Evolution of intelligence
Direct modeling of temporal effects of environment on a global absolute scale vs statistics
The Authors
H.M. Hubey, Department of Computer Science, Montclair State University, Upper Montclair, USA
Abstract
The social sciences are really the “hard sciences” and the physical sciences are the “easy” sciences. One of the great contributors to making the job of the social scientist very difficult is the lack of fundamental dimensions on the basis of which absolute (i.e. ratio) scales can be formulated and in which relationships could be realized as the [allegedly] coveted equations of physics. This deficiency leads directly to the uses of statistical methods of various types. However it is possible, as shown, to formulate equations and to use them to obtain ratio/absolute scales and relationships based on them. This paper uses differential/integral equations, fundamental ideas from the processing view of the brainmind, multiple scale approximation via Taylor series, and basic reasoning some of which may be formulated as infinitevalued logic, and which is related to probability theory (the theoretical basis of statistics) to resolve some of the basic issues relating to learning theory, the roles of nature and nurture in intelligence, the measurement of intelligence itself, and leads to the correct formulation of the potentialactual type behaviors (specifically intelligence) and dynamicaltemporal model of intelligence development. Specifically, it is shown that the: (1) basic model for intelligence in terms of genetics and environment has to be multiplicative, which corresponds to a logicalAND, and is not additive; (2) related concept of “genetics” creating its own environment is simply another way of saying that the interaction of genetics and environment is multiplicative as in (1); (3) timing of environmental richness is critical and must be modeled dynamically, e.g. in the form of a differential equation; (4) path functions, not point functions, must be used to model such phenomena; (5) integral equation formulation shows that intelligence at any time t, is a a sum over time of the past interaction of intelligence with environmental and genetic factors; (6) intelligence is about 100 per cent inherited on a global absolute (ratio) scale which is the natural (dimensionless) scale for measuring variables in social science; (7) nature of the approximation assumptions implicit in statistical methods leads to “heritability” calculations in the neighborhood of 0.5. and that short of having controlled randomized experiments such as in animal studies these are expected sheerely due to the methods used; (8) concepts from AI, psychology, epistemology and physics coincide in many respects except for the terminology used, and these concepts can be modeled nonlinearly.
Article Type:
Research paper
Keyword(s):
Cybernetics; Intelligence; Computers;
Brain.
Journal:
Kybernetes
Volume:
31
Number:
3/4
Year:
2002
pp:
361431
Copyright ©
MCB UP Ltd
ISSN:
0368492X
Introduction
There is an old issue going back to Aristotle (who thought that slaves were slavish by birth), and which has become a heated debate in recent years by Burt, Spearman, Thurstone, Jensen, and Gould, having to do with the role of genetics and environment in intelligence, revived only a few years ago by Herrnstein and Murray (H&M). During this century the discussion has become more “scientific” via the use of mathematical models and technique, the evidence consisting of tests, grading, and statistical analysis of such tests. Because of the importance of the history of the subject, the various incommensurable views adhered to by various parties, and sweeping breadth of discussion, the paper treads over known territory, some of it in standard/classical fashion and some with original twists, at the risk of boring some readers, in order to be accessible to the broad readership some of whom may not have any familiarity with some of the mathematical techniques. The attitude of the workers in this field, according to Herrnstein and Murray (1994) without oversimplifying, and without taking too long can be put into three groups:
Classic: Intelligence is a structure. Whether there is a single number, two or several is not as important as the fact that there's a structure to it, and this structure can be captured in a single number, which Spearman called g, general intelligence. Thurstone claimed about a halfdozen PMAs (Primary Mental Abilities). According to Vernon, they are hierarchical. According to Guilford there are 120 or so components in this structure.
ComputationalAI model (revisionist): Intelligence is a process. This seems to be an evidently more modern attitude encompassing the information processing view. According to Sternberg there are three aspects of human information processing; the transducers or our sensory organs that change real world inputs into special forms for our brain, classifying the real world problems into groups, and actually making use of the apparatus in living (and hopefully being successful) in the real world via the use of the schemes of adapting (to the environment), shaping (the environment), and selecting (a new environment).
Scalar vs. Tensor (radical): There are different kinds of things called intelligence. For example, according to Gardner there are linguistic, musical, logicalmathematical, spatial, bodily,musical, interpersonal and intrapersonal forms of intelligence.
Of course, the phrase cognitive ability (CA) has now replaced intelligence and according to H&M, it is substantially heritable, apparently no less that 40 per cent and no more than 80 per cent. The importance of personal skills and emotional issues already clouds the definition of intelligence. Is it possible that all of these views have part of the truth and like the men who fought over what to do with their money without knowing that they all wanted to purchase grapes, they are fundamentally more in agreement as far as the facts are concerned just as men and women are more alike than unlike? Is there a unified view of which all of these are components?
Properties of intelligence: classical
One simple way of invalidating the results seems to be to deny the existence of race. The arguments are from biology; There's no scientific definition of race! It's too silly to be of much use. For one thing, it won't stop the racists; another word will take its place. For another, biology is hardly in a position to be arguing about what is science and what is not since it is still rather low on the totem pole. And thirdly, the definition of race as given doesn't say anything more than what it's supposed to be: arguing that there's no such thing as beauty because it's only skin deep is silly. Who said it's supposed to be any deeper? We could, of course, see all kinds of beauty and in everything including in intelligence; indeed it exists everywhere. We might make up a simple table of words and phrases used in the literature for describing the intelligence or the CA debate as below (Table I):
Man came first to the realm of the minerals, and from them he fell in among plants. For years he lived among the plants and remembered nothing of the vegetative state. In the same way he passed from realm to realm, until now he is intelligent, knowledgeable, and strong. He remembers not his first intellects, and he will leave his present intellect behind. He will be delivered from this intellect full of avarice and cupidity and see hundreds of thousands of marvelous intellects. Rumi (Chittick, 1983)
We can also use the standard terminology of naturenurture, innate vs. cultural, but they all seem to boil down to a discussion of whether there is biological determinism. We might expand upon the standard arguments against the thesis that human intelligence is unfairly distributed to the different races by summarizing the arguments as making some version of the statement that, the IQ tests measure a

structure (vector or tensor) AND a scalar should not or can not be derived from the tensor.

scalar (single number) BUT cannot used to linearly order or rank humans.

component that's racebased AND is immutable.
The last part has been changed slightly from the ones expressed by Gould (1981). Some people might say ‘geneticallybased’ (i.e. genetically inheritable) and slightly mutable. According to both detractors and proponents the IQ tests and their conclusions are about what is called biological determinism (BD); the idea that intelligence or cognitive ability/capacity is/are innate, intrinsic, inherited (biologically or genetically). A small problem revolves around the definition of heritability. Heritable means capable of being inherited. Inheriting has to do with coming to possess certain characteristics and not necessarily genetically although it is often meant that way. So the fact that statistical techniques such as correlationregression, and analysis of variance have been used to define inheritance or heritability means that somehow we are to assume that we know exactly what it is and is clearly defined, but it is simply not the case. It is as if we went to a doctor's office complaining of our heart beating too fast and he told us that we had tachycardia. He hasn't diagnosed the problem but only given it a name, and we should not be impressed. Several definitions of heritability are possible (Table II).
Of these it is probably WAH and WAE are the most commonly held views in opposition. So intelligence could be inherited, but only socially. Superficially the basis of statistical inference and correlationregression analysis seems secure. Who would fight it? Most people would head for the intellectual hills whenever faced with squiggly symbols of mathematics so the battle lines for the Bell Curve would seem at a glance as if they resemble the Scopes monkey trial with science once again about to triumph over its emotional opponents, who naturally once again seem to be the ‘bleeding heart liberals’. It would be strange to hear someone say that it is all for nothing but that is essentially what it is about.
race: A local geographic or global human population distinguished as a more or less distinct group by genetically transmitted physical characteristics.species: A fundamental taxonomic classificiation category, ranking after genus, and consisting of organisms capable of interbreeding.subspecies: a subdivision of a taxanomic species, usually based on geographic distribution.
The theories of statistical testing (and CA testing debates) are replete with oblique axes, multicollinearity, orthogonal regression, covariant vs. contravariant tensors and to this we could add others such as rates of cultural vs. biological change. Some of this is done in detail in later sections. For a brief and intuitive tour of the relevant ideas we should turn to a short history of the vector vs. scalar theory of intelligence and thus to the pioneers of this century. The real vectors of mind, Thurstone reasoned, must represent independent primary abilities (PMAs). If they are truly independent they should be orthogonal (that is, perpendicular) to each other. But whatever these PMAs are, they are correlated; that is they tend cluster. The problem is called multicollinearity in statistics. Not all sets of vectors have a definable simple structure. A random array without clusters cannot be fit by a set of factors.
The discovery of a simple structure implies that vectors are grouped into clusters and that clusters are relatively independent of each other; that is they represent, however inaccurately some aspect of some primary mental abilities, or PMAs. Thurstone identified seven of them: Verbal comprehension, Word fluency, Number(computational), Spatial visualization, M (associative memory), Perceptual speed, and Reasoning. Thurstone admitted strong potential influence for environment but emphasized inborn biology and also refused to reduce these to a single number, hence was an advocate of the structuralist school, it might be said. He claimed that Spearman's scalar g (general intelligence factor of some sort) was imply an artifact of the tests Spearman gave and nothing more. Spearman's retort was that Thurstone's PMAs were also artifacts of chosen tests, not invariant vectors of mind, which is also as true as Thurstone's claim.
Vector/tensor vs. scalar controversy: distance metrics & normalizations
Suppose we want to represent physical agility or physical capability of athletes from various different tests. Suppose we only use three tests; (i)endurance/stamina; (ii) reflex, reactiontime, and (iii)strength. How should we represent these three qualities (as quantities)? As the simplest such measure we can simply make three separate bits (i.e. zero or one) which will represent the possession or lack of the relevant property (such as a pass/fail grade) which we can write as 000,001,010,011,100,101,110, and 111 (see Figure 1). Or we can decide to give them grades in the normalized interval [0,1] for each of the three separate tests, and thus implicitly switch to using some kind of reasoning related to fuzzy logic or probability theory. Of course, we can easily increase the number of such tests to five or ten, and we can also increase the dimensionality of the problem but plotting more than 3 dimensions is very difficult. Hence, it is easy to deal with such high dimensional problems using only symbols and logic. To continue the example of 3 dimensions, we can make bar charts, pie charts or we can plot them on a 3dimensional graph. Then we can represent each person as a point in three dimensions {x,y,z}. We call such ordered ntuples or vectors. A vector is obviously a simpler case of a matrix. It is a 1 by n matrix. Matrices are also called tensors of rank 2, and vectors are tensors of rank 1. Therefore the ordinary single numbers are called tensors of rank 0 or simply scalars.
Consider the case of colors. Colors are produced from three socalled primary colors, Red, Green and Blue (RGB) or their complements, Cyan, Yellow, and Magenta (CYM) depending on whether an additive or subtractive process is used. No one would really argue that a color is not a single indivisible quantity if we think of it as something our perceptual/visual system is able to transduce. So then the natural question is whether a color is a single number, multiple numbers, a vector, a structure or a dynamic thing that causes our perceptual system to process the input data. It depends on our perceptual abilities and our knowledge. For sure it is all of them depending on what we want to do with it, and there's no contradiction. As we know all the colors (for all practical purposes) can be obtained (additively) from the three basic primaries, Red, Green and Blue, (RGB) and Figure 2. The gray scale runs from black to white along the diagonal. The great advantage of using multiple dimensional space is the accuracy of such representations of much phenomena. We all know what colors are but they would be virtually impossible to explain to someone who was congenitally blind. If we did attempt to “explain” colors by explaining that “black is the absence of color and white is a mixture of all the colors” it is likely that the blind person would think of colors as what we call “gray scale”. We can write the primary colors as vectorsEquation 1
Since a vector consists of ordered elements, the first entry refers to redness, second to greenness and the third to blueness. Thus the red vector r has only a 1 in the rednessplace and zeroes elsewhere. Similarly for the other primary colors, g, and b. We suspect, then, that the other colors will be some combination of these primary colors. What this boils down to is that we want to add different proportions of the primaries to create other colors so that we will multiply the primary colors by some number less than one (so that it is a small proportion) and then add them all to get some other color c_{any}, so thatEquation 2where p_{r} = proportion of red, p_{g} = proportion of green and p_{b} = proportion of blue. If we had p_{r} = p_{g} = p_{b} = 0.5 we will obtain a gray since the diagonal of the color space that runs from black to white is called grayscale. We can represent this particular gray asEquation 3
In the example above we saw the rules for scalarvector multiplication and vector addition, but not vector multiplication. The final result for this particular gray is that it has 0.5 proportion of red, green and blue since those are the vector components. However, if we do make an analogy to the 3D space in which we live with the exception that the dimensions of color are not homogenous like our space dimensions, it is more likely to be understood better. For a more detailed look at color, see Hubey (1997). There is a simple way to obtain magnitude from the PA space (instead of using the Hamming metric) by treating as the color space except that the meaning may not be intuitive. Simply defineEquation 4
For the special case of ɛ=ρ=σ=e=r=s=n=1 this is simply the Euclidean distance metric that we use for our ordinary threedimensional space [please see appendix A.1).Although it's just as obvious that the color vector is being produced from the primary component colors our mind's eye sees a single color. Indeed this is done all the time; the colors on the computer monitors are produced directly by energizing the red, green and blue strips of phosphors to varying degrees of intensity. The eye in the case of the high resolution monitors (0.28 mm dot pitch) is unable to resolve the different components and produces instead what we see as a single recognizable color on the color wheel. The number does mean something. We can all see it. But naturally we will not be able to assign a linear ranking since it's pointless. We can see that the Euclidean norm of the color vector will be a section of a sphere in the positive orthant but it could be one of an infinite number of colors on the surface of this sphere. Making the analogy to colors what we can immediately see is that our unaided intuition, if we only considered this color space to be a homogeneous space like that in physics, would not be able to tell us that what we perceive subjectively as color often does not seem to have any obvious connection to the constituent components of the color vector, since we now know that what looks to be a distinct “thing” is merely a shorter/longer wavelength in the visible bandwidth of the electromagnetic spectrum. However, there is no doubt that any given color can be comprised of the basis colors RGB. Therefore we have to reason to insist that a vector created from the components of intelligence will not possess intuitive properties totally different than the basis vectors. At the same time, the scalar quantity obtained from the vector certainly is missing much information. The real question is how different from each other are the components of the intelligence vector.
However along the diagonal from black to white, we can indeed assign a single scale the so called grey scale. And everyone will be able to visually compare them. It will take some training to be able to estimate the color vector components for various colors however in these days of computers it should not be too difficult to be able to find a program with which to play around. And indeed the results will be what we imagined above for the physical case, there are differences and they are quite noticeable. So the whole idea of whether to combine the components to produce a single number or to leave them alone may not be much more than a matter of taste. In fact, if anything they should both be done.
I died from the mineral kingdom and became a plant; I died to vegetative nature and attained to animality. I died to animality and became a man. Why should I fear? When did I become less through dying? Next time I will die to human nature, so that I may spread my wings and lift up my head among the angels. Once again, I will be sacrificed from angelic nature and become that which enters not the imagination… Rumi (Chittick, 1983).
And different weightings should be used just to see what kinds of differences it would make. In the specific case of CA or PQ since the various alleged factors or components of the tests are or would be highly correlated and not independent as in the case of the three primary colors they would all be increasing more or less together and it would correspond almost exactly to the case of the grey scale, so there is something after all to what the classicists claim. Since they are correlated (i.e. tendency to increase or decrease together) then this resembles something like the grey scale and we can make use of this idea to comprehend what these tests purport to measure. So there's no serious difficulty with making sense of a scalar measure (i.e. a single number, say, Spearman's g). We can use analogical reasoning now to try to comprehend what this single number could mean, if we had a mind to produce such a single number. Indeed, it is an excellent example of the fact that although we can ‘see’ the grey number as something clearly related to black, we would not have been able to imagine that it is really being produced from red, green and blue. It is one of the miracles of the natural world; strange but true, just like finding order in randomness in chaos. But there is another simple way in which we can produce scalars from which we can get an idea of the colors. The problem of structure vs. process vs. multiple intelligences is a pseudo problem since the arguments are really about definition of intelligence. From the way it is explained it seems that by structure is meant really a state. In some ways the cognition view seems to be an attempt to solve the problem of intelligence by avoiding it, and the multiple intelligences view seems to be taking the vectors of mind view as is and refusing to go further. The only one that causes immediate grief is the difficulty of connecting the state view with process view since this is in general very difficult even in relatively welltread fields such as thermodynamics.
Anomaly or contradiction? (data is evaluated according to a theory)
A more serious problem is the apparent paradox of the fact that we have the largest b/B for all animals (where b = brain mass, B = body mass) and yet brain size among humans doesn't seem to matter much from the evidence. Evidently either

there is something analogous to flab for the brain so that massive brains don't necessarily imply high intelligence

or whatever intelligence is, the test doesn't measure it but rather a narrow set of skills taught to students who are expected to have this core knowledge just to survive and be a reasonably productive member of this society during this century

or it's the connectivity that is important so that the more efficient connections may be present in the brains of some individuals who have small brains and are anecdotally said to have been ‘smart’.
It seems as if the correlation between the Encephalization Index (EI) (Eccles, 1989) and intelligence holds at large scales (i.e. global scale) and does not hold at small scales (local scale). We do definitely find that the larger the EI the more intelligent the species. Why then doesn't the relationship hold at local scales? Superficially there could be two reasons; the tests (instruments) do not possess the resolving power required or that the relationship is not linear thus linearcorrelation regression (LCR) analysis does not divulge any information. However there are other reasons why the EI does not seem to correlate with intelligence at local scales (i.e. only for humans).
We do know that more complex organisms also have larger brain/body mass ratio (Britten and Davidson, 1969; Sagan, 1977)
Artificial (machine) intelligence perspective: form, mode, and type
As for the taxonomic structure of the skills that comprise what we call intelligence the first thing we note is that like a database there are different possible classifications, and that if they all seem to be just as attractive then they must be different conceptual views of the same thing which can possibly be all accounted for some day when we have better mathematical models. The standard models were reviewed in the beginning and we have yet more possibly taxonomies and also other pieces of evidence that points in the direction of a logarithmic scale. Some skills of problem solving are serial which would include what we call formal logic [and definitely its informal version that shows up constantly in verbal comprehension type questions], and some of the simple arithmetic (i.e word) problems. Others the most obvious of which is spatial visualization require a parallel mode of processing. The visual [nonverbal and parallel] mode of thinking was probably best expressed by Einstein. Since the number of brain states (in analogy with computer science sense of the word i.e. say, a state of a set of flipflops of a real machine or the internal states of an abstract machine such as a Turing machine) increases exponentially with the number of neurons, and we expect that ability, in some sense, also increases exponentially so that we should use a logarithmic scale. As for the complexity (in the sense of number of components of the number of operations a machine executes in algorithmic complexity) of the brain and expressive power of a language there are good reasons to think that they should be multiplicative and that there are tradeoffs in time vs. space complexity, for languages (see Hubey (1994)). Going back to standard computer paradigms, if we concede that animals can think (although at some lower level) we must also concede that thinking doesn't require language [if the few tens of words that animals can recognize is not counted as language]. There may be natural spaces in which to represent intelligence, which means that we may yet provide some kind of a structure to it. For example it would be possible to represent many of the ideas in terms of a simplified threedimensional space whose axes are

ExplicitImplicit [Knowledge Form]

ParallelSerial [Computation Mode]

Processing [Bound Type]
Some kinds of questions require explicit knowledge such as mathematics, geography, verbal comprehension [grammar], and word fluency. Others are implicitly learned such as personal and interpersonal skills, physical coordination, and much of language. We might also call explicit knowledge much of what is taught in schools, and the implicit, what is learned without a formal education [which would include the socalled street smarts and also certain personality skills which would make for good manager or salesperson.] The last axis of the 3D space has to do with what might be called the difference in computation between batch vs. realtime or between I/O bound vs. computebound processes; it's really a combination of both. Into this last category (axis) would fall such things as bodilykinetic intelligence of Gardner, musical talents (i.e. ability to play an instrument), being athletically minded, and perhaps some aspects of personality. Those involved in realtime programming know that it is a task of difficult constraints. Similarly, coordinating physical activity and mental tasks (i.e as in team sports) is a rather difficult task i.e. high complexity. It is for this reason that music and dancing have calming effects; it stops the internal dialogue of Casteneda. It is for this reason that music might break some people's concentration but improve others’. We can try to include what should really be a fourth dimension, that is essentially a memory fetch vs. computation in this third dimension, but only to make its comprehension easier since representing more than three dimensions is very difficult except purely mathematically. The possible fourth dimension [only for the purposes of simplification and exposition], that of the difference between a computebound process vs. one of memory fetch in computer science would be the difference between a complex algorithm vs. that of table lookup. In the real world of humans, the table lookup has the analog of word fluency, and perceptual speed [of Thurstone's PMAs]. Clearly it has to do with the organizational skills of the person, which naturally is about the organization of the knowledge of his brain, and hence his past which includes both formal and informal education. It is this which Thurstone probably calls M (associative memory). In Gardner's world view, this would get split into spatial (since perceptual speed might have to do with spatial resolution and manipulation of objects in space), and logicalmathematical would also be in this category. Since all memory in the brain seems to be associative, and analogically based, this particular component is probably what we might call efficiency in another setting instead of the intelligence debate and is probably what we are measuring along with some basic knowledge that we presume every human should know. Continuing with this idea we can see that things are often measured as a product of two variables in which one is intensive and the other extensive. For example, in thermodynamics/ physics work done is δW=p·δV, entropy is δQ=T·δS. The idea of intensive vs. extensive variables do have uses in many different areas. Training or education is probably something like δt=x·δT where x is the intensity or quality of the training program and δT the extensive variable which is the amount of time spent in it. Problem solving ability is δπ=ɛ·δK where ɛ has something to do with the inferencing mechanism or engine used, and K the knowledge base, for despite all claims to the contrary and protestations, we cannot separate the two completely, at least in the human brain, and at least for the time being. Knowledge of the world comes from our senses, and our inferencing about the world at large comes from our observations. In fact, we can see the same ideas being used in scoring in gymnastics and diving. The score is calculated by multiplying the raw score (how well performed) by an inherent degree of difficulty of the routine of the dive. Hence the measurement is really about a product of an intensive parameter (organizational effectiveness of the brain or its efficiency) multiplied by an extensive parameter which is knowledge. Please see appendix A.3 on Path Integrals and their connection to these ideas.
Potential and Its Realization
In articles on intelligence (indeed almost any other characteristically human trait such as language) we often run into words which talk about human potential which has not been realized. It is often thought to be a single dimension in which the actual realization is simply a proportion of the potential (capacity). What seem like two poles of a continuum often turn out to be separate dimensions. The case of language turns out to be one of these. There are really two variables; capacity & existence of instruction. There is a window of opportunity for picking up language.
We see that we are dealing with a product of variables since it is only a product which can create this. And in this case the simplest approximation is just a logicalAND. In other words, there must be both languagecapacity (i.e. innate, inherited, potential) AND also there must be proper environment (i.e. instruction), so that language can be learned. The next level of approximation is simply using fuzzy logic concepts. As long as the potential is there (for example in mentally retarded children) and there is instruction, there will be some form of language. Indeed, IQ tests do measure language competence to various degrees and use it as a part of the test of intelligence. Combining the two, (i.e Knowledge Form, Computation Mode, Bound Type) and the concept of potential from physics we might try a potentials of formEquation 5 Equation 6from which we can compute the vectors of the mind, and also derive single or multiple scalars using any of the ideas shown in earlier sections. The potential in is already multiplicative and becomes additive as ln(Ψ)=ϕF ^{f} +μM ^{m} +τT ^{t} after taking logarithms so that if we are interested only in adding up scores on various sections of the test without any compelling reason not to do so, the logarithm will relate these numbers to the potential. In the later case, the logarithm produces the standard form for linear regression. Without some data on what these mean it be pointless to speculate on the choice of functions however we should note that Q is multiplicative so that if any one of the components is zero Q will be zero, so that it already has builtin correlatedness for the components. It would tend to produce high scores for more wellrounded informal low level education [i.e. cognitive intelligence] whereas if there is a limit to what is possible because our brain is finite after all, then the high achievers would certainly be deficient in some areas and stronger in others which would be exaggerated by the exponential form of F given the right coefficients, so both forms are flexible enough for creative use. Even a simple multiplicative model is much closer to the truth than the standard linear regression models. Many things having to do with psychophysics is best modeled by a power law, and the sigmoidal functions, as pioneered by Rasch (1980) seem to have much success. Other sigmoidal models can be seen in Hubey (1987) and stochastic models in Hubey (1991a) and also below. We still have the problem of obtaining the actual/real from the potential (which is the concern of learning theory).
Functional View of the Psychology/Biology of Learning, and Intelligence
It's reasonably clear from all the evidence that whatever it is that intelligence tests measure whether it should be called the Intelligence Quotient or Cognitive Ability or Problem Solving Capability, or Problem Solving and Creativity Scale or whatever can be changed/affected via training, emotion, poverty in other words environmental influences. Even after scaling things correctly, we are still left with variation among humans. It might be argued despite the evidence that it still means something and needs an “explanation”, in other words, some simple theoretical model. It is not difficult to produce a very simple model that hopefully will not do terrible injustice to the idea of intelligence. We know that our memory is associative. Memory events seem to be linked to other events and we can recall almost everything with some prompting. We might make an oblique reference here to artificial intelligence programs in which an inference engine is working on data so that we may liken problem solving to having an inference engine (naturally not necessarily localized in some part of the brain but possibly scattered about) fetching data and doing some kind of a search (breadth first, depth first or some combination thereof or something completely unknown to us yet). Of course it will take time to do all this.
Let us call the time it takes to do this T _{c}, for complete_search_time, without implying that the search does not include conventional computation i.e. problem solving. Suppose now that over a period of time we've built up (via formal or informal education) a large bag of cheap tricks which is also kept in storage someplace. We can think of reasoning as analogical reasoning in which we solve problems via analogy to problems resembling the given problem in one or more dimensions and that we kind of keep a mental template of such solved problems in memory (which we might imagine is functionally being kept someplace separate from the rest of the memory). Thus if we first are able to find the template for the given problem in this memory of presolved problems [premem] we can ‘solve’ the problem much faster than we could have if we never had encountered problems of this type. The truth of the matter is that there are really no completely original or novel problems that can be presented at any of these tests. And solving some of them really revolves around guessing what the tester wants to get as an answer. Therefore the time to solve a problem if we can find an analogical match in our presolved memory is much shorter than if we treated the problem as completely original and tried to be creative in its solution in which case might never be able to solve it at all. The time to solve a problem, then having this highlysimplified twotier memory system then drops toEquation 7where H is a probability of finding this solution in premem [presolved memory]. This is essentially what is being referred to as “chunking” in learning theory and in artificial intelligence.Thus if we do find it there, the answer is very quickly found, which takes time T _{p}. The time to find the solution if it's not found in this prememory fetch is that time spent doing this plus the time spent actually solving it via supposedly original methods. Naturally, this simplification is so gross that we should not expect anything beyond the simplest kind of description of matching reality. First, there really is no such thing as these two memories locatable anywhere in memory but there's no need for it; the connections must behave something like it. Secondly, we'd have a tough time solving very original problems; if anything the problem we have is in finding a good match for the problem at hand and trying to force fit couple of problems together or cobbling solutions from several such virtual templates; it is this efficient time that we've called T _{p}. In any case, the tests don't give us time to find the solutions but rather give us a fixed amount of time in which to solve such problems, so the assumption behind this idea is that we'll be able to solve less problems in this fixed amount of time if we cannot find many of them in our premem. In truth all of us who are alive have some small virtual memory in which already solved problems from life are stored (naturally not necessarily in some localized region of the brain) so that the time it takes for us to solve the problem compared to some hypothetical baseline of the finelytuned problemsolving brain would be of the formEquation 8
We should note that this solution time, T, is equal to H if H=1 and is equal to 1+T _{c}/T _{p} if H=0. The case of H=0 corresponds hypothetically to a situation in which we are faced with a problem for which we have no handles. In this case the factor T _{c}/T _{p} is something that corresponds to the inherent originality of the problem, at least to the subject. We should suspect that λ should be a large number since very few people [almost no one] are actually creative but rather partially creative; we may cobble together solutions to new problems by combining several old ones. We solve large problems by cobbling together solutions to a bunch of smaller component problems. The process is iterative, and hierarchical since the same types of solutions can be used at different scales, hierarchies or levels. Most people probably cannot even do that unless they've been trained to do so. In any case, highly educated people, especially those who've studied mathematical sciences will have high H values since they probably have already solved symbolic problems of the type found on tests many times over. Similarly, questions such as “what is the opposite of…” will be easier for those children raised by parents who are highly literate than those living in “tarzan neighborhoods”. We should really consider not T but another quantity τ=1/T if we want to consider the values as normalized. We then haveEquation 9
The plot shows that for H=1, τ=1 since at that time we have reached maximum efficiency since every problem presented to us in the test is already present in our presolved memory and we need only to fetch the answer. We should also note that the most rapid increases occur for large l, which is exactly as it should be since it implies that the ratio of the learned solutions to the searching/groping for solutions using all ingenuity and creativity is high, meaning that if the problems are of the type that would be very difficult to solve without being exposed to problems of this type, then the steepest increases come near H=1 when we can find the solutions in the presolved memory. It's possible that the large brain individuals may be capable of more original lines of thought, capable of more creative lines of thought, have more memories built in. It's also possible that the so called intelligent beings such as mathematicians or novelists, or philosophers were merely onedimensional experts in small domains and managed to score high on these tests particularly because they were trained for these tests. In particular the tests might overemphasize classification which is a large component of education, especially in the ‘soft sciences’. It is said that an expert knows everything about nothing and the generalist knows nothing about everything. This is simply an example of tradeoff as can be observed in many fields (Hubey, 1996), and also Appendix A.2. Most people would naturally have to fall somewhere in between. In yet another sense we can consider the effect of H plotted against the amount S _{p}/S _{c} where S _{P} and S _{c} are proportions of the memory devoted to the two different types of problem solving modes and their associated memories, where we've assumed that there must be some kind of a parameter Ω which has to do with the organization of the brain. If knowledge is organized so that there's a method to the solution searching mechanism instead of being a cut & try method that an unsophisticated person might attempt, the probability of finding the answer (or something close to it) in the faster premem H will increase. Hence we might think of Ω as a kind of efficiency of the brain as far as its organization goes. It's also possible that this could point to overorganization in the sense that it will be good only for solving the types of problems given on such tests. As can be seen if there was absolutely no efficiency raising mechanisms or learning by experience, hence no localization of memory (i.e. associativity) then the increase in H should be about linear with S _{P}/S _{c}. There should be a higher rate of increase of H with S _{P}/S _{c} if the learning mechanism was efficient instead of simply being rote training. In all likelihood memory (that is, neural net) organizes itself in some manner which is captured in an extremely simple way by these equations. The early methods of solving problems are much closer to the parts of the triune brain (MacLean, 1973; Jerison, 1973) so that they become automatic means or fallback methods, and thus the increase in the likelihood of finding the solution to problems such as those given in various IQ/CA tests greatly increases performance. This “organizational efficiency” of the brain has been captured in the single parameter Ω. Other thoughts on functional descriptions of memories of living entities include the procedural vs. declarative memory (Squire, 1983), working and reference memory (Olton, 1983], and associative and recognition memory (Gaffan and Weiskrantz, 1980), which like the present work is borrowed directly from computer science. For tradeoff type relationships in many fields of science, and epistemology, see Hubey (1996).
Mathematical analysis of proposals
The previous section was a purely functional view of the role of learning in problemsolving, but IQ/CA is not supposed to be learned but innate/hereditary/genetic. If intelligence cannot be learned, what exactly, then, is IQ? To answer this we must first ask what intelligence is. IQ is a normalized version of intelligence. The question has obviously been asked and answered in different ways in the past. In binary form the answer is the Turing Test. To know what intelligence is in nonbinary form we should try to delineate its properties. Some of this was already done in the beginning in the literature review. In this section we can try to produce answers from other points of view, ignoring the previous section and restarting new thread by examining the standard arguments but evaluating them from different perspectives. Historically, the brain/mind was always described by using the highest technology available as metaphors for understanding and what we are attempting to understand or describe is a function of the brain/mind. The mind was likened to clockworks, then the telephone switch, then the digital computer and finally artificial neural networks. The memory part was likened to holograms, and the associative memories of computer science are still used as analogies. The computational paradigm is still rampant, and the concepts of state and process come from this view. However since the brain/mind is a very complex thing, there is yet one more analogy we can make, and that is to databases which have different conceptual (often called logical) views. The multiple view perspective is taking hold these days even in operating systems. Since analogies are always single dimensional, it is not surprising that something as complex as the human brain/mind (the threepound universe) can be seen to be like so many things. Since we don't yet understand it whole but only its parts, we can liken ourselves to the story of the four blind men and the elephant. There are other questions we can ask regarding its properties. Is it an extensive property or an intensive one? Is it like temperature or pressure (i.e. an intensive function) or is it like volume/capacity/mass/internal energy?
The answer to both is that it is probably a product of both! Not only is the problem solving ability a function of some kind of an efficiency of neurons or organization of the brain but also of the pure mass or amount of neurons. If it were not so, animals such as reptiles would be as intelligent as humans. On the other hand if we claim that since we are only considering humans, and since the brain masses all fall into the same range, we should consider this constant, then we still have to deal with whether IQ is intensive or extensive purely from the consideration of whether it depends on knowledge (extensive) and also on some kind of an efficiency of processing or creativity in solution finding (intensive). Therefore we still cannot escape the bind of choosing one or the other. It is most likely a function of both and hence it must still be multiplicative function, aside from the problem of being a path function and not a point function. On the basis of the foregoing we can find at least four serious problems with the attempts by which psychologists so far have tried to capture the idea of intelligence, aside from the ones that have already been discussed in the literature and earlier in this text.

What kind of a quantity is intelligence? Is it binary or measurable on some scale? What kind of a scale is appropriate? Is it an ordinal, interval, or an absolute (ratio) scale?

Is it an additive function of its constituents, the most important ones for purposes of simplification being hereditary(nature) and environment (nurture)? Or is it a multiplicative function? Is it logarithmic function, an exponential function or a polynomial function of its variables?

Is it a vector/tensor function or a scalar?

Is it a point function, or a path function? In other words is it a state or a process? Is it a quality or a quantity? Is it an extensive variable or an intensive variable?
We all recognize that genetic influence can be spread diffusely among many genes, and that genes set limits to ranges; they do not provide blueprints for exact replicas. In one sense, the debate between sociobiologists and their critics is an argument about breadth of ranges. For sociobiologists, the ranges are narrow enough to program a specific behavior as the predicatable result of possessing certain genes. Critics argue that ranges permitted by these genetic factors are wide enough to include all behaviors that sociobiologists atomize into distinct traits coded by separate genes.Gould (1981), 329.
It's clear that all of these questions are not independent of each other but related to one another. If this thing called intelligence is to make any sense it should be comprehended and comprehensible in a broader context. It is paradoxically true that sometimes one can find solutions to problems by generalizing them and looking for more general solutions since that enables us not only to locate the phenomena in its proper space relative to related ideas or objects but also allows us to use more data as evidence to better grasp the constraints to be imposed on the phenomena. This intelligence scale should encompass and allow us to measure intelligence of fleas, as well as that of chimps, humans and also machines.
Common sense says that the scale should be logarithmic in order to accommodate the vast differences in intelligence but also because many laws in psychophysics are power laws. Logarithmic transduction of inputs allows for a greater range of sense perception without a proportional increase in the size of the organs. Furthermore, if this scale is to be something like the temperature scale, then absolute zero should belong to something like viruses or simple computer programs. Furthermore ideally this scale should be an absolute/ratio scale instead of simply an interval or an ordinal scale. A highly mathematical treatment of the subject of scaling going back to Campbell (1920) can be found in Suppes & Zinnes (Luce et al., 1963).
Heritability: Why is the “intelligence function” not additive?
First problem with the Linear CorrelationRegression Models (LCRM) is that it is highly unlikely that intelligence is an additive function of environment and heredity since additive means logical OR and not AND. So therefore the verbal expression that intelligence is a function of both environment and heredity is being twisted out of shape as soon as we try a linear additive model. As is well known, AND is represented as multiplication, and not necessarily only in bivalent logic or even fuzzy logic but even in modeling via differential equations, for example in the nonlinear LotkaVolterra models the interaction is multiplicative (see Appendix A.5). Various types of infinite valued AND functions can be found in Hubey (1998). The sigmoidal function is produced quite naturally in the nonlinear differential equation modeling of forced binary discrimination of phonemes in Hubey (1994).
Biological determinsim] is fundamentally a theory about limits… Why should human behavioral ranges be so broad, when anatomical ranges are generally narrower?…[I] conclude that wide behvioral ranges should arise as consequences of the evolution and structural organization of the brain. Human uniqueness lies in the flexibility of what our brain can do. What is intelligence, if not the ability to face problems in an unprogrammed (or as we often say, creative) manner? Gould (1981):331.
Additivity implies that the environmental and hereditary components are grossly substitutable for one another which is simply untrue. No amount of teaching will make a chimp into a human. There is no question that the model should be multiplicative. The model cannot be additive since additivity logical translates to OR and nobody would really dispute that environment and heredity are not grossly substitutable for one another. If it were so, we could be teaching calculus to dogs by enriching their environment to make up for their genetic deficiency. The coefficients no longer mean what they meant in linear regression. If we are looking for the magnitude of variation of intelligence with the factors the two cases give fundamentally different results because if we have I=f(E,G) thenEquation 10
For the linear caseEquation 11the differential (i.e. variation)Equation 12
For the nonlinear caseEquation 13the variation/differential isEquation 14
As can plainly be seen from the form of the multiplicative (i.e. AND) dependence the powers of G and E essentially determine the sensitivity of the intelligence to variations in environment and heredity. For the linear case the respective coefficients do determine the sensitivity of intelligence to the factors, but for the nonlinear case (which is the correct case) the respective coefficients no longer mean what they meant for the linear case. The model must be multiplicative. (See Appendix A.7 for some paradoxes.) The simplest such model accounting for environment and heredity would be of the multiplicative type which is interpreted as a logicalAND (i.e. conjunction) Therefore the linear regression could be done via using the logarithms /would have the formEquation 15
Immediately, we would see that all the numbers that were measured would get smaller and hence the variances. However, that is not the only problem (see Appendix A6 for correct computation of variation and Appendix A4 for the conditions on the functional form). The argument that the present testing methods and models are only for “human level intelligence” where the linearity is valid does not hold water for there are standard mathematical methods to deal with such approximations. We simply expand the function in a Taylor series and attempt to regress about some point which we may claim is some average human level genetic and environmental condition and that the function is approximately linear about that point. For example, if we suspected some general form such as I=f(E,G), then we can expandEquation 16which for above isEquation 17
Rearranging terms and simplifying we obtainEquation 18where Φ=αE _{h} ^{e} G _{h} ^{h} (1+e+h),Δ=αeE _{h} ^{e−1} G _{h} ^{h} ,Λ=αhE _{h} ^{e} G _{h} ^{h−1}. In order to make linear we dropped the higher order terms in the Taylor series to obtain . However the linear correlationregression analysis computes the values of the constants Φ,Δ and Λ which parameters are no longer indicative of the effect of the selfvariables since they are now functions of the other variable. In order to offset this dependence we would have to use the normalization E _{h}=G _{h}=1 thereby computing the coefficients α(1+e+h), αe and αh in the linear regression. We can then solve for α, e and h from the three equations. If we do solve for these coefficients in terms of the regression values Φ, Δ and Λ we obtain the results:Equation 19
If we had, say Φ=2 and if Δ+Λ=1, then the above works out only to rescale the parameters since we would then have e=Δ and h=Λ so nothing would really change. If Φ<1 we'd obtain negative correlation and we cannot allow Φ=1 since the numerator would then be zero. However, if we had used another scale, say the one in use right now (i.e. E _{h}=G _{h}=100) everything would not work as above. Something which depends on a particular choice of interval scaling for its truth cannot be correct. We do not know if the present IQ scaling is meant to be an interval scale or an absolute scale. It is through problems like this that Kelvin's research led to the postulation of an absolute temperature scale. (Please see appendix A1 and Appendix A6). For more on fuzzy logic and differential /and the meaning of multiplication and nonlinearity, please see appendix A. 4.)
Problem of dynamics in measurement and attribution of causality
We should note there is another complication since the real complexity of the problem is in the dependence of the variables on one another since they can be functions of one another. For example, if we are traveling in an airplane from Maine to Florida starting up at around 9:00 AM and taking measurements of the ambient temperature, the rate of change of temperature we'd measure is that not only of the spatial variations in temperature (northsouth) and also the temporal variations since the air would start to warm up after the sun comes up and will be reaching a peak say around noon. Since we have the temperature q=q(x(t),t) where x is the distance traveled starting form Maine then the rate of change of the temperature measured (recorded by the instrument across time) isEquation 20where v is the velocity of the airplane. The first term is the purely temporal rate of change of the temperature (which is due to the warming of the earth because of the sun), and the second in which the term v·∂q/∂x appears [which is the spatial change], multiplied by the velocity of the airplane gives the change due to both the actual spatial variation and the rate of sampling of this spatial thermocline. For the case of measuring intelligence (whatever it may be) we don't know that the variables we have selected are really independent. For example suppose we have y=y(M(t),V(t),t) [where M=mathematical, V=Verbal, and t= training i.e.formal or informal education]. We know that verbal ability is important because without it we can't even give these tests. But are we sure that mathematical/symbolic/quantitative reasoning is not important for verbal comprehension? What exactly is the relationship between the two? In terms of the neural networks underlying these they are both handled by neurons, although there is lots of evidence of localization of speech, spatial reasoning etc. (for example Gazzaniga, 1985; Sperry, 1988; LeDoux, 1977). However our main concern now is in mathematical formulation of the problem. Since speech and visual ability are developing in infants simultaneously, in all likelihood threedimensional spatial comprehension and its verbal articulation probably go hand in hand although people seem to start early into developing some modes more than others, for example, spatial orientation, verbal fluency, physical development.
In the study of any scientific discipline it is necessary, in the beginning stages, to use words whose precise meanings may not be defined, but are accepted as defined in an intuitive sense as starting points. On the basis of the ideas and concepts derived from these basic terms, a theory begins to develop and then it is possible to retrogress and give precise quantitative definitions to the words and terms defined only verbally. Perhaps, the best example of this process is in the field of thermodynamics. Concepts such as heat, temperature and pressure were properties only physically felt and intuitively understood. After thermodynamics was put on a theoretical footing, the concepts of temperture, heat and pressure were defined operationally (mathematically on the basis of the developed micro (kineticstatistical) theory of thermodynamics. Hubey (1979).
Putting it all together: effect of learning and timing of learning on potential
Many things which are accepted to be a part of the “natural” (how this word is abused probably will take a book to explain) growth/maturation of humans are all due to learning. For example, in very early ages, we are told that it is quite “natural” for children to have pretense play and to invent objects and people. In all likelihood, this is due simply to the fact that the infant still has not made the strong differentiation between sleep/dreams and wakefulness. The child falls asleep in one place and wakes up in another (for example in the car, at the beach, or in someone's arms). This is probably no more mysterious at this time than being in one wonderland (in sleep i.e. dreaming) and then waking up to another reality in another place. At the same time, if it is talking to dolls or toys or dogs, it is still learning that somethings are alive and move on their own accord, some are toys and are run by batteries and that somethings that move (i.e. toys or animals) do not speak. Another stage in growth/development is when it still does not understand for example the concept of picture so that if we tell it to “do a truck” it might mimic driving it instead of drawing the picture (Gardner, 1991). But of course, does the child at that age understand that small iconic representations of objects which it sees on TV, or a book got there by various means such as a camera or being “drawn” by other human beings? It is simply ignorance, nothing more. If someone draws a picture in front of his eyes (not a bad picture since it might not be able to make the connection well at that stage) it might think that the pictures on TV are also drawn or it might think that there is a little guy inside a Polaroid camera like the German peasants during the last century who thought there were horses inside the locomotive. For the case of measuring intelligence (whatever it may be) we don't know that the variables we have selected are really independent. For example suppose we have ψ=ψ(M(t),V(t),t) [where V=Verbal, M=mathematical, and t=training i.e.formal or informal education], then the variation in the potential isEquation 21where we denote the partial derivatives by subscripts, so that if we wanted to know the change in this potential with respect to training (which would naturally affect the measured intelligence) we'd need to compute the total derivativeEquation 22
It's possible that dΨ/dt=0 since we cannot think now of how Ψ could change if neither M nor V changes (assuming that these are the only factors/variables we've identified). In the general case, naturally, there'd be more variables. But in truth things are more complicated; it may be more like ψ=ψ(M(V(t)),V(t),t). It's obvious that we can't even get across the problem let alone the solution without language so that V will definitely affect M. In this case we haveEquation 23where t is a proxy for the environmental richness. It seems at this point that we can get stuck in infinite regress since if M=M(V) we may then need to write V=V(M)=V(M(V))=V(M(V(M…)))) if we cannot separate the influence of V directly and via M and if V is also a function of M. In some problems we can measure this, say, in industry analysis in economics. In any case, we can see the effect that this will have on computing the gradient of the potential to derive the vector function. Or we might have a more complex case such as that of having a potential of the form Ψ(M(V,t),V(t),t) or even Ψ(M(V,t),V(M,t),t). There is even a more serious objection to linear correlationregression analysis. We can see immediately that new memories are built on top of old ones, and that learning to solve problems is just as good as being creative, and in many cases, the learning eventually outstrips creativity and that is naturally the reason why the testing for IQ stops at adulthood since everyone has already pretty much learned what there is to learn, and that if the same trends continued, we should be asking questions on algebra, trigonometry and calculus on the IQ tests given to adults, say college students or graduates. The fact that it is not done is a testimony to the simple fact that the tests also test for knowledge.
Furthermore the earliest memories should count more heavily, even in using the standard IQ tests since new memories are built on old ones, and thrusting someone into a new socioeconomic (SE) class is not the same as having someone in that SE class since childhood. In fact, there is probably a lag of several generations at least, and probably centuries as can be seen in the long cycles of histories of countries and empires. Therefore we already know that IQ is a path function and not a point function. Again it is not a state function but a process function, where state and process are used in a more general sense than in computer science or in psychology, or in mental testing (please see Appendix A.3). The earliest such models come from thermodynamics. And it is also from thermodynamics that we have the ideas of extensive vs. intensive properties of systems. The standard example of a path function is the length of a curve on a plane. The standard example in physics comes from thermodynamics in which the heat rejected or the work done of a system depends on the path that the process took and not a function of the end states of the process. We can thus surmise that intelligence will be a function of the time, and the path that the environment variable, say, E(t) took during this time.
Advantages of mathematical or analytical modelsunambiguity, possibility of strict deduction, verifiability by observed dataare well known and make them highly desirable in systems engineering. This does not mean that models formulated in ordinary language (e.g. verbal descriptions) are to be despised. A verbal model is better than no model at all, or a model which because it can be formulated mathematically falsifies reality. Indeed theroies of enormous influence such as Darwin's Theory of Selection, were originally verbal. Much of psychological, sociology and even economics today is still descriptive. Models in ordinary languaeg, therefore, have their place in system theory. The system idea retains its value even where it cannot be formulated mathematically, or remains a “guide” rather than being a mathematical construct. Hubey (1979)
Thus in addition to the problem that what we purport to measure, say, verbal skill V which may be a function of mathematical or spatial skills M or vice versa, we now have a bigger problem of the form of the function itself. The significance of this crossdependence of variables is obvious if we think of the fact that some proposals have been put forward that the richer environment itself is a function of genetics, i.e. I=f(E(G),G,t) (see for example Plomin and Bergeman (1991) for a review). Obviously, this is true on a global scale and the derivatives of this function will have to be calculated in the same way as that of . More on this can be found in Appendix A6 and in the conclusion section. We can approach this problem a little differently: if we have say some learning ability, L, we can see that it will in all likelihood be a function of time, since the earliest years are the most important and by old age very rarely can people retain their mind's elasticity of their early years. However, here intelligence is a function of this learning ability for which we use time, t as a proxy. Furthermore an enriched environment is essential, and the earlier this rich environment is provided the better it is, so that we can attempt to surmise the form of the functional dependence of intelligence on environment. We have I=f(G,E(t),L(t))=f(G,E(t),t) where we have accepted that t is a proxy for the learning ability, and that E(t) and G are representative of or proxies for environmental and genetic variables, respectively.
We expect that we should have f _{E} >0 since with a more enriched environment we expect increases in intelligence. Similarly we expect that f _{t} >0 if people are measured on the same scale since problem solving ability should increase with age (please see Appendix A.4). Note that we are not discussing IQ which can be obtained from intelligence via normalization. The intelligences of people have been increasing over the past half a century or so. One of the reasons, of course, is that the environment itself has been changing. Not only has the educational level gone up, but the environment itself (i.e. the standard of living) has gone up, and thus children are being exposed to more things and are getting better care both healthwise and nutritionwise. Consequently, not only do IQ tests measure knowledge [albeit claimed not to measure knowledge at all but some kind of “innate/genetic” capacity] but whether it means an intensive variable or an extensive variable or a product is not clear in the literature since it hasn't been discussed with respect to any model except in terms of regression or correlation coefficients. The tests also measure a quantity which is a function of the path (the history of the individual in a particular changing environment). In other words, the words process and state are not necessarily to be understood only in the sense made popular by the emergence of the digital computer as the metaphor of choice among philosophers and scientists working in the intelligence/knowledge field but rather in a more general sense in which thermodynamics made popular. We should construct a path function for the dependence of intelligence on environment. The simplest path function is the length of a curve, and it is an integral. Taking a cue from this we may try a very simple function of this form for intelligence to be of form;Equation 24
However it would be preferable to derive such a function from more basic considerations instead of producing it out of thin air, like the standard linear correlationregression analysis. Otherwise we could be accused of behaving like the man who was looking for his lost keys in his yard because there was more light there. Even worse, we could be accused of behaving like the little boy, who, given a toy hammer, discovered that everything looks like a nail. Since the process of acquiring intelligence (as measured in some fashion by standard intelligence tests) is a dynamic process we should turn to differential equations. In the case of a simple differential equation, a first order ordinary differential equation, given byEquation 25has the solutionEquation 26
We note that for a constant a, it reduces toEquation 27so that the Green's function (the exponential function which is the kernel of the integral) of the convolution integral is, in a sense, a weighting function, since it assigns an exponentially decreasing weight to the earlier forces that affect the system. In contrast, the weighting function that should be used for the effects of environment on intelligence should give greater weight to the earlier times since it is now common knowledge that brain damage can occur even in the womb due to effects of simple things like smoking cigarettes. Since the fastest growth for the brain, as well as the body occurs during the early years, it is a foregone conclusion that it should be so. In this case, in addition to the fact that the basic model should be multiplicative, the environmental factors should be a path function, something that accounts for the effects of the environment at different phases of development.
Clearly the differential equation model which had “feedback” can be modeled as a “black box” and we can find out from the inputs and outputs how the internal mechanism of the “black box” works. The black box models in the social sciences are strawman arguments, not against behaviorism but against science. This area of “identification” of the system (i.e. black box) and prediction based on its is a welldeveloped science. The “black box” model (i.e. ) is “timeinvariant” in that the parameters of the differential equation are not timedependent. In the real world, the behavior of intelligent entities changes with time; it is a part of learning. These changes require nonlinear differential equations and this topic is discussed briefly in appendix A.6 (Meaning of Nonlinearity). We know that as children grow, not only do their bodies develop but so do their brains and minds. A simple example of growth is given by the differential equationEquation 28
We assume that the growth rate of the child is proportional to the size yet to be achieved; that is, it grows faster when it's smaller because the size yet to be achieved is much larger than in adolescence when it has reached close to its adult height. The brain also grows at similar rates and we can take this equation to be a simple model for the development of intelligence for a start. As is well known the solution consists of an exponential approach to a final constant value. The coefficient g can be seen in this plot to control the rate at which the child approaches its final adult height, and would grow faster for larger values of k. Of course, this is a simplified model and does not take into account the fact that there are spurts in growth rates around puberty. If anything a large g would indicate a precocious child, especially if its intelligence were to increase at the same rate. At this point we need to consider other global effects in what to expect. On a large scale, as can be evidenced every day, we see that except for humans all other animals seem to have a limit of intelligence and capability/capacity beyond which they cannot advance. We already know that a multiplicative formulation is needed, therefore we need to combine this idea with the dynamics of intelligence. On large scale over time we expect to see what is shown in Figure 9(a) and Figure 9(b).
It would seem on a global (large) scale that intelligence is definitely genetic. No dog will ever talk and no chimp will ever do calculus. So then why do the statistical tests give results that intelligence is 40 per cent to 80 per cent genetic instead of more like 99.99999 per cent? As a simple example of the kind of mathematical equation for the above we can tryEquation 29The real question then becomes, exactly what kind of function of heredity or environment are the parameters A _{x} and k? As can be seen from the plot and the equation, the parameter k determines how fast the intelligence of the subject increases in time toward its potential which is apparently mostly genetically determined. But if we examine these plots at small scales (or higher resolutions) say only for humans then we see something like this (again simplified). Of course, in reality both A _{x} and k fluctuate and although the plots do not show them a typical sample function could crisscross across others. In any case, now that we look at it at small scale (and high resolution) we have other means of interpreting the differential equation that gave rise to thisEquation 30
The rate of increase of intelligence is proportional to the intelligence yet to be achieved with γ being the constant of proportionality. The intelligence limit on the whole is determined genetically but it acts as a kind of an attractor of expectation from the child. In other words the difference (A _{x} −x) can easily be thought of as motivation. Does k then denote the genetically determined factor (i.e. a rate of increase)? Since we already have much evidence that this constant (is not constant but varies) can be changed with more attention and greater quality and quality of teaching and practice it cannot be a purely genetic factor either. It could be genetically determined on the whole but it is also a factor of both heredity and environment. So then we are led toward the complete model which could be of form;Equation 31
This is clearly the solution of the differential equation for intelligenceEquation 32which although simple and linear still has basically all the right ingredients to be a model of dynamical learning in the naturenurture environment. From the solutions of first order ODEs of this type (e.g. through ) the coefficient of I(t) (i.e. λE^{ɛ} G^{η} ) determines the rate of increase therefore it is the part that represents the interaction of the environment with genetics. The limit intelligence (λE^{eɛ} G^{hη} ) is achieved eventually but if this were completely independent of the coefficients e and h it would mean that all this interaction has nothing to do with intelligence and two other coefficients representing something else (i.e. e and h) determine final components of intelligence. The important part is that the multiplicative interaction of G and E is modeled. Variations on this theme can be seen below and in Appendix A.4.
Much ado about nothing?
In mathematical modeling, it is really the equations that talk. However the meanings of these equations has been discussed throughout the exposition. By changing the scales to the natural (absolute) scales, and by making the global intelligence/behavioral parameter multiplicative with genetics factors and by examining the behavior of this function in the neighborhood of human level behavior we can unify much of the work done on such topics as has been done. We can improve the model above by making appropriate changes, pointing out its relationship to the constraints that must be satisfied by such models, and by connecting it to the standard analysis of such problems in the literature. We see clearly that is simple linear (but dynamic) realization of a more general formEquation 33where φ(E(t),t) is a path function examples of which are given in Appendix A.4. The reasoning to obtain the differential was already given, but there are criteria/constraints that it must satisfy to be a good representation of the geneticsenvironment interaction. Since E is not constant but heredity is fixed at conception (at least at present) a more general (and slightly different) version of isEquation 34which by virtue of has the solutionEquation 35
There is yet more to the power that this simple linear differential equation hides. Integrating it once and rearranging terms we obtain the integral equationEquation 36
The interpretation of this equation is exactly what is claimed by most researchers in the field, namely, that intelligence at time t, that is I(t) is a function of the past interaction of intelligence with environment summed up over time from time zero to the present time t. The K(t) term is also a multiplicative function of environment and genetic interaction and its position is reminiscent of the differential equation formulation in that it seems to be some ultimate potential for a given environment and genetic makeup (for all humans) toward which all humans grow. Obviously, can easily be cast in integral form as above and has the same interpretation. This is a simple version of a more general integral equation, for example, in Appendix A4. A more convenient form of this equation (especially for purposes of testing, which is discussed in Appendix A6) is in Appendix A4.
The solution of this integral [and the equivalent differential is given in ] and in it we can see that after a sufficient amount of time elapsed, the effects of environment and genetics wears off so that the limitEquation 37is reached [where we have made the change of variables eɛ→e and hη→h from ] which is the original multiplicative form that was posited based on fundamental reasons valid for all intelligent organisms. Indeed, the complexity of the real world is much beyond what can be captured by these linear and deterministic equations. More on this train of thought can be seen in Appendices I.3 and I.4. Even if we did stick to these linear models (i.e. ) we still have to consider that an ensemble of such /would be needed, one for each person with its own parameters. That would mean that we'd have to consider the parameters E and G as random variables thus turning into a stochastic process in which we would compute the probability density p(I,t) of the process. For simple cases we can obtain solutions for example in Appendix A.8. That such methods are the wave of the future for highly complex problems of the social sciences has been argued in detail in Hubey (1996) and examples of simple solutions can be seen in Hubey (1993) and complex ones in Helbing (1995). Since the intelligence has been normalized to unity, then is really another expression of the relationship of environment to genetics, which can be written as E ^{e} G ^{h} =cons. In other words, the nonlinear formulation as in is not necessarily the only one, but rather an example. There are other formulations possible, for exampleEquation 38 Equation 39 Equation 40
Whatever the case, a linear approximation via the Taylor series in the neighborhood of human level genetic endowment and cultural/environmental achievement will all lead to the same linear approximation results. Since the linearity is obtained from an approximation the numbers are only good in the neighborhood where E+G≈2 (see appendix A6). But then we haveEquation 41which leads inexorably to the conclusionEquation 42Since these have to add up to the total variation of unity, it is not a surprise that the environmental variance or genetic variance hovers around 0.5 in studies (Rowe, 1994; Rushton, 1997; Plomin and Daniels, 1987; Plomin and Bergeman, 1991). Similar to the situation in economics (see Appendix A4) this fact is really an indication that the socioeconomic and technological systems and the educational systems that support such systems are created via complex interactions so that we (humanity) work near our optimal limits. It is interesting that similar results hold in the case of the production functions of economics theory, for example, the CobbDouglas type, which is multiplicative as here, that the “share of capital” and “share of labor” in the production process and function are about 0.5. In the case of the production function, the numbers are merely a reflection of how the socioeconomic system is setup. For example, in LDCs where they must buy machines from other more developed countries, the “share of labor” is much less than the “capital's share”. The reason that in advanced societies, the shares of labor and capital are about equivalent merely reflects the fact that the machines which are in the production process are built by other workers in industry and machine costs reflect the salaries and wages of those involved in the production of these machines. Furthermore it is a sign of the power of workers since if capitalists were really that powerful, they could conceivably pay pittances to all the workers, and claim large shares of the profits for themselves. In a similar vein, the reason the heritability is about 50 per cent is really a reflection of the way tests are created and is an indication of the importances that society attaches to skills in various ways. It would be quite easy to create tests in which word recall, reading would be rated low, and mathematicalsymboliclogical capability rated greatly and skew the results of tests so that the heritability results are more skewed than they are now.
If these intelligence tests given today were given to our ancestors 5,000 years ago they probably would have scored about the same as those in less developed countries for most of what passes as intelligence is really knowledge of the shared environment which in advanced societies is shared through the popular mass media organs such as television, and propagated through our educational institutions. If anything, the scores less than 0.5 are a testimony to the unequal environment in our societies. Similar views have been voiced by others, for example very strongly by Lewontin (1975).
To show explicitly the effect of the nonlinearity on one of the infamous variables of cognitive ability studies, we can show that in the linear (additive) case such asEquation 43the heritability coefficient calculations are rather easy sinceEquation 44and thereforeEquation 45The last step was obtained assuming that the virtual variations (displacements) are equal i.e. δE ^{2}=δG ^{2} and by assuming the crossproduct term to be zero as is usually done when the interaction term variance is ignored. We can do this because we want to know how much of the variation in I is due to unit variations in G and E. Behind this approximation are really stochastic differentials. In other words, alternatively, we may treat the variations to be random quantities and then average over the ensemble in which case assuming that the variations δE and δG are independent (which is the assumption used when ignoring the V_{GE} in the standard calculations in ANOVA) the crossvariation is zero. The result is clearly what is expected, and that the heritability is really the ration of the variation due to genes to total variation. What is hidden here but assumed is that both G and E are measured on the same ratio scale since if it were not so the equality of the small variations would not be assumed. For the nonlinear dependence of intelligence on both the environment and heredity as given byEquation 46the variation/differential isEquation 47
To compute the heritability as usually done we would have to divide the variation due to the variation due to genetics by the total variation. In this case we compute it (ignoring the crossproduct as is usually done to beEquation 48 Equation 49The result can be put in final form which can be analyzed for semantics.Equation 50Clearly, it is now even more obvious that both G and E must be measured on a ratio scale (i.e. absolute scale) since now the ratio shows up explicitly in the calculations. Simple analysis of variance calculations based on a linear (additive) model of the influence of genetics and environment in which they can be used as substitutes for one another, is clearly false. The fact that only equations for variance are almost always used in the discussions such asEquation 51 Equation 52hides the fact that these ad hoc derivations rely and are based on linear/additive models as shown above. If E is measured on a scale, even if it is a ratio scale, in which it is, say, 10 times larger than the scale on which G is measured, the product will make the contribution from the environment seem small so that the heritability coefficient will get larger. To see the devastating effects of nonlinearity on computations of heritability as is usually done, we can examine a simple case of nonlinearity in which I=αEG. Substituting e=γ=1 into we obtainEquation 53
It is impossible under these conditions to claim that H ^{2} really measures (the genetic component) heritability! Obviously, to obtain true heritability we must compute 1−H ^{2}. Furthermore it can be shown that if the dependence of intelligence on the environment and genetics is I=e ^{E e G γ} the results for H ^{2} (or h ^{2} as appropriate) are still the same as above. In the case of the dynamical equation or the solution [] the equation for h ^{2} is still the same form as above, therefore the heritability calculations based on the linear model are incorrect. Furthermore, such calculations need to be made on measurements based on a ratio scale so that arbitrary scales for socioeconomic factors cannot be used. We can obtain similar results via Taylor series. Expanding about E _{0} and H _{0}, we obtainEquation 54 Equation 55where I _{0}=E _{0} H _{0} which is the average/normal human intelligence. Obviously then we can identify δH ^{2}=(H−H _{0})^{2} and δE ^{2}=(E−E _{0})^{2}. It is clearer from these derivations that the previous analysis was basically the equivalent of analysis of variance, and thus the results have been demonstrated for a variety of ways.
If these intelligence tests were given to our ancestors 5,000 years ago they probably would have scored about the same as the semiliterate peoples in the less developed countries for much of what passes as intelligence is really knowledge of the shared environment which in advanced societies is propagated through the popular mass media organs such as television, and through our educational institutions. If anything, the wide variations on these tests are a testimony to the unequal environment in our societies. Similar views have been voiced by others, for example, Lewontin (1975). In terms of the processes which give rise to such scores, there is basic agreement among many workers in the field, except for some lingering confusion. For a more detailed exposition of the ideas, see Hubey (1996). The fundamental concepts are shown in the Table III below. Most of the time by “qualitative” people mean “not well understood” because many intensive variables are quite easily quantifiable. It is an unfortunate accident of history and sloganeering that a word like “quality” has come to be the basis of a word like “qualitative” which is used in opposition to quantitative to disparage the physical sciences. The most correct version of all of these is the intensiveextensive dichotomy (Hubey, 1996) which is what the psychological division of associative vs cognitive/conceptual signifies, as can be easily seen via extrapolation from the AI concepts of knowledgebase and the inferenceengine which operates on it. In humans both of these are stored in the brain using neurons. The earlier some of the inferencing mechanisms (i.e. intensive variable) are learned the more they become a natural part of the human reasoning process similar to talking and walking and the more easily they are able to masquerade as intrinsic/genetic factors. The increase in the probability of finding similar problems already solved in memory (section 2.3) greatly increases performance (Figure 5). The earlier these are learned, the more efficient the brain organization for problemsolving (Figure 6). Therefore more accurate measurement these effects requires models in which time is explicit. This also explains why brain size does not correlate more strongly with intelligence tests. Problemsolving techniques, whether learned informally during early childhood or formally in school present themselves in studies as “intelligence”. It is for this reason that more difficult questions are not asked on such tests and especially to adults. Many people in the physical sciences and mathematics would score very high on such tests, but then the learned component would be very obvious to every researcher and layperson alike. However, when fundamental concepts learned early in childhood and which adds to the efficiency of brain organization are asked on such tests we are instead left with “controversy”. It is for this reason that some researchers have put forward ideas such as musical talent, body intelligence and the like. This argument misses the point if one can retort “are there music neurons” or if music is noncomputational. Clearly, music is also computational (Johnson, 1997). In the past some researchers took refuge in such arguments are creativity, and originality (in antiAI arguments) and musicalkinetic intelligence (in antimath orientation of test questions). However, the validity of the argument stands if it is about the lack of natural dimensions and the weighting of the distance metric in ndimensional space. (Appendix A.1).
At this point in time unless tests which are stringently controlled can be given, and tests which explicitly take into account nonlinear interaction of genetics and environment, there is not sufficient reason to attribute differences in performance on standardized tests to genetic differences which is not to say that what the questions test for are not important to society. If however, motivation is a key factor, one might ask why people with PhDs cannot learn 2 semesters worth of calculus or physics over a period of 50–60 years.
In general the differences between humans and other animals (say chimps) in all measurable behavioral characteristics are likely differences of degree and not differences of kind unless there are definite physiological constraints. This means that the interval [0,1] is the natural absolute scale of measurement, and the maximum will be achieved by our species which provides for normalization at the upper end of the scale. Furthermore the natural kind of relationship is multiplicative, which can still be tested using standard methods using logarithms in which there is a tradeoff between order of magnitude and the nonlinearity of the logarithmic transformation. If Taylor series approximations are used to obtain linear relationships to be tested, the necessity of using the natural scales is obvious, otherwise the interaction of the different factors cannot be separated from each other.
Please see Equation 56, Equation 57, Equation 58, Equation 59, Equation 60, Equation 61, Equation 62, Equation 63, Equation 64, Equation 65, Equation 66, Equation 67, Equation 68, Equation 69, Equation 70, Equation 71, Equation 72, Equation 73, Equation 74, Equation 75, Equation 76, Equation 77, Equation 78, Equation 79, Equation 80, Equation 81, Equation 82, Equation 83, Equation 84, Equation 85, Equation 86, Equation 87, Equation 88, Equation 89, Equation 90, Equation 91, Equation 92, Equation 93, Equation 94, Equation 95, Equation 96, Equation 97, Equation 98, Equation 99, Equation 100, Equation 101, Equation 102, Equation 103, Equation 104, Equation 105, Equation 106, Equation 107, Equation 108, Equation 109, Equation 110, Equation 111, Equation 112, Equation 113, Equation 114, Equation 115, Equation 116, Equation 117, Equation 118, Equation 119, Equation 120, Equation 121, Equation 122, Equation 123, Equation 124, Equation 125, Equation 126, Equation 127, Equation 128, Equation 129, Equation 130, Equation 131, Equation 132, Equation 133, Equation 134, Equation 135, Equation 136, Equation 137, Equation 138, Equation 139, Equation 140, Equation 141, Equation 142, Equation 143, Equation 144, Equation 145, Equation 146, Equation 147, Equation 148, Equation 149, Equation 150, Equation 151, Equation 152, Equation 153, Equation 154, Equation 155 and Equation 156 in the Appendix.
Also please see Table III, Figure A1, Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7, Figure A8, Figure A9, Figure 3a, Figure 3b, Figure 4, Figure 7 and Figure 8.
Equation 1
Equation 2
Equation 3
Equation 4
Equation 5
Equation 6
Equation 7
Equation 8
Equation 9
Equation 10
Equation 11
Equation 12
Equation 13
Equation 14
Equation 15
Equation 16
Equation 17
Equation 18
Equation 19
Equation 20
Equation 21
Equation 22
Equation 23
Equation 24
Equation 25
Equation 26
Equation 27
Equation 28
Equation 29
Equation 30
Equation 31
Equation 32
Equation 33
Equation 34
Equation 35
Equation 36
Equation 37
Equation 38
Equation 39
Equation 40
Equation 41
Equation 42
Equation 43
Equation 44
Equation 45
Equation 46
Equation 47
Equation 48
Equation 49
Equation 50
Equation 51
Equation 52
Equation 53
Equation 54
Equation 55
Equation 56
Equation 57
Equation 58
Equation 59
Equation 60
Equation 61
Equation 62
Equation 63
Equation 64
Equation 65
Equation 66
Equation 67
Equation 68
Equation 69
Equation 70
Equation 71
Equation 72
Equation 73
Equation 74
Equation 75
Equation 76
Equation 77
Equation 78
Equation 79
Equation 80
Equation 81
Equation 82
Equation 83
Equation 84
Equation 85
Equation 86
Equation 87
Equation 88
Equation 89
Equation 90
Equation 91
Equation 92
Equation 93
Equation 94
Equation 95
Equation 96
Equation 97
Equation 98
Equation 99
Equation 100
Equation 101
Equation 102
Equation 103
Equation 104
Equation 105
Equation 106
Equation 107
Equation 108
Equation 109
Equation 110
Equation 111
Equation 112
Equation 113
Equation 114
Equation 115
Equation 116
Equation 117
Equation 118
Equation 119
Equation 120
Equation 121
Equation 122
Equation 123
Equation 124
Equation 125
Equation 126
Equation 127
Equation 128
Equation 129
Equation 130
Equation 131
Equation 132
Equation 133
Equation 134
Equation 135
Equation 136
Equation 137
Equation 138
Equation 139
Equation 140
Equation 141
Equation 142
Equation 143
Equation 144
Equation 145
Equation 146
Equation 147
Equation 148
Equation 149
Equation 150
Equation 151
Equation 152
Equation 153
Equation 154
Equation 155
Equation 156
Figure 1Pass/Fail Physical Agility Space: If
we use only {0,1} we can have a discrete distance metric which we
can use for binary pass/fail or {0,1} scorling
Figure 2Color Space: Almost all colors can be
produced additively via the three colors Red, Green, and Blue.
(RGB). See Banks (1990).
Figure 9(9a) Increase in intelligence of
various species after birth on an absolute intelligence scale. The
slow increases up to some limit are typical of exponential curves.
(see appendices .3 and .4). The initial condition is not really
zero but is drawn this way for simplicity. (9b) Variations in the
parameters of the intelligence model for a given species. It is
evident that both parameters A _{x} , and
k are functions of both heredity and environment. The
arrows show the “track jumping” behavior of HeadStart type
programs in which change in the environment puts the child into a
different sample function (i.e. path)
Figure 5Effects of learning: the effect of
learning is obviously to make it possible to solve problems much
faster and thus obtain higher scores
Figure 6Effect of localization of memory and
specialization: With early learning there is more efficient
organization of the brain for certain types of tasks and thus
leading to higher Ω than for late learning
Figure A1The concept of distance
Figure A3The ordinal vowel cube (Hubey, 1994)
Figure A4The temperature scales: In physics
the absolute temperature scale (Kelvin scale) must be used in order
to be meaningful in equations involving ratios
Figure A5The potential and the expert: In (a)
the expert has more general knowledge than the expert in (b). The
isoclines are types of knowledge
Figure A6Difference in intelligence due to
environment changes
Figure A9Intelligence function/potential
paradox: The linear relationship I=E+G−1
is approximate and valid only around E=G=1,
therefore the paradox is more difficult to achieve or display. If
however, the nonlinear version is used, we can make use of the full
scale
Table I
Table II
Table III
Figure 3a(a) Global pattern but lack of
correlation at local scale: A situation where using an absolute
(ratio) scale would yield correlation over a large scale as
expected but fail over small scales (magnitudes). The figure is not
drawn to scale but is only meant to be suggestive. The simulated
data points would be much more closely clustered (horizontally) in
real life. (b) The evolution of information content in genes and
brains (after Britten & Davidsen
(1969); see also Sagan
(1977). Compare this to (a)
Figure 3b(a) Global pattern but lack of
correlation at local scale: A situation where using an absolute
(ratio) scale would yield correlation over a large scale as
expected but fail over small scales (magnitudes). The figure is not
drawn to scale but is only meant to be suggestive. The simulated
data points would be much more closely clustered (horizontally) in
real life. (b) The evolution of information content in genes and
brains (after Britten & Davidsen
(1969); see also Sagan
(1977). Compare this to (a)
Figure 4Memory levels: A simplified functional
view of memory needed for solving problems and the role of
learning
Figure 5
Figure 6
Figure 7Realization of potential: The
potential (heredity) for speech is there for all humans but if the
window opportunity for learning language passes language cannot be
learned. Meanwhile, although instruction (environment) is given to
animals they cannot learn to speak. The dividing lines are
arbitrary and merely suggestive
Figure 8Various “Black Box” Models: It is
thought by some that the last two models with a feedback loop from
the environment is not a part of the “black box” method of science
and cannot be handled by standard mathematical tools. Of course,
that is not how it is practiced in real life. See Appendix
17
Figure 9a
Figure 9b
References
Banks, S. (1990), Signal Processing, Image Processing and Pattern Recognition, Prentice Hall, Englewood Cliffs, .
Britten, R.J., Davidson, E.H. (1969), "Gene Regulation for Higher Cells: A Theory", Science, Vol. 165 pp.349–57.
Chittick, W. (1983), The Sufi Path of Love: The Spiritual Teachings of Rumi, State University of New York Press, Albany, .
Eccles, J. (1989), Evolution of the Brain: Creation of the Self, Routledge, New York, .
Gardner, H. (1993a), The Unschooled Mind, Basic Books, New York, .
Gould, S.J. (1981), The Mismeasure of Man, W.W. Norton, New York, .
Gaffan, D., Weisenkrantz, L. (1980), "Recency effecs and lesion effects in delayed nonmatching to randomly baited samples by monkeys", Brain Res, Vol. 196 pp.373–86.
Gazzaniga, M. (1985), The Social Brain, Basic Books, New York, .
Helbing, D. (1995), Quantitative sociodynamics: stochastic methods and models of social interaction processes, Kluwer Academic Publishers, Dordrecht, .
Hubey, H.M. (1987), .
Hubey, H.M. (1991a), .
Hubey, H.M. (1993), "Psychosocioeconomic Evolution of Human Systems, Math. Modelling and Sci. Computing", Principia Scientia, Vol. 2 pp.320–5.
Hubey, H.M. (1994), Mathematical and Computational Linguistics, Mir Domu Tvoemu, Moscow, .
Hubey, H.M. (1996), "Topology of Thought, CCAI: The Journal for the Integrated Study of Artificial Intelligence", Cognitive Science, and Applied Epistemology, Vol. 13 No.23, pp.225–92.
Hubey, H.M. (1997), "Logic, physics, physiology, and topology of color", Behavioral and Brain Sciences, Vol. 20 No.2, pp.191–4.
Hubey, H.M. (1998), The Diagonal Infinity: Problems of Multiple Scales, World Scientific, Singapore, .
Hubey, H.M. (1999b), Mathematical Foundations of Linguistics, LINCOM EUROPA, Muenchen, Germany, .
Herrnstein, R., Murray, C. (1994), The Bell Curve, The Free Press, New York, .
Jerison, H.J. (1973), Evolution of the Brain and Intelligence, Academic Press, New York, .
(1997), .
Kojima, Kenichi Mathematical Topics in Population Genetics, SpringerVerlag, New York, .
Klir, G., Yuan, B. (1995), Fuzzy Sets and Logic, PrenticeHall, Englewood Cliffs, .
Lewontin, R.C. (1975), "Genetic Aspects of Intelligence", Annual Review of Genetics, Vol. 9 pp.387–405.
(1963), in Luce, R. , Bush, R. , Galanter, E. (Eds),Handbook of Mathematical Psychology, John Wiley & Sons, New York, Vol. Volume I.
MacLean, P.D. (1973), A Triune Concept of the Brain and Behavior, University of Toronto Press, Toronto, .
(1974), in Maranell, G. (Eds),Scaling: A Sourcebook for Behavioral Scientists, Aldine Publishing Company, Chicago, .
Olton, D. (1983), "Memory functions and the hippocampus", in Seifert (Eds),Molecular, Cellular and Behavioral Neurobiology of the Hippocampus, Academic Press, New York, pp.335–73.
Papoulis, A. (1984), Probability, Random Variables, and Stochastic Processes, McGrawHill, New York, .
Plomin, R., Daniels, D. (1987), "Why are children in the same family so different from one another", Behavioral and Brain Sciences, Vol. 10 pp.1–60.
Plomin, R., Bergeman, C.S. (1991), "The nature of nurture: Genetic influence on environmental measures", Brain and Behavioral Sciences, Vol. 14 pp.373–427.
Rasch, G. (1980), Probabilistic Models for Some Intelligence and Attainment Tests, University of Chicago Press, Chicago, .
Roughgarden, J. (1979), Theory of Population Genetics And Evolutionary Ecology: An Introduction, MacMillan, New York, .
Rowe, D. (1994), The Limits of Family Influence, Guildford Press, New York, .
Rushton, J.P. (1997), Race, Evolution, and Behavior, Transaction Publishers, New Brunswick, .
Sagan, Carl (1977), Dragons of Eden: Speculations on the Evolution of Human Intelligence, Ballantine Books, New York, .
Shrager, J., Hogg, T., Huberman, B.A. (1988), "A graphdynamical model of the power law of practice and the problemsolving faneffect", Science, Vol. 242 pp.414–6.
Tricomi, F.G. (1985), .
Zadeh, L. (1963), .
Zadeh, L. (1978), .
Zadeh, L (1987) A Computational Theory of Dispositions, Int., Intelligent Systems, Vol. .