07 January 2011

How Smart Were Archaic Humans?

Until 2010, we had to draw almost all of our conclusions about how smart archaic humans were from bones (mostly skull volume relative to body size) and artifact sophistication and evolution. Now that we have ancient DNA from two kinds of archaic humans, with a third kind of archaic human DNA in the process of being analyzed, we can also look at that ancient DNA in order to make inferences about their cognitive abilities. This post looks at the first couple of insights that have arisen from looking at the content of those ancient DNA samples.

We now have DNA samples of two kinds of archaic humans (Neanderthals and Denisovians) that we can compare to the DNA of modern humans and of our collective closest living primate ancestors, chimpanzees and bonobos, respectively.

Mostly, state of the art science can only tell us how statistically similar or dissimilar those DNA sequences are, but not what they do. But, the function of some genes thought to be critical to modern human intellectual capacity is known. John Hawks examples two of them, microcephalin (MCPH1), and FOXP2 from the published genomes at his blog.

Microcephalin (MCPH1) is associated with brain development in some manner. FOXP2 is associated strongly with language and speech and sound processing ability in humans, songbirds, mice and bats.

For FOXP2, the Neanderthal and Denisovian DNA sequences align with most modern humans, rather than with the chimpanzees, so it appears that this gene evolved in a common ancestor of all of these hominin types (although there are other ways this could happen that aren't entirely ruled out) and that all of them would have had superior language abilities to our closest modern primate relatives.

For MCPH1, the Neanderthal and Denisovian DNA sequences appear to align with chimpanzees, rather than with most modern humans, so this appears to be a mutation that took place much later in time.

The caveat to both findings is that either mutation could have been present in some percentage, but not all archaic humans, and that the results from looking at a very small number of ancient DNA samples could simply be statistical flukes.

About MCPH1

While we have a reasonably clear idea about how the FOXP2 gene expresses itself phenotypically, the manner in which MCPH1 is related to anything about a person that you could observe without cutting them open or doing invasive testing is less obvious, except in the case of a particular combination of allelles of that gene that produces greatly diminished brain size. As Wikipedia explains (citations omitted):

Microcephalin (MCPH1) is one of six genes causing primary microcephaly when non-functional mutations exist in the homozygous state. Derived from the Greek words for "small" and "head", this condition is characterised by a severely diminished brain. Hence it has been assumed that variants have a role in brain development, but in normal individuals no effect on mental ability, brain size or behavior has been attributed to either this or another similarly studied microcephaly gene, ASPM. . . MCPH1 is expressed in the fetal brain, in the developing forebrain, and on the walls of the lateral ventricles. Cells of this area divide, producing neurons that migrate to eventually form the cerebral cortex.

Actually, I'm understating the point here. The role of MCPH1 in intelligence is wildly controversial and contains multiple puzzles.

The amount of genetic diversity in the different version of the MCPH1 gene suggests that five of the six versions have origins about 1.7 million years ago, but that one subtype, called haplogroup D, has much more recent origins, estimated at 14,000 to 60,000 years ago, although mutation rate estimates are very rough and problematic at any significant degree of detail. Moreover, the haplogroup D variation and a similarly young variation in the ASPM gene seem to be increasing in frequency.

This increase in frequency suggests that the younger haplogroups of these respective genes confer some selective advantage. Still, they could simply be chance fellow travelers with a population with a selective advantage totally unrelated to these particular genes. Also, the genetic advantage associated with MCPH1, if there is one, wouldn't necessarily have to be related to intelligence or brain function at all, since many genes have multiple impacts on human development. For example, if MCPH1 haplogroup D played a role in metabolizing domesticated foods in one of the many ways that domesticated foods differ generally from foods obtained from hunting and gathering, natural selection might favor it for reasons totally unrelated to its role in brain development.

But, generally, genes that increase in frequency in a population tend to be associated somehow or another, either causally, or coincidentally, with some observable trait or factor that confers selective advantage. So, even though we aren't sure precisely what good thing MCPH1 haplogroup D does for a person, the evidence that it is being selected for suggests that it does do something good, even if we're not really sure just what that good thing could be.

Part of the reason that the role of MCPH1 is controversial, particularly to the extent that it might be associated with a trait that we might care about in determing social status, like intelligence (or any trait that confers generalized selective advantage for that matter), is that "the D haplogroup is common in Europe and Asia, but is very rare in Africa." There is also evidence of "an ancient Asian allele [in MCPH1] being retained in living Asians."

The absence of MCPH1 D in Neanderthals and Denisovians from whom it might have introgressed into modern humans, together with its predominantly Eurasian distribution, suggests that the MCPH1 haplogroup D specific mutations, like the mutations the distinguish mtDNA macro-haplogroups M and N from their ancestral L3 haplogroup, and like the mutations that distinguish Y-DNA macro-haplogroup CF from more basal Y-DNA haplogroups A and B, may have been mutations specific to the Eurasian founder group, either as a distinct African subpopulation (probably East African) or upon leaving Africa perhaps in a "Eurasian Eden."

This too, however, warrants a caveat. Unlike neutral traits, if a trait has selective advantage, and you have enough generations for it to do its work, even slight introgression of an advantgeous trait will soon predominate in the general population. So, if Eurasia were effective isolated from Africa for many generations that included the time period when a selectively advantageous mutation arose anywhere in Eurasia that was less isolated from the rest of Eurasia than it was from Africa, one might expect the advantageous mutation to become common across Eurasia regardless of its source within Eurasia.

MCPH1 and Tonality

"Modern distributions of chromosomes bearing the ancestral forms of MCPH1 and ASPM are correlated with the incidence of tonal languages, but the nature of this relationship is far from clear." The association of tone and this gene is particularly surprising, as tonality in language tends to be an areal feature shared by languages in the same geographic area even if they are part of different language families. Indeed, tonality in language seems to replace other linguistic features of languages in fairly orderly and predictable ways - to some extent tone in a language is simply one more way in which different phonemes (i.e. vowels and consonants) are created, and languages with tones may have fewer phonemes.

The association is also surprising because the this association seems to be present in both Old World places where tonal languages are common, one in parts of Southeastern Asia and one in parts of Africa.

It could be an association that exists for profound causal reasons, or it could be a fluke (or due to incomplete tone language data). Consider that the study that that found this relationship compared twenty-six language features to the distributions of these genes, so a coincidence uncommon enough to appear only 4% of the time would be expected to appear at least once by random chance, and even greater coincidences are plausible since links between language and geography and population genetics are clearly not independent variables.

More Caveats On MCPH1, Race and Intelligence

Differing allelle frequencies of genes like MCPH1 or any other gene in given populations with sufficiently large sample sizes within the study are, in the kinds of studies that are being relied upon, not subject to serious dispute. One can, with enough time and money, determine gene allelle frequencies in a population to any desired degree of accuracy, and a pretty modest sample size produces a quite accurate result.

This is a very different situation than the one in which one makes comparisons, for example, of IQ between different populations, where the particular test or method used to make the measurement may have cultural biases, and where the extent to which the phenotype measured by the test in various populations is due to nature or nuture is often very hard to determine. Even the gold standard of twin studies won't help much if there is an environmental difference between the society in which you do the twin studies and another society that affects the phenotype. For example, if everyone in Rome drinks from lead pipes, and everyone in Paris drinks from copper pipes, the effect of lead on IQ would not be revealed by a twin study of twins in Rome.

If genes are a fairly minor part of a phenotype (even if they account for most of the intrapopulation variance because the relevant part of the environment is similar for almost everyone in the population), then genotype differences won't produce big population level phenotype differences, although the differences between populations might be statistically significant. If genes are a fairly important part of the phenotype, in contrast (even if environment accounts for most of the intrapopulation variance because almost everyone in the population shares the relevant genes), then there should be very significant and noticable population level differences that track genotype differences between the populations.

When you have accurate information about genotypes (gene allelle varieties) in a population, the link between populatioon level genotypes and population level frequencies of phenotypes (visible traits) depends entirely upon the strength of the link between the genotype and the phenotype. When the causal link between genotype and phenotype is very strong, e.g. the link between the "red hair gene" and "red hair," then genotype frequency information can tell you a lot about population level differences. When the causal link between genotype and phenotype is weak, e.g. the link between genotype and phenotype is weak, e.g. between Y-DNA haplogroup R1a and fluency in an Indo-European language, even strong coincidences of the two may tell you more about coincidences due to ancestry than to any biological effect of the gene itself.

Also, even if there is a perfectly established causal link between a genotype and a phenotype, if more than one gene type contibutes to the phenotype, knowing genotype frequencies for one of those genes will be incomplete and may even be misleading. For example, most lactose tolerance in large parts of the world is attributed to a couple of known gene mutations. But, there are lactose tolerant populations in Africa that receive their lactose tolerance from some other gene. In some subcontinental sized parts of the world, genotype is an almost perfect predictor of lactose tolerance, but in others, knowing a person's genotype is almost useless because there is another gene out there that we haven't discovered yet that has the same trait.

Indeed, as John Hawks has explained in some posts at his blog, there is actually a mathematical formula that explains who population differences in genotype develop when there is more than one kind of gene that confers the same kind of benefit to a greater or lesser degree and the genes start in different places at different times. Thus, one reason that a selectively beneficial gene could spread in Eurasians, but not in Africans could be that Africans have some other gene that does something very similar making the Eurasian gene less selectively advantageous there than it is in populations where neither advantageous gene is present.

Basically, the formula provides that advantageous genes are passed on to neighboring populations at a rate proportionate to the advantage conferred until they reach a boundary where a neighboring population has some other similarly advantageous gene, at which point the rate of expansion of the more beneficial gene slows greatly with the net benefit of the better gene, rather than its total benefit driving the rate of further expansion.

Despite all of these caveats, it remains the case that in the next decade or two, we will have large, statistically significant data on the complete genotype of almost every gene in every reasonably large population. And, over a longer period of time, that will probably be several generations, we will learn much more about what particular genes do in a much more reliable way based on biological function, as opposed to black box statistical associations and hunches as we do now.

I wouldn't be surprised, for example, if it would be possible to take a blood sample from my great-grandchild when (and if) he or she is born, drop it into a machine, and learn within the day from that test the extent to which 90%+ of the hereditary components of IQ and personality and mental illness and all known hereditary disease risks are present in that child, together with a sidebar in each category providing the latest information on the extent to which each of those traits is a function of nature v. nuture. For better or worse, parents may have a much better idea what to expect from their children at that point. Some people would probably intentionally refrain from finding out this information, but I know that I wouldn't have that strength of will if the information was easily and cheaply available to me.

The hard question is what people would do with that information, either at a population level, or in individuals, if they had it. One plus of genotyping compared to other forms of population level stereotyping is that it is much more obvious that there are exceptions and that the individual rather than the average is what matters.

Footnote on the Prediction of Ancient Admixture

Also looking prescient, was a 2006 finding that concluded "that around five percent of human genes show some evidence for introgression from archaic humans. Their statistical test was looking for loci with ancient divergence times and in particular divergent alleles centered in Eurasian (non-African) populations." This is surprisingly close to the estimate of the percentage of archaic human DNA in modern humans based on direct comparisons of the Denisovian and Neanderthal genomes to those of modern humans, although the ancient DNA confounds the original hypothesis that MCPH1 itself may have been archaic human in origin.

No comments: