05 January 2016

The Law Of Averages And Genetic Models

A phenotype is an observable and measurable quality of a person or organism, like hair color, eye color, lactase persistance (i.e. the ability to digest cow's milk as an adult), IQ, height, introversion, sexual orientation or schitzophrenia.

There are a whole host of phenotypes that have been determined to have some genetic component through means like twin studies, studies of the correlation of the phenotype with genetic relatedness, adoption studies, and so on.  These methods can rule out or confirm the hypothesis that a phenotype has no genetic component, and can make a mildly model dependent estimate of the extent to which genetic or non-genetic factors are more important in causing the phenotype to arise.

But, knowing that a phenotype has a genetic component doesn't give you a genetic model to look for when you sequence genes.

To do that you would like to know a few things. 

You would like to know if almost all instances of the phenotype are due to a small number of common mutation variants as in hair and eye color, or a small number of rare mutation variants as in achromotopsia (a rare genetically caused form of color blindness), or a large number of rare mutation variants as in the case of schizophrenia (more than 540 variants), or a large number of common variants as in the case of IQ.

One would also like to know how many genes in any given person are causing the phenotype. Lactase persistance, for example, is caused mostly by one or two genes at a time - even though the gene mutations in question aren't precisely the same in Africa as they are in Europe as they are in Asia. IQ and height, in contrast, seem to be caused by the concerted effort of scores or hundreds of genes at the same time in the same person, each of which contributes to the overall cumulative effect.

The law of averages can help with this task in principle.  Traits with lots of genes of roughly equal importance determining them should show less variation from generation to generation.  Traits with fewer genes should so more intergenerational variation because large variants leave an effect that won't be averaged out.

There are quite a few mathematical considerations that go into determining how many genes are involved from observed phenotypes, most importantly, the difficulty of properly measuring the phenotype and excluding environmental effects.  But, in principle, it ought to be able to get a good order of magnitude estimate of how many genes contribute to missing heritability in most cases.

No comments: