Saturday, April 02, 2016

Heritability, correlation and prediction

We're told that intelligence is 60-80% heritable, and that personality is 40-60% heritable. In some hand-wavy way, we know that heritability captures the nature side of the nature-nurture contribution to traits.

But what does heritability really mean? It's a rather slippery concept. We'll get there by stages.

1. The contribution of genes to a phenotype

Let's take height as our running example (pretty much the same heritability as intelligence). Let's take a person with height P (P stands for phenotype - the measured trait). P is measured in inches away from the population mean height.

How did a person get to be that height? Nature and nurture, right?

We assume that the alleles the person got from their father contributes Xfather inches of height, Xmother counts the inches they received from their mother's alleles they inherited, and then there is a nurture - or environmental - term E inches. So their total height,
P = Xfather + Xmother + E.
Note these are genetic additive effects: each additional allele is plausibly assumed to make its independent contribution into raising or lowering X a fraction. Dominance and epistatic effects are neglected in this simplified conceptual model (in a polygenic trait, they tend not to be large).

Since we're measuring deviations from the mean, the average values across the population of Xfather, Xmother and E must all be zero. And so, therefore, must be the average value of P.

So without loss of generality, we assume Xfather, Xmother and E are normally distributed random variables with mean zero and variances as follows:
Var(Xfather) = Vadditive/2    -- each parent provides half the additive genetic 'input'

Var(Xmother) = Vadditive/2   -- each parent provides half the additive genetic 'input'

Var(E) = Venvironment.
So what is Var(P), the variance of height as we observe it in the population?
Var(P) = Var(Xfather) + Var(Xmother) + Var(E) +

        2Cov(Xfather, Xmother) + 2Cov(Xfather, E) + 2Cov(Xmother, E).
Messy, but if we assume Xfather, Xmother and E are independent, their covariances are zero, so
Var(P) = Var(Xfather) + Var(Xmother) + Var(E),

Vphenotype  = Vadditive + Venvironment
The fraction of the population phenotypic variation due to genetic, additive effects is then simply
h2 = Vadditive/Vphenotype = Vadditive/(Vadditive + Venvironment).
This is the definition of heritability, h2.

So if h2 is 0.5, then 50% of the variance in the phenotype is genetic in origin (additive-genetic, that is) and 50% is environmental (everything else).

Note that the more you reduce environmental variance, for example making sure that everyone's well-fed, properly educated and not knocked about, the more genetic differences predominate .. and heritability goes up. Not what the SJWs really want to hear!

---

2. Correlations

What is the correlation, ρ, between a parent and child for height?

If we have two random variables, A and B, the correlation between them is defined as follows:
ρ =  Cov(A,B)/√(Var(A) * Var(B)).
This is the standard definition.

In the case of one parent and their offspring, under some simplifying assumptions,
Cov(parent,offspring) = Vadditive/2
- this takes a few lines to work out, setting most of the Xfather, Xmother and E cross-terms to zero. It reflects the 50% of genetic material they have in common.

More obviously,
Var(parent) = Var(offspring) = Vphenotype,
So using the formula for ρ above,
ρ = (Vadditive/2) / Vphenotype = h2/2.
This shows that heritability is not the same as the correlation between a child and one of its parents.

In general, the correlation, ρ, on a trait between relatives is equal to the coefficient of relatedness times the heritability, ie ρ = rh2.

---

3. Predictions

If we know the height of both the parents, what's our best prediction of the height of their offspring? In our mind, we draw the best-fit regression line on the scatter-plot of parental-midpoint and offspring heights measured across the population.

If we centre the graph-axes at the mean values of the two populations (parental mid-point heights and offspring heights) then the regression line goes through the origin, with slope β. Then the equation of the regression line takes this simple form:
predicted-offspring-height = β * parental-midpoint-height
with both heights measured as inches in deviation from the respective means.

How do we compute β?

In this special case it turns out that β equals the heritability, so β  = h2. *

This should remind you of the Breeder's Equation.

---

Example: suppose the heritability of height is 0.673 and we know that one parent is 3 inches above the population mean while the other parent is 1 inch above the mean, what's the predicted (expected) height deviation from the mean for their child?
Answer: predicted-offspring-height = β * (3 + 1)/2 = 2h2 = 1.35 inches.
Yes, the child has regressed towards the mean.

---

This is problem 6.3 (p. 149) from 'Population Genetics: a concise guide' by John H. Gillespie, from which all the material above has been summarised.

---

* In general, β = ρ * (σyx) where x is the independent variable.

No comments:

Post a Comment

Comments are moderated. Keep it polite and no gratuitous links to your business website - we're not a billboard here.