Some notes on assortative mating
Simple derivations and implications for between and within family variance.
Definitions
When spouses pair up based on an observed phenotype — say height or education — this is called “direct” assortative mating. Other forms of phenotypic similarity between mates are also possible, as recently discussed in [Sunde et al. 2024 pre-print] and the figure below, but we will focus on direct assortative mating here.
If the phenotype that is being sorted on is heritable (as all phenotypes are to some extent), direct assortative mating induces correlation between otherwise independent alleles in offspring. This process operates over the course of multiple generations until it reaches an equilibrium after ~10 (with most of the correlation accumulating within the first ~5).
Excess correlation across sites also increases the genetic variance in the population relative to what it would be under random mating with the same exact direct effects. If the environmental variance is fixed, that means a trait effectively becomes more “heritable” (here defined as the ratio of genetic variance and total phenotypic variance in a population) solely due to cultural structure. If assortative mating stops, the built up correlation quickly collapses after a few generations of random mating and heritability “decreases”. Phenotype causes genes!
One generation of assortative mating increases the additive genetic variance by a factor of:
Where r is the phenotypic correlation for mates and h2 is the heritability under random mating (this is important as we will now introduce a second heritability term). Over multiple generations (for a trait with a reasonably large number of causal variants) the genetic variance converges to an equilibrium value of:
Where h2_eq is now the “equilibrium” heritability: defined as the ratio of genetic variance to phenotypic variance (as above), but now in the equilibrium population after sufficient generations of assortative mating. Traits with high heritability and high mate pair correlations can thus converge to substantially higher genetic variance at equilibrium.
As a consequence, spouses and relatives become more genetically correlated on the sorting trait than would be expected under random mating. For mates, that genetic correlation is:
For kth degree relatives, the genetic correlation is:
Thus siblings (k=1) have their genetic correlation increased by half the excess genetic correlation in the parents, and so on.
As we know, in a randomly mating population the within-family genetic variance is half the population genetic variance. In an assortative mating population at equilibrium, we can re-derive the within-family (wf) variance at equilibrium as the additive variance minus the first-degree covariance to get:
Now plugging in the derivation of equilibrium variance from above, the assortative mating coefficients cancel out and the within-family variance reduces to simply half the genetic variance under random mating:
Thus, within-family variance does not increase due to assortative mating (or, rather, the increased genetic correlations due to assortative mating cancel out after conditioning on the parents).
We can verify both phenomena in the simulation below, which decomposes and visualizes the variance for each generation of assortative mating. Population variance increases, while within-family (i.e. offspring minus mid-parent) variance stays fixed.
Consequences
Assortative mating biases projections of the within-family segregation upwards if a population-level estimate of variance/heritability is used. This is because the within-family segregation follows the random mating variance, as we saw above. For example, for extrapolating the yield from embryo selection — a within-family process — the random mating heritability (as estimated within-families) should be used and not the population-level heritability. Note the population-level estimate of heritability can itself be biased upwards by assortative mating depending on the method used for inference (see [Border et al. 2022]), but we will assume here that the true parameters are known.
We can confirm in simulations that using the population-level h2_eq produces biased estimates. We start from a random mating heritability of 0.60 which, under strong assortative mating over 10 generations, increases to an equilibrium heritability of 0.75. The estimated yield (using the derivations from [Lencz et al. 2021]) is projected correctly when using the h2 of 0.60 but is over-estimated when using an h2_eq of 0.75. Thus, the within-family estimate of heritability is the relevant parameter for modeling within-family segregation in the presence of assortative mating.
On the other hand, assortative mating biases projections for population-level heritability downwards if within-family estimators are used. For example, the classic twin design assumes that DZ twins have a covariance of 1/2, whereas (as we saw above) this covariance will be higher under assortative mating, leading to downward bias in the twin-based estimate of additive genetic variance. Thus, for extrapolating population-level “direct heritability” based on within-family estimates, a post-hoc correction must be applied that assumes a specific model of assortative mating (see, for example, [Kemper et al. 2021]). Alternatively, [Young 2023] derived a data-driven correction that uses the measured genetic correlation of polygenic scores across mates, but assumes that these polygenic scores do not contain any bias from population stratification (unlikely to be true).
In short, there are multiple true heritability/variance parameters and they differ in their relevant applications.
Further reading
The fundamentals are covered in quantitative genetics textbooks such as Genetics and Analysis of Quantitative Traits by Lynch & Walsh or The Mathematical Theory of Quantitative Genetics by Bulmer.
A comprehensive open-access overview is described in [Sunde et al. 2024 Nat Comms], in particular the extensive derivations in the Supplementary Note. Variant-level derivations of the change in genetic variance under assortative mating are also described in [Hayashi et al. 1998] with derivations for arbitrary variants in [Crow and Felsenstein 1968] and going all the way back to [Wright 1921].
[Yengo et al. 2023] includes a recent overview and discussion of interesting open questions.