# EvoMath 2: Testing for Hardy-Weinberg Equilibrium

In the first installment of EvoMath, I derived the Hardy-Weinberg Principle and discussed its significance to biology. In the second installment I will demonstrate how to test if a population deviates from Hardy-Weinberg equilibrium.

## Recap:

A population is considered to be in Hardy-Weinberg equilibrium if the allele and genotype frequencies are as follows.

Genotype |
Frequency |

AA |
\(P = P^\prime = p^2\) |

Aa |
\(H = H^\prime = 2pq\) |

aa |
\(Q = Q^\prime = q^2\) |

Allele |
Frequency |

A |
\(p = P + \frac{H}{2}\) |

a |
\(q = 1-p = Q + \frac{H}{2}\) |

## Test Procedure:

A goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. We conclude that the population is not in Hardy-Weinberg equilibrium if the probability that the counts were drawn under the Hardy-Weinberg model is too small for the deviations to be considered due to random chance. The significance level that is typically used is \(\alpha = 0.05\), i.e. the genotype counts have less than a one in twenty chance of being caused by a population in Hardy-Weinberg equilibrium.

In order to calculate this probability, we will use a test statistic, \(\chi^2\), which was devised in 1900 by Karl Pearson and has a well characterized distribution. If \(O_i\) are the set of observed counts, and \(E_i\) are the set of expected counts, then

\[\chi^2 = \sum_i{\frac{ \left( O_i-E_i \right)^2 }{ E_i }}.\]This test statistic has a “chi-square” distribution with \(\nu\) degrees of freedom. Since we are testing Hardy-Weinberg equilibrium with two alleles, \(\nu=1\) (rationale not shown). Furthermore, it can be shown that if \(\chi^2 \ge 3.841\) then \(\Pr{ \left\{ \chi^2 \right\} } \le 0.05\). Therefore, if \(\chi^2 \ge 3.841\) we will reject the null model and conclude that there is significant statistical support that the population is not in Hardy-Weinberg equilibrium.

## Example 1:

Consider the following samples from a population.

Genotype |
Count |

AA |
30 |

Aa |
55 |

aa |
15 |

Allele |
Frequency |

A |
0.575 |

a |
0.425 |

Calculate the \(\chi^2\) value.

Genotype |
Observed |
Expected |
(O-E)^{2}/E |

AA |
30 | 33 | 0.27 |

Aa |
55 | 49 | 0.73 |

aa |
25 | 18 | 0.50 |

Total | 100 | 100 | 1.50 |

Since \(\chi^2 = 1.50 < 3.841\), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium.

## Example 2:

Race and Sanger (1975) determined the blood groups of 1000 Britons as follows (from Hartl and Clarke 1997).

Genotype |
Observed |
Expected |

MM |
298 | 294.3 |

MN |
489 | 496.4 |

NN |
213 | 209.3 |

This results in \(\chi^2 = 0.222 < 3.841\). As in the previous example, the measured genotype frequencies are not significantly different from the expectations of Hardy-Weinberg equilibrium.

## Example 3:

Matthijis et al. (1998) surveyed a group of 54 people suffering from Jaeken syndrome (from Freeman and Herron 2004).

Genotype |
Observed |
Expected |

OO |
11 | 19.44 |

OR |
43 | 25.92 |

RR |
0 | 8.64 |

This results in \(\chi^2 = 23.56 > 3.841\). Unlike the previous two examples, the measured genotype frequencies are significantly different from the expectations of Hardy-Weinberg equilibrium. This indicates that one or more of the Hardy-Weinberg conditions are being violated; although, it does not tell us which ones.

## Conclusion:

Although to derive the Hardy-Weinberg principle, we assumed that the size of the population was infinite, these statistical tests demonstrate that finite populations can approximately exist in Hardy-Weinberg equilibrium.

- Freeman S and Herron JC (2004) Evolutionary Analysis 3rd ed. Pearson Education, Inc (Upper Saddle River, NJ)
- Hartl DL and Clarke AG (1997) Principles of Population Genetics 3rd ed. Sinauer Associates, Inc (Sutherland, MA)
- Matthijis GE et al. (1998) Lack of homozygotes for the most frequent disease allele in carbohydrate-deficient-glycoprotein syndrome type 1A. American Journal of Human Genetics 62: 542-550
- Race RR and Sanger R (1975) Blood Groups in Man 6th ed. JB Lippincott, Philadelphia