Download Educational Items Section Hardy-Weinberg model Atlas of Genetics and Cytogenetics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetics and archaeogenetics of South Asia wikipedia , lookup

Inbreeding wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Genetic drift wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Transcript
Atlas of Genetics and Cytogenetics
in Oncology and Haematology
OPEN ACCESS JOURNAL AT INIST-CNRS
Educational Items Section
Hardy-Weinberg model
Robert Kalmes, Jean-Loup Huret
Institut de Recherche sur la Biologie de l'Insecte, IRBI - CNRS - ESA 6035, Av. Monge, F-37200 Tours,
France (RK); Genetics, Dept Medical Information, UMR 8125 CNRS, University of Poitiers, CHU Poitiers
Hospital, F-86021 Poitiers, France (JLH)
Published in Atlas Database: February 2001
Online updated version : http://AtlasGeneticsOncology.org/Educ/HardyEng.html
DOI: 10.4267/2042/37744
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 France Licence.
© 2001 Atlas of Genetics and Cytogenetics in Oncology and Haematology
I THE INTUITIVE APPROACH
II THE HARDY-WEINBERG EQUILIBRIUM
II-1 FOR AN AUTOSOMAL, DIALLELE, CO-DOMINANT GENE EXERCISE
III THE HW LAW
III-1 DEMONSTRATION OF THE LAW
III-2 EXERCISES
III-3 CONSEQUENCES OF THE LAW
III-3.1 WHAT IS THE ALLELE FREQUECY IN THE n+ 1 GENERATION?
III-3.2 WHAT IS THE GENOTYPE FREQUENCY IN THE n+ 1 GENERATION?
III-3.3 EXAMPLE
IV
EXTENSION OF HW TO OTHER GENE SITUATIONS
IV-1 TO AN AUTOSOMAL, TRIALLELE, CO-DOMINANT GENE
IV-2 TO AN AUTOSOMAL, DIALLELE, NON CO-DOMINANT GENE
IV-3 TO AN AUTOSOMAL, TRIALLELE, NON CO-DOMINANT GENE
IV-3.1 BERNSTEIN's EQUATION
IV-4 TO A HETEROSOMAL (= gonosomic) GENE
IV-4.1 Y CHROMOSOME
IV-4.2 X CHROMOSOME
V SUMMARY- CONSEQUENCES OF HW's LAW
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
156
Hardy-Weinberg model
Kalmes R, Huret JL
I- THE INTUITIVE APPROACH
The Hardy-Weinberg law can be used under some circumstances to calculate genotype frequencies from allele
frequences.
Let A1 and A2 be two alleles at the same locus,
p is the frequency of allele A1
0 =< p =< 1
q is the frequency of allele A2
0 =< q =< 1 and p + q = 1
Where the distribution of allele frequencies is the same in men and women, i.e.:
Men (p,q) Women (p,q)
If they procreate : (p + q)2 = p2 + 2pq + q2 = 1
where:
p2 = frequency of the A1 A1 genotype ← HOMOZYGOTE
2pq = frequency of the A1 A2 genotype ← HETEROZYGOTE
q2 = frequency of the A2 A2 genotyp
← HOMOZYGOTE
These frequencies remain constant in successive generations.
Example : Autosomal recessive inheritance with alleles A and a, and allele frequencies p and q:
→ frequency of the genotypes: AA = p2 and the phenotypes [ ]: [A] = p2 + 2pq
Aa = 2pq
[a] = q2
2
aa = q
Example : Phenylketonuria (recessive autosomal), of which the deleterious gene has a frequency of 1/100: → q = 1/100
therefore, the frequency of this disease is q2 = 1/10 000,
and the frequency of heterozygotes is 2pq = 2 x 99/100 x 1/100 = 2/100;
Note that there are a lot of heterozygotes: 1/50, two hundred times more than there are individuals suffering from the
condition. .
For a rare disease, p is very little different from 1, and the frequency of the heterozygotes = 2q.
We use these equations implicitly, in formal genetics and in the genetics of pooled populations, usually without
considering whether, and under what conditions, they are applicable.
II- THE HARDY-WEINBERG EQUILIBRIUM
The Hardy-Weinberg equilibium, which is also known as the panmictic equilibrium, was discovered at the beginning of
the 20th century by several researchers, notably by Hardy, a mathematician and Weinberg, and physician.
The Hardy-Weinberg equilibrium is the central theoretical model in population genetics. The concept of equilibrium in
the Hardy-Weinberg model is subject to the following hypotheses/conditions:
1.
2.
3.
4.
The population is panmictic (couples form randomly (panmixia), and their gametes encounter each other
randomly (pangamy))
The population is "infinite" (very large: to minimize differences due to sampling).
There must be no selection, mutation, migration (no allele loss /gain).
Successive generations are discrete (no crosses between different generations).
Under these circumstances, the genetic diversity of the population is maintained and must tend towards a stable
equilibrium of the distribution of the genotype.
II-1. FOR AN AUTOSOMAL, DIALLELE, CO-DOMINANT GENE (Alleles A1 and A2)
Let:
The frequencies of genotypes F(G) be called D, H, and R with 0 =< [D,H,R] = < 1 and D + H + R = 1
The frequencies of alleles F(A) be called p, and q with 0 =< [p,q] =< 1 and p+q = 1
Génotypes
Number of subjects
Frequencies F(G)
A1A1 A1A2 A2A2
DN
HN RN
D
H
R
Allele frequencies F(A):
of A1 D + H/2 = p
of A2 R + H/2 = q with p+q=1
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
(total number N)
with (D+H+R) = 1
157
Hardy-Weinberg model
Kalmes R, Huret JL
NOTES
The genotype frequencies F(G) can always be used to calculate the allele frequencies F(A)
F(A) contains less information than F(G)
if p = 0: allele is lost; if p = 1: allele is fixed..
First demonstration that p = D + H/2, by counting the alleles:
• size of the population = N -> number of alleles = 2N
• p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2
• p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2 similarly for A2
• q = nb A2 / nb total = (2RN + HN) / 2N = R + H/2 (note the symmetry between p and q)
Second demonstration, by calculating the probabilities:
proba of drawing A1 = drawing A1A1: : D x 1 then drawing A1 into A1A1
or
drawing A1A2: H x 1/2 then drawing A1 amongst A1A2
sum:
→ → P(A1) = D + H/2
similarly for A2 ...;
EXERCISE
Let:
Phenotypes
Genotypes
Number of subjects
[A1]
A1A1
167
[A1A2]
A1A2
280
[A2]
A2A2
109
total N : 556
calculate the following frequencies: F(P: phenotypes), F(G: genotypes), F(A: alleles), F (gametes):
F(A) = F(gam), because there is 1 allele (of each gene) per gamete
In addition, here F(p) = F(G), because these are co-dominant alleles.
F(P) = F(G)
Where :
167/556
D=0.300
280/556
H=0.504
109/556
R=0.196
F(A) = F(gam.)
p = D+H/2 = (167+280/2)/ 556 or 0.300+0.504/2 = 0.552
q = R+H/2 = (109+280/2)/ 556 or 0.196 + 0.504/2 = 0.448
confirm: Σ(D,H,R)=2
confirm:Σ(p,q)=1
III- THE HW LAW
In a population consisting of an infinite number of individuals
(i.e. a very large population), which is panmictic (mariages
occur randomly), and in the absence of mutation and selection,
the frequency of the genotypes will be the development of
(p+q)2, p and q being the allele frequencies.
The figure shows the correspondence between the allele
frequency q of a and the genotype frequencies in the case of two
alleles in a panmictic system. The highest frequency of
heterozygotes, H, is then reached when p = q and H = 2pq =
0.50. In contrast, when one of the alleles is rare (i.e. q is very
small), virtually all the subjects who have this allele are
heterozygotes.
III-1. DEMONSTRATION OF THE LAW
Let A be an autosomal gene that is found in a population in two allele forms, A1 and A2 (with the same frequencies in
both sexes of course). As there is codominance, 3 genotypes can be distinguished. According to the
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
158
Hardy-Weinberg model
Kalmes R, Huret JL
hypotheses/conditions of Hardy-Weinberg (HW), the individuals of the n + 1 generation will be assumed to be the
descendants of the random union of a male gamete and a femal gamete.
Consequently, if, by generation n, the probability of drawing an A1 allele is p, then that of producing an A1A1 zygote
after fertilization is p x p = p2 and similarly for A2, that of producing an A2A2 zygote is q x q = q2. The probability of
producing a heterozygote is pq + pq = 2pq. Finally, p2 + 2pq + q2 = (p+q)2 = 1
A1A1
D = p2
A1A2
H=2pq
Table of gametes
A1
(p)
A1
(p)
A1A1 (p2)
A2
(q)
A1A2 (pq)
A2A2
R = q2
← only under HW
A2
(q)
A1A2 (pq)
A2A2 (q2)
(The allele frequencies can only be used to calculate the genotype frequencies if they are subject to HW).
The allele frequencies remain the same from one generation to another.
The genotype frequencies remain the same from one generation to another.
III-2. EXERCICES
Exercise
Show that, in the absence of panmixia, two populations with similar allele frequencies can have different genotype
frequencies (by doing this, you show that there is a loss of information between genotype and allele frequencies):
Example:
for p = q = 0,5
Answer
if H = 0−>
if H = 1−>
p = D + H/2 = 0.5
D=R=0
→ D = 0.5
→D=0
H=0
H=1
R = 0.5
R=0
Exercise
Calculation of the genotype and allele frequencies, calculation of the numbers predicted by HW (theoretical numbers of
individuals), and confirmation that we are indeed in a situation subject to HW :
AA
1787
DN
AB
3039
HN
BB
1303
RN
N=6129
Answer:
F(A) = (1787 + 3039/2) / 6129 = 0.54 = p
F(B) = (1303 + 3039/2) / 6129 = 0.46 = q
… and Σ(p,q)=1
Genotype frequencies predicted by HW genotype frequencies predicted by HW
AA :
p2 = (0.54)2
= 0.2916
AB : 2pq = 2x 0.54 x 0.46 = 0.4968
BB :
q2 = (0.46)2
= 0.2116
Numbers predicted by HW
AA : p2N = 0.2916 x 6129 = 1787.2
AB : 2pqN = 0.4968 x 6129 = 3044.9
BB : q2N = 0.2116 x 6129 = 1296.9
Confirmation:
(1787 - 1787.2)2 + (3039 - 3044.9)2 + (1303 - 1296.9)2
Σ (0i - Ci)2
=
= NS
1787.2
3044.9
1296.9
Ci
→ We are in a situation subject to HW
χ2
=
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
159
Hardy-Weinberg model
Kalmes R, Huret JL
III-3. CONSEQUENCES OF THE LAW
Change in HW across the generations (demonstration that the frequencies are invariable). In a population subject to
HW, an equilibrium involving the distribution of the genotype frequencies is reached after a single reproductive cycle.
Is a population in the n generation.
III-3.1. WHAT WILL THE ALLELE FREQUENCY BE IN THE n+1 GENERATION?
A1A1 A1A2 A2A2
p2
n
q2
2pq
n + 1 F(A1) = D + H/2 = p2 +1/2 (2pq) = p (p+q) = p
F(A2) = R + H/2 = q2 +1/2 (2pq) = q (p+q) = q
→ No change in allele frequencies:
in the n generation, we have p and q
in the n+1 generation, we have p and q
III-3.2. WHAT WILL THE GENOTYPE FREQUENCY BE IN THE n+1 GENERATION ?
male
female
p2
2pq
q2
A1A1
A1A2
A2A2
A1A1
A1A1
no A1A1
p2
A1A1
2pq
A1A2
1/2A1A1 1/4A1A1
no A1A1
q2
A2A2
no A1A1 no A1A1
no A1A1
Generation n+1
Frequency of (A1A1) in the generation n+1 = (p2)2 + 1/2 (2 pq.p2) + 1/2 (p2.2pq) + 1/4 (2pq)2
= p4 + p3q + p3q + p2q2 = p2 (p2 + 2pq + q2) = p2
The frequency of the (A1A1) genotype does not change between generation n and generation n+1 (same demonstration
for the (A2A2 ) and (A1A2) genotypes). The genotype structure no longer undergoes any further changes once the
population reaches the Hardy Weinberg equilibrium.
In very many examples, the frequencies seen in natural populations are consistent with those predicted by the HardyWeinberg law.
III-3.3. EXAMPLE
The MN human blood groups.
Group
Number:
MM
1787
MN
3039
NN
1303
Frequency of M = (1787 + 3039/2)/ 6129
Frequency of N = (1303 + 3039/2)/6129
Total, N = 6129
= 0.540 = p
= 0.460 = q
Predicted proportion of MM = p2 = (0.540)2
= 0,2916
Predicted proportion of MN = 2pq = 2(0.540)(0.460) = 0.4968
Predicted proportion of NN = q2 = (0.460)2
= 0.2116
Numbers predicted by Hardy-Weinberg :
For MM = p2N = 0,2916 x 6129 = 1787.2
For MN = 2pqN = 0,4968 x 6129 = 3044.9
For NN = q2N = 0,2116 x 6129 = 1296.9
In the present situation, there is no need to do χ2 test to see that the actual numbers are not statistically different from
those predicted.
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
160
Hardy-Weinberg model
Kalmes R, Huret JL
IV- EXTENSION OF HW TO OTHER GENE SITUATIONS
IV-1.TO AN AUTOSOMAL, TRIALLELE, CO-DOMINANT GENE
3 alleles
with frequencies
A1, A2, A3
F(A1) = p, F(A2) = q, F(A3) = r
there will be 6 genotypes
A1A1
Genotype frequencies according to HW
p
q
r
A1
A2
A3
p
p
q
r
A1
A2
A3
pq
q2
qr
pr
qr
r2
2
p
pq
pr
2
A1A2
2pq
A1A3
2pr
A2A2
q
2
A2A3
A3A3
2qr
r2
IV-2. TO AN AUTOSOMAL, DIALLELE, NON CO-DOMINANT GENE
A is dominant over a, which is recessive; in this case the genotypes (AA) and (Aa) cannot be distinguished within the
population. Only the individuals with the phenotype [A], who number N1, will be distinguishable from the individuals
with the phenotype [a], who number N2.
Genotypes
Phenotypes
Number
Frequency of genotype
AA
Aa
[A]
N1
1-q2
aa
[a]
N2
q2
N
with q2 = N2/N = N2 / (N1 + N2)
and the frequency of the allele a = F(a) =(q2)1/2 = (N2/(N1 + N2))1/2
This is a method commonly used in human genetics to calculate the frequency of rare, recessive genes.
Frequencies of homozygotes and heterozygotes for rare recessive human genes.
Gene
Incidence in population q2
Albinism
1/22 500
Phenylketonuria
1/10 000
Mucopolysaccharidosis
11/90 000
Frequency of allele q Frequency of heterozygotes 2pq
1/150
1/75
1/100
1/50
1/300
1/150
IV-3. TO AN AUTOSOMAL, TRIALLELE, NON CO-DOMINANT GENE
Example: the ABO blood group system. Although the human (ABO) blood group system is often taken to be a simple
example of polyallelism, it is in fact a relatively complex situation combining the codominance of A and B, the presence
of a nul O allele and the dominance of A and B over O.
If we take
p to designate the frequency of allele A
q to designate the frequency of allele B
(p + q + r = 1)
Rdiffering genotype and phenotype frequencies are found by applying the Hardy-Weinberg law.
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
161
Hardy-Weinberg model
Phenotype
[A]
Kalmes R, Huret JL
Genotype
(AA)
Genotype frequency
p2
Phenotype frequency
(AO)
2pr
p2+2pr
(BB)
q2
(BO)
2qr
q2+2qr
(AB)
(OO)
2pq
r2
2pq
r2
[B]
[AB]
[O]
Using::
p2 +2pr +r2 = (p+r)2
q2 +2qr +r2 = (q+r)2
Where
F[A] + F[O] = (p+r)2
F[B] + F[O] = (q+r)2 et F[O] = r2
IV-3.1. BERNSTEIN's EQUATION (1930)
Bernstein's equation (1930) simplifies the calculations:
p = 1 - (F[B] + F[O])1/2
q = 1 - (F[A] + F[O])1/2
r = (F[O])1/2
Then, if p+q+r # 1, correction by the deviation D = 1 - (p + q + r) →
p'= p (1 + D/2) q'= q (1 + D/2) r'= (r + D/2) (1 + D/2)
Example:
Group
Number
Frequency
A
9123
0.4323
B
2987
0.1415
O
7725
0.3660
AB
1269
0.601
p = 1 - (0.3660+0.1415)1/2 = 0.2876
q = 1 - (0.3660+0.4323)1/2 = 0.1065
r = = 0.6050
p+q+r = 0.9991 ... --> p'= 0.2877, q'= 0.1065, r'= 0.6057.
IV-4. TO A HETEROSOMAL (= gonosomic) GENE
IV-4.1. Y CHROMOSOME
Frequency p and q in subjects XY; transmission to male descendants.
IV-4.2. X CHROMOSOME
Female
XA1XA1
XA1XA2
XA2XA2
2
2pq
q2
XA1/Y
XA2/Y
p
Male
p
q
i.e. the frequency of the q allele, is qx in men, and qxx in women:
The X chromosome of the boys (in generation n) has been transmitted from the mothers (generation n-1) → qx(n) =
qxx(n-1) qx(n) = qxx(n-1)
The X chromosome carrying the q allele in the daughters has:
1/2 chance of coming from their father,
1/2 chance of coming from their mother,
→ qxx(n) = ( qx(n-1) + qxx(n-1))/2
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
162
Hardy-Weinberg model
Kalmes R, Huret JL
The frequency of the allele in men = the frequency in women in the previous generation.
The frequency of the allele in women = mean of the frequencies in the 2 sexes in the previous generation.
* calculation of the difference in allele frequencies between the 2 sexes:
qx(n) - qxx(n) = qxx(n-1) - (qxx(n-1))/2 - (qxx(n-1)) /2 = - 1/2 (qx(n-1) - qxx(n-1))
→ qx(n) - qxx(n) = (- 1/2)n (qx(0) - qxx(0)) : tends towards zero in 8 to 10 generations
* mean frequency q: :
1/3 of the X chromosomes belong to men, 2/3 to women: q = 1/3 qx(n) + 2/3 qxx(n)
The mean frequency is invariable (develop q1 into q0 ...... --> q1 = q0)
At equilibrium, q(e) est : qx(e) = qxx(e) = q(e)
Exercise: For generation G0, consisting of 100% of normal men and 100% of color-blind women, calculate the
frequencies of the gene up to G6:
Answer
G0: XNY
XDXD
G0 : qx(0) = 0.00
qxx(0) = 1.00
G1 : qx(1) = 1.00
qxx(1) = 0.50
G2 : qx(2) = 0.50
qxx(2) = 0.75
G3 : qx(3) = 0.75
qxx(3) = 0.63
G4 : qx(4) = 0.63
qxx(4) = 0.69
G5 : qx(5) = 0.69
qxx(5) = 0.66
G6 : qx(6) = 0.66
qxx(6) = 0.60
Therefore:
For a sex-linked locus, the Hardy Weinberg equilibrium is
reached asymptotically after 8-10 generations, whereas it is
reached after 1 generation for an autosomal locus.
V- CONSEQUENCES OF THE HW LAW
Regardless of whether we are in a situation subject to HW or
not, the genotype frequencies (D, H, R) can be used to calculate the allele frequencies (p,q), from : p = D + H/2, q = R +
H/2.
Whereas, if and only if we are subject to HW, the genotype frequencies can be calculated from the allele frequencies,
from D = p2, H = 2pq, R = q2.
The dominance relationships between alleles have no effect on the change in allele frequencies (although they do affect
how difficult the exercises are!)
The allele frequencies remain stable over time; and so do the genotype frequencies.
The random mendelian segregation of the chromosomes preserves the genetic variability of populations.
Since "evolution" is defined as a change in allele frequencies, an ideal diploid population would not evolve.
It is only violations of the properties of an ideal population that allow the evolutionary process to take place.
The practical approach to a problem is always the same:
1. The Numbers Observed → the (Observed) Genotype Frequencies;
2. Calculate the Allele Frequencies: p=D/2 + S Hi/2 , q = ...
3. If we are subject to HW (hypothetically), then D=p2, H= 2pq, etc ... : we calculate the Theoretical Genotype
Frequencies according to HW.
4. The Calculated Genotype Frequencies --> the Calculated Numbers;
5. Comparison of Observed Numbers - Calculated Numbers: : χ2 = Σ (Oi - Ci)2/Ci
6. If χ2 is significant: we are not in accordance with HW; this
→ Consanguinity?
→ Selection?
→ Mutations ?
This article should be referenced as such:
Kalmes R, Huret JL. Hardy-Weinberg model. Atlas Genet
Cytogenet Oncol Haematol. 2001; 5(2):156-163.
Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2)
163