* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Educational Items Section Hardy-Weinberg model Atlas of Genetics and Cytogenetics
Survey
Document related concepts
Transcript
Atlas of Genetics and Cytogenetics in Oncology and Haematology OPEN ACCESS JOURNAL AT INIST-CNRS Educational Items Section Hardy-Weinberg model Robert Kalmes, Jean-Loup Huret Institut de Recherche sur la Biologie de l'Insecte, IRBI - CNRS - ESA 6035, Av. Monge, F-37200 Tours, France (RK); Genetics, Dept Medical Information, UMR 8125 CNRS, University of Poitiers, CHU Poitiers Hospital, F-86021 Poitiers, France (JLH) Published in Atlas Database: February 2001 Online updated version : http://AtlasGeneticsOncology.org/Educ/HardyEng.html DOI: 10.4267/2042/37744 This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 France Licence. © 2001 Atlas of Genetics and Cytogenetics in Oncology and Haematology I THE INTUITIVE APPROACH II THE HARDY-WEINBERG EQUILIBRIUM II-1 FOR AN AUTOSOMAL, DIALLELE, CO-DOMINANT GENE EXERCISE III THE HW LAW III-1 DEMONSTRATION OF THE LAW III-2 EXERCISES III-3 CONSEQUENCES OF THE LAW III-3.1 WHAT IS THE ALLELE FREQUECY IN THE n+ 1 GENERATION? III-3.2 WHAT IS THE GENOTYPE FREQUENCY IN THE n+ 1 GENERATION? III-3.3 EXAMPLE IV EXTENSION OF HW TO OTHER GENE SITUATIONS IV-1 TO AN AUTOSOMAL, TRIALLELE, CO-DOMINANT GENE IV-2 TO AN AUTOSOMAL, DIALLELE, NON CO-DOMINANT GENE IV-3 TO AN AUTOSOMAL, TRIALLELE, NON CO-DOMINANT GENE IV-3.1 BERNSTEIN's EQUATION IV-4 TO A HETEROSOMAL (= gonosomic) GENE IV-4.1 Y CHROMOSOME IV-4.2 X CHROMOSOME V SUMMARY- CONSEQUENCES OF HW's LAW Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 156 Hardy-Weinberg model Kalmes R, Huret JL I- THE INTUITIVE APPROACH The Hardy-Weinberg law can be used under some circumstances to calculate genotype frequencies from allele frequences. Let A1 and A2 be two alleles at the same locus, p is the frequency of allele A1 0 =< p =< 1 q is the frequency of allele A2 0 =< q =< 1 and p + q = 1 Where the distribution of allele frequencies is the same in men and women, i.e.: Men (p,q) Women (p,q) If they procreate : (p + q)2 = p2 + 2pq + q2 = 1 where: p2 = frequency of the A1 A1 genotype ← HOMOZYGOTE 2pq = frequency of the A1 A2 genotype ← HETEROZYGOTE q2 = frequency of the A2 A2 genotyp ← HOMOZYGOTE These frequencies remain constant in successive generations. Example : Autosomal recessive inheritance with alleles A and a, and allele frequencies p and q: → frequency of the genotypes: AA = p2 and the phenotypes [ ]: [A] = p2 + 2pq Aa = 2pq [a] = q2 2 aa = q Example : Phenylketonuria (recessive autosomal), of which the deleterious gene has a frequency of 1/100: → q = 1/100 therefore, the frequency of this disease is q2 = 1/10 000, and the frequency of heterozygotes is 2pq = 2 x 99/100 x 1/100 = 2/100; Note that there are a lot of heterozygotes: 1/50, two hundred times more than there are individuals suffering from the condition. . For a rare disease, p is very little different from 1, and the frequency of the heterozygotes = 2q. We use these equations implicitly, in formal genetics and in the genetics of pooled populations, usually without considering whether, and under what conditions, they are applicable. II- THE HARDY-WEINBERG EQUILIBRIUM The Hardy-Weinberg equilibium, which is also known as the panmictic equilibrium, was discovered at the beginning of the 20th century by several researchers, notably by Hardy, a mathematician and Weinberg, and physician. The Hardy-Weinberg equilibrium is the central theoretical model in population genetics. The concept of equilibrium in the Hardy-Weinberg model is subject to the following hypotheses/conditions: 1. 2. 3. 4. The population is panmictic (couples form randomly (panmixia), and their gametes encounter each other randomly (pangamy)) The population is "infinite" (very large: to minimize differences due to sampling). There must be no selection, mutation, migration (no allele loss /gain). Successive generations are discrete (no crosses between different generations). Under these circumstances, the genetic diversity of the population is maintained and must tend towards a stable equilibrium of the distribution of the genotype. II-1. FOR AN AUTOSOMAL, DIALLELE, CO-DOMINANT GENE (Alleles A1 and A2) Let: The frequencies of genotypes F(G) be called D, H, and R with 0 =< [D,H,R] = < 1 and D + H + R = 1 The frequencies of alleles F(A) be called p, and q with 0 =< [p,q] =< 1 and p+q = 1 Génotypes Number of subjects Frequencies F(G) A1A1 A1A2 A2A2 DN HN RN D H R Allele frequencies F(A): of A1 D + H/2 = p of A2 R + H/2 = q with p+q=1 Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) (total number N) with (D+H+R) = 1 157 Hardy-Weinberg model Kalmes R, Huret JL NOTES The genotype frequencies F(G) can always be used to calculate the allele frequencies F(A) F(A) contains less information than F(G) if p = 0: allele is lost; if p = 1: allele is fixed.. First demonstration that p = D + H/2, by counting the alleles: • size of the population = N -> number of alleles = 2N • p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2 • p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2 similarly for A2 • q = nb A2 / nb total = (2RN + HN) / 2N = R + H/2 (note the symmetry between p and q) Second demonstration, by calculating the probabilities: proba of drawing A1 = drawing A1A1: : D x 1 then drawing A1 into A1A1 or drawing A1A2: H x 1/2 then drawing A1 amongst A1A2 sum: → → P(A1) = D + H/2 similarly for A2 ...; EXERCISE Let: Phenotypes Genotypes Number of subjects [A1] A1A1 167 [A1A2] A1A2 280 [A2] A2A2 109 total N : 556 calculate the following frequencies: F(P: phenotypes), F(G: genotypes), F(A: alleles), F (gametes): F(A) = F(gam), because there is 1 allele (of each gene) per gamete In addition, here F(p) = F(G), because these are co-dominant alleles. F(P) = F(G) Where : 167/556 D=0.300 280/556 H=0.504 109/556 R=0.196 F(A) = F(gam.) p = D+H/2 = (167+280/2)/ 556 or 0.300+0.504/2 = 0.552 q = R+H/2 = (109+280/2)/ 556 or 0.196 + 0.504/2 = 0.448 confirm: Σ(D,H,R)=2 confirm:Σ(p,q)=1 III- THE HW LAW In a population consisting of an infinite number of individuals (i.e. a very large population), which is panmictic (mariages occur randomly), and in the absence of mutation and selection, the frequency of the genotypes will be the development of (p+q)2, p and q being the allele frequencies. The figure shows the correspondence between the allele frequency q of a and the genotype frequencies in the case of two alleles in a panmictic system. The highest frequency of heterozygotes, H, is then reached when p = q and H = 2pq = 0.50. In contrast, when one of the alleles is rare (i.e. q is very small), virtually all the subjects who have this allele are heterozygotes. III-1. DEMONSTRATION OF THE LAW Let A be an autosomal gene that is found in a population in two allele forms, A1 and A2 (with the same frequencies in both sexes of course). As there is codominance, 3 genotypes can be distinguished. According to the Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 158 Hardy-Weinberg model Kalmes R, Huret JL hypotheses/conditions of Hardy-Weinberg (HW), the individuals of the n + 1 generation will be assumed to be the descendants of the random union of a male gamete and a femal gamete. Consequently, if, by generation n, the probability of drawing an A1 allele is p, then that of producing an A1A1 zygote after fertilization is p x p = p2 and similarly for A2, that of producing an A2A2 zygote is q x q = q2. The probability of producing a heterozygote is pq + pq = 2pq. Finally, p2 + 2pq + q2 = (p+q)2 = 1 A1A1 D = p2 A1A2 H=2pq Table of gametes A1 (p) A1 (p) A1A1 (p2) A2 (q) A1A2 (pq) A2A2 R = q2 ← only under HW A2 (q) A1A2 (pq) A2A2 (q2) (The allele frequencies can only be used to calculate the genotype frequencies if they are subject to HW). The allele frequencies remain the same from one generation to another. The genotype frequencies remain the same from one generation to another. III-2. EXERCICES Exercise Show that, in the absence of panmixia, two populations with similar allele frequencies can have different genotype frequencies (by doing this, you show that there is a loss of information between genotype and allele frequencies): Example: for p = q = 0,5 Answer if H = 0−> if H = 1−> p = D + H/2 = 0.5 D=R=0 → D = 0.5 →D=0 H=0 H=1 R = 0.5 R=0 Exercise Calculation of the genotype and allele frequencies, calculation of the numbers predicted by HW (theoretical numbers of individuals), and confirmation that we are indeed in a situation subject to HW : AA 1787 DN AB 3039 HN BB 1303 RN N=6129 Answer: F(A) = (1787 + 3039/2) / 6129 = 0.54 = p F(B) = (1303 + 3039/2) / 6129 = 0.46 = q … and Σ(p,q)=1 Genotype frequencies predicted by HW genotype frequencies predicted by HW AA : p2 = (0.54)2 = 0.2916 AB : 2pq = 2x 0.54 x 0.46 = 0.4968 BB : q2 = (0.46)2 = 0.2116 Numbers predicted by HW AA : p2N = 0.2916 x 6129 = 1787.2 AB : 2pqN = 0.4968 x 6129 = 3044.9 BB : q2N = 0.2116 x 6129 = 1296.9 Confirmation: (1787 - 1787.2)2 + (3039 - 3044.9)2 + (1303 - 1296.9)2 Σ (0i - Ci)2 = = NS 1787.2 3044.9 1296.9 Ci → We are in a situation subject to HW χ2 = Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 159 Hardy-Weinberg model Kalmes R, Huret JL III-3. CONSEQUENCES OF THE LAW Change in HW across the generations (demonstration that the frequencies are invariable). In a population subject to HW, an equilibrium involving the distribution of the genotype frequencies is reached after a single reproductive cycle. Is a population in the n generation. III-3.1. WHAT WILL THE ALLELE FREQUENCY BE IN THE n+1 GENERATION? A1A1 A1A2 A2A2 p2 n q2 2pq n + 1 F(A1) = D + H/2 = p2 +1/2 (2pq) = p (p+q) = p F(A2) = R + H/2 = q2 +1/2 (2pq) = q (p+q) = q → No change in allele frequencies: in the n generation, we have p and q in the n+1 generation, we have p and q III-3.2. WHAT WILL THE GENOTYPE FREQUENCY BE IN THE n+1 GENERATION ? male female p2 2pq q2 A1A1 A1A2 A2A2 A1A1 A1A1 no A1A1 p2 A1A1 2pq A1A2 1/2A1A1 1/4A1A1 no A1A1 q2 A2A2 no A1A1 no A1A1 no A1A1 Generation n+1 Frequency of (A1A1) in the generation n+1 = (p2)2 + 1/2 (2 pq.p2) + 1/2 (p2.2pq) + 1/4 (2pq)2 = p4 + p3q + p3q + p2q2 = p2 (p2 + 2pq + q2) = p2 The frequency of the (A1A1) genotype does not change between generation n and generation n+1 (same demonstration for the (A2A2 ) and (A1A2) genotypes). The genotype structure no longer undergoes any further changes once the population reaches the Hardy Weinberg equilibrium. In very many examples, the frequencies seen in natural populations are consistent with those predicted by the HardyWeinberg law. III-3.3. EXAMPLE The MN human blood groups. Group Number: MM 1787 MN 3039 NN 1303 Frequency of M = (1787 + 3039/2)/ 6129 Frequency of N = (1303 + 3039/2)/6129 Total, N = 6129 = 0.540 = p = 0.460 = q Predicted proportion of MM = p2 = (0.540)2 = 0,2916 Predicted proportion of MN = 2pq = 2(0.540)(0.460) = 0.4968 Predicted proportion of NN = q2 = (0.460)2 = 0.2116 Numbers predicted by Hardy-Weinberg : For MM = p2N = 0,2916 x 6129 = 1787.2 For MN = 2pqN = 0,4968 x 6129 = 3044.9 For NN = q2N = 0,2116 x 6129 = 1296.9 In the present situation, there is no need to do χ2 test to see that the actual numbers are not statistically different from those predicted. Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 160 Hardy-Weinberg model Kalmes R, Huret JL IV- EXTENSION OF HW TO OTHER GENE SITUATIONS IV-1.TO AN AUTOSOMAL, TRIALLELE, CO-DOMINANT GENE 3 alleles with frequencies A1, A2, A3 F(A1) = p, F(A2) = q, F(A3) = r there will be 6 genotypes A1A1 Genotype frequencies according to HW p q r A1 A2 A3 p p q r A1 A2 A3 pq q2 qr pr qr r2 2 p pq pr 2 A1A2 2pq A1A3 2pr A2A2 q 2 A2A3 A3A3 2qr r2 IV-2. TO AN AUTOSOMAL, DIALLELE, NON CO-DOMINANT GENE A is dominant over a, which is recessive; in this case the genotypes (AA) and (Aa) cannot be distinguished within the population. Only the individuals with the phenotype [A], who number N1, will be distinguishable from the individuals with the phenotype [a], who number N2. Genotypes Phenotypes Number Frequency of genotype AA Aa [A] N1 1-q2 aa [a] N2 q2 N with q2 = N2/N = N2 / (N1 + N2) and the frequency of the allele a = F(a) =(q2)1/2 = (N2/(N1 + N2))1/2 This is a method commonly used in human genetics to calculate the frequency of rare, recessive genes. Frequencies of homozygotes and heterozygotes for rare recessive human genes. Gene Incidence in population q2 Albinism 1/22 500 Phenylketonuria 1/10 000 Mucopolysaccharidosis 11/90 000 Frequency of allele q Frequency of heterozygotes 2pq 1/150 1/75 1/100 1/50 1/300 1/150 IV-3. TO AN AUTOSOMAL, TRIALLELE, NON CO-DOMINANT GENE Example: the ABO blood group system. Although the human (ABO) blood group system is often taken to be a simple example of polyallelism, it is in fact a relatively complex situation combining the codominance of A and B, the presence of a nul O allele and the dominance of A and B over O. If we take p to designate the frequency of allele A q to designate the frequency of allele B (p + q + r = 1) Rdiffering genotype and phenotype frequencies are found by applying the Hardy-Weinberg law. Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 161 Hardy-Weinberg model Phenotype [A] Kalmes R, Huret JL Genotype (AA) Genotype frequency p2 Phenotype frequency (AO) 2pr p2+2pr (BB) q2 (BO) 2qr q2+2qr (AB) (OO) 2pq r2 2pq r2 [B] [AB] [O] Using:: p2 +2pr +r2 = (p+r)2 q2 +2qr +r2 = (q+r)2 Where F[A] + F[O] = (p+r)2 F[B] + F[O] = (q+r)2 et F[O] = r2 IV-3.1. BERNSTEIN's EQUATION (1930) Bernstein's equation (1930) simplifies the calculations: p = 1 - (F[B] + F[O])1/2 q = 1 - (F[A] + F[O])1/2 r = (F[O])1/2 Then, if p+q+r # 1, correction by the deviation D = 1 - (p + q + r) → p'= p (1 + D/2) q'= q (1 + D/2) r'= (r + D/2) (1 + D/2) Example: Group Number Frequency A 9123 0.4323 B 2987 0.1415 O 7725 0.3660 AB 1269 0.601 p = 1 - (0.3660+0.1415)1/2 = 0.2876 q = 1 - (0.3660+0.4323)1/2 = 0.1065 r = = 0.6050 p+q+r = 0.9991 ... --> p'= 0.2877, q'= 0.1065, r'= 0.6057. IV-4. TO A HETEROSOMAL (= gonosomic) GENE IV-4.1. Y CHROMOSOME Frequency p and q in subjects XY; transmission to male descendants. IV-4.2. X CHROMOSOME Female XA1XA1 XA1XA2 XA2XA2 2 2pq q2 XA1/Y XA2/Y p Male p q i.e. the frequency of the q allele, is qx in men, and qxx in women: The X chromosome of the boys (in generation n) has been transmitted from the mothers (generation n-1) → qx(n) = qxx(n-1) qx(n) = qxx(n-1) The X chromosome carrying the q allele in the daughters has: 1/2 chance of coming from their father, 1/2 chance of coming from their mother, → qxx(n) = ( qx(n-1) + qxx(n-1))/2 Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 162 Hardy-Weinberg model Kalmes R, Huret JL The frequency of the allele in men = the frequency in women in the previous generation. The frequency of the allele in women = mean of the frequencies in the 2 sexes in the previous generation. * calculation of the difference in allele frequencies between the 2 sexes: qx(n) - qxx(n) = qxx(n-1) - (qxx(n-1))/2 - (qxx(n-1)) /2 = - 1/2 (qx(n-1) - qxx(n-1)) → qx(n) - qxx(n) = (- 1/2)n (qx(0) - qxx(0)) : tends towards zero in 8 to 10 generations * mean frequency q: : 1/3 of the X chromosomes belong to men, 2/3 to women: q = 1/3 qx(n) + 2/3 qxx(n) The mean frequency is invariable (develop q1 into q0 ...... --> q1 = q0) At equilibrium, q(e) est : qx(e) = qxx(e) = q(e) Exercise: For generation G0, consisting of 100% of normal men and 100% of color-blind women, calculate the frequencies of the gene up to G6: Answer G0: XNY XDXD G0 : qx(0) = 0.00 qxx(0) = 1.00 G1 : qx(1) = 1.00 qxx(1) = 0.50 G2 : qx(2) = 0.50 qxx(2) = 0.75 G3 : qx(3) = 0.75 qxx(3) = 0.63 G4 : qx(4) = 0.63 qxx(4) = 0.69 G5 : qx(5) = 0.69 qxx(5) = 0.66 G6 : qx(6) = 0.66 qxx(6) = 0.60 Therefore: For a sex-linked locus, the Hardy Weinberg equilibrium is reached asymptotically after 8-10 generations, whereas it is reached after 1 generation for an autosomal locus. V- CONSEQUENCES OF THE HW LAW Regardless of whether we are in a situation subject to HW or not, the genotype frequencies (D, H, R) can be used to calculate the allele frequencies (p,q), from : p = D + H/2, q = R + H/2. Whereas, if and only if we are subject to HW, the genotype frequencies can be calculated from the allele frequencies, from D = p2, H = 2pq, R = q2. The dominance relationships between alleles have no effect on the change in allele frequencies (although they do affect how difficult the exercises are!) The allele frequencies remain stable over time; and so do the genotype frequencies. The random mendelian segregation of the chromosomes preserves the genetic variability of populations. Since "evolution" is defined as a change in allele frequencies, an ideal diploid population would not evolve. It is only violations of the properties of an ideal population that allow the evolutionary process to take place. The practical approach to a problem is always the same: 1. The Numbers Observed → the (Observed) Genotype Frequencies; 2. Calculate the Allele Frequencies: p=D/2 + S Hi/2 , q = ... 3. If we are subject to HW (hypothetically), then D=p2, H= 2pq, etc ... : we calculate the Theoretical Genotype Frequencies according to HW. 4. The Calculated Genotype Frequencies --> the Calculated Numbers; 5. Comparison of Observed Numbers - Calculated Numbers: : χ2 = Σ (Oi - Ci)2/Ci 6. If χ2 is significant: we are not in accordance with HW; this → Consanguinity? → Selection? → Mutations ? This article should be referenced as such: Kalmes R, Huret JL. Hardy-Weinberg model. Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2):156-163. Atlas Genet Cytogenet Oncol Haematol. 2001; 5(2) 163