* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download ppt
Survey
Document related concepts
Transcript
Lecture 3: Allele Frequencies and Hardy-Weinberg Equilibrium August 24, 2015 u Review of genetic variation and Mendelian Genetics Sample calculations for Mendelian expectations: see solutions in excel file on website u Methods for detecting variation Morphology Allozymes DNA Markers (deferred to Friday: Guest lecture) ä Anonymous ä Sequence-tagged Today Introduction to statistical distributions Estimating allele frequencies Introduction to Hardy-Weinberg Equilibrium Using Hardy-Weinberg: Estimating allele frequencies for dominant loci Statistical Distributions: Normal Distribution Many types of estimates follow normal distribution Can be visualized as a frequency distribution (histogram) Can interpret as a probability density function 2 sd 1 sd 1 n Expected Value (Mean): x xi n i 1 where n is the number of samples Variance (Vx): A measure of the dispersion around the mean: 1 n Vx ( xi x ) 2 n 1 i 1 Standard Deviation (sd): A measure of dispersion around the mean that is on same scale as mean sd Vx Standard Error of Mean Standard Deviation is a measure of how individual points differ from the mean estimates in a single sample Standard Error is a measure of how much the estimate differs from the true parameter value (in the case of means, μ) If you repeated the experiment, how close would you expect the mean estimate to be to your previous estimate? Standard Error of the Mean (se): 95% Confidence Interval: Vx se n x 1.96( se) Estimating Allele Frequencies, Codominant Loci Measured allele frequency is maximum likelihood estimator of the true frequency of the allele in the population (See Hedrick, pp 82-83 for derivation) p 1 N12 2 N N11 Expected number of observations of allele A1: E(Y)=np Where n is number of samples For diploid organisms, n = 2N , where N is number of individuals sampled Expected number of observations of allele A1 is analogous to the mean of a sample from a normal distribution Allele frequency can also be interpreted as an estimate of the mean Allele Frequency Example Assume a population of Mountain Laurel (Kalmia latifolia) at Cooper’s Rock, WV Red buds: 5000 Pink buds: 3000 White buds: 2000 A1A1 A1A2 A2A2 Phenotype is determined by a single, codominant locus: Anthocyanin What is frequency of “red” alleles (A1), and “white” alleles (A2)? Frequency of A2 = q Frequency of A1 = p p 1 N12 2 N11 N12 2 , N 2N N11 q 1 N12 2 N 22 N12 2 , N 2N N 22 Allele Frequencies are Distributed as Binomials Based on samples from a population For two-allele system, each sample is like a “trial” Does the individual contain Allele A1? Remember, q=1-p, so only one parameter is estimated Binomials are variables that can be interpreted as the number of successes and failures in a series of trials n y n y P(Y y) s f , y Number of ways of observing y positive results n in n trials where s is the probability of a success, and f is the probability of a failure Probability of observing y positive results in n trials once n! y C y!(n y )! n y Given the allele frequencies that you calculated earlier for Cooper’s Rock Kalmia latifolia, what is the probability of observing two “white” alleles in a sample of two plants? Variation in Allele Frequencies, Codominant Loci Binomial variance is pq or p(1-p) Variance in number of observations of A1: V(Y) = np(1-p) Variance in allele frequency estimates (codominant, diploid): Vp p(1 p) 2N Standard Error of allele frequency estimates: SE p p(1 p) 2N Notice that estimates get better as sample size increases Notice also that variance is maximum at intermediate allele frequencies Maximum variance as a function of allele frequency for a codominant locus 0.3 0.25 p (1-p ) 0.2 0.15 0.1 0.05 0 0 0.1 0.2 0.3 0.4 0.5 p 0.6 0.7 0.8 0.9 1 Why is variance highest at intermediate allele frequencies? p = 0.5 p = 0.125 If this were a target, how variable would your outcome be in each case (red versus white hits)? Variance is constrained when value approaches limits (0 or 1) What if there are more than 2 alleles? General formula for calculating allele frequencies in multiallelic system with codominant alleles: 1 n N ii N ij 2 j 1 pi , ji N Variance and Standard Error of allele frequency estimates remain: V pi pi (1 pi ) SE pi 2N pi (1 pi ) 2N How do we estimate allele frequencies for dominant loci? Codominant locus - + A1A1 A1A2 A2A2 Dominant locus A1A1 A1A2 A2A2 Hardy-Weinberg Law After one generation of random mating, single-locus genotype frequencies can be represented by a binomial (with 2 alleles) or a multinomial function of allele frequencies ( p q) p 2 pq q 2 Frequency of A1A1 (P) 2 Frequency of A1A2 (H) 2 Frequency of A2A2 (Q) Hardy-Weinberg Law Hardy and Weinberg came up with this simultaneously in 1908 After one generation of random mating, single-locus genotype frequencies can be represented by a binomial (with 2 alleles) or a multinomial function of allele frequencies ( p q) p 2 pq q 2 Frequency of A1A1 (P) 2 Frequency of A1A2 (H) 2 Frequency of A2A2 (Q) Hardy-Weinberg Equilibrium After one generation of random mating, genotype frequencies remain constant, as long as allele frequencies remain constant Provides a convenient Neutral Model to test for departures from assumptions Allows genotype frequencies to be represented by allele frequencies: simplification of calculations New Notation Genotype AA Aa aa Frequency P H Q Allele A a Frequency p q Hardy-Weinberg Assumptions Diploid Large population Random Mating: equal probability of mating among genotypes No mutation No gene flow Equal allele frequencies between sexes Nonoverlapping generations Graphical Representation of Hardy-Weinberg Law (p+q)2 = p2 + 2pq + q2 = 1 Relationship Between Allele Frequencies and Genotype Frequencies under Hardy-Weinberg Hardy-Weinberg Law and Probability A(p) a(q) A (p) AA (p2) Aa (pq) a (q) aA (qp) aa (q2) p2 + 2pq + q2 = 1 How does Hardy-Weinberg Work? Reproduction is a sampling process Example: Mountain Laurel at Cooper’s Rock Red Flowers: 5000 Pink Flowers: 3000 White Flowers: 2000 Alleles: : A2=14 : A1=26 A1A1 A1A2 A2A2 Frequency of A1 = p = 0.65 Frequency of A2 = q = 0.35 What are expected numbers of phenotypes and genotypes in a sample of 20 trees? What are expected frequencies of alleles in pollen and ovules? Genotypes: : 4 : 10 : 6 Phenotypes: : 4 : 10 : 6 What will be the genotype and phenotype frequencies in the next generation? What assumptions must we make? What about a 3-Allele System? Alleles occur in gamete pool at same frequency as in adults Probability of two alleles coming together to form a zygote is A B U A1 (p) Pollen Gametes A2 (q) A3 (r) A1A1 = p2 A1A2 = 2pq A1 (p) A1A3 = 2pr A2A2 = q2 A3A3 = r2 Ovule Gametes A2A3 = 2qr A2 (q) A3 (r) From Neal, D. 2004. Introduction to Population Biology. Equilibrium established with ONE GENERATION of random mating Genotype frequencies remain stable as long as allele frequencies remain stable Remember assumptions!