* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Genetic Theory of Natural Selection
Deoxyribozyme wikipedia , lookup
Gene expression programming wikipedia , lookup
Human genetic variation wikipedia , lookup
Koinophilia wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Group selection wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Microevolution wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
LectureIX:SelectionPart2 The Genetic Theory of Natural Selection Considering a large (infinite) population size, we have seen that allele frequencies do not change by Mendelian inheritance alone; with recurrent mutation added we learned that alleles change very slowly, but eventually will all end up mutants, or reach an equilibrium frequency depending on the rate of backmuation. Considering natural populations of finite size, we could show that the process of random genetic drift is a powerful force in changing allele frequencies – rapidly and with high probability if the effective population size is small. How do allele frequencies change under natural selection? Again, we need to make several simplifying assumptions. We consider a bi-allelic locus with alleles A and a, and start deterministically with an infinitely large population assuming the following basic life cycle model with discrete, non-overlapping generations: Figure 1: Basic Selection model assuming random mating and post-zygotic viability selection. We thus assume that natural selection only operates in the left half of the cycle: from zygotes to adults at reproductive age. The right half of the cycle (adults to offspring) involves only random mating. This has the following consequences: • All selection acts as differential viabilitiy: zygotes with different genotypes have a different chance to survive to reproductive age. We do not include differences in fertility. Every mating pair will produce the same average number of offspring. • As a consequence, zygotes conform to Hardy-Weinberg proportions each generation because of random mating of parents. In contrast, adults are generally not in HWE due to selection. How do we measure selection? We have introduced a measure for differential reproductive success among genotypes: fitness. We can assign absolute fitness values (W) to each zygote as the number of offspring (in zygote stage) it will produce in the next generation or during its entire lifetime (lifetime reproductive success – the ultimate measure of fitness). Table1:ExampleofabsolutegenotypefitnesswithalleleAbeingdominant. AA WAA=3 Aa WAa=3 aa Waa=1 In lack of direct information on fitness researchers often use components of fitness such as early survival or body condition. In the above example the A allele is dominant and obviously LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 1 LectureIX:SelectionPart2 advantageous. In order to calculate the change of allele frequencies through the course of time, only the relative fitness among genotypes (w) matters. We therefore standardize absolute fitness by the fitness of the best genotype. Absolute fitness is only required where the absolute size of the population is of interest. For the dynamics of allele frequency change it does not matter if fitness of AA and aa are 3 vs. 1 or 30 vs. 10. Table2:ExampleofrelativegenotypefitnesswithalleleAbeingdominant. Genotype Frequency at birth Relative fitness AA p2 wAA=1 Aa 2pq wAa=1 aa q2 waa=1/3 Using this convention, all relative fitness values are 0 ≤ w ≤ 1. Relative fitness is often expressed by means of the selection coefficient defined as: w = 1-s For the above case s=1-1/3 = 2/3. In the following we will derive the change in allele frequency caused by selection. We will see that the genetic architecture of a trait under selection plays a role. First we consider the case of a dominant, advantageous allele, which can be nicely illustrated by a simple empirical example. Rock pocket mice live in the deserts of the American Southwest. Ancestrally, pocket mice had light-colored coats that blended in with the region's rocks and sandy soil, keeping the mice hidden from their owl predators. Most rocket mice have light coloured fur, but every now and again a dominant mutation (let’s call it A) in the MC1R gene causes an increase in the production of dark melanin. These mutant mice have dark coloured fur. As the mutation is dominant both AA and Aa mice are dark. In the sandy habitat this mutation is disadvantageous, and these mice are quickly spotted by their predatory reducing their fitness. However, starting about 1.7 million years ago, a series of volcanic eruptions resulted in wide trails of black lava weaving right through the middle of pocket-mouse territory. In these areas the dark mice are well camouflaged and have a clear fitness advantage over white mice. Assuming no pleiotropic effects of the mutation (only viability selection by means of predation), large population size, random mating, nonoverlapping generations (all of which is not too unrealistic in the system) can we predict how allele frequencies would change every generation? Do we expect the entire population to ‘turn dark’? How long should the recessive allele conferring light colouration persist? LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 2 LectureIX:SelectionPart2 Directional selection Example: Advantagous dominant allele, disadvantageous recessive allele (wAA=wAa>waa) f(A)= p; f(a) = 1-p Table3:Fitnesstableforadominantgeneticarchitecture. Genotype Frequency before selection Relative fitness Frequency after selection (not normalized) AA p2 wAA=1 Aa 2pq wAa=1 aa q2 waa=1-s p2wAA= p2 2pq wAa=2pq q2 waa= q2(1-s) Considering the table above we can conceptualize fitness as a factor that transforms genotype frequencies before selection into genotype frequencies after selection. And if genotype frequencies change, we can expect allele frequencies to also change. One final step is needed to arrive at the change in allele frequency. Because of selection (some individuals were removed) the genotype frequencies after selection no longer add up to one, as p2 +2pq + q2(1-s) = 1- q2s We therefore need to normalize the genotype frequencies. We do this by dividing each by the gamete frequencies expected to enter the next generation after selection has been acting. In general terms, gamete frequencies are given by the frequencies of each genotype weighed by its fitness: 𝑤 = 𝑝 ! 𝑤!! + 2𝑝𝑞𝑤!" + 𝑞 ! 𝑤!! 𝒘 is the mean fitness of the population. It is a measure for the average fitness of any individual in a population relative to the fittest genotype. As a relative measure it cannot be interpreted as a population parameter indicating whether a population is growing or not. In our particular case 𝑤 = 1 − 𝑞 ! 𝑠 (see above). Accordingly, the normalized genotype frequencies AA, Aa and aa after selection are 𝑝 ! 2𝑝𝑞 𝑞 ! (1 − 𝑠) + + =1 𝑤 𝑤 𝑤 We can now calculate the change in allele frequency. Let p’ be the frequency of A in the following generation: 𝑝! = 𝑝 ! 1 2𝑝𝑞 𝑝 𝑝 + 𝑞 𝑝 𝑝 + = = = 𝑤 2 𝑤 𝑤 𝑤 1 − 𝑞!𝑠 The change in frequency then is: LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 3 LectureIX:SelectionPart2 𝑝 𝑝 − 𝑝 + 𝑝𝑞 ! 𝑠 𝑝𝑞 ! 𝑠 Δ! 𝑝 = 𝑝 − 𝑝 = −𝑝= = 1 − 𝑞!𝑠 1 − 𝑞!𝑠 1 − 𝑞!𝑠 ! Hence, as long as the A allele is advantageous (1 > s > 0) the A allele gets more common. The increase in frequency will be slow, since at first only heterozygotes contribute. Once intermediate frequencies are reached, both heterozygotes and AA homozygotes contribute, Δ! 𝑝 is large and A increases fast in frequency. Finally, the increase in A (= the decrease in a) slows down again due to Δ! 𝑞 =- Δ! 𝑝, i.e. the a allele gets rarer. Since the change is proportional to q2, it is small if q is small. This makes sense. As long as the a allele is rare, it will mostly occur in heterozygotes and we have assumed that all selection is against the deleterious allele in the homozygous state only. These dynamics hold across a large range of selection coefficients (Figure 1). Figure2:Change in allele frequency of the advantageous allele A 𝚫 𝒔 𝒒 as a function of its frequency p (fA) shown for different selection coefficients. Applying the above equation to the mouse example above, we see that for low initial frequencies (which we would expect for a novel mutation) change is very slow at the beginning, but speeds up as the number of heterozygotes increases. Once a high frequency of A is reached in the population per generation change slows down significantly, as only few mice with aa genotypes are ‘seen’ by selection and most a alleles ‘hide’ in heterozygous state. Nonetheless, after only about 100 generations most of the population has turned dark. Figure3:IncreaseinfrequencyofthedominantadvantageousalleleAwithrelativefitnesswAA=1,wAa=1 andwaa=0.9anddifferentstartingfrequenciesoftheAallelep=0.5,0.1,0.01,0.001,0.0001(fromleftto right). LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 4 LectureIX:SelectionPart2 It turns out, however, that most genetic architectures cannot be simply grouped into dominant, additive or recessive. E.g. in Drosophila recessive lethal mutations have a 3% in heterozygotes, mildly deleterious alleles even 30-40% of the homozygous effect. We can take such dominance deviations into account by formulating a general selection model. General Selection Model Instead of assuming a discrete genetic architecture we now introduce the dominance coefficient h. We also change to an allele annotation of allele A1 (advantageous allele) and A2 (deleterious allele) instead of A and a, as the latter is often associated with full dominance or recessively. h=0: the deleterious allele (A2) is completely recessive h=1: the deleterious allele (A2) is completely dominant Genotype Frequency before selection Relative fitness Frequency after selection (n. norm.) Frequency after selection (norm.) A1A1 p2 w11=1 p2 w11 𝑝!𝑤!! 𝑤 A1A2 2pq w12=1-hs 2pq w12 2𝑝𝑞𝑤!" 𝑤 A2A2 q2 w22=1-s q2w22 𝑞!𝑤!! 𝑤 Table4:Fitnesstableforthegeneralselectionmodelwithdominancecoefficienth. Mean fitness (normalizing factor): 𝑤 = 𝑝 ! 𝑤!! + 2𝑝𝑞𝑤!" + 𝑞 ! 𝑤!! Change in allele frequencies: 𝑝! = Δ𝑝 = 𝑝! − 𝑝 = 𝑝 ! 𝑤!! 1 2𝑝𝑞𝑤!" 𝑝 ! 𝑤!! + 𝑝𝑞𝑤!" + = 𝑤 2 𝑤 𝑤 𝑝! 𝑤!! + 𝑝𝑞𝑤!" 𝑝! 𝑤!! + 𝑝𝑞𝑤!" − 𝑝𝑤 𝑝 𝑝 − 𝑝! 𝑤!! + 𝑞 1 − 2𝑝 𝑤!" − 𝑞 ! 𝑤!! −𝑝= = 𝑤 𝑤 𝑤 This simplifies to this simplifies to Δ! 𝑝 = 𝑝𝑞 𝑝(𝑤!! − 𝑤!" ) − 𝑞(𝑤!" − 𝑤!! ) 𝑤 or Δ! 𝑝 = 𝑝𝑞𝑠[𝑝ℎ + 𝑞(1 − ℎ)] 𝑤 where we substitute 1 for 𝑤!! , 1-hs for 𝑤!" and 1-s for 𝑤!! LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 5 LectureIX:SelectionPart2 This is one of the most important equations in evolutionary biology. We note three points from it: • • • Because w11> w12>w22 Δ! 𝑝 is always positive. Thus the advantageous allele will always increase in frequency (which makes sense) and will eventually fix. The larger the fitness differences, the faster the change in frequency Change is faster at intermediate allele frequencies (due to the pq term at the beginning of the equation. Selection maintaining genetic variation Selection does not necessarily have to deplete genetic variation, as we have just shown for directional selection. In cases where the fitness of the heterozygote is greater than that of both homozygotes such heterozygote advantage (also termed overdominance), will maintain genetic polymorphism at the locus. This situation can no longer be modelled with a single selection coefficient. Genotype Frequency before selection Relative fitness A1A1 p2 w11=1-s A1A2 2pq w12=1 A2A2 q2 w22=1-t Substituting the fitness values for each genotype into the general fitness equation from above yields: Δ! 𝑝 = 𝑝𝑞 𝑝(𝑤!! − 𝑤!" ) − 𝑞(𝑤!" − 𝑤!! ) 𝑝𝑞(𝑞𝑡 − 𝑝𝑠) = 𝑤 𝑤 If we set Δ! 𝑝 = 0 we obtain the equilibrium allele frequencies as qt-ps=0 or (1-p)t = ps. Solving for p we get 𝑡 𝑠 𝑝= 𝑎𝑛𝑑 𝑞 = 𝑠+𝑡 𝑠+𝑡 These are the equilibrium frequencies at which selection will no longer change allele frequencies. This polymorphic equilibrium is stable; if allele frequencies drift away (see below) selection will push them back. Overdominance thus does not remove an allele from a population, it maintains both alleles at a ratio given by the respective relative fitness values of homozygotes. Note, that the proportion of heterozygotes can never reach 1, since random mating will inevitable create novel homozygotes each generation. Example: A classic example in humans is the beta-hemoglobin loucs in some African and Mediterranean populations. One allele at the locus which differes by one amino acid substitution from normal haemoglobin (A) encodes sickle-cell haemoglobin (S). At low oxygen concentrations sickle-cell haemoglobin form elongate crystals, and is less efficient in transporting oxygen. In homozygous state, SS, the sickle allele causes distortion of the red blood cells causing severe anemia and damage to blood capillaries; carriers mostly die before they reach reproductive age. Heterozygotes AS only suffer slight anemia. Homozygotes AA produce oxygen best and show no signs of anemia. At first sight this sounds like a clear case of directional selection, where the A allele would be favoured. However, in regions exposed to malaria, selection pressures change. Here, normal AA homozygotes suffer higher mortality LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 6 LectureIX:SelectionPart2 by malaria than AS heterozygotes, where red blood cells are broken down faster. This provides worse conditions for the malaria parasite Plasmodium falciparum to develop in the red blood cells. Genotype frequncies in a sample of 12387 individuals from Nigeria have been estimated as follows: SS 29 adults, 0.234% AS: 2993 adults, 24.162 % AA: 9365 adults, 75.603 % Given this basic information can we estimate genotype fitness and predict allele frequencies of the mutant allele at equilibrium? We obtain frequency q of the S allele as q=(29+2993/2)/12387 = 0.123. The expected zygote frequencies (not affected by selection assuming Hardy-Weinberg equilibrium) are accordingly. SS: 1.512% AS: 21.574 % AA: 76.913% Dividing the adult frequencies by the zygote frequencies we obtain the fitness measures for SS: 0.155 AS: 1.120 AA: 0.983 Normalizing the maximum fitness (of Aa) to 1, this gives estimates of wSS=0.138 wAS= 1 wAA=0.878 Hence, t=0.862 and s=0.122. At equilibrium, we would therefore expect the S allele to be at 𝑞 = s/(s+t) = 0.124 – very close to the observed frequency of 0.123. Selection-Drift equilibrium We have seen that directional selection will always increase the frequency of a beneficial allele. This property, however, only holds if we assume infinitely large population size. In finite populations, we need to factor in the effects of random sampling from one generation to the next. When an advantageous allele is in low frequency, it is easy to conceive that it still can get lost, because the few individuals who carry it happen to not survive or reproduce. Therefore, if we want to understand the role of natural selection in real populations, we need to factor in population size. Remember that the probability of fixation for any neutrally evolving allele A is given by its frequency p. For novel mutations the initial frequency is 1/2N, its probability of fixation 1/2N accordingly. It is intuitive that selection favoring a new mutation will increase the probability of fixation, but not as much as one would think. Provided that N is large and s is small in absolute values (allowing to use mathematical diffusion theory) Kimura could shown that for any allele A the probability of fixation is 𝑃𝑟!"# 𝐴|𝑠, 𝑁! = 1 − 𝑒 !!!"! !" 1 − 𝑒 !!!"! ! where Ne is the variance effective population size, c indicates the ploidy level (1=haploid, 2=diploid), s is the selection coefficient and p the frequency of allele A. This result allows incorporating both deleterious mutations (s < 0) as well as advantageous mutations (s > 0). Considering specifically a diploid organism subject to additive selection (allele A is co- LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 7 LectureIX:SelectionPart2 dominant) we the expected probability of fixation for a novel mutant allele (with frequency 1/2N) simplifies to 𝑃𝑟!"# 𝐴|𝑠, 𝑁! = 1 − 𝑒 !!! 1 − 𝑒 !!!! ! Pfix(A) nearly neutral zone 1/2N −4 −2 0 2 4 2 Ne s Figure 4:Graphillustratingtheaboveequation,i.e.thefixationprobabilityasafunctionoftheproductof selectionandpopulationsize(2Nes).Theshadedareacorrespondsto-1<1/(2Nes)<1wheretheprobability offixationisdominatedbygeneticdriftandclosetothatofaneutralallele. We often distinguish three ranges of selection intensity illustrating properties of the fixation probability for a novel mutation. Neutrality and near- neutrality 2Nes = 0: In case of no selection (neutrality), we already know that the fixation probability is 1/2N. -1 < 2Ne < 1: Between -2Nes and 2Nes the dynamics are mainly determined by the effects of genetic drift. The probability of fixation for the A allele is therefore close to that of a neutral allele 𝑃𝑟!"# 𝐴 ≈ 1/2𝑁! . We therefore characterize that range of the population scaled selection coefficient (2Nes) as nearly neutral. Note that the actual threshold at which point fitness differences matter critically depends on the effective population size. In populations of Ne=50 say it requires large selection co-efficents (s=1/2Ne = .01) to overcome drift dynamics. In large populations of say 50000 very small fitness differences will impact the probability of fixation (s=0.00001). In animal conservation, we often speak of adaptive genetic variation, i.e. genetic variation that may potentially help a population to adapt to (human imposed) rapid environmental change. From the above equation, it is apparent that potentially advantageous alleles stands a much larger chance of getting lost by genetic drift in small populations. Strong selection 2Nes > 1: In case of strongly advantageous alleles 𝑃𝑟!"# 𝐴 ≈ 2𝑠. This remarkable result proves that selection is not omnipotent. Counter-intuitively, a population of Ne=100 an allele LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 8 LectureIX:SelectionPart2 with selection coefficient of s=0.1 is strongly advantageous, but is only expected to fix in 20 % of the cases - and to be lost in 80% of the cases. 2Ne< -1: in case of strongly deleterious alleles the fixation probability is very small and decreases rapidly to 0 𝑃𝑟!"# 𝐴 ≈ 0 as s becomes more negative. However, a disadvantageous mutant with an additive effect and selection coefficient of -0.001 has a 0.4% chance of becoming fixed in a population of Ne=100. Again, population size makes all the difference. In a population of Ne=1000 fixation probability diminishes to 0.004%. This explains, why small human communities with a long history of isolation and inbreeding (small Ne) show increased rates of heritable diseases. It is also one of the reason why we are worried when a species significantly declines in numbers, and populations become fragmented and isolated. It simply requires a certain population size to purge deleterious mutations. Literature: (Barton et al. 2007; Futuyma 2013; Nielsen and Slatkin 2013) Barton NH, Briggs DEG, Eisen JA, Goldstein DB, Patel NH. 2007. Evolution. 1st edition. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press Futuyma DJ. 2013. Evolution. 3rd ed. Sinauer Associates Kimura M. 1983. The neutral theory of molecular evolution. Cambridge University Press Nielsen R, Slatkin M. 2013. An Introduction to Population Genetics: Theory and Applications. Sunderland, Mass: Macmillan Education LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 9