Download The Genetic Theory of Natural Selection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tag SNP wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genetic variation wikipedia , lookup

Inbreeding wikipedia , lookup

Koinophilia wikipedia , lookup

Epistasis wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Group selection wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Microevolution wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Genetic drift wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
LectureIX:SelectionPart2
The Genetic Theory of Natural Selection
Considering a large (infinite) population size, we have seen that allele frequencies do not
change by Mendelian inheritance alone; with recurrent mutation added we learned that alleles
change very slowly, but eventually will all end up mutants, or reach an equilibrium frequency
depending on the rate of backmuation. Considering natural populations of finite size, we
could show that the process of random genetic drift is a powerful force in changing allele
frequencies – rapidly and with high probability if the effective population size is small. How
do allele frequencies change under natural selection? Again, we need to make several
simplifying assumptions. We consider a bi-allelic locus with alleles A and a, and start
deterministically with an infinitely large population assuming the following basic life cycle
model with discrete, non-overlapping generations:
Figure 1: Basic Selection model assuming random mating and post-zygotic viability selection.
We thus assume that natural selection only operates in the left half of the cycle: from zygotes
to adults at reproductive age. The right half of the cycle (adults to offspring) involves only
random mating. This has the following consequences:
•
All selection acts as differential viabilitiy: zygotes with different genotypes have a
different chance to survive to reproductive age. We do not include differences in
fertility. Every mating pair will produce the same average number of offspring.
•
As a consequence, zygotes conform to Hardy-Weinberg proportions each generation
because of random mating of parents. In contrast, adults are generally not in HWE
due to selection.
How do we measure selection? We have introduced a measure for differential reproductive
success among genotypes: fitness. We can assign absolute fitness values (W) to each zygote
as the number of offspring (in zygote stage) it will produce in the next generation or during its
entire lifetime (lifetime reproductive success – the ultimate measure of fitness).
Table1:ExampleofabsolutegenotypefitnesswithalleleAbeingdominant.
AA
WAA=3
Aa
WAa=3
aa
Waa=1
In lack of direct information on fitness researchers often use components of fitness such as
early survival or body condition. In the above example the A allele is dominant and obviously
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
1
LectureIX:SelectionPart2
advantageous. In order to calculate the change of allele frequencies through the course of
time, only the relative fitness among genotypes (w) matters. We therefore standardize
absolute fitness by the fitness of the best genotype. Absolute fitness is only required where
the absolute size of the population is of interest. For the dynamics of allele frequency change
it does not matter if fitness of AA and aa are 3 vs. 1 or 30 vs. 10.
Table2:ExampleofrelativegenotypefitnesswithalleleAbeingdominant.
Genotype
Frequency at birth
Relative fitness
AA
p2
wAA=1
Aa
2pq
wAa=1
aa
q2
waa=1/3
Using this convention, all relative fitness values are 0 ≤ w ≤ 1. Relative fitness is often
expressed by means of the selection coefficient defined as:
w = 1-s
For the above case s=1-1/3 = 2/3. In the following we will derive the change in allele
frequency caused by selection. We will see that the genetic architecture of a trait under
selection plays a role. First we consider the case of a dominant, advantageous allele, which
can be nicely illustrated by a simple empirical example. Rock pocket mice live in the deserts
of the American Southwest. Ancestrally, pocket mice had light-colored coats that blended in
with the region's rocks and sandy soil, keeping the mice hidden from their owl predators.
Most rocket mice have light coloured fur, but every now and again a dominant mutation (let’s
call it A) in the MC1R gene causes an increase in the production of dark melanin. These
mutant mice have dark coloured fur. As the mutation is dominant both AA and Aa mice are
dark. In the sandy habitat this mutation is disadvantageous, and these mice are quickly spotted
by their predatory reducing their fitness. However, starting about 1.7 million years ago, a
series of volcanic eruptions resulted in wide trails of black lava weaving right through the
middle of pocket-mouse territory. In these areas the dark mice are well camouflaged and have
a clear fitness advantage over white mice. Assuming no pleiotropic effects of the mutation
(only viability selection by means of predation), large population size, random mating, nonoverlapping generations (all of which is not too unrealistic in the system) can we predict how
allele frequencies would change every generation? Do we expect the entire population to
‘turn dark’? How long should the recessive allele conferring light colouration persist?
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
2
LectureIX:SelectionPart2
Directional selection
Example: Advantagous dominant allele, disadvantageous recessive allele (wAA=wAa>waa)
f(A)= p; f(a) = 1-p
Table3:Fitnesstableforadominantgeneticarchitecture.
Genotype
Frequency before selection
Relative fitness
Frequency after selection
(not normalized)
AA
p2
wAA=1
Aa
2pq
wAa=1
aa
q2
waa=1-s
p2wAA= p2
2pq wAa=2pq
q2 waa= q2(1-s)
Considering the table above we can conceptualize fitness as a factor that transforms genotype
frequencies before selection into genotype frequencies after selection. And if genotype
frequencies change, we can expect allele frequencies to also change. One final step is needed
to arrive at the change in allele frequency. Because of selection (some individuals were
removed) the genotype frequencies after selection no longer add up to one, as
p2 +2pq + q2(1-s) = 1- q2s
We therefore need to normalize the genotype frequencies. We do this by dividing each by the
gamete frequencies expected to enter the next generation after selection has been acting. In
general terms, gamete frequencies are given by the frequencies of each genotype weighed by
its fitness:
𝑤 = 𝑝 ! 𝑤!! + 2𝑝𝑞𝑤!" + 𝑞 ! 𝑤!!
𝒘 is the mean fitness of the population. It is a measure for the average fitness of any
individual in a population relative to the fittest genotype. As a relative measure it cannot be
interpreted as a population parameter indicating whether a population is growing or not.
In our particular case 𝑤 = 1 − 𝑞 ! 𝑠 (see above). Accordingly, the normalized genotype
frequencies AA, Aa and aa after selection are
𝑝 ! 2𝑝𝑞 𝑞 ! (1 − 𝑠)
+
+
=1
𝑤
𝑤
𝑤
We can now calculate the change in allele frequency. Let p’ be the frequency of A in the
following generation:
𝑝! =
𝑝 ! 1 2𝑝𝑞 𝑝 𝑝 + 𝑞
𝑝
𝑝
+
=
= =
𝑤 2 𝑤
𝑤
𝑤 1 − 𝑞!𝑠
The change in frequency then is:
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
3
LectureIX:SelectionPart2
𝑝
𝑝 − 𝑝 + 𝑝𝑞 ! 𝑠
𝑝𝑞 ! 𝑠
Δ! 𝑝 = 𝑝 − 𝑝 =
−𝑝=
=
1 − 𝑞!𝑠
1 − 𝑞!𝑠
1 − 𝑞!𝑠
!
Hence, as long as the A allele is advantageous (1 > s > 0) the A allele gets more common.
The increase in frequency will be slow, since at first only heterozygotes contribute. Once
intermediate frequencies are reached, both heterozygotes and AA homozygotes contribute,
Δ! 𝑝 is large and A increases fast in frequency. Finally, the increase in A (= the decrease in a)
slows down again due to Δ! 𝑞 =- Δ! 𝑝, i.e. the a allele gets rarer. Since the change is
proportional to q2, it is small if q is small. This makes sense. As long as the a allele is rare, it
will mostly occur in heterozygotes and we have assumed that all selection is against the
deleterious allele in the homozygous state only. These dynamics hold across a large range of
selection coefficients (Figure 1).
Figure2:Change in allele frequency of the advantageous allele A 𝚫 𝒔 𝒒 as a function of its frequency p
(fA) shown for different selection coefficients.
Applying the above equation to the mouse example above, we see that for low initial
frequencies (which we would expect for a novel mutation) change is very slow at the
beginning, but speeds up as the number of heterozygotes increases. Once a high frequency of
A is reached in the population per generation change slows down significantly, as only few
mice with aa genotypes are ‘seen’ by selection and most a alleles ‘hide’ in heterozygous state.
Nonetheless, after only about 100 generations most of the population has turned dark.
Figure3:IncreaseinfrequencyofthedominantadvantageousalleleAwithrelativefitnesswAA=1,wAa=1
andwaa=0.9anddifferentstartingfrequenciesoftheAallelep=0.5,0.1,0.01,0.001,0.0001(fromleftto
right).
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
4
LectureIX:SelectionPart2
It turns out, however, that most genetic architectures cannot be simply grouped into dominant,
additive or recessive. E.g. in Drosophila recessive lethal mutations have a 3% in
heterozygotes, mildly deleterious alleles even 30-40% of the homozygous effect. We can take
such dominance deviations into account by formulating a general selection model.
General Selection Model
Instead of assuming a discrete genetic architecture we now introduce the dominance
coefficient h. We also change to an allele annotation of allele A1 (advantageous allele) and A2
(deleterious allele) instead of A and a, as the latter is often associated with full dominance or
recessively.
h=0: the deleterious allele (A2) is completely recessive
h=1: the deleterious allele (A2) is completely dominant
Genotype
Frequency before selection
Relative fitness
Frequency after selection (n. norm.)
Frequency after selection (norm.)
A1A1
p2
w11=1
p2 w11
𝑝!𝑤!!
𝑤
A1A2
2pq
w12=1-hs
2pq w12
2𝑝𝑞𝑤!"
𝑤
A2A2
q2
w22=1-s
q2w22
𝑞!𝑤!!
𝑤
Table4:Fitnesstableforthegeneralselectionmodelwithdominancecoefficienth.
Mean fitness (normalizing factor): 𝑤 = 𝑝 ! 𝑤!! + 2𝑝𝑞𝑤!" + 𝑞 ! 𝑤!!
Change in allele frequencies:
𝑝! =
Δ𝑝 = 𝑝! − 𝑝 =
𝑝 ! 𝑤!! 1 2𝑝𝑞𝑤!" 𝑝 ! 𝑤!! + 𝑝𝑞𝑤!"
+
=
𝑤
2 𝑤
𝑤
𝑝! 𝑤!! + 𝑝𝑞𝑤!"
𝑝! 𝑤!! + 𝑝𝑞𝑤!" − 𝑝𝑤 𝑝 𝑝 − 𝑝! 𝑤!! + 𝑞 1 − 2𝑝 𝑤!" − 𝑞 ! 𝑤!!
−𝑝=
=
𝑤
𝑤
𝑤
This simplifies to this simplifies to
Δ! 𝑝 =
𝑝𝑞 𝑝(𝑤!! − 𝑤!" ) − 𝑞(𝑤!" − 𝑤!! )
𝑤
or
Δ! 𝑝 =
𝑝𝑞𝑠[𝑝ℎ + 𝑞(1 − ℎ)]
𝑤
where we substitute 1 for 𝑤!! , 1-hs for 𝑤!" and 1-s for 𝑤!!
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
5
LectureIX:SelectionPart2
This is one of the most important equations in evolutionary biology. We note three points
from it:
•
•
•
Because w11> w12>w22 Δ! 𝑝 is always positive. Thus the advantageous allele will
always increase in frequency (which makes sense) and will eventually fix.
The larger the fitness differences, the faster the change in frequency
Change is faster at intermediate allele frequencies (due to the pq term at the
beginning of the equation.
Selection maintaining genetic variation
Selection does not necessarily have to deplete genetic variation, as we have just shown for
directional selection. In cases where the fitness of the heterozygote is greater than that of
both homozygotes such heterozygote advantage (also termed overdominance), will
maintain genetic polymorphism at the locus. This situation can no longer be modelled with a
single selection coefficient.
Genotype
Frequency before selection
Relative fitness
A1A1
p2
w11=1-s
A1A2
2pq
w12=1
A2A2
q2
w22=1-t
Substituting the fitness values for each genotype into the general fitness equation from above
yields:
Δ! 𝑝 =
𝑝𝑞 𝑝(𝑤!! − 𝑤!" ) − 𝑞(𝑤!" − 𝑤!! )
𝑝𝑞(𝑞𝑡 − 𝑝𝑠)
=
𝑤
𝑤
If we set Δ! 𝑝 = 0 we obtain the equilibrium allele frequencies as qt-ps=0 or (1-p)t = ps.
Solving for p we get
𝑡
𝑠
𝑝=
𝑎𝑛𝑑 𝑞 =
𝑠+𝑡
𝑠+𝑡
These are the equilibrium frequencies at which selection will no longer change allele
frequencies. This polymorphic equilibrium is stable; if allele frequencies drift away (see
below) selection will push them back. Overdominance thus does not remove an allele from a
population, it maintains both alleles at a ratio given by the respective relative fitness values of
homozygotes. Note, that the proportion of heterozygotes can never reach 1, since random
mating will inevitable create novel homozygotes each generation.
Example: A classic example in humans is the beta-hemoglobin loucs in some African and
Mediterranean populations. One allele at the locus which differes by one amino acid
substitution from normal haemoglobin (A) encodes sickle-cell haemoglobin (S). At low
oxygen concentrations sickle-cell haemoglobin form elongate crystals, and is less efficient in
transporting oxygen. In homozygous state, SS, the sickle allele causes distortion of the red
blood cells causing severe anemia and damage to blood capillaries; carriers mostly die before
they reach reproductive age. Heterozygotes AS only suffer slight anemia. Homozygotes AA
produce oxygen best and show no signs of anemia. At first sight this sounds like a clear case
of directional selection, where the A allele would be favoured. However, in regions exposed
to malaria, selection pressures change. Here, normal AA homozygotes suffer higher mortality
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
6
LectureIX:SelectionPart2
by malaria than AS heterozygotes, where red blood cells are broken down faster. This
provides worse conditions for the malaria parasite Plasmodium falciparum to develop in the
red blood cells. Genotype frequncies in a sample of 12387 individuals from Nigeria have
been estimated as follows:
SS 29 adults, 0.234%
AS: 2993 adults, 24.162 %
AA: 9365 adults, 75.603 %
Given this basic information can we estimate genotype fitness and predict allele frequencies
of the mutant allele at equilibrium?
We obtain frequency q of the S allele as q=(29+2993/2)/12387 = 0.123. The expected zygote
frequencies (not affected by selection assuming Hardy-Weinberg equilibrium) are
accordingly.
SS: 1.512%
AS: 21.574 %
AA: 76.913%
Dividing the adult frequencies by the zygote frequencies we obtain the fitness measures for
SS: 0.155
AS: 1.120
AA: 0.983
Normalizing the maximum fitness (of Aa) to 1, this gives estimates of
wSS=0.138
wAS= 1
wAA=0.878
Hence, t=0.862 and s=0.122. At equilibrium, we would therefore expect the S allele to be at
𝑞 = s/(s+t) = 0.124 – very close to the observed frequency of 0.123.
Selection-Drift equilibrium
We have seen that directional selection will always increase the frequency of a beneficial
allele. This property, however, only holds if we assume infinitely large population size. In
finite populations, we need to factor in the effects of random sampling from one generation to
the next. When an advantageous allele is in low frequency, it is easy to conceive that it still
can get lost, because the few individuals who carry it happen to not survive or reproduce.
Therefore, if we want to understand the role of natural selection in real populations, we need
to factor in population size. Remember that the probability of fixation for any neutrally
evolving allele A is given by its frequency p. For novel mutations the initial frequency is
1/2N, its probability of fixation 1/2N accordingly. It is intuitive that selection favoring a new
mutation will increase the probability of fixation, but not as much as one would think.
Provided that N is large and s is small in absolute values (allowing to use mathematical
diffusion theory) Kimura could shown that for any allele A the probability of fixation is
𝑃𝑟!"# 𝐴|𝑠, 𝑁! =
1 − 𝑒 !!!"! !"
1 − 𝑒 !!!"! !
where Ne is the variance effective population size, c indicates the ploidy level (1=haploid,
2=diploid), s is the selection coefficient and p the frequency of allele A. This result allows
incorporating both deleterious mutations (s < 0) as well as advantageous mutations (s > 0).
Considering specifically a diploid organism subject to additive selection (allele A is co-
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
7
LectureIX:SelectionPart2
dominant) we the expected probability of fixation for a novel mutant allele (with frequency
1/2N) simplifies to
𝑃𝑟!"# 𝐴|𝑠, 𝑁! =
1 − 𝑒 !!!
1 − 𝑒 !!!! !
Pfix(A)
nearly
neutral zone
1/2N
−4
−2
0
2
4
2 Ne s
Figure 4:Graphillustratingtheaboveequation,i.e.thefixationprobabilityasafunctionoftheproductof
selectionandpopulationsize(2Nes).Theshadedareacorrespondsto-1<1/(2Nes)<1wheretheprobability
offixationisdominatedbygeneticdriftandclosetothatofaneutralallele.
We often distinguish three ranges of selection intensity illustrating properties of the fixation
probability for a novel mutation.
Neutrality and near- neutrality
2Nes = 0: In case of no selection (neutrality), we already know that the fixation probability is
1/2N.
-1 < 2Ne < 1: Between -2Nes and 2Nes the dynamics are mainly determined by the effects of
genetic drift. The probability of fixation for the A allele is therefore close to that of a neutral
allele 𝑃𝑟!"# 𝐴 ≈ 1/2𝑁! . We therefore characterize that range of the population scaled
selection coefficient (2Nes) as nearly neutral. Note that the actual threshold at which point
fitness differences matter critically depends on the effective population size. In populations of
Ne=50 say it requires large selection co-efficents (s=1/2Ne = .01) to overcome drift
dynamics. In large populations of say 50000 very small fitness differences will impact the
probability of fixation (s=0.00001). In animal conservation, we often speak of adaptive
genetic variation, i.e. genetic variation that may potentially help a population to adapt to
(human imposed) rapid environmental change. From the above equation, it is apparent that
potentially advantageous alleles stands a much larger chance of getting lost by genetic drift in
small populations.
Strong selection
2Nes > 1: In case of strongly advantageous alleles 𝑃𝑟!"# 𝐴 ≈ 2𝑠. This remarkable result
proves that selection is not omnipotent. Counter-intuitively, a population of Ne=100 an allele
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
8
LectureIX:SelectionPart2
with selection coefficient of s=0.1 is strongly advantageous, but is only expected to fix in 20
% of the cases - and to be lost in 80% of the cases.
2Ne< -1: in case of strongly deleterious alleles the fixation probability is very small and
decreases rapidly to 0 𝑃𝑟!"# 𝐴 ≈ 0 as s becomes more negative. However, a disadvantageous
mutant with an additive effect and selection coefficient of -0.001 has a 0.4% chance of
becoming fixed in a population of Ne=100. Again, population size makes all the difference. In
a population of Ne=1000 fixation probability diminishes to 0.004%. This explains, why small
human communities with a long history of isolation and inbreeding (small Ne) show increased
rates of heritable diseases. It is also one of the reason why we are worried when a species
significantly declines in numbers, and populations become fragmented and isolated. It simply
requires a certain population size to purge deleterious mutations.
Literature: (Barton et al. 2007; Futuyma 2013; Nielsen and Slatkin 2013)
Barton NH, Briggs DEG, Eisen JA, Goldstein DB, Patel NH. 2007. Evolution. 1st edition.
Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press
Futuyma DJ. 2013. Evolution. 3rd ed. Sinauer Associates
Kimura M. 1983. The neutral theory of molecular evolution. Cambridge University Press
Nielsen R, Slatkin M. 2013. An Introduction to Population Genetics: Theory and
Applications. Sunderland, Mass: Macmillan Education
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
9