Download Effective population size

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome (book) wikipedia , lookup

Medical genetics wikipedia , lookup

Epistasis wikipedia , lookup

Heritability of IQ wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Frameshift mutation wikipedia , lookup

Viral phylodynamics wikipedia , lookup

Mutation wikipedia , lookup

Point mutation wikipedia , lookup

Human genetic variation wikipedia , lookup

Koinophilia wikipedia , lookup

Genetic drift wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
LectureVI:NeutralTheory
Effective population size
Given that the size of most natural populations (in terms of simple body counts) is large, one
may question the role of drift. However, as we will see now, the relevant ‘effective’
population size is often much smaller than the actual number of individuals, or census
population size Nc.
We have derived our results for drift for the Wright-Fisher model under a number of
restrictive – and in general unrealistic – conditions. Most importantly, we have assumed that
population size is constant, that mating is random and there are no separate sexes. In such an
‘ideal population’ genetic drift will proceed at a rate given directly by the census population
size Nc. However, in natural populations the variance in reproductive success is generally
much larger than assumed by Wright-Fisher sampling (binomial or Poisson as N gets large).
Examples of forces increasing variance in reproductive output are:
• sex ratio differences (sexual selection)
• fluctuations in population size
• overlapping generations
• population structure
All of these forces increase variance in reproductive success and thereby reduce the number
of individuals effectively contributing to the next generation. It is thus intuitive that the
‘effective’ population size will be smaller than the census size. In humans, who at present
roam the planet in their billions, the effective population size is estimated to be around
10,000! But, if we violate the assumptions of the Wright-Fisher model, is this abstract
mathematical model still valid? As it turns out, the model can nevertheless be applied if we
replace the census population size Nc by an effective population size Ne. Ne then reflects the
size of an ideal population that experiences genetic drift at the rate of the population in
question. Hence, if we are able to transform Nc to Ne in some meaningful way, we can still
quantify the rate at which genetic diversity gets lost through genetic drift using the same
mathematical model. It is important to note that we need some read out to measure the effects
of genetic drift and calibrate the effective population size accordingly. This can be the loss of
heterozygosity in a population, the degree of inbreeding, genetic variance, the efficiency of
selection, or the rate of coalescence. Accordingly, the effective population size is defined by
the quantity of interest as e.g. the inbreeding effective size, the coalescence effective size, etc.
Below, we derive Ne using the decrease in heterozygosity. Let’s see how that can work using
an example.
Fluctuations in population size
One obvious factor bearing on the effective population size is change in real population sizes
through time. The Wright-Fisher model assumes constant size, so can we approximate
fluctuation in Nc in terms of an idealized population of constant size?
First, it is important to understand that the lowest population numbers determine, to a large
extent, the overall effective population size: all future offspring will be descendants from of
these few survivors. The effect of variation in population size can be shown by examining the
heterozygosity over time. Remember
1 !
𝐻! = 1 −
𝐻!
2𝑁
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
1
LectureVI:NeutralTheory
If N varies from population to population, then
𝐻!
1
= 1−
𝐻!
2𝑁!
1
1−
2𝑁!
1
1
1−
… 1−
=
2𝑁!
2𝑁!!!
!!!
1−
!!!
1
2𝑁!
The overall effective population size is the one that causes the same reduction in
heterozygosity as the varying Ni values and thus
!!!
1−
!!!
1
1
= 1−
2𝑁!
2𝑁!
!
Solving for Ne we get
𝑡
1
𝑁!
where t is the number of discrete generations of fluctuating size. To illustrate the importance
of a bottleneck imagine an insect population N that increases 10-fold for two summer
generations and returns to its original size in winter (due to winter mortality). Population sizes
are hence: N, 10N, 100N. The mean census number Nc across all three generations is 36.7 N.
However, the effective populations size Ne that matters and appropriately describes the effects
of genetic drift (such as reduction in heterozygosity, increase in variance in allele frequencies)
is 3/(1+1/10+1/100)=2.7 N, more than an order of magnitude less. In this case Ne/Nc=0.074,
i.e. Ne is only 7.4% of the census size.
𝑁! =
Sex ratio differences
Ne is similary reduced compared to Nc if we consider highly unequal contributions of males
and females to the next generation. Imagine a zoo population of primates with 20 males and
20 females. Due to dominance hierarchy only one of the males actually breeds. What is the
relevant population size that informs us about the strength of drift in this system? .. 40? .. or
21?
It can be shown that for this situation
𝑁! ≈
4𝑁! 𝑁!
𝑁! + 𝑁!
where Nf is the effective size of breeding females (20 in our case), Nm is the effective size of
breeding males (only 1 in our example). We thus obtain
𝑁! ≈
4 ∙ 20 ∙ 1 80
=
≈4
20 + 1
21
The effective population size is thus an order of magnitude smaller than the census size due to
fact that all kids come from the same father, and genetic variation will rapidly disappear. In
the case of equal sex ration Nf=Nm = N/2, however, we obtain 𝑁! ≈ 𝑁! ; the Wright-Fisher
model thus applies with the original census size Nc.
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
2
LectureVI:NeutralTheory
The Neutral Theory of Evolution: Genetic Drift + Mutation
In the introductory lecture we have touched upon an important debate of the 1930-60ies.
Proponents of the classical school and the balancing school differed strongly in their view on
the extent of genetic diversity expected in natural populations and the responsible mechanism.
Both focused on morphological or physiological characters with a clear role for selection.
First data from allozyme electrophoresis (Lewontin and Hubby 1966) suggested that selection
alone could not be responsible for maintaining the high observed levels of polymorphism. At
around the same time in 1968 Motoo Kimura studied long-term protein evolution. His
observation of similar evolutionary rates across lineages prompted him to develop the
Neutral Theory of Evolution stating that most changes at the molecular level resulted from a
combination of mutation and genetic drift, without the action of selection. In his theory
selection is appreciated only in the form of strong purifying selection efficiently removing
highly deleterious mutations such that they do not contribute to segregating genetic
variation. Positively selected mutations play only a minor role. They are assumed to rapidly
reach fixation, and hence do not contribute to segregating variation. The Neutral Theory of
Evolution and its extension in the Nearly Neutral Theory of Evolution introduced by
Tomoko Ohta (relaxing the assumptions on strong purifying and positive selection) are
widely accepted as models appropriately describing sequence evolution across large parts of
the genome.
So how can we predict the level of genetic variation using Neutral Theory? We have seen that
mutations introduce genetic variation, and that genetic drift erodes it in populations of finite
size. The Neutral Theory combines both forces into one framework making predictions on the
level of genetic diversity we expect at equilibrium.
Heterozygosity H assumes a predictable equilibrium value
We have already derived that genetic drift reduces the heterozygosity within a population
each generation by ΔdriftH=-1/2NeH. Mutation will work against that reduction by increasing
genetic variation. Therefore, at some point an equilibrium will be reached where the decrease
in H due to drift is balanced by the increase due to mutation (mutation-drift equilibrium).
To find the point of equilibrium, we first derive the change of H under mutation alone. For an
infinite population (no drift at this point), the heterozygosity in the new generation before
mutation equals the heterozygosity in the parent population, H’ = H (Hardy-Weinberg
equilibirium). We now assume every new mutation results in a new allele not present in the
population before (see infinite alleles model). This is realistic if we distinguish alleles of a
gene on the level of allelic types (haplotypes, protein electrophoresis) without keeping
track on how these types relate to each other. Then every pair of genes with unequal alleles
before mutation will also have unequal alleles after mutation. Every pair of genes with equal
alleles before mutation will become heterozygote if either of the genes mutates. Summing
over these two cases and ignoring terms proportional to u2 (both genes mutate), we obtain:
H’= H + 2u(1-H) thus ΔmutH=2u(1-H)
The total change of heterozygosity ΔH= ΔdriftH+ ΔmutH. The equilibrium is obtained for
ΔH=0:
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
3
LectureVI:NeutralTheory
2u(1-H)-H/2 Ne =0
2u-2uH=H/2 Ne
4Nu-4 NeuH=H
4Nu=H+4 Ne uH
4Nu=H(1+4 Ne u)
Writing θ for 4Neu we obtain
𝐻=
𝜃
1+𝜃
(The quantity 4Neu is central in population genetics and is generally denoted by θ). As
expected, the equilibrium heterozygosity increases with increasing mutation rate and
increasing population size (i.e. reduced drift). Expressed in terms of homozygosity G (=1-H)
we obtain:
1
𝐺=
1+𝜃
If we sampled an individual from a randomly mating population, we would expect the
proportion of loci for which the individual is heterozygous θ/(1+ θ). In terms of DNA
sequence under the infinite sites model, we would interpret 𝐻 as the probability that two
haplotypes are non-identical.
The expected number of mutations (changes in the DNA sequence) occurring in the history of
a sample is given by
θ = 4Neu
0.0
0.2
0.4
^
H
0.6
0.8
if we assume that the infinite sites model holds (each mutation creates a new variable site).
We can think of θ as the population mutation pressure determining 1) the number of
differences we expect between two randomly sampled DNA sequences and 2) the probability
that they the sequences are not identical, i.e. that they are heterozygote (Fig. 1). θ depends on
both the mutation rate and the population size. This has e.g. interesting implications for sex
chromosomes that only have ¾ Ne of autosomes, but also differ in mutation rate (see malebiased mutation).
0
2
4
6
8
10
θ
Figure1:Theequilibriumlevelofheterozygosityincreasesasafunctionoftheproductofthe
neutralmutationrateandeffectivepopulationsize.
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
4
LectureVI:NeutralTheory
The Neutral Theory also makes a clear prediction about the degree of divergence expected
between two populations / species that are not connected by migration establishing the
expectation for the rate of neutral substitution (i.e. the alleles no longer segregate, but are
fixed).
The probability of fixation of a new mutation is 1/2N. We have previously seen that under
Wright-Fisher sampling the probability of fixation for any allele is equal to its frequency in
the population. A novel mutation always enters a population at frequency 1/2N, its fixation
probability is thus 1/2N. We have also shortly discussed that at some point all gene copies of
a population will have descended from a single common ancestor (coalescent). Assume now
that a mutation happened in this early ancestral generation. Obviously, it will only spread to
fixation (and can thus be observed) only if it has occurred in the common ancestor. Otherwise
it will be lost. If there are 2N gene copies in the ancestral generation, the fixation probability
is thus 1/2N.
The rate of nuclear substitution is equal to the mutation rate. Next, we want to calculate
the neutral rate of substitution k, defined as the number of all mutations that arise in a
population times the probability that any of those mutations is fixed. If the mutation rate per
site and generation is u, 2Nu mutations will arise every generation at the site. We thus have
k = (2N u) • (1/2N) = u
The rate of substitution is just equal to the rate of new mutations, independent of the
population size. This is one of the most famous and most useful results from population
genetics. It implies that molecular evolution at a neutral site occurs at an approximately
constant rate per unit time. It is therefore said to show a molecular clock. As a consequence,
the number of substitutions between two species can be used to infer the time since these
species split from their common ancestor. Importantly, the molecular clock is independent of
fluctuations in population size. It assumes, however, that mutation rates are constant over
time.
Literature: (Barton et al. 2007; Futuyma 2013; Nielsen and Slatkin 2013)
Barton NH, Briggs DEG, Eisen JA, Goldstein DB, Patel NH. 2007. Evolution. 1st edition.
Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press
Futuyma DJ. 2013. Evolution. 3rd ed. Sinauer Associates
Lewontin RC, Hubby JL. 1966. A Molecular Approach to the Study of Genic Heterozygosity
in Natural Populations. Ii. Amount of Variation and Degree of Heterozygosity in
Natural Populations of Drosophila Pseudoobscura. Genetics 54:595–609.
Nielsen R, Slatkin M. 2013. An Introduction to Population Genetics: Theory and
Applications. Sunderland, Mass: Macmillan Education
LectureWS
EvolutionaryGeneticsPartI-JochenB.W.Wolf
5