Download The neutral theory of molecular

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome evolution wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Oncogenomics wikipedia , lookup

Human genetic variation wikipedia , lookup

Viral phylodynamics wikipedia , lookup

Inbreeding wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Frameshift mutation wikipedia , lookup

Epistasis wikipedia , lookup

Koinophilia wikipedia , lookup

Point mutation wikipedia , lookup

Mutation wikipedia , lookup

Genetic drift wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
The neutral theory of molecular
In 1968 Motoo Kimura introduced the concept and discipline
´Neutral theory of molecular evolution´ ,
Nature 217:624-626, Evolutionary rate at the molecular level.
Kimura (1924-1994) was a Japanese orchid breeder. He became
interested in stochastic processes and is known among mathematicians
due to innovative use of diffusion equations. In biology he is the father of
theoretical molecular evolution.
This set of slides will be replaced by a better edited version
Seq evol / Neutral theory / SVarvio
1
Theory of substitution dynamics in short
Mutation generates new versions of sequences, these are good,
bad or neutral for the species, i.e. increase or decrease the fitness, or
have no effect on fitness = neutral.
Sequences which evolve neutrally: divergence = accumulation of
neutral mutations, estimate species ´calendar time´ divergence. The
classical molecular clock hypothesis may hold.
Substitution is a process whereby a mutant allele replaces the
previous allele
In this process a mutant allele arises as a single copy and becomes
fixed after a certain number of generations.
Not all mutants, however, reach fixation – in fact, the majority of
them are lost after a few generations, by chance (random drift) even if
they are good.
Seq evol / Neutral theory / SVarvio
2
Fixation probability, fixation time, rate of gene
substitution, the number of fixations of new
alleles (new sequence versions) per unit time
The probability that a particular allele will become fixed in a population depends
on its frequency, its fitness advantage or disadvantage, i.e. (Darwinian) selection
increasing or decreasing its frequency, the effective population size Ne which affects
the sampling process (small population => more changes by mere chance, ´random
drift´, larger population => selection (fitness differences) outcompetes chance
effects.
In the field `molecular evolution and populations genetics´, long ago….when the
structure of DNA was found, when the genetic code was found, and when something
was known about protein sequences (i.e. they are more similar between closely
related species, and similarity decreases with decreasing species biological
relationship), Kimura (1962) showed, considering a selection scheme (s is the
selection advantage)
genotypes
fitness
A 1A 1 A 1A 2 A 2A 2
1
1+s
1+2s
that the probability of fixation of A2 is P = (1 – e-4Nesq) / (1 – e-Nes)
where q is the initial frequency of allele A2
Seq evol / Neutral theory / SVarvio
3
0.
Since e-x 1 – x for small values of x, the equation reduces to P
q as s approaches
Thus, for a neutral allele, the fixation probability equals its frequency in the
population. For example, a neutral allele with a frequency of 40% will become fixed in
40% of the cases and will be lost in 60% of the cases, fixation occurs by random drift,
which facilitates neither allele.
A new mutant arising as a single copy in a population of size N individuals has an
initial frequency of 1 / (2N) (diploid individuals). The probability of fixation of a
particular mutant allele is thus obtained by replacing q with 1 / (2N) in the previous
equa on. When s 0,
P = (1 – e -(2Nes/N) ) / (1 – e -4Ns)
For a neutral mutation, s = 0, the equation becomes P = 1 / 2N
If the population size is equal to the effective population size
P = (1 – e-2s) / (1 – e-4Ns)
If the absolute value of s is small
P = 2s / (1 – e-4Ns)
For positive values of s and large values of N, P 2s
Seq evol / Neutral theory / SVarvio
4
Thus: if an advantageous mutation arises in a large population and its selective
advantage over the rest of the alleles is, say up to 5%, the probability of its fixation is
approximately twice its selective advantage. For example, if a new mutation with s =
0.01 arises in a population, the probability of its eventual fixation is 2%.
An example: A new mutant arises in a population of 1000 individuals.
What is
the probability that this allele will become fixed in the population if
(a) it is
neutral, (b) it confers a selective advantage of 0.01, or (c) it has a selective
disadvantage of 0.001?
For simplicity, we assume that Ne = N (=1000).
For the neutral case, the probability P = 1 / 2N = 0.05%.
From equations (previous page) we obtain the probabilities 2% and 0.004% for the
advantageous and deleterious mutations, respectively.
The message: Advantageous mutations do not always become fixed in the
population. In fact, 98% of all the mutations with the selective advantage of 0.01 will
be lost by chance. Deleterious mutations have a finite probability of becoming fixed
in a population, albeit a small one. The fact that a deleterious allele may become
fixed in a population at the expense of better alleles illustrates the importance of
chance events in determining the fate of mutations during evolution.
Seq evol / Neutral theory / SVarvio
5
The dynamics of gene substitution for advantageous (top) and neutral (bottom)
mutations. Advantageous mutations are either rapidly lost or rapidly fixed in the
population. In contrast, for neutral alleles the frequency changes are slow, and the
fixation time is much longer than that for advantageous mutants. X axis is time ->, y axis
is allele frequency from 0 to 1 (=fixation)
Seq evol / Neutral theory / SVarvio
6
Considering larger population, the chance effects become smaller. For example, if N =
10 000, then the fixation probabilities become 0.005%, 2% and ~ 10-20.
Thus, while the fixation probability for the advantageous mutations remains
approximately the same, that for the neutral mutation becomes smaller, and that for the
deleterious allele becomes indistinguishable from zero.
The fixation time. Consider fixation and loss separately and restrict consideration to
those mutants that will eventually become fixed in the population. This is called the
conditional fixation time.
In the case of a new mutation whose initial frequency is q = 1/ (2N), the mean
conditional fixation time (theory by Kimura in 1960´s) for a neutral mutation is
approximated by
t = 4N generations
For a mutation with a selective advantage of s
t = (2/s) ln(2N) generations
Seq evol / Neutral theory / SVarvio
7
Let´s assume a species which has an effective population size of about 106
and a mean generation time 2 years.
Under these conditions, it will take a neutral mutation, on average 8
million years to become fixed in the population.
A mutation with a selective advantage of 1% will become fixed in the
population in 5800 years.
The conditional fixation time for a deleterious allele with a selective
disadvantage –s is the same as that for an advantageous allele with a
selective advantage s (theory by Kimura in the 1970´s).
This is intuitively understandable given the high probability of loss for a
deleterious allele. That is, for a deleterious allele to become fixed in a
population, fixation must occur very quickly.
Seq evol / Neutral theory / SVarvio
8
The rate of gene substitution. Definition: the number of mutants reaching
fixation per unit time. If neutral mutations occur at a rate of u per gene per
generation, then the number of mutants arising at gene locus in a population
of size N is 2Nu per generation.
Since the probability of fixation for each of these mutations is 1 / (2N), the
rate of substitutiton of neutral alleles is obtained by multiplying the total
number of mutations by the probability of their fixation
K = 2Nu (1/2N) = u
The rate of substitution is thus equal to the rate of mutation.
Intuitively: in a large population the number of mutations arising every
generation is high, but the fixation probability is low. In a small population the
number of mutations arising every generation in low, but the fixation
probability of each mutation is high. As a consequence, the rate of substitution
for neutral mutations is independent of population size.
For advantageous mutations the rate of substitution can also be obtained by
multiplying the rate of mutation by the probability of fixation for advantageous
alleles as given above (P 2s). For selection with s > 0
K = 4Nsu
Seq evol / Neutral theory / SVarvio
9
Let´s assume a species which has an effective population size of about 106
and a mean generation time 2 years.
Under these conditions, it will take a neutral mutation, on average 8
million years to become fixed in the population.
A mutation with a selective advantage of 1% will become fixed in the
population in 5800 years.
The conditional fixation time for a deleterious allele with a selective
disadvantage –s is the same as that for an advantageous allele with a
selective advantage s (theory by Kimura in the 1970´s).
This is intuitively understandable given the high probability of loss for a
deleterious allele. That is, for a deleterious allele to become fixed in a
population, fixation must occur very quickly.
Seq evol / Neutral theory / SVarvio
10
The rate of substitution depends on the population size, selective advantage and
mutation rate. The inverse of K is the mean time between two consequtive fixation
events (cf the figures above)
The formulae for K have been very important in the field of molecular evolution.
They define two very different predictions for genetic polymorphisms in populations. Are
the polymorphisms (for example sequence polymorphisms of certain genes) outcomes
from neutral evolution or from evolution dictated by selection?? Two opposite schools
since ~1970´s.
Today, when sequence information is extensive, the question has turned to:
How to identify genes from, for example, human sequence databases, which bear
signatures of positive selection?
Such genes have evolved faster => biased, “too long” branch lengths in phylogenetic
trees.
During the evolution of the human lineage, as compared with our phylogenetic
relatives (the great apes), positively selected genes might be responsible for some
important human specific phenotypic traits.
Seq evol / Neutral theory / SVarvio
11
Rate of nucleotide substitution
Below is a short introduction, in pursuance of serving a closure of the previous topic.
The number of substitutions per site per year
r = K / 2T
Divergence of two homologous sequences from a common ancestral sequence T years ago.
In the vast majority of genes, the synonymous substitution (does not result in amino acid
change) rate greatly exceeds the non-synonymous rate.
An extreme example: histone H4 gene, human vs. wheat, synonymous rate is ~25 times higher
than the nonsynonymous rate.
The rate is highly variable: fibrinopeptides evolve ~900 times faster than ubiquitin gene. This
kind of phenomena are explained by functional constraints: what proportion of possible
mutants will be deleterious and selectively removed, or neutral and fixed with a small
probability. The more functionally constrained the gene, the greater probability that a mutation
will be deleterious rather than neutral and so the lower the rate of nucleotide substitution.
Seq evol / Neutral theory / SVarvio
12
For example, proteins like histones which interact structurally with DNA so that any
amino acid change may be deleterious, evolve at the lowest rate whilst those
that only interact with other proteins, such as members of the immune system
or hormones, evolve at the highest rates because there is a great deal more
flexibility in which amino acids are functionally suitable.
Pseudogenes (loss of functionality) are important calibration tools. Rates of
nucleotide substitution per site, per year x 10-9 for mammalian globin
pseudogenes and their functional homologues. Each codon position is given
separately.
Pseudogene
Functional genes
1.
2.
3.
Mouse
5.0
0.75
0.68
2.65
Human
5.1
0.75
0.68
2.65
Rabbit
4.1
0.94
0.71
2.02
Goat
4.4
0.94
0.71
2.02
Thus, if the rate at pseudogens reflects evolution through mutation process, then
the 3. position sites (degenerate sites), which evolve slower, do also have
functional constraint, i.e. selection is slowing down their evolution, too.
Seq evol / Neutral theory / SVarvio
13