Download Neutral theory 3: Rates and patterns of molecular evolution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Designer baby wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Viral phylodynamics wikipedia , lookup

Gene wikipedia , lookup

Oncogenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Genome evolution wikipedia , lookup

Genetic drift wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic code wikipedia , lookup

Epistasis wikipedia , lookup

Point mutation wikipedia , lookup

Mutation wikipedia , lookup

Koinophilia wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Neutral theory 3:
Rates and patterns of molecular evolution
Predictions of the neutral theory
1.
Within species variation is correlated with divergence between species.
2.
Evolutionary rate is inversely related functional constraint.
2.
Base composition at neutral sites reflects mutational equilibrium.
4.
A molecular clock.
Neutral theory is “the” rigid null hypothesis for molecular evolution
1
1. Polymorphism and divergence are correlated
Neutral theory is a bridge between microevolution and macroevolution
Neutral population polymorphism within species is correlated with neutral divergence
between species
1. Variation within and among species: polymorphism & divergence
This is one place where the genetic code is relevant:
1.
Synonymous (S)
2.
Non-synonymous (NS)
Neutrality and selection have different impacts on polymorphism:
1.
Neutrality: NS residence times determined by Ne
2.
Selection: NS residence times reduced by natural selection
Let’s look at the ratio NS:S [ratio of counts]
2
1. Variation within and among species: polymorphism & divergence
Comparison of the ratio of synonymous and nonsynonymous polymorphism within
species to divergence between species. Neutral theory suggests that the fraction of
variation that is nonsynonymous within species should be the same as between
species.
Species 1
Genealogies within
populations
12:4
17:6
Species 2
Species 3
6:2
10:3
14:5
19:6
Polymorphism within a
species
Substitutions between
species
Species level
phylogenies
Polymorphic
Fixed
Synonymous (S)
28
50
Non-synonymous (NS)
9
17
S:NS
3.1
2.9
Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS.
These hypothetical data are not significant. If positive selection were acting, residence
times for NS would be lower within species and polymorphic S:NS > fixed S:NS.
2. Rate of evolution is inversely related to functional constraint
Rate variation is well known:
• Fast genes (D-loop) verses slow genes (Histones)
• Introns verses exons
• Synonymous verse nonsynonymous sites
Neutral theory is consistent with such rate variation
- Asserts only that polymorphism is selectively equivalent
- Frequency of such polymorphism can change among genes, sites etc.
3
Note: two ways it is commonly measured
10
20
30
40
50
60
....|....| ....|....| ....|....| ....|....| ....|....| ....|....|
Mus2.FAS
MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT
Human_GIA
---------- ---------- ---------- -----MNSNF ITFDLKMSLL PSNLFSAFIT
Human_GIB
MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT
Mus_GIA
MPVGGLLPLF SSPGGGGLGS GLGGGLGGG- ----RKGSGP AAFRLTEKFV LLLVFSAFIT
Rabbit_GIA
---------- ---------- ---------- ---------- ---------- ----------
Sus_GIA
MPVGGLLPLF SSPAGGGLGG GLGGGLGGGG GGGGRKGSGP SAFRLTEKFV LLLVFSAFIT
70
80
90
100
110
120
....|....| ....|....| ....|....| ....|....| ....|....| ....|....|
Cebus
Saimiri
Mus2.FAS
LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE
Human_GIA
LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAAEGRAR RREEGAPGDP
Human_GIB
LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE
Mus_GIA
LCFGAIFFLP DSSKLLSGVL FHSNPALQPP AEHKPGLGAR AEDAAEGRVR HREEGAPGDP
Rabbit_GIA
---------- ---------- ---------- ---------- AEDAADGRAR PGEEGAPGDP
Sus_GIA
LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAADGRAR PGEEGAPGDP
Aotus
Callithrix
Lagothrix
Brachyteles
Alouatta
Ateles
Pan
Homo
Pongo
Macaca
Hylobates
Tarsius
Galago
Otolemur
Cheirogaleus
0.01
Eulemur
Ave of all pairwise distances
Tree length
2. Rate of evolution is inversely related to functional constraint
Mean number of substitution per site at the three codon positions of the epsilon-globin gene of
primates. Two measures are presented: (i) the average over all pair wise comparisons
between genes; and (ii) the sum of the branch lengths of the epsilon globin gene tree.
Mean number of substitutions/site
Cebus
Saimiri
0.15
0.8
pairwise subst/site
Callithrix
Lagothrix
Brachyteles
Alouatta
Ateles
Pan
Homo
0.6
0.1
0.4
0.05
0.2
0
0
Pongo
Macaca
subst/site as a sum
over tree
Aotus
1
2
3
Hylobates
Tarsius
Codon position
Galago
Otolemur
Cheirogaleus
0.01
mean pairwise subst rate
Subst rate as a sum of branch lengths
Eulemur
Gene tree for primate epsilon globins
Under both measures of substitution rate, 3rd codon
positions evolve faster than 1st and 2nd positions.
Note: mean number of substitutions per site were computed in all cases by using the Jukes and
Cantor (1969) correction.
4
2. Rate of evolution is inversely related to functional constraint
Neutral Model
Beneficial: rare and frequency quickly goes to zero
Deleterious: frequency = fD
f0 = 1 - fD
Neutral: frequency = f0
Kimura, 1968:
The neutral mutation rate per site is:
µ0 = µT f0
The neutral substitution rate per site is:
k = µT f0
2. Rate of evolution is inversely related to functional constraint
The rate of evolution depends on the “size (f0) of the selective sieve”
New mutations
Fixation in a “slow gene”
Kimura’s
New mutations
Fixation in a “fast gene”
f0 is the fraction of mutations that passes through the “sieve”.
5
2. Rate of evolution is inversely related to functional constraint
Example: 3rd codon positions verses synonymous sites
Some changes at 3rd codon positions are NOT synonymous
Prediction:
1.
f0 for 3rd codon positions < f0 for synonymous sites
2. rate 3rd codon positions < rate for synonymous sites
2. Rate of evolution is inversely related to functional constraint
The average substitution rate between primates and rodents is higher for synonymous sites
as compared with third codon positions. The results are based on a sample of 82 nuclear
genes.
45
Rodent
gene
t0
t1
Mean at 3rd positions: 0.40
40
number of proteins
Primate
gene
Mean at synonymous sites: 0.61
35
30
25
20
15
10
5
0
0.1
Ancestral gene
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
More
substitutions / site / 2x80 million years
3rd codon postions
Synonymous sites
Mean number of substitutions per site
This result is consistent with neutral theory given that
between primates and rodents is t = t0 +
f0 is smaller for 3rd codon positions because some
t1. The unit of time is 2 × 80my; the time
mutations at such site will be nonsynonymous.
since primates and rodents shared a
common ancestor.
Data from Bielawski, Dunn and Yang (2000) Genetics. 156:1299-1308.
6
2. Rate of evolution is inversely related to functional constraint
We can put sites into a wide variety of categories:
• 5’ and 3’ flanking regions
• 5’ and 3’ untranslated regions
• Introns
• Exons
• 3rd positions of 4-fold degenerate codons
• Nonsynonymous sites of a codon
• Functional domains
• Pseudogenes
2. Rate of evolution is inversely related to functional constraint
Substitutions per site per 109 years
Comparison of mean substitution rates in different parts of genes and pseudo-genes. Data is
from Li et al. (1985). Substitution rate is the mean number of substitutions per site per 109
years. Rates are an average over 3000 mammalian genes.
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
In
5'
Ps
5'
No
Sy
3'
3'
tro
un
fla
un
fla
eu
on
nns
nk
nk
tra
sy
tra
do
ym
in
no
i
ns
n
ge
ng
ou
sl
g
n
ne
l
at
at
re
s
re
y
m
s
ed
si
ed
gi
gi
o
t
o
on
es
us
re
re
n
gi
gi
si
on
on
te
s
7
2. Rate of evolution is inversely related to functional constraint
Sites subject to selection will have variable f0 depending on
the level of functional constraints acting on that site:
1. Nonsynonsymous sites
2. Functional domains
3. Etc.
2. Rate of evolution is inversely related to functional constraint
Multiple sequence alignment of four vertebrate beta-globin genes representing 450
million years of evolution. Amino acids shaded in green represent sites that appear
conserved for over those 450millin years.
Note: many of the shaded sites are located in the heme pocket or at the interfaces between
globins subunits, consistent with the notion that sites most critical to protein function evolve at
the slowest rates.
8
2. Rate of evolution is inversely related to functional constraint
Substitution rate differs in different polypeptide domains of preproinsulin.
C chain: 1.1 × 10-9 / site / year
A&B chains: 0.2 × 10-9 / site / year
5 fold higher rate in C chain
Note that c-chain rate is still lower than in many other proteins, and at
synonymous sites, so its amino acid sequence must still has considerable
functional importance to the protein, probably in folding to lowest free energy so
that disulfide bonds can be formed.
2. Rate of evolution: differences among genes
Let’s estimate the width of the selective sieve:
Under neutral theory:
• The synonymous substitution rate (kS) is equal to the neutral mutation rate.
• The nonsynonymous substitution rate (kN) measures the substitution rate for neutral amino acid changes.
• Thus the ratio of these rates (kN / kS) represents the fraction of amino acid mutations that are neutral: this is f0 for
amino acids
• The fraction of amino acid mutations that are deleterious (fD) must be 1 - (kN / kS).
Let’s take the Neuroleukin gene of primates as an example:
kN = 0.016
kS = 0.300
The fraction of amino acid changes that are neutral is 0.016/0.300 = 0.053, a small amount.
Hence the fraction of amino acid changes that are deleterious is 1 - 0.053 = 0.95!
9
2. Rate of evolution: differences among genes
Estimated level of function constraint for three nuclear genes of primates
A1 adenosine receptor
Most mutations in a gene
evolving under strong
functional constraints are
deleterious
Prolactin
Less functionally constrained
gene has more neutral
mutations
Pseudogene
Pseudogenes are nonfunctional so all mutations
are expected to be neutral
Fraction of deleterious mutations
Fraction of neutral mutations
Estimates obtained from relative rates of synonymous and nonsynonymous substitution. Data is from
Bielawski, Dunn, and Yang (2000) Genetics 156:1299-1308.
2. Rate of evolution: differences among genes
1. Nonsyonymous rates vary among genes due to differences
in functional constraints
2. Synonymous rates vary due to differences in mutation rates
[and in some cases weak selective constraints]
Best to estimate kS separately for each gene when trying to
estimate f0 and fD.
10
2. Rate of evolution: differences among genes
Distribution of nonsynonymous and synonymous substitution rates for 82 nuclear
genes of primates.
60
0.045 / site / 80 million years
Mean rate of synonymous
substitution:
0.201 / site / 80 million years
Number of proteins
Mean rate of nonsynonymous
substitution:
50
40
30
20
10
0.55
0.45
0.25
0.35
0.05
Data from Bielawski, Dunn, and
Yang (2000) Genetics,
156:1299-1308.
0.15
0
substitutions/site/80
million years
Method: GY94 under ML
Nonsyonymous rate
Synonymous rate
3. Patterns of mutation
Neutral theory: nucleotide frequencies at site free from selection will
reflect mutational equilibrium.
For example:
• Pseudogenes
• 3’ flanking regions
• Synonymous sites
11
3. Patterns of mutation: human beta globin
Nucleotide frequencies in the human beta-globin gene differ among the three
positions of the codon. Frequencies at positions 1 and 2 reflect selection
acting on the protein product of the gene. Frequencies at position 3 reflect a
strong influence of mutation pressure.
st
nd
1 codon
position
0.45
rd
2 codon
position
3 codon
position
0.45
0.45
3
0.4
0. 4
0.4
0.35
0.35
0.35
0.3
0. 3
0.25
0.2
0.15
0.1
0.25
0. 2
0.2
0 . 15
0 . 15
0.1
0.05
0
0.3
0.25
0.1
0.05
0.05
0
1
2
3
A C G
4
T
0
1
2
3
A C G
4
T
1
2
3
A C G
4
T
4. Molecular clock
k = µ
Neutral theory (1968) predicts that the rate of molecular evolution
[substitution] should be approximately constant over time, where
time is measures in generations.
Zukerkandl and Pauling (1965) noticed an approximately uniform
rate of amino acid substitution, with time measures in years.
The notion of a molecular clock is somewhat controversial
12
4. Molecular clock
Mean number of substitutions / site
Linear relation between mitochondrial substitution rate
and time since common ancestor in teleost fishes
0.2
0.15
0.1
0.05
0
0
50
100
150
200
Time (millions of years)
Linear relationship is expected under a uniform rate of substitution.
Substitutions are the mean number of changes at first codon
positions of all mitochondrial protein coding genes. Data were
kindly provided by K. Dunn.
Note: many proteins exhibit constant rate in terms of absolute time in years
Some initial problems with neutral theory
1. Expected heterozygosity was larger than observed in
natural populations
2. Some molecules evolved according to a per million year
clock even though generation times were very different
3. There were more problems to follow, but will not
consider them.
13
Nearly neutral theory
Tomoko Ohta
Motoo Kimura
Ohta: What happens if some mutations are only mildly deleterious?
Nearly neutral theory
Slightly deleterious mutations:
• Small selective coefficients (s)
• When s is small, populations size plays important role:
• Large Ne: selection effective
• Small Ne: selection less effective [more deleterious alleles get fixed!]
Ohta and Kimura (1971): the slightly deleterious model of evolution
- more adjustments were to follow
14
Remember this slide from PopGen Topic 8: Selection - drift
Then fate of a beneficial recessive allele (A1) is not always predictable under the combined effects of
directional selection and genetic drift. If there is no genetic drift (left: Nes = infinity), the fate of the recessive allele
(A1) is always determined by selection. When there is drift (right: Nes < infinity) the fate of the recessive allele (A1)
is not necessarily determined by selection; hence a deleterious allele can be fixed in a population.
Nes = 100
Nes = infinity
Note that Nes > 1 does not guarantee that an allele is going to be fixed, it simply indicates that (as a long term
average) the frequency that it is fixed will be greater than the frequency under genetic drift alone.
Slightly deleterious model
Large population size:
(selection very effective)
neutral
mutations
slightly
deleterious
mutations
FIXED
Small population size:
(selection a little less effective)
neutral
mutations
slightly
deleterious
mutations
FIXED
15
Nearly neutral theory
Slightly beneficial models were incorporated later
Large population size:
(selection very effective)
neutral
mutations
Small population size:
(selection a little less effective)
neutral
mutations
slightly
beneficial
mutations
slightly
beneficial
mutations
FIXED
FIXED
Mildly deleterious + slightly beneficial = “Nearly neutral model”
The strictly neutral model was extended to accommodate nearly neutral mutations
Beneficial
Deleterious
Neutral
Strictly neutral model
Neutral
Slightly deleterious model
Slightly deleterious
Slightly beneficial
Nearly neutral model
Neutral
Fraction of “neutral” mutations
(f0) changes with Ne
Note that Ne changes over time!
Slightly deleterious
16
Nearly neutral theory reconciles some observations with theory
1. Predicts lower natural levels of heterozygosity
2. Possible reconciliation of predicted rate constancy in
generations with rate constancy in years.
Neutrality depends on environmental conditions
Strictly neutral and nearly neutral models:
• distribution of fitness effects of new mutations changes according to
environment
Genetic environment and physical environment change:
• 3D space of protein reflect a genetic as well as physical environment
• neutral substitutions, recombination, LGT, etc. change the genetic
environment
• physical environment changes daily, weekly, seasonally, yearly….
17
Success of neutral theory: “the null model”
Rob Stainer (1970) at the general meeting of the society for general
Microbiology commented that evolutionary studies are
“a relatively harmless habit, like eating peanuts”.
Neutral theory provides the foundation for the science of molecular
evolution
Success of neutral theory: James F. Crow (1985)
1. The theory provides the best explanation for the dramatic differences in the rates
and patterns of evolution in molecules as compared with morphology.
2. The neutral theory provides a common framework for understanding the dramatic
differences among genes, codon positions, introns, and pseudogenes.
3. The neutral theory correctly predicts the differences in rates among molecular
datasets as well as the similarity of substitution rates between the so-called “living
fossil” organisms and the most rapidly changing species.
4. The neutral theory has stimulated theoretical studies as well as studies of natural
variation in a framework based on a rigid null hypothesis.
18