* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Neutral theory 3: Rates and patterns of molecular evolution
Designer baby wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Viral phylodynamics wikipedia , lookup
Oncogenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Genome evolution wikipedia , lookup
Genetic drift wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetic code wikipedia , lookup
Point mutation wikipedia , lookup
Koinophilia wikipedia , lookup
Neutral theory 3: Rates and patterns of molecular evolution Predictions of the neutral theory 1. Within species variation is correlated with divergence between species. 2. Evolutionary rate is inversely related functional constraint. 2. Base composition at neutral sites reflects mutational equilibrium. 4. A molecular clock. Neutral theory is “the” rigid null hypothesis for molecular evolution 1 1. Polymorphism and divergence are correlated Neutral theory is a bridge between microevolution and macroevolution Neutral population polymorphism within species is correlated with neutral divergence between species 1. Variation within and among species: polymorphism & divergence This is one place where the genetic code is relevant: 1. Synonymous (S) 2. Non-synonymous (NS) Neutrality and selection have different impacts on polymorphism: 1. Neutrality: NS residence times determined by Ne 2. Selection: NS residence times reduced by natural selection Let’s look at the ratio NS:S [ratio of counts] 2 1. Variation within and among species: polymorphism & divergence Comparison of the ratio of synonymous and nonsynonymous polymorphism within species to divergence between species. Neutral theory suggests that the fraction of variation that is nonsynonymous within species should be the same as between species. Species 1 Genealogies within populations 12:4 17:6 Species 2 Species 3 6:2 10:3 14:5 19:6 Polymorphism within a species Substitutions between species Species level phylogenies Polymorphic Fixed Synonymous (S) 28 50 Non-synonymous (NS) 9 17 S:NS 3.1 2.9 Data are hypothetical. Ratios are tested by using a G-test on the counts of S and NS. These hypothetical data are not significant. If positive selection were acting, residence times for NS would be lower within species and polymorphic S:NS > fixed S:NS. 2. Rate of evolution is inversely related to functional constraint Rate variation is well known: • Fast genes (D-loop) verses slow genes (Histones) • Introns verses exons • Synonymous verse nonsynonymous sites Neutral theory is consistent with such rate variation - Asserts only that polymorphism is selectively equivalent - Frequency of such polymorphism can change among genes, sites etc. 3 Note: two ways it is commonly measured 10 20 30 40 50 60 ....|....| ....|....| ....|....| ....|....| ....|....| ....|....| Mus2.FAS MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT Human_GIA ---------- ---------- ---------- -----MNSNF ITFDLKMSLL PSNLFSAFIT Human_GIB MTTPALLPLS -----GRRIP PLNL--GPP- ----SFPHHR ATLRLSEKFI LLLILSAFIT Mus_GIA MPVGGLLPLF SSPGGGGLGS GLGGGLGGG- ----RKGSGP AAFRLTEKFV LLLVFSAFIT Rabbit_GIA ---------- ---------- ---------- ---------- ---------- ---------- Sus_GIA MPVGGLLPLF SSPAGGGLGG GLGGGLGGGG GGGGRKGSGP SAFRLTEKFV LLLVFSAFIT 70 80 90 100 110 120 ....|....| ....|....| ....|....| ....|....| ....|....| ....|....| Cebus Saimiri Mus2.FAS LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE Human_GIA LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAAEGRAR RREEGAPGDP Human_GIB LCFGAFFFLP DSSKHKRFDL G-LEDVLIPH VDAGKG---- AKNPGVFLIH GPDEHRHREE Mus_GIA LCFGAIFFLP DSSKLLSGVL FHSNPALQPP AEHKPGLGAR AEDAAEGRVR HREEGAPGDP Rabbit_GIA ---------- ---------- ---------- ---------- AEDAADGRAR PGEEGAPGDP Sus_GIA LCFGAIFFLP DSSKLLSGVL FHSSPALQPA ADHKPGPGAR AEDAADGRAR PGEEGAPGDP Aotus Callithrix Lagothrix Brachyteles Alouatta Ateles Pan Homo Pongo Macaca Hylobates Tarsius Galago Otolemur Cheirogaleus 0.01 Eulemur Ave of all pairwise distances Tree length 2. Rate of evolution is inversely related to functional constraint Mean number of substitution per site at the three codon positions of the epsilon-globin gene of primates. Two measures are presented: (i) the average over all pair wise comparisons between genes; and (ii) the sum of the branch lengths of the epsilon globin gene tree. Mean number of substitutions/site Cebus Saimiri 0.15 0.8 pairwise subst/site Callithrix Lagothrix Brachyteles Alouatta Ateles Pan Homo 0.6 0.1 0.4 0.05 0.2 0 0 Pongo Macaca subst/site as a sum over tree Aotus 1 2 3 Hylobates Tarsius Codon position Galago Otolemur Cheirogaleus 0.01 mean pairwise subst rate Subst rate as a sum of branch lengths Eulemur Gene tree for primate epsilon globins Under both measures of substitution rate, 3rd codon positions evolve faster than 1st and 2nd positions. Note: mean number of substitutions per site were computed in all cases by using the Jukes and Cantor (1969) correction. 4 2. Rate of evolution is inversely related to functional constraint Neutral Model Beneficial: rare and frequency quickly goes to zero Deleterious: frequency = fD f0 = 1 - fD Neutral: frequency = f0 Kimura, 1968: The neutral mutation rate per site is: µ0 = µT f0 The neutral substitution rate per site is: k = µT f0 2. Rate of evolution is inversely related to functional constraint The rate of evolution depends on the “size (f0) of the selective sieve” New mutations Fixation in a “slow gene” Kimura’s New mutations Fixation in a “fast gene” f0 is the fraction of mutations that passes through the “sieve”. 5 2. Rate of evolution is inversely related to functional constraint Example: 3rd codon positions verses synonymous sites Some changes at 3rd codon positions are NOT synonymous Prediction: 1. f0 for 3rd codon positions < f0 for synonymous sites 2. rate 3rd codon positions < rate for synonymous sites 2. Rate of evolution is inversely related to functional constraint The average substitution rate between primates and rodents is higher for synonymous sites as compared with third codon positions. The results are based on a sample of 82 nuclear genes. 45 Rodent gene t0 t1 Mean at 3rd positions: 0.40 40 number of proteins Primate gene Mean at synonymous sites: 0.61 35 30 25 20 15 10 5 0 0.1 Ancestral gene 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 More substitutions / site / 2x80 million years 3rd codon postions Synonymous sites Mean number of substitutions per site This result is consistent with neutral theory given that between primates and rodents is t = t0 + f0 is smaller for 3rd codon positions because some t1. The unit of time is 2 × 80my; the time mutations at such site will be nonsynonymous. since primates and rodents shared a common ancestor. Data from Bielawski, Dunn and Yang (2000) Genetics. 156:1299-1308. 6 2. Rate of evolution is inversely related to functional constraint We can put sites into a wide variety of categories: • 5’ and 3’ flanking regions • 5’ and 3’ untranslated regions • Introns • Exons • 3rd positions of 4-fold degenerate codons • Nonsynonymous sites of a codon • Functional domains • Pseudogenes 2. Rate of evolution is inversely related to functional constraint Substitutions per site per 109 years Comparison of mean substitution rates in different parts of genes and pseudo-genes. Data is from Li et al. (1985). Substitution rate is the mean number of substitutions per site per 109 years. Rates are an average over 3000 mammalian genes. 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 In 5' Ps 5' No Sy 3' 3' tro un fla un fla eu on nns nk nk tra sy tra do ym in no i ns n ge ng ou sl g n ne l at at re s re y m s ed si ed gi gi o t o on es us re re n gi gi si on on te s 7 2. Rate of evolution is inversely related to functional constraint Sites subject to selection will have variable f0 depending on the level of functional constraints acting on that site: 1. Nonsynonsymous sites 2. Functional domains 3. Etc. 2. Rate of evolution is inversely related to functional constraint Multiple sequence alignment of four vertebrate beta-globin genes representing 450 million years of evolution. Amino acids shaded in green represent sites that appear conserved for over those 450millin years. Note: many of the shaded sites are located in the heme pocket or at the interfaces between globins subunits, consistent with the notion that sites most critical to protein function evolve at the slowest rates. 8 2. Rate of evolution is inversely related to functional constraint Substitution rate differs in different polypeptide domains of preproinsulin. C chain: 1.1 × 10-9 / site / year A&B chains: 0.2 × 10-9 / site / year 5 fold higher rate in C chain Note that c-chain rate is still lower than in many other proteins, and at synonymous sites, so its amino acid sequence must still has considerable functional importance to the protein, probably in folding to lowest free energy so that disulfide bonds can be formed. 2. Rate of evolution: differences among genes Let’s estimate the width of the selective sieve: Under neutral theory: • The synonymous substitution rate (kS) is equal to the neutral mutation rate. • The nonsynonymous substitution rate (kN) measures the substitution rate for neutral amino acid changes. • Thus the ratio of these rates (kN / kS) represents the fraction of amino acid mutations that are neutral: this is f0 for amino acids • The fraction of amino acid mutations that are deleterious (fD) must be 1 - (kN / kS). Let’s take the Neuroleukin gene of primates as an example: kN = 0.016 kS = 0.300 The fraction of amino acid changes that are neutral is 0.016/0.300 = 0.053, a small amount. Hence the fraction of amino acid changes that are deleterious is 1 - 0.053 = 0.95! 9 2. Rate of evolution: differences among genes Estimated level of function constraint for three nuclear genes of primates A1 adenosine receptor Most mutations in a gene evolving under strong functional constraints are deleterious Prolactin Less functionally constrained gene has more neutral mutations Pseudogene Pseudogenes are nonfunctional so all mutations are expected to be neutral Fraction of deleterious mutations Fraction of neutral mutations Estimates obtained from relative rates of synonymous and nonsynonymous substitution. Data is from Bielawski, Dunn, and Yang (2000) Genetics 156:1299-1308. 2. Rate of evolution: differences among genes 1. Nonsyonymous rates vary among genes due to differences in functional constraints 2. Synonymous rates vary due to differences in mutation rates [and in some cases weak selective constraints] Best to estimate kS separately for each gene when trying to estimate f0 and fD. 10 2. Rate of evolution: differences among genes Distribution of nonsynonymous and synonymous substitution rates for 82 nuclear genes of primates. 60 0.045 / site / 80 million years Mean rate of synonymous substitution: 0.201 / site / 80 million years Number of proteins Mean rate of nonsynonymous substitution: 50 40 30 20 10 0.55 0.45 0.25 0.35 0.05 Data from Bielawski, Dunn, and Yang (2000) Genetics, 156:1299-1308. 0.15 0 substitutions/site/80 million years Method: GY94 under ML Nonsyonymous rate Synonymous rate 3. Patterns of mutation Neutral theory: nucleotide frequencies at site free from selection will reflect mutational equilibrium. For example: • Pseudogenes • 3’ flanking regions • Synonymous sites 11 3. Patterns of mutation: human beta globin Nucleotide frequencies in the human beta-globin gene differ among the three positions of the codon. Frequencies at positions 1 and 2 reflect selection acting on the protein product of the gene. Frequencies at position 3 reflect a strong influence of mutation pressure. st nd 1 codon position 0.45 rd 2 codon position 3 codon position 0.45 0.45 3 0.4 0. 4 0.4 0.35 0.35 0.35 0.3 0. 3 0.25 0.2 0.15 0.1 0.25 0. 2 0.2 0 . 15 0 . 15 0.1 0.05 0 0.3 0.25 0.1 0.05 0.05 0 1 2 3 A C G 4 T 0 1 2 3 A C G 4 T 1 2 3 A C G 4 T 4. Molecular clock k = µ Neutral theory (1968) predicts that the rate of molecular evolution [substitution] should be approximately constant over time, where time is measures in generations. Zukerkandl and Pauling (1965) noticed an approximately uniform rate of amino acid substitution, with time measures in years. The notion of a molecular clock is somewhat controversial 12 4. Molecular clock Mean number of substitutions / site Linear relation between mitochondrial substitution rate and time since common ancestor in teleost fishes 0.2 0.15 0.1 0.05 0 0 50 100 150 200 Time (millions of years) Linear relationship is expected under a uniform rate of substitution. Substitutions are the mean number of changes at first codon positions of all mitochondrial protein coding genes. Data were kindly provided by K. Dunn. Note: many proteins exhibit constant rate in terms of absolute time in years Some initial problems with neutral theory 1. Expected heterozygosity was larger than observed in natural populations 2. Some molecules evolved according to a per million year clock even though generation times were very different 3. There were more problems to follow, but will not consider them. 13 Nearly neutral theory Tomoko Ohta Motoo Kimura Ohta: What happens if some mutations are only mildly deleterious? Nearly neutral theory Slightly deleterious mutations: • Small selective coefficients (s) • When s is small, populations size plays important role: • Large Ne: selection effective • Small Ne: selection less effective [more deleterious alleles get fixed!] Ohta and Kimura (1971): the slightly deleterious model of evolution - more adjustments were to follow 14 Remember this slide from PopGen Topic 8: Selection - drift Then fate of a beneficial recessive allele (A1) is not always predictable under the combined effects of directional selection and genetic drift. If there is no genetic drift (left: Nes = infinity), the fate of the recessive allele (A1) is always determined by selection. When there is drift (right: Nes < infinity) the fate of the recessive allele (A1) is not necessarily determined by selection; hence a deleterious allele can be fixed in a population. Nes = 100 Nes = infinity Note that Nes > 1 does not guarantee that an allele is going to be fixed, it simply indicates that (as a long term average) the frequency that it is fixed will be greater than the frequency under genetic drift alone. Slightly deleterious model Large population size: (selection very effective) neutral mutations slightly deleterious mutations FIXED Small population size: (selection a little less effective) neutral mutations slightly deleterious mutations FIXED 15 Nearly neutral theory Slightly beneficial models were incorporated later Large population size: (selection very effective) neutral mutations Small population size: (selection a little less effective) neutral mutations slightly beneficial mutations slightly beneficial mutations FIXED FIXED Mildly deleterious + slightly beneficial = “Nearly neutral model” The strictly neutral model was extended to accommodate nearly neutral mutations Beneficial Deleterious Neutral Strictly neutral model Neutral Slightly deleterious model Slightly deleterious Slightly beneficial Nearly neutral model Neutral Fraction of “neutral” mutations (f0) changes with Ne Note that Ne changes over time! Slightly deleterious 16 Nearly neutral theory reconciles some observations with theory 1. Predicts lower natural levels of heterozygosity 2. Possible reconciliation of predicted rate constancy in generations with rate constancy in years. Neutrality depends on environmental conditions Strictly neutral and nearly neutral models: • distribution of fitness effects of new mutations changes according to environment Genetic environment and physical environment change: • 3D space of protein reflect a genetic as well as physical environment • neutral substitutions, recombination, LGT, etc. change the genetic environment • physical environment changes daily, weekly, seasonally, yearly…. 17 Success of neutral theory: “the null model” Rob Stainer (1970) at the general meeting of the society for general Microbiology commented that evolutionary studies are “a relatively harmless habit, like eating peanuts”. Neutral theory provides the foundation for the science of molecular evolution Success of neutral theory: James F. Crow (1985) 1. The theory provides the best explanation for the dramatic differences in the rates and patterns of evolution in molecules as compared with morphology. 2. The neutral theory provides a common framework for understanding the dramatic differences among genes, codon positions, introns, and pseudogenes. 3. The neutral theory correctly predicts the differences in rates among molecular datasets as well as the similarity of substitution rates between the so-called “living fossil” organisms and the most rapidly changing species. 4. The neutral theory has stimulated theoretical studies as well as studies of natural variation in a framework based on a rigid null hypothesis. 18