* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The neutral theory of molecular
Survey
Document related concepts
Genome evolution wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Oncogenomics wikipedia , lookup
Human genetic variation wikipedia , lookup
Viral phylodynamics wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Frameshift mutation wikipedia , lookup
Koinophilia wikipedia , lookup
Point mutation wikipedia , lookup
Genetic drift wikipedia , lookup
Transcript
The neutral theory of molecular In 1968 Motoo Kimura introduced the concept and discipline ´Neutral theory of molecular evolution´ , Nature 217:624-626, Evolutionary rate at the molecular level. Kimura (1924-1994) was a Japanese orchid breeder. He became interested in stochastic processes and is known among mathematicians due to innovative use of diffusion equations. In biology he is the father of theoretical molecular evolution. This set of slides will be replaced by a better edited version Seq evol / Neutral theory / SVarvio 1 Theory of substitution dynamics in short Mutation generates new versions of sequences, these are good, bad or neutral for the species, i.e. increase or decrease the fitness, or have no effect on fitness = neutral. Sequences which evolve neutrally: divergence = accumulation of neutral mutations, estimate species ´calendar time´ divergence. The classical molecular clock hypothesis may hold. Substitution is a process whereby a mutant allele replaces the previous allele In this process a mutant allele arises as a single copy and becomes fixed after a certain number of generations. Not all mutants, however, reach fixation – in fact, the majority of them are lost after a few generations, by chance (random drift) even if they are good. Seq evol / Neutral theory / SVarvio 2 Fixation probability, fixation time, rate of gene substitution, the number of fixations of new alleles (new sequence versions) per unit time The probability that a particular allele will become fixed in a population depends on its frequency, its fitness advantage or disadvantage, i.e. (Darwinian) selection increasing or decreasing its frequency, the effective population size Ne which affects the sampling process (small population => more changes by mere chance, ´random drift´, larger population => selection (fitness differences) outcompetes chance effects. In the field `molecular evolution and populations genetics´, long ago….when the structure of DNA was found, when the genetic code was found, and when something was known about protein sequences (i.e. they are more similar between closely related species, and similarity decreases with decreasing species biological relationship), Kimura (1962) showed, considering a selection scheme (s is the selection advantage) genotypes fitness A 1A 1 A 1A 2 A 2A 2 1 1+s 1+2s that the probability of fixation of A2 is P = (1 – e-4Nesq) / (1 – e-Nes) where q is the initial frequency of allele A2 Seq evol / Neutral theory / SVarvio 3 0. Since e-x 1 – x for small values of x, the equation reduces to P q as s approaches Thus, for a neutral allele, the fixation probability equals its frequency in the population. For example, a neutral allele with a frequency of 40% will become fixed in 40% of the cases and will be lost in 60% of the cases, fixation occurs by random drift, which facilitates neither allele. A new mutant arising as a single copy in a population of size N individuals has an initial frequency of 1 / (2N) (diploid individuals). The probability of fixation of a particular mutant allele is thus obtained by replacing q with 1 / (2N) in the previous equa on. When s 0, P = (1 – e -(2Nes/N) ) / (1 – e -4Ns) For a neutral mutation, s = 0, the equation becomes P = 1 / 2N If the population size is equal to the effective population size P = (1 – e-2s) / (1 – e-4Ns) If the absolute value of s is small P = 2s / (1 – e-4Ns) For positive values of s and large values of N, P 2s Seq evol / Neutral theory / SVarvio 4 Thus: if an advantageous mutation arises in a large population and its selective advantage over the rest of the alleles is, say up to 5%, the probability of its fixation is approximately twice its selective advantage. For example, if a new mutation with s = 0.01 arises in a population, the probability of its eventual fixation is 2%. An example: A new mutant arises in a population of 1000 individuals. What is the probability that this allele will become fixed in the population if (a) it is neutral, (b) it confers a selective advantage of 0.01, or (c) it has a selective disadvantage of 0.001? For simplicity, we assume that Ne = N (=1000). For the neutral case, the probability P = 1 / 2N = 0.05%. From equations (previous page) we obtain the probabilities 2% and 0.004% for the advantageous and deleterious mutations, respectively. The message: Advantageous mutations do not always become fixed in the population. In fact, 98% of all the mutations with the selective advantage of 0.01 will be lost by chance. Deleterious mutations have a finite probability of becoming fixed in a population, albeit a small one. The fact that a deleterious allele may become fixed in a population at the expense of better alleles illustrates the importance of chance events in determining the fate of mutations during evolution. Seq evol / Neutral theory / SVarvio 5 The dynamics of gene substitution for advantageous (top) and neutral (bottom) mutations. Advantageous mutations are either rapidly lost or rapidly fixed in the population. In contrast, for neutral alleles the frequency changes are slow, and the fixation time is much longer than that for advantageous mutants. X axis is time ->, y axis is allele frequency from 0 to 1 (=fixation) Seq evol / Neutral theory / SVarvio 6 Considering larger population, the chance effects become smaller. For example, if N = 10 000, then the fixation probabilities become 0.005%, 2% and ~ 10-20. Thus, while the fixation probability for the advantageous mutations remains approximately the same, that for the neutral mutation becomes smaller, and that for the deleterious allele becomes indistinguishable from zero. The fixation time. Consider fixation and loss separately and restrict consideration to those mutants that will eventually become fixed in the population. This is called the conditional fixation time. In the case of a new mutation whose initial frequency is q = 1/ (2N), the mean conditional fixation time (theory by Kimura in 1960´s) for a neutral mutation is approximated by t = 4N generations For a mutation with a selective advantage of s t = (2/s) ln(2N) generations Seq evol / Neutral theory / SVarvio 7 Let´s assume a species which has an effective population size of about 106 and a mean generation time 2 years. Under these conditions, it will take a neutral mutation, on average 8 million years to become fixed in the population. A mutation with a selective advantage of 1% will become fixed in the population in 5800 years. The conditional fixation time for a deleterious allele with a selective disadvantage –s is the same as that for an advantageous allele with a selective advantage s (theory by Kimura in the 1970´s). This is intuitively understandable given the high probability of loss for a deleterious allele. That is, for a deleterious allele to become fixed in a population, fixation must occur very quickly. Seq evol / Neutral theory / SVarvio 8 The rate of gene substitution. Definition: the number of mutants reaching fixation per unit time. If neutral mutations occur at a rate of u per gene per generation, then the number of mutants arising at gene locus in a population of size N is 2Nu per generation. Since the probability of fixation for each of these mutations is 1 / (2N), the rate of substitutiton of neutral alleles is obtained by multiplying the total number of mutations by the probability of their fixation K = 2Nu (1/2N) = u The rate of substitution is thus equal to the rate of mutation. Intuitively: in a large population the number of mutations arising every generation is high, but the fixation probability is low. In a small population the number of mutations arising every generation in low, but the fixation probability of each mutation is high. As a consequence, the rate of substitution for neutral mutations is independent of population size. For advantageous mutations the rate of substitution can also be obtained by multiplying the rate of mutation by the probability of fixation for advantageous alleles as given above (P 2s). For selection with s > 0 K = 4Nsu Seq evol / Neutral theory / SVarvio 9 Let´s assume a species which has an effective population size of about 106 and a mean generation time 2 years. Under these conditions, it will take a neutral mutation, on average 8 million years to become fixed in the population. A mutation with a selective advantage of 1% will become fixed in the population in 5800 years. The conditional fixation time for a deleterious allele with a selective disadvantage –s is the same as that for an advantageous allele with a selective advantage s (theory by Kimura in the 1970´s). This is intuitively understandable given the high probability of loss for a deleterious allele. That is, for a deleterious allele to become fixed in a population, fixation must occur very quickly. Seq evol / Neutral theory / SVarvio 10 The rate of substitution depends on the population size, selective advantage and mutation rate. The inverse of K is the mean time between two consequtive fixation events (cf the figures above) The formulae for K have been very important in the field of molecular evolution. They define two very different predictions for genetic polymorphisms in populations. Are the polymorphisms (for example sequence polymorphisms of certain genes) outcomes from neutral evolution or from evolution dictated by selection?? Two opposite schools since ~1970´s. Today, when sequence information is extensive, the question has turned to: How to identify genes from, for example, human sequence databases, which bear signatures of positive selection? Such genes have evolved faster => biased, “too long” branch lengths in phylogenetic trees. During the evolution of the human lineage, as compared with our phylogenetic relatives (the great apes), positively selected genes might be responsible for some important human specific phenotypic traits. Seq evol / Neutral theory / SVarvio 11 Rate of nucleotide substitution Below is a short introduction, in pursuance of serving a closure of the previous topic. The number of substitutions per site per year r = K / 2T Divergence of two homologous sequences from a common ancestral sequence T years ago. In the vast majority of genes, the synonymous substitution (does not result in amino acid change) rate greatly exceeds the non-synonymous rate. An extreme example: histone H4 gene, human vs. wheat, synonymous rate is ~25 times higher than the nonsynonymous rate. The rate is highly variable: fibrinopeptides evolve ~900 times faster than ubiquitin gene. This kind of phenomena are explained by functional constraints: what proportion of possible mutants will be deleterious and selectively removed, or neutral and fixed with a small probability. The more functionally constrained the gene, the greater probability that a mutation will be deleterious rather than neutral and so the lower the rate of nucleotide substitution. Seq evol / Neutral theory / SVarvio 12 For example, proteins like histones which interact structurally with DNA so that any amino acid change may be deleterious, evolve at the lowest rate whilst those that only interact with other proteins, such as members of the immune system or hormones, evolve at the highest rates because there is a great deal more flexibility in which amino acids are functionally suitable. Pseudogenes (loss of functionality) are important calibration tools. Rates of nucleotide substitution per site, per year x 10-9 for mammalian globin pseudogenes and their functional homologues. Each codon position is given separately. Pseudogene Functional genes 1. 2. 3. Mouse 5.0 0.75 0.68 2.65 Human 5.1 0.75 0.68 2.65 Rabbit 4.1 0.94 0.71 2.02 Goat 4.4 0.94 0.71 2.02 Thus, if the rate at pseudogens reflects evolution through mutation process, then the 3. position sites (degenerate sites), which evolve slower, do also have functional constraint, i.e. selection is slowing down their evolution, too. Seq evol / Neutral theory / SVarvio 13