* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PopGen 6: Brief Introduction to Evolution by Natural Selection
Survey
Document related concepts
History of genetic engineering wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Dual inheritance theory wikipedia , lookup
Gene expression programming wikipedia , lookup
Koinophilia wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Natural selection wikipedia , lookup
Genetic drift wikipedia , lookup
Group selection wikipedia , lookup
Transcript
PopGen 6: Brief Introduction to Evolution by Natural Selection Introduction (The theory of Natural Selection) Natural selection is one mechanism for evolution; i.e., changing allele frequencies in a population. We have seen a number of mechanisms for changing the genotype frequencies (inbreeding and assortative mating), but those do not impact allele frequencies. Natural selection can be summarized as: GENETIC SUCESS ⎯→ 14444VARIATION 244443 + DIFFERENTI 14444 42AL 44 444 3 ⎯ This genetic prefix is important as the variation must be heritable. Change in this portion of the equation is undirected. This has two components: (i) reproduction and (ii) survival in a particular environmental context. This portion provides a direction to evolutionary change EVOLUTION 144244 3 For this part of the course, we define evolution as change in allele frequencies The theory of Natural selection was developed independently, and published jointly in 1858 by Charles Darwin and Alfred Russel Wallace. At that time the theory of natural selection differed in one very important way from all the alternatives; it specified strict independence between an undirected process of change (mutation) and the direction resulting from differential success (natural selection). The major alternative to Darwinism at that time was Lamarckism, which made no such distinction. Under Lamarckism, evolution was a directed extension of the underlying mutational process. Natural selection is not the only agent for change in allele frequencies. Two potentially important alternatives are MUTATION PRESSURE and GENETIC DRIFT. Under mutation pressure there is a direction to evolution and, unlike natural selection, it is a direct extension of the process of mutation. Under genetic drift change is driven by the rate of mutation, but the process is undirected. The important distinction between natural selection and both mutation pressure and genetic drift is that only natural selection can explain adaptation. ADAPTATION refers to the way that the phenotype of an organism is suited to a lifestyle within a particular environment. Before Darwin and Wallace, it was difficult to explain adaptation without invoking either divine intervention, or Lamarckism. The conditions of natural selection Both Darwin and Wallace made logical arguments for the action of natural selection. Their approach was to make observations about nature. After making many observations and considering a very large amount of data, both men formulated a series of principles. Darwin and Wallace independently argued that the same conclusion must follow their individual principles; the conditions of natural selection. Before we develop an explicit model for natural selection on an allele within a population, it will be useful to review the logical argument for natural selection. Natural selection will operate on any system in which there is (i) variation among individuals, (ii) individual are able to make copies of themselves, and (iii) this variation can be faithfully transmitted to the next generation. Of course, living organisms possess all these attributes. Interestingly, artificial systems can be designed that possess these attributes, and they can be shown to evolve. In this section we will consider the following system: (i) allelic variation among genes carried by individuals, (ii) reproduction of alleles via DNA replication, and (iii) inheritance of alleles via Mendelian transmission genetics. In addition to these conditions, there must be a relationship between allelic variation in the genes and its impact on an individual’s ability to survive and reproduce; this is what we will call FITNESS. Remember that natural selection acts on phenotypes, not genotypes. In living organisms the phenotype reflects many genes and environmental factors. Despite this complexity, one locus can have a major effect on a phenotype; hence, the main features of natural selection in living organisms can sometime be understood in terms of a one-locus model (e.g., industrial melanism in the peppered moth Biston betularia). For the purposes of understanding the population genetic consequences of natural selection, we will focus on a one-locus model of fitness. We can think of fitness as having two components: (i) viability and (ii) fertility. VIABILITY refers the probability that an individual survives from fertilization to reproductive age. FERTILITY refers to differences in reproductive capabilities of different genotypes. In the simplest situation fitness can be thought of as the probability of survival multiplied by the number of offspring produced. The important point is that evolutionary fitness is not the same as physical fitness; an individual can have the highest possible physical fitness and yet be sterile and thus have no evolutionary fitness. Malthus’ principle (populations increase geometrically, and food supplies increase arithmetically) puts evolutionary fitness into an ecological context. Ecological resources such as food, water, and habitat are finite, whereas the reproductive capacity is potentially limitless. From this observation comes the premise “life is a struggle for existence”. This premise leads both Darwin and Wallace to the same conclusion; that natural selection acts on living organisms. Let’s consider a specific example; tropical corals. One crude estimate suggests that a single coral polyp can produce about 1000 eggs. If we multiply that number by the number of polyps in a single coral reef we obtain a number in the millions for a single spawning season. If we consider all the reefs on the planet we obtain billions of eggs. If all such eggs were fertilized and grew to sexual maturity, it would not take long to cover the surface of the earth with corals. Clearly, only a very small fraction of the offspring of corals survives to reproductive maturity. The ecological competition for finite resources ensures that only the successful competitors survive. Natural selection in action There are a few cases where biologists have been able to observe natural selection in action. One of the most well studied examples is the evolution of drug resistance in HIV. HIV uses RNA as its genetic material. Because HIV is a cellular parasite it must use the host’s enzymatic machinery to replicate its genome. To do this, HIV must first convert its RNA genome into DNA. The conversion of RNA to DNA is accomplished by an enzyme called reverse transcriptase (RT), which is encoded in the HIV genome. The hosts own RNA synthesis machinery is used to produce many copies of the genome from the DNA template. Because the host does not use the RT enzyme, therapeutic agents can be developed that interfere with its normal function, as such agents should not have deleterious effects on the host. Consequently a popular class of anti-HIV drugs are the RT inhibitors (e.g., AZT was the first of this type for HIV), as they were designed to block the process of reverse transcription. Life cycle of HIV Nucleoside RT inhibitors are molecules that mimic one of the standard nucleotide monomers A, C, G, or T. The inhibitor is similar enough to be added to the growing chain by RT. However, the inhibitor cannot be extended, and polymerization is terminated. Administering RT inhibitors typically results in a dramatic decline in the patient’s population of HIV. Within as little as a few days drug resistant versions of HIV will have grown frequent enough to be detected within the patient. Over the course of just one or two months the population of HIV within the patient will be100% resistant to the RT inhibitor. HIV evolves so quickly because it experiences a high mutation rate and it completes the process of replication in just about two days. The variation existing within a pool of HIV, and that accumulating by mutation, is undirected with respect to drug resistance. Upon treatment with an RT inhibitor, most variants will be susceptible to the drug. Purely by chance, one or more variants will be resistant to the drug. These variants are resistant because they either (i) avoid uptake of the RT inhibitor, or (ii) they can excise the inhibitor following incorporation. Because drug treatment yields a greater frequency of reproduction and survival in the resistant variants, the frequency of the variant grows over time until it becomes fixed. This is natural selection in action. Concept map of evolution of resistance to RT inhibitors in HIV Generation: 1 2 No drugs High polymorphism Low resistance 3 4 5 6 7 Single drug therapy Low polymorphism High resistance Note that these variants have a lower fitness in an untreated individual, as the enzyme is not as efficient. Interestingly, once treatment begins there is natural selection pressure for compensatory mutations that improve the RT enzyme, and evolution of such mutations has been observed. Fitness Evolutionary fitness is symbolized with W. Fitness values are specified for genotypes (e.g., WAA , WAa and Waa) and are usually relative measures of fitness rather than absolute measures. One genotype (usually the one with the highest absolute fitness) is regarded as the standard and its fitness is set to 1. All other fitness values are measured relative to the standard (e.g., 1.0, 1.0, and 0.76). Types of selection 1. Directional. DIRECTIONAL SELECTION occurs when selection favours the phenotype at an extreme of the range of phenotypes. This form of selection exerts pressure for the FIXATION (frequency goes to 1.0) of a specific allele in the population. Hence it imposes a direction on the course of evolution. 1 Fitness 0.8 WAA > WAa > Waa 0.6 0.4 0.2 0 AA Aa aa Genotypes 2. Overdominance. OVERDOMINANT SELECTION occurs when the heterozygote has a greater fitness than either homozygote. This form of selection is also called BALANCING SELECTION or HETEROZYGOTE ADVANTAGE. Rather than a force for directional evolution, it is an evolutionary force for maintaining a stable polymorphism within the population 1 WAA < WAa > Waa Fitness 0.8 0.6 0.4 0.2 0 AA Aa aa Genotypes 3. Underdominance. UNDERDOMINANT SELECTION occurs when the heterozygote has lower fitness than either homozygote. This type of selection will result in an unstable equilibrium. The global equilibrium values of p and q represent a population polymorphism, however even a slight perturbation of these frequencies will result in change towards local equilibrium where either one of the two alleles is fixed; hence, the global equilibrium is said to be unstable. This form of selection is sometimes called APOSTATIC or DISRUPTIVE SELECTION. 1 WAA > Fitness 0.8 0.6 0.4 0.2 0 AA Aa Genotypes aa WAa < Waa Selection in diploids We have used Hardy-Weinberg equilibrium to model allele frequencies in diploids when there is no selection. We can extend this model to incorporate selection by specifying the fitness values of the different genotypes. The model assumes that selection acts on the diploid genotypes, and not at the level of haploid gametes. We will use the symbols WAA, WAa, and Waa to specify the relative fitness values of the AA, Aa, and aa genotypes. As mentioned earlier, fitness is a function of both viability and fertility. We will consider fitness a phenotype because it is an interaction between the genotype and the environment. Genotype Frequency Phenotype Symbolism for generation 0 AA Aa p02 2p0q0 WAA aa q02 WAa Waa By our definition the survival of the newly fertilized eggs will follow the ratio: WAA : WAa : Waa and the ratio of AA, Aa, and aa genotypes among the adults will be: p2WAA : 2pqWAa : q2Waa The above ratios are correct, but they are NOT frequencies because they no longer add up to 1! So, if we want to build a model with selection, we must divide the grand total after selection. This will normalize the terms so that the frequencies sum to 1. The grand total is a measure of the AVERAGE FITNESS ( W ) of individuals in the population. W = p2WAA + 2pqWAa + q2Waa [eq. 1] So, we can think in terms of a normalized fitness of a genotype; i.e., fitness divided by the average fitness. W W Aa and Waa AA and W W W The frequency of p in the next generation is: Under HW: p1 = p2 + (1/2)2pq With selection: p1 = p2 ( ) + (1/2)2pq ( ) WAA WAa W W [eq. 2] [eq. 3] Likewise the frequency of q is: Under HW: q1 = (1/2)2pq + q2 With selection: q1 = (1/2)2pq [eq. 4] ( )+q ( ) WAa 2 W Waa [eq. 5] W We can simplify the selection equations a little: [eq. 6] p1 = p(pWAA + qWAa) / W q1 = q(pWAa + qWaa) / [eq. 7] W Now we have the tools to consider some specific cases. Deleterious recessive Let’s consider a case of complete dominance of A. The phenotype of the recessive allele, a, will only be subject to natural selection when it appears in the form of a homozygous recessive genotype, aa. Let’s assume that the recessive allele is deleterious but not lethal. We have just specified a model. Now we want to convert this conceptual model into an explicit population genetics model. To do this we need a way of quantifying how much the likelihood of survival is reduced in individuals with an aa genotype. It should now be clear that this is most easily done in relation to the best genotype(s) in the populations. So, we set the fitness of the best genotypes to 1 (hence, WAA = 1 and WAa = 1). Now we specify the fraction by which the chance of survival is reduced by a parameter called s, which is called the SELECTION COEFFICIENT (hence, Waa = 1 –s). Our model Genotype Frequency AA p02 1 W Aa 2p0q0 1 aa q02 1-s Using equation 1 from above we can determine the average fitness of an individual: W = p2(1) + 2pq(1) + q2(1-S) [eq. 8] Equation 8 can be simplified considerably. W = p2 + 2pq + q2- Sq2 W = 1- q2S [eq. 9] By substitution in equation 7 above we can determine the change in allele frequency q in one generation. 2 q1 = q(p(1) + q(1- s)) / 1- sq 2 q1 = q(p + q - sq) / 1- sq 2 q1 = q(1 - sq) / 1- sq 2 qt+1 = qt - Sqt / 1- sqt 2 [eq. 10] Let’s take a closer look at the case of industrial melanism in the peppered moth Biston betularia. The peppered moth has a rare dark coloured variant that turns out to provide very good camouflage in polluted area. As mentioned earlier the genetic control in these moths is more complex than a two-allele one-locus model. However, main features of the system can be reasonably approximated by such a model. We will use A to symbolize the dark allele; hence the melanic form is produced by the AA and Aa genotypes, and the light form by the aa genotype. Following the industrial revolution, the habitat began to change dramatically, and the fitness of the dark form eventually exceeded that of the light form in polluted areas. We can construct a simple model for natural selection for the dark forms in polluted areas. Peppered moths in polluted environment Dark form Genotype Frequency at birth Fitness AA p2 1 Aa 2pq 1 Light form aa q2 1-s Let s = 0.33 and p = 0.06. By using equation 10 above, and the specified values of s and p, we can investigate the change in the frequency of the recessive allele under conditions similar to those that occurred following industrialization. Note that one empirical estimate of s for a natural population of Biston betularia is 0.33. Change in recessive allele frequency over time (in generations) Frequency of a allele 1 0.9 s = 0.33 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Generations Based on this result, we might predict a relatively fast decline in the frequency of the recessive allele following the negative effects of the industrialization. Now let’s examine the change in the frequency of the recessive allele under a variety of selection coefficients (s = 0, 0.01, 0.1, 0.5, and 0.9). Change in recessive allele frequency over time under different intensities of negative selection 1 s=0 s = 0.01 Frequency of a allele 0.9 0.8 0.7 0.6 s = 0.1 s = 0.5 s = 0.9 0.5 0.4 0.3 0.2 0.1 0 1 26 51 76 101 126 151 176 201 226 251 Generations It is clear that there can be a very rapid decrease in the frequency of the recessive allele, and hence a corresponding rapid increase in the dominant allele when selection pressure is strong enough. Interestingly many cases of resistance genes to pesticides in insects are partial or completely dominant. There is considerable data tracking the change in the frequency of pesticide resistant insects following the large scale application of such chemicals during the 1940’s. In fact, this represents another classic case study of evolution in action. What is remarkable is the resistance evolved in 5-50 generations in a wide variety of species and environments. We can see from our plot that for the range of s > 0.5, resistant genes will increase to very high frequencies in less than 50 generations. Deleterious Dominant By modelling a dominant trait as deleterious we can evaluate directional selection for a recessive allele. This situation represents the second half of the peppered moth story. Following clean air legislation and the resulting decreased air pollution in the 1960’s the frequency of the dark form declined. At one site in northwest England the frequency declined from 0.94 in 1961 to 0.11 in 1998. With the change in environment there was directional, positive selection for the homozygous recessive genotypes. Peppered moths in restored environment The model Genotype Frequency AA p02 1-s W Aa 2p0q0 1-s aa q02 1 The change in allele frequencies under one generation of selection according to the above model is: q1 = q − sq + sq 2 1 − s (1 − q 2 ) The figure below illustrates the change in the frequency of a dominant allele when it is selected against according to s = 0.05, 0.1, 0.2, and 0.3. Change in frequency of dominant allele under different intensities of negative selection 1 Frequency of A allele 0.9 s = 0.05 s = 0.1 s = 0.2 s = 0.5 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 9 17 25 33 41 49 57 65 73 81 89 97 Generations We can see that directional selection is quite effective when acting against a dominant trait. The dominant allele is eventually lost, with the recessive being “fixed” in the population. Even under very weak selection pressure (s = 0.05) there will eventually be a period of steep decline in frequency. Overdominance (Balancing selection) The case of overdominant selection provides an interesting contrast to cases of directional selection. While directional selection pressure results in the fixation of one or another allele, overdominant selection results in a globally stable equilibrium for allele frequency polymorphism. The average fitness of the population, maximum at the equilibrium values of p and q. We can construct such a model as follows: The model AA Aa p02 2p0q0 1 1 – s1 Genotype Frequency W W , is at its aa q02 1 – s2 The change in allele frequencies under one generation of selection according to the above model is: q − s2 q 2 q1 = 1 − s1 p 2 − s 2 q 2 Let s1= 0.3 and s2= 0.1. We can evaluate the convergence on the equilibrium value of q, from a variety of different initial values of q. Stable equilibrium resulting from overdominant selection 1 Frequency of a allele 0.9 0.8 Stable polymorphism: q = 0.75 p = 0.25 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 Generations It is clear that given the two values of s, the equilibrium value of q will always be 0.75 regardless of the initial frequency of q in the population. The power of overdominant selection to maintain a balanced polymorphism is illustrated by the phenomenon of TRANS-SPECIES EVOLUTION. In trans-species evolution balancing selection maintains two or more alleles in a population for such extremely long periods of time that the population actually undergoes the process of speciation one or more times, all the while maintaining a stable balance in the allele frequencies. An interesting result of trans-species evolution is a closer phylogenetic relationship between DNA sequences from different species than within the same species (only at the locus evolving under overdominant selection). Virus infected cell Antigen presenting receptor TCR T A good example of trans-species evolution is provided by the major histocompatibility locus (MHC) of humans and chimpanzees. The MHC locus is a 4 megabase region of the genome that encodes the antigen presenting receptors that function to present foreign peptides to T-cells. Recognition of a peptide by a T-cell stimulates the T-cell to mature into a killer cell that will destroy the infected cell. The figure to the right illustrates this relationship. MCHBh MCHAh MCHBch MCHAch Killer By using a model called a molecular clock (we will cover the molecular clock in cell detail later in this course) it is possible to estimate the divergence date of gene sequences. When the first gene sequences for MHC genes were sequenced, it was evident that divergences of the human and chimp MHC sequences were older (~9 mya and ~14 mya) than humans and chimps as species (~6 mya). Moreover alleles in different species, i.e., humans and chimps, were more closely related to each other than alleles within the same species! The only possible explanation for such a pattern is that the alleles contained within both humans and chimps diverged long before humans and chimps evolved, and that this variation was preserved over evolutionary time and through speciation events. The figure below illustrates a human/chimp trans-species polymorphism. It turns out that this type of polymorphism is very common in MHC, being found in many sets of vertebrate species. 6 mya: Human – chimp speciation 9 mya: MHCBh - MHCBch 14 mya:MHCAh - MHCAch Why is MHC subject to such strong overdominant selection? The simple answer is the extra diversity associated with having two alleles is beneficial. The reason for the benefit of having extra diversity lies in the increased number of antigen sequences that can be presented by the MHC molecule. Heterozygotes would have an advantage in a population exposed to more than one pathogen. A number of studies have shown that selection pressure is strongest at the antigen recognition sites (ARSs) in the antigen receptor, indicating that changes in the ability to bind peptides has fitness consequences for the individuals bearing these MHC genes. In addition there is an advantage to a pathogen if it can evolve a peptide sequence that can’t be recognized by the host’s receptor. Likewise it is to the advantage of the host to evolve MHC molecules that can recognize the new peptide sequence. This sets up a long term evolutionary “arms-race” that would result in very strong natural selection for change at the ARSs. MHC is an example of microevolution that leads to macroevolution. Deleterious recessive under partial dominance We can relax the assumption of complete dominance and allow selection to act on the heterozygote. The effectiveness of selection on the heterozygote will depend on the DEGREE OF DOMINANCE, which we model with a parameter h. For example if the degree of dominance is zero, i.e., h = 0, then selection cannot act on the heterozygote and fitness values will be 1, 1, and 1 – s. When h = 1/2 we have an additive model for the effect of the alleles, as the fitness of the heterozygote is exactly intermediate between the fitness of each homozygote (1, 1– (1/2)s, and 1 – s. The model Genotype Frequency AA p02 1 W Aa 2p0q0 1 - hs aa q02 1–s The change in allele frequencies under one generation of selection according to the above model is: q − hspq − sq 2 q1 = 1 − 2hspq − sq 2 Effect of partial dominance on the change of the recessive allele frequency under negative selection 1 Frequency of a allele 0.9 Partial Dominance: h = 0.5 0.8 0.7 0.6 0.5 Full Dominance: h=0 s = 0.33 0.4 0.3 0.2 0.1 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 Generations In both cases the change in allele frequency, p, is fast when the allele is common, and most slow when the allele is rare. An important difference under partial dominance is that the recessive allele cannot “hide” nearly as well in the heterozygotes, so its frequency approaches zero much more quickly than when there is full dominance. Initially the decrease in frequency is slower under partial dominance, but eventually overtakes the case of full dominance. Under partial dominance, selection is initially removing many dominant alleles carried in the heterozygotes, and the relative decrease in frequency of the recessive allele is slower. However, the situation changes because the distinction between heterozygote (h = ½) and homozygote allows selection to continue to target the alleles hiding in the heterozygotes when the frequency is low.