* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 5 Adaptive evolution at the molecular level
Survey
Document related concepts
Transcript
Chapter 5 Adaptive evolution at the molecular level Objective One of the most exciting aspects of evolutionary biology is to consider how populations respond to changes in the environment and adapt to new conditions. Sometimes the changes are precisely targeted to one or a few particular molecules, such as the generation of pesticide resistance in response to application of pesticides or of warfarin resistance in rats. Other examples are observed in the geographic patterns of variation, such as Adh in Drosophila and beta-globin and G6PD deficiency in humans. In this chapter, we will explore how simple genetic systems can respond to natural selection, and how to look for the “signature” of natural selection at the molecular level. Definition of Adaptation The word adaptation appears frequently in literature on evolution, and it is worthwhile to define it carefully. An adaptation is a heritable change in an organism that improves reproductive fitness. As you might imagine, it is pretty easy to make up stories about how a mutation might improve an organism's survival or reproductive capacity. These are called "Just so" stories after Rudyard Kipling's famous work. An adaptive story is useful as a hypothesis, but one has to recognize that it is only a hypothesis and must be tested somehow. It is not always easy to demonstrate with scientific rigor that a change is adaptive, but there are a few good examples of molecular changes whose adaptive consequences are clear. We will focus on those in this chapter. Insecticide resistance One of the clearest examples of adaptive evolution at the molecular level is the response by pest populations to application of pesticide. Time and time again farmers have noticed that a spray may work initially, but after a few seasons the pests just come back again, even with repeated spraying. The reason for their return is that the population of pests has become resistant. This happens by a true process of evolution -there is variation in the population for resistance, and when the pesticide is applied, even if only a tiny minority is resistant, they will survive and expand in numbers until it appears that the whole population has become resistant. Shortly after the discovery of DDT it became wildly popular because it dramatically reduced pest damage to crops, and it did a beautiful job controlling populations of disease-spreading mosquitoes. But in just a few years, resistance to DDT started appearing in a variety of pest species. Soon the spread of DDT resistance became truly frightening. The figure shows the LD 50, or the dose required to kill 50% of the flies, and how the LD50 dramatically increased as the fly population in Illinois became resistant to the pesticide. Shortly after this, Rachel Carson wrote the very influential book Silent Spring, which described the role that DDT plays in threatening many bird species by upsetting their metabolism and weakening their egg shells. DDT has been banned as a pesticide since the 1960's. 29 Insecticide resistance in Culex pipiens (mosquito) When insecticide resistance spreads in a population, it may do so because a single highly-favorable mutation rapidly spreads through, or it may be that many different mutations can each confer resistance. A recent study showed that in at least one case, the former had occurred. Resistance to insecticide can occur by a number of different pathways, including avoidance of dosage, sequestering of the toxin, rapid excretion of the toxin, or metabolism of the toxin. Some forms of the enzyme esterase B are able to metabolize some organophosphate insecticides. In the case of the mosquito Culex pipiens, it seemed pretty clear that one allele of esterase B was able to confer resistance. This was confirmed by doing simple genetic crosses and noting the correct Mendelian ratios of susceptible and resistant, which correlated perfectly with the esterase B alleles. To address whether the same allele had spread all over the world, Michel Raymond and colleagues collected mosquitoes from several sites in Africa, Pakistan, and the US, and they made a restriction map of the region of the DNA that included the esterase B gene. Restriction maps are made by doing PCR of the gene region, then cutting the DNA with several different enzymes and piecing together where the cut sites must be from the size of the resulting fragments. The amazing thing that Raymond and colleagues saw was that the same allele had spread globally. This phenomenon, where selection is strong enough that a particular allele replaces all others in the population, is referred to as a "selective sweep". The evidence that a selective sweep had occurred in the mosquito esterase B gene is made stronger by the observation of plentiful variation in other genes. 30 Geographic patterns of variation may have an adaptive cause In addition to the observation of selective sweeps, evidence for natural selection acting on a particular molecule may come from the geographic distribution of the variation in the respective gene. One kind of geographic pattern that can be very conspicuous is a cline, defined as a trend in allele frequency along a geographic line. The figure shows a famous cline in the frequencies of the fast and slow alleles of alcohol dehydrogenase in the fruit fly Drosophila melanogaster. "Fast" and "Slow" refer to the migration rate of the protein forms of the alleles in an electrophoretic gel. Just as DNA can be separated by gel electrophoresis on the basis of size, proteins can also be separated by gel electrophoresis, except that proteins vary widely in charge, and it is charge as much as size that makes them move differently. Knowing that there is a cline in ADH allele frequency is not completely satisfying, because we want to know why the cline is there. Could it be simply that the northern population started out by a founding event with a high frequency of the Fast allele, and that migration between the northern and southern population (which had a higher Slow allele frequency) resulted in a smooth transition between the two? Or is there a real selective advantage to either having Fast in the north or slow in the south? One bit of evidence comes from Australia. The east coast of Australia also has an Adh cline, and there the north has a higher frequency of the Slow allele. This is consistent with selection based on some factor correlated with climate -- more equatorial climates 31 have a higher frequency of the Slow allele. One way to determine whether the Adh cline is due to chance or due to natural selection is to find a physiological mechanism why the fast allele would do better in the north (in North America). In general the slow allele does appear to be more tolerant of elevated temperatures, but it was not a very convincing story because we did not particularly know whether the flies were seeing greater heat stress in the south. Andrew Berry and Marty Kreitman sought to answer this question by examining the molecular variation around the Adh gene in a very large sample of flies from all over the eastern US. They found 83 polymorphic nucleotide sites in the Drosophila Adh region, and one of these sites is the one that actually causes the fast vs. slow difference in protein mobility. They reasoned that if the populations in the north and south had sufficient migration, it would eliminate the cline in all nucleotide positions other than the one that affects the Fast vs. Slow difference in the ADH protein. They did a regression analysis, plotting the frequency of each of the 83 varying sites on the Y axis, and the latitude on the X axis. For most of the sites, the slope of these best-fitting lines was nearly 0, but for the fast vs. slow site, there was a significant cline. The figure shows the r2 value, which roughly speaking indicates how much of the variation in frequency is explained by latitude. As you can see, the Fast-Slow site is the only one with a strong geographic cline. This means that the other sites are sufficiently mixed by migration, which means that the fast-slow cline must be maintained by some kind of selection. The Algebra of Natural Selection This is a good time to stop and think about how natural selection is expected to change allele frequencies. We need to make a mathematical model in order to be able to say whether the observed data fit one idea or another about the way selection works. We'll start with a model very similar to the Hardy-Weinberg principle, that is, we will assume that there is one locus with two alleles, A and a, and that the starting frequency of the A allele is freq(A) = p, and freq(a) = 1 - p = q. 32 Let's first consider the case of a recessive lethal allele. We will let the aa genotype fail to grow to maturity. In terms of fitness, this means we will assume that the fitness of genotypes AA, Aa, and aa are 1, 1, and 0 respectively. A pool of zygotes drawn from this population will be in Hardy-Weinberg proportions of p2, 2pq and q2 for genotypes AA, Aa, and aa respectively. After multiplying zygote frequencies by their appropriate fitness, the relative allele frequencies after selection are: p* = p2 + pq q* = pq. To get the allele frequency the next generation (written q'), divide q* by the sum of the two relative allele frequencies: q' = q*/(p* + q*) = pq/(p2 + 2pq) = q/(p + 2q) = q/(1+q). This equation q' = q/(1+q) is a recursion that lets us predict what will be the frequency the next generation if we know what it is today. It helps to try these recursions with a calculator (or a spreadsheet or simple computer program) to see what they say. Plug in some numbers to try it. If q=1/2, then q' = 1/3. To get the next generation let q=1/3 and find that q' = 1/4. From q = 1/4, we get q' = 1/5, etc. As you can see, the allele frequency (q) of a recessive lethal always decreases, and will continue doing so until the allele is lost from the population. What might be surprising is that it can take a very long time to eliminate the recessive lethal alleles, because when rare, recessive lethals are removed very slowly by natural selection. Just to show that this is not only theory, here is a plot of the frequency of an allele that is recessive lethal in Drosophila. The curve is not a perfect fit to the model, and the reasons for the departure are often worth further study. 33 Directional selection causes fixation If instead of being lethal, the aa genotype had fitness 1-s, then the recursion is a bit more complicated. But even in this case, regardless of where the population starts, each generation there will be an inexorable decline in the frequency of the a allele, until finally the A allele goes to fixation. In general, if fitnesses are AA > Aa > aa, then A fixes, and if fitnesses are AA < Aa < aa, then a fixes. This pattern of selection is known as directional selection. Heterozygote advantage leads to stable polymorphism Getting back to the example of the Adh polymorphism, here we appear to have a situation where the polymorphism is maintained by selection. Such a polymorphism is called a balanced polymorphism. The easiest way for a population to maintain a balanced polymorphism is through heterozygote advantage. If the heterozygote has a fitness higher than either homozygote, then with a little algebra it is not too hard to show that you expect the population to go to a stable allele frequency. In fact, if the fitnesses are 1-t for the AA genotypes, 1 for the heterozygote Aa, and 1-s for the aa genotype, where s and t are both greater than zero, then the equilibrium frequency of the A allele is s/(s+t). The figure below shows the allele frequency for a case of heterozygote advantage. It shows that regardless of the starting frequency, the theory says that the population will always go toward the polymorphic equilibrium. Such an equilibrium is said to be a stable equilibrium. 34 Malaria and human genetic disorders Some of the best examples of adaptation at the molecular level are found in the case of genes that affect the resistance to pathogen infections. Malaria in particular is an important human pathogen, and an estimated 500 million people worldwide have this disease. Generally it is only fatal in children, elderly and otherwise infirm individuals, and it is the cause of considerable human suffering. It turns out that many genetic disorders have an unusually high incidence in regions of the world where malaria is present. In fact, the maps of cases of thalassemia, G6PD deficiency, and sickle cell disease are surprisingly coincident with that of malaria. It is as though malaria provides a selection pressure, and in the different populations, one or another of these genetic disorders increases in frequency in response. Thalassemia is caused by a deficit of structural gene copies of either alpha globin (in the case of alpha thalassemia) or beta globin (in the case of beta thalassemia). Recall that hemoglobin normally has two alpha chains and two beta chains, along with a heme group that actually binds the oxygen. Thalassemias result in red blood cells that carry less oxygen and whose resulting stress have a shorter longevity in the blood stream. G6PD deficiency is a low activity variant of the enzyme glucose-6-phosphate dehydrogenase. This is an enzyme in 35 the pentose phosphate shunt, and it serves to generate NADPH, which is used in lipid synthesis. It is also important in establishing the NADP:NADPH ratio, which is reflected in the ratio of oxidized vs. reduced glutathione. While the details are not clear, this too affects the longevity of red blood cells, and their invasiveness by the plasmodium pathogen. Sickle cell disease is caused by a mutation in codon 6 of beta globin, changing CTT (glutamic acid) to CAT (valine). This results in a much more hydrophobic protein which tends to clump with other globin molecules. Under conditions of low oxygen, such as in the joints, or during exercise, the clumping can cause the red blood cells to collapse into a sickle shape, causing bouts of hemolytic anemia. Heterozygotes have milder symptoms. The sickling tends to cause damage to the red blood cells giving them a reduced longevity from the normal 120 days to about half that time. The malarial parasite, Plasmodium falciparum, has one of its life stages inside the red blood cell, so these genetic diseases which alter either the longevity of the red blood cell or its membrane seem to make individuals less susceptible to malaria. Certainly the elevated incidence of the genetic disorders in malarial areas is consistent with this. So if thalassemia, G6PD deficiency, and sickle cell all confer resistance to malaria, why haven't these diseases become even more prevalent? It appears that all three diseases exhibit a kind of heterozygote advantage. In the case of sickle cell disease, in regions of the world where malaria is present, the relative reproductive fitness of AA, AS, and SS were estimated in one study to be 0.89 : 1 : 0.20. That is to say, at birth, on average, a heterozygote is five times as likely to survive and reproduce as a SS homozygote. This heterozygote advantage would appear to result in a stable polymorphism. Of course the real world is more complicated than this. There is a third globin allele, the C allele, which appears to confer malarial resistance but does not cause sickle cell disease. There is clear evidence that the C allele is spreading in malarial areas in Africa. The Major Histocompatibility Complex It may seem odd that the cases where there has been a genetic association with a pathogen have not involved the immune system at all. In fact, the immune system faces unusually strong natural selection, and some components of the immune system leave a very clear signature of past natural selection. You may have heard about how antibody diversity is important to fighting bacterial infections. Another class of molecules is also important in fighting viral infections, and they are in what is called the major histocompatibility complex. It turns out these same molecules are important in determining whether an organ or tissue transplant is accepted or rejected. When a virus invades the cell, the cell has a means of picking up bits of the viral protein and transporting them to the cell surface. Because the viral protein bits are to be attacked by the immune system, they are called antigens. At the surface, the MHC 36 molecules serve to present the viral protein bits on the cell surface. They do this by binding the protein in a pocket called the Antigen Recognition Site (ARS), and poking the ARS outside the cell's surface. T-cells recognize the infected cell from the MHC sticking out, and the T-cells then eliminate the infected cell. Greater amounts of variation in ARS means that more variable antigens can be bound, presented, and eliminated. The structure of the class I MHC (Major Histocompatibility Complex) molecule suggests that the Antigen Recognition Sequence (ARS) is the region that most requires a great diversity of sequences. If selection serves to increase variation in the antigen recognition sequence, we ought to see it. Within the ARS there were an average of 4.7 silent substitutions per site and 14.1 amino acid changing substations per site. In the rest of the MHC molecule, there were 5.8 silent substitutions per silent site and 1.1 amino acid replacements per site. You notice that the ARS has proportionately far more changes that change the amino acid sequence. In fact, this part of the molecule has far more amino acid changing substitution than you would expect by chance, indicating that natural selection "sees" these differences in the protein product, they increase survival by increased diversity of pathogens that are recognized, and this makes them increase in frequency. Excess replacement polymorphism in ARS is consistent with selection for an increase in the diversity of MHC alleles. On the other hand, the non-ARS portion of the molecule seems to be conservative in the sense that the number of amino acid changes is much less, consistent with a form of selection that removes the deleterious mutations that result in decreased function. Amazingly, the MHC shows both diversity enhancing and diversity suppressing selection acting in different parts of the same protein. MHC shows polymorphism shared among humans, chimps and gorillas A shared polymorphism is when two different species share two or more alleles at a genetic locus. Such a situation can arise either by chance, or the polymorphism may have been maintained by the two species all the way back to and including the common ancestral species. In other words, a shared polymorphism may occur if the polymorphism is exceptionally stable and old. 37 Shared polymorphism between species is not expected to be common, because random drift should remove neutral shared polymorphisms fairly quickly. In fact, we do not expect a neutral polymorphism to be retained in a population much more than 2Ne generations, where Ne is the effective population size. It turns out the effective size of the human population is only around 10,000. This is because of the rapid and recent expansion of our population since the time that our population was much smaller. Humans and chimps had a common ancestor around 5 million years ago, which is about 500N generations. The fact that humans and other primates share many alleles in the major histocompatibility complex is another line of evidence that strong natural selection has retained these alleles in the populations of all primates. Summary Natural selection is a powerful force for causing deviations from Hardy-Weinberg expectations; simple models can predict how quickly and in what direction allele frequencies should change in response to selection. Strongly deleterious alleles will be expected to be removed by selection, but new mutations will result in a balance between selection and mutation. This can account for only a very low continuing frequency of strongly deleterious traits in populations. Heterozygote advantage is one mechanism that can maintain a deleterious trait at a higher frequency. Natural selection may reveal its effects at the molecular level by: 1. "Sweep" of one allele. 2. Clines in allele frequency. 3. Excess number of amino acid changes. 38 4. Excess divergence among alleles compared to neutrality. 5. Shared polymorphisms of alleles in different species. 39