Download PopGen 6: Brief Introduction to Evolution by Natural Selection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of genetic engineering wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Dual inheritance theory wikipedia , lookup

Inbreeding wikipedia , lookup

Epistasis wikipedia , lookup

Gene expression programming wikipedia , lookup

Koinophilia wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Natural selection wikipedia , lookup

Genetic drift wikipedia , lookup

Group selection wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
PopGen 6: Brief Introduction to Evolution by Natural Selection
Introduction (The theory of Natural Selection)
Natural selection is one mechanism for evolution; i.e., changing allele frequencies in a population. We have
seen a number of mechanisms for changing the genotype frequencies (inbreeding and assortative mating), but
those do not impact allele frequencies. Natural selection can be summarized as:
GENETIC
SUCESS
⎯→
14444VARIATION
244443 + DIFFERENTI
14444
42AL
44
444
3 ⎯
This genetic prefix is important as the
variation must be heritable. Change in
this portion of the equation is undirected.
This has two components: (i) reproduction
and (ii) survival in a particular environmental
context. This portion provides a direction to
evolutionary change
EVOLUTION
144244
3
For this part of the course,
we define evolution as change
in allele frequencies
The theory of Natural selection was developed independently, and published jointly in 1858 by Charles Darwin
and Alfred Russel Wallace. At that time the theory of natural selection differed in one very important way from all
the alternatives; it specified strict independence between an undirected process of change (mutation) and the
direction resulting from differential success (natural selection). The major alternative to Darwinism at that time
was Lamarckism, which made no such distinction. Under Lamarckism, evolution was a directed extension of the
underlying mutational process.
Natural selection is not the only agent for change in allele frequencies. Two potentially important alternatives
are MUTATION PRESSURE and GENETIC DRIFT. Under mutation pressure there is a direction to evolution and, unlike
natural selection, it is a direct extension of the process of mutation. Under genetic drift change is driven by the
rate of mutation, but the process is undirected. The important distinction between natural selection and both
mutation pressure and genetic drift is that only natural selection can explain adaptation. ADAPTATION refers to
the way that the phenotype of an organism is suited to a lifestyle within a particular environment. Before Darwin
and Wallace, it was difficult to explain adaptation without invoking either divine intervention, or Lamarckism.
The conditions of natural selection
Both Darwin and Wallace made logical arguments for the action of natural selection. Their approach was to
make observations about nature. After making many observations and considering a very large amount of data,
both men formulated a series of principles. Darwin and Wallace independently argued that the same conclusion
must follow their individual principles; the conditions of natural selection. Before we develop an explicit model for
natural selection on an allele within a population, it will be useful to review the logical argument for natural
selection.
Natural selection will operate on any system in which there is (i) variation among individuals, (ii) individual are
able to make copies of themselves, and (iii) this variation can be faithfully transmitted to the next generation. Of
course, living organisms possess all these attributes. Interestingly, artificial systems can be designed that
possess these attributes, and they can be shown to evolve. In this section we will consider the following system:
(i) allelic variation among genes carried by individuals, (ii) reproduction of alleles via DNA replication, and (iii)
inheritance of alleles via Mendelian transmission genetics.
In addition to these conditions, there must be a relationship between allelic variation in the genes and its impact
on an individual’s ability to survive and reproduce; this is what we will call FITNESS. Remember that natural
selection acts on phenotypes, not genotypes. In living organisms the phenotype reflects many genes and
environmental factors. Despite this complexity, one locus can have a major effect on a phenotype; hence, the
main features of natural selection in living organisms can sometime be understood in terms of a one-locus model
(e.g., industrial melanism in the peppered moth Biston betularia). For the purposes of understanding the
population genetic consequences of natural selection, we will focus on a one-locus model of fitness.
We can think of fitness as having two components: (i) viability and (ii) fertility. VIABILITY refers the probability that
an individual survives from fertilization to reproductive age. FERTILITY refers to differences in reproductive
capabilities of different genotypes. In the simplest situation fitness can be thought of as the probability of
survival multiplied by the number of offspring produced. The important point is that evolutionary fitness is not the
same as physical fitness; an individual can have the highest possible physical fitness and yet be sterile and thus
have no evolutionary fitness.
Malthus’ principle (populations increase geometrically, and food supplies increase arithmetically) puts
evolutionary fitness into an ecological context. Ecological resources such as food, water, and habitat are finite,
whereas the reproductive capacity is potentially limitless. From this observation comes the premise “life is a
struggle for existence”. This premise leads both Darwin and Wallace to the same conclusion; that natural
selection acts on living organisms. Let’s consider a specific example; tropical corals. One crude estimate
suggests that a single coral polyp can produce about 1000 eggs. If we multiply that number by the number of
polyps in a single coral reef we obtain a number in the millions for a single spawning season. If we consider all
the reefs on the planet we obtain billions of eggs. If all such eggs were fertilized and grew to sexual maturity, it
would not take long to cover the surface of the earth with corals. Clearly, only a very small fraction of the
offspring of corals survives to reproductive maturity. The ecological competition for finite resources ensures that
only the successful competitors survive.
Natural selection in action
There are a few cases where biologists have been able to observe natural selection in action. One of the most
well studied examples is the evolution of drug resistance in HIV.
HIV uses RNA as its genetic material. Because HIV is a
cellular parasite it must use the host’s enzymatic machinery
to replicate its genome. To do this, HIV must first convert its
RNA genome into DNA. The conversion of RNA to DNA is
accomplished by an enzyme called reverse transcriptase
(RT), which is encoded in the HIV genome. The hosts own
RNA synthesis machinery is used to produce many copies of
the genome from the DNA template. Because the host does
not use the RT enzyme, therapeutic agents can be developed
that interfere with its normal function, as such agents should
not have deleterious effects on the host. Consequently a
popular class of anti-HIV drugs are the RT inhibitors (e.g.,
AZT was the first of this type for HIV), as they were designed
to block the process of reverse transcription.
Life cycle of HIV
Nucleoside RT inhibitors are molecules that mimic one of the
standard nucleotide monomers A, C, G, or T. The inhibitor is
similar enough to be added to the growing chain by RT.
However, the inhibitor cannot be extended, and polymerization is terminated.
Administering RT inhibitors typically results in a dramatic decline in the patient’s population of HIV. Within as
little as a few days drug resistant versions of HIV will have grown frequent enough to be detected within the
patient. Over the course of just one or two months the population of HIV within the patient will be100% resistant
to the RT inhibitor. HIV evolves so quickly because it experiences a high mutation rate and it completes the
process of replication in just about two days.
The variation existing within a pool of HIV, and that accumulating by mutation, is undirected with respect to drug
resistance. Upon treatment with an RT inhibitor, most variants will be susceptible to the drug. Purely by chance,
one or more variants will be resistant to the drug. These variants are resistant because they either (i) avoid
uptake of the RT inhibitor, or (ii) they can excise the inhibitor following incorporation. Because drug treatment
yields a greater frequency of reproduction and survival in the resistant variants, the frequency of the variant
grows over time until it becomes fixed. This is natural selection in action.
Concept map of evolution of resistance to RT inhibitors in HIV
Generation:
1
2
No drugs
High polymorphism
Low resistance
3
4
5
6
7
Single drug therapy
Low polymorphism
High resistance
Note that these variants have a lower fitness in an untreated individual, as the enzyme is not as efficient.
Interestingly, once treatment begins there is natural selection pressure for compensatory mutations that improve
the RT enzyme, and evolution of such mutations has been observed.
Fitness
Evolutionary fitness is symbolized with W. Fitness values are specified for genotypes (e.g., WAA , WAa and Waa)
and are usually relative measures of fitness rather than absolute measures. One genotype (usually the one with
the highest absolute fitness) is regarded as the standard and its fitness is set to 1. All other fitness values are
measured relative to the standard (e.g., 1.0, 1.0, and 0.76).
Types of selection
1. Directional. DIRECTIONAL SELECTION occurs when selection favours the phenotype at an extreme of the range
of phenotypes. This form of selection exerts pressure for the FIXATION (frequency goes to 1.0) of a specific allele
in the population. Hence it imposes a direction on the course of evolution.
1
Fitness
0.8
WAA > WAa > Waa
0.6
0.4
0.2
0
AA
Aa
aa
Genotypes
2. Overdominance. OVERDOMINANT SELECTION occurs when the heterozygote has a greater fitness than either
homozygote. This form of selection is also called BALANCING SELECTION or HETEROZYGOTE ADVANTAGE. Rather
than a force for directional evolution, it is an evolutionary force for maintaining a stable polymorphism within the
population
1
WAA < WAa > Waa
Fitness
0.8
0.6
0.4
0.2
0
AA
Aa
aa
Genotypes
3. Underdominance. UNDERDOMINANT SELECTION occurs when the heterozygote has lower fitness than either
homozygote. This type of selection will result in an unstable equilibrium. The global equilibrium values of p and
q represent a population polymorphism, however even a slight perturbation of these frequencies will result in
change towards local equilibrium where either one of the two alleles is fixed; hence, the global equilibrium is said
to be unstable. This form of selection is sometimes called APOSTATIC or DISRUPTIVE SELECTION.
1
WAA >
Fitness
0.8
0.6
0.4
0.2
0
AA
Aa
Genotypes
aa
WAa
<
Waa
Selection in diploids
We have used Hardy-Weinberg equilibrium to model allele frequencies in diploids when there is no selection.
We can extend this model to incorporate selection by specifying the fitness values of the different genotypes.
The model assumes that selection acts on the diploid genotypes, and not at the level of haploid gametes.
We will use the symbols WAA, WAa, and Waa to specify the relative fitness values of the AA, Aa, and aa genotypes.
As mentioned earlier, fitness is a function of both viability and fertility. We will consider fitness a phenotype
because it is an interaction between the genotype and the environment.
Genotype
Frequency
Phenotype
Symbolism for generation 0
AA
Aa
p02
2p0q0
WAA
aa
q02
WAa
Waa
By our definition the survival of the newly fertilized eggs will follow the ratio:
WAA : WAa : Waa
and the ratio of AA, Aa, and aa genotypes among the adults will be:
p2WAA : 2pqWAa : q2Waa
The above ratios are correct, but they are NOT frequencies because they no longer add up to 1! So, if we want
to build a model with selection, we must divide the grand total after selection. This will normalize the terms so
that the frequencies sum to 1. The grand total is a measure of the AVERAGE FITNESS ( W ) of individuals in the
population.
W = p2WAA + 2pqWAa + q2Waa
[eq. 1]
So, we can think in terms of a normalized fitness of a genotype; i.e., fitness divided by the average fitness.
W
W
Aa and Waa
AA and
W
W
W
The frequency of p in the next generation is:
Under HW:
p1 = p2 + (1/2)2pq
With
selection:
p1 = p2
( ) + (1/2)2pq ( )
WAA
WAa
W
W
[eq. 2]
[eq. 3]
Likewise the frequency of q is:
Under HW:
q1 = (1/2)2pq + q2
With selection:
q1 = (1/2)2pq
[eq. 4]
( )+q ( )
WAa
2
W
Waa
[eq. 5]
W
We can simplify the selection equations a little:
[eq. 6]
p1 = p(pWAA + qWAa) / W
q1 = q(pWAa + qWaa) /
[eq. 7]
W
Now we have the tools to consider some specific cases.
Deleterious recessive
Let’s consider a case of complete dominance of A. The phenotype of the recessive allele, a, will only be subject
to natural selection when it appears in the form of a homozygous recessive genotype, aa. Let’s assume that the
recessive allele is deleterious but not lethal. We have just specified a model.
Now we want to convert this conceptual model into an explicit population genetics model. To do this we need a
way of quantifying how much the likelihood of survival is reduced in individuals with an aa genotype. It should
now be clear that this is most easily done in relation to the best genotype(s) in the populations. So, we set the
fitness of the best genotypes to 1 (hence, WAA = 1 and WAa = 1). Now we specify the fraction by which the
chance of survival is reduced by a parameter called s, which is called the SELECTION COEFFICIENT (hence, Waa = 1
–s).
Our model
Genotype
Frequency
AA
p02
1
W
Aa
2p0q0
1
aa
q02
1-s
Using equation 1 from above we can determine the average fitness of an individual:
W = p2(1) + 2pq(1) + q2(1-S)
[eq. 8]
Equation 8 can be simplified considerably.
W = p2 + 2pq + q2- Sq2
W = 1- q2S
[eq. 9]
By substitution in equation 7 above we can determine the change in allele frequency q in one generation.
2
q1 = q(p(1) + q(1- s)) / 1- sq
2
q1 = q(p + q - sq) / 1- sq
2
q1 = q(1 - sq) / 1- sq
2
qt+1 = qt - Sqt / 1- sqt
2
[eq. 10]
Let’s take a closer look at the case of industrial melanism in the peppered moth Biston betularia. The peppered
moth has a rare dark coloured variant that turns out to provide very good camouflage in polluted area. As
mentioned earlier the genetic control in these moths is more complex than a two-allele one-locus model.
However, main features of the system can be reasonably approximated by such a model.
We will use A to symbolize the dark allele; hence the
melanic form is produced by the AA and Aa genotypes,
and the light form by the aa genotype. Following the
industrial revolution, the habitat began to change
dramatically, and the fitness of the dark form eventually
exceeded that of the light form in polluted areas. We
can construct a simple model for natural selection for the
dark forms in polluted areas.
Peppered moths in polluted environment
Dark form
Genotype
Frequency at birth
Fitness
AA
p2
1
Aa
2pq
1
Light form
aa
q2
1-s
Let s = 0.33 and p = 0.06. By using equation 10 above, and the specified values of s and p, we can investigate
the change in the frequency of the recessive allele under conditions similar to those that occurred following
industrialization. Note that one empirical estimate of s for a natural population of Biston betularia is 0.33.
Change in recessive allele frequency over time
(in generations)
Frequency of a allele
1
0.9
s = 0.33
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
Generations
Based on this result, we might predict a relatively fast decline in the frequency of the recessive allele following
the negative effects of the industrialization.
Now let’s examine the change in the frequency of the recessive allele under a variety of selection coefficients (s
= 0, 0.01, 0.1, 0.5, and 0.9).
Change in recessive allele frequency over time under different intensities of negative
selection
1
s=0
s = 0.01
Frequency of a allele
0.9
0.8
0.7
0.6
s = 0.1
s = 0.5
s = 0.9
0.5
0.4
0.3
0.2
0.1
0
1
26
51
76
101
126
151
176
201
226
251
Generations
It is clear that there can be a very rapid decrease in the frequency of the recessive allele, and hence a
corresponding rapid increase in the dominant allele when selection pressure is strong enough. Interestingly
many cases of resistance genes to pesticides in insects are partial or completely dominant. There is
considerable data tracking the change in the frequency of pesticide resistant insects following the large scale
application of such chemicals during the 1940’s. In fact, this represents another classic case study of evolution
in action. What is remarkable is the resistance evolved in 5-50 generations in a wide variety of species and
environments. We can see from our plot that for the range of s > 0.5, resistant genes will increase to very high
frequencies in less than 50 generations.
Deleterious Dominant
By modelling a dominant trait as deleterious we can evaluate
directional selection for a recessive allele. This situation represents
the second half of the peppered moth story. Following clean air
legislation and the resulting decreased air pollution in the 1960’s the
frequency of the dark form declined. At one site in northwest England
the frequency declined from 0.94 in 1961 to 0.11 in 1998. With the
change in environment there was directional, positive selection for the
homozygous recessive genotypes.
Peppered moths in restored environment
The model
Genotype
Frequency
AA
p02
1-s
W
Aa
2p0q0
1-s
aa
q02
1
The change in allele frequencies under one generation of selection according to the above model is:
q1 =
q − sq + sq 2
1 − s (1 − q 2 )
The figure below illustrates the change in the frequency of a dominant allele when it is selected against
according to s = 0.05, 0.1, 0.2, and 0.3.
Change in frequency of dominant allele under different intensities of negative
selection
1
Frequency of A allele
0.9
s = 0.05
s = 0.1
s = 0.2
s = 0.5
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
9
17
25
33
41
49
57
65
73
81
89
97
Generations
We can see that directional selection is quite effective when acting against a dominant trait. The dominant allele
is eventually lost, with the recessive being “fixed” in the population. Even under very weak selection pressure (s
= 0.05) there will eventually be a period of steep decline in frequency.
Overdominance (Balancing selection)
The case of overdominant selection provides an interesting contrast to cases of directional selection. While
directional selection pressure results in the fixation of one or another allele, overdominant selection results in a
globally stable equilibrium for allele frequency polymorphism. The average fitness of the population,
maximum at the equilibrium values of p and q. We can construct such a model as follows:
The model
AA
Aa
p02
2p0q0
1
1 – s1
Genotype
Frequency
W
W , is at its
aa
q02
1 – s2
The change in allele frequencies under one generation of selection according to the above model is:
q − s2 q 2
q1 =
1 − s1 p 2 − s 2 q 2
Let s1= 0.3 and s2= 0.1. We can evaluate the convergence on the equilibrium value of q, from a variety of
different initial values of q.
Stable equilibrium resulting from overdominant selection
1
Frequency of a allele
0.9
0.8
Stable polymorphism:
q = 0.75
p = 0.25
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
Generations
It is clear that given the two values of s, the equilibrium value of q will always be 0.75 regardless of the initial
frequency of q in the population.
The power of overdominant selection to maintain a balanced polymorphism is
illustrated by the phenomenon of TRANS-SPECIES EVOLUTION. In trans-species
evolution balancing selection maintains two or more alleles in a population for
such extremely long periods of time that the population actually undergoes the
process of speciation one or more times, all the while maintaining a stable
balance in the allele frequencies. An interesting result of trans-species evolution
is a closer phylogenetic relationship between DNA sequences from different
species than within the same species (only at the locus evolving under
overdominant selection).
Virus infected cell
Antigen
presenting
receptor
TCR
T
A good example of trans-species evolution is provided by the major
histocompatibility locus (MHC) of humans and chimpanzees. The MHC locus is
a 4 megabase region of the genome that encodes the antigen presenting
receptors that function to present foreign peptides to T-cells. Recognition of a
peptide by a T-cell stimulates the T-cell to mature into a killer cell that will
destroy the infected cell. The figure to the right illustrates this relationship.
MCHBh
MCHAh
MCHBch
MCHAch
Killer
By using a model called a molecular clock (we will cover the molecular clock in
cell
detail later in this course) it is possible to estimate the divergence date of gene
sequences. When the first gene sequences for MHC genes were sequenced, it
was evident that divergences of the human and chimp MHC sequences were
older (~9 mya and ~14 mya) than humans and chimps as species (~6 mya). Moreover alleles in different
species, i.e., humans and chimps, were more closely related to each other than alleles within the same species!
The only possible explanation for such a pattern is that the alleles contained within both humans and chimps
diverged long before humans and chimps evolved, and that this variation was preserved over evolutionary time
and through speciation events. The figure below illustrates a human/chimp trans-species polymorphism. It turns
out that this type of polymorphism is very common in MHC, being found in many sets of vertebrate species.
6 mya: Human – chimp speciation
9 mya: MHCBh - MHCBch
14 mya:MHCAh - MHCAch
Why is MHC subject to such strong
overdominant selection? The simple
answer is the extra diversity associated
with having two alleles is beneficial. The
reason for the benefit of having extra
diversity lies in the increased number of
antigen sequences that can be
presented by the MHC molecule.
Heterozygotes would have an
advantage in a population exposed to
more than one pathogen. A number of
studies have shown that selection
pressure is strongest at the antigen
recognition sites (ARSs) in the antigen
receptor, indicating that changes in the
ability to bind peptides has fitness
consequences for the individuals
bearing these MHC genes. In addition
there is an advantage to a pathogen if it
can evolve a peptide sequence that
can’t be recognized by the host’s
receptor. Likewise it is to the advantage
of the host to evolve MHC molecules
that can recognize the new peptide
sequence. This sets up a long term
evolutionary “arms-race” that would
result in very strong natural selection for
change at the ARSs. MHC is an
example of microevolution that leads to
macroevolution.
Deleterious recessive under partial dominance
We can relax the assumption of complete dominance and allow selection to act on the heterozygote. The
effectiveness of selection on the heterozygote will depend on the DEGREE OF DOMINANCE, which we model with a
parameter h. For example if the degree of dominance is zero, i.e., h = 0, then selection cannot act on the
heterozygote and fitness values will be 1, 1, and 1 – s. When h = 1/2 we have an additive model for the effect of
the alleles, as the fitness of the heterozygote is exactly intermediate between the fitness of each homozygote (1,
1– (1/2)s, and 1 – s.
The model
Genotype
Frequency
AA
p02
1
W
Aa
2p0q0
1 - hs
aa
q02
1–s
The change in allele frequencies under one generation of selection according to the above model is:
q − hspq − sq 2
q1 =
1 − 2hspq − sq 2
Effect of partial dominance on the change of the recessive allele
frequency under negative selection
1
Frequency of a allele
0.9
Partial
Dominance:
h = 0.5
0.8
0.7
0.6
0.5
Full Dominance:
h=0
s = 0.33
0.4
0.3
0.2
0.1
0
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Generations
In both cases the change in allele frequency, p, is fast when the allele is common, and most slow when the allele
is rare. An important difference under partial dominance is that the recessive allele cannot “hide” nearly as well
in the heterozygotes, so its frequency approaches zero much more quickly than when there is full dominance.
Initially the decrease in frequency is slower under partial dominance, but eventually overtakes the case of full
dominance. Under partial dominance, selection is initially removing many dominant alleles carried in the
heterozygotes, and the relative decrease in frequency of the recessive allele is slower. However, the situation
changes because the distinction between heterozygote (h = ½) and homozygote allows selection to continue to
target the alleles hiding in the heterozygotes when the frequency is low.