* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Population genetics
Point mutation wikipedia , lookup
Designer baby wikipedia , lookup
Public health genomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
Group selection wikipedia , lookup
History of genetic engineering wikipedia , lookup
Gene expression programming wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genome (book) wikipedia , lookup
Genetics and archaeogenetics of South Asia wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Medical genetics wikipedia , lookup
Inbreeding avoidance wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Koinophilia wikipedia , lookup
Human genetic variation wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Genetic drift wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Population genetics Population genetics is a field that could be viewed as the extension of Mendelian genetics to the population level, rather than a consideration of the gene segregation within a cross or family. While a single diploid individual can have at most two alleles for some gene, in a population there can be numerous alleles at various frequencies. In population genetics, descriptions can be made of the frequencies of various genotypes and alleles in populations, and/or the levels of genetic variation can be determined. A population is a collection of organisms of a single species the individuals of which interact with each other in some way. So, a species will typically be broken up into a number of populations. Population genetics includes both empirical and theoretical studies. It provides the mechanics or mathematics underlying the evolutionary process, and it emerged in the 1920's. The main founders include RA Fisher, JBS Haldane, and Sewall Wright. The increasing availability of DNA sequence data, faster computers, and mathematics, have continued to fuel this field. Population genetics is also relevant in applied fields including agriculture, medical genetics including the genetics of various human diseases, and in the evolution of bacterial resistance to antibiotics, and the evolution of virulence in pathogens. Questions in population genetics: 1. What are the levels of genetic variation in populations? 2. How do different mating patterns affect genotype and allele frequencies? 3. What forces are responsible for changes in allele frequencies or the genetic composition of populations? In otherwords, what are the roles of mutation, migration, genetic drift and natural selection in populations. Genetic variation Genetic variation is critically important from both theoretical and applied perspectives. In the absence of genetic variation, evolution cannot occur. Likewise, in order to improve crop plants or animals for breeding purposes there must be genetic variation present. One of the first tasks in population genetics is the measurement of variation within populations. The easy way to approach this is to make use of single genes that are codominant (or show additive inheritance or incomplete dominance) where there is a one to one correspondence between the genotype and phenotype (for quantitative or polygenic traits, other approaches are necessary). Among the first loci used to assess levels of variation in human populations were various blood group loci, including the ABO blood group system and the MN blood group polypmorphism. MN blood group polymorphisms So let's consider an example involving the MN blood group polymorphism. The gene has two codominant alleles in populations, M and N. And with a simple blood test one can determine the the genotype of individuals as MM, MN or NN. So let's imagine you have the following sample of from a population Phenotype M MN Genotype MM MN NN Obs # 180 240 80 240/500 = 0.48 80/500 = 0.16 Genotype proportions 180/500 = 0.36 N TOTAL 500 Note that the proportions should add to 1, or you've made a mistake. We can also determine the frequencies of alleles in the populations. Since there is codominance, this is very simple and can simply be done by counting up the numbers of various alleles and dividing by the total number of alleles counted. Or, it can be done using the proportions of each genotype. 1. Using the numbers of various genotypes observed in the population p = freq(M) = # of M alleles/total alleles = (2 x 180 + 240)/ (2 x 500) = 0.6 q = freq(N) = # of N alleles/total alleles = (2 x 80 + 240)/ (2 x 500) = 0.4 Note that the alleles frequencies p + q = 1 or you've made a mistake. 2. Using the proportions of each genotype. p = freq (MM) + 1/2 x freq (MN) = 0.36 + 0.48/2 = 0.6 q = freq (NN) + 1/2 x freq (MN) = 0.16 + 0.48/2 = 0.4 So this provides a simple description of the allele frequences for a particular gene in a particular population. It is possible to compare populations as seen below. Relative to most populations which have roughly comparable frequencies of the two alleles, the Eskimo population studied has a considerably greater frequency of the M allele (and correspondingly lower frequency of the N), while the Australian Aborignial population has the reverse pattern. The reasons for the differences are unknown, and the function of the MN polymorphism (and / or ) functional differences between genotypes are not known. Random processes and the lack of migration between populations (or their degree of isolation) are likely responsible for these differences - or at least, this is the null hypothesis. The ABO blood group system shows co-dominance of only two of the three alleles, and so some other methods of estimation of allele frequencies and assumptions need to be used in determining allele frequencies. Different frequencies also occur among some human populations. Here the Sioux population has an unusually low level of the IA allele compared with other populations. Again, it may well be that random genetic drift is responsible. Protein polymorphisms The application of protein electrophoresis to assess the levels of genetics variation in populations began in 1967 and sparked the "Find em and grind em era". The application of these methods allowed the easy determination of genotype and allele frequencies in numerous organisms and for a large number of genes. That is, one could assess not just model organisms, like fruitflys, but most organisms. Furthermore, these polymorphisms are commonly co-dominant so there is a one to one correspondence between genotype and phenotype, making allele frequencies easy to determine. The first studies were carried out in 1967 by Harris on humans, and by Lewontin and Hubby on fruitflys. See powerpoint slides and gels as a reminder and a few other slides of classical and balance hypothesis. Using these methods, one can easily determine genotype and allele frequecies at numerous loci. So imagine you obtain the follow data for an esterase locus with three alleles denoted F, M, S. for alleles encoding Fast, Intermediate and Slow migrating forms to the enzymes. (See power point slide of the figure below). Schematic of allozyme gel electrophoresis showing an enzyme segregating for 3 alleles, F, M, S, in a population sample. F M S SS SS MS MM FS FM FF MS MM FS FM SS So here are the data collected. Genotyes Obs #s FF 100 FM 150 FS 50 MM 125 MS 0 SS 75 TOTAL 500 We can easily calculate proportion of each genotype by simply dividing the observed numbers by 500. Allele frequencies are estimated just as with two alleles as follows: p = freq(F) = (2x # FF + #FM + #FS) /(2x500) = 2x100+150+50 /(1000) = 0.4 q = freq(M) = (2x#MM + #FM +#MS)/(2x500) = 2x125+150+0/(1000) = 0.4 r = freq(S) = (2x#SS + #FS + #MS)/(2x500) = 2x75+50+0/(1000) = 0.2 note that p+q+r =1 or you've made an error. In general for a locus with any number of co-dominant alleles, equation is given by Freq(allele) = 2 x # of homozyg for the allele + the number heterozygous for it / total alleles or if you have proportions of each genotype it is: Freq(allele) = proportion homozyg + 1/2 proportion of heterozygs for the allele. Note that these equations can also be used for any co-dominant marker including DNA based markers or polymorphisms, like CAPs, RFLPs, SNPs etc. Sequencing DNA directly, can of course lead to the finest level of resolution for determining the extent of genetic variation in populations. RFLP variation in a sample of 58 flys across a 4.5 kb region of the genome of D. pseudoobscura. There are 78 RFLPs and 53 unique haplotypes illustrating the great extent of genetic variation in natural populations. (see book of powerpoint) Effect of sexual reproduction on genetic variation. Darwin was unaware of the correct mechanism of inheritance, and in at least one version of his book on the Origins of Species, he invoked blending inheritance. Under blending inheritance, sexual reproduction will result in a depletion of genetic variation from populations and hence evolution cannot occur. In the 1900s, some biologists independently determined that sexual reproduction alone doesn't alter levels of genetic variation and can lead to an equilibrium frequency of genotypes in a population following just 1 generation of random mating. This has become known as Hardy-Weinberg equilibrium. One way to consider what genotype frequencies should be under random mating (with respect to a particular locus) is to consider the gamate pool approach. That is, imagine an organism where all eggs are shed into the ocean and also all sperm are shed and there is subsequent random union of gametes. So, let's imagine there are two alleles A, a. p = freq(A) in the population and q = freq (a). The frequencies lies somwhere between 0 and 1 and of course p + q = 1. So, we can set up a Punnet Square to deduce the expected frequency of offpring in the population under random mating. Eggs freq p q Sperm p p2 pq q pq q2 So under random mating we expect : Freq (AA) = p2 Freq (Aa) = 2pq Freq (aa) = q2 This result holds independent of the initial genotype frequencies in the population. After just one round of random mating, the progeny generation will be at these HardyWeinberg equilibrium frequencies. Note that this mating system described, is of course atypical for Humans, who usually don't dump their sperm and eggs into the ocean and allow random fertilization to occur. However, the same result holds for random mating for a locus, if individuals choose to mate at random in pairs. The demonstration that this holds is a little more laborious to derive, but not difficult. First enumerate all possible matings among the three genotypes (9 of them). Then weight the probability of each mating according to their proportion in the population. Then determine the fraction of each kind of offspring produced by each mating weighting them by the proportion of that mating and then adding up the progreny across all matings. So, Imagine you have a population with two alleles, A and a, and the frequencies of genotypes are freq(AA) = D; freq(Aa) = H; freq (aa) = R. note that p = D + 1/2H; q = R + 1/2H also D+H+R=1 Matings Male AA x Female AA :prob D x D: offspring all AA Male AA x Female Aa : prob D x H offsping 1/2 AA : 1/2 Aa AA D2 Aa aa 1/2DH 1/2 DH Male AA x Female aa : prob D x R offspring all Aa etc, to Male aa x Female aa : prob R x R offspring all aa DR R2 Then add up the proportions of AA offspring, then Aa and then aa. You'll have expression in terms of D, H, R. But sub into them expressions for p and q. You should end up with Freq (AA) = p2; Freq (Aa) = 2pq ; Freq (aa) = q2 Thus under random mating, of individuals in pairs with respect to some locus, you expect to find Hardy-Weinberg proportions of each genotype. That is: Freq (AA) = p2 ; Freq (Aa) = 2pq ; Freq (aa) = q2 Note that Hardy-Weinberg proportions can be extended to any number of alleles per locus. So for any number of alleles, the frequency of a particular homozygote will always be the square of the allele frequences, while the frequency of heterozgotes will always be 2 x the product of each of the two relevant allele frequencies comprizing the heterozgote. So, for three alleles, F, M, S at freqs p, q, r You'd expect FF p2 ; FM 2pq; FS 2ps; MM q2 and so on. This may be given by the expansion of (p + q + r)2 HW eq can also be extended to tetrasomic inheritance through the expansion of (p + q )4 Note that random mating and resultant HW eq doesn't result in any change in allele frequencies, so in large random mating population, you wouldn't expect allele or genotype frequencies to change (unless some other force is operarting). One consquence of HW equilbrium (and random mating), is that rare alleles will occur most of the time in heterozygotes. That is, there will be few individuals homozygous for a rare allele. This has important consequences in evolution and in medical genetics. That is, human genetic disorders are rare, and hence most of the alleles for recessive disorders will exist and go unrecognized in individuals who are heterozygous for those recessive alleles. so imagine the frequency of an allele d, causing a recessive disease is q = 0.001 (so p = 0.999) Expected frequencies assuming HW eq are: DD p2 = 0.998 Dd = 2pq = 0.001998 dd = q2 = 0.000001 So, here 1 in 1 million have the disease (they are dd) close to 2000 in a million carry the disease causing allele as heterozygotes and they are normal. Or there are about 2000 more individuals carrying the allele as heterozygotes compared with those with the disease. This is one reason why it isn't easily possible to get rid of rare genetic recessive disorders since you'd have to eliminate far more normal people than diseased. This is also why natural and artificial selection against recessives is difficult or lags since most of the time the phenotype of the recessive allele isn't expressed until its frequency becomes appreciable. Note that one can determine whether a population is at HW eq with respect to some locus by assaying a random sample from the popluation and doing a chi-square goodness of fit test as in the example below: Sample 100 flys for ADH activity FF 22 Number of flys FS 30 SS 48 N 100 We first need to estimate allele freqs p = freq(F) = 2 x 22 + 30 /(2 x 100) = 0.37 q = freq(S) = 2 x 48 + 30 /(2 x 100) = 0.63 check that p + q = 1 Now set up a table with observed and #s expected (based upon Hweq). OBS # of flys exp # of flys FF 22 N x p2 = 13.69 FS 30 N x 2 pq = 46.62 SS 48 N 100 N x q2 = 39.69 Chisq = 12.7 Chisq tab df =1 = 3.841 Df = 3 classes - 1 -1 independent parameter estimated from the data Reject Hweq. With respect to ADH , the flys are not at Hweq. There appear to be too few FS and too many FF and SS. Why that is requires further study. Could be inbreeding. Note that in cases where there is dominance, and you wish to estimate allele frequencies, you can do this if you assume mating is random with respect to the locus you are considering. So Imagine flower colour Red = RR and Rr, white rr. let p = freq(R) and q = freq(r) Sample a population as follows: RED flrs (RR and Rr) obs 75 white (rr) 25 pheno proportion 75/100 = .75 25/100 = .25 Assuming Hweq p2+2pq q2 N 100 The easy solution is to set q2 = 0.25 take square root q = 0.5 since p + q =1, p = 1 - .5 = 0.5 If we wished to we could also determine expected frequency of each genotype so, p2 of RR, 2 pq of Rr. Can we test for HW eq? No. we assumed it to get here, and there are also no degrees of freedom. DF = 2 classes - 1 - 1 parm est = 0 Departures from random mating. If a species is divided into separate populations, random mating may occur within populations but not necessarily between them. If allele frequencies differ between populations, then globally Hweq will likely not be observed across the whole species, whereas it may occur within each population (given random mating within popns). Inbreeding. One form of departure from random mating is inbreeding (or at least inbreeding in excess of that which would occur due to chance). Inbreeding is mating among relatives, or the most extreme form, selfing. The more closely related the individual mating, the greater the effects on genotype frequencies (greater departure from Hweq). If an entire popluation were suddenly to undergo nothing but inbreeding, then eventually all loci would become homozygous. The rate at which this occurs is a function of the kind of inbreeding. Eg 1st cousin matings, brother sister, selfing etc. for selfing! Inbreeding depression often results from inbreeding (in normally outbred organisms). Inbreeding is the reduction in fitness of inbred relative to outbred offspring. In many ways it is the opposite of hybrid vigour. Inbreeding depression is largely the result of recessive or partially recessive genes that become homozygous upon inbreeding. The inbreeding coefficent, F, may be estimated from co-dominant data at one (or preferably more than one locus). Inbreeding effects all loci in the genome as opposed to other departures from random mating. F = 1 - (obs het / exp het) Where obs het is the observed proportion of heterozygotes at a locus and exp het is the expected proportion assuming Hweq. When there is random mating obs het = exp het and F = 0 (no inbreeding) At the opposite extreme, F = 1 when there are no heterozygotes at all at a locus that has two or more alleles. Normally 0 < F < 1. Note that you'd expect an approximately equivalent F value for all loci if a population is inbreeding. To explore the effects of inbreeding, Imagine that 1 in 10,000 people have the recessive genetics disease, cystic fibrosis. That is the frequency of the disease causing allele is q = 0.01 while the normal is p = .99 with F = 0 Normal 9999 Cystic 1 with F=.5 9949.5 50.5 with F=1 9900 100 So with partial inbreeding (F=0.5), we'd see 50 times more people with the disease. Now recall that there are numerous recessive genetic disorders, and all would increase in homozygosity resulting in increases in frequency of people with numerous other genetic disorders. Note that if you can estimate F, you can predict frequency of various genotypes: AA p2 +Fpq Aa 2pq(1-F) aa q +Fpq 2 Other deviations from random mating Assortative mating. This is where, say, individuals with the same phenotype tend to mate more often than is predicted based upon random. So in humans, there is a tendancy to assortative mating by height. Note that assortative mating could increase homozygosity of genes for height, but it will not affect all genes in the genome as inbreeding does. Negative assortative mating, is the opposite with unlike individuals mating. This form of mating can increase heterozygosity for genes involved in the particular trait. Tristylous plant populations exhibit negative assortative mating and thus maintain the three phenotypes in the population at approximately equal frequencies . Show image again. (ie see powerpoints) Or diagram out homomorphic SI system showing mating advantage to low frequency genotypes. (omit) Mutation Mutation is the ultimate source of genetic variation and can introduce new alleles into populations. It is, however, a weak force because of its low rate and alone results only in very slow changes in allele frequency through time. Rates of mutation vary across organisms and genes, but typically lie in the range of about 1 mutation in 1 million per generation for a particular gene. or 10-6 mutations per gene per generation and perhaps range from about 10-6 to 10-9 mutations per gene per generation Imagine a population is composed of just the A allele at some point in time and the mutation rate to other alleles is 10-6. So in generation t=0 the frequency of the A allele p = 1. In the next generation, 1 in 1 million A alleles mutate to other alleles so there is a reduction in the frequency of the A allele p = 1 - 10-6 = 0.999999 in the next generation there will p = .999999 - .999999 x 10-6 = 0.999998 and so on. Freq of A allele Allele frequency change due to m utation alone 1 rate = 0.00001 0.5 rate = 0.000001 0 0 20000 40000 60000 80000 100000 Generation The basic message is plain. While mutation is a critically important process, mutation alone is very very slow at causing allele frequency change in populations. Variation due to recombination Recombination has a greater capacity to generate genetic variation than mutation. This is because recombination can shuffle or create new gene combinations due to crossing over or independent assortment for genes on different chromosomes. Given the low rate of mutation, when a mutation first occurs, it will be unique and will occur in a given chromosomal context. So imagine you have the following situation with one gene already polymorphic in the population and the other we will have new mutation occur: So initially the population is polymorphic at the A locus with A and a alleles. While the population is fixed for the B allele, except for this first new b* mutation. So with respect to the two loci the population is composed initially of the chromosome (or haplotypes) AB and aB. Mutation produces a new haplotype ab*, and the new mutation initially never occurs with the A allele. A a B B Mutation B to b* A a B b* Imagine there is no crossing over and that for some reason the ab* combination increases in frequency. Then the population will be composed of only gene combinations AB, aB and ab*. There will be no Ab*. If we allow recombination to occur between the A and B genes, then over time, there will be a random association of alleles generated. This state is called linkage equilibrium. The frequency of each combination of allele of different genes is expected to occur according to the product of their individual frequencies. That is, whether a chromosome has the A allele is independent of whether it has the B or b alleles. So at linkage equilibrium freq(AB) = freq(A) x freq (B) Freq (Ab*) = freq(A) x freq (b*) Freq (aB) = freq (a) x freq (B) Freq (ab*) = freq (a) x freq (b*). The original situation is described as linkage disequilibrium That is, a nonrandom association of alleles. When new mutations occur, they are initially in linkage disequilibrium. The disequilibrium eventually breaks down provided there is some recombination betwee the genes. The more recombination there is, the more rapidly the disequilibrium breaks down. An example of extreme disequilibrium would be if you had only AB and ab chromosomes in the population. You could even set this up in a fruitfly bottle experiment and follow the change in frequency of haplotypes through time. Note that linkage disequilbrium can even occur for genes that are not linked, but of course it will breakdown more quickly the if genes are not linked. Note that linkage disequilbrium can also be generated by other forces such as natural selection or migration or intermixing of two previously isolated populations. Even drift can generate linkage disequilibrium which will of course decay. Show graphs of decay (see more in powerpoints) Decay of linkage disequilibrium with r = 0.01 0.6 Two-locus frequency 0.5 0.4 AB Ab 0.3 aB ab 0.2 0.1 196 183 170 157 144 131 118 92 105 79 66 53 40 27 1 14 0 Generation Recently, the occurrence of linkage disequilibrium has been exploited to help localize or identify genes for various human genetic diseases. One can look for an association between various molecular markers and the disease occurrence. The closer the marker to the disease gene, the stronger the extent of linkage disequilibrium. This can then be exploited to find the gene. Variation from migration. If populations differ in allele frequency, then migration of individuals from one population into the other can change gene frequencies. The rate of change will be a function of how much migration occurs, and migration can be more potent than mutation at changing allele frequencies. Migration tends to homogenize allele frequencies across populations if there is bi-directional migration (migrants from pop A go to pop B and vice versa). The final allele frequencies may end being the avearage of the two initially in the separate populations. Population size will also be important in the final frequencies. Migration can also introduce an entirely new allele from one population to another in which it might not have previously existed. Natural Selection Involves the differential rate of survival and/or reproduction of individuals with different genotypes. Natural selection has the capacity to change gene frequencies relatively rapidly (compared to the other evolutionary forces of mutation, migration and drift). It can also, however, result in the maintenance of genetic variation and/or prevent evolutionary change from occurring at locus. It is the process responsible for adaptation of organisms to their environment. The process is very much like that of artificial selection practiced by plant and animal breeders. The difference being there is a human conceived of goal in the breeding of plants and animals, while in nature, the particular environment (biotic and abiotic) drives the direction of evolutionary change. We often refer to the Darwinian fitness which is a relative measure of the probability of survival and reproduction of a genotype or phenotype. That is, it is measured relative to other genotypes or phenotypes in the population. Relative fitness is a function of the environment an individual is in. So a particular genotype might be favoured in one environment but less favoured in another. A gene that results in increased offspring production in an environment compared with other genotypes, will increase infrequency. Different forms of selection. Frequency independent selection is selection where the fitness of a genotype/phenotype does not depend upon the composition of the population to which it belongs. That is, the fitness is a function of the enviroment but not the composition of its own population. So as a example, the fitness of two plants might be a function of their leaf numbers to enhance the amount of photosynthesis, something that is independent of the presence of other plants in the population. Frequency dependent selection is where the frequency of other genotypes/phenotypes affects the fitness. So for example plants with self-incompatibility systems, the rare genotypes at the S-locus have greater relative fitness because they can mate with many other plants, as compared to plants carrying more common S-alleles. This process can result in high levels of genetic variation at incompatibility loci, and their maintenance over long periods of time. Density dependent selection is selection where population size or density changes the relative fitness of genotypes/phenotypes. E.g. In field mice, there could be selection for increasingly aggressive behaviour as density increases (ie fight your neighbours to get more food). Measuring fitness differences While natural selection acts on the sum total of genotypic/phenotypic characters in the wild, we'll consider cases where a variation at a single gene is responsible for fitness differences. Clearly there are many such examples in humans, and other organisms, when we consider genes with detrimental effects, such as alleles for various genetic disorders which have the capacity to severely reduce the fitness of individuals with those genotypes. The genes causing these disorders are typically selected against and hence there typically at low frequency in populations. The effects of selection can be determined at a single locus by considering the relative fitness of individuals with various genotypes. So let's imagine we have form of selection that operates some time after zygote formation but before reproductive maturity. This is sometimes refereed to as viability selection. We'll consider a gene with two alleles, A and a with initial freqs p and q. Assume we begin at zygote formation with HW eq genotype freqs. Genotype zygotes stage AA p2 Aa 2pq aa q2 viability selection occurs here and some individuals die as a function of their genotype. adults W AA x p2 W Aa x 2pq W aa x q2 here the W's represent the probability of surviving to reproductive age. Because these W's are likely to be smaller than 1, the sum of the genotype proportions would no longer add up to one. We can scale this so that it adds up to one by dividing through be the mean fitness of the population. W = W AA x p2 + W Aa x 2pq + W aa x q2 So after selection the genotype frequencies will be: AA Aa aa W AA x p2 W Aa x 2pq W aa x q2 W W W These will now sum to one and can easily be compared with the initial genotype frequencies. Given these frequencies after selection, we can also now calculate allele frequencies. So after selection,the frequency of A will be p' and a is q' p' = p2 W AA / W + pq W Aa / W q' = q2 W aa / W + pq W Aa / W If selection is constant from 1 generation to the next, one can then use the new genotype or allele frequency, and the same equations to derive what happens in the subsequent generations. Each time calculating the new genotype type frequencies based upon those in the previous generation and then calculating mean fitness, allele freqs etc. Note that under frequency independent selection the mean fitness of the population tends to increase through generations. Keep in mind that selection is really a relative measure. That is, it is the fitness or in this case the survivorship of one genotype relative to others that leads to increases or decreases in allele frequencies. An example of viability selection in plants. Cyanogenesis is the ability of plants to release hydrogen cyanide gas, HCN, when they are eaten by herbivores. The genetics is relatively simple, with two alleles, A and a, where AA and Aa plants produce cyanide and aa don't. And further let's imagine we start off the population at Hweq freqs with p = q = .5 Genotype zygotes stage AA 0.25 Aa 0.5 aa 0.25 prob surviving 0.5 0.5 0.1 Fitness W AA =.5 W = W AA x p2 W Aa =.5 + W Aa x 2pq W = 0.5 x .25 + 0.5 x 0.5 + W aa = .1 + W aa x q2 0.1 x 0.25 = 0.4 So after selection the genotype frequencies will be: AA Aa aa W AA x p2 W Aa x 2pq W aa x q2 W W W 0.5 x .25 /.4 = 0.3125 .5 x .5 / .4 = 0.625 .1 x .25 / .4 = .0625 Allele freqs are p' = 0.3125 + 0.625/2 = 0.625 q' = 0.0625 +0.625/2 = 0.375 So there has been an increase in the frequency of the C allele and decrease in c. If this process were to continue, the c allele would eventually be eliminated from the population. (show some examples using populus program simulation of natural selection or graph or or graphs shown in text of allele frequency change) Selection can cause rapid changes in allele frequency depending upon how different the relative fitness are. Balanced Polymorphism Sometimes called overdominance, the phenomenon can occur when the fitness of the heterozygote is greater than that of either homozygote. When this occurs, an equilibrium point is reached where both alleles are maintained in the population at frequencies determined by the relative fitnesses (or seln coeffs). The equilbrium is often described using selection coefficients rather than fitness coefficients. So t is the strength of selection against the AA genotype. W AA = 1 - t s is the strength of selection against the aa homozgote. Waa = 1 - s The relative fitness of the heterozygote is set to 1 So W Aa = 1. So we have AA W AA 1-t Aa aa W Aa Waa 1 1-s It can be shown that an equilibrium is reached where: freq (A) = p = s / (s+t) So if any perturbation from this equilibrium occurs, the population is driven back towards the equilibrium point. Thus both alleles are held in the population by selection which maintains genetic variation at this locus. The classic example of this is sickle cell disease where in malarial areas the Ss heterozygotes have superior fitness because they have increased resistance to malaria. ss have the disease SS get malaria. So in Malarial regions t = .15, s = 1 Then Freq (S) = p eq = 1/(1 + .15) = 0.87 q eq = 1 - p = 0.13 These theoretical predictions are close to the observed frequencies of alleles in the populations in Malarial regions of Africa. Illustrate using populus (again use natural selection simulation in populus). Note that if the heterozygote has lower fitness than either homozygote, then this yields an unstable situation and one or the other allele will become fixed in the population. Mutation selection balance. freq (a) = q = (u / s)1/2 It can be shown that the frequency of a deleterious recessive allele reaches an equilibrium between its mutation rate and selection against the recessive homozygotes. So this is strictly a function of the mutation rate to the deleterious allele and the strength of selection against it. Selection removes the alleles by selection against aa homozygotes but mutation reintroduces them. so so let's imagine a recessive allele is harmful with a fitness of Waa = 0.5 (so s= .5 as well) and u = 10-5 then the equilbrium frequency of a is q = (10-5 / .5)1/2 = 0.0045 a more harmful, in fact, a lethal allele will have Waa = 0 or s = 1 q = (10-5 / 1)1/2 = 0.003 a lower frequency obviously and one could calculate the percent of homozygous individuals assuming Hweq. Random effects (Genetic Drift) The final force capable of changes to allele frequency is random and is referred to as genetic drift. It occurs in any population of finite size (ie all populations are finite in size). The smaller the population size the more signficant the effects of genetic drift. Use populus to illustrate random change in allele frequencies. Under drift alone in a finite sized population, eventually all but one allele at a locus will be lost. The length of time this takes is a function of population size. The larger the popn the slower the rate of loss. It isn't possible to predict with certainty which allele will be lost although in general, the lower the frequency of an allele the greater the chance it will be lost. Elephant seals and Cheetahs were driven to near extinction (Cheetah's about 10000 years ago), and they are largely devoid of genetic variation. The same is true of elephant seals. This is because they were driven to very small population sizes, and that is when the effects of drift are most marked. Colonizing populations often go through a genetic bottleneck (loss of genetic varn due to drift) and are also have lower levels of genetic variation such as species that colonize an island with only a few founders. (a phenomenon known as Founder Effect). Note that drift can result in the fixation of somewhat deterimental alleles if population size is small enough, and likewise, an allele favoured by natural selection could be lost due to chance sampling from one generation to the next. The effects of drift on DNA sequence data have become very important because this allows the formulation of a null model to explain DNA sequence diversity within species and between species. If the null model doesn't fit, than this provides evidence that natural selection may be responsible for sequence diversity or sequence divergence between species. Note that drift has the effect of resulting in loss of genetic variation which may be counter balanced by the re-introduction of genetic variation by mutation. A steady state can even be reached where rate of loss by drift is counterbalanced by mutation leading to an equilibrium level of heterozygosity. This is often known as the neutral model of molecular evolution and forms the null hypothesis against which evidence for selection can be obtained at the molecluar level.