Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Evolution and conservation genetics Neutral model of evolution What governs heterogyzosity levels? Neutral model of drift and mutation Single population Constant size Drift occurs at rate 1/2N per generation Mutation creates new or alternative alleles and prevents fixation of alleles What model of mutation does a gene locus follow under the neutral model? Infinite Alleles Model Stepwise-Mutation Model Infinite Alleles Model (IAM) (Crow-Kimura Model) Average protein contains about 300 amino acids (900 nucleotides) 4900 10542 Mutations always occur to new alleles Finite population size (drift) How is loss of alleles due to drift balanced by new mutations Do allozymes really fall under a mutationdrift process? What is the equilibrium heterozygosity predicted by IAM? F = probability that two alleles are both copies of the same ancestral allele (identical by descent) Probability that you are not identical by descent and neither allele has mutated 1 1 2 Ft 1(1 )2 Ft (1 u ) 1 2Ne 2Ne Probability that two alleles are IBD. No mutation. Both alleles do not mutate At equilibrium then… Ft Ft 1 1 Ft 4N 1 But we have two measures of homozygosity both measure the same thing thus equal each other 1 p Ft 4N 1 2 i Can you derive this? If H=1-F, then what is H at a mutation drift equilibrium? Heterozygosity at a mutation drift equilibrium, given an IAM is… 4Ne H 4Ne 1 1.2 μ=0.001 1.0 When mutation rates are held constant then as population size increases: higher equilibrium heterozygosity 0.8 Heterozygosity When mutation rates are high and population size is held constant: higher equilibrium heterozygosity. 0.6 μ=10-5 μ=10-7 0.4 0.2 0.0 1e+1 1e+2 1e+3 1e+4 Population Size 1e+5 1e+6 1e+7 Stepwise-mutation model (SMM) (Ohta and Kimura) Generated by slipped strand mispairing, mutations occur only at adjacent sites. Mutation can produce alleles already present in the population. Expect that the equilibrium level of heterozygosity under SMM to be lower than that of IAM. 1 H 1 8N e 1 Genetic diversity and population size What is the effect of “finite” population size on gene frequencies The various ways to mathematically study it Effective population size Drift defined Random changes of gene frequencies among generations More important with Small population sizes Fluctuation in population size Low selection and migration Long time periods A simple simulation of drift: “replicated outcomes” (mean frequency is dotted) Buri’s (1956) classic genetic drift experiment showing the number of wildtype versus neutral mutant alleles in populations of 16 Drosophila followed through time: “gene frequency distribution” Generalized effect of drift Allele frequencies do not change (much) on the landscape scale Within populations, drift decreases genetic variance Between populations, drift increases genetic variance Consider the following to simply illustrate the principle: In a Buri-like experiment on 4 lines of n=4 hermaphrodite snails, the frequency of an albinism allele was as follows at generations 2 and 6. Generation 2 0 1 2 3 4 5 variance = 2.67 6 Generation 6 7 8 0 1 2 3 4 5 6 variance = 21.33 7 8 Observed vs. expected changes of mean and variance of gene frequency Loss of heterozygosity due to drift Buri used a population of size nine Effective population size Governs random change of gene frequency, p Depends on several factors All those that reduce the size of the breeding population Ne = number of individuals in an ideal population which has the same magnitude of genetic drift as the actual population. Wright-Fisher model Assume that the number of offspring is distributed as a Poisson variable with Mean = 2 ; Variance = 2 In this case, Ne = N No selection, Random mating, random number of offspring Factors reducing N to Ne Only adults of reproductive age count Sex ratio Variation in size over time Variation in offspring number Inbreeding (self-fertilization) Factors reducing N to Ne -- 1 Ne usually less than census population size Non-breeding individuals do not contribute juveniles “bachelor males” post-reproductives Factors reducing N to Ne -- 2 different number of breeding individuals in the two sexes – one sex represented by a small number of breeding individuals example: Captive bred animals – only one male used for breeding Different numbers of males and females Analogous to having two different population sizes Unequal sex ration The effective population size is strongly influenced by the rarer of the two sexes. Factors reducing N to Ne -- 3 Variation in number of offspring produced by different individuals Ne smaller when offspring numbers are more unequal Ne can be larger when variation in offspring number is reduced 4N 2 Ne V 2 V is the variance of reproductive success What is upper limit for effective population size? Factors reducing N to Ne -- 4 • Variation of population size in different generations • Consider the effect on loss of variation caused by the specific population in size in generations 1, 2, 3, .... ,t. harmonic mean: occasional severe reductions in population size will predominate over long stretches of stable large population size in reducing variability N=1000, 10, 1000 Factors reducing N to Ne -- 5 Self-fertilization causes increases of homozygosity (most extreme form of close inbreeding, or mating between relatives) f = fraction of loci in which both alleles are copies of an immediate ancestor Ne = N / (1+f) Effective size in continuous populations What if there is one population, and mating occurs to nearby individuals progeny are dispersed a short distance “Neighborhood size” (Wright 1943) Number of individuals within which 95% of the alleles derive from the previous generation (twice the standard deviation of gene flow in one direction, … don’t worry about the formula…) Mainly applied to plants, Ne= 500-1000; why? Estimation of effective population size Demographic data (variance of number of offspring, variation of population size direct… but usually difficult to obtain Can use genetic data reconstruct parentage of current population temporal changes of gene frequency (paternity analysis, in a few weeks) (need to separate from sampling variance) heterozygote excess, between few parents (only applicable to very small populations) Heterozygosity vs. allele number as indicators of variation Rarer alleles are lost in bigger bottlenecks (n) Predicted Observed reduction of reduction of H = (1/N) Na =(8-n)/8 ______________________ 0.001 0.005 0.050 0.100 0.500 0.000 0.024 0.518 0.664 0.831 rarer genes lost faster than predicted by heterozygosity model! Bottlenecks and founding effects These are special cases of genetic drift Especially important in conservation genetics The Founder effect •New populations often started by small numbers of migrants (analogous to bottleneck) • Carry only a fraction of the genetic variability of the parental population • New populations tend to differ randomly both from the parent population and from each other, tend to be “inbred” •Applies to: •Invasive species •Island colonists •Examples… •Amish of Lancaster Co., PA (Ellis-van Creveld syndrome) •Pirates of Pitcairn Isle The Cheetah bottleneck 15,000 to 20,000 cats in the wild All sampled cheetah share the same allozymes (Cohn 1996) homozygosity of 100%, population 0% polymorphic For genes mediating immune response, foreign skin is recognized as their own Why? Two bottlenecks – 10,000 years ago and another in the last two centuries Work of Stephen J. O’Brian and collaborators (Cat genome project) Intrinsic rate of growth affects H after a bottleneck Dotted line: N=10 Solid line: N=2 Loss of alleles mainly depends on bottleneck size, not rate of growth following bottleneck Genetics 144: 2001-2014 (December, 1996) Heterozygosity excess: difference between the observed heterozygosity and the heterozygosity expected from the observed number of alleles. Journal of Heredity, 1998 Data from real populations Inference of colonization history: the Northern elephant seal Formerly ranged from Mexico-California Hunted and collected to death Few survivors on Isla Guadalupe, Mexico (10-100?) Currently 200,000, many in Central/Southern Calif. How small was the bottlenecked population? Attempted reconstruction of the bottleneck of Northern Elephant Seals Currently, two mitochondrial DNA haplotypes have frequency 0.27, 0.73, giving He=0.40 Museum sample of pre-bottleneck samples gave He=0.80 Use Ht=H0 (1-1/2Ne1) (1-1/2Ne2)…(1-1/2Net) One generation bottlenect of 15 gives This allows Ne to increase following the bottleneck (1922-1960) Rate of increase about 1.7 per generation Allows population to grow from 15 to 200,000 in 38 years H0 =.80, H1 =.59, H2 =.50, H3 =.45… to H=.40 very shortly But microsatellites don’t show such a reduction of diversity, why? Inbreeding due to small population size – Has predictable consequences for allele frequencies and genotype frequencies: • Increases the frequency of homozygous genotypes – Similar in effect to: • Genetic drift • Variation in population size over time • Skewed sex ratios, etc. – Two “kinds” of inbreeding: nonrandom – self-fertilization random Random inbreeding Mutational meltdown? Populations enter a positive feedback loop – Inbreeding depression increases, population size decreases – Effect of drift increases: deleterious mutations become fixed – As deleterious mutations become fixed, inbreeding depression increases –Maybe the population dies! Among-population gene diversity Within populations (so far) Between populations Genetic variation in space and time in populations • Genetic structure of populations and frequency of alleles varies in space or time • Space: Allele frequency clines in the blue mussel. Variation across time: temporal variation in a prairie vole (Microtus ochrogaster) esterase gene. Measuring Genetic Differentiation: Fst Fst= normalized variance in allele frequencies among populations Fst = Var(p)/p*(1-p*), where Var(p) is the variance in the frequencies of allele p among populations and p* is the observed mean allele frequency across populations Or Fst= the relative reduction in gene diversity in a single population compared to pooling all populations Fst = (Ht - Hs)/Ht, where Ht is the expected heterozygosity for a pooled sample of alleles and Hs is the average expected heterozygosity within each sub-population Wright’s F statistics Separate components of genetic variation into a hierarchy: How much genetic variation is contained in a subpopulation compared to region a region compared to total a subpopulation compared to total Partition of Wright’s F In general sense, F is the probability that two alleles share a common ancestor (identity by descent) Total F = Fit (individual-total) Local F = Fis (individual-subpopulation) Regional F = Fst (subpopulation-total) Fit = Fis + (1- Fis ) Fst If it ain’t locally inbred, then maybe it is regionally Fundamental concept; can be defined for any number of levels Stepwise mutation model (for SSRs=simple sequence repeats=microsatellites) Mutation is a progressive change so fragments that migrate similar distances have had few mutations. In the case of SSRs, mutation is assumed to change the number of repeats, increasing or decreasing step by step. The square of the difference in the number of repeats between 2 microsatellites is proportional to the time of divergence from a common ancestor. Partitioning variation of SSRs: Rst - differentiation based on variance in allele sizes between populations (Slatkin 1995) Microsatellite analog of Fst that explicitly takes into account mutational differences among alleles Rst = (S - Sw)/S, where S is the average squared difference in size of all alleles and Sw is the average sum of squares of the differences in allele sizes within each population Analogous to Fst = (Ht-Hs)/(1-Ht) Assumes step-wise mutation model and weights differences between alleles by size (= # repeats) differences Issues with the use of the marker type (Hedrick. 1999. Evolution 53:313-318) High level of variation constrains maximum value of Fst that is possible Max Fst < 1 - Hs or the observed level of homozygosity Complicates interpretation of significance of Fst values Biological significance of statistically significant but small values of Fst (e.g. 0.01) from microsatellite data Genetic distance Measures the genetic “difference” between populations; alternative to variance partitioning Proportional to the time of separation from a common ancestor Between-population distance increases with time Due to genetic drift, mutation Four major models: Mutation to infinite alleles isozymes, sometimes microsatellites Stepwise mutation microsatellites Genetic drift causes random changes of gene frequencies Mutation in the nucleotide sequence Genetic distance: infinite allele model of Nei Expected homozygosities within and between populations Two populations, "x" and "y" Jx = probability that two alleles from population x are the same (expected homozygosity) = ip2ix Jy likewise defined for population y Jxy = probability that two alleles chosen from different populations x and y are the same = ipixpiy Nei's gene identity I=Jxy/√(JxJy) Analogous to a correlation coefficient With multiple loci, take average of Jx , Jy, Jxy over loci Nei’s genetic distance D = -ln(I) Increases linearly with time under infinite allele mutation model Genetic distance: stepwise mutation model Based on squared difference of mean allele size ux = mean for population x uy =mean for population y 2u = (ux-uy)2 Take average over multiple loci Increases linearly with time with stepwise mutation Highly dependent on allele size distribution Often Nei’s infinite allele model better PNAS December 2, 2008