* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Random Allelic Variation
Site-specific recombinase technology wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Heritability of IQ wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genetic engineering wikipedia , lookup
Gene expression programming wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Inbreeding avoidance wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
Genetics and archaeogenetics of South Asia wikipedia , lookup
Koinophilia wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Human genetic variation wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Population genetics wikipedia , lookup
Random Allelic Variation AKA Genetic Drift Genetic Drift a non-adaptive mechanism of evolution (therefore, a theory of evolution) that sometimes operates simultaneously with others, such as natural selection the frequency of gene copies (i.e., alleles) in any generation of adult organisms represents only a sample of the gene copies carried by gametes of the previous generation, and the sample is subject to random variation, i.e., “sampling error” Beginning with the Hardy-Weinberg model no mutation no selection no gene flow But with one wrinkle in drift finite population size Result random changes in allele frequency (there is never a change in allele frequency in the HardyWeinberg model) Definitions monomorphy – no allelic variation at a locus in a population polymorphy – multiple alleles at a locus in a population fixation, fixed – describes an allele frequency reaching 100% and therefore monomorphic N = censused population size Definitions private allele – an allele unique to only one population, but not necessarily fixed within it cline – continuous change in allele frequency along a geographic transect, the hallmark of gene flow deme – a reproductively isolated or semi-isolated population, i.e., reduced or no gene flow among or between demes metapopulation – a collection of conspecific demes Two models of drift Random Walk – prospective, looking forward in time Coalescence – retrospective, looking backward in time Random Walk Model – Monte Carlo Markov Chain prospective, looking forward in time The state at timet=0 is determined only by the state at timet-1 plus a random event example stand at sundial on Horseshoe in front of Hadley Hall facing West take one step forward, flip a coin, move one step to the right if heads or one step to the left if tails repeat process until you either reach Espina, run into either North Horseshoe Rd or South Horseshoe Rd Espina St South Horseshoe Rd North Horseshoe Rd the horseshoe as a graph of allele frequency Espina St South Horseshoe Rd North Horseshoe Rd y-axis (vertical) = allele frequency x-axis (horizontal) = time in generations p = 0 at North Horseshoe Rd (extinction of A1, fixation of A2) p = 1.0 at South Horseshoe Rd (fixation of A1, extinction of A2) the sundial is halfway between, p = 0.5 at timet=0 the outcome will differ every time because of the random component Unlike the coin-toss exercise, in which the probability of heads and tails remains equal, the probability of an allele being represented in a gamete changes with each generation the probability of an allele being represented in a gamete is equal to the new allele frequency this will tend to ensure allele fixation or loss the width of the Horseshoe (i.e., North-South) is analogous to population size (N) the smaller the population, the narrower the width (or more specifically, the greater the variance of change) the smaller the population, the greater the sampling bias of gametes, and the more probably and rapidly an allele frequency will become fixed or monomorphic (100%) or go extinct (0%) variance is higher in small samples 𝑝 1−𝑝 𝑉= 2𝑁 (N = population size) If Drift is random (by definition, it is), then how can you predict change in allele frequency or which allele will become fixed? Probability of fixation = p Probability of extinction = 1 – p for any new mutant, probability of fixation = initial frequency 1 𝑝= 2𝑁 intuitively, probability of fixation of a new mutant by chance alone is greater in a small rather than large population average ‘time’ to fixation by drift (without selection) = 4N generations (in diploid species) Coalescent Model – Coalescent Theory any two lineages can be traced backward in time to a common ancestor alleles, haplotypes, or lineages are said to “coalesce” at that generation of common ancestry Coalescent Model – Coalescent Theory Example – a haploid non-recombining bacterium in each generation, the bacterium may die, survive but not reproduce, or survive and reproduce thus, in a population of finite size, if some lineages leave no descendants while others reproduce, eventually all individuals will be descendants of just one single lineage barring consideration of new mutants, initially polymorphic populations become increasingly closely related (as descendants of a single common ancestor) as allelic variation is lost by fixation and extinction Same thing – in color Histogram of generations to coalescence of lineages http://www.csbio.unc.edu/mcmillan/index.py?run=Courses.Comp790S09 ‘time’ to coalescence of alleles = 4N generations (diploids) 2N generations (haploids) 1N generations in maternally inherited haploid organellar DNA (i.e., mitochondria, chloroplast) because the paternal lineage ends in every generation Conclusion – coalescence is fast in small populations Drift is greatest in small populations Note that time to coalescence is exactly the same as time to fixation under the random walk model (retrospective = prospective) Coalescent Theory Predicts (in the absence of gene flow, mutation, selection) Allele or haplotype frequencies fluctuate at random but, in finite populations, one will become fixed Individual populations lose their genetic variation Initially similar populations diverge in allele frequencies by chance alone because they become fixed for different alleles or different combinations of alleles at unlinked loci The probability that an allele will ultimately become fixed is equal to its frequency in the population in any given generation Rate of fixation (or loss) is greater in small populations Distinct evolutionary histories of species and their genes Polymorphism arises before speciation modified from Ebersberger et al. Mol Biol Evol 2007 Lineage Sorting Hemiplasy the time-dependent process by which species lose their ancestral polymorphism through the process of genetic drift genes or characters with different evolutionary histories than the species that possess them, most often due to incomplete lineage sorting (ILS) ancestral polymorphism the shorter the time between speciations, the more ILS, hemiplasy complete complete complete Robinson et al 2008 PNAS 105:14477-14481 How is hemiplasy manifested? Mosaic Genomes with discordant gene trees among three or more species that diverged in rapid succession Percentage of 25,000 genes most closely related Between: • human-chimp • chimp-gorilla • human-gorilla Ebersberger et al Mol Biol Evol 2007 Heterozygosity (H) single locus H – the number of individuals in a population that are heterozygous for a given locus multilocus H – the number of loci that are heterozygous in an average individual H highest in a population with equal numbers of homozygotes Within demes, drift fixes alleles Across the metapopulation, allele frequencies remain unchanged, but genotype frequencies deviate from Hardy-Weinberg equilibrium, i.e., heterozygosity decreases (H) Ten Populations, red and blue alleles panmictic with gene flow, high H demic with genetic structure, low H Effective Population Size (Ne) the Ne of an actual population is equal to the censused population size (N) of an “ideal” population (i.e., in which all individuals breed and contribute equally to the gene pool) that would show the amount of drift actually observed and measured by heterozygosity (H) typically, Ne < N because of: sex bias (the less numerous sex limits Ne) reproductive variance of the sexes (the polygamous sex limits Ne) overlapping generations fluctuations in population size, e.g., past bottlenecks ploidy Founder Effect the principle that the founders of a new colony carry only a fraction of the total genetic variation of the source population genetic drift will have a strong effect on small founding populations most rare alleles will not be represented, a few will be overrepresented Founder Effect initially, H tends to be similar in source and founder populations because H is most influenced by common alleles but H decreases rapidly in founder populations, more so in small populations, less so in populations with high intrinsic growth rate (r) Logistic Growth curve Logistic Growth curve 𝑑𝑁 𝑁 = 𝑟(1 − ) 𝑑𝑡 𝐾 examples of drift – Buri 1956 Fixation of eye color allele from initial freq = 0.5 in 107 populations of Drosophila in 19 generations examples of drift – Baker and Mooed 1987 Mynah birds are indigenous to India Mynahs were introduced by humans to Australia, New Zealand, Fiji, and Hawaii in the 1800’s among natural populations of Mynahs, Nei’s D = 0.001 (a genetic distance that describes the inverse correlation coefficient of shared alleles) among naturalized populations, Nei’s D = 0.006 – equivalent to sub-species differences in about a century also, most rare alleles lost, but some increased from p = 0.01 to p = 0.08 Inbreeding (Assortative Mating, compared to drift) the antithesis of panmixia, panmixis, random mating Inbreeding Coefficient (F) – the frequency of autozygous individuals in a population Autozygous (“identical by descent”) - both alleles in a homozygous individual were inherited directly from a single haploid allele in an ancestor (e.g., grandparent) Allozygous – not identical by descent; either homozygous or heterozygous in an inbred population, H is low Pedigree with Identity by Descent Parental ♂ A1A2* F1 ♂ A1A2* F2 generation ♀ A1A2 inbreeding A2*A2* ♀ A2A2* Genotype Frequencies with Inbreeding A1 A1 A1 A2 A2 A2 allozygous p2 (1-F) 2pq (1-F) q2 (1-F) + autozygous pF + qF as F 1.0, the frequency the frequency of autozygous homozygotes increases at the expense of all allozygous genotypes the greater F, the faster H decreases Selfing (self-fertilization) – H is halved in each generation Inbred population F > number of autozygous individuals in a panmictic population F = 1 fully inbred, F = 0 no inbreeding comparison of Inbreeding to H-W equilibrium allele frequencies do not change (necessarily) genotype frequencies do change phenotypic variance usually increases due to loss of heterozygotes inbreeding depression (reduction in mean of phenotype due to increased expression of recessive alleles in homozygous genotype) number of homozygous recessive alleles increases mean fitness of population decreases which, when coupled with natural selection, can then change allele frequency can promote linkage disequilibrium due to lack of heterozygotes, even if loci are not physically linked comparison of Inbreeding to Drift both genetic drift and inbreeding lead to deviation from HardyWeinberg equilibrium heterozygosity decreases in small demes genetic drift causes change in allele frequency (and consequently genotype frequency) inbreeding causes change in genotype frequency (but not allele frequency in the absence of selection) both cause a decrease in heterozygosity Neutral Theory Mootoo Kimura 1968 Original thesis: there are too many genes for selection to act in any significant way on all simultaneously, no population is sufficiently large to bear the reduction in fitness Now better understood as a balance between new mutation and genetic drift; the genome is in a constant state of flux Consistent with Molecular clock Large percentage of non-coding and non-conserved DNA and redundancy of genetic code Gene Flow homogenization of metapopulations addition of alleles/genotypes to demes opposite effect of drift complete gene flow = panmixia recall models of gene flow: island, stepping stone, isolation by distance, extinction and recolonization Gene Flow m = rate of gene flow = % of gene copies carried into a population from outside per generation Nm = number of immigrants per generation, a measure of gene flow FST = fixation index ≈ % of genetic variation of a total population that is represented in a sub-population genetic structure – a structured population is one with high FST, that is, the subpopulation is not representative of the total As Nm , equilibrium FST , genetic structure Gene Flow It takes very few immigrants to homogenize populations Yet, typically populations are very structured, gene flow is surprisingly low Direct estimates of gene flow (mark and recapture studies) suggest more gene flow than typically measured by FST