* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lect15_EvolutionSNP
Genetics and archaeogenetics of South Asia wikipedia , lookup
Heritability of IQ wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Public health genomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genetic testing wikipedia , lookup
Viral phylodynamics wikipedia , lookup
Dual inheritance theory wikipedia , lookup
Group selection wikipedia , lookup
Genome (book) wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Frameshift mutation wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Genetic engineering wikipedia , lookup
Deoxyribozyme wikipedia , lookup
History of genetic engineering wikipedia , lookup
SNP genotyping wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Point mutation wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Human genetic variation wikipedia , lookup
Koinophilia wikipedia , lookup
Genetic drift wikipedia , lookup
Evolution and Population Genetics Xiaole Shirley Liu STAT115 / STAT215 Evolution • Evolution is a gradual change in genetic makeup from one generation to the next • Evolution: Nonrandom • Natural Selection process • Mutation Random • Genetic Drift processes … • Natural selection and genetic drift are the two most important causes of allele substitution in populations 2 Evolution • Evolution creates species-specific and population-specific differences • Are they all selected for advantages to the species or population? Some definitions: • Locus: position on chromosome where a sequence or a gene is located • Allele: alternative form of DNA on a locus • Written as A vs a, or A vs B 3 Natural Selection What about transgenerational epigenetic inheritance? Controversial 4 Phenotypic vs Molecular Evolution • Phenotypic evolution is controlled by natural selection • Molecular mutations are selectively neutral in the strict sense as that their fate in evolution is largely determined by random genetic drift • Genetic drift due to sampling errors 5 Motoo Kimura Random Fluctuation in Allele Frequencies Metapopulation Deme p q Neutral alleles pt p' … time Drunk traveler staggering on a train platform with tracks on both sides… will eventually fall off the edge of the platform onto one or the other track 6 Genetic Drift Metapopulation Deme p q Neutral alleles pt p' … time • Over time, allele frequency in each sub-population will fluctuate, diversity in each sub-population will decrease till an allele is fixed (100%) or lost (0%) 7 Factors Influencing Genetic Drift • Deme: a population consisting of closely related species that can typically breed within • Initial mutation (allele) occurs in a deme of N individuals (effective population size) • Assuming neutral evolution, its probably of being sampled in the offspring is 1/2N • The likelihood of a mutation being fixed is its initial frequency (1 / 2N): smaller population, more likely fix; larger population more likely lost • Founder effect: new colony starts from few members (small N) of initial population 8 Factors Influencing Genetic Drift • An allele’s probability of fixation equals its frequency at that time and is not affected by its previous history • In a diploid population, the average time to fixation of a newly arisen neutral allele that does become fixed is 4N generations: evolution by genetic drift proceeds faster in small than in large populations p' • Bottleneck: drastic population decrease for at least one generation accelerate fixation 9 Factors Influencing Genetic Drift • Initially genetically identical demes can evolve by chance to have different genetic constitutions • Pb (mutation X will fix) = allele frequency • Among genetically identical demes in a metapopulation, average allele frequency does not change but heterogeneity in each declines to 0 Metapopulation Deme p q Neutral alleles pt p' … 10 The Neutral Theory of Molecular Evolution • Most mutations (genetic variations) are fixed from genetic drifts: neutrally selected and lacks adaptive significance • Some mutations are disadvantageous and eliminated • Only minority of mutations are advantageous and fixed from natural selection Break 11 By comparing DNA changes among populations we can trace their history Population 1: Population 2: Population 3: Population 4: 1 ATGTAACGTTATA ACGTAACGTTATA ACGAAACGTTATA ACGAAACCTTATA 2 3 4 From Phylogeny to Selection • The protein-coding portion of DNA has synonymous and nonsynonymous substitutions. Thus, some DNA changes do not have corresponding protein changes. • If the synonymous substitution rate (dS) is greater than the nonsynonymous substitution rate (dN), the DNA sequence is under negative (purifying) selection. • If dS < dN, positive selection occurs. E.g. a duplicated gene may evolve rapidly to assume new functions. 13 Molecular Clock • Molecular evolutionary substitutions proceed at ~constant rate, sequence difference between species a MOLECULAR CLOCK • If sequences evolve at constant rates (big if), they can be used to estimate the times that sequences diverged. ~Dating fossils by radioactive decay. 14 Molecular Clock • L = number of nucleotides compared between two sequences • N = total number of substitutions • K = N / L, number of substitutions per nucleotide • E.g. K = 0.093 for rat versus human • r = rate of substitution (mutations) = 0.56 x 10-9 per site per year • r = K / 2T T = .093 / (2)(0.56 x 10-9) = 80 million years 15 Graur and Li (1999) Factors Influencing Mutation Rate / Molecular Clock • Generation time (age to reproduction) • Population size (stronger drifts in small populations) • Intensity of natural selection • Species-specific differences When two species are way too different, over a sufficiently long time some sites experience repeated base substitutions, so the observed number of differences will plateau. 16 Factors Influencing Mutation Rate / Molecular Clock • Generation time (age to reproduction) • Population size (stronger drifts in small populations) • Intensity of natural selection • Species-specific differences • Change in protein function 17 Constant Mutation Rate? Page & Holmes Where did we come from? • Two competing hypotheses – Multiregional evolution (1 millions years ago, Homo erectus left Africa, and evolve into modern humans in different parts of the Old World) – The Out of Africa hypothesis: Homo erectus were displaced by new populations of modern humans that left Africa 100K to 50K years ago. • National Geographic Story Jan 2014 • If a fragment of DNA is shared by Neanderthals and non-Africans, but not Africans or other primates, it is likely to be a Neanderthal heirloom. • People living outside Africa carries 1-4% of Neanderthal DNA (skin, hair, etc). Break 20 Polymorphism • Polymorphism: sites/genes with “common” variation, less common allele frequency >= 1%, otherwise called rare variant and not polymorphic • Single Nucleotide Polymorphism – Come from DNA-replication mistake individual germ line cell, then transmitted – ~90% of human genetic variation • Copy number variations – May or may not be genetic 21 STAT115 Why Should We Care • Disease gene discovery – Association studies, e.g. certain SNPs are susceptible for diabetes – Chromosome aberrations, duplication / deletion might cause cancer • Personalized Medicine – Drug only effective if you have one allele 22 STAT115 SNP Distribution • Most common, 1 SNP / 100-300 bp – Balance between mutation introduction rate and polymorphism lost rate – Most mutations lost within a few generations • 2/3 are CT differences • In non-coding regions, often less SNPs at more conserved regions • In coding regions, often more synonymous than non-synonymous SNPs 23 STAT115 SNP Characteristics: Allele Frequency Distribution • Most alleles are rare (minor allele frequency < 10%) 24 STAT115 SNP Characteristics: Linkage Disequilibrium • Hardy-Weinberg equilibrium – In a population with genotypes AA, aa, and Aa, if p = freq(A), q =freq(a), the frequency of AA, aa and Aa will be p2, q2, and 2 pq respectively at equilibrium. – Similarly with two loci, each two alleles Aa, Bb 25 STAT115 SNP Characteristics: Linkage Disequilibrium • Equilibrium Disequilibrium 0.26 ab • LD: If Alleles occur together more often than can be accounted for by chance, then indicate two alleles are physically close on the DNA – In mammals, LD is often lost at ~100 KB – In fly, LD often decays within a few hundred bases 26 STAT115 SNP Characteristics: Linkage Disequilibrium • Statistical Significance of LD – Chi-square test (or Fisher’s exact test) 2 – eij = ni. n.j / nT ( n e ) 2 ij ij eij i, j 27 B1 B2 Total A1 n11 n12 n 1. A2 n21 n22 n2. Total n.1 n.2 nT STAT115 SNP Characteristics: Linkage Disequilibrium • Haplotype block: a cluster of linked SNPs • Haplotype boundary: blocks of sequence with strong LD within blocks and no LD between blocks, reflect recombination hotspots 28 STAT115 SNP Characteristics: Linkage Disequilibrium • Haplotype block: a cluster of linked SNPs • Haplotype boundary: blocks of sequence with strong LD within blocks and no LD between blocks, reflect recombination hotspots • Haplotype size distribution 29 STAT115 Summary • Phenotype evolution (natural selection) vs molecular evolution (neutral theory) • Decrease of genetic variation over time • Fixation: population size, probability • Positive and negative selection (dN / dS ratio) • Molecular clock and migration patterns • Genome variations: SNP and CNV • Linkage disequilibrium from recombination 30 Acknowledgement • Francisco Ubeda • Jun Liu 31