Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
The Distribution of Fitness Effects of Mutations in Humans and Flies Adam Eyre-Walker (University of Sussex) Types of Mutation • Deleterious • Neutral • Advantageous -ve 0 +ve Deleterious Mutations Mutation accumulation and Mutagenesis expts dn/ds in primates <30% <10% 1/100 1/10,000 Distribution of Effects deleterious neutral low high Theory Neutral sites (e.g. introns / synonymous) n- 1 Ps = L s i 1 i i=1 ! Selected sites (e.g. non-synonymous) -assume all mutations neutral or deleterious (1 - e - S (1 - x) ) n n Pn = L n i # # D(S) - S (1 - x - (1 - x) ) 2S2x x (1 - x) (1 - e ) Simplication 1 f + b Ne Theory Neutral sites n- 1 Ps = L s i 1 !i = 1 i Selected sites Parameters n - known Ln - each gene Ls- each gene - each gene - shared - shared n- 1 1 1 Pn . L n i ! i a b i i=1 Estimation assume free recomb , , Bayesian estimation using MCMC Dataset - humans • • • • • Environmental genome project 275 human genes 90 individuals resequenced 549 non-synonymous polymorphisms 15746 intron polymorphisms Pn/Pi versus i Human 2.5 2.0 Pn/Pi 1.5 1.0 0.5 0.0 -0.5 -0.002 0.002 0.006 0.01 i 0.014 Results - human Shape = 0.28 Nes = 240 N es 01 110 10100 1001000 100010000 % 23 22 37 19 0.1 Results - human Shape = 0.28 (0.03, 0.48) Nes = 240 (90, ) 01 110 10100 1001000 100010000 0.38 0 0 0 0.62 0.23 0.22 0.37 0.19 0.001 0.17 0.33 0.47 0.03 0.000 Low Frequency Polymorphisms 0.7 0.6 syn non-syn proportion 0.5 0.4 0.3 0.2 0.1 0 low medium frequency high Dataset - D.melanogaster 44 genes 5-55 alleles sequenced 141 non-synonymous polymorphisms 346 synonymous polymorphisms Pn/Ps versus s D.melanogaster 3.0 2.5 2.0 1.5 1.0 0.5 0.0 -0.5 -0.01 0 0.01 0.02 0.03 0.04 0.05 Shape = 0.46 (0.15, 0.65) Adaptive Mutations Human1 Human2 Human3 Human4 CCC CCG CCC CCC GCA GCA GCA GCA GAG GAG AAG AAG TTA TTA TTA TTA CTA CTA CTA CTA Chimp CCC GCC GAG TTA GTA ATT GAA Poly Sub Syn 1 2 Non 1 1 ATC ATC ATC ATC GAA GAA GAA GAA Model Assume - synonymous mutations are neutral - amino acid mutations are deleterious, neutral or advantageous Poly Sub Syn Ps4Neu Ds2ut Non Pn4Neu f Dn2ut f + a P a = D n - D s Pn s D s Pn a =1- D P n s Estimation Parameters n, Ln, Ls - known without error - each gene - each gene - shared, beta distributed or one per gene Estimation by ML Drosophila 35 genes with multiple alleles in D.simulans and one allele in D.yakuba Poly Sub Syn 707 2489 Non 153 1054 0.22 0.42 Result = 0.26 (0.08, 0.41) Proportion Constant Model One Beta distributed One per gene Gene Hsc70 Adh Est-6 n 106 107 140 Log(L) -327.5 -327.5 -302.9 Amino Acid Div 0.0023 0.036 0.20 D.simulans & D.yakuba 600,000 aa differences 26 % adaptive 160,000 adaptive 1 every 75 years Human-Chimp • • • • • Environmental Genome Project 232 human genes 90 individuals resequenced Non-synonymous versus intron Human sequence aligned against chimpanzee genome Human Nuclear Genes Poly Sub Intron 17631 33223 Non 681 765 0.039 0.023 Low Frequency Polymorphisms 0.7 0.6 syn non-syn proportion 0.5 0.4 0.3 0.2 0.1 0 low medium frequency high Dealing With Deleterious Mutations • Use estimate of distribution of fitness effects from SNP data • Assume adaptive and slightly deleterious mutations governed by one distribution • Ignore low frequency variants Excluding SNPs Cutoff ML 95% CI 0% -0.62 5% 0.09 (-0.11, 0.26) 10% 0.26 (0.08, 0.41) 20% 0.31 (0.11, 0.52) Humans & Chimpanzees 1% 290,000 amino acid differences 25% adaptive 72,500 adaptive differences 1 every 165 years Conclusions • Distribution of fitness effects of slightly/moderately deleterious mutations is highly leptokurtic in humans and drosophila • ~25% of amino acid substitutions are driven by positive selection in humans and drosophila • Proportion does not vary between genes Thanks Gwenael Piganeau Nick Smith Meg Woolfit Nicolas Bierne