* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download (ANIMAL) MITOCHONDRIAL GENOME EVOLUTION
Human genetic variation wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genetic engineering wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genome (book) wikipedia , lookup
List of haplogroups of historic people wikipedia , lookup
Human genome wikipedia , lookup
Frameshift mutation wikipedia , lookup
Genetics and archaeogenetics of South Asia wikipedia , lookup
History of genetic engineering wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Point mutation wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Oncogenomics wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Genome evolution wikipedia , lookup
DNA barcoding wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Population genetics wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Mitochondrial Eve wikipedia , lookup
Microevolution wikipedia , lookup
Koinophilia wikipedia , lookup
Lyon, jan 2009 (ANIMAL) MITOCHONDRIAL GENOME EVOLUTION N. GALTIER Institut des Sciences de l’Evolution CNRS - Université Montpellier 2 [email protected] Why focusing on mitochondrial evolution ? A small piece of DNA… in human: mitochondria total fraction nucleotides 16 000 3.5 109 5 10-6 genes 40 20 000 2 10-3 genomes 1 2 0.5 … but a fascinating one: - involved in fundamental aspects of cellular biology and physiology - peculiar evolutionary history - long-term interaction with the nucleus - popular phylogenetic and population genetic marker Mitochondrial endosymbiosis and the origins of Eukaryotes mitochondria chloroplasts The universal tree of life according to ribosomal RNA Mitochondrial endosymbiosis and the origins of Eukaryotes Mitochondria closest relatives are !-proteobacteria, which include several obligatory intracellular bacterial species: Rhizobium Wolbachia Rickettsia The mitochondrial genome has undergone a severe reduction in size, starting from several thousands genes (bacterial ancestor) to just 5 - 100 (current mitochondria). Some of these genes have been lost, other were transferred to the nucleus. Mitochondrial endosymbiosis and the origins of Eukaryotes mitochondria chloroplasts "amitochondriates" Mitochondrial endosymbiosis and the origins of Eukaryotes mitochondria chloroplasts "amitochondriates" Mitochondrial endosymbiosis and the origins of Eukaryotes mitochondria chloroplasts mitochondria "amitochondriates" - all known eukaryotic phyla have, or have had, a mitochondrion - the mitochondrial endosymbiosis has been proposed as a major/founding step of Eukaryote evolution Mito-nuclear coevolution - one of the two endosymbiosis that persisted in the long run - nuclear control of mtDNA replication Largely, but not entirely: - "petite" mutations in yeast - male sterility phenotypes (animals, plants) - double uniparental transmission (mussels) The nuclear genome adapts: germline mitochondrial bottleneck domestication? Biological functions of the mitochondrion - respiration - many metabolic pathways - thermoregulation - apoptosis - many mitochondrial diseases are known, including cancers - ageing (oxidative stress, somatic mutations) Why is animal mtDNA hypermutable? The most popular genetic marker of biodiversity in animals - easy to amplify Real reasons: - highly variable - clonal inheritance Other good reasons: - (nearly) neutral - clock-like Galtier et al 2006 Genome Res 16:215 Let's check : Bazin et al 2006 Science 312:270 Nabholz et al 2008 Genetics 178:351 Nabholz et al 2008 Mol Biol Evol 25:120 mtDNA, a clonal marker ? Mitochondria in animals are maternally transmitted, and therefore clonal - genetic and cytologic argments (Birky 2001 Annu Rev Genet 35:125) - active mechanisms of paternal mtDNA degradation (Gyllensten et al 1991 Nature 352:192) - mussel exception (Ladoukakis & Zouros 2001 Mol Biol Evol 18:1168) Three 1999 papers challenged this dogma in humans: - Hagelberg et al 1999 Proc Biol Sci 266:485 site-specific convergence in a pacific ocean island - Awadalla et al 1999 Science 286:2524 relationship between linkage disequilibirum and physical distance - Eyre-Walker et al 1999 Proc Biol Sci 266:477 excess of within-species mitochondrial homoplasy Within-species homoplasy: recombination or mutation hotspots ? individual 1 individual 2 individual 3 individual 4 individual 5 site c 1 T 2 C 3 4 C T 5 T site b C C site a A A a A A G G G b C C C A A c T C C T T site c’ 1 T 2 T 3 4 C T C A A site b C C C A A G G G site a A A G G G model 1: mutation hotspots 5 T model 2: recombination Within-species homoplasy: recombination or mutation hotspots ? individual 1 individual 2 individual 3 individual 4 individual 5 site c 1 T 2 C 3 4 C T 5 T site b C C site a A A a A A G G G b C C C A A c T C C T T site c’ 1 T 2 C 3 4 C T C A A site b C C C A A G G G site a A A G G G model 1: mutation hotspots 5 T model 2: recombination Within-species homoplasy: recombination or mutation hotspots ? site c site b site a 1 T C A 2 C C A 3 4 5 C T T C A A G G G species 1 A G T T G G C A T T G G species 2 model 1: mutation hotspots Shared polymorphic sites between close species, polymorphism/divergence correlation site c site b site a 1 T C A 2 C C A 3 4 5 C T T C A A G G G species 1 A A T T A G A A T T G G species 2 model 2: recombination No such relationship Data set I : mammalian cytochrome b - 27 mammalian genera, each with at least 2 distinct species for which the cytochrome b sequences of 6 individuals or more are available 80 . non-significant 0 20 40 Homoplasy + significant 60 homoplasy 100 120 140 - synonymous (neutral) sites only 0.0 0.1 0.2 0.3 0.4 0.5 polymorphism Proportion of polymorphic sites Too much within-species homoplasy in mammalian mtDNA Polymorphism co-occurrence between congeneric species species 1 ACCAGATTGCAATAGC ACCAGATTGCAATAGC ACCAAATTGCGATAGC ACCAGATTGCGATAGT species 2 ATTAACTTACCGTAGT ATTAACTTACCGTAGT ATTAACTTGCCGTAGT ACTAGCTTACCGTAGT ACTAGCTTACCGTAGT ATTAGCTTACCGTAGT species 3 GCTAGATTACTATGGT GCTAGATTACTATGGT GCTAGATTACCATGGT GCTAGATTACCATGGT GCTAAATTACCATGGT MMMMPMMMMMPMMMMP MPMMPMMMPMMMMMMM MMMMPMMMMMPMMMMM 0100300010200001 2 co-occurrences A site is called co-occurrent if it is polymorphic in more than halh of the species in the genus. 60 + significant 40 observed coocurrence Polymorphism co-occurrence between congeneric species observed co-occurrences 0 20 . non-significant 0 10 20 30 40 50 60 expected cooccurrence expected co-occurrences (permutations) The amount of co-occurrence is higher than expected in 22 genera out of 27, significantly in 11 : there are mutational hotspots in mammalian mitochondrial DNA. Simulations under a hotspot model Gamma number of sites mutation rate A C G T A C Tamura & Nei 1993 G T - mutation rate heterogeneity is tuned to fit the observed within-species homoplasy - then we compare the simulated and real levels of co-occurrence of polymorphisms 10 5 predicted minus observed co-occurrence 0 coocurrence_shortage Simulations under a hotspot model 0.05 0.10 divergence 0.15 between-species divergence 10 0 predicted minus observed co-occurrence 5 coocurrence_shortage Simulations under a hotspot model 0.05 0.10 0.15 divergence between-species divergence Hotspot sites vary in time and between species Data set II : human and ape full genome Hylobates lar Pongo pygmaeus pygmaeus Pongo pygmaeus abelli Gorilla gorilla Pan troglodytes Pan paniscus Homo sapiens 560 human full genomes 6 outgroups synonymous sites only Data set II : human and ape full genome - within-homo sapiens homoplasy is strong: several sites require at least ten distinct mutation events in humans - sites polymorphic in human are significantly more divergent between species - hypervariable sites are mostly A"G polymorphisms - nine sites are like: A#G A Hylobates lar A Pongo pygmaeus pygmaeus A A A Pongo pygmaeus abelli Gorilla gorilla Pan troglodytes A Pan paniscus G/A Homo sapiens There are mutation hotspots in humans as well, many due to G#A hypermutation. Conclusions - Invoking recombination is not necessary to explain within-species homoplasy in mammalian mitochondrial DNA - There are mutation hotspots in mammalian mtDNA; hotspot locations vary rapidly during evolution Mitochondrial recombination ? - direct evidence for mitochondrial recombination has been reported in humans (Kraytsberg et al 2004 Science 304:981) - indirect evidence in a couple of other animal species (Piganeau et al 2004 Mol Biol Evol 21:2319) - mitochondrial recombination apparently occurs only anecdotically in animals The most popular genetic marker of biodiversity in animals - easy to amplify Real reasons: - highly variable - clonal inheritance Other good reasons: YES - (nearly) neutral - clock-like Galtier et al 2006 Genome Res 16:215 Let's check : Bazin et al 2006 Science 312:270 Nabholz et al 2008 Genetics 178:351 Nabholz et al 2008 Mol Biol Evol 25:120 mtDNA: a neutral marker? Being involved in fundamental processes of cell and organismal biology (respiration, apoptosis, metabolism), mtDNA is not likely to undergo frequent adaptive evolution. Most analyses of variation between species are consistent with a predominant role of purifying selection on mitochondrial genes (Weinreich & Rand 2000 Genetics 156:385). What do mitochondrial sequence polymorphism patterns say? Evolutionary forces influencing the genetic diversity demography structure mating systems $ ~ Ne . µ On average, abundant species should be more polymorphic than scarce ones. Measuring DNA polymorphism in animals - start from Polymorphix 1.2 - Metazoa (Bazin et al 2005 Nucleic Acids Res 33:481) - remove genome projects - remove transposons, LINE, SINE, MHC, immunoglobulin, rRNA … - manually check highly polymorphic families - focus on coding sequences - for each family, calculate the synonymous diversity $s - average over loci within species - average over species within 8 taxa: Mammals, Sauropsids, Amphibians, Fish Insects, Crustaceans, Molluscs, Echinoderms - compare to allozyme data (Nevo et al 1984) Measuring DNA polymorphism in animals mtDNA nuclear DNA allozymes Mammals 311 25 184 Sauropsids 348 18 116 Amphibians 80 4 61 Fish 248 11 183 Echinoderms 26 22 15 Insects 451 69 122 Crustaceans 58 2 122 Molluscs 107 11 46 1629 162 849 Taxonomy does not predict mtDNA sequence polymorphism Verterbates Inverterbates nuclear DNA $s Allozyme heterozygosity Taxonomy does not predict mtDNA sequence polymorphism Verterbates Inverterbates mtDNA nuclear DNA $s Allozyme heterozygosity Ecology does not predict mtDNA sequence polymorphism Branch. Dec. Branch. Dec. continent marine continent marine 0.40 0.30 0.08 0.10 H $s H ** Allozymes mtDNA Crustaceans fresh $s * Allozymes mtDNA Molluscs marine 0.08 fresh marine 0.08 * $s H Allozymes mtDNA Fish Ecology does not predict mtDNA sequence polymorphism Branch. Dec. Branch. Dec. continent marine continent marine 0.40 0.30 0.08 0.10 H $s H ** Allozymes mtDNA Crustaceans fresh $s * Allozymes mtDNA Molluscs marine 0.08 fresh marine 0.08 * $s H Allozymes mtDNA Fish Why is not mtDNA sequence polymorphism correlated to Ne? - mutation: would imply a general, unplausible inverse relationship between Ne and µ - demography, structure: should affect the nuclear genome as well - natural selection: . negative selection = background selection: still predicts a positive relationship between $ and Ne (Charlesworth 1995 Genetics 141:1619) . positive selection = genetic draft predicts an essentially flat relationship between $ and Ne (Gillespie 2001 Evolution 55:2161) Selective sweep, hitch-hiking and genetic draft SELECTIVE SWEEP sampled neutral locus linked selected locus A selective sweep, the rapid fixation of an advantageous mutation leads to sudden drop of variability at linked loci through hitch-hiking. Advantageous mutations are more frequent in large populations: the increased genetic draft compensates for the decreased genetic drift. Selective sweep, hitch-hiking and genetic draft draft drift $ Ne A selective sweep, the rapid fixation of an advantageous mutation leads to sudden drop of variability at linked loci through hitch-hiking. Advantageous mutations are more frequent in large populations: the increased genetic draft compensates for the decreased genetic drift (Gillespie 2001 Evolution 55:2161). Conclusions - population size influences nuclear, but not mitochondrial DNA diversity - recurrent adaptive evolution explains the homogeneous mtDNA pattern, at least in invertebrates - mtDNA diversity is largely unpredictable, and mostly reflects the time since the last selective sweep Question - what is mtDNA adapting to ? The most popular genetic marker of biodiversity in animals - easy to amplify Real reasons: Other good reasons: - highly variable - clonal inheritance YES - (nearly) neutral NO - clock-like Galtier et al 2006 Genome Res 16:215 Let's check : Bazin et al 2006 Science 312:270 Nabholz et al 2008 Genetics 178:351 Nabholz et al 2008 Mol Biol Evol 25:120 mtDNA : a clock-like marker ? The molecular clock hypothesis states that the rate of accumulation of substitutions is more or less constant in time and between lineages, so that molecules can be used as chronometers of evolutionary divergences. Clock-like markers are useful for molecular dating purposes. Mitochondrial DNA has been widely used to date phylogenetic / phylogeographic events. Some differences of mtDNA evolutionary rate between lineages have been reported, though, and related to species metabolic rate (Gillooly et al 2005 PNAS 102:140) A comprehensive study in mammals Estimating evolutionary rates - easy in principle: t r = [dist(A,B) / 2] / t A B - uneasy in practice: tmax(D,E) tmin(D,E) tmax(A,B) t E tmin(A,B) B D A C What we have 0 A B C D What we want E Estimating evolutionary rates problems: - properly estimating branch lengths (when saturation can obscures the signal) - reconciling potentially conflicting divergence date estimates - modelling the evolution of the substitution rate - accounting for the specificity of sequence evolutionary processes our approach in mammals: - including as many species as possible (cytochrome b, 1696 species) - including as many fossil calibration points as possible (22) - dating first (using protein sequences), then estimating synonymous rates within groups of reasonnable maximal divergence (using 3rd codon positions) - using statistical (bayesian) modelling (Thorne et al 1998 Mol Biol Evol 15:1647) The neutral substitution rate varies by 2 orders of magnitude across mammalian lineages Taxonomic distribution Mutation rate variation in mammals : the longevity hypothesis Mitochondrial somatic mutations are involved in the ageing process : long-lived species may have to constrain their mtDNA mutation rate to low values to reach a long life-span. Substitution rate The bird/mammal comparison 0.5 0.5 0.05 0.05 0.01 0.01 0.001 0.001 5 10 25 50 100 Maximum longevity 10 100 1000 106 Body mass Birds live 3 times as long, on average, than similar-sized mammals, but have a higher mass-specific metabolic rate. mtDNA mutation rate is lower in birds than in mammals, in agreement with the longevity hypothesis, and contradicting the metabolic rate hypothesis. The most popular genetic marker of biodiversity in animals - easy to amplify Real reasons: Other good reasons: - highly variable - clonal inheritance YES - (nearly) neutral NO - clock-like NO Galtier et al 2006 Genome Res 16:215 Let's check : Bazin et al 2006 Science 312:270 Nabholz et al 2008 Genetics 178:351 Nabholz et al 2008 Mol Biol Evol 25:120 General conclusions Under frequent adaptive evolution, and strongly non-clock-like, mtDNA might be the worst possible genetic marker of biodiversity in animals. Yet people will obviously continue using it, if only for practical reasons. This is good, because the evolution of this genome needs to be further understood: - what is mtDNA adapting to? - what controls the mtDNA mutation rate? - why is it so compact in animals but not in other eukaryotes? - why do mitochondria retain a genome?