Download (ANIMAL) MITOCHONDRIAL GENOME EVOLUTION

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genetic variation wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genome (book) wikipedia , lookup

List of haplogroups of historic people wikipedia , lookup

Human genome wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetics and archaeogenetics of South Asia wikipedia , lookup

History of genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Mutation wikipedia , lookup

Oncogenomics wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Genome evolution wikipedia , lookup

DNA barcoding wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Population genetics wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Mitochondrial Eve wikipedia , lookup

Microevolution wikipedia , lookup

Koinophilia wikipedia , lookup

NUMT wikipedia , lookup

Human mitochondrial genetics wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Transcript
Lyon, jan 2009
(ANIMAL) MITOCHONDRIAL GENOME EVOLUTION
N. GALTIER
Institut des Sciences de l’Evolution
CNRS - Université Montpellier 2
[email protected]
Why focusing on mitochondrial evolution ?
A small piece of DNA…
in human:
mitochondria
total
fraction
nucleotides
16 000
3.5 109
5 10-6
genes
40
20 000
2 10-3
genomes
1
2
0.5
… but a fascinating one:
- involved in fundamental aspects of cellular biology and physiology
- peculiar evolutionary history
- long-term interaction with the nucleus
- popular phylogenetic and population genetic marker
Mitochondrial endosymbiosis and the origins of Eukaryotes
mitochondria
chloroplasts
The universal tree of life according to ribosomal RNA
Mitochondrial endosymbiosis and the origins of Eukaryotes
Mitochondria closest relatives are !-proteobacteria, which include several obligatory
intracellular bacterial species:
Rhizobium
Wolbachia
Rickettsia
The mitochondrial genome has undergone a severe reduction in size, starting from
several thousands genes (bacterial ancestor) to just 5 - 100 (current mitochondria).
Some of these genes have been lost, other were transferred to the nucleus.
Mitochondrial endosymbiosis and the origins of Eukaryotes
mitochondria
chloroplasts
"amitochondriates"
Mitochondrial endosymbiosis and the origins of Eukaryotes
mitochondria
chloroplasts
"amitochondriates"
Mitochondrial endosymbiosis and the origins of Eukaryotes
mitochondria
chloroplasts
mitochondria
"amitochondriates"
- all known eukaryotic phyla have, or have had, a mitochondrion
- the mitochondrial endosymbiosis has been proposed as a major/founding step of Eukaryote evolution
Mito-nuclear coevolution
- one of the two endosymbiosis that persisted in the long run
- nuclear control of mtDNA replication
Largely, but not entirely:
- "petite" mutations in yeast
- male sterility phenotypes (animals, plants)
- double uniparental transmission (mussels)
The nuclear genome adapts: germline mitochondrial bottleneck
domestication?
Biological functions of the mitochondrion
- respiration
- many metabolic pathways
- thermoregulation
- apoptosis
- many mitochondrial diseases are known, including cancers
- ageing (oxidative stress, somatic mutations)
Why is animal mtDNA hypermutable?
The most popular genetic marker of biodiversity in animals
- easy to amplify
Real reasons:
- highly variable
- clonal inheritance
Other good reasons:
- (nearly) neutral
- clock-like
Galtier et al 2006 Genome Res 16:215
Let's check :
Bazin et al 2006 Science 312:270
Nabholz et al 2008 Genetics 178:351
Nabholz et al 2008 Mol Biol Evol 25:120
mtDNA, a clonal marker ?
Mitochondria in animals are maternally transmitted, and therefore clonal
- genetic and cytologic argments (Birky 2001 Annu Rev Genet 35:125)
- active mechanisms of paternal mtDNA degradation (Gyllensten et al 1991 Nature 352:192)
- mussel exception (Ladoukakis & Zouros 2001 Mol Biol Evol 18:1168)
Three 1999 papers challenged this dogma in humans:
- Hagelberg et al 1999 Proc Biol Sci 266:485
site-specific convergence in a pacific ocean island
- Awadalla et al 1999 Science 286:2524
relationship between linkage disequilibirum and physical distance
- Eyre-Walker et al 1999 Proc Biol Sci 266:477
excess of within-species mitochondrial homoplasy
Within-species homoplasy: recombination or mutation hotspots ?
individual 1
individual 2
individual 3
individual 4
individual 5
site c
1
T
2
C
3 4
C T
5
T
site b
C
C
site a
A
A
a
A
A
G
G
G
b
C
C
C
A
A
c
T
C
C
T
T
site c’
1
T
2
T
3 4
C T
C A A
site b
C
C
C A A
G G G
site a
A
A
G G G
model 1: mutation hotspots
5
T
model 2: recombination
Within-species homoplasy: recombination or mutation hotspots ?
individual 1
individual 2
individual 3
individual 4
individual 5
site c
1
T
2
C
3 4
C T
5
T
site b
C
C
site a
A
A
a
A
A
G
G
G
b
C
C
C
A
A
c
T
C
C
T
T
site c’
1
T
2
C
3 4
C T
C A A
site b
C
C
C A A
G G G
site a
A
A
G G G
model 1: mutation hotspots
5
T
model 2: recombination
Within-species homoplasy: recombination or mutation hotspots ?
site c
site b
site a
1
T
C
A
2
C
C
A
3 4 5
C T T
C A A
G G G
species 1
A G
T T
G G
C A
T T
G G
species 2
model 1: mutation hotspots
Shared polymorphic sites between close species,
polymorphism/divergence correlation
site c
site b
site a
1
T
C
A
2
C
C
A
3 4 5
C T T
C A A
G G G
species 1
A A
T T
A G
A A
T T
G G
species 2
model 2: recombination
No such relationship
Data set I : mammalian cytochrome b
- 27 mammalian genera, each with at least 2 distinct species for which the cytochrome b
sequences of 6 individuals or more are available
80
. non-significant
0
20
40
Homoplasy
+ significant
60
homoplasy
100
120
140
- synonymous (neutral) sites only
0.0
0.1
0.2
0.3
0.4
0.5
polymorphism
Proportion of polymorphic sites
Too much within-species homoplasy in mammalian mtDNA
Polymorphism co-occurrence between congeneric species
species 1
ACCAGATTGCAATAGC
ACCAGATTGCAATAGC
ACCAAATTGCGATAGC
ACCAGATTGCGATAGT
species 2
ATTAACTTACCGTAGT
ATTAACTTACCGTAGT
ATTAACTTGCCGTAGT
ACTAGCTTACCGTAGT
ACTAGCTTACCGTAGT
ATTAGCTTACCGTAGT
species 3
GCTAGATTACTATGGT
GCTAGATTACTATGGT
GCTAGATTACCATGGT
GCTAGATTACCATGGT
GCTAAATTACCATGGT
MMMMPMMMMMPMMMMP
MPMMPMMMPMMMMMMM
MMMMPMMMMMPMMMMM
0100300010200001
2 co-occurrences
A site is called co-occurrent if it is polymorphic in more than halh of the species in the genus.
60
+ significant
40
observed coocurrence
Polymorphism co-occurrence between congeneric species
observed
co-occurrences
0
20
. non-significant
0
10
20
30
40
50
60
expected cooccurrence
expected co-occurrences
(permutations)
The amount of co-occurrence is higher than expected in 22 genera out of 27,
significantly in 11 : there are mutational hotspots in mammalian mitochondrial DNA.
Simulations under a hotspot model
Gamma
number of sites
mutation rate
A
C
G
T
A
C
Tamura & Nei 1993
G
T
- mutation rate heterogeneity is tuned to fit the observed within-species homoplasy
- then we compare the simulated and real levels of co-occurrence of polymorphisms
10
5
predicted minus observed
co-occurrence
0
coocurrence_shortage
Simulations under a hotspot model
0.05
0.10
divergence
0.15
between-species divergence
10
0
predicted minus observed
co-occurrence
5
coocurrence_shortage
Simulations under a hotspot model
0.05
0.10
0.15
divergence
between-species divergence
Hotspot sites vary in time and between species
Data set II : human and ape full genome
Hylobates lar
Pongo pygmaeus pygmaeus
Pongo pygmaeus abelli
Gorilla gorilla
Pan troglodytes
Pan paniscus
Homo sapiens
560 human full genomes
6 outgroups
synonymous sites only
Data set II : human and ape full genome
- within-homo sapiens homoplasy is strong: several sites require at least ten distinct
mutation events in humans
- sites polymorphic in human are significantly more divergent between species
- hypervariable sites are mostly A"G polymorphisms
- nine sites are like:
A#G
A
Hylobates lar
A
Pongo pygmaeus pygmaeus
A
A
A
Pongo pygmaeus abelli
Gorilla gorilla
Pan troglodytes
A
Pan paniscus
G/A Homo sapiens
There are mutation hotspots in humans as well, many due to G#A hypermutation.
Conclusions
- Invoking recombination is not necessary to explain within-species homoplasy
in mammalian mitochondrial DNA
- There are mutation hotspots in mammalian mtDNA; hotspot locations vary rapidly
during evolution
Mitochondrial recombination ?
- direct evidence for mitochondrial recombination has been reported in humans
(Kraytsberg et al 2004 Science 304:981)
- indirect evidence in a couple of other animal species (Piganeau et al 2004 Mol Biol Evol 21:2319)
- mitochondrial recombination apparently occurs only anecdotically in animals
The most popular genetic marker of biodiversity in animals
- easy to amplify
Real reasons:
- highly variable
- clonal inheritance
Other good reasons:
YES
- (nearly) neutral
- clock-like
Galtier et al 2006 Genome Res 16:215
Let's check :
Bazin et al 2006 Science 312:270
Nabholz et al 2008 Genetics 178:351
Nabholz et al 2008 Mol Biol Evol 25:120
mtDNA: a neutral marker?
Being involved in fundamental processes of cell and organismal biology (respiration,
apoptosis, metabolism), mtDNA is not likely to undergo frequent adaptive evolution.
Most analyses of variation between species are consistent with a predominant role
of purifying selection on mitochondrial genes (Weinreich & Rand 2000 Genetics 156:385).
What do mitochondrial sequence polymorphism patterns say?
Evolutionary forces influencing the genetic diversity
demography
structure
mating systems
$ ~ Ne . µ
On average, abundant species should be more polymorphic than scarce ones.
Measuring DNA polymorphism in animals
- start from Polymorphix 1.2 - Metazoa (Bazin et al 2005 Nucleic Acids Res 33:481)
- remove genome projects
- remove transposons, LINE, SINE, MHC, immunoglobulin, rRNA …
- manually check highly polymorphic families
- focus on coding sequences
- for each family, calculate the synonymous diversity $s
- average over loci within species
- average over species within 8 taxa:
Mammals, Sauropsids, Amphibians, Fish
Insects, Crustaceans, Molluscs, Echinoderms
- compare to allozyme data (Nevo et al 1984)
Measuring DNA polymorphism in animals
mtDNA
nuclear DNA allozymes
Mammals
311
25
184
Sauropsids
348
18
116
Amphibians
80
4
61
Fish
248
11
183
Echinoderms
26
22
15
Insects
451
69
122
Crustaceans
58
2
122
Molluscs
107
11
46
1629
162
849
Taxonomy does not predict mtDNA sequence polymorphism
Verterbates
Inverterbates
nuclear DNA
$s
Allozyme heterozygosity
Taxonomy does not predict mtDNA sequence polymorphism
Verterbates
Inverterbates
mtDNA
nuclear DNA
$s
Allozyme heterozygosity
Ecology does not predict mtDNA sequence polymorphism
Branch. Dec.
Branch. Dec.
continent marine
continent marine
0.40
0.30
0.08
0.10
H
$s
H
**
Allozymes
mtDNA
Crustaceans
fresh
$s
*
Allozymes
mtDNA
Molluscs
marine
0.08
fresh
marine
0.08
*
$s
H
Allozymes
mtDNA
Fish
Ecology does not predict mtDNA sequence polymorphism
Branch. Dec.
Branch. Dec.
continent marine
continent marine
0.40
0.30
0.08
0.10
H
$s
H
**
Allozymes
mtDNA
Crustaceans
fresh
$s
*
Allozymes
mtDNA
Molluscs
marine
0.08
fresh
marine
0.08
*
$s
H
Allozymes
mtDNA
Fish
Why is not mtDNA sequence polymorphism correlated to Ne?
- mutation: would imply a general, unplausible inverse relationship between Ne and µ
- demography, structure: should affect the nuclear genome as well
- natural selection:
. negative selection = background selection:
still predicts a positive relationship between $ and Ne (Charlesworth 1995 Genetics 141:1619)
. positive selection = genetic draft
predicts an essentially flat relationship between $ and Ne (Gillespie 2001 Evolution 55:2161)
Selective sweep, hitch-hiking and genetic draft
SELECTIVE SWEEP
sampled neutral locus
linked selected locus
A selective sweep, the rapid fixation of an advantageous mutation
leads to sudden drop of variability at linked loci through hitch-hiking.
Advantageous mutations are more frequent in large populations:
the increased genetic draft compensates for the decreased genetic drift.
Selective sweep, hitch-hiking and genetic draft
draft
drift
$
Ne
A selective sweep, the rapid fixation of an advantageous mutation
leads to sudden drop of variability at linked loci through hitch-hiking.
Advantageous mutations are more frequent in large populations:
the increased genetic draft compensates for the decreased genetic drift
(Gillespie 2001 Evolution 55:2161).
Conclusions
- population size influences nuclear, but not mitochondrial DNA diversity
- recurrent adaptive evolution explains the homogeneous mtDNA pattern, at least
in invertebrates
- mtDNA diversity is largely unpredictable, and mostly reflects the time since the
last selective sweep
Question
- what is mtDNA adapting to ?
The most popular genetic marker of biodiversity in animals
- easy to amplify
Real reasons:
Other good reasons:
- highly variable
- clonal inheritance
YES
- (nearly) neutral
NO
- clock-like
Galtier et al 2006 Genome Res 16:215
Let's check :
Bazin et al 2006 Science 312:270
Nabholz et al 2008 Genetics 178:351
Nabholz et al 2008 Mol Biol Evol 25:120
mtDNA : a clock-like marker ?
The molecular clock hypothesis states that the rate of accumulation of substitutions
is more or less constant in time and between lineages, so that molecules can be used
as chronometers of evolutionary divergences.
Clock-like markers are useful for molecular dating purposes. Mitochondrial DNA
has been widely used to date phylogenetic / phylogeographic events.
Some differences of mtDNA evolutionary rate between lineages have been reported,
though, and related to species metabolic rate (Gillooly et al 2005 PNAS 102:140)
A comprehensive study in mammals
Estimating evolutionary rates
- easy in principle:
t
r = [dist(A,B) / 2] / t
A
B
- uneasy in practice:
tmax(D,E)
tmin(D,E)
tmax(A,B)
t
E
tmin(A,B)
B
D
A
C
What we have
0
A
B
C D
What we want
E
Estimating evolutionary rates
problems:
- properly estimating branch lengths (when saturation can obscures the signal)
- reconciling potentially conflicting divergence date estimates
- modelling the evolution of the substitution rate
- accounting for the specificity of sequence evolutionary processes
our approach in mammals:
- including as many species as possible (cytochrome b, 1696 species)
- including as many fossil calibration points as possible (22)
- dating first (using protein sequences), then estimating synonymous rates
within groups of reasonnable maximal divergence (using 3rd codon positions)
- using statistical (bayesian) modelling (Thorne et al 1998 Mol Biol Evol 15:1647)
The neutral substitution rate varies by 2 orders of magnitude
across mammalian lineages
Taxonomic distribution
Mutation rate variation in mammals : the longevity hypothesis
Mitochondrial somatic mutations are involved in the ageing process :
long-lived species may have to constrain their mtDNA mutation rate to low values
to reach a long life-span.
Substitution rate
The bird/mammal comparison
0.5
0.5
0.05
0.05
0.01
0.01
0.001
0.001
5
10
25
50
100
Maximum longevity
10 100 1000
106
Body mass
Birds live 3 times as long, on average, than similar-sized mammals, but have a higher
mass-specific metabolic rate.
mtDNA mutation rate is lower in birds than in mammals, in agreement with the longevity
hypothesis, and contradicting the metabolic rate hypothesis.
The most popular genetic marker of biodiversity in animals
- easy to amplify
Real reasons:
Other good reasons:
- highly variable
- clonal inheritance
YES
- (nearly) neutral
NO
- clock-like
NO
Galtier et al 2006 Genome Res 16:215
Let's check :
Bazin et al 2006 Science 312:270
Nabholz et al 2008 Genetics 178:351
Nabholz et al 2008 Mol Biol Evol 25:120
General conclusions
Under frequent adaptive evolution, and strongly non-clock-like, mtDNA might be
the worst possible genetic marker of biodiversity in animals.
Yet people will obviously continue using it, if only for practical reasons.
This is good, because the evolution of this genome needs to be further understood:
- what is mtDNA adapting to?
- what controls the mtDNA mutation rate?
- why is it so compact in animals but not in other eukaryotes?
- why do mitochondria retain a genome?