Download characters

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA barcoding wikipedia , lookup

Maximum parsimony (phylogenetics) wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Microevolution wikipedia , lookup

Koinophilia wikipedia , lookup

Transcript
EEOB 400: Lecture 13
Phylogeny
Outgroup
Species A
Species B
Species C
AAGCTTCATAGGAGCAACCATTCTAATAATAAGCCTCATAAAGCC
AAGCTTCACCGGCGCAGTTATCCTCATAATATGCCTCATAATGCC
GTGCTTCACCGACGCAGTTGTCCTCATAATGTGCCTCACTATGCC
GTGCTTCACCGACGCAGTTGCCCTCATGATGAGCCTCACTATGCA
Extra credit
Using genetic markers
to examine spatial and temporal variation in
Ohio Canada Goose harvest composition
Dr. Kristin Mylecraine
The Ohio State University
and Ohio Division of Natural Resources
Thursday November 9
12:00 PM, Lazenby 21
Tree of life
Why is phylogeny important?
Understanding and classifying the
diversity of life on Earth
Testing evolutionary hypotheses:
- trait evolution
- coevolution
- mode and pattern of speciation
- correlated trait evolution
- biogeography
- geographic origins
- age of different taxa
- nature of molecular evolution
- disease epidemiology
…and many more applications!
Phylogeny
What is a phylogeny?
Branching diagram showing relationships between species (or higher taxa)
based on their shared common ancestors
Species: A
B
C
D
E
F
Time
E
A
B
C
F
D
Time
A and B are most closely related because they share a common ancestor
( call the ancestor “E”) that C and D do not share
A+B+C are more closely related to each other than to D because they share
a common ancestor (“F”) that D does not share
Phylogeny
Terminal nodes = contemporary taxa
Internal nodes =
ancestral taxa
Phylogeny and classification
Hierarchy
All taxonomic classifications are hierarchical – how does phylogeny differ?
Class
Order
Order
Family
Genus
Species 1
Species 2
Species 3
Species 4
Genus
Species 1
Species 2
Species 3
Family
Genus
Species 1
Species 2
Family
Genus
Species 1
Species 2
Species 3
Species 4
Species 5
Species 6
Species 7
Species 8
Species 9
Genus
Species 1
Species 2
Genus
Species 1
Genus
Species 1
Species 2
Species 3
Phylogeny and classification
Hierarchy
Phylogenetic (cladistic) classification reflects evolutionary history
The only objective form of classification – organisms share a true evolutionary
history regardless of our arbitrary decisions of how to classify them
Phylogeny
Classification
Genus
Family
Genus
Order
Genus
Family
Genus
Class
Genus
Family
Genus
Order
Genus
Family
Genus
Phylogeny and classification
Classification
Note that taxa are nested
on the basis of shared
common ancestors
e.g., All tetrapods share
a common ancestor with
legs, but other chordates
outside of Tetrapoda do
not share this common
ancestor
The traits mapped onto
the phylogeny are
synapomorphies – we
will return to them later
Phylogeny and classification
Monophyletic group
Paraphyletic group
Polyphyletic group
Includes an ancestor
all of its descendants
Includes ancestor and
some, but not all of its
descendants
Includes two convergent
descendants but not their
common ancestor
A
B
C
D
How could this happen?
A
B
C
D
Taxon A is highly derived
and looks very different
from B, C, and ancestor
A
B
C
D
Taxon A and C share
similar traits through
convergent evolution
Only monophyletic groups (clades) are recognized in cladistic classification
Phylogeny and classification
Monophyly
Each of the colored lineages
in this echinoderm phylogeny
is a good monophyletic group
Asteroidea
Ophiuroidea
Echinoidea
Holothuroidea
Crinoidea
Each group shares a common
ancestor that is not shared by any
members of another group
Paraphyletic groups
Foxes
Paraphyly
“Foxes” are paraphyletic with respect
to dogs, wolves, jackals, coyotes, etc.
This is a trivial example because
“fox” and “dog” are not formal
taxonomic units, but it does show
that a dog or a wolf is just a derived
fox in the phylogenetic sense
Lindblad-Toh et al. (2005) Nature 438: 803-819
Paraphyletic groups
Canids
Monophyly
Note that canids are still a good
monophyletic clade within Mammalia
Each of the colored lineages within
canids is also a monophyletic clade
Lindblad-Toh et al. (2005) Nature 438: 803-819
Paraphyletic groups
Lizards
Paraphyly
“Lizards” (Sauria) are
paraphyletic with respect
to snakes (Serpentes)
Serpentes is a monophyletic
clade within lizards
Squamata (lizards + snakes)
is a monophyletic clade
sister to sphenodontida
Snakes are just derived,
limbless lizards
Fry et al. (2006) Nature 439: 584-588
Paraphyletic groups
Reptilia
Paraphyly
Birds are more closely related
to crocodilians than to other
extant vertebrates
Archosauria = Birds + Crocs
We think of reptiles as turtles,
lizards, snakes, and crocodiles
But Reptilia is a paraphyletic
group unless it includes Aves
What does this mean?
It means that
“reptiles” don’t
exist!
No, it means
that you’re one
of us!
What it means is that “reptile” is only a
valid clade if it includes birds
Birds are still birds, but Aves cannot be
considered a “Class” equivalent to
Class Reptilia because it is evolutionarily
nested within Reptilia
Reptilia
Aves
(birds)
Turtles
Crocodiles
Lizards and snakes
Tuataras
Testing evolutionary hypotheses
Mapping evolutionary transitions
Some horned lizards squirt blood from their eyes when attacked by canids
How many times has blood-squirting evolved?
Blood squirting?
No
Yes
Testing evolutionary hypotheses
Mapping evolutionary transitions
Some horned lizards squirt blood from their eyes when attacked by canids
How many times has blood-squirting evolved? This phylogeny suggests a single
evolutioary gain and a single loss
of blood squirting
Blood squirting?
No
Yes
Testing evolutionary hypotheses
Mapping evolutionary transitions
But a new phylogeny using multiple characters
suggests that blood squirting has been lost
many times in the evolution of this group
Our interpretation of these evolutionary
scenarios depends on phylogeny
Leaché and McGuire. Molecular Phylogenetics and Evolution 39: 628-644
Testing evolutionary hypotheses
Reconstructing ancestral characters
This phylogeny also shows how we can use
data from living species to infer character
states in ancestral taxa
?
?
Ancestral state could be blue, purple,
or intermediate…outgroup comparison
indicates blue is most parsimonious
Leaché and McGuire. Molecular Phylogenetics and Evolution 39: 628-644
Testing evolutionary hypotheses
Mapping evolutionary transitions
How many times has venom
evolved in squamate reptiles?
Once in the large “venom clade”
Groups within this clade then
evolved different venom types
e.g., different proteins found in
Snakes versus Gila monsters
Even non-venomous lizards in this
clade (Iguania) share ancestral toxins
Fry et al. (2006) Nature 439: 584-588
Testing evolutionary hypotheses
Convergence and modes of speciation
What can this phylogeny tell us about homology/analogy and speciation?
Lake Tanganyika
1.
Similarities between each pair are
the result of convergence
2.
Sympatric speciation more likely
than allopatric speciation
Lake Malawi
Testing evolutionary hypotheses
Coevolution
Aphids and bacteria are symbiotic
Given this close relationship, we might
expect that speciation in an aphid would
cause parallel speciation in the bacteria
When comparing phylogenies for each
group we see evidence for reciprocal
cladogenesis (but also contradictions)
Clark et al. (2000)
Testing evolutionary hypotheses
Geographic origins
A
Where did domestic corn (Zea
mays maize) originate?
Populations from Highland Mexico
are at the base of each maize clade
B
Matsuoka et al. (2002)
Testing evolutionary hypotheses
Geographic origins
Where did humans originate?
Each tip is one of 135 different
mitochondrial DNA types found
among 189 individual humans
African mtDNA types are clearly
basal on the tree, with the nonAfrican types derived
Suggests that humans originated in
Africa
Vigilant et al. (1991) Science
Reconstructing evolutionary history
Phenetic methods
Based on overall difference between taxa = “distance” methods
Only considers shared characters; not shared, derived characters
Suppose you use DNA hybridization
to compare DNA of 4 species
A differs from B by 4%
A differs from C by 10%
A differs from D by 10%…for all pairs
Use algorithm to find shortest tree
“Quick and dirty” method
Distance method will often recover
trees that are similar to cladistic trees,
but it requires constant rate of evolution
or it will give erroneous groupings
Reconstructing evolutionary history
Cladistic methods (Willi Hennig 1966)
Based on shared, derived characters = synapomorphies
Similarity is not enough – requires similarity reflecting descent with modification
Requires characters that can be assigned a particular character state
Characters and character states
Character:
Molecular
Characters
eye color
Character states: blue, brown, green
mammary glands
present, absent
number of legs
0, 2, 4, 6, 8, etc.
nucleotide bases
A, C, T, G
amino acid codons
ACC, CGT, GAT, etc.
Terminology
Plesiomorphy
Character state found in ancestor of group
Apomorphy
Derived character state in descendants of group
Symplesiomorphy
Shared, ancestral character state
Synapomorphy
Shared, derived character state (indicates homology)
Polarity Distinguishing ancestral (0) from derived (1) = assigning polarity
- polarity can be assessed by outgroup comparison
A
“Blue” and “square” are plesiomorphic
“Small size” is an apomorphy for A
“Red” is a synapomorphy for A + B
“Circle” is a synapomorphy for A + B + C
…but a symplesiomorphy for A + B
B
C
D
Synapomorphy
Synapomorphy
Each character shown in
pink is a synapomorphy
Shared - by all descendants
in the clade
e.g., all chordates share a
notochord
Derived – not present in
ancestral taxa
e.g., ancestral deuterostome
lacks a notochord
Any clade must share at
least one synapomorphy
Synapomorphy
How can we tell how well
a clade is supported?
In part, by the number of
synapomorphies
Few synapomorphies = weaker support
Many synapomorphies = stronger support
Homoplasy
Homoplasy
Taxa share a character, but not by descent from a common ancestor
Equivalent to analogy, homoplasy is a product of convergent evolution
Homoplasy gives the impression of homology (synapomorphy) and therefore
misleads phylogenetic analyses by supporting polyphyletic taxa
Recovered
phylogeny
True phylogeny
Homoplasy
Homoplasies that look like
homologies:
Stripes
Spotted caudal fin
Yellow color
Recovered
phylogeny
Lake Tanganyika
True phylogeny:
Malawi cichlids
monophyletic
Lake Malawi
Morphological characters
Examples
Skull structure in cetaceans
Genitalia in ants
Morphological characters
Constructing a character matrix
Suppose we want to know the phylogeny of cichlids A, B, C using an Outgroup
First, we need characters that are variable within this group
Character:
Pattern
Caudal Caudal
Pattern Shape
Forehead
Bulge?
Striped
Spot
No
Out
A
Round
Synapomorphies
Barred
None
Forked
No
Barred
None
Forked
No
B
C
Barred
None
Round
Yes
Apomorphy
Parsimony
How do we decide the “best” phylogeny?
Parsimony – the simplest explanation is preferred (Occam’s razor)
A trivial example (much more complicated with real datasets)
Most parsimonious:
Requires 5 steps
Requires only 4 steps
Round  forked tail
Round  forked tail
Round  forked tail
Stripe  barred
Spot  plain tail
Stripe  barred
Spot  plain tail
No bump  forehead bump
No bump  forehead bump
Molecular characters
Outgroup
Species A
Species B
Species C
AAGCTTCATAGGAGCAACCATTCTAATAATAAGCCTCATAAAGCC
AAGCTTCACCGGCGCAGTTATCCTCATAATATGCCTCATAATGCC
GTGCTTCACCGACGCAGTTGTCCTCATAATGTGCCTCACTATGCC
GTGCTTCACCGACGCAGTTGCCCTCATGATGAGCCTCACTATGCA
2. Sequence
1. Extract
3. Align
Molecular characters
Outgroup
Species A
Species B
Species C
Out
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
Invariable sites
These are not useful
phylogenetic characters
Out
A
A
B
B
C
C
Molecular characters
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
Outgroup
Species A
Species B
Species C
Out
Synapomorphies
supporting A+B+C
Out
A
A
B
AG
C
TC
Any mutations at
this time would affect
A, B and C because they
have not yet diverged
B
C
Molecular characters
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCCTCACG
Outgroup
Species A
Species B
Species C
Synapomorphies
supporting A+B+C
Synapomorphies
supporting B+C
Out
Out
A
A
B
AG
B
TC
C
AT
AG
Any mutations at this time
would affect A and B
C
Molecular characters
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCCTCACG
Outgroup
Species A
Species B
Species C
Synapomorphies
supporting A+B+C
Synapomorphies
supporting B+C
Apomorphy for C
Out
Out
A
A
B
AG
B
TC
C
AT
AG
C
Any mutations at this time would only affect C TC
Molecular characters
Homoplasy is still a problem
There are only 4 possible character states for nucleotides:
A G C T
Homoplasy arises when nucleotide mutates back to ancestral state:
Out
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
AG
TC
A
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
AT
Back-mutation “erases”
synapomorphy and
produces homoplasy
AG
B
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GAGCTTCACG
C
TA
ATA
Molecular characters
Homoplasy is still a problem
There are only 4 possible character states for nucleotides:
A G C T
Homoplasy arises when nucleotide mutates back to ancestral state:
ATA
Out
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
AG
TC
CA
A
AAGCTTCATA
GAGCTTCACA
GTGCTTCACG
GTGCTTCACG
AT
Back-mutation “erases”
synapomorphy and
produces homoplasy
AG
B
Homoplasy
can also reflect
convergent
mutations
AAGCTTCATA
GAGCTTAACA
GTGCTTCACG
GAGCTTAACG
C
TA
CA
Morphology vs molecules
Morphology
Nucleotides
Homoplasy can be assessed from
structure, development, etc. PRO
Homoplasy can’t be assessed directly
(an “A” is an “A”) CON
Characters may be subject to selection
= convergence = homoplasy CON
Characters may or may not be subject
to selection – depends on the site ?
Takes lots of time to identify and code
characters for analysis CON
Sequencing yields lots of characters
if gene is sufficiently variable PRO
Requires parsimony analysis ?
Can use either parsimony or likelihood
analysis – stronger inference PRO
Only someone familiar with taxon can
identify good characters PRO & CON
Any idiot can get sequence data
PRO & CON
With either approach, it all comes down to successfully identifying
synapomorphies and distinguishing them from homoplasies
Character conflict
Character conflict
With either morphology or molecules,
some characters will not “agree” on
the most parsimonious phylogeny
Some characters support monophyletic
Reptilia exclusive of birds
These are not synapomorphies for
“reptiles”, they are ancestral traits
Feathers, two legs, and endothermy
are apomorphic in birds
Other characters reflect synapomorphy
and recover the true relationships
But in many cases it is more difficult to
resolve character conflict
Consensus
When multiple phylogenies are supported…
A consensus tree shows only those relationships common to all trees
The lower tree is a “compromise” between conflicting upper phylogenies
Examples:
- two equally parsimonious trees
- two trees from different genes
- morphological vs. molecular tree
- parsimony vs. likelihood tree
Consensus trees will always have
at least one polytomy - a branching
event that is not a bifurcation
Better to have an incompletely resolved
tree than an incorrect tree
Consensus
An example of a consensus tree for loons
The middle tree is a “compromise” between conflicting left and right trees
Polytomy
Are polytomies real?
Usually not - they reflect inability to reconstruct the true bifurcating phylogeny
We often encounter polytomies in cases of rapid speciation when an ancestor
rapidly diverged into many new forms
A
B
C
D
True
phylogeny
E
F
G
Inferred
phylogeny
H
I
J
K
= 1 million years
= Change in character state
We can only recover those branches
on which we “see” characters change
Different genes for different questions
Molecular stopwatch
Deepest root: 35 mya (use mtRNA)
Molecular hourglass
600 mya (use nuclear rRNA)
Different genes, different trees
Gene 1
Species A
Species B
Gene 2
Species C
Species A
Species B
Species C
Red and blue
indicate different
alleles for a particular
gene (gene 1 or 2)
A B C
A B C
Incorrect
Correct
Because genes are inherited as a single unit, all of the nucleotides in a gene can
support the same phylogeny, and it could still not reflect true speciation sequence