* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 6 Phylogenetic Inference
Survey
Document related concepts
Transcript
Lecture 6 Phylogenetic Inference From Darwin’s notebook in 1837 Charles Darwin Willi Hennig From “The Origin” in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group, Natural group a. All individuals in the clade derived from a single ancestor b. This ancestor’s descendants are all in the clade Monophyletic groups Lungfishes Sarcopterygians Fishes Tetrapods Fishes Coelacanths Ancestor Phylogenetic inference Definitions: 2. Ancestral v.s. Derived characters A B C D Phylogenetic inference Definitions: apomorphy: derived character 3. Synapomorphy: Shared derived character A B C D apomorphy synapomorphy Phylogenetic inference Definitions: 4. Reversal evolution ← ← ← Phylogenetic inference 5. Homoplasy, Convergent evolution Fossa, Madagascar Mongoose Mountain Lion, California, Cat Thylacine, Tasmania Marsupial Phylogenetic inference 6. Parallel evolution Phylogenetic Inference • phylogenetic trees are built from “characters”. Phylogenetic Inference • phylogenetic trees are built from “characters”. • characters can be morphological, behavioral, physiological, or molecular. Phylogenetic Inference • phylogenetic trees are built from “characters”. • characters can be morphological, behavioral, physiological, or molecular. • there are two important assumptions about the characters used to build trees: Phylogenetic Inference • phylogenetic trees are built from “characters”. • characters can be morphological, behavioral, physiological, or molecular. • there are two important assumptions about the characters used to build trees: 1. they are independent. Phylogenetic Inference • phylogenetic trees are built from “characters”. • characters can be morphological, behavioral, physiological, or molecular. • there are two important assumptions about characters used to build trees: 1. they are independent. 2. they are homologous. What is a homologous character? What is a homologous character? • a homologous character is shared by two species because it was inherited from a common ancestor. What is a homologous character? • a homologous character is shared by two species because it was inherited from a common ancestor. • a character possessed by two species but was not present in their recent ancestors, it is said to exhibit “homoplasy”. Types of homoplasy: Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight. Examples of convergent evolution Convergent evolution between placental and marsupial mammals Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight. 2. Parallel evolution Example: lactose tolerance in humans. What is the difference between convergent and parallel evolution? What is the difference between convergent and parallel evolution? Convergent Parallel What is the difference between convergent and parallel evolution? Species compared Convergent Parallel distantly related closely related What is the difference between convergent and parallel evolution? Species compared Trait produced by Convergent Parallel distantly related closely related different genes/ developmental pathways same genes/ developmental pathways Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight. 2. Parallel evolution Example: lactose tolerance in human adults 3. Evolutionary reversals Example: back mutations at the DNA sequence level (C → A → C). Phylogenetic reconstructions 1. Phenetics (Neighbor - Joining) 2. Cladistics (Maximum Parsimony) 3. Statistics (Maximum Likelihood) Phylogenetic reconstructions Phenetics (Distance Methods) A B C D ATGTTGCCA * AAGTTGCCA ***** ATCAACCCA * ** CTCAACTTA A B C D A B C D 1 4 5 7 8 4 Phylogenetic reconstructions Phenetics (Distance Methods) (A,B)=1 (A,B)C=(4+5)/2=4.5 (A,B)D=(7+8)/2=7.5 (A,B,C)D=(7+8+4)/3=6.3 A B C D A B C D 1 A B 4 5 0.5 7 8 4 C D 2.25 1.75 3.15 0.9 Phylogenetic reconstructions Cladistics: Maximum Parsimony Method A B C D G G A A G A 1 step A G C B D A G A G 3 steps A G G A D B C G A G A G 3 steps A G Phylogenetic reconstructions Cladistics: Maximum Parsimony Number of possible rooted trees Number of taxa 4 7 10 Number of rooted trees 15 10,395 34,459,425 Number of unrooted trees 3 954 2,027,025 How do we select the “best” tree? No. of Taxa No. of possible trees 4 5 6 7 10 11 50 3 15 105 945 2 x 106 34 x 106 3 x 1074 Independent gain of camera eye requires two changes Evolution and loss of camera eye requires six changes Phylogenetic reconstructions Phenetics (Distance Methods) - what are the principles pheneticists use to construct phylogenies? 1. tree should reflect overall degree of similarity. 2. tree should be based on as many characters as possible. 3. tree should minimize the distance between taxa. Phylogenetic reconstructions Cladistics 1. tree should reflect the true phylogeny. 2. phylogeny should be based on characters that are shared (by more than one taxon) and derived (from some known ancestral state). 3. the ancestral state of characters are inferred from an outgroup that roots the tree. - an outgroup is ideally picked from fossil evidence - i.e., it belongs to a genus or family that existed prior to taxa forming the ingroup. Each subspecies of seaside sparrow has a restricted range. maritima Atlantic coast junicola macgillivraii Gulf coast nigrescens fisheri peninsulae compared. macgillivraii Atlantic coast The subspecies separate into two groups when DNA sequences are maritima nigrescens peninsulae fisheri Gulf coast junicola How do distance trees differ from cladograms? Distance trees Cladograms Characters used as many as possible synapomorphies only Monophyly not required absolute requirement Emphasis branch lengths branch-splitting Outgroup not required absolute requirement Phylogenetic reconstructions 3. Statistics (Maximum Likelihood) The only method based on a mutation model ! Phylogenetic reconstructions 3. α A α α C Maximum Likelihood G α α Jukes-Cantor Model α T pAn = 3α Phylogenetic reconstructions 3. α A α α C Maximum Likelihood G α α Jukes-Cantor Model α T α A β C β G β α β T Kimura - 2 parameter Model Phylogenetic reconstructions 3. Maximum Likelihood α A pAn = α + 2β β C β G β α β T Kimura - 2 parameter Model Infer relationships among three species: Outgroup: Markov chain Monte Carlo 1. 2. 3. Start at an arbitrary point Make a small random move Calculate height ratio (r) of new state to old state: 1. 2. 4. r > 1 -> new state accepted r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state Go to step 2 2a always accept 1 2b 20 % tree 1 accept sometimes 48 % 32 % tree 2 tree 3 The proportion of time the MCMC procedure samples from a particular parameter region is an estimate of that region’s posterior probability density Markov chain Monte Carlo 1. 2. 3. Start at an arbitrary point Make a small random move Calculate height ratio (r) of new state to old state: 1. 2. 4. r > 1 -> new state accepted r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state Go to step 2 2a always accept 1 2b 20 % tree 1 accept sometimes 48 % 32 % tree 2 tree 3 The proportion of time the MCMC procedure samples from a particular parameter region is an estimate of that region’s posterior probability density Phylogenetic reconstructions 1. Phenetics (Neighbor - Joining) 2. Cladistics (Maximum Parsimony) 3. Statistics (Maximum Likelihood) Phylogenetic Inference Two points to keep in mind: 1. Phylogenetic trees are hypotheses 2. Gene trees are not the same as species trees • a species tree depicts the evolutionary history of a group of species. • a gene tree depicts the evolutionary history of a specific locus. Conflict between gene trees and species trees Conflict between gene trees and species trees How do we select the “best” tree? Evaluating tree support by bootstrapping Evaluating tree support by bootstrapping Species 1 Species 2 Species 3 Species 4 A A A A A T T T C C T T G G G G C C A A C C C C T… T… C… C… G G G G Evaluating tree support by bootstrapping Species 1 Species 2 Species 3 Species 4 A A A A A T T T C C T T G G G G C C A A C C C C T… T… C… C… Species 1 Species 2 Species 3 Species 4 G G G G Evaluating tree support by bootstrapping Species 1 Species 2 Species 3 Species 4 A A A A A T T T C C T T G G G G C C A A C C C C T… T… C… C… G G G G Step 1. Randomly select a base to represent position 1 Evaluating tree support by bootstrapping Species 1 Species 2 Species 3 Species 4 A A A A A T T T C C T T G G G G C C A A C C C C T… T… C… C… G G G G Step 1. Randomly select a base to represent position 1 Species 1 Species 2 Species 3 Species 4 T T C C Evaluating tree support by bootstrapping Species 1 Species 2 Species 3 Species 4 A A A A A T T T C C T T G G G G C C A A C C C C T… T… C… C… G G G G Step 2. Randomly select a base to represent position 2 Species 1 Species 2 Species 3 Species 4 T T C C G G G G Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement). Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement). Step 4. Build tree and record if groupings match original tree. Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement). Step 4. Build tree and record if groupings match original tree. Step 5. Repeat 1,000 times. Evaluating tree support by bootstrapping Species 1 98 Species 2 92 Species 3 Species 4 Cospeciation of aphids and their bacterial endosymbionts