Download 2_16S_TREE_RECONSTRUCTION

3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the alignment  The algorithms  Bootstrap  One tree is no tree Phylogenetic coherence (monophyly) phylogenetic coherence genomic coherence phenotypic coherence 50% 60% 70% 70-50% 70% 80% 100% RNAr 16S Functional genes (MLSA) Genomic analyses Reasociación DNA-DNA G+C, AFLP, MLSA Genomic comparisons (ANI; AAI) metabolism chemotaxonomy spectrometry (Maldi-Tof; ICR-FT/MS)  Generally based on 16S rRNA gene analysis  important to recognize the closest relatives by means of the Type Strain gene sequences  Housekeeping genes (MLSA approach or single gene) may help in resolve phylogenies  Future perspectives will be done with full-genome sequences Phenetics vs Cladistics  Data can be treated as presence/absence/intensity to generate similarity matrices  If data is analyzed by their similarity  PHENETICS  If data is analyzed in an evolutionary context (i.e. changes in homologous characters are mutations or evolutive steps)  CLADISTICS Similarity matrix or alignment  For evolutive purposes is necessary to recognize HOMOLOGY PHENETICS 80 85 90 OTU A 10100010010010010 OTU B 11010001010001010 OTU C 00010010011110101 OTU D 00111110010101010 OTU E 00010010111001101 … M8 M31 A1 M1 A7 P13 P18 PR1 C12 C16 E3 E11 C9 C4 C5 C25A E7 CLADISTICS HOMOLOGY  ORTOLOGY  PARALOGY  HOMOPLASY Homology  same ancestral origin Organism A Gene X Homoplasy  false homology Organism B Gene X Orthology  homologous genes in different organisms Organism A Gene X Gene X’ Gene X’’ Paralogy  homologous genes in the same organism, gene duplications with identical or different function HOMOLOGY  ORTOLOGY  PARALOGY  HOMOPLASY Homoplasy (false homology) Organism A Gene X Organism B Gene X Orthology  homologous genes in different organisms Homology (same ancestral origin) Organism A Gene X Gene X’ Gene X’’ Paralogy  homologous genes in the same organism, gene duplications with identical or different function Evolution vs. Phylogeny Evolution => mutations (morphometrics) + age (fossil record) Phylogeny = genealogy => we know only the tips of the tree, nothing is said about putative ancestors Evolution ≠ phylogeny PROKARYOTES => no fossil record => molecular clocks Molecular clocks (housekeeping genes): 16S rRNA; 23S rRNA; ATPases; TU-elongation factor; gyrases… The 16S rRNA:  Universally represented  Conserved  No protein coding  Base pairing (helix)  Natural amplification  Proper size Ludwig and Schleifer, 1994 FEMS Rev 15:155-173 The relevance of the alignment To perform cladistic analyses we should first align al sequences in order to recognize all homologous positions. Recognition by:  Sequence similarities  Base pairing due secondary structure (helixes for rRNA)  Insertions & deletions  Empirically (subjective)  Minimize homoplasic influences There are many alignment programs, all look to common features that may indicate homologous sites:  Clustal X  MAFFT  PileUp … The relevance of the alignment Most of the programs do not take into account secondary structure, just sequence motive similarities rRNA has a secondary structure with helixes that help in aligning sequences Functional gene or translated proteins cannot be improved by secondary structure analysis The relevance of the alignment www.arb-home.de www.arb-silva.de ARB does take into account features as helix pairing By increasing the numbers of sequences, the alignment improves The algorithms b c c a a b Like Maximum Parsimony but takes into account dendrograms alignment Distance transformation a => 0 a => 100 b => 40 0 b => 60 100 Jukes-Cantor c => 60 20 0 c => 40 80 100 Kimura a b c a b c De Soete Distance matrix Similarity matrix (pitfalls: does not take into account multiple mutations) Maximum Parsimony G C C A T => a G C A C T => b G C A C C => c a b 2 b – c => 1 mutation a c 3 b c b 1 3 2 2 5 5 3 a – b => 2 mutations a – c => 3 mutations c Maximum Likelihood (pitfalls: nature may not be parsimonious) a  difficulties in mutation events (transitions vs. transversions)  mutation position  Slower transitions transversions Neighbor Joining: G C C A T => a G C A C T => b G C A C C => c T C A G Bootstrap Bootstrap indicates how stable is a branching order when a given dataset is submitted to multiple analysis Generally short internode branches will have low bootstrap values  TERMINI  42,284 homologous positions PHYLOGENETIC FILTERS  BACTERIA  1,532 homologous positions  30%  1,433 homologous positions  50%  1,288 homologous positions NJ_bac USE OF PHYLOGENETIC FILTERS  Conservational filters are useful for deepbranching phylogenies  complete sequences are useful for close relative organisms NJ_30% NJ_50% Size & information content  complete sequences give complete information  partial sequences lose phylogenetic signal  short sequences lose resolution 1500 nuc 300 nuc 900 nuc One tree is no tree  different algorithms  different topologies  try different datasets as well  draw a consensus tree RaXML NJ PAR RECOMMENDATIONS FOR 16S rRNA TREE RECONSTRUCTION  SEQUENCE almost complete is better than short partial sequences  ALIGNMENT Better take into account secondary structures  ALGORITHM Better maximum likelihood, but compare with other as neighbor joining and maximum parsimony  DATASET Never just one dataset, try different sets of data (i.e. different number of sequences; different filters to find the best resolution)  FINAL TREE  Either you show all trees, or the best bootstrapped, or a multifurcation showing unresolved branching order. B E C A G H I F D A 95 50 25 B E C D F G H I 100 90 25 100 100 Tree with bootstrap Tree with multifurcation MLSA: phylogenetic reconstructions MULTIPLE SEQUENCE ALIGNMENTS  sometimes have better resolution than the 16S rRNA gene  16S rRNA gene can have very low resolution Jiménez et al., 2013, System Appl Microbiol, 36: 383- 391

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 2_16S_TREE_RECONSTRUCTION