Download Presentazione standard di PowerPoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
MetaPhlAn v2 and
tracking microbes at the strain level
Edoardo Pasolli
Nicola Segata’s Lab
Laboratory of Computational Metagenomics
Centre for Integrative Biology
University of Trento, Italy
MetaSUB Conference
June 20th, 2015
The shotgun metagenomic workflow
1
2
2
2
Taxonomic profiling: who’s there?
Tons of microbes
Tons of short reads
Shotgun
sequencing
Taxonomic
profiling
Sequenced genomes of (some) microbes
Microbial
taxonomy
• MetaPhlAn (Segata et al., Nature Methods 2012)
• MetaPhlAn v2 (released, under review)
• http://segatalab.cibio.unitn.it/tools/metaphlan2
Organismal
relative abundances
3
Taxonomic profiling: unique marker genes
X is a unique marker gene for clade Y
Gene X
THE INPUT
•
~25.000 genomes (Bacteria, Archaea, Fungi, Viruses)
• ~1/10 of genomes are final, ~9/10 draft
•
~7,100 species (excludes incomplete annotations, spp.,
etc.)
•
•
•
•
IDEA
1. Pre-identify markers
from reference
genomes
2. Use markers as proxy
for taxonomic clades in
shotgun metagenomics
2004
2005
2006
2007
2008
15k
12k
Method: ChocoPhlAn
THE RESULTING DATABASE
~15M total unique marker genes
~1M “most representative” unique marker genes
180±45 markers per species (200 fixed max)
Quasi-markers used to resolve ambiguity in
postprocessing
2003
18k
9k
6k
Number of microbial
organisms in RefSeq
2009
2010
2011
2012
3k
20134
0k
Taxonomic profiling: MetaPhlAn’s overview
Marker database
Reference genomes + taxonomy
Clade 1
Clade 1
Marker identification
ChocoPhlAn (offline)
Clade 2
Metagenome
Clade 2
Mapping
MetaPhlAn
database
Profiled Metagenome
5
Taxonomic profiling: MetaPhlAn’s main features
•
•
•
•
•
•
Species-level resolution
Computational feasibility
Prevotella copri
Organismal relative abundance rather than DNA concentrations
Consistent detection confidence for all clades
High accuracies for very short reads (as short as ~50nt)
Detection of organisms without sequenced genomes
Main MetaPhlAn2 additions
•
•
•
•
•
•
•
•
Profiling not only for Bacteria and Archaea, but also for viruses, Fungi and Protozoa
6-fold increase in the number of considered species: >7000 species
Introduction of the concept of quasi-markers
Improvement of quantitative performances: higher correlation with true
abundances, lower false positive and false negative rates
Improvement of computational performances
Addition of strain-specific
barcoding
for microbial
strain
tracking
Profiled
thousands
of samples
in few
days
Strain-level identification for organisms with sequenced genomes
Integration with post-processing and visualization tools
6
Thanks!
The Laboratory of Computational Metagenomics
Matthias Scholz
Adrian Tett
Tin Truong
Edoardo Pasolli
Federica Armanini
Francesco Asnicar
Pamela Ferretti
Moreno Zolfo
Thomas Tolio
Serena Manara
Mattia Bolzan
Francesco Beghini
Luca Erculiani
http://cibiocm.bitbucket.org - [email protected]
Olivier
Curtis
Huttenhower Jousson
Wendy
Garrett
Doyle
Ward
Jacques
Izard
Flaminia
Marco
Ventura Catteruccia
Owen R White
Dan Littman
Veronica De Sanctis
Roberto Bertorelli
Enrico Blanzieri
http://segatalab.cibio.unitn.it/tools/metaphlan2
7
Related documents