Download Ribosomal MLST - The Maiden Lab

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

RNA-Seq wikipedia , lookup

Public health genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Minimal genome wikipedia , lookup

Human genome wikipedia , lookup

Genomic library wikipedia , lookup

Human Genome Project wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Koinophilia wikipedia , lookup

DNA barcoding wikipedia , lookup

Genome editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

Genomics wikipedia , lookup

Genome evolution wikipedia , lookup

Pathogenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Transcript
Ribosomal MLST: Universal Bacterial Typing
using Whole Genome Sequences
James E. Bray, Keith A. Jolley, Martin C.J. Maiden
Department of Zoology, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK.
Contact: [email protected]
Introduction
Overview of the Bacterial Domain using Protein-encoding Ribosomal Genes
The Ribosomal Multi-Locus Sequence Typing (rMLST)
database hosted on the PubMLST website provides allelic
sequence definitions for more than 3,000 different bacterial
species and can be used for rapid speciation and sequence
typing across the bacterial domain [1]. The rMLST approach
indexes the variation of the 53 ribosomal protein subunit (rps)
genes. These genes are present in all bacterial species and
therefore this approach can provide a universal bacterial typing
nomenclature for the biomedical community.
rMLST Website
pubmlst.org/rmlst
Schematic diagram of the ribosomal subunits.
RNA is shown in yellow and orange, protein domains are
shown in blue. Image courtesy of RCSB Protein Data Bank
Speciation by rMLST is concordant with 16S rRNA but is able
to resolve individual species that, in many cases, 16S rRNA
cannot.
Currently, more than 100,000 bacterial
genomes have been collected from
public repositories and the rMLST allelic
variation has been catalogued using the
BIGSdb platform [2]. BIGSdb is opensource software that includes a database
for storing isolate sequences, meta-data
and allelic definitions for defined loci. The
platform includes advanced functionality
for interrogating genomic data using a
gene-by gene approach at different levels
of resolution from traditional 7-locus
MLST, 53 gene rMLST, through to core
genome MLST and whole genome
analyses.
Neighbor-joining tree reconstructed from concatenated
rMLST alleles from 1565 bacterial isolates.
Compare Genomes at Different Taxonomic Levels
A) Species within a Genus
NeighborNet diagram of
Campylobacter isolates
from 12 different species.
Ribosomal Sequence Typing Nomenclature
The Genome Comparator module of the BIGSdb platform
allows genomic data from hundreds of isolates to be
compared using gene-by-gene analysis at any taxonomic
level. As rMLST uses 53 loci it is able to resolve down to
the level of strain type, comparable, and often better, than
conventional MLST
For example, Campylobacter isolates can be analysed at
genus, species, or individual clonal complex level.
NeighborNet diagrams were calculated by SplitsTree [3]
using nucleotide variation of concatenated rMLST alleles.
C) Isolates within a Single Clonal Complex
B) Isolates within a Species
NCBI Genomes
(Assembled)
Velvet
Assembly
Pipeline [4]
Isolates
Identify alleles for
all ribosomal
genes
Scan
Ribosomal sequence types (rSTs) are
defined based on the unique combination
References
of alleles
at the 53 rps loci.
Ribosomal sequence types have been
defined for 14 bacterial genera including
many human pathogenic species
including Campylobacter jejuni, Neisseria
meningitidis and Staphylococcus aureus
and work is on-going to scale-up the
process to encompass all bacterial
species.
Designate
NeighborNet diagram of 186
Campylobacter jejuni isolates
from within the ST-21 complex
NeighborNet diagram of 36
Campylobacter jejuni
isolates (one representative
of each clonal complex and
C. jejuni subsp. doylei)
Public Genomes
(Raw Reads)
Define
Alleles
Define unique
combinations of
alleles
Analyse all alleles
for each gene for a
whole genus
Profiles
Using Ribosomal Alleles and Sequence Types to Identify Species
Genomic
Sequence File
Library of
ribosomal allele
sequences
The library of ribosomal allele sequences can be searched with any
bacterial genome sequence. The BIGSdb Species Finder program will
identify all the exact allelic matches within the sequence. The program
reports the number of isolates and associated species annotations for
each allelic match and the ribosomal sequence type (rST) if there is a
complete profile match.
53 loci
300,000+ sequences
Streptococcus
3,987
Staphylococcus
3,709
Campylobacter
1,964
Escherichia
1,825
Neisseria
1,423
Salmonella
1,101
Mycobacterium
840
Acinetobacter
603
Vibrio
304
Shigella
251
Bacillus
249
Listeria
215
Yersinia
149
Bordetella
60
Total
16,680
Conclusions
• We have developed a universal bacterial sequence typing method for analysing
bacterial genomes using the 53 protein encoding ribosomal genes (rps).
• Genomes can be compared at all taxonomic levels using the BIGSdb Genome
Comparator module.
• We have defined more than 16,000 ribosomal sequence types (rSTs) for 154
bacterial species including many human pathogens of clinical importance.
References
Identify exact
allelic matches
1. Jolley, K.A., et al., Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain.
Microbiology, 2012. 158(Pt 4): p. 1005-15.
2. Jolley, K.A. and M.C. Maiden, BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC
Bioinformatics, 2010. 11: p. 595.
3. D.H. Huson and D. Bryant, Application of Phylogenetic Networks in Evolutionary Studies. Molecular Biology and
Evolution, 2006, 23(2):254-267.
4. Zerbino, D.R. and E. Birney, Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res,
2008. 18(5): p. 821-9.
Look up allele indexes in
profile database to find
Ribosomal Sequence Type (rST)
Processes involved for species identification
using BIGSdb Species Finder program
Genus
Number of Ribosomal
Sequence Types
Defined
Web interface for the BIGSdb Species Finder program
This project is funded by the Wellcome Trust.