Download Viral Metagenomics_EBI

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Viral Metagenomics
Andrew Millard
Warwick Medical School
University of Warwick
Email: [email protected]
Twitter: @milja001
Overview
• Brief history
• Methods
• Analysis
• Examples
Brief History
• 1St Viral Metagenome – Brietbart 2003, PNAS 2002
• Marine seawater sample
• Human faeces – Brietbart 2003, J Bacteriology
• Four Ocean biomes – Angly 2006, PLoS Biology
• Global Ocean Survey – Williamson 2008, PLOS ONE
• Human faeces from monozygotic twins – Reyes 2010,
Nature
• Pacific Ocean Virome – Hurwitz 2013, PLOS ONE
• TARA Oceans – Brum 2015 , Science
Collection of Viral Fraction
Concentration
Nucleic Acid Extraction
Library Preparation
Analysis
Virus Collection
• Separation from larger particulate matter
• Water samples –easy
• Faecal matter – less fun
• Iron Chloride flocculation – (Poulos 2015)
• Filtration
• Tangential flow filtration
• Both methods work
• Depends on samples – TFF can filter from 0.5 l to
100s litres of water.
Concentration - Ceasium Chloride
• Expensive
• Specialised equipment
• Ultra centrifuge
• Low throughput
• Does not remove all free
bacterial DNA
Viral band
Concentration - Columns
• Amicon columns
• Higher throughput
• Cheaper
• Less specialised equipment
Viral nucleic acid types
• dsDNA
• Linear
• Circular
• Linear with single strand breaks
• ssDNA
• Linear
• Circular
• dsRNA
• Linear, positive , negative strand, segmented
• ssRNA
• Linear
One library preparation method will NOT target all viruses
Bacterial Contamination (?)
• Bacterial contamination
• Does it matter ?
• Removal of bacterial DNA by DnaseI treatment,
prior to viral lysis.
• Check for contamination with PCR or qPCR – 16S
primers, rpoB etc
• BUT – is detection contamination ?
• Transduction
•
•
•
•
Rate vary for different bacteriophages
1x109 vlp in seawater
Transduction rate of 1x10-6
1x103 particles may contain host DNA – is this contamination ?
Sequencing
• Library Preparation
• Will depend on nucleic acid type
• DNA –dsDNA –Nextera XT
• Options
• RASL, LASL , Nextera XT, TruSeq ……
• Amount of DNA is dependent on method used
• Amplification – phi29. Can cause 10,000 x differences in
resultant population (Zhang 2006 )
• How much sequence is enough ?
• Depends on the question
To assemble or not ?
• Dependent on question of interest
• Assembly
• MetaVelvet, CLC, Ray Meta , etc
• Annotation
• EBI, MG-RAST
• PROKKA (Seeman 2014) with custom bacteriophage
database
Analysis – The Dark Matter
Known sequence
Unknown sequence
Data will look
something like this
Viral metagenomic sequences from human faeces, a marine sediment sample
and two seawater samples were compared to the GenBank non-redundant
database at the date of publication and in December 2004. The percentage of
each library that could be classified as Eukarya, Bacteria, Archaea, viruses or
showed no similarities (E-value >0.001) is shown.
Viral Diversity
• Majority of sequences will have no
similarity to current databases
• Estimates of viral protein clusters
vary
• ~2 billion protein clusters (Rohwer
2003)
• 3.9 million (Espinoza, 2013)
• ~5,746 virus populations in the ocean
• Only 39 previously identified
Analysis Issues- Databases
• Databases are crucial
• Very small database of bacteriophage and viral genomes
• 4026 complete viral genomes
(http://www.ebi.ac.uk/genomes/virus.html)
• ~96 Mb total sequence
• ~20 kb mean ( 7 kb median) viral genome size
• In contrast >40,000 Salmonella genomes !!
• 1 MiSeq run V3 chemistry ~ 15 Gb
• Would allow sequencing of all complete viral genomes to
reasonable coverage ( in theory)
Analysis – Assembly
Fully assembled
genomes no
misassembly
Partially
assembled
genomes no
misassembly
Genomes with
assembly errors
Mycobacterium
phages
88%
(169/192)
3.6 %
(7/192)
8.4%
(16/192)
Pseudomonas
phages
85.5 %
(164/192)
8.3%
(16/192)
6.2%
(12/192)
Mycobacterium &
Pseudomonas
phages
100 %
(192/192)
0%
(0/192)
0%
(0/192)
Mycobacterium &
Pseudomonas &
Synechococcus
phages
92.7%
(267/288)
3.47%
(10/288)
3.81 %
(11/288)
Mycobacterium &
Pseudomonas &
Synechococcus &
Bacillus phages
87.32 %
(335/384)
4.42%
(17/384)
8.33%
(32/384)
In silico modelling
suggests
misassembly of
phage genomes
will occur in
mixtures of closely
related phages
Viral Metagenomics
Lake Borgoria, Kenya
Sophie Clough & Martha Clokie
Background
Flamingos eat Arthrospira sp blooms
High specialised diet
Crashes in blooms results in less food
Flamingos die ( Kaggwa et al, 2013)
Flamingos provide tourism to local community.
Cyanophage implicated in lysis of blooms Peduzzi et al. (2014)
Data Collected
•
•
•
•
•
•
Viral numbers
Identification of Arthrospira sp present
Abundance data of Arthrospira
Viral DNA samples
CTD, pH,
Samples are continually being collected on a
monthly basis
Viral Metagenomics
•
Iron chloride concentration
•
Nextera XT library preparation
•
Analysis
•
•
•
Assembly CLCworkbench
MetaVir2 analysis
Annotation against custom viral database
N
u
m
b
e
r
o
f
s
e
q
u
e
n
c
e
c
l
u
s
t
e
r
s
Diversity
Number of sequences
Sample Summary
Sample
Details
Contigs
a22
C/0
67315
88990
42
52070
15245
29.2
a23
N/0
48677
71352
21
36112
12565
34.7
a25
S/0
109987
157761
67
88809
21178
23.8
b22
C/25
79880
12744
54
59185
20695
34.9
b23
N/25
52125
73752
17
40182
11943
29.7
b25
S/25
87002
132177
53
67179
19823
29.50
c22
C/50
12936
18973
4
9265
3671
39.6
c23
N/50
50327
71111
17
39724
10603
26.6
c25
S/50
43624
62006
10
34821
8803
25.8
d22
C/75
59922
85208
42
47640
12082
25.3
d25
N/75
8355
12065
6
5537
2818
50.8
Genes
Predicted
Circular
Contigs
Unaffiliat
ed
Contigs
Affiliated
Contigs
% Affiliated
Pelagibacter phage HTVC008M
Mycobacterium phage
PattyP
Synechococcus phage ACG2014c
Novel phage 1
No
similarity
Phage associated
protein
Identification of multiple novel phage
assemblies
Depth Analysis
North Basin Vertical Stratification Comparisons
0
20
40
60
80
0
10
20
Depth (m)
30
40
50
60
70
80
% reads that map to assembly of Surface sample
100
•
Viral community
changes with depth
Flamingo Summary
•
Viral metagenomes are diverse
•
Majority of genes are unknown
•
Identification of multiple novel phage isolates
Grass in
Are phage a reservoir of antibiotic
resistance genes in slurry ?
70 Litres a day
~30 million tonnes of slurry per
year is spread onto farmland
Assaying for phage carriage of
antibiotic resistance genes in cattle
slurry
1. Isolate bacteriophage
•
Sequence their genomes
2. Add phage fraction to bacterial isolates
•
•
•
Assay for antibiotic resistance
Selects for temperate phage only
Adding phage increased occurrence of resistance
colonies ( 0.005 % of cells when using E. coli ) *
3. Metagenomics of viral fraction
•
Search for antibiotic genes
Phage Isolation on E.coli
• Isolated 20 phage
• Purify
• Sequence
• Annotate
14 plaques
305 pfu/ml
Diversity of phage genomes isolated on E. coli
T4-like
T4like
HK587-like
T5-like
Shivani
Seurat
Novel
Bacteriophage T4
No Antibiotic Resistance genes
Slurry Metagenome
• 2,171,116 PE reads
• ~217,735 contigs
• 56,824 genes
• 12,236 are annotated as known phage genes (BLASTp 1e-5)
Coverage
• Only 10% of reads map to any known viral genome
• BUT 1% of reads map to new novel phage genome
Kraken-Analysis
99.25 %
reads
unassigned
90
80
70
60
50
40
30
20
10
0
1
Astroviridae
Lipothrixviridae
Poxviridae
Potyviridae
Herpesviridae
Ascoviridae
Phycodnaviridae
Polydnaviridae
Baculoviridae
Microviridae
Tymoviridae
Flaviviridae
Podoviridae
Partitiviridae
Myoviridae
Siphoviridae
Analysis of assembled contigs- MetaVir
•70% contigs contigs have no “known” viral
genes
•30% contigs associated as viral contigs
Dominated by Siphoviridae
Assembled phage genomes (?)
Phage 1
Phage 2
Phage 3
Known phage protein
No similarity
Bacterial/phage protein
.............20
Antibiotic Resistance Genes
• 190 genes have similarity to known antibiotic
resistance genes
• Database from http://ardb.cbcb.umd.edu/
- non are localized on a contig with a “known” phage gene
Conclusions III
• Cattle slurry harbours a vast diversity of unknown
bacteriophages
• Just like every other viral metagenome !!
• Isolation of 20 phage from single host did not isolate the
“same” phage twice.
• Viral fraction can transfer antibiotic resistance to E.
coli
• Viral metagenomics reveals the presence of
antibiotic resistance genes
Acknowledgements
Prof Martha Clokie
Sophie Clough
Becky Smith
Marie O Hara
Related documents