Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Viral Metagenomics Andrew Millard Warwick Medical School University of Warwick Email: [email protected] Twitter: @milja001 Overview • Brief history • Methods • Analysis • Examples Brief History • 1St Viral Metagenome – Brietbart 2003, PNAS 2002 • Marine seawater sample • Human faeces – Brietbart 2003, J Bacteriology • Four Ocean biomes – Angly 2006, PLoS Biology • Global Ocean Survey – Williamson 2008, PLOS ONE • Human faeces from monozygotic twins – Reyes 2010, Nature • Pacific Ocean Virome – Hurwitz 2013, PLOS ONE • TARA Oceans – Brum 2015 , Science Collection of Viral Fraction Concentration Nucleic Acid Extraction Library Preparation Analysis Virus Collection • Separation from larger particulate matter • Water samples –easy • Faecal matter – less fun • Iron Chloride flocculation – (Poulos 2015) • Filtration • Tangential flow filtration • Both methods work • Depends on samples – TFF can filter from 0.5 l to 100s litres of water. Concentration - Ceasium Chloride • Expensive • Specialised equipment • Ultra centrifuge • Low throughput • Does not remove all free bacterial DNA Viral band Concentration - Columns • Amicon columns • Higher throughput • Cheaper • Less specialised equipment Viral nucleic acid types • dsDNA • Linear • Circular • Linear with single strand breaks • ssDNA • Linear • Circular • dsRNA • Linear, positive , negative strand, segmented • ssRNA • Linear One library preparation method will NOT target all viruses Bacterial Contamination (?) • Bacterial contamination • Does it matter ? • Removal of bacterial DNA by DnaseI treatment, prior to viral lysis. • Check for contamination with PCR or qPCR – 16S primers, rpoB etc • BUT – is detection contamination ? • Transduction • • • • Rate vary for different bacteriophages 1x109 vlp in seawater Transduction rate of 1x10-6 1x103 particles may contain host DNA – is this contamination ? Sequencing • Library Preparation • Will depend on nucleic acid type • DNA –dsDNA –Nextera XT • Options • RASL, LASL , Nextera XT, TruSeq …… • Amount of DNA is dependent on method used • Amplification – phi29. Can cause 10,000 x differences in resultant population (Zhang 2006 ) • How much sequence is enough ? • Depends on the question To assemble or not ? • Dependent on question of interest • Assembly • MetaVelvet, CLC, Ray Meta , etc • Annotation • EBI, MG-RAST • PROKKA (Seeman 2014) with custom bacteriophage database Analysis – The Dark Matter Known sequence Unknown sequence Data will look something like this Viral metagenomic sequences from human faeces, a marine sediment sample and two seawater samples were compared to the GenBank non-redundant database at the date of publication and in December 2004. The percentage of each library that could be classified as Eukarya, Bacteria, Archaea, viruses or showed no similarities (E-value >0.001) is shown. Viral Diversity • Majority of sequences will have no similarity to current databases • Estimates of viral protein clusters vary • ~2 billion protein clusters (Rohwer 2003) • 3.9 million (Espinoza, 2013) • ~5,746 virus populations in the ocean • Only 39 previously identified Analysis Issues- Databases • Databases are crucial • Very small database of bacteriophage and viral genomes • 4026 complete viral genomes (http://www.ebi.ac.uk/genomes/virus.html) • ~96 Mb total sequence • ~20 kb mean ( 7 kb median) viral genome size • In contrast >40,000 Salmonella genomes !! • 1 MiSeq run V3 chemistry ~ 15 Gb • Would allow sequencing of all complete viral genomes to reasonable coverage ( in theory) Analysis – Assembly Fully assembled genomes no misassembly Partially assembled genomes no misassembly Genomes with assembly errors Mycobacterium phages 88% (169/192) 3.6 % (7/192) 8.4% (16/192) Pseudomonas phages 85.5 % (164/192) 8.3% (16/192) 6.2% (12/192) Mycobacterium & Pseudomonas phages 100 % (192/192) 0% (0/192) 0% (0/192) Mycobacterium & Pseudomonas & Synechococcus phages 92.7% (267/288) 3.47% (10/288) 3.81 % (11/288) Mycobacterium & Pseudomonas & Synechococcus & Bacillus phages 87.32 % (335/384) 4.42% (17/384) 8.33% (32/384) In silico modelling suggests misassembly of phage genomes will occur in mixtures of closely related phages Viral Metagenomics Lake Borgoria, Kenya Sophie Clough & Martha Clokie Background Flamingos eat Arthrospira sp blooms High specialised diet Crashes in blooms results in less food Flamingos die ( Kaggwa et al, 2013) Flamingos provide tourism to local community. Cyanophage implicated in lysis of blooms Peduzzi et al. (2014) Data Collected • • • • • • Viral numbers Identification of Arthrospira sp present Abundance data of Arthrospira Viral DNA samples CTD, pH, Samples are continually being collected on a monthly basis Viral Metagenomics • Iron chloride concentration • Nextera XT library preparation • Analysis • • • Assembly CLCworkbench MetaVir2 analysis Annotation against custom viral database N u m b e r o f s e q u e n c e c l u s t e r s Diversity Number of sequences Sample Summary Sample Details Contigs a22 C/0 67315 88990 42 52070 15245 29.2 a23 N/0 48677 71352 21 36112 12565 34.7 a25 S/0 109987 157761 67 88809 21178 23.8 b22 C/25 79880 12744 54 59185 20695 34.9 b23 N/25 52125 73752 17 40182 11943 29.7 b25 S/25 87002 132177 53 67179 19823 29.50 c22 C/50 12936 18973 4 9265 3671 39.6 c23 N/50 50327 71111 17 39724 10603 26.6 c25 S/50 43624 62006 10 34821 8803 25.8 d22 C/75 59922 85208 42 47640 12082 25.3 d25 N/75 8355 12065 6 5537 2818 50.8 Genes Predicted Circular Contigs Unaffiliat ed Contigs Affiliated Contigs % Affiliated Pelagibacter phage HTVC008M Mycobacterium phage PattyP Synechococcus phage ACG2014c Novel phage 1 No similarity Phage associated protein Identification of multiple novel phage assemblies Depth Analysis North Basin Vertical Stratification Comparisons 0 20 40 60 80 0 10 20 Depth (m) 30 40 50 60 70 80 % reads that map to assembly of Surface sample 100 • Viral community changes with depth Flamingo Summary • Viral metagenomes are diverse • Majority of genes are unknown • Identification of multiple novel phage isolates Grass in Are phage a reservoir of antibiotic resistance genes in slurry ? 70 Litres a day ~30 million tonnes of slurry per year is spread onto farmland Assaying for phage carriage of antibiotic resistance genes in cattle slurry 1. Isolate bacteriophage • Sequence their genomes 2. Add phage fraction to bacterial isolates • • • Assay for antibiotic resistance Selects for temperate phage only Adding phage increased occurrence of resistance colonies ( 0.005 % of cells when using E. coli ) * 3. Metagenomics of viral fraction • Search for antibiotic genes Phage Isolation on E.coli • Isolated 20 phage • Purify • Sequence • Annotate 14 plaques 305 pfu/ml Diversity of phage genomes isolated on E. coli T4-like T4like HK587-like T5-like Shivani Seurat Novel Bacteriophage T4 No Antibiotic Resistance genes Slurry Metagenome • 2,171,116 PE reads • ~217,735 contigs • 56,824 genes • 12,236 are annotated as known phage genes (BLASTp 1e-5) Coverage • Only 10% of reads map to any known viral genome • BUT 1% of reads map to new novel phage genome Kraken-Analysis 99.25 % reads unassigned 90 80 70 60 50 40 30 20 10 0 1 Astroviridae Lipothrixviridae Poxviridae Potyviridae Herpesviridae Ascoviridae Phycodnaviridae Polydnaviridae Baculoviridae Microviridae Tymoviridae Flaviviridae Podoviridae Partitiviridae Myoviridae Siphoviridae Analysis of assembled contigs- MetaVir •70% contigs contigs have no “known” viral genes •30% contigs associated as viral contigs Dominated by Siphoviridae Assembled phage genomes (?) Phage 1 Phage 2 Phage 3 Known phage protein No similarity Bacterial/phage protein .............20 Antibiotic Resistance Genes • 190 genes have similarity to known antibiotic resistance genes • Database from http://ardb.cbcb.umd.edu/ - non are localized on a contig with a “known” phage gene Conclusions III • Cattle slurry harbours a vast diversity of unknown bacteriophages • Just like every other viral metagenome !! • Isolation of 20 phage from single host did not isolate the “same” phage twice. • Viral fraction can transfer antibiotic resistance to E. coli • Viral metagenomics reveals the presence of antibiotic resistance genes Acknowledgements Prof Martha Clokie Sophie Clough Becky Smith Marie O Hara