Download Next Generation Sequencing and Infectious Disease

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Joel R Sevinsky, PhD





Microbial genomes
Common isolate identification techniques
using molecular biology
Whole genome sequencing (WGS)
Example of WGS for outbreak investigation
Questions

Genome size varies from 4.56 to 5.70 Mb

How big is 5 Mb???

Long story, some big books!
1,084,440 words in all seven books
Average word length ~5 letters
~5,422,200 letters total in box set
E.coli genomes range from 4.56 to 5.70 Mb
A single E. coli genome @ 1 box set

Single human genome @ 1,000 box sets!!!





What do these bands really mean???
Genome size in Mb
1 site
2 sites
3 sites
2 sites
4 sites
5 Mb
5 Mb
5 Mb
5 Mb
5 Mb
5Mb
1Mb
0.5Mb
Restriction enzyme site




Specific word = enzyme
restriction site
Word frequency determines
banding pattern.
Different words represent
different enzymes.
What does PFGE really tell you
then?
Table 1
Frequency
Book
Voldemort (n)
Sorcerer’s Stone
31
Chamber of Secrets
20
Prisoner of Azkaban
37
Table 2
Frequency
Book
Broomstick (n)
Spell (n)
Wand (n)
Wizard (n)
Sorcerer’s Stone
27
14
62
41
Chamber of Secrets
12
6
107
44
Prisoner of Azkaban
20
6
114
39
Protein
DNA





WGS

◦
◦
Pulsed Field Gel Electrophoresis
Total gDNA fragments
◦
◦
Ribosomal RNA Sequencing
1 gene
◦
◦
Multi Locus Sequence Typing
7 genes
◦
◦
Whole Genome Multi Locus Sequence Typing
Thousands of reference genes plus pan genome
◦
◦
Whole Genome Single Nucleotide Polymorphism Typing
Total gDNA
16S rRNA
MLST
wgMLST
Information 
Sequencing
Serotyping
PFGE
wgSNP or hqSNP
40 box sets
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
Salmonella enterica serovar Enteritidis
JEGX01.004
JEGX01.002
JEGX01.002
JEGX01.004
A = suspect isolate, same time/PFGE
B = same patient over 5 weeks
C = suspect isolate for outbreak 5
D = environmental isolate, egg farm swab

“…comparison of these 61 genomes
sequences revealed that neither the 16S gene,
nor the gene fragments usually used for
MLST, provides biologically meaningful
information on the relatedness of the
sequenced isolates. The best way to analyze
this is by taking into account all the genomic
content, rather than looking at one or a few
individual genes.”



Genome size varies from 4.56 to 5.70 Mb
This size variation demonstrates a genomic
difference of up to 1 Mb between isolates.
1 Mb = ~1,000 genes
“One Shot” Characterization of STEC
ANI
SerotypeFinder
VirulenceFinder
GENUS/SPECIES:
SEROTYPE:
PATHOTYPE:
Escherichia coli
O104:H4
Shiga toxin producing and Enteroaggregative E. coli (STEC & EAEC)
VIRULENCE PROFILE:
7-gene MLST
ResFinder
Phylogenetic ID
SEQUENCE TYPE:
stx2a, aggR, aggA, sigA, sepA, pic, aatA, aaiC, aap
ST678
ANTIMICROBIAL RESISTANCE GENES:
wgMLST CODE:
102:45.26.35.3
blaTEM-1, blaCTX-M-15, strAB, sul2, tet(A)A, dfrA7



Outbreak investigation
◦ Sporadic vs outbreak
◦ Not just cluster but phylogenetic relationships
Microbial Source Tracking (MST)
Microbial Surveillance
◦ Food
◦ Environment
 Animals, soil, food prep areas, hospitals, etc

Antibiotic resistance monitoring

Virulence gene monitoring

What else???
◦ Genotype predicts phenotype
◦ Mobile vs integrated
Related documents