Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Joel R Sevinsky, PhD Microbial genomes Common isolate identification techniques using molecular biology Whole genome sequencing (WGS) Example of WGS for outbreak investigation Questions Genome size varies from 4.56 to 5.70 Mb How big is 5 Mb??? Long story, some big books! 1,084,440 words in all seven books Average word length ~5 letters ~5,422,200 letters total in box set E.coli genomes range from 4.56 to 5.70 Mb A single E. coli genome @ 1 box set Single human genome @ 1,000 box sets!!! What do these bands really mean??? Genome size in Mb 1 site 2 sites 3 sites 2 sites 4 sites 5 Mb 5 Mb 5 Mb 5 Mb 5 Mb 5Mb 1Mb 0.5Mb Restriction enzyme site Specific word = enzyme restriction site Word frequency determines banding pattern. Different words represent different enzymes. What does PFGE really tell you then? Table 1 Frequency Book Voldemort (n) Sorcerer’s Stone 31 Chamber of Secrets 20 Prisoner of Azkaban 37 Table 2 Frequency Book Broomstick (n) Spell (n) Wand (n) Wizard (n) Sorcerer’s Stone 27 14 62 41 Chamber of Secrets 12 6 107 44 Prisoner of Azkaban 20 6 114 39 Protein DNA WGS ◦ ◦ Pulsed Field Gel Electrophoresis Total gDNA fragments ◦ ◦ Ribosomal RNA Sequencing 1 gene ◦ ◦ Multi Locus Sequence Typing 7 genes ◦ ◦ Whole Genome Multi Locus Sequence Typing Thousands of reference genes plus pan genome ◦ ◦ Whole Genome Single Nucleotide Polymorphism Typing Total gDNA 16S rRNA MLST wgMLST Information Sequencing Serotyping PFGE wgSNP or hqSNP 40 box sets ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA Salmonella enterica serovar Enteritidis JEGX01.004 JEGX01.002 JEGX01.002 JEGX01.004 A = suspect isolate, same time/PFGE B = same patient over 5 weeks C = suspect isolate for outbreak 5 D = environmental isolate, egg farm swab “…comparison of these 61 genomes sequences revealed that neither the 16S gene, nor the gene fragments usually used for MLST, provides biologically meaningful information on the relatedness of the sequenced isolates. The best way to analyze this is by taking into account all the genomic content, rather than looking at one or a few individual genes.” Genome size varies from 4.56 to 5.70 Mb This size variation demonstrates a genomic difference of up to 1 Mb between isolates. 1 Mb = ~1,000 genes “One Shot” Characterization of STEC ANI SerotypeFinder VirulenceFinder GENUS/SPECIES: SEROTYPE: PATHOTYPE: Escherichia coli O104:H4 Shiga toxin producing and Enteroaggregative E. coli (STEC & EAEC) VIRULENCE PROFILE: 7-gene MLST ResFinder Phylogenetic ID SEQUENCE TYPE: stx2a, aggR, aggA, sigA, sepA, pic, aatA, aaiC, aap ST678 ANTIMICROBIAL RESISTANCE GENES: wgMLST CODE: 102:45.26.35.3 blaTEM-1, blaCTX-M-15, strAB, sul2, tet(A)A, dfrA7 Outbreak investigation ◦ Sporadic vs outbreak ◦ Not just cluster but phylogenetic relationships Microbial Source Tracking (MST) Microbial Surveillance ◦ Food ◦ Environment Animals, soil, food prep areas, hospitals, etc Antibiotic resistance monitoring Virulence gene monitoring What else??? ◦ Genotype predicts phenotype ◦ Mobile vs integrated