* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genomic island analysis: Improved web-based software
X-inactivation wikipedia , lookup
Public health genomics wikipedia , lookup
Nutriepigenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Gene expression programming wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Essential gene wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Genome evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory Simon Fraser University Burnaby, BC, Canada Prokaryotic Genomic Islands (GIs) Definition: Genomic DNA segments with particular characteristics that indicate horizontal origins A bacterium GI Genomic Island Characteristics Exhibit sequence and annotation features Genomic Island (e.g. PAI) (%G+C, sequence composition bias) chromosome Direct Repeats tRNA gene mob Direct Repeats VF VF VF mob: mobility genes Often contain genes encoding adaptive functions of medical and environmental importance Pathogenicity Islands: virulence factors (genes contribute to diseases) Resistance Islands: antibiotic resistance Metabolic Islands: secondary metabolism (e.g. sucrose) IslandPath: Aiding identification of GIs TCP island Vibrio cholerae N16961 Chr1 A yellow circle: %G+C above high cutoff A green circle: % G+C between cutoffs A pink circle: %G+C below low cutoff A black bar: transfer RNA A purple bar: ribosomal RNA A deep blue bar: both tRNA and rRNA A black square: transposase A black triangle: integrase A strike-line: regions with dinucleotide bias TCP = toxin co-regulated pili (Hsiao et al 2003 Bioinformatics p418-20) IslandPath V.2 Which Features Best Identify GIs Examined prevalence of features in 95 published islands 85% of islands with >25% dinucleotide bias coverage (62% have > 50% dinucleotide bias coverage) Mobility genes identified in >75% of the islands tRNA genes observed in <50% of known islands Only 20% of the islands show atypical %G+C Properties of genes in GIs? Defined a “putative island” as 8 or more genes in a row with dinucleotide bias 8 or more genes in a row with dinucleotide bias + an associated mobility gene Any difference for genes in islands versus outside of islands in terms of their protein Functional categories? 63 genomes (67 chromosomes) analyzed COG: cluster of orthologous groups of proteins Bacillus subtilis 168 More novel genes inside of islands Yersinia pestis CO92 70.00% Vibrio cholerae chromosome II Vibrio cholerae chromosome I Sulfolobus solfataricus Streptococcus pneumoniae TIGR4 Staphylococcus aureus N315 Salmonella typhimurium LT2 Pseudomonas aeruginosa PAO1 Neisseria meningitidis MC58 Mycobacterium leprae Mycobacterium tuberculosis CDC1551 Mycoplasma pneumoniae M129 Listeria innocua Clip11262 Helicobacter pylori 26695 Haemophilus influenzae Rd-KW20 Escherichia coli O157 Chlamydia trachomatis D Clostridium acetobutylicum ATCC824 Escherichia coli K12 Buchnera sp. APS Borrelia burgdorferi B31 Proportions of Genes with no COG Assignment in Islands vs. Outside OUTSIDE ISLAND 60.00% 50.00% 40.00% Paired-t-test P value: 1.27E-18 30.00% 20.00% 10.00% 0.00% Hsiao et al. PLOS Genetics e62, Nov. 2005 Control for Analysis Biases Control for mis-prediction of genes in sequence composition biased regions Control for bias of COG Protein Classification Excluded genes < 300bps Used SUPERFAMILY classification which is better at detecting distant homologs Control for compositional bias due to other factors Used the dinucleotide bias plus mobility gene dataset More novel genes in islands in all experiments Island Dataset Classification Method Paired t-test pvalue DINUC (all genes) COG 1.27E-18 DINUC+MOB (all Genes) COG 1.20E-18 DINUC (all genes) SUPERFAMILY 1.13E-18 DINUC+Mob (all genes) SUPERFAMILY 4.43E-14 DINUC (>300bps) COG 1.05E-17 DINUC+MOB (>300bps) COG 7.65E-16 DINUC (>300bps) SUPERFAMILY 3.01E-16 DINUC+MOB (>300bps) SUPERFAMILY 2.04E-10 Hsiao et al. PLOS Genetics e62, Nov. 2005 Phage may be the predominant donors of GIs Some GIs are clearly of bacteriophage origin, but more may be from phage as well Predicted subcellular localizations of proteins encoded in our GIs similar to phage genomes (lower proportion of cytoplasmic membrane proteins) Hsiao et al. PLOS Genetics e62, Nov. 2005 Many GI encoded genes have sequence characteristics similar to phage genes (A+T rich and short) Daubin et al. Genome Biol. 4(9): R57 7 Proportions of virulence factors in Islands vs. Outside of Islands in 26 pathogens Outside Island 6 P value: < 2.2E-16 % of VFs 5 4 3 2 Higher proportions of genes in Islands are VFs 1 0 DINUC DINUC + Mob Gene Island Types Fedynak, Hsiao, and Brinkman (unpublished) http://zdsys.chgb.org.cn/VFs/ Certain classes of VFs overrepresented in GIs Virulence Factor Database (VFDB) classification of VFs in GIs and non-GIs VFDB Classification Unclassified Secretion system Adherence Iron uptake Type III translocated protein Antiphagocytosis Protease Toxin GIs VFs (#) 185 95 59 33 6 23 5 18 non-GIs Proportion VFs (#) of genes (%) 1.89 158 0.97 138 0.60 138 0.34 59 0.06 1 0.23 66 0.05 5 0.18 53 p-value Proportion of genes (%) 0.23 0.20 0.20 0.09 0.00 0.10 0.01 0.08 < 2.20E-16 < 2.20E-16 5.69E-13 5.83E-11 1.54E-07 3.34E-04 2.08E-03 2.34E-03 Most of these are “offensive” virulence factors Fedynak, Hsiao, and Brinkman (unpublished) Conclusions Genomic islands contain disproportionately higher number of novel genes, suggesting a large and understudied gene pool contributing to horizontal gene transfer These novel genes appear to be drawn from a large pool of phage - metagenomics studies useful These novel genes may contribute to microbial adaptation and may play a role in pathogenesis and in antibiotic resistance Acknowledgements Fiona Brinkman Amber Fedynak -VF studies Brian Coombes, Michael Lowden, and Brett Finlay (UBC) - Microarray data Jenny Bryan (UBC) -Stats analysis Brinkman Laboratory http://www.pathogenomics.sfu.ca/islandpath Other categories more common in islands Category In putative islands: Paired t-test p-value In putative islands + mobility genes: Paired t-test p-value Cell motility 7.73E-5 0.002087 (may be a sampling size issue) Intracellular trafficking, secretion, and vesicular transport 8.124E-3 0.406955 (may be a sampling size issue) Several metabolism-associated categories are under-represented in islands * Novel genes not included in analysis due to potential skew of other category results 70.00% Yersinia pestis CO92 80.00% Vibrio cholerae chromosome II Vibrio cholerae chromosome I Staphylococcus aureus N315 Streptococcus pneumoniae TIGR4 Sulfolobus solfataricus Salmonella typhimurium LT2 Pseudomonas aeruginosa PAO1 Neisseria meningitidis MC58 Mycobacterium leprae Mycobacterium tuberculosis CDC1551 Mycoplasma pneumoniae M129 Listeria innocua Clip11262 Escherichia coli O157 Haemophilus influenzae RdKW20 Helicobacter pylori 26695 Chlamydia trachomatis D Clostridium acetobutylicum ATCC824 Escherichia coli K12 Buchnera sp. APS Borrelia burgdorferi B31 Bacillus subtilis 168 Proportions of Genes with no SUPERFAMILY Assignment in Islands vs. Outside OUTSIDE ISLAND 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% P value 3.0E-16 IslandPath V.2 Experiment: S. typhimurium LT2 ssrB gene KO Track 1: IslandPath Track 2: Microarray expression (overexp & underexp )