Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to the Pathway Tools Software and BioCyc Database Collection MetaCyc Family of Pathway/Genome Databases SRI International Bioinformatics 2,500+ databases from multiple institutions Cover all domains of life with microbial emphasis All DBs derived from MetaCyc via computational pathway prediction Common schema Common controlled vocabularies Common methodologies Curated Databases Within the MetaCyc Family SRI International Bioinformatics Database Organism Organization Curated From MetaCyc Multiorganism SRI 34,000 EcoCyc E. coli SRI 23,000 HumanCyc H. sapiens SRI AraCyc A. thaliana Carnegie Instit. 2,282 YeastCyc S. cerevisiae Stanford Univ 565 MouseCyc M. musculus Jackson Labs BioCyc Collection of 1,700 Pathway/Genome Databases Database (PGDB) – combines information about Pathways, reactions, substrates Enzymes, transporters Genes, replicons Transcription factors/sites, promoters, operons Pathway/Genome Tier 1: Literature-Derived PGDBs MetaCyc, HumanCyc, YeastCyc EcoCyc -- Escherichia coli K-12 AraCyc – Arabidopsis thaliana Tier 2: Computationally-derived DBs, Some Curation -- 34 PGDBs Bacillus subtilis, Mycobacterium tuberculosis Tier 3: Computationally-derived DBs, No Curation -- The remainder SRI International Bioinformatics SRI International Bioinformatics Pathway/Genome Database Pathways Reactions Proteins RNAs Genes Compounds Sequence Features Operons Promoters DNA Binding Sites Regulatory Interactions Chromosomes Plasmids CELL Pathway Tools Software: PGDBs Created Outside SRI 3,000+ SRI International Bioinformatics licensees: 250+ groups applying software to 1,700 organisms Saccharomyces cerevisiae, SGD project, Stanford University 135 pathways / 565 publications – BioCyc.org FungiCyc, Broad Institute Candida albicans, CGD project, Stanford University dictyBase, Northwestern University Mouse, MGD, Jackson Laboratory -- BioCyc.org Drosophila, FlyBase, Harvard University -- BioCyc.org Under development: C. elegans, WormBase Arabidopsis thaliana, TAIR, Carnegie Institution of Washington 288 pathways / 2282 publications – BioCyc.org ChlamyCyc, GoFORSYS PlantCyc, Carnegie Institution of Washington Six Solanaceae species, Cornell University GrameneDB, Cold Spring Harbor Laboratory Medicago truncatula, Samuel Roberts Noble Foundation Pathway Tools Software: PGDBs Created Outside SRI G. SRI International Bioinformatics Serres, MBL, Shewanella oneidensis M. Bibb, John Innes Centre, Streptomyces coelicolor TBDB Project, Mycobacterium tuberculosis F. Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa Genoscope, Acinetobacter R.J.S. Baerends, University of Groningen, Lactococcus lactis IL1403, Lactococcus lactis MG1363, Streptococcus pneumoniae TIGR4, Bacillus subtilis 168, Bacillus cereus ATCC14579 Matthew Berriman, Sanger Centre, Trypanosoma brucei, Leishmania major Sergio Encarnacion, UNAM, Sinorhizobium meliloti Mark van der Giezen, University of London, Entamoeba histolytica, Giardia intestinalis Pathway Tools Software: PGDBs Created Outside SRI SRI International Bioinformatics Large scale users: C. Medigue, Genoscope, 500+ PGDBs J. Zucker, Broad Inst, 94 PGDBs G. Sutton, J. Craig Venter Institute, 80+ PGDBs G. Burger, U Montreal, 60+ PGDBs E. Uberbacher, ORNL 33 Bioenergy-related organisms Bart Weimer, UC Davis, Lactococcus lactis, Brevibacterium linens, Lactobacillus acidophilus, Lactobacillus plantarum, Lactobacillus johnsonii, Listeria monocytogenes Partial listing of outside PGDBs at http://biocyc.org/otherpgdbs.shtml Pathway Tools Software Comprehensive SRI International Bioinformatics software environment spanning computational genomics and systems biology Create and maintain an organism database integrating genome, pathway, regulatory information Computational inference tools Interactive editing tools Query and visualize that database Interpret genome-scale datasets Comparative analysis tools Generate flux-balance models Pathway Tools Software Annotated Genome Genome-Scale Flux Model + SRI International Bioinformatics PathoLogic Pathway/Genome Database Pathway/Genome Editors Briefings in Bioinformatics 11:40-79 2010 Pathway/Genome Navigator SRI International Bioinformatics Pathway Tools Software: PathoLogic Computational creation of new Pathway/Genome Databases Transforms genome into Pathway Tools schema and layers inferred information above the genome Predicts operons Predicts metabolic network Predicts which genes code for missing enzymes in metabolic pathways Infers transport reactions from transporter names Bioinformatics 18:S225 2002 Pathway Tools Software: Pathway/Genome Editors Interactively update PGDBs with graphical editors Support geographically distributed teams of curators with object database system Gene editor Protein editor Reaction editor Compound editor Pathway editor Operon editor Publication editor SRI International Bioinformatics What is Curation? SRI International Bioinformatics Ongoing updating and refinement of a PGDB Correcting false-positive and false-negative predictions Incorporating information from experimental literature Authoring of comments and citations Updating database fields Gene positions, names, synonyms Protein functions, activators, inhibitors Addition of new pathways, modification of existing pathways Defining TF binding sites, promoters, regulation of transcription initiation and other processes Pathway Tools Software: Pathway/Genome Navigator Querying and visualization of: Pathways Reactions Metabolites Proteins Genes Chromosomes Two modes of operation: Web mode Desktop mode Most functionality shared, but each has unique functionality SRI International Bioinformatics SRI International Bioinformatics Pathway Tools Ontology / Schema Ontology classes: 1621 Datatype classes: Define objects from genomes to pathways Classification systems for pathways, chemical compounds, enzymatic reactions (EC system) Protein Feature ontology Controlled vocabularies: Cell Component Ontology Evidence codes Comprehensive relationships set of 248 attributes and What is a Pathway? A SRI International Bioinformatics connected sequence of biochemical reactions Occurs in one organism Conserved through evolution Regulated as a unit Starts or stops at one of 13 common intermediate metabolites SRI International Bioinformatics Comparison of BioCyc to KEGG KEGG approach: Static collection of reference pathway diagrams are color-coded to produce organism-specific views KEGG vs MetaCyc: Resource on literature-derived pathways KEGG maps are not pathways Nuc Acids Res 34:3687 2006 KEGG maps contain multiple biological pathways KEGG maps are composites of pathways in many organisms -- do not identify what specific pathways elucidated in what organisms KEGG has no literature citations, no comments, less enzyme detail KEGG vs BioCyc organism-specific PGDBs KEGG does not curate or customize pathway networks for each organism Highly curated PGDBs now exist for important organisms such as E. coli, yeast, mouse, Arabidopsis KEGG re-annotates entire genome for each organism Comparison of Pathway Tools to KEGG Inference SRI International Bioinformatics tools KEGG does not predict presence or absence of pathways KEGG lacks pathway hole filler, operon predictor Curation tools KEGG does not distribute curation tools No ability to customize pathways to the organism Pathway Tools schema much more comprehensive Visualization and analysis KEGG does not perform automatic pathway layout No comparative pathway analysis SRI International Bioinformatics Pathway Tools Implementation Details Allegro Common Lisp PC/Windows, Linux, Macintosh platforms Ocelot object database 600,000+ lines of code Lisp-based WWW server at BioCyc.org Manages 1,100+ PGDBs EcoCyc iPhone App Available SRI International Bioinformatics in iTunes store Free Look up gene information while on travel, at a conference, in the library Automated Generation of Metabolic Flux Models from PGDBs Joint work with Mario Latendresse SRI International Bioinformatics Flux-Balance Analysis Nutrients A Steady state, constraint-based quantitative models of metabolism Starting information for organism of interest: Secretions Metabolic Reaction List A B C X D Biomass D Flux Balance Models SRI International Bioinformatics Submit to linear optimization package Optimize biomass production, ATP production, etc Results Steady-state reaction fluxes for the metabolic network Remove reactions from the model to predict knock-out phenotypes Supply alternative nutrient sets to predict growth phenotypes Approach: Derive FBA Models from PGDBs SRI International Bioinformatics Store and update metabolic model within Pathway Tools The PGDB is the model All query and visualization tools applicable to FBA model FBA model is tightly coupled to genome and regulatory information Export to constraint solver for model execution/solving Reaction balance checking Dead-end metabolite analysis Visualize reaction flux using cellular overview Multiple gap filling SRI International Bioinformatics Multiple Gap Filling of FBA Models Reaction gap filling (Kumar et al, BMC Bioinf 2007 8:212): Reverse directionality of selected reactions Add a minimal number of reactions from MetaCyc to the model to enable a solution Reaction cost is a function of reaction taxonomic range Metabolite gap filling: Postulate additional nutrients and secretions Partial solutions: Identify maximal subset of biomass components for which model can yield positive production rates Downloading Pathway Tools SRI International Bioinformatics Obtain license http://biocyc.org/download.shtml Download Choose directory offers several configurations platform and database configuration Many combinations of databases available All databases requires a lot of memory Use registry to add PGDBs to configuration you downloaded Information Sources SRI International Bioinformatics Pathway Tools User’s Guide aic-export/pathway-tools/ptools/14.0/doc/manuals/userguide.pdf NOTE: Location of the aic-export directory can vary across different computers Pathway Tools Web Site http://bioinformatics.ai.sri.com/ptools/ Publications, FAQ, programming examples, etc. Slides from this tutorial http://bioinformatics.ai.sri.com/ptools/tutorial/sessions/ BioCyc Webinars http://biocyc.org/webinar.shtml Desktop vs Web functionality in Pathway Tools http://biocyc.org/desktop-vs-web-mode.shtml Information Sources SRI International Bioinformatics Publications “Pathway Tools version 13.0: Integrated Software for Pathway/Genome Informatics and Systems Biology”, Briefings in Bioinformatics 11:40-79 2010 “A survey of metabolic databases emphasizing the MetaCyc family”, Archives of Toxicology 2011 Information Sources BioCyc Web site: Help Menu Basic Help Search Help BioCyc Glossary Publications Website User Guide PGDB Concepts Guide to EcoCyc Guide to MetaCyc SRI International Bioinformatics