Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Editing Pathway/Genome Databases Ron Caspi Compounds, Reactions and Pathways Activate Editing Mode Type SRI International Bioinformatics (enable/disable-editors t) at the listener pane Why Curation is Important! Curators SRI International Bioinformatics need jobs “in silico” information less solid than experimental evidence Database curation greatly enhances the usefulness of the data Pathway Tools Paradigms SRI International Bioinformatics Separate database from user interface Navigator provides one interface to the DB Editors provide an alternative interface to the DB • Reuse information whenever possible! • A PGDB should not describe the same biological or chemical entity more than once • A tool helps to prevent creation of duplicate reactions Editing rules: Support Policy Do Do SRI International Bioinformatics not modify the EcoCyc or MetaCyc datasets not alter DB schema e.g. do not add or remove classes or slots List of Editors SRI International Bioinformatics Compound Editor Compound Structure Editors Reaction Editor Synonym Editor Publication Editor Pathway Editor and Pathway Info Editor Protein/Subunit structure/Enzymatic Reaction Editors Gene Editor Intron Editor (Eukaryotes only) Transcription Unit Editor Frame Editor Relationships Editor Ontology Editor Saving Changes The user must save changes explicitly File => Save Current DB Save DB button List Unsaved Changes in Current DB Revert Current DB Checkpoint Current DB Updates to File Restore Updates from Checkpoint File SRI International Bioinformatics SRI International Bioinformatics Other DB commands under the File menu Summarize databases Summarize current organism Refresh DB list Refresh All Current DBs Delete a DB Attempt to Reconnect to Database Server Invoking the Editors SRI International Bioinformatics 1. New Object: Use the “New” command 2. Existing Object: Right-Click on the Object Handle Compound Editor Create or edit a compound Specify Class Common Name and Synonyms Comments, citations Links to other DBs SRI International Bioinformatics The Synonym Editor Lets you easily edit the synonyms and set the common name SRI International Bioinformatics Citations Citation boxes The CITS field File =>Import Citations from Pubmed Publication editor (invoke by right clicking on a citation at bottom) Non Pubmed citation: Enter in citation box in the form Smith06, invoke editor by clicking out of a citation box. SRI International Bioinformatics More Compound Editing Compound Structure Editors (Marvin, JME) Mol files Exporting to other DBs Merging Duplicate Frame and Edit SRI International Bioinformatics Reaction Editor Enter or edit a reaction equation EC number (official?) Check for balance Compound Resolver SRI International Bioinformatics Pathway Info Editor • Class (variant class) • Common Name • Synonyms • Evidence code • Citations (CIT) • Comments • External Links • Hypothetical reactions • Author credits SRI International Bioinformatics Pathway Editor Graphically Two create and modify pathways tools: Connections Editor: add reactions one by one Segment Editor: enter a linear pathway segment SRI International Bioinformatics SRI International Bioinformatics Connections Editor Operations Two main display panes: left: unconnected pathway reactions right: draws connected reactions (looks like the regular Pathway display window) Connecting reactions: select initial reaction (in either pane) ===> red and green reactions select a green reaction Useful Commands: choose main compounds for reaction disconnect all reactions In circular pathways, specify which compound should be at the top Add links to other pathways, reactions, or comments Pathway Segment Editor To enter linear sequence of reactions simultaneously Reactions are specified by EC numbers or reaction substrates One segment may contain up to 7 reactions SRI International Bioinformatics SRI International Bioinformatics Pathway Editor Limitations Complex situations can cause ambiguity: link may be ignored dialog box for disambiguating pathway drawn in bizarre arrangement Fix: try removing offending link and add links in different order Pathway Editor does not handle polymerization pathways easily. Evidence Codes for Pathways http://brg.ai.sri.com/ptools/evidence-ontology.html EV-COMP: Inferred from computation HINF - Human inference AINF - Artificial inference EV-AS: Author statement TAS - traceable NAS – non-traceable EV-IC: Inferred by curator EV-EXP: Inferred from experiment IDA - inferred from direct assay IPI - inferred from physical interaction TAS – inferred from traceable data (review) IEP - inferred from expression pattern IGI - inferred from genetic interaction IMP- inferred from mutant phenotype SRI International Bioinformatics Enzyme/Protein Editors SRI International Bioinformatics To add an enzyme to a reaction: Right click the reaction, choose Edit => Create/Add enzyme. “Choose Protein”: specify ID, or “Search by genes or create new protein” => Protein subunit structure editor Protein Editor Check the Curator Guide at http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf Protein Editor SRI International Bioinformatics Enzymatic Reaction Editor SRI International Bioinformatics Protein Subunit Editor SRI International Bioinformatics Super Pathways SRI International Bioinformatics Need to keep pathways within well-defined end points Link pathways to upstream or downstream pathways with pathway links. Create more complex metabolic networks using superpathways Example: superpathway of aromatic compound degradation (aerobic) is composed of: catechol degradation II mandelate degradation I benzoate degradation (aerobic) b-ketoadipate degradation protocatechuate degradation II shikimate degradation quinate degradation 4-hydroxymandelate degradation tryptophan degradation I Pathway Export Export Edit => Add Pathway to File Export List File => Export => Selected Pathways to File Import File => Import => Pathways from File SRI International Bioinformatics SRI International Bioinformatics Creating Links to External Databases Creating links from a PGDB to external databases To define a new external database: Tools => Ontology Browser View => Browse from new root / type Databases Highlight Databases Frame => Create => Instance Enter frame name, frame edit Enter Common Name, Static-Search-URL e.g. http:/gene.pharma.com/dbquery? Enter a value for Search-Object-Class (e.g. Proteins) Creating links to a PGDB see http://biocyc.org/linking.shtml Constraint Checking SRI International Bioinformatics General rules that constrain the valid relationships among instances Constraints are checked when new facts are asserted to assure that the DB remains logically consistent Constraints on slots: Domain violation checks to make sure the slots are in instances of the appropriate class Range violation : value type value cardinality Inverse Cardinality Lisp-predicate SRI International Bioinformatics Consistency Checking (correctify-kb) Removes newlines from names Converts “<“ to “|” in string citations Checks isozyme sequence similarity Fixes references between polypeptides and genes Changes compound names to ids in a variety of slots Matches physiological regulators to other regulators Cross-references compounds to reactions Checks pathways predecessors/reactions/subs Checks reaction balancing Checks compound structures Calculates sub- and super-pathways Finds missing sub-pathways links Verifies chromosome components and positions Update your computers! To SRI International Bioinformatics install a patch: Tools => Instant Patch => Download and Activate All Patches Make sure that… SRI International Bioinformatics You perform all exercises on the Hb. pylori database, not on your own!!! Creating New Reactions SRI International Bioinformatics Don’t forget to include spaces between chemical names and terms such as “+” and “=“: 1. ascorbate + H2O = 3-keto-L-gulonate 2. 3-keto-L-gulonate + ATP = 3-keto-L-gulonate-6-phosphate + ADP 3. 3-keto-L-gulonate-6-phosphate = L-xylulose-5-phosphate + CO2 4. L-xylulose-5-phosphate = L-ribulose-5-phosphate 5. L-ribulose-5-phosphate = xylulose-5-phosphate 6. xylulose-5-phosphate = D-ribulose-5-phosphate Fill Reaction frame ID’s in your handout Reaction ascorbate + H2O = 3-keto-L-gulonate 3-keto-L-gulonate + ATP = 3-keto-L-gulonate 6-phosphate + ADP 3-keto-L-gulonate 6-phosphate = L-xylulose-5-phosphate + CO2 L-xylulose-5-phosphate = L-ribulose-5-phosphate L-ribulose-5-phosphate = xylulose-5-phosphate xylulose-5-phosphate = D-ribulose-5-phosphate SRI International Bioinformatics Frame ID XXX Duplicate Reaction? Frame ID of the new reaction to be created. This frame will NOT be created unless you choose “Keep” Frame ID of the existing reaction. This reaction will NOT be transferred into your database until you click “Import”! Record this BEFORE you click “Import” SRI International Bioinformatics Define a New Pathway SRI International Bioinformatics Define the pathway L-ascorbate degradation to D-ribulose-5phosphate by connecting the reactions together Assign class: (Pathways -> Degradation/Utilization/Assimilation -> Carboxylates, Other) Add the reactions, conect them, and add a link to the pathway non-oxidative branch of the pentose phosphate pathway (Generation of precursor metabolites and energy => Pentose phosphate pathways =>) Add a reverse link from non-oxidative branch of the pentose phosphate pathway to the new pathway Run (correctifykb) Open Run SRI International Bioinformatics the database Hb. pylori (HypCyc) (so ‘hyp) (correctify-kb) Analyze output