* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Pathways - Bioinformatics.ca
Wnt signaling pathway wikipedia , lookup
Pathogenomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene expression programming wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Oncogenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
History of genetic engineering wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Public health genomics wikipedia , lookup
Genome (book) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Protein Pathways and Pathway Databases Shan Sundararaj University of Alberta Edmonton, AB [email protected] Lecture 4.3 1 Interactions Networks Pathways • A collection of interactions defines a network • Pathways are a subset of networks – All pathways are networks of interactions, however not all networks are pathways! – Difference in the level of annotation/understanding • We can define a pathway as a biological network that relates to a known physiological process or phenotype Lecture 4.3 2 Pathways • However, there is no precise biological definition of a pathway • Our partitioning of networks into pathways is somewhat arbitrary – We choose the start/finish points based on “important” or easily understood compounds – Gives us the ability to conceptualize the mapping of genotype phenotype Lecture 4.3 3 Biological pathways • There are 3 type of interactions that can be mapped to pathways: 1) enzyme – ligand • metabolic pathways 2) protein – protein • cell signaling pathways • complexes for cell processes 3) gene regulatory elements – gene products • genetic networks Lecture 4.3 4 Pathways are inter-linked Signalling pathway Genetic network STIMULUS Metabolic pathway Lecture 4.3 5 Metabolic Pathways 1993 Boehringer Mannheim GmbH - Biochemica Lecture 4.3 6 What the pathway represents • • • • • • • • Metabolites involved Enzymes/transport proteins Order of reactions General biological function Reaction rates Expression data Inhibitors, activators, alternate pathways Genetic regulatory information Lecture 4.3 7 Describing metabolic networks • Classical biochemical pathways – glycolysis, TCA cycle, etc. • Stoichiometric modeling – flux balance analysis, extreme pathways • Kinetic modeling (CyberCell, E-cell, …) – Need to accumulate comprehensive kinetic information Lecture 4.3 8 Complexity • Pathways involve multiple enzymes, which may have multiple subunits, alternate forms, alternate specificities • Enzymes may be involved in multiple pathways • Malate dehydogenase appears in 6 different metabolic pathways in some databases Lecture 4.3 9 Metabolic Pathway Reconstruction • Given a genomic sequence, we can infer what metabolic pathways are available to an organism • Used to design culture medium for Tropheryma whipplei by seeing what nutrients were essential for growth (Renesto et al., Lancet, 362, 447-449, 2003) Lecture 4.3 10 Co-expression within pathways • Tempting thought: genes that occur within the same pathway will show similar expression profiles • Reality: depends greatly on how you identify your pathways, KEGG pathways show at best 50% coexpression in survey of available yeast expression data (Ihmels et al., Nat Biotechnol. 22, 86-92, 2004). • Expression levels do not correlate very well with protein interactions (unless they are “stable” complexes, maintained in many different conditions) Lecture 4.3 11 Pathway Databases • • • • • • • KEGG BioCyc Reactome GenMAPP BioCarta TransPATH …175 more at Pathway Resource List http://www.cbio.mskcc.org/prl/index.php Lecture 4.3 12 BioPAX (www.biopax.org) • Collaborative effort to create a data exchange format for biological pathway data Lecture 4.3 13 KEGG • • • • 5904 chemical reactions 15,037 pathways 229 reference pathways 85 ortholog tables • 181 organisms http://www.genome.ad.jp/kegg/ Lecture 4.3 14 KEGG • GENES Database – The universe of genes and proteins in complete genomes • LIGAND Database – The universe of chemical reactions involving metabolites and other biochemical compounds • Pathway Database – Molecular interaction networks, metabolic and regulatory pathways, and molecular complexes Lecture 4.3 15 Connection between KEGG and other Databases Lecture 4.3 16 Pathways • Represented as diagrams, manually created, stored as gifs • Easy to link to, highlight genes of interest • Generate orthologous pathways in other organisms Lecture 4.3 2.7.2.4 1.2.1.11 1.1.1.3 2.3.1.46 2.5.1.48 4.4.1.8 2.1.1.13 2.5.1.6 17 http://www.biocyc.org/ Lecture 4.3 18 BioCyc • The primary database was EcoCyc (E. coli) • 21 more curated pathway/genome databases (PGDB), each focusing on one organism (e.g. HumanCyc) – Also 142 more non-curated (computationally generated) pathways • MetaCyc database contains non-redundant reference pathways from more than 240 organisms • Supports “Pathway Tools” software suite to analyze PGDBs, and “PathoLogic” pathway prediction program for new genomes Lecture 4.3 19 BioCyc • Each PGDB includes info about: – – – – Pathways, reactions, substrates Enzymes, transporters Genes, replicons Transcription factors, promoters, operons, DNA binding sites • MetaCyc and EcoCyc are literature-based, the others are computationally derived Lecture 4.3 Pathways Reactions Compounds Proteins Genes Operons, Promoters, DNA Binding Sites Chromosomes, Plasmids 20 164 datasets Query by protein, gene, compound, reaction, pathway BLAST sequence if protein name unknown Lecture 4.3 21 MetaCyc Statistics Lecture 4.3 22 EcoCyc Statistics Lecture 4.3 23 BioCyc: Pathway Tools (Adapted from Pathway Tools tutorial, http://bioinformatics.ai.sri.com/ptools/) • Full Metabolic Map – Paint gene expression data on metabolic network; compare metabolic networks • Pathways – Pathway prediction (PathoLogic) • Reactions – Balance checker • Compounds – Chemical substructure comparison • Enzymes,Transcription Factors • Genes: Blast search • Operons – Operon prediction Lecture 4.3 24 PathoLogic – Making PGDBs Lecture 4.3 25 Completeness of Pathways Lecture 4.3 26 Completeness of Pathways Lecture 4.3 27 Issues with predicting pathways • Predicting metabolic pathways from genome: – – – – – Predict genes Assign enzymatic function to genes Look for enzymes unique to pathway Check if pathway is “balanced” (no holes) Try to fill holes by re-searching genome Lecture 4.3 28 Reactome http://www.reactome.org/ Lecture 4.3 29 Reactome • Joint venture of CSHL and EBI (supercedes the Genome Knowledgebase project) • Curated database of biological processes in humans – Also rat, mouse, fugu, zebrafish, chicken • Everything referenced by curators to literature citation or inference based on sequence similarity Lecture 4.3 30 Reactome model • Model reactions: (input_entities) (output_entities) • Distinguishes between modified/unmodified proteins (modification is an explicit reaction) • Highly annotated at every step, very micromanaged, hope to find interesting links between reactions Lecture 4.3 31 Reactome: PathFinder • Pathfinding between distant processes • Enter two molecules or events and see if they can be joined together by reactions Lecture 4.3 32 Reactome: SkyPainter • Find all reactions that contain a molecule or event – Very flexible input, any one or more of: • • • • Lecture 4.3 protein/gene ID (UniProt, Genbank or others) protein/gene sequence GO or OMIM identifier time series from a gene expression study 33 Reactome: SkyPainter • Starry sky output • If expression data used, you get different colours for different levels of expression • If time series available, you can make an animation Lecture 4.3 34 GenMAPP (www.genmapp.org) • Designed to rapidly analyze gene profiling data in the context of known biochemical pathways • Pathways (MAPPs) are authored by experts, as well as adapting several pathways from KEGG • Pathways easily web-queryable • Free for all users • But… Windows platform only Lecture 4.3 35 GenMAPP • Easy to draw/edit pathways • Color genes from user imported expression data Lecture 4.3 36 MAPPFinder – maps to GO ontology Lecture 4.3 37 BioCarta (www.biocarta.com) Lecture 4.3 38 BioCarta • Not a public database, but offers free, clickable, graphics-rich pathway database and gene information – Community annotation • Easy to use glyph system for genes • 355 pathways – mostly human/mouse metabolic and signaling pathways Lecture 4.3 39 TransPATH Lecture 4.3 40 TransPATH • Part of larger BioBase package (commercial) • PathwayBuilder package for network visualization • Highly integrated with signaling networks and transcription factor networks (TransFAC) • Linked to extensive enzyme information in BRENDA (www.brenda.uni-koeln.de/) • 28,456 molecules; 52,007 reactions; 54 handdrawn pathways Lecture 4.3 41 Pathway Database Comparison KEGG BioCyc GenMAPP Reactome BioCarta TransPATH 181 (varied) E.Coli, human (20 others) Human, mouse, rat, fly, yeast Human, rat, mouse, chicken, fugu, zebrafish Human, mouse Human, mouse Pathway types Metabolic, genetic, signaling, complexes Metabolic, complexes Metabolic, signaling, complexes Metabolic, signaling, complexes Metabolic, signaling, complexes Signaling, genetic Tools/ visualization linked to from many Pathway Tools GenMAPP PathView applets none Pathway Builder Images Static box flow diagrams Detailed flow diagrams Static box flow diagrams “starry sky” “Graphics rich” cell diagrams Graphics rich cell diagrams KGML XML BioPax SBML MAPP format SBML MySQL Just images Propietary XML files Organisms Download Formats Lecture 4.3 42 Conclusion • Pathway databases are continually evolving, and are an important abstract mid-level of expressing data: between genes/proteins and observable phenotypes • Metabolic pathways are most well studied/modeled • Many different formats of storage and display, but moving towards standards (PSI-MI, Biopax) Lecture 4.3 43