Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
List of types of proteins wikipedia , lookup
Hedgehog signaling pathway wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
Signal transduction wikipedia , lookup
Paracrine signalling wikipedia , lookup
Gene regulatory network wikipedia , lookup
Computational Biology Networks and Pathways Lecture Slides Week 11 Data is Interconnected What is a Graph Complexity A network is a collection of interactions Pathways are a subset of networks All pathways are networks of interactions not all networks are pathways Young et. al: Transcriptional Regulatory Networks in Saccharomyces cerevisiae; Science 2002 A network is a collection of interactions Pathways are a subset of networks All pathways are networks of interactions, however not all networks are pathways! Pathway is a biological network that corresponds to a specific physiological process or phenotype Biological pathways Biological components interacting with each other over time to bring about a single biological effect Pathways can be broken down sub-pathways Some common pathways: signal transduction metabolic pathways, gene regulatory pathways Entities in one pathway can be found in others 3 types of interactions that can be mapped into pathways protein (enzyme) – metabolite (ligand) metabolic pathways protein – protein cell signaling pathways, protein complexes protein – gene genetic networks Available resources KEGG http://www.genome.jp/kegg/ BioCyc http://www.biocyc.org/ Reactome http://www.reactome.org/ GenMAPP http://www.genmapp.org/ BioCarta http://www.biocarta.com/ TransPATH http://www.biobaseinternational.com/pages/index.php?id=transpathda tabases Pathguide – the pathway resource list http://www.pathguide.org/ Network Topology (PPI) Network analysis and visualization tools Databases for analysis Text mining algorithms (e.g., natural language processing (NLP)) technologies Expert human curation Ingenuity Pathway Analysis http://www.ingenuity.com/products/pathways_analysis.html PathwayStudio http://www.ariadnegenomics.com/products/pathway-studio/ PathwayArchitect http://www.selectscience.net Cytoscape http://www.cytoscape.org/ Biological Networks http://biologicalnetworks.net/ GeneGO http://www.genego.com/ Nanduri etal (unpublished) GO term enrichment Nanduri etal (unpublished) Nanduri etal (unpublished) Nanduri etal (unpublished) Nanduri etal (unpublished) End Theory I 5 min mindmapping 10 min break Practice I Cytoscape Download and install cytoscape Add the reactome app Initialize the reactome app Inspect some metabolic pathways End Practice I 15 min break Theory II Pathways vs. networks Gene networks • Clusters of genes (or gene products) with evidence of coexpression • Connections usually represent degrees of co-expression • In-depth knowledge of process is not necessary • Networks are non-predictive Biochemical pathways • Series of chained, chemical reactions • Connections represent describable (and quantifiable) relations between molecules, proteins, lipids, etc. • Enzymatic process is elucidated • Changes via perturbation are predictable downstream Pathways vs. networks Gene networks Curation Relatively easy: Biochemical pathways Difficult: mostly manual automated and manual Nodes Genes or gene products Any general molecule Edges Levels of co- Representation of possibly quantifiable mechanisms between compounds expression/influence or a qualitative relation Fidelity Low – usually very little High – specific processes detail Predictive power Relatively low Relatively high Effort to curate Pathway and network granularity Level of detail Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks Pathway and cellular simulations Yeast gene interaction network Tong, et al., Science 303, 808 (2004) Characteristics of the yeast gene network Some genes (e.g. regulatory factors) act as ‘hubs’ in a network and have many interactions Degrees of connectivity follows the power law Hubs may make interesting anti-cancer targets Clusters of genes with known function suggest function for hypothetical genes in same cluster Network characteristics can be used to predict proteinprotein interactions Path between two genes tends to be short (average ~3.3 hops) Tong, et al., Science 303, 808 (2004) E. coli metabolic pathway glycolysis Karp, et al., Science 293, 2040 (2001) Pathways: E. coli metabolic map Encompasses >791 chemical compounds in >744 noted biochemical reactions Pathway was compiled via literature information extraction and extensive manual curation System allows for users to indicate evidence of pathway annotations Curation is done collaboratively with numerous experts outside of EcoCyc Karp, et al., Science 293, 2040 (2001) Pathways in bioinformatics Most resources for pathways focus on metabolic pathways (signaling and regulatory gaining prominence) Pathways as a very specific subtype of networks Like networks, can be made in computable (symbolic) form Specificities in chemical reactions are more predictive Pathways can chain together, forming larger pathways Karp, et al., Science 293, 2040 (2001) Pathway repositories BioCyc/MetaCyc Kyoto Encyclopedia of Genes and Genomes (KEGG) PATHWAY DB BioCarta BioModels database BioCyc database http://www.biocyc.org Pathway/genome database (PGDB) for organisms with completely sequenced genomes 409 full genomes and pathways deposited Species-specific pathways are inferred form MetaCyc Query/navigation/pathway creation support through the Pathway Tools software suite http://www.biocyc.org MetaCyc database http://www.metacyc.org Non-redundant reference database for metabolic pathways, reactions, enzymes and compounds Curation through experimental verification and manual literature review >1200 pathways from 1600+ species (mostly plants and microorganisms) http://www.metacyc.org Glycolysis pathway in MetaCyc http://www.metacyc.org KEGG PATHWAY database http://www.kegg.com Consolidated set of databases that cover genomics (GENE), chemical compounds (LIGAND) and reaction networks (PATHWAY) Broad focus on metabolics, signal transduction, disease, etc. Species-specific views available (but networks are static across all organisms) http://www.kegg.com Glycolysis pathway in KEGG http://www.kegg.com Global Pathway Map BioCarta database http://www.biocarta.com Corporate-owned, publicly-curated pathway database Series of interactive, “cartoon” pathway maps Predominantly human and mouse pathways Contains 120,000 gene entries and 355 pathways http://www.biocarta.com Glycolysis pathway in BioCarta http://www.biocarta.com BioModels database http://www.biomodels.net Database for published, quantitative models of biochemical processes All models/pathways curated manually, compliant with MIRIAM Models can be output in SBML format for quantitative modeling 86 curated models, 40 models pending curation http://www.biomodels.net Glycolysis pathways in BioModels http://www.biomodels.net Comparison of pathway databases MetaCyc/ BioCyc Curation Manual and KEGG PATHWAYS BioCarta BioModels Automated Manual Manual ~289 reference pathways ~355 pathways ~126 models EC, KO None GO Various Primarily human and mouse ~475 species Reference and species-specific Animated, cartoonish Non-standardized PGDB, pathway comparisons Human pathways, disease Simulations, modeling automated Size ~621+ pathways Nomenclature EC, GO Organism ~500 species coverage Visuals Species-specific custom Primary usage PGDB, computational biology Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks Pathway and cellular simulations Inferring pathways and networks Experimental methods Microarray co-expression Quantitative trait locus mapping (QTL) Isotope-coded affinity tagging (ICAT) Yeast two-hybrid assay Green florescent protein tagging (GFP tagging) Computational methods Database-driven protein-protein interactions Expression clustering techniques Literature-mining for specified interactions Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks Pathway and cellular simulations Cellular simulations Study the effect perturbation has on a pathway (and thus the organism) Generally require extensive detail on the pathway or reactions of interest (flux equations, metabolite concentration, etc.) Cellular pathway simulations must manage both temporal and spatial complexity microsec. millisec. sec. min. yr. Temporal intervals nanosec. picosec. 0.1 nm 10nm 1um 1mm 1cm 1m Spatial dimension Adapted from Kelly, H., http://www.fas.org/resource/05242004121456.pdf , via Neal, Yngve 2006 VHS, UW MEBI 591 Simulation methods and techniques Biological process Phenomena Metabolism Enzymatic reaction Signal transduction Binding Computation scheme Differential-algebraic equations, flux-based analysis Differential-algebraic equations, stochastic algorithms, diffusionreaction Gene expression Binding Polymerization Degradation Object-oriented modeling, differential-algebraic equations, stochastic algorithms, boolean networks DNA replication Binding Polymerization Object-oriented modeling, differential-algebraic equations Membrane transport Osmotic pressure Membrane potential Differential-algebraic equations, electrophysiology Adapted from Tomita 2001 Research in simulation and modeling Virtual Cell (National Resource for Cell Analysis and Modeling) MCell (the Salk Institute) Gepasi (Virginia Tech) E-CELL (Institute for Advanced Biosciences, Keio University) Karyote/CellX (Indiana University) End Theory II 5 min mindmapping 10 min break Term Project Max 3000 words Focus on results and their discussion Make sure to incorporate all the little hints we gave Incorporate runtime for the new dataset as another performance measure Practice Perform the steps as described here: http://wiki.cytoscape.org/GettingStarted