Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Databases for Systems Biology Herbert M Sauro Keck Graduate Institute Claremont, CA, 91711 Systems Biology Systems Biology •Computational Systems Biology Group (Peter Spirtes) in Pittsburgh, Pennsylvania •Biochemical Networks Modeling Group (Pedro Mendes) at the Virginia Bioinformatics Institute Computational Systems Biology Group (Reinhard Laubenbacher) at the Virginia Bioinformatics Institute •Evolution of Molecular Networks group (Andreas Wagner) at the University of New Mexico •Systems biology group (Trey Idekeker) at the Whitehead Institute for Biomedical Research, Cambridge (USA) •Computational Cell Biology (Dennis Bray) at the University of Cambridge (UK) STRC Biocomputation Group (Hamid Bolouri) at the University of Hertfordshire •Computational Molecular Biology (Ron Shamir) at the University of Tel Aviv •Complex Systems Division (Carsten Peterson) at the University of Lund •Design Principles of Protein Networks (Uri Alon) at the Weizmann Institute •Design Principles of Protein Networks (Naama Barkai) at the Weizmann Institute •Probabilistic Graphical Models (Daphne Koller) at the University of Stanford •Molecular Biology and Probabilistic Models (Nir Friedman) at the Hewbrew University of Jerusalem •Systems Optimization Group (Eckart Zitzler) at the ETH Zürich •Protein Interaction Group (Benno Schwikowski) at the Systems Biology Institute, Seattle •Systems Biology Center at TU Delft •Integrative Systems Biology at TU Denmark •U Ghent •Institute for Advanced Study, Center for Systems Biology •Ron Weiss group, Princeton University •BII Systems Biology Group (Singapore) •UC San Francisco BioSystems Group Groups World-Wide •Kitano Systems Biology Group •Davidson Lab at Caltech •Bioinformatics & Systems Biology Group at the Burnham Institute (La Jolla) •Virtual Cell Project, U Connecticut •UC Santa Barbara IGERT Program on Systems Biology •UC San Diego Bioinformatics & Systems Biology Groups •UC San Diego Systems Biodynamics Group •Integrated Systems Biology Group at Rensselaer Polytechnic Institute Systems Biology Institutes and Larger Initiatives •BioSPI Project at Weizmann •BioSPICE •BioMaps Institute at Rutgers: •Institute for Systems Biology, Seattle •Bauer Center for Genomics Research (CGR) at Harvard University •Systems Biology Department at Harvard Medical School •Computational and Systems Biology Initiative at MIT •Bio-X at Stanford University •Center for Studies in Physics and Biology at The Rockefeller University •GENSCEND Initiative of the Wellcome Trust •"Genomes to Life program" (a funding initiative of the DOE) •"Cell Systems Initiative" (an initiative of the University of Washington) •"Systems of Life - System Biology" (a funding initiative of the German Ministry of Education and Research, BMBF) •SFB 618 (funded by the German Research Council DFG) •STAGSIM - Systems Biology (An Expression of Interest (EoI) submitted to the EU Framework Program VI) •Systems Biology in Sweden •Institute for Computational Biomedicine at the Weill Medical College of Cornell University. •Pathways/Systems Biology Working Group at I3C. Systems Biology Has its Backers and Attackers Revolution or buzzword du jour, pundits ponder a pervasive term | By Mignon Fogarty Though coined 40 years ago,1 a lot of people still ask, "What's that?" when the term systems biology comes up. "It is used in so many different contexts, nobody is really clear what you mean by it," says John Yates III, a professor at the Scripps Research Institute in La Jolla, Calif. He's not the only one stumped by the term's meaning. David Placek, president of Sausalito, Calif.based Lexicon Branding, a company that cooks up names for pharmaceutical products such as Velcade and Meridia, says he's not so hot on the moniker. "Systems biology is just so general that it could apply to many things. When you're naming a category, the underlying principle is that if you make a statement like, 'I'm doing systems biology,' do people know what you're talking about?'“…… Volume 17 | Issue 19 | 27 Oct. 6, 2003, The Scientist Systems Biology? High-throughput Data? Systems Biology? Databases? PathDB What is Systems Biology? Understanding the principles of how physiological/phenotypic characteristics emerge from the properties of the components. Predicting how these characteristics will change in response to alterations in the environment or system components. What are we dealing with? What are we dealing with? Mirit Aladjem et al., Stke, March 2004 Successful Models Red Blood Cell Yeast Glycolysis Trypanosoma Brucei EGF Signaling Pathway Mulquiney, Joshi, Heinrich, … Calvin Cycle Bas Teusink Chemotaxis, ecoli Barbara Bakker, Westerhoff and Cornish-Bowden Yeast Cell Cycle Frances Brightman et al Poolman and Fell Many Contributors John Tyson et al Level of Complexity E. coli composition Molecule # Molecules per cell # of Types Protein RNA Small Molecules Ions 2,360,000 270,000 millions millions 1000-2000 5 500 20-30 http://biosci191.bsd.uchicago.edu/L02/ecoli.htm http://opbs.okstate.edu/5753/Composition%20table.html Man-made Complex Devices Intel Pentium 4 42 million transistors Man-made Complex Devices • The AMD Opteron • 105.9 million transistors • Number of gates > 54 Million Man-made Complex Devices • The Intel Itanium 2 • 410 million transistors • Number of gates > 100 Million Man-made Complex Devices • The Intel Itanium 2 • 410 million transistors • Number of gates > 100 Million By 2007 both Intel and AMD are predicting dies with 1 billion transistors Man-made Complex Devices • The Intel Itanium 2 • 410 million transistors • Number of gates > 100 Million By 2007 both Intel and AMD are predicting dies with 1 billion transistors Many of the new graphics chips have over 60 million transistors AMD are working towards 45nanometer transistors by 2007. The sizes of proteins vary from 2nm to 20 nm. Man-made Complex Devices Probably by 2010, man-made devices will have comparable complexity to bacterial cells if not greater. Cellular Models Building computational models of cells seems more and more like a viable project. Such a project would bring a much clearer understanding of how cellular systems are controlled and ultimately it should bring unprecedented predictive power. Are Biologists Ready? Xo S1 S2 S3 v S4 S5 S6 X1 Xo and X1 fixed, all reactions reversible, assume stable steady state. Are Biologists Ready? 50 % Xo S1 S2 S3 v S4 S5 S6 X1 What happens to the steady state? Xo and X1 fixed, all reactions reversible, assume stable steady state. Are Biologists Ready? 50 % Xo S1 S2 S3 v S4 S5 Students reply: 1. Nothing happens. 2. Nothing happens unless it is the rate-limiting step. 3. The rate v goes down, but that’s all. 4. S3 goes up. 5. S4 goes down. 6. Species downstream of v go up. 7. Steady State flow changes but species levels don’t. 8. Xo and X1 change S6 X1 Are Biologists Ready? 50 % Xo S1 S2 S3 v S4 S5 S6 If we can’t understand this system how can we hope to understand: X1 Functional Motif Identification Computer simulation of EGF signal transduction PC12 cells. Frances Brightman, Simon Thomas and David Fell http://bms-mudshark.brookes.ac.uk/frances/fabweb5.htm 29 species Functional Motif Identification Computer simulation of EGF signal transduction PC12 cells. Frances Brightman, Simon Thomas and David Fell http://bms-mudshark.brookes.ac.uk/frances/fabweb5.htm Functional Motif Identification 27 components Functional Motif Identification Amplifier Demodulator Resonance Detector Functional Motif Identification Rectifier Audio Filter Carrier Filter Demodulator Amplifier Power Amplifier Amplifier Feedback Feedback Pre-Amplifier Filter Functional Motif Identification How Intel Engineers Cope Complex man-made devices are modeled and designed on multiple levels, each level may use different modeling techniques: Transistor Characteristics Basic Logic Gates Small Gate Modules Hierarchy of functional modules Top Level Module How Intel Engineers Cope Complex man-made devices are modeled and designed on multiple levels, each level may use different modeling techniques: Transistor Characteristics Basic Logic Gates Small Gate Modules Hierarchy of functional modules Top Level Module Fundamental Protein Chemistry Basic Enzyme Rate Characteristics Small Enzyme Motifs Hierarchy of functional modules Top Level Module Functional Motif Identification Negative Feedback in the MAPK Pathway yi At high amplifier gain (A k > 1): A yo k Functional Motif Identification Negative Feedback in the MAPK Pathway At high amplifier gain (A k > 1): Linearization of the amplifier response. Without Feedback With Feedback Functional Motif Identification E. coli Chemotaxis Signaling network reset Run Tumble Motor Software Tools and Resources: Software Infrastructure Interchange Formats Analysis Algorithms Model Editors Visualization Model Databases Theoretical Foundation Databases for Systems Biology • Kinetic Data • Network Information Systems Biology Models Systems Biology Models Simple first-order reaction kinetics Power Law Systems Biology Models Simple irreversible Michaelis-Menten Systems Biology Models Reversible Michaelis-Menten Systems Biology Models Irreversible Allosteric Mechanism Databases for Systems Biology The oldest known metabolic pathway is Yeast Glycolysis http://www.utoronto.ca/greenblattlab/yeast.htm http://www.utc.edu/Faculty/Becky-Bell/210-outline05.html Databases for Systems Biology Hexokinase 2.7.1.1 Databases for Systems Biology Hexokinase 2.7.1.1 Glucose + ATP = G6P + ADP Km None available Specific Activity: 512 M/min/mg Databases for Systems Biology Phosphofructokinase 2.7.1.11 Databases for Systems Biology Phosphofructokinase 2.7.1.11 ATP + F6P = ADP + FBP Km None available Specific Activity: 180 M/min/mg 148 M/min/mg 114 M/min/mg Databases for Systems Biology Pyruvate Kinase 2.7.1.40 Databases for Systems Biology Pyruvate Kinase 2.7.1.40 PEP + ADP = Pyruvate + ATP Km ADP : Specific Activity: 0.16 mM (+ FBP) None available Databases for Systems Biology 1. Kinetic equations 2. Values for kinetic constants plus standard errors 3. Conditions under which enzyme was characterized Networks Network information is mainly Inaccessible in convenient formats, much work has to be done by the user to extract the desired information. without much work. The need for a model or network exchange format. Networks There is also the need for a network visualization standard. DCL: Gene Network Sciences Mirit I. Aladjem and Kurt Kohn Model Databases Model Databases Other Systems eg BioSPICE Database Web Services Peer Reviewed Desktop Client SBW Client Version Controller Scratchpad => translator => SBML/SQL translator Matlab, XPP, FORTRAN Berkeley Madonna, SBML, CellML, C, Java, Mathematica, etc….. Modelling Tools 9 7 5 3 1 65-69 70-74 75-79 80-84 85-89 Period Klaus Mauch, University of Stuttgart 90-94 95-99 • • • • • • • • • • • • • • • • • • • • BIOSSIM (1968) ESSYN (1976) SCAMP (1983) SCOP (1986) METAMOD (1986) SIMFIT (1990) METAMODEL (1991) METASIM (1992) KINSIM (1993) GEPASI (1994) METALGEN (1994 ?) MIST (1995) METABOLIKA (1997 ?) METAFLUX (1997) SIMFLUX (1997) MNA (1998) CELLMOD (1998) FLUXMAP (1999) METATOOL (1999) VCELL (1999) SBML – Systems Biology Markup Language The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks. SBML is applicable to metabolic networks, cell-signaling pathways, genomic regulatory networks, and many other areas in systems biology. Tool 1 Tool 2 The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks. SBML is applicable to metabolic networks, cell-signaling pathways, genomic regulatory networks, and many other areas in systems biology. Originally developed Hamid Bolouri, Andrew Finney, Mike Huck and Herbert Sauro Tool 2 SBML – Systems Biology Markup Language XML based Standard • Simple Compartments (well stirred reactor) • Internal/External Species • Reaction Schemes • Global Parameters • Arbitrary Rate Laws • DAEs (ODE + Algebraic functions, Constraints) • Physical Units/Model Notes • Annotation – extension capability SBML – Systems Biology Markup Language What is XML? <?xml version="1.0" ?> <note> <to> Hobbit </to> <from> Orc </from> <heading> Note to Frodo </heading> <body> I want to eat you </body> </note> SBML – Systems Biology Markup Language XML has a hierarchical structure <root> <child> <subchild>.....</subchild> </child> </root> Each node can also have optional attributes, eg <child name = “john”> <?xml version="1.0" encoding="UTF-8"?> <!-- Created by XMLPrettyPrinter on 11/14/2002 --> <sbml level = "1" version = "1" xmlns = "http://www.sbml.org/sbml/level1"> <!---> <!-- Model Starts Here --> <!---> <model name = "untitled"> <listOfCompartments> <compartment name = "uVol" volume = "1"/> </listOfCompartments> <listOfSpecies> <specie boundaryCondition = "false" compartment = "uVol" initialAmount = "0" name ="Node0"/> <specie boundaryCondition = "false" compartment = "uVol" initialAmount = "0" name = "Node1"/> <specie boundaryCondition = "false" compartment = "uVol" initialAmount = "0" name = "Node2"/> </listOfSpecies> SBML – Example <listOfReactions> <reaction name = "J0" reversible = "false"> <listOfReactants> <specieReference specie = "Node0" stoichiometry = "1"/> </listOfReactants> <listOfProducts> <specieReference specie = "Node1" stoichiometry = "1"/> </listOfProducts> <kineticLaw formula = "v"> </kineticLaw> </reaction> <reaction name = "J1" reversible = "false"> <listOfReactants> <specieReference specie = "Node1" stoichiometry = "1"/> </listOfReactants> <listOfProducts> <specieReference specie = "Node2" stoichiometry = "1"/> </listOfProducts> <kineticLaw formula = "v"> </kineticLaw> </reaction> </listOfReactions> </model> </sbml> Other Related Efforts - CellML CellML is a more comprehensive attempt at developing an exchange standard, also defined in terms of XML. However, it is much more complex and the designers of CellML have not provided software support in the form of tools and software libraries. Data Formats One other area which is even more difficult to resolve is experimental data formats, microarray, proteomic, metabolmic, basically all the omics. Two projects are attempting to put some order in the data format area, bioSPICE and particularly the DOE GTL project. The Future There is obviously a long long way to go. Kinetic data must be more carefully curated. Standards for exchanging data, models, including visualization notations need to be developed further. What are we dealing with? Reaction systems working on multiple time scales: 1. Discrete deterministic events 2. Fast reactions 3. Continuous variables (modeled by ODES) 4. Continuous variables with additive and multiplicative noise 5. Stochastic discrete systems (Gillespie type) Functional Motif Identification