* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Symbolic Systems Biology
Protein–protein interaction wikipedia , lookup
Wnt signaling pathway wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Polyclonal B cell response wikipedia , lookup
Proteolysis wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Lipid signaling wikipedia , lookup
Evolution of metal ions in biological systems wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Clinical neurochemistry wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Ultrasensitivity wikipedia , lookup
Gene regulatory network wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
Signal transduction wikipedia , lookup
Mitogen-activated protein kinase wikipedia , lookup
Symbolic Systems Biology --- Using Formal Methods Tools to Model Biological Processes http://pl.csl.sri.com Carolyn Talcott SRI International SSFT May 2011 PLan Symbolic systems biology Pathway Logic Modeling Analyzing Case studies Choosing a formalism Future challenges Biological Systems Biological processes are complex signaling, regulation, defense, .... Huge dynamics timescales: microseconds to years Spatial scales over 12 orders of magnitude single protein to cell to organ to organism ... Oceans of experimental biological data generated Important intuitions captured in mental models that biologists build of biological processes How to build in silico models from all this? SyMbolic Systems Biology Symbolic -- represented in a logical framework Systems -- how things interact and work together, integration of multiple parts, viewpoints and levels of abstraction Goals: Develop formal models that are as close as possible to domain expert’s mental models Compute with, analyze and reason about complex networks New insights into / understanding of biological mechanisms Executable Formal Models Describe system states and rules for change in a formal system From an initial state, derive a transition graph nodes -- reachable states edges -- rules connecting states Path in transition graph ~ computation/derivation Many kinds of analysis available Pathway LogiC (PL) Representation of Signaling http://pl.csl.sri.com/ Signaling PATHWAYS Signaling pathways involve the modification and/or assembly of proteins and other molecules within cellular compartments into complexes that coordinate and regulate the flow of information. Signaling pathways are distributed in networks having stimulatory (positive) and inhibitory (negative) feedback loops, and other concurrent interactions to ensure that signals are propagated and interpreted appropriately in a particular cell or tissue. Signaling networks are robust and adaptive, in part because of combinatorial complex formation (several building blocks for forming the same type of complex), redundant pathways, and feedback loops. Egf stimulation of the Mitogen Activated Protein Kinase (MAPK) pathway. Egf → EgfR → Grb2 → Sos1 → Ras → Raf1 → Mek → Erk Egf (EGF) binds to the Egf receptor (EgfR) and stimulates its protein tyrosine kinase activity to cause autophosphorylation, thus activating EgfR. The adaptor protein Grb2 (GRB2) and the guanine nucleotide exchange factor Sos1 (SOS) are recruited to the membrane, binding to EgfR. The EgfR complex activates a Ras family GTPase Activated Ras activates Raf1, a member of the RAF serine/threonine protein kinase family. Raf1 activates the protein kinase Mek (MEK), which then activates Erk (MAPK) ... from Wikipedia Rewriting Logic Rewriting Logic is a logical formalism that is based on two simple ideas states of a system are represented as elements of an algebraic data type the behavior of a system is given by local transitions between states described by rewrite rules It is a logic for executable specification and analysis of software systems, that may be concurrent, distributed, or even mobile. It is also a (meta) logic for specifying and reasoning about formal systems, including itself (reflection!) About Pathway Logic Pathway Logic (PL) is an approach to modeling biological processes as executable formal specifications (in Maude) The resulting models can be queried using formal methods tools: given an initial state execute --- find some pathway search --- find all reachable states satisfying a given property model-check --- find a pathway satisfying a temporal formula using reflection find all rules that use / produce X (for example, activated Rac) find rules down stream of a given rule or component translate to alternative formalism and export Pathway Logic Organization A PL cell signaling model is generated from • a knowledge base a cell state A PL knowledge base consists of Theops --- sorts and operations Components --- specific proteins, chemicals ... Rules --- biomolecular reactions / processes A cell state is given by specifying • • the proteins and other molecular components present the incoming signals (ligands) Rule 1: Receptor Binding If a dish contains an EgfR ligand (?ErbB1L:ErbB1L) outside a cell with EgfR in the cell membrane then the ligand binds to exterior part of the receptor and the receptor is activated. rl[1.EgfR.act]: ?ErbB1L:ErbB1L [CellType:CellType | ct {CLm | clm EgfR}] => [CellType:CellType | ct {CLm | clm ([EgfR - act] : ?ErbB1L:ErbB1L)} ] . Rule 1 applies to rasDish PD(Egf [Cell | {CLm | EgfR PIP2}{CLi | [Hras - GDP] Src} {CLc | Gab1 Grb2 Pi3k Plcg Sos1}]) with the match ?ErbB1L:ErbB1L := Egf clm := PIP2 ct := {CLi | [Hras - GDP] Src} {CLc | Gab1 Grb2 Pi3k Plcg Sos1} giving rasDish1 PD([Cell | {CLm | ([EgfR - act] : Egf) PIP2} {CLi | [Hras - GDP] Src} {CLc | Gab1 Grb2 Pi3k Plcg Sos1}]) . How do we Infer Rules? Rules are inferred from evidence, called datums, curated from the literature. Consider the rule converting Rala-GDP to Rala-GTP in response to Egf stimulation rl[1064.Rala.irt.Egf]: {EgfRC | [EgfR - act] : Egf Pi3k RalGds clm } {CLi | cli [Rala - GDP]} => {EgfRC | [EgfR - act] : Egf Pi3k RalGds [Rala - GTP] clm } {CLi | cli } . Evidence for this rule includes: DID#05387:! Rala[Ab] GTP[BD-PD] is increased irt Egf (tnr) cells: Cos7 in BMLS inhibited by: Wortmannnin [chem] --- Pi3k Inhibitor inhibited by: LY294002 [chem] "" source: 15034142-Fig-5c DID#12876:! xRala[xAb]IP GTP/GDP[32Pi-TLC] is increased irt Egf cells: Cos1-xRalGds in BMS times: 0 1+ 2++ 3++ 4+ 5 min reqs: xRalGds [omission] inhibited by: xRalGds-C203S "membrane-binding-mutant" [substitution] comment: cells were pretreated with Vanadate 30 min before Egf treatment source: 9416833-Fig-2 The Pathway Logic Assistant (PLA) The Pathway Logic Assistant (PLA) Provides a means to interact with a PL model Manages multiple representations Maude module (logical representation) PetriNet (process representation for efficient query) Graph (for interactive visualization) Exports Representations to other tools Lola (and SAL model checkers) Dot -- graph layout JLambda (interactive visualization, Java side) About Petri Nets A Petri net is represented as a graph with two kinds of nodes: * transitions/rules (reactions--squares) * places/occurrences (reactants, products, modifiers--ovals) A A Petri net process has tokens on some of its places. A rule can fire if all of its inputs have tokens. Firing a rule moves tokens from input to output. A B AB C AB C AB 2 2 2 D D D An execution is a sequence of rule firings. A pathway is represented as an execution subgraph. B 1 1 1 C A B EgfR-CLm Hras activated 1 Parallel paths Grb2-CLc rasNet Egf-Out Egf-bound-CLo EgfR-act-CLm Rule instances relevant to Hras activation 5 Cross talk Grb2-reloc-CLi Synchronization Gab1-CLc 12 Conflict Sos1-CLc Grb2-Yphos-CLi 4 Gab1-Yphos-CLi Pi3k-CLc 13 8 Sos1-reloc-CLi Pi3k-act-CLi PIP2-CLm 9 Hras-GDP-CLi PIP3-CLm 6 Hras-GTP-CLi Src-CLi Plcg-CLc 10 Plcg-act-CLi 7 DAG-CLm IP3-CLc A Simple Query Language Given a Petri net with transitions P and initial marking O (for occurrences) there are two types of query subnet findPath - a computation / unfolding For each type there are three parameters G: a goal set---occurrences required to be present at the end of a path A: an avoid set---occurrences that must not appear in any transition fired H: as list of identifiers of transitions that must not be fired subnet returns a subnet containing all (minimal) such pathways (using backward and forward collection) findPath returns a pathway (transition list) generating a computation satisfying the requiremments (using model checking on the negation). Pathways in Ras Net EgfR-CLm EgfR-CLm Egf-Out 1 1 EgfR-CLm Egf-Out Grb2-CLc 1 Grb2-CLc Grb2-reloc-CLi Egf:EgfR-act-CLm 5 Grb2-reloc-CLi 5 Sos1-CLc Grb2-CLc Egf:EgfR-act-CLm 5 Egf:EgfR-act-CLm Egf-Out Gab1-CLc Sos1-CLc 4 Pi3k-CLc Gab1-Yphos-CLi 13 8 Sos1-reloc-CLi Pi3k-act-CLi Pi3k-CLc Grb2-reloc-CLi Gab1-CLc 13 4 Sos1-reloc-CLi Gab1-Yphos-CLi 8 Pi3k-act-CLi Model of EGF STimulation (by Merrill Knapp) The ErbB Network (CARTOON FORM) Yarden and Sliwkowski, Nat. Rev. Mol. Cell Biol. 2: 127-137, 2001 PL Egf Model Events that could occur in response to Egf Curated by Merrill Knapp Subnet Relevant to Erk Activation Contains all pathways leading to activation of Erk. Obtained by backwards followed by forwards collection. Sos1 EgfR-EgfRC Egf-XOut 001 Shc1-CLc Gab2-CLc Sos1-CLc 191 075 197 Gab2-Yphos-EgfRC Pi3k-CLc Sos1-Yphos-EgfRC Hras-GDP-CLi Egf:EgfR-Yphos-EgfRC Ptpn11-CLc Dbl-CLc Gab1-CLc 172 188 389 116 Pi3k-EgfRC 529-13 Ptpn11-Yphos-EgfRC Dbl-Yphos-EgfRC Cdc42-GDP-CLi 207 Src-act-EgfRC 399-2 Cdc42-GTP-EgfRC Hras-GTP-EgfRC Braf-CLc Shc1-Yphos-EgfRC Git1-CLc 398 Gab1-Yphos-EgfRC Git1-Yphos-EgfRC Mlk3-act-EgfRC 310 Mek1-CLc Braf-act-EgfRC 639-1 Fak2-act-EgfRC Mek1-act-EgfRC Activation of Erk in response to Egf Fak2-CLc 186 Mlk3-CLc 1063 IqGap1-CLc Src-CLc Shoc2-CLc 196 Erks-act-EgfRC Erks-CLc EgfR-EgfRC Egf-XOut 001 Shc1-CLc 191 Gab2-CLc Egf:EgfR-Yphos-EgfRC 075 Pi3k-CLc Ptpn11-CLc 172 Gab1-CLc 188 116 Ptpn11-Yphos-EgfRC Shc1-Yphos-EgfRC Gab2-Yphos-EgfRC Src-CLc Pi3k-EgfRC 207 RasGrp3-CLc ArhGef4-CLc Src-act-EgfRC Git1-CLc 156 398 440 RasGrp3-Yphos-EgfRC Hras-GDP-CLi Cdc42-GDP-CLi 529-11 Fak2-CLc 186 ArhGef4-Yphos-EgfRC 399 Cdc42-GTP-EgfRC Hras-GTP-EgfRC Braf-CLc 1063 IqGap1-CLc Mlk3-CLc Git1-Yphos-EgfRC Mlk3-act-EgfRC 310 Mek1-CLc Fak2-act-EgfRC Braf-act-EgfRC 639-1 Activation of Erk in response to Egf avoiding Sos1 Gab1-Yphos-EgfRC Mek1-act-EgfRC Shoc2-CLc 196 Erks-act-EgfRC Erks-CLc EgfR-EgfRC Egf-XOut 001 Shc1-CLc 191 Sos1-CLc Gab2-CLc 197 Gab2-Yphos-EgfRC Egf:EgfR-Yphos-EgfRC Pi3k-CLc 075 Sos1-Yphos-EgfRC Hras-GDP-CLi Ptpn11-CLc Dbl-CLc 172 188 Pi3k-EgfRC Ptpn11-Yphos-EgfRC Src-CLc Dbl-Yphos-EgfRC 207 529-13 RasGrp3-CLc 389 Cdc42-GDP-CLi 440 116 Src-act-EgfRC 399-2 ArhGef4-CLc 156 RasGrp3-Yphos-EgfRC 398 Git1-CLc Fak2-CLc 186 ArhGef4-Yphos-EgfRC 529-11 Gab1-Yphos-EgfRC 399 Hras-GTP-EgfRC Cdc42-GTP-EgfRC Mlk3-CLc 1063 Braf-CLc IqGap1-CLc Shc1-Yphos-EgfRC Comparing ways to activate Erk Pink - both pathways Cyan -- with Sos1 Blue -- avoiding Sos1 Gab1-CLc Git1-Yphos-EgfRC Mlk3-act-EgfRC 310 Mek1-CLc Braf-act-EgfRC 639-1 Fak2-act-EgfRC Mek1-act-EgfRC Shoc2-CLc 196 Erks-act-EgfRC Erks-CLc SLEEP (with MaryAnn Greco and Merrill Knapp) The Question What is the function of sleep? What are your cells doing when you sleep? vs awake? Rat model -- proteomics from different organs at different sleep states Natural Sleep Paradigm 2D Master GEL Proteins unique to different states were identified Those modeled in PL included Actin and Rhob Use the PLA explorer to find signaling connections Exploring PL KB from Actin Actin-mono 886 Tmsb4x Pfn1 11 732 Pfn1:Actin-mono Diaph1-act 555 Ssh Pxn Vasp Tns1 672 Tmsb4x:Actin-mono Actin-poly 374-8 Limk1-act 165 Cofilin Ptk2-act Pxn 364-2 Cofilin-phos Abi1:Eps8:Sos1 Arp23-act Src-act 764 Vasp Actinin 601 713 1075 Tns1 Actinin Ilk:Lims1:Parva 434 Egf:EgfR-act Pi3k-act Ptk6-act Rac1-GDP E41 53 Rac1-GTP Pxn-Yphos Integrins-clustered Crk-Yphos T 1 Exploring PL KB from RhoB Ngef-reloc Mcf2-act 798-2 Cit Dgk Pld1 591-2 699-2 611-2 kl-reloc DAG Dgk-act PC 906 27 PA Choline Pld1-act Prkcl1 Rhob-GDP 221-2 Rhob-GTP 581-2 680-2 Prkcl1-act Diaph1-act Trio-act 807-2 Diaph1 Limk1 Srf Ktn1 Rhpn1 Rtkn 679-4 700-2 698-2 593-2 Rock1-act Ssh PP1 671 697 Limk1-act Myl9-phos PP1-inhib 364-2 Actin-poly Myl9 238 58 Srf-act Rock1 Cofilin-phos 374-8 Cofilin Ktn1-act Rhpn1-act Rtk Comparing the Explore Nets Mcf2-act Rhob-GDP 221-2 Cit Prkcl1 Crkl-reloc 798-2 Erk2 2-act 680-2 Prkcl1-act Diaph1-act Actin-mono Pfn1 Ktn1 Rhpn1 Rtkn 679-4 700-2 698-2 593-2 Limk1 Myl9 238 Arp23-act Srf Limk1-act 732 Pxn Vasp Pxn Actinin Vasp Tns1 Tln-act Src-act 764 Actinin Integrins-clustered 713 Ptk2-act 434 Rac1-GDP Pxn-Yphos Rock1-act PP1 671 697 Myl9-phos Ktn1-act Rhpn1-act Ptk6-act Crk-Yphos 601 Tns1 Actin-poly 813 Vcl 1076 Zyx Rtkn-act PP1-inhib 58 165 Pi3k-act Diaph1-act Rock1 672 11 ctin-mono Diaph1 581-2 2003 Trio-act 807-2 Rhob-GTP 591-2 RapGef1-act Ngef-reloc 364-2 Srf-act Vcl 1075 Zyx Ilk:Lims1:Parva Cofilin-phos 374-8 Ilk:Lims1:Parva Cofilin Ssh A HypotheticaL Model Pathway Relating State and Synaptic Plasticity Wake state: unknown signal(s) => phosphorylation of Rock1 => activation of Limk1 => phosphorylation of cofilin => increase in polymerized actin (Phosphorylated cofilin is unable to depolymerize actin) SWS: RhoDG11 binds Rhob-GDP (is not phosphorylated) => Rock1, Limk1, and cofillin would not be phosphorylated and => actin depolymerization => decrease in synaptic weight TODO -- test the hypothesis Modeling METABOLISM (work of Malabika Sarker) The Problem Identify candidate drug targets in mycobacteria Idea: integrate screening data, molecular structure models, and metabolic models (using symbolic system biology!) Initial steps curation of PL model of mycolic acid synthesis (including drug action) importing PGDBs into PL Why Mycolic Acid Mycolate biosynthesis enzymes are essential for survival of Mycobacteria---excellent drug targets Isoniazid(INH)/Ethionammide(ETH)/Triclosan(TRC) --| InhA Mycobacterial Cell Wall Mycolic Acid Fragment Showing Inhibition of INHA acetyl-CoA Nat 9 KatG Isoniazid 8a Isonicotinic-acyl-anion 8b InhA activated-Ethionamide AcpM-trans-but-2-enoyl 10 nhA:Isonicotinic-acyl-NADH InhA:activated-Ethionamide-NADH 7 AcpM-butanoyl 11 hexacosanoyl-CoA eicosanoyl-CoA AcpM Choosing a Formalism Requirements What do you want to represent? What questions do you want to ask? What material/data is available, what must be estimated/ hypothesized? How are you going to validate? What are you familiar with? What formalism features: Language primitives, semantic models, tools model analysis/validation vs system analysis What to Model? Species: Individual, concentration, population Structure: compartments, binding, location Process: non-determinism -- computation trees stochastic / probablistic -- MC deterministic -- ODE Quantitative: kinetics, dynamics, Qualitative: interactions, causal relations ... Effects of perturbation Knockout / KnockIn / Mutations Stimulus / Stress What Questions? Examples: All the ways to phosphorylate Erk? How fast can signal get to nucleus? Is Pi3k required for activation of Raf? Categories: Reachability Information/material flow Dynamical features steady states, stable states, oscillations competition/race conditions/interference Kinetic profiles Future Challenges Future Challenges Biology Related Integration of signaling and metabolic networks Host response integrating host and pathogen Integrating models from different sources different levels of detail phos, Yphos, phos(Y 234), ... different representation choices Future CHAllenges Algorithms Adding semiquantitative information Integrate probabilistic / stochastic reasoning Algorithms to discover meaningful subnets Finding all pathways Visualizing 10s/100s of pathways Future CHAllenges Magic Automating curation from text from figures Inferring rules what changes (one step) in what ???