Download Symbolic Systems Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein–protein interaction wikipedia , lookup

Wnt signaling pathway wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Polyclonal B cell response wikipedia , lookup

Proteolysis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Metabolism wikipedia , lookup

Lipid signaling wikipedia , lookup

Evolution of metal ions in biological systems wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Ultrasensitivity wikipedia , lookup

Gene regulatory network wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Signal transduction wikipedia , lookup

Mitogen-activated protein kinase wikipedia , lookup

Biochemical cascade wikipedia , lookup

Paracrine signalling wikipedia , lookup

Transcript
Symbolic Systems Biology
---
Using Formal Methods Tools to
Model Biological Processes
http://pl.csl.sri.com
Carolyn Talcott
SRI International
SSFT May 2011
PLan
Symbolic systems biology
Pathway Logic
Modeling
Analyzing
Case studies
Choosing a formalism
Future challenges
Biological Systems
Biological processes are complex
signaling, regulation, defense, ....
Huge dynamics timescales: microseconds to years
Spatial scales over 12 orders of magnitude
single protein to cell to organ to organism ...
Oceans of experimental biological data generated
Important intuitions captured in mental models
that biologists build of biological processes
How to build in silico models from all this?
SyMbolic Systems Biology
Symbolic -- represented in a logical framework
Systems -- how things interact and work together, integration
of multiple parts, viewpoints and levels of abstraction
Goals:
Develop formal models that are as close as possible to
domain expert’s mental models
Compute with, analyze and reason about complex networks
New insights into / understanding of biological mechanisms
Executable Formal Models
Describe system states and rules for change in a
formal system
From an initial state, derive a transition graph
nodes -- reachable states
edges -- rules connecting states
Path in transition graph ~ computation/derivation
Many kinds of analysis available
Pathway LogiC (PL)
Representation of Signaling
http://pl.csl.sri.com/
Signaling PATHWAYS
Signaling pathways involve the modification and/or assembly of
proteins and other molecules within cellular compartments into
complexes that coordinate and regulate the flow of information.
Signaling pathways are distributed in networks having
stimulatory (positive) and inhibitory (negative) feedback loops,
and other concurrent interactions to ensure that signals are
propagated and interpreted appropriately in a particular cell or
tissue.
Signaling networks are robust and adaptive, in part because of
combinatorial complex formation (several building blocks for
forming the same type of complex), redundant pathways, and
feedback loops.
Egf stimulation of the Mitogen Activated
Protein Kinase (MAPK) pathway.
Egf → EgfR → Grb2 → Sos1 → Ras → Raf1 → Mek → Erk
Egf (EGF) binds to the Egf receptor (EgfR) and
stimulates its protein tyrosine kinase activity to
cause autophosphorylation, thus activating EgfR.
The adaptor protein Grb2 (GRB2) and the guanine
nucleotide exchange factor Sos1 (SOS) are
recruited to the membrane, binding to EgfR.
The EgfR complex activates a Ras family GTPase
Activated Ras activates Raf1, a member of the
RAF serine/threonine protein kinase family.
Raf1 activates the protein kinase Mek (MEK),
which then activates Erk (MAPK)
...
from Wikipedia
Rewriting Logic
Rewriting Logic is a logical formalism that is based on two
simple ideas
states of a system are represented as elements of an
algebraic data type
the behavior of a system is given by local transitions
between states described by rewrite rules
It is a logic for executable specification and analysis of
software systems, that may be concurrent, distributed,
or even mobile.
It is also a (meta) logic for specifying and reasoning about
formal systems, including itself (reflection!)
About Pathway Logic
Pathway Logic (PL) is an approach to modeling biological
processes as executable formal specifications (in Maude)
The resulting models can be queried
using formal methods tools: given an initial state
execute --- find some pathway
search --- find all reachable states satisfying a given property
model-check --- find a pathway satisfying a temporal formula
using reflection
find all rules that use / produce X (for example, activated Rac)
find rules down stream of a given rule or component
translate to alternative formalism and export
Pathway Logic Organization
A PL cell signaling model is generated from
•
a knowledge base
a cell state
A PL knowledge base consists of
Theops --- sorts and operations
Components --- specific proteins, chemicals ...
Rules --- biomolecular reactions / processes
A cell state is given by specifying
•
•
the proteins and other molecular components present
the incoming signals (ligands)
Rule 1: Receptor Binding
If a dish contains an EgfR ligand (?ErbB1L:ErbB1L) outside a cell with EgfR in
the cell membrane then the ligand binds to exterior part of the receptor
and the receptor is activated.
rl[1.EgfR.act]:
?ErbB1L:ErbB1L [CellType:CellType | ct {CLm | clm EgfR}]
=>
[CellType:CellType | ct {CLm | clm ([EgfR - act] : ?ErbB1L:ErbB1L)} ] .
Rule 1 applies to rasDish
PD(Egf [Cell | {CLm | EgfR PIP2}{CLi | [Hras - GDP] Src}
{CLc | Gab1 Grb2 Pi3k Plcg Sos1}])
with the match
?ErbB1L:ErbB1L := Egf
clm := PIP2
ct := {CLi | [Hras - GDP]
Src} {CLc | Gab1 Grb2 Pi3k Plcg Sos1}
giving rasDish1
PD([Cell |
{CLm | ([EgfR - act] : Egf) PIP2}
{CLi | [Hras - GDP] Src}
{CLc | Gab1 Grb2 Pi3k Plcg Sos1}]) .
How do we Infer Rules?
Rules are inferred from evidence, called datums, curated from the literature.
Consider the rule converting Rala-GDP to Rala-GTP in response to Egf stimulation
rl[1064.Rala.irt.Egf]:
{EgfRC | [EgfR - act] : Egf Pi3k RalGds clm }
{CLi | cli [Rala - GDP]}
=>
{EgfRC | [EgfR - act] : Egf Pi3k RalGds [Rala - GTP] clm }
{CLi | cli } .
Evidence for this rule includes:
DID#05387:! Rala[Ab] GTP[BD-PD] is increased irt Egf (tnr)
cells: Cos7 in BMLS
inhibited by: Wortmannnin [chem] --- Pi3k Inhibitor
inhibited by: LY294002 [chem]
""
source: 15034142-Fig-5c
DID#12876:! xRala[xAb]IP GTP/GDP[32Pi-TLC] is increased irt Egf
cells: Cos1-xRalGds in BMS
times: 0 1+ 2++ 3++ 4+ 5 min
reqs: xRalGds [omission]
inhibited by: xRalGds-C203S "membrane-binding-mutant" [substitution]
comment: cells were pretreated with Vanadate 30 min before Egf treatment
source: 9416833-Fig-2
The Pathway Logic Assistant
(PLA)
The Pathway Logic Assistant (PLA)
Provides a means to interact with a PL model
Manages multiple representations
Maude module (logical representation)
PetriNet (process representation for efficient query)
Graph (for interactive visualization)
Exports Representations to other tools
Lola (and SAL model checkers)
Dot -- graph layout
JLambda (interactive visualization, Java side)
About Petri Nets
A Petri net is represented as a graph with two kinds of nodes:
* transitions/rules (reactions--squares)
* places/occurrences (reactants, products, modifiers--ovals)
A
A Petri net process has tokens on some
of its places. A rule can fire if all of
its inputs have tokens. Firing a rule
moves tokens from input to output.
A
B
AB
C
AB
C
AB
2
2
2
D
D
D
An execution is a sequence of rule firings.
A pathway is represented as an execution subgraph.
B
1
1
1
C
A
B
EgfR-CLm
Hras activated
1
Parallel paths
Grb2-CLc
rasNet
Egf-Out
Egf-bound-CLo
EgfR-act-CLm
Rule instances relevant
to Hras activation
5
Cross talk
Grb2-reloc-CLi
Synchronization
Gab1-CLc
12
Conflict
Sos1-CLc
Grb2-Yphos-CLi
4
Gab1-Yphos-CLi
Pi3k-CLc
13
8
Sos1-reloc-CLi
Pi3k-act-CLi
PIP2-CLm
9
Hras-GDP-CLi
PIP3-CLm
6
Hras-GTP-CLi
Src-CLi
Plcg-CLc
10
Plcg-act-CLi
7
DAG-CLm
IP3-CLc
A Simple Query Language
Given a Petri net with transitions P and initial marking O (for occurrences)
there are two types of query
subnet
findPath - a computation / unfolding
For each type there are three parameters
G: a goal set---occurrences required to be present at the end of a path
A: an avoid set---occurrences that must not appear in any transition fired
H: as list of identifiers of transitions that must not be fired
subnet returns a subnet containing all (minimal) such pathways (using
backward and forward collection)
findPath returns a pathway (transition list) generating a computation
satisfying the requiremments (using model checking on the negation).
Pathways in Ras Net
EgfR-CLm
EgfR-CLm
Egf-Out
1
1
EgfR-CLm
Egf-Out
Grb2-CLc
1
Grb2-CLc
Grb2-reloc-CLi
Egf:EgfR-act-CLm
5
Grb2-reloc-CLi
5
Sos1-CLc
Grb2-CLc
Egf:EgfR-act-CLm
5
Egf:EgfR-act-CLm
Egf-Out
Gab1-CLc
Sos1-CLc
4
Pi3k-CLc
Gab1-Yphos-CLi
13
8
Sos1-reloc-CLi
Pi3k-act-CLi
Pi3k-CLc
Grb2-reloc-CLi
Gab1-CLc
13
4
Sos1-reloc-CLi
Gab1-Yphos-CLi
8
Pi3k-act-CLi
Model of EGF STimulation
(by Merrill Knapp)
The ErbB Network
(CARTOON FORM)
Yarden and Sliwkowski, Nat. Rev. Mol. Cell Biol. 2: 127-137, 2001
PL Egf Model
Events that could occur in response to Egf
Curated by
Merrill Knapp
Subnet Relevant to Erk Activation
Contains all pathways leading to activation of Erk.
Obtained by backwards followed by forwards collection.
Sos1
EgfR-EgfRC
Egf-XOut
001
Shc1-CLc
Gab2-CLc
Sos1-CLc
191
075
197
Gab2-Yphos-EgfRC
Pi3k-CLc
Sos1-Yphos-EgfRC
Hras-GDP-CLi
Egf:EgfR-Yphos-EgfRC
Ptpn11-CLc
Dbl-CLc
Gab1-CLc
172
188
389
116
Pi3k-EgfRC
529-13
Ptpn11-Yphos-EgfRC
Dbl-Yphos-EgfRC
Cdc42-GDP-CLi
207
Src-act-EgfRC
399-2
Cdc42-GTP-EgfRC
Hras-GTP-EgfRC
Braf-CLc
Shc1-Yphos-EgfRC
Git1-CLc
398
Gab1-Yphos-EgfRC
Git1-Yphos-EgfRC
Mlk3-act-EgfRC
310
Mek1-CLc
Braf-act-EgfRC
639-1
Fak2-act-EgfRC
Mek1-act-EgfRC
Activation of Erk in response to Egf
Fak2-CLc
186
Mlk3-CLc
1063
IqGap1-CLc
Src-CLc
Shoc2-CLc
196
Erks-act-EgfRC
Erks-CLc
EgfR-EgfRC
Egf-XOut
001
Shc1-CLc
191
Gab2-CLc
Egf:EgfR-Yphos-EgfRC
075
Pi3k-CLc
Ptpn11-CLc
172
Gab1-CLc
188
116
Ptpn11-Yphos-EgfRC
Shc1-Yphos-EgfRC
Gab2-Yphos-EgfRC
Src-CLc
Pi3k-EgfRC
207
RasGrp3-CLc
ArhGef4-CLc
Src-act-EgfRC
Git1-CLc
156
398
440
RasGrp3-Yphos-EgfRC
Hras-GDP-CLi
Cdc42-GDP-CLi
529-11
Fak2-CLc
186
ArhGef4-Yphos-EgfRC
399
Cdc42-GTP-EgfRC
Hras-GTP-EgfRC
Braf-CLc
1063
IqGap1-CLc
Mlk3-CLc
Git1-Yphos-EgfRC
Mlk3-act-EgfRC
310
Mek1-CLc
Fak2-act-EgfRC
Braf-act-EgfRC
639-1
Activation of Erk in response to Egf
avoiding Sos1
Gab1-Yphos-EgfRC
Mek1-act-EgfRC
Shoc2-CLc
196
Erks-act-EgfRC
Erks-CLc
EgfR-EgfRC
Egf-XOut
001
Shc1-CLc
191
Sos1-CLc
Gab2-CLc
197
Gab2-Yphos-EgfRC
Egf:EgfR-Yphos-EgfRC
Pi3k-CLc
075
Sos1-Yphos-EgfRC
Hras-GDP-CLi
Ptpn11-CLc
Dbl-CLc
172
188
Pi3k-EgfRC
Ptpn11-Yphos-EgfRC
Src-CLc
Dbl-Yphos-EgfRC
207
529-13
RasGrp3-CLc
389
Cdc42-GDP-CLi
440
116
Src-act-EgfRC
399-2
ArhGef4-CLc
156
RasGrp3-Yphos-EgfRC
398
Git1-CLc
Fak2-CLc
186
ArhGef4-Yphos-EgfRC
529-11
Gab1-Yphos-EgfRC
399
Hras-GTP-EgfRC
Cdc42-GTP-EgfRC
Mlk3-CLc
1063
Braf-CLc
IqGap1-CLc
Shc1-Yphos-EgfRC
Comparing ways to activate Erk
Pink - both pathways
Cyan -- with Sos1
Blue -- avoiding Sos1
Gab1-CLc
Git1-Yphos-EgfRC
Mlk3-act-EgfRC
310
Mek1-CLc
Braf-act-EgfRC
639-1
Fak2-act-EgfRC
Mek1-act-EgfRC
Shoc2-CLc
196
Erks-act-EgfRC
Erks-CLc
SLEEP
(with MaryAnn Greco and Merrill Knapp)
The Question
What is the function of sleep?
What are your cells doing when you sleep? vs
awake?
Rat model -- proteomics from different organs
at different sleep states
Natural Sleep Paradigm
2D Master GEL
Proteins unique to different states were identified
Those modeled in PL included Actin and Rhob
Use the PLA explorer to find signaling connections
Exploring PL KB from Actin
Actin-mono
886
Tmsb4x
Pfn1
11
732
Pfn1:Actin-mono
Diaph1-act
555
Ssh
Pxn
Vasp
Tns1
672
Tmsb4x:Actin-mono
Actin-poly
374-8
Limk1-act
165
Cofilin
Ptk2-act
Pxn
364-2
Cofilin-phos
Abi1:Eps8:Sos1
Arp23-act
Src-act
764
Vasp
Actinin
601
713
1075
Tns1
Actinin
Ilk:Lims1:Parva
434
Egf:EgfR-act
Pi3k-act
Ptk6-act
Rac1-GDP
E41
53
Rac1-GTP
Pxn-Yphos
Integrins-clustered
Crk-Yphos
T
1
Exploring PL KB from RhoB
Ngef-reloc
Mcf2-act
798-2
Cit
Dgk
Pld1
591-2
699-2
611-2
kl-reloc
DAG
Dgk-act
PC
906
27
PA
Choline
Pld1-act
Prkcl1
Rhob-GDP
221-2
Rhob-GTP
581-2
680-2
Prkcl1-act
Diaph1-act
Trio-act
807-2
Diaph1
Limk1
Srf
Ktn1
Rhpn1
Rtkn
679-4
700-2
698-2
593-2
Rock1-act
Ssh
PP1
671
697
Limk1-act
Myl9-phos
PP1-inhib
364-2
Actin-poly
Myl9
238
58
Srf-act
Rock1
Cofilin-phos
374-8
Cofilin
Ktn1-act
Rhpn1-act
Rtk
Comparing the Explore Nets
Mcf2-act
Rhob-GDP
221-2
Cit
Prkcl1
Crkl-reloc
798-2
Erk2
2-act
680-2
Prkcl1-act
Diaph1-act
Actin-mono
Pfn1
Ktn1
Rhpn1
Rtkn
679-4
700-2
698-2
593-2
Limk1
Myl9
238
Arp23-act
Srf
Limk1-act
732
Pxn
Vasp
Pxn
Actinin
Vasp
Tns1
Tln-act
Src-act
764
Actinin
Integrins-clustered
713
Ptk2-act
434
Rac1-GDP
Pxn-Yphos
Rock1-act
PP1
671
697
Myl9-phos
Ktn1-act
Rhpn1-act
Ptk6-act
Crk-Yphos
601
Tns1
Actin-poly
813
Vcl
1076
Zyx
Rtkn-act
PP1-inhib
58
165
Pi3k-act
Diaph1-act
Rock1
672
11
ctin-mono
Diaph1
581-2
2003
Trio-act
807-2
Rhob-GTP
591-2
RapGef1-act
Ngef-reloc
364-2
Srf-act
Vcl
1075
Zyx
Ilk:Lims1:Parva
Cofilin-phos
374-8
Ilk:Lims1:Parva
Cofilin
Ssh
A HypotheticaL Model Pathway
Relating State and Synaptic Plasticity
Wake state:
unknown signal(s)
=> phosphorylation of Rock1
=> activation of Limk1
=> phosphorylation of cofilin
=> increase in polymerized actin
(Phosphorylated cofilin is unable
to depolymerize actin)
SWS:
RhoDG11 binds Rhob-GDP
(is not phosphorylated)
=> Rock1, Limk1, and cofillin would
not be phosphorylated and
=> actin depolymerization
=> decrease in synaptic weight
TODO -- test the hypothesis
Modeling METABOLISM
(work of Malabika Sarker)
The Problem
Identify candidate drug targets in mycobacteria
Idea: integrate screening data, molecular
structure models, and metabolic models (using
symbolic system biology!)
Initial steps
curation of PL model of mycolic acid synthesis
(including drug action)
importing PGDBs into PL
Why Mycolic Acid
Mycolate biosynthesis enzymes are essential for survival
of Mycobacteria---excellent drug targets
Isoniazid(INH)/Ethionammide(ETH)/Triclosan(TRC) --| InhA
Mycobacterial Cell Wall
Mycolic Acid Fragment Showing
Inhibition of INHA
acetyl-CoA
Nat
9
KatG
Isoniazid
8a
Isonicotinic-acyl-anion
8b
InhA
activated-Ethionamide
AcpM-trans-but-2-enoyl
10
nhA:Isonicotinic-acyl-NADH InhA:activated-Ethionamide-NADH
7
AcpM-butanoyl
11
hexacosanoyl-CoA
eicosanoyl-CoA
AcpM
Choosing a Formalism
Requirements
What do you want to represent?
What questions do you want to ask?
What material/data is available, what must be estimated/
hypothesized?
How are you going to validate?
What are you familiar with?
What formalism features:
Language primitives, semantic models, tools
model analysis/validation vs system analysis
What to Model?
Species: Individual, concentration, population
Structure: compartments, binding, location
Process:
non-determinism -- computation trees
stochastic / probablistic -- MC
deterministic -- ODE
Quantitative: kinetics, dynamics,
Qualitative: interactions, causal relations ...
Effects of perturbation
Knockout / KnockIn / Mutations
Stimulus / Stress
What Questions?
Examples:
All the ways to phosphorylate Erk?
How fast can signal get to nucleus?
Is Pi3k required for activation of Raf?
Categories:
Reachability
Information/material flow
Dynamical features
steady states, stable states, oscillations
competition/race conditions/interference
Kinetic profiles
Future Challenges
Future Challenges
Biology Related
Integration of signaling and metabolic networks
Host response
integrating host and pathogen
Integrating models from different sources
different levels of detail
phos, Yphos, phos(Y 234), ...
different representation choices
Future CHAllenges
Algorithms
Adding semiquantitative information
Integrate probabilistic / stochastic reasoning
Algorithms to discover meaningful subnets
Finding all pathways
Visualizing 10s/100s of pathways
Future CHAllenges
Magic
Automating curation
from text
from figures
Inferring rules
what changes (one step)
in what
???