Download What is systems biology? Being a mathematician in a biologist’s

Document related concepts

Numerical weather prediction wikipedia , lookup

Generalized linear model wikipedia , lookup

Predictive analytics wikipedia , lookup

Operational transformation wikipedia , lookup

History of numerical weather prediction wikipedia , lookup

General circulation model wikipedia , lookup

Theoretical ecology wikipedia , lookup

Computer simulation wikipedia , lookup

Plateau principle wikipedia , lookup

Data assimilation wikipedia , lookup

Atmospheric model wikipedia , lookup

Flux balance analysis wikipedia , lookup

Transcript
What is systems biology?
Being a mathematician in a
biologist’s
and a bioinformatician’s world
Zofia Jones, PhD Mathematical Sciences
Outline
• Intro
• Mass action / ODE example.
• Metabolic modelling example.
A mathematical biologist is…
by definition interdisciplinary.
My main motivation: a chance to use a broader range of
mathematical skills than I could in physics and fewer rules.
(but that’s just me, lots of people love biology for its own sake)
However, progress is somewhat dependent on effective
collaboration and communication.
First: What is the computational focus
of this group?
• Something like…
First: What is the computational focus
of this group?
• Identification of species at OTU level.
• Fitting diversity distributions to hypotheses –
rarefication, neutral theory with immigration from one or
more metacommunities
• Accounting for diversity using environmental variables.
• Lots of bioinformatics – de-noising, assembly,
visualisation, phyologenic trees, identification
• Metagenomics – takes the bioinformatics challenge up a
notch in complexity and looks at function as well as
phylogeny
But more about diversity than function
But to understand diversity we need to ask how microorganisms are competitive
And for that we need to understand how they function
How does the available energy in the environment translate to fitness?
•What are the principles underlying diversity?
•Then match to data.
The big goals
• Pharmaceuticals £££
• Alternative fuels – bacteria to produce ethanol from
waste biomass
• Bioremediation, phytoremediation, mycoremediation
• Use microorganisms to grow building materials and
cellulose based “plastics” – check out BioMason,
Ecovative…
• Want to engineer specific metabolic pathways and
their efficiency
Scientific Method- hypothesis and evidence
Deduction / Induction
Need predictive models
• on a cell scale
• or/and on community / ecological scale
This covers with a LOT of science and expertise
What do predictive models do?
• Need to integrate information on metabolic
pathways, regulation, kinetics...
• See if we can reproduce what we observe in
experiments on a computer
• Predict growth/no growth, specific pathways, coexistence, inhibition factors
• Help plan experiments.
• Help save money.
• Help save time.
• (Inspire funding)
The most common tasks of
systems biologists
when modelling a standard
gene regulatory network
- an example systems biology model
Figure 2. Network models, derived from the heuristic MIM shown in Figure 1, for simulation.
Detailed Diagram
-lots of interaction.
Lots of research and
Biologist’s input on this
-many years of work.
Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response of p53 to DNA Damage. PLoS
Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665
Research Similar Models with elements similar to yours.
Make assumptions on rate constants
- group them by type, evidence of relative
magnitude
Write down some equations
-here we have simple mass action
(rate proportional to concentration)
-though more complicated with less
information
Make assumptions on
Initial conditions
Fit to experimental data
-some well developed data
here
Sensitivity Analysis –
how sensitive are outputs
to parameters?
Figure 10. Prediction of late response to DNA damage.
Make some predictions
on dynamics
-qualitative statements are
best
eg. oscillation or decay?
Give this back to the
Biologists.
Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response of p53 to DNA Damage. PLoS
Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665
Figure 7. Bifurcation diagram of the effects of MdmX on p53 oscillatory behavior.
So some stability analysis
-where
in toparameter
Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response
of p53
DNA Damage. PLoS
Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665
space do these behaviours
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665
occur?
What might you think of when you think about
mathematical modelling?
Experimental biologists are pragmatic people…
What might you think of when you think about
mathematical modelling?
• Something complicated or time consuming.
• Memories of tedious maths lessons at school.
• Michaelis-Menten enzyme kinetics.
• Don’t know where to start.
• Something to be used sparingly for specific questions or problems.
• There are interdisciplinary courses available.
Actually need a lot of qualitative knowledge.
I spend a lot of time reading the small print.
This is the real research bottleneck for numerical people in
biology.
I miss sums …
Bad luck Charlie Brown.
What I think of …
• Asking lots of questions
• What is the science I need to learn?
• What is the experimental evidence?
• What do we want to explain?
• What is the fundamental cause/effect
behind these results and discussion papers?
• How can I account for these relations
quantitatively?
What I think of …
• I need to know a lot about the biology:
details on what drives syntrophy, evolutionary
trade-offs, information on primary and secondary
metabolites, phylogeny of enzymes,
thermodynamic limitations, thermodynamic
gradients, co-regulation, regulation of pH, limiting
factors, summary of open questions, good
experimental data.
• Need specific, precise information and clearly
expressed ideas and theories.
• I need to know a lot about the relevant
mathematics which fits this biological information.
• The maths is not driven by complexity or difficulty
but relevance.
• ODEs, graph theory, flux balance analysis with
further constraints….
Again…
• Need specific, precise information and clearly
expressed ideas and theories.
(and biology is big on vagueness … )
• The maths is not driven by complexity or
difficulty but relevance and context.
• Should be helpful and accessible to both
mathematicians and biologist’s specialised in that
area.
• If it is confusing they haven’t done their job.
Let’s look at some infographics
“The Virtuous Cycle”
• to constrain models to be
more precise
• infer regulatory networks
• better kinetic data,
information on interactions
• to put the bigger picture theories
on evolutionary biology or cell function to
the test
Modern research : Different
people involved in different
steps.
• metabolic reconstruction
• mass action kinetics
• directed graphs
• microarray
• transcriptomics
• diversity data
• environmental variables
Eg. One lady Jeanette Johnson
• Who works with mathematicians in Oxford on
diffusion problems.
• Said that she often find things in her
experiments which the modellers then can
explain in theory.
• And they can predict things they find in
experiment.
• So that’s the ideal situation.
Requires lots of skills and lots of people…
Requires communication, teamwork and patience.
After all, each cell can be viewed as a tiny computer
with a core program modified by experimental evidence.
Scientific
Deduction / Induction
Use your imagination!
Need predictive models
• on a cell scale
• or/and on community / ecological scale
This covers with a LOT of science and expertise
Another example.
Metabolic Modelling
ie.
Only need genome to
get stoichiometry of
network and
some estimates of
parameters
which can be
provided by Kbase.
Then can improve
annotation
as required.
Solution of metabolic model ==
• Net
flux at each node = 0
• Flux is concentration x rate
• Extracellular source terms for substrates etc.
• Sink terms for biomass.
• Assign a vector to be optimised eg. biomass flux
• Standard linear discrete optimisation problem.
• Many alternative solutions are usually possible.
• Finding the biologically meaningful ones…
How to build a metabolic model
• Genome: get it, annotate it.
• All you need is databases.
• Get it: sequencing then assembly, NCBI,
arCOG
• Initial annotation: curated genomes NCBI,
RAST
• Additional annotation: comparative genomics
with MAUVE, literature review
How to build a metabolic model
• Draft a model: join the dots and create a sbml file.
• SEED or Kbase – same software different packages.
• SEED: slow and over-subscribed.
• Kbase: command line and faster.
• Draft built on RAST annotation then can add
additional missing reactions manually.
How to build a metabolic model
• Edit model: add or delete reactions
• Fit model to growth or no growth data.
• This data is usually specified by choice of media
or gene deletions.
• False growth positive requires more gaps.
• False growth negative requires more reactions.
• Can be done efficiently on KBase.
• Put model in paper.
Some Kbase Commands
kbws-login zofia1 -p l******1
kbws-url http://kbase.us/services/workspace_service
kbws-workspace glasgow
kbws-listobj
kbws-url http://kbase.us/services/fba_model_services
kbfba-getmedia acetate_minimal glasgow -e >> acetate_minimal.txt
annotate_genome contigs.fasta concilii
kbfba-buildfbamodel concilii methanosaeta
kbfba-gapfill methanosaeta –m acetate_minimal
= gapfilled model
Curate from getting gapfilling results.
= good enough model
Metabolic models mainly used to show we have
correct understanding of network via
growth/ no growth data.
Metabolic model building can be used to check
understanding as we make guesses about which
pathways are present or active in a given environment.
•Set constraints to reflect presence/absence of substrates.
•Run flux balance analysis to get steady state solution.
•See which pathways are active in solution.
•Use solution to help interpret transcriptomics, microarray data,
metagenomics or PCR.
•Adjust understanding of organism or model as necessary.
How to use a metabolic model
• Check understanding of topology: the most
complicated bit
• Done mainly by referring to growth or no growth
data.
• Databases are light on info on archaea and nonpathogenic micro-organisms.
• Difficult is microorganisms can only grow on a
limited range of media – fewer experimental
options.
• Challenge to chose the correct edits.
How to use a metabolic model
• Curate detail: learn about your pet organisms
• Some detail to add… proton/electron ratio = getting the net
ATP produced correct
How to use a metabolic model
• Curate detail: learn about your pet organisms
• Some detail to add… accurate rate of ATP production
-> accurate growth rate
Minimal ATP requirement.
ATP requirement increases linearly
with growth.
How to use a metabolic model
• Curate detail: learn about your pet organisms
• Some detail to add… biomass composition
How to use a metabolic model
• Curate detail: learn about your pet organisms
• Some detail to add… Thermodynamics: is reaction
reversible or not?
How to use a metabolic model
• Add kinetics: model works out yield
• Just add Michaelis-Menten kinetics for carbon and
nitrogen sources.
How to use a metabolic model
•
•
•
•
•
Use to integrate omics data
Transcriptome, ribosome, proteome, metabolome …
Various statistical methods to try.
Need money and experimental expertise.
Ask what is necessary for a specific question.
How to use a metabolic model
• Search and brain-storm good
questions.
• Then find ways of testing them
in silico
• There is no set way of doing
this.
How to use a metabolic model
Visual starting point - Cytoscape
• Ask theoretical questions – hurrah!
• What is being optimised – ATP production, efficiency, flux
minimisation, what is the optimisation trade-off?
• What does the structure of the metabolism infer? Can we infer
regulatory networks?
• Does more modularity indicate robustness?
• Can many similar networks (genotypes) produce a similar
phenotype?
• What role does thermodynamic buffering play?
• Simplification – which pathways are responsible for bulk of growth?
Complex or just Complicated
• “Complexity” arises from the application of fundamental
principles.
• But are there fundamental principles in biology?
• After we exhaustively make lists and databases of what we
know, is it just complicated?
• Or is it just something in between where principles do exist,
but in a less black and white way?
Elementary Flux Modes
• A unique path through a network
• Form a basis set to all other paths
• Typically millions – computationally expensive
or impossible - meaningless?
• Find the k shortest EFMs
• Look for EFMs which connect two points of
interest
• Prune reactions with low flux values
• Can then test co-regulation and yield predictions with
transcriptomics or microarray data
• Check understanding of what is responsible for yield
• Find ways to improve yield via gene deletions etc.
Scientific
Deduction / Induction
Requires lots of skills and lots of people…
Requires communication, teamwork and patience.
What do predictive models do?
• Need to integrate information on metabolic
pathways, regulation, kinetics...
• See if we can reproduce what we observe in
experiments on a computer
• Predict growth/no growth, specific pathways, coexistence.
• Help plan experiments.
• Help save money.
• Help save time.
Please ask if you’re interested in constructing Kbase metabolic models.
Tutorial coming soon!
Mathematical biologist?
I spend a lot of time reading the small print
and reducing what it says to modellable parts.
You need to know your system to model your system.
I miss sums …
Bad luck Charlie Brown.