Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stochastic modeling of molecular reaction networks Daniel Forger University of Michigan Let’s begin with a simple genetic network We can list the basic reaction rates and stochiometry numsites = total # of sites on a gene, G = # sites bound M = mRNA, Po = unmodified protein, Pt = modified protein Transcription Translation Protein Modification M degradation Po degradation Pt degradation Binding to DNA Unbinding to DNA trans or 0 tl*M conv*Po degM*M degPo*Po degPt*Pt bin(numsites - G)*Pt unbin*G +M +Po -Po, +Pt -M -Po -Pt -Pt, +G -G We normally track concentration Let’s track # molecules instead • Let M, Po, Pt be # molecules • First order rate constants (tl, unbin, conv, degM, degPo and degPt) have units 1/time and stay constant • Zero order rate constant (trans) has units conc/time, so multiply it by volume • 2nd order rate constant (bin) has units 1/(conc*time), so divide it by volume numsites = total # of sites on a gene, G = # sites bound M = mRNA, Po = unmodified protein, Pt = modified protein V = Volume Transcription Translation Protein Modification M degradation Po degradation Pt degradation Binding to DNA Unbinding to DNA trans*V or 0 tl*M conv*Po degM*M degPo*Po degPt*Pt bin/V(numsites - G)*Pt unbin*G +M +Po -Po, +Pt -M -Po -Pt -Pt, +G -G How would you simulate this? • Choose which reaction happens next – Find next reaction – Update species by stochiometry of next reaction – Find time to this next reaction How to find the next reaction • Choose randomly based on their reaction rates trans*V tl*M conv*Po degM*M degPo*Po degPt*Pt Random # unbin*G bin/V(numsites - G)*Pt Now that we know the next reaction modifies the protein • Po = Po - 1 • Pt = Pt + 1 • How much time has elapsed – a0 = sum of reaction rates – r0 = random # between 0 and 1 1 1 ln a0 r0 This method goes by many names • Computational Biologists typically call this the Gillespie Method – Gillespie also has another method • Material Scientists typically call this Kinetic Monte Carlo Myth 1: “Mass Action Formulations do not account for Stochasticity” Consider a simple model inspired by the circadian clock in Cyanobacteria A B C A B C • Here a protein can be in 3 states, A, B or C • We start the system with 100 molecules of A • Assume all rates are 1, and that reactions occur without randomness (it takes one time unit to go from A to B, etc.) Mass Action Representation dA dB dC C A, A B, BC dt dt dt Matlab simulation Mass Action represents a limiting case of Stochastics • Mass action and stochastic simulations should agree when certain “limits” are obtained • Mass action typically represents the expected concentrations of chemical species (more later) Myth 2: Stochastic and Mass Action Approaches agree only if there are enough molecules What matters is the number of reactions • This is particularly important for reversible reactions • By the central limit theorem, fluctuations dissapear like n-1/2 • There are almost always a very limited number of genes, – Ok if fast binding and unbinding There are several representations in between Mass Action and Gillespie • • • • Chemical Langevin Equations Master Equations Fokker-Planck Moment descriptions We will illustrate this with an example Kepler and Elston Biophysical Journal 81:3116 QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Master Equations describe how the probability of being in each state dpm0 0 Kk0 m 0 pm0 Kk1 p1m (m 1) pm0 1 0 pm1 dt dp1m Kk1 m 1 p1m Kk0 pm0 (m 1) p1m 1 1 p1m1 dt QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Sometimes we can solve for the mean and variance moments m j s m j pms m at steady state mean 0 k1 1k 0 0 1 var iance mean k 0 k1 K 2 Distribution of molecules often looks Gaussian Moment Descriptions • Gaussian Random Variables are fully characterized by their mean and standard deviation • We can write down odes for the mean and standard deviation of each variable • However, for bimolecular reactions, we need to know the correlations between variables (potentially N2) Towards Fokker Planck • Let’s divide the master equation by the mean m*. • Although this equation described many states, we can smooth the states to make a probability distribution function pms (t) (m 1/ 2)/ m * dxp (x,t) s (m1/ 2)/ m * Note 1 x 1 j 1 1 j m* psx * x ps (x) * e ps (x) m j j! m If 1/m* is small, we can then derive a simplifed Version of the Master equations 1 2 s t ps (x) x s* x ps (x) x p (x) K[ksˆ psˆ (x) k s ps (x)] * x * s m 2m m QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Chemical Langevin Equations • If we don’t want the whole probability distribution, we can sometimes derive a stochastic differential equation to generate a sample dX A(X) B(X) (t) dt Adalsteinsson et al. BMC Bioinformatics 5:24 QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Examples • • • • Transcription Control Lac Operon Oscillations Accounting for diffusion Rossi et al. Molecular Cell QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Ozbudak et al. Nature 427:737 QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Guantes and Poyatos PLoS Computational Biology 2:e30 QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. SNIC Bifurcation Invariant Circle Saddle-Node on an Invariant Circle SNIC saddle max Limit Cycle x2 node min p1 Hopf Bifurcation stable limit cycle max slc x2 uss sss min p1 Noise Induced oscillations Liu et al. Cell 129:605 3-D Gillespie http://www.math.utah.edu/~isaacson/3dmodel.html