Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Interactome wikipedia , lookup
Proteolysis wikipedia , lookup
Western blot wikipedia , lookup
Ligand binding assay wikipedia , lookup
Clinical neurochemistry wikipedia , lookup
NADH:ubiquinone oxidoreductase (H+-translocating) wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Drug design wikipedia , lookup
Metalloprotein wikipedia , lookup
Protein structure prediction wikipedia , lookup
Biochemistry wikipedia , lookup
Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular Motions Serkan Apaydin, Doug Brutlag1 Carlos Guestrin, David Hsu2 Jean-Claude Latombe, Chris Varma Computer Science Department Stanford University 1 Department of Biochemistry, Stanford University 2 Computer Science Department, University of North Carolina Goal of our Research Develop efficient computational representations and algorithms to study molecular pathways for protein folding and ligand-protein binding Protein folding RECOMB ’02 Ligand-protein binding ECCB ‘02 Acknowledgements People: Leo Guibas Michael Levitt, Structural Biology Itay Lotan Vijay Pande, Chemistry Fabian Schwarzer Amit Singh Rohit Singh Funding: NSF-ITR ACI-0086013 Stanford’s Bio-X and Graduate Fellowship programs Analogy with Robotics Configuration Space Approximate the free space by random sampling Probabilistic Roadmaps Probabilistic Roadmap free space [Kavraki, Svetska, Latombe,Overmars, 95] Probabilistic Completeness The probability that a roadmap fails to correctly capture the connectivity of the free space goes to 0 exponentially in the number of milestones (~ running time). Random sampling is convenient incremental scheme for approximating the free space Computed Examples Biology Robotics Energy field, instead of joint control Continuous energy field, instead of binary free and in-collision spaces Multiple pathways, instead of single collision-free path Potentially many more degrees of freedom Relation to real world is more complex Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding energy Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths Study of energy profiles along such paths Catalytic Site Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths Study of energy profiles along such paths Extensions to protein folding [Song and Amato, 01] [Apaydin et al., 01] New Idea: Capture the stochastic nature of molecular motion by assigning probabilities to edges vi Pij vj Why is this a good idea? 1) We can approximate Monte Carlo simulation as closely as we wish 2) Unlike with MC simulation, we avoid the local-minima problem 3) We can consider all pathways in the roadmap at once to compute ensemble properties Edge probabilities exp( Eij / k BT ) , if Eij 0; Ni Follow Metropolis criteria: Pij 1 , otherwise. N i vi Self-transition probability: Pii 1 Pij j i Pii Pij vj Stochastic Roadmap Simulation Pij S Stochastic simulation on roadmap and Monte Carlo simulation converge to same Boltzmann distribution Problems with Monte Carlo Simulation Much time is wasted in local minima Each run generates a single pathway Solution Pij Treat roadmap as a Markov chain and use the First-Step Analysis tool Example #1: Probability of Folding pfold HIV integrase [Du et al. ‘98] 1- pfold pfold “We stress that we do not suggest using pfold as a transition coordinate for practical purposes as it is Folded set Unfolded set very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). First-Step Analysis U: Unfolded set F: Folded set One linear equation per node Solution gives pfold for all nodes l k No explicit simulation run j Pik Pil All pathways are taken Pij into account m Pim Sparse linear system i Pii Let fi = pfold(i) After one step: fi = Pii fi + Pij fj + Pik fk + Pil fl + Pim fm =1 =1 In Contrast … Computing pfold with MC simulation requires: Performing many MC simulation runs Counting the number of times F is attained first for every conformation of interest: Computational Tests • 1ROP (repressor of primer) • 2 a helices • 6 DOF • 1HDD (Engrailed homeodomain) • 3 a helices • 12 DOF H-P energy model with steric clash exclusion [Sun et al., 95] Correlation with MC Approach 1ROP Correlation with MC Approach 1HDD Computation Times (1ROP) Monte Carlo: 49 conformations Over 11 days of computer time Over 106 energy computations Roadmap: 5000 conformations 1 - 1.5 hours of computer time ~15,000 energy computations ~4 orders of magnitude speedup! Example #2: Ligand-Protein Interaction Computation of escape time from funnels of attraction around potential binding sites (funnel = ball of 10A rmsd) Computing Escape Time with Roadmap l k j Pil Pik Pij i Pii m Pim Funnel of Attraction ti = 1 + Pii ti + Pij tj+ Pik tk + Pil tl + Pim tm =0 (escape time is measured as number of steps of stochastic simulation) Similar Computation Through Simulation [Sept, Elcock and McCammon `99] 10K to 30K independent simulations Applications 1) Distinguishing catalytic site: Given several potential binding sites, which one is the catalytic site? Complexes Studied ligand protein # random nodes # DOFs oxamate 1ldm 8000 7 Streptavidin 1stp 8000 11 Hydroxylamine 4ts1 8000 9 COT 1cjw 8000 21 THK 1aid 8000 14 IPM 1ao5 8000 10 PTI 3tpi 8000 13 Distinction Based on Energy Protein Bound state Best potential binding site 1stp -15.1 -14.6 4ts1 -19.4 -14.6 3tpi -25.2 -16.0 1ldm -11.8 -13.6 1cjw -11.7 -18.0 1aid -11.2 -22.2 1ao5 -7.5 -13.1 Able to distinguish catalytic site Not able (kcal/mol) Distinction Based on Escape Time Protein 1stp 4ts1 3tpi 1ldm 1cjw 1aid 1ao5 Bound state 3.4E+9 3.8E+10 1.3E+11 8.1E+5 5.4E+8 9.7E+5 6.6E+7 Best potential binding site 1.1E+7 1.8E+6 5.9E+5 3.4E+6 4.2E+6 1.6E+8 5.7E+6 Able to distinguish catalytic site Not able (# steps) Applications 1) Distinguishing catalytic site 2) Computational mutagenesis GLN-101 Loop ARG-106 ASP-195 HIS-193 + + CH3 O C C ASP-166 NADH Some amino acids are deleted entirely, replaced by other amino acids, or sidechains altered O O + ARG-169 Chemical environment of LDH-NADH-substrate complex (pyruvate) (catalyzes conversion of pyruvate to lactate in the presence of NADH Binding of Pyruvate to LDH GLN-101 Loop CH3 ARG-106 O C C O ASP-195 O HIS-193 + + THR-245 ASP-166 NADH + ARG-169 Results Mutant Escape Time Change Wildtype 3.216E6 GLN-101 N/A Loop ARG-106 ASP-195 HIS-193 + + CH3 O C THR-245 C ASP-166 NADH O O + ARG-169 Results Mutant Escape Time Change Wildtype 3.216E6 His193 Ala Arg106 Ala 4.126E2 GLN-101 N/A Loop ALA-106 CH3 ASP-195 ALA-193 O C ASP-166 C NADH O O + ARG-169 Results Mutant Escape Time Wildtype 3.216E6 His193 Ala Arg106 Ala 4.126E2 Change GLN-101 N/A Loop ARG-106 ASP-195 HIS-193 His193 Ala 3.381E3 + + CH3 O C GLY-245 Arg106 Ala 2.550E2 Asp195 Asn 5.221E7 Gln101 Arg 1.669E6 No change Thr245 Gly 4.607E5 C ASP-166 NADH O O + ARG-169 Conclusion Probabilistic roadmaps are a promising computational tool for studying ensemble properties of molecular pathways Current and future work: Better kinetic/energetic models Experimentally verifiable tests Non-uniform sampling strategies Encoding MD simulation Stochastic Roadmap Simulation vs S vg Stochastic simulation on a roadmap and MC simulation converge to the same distribution p (Boltzman): For any set S, e>0, d>0, g>0, there exists N such that a roadmap with N milestones has error bounded by: p ( S )(1 d ) e pˆ ( S ) p ( S )(1 d ) e with probability at least 1-g. Ligand-Protein Modeling x,y,z • DOF = 10 – – – – 3 coordinates to position root atom; 2 angles to specify first bond; Angles for all remaining non-terminal atoms; Bond angles are assumed constant; • Protein assumed rigid [Singh, Latombe and Brutlag `99] Energy of Interaction Energy = van der Waals interaction (Ev) + electrostatic interaction (Ec) Ev = 0.2[(R0/Rij)12 - 2(R0/Rij)6 ] Ec = 332 QiQj/(eRij) Ec Ev Rij Rij Solvent Effects Ec = 332 QiQj/(eRij) • Is only valid for an infinite medium of uniform dielectric; • Dielectric discontinuities result in induced surface charges; Solution: Poisson-Boltzman equation [e(r) . (r)] - e(r)k(r)2sinh([(r)] + 4prf(r)/kT = 0 Use Delphi [Rocchia et al `01] Finite Difference solution is based on discretizing the workspace into a uniform grid.