Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Development of SCWRL4 for improved prediction of protein side-chain conformations Georgii Krivov, Maxim Shapovalov and Roland L. Dunbrack Jr. In collaboration with Moscow Engineering & Physics Institute © George Krivov Version 3 was written ® SCWRL program by Adrian A. Canutescu and Dr. Roland L. Dunbrack Jr. SCWRL ETotal min Main assumption: There is a finite set of possible conformations (called rotamers) for each amino acid residue. pair-wise interactions backbone SCWRL’s sidechains packing algorithm 1. Obtain spatial backbone structure and aminoacid sequence PDB 2. For each residue build possible side-chain conformations (rotamers) using rotamer library 3. Build interaction graph – each vertex denotes a certain residue – an edge between vertices indicates that there is an interaction between some rotamers of the corresponding residues Res 1 Res 2 4. Find optimal assignment of side-chain conformations by graph decomposition and dynamic programming 5. Save resolved structure into file Res 3 Res 4 PDB Res 1 Inproved dynamic programming Res 2 Res 3 Res 4 a based on a tree-decomposition of the interaction graph A tree-decomposition of a graph G V , E is a pair T , X , where b c d T I , F – is a tree with a set of vertices I and a set of edges F X X i : X i V iI e – is a family of subsets of the set V, such that iI i f associated with the vertices of T, Xi V g g e which satisfies the conditions: l e d (u, v) E i I : u, v X i k egh d c v V a set of vertices egf ehi i I : v Xi is connected in T h cdb bca hk hl Hans L. Bodlaender. 1992 Described combinatorial algorithm… is capable of larger and denser graphs than one based on biconnected components SCWRL4 is capable of significantly larger proteins than SCWRL3 typically finishes pretty quickly no coffee-breaks… resolves global optimum (avoids stochastics) accuracy of prediction entirely depends on rotamer library and energy potentials However… a better accuracy is desired Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable combinatorial complexity blows-up quick collision detection algorithm hardly feasible even with new algorithm thermodynamic fluctuations via Flexible Rotamers Model Hierarchies of bounding boxes enable efficient search for intersections between two groups of geometric figures Given two groups of figures enclosed into k-dops… 2. Check each combination for overlapping 1. If overlap then split each 3. Disregard boxes that don’t clash 4. Continue recursively on each clashing pair Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable combinatorial complexity blows-up quick collision detection algorithm hardly feasible even with new algorithm James T. Klosowski, et.al. 1998 … works best in conjunction with k-Discrete Oriented Polytopes – – – – easy to enclose a ball easy to merge easy clash check almost rotatable – a class of convex polytopes with 2k planes any plane is orthogonal to one of k basic axes which remain fixed examples: k=3 k=2 Cubic (k = 3) Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable combinatorial complexity blows-up hardly feasible even with new algorithm k=4 Tetrahedral (k = 4) quick collision detection algorithm … works best in conjunction with k-Discrete Oriented Polytopes easy to enclose a ball easy to merge easy clash check almost rotatable – a class of convex polytopes with 2k planes any plane is orthogonal to one of k basic axes which remain fixed examples: k=3 k=2 i = 1..k k=4 simple projection onto all basic axes Cubic (k = 3) Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable combinatorial complexity blows-up hardly feasible even with new algorithm Tetrahedral (k = 4) quick collision detection algorithm … works best in conjunction with k-Discrete Oriented Polytopes easy to enclose a ball easy to merge easy clash check almost rotatable – a class of convex polytopes with 2k planes any plane is orthogonal to one of k basic axes which remain fixed examples: k=3 k=2 k=4 max i max max i min i min min i Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable Cubic (k = 3) combinatorial complexity blows-up hardly feasible even with new algorithm Tetrahedral (k = 4) quick collision detection algorithm … works best in conjunction with k-Discrete Oriented Polytopes easy to enclose a ball easy to merge easy clash check almost rotatable – a class of convex polytopes with 2k planes any plane is orthogonal to one of k basic axes which remain fixed examples: k-DOP A k-DOP B k=3 k=2 k=4 ? A doesn’t clash B if exists axis xi (1≤i ≤ k) such that minia maxbi or maxia minbi Involve more rotamers more interaction to evaluate more combinations to enumerate More realistic potentials longer interaction range more edges in the graph less decomposable Cubic (k = 3) combinatorial complexity blows-up hardly feasible even with new algorithm Tetrahedral (k = 4) quick collision detection algorithm Fast anisotropic hydrogen bond potential e1 O n e2 cd OH doptimal cO n, e1 cos max w z0 1 c c 2 d H cO E O, H (1 w) EDefault w E0 More realistic potentials longer interaction range more edges in the graph less decomposable 1 z0 H e0 cH n, e0 cosmax d 1 cos max 1 cos max E0 B qH qO doptimal 2.1 d 0.65 B 30 cos max 45 cos max 35 For more relevant comparison Amount of sidechains it make sense to predict a crystal not the ASU relative surface accessibility (%) Knowledge of crystal symmetry enables higher accuracy … Extra percent in average accuracy due to crystal awareness 14 12 All residues Crystal contacts 10 8 6 4 2 0 ALL ARG ASN ASP CYS GLN GLU HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL Tuning parameters of the Flexible Rotamer Model due to backbone and frame si sample around rotamer Estatic Epairwise library’s conformation pij due to sidechains’ interaction Eself Estatic T k log probability k , T , ci i i , i ci i may be setup independently for each type of amino acid n Estatic 1 T log n i1 1 T log e mn i 1 j 1 n Epairwise search for optimal values in highdimensional space from rotamer library m s i e T pij si sj T s1 s2 optimize one amino acid type in a time (and loop for all) Optimizing expensive function in multidimensional space arg max f x x || x || R n , f x expensive 1. Generate sample of arguments and evaluate function at these points 2. Assume that second order approximation works well 3. From the linear regression resolve coefficients and their covariance 4. Maximization of quadratic form is relatively simple, provided that we can resolve eigenvalues and eigenvectors 5. Hence, generate sample of quadratic forms, maximize each of them and aggregate robust for non-convex functions! Traces through the optimization of the FRM parameters Training on 40 proteins ( ~ 2 500 residues ) + 24 proteins more and continue ( ~ 5 000 residues ) Average conditional accuracy 89.4 88.9 Testing @ H-bonds 1 Testing @ H-bonds 2 88.4 Training @ H-bonds 1 87.9 87.4 Testing on 130 proteins ( ~ 20 000 residues ) 86.9 0 10 20 30 40 Iteration number 50 60 70 80 Conditional accuracy (%) Measurement: Side chains with better electron density are easier to predict ARG ASN GLU HIS MET PHE ASP CYS ILE LEU GLN LYS PRO SER THR sliding frame - 20% 1 40 2 | 1 TRP TYR VAL 3 | 2 & 1 4 | 3 & 2 & 1 Confidence of side-chain placement (derived from experimental EDS maps) ALL Shapovalov et.al. 2007 all this good with Improved usability functionality of SCWRL4 is available as library (coming soon) backbone PDB file rotamer library SCWRL3.exe class SCWRL { … }; output PDB file enables direct manipulation of the model via C++ API SCWRL4.DLL Questions, Comments, Suggestions ? Acknowledgements Dr. Roland Dunbrack Prof. Nickolai Kudryashov Colleagues: Adrian Canutescu Guoli Wang Maxim Shapovalov Qiang Wang Qifang Xu Thanks for Your Attention! Have a Nice Day and welcome to our poster!