Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DOCKING Modeling Protein Complexes Dr. Victor Lesk 24th October 2006 • Protein / small molecules – Enzyme / substrates – Enzyme / drug • Protein / protein – Enzyme / inhibitor – Inhibitor / modulator – Macromolecular assemblies • Protein / nucleic acid – RNA/DNA / polymerase – Ribosome / peptide • Docking two molecules means constructing the coordinates of the bound state. • Bound state is called the complex. • We require coordinates for the independent molecules as input • Molecules move towards each other and bind/‘dock’ • But aim is to predict their docked configuration (not describe their motion). Drug design • Drugs typically affect a protein target’s ability to bind substrate (compet. or non-). • For some time carried out by automated screening of large library of compounds • Library chosen according to ADME criteria using Lipinski’s rule of 5 • Used to be experimental screening using physical library • Now “virtual screening” computer-based methods are also available. How is virtual screening performed? • QSAR when structure of protein target is unknown • When protein structure is known, docking the drugs onto the protein can be tried (“small-molecule docking”). • If partner is also protein and binding site is cannot be identified by expt. or bioinformatics, protein-protein docking may be used to help find it. SMALL-MOLECULE DOCKING Ligand docking Virtual screening For: • Drug design – Lead optimization • Toxicology • Metabolism study • Development of tags for imaging Software e.g.: • Commercial – DOCK – GOLD – FlexX • Free for academics – Autodock Small molecule docking for drug design • Try to dock putative drug molecule on to protein • Each molecule has few atoms, so docking of each is computationally efficient.. but many molecules. • Search from library of 500,000 compounds – Pre-filtered using heuristics (Lipinski) • Score with pairwise energy function • Protein remains rigid • Torsion angles of drug are allowed to rotate. Bonds are not allowed to stretch or flex. • Example: HIV protease inhibitors PROTEIN-PROTEIN DOCKING (also polynucleotides) • Other reasons for protein-protein docking • Basic concepts • Methods • Assessment • Summary Background: why do proteinprotein docking? Aside from helping with virtual screening, • Protein-protein interaction networks are of widespread interest in systems biology • Exist proteins with no information from genome projects • And known proteins having as yet unknown interactions • Structure prediction technology advances Protein-Protein Docking • Configuration space large, computationally intensive problem • Even larger when one of components is not rigid ‘enough’ – Throw away as many atoms as possible – Search remaining space efficiently – And/or use high performance computing Protein-protein docking: Methods • Set of configurations must contain good enough one • Good enough configuration must have nearly the best score • Use non-structural help where possible Protein-protein docking: Methods • Fourier series methods • Monte Carlo methods • Surface methods • Bioinformatics methods • Normal modes Fourier method for proteinprotein docking ‘Reciprocal space method’ • Convolution scores only • Non-structural help cannot be used efficiently • Conformational change not allowed Monte-Carlo methods for protein-protein docking • Make small random change • Prefer to accept change with better score • Random change may include non-rigidity • Not guaranteed to consider true structure Surface method for proteinprotein docking • Superpose a point on each protein’s molecular surface • Rotate to make normals antiparallel • Surfaces created with marching cubes Movie of surface method proteinprotein docking • • • • • • 2 x Plasmodium vivax 25kD protein Homodimer complex Symmetry not imposed 1000 active triangles on each protein 60,000,000 configurations total Score rate: 30 configurations per second Marching cubes (Lorensen and Cline,1987) • Originally for medical imaging (Pictures: D. Lingrand Université Nice Sophia Antipolis/CNRS) Marching cubes: animated demonstration • Molecular surface of P25 being constructed • Low resolution Marching cubes: properties • Surface is constructed out of triangles (‘simplicial complex’) • All mathematical topologies ok • Restrictable to specified patches • Internal pockets must be eliminated • Major flexibility requires refinement stage (although better than Fourier method) Marching cubes: variables Molecular surface Solvent-accessible surface Marching cubes: patches 800 sq. A patch around GLU152.Oe2 Sample from patch Bioinformatics methods for protein-protein docking • Mutagenesis effects on affinity • Surface residue conservation • Correlated mutation between interactors • Homologous complexes • Bioinformatics auxiliary, not stand-alone Normal Modes • About 10000 oscillations in average protein • Largest amplitude oscillations calculable • Describe structural change for docking • Or just reduce backbone conformer search (needs to be done) Scores for selecting the best configuration • Free energy – Electostatics – Stereochemistry • • • • • Solvation score Statistical scores Geometric scores Phylogenetic scores Combined scores Free energy • • • • • Contribution from all pairs of atoms Same/opposite electric charge repel/attract Electron clouds exclude each other Atoms try to make glancing contact Hydrogen bonds are favourable (difficult to model and calculate, direction-dependent) Solvation score • Water attracts polar groups • Non-polar groups buried by interface • Atomic contact energy (Zhang) Statistical scores • Interacting residue or atom type profile • Profile from known complex interfaces Geometric scores • Convex hull of surfaces • Buried surface area • Volume of intersection Phylogenetic scores • Needs homologues for all interactors • Conservation score • Correlation of mutations across interface Combined scores • • • • • best/worst rank(s1,s2,s3 … ) reverse: -s1 s1 with configurations filtered out if s2>5.7 weighted sum: a x s1 + b x s2 + … weighted product: s1a x s2b x s3c • Automated combined score trials Assessment of docking methods • Benchmark • Assessment event: – CAPRI double-blind trial Protein-protein docking benchmark • 84 protein-protein complexes • Unique structural family combinations • Diverse biological roles • Maintained by Prof. Z. Weng at Boston U. CAPRI: Critical Assessment of PRotein Interactions • Every 6 months or so, irregular • Set of 1 to 6 target complexes • Centralized double-blind assessment • International meetings • Proteins journal: special CAPRI edition Interpretation of docking • Not simulation (maybe Monte Carlo is) • Energy functions are no more than ‘inspired’ by physics • With greater understanding affinity prediction could become possible Imperial structural bioinformatics group Virtual screening Protein-protein docking • Prof. Mike Sternberg • Dr. Ata Amini • Dr. Paul Shrimpton • • • • • Prof. Mike Sternberg Dr. Suhail Islam Dr. Victor Lesk Philip Carter Sara Dobbins Glossary • • • • • • • • • • • Complex Component Ligand/Receptor Bond torsion Torsion angle Bond angle Configuration Decoy Blind Trial PDB file Coordinate file • • • • • • • • • • Energy function Scoring function Fitness function Pairwise energy Electrostatics Solvation Dielectric Affinity Fourier Transform Mutagenesis Summary • Surface method: fast, versatile, flexibility ok • 600 processor hours for full rigid search • • • • To be done: Score improvement Fast sidechain flexibility Backbone flexibility Bibliography • Virtual screening – Virtual Screening in Drug Discovery, Alvarez & Shoichet, CRC Press (2005) – Structure-based virtual screening – an overview, Lyne, DDT 7 20 1047 (2002) • Lipinski’s rule of 5 – Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings, Lipinski et al., Adv. Drug Del. Rev.. 26 13 3(2001) • Protein-protein docking: Fourier method – Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation methods, Katzir et al., PNAS 89 2195(1992) • Protein-protein docking: Monte Carlo – Protein-protein docking with simultaneous optimization of rigid-body displacement and sidechain conformations, Gray et al., JMB 331 1 281 (2003) • Benchmark for protein-protein docking – Protein-Protein Docking Benchmark 2.0, Mintseris et al., Proteins 60 2 216(2005) • Solvation modeling for proteins – Determination of atomic desolvation energies from the structures of crystallized proteins, Zhang et al., JMB 267 3 707(1997) • Automated protein-protein docking server – CLUSPRO, Comeau et al., Bioinf. 20 1 45 (2004) http://nrc.bu.edu/cluster/ • CAPRI docking assessment – Welcome to CAPRI, Janin, Proteins SFG 47 3 257(2002) – CAPRI methods articles, Proteins SFG 52 1(2003) – The CAPRI experiment, its evaluation and implications, Wodak & Mendex, Curr Opin Struct Biol 14 2 242 • Marching Cubes – Marching cubes: A high resolution 3d surface construction algorithm, Lorensen & Cline, Proc. ACM Siggraph Aug 1987 163 Critical • Scores and hybrid scores • How to describe the success rate of a docking method • Benchmark • Changes in protein structure upon complexation Basic Scores electrostatic energy hard core repulsion penalty score (e.g. Lennard-Jones) solvation score Quality of method In x% of different proteinprotein complexes the best n guesses of method M contain at least one guess closer than r Angstroms. Benchmark • Benchmark of 39 complexes, filtered for redundancy • Ranked in 3 difficulty classes based on degree of conformational change • Enzyme-inhibitor complexes:11 • Antibody-antigen complexes:11 • 27 others of unclassified functional role Conformational change • Proteins change shape a little (example) or a lot (example) upon complexation • When conformational change is small, docking methods can ignore it. Strategy Any docking method must work unfailingly in cases of zero conformational change. Methods are first tested with zero conformational change imposed. Surface-based docking • Automatically excludes a large subset of known undesirable conformations • Can impose contact between specified surface patches • N-fold rotational symmetry • Provides alternative visualization Construction of Surface from pdb file • Step 1: read pdb file and identify atom types • Step 2: replace points with overlapping clouds of density • Step 3: apply marching cubes to generate a set of interlocking triangles representing the atomic surface Generation of guesses for complexed structure A possible structure for the complex can be generated by specifying • A triangle from the surface representing component 1 • A triangle from component 2 • An angle Put triangle 1 against triangle 2 and rotate by angle around the centre of the triangle. Score the configuration and record the structure. Repeat 1M times. Ultimate aims of Protein-Protein Docking, in order of difficulty • What is the complexed structure of proteins x, y of known structure which are known to form a complex? (Hard) • Could proteins a, b of known structure form a stable complex in vivo? (Very Hard) • What, approximately, is the chemical affinity for given interacting proteins? (Very Hard) Hybrid scores Hybrid scores are scores produced by operations on basic scores s1,s2,s3 reverse(s1) s1 with configurations filtered out if s2>5.7 weighted sum a*s1 + b*s2 + c*s3 + … weighted product s1a x s2b. x s3c …