Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MOLECULAR DOCKING V. Subramanian Chemical Laboratory Central Leather Research Institute Adyar, Chennai [email protected] Introduction • Drug discovery take years to decade for discovering a new drug and very costly • Effort to cut down the research timeline and cost by reducing wet-lab experiment use computer modeling Drug discovery Chemical + biological system desired response? TRADITIONAL DRUG DESIGN Lead generation: Natural ligand / Screening Biological Testing Drug Design Cycle If promising Synthesis of New Compounds Pre-Clinical Studies Finding lead compound • A lead compound is a small molecule that serves as the starting point for an optimization involving many small molecules that are closely related in structure to the lead compound • Many organizations maintain databases of chemical compounds • Some of these are publically accessible others are proprietary • Databases contain an extremely large number of compounds (ACS data bases contains 10 million compounds) • 3D databases have information about chemical and geometrical features » Hydrogen bond donors » Hydrogen bond acceptors » Positive Charge Centers » Aromatic ring centers » Hydrophobic centers Finding lead compound • There are two approaches to this problem – A computer program AutoDock (or similar version Affinity (accelrys)) can be used to search a database by generating “fit” between molecule and the receptor – Alternatively one can search 3D pharmacophore Structure based drug design • Drug design and development • Structure based drug design exploits the 3D structure of the target or a pharmacophore – Find a molecule which would be expected to interact with the receptor. (Searching a data base) – Design entirely a new molecule from “SCRATCH” (de novo drug/ligand design) • In this context bioinformatics chemoinformatics play a crucial role and Structure-based Drug Design (SBDD) Natural ligand / Screening Molecular Biology & Protein Chemistry 3D Structure Determination of Target and Target-Ligand Complex Modelling Drug Design Cycle Structure Analysis Biological Testing and Compound Design If promising Synthesis of New Compounds Pre-Clinical Studies Structure based drug design • SBDD: • drug targets (usually proteins) • binding of ligands to the target (docking) ↓ “rational” drug design (benefits = saved time and $$) Schematics for structure based drug design Select and Purify the target protein Obtain known inhibitor X-Ray structural determination of native protein X-Ray structural determination of inhibitor complex Synthesis, Evaluate preclinical, clinical, invitro, invivo, cells, animals, & humans Determine IC50 Model inhibitor with computational tools Drug Structure Based Drug Design have the potential to shave off years and millions of dollars Working at the intersection • • • • • • • • • Structural Biology Biochemistry Medicinal Chemistry Toxicology Pharmacology Biophysical Chemistry Natural Products Chemistry Chemical Ecology Information Technology Molecular docking-definition • It is a process by which two molecules are put together in 3 Dimension • Best ways to put two molecules together • Using molecular modeling and computational chemistry tools Molecular docking • Docking used for finding binding modes of protein with ligands/inhibitors • In molecular docking, we attempt to predict the structure of the intermolecular complex formed between two or more molecules • Docking algorithms are able to generate a large number of possible structures • We use force field based strategy to carry out docking Oxygen transport molecule (101M) with surface and myoglobin ligand Influenza virus b/beijing/1/87 neuraminidase complexed with zanamivir Influenza virus b/beijing/1/87 neuraminidase complexed with zanamivir Plasma alpha antithrombin-iii and pentasaccharide protein with heparin ligand Steps of molecular docking • Three steps (1) Definition of the structure of the target molecule (2) Location of the binding site (3) Determination of the binding mode Best ways to put two molecules together – Need to quantify or rank solutions – Scoring function or force field – Experimental structure may be amongst one of several predicted solutions -Need a Search method Questions • Search – What is it? – When/why and which search? • Scoring – What is it? • Dimensionality – Why is this important? Spectrum of search • Local – Molecular Mechanics • Short - Medium – Monte Carlo Simulated Annealing – Brownian Dynamics – Molecular Dynamics • Global – Docking Details of search Level-of-Detail • Atom types • Terms of force field – Bond stretching – Bond-angle bending – Torsional potentials – Polarizability terms – Implicit solvation Kinds of search Systematic • • • • Exhaustive Deterministic Dependent on granularity of sampling Feasible only for low-dimensional problems • DOF, 6D search Kinds of search Stochastic • • • • Random Outcome varies Repeat to improve chances of success Feasible for higher-dimensional problems • AutoDock, < ~40D search Stochastic search methods • Simulated Annealing (SA) • Evolutionary Algorithms (EA) – Genetic Algorithm (GA) • Others – Tabu Search (TS) • Hybrid Global-Local Search – Lamarckian GA (LGA) Simulated annealing • One copy of the ligand (Population = 1) • Starts from a random or specific postion/orientation/conformation (=state) • Constant temperature annealing cycle (Accepted & Rejected Moves) • Temperature reduced before next cycle • Stops at maximum cycles Search parameters Simulated Annealing • Initial temperature (K) • Temperature reduction factor (K-1cycle) • Termination criteria: – accepted moves – rejected moves – cycles Genetic function algorithm • Start with a random population (50-200) • Perform Crossover (Sex, two parents -> 2 children) and Mutation (Cosmic rays, one individual gives 1 mutant child) • Compute fitness of each individual • Proportional Selection & Elitism • New Generation begins if total energy evals or maximum generations reached Search parameters • Population size • Crossover rate • Mutation rate • Local search – energy evals • Termination criteria – energy evals – generations Dimensionality of molecular docking • Degrees of Freedom (DOF) • Position or Translation – (x,y,z) = 3 • Orientation or Quaternion – (qx, qy, qz, qw) = 4 • Rotatable Bonds or Torsions – (tor1, tor2, … tor n) = n • Total DOF, or Dimensionality, D=3+4+n Docking score DGbinding = DGvdW + DGelec + DGhbond + DGdesolv + DGtors DGvdW 12-6 Lennard-Jones potential • DGelec Coulombic with Solmajer-dielectric • DGhbond 12-10 Potential with Goodford Directionality • DGdesolv Stouten Pairwise Atomic Solvation Parameters • DGtors Number of rotatable bonds Molecular mechanics: theory • Considering the simple harmonic approximation, the potential energy of molecules is given by V= VBond+ VAngle + VTorsion + Vvdw + Velec+ Vop • VBond = 1/2Kr (rij-r0)2 • Where Kr is the stretching force constant • VAngle =1/2K (ijk-0)2 • Where K is the bending force constant • VTorsion =V/2 (1+ Cos n(+0)) • Where V is the barrier to rotation, is torsional angle Molecular mechanics: Theory • Lennard-Jones type of 6-12 potential is used to describe non-bonded and weak interaction • Vvdw= (Aij/rij12-Bij/rij6) • Simple Columbic potential is used to describe electrostatic interaction • Velec=(qiqj/rij) • Out of plane bending/deformation is described by the following expression • Vop= 0.5 Kop 2 The forcefield • The purpose of a forcefield is to describe the potential energy surface of entire classes of molecules with reasonable accuracy • In a sense, the forcefield extrapolates from the empirical data of the small set of models used to parameterize it, a larger set of related models • Some forcefields aim for high accuracy for a limited set of elements, thus enabling good predictions of many molecular properties • Others aim for the broadest possible coverage of the periodic table, with necessarily lower accuracy Components of a forcefield • The forcefield contains all the necessary elements for calculations of energy and force: – A list of forcefield types – A list of partial charges • Forcefield-typing rules – Functional forms for the components of the energy expression • Parameters for the function terms – For some forcefields, rules for generating parameters that have not been explicitly defined – For some forcefields, a way of assigning functional forms and parameters The energy expression Valence interactions • The energy of valence interactions is generally accounted for by diagonal terms: – bond stretching (bond) – valence angle bending (angle) – dihedral angle torsion (torsion) – inversion, also called out-of-plane interactions (oop) terms, which are part of nearly all forcefields for covalent systems – A Urey-Bradley (UB) term may be used to account for interactions between atom pairs involved in 1-3 configurations (i.e., atoms bound to a common atom) • Evalence=Ebond + Eangle + Etorsion + Eoop + EUB Non-bond interactions • The energy of interactions between non-bonded atoms is accounted for by • van der Waals (vdW) • electrostatic (Coulomb) • hydrogen bond (hbond) terms in some older forcefields • Enon-bond=EvdW + ECoulomb + Ehbond Molecular dynamics (MD) simulations • A deterministic method based on the solution of Newton’s equation of motion Fi = mi ai for the ith particle; the acceleration at each step is calculated from the negative gradient of the overall potential, using Fi = - grad Vi - = - Vi Vi = Sk(energies of interactions between i and all other residues k located within a cutoff distance of Rc from i) Classical molecular dynamics • Constituent molecules obey classical laws of motion • In MD simulation, we have to solve Newton's equation of motion • Force calculation is the time consuming part of the simulation • MD simulation can be performed in various ensembles • NVT, NPT and NVE are the ensembles widely used in the MD simulations • Both quantum and classical potentials can be used to perform MD simulation Calculation of interaction energy • MM total energy can be used to get interaction energy of the ligands with biomolecules • In order to compute the interaction energy, calculations have to be performed for the biomolecule, ligands and the biomolecule-ligand adduct using the same force field • Eint= Ecomplex - {Ebiomolecule+Eligand} Integration of equation of motion and time step • A key parameter in the integration algorithm is the integration time step • The time step is related to molecular vibration • The main limitation imposed by the highest-frequency motion • The vibrational period must be split into at least 8-10 segments for models to satisfy the Verlet algorithm that the velocities and accelerations are constant over time step used • In most organic models, the highest vibrational frequency is that of C-H stretching, whose period is of the order of 10-14 s (10fs). Therefore integration step should be 0.5-1 fs Stages and duration in MD simulation • Dynamics simulations are usually carried out in two stages, equilibration and data collection • The purpose of the equilibration is to prepare the system so that it comes to the most probable configuration consistent with the target temperature and pressure • For large system, the equilibration takes long time because of the vast conformational space it has to search • The best way to judge whether a model has equilibrated is to plot various thermodynamic quantities such as energy, temperature, pressure versus time • When equilibrated, the system fluctuate around their average Durations of some real molecular events Event Approximate duration Bond stretching 1-20 fs Elastic domain modes 100 fs to several ps Water reorientation 4 ps Inter-domain bending 10 ps-100 ns Globular protein tumbling 1-10 ns Aromatic ring flipping 100 µs to several seconds Allosteric shifts 2 µs to several seconds Local denaturation 1 ms to several seconds Free energy simulations • Ability to predict binding energy • Free energy perturbation and thermodynamic integration • Computational demand and issues related to sampling prevent this technique in probing structure based drug design • Free Energy equation De nova design of inhibitor for HIV-I protease • An impressive example of the application of SBDD is was the design of the HIV-I protease inhibitor De nova design • It is a member of the aspartyl protease family with the two active sites • Structure has tetra coordinated water molecules tat accepted two hydrogen bond from the backbone amide hydrogens of isoleucine in the flaps • Two hydrogen bonds to the carbonyl oxygens of the inhibitor Application of structure based drug design: HIV protease inhibitors • The starting point is the series of Xray structures of the enzyme and enzyme-inhibitor complex • The enzyme is made up of two equal halves • HIV protease is a symmetrical molecule with two equal halves and an active site near its center like butterfly • For most such symmetrical molecules, both halves have a "business area," or active site, that carries out the enzyme's job • But HIV protease has only one such active site in the center of the molecule where the two halves meet Structure based drug design: HIV protease inhibitors • The single active site was plugged with a small molecule so that it is possible shut down the whole enzyme and theoretically stop the virus' spread in the body • Several Inhibitors have been designed based on – Peptidic inhibitor – Peptidomemitic compounds – Non-peptide inhibitors • Further work has demonstrated the success of this approach Some examples • Ritonavir (trade name Norvir) is one of a class of anti-HIV drugs called protease inhibitors • Saquinavir • Indinavir is another example of very potent peptidomimetic compound discovered using the elements of 3D structure and Structure Activity Relationship (SAR) De nova design… • The first step was a 3D database search of a subset of the Cambridge Structural Database • The pharmacophore for this search comprised of two hydrophobic groups and a hydrogen bond donor or acceptor • The hydrophobic groups were intented to bind to the catalytic asp residues De nova design… • The search yielded the hit which contained desired element of the pharmacophore but it also had oxygen that could replace the bound water molecules • The benzene ring in the original compound was changed to a cyclohexanone, which was able to position substituents in a more fitting manner • The DuPont Merck group had explored a series of peptide based diols that were potent inhibitors but with poor oral bioavailability De nova design • They have retained the diol functionality and expanded the six me member ring to a seven membered diol • The ketone was changed to cyclic urea to enhance the hydrogen bonding to the flaps and to help synthesis • The compound chosen further studies including clinical trails was p-hydroxymethylbenzyl derivative P1 3.5-6.5Å 8.5-12Å P1’ 3.5-6.5Å H-bond donor or acceptor 3D hit 3D pharmacophore Symmetric diol docked into HIV active site Final Molecule selected for clinical Trials Initial idea for inhibitor Stereochemistry required for optimal binding Expand ring to give diol and incorporate urea Host-Guest Interactions with Collagen: As molecules Dominated by Geometrical factors and Solvent Accessible Volumes Energy minimized structure of 24mer collagen triple helix Complex Formation of poly phenols at various collagen sites Aspargine of T.Helix and gallic acid Aspartic acid of T.Helix and catechin Lysine of T.Helix and epigallocatechingallate Binding energies different complexes between polyphenols and triple helix Binding Energy (Kcal/mol) Binding Sites in triple helix Catechin (Cat) Epigallocatechi ngallate (EGCG) Pentagalloyl glucose (PGG) 16.5 22.5 35.2 56.6 6th residue Hyp of A-chain (α1) 14.5 20.8 34.5 48.4 12th residue Lys of B-chain (α1) 19.2 23.8 37.9 41.1 21st residue Asp of A-chain (α1) 18.4 20.0 38.2 59.8 17th residue Asn of C-chain (α2) 14.1 23.7 34.3 52.8 Gallic acid (Gal) 9th residue Ser of C-chain (α2) Interfacial interacting volume Vs Binding energy of the collagen-poly phenol complex Interacting Interfacial Volume (Å3) Effective solvent inaccessible contact volume Vs Binding energy of the collagen-poly phenol complex Inset: effective solvent inaccessible contact surface area Vs Binding energy of the complex Plot of inverse of interacting interfacial volume (1/Int.Vol.) Vs inverse of binding energy(1/B.E) of the complexes Acknowledgement • • • • • • • Mr. R. Parthasarathi Mr. B. Madhan Mr. J. Padmanabhan Mr. M. Elango Mr. S. Sundar Raman Mr. R. Vijayraj CSIR & DST, GOI Big Thank You Others have done the work. Some have used the work. I have spoken only on behalf of their behalf.