Download No Slide Title

Protein Structure Prediction N. Gautham Department of Crystallography and Biophysics University of Madras, Guindy Campus Chennai 600 025 Lecture Outline • The problem of protein structure prediction - its statement • The Levinthal paradox and computational complexity • Methods of structure prediction • Ab initio methods • Genetic algorithms • Potential energy minimisation • MOLS Information Transfer pathway within the cell ……ATGCATGCATGCATGCATGC.. ………CGUACGUACGUACGU………… DNA ………CGUACGUACGUACGU………… RNA DECODING MECHANISM PROTEIN Sequence PROTEIN Structure Biological function Statement of the problem • The sequence of amino acids in a protein determines its three dimensional structure ….AVTYRGSED…. • The structure of a protein is essential for its function Statement of the problem • Structures are determined experimentally using X-ray crystallography and NMR • This is expensive and time-consuming • Instead, can this be done using computers ? • The Problem: Given the sequence of a protein, can we use available information from Physics, Chemistry (and databases of previous structures, etc.) to calculate its three dimensional structure ? Levinthal Paradox • The Levinthal paradox arises when we consider • protein folding as a thermodynamic phenomenon (driven by entropy) This means: - the native fold of a protein is its minimum energy state the protein folds by sampling its conformational space to find the one with least energy - Levinthal Paradox • The time taken to search all possibilities increases exponentially with the size of the protein (by known algorithms) • THE PARADOX: In real life proteins fold in a few milliseconds, though we expect them to take centuries!! • In other words – the problem of protein structure prediction (or protein folding) is NP in terms of computational complexity Levinthal Paradox The ‘Golf Course’ model of the potential energy landscape Levinthal Paradox – The new view of protein folding The ‘folding funnel’ model of the potential energy landscape Computational complexity • If an algorithm is such that the computation time increases as a polynomial function of the size of the problem – it is a ‘Polynomial time’ algorithm. It belongs to the set P e.g. Time = const x (size)2 + const x (size)5 • If an algorithm is such that the computation time increases as an exponential function of the size of the problem – it is a ‘NonPolynomial time’ algorithm. It belongs to the set NP e.g. Time = const x 2.5size Computational complexity • The Travelling Salesman problem – an example of a problem in NP City 3 City 1 City 6 City 2 City 4 City 5 • Problem – Find the path with the least distance that covers all cities at least once. • The number of paths to be tried increases an exponential function of the number of cities as Computational complexity • Problems in computational biology that are in NP : • Construction of phylogenetic trees A B A C A B A B B D C D D C • Multiple sequence alignment • Protein Structure Prediction Methods of Protein Structure Prediction • Homology modelling • Fold recognition • Ab initio methods • Genetic algorithms • Potential energy minimisation • MOLS Ab initio methods: Genetic algorithms • a.k.a. ‘Evolutionary Computation’ • The method operates on pieces of information • (like Nature on genes) Start with a group of individuals (binary coded ?) that represent possible solutions to the problem • Apply mutation, variation and crossover operators to the individuals • From the resulting population, select individuals with high values of fitness to populate next generation • Iterate till best individual is obtained Genetic algorithms: Application to Protein Structure Prediction • The initial generation consisted of protein structures with random choice of torsion angle values • The fitness function was a semi-empirical potential energy function, i.e. EvdW + Eel + Etor + Epseudoentropy • The mutate operator randomly changed torsion angle values • The variate operator made small, random increments or decrements to torsion angle values • The crossover operator interchanged portions of randomly selected pairs of individuals Potential Energy Minimization • Minimize Potential Energy (Least squares, Conjugate Gradient, Molecular Dynamics….) • The problem – where to start? How to avoid local minima? • Many methods - Build-up method - Conformational Space Annealing - Monte Carlo Minimization - Diffusion Equation and Distance Scaling - Simulated Annealing - ……. Potential Energy Minimization • Build-up method 5 1 Step 1 Minimize 10 6 1 Step 2 10 Minimize Mutually Orthogonal Latin Squares OBJECTIVE: To build a library of the lowest energy conformations of an oligopeptide Mutually Orthogonal Latin Squares METHOD (IN BRIEF): The MOLS cycle Parameterize the search space Use these to build a set of MOLS (chosen at random) to globally sample the space Analyze the samples to obtain a low energy conformation (This is followed by gradient minimization) Yes Another low energy conformation? Mutually Orthogonal Latin Squares Results: The (23) best structures for Met-enkephalin GA MOLS Initial population of 50 individual structures for the sequence Sequence is split into overlapping fragments of five/seven/nine residues Mutations Conformational search for all fragments using MOLS yielding ~ 1000 structures each Resulting structures are clustered A library of structures for each fragment Variations Structures from MOLS libraries Crossing over Selection of individuals with lower energy Avian pancreatic polypeptide 36 residues RMSD 4.0 A Prediction Experiment (X-ray crystallography) Villin headpiece 36 residues RMSD 5.2 A Prediction Experiment (X-ray crystallography) Tryptophan zipper 16 residues RMSD 2.7 A Prediction Experiment (NMR) Bovine Pancreatic Trypsin Inhibitor 58 residues 3 disulphide bridges Prediction Experiment (X-ray) Protein Structure Prediction: Conclusion • If the sequence of the protein of unknown structure has greater than 40% identity with one of known structure, the structure prediction problem may be considered solved – especially with the structural genomics initiative • Ab initio structure prediction, using knowledge only of sequence, and of physics and chemistry, is as yet an unsolved problem

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download No Slide Title