Download Document

Predicting Protein Structure: Comparative Modeling (homology modeling) Predicting Protein Structure: Comparative Modeling (formerly, homology modeling) KQFTKCELSQNLYDIDGYGRIALPELICTMF HTSGYDTQAIVENDESTEYGLFQISNALWCK SSQSPQSRNICDITCDKFLDDDITDDIMCAK KILDIKGIDYWIAHKALCTEKLEQWLCEKE ? 1alc Homologous Share Similar Sequence Use as template & model KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAK FESNFNTQATNRNTDGSTDYGILQINSRWWCND GRTPGSRNLCNIPCSALLSSDITASVNCAKKIV SDGNGMNAWVAWRNRCKGTDVQAWIRGCRL 8lyz Structure prediction • In an ideal world, we would be able to accurately predict protein structure from the sequence only! • Because of the myriad possible configurations of a protein chain – This goal can’t reliably be achieved, yet. • Knowledge based prediction vs. Simulation based on physical forces. • Here we will only concern ourselves with knowledge-based methods, although we might use simulation in order to optimize our models. Can we predict protein structures ? MNIFEMLRID HLLTKSPSLN DEAEKLFNQD LDAVRRCALI LQQKRWDEAA TTFRTGTWDA EGLRLKIYKD AAKSELDKAI VDAAVRGILR NMVFQMGETG VNLAKSRWYN YKNL TEGYYTIGIG GRNCNGVITK NAKLKPVYDS VAGFTNSLRM QTPNRAKRVI • ab initio folding simulation: not yet ... • Rosetta approach: neither ... • Fold recognition (threading): Often works, but ... • ??? Approaches to predicting protein structures obtain sequence (target) fold assignment comparative modeling ab initio modeling build, assess model Homology Modelling of Proteins • Definition: Prediction of three dimensional structure of a target protein from the amino acid sequence (primary structure) of a homologous (template) protein for which an X-ray or NMR structure is available. • Why a Model: A Model is desirable when either X-ray crystallography or NMR spectroscopy cannot determine the structure of a protein in time or at all. The built model provides a wealth of information of how the protein functions with information at residue property level. This information can than be used for mutational studies or for drug design. Homology modeling = Comparative protein modeling = Knowledge-based modeling Idea: Extrapolation of the structure for a new (target) sequence from the known 3D-structures of related family members (templates). Homology models can be very smart! Homology models have RMSDs less than 2Å more than 70% of the time. Sequence similarity implies structural similarity? 100 . identity identity/similarity Percentage sequence 80 Sequence identity implies structural similarity 60 40 Don’t know 20 0 region ..... (B.Rost, Columbia, NewYork) 0 50 100 150 200 Number of residues aligned 250 Step 1 in Homology Modeling Fold Identification Aim: To find a template or templates structures from protein data base pairwise sequence alignment - finds high homology sequences BLAST http://www.ncbi.nlm.nih.gov/BLAST/ Improved Multiple sequence alignment methods improves sensitivity - remote homologs PSIBLAST, CLUSTAL Comparative Modeling Known Structures (Templates) Target Sequence • Protein Data Bank PDB http://www.pdb.org  Database of templates • • • Separate into single chains Remove bad structures (models) Create BLAST database Template Selection Alignment Template - Target Structure modeling Homology Model(s) Structure Evaluation & Assessment Model Building from template Core conserved regions Protein Fold Variable Loop regions Side chains Multiple templates Calculate the framework from average of all template structures Generate one model for each template and evaluate I. Manual Modeling [ http://www.expasy.org/spdbv/ ] II. Template based fragment assembly a) Build conserved core framework • averaging core template backbone atoms (weighted by local sequence similarity with the target sequence) • Leave non-conserved regions (loops) for later …. Dressing up the Core Model Core Model-Rigid Body Assembly Add loops Add Side chains End Game in protein folding Molecular dynamics of all atoms in explicit solvent II. Template based fragment assembly b) Loop modeling • use the “spare part” algorithm to find compatible fragments in a Loop-Database • “ab-initio” rebuilding of loops (Monte Carlo, molecular dynamics, genetic algorithms, etc.) Loops result from substitutions, insertions and deletions in the same family Loop Builders Mini protein folding problem3 to 10 residues longer in membrane proteins Ab Initio methods generates various random conformations of loops and score Compare the loop sequence string to DB and get hits and evaluate. Some Homology modeling methods have less number of loops to be added because of extensive multiple sequence alignment of profiles Construction of loops might be done by: Using database of loops which appear in known structures. The loops could be catagorised by their length or sequence Ab initio methods - without any prior knowledge. This is done by empirical scoring functions that check large number of conformations and evaluates each of them. II. Template based fragment assembly c) Side Chain placement Find the most probable side chain conformation, using • homologues structures • back-bone dependent rotamer libraries • energetic and packing criteria II. Template based fragment assembly d) Energy minimization • modeling will produce unfavorable contacts and bonds  idealization of local bond and angle geometry • extensive energy minimization will move coordinates away  keep it to a minimum • SwissModel is using GROMOS 96 force field for a steepest descent II. Template based fragment assembly d) Energy minimization Homology Modeling Programs Modeller (http://guitar.rockefeller.edu/modeller) Swiss-Model (http://www.expasy.ch/swissmod) Whatif (http://www.cmbi.kun.nl/whatif) Swiss-Model • Method: Knowledge-based approach. • Requirements: At least one known 3D-structure of a related protein. Good quality sequence alignements. • Procedures: Superposition of related 3D-structures. Generation of a multiple a alignement. Generation of a framework for the new sequence. Rebuild lacking loops. Complete and correct backbone. Correct and rebuild side chains. Verify model structure quality and check packing. Refine structure by energy minimisation and molecular dynamics. Model Confidence Factors The Model B-factors are determined as follows: • The number of template structures used for model building. • The deviation of the model from the template structures. • The Distance trap value used for framework building. The Model B-factor is computed as: 85.0 * (1/ # selected template str.) * (Distance trap / 2.5) and 99.9 for all atoms added during loop and side-chain building Verifying the Model • PROCHECK • WHAT IF • PROSA II • VERIFY 3D, Profile3D Errors in Models !!! • Incorrect template selection • Incorrect alignments • Errors in positioning of sidechains and loops General Structure Prediction Scheme Any given protein sequence Check sequence identity with proteins with known structure > 35% < 35% Homology Modeling Fold Recognition < 35% ab initio Folding Structure selection Structure refinement Final Structure Baker and Sali (2000) Model Accuracy Evaluation CASP Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction http://PredictionCenter.llnl.gov/casp5/ EVA Evaluation of Automatic protein structure prediction [ Burkhard Rost, Andrej Sali, http://maple.bioc.columbia.edu/eva/ ] 3D - Crunch Very Large Scale Protein Modelling Project http://www.expasy.org/swissmod/SM_LikelyPrecision.html Several web pages for homology modeling COMPOSER – felix.bioccam.ac.uksoft-base.html MODELLER – guitar.rockefeller.edu/modeller/modeller.html WHAT IF – www.sander.embl-heidelberg.de/whatif/ SWISS-MODEL – www.expasy.ch/SWISS-MODEL.html

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document