* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download tutorial10_3D_structure
List of types of proteins wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Circular dichroism wikipedia , lookup
Protein design wikipedia , lookup
Alpha helix wikipedia , lookup
Western blot wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Rosetta@home wikipedia , lookup
Trimeric autotransporter adhesin wikipedia , lookup
Protein purification wikipedia , lookup
Protein folding wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Protein domain wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Protein Tertiary Structure Protein Data Bank (PDB) • Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids: ~87,000 structures. • The data is typically obtained by X-ray crystallography or NMR (Nuclear magnetic resonance) spectroscopy and submitted by biologists and biochemists from around the world. • Freely accessible. Accession number PDB file Java based visualization tools 2ndary structure PDB file example: A PDB file can be viewed by different visualization tools , such as Pymol Protein, chain, domain • Here is a protein compound by 4 chains. • Which protein is that? Protein, chain, domain • One chain may have multiple domains. • A protein domain is a conserved part of a given protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. • Each domain has a stable 3D structure. Protein domain classifications • Scientists have tried to classify proteins by their structural properties into a tree-like hierarchy. • The 2 most used domain classifications are CATH and SCOP. CATH: Protein Domain Structure Classification Class, Architecture, Topology and Homology •Class: The secondary structure composition: mainlyalpha, mainly-beta and alpha-beta. • Architecture: The overall shape of the domain structure. Orientations of the secondary structures : e.g. barrel or 3layer sandwich. • Topology: Structures are grouped into fold groups at this level depending on both the overall shape and connectivity of the secondary structures. •Homologous structures Superfamily: Evolutionary conserved http://www.cathdb.info/ CATH: Protein Domain Structure Classification Class, Architecture, Topology and Homology SCOP Structural Classification of Proteins http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html Based on known protein structures •Manually created by visual inspection •Hierarchical database structure: –Class, Fold, Superfamily, Family, Protein and Species Node Parents of node Children of node Protein structure alignment • Structural alignment attempts to establish homology between two or more protein structures based on their 3D conformation. • Structural alignment often implies evolutionary relationships between proteins with low seq-id. Sequence – structure relations • Similar sequences Similar structures. • Different sequences ??? • Different sequences that fold into similar structures are most interesting, since they imply a common origin. • This is what we aim to find Protein structure alignment • Alignment tools try to superimpose the 2 structures, so that the distance between them is minimal. • The distance measure is RMSD - Root Mean Square Deviation. • Given two sets of n points v and w, the RMSD is defined as follows: n 1 RMSD(v, w) vi wi n i 1 2 Protein structure alignment • The structural alignment servers do LOCAL structural alignment. • They try to align larger stretches of protein backbone with minimal RMSD. • Thus, another parameter to assess the quality of the alignment is the alignment length. Protein structure alignment similar structures • Low RMSD _________ dissimilar structures • Low alignment length _________ • SAS score = 100*RMSD/(alignment length) similar structures • Low SAS _________ Structure alignment servers Dalilite: http://www.ebi.ac.uk/Tools/structure/dalilite/ • 1XIS and 1NAR have only 7% sequence identity, but they are structurally similar. • We will download their pdb files from the PDB, and structurally align them using Dalilite. Insert PDB files This file can be loaded to Pymol viewer Food for thought How can structure alignment help us in structure prediction? Structure prediction • Input: protein sequence; • Output: protein 3D structure. • This is a VERY difficult task. • CASP: Critical Assessment of Techniques for Protein Structure Prediction • Worldwide experiment for protein structure prediction taking place every two years. Structure prediction Comparative Modeling uses previously solved structures as starting points, or templates. Homology modeling: searches similarity in sequences with known structures. Protein threading: sequence to structure alignment, against a database of ‘templates’ – known structures. Ab Initio Modeling build 3D protein models "from scratch", i.e., based on physical principles rather than on previously solved structures. I-TASSER structure prediction server • based on multiple-threading alignments • I-TASSER (as 'Zhang-Server') was ranked as the No 1 server for protein structure prediction in recent CASP7, CASP8, CASP9, and CASP10 experiments. I-TASSER results I-TASSER results I-TASSER results