* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Protein Structure Prediction
Biosynthesis wikipedia , lookup
Expression vector wikipedia , lookup
Gene expression wikipedia , lookup
Magnesium transporter wikipedia , lookup
Point mutation wikipedia , lookup
Genetic code wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Interactome wikipedia , lookup
Protein purification wikipedia , lookup
Western blot wikipedia , lookup
Metalloprotein wikipedia , lookup
Structural alignment wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Protein Structure Prediction Protein Sequence + Dr. G.P.S. Raghava Structure Protein Structure Prediction • Experimental Techniques – X-ray Crystallography – NMR • Limitations of Current Experimental Techniques – Protein DataBank (PDB) -> 23000 protein structures – SwissProt -> 100,000 proteins – Non-Redudant (NR) -> 10,00,000 proteins • Importance of Structure Prediction – Fill gap between known sequence and structures – Protein Engg. To alter function of a protein – Rational Drug Design Different Levels of Protein Structure Protein Architecture • Proteins consist of amino acids linked by peptide bonds • Each amino acid consists of: – – – – a central carbon atom an amino group a carboxyl group and a side chain • Differences in side chains distinguish the various amino acids Amino Acid Side Chains Vary in: • Size • Shape • Polarity Peptide Bond Peptide Bonds Dihedral Angles Conformation Flexibility • Backbone (main chain of atoms in peptide bonds, minus side chains) conformation: – Torsion or rotation angles around: • C-N bond () • C-C bond () – Sterical hinderance: • Most – Pro • Least - Gly Ramachandran Plot Protein Secondary Structure Secondary Structure Regular Secondary Structure (-helices, sheets) Irregular Secondary Structure (Tight turns, Random coils, bulges) Secondary Structure: Helices ALPHA HELIX : a result of H-bonding between every fourth peptide bond (via amino and carbonyl groups) along the length of the polypeptide chain Individual Amino acid H-bond Helix formation is local THYROID hormone receptor (2nll) residues i and i+3 Secondary Structure: Beta Sheets BETA PLEATED SHEET: a result of H-bonding between polypeptide chains -sheet formation is NOT local Erabutoxin (3ebx) Definition of -turn A -turn is defined by four consecutive residues i, i+1, i+2 and i+3 that do not form a helix and have a C(i)-C(i+3) distance less than 7Å and the turn lead to reversal in the protein chain. (Richardson, 1981). The conformation of -turn is defined in terms of and of two central residues, i+1 and i+2 and can be classified into different types on the basis of and . i+1 i i+2 H-bond D <7Å i+3 Tight turns Type No. of residues H-bonding -turn 2 NH(i)-CO(i+1) -turn 3 CO(i)-NH(i+2) -turn 4 CO(i)-NH(i+3) -turn 5 CO(i)-NH(i+4) -turn 6 CO(i)-NH(i+5) Secondary Structure shortcuts Tertiary Structure: Hexokinase (6000 atoms, 48 kD, 457 amino acids) polypeptides with a tertiary level of structure are usually referred to as globular proteins, since their shape is irregular and globular in form Quarternary Structure: Haemoglobin What determines fold? • Anfinsen’s experiments in 1957 demonstrated that proteins can fold spontaneously into their native conformations under physiological conditions. This implies that primary structure does indeed determine folding or 3-D stucture. • Some exceptions exist – Chaperone proteins assist folding – Abnormally folded Prion proteins can catalyze misfolding of normal prion proteins that then aggregate Levels of Description of Structural Complexity • Primary Structure (AA sequence) • Secondary Structure – Spatial arrangement of a polypeptide’s backbone atoms without regard to side-chain conformations • , , coil, turns (Venkatachalam, 1968) – Super-Secondary Structure • , , /, + (Rao and Rassman, 1973) • Tertiary Structure – 3-D structure of an entire polypeptide • Quarternary Structure – Spatial arrangement of subunits (2 or more polypeptide chains) Techniques of Structure Prediction • Computer simulation based on energy calculation – Based on physio-chemical principles – Thermodynamic equilibrium with a minimum free energy – Global minimum free energy of protein surface • Knowledge Based approaches – Homology Based Approach – Threading Protein Sequence – Hierarchical Methods Energy Minimization Techniques Energy Minimization based methods in their pure form, make no priori assumptions and attempt to locate global minma. • Static Minimization Methods – Classical many potential-potential can be construted – Assume that atoms in protein is in static form – Problems(large number of variables & minima and validity of potentials) • Dynamical Minimization Methods – Motions of atoms also considered – Monte Carlo simulation (stochastics in nature, time is not cosider) – Molecular Dynamics (time, quantum mechanical, classical equ.) • Limitations – large number of degree of freedom,CPU power not adequate – Interaction potential is not good enough to model Molecular Dynamics • Provides a way to observe the motion of large molecules such as proteins at the atomic level – dynamic simulation • Newton’s second law applied to molecules • Potential energy function – Molecular coordinates – Force on all atoms can be calculated, given this function – Trajectory of motion of molecule can be determined Knowledge Based Approaches • Homology Modelling – – – – Need homologues of known protein structure Backbone modelling Side chain modelling Fail in absence of homology • Threading Based Methods – – – – – New way of fold recognition Sequence is tried to fit in known structures Motif recognition Loop & Side chain modelling Fail in absence of known example Homology Modeling • Simplest, reliable approach • Basis: proteins with similar sequences tend to fold into similar structures • Has been observed that even proteins with 25% sequence identity fold into similar structures • Does not work for remote homologs (< 25% pairwise identity) Homology Modeling • Given: – A query sequence Q – A database of known protein structures • Find protein P such that P has high sequence similarity to Q • Return P’s structure as an approximation to Q’s structure Threading • Given: – sequence of protein P with unknown structure – Database of known folds • Find: – Most plausible fold for P – Evaluate quality of such arrangement • Places the residues of unknown P along the backbone of a known structure and determines stability of side chains in that arrangement Hierarcial Methods Intermidiate structures are predicted, instead of predicting tertiary structure of protein from amino acids sequence • Prediction of backbone structure – Secondary structure (helix, sheet,coil) – Beta Turn Prediction – Super-secondary structure • Tertiary structure prediction • Limitation Accuracy is only 75-80 % Only three state prediction