* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Protein Structure Predictions 1
Expression vector wikipedia , lookup
Fatty acid synthesis wikipedia , lookup
Gene expression wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Fatty acid metabolism wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Magnesium transporter wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Interactome wikipedia , lookup
Point mutation wikipedia , lookup
Peptide synthesis wikipedia , lookup
Metalloprotein wikipedia , lookup
Western blot wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Homology modeling wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Genetic code wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Biosynthesis wikipedia , lookup
Roadmap The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence alignments phylogenetics structure prediction microarray data analysis Protein Synthesis the national health museum Proteins Proteins Proteins perform a vast array of biological functions including: Transport: hemoglobin (delivers O2 to lungs) Mechanical support: collagen Storage: ferritin (stores iron) Regulation: repressor proteins (gene expression) Antibodies: immunoglobulin Catalysis: SOD (superoxide dismutase) … Misfold: mad cow disease, Alzheimer's disease, … Amino acid composition Basic Amino Acid Structure: The side chain, R, varies for each of the 20 amino acids Side chain R H O N C C H Amino group H OH Carboxyl group The Peptide Bond Dehydration synthesis Polypeptide with repeating backbone: N–C –C –N–C –C Side chain properties What make amino acids having different properties ? Carbon does not make hydrogen bonds with water easily – hydrophobic O and N are generally more likely than C to h-bond to water – hydrophilic The amino acids forms three general groups: Hydrophobic Polar Charged (positive/basic & negative/acidic) The Hydrophobic Amino Acids Proline severely limits allowable conformations! The Charged Amino Acids Krane & Raymer The Polar Amino Acids Krane & Raymer More Polar Amino Acids and Peptidyl polymers A few amino acids in a chain are called a polypeptide. A protein is usually composed of 50 to 400+ amino acids. Primary & Secondary Structure Primary structure = the linear sequence of amino acids comprising a protein: AGVGTVPMTAYGNDIQYYGQVT… Secondary structure Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the -sheet The location of direction of these periodic, repeating structures is known as the secondary structure of the protein Levels of Protein Structure Secondary structure elements combine to form tertiary structure Quaternary structure occurs in multi-enzyme complexes Many proteins are active only as homodimers, homotetramers, etc. Dihedral angles Helix Most abundant secondary structure 3.6 amino acids per turn Hydrogen bond formed between every fourth reside Avg length: 10 amino acids, or 3 turns Varies from 5 to 40 amino acids Helix Normally found on the surface of protein cores Interact with aqueous environment Inner facing side has hydrophobic amino acids Outer-facing side has hydrophilic amino acids Every third amino acid tends to be hydrophobic Pattern can be detected computationally Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M) Poor in proline (P), glycine (G), tyrosine (Y), and serine (S) Sheet Sheet Hydrogen bonds between 5-10 consecutive amino acids in one portion of the chain with another 5-10 farther down the chain Interacting regions may be adjacent with a short loop, or far apart with other structures in between Directions: Same: Parallel Sheet Opposite: Anti-parallel Sheet Mixed: Mixed Sheet Alpha carbons (and R side groups) alternate above & below the sheet Prediction difficult, due to wide range of and angles Ramachandran Plot (alpha) Ramachandran Plot (beta) Ramachandran Plot Helices and Sheets Loop Regions between helices and sheets Various lengths and three-dimensional configurations Located on surface of the structure Hairpin loops: complete turn in the polypeptide chain, (antiparallel sheets) More variable sequence structure Tend to have charged and polar amino acids Coil Region of secondary structure that is not a helix, sheet, or loop Determining Protein Structure There are O(100,000) distinct proteins in human proteome. Two methods for revealing positions of atoms in 3D: X-Ray Crystallography X-ray diffraction pattern + mathematical construction Good protein crystal needed, good resolution of diffraction needed Nuclear Magnetic Resonance Small proteins only (< 250 residues) Inter-proton distances + geometric constraints Bovine Ribonuclease Christian Anfinsen, 1957. Disulfide Bonds Two cysteines in close proximity will form a covalent bond Disulfide bond, disulfide bridge, or dicysteine bond. Significantly stabilizes tertiary structure. Principles that govern the folding of protein chains - Christian Anfinsen, Science 1973 Ribonuclease Disulfide Bonds # of cysteines # of S-S bonds # of combinations 4 2 3 6 3 15 8 4 105 10 5 945 12 6 10395 Levinthal’s paradox How do proteins find the right conformation out of the simply endless number of potential threedimensional forms that it could randomly fold into? Consider a 100 residue protein. If each residue can take only 3 positions, there are ? 3100 = 5 1047 possible conformations. If it takes 10-13s to convert from 1 structure to another, exhaustive search would take ? 1.6 1027 years! Current Opinion in Structural Biology, 2004, 14, 70-75 What determines fold? Anfinsen’s experiments in 1957 demonstrated that proteins can fold spontaneously into their native conformations under physiological conditions. This implies that primary structure does indeed determine folding or 3-D structure. Exceptions exist Chaperone proteins assist folding Abnormally folded Prion proteins can catalyze misfolding of normal prion proteins that then aggregate Other factors Physical properties of protein that influence stability & therefore, determine its fold: Rigidity of backbone Amino acid interaction with water Hydropathy index for side chains Interactions among amino acids Electrostatic interactions Hydrogen, disulphide bonds Volume constraints Understand protein folding Structure: Given a sequence, what tertiary structure does it adopt? Thermodynamics: under mutation does the free energy of the native state change relative to native sequence? Global optimization, Monte Carlo, Molecular dynamics, Coarse-grained dynamics, etc. MC, MD, Free energy methods, etc. Kinetics: how fast does the protein fold? Does a different sequence fold faster and why? Lattice Monte Carlo, Molecular dynamics, Coarse-grained dynamics CASP changed the landscape Critical Assessment of Structure Prediction competition. Even numbered years since 1994 Solved, but unpublished structures are posted in May, predictions due in September Various categories Relation to existing structures, ab initio, homology, fold, etc. Partial vs. Fully automated approaches Produces lots of information about what aspects of the problems are hard, and ends arguments about test sets. Results showing steady improvement, and the value of integrative approaches.