Download Computational Biology

V9 – orientation of TM helices - Modelling 3D structures of helical TM bundles Park, Staritzbichler, Elsner & Helms, Proteins (2004), Park & Helms, Proteins (2006) - Beuming & Weinstein (2004) T. Beming & H. Weinstein (2004) Bioinformatics 20, 1822 - Adamian & Liang (2006) L. Adamian & J. Liang (2006) BMC Struct. Biol. 6, 13 - TMX: predict lipid-accessible sides of TM helices from sequence Park & Helms, Bioinformatics (2007), Park, Hayat & Helms, BMC Bioinformatics (2007), Membrane Bioinformatics SS09 1 Structure modelling for helical membrane proteins >P52202 RHO -- Rhodopsin. 1D MNGTEGPDFYIPFSNKTGVVRSPFEYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHK KLRSPLNYILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVALWCLVVL AIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVGWSRYIPEGMQCSCGVDYYTLK PEVNNESFVIYMFVVHFAIPLAVIFFCYGRLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSF LICWVPYASVAFYIFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT LCCGKNPLGDDETATGSKTETSSVSTSQVSPA 2D www.gpcr.org 3D EMBO Reports (2002) Membrane Bioinformatics SS09 2 Design helical bundles using effective energy functions Aim: assemble TM bundles Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer Method: scan 6-D conformational space of dimers of ideal helices Membrane Bioinformatics SS09 3 docking of helix-dimers: energy scoring search 5 degrees of freedom systematically. Example for parametrised energy function between 2 residues score conformations by residue-residue energy function. Park et al. Proteins (2004) Membrane Bioinformatics SS09 4 docking of helix-dimers Test for Glycophorin A, dimer of two identical helices, NMR structure available Energy landscape around the minimum  Minimum is truly global minimum. RMSD between best model and NMR structure only 0.8 Å. Park et al. Proteins (2004) Membrane Bioinformatics SS09 However, this is not the case for dimers in larger TMH proteins. 5 Need more/other information to orient helices Early suggestion: TM proteins are „inside-out“ proteins. That means that are hydrophobic outside and hydrophilic inside.  compute hydrophobic moment = the direction of largest hydrophobicity 1  N    H i  r N i 1  C i   rproji   here, rproj(i) is the projection of the sidechain onto the helical axis, i.e. the vector difference describes the shortest distance between residue i and the helix axis. H(i) is the hydrophobicity of residue i. This method was introduced by David Eisenberg (1982, Nature) Membrane Bioinformatics SS09 6 role of hydrophobic moment According to the concept of Eisenberg, all helices would orient their most hydrophobic side towards the bilayer. However, this measure is quite unprecise (Park & Helms, Biopolymers 2006). Hydrophobicity scales ww: Wimley-White scale eis: Eisenberg scale ges: Goldman/Engelman/Steitz scale kd: Kyte-Doolittle scale Specialized scales kP: kProt bw: Beuming & Weinstein scale tmlip1/2: Adamian & Liang Membrane Bioinformatics SS09 7 Beuming & Weinstein (2004): amino acid propensities (1) Hydrophobic residues (A, I, L, V) make up 48.7 % of all residues in TM proteins (2) Charged residues (D, E, H, K, R) constitute only 5.5% (3) Glycine (G) is relatively abundant (4) Small residues (A, C, S, T) form 30.6% (5) Aromatic residues (F,W,Y) represent 15.8% (6) -branched residues (T, I, V) form 24.9%. (7) Proline is a helix-breaker and is underrepresented (8) Also, Cys, Gln, and Asn are rarely found. Membrane Bioinformatics SS09 8 amino acid propensities: conclusions The overall amino acid composition deviates significantly from that of the whole genome. Hydrophobic residues (A, F, G, I, L, M, V, W) occur more frequently in MPs than in the whole genomes. Conversely, residues C, D, E, K, N, P, Q, R are underrepresented in MPs. H, S, T, and Y have equal distributions in MPs and whole genomes. Membrane Bioinformatics SS09 9 Beuming & Weinstein (2004): inside vs. outside (1) Most of the exposed (lipid facing) charged residues (D, E, K, H, R) that are found in TMs are located in the terminal regions (4.4%) rather than in the central region (2.7%). (2) The exposed terminal parts are very rich in aromatic residues (21.3%) compard to the central part (16.1%). Membrane Bioinformatics SS09 10 Beuming & Weinstein (2004): surface propensity scale SFX  SFHIS SPX  SFTRP  SFHIS Table shows fraction SF of exposed residue i. Trp has highest value of SF, His has smallest value. Normalize SP values with respect to His (SP=0) and Trp (SP=1). Membrane Bioinformatics SS09 11 correlation of SP scale with other scales Compute correlation coefficient. SP propensity scale has high correlation with hydrophobicity or volume scales. Combine SP scale with conservation index: f ia CI i   f ia log pa nAlignment pa : a priori distribution of residues Membrane Bioinformatics SS09 12 Beuming & Weinstein (2004) Add propensity score and conservation score: total score(i) = SPi + CIi Accuracy to detect the buried resides is ca. 70%. Membrane Bioinformatics SS09 13 Beuming & Weinstein (2004) (top) correct SASA in X-ray structure (middle): prediction based on aminoacid propensity + conservation BEST! (bottom): prediction based only on conservation Membrane Bioinformatics SS09 14 Adamian & Liang (2006): interacting helices Example for two interacting TM helices in succinate dehydrogenase. Interacting residues follow heptad motiv. Note the periodicity of 3.6 residues per turn in an ideal -helix. Membrane Bioinformatics SS09 15 Adamian & Liang (2006) Heptad motifs are generally preferred for interacting helix pairs. For left-handed helices, about 94.7% and 92.4% of interacting residues can be mapped to heptad repeats for parallel and anti-parallel helices. For right-handed pairs the number are slightly less. Assume that the residues of lipidaccessible helices follows a similar pattern. Membrane Bioinformatics SS09 16 Adamian & Liang (2006) Each TM helix has „7 faces“. A: the anchoring residues are 0, 7, 14, and 21 contacts are also formed by residues 3, 4, 10, 11, 17, 18 Membrane Bioinformatics SS09 17 Adamian & Liang (2006) Combine lipophilicity score Lf and positional entropy Ef of a helical face by simply multiplying them. Membrane Bioinformatics SS09 18 Adamian & Liang (2006): Test fo TRP channel Membrane Bioinformatics SS09 19 Adamian & Liang (2006): discuss failures Sometimes, binding sites for individual lipids (e.g. cardiolipin) are formed on the surfaces of TM proteins. Those residues will also be highly conserved, and the method will therefore fail. Membrane Bioinformatics SS09 20 What is needed for true de novo design of helical bundles? Aim: explore new TM protein topologies. distance-dependent residue-residue force field Generate energetically favorable geometries of helix dimers. Overlap helix dimers  full protein structure. Membrane Bioinformatics SS09 21 Derivation of position scores (1) For each test protein,  1000 similar sequences from non-redundant database using BLAST URLAPI. (2) generate initial multiple sequence alignment (MSA) with ClustalW. Delete fragments < 80% of length of query sequence. From these refined MSA, apply 6 different % identity criteria,  6 final MSAs for each test protein. Pei & Grishin: need to align ≥ 20 sequences to accurately estimate conservation indices from MSAs. Membrane Bioinformatics SS09 22 predicting the TM-helix-orientation from sequences Assumption: lipid-exposed positions are less conserved.  CIi       f i   f  2 j j j  1  C    CI: conservation index in MSA SASA: Solvent accessible surface area, relative to a single, free helix fj(i): frequency of amino acid j in position i. fj : frequency of amino acid j in full alignment. C : average conservation index (CI) : Standard deviation Positive values: conserved positions Negative values: variable positions Test: correct orientation (0,0) has lowest score. Membrane Bioinformatics SS09 23 Ab initio structure prediction of TM bundles Aim: construct structural model for a bundle of ideal transmembrane helices. (1) Construct 12 good geometries for every helix pair AB, BC, CD, DE, EF, FG (2) overlay ABCDEFG „thin out“ solution space containing ca. 126 models (a) remove „solutions“ where helices collide with eachother (b) delete non-compact „solutions“ (3) score remaining 106 solutions by sequence conservation (4) cluster 500 best solutions in 8 models (5) rigid-body refinement, select 5 models with best sequence conservation. Membrane Bioinformatics SS09 24 Rigid-body refinement Membrane Bioinformatics SS09 25 Compare best models with X-ray structures dark: Model light: X-ray structure Bacteriorhodopsin Halorhodopsin Sensory Rhodopsin Additional input: known connectivity of the helices A-B-C-D-E-F-G. Otherwise, the search space would have been too large. Rhodopsin NtpK Membrane Bioinformatics SS09 26 Comparing the best models with X-ray structures Membrane Bioinformatics SS09 27 Can one select the best model? These are our 4 best non-native models of bR. Because contact between A and E was not imposed, very different topologies were obtained. In 2006, our methods could not distinguish between these models.  but they could serve as input for further experiments. Membrane Bioinformatics SS09 28 “Success case”: True de novo model of 4-helix bundle Membrane Bioinformatics SS09 29 Predicting lipid-exposure Membrane Bioinformatics SS09 30 Predicting lipid-exposure Aim: derive optimal scale to predict exposure of residues to hydrophobic part of lipid bilayer. Scale should optimally correlate with SASA  minimize quadratical error. Solution for minimization task  Y: SASA values of the training set (N = 2901 residue positions) X: profile of residue frequencies from multiple sequence alignment ( N  21 matrix) : wanted propensity scale for 20 amino acids + 1 intercept value (21) Membrane Bioinformatics SS09 31 What does MO scale capture? Membrane Bioinformatics SS09 32 Improved prediction of exposure by statistical learning Beuming & Weinstein (2004) method Prediction method Prediction accuracy [%] Beuming & Weinstein 68.7 TMX 78.7 Yuan ... Teasdale 71.1 Membrane Bioinformatics SS09 33 Improved method by statistical learning The theory of Support Vector Classifiers evolves from a simpler case of optimal separating hyperplanes that, while separating two separable classes, maximize the distance between a separating hyperplane and the closest point from either class. A: The two classes can be fully separable by a hyperplane, and the optimal separating hyperplane can be obtained by solving Eq. 9. B: It is not possible to separate the two classes with a hyperplane, and the optimal hyperplane can be obtained by solving Eq. 17. Membrane Bioinformatics SS09 34 Improved method by statistical learning Stockholm Univ. Sept. 2008 Membrane Bioinformatics SS09 35 Improved method by statistical learning Membrane Bioinformatics SS09 36 Improved method by statistical learning Membrane Bioinformatics SS09 37 Improved method by statistical learning Membrane Bioinformatics SS09 38 http://service.bioinformatik.uni-saarland.de/tmx/ input: Putative TM helices TopoView draws Snake plot Master thesis Nadine Schneider Membrane Bioinformatics SS09 39 http://service.bioinformatik.uni-saarland.de/tmx/ Top: TMD11, Bottom: TMD 12 Membrane Bioinformatics SS09 40 http://service.bioinformatik.uni-saarland.de/tmx/ Top: TMD5, Bottom: TMD 12 Membrane Bioinformatics SS09 41 Summary TMX and related methods Sequences of TM proteins reveal many powerful features to allow prediction of 2D- and 3D structural features, function, and oligomerization status. TMX server can predict lipid exposure with ca. 78% accuracy.: http://service.bioinformatik.uni-saarland.de/tmx/ Possible applications: (1) predict transporter pores (2) predict lipid-exposed surface of TM proteins: correlate with different membrane composition collaborate with us  do you have lots of solubility data? (3) Conserved surface residues may indicate interaction sites Membrane Bioinformatics SS09 42

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Computational Biology