Download Computational Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metabolism wikipedia , lookup

Biochemistry wikipedia , lookup

Mitochondrion wikipedia , lookup

Lipid signaling wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Magnesium transporter wikipedia , lookup

Proteolysis wikipedia , lookup

Protein wikipedia , lookup

Metalloprotein wikipedia , lookup

Protein–protein interaction wikipedia , lookup

NADH:ubiquinone oxidoreductase (H+-translocating) wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Signal transduction wikipedia , lookup

Homology modeling wikipedia , lookup

Oxidative phosphorylation wikipedia , lookup

Protein structure prediction wikipedia , lookup

Thylakoid wikipedia , lookup

SNARE (protein) wikipedia , lookup

Western blot wikipedia , lookup

Transcript
V9 – orientation of TM helices
-
Modelling 3D structures of helical TM bundles
Park, Staritzbichler, Elsner & Helms, Proteins (2004), Park & Helms, Proteins (2006)
-
Beuming & Weinstein (2004)
T. Beming & H. Weinstein (2004) Bioinformatics 20, 1822
-
Adamian & Liang (2006)
L. Adamian & J. Liang (2006) BMC Struct. Biol. 6, 13
-
TMX: predict lipid-accessible sides of TM helices from sequence
Park & Helms, Bioinformatics (2007), Park, Hayat & Helms, BMC Bioinformatics (2007),
Membrane Bioinformatics SS09
1
Structure modelling for helical membrane proteins
>P52202 RHO -- Rhodopsin.
1D
MNGTEGPDFYIPFSNKTGVVRSPFEYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHK
KLRSPLNYILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVALWCLVVL
AIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVGWSRYIPEGMQCSCGVDYYTLK
PEVNNESFVIYMFVVHFAIPLAVIFFCYGRLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSF
LICWVPYASVAFYIFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT
LCCGKNPLGDDETATGSKTETSSVSTSQVSPA
2D
www.gpcr.org
3D
EMBO Reports (2002)
Membrane Bioinformatics SS09
2
Design helical bundles using effective energy functions
Aim: assemble TM bundles
Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer
Method: scan 6-D conformational space of dimers of ideal helices
Membrane Bioinformatics SS09
3
docking of helix-dimers: energy scoring
search 5 degrees of freedom systematically.
Example for parametrised
energy function between 2
residues
score conformations by residue-residue
energy function.
Park et al. Proteins (2004)
Membrane Bioinformatics SS09
4
docking of helix-dimers
Test for Glycophorin A, dimer of two identical helices, NMR structure available
Energy landscape
around the minimum
 Minimum is truly
global minimum.
RMSD between best model and NMR
structure only 0.8 Å.
Park et al. Proteins (2004)
Membrane Bioinformatics SS09
However, this is not the
case for dimers in
larger TMH proteins.
5
Need more/other information to orient helices
Early suggestion: TM proteins are „inside-out“ proteins.
That means that are hydrophobic outside and hydrophilic inside.

compute hydrophobic moment = the direction of largest hydrophobicity
1

N


 H i  r
N
i 1

C i   rproji  
here, rproj(i) is the projection of the sidechain onto the helical axis, i.e. the vector
difference describes the shortest distance
between residue i and the helix axis.
H(i) is the hydrophobicity of residue i.
This method was introduced by
David Eisenberg (1982, Nature)
Membrane Bioinformatics SS09
6
role of hydrophobic moment
According to the concept of Eisenberg,
all helices would orient their most hydrophobic side towards the bilayer.
However, this measure is quite unprecise (Park & Helms, Biopolymers 2006).
Hydrophobicity scales
ww: Wimley-White scale
eis: Eisenberg scale
ges: Goldman/Engelman/Steitz scale
kd: Kyte-Doolittle scale
Specialized scales
kP: kProt
bw: Beuming & Weinstein scale
tmlip1/2: Adamian & Liang
Membrane Bioinformatics SS09
7
Beuming & Weinstein (2004): amino acid propensities
(1) Hydrophobic residues (A, I, L, V)
make up 48.7 % of all residues in TM
proteins
(2) Charged residues (D, E, H, K, R)
constitute only 5.5%
(3) Glycine (G) is relatively abundant
(4) Small residues (A, C, S, T) form
30.6%
(5) Aromatic residues (F,W,Y) represent
15.8%
(6) -branched residues (T, I, V) form
24.9%.
(7) Proline is a helix-breaker and is
underrepresented
(8) Also, Cys, Gln, and Asn are rarely
found.
Membrane Bioinformatics SS09
8
amino acid propensities: conclusions
The overall amino acid composition deviates significantly from that of the whole
genome.
Hydrophobic residues (A, F, G, I, L, M, V, W) occur more frequently in MPs than in
the whole genomes.
Conversely, residues C, D, E, K, N, P, Q, R are underrepresented in MPs.
H, S, T, and Y have equal distributions in MPs and whole genomes.
Membrane Bioinformatics SS09
9
Beuming & Weinstein (2004): inside vs. outside
(1) Most of the exposed (lipid facing)
charged residues (D, E, K, H, R) that
are found in TMs are located in the
terminal regions (4.4%) rather than in
the central region (2.7%).
(2) The exposed terminal parts are very
rich in aromatic residues (21.3%)
compard to the central part (16.1%).
Membrane Bioinformatics SS09
10
Beuming & Weinstein (2004): surface propensity scale
SFX  SFHIS
SPX 
SFTRP  SFHIS
Table shows fraction SF of exposed residue i.
Trp has highest value of SF, His has smallest
value.
Normalize SP values with respect to His
(SP=0) and Trp (SP=1).
Membrane Bioinformatics SS09
11
correlation of SP scale with other scales
Compute correlation coefficient.
SP propensity scale has high
correlation with hydrophobicity or
volume scales.
Combine SP scale with conservation
index:
f ia
CI i   f ia log
pa
nAlignment
pa : a priori distribution of residues
Membrane Bioinformatics SS09
12
Beuming & Weinstein (2004)
Add propensity score and
conservation score:
total score(i) = SPi + CIi
Accuracy to detect the buried resides
is ca. 70%.
Membrane Bioinformatics SS09
13
Beuming & Weinstein (2004)
(top) correct SASA in X-ray structure
(middle): prediction based on aminoacid propensity + conservation
BEST!
(bottom): prediction based only on
conservation
Membrane Bioinformatics SS09
14
Adamian & Liang (2006): interacting helices
Example for two interacting TM
helices in succinate dehydrogenase.
Interacting residues follow heptad
motiv.
Note the periodicity of 3.6 residues
per turn in an ideal -helix.
Membrane Bioinformatics SS09
15
Adamian & Liang (2006)
Heptad motifs are generally
preferred for interacting helix pairs.
For left-handed helices, about 94.7%
and 92.4% of interacting residues
can be mapped to heptad repeats for
parallel and anti-parallel helices.
For right-handed pairs the number
are slightly less.
Assume that the residues of lipidaccessible helices follows a similar
pattern.
Membrane Bioinformatics SS09
16
Adamian & Liang (2006)
Each TM helix has „7 faces“.
A: the anchoring residues are
0, 7, 14, and 21
contacts are also formed by residues
3, 4, 10, 11, 17, 18
Membrane Bioinformatics SS09
17
Adamian & Liang (2006)
Combine lipophilicity score Lf and positional entropy Ef of a helical face by
simply multiplying them.
Membrane Bioinformatics SS09
18
Adamian & Liang (2006): Test fo TRP channel
Membrane Bioinformatics SS09
19
Adamian & Liang (2006): discuss failures
Sometimes, binding sites for individual lipids (e.g. cardiolipin) are formed on the
surfaces of TM proteins. Those residues will also be highly conserved, and the
method will therefore fail.
Membrane Bioinformatics SS09
20
What is needed for true de novo design of helical bundles?
Aim: explore new TM protein topologies.
distance-dependent residue-residue force field
Generate energetically favorable geometries of helix dimers.
Overlap helix dimers  full protein structure.
Membrane Bioinformatics SS09
21
Derivation of position scores
(1) For each test protein,  1000 similar sequences
from non-redundant database using BLAST URLAPI.
(2) generate initial multiple sequence alignment (MSA) with ClustalW.
Delete fragments < 80% of length of query sequence.
From these refined MSA, apply 6 different % identity criteria,
 6 final MSAs for each test protein.
Pei & Grishin: need to align ≥ 20 sequences to accurately estimate conservation
indices from MSAs.
Membrane Bioinformatics SS09
22
predicting the TM-helix-orientation from sequences
Assumption:
lipid-exposed positions are
less conserved.

CIi  


  f i   f 
2
j
j
j
 1
 C 


CI: conservation index in MSA
SASA: Solvent accessible surface area,
relative to a single, free helix
fj(i): frequency of amino acid j
in position i.
fj : frequency of amino acid j
in full alignment.
C : average conservation index (CI)
: Standard deviation
Positive values: conserved positions
Negative values: variable positions
Test: correct orientation (0,0)
has lowest score.
Membrane Bioinformatics SS09
23
Ab initio structure prediction of TM bundles
Aim: construct structural model for a bundle of ideal transmembrane
helices.
(1) Construct 12 good geometries for every helix pair AB, BC, CD, DE, EF, FG
(2) overlay ABCDEFG
„thin out“ solution space containing ca. 126 models
(a) remove „solutions“ where helices collide with eachother
(b) delete non-compact „solutions“
(3) score remaining 106 solutions by sequence conservation
(4) cluster 500 best solutions in 8 models
(5) rigid-body refinement, select 5 models with best sequence conservation.
Membrane Bioinformatics SS09
24
Rigid-body refinement
Membrane Bioinformatics SS09
25
Compare best models with X-ray structures
dark: Model
light: X-ray structure
Bacteriorhodopsin
Halorhodopsin
Sensory Rhodopsin
Additional input:
known connectivity of the
helices A-B-C-D-E-F-G.
Otherwise, the search
space would have been
too large.
Rhodopsin
NtpK
Membrane Bioinformatics SS09
26
Comparing the best models with X-ray structures
Membrane Bioinformatics SS09
27
Can one select the best model?
These are our 4 best
non-native models of bR.
Because contact between
A and E was not imposed,
very different topologies
were obtained.
In 2006, our methods
could not distinguish
between these models.
 but they could serve as
input for further
experiments.
Membrane Bioinformatics SS09
28
“Success case”: True de novo model of 4-helix bundle
Membrane Bioinformatics SS09
29
Predicting lipid-exposure
Membrane Bioinformatics SS09
30
Predicting lipid-exposure
Aim: derive optimal scale to predict exposure of residues
to hydrophobic part of lipid bilayer.
Scale should optimally correlate with SASA  minimize quadratical error.
Solution for minimization task

Y: SASA values of the training set (N = 2901 residue positions)
X: profile of residue frequencies from multiple sequence alignment ( N  21 matrix)
: wanted propensity scale for 20 amino acids + 1 intercept value (21)
Membrane Bioinformatics SS09
31
What does MO scale capture?
Membrane Bioinformatics SS09
32
Improved prediction of exposure by statistical learning
Beuming & Weinstein
(2004) method
Prediction method
Prediction accuracy [%]
Beuming & Weinstein
68.7
TMX
78.7
Yuan ... Teasdale
71.1
Membrane Bioinformatics SS09
33
Improved method by statistical learning
The theory of Support Vector Classifiers evolves from a simpler case of optimal
separating hyperplanes that, while separating two separable classes, maximize
the distance between a separating hyperplane and the closest point from either
class.
A: The two classes can be fully separable by a hyperplane,
and the optimal separating hyperplane can be obtained by
solving Eq. 9.
B: It is not possible to separate the two classes with a
hyperplane, and the optimal hyperplane can be obtained by
solving Eq. 17.
Membrane Bioinformatics SS09
34
Improved method by statistical learning
Stockholm Univ. Sept. 2008
Membrane Bioinformatics SS09
35
Improved method by statistical learning
Membrane Bioinformatics SS09
36
Improved method by statistical learning
Membrane Bioinformatics SS09
37
Improved method by statistical learning
Membrane Bioinformatics SS09
38
http://service.bioinformatik.uni-saarland.de/tmx/
input:
Putative TM helices
TopoView draws
Snake plot
Master thesis
Nadine Schneider
Membrane Bioinformatics SS09
39
http://service.bioinformatik.uni-saarland.de/tmx/
Top: TMD11, Bottom: TMD 12
Membrane Bioinformatics SS09
40
http://service.bioinformatik.uni-saarland.de/tmx/
Top: TMD5, Bottom: TMD 12
Membrane Bioinformatics SS09
41
Summary TMX and related methods
Sequences of TM proteins reveal many powerful features to allow prediction of
2D- and 3D structural features, function, and oligomerization status.
TMX server can predict lipid exposure with ca. 78% accuracy.:
http://service.bioinformatik.uni-saarland.de/tmx/
Possible applications:
(1) predict transporter pores
(2) predict lipid-exposed surface of TM proteins:
correlate with different membrane composition
collaborate with us  do you have lots of solubility data?
(3) Conserved surface residues may indicate interaction sites
Membrane Bioinformatics SS09
42