Download r i+5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Expression vector wikipedia , lookup

Western blot wikipedia , lookup

Protein wikipedia , lookup

Metalloprotein wikipedia , lookup

Interactome wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Biochemistry wikipedia , lookup

Proteolysis wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Structural alignment wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Transcript
HIGH RESOLUTION LATTICE MODELS OF
PROTEINS: DESIGN & APPLICATIONS
Andrzej Kolinski
LABORATORY OF THEORY OF BIOPOLYMERS
WARSAW UNIVERSITY
http://www.biocomp.chem.uw.edu.pl
Structure and Function of Biomolecules, Bedlewo, May 12-15, 2004
WHY REDUCED MODELS?
• Classical Molecular Mechanics study of the large scale
conformational rearrangements of biomolecules are still
impractical (proteins fold in a time frame of 0.001s to
100s - “long” MD simulations cover 100 nanoseconds).
• The number of degrees of freedom treated in an explicit
way needs to be reduced and the energy landscape
smoothened.
• Knowledge-based force fields of reduced models seem
to have frequently a higher predictive power than the
all-atom potentials of the Molecular Mechanics.
• We know about 1000 times more protein sequences
than protein structures (ca. 30M against ca. 30k).
This gap increases.
OUTLINE
• Reduced protein models of an intermediate and high
resolution (representation, sampling and force field)
• Ab initio folding (an illustration)
• Loops (or fragments) modeling using various reduced
representations: SICHO, CABS and REFINER models.
Comparison with standard modeling tools: MODELLER
and SWISS-MODEL
• Comparative modeling starting from multiple threading
alignments
SICHO, CABS and REFINER
Phe
Leu
Met
Leu
Ala
Gly
Gly
1.45 Å
Ala
0.61Å
All models use knowledge-based statistical potentials derived via an analysis
of structural regularities seen in the solved structures of globular proteins
Sampling of the conformational space of the
SICHO and CABS models
-Single residue moves
-Two-residue moves
-Three-residue moves
-Small distance (rigid body) moves of
a randomly selected fragment of the
model chain
-Reptation type moves
Conformational
Search Scheme
Replica Exchange Monte Carlo
High Temperature
Isothermal MC
N copies
Folding Transition
Temp
exp (-
Low Temperature
INTERACTION SCHEME
• Generic “protein-like” biases
• Statistical potentials for short-range
conformational propensities
• Model of main chain hydrogen bonds
• Pairwise interactions between united atoms
(including orientation- and secondary structure
dependent potentials)
Generic (sequence independent) chain stiffness
- regular secondary structure propensities
i
i+2
vi+1
vi
vi+2
i+1
i+3
vi+3
i+4
Generic (sequence independent) chain stiffness
i
i+2
vi-1
i+4
vi+1
vi
i-1
vi-1
vi+3
vi+2
1
i+1
vi+3
i+3
B1 = f×eg
for: (vi-1 • vi+3)<0
B2 = -f eg -g×eg
×
for: | ri+4 –ri |< 7.0 Å
or: | ri+4 –ri |>11.0 Å
and “right handed” twist
and -type geometry
Generic (sequence independent) chain stiffness
B4 = h eg
×
1
for:
and
(ri+5 –ri ) • (ri+10 –ri+5 ) < 0
(ri+15 –r10 ) • (ri+5 –ri ) >0
i.e., penalty for a too crumpled main
chain conformations
For known or strongly predicted secondary structure fragments
an additional bias towards proper values of the medium-range
distances along the chain could be superimposed
Short-range conformational propensities
E13(ri+2,i , Ai, Ai+2)
i
i+2
i+1
i+4
i+3
E15(ri+4,i , Ai+1, Ai+3)
E/kT ~ -ln (nk,A1,A2/<nk,Ai,Aj>)
<>
E14(r*i+3,i , Ai+1, Ai+2
Note: the reduced backbone
geometry correlates better
with secondary structure
than the phi-psi angles
average over the database
-10
-1
0
1
10
_______________________________________________________________________________________________________
ALA ALA
-0.25 -0.45 -0.39 0.73 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 -1.12 -2.55 0.44 0.56 0.25 0.76 0.51
VAL THR
-1.71 -1.83 0.06 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.11 -1.51 0.56 0.56 0.44 -0.57 -0.75
_______________________________________________________________________________________________________
Left-handed beta
unlike or prohibited
Alpha
Right-handed beta
CABS reduced representation
Model of the main chain hydrogen bonds
Hydrogen bonds cause specific spatial arrangement of the
a-trace vectors and the a-carbon united atoms
bj
The united atoms i and j are “hydrogen bonded” when:
j
- at least one of the vectors h points into the vicinity of
the a-carbon i or j
vj-1
j-1
vj-1
- vectors h are “almost” parallel (or antiparallel)
-hi hj
j+1
bi
vi-1
i-1
i
vi
i+1
- (bi * bj) >0 (“roughly” parallel)
The strength of the hydrogen bond is moderated by a
cooperative component dependent on the distance
between the corresponding centers of the Ca-Ca virtual
bonds (minimum of the potential at 4.25 Å )
Additional rules: No hydrogen bonds between pairs assigned as (HE) and (HH for |i-j|>3)
The Ca-based model of hydrogen bonds correlates very well with the real hydrogen bonds.
When “translating” the indices need to be properly shifted (by +/- 1) depending on type of
secondary structure
Pairwise interactions (Ca, C, Side Groups)
• Hard-core excluded volume for Ca-Ca, C-C and Ca-C
•
•
•
•
pairs (the cut-off distances are amino acid independent).
Soft core excluded volume for interactions with the side
groups.
Pairwise potentials for side groups derived from a
statistical analysis of known protein structures.
Two side groups are assumed to be “in contact” when
any pair of their heavy atoms is “in contact” (4.5 Å cutoff) – the average distance between the centers of mass
are then taken as a contact distance for a pair of side
groups.
Side group pairwise potentials are “context” dependent
(mutual orientation, conformation of the main chain)
Pairwise interactions of the side groups
Between centers of mass (all heavy
atoms of a side group + Ca).
Cut-off distances pairwise
dependent (not additive, account
for some packing details).
Square-well shape of the potential
(for charged residues a tail added).
Soft (however relatively large)
excluded volume potential – the
height is amino acid independent.
For a given pair of amino acids the
strength of interactions and the
cut-off distances depend on mutual
orientation of the interacting side
groups and on the local geometry
of the main chain.
CONTEXT-DEPENDENT STATISTICAL POTENTIALS
Three types of the mutual
orientations of the side groups:
A-antiparallel, M-intermediate,
P-parallel
Two types of the main chain conformations:
C- compact and E-extended
Derived pairwise contact potentials from the statistics of the
numbers of parallel, antiparllel and semi-orthogonal contacts
for a given residue type and two types of the main chain
conformations.
NEW STATISTICAL POTENTIALS (AN EXAMPLE)
LYS-GLU POTENTIAL
P
M
GAPLESS THREADING
A
CC -0.9 -0.4
0.9
EE -1.1 -0.4
0.6
CE -0.2
0.1
0.8
EC -0.2
0.0
0.8
QUASI
QUASI3
QUASI3S
%NATIVE
86 %
94 %
97 %
Z-score
6.72
7.84
9.96
When tested on a large set of decoys the orientation and
backbone conformation dependent potentials QUASI3S exhibits
better correlation between energy and RMSD from native than
the more “generic” potentials
Ab initio folding
• “Pure” ab initio (with only statistical
potentials) protein folding and
macromolecular assembly (results for the
SICHO model)
LOOP MODELING – STRUCTURE COMPLETION
• Fixed template (and an “ideal” alignment) from PDB with
•
•
•
•
removed fragments of their native structure
Random starting conformation of the loops (nonentangled)
Loop optimization using SICHO, CABS and REFINER
(sampling via Replica Exchange Monte Carlo)
The lowest energy structure taken for a comparison with
MODELLER and SWISS-MODEL (automatic version)
No human intervention during the modeling procedures
EXAMPLES
(a-SICHO, b-CABS, c-REFINER, d-MODELLER)
Gray – template
Green – native fragment or loop
removed from the PDB structure
Red – Modeled fragment
EXAMPLES
(a-SICHO, b-CABS, c-REFINER, d-MODELLER)
• Green – native fragment
•
•
or loop removed from the
PDB structure
Red – Modeled fragment
Gray – template
COMPARATIVE MODELING WITH MULTIPLE
TEMPLATES
• Highest score templates detected by threading
•
•
•
procedures are used to extract the distance
restraints
“Soft” implementation of the restraints in the
CABS algorithm (from the top-four templates –
when available)
Sampling via Replica Exchange Monte Carlo
Almost always a single cluster of structures is
obtained and its centroid is taken as a final
model
EXAMPLES OF COMPARATIVE MODELING
EXAMPLES OF COMPARATIVE MODELING
SUMMARY OF COMPARATIVE MODELING
Frequently the models are closer to the
native structure than to any of the templates
CONCLUSIONS
• Algorithms employing reduced representation of
the protein conformational space are now mature
and efficient tools for protein modeling
• Applications:
-
ab initio structure prediction
comparative modeling (also multitemplate)
structure assembly from sparse experimental data
dynamics and thermodynamics of proteins, prions
flexible docking, macromolecular assemblies
• Tools exist for the all-atom reconstruction of the
reduced models. (See: NIH Research Resources
for Multiscale Modeling Tools in Structural Biology
hhtp://mmtsb.scripps.edu)
Acknowledgement
• Warsaw University
• SUNY at Buffalo (NY)
Poland
Michal Boniecki
Dominik Gront
Sebastian Kmiecik
Piotr Klein
Piotr Pokarowski
Piotr Rotkiewicz
Andrzej Kolinski
Piotr Rotkiewicz
Jeffrey Skolnick
More info: http://www.biocomp.chem.uw.edu.pl