Download ProteinShop: A tool for protein structure prediction and modeling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Paracrine signalling wikipedia , lookup

Amino acid synthesis wikipedia , lookup

SR protein wikipedia , lookup

Biosynthesis wikipedia , lookup

Gene expression wikipedia , lookup

Point mutation wikipedia , lookup

Expression vector wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Metabolism wikipedia , lookup

Magnesium transporter wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Genetic code wikipedia , lookup

Metalloprotein wikipedia , lookup

Structural alignment wikipedia , lookup

Interactome wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Protein purification wikipedia , lookup

Protein wikipedia , lookup

Western blot wikipedia , lookup

Biochemistry wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
ProteinShop: A Tool for Protein
Structure Prediction and
Modeling
Silvia Crivelli
Computational Research Division
Lawrence Berkeley National Laboratory
The Protein Structure Prediction
Problem
To determine how proteins, the building
blocks of living cells, fold themselves into
three-dimensional shapes that define the
role they play in life.
Importance of Protein Structure
Prediction
• The shape of a protein determines its function.
• Knowledge of structure is used in many ways:
– Drug design
– Design of synthetic proteins
– Re-engineering defective proteins
• Genome projects are providing sequences for
many proteins whose structure will need to be
determined.
Protein Structures
Gly
Leu
Ser
Proteins consist of a long chain of
amino acids, the primary structure
Pro
Side chain
Amino acid
H R
H R
H
O
H
N
N
Backbone
H-bond
O
N
N
O
H
R H
O
H
R H
H
O
R H
H
O
R H
N
N
N
R
H
H
N
O
R
H
H
O
Protein Structures
Gly
a-helix
Leu
Ser
Pro
b-sheet
Proteins consist of a long chain of
amino acids, the primary structure
The constituent amino acids may
encourage hydrogen bonding that
form regular structures, called
secondary structures
The secondary structures fold
together to form a compact
3-dimensional shape, called
the tertiary structure
Ab Initio Approach
Our Goal: To provide an approach that relies more on physical
principles than on information from known proteins
The problem can be formulated as a global
minimization problem, as it is assumed that the
tertiary structure occurs at the global minimum of
the free energy function of the primary sequence
Ab Initio Method
Tertiary structure is
believed to minimize potential energy:
Min VMM(x)
where x = atom coordinates
Difficulties:
Proposed energy function may not
match nature
O(en2) local minima
Very large parameter space
e.g., modestly sized protein
100 amino acids
~ 1,600 atoms
~ 4,800 variables
The Search Algorithm
Given the amino acid sequence of a
protein, find the global minimum of
the free energy function.
Generate
Starting
Configurations
Phase 1
Global
Optimization
Phase 2
Secondary Structure Predictions
in Phase 1
Sequence: SKIGIDGFGRIGRLVLRAALSCGAQ
Servers predict secondary
structure likely to be in a
target protein based on a
large database of known
proteins.
Sequence: SKIGIDGFGRIGRLVLRAALSCGAQ
Type: CBBBBBCCCAAAAAAACCCBBBBBC
Weight: 1135522356789992888566733
Matching the predicted strands is a
combinatorial problem
Which strands are paired?
?
?
?
Which orientation?
anti-parallel
parallel
Which residues are paired?
odd
even
There are n!2 n-2 possible
n-stranded motifs
96 motifs for n=4
960 motifs for n=5
It takes weeks to
create some of these
configurations using
constrained local
minimizations!
Distribution of Beta Sheets in Proteins with Applications to Structure Prediction
Ruckzinski, Kooperberg, Bonneau, and Baker, Proteins 48,2002
CASP4 Competition
• Fourth community-wide experiment on the
Critical Assessment of Techniques for Protein
Structure Prediction (2000)
• Our group predicted 8 proteins
•Largest protein had 240 aa
•Most complex fold had 2 β-strands
ProteinShop
• Interactive tool for protein manipulation
• Designed to quickly create initial configurations
• It takes weeks to create a number of configurations
using constrained minimizations
• It takes a few hours to create the same
configurations with ProteinShop
Phase 1 with ProteinShop
Amino Acid Sequence
2ndary Structure
Prediction
Phase 1
Structure Sequence
Initial Configurations
Geometry
Generation
Phase 2
Pre-configuration
Final Configuration
Direct
Manipulation
ProteinShop
takes minutes
Initial Configurations
CASP4 Competition (before ProteinShop)
•Our group predicted 8 proteins
•Largest protein had 240 aa
•Most complex fold had 2 β-strands
CASP5 Competition (with ProteinShop)
•Our group predicted 20 proteins
•Largest protein had 417 aa
•Most complex fold had 13 β-strands
Phase 2
Amino Acid Sequence
Phase 1
Initial Configurations
Phase2: Global
Optimization
Initial Configurations
Subspace
Selection
Subspace
Optimization
Candidate
Selection
Final Configuration
Final Configuration
Takes months to
converge using
hundreds of
processors on
Seaborg!
Phase 2 with ProteinShop
Amino Acid Sequence
Phase 1
Initial Configurations
Phase2: Global
Optimization
Initial Configurations
Subspace
Selection
Subspace
Optimization
Candidate
Selection
Final Configuration
Final Configuration
Will reduce
computation time
Monitoring
System
Direct
Manipulation
Steering
System
Monitoring System
• Monitor progress of overall optimization/each
optimization process
Monitoring System
• Monitor progress of overall optimization/each
optimization process
• Alert user to important events during optimization
• A sudden drop in internal energy
• A group of processes getting stuck
• Test new heuristics for expanding nodes of the tree
Steering System
• Change configurations during optimization to
account for developments not anticipated during
Phase 1
• Manipulate proteins that don’t seem to be realistic
or that are stuck in a local minimum
• Allow pruning of the optimization tree
•Assign multiple processes to a configuration that just had
a drop in internal energy
•Assign stuck processes to other configurations
Plans for the Future
Use of the monitoring and steering
features to develop and test a new
method for protein structure prediction
Compete in CASP6 (Critical Assessment
of Techniques for Protein Structure Prediction)
Expand and enhance ProteinShop
ProteinShop
O. Kreylos, N. Max, B. Hamann,
S. Crivelli, and W. Bethel.
Interactive Protein Manipulation,
Winner of the Best Application
Award IEEE Visualization 2003,
Seattle.
Available to academic and non-profit organizations
proteinshop.lbl.gov