Download Protein Threading - Laboratory of Molecular Modelling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Multi-state modeling of biomolecules wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Paracrine signalling wikipedia , lookup

Biosynthesis wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Gene expression wikipedia , lookup

Genetic code wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Expression vector wikipedia , lookup

Point mutation wikipedia , lookup

Biochemistry wikipedia , lookup

Protein wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Metalloprotein wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Interactome wikipedia , lookup

Western blot wikipedia , lookup

Protein purification wikipedia , lookup

Structural alignment wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Proteolysis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Transcript
Protein Threading
Zhanggroup 2003 10 22
Overview
Background protein structure protein
folding and designability
 Protein threading
 Current limitations to protein threading

Computational complexity of certain
formulations of the protein threading
problem
 Performance of protein threading
systems
 References

Protein Structure

Primary, secondary, tertiary structure
Can only refer to the structure
of a protein if a particular
environment is assumed
solvent environment (aqueous
trans-membrane ……)

temperature

pH etc
 Different environments yield different
structures or no stable structure at all

Proteins molecules are not
completely rigid structures
kinetic energy energetic collisions with
solvent molecules
 vibrations sidechain conformational
changes
 flexible sections of the peptide chain
 The native tertiary structure of a protein
is thus an average

Protein Folding

Protein folding = searching for a
conformation having minimum energy
Factors in protein folding
hydrophobic effects
 electrostatic charges in residues
 hydrogen bonding
 Chaperonins,ribosomes

3 stages of folding
denatured unfolded state
 molten globule state
 native compact state
 most proteins will return to their native
state after forced denaturation

The Protein Folding Problem

Given a proteins amino acid sequence
what is its tertiary structure

The protein folding problem is hard
Direct approach :molecular
dynamics simulation
Simulate on an atomic level the folding
of a single protein molecule
 protein = thousands of atoms
 solvent environment = hundreds to
thousands of molecules => thousands
of atoms

Sub-picosecond time scales
 run the simulation for 1-5 seconds
 We need another years of Moores law
to make this computation feasible

Designability
A protein with a stable native state can
not have another low-energy state
nearby in conformational space
 A structure is highly designable if its
minimum energy state has no lowenergy neighbours

Protein Threading
inverse protein folding problem: given
 a tertiary structure, find an amino acid
sequence that folds to that structure
 Protein threading: given a library of
possible protein folds and an amino acid
sequence find the fold with the
 best sequence -> structure alignment
(threading)


Evolution depends on designability to
preserve function under mutation

Estimate only different protein
structures exist in nature (Chothia,1992)
four components
a library of protein folds (templates)
 a scoring function to measure the
fitness of a sequence -> structure
alignment
 a search technique for finding the best
alignment between a fixed sequence
and structure


a means of choosing the best fold from
among the best scoring alignments of a
sequence to all possible folds
Scoring Schemes for
Sequence->Structure
Alignments

The scoring scheme for a particular
threading of a sequence onto a
structure measures the degree to which







environmental preferences are satisfied
Different amino acid types prefer different
environments e.g.
structural preferences:
in helix
in sheet
not exposed to solvent
pairwise interactions with neighbouring amino
acids
Formal Statement of the
ProteinThreading Problem
C is a protein core having m segments
Ci representing a set of contiguous
amino acids Let ci be the length of Ci
 Sequence a = a1a2…an of amino acids

Current limitations to protein
threading


Statistical problems
Definition of neighbor and /or pairwise
contact environments:
 energetic neighbor ? contact neighbor
Computational Complexity of
Finding an Optimal Alignment
The complexity of the protein threading
problem depends on whether:
 Variable-length gaps are allowed in
alignments
 the scoring function for an alignment
incorporates pairwise interactions
between amino acids

Property(I) makes the search space
exponential in size to the length of the
sequence
 Property(Ii) forces a solution to take
non-local effects into account

Any protein threading scheme with both properties is NP-complete
(3-SAT Lathrop 1994)
(MAX-CUT Akutsu,Miyano 1999)
Thus all protein threading approaches can be divided
into four groups:
1 no variable length gaps allowed
2 no pairwise interactions considered in scoring function
3 no optimal solution guarantee
4 exponential runtime
Performance of Protein Threading Systems
CASP1(1994) CASP2(1996) CASP3(1998): Critical
Assessment of Structure Prediction meetings
protein threading methods have consistently been
the winners
success depends on structural similarity of target to
known structures
successful even when target sequence and library
sequence have low homology
Much room for improvement in all areas of protein threading e.g.:
algorithms for searching the threading space
reliable biologically accurate scoring functions