Download (Simple) Physical Models of Protein Folding

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia , lookup

SR protein wikipedia , lookup

Gene expression wikipedia , lookup

Expression vector wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Peptide synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Protein purification wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Metabolism wikipedia , lookup

Metalloprotein wikipedia , lookup

Interactome wikipedia , lookup

Western blot wikipedia , lookup

Biosynthesis wikipedia , lookup

Genetic code wikipedia , lookup

Protein wikipedia , lookup

Homology modeling wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Biochemistry wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
Bioinformatics: Practical
Application of Simulation and
Data Mining
Protein Folding I
Prof. Corey O’Hern
Department of Mechanical Engineering
Department of Physics
Yale University
Protein
Folding
Biochemistry
Statistical
Mechanics
What are proteins?
Linear polymer
Folded state
•Proteins are important; e.g. for catalyzing and regulating biochemical reactions,
transporting molecules, …
•Linear polymer chain composed of tens (peptides) to thousands (proteins) of monome
•Monomers are 20 naturally occurring amino acids
•Different proteins have different amino acid sequences
•Structureless, extended unfolded state
•Compact, ‘unique’ native folded state (with secondary and tertiary structure) required
for biological function
•Sequence determines protein structure (or lack thereof)
•Proteins unfold or denature with increasing temperature or chemical denaturants
Amino Acids I
N-terminal
C
R
variable
side chain
C-terminal
peptide
bonds
•Side chains differentiate amino acid repeat units
•Peptide bonds link residues into polypeptides
Amino Acids II
The Protein Folding Problem:
What is ‘unique’ folded 3D structure of a protein based on its amino acid
sequence?
Sequence  Structure
Lys-Asn-Val-Arg-Ser-Lys-Val-Gly-Ser-Thr-Glu-Asn-Ile-Lys- His-Gln-Pro- Gly-Gly-Gly-…
Driving Forces
•Folding: hydrophobicity, hydrogen bonding, van der
Waals interactions, …
•Unfolding: increase in conformational entropy,
electric charge…
Hydrophobicity index
inside
H (hydrophobic)
outside
P (polar)
Higher-order Structure
Secondary Structure: Loops, -helices, -strands/sheets
-helix
-strand
-sheet
5Å
•Right-handed; three turns
•5-10 residues; peptide backbones fully extend
•Vertical hydrogen bonds between NH2 (teal/white)
•NH (blue/white) of one strand hydrogen-bonde
backbone group and C=O (grey/red) backbone group
to C=O (black/red) of another strand
four residues earlier in sequence
•C ,side chains (yellow) on adjacent strands
•Side chains (R) on outside; point upwards toward NH2
aligned; side chains along single strand alterna
•Each amino acid corresponds to 100, 1.5Å, 3.6
up and down
amino acids per turn
•(,)=(-135,-135)
•(,)=(-60,-45)
•-strand propensities: Val, Thr, Tyr, Trp,
•-helix propensities: Met, Ala, Leu, Glu
Phe, Ile
Backbonde Dihedral Angles
cos  ̂1 ̂ 2
Ramachandran Plot: Determining Steric Clashes
Non-Gly
Backbone
dihedral angles
Gly

theory

4 atoms define dihedral angle:
CCNC
CN CC
N CCN

=0,180

PDB
vdW radii
< vdW radii
backbone
flexibility
How can structures from PDB exist
outside Ramachadran bounds?
•Studies were performed on alanine dipeptide
•Fixed bond angle (=110) [105,115]
J. Mol. Biol. (1963) 7, 95-99
Side-Chain Dihedral Angles
4: Lys, Arg
5: Arg
Side chain: C-CH2-CH2-CH2-CH2-NH3
Use NCCCCCN to define 1, 2, 3, 4
Your model is oversimplified
and has nothing to do with
biology!
Molecular biologist
Your model is too complicated
and has no predictive power!
Biological Physicist
Possible Strategies for Understanding Protein Folding
•For all possible conformations, compute free energy
from atomic interactions within protein and proteinsolvent interactions; find conformation with lowest free
energy…e.g using all-atom molecular dynamics
simulations
Not possible?, limited time resolution
•Use coarse-grained models with effective interactions
between residues and residues and solvent
General, but qualitative
Why do proteins fold (correctly & rapidly)??
Levinthal’s paradox:
For a protein with N amino acids, number of backbone conformations/minima
Nc

2N
 = # allowed dihedral angles
How does a protein find the global optimum w/o
global search? Proteins fold much faster.
Nc~ 3200 ~1095
fold ~ Nc sample ~1083 s vs fold ~ 10-6-10-3 s
universe ~ 1017 s
Energy Landscape
U, F =U-TS
S12
S23
S-1
M2
M1
U  0
 2U  0 minimum
 2
 U  0 saddle point
 2U  0 maximum

M3
r1, r2 ,
, rN 
all atomic
coordinates;
dihedral angles
Roughness of Energy Landscape
smooth, funneled
(Wolynes et. al. 1997)
rough
Folding Pathways
dead
end
similarity to
native state
Folding Phase Diagram
smooth
rough
Open Questions
•What differentiates the native state from other low-lying energy minima?
•How many low-lying energy minima are there? Can we calculate
landscape roughness from sequence?
•What determines whether protein will fold to the native state or become
trapped in another minimum?
•What are the pathways in the energy landscape that a given protein
follows to its native state?
NP Hard Problem!
Digression---Number of Energy Minima for Sticky Spheres
Nm
Ns
Np
4
1
1
5
1
6
6
2
50
7
5
486
8
13 5500
9
52 49029
10
-
sphere
packings
polymer
packings
-
Ns~exp(aNm); Np~exp(bNm) with b>a