Download protein_mol_biophysics_slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleic acid analogue wikipedia , lookup

Gene expression wikipedia , lookup

Peptide synthesis wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Expression vector wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Magnesium transporter wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Interactome wikipedia , lookup

Metabolism wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Protein purification wikipedia , lookup

Western blot wikipedia , lookup

Point mutation wikipedia , lookup

Protein wikipedia , lookup

Metalloprotein wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Homology modeling wikipedia , lookup

Biosynthesis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Biochemistry wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
Folding simulation: self-organization of 4-helix bundle protein
yellow =
helical turns
Protein structure
Protein: heteropolymer chain made of amino acid residues
R
φ
|
+H
3
N- C-
ψ
COO-
|
H
Chain of amino acid residues
20 different amino acids
More than 50,000 different proteins in human body alone
Protein structure
str
an
d
primary
Linear chain of
amino acids
secondary
Local regular
structures
tertiary
m
yo
glo
bin
β-
αhe
lix
Hierarchical levels of structure:
3-D compact structure
with long-range contacts
The biological function is determined by shape.
The shape is determined by primary sequence. How?
(“Protein Folding Problem”)
Protein Folding
“Protein folding”: Primary sequence
Native state
∆S<0
Random coil
Highly organized
Compact 3-d structure
p Huge variation in the possible primary sequence:
20N (20 different amino acids, N is # of amino acids in a chain)
Most sequences do not fold; primary sequence must be carefully chosen
Methods for finding primary sequences that fold to specific shapes:
l Evolution: trial and error, requires lots of time
l Engineering: Understand underlying principles of Self-organization
Protein Folding Problem
l
Proteins are long (>50) chains of amino acid residues
l
Biological functioning requires protein chain to fold to very
specific compact shape: “native state”
l
Chain is very flexible: each amino acid has internal degrees of
l
Paradox: Even with super-fast sampling rate 10-12sec/config
freedom (Φ, Ψ, sidechain, e.g. 4 states each) ⇒ > 64 configurations
Ex: Myoglobin (153 amino acids) 64153 = 10276 configs
⇒ 10264 seconds (10256 yrs) to randomly find native state.
(degeneracy of native state reduces this to 10118 years)
Actual real protein folding times: milli-seconds !!
l
How ??
Folding must be a guided deterministic process, not random.
Configuration space is frustrated, ultra-metric.
Different initial configurations converge to native state.
How does primary sequence determine folding dynamics.
l
Interactions are non-linear: Anti-chaotic dynamics !?
Energy Landscape
Funnel shaped, different initial configurations guide system to the same
native state.
Anti-Chaos? Is this a valid and useful approach?
(B. Gerstman and Y. Garbourg, Journal of Polymer Science B:
Polymer Physics, 36, 2761-2769, 1998.)
-- 0
Many other axes are
necessary to represent
all the structural
degrees of freedom.
Which are most
important?
Ultimate Physics Aim: Determine which aspects of 1-D sequence of
amino acids in peptide chain determine efficient folding pathway and
final shape (native state).
Immediate Aim: Determine if formalism of non-linear dynamics is
useful for investigating protein folding.
This work: Can formalism of non-linear dynamics show that large
scale un-folding is deterministic (and is it mathematically anti-chaotic)
and distinguish random thermal fluctuations ?
Use data from lattice simulations of protein unfolding
(realistic folding simulations of full proteins not available)
First check to confirm that model realistically simulates protein dynamics.
Compare results from model for characteristics that have been
experimentally measured;
e.g. Heat Capacity
Why use computer model?
The system is complex
- Huge number of degrees of structural freedom
- Many terms in the Hamiltonian
- System is not solvable analytically
Monte Carlo simulations are very useful for these kinds of systems
- Interested in relaxation times (non-equilibrium dynamics), as
well as final configurations (equilibrium).
Lattice model and interaction Hamiltonian
Red: backbone
Green: side chain
Interaction Hamiltonian:
 

ss
ssp
bb
bb
rep
H = ∑  ∑  a ij ∑ E ij + a ij E + a ij E rep  +
i  j> i 
p

∑a
l
l
i
El +
∑a
m
m
i

E m 

close enough contact (or preferred state)? Yes: a = 1; No: a = 0.
ss : sidechain-sidechain
bb: backbone-backbone
l : local
m : cooperative
p : hydrophobic or polar or hydrophilic
Protein Configuration Energy Determined by Interaction Hamiltonian
i,j : amino acid residue number in the primary sequence.
aijss: are sidechains of i and j close enough to interact; yes = 1, no = 0.
Eijssp: sidechain-sidechain energy (p = 1 hydrophobic-hydrophobic,
p = 2 hydrophilic-hydrophilic, p = 3 hydrophobic-hydrophilic).
aijbb: are backbones i and j close enough to interact; y=1, n=0.
Ebb:
backbone-backbone interaction energy ( hydrogen bond, dipole, soft
core repulsion combined together)
ail: are residues i-1, i, i+1, arranged so that ‘i’ is in its preferred user-defined
local configuration (i.e. α-helix, β-sheet, turn); y=1, n=0.
El: local propensity energy.
aim: are residues i-1, i, i+1, i+2 arranged so that i and i+1 are both in the
same preferred local configuration; y=1, n=0
Em: medium range (cooperative) propensity energy
DNA
Nucleus
DNA Packaging Inside Nucleus
Nucleosome
Supercoil
Chromosome
Protein
Scaffold
Double Helix (Partially Disrupted)
2 nanometers
1 turn =
10 base pairs =
3.4 nanometers
Minor Groove
Major Groove
NUCLEOTIDES
Phosphate
Base
Sugar
TRIPLET CODONS
GENETIC CODE
Initiation Codon
Termination Codons
U
U
First
Base
in
Codon
C
A
G
C
A
G
UUU
Phe
UCU
Ser
UAU
Tyr
UGU
Cys
UUC
Phe
UCC
Ser
UAC
Tyr
UGC
Cys
UUA
Leu
UCA
Ser
UAA
UAA
UGA
UUG
Leu
UCG
Ser
UAG
UGA
UGG
Trp
CUU
Leu
CCU
Pro
CAU
His
CGU
Arg
CUC
Leu
CCC
Pro
CAC
His
CGC
Arg
CUA
Leu
CCA
Pro
CAA
Gln
CGA
Arg
CUG
Leu
CCG
Pro
CAG
Gln
CGG
Arg
AUU
Ile
ACU
Thr
AAU
Asn
AGU
Ser
Ile
ACC
Thr
AAC
Asn
AGC
Ser
AUA
Ile
ACA
Thr
AAA
Lys
AGA
Arg
AUG
AUG
Met
ACG
Thr
AAG
Lys
AGG
Arg
GUU
Val
GCU
Ala
GAU
Asp
GGU
Gly
GUC
Val
GCC
Ala
GAC
Asp
GGC
Gly
GUA
Val
GCA
Ala
GAA
Glu
GGA
Gly
GUG
Val
GCG
Ala
GAG
Glu
GGG
Gly
AUC
Third
Base
in
Codon