Download Force Field

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein wikipedia , lookup

Metabolism wikipedia , lookup

Biosynthesis wikipedia , lookup

Genetic code wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Proteolysis wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
Force Fields
Force Fields
G Vriend 30-11-2008
Force Fields
What is a Force Field ?
• A force field is a set of equations and parameters which when
evaluated for a molecular system yields an energy
• A force field is a specific type of vector field where the value of a
given force is defined at each point in space. Examples include
gravitational fields and electrostatic fields
• In the fictional Star Trek universe, force shields are the defenses
most commonly used to protect a starship. The physics of a shield
is extracted from the physics of a force field ….. etc.
• The space around a radiating body within which its electromagnetic
oscillations can exert force on another similar body not in contact
with it
• Force field analysis evaluates non-monetary factors, just as costbenefit analysis evaluates monetary factors
2
Force Fields
But even this is too time consuming to calculate
So, most MD force fields don’t recalculate atomic
charges, at all...
3
Force Fields
One ‘serious’ example: Chou and Fasman
Example of Chou and Fasman:
We count all amino acids in a dataset of 400 proteins with know structure
(they had many fewer proteins available in 1974, but anyway...)
These 400 proteins in total have 102.197 amino acids.
4
Ala
Cys
Asp
Glu
Phe
Gly
His
Ile
Lys
7.123 = 7.0%
1.232 = 1.2%
5.993 etc
6.086
4.822
7.339
989
6.550
8.127
Helix 35.017 = 34.3%
Sheet 27.038 = 26.5%
Rest 40.142 = 39.3%
Force Fields
What is the null-model?
The null-model is the model that assumes that there is no signal in the
input data. In case of our Chou-and-Fasman example, the null model
assumes that there is no relation between the amino acid type and the
secondary structure.
So, if 7% (0.07) of all amino acids are of type Ala, and ~34% (0.34) of all
amino acids are in a helix, then 7% of 34% (0.07*0.34) is 2.4% (0.024) of
all Alanines should be observed in a helix.
And since that isn’t true, we can make a model that differs from the nullmodel, and thus we can make predictions.
5
Force Fields
One ‘serious’ example: Chou and Fasman
These 400 proteins in total have 102.197 amino acids.
Ala
7.123 = 7.0%
Helix 35.017 = 34.3%
Cys
1.232 = 1.2%
Sheet 27.038 = 26.5%
Asp
5.993 etc
Rest
40.142 = 39.3%
(Ala,Helix)predicted=0.07*34.3=2.4% or 2505 Ala in helix predicted.
But we count 3457 Ala in helix; that is 1.38 times ‘too many’.
So chances are ‘better than random to find Ala in a helix. How do we quantify
this?
6
Force Fields
Come to the rescue, one long dead physicist
7
Force Fields
One ‘serious’ example: Chou and Fasman
These 400 proteins in total have 102.197 amino acids.
Ala
7.123 = 7.0%
Helix 35.017 = 34.3%
Cys
1.232 = 1.2%
Sheet 27.038 = 26.5%
Asp
5.993 etc
Rest
40.142 = 39.3%
(Ala,Helix)predicted=0.07*34.3=2.4% or 2505 Ala in helix predicted.
But we count 3457 Ala in helix; that is 1.38 times ‘too many’.
So chances are ‘better than random to find Ala in a helix. How do we
quantify this?
So the ‘score’ for (Ala,helix) = Q(A,H)= ln(observed/predicted) =
ln(3457/2505)=ln(1.38)=0.32. The preference parameter Q(A,H) is
positive.
8
And how do we now predict the secondary structure of a protein?
Force Fields
One ‘serious’ example: Chou and Fasman
One example could be:
Loop over the sequence to be predicted
Give each residue 3 energy scores (for H, S, and rest, respectively)
Find stretches where the energy for one of the three is higher than the
other two.
Hundreds of recipes can be thought of, and the best one wins (CASP
competition teaches us that the best one is a neural network that uses
multiple sequence alignments as input).
SNDALPIVAKGS Nice helix, but for the PIV.
TSEAAQIALHSG Aha, P is exception, and V too
TNDLLMAVMRGG and I too, so it IS a helix after all.
9
Force Fields
And the other way around
ΔG= -RT.ln(K)
ΔG is just over 1kCal/Mole when K=10 and K is the ratio between two
‘somethings’ (can be anything). Swimming into a gradient of a factor
10 costs 1 kCal/Mole. A pH unit difference must be ‘worth’ a
kCal/Mole.
A nice exam question would be to think of an example of Vriend’s law
of 10 that hasn’t been discussed in the classroom yet...
10