Download Dihedral angle principal component analysis of molecular dynamics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
THE JOURNAL OF CHEMICAL PHYSICS 126, 244111 共2007兲
Dihedral angle principal component analysis of molecular
dynamics simulations
Alexandros Altis, Phuong H. Nguyen, Rainer Hegger, and Gerhard Stocka兲
Institute of Physical and Theoretical Chemistry, J. W. Goethe University, Max-von-Laue-Strasse 7,
D-60438 Frankfurt, Germany
共Received 14 March 2007; accepted 11 May 2007; published online 29 June 2007兲
It has recently been suggested by Mu et al. 关Proteins 58, 45 共2005兲兴 to use backbone dihedral angles
instead of Cartesian coordinates in a principal component analysis of molecular dynamics
simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a
correct separation of internal and overall motion, which was found to be essential for the
construction and interpretation of the free energy landscape of a biomolecule undergoing large
structural rearrangements. To account for the circular statistics of angular variables, a transformation
from the space of dihedral angles 兵␸n其 to the metric coordinate space 兵xn = cos ␸n , y n = sin ␸n其 was
employed. To study the validity and the applicability of the approach, in this work the theoretical
foundations underlying the dihedral angle principal component analysis 共dPCA兲 are discussed. It is
shown that the dPCA amounts to a one-to-one representation of the original angle distribution and
that its principal components can readily be characterized by the corresponding conformational
changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N
angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the
construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics
simulation, a critical comparison of the various methods is given. © 2007 American Institute of
Physics. 关DOI: 10.1063/1.2746330兴
I. INTRODUCTION
Classical molecular dynamics 共MD兲 simulations have
become a popular and powerful method in describing the
structure, dynamics, and function of biomolecules in microscopic detail.1 As MD simulations produce a considerable
amount of data 共i.e., 3M coordinates of all M atoms for each
time step兲, there has been an increasing interest to develop
methods to extract the “essential” information from the trajectory. For example, one often wants to represent the molecule’s free energy surface 共the “energy landscape”2–4兲 as a
function of a few important coordinates 共the “reaction coordinates”兲, which describe the essential physics of a biomolecular process such as protein folding or molecular recognition. The reduction of the dimensionality from 3M atom
coordinates to a few collective degrees of freedom is therefore an active field of theoretical research.5–28
Principal component analysis5 共PCA兲, also called quasiharmonic analysis or essential dynamics method,6–9 is one of
the most popular methods in systematically reducing the dimensionality of a complex system. The approach is based on
the covariance matrix, which provides information on the
two-point correlations of the system. The PCA represents a
linear transformation that diagonalizes the covariance matrix
and thus removes the instantaneous linear correlations
among the variables. Ordering the eigenvalues of the transformation decreasingly, it has been shown that a large part of
a兲
Electronic mail: [email protected]
0021-9606/2007/126共24兲/244111/10/$23.00
the system’s fluctuations can be described in terms of only a
few principal components, which may serve as reaction
coordinates.6–12
Recently, it has been suggested to employ internal 共instead of Cartesian兲 coordinates in a PCA.13–19 In biomolecules, in particular, the consideration of dihedral angles appears appealing, because other internal coordinates such as
bond lengths and bond angles usually do not undergo
changes of large amplitudes. Studying the reversible folding
and unfolding of pentaalanine in explicit water, Mu et al.17
showed that a PCA using Cartesian coordinates did not yield
the correct rugged free energy landscape due to an artifact of
the mixing of internal and overall motion. As internal coordinates naturally provide a correct separation of internal and
overall dynamics, they proposed a method, referred to as
dihedral angle principal component analysis 共dPCA兲, which
is based on the dihedral angles 共␾n , ␺n兲 of the peptide backbone. To avoid the problems arising from the circularity of
these variables, a transformation from the space of dihedral
angles 兵␸n其 to a linear metric coordinate space 共i.e., a vector
space with the usual Euclidean distance兲 was built up by the
trigonometric functions sin ␸n and cos ␸n. In a recent
comment29 to Ref. 17, the concern was raised that the dPCA
method may lead to spurious results because of the inherent
constraints 共sin2 ␸n + cos2 ␸n = 1兲 of the formulation. While it
is straightforward to show that the problem described in Ref.
29 was caused by numerical artifacts due to insufficient
sampling,30 the discussion nevertheless demonstrates the
need for a thorough general analysis of the dPCA.
In this work, we present a comprehensive account of
126, 244111-1
© 2007 American Institute of Physics
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-2
J. Chem. Phys. 126, 244111 共2007兲
Altis et al.
various theoretical issues underlying the dPCA method. We
start with a brief introduction to the circular statistics of
angle variables, discuss the transformation from an angle to
the unit circle proposed in Ref. 17, and demonstrate that the
transformation amounts to a one-to-one representation of the
original angle distribution. Adopting the 共␾ , ␺兲 distribution
of trialanine as a simple but nontrivial example, the properties of the dPCA are discussed in detail. In particular, it is
shown that in this case the dPCA results are equivalent to the
results of a Cartesian PCA and that the dPCA eigenvectors
may be characterized in terms of the corresponding conformational changes of the peptide. Furthermore, we introduce
a complex-valued version of the dPCA, which provides new
insights on the PCA of circular variables. Adopting a 300 ns
MD simulation of the folding of decaalanine, we conclude
with a critical comparison of the various methods.
II. CIRCULAR STATISTICS
Dihedral angles ␸ 苸 关0 ° , 360° 关 represent circular 共or directional兲 data.31 Unlike the case of regular data x 苸 兴
− ⬁ , ⬁关, the definition of a metric is not straightforward,
which makes it difficult to calculate distances or means. For
example, the regular data x1 = 10 and x2 = 350 clearly give
⌬x = 兩x2 − x1兩 = 340 and 具x典 = 共10+ 350兲 / 2 = 180. A visual inspection of the corresponding angles ␸1 = 10° and ␸2 = 350°,
on the other hand, readily shows that ⌬␸ = 20° ⫽ 兩␸2 − ␸1兩
and 具␸典 = 0 ° ⫽ 共␸1 + ␸2兲 / 2. To recover the standard rules of
calculating distances and the mean, we may assume that ␸
苸 关−180° , 180° 关. Then ␸1 = 10° and ␸2 = −10°, and we obtain ⌬␸ = 兩␸2 − ␸1兩 = 20° and 具␸典 = 共␸1 + ␸2兲 / 2 = 0°. This example manifests the general property that, if the range of
angles covered by the data set is smaller than 180°, we may
simply shift the origin of the angle coordinates to the middle
of this range and perform standard statistics.
The situation is more involved for “true” circular data
whose range exceeds 180°. This is the case for folding
biomolecules, since the ␺ angle of the peptide backbone is
typically distributed as ␺␣ ⬇ −60° ± 30° 共for ␣R helical
conformations兲 and ␺␤ ⬇ 140° ± 30° 共for ␤ extended conformations兲. If the values of the angles can be described by a
normal distribution, one may employ the von Mises
distribution,31 which represents the circular statistics’ equivalent of the normal distribution for regular data. However, this
method is not applicable to the description of conformational
transitions, since the corresponding dihedral angle distributions can only be typically described by multipeaked probability densities.
A general approach to circular statistics is obtained by
representing the angle ␸ by its equivalent vector 共x , y兲 on the
unit circle. This amounts to the transformation
␸哫
再
x = cos ␸
y = sin ␸ .
冎
共1兲
Unlike the periodic range of the angle coordinate ␸, the vectors 共x , y兲 are defined in a linear space, which means that we
can define the usual Euclidean metric ⌬2 = 共x1 − x2兲2 + 共y 1
− y 2兲2 between any two vectors 共x1 , y 1兲T and 共x2 , y 2兲T. The
distance of two angles with an actual small distance, e.g.,
␸1 = 179° and ␸2 = −179°, is given by a small ⌬ in the 共x , y兲
space, since the corresponding vectors lie close on the unit
circle. Hence, the problem of periodicity is circumvented.
Furthermore, the vector representation of the angles allows
us to unambiguously calculate mean values and other quantities. For example, to evaluate the mean of the angles ␸n,
one simply calculates the sum of the corresponding vector
components and then determines the mean angle by31
tan具␸典 = 具y典/具x典 =
兺n sin ␸n
.
兺n cos ␸n
共2兲
Although the vector representation of angles in Eq. 共1兲
appears straightforward and intuitively appealing, it has the
peculiar property of doubling the variables: Given N angle
coordinates ␸n, we obtain 2N Cartesian-type coordinates
共xn , y n兲. In the example given in Eq. 共2兲, this does not lead to
any problems, because in the end of the calculation we are
able to calculate back from the averaged vector coordinates
to the original angle coordinate, that is, the correctly averaged angle. Since Eq. 共1兲 represents a nonlinear transformation, however, we will see that obtaining the peptide’s angles
in a direct way after a dPCA treatment of the data is not
possible in general 共see below兲. In this case, a subsequent
analysis needs to be performed.
Having to employ these coordinates for the description
of peptide energy landscapes in mind, the question of
whether the resulting representation preserves the characteristics of the original energy landscapes arises. In particular, it
is of interest if the number and structure of minima and transition states are preserved in the 2N-dimensional 共xn , y n兲
space. To answer these questions and to illustrate the properties of transformation 共1兲, we consider a simple onedimensional example described by the angular probability
density 关see Fig. 1共a兲兴,
␳共␸兲 =
1
共1 − cos 4␸兲,
2␲
共3兲
with ␸ 苸 关−180° , 180° 关. By construction, the density exhibits four maxima at ␸ = ± 45° , ± 135°. Employing transformation 共1兲, we obtain the corresponding probability density on a
circle of unit radius,
␳共x,y兲 =
8x2共1 − x2兲 2 2
␦共x + y − 1兲.
␲
共4兲
The density plot of ␳共x , y兲 displayed in Fig. 1共b兲 demonstrates that transformation 共1兲 simply wraps the angular density ␳共␸兲 around the circumference of the unit circle. Hence,
all features of ␳共␸兲 are faithfully represented by ␳共x , y兲, particularly the number and the structure of extrema. This is a
consequence of the fact that transformation 共1兲 is a bijection,
which uniquely assigns each angle ␸ a corresponding vector
共x , y兲 and vice versa.
We observe that this desirable feature is not obtained if
we transform to only a single Cartesian-type variable, x or y.
The corresponding densities,
␳共x兲 =
8x2冑1 − x2
,
␲
共5兲
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-3
J. Chem. Phys. 126, 244111 共2007兲
Dihedral angle principal component analysis
dimensional data set.5 In the case of molecular dynamics of
M atoms, the basic idea is that the correlated internal motions are represented by the covariance matrix,
␴ij = 具共qi − 具qi典兲共q j − 具q j典兲典,
共7兲
where q1 , . . . , q3M are the mass-weighted Cartesian coordinates of the molecule and 具…典 denotes the average over all
sampled conformations.6–9 By diagonalizing the covariance
matrix we obtain 3M eigenvectors v共i兲 and eigenvalues ␭i,
which are rank ordered descendingly, i.e., ␭1 represents the
largest eigenvalue. The eigenvectors and eigenvalues of ␴
yield the modes of collective motion and their amplitudes,
respectively. The principal components,
Vi = v共i兲 · q,
共8兲
of the data q = 共q1 , . . . , q3M 兲T can be used, for example, to
represent the free energy surface of the system. Restricting
ourselves to two dimensions, we obtain
⌬G共V1,V2兲 = − kBT关ln ␳共V1,V2兲 − ln ␳max兴,
FIG. 1. 共A兲 Angular density ␳共␸兲 = 共1 / 2␲兲共1 − cos 4␸兲. 共B兲 Representation
of ␳共␸兲 through its probability density ␳共x , y兲 on the unit circle 共artificial
width added for a better visualization兲. Also shown are the densities ␳共x兲
and ␳共y兲, which display the angular densities along the single Cartesian-type
variables x and y, respectively. Note that only ␳共x , y兲 reproduces the correct
number of extrema of ␳共␸兲.
␳共y兲 =
8y 2冑1 − y 2
,
␲
共6兲
are also shown in Fig. 1共b兲. As a consequence of the projection onto the x or y axis, each density exhibits only two
instead of four maxima.
The above described properties of the one-dimensional
example readily generalize to the N-dimensional case,
␸n 哫 共xn , y n兲. In direct generalization of the unit circle, the
data points 共xn , y n兲 are distributed on the surface of a
2N-dimensional sphere with radius 冑N. This is because the
distance of every data point 共x1 , y 1 , . . . , xN , y N兲 to the origin
equals 共x21 + y 21 + ¯ + x2N + y 2N兲1/2 = 共1 + ¯ + 1兲1/2 = 冑N. Since
the transformation represents a bijection, there is a one-toone correspondence between states in the N-dimensional angular space and in the 2N-dimensional vector space. Again,
the Euclidean metric of the 2N-dimensional vector space
guarantees that mean values and other quantities can be calculated easily.
We note in passing that, alternatively to transformation
共1兲, one may employ a complex representation zn = ei␸n of the
angles. As Euler’s formula ei␸ = cos ␸ + i sin ␸ provides a direct correspondence between the 2N-dimensional real vectors 共x1 , y 1 , . . . , xN , y N兲T and the N-dimensional complex vectors 共z1 , . . . , zN兲T, all considerations performed above can
also be done using the complex representation. We will explore this idea in more detail in Sec. VI.
III. DIHEDRAL ANGLE PRINCIPAL COMPONENT
ANALYSIS „dPCA…
Principal component analysis 共PCA兲 is a wellestablished method in reducing the dimensionality of a high-
共9兲
where ␳ is an estimate of the probability density function
obtained from a histogram of the data. ␳max denotes the
maximum of the density, which is subtracted to ensure that
⌬G = 0 for the lowest free energy minimum.
The basic idea of the dPCA proposed in Ref. 17 is to
perform the PCA on sin- and cos-transformed dihedral
angles,
q2n−1 = cos ␸n ,
q2n = sin ␸n ,
共10兲
where n = 1 , . . . , N and N is the total number of peptide backbone and side-chain dihedral angles used in the analysis.
Hence the covariance matrix 关Eq. 共7兲兴 of the dPCA uses 2N
variables qn. The question then is whether the combination of
the nonlinear transformation 关Eq. 共10兲兴 and the subsequent
PCA still gives a unique and faithful representation of the
initial angular data ␸n.
Let us first consider the above discussed example of a
one-dimensional angular density ␳共␸兲 = 共1 / 2␲兲共1 − cos 4␸兲,
which is mapped via transformation 共10兲 on the twodimensional density on the unit circle ␳共x , y兲 = 关8x2共1 − x2兲 /
␲兴␦共x2 + y 2 − 1兲, where x = q1 = cos ␸ and y = q2 = sin ␸. Since
in this case 具x典 = 具y典 = 具xy典 = 0 and 具x2典 = 具y 2典 = 21 , we find that
1
the covariance matrix is diagonal with ␴11 = ␴22 = 2 . That is,
1
we have degenerate eigenvalues ␭1/2 = 2 and may choose any
two orthonormal vectors as eigenvectors. Choosing, e.g., the
unit vectors ex and ey, the PCA leaves the density ␳共x , y兲
invariant, which—as discussed above—is a unique and faithful representation of the initial angular density ␳共␸兲. In general, one does not obtain a diagonal covariance matrix for a
one-dimensional angular density ␳共␸兲 关e.g., for ␳共␸兲 = 1 / 2␲
+ 91 cos共␸兲 + 91 sin共␸兲 we obtain ␴12 = −␲2 / 81⫽ 0兴. A sufficient condition for a diagonal covariance matrix for an
N-dimensional angular density is that the latter factorizes
␳ 共 ␸ 1 , . . . , ␸ N兲
in
one-dimensional
densities
关i.e.,
= ␳共␸1兲␳共␸2兲 ¯ ␳共␸N兲兴 and that 具cos ␸n典 = 0 or 具sin ␸n典 = 0 for
all n = 1 , . . . , N. In these trivial cases, the dPCA method simply reduces to transformation 共10兲.
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-4
Altis et al.
J. Chem. Phys. 126, 244111 共2007兲
FIG. 2. 共Color兲 共A兲 Ramachandran 共␾ , ␺兲 probability distribution of Ala3 in water as obtained from a 100 ns MD simulation. Performing a dPCA, the
resulting free energy landscape along the first two principal components is shown in 共B兲; the 共␾ , ␺兲 distributions pertaining to the labeled energy minima is
shown in 共C兲. Panels 共D兲 and 共E兲 show the corresponding results obtained for a Cartesian PCA. Panel 共F兲 displays the 共␪1 , ␪2兲 distribution obtained from the
complex dPCA.
IV. A SIMPLE EXAMPLE
The simplest nontrivial case of a dPCA occurs for a twodimensional correlated angular density. As an example, we
adopt trialanine whose conformation can be characterized by
a single pair of 共␾ , ␺兲 backbone dihedral angles. Trialanine
共Ala3兲 in aqueous solution is a model peptide which has
been the subject of numerous experimental32–35 and
computational36–38 studies. To generate the angular distribution of 共␾ , ␺兲 of trialanine, we performed a 100 ns MD simulation at 300 K. We used the GROMACS program suite,39,40
the GROMOS96 force field 43a1,41 the simple point charge
共SPC兲 water model,42 and a particle-mesh Ewald43 treatment
of the electrostatics. Details of the simulation can be found in
Ref. 37. Figure 2共a兲 shows the 共␾ , ␺兲 distribution obtained
from the simulation, which predicts that mainly three conformational states are populated: the right-handed helix conformation ␣R 共15%兲, the extended conformation ␤ 共39%兲, and
the poly-L-proline II 共PII兲 helixlike conformation 共42%兲. Although recent experimental data35 indicate that the simulation overestimates the populations of ␣R and ␤, we nevertheless adopt the MD data as a simple yet nontrivial example to
illustrate the performance of the dPCA method.
Performing the dPCA on the 共␾ , ␺兲 data, we consider the
four variables q1 = cos ␾, q2 = sin ␾, q3 = cos ␺, and q4
= sin ␺. Diagonalization of the resulting covariance matrix
yields four principal components V1 , . . . , V4, which contribute 51%, 24%, 15%, and 10% to the overall fluctuations of
the system, respectively. To characterize the principal components, Fig. 3 shows their one-dimensional probability densities. Only the first two distributions are found to exhibit
multiple peaks, while the other two are approximately unimodal. Hence we may expect that the conformational states
shown by the angular distribution of 共␾ , ␺兲 in Fig. 2共a兲 can
be accounted for by the first two principal components.
If we assume that V1 and V2 are independent 关i.e.,
␳共V1 , V2兲 = ␳共V1兲␳共V2兲兴, the three peaks found for ␳共V1兲 as
well as for ␳共V2兲 give rise to 3 ⫻ 3 = 9 peaks of ␳共V1 , V2兲. To
identify possible correlations, Fig. 2共b兲 shows the twodimensional density along the first two principal components. For the sake of better visibility, we have chosen a
logarithmic representation, thus showing the free energy
landscape 关Eq. 共9兲兴 of the system. The figure exhibits three
共instead of nine兲 well-defined minima labeled S1, S2, and
S3, revealing that the first two principal components are indeed strongly dependent. To identify the corresponding three
conformational states, we have back-calculated the 共␾ , ␺兲
distributions of the minima from the trajectory.44 As shown
in Fig. 2共c兲 as well as by Table I, the minima S1, S2, and S3
clearly correspond to PII, ␤, and ␣R, respectively. A closer
analysis reveals that fine details of the conformational distribution can also be discriminated by the first two principal
components. For example, the shoulder on the left side of the
␣R state in Fig. 2共a兲 corresponds to the region around V2
⬇ −0.9 of the S3 minimum. Moreover, the minor 共3%兲 population of the left-handed helix conformation ␣L at ␾ ⬇ 60°
corresponds to the small orange region 共outside of the
square兲 of the S1 minimum.
It is instructive to compare the above results obtained by
FIG. 3. 共Color online兲 Probability densities of the four principal components
obtained from the sin/cos 共full lines兲 and the complex 共dashed lines兲 dPCA
of trialanine, respectively.
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-5
J. Chem. Phys. 126, 244111 共2007兲
Dihedral angle principal component analysis
TABLE I. Conformational states PII, ␤, and ␣R of trialanine in water, characterized by their population probability P and the average dihedral angles 共␾ , ␺兲. The results from the dPCA and the Cartesian PCA are
compared to reference data obtained directly from the MD simulation.
MD data
dPCA
Cartesian PCA
State
P 共%兲
共␾ , ␺兲 共deg兲
P 共%兲
共␾ , ␺兲 共deg兲
P 共%兲
共␾ , ␺兲 共deg兲
PII
␤
␣R
42
39
15
−67, 132
−121, 131
−75, −45
45
40
16
−63, 131
−121, 131
−74, −46
47
38
16
−64, 132
−122, 130
−75, −46
the dPCA to the outcome of a standard PCA using Cartesian
coordinates. Restricting the analysis to the atoms
CONH – CHCH3 – CONH around the central 共␾ , ␺兲 dihedral
angles of trialanine, the first four principal components contribute 47%, 28%, 15%, and 8% to the overall fluctuations,
respectively, and exhibit one-dimensional probability densities that closely resemble the ones obtained by the dPCA
共data not shown兲. Figure 2共d兲 shows the resulting free energy
surface along the first two principal components, which
looks quite similar to the dPCA result. The three minima
S1⬘, S2⬘, and S3⬘ are identified in Fig. 2共e兲 as the conformational states PII, ␤, and ␣R. Again, the details of the conformational distribution such as the ␣L state are also resolved
by the first two principal components.
In summary, it has been shown that both the Cartesian
PCA and the dPCA reproduced the correct conformational
distribution of the MD trajectory of trialanine. In both cases,
the first two principal components were sufficient to resolve
most details. Although only four coordinates were used, the
dPCA was found to be equivalent to the Cartesian PCA using
33 coordinates.
V. INTERPRETATION OF EIGENVECTORS
In the simple example above, Fig. 2 demonstrates that
the first two principal components V1 and V2 共or, equivalently, the first two eigenvectors v共1兲 and v共2兲兲 are associated
with motions along the ␺ and the ␾ dihedral angles, respectively. In the case of the Cartesian PCA, the structural
changes of the molecule along the principal components are
readily illustrated, even for high-dimensional systems. From
共i兲
共i兲
共i兲
Vi = v共i兲 · q = v共i兲
1 q1 + v2 q2 + v3 q3 + . . . + v3M−2q3M−2
共k兲 2
共k兲 2
⌬共k兲
1 = 共v1 兲 + 共v2 兲
共13兲
as a measure of the influence of angle ␸1 on the principal
共k兲
component Vk 共and similarly ⌬共k兲
2 , . . . , ⌬N for the other
angles兲. The definition implies that 兺n⌬共k兲
n = 1, since the
length of each eigenvector is 1. Hence ⌬共k兲
can
be considered
n
as the percentage of the effect of the angle ␸n on the principal component Vk. Furthermore, Eq. 共12兲 assures that only
structural rearrangements along angles with nonzero ⌬共k兲
n
may change the value of Vk.
To demonstrate the usefulness of definition 共13兲, we
again invoke our example of trialanine with angles ␾共n = 1兲
and ␺共n = 2兲 and consider the quantities ⌬共k兲
n describing the
effect of these angles on the four principal components 共k
= 1 , . . . , 4兲, see Fig. 4. We clearly see that the dihedral angle
␾ has almost no influence on V1 共⌬共1兲
1 ⬇ 0兲, whereas ␺ has a
共1兲
very large one 共⌬2 ⬇ 1兲. As a consequence, the first principal component allows us to separate conformations with a
different angle ␺ but does not separate conformations which
differ in ␾. Indeed, Fig. 2共b兲 reveals that V1 accounts essentially for the ␣ ↔ ␤ / PII transition along ␺, but hardly separates conformations with different ␾, such as ␤ and PII. Considering the second principal component V2, we obtain ⌬共2兲
1
⬇ 1 and ⌬共2兲
⬇
0.
This
is
again
in
agreement
with
Fig.
2共b兲,
2
which shows that the second principal component accounts
essentially for transitions along ␾. Recalling that V1, V2, V3,
and V4 contribute 51%, 24%, 15%, and 10% to the overall
fluctuations, respectively, the ␤ ↔ PII transitions described by
the second principal component represent a much smaller
conformational change than the ␣ ↔ ␤ / PII transitions described by V1. Similarly, although the ⌬共k兲
n of the third and
共i兲
+ v3M−1
q3M−1 + v共i兲
3M q3M ,
共i兲
共i兲
we see that, e.g., the first three components v共i兲
1 , v2 , and v3
共i兲
of the eigenvector v simply reflect the influence of the x, y,
and z coordinates of the first atom on the ith principal component. Hence,
共i兲 2
共i兲 2
共i兲 2
⌬共i兲
1 = 共v1 兲 + 共v2 兲 + 共v3 兲
共11兲
is a suitable measure of this influence. The quantities
共i兲
⌬共i兲
2 , . . . , ⌬ M are defined analogously.
In the dPCA, the principal components are given by
共k兲
Vk = v共k兲 · q = v共k兲
1 cos ␸1 + v2 sin ␸1
共k兲
+ . . . + v2N−1
cos ␸N + v共k兲
2N sin ␸N .
In direct analogy to Eq. 共11兲, we may define
共12兲
FIG. 4. 共Color online兲 Influence of the dihedral angles ␾ 共black bars兲 and ␺
共gray bars兲 on the principal component Vk 共k = 1 , . . . , 4兲 of the cos/sin dPCA
共k兲
of trialanine. Shown are the quantities ⌬共k兲
1 共for ␾兲 and ⌬2 共for ␺兲 defined in
Eq. 共13兲, representing the percentage of the effect of the two dihedral angels
on Vk. Also shown are the contributions 共in %兲 of each principal component
to the overall fluctuations of the system.
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-6
J. Chem. Phys. 126, 244111 共2007兲
Altis et al.
fourth principal components are quite similar to the previous
ones, they only account for fluctuations within a conformational state and are therefore of minor importance in a conformational analysis.
Re Wn = Vkn,
Im Wn = Vk⬘ ,
n
共18兲
and the union of the indices kn , kn⬘ gives the complete set
兵1 , . . . , 2N其. Moreover, the eigenvalues ␮n of the complex
dPCA are given by the sum of the two corresponding eigenvalues ␭kn and ␭k⬘ of the sin/cos dPCA,
n
␮ n = ␭ kn + ␭ k⬘ .
VI. COMPLEX DPCA
Alternatively to the sin/cos transformation in Eq. 共10兲
which maps N angles on 2N real numbers, one may also
transform from the angles ␸n to the complex numbers
z n = e i␸n
共n = 1, . . . ,N兲,
共14兲
which give an N-dimensional complex vector z
= 共z1 , z2 , . . . , zN兲T. In what follows, we develop a dPCA based
on this complex data 共“complex dPCA”兲 and discuss its relation to the real-valued dPCA 共“sin/cos dPCA”兲 considered
above.
The covariance matrix pertaining to the complex variables zn is defined as
Cmn = 具共zm − 具zm典兲共z*n − 具z*n典兲典,
共15兲
with m , n = 1 , . . . , N, and z* being the complex conjugate of z.
Being in principle an observable quantity, C is a Hermitian
matrix with N real-valued eigenvalues ␮n and N complex
eigenvectors w共n兲,
Cw共n兲 = ␮nw共n兲 ,
共16兲
where the eigenvectors are unique up to a phase ␪0. We define the complex principal components to be
T
Wn = w共n兲 z = rnei共␪n+␪0兲 ,
共19兲
n
共17兲
where we use vector-vector multiplication instead of a Hermitian inner product 共see Appendix for details兲. Two nice
features of the complex dPCA are readily evident. First, the
complex representation of N angular variables directly results in N eigenvalues and eigenvectors; that is, there is no
doubling of variables as in the sin/cos dPCA. Second, the
representation of the complex principal components by their
weights rn and angles ␪n in Eq. 共17兲 may facilitate their
direct interpretation in terms of simple physical variables.
From Euler’s formula ei␸ = cos ␸ + i sin ␸, one would expect an evident correspondence between the sin/cos and the
complex dPCA. That is, there should be a relation between
the N complex eigenvectors w共n兲 and the 2N real eigenvectors v共k兲. Furthermore, the N real eigenvalues ␮n of the complex dPCA should be related to the 2N real eigenvalues ␭k of
the sin/cos dPCA. However, this general correspondence
turned out to be less obvious than expected 共see Appendix兲,
and we were only able to find an analytical relation in some
limiting cases. In these cases, one indeed may construct suitably normalized eigenvectors w共n兲 such that the real and
imaginary parts of the resulting principal components Wn of
the complex dPCA are equal to the 2N principal components
Vk of the sin/cos dPCA. In other words, for every n
苸 兵1 , . . . , N其 there are two indices kn , kn⬘ 苸 兵1 , . . . , 2N其 such
that
Apart from the limiting cases of completely uncorrelated
and completely correlated variables, we could not establish
general conditions under which Eqs. 共18兲 and 共19兲 hold. Empirically, Eq. 共19兲 was always satisfied, while Eq. 共18兲 was
found to hold in many 共but not all兲 cases under consideration, see Figs. 3 and 7 below. We note that even in numerical studies it may be cumbersome to establish the correspondences, since the accuracy of Eqs. 共18兲 and 共19兲 depends on
the number of data points one uses to calculate the covariance matrices in both methods, i.e., on the overall sampling
of the MD trajectory.
To demonstrate the performance of the complex dPCA,
we first apply it to the above discussed example of trialanine.
Comparing the 2N = 4 eigenvalues of the sin/cos dPCA
␭1 , . . . , ␭4 to the two eigenvalues ␮1 and ␮2 of the complex
dPCA, we obtain
␮1 = 0.630 = 0.489 + 0.141 = ␭1 + ␭3 ,
␮2 = 0.338 = 0.237 + 0.101 = ␭2 + ␭4 ,
that is, Eq. 共19兲 is fulfilled. Choosing suitable normalization
constants ␪0 for the complex eigenvectors, we furthermore
find the correspondence
Re W1 ⬇ V1,
Re W2 ⬇ V2 ,
Im W1 ⬇ V3,
Im W2 ⬇ V4 .
As shown by the probability densities of the principal components in Fig. 3, both formulations lead to virtually identical principal components.
Finally, it is interesting to study if the representation of
the complex principal components by their weights rn and
angles ␪n in Eq. 共17兲 facilitates their interpretation. In the
case of our trialanine data, it turns out that the weights are
approximately constant, i.e., r1 ⬇ r2 ⬇ 1. Hence, the probability distribution of the two angles 共␪1 , ␪2兲 contains all the
conformational fluctuations of the data. Indeed, Fig. 2 reveals that ␳共␪1 , ␪2兲 is almost identical to the original 共␾ , ␺兲
density from the MD simulation. In this simple case, the
complex dPCA has obviously managed to completely identify the underlying structure of the data.
VII. ENERGY LANDSCAPE OF DECAALANINE
We finally wish to present an example which demonstrates the potential of the dPCA method to represent the true
multidimensional energy landscape of a folding biomolecule.
Following earlier work on the folding of alanine
peptides,17,28,35 we choose decaalanine 共Ala10兲 in aqueous solution. Employing similar conditions as in the case of trialanine described above 共GROMOS96 force field 43a1,41 SPC wa-
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-7
Dihedral angle principal component analysis
J. Chem. Phys. 126, 244111 共2007兲
FIG. 5. 共Color兲 Free energy landscapes of Ala10 in water as obtained from a 300 ns MD simulation. The first column, 共A兲 and 共B兲, shows the results along
the first four principal components obtained from a Cartesian PCA, the second column, 共C兲 and 共D兲, the corresponding landscapes calculated from the sin/cos
dPCA. Panels 共E兲–共H兲 display the landscapes along the angles 共␪1 , ␪2兲 and 共␪3 , ␪4兲 and the weights 共r1 , r2兲 and 共r3 , r4兲 of the complex dPCA, respectively.
ter model,42 and particle-mesh Ewald43 treatment of the
electrostatics兲, we ran a 300 ns trajectory of Ala10 at 300 K
and saved every 0.4 ps the coordinates for analysis.
Let us first consider the free energy landscape ⌬G 关Eq.
共9兲兴 obtained from a PCA using all Cartesian coordinates of
the system. The calculations of ⌬G共V1 , V2兲 and ⌬G共V3 , V4兲
presented in Figs. 5共a兲 and 5共b兲 show that the resulting energy landscape is rather unstructured and essentially single
peaked, indicating a single folded state and a random ensemble of unfolded conformational states. However, as discussed in detail in Ref. 17 for the case of Ala5, this smooth
appearance of the energy landscape in the Cartesian PCA
merely represents an artifact of the mixing of internal and
overall motion. This becomes clear when a sin/cos dPCA of
the N = 18 inner backbone dihedral angles 兵␸n其
= 兵␺1 , ␾2 , ␺2 , . . . , ␾9 , ␺9 , ␾10其 is performed. The resulting
dPCA free energy surfaces ⌬G共V1 , V2兲 and ⌬G共V3 , V4兲
shown in Figs. 5共c兲 and 5共d兲 exhibit numerous wellseparated minima, which correspond to specific conformational structures. By back-calculating from the dPCA free
energy minima to the underlying backbone dihedral angles of
all residues,44 we are able to discriminate and characterize 15
such states.45 The most populated ones are the all ␣R helical
conformation 共8%兲, a state 共15%兲 with the inner seven residues in ␣R 共and the remaining residues in ␤ / PII兲, and two
states 共8% each兲 with six inner residues in ␣R. Well-defined
conformational states are also found in the unfolded part of
the free energy landscape, revealing that the unfolded state of
decaalanine is rather structured than random.
To obtain an interpretation of the kth principal component in terms of the dihedral angles ␸n, Fig. 6 shows the
quantities ⌬共k兲
n defined in Eq. 共13兲 which describe the effect
of these angles on the first two principal components. The
first principal component V1 is clearly dominated by motion
along the ␺ angles 共gray bars兲, while fluctuations of the ␾
angles 共black bars兲 hardly contribute. Hence, going along V1
we will find conformations which mainly differ in ␺ angles.
Considering the second principal component V2, we find a
dominant ⌬共2兲
n for the angle ␺3 共and a smaller value for ␺9兲,
revealing that V2 mainly separates conformation that differ in
␺3. Similarly, the ⌬共k兲
n obtained for the next few principal
components are dominated by the contribution of a single ␺
共4兲
共5兲
共6兲
angle. For example, we find that ⌬共3兲
n , ⌬n , ⌬n , and ⌬n
depend mostly on the angles ␺2, ␺9, ␺4 共and ␺8兲, and ␺5,
FIG. 6. 共Color online兲 Influence of the 18 inner backbone dihedral angles
兵␸n其 = 兵␺1 , ␾2 , ␺2 , . . . , ␾9 , ␺9 , ␾10其 on the first two principal components V1
and V2 of the sin/cos dPCA of Ala10. Shown are the quantities ⌬共1兲
n 共for V1兲
and ⌬共2兲
n 共for V2兲 defined in Eq. 共13兲, representing the percentage of the
effect of the dihedral angles on Vk. The black and gray bars correspond to
the ␾ and ␺ angles, respectively. Also shown are the contributions 共in %兲 of
each principal component to the overall fluctuations of the system.
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-8
Altis et al.
FIG. 7. 共Color online兲 Probability densities of the first six principal components obtained from the sin/cos 共full lines兲 and the complex 共dashed lines兲
dPCA of Ala10, respectively.
respectively 共data not shown兲. Together with the percentage
of the fluctuations 共18%, 10%, 8%, 7%, 6%, and 5% for
V1 , . . . , V6兲 the quantities ⌬共k兲
n therefore give a quick and
valuable interpretation of the conformational changes along
the principal components Vk.
It is interesting to compare the above results to the outcome of a complex dPCA of the Ala10 trajectory. To check
the similarity of the complex and the sin/cos dPCA in this
case, Fig. 7 compares the distributions of the sin/cos principal components Vk to the distributions of the corresponding
principal components, Re Wn and Im Wn, obtained from the
complex dPCA using suitably normalized eigenvectors. Although we find good overall agreement, the correspondence
关Eq. 共18兲兴 is not perfect in all cases 共see Appendix兲. Finally,
we wish to investigate whether the polar representation 关Eq.
共17兲兴 of the complex principal components facilitates the interpretation of the energy landscape of Ala10. To this end,
Figs. 5共e兲–5共h兲 show the free energy surfaces 共E兲 ⌬G共␪1 , ␪2兲,
共F兲 ⌬G共␪3 , ␪4兲, 共G兲 ⌬G共r1 , r2兲, and 共H兲 ⌬G共r3 , r4兲. Similar to
that found for Ala3, the energy landscape is only a little
structured along the weights rn 共mainly along r1兲, thus leaving the main information on the conformational states to the
angles ␪n 共mainly ␪2, ␪3, and ␪4兲. A closer analysis reveals,
e.g., that ␪2 separates conformational states with a different
dihedral angle ␺3, while ␪3 separates conformations with a
different dihedral angle ␺2. Unlike the simpler case of trialanine, where the 共␪1 , ␪2兲 representation of the complex dPCA
was found to directly reproduce the original 共␾ , ␺兲 distribution, however, the polar principal components of Ala10 appear to be equivalent to the results of the standard sin/cos
dPCA. Roughly speaking, in both formulations we need
about the same number of principal components to identify
the same number of conformational states.
VIII. CONCLUSIONS
We have studied the theoretical foundations of the dPCA
in order to clarify the validity and the applicability of the
approach. In particular, we have shown that dPCA amounts
to a one-to-one representation of the original angle distribu-
J. Chem. Phys. 126, 244111 共2007兲
tion and that its principal components can be characterized
by the corresponding conformational changes of the peptide.
Furthermore, we have investigated a complex version of the
dPCA which sheds some light on the mysterious doubling of
variables occurring in the sin/cos dPCA. One learns that N
angular variables can actually be represented by N complex
variables, which then naturally lead to N eigenvalues and
eigenvectors. Despite its similarity to the sin/cos dPCA, the
complex dPCA might be advantageous because the representation of the complex principal components by their weights
and angles may facilitate their direct interpretation in terms
of simple physical variables.
To demonstrate the potential of the dPCA, we have applied it in the construction of the energy landscape of Ala10
from a 300 ns MD simulation. The resulting free energy surface exhibits numerous well-separated minima corresponding to specific conformational states, revealing that the unfolded state of decaalanine is rather structured than random.
The smooth appearance of the energy landscape obtained
from a PCA using Cartesian coordinates was found to be
caused by an artifact of the mixing of internal and overall
motion. Hence the correct separation of internal and overall
motion is essential for the construction and interpretation of
the energy landscape of a biomolecule undergoing large
structural rearrangements. Internal coordinates such as dihedral angles fulfill this requirement in a natural way.
Recently, several nonlinear approaches have been
proposed25–28 which may account for nonlinear correlations
not detected by a standard PCA. For example, it has been
discussed in Ref. 26 that completely correlated motion such
as two atoms oscillating in parallel direction but with a 90°
phase shift is not monitored by a linear PCA, since
具sin共␻t兲sin共␻t + ␲ / 2兲典 = 0. This geometrical artifact caused
by the relative orientation of the atomic fluctuations was
found to lead to a considerable 共⬇40% 兲 underestimation of
the correlation of protein motion.26 Because of the use of
dihedral angles and the inherent nonlinear transformation,
the dPCA represents a nonlinear PCA with respect to Cartesian atomic coordinates and is therefore able to identify this
type of fluctuations.
Furthermore, various methods have been suggested
which allow for an identification of metastable conformational states.12,21–24 By calculating the transition matrix that
connects these states, one may then model the conformational dynamics of the system via a master-equation description. While the dPCA also allows us to calculate metastable
conformational states and their transition matrix,17 it moreover provides a way to represent the free energy landscape as
well as all observables of the system in terms of well-defined
collective coordinates.46 This way the dPCA free energy surface can be used to perform 共equilibrium or nonequilibrium兲
Langevin simulations of the molecular dynamics47,48 as well
as a simulation using a nonlinear dynamic model.28 As all
quantities of interest can be converged to the desired accuracy by including more principal components, the approach
avoids problems associated with the use of empirical order
parameters 共such as the number of native contacts兲 or low-
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-9
J. Chem. Phys. 126, 244111 共2007兲
Dihedral angle principal component analysis
dimensional reaction coordinates 共such as the radius of gyration兲, which may lead to artifacts and an oversimplification
of the free energy landscape.49
ACKNOWLEDGMENTS
The authors thank Yuguang Mu and Alessandra Villa for
numerous inspiring and helpful discussions. This work has
been supported by the Frankfurt Center for Scientific Computing, the Fonds der Chemischen Industrie, and the Deutsche Forschungsgemeinschaft.
Cw共1兲 ª C共x1 − ix2,0兲T = 共␭1 + ␭2兲w共1兲 ¬ ␮1w共1兲 ,
Cw共2兲 ª C共0,x3 − ix4兲T = 共␭3 + ␭4兲w共2兲 ¬ ␮2w共2兲 ,
共A5兲
which reveals the simple relation 关Eq. 共19兲兴 between the eigenvalues ␭k of the sin/cos dPCA and the eigenvalues ␮n of
the complex dPCA. By comparing the principal components
Wn = w共n兲Tz 共n = 1 , 2兲 and Vk = v共k兲 · q 共k = 1 , . . . , 4兲, we finally
obtain the equality 关Eq. 共18兲兴 of the principal components of
the two formulations,
Re W1 = V1,
Im W1 = V2 ,
Re W2 = V3,
Im W2 = V4 .
共A6兲
APPENDIX: RELATION BETWEEN SIN/COS AND
COMPLEX dPCA
The purpose of the appendix is to discuss the relations of
the principal components 关Eq. 共18兲兴 and the eigenvalues 关Eq.
共19兲兴 between the sin/cos and the complex dPCA, respectively. To this end, we first establish a correspondence between the covariance matrices of the two formulations. Using Euler’s formula, we express the matrix elements of the
covariance matrix 关Eq. 共15兲兴 as
Cmn = 具共ei␸m − 具ei␸m典兲共e−i␸n − 具e−i␸m典兲典
= cov共cos ␸m,cos ␸n兲 + cov共sin ␸m,sin ␸n兲
− i cov共cos ␸m,sin ␸n兲 + i cov共sin ␸m,cos ␸n兲,
共A1兲
where cov共a , b兲 = 具ab典 − 具a典具b典. Without loss of generality
共since the generalization is straightforward兲, we restrict ourselves in the following to the case of two angles 共N = 2兲.
Using Eq. 共A1兲 and the definition of ␴ 关Eq. 共7兲兴 together with
Eq. 共10兲, it is easy to see that one can transform the sin/cos
covariance matrix ␴ into the complex covariance matrix C
according to
T␴T† = C,
where
T=
冉
共A2兲
1 −i 0
0
0
0
1 −i
冊
.
共A3兲
Let us next derive Eqs. 共18兲 and 共19兲 for the limiting
case of two uncorrelated angle variables. The resulting covariance matrix of the sin/cos dPCA exhibits a block-diagonal
structure with 2 ⫻ 2 blocks A and B. Assuming that 共x1 , x2兲T
is an eigenvector of A with eigenvalue ␭1, then, due to orthogonality, 共−x2 , x1兲T is an eigenvector of A, too. Let its
eigenvalue be ␭2. Analogously, let 共x3 , x4兲T and 共−x4 , x3兲T be
the eigenvectors of B with eigenvalues ␭3 and ␭4. It follows
that
v共1兲 = 共x1,x2,0,0兲T,
v共2兲 = 共− x2,x1,0,0兲T ,
v共3兲 = 共0,0,x3,x4兲T,
v共4兲 = 共0,0,− x4,x3兲T
共A4兲
are eigenvectors of ␴ with eigenvalues ␭1 , . . . , ␭4. Using Eq.
共A2兲, it is now straightforward to verify that the eigenvectors
w共n兲 of the complex dPCA can be defined as follows:
We note that the above definition of the principal components Wn is not equivalent to the projection w共n兲 · z given by a
Hermitian inner product. However, the appealingly simple
relation 关Eq. 共18兲兴 between the principal components of the
two dPCA methods only holds when the Wn are defined that
way.
While a 2 ⫻ 2 block-diagonal structure of the sin/cos covariance matrix ␴ represents a sufficient condition, it is certainly not a necessary requirement to yield relations 共18兲 and
共19兲. In the case of trialanine, where the latter equations were
satisfied to high accuracy 共see Fig. 3兲, the covariance matrix
␴ was indeed approximately block diagonal. On the other
hand, our second example Ala10 also satisfied the equalities
quite well 共see Fig. 7兲, although ␴ revealed only little block
diagonal structure. Finally, we found cases where the correspondence holds for covariance matrices that are not blockdiagonal at all. For example, it can be shown that two completely correlated angle variables 共say, ␸1 and ␸2 = ␸1
+ const兲 result in dPCA covariance matrices that satisfy
Eqs. 共18兲 and 共19兲.
1
W. F. van Gunsteren, D. Bakowies, R. Baron et al., Angew. Chem., Int.
Ed. 45, 4064 共2007兲.
2
J. N. Onuchic, Z. L. Schulten, and P. G. Wolynes, Annu. Rev. Phys.
Chem. 48, 545 共1997兲.
3
K. A. Dill and H. S. Chan, Nat. Struct. Biol. 4, 10 共1997兲.
4
D. J. Wales, Energy Landscapes 共Cambridge University Press, Cambridge, 2003兲.
5
I. T. Jolliffe, Principal Component Analysis 共Springer, New York, 2002兲.
6
T. Ichiye and M. Karplus, Proteins 11, 205 共1991兲.
7
A. E. Garcia, Phys. Rev. Lett. 68, 2696 共1992兲.
8
A. Amadei, A. B. M. Linssen, and H. J. C. Berendsen, Proteins 17, 412
共1993兲.
9
S. Hayward, A. Kitao, F. Hirata, and N. Go, J. Mol. Biol. 234, 1207
共1993兲.
10
O. M. Becker, Proteins 27, 213 共1997兲.
11
O. F. Lange and H. Grubmüller, J. Phys. Chem. B 110, 22842 共2006兲.
12
F. Noe, D. Krachtus, J. C. Smith, and S. Fischer, J. Chem. Theory
Comput. 2, 840 共2006兲.
13
R. Abseher and M. Nilges, J. Mol. Biol. 279, 911 共1998兲.
14
D. M. D. van Aalten, B. L. de Groot, J. B. C. Finday, H. J. C. Berendsen,
and A. Amadei, J. Comput. Chem. 18, 169 共1997兲.
15
N. Elmaci and R. S. Berry, J. Chem. Phys. 110, 10606 共1999兲.
16
T. H. Reijmers, R. Wehrens, and L. M. C. Buydens, Chemom. Intell. Lab.
Syst. 56, 61 共2001兲.
17
Y. Mu, P. H. Nguyen, and G. Stock, Proteins 58, 45 共2005兲.
18
G. E. Sims, I.-G. Choi, and S.-H. Kim, Proc. Natl. Acad. Sci. U.S.A. 102,
618 共2005兲.
19
J. Wang and R. Brüschweiler, J. Chem. Theory Comput. 2, 18 共2006兲.
20
B. Alakent, P. Doruker, and M. C. Camurdan, J. Chem. Phys. 121, 4756
共2004兲.
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics
244111-10
21
J. Chem. Phys. 126, 244111 共2007兲
Altis et al.
V. Schultheis, T. Hirschberger, H. Carstens, and P. Tavan, J. Chem.
Theory Comput. 1, 515 共2005兲.
22
A. Ma and A. R. Dinner, J. Phys. Chem. B 109, 6769 共2005兲.
23
E. Meerbach, E. Dittmer, I. Horenko, and C. Schütte, Lect. Notes Phys.
703, 475 共2006兲.
24
J. D. Chodera, W. C. Swope, J. W. Pitera, and K. A. Dill, Multiscale
Model. Simul. 5, 1214 共2006兲.
25
P. Das, M. Moll, H. Stamati, L. E. Kavraki, and C. Clementi, Proc. Natl.
Acad. Sci. U.S.A. 103, 9885 共2006兲.
26
O. F. Lange and H. Grubmüller, Proteins 62, 1052 共2006兲.
27
P. H. Nguyen, Proteins 65, 898 共2006兲.
28
R. Hegger, A. Altis, P. H. Nguyen, and G. Stock, Phys. Rev. Lett. 98,
028102 共2007兲.
29
K. Hinsen, Proteins 64, 795 共2006兲.
30
Y. Mu, P. H. Nguyen, and G. Stock, Proteins 64, 798 共2006兲.
31
N. I. Fisher, Statistical Analysis of Circular Data 共Cambridge University
Press, Cambridge, 1996兲.
32
S. Woutersen and P. Hamm, J. Phys. Chem. B 104, 11316 共2000兲.
33
S. Woutersen, R. Pfister, P. Hamm, Y. Mu, D. Kosov, and G. Stock, J.
Chem. Phys. 117, 6833 共2002兲.
34
R. Schweitzer-Stenner, F. Eker, Q. Huang, and K. Griebenow, J. Am.
Chem. Soc. 123, 9628 共2001兲.
35
J. Graf, P. H. Nguyen, G. Stock, and H. Schwalbe, J. Am. Chem. Soc.
129, 1179 共2007兲.
36
Y. Mu and G. Stock, J. Phys. Chem. B 106, 5294 共2002兲.
37
Y. Mu, D. S. Kosov, and G. Stock, J. Phys. Chem. B 107, 5064 共2003兲.
38
S. Gnanakaran and A. E. Garcia, J. Phys. Chem. B 107, 12555 共2003兲.
39
H. J. C. Berendsen, D. van der Spoel, and R. van Drunen, Comput. Phys.
Commun. 91, 43 共1995兲.
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, and H.
J. C. Berendsen, J. Comput. Chem. 26, 1701 共2005兲.
41
W. F. van Gunsteren, S. R. Billeter, A. A. Eising, P. H. Hünenberger, P.
Krüger, A. E. Mark, W. R. P. Scott, and I. G. Tironi, Biomolecular Simulation: The GROMOS96 Manual and User Guide 共Vdf Hochschulverlag
AG an der ETH Zürich, Zürich, 1996兲.
42
H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, and J. Hermans, in Intermolecular Forces, edited by B. Pullman 共Reidel, Dordrecht,
1981兲, pp. 331–342.
43
T. Darden, D. York, and L. Petersen, J. Chem. Phys. 98, 10089 共1993兲.
44
A direct back-calculation of the dihedral angles is not possible. But since
the time indices of the original trajectory and the principal components
are identical, we can use these indices to identify corresponding dihedral
angles.
45
Details of the identification of the metastable conformational states and
their transition matrices are given in Ref. 17.
46
As the complete analysis is performed in the space of dihedral angle
principal components, there is no need to invoke the Jacobian transformation between these coordinates and the atomic Cartesian coordinates
共Ref. 50兲.
47
O. F. Lange and H. Grubmüller, J. Chem. Phys. 124, 214903 共2006兲.
48
S. Yang, J. N. Onuchic, and H. Levine, J. Chem. Phys. 125, 054910
共2006兲.
49
S. V. Krivov and M. Karplus, Proc. Natl. Acad. Sci. U.S.A. 101, 14766
共2004兲.
50
S. He and H. A. Scheraga, J. Chem. Phys. 108, 271 共1998兲.
40
Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp