Download Strain in Protein Structures as Viewed Through Nonrotameric Side

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Multi-state modeling of biomolecules wikipedia , lookup

Catalytic triad wikipedia , lookup

Interactome wikipedia , lookup

Biochemistry wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

Paracrine signalling wikipedia , lookup

Western blot wikipedia , lookup

Protein purification wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Drug design wikipedia , lookup

Signal transduction wikipedia , lookup

Protein wikipedia , lookup

Homology modeling wikipedia , lookup

Two-hybrid screening wikipedia , lookup

NADH:ubiquinone oxidoreductase (H+-translocating) wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Structural alignment wikipedia , lookup

Proteolysis wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Anthrax toxin wikipedia , lookup

Metalloprotein wikipedia , lookup

Ligand binding assay wikipedia , lookup

Transcript
PROTEINS: Structure, Function, and Genetics 37:44–55 (1999)
Strain in Protein Structures as Viewed Through
Nonrotameric Side Chains: II. Effects Upon Ligand Binding
Jaap Heringa1,2* and Patrick Argos2
of Mathematical Biology, National Institute for Medical Research, London, United Kingdom
2European Molecular Biology Laboratory, Heidelberg, Germany
1Division
ABSTRACT
The relation between the spatial
positioning of nonrotameric residues and ligands
was studied in 112 tertiary structures of proteinligand complexes with a crystallographic resolution
of I 1.8 Å. Nonrotameric side chains and especially
clusters of interacting nonrotameric residues were
found to be associated preferentially with ligandand substrate-binding sites. Asp, Glu, His, Met, and
Asn are favored nonrotameric residue types positioned in the first 9-Å shell around ligands. Comparison of 20 complexes with associated apo structures
suggests that ligand binding induces nonrotamericity and, hence, strain within protein-ligand complexes. The internal energy gain is not neutralized
by increased hydrogen bonding or salt-bridge formation involving side chains that become nonrotameric in the complexed structure. It is suggested
that the increased internal energy might aid in the
formation and ejection of enzymatic products,
thereby enhancing activity. These results could
prove useful in protein engineering experiments
aimed at altering enzymatic activity. Proteins
1999;37:44–55. r 1999 Wiley-Liss, Inc.
Key words: rotamers; protein structure; protein
folding; ligand binding; product formation; catalysis
INTRODUCTION
In protein tertiary structures, amino acid side-chain
conformations can be described by their torsion angles, the
so-called ␹ angles. A rotamer is a k-tuple of ␹ angles 5␹1, ␹2,
␹3, . . .6 defined by bonded atom positions in moving out
along an individual side chain, with k ⫽ 1 to N, with
maximum of 4, depending on the residue type. A preferred
dihedral conformation for a residue type is termed a
rotamer, and the number of rotamers delineated for each of
the amino acid types can vary from a minimum of 3 for
Cys, Asp, Ser, and Thr up to 9 for Leu.1 Rotamers have
been characterized as 1) possessing configurations representing a local minimum in potential energy;2–5 2) dense
clusters in the ␹-angle space, as observed from tertiary
structures in the Protein Data Bank;1,6 and 3) being
associated with a local, most favorable conformational
entropy.7,8 For the amino acids Gly and Ala, no side-chain
dihedral angles are defined due to the lack of sufficient
side-chain atoms. Furthermore, Pro lacks the rotational
r 1999 WILEY-LISS, INC.
freedom to attain multiple rotamers, because a part of its
side chain forms the backbone.
By using data from 19 well-refined crystallographic
structures, Ponder and Richards6 assembled a library of
rotamers for each amino acid type based on clustering
techniques. They observed that most side chains adopt one
of the rotameric states. Schrauber et al.1 used a much
larger protein structure set and observed that a significant
fraction (up to 30%) of particular side-chain types could
not be assigned to a rotameric state based on the criterion
that rotameric side chains can deviate in each of the ␹1 and
␹2 angles by no more than 20° from the closest associated
rotamer. Moreover, they found that this high fraction of
nonrotameric residues cannot be attributed to structural
resolution, because the rotamericity (fraction of all side
chains in a protein structure found rotameric) does not
increase with crystallographic resolutions ⬍1.9 Å.
Schrauber et al. suggested that nonrotamericity is often a
result of functional or structural constraints, such that,
locally, a minimum energy configuration could be sacrificed for the function or stability of the protein as a whole.
However, they performed no statistical analysis to support
their claim. Herzberg and Moult9 investigated steric strain
for main-chain atoms only in ten protein tertiary structures and observed strained residues near functional sites.
In Heringa and Argos (this issue),10 we investigated the
spatial positioning of buried, nonrotameric side chains and
found that they prefer to be in self-interacting clusters and
in coil secondary structures. Although nonrotameric side
chains generally have significantly higher crystallographic temperature factors (which measure energetic
atomic vibrations) than those in rotameric states, side
chains in nonrotameric clusters tend to have lower crystallographic temperature factors than isolated nonrotameric
side chains. Nonrotameric clustering thus appears to
alleviate structural strain in the protein three-dimensional (3D) structure.
Here, we study the location of nonrotameric residues
with regard to ligand- and substrate-binding sites and
address the following questions: Is there a relation between the placement of nonrotameric side chains and
Abbreviations: rot, rotameric; nonrot, nonrotameric; ASA, solvent
accessible surface area.
*Correspondence to: Jaap Heringa, Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill,
London NW7 1AA, United Kingdom. E-mail: [email protected]
Received 23 April 1999; Accepted 26 April 1999
45
NONROTAMERIC RESIDUES AND LIGAND BINDING
TABLE I. Set of 112 Protein Structures With a Recorded Ligand in the Protein Data Base Depository†
121P
1BCX
1CPCA
1ECA
1GMPA
1LYE
1PAL
1SNC
1XIS
2CMD
2MSBB
7PCY
†The
1ABA
1BDMA
1CPCB
1EZM
1GOF
1MBA
1PDA
1SRIB
1YCC
2CTB
2OHXA
8DFR
1ABE
1BP2
1CPN
1FKB
1HBG
1MDC
1PHA
1TAG
256BA
2CY3
2PIA
1ADL
1BTL
1CSH
1FLP
1HBIA
1MFA
1PHP
1TCA
2AAE
2CYP
2PKC
1ADS
1CBS
1CYO
1FNB
1HFC
1MRJ
1PMY
1THG
2ACT
2DRI
2POR
1AIZA
1CELA
1DBS
1FRD
1HML
1MUA
1RAS
1THM
2AK3A
2FCR
2TRXA
1AMP
1CHN
1DHJA
1GCA
1HYT
1MYT
1SBP
1TML
2ALP
2GMT
351C
1ARS
1CLL
1DMB
1GD1O
1ICM
1NSCA
1SCS
1TON
2APR
2HMQA
3C2C
1AST
1COT
1DXTB
1GDI
1ISAA
1OFV
1SGC
1TPFA
2BBKL
2HTS
3CLA
1BBHA
1COY
1EAS
1GESB
1IVD
1ONC
1SGT
1TYS
2CCYA
2MCM
3DFR
proteins are identified by their Protein Data Base four-letter code. A fifth character identifies the chain of the protein used in the analysis.
ligands? In an analysis over a nonredundant set of tertiary
structures, it will be shown that nonrotameric residues
tend to be situated closer to protein ligands than expected.
Can any features be derived from such side chains that
might influence the mechanism of ligand binding? This
second question was addressed over a set of known tertiary
structures with a bound ligand as well as a sequentially
identical apo form. The analysis suggested that residues
tended to switch from a rotameric state to a nonrotameric
state upon ligand binding, leading to an increase in protein
internal energy. The induced nonrotameric states were not
accompanied by increased hydrogen bonding or salt bridges,
which could compensate for the elevated local strain.
factor in any statistical results. For instance, Schrauber et
al.1 as well as Heringa and Argos (this issue10) have
demonstrated that resolution has an influence on the
rotamericity in protein tertiary structures. If a complexed
(apo) structure had more than one apo (complexed) counterpart, then the pair showing the smallest resolution difference was selected. In a single case in which the resolution
difference was identical, the paired structure with the
smallest crystallographic R-factor was selected. This scenario resulted in a small set of 20 structural complex/apo
pairs (Table II).
MATERIALS AND METHODS
Protein 3D Structures
The protocol used for defining nonrotameric residues
was that described in the accompanying article,10 in which
residue types Gly, Ala, and Pro were appropriately excluded (see above). Side chains were deemed nonrotameric
if 0␹r1⫺␹10⬎20° or 0␹r2⫺␹20⬎20°, where ␹1 and ␹2 are the
side-chain dihedral angles of the nonrotamer, and ␹r1 and
␹r2, respectively, are the corresponding dihedral angles of
the nearest associated rotamer for the particular residue
type. Rotamers used were those from the library of Ponder
and Richards6 combined with that of Schrauber et al.1
Both were constructed by using the clustering techniques
in ␹ angle space described in the accompanying article.10
A nonredundant set of 3D structures was compiled from
the Protein Data Bank (PDB)11 with recorded ligands. By
using the method of Heringa et al.,12 a maximally sized set
of proteins was gathered with minimal sequence lengths of
80 residues (see below), resolutions ⱕ 1.8 Å, and sequence
identities ⱕ 35% over all possible sequence pair alignments. Generally, smaller proteins did not have a solventinaccessible core containing more than a single nonrotameric residue. Therefore, they would not have been useful
in our interaction analysis of buried side chains with
minimal crystallographic error in the coordinates. Of the
166 X-ray structures obtained in this manner, a set of 112
ligand-bound proteins, complexed with one or more ligands, was extracted (Table I). The R-factors of these
structures had a value ⱕ 0.225.
To test the conformational changes of nonrotameric
residues upon ligand binding, a set of structural pairs
consisting of complexed and corresponding apo structures
was compiled. Of the 166 nonredundant ⱕ 1.8 Å resolution
structures, 29 had one or more sequentially identical
complexed or apo structures in the PDB with resolution
not exceeding 2.5 Å, resulting in 42 complex/apo pairs of
structures. Because structural resolution and the degree of
rotamericity (the fraction of residues found rotameric) are
related (see Heringa and Argos, this issue10), only complex/
apo structural pairs were considered that showed a resolution difference ⱕ 0.5 Å. This threshold was applied to limit
the chances that resolution effects would be the dominant
Defining Nonrotameric Residues
Side Chain Selection
Only buried core residues were selected, such that the
solvent-accessible surface area (ASA) summed over all
main- and side-chain atoms was ⱕ 5 Å2. This selection was
made because side-chain dihedral angles display greater
error when at or near the protein surface (see the accompanying article10). All ASA values were calculated by using
the program DSSP.13 For structures that comprised more
than one chain, as a further precaution, all other chains
were removed before surface accessibilities were calculated, such that chain interface regions became exposed
and, hence, were excluded from the analysis.
Contacts Among Side Chains
The protocol followed for side-chain contact delineation
is described in the accompanying article.10 Side-chain
pairs were considered to make contact if two atoms, each
46
J. HERINGA AND P. ARGOS
TABLE II. Ligands Involved in 20 Pairs of Complexes and Apo Enzymes†
No.
Complex
Apo
RMSd
(Å)
Species and
name
Ligand(s) in
complex only
1
1adl
1lib
0.28
Mouse lipid-binding protein
2
3
1bp2
1brnl
2bpp
1brsa
0.57
0.45
4
1cbq
1cbs
0.45
5
1dmb
1omp
0.47
6
1gca
1gcg
0.32
7
8
1geub
1gmpa
1gesb
1gmqa
0.23
0.18
9
1gof
1gog
0.11
10
1hyt
1lnfe
0.07
11
1icm
1ifc
0.45
12
13
1isca
1ndc
1isaa
1npk
0.09
0.45
14
1rar
1ras
0.19
Bovine phospholipase A2
Bacillus amyloliquefaciens barnase
Human retinoic acid-binding
protein
E. coli D-maltodextrin-binding
protein
Salmonella typhimurium galactose-binding protein
E. coli glutathione reductase
Streptomyces aureofaciens ribonuclease
Dactylium dendroides galactose
oxidase
B. thermoproteolyticus thermolysin
Rat intestinal fatty acid binding
protein
E. coli superoxide dismutase
Dictyostelium discoideum
nucleoside diphosphate
kinase
Bovine ribonuclease A
15
16
17
18
4xis
1emd
2ctc
2gmt
1xis
2cmd
2ctb
1gmca
0.06
0.11
0.15
0.25
S. rubiginosus xylose isomerase
E. coli malate dehydrogenase
Bovine carboxipeptidase A
Bovine ␥-chymotrypsin
19
2sim
2sil
0.14
20
2wgca
9wgaa
0.20
Salmonella typhimurium sialidase
Wheat germ isolectin
Ligand(s) in the complex
and apo structures
3 ⫻ oxygen; arachidonic acid;
propanoic acid
2 ⫻ 2-methyl-2,4-pentanediol
RNA
—
Phosphate
Retinoic acid
␤-cyclodextrin
—
Galactose
Ca2⫹
NAD
28-Guanylic acid
FAD
Sulfate
2 ⫻ Acetate ion
Cu2⫹; Sodium⫹ counter ion
L-benzylsuccinate
4 ⫻ Ca2⫹; Zn2⫹; Dimethyl
Sulfoxide
—
Myristate
Ca2⫹
—
Azide
Thymidine-58-diphosphate;
Mg2⫹
Fe2⫹
—
3 ⫻ Cl
Acetylaminoethyl
napthylamine sulfonate
2 ⫻ Mn2⫹
Citrate
Zn2⫹
—
Xylose
NAD
L-phenyl lactate
N-acetyl-L-phenylalanyl-␣-chloroethylketone
2,3-dehydro-2-deoxy-N-acetyl
neuraminic acid
N-acetyl-neuraminyl lactose
—
—
†No.,
the number of the holo/apo pair. Complex and Apo identify the Protein Data Bank four-letter codes for the structural pairs, with a fifth
character denoting the chain identifier, and RMSd (Å) gives the root-mean-square deviations between the pairs obtained from structure C␣
superpositioning by the method of Taylor and Orengo.20 The ligands designated under Ligand(s) in complex only were used to derive the statistics.
from a side chain, were at a distance ⱕ 5 Å. Clusters of
nonrotameric residues were generated when the minimum
sized cluster consisted of two side chains. Further cluster
growth involved nonrotameric residues that contacted one
or more cluster members.
General Side Chain/Ligand Interactions
For all bound ligands recorded for each of the 112 protein
3D structures listed in Table I, the distances to all buried
rotameric and nonrotameric side chains were determined.
Various shells were constructed around the ligands that
had a depth of 9 Å, a distance corresponding to two carbon
atom diameters and an additional 1 Å to consider any
hydrogen atoms and experimental error. The side chains
were assigned to these shells based on their side chain/
ligand distances. For example, a side chain at 13.2 Å from
a particular ligand would fall in the second 9-Å shell for
that ligand. For all shell layers under a particular shell
depth and over all of the ligands within the 112 protein
structures listed in Table I, the rotameric and nonrotameric residues were sampled, and the nonrotameric ratio
was calculated as Ratioshell ⫽ Nnonrotshell/Nrotshell, where
Nnonrotshell is the number of nonrotameric residues in the
shell considered, and Nrotshell is the corresponding number
of rotameric residues. The nonrotameric ratios observed in
each of the shells (Ratioshell) were converted to preferences
(Prefshell) through normalization with the nonrotameric
ratio (Ratioprot ⫽ Nnonrotprot/Nrotprot) for the associated
complete structure: Prefshell ⫽ Ratioshell/Ratioprot. A value ⬎
1.0 denotes that nonrotameric residues are preferred in
the considered shell, and a lower value indicates avoidance. In structures with more than one attached ligand,
nonrotameric residues were assigned to their closest ligand.
Distance Between Side Chains and Ligands
Consistent with the definition for contacts among side
chains, the default distance taken between a particular
47
NONROTAMERIC RESIDUES AND LIGAND BINDING
buried side chain (rotameric or nonrotameric) and ligand
was the smallest of all interatomic distances among all
side-chain and ligand atoms. However, this definition
might lead to a bias in the statistics gathered for the above
ligand shells, in that the shells become ‘‘hairy,’’ i.e., the
shells might involve residues toward the outside of each
shell that are orthogonal to the shell surface, whereas side
chains inside the shells might be positioned more in
parallel. To alleviate this bias, control statistics were
gathered from side-chain/ligand distances defined for each
side chain and associated ligand as the distance between
the side-chain centroid and the closest ligand atom.
TABLE III. Nonrotameric Preferences in 9-Å Shells
Around Ligands Averaged per Protein†
Shell
Average PrfNonrotⱕ1 PrfNonrot⬎1 PrfNonrot
preference
(%)
(%)
n.a.
0.0–9.0
9.0–18.0
18.0–27.0
27.0–36.0
1.51 ⫾ 0.16
0.88 ⫾ 0.09
0.83 ⫾ 0.15
0.64 ⫾ 0.36
35 (39.3)
55 (61.1)
38 (51.4)
35 (81.4)
54 (60.7)
35 (38.9)
36 (48.7)
8 (18.6)
23
22
38
69
†PrfNonrotⱕ1
gives the number of proteins for which the preferences
were smaller than or equal to 1.0, and PrfNonrot⬎1 provides the
number of proteins with shell preferences greater than 1.0. PrfNonrot
n.a. gives the numbers of proteins for which no shell preference could
be calculated due to the absence of rotameric residues.
Cluster-Ligand Interaction
To determine whether nonrotameric clusters show a
preference to be in the vicinity of ligands, the distance
(Dnonrot) between each cluster and its nearest ligand was
taken as the smallest separation between any ligand atom
and any side-chain atom of the cluster. We also determined
such distances for all rotameric clusters with a number of
residues identical to the number in the compared nonrotameric cluster and with the same clustering criterion. For
all such rotameric control clusters, we calculated the
average distance (Drot) to the ligand. The preference (Prf)
for a single nonrotameric cluster to be in the proximity of a
ligand was defined as Prf ⫽ Drot/Dnonrot, such that values ⬎
1.0 denote a preference for the nonrotameric cluster to be
in the proximity of a ligand. The average preference value
was determined for various sized groups (two to four
residues) as well as over all nonrotameric clusters. With
regard to the ligand shell statistics of individual nonrotameric side chains (see above), a default and control distance criterion was used. The default criterion for clusterligand distance involved the smallest separation between
any considered cluster and ligand atom. For a control
distance between a side-chain cluster and associated ligand, the shortest distance between any cluster side-chain
centroid and any ligand atom was taken.
Comparing Nonrotameric States in Complexed and
Apo Structures
The 20 analyzed pairs of proteins with known complexed
and corresponding apo structures comprise enzymes complexed with natural ligands or analogues as well as
binding proteins (Table II). For each of the 20 pairs, the
holo and apo enzymes have identical topologies. Similar to
the general analysis, only residues with ASA ⱕ 5 Å2 were
analyzed. Solvent accessibilities were calculated only in
the absence of ligands for consistency and crystallographic
error avoidance. The complex/apo pairs were scrutinized
for different side-chain conformations (rotameric or nonrotameric) from the unbound state to the bound state with
one or more ligands.
The side-chain rotameric behavior was characterized by
three states: 1) side chains that are nonrotameric in the
complexed structure and rotameric in the apo form (complex only); 2) side chains that are nonrotameric both in the
complex and the apo enzyme (complex and apo); and 3)
those that are rotameric in the complex and become
nonrotameric in the apo enzyme (apo only). The net
increase (or decrease) of nonrotameric side chains could
then be determined for various conditions; for example,
states 1–3 above would give the net nonrotameric movement upon ligand binding.
Hydrogen Bonds and Salt Bridges in Complexed
and Apo Structures
For each of the 20 pairs of structures, protein internal
hydrogen bonds and salt bridges were calculated by using
the molecular mechanics computer package ICM.14 Hydrogen bonds were declared whenever the distance between a
hydrogen and a proton acceptor was ⱕ 2.5 Å.15 Only
hydrogen bonds were sampled between a proton donor or
acceptor of a side chain considered and, respectively,
acceptors or donors from any other side-chain or mainchain atoms. The criterion for declaring a salt bridge was a
distance ⱕ 3.0 Å between charge complementary atoms, a
distance geared to sample primary salt bridges.16 H-bond
and salt bridge formation could neutralize increased energy involved in residues changing from rotameric to
nonrotameric. For a control, pairwise differences in the
number of hydrogen bonds and salt bridges were calculated for side-chain atoms in each of the three classes of
nonrotamericity described above (i.e., complex only, complex and apo, and apo only). Although the actual bond
energies may vary with the local geometries, the differencecounting scheme is feasible in our pairwise comparison of
buried and sequentially identical protein environments.
RESULTS
Nonrotameric Preferences to Be Near Ligands
Tables III and IV show results for four ligand shells,
each with a depth of 9 Å. The shell nearest the ligands has
a clear preference (1.51) for nonrotameric residues, whereas
the preferences for outer ligand shells are consistently ⬍ 1,
indicating avoidance (Table III). Moreover, the percentage
of proteins that prefer nonrotameric residues drops from
61% for the nearest to 19% for the most remote ligand
shell. Table IV gives statistics for residue numbers summed
over all proteins for each of the ligand shells: Relative to a
total number of 4,103 rotameric and 389 nonrotameric side
chains over all protein ligand shells, there is a clear,
48
J. HERINGA AND P. ARGOS
TABLE IV. Nonrotameric Preferences in 9-Å Shells
Around Ligands for Summed Residues Over All Ligands†
Shell
Preference
No. of
rotameric
side chains
0.0–9.0
9.0–18.0
18.0–27.0
27.0–36.0
1.28 ⫾ 0.08
0.91 ⫾ 0.07
0.87 ⫾ 0.13
0.90 ⫾ 0.34
1081
2293
633
94
No. of
nonrotameric
side chains
131
198
52
8
†Estimates
for standard deviations of preferences have been compiled
as described in the accompanying article (Heringa and Argos10 ).
nonrotameric preference of 1.28 only for the shell closest to
the ligand.
The control preferences derived by using side-chain
centroid distances (see Materials and Methods) and averaged per protein showed a similar pattern (compared to
that in Table III) with values of 1.54 ⫾ 0.23, 0.97 ⫾ 0.06,
0.96 ⫾ 0.19, and 0.81 ⫾ 1.00, respectively for the four 9-Å
shells going outward from the ligand. Also the control
preferences using side-chain centroids compiled from summing each shell over all ligand proteins showed preferences that virtually were identical to those for the default
distance criterion (Table IV), with values of 1.29 ⫾ 0.09,
0.93 ⫾ 0.07, 0.89 ⫾ 0.12, and 0.99 ⫾ 0.30 for the respective
shells going outward from the ligand.
It might be argued that the ligand shell definition would
lead to selection of side chains positioned more toward the
protein surface for the closest ligand shell than for those at
greater distances from the ligand, which might explain the
observed rotameric bias. On the other hand, it must be
stressed that all ligand shells can include side chains near
the surface as well as those buried more deeply in the
protein structure. For a control for this deepness effect, we
grouped the side chains over all proteins in six accessibility classes (ASA ⫽ 0 Å2, 1 Å2, 2 Å2, 3 Å2, 4 Å2, and 5 Å2) and
found similar rotamericity values of 92.17%, 91.95%,
91.00%, 90.61%, 90.10%, and 91.16%, respectively. Clearly,
the small differences between these rotamericities are not
sufficient to explain the observed nonrotameric preference
for the closest ligand shell.
Because the nonrotameric cluster statistics were sparse,
we could not calculate similar ligand shell preferences.
However, average distances of nonrotameric clusters from
ligands could be compared with those for rotameric control
clusters comprising identical numbers of side chains (see
Materials and Methods). Nonrotameric clusters are 1.7fold closer to ligands than rotameric control clusters, i.e.,
on average, they are at only 0.58 of the control cluster
distance from the ligands (Table V). The control clusterligand proximity preferences based on side-chain centroidligand distances showed a similar trend, with a value of
1.6. We conclude that there is a clear overall tendency for
nonrotameric residues and those in clusters to be associated with protein ligands.
TABLE V. Preferences of Nonrotameric Clusters to be
Close to Ligands
Average nonhairy
Average hairy
No. of
preference compared preference compared
to mean control
residues
No. of
to mean control
cluster distanceb
in cluster clusters
cluster distancea
2
3
4
Total
37
10
6
53
1.80 ⫾ 0.19
1.43 ⫾ 0.21
1.79 ⫾ 0.35
1.73 ⫾ 0.21
1.62 ⫾ 0.15
1.41 ⫾ 0.17
1.72 ⫾ 0.34
1.60 ⫾ 0.17
aCluster-ligand
distance was the separation between the closest
cluster and associated ligand atom.
bFor cluster-ligand distance, the separation between the closest cluster side-chain centroid and associated ligand atom was taken.
Nonrotameric Residue Composition Around
Ligands
We sampled compositional data over 17 residue types for
the two 9-Å shells closest to the ligands, as shown in Table
VI. Although the statistics were scant for the first two 9-Å
shells and were dramatically sparse for the third and
fourth 9-Å ligand-distance layers, the first two layers tend
to prefer nonoverlapping sets of amino acids compared
with the general nonrotameric composition (Prefnonrot).
Nonrotameric side-chain types that were preferred by ⱖ
20% (Prefnonrot) in the first layer (closest to the ligand) and
were avoided in the second layer included Asp, Glu, His,
Met, and Asn, which are mostly polar and negatively
charged. Although Lys and Gln were preferred in the first
layer, their low abundance and high preference error
values render the result insignificant. In the second layer,
hydrophobic residue types Cys, Ile, Leu, and Val were
preferred along with Arg, Ser and Thr (albeit the preference error values are relatively high for the latter types),
clearly distinguishing the nonrotameric residue types closest to the ligand. Indeed, the correlation coefficient between the Prefnonrot values for the first two layers was
⫺0.93. The shell compositions compared with the general
rotameric distribution (Prefrot) showed that the nonrotameric residue types that were highly preferred in general
(see the accompanying article10; Table II) remained favored in the two ligand shells, albeit far less dramatically.
It should be stressed that all sampled residues were buried
almost completely with ASA ⬍ 5 Å2, such that any polar
preference observed would not be induced by the presence
of solvent.
Secondary Structure Preferences Around Ligands
Table VII shows statistics similar to those shown in
Table VI for the first two 9-Å layers around ligands
regarding preferences for secondary structure types and
regions therein (N- and C-terminal; middle) relative to the
general rotameric and nonrotameric statistics. Clearly,
nonrotameric residues in the shell closest to ligands prefer
N-terminal helix, turn, and coil regions. The observations
for helical N-terminal fragments are consistent with the
high preferences for Asp and Glu within the first ligand
shell; these amino acid types are observed frequently also
49
NONROTAMERIC RESIDUES AND LIGAND BINDING
TABLE VI. Amino Acid Composition of Nonrotameric Residues in Sampled Distance Shells Around Ligands†
Amino
acid type
No.
C
D
E
F
H
I
K
L
M
N
Q
R
S
T
V
W
Y
3
19
6
22
9
7
1
9
12
23
1
1
4
1
3
3
7
First ligand shell (0.0–9.0 Å)
%
Prefrot
2.3
14.5
4.6
16.8
6.9
5.3
0.8
6.9
9.2
17.6
0.8
0.8
3.1
0.8
2.3
2.3
5.3
0.53 ⫾ 0.30
7.08 ⫾ 1.50
2.51 ⫾ 1.00
2.10 ⫾ 0.41
4.78 ⫾ 1.54
0.38 ⫾ 0.14
2.24 ⫾ 2.21
0.36 ⫾ 0.12
2.01 ⫾ 0.55
9.12 ⫾ 1.73
0.46 ⫾ 0.46
0.98 ⫾ 0.98
0.36 ⫾ 0.18
0.10 ⫾ 0.10
0.12 ⫾ 0.07
1.03 ⫾ 0.59
1.60 ⫾ 0.59
Prefnonrot
No.
Second ligand shell (9.0–18.0 Å)
%
Prefrot
Prefnonrot
0.74 ⫾ 0.42
1.25 ⫾ 0.26
2.23 ⫾ 0.89
1.11 ⫾ 0.22
1.48 ⫾ 0.48
0.58 ⫾ 0.21
2.97 ⫾ 2.96
0.56 ⫾ 0.18
1.55 ⫾ 0.43
1.55 ⫾ 0.29
1.48 ⫾ 1.48
0.59 ⫾ 0.59
0.57 ⫾ 0.28
0.27 ⫾ 0.27
0.52 ⫾ 0.30
0.47 ⫾ 0.27
1.04 ⫾ 0.38
8
15
1
28
7
23
0
33
10
17
1
3
13
8
13
10
8
4.0
7.6
0.5
14.1
3.5
11.6
0.00
16.7
5.1
8.6
0.5
1.5
6.6
4.0
6.6
5.1
4.0
0.94 ⫾ 0.32
3.70 ⫾ 0.92
0.28 ⫾ 0.28
1.77 ⫾ 0.31
2.46 ⫾ 0.91
0.83 ⫾ 0.16
0.00 ⫾ 0.00
0.88 ⫾ 0.14
1.11 ⫾ 0.34
4.46 ⫾ 1.03
0.30 ⫾ 0.30
1.94 ⫾ 1.11
0.78 ⫾ 0.21
0.53 ⫾ 0.18
0.35 ⫾ 0.10
2.28 ⫾ 0.70
1.21 ⫾ 0.42
1.31 ⫾ 0.45
0.65 ⫾ 0.16
0.25 ⫾ 0.25
0.93 ⫾ 0.16
0.76 ⫾ 0.28
1.26 ⫾ 0.25
0.00 ⫾ 0.00
1.35 ⫾ 0.22
0.85 ⫾ 0.26
0.76 ⫾ 0.18
0.98 ⫾ 0.98
1.18 ⫾ 0.68
1.22 ⫾ 0.33
1.43 ⫾ 0.50
1.50 ⫾ 0.40
1.03 ⫾ 0.32
0.79 ⫾ 0.28
†First
ligand shell and Second ligand shell give the separation ranges in Å between ligand and side-chain atoms. No. denotes numbers of amino
acids, % gives the percentages. Prefrot and Prefnonrot specify the preferences compared with the general rotameric and nonrotameric compositions,
respectively, over all ligand proteins. The standard errors are estimated as described in the accompanying article.10
in N-terminal helical turns,17 possibly due to the helical
dipole effect18 or capping box structure.19 Residues in the
second ligand shell avoid N-terminal helix termini (Prefnonrot)
and prefer helical C-terminal fragments, in accordance
with the observed relative avoidance of negatively charged
residues (Table VI). Relative to the first ligand shell,
N-terminal and middle sheet segments become favored.
Turn and coil segments in the second ligand shell are
avoided, although these regions suffer from critically low
sampling frequencies, resulting in high error estimates.
Changes in Side-Chain Structural States for Apo
and Complexed Structures
The 20 pairs of proteins with known complexed and
corresponding apo structures (Table II) were selected in
the absence of any functional constraints (see Materials
and Methods). They include various binding proteins as
well as enzymes complexed with natural ligands or analogues. The structures within each pair show identical
topologies, with low root mean square deviation (RMSd)
values after superpositioning20 of all C␣ atoms (Table II).
The buried sample of residues within the pairs (ASA ⱕ 5
Å2) would lead to a further decrease in the RMSd values.
Statistics on conformational changes of rotameric and
nonrotameric side chains as structures go from the unbound state to the bound state are given in Table VIII.
Differences in crystallographic resolutions, limited to values ⬍ 0.50 Å to assure comparability (see Materials and
Methods), also are listed. The statistics are reasonably
balanced, because, out of 20 complex/apo pairs, 7 pairs
show that the complexed structure is resolved better,
whereas 8 pairs favor the apo structure, and 5 pairs
display no difference.
Tables IX and X summarize the data from Table VIII.
When all side chains in the entire protein structure are
considered, 11 of 20 structural pairs show a net increase in
nonrotameric residues for the complex, 4 structural pairs
display more nonrotamers for the apo structures, and 5
structural pairs were indifferent, such that 73% of the
structure pairs that displayed a change in strain upon
ligand binding opted for more nonrotameric residues in the
complex state. Furthermore, if we considered only the 7
structural pairs with a better resolved complex (negative
‘‘⌬R’’ values in Tables IX and X), such that the complexes
contained smaller fractions of nonrotameric residues than
the associated apo structures, then 3 pairs showed increased nonrotamericity for the complex, only 1 pair
showed this for the apo form, and 3 pairs were neutral. It is
clear that the results cannot be attributed to resolution
differences. However, as expected from the small sample of
apo-complex pairs, the standard deviations were relatively
high for the average difference between side chains nonrotameric in the complex only and those nonrotameric in the
apo structure (column ⌬nonrot in Table VIII), with values
of 0.90 ⫾ 2.25 over the entire structures and 0.25 ⫾ 1.09
for side chains within 9-Å distance from their associated
ligands.
Table XI shows the nonrotameric residues occurring in
the three states (complex only, complex and apo, apo only)
over the 20 complex/apo pairs, and Table XII lists the
corresponding counts and preferences for the grouped and
individual residue types involved. Although the counts are
critically small, interesting trends are suggested. Nonrotameric states for the large aromatic side chains are preferred in the apo structures, whereas the negatively
charged and polar as well as smaller or branched hydrophobic residues would favor nonrotameric states in complexed
structures. The cross-composition preferences show that
the Asp, Gly, Asn, Gln group, although it is preferred in the
complex-only state relative to the apo-only state, is most
50
J. HERINGA AND P. ARGOS
TABLE VII. Preferences for the Presence of
Nonrotameric Residues in 9-Å Ligand Shells†
Shell (Å)/
secondary
structure
0.00–9.00
H
H
H
H
E
E
E
E
T
T
T
T
C
C
C
C
9.00–18.00
H
H
H
H
E
E
E
E
T
T
T
T
C
C
C
C
Statistic
N-terminal
Middle
C-terminal
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
16
17.98
2.74 ⫾ 0.62
1.62 ⫾ 0.37
11
12.36
0.88 ⫾ 0.25
0.71 ⫾ 0.20
4
4.49
2.63 ⫾ 1.29
2.17 ⫾ 1.06
6
6.74
3.95 ⫾ 1.56
1.95 ⫾ 0.77
17
19.10
0.69 ⫾ 0.15
1.01 ⫾ 0.23
10
11.24
0.57 ⫾ 0.17
0.53 ⫾ 0.16
3
3.37
3.46 ⫾ 1.96
2.44 ⫾ 1.38
2
2.25
1.64 ⫾ 1.15
1.30 ⫾ 0.91
2
2.25
0.30 ⫾ 0.21
0.36 ⫾ 0.25
11
12.36
0.80 ⫾ 0.23
0.87 ⫾ 0.25
5
5.62
4.10 ⫾ 1.78
2.32 ⫾ 1.01
2
2.25
1.27 ⫾ 0.89
2.17 ⫾ 1.51
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
N
%
Prefrot
Prefnonrot
15
9.43
1.44 ⫾ 0.35
0.85 ⫾ 0.21
31
19.50
1.39 ⫾ 0.23
1.13 ⫾ 0.18
1
0.63
0.37 ⫾ 0.37
0.30 ⫾ 0.30
2
1.26
0.74 ⫾ 0.52
0.36 ⫾ 0.26
30
18.87
0.68 ⫾ 0.11
1.05 ⫾ 0.17
41
25.79
1.31 ⫾ 0.18
1.22 ⫾ 0.16
0
0.00
0.00 ⫾ 0.00
0.00 ⫾ 0.00
2
1.26
0.92 ⫾ 0.64
0.73 ⫾ 0.51
14
8.81
1.17 ⫾ 0.30
1.41 ⫾ 0.36
21
13.21
0.86 ⫾ 0.17
0.93 ⫾ 0.19
1
0.63
0.46 ⫾ 0.46
0.26 ⫾ 0.26
1
0.63
0.36 ⫾ 0.36
0.61 ⫾ 0.60
†Shell
refers to the distance of the nonrotameric residues from the
ligand. Secondary structure gives secondary structural states of the
nonrotameric residues as determined by using the DSSP algorithm.13
The four secondary structural symbols shown are obtained from DSSP
symbols and classified as follows: H (helix) for G, H, and I; E (strand)
for B and E; T (turn) for S and T; and C (coil) for blank. Secondary
structural elements sampled had minimum lengths of 6, 4, 2, and 6 for
helix (H), strand (E), turn (T), and coil (C), respectively, whereas the
lengths of the N-terminal and C-terminal segments chosen, respectively, were 3, 2, 1, and 3. Lengths of middle segments were dependent
on the actual length of the secondary structure considered. Statistics
are shown as symbols designating the following: N, number of side
chains in the secondary structural element considered; %, percentages
of all shell residues found in the given secondary structural region;
Prefrot , preference of rotameric side chains compared with rotameric
residues; Prefnonrot , nonrotameric preference compared with the nonrotameric residue distribution. Estimates for preference standard
errors are compiled as described in the accompanying article.10
likely to be nonrotameric in complex and apo structures
(Table XIIa). The small/branched hydrophobic amino acid
types, indeed, show the highest preference for the complexonly state. The latter statistics are especially salient for
Leu, with 11 complex-only counts versus 2 apo-only counts,
such that Leu shows the most pronounced tendency to
become strained upon ligand binding (Table XIIb). The Leu
complex-only state preference relative to the apo-only
state has a value of 2.90 ⫾ 0.74 (Table XIIb). Although
some trends for other individual residue types also are
suggested, due to the sparse data (particularly the paucity
in the distribution to which a considered distribution is
compared), the error numbers10 likely are underestimated,
so that only the preference for Leu residues might be
considered significant.
To quantify further the increased nonrotamericity in
complexed structures, we checked how much the sidechain ␹1,2 angles moved upon binding the ligands. For each
examined side chain, the average difference in (⌬␹1 ⫹ ⌬␹2)
was taken; i.e., ⌬(⌬␹1 ⫹ ⌬␹2), where ⌬␹1,2 denotes the
angular ␹1,2 distance from the nearest associated rotamer,
and the outer ⌬ corresponds to differences between (⌬␹1 ⫹
⌬␹2) values from equivalent side chains in the complexed
and apo structure. Table XIII lists the angular displacements obtained in this manner averaged for the three
nonrotameric residue states (complex only, complex and
apo, and apo only) over the 20 complex/apo pairs together
with associated standard deviations. Thirty-eight residues, nonrotameric in the complexed structure, move on
average an extra 16.6° from rotameric residues relative to
equivalent residues in the apo structure, whereas only 20
nonrotameric apo residues move by a smaller 11.2° from
rotameric residues relative to corresponding residues in
the complex. When residues are nonrotameric in both the
complexed structure and the apo structure, the difference
is small, but positive. Clearly, there is increased strain in
the ligand-bound structures, albeit the error numbers
indicate fairly wide distributions. The results are consistent when only nonrotameric residues within 9 Å of the
ligand are considered (Table XIII). In fact, the relative
build-up of strain in residues that become stressed (complex only), compared with the relaxation of those becoming
rotameric in the complex (apo only), increases from 5.4° to
6.8° for side chains closest to the ligands (⬍9 Å).
Hydrogen Bonding and Charge Complementarity
Involving Nonrotameric Residues
Can the increased strain be mitigated by increased
hydrogen bonding and/or salt-bridge formation involving
the residues that have become nonrotameric upon ligand
binding? Differences in hydrogen bond formation over the
relevant residues were determined for the 20 complex/apo
pairs. Table XIV shows that, for side-chain atoms of
nonrotameric residues only in the complexes, the net
number of hydrogen bonds over 38 side-chains is increased
by only 4 relative to the apo counterparts. Similarly, the
apo-only, nonrotameric side chains increase their net
hydrogen bonds by 2, which would lower further their
internal energy relative to the complexes. From Table XIV
it can be concluded that some residues sacrifice their
rotameric states for an extra hydrogen bond, but this
occurs only in about 10% of all cases. Moreover, we found
no correlation between the (⌬␹1 ⫹ ⌬␹2) differences and
observed hydrogen bonding (complex only state, 0.37;
complex and apo state, ⫺0.26; complex-only state, ⫺0.12).
51
NONROTAMERIC RESIDUES AND LIGAND BINDING
TABLE VIII. Side-Chain Conformational Changes in Selected Complex/Apo Structure Pairs†
Structure
pair
Compl
Apo
RC
RA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1adl
1bp2
1brnl
1cbq
1dmb
1gca
1geub
1gmpa
1gof
1hyt
1icm
1isca
1ndc
1rar
4xis
1emd
2ctc
2gmt
2sim
2wgca
1lib
2bpp
1brsa
1cbs
1omp
1gcg
1gesb
1gmqa
1gog
1lnfe
1ifc
1isaa
1npk
1ras
1xis
2cmd
2ctb
1gmca
2sil
9wgaa
1.60
1.70
1.76
1.80
1.80
1.70
2.20
1.70
1.70
1.70
1.50
1.80
2.00
1.90
1.60
1.90
1.40
1.80
1.60
2.20
1.70
1.80
2.00
2.20
1.80
1.90
1.74
1.80
1.90
1.70
1.19
1.80
1.80
1.70
1.60
1.87
1.50
2.20
1.60
1.80
⌬R
⌬nonrot
Entire structure
⬍9 Å
⫺0.10
⫺0.10
⫺0.24
0.40
0.00
⫺0.20
0.46
⫺0.10
⫺0.20
0.00
0.31
0.00
0.20
0.20
0.00
0.03
⫺0.10
0.40
0.00
0.40
0
⫹2
⫹3
⫹1
⫹4
⫹4
⫹2
0
⫺6
⫹2
⫹2
⫺1
⫺1
0
⫺1
⫹2
0
⫹4
0
⫹1
0
0
⫹2
0
0
⫹2
⫺1
0
⫺2
⫹1
⫹2
0
0
⫺1
⫺1
0
0
⫹2
⫹1
0
C
Entire
structure
C⫹A
A
C
⬍9 Å
C⫹A
A
0
2
3
1
5
4
4
0
1
2
2
0
0
1
0
3
0
6
3
1
0
1
0
0
4
3
4
0
10
1
0
1
0
0
3
1
5
1
6
1
0
0
0
0
1
0
2
0
7
0
0
1
1
1
1
1
0
2
3
0
0
0
2
0
0
2
0
0
0
1
2
0
0
0
0
1
0
3
1
0
0
0
0
0
1
2
0
0
2
0
0
0
0
0
1
0
0
0
2
0
0
0
0
0
0
0
1
0
2
0
0
0
0
1
1
1
0
1
0
0
†Structure
pair, Compl, and Apo designate the structural pairs as given in Table II. RC relates the crystallographic resolutions of the complexed
structures, and RA shows those for the apo enzymes. ⌬R provides the difference of columns RC and RA; a negative (positive) number implies a
better resolved complex (apo) structure. ⌬nonrot designates the net difference in number of nonrotameric residues for the complex form (C ⫺ A).
Entire structure denotes this value over all sampled residues, and ⬍9 Å gives the net difference only for residues at a distance smaller than 9 Å
from their associated ligands. Entire structure and ⬍9 Å list the numbers of nonrotameric residues for three states: C, nonrotameric in the
complex and rotameric in the apo form; C ⫹ A, nonrotameric in both complex and apo forms; A, rotameric in the complex and nonrotameric in the
apo forms.
TABLE IX. Statistics Over Side-Chain
Conformational Changes in Complex/Apo
Structural Pairs (Statistics Over Proteins)†
Type
No.
%
% nonzero
Number ratio (⫹/⫺)
⫹
⫺
0
11
55
73
2.75
4
20
27
—
5
25
—
—
†⫹ indicates data for proteins with increased nonrotamericity in the complex structure, ⫺ indicates those
with fewer nonrotameric residues in the complex,
and 0 indicates proteins displaying no change in
nonrotamericity between the complex and apo structures.
Thus, side-chain dihedral displacement cannot be explained by increased hydrogen bonding. Table XIV also
shows similar statistics for salt-bridge formation, and,
again, the observed increase in nonrotamericity for ligandbound structures appears to be little influenced or mitigated by increased charge complementartity within the
complexes.
Apo/Complexed Examples
Three examples are included to illustrate the positioning of strained side chains in ligand-bound complexes:
Salmonella galactose-binding protein (1gca; pair 6 in
Table VII), bovine ␥-chymotrypsin (2gmt; pair 18), and
TABLE X. Statistics Over Side-Chain
Conformational Changes in
Complex/Apo Structural Pairs
(Statistics Over Number of Side
Chains)†
Type
No. of side chains
Average per structure
Number ratio (⫹/⫺)
⫹
⫺
27
2.45
3.0
9
2.25
—
†⫹ denotes residues that switch to nonrotameric from rotameric in the ligand-bound
structure, whereas ⫺ side chains are nonrotameric only in the apo form.
Bacillus amyloliquefaciens barnase (1brnl; pair 3). These
structures illustrate how increased internal energy is
stored around ligands.
The complexed galactose-binding protein (1gca; Fig. 1)
provides a good example of a strained, four-side-chain
cluster that is completely within a 9-Å-radius shell of the
galactose ligand. The cluster contains Tyr10 and Met17,
strained in both the bound for and the unbound form,
whereas Asn66 and Lys92 become nonrotameric upon
ligand binding. The Asn and Met side chains, respectively,
show an extra 10° and 14° deviation from the rotameric
states relative to their dihedral angles in the apo structure. The cluster forms a supporting layer behind the
52
J. HERINGA AND P. ARGOS
TABLE XI. Residue Types Occurring in Various Nonrotameric States in 20 Complex/Apo Pairs
Pair Compl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1ad1
1bp2
1brnl
1cbq
1dmb
1gca
1geub
1gmpa
1gof
Apo
1lib
2bpp
1brsa
1cbs
1omp
1gcg
1gesb
1gmqa
1gog
Nonrotameric in complex only
Nonrotameric in complex and apo
—
N71, I95
D75, L89, I96
C130
L43, I60, S114, L115, E308
N66, K92, Q142, L145
L32, M53, V125, T383
—
L252
—
D42
—
—
I59, L89, D136, F258
Y10, M17
Y7, I20, D207, F446
—
W39, L68, F116, I165, H334, N355,
Y405, R439, N509, D586
1hyt
1lnfe I20, E166
Y76
1icm
1ifc
L113, R126
—
1isca 1isaa —
F106
1ndc
1npk —
—
1rar
1ras
C72
—
4xis
1xis
—
F11, F99, N215
1emd 2cmd L6, L32, I151
V171
2ctc
2ctb
—
V33, I62, D65, L66, N146
2gmt 1gmca L105, C136, D194, C201, V227, Y228 L199
2sim
2sil
V280, L319, V358
I38, V116, Q194, N232, T256, V359
2wgca 9wgaa S62
S148
Nonrotameric in apo only
—
—
—
—
V183
—
L257, S345
—
Y55, S139, L184, C228, W340, F399,
I574
—
—
I23
M80
N44
F288
I116
—
V52, S214
H57, D67, M269
—
Notes: Column ‘‘Complex only’’ gives residues that are nonrotameric in the complex and rotameric in the apo form, ‘‘Complex and apo’’ provides
those nonrotameric both in complex and apo forms, and ‘‘Apo only’’ lists residues rotameric in the complex form and nonrotameric in the apo form.
ligand. Two more side chains become strained in the
complex: Gln142 binds to the Ca2⫹ ligand and deviates
from the rotameric by an extra 19° upon ligand binding,
and the distant Leu145 strains only by a further 2.4°.
Figure 2 shows a four cluster of side chains in the first
9-Å layer around the inhibitor N-acetyl-L-phenylalanyl-␣chloroethylketone attached to ␥-chymotrypsin. Asp194,
Val227, and Tyr228 become nonrotameric in the complex,
whereas Ser214 becomes strained with the ligand removed. Asp194 strains by an extra 2.6° upon ligand
binding, whereas the Val227 ␹1 angle is pushed 17° away
from its nearest associated rotamer, and the Tyr228 ␹1 and
␹2 angles are displaced by 22° in total. Ser214 relaxes by 6°
in the apo structure.
Figure 3 shows the three side chains that become
nonrotameric in a barnase (1brn chain L) enzyme/RNA
complex. Asp75 and Leu89 are within the first 9-Å shell
around the bound, four-nucleotide fragment, whereas Ile96
is in the second shell. Although the three side chains are
not in contact, they are distributed about the ligand
pocket. Furthermore, Asp75, Leu89 and Ile96 show substantial and extra (␹1 ⫹ ␹2) dihedral deviations from
rotameric of 10°, 20°, and 50.1°, respectively, apparently
induced by binding the nucleotide fragment.
DISCUSSION
We have shown that nonrotameric side chains (and
especially clusters of interacting nonrotameric residues)
prefer to be in the proximity of ligand- and substratebinding sites. Within in the first 9-Å shell around the
ligands, the favored nonrotameric residue types, compared
with the general nonrotameric composition, are Asp, Glu,
His, Met, and Asn. In the second 9-Å-layer, hydrophobic
residue types Cys, Ile, Leu, and Val are preferred along
with Arg, Ser, and Thr, in contrast with the nonrotameric
residue types closest to the ligand. The strain in the
vicinity of ligands apparently can be accommodated better
if the residues are in turn, coil, or N-terminal helical
regions.
It is known that ligand-binding sites show shape complementarity and are composed of residue types that allow
tight interaction with a substrate or other ligands, often
involving hydrophobic surface atoms. Induction of strain
through increased internal energy by nonrotameric side
chains is suggested here to aid catalysis and the release of
bound products. For example, enzymatic activity would be
enhanced when the dihedral strain is released upon reaching the transition state. Under this scenario, the nonrotameric residues containing energy might be viewed as
‘‘rechargeable batteries’’ that feed and stimulate reactivity.
Such a mechanism would give nature an extra device for
fine tuning enzyme catalysis during evolution, particularly enzymes in which favorable free energy changes upon
ligand binding are relatively large. It must be stressed
that the few enzyme complexes included here do not
corroborate this notion clearly. In fact, three enzyme
complexes of the 20 complex/apo pairs presented (pairs 10,
18, and 19 in Table II) involve transition-state analogues
but, nonetheless, show a build-up of strain (Table VIII),
which would not be expected if such strain was the
predominant mechanism for the ejection of catalytic products. On the other hand, the aforementioned barnase
example (Fig. 3) shows that a substrate analogue can
cause nonrotamericity along with large shifts in side-chain
53
NONROTAMERIC RESIDUES AND LIGAND BINDING
TABLE XII. Counts of Individual and Grouped Nonrotameric Residue Types in Three States
Type
a. Individual (amino acid)
residue types
C
D
E
F
H
I
K
L
M
N
Q
R
S
T
V
W
Y
Total observations
b. Grouped residue types
D⫹E⫹N⫹Q
F⫹H⫹W⫹Y
K⫹R
S⫹T
C⫹I⫹L⫹M⫹V
Total observations
1. Nonrotameric
in complex only
State
2. Nonrotameric
in complex and apo
3. Nonrotameric
in apo only
Preference
State 1/3a
State 1/2a
State 2/3a
4
2
2
0
0
5
1
11
1
2
1
1
2
1
4
0
1
38
0
5
0
6
1
5
0
4
1
5
1
1
1
1
4
1
4
40
1
1
0
2
1
3
0
2
2
1
0
0
3
0
2
1
1
20
2.11 ⫾ 1.00
1.05 ⫾ 0.72
n.a.
0.00 ⫾ 0.00
0.00 ⫾ 0.00
0.88 ⫾ 0.37
n.a.
2.90 ⫾ 0.74
0.26 ⫾ 0.26
1.05 ⫾ 0.72
n.a.
n.a.
0.35 ⫾ 0.24
n.a.
1.05 ⫾ 0.50
0.00 ⫾ 0.00
0.53 ⫾ 0.52
—
n.a.
0.42 ⫾ 0.29
n.a.
0.00 ⫾ 0.00
0.00 ⫾ 0.00
1.04 ⫾ 0.44
n.a.
2.90 ⫾ 0.74
1.05 ⫾ 1.04
0.42 ⫾ 0.29
1.05 ⫾ 1.04
1.05 ⫾ 1.04
2.11 ⫾ 1.54
1.05 ⫾ 1.04
1.05 ⫾ 0.50
0.00 ⫾ 0.00
0.26 ⫾ 0.26
—
0.00 ⫾ 0.00
2.50 ⫾ 1.05
n.a.
1.50 ⫾ 0.57
0.50 ⫾ 0.49
0.83 ⫾ 0.35
n.a.
1.00 ⫾ 0.47
0.25 ⫾ 0.25
2.50 ⫾ 1.05
n.a.
n.a.
0.17 ⫾ 0.17
n.a.
1.00 ⫾ 0.47
0.50 ⫾ 0.49
2.00 ⫾ 0.95
—
7
1
2
3
25
38
11
12
1
2
14
40
2
5
0
3
10
20
1.84 ⫾ 0.63
0.11 ⫾ 0.10
n.a.
0.53 ⫾ 0.29
1.32 ⫾ 0.15
—
0.67 ⫾ 0.23
0.09 ⫾ 0.09
2.11 ⫾ 1.54
1.58 ⫾ 0.88
1.88 ⫾ 0.22
—
2.75 ⫾ 0.71
1.20 ⫾ 0.29
n.a.
0.33 ⫾ 0.23
0.70 ⫾ 0.15
—
aThe
preference is calculated as the fraction of the residue type in one state relative to that in another state, where (a) and (b) can be in state 1 to 3
(column 2 to 4). For example, State 1/3 denotes the nonrotameric preference of the considered residue type for being nonrotameric in the
complex-only states relative to being nonrotameric in the apo-only state. Estimates for preference standard deviations are compiled as described in
Heringa and Argos.10 n.a., not applicable.
TABLE XIII. Differences in ⌬␹1 and ⌬␹2 Angles of
Nonrotameric Side Chains in Three Complex/Apo States†
Nonrotameric
in
Entire structure
No.
⌬(⌬␹1 ⫹ ⌬␹2)
Complex only
Complex and apo
Apo only
38
40
20
16.59 ⫾ 13.63
1.97 ⫾ 9.08
⫺11.19 ⫾ 13.66
⬍9 Å from ligand
No. ⌬(⌬␹1 ⫹ ⌬␹2)
12
8
7
11.41 ⫾ 9.75
3.00 ⫾ 12.00
⫺4.57 ⫾ 1.50
†⌬(⌬␹
1 ⫹ ⌬␹2 ), the mean differences with standard deviations of the
summed (⌬␹1 ⫹ ⌬␹2 ) values of equivalent side-chains in complexed
and apo structures relative to the complexes; i.e., 1/N ⴱ ⌺N(⌬␹1c ⫹ ⌬␹2c) ⫺ (⌬
␹1a ⫹ ⌬␹2a ), where superscript c and a designate ⌬␹ angles of equivalent
residues in complexed and apo structures, respectively, and N is the
number of side chains. Summation is over the number of side chains
(N). For residues Cys, Ser, Thr, and Val, only the ⌬␹1 values were used.
No., the numbers of side chains observed in each of the three states.
dihedral angles. It also is significant that the increased
energy on ligand binding, as observed over the complex/
apo pairs, does not appear to be mitigated by increased
internal hydrogen bonding or electrostatic interaction.
For use as aids in protein engineering and design,
residue types used in natural evolution to induce strain
are likely to be negatively charged and polar side chains as
well as branched hydrophobic residues, especially leucine.
TABLE XIV. Differences in Protein-Internal
Hydrogen Bonding and Charge Complementarity of
Nonrotameric Side Chains in Three Complex/Apo
States
Nonrotameric in
No.
⌬H-bonda
⌺alt bridgea
Complex only
Complex and apo
Apo only
38
40
20
4
0
⫺2
3
⫺1
0
a⌬H-bond
and ⌬salt-bridge values are given relative to the
complexed structure.
Furthermore, nonrotameric switching may provide a more
general way to channel energy through protein structures
without necessarily disrupting the main-chain topology. If
more corresponding complex and apo structures become
available, then suggestions for protein mutagenesis experiments could be derived (Tables XI and XII). Here, only the
statistics for Leu residues were significant enough to
suggest its engineering at ligand binding sites aimed at
provoking more inducible strain, which could increase
enzymatic activity. The residue type might be especially
appropriate in cases of hydrophobic patches binding li-
54
J. HERINGA AND P. ARGOS
Fig. 1. Space-filling representation of a nonrotameric side-chain four cluster and two isolated
side chains for the Salmonella typhimurium galactose-binding protein (1gca) complex with galactose and Ca2⫹ ligands. Orange indicates nonrotameric side chains only in the complex: namely,
Asn66, Lys92, Gln142, and Leu145. Yellow designates nonrotameric side chains in both the
complexed and the apo structures: Tyr10 and
Met17. Ligands galactose (GAL) and Ca2⫹ (CA)
are depicted in blue. The picture was prepared by
using RasMol graphics.21
Fig. 2. Space-filling representation of nonrotameric side chains in the
complexed bovine ␥-chymotrypsin (2gmt). Ser214, Val227, Tyr228, and
Asp194 are within 9 Å of the ligand. Orange indicates nonrotameric side
chains only in the complex. Val52 and Ser214 (green) become relaxed
(rotameric) upon ligand binding. Leu199 (yellow) is nonrotameric in both
the complexed and apo forms. The side chains Cys136, Cys201, Leu199,
and Tyr228 form a nonrotameric four cluster. The N-acetyl-L-phenylalanyl␣-chloroethylketone ligand (HIN) is depicted in blue.
Fig. 3. Space-filling representation of the nonrotameric side chains for
Bacillus amyloliquefaciens barnase (1brn chain L) with a bound fournucleotide fragment (RNA). Asp75, Leu89, and Ile96 (orange) become
nonrotameric and strained upon ligand binding. The nucleotide fragment
is indicated in blue.
NONROTAMERIC RESIDUES AND LIGAND BINDING
gands. However, such engineering is a complicated matter,
because every active site has unique properties with
associated residues in its specific environment, such that,
apart from altered side-chain dihedral strain, amino acid
substitutions involve changes in hydrophobicity, packing,
hydrogen bonding, and the like. Further research should
elucidate how enzymes bind their substrates, transition
states, and products in individual cases of strained sidechain conformations, as observed in this work.
It could be argued that the results presented here are
derived from critically small data sets and also that
extrapolating a limited number of observations on induced
strain to a general mechanism for providing energy in
enzyme-catalyzed reactions seems somewhat imaginative.
Nonetheless, our observations about the predominance of
nonrotameric residues in the vicinity of ligands are consistent, and we invite protein engineers to test our ‘‘battery
hypothesis’’ of inducible strain.
ACKNOWLEDGMENTS
We thank Frank Eisenhaber and Alex May for valuable
discussions and Gerhard Vogt for help in delineating
hydrogen bonds and salt bridges.
REFERENCES
1. Schrauber H, Eisenhaber F, Argos P. Rotamers: to be or not to be?
An analysis of amino acid side-chain conformations in globular
proteins. J Mol Biol 1993;230:592–612.
2. Janin J, Wodak S, Levitt M, Maigret B. Conformations of amino
acid side-chains in proteins. J Mol Biol 1978;125:357–386, 1978.
3. Bhat TN, Sasisekharan V, Vijayan, M. An analysis of side-chain
conformation in proteins. Int J Protein Peptide Res 1979;13:170–
184.
4. Ghelin BR, Karplus M. Side-chain torsional potentials and motion
of amino acids in proteins: bovine pancreatic trypsin inhibitor.
Proc Natl Acad Sci USA 1975;72:2002–2006.
55
5. Ghelin BR, Karplus M. Side-chain torsional potentials: effect of
dipeptide, protein and solvent environment. Biochemistry 1979;18:
1256–1268.
6. Ponder JW, Richards FM. Tertiary templates for proteins. Use of
packing criteria in the enumeration of allowed sequences for
different structural classes. J Mol Biol 1987;193:775–791.
7. Creamer TP, Rose GD. Side-chain entropy opposes ␣-helix formation but rationalizes experimentally determined helix-forming
propensities. Proc Natl Acad Sci USA 1992;89:5937–5941.
8. Lee KH, Xie D, Amzel LM. Estimation of changes in side chain
configurational entropy in binding and folding: general methods
and application to helix formation. Proteins1994;20:68–84.
9. Herzberg O, Moult J. Analysis of steric strain in the polypeptide
backbone of protein molecules. Proteins1991;11:223–229.
10. Heringa J, Argos P. Strain in protein structures as viewed through
non-rotameric side-chains. I. Spatial position and interaction.
Proteins1999;37:30–43.
11. Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD,
Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. Protein data
bank: a computer based archival file for macromolecular structures. J Mol Biol 1977;112:535–542.
12. Heringa J, Sommerfeldt H, Higgins D, Argos P. OBSTRUCT: a
program to obtain largest cliques from a protein sequence set
according to structural resolution and sequence similarity. Comput Appl Biosci 1992;8:599–600.
13. Kabsch W, Sander C. Dictionary of protein secondary structure:
pattern recognition of hydrogen-bonded and geometrical features.
Biopolymers 1983;22:2577–2637.
14. Abagyan R, Totrov M, Kuznetsov D. ICM—a new method for
protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comp
Chem 1994;15:488–506.
15. McDonald IK, Thornton JM. Satisfying hydrogen bonding potential in proteins. J Mol Biol 1994;238:777–793.
16. Barlow DJ, Thornton JM. Ion-pairs in proteins. J Mol Biol
1983;168:867–885.
17. Heringa J, Argos P. Side-chain clusters in protein structures and
their role in protein folding. J Mol Biol 1991;220:151–171.
18. Hol WGJ, van Duijnen PT, Berendsen HJC. The ␣-helix dipole and
the properties of proteins. Nature 1978;273:443–446.
19. Harper ET, Rose GD. Helix stop signals in proteins and peptides—
the capping box. Biochemistry 1993;32:7605–7609.
20. Taylor WR, Orengo CA. Protein structure alignment. J Mol Biol
1989;208:1–22.
21. Sayle RA, Milner-White EJ. RasMol: biomolecular graphics for all.
Trends Biochem Sci 1995;20:374–376.