* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Strain in Protein Structures as Viewed Through Nonrotameric Side
Multi-state modeling of biomolecules wikipedia , lookup
Catalytic triad wikipedia , lookup
Interactome wikipedia , lookup
Biochemistry wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Clinical neurochemistry wikipedia , lookup
Paracrine signalling wikipedia , lookup
Western blot wikipedia , lookup
Protein purification wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Drug design wikipedia , lookup
Signal transduction wikipedia , lookup
Homology modeling wikipedia , lookup
Two-hybrid screening wikipedia , lookup
NADH:ubiquinone oxidoreductase (H+-translocating) wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Structural alignment wikipedia , lookup
Proteolysis wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Anthrax toxin wikipedia , lookup
PROTEINS: Structure, Function, and Genetics 37:44–55 (1999) Strain in Protein Structures as Viewed Through Nonrotameric Side Chains: II. Effects Upon Ligand Binding Jaap Heringa1,2* and Patrick Argos2 of Mathematical Biology, National Institute for Medical Research, London, United Kingdom 2European Molecular Biology Laboratory, Heidelberg, Germany 1Division ABSTRACT The relation between the spatial positioning of nonrotameric residues and ligands was studied in 112 tertiary structures of proteinligand complexes with a crystallographic resolution of I 1.8 Å. Nonrotameric side chains and especially clusters of interacting nonrotameric residues were found to be associated preferentially with ligandand substrate-binding sites. Asp, Glu, His, Met, and Asn are favored nonrotameric residue types positioned in the first 9-Å shell around ligands. Comparison of 20 complexes with associated apo structures suggests that ligand binding induces nonrotamericity and, hence, strain within protein-ligand complexes. The internal energy gain is not neutralized by increased hydrogen bonding or salt-bridge formation involving side chains that become nonrotameric in the complexed structure. It is suggested that the increased internal energy might aid in the formation and ejection of enzymatic products, thereby enhancing activity. These results could prove useful in protein engineering experiments aimed at altering enzymatic activity. Proteins 1999;37:44–55. r 1999 Wiley-Liss, Inc. Key words: rotamers; protein structure; protein folding; ligand binding; product formation; catalysis INTRODUCTION In protein tertiary structures, amino acid side-chain conformations can be described by their torsion angles, the so-called angles. A rotamer is a k-tuple of angles 51, 2, 3, . . .6 defined by bonded atom positions in moving out along an individual side chain, with k ⫽ 1 to N, with maximum of 4, depending on the residue type. A preferred dihedral conformation for a residue type is termed a rotamer, and the number of rotamers delineated for each of the amino acid types can vary from a minimum of 3 for Cys, Asp, Ser, and Thr up to 9 for Leu.1 Rotamers have been characterized as 1) possessing configurations representing a local minimum in potential energy;2–5 2) dense clusters in the -angle space, as observed from tertiary structures in the Protein Data Bank;1,6 and 3) being associated with a local, most favorable conformational entropy.7,8 For the amino acids Gly and Ala, no side-chain dihedral angles are defined due to the lack of sufficient side-chain atoms. Furthermore, Pro lacks the rotational r 1999 WILEY-LISS, INC. freedom to attain multiple rotamers, because a part of its side chain forms the backbone. By using data from 19 well-refined crystallographic structures, Ponder and Richards6 assembled a library of rotamers for each amino acid type based on clustering techniques. They observed that most side chains adopt one of the rotameric states. Schrauber et al.1 used a much larger protein structure set and observed that a significant fraction (up to 30%) of particular side-chain types could not be assigned to a rotameric state based on the criterion that rotameric side chains can deviate in each of the 1 and 2 angles by no more than 20° from the closest associated rotamer. Moreover, they found that this high fraction of nonrotameric residues cannot be attributed to structural resolution, because the rotamericity (fraction of all side chains in a protein structure found rotameric) does not increase with crystallographic resolutions ⬍1.9 Å. Schrauber et al. suggested that nonrotamericity is often a result of functional or structural constraints, such that, locally, a minimum energy configuration could be sacrificed for the function or stability of the protein as a whole. However, they performed no statistical analysis to support their claim. Herzberg and Moult9 investigated steric strain for main-chain atoms only in ten protein tertiary structures and observed strained residues near functional sites. In Heringa and Argos (this issue),10 we investigated the spatial positioning of buried, nonrotameric side chains and found that they prefer to be in self-interacting clusters and in coil secondary structures. Although nonrotameric side chains generally have significantly higher crystallographic temperature factors (which measure energetic atomic vibrations) than those in rotameric states, side chains in nonrotameric clusters tend to have lower crystallographic temperature factors than isolated nonrotameric side chains. Nonrotameric clustering thus appears to alleviate structural strain in the protein three-dimensional (3D) structure. Here, we study the location of nonrotameric residues with regard to ligand- and substrate-binding sites and address the following questions: Is there a relation between the placement of nonrotameric side chains and Abbreviations: rot, rotameric; nonrot, nonrotameric; ASA, solvent accessible surface area. *Correspondence to: Jaap Heringa, Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, United Kingdom. E-mail: [email protected] Received 23 April 1999; Accepted 26 April 1999 45 NONROTAMERIC RESIDUES AND LIGAND BINDING TABLE I. Set of 112 Protein Structures With a Recorded Ligand in the Protein Data Base Depository† 121P 1BCX 1CPCA 1ECA 1GMPA 1LYE 1PAL 1SNC 1XIS 2CMD 2MSBB 7PCY †The 1ABA 1BDMA 1CPCB 1EZM 1GOF 1MBA 1PDA 1SRIB 1YCC 2CTB 2OHXA 8DFR 1ABE 1BP2 1CPN 1FKB 1HBG 1MDC 1PHA 1TAG 256BA 2CY3 2PIA 1ADL 1BTL 1CSH 1FLP 1HBIA 1MFA 1PHP 1TCA 2AAE 2CYP 2PKC 1ADS 1CBS 1CYO 1FNB 1HFC 1MRJ 1PMY 1THG 2ACT 2DRI 2POR 1AIZA 1CELA 1DBS 1FRD 1HML 1MUA 1RAS 1THM 2AK3A 2FCR 2TRXA 1AMP 1CHN 1DHJA 1GCA 1HYT 1MYT 1SBP 1TML 2ALP 2GMT 351C 1ARS 1CLL 1DMB 1GD1O 1ICM 1NSCA 1SCS 1TON 2APR 2HMQA 3C2C 1AST 1COT 1DXTB 1GDI 1ISAA 1OFV 1SGC 1TPFA 2BBKL 2HTS 3CLA 1BBHA 1COY 1EAS 1GESB 1IVD 1ONC 1SGT 1TYS 2CCYA 2MCM 3DFR proteins are identified by their Protein Data Base four-letter code. A fifth character identifies the chain of the protein used in the analysis. ligands? In an analysis over a nonredundant set of tertiary structures, it will be shown that nonrotameric residues tend to be situated closer to protein ligands than expected. Can any features be derived from such side chains that might influence the mechanism of ligand binding? This second question was addressed over a set of known tertiary structures with a bound ligand as well as a sequentially identical apo form. The analysis suggested that residues tended to switch from a rotameric state to a nonrotameric state upon ligand binding, leading to an increase in protein internal energy. The induced nonrotameric states were not accompanied by increased hydrogen bonding or salt bridges, which could compensate for the elevated local strain. factor in any statistical results. For instance, Schrauber et al.1 as well as Heringa and Argos (this issue10) have demonstrated that resolution has an influence on the rotamericity in protein tertiary structures. If a complexed (apo) structure had more than one apo (complexed) counterpart, then the pair showing the smallest resolution difference was selected. In a single case in which the resolution difference was identical, the paired structure with the smallest crystallographic R-factor was selected. This scenario resulted in a small set of 20 structural complex/apo pairs (Table II). MATERIALS AND METHODS Protein 3D Structures The protocol used for defining nonrotameric residues was that described in the accompanying article,10 in which residue types Gly, Ala, and Pro were appropriately excluded (see above). Side chains were deemed nonrotameric if 0r1⫺10⬎20° or 0r2⫺20⬎20°, where 1 and 2 are the side-chain dihedral angles of the nonrotamer, and r1 and r2, respectively, are the corresponding dihedral angles of the nearest associated rotamer for the particular residue type. Rotamers used were those from the library of Ponder and Richards6 combined with that of Schrauber et al.1 Both were constructed by using the clustering techniques in angle space described in the accompanying article.10 A nonredundant set of 3D structures was compiled from the Protein Data Bank (PDB)11 with recorded ligands. By using the method of Heringa et al.,12 a maximally sized set of proteins was gathered with minimal sequence lengths of 80 residues (see below), resolutions ⱕ 1.8 Å, and sequence identities ⱕ 35% over all possible sequence pair alignments. Generally, smaller proteins did not have a solventinaccessible core containing more than a single nonrotameric residue. Therefore, they would not have been useful in our interaction analysis of buried side chains with minimal crystallographic error in the coordinates. Of the 166 X-ray structures obtained in this manner, a set of 112 ligand-bound proteins, complexed with one or more ligands, was extracted (Table I). The R-factors of these structures had a value ⱕ 0.225. To test the conformational changes of nonrotameric residues upon ligand binding, a set of structural pairs consisting of complexed and corresponding apo structures was compiled. Of the 166 nonredundant ⱕ 1.8 Å resolution structures, 29 had one or more sequentially identical complexed or apo structures in the PDB with resolution not exceeding 2.5 Å, resulting in 42 complex/apo pairs of structures. Because structural resolution and the degree of rotamericity (the fraction of residues found rotameric) are related (see Heringa and Argos, this issue10), only complex/ apo structural pairs were considered that showed a resolution difference ⱕ 0.5 Å. This threshold was applied to limit the chances that resolution effects would be the dominant Defining Nonrotameric Residues Side Chain Selection Only buried core residues were selected, such that the solvent-accessible surface area (ASA) summed over all main- and side-chain atoms was ⱕ 5 Å2. This selection was made because side-chain dihedral angles display greater error when at or near the protein surface (see the accompanying article10). All ASA values were calculated by using the program DSSP.13 For structures that comprised more than one chain, as a further precaution, all other chains were removed before surface accessibilities were calculated, such that chain interface regions became exposed and, hence, were excluded from the analysis. Contacts Among Side Chains The protocol followed for side-chain contact delineation is described in the accompanying article.10 Side-chain pairs were considered to make contact if two atoms, each 46 J. HERINGA AND P. ARGOS TABLE II. Ligands Involved in 20 Pairs of Complexes and Apo Enzymes† No. Complex Apo RMSd (Å) Species and name Ligand(s) in complex only 1 1adl 1lib 0.28 Mouse lipid-binding protein 2 3 1bp2 1brnl 2bpp 1brsa 0.57 0.45 4 1cbq 1cbs 0.45 5 1dmb 1omp 0.47 6 1gca 1gcg 0.32 7 8 1geub 1gmpa 1gesb 1gmqa 0.23 0.18 9 1gof 1gog 0.11 10 1hyt 1lnfe 0.07 11 1icm 1ifc 0.45 12 13 1isca 1ndc 1isaa 1npk 0.09 0.45 14 1rar 1ras 0.19 Bovine phospholipase A2 Bacillus amyloliquefaciens barnase Human retinoic acid-binding protein E. coli D-maltodextrin-binding protein Salmonella typhimurium galactose-binding protein E. coli glutathione reductase Streptomyces aureofaciens ribonuclease Dactylium dendroides galactose oxidase B. thermoproteolyticus thermolysin Rat intestinal fatty acid binding protein E. coli superoxide dismutase Dictyostelium discoideum nucleoside diphosphate kinase Bovine ribonuclease A 15 16 17 18 4xis 1emd 2ctc 2gmt 1xis 2cmd 2ctb 1gmca 0.06 0.11 0.15 0.25 S. rubiginosus xylose isomerase E. coli malate dehydrogenase Bovine carboxipeptidase A Bovine ␥-chymotrypsin 19 2sim 2sil 0.14 20 2wgca 9wgaa 0.20 Salmonella typhimurium sialidase Wheat germ isolectin Ligand(s) in the complex and apo structures 3 ⫻ oxygen; arachidonic acid; propanoic acid 2 ⫻ 2-methyl-2,4-pentanediol RNA — Phosphate Retinoic acid -cyclodextrin — Galactose Ca2⫹ NAD 28-Guanylic acid FAD Sulfate 2 ⫻ Acetate ion Cu2⫹; Sodium⫹ counter ion L-benzylsuccinate 4 ⫻ Ca2⫹; Zn2⫹; Dimethyl Sulfoxide — Myristate Ca2⫹ — Azide Thymidine-58-diphosphate; Mg2⫹ Fe2⫹ — 3 ⫻ Cl Acetylaminoethyl napthylamine sulfonate 2 ⫻ Mn2⫹ Citrate Zn2⫹ — Xylose NAD L-phenyl lactate N-acetyl-L-phenylalanyl-␣-chloroethylketone 2,3-dehydro-2-deoxy-N-acetyl neuraminic acid N-acetyl-neuraminyl lactose — — †No., the number of the holo/apo pair. Complex and Apo identify the Protein Data Bank four-letter codes for the structural pairs, with a fifth character denoting the chain identifier, and RMSd (Å) gives the root-mean-square deviations between the pairs obtained from structure C␣ superpositioning by the method of Taylor and Orengo.20 The ligands designated under Ligand(s) in complex only were used to derive the statistics. from a side chain, were at a distance ⱕ 5 Å. Clusters of nonrotameric residues were generated when the minimum sized cluster consisted of two side chains. Further cluster growth involved nonrotameric residues that contacted one or more cluster members. General Side Chain/Ligand Interactions For all bound ligands recorded for each of the 112 protein 3D structures listed in Table I, the distances to all buried rotameric and nonrotameric side chains were determined. Various shells were constructed around the ligands that had a depth of 9 Å, a distance corresponding to two carbon atom diameters and an additional 1 Å to consider any hydrogen atoms and experimental error. The side chains were assigned to these shells based on their side chain/ ligand distances. For example, a side chain at 13.2 Å from a particular ligand would fall in the second 9-Å shell for that ligand. For all shell layers under a particular shell depth and over all of the ligands within the 112 protein structures listed in Table I, the rotameric and nonrotameric residues were sampled, and the nonrotameric ratio was calculated as Ratioshell ⫽ Nnonrotshell/Nrotshell, where Nnonrotshell is the number of nonrotameric residues in the shell considered, and Nrotshell is the corresponding number of rotameric residues. The nonrotameric ratios observed in each of the shells (Ratioshell) were converted to preferences (Prefshell) through normalization with the nonrotameric ratio (Ratioprot ⫽ Nnonrotprot/Nrotprot) for the associated complete structure: Prefshell ⫽ Ratioshell/Ratioprot. A value ⬎ 1.0 denotes that nonrotameric residues are preferred in the considered shell, and a lower value indicates avoidance. In structures with more than one attached ligand, nonrotameric residues were assigned to their closest ligand. Distance Between Side Chains and Ligands Consistent with the definition for contacts among side chains, the default distance taken between a particular 47 NONROTAMERIC RESIDUES AND LIGAND BINDING buried side chain (rotameric or nonrotameric) and ligand was the smallest of all interatomic distances among all side-chain and ligand atoms. However, this definition might lead to a bias in the statistics gathered for the above ligand shells, in that the shells become ‘‘hairy,’’ i.e., the shells might involve residues toward the outside of each shell that are orthogonal to the shell surface, whereas side chains inside the shells might be positioned more in parallel. To alleviate this bias, control statistics were gathered from side-chain/ligand distances defined for each side chain and associated ligand as the distance between the side-chain centroid and the closest ligand atom. TABLE III. Nonrotameric Preferences in 9-Å Shells Around Ligands Averaged per Protein† Shell Average PrfNonrotⱕ1 PrfNonrot⬎1 PrfNonrot preference (%) (%) n.a. 0.0–9.0 9.0–18.0 18.0–27.0 27.0–36.0 1.51 ⫾ 0.16 0.88 ⫾ 0.09 0.83 ⫾ 0.15 0.64 ⫾ 0.36 35 (39.3) 55 (61.1) 38 (51.4) 35 (81.4) 54 (60.7) 35 (38.9) 36 (48.7) 8 (18.6) 23 22 38 69 †PrfNonrotⱕ1 gives the number of proteins for which the preferences were smaller than or equal to 1.0, and PrfNonrot⬎1 provides the number of proteins with shell preferences greater than 1.0. PrfNonrot n.a. gives the numbers of proteins for which no shell preference could be calculated due to the absence of rotameric residues. Cluster-Ligand Interaction To determine whether nonrotameric clusters show a preference to be in the vicinity of ligands, the distance (Dnonrot) between each cluster and its nearest ligand was taken as the smallest separation between any ligand atom and any side-chain atom of the cluster. We also determined such distances for all rotameric clusters with a number of residues identical to the number in the compared nonrotameric cluster and with the same clustering criterion. For all such rotameric control clusters, we calculated the average distance (Drot) to the ligand. The preference (Prf) for a single nonrotameric cluster to be in the proximity of a ligand was defined as Prf ⫽ Drot/Dnonrot, such that values ⬎ 1.0 denote a preference for the nonrotameric cluster to be in the proximity of a ligand. The average preference value was determined for various sized groups (two to four residues) as well as over all nonrotameric clusters. With regard to the ligand shell statistics of individual nonrotameric side chains (see above), a default and control distance criterion was used. The default criterion for clusterligand distance involved the smallest separation between any considered cluster and ligand atom. For a control distance between a side-chain cluster and associated ligand, the shortest distance between any cluster side-chain centroid and any ligand atom was taken. Comparing Nonrotameric States in Complexed and Apo Structures The 20 analyzed pairs of proteins with known complexed and corresponding apo structures comprise enzymes complexed with natural ligands or analogues as well as binding proteins (Table II). For each of the 20 pairs, the holo and apo enzymes have identical topologies. Similar to the general analysis, only residues with ASA ⱕ 5 Å2 were analyzed. Solvent accessibilities were calculated only in the absence of ligands for consistency and crystallographic error avoidance. The complex/apo pairs were scrutinized for different side-chain conformations (rotameric or nonrotameric) from the unbound state to the bound state with one or more ligands. The side-chain rotameric behavior was characterized by three states: 1) side chains that are nonrotameric in the complexed structure and rotameric in the apo form (complex only); 2) side chains that are nonrotameric both in the complex and the apo enzyme (complex and apo); and 3) those that are rotameric in the complex and become nonrotameric in the apo enzyme (apo only). The net increase (or decrease) of nonrotameric side chains could then be determined for various conditions; for example, states 1–3 above would give the net nonrotameric movement upon ligand binding. Hydrogen Bonds and Salt Bridges in Complexed and Apo Structures For each of the 20 pairs of structures, protein internal hydrogen bonds and salt bridges were calculated by using the molecular mechanics computer package ICM.14 Hydrogen bonds were declared whenever the distance between a hydrogen and a proton acceptor was ⱕ 2.5 Å.15 Only hydrogen bonds were sampled between a proton donor or acceptor of a side chain considered and, respectively, acceptors or donors from any other side-chain or mainchain atoms. The criterion for declaring a salt bridge was a distance ⱕ 3.0 Å between charge complementary atoms, a distance geared to sample primary salt bridges.16 H-bond and salt bridge formation could neutralize increased energy involved in residues changing from rotameric to nonrotameric. For a control, pairwise differences in the number of hydrogen bonds and salt bridges were calculated for side-chain atoms in each of the three classes of nonrotamericity described above (i.e., complex only, complex and apo, and apo only). Although the actual bond energies may vary with the local geometries, the differencecounting scheme is feasible in our pairwise comparison of buried and sequentially identical protein environments. RESULTS Nonrotameric Preferences to Be Near Ligands Tables III and IV show results for four ligand shells, each with a depth of 9 Å. The shell nearest the ligands has a clear preference (1.51) for nonrotameric residues, whereas the preferences for outer ligand shells are consistently ⬍ 1, indicating avoidance (Table III). Moreover, the percentage of proteins that prefer nonrotameric residues drops from 61% for the nearest to 19% for the most remote ligand shell. Table IV gives statistics for residue numbers summed over all proteins for each of the ligand shells: Relative to a total number of 4,103 rotameric and 389 nonrotameric side chains over all protein ligand shells, there is a clear, 48 J. HERINGA AND P. ARGOS TABLE IV. Nonrotameric Preferences in 9-Å Shells Around Ligands for Summed Residues Over All Ligands† Shell Preference No. of rotameric side chains 0.0–9.0 9.0–18.0 18.0–27.0 27.0–36.0 1.28 ⫾ 0.08 0.91 ⫾ 0.07 0.87 ⫾ 0.13 0.90 ⫾ 0.34 1081 2293 633 94 No. of nonrotameric side chains 131 198 52 8 †Estimates for standard deviations of preferences have been compiled as described in the accompanying article (Heringa and Argos10 ). nonrotameric preference of 1.28 only for the shell closest to the ligand. The control preferences derived by using side-chain centroid distances (see Materials and Methods) and averaged per protein showed a similar pattern (compared to that in Table III) with values of 1.54 ⫾ 0.23, 0.97 ⫾ 0.06, 0.96 ⫾ 0.19, and 0.81 ⫾ 1.00, respectively for the four 9-Å shells going outward from the ligand. Also the control preferences using side-chain centroids compiled from summing each shell over all ligand proteins showed preferences that virtually were identical to those for the default distance criterion (Table IV), with values of 1.29 ⫾ 0.09, 0.93 ⫾ 0.07, 0.89 ⫾ 0.12, and 0.99 ⫾ 0.30 for the respective shells going outward from the ligand. It might be argued that the ligand shell definition would lead to selection of side chains positioned more toward the protein surface for the closest ligand shell than for those at greater distances from the ligand, which might explain the observed rotameric bias. On the other hand, it must be stressed that all ligand shells can include side chains near the surface as well as those buried more deeply in the protein structure. For a control for this deepness effect, we grouped the side chains over all proteins in six accessibility classes (ASA ⫽ 0 Å2, 1 Å2, 2 Å2, 3 Å2, 4 Å2, and 5 Å2) and found similar rotamericity values of 92.17%, 91.95%, 91.00%, 90.61%, 90.10%, and 91.16%, respectively. Clearly, the small differences between these rotamericities are not sufficient to explain the observed nonrotameric preference for the closest ligand shell. Because the nonrotameric cluster statistics were sparse, we could not calculate similar ligand shell preferences. However, average distances of nonrotameric clusters from ligands could be compared with those for rotameric control clusters comprising identical numbers of side chains (see Materials and Methods). Nonrotameric clusters are 1.7fold closer to ligands than rotameric control clusters, i.e., on average, they are at only 0.58 of the control cluster distance from the ligands (Table V). The control clusterligand proximity preferences based on side-chain centroidligand distances showed a similar trend, with a value of 1.6. We conclude that there is a clear overall tendency for nonrotameric residues and those in clusters to be associated with protein ligands. TABLE V. Preferences of Nonrotameric Clusters to be Close to Ligands Average nonhairy Average hairy No. of preference compared preference compared to mean control residues No. of to mean control cluster distanceb in cluster clusters cluster distancea 2 3 4 Total 37 10 6 53 1.80 ⫾ 0.19 1.43 ⫾ 0.21 1.79 ⫾ 0.35 1.73 ⫾ 0.21 1.62 ⫾ 0.15 1.41 ⫾ 0.17 1.72 ⫾ 0.34 1.60 ⫾ 0.17 aCluster-ligand distance was the separation between the closest cluster and associated ligand atom. bFor cluster-ligand distance, the separation between the closest cluster side-chain centroid and associated ligand atom was taken. Nonrotameric Residue Composition Around Ligands We sampled compositional data over 17 residue types for the two 9-Å shells closest to the ligands, as shown in Table VI. Although the statistics were scant for the first two 9-Å shells and were dramatically sparse for the third and fourth 9-Å ligand-distance layers, the first two layers tend to prefer nonoverlapping sets of amino acids compared with the general nonrotameric composition (Prefnonrot). Nonrotameric side-chain types that were preferred by ⱖ 20% (Prefnonrot) in the first layer (closest to the ligand) and were avoided in the second layer included Asp, Glu, His, Met, and Asn, which are mostly polar and negatively charged. Although Lys and Gln were preferred in the first layer, their low abundance and high preference error values render the result insignificant. In the second layer, hydrophobic residue types Cys, Ile, Leu, and Val were preferred along with Arg, Ser and Thr (albeit the preference error values are relatively high for the latter types), clearly distinguishing the nonrotameric residue types closest to the ligand. Indeed, the correlation coefficient between the Prefnonrot values for the first two layers was ⫺0.93. The shell compositions compared with the general rotameric distribution (Prefrot) showed that the nonrotameric residue types that were highly preferred in general (see the accompanying article10; Table II) remained favored in the two ligand shells, albeit far less dramatically. It should be stressed that all sampled residues were buried almost completely with ASA ⬍ 5 Å2, such that any polar preference observed would not be induced by the presence of solvent. Secondary Structure Preferences Around Ligands Table VII shows statistics similar to those shown in Table VI for the first two 9-Å layers around ligands regarding preferences for secondary structure types and regions therein (N- and C-terminal; middle) relative to the general rotameric and nonrotameric statistics. Clearly, nonrotameric residues in the shell closest to ligands prefer N-terminal helix, turn, and coil regions. The observations for helical N-terminal fragments are consistent with the high preferences for Asp and Glu within the first ligand shell; these amino acid types are observed frequently also 49 NONROTAMERIC RESIDUES AND LIGAND BINDING TABLE VI. Amino Acid Composition of Nonrotameric Residues in Sampled Distance Shells Around Ligands† Amino acid type No. C D E F H I K L M N Q R S T V W Y 3 19 6 22 9 7 1 9 12 23 1 1 4 1 3 3 7 First ligand shell (0.0–9.0 Å) % Prefrot 2.3 14.5 4.6 16.8 6.9 5.3 0.8 6.9 9.2 17.6 0.8 0.8 3.1 0.8 2.3 2.3 5.3 0.53 ⫾ 0.30 7.08 ⫾ 1.50 2.51 ⫾ 1.00 2.10 ⫾ 0.41 4.78 ⫾ 1.54 0.38 ⫾ 0.14 2.24 ⫾ 2.21 0.36 ⫾ 0.12 2.01 ⫾ 0.55 9.12 ⫾ 1.73 0.46 ⫾ 0.46 0.98 ⫾ 0.98 0.36 ⫾ 0.18 0.10 ⫾ 0.10 0.12 ⫾ 0.07 1.03 ⫾ 0.59 1.60 ⫾ 0.59 Prefnonrot No. Second ligand shell (9.0–18.0 Å) % Prefrot Prefnonrot 0.74 ⫾ 0.42 1.25 ⫾ 0.26 2.23 ⫾ 0.89 1.11 ⫾ 0.22 1.48 ⫾ 0.48 0.58 ⫾ 0.21 2.97 ⫾ 2.96 0.56 ⫾ 0.18 1.55 ⫾ 0.43 1.55 ⫾ 0.29 1.48 ⫾ 1.48 0.59 ⫾ 0.59 0.57 ⫾ 0.28 0.27 ⫾ 0.27 0.52 ⫾ 0.30 0.47 ⫾ 0.27 1.04 ⫾ 0.38 8 15 1 28 7 23 0 33 10 17 1 3 13 8 13 10 8 4.0 7.6 0.5 14.1 3.5 11.6 0.00 16.7 5.1 8.6 0.5 1.5 6.6 4.0 6.6 5.1 4.0 0.94 ⫾ 0.32 3.70 ⫾ 0.92 0.28 ⫾ 0.28 1.77 ⫾ 0.31 2.46 ⫾ 0.91 0.83 ⫾ 0.16 0.00 ⫾ 0.00 0.88 ⫾ 0.14 1.11 ⫾ 0.34 4.46 ⫾ 1.03 0.30 ⫾ 0.30 1.94 ⫾ 1.11 0.78 ⫾ 0.21 0.53 ⫾ 0.18 0.35 ⫾ 0.10 2.28 ⫾ 0.70 1.21 ⫾ 0.42 1.31 ⫾ 0.45 0.65 ⫾ 0.16 0.25 ⫾ 0.25 0.93 ⫾ 0.16 0.76 ⫾ 0.28 1.26 ⫾ 0.25 0.00 ⫾ 0.00 1.35 ⫾ 0.22 0.85 ⫾ 0.26 0.76 ⫾ 0.18 0.98 ⫾ 0.98 1.18 ⫾ 0.68 1.22 ⫾ 0.33 1.43 ⫾ 0.50 1.50 ⫾ 0.40 1.03 ⫾ 0.32 0.79 ⫾ 0.28 †First ligand shell and Second ligand shell give the separation ranges in Å between ligand and side-chain atoms. No. denotes numbers of amino acids, % gives the percentages. Prefrot and Prefnonrot specify the preferences compared with the general rotameric and nonrotameric compositions, respectively, over all ligand proteins. The standard errors are estimated as described in the accompanying article.10 in N-terminal helical turns,17 possibly due to the helical dipole effect18 or capping box structure.19 Residues in the second ligand shell avoid N-terminal helix termini (Prefnonrot) and prefer helical C-terminal fragments, in accordance with the observed relative avoidance of negatively charged residues (Table VI). Relative to the first ligand shell, N-terminal and middle sheet segments become favored. Turn and coil segments in the second ligand shell are avoided, although these regions suffer from critically low sampling frequencies, resulting in high error estimates. Changes in Side-Chain Structural States for Apo and Complexed Structures The 20 pairs of proteins with known complexed and corresponding apo structures (Table II) were selected in the absence of any functional constraints (see Materials and Methods). They include various binding proteins as well as enzymes complexed with natural ligands or analogues. The structures within each pair show identical topologies, with low root mean square deviation (RMSd) values after superpositioning20 of all C␣ atoms (Table II). The buried sample of residues within the pairs (ASA ⱕ 5 Å2) would lead to a further decrease in the RMSd values. Statistics on conformational changes of rotameric and nonrotameric side chains as structures go from the unbound state to the bound state are given in Table VIII. Differences in crystallographic resolutions, limited to values ⬍ 0.50 Å to assure comparability (see Materials and Methods), also are listed. The statistics are reasonably balanced, because, out of 20 complex/apo pairs, 7 pairs show that the complexed structure is resolved better, whereas 8 pairs favor the apo structure, and 5 pairs display no difference. Tables IX and X summarize the data from Table VIII. When all side chains in the entire protein structure are considered, 11 of 20 structural pairs show a net increase in nonrotameric residues for the complex, 4 structural pairs display more nonrotamers for the apo structures, and 5 structural pairs were indifferent, such that 73% of the structure pairs that displayed a change in strain upon ligand binding opted for more nonrotameric residues in the complex state. Furthermore, if we considered only the 7 structural pairs with a better resolved complex (negative ‘‘⌬R’’ values in Tables IX and X), such that the complexes contained smaller fractions of nonrotameric residues than the associated apo structures, then 3 pairs showed increased nonrotamericity for the complex, only 1 pair showed this for the apo form, and 3 pairs were neutral. It is clear that the results cannot be attributed to resolution differences. However, as expected from the small sample of apo-complex pairs, the standard deviations were relatively high for the average difference between side chains nonrotameric in the complex only and those nonrotameric in the apo structure (column ⌬nonrot in Table VIII), with values of 0.90 ⫾ 2.25 over the entire structures and 0.25 ⫾ 1.09 for side chains within 9-Å distance from their associated ligands. Table XI shows the nonrotameric residues occurring in the three states (complex only, complex and apo, apo only) over the 20 complex/apo pairs, and Table XII lists the corresponding counts and preferences for the grouped and individual residue types involved. Although the counts are critically small, interesting trends are suggested. Nonrotameric states for the large aromatic side chains are preferred in the apo structures, whereas the negatively charged and polar as well as smaller or branched hydrophobic residues would favor nonrotameric states in complexed structures. The cross-composition preferences show that the Asp, Gly, Asn, Gln group, although it is preferred in the complex-only state relative to the apo-only state, is most 50 J. HERINGA AND P. ARGOS TABLE VII. Preferences for the Presence of Nonrotameric Residues in 9-Å Ligand Shells† Shell (Å)/ secondary structure 0.00–9.00 H H H H E E E E T T T T C C C C 9.00–18.00 H H H H E E E E T T T T C C C C Statistic N-terminal Middle C-terminal N % Prefrot Prefnonrot N % Prefrot Prefnonrot N % Prefrot Prefnonrot N % Prefrot Prefnonrot 16 17.98 2.74 ⫾ 0.62 1.62 ⫾ 0.37 11 12.36 0.88 ⫾ 0.25 0.71 ⫾ 0.20 4 4.49 2.63 ⫾ 1.29 2.17 ⫾ 1.06 6 6.74 3.95 ⫾ 1.56 1.95 ⫾ 0.77 17 19.10 0.69 ⫾ 0.15 1.01 ⫾ 0.23 10 11.24 0.57 ⫾ 0.17 0.53 ⫾ 0.16 3 3.37 3.46 ⫾ 1.96 2.44 ⫾ 1.38 2 2.25 1.64 ⫾ 1.15 1.30 ⫾ 0.91 2 2.25 0.30 ⫾ 0.21 0.36 ⫾ 0.25 11 12.36 0.80 ⫾ 0.23 0.87 ⫾ 0.25 5 5.62 4.10 ⫾ 1.78 2.32 ⫾ 1.01 2 2.25 1.27 ⫾ 0.89 2.17 ⫾ 1.51 N % Prefrot Prefnonrot N % Prefrot Prefnonrot N % Prefrot Prefnonrot N % Prefrot Prefnonrot 15 9.43 1.44 ⫾ 0.35 0.85 ⫾ 0.21 31 19.50 1.39 ⫾ 0.23 1.13 ⫾ 0.18 1 0.63 0.37 ⫾ 0.37 0.30 ⫾ 0.30 2 1.26 0.74 ⫾ 0.52 0.36 ⫾ 0.26 30 18.87 0.68 ⫾ 0.11 1.05 ⫾ 0.17 41 25.79 1.31 ⫾ 0.18 1.22 ⫾ 0.16 0 0.00 0.00 ⫾ 0.00 0.00 ⫾ 0.00 2 1.26 0.92 ⫾ 0.64 0.73 ⫾ 0.51 14 8.81 1.17 ⫾ 0.30 1.41 ⫾ 0.36 21 13.21 0.86 ⫾ 0.17 0.93 ⫾ 0.19 1 0.63 0.46 ⫾ 0.46 0.26 ⫾ 0.26 1 0.63 0.36 ⫾ 0.36 0.61 ⫾ 0.60 †Shell refers to the distance of the nonrotameric residues from the ligand. Secondary structure gives secondary structural states of the nonrotameric residues as determined by using the DSSP algorithm.13 The four secondary structural symbols shown are obtained from DSSP symbols and classified as follows: H (helix) for G, H, and I; E (strand) for B and E; T (turn) for S and T; and C (coil) for blank. Secondary structural elements sampled had minimum lengths of 6, 4, 2, and 6 for helix (H), strand (E), turn (T), and coil (C), respectively, whereas the lengths of the N-terminal and C-terminal segments chosen, respectively, were 3, 2, 1, and 3. Lengths of middle segments were dependent on the actual length of the secondary structure considered. Statistics are shown as symbols designating the following: N, number of side chains in the secondary structural element considered; %, percentages of all shell residues found in the given secondary structural region; Prefrot , preference of rotameric side chains compared with rotameric residues; Prefnonrot , nonrotameric preference compared with the nonrotameric residue distribution. Estimates for preference standard errors are compiled as described in the accompanying article.10 likely to be nonrotameric in complex and apo structures (Table XIIa). The small/branched hydrophobic amino acid types, indeed, show the highest preference for the complexonly state. The latter statistics are especially salient for Leu, with 11 complex-only counts versus 2 apo-only counts, such that Leu shows the most pronounced tendency to become strained upon ligand binding (Table XIIb). The Leu complex-only state preference relative to the apo-only state has a value of 2.90 ⫾ 0.74 (Table XIIb). Although some trends for other individual residue types also are suggested, due to the sparse data (particularly the paucity in the distribution to which a considered distribution is compared), the error numbers10 likely are underestimated, so that only the preference for Leu residues might be considered significant. To quantify further the increased nonrotamericity in complexed structures, we checked how much the sidechain 1,2 angles moved upon binding the ligands. For each examined side chain, the average difference in (⌬1 ⫹ ⌬2) was taken; i.e., ⌬(⌬1 ⫹ ⌬2), where ⌬1,2 denotes the angular 1,2 distance from the nearest associated rotamer, and the outer ⌬ corresponds to differences between (⌬1 ⫹ ⌬2) values from equivalent side chains in the complexed and apo structure. Table XIII lists the angular displacements obtained in this manner averaged for the three nonrotameric residue states (complex only, complex and apo, and apo only) over the 20 complex/apo pairs together with associated standard deviations. Thirty-eight residues, nonrotameric in the complexed structure, move on average an extra 16.6° from rotameric residues relative to equivalent residues in the apo structure, whereas only 20 nonrotameric apo residues move by a smaller 11.2° from rotameric residues relative to corresponding residues in the complex. When residues are nonrotameric in both the complexed structure and the apo structure, the difference is small, but positive. Clearly, there is increased strain in the ligand-bound structures, albeit the error numbers indicate fairly wide distributions. The results are consistent when only nonrotameric residues within 9 Å of the ligand are considered (Table XIII). In fact, the relative build-up of strain in residues that become stressed (complex only), compared with the relaxation of those becoming rotameric in the complex (apo only), increases from 5.4° to 6.8° for side chains closest to the ligands (⬍9 Å). Hydrogen Bonding and Charge Complementarity Involving Nonrotameric Residues Can the increased strain be mitigated by increased hydrogen bonding and/or salt-bridge formation involving the residues that have become nonrotameric upon ligand binding? Differences in hydrogen bond formation over the relevant residues were determined for the 20 complex/apo pairs. Table XIV shows that, for side-chain atoms of nonrotameric residues only in the complexes, the net number of hydrogen bonds over 38 side-chains is increased by only 4 relative to the apo counterparts. Similarly, the apo-only, nonrotameric side chains increase their net hydrogen bonds by 2, which would lower further their internal energy relative to the complexes. From Table XIV it can be concluded that some residues sacrifice their rotameric states for an extra hydrogen bond, but this occurs only in about 10% of all cases. Moreover, we found no correlation between the (⌬1 ⫹ ⌬2) differences and observed hydrogen bonding (complex only state, 0.37; complex and apo state, ⫺0.26; complex-only state, ⫺0.12). 51 NONROTAMERIC RESIDUES AND LIGAND BINDING TABLE VIII. Side-Chain Conformational Changes in Selected Complex/Apo Structure Pairs† Structure pair Compl Apo RC RA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1adl 1bp2 1brnl 1cbq 1dmb 1gca 1geub 1gmpa 1gof 1hyt 1icm 1isca 1ndc 1rar 4xis 1emd 2ctc 2gmt 2sim 2wgca 1lib 2bpp 1brsa 1cbs 1omp 1gcg 1gesb 1gmqa 1gog 1lnfe 1ifc 1isaa 1npk 1ras 1xis 2cmd 2ctb 1gmca 2sil 9wgaa 1.60 1.70 1.76 1.80 1.80 1.70 2.20 1.70 1.70 1.70 1.50 1.80 2.00 1.90 1.60 1.90 1.40 1.80 1.60 2.20 1.70 1.80 2.00 2.20 1.80 1.90 1.74 1.80 1.90 1.70 1.19 1.80 1.80 1.70 1.60 1.87 1.50 2.20 1.60 1.80 ⌬R ⌬nonrot Entire structure ⬍9 Å ⫺0.10 ⫺0.10 ⫺0.24 0.40 0.00 ⫺0.20 0.46 ⫺0.10 ⫺0.20 0.00 0.31 0.00 0.20 0.20 0.00 0.03 ⫺0.10 0.40 0.00 0.40 0 ⫹2 ⫹3 ⫹1 ⫹4 ⫹4 ⫹2 0 ⫺6 ⫹2 ⫹2 ⫺1 ⫺1 0 ⫺1 ⫹2 0 ⫹4 0 ⫹1 0 0 ⫹2 0 0 ⫹2 ⫺1 0 ⫺2 ⫹1 ⫹2 0 0 ⫺1 ⫺1 0 0 ⫹2 ⫹1 0 C Entire structure C⫹A A C ⬍9 Å C⫹A A 0 2 3 1 5 4 4 0 1 2 2 0 0 1 0 3 0 6 3 1 0 1 0 0 4 3 4 0 10 1 0 1 0 0 3 1 5 1 6 1 0 0 0 0 1 0 2 0 7 0 0 1 1 1 1 1 0 2 3 0 0 0 2 0 0 2 0 0 0 1 2 0 0 0 0 1 0 3 1 0 0 0 0 0 1 2 0 0 2 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 1 0 2 0 0 0 0 1 1 1 0 1 0 0 †Structure pair, Compl, and Apo designate the structural pairs as given in Table II. RC relates the crystallographic resolutions of the complexed structures, and RA shows those for the apo enzymes. ⌬R provides the difference of columns RC and RA; a negative (positive) number implies a better resolved complex (apo) structure. ⌬nonrot designates the net difference in number of nonrotameric residues for the complex form (C ⫺ A). Entire structure denotes this value over all sampled residues, and ⬍9 Å gives the net difference only for residues at a distance smaller than 9 Å from their associated ligands. Entire structure and ⬍9 Å list the numbers of nonrotameric residues for three states: C, nonrotameric in the complex and rotameric in the apo form; C ⫹ A, nonrotameric in both complex and apo forms; A, rotameric in the complex and nonrotameric in the apo forms. TABLE IX. Statistics Over Side-Chain Conformational Changes in Complex/Apo Structural Pairs (Statistics Over Proteins)† Type No. % % nonzero Number ratio (⫹/⫺) ⫹ ⫺ 0 11 55 73 2.75 4 20 27 — 5 25 — — †⫹ indicates data for proteins with increased nonrotamericity in the complex structure, ⫺ indicates those with fewer nonrotameric residues in the complex, and 0 indicates proteins displaying no change in nonrotamericity between the complex and apo structures. Thus, side-chain dihedral displacement cannot be explained by increased hydrogen bonding. Table XIV also shows similar statistics for salt-bridge formation, and, again, the observed increase in nonrotamericity for ligandbound structures appears to be little influenced or mitigated by increased charge complementartity within the complexes. Apo/Complexed Examples Three examples are included to illustrate the positioning of strained side chains in ligand-bound complexes: Salmonella galactose-binding protein (1gca; pair 6 in Table VII), bovine ␥-chymotrypsin (2gmt; pair 18), and TABLE X. Statistics Over Side-Chain Conformational Changes in Complex/Apo Structural Pairs (Statistics Over Number of Side Chains)† Type No. of side chains Average per structure Number ratio (⫹/⫺) ⫹ ⫺ 27 2.45 3.0 9 2.25 — †⫹ denotes residues that switch to nonrotameric from rotameric in the ligand-bound structure, whereas ⫺ side chains are nonrotameric only in the apo form. Bacillus amyloliquefaciens barnase (1brnl; pair 3). These structures illustrate how increased internal energy is stored around ligands. The complexed galactose-binding protein (1gca; Fig. 1) provides a good example of a strained, four-side-chain cluster that is completely within a 9-Å-radius shell of the galactose ligand. The cluster contains Tyr10 and Met17, strained in both the bound for and the unbound form, whereas Asn66 and Lys92 become nonrotameric upon ligand binding. The Asn and Met side chains, respectively, show an extra 10° and 14° deviation from the rotameric states relative to their dihedral angles in the apo structure. The cluster forms a supporting layer behind the 52 J. HERINGA AND P. ARGOS TABLE XI. Residue Types Occurring in Various Nonrotameric States in 20 Complex/Apo Pairs Pair Compl 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1ad1 1bp2 1brnl 1cbq 1dmb 1gca 1geub 1gmpa 1gof Apo 1lib 2bpp 1brsa 1cbs 1omp 1gcg 1gesb 1gmqa 1gog Nonrotameric in complex only Nonrotameric in complex and apo — N71, I95 D75, L89, I96 C130 L43, I60, S114, L115, E308 N66, K92, Q142, L145 L32, M53, V125, T383 — L252 — D42 — — I59, L89, D136, F258 Y10, M17 Y7, I20, D207, F446 — W39, L68, F116, I165, H334, N355, Y405, R439, N509, D586 1hyt 1lnfe I20, E166 Y76 1icm 1ifc L113, R126 — 1isca 1isaa — F106 1ndc 1npk — — 1rar 1ras C72 — 4xis 1xis — F11, F99, N215 1emd 2cmd L6, L32, I151 V171 2ctc 2ctb — V33, I62, D65, L66, N146 2gmt 1gmca L105, C136, D194, C201, V227, Y228 L199 2sim 2sil V280, L319, V358 I38, V116, Q194, N232, T256, V359 2wgca 9wgaa S62 S148 Nonrotameric in apo only — — — — V183 — L257, S345 — Y55, S139, L184, C228, W340, F399, I574 — — I23 M80 N44 F288 I116 — V52, S214 H57, D67, M269 — Notes: Column ‘‘Complex only’’ gives residues that are nonrotameric in the complex and rotameric in the apo form, ‘‘Complex and apo’’ provides those nonrotameric both in complex and apo forms, and ‘‘Apo only’’ lists residues rotameric in the complex form and nonrotameric in the apo form. ligand. Two more side chains become strained in the complex: Gln142 binds to the Ca2⫹ ligand and deviates from the rotameric by an extra 19° upon ligand binding, and the distant Leu145 strains only by a further 2.4°. Figure 2 shows a four cluster of side chains in the first 9-Å layer around the inhibitor N-acetyl-L-phenylalanyl-␣chloroethylketone attached to ␥-chymotrypsin. Asp194, Val227, and Tyr228 become nonrotameric in the complex, whereas Ser214 becomes strained with the ligand removed. Asp194 strains by an extra 2.6° upon ligand binding, whereas the Val227 1 angle is pushed 17° away from its nearest associated rotamer, and the Tyr228 1 and 2 angles are displaced by 22° in total. Ser214 relaxes by 6° in the apo structure. Figure 3 shows the three side chains that become nonrotameric in a barnase (1brn chain L) enzyme/RNA complex. Asp75 and Leu89 are within the first 9-Å shell around the bound, four-nucleotide fragment, whereas Ile96 is in the second shell. Although the three side chains are not in contact, they are distributed about the ligand pocket. Furthermore, Asp75, Leu89 and Ile96 show substantial and extra (1 ⫹ 2) dihedral deviations from rotameric of 10°, 20°, and 50.1°, respectively, apparently induced by binding the nucleotide fragment. DISCUSSION We have shown that nonrotameric side chains (and especially clusters of interacting nonrotameric residues) prefer to be in the proximity of ligand- and substratebinding sites. Within in the first 9-Å shell around the ligands, the favored nonrotameric residue types, compared with the general nonrotameric composition, are Asp, Glu, His, Met, and Asn. In the second 9-Å-layer, hydrophobic residue types Cys, Ile, Leu, and Val are preferred along with Arg, Ser, and Thr, in contrast with the nonrotameric residue types closest to the ligand. The strain in the vicinity of ligands apparently can be accommodated better if the residues are in turn, coil, or N-terminal helical regions. It is known that ligand-binding sites show shape complementarity and are composed of residue types that allow tight interaction with a substrate or other ligands, often involving hydrophobic surface atoms. Induction of strain through increased internal energy by nonrotameric side chains is suggested here to aid catalysis and the release of bound products. For example, enzymatic activity would be enhanced when the dihedral strain is released upon reaching the transition state. Under this scenario, the nonrotameric residues containing energy might be viewed as ‘‘rechargeable batteries’’ that feed and stimulate reactivity. Such a mechanism would give nature an extra device for fine tuning enzyme catalysis during evolution, particularly enzymes in which favorable free energy changes upon ligand binding are relatively large. It must be stressed that the few enzyme complexes included here do not corroborate this notion clearly. In fact, three enzyme complexes of the 20 complex/apo pairs presented (pairs 10, 18, and 19 in Table II) involve transition-state analogues but, nonetheless, show a build-up of strain (Table VIII), which would not be expected if such strain was the predominant mechanism for the ejection of catalytic products. On the other hand, the aforementioned barnase example (Fig. 3) shows that a substrate analogue can cause nonrotamericity along with large shifts in side-chain 53 NONROTAMERIC RESIDUES AND LIGAND BINDING TABLE XII. Counts of Individual and Grouped Nonrotameric Residue Types in Three States Type a. Individual (amino acid) residue types C D E F H I K L M N Q R S T V W Y Total observations b. Grouped residue types D⫹E⫹N⫹Q F⫹H⫹W⫹Y K⫹R S⫹T C⫹I⫹L⫹M⫹V Total observations 1. Nonrotameric in complex only State 2. Nonrotameric in complex and apo 3. Nonrotameric in apo only Preference State 1/3a State 1/2a State 2/3a 4 2 2 0 0 5 1 11 1 2 1 1 2 1 4 0 1 38 0 5 0 6 1 5 0 4 1 5 1 1 1 1 4 1 4 40 1 1 0 2 1 3 0 2 2 1 0 0 3 0 2 1 1 20 2.11 ⫾ 1.00 1.05 ⫾ 0.72 n.a. 0.00 ⫾ 0.00 0.00 ⫾ 0.00 0.88 ⫾ 0.37 n.a. 2.90 ⫾ 0.74 0.26 ⫾ 0.26 1.05 ⫾ 0.72 n.a. n.a. 0.35 ⫾ 0.24 n.a. 1.05 ⫾ 0.50 0.00 ⫾ 0.00 0.53 ⫾ 0.52 — n.a. 0.42 ⫾ 0.29 n.a. 0.00 ⫾ 0.00 0.00 ⫾ 0.00 1.04 ⫾ 0.44 n.a. 2.90 ⫾ 0.74 1.05 ⫾ 1.04 0.42 ⫾ 0.29 1.05 ⫾ 1.04 1.05 ⫾ 1.04 2.11 ⫾ 1.54 1.05 ⫾ 1.04 1.05 ⫾ 0.50 0.00 ⫾ 0.00 0.26 ⫾ 0.26 — 0.00 ⫾ 0.00 2.50 ⫾ 1.05 n.a. 1.50 ⫾ 0.57 0.50 ⫾ 0.49 0.83 ⫾ 0.35 n.a. 1.00 ⫾ 0.47 0.25 ⫾ 0.25 2.50 ⫾ 1.05 n.a. n.a. 0.17 ⫾ 0.17 n.a. 1.00 ⫾ 0.47 0.50 ⫾ 0.49 2.00 ⫾ 0.95 — 7 1 2 3 25 38 11 12 1 2 14 40 2 5 0 3 10 20 1.84 ⫾ 0.63 0.11 ⫾ 0.10 n.a. 0.53 ⫾ 0.29 1.32 ⫾ 0.15 — 0.67 ⫾ 0.23 0.09 ⫾ 0.09 2.11 ⫾ 1.54 1.58 ⫾ 0.88 1.88 ⫾ 0.22 — 2.75 ⫾ 0.71 1.20 ⫾ 0.29 n.a. 0.33 ⫾ 0.23 0.70 ⫾ 0.15 — aThe preference is calculated as the fraction of the residue type in one state relative to that in another state, where (a) and (b) can be in state 1 to 3 (column 2 to 4). For example, State 1/3 denotes the nonrotameric preference of the considered residue type for being nonrotameric in the complex-only states relative to being nonrotameric in the apo-only state. Estimates for preference standard deviations are compiled as described in Heringa and Argos.10 n.a., not applicable. TABLE XIII. Differences in ⌬1 and ⌬2 Angles of Nonrotameric Side Chains in Three Complex/Apo States† Nonrotameric in Entire structure No. ⌬(⌬1 ⫹ ⌬2) Complex only Complex and apo Apo only 38 40 20 16.59 ⫾ 13.63 1.97 ⫾ 9.08 ⫺11.19 ⫾ 13.66 ⬍9 Å from ligand No. ⌬(⌬1 ⫹ ⌬2) 12 8 7 11.41 ⫾ 9.75 3.00 ⫾ 12.00 ⫺4.57 ⫾ 1.50 †⌬(⌬ 1 ⫹ ⌬2 ), the mean differences with standard deviations of the summed (⌬1 ⫹ ⌬2 ) values of equivalent side-chains in complexed and apo structures relative to the complexes; i.e., 1/N ⴱ ⌺N(⌬1c ⫹ ⌬2c) ⫺ (⌬ 1a ⫹ ⌬2a ), where superscript c and a designate ⌬ angles of equivalent residues in complexed and apo structures, respectively, and N is the number of side chains. Summation is over the number of side chains (N). For residues Cys, Ser, Thr, and Val, only the ⌬1 values were used. No., the numbers of side chains observed in each of the three states. dihedral angles. It also is significant that the increased energy on ligand binding, as observed over the complex/ apo pairs, does not appear to be mitigated by increased internal hydrogen bonding or electrostatic interaction. For use as aids in protein engineering and design, residue types used in natural evolution to induce strain are likely to be negatively charged and polar side chains as well as branched hydrophobic residues, especially leucine. TABLE XIV. Differences in Protein-Internal Hydrogen Bonding and Charge Complementarity of Nonrotameric Side Chains in Three Complex/Apo States Nonrotameric in No. ⌬H-bonda ⌺alt bridgea Complex only Complex and apo Apo only 38 40 20 4 0 ⫺2 3 ⫺1 0 a⌬H-bond and ⌬salt-bridge values are given relative to the complexed structure. Furthermore, nonrotameric switching may provide a more general way to channel energy through protein structures without necessarily disrupting the main-chain topology. If more corresponding complex and apo structures become available, then suggestions for protein mutagenesis experiments could be derived (Tables XI and XII). Here, only the statistics for Leu residues were significant enough to suggest its engineering at ligand binding sites aimed at provoking more inducible strain, which could increase enzymatic activity. The residue type might be especially appropriate in cases of hydrophobic patches binding li- 54 J. HERINGA AND P. ARGOS Fig. 1. Space-filling representation of a nonrotameric side-chain four cluster and two isolated side chains for the Salmonella typhimurium galactose-binding protein (1gca) complex with galactose and Ca2⫹ ligands. Orange indicates nonrotameric side chains only in the complex: namely, Asn66, Lys92, Gln142, and Leu145. Yellow designates nonrotameric side chains in both the complexed and the apo structures: Tyr10 and Met17. Ligands galactose (GAL) and Ca2⫹ (CA) are depicted in blue. The picture was prepared by using RasMol graphics.21 Fig. 2. Space-filling representation of nonrotameric side chains in the complexed bovine ␥-chymotrypsin (2gmt). Ser214, Val227, Tyr228, and Asp194 are within 9 Å of the ligand. Orange indicates nonrotameric side chains only in the complex. Val52 and Ser214 (green) become relaxed (rotameric) upon ligand binding. Leu199 (yellow) is nonrotameric in both the complexed and apo forms. The side chains Cys136, Cys201, Leu199, and Tyr228 form a nonrotameric four cluster. The N-acetyl-L-phenylalanyl␣-chloroethylketone ligand (HIN) is depicted in blue. Fig. 3. Space-filling representation of the nonrotameric side chains for Bacillus amyloliquefaciens barnase (1brn chain L) with a bound fournucleotide fragment (RNA). Asp75, Leu89, and Ile96 (orange) become nonrotameric and strained upon ligand binding. The nucleotide fragment is indicated in blue. NONROTAMERIC RESIDUES AND LIGAND BINDING gands. However, such engineering is a complicated matter, because every active site has unique properties with associated residues in its specific environment, such that, apart from altered side-chain dihedral strain, amino acid substitutions involve changes in hydrophobicity, packing, hydrogen bonding, and the like. Further research should elucidate how enzymes bind their substrates, transition states, and products in individual cases of strained sidechain conformations, as observed in this work. It could be argued that the results presented here are derived from critically small data sets and also that extrapolating a limited number of observations on induced strain to a general mechanism for providing energy in enzyme-catalyzed reactions seems somewhat imaginative. Nonetheless, our observations about the predominance of nonrotameric residues in the vicinity of ligands are consistent, and we invite protein engineers to test our ‘‘battery hypothesis’’ of inducible strain. ACKNOWLEDGMENTS We thank Frank Eisenhaber and Alex May for valuable discussions and Gerhard Vogt for help in delineating hydrogen bonds and salt bridges. REFERENCES 1. Schrauber H, Eisenhaber F, Argos P. Rotamers: to be or not to be? An analysis of amino acid side-chain conformations in globular proteins. J Mol Biol 1993;230:592–612. 2. Janin J, Wodak S, Levitt M, Maigret B. Conformations of amino acid side-chains in proteins. J Mol Biol 1978;125:357–386, 1978. 3. Bhat TN, Sasisekharan V, Vijayan, M. An analysis of side-chain conformation in proteins. Int J Protein Peptide Res 1979;13:170– 184. 4. Ghelin BR, Karplus M. Side-chain torsional potentials and motion of amino acids in proteins: bovine pancreatic trypsin inhibitor. Proc Natl Acad Sci USA 1975;72:2002–2006. 55 5. Ghelin BR, Karplus M. Side-chain torsional potentials: effect of dipeptide, protein and solvent environment. Biochemistry 1979;18: 1256–1268. 6. Ponder JW, Richards FM. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 1987;193:775–791. 7. Creamer TP, Rose GD. Side-chain entropy opposes ␣-helix formation but rationalizes experimentally determined helix-forming propensities. Proc Natl Acad Sci USA 1992;89:5937–5941. 8. Lee KH, Xie D, Amzel LM. Estimation of changes in side chain configurational entropy in binding and folding: general methods and application to helix formation. Proteins1994;20:68–84. 9. Herzberg O, Moult J. Analysis of steric strain in the polypeptide backbone of protein molecules. Proteins1991;11:223–229. 10. Heringa J, Argos P. Strain in protein structures as viewed through non-rotameric side-chains. I. Spatial position and interaction. Proteins1999;37:30–43. 11. Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. Protein data bank: a computer based archival file for macromolecular structures. J Mol Biol 1977;112:535–542. 12. Heringa J, Sommerfeldt H, Higgins D, Argos P. OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity. Comput Appl Biosci 1992;8:599–600. 13. Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983;22:2577–2637. 14. Abagyan R, Totrov M, Kuznetsov D. ICM—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comp Chem 1994;15:488–506. 15. McDonald IK, Thornton JM. Satisfying hydrogen bonding potential in proteins. J Mol Biol 1994;238:777–793. 16. Barlow DJ, Thornton JM. Ion-pairs in proteins. J Mol Biol 1983;168:867–885. 17. Heringa J, Argos P. Side-chain clusters in protein structures and their role in protein folding. J Mol Biol 1991;220:151–171. 18. Hol WGJ, van Duijnen PT, Berendsen HJC. The ␣-helix dipole and the properties of proteins. Nature 1978;273:443–446. 19. Harper ET, Rose GD. Helix stop signals in proteins and peptides— the capping box. Biochemistry 1993;32:7605–7609. 20. Taylor WR, Orengo CA. Protein structure alignment. J Mol Biol 1989;208:1–22. 21. Sayle RA, Milner-White EJ. RasMol: biomolecular graphics for all. Trends Biochem Sci 1995;20:374–376.