Download NQ-Flipper: validation and correction of asparagine/glutamine

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Implicit solvation wikipedia , lookup

Molecular dynamics wikipedia , lookup

Transcript
BIOINFORMATICS APPLICATIONS NOTE
Vol. 22 no. 11 2006, pages 1397–1398
doi:10.1093/bioinformatics/btl128
Structural bioinformatics
NQ-Flipper: validation and correction of asparagine/glutamine
amide rotamers in protein crystal structures
Christian X. Weichenberger and Manfred J. Sippl
Center of Applied Molecular Engineering, University of Salzburg, Jakob Haringerstrasse 5, 5020 Salzburg, Austria
Received on January 12, 2006; revised on March 6, 2006; accepted on March 30, 2006
Advance Access publication April 4, 2006
Associate Editor: Anna Tramontano
ABSTRACT
Summary: The error rate of asparagine (Asn) and glutamine (Gln)
amide rotamers in protein crystal structures is in the order of 20%
and as a consequence the current Protein Database (PDB) contains
approximately half a million incorrect Asn and Gln side-chain rotamers.
Here we present NQ-Flipper, a web service based on knowledge-based
potentials of mean force to automatically detect and correct erroneous
rotamers. We achieve excellent agreement with expert curated data.
Availability: The program is accessible freely as a web service at http://
flipper.services.came.sbg.ac.at
Contact: [email protected]
1
INTRODUCTION
The side-chain amide groups of asparagine (Asn) and glutamine
(Gln) act simultaneously as hydrogen bond donors and acceptors.
The electron density near the nitrogen and oxygen atoms is frequently compatible with two rotamers which are related by a 2-fold
symmetry axis. This hampers the correct interpretation of electron
density maps resulting in the perpetual assignment of incorrect
rotamers with an error rate of 20% (McDonald and Thornton,
1995; Word et al., 1999). Stated in this way the problem seems
to be specific for X-ray analysis of protein crystals but we emphasize that a similar error rate of 23% is found in NMR structures.
Since Asn and Gln residues frequently participate in hydrogen
bond networks and functional groups, incorrect rotamers may
severely interfere with physico-chemical studies of protein structures and molecular modeling tasks. On the other hand, incorrect
rotamers in general result in unfavorable interactions which should
be clearly detectable by proper energy calculations.
Currently, two web-based services are available for the detection
of incorrect Asn and Gln rotamers. Lovell et al. (2003) identify
correct rotamers by minimizing steric clashes after adding hydrogen
atoms to the protein structure. Hooft et al. (1996) optimize the
hydrogen bonding network of proteins allowing Asn and Gln residues to flip during the optimization process. Both methods use
artificially created hydrogen atom positions for rotamer characterization. The service provided by Lovell et al. (2003) offers visualization and download of corrected PDB (Berman et al., 2000)
entries.
To whom correspondence should be addressed.
2
IMPLEMENTATION AND USAGE
To address the rotamer problem we use potentials of mean force for
pairwise interactions among all heavy atoms of standard amino
acids (Sippl, 1990). For each Asn and Gln residue we compute
the energy «(R1) of the original conformation R1 as found in the
PDB structure and the energy «(R2) for the alternative rotamer R2.
The energy difference D« :¼ «(R1) «(R2) serves as a score and
from Boltzmann’s distribution we obtain the probabilities p(R1) as
(1 + exp(D«))1 and p(R2) ¼ 1 p(R1). Given a threshold value v
(see below), we distinguish three cases. If D« < v, the probability
p(R1) is close to one and R1 is considered to be the correct rotamer.
If D« > v then the probability p(R1) is close to zero and R1 is the
incorrect rotamer. For j D« j < v, both rotamers have significant
probabilities so that they are likely to coexist in the crystal structure
and the assignment is ambiguous. In particular, for D« ¼ 0 both
rotamers have equal probabilities p(R1) ¼ p(R2) ¼ 1/2.
A suitable value of v is obtained by comparison with the reference set reported by Word et al. (1999). The reference set consists of
100 protein chains containing 1006 (75.9%) Asn and Gln residues
classified as correct and 320 (24.1%) classified as incorrect. Using
established terminology (Baldi et al., 2000) we find for incorrect
rotamers defined as D« > v ¼ 6 a sensitivity of 92.7%, a specificity of
96.7% and an overall accuracy of 95.8%. The fraction of ambiguous
rotamers, defined by 6 D« 6 is 5.8%.
The mean force potentials are first compiled from a database
containing the original R1 rotamers. The potentials are then refined
by several cycles of rotamer correction and recompilation of potentials. The potentials converge quickly to a stable self-consistent
solution. A subsequent comparison shows excellent agreement
with rotamer flips independently suggested by expert analysis
[e.g. Word et al. (1999)]. In principle the approach presented
here can be applied to the related problem of ambiguities in His
conformers. However, the respective analysis requires a careful
consideration of various protonation states which is beyond the
scope of the present analysis.
As an example we provide an analysis of oxidoreductase
1ra9 (resolution 1.55 Å). The structure contains four residues with
significant D«-scores (Asn-18, Asn-23, Gln-65, Gln-108; D«-scores
20.5, 16.7, 12.8 and 26.7, respectively). These residues should
be flipped to the alternative rotamer R2 which is corroborated by
a detailed analysis of the interactions among the affected atoms
(Fig. 1).
The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]
1397
C.X.Weichenberger and M.J.Sippl
Fig. 1. Residues Gln-65 (a) and Asn-18 (b) of oxidoreductase 1ra9 (resolution 1.55 Å) shown in their respective chemical environment. The dashed lines
represent the distances between atoms. The atoms are colored by atom type: carbon, grey; oxygen, red; nitrogen, blue; phosphorus, orange. Both rotamers are
shown in their R1 state (as found in the PDB entry). In both cases the mean force energy difference clearly disfavors the R1 rotamer as compared to R2. This can also
be rationalized in terms of physico-chemical principles. In the R1 rotamer of Gln-65 (Fig. 1a) the amide oxygen is in close proximity to the hydroxyl group of
Ser-64 (2.90 Å). In principle the hydroxyl group can donate its proton to form a hydrogen bond with Gln-65, but then no hydrogen bond can be formed between the
Ser-64 hydroxyl and the negatively charged NAP moiety. The hydroxyl oxygen of Ser-64 has an ideal hydrogen bond distance of 2.70 Å to the respective oxygen
of NAP. The Ser-64 hydroxyl group therefore donates its proton to NAP rather than to Gln-65. Hence, the correct configuration is obtained by flipping the Gln-65
amide, in agreement with the result obtained by NQ-Flipper. Then the Ser-64 hydroxyl group accepts an N–H proton from the Gln-65 amide and Ser-64 can
participate in two hydrogen bonds. Also, there are no other interactions in the vicinity which become disfavored by rotating the amide group. We note that this flip
is suggested by NQ-Flipper even without taking into account the interaction between the Ser-64 hydroxyl and the NAP atoms. Adding these interactions will
further increase the energy difference between R1 and R2 of Gln-65. In the case of the R1 rotamer of Asn-18 (Fig. 1b) the amide nitrogen is close to its own
backbone nitrogen (3.1 Å). A flip to the R2 rotamer brings the amide oxygen (negative partial charges) in close proximity to the backbone nitrogen (positive partial
charges). In addition the flip removes an unfavorable close contact between the amide N-H2 group and the Ca-H group within the Asn-18 residue. A corresponding
analysis of the rotamers of Asn-23 and Gln-108 again corroborates that the R1 rotamer is unfavorable and should be flipped to the R2 rotamer. The major
unfavorable interaction of Asn-23 is a steric clash with Pro-25 which is absent in the flipped rotamer, and the major problems with Gln-108 are due to unfavorable
electrostatic interactions with the backbone oxygen atoms of Lys-107 and Lys-108 which are replaced by favorable electrostatic interactions in the flipped
rotamer (data not shown). The figure was generated using PyMOL (http://pymol.sourceforge.net).
The method presented here is implemented as a web service
called NQ-Flipper (http://flipper.services.came.sbg.ac.at). The service provides validation and correction of Asn and Gln residues in
protein structures that can be specified either as a valid PDB code or
uploaded as a PDB formatted file. For each Asn and Gln amino acid
in the structure the server computes D«-scores by taking into
account all chains and the full crystal symmetry.
The results are presented in the form of a table of D« values and a
graphical view of the structure based on JMol. The table signifies
incorrect residues in red and residues which are within a radius of 8
Å of non-standard groups in blue. The residues in blue have to be
treated with caution since they may interact with atoms whose
potentials of mean force are currently not available. All assignments
can be edited by the user and the corrected coordinate files can be
downloaded in various compression formats.
All transactions are encrypted by the https protocol and the
data are stored in session dependent directories that are only
accessible to the user who has control of the session. A detailed
description of all parameters is available in the help section of the
website.
1398
ACKNOWLEDGEMENTS
The authors thank Ralf Grosse-Kunstleve for kind permission to use
his sglite crystallographic symmetry library.
Conflict of Interest. none declared.
REFERENCES
Baldi,P. et al. (2000) Assessing the accuracy of prediction algorithms for classification:
an overview. Bioinformatics, 16, 412–424.
Berman,H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242.
Hooft,R.W. et al. (1996) Errors in protein structures. Nature, 381, 272.
Lovell,S.C. et al. (2003) Structure validation by Calpha geometry: phi, psi and Cbeta
deviation. Proteins, 50, 437–450.
McDonald,I.K. and Thornton,J.M. (1995) The application of hydrogen bonding analysis in X-ray crystallography to help orientate asparagine, glutamine and histidine
side chains. Protein Eng., 8, 217–224.
Sippl,M.J. (1990) Calculation of conformational ensembles from potentials of mean
force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol., 213, 859–883.
Word,J.M. et al. (1999) Asparagine and glutamine: using hydrogen atom contacts in the
choice of side-chain amide orientation. J. Mol. Biol., 285, 1735–1747.