Download visualization, comparison and analysis of 2D maps of protein structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Index of biochemistry articles wikipedia , lookup

Immunoprecipitation wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Gene expression wikipedia , lookup

Magnesium transporter wikipedia , lookup

List of types of proteins wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Protein wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Protein moonlighting wikipedia , lookup

Rosetta@home wikipedia , lookup

Protein design wikipedia , lookup

Metalloprotein wikipedia , lookup

Cyclol wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Western blot wikipedia , lookup

Protein domain wikipedia , lookup

Interactome wikipedia , lookup

Protein folding wikipedia , lookup

Proteolysis wikipedia , lookup

Protein adsorption wikipedia , lookup

Homology modeling wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Protein structure prediction wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Transcript
BIOINFORMATICS APPLICATIONS NOTE
Vol. 23 no. 11 2007, pages 1429–1430
doi:10.1093/bioinformatics/btm124
Structural bioinformatics
PROTMAP2D: visualization, comparison and analysis
of 2D maps of protein structure
Michal J. Pietal, Irina Tuszynska and Janusz M. Bujnicki
Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology,
Trojdena 4, 02-109 Warsaw, Poland
Received November 11, 2006; accepted March 25, 2007
Advance Access publication March 30, 2007
Associate Editor: Prof. Martin Bishop
ABSTRACT
Motivation: Protein structure comparison is a fundamental problem
in structural biology and bioinformatics. Two-dimensional maps of
distances between residues in the structure contain sufficient
information to restore the 3D representation, while maps of contacts
reveal characteristic patterns of interactions between secondary
and super-secondary structures and are very attractive for visual
analysis. The overlap of 2D maps of two structures can be easily
calculated, providing a sensitive measure of protein structure
similarity. PROTMAP2D is a software tool for calculation of contact
and distance maps based on user-defined criteria, quantitative
comparison of pairs or series of contact maps (e.g. alternative
models of the same protein, model versus native structure,
different trajectories from molecular dynamics simulations, etc.)
and visualization of the results.
Availability: PROTMAP2D for Windows / Linux / MacOSX is freely
available for academic users from http://genesilico.pl/protmap2d.htm
Contact: [email protected]
Supplementary information: Supplementary data are available at
Bioinformatics online.
Comparison of protein structures is fundamental to structural
biology. Typically, the protein structures are represented by
spatial coordinates in 3D Cartesian space and are superimposed
as rigid bodies. The optimal superposition of 3D protein
structures and calculation of a number of residues superimposable under a certain distance cutoff is however a nontrivial problem (Godzik, 1996). However, the protein structure
can also be represented as a 2D matrix, where the element (i, j)
carries information about the interactions of residues i and j
(Phillips, 1970). The matrix elements can be distances between
particular atoms (distance map) or simple binary (yes/no)
information about residue interactions (contact map).
In contrast to Cartesian coordinates, the 2D map representation of protein structure is independent of the coordinate frame,
which makes it very useful for protein structure comparisons
(Mirny and Domany, 1996). For example DALI, one of the
most popular tools for protein structure database searches,
relies on comparison of distance matrices (Holm and
Sander, 1993).
*To whom correspondence should be addressed.
Many methods have been developed for comparison or
visualization of the 3D representation of protein models.
In contrast, only a few methods are freely available for
comparison or visualization of 2D maps, such as VMD
(Humphrey et al., 1996), SeqX (Biro and Fordos, 2005) or
iMolTalk (Diemand and Scheib, 2004). However, these
programs provide little flexibility for definition of the contact
by the user (type of atom and the distance threshold),
for comparison of series or ensembles of models, or for
manipulation of the results.
We developed a new tool PROTMAP2D dedicated to analysis
of contact maps, and to some extent also distance maps. Our aim
was to facilitate the comparative analysis of protein structures,
especially for such cases, where not only the backbone geometry
is important (a property typically analyzed by 3D superposition
methods), but also the mutual position and interactions of
residues. Contact maps are a very demonstrative representation
of protein structure, attractive for visual analysis, which reveals
interactions between residues spatially distant in the structure
and clearly depicts all secondary structure elements and contacts
between them (Godzik et al., 1993). It was also demonstrated
that it is possible to reproduce, with considerable accuracy, the
3D structure of the protein from its contact map (Vendruscolo
et al., 1997).
PROTMAP2D allows for quantitative and qualitative
(visual) analysis of contact maps of protein structures:
individual models, different models of the same protein
(e.g. structures solved experimentally under different conditions, theoretical models calculated with different programs),
ensembles of structures (e.g. from NMR or de novo folding)
and trajectories (e.g. from molecular dynamics simulations).
The user can specify the definition of the contact: the type
of atom to be included in calculations, a maximal distance (in Å)
between atoms to be considered in contact, and a minimal
separation of amino acids along the sequence. The user can also
specify a group of particular residues (or several segments of the
protein chain) and monitor the changes in the number of
contacts between them in a trajectory or an ensemble.
Full distance maps can be also calculated and analyzed
for individual models and then converted to contact maps.
Options for visualization of distance maps include generation
of derivative contact (binary) maps on the fly, as the user
changes parameters with which to define the contact.
ß The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]
1429
M.J.Pietal et al.
This allows for convenient election of such parameters for
contact map calculation that capture the desired feature to be
analyzed for a large set of models (e.g. spatial proximity of a
given pair of secondary structure elements).
PROTMAP2D calculates contact maps for all uploaded 3D
models and provides many options for their visualization
(see e.g. Fig. 1, or examples available in the tutorial or at the
PROTMAP2D website, http://genesilico.pl/protmap2d.htm).
It also displays the statistics (density of contacts, conservation
of common contacts, etc.) and allows saving the output as
bitmap graphics or ASCII files. Maps generated in
PROTMAP2D can be also exported as matrices in the
PHYLIP, CLANS or Microsoft EXCEL formats or CASP or
EVA residue–residue contact files, allowing for or visualization
in different programs or further processing e.g. by clustering
methods. Our program allows also for uploading previously
calculated 2D maps saved in the above-mentioned format,
thus enabling comparison of e.g. computationally predicted
and native maps.
PROTMAP2D generates a particularly visually appealing
output for MD trajectories comprising multiple conformations
(e.g. from protein folding or unfolding simulations).
It produces a ‘movie’ (MPEG file, example available at the
program website), in which each frame displays in white
the contacts found in the current conformation, as well as a
fading ‘trace’ of contacts observed in previous conformations.
Given two alternative trajectories, PROTMAP2D calculates
the fraction of common contacts for all model pairs and
visualizes the distance matrix between the two trajectories,
facilitating identification of folding intermediates.
Our program does not conduct alignments of contact maps
for non-identical proteins, e.g. remotely related homologs,
it performs analyses only for alternative models of the same
protein. Analysis of incomplete structures that differ in length
(e.g. crude fold-recognition models that exhibit deletions or
crystallographic structures with disordered parts) is however
possible, as long as the corresponding residues retain the same
index number in all models. PROTMAP2D allows analyses of
multimeric proteins, and can be used to visualize intermolecular
contacts in protein–protein complexes.
PROTMAP2D has been tested thoroughly in our laboratory.
We routinely use it to analyze the results of unfolding
simulations in search for common pathways. We also use it
as the first step in clustering of protein models generated
by fold-recognition or de-novo folding and to infer the most
probable contacts from a set of alternative models, which can
be used as distance restraints in subsequent folding simulations.
We hope that our method will also become a valuable resource
for other groups involved in protein structure prediction,
comparison of models and simulations of protein dynamics.
PROTMAP2D is written in the Python programming
language. Installation of the Linux version requires preinstallation of Python 2.4, BioPython, PIL, wxPython,
PyExcelerator and PyMedia (see manual for details). The
Windows and Mac versions are essentially standalone programs. The distribution package contains an extensive manual
(describing details of all options) and tutorial with example
analyses for the most typical tasks. A set of example files is also
included (alternative models, ensembles and trajectories).
1430
Fig. 1. Comparison between the crystal structure and the theoretical
model of the I-TevI catalytic domain, and the respective contact maps
(calculated using the 8 Å threshold for the maximal distance between
C- atoms). The upper triangle represents contacts present in the crystal
structure (Van Roey et al., 2002), the lower triangle represents contacts
present in the earlier ‘blind’ de novo prediction (Bujnicki et al., 2001).
White color represents contacts common between the two models.
ACKNOWLEDGEMENTS
This analysis was funded by the Polish Ministry of Science
(grant PBZ–KBN–088/PO4/2003). M.J.P. and I.T. were
supported by the NIH (Fogarty International Center grant
R03 TW007163-01).
Conflict of Interest: none declared.
REFERENCES
Biro,J.C. and Fordos,G. (2005) SeqX: a tool to detect, analyze and visualize
residue co-locations in protein and nucleic acid structures. BMC
Bioinformatics, 6, 170.
Bujnicki,J.M. et al. (2001) Three-dimensional modeling of the I-TevI homing
endonuclease catalytic domain, a GIY-YIG superfamily member, using NMR
restraints and Monte Carlo dynamics. Protein Eng., 14, 717–721.
Diemand,A.V. and Scheib,H. (2004) MolTalk - a programming library for
protein structures and structure analysis. BMC Bioinformatics, 5, 39.
Godzik,A. (1996) The structural alignment between two proteins: is there
a unique answer? Protein Sci., 5, 1325–1338.
Godzik,A. et al. (1993) Regularities in interaction patterns of globular proteins.
Protein Eng., 6, 801–810.
Holm,L. and Sander,C. (1993) Protein structure comparison by alignment of
distance matrices. J. Mol. Biol., 233, 123–138.
Humphrey,W. et al. (1996) VMD: visual molecular dynamics. J. Mol. Graph., 14,
33–38, 27–38.
Mirny,L. and Domany,E. (1996) Protein fold recognition and dynamics in the
space of contact maps. Proteins, 26, 391–410.
Phillips,D.C. (1970) The development of crystallographic enzymology. Biochem.
Soc. Symp., 30, 11–28.
Vendruscolo,M. et al. (1997) Recovery of protein structure from contact maps.
Fold. Des., 2, 295–306.