* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download visualization, comparison and analysis of 2D maps of protein structure
Index of biochemistry articles wikipedia , lookup
Immunoprecipitation wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
Gene expression wikipedia , lookup
Magnesium transporter wikipedia , lookup
List of types of proteins wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Protein moonlighting wikipedia , lookup
Rosetta@home wikipedia , lookup
Protein design wikipedia , lookup
Metalloprotein wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Western blot wikipedia , lookup
Protein domain wikipedia , lookup
Interactome wikipedia , lookup
Protein folding wikipedia , lookup
Proteolysis wikipedia , lookup
Protein adsorption wikipedia , lookup
Homology modeling wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
BIOINFORMATICS APPLICATIONS NOTE Vol. 23 no. 11 2007, pages 1429–1430 doi:10.1093/bioinformatics/btm124 Structural bioinformatics PROTMAP2D: visualization, comparison and analysis of 2D maps of protein structure Michal J. Pietal, Irina Tuszynska and Janusz M. Bujnicki Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland Received November 11, 2006; accepted March 25, 2007 Advance Access publication March 30, 2007 Associate Editor: Prof. Martin Bishop ABSTRACT Motivation: Protein structure comparison is a fundamental problem in structural biology and bioinformatics. Two-dimensional maps of distances between residues in the structure contain sufficient information to restore the 3D representation, while maps of contacts reveal characteristic patterns of interactions between secondary and super-secondary structures and are very attractive for visual analysis. The overlap of 2D maps of two structures can be easily calculated, providing a sensitive measure of protein structure similarity. PROTMAP2D is a software tool for calculation of contact and distance maps based on user-defined criteria, quantitative comparison of pairs or series of contact maps (e.g. alternative models of the same protein, model versus native structure, different trajectories from molecular dynamics simulations, etc.) and visualization of the results. Availability: PROTMAP2D for Windows / Linux / MacOSX is freely available for academic users from http://genesilico.pl/protmap2d.htm Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online. Comparison of protein structures is fundamental to structural biology. Typically, the protein structures are represented by spatial coordinates in 3D Cartesian space and are superimposed as rigid bodies. The optimal superposition of 3D protein structures and calculation of a number of residues superimposable under a certain distance cutoff is however a nontrivial problem (Godzik, 1996). However, the protein structure can also be represented as a 2D matrix, where the element (i, j) carries information about the interactions of residues i and j (Phillips, 1970). The matrix elements can be distances between particular atoms (distance map) or simple binary (yes/no) information about residue interactions (contact map). In contrast to Cartesian coordinates, the 2D map representation of protein structure is independent of the coordinate frame, which makes it very useful for protein structure comparisons (Mirny and Domany, 1996). For example DALI, one of the most popular tools for protein structure database searches, relies on comparison of distance matrices (Holm and Sander, 1993). *To whom correspondence should be addressed. Many methods have been developed for comparison or visualization of the 3D representation of protein models. In contrast, only a few methods are freely available for comparison or visualization of 2D maps, such as VMD (Humphrey et al., 1996), SeqX (Biro and Fordos, 2005) or iMolTalk (Diemand and Scheib, 2004). However, these programs provide little flexibility for definition of the contact by the user (type of atom and the distance threshold), for comparison of series or ensembles of models, or for manipulation of the results. We developed a new tool PROTMAP2D dedicated to analysis of contact maps, and to some extent also distance maps. Our aim was to facilitate the comparative analysis of protein structures, especially for such cases, where not only the backbone geometry is important (a property typically analyzed by 3D superposition methods), but also the mutual position and interactions of residues. Contact maps are a very demonstrative representation of protein structure, attractive for visual analysis, which reveals interactions between residues spatially distant in the structure and clearly depicts all secondary structure elements and contacts between them (Godzik et al., 1993). It was also demonstrated that it is possible to reproduce, with considerable accuracy, the 3D structure of the protein from its contact map (Vendruscolo et al., 1997). PROTMAP2D allows for quantitative and qualitative (visual) analysis of contact maps of protein structures: individual models, different models of the same protein (e.g. structures solved experimentally under different conditions, theoretical models calculated with different programs), ensembles of structures (e.g. from NMR or de novo folding) and trajectories (e.g. from molecular dynamics simulations). The user can specify the definition of the contact: the type of atom to be included in calculations, a maximal distance (in Å) between atoms to be considered in contact, and a minimal separation of amino acids along the sequence. The user can also specify a group of particular residues (or several segments of the protein chain) and monitor the changes in the number of contacts between them in a trajectory or an ensemble. Full distance maps can be also calculated and analyzed for individual models and then converted to contact maps. Options for visualization of distance maps include generation of derivative contact (binary) maps on the fly, as the user changes parameters with which to define the contact. ß The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] 1429 M.J.Pietal et al. This allows for convenient election of such parameters for contact map calculation that capture the desired feature to be analyzed for a large set of models (e.g. spatial proximity of a given pair of secondary structure elements). PROTMAP2D calculates contact maps for all uploaded 3D models and provides many options for their visualization (see e.g. Fig. 1, or examples available in the tutorial or at the PROTMAP2D website, http://genesilico.pl/protmap2d.htm). It also displays the statistics (density of contacts, conservation of common contacts, etc.) and allows saving the output as bitmap graphics or ASCII files. Maps generated in PROTMAP2D can be also exported as matrices in the PHYLIP, CLANS or Microsoft EXCEL formats or CASP or EVA residue–residue contact files, allowing for or visualization in different programs or further processing e.g. by clustering methods. Our program allows also for uploading previously calculated 2D maps saved in the above-mentioned format, thus enabling comparison of e.g. computationally predicted and native maps. PROTMAP2D generates a particularly visually appealing output for MD trajectories comprising multiple conformations (e.g. from protein folding or unfolding simulations). It produces a ‘movie’ (MPEG file, example available at the program website), in which each frame displays in white the contacts found in the current conformation, as well as a fading ‘trace’ of contacts observed in previous conformations. Given two alternative trajectories, PROTMAP2D calculates the fraction of common contacts for all model pairs and visualizes the distance matrix between the two trajectories, facilitating identification of folding intermediates. Our program does not conduct alignments of contact maps for non-identical proteins, e.g. remotely related homologs, it performs analyses only for alternative models of the same protein. Analysis of incomplete structures that differ in length (e.g. crude fold-recognition models that exhibit deletions or crystallographic structures with disordered parts) is however possible, as long as the corresponding residues retain the same index number in all models. PROTMAP2D allows analyses of multimeric proteins, and can be used to visualize intermolecular contacts in protein–protein complexes. PROTMAP2D has been tested thoroughly in our laboratory. We routinely use it to analyze the results of unfolding simulations in search for common pathways. We also use it as the first step in clustering of protein models generated by fold-recognition or de-novo folding and to infer the most probable contacts from a set of alternative models, which can be used as distance restraints in subsequent folding simulations. We hope that our method will also become a valuable resource for other groups involved in protein structure prediction, comparison of models and simulations of protein dynamics. PROTMAP2D is written in the Python programming language. Installation of the Linux version requires preinstallation of Python 2.4, BioPython, PIL, wxPython, PyExcelerator and PyMedia (see manual for details). The Windows and Mac versions are essentially standalone programs. The distribution package contains an extensive manual (describing details of all options) and tutorial with example analyses for the most typical tasks. A set of example files is also included (alternative models, ensembles and trajectories). 1430 Fig. 1. Comparison between the crystal structure and the theoretical model of the I-TevI catalytic domain, and the respective contact maps (calculated using the 8 Å threshold for the maximal distance between C- atoms). The upper triangle represents contacts present in the crystal structure (Van Roey et al., 2002), the lower triangle represents contacts present in the earlier ‘blind’ de novo prediction (Bujnicki et al., 2001). White color represents contacts common between the two models. ACKNOWLEDGEMENTS This analysis was funded by the Polish Ministry of Science (grant PBZ–KBN–088/PO4/2003). M.J.P. and I.T. were supported by the NIH (Fogarty International Center grant R03 TW007163-01). Conflict of Interest: none declared. REFERENCES Biro,J.C. and Fordos,G. (2005) SeqX: a tool to detect, analyze and visualize residue co-locations in protein and nucleic acid structures. BMC Bioinformatics, 6, 170. Bujnicki,J.M. et al. (2001) Three-dimensional modeling of the I-TevI homing endonuclease catalytic domain, a GIY-YIG superfamily member, using NMR restraints and Monte Carlo dynamics. Protein Eng., 14, 717–721. Diemand,A.V. and Scheib,H. (2004) MolTalk - a programming library for protein structures and structure analysis. BMC Bioinformatics, 5, 39. Godzik,A. (1996) The structural alignment between two proteins: is there a unique answer? Protein Sci., 5, 1325–1338. Godzik,A. et al. (1993) Regularities in interaction patterns of globular proteins. Protein Eng., 6, 801–810. Holm,L. and Sander,C. (1993) Protein structure comparison by alignment of distance matrices. J. Mol. Biol., 233, 123–138. Humphrey,W. et al. (1996) VMD: visual molecular dynamics. J. Mol. Graph., 14, 33–38, 27–38. Mirny,L. and Domany,E. (1996) Protein fold recognition and dynamics in the space of contact maps. Proteins, 26, 391–410. Phillips,D.C. (1970) The development of crystallographic enzymology. Biochem. Soc. Symp., 30, 11–28. Vendruscolo,M. et al. (1997) Recovery of protein structure from contact maps. Fold. Des., 2, 295–306.