Download Analysis of Protein Structures Using Protein Contacts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rosetta@home wikipedia , lookup

Ubiquitin wikipedia , lookup

Protein design wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Protein folding wikipedia , lookup

Protein purification wikipedia , lookup

Alpha helix wikipedia , lookup

Protein moonlighting wikipedia , lookup

Circular dichroism wikipedia , lookup

Proteomics wikipedia , lookup

List of types of proteins wikipedia , lookup

Homology modeling wikipedia , lookup

Protein wikipedia , lookup

Protein domain wikipedia , lookup

Western blot wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Structural alignment wikipedia , lookup

Cyclol wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Protein structure prediction wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Transcript
International Workshop and Conference on: Statistical Physics Approaches to Multi-disciplinary Problems
January 07 – 13, 2008, IIT Guwahati, India
Analysis of Protein Structures Using Protein Contacts Networks
Pankaj Barah and Somdatta Sinha
Mathematical Modelling and Computational Biology Group,
Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad, 500007
email: [email protected]
Proteins are biological macromolecules made up of a linear chain of amino acids that are organised into threedimensional structure comprising of different secondary structural elements. From a topological viewpoint,
native-state protein structures can be modelled, using Graph Theory, as complex networks of their constituent
amino acid residues and their interactions. A coarse-grained model - "Protein Contact Network" (PCN) - is
built with the C∝ atoms of each amino acid as nodes and a 7Ao threshold distance between the nodes as a link to
include covalent and non-covalent interactions. Using this network model we have studied various network
parameters (e.g., Shortest Path, Clustering Coefficient, Degree Distribution, etc) in proteins of different
structural classes (α, β, α+β, α/β) comprising of primarily α helices and β strands, which offer insight into the
structural organisation and can be correlated to biophysical properties of the proteins, such as, their folding
kinetics [1,2,3].
Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in
hierarchy, but principal levels are family, super-family and fold. The exact positions of boundaries between
these levels are to some degree subjective. The evolutionary classification theory is conservative, where many
doubt about relatedness exists. The SCOP classification made new division at the super-family and family
levels [4]. Proteins clustered together into families are clearly evolutionary-related. Generally this means pairwise residue identities between the proteins 30% and greater. On the other hand, the proteins grouped into a
common super-family have probable common evolutionary origin, as they have low sequence identities, but
their functional features show a common evolutionary origin. Proteins are defined as having a common fold if
they have major secondary structural elements in the arrangement with same topological connections. Different
proteins from the same fold often have same peripheral elements of secondary structures and turn regions that
differ in size and conformation. In some cases these differing peripheral regions may comprise half of the
structures. Proteins placed together in the same fold category may not have a common evolutionary origin. [5,
6, 7]
In this study we have analysed protein structures using the coarse-grained network model (PCN) with the
aim to decipher the three dimensional structural features from the two-dimensional contact matrices. Since in
the three dimensional structure of a protein, specific combination of secondary structural elements can give rise
to a typical topological configuration as fold, we have studied few proteins belonging to some well-known folds
– the Ubiquitin, the E-F hand, and the TIM Barrel - using the PCN formalism to analyse their structural and
contact features that may be conserved. We show that the short and long range contacts among different
secondary structural elements in each fold-type can be identified from the contact matrices and the visualization
in the ring graph can easily correspond to the information in the two-dimensional matrix. The results will be
shown in comparison to the network structure and the three dimensional structures of the proteins available
from the Protein Data Bank [8]. We aim to develop a learning algorithm in future, which can automatically use
the graph formalism to predict about the different types of folds and their combination in proteins.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
G. Bagler and Somdatta Sinha, Physica A, 346, 27 (2005).
G. Bagler and Somdatta Sinha, Bioinformatics, 23, 1760 (2007).
N. S. ShijuLal and Somdatta Sinha, Proceedings of the 11th ADNAT Convention on “Advances in
Structural Biology and Structure Prediction.” 134 (2007).
A. G. Murzin., S. E. Brenner , T. Hubbard, C. Chothia, J. Mol. Biol., 247, 536 (1995).
M.G. Rossmann and P. Argos, J. Mol. Biol., 109, 99 (1977).
C. Chothia, Annu. Rev. Biochem., 53, 537 (1984).
J.P. Overington, Z.Y. Zhu, A. Sali, M.S. Jonson, R. Sowdhamini, G.V. Louie, and T.L. Blundell, Biochem.
Soc. Trans, 21, 597 (1993).
http://www.rcsb.org/; F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Meyer, M.D. Brice, J.R.
Rodgers, O. Kennard, T. Shimanouchi, M.Tasumi, J. Mol. Biol., 112, 535 ( 1977).