* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Analysis of Protein Structures Using Protein Contacts
Survey
Document related concepts
Rosetta@home wikipedia , lookup
Protein design wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Protein folding wikipedia , lookup
Protein purification wikipedia , lookup
Alpha helix wikipedia , lookup
Protein moonlighting wikipedia , lookup
Circular dichroism wikipedia , lookup
List of types of proteins wikipedia , lookup
Homology modeling wikipedia , lookup
Protein domain wikipedia , lookup
Western blot wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Structural alignment wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Transcript
International Workshop and Conference on: Statistical Physics Approaches to Multi-disciplinary Problems January 07 – 13, 2008, IIT Guwahati, India Analysis of Protein Structures Using Protein Contacts Networks Pankaj Barah and Somdatta Sinha Mathematical Modelling and Computational Biology Group, Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad, 500007 email: [email protected] Proteins are biological macromolecules made up of a linear chain of amino acids that are organised into threedimensional structure comprising of different secondary structural elements. From a topological viewpoint, native-state protein structures can be modelled, using Graph Theory, as complex networks of their constituent amino acid residues and their interactions. A coarse-grained model - "Protein Contact Network" (PCN) - is built with the C∝ atoms of each amino acid as nodes and a 7Ao threshold distance between the nodes as a link to include covalent and non-covalent interactions. Using this network model we have studied various network parameters (e.g., Shortest Path, Clustering Coefficient, Degree Distribution, etc) in proteins of different structural classes (α, β, α+β, α/β) comprising of primarily α helices and β strands, which offer insight into the structural organisation and can be correlated to biophysical properties of the proteins, such as, their folding kinetics [1,2,3]. Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in hierarchy, but principal levels are family, super-family and fold. The exact positions of boundaries between these levels are to some degree subjective. The evolutionary classification theory is conservative, where many doubt about relatedness exists. The SCOP classification made new division at the super-family and family levels [4]. Proteins clustered together into families are clearly evolutionary-related. Generally this means pairwise residue identities between the proteins 30% and greater. On the other hand, the proteins grouped into a common super-family have probable common evolutionary origin, as they have low sequence identities, but their functional features show a common evolutionary origin. Proteins are defined as having a common fold if they have major secondary structural elements in the arrangement with same topological connections. Different proteins from the same fold often have same peripheral elements of secondary structures and turn regions that differ in size and conformation. In some cases these differing peripheral regions may comprise half of the structures. Proteins placed together in the same fold category may not have a common evolutionary origin. [5, 6, 7] In this study we have analysed protein structures using the coarse-grained network model (PCN) with the aim to decipher the three dimensional structural features from the two-dimensional contact matrices. Since in the three dimensional structure of a protein, specific combination of secondary structural elements can give rise to a typical topological configuration as fold, we have studied few proteins belonging to some well-known folds – the Ubiquitin, the E-F hand, and the TIM Barrel - using the PCN formalism to analyse their structural and contact features that may be conserved. We show that the short and long range contacts among different secondary structural elements in each fold-type can be identified from the contact matrices and the visualization in the ring graph can easily correspond to the information in the two-dimensional matrix. The results will be shown in comparison to the network structure and the three dimensional structures of the proteins available from the Protein Data Bank [8]. We aim to develop a learning algorithm in future, which can automatically use the graph formalism to predict about the different types of folds and their combination in proteins. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. G. Bagler and Somdatta Sinha, Physica A, 346, 27 (2005). G. Bagler and Somdatta Sinha, Bioinformatics, 23, 1760 (2007). N. S. ShijuLal and Somdatta Sinha, Proceedings of the 11th ADNAT Convention on “Advances in Structural Biology and Structure Prediction.” 134 (2007). A. G. Murzin., S. E. Brenner , T. Hubbard, C. Chothia, J. Mol. Biol., 247, 536 (1995). M.G. Rossmann and P. Argos, J. Mol. Biol., 109, 99 (1977). C. Chothia, Annu. Rev. Biochem., 53, 537 (1984). J.P. Overington, Z.Y. Zhu, A. Sali, M.S. Jonson, R. Sowdhamini, G.V. Louie, and T.L. Blundell, Biochem. Soc. Trans, 21, 597 (1993). http://www.rcsb.org/; F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Meyer, M.D. Brice, J.R. Rodgers, O. Kennard, T. Shimanouchi, M.Tasumi, J. Mol. Biol., 112, 535 ( 1977).