Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Microevolution wikipedia , lookup
Synthetic biology wikipedia , lookup
Frameshift mutation wikipedia , lookup
Public health genomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Population genetics wikipedia , lookup
Human genetic variation wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Transfer RNA wikipedia , lookup
Genetic testing wikipedia , lookup
Genome (book) wikipedia , lookup
Point mutation wikipedia , lookup
COMPUTATIONAL STRUCTURAL AND FUNCTIONAL PROTEOMICS SYMMETRY AND SPATIAL STRUCTURE OF THE CANONICAL SET OF AMINO ACIDS Karasev V.A.*1, Luchinin V.V.1, Stefanov V.E.2 Saint-Petersburgh State Electrotechnical University “LETI”, Saint-Petersburgh, Russia; Saint-Petersburgh State University, Saint-Petrsburgh, Russia * Corresponding author: e-mail: [email protected] 1 2 Keywords: canonical set of amino acids, genetic code, icosahedron, dodecahedron, spatial structure Summary Motivation: The nature of the canonical set of twenty amino acids remains unsolved problem. In this connection we undertook analysis of group properties of the amino acids, which compose the canonical set. The dodecahedron structure was used for pictorial rendition of the derived principles. Results: Analysis of the properties of a set of 12 meridian cycles obtained on the structure of the duplet genetic code, which is isomorphic to Boolean hypercube B4, revealed four groups of cycles united in pairs by anti-symmetry transformations of two types. These transformations become most illustrative when shown on the icosahedron, a polyhedron with 12 vertices. Related to the icosahedron is another polyhedron – dodecahedron, which has 20 vertices. Approach based on the use of the two polyhedrons was applied to the analysis of structure of the canonical set of 20 amino acids. It was demonstrated that four groups of amino acids, each containing five amino acids connected by anti-symmetry transformations of two types, can be distinguished in the initial set. The revealed principles were pictorially represented on the structure of dodecahedron. Availability: http://genetic-code.narod.ru/ Introduction Development of spatial models of the duplet and triplet genetic code isomorphic to Boolean hypercubes B4 and B6 , respectively (Klump, 1993; Jimenez-Montano et al., 1996; Karasev, Sorokin, 1997), is an important achievement. However, the proposed structures deal with the duplet and triplet code only, ignoring the nature of the canonical set of 20 amino acids. This set must have its structural principles, which up to now remain obscure. The genetic code should be regarded as a natural system of amino acid organization. Within the model developed by us (Karasev, 2003; Karasev, Stefanov, 2001), side chains of amino acids, encoded by triplets, are treated as physical operators reconstructing the encoded structure. They include connectivity operators (polar amino acids), encoded by codons, which have G or A in the second position of the triplet, and anticonnectivity operators, encoded by triplets with C or U in the second position. Another classification, based on the genetic code, addresses the idea of complementarity of amino acids encoded by complementary triplets (Mekler, Idlis, 1993). There are alternative approaches based on the classification of the amino acid side chains according to their physicochemical properties (Campbell, Smith, 1994) and on the particular character of functioning of amino acids in the protein structure (Karasev, 2003). However, the system of twenty amino acids may have its own spatial representation only indirectly connected with the code. The present study aims at the analysis of the structure of the canonical set of amino acids, based on principles of symmetry and anti-symmetry, and development of a spatial model, which would provide an illustrative representation of the structure of the set. Earlier this problem was addressed in a preliminary study (Karasev, 2004). 278 BGRS'2004 COMPUTATIONAL STRUCTURAL AND FUNCTIONAL PROTEOMICS BGRS 2004 Model The following prerequisites of the model should be indicated: a) coincidence of the number of amino acids in the canonical set with that of vertexes in the dodecahedron; b) occurrence of a structure related to dodecahedron, i.e. icosahedron; c) occurrence of functional connection of the side chains of amino acids with the spatial-temporal self-organization of protein molecules; d) possibility to use spatial structures, isomorphic to Boolean hypercubes, for describing the process of self-organization of protein molecules; e) establishing of the fact that the icosahedron represents the spatial structure of the meridian cycles of the duplet genetic code. The structure of the duplet genetic code is known to be isomorphic to the Boolean hupercube B4 (Karasev, 2003). Let us move on the structure of the genetic code (Fig. 1a) from the vertex situated in the first tier to the vertex situated in the fifth tier and then back along the symmetrical path. The resulting cyclic pathway, shown in (Fig. 1b), can be called meridian cycle (M-cycle). a b Fig. 1. Structure of the duplet genetic code (a) and meridian cycle highlighted on this structure (b). The total number of M-cycles connected by relations of anti-symmetry, which can be identified on the structure of the duplet genetic code, is equal to twelve. Icosahedron is the spatial structure, which most obviously incorporates anti-symmetry principles of M-cycles (Fig. 2). As seen from Fig. 2, the plane I separates M-cycles, interrelated by operation of anti-symmetry implying that letters of the duplets interchange according to the following rule: CÅÆ A, G ÅÆ U. Plane II separates two groups of M-cycles, which do not have duplets in common other than CC and AA, the so called antipode cycles. On rotating this plane about the axis perpendicular to plane I (C2), vertexes with antipode cycles coincide. Since icosahedron is related to dodecahedron, we applied the anti-symmetry principles established on the icosahedron to analyze the arrangement of amino acids on the dodecahedron. Results and Discussion The performed analysis revealed two groups of amino acids which have similar properties but different structure, e.g. Lys –Arg, Glu – Asp, Asn – Gln, two groups of amino acids with opposite properties, e.g. Lys – Glu, Arg – Asp, etc. These results are clearly demonstrated on the 279 BGRS'2004 COMPUTATIONAL STRUCTURAL AND FUNCTIONAL PROTEOMICS BGRS 2004 dodecahedron structure (Fig. 3). Amino acids with similar properties but differing in size, occupy positions symmetrical with respect to plane I, whereas amino acids with opposite properties, occupy vertexes, which coincide with each other upon rotation of the dodecahedron about axis С2. Thus, the set of 20 canonical amino acids consists of four groups, containing five amino acids each, which are interrelated by anti-symmetry transformations. At present, we carry out analysis of proteins on the basis of the developed structure model looking for construction principles for polypeptide chains that can be implemented in nanoelectronics and sensorics. Fig. 2. Localization of meridian cycles of the duplet genetic code on the icosahedron. (I) – anti-symmetry plane separating antipodes of the 1st type; (II) – plane separating antipodes of the 2nd type. Fig. 3. System of amino acids, connected by relations of anti-symmetry, constructed on the dodecahedron. (I) – anti-symmetry plane separating amino acids with similar properties; (II) – plane separating amino acids with opposite properties. 280 BGRS'2004 COMPUTATIONAL STRUCTURAL AND FUNCTIONAL PROTEOMICS BGRS 2004 References Campbell P.N., Smith A.D. Biochemistry Illustrated. Edinburgh – London – Madrid – Tokio: Curchill Livingstone, 1994. P. 8–9. Jimenez-Montaño M.A., de la Mora-Basañez C.R., Poschel Th. The hypercube structure of the genetic code explains conservative and non- conservative aminoacid substitutions in vivo and in vitro // BioSystems. 1996. V. 39. P. 117–125. Karasev V.A. Genetic Code: New Horizons. SPb: Tessa, 2003. 145 p. (Russ.). Karasev V.A. 2004. On anti-symmetry of the canonical set of amino acids. Dep.VINITI, 23.03.2004, N 470-B2004 (Russ.). Karasev V.A., Sorokin S.G. Topological structure of the genetic code // Russ. J. Genetics. 1997. V. 33. P. 622–628. Karasev V.A., Stefanov V.E. Topological nature of the genetic code // Theor. Biol. 2001. V. 209. P. 303–317. Klump H.H. The physical basis of the genetic code: the choice between speed and precision // Arch. Biochem. Biophys. 1993. V. 301. P. 207–209. Mekler L.B., Idlis R.G. General stereochemical genetic code – towards biology and universal medicine of the XXI century // Priroda. 1993. N 5. P. 29–63 (Russ.). 281 BGRS'2004