Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Swetlana Nikolajewa, Thomas Wilhelm Theoretical Systems Biology Institute of Molecular Biotechnology Jena Overview The genetic code - introduction The new classification scheme of the genetic code shows: symmetry characteristics explanation for the number (22) of tRNA genes in mammalian mitochondrial genome amino-acids patterns and regularities of codons (strong, mixed and weak codons) possible predecessors of our contemporary quaternary triplet code The Genetic Code 3 nucleotides bases (triplets) of A, G, C, U are used to code for 20 amino acids two purines (A,G) two pyrimidines (C,U) 64 possible codons (4x4x4=43) 3 termination codons: UGA, UA(G/A) 61 codons for amino acid coding Met (AUG) codon is also the start codon The Common Genetic Code Table 2nd base U C A G U UUU Phe UUC Phe UUA Leu UUG Leu UCU Ser UCC Ser UCA Ser UCG Ser UAU Tyr UAC Tyr UAA Stop UAG Stop UGU Cys UGC Cys UGA Stop UGG Trp U C A G C CUU Leu CUC Leu CUA Leu CUG Leu CCU Pro CCC Pro CCA Pro CCG Pro CAU His CAC His CAA Gln CAG Gln CGU Arg CGC Arg CGA Arg CGG Arg U C A G U C A G 1st base A AUU Ile AUC Ile AUA Ile AUG Met ACU Thr ACC Thr ACA Thr ACG Thr AAU Asn AAC Asn AAA Lys AAG Lys AGU Ser AGC Ser AGA Arg AGG Arg G GUU Val GUC Val GUA Val GUG Val GCU Ala GCC Ala GCA Ala GCG Ala GAU Asp GAC Asp GAA Glu GAG Glu GGU Gly GGC Gly GGA Gly GGG Gly U C A G 3rd base The new classification scheme of the genetic code binary representation of purines(A,G) → 1 pyrimidines(C,U) → 0 23 = 8 different binary triplets 000 , 001, … ,111 each of these has again 8 possibilities, for instance: 000 stands for three pyrimidines: CCC, CCU, UUC, …, UUU 111 stands for three purines: GGG, GGA, GAA, …, AAA C A G binds via 3 hydrogen bonds in the complementary base-paring U binds via 2 hydrogen bonds in the complementary base-paring The Common Genetic Code Table 2nd base U C A G U UUU Phe UUC Phe UUA Leu UUG Leu UCU Ser UCC Ser UCA Ser UCG Ser UAU Tyr UAC Tyr UAA Stop UAG Stop UGU Cys UGC Cys UGA Stop UGG Trp U C A G C CUU Leu CUC Leu CUA Leu CUG Leu CCU Pro CCC Pro CCA Pro CCG Pro CAU His CAC His CAA Gln CAG Gln CGU Arg CGC Arg CGA Arg CGG Arg U C A G U C A G 1st base A AUU Ile AUC Ile AUA Ile AUG Met ACU Thr ACC Thr ACA Thr ACG Thr AAU Asn AAC Asn AAA Lys AAG Lys AGU Ser AGC Ser AGA Arg AGG Arg G GUU Val GUC Val GUA Val GUG Val GCU Ala GCC Ala GCA Ala GCG Ala GAU Asp GAC Asp GAA Glu GAG Glu GGU Gly GGC Gly GGA Gly GGG Gly The Common Genetic Code Table contains 64 fields… U C A G 3rd base The new classification scheme (standard genetic code) Code 000 Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds Pro 001 Pro 100 Ala 101 Ala 010 Arg (C/U) Ser CC GC (A/G) Ser (C/U) Thr (A/G) Thr (C/U) 110 Gly CG GG GG Glycine (A/G) Leu (A/G) AC (C/U) Val AC (A/G) AG Val (C/U) His Arg AG Arginine (A/G) Leu GU GU CA (A/G) Gln CA UU (A/G) Leucine (C/U) Ile AU (C/U) Isoleucine (A/G) Ile/Met AU (A/G) Isoleucine/Methionine (C/U) Tyr Histidine UA (C/U) Tyrosine (A/G) Stop (C/U) Asn UA (A/G) Glutamine (C/U) Asp Serine (A/G) CU (C/U) Phenylalanine Valine Stop/Trp UG Ser Phe UU Valine Tryptophan (C/U) (C/U) Leucine Cystein Glycine Gly UC Cys UG Arginine CU Leucine Threonine Arginine Arg Leu Threonine Alanine CG (C/U) Serine Alanine GC UC Serine Proline 011 111 CC Proline GA (A/G) Glu GA (A/G) Glutamatic acid AA (C/U) Asparagine Asparatic acid Lys AA (A/G) Lysine the new scheme contains the same information in only 32 fields. Deviations from the Standard Code Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds CC Leu CU Pro 001 Pro 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG 011 Arg CG (A/G) Stop /Trp UG 110 Gly GG (C/U) 111 Gly GG (A/G) (C/U) Ser UC 000 (C/U) (C/U) Phe UU (C/U) 1/1 CC (A/G) Ser UC (A/G) Leu CU 1/0 (A/G) Leu UU (A/G) 1/0 1/2 Ile AU (C/U) Ile/Met AU (A/G) 5/0 (C/U) (A/G) His CA (C/U) Tyr UA (C/U) Gln CA (A/G) Stop UA (A/G) 2/4 9/0 AG (C/U) Asp GA (C/U) Asn AA (C/U) Arg AG (A/G) Glu GA (A/G) Lys AA (A/G) Ser 6/6 3/0 http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi Mitochondrial Genomes Have Several Surprising Features genetic code of mitochondria only 22 tRNAs are required for mammalian mitochondrial protein synthesis The Mammalian Mitochondrial Genetic Code Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds 000 Pro CC (C/U) Ser UC (C/U) Leu CU (C/U) Phe UU 001 Pro CC (A/G) Ser UC (A/G) Leu CU (A/G) Leu UU (A/G) 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) Ile AU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U) 011 Arg CG (A/G) Trp /Trp UG Gln CA (A/G) Stop UA (A/G) 110 Gly GG (C/U) Asp GA (C/U) Asn AA (C/U) 111 Gly GG (A/G) Glu GA (A/G) Lys AA (A/G) Ser STOP AG (A/G) (C/U) AG (A/G) (C/U) Met/Met AU (A/G) http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi The Mammalian Mitochondrial Code: 8 tRNAs for family codons + 14 tRNAs for non-family codons Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds tRNAPhe UU (C/U) 000 tRNAPro CC tRNASer1 UC tRNALeu1 CU 001 tRNALeu2 UU (A/G) 100 tRNAIle AU (C/U) tRNAAla GC tRNAThr AC tRNAVal GU tRNAMet AU (A/G) 101 tRNACys UG (C/U) tRNAHis CA (C/U) tRNATyr UA (C/U) 011 tRNATrp UG (A/G) tRNAGln CA (A/G) STOP 110 tRNASer2 AG (C/U) tRNAAsp GA (C/U) tRNAAsn AA (C/U) tRNAGlu GA (A/G) tRNALys AA (A/G) 010 tRNAArg tRNAGly 111 CG UA (A/G) GG STOP AG (A/G) http://mamit-trna.u-strasbg.fr/2DStructures.html Amino acids patterns: Polar requirement of NCN and NUN codons Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds 000 Pro CC (C/U) Ser UC (C/U) Leu CU (C/U) Phe UU 001 Pro CC (A/G) Ser UC (A/G) Leu CU (A/G) Leu UU (A/G) 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) Ile AU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U) 011 Arg CG (A/G) Stop/Trp UG Gln CA (A/G) Stop UA (A/G) 110 Gly GG (C/U) GA (C/U) Asn Ser AG (A/G) (C/U) Asp Ile/Met AU Gly GG (A/G) Arg AG (A/G) Glu GA (A/G) Glutamatic acid AA (A/G) (C/U) Asparagine Asparatic acid 111 (C/U) Lys AA (A/G) Lysine C. R. Woese, G. J. Olsen, M. Ibba, D. Söll Aminoacyl-tRNA Synthetases, the Genetic Code, and the Evolutionary Process. MMBR 2000(64) 202-236 Amino acids patterns: Hydrophobicity. Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds 000 Pro CC (C/U) Ser UC (C/U) Leu CU (C/U) Phe UU 001 Pro CC (A/G) Ser UC (A/G) Leu CU (A/G) Leu UU (A/G) 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) Ile AU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U) 011 Arg CG (A/G) Stop/Trp UG Gln CA (A/G) Stop UA (A/G) 110 Gly GG (C/U) Ser AG (C/U) Asp GA (C/U) Asn AA (C/U) 111 Gly GG (A/G) Arg AG (A/G) Glu GA (A/G) Lys AA (A/G) (A/G) (C/U) Ile/Met AU (A/G) Kyte&Doolittle, 1982, http://biology-pages.info Codon-Anticodon symmetry Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds 000 Pro CC (C/U) Ser UC (C/U) Leu CU (C/U) Phe UU 001 Pro CC (A/G) Ser UC (A/G) Leu CU (A/G) Leu UU (A/G) 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) Ile AU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U) 011 Arg CG (A/G) Stop/Trp UG Gln CA (A/G) Stop UA (A/G) 110 Gly GG (C/U) Ser AG (C/U) Asp GA (C/U) Asn AA (C/U) 111 Gly GG (A/G) Arg AG (A/G) Glu GA (A/G) Lys AA (A/G) (A/G) (C/U) Ile/Met AU (A/G) Point symmetry Code Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds 000 Pro CC (C/U) Ser UC (C/U) Leu CU (C/U) Phe UU 001 Pro CC (A/G) Ser UC (A/G) Leu CU (A/G) Leu UU (A/G) 100 Ala GC (C/U) Thr AC (C/U) Val GU (C/U) Ile AU (C/U) 101 Ala GC (A/G) Thr AC (A/G) Val GU (A/G) 010 Arg CG (C/U) Cys UG (C/U) His CA (C/U) Tyr UA (C/U) 011 Arg CG (A/G) Stop/Trp UG Gln CA (A/G) Stop UA (A/G) 110 Gly GG (C/U) Ser AG (C/U) Asp GA (C/U) Asn AA (C/U) 111 Gly GG (A/G) Arg AG (A/G) Glu GA (A/G) Lys AA (A/G) (A/G) (C/U) Ile/Met AU (A/G) D. Halitsky Extending the (Hexa-)Rhombic Dodecahedral Model of the Genetic Code: the Code's Four 6-fold Degeneracies and the Ten Orthogonal Projections of the 5-cube as 3-cube. Computer Systems Technology 2004 Correlation of codon strength and amino acid properties Measure Strong codons Mixed codons Weak codons Dinucleoside monophosphates Hydrophilicity (Weber & Lacey 1978) 1.686 1.434 1.235 Hydrophilicity (Barzilay et al. 1973) 2.72 2.26 2.26 Hydrophobicity (Garel et al. 1973) 2.556 3.413 3.982 Amino acids Molec. Weight (Handbook value) 907 1065.6 1217.5 Molec. Volume (Grantham 1974) 381 637.5 906 Refractivity (Jones 1975) 83.86 140.03 186.51 Alpha pK1 (Zimmermann et al. 1968) 16.96 17.11 17.43 Bulkiness (Zimmermann et al. 1968) 93.22 124.345 143.54 Specific volume (McMeekin et al. 1964) 5.26 5.37 5.8 107.16 109.58 58.14 Polarity (Woese et al. 1967) 61.2 59.15 51 Polarity (Grantham 1974) 71.2 67 56.3 Hydrophobicity (Jones 1975) 9.18 8.385 16.93 Hydrophobicity (Levitt 1976) -2.2 1.6 8.8 Hydrophobicity (Bull & Breese 1974) 3880 -165 -6790 Hydrophilicity (Weber & Lacey 1978) 7.02 6.585 5.59 Partition coefficient (Garel et al. 1973) 1.88 5.58 7.6 Sequence Frequency (Jungck 1971) 4280 3522 2966 Polarity (Zimmerman et al. 1968) Evolution of the genetic code our contemporary code is the quaternary triplet code: 43=64 fields 00* 00* 00* 00* 01* 01* 01* 01* 10* 10* 10* 10* 11* 11* 11* 11* CGU, UAC,… quaternary doublet code: 42=16 fields 00 00 00 00 01 01 01 01 10 10 10 10 11 11 11 11 CGU, UAC,… binary doublet: 00 41=4 fields 01 10 11 Evidence: Evolution of the Genetic Code Code 000 Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds Pro CC (C/U) Ser Proline 001 Pro 100 Ala 101 Ala 010 011 110 111 CC (A/G) Ser (C/U) Thr (A/G) Thr Alanine Arg CG (C/U) CG Gly GG GG Glycine (A/G) Leu (A/G) AC (C/U) Val AC (A/G) Val His AG Arg AG (A/G) Leu GU GU CA (A/G) Gln (C/U) CA (C/U) Ile (A/G) (A/G) AU (C/U) Isoleucine (A/G) Ile/Met AU (A/G) Isoleucine/Methionine (C/U) Tyr UA (C/U) Tyrosine (A/G) Stop UA (A/G) Glutamine Asp GA (C/U) Asn Glu GA (A/G) Glutamatic acid AA (C/U) Asparagine Asparatic acid Arginine UU Leucine Histidine Serine (A/G) CU (C/U) Phenylalanine Valine (C/U) Stop/Trp UG Ser Phe UU Valine Tryptophan (C/U) (C/U) Leucine Cystein Glycine Gly UC Cys UG Arginine CU Leucine Threonine Arginine Arg Leu Threonine Alanine GC (C/U) Serine Proline GC UC Serine Lys AA Lysine (A/G) Evidence: Evolution of the Genetic Code Code 000 Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds Pro CC (C/U) Ser Proline 001 Pro 100 Ala 101 Ala 010 011 110 111 CC (A/G) Ser (C/U) Thr (A/G) Thr Alanine Arg CG (C/U) CG Gly GG GG Glycine (A/G) Leu (A/G) AC (C/U) Val AC (A/G) Val His AG Arg AG (A/G) Leu GU GU CA (A/G) Gln (C/U) CA (C/U) Ile (A/G) (A/G) AU (C/U) Isoleucine (A/G) Ile/Met AU (A/G) Isoleucine/Methionine (C/U) Tyr UA (C/U) Tyrosine (A/G) Stop UA (A/G) Glutamine Asp GA (C/U) Asn Glu GA (A/G) Glutamatic acid AA (C/U) Asparagine Asparatic acid Arginine UU Leucine Histidine Serine (A/G) CU (C/U) Phenylalanine Valine (C/U) Stop/Trp UG Ser Phe UU Valine Tryptophan (C/U) (C/U) Leucine Cystein Glycine Gly UC Cys UG Arginine CU Leucine Threonine Arginine Arg Leu Threonine Alanine GC (C/U) Serine Proline GC UC Serine Lys AA Lysine (A/G) Outlook Looking for binary patterns in the genomes Additional information http://www.imb-jena.de/~sweta/genetic_code/ Acknowledgment Maik Friedel Andreas Beyer Frank Grosse Thank you for your attention ! The new classification scheme of the standard genetic code Code 000 Strong codons Mixed codons Mixed codons Weak codons 6 hydrogen bonds 5 hydrogen bonds 5 hydrogen bonds 4 hydrogen bonds Pro CC (C/U) Ser Proline 001 Pro 100 Ala 101 Ala 010 011 110 111 CC (A/G) Ser (C/U) Thr (A/G) Thr Alanine Arg CG (C/U) CG Gly GG GG Glycine (A/G) Leu (A/G) AC (C/U) Val AC (A/G) Val His AG Arg AG (A/G) Leu GU GU CA (A/G) Gln (C/U) CA (C/U) Ile (A/G) (A/G) AU (C/U) Isoleucine (A/G) Ile/Met AU (A/G) Isoleucine/Methionine (C/U) Tyr UA (C/U) Tyrosine (A/G) Stop UA (A/G) Glutamine Asp GA (C/U) Asn Glu GA (A/G) Glutamatic acid AA (C/U) Asparagine Asparatic acid Arginine UU Leucine Histidine Serine (A/G) CU (C/U) Phenylalanine Valine (C/U) Stop/Trp UG Ser Phe UU Valine Tryptophan (C/U) (C/U) Leucine Cystein Glycine Gly UC Cys UG Arginine CU Leucine Threonine Arginine Arg Leu Threonine Alanine GC (C/U) Serine Proline GC UC Serine Lys AA (A/G) Lysine T.Wilhelm, S.Nikolajewa A new classification scheme of the genetic code. J. Mol. Evol. (2004) 59: 598-605