Download Mixed codons

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Swetlana Nikolajewa, Thomas Wilhelm
Theoretical Systems Biology
Institute
of
Molecular Biotechnology
Jena
Overview


The genetic code - introduction
The new classification scheme of the genetic code
shows:




symmetry characteristics
explanation for the number (22) of tRNA genes in
mammalian mitochondrial genome
amino-acids patterns and regularities of codons (strong,
mixed and weak codons)
possible predecessors of our contemporary quaternary
triplet code
The Genetic Code

3 nucleotides bases (triplets) of A, G, C, U are used
to code for 20 amino acids



two purines (A,G)
two pyrimidines (C,U)
64 possible codons (4x4x4=43)



3 termination codons: UGA, UA(G/A)
61 codons for amino acid coding
Met (AUG) codon is also the start codon
The Common Genetic Code Table
2nd base
U
C
A
G
U
UUU Phe
UUC Phe
UUA Leu
UUG Leu
UCU Ser
UCC Ser
UCA Ser
UCG Ser
UAU Tyr
UAC Tyr
UAA Stop
UAG Stop
UGU Cys
UGC Cys
UGA Stop
UGG Trp
U
C
A
G
C
CUU Leu
CUC Leu
CUA Leu
CUG Leu
CCU Pro
CCC Pro
CCA Pro
CCG Pro
CAU His
CAC His
CAA Gln
CAG Gln
CGU Arg
CGC Arg
CGA Arg
CGG Arg
U
C
A
G
U
C
A
G
1st
base
A
AUU Ile
AUC Ile
AUA Ile
AUG Met
ACU Thr
ACC Thr
ACA Thr
ACG Thr
AAU Asn
AAC Asn
AAA Lys
AAG Lys
AGU Ser
AGC Ser
AGA Arg
AGG Arg
G
GUU Val
GUC Val
GUA Val
GUG Val
GCU Ala
GCC Ala
GCA Ala
GCG Ala
GAU Asp
GAC Asp
GAA Glu
GAG Glu
GGU Gly
GGC Gly
GGA Gly
GGG Gly
U
C
A
G
3rd
base
The new classification scheme
of the genetic code

binary representation of



purines(A,G) → 1
pyrimidines(C,U) → 0
23 = 8 different binary triplets
000 , 001, … ,111
each of these has again 8 possibilities, for instance:
 000
stands for three pyrimidines: CCC, CCU, UUC, …, UUU
 111
stands for three purines:
GGG, GGA, GAA, …, AAA


C
A
G binds via 3 hydrogen bonds in the complementary base-paring
U binds via 2 hydrogen bonds in the complementary base-paring
The Common Genetic Code Table
2nd base
U
C
A
G
U
UUU Phe
UUC Phe
UUA Leu
UUG Leu
UCU Ser
UCC Ser
UCA Ser
UCG Ser
UAU Tyr
UAC Tyr
UAA Stop
UAG Stop
UGU Cys
UGC Cys
UGA Stop
UGG Trp
U
C
A
G
C
CUU Leu
CUC Leu
CUA Leu
CUG Leu
CCU Pro
CCC Pro
CCA Pro
CCG Pro
CAU His
CAC His
CAA Gln
CAG Gln
CGU Arg
CGC Arg
CGA Arg
CGG Arg
U
C
A
G
U
C
A
G
1st
base
A
AUU Ile
AUC Ile
AUA Ile
AUG Met
ACU Thr
ACC Thr
ACA Thr
ACG Thr
AAU Asn
AAC Asn
AAA Lys
AAG Lys
AGU Ser
AGC Ser
AGA Arg
AGG Arg
G
GUU Val
GUC Val
GUA Val
GUG Val
GCU Ala
GCC Ala
GCA Ala
GCG Ala
GAU Asp
GAC Asp
GAA Glu
GAG Glu
GGU Gly
GGC Gly
GGA Gly
GGG Gly
The Common Genetic Code Table contains 64 fields…
U
C
A
G
3rd
base
The new classification scheme (standard genetic code)
Code
000
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
Pro
001
Pro
100
Ala
101
Ala
010
Arg
(C/U)
Ser
CC
GC
(A/G)
Ser
(C/U)
Thr
(A/G)
Thr
(C/U)
110
Gly
CG
GG
GG
Glycine
(A/G)
Leu
(A/G)
AC
(C/U)
Val
AC
(A/G)
AG
Val
(C/U)
His
Arg AG
Arginine
(A/G)
Leu
GU
GU
CA
(A/G)
Gln
CA
UU
(A/G)
Leucine
(C/U)
Ile
AU
(C/U)
Isoleucine
(A/G)
Ile/Met AU
(A/G)
Isoleucine/Methionine
(C/U)
Tyr
Histidine
UA
(C/U)
Tyrosine
(A/G)
Stop
(C/U)
Asn
UA
(A/G)
Glutamine
(C/U)
Asp
Serine
(A/G)
CU
(C/U)
Phenylalanine
Valine
Stop/Trp UG
Ser
Phe UU
Valine
Tryptophan
(C/U)
(C/U)
Leucine
Cystein
Glycine
Gly
UC
Cys UG
Arginine
CU
Leucine
Threonine
Arginine
Arg
Leu
Threonine
Alanine
CG
(C/U)
Serine
Alanine
GC
UC
Serine
Proline
011
111
CC
Proline
GA
(A/G)
Glu
GA
(A/G)
Glutamatic acid
AA
(C/U)
Asparagine
Asparatic acid
Lys
AA
(A/G)
Lysine
the new scheme contains the same information in only 32 fields.
Deviations from the Standard Code
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
CC
Leu CU
Pro
001
Pro
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
011
Arg
CG
(A/G)
Stop /Trp UG
110
Gly
GG
(C/U)
111
Gly
GG
(A/G)
(C/U)
Ser
UC
000
(C/U)
(C/U)
Phe UU
(C/U)
1/1
CC
(A/G)
Ser
UC
(A/G)
Leu CU
1/0
(A/G)
Leu
UU
(A/G)
1/0
1/2
Ile
AU
(C/U)
Ile/Met AU
(A/G)
5/0
(C/U)
(A/G)
His
CA
(C/U)
Tyr
UA
(C/U)
Gln
CA
(A/G)
Stop
UA
(A/G)
2/4
9/0
AG
(C/U)
Asp
GA
(C/U)
Asn
AA
(C/U)
Arg AG
(A/G)
Glu
GA
(A/G)
Lys
AA
(A/G)
Ser
6/6
3/0
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
Mitochondrial Genomes
Have Several Surprising Features

genetic code of mitochondria

only 22 tRNAs are required for mammalian mitochondrial protein synthesis
The Mammalian Mitochondrial Genetic Code
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
000
Pro
CC
(C/U)
Ser
UC
(C/U)
Leu CU
(C/U)
Phe UU
001
Pro
CC
(A/G)
Ser
UC
(A/G)
Leu CU
(A/G)
Leu
UU
(A/G)
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
Ile
AU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
(C/U)
His
CA
(C/U)
Tyr
UA
(C/U)
011
Arg
CG
(A/G)
Trp /Trp UG
Gln
CA
(A/G)
Stop
UA
(A/G)
110
Gly
GG
(C/U)
Asp
GA
(C/U)
Asn
AA
(C/U)
111
Gly
GG
(A/G)
Glu
GA
(A/G)
Lys
AA
(A/G)
Ser
STOP
AG
(A/G)
(C/U)
AG
(A/G)
(C/U)
Met/Met AU
(A/G)
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
The Mammalian Mitochondrial Code:
8 tRNAs for family codons + 14 tRNAs for non-family codons
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
tRNAPhe UU (C/U)
000
tRNAPro CC
tRNASer1
UC
tRNALeu1 CU
001
tRNALeu2 UU (A/G)
100
tRNAIle AU (C/U)
tRNAAla
GC
tRNAThr
AC
tRNAVal GU
tRNAMet AU (A/G)
101
tRNACys UG (C/U)
tRNAHis CA (C/U)
tRNATyr UA (C/U)
011
tRNATrp UG (A/G)
tRNAGln CA (A/G)
STOP
110
tRNASer2 AG (C/U)
tRNAAsp GA (C/U)
tRNAAsn AA (C/U)
tRNAGlu GA (A/G)
tRNALys AA (A/G)
010
tRNAArg
tRNAGly
111
CG
UA
(A/G)
GG
STOP
AG
(A/G)
http://mamit-trna.u-strasbg.fr/2DStructures.html
Amino acids patterns: Polar requirement of NCN and NUN codons
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
000
Pro
CC
(C/U)
Ser
UC
(C/U)
Leu
CU
(C/U)
Phe UU
001
Pro
CC
(A/G)
Ser
UC
(A/G)
Leu
CU
(A/G)
Leu
UU
(A/G)
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
Ile
AU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
(C/U)
His
CA
(C/U)
Tyr
UA
(C/U)
011
Arg
CG
(A/G)
Stop/Trp UG
Gln
CA
(A/G)
Stop
UA
(A/G)
110
Gly
GG
(C/U)
GA
(C/U)
Asn
Ser
AG
(A/G)
(C/U)
Asp
Ile/Met AU
Gly
GG
(A/G)
Arg AG
(A/G)
Glu
GA
(A/G)
Glutamatic acid
AA
(A/G)
(C/U)
Asparagine
Asparatic acid
111
(C/U)
Lys
AA
(A/G)
Lysine
C. R. Woese, G. J. Olsen, M. Ibba, D. Söll Aminoacyl-tRNA Synthetases, the Genetic Code, and the Evolutionary Process. MMBR 2000(64) 202-236
Amino acids patterns: Hydrophobicity.
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
000
Pro
CC
(C/U)
Ser
UC
(C/U)
Leu
CU
(C/U)
Phe UU
001
Pro
CC
(A/G)
Ser
UC
(A/G)
Leu
CU
(A/G)
Leu
UU
(A/G)
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
Ile
AU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
(C/U)
His
CA
(C/U)
Tyr
UA
(C/U)
011
Arg
CG
(A/G)
Stop/Trp UG
Gln
CA
(A/G)
Stop
UA
(A/G)
110
Gly
GG
(C/U)
Ser
AG
(C/U)
Asp
GA
(C/U)
Asn
AA
(C/U)
111
Gly
GG
(A/G)
Arg AG
(A/G)
Glu
GA
(A/G)
Lys
AA
(A/G)
(A/G)
(C/U)
Ile/Met AU
(A/G)
Kyte&Doolittle, 1982, http://biology-pages.info
Codon-Anticodon symmetry
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
000
Pro
CC
(C/U)
Ser
UC
(C/U)
Leu
CU
(C/U)
Phe UU
001
Pro
CC
(A/G)
Ser
UC
(A/G)
Leu
CU
(A/G)
Leu
UU
(A/G)
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
Ile
AU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
(C/U)
His
CA
(C/U)
Tyr
UA
(C/U)
011
Arg
CG
(A/G)
Stop/Trp UG
Gln
CA
(A/G)
Stop
UA
(A/G)
110
Gly
GG
(C/U)
Ser
AG
(C/U)
Asp
GA
(C/U)
Asn
AA
(C/U)
111
Gly
GG
(A/G)
Arg AG
(A/G)
Glu
GA
(A/G)
Lys
AA
(A/G)
(A/G)
(C/U)
Ile/Met AU
(A/G)
Point symmetry
Code
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
000
Pro
CC
(C/U)
Ser
UC
(C/U)
Leu
CU
(C/U)
Phe UU
001
Pro
CC
(A/G)
Ser
UC
(A/G)
Leu
CU
(A/G)
Leu
UU
(A/G)
100
Ala
GC
(C/U)
Thr
AC
(C/U)
Val
GU
(C/U)
Ile
AU
(C/U)
101
Ala
GC
(A/G)
Thr
AC
(A/G)
Val
GU
(A/G)
010
Arg
CG
(C/U)
Cys UG
(C/U)
His
CA
(C/U)
Tyr
UA
(C/U)
011
Arg
CG
(A/G)
Stop/Trp UG
Gln
CA
(A/G)
Stop
UA
(A/G)
110
Gly
GG
(C/U)
Ser
AG
(C/U)
Asp
GA
(C/U)
Asn
AA
(C/U)
111
Gly
GG
(A/G)
Arg AG
(A/G)
Glu
GA
(A/G)
Lys
AA
(A/G)
(A/G)
(C/U)
Ile/Met AU
(A/G)
D. Halitsky Extending the (Hexa-)Rhombic Dodecahedral Model of the Genetic Code: the Code's Four 6-fold
Degeneracies and the Ten Orthogonal Projections of the 5-cube as 3-cube. Computer Systems Technology 2004
Correlation of codon strength and amino acid properties
Measure
Strong codons
Mixed codons
Weak codons
Dinucleoside monophosphates
Hydrophilicity (Weber & Lacey 1978)
1.686
1.434
1.235
Hydrophilicity (Barzilay et al. 1973)
2.72
2.26
2.26
Hydrophobicity (Garel et al. 1973)
2.556
3.413
3.982
Amino acids
Molec. Weight (Handbook value)
907
1065.6
1217.5
Molec. Volume (Grantham 1974)
381
637.5
906
Refractivity (Jones 1975)
83.86
140.03
186.51
Alpha pK1 (Zimmermann et al. 1968)
16.96
17.11
17.43
Bulkiness (Zimmermann et al. 1968)
93.22
124.345
143.54
Specific volume (McMeekin et al. 1964)
5.26
5.37
5.8
107.16
109.58
58.14
Polarity (Woese et al. 1967)
61.2
59.15
51
Polarity (Grantham 1974)
71.2
67
56.3
Hydrophobicity (Jones 1975)
9.18
8.385
16.93
Hydrophobicity (Levitt 1976)
-2.2
1.6
8.8
Hydrophobicity (Bull & Breese 1974)
3880
-165
-6790
Hydrophilicity (Weber & Lacey 1978)
7.02
6.585
5.59
Partition coefficient (Garel et al. 1973)
1.88
5.58
7.6
Sequence Frequency (Jungck 1971)
4280
3522
2966
Polarity (Zimmerman et al. 1968)
Evolution of the genetic code

our contemporary code is the
quaternary triplet code:
43=64
fields
00*
00*
00*
00*
01*
01*
01*
01*
10*
10*
10*
10*
11*
11*
11*
11*
CGU, UAC,…

quaternary doublet code:
42=16 fields
00
00
00
00
01
01
01
01
10
10
10
10
11
11
11
11
CGU, UAC,…

binary doublet:
00
41=4 fields
01
10
11
Evidence: Evolution of the Genetic Code
Code
000
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
Pro
CC
(C/U)
Ser
Proline
001
Pro
100
Ala
101
Ala
010
011
110
111
CC
(A/G)
Ser
(C/U)
Thr
(A/G)
Thr
Alanine
Arg
CG
(C/U)
CG
Gly
GG
GG
Glycine
(A/G)
Leu
(A/G)
AC
(C/U)
Val
AC
(A/G)
Val
His
AG
Arg AG
(A/G)
Leu
GU
GU
CA
(A/G)
Gln
(C/U)
CA
(C/U)
Ile
(A/G)
(A/G)
AU
(C/U)
Isoleucine
(A/G)
Ile/Met AU
(A/G)
Isoleucine/Methionine
(C/U)
Tyr
UA
(C/U)
Tyrosine
(A/G)
Stop
UA
(A/G)
Glutamine
Asp
GA
(C/U)
Asn
Glu
GA
(A/G)
Glutamatic acid
AA
(C/U)
Asparagine
Asparatic acid
Arginine
UU
Leucine
Histidine
Serine
(A/G)
CU
(C/U)
Phenylalanine
Valine
(C/U)
Stop/Trp UG
Ser
Phe UU
Valine
Tryptophan
(C/U)
(C/U)
Leucine
Cystein
Glycine
Gly
UC
Cys UG
Arginine
CU
Leucine
Threonine
Arginine
Arg
Leu
Threonine
Alanine
GC
(C/U)
Serine
Proline
GC
UC
Serine
Lys
AA
Lysine
(A/G)
Evidence: Evolution of the Genetic Code
Code
000
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
Pro
CC
(C/U)
Ser
Proline
001
Pro
100
Ala
101
Ala
010
011
110
111
CC
(A/G)
Ser
(C/U)
Thr
(A/G)
Thr
Alanine
Arg
CG
(C/U)
CG
Gly
GG
GG
Glycine
(A/G)
Leu
(A/G)
AC
(C/U)
Val
AC
(A/G)
Val
His
AG
Arg AG
(A/G)
Leu
GU
GU
CA
(A/G)
Gln
(C/U)
CA
(C/U)
Ile
(A/G)
(A/G)
AU
(C/U)
Isoleucine
(A/G)
Ile/Met AU
(A/G)
Isoleucine/Methionine
(C/U)
Tyr
UA
(C/U)
Tyrosine
(A/G)
Stop
UA
(A/G)
Glutamine
Asp
GA
(C/U)
Asn
Glu
GA
(A/G)
Glutamatic acid
AA
(C/U)
Asparagine
Asparatic acid
Arginine
UU
Leucine
Histidine
Serine
(A/G)
CU
(C/U)
Phenylalanine
Valine
(C/U)
Stop/Trp UG
Ser
Phe UU
Valine
Tryptophan
(C/U)
(C/U)
Leucine
Cystein
Glycine
Gly
UC
Cys UG
Arginine
CU
Leucine
Threonine
Arginine
Arg
Leu
Threonine
Alanine
GC
(C/U)
Serine
Proline
GC
UC
Serine
Lys
AA
Lysine
(A/G)
Outlook

Looking for binary patterns in the
genomes

Additional information
http://www.imb-jena.de/~sweta/genetic_code/

Acknowledgment
Maik Friedel
Andreas Beyer
Frank Grosse
Thank you for your attention !
The new classification scheme of the standard genetic code
Code
000
Strong codons
Mixed codons
Mixed codons
Weak codons
6 hydrogen bonds
5 hydrogen bonds
5 hydrogen bonds
4 hydrogen bonds
Pro
CC
(C/U)
Ser
Proline
001
Pro
100
Ala
101
Ala
010
011
110
111
CC
(A/G)
Ser
(C/U)
Thr
(A/G)
Thr
Alanine
Arg
CG
(C/U)
CG
Gly
GG
GG
Glycine
(A/G)
Leu
(A/G)
AC
(C/U)
Val
AC
(A/G)
Val
His
AG
Arg AG
(A/G)
Leu
GU
GU
CA
(A/G)
Gln
(C/U)
CA
(C/U)
Ile
(A/G)
(A/G)
AU
(C/U)
Isoleucine
(A/G)
Ile/Met AU
(A/G)
Isoleucine/Methionine
(C/U)
Tyr
UA
(C/U)
Tyrosine
(A/G)
Stop
UA
(A/G)
Glutamine
Asp
GA
(C/U)
Asn
Glu
GA
(A/G)
Glutamatic acid
AA
(C/U)
Asparagine
Asparatic acid
Arginine
UU
Leucine
Histidine
Serine
(A/G)
CU
(C/U)
Phenylalanine
Valine
(C/U)
Stop/Trp UG
Ser
Phe UU
Valine
Tryptophan
(C/U)
(C/U)
Leucine
Cystein
Glycine
Gly
UC
Cys UG
Arginine
CU
Leucine
Threonine
Arginine
Arg
Leu
Threonine
Alanine
GC
(C/U)
Serine
Proline
GC
UC
Serine
Lys
AA
(A/G)
Lysine
T.Wilhelm, S.Nikolajewa A new classification scheme of the genetic code. J. Mol. Evol. (2004) 59: 598-605
Related documents