Download TEXT F.H.C crick postulated the existence of “genetic code” the set

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Citric acid cycle wikipedia , lookup

Protein wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Proteolysis wikipedia , lookup

Messenger RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

Peptide synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Metabolism wikipedia , lookup

Protein structure prediction wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Amino acid wikipedia , lookup

Transfer RNA wikipedia , lookup

Biochemistry wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Biosynthesis wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
TEXT
F.H.C crick postulated the existence of “genetic code”
the set of all codons that specify the 20 amino acids. The
number and sequence of basis in mRNA specifying an amino
acid is known as codon. The codons are usually presented in
a language of adenine (A), guanine (G), cytosine (C), uracil
(U). If a single nucleotide is capable of coding for amino acid
(singlet code) only four codons are established and doublet
code would be 16 codons which are not enough to code for
20 amino acids. However if three nucleotides code for one
amino acid (triplet code) as many as 64 codons (4x4x4=64)
become available for 20 amino acids. The 64 triplets would
be enough to code for 20 amino acids (Gupta, 2007).
A
G
C
U
AA
AG
AC
AU
GA
GG
GC
GU
CA
CG
CC
CU
UA
UG
UC
TU
Singlet code
doublet code
AAA
UAA
GAA
CAA
AAU
UAU
GAU
CAU
AAG
UAG
GAG
CAG
AAC
UAC
GAC
CAC
AUA
UUA
GUA
CUA
AUU
UUU
GUU
CUU
AUG
UUG
GUG
CUG
AUC
UUC
GUC
CUC
AGA
UGA
GGA
CGA
AGU
UGU
GGU
CGU
AGG
UGG
GGG
CGG
AGC
UGC
GGC
CGC
ACA
UCA
GCA
CCA
ACU
UCU
GCC
CCU
ACC
UCC
GCC
CCC
Triplet code
Source: Gupta (2007)
Crick et. al. (1961) provided the first experimental
evidence in support to the concept of triplet code of mRNA. A
chemical called proflavin was given to T4 bacterophage which
could either add or delete a base in its DNA molecule thus
damaging the virus and resulting in an altered or mutant
form of the virus (Sarin, 1997). When insertion or deletion of
a single or double base pairs occurs the bacteriophage
ceased the normal function. In addition, when three base
pairs
were
added
or
deleted
in
the
T4
DNA,
the
bacteriophage performed the normal function. Based on this
experiment they concluded that genetic code is a triplet code
because due to addition or deletion of single or double base
pairs the reading sequence was changed. Where as it was
returned to normal with the addition or deletion of third
nucleotide.
Accordingly a codon dictionary has been prepared and
relationship of some 61 codons has been established to
certain specific amino acids.
The remaining three codons, UAA (also called ochre), UAG
(also called amber) and UGA (also called opal) do not code
for specific amino acids and before the functions of these
codons was discoved they were called nonsense codons. The
three codons (UAA, UAG and UGA) whenever present in
mRNA would bring about termination of polypeptide chain
and are given the name stop or termination codon (Gupta,
2007). Since there are more codons than the amino acids,
and as a result almost all amino acids are represented by
more than one codon. The only exceptions are methionine
and tryptophan. The codons that have same meaning are
called synonyms. Thus, multiple codons must code the same
amino acids. This is called degeneracy in the genetic code.
Second letter
First
Third
letter U
UUU
U
UUC
UUA
C
Phe
C
A
UCU
UAU
UCC
Ser
letter
G
Tyr
UAC
UGU
Cys
UGC
U
C
UCA
UAA
Stop
UGA
Stop
A
UUG
UCG
UAG Stop
UGG
Trp
G
CUU
CCU
CAU
CGU
CCC
CAC
CUC
Leu
Leu
Pro
CUA
CCA
CAA
CUG
CCG
CAG
His
CGC
Gln
Arg
U
CGA
C
CGG
A
G
AUU
A AUC
ACU
Ile
ACC
AAU
Thr
Asn
AGU
Ser
AAC
AGC
U
AGA
C
AUA
ACA
AAA
AUG Met
ACG
AAG
Lys
Arg
AGG
A
G
GUU
G
GUC
GCU
Val
GCC
GAU
Ala
Asp
GAC
GUA
GCA
GAA
GUG
GCG
GAG
GGU
GGC
Glu
U
Gly
C
GGA
A
GGG
G
Triplet codon (Source: Gupta (2007))
FEATURES OF THE GENETIC CODE
Triplet code
A singlet code means one to one correspondence between
nucleotides and amino acids and has been ruled out by the
biologists as the nucleic acids code only for four types of
amino acids. In doublet code two nucleotides code for one
amino acid and only 16 amino acids will be coded and is
insufficient to code for 20 amino acids. In triplet code three
nucleotides code for one amino acid (4x4x4=64 triplet
combinations) is thus the smallest coding unit that could
accommodate 20 amino acids. So the triplet code fulfills the
requirement of coding all the 20 amino acids (Sarin, 1997).
Non-ambiguity of the code
It means that there is no ambiguity about a particular codon.
A codon will always code for a particular amino acid.
However, the amino acid can be coded by more than one
codon but same codon shall never code for two different
amino acids.
There is some ambiguity when AUG and GUG are taken
in to consideration; both may code for methionine as
initiating codon although GUG is meant for valine (Gupta,
2007).
Genetic code is universal
The genetic code is the same in almost all organisms
e.g, the codon AGA specifies the amino acid arginine in
bacteria, humans and all other organisms whose genetic
code has been studied. The universality of the genetic code is
among the strongest evidence that all living things shear a
common evolutionary heritage. The universality of the code
argues that it must have been established very early in
evolution. Perhaps, the code started in a primitive form in
which a small number of codons were used to represent
comparatively few amino acids, possibly even with one codon
corresponding to any member of group of amino acids. More
precise codon means an additional amino acids could have
been introduced later.
Evolution of the code could have become “frozen” at a
point at which the system had become so complex that any
changes in codon meaning would disrupt existing proteins by
substituting un-acceptable
amino acids.
Its
universality
implies that this must have happened at such an early stage
that all living organisms are descended from a single pool of
primitive cells in which this occurred. Because the code is
universal, genes transcribed from one organism can be
translated in another. Similarly genes can be transferred
from
one
organism
to
another
and
be
successfully
transcribed and translated in their new host. This universal of
gene expression is central to many of the advances of
genetic engineering. Many commercial products, such as the
insulin used to treat diabetes, are now manufactured by
placing human genes into bacteria, which serve as tiny
factories to turn out prodigious quantities of insulin.
Genetic Code is degenerate
All the amino acids except methionine & tryptophan are
specified by more than one codon. A non degenerate code is
one where one codon codes for one amino acid i.e. 20 codons
code for 20 amino acids and rest 44 codons are useless but it
is not the case.
Degeneracy of the Genetic code
Amino
Number
Amino Acid
Number of
Acid
of codons
Ala
4
IIe
3
Arg
6
Leu
6
codons
Asn
2
Lys
2
Asp
2
Phe
2
Cys
2
Pro
4
Gln
2
Ser
6
Glu
2
Phr
4
Gly
4
Tyr
2
His
2
Val
4
This occurrence of more than one codon per amino is called
degeneracy. The degeneracy in the genetic code is not at
random, instead, it is highly ordered. Usually, the multiple
codons specifying an amino acid differ by only one base, the
third or 3' - base of the codon. Because of the degeneracy of
the genetic code, there must be either several different
tRNAs that recognize the different codons specifying a given
amino acids or the anticodon of a given tRNA must be able to
base-pair with several different codons. The degeneracy is
primarily of two types.
(1)
Partial degeneracy:- It occurs when the first two
nucleotides are identical but the third or 3/- base of the
codon differes. The third base may be one of the two
pyrimidines (U or C) and the codon will still specify the same
amini acid (e.g. CUU, CUC code for leucine). Similarly purines
(A and G) are often interchangeable for the third base of a
codon (e.g. GUA, GUG code for valine).
(2) Complete
degeneracy:-
In
case
of
complete
degeneracy any of the four bases may be present at the
third position in the codon, and the codon will still
specify the same amino acid e.g UCU, UCC, UCA and
UCG code for serine (Sarin, 1997).
Importance of degeneracy in genetic Code
Degeneracy in the genetic code evolved as way of minimizing
mutational lethality. If the degeneracy is of the type that
leads to the replacement by equivalent amino acids, the
small accidental mutational changes are much less damaging
than that would occur under a non-degenerate code. Thus
degeneracy contributes favourably to genetic stability (Sarin,
1997). The wobble or third base of the codon contributes to
specificity but,
because it
pairs
only loosely with
its
corresponding base in the anticodon, it permits rapid
dissociation of the tRNA from its codon during protein
synthesis. If all three bases of codons engaged in strong
Watson Crick pairing with the three bases of the anticodon,
tRNA’s would dissociate too slowly and this would severely
limit the rate of protein synthesis. Thus, codon–anticodon
interactions balance the requirements for accuracy and
speed.
Wobble hypothesis
Tansfer RNAs base- pair with mRNA codons by means of a
three base sequence on the tRNA called the anticodon. The
first base of the codon in mRNA (5’ to 3' direction) pairs with
the third-base of the anticodon.
If the anticodon triplet of tRNA recognized only one codon
triplet through Watson–Crick base pairing, cells would have a
different tRNA for each codon of an amino acid. This is not
the case. e.g, the anti codons in some tRNAs contain the
nucleotide (designated I) which contains the un-common
base hypoxanthine. Inosinate can form hydrogen bonds with
three different nucleotides U, C and A, although these base
pairings are much weaker than the hydrogen bonds between
the Watson Crick base pairs G ≡C and A=U
Examination of these and other codon–anti codon parings
led Crick to conclude that the third base of most codons
pairs rather loosely with the corresponding base of its
anticodon; the third base of such codons and the first bases
of their corresponding anticodon “Wobble”. Crick in 1965
proposed a hypothesis known as Wobble hypothesis to
explain this phenomenon.
The first two bases of mRNA codon always form strong
Watson Crick base pairs with the corresponding bases of the
tRNA anticodon and confer most of the coding specificity.
The first base of the anticodon determines the number
of codons recognized by the tRNA. When the first base of the
anticodon is C or A, base pairing is specific and only one
codon is recognized by that tRNA. When the first base is U or
G, binding is less specific and two different codons may be
read. When inosine (I) is the first nucleotide of an anticodon,
three different codons can be recognized (the maximum
number for any tRNA.). These relationships are summarized
as;
X and Y
denotes
complementary
bases capable
of strong
Watson Crick
base pairing
with each other
When an amino acid is specified by several different codons,
the codons that differ in either of the first two bases require
different tRNA’s. A minimum of 32 tRNA’s are required to
translate all 61 codons
The genetic code has polarity
The code has polarity i,e it is read between the fixed start
and stop codons. The start codon is also known as initiation
codon
and stop codon as chain termination codon. The
message of mRNA is read in 5'
3' direction. The polypeptide
chain is synthesized from amino (-NH2) end to the carboxyl
(COOH) end i.e. N
C
Chain initiation codon
The codon present in the beginning of the cistron is known as
initiation codon,
it marks the beginning of message for
polypeptide chain, the initiation codon is AUG in majority of
cases, it codes far amino acid methionine. Rarely, GUG also
acts as initiation codon in bacterial protein synthesis. GUG
code for valine.
Chain termination codon
Three of the 64 codons do not code for specific amino acids
these codons are UAG, UAA and UGA. These bring about
termination of polypeptide and are, therefore, called as
termination codons.
Genetic code is non-overlapping and commaless
The genetic code is non-overlapping means that a base
in mRNA is not used for two different codons. Commaless
code means that no punctuations are between the codons. In
other words the code is read from fixed starting point as a
continuous sequence of bases, taken three at time, e.g,
ABCDEFGHIJKL-------- is read as ABC/ DEF/ GHI/ JKL----without any punctuation between the codons. When one
amino
acid
is
coded
the
second
amino
acid will
be
automatically coded by the next three letters and no letters
are wasted for telling that one amino acid has been coded
and that now second should be coded. If one or two
nucleotides are either deleted from or added to the interior of
a message sequence, a frame-shift mutation occurs and the
reading frame is altered. The resulting amino acid sequence
1
2
3
may become 2radically different from this point onward.
1
C A T G A T
C A T
G A T
Overlapping
C
Non-overlapping
Commaless
Colinearity of genetic code
Genetic code represents sequences of codons in mRNA and
corresponding amino acid residues of a polypeptide chain are
arranged in the same linear sequence i.e. the code is
collinear with amino acid sequence in a polypeptide chain.
The translation of mRNA occurs concomitantly with its
transcription. mRNA is translated in the same direction as
that in which it is synthesized. RNA chain grows from 5/ to 3/
end and the translation of mRNA too goes from 5/ to 3/ end.
The protein chains grow from their amino terminal end. It
has been shown that the amino acid closer to the amino
terminus is represented by a codon closer to the 5/ end of
the corresponding mRNA (Sarin, 1997).
Deciphering the code
The deciphering of the genetic code means
•
Which codons specify which amino acids?
•
How the code is punctuated.
•
Whether different species use the same or different
codons.
In 1950s it was therotically accepted that genetic code
should be triplet in nature and it was not possible to say
which out of 64 codons code for which amino acid (Gupta,
2007).
M. W. Nirenberg and J.H. Mathaei in 1961 synthesized RNA
using only one nucleotide uracil. It means that there was no
base other than uracil in the length of mRNA and the only
possible triplet was UUU. In the experiment when they used
the poly-U RNA in polypeptide synthesis, only one amino acid
phenylalanine
was
synthesized
repeatedly
and
it
was
concluded that phenylalanine is coded by a triplet codon
UUU. Same experiment was repeated with adenine and
cytosine. The poly-A RNA synthesize lysine and poly-C RNA
synthesize proline. They concluded that AAA coded for lysine
and CCC coded for proline. This type of experiment with poly
G was unsuccessfull.
Afterwards
in
1964,
M.W.
Nirenberg
and
P.Leder
proposed a technique known as “binding technique”they
found that if a synthetic tri-nucleotide for a known sequence
is used with ribosome
and
a particular aminoacyl–tRNA,
these will form a complex, provided the used codes for the
amino acid attached to the
given aminoacyl tRNA (Gupta,
2007).
Codon1 +Ribosome + AA1- tRNA
Ribosome -
codon1 –AA1 –tRNA1
In the above process, if given AA1 is used with a given codon
1
and the formation of the complex is detected, this would
prove that the given codon codes for the given amino acid
(Gupta, 2007).
The free AA- tRNA passes through nitrocellulose membrane
easily, while the ribosome – codon – AA – tRNA complex
adsorbs on such a membrane. If only one of the amino acids
is made radioactive in a mixture, then the radioactive amino
acid will get adsorbed on the nitrocellulose membrane. This
will prove the relationship between codon and radioactive
amino acid. For example, 20 samples of a mixture of all 20
amino acids may be taken and in each sample one amino
acid is made radioactive in such a manner that each and
every amino acid is made radioactive in one sample or the
other, and no two samples have same radioactive amino
acid. A particular sample would be then known by its
radioactive amino acid. Now tRNA’s and ribosome’s are
mixed with each sample and the same codon is used for
complex formation in all 20–cases. When the mixture is
poured on the nitrocellulose membrane, radioactivity on the
nitrocellulose membrane will be observed only when the
radioactive amino acid is taking part in the formation of the
complex. Since in each sample the radioactive amino acid is
known it would be possible to detect the amino acid coded by
a given codon by the presence of radioactivity on the
membrane. Such a treatment was given by Nirenberg and his
co-workers to all the 64 synthetic codons, and their
respective amino acids were identified , By this technique
Nirenberg and his co-workers cracked 45 codons for amino
acids–arginine, alanine, methionine, proline, tryptophan,
tyrosine, serine and valine (Gupta, 2007).
H.G. Khorana (1961) also devised a technique for
craking the genetic code. He prepared polyribonucleotides
with known repeating sequences. A repeating sequence
means that , if CU are two bases , these will be repeatedly
present throughout the length as follows:
CUCUCUCUCUCUCUCUCU
In a similar manner, if ACU are three bases they will be
present repeatedly as follows
ACUACUACUACUACUACU
So only two codons are possible and these are CUC and UCU
in
altering
sequence,
e.g.
(CUC/UCU/CUC/UCU/CUC/UCU).,
only
in
(CU)n
=
two
codons
are
possible and these are CUC and UCU. It means that only two
amino acids leucine coded by CUC and serine coded by UCU
are formed in altering fashion.
Copolymer Codons
Amino acids
Codons
Leucine/ serine
CUC/UC
s
(CU)n
CUC/UCU/CU
C
U
(UG)n
UGU/GUG/UG
Cysteine / valine
UGU/GU
U
(AC)n
G
ACA/CAC/ACA Threonine/Histidin ACA/CAC
e
Assignment of codons having known sequences, with
help of
the
co- polymers having repetitive sequences of two
bases.
Similarly consider a repeating sequence of three bases e.g.
(ACG)n. Depending upon were the reading is started, three
kinds of homopolypeptides are expected. Actual codon
assignment i,e. to find out which of three codons codes for
which
amino
acid
information available
would
depend
upon
the
previous
regarding the composition of basis in
different codons coding for different amino acids
On the basis of the above techniques, a complete
genetic code dictionary could be prepared (Gupta, 2007).
Codons
Homopolypeptide
Codon
assignment
ACG/ ACG/ ACG/ ACG/ (Threonine)n
ACG
ACG =Poly( ACG)
Threonine
A/CGA/
CGA/
CGA/
CGA/
=
CGA/ (Arginine)n
CGA
=poly
Arginine
GAC/ (Aspartic acid )n
GAC
=
(CGA)
AC/GAC/
GAC/
GAC/
GAC/
=
Poly
=
Aspartic acid
(GAC)
MITOCHONDRIAL GENETIC CODE
In
1979,
scientists
began
to
determine
the
complete
nucleotide sequences of the mitochondrial genomes in
humans and mice. It came as a shock when these scientists
learned that the genetic code used by these mammalian
mitochondria was not quite the same as the “universal code”
that
has
become
so
familiar
to
biologists.
In
the
mitochondrial genomes, what should have been a “stop”
codon, UGA was instead read as the amino acid tryptophan,
AUA was read as methionine rather than isoluecine and AGA
and AGG were read as “Stop” rather than arginine. Further
more, minor differences from the universal code have also
been found in the genomes of chloroplasts and ciliates. Thus,
it appears that the genetic code is not quite universal. Some
time ago, presumably after they began their endosymbiontic
existence, mitochondria and chloroplasts began to read the
code
differently,
particularly
the
portion
of
the
code
associated with “stop” signals.
The major difference between the universal codon and
mammalian mitochondrion code are:
(i) one termination codon (UGA) and two arginine codons
(AGA
and
AGG)
in
universal
code;
in
mammalian
mitochondrion code UGA codes for tryptophan and AGA and
AGG codes for stop signals.
(ii) number of tRNAs is 22 in mammalian mitochondria code
while it is 55 in E.coli (Gupta, 2007).
(iii) in universal genetic code CUN (N=any nucleotide) codes
for leucine and in mitochondrial code CUN codes for
threonine in yeast.
(iv) UAG antocodon is amino acylated by leucine in universal
genetic code and UAG anticodon in yeast mitochondria
accepts threonine.
Different genetic codes have been identified in protozoa
(Mycoplasma capricolum) im 1986. In this UAA and UAG
code for glutamine instead of stop signals.