* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download TEXT F.H.C crick postulated the existence of “genetic code” the set
Survey
Document related concepts
Citric acid cycle wikipedia , lookup
Fatty acid synthesis wikipedia , lookup
Proteolysis wikipedia , lookup
Messenger RNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Epitranscriptome wikipedia , lookup
Peptide synthesis wikipedia , lookup
Point mutation wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Transfer RNA wikipedia , lookup
Biochemistry wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Transcript
TEXT F.H.C crick postulated the existence of “genetic code” the set of all codons that specify the 20 amino acids. The number and sequence of basis in mRNA specifying an amino acid is known as codon. The codons are usually presented in a language of adenine (A), guanine (G), cytosine (C), uracil (U). If a single nucleotide is capable of coding for amino acid (singlet code) only four codons are established and doublet code would be 16 codons which are not enough to code for 20 amino acids. However if three nucleotides code for one amino acid (triplet code) as many as 64 codons (4x4x4=64) become available for 20 amino acids. The 64 triplets would be enough to code for 20 amino acids (Gupta, 2007). A G C U AA AG AC AU GA GG GC GU CA CG CC CU UA UG UC TU Singlet code doublet code AAA UAA GAA CAA AAU UAU GAU CAU AAG UAG GAG CAG AAC UAC GAC CAC AUA UUA GUA CUA AUU UUU GUU CUU AUG UUG GUG CUG AUC UUC GUC CUC AGA UGA GGA CGA AGU UGU GGU CGU AGG UGG GGG CGG AGC UGC GGC CGC ACA UCA GCA CCA ACU UCU GCC CCU ACC UCC GCC CCC Triplet code Source: Gupta (2007) Crick et. al. (1961) provided the first experimental evidence in support to the concept of triplet code of mRNA. A chemical called proflavin was given to T4 bacterophage which could either add or delete a base in its DNA molecule thus damaging the virus and resulting in an altered or mutant form of the virus (Sarin, 1997). When insertion or deletion of a single or double base pairs occurs the bacteriophage ceased the normal function. In addition, when three base pairs were added or deleted in the T4 DNA, the bacteriophage performed the normal function. Based on this experiment they concluded that genetic code is a triplet code because due to addition or deletion of single or double base pairs the reading sequence was changed. Where as it was returned to normal with the addition or deletion of third nucleotide. Accordingly a codon dictionary has been prepared and relationship of some 61 codons has been established to certain specific amino acids. The remaining three codons, UAA (also called ochre), UAG (also called amber) and UGA (also called opal) do not code for specific amino acids and before the functions of these codons was discoved they were called nonsense codons. The three codons (UAA, UAG and UGA) whenever present in mRNA would bring about termination of polypeptide chain and are given the name stop or termination codon (Gupta, 2007). Since there are more codons than the amino acids, and as a result almost all amino acids are represented by more than one codon. The only exceptions are methionine and tryptophan. The codons that have same meaning are called synonyms. Thus, multiple codons must code the same amino acids. This is called degeneracy in the genetic code. Second letter First Third letter U UUU U UUC UUA C Phe C A UCU UAU UCC Ser letter G Tyr UAC UGU Cys UGC U C UCA UAA Stop UGA Stop A UUG UCG UAG Stop UGG Trp G CUU CCU CAU CGU CCC CAC CUC Leu Leu Pro CUA CCA CAA CUG CCG CAG His CGC Gln Arg U CGA C CGG A G AUU A AUC ACU Ile ACC AAU Thr Asn AGU Ser AAC AGC U AGA C AUA ACA AAA AUG Met ACG AAG Lys Arg AGG A G GUU G GUC GCU Val GCC GAU Ala Asp GAC GUA GCA GAA GUG GCG GAG GGU GGC Glu U Gly C GGA A GGG G Triplet codon (Source: Gupta (2007)) FEATURES OF THE GENETIC CODE Triplet code A singlet code means one to one correspondence between nucleotides and amino acids and has been ruled out by the biologists as the nucleic acids code only for four types of amino acids. In doublet code two nucleotides code for one amino acid and only 16 amino acids will be coded and is insufficient to code for 20 amino acids. In triplet code three nucleotides code for one amino acid (4x4x4=64 triplet combinations) is thus the smallest coding unit that could accommodate 20 amino acids. So the triplet code fulfills the requirement of coding all the 20 amino acids (Sarin, 1997). Non-ambiguity of the code It means that there is no ambiguity about a particular codon. A codon will always code for a particular amino acid. However, the amino acid can be coded by more than one codon but same codon shall never code for two different amino acids. There is some ambiguity when AUG and GUG are taken in to consideration; both may code for methionine as initiating codon although GUG is meant for valine (Gupta, 2007). Genetic code is universal The genetic code is the same in almost all organisms e.g, the codon AGA specifies the amino acid arginine in bacteria, humans and all other organisms whose genetic code has been studied. The universality of the genetic code is among the strongest evidence that all living things shear a common evolutionary heritage. The universality of the code argues that it must have been established very early in evolution. Perhaps, the code started in a primitive form in which a small number of codons were used to represent comparatively few amino acids, possibly even with one codon corresponding to any member of group of amino acids. More precise codon means an additional amino acids could have been introduced later. Evolution of the code could have become “frozen” at a point at which the system had become so complex that any changes in codon meaning would disrupt existing proteins by substituting un-acceptable amino acids. Its universality implies that this must have happened at such an early stage that all living organisms are descended from a single pool of primitive cells in which this occurred. Because the code is universal, genes transcribed from one organism can be translated in another. Similarly genes can be transferred from one organism to another and be successfully transcribed and translated in their new host. This universal of gene expression is central to many of the advances of genetic engineering. Many commercial products, such as the insulin used to treat diabetes, are now manufactured by placing human genes into bacteria, which serve as tiny factories to turn out prodigious quantities of insulin. Genetic Code is degenerate All the amino acids except methionine & tryptophan are specified by more than one codon. A non degenerate code is one where one codon codes for one amino acid i.e. 20 codons code for 20 amino acids and rest 44 codons are useless but it is not the case. Degeneracy of the Genetic code Amino Number Amino Acid Number of Acid of codons Ala 4 IIe 3 Arg 6 Leu 6 codons Asn 2 Lys 2 Asp 2 Phe 2 Cys 2 Pro 4 Gln 2 Ser 6 Glu 2 Phr 4 Gly 4 Tyr 2 His 2 Val 4 This occurrence of more than one codon per amino is called degeneracy. The degeneracy in the genetic code is not at random, instead, it is highly ordered. Usually, the multiple codons specifying an amino acid differ by only one base, the third or 3' - base of the codon. Because of the degeneracy of the genetic code, there must be either several different tRNAs that recognize the different codons specifying a given amino acids or the anticodon of a given tRNA must be able to base-pair with several different codons. The degeneracy is primarily of two types. (1) Partial degeneracy:- It occurs when the first two nucleotides are identical but the third or 3/- base of the codon differes. The third base may be one of the two pyrimidines (U or C) and the codon will still specify the same amini acid (e.g. CUU, CUC code for leucine). Similarly purines (A and G) are often interchangeable for the third base of a codon (e.g. GUA, GUG code for valine). (2) Complete degeneracy:- In case of complete degeneracy any of the four bases may be present at the third position in the codon, and the codon will still specify the same amino acid e.g UCU, UCC, UCA and UCG code for serine (Sarin, 1997). Importance of degeneracy in genetic Code Degeneracy in the genetic code evolved as way of minimizing mutational lethality. If the degeneracy is of the type that leads to the replacement by equivalent amino acids, the small accidental mutational changes are much less damaging than that would occur under a non-degenerate code. Thus degeneracy contributes favourably to genetic stability (Sarin, 1997). The wobble or third base of the codon contributes to specificity but, because it pairs only loosely with its corresponding base in the anticodon, it permits rapid dissociation of the tRNA from its codon during protein synthesis. If all three bases of codons engaged in strong Watson Crick pairing with the three bases of the anticodon, tRNA’s would dissociate too slowly and this would severely limit the rate of protein synthesis. Thus, codon–anticodon interactions balance the requirements for accuracy and speed. Wobble hypothesis Tansfer RNAs base- pair with mRNA codons by means of a three base sequence on the tRNA called the anticodon. The first base of the codon in mRNA (5’ to 3' direction) pairs with the third-base of the anticodon. If the anticodon triplet of tRNA recognized only one codon triplet through Watson–Crick base pairing, cells would have a different tRNA for each codon of an amino acid. This is not the case. e.g, the anti codons in some tRNAs contain the nucleotide (designated I) which contains the un-common base hypoxanthine. Inosinate can form hydrogen bonds with three different nucleotides U, C and A, although these base pairings are much weaker than the hydrogen bonds between the Watson Crick base pairs G ≡C and A=U Examination of these and other codon–anti codon parings led Crick to conclude that the third base of most codons pairs rather loosely with the corresponding base of its anticodon; the third base of such codons and the first bases of their corresponding anticodon “Wobble”. Crick in 1965 proposed a hypothesis known as Wobble hypothesis to explain this phenomenon. The first two bases of mRNA codon always form strong Watson Crick base pairs with the corresponding bases of the tRNA anticodon and confer most of the coding specificity. The first base of the anticodon determines the number of codons recognized by the tRNA. When the first base of the anticodon is C or A, base pairing is specific and only one codon is recognized by that tRNA. When the first base is U or G, binding is less specific and two different codons may be read. When inosine (I) is the first nucleotide of an anticodon, three different codons can be recognized (the maximum number for any tRNA.). These relationships are summarized as; X and Y denotes complementary bases capable of strong Watson Crick base pairing with each other When an amino acid is specified by several different codons, the codons that differ in either of the first two bases require different tRNA’s. A minimum of 32 tRNA’s are required to translate all 61 codons The genetic code has polarity The code has polarity i,e it is read between the fixed start and stop codons. The start codon is also known as initiation codon and stop codon as chain termination codon. The message of mRNA is read in 5' 3' direction. The polypeptide chain is synthesized from amino (-NH2) end to the carboxyl (COOH) end i.e. N C Chain initiation codon The codon present in the beginning of the cistron is known as initiation codon, it marks the beginning of message for polypeptide chain, the initiation codon is AUG in majority of cases, it codes far amino acid methionine. Rarely, GUG also acts as initiation codon in bacterial protein synthesis. GUG code for valine. Chain termination codon Three of the 64 codons do not code for specific amino acids these codons are UAG, UAA and UGA. These bring about termination of polypeptide and are, therefore, called as termination codons. Genetic code is non-overlapping and commaless The genetic code is non-overlapping means that a base in mRNA is not used for two different codons. Commaless code means that no punctuations are between the codons. In other words the code is read from fixed starting point as a continuous sequence of bases, taken three at time, e.g, ABCDEFGHIJKL-------- is read as ABC/ DEF/ GHI/ JKL----without any punctuation between the codons. When one amino acid is coded the second amino acid will be automatically coded by the next three letters and no letters are wasted for telling that one amino acid has been coded and that now second should be coded. If one or two nucleotides are either deleted from or added to the interior of a message sequence, a frame-shift mutation occurs and the reading frame is altered. The resulting amino acid sequence 1 2 3 may become 2radically different from this point onward. 1 C A T G A T C A T G A T Overlapping C Non-overlapping Commaless Colinearity of genetic code Genetic code represents sequences of codons in mRNA and corresponding amino acid residues of a polypeptide chain are arranged in the same linear sequence i.e. the code is collinear with amino acid sequence in a polypeptide chain. The translation of mRNA occurs concomitantly with its transcription. mRNA is translated in the same direction as that in which it is synthesized. RNA chain grows from 5/ to 3/ end and the translation of mRNA too goes from 5/ to 3/ end. The protein chains grow from their amino terminal end. It has been shown that the amino acid closer to the amino terminus is represented by a codon closer to the 5/ end of the corresponding mRNA (Sarin, 1997). Deciphering the code The deciphering of the genetic code means • Which codons specify which amino acids? • How the code is punctuated. • Whether different species use the same or different codons. In 1950s it was therotically accepted that genetic code should be triplet in nature and it was not possible to say which out of 64 codons code for which amino acid (Gupta, 2007). M. W. Nirenberg and J.H. Mathaei in 1961 synthesized RNA using only one nucleotide uracil. It means that there was no base other than uracil in the length of mRNA and the only possible triplet was UUU. In the experiment when they used the poly-U RNA in polypeptide synthesis, only one amino acid phenylalanine was synthesized repeatedly and it was concluded that phenylalanine is coded by a triplet codon UUU. Same experiment was repeated with adenine and cytosine. The poly-A RNA synthesize lysine and poly-C RNA synthesize proline. They concluded that AAA coded for lysine and CCC coded for proline. This type of experiment with poly G was unsuccessfull. Afterwards in 1964, M.W. Nirenberg and P.Leder proposed a technique known as “binding technique”they found that if a synthetic tri-nucleotide for a known sequence is used with ribosome and a particular aminoacyl–tRNA, these will form a complex, provided the used codes for the amino acid attached to the given aminoacyl tRNA (Gupta, 2007). Codon1 +Ribosome + AA1- tRNA Ribosome - codon1 –AA1 –tRNA1 In the above process, if given AA1 is used with a given codon 1 and the formation of the complex is detected, this would prove that the given codon codes for the given amino acid (Gupta, 2007). The free AA- tRNA passes through nitrocellulose membrane easily, while the ribosome – codon – AA – tRNA complex adsorbs on such a membrane. If only one of the amino acids is made radioactive in a mixture, then the radioactive amino acid will get adsorbed on the nitrocellulose membrane. This will prove the relationship between codon and radioactive amino acid. For example, 20 samples of a mixture of all 20 amino acids may be taken and in each sample one amino acid is made radioactive in such a manner that each and every amino acid is made radioactive in one sample or the other, and no two samples have same radioactive amino acid. A particular sample would be then known by its radioactive amino acid. Now tRNA’s and ribosome’s are mixed with each sample and the same codon is used for complex formation in all 20–cases. When the mixture is poured on the nitrocellulose membrane, radioactivity on the nitrocellulose membrane will be observed only when the radioactive amino acid is taking part in the formation of the complex. Since in each sample the radioactive amino acid is known it would be possible to detect the amino acid coded by a given codon by the presence of radioactivity on the membrane. Such a treatment was given by Nirenberg and his co-workers to all the 64 synthetic codons, and their respective amino acids were identified , By this technique Nirenberg and his co-workers cracked 45 codons for amino acids–arginine, alanine, methionine, proline, tryptophan, tyrosine, serine and valine (Gupta, 2007). H.G. Khorana (1961) also devised a technique for craking the genetic code. He prepared polyribonucleotides with known repeating sequences. A repeating sequence means that , if CU are two bases , these will be repeatedly present throughout the length as follows: CUCUCUCUCUCUCUCUCU In a similar manner, if ACU are three bases they will be present repeatedly as follows ACUACUACUACUACUACU So only two codons are possible and these are CUC and UCU in altering sequence, e.g. (CUC/UCU/CUC/UCU/CUC/UCU)., only in (CU)n = two codons are possible and these are CUC and UCU. It means that only two amino acids leucine coded by CUC and serine coded by UCU are formed in altering fashion. Copolymer Codons Amino acids Codons Leucine/ serine CUC/UC s (CU)n CUC/UCU/CU C U (UG)n UGU/GUG/UG Cysteine / valine UGU/GU U (AC)n G ACA/CAC/ACA Threonine/Histidin ACA/CAC e Assignment of codons having known sequences, with help of the co- polymers having repetitive sequences of two bases. Similarly consider a repeating sequence of three bases e.g. (ACG)n. Depending upon were the reading is started, three kinds of homopolypeptides are expected. Actual codon assignment i,e. to find out which of three codons codes for which amino acid information available would depend upon the previous regarding the composition of basis in different codons coding for different amino acids On the basis of the above techniques, a complete genetic code dictionary could be prepared (Gupta, 2007). Codons Homopolypeptide Codon assignment ACG/ ACG/ ACG/ ACG/ (Threonine)n ACG ACG =Poly( ACG) Threonine A/CGA/ CGA/ CGA/ CGA/ = CGA/ (Arginine)n CGA =poly Arginine GAC/ (Aspartic acid )n GAC = (CGA) AC/GAC/ GAC/ GAC/ GAC/ = Poly = Aspartic acid (GAC) MITOCHONDRIAL GENETIC CODE In 1979, scientists began to determine the complete nucleotide sequences of the mitochondrial genomes in humans and mice. It came as a shock when these scientists learned that the genetic code used by these mammalian mitochondria was not quite the same as the “universal code” that has become so familiar to biologists. In the mitochondrial genomes, what should have been a “stop” codon, UGA was instead read as the amino acid tryptophan, AUA was read as methionine rather than isoluecine and AGA and AGG were read as “Stop” rather than arginine. Further more, minor differences from the universal code have also been found in the genomes of chloroplasts and ciliates. Thus, it appears that the genetic code is not quite universal. Some time ago, presumably after they began their endosymbiontic existence, mitochondria and chloroplasts began to read the code differently, particularly the portion of the code associated with “stop” signals. The major difference between the universal codon and mammalian mitochondrion code are: (i) one termination codon (UGA) and two arginine codons (AGA and AGG) in universal code; in mammalian mitochondrion code UGA codes for tryptophan and AGA and AGG codes for stop signals. (ii) number of tRNAs is 22 in mammalian mitochondria code while it is 55 in E.coli (Gupta, 2007). (iii) in universal genetic code CUN (N=any nucleotide) codes for leucine and in mitochondrial code CUN codes for threonine in yeast. (iv) UAG antocodon is amino acylated by leucine in universal genetic code and UAG anticodon in yeast mitochondria accepts threonine. Different genetic codes have been identified in protozoa (Mycoplasma capricolum) im 1986. In this UAA and UAG code for glutamine instead of stop signals.