* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Elucidation of the Genetic Code
Synthetic biology wikipedia , lookup
List of types of proteins wikipedia , lookup
Genome evolution wikipedia , lookup
Protein adsorption wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Cell-penetrating peptide wikipedia , lookup
Messenger RNA wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Peptide synthesis wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene expression wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Polyadenylation wikipedia , lookup
Epitranscriptome wikipedia , lookup
Proteolysis wikipedia , lookup
Bottromycin wikipedia , lookup
Molecular evolution wikipedia , lookup
Protein structure prediction wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Biochemistry wikipedia , lookup
Elucidation of the Genetic Code 4 major advances helped figure out the code: 1) The demonstration of colinearity between genes and proteins 2) The idea of triplet codons 3) Deciphering the first word (UUU= Phe) 4) Deciphering the rest of the code The Colinearity of Gene and Protein Charles Yanofsky (1964) studied auxotrophic mutants of trpA gene, coding for tryptophan synthase 1) Using genetic recombinations he determined the positions of the mutations in the gene– mapped the various mutations; 2) Sequenced the mutant proteins to determine the position of the mutant amino acid in the protein encoded by each mutant gene Mutant 1 gene protein Mutant 3 Mutant 2 • The changes were colinear ‐ order of mutations in DNA was the same as the order of amino acid substitutions in the encoded protein. • The relative distances between mutations in DNA were proportional to the distances between amino acid substitutions • Different point mutations in the same position (based on his power of resolution) could result in different amino acids in the product: – The nucleotide sequence determines the amino acid sequence. Therefore gene sequence is colinear with protein sequence. What is the nature of the code 4 different nucleotides mRNA (nucleotides) 20 different amino acids protein (amino acids) For 4 nucleotides (ATGC) to encode 20 amino acids, you need a coding unit of at least 3: A coding unit of 2 nucleotides can only encode 16 amino acids (4x4) A coding unit of 3 nucleotides can only encode 64 amino acids (4x4x4) The idea of triplet codons Several types of triplet codes are possible ● ● ● ● AUCCGUCGAAU ● ● ● ● ● ● ● ● AUCCGUCGAAU ● ● ● ● aa1 aa1 aa2 aa2 aa3 aa3 A wholly overlapping code ● ● ● ● A partially overlapping code AUCCGUCGAAU aa1 aa2 aa3 A nonoverlapping code ● ● ● ● Experiments performed by Francis Crick and Sydney Brenner (1961) generated mutations in bacteriophage T4 rIIB gene using the mutagen proflavin + Phage DNA T4 rIIB mutants T4 rIIB mutants + Other mutagens No revertants T4 rIIB mutants + Proflavin wild type revertants proflavin inserts between two base pairs Has dimensions of a purine‐pyrymidine basepair Causes distortion in helix ‐ cause errors during DNA replication; leads to insertion or deletion of a base pair. Wild‐type phenotype, but different genotype They showed that while base addition or base deletion gave a mutant phenotype, a combination of a single base addition and single base deletion near to one another on the DNA always produced a normal phenotype. This result established that the genetic code is a reading frame code, with code reading starting from some fixed point. When one or two bases were inserted or deleted, protein function was destroyed. When three bases were inserted or deleted, protein function was often more or less normal: the genetic code was made up of 3‐base words, or codons and the starting point of each gene establishes the reading frame. Confirmed hypothesis of Crick: message based on triplets, read three nucleotides at a time from a definite starting point THE BIG RED FOX ATE THE EGG deletion THE IGR EDF OXA TET HEE GG deletion insertion THE IGR EDX FOX ATE THE EGG insertion insertion insertion THE BXI GYR EDZ FOX ATE THE EGG Three added bases restores the correct reading frame; message in phase; so “code word” consists of 3 units Genetic Code is nonoverlapping . A codon (three bases or triplet) encodes an amino acid. Genetic Code is read continuously from a fixed starting point. Most amino acids must be encoded by multiple codons • There are potentially 64 codons, and only 20 amino acids. • If each amino acid is encoded by only one codon, there would be 44 codons which would not code for any amino acid. • This would imply that more than 50% of the time, a frame shift would result in a codon that would be a ‘nonsense’ codon. • This was contrary to experimental observations • Conclusion: the genetic code is degenerate. – More than one codon can code for each amino acid. – Each codon only codes for one amino acid Steps to decipher the genetic code 1. Synthetic RNA as a messager (Nirenberg) Poly U ‐‐‐> poly‐phenylalanine (UUU is the phenylalanine codon) Poly C ‐‐‐> poly‐proline (CCC is the proline codon) Poly A ‐‐‐> poly‐lysine (AAA is the lysine codon) 2. Triplet binding (filter binding) assay (Nirenberg) Synthetic trinucleotides promote binding of specific aminoacyl tRNA to ribosomes ‐ used to bind specific charged tRNAs. 5’‐GUU‐3’ promotes binding of Valyl‐tRNA. ‐‐‐> GUU is the codon of valine. 3. Alternating copolymers (Khorana) CUC UCU CUC UCU ‐‐‐> alternating copolymer of leucine and serine CUC and UCU are the codons of leucine and serine. Elucidation of the genetic code 1961 – Marshall Nirenberg and Heinrich Matthaei Worked with an in vitro translation system from E. coli Cell-free extract •Ribosomes •tRNAs •Amino acids •Enzymes •ATP, GTP + mRNA = protein in vitro synthesis of viral proteins – Tobacco Mosaic Virus (TMV) RNA Control RNA template: homopolymer poly(U) synthesized from UDP using polynucleotide phosphorylase Control RNA template: homopolymer poly(U) synthesized from UDP using polynucleotide phosphorylase polynucleotide phosphorylase phosphate ribonucleoside diphosphate poly An poly An+1 Using this system they made a polyU mRNA by programming their reaction with UDP; when this was put into the cell‐free extract it should be translated into a protein made up of amino acids coded by the codon UUU. Experiment: • They set up 20 different test tube reactions • Each one was spiked with a different radioactive amino acid • They programmed each with the polyU RNA • Then recovered the proteins by acid precipitation • Under these conditions the proteins precipitate but the free amino acids do not • Then they asked which reaction (out of the 20) has radioactivity in the protein pellet? Results Marshall Nirenberg and Heinrich Matthaei showed that poly‐U produced polyphenylalanine in a cell‐free solution from E. coli. In other words, only the test tube reaction spiked with radioactive Phe generated a radioactive pellet They repeated the experiment with other synthetic homopolymer RNAs Poly C ‐‐‐> poly‐proline (CCC is the proline codon) Poly A ‐‐‐> poly‐lysine (AAA is the lysine codon) Getting at the Rest of the Code Work with nucleotide copolymers (poly (A,C), etc.) revealed some of the codes Gobind Khorana (organic chemist) ‐synthesized DNA composed of alternating copolymers eg: ACACACACACAC….. Then used RNAP to make RNA from the DNA template eg: UGUGUGUGUGUGU…… This RNA transcript has two possible alternating codons: UGU GUG UGU GUG In a translation extract you should get a protein with 2 alternating amino acids • UGUGUGUGUGUGUGUGU... – Cys-Val-Cys-Val-Cys-Val-... – Therefore GUG or UGU codes for either Cys or Val • UUCUUCUUCUUCUUC… – Phe-Phe-Phe-Phe-... or – Ser-Ser-Ser-Ser-… or – Leu-Leu-Leu-Leu-... DNA template RNA polymerase GUA GUA GUA GUA Val Val Val Val AGU AGU AGU Ser Ser Ser UAC UAC UAC UAC Tyr Tyr Tyr Tyr ACU ACU ACU ACU Thr Thr Thr Thr CUA CUA CUA CUA Leu Leu Leu Leu Getting at the Rest of the Code • Finally Marshall Nirenberg and Philip Leder cracked the entire code in 1966 • They showed that trinucleotides bound to ribosomes could direct the binding of specific aminoacyl‐tRNAs. • By using C‐14 labeled amino acids with all the possible trinucleotide codes, they elucidated all 64 correspondences in the code • Found that all the codons (except the 3 stop codons) specified an amino acid • There are 64 codons and 20 amino acids • Therefore amino acids can be encoded by >1 codon In Vitro Triplet Binding Assay Nirenberg and Leder (1964) mixed all 20 amino acids with cell‐free translation extract (ribosomes, tRNAs, soluble enzymes) added a synthetic triplet RNA (a codon) e.g. UUU Phe Ribosome •Ternary complex •Very large •Can be captured on a filter Triplet RNA AAA UUU They found that addition of the simple triplet RNA to the cell‐free extract could stimulate the binding of the tRNA that recognized that codon to a ribosome Since the tRNA is covalently linked to the amino acid that is coded for by the codon, that amino acid gets localized to the ribosome If they collect the ribosomes from the experiment they can identify which amino acid was brought to the ribosome by that triplet codon Nirenberg and Leder Ternary complex very large Experiment: for each triplet RNA set up 20 reactions, each one spiked with a different radioactive amino acid. Ask which reaction generates radioactivity on the filter. That’s the amino acid coded for by the triplet codon! – Ribosomes + UAU ‐> Tyr binds – Ribosomes + AUA ‐> Ile binds – Ribosomes + UUU ‐> Phe binds, etc. Genetic code 61 of 64 possible triplets specify an amino acid 3 are signals to terminate translation Codons for initiation of translation Major codon for initiation is AUG Regardless of codon used, the first amino acid incorporated in E. coli is formyl‐Met For the 4288 genes identified in E. coli AUG is used for 3542 genes GUG is used for 612 genes UUG is used for 130 genes AUU is used for 1 gene CUG may be used for 1 gene Codons for termination of translation UAA (ochre), UAG (amber), UGA (opal) For genes identified in E. coli: UAA is used for 2705 genes UGA is used for 1257 genes UAG is used for 326 genes Universality of the Genetic Code • All living beings use the same genetic code. • Genetic code evolved early in life, and has remained constant over billions of years because of the lack of tolerance for change. • Some exceptions exist: the genetic code is not universal – in some ciliates, there is only one nonsense codon, UAA, and UAG (STOP) encodes Gln in Tetrahymena and Paramecium • In eukaryotic mitochondria, there are some changes: – UGA (STOP) encodes Trp – in yeast, CUA codes for Thr instead of Leu • The impact of this change in these organisms/organelles may not be as drastic, since there are very few proteins encoded by these systems, allowing evolution of the genetic code. Features of the Genetic Code • All the codons have meaning: 61 specify amino acids, and the other 3 are "nonsense" or "stop" codons • The code is unambiguous ‐ only one amino acid is indicated by each of the 61 codons • The code is degenerate ‐ except for Trp and Met, each amino acid is coded by two or more codons • Codons representing the same or similar amino acids are similar in sequence • 2nd base pyrimidine: usually nonpolar amino acid • 2nd base purine: usually polar or charged aa Wobble Theory in anticodon‐codon pairing Crick suggested that bases in the codon and anticodon could “wobble” (or spatially move) at third position in order to form H+ bonds even if they were not complementary according to strict pairing rules Wobble Theory means that a single tRNA can pair with more than 1 codon Result: 61 codons can be read by as few as 31 tRNAs Mutations that change the code 1. Missense mutation. Mutations in which the substitution of one base pair for another changes one codon to another codon. 2. Nonsense mutation. Mutations that create a nonsense or stop codon. 3. Frameshift mutation. Insertions or deletions that change the reading frame.