Download Elucidation of the Genetic Code

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Synthetic biology wikipedia , lookup

List of types of proteins wikipedia , lookup

Mutation wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome evolution wikipedia , lookup

Protein adsorption wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Cell-penetrating peptide wikipedia , lookup

Messenger RNA wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Peptide synthesis wikipedia , lookup

Non-coding RNA wikipedia , lookup

Protein wikipedia , lookup

Gene expression wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Polyadenylation wikipedia , lookup

Epitranscriptome wikipedia , lookup

Gene wikipedia , lookup

Proteolysis wikipedia , lookup

Metabolism wikipedia , lookup

Bottromycin wikipedia , lookup

Molecular evolution wikipedia , lookup

Protein structure prediction wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Ribosome wikipedia , lookup

Biochemistry wikipedia , lookup

Expanded genetic code wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
Elucidation of the Genetic Code
4 major advances helped figure out the code: 1) The demonstration of colinearity between genes and proteins 2) The idea of triplet codons
3) Deciphering the first word (UUU= Phe)
4) Deciphering the rest of the code
The Colinearity of Gene and Protein
Charles Yanofsky (1964) studied auxotrophic mutants of trpA gene, coding for tryptophan synthase
1) Using genetic recombinations he determined the positions of the mutations in the gene– mapped the various mutations;
2) Sequenced the mutant proteins to determine the position of the mutant amino acid in the protein encoded by each mutant gene
Mutant 1
gene
protein
Mutant 3
Mutant 2
• The changes were colinear ‐ order of mutations in DNA was the same as the order of amino acid substitutions in the encoded protein.
• The relative distances between mutations in DNA were proportional to the distances between amino acid substitutions • Different point mutations in the same position (based on his power of resolution) could result in different amino acids in the product:
– The nucleotide sequence determines the amino acid sequence.
Therefore gene sequence is colinear with protein sequence.
What is the nature of the code
4 different nucleotides
mRNA (nucleotides)
20 different amino acids
protein (amino acids)
For 4 nucleotides (ATGC) to encode 20 amino acids, you need a coding unit of at least 3:
A coding unit of 2 nucleotides can only encode 16 amino acids (4x4)
A coding unit of 3 nucleotides can only encode 64 amino acids (4x4x4)
The idea of triplet codons
Several types of triplet codes are possible
● ● ● ●
AUCCGUCGAAU
● ● ● ●
● ● ● ●
AUCCGUCGAAU
● ● ● ●
aa1
aa1
aa2
aa2
aa3
aa3
A wholly overlapping code
● ● ● ●
A partially overlapping code
AUCCGUCGAAU
aa1
aa2
aa3
A nonoverlapping code
● ● ● ●
Experiments performed by Francis Crick and Sydney Brenner (1961) generated mutations in bacteriophage T4 rIIB gene using the mutagen proflavin
+
Phage DNA
T4 rIIB mutants
T4 rIIB mutants
+ Other mutagens
No revertants
T4 rIIB mutants
+
Proflavin
wild type revertants
proflavin inserts between two base pairs
Has dimensions of a purine‐pyrymidine basepair
Causes distortion in helix ‐ cause errors during DNA replication; leads to insertion or deletion of a base pair.
Wild‐type phenotype, but different genotype
They showed that while base addition or base deletion gave a mutant phenotype, a combination of a single base addition and single base deletion near to one another on the DNA always produced a normal
phenotype. This result established that the genetic code is a reading frame code, with code reading starting from some fixed point.
When one or two bases were inserted or deleted, protein function was destroyed. When three bases were inserted or deleted, protein function was often more or less normal:
the genetic code was made up of 3‐base words, or codons and the starting point of each gene establishes the reading frame.
Confirmed hypothesis of Crick:
message based on triplets, read three nucleotides at a time from a definite starting point
THE BIG RED FOX ATE THE EGG
deletion
THE IGR EDF OXA TET HEE GG
deletion
insertion
THE IGR EDX FOX ATE THE EGG
insertion
insertion
insertion
THE BXI GYR EDZ FOX ATE THE EGG
Three added bases restores the correct reading frame; message in phase; so “code word” consists of 3 units
Genetic Code is nonoverlapping
.
A codon (three bases or triplet) encodes an amino acid.
Genetic Code is read continuously from a fixed starting point.
Most amino acids must be encoded by multiple codons
• There are potentially 64 codons, and only 20 amino acids.
• If each amino acid is encoded by only one codon, there would be 44 codons which would not code for any amino acid.
• This would imply that more than 50% of the time, a frame shift would result in a codon that would be a ‘nonsense’ codon. • This was contrary to experimental observations
• Conclusion: the genetic code is degenerate. – More than one codon can code for each amino acid.
– Each codon only codes for one amino acid
Steps to decipher the genetic code
1. Synthetic RNA as a messager (Nirenberg)
Poly U ‐‐‐> poly‐phenylalanine (UUU is the phenylalanine codon)
Poly C ‐‐‐> poly‐proline (CCC is the proline codon)
Poly A ‐‐‐> poly‐lysine (AAA is the lysine codon)
2. Triplet binding (filter binding) assay (Nirenberg)
Synthetic trinucleotides promote binding of specific aminoacyl
tRNA to ribosomes ‐ used to bind specific charged tRNAs. 5’‐GUU‐3’ promotes binding of Valyl‐tRNA. ‐‐‐> GUU is the codon of valine.
3. Alternating copolymers (Khorana)
CUC UCU CUC UCU ‐‐‐> alternating copolymer of leucine and serine
CUC and UCU are the codons of leucine and serine.
Elucidation of the genetic code
1961 – Marshall Nirenberg and Heinrich Matthaei
Worked with an in vitro translation system from E. coli
Cell-free extract
•Ribosomes
•tRNAs
•Amino acids
•Enzymes
•ATP, GTP
+ mRNA = protein
in vitro synthesis of viral proteins – Tobacco Mosaic Virus (TMV) RNA
Control RNA template:
homopolymer poly(U) synthesized from UDP using polynucleotide phosphorylase
Control RNA template:
homopolymer poly(U) synthesized from UDP using polynucleotide phosphorylase
polynucleotide
phosphorylase
phosphate
ribonucleoside
diphosphate
poly An
poly An+1
Using this system they made a polyU mRNA by programming their reaction with UDP; when this was put into the cell‐free extract it should be translated into a protein made up of amino acids coded by the codon UUU.
Experiment:
• They set up 20 different test tube reactions
• Each one was spiked with a different radioactive amino acid
• They programmed each with the polyU RNA
• Then recovered the proteins by acid precipitation
• Under these conditions the proteins precipitate but the free amino acids do not
• Then they asked which reaction (out of the 20) has radioactivity in the protein pellet?
Results
Marshall Nirenberg and Heinrich Matthaei showed that poly‐U produced polyphenylalanine in a cell‐free solution from E. coli. In other words, only the test tube reaction spiked with radioactive Phe generated a radioactive pellet
They repeated the experiment with other synthetic homopolymer RNAs
Poly C ‐‐‐> poly‐proline (CCC is the proline codon)
Poly A ‐‐‐> poly‐lysine (AAA is the lysine codon)
Getting at the Rest of the Code
Work with nucleotide copolymers (poly (A,C), etc.) revealed some of the codes Gobind Khorana (organic chemist)
‐synthesized DNA composed of alternating copolymers eg: ACACACACACAC…..
Then used RNAP to make RNA from the DNA template eg: UGUGUGUGUGUGU……
This RNA transcript has two possible alternating codons: UGU GUG UGU GUG
In a translation extract you should get a protein with 2 alternating amino acids
• UGUGUGUGUGUGUGUGU...
– Cys-Val-Cys-Val-Cys-Val-...
– Therefore GUG or UGU codes for either Cys or Val
• UUCUUCUUCUUCUUC…
– Phe-Phe-Phe-Phe-... or
– Ser-Ser-Ser-Ser-… or
– Leu-Leu-Leu-Leu-...
DNA template
RNA polymerase
GUA GUA GUA GUA
Val Val Val Val
AGU AGU AGU
Ser Ser Ser
UAC UAC UAC UAC
Tyr Tyr Tyr Tyr
ACU ACU ACU ACU
Thr Thr Thr Thr
CUA CUA CUA CUA
Leu Leu Leu Leu
Getting at the Rest of the Code
• Finally Marshall Nirenberg and Philip Leder cracked the entire code in 1966 • They showed that trinucleotides bound to ribosomes could direct the binding of specific aminoacyl‐tRNAs.
• By using C‐14 labeled amino acids with all the possible trinucleotide codes, they elucidated all 64 correspondences in the code • Found that all the codons (except the 3 stop codons) specified an amino acid
• There are 64 codons and 20 amino acids • Therefore amino acids can be encoded by >1 codon
In Vitro Triplet Binding Assay
Nirenberg and Leder (1964) mixed all 20 amino acids with cell‐free translation extract (ribosomes, tRNAs, soluble enzymes) added a synthetic triplet RNA (a codon) e.g. UUU
Phe
Ribosome
•Ternary complex
•Very large
•Can be captured
on a filter
Triplet RNA
AAA
UUU
They found that addition of the simple triplet RNA to the cell‐free extract could stimulate the binding of the tRNA that recognized that codon to a ribosome
Since the tRNA is covalently linked to the amino acid that is coded for by the codon, that amino acid gets localized to the ribosome
If they collect the ribosomes from the experiment they can identify which amino acid was brought to the ribosome by that triplet codon
Nirenberg and Leder
Ternary complex
very large
Experiment: for each triplet RNA set up 20 reactions, each one spiked with a different radioactive amino acid. Ask which reaction generates radioactivity on the filter. That’s the amino acid coded for by the triplet codon!
– Ribosomes + UAU ‐> Tyr binds
– Ribosomes + AUA ‐> Ile binds
– Ribosomes + UUU ‐> Phe binds, etc.
Genetic code
61 of 64 possible triplets specify an amino acid
3 are signals to terminate translation
Codons for initiation of translation
Major codon for initiation is AUG
Regardless of codon used, the first amino acid incorporated in E. coli is formyl‐Met
For the 4288 genes identified in E. coli
AUG is used for 3542 genes
GUG is used for 612 genes
UUG is used for 130 genes
AUU is used for 1 gene
CUG may be used for 1 gene
Codons for termination of translation
UAA (ochre), UAG (amber), UGA (opal)
For genes identified in E. coli:
UAA is used for 2705 genes
UGA is used for 1257 genes
UAG is used for 326 genes
Universality of the Genetic Code
• All living beings use the same genetic code.
• Genetic code evolved early in life, and has remained constant over billions of years because of the lack of tolerance for change.
• Some exceptions exist: the genetic code is not universal
– in some ciliates, there is only one nonsense codon, UAA, and UAG (STOP) encodes Gln in Tetrahymena and Paramecium
• In eukaryotic mitochondria, there are some changes: – UGA (STOP) encodes Trp
– in yeast, CUA codes for Thr instead of Leu
• The impact of this change in these organisms/organelles may not be as drastic, since there are very few proteins encoded by these systems, allowing evolution of the genetic code. Features of the Genetic Code
• All the codons have meaning: 61 specify amino acids, and the other 3 are "nonsense" or "stop" codons
• The code is unambiguous ‐ only one amino acid is indicated by each of the 61 codons
• The code is degenerate ‐ except for Trp and Met, each amino acid is coded by two or more codons
• Codons representing the same or similar amino acids are similar in sequence • 2nd base pyrimidine: usually nonpolar amino acid • 2nd base purine: usually polar or charged aa
Wobble Theory in anticodon‐codon pairing
Crick suggested that bases in the codon and anticodon could “wobble” (or spatially move) at third position in order to form H+ bonds even if they were not complementary according to strict pairing rules
Wobble Theory means that a single tRNA can pair with more than 1 codon
Result: 61 codons can be read by as few as 31 tRNAs
Mutations that change the code
1. Missense mutation. Mutations in which the substitution of one base pair for another changes one codon to another codon.
2. Nonsense mutation. Mutations that create a nonsense or stop codon.
3. Frameshift mutation. Insertions or deletions that change the reading frame.