* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Translation
Survey
Document related concepts
Therapeutic gene modulation wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Frameshift mutation wikipedia , lookup
Polyadenylation wikipedia , lookup
Protein moonlighting wikipedia , lookup
Point mutation wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
History of RNA biology wikipedia , lookup
Primary transcript wikipedia , lookup
Non-coding RNA wikipedia , lookup
Messenger RNA wikipedia , lookup
Epitranscriptome wikipedia , lookup
Transfer RNA wikipedia , lookup
Transcript
Translation Messenger RNA Structure • In eukaryotes, genes can be divided into exons and introns. An exon is any sequence that ends up in the messenger RNA. An intron is any sequence that is transcribed but spliced out of the primary transcript. • As we move to the level of translation, we need to add another set of definitions: • the coding sequence or CDS is the part of the messenger RNA sequence that is translated into protein. The CDS starts at the first AUG (start codon) and ends at the first in-frame stop codon. • From the beginning of the mRNA to the start codon is the 5’ untranslated region or 5’UTR. • From the end of the stop codon to the end of the mRNA is the 3’ untranslated region or 3’UTR. • Note that the first exon contains the 5’UTR, but it might also include part of the CDS. Or, the 5’UTR might take up the entire first exon and also part of the second exon. That is, intron/exon boundaries are not necessarily the same as UTR/CDS boundaries. Codons and the Genetic Code • There are 20 different amino acids coded in DNA, but only 4 different nucleotides. To accommodate all the amino acids, each amino acid is coded for by a group of 3 nucleotides, called a codon. • There are 64 possible codons (4 x 4 x 4 = 64). The table that shows the correspondence between the 64 codons and the amino acid they code for is called the genetic code. • Most amino acids have more than one codon: we say that the genetic code is degenerate. The consequence of this is that if you know the amino acid sequence of a protein, you can’t be sure of the exact nucleotide sequence. • If an amino acid has 4 possible codons, they all start with the same 2 bases, and the third base can be anything. See proline, threonine, alanine, lysine. • If an amino acid has only 2 codons, the first two bases are the same, and the third is either a pyrimidine (C or U) or a purine (A or G). E.g., histidine/glutamine or asparagine/lysine. • Two amino acids have 6 possible codons: leucine and serine. They have different first bases as well as third bases. • All eukaryotic proteins start translation at AUG, the codon for methionine. This initial methionine is removed after translation in many proteins. There are also methionines coded by AUG in the middle of most proteins. AUG is the only methionine codon. • In addition to AUG, bacteria use GUG and UUG as start codons. All three cause the first amino acid to be Nformyl methionine. In the rest of the protein, these codons code for valine and leucine. • Three codons (UAA, UAG, and UGA) are stop codons: they code for no amino acid, and every CDS ends in one of these 3 stop codons. • Except in the very special circumstances of the unusual amino acids selenocysteine and pyrrolysine (discussed later), there are no internal stop codons in a CDS. The Genetic Code • The genetic code is almost universal: nearly all organisms use it, both prokaryote and eukaryote. • There are several minor variants, with one or two codons changed, in mitochondria and in some single celled eukaryotes (protists). • It is thought that these variants developed late in evolutionary time, because they only affect a few taxonomic groups. Reading Frames • Codons are groups of 3 bases. Since translation can start at any nucleotide of the mRNA, the same region of DNA can be read in 3 ways, starting one base apart. Each of these 3 modes is a reading frame. • The DNA might also be read on the opposite strand, giving a total of 6 possible reading frames. • The proper reading frame is set by the initiation AUG. Once this codon is read, the ribosome simply moves down 3 nucleotides at a time. • Genes occur in open reading frames (ORFs), areas where there are no stop codons. Genes end at the first stop codon that exists in their reading frame. • 3 out of every 64 codons is a stop codon, so long open reading frames are rare in random, unselected DNA. Since genes are under selection pressure, most long open reading frames contain genes. The figure to the left shows all 3 reading frames for a small virus. Potential start codons (AUG) are marked with short lines, and stop codons are marked with full-width lines. There are 2 ORFs, one in reading frame 3 and the other in reading frame 1. Transfer RNA • Transfer RNA (tRNA) molecules are short RNAs that serve as the adapters between the codons of mRNA and the amino acids they code for. • Transfer RNA molecules fold into a characteristic cloverleaf pattern formed by base-pairing within the molecule. Higher level (tertiary) structure then forms as different parts of the cloverleaf hydrogen-bond with each other. • Each tRNA has 3 bases that make up the anticodon. These bases pair with the 3 bases of the codon on mRNA during translation. • Some tRNAs can pair with more than one codon. The first base of the anticodon (which matches the third base of the codon) is called the wobble position. It can form base pairs with several different nucleotides, often using non-standard base pairings. Also, the wobble position is often inosine, a purine that can base pair with C, A, or U. • Inosine is made by deaminating adenosine. • Different species use different numbers of tRNAs; a minimum of 31 different tRNAs is necessary. Humans have 48 different tRNAs. • There are multiple copies of tRNA genes, with about 500 total in humans. tRNA Charging • A set of enzymes, the aminoacyl-tRNA synthetases, are used to “charge” (that is, attach) the tRNA with the proper amino acid. • There are 20 aminoacyl-tRNA synthetases, one for each amino acid. Each enzyme works with all of the tRNAs that code for that amino acid. • The aminoacyl-tRNA synthetases recognize their tRNAs by bases in both the anticodon and in the acceptor stem (just below the amino acid attachment point). • The –COOH group of the amino acid is attached to the 3’ –OH at the 3’ end of the tRNA, using energy from hydrolyzing ATP to AMP (not just to ADP). • The bond formed is a high energy bond, and the energy is later used to drive the formation of the peptide bond between this amino acid and the growing peptide chain. Ribosomes • Ribosomes are RNA/protein hybrids that perform protein synthesis. About 60% of the mass of a ribosome is RNA, with 40% protein. • The ribosome is a ribozyme: the ribosomal RNA catalyzes the chemical reactions of protein synthesis. • Ribosomes consist of a large subunit and a small subunit. • In eukaryotes, the large subunit is called the 60S subunit. It has 3 RNA molecules and 46-50 polypeptides. The small subunit, called 40S, contains one RNA plus 33 polypeptides. Together, the ribosomal subunits make up the 80S ribosome. • The S units are sedimentation velocity is a centrifuge, and they are not strictly additive. • In prokaryotes, the assembled ribosome is called 70S. It is smaller than the eukaryotic ribosome. The large subunit (50S) has 2 RNAs (one fewer than eukaryotic) plus 31 polypeptides, and the small subunit (30S) has 1 RNA and 21 polypeptides. • When the ribosome is assembled, it has a binding site for mRNA as well as 3 binding sites for tRNA, called the E site, P site, and A site. G Proteins • G proteins are used for a variety of signal transmissions within the cell, as well as between the outside of the cell and the inside. G proteins can be thought of as molecular switches: they have two different conformations, corresponding to “On” and “Off”. • G proteins work by binding a molecule of GTP. When GTP is bound, the protein is in the “on” state. The G protein slowly hydrolyzes the GTP to GDP. • Conversion of GTP to GDP (going from on to off) can be greatly sped up by G protein activating proteins or GAPs. • Hydrolysis of GTP to GDP causes a change of conformation of the protein, converting it to the “off” state. • The energy derived from converting GTP to GDP is used to change the conformation of the G protein. Sometimes this energy does other useful work as well: moving the ribosome from one codon to the next, for example. • The G protein goes back to the ON state when the GDP is exchanged for a GTP, with the help of a guanosine nucleotide exchange factor (GES) protein. • We will see several uses of G proteins and their associated GES and GAP proteins in the translation process. Initiation Process • Translation has 3 phases: initiation, elongation, and termination. • All proteins, both prokaryotic and eukaryotic, use an initiator tRNA to insert the first amino acid, which is always methionine or a derivative. • In eukaryotes, the first amino acid is always methionine, whose codon is AUG. • In bacteria, the first amino acid is a modified version of methionine: N-formyl methionine. Bacteria usually use AUG as the start codon, but some proteins use GUG or UUG instead. However, all use the same initiator tRNA. • The initiator tRNA (tRNAiMet) is charged by the same amino acyl tRNA synthetase as the regular Met-tRNA (which inserts methionine into AUG codons in the middle of the protein). • The initiator tRNA binds to the P site on the ribosome. All other tRNAs bind to the A site. • Initiation in both prokaryotes and eukaryotes use several different proteins, called initiation factors. Where Does Translation Start? • In prokaryotes, translation starts when the small ribosomal subunit binds to a specific sequence called a ribosome binding site (or ShineDalgarno sequence), which is just upstream from the translation start site. Ribosome binding site sequences are complementary to a region of the 16S ribosomal RNA. • Many bacterial mRNAs code for multiple proteins, each with its own translation start site. This is an easy way to keep the amount of different proteins in the same biochemical pathway relatively equal. • An operon is a group of genes that are all transcribed by a single messenger RNA, and then translated separately. The mRNA from an operon is sometimes called polycistronic: cistron is an old word for protein-coding gene. • In eukaryotes, protein synthesis starts at the first AUG in the mRNA • This implies that eukaryotic messenger RNAs can only be translated into a single protein Initiation in Prokaryotes 1. To start the initiation process, the small ribosomal subunit and 2 initiation factors (IF1 and IF3) bind to the ribosome binding site. 2. Next, the initiator tRNA binds to the AUG start codon with the help of IF2, the third initiation factor. • IF2 is a G protein, in its active conformation with GTP bound 3. Then, the large ribosomal subunit binds. 4. The initiation factors dissociate. • The large subunit acts as a G protein activating protein (GAP), causing GTP to be hydrolyzed to GDP. The conformation change in IF2 causes all of the initiation factors dissociate. Eukaryotic Initiation • There are many more proteins involved in eukaryotic initiation than in prokaryotic. • Eukaryotic initiation factors are named eIF (eukaryotic initiation factor). 1. Proteins bind to the 5’ cap and the 3’ poly-A tail. Then, these proteins bind to each other to form a circular structure with the mRNA. • This probably facilitates rapid re-use of ribosomes: as soon as a ribosome finished making a protein, it can quickly find its way back to the translation start site. 2. The small subunit, combined with the initiator tRNA and several initiation factors, binds to the 5’ cap structure. 3. The small subunit complex then uses ATP energy to move down the mRNA, scanning for the first AUG. • One of the initiation factors, eIF2, is a G protein. Scanning is only possible when it is in the ON conformation (has GTP bound). 4. When the initiator tRNA’s anticodon binds to the AUG, another initiation factor acts as a GAP protein and hydrolyzes the GTP to GDP. This locks the initiation complex in place and prevents further scanning. 5. The large subunit then binds. This causes the initiation factors to be released. Elongation • After initiation is completed, the ribosome is assembled with the messenger RNA, and the initiator tRNA is in the P site. All initiation factors have been released. Translation then enters the elongation phase, where the polypeptide chain is synthesized one amino acid at a time. • Elongation also uses proteins, called elongation factors. Some of these are G proteins, which serve to proofread the new polypeptide. • The process is very similar between prokaryotes and eukaryotes. Elongation • Transfer RNAs, charged with the appropriate amino acid and bound to elongation factor EF-1α (called EF-Tu in bacteria) enter the A site of the ribosome. • EF-1α is a G protein, and when the anticodon of the proper tRNA binds to the mRNA codon, EF-1α hydrolyzes GTP to GDP. This causes a conformational shift that moves the incoming amino acid into close proximity with the amino acid at the P site. • Note that various incorrect tRNAs enter the A site but are rejected because their anticodon doesn’t match the mRNA codon. • At this point, the P site has a transfer RNA with the initial methionine attached to its 3’ end, and the A site has a transfer RNA with the next amino acid attached to it. Elongation • The ribosomal RNA then catalyzes the transfer of the polypeptide chain from the tRNA at the P site onto the NH2 group of the amino acid at the A site. The ribosome is actually a ribozyme: the reaction is catalyzed by RNA, not a protein. • At this point, the tRNA in the P site has nothing attached to it, and the tRNA in the A site has the growing polypeptide chain attached to it. The attachment is the COOH group of the last amino acid. • The ribosome then moves down the mRNA 3 nucleotides: this is called translocation. It occurs because elongation factor EF2, another G protein, hydrolyzes its GTP to GDP. This causes a conformation change that moves the ribosome. • At this point, the empty tRNA is in the E site and the growing polypeptide is attached to the tRNA in a P site. The A site is empty. • Finally, the empty tRNA in the E site detaches from the ribosome, leaving it ready to start the next cycle of adding an amino acid. Termination • The protein coding sequence on the mRNA ends in a stop codon. There are no tRNAs that match stop codons. • Termination is accomplished by 2 release factor proteins. • When the ribosome has a stop codon under the A site, the first release factor binds to the stop codon. This release factor has a shape very similar to a tRNA. • The release factor catalyzes the hydrolysis of the bond between the tRNA and the C-terminus of the newly synthesized polypeptide. • The second release factor is a G protein. After the polypeptide has been released, the second release factor hydrolyzes its GTP, and the resulting conformation change causes the ribosomal subunits to separate from each other and from the mRNA. Inventory of GTP usage in eukaryotic translation --Initiation: when initiator tRNA binds to start codon (plus ATP for small subunit scanning) --Elongation: for each amino acid, GTP is used when the proper tRNA binds to the codon, and when the ribosome translocates. --Termination: when the new polypeptide is released from its tRNA. Polyribosomes • Most mRNAs are translated by several ribosomes following each other down the RNA. This structure, the mRNA plus attached ribosomes, is called a polyribosome, or polysome. • The circularization of the mRNA (in eukaryotes) by proteins binding to both the cap and the poly A tail make it easier to recycle the ribosomes. • In bacteria, transcription and translation are coupled, with the first ribosome attached to the mRNA right after it is transcribed. Bacterial mRNA isn’t circularized. Protein Synthesis Inhibitors • There are some small differences in the way bacteria and eukaryotes synthesize proteins. Many antibiotics work by using these differences to inhibit protein synthesis in bacteria while leaving human cells unharmed. Selenocysteine and Pyrrolysine • These two amino acids are not part of the regular genetic code, but they are coded in the DNA: as stop codons that get modified by other sequences in the mRNA. • Selenocysteine (Sec, or U) is the best known. It is found in most organisms in all 3 domains of life, as part of the active site of some oxidation-reduction enzymes. • At least 25 human proteins contain Sec. • Sec uses the UGA stop codon, and has a special tRNA with a matching anticodon. • tRNASec is initially charged with cysteine, and then an enzyme replaces the sulfur atom with selenium. • The selenocysteine insertion sequence (SECIS) is a hairpin loop in mRNA • The SECIS binds a translation elongation factor similar to EF-Tu in bacteria or EF1A in eukaryotes. • This elongation factor causes Sec-tRNASec to be inserted at every UGA in the mRNA, instead of the usual chain-terminating release factor. • Pyrrolysine has only been found in a few Archaea. It uses the UAG stop codon and a hairpin loop similar to the SECIS element. Protein Degradation • The lifespan of proteins is primarily controlled by the process of protein degradation. Damaged proteins are removed by the same process. • The proteasome is a large, multi-subunit molecular machine. All of its subunits are proteins, with no RNA. It has about 50 subunits and a sedimentation size of 26S (compare to the 40S small ribosomal subunit). • The proteasome has a cylindrical shape with 3 parts. The central region contains multiple proteases (enzymes that digest proteins). The end caps (or stoppers) regulate the proteasome’s activity. • The proteasome can exist with just the central cylinder, or with 1 or 2 end caps. • Proteins that are to be destroyed enter through the end cap. Here, ATP in used to unfold the proteins. Also, disulfide bridges are removed (reduced to –SH). Then, the proteins are fed through a narrow opening into the central chamber. Proteases cut them up into short peptides, which are then released into the cytoplasm. In the cytoplasm, other proteases hydrolyze them into single amino acids. Ubiquitin and Proteolysis • Proteins are marked for destruction by being tagged with ubiquitin molecules. • Ubiquitin is a 76 amino acid protein (which is quite short for a protein). • Ubiquitin is found in all eukaryotes, and it is one of the slowest evolving proteins known. • The COOH end of ubiquitin is attached to the –NH2 at the end of the R-group of lysine. • This creates a peptide bond, but it is not part of the main polypeptide chain. It is thus called an isopeptide linkage. • A set of 3 enzymes attaches the ubiquitin to the protein being marked for destruction, using ATP energy. Then more ubiquitins are attached to a lysine in the first ubiquitin, forming along chain. • E3, the last enzyme in the chain, is a large family of proteins that recognize specific misfolded or defective proteins. E3 thus provides specificity to the degradation process. • The proteasome cap binds to proteins with 4 or more ubiquitins and destroys them. An enzyme removes the ubiquitins before the protein is destroyed.