* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Transcription is the synthesis of RNA under the direction of DNA
Gel electrophoresis of nucleic acids wikipedia , lookup
Histone acetylation and deacetylation wikipedia , lookup
List of types of proteins wikipedia , lookup
Community fingerprinting wikipedia , lookup
Molecular cloning wikipedia , lookup
Gene regulatory network wikipedia , lookup
Molecular evolution wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
RNA interference wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Biosynthesis wikipedia , lookup
Transcription factor wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Messenger RNA wikipedia , lookup
Non-coding DNA wikipedia , lookup
Polyadenylation wikipedia , lookup
RNA silencing wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Epitranscriptome wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene expression wikipedia , lookup
Non-coding RNA wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
1 Transcription Transcription is the synthesis of RNA under the direction of DNA. RNA synthesis, or transcription, is the process of transcribing DNA nucleotide sequence information into RNA sequence information. Both nucleic acid sequences use complementary language, and the information is simply transcribed, or copied, from one molecule to the other. DNA sequence is enzymatically copied by RNA polymerase to produce a complementary nucleotide RNA strand, called messenger RNA (mRNA), because it carries a genetic message from the DNA to the protein-synthesizing machinery of the cell. One significant difference between RNA and DNA sequence is the presence of U, or uracil in RNA instead of the T, or thymine of DNA. In the case of protein-encoding DNA, transcription is the first step that usually leads to the expression of the genes, by the production of the mRNA intermediate, which is a faithful transcript of the gene's protein-building instruction. The stretch of DNA that is transcribed into an RNA molecule is called a transcription unit. A DNA transcription unit that is translated into protein contains sequences that direct and regulate protein synthesis in addition to coding the sequence that is translated into protein. The regulatory sequence that is before (upstream (-) , towards the 5' DNA end) the coding sequence is called 5' untranslated region (5'UTR), and sequence found following (downstream (+), towards the 3' DNA end) the coding sequence is called 3' untranslated region (3'UTR). Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication. As in DNA replication, RNA is synthesized in the 5' → 3' direction (from the point of view of the growing RNA transcript). Only one of the two DNA strands is transcribed. This strand is called the template strand, because it provides the template for ordering the sequence of nucleotides in an RNA transcript. The other strand is called the coding strand, because its sequence is the same as the newly created RNA transcript (except for uracil being substituted for thymine). The DNA template strand is read 3' → 5' by RNA polymerase and the new RNA strand is synthesized in the 5'→ 3' direction. A polymerase binds to the 3' end of a gene (promoter) on the DNA template strand and travels toward the 5' end. Prokaryotic and Eukaryotic Transcription 1. Prokaryotic transcription occurs in the cytoplasm alongside translation. 2. Eukaryotic transcription is localized to the nucleus, where it is separated from the cytoplasm by the nuclear membrane. The transcript is then transported into the cytoplasm where translation occurs. 3. Another important difference is that eukaryotic DNA is wound around histones to form nucleosomes and packaged as chromatin. Chromatin has a strong influence on the accessibility of the DNA to transcription factors and the transcriptional machinery including RNA polymerase. 2 4. In prokaryotes, mRNA is not usually modified. Eukaryotic mRNA is modified through RNA splicing, 5' end capping (5' cap), and the addition of a polyA tail. Transcription is divided into 5 stages: pre-initiation, initiation, promoter clearance, elongation and termination. Pre-Initiation Unlike DNA replication, transcription does not require primers for initiation. However RNA polymerase does require the presence of a core promoter sequence in the DNA, which it is able to bind to in the presence of various specific transcription factors. Promoters are regions of DNA which promote transcription and are found around -10 to -35 bp upstream from the start site of transcription. Core promoters are sequences within the promoter which are essential for transcription initiation. The most common type of core promoter in eukaryotes is a TATA box, with a consensus sequence of TATA(A/T)A(A/T). The TATA box, as a core promoter, is the binding site for a transcription factor known as TATA binding protein (TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D (TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and RNA polymerase combine around the TATA box in a series of stages to form what is known as the preinitiation complex. One such transcription factor has helicase activity and so is involved in the separating of opposing strands of double-stranded DNA to provide access to a single-stranded DNA template. However only a low, or basal, rate of transcription is driven by this pre-intiation complex. Other proteins known as activators and repressors, along with any associated co-activators or co-repressors, may further enhance or inhibit transcription. Initiation Simple diagram of transcription initiation. RNAP = RNA polymerase In bacteria, transcription begins with the binding of RNA polymerase to the promoter in DNA. The RNA polymerase is a core enzyme consisting of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. At the start of initiation, the core enzyme is associated with a sigma factor (number 70) that aids in finding the appropriate -35 and -10 basepairs downstream of promoter sequences. Transcription initiation is far more complex in eukaryotes, the main difference being that eukaryotic polymerases do not directly recognize their core promoter sequences. In eukaryotes, a collection of proteins called transcription factors mediate the binding of RNA 3 polymerase and the initiation of transcription. Only after certain transcription factors are attached to the promoter does the RNA polymerase bind to it. The completed assembly of transcription factors and RNA polymerase bind to the promoter, called transcription initiation complex. Transcription in archaea is similar to transcription in eukaryotes. Promoter clearance After the first bond is synthesized the RNA polymerase must clear the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation and is common for both eukaryotes and prokaroytes. Once the transcript reaches approximately 23 nucleotides it no longer slips and elongation can occur. This is an ATP dependent process. Promoter clearance coincides with phosphorylation of serine 5 on the carboxy terminal domain of RNA Pol in prokaryotes, which is phosphorylated by TFIIH. Elongation Simple diagram of transcription elongation One strand of DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy. Although RNA polymerase traverses the template strand from 3' → 5', the coding (nontemplate) strand is usually used as the reference point, so transcription is said to go from 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one less oxygen atom) in its sugar-phosphate backbone). Unlike DNA replication, mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be produced from a single copy of a gene. This step also involves a proofreading mechanism that can replace incorrectly incorporated bases. Prokaryotic elongation starts with the "abortive initiation cycle". During this cycle RNA Polymerase will synthesize mRNA fragments 2-12 nucleotides long. This continues to occur until the σ factor rearranges, which results in the transcription elongation complex (which gives a 35 bp moving footprint). The σ factor is released before 80 nucleotides of mRNA are synthesized. 4 In Eukaryotic transcription the polymerase can experience pauses. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure. Often the polymerase pauses to allow appropriate RNA editing factors to bind. Termination Simple diagram of transcription termination Bacteria use two different strategies for transcription termination: in Rho-independent transcription termination, RNA transcription stops when the newly synthesized RNA molecule forms a G-C rich hairpin loop, followed by a run of U's, which makes it detach from the DNA template. In the "Rho-dependent" type of termination, a protein factor called "Rho" destabilizes the interaction between the template and the mRNA, thus releasing the newly synthesized mRNA from the elongation complex. Transcription termination in eukaryotes is less well understood. It involves cleavage of the new transcript, followed by templateindependent addition of As at its new 3' end, in a process called polyadenylation Reverse transcription Scheme of reverse transcription 5 Some viruses (such as HIV, the cause of AIDS), have the ability to transcribe RNA into DNA. HIV has an RNA genome that is duplicated into DNA. The resulting DNA can be merged with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA from an RNA template is called reverse transcriptase. In the case of HIV, reverse transcriptase is responsible for synthesizing a complementary DNA strand (cDNA) to the viral RNA genome. An associated enzyme, ribonuclease H, digests the RNA strand, and reverse transcriptase synthesises a complementary strand of DNA to form a double helix DNA structure. This cDNA is integrated into the host cell's genome via another enzyme (integrase) causing the host cell to generate viral proteins which reassemble into new viral particles. Subsequently, the host cell undergoes programmed cell death (apoptosis). Some eukaryotic cells contain an enzyme with reverse transcription activity called telomerase. Telomerase is a reverse transcriptase that lengthens the ends of linear chromosomes. Telomerase carries an RNA template from which it synthesizes DNA repeating sequence, or "junk" DNA. This repeated sequence of "junk" DNA is important because every time a linear chromosome is duplicated, it is shortened in length. With "junk" DNA at the ends of chromosomes, the shortening eliminates some repeated, or junk sequence, rather than the protein-encoding DNA sequence that is further away from the chromosome ends. Telomerase is often activated in cancer cells to enable cancer cells to duplicate their genomes without losing important protein-coding DNA sequence. Activation of telomerase could be part of the process that allows cancer cells to become technically immortal. Prokaryotic transcription is the process in which messenger RNA transcripts of genetic material in prokaryotes are produced, to be translated for the production of proteins. Prokaryotic transcription occurs in the cytoplasm alongside translation. Unlike in eukaryotes, prokaryotic transcription and translation can occur simultaneously. This is impossible in eukaryotes, where transcription occurs in a membrane-bound nucleus while translation occurs outside the nucleus in the cytoplasm. In prokaryotes genetic material is not enclosed in a membrane-enclosed nucleus and has access to ribosomes in the cytoplasm. 6 Initiation The following steps occur, in order, for transcription initiation: RNA polymerase (RNAP) binds to one of several specificity factors, σ, to form a holoenzyme. In this form, it can recognize and bind to specific promoter regions in the DNA. At this stage, the DNA is double-stranded ("closed"). This holoenzyme/wound-DNA structure is referred to as the closed complex. Elongation The DNA is unwound and becomes single-stranded ("open") in the vicinity of the initiation site (defined as +1). This holoenzyme/unwound-DNA structure is called the open complex. The RNA polymerase transcribes the DNA, but produces about 10 abortive (short, nonproductive) transcripts which are unable to leave the RNA polymerase because the exit channel is blocked by the σ-factor. The σ-factor eventually dissociates from the holoenzyme, and elongation proceeds. Promoters can differ in "strength"; that is, how actively they promote transcription of their adjacent DNA sequence. Promoter strength is in many (but not all) cases, a matter of how tightly RNA polymerase and its associated accessory proteins bind to their respective DNA sequences. The more similar the sequences are to a consensus sequence, the stronger the binding is. Additional transcription regulation comes from transcription factors that can affect the stability of the holoenzyme structure at initiation. Most transcripts originate using adenosine-5'-triphosphate (ATP) and, to a lesser extent, guanosine-5'-triphosphate (GTP) (purine nucleoside triphosphates) at the +1 site. Uridine-5'triphosphate (UTP) and cytidine-5'-triphosphate (CTP) (pyrimidine nucleoside triphosphates) are disfavoured at the initiation site. Termination Two termination mechanisms are well known: Intrinsic termination (also called Rho-independent transcription termination) involves terminator sequences within the RNA that signal the RNA polymerase to stop. The terminator sequence is usually a palindromic sequence that forms a stem-loop hairpin structure that leads to the dissociation of the RNAP from the DNA template. Rho-dependent termination uses a termination factor called ρ factor(rho factor) which is a protein to stop RNA synthesis at specific sites. This protein binds at a rho utilisation site on the nascent RNA strand and runs along the mRNA towards the RNAP. A stem loop structure upstream of the terminator region pauses the RNAP, when ρ-factor reaches the RNAP, it causes RNAP to dissociate from the DNA, terminating transcription. Other termination mechanisms include where RNAP comes across a region with repetitious thymidine residues in the DNA template, or where a GC-rich inverted repeat followed by 4 A 7 residues. The inverted repeat forms a stable stem loop structure in the RNA, which causes the RNA to dissociate from the DNA template. The -35 region and the -10 ("Pribnow box") region comprise the basic prokaryotic promoter, and |T| stands for the terminator. The DNA on the template strand between the +1 site and the terminator is transcribed into RNA, which is then translated into protein Eukaryotic transcription is more complex than prokaryotic transcription. For instance, in eukaryotes the genetic material (DNA), and therefore transcription, is primarily localized to the nucleus, where it is separated from the cytoplasm (in which translation occurs) by the nuclear membrane. DNA is also present in mitochondria in the cytoplasm and mitochondria utilize a specialized RNA polymerase for transcription. This allows for the temporal regulation of gene expression through the sequestration of the RNA in the nucleus, and allows for selective transport of RNAs to the cytoplasm, where the ribosomes reside. The basal eukaryotic transcription complex includes the RNA polymerase and additional proteins that are necessary for correct initiation and elongation. Initiation Among eukaryotes that regulate the transcription of individual genes, the core promoter of protein-encoding gene contains binding sites for the basal transcription complex and RNA polymerase II, and is normally within about 50 bases upstream of the transcription initiation site. Further transcriptional regulation is provided by upstream control elements (UCEs), usually present within about 200 bases upstream of the initiation site. The core promoter for Pol II sometimes contains a TATA box, the highly conserved DNA recognition sequence for the TATA box binding protein, TBP, whose binding initiates transcription complex assembly at the promoter. Some genes also have enhancer elements that can be thousands of bases upstream or downstream of the transcription initiation site. Combinations of these upstream control elements and enhancers regulate and amplify the formation of the basal transcription complex. Transcription process Eukaryotes have three nuclear RNA polymerases, each with distinct roles and properties: Name transcribed RNA Polymerase I (Pol I, Pol A) nucleolus 5.8S) RNA Polymerase II (Pol II, Pol B) nucleus RNA Polymerase III (Pol III, Pol C) Larger ribosomal RNA (rRNA) (28S, 18S, messenger RNA (mRNA) and most small nuclear RNAs (snRNAs) nucleus (and possibly the nucleolus-nucleoplasm interface)transfer RNA (tRNA) and other small RNAs (including the small 5S rRNA) 8 There are many eukaryotes that differ from the canonical presentation of the roles of RNA polymerases. Certain organisms possess four distinct RNA polymerases. Other organisms utilize RNA polymerase I to transcribe certain protein-coding genes in addition to rRNAs. Transcription regulation The regulation of gene expression is achieved through the interaction of several levels of control including the regulation of transcription initiation. Most (not all) eukaryote possess robust methods of regulating transcription initiation on a gene-by-gene basis. The transcription of a gene can be regulated by cis-acting elements within the regulatory regions of the DNA, and trans-acting factors that include transcription factors and the basal transcription complex Splicing Two types of splicing, cis-splicing and trans-splicing, use the same splicing machinery to cleave RNAs at specific points and rejoin them to form new combinations once transcribed. Although most eukaryotes possess splicing machinery the extent of cis- and trans-splicing varies from organism to organism. Cis-splicing Primary (initial) mRNA transcripts are synthesized as larger precursor RNAs that are processed by splicing out introns (non-coding sequences) and ligating exons (non-contiguous coding sequences) into the mature mRNA. Primary transcripts for some genes can be large. The primary transcripts of the neurexin genes, for instance, are as large as 1.7 megabases (1,700,000 bases), while the mature (processed) neurexin mRNAs are under 10 kilobases (10,000 bases), with as many as 24 exons and thousands of possible alternative splice variants that produce proteins with different activities. Over 80% of human genes are alternatively spliced, greatly increasing the variety of actual proteins produced by the limited set of genes in the human genome. Trans-splicing Observed in range of different eukaryotes (including most conspicuously the worm C. elegans and a group of parasitic protists called kinetoplastids), trans-splicing occurs whereby an exon from one RNA molecule is spliced onto the 5' end of a completely separate molecule post-transcriptionally. While relatively unimportant to many eukaryotes, the role of this process in the biology of some organisms is ubiquitous. In kinetoplastids, for example, every single nuclear-encoded message must be trans-spliced before translation of the message can occur RNA polymerase (RNAP or RNApol) is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses. In chemical terms, RNAP is a nucleotidyl transferase that polymerizes ribonucleotides at the 3' end of an RNA transcript. 9 RNAP was discovered independently by Sam Weiss and Jerard Hurwitz in 1960. By this time the 1959 Nobel Prize in Medicine had been awarded to Severo Ochoa and Arthur Kornberg for the discovery of what was believed to be RNAP, but instead turned out to be polynucleotide phosphorylase. The 2006 Nobel Prize in Chemistry was awarded to Roger Kornberg for creating detailed molecular images of RNA polymerase during various stages of the transcription process RNA polymerase I (also called Pol I) is, in eukaryotes, the only enzyme that transcribes ribosomal RNA (excluding 5S rRNA, which is synthesized by RNA Polymerase III) a type of RNA which accounts for over 50% of the total RNA synthesized in a cell. Pol I consists of 8-14 protein subunits (polypeptides). All 12 subunits have identical or related counterparts in Pol II and Pol III. rDNA transcription is confined to the nucleolus where several hundreds of copies of rRNA genes are present, arranged as tandem head-to-tail repeats. Pol I transcribes one large transcript encoding an rDNA gene over and over again. This gene encodes the 18S, the 5.8S and the 28S RNA molecules of the ribosome in eukaryotes. The transcripts are cleaved by snoRNA. The 5S ribosomal RNA is transcribed by Pol III. Because of the simplicity of Pol I transcription it is the fastest acting polymerase RNA polymerase II (also called RNAP II and Pol II) is an enzyme found in eukaryotic cells. It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase. A wide range of transcription factors are required for it to bind to its promoters and begin transcription. RNA polymerase III (also called Pol III) transcribes DNA to synthesize ribosomal 5S rRNA, tRNA and other small RNAs. The genes transcribed by RNA Pol III fall in the category of "housekeeping" genes whose expression is required in all cell types and most environmental conditions. Therefore the regulation of Pol III transcription is primarily tied to the regulation cell growth and the cell cycle, thus requiring fewer regulatory proteins than RNA polymerase II. Control of transcription An electron-micrograph of DNA strands decorated by hundreds of RNAP molecules too small to be resolved. Each RNAP is transcribing an RNA strand which can be seen branching off from the DNA. "Begin" indicates the 3' end of the DNA, where RNAP initiates transcription; "End" indicates the 5' end, where the longer RNA molecules are almost completely transcribed. Control of the process of gene transcription affects patterns of gene expression and thereby allows a cell to adapt to a changing environment, perform specialized roles within an organism, and maintain basic metabolic processes necessary for survival. Therefore, it is hardly surprising that the activity of RNAP is both complex and highly regulated. In Escherichia coli bacteria, more than 100 transcription factors have been identified which modify the activity of RNAP. 10 RNAP can initiate transcription at specific DNA sequences known as promoters. It then produces an RNA chain which is complementary to the template DNA strand. The process of adding nucleotides to the RNA strand is known as elongation; In eukaryotes, RNAP can build chains as long as 2.4 million nucleosides (the full length of the dystrophin gene). RNAP will preferentially release its RNA transcript at specific DNA sequences encoded at the end of genes known as terminators. Products of RNAP include: Messenger RNA (mRNA)—template for the synthesis of proteins by ribosomes. Non-coding RNA or "RNA genes"—a broad class of genes that encode RNA that is not translated into protein. The most prominent examples of RNA genes are transfer RNA (tRNA) and ribosomal RNA (rRNA), both of which are involved in the process of translation. However, since the late 1990s, many new RNA genes have been found, and thus RNA genes may play a much more significant role than previously thought. Transfer RNA (tRNA)—transfers specific amino acids to growing polypeptide chains at the ribosomal site of protein synthesis during translation Ribosomal RNA (rRNA)—a component of ribosomes Micro RNA—regulates gene activity Catalytic RNA (Ribozyme)—enzymatically active RNA molecules RNAP accomplishes de novo synthesis. It is able to do this because specific interactions with the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts with ATP (followed by GTP, UTP, and then CTP). In contrast to DNA polymerase, RNAP includes helicase activity, therefore no separate enzyme is needed to unwind DNA. RNA polymerase action Binding and initiation RNA Polymerase binding in prokaryotes involves the α subunit recognizing the upstream element (-40 to -70 base pairs) in DNA, as well as the σ factor recognizing the -10 to -35 region. There are numerous σ factors that regulate gene expression. For example, σ70 is expressed under normal conditions and allows RNAP binding to house-keeping genes, while σ32 elicits RNAP binding to heat-shock genes. After binding to the DNA, the RNA polymerase switches from a closed complex to an open complex. This change involves the separation of the DNA strands to form an unwound section of DNA of approximately 13bp. Ribonucleotides are base-paired to the template DNA strand, according to Watson-Crick base-pairing interactions. Supercoiling plays an important part in polymerase activity because of the unwinding and rewinding of DNA. Because 11 regions of DNA in front of RNAP are unwound, there is compensatory positive supercoils. Regions behind RNAP are rewound and negative supercoils are present. Elongation Transcription elongation involves the further addition of ribonucleotides and the change of the open complex to the transcriptional complex. RNAP cannot start forming full length transcripts because of its strong binding to promoter. Transcription at this stage primarily results in short RNA fragments of around 9 bp in a process known as abortive transcription. Once the RNAP starts forming longer transcripts it clears the promoter. At this point, the -10 to -35 promoter region is disrupted, and the σ factor falls off RNAP. This allows the rest of the RNAP complex to move forward, as the σ factor held the RNAP complex in place. The 17 bp transcriptional complex has an 8 bp DNA-RNA hybrid, that is, 8 base-pairs involve the RNA transcript bound to the DNA template strand. As transcription progresses, ribonucleotides are added to the 3' end of the RNA transcript and the RNAP complex moves along the DNA. Although RNAP does not seem to have the 3'exonuclease activity that characterizes the proofreading activity found in DNA polymerase, there is evidence of that RNAP will halt at mismatched base-pairs and correct it. The addition of ribonucleotides to the RNA transcript has a very similar mechanism to DNA polymerization - it is believed that these polymerases are evolutionarily related. Aspartyl (asp) residues in the RNAP will hold onto Mg2+ ions, which will in turn coordinate the phosphates of the ribonucleotides. The first Mg2+ will hold onto the α-phosphate of the NTP to be added. This allows the nucleophilic attack of the 3'OH from the RNA transcript, adding an additional NTP to the chain. The second Mg2+ will hold onto the pyrophosphate of the NTP. The overall reaction equation is: (NMP)n + NTP --> (NMP)n+1 + PPi Termination Termination of RNA transcription can be rho-independent or rho-dependent: Rho-independent transcription termination is the termination of transcription without the aid of the rho protein. Transcription of a palindromic region of DNA causes the formation of a hairpin structure from the RNA transcription looping and binding upon itself. This hairpin structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid itself. As a result, the 8bp DNA-RNA hybrid in the transcription complex shifts to a 4bp hybrid. Coincidentally, these last 4 base-pairs are weak A-U base-pairs, and the entire RNA transcript will fall off.[5] RNA polymerase in bacteria In bacteria, the same enzyme catalyzes the synthesis of mRNA and ncRNA. RNAP is a relatively large molecule. The core enzyme has 5 subunits (~400 kDa): 12 α2: the two α subunits assemble the enzyme and recognize regulatory factors. Each subunit has two domains: αCTD (C-Terminal domain) binds the UP element of the extended promoter, and αNTD (N-terminal domain) binds the rest of the polymerase. This subunit is not used on promoters without an UP element. β: this has the polymerase activity (catalyzes the synthesis of RNA) which includes chain initiation and elongation. β': binds to DNA (nonspecifically). ω: restores denatured RNA polymerase to its functional form in vitro. It has been observed to offer a protective/chaperone function to the β' subunit in Mycobacterium smegmatis. Now known to promote assembly. In order to bind promoter-specific regions, the core enzyme requires another subunit, sigma (σ). The sigma factor greatly reduces the affinity of RNAP for nonspecific DNA while increasing specificity for certain promoter regions, depending on the sigma factor. That way, transcription is initiated at the right region. The complete holoenzyme therefore has 6 subunits: α2ββ'σω (~480 kDa). The structure of RNAP exhibits a groove with a length of 55 Å (5.5 nm) and a diameter of 25 Å (2.5 nm). This groove fits well the 20 Å (2 nm) double strand of DNA. The 55 Å (5.5 nm) length can accept 16 nucleotides. When not in use RNA polymerase binds to low affinity sites to allow rapid exchange for an active promoter site when one opens. RNA polymerase holoenzyme, therefore, does not freely float around in the cell when not in use. Transcriptional cofactors There are a number of proteins which can bind to RNAP and modify its behavior. For instance, GreA and GreB from E. coli and in most other prokaryotes can enhance the ability of RNAP to cleave the RNA template near the growing end of the chain. This cleavage can rescue a stalled polymerase molecule, and is likely involved in proofreading the occasional mistakes made by RNAP. A separate cofactor, Mfd, is involved in transcription-coupled repair, the process in which RNAP recognizes damaged bases in the DNA template and recruits enzymes to restore the DNA. Other cofactors are known to play regulatory roles, i.e. they help RNAP choose whether or not to express certain genes RNA polymerase in eukaryotes Eukaryotes have several types of RNAP, characterized by the type of RNA they synthesize: RNA polymerase I synthesizes a pre-rRNA 45S, which matures into 28S, 18S and 5.8S rRNAs which will form the major RNA sections of the ribosome. RNA polymerase II synthesizes precursors of mRNAs and most snRNA and microRNAs.This is the most studied type, and due to the high level of control required over transcription a range of transcription factors are required for its binding to promoters. RNA polymerase III synthesizes tRNAs, rRNA 5S and other small RNAs found in the nucleus and cytosol. RNA polymerase IV synthesizes siRNA in plants. RNA polymerase V synthesizes RNAs involved in siRNA-directed heterochromatin formation in plants. 13 There are other RNA polymerase types in mitochondria and chloroplasts. And there are RNA-dependent RNA polymerases involved in RNA interference. RNA polymerase in archaea Archaea have a single RNAP that is closely related to the three main eukaryotic polymerases. Thus, it has been speculated that the archaeal polymerase resembles the ancestor of the specialized eukaryotic polymerases. RNA polymerase in viruses T7 RNA polymerase producing a mRNA (green) from a DNA template. The protein is shown as a purple ribbon. Image derived from PDB 1MSW. Many viruses also encode for RNAP. Perhaps the most widely studied viral RNAP is found in bacteriophage T7. This single-subunit RNAP is related to that found in mitochondria and chloroplasts, and shares considerable homology to DNA polymerase.[13] It is believed that most viral polymerases therefore evolved from DNA polymerase and are not directly related to the multi-subunit polymerases described above. The viral polymerases are diverse, and include some forms which can use RNA as a template instead of DNA. This occurs in negative strand RNA viruses and dsRNA viruses, both of which exist for a portion of their life cycle as double-stranded RNA. However, some positive strand RNA viruses, such as polio, also contain these RNA dependent RNA polymerases.