Download Transcription is the synthesis of RNA under the direction of DNA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gel electrophoresis of nucleic acids wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

List of types of proteins wikipedia , lookup

Community fingerprinting wikipedia , lookup

Molecular cloning wikipedia , lookup

Gene regulatory network wikipedia , lookup

SR protein wikipedia , lookup

Molecular evolution wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

RNA interference wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Biosynthesis wikipedia , lookup

Gene wikipedia , lookup

Transcription factor wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Messenger RNA wikipedia , lookup

Non-coding DNA wikipedia , lookup

Replisome wikipedia , lookup

Polyadenylation wikipedia , lookup

RNA silencing wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Epitranscriptome wikipedia , lookup

RNA-Seq wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene expression wikipedia , lookup

Non-coding RNA wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Transcript
1
Transcription
Transcription is the synthesis of RNA under the direction of DNA. RNA synthesis, or
transcription, is the process of transcribing DNA nucleotide sequence information into RNA
sequence information. Both nucleic acid sequences use complementary language, and the
information is simply transcribed, or copied, from one molecule to the other. DNA sequence
is enzymatically copied by RNA polymerase to produce a complementary nucleotide RNA
strand, called messenger RNA (mRNA), because it carries a genetic message from the DNA
to the protein-synthesizing machinery of the cell. One significant difference between RNA
and DNA sequence is the presence of U, or uracil in RNA instead of the T, or thymine of
DNA. In the case of protein-encoding DNA, transcription is the first step that usually leads to
the expression of the genes, by the production of the mRNA intermediate, which is a faithful
transcript of the gene's protein-building instruction. The stretch of DNA that is transcribed
into an RNA molecule is called a transcription unit. A DNA transcription unit that is
translated into protein contains sequences that direct and regulate protein synthesis in
addition to coding the sequence that is translated into protein. The regulatory sequence that
is before (upstream (-) , towards the 5' DNA end) the coding sequence is called 5'
untranslated region (5'UTR), and sequence found following (downstream (+), towards the 3'
DNA end) the coding sequence is called 3' untranslated region (3'UTR). Transcription has
some proofreading mechanisms, but they are fewer and less effective than the controls for
copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.
As in DNA replication, RNA is synthesized in the 5' → 3' direction (from the point of view of
the growing RNA transcript). Only one of the two DNA strands is transcribed. This strand is
called the template strand, because it provides the template for ordering the sequence of
nucleotides in an RNA transcript. The other strand is called the coding strand, because its
sequence is the same as the newly created RNA transcript (except for uracil being substituted
for thymine). The DNA template strand is read 3' → 5' by RNA polymerase and the new
RNA strand is synthesized in the 5'→ 3' direction.
A polymerase binds to the 3' end of a gene (promoter) on the DNA template strand and
travels toward the 5' end.
Prokaryotic and Eukaryotic Transcription
1. Prokaryotic transcription occurs in the cytoplasm alongside translation.
2. Eukaryotic transcription is localized to the nucleus, where it is separated from the
cytoplasm by the nuclear membrane. The transcript is then transported into the cytoplasm
where translation occurs.
3. Another important difference is that eukaryotic DNA is wound around histones to form
nucleosomes and packaged as chromatin. Chromatin has a strong influence on the
accessibility of the DNA to transcription factors and the transcriptional machinery including
RNA polymerase.
2
4. In prokaryotes, mRNA is not usually modified. Eukaryotic mRNA is modified through
RNA splicing, 5' end capping (5' cap), and the addition of a polyA tail.
Transcription is divided into 5 stages: pre-initiation, initiation, promoter clearance,
elongation and termination.
Pre-Initiation
Unlike DNA replication, transcription does not require primers for initiation. However RNA
polymerase does require the presence of a core promoter sequence in the DNA, which it is
able to bind to in the presence of various specific transcription factors.
Promoters are regions of DNA which promote transcription and are found around -10 to -35
bp upstream from the start site of transcription. Core promoters are sequences within the
promoter which are essential for transcription initiation. The most common type of core
promoter in eukaryotes is a TATA box, with a consensus sequence of TATA(A/T)A(A/T).
The TATA box, as a core promoter, is the binding site for a transcription factor known as
TATA binding protein (TBP), which is itself a subunit of another transcription factor, called
Transcription Factor II D (TFIID). After TFIID binds to the TATA box via the TBP, five more
transcription factors and RNA polymerase combine around the TATA box in a series of
stages to form what is known as the preinitiation complex. One such transcription factor has
helicase activity and so is involved in the separating of opposing strands of double-stranded
DNA to provide access to a single-stranded DNA template.
However only a low, or basal, rate of transcription is driven by this pre-intiation complex.
Other proteins known as activators and repressors, along with any associated co-activators or
co-repressors, may further enhance or inhibit transcription.
Initiation
Simple diagram of transcription initiation. RNAP = RNA polymerase
In bacteria, transcription begins with the binding of RNA polymerase to the promoter in
DNA. The RNA polymerase is a core enzyme consisting of five subunits: 2 α subunits, 1 β
subunit, 1 β' subunit, and 1 ω subunit. At the start of initiation, the core enzyme is associated
with a sigma factor (number 70) that aids in finding the appropriate -35 and -10 basepairs
downstream of promoter sequences.
Transcription initiation is far more complex in eukaryotes, the main difference being that
eukaryotic polymerases do not directly recognize their core promoter sequences. In
eukaryotes, a collection of proteins called transcription factors mediate the binding of RNA
3
polymerase and the initiation of transcription. Only after certain transcription factors are
attached to the promoter does the RNA polymerase bind to it. The completed assembly of
transcription factors and RNA polymerase bind to the promoter, called transcription
initiation complex. Transcription in archaea is similar to transcription in eukaryotes.
Promoter clearance
After the first bond is synthesized the RNA polymerase must clear the promoter. During this
time there is a tendency to release the RNA transcript and produce truncated transcripts. This
is called abortive initiation and is common for both eukaryotes and prokaroytes. Once the
transcript reaches approximately 23 nucleotides it no longer slips and elongation can occur.
This is an ATP dependent process.
Promoter clearance coincides with phosphorylation of serine 5 on the carboxy terminal
domain of RNA Pol in prokaryotes, which is phosphorylated by TFIIH.
Elongation
Simple diagram of transcription elongation
One strand of DNA, the template strand (or noncoding strand), is used as a template for
RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand
and uses base pairing complementarity with the DNA template to create an RNA copy.
Although RNA polymerase traverses the template strand from 3' → 5', the coding (nontemplate) strand is usually used as the reference point, so transcription is said to go from 5'
→ 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand
(except that thymines are replaced with uracils, and the nucleotides are composed of a ribose
(5-carbon) sugar where DNA has deoxyribose (one less oxygen atom) in its sugar-phosphate
backbone).
Unlike DNA replication, mRNA transcription can involve multiple RNA polymerases on a
single DNA template and multiple rounds of transcription (amplification of particular
mRNA), so many mRNA molecules can be produced from a single copy of a gene. This step
also involves a proofreading mechanism that can replace incorrectly incorporated bases.
Prokaryotic elongation starts with the "abortive initiation cycle". During this cycle RNA
Polymerase will synthesize mRNA fragments 2-12 nucleotides long. This continues to occur
until the σ factor rearranges, which results in the transcription elongation complex (which
gives a 35 bp moving footprint). The σ factor is released before 80 nucleotides of mRNA are
synthesized.
4
In Eukaryotic transcription the polymerase can experience pauses. These pauses may be
intrinsic to the RNA polymerase or due to chromatin structure. Often the polymerase pauses
to allow appropriate RNA editing factors to bind.
Termination
Simple diagram of transcription termination
Bacteria use two different strategies for transcription termination: in Rho-independent
transcription termination, RNA transcription stops when the newly synthesized RNA
molecule forms a G-C rich hairpin loop, followed by a run of U's, which makes it detach from
the DNA template. In the "Rho-dependent" type of termination, a protein factor called "Rho"
destabilizes the interaction between the template and the mRNA, thus releasing the newly
synthesized mRNA from the elongation complex. Transcription termination in eukaryotes is
less well understood. It involves cleavage of the new transcript, followed by templateindependent addition of As at its new 3' end, in a process called polyadenylation
Reverse transcription
Scheme of reverse transcription
5
Some viruses (such as HIV, the cause of AIDS), have the ability to transcribe RNA into DNA.
HIV has an RNA genome that is duplicated into DNA. The resulting DNA can be merged
with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA
from an RNA template is called reverse transcriptase. In the case of HIV, reverse
transcriptase is responsible for synthesizing a complementary DNA strand (cDNA) to the
viral RNA genome. An associated enzyme, ribonuclease H, digests the RNA strand, and
reverse transcriptase synthesises a complementary strand of DNA to form a double helix
DNA structure. This cDNA is integrated into the host cell's genome via another enzyme
(integrase) causing the host cell to generate viral proteins which reassemble into new viral
particles. Subsequently, the host cell undergoes programmed cell death (apoptosis).
Some eukaryotic cells contain an enzyme with reverse transcription activity called
telomerase. Telomerase is a reverse transcriptase that lengthens the ends of linear
chromosomes. Telomerase carries an RNA template from which it synthesizes DNA
repeating sequence, or "junk" DNA. This repeated sequence of "junk" DNA is important
because every time a linear chromosome is duplicated, it is shortened in length. With "junk"
DNA at the ends of chromosomes, the shortening eliminates some repeated, or junk
sequence, rather than the protein-encoding DNA sequence that is further away from the
chromosome ends. Telomerase is often activated in cancer cells to enable cancer cells to
duplicate their genomes without losing important protein-coding DNA sequence. Activation
of telomerase could be part of the process that allows cancer cells to become technically
immortal.
Prokaryotic transcription is the process in which messenger RNA transcripts of genetic
material in prokaryotes are produced, to be translated for the production of proteins.
Prokaryotic transcription occurs in the cytoplasm alongside translation. Unlike in eukaryotes,
prokaryotic transcription and translation can occur simultaneously. This is impossible in
eukaryotes, where transcription occurs in a membrane-bound nucleus while translation
occurs outside the nucleus in the cytoplasm. In prokaryotes genetic material is not enclosed
in a membrane-enclosed nucleus and has access to ribosomes in the cytoplasm.
6
Initiation
The following steps occur, in order, for transcription initiation:
RNA polymerase (RNAP) binds to one of several specificity factors, σ, to form a holoenzyme.
In this form, it can recognize and bind to specific promoter regions in the DNA. At this stage,
the DNA is double-stranded ("closed"). This holoenzyme/wound-DNA structure is referred
to as the closed complex.
Elongation
The DNA is unwound and becomes single-stranded ("open") in the vicinity of the initiation
site (defined as +1). This holoenzyme/unwound-DNA structure is called the open complex.
The RNA polymerase transcribes the DNA, but produces about 10 abortive (short, nonproductive) transcripts which are unable to leave the RNA polymerase because the exit
channel is blocked by the σ-factor.
The σ-factor eventually dissociates from the holoenzyme, and elongation proceeds.
Promoters can differ in "strength"; that is, how actively they promote transcription of their
adjacent DNA sequence. Promoter strength is in many (but not all) cases, a matter of how
tightly RNA polymerase and its associated accessory proteins bind to their respective DNA
sequences. The more similar the sequences are to a consensus sequence, the stronger the
binding is. Additional transcription regulation comes from transcription factors that can
affect the stability of the holoenzyme structure at initiation.
Most transcripts originate using adenosine-5'-triphosphate (ATP) and, to a lesser extent,
guanosine-5'-triphosphate (GTP) (purine nucleoside triphosphates) at the +1 site. Uridine-5'triphosphate (UTP) and cytidine-5'-triphosphate (CTP) (pyrimidine nucleoside
triphosphates) are disfavoured at the initiation site.
Termination
Two termination mechanisms are well known:
Intrinsic termination (also called Rho-independent transcription termination) involves
terminator sequences within the RNA that signal the RNA polymerase to stop. The
terminator sequence is usually a palindromic sequence that forms a stem-loop hairpin
structure that leads to the dissociation of the RNAP from the DNA template.
Rho-dependent termination uses a termination factor called ρ factor(rho factor) which is a
protein to stop RNA synthesis at specific sites. This protein binds at a rho utilisation site on
the nascent RNA strand and runs along the mRNA towards the RNAP. A stem loop structure
upstream of the terminator region pauses the RNAP, when ρ-factor reaches the RNAP, it
causes RNAP to dissociate from the DNA, terminating transcription.
Other termination mechanisms include where RNAP comes across a region with repetitious
thymidine residues in the DNA template, or where a GC-rich inverted repeat followed by 4 A
7
residues. The inverted repeat forms a stable stem loop structure in the RNA, which causes
the RNA to dissociate from the DNA template.
The -35 region and the -10 ("Pribnow box") region comprise the basic prokaryotic promoter,
and |T| stands for the terminator. The DNA on the template strand between the +1 site and
the terminator is transcribed into RNA, which is then translated into protein
Eukaryotic transcription is more complex than prokaryotic transcription. For instance, in
eukaryotes the genetic material (DNA), and therefore transcription, is primarily localized to
the nucleus, where it is separated from the cytoplasm (in which translation occurs) by the
nuclear membrane. DNA is also present in mitochondria in the cytoplasm and mitochondria
utilize a specialized RNA polymerase for transcription. This allows for the temporal
regulation of gene expression through the sequestration of the RNA in the nucleus, and
allows for selective transport of RNAs to the cytoplasm, where the ribosomes reside.
The basal eukaryotic transcription complex includes the RNA polymerase and additional
proteins that are necessary for correct initiation and elongation.
Initiation
Among eukaryotes that regulate the transcription of individual genes, the core promoter of
protein-encoding gene contains binding sites for the basal transcription complex and RNA
polymerase II, and is normally within about 50 bases upstream of the transcription initiation
site. Further transcriptional regulation is provided by upstream control elements (UCEs),
usually present within about 200 bases upstream of the initiation site. The core promoter for
Pol II sometimes contains a TATA box, the highly conserved DNA recognition sequence for
the TATA box binding protein, TBP, whose binding initiates transcription complex assembly
at the promoter.
Some genes also have enhancer elements that can be thousands of bases upstream or
downstream of the transcription initiation site. Combinations of these upstream control
elements and enhancers regulate and amplify the formation of the basal transcription
complex.
Transcription process
Eukaryotes have three nuclear RNA polymerases, each with distinct roles and properties:
Name
transcribed
RNA Polymerase I (Pol I, Pol A) nucleolus
5.8S)
RNA Polymerase II (Pol II, Pol B) nucleus
RNA Polymerase III (Pol III, Pol C)
Larger ribosomal RNA (rRNA) (28S, 18S,
messenger RNA (mRNA) and most
small nuclear RNAs (snRNAs)
nucleus (and possibly the nucleolus-nucleoplasm
interface)transfer RNA (tRNA) and other small
RNAs (including the small 5S rRNA)
8
There are many eukaryotes that differ from the canonical presentation of the roles of RNA
polymerases. Certain organisms possess four distinct RNA polymerases. Other organisms
utilize RNA polymerase I to transcribe certain protein-coding genes in addition to rRNAs.
Transcription regulation
The regulation of gene expression is achieved through the interaction of several levels of
control including the regulation of transcription initiation. Most (not all) eukaryote possess
robust methods of regulating transcription initiation on a gene-by-gene basis. The
transcription of a gene can be regulated by cis-acting elements within the regulatory regions
of the DNA, and trans-acting factors that include transcription factors and the basal
transcription complex
Splicing
Two types of splicing, cis-splicing and trans-splicing, use the same splicing machinery to
cleave RNAs at specific points and rejoin them to form new combinations once transcribed.
Although most eukaryotes possess splicing machinery the extent of cis- and trans-splicing
varies from organism to organism.
Cis-splicing
Primary (initial) mRNA transcripts are synthesized as larger precursor RNAs that are
processed by splicing out introns (non-coding sequences) and ligating exons (non-contiguous
coding sequences) into the mature mRNA. Primary transcripts for some genes can be large.
The primary transcripts of the neurexin genes, for instance, are as large as 1.7 megabases
(1,700,000 bases), while the mature (processed) neurexin mRNAs are under 10 kilobases
(10,000 bases), with as many as 24 exons and thousands of possible alternative splice variants
that produce proteins with different activities. Over 80% of human genes are alternatively
spliced, greatly increasing the variety of actual proteins produced by the limited set of genes
in the human genome.
Trans-splicing
Observed in range of different eukaryotes (including most conspicuously the worm C.
elegans and a group of parasitic protists called kinetoplastids), trans-splicing occurs whereby
an exon from one RNA molecule is spliced onto the 5' end of a completely separate molecule
post-transcriptionally. While relatively unimportant to many eukaryotes, the role of this
process in the biology of some organisms is ubiquitous. In kinetoplastids, for example, every
single nuclear-encoded message must be trans-spliced before translation of the message can
occur
RNA polymerase (RNAP or RNApol) is an enzyme that produces RNA. In cells, RNAP is
needed for constructing RNA chains from DNA genes as templates, a process called
transcription. RNA polymerase enzymes are essential to life and are found in all organisms
and many viruses. In chemical terms, RNAP is a nucleotidyl transferase that polymerizes
ribonucleotides at the 3' end of an RNA transcript.
9
RNAP was discovered independently by Sam Weiss and Jerard Hurwitz in 1960. By this
time the 1959 Nobel Prize in Medicine had been awarded to Severo Ochoa and Arthur
Kornberg for the discovery of what was believed to be RNAP, but instead turned out to be
polynucleotide phosphorylase.
The 2006 Nobel Prize in Chemistry was awarded to Roger Kornberg for creating detailed
molecular images of RNA polymerase during various stages of the transcription process
RNA polymerase I (also called Pol I) is, in eukaryotes, the only enzyme that transcribes
ribosomal RNA (excluding 5S rRNA, which is synthesized by RNA Polymerase III) a type of
RNA which accounts for over 50% of the total RNA synthesized in a cell.
Pol I consists of 8-14 protein subunits (polypeptides). All 12 subunits have identical or related
counterparts in Pol II and Pol III. rDNA transcription is confined to the nucleolus where
several hundreds of copies of rRNA genes are present, arranged as tandem head-to-tail
repeats. Pol I transcribes one large transcript encoding an rDNA gene over and over again.
This gene encodes the 18S, the 5.8S and the 28S RNA molecules of the ribosome in
eukaryotes. The transcripts are cleaved by snoRNA. The 5S ribosomal RNA is transcribed by
Pol III. Because of the simplicity of Pol I transcription it is the fastest acting polymerase
RNA polymerase II (also called RNAP II and Pol II) is an enzyme found in eukaryotic cells.
It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA
and microRNA. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA
polymerase. A wide range of transcription factors are required for it to bind to its promoters
and begin transcription.
RNA polymerase III (also called Pol III) transcribes DNA to synthesize ribosomal 5S rRNA,
tRNA and other small RNAs. The genes transcribed by RNA Pol III fall in the category of
"housekeeping" genes whose expression is required in all cell types and most environmental
conditions. Therefore the regulation of Pol III transcription is primarily tied to the regulation
cell growth and the cell cycle, thus requiring fewer regulatory proteins than RNA polymerase
II.
Control of transcription
An electron-micrograph of DNA strands decorated by hundreds of RNAP molecules too
small to be resolved. Each RNAP is transcribing an RNA strand which can be seen branching
off from the DNA. "Begin" indicates the 3' end of the DNA, where RNAP initiates
transcription; "End" indicates the 5' end, where the longer RNA molecules are almost
completely transcribed.
Control of the process of gene transcription affects patterns of gene expression and thereby
allows a cell to adapt to a changing environment, perform specialized roles within an
organism, and maintain basic metabolic processes necessary for survival. Therefore, it is
hardly surprising that the activity of RNAP is both complex and highly regulated. In
Escherichia coli bacteria, more than 100 transcription factors have been identified which
modify the activity of RNAP.
10
RNAP can initiate transcription at specific DNA sequences known as promoters. It then
produces an RNA chain which is complementary to the template DNA strand. The process of
adding nucleotides to the RNA strand is known as elongation; In eukaryotes, RNAP can
build chains as long as 2.4 million nucleosides (the full length of the dystrophin gene). RNAP
will preferentially release its RNA transcript at specific DNA sequences encoded at the end of
genes known as terminators.
Products of RNAP include:
Messenger RNA (mRNA)—template for the synthesis of proteins by ribosomes.
Non-coding RNA or "RNA genes"—a broad class of genes that encode RNA that is not
translated into protein. The most prominent examples of RNA genes are transfer RNA
(tRNA) and ribosomal RNA (rRNA), both of which are involved in the process of translation.
However, since the late 1990s, many new RNA genes have been found, and thus RNA genes
may play a much more significant role than previously thought.
Transfer RNA (tRNA)—transfers specific amino acids to growing polypeptide chains at the
ribosomal site of protein synthesis during translation
Ribosomal RNA (rRNA)—a component of ribosomes
Micro RNA—regulates gene activity
Catalytic RNA (Ribozyme)—enzymatically active RNA molecules
RNAP accomplishes de novo synthesis. It is able to do this because specific interactions with
the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the
incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts
with ATP (followed by GTP, UTP, and then CTP). In contrast to DNA polymerase, RNAP
includes helicase activity, therefore no separate enzyme is needed to unwind DNA.
RNA polymerase action
Binding and initiation
RNA Polymerase binding in prokaryotes involves the α subunit recognizing the upstream
element (-40 to -70 base pairs) in DNA, as well as the σ factor recognizing the -10 to -35
region. There are numerous σ factors that regulate gene expression. For example, σ70 is
expressed under normal conditions and allows RNAP binding to house-keeping genes, while
σ32 elicits RNAP binding to heat-shock genes.
After binding to the DNA, the RNA polymerase switches from a closed complex to an open
complex. This change involves the separation of the DNA strands to form an unwound
section of DNA of approximately 13bp. Ribonucleotides are base-paired to the template DNA
strand, according to Watson-Crick base-pairing interactions. Supercoiling plays an important
part in polymerase activity because of the unwinding and rewinding of DNA. Because
11
regions of DNA in front of RNAP are unwound, there is compensatory positive supercoils.
Regions behind RNAP are rewound and negative supercoils are present.
Elongation
Transcription elongation involves the further addition of ribonucleotides and the change of
the open complex to the transcriptional complex. RNAP cannot start forming full length
transcripts because of its strong binding to promoter. Transcription at this stage primarily
results in short RNA fragments of around 9 bp in a process known as abortive transcription.
Once the RNAP starts forming longer transcripts it clears the promoter. At this point, the -10
to -35 promoter region is disrupted, and the σ factor falls off RNAP. This allows the rest of
the RNAP complex to move forward, as the σ factor held the RNAP complex in place.
The 17 bp transcriptional complex has an 8 bp DNA-RNA hybrid, that is, 8 base-pairs involve
the RNA transcript bound to the DNA template strand. As transcription progresses,
ribonucleotides are added to the 3' end of the RNA transcript and the RNAP complex moves
along the DNA. Although RNAP does not seem to have the 3'exonuclease activity that
characterizes the proofreading activity found in DNA polymerase, there is evidence of that
RNAP will halt at mismatched base-pairs and correct it.
The addition of ribonucleotides to the RNA transcript has a very similar mechanism to DNA
polymerization - it is believed that these polymerases are evolutionarily related. Aspartyl
(asp) residues in the RNAP will hold onto Mg2+ ions, which will in turn coordinate the
phosphates of the ribonucleotides. The first Mg2+ will hold onto the α-phosphate of the NTP
to be added. This allows the nucleophilic attack of the 3'OH from the RNA transcript, adding
an additional NTP to the chain. The second Mg2+ will hold onto the pyrophosphate of the
NTP. The overall reaction equation is:
(NMP)n + NTP --> (NMP)n+1 + PPi
Termination
Termination of RNA transcription can be rho-independent or rho-dependent:
Rho-independent transcription termination is the termination of transcription without the aid
of the rho protein. Transcription of a palindromic region of DNA causes the formation of a
hairpin structure from the RNA transcription looping and binding upon itself. This hairpin
structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid
itself. As a result, the 8bp DNA-RNA hybrid in the transcription complex shifts to a 4bp
hybrid. Coincidentally, these last 4 base-pairs are weak A-U base-pairs, and the entire RNA
transcript will fall off.[5]
RNA polymerase in bacteria
In bacteria, the same enzyme catalyzes the synthesis of mRNA and ncRNA.
RNAP is a relatively large molecule. The core enzyme has 5 subunits (~400 kDa):
12
α2: the two α subunits assemble the enzyme and recognize regulatory factors. Each subunit
has two domains: αCTD (C-Terminal domain) binds the UP element of the extended
promoter, and αNTD (N-terminal domain) binds the rest of the polymerase. This subunit is
not used on promoters without an UP element.
β: this has the polymerase activity (catalyzes the synthesis of RNA) which includes chain
initiation and elongation.
β': binds to DNA (nonspecifically).
ω: restores denatured RNA polymerase to its functional form in vitro. It has been observed to
offer a protective/chaperone function to the β' subunit in Mycobacterium smegmatis. Now
known to promote assembly.
In order to bind promoter-specific regions, the core enzyme requires another subunit, sigma
(σ). The sigma factor greatly reduces the affinity of RNAP for nonspecific DNA while
increasing specificity for certain promoter regions, depending on the sigma factor. That way,
transcription is initiated at the right region. The complete holoenzyme therefore has 6
subunits: α2ββ'σω (~480 kDa). The structure of RNAP exhibits a groove with a length of 55 Å
(5.5 nm) and a diameter of 25 Å (2.5 nm). This groove fits well the 20 Å (2 nm) double strand
of DNA. The 55 Å (5.5 nm) length can accept 16 nucleotides.
When not in use RNA polymerase binds to low affinity sites to allow rapid exchange for an
active promoter site when one opens. RNA polymerase holoenzyme, therefore, does not
freely float around in the cell when not in use.
Transcriptional cofactors
There are a number of proteins which can bind to RNAP and modify its behavior. For
instance, GreA and GreB from E. coli and in most other prokaryotes can enhance the ability
of RNAP to cleave the RNA template near the growing end of the chain. This cleavage can
rescue a stalled polymerase molecule, and is likely involved in proofreading the occasional
mistakes made by RNAP. A separate cofactor, Mfd, is involved in transcription-coupled
repair, the process in which RNAP recognizes damaged bases in the DNA template and
recruits enzymes to restore the DNA. Other cofactors are known to play regulatory roles, i.e.
they help RNAP choose whether or not to express certain genes
RNA polymerase in eukaryotes
Eukaryotes have several types of RNAP, characterized by the type of RNA they synthesize:
RNA polymerase I synthesizes a pre-rRNA 45S, which matures into 28S, 18S and 5.8S rRNAs
which will form the major RNA sections of the ribosome.
RNA polymerase II synthesizes precursors of mRNAs and most snRNA and microRNAs.This
is the most studied type, and due to the high level of control required over transcription a
range of transcription factors are required for its binding to promoters.
RNA polymerase III synthesizes tRNAs, rRNA 5S and other small RNAs found in the
nucleus and cytosol.
RNA polymerase IV synthesizes siRNA in plants.
RNA polymerase V synthesizes RNAs involved in siRNA-directed heterochromatin
formation in plants.
13
There are other RNA polymerase types in mitochondria and chloroplasts. And there are
RNA-dependent RNA polymerases involved in RNA interference.
RNA polymerase in archaea
Archaea have a single RNAP that is closely related to the three main eukaryotic polymerases.
Thus, it has been speculated that the archaeal polymerase resembles the ancestor of the
specialized eukaryotic polymerases.
RNA polymerase in viruses
T7 RNA polymerase producing a mRNA (green) from a DNA template. The protein is shown
as a purple ribbon. Image derived from PDB 1MSW.
Many viruses also encode for RNAP. Perhaps the most widely studied viral RNAP is found
in bacteriophage T7. This single-subunit RNAP is related to that found in mitochondria and
chloroplasts, and shares considerable homology to DNA polymerase.[13] It is believed that
most viral polymerases therefore evolved from DNA polymerase and are not directly related
to the multi-subunit polymerases described above.
The viral polymerases are diverse, and include some forms which can use RNA as a template
instead of DNA. This occurs in negative strand RNA viruses and dsRNA viruses, both of
which exist for a portion of their life cycle as double-stranded RNA. However, some positive
strand RNA viruses, such as polio, also contain these RNA dependent RNA polymerases.