Download Central Dogma of Molecular Biology: How does the sequence of a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

List of types of proteins wikipedia , lookup

Transcription factor wikipedia , lookup

Molecular cloning wikipedia , lookup

Gene regulatory network wikipedia , lookup

SR protein wikipedia , lookup

Community fingerprinting wikipedia , lookup

Genetic code wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Messenger RNA wikipedia , lookup

RNA interference wikipedia , lookup

Molecular evolution wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Biosynthesis wikipedia , lookup

Gene wikipedia , lookup

Replisome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Polyadenylation wikipedia , lookup

Promoter (genetics) wikipedia , lookup

RNA wikipedia , lookup

RNA silencing wikipedia , lookup

Epitranscriptome wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA-Seq wikipedia , lookup

Deoxyribozyme wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene expression wikipedia , lookup

Non-coding RNA wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Transcript
Biochemistry 401G
Lecture 36
ANDRES
Lecture Summary:
Introduce RNA polymers. Know the chemistry of these macromolecules and
why different classes are present.
Define a Gene. Understand the nomenclature and the definition of a promoter
(why are they necessary?).
Understand the RNA polymerase reaction. How is it similar/different than DNA
polmerases? What is Sigma factor? Know the consensus promoter sequence.
Understand the process of transcription. How is this process terminated in
prokaryotes?
Central Dogma of Molecular Biology:
How does the sequence of a strand of DNA correspond to the
amino acid sequence of a protein? This concept is explained by
the central dogma of molecular biology, which states that:
Flow of genetic information in normal cells:
Transcription
Translation
(--<-->---) -DNA ---------------------> RNA -------------------> Protein
Replication
Why would an organism want to have an intermediate between
DNA and the protein it encodes?
•DNA can stay pristine and protected, away from the caustic
chemistry of the cytoplasm.
•Genetic information can be amplified by having many copies
of RNA made from one copy of DNA.
•Regulation of gene expression can be effected by having
specific controls at each element of the pathway between DNA and
proteins. The more elements there are in the pathway, the more
opportunities there are to control the process in response to
different circumstances.
Transcription: Information stored in the sequence of DNA is
converted to RNA. Mechanistically, transcription is similar to DNA
replication: uses nucleotide triphosphates and template directed
synthesis in 5' to 3' direction.
2 major differences:
1) Only one DNA template is transcribed (single stranded RNA
chain is synthesized).
2) Only a small fraction of the total genetic potential of an
organism is used in any one cell.
The reaction is thermodynamically favorable: Hydrolysis of the
terminal phosphoanhydride bond of nucleotide triphosphate yields
13 kJ/mol more energy than is necessary for formation of a
phosphodiester linkage within the RNA backbone (remember back
to our discussion of DNA synthesis in earlier lectures).
Structural features of RNA:
1. Similar to DNA except it contains a 2' hydroxyl group (makes
phosphodiester bond more labile than DNA).
2. Thymine in DNA is replaced by Uracil in RNA.
3. RNA's can adopt regular three-dimensional structures that allow them to
function in the process of genetic expression (i.e. the production of proteins).
This ability to adopt defined three-dimensional structures that impart
functionality places RNA in a unique class- somewhat akin to proteins,
and different from DNA. For example certain RNA molecules, when folded,
exhibit catalytic capacities (e.g. the cleavage of RNA molecules). The
majority of RNA in cells is found in complex with proteins. The most common
example is ribosomes (involved in protein synthesis).
Classes of RNA:
1) Messenger RNA (mRNA): It is the carrier of genetic
information on the primary structure of proteins from DNA, along
with special features that allow it to attach to ribosomes and
function in protein synthesis. Its size depends on the size of the
protein for which it codes. It tends to be relatively short-lived, and
its lifetime varies from molecular species to molecular species
(depending to a great extent on the biological role of the protein
which it encodes). 3% of total RNA in bacteria is mRNA.
2) Ribosomal RNA (rRNA): Forms the ribosome, the site of
protein synthesis, and one rRNA is the catalyst for formation of the
peptide bond. Various species range in size from 4700 bases to
about 120 bases. Eukaryotic and prokaryotic rRNAs are distinctly
different. rRNA is long-lived (stable). 83% of bacterial total RNA.
3) Transfer RNA (tRNA): Is a small (65-110 nucleotide) molecule
designed to carry activated amino acids to the site of protein
synthesis, the ribosome. Is long-lived (stable). 14% of total
bacterial RNA.
Define a new term: GENE
A Gene is the entire nucleic acid sequence that is necessary
for the synthesis of a functional RNA molecule (this includes
mRNAs that would allow the production of active
polypeptides). Gene's can be transcribed into any of the classes
of RNA that we discussed above.
Thus, a gene contains additional sequence information beyond
that which codes for the amino acids in a protein or the nucleotides
in an RNA molecule. The gene also contains the DNA necessary to
get a particular transcript made.
Terminology and numbering of Gene Sequences:
1). DNA is indicated in a 5' to 3' direction along its top (or
coding) strand and 3' to 5' along the bottom (TEMPLATE or
noncoding) strand.
5'---------------------------------------3' Coding strand
3'---------------------------------------5'
If this DNA sequence is capable of being transcribed to RNA, the
sequence would be termed a "gene" and the RNA would be written
as the 5' to 3' TOP or CODING strand sequence.
Coding Strand: Identical to the RNA transcript.
Template Strand: Serves as the template for making the RNA
transcript and is complementary to that of the RNA transcript.
2). Numbering system
Transcription Start Site. Nucleotide in DNA coding strand
corresponding to the first nucleotide of the transcribed RNA is
numbered +1.
Nucleotides to the right of the start site (+1) toward 3' end on
coding strand are indicated by increasing positive numbers (+
2,3,4,5,etc.).
Nucleotide directly to the left of the +1 nucleotide (start site) is
defined as -1, and the next is -2, -3, etc. There is no zero between
-1 and +1.
3). Promoter Sequences: Each gene has sequences that are
important for controlling its expression. These are termed
"promoter sequences."
Usually found at the 5' end of the gene, relative to the coding
strand. In the numbering system, these promoter sequences have
negative numbers.
Enzymology of RNA Synthesis/Transcription: RNA
POLYMERASE
RNA polymerase can initiate the synthesis of a new nucleic acid
strand given a template. This means that a primer is not
necessary!
A single RNA polymerase functions in bacteria.
In eukaryotes, three distinct RNA polmerases are responsible for
the synthesis of each class of RNA.
DNA and RNA polymerases Catalyze Similar Reactions:
Vmax DNA pol III 500-1000 nucleotides/sec
Vmax RNA polymerase
50 nucleotides/sec
10 molecules of DNA polymerase/cell, 3000 molecules of RNA
polymerase (~50% involved in making RNA at any one time). DNA
replication is fast but initiates at a few sites while RNA transcription
is slow but occurs at many sites of initiation and so accumulates to
high levels.
RNA polymerase is highly processive (like DNA pol.). So once
initiated, it will not dissociate until a specific termination
signal is received.
Another difference is that RNA polymerase is much less accurate.
RNA Polymerase is an Oligomeric Protein:
5 separate protein subunits comprise RNA Polymerase in bacteria:
2 copies of α, β, β', σ, and ω. A separate function has been
ascribed to different subunits:
2 alpha -- initiation.
beta -- phosphodiester bond formation.
beta' -- binds the DNA template.
These four subunits are the core enzyme; they alone carry out
transcription, but cannot initiate rapidly at specific sites.
4. sigma -- recognizes the promoter and provides binding
specificity. The core enzyme plus sigma factor is called the
holoenzyme.
5. omega -- unknown function.
The sigma subunit (σ) can be removed from the RNA polymerase
core while leaving the rest of the complex intact.
Using these two complexes, scientists tested the binding affinity of
the entire complex and the Core complex (lacking sigma) for
general DNA and "Promoter" DNA (which contains -10 and -35
consensus sequences, see below).
Kassoc. Values for:
Any DNA
Promoter
DNA Sequence
RNA polymerase (- sigma)
1 x 1010 M -1
1 x 1010 M -1
RNA polymerase (+ sigma)
5 x 106 M -1
2 x 1011 M -1
Sigma Factor does two things:
1). Decreases the affinity of RNA polymerase for general DNA (by
4 orders of magnitude).
2). Increases the affinity of RNA polymerase for promoter DNA
sites (by 1 order of magnitude).
The function of sigma is to interact with the -10 and -35 consensus
sequences (promoter region) so that RNA polymerase can bind to
(find genes by finding their promoter regions), and initiate RNA
synthesis from, genes.
STEPS OF TRANSCRIPTION:
1). Binding of RNA polymerase to Promoter Sequences:
In E. coil there are two regions that are similar in all promoters.
One sequence is centered at -10 and the other -35 relative to the
transcriptional start site at +1.
The -10 and -35 sequence is used to identify the location of
genes.
Called "Consensus Sequences". A consensus sequence is an
idealized sequence of bases
Whose real counterparts appear in various places in a
polynucleotide and perform the same function in each, but with
minor deviations of the real sequence from the ideal.
For the -10 region (or Pribnow box) the consensus sequence is:
5' TATAAT 3', often called the "TATA" box for this reason.
For the -35 region the consensus sequence is 5' TTGACA 3'.
The nucleotide at the transcriptional start site is almost ALWAYS A
PURINE (A or G), most often an Adenine.
Promoter recognition is a critical step in transcription. This is
because promoter recognition is the rate-limiting step in
transcription. Because the same protein complex in bacteria
transcribes all genes, differences in promoter structure are largely
responsible for differences in the frequency of initiation (as rapid as
1/10 sec to 1/per generation [30-60 min]).
The notion of consensus sequence represents relative (as opposed
to absolute) specificity for a nucleotide sequence. The more closely
a real promoter (-10, -35 region) resembles the consensus, the
better it performs as a promoter (more often recognized by the
sigma factor containing complex). Therefore, in prokaryotes the
more closely the promoter region for a given gene resembles the
perfect consensus sequence; the more often the gene will be
transcribed.
How does the RNA polymerase find a promoter in DNA?
RNA polymerase binds to DNA at random sites and moves quickly
along the DNA while the sigma factor scans for promoter regions.
Once a promoter is located, the sigma subunit binds to the
promoter sequences with high affinity and prevents the polymerase
from scanning any further.
Why use a scanning mechanism? Because it is much faster than
a random association/dissociation search which is diffusion
controlled and therefore a second-order reaction (Maximum rate
108 M-1 S-1
The scanning scheme is essentially first order and has a rate
constant of 1010 M-1 S-1,. This is two orders of magnitude faster
than a bind/release search.
2). Initiation of Transcription.
A). RNA polymerase associates with
promoter sequences near the +1
Transcription start site.
This is called a "CLOSED PROMOTER
COMPLEX" because the DNA at the
Transcription start site is still double
stranded.
B). The RNA Polymerase complex then
unwinds the DNA at the Transcription
start site to make it single-stranded.
This complex is termed the "Open
Complex" because the DNA is single
stranded within the RNA polymerase
active site.
17 base-pairs of DNA are unwound,
forming a "Transcription Bubble".
RNA polymerase now starts to
synthesize the RNA transcript.
RNA polymerase has two
binding sites for
ribonucleoside triphosphates,
the FIRST is used during
elongation and binds all 4
common ribonucleoside
triphosphates with a half
saturating concentration of
10 µM. The SECOND, used
only during initiation, binds
ATP and GTP preferentially
at 100 µM. Thus, most RNA
molecules have a purine at
their 5' end. The binding of a
purine at this site is a critical
difference between DNA and
RNA polymerases. The
binding to an initiating
nucleotide allows the RNA
polymerase complex to begin
chain synthesis without a
primer.
Chain growth begins with
binding of the template
specified rNTP at the initiation site, followed by binding of the next
nucleotide at the elongation site. Next, nucleophilic attack by the
3' hydroxyl of the first nucleotide on the α (inner) phosphorus of
the second nucleotide generates the first phosphodiester bond
and leaves an intact triphosphate at the 5' position of the first
nucleotide.
RNA polymerase moves in 5' to 3' direction (relative to the coding
strand) and continues synthesizing RNA off the DNA template
strand.
"Transcription Bubble" moves down the DNA helix in concert with
the new synthesis.
Within the "Bubble" only 12 nucleotides of the DNA template
strand are base-paired with the RNA strand at any time. This is
called the "RNA:DNA hybrid".
As each new ribonucleotide is incorporated, one base-pair of the
RNA:DNA hybrid at the other end of the transcription bubble has to
dissociate.
3). Termination of Transcription.
RNA transcripts are not infinitely long. There are two ways in which
termination of transcription is known to occur in prokaryotes.
First lets talk about pausing:
RNA polymerase can pause during transcription. Pausing
occurs at sequences rich in G/C base-pairs.
This is because it is difficult to disrupt stable G/C base-pairs to
allow formation of the transcription bubble and to release the
RNA:DNA hybrid.
Pausing can last from 10 seconds to 30 minutes.
Two Major Mechanisms of Transcription Termination.
Simple and Rho-dependent.
1). SIMPLE (Rho-independent): Some termination sites have two
shared structural features at these termination sites:
A). Two symmetrical G/C-rich sequences that in the transcript
have the potential to form a stem-loop structure.
B). A downstream run of four to eight A residues.
RNA polymerase pauses at the first G/C rich region, this allows the
second G/C rich region of the RNA transcript to base-pair with the
first region- forming a RNA:RNA stem-loop duplex and eliminating
some of the base-pairing between the DNA template and the RNA
transcript. Further weakening, leading to dissociation occurs when
the A-rich region is transcribed to give a series of very weak A-U
bonds.
2). Rho Mediated: Factor-dependent termination is more rare.
The Rho protein is necessary for the termination of these genes.
3 Steps:
1). RNA Polymerase complex pauses.
2). Rho protein recognizes and binds to a specific RNA sequence
in the nascent RNA transcript.
3). The Rho protein terminates transcription in an ATP dependent
process by migrating toward the 3' end of the RNA transcript,
displacing the RNA polymerase and disrupting the RNA:DNA
hybrid.
Differences in RNA transcription between eukaryotes and
prokaryotes:
1). There is only one RNA polymerase in E. coli. There are three
RNA polymerases in eukaryotes.
2). In eukaryotes, most promoters direct transcription of only one
gene. In bacteria, several genes are often transcribed from a
single promoter. As we will discuss, this type of transcriptional unit
is called an "Operon".
Gene A
Gene B
Gene C
5'----------[--------]-----[---------------]----------[------------------------]----------3'
3'----------[--------]-----[---------------]----------[------------------------]----------5'
3). Eukaryotic RNA polymerases require additional protein factors
(Transcription Factors) to bind to a promoter and initiate
transcription. We will discuss these proteins when we discuss
eukaryotic gene expression.
4). Eukaryotic RNA polymerases must pass through nucleosomes
(that are found on all chromatin) during transcription.
5). Eukaryotic RNA polymerases do not have terminator signals,
rather they proceed well past the coding region and into the 3'
noncoding region of genes. The action of additional enzymes
processes the RNA molecule extensively in a series of reactions
that we will discuss (capping, splicing, editing).