Download Essential Question

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Helicase wikipedia , lookup

Microsatellite wikipedia , lookup

DNA polymerase wikipedia , lookup

Replisome wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Essential Question
Reginald H. Garrett
Charles M. Grisham
www.cengage.com/chemistry/garrett
• How are the genes of prokaryotes and
eukaryotes transcribed to form RNA products
that can be translated into proteins?
Chapter 29
Transcription and the Regulation of
Gene Expression
Reginald Garrett & Charles Grisham • University of Virginia
All Cells Contain Three Major Classes of
RNA – mRNA, rRNA, and tRNA
• All three forms participate in protein synthesis
• All RNAs are synthesized from DNA templates by
DNA-dependent RNA polymerases
• This process is called transcription
• Only mRNAs direct the synthesis of proteins
• Transcription is tightly regulated in all cells
• Only 3% of genes in a typical eukaryotic cell are
undergoing transcription at any given moment
• The metabolic conditions and growth status of the
cell dictate which gene products are needed at
any moment
29.1 How Are Genes Transcribed in Prokaryotes?
• In prokaryotes, virtually all RNA is synthesized by a
single species of DNA-dependent RNA polymerase
• RNA polymerases link NTPs (ATP, GTP, CTP, and
UTP) in the order specified by base pairing with a
DNA template
• The polymerase moves along the DNA strand in the
3'-5' direction
• Thus, the RNA chain grows 5'-3' during transcription
• Subsequent hydrolysis of PPi to inorganic phosphate
by pyrophosphatases makes the polymerase
reaction thermodynamically favorable
Sigma Subunits of Prokaryotic RNA Polymerases
Identify Transcription Start Sites
Conventions Used in Expressing the Sequences of
Nucleic Acids and Proteins
• Transcription is initiated in prokaryotes by RNA
polymerase holoenzyme, with the subunit
composition α2ββ'σ
• The core polymerase is α2ββ' (see Figure 29.1)
• Binding of the σ subunit allows the polymerase to
recognize different DNA sequences that act as
promoters
• Promoters are nucleotide sequences that identify the
location of transcription start sites, where
transcription begins
• Without σ bound, the core polymerase can
transcribe DNA into RNA, but cannot initiate
transcription
• Certain conventions are used in describing
information transfer from DNA to protein:
• The strand of duplex DNA that is read by RNA
polymerase is termed the template stand
• The strand not read is the nontemplate strand
• The template is read by the RNA polymerase moving
3'-5' along the template, so the RNA product, the
transcript, grows in the 5'-3' direction
• By convention, when the order of nucleotides in DNA
is shown as a single strand, it is the 5'-3' sequence
of nucleotides in the nontemplate strand that is
shown
Conventions Used in Expressing the Sequences of
Nucleic Acids and Proteins
Structure of the Core RNA Polymerase from
Thermus thermophilus
The template
DNA strand is
green, the
nontemplate
strand is blue,
and the RNA
transcript is hot
pink. The 2 α
chains are
orange, the β
chain is cyan,
the β' chain is
yellow.
Bacteriophage T7 Expresses a Simpler,
Monomeric RNA Polymerase
Figure 29.2
Bacteriophage
T7 RNA
polymerase in
the act of
transcription. The
DNA is shown
entering the
enzyme from the
upper right. The
template is green,
the nontemplate
is blue, and the
RNA transcript is
hot pink.
The Process of Transcription Has Four Stages
• Transcription can be divided into four stages:
1)Binding of RNA polymerase holoenzyme to
template DNA at promoter sites
2)Initiation of polymerization
3)Chain elongation
4)Chain termination
Binding of Polymerase to Template DNA
Figure 29.3 Sequence
of events in the initiation
and elongation phases
of transcription as it
occurs in prokaryotes.
Nucleotides in this
region are numbered
with reference to the
base at the transcription
start site, which is
designated +1.
• Polymerase binds nonspecifically to DNA with low
affinity and migrates along it, looking for promoter
• Sigma subunit recognizes promoter sequence
• RNA polymerase holoenzyme and promoter form a
closed promoter complex (in which the DNA is
not unwound)
• Polymerase then unwinds about 12 pairs to form
"open promoter complex“
• RNA polymerase binding protects a nucleotide
sequence spanning the region from -70 to +20,
where +1 is defined as the transcription start site
Properties of Prokaryotic Promoters
• Promoters recognized by the σ factor typically
consist of a 40 bp region on the 5'-side of the
transcription start site
• Within the promoter are two consensus
sequence elements:
• The Pribnow box near -10, with consensus
TATAAT - this region is ideal for unwinding why? (It is rich in As and Ts, which only form two
H bonds per base pair)
• The -35 region, with consensus TTGACA - σ
subunit binds here. The more the -35 region
sequence corresponds to the consensus
sequence, the better the σ subunit binds, and
the greater is the efficiency of gene transcription
The Nucleotide Sequences of Representative E.
coli Promoters
Figure 29.4 Consensus sequences for the -35 region, the
Pribnow box, and the initiation site are shown at the bottom. The
numbers represent the percent occurrence of the indicated base.
In this figure, sequences are aligned relative to the Pribnow box.
DNA Footprinting: Identifying the Nucleotide
Sequence in DNA Where a Protein Binds
• DNA footprinting is a widely used technique to
identify the nucleotide sequence within DNA where a
specific protein binds (such as a promoter sequence
bound to RNA polymerase holoenzyme)
• The protein is incubated with a labeled DNA
fragment containing the sequence where the protein
is thought to bind
• Digestion with DNase cleaves the DNA backbone in
exposed regions, but not where the DNA-binding
protein is bound
• Analysis of the DNase digests reveals the location of
the protein-binding site on the DNA
DNA Footprinting:
Identifying the
Nucleotide Sequence
in DNA Where a
Protein Binds
Initiation of Polymerization
Chain Elongation
• RNA polymerase has two binding sites for NTPs
• The initiation site prefers to bind ATP and GTP
(most RNAs begin with a purine at 5'-end)
• The elongation site binds the second incoming NTP
• 3'-OH of the first nucleotide bound attacks α-P of
the second to form a new phosphoester bond
(eliminating PPi)
• When 6-10 unit oligonucleotide has been made, the
σ subunit dissociates, signaling the completion of
"initiation"
• The core polymerase (without σ) is the
elongation enzyme
• RNA polymerase is accurate - only about 1 error in
10,000 bases
• Even this error rate is OK, since many transcripts
are made from each gene
• Elongation rate is 20-50 bases per second - slower
in G/C-rich regions (why??*) and faster elsewhere
• Topoisomerases precede and follow polymerase to
relieve supercoiling
*G-C base pairs share 3 H bonds, whereas A-T
base pairs, with 2 H bonds, are less stable
Supercoiling Versus Transcription
Chain Termination
(a) If the RNA polymerase followed the template strand around the
axis of the DNA duplex, no supercoiling of the DNA would occur
but the RNA chain would be wrapped around the double helix
once every 10 bp. This possibility seems unlikely because it
would be difficult to untangle the transcript from the DNA duplex.
(b) Instead, gyrases and topoisomerases act to remove the torsional
stresses induced by transcription.
• Two types of transcription termination
mechanisms operate in bacteria: One depends
on Rho termination factor
• rho is an ATP-dependent helicase
• it moves along RNA transcript, finds the
“transcription bubble", unwinds the DNA:RNA
hybrid and releases RNA chain
• It is likely that the RNA polymerase stalls in a
G:C-rich termination region, allowing rho factor
to overtake it
Figure 29.7 Transcription termination by rho factor.
Intrinsic Termination
• The second termination mechanism is termed
intrinsic termination
• Here termination is determined by specific
sequences in the DNA – called termination sites
• Termination sites consist of 3 structural features
• inverted repeats, rich in G:C, which form a
stable stem-loop structure in RNA transcript
• A nonrepeating segment that punctuates the
inverted repeats
• A run of 6-8 As in the DNA template, coding for
Us in the transcript
Figure 29.6
The intrinsic ermination site for the E.coli trp operon. The
inverted repeats give rise to a step-loop, or “hairpin,” structure
ending in a series of U residues
29.2 – How Is Transcription Regulated in
Prokaryotes?
• Genes for enzymes for pathways are grouped in
clusters on the chromosome - called operons
• This allows coordinated expression through
transcription into a single polycistronic mRNA
• Regulatory sequences adjacent to such a unit
determines whether it is transcribed – these
regulatory sequences are the promoter and the
operator
• Regulatory proteins work with operators to
control transcription of the genes
The General Organization of Operons
Figure 29.8 Operons consist of transcriptional control regions
and a set of related structural genes, all organized in a
contiguous linear array along the chromosome. The
transcriptional control regions are the promoter and the operator,
which lie next to, or overlap, each other, upstream from the
structural genes they control. Operators may lie at various
positions relative to the promoter, either upstream or
downstream. Expression of the operon is determined by access
of RNA polymerase to the promoter, and occupancy of the
operator by regulatory proteins influences this access. Induction
activates transcription from the promoter; repression prevents it.
Lactose is an Inducer of the lac Operon
Figure 29.9 The structure of lactose, a β-galactoside.
Metabolism of lactose depends on hydrolysis into its
component sugars, glucose and galactose, by the enzyme
β-galactosidase. Lactose availability induces the synthesis
of this enzyme by activating transcription of the lac operon.
Transcription of Operons is Controlled by Induction
and Repression
• Increased synthesis of enzymes in response to
the presence of a metabolite is induction
• Decreased synthesis in response to a metabolite
is repression
• Some substrates induce enzyme synthesis even
though the enzymes can’t metabolize the
substrate - these are gratuitous inducers - such
as IPTG (isopropyl β-thiogalactoside)
IPTG is a Gratuitous Inducer
Figure 29.10 The structure of IPTG (isopropyl βthiogalactoside), a gratuitous inducer.
The lac Operon Serves as a Pardigm of
Operons
• lacI mutants express the genes needed for
lactose metabolism
• The structural genes of the lac operon are
controlled by negative regulation
• lacI gene product is the lac repressor, a
tetrameric protein
• The lac operator is a palindromic DNA segment
• lac repressor – has a DNA binding domain on Nterminus; the C-terminus binds inducer, forms
tetramer.
The Mode of Action of lac Repressor
Figure 29.12 The
structure of the
lac repressor
tetramer with
bound IPTG
(purple) is also
shown.
The lac Operon
Figure 29.11 The operon consists of two transcription units.
In one unit, there are three structural genes, lacZ, lacY, and
lacA, under control of the promoter, plac, and the operator O.
In the other unit, there is a regulator gene, lacI, with its own
promoter, placI.
The Nucleotide Sequence of the lac Operator
Figure 29.13 This sequence comprises 36 bp showing nearly
palindromic symmetry. The inverted repeats that constitute this
approximate twofold symmetry are shaded in rose. The bases
are numbered relative to the +1 start site for transcription. The
G:C base pair at position +11 represents the axis of symmetry. In
vitro studies show that bound lac repressor protects a 26-bp
region from -5 to +21 against nuclease digestion. Bases that
interact with bound lac are indicated below the operator.
Lac Repressor Is a Negative Regulator of the lac
Operon
Catabolite Activator Protein Provides Positive
Control of the lac Operon
• Some promoters require an accessory protein to
speed transcription
• Catabolite activator protein or CAP is one such
protein
• CAP is a dimer of 22.5-kD peptides
• N-terminus binds cAMP; C-terminus binds DNA
• Binding of CAP-(cAMP)2 to DNA assists
formation of closed promoter complex
• Catabolite repression ensures that the operons
necessary for metabolism of alternative energy
sources (the lac and gal operons) remain
repressed until the supply of glucose is
exhausted.
The Mechanism of Catabolite Repression
and CAP Action
Binding of CAP-(cAMP)2 induces a severe
bend in DNA
Figure 29.14 The mechanism of
catabolite repression and CAP action.
Glucose instigates catabolite
repression by lowering cAMP levels.
cAMP is necessary for CAP binding
near promoters of operons whose
gene products are involved in the
metabolism of alternative energy
sources such as lactose, galactose,
and arabinose. The binding sites for
the CAP-(cAMP)2 complex are
consensus DNA sequences
containing the conserved pentamer
TGTGA and a less well-conserved
inverted repeat, TCANA (where N is
any nucleotide).
Figure 29.15 The CAP
dimer with two molecules
of cAMP bound interacts
with 27 to 30 base pairs
of duplex DNA.
Negative and Positive Control Systems are
Fundamentally Different
Negative and Positive Control Systems are
Fundamentally Different
• Negative and positive control systems operate in
fundamentally different ways
• Genes under negative control are transcribed unless
they are turned off by the presence of a repressor
protein
• Often, transcription activation is merely the release
from negative control
• In contrast, genes under positive control are
expressed only if an active regulator protein is
present
Figure 29.16 Control circuits governing the expression of genes.
The araBAD Operon is Both Positively and
Negatively Controlled by AraC
• E. coli can use the plant pentose L-arabinose as sole
source of carbon and energy
• Arabinose is metabolized by three enzymes encoded
in the araBAD operon
• Transcription of this operon is regulated by both
catabolite repression and arabinose-mediated
induction
• AraBAD is regulated both positively and negatively
by AraC
• Positive control of araBAD occurs in the presence of
L-arabinose and cAMP
Regulation of the araBAD operon by the
combined action of CAP and AraC Protein
Figure 29.17
The trp Operon is Regulated Through a
CoRepressor-Mediated Negative Control Circuit
trp Operon is Regulated by a Co-RepressorMediated Negative Control Circuit
• The trp operon encodes a leader sequence and 5
proteins (trpE through TrpA) that synthesize
tryptophan
• Trp repressor controls the operon
• Trp repressor binding excludes RNA polymerase
from the promoter
• Trp repressor also regulates trpR and aroH
operons and is itself encoded by the trpR operon.
This is autogenous regulation (autoregulation).
Figure 29.18 The trp operon of E.coli.
Attenuation is a Prokaryotic Mechanism for PostTranslational Regulation of Expression
Attenuation is a Prokaryotic Mechanism for PostTranslational Regulation of Expression
• In addition to repression, expression of the trp
operon is controlled by transcription attenuation
• Unlike the mechanisms discussed thus far,
attenuation regulates transcription after it has begun
• Attenuation is any regulatory mechanism that
manipulates transcription termination or transcription
pausing to regulate gene transcription downstream
• In prokaryotes, transcription and translation are
coupled, and the translating ribosome is affected by
the formation and persistence of secondary structure
in the mRNA
• In many operons encoding enzymes of amino acid
biosynthesis, a transcribed leader region lies
between the promoter and the first major structural
gene
• These regions encode a short leader peptide
containing multiple codons for the pertinent amino
acid
• For example, the leader peptide of the leu operon
has four leucine codons, the trp operon has two
tandem tryptophan codons, and so on (Fig. 29.19)
• Translations of these codons depends on availability
of the amino acid
Sequences of Leader Peptides in Various Amino
Acid Biosynthetic Operons
Figure 29.19 Amino acid sequences of leader peptides in
various amino acid biosynthetic operons regulated by
attenuation. Color indicates amino acids synthesized in the
pathway catalyzed by the operon’s gene products. (The ilv
operon encodes enzymes of isoleucine, leucine, and valine
biosynthesis.
Attenuation is a Prokaryotic Mechanism for PostTranslational Regulation of Expression
• When tryptophan is scarce, the entire trp operon is
transcribed to give a polycistronic mRNA
• But as [Trp] increases, more and more of the trp
transcripts consist of only a 140-nucleotide fragment
corresponding to the 5'-end of trpL
• Tryptophan availability is causing premature
attentuation of trp transcripts
• This is transcription attenuation
• The secondary structure of the 160 bp leader region
transcript is the principal control element in
transcription attenuation (Figure 29.20)
The Secondary Structure of the Leader Transcript is
the Control Element in Transcription Attenuation
Figure 29.20 Alternative secondary structures for the leader
region of the trp operon transcript.
Figure 29.21 The mechanism of
attenuation in the trp operon.
DNA: Protein & Protein: Protein Interactions are
Essential to Transcription Regulation
DNA Looping Allows Multiple DNA-Binding
Proteins to Interact With One Another
• DNA: protein interactions are a central feature in
transcriptional control
• The DNA sites where regulatory proteins bind
commonly display at least partial dyad symmetry or
inverted repeats
• DNA-binding proteins themselves are generally
even-numbered oligomers (dimers, tetramers, etc.)
that have innate twofold rotational symmetry
• Protein: protein interactions are an essential
component of transcriptional activation
• Proteins that activate transcription work through
protein: protein contacts with RNA polymerase
• Because transcription must respond to a variety of
regulatory signals, multiple proteins are essential for
appropriate regulation of gene expression
• These regulatory proteins are the sensors of cellular
circumstances
• They communicate this information to the genome by
binding at specific nucleotide sequences
• But DNA is a one-dimensional polymer, with limited
space for proteins to bind
• DNA looping permits additional proteins to convene
at the initiation site and to exert their influence on
creating and activating the initiation complex
DNA Looping Allows Multiple DNA-Binding
Proteins to Interact With One Another
29.3 How Are Genes Transcribed in Eukaryotes?
Figure 29.22 Formation of a DNA loop delivers DNA-bound
transcriptional activator to RNA polymerase positioned at the
promoter. Protein: protein interactions between the
transcriptional activator and RNA polymerase activate
transcription.
• Three classes of RNA polymerases (I, II and III)
transcribe rRNA, mRNA and tRNA genes,
respectively
• Pol III transcribes a few other RNAs as well
• All 3 are big, multimeric proteins (500-700 kD)
• All have 2 large subunits with sequences similar to
β and β' in E.coli RNA polymerase, so the
catalytic site is evolutionarily conserved
• Pol II is most sensitive to α−amanitin, an
octapeptide from Amanita phalloides ("destroying
angel mushroom")
• Pol III is less so, and Pol I is insensitive
Sensitivity to α-Amanitin Distinguishes the Three
Classes of RNA Polymerase
Figure 29.23 The
structure of αamanitin, one of a
series of toxic
compounds known
as amatoxins that
are found in the
mushroom
Amanita phalloides.
29.3 How Are Genes Transcribed in Eukaryotes?
• With three categories of polymerases acting on three
sets of genes, there are also at least three
categories of promoters that maintain specificity
• Eukaryotic promoters are very different from
prokaryotic promoters
• All three eukaryotic RNA polymerases interact with
their promoters via transcription factors
• Transcription factors are DNA-binding proteins that
recognize and accurately initiate transcription at
specific promoter sequences
RNA Polymerase II Transcribes Protein-Coding
Genes
RNA Polymerase II Transcribes Protein-Coding
Genes
• RNA Pol II must be capable of transcribing a great
diversity of genes, but must also function at any
moment only on the genes whose products are
appropriate to the needs of the cell
• The RNA Pol II enzymes from yeast and humans are
homologous
• The structure of RNA Pol II from yeast is known
(Figure 29.24) and consists of 12 polypeptides
• RNA polymerases adopt a claw-like structure, to
grasp the DNA duplex
Figure 29.24 Structure of RNA
Pol II. Template DNA is green,
nontemplate DNA is blue, RNA
transcript is pink, emerging from
the bottom of the structure.
RNA Polymerase II Transcribes Protein-Coding
Genes
• Yeast Pol II consists of 12 different peptides (RPB1 RPB12)
• RPB1 and RPB2 are homologous to E. coli RNA
polymerase β′ and β
• RPB1 has DNA-binding site; RPB2 binds NTP
• RPB1 has C-terminal domain (CTD) consisting of
multiple YSPTSPS repeats
• 5 of 7 residues in the heptad repeat have –OH
group, both a hydrophilic and a phosphorylatable
site
RNA Polymerase II Transcribes Protein-Coding
Genes
• CTD of RPB1 is essential; this domain projects
away from the globular portion of the enzyme
• Only RNA Pol II whose CTD is NOT
phosphorylated can initiate transcription
• TATA box (TATAAA) is a consensus promoter
• Several general transcription factors are required
• See TBP bound to TATA (Fig. 29.28)
RNA Polymerase II Transcribes Protein-Coding
Genes
The Regulation of Gene Expression is More
Complex in Eukaryotes
The Site of Transcription Initiation Includes an
Initiator (Inr) and a TATA Box
• Pol II promoters consist of two separate sequence
features:
• the core element near the start site, where
general transcription factors bind, and
• More distantly located regulatory elements
(known as enhancers and silencers)
• Promoters encoding proteins typically contain
modules of short conserved sequences
• In addition to promoters, eukaryotic genes have
enhancers, also known as upstream activation
sequences, which may lie far from the promoter
• DNA looping allows multiple proteins bound to
different DNA sequences to convene
Figure 29.25 The Inr and TATA box in selected eukaryotic genes.
The consensus sequence of a number of such promoters is
presented in the lower part of the figure, the numbers giving the
percent occurrence of various bases at the positions indicated.
Promoter Regions of Several Representative
Eukaryotic Genes
Promoter Regions of Several Representative
Eukaryotic Genes
Figure 29.26
Response Elements are Promoter Modules
Responsive to Common Regulation
Response Elements are Promoter Modules
Responsive to Common Regulation
• Promoter modules in genes responsive to common
regulation are termed response elements
• Examples include
• the heat shock element (HSE)
• the glucocorticoid response element (GRE), and
• the metal response element (MRE)
• Many genes are subject to multiple regulatory
influences
• Regulation of such genes is achieved through the
presence of an array of different regulatory elements
• The metallothionein gene is a good example (Figure
29.27)
Response Elements are Promoter Modules
Responsive to Common Regulation
Figure 29.27 The metallothionein gene possesses several
constitutive elements in its promoter (the TATA and GC boxes)
as well as specific response elements such as MREs and a GRE.
The BLEs are elements involved in basal level expression
(constitutive expression). TRE is a tumor response element
activated in the presence of tumor-promoting phorbol esters such
as TPA (tetradecanoyl phorbol acetate).
Transcription Initiation by RNA Polymerase II
Requires TBP and the GTFs
• The eukaryotic transcription initiation complex
consists of:
• RNA polymerase II
• Five general transcription factors (GTFs)
• A complex called Mediator (Srb/Med)
• The CTD of Pol II anchors Mediator
• Mediator allows Pol II to communicate with
transcriptional activators bound at sites distant from
the promoter
Transcription Initiation by RNA Polymerase II
Requires TBP and the GTFs
Figure 29.28 Transcription initiation. (a)
Model of the TATA-binding protein (TBP,
gold) in complex with a DNA TATA
sequence.
Figure 29.28b Transcription Initiation
(b) RNA pol II: Mediator: TFIIF complex.
The Role of Mediator in Transcription Activation
and Repression
• Transcription activation requires Mediator
• Mediator is a bridge between gene-specific
transcription co-activators bound to enhancers and
the RNA polymerase II/GTF transcription machinery
bound at the promoter
• Once DNA is accessible (through chromatin
remodeling), a transcription co-activator binds to an
enhancer and recruits Mediator to the gene
• Mediator promotes the binding of GTFs and RNA
polymerase II at the promoter
• Mediator is 1 million daltons in mass, with a core
comprised of about 20 distinct subunits in yeast and
30 subunits in humans
The Role of Mediator
Figure 29.29 Simple models of
Mediator in the regulation of
eukaryotic gene transcription.
(a) Mediator as a transcription
activator. Mediator regions
are highlighted in color: green
for the tail, yellow for the
middle, and red for the head.
RNA polymerase II and the
GTFs are blue. The
transcription co-activator is
orange. DNA is shown as a
black line.
(b) Mediator as a repressor.
Chromatin-Remodeling Complexes Alleviate
Repression Due to Nucleosomes
• Chromatin-remodeling complexes are enormous (MW
=1 megadalton)
• These assemblies serve to loosen the DNA:protein
interactions in nucleosomes by sliding, ejecting,
inserting, or otherwise restructuring core octamers
• Two sets of factors are important: chromatinremodeling complexes that mediate ATP-dependent
conformational changes in nucleosome structure
• Histone-modifying enzymes that introduce covalent
modifications into the N-terminal tails of the histone
core octamer
• Chromatin remodeling and histone modification are
closely linked processes
Chromatin-Remodeling Complexes Alleviate
Repression Due to Nucleosomes
• The central structural unit of nucleosomes, the
histone “core octamer”, is constructed from the eight
histone-fold protein domains of the eight various
histone monomers comprising the octamer
• Interactions between histone tails contributed by core
histones in adjacent nucleosomes are an important
influence in establishing higher orders of chromatin
organization
• Activation of eukaryotic transcription depends on:
• Relief from repression imposed by chromatin
structure
• Interaction of RNA polymerase II with promoter and
transcription regulatory proteins
Chromatin-Remodeling Complexes Alleviate
Repression Due to Nucleosomes
• Two sets of factors are important to eukaryotic
transcription:
• Chromatin-remodeling complexes that mediate
ATP dependent conformational changes
• Histone-modifying enzymes that introduce
covalent modifications into the N-terminal tails of the
histone core octomer
• Chromatin remodeling and histone modification are
closely linked processes
• Chromatin-remodeling complexes are nucleic-acid
–stimulated multisubunit ATPases
Covalent Modification of Histones
Covalent Modification of Histones Forms
the Basis of the Histone Code
• Chromatin is remodeled through the actions of
enzymes that covalently modify side chains on
histones within the core octamer
• Initial events in transcriptional activation include
acetyl-CoA-dependent acetylation of ε-amino acids
on lysine residues in histone tails by histone
acetyltransferases (HATs)
• Phosphorylation of Ser residues and methylation of
Lys residues in histone tails also contribute to
transcription regulation
• Attachment of small proteins to histone C-terminal Lys
residues through ubiquitination and sumolyation
are two other forms of covalent modification
• A code based on histone-tail covalent modifications
determines gene expression through selective
recruitment of proteins
• Proteins that cause chromatin compaction
(heterochromatin formation) lead to repression
• Proteins giving easier access to DNA through
relaxation of histone: DNA interactions favor the
possibility of gene expression
• Prominent forms of histone covalent modification are
lysine acetylation, lysine methylation, serine
phosphorylation, lysine ubiquitination, and lysine
sumoylation
Methylation and Phosphorylation Act as a Binary
Switch in the Histone Code
• As cells enter mitosis, the chromatin becomes
condensed and histone H3 is not only methylated at
K9 but also phosphorylated at the adjacent S10
• S10 phosphorylation triggers dissociation of HP1 from
the heterochromatin
• Thus phosphorylation next to K9 trumps HP1 binding
• Similarly phosphorylation of Thr (T3) neighboring K4 in
the histone H3 tail evicts CHD1 from its site on the
methylated K4.
• Lysine methylation is the “on” position for the binary
switch that recruits proteins to the histone tail and
phosphorylation at a neighboring residue turns the
switch to “off” by ejecting the bound proteins
Nucleosome Alteration and Interaction of RNA
Polymerase II are Essential
• Gene activation (initiation of transcription) requires two
principal steps:
(1)Alterations in nucleosomes (and thus chromatin) that
relieve the general repressed state imposed by
chromatin structure, followed by
(2)The interaction of RNA polymerase II and the GTFs
with the promoter
• Transcription activators initiate the process by
recruiting chromatin-altering proteins (the chromatinremodeling complexes and histone-modifying
enzymes)
• Once these have occurred, promoter DNA is
accessible to TBP:TFIID, other GTFs, and RNA Pol II
Figure 29.30 Diagram of the nucleosome.
Figure 29.30 Diagram of the nucleosome.
• The following slide shows a schematic diagram of
the nucleosome, illustrating the various covalent
modifications on the N-terminal tails of histones:
• AcK = acetylated lysine residue
• meK – methylated lysine residue
• meR – methylated arginine residue
• PS – phosphorylated serine residue
• The numbers indicate the positions of the amino
acids in the amino acid sequences. Note the
prevalence of modifiable sites, particularly
acetylatable lysine, on the N-terminal tails of
histones H2B, H3, and H4.
A Model for the Transcriptional Regulation of
Eukaryotic Genes
Figure 29.31 The DNA is a green ribbon wrapped around
disclike nucleosomes. A specific transcription factor (TF, pink) is
bound to a regulatory element (either an enhancer or silencer).
RNA polymerase II and its associated GTF (blue) are bound at
the promoter. The N-terminal tails of histones are shown as
wavy lines (blue) emanating from the nucleosome discs. A
specific transcription factor that is a transcription activator
stimulates transcription through interactions with a co-activator
whose HAT activity renders DNA more accessible.
29.4 How Do Gene Regulatory Proteins Recognize
Specific DNA Sequences?
• Proteins that recognize nucleic acids do so by the
basic rule of macromolecular recognition:
• They present a three-dimensional shape that is
structurally and chemically complementary to the
surface of a DNA sequence
• Protein contacts with the bases of DNA usually occur
within the major groove of the DNA (but not always)
• Protein contacts with DNA involve H bonding and
salt bridges with electronegative oxygen atoms of the
phosphodiester linkages
• 80% of DNA-binding proteins belong to one of three
principal classes
29.4 How Do Gene Regulatory Proteins Recognize
Specific DNA Sequences?
• 80% of DNA-binding proteins below to one of three
principal classes based on their structures:
• The helix-turn-helix (HTH) motif
• The zinc-finger (or Zn-finger) motif
• The leucine zipper-basic region (or bZIP)
• Alpha helices fit into the major groove of B-DNA
• α-helix diameter (including side chains) is 1.2 nm
• DNA major groove: 1.2 nm wide x 0.6 to 0.8 nm deep
• The α-helix and B-form DNA are the predominant
structures involved in protein: DNA interactions
Proteins With the Helix-Turn-Helix Motif Use One
Helix to Recognize DNA
• The HTH motif is a protein structural domain
consisting of two successive α-helices separated
by a sharp β-turn (Figure 29.32)
• All contain two α-helices separated by a loop
with a β-turn
• The C-terminal helix (denoted helix 3) fits in
major groove of DNA; the N-terminal helix (helix
2) locks helix 3 into its DNA interface
• Recognition of DNA sequence involves the sides
of base pairs that face the major groove
Alpha Helices and DNA
•
•
•
•
A perfect fit
A recurring feature of DNA-binding proteins is the
presence of α-helical segments that fit directly
into the major groove of B-form DNA
Diameter of helix is 1.2 nm
Major groove of DNA is about 1.2 nm wide and
0.6 to 0.8 nm deep
Proteins can recognize specific sites in DNA
Proteins With the Helix-Turn-Helix Motif Use
One Helix to Recognize DNA
• An HTH motif example: antp is a member of a family
of eukaryotic proteins involved in the regulation of
early embryonic development that have in common
an amino acid sequence element known as the
homeobox domain
• The homeobox is a DNA motif that encodes a related
60-residue sequence (the homeobox) found among
proteins of virtually every eukaryote
• Embedded in the homeobox domain is an HTH motif
• Homeobox domain proteins are sequence-specific
transcription factors
• Other DNA-binding proteins with HTH motifs are lac
repressor, trp repressor, and the CAP C-term domain
Proteins With the Helix-Turn-Helix Motif Use One
Helix to Recognize DNA
Figure 29.32 An HTH motif
protein: Antp monomer bound
to DNA. Helix 3 (yellow) is
locked into the major groove
of the DNA by helix 2
(magenta).
Some Proteins Bind to DNA via Zn-Finger
Motifs
Figure 29.33 The Zn-finger motif of the C2H2 type showing (a)
the coordination of Cys and His residues to Zn and (b) the
secondary structure.
Some Proteins Bind to DNA via Zn-Finger
Motifs
First discovered in TFIIIA from Xenopus laevis, the
African clawed toad
• Now known to exist in nearly all organisms
• Two main classes: C2H2 and Cx
• C2H2 domains consist of Cys-x2-Cys and His-x3His domains separated by at least 7-8 amino acids
• This motif can be repeated as many as 13 times
over the primary structure of a Zn-finger protein
• Cx domains consist of 4, 5 or 6 Cys residues
separated by various numbers of other residues
• The Cx proteins have a variable number of Cys
residues available for Zn chelation
Some Proteins Bind to DNA via Zn-Finger
Motifs
(c) Structure of a classic C2H2 zinc finger protein with
three zinc fingers bound to DNA.
Some Proteins Bind to DNA via Zn-Finger
Motifs
• Comparison of secondary and tertiary
structures
• C2H2 -type Zn fingers form a folded beta
strand and an alpha helix that fits into the DNA
major groove
• Cx-type Zn fingers consist of two minidomains of four Cys ligands to Zn followed by
an alpha helix: the first helix is the DNA
recognition helix, second helix packs against
the first
Model for a Dimeric bZIP Protein
Some DNA-Binding Proteins Use a Basic Region
Leucine Zipper (bZIP) Motif
First found in C/EBP, a DNA-binding protein in rat
liver nuclei
• Now found in nearly all organisms
• Characteristic features: a 28-residue sequence
with Leu every 7th position and a "basic region"
• (What do you know by now about 7-residue
repeats?)
• This suggests amphipathic α−helices and a
coiled-coil dimer (see Chapter 6, page 155)
The Structure of the Leucine Zipper
•
•
•
Figure 29.34 BR-A
and BR-B are basic
regions A and B.
•
•
•
In complex with DNA
Leucine zipper proteins (aka bZIP proteins)
dimerize, either as homo- or hetero-dimers
The basic region is the DNA-recognition site
Basic region is often modeled as a pair of
helices that can wrap around the major groove
Homodimers recognize dyad-symmetric DNA
Heterodimers recognize non-symmetric DNA
Fos and Jun are classic bZIPs
Structure of a Leucine Zipper:DNA Complex
Figure 29.35 Model for the
heterodimeric bZIP transcription factor
c-Fos:c-Jun bound to a DNA oligomer
containing the AP-1 consensus target
sequence TGACTCA.
Eukaryotic Genes are Split Genes
• Introns (non-coding regions) intervene between
exons (protein-coding regions)
• Examples: actin gene has 309-bp intron between
first three amino acids and the other 350 or so
• But chicken pro α-2 collagen gene is 40-kbp
long, with 51 exons of only 5 kbp total.
• In these cases, the exons range in size from 45 to
249 bases
• The mechanism by which introns are excised and
exons are spliced together is complex and must be
precise
29.5 How Are Eukaryotic Transcripts Processed
and Delivered to the Ribosomes for Translation?
• In prokaryotes, transcription and translation are
concomitant processes
• In eukaryotes, the two processes are spatially
separated: transcription occurs on DNA in the
nucleus, and translation occurs on ribosomes in the
cytoplasm
• Thus, transcripts must be transported from the
nucleus to the cytosol to be translated
• On the way, these transcripts undergo processing
• Alterations that convert the newly synthesized
RNAs (primary transcripts) into mature mRNAs
• And unlike prokaryotes, eukaryotic mRNAs encode
only one polypeptide; i.e., they are monocistronic
Eukaryotic Genes are Split Genes
Figure 29.36 The organization of split eukaryotic genes.
Eukaryotic Genes are Split Genes
mRNA Processing Involves Capping, Methylation,
Polyadenylylation, & Splicing
• Primary transcripts (aka pre-mRNAs or
heterogeneous nuclear RNA) are usually
capped by addition of a guanylyl group
• The reaction is catalyzed by guanylyl
transferase
• Cap G residue is methylated at 7-position
• Additional methylations occur at 2'-O positions
of next two residues and at 6-amino of the first
adenine
Figure 29.37 The organization of the mammalian DHRF gene in
three representative species. Note that the exons are much
shorter than the introns. Note also that the exon pattern is more
highly conserved than the intron pattern.
The Capping of Eukaryotic pre-mRNAs
Figure 29.38 Guanylyl transferase catalyzes the addition of a
guanylyl residue derived from GTP to the 5'-end of the
growing transcript, which has a 5-triphosphate group already
there. In the process, pyrophosphate (pp) is liberated from
GTP and the terminal phosphate (p) is removed from the
transcript:
Gppp + pppApNpNpNp.. → GpppApNpNpNp… + pp + p
(A is often the initial nucleotide in the primary transcript.)
Methylation at Several Sites is Essential to
mRNA Maturation
Figure 29.39 A cap bearing only a single
–CH3 on the guanyl is termed cap O. This
methylation occurs in all eukaryotic
mRNAs. If a methyl is also added to the
2'-O position of the first nucleotide after the cap, a cap 1
structure is generated. This is the predominant cap form in
RNA from all multicellular eukaryotes.
3'-Polyadenylylation of Eukaryotic mRNAs
• Termination of transcription occurs only after
RNA polymerase has transcribed past a
consensus AAUAAA sequence - the poly(A)
addition site
• 10-35 nucleotides past this site, a string of 100 to
200 adenine residues are added to the mRNA
transcript - the poly(A) tail
• Poly(A) polymerase adds these A residues
• Poly(A) tail enhances mRNA stability
Nuclear Pre-mRNA Splicing
• Within the nucleus, hnRNA forms ribonucleoprotein
particles (RNPs) through association with a
characteristic set of nuclear proteins
• These proteins maintain the hnRNA in an untangled
and accessible conformation
• The substrate for splicing, that is, intron excision
and exon ligation, is the capped primary transcript
emerging from the RNA polymerase II transcriptional
apparatus
• Splicing occurs exclusively in the nucleus
• Consensus sequences define the exon/intron
junctions in eukaryotic mRNA precursors
Figure 29.40 Poly (A) addition
to the 3'-ends of transcripts
occurs 10 to 35 nucleotides
downstream from a consensus
AAUAAA sequence, defined
as the polyadenylylation signal.
CPSF (cleavage and
polyadenylylation specificity
factor) binds to this signal
sequence and mediates
looping of the 3'-end of the
transcript through interactions
with a G/U-rich sequence even
further downstream.
Splicing of Pre-mRNA
Capped, polyadenylated RNA, in the form of a RNP
complex, is the substrate for splicing
• In "splicing", the introns are excised and the exons
are joined together to form mature mRNA
• The 5'-end of an intron in higher eukaryotes is
always GU and the 3'-end is always AG
• All introns have a "branch site" 18 to 40 nucleotides
upstream from 3'-splice site
• The branch site is essential to splicing
Figure 29.41 Consensus Sequences at the Splice
Sites in Vertebrate Genes
The Splicing Reaction Proceeds via
Formation of a Lariat Intermediate
• Figure 29.42 shows the splicing mechanism
• The branch site is usually YNYRAY, where Y =
pyrimidine, R = purine and N is anything
• The lariat, a covalently closed loop of RNA, is
formed by attachment of the 5'-P of the intron's
invariant 5'-G to the 2'-OH at the branch A site
• The exons then join, excising the lariat.
• The lariat is unstable; the 2'-5' phosphodiester is
quickly cleaved and the intron is degraded in the
nucleus.
The Splicing Reaction Proceeds via
Formation of a Lariat Intermediate
Figure 29.42 Splicing of
mRNA precursors. A
representative precursor
mRNA is depicted. Exon 1
and Exon 2 indicate two
exons separated by an
intervening sequence (an
intron) with consensus 5', 3',
and branch sites.
Splicing Depends on snRNPs
• Splicing depends on a unique set of small nuclear
ribonucleoprotein particles - snRNPs, pronounced
"snurps"
• A snRNP consists of a small RNA (100-200 bases
long) and about 10 different proteins
• Some of the 10 proteins are general, some are
specific (see Table 29.6)
• Major snRNP species are abundant, with more than
100,000 copies per nucleus
• snRNPs and pre-mRNA form the spliceosome
• The spliceosome is the size of ribosomes, and its
assembly requires ATP
Splicing Depends on snRNPs
snRNPs Form the Spliceosome
• Splicing occurs when the various snRNPs come
together with the pre-mRNA to form a
multicomponent complex called the spliceosome
• The spliceosome is a large complex, about the
size of a ribosome; its assembly requires ATP
• snRNPs U1 and U5 bind at the 5'- and 3'- splice
sites, and U2 snRNP binds at the branch site
• Interaction between the snRNPs brings 5'- and 3'splice sites together so the lariat can form and
exon ligation can occur
• Spliceosome assembly requires ATP-dependent
RNA rearrangements catalyzed by spliceosomal
DEAD-box ATPases/helicases
Figure 29.43
Structure of the core domain
of the U4 SnRNP. The U4
snRNA (orange, with bases
in light blue stick) passes
through the central hole in
the heteroheptad Sm protein
complex, SmG-SmD3-SmBSmD1-SmD2-SmF. Each of
the seven Sm proteins is a
different color.
snRNPs Form the Spliceosome
Figure 29.44 The mammalian U1 snRNA can be arranged in
a secondary structure where its 5'-end is single-stranded
and can base-pair with the consensus 5'-splice site of the
intron.
snRNPs Form
the Spliceosome
Alternative RNA Splicing Creates Protein
Isoforms
Figure 29.45 Events in
spliceosome assembly. U1
snRNP binds at the 5'-splice
site, followed by the
association of U2 snRNP with
the UACUAA*C branch-point
sequence. The triple U4/U6U5 snRNP complex replaces
U1 at the 5'-splice site and
directs the juxtaposition of the
branch-point sequence with
the 5'-splice site, whereupon
U4 snRNP is released.
• In constitutive splicing, every intron is removed
and every exon is incorporated into the mature RNA
• This produces a single form of mature mRNA from
the primary transcript
• However, many eukaryotic genes can give rise to
multiple forms of mature RNA transcripts
• This may occur by:
• Use of different promoters
• Selection of different polyadenylylation sites
• Alternative splicing of the primary transcript, or
• A combination of these three mechanisms
Alternative RNA Splicing Creates Protein
Isoforms
• Different transcript from a single gene make possible a
set of related polypeptides, termed protein isoforms,
each with a slightly altered function
• The isoforms of fast skeletal muscle troponin T are an
example of alternative splicing
• This gene consists of 18 exons, 11 of which are found in
all mature mRNAs and are constitutive
• Five of the exons (4 through 8) are combinatorial, in
that they may be included or excluded
• Two (16 and 17) are mutually exclusive – one is always
present but never both
• 64 different mature mRNA can be formed from this gene
by alternative splicing
Alternative RNA Splicing Creates Protein
Isoforms
Figure 29.46 Organization of the fast skeletal muscle
troponin T gene and the 64 possible mRNAs that can be
generated from it. Exons are constitutive (yellow),
combinatorial (green), or mutually exclusive (blue or orange).
RNA Editing: Another Way To Increase the Diversity
of Genetic Information
29.6 Can Gene Expression Be Regulated Once the
Transcript Has Been Synthesized?
• RNA editing is a process that changes one or more
nucleotides in an RNA transcript by deaminating a
base, either A→I or C→U
• These changes alter the coding possibilities in a
transcript, because I will pair with G (not U as A
does) and U will pair with A (not G as C does)
• RNA editing can increase protein diversity by
(1) Altering amino acid coding possibilities
(2) Introducing premature stop codons
(3) Changing a splice site in a transcript
• miRNAs are key regulators in post-transcriptional gene
regulation
• miRNAs are a large family of small, noncoding RNAs found in
animals, plants, and protists
• At least 800 are found in mammals and they are predicted to
target the expression of about 60% of all protein-coding
genes
• Mature miRNAs are incorporated into a miRNA-induced
silencing complex through interaction with AGO2
• In most cases, miRNAs target the 3’-untranslated region of
the mRNAs they regulate
• miRNA-RISC blocks gene expression in two ways:
• miRNA-RISC binding can interfere with recruitment of
ribosomes; in addition, miRNA-RISC complexes destabilize
mRNAs through deadenylation at their 3′-ends
Figure 29.47 Domain organization of human
Argonaute 2 (AGO2)
29.7 – Can We Propose a Unified Theory of Gene
Expression?
• Traditionally, the stages of eukaryotic gene expression,
from transcriptional activation through mRNA translation,
have been viewed as discrete steps
• We now know that each stage is part of a continuous
process with physical and functional connections, running
from transcription through processing to protein synthesis
as DNAÆ RNAÆ protein
• This continuous process is achieved by an interacting
network of macromolecular machines - nucleosomes,
HATs, chromatin remodeling complexes, RNA pol II,
capping, splicing and poly(A) enzymes, mRNA export
proteins, and ribosomes
RNA Degradation
Figure 29.48
A unified
theory of gene
expression.
RNA Degradation
Figure 29.49 Structure of
the human exosome core,
composed of nine different
polypeptides. A hexameric
ring of subunits surrounds
a central cavity that is
capped by a set of three
other proteins (the ones
colored pink here).
• The amount of specific mRNAs or proteins in a cell
at any time represents a balance between rates of
macromolecular synthesis and degradation
• Regulation degradation of mRNAs and proteins (see
Chapter 31) is a rapid and effective way to control
the levels of these macromolecules
• Targeted degradation of RNAs and proteins is
enclosed within ringlike or cylindrical macromolecular
complexes – the exosome for RNA and the
proteasome for proteins (Chapter 31)
• Exosome consist of a ring of six subunits
surrounding a central cavity, with one or more having
RNase PH activity