Download Transcription • Transcription factors • Chromatin • RNA polymerase II

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein moonlighting wikipedia , lookup

Hedgehog signaling pathway wikipedia , lookup

Signal transduction wikipedia , lookup

P-type ATPase wikipedia , lookup

Cell nucleus wikipedia , lookup

List of types of proteins wikipedia , lookup

SR protein wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

JADE1 wikipedia , lookup

Protein phosphorylation wikipedia , lookup

Transcription factor wikipedia , lookup

Phosphorylation wikipedia , lookup

Gene expression wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Epitranscriptome wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Transcriptional regulation wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Transcript

Transcription
• Transcription
factors
• Chromatin
• RNA
polymerase
II
RNA polymerase II structure – without one important domain
Armache K et al. J. Biol. Chem. 2005;280:7131-7134
©2005 by American Society for Biochemistry and Molecular Biology
d for
e rare
enetic
ossible
some
(Egloff
existts do
genes
rks. A
xperipecific
asked
-react
e prephos-
phoseach
theory
phoryis the
he retudies
of the
phoryupied,
of the
ers of
ts the
ylation
lity is
re inphoryport of
phoseases
horyla2016;
, other
Molecular Cell
UniquefeaturesoftheC-terminaldomainrepeats
ofRNApolymeraseII
Previews
Voss et al., 2015). What role
Eachheptdad hasseveralpossible
these modifications play in
posttranslationalmodifications
establishing different phosphorylation states is not
known. Of course, the experimental paradigm presented in
these two papers will be an
effective and highly informative route to addressing such
questions.
REFERENCES
Themodificationsvaryinthe
Buratowski, S. (2003). Nat. Struct.
courseoftranscription
Biol. 10, 679–680.
Corden, J.L. (2013). Chem. Rev.
113, 8423–8455.
Dias, J.D., Rito, T., Torlai Triglia, E.,
Kukalev, A., Ferrai, C., Chotalia, M.,
Brookes, E., Kimura, H., and
Pombo, A. (2015). eLife 4, 4.
Egloff, S., Dienstbier, M., and Murphy, S. (2012). Trends Geneti. 28,
333–341.
Eick, D., and Geyer, M. (2013).
Chem. Rev. 113, 8456–8490.
Komarnitsky, P., Cho, E.J., and Buratowski, S. (2000). Genes Dev. 14,
2452–2460.
Schüller, R., Forné, I., Straub, T.,
Schreieck, A., Texier, Y., Shah, N.,
Decker, T.-M., Cramer, P., Imhof,
A., and Eick, D. (2016). Mol. Cell
61, this issue, 305–314.
Figure 1. RNA Polymerase II with CTD Indicating Multiple Heptad
Repeats
(A) Five possible phosphorylation sites in each repeat.
(B) Sparse phosphorylation at S5 close to the start of transcription.
(C) Sparse phosphorylation at S2 near the cleavage/polyadenylation site.
Schwer, B., and Shuman, S. (2011).
Mol. Cell 43, 311–318.
Suh, H., Ficarro, S.B., Kang, U.-B.,
Chun, Y., Marto, J.A., and Buratowski, S. (2016). Mol. Cell 61, this issue,
297–304.
Review
Trends in Genetics
Vol.24 No.6
x 34 repeats in
Arabidopsis
Figure 2. The updated carboxyl-terminal domain (CTD) code. All the possible serine phosphorylation and proline isomerization combinations are shown. It has not been
ruled out that Tyr1 and Thr4 phosphorylation co-exists with serine phosphorylation. The glycosylation state of serine residue in position 2 (Ser2), Thr4, Ser5 and Ser7 may
also play a role in recognition of the CTD by factors and multiple differentially glycosylated forms are possible. Before and after recruitment, subsets of these combinations
will influence the function of polymerase II (pol II). Insertion of an extra amino acid between heptapeptide pairs is tolerated, whereas insertion between individual repeats is
not, suggesting that the unit of recognition is a heptapeptide pair [51]. This would increase the potential complexity of the CTD code (64 potential combinations for
phosphorylation sites and 16 for proline isomerization) but reduce the number of different protein binding sites on the same CTD. Although the code is potentially very
complex, with each of 52 repeats or 26 pairs having a different set of modifications, in reality the number of combinations is restricted by the recruitment of different sets of
modification enzymes at the appropriate points. In addition, some modifications will enhance or preclude others. See also Refs. [17,18]. Modification of the heptapeptide is
shown as in Figure 1.
processing the 30 end of the transcripts and/or transcription
termination. However, substitution of Ser7 by the non–
phospho-acceptor alanine in each heptapeptide has little
effect on expression of endogenous or transiently transfected protein-coding genes [6,22]. By contrast, alanine
substitution of Ser7 causes a marked defect in the transcription of human snRNA genes and in formation of the 30
end of transcripts [6], indicating that at least one element
of the CTD code can be read in a gene-specific manner.
Phosphorylation of other residues in the heptapeptide
The highly conserved Tyr1 is a potential site for tyrosine
kinases, and some tyrosine phosphorylation occurs in vivo
[19]. Ectopic expression of c-Abl both increases Tyr1 phosphorylation of endogenous pol II and activates expression
of HIV templates [39]. However, no clear function has been
attributed to this modification. Likewise, some threonine
phosphorylation is detected on the CTD of pol II in mammalian cells [20], but the functional significance is unclear.
Because there are 15 threonines in positions 2, 5 or 7, it is
possible that these rather than Thr4 are subject to phosphorylation.
Finally, in vertebrates, the last repeat of the CTD is
followed by a conserved 10 amino acid extension [2], which
contains a constitutively phosphorylated casein kinase
(CK)II site [40]. Deletion of this extension results in degradation of the CTD and a reduced ability to transcribe and
process RNA [40,41]. However, mutation of the CKII target
site does not affect pol II CTD stability. Phosphorylation of
Tyr1 by c-Abl requires this extension [42], implicating Tyr1
phosphorylation in functions specific to vertebrates.
Proline isomerization
There are two peptidyl-prolyl bonds in each consensus
heptapeptide, and these can be in either cis or trans
orientation, resulting in four possible configurations of
each repeat for the mammalian CTD (Figure 2). At least
one protein involved in recognition of poly(A) sites selectively binds to repeats with prolines in an all-trans conformation (see below), and peptidyl-prolyl cis/trans
isomerases (PPIases) might be involved in isomerization
in vivo. Mammalian Pin1 and yeast ESS1 proteins, which
possess PPIase activity, are good candidates for proteins
regulating CTD structure and function through proline
isomerization (see Ref. [36] for review). They interact with
phospho-CTD and display an unusual substrate specificity
for peptides with phospho-serine and phospho-threonine
residues preceding the proline. Mutations in ESS1 are
associated with pre-mRNA 30 processing defects, whereas
there is no evidence that Pin1 plays a positive role in
polyadenylation in mammals. Indeed, it has not been
unequivocally shown that binding of any transcription or
processing factors to the CTD is influenced by enzymatic
proline isomerization. However, Pin1 influences the
283
5-2
The proline peptide bond
- proline, an imino acid
180º
trans conformation
favoured ~1000-fold
over cis
R=side chain
O=C-N-H is planar
trans
~93%
cis
~7%
- cis conformation rare except for proline
- cis can be 10-30%, depending on the
nature of the Xaa-Pro bond
www.sfu.ca/~leroux/class_L05.ppt
5-3
Proline cis-trans isomerization
- slow because it involves rotation about a partial double-bond (t1/2 between 10-100
sec at 25ºC)
- cis-trans equilibria more common in flexible regions of native proteins (e.g., coils)
OR: during protein folding
partial
double-bond
character
- strong acids favour cis-trans isomerization by protonating the nitrogen atom
- proline residues disrupt alpha-helices; often found in turns
- cis-trans isomerization could be used as a molecular switch
+
Catalysis of cis-trans isomerization
- simple reaction; does not involve breaking or forming bonds
- mechanism: catalysis by distortion and transition state containing partially-rotated
C-N bond
- this would result in a reduced partial double-bond character
PPIase, Peptidyl Prolyl Isomerase, catalyzes proline cis-trans isomerization
- active site of PPIase hydrophobic in character; conserved Arg residue of a PPI
might be involved in H-bond formation with N:, producing C-N bond with more
single-bond character
+
Peptidyl prolyl isomerases
Three classes are known:
• Cyclophilins
- ubiquitous; 11 different ones found in S. cerevisiae; not essential for viability
- binds cyclosporin A
• FKBP binding proteins
- no sequence similarity with cyclophilins; many different members found in eukaryotes, as
well as prokaryotes; not essential for viability. Yeast mutant lacking all its cyclophilins and
FKBP binding proteins still alive!
- bind immunosuppressants FK506 and rapamycin (but not cyclosporin A)
- both cyclophilins and FKBP’s form complexes with the molecular chaperone Hsp90,
perhaps to catalyze cis-trans isomerization as well as to assist folding (or modulate protein
conformation)
• Parvulins
- not related to cyclophilins or FKBP binding proteins, and are not inhibited by cyclosporin A,
FK506 or rapamycin
- occur as small proteins of <100 amino acids or as domains of larger proteins
- have high PPIase activity
.... and they differ in their activities and specificities,
cellular localization, and binding partner(s)
Implicated in: protein folding, protection against stress, apoptosis, cell cycle progression, etc. etc.
5-4
Review
Trends in Genetics Vol.24 No.6
Figure 1. Modification of the polymerase II (Pol II) carboxyl-terminal domain (CTD) heptapeptide during transcription of protein-coding genes. (a) The CTD of pol II, which is
recruited by preinitiation complexes at the promoter, is unphosphorylated [10] but might be glycosylated [44]. Phosphorylation of the CTD before initiation is thought to
block recruitment of pol II, emphasizing that the CTD code is read differently before and after initiation of transcription. (b) The CTD is located close to the RNA exit channel
[16]. Phosphorylation of serine residue in position 5 (Ser5), by the CDK7 subunit of the general transcription factor TFIIH, just after initiation, helps to recruit and activate
enzymes that add a methylguanosine cap (filled black circle) to the 50 end of the emerging transcript [10]. Because glycosylated and phosphorylated residues are not found
together on the same CTD, glycosylation must have been removed by this point. (c) Subsequent phosphorylation of Ser2 by the CDK9 subunit of positive-transcription
elongation factor b (P-TEFb) activates elongation and RNA processing [10,27]. Phosphorylation of Ser7 also occurs during transcription and can peak at the 30 end [22] but
has, thus far, no known function in protein-coding gene expression. In yeast and in some mammalian genes, Ser5 is dephosphorylated toward the 30 end of the
transcription unit [10,24,25]. (d) After cleavage and polyadenylation of the 30 end of the pre-mRNA, directed by the poly(A) site, dephosphorylation of the CTD may help pol II
to disengage ready for another round of transcription [10,14,15]. Glycosylation is indicated by circles containing Gs (light blue), phosphorylation by circles containing Ps
(orange), and trans isomerisation of prolines by a t (red) above the amino acid. Only one heptapeptide of the multiple heptapeptide repeats is shown on the schematic of pol
II (green) with a CTD ‘tail’.
Ser7 phosphorylation extends the CTD code and is
part of a gene specific signal
Using novel anti-CTD monoclonal antibodies, it has now
been shown that the serine residue in position 7 (Ser7) of
the heptapeptide repeat is also phosphorylated in vivo
during transcription of snRNA genes and a range of
282
protein-coding genes [6,22]. This mark expands the
complexity of the CTD code (Figure 2 and Ref. [18]). ChIP
analysis indicates that this phosphorylation mark peaks
toward the 30 end of the transcribed T-cell receptor b
(TCRb) gene, in a similar way to phosphorylation of
Ser2, suggesting that Ser7 phosphorylation has a role in
Molecular Cell
Rtr1 Regulates RNAPII CTD Phosphorylation
Figure 3. Rtr1 Localizes to Open Reading
Frames
(A) ChIP assays were performed using antibodies
directed against different forms of RNAPII and
Rtr1 as indicated. Quantitation of the S5-P (black
circles) and S2-P (blue squares) occupancy
compared to the occupancy of Rtr1 (red triangles)
across the PMA1 genomic loci.
(B) Quantitation of protein occupancy across the
PYK1 genomic loci as performed above. These
data are shown as the average percent maximum
IP ± SD for comparison purposes between the
different antibodies (n = 3). The midpoint of the
PCR amplicon was used as the distance from
the ATG. A schematic representation of the
genomic loci for each gene is shown below the
graphs to illustrate the approximate location of
the promoter and polyadenylation regions.
(Figure S2). This localization, in combination with our other results, could indicate
a role for Rtr1 in the transition from S5-P
CTD to S2-P CTD in vivo.
Rtr1 Is Found in the Open Reading Frames
of PMA1 and PYK1
To determine whether Rtr1 colocalizes with elongating, initiating,
or terminating RNAPII in vivo, chromatin immunoprecipitation
(ChIP) assays were performed with an Rtr1-HFH strain using
antibodies directed against HA. The genomic loci of PMA1 and
PYK1 were analyzed for Rtr1, S5-P, and S2-P RNAPII occupancy
using qPCR (Figures 3A and 3B). Both of these genes are highly
expressed and have previously been used to study the localization of different phosphorylated forms of RNAPII (Komarnitsky
et al., 2000). As shown in Figures 3A and 3B, a peak in S5-phosphorylated RNAPII is observed at the 50 end of both PMA1 and
PYK1 and decreases prior to an observed increase of S2-P, in
agreement with previous results (Komarnitsky et al., 2000).
Surprisingly, we found a strong and distinct peak of Rtr1 that
localized to a region of both genes between the enriched regions
of S5-P and S2-P. This peak was also observed after normalization of the levels of Rtr1 to the level of RNAPII occupancy
Rtr1 Is Required for S5-P
Dephosphorylation during Early
RNAPII Elongation
The Rtr1 deletion strain was analyzed for
defects in CTD phosphorylation in vivo
by western blot analysis of whole-cell
extracts. Loss of Rtr1 resulted in increased
levels of S5-P CTD in vivo (Figure 4A,
upper panel) and corresponded with a
slight decrease in the cellular level of
unmodified RNAPII (Figure 4B, third
panel). The level of S2-P was not affected
in rtr1D extracts, nor was the level of Rpb3,
which was used as a control for protein
loading (Figure 4A, bottom two panels).
Quantitation was performed on triplicate
experiments and is shown in Figure 4B. These data indicate that
Rtr1 is involved in the regulation of S5-P in vivo and support the
hypothesis that Rtr1 is involved in decreasing S5-P in wild-type
cells, given that there is an accumulation of the S5-P form in
rtr1D cells.
In order to determine the effects of Rtr1 deletion on the levels of
DNA-associated RNAPII, we performed ChIP analyses in the
rtr1D strain and compared the levels of S5-P and S2-P RNAPII
to those found in wild-type. The levels of S5-P RNAPII dramatically increase in the rtr1D cells when compared to the levels
observed in wild-type cells at both the PMA1 and PYK1 ORFs
(Figures 4C and 4D). This finding is especially true toward the
30 ends of these coding regions, where S5-P on the CTD is
normally very low. The level of S5-P RNAPII decreases near the
polyadenylation sites of both genes to near wild-type levels.
This decline is likely due to decreased RNAPII occupancy in these
regions, which was observed for Rpb3 in both the rtr1D and
wild-type cells (Figure 5A). The levels of S2-P RNAPII were also
Molecular Cell 34, 168–178, April 24, 2009 ª2009 Elsevier Inc. 171
Molecular Cell
Article
Rtr1 Is a CTD Phosphatase that Regulates RNA
Polymerase II during the Transition from Serine 5
to Serine 2 Phosphorylation
Amber L. Mosley,1 Samantha G. Pattenden,1 Michael Carey,1,2 Swaminathan Venkatesh,1 Joshua M. Gilmore,1
Laurence Florens,1 Jerry L. Workman,1 and Michael P. Washburn1,*
1Stowers
Institute for Medical Research, Kansas City, MO 64110, USA
of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, 10833 LeConte Avenue,
Los Angeles, CA 90095, USA
*Correspondence: [email protected]
DOI 10.1016/j.molcel.2009.02.025
2Department
SUMMARY
Messenger RNA processing is coupled to RNA
polymerase II (RNAPII) transcription through coordinated recruitment of accessory proteins to the Rpb1
C-terminal domain (CTD). Dynamic changes in CTD
phosphorylation during transcription elongation are
responsible for their recruitment, with serine 5 phosphorylation (S5-P) occurring toward the 50 end of
genes and serine 2 phosphorylation (S2-P) occurring
toward the 30 end. The proteins responsible for regulation of the transition state between S5-P and S2-P
CTD remain elusive. We show that a conserved
protein of unknown function, Rtr1, localizes within
coding regions, with maximum levels of enrichment
occurring between the peaks of S5-P and S2-P
RNAPII. Upon deletion of Rtr1, the S5-P form of
RNAPII accumulates in both whole-cell extracts and
throughout coding regions; additionally, RNAPII transcription is decreased, and termination defects are
observed. Functional characterization of Rtr1 reveals
its role as a CTD phosphatase essential for the S5to-S2-P transition.
INTRODUCTION
From yeast to mammals, there are three highly conserved RNA
polymerase complexes that are responsible for the transcription
of all classes of cellular RNAs. RNA processing is closely tied to
transcription in order to ensure the fate of nascent RNA. One
unique mechanism for proper RNA processing involves the
recruitment of a wide variety of accessory proteins to the
C-terminal domain (CTD) of the largest subunit of RNAPII, Rpb1
(for review, see Phatnani and Greenleaf, 2006). The CTD consists
of 27 repeats of the sequence Y1S2P3T4S5P6S7 in yeast and is not
conserved within the Rpb1 counterparts found in RNAP I and
RNAPIII, thereby serving as a unique signaling platform for
RNAPII. In order to form a competent initiation complex at the
promoter of a target gene, the CTD must exist in a hypophosphorylated state. Following assembly of the initiation complex,
168 Molecular Cell 34, 168–178, April 24, 2009 ª2009 Elsevier Inc.
the CTD exhibits increased phosphorylation on serine 5 (S5-P),
carried out by the cyclin-dependent kinase Kin28, a subunit of
the general transcription factor TFIIH (Komarnitsky et al., 2000;
Schroeder et al., 2000). This phosphorylation event is responsible
for the recruitment of the capping machinery, which begin processing of the nascent mRNA during early transcription (Cho
et al., 1997; Fabrega et al., 2003; Komarnitsky et al., 2000;
Schroeder et al., 2000). As transcription elongation progresses,
there is a change in the modification state of the CTD as serine
2 phosphorylation (S2-P) increases through the action of the
CTDK-I complex (Cho et al., 2001). Chromatin immunoprecipitation (ChIP) experiments have demonstrated that the increase
in S2-P occurs as transcription progresses through the open
reading frame (ORF) (Komarnitsky et al., 2000). As transcription
approaches the 30 end of the ORF, the termination and polyadenylation machinery are recruited, some of which interact with the
S2-P CTD (Licatalosi et al., 2002; Meinhart and Cramer, 2004;
Kim et al., 2004). Although this transition state from S5-P to
S2-P during the transcription cycle is thought to distinguish
different phases of RNAPII elongation, the proteins involved in
the decrease of S5-P during elongation have yet to be identified.
In addition to the aforementioned CTD kinases, the actions of
CTD phosphatases are also required to manage the different
CTD modification states. Two CTD phosphatases, Fcp1 and
Ssu72, have been characterized in yeast (for review, see Meinhart
et al., 2005). Fcp1 has a preference for the S2-P modification and
has been shown by ChIP analysis to colocalize with RNAPII
throughout coding regions (Cho et al., 2001). In addition, Fcp1
mutants show an increase in the level of S2-P in the coding region
of genes, indicating that the phosphatase plays a role in dephosphorylation of S2-P during the transcription cycle (Cho et al.,
2001). Fcp1 is also thought to play a major role in RNAPII recycling
after the complex has dissociated from the coding region (Cho
et al., 1999; Kong et al., 2005; Archambault et al., 1997; Chambers
et al., 1995; Aygun et al., 2008). Ssu72, conversely, is a S5-Pspecific CTD phosphatase and a component of the yeast cleavage
and polyadenylation factor (CPF), which is involved in mRNA processing at the 30 ends of genes (Krishnamurthy et al., 2004; ReyesReyes and Hampsey, 2007). ChIP assays have revealed that
Ssu72 is predominately enriched at the 30 ends of genes, with little
to no enrichment found at the promoter (Nedea et al., 2003; Ansari
and Hampsey, 2005). Although Fcp1 and Ssu72 have both been
Review
Trends in Genetics Vol.24 No.6
Figure 3. The role of the carboxyl-terminal domain (CTD) code in selective binding of proteins. The effect of modification of residues in the heptapeptide on protein binding
has only been determined for a few of the many factors shown to interact with the CTD [10,55,74]. Only factors that have been shown to bind directly to the CTD are shown.
Modification of the heptapeptide is shown as in Figure 1. (a) CTD-interacting proteins are shown as light blue shapes, with their function during the protein-coding gene
transcription cycle [10] noted on the left. The CTD can adopt numerous conformations that can interact with a range of different CTD-interacting domains. These include the
WW domain (Pin1, Ess1), the FF domain (CA150), the Set2 Rpb1 interacting (SRI) domain (Set2), an unusual SH2 domain (Spt6) and the CID domain of Pcf11 and Rtt103
[10,54]. Prp40 has FF and WW domains. However, it is not clear exactly what mediates binding to the CTD [10,75]. (b) The small nuclear RNA gene-specific RNA 30
processing complex, Integrator, is the only factor known to date that requires phosphorylation of serine residue in position 7 (Ser7) [6]. The role of phosphorylation of Ser2
and Ser5 in this interaction remains to be determined.
phosphorylation status of the CTD by inhibiting the Fcp1
phosphatase, and this has a negative effect on transcription [43].
Glycosylation
The mammalian CTD can also be modified by the addition
of a monosaccharide N-acetylglucosamine (O-GlcNAc), to
the hydroxyl groups of serine and threonine residues [44].
Interestingly, no glycosylation is detected on the phosphorylated form of the enzyme, suggesting that phosphorylation and glycosylation are mutually exclusive. It
is possible that the glycosylated form of pol II is recruited to
the promoter and that an N-acetylglucosaminase acts at
this stage to selectively remove the O-GlcNAc group before
phosphorylation occurs. However, to date, it has not been
clearly demonstrated that the low level of glycosylation
detected plays any role in gene expression.
Reading the code
A multitude of factors binds to the CTD (Figure 3). The
influence of specific CTD modifications on interaction is
only well understood in a few cases. However, each
described phosphorylation state clearly correlates with
the requirement of specific processing factors through
the transcription cycle (Figures 1 and 3).
284
Association of factors with the CTD early in the
transcription cycle
Pol II interacts with the general transcription factor TBP
and the multi-subunit Mediator complex (Box 2) through
unphosphorylated CTD [45,46], which might help tether pol
II to promoters. The Mediator complex plays an important
role in transducing signals from transcriptional regulators
to the general transcription machinery (for a review, see Ref.
[47]). Phosphorylation on Ser5 releases the Mediator complex from pol II, disrupting one of the bonds that stabilizes
promoter-associated complexes [48]. The Ser5-phosphorylated CTD then becomes a landing pad for guanylyltransferases responsible for the addition of the cap structure to
the 50 end of newly synthesized RNAs (for review, see Ref.
[10]). Mammalian guanylyltransferase [mouse mRNA capping enzyme (Mce)1] recognizes and is allosterically activated by as few as two phosphorylated heptads in vitro [49].
Tyr1, Pro3, Pro6 and Ser5-PO4 side chains from each of two
heptapeptide repeats contribute to the interface between
yeast C. albicans guanylyltransferase, Cgt1 and the CTD
[50], emphasizing that, at least in some cases, a pair of
heptads is the recognition unit of the CTD code (Figure 2)
[51]. The close association of the transcription and capping
machineries early in the transcription cycle ensures that
nascent transcripts are accurately and efficiently capped.
Review
Cracking the RNA polymerase II CTD
code
Sylvain Egloff and Shona Murphy
Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK
The carboxyl-terminal domain (CTD) of the largest
subunit of RNA polymerase II comprises multiple tandem
conserved heptapeptide repeats, unique to this eukaryotic RNA polymerase. This unusual structure provides a
docking platform for factors involved in various co-transcriptional events. Recruitment of the appropriate factors
at different stages of the transcription cycle is achieved
through changing patterns of post-translational modification of the CTD repeats, which create a readable ‘code’.
A new phosphorylation mark both expands the CTD code
and provides the first example of a CTD signal read in a
gene type–specific manner. How and when is the code
written and read? How does it contribute to transcription
and coordinate RNA processing?
The polymerase II CTD
In prokaryotes, one DNA-dependant RNA polymerase is
sufficient to transcribe all genes into the variety of RNA
molecules required by the cell. In eukaryotes, however, this
function is shared by three distinct multi-subunit enzymes,
each dedicated to transcription of specific gene types.
Polymerase I (pol I) transcribes only genes encoding 18S
and 28S rRNA, whereas pol III transcribes a range of short
genes, including those encoding tRNA and 5S RNA. Pol II
is responsible instead for transcription of the thousands of
protein-coding genes constituting the largest group of distinct individual genes in the eukaryotic genome. It also
transcribes genes encoding small noncoding RNAs [e.g.
spliceosomal small nuclear RNAs (snRNAs)]. The largest,
catalytic subunits of all three eukaryotic polymerases
share homology with one another and with the largest
subunit of bacterial polymerase [1]. However, an unusual
structure is uniquely found at the C terminus of the largest
subunit of pol II (Rpb1) in eukaryotes as evolutionarily
distant as yeast and humans (Box 1 and Ref. [2]). This
carboxyl-terminal domain (CTD) comprises multiple tandemly repeated heptapeptides with the consensus
sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser (Y1S2P3T4S5P6S7),
with up to 52 repeats in the mammalian protein (Box 1).
Deletion of the CTD in mouse, Drosophila or yeast is lethal,
demonstrating that this structure is essential for life.
However, transcription, both in vitro and on transiently
transfected templates, can proceed without the CTD [2–4],
pointing to an ancillary, rather than fundamental role for
this structure. Our current understanding of the role of the
CTD comes mainly from studying expression of proteincoding genes in either the baker’s yeast, Saccharomyces
cerevisiae, or higher eukaryotic tissue culture cells.
Corresponding author: Murphy, S. ([email protected]).
280
However, recent studies of mammalian snRNA gene
expression have shed new light on the function of this
enigmatic domain.
CTD function is programmed by reversible posttranslational modification
The CTD serves as a scaffold for the interaction of a wide
range of nuclear factors and plays a major role in transcription and co-transcriptional RNA processing in expression of protein-coding genes, mammalian snRNA genes
and yeast small nucleolar RNA (snoRNA) genes [5–15].
The CTD extends from the pol II core enzyme close to the
RNA exit channel [16], where it is ideally placed to influence RNA processing reactions, through direct or indirect
interaction with components of the RNA processing
machinery.
Dynamic, reversible modification of the CTD underpins the efficient and accurate completion of transcript
synthesis (Figure 1). More specifically, recruitment of
transcription and processing factors at different steps
of the transcription cycle is closely linked to the modification state of the CTD, particularly the position of
phosphorylated serines. For example, CTD phosphorylation helps recruit capping factors to the 50 end of newly
synthesized RNA and 30 processing factors to poly(A)
sites. Accordingly, it has been proposed that differential
modification of residues within the heptapeptide generates a CTD code critical for the timely recruitment of
different proteins [17,18]. Phosphorylation has been
detected on tyrosine, threonine and all three serines of
the CTD repeat in vivo. Glycosylation of serines and
threonines and isomerization of the two proline residues
(Pro3 and Pro6) can also occur. The potential for the CTD
heptapeptide to be modified at every residue enables a
wide range of signaling combinations, potentially readable as a binary code, where the presence or absence of
modification can influence factor–CTD interactions.
Here, we provide an overview of the elements that can
contribute to a CTD code and explain how this code is
‘read’ by proteins involved in transcription and RNA
processing (Figures 2 and 3).
Creating the code
A range of enzymes participates in the dynamic modification of the CTD. These include kinases and phosphatases
responsible for the addition or removal of phosphates,
respectively. Likewise, glycosyltransferases and deglycosylases are implicated in reversible glycosylation. In
addition, peptidyl-prolyl bonds, which exist in either cis
0168-9525/$ – see front matter ß 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2008.03.008 Available online 3 May 2008
252 Nucleus and gene expression
CTD conformation is modified by the peptidyl-prolyl
isomerase Pin1 (Ess1 in yeast). How this enzyme affects
recruitment of mRNA processing factors is unknown;
however, it does affect CTD kinase and phosphatase
activity by remodelling the substrate [24]. The CTD is
also O-glycosylated in a manner that is mutually exclusive
with serine phosphorylation, suggesting that there could
be cross-talk between these two modifications [25].
CTD peptides do not assume fixed conformations regardless of phosphorylation state [26]. Rather, the CTD is
quite malleable in the grip of its binding partners. There
are three identified protein domains that recognize CTD
heptads: the CTD interacting domain [27], the WW
domain [28] and the FF domain [29], but other proteins
that lack these domains can also bind directly to the CTD.
There are three available structures of phosphorylated
CTD peptides bound to different partners: Pin1, which
binds via a WW domain; the guanylyltransferase Cgt1; and
the CTD interacting domain of Pcf11, a cleavage/polyadenylation factor [28,30,31]. These structures show
quite different CTD conformations, although the prolyl
bonds are exclusively trans isomers [32]. The Pcf11 CTDinteracting domain, which comprises eight a-helices in a
superhelical arrangement, binds Ser 2-phosphorylated
heptads by an induced-fit mechanism [26,30]. The
extreme flexibility of the CTD helps explains how it
can bind so many partners. The changing pattern of
covalent and non-covalent modifications between the 50
and 30 ends of the gene may constitute a CTD code [33]
that is translated into a sequence of processing factor
binding and release reactions as transcription proceeds.
The code seems to be quite a loose one, however, as Ser 2
phosphorylation is not essential for viability in yeast [34].
Capping enzymes: processing factors with
transcriptional functions
The first mRNA processing factors to be recruited to the
CTD during the transcription cycle are the capping
enzymes RNA triphosphatase, guanylyltransferase and
7-methyltransferase. Yeast guanylyltransferase and
7-methyltransferase bind directly and independently to
the phosphorylated CTD and phosphorylation of Ser 5 at
Current Opinion in Cell Biology 2005, 17:251–256
the promoter by Kin28, a subunit of TFIIH, is necessary
for their recruitment [22,35]. Hence, recognition of a Ser5-phosphate ‘code word’ permits recruitment of
capping enzymes at the right time to perform cotranscriptional capping. CTD binding not only localizes
the mammalian guanylyltransferase but also allosterically
regulates it, reducing the Km for GTP [10]. Removal of
Ser5 phosphate by phosphatases early in elongation is
coupled to release of capping enzymes. Guanylyltransferase is released within the first 500 bases of the gene,
whereas 7-methyltransferase remains bound throughout
the length of the gene [22,35] (Figure 1 and 2).
Capping enzymes participate in a network of interactions
that regulate early steps in transcription [36]. The yeast
7-methyltransferase (Abd1) stabilizes pol II on some
promoters and both 7-methyltransferase and guanylyltransferase (Ceg1) stimulate early elongation [37,38],
whereas the RNA triphosphatase (Cet1) inhibits re-initiation [39]. Stimulation of transcription by Abd1 occurs in a
methylation-defective mutant and is therefore independent of capping itself [38]. Human capping enzyme (CE)
stimulates promoter escape (Figure 1) by countering the
negative elongation factor NELF, and CE recruitment is
Figure 1
MT
GT
CTD
Pol II
High Ser5 PO4:Ser2 PO4
C/P
Rat1 P5
MT
P 5 P5
GT
P5
P2
CBC
The CTD is phosphorylated on Ser5 residues of the
heptads by the TFIIH-associated kinase when transcription initiates, and later the Ser 2 residues are phosphorylated by the kinases CTK1 (yeast) and PTEFb (positive
transcription elongation factor b or CDK9) [18]. Decoration of the CTD with phosphates is also fashioned by
several phosphatases which differ in their preferences for
Ser2 or Ser5 phosphate: Fcp1 [19], Ssu72 [20] and SCPs
[21] The ratio of Ser5:Ser2 phosphorylation is high at the
50 end and lower at the 30 end [22]. A technical limitation
is that analysis of Ser2 phosphorylation has relied on one
monoclonal antibody, H5, which has highest affinity for
heptads phosphorylated on both Ser2 and Ser5 [23].
GT
MeGpppN
P5
P2
C/P
Rat1 P5
MT
P5 P5
Current Opinion in Cell Biology
Co-transcriptional recruitment of pre-mRNA processing factors at
the 50 end of a gene. Capping enzymes (guanylyltransferase [GT] and
7-methyltransferase [MT]), cap binding complex (CBC), cleavage/
polyadenylation factors (C/P), and the RNA 50 –30 exonuclease, Rat1,
are indicated. Processing factors interact with pol II elongation complex
via the CTD (green line) and probably also via the nascent RNA (red line).
Phosphorylation of Ser2 and Ser5 residues in the CTD heptad
repeats are marked P2 and P5. Capping enzymes stimulate early
steps in pol II transcription including promoter clearance denoted by
the thick arrow [17,38]. Following addition of the cap, MeGppp,
GT is released from the elongation complex.
www.sciencedirect.com
Co-transcriptional recruitment of pre-mRNA processing factors Bentley 253
Figure 2
CBC
Low Ser5 PO4:Ser2 PO4
MeGpppN
elic me
p
S so
o
C/P
Rat1
P5
AA
UA
A
A
P2
P2
MT
CBC
P2
MeGpppN
AAAAAAA
C/P
at1
R
P2
P5
P2
MT
P2
Current Opinion in Cell Biology
Co-transcriptional splicing and cleavage/polyadenylation. The spliceosome is recruited to intron RNA (black line) and probably also to the pol II
elongation complex, at least in metazoans. Co-transcriptional splicing releases the intron (black lariat). Note that while spliceosome assembly
probably occurs co-transcriptionally on most introns, excision of the intron can occur post-transcriptionally [7]. More cleavage/polyadenylation
factors (C/P) and Rat1 are detectable at the 30 end than at the 50 end of the gene. These factors interact with the nascent RNA at the poly(A)
site, AAUAAA, to cleave and polyadenylate the transcript. The Rat1 exonuclease degrades RNA downstream of the cleavage site (scissors) and
helps trigger termination of transcription. Whether or not processing factors are released from pol II prior to termination is not known.
enhanced by direct binding to the elongation factor Spt5
[17]. In summary, capping enzymes have a previously
unsuspected ability to manipulate early steps in transcription. In this way they may operate a checkpoint to ensure
that the cap has been added before commitment to
productive elongation of the transcript. It is not known
whether recruitment of capping enzymes is regulated;
however, it is intriguing that a viral activator of elongation,
HIV Tat, can stimulate capping, at least in vitro [40].
Recruiting the spliceosome: RNA versus
protein recognition
Splicing factors are rapidly recruited to nascent transcripts
and many introns are removed co-transcriptionally, while
others are marked co-transcriptionally for post-transcriptional splicing [7]. Like capping enzymes, spliceosomal
UsnRNPs can also enhance elongation via the co-factor
TAT-SF1 [41]; therefore it appears that pol II may help
itself along the gene by the sequential recruitment of
mRNA processing factors.
Of the three major pre-mRNA processing events, least is
known about recruitment of the splicing apparatus. The
simplest situation is exemplified by budding yeast, where
introns are short and few genes have more than one. In
yeast, U1snRNP, which binds the 50 splice site, associates
with the elongation complex in the intron, not upstream,
www.sciencedirect.com
and remains bound in the downstream exon. This binding
pattern suggests initial recruitment by recognition of
intron RNA rather than protein subunits of the elongation
complex [42] (Figure 2). On the other hand, recruitment
by RNA recognition alone is not easily reconciled with
the binding of the U1snRNP protein Prp40 to the phosphorylated CTD via its FF domains [12]. It is possible
that splicing factors like Prp40 are recruited to the RNA
and later handed off to the CTD or other components of
the elongation complex. A second yeast splicing factor,
Sub2, which doubles as a general mRNA export factor,
may be recruited by a protein–protein interaction followed by hand-off to the RNA [43]. In Chironomus this
factor (UAP56) was found on transcription complexes in
both introns and exons [44]. It should be noted that handoff of proteins on or off the nascent substrate could
substantially affect the results of ChIP experiments by
altering cross-linking efficiency.
Whether co-transcriptional spliceosome assembly occurs
by one-step binding of a pre-assembled penta-snRNP or
by sequential binding of individual snRNPs is not known.
ChIP (chromatin immunoprecipitation) analysis in yeast
suggests that different snRNPs are recruited at different
times, which is consistent with a sequential pathway for
spliceosome assembly in vivo (M Rosbash, K Neugebauer, personal communication).
Current Opinion in Cell Biology 2005, 17:251–256
Rules of engagement: co-transcriptional recruitment of
pre-mRNA processing factors
David L Bentley
The universal pre-mRNA processing events of 50 end capping,
splicing, and 30 end formation by cleavage/polyadenylation
occur co-transcriptionally. As a result, the substrate for mRNA
processing factors is a nascent RNA chain that is being
extruded from the RNA polymerase II exit channel at 10–30
bases per second. How do processing factors find their
substrate RNAs and complete most mRNA maturation before
transcription is finished? Recent studies suggest that this task
is facilitated by a combination of protein–RNA and protein–
protein interactions within a ‘mRNA factory’ that comprises the
elongating RNA polymerase and associated processing
factors. This ‘factory’ undergoes dynamic changes in
composition as it traverses a gene and provides the setting
for regulatory interactions that couple processing to
transcriptional elongation and termination.
Addresses
Department of Biochemistry and Molecular Genetics, University of
Colorado School of Medicine, UCHSC at Fitzsimons, Mail Stop 8101,
PO Box 6511, Aurora Colorado 80045, USA
Corresponding author: Bentley, David L ([email protected])
Current Opinion in Cell Biology 2005, 17:251–256
This review comes from a themed issue on
Nucleus and gene expression
Edited by Christine Guthrie and Joan Steitz
Available online 13th April 2005
0955-0674/$ – see front matter
# 2005 Elsevier Ltd. All rights reserved.
DOI 10.1016/j.ceb.2005.04.006
Introduction: how does coupling to
transcription enhance pre-mRNA
processing?
How mRNA processing is facilitated by coupling to RNA
polymerase II (pol II) transcription is a fascinating problem that is the subject of several excellent reviews [1–5].
Transcription-coupled processing differs from uncoupled
processing in that the substrate RNA is a growing and
progressively folding structure rather than a static fulllength pre-mRNA. The importance of coupling is suggested by the fact that processing of full-length synthetic
pre-mRNAs in injected oocytes is less efficient than cotranscriptional processing in vivo [6]. In vivo, introns can
be removed and the poly(A) site cleaved by the time
polymerase has transcribed only 1 kb beyond the processing sites, probably within 30s [7]. By contrast, in vitro
www.sciencedirect.com
processing uncoupled from transcription usually takes
>20 min. Optimal processing is achieved by coupling
with transcription by RNA pol II and not other polymerases because pol II is uniquely equipped with an
unusual domain on its large subunit, called the C-terminal
domain (CTD), that provides a landing pad for mRNA
processing factors.
Coupling of pol II transcription with processing can
influence processing reactions in at least three ways. First,
localization: in its simplest form, coupling positions
mRNA processing factors at the elongation complex,
raising their local concentration in the vicinity of the
nascent transcript. Second, kinetic coupling: the rate of
transcript elongation can have profound effects on RNA
folding and the assembly of RNA–protein complexes and
has been shown to affect the choice between alternative
processing sites [8,9]. Third, allostery: contacts between
mRNA processing factors and the pol II elongation complex can allosterically activate or inhibit mRNA processing factors [10]. Here I will review recent progress in our
understanding of how the factors that carry out mRNA
capping, splicing and 30 end formation engage the pol II
elongation complex.
The pol II C-terminal domain: a recruitment
platform
The CTD is an essential domain in the large subunit of
pol II, but is absent from the related subunits of RNA
polymerases I and III. This domain comprises tandem
heptads whose consensus sequence, Y1S2P3T4S5P6S7, is
identical across animals, plants and some protozoa. The
in vivo functional unit of the CTD appears to be a pair of
tandem heptads [11]. A recent proteomic analysis identified over 100 yeast proteins that bind to the phosphorylated CTD [12]. The CTD is more than a passive
landing pad, however. Among its numerous roles, the
CTD can allosterically regulate capping enzymes and
regulate transcriptional elongation and termination [13].
CTD deletion prevents efficient co-transcriptional capping, splicing and 30 end formation in metazoans [14,15].
Although it is essential for co-transcriptional processing,
the CTD is dispensable for processing uncoupled from
transcription in injected Xenopus oocytes [6], suggesting
that processing at the site of transcription differs from
post-transcriptional processing elsewhere in the nucleus.
Important clues to how the CTD works has come from in
vitro systems, in which it can stimulate processing reactions, in some cases even in the absence of ongoing
transcription [16,17].
Current Opinion in Cell Biology 2005, 17:251–256
Transcription termination
needed to complete a transcription cycle (frees RNA polII for further rounds of
initiation)
in mammals and yeast, linked in many ways to
the process of mRNA 3' end formation (a role
for polyadenylation factor recruitment by
CTD?)
the site of termination of transcription is not
the same as the site of polyadenylation rather, transcription terminates somewhere
downstream from the poly(A) site
Transcription termination and RNA turnover - a
function for the nuclear 5’->3’ exonuclease Xrn2
Figure 2. A Hybrid of the Torpedo and Allosteric Models
It is proposed that the exonuclease cooperates with an unknown helicase and/or allosteric modulator
of the polymerase, converting it from processive to nonprocessive form, ultimately disrupting the RNADNA hybrid and releasing the polymerase. Symbols and color coding as in Figure 1.
From Luo and Bentley, Cell 119, 911-914, 2004
ÿPolyadenylation, transcription termination, and polII recycling may be
connected through the exonuclease Xrn2 (aka Rat1)