* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Transcription • Transcription factors • Chromatin • RNA polymerase II
Survey
Document related concepts
Protein moonlighting wikipedia , lookup
Hedgehog signaling pathway wikipedia , lookup
Signal transduction wikipedia , lookup
P-type ATPase wikipedia , lookup
Cell nucleus wikipedia , lookup
List of types of proteins wikipedia , lookup
Histone acetylation and deacetylation wikipedia , lookup
Protein phosphorylation wikipedia , lookup
Transcription factor wikipedia , lookup
Phosphorylation wikipedia , lookup
Gene expression wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Epitranscriptome wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Transcript
Transcription • Transcription factors • Chromatin • RNA polymerase II RNA polymerase II structure – without one important domain Armache K et al. J. Biol. Chem. 2005;280:7131-7134 ©2005 by American Society for Biochemistry and Molecular Biology d for e rare enetic ossible some (Egloff existts do genes rks. A xperipecific asked -react e prephos- phoseach theory phoryis the he retudies of the phoryupied, of the ers of ts the ylation lity is re inphoryport of phoseases horyla2016; , other Molecular Cell UniquefeaturesoftheC-terminaldomainrepeats ofRNApolymeraseII Previews Voss et al., 2015). What role Eachheptdad hasseveralpossible these modifications play in posttranslationalmodifications establishing different phosphorylation states is not known. Of course, the experimental paradigm presented in these two papers will be an effective and highly informative route to addressing such questions. REFERENCES Themodificationsvaryinthe Buratowski, S. (2003). Nat. Struct. courseoftranscription Biol. 10, 679–680. Corden, J.L. (2013). Chem. Rev. 113, 8423–8455. Dias, J.D., Rito, T., Torlai Triglia, E., Kukalev, A., Ferrai, C., Chotalia, M., Brookes, E., Kimura, H., and Pombo, A. (2015). eLife 4, 4. Egloff, S., Dienstbier, M., and Murphy, S. (2012). Trends Geneti. 28, 333–341. Eick, D., and Geyer, M. (2013). Chem. Rev. 113, 8456–8490. Komarnitsky, P., Cho, E.J., and Buratowski, S. (2000). Genes Dev. 14, 2452–2460. Schüller, R., Forné, I., Straub, T., Schreieck, A., Texier, Y., Shah, N., Decker, T.-M., Cramer, P., Imhof, A., and Eick, D. (2016). Mol. Cell 61, this issue, 305–314. Figure 1. RNA Polymerase II with CTD Indicating Multiple Heptad Repeats (A) Five possible phosphorylation sites in each repeat. (B) Sparse phosphorylation at S5 close to the start of transcription. (C) Sparse phosphorylation at S2 near the cleavage/polyadenylation site. Schwer, B., and Shuman, S. (2011). Mol. Cell 43, 311–318. Suh, H., Ficarro, S.B., Kang, U.-B., Chun, Y., Marto, J.A., and Buratowski, S. (2016). Mol. Cell 61, this issue, 297–304. Review Trends in Genetics Vol.24 No.6 x 34 repeats in Arabidopsis Figure 2. The updated carboxyl-terminal domain (CTD) code. All the possible serine phosphorylation and proline isomerization combinations are shown. It has not been ruled out that Tyr1 and Thr4 phosphorylation co-exists with serine phosphorylation. The glycosylation state of serine residue in position 2 (Ser2), Thr4, Ser5 and Ser7 may also play a role in recognition of the CTD by factors and multiple differentially glycosylated forms are possible. Before and after recruitment, subsets of these combinations will influence the function of polymerase II (pol II). Insertion of an extra amino acid between heptapeptide pairs is tolerated, whereas insertion between individual repeats is not, suggesting that the unit of recognition is a heptapeptide pair [51]. This would increase the potential complexity of the CTD code (64 potential combinations for phosphorylation sites and 16 for proline isomerization) but reduce the number of different protein binding sites on the same CTD. Although the code is potentially very complex, with each of 52 repeats or 26 pairs having a different set of modifications, in reality the number of combinations is restricted by the recruitment of different sets of modification enzymes at the appropriate points. In addition, some modifications will enhance or preclude others. See also Refs. [17,18]. Modification of the heptapeptide is shown as in Figure 1. processing the 30 end of the transcripts and/or transcription termination. However, substitution of Ser7 by the non– phospho-acceptor alanine in each heptapeptide has little effect on expression of endogenous or transiently transfected protein-coding genes [6,22]. By contrast, alanine substitution of Ser7 causes a marked defect in the transcription of human snRNA genes and in formation of the 30 end of transcripts [6], indicating that at least one element of the CTD code can be read in a gene-specific manner. Phosphorylation of other residues in the heptapeptide The highly conserved Tyr1 is a potential site for tyrosine kinases, and some tyrosine phosphorylation occurs in vivo [19]. Ectopic expression of c-Abl both increases Tyr1 phosphorylation of endogenous pol II and activates expression of HIV templates [39]. However, no clear function has been attributed to this modification. Likewise, some threonine phosphorylation is detected on the CTD of pol II in mammalian cells [20], but the functional significance is unclear. Because there are 15 threonines in positions 2, 5 or 7, it is possible that these rather than Thr4 are subject to phosphorylation. Finally, in vertebrates, the last repeat of the CTD is followed by a conserved 10 amino acid extension [2], which contains a constitutively phosphorylated casein kinase (CK)II site [40]. Deletion of this extension results in degradation of the CTD and a reduced ability to transcribe and process RNA [40,41]. However, mutation of the CKII target site does not affect pol II CTD stability. Phosphorylation of Tyr1 by c-Abl requires this extension [42], implicating Tyr1 phosphorylation in functions specific to vertebrates. Proline isomerization There are two peptidyl-prolyl bonds in each consensus heptapeptide, and these can be in either cis or trans orientation, resulting in four possible configurations of each repeat for the mammalian CTD (Figure 2). At least one protein involved in recognition of poly(A) sites selectively binds to repeats with prolines in an all-trans conformation (see below), and peptidyl-prolyl cis/trans isomerases (PPIases) might be involved in isomerization in vivo. Mammalian Pin1 and yeast ESS1 proteins, which possess PPIase activity, are good candidates for proteins regulating CTD structure and function through proline isomerization (see Ref. [36] for review). They interact with phospho-CTD and display an unusual substrate specificity for peptides with phospho-serine and phospho-threonine residues preceding the proline. Mutations in ESS1 are associated with pre-mRNA 30 processing defects, whereas there is no evidence that Pin1 plays a positive role in polyadenylation in mammals. Indeed, it has not been unequivocally shown that binding of any transcription or processing factors to the CTD is influenced by enzymatic proline isomerization. However, Pin1 influences the 283 5-2 The proline peptide bond - proline, an imino acid 180º trans conformation favoured ~1000-fold over cis R=side chain O=C-N-H is planar trans ~93% cis ~7% - cis conformation rare except for proline - cis can be 10-30%, depending on the nature of the Xaa-Pro bond www.sfu.ca/~leroux/class_L05.ppt 5-3 Proline cis-trans isomerization - slow because it involves rotation about a partial double-bond (t1/2 between 10-100 sec at 25ºC) - cis-trans equilibria more common in flexible regions of native proteins (e.g., coils) OR: during protein folding partial double-bond character - strong acids favour cis-trans isomerization by protonating the nitrogen atom - proline residues disrupt alpha-helices; often found in turns - cis-trans isomerization could be used as a molecular switch + Catalysis of cis-trans isomerization - simple reaction; does not involve breaking or forming bonds - mechanism: catalysis by distortion and transition state containing partially-rotated C-N bond - this would result in a reduced partial double-bond character PPIase, Peptidyl Prolyl Isomerase, catalyzes proline cis-trans isomerization - active site of PPIase hydrophobic in character; conserved Arg residue of a PPI might be involved in H-bond formation with N:, producing C-N bond with more single-bond character + Peptidyl prolyl isomerases Three classes are known: • Cyclophilins - ubiquitous; 11 different ones found in S. cerevisiae; not essential for viability - binds cyclosporin A • FKBP binding proteins - no sequence similarity with cyclophilins; many different members found in eukaryotes, as well as prokaryotes; not essential for viability. Yeast mutant lacking all its cyclophilins and FKBP binding proteins still alive! - bind immunosuppressants FK506 and rapamycin (but not cyclosporin A) - both cyclophilins and FKBP’s form complexes with the molecular chaperone Hsp90, perhaps to catalyze cis-trans isomerization as well as to assist folding (or modulate protein conformation) • Parvulins - not related to cyclophilins or FKBP binding proteins, and are not inhibited by cyclosporin A, FK506 or rapamycin - occur as small proteins of <100 amino acids or as domains of larger proteins - have high PPIase activity .... and they differ in their activities and specificities, cellular localization, and binding partner(s) Implicated in: protein folding, protection against stress, apoptosis, cell cycle progression, etc. etc. 5-4 Review Trends in Genetics Vol.24 No.6 Figure 1. Modification of the polymerase II (Pol II) carboxyl-terminal domain (CTD) heptapeptide during transcription of protein-coding genes. (a) The CTD of pol II, which is recruited by preinitiation complexes at the promoter, is unphosphorylated [10] but might be glycosylated [44]. Phosphorylation of the CTD before initiation is thought to block recruitment of pol II, emphasizing that the CTD code is read differently before and after initiation of transcription. (b) The CTD is located close to the RNA exit channel [16]. Phosphorylation of serine residue in position 5 (Ser5), by the CDK7 subunit of the general transcription factor TFIIH, just after initiation, helps to recruit and activate enzymes that add a methylguanosine cap (filled black circle) to the 50 end of the emerging transcript [10]. Because glycosylated and phosphorylated residues are not found together on the same CTD, glycosylation must have been removed by this point. (c) Subsequent phosphorylation of Ser2 by the CDK9 subunit of positive-transcription elongation factor b (P-TEFb) activates elongation and RNA processing [10,27]. Phosphorylation of Ser7 also occurs during transcription and can peak at the 30 end [22] but has, thus far, no known function in protein-coding gene expression. In yeast and in some mammalian genes, Ser5 is dephosphorylated toward the 30 end of the transcription unit [10,24,25]. (d) After cleavage and polyadenylation of the 30 end of the pre-mRNA, directed by the poly(A) site, dephosphorylation of the CTD may help pol II to disengage ready for another round of transcription [10,14,15]. Glycosylation is indicated by circles containing Gs (light blue), phosphorylation by circles containing Ps (orange), and trans isomerisation of prolines by a t (red) above the amino acid. Only one heptapeptide of the multiple heptapeptide repeats is shown on the schematic of pol II (green) with a CTD ‘tail’. Ser7 phosphorylation extends the CTD code and is part of a gene specific signal Using novel anti-CTD monoclonal antibodies, it has now been shown that the serine residue in position 7 (Ser7) of the heptapeptide repeat is also phosphorylated in vivo during transcription of snRNA genes and a range of 282 protein-coding genes [6,22]. This mark expands the complexity of the CTD code (Figure 2 and Ref. [18]). ChIP analysis indicates that this phosphorylation mark peaks toward the 30 end of the transcribed T-cell receptor b (TCRb) gene, in a similar way to phosphorylation of Ser2, suggesting that Ser7 phosphorylation has a role in Molecular Cell Rtr1 Regulates RNAPII CTD Phosphorylation Figure 3. Rtr1 Localizes to Open Reading Frames (A) ChIP assays were performed using antibodies directed against different forms of RNAPII and Rtr1 as indicated. Quantitation of the S5-P (black circles) and S2-P (blue squares) occupancy compared to the occupancy of Rtr1 (red triangles) across the PMA1 genomic loci. (B) Quantitation of protein occupancy across the PYK1 genomic loci as performed above. These data are shown as the average percent maximum IP ± SD for comparison purposes between the different antibodies (n = 3). The midpoint of the PCR amplicon was used as the distance from the ATG. A schematic representation of the genomic loci for each gene is shown below the graphs to illustrate the approximate location of the promoter and polyadenylation regions. (Figure S2). This localization, in combination with our other results, could indicate a role for Rtr1 in the transition from S5-P CTD to S2-P CTD in vivo. Rtr1 Is Found in the Open Reading Frames of PMA1 and PYK1 To determine whether Rtr1 colocalizes with elongating, initiating, or terminating RNAPII in vivo, chromatin immunoprecipitation (ChIP) assays were performed with an Rtr1-HFH strain using antibodies directed against HA. The genomic loci of PMA1 and PYK1 were analyzed for Rtr1, S5-P, and S2-P RNAPII occupancy using qPCR (Figures 3A and 3B). Both of these genes are highly expressed and have previously been used to study the localization of different phosphorylated forms of RNAPII (Komarnitsky et al., 2000). As shown in Figures 3A and 3B, a peak in S5-phosphorylated RNAPII is observed at the 50 end of both PMA1 and PYK1 and decreases prior to an observed increase of S2-P, in agreement with previous results (Komarnitsky et al., 2000). Surprisingly, we found a strong and distinct peak of Rtr1 that localized to a region of both genes between the enriched regions of S5-P and S2-P. This peak was also observed after normalization of the levels of Rtr1 to the level of RNAPII occupancy Rtr1 Is Required for S5-P Dephosphorylation during Early RNAPII Elongation The Rtr1 deletion strain was analyzed for defects in CTD phosphorylation in vivo by western blot analysis of whole-cell extracts. Loss of Rtr1 resulted in increased levels of S5-P CTD in vivo (Figure 4A, upper panel) and corresponded with a slight decrease in the cellular level of unmodified RNAPII (Figure 4B, third panel). The level of S2-P was not affected in rtr1D extracts, nor was the level of Rpb3, which was used as a control for protein loading (Figure 4A, bottom two panels). Quantitation was performed on triplicate experiments and is shown in Figure 4B. These data indicate that Rtr1 is involved in the regulation of S5-P in vivo and support the hypothesis that Rtr1 is involved in decreasing S5-P in wild-type cells, given that there is an accumulation of the S5-P form in rtr1D cells. In order to determine the effects of Rtr1 deletion on the levels of DNA-associated RNAPII, we performed ChIP analyses in the rtr1D strain and compared the levels of S5-P and S2-P RNAPII to those found in wild-type. The levels of S5-P RNAPII dramatically increase in the rtr1D cells when compared to the levels observed in wild-type cells at both the PMA1 and PYK1 ORFs (Figures 4C and 4D). This finding is especially true toward the 30 ends of these coding regions, where S5-P on the CTD is normally very low. The level of S5-P RNAPII decreases near the polyadenylation sites of both genes to near wild-type levels. This decline is likely due to decreased RNAPII occupancy in these regions, which was observed for Rpb3 in both the rtr1D and wild-type cells (Figure 5A). The levels of S2-P RNAPII were also Molecular Cell 34, 168–178, April 24, 2009 ª2009 Elsevier Inc. 171 Molecular Cell Article Rtr1 Is a CTD Phosphatase that Regulates RNA Polymerase II during the Transition from Serine 5 to Serine 2 Phosphorylation Amber L. Mosley,1 Samantha G. Pattenden,1 Michael Carey,1,2 Swaminathan Venkatesh,1 Joshua M. Gilmore,1 Laurence Florens,1 Jerry L. Workman,1 and Michael P. Washburn1,* 1Stowers Institute for Medical Research, Kansas City, MO 64110, USA of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, 10833 LeConte Avenue, Los Angeles, CA 90095, USA *Correspondence: [email protected] DOI 10.1016/j.molcel.2009.02.025 2Department SUMMARY Messenger RNA processing is coupled to RNA polymerase II (RNAPII) transcription through coordinated recruitment of accessory proteins to the Rpb1 C-terminal domain (CTD). Dynamic changes in CTD phosphorylation during transcription elongation are responsible for their recruitment, with serine 5 phosphorylation (S5-P) occurring toward the 50 end of genes and serine 2 phosphorylation (S2-P) occurring toward the 30 end. The proteins responsible for regulation of the transition state between S5-P and S2-P CTD remain elusive. We show that a conserved protein of unknown function, Rtr1, localizes within coding regions, with maximum levels of enrichment occurring between the peaks of S5-P and S2-P RNAPII. Upon deletion of Rtr1, the S5-P form of RNAPII accumulates in both whole-cell extracts and throughout coding regions; additionally, RNAPII transcription is decreased, and termination defects are observed. Functional characterization of Rtr1 reveals its role as a CTD phosphatase essential for the S5to-S2-P transition. INTRODUCTION From yeast to mammals, there are three highly conserved RNA polymerase complexes that are responsible for the transcription of all classes of cellular RNAs. RNA processing is closely tied to transcription in order to ensure the fate of nascent RNA. One unique mechanism for proper RNA processing involves the recruitment of a wide variety of accessory proteins to the C-terminal domain (CTD) of the largest subunit of RNAPII, Rpb1 (for review, see Phatnani and Greenleaf, 2006). The CTD consists of 27 repeats of the sequence Y1S2P3T4S5P6S7 in yeast and is not conserved within the Rpb1 counterparts found in RNAP I and RNAPIII, thereby serving as a unique signaling platform for RNAPII. In order to form a competent initiation complex at the promoter of a target gene, the CTD must exist in a hypophosphorylated state. Following assembly of the initiation complex, 168 Molecular Cell 34, 168–178, April 24, 2009 ª2009 Elsevier Inc. the CTD exhibits increased phosphorylation on serine 5 (S5-P), carried out by the cyclin-dependent kinase Kin28, a subunit of the general transcription factor TFIIH (Komarnitsky et al., 2000; Schroeder et al., 2000). This phosphorylation event is responsible for the recruitment of the capping machinery, which begin processing of the nascent mRNA during early transcription (Cho et al., 1997; Fabrega et al., 2003; Komarnitsky et al., 2000; Schroeder et al., 2000). As transcription elongation progresses, there is a change in the modification state of the CTD as serine 2 phosphorylation (S2-P) increases through the action of the CTDK-I complex (Cho et al., 2001). Chromatin immunoprecipitation (ChIP) experiments have demonstrated that the increase in S2-P occurs as transcription progresses through the open reading frame (ORF) (Komarnitsky et al., 2000). As transcription approaches the 30 end of the ORF, the termination and polyadenylation machinery are recruited, some of which interact with the S2-P CTD (Licatalosi et al., 2002; Meinhart and Cramer, 2004; Kim et al., 2004). Although this transition state from S5-P to S2-P during the transcription cycle is thought to distinguish different phases of RNAPII elongation, the proteins involved in the decrease of S5-P during elongation have yet to be identified. In addition to the aforementioned CTD kinases, the actions of CTD phosphatases are also required to manage the different CTD modification states. Two CTD phosphatases, Fcp1 and Ssu72, have been characterized in yeast (for review, see Meinhart et al., 2005). Fcp1 has a preference for the S2-P modification and has been shown by ChIP analysis to colocalize with RNAPII throughout coding regions (Cho et al., 2001). In addition, Fcp1 mutants show an increase in the level of S2-P in the coding region of genes, indicating that the phosphatase plays a role in dephosphorylation of S2-P during the transcription cycle (Cho et al., 2001). Fcp1 is also thought to play a major role in RNAPII recycling after the complex has dissociated from the coding region (Cho et al., 1999; Kong et al., 2005; Archambault et al., 1997; Chambers et al., 1995; Aygun et al., 2008). Ssu72, conversely, is a S5-Pspecific CTD phosphatase and a component of the yeast cleavage and polyadenylation factor (CPF), which is involved in mRNA processing at the 30 ends of genes (Krishnamurthy et al., 2004; ReyesReyes and Hampsey, 2007). ChIP assays have revealed that Ssu72 is predominately enriched at the 30 ends of genes, with little to no enrichment found at the promoter (Nedea et al., 2003; Ansari and Hampsey, 2005). Although Fcp1 and Ssu72 have both been Review Trends in Genetics Vol.24 No.6 Figure 3. The role of the carboxyl-terminal domain (CTD) code in selective binding of proteins. The effect of modification of residues in the heptapeptide on protein binding has only been determined for a few of the many factors shown to interact with the CTD [10,55,74]. Only factors that have been shown to bind directly to the CTD are shown. Modification of the heptapeptide is shown as in Figure 1. (a) CTD-interacting proteins are shown as light blue shapes, with their function during the protein-coding gene transcription cycle [10] noted on the left. The CTD can adopt numerous conformations that can interact with a range of different CTD-interacting domains. These include the WW domain (Pin1, Ess1), the FF domain (CA150), the Set2 Rpb1 interacting (SRI) domain (Set2), an unusual SH2 domain (Spt6) and the CID domain of Pcf11 and Rtt103 [10,54]. Prp40 has FF and WW domains. However, it is not clear exactly what mediates binding to the CTD [10,75]. (b) The small nuclear RNA gene-specific RNA 30 processing complex, Integrator, is the only factor known to date that requires phosphorylation of serine residue in position 7 (Ser7) [6]. The role of phosphorylation of Ser2 and Ser5 in this interaction remains to be determined. phosphorylation status of the CTD by inhibiting the Fcp1 phosphatase, and this has a negative effect on transcription [43]. Glycosylation The mammalian CTD can also be modified by the addition of a monosaccharide N-acetylglucosamine (O-GlcNAc), to the hydroxyl groups of serine and threonine residues [44]. Interestingly, no glycosylation is detected on the phosphorylated form of the enzyme, suggesting that phosphorylation and glycosylation are mutually exclusive. It is possible that the glycosylated form of pol II is recruited to the promoter and that an N-acetylglucosaminase acts at this stage to selectively remove the O-GlcNAc group before phosphorylation occurs. However, to date, it has not been clearly demonstrated that the low level of glycosylation detected plays any role in gene expression. Reading the code A multitude of factors binds to the CTD (Figure 3). The influence of specific CTD modifications on interaction is only well understood in a few cases. However, each described phosphorylation state clearly correlates with the requirement of specific processing factors through the transcription cycle (Figures 1 and 3). 284 Association of factors with the CTD early in the transcription cycle Pol II interacts with the general transcription factor TBP and the multi-subunit Mediator complex (Box 2) through unphosphorylated CTD [45,46], which might help tether pol II to promoters. The Mediator complex plays an important role in transducing signals from transcriptional regulators to the general transcription machinery (for a review, see Ref. [47]). Phosphorylation on Ser5 releases the Mediator complex from pol II, disrupting one of the bonds that stabilizes promoter-associated complexes [48]. The Ser5-phosphorylated CTD then becomes a landing pad for guanylyltransferases responsible for the addition of the cap structure to the 50 end of newly synthesized RNAs (for review, see Ref. [10]). Mammalian guanylyltransferase [mouse mRNA capping enzyme (Mce)1] recognizes and is allosterically activated by as few as two phosphorylated heptads in vitro [49]. Tyr1, Pro3, Pro6 and Ser5-PO4 side chains from each of two heptapeptide repeats contribute to the interface between yeast C. albicans guanylyltransferase, Cgt1 and the CTD [50], emphasizing that, at least in some cases, a pair of heptads is the recognition unit of the CTD code (Figure 2) [51]. The close association of the transcription and capping machineries early in the transcription cycle ensures that nascent transcripts are accurately and efficiently capped. Review Cracking the RNA polymerase II CTD code Sylvain Egloff and Shona Murphy Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK The carboxyl-terminal domain (CTD) of the largest subunit of RNA polymerase II comprises multiple tandem conserved heptapeptide repeats, unique to this eukaryotic RNA polymerase. This unusual structure provides a docking platform for factors involved in various co-transcriptional events. Recruitment of the appropriate factors at different stages of the transcription cycle is achieved through changing patterns of post-translational modification of the CTD repeats, which create a readable ‘code’. A new phosphorylation mark both expands the CTD code and provides the first example of a CTD signal read in a gene type–specific manner. How and when is the code written and read? How does it contribute to transcription and coordinate RNA processing? The polymerase II CTD In prokaryotes, one DNA-dependant RNA polymerase is sufficient to transcribe all genes into the variety of RNA molecules required by the cell. In eukaryotes, however, this function is shared by three distinct multi-subunit enzymes, each dedicated to transcription of specific gene types. Polymerase I (pol I) transcribes only genes encoding 18S and 28S rRNA, whereas pol III transcribes a range of short genes, including those encoding tRNA and 5S RNA. Pol II is responsible instead for transcription of the thousands of protein-coding genes constituting the largest group of distinct individual genes in the eukaryotic genome. It also transcribes genes encoding small noncoding RNAs [e.g. spliceosomal small nuclear RNAs (snRNAs)]. The largest, catalytic subunits of all three eukaryotic polymerases share homology with one another and with the largest subunit of bacterial polymerase [1]. However, an unusual structure is uniquely found at the C terminus of the largest subunit of pol II (Rpb1) in eukaryotes as evolutionarily distant as yeast and humans (Box 1 and Ref. [2]). This carboxyl-terminal domain (CTD) comprises multiple tandemly repeated heptapeptides with the consensus sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser (Y1S2P3T4S5P6S7), with up to 52 repeats in the mammalian protein (Box 1). Deletion of the CTD in mouse, Drosophila or yeast is lethal, demonstrating that this structure is essential for life. However, transcription, both in vitro and on transiently transfected templates, can proceed without the CTD [2–4], pointing to an ancillary, rather than fundamental role for this structure. Our current understanding of the role of the CTD comes mainly from studying expression of proteincoding genes in either the baker’s yeast, Saccharomyces cerevisiae, or higher eukaryotic tissue culture cells. Corresponding author: Murphy, S. ([email protected]). 280 However, recent studies of mammalian snRNA gene expression have shed new light on the function of this enigmatic domain. CTD function is programmed by reversible posttranslational modification The CTD serves as a scaffold for the interaction of a wide range of nuclear factors and plays a major role in transcription and co-transcriptional RNA processing in expression of protein-coding genes, mammalian snRNA genes and yeast small nucleolar RNA (snoRNA) genes [5–15]. The CTD extends from the pol II core enzyme close to the RNA exit channel [16], where it is ideally placed to influence RNA processing reactions, through direct or indirect interaction with components of the RNA processing machinery. Dynamic, reversible modification of the CTD underpins the efficient and accurate completion of transcript synthesis (Figure 1). More specifically, recruitment of transcription and processing factors at different steps of the transcription cycle is closely linked to the modification state of the CTD, particularly the position of phosphorylated serines. For example, CTD phosphorylation helps recruit capping factors to the 50 end of newly synthesized RNA and 30 processing factors to poly(A) sites. Accordingly, it has been proposed that differential modification of residues within the heptapeptide generates a CTD code critical for the timely recruitment of different proteins [17,18]. Phosphorylation has been detected on tyrosine, threonine and all three serines of the CTD repeat in vivo. Glycosylation of serines and threonines and isomerization of the two proline residues (Pro3 and Pro6) can also occur. The potential for the CTD heptapeptide to be modified at every residue enables a wide range of signaling combinations, potentially readable as a binary code, where the presence or absence of modification can influence factor–CTD interactions. Here, we provide an overview of the elements that can contribute to a CTD code and explain how this code is ‘read’ by proteins involved in transcription and RNA processing (Figures 2 and 3). Creating the code A range of enzymes participates in the dynamic modification of the CTD. These include kinases and phosphatases responsible for the addition or removal of phosphates, respectively. Likewise, glycosyltransferases and deglycosylases are implicated in reversible glycosylation. In addition, peptidyl-prolyl bonds, which exist in either cis 0168-9525/$ – see front matter ß 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2008.03.008 Available online 3 May 2008 252 Nucleus and gene expression CTD conformation is modified by the peptidyl-prolyl isomerase Pin1 (Ess1 in yeast). How this enzyme affects recruitment of mRNA processing factors is unknown; however, it does affect CTD kinase and phosphatase activity by remodelling the substrate [24]. The CTD is also O-glycosylated in a manner that is mutually exclusive with serine phosphorylation, suggesting that there could be cross-talk between these two modifications [25]. CTD peptides do not assume fixed conformations regardless of phosphorylation state [26]. Rather, the CTD is quite malleable in the grip of its binding partners. There are three identified protein domains that recognize CTD heptads: the CTD interacting domain [27], the WW domain [28] and the FF domain [29], but other proteins that lack these domains can also bind directly to the CTD. There are three available structures of phosphorylated CTD peptides bound to different partners: Pin1, which binds via a WW domain; the guanylyltransferase Cgt1; and the CTD interacting domain of Pcf11, a cleavage/polyadenylation factor [28,30,31]. These structures show quite different CTD conformations, although the prolyl bonds are exclusively trans isomers [32]. The Pcf11 CTDinteracting domain, which comprises eight a-helices in a superhelical arrangement, binds Ser 2-phosphorylated heptads by an induced-fit mechanism [26,30]. The extreme flexibility of the CTD helps explains how it can bind so many partners. The changing pattern of covalent and non-covalent modifications between the 50 and 30 ends of the gene may constitute a CTD code [33] that is translated into a sequence of processing factor binding and release reactions as transcription proceeds. The code seems to be quite a loose one, however, as Ser 2 phosphorylation is not essential for viability in yeast [34]. Capping enzymes: processing factors with transcriptional functions The first mRNA processing factors to be recruited to the CTD during the transcription cycle are the capping enzymes RNA triphosphatase, guanylyltransferase and 7-methyltransferase. Yeast guanylyltransferase and 7-methyltransferase bind directly and independently to the phosphorylated CTD and phosphorylation of Ser 5 at Current Opinion in Cell Biology 2005, 17:251–256 the promoter by Kin28, a subunit of TFIIH, is necessary for their recruitment [22,35]. Hence, recognition of a Ser5-phosphate ‘code word’ permits recruitment of capping enzymes at the right time to perform cotranscriptional capping. CTD binding not only localizes the mammalian guanylyltransferase but also allosterically regulates it, reducing the Km for GTP [10]. Removal of Ser5 phosphate by phosphatases early in elongation is coupled to release of capping enzymes. Guanylyltransferase is released within the first 500 bases of the gene, whereas 7-methyltransferase remains bound throughout the length of the gene [22,35] (Figure 1 and 2). Capping enzymes participate in a network of interactions that regulate early steps in transcription [36]. The yeast 7-methyltransferase (Abd1) stabilizes pol II on some promoters and both 7-methyltransferase and guanylyltransferase (Ceg1) stimulate early elongation [37,38], whereas the RNA triphosphatase (Cet1) inhibits re-initiation [39]. Stimulation of transcription by Abd1 occurs in a methylation-defective mutant and is therefore independent of capping itself [38]. Human capping enzyme (CE) stimulates promoter escape (Figure 1) by countering the negative elongation factor NELF, and CE recruitment is Figure 1 MT GT CTD Pol II High Ser5 PO4:Ser2 PO4 C/P Rat1 P5 MT P 5 P5 GT P5 P2 CBC The CTD is phosphorylated on Ser5 residues of the heptads by the TFIIH-associated kinase when transcription initiates, and later the Ser 2 residues are phosphorylated by the kinases CTK1 (yeast) and PTEFb (positive transcription elongation factor b or CDK9) [18]. Decoration of the CTD with phosphates is also fashioned by several phosphatases which differ in their preferences for Ser2 or Ser5 phosphate: Fcp1 [19], Ssu72 [20] and SCPs [21] The ratio of Ser5:Ser2 phosphorylation is high at the 50 end and lower at the 30 end [22]. A technical limitation is that analysis of Ser2 phosphorylation has relied on one monoclonal antibody, H5, which has highest affinity for heptads phosphorylated on both Ser2 and Ser5 [23]. GT MeGpppN P5 P2 C/P Rat1 P5 MT P5 P5 Current Opinion in Cell Biology Co-transcriptional recruitment of pre-mRNA processing factors at the 50 end of a gene. Capping enzymes (guanylyltransferase [GT] and 7-methyltransferase [MT]), cap binding complex (CBC), cleavage/ polyadenylation factors (C/P), and the RNA 50 –30 exonuclease, Rat1, are indicated. Processing factors interact with pol II elongation complex via the CTD (green line) and probably also via the nascent RNA (red line). Phosphorylation of Ser2 and Ser5 residues in the CTD heptad repeats are marked P2 and P5. Capping enzymes stimulate early steps in pol II transcription including promoter clearance denoted by the thick arrow [17,38]. Following addition of the cap, MeGppp, GT is released from the elongation complex. www.sciencedirect.com Co-transcriptional recruitment of pre-mRNA processing factors Bentley 253 Figure 2 CBC Low Ser5 PO4:Ser2 PO4 MeGpppN elic me p S so o C/P Rat1 P5 AA UA A A P2 P2 MT CBC P2 MeGpppN AAAAAAA C/P at1 R P2 P5 P2 MT P2 Current Opinion in Cell Biology Co-transcriptional splicing and cleavage/polyadenylation. The spliceosome is recruited to intron RNA (black line) and probably also to the pol II elongation complex, at least in metazoans. Co-transcriptional splicing releases the intron (black lariat). Note that while spliceosome assembly probably occurs co-transcriptionally on most introns, excision of the intron can occur post-transcriptionally [7]. More cleavage/polyadenylation factors (C/P) and Rat1 are detectable at the 30 end than at the 50 end of the gene. These factors interact with the nascent RNA at the poly(A) site, AAUAAA, to cleave and polyadenylate the transcript. The Rat1 exonuclease degrades RNA downstream of the cleavage site (scissors) and helps trigger termination of transcription. Whether or not processing factors are released from pol II prior to termination is not known. enhanced by direct binding to the elongation factor Spt5 [17]. In summary, capping enzymes have a previously unsuspected ability to manipulate early steps in transcription. In this way they may operate a checkpoint to ensure that the cap has been added before commitment to productive elongation of the transcript. It is not known whether recruitment of capping enzymes is regulated; however, it is intriguing that a viral activator of elongation, HIV Tat, can stimulate capping, at least in vitro [40]. Recruiting the spliceosome: RNA versus protein recognition Splicing factors are rapidly recruited to nascent transcripts and many introns are removed co-transcriptionally, while others are marked co-transcriptionally for post-transcriptional splicing [7]. Like capping enzymes, spliceosomal UsnRNPs can also enhance elongation via the co-factor TAT-SF1 [41]; therefore it appears that pol II may help itself along the gene by the sequential recruitment of mRNA processing factors. Of the three major pre-mRNA processing events, least is known about recruitment of the splicing apparatus. The simplest situation is exemplified by budding yeast, where introns are short and few genes have more than one. In yeast, U1snRNP, which binds the 50 splice site, associates with the elongation complex in the intron, not upstream, www.sciencedirect.com and remains bound in the downstream exon. This binding pattern suggests initial recruitment by recognition of intron RNA rather than protein subunits of the elongation complex [42] (Figure 2). On the other hand, recruitment by RNA recognition alone is not easily reconciled with the binding of the U1snRNP protein Prp40 to the phosphorylated CTD via its FF domains [12]. It is possible that splicing factors like Prp40 are recruited to the RNA and later handed off to the CTD or other components of the elongation complex. A second yeast splicing factor, Sub2, which doubles as a general mRNA export factor, may be recruited by a protein–protein interaction followed by hand-off to the RNA [43]. In Chironomus this factor (UAP56) was found on transcription complexes in both introns and exons [44]. It should be noted that handoff of proteins on or off the nascent substrate could substantially affect the results of ChIP experiments by altering cross-linking efficiency. Whether co-transcriptional spliceosome assembly occurs by one-step binding of a pre-assembled penta-snRNP or by sequential binding of individual snRNPs is not known. ChIP (chromatin immunoprecipitation) analysis in yeast suggests that different snRNPs are recruited at different times, which is consistent with a sequential pathway for spliceosome assembly in vivo (M Rosbash, K Neugebauer, personal communication). Current Opinion in Cell Biology 2005, 17:251–256 Rules of engagement: co-transcriptional recruitment of pre-mRNA processing factors David L Bentley The universal pre-mRNA processing events of 50 end capping, splicing, and 30 end formation by cleavage/polyadenylation occur co-transcriptionally. As a result, the substrate for mRNA processing factors is a nascent RNA chain that is being extruded from the RNA polymerase II exit channel at 10–30 bases per second. How do processing factors find their substrate RNAs and complete most mRNA maturation before transcription is finished? Recent studies suggest that this task is facilitated by a combination of protein–RNA and protein– protein interactions within a ‘mRNA factory’ that comprises the elongating RNA polymerase and associated processing factors. This ‘factory’ undergoes dynamic changes in composition as it traverses a gene and provides the setting for regulatory interactions that couple processing to transcriptional elongation and termination. Addresses Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, UCHSC at Fitzsimons, Mail Stop 8101, PO Box 6511, Aurora Colorado 80045, USA Corresponding author: Bentley, David L ([email protected]) Current Opinion in Cell Biology 2005, 17:251–256 This review comes from a themed issue on Nucleus and gene expression Edited by Christine Guthrie and Joan Steitz Available online 13th April 2005 0955-0674/$ – see front matter # 2005 Elsevier Ltd. All rights reserved. DOI 10.1016/j.ceb.2005.04.006 Introduction: how does coupling to transcription enhance pre-mRNA processing? How mRNA processing is facilitated by coupling to RNA polymerase II (pol II) transcription is a fascinating problem that is the subject of several excellent reviews [1–5]. Transcription-coupled processing differs from uncoupled processing in that the substrate RNA is a growing and progressively folding structure rather than a static fulllength pre-mRNA. The importance of coupling is suggested by the fact that processing of full-length synthetic pre-mRNAs in injected oocytes is less efficient than cotranscriptional processing in vivo [6]. In vivo, introns can be removed and the poly(A) site cleaved by the time polymerase has transcribed only 1 kb beyond the processing sites, probably within 30s [7]. By contrast, in vitro www.sciencedirect.com processing uncoupled from transcription usually takes >20 min. Optimal processing is achieved by coupling with transcription by RNA pol II and not other polymerases because pol II is uniquely equipped with an unusual domain on its large subunit, called the C-terminal domain (CTD), that provides a landing pad for mRNA processing factors. Coupling of pol II transcription with processing can influence processing reactions in at least three ways. First, localization: in its simplest form, coupling positions mRNA processing factors at the elongation complex, raising their local concentration in the vicinity of the nascent transcript. Second, kinetic coupling: the rate of transcript elongation can have profound effects on RNA folding and the assembly of RNA–protein complexes and has been shown to affect the choice between alternative processing sites [8,9]. Third, allostery: contacts between mRNA processing factors and the pol II elongation complex can allosterically activate or inhibit mRNA processing factors [10]. Here I will review recent progress in our understanding of how the factors that carry out mRNA capping, splicing and 30 end formation engage the pol II elongation complex. The pol II C-terminal domain: a recruitment platform The CTD is an essential domain in the large subunit of pol II, but is absent from the related subunits of RNA polymerases I and III. This domain comprises tandem heptads whose consensus sequence, Y1S2P3T4S5P6S7, is identical across animals, plants and some protozoa. The in vivo functional unit of the CTD appears to be a pair of tandem heptads [11]. A recent proteomic analysis identified over 100 yeast proteins that bind to the phosphorylated CTD [12]. The CTD is more than a passive landing pad, however. Among its numerous roles, the CTD can allosterically regulate capping enzymes and regulate transcriptional elongation and termination [13]. CTD deletion prevents efficient co-transcriptional capping, splicing and 30 end formation in metazoans [14,15]. Although it is essential for co-transcriptional processing, the CTD is dispensable for processing uncoupled from transcription in injected Xenopus oocytes [6], suggesting that processing at the site of transcription differs from post-transcriptional processing elsewhere in the nucleus. Important clues to how the CTD works has come from in vitro systems, in which it can stimulate processing reactions, in some cases even in the absence of ongoing transcription [16,17]. Current Opinion in Cell Biology 2005, 17:251–256 Transcription termination needed to complete a transcription cycle (frees RNA polII for further rounds of initiation) in mammals and yeast, linked in many ways to the process of mRNA 3' end formation (a role for polyadenylation factor recruitment by CTD?) the site of termination of transcription is not the same as the site of polyadenylation rather, transcription terminates somewhere downstream from the poly(A) site Transcription termination and RNA turnover - a function for the nuclear 5’->3’ exonuclease Xrn2 Figure 2. A Hybrid of the Torpedo and Allosteric Models It is proposed that the exonuclease cooperates with an unknown helicase and/or allosteric modulator of the polymerase, converting it from processive to nonprocessive form, ultimately disrupting the RNADNA hybrid and releasing the polymerase. Symbols and color coding as in Figure 1. From Luo and Bentley, Cell 119, 911-914, 2004 ÿPolyadenylation, transcription termination, and polII recycling may be connected through the exonuclease Xrn2 (aka Rat1)