* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download WW Domains Provide a Platform for the
Survey
Document related concepts
SNARE (protein) wikipedia , lookup
Endomembrane system wikipedia , lookup
Protein phosphorylation wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Histone acetylation and deacetylation wikipedia , lookup
Cell nucleus wikipedia , lookup
Magnesium transporter wikipedia , lookup
Protein moonlighting wikipedia , lookup
Signal transduction wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
List of types of proteins wikipedia , lookup
Transcript
MOLECULAR AND CELLULAR BIOLOGY, Aug. 2005, p. 7092–7106 0270-7306/05/$08.00⫹0 doi:10.1128/MCB.25.16.7092–7106.2005 Copyright © 2005, American Society for Microbiology. All Rights Reserved. Vol. 25, No. 16 WW Domains Provide a Platform for the Assembly of Multiprotein Networks† Robert J. Ingham,1 Karen Colwill,1 Caley Howard,1 Sabine Dettwiler,2 Caesar S. H. Lim,1,3 Joanna Yu,1,3 Kadija Hersi,1 Judith Raaijmakers,1 Gerald Gish,1 Geraldine Mbamalu,1 Lorne Taylor,1 Benny Yeung,1 Galina Vassilovski,1 Manish Amin,1 Fu Chen,4 Liudmila Matskova,4 Gösta Winberg,4 Ingemar Ernberg,4 Rune Linding,1 Paul O’Donnell,1 Andrei Starostine,1 Walter Keller,2 Pavel Metalnikov,1Chris Stark,1 and Tony Pawson1,3* Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada1; Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada3; Department of Cell Biology, Biozentrum, University of Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland2; and Karolinska Institutet, Microbiology and Tumor Biology Center (MTC), SE-171 Stockholm, Sweden4 Received 8 April 2005/Returned for modification 5 May 2005/Accepted 22 May 2005 WW domains are protein modules that mediate protein-protein interactions through recognition of prolinerich peptide motifs and phosphorylated serine/threonine-proline sites. To pursue the functional properties of WW domains, we employed mass spectrometry to identify 148 proteins that associate with 10 human WW domains. Many of these proteins represent novel WW domain-binding partners and are components of multiprotein complexes involved in molecular processes, such as transcription, RNA processing, and cytoskeletal regulation. We validated one complex in detail, showing that WW domains of the AIP4 E3 proteinubiquitin ligase bind directly to a PPXY motif in the p68 subunit of pre-mRNA cleavage and polyadenylation factor Im in a manner that promotes p68 ubiquitylation. The tested WW domains fall into three broad groups on the basis of hierarchical clustering with respect to their associated proteins; each such cluster of bound proteins displayed a distinct set of WW domain-binding motifs. We also found that separate WW domains from the same protein or closely related proteins can have different specificities for protein ligands and also demonstrated that a single polypeptide can bind multiple classes of WW domains through separate prolinerich motifs. These data suggest that WW domains provide a versatile platform to link individual proteins into physiologically important networks. tidyl prolyisomerase domains (Pin1), and Rho GTPase-activating protein domains. Consequently, WW domain-containing proteins are involved in a variety of cellular processes, including transcription, RNA processing, protein trafficking, receptor signaling, and control of the cytoskeleton (32, 33, 68). WW domain-mediated interactions have been implicated in cancer (4, 75), in hereditary disorders, such as Liddle’s syndrome (66) and Rett’s syndrome (8), as well as in Alzheimer’s (46, 48) and Huntington’s (20, 55) diseases. WW domains are typically 35 to 40 amino acids in length (6) and fold into a three-stranded, antiparallel  sheet with two ligand-binding grooves (30, 38, 50, 72). WW domains bind a variety of distinct peptide ligands including motifs with core proline-rich sequences, such as PPXY (amino acid single-letter code; X is any amino acid) (PY) (14), PPLP (2, 18), as well as proline/arginine-containing (PR) sequences (3) and phosphorylated serine/threonine-proline sites [p(S/T)P] (49, 76). WW domains have been classified into four groups on the basis of their binding to peptide ligands (3, 19, 67). Group I WW domains, which include the NEDD4 family proteins and the 65-kDa Yes-associated protein (YAP65), recognize PY motifs (14, 66). Group II WW domains, such as the FE65 WW domain and the first WW domain of FBP11, recognize PPLP motifs (2, 18), and group III WW domains (FBP30 WW1) recognize PR motifs (3). Group IV WW domains, such as those from Pin1 (49) and PDX-1 C-terminus-interacting factor Many signaling proteins contain modular domains that mediate specific protein-protein interactions, frequently through the recognition of short peptide motifs in their binding partners (56). In many cases these interactions are regulated by posttranslational modifications, such as phosphorylation. Interaction domains can thereby control the subcellular localization, enzymatic activity, and substrate specificity of regulatory proteins and the assembly of multiprotein complexes, and thus the flow of information through signaling pathways. WW domains comprise a family of protein-protein interaction modules that are found in many eukaryotes and are present in approximately 50 human proteins (6; see Fig. 1). Within these polypeptides, WW domains are joined to a number of distinct interaction modules, including phosphotyrosinebinding domains (i.e., in the FE65 protein) and FF domains (CA150 and FBP11), as well as protein localization domains, such as C2 (NEDD4 family proteins) and pleckstrin homology domains (PLEKHA5). WW domains are also linked to a variety of catalytic domains, including HECT E3 protein-ubiquitin ligase domains (in NEDD4 family proteins), rotomerase/pep* Corresponding author. Mailing address: Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario M5G 1X5, Canada. Phone: (416) 586-8262. Fax: (416) 586-8869. E-mail: [email protected]. † Supplemental material for this article may be found at http://mcb .asm.org/. 7092 VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS 7093 FIG. 1. Domain organization and alternate names of WW domain-containing proteins analyzed in this study. WW domain-containing proteins examined in this study, their unique GeneIDs, as well as alternate names are indicated. The relative domain organization of each of the proteins was determined by searching the National Center for Biotechnology Information conserved domain database (CDD) (51) and is indicated. WW domains specifically analyzed in this study are indicated with an asterisk. a.a., amino acids; PTB, phosphotyrosine binding; PH, pleckstrin homology; SH3, Src homology 3. 1 (21), recognize p(S/T)P motifs (76). Group II and III WW domains can be rather versatile in their binding properties, since they not only recognize both PPLP- and PR-containing peptides (with varied affinities) but also polyproline stretches often containing glycine, methionine, or arginine (29, 39, 40, 54), and therefore could be viewed as a single set. A number of groups have identified WW domain-binding proteins through screening cDNA expression libraries and yeast two-hybrid analysis (2, 14, 18, 36, 53). These studies, coupled with probing of peptide libraries, have suggested that consensus sequences are recognized by individual WW domains (29, 54). To obtain a more comprehensive view of the functions of WW domain-containing proteins, as well as the ability of individual WW domains to recognize cellular proteins, we have used tandem mass spectrometry (MS) to identify human polypeptides that associate with a range of WW domains. This approach has several advantages. (i) It is not biased towards previously determined peptide ligands and provides orthogonal results regarding WW domain-binding properties that can be overlaid on peptide-binding data. (ii) It allows the identification of proteins that interact indirectly with WW domains, providing new information regarding the cellular complexes and machinery engaged by WW domains. (iii) It facilitates the identification of phosphorylated ligands for group IV WW domains. Here, we identify 148 T-cell proteins that selectively associate with 10 human WW domains. Interestingly, hierarchical clustering of this data set organized the WW domains into three groups on the basis of their protein-binding properties, which correlates with the presence of specific proline-rich or phosphorylated peptide motifs in these associated proteins. The screen identified novel, direct, and biologically relevant binding partners for WW domain-containing proteins, as well as associated proteins that bind WW domains indirectly as components of multiprotein complexes. Finally, we show that WW domains from the same protein or protein families can have distinct binding preferences and that individual proteins can potentially serve as scaffolds to recruit multiple WW domains of different classes to distinct binding sites. These results argue that the reiterated use of a small interaction domain can yield a versatile network of protein interactions that influences many facets of cellular behavior. MATERIALS AND METHODS Antibodies and cDNAs. The polyclonal antisera to AIP4 (74) and p68 and p25 antisera (60) have been previously described. Antibodies were purchased as 7094 INGHAM ET AL. follows: anti-Diaphanous 1 polyclonal antisera (ImmunoGlobe, Himmelstadt, Germany); anti-FLAG M2 monoclonal antibody (MAb) (Sigma, St. Louis, MO); anti-Myc 9E10 MAb, antibody to the large subunit of RNA polymerase II (anti-RNA Pol II LS), antihemagglutinin (anti-HA), and anti-EWSR1 (antiEWS) polyclonal antisera (Santa Cruz, Santa Cruz, CA); anti-CA150 polyclonal antisera (Abcam, Cambridge, MA); and anti-Pin1 polyclonal antisera (Upstate Biotechnology, Inc., Upstate, NY). The HA-tagged ubiquitin construct was a gift from Ivan Dikic (Goethe University Medical School, Frankfurt, Germany). The doubly Myc-tagged AIP4 constructs were generated by PCR of our previously published constructs (74) with the amino terminus of the published AIP4 sequence (31) replacing the amino-terminal Itch portion of the fusion used in our earlier study. These constructs were cloned into the pCDNA3.1A expression vector (Invitrogen, San Diego, CA). The Flag-tagged p68 (Flag-p68) and KIAA0144 and the corresponding tyrosine-to-alanine (Y/A) mutation in the PY motif, were generated by PCR and cloned into the pCMVFlac eukaryotic expression vector (Sigma). All PCR products were sequence verified. Production of the His-tagged, recombinant pre-mRNA cleavage and polyadenylation factor Im (CFIm) p68 and p25 has been previously described (16). Construction of glutathione S-transferase (GST) fusion proteins. Oligonucleotides for the WW domains including four or five amino acids N and C terminal to the conserved tryptophan residues of each of the WW domains (see Table S1 in the supplemental material) were purchased from Sigma. Complementary oligonucleotides (approximately 2 mM of each) were annealed and cloned into BamHI/EcoRI-cut pGEXKT (Amersham Biosciences AB, Uppsala, Sweden). All constructs were sequence verified. The AIP4 WW domain fusion proteins have been previously described (74). Purification of GST fusion proteins. GST fusion proteins were purified from BL21 Escherichia coli bacterial lysate with glutathione-Sepharose beads (Amersham). Fusion protein-containing beads were stored at 4°C as a 10% slurry in phosphate-buffered saline containing 0.02% NaN3. Fusion protein used for SPOTS blotting was eluted from the beads with 50 mM Tris base and 25 mM glutathione (Sigma), and the eluates were dialyzed overnight at 4°C in phosphate-buffered saline. Cell culture. Jurkat cell pellets were purchased from the National Cell Culture Center (Minneapolis, MN) where cells were grown in RPMI 1640 medium supplemented with 0.1 mM sodium pyruvate and 10% normal calf serum. 293T cells were grown in Dulbecco modified Eagle medium (Gibco BRL, Cambridge, Ontario, Canada) supplemented with 10% fetal calf serum (Gibco BRL). GST-WW domain pulldowns for MS analysis. Jurkat cells were resuspended in 1% NP-40 lysis buffer supplemented with 1 mM NaVO4, 1 mM phenylmethylsulfonyl fluoride, 10 g/ml leupeptin, and 10 g/ml aprotinin at 1.0 ⫻ 108 cells/ml and incubated on a nutator in the cold for 30 min. Detergent-insoluble material was removed by centrifugation at ⬃50,000 ⫻ g for 30 min at 4°C. One milliliter of lysate (1.0 ⫻ 108 cell equivalents) was then added to 80 g of GST fusion protein precoupled to glutathione-Sepharose beads and allowed to incubate for 6 h on a nutator at 4°C. The beads were washed four times with 1% NP-40 lysis buffer with inhibitors, and proteins were removed from the beads by boiling in sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) sample buffer. Running and staining of SDS-PAGs. Samples were run on 10% SDS-PAGs and washed a minimum of three times for 10 min each time with distilled, deionized water before being stained with GelCode colloidal Coomassie blue reagent (MJS BioLynx, Inc., Brockville, Ontario, Canada) overnight at room temperature. Gels were destained the next day with distilled, deionized water. Excision of bands for MS. Individual bands were excised using an Investigator ProPic Gel Cutting Robotic workstation (Genomic Solutions, Ann Arbor, MI) and delivered to a single well of a 96-well microtiter plate with prepunched holes at the bottom of each well (Genomic Solutions). Reduction, alkylation, and tryptic digestion of samples. The PCR microtiter plate was transferred to a Genomic Solutions ProGest Digestion robot for “in-gel” trypsin digestion where protein bands were washed, reduced with dithiothreitol, alkylated with iodoacetamide, before being digested with sequencegrade, modified trypsin (Promega, Madison, WI) as previously described (28). Tryptic peptides were then extracted from the gel for analysis by MS. Liquid chromatography-MS (LC-MS). Tryptic peptides were analyzed by liquid chromatography-tandem MS (LC-MS/MS) using two systems. The first was an Ultimate high-performance liquid chromatography system equipped with a FAMOS 96-well autosampler (LC Packings-Dionex, Sunnyvale, CA) linked to a Q-TRAP mass spectrometer (MDS Sciex, Concord, Ontario, Canada). The second was an HP 1100 high-performance liquid chromatography system (Palo Alto, CA) connected to an LCQ-Deca mass spectrometer (Thermo Electron, San Jose, CA). Peptides were separated on custom-made 75-m-inner-diameter PicoTip columns packed with 5-m C-18 beads. The gradient was 3 to 60% of acetonitrile MOL. CELL. BIOL. during 25 min with a total run time of 45 min. Data were analyzed in batch using the Mascot search engine (57), and proteins were considered “hits” if two independent peptides or a single peptide with a Mascot score of 50 or higher was found. Protein hits were converted to gene identifiers (GeneIDs) for further analysis. Any “hit” that was not seen in two of three independent experiments or was seen binding to GST alone was eliminated. Also excluded were all actins, myosins, keratins, spectrins, and heat shock proteins. The cellular process for each “hit” was assigned on the basis of published work and GeneOntology information provided by the Gene Ontology Consortium (10). For the identification of phosphopeptides, the data were analyzed using the Spectrum Mill software (Agilent Technologies, Palo Alto, CA). The validity of phosphorylation sites was manually verified by examination of the MS/MS fragmentation spectra. Transfection of 293T cells, GST pulldowns, and immunoprecipitation experiments. 293T cells were transfected in 10-cm plates using CaPO4 as previously described (42). Transfected cells were lysed in 1 ml 1% NP-40 lysis buffer supplemented with inhibitors (see above) for 10 min on ice. Detergent-insoluble material was removed by spinning at ⬃18,000 ⫻ g in a microcentrifuge, and the concentration of the cleared lysate was determined by bicinchoninic acid assay (Pierce, Rockford, IL). An equivalent amount of cleared lysate was incubated with 2 g of the indicated antibodies and either protein A or G-Sepharose or anti-mouse immunoglobulin G-agarose for 1 h on a rocker at 4°C. For GST fusion protein experiments, equivalent amounts of lysate were incubated for 1 h on a rocker at 4°C with ⬃8 g of GST fusion protein preabsorbed on glutathione-Sepharose beads. Immunoprecipitates and GST pulldowns were then washed three times with 1% NP-40 lysis buffer with inhibitors, and bound proteins were eluted by boiling in SDS-PAGE sample buffer. Western blotting. Cell lysate, immunoprecipitations, and GST pulldowns were separated on SDS-PAGs, transferred to Polyscreen polyvinylidene fluoride (PVDF) membranes (Perkin-Elmer), and blocked for 1 h in Tris-buffered saline containing 0.5% Tween 20 (TBST) supplemented with 5% skim milk powder. Membranes were probed with the indicated antibodies (1 g/ml in TBST) as previously described (34), and proteins were visualized using enhanced chemiluminescence reagent. Blots were stripped by incubating the membrane for 1 h in TBST, pH 2, and reprobed as described above. Hierarchical clustering. Pearson correlation coefficients were computed to reflect the similarities between the different WW domains with respect to the proteins they precipitated and vice versa. The x and y axes were hierarchically clustered independently using an average linkage method, and the matrix was reorganized to reflect the hierarchically clustered ordering (17). The tree shows the relatedness between items based on the clustering, with items closer together being more related (65). Ubiquitylation experiments. For ubiquitylation experiments, transfected cells were treated for 1 h with 50 M of the proteasome inhibitor MG132 (BACHEM, King of Prussia, PA) prior to lysis. Cells were lysed and lysates processed as indicated above with the exception that 50 M MG132 and 20 mM N-ethylmaleimide (Sigma) were added to the lysis buffer. Synthesis and blotting of SPOTS membranes. SPOTS blots (23) were synthesized, using standard 9-fluorenylmethoxycarbonyl (Fmoc) chemistry, in 30 ⫻ 20 spot arrays using a Multipep peptide synthesizer adapted for SPOTS synthesis (Intavis AG, Cologne, Germany). Membranes were blocked for at least 1 h in TBST with 5% skim milk powder before being probed overnight with the indicated GST fusion protein (1 M in TBST) in the cold. The following day, blots were washed three times for 10 min each time in TBST, incubated with a horseradish peroxidase-conjugated, anti-GST MAb (B14) (Santa Cruz) for 45 min at room temperature, and then washed with TBST three more times for 10 min each time before visualizing SPOTS by exposing the membranes to enhanced chemiluminescence reagent. RESULTS Cluster analysis groups WW domains according to their binding partners. To identify cellular proteins that interact with individual WW domains, we constructed GST fusion proteins of 15 WW domains from 13 proteins containing WW domains (Fig. 1). This set includes WW domains that have different binding preferences for peptide ligands and are derived from proteins with diverse molecular functions. Distinct WW domains from the same protein (FBP21 or CA150) and from proteins with similar domain organizations (i.e., NEDD41/AIP4/Smurf1 and CA150/FBP11) were also part of the test VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS FIG. 2. Purification of WW domain-binding proteins. Proteins were precipitated from Jurkat cell lysate with the indicated fusion proteins as outlined in Materials and Methods. Precipitated proteins were separated on SDS-PAGs, and gels were stained with colloidal Coomassie blue to visualize proteins. One representative gel is shown here. The positions of molecular mass markers (in kilodaltons) are indicated to the left. The positions of bait proteins are indicated by arrows. set. Fusion proteins were expressed in bacteria, purified, and used to precipitate proteins from Jurkat T-cell lysates as described in Materials and Methods. A representative gel stained with colloidal Coomassie blue shows that the GST-WW domain fusion proteins can precipitate proteins from Jurkat cell lysates not seen with GST alone (Fig. 2). Ten of the tested WW domains associated with specific proteins and showed distinct binding profiles (Fig. 2), whereas five (Smurf1 WW1, CA150 WW1, PQBP1, PLEKHA5, and KIAA1052) did not bind specific proteins in this assay (data not shown), potentially because they require additional sequences either to fold into functional modules or to recognize ligands (15, 29, 30, 50, 54). Proteins that were precipitated by the 10 WW domains were then excised from the relevant gel and analyzed by LC-MS/MS. Precipitations with each GST-WW domain were repeated three times, and only proteins identified in two of three independent experiments with a minimum score of confidence (see Materials and Methods) were included for subsequent analysis. Using these criteria, a total of 148 proteins were identified as being precipitated by one or more of the 10 WW domains but not by GST alone (see Table S2 in the supplemental material). The majority of these proteins represent novel WW domainassociated proteins and are involved in processes such as transcription, RNA processing, and regulation of the cytoskeleton. In order to compare the binding properties of the different WW domains, we employed hierarchical clustering to group the WW domains on the basis of the similarity of the proteins they recognized. This clustering separated the WW domains into three classes (Fig. 3). One cluster (group A [shown in 7095 yellow]) includes the NEDD4-1 WW2, AIP4 WW2, and WWOX WW1 domains. A second group comprises the FE65 WW, Gas7 WW, FBP21 WW2, FBP11 WW1 and WW2, and CA150 WW2 domains (group B [shown in red]). The WW domain of Pin1 clustered independently of the other WW domains and thus forms its own group (group C [shown in green]). Thus, despite the fact that these WW domain complexes may contain both direct and indirect interaction partners, data based exclusively on the analysis of full-length proteins isolated by WW domains identified three groups of domains, raising the possibility that these might correspond to groups previously distinguished by their selectivity for short peptide motifs. Clusters of WW domain ligands are enriched for specific WW domain-binding motifs. WW domains bind a number of proline-based peptide ligands including PY, PPLP, PR, and p(S/T)P motifs as well as polyproline stretches (2, 3, 14, 18, 29, 39, 40, 49, 54, 76). Therefore, we examined whether PPXY, PPLP, and PPPPP motifs are present in the WW-binding proteins from Jurkat cells (Fig. 4). Since a clear consensus sequence does not exist for PR peptide ligands, we searched for the PPRXP and PXPPXR motifs described by Otte et al. (54). Analysis of over 24,000 reference proteins with unique GeneIDs showed that only 13% of these polypeptides contained at least one such motif. In contrast, over 50% of the proteins isolated in association with the analyzed WW domains possess at least one of these peptide sequences, with the exception of Pin1 WW domain-binding proteins. Furthermore, WW domains assigned to group A by hierarchical clustering were enriched for precipitated proteins with PPXY motifs, while group B WW domains preferentially precipitated proteins containing PPLP, PPPPP, and PXPPXR motifs (Fig. 4). This relationship is highlighted by superimposing the presence of these motifs on the clustered data of Fig. 3 (see Fig. S1 in the supplemental material). The majority of the proteins precipitated by the WW domain of Pin1 lacked these proline-rich motifs, and there was no enrichment for any specific proline-rich sequence (Fig. 4; see Fig. S1 in the supplemental material). This is consistent with the ability of this domain to recognize p(S/T)P ligands. We therefore examined the MS data for tryptic peptides containing phosphorylated serine or threonine and compared Pin1 phosphorylation data with results obtained for representatives of group A (NEDD4-1 WW2) and group B (FBP11 WW2) WW domains (Fig. 5A). Twenty-two distinct serine/threonine phosphorylation sites were found in 7 of the 45 proteins precipitated by the Pin1 WW domain (Fig. 5C), and the majority of these phosphorylation sites conform to the Pin1 consensus, p(S/T)P. A single phosphorylation site from one protein was identified in the NEDD4-1 WW2 pulldowns, while two phosphorylation sites were identified for the single phosphoprotein precipitated by the FBP11 WW2 (Fig. 5C). While this analysis is not comprehensive, the GST-Pin1 WW domain pulldowns appear to be enriched for phosphorylated-serine/threonine peptides which fit the consensus for Pin1 WW domain binding. Taken together, these results indicate that the binding properties of individual WW domains for full-length cellular proteins can be integrated with their selective recognition of peptide ligands to provide a more comprehensive view of their 7096 INGHAM ET AL. MOL. CELL. BIOL. VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS 7097 FIG. 4. Examination of precipitated proteins for known proline-based WW domain-binding motifs. The proteins precipitated with each of the GST-WW pulldowns were analyzed for the presence of the indicated proline-containing motifs. Note that the pie charts do not equal 100%, since individual proteins can have more than one motif. The number of proteins precipitated by each WW domain and the percentage of proteins with at least one of the proline-rich motifs are indicated below the corresponding pie charts. The ⬃24,000 reference proteins with unique GeneIDs were downloaded from the National Center for Biotechnology Information database and searched locally for the indicated motifs. physiological interactions. This approach may be of general utility in exploring the functions of interaction domains. Domain-based screen identifies novel, physiologically relevant WW domain-associated proteins. Several of the proteins identified as binding to specific WW domains have been previously shown to coimmunoprecipitate with the corresponding WW domain-containing protein. For example, transfected FE65 coprecipitates with endogenous Mena from Cos-7 cells (18), and CA150 coprecipitates with the SF1 splicing factor through its second WW domain (45). We identified both of these interactions in our screen (ENAH [GeneID 55740] and SF1 [GeneID 7536]). In addition, NEDD4-1 and its yeast orthologue, Rsp5p, directly interact with the RNA Pol II LS (12) and induce its ubiquitylation in response to UV-induced DNA damage (1). We found RNA Pol II LS in association with NEDD4-1 WW2 and also with WW domains from AIP4 and WWOX (POL2RA, GeneID 5430). However, the majority of the proteins we identified have not been previously shown to interact with the relevant WW domain-containing proteins. Therefore, we assessed whether selected proteins identified as binding isolated WW domains in vitro could associate with the cognate full-length proteins in cells (Fig. 6A). In a fashion similar to that of NEDD4-1, endogenous AIP4 coimmunoprecipitated with the RNA Pol II LS. More interestingly, we tested whether AIP4 associates in vivo with the p68 subunit of CFIm (p68) (CPSF6, GeneID 11052), which precipitated with the AIP4 WW2 domain in vitro. The CFIm complex is involved in the earliest events of pre-mRNA cleavage prior to the addition of the poly(A) tail (59, 60) and is formed through the heterodimerization of p68 with a p25 subunit (CPSF5, GeneID 11051) which was also precipitated with the AIP4 WW2 domain (Fig. 3). Indeed, the p68 subunit of CFIm coprecipitated with AIP4 from Jurkat T-cell lysate. We also found that CA150, which FIG. 3. Hierarchical clustering of WW domains with precipitated proteins. The GST-WW domains (abscissa) and precipitated proteins (ordinate) were hierarchically clustered as outlined in Materials and Methods. The tree at the top indicates the relatedness of the WW domains with respect to precipitated proteins. This clustering distinguishes three groups of WW domains: group A (AIP4 WW2, NEDD4-1 WW2, and WWOX WW1) in yellow, group B (FBP11 WW1 and WW2, FBP21 WW2, FE65 WW, CA150 WW2, and Gas7 WW) in red, and group C (Pin1 WW) in green. The color of the precipitated protein indicates cellular function and is shown to the right. 7098 INGHAM ET AL. MOL. CELL. BIOL. FIG. 5. Analysis of GST-WW domain pulldowns for phosphopeptides. (A) The MS data from the GST-Pin1 WW, NEDD4-1 WW2, and FBP11 WW2 pulldowns were analyzed for the presence of phosphopeptides as described in Materials and Methods. Identified phosphopeptides and their host proteins are indicated. Definitively phosphorylated residues are indicated in red, and possible phosphorylation sites where the identification was ambiguous are indicated in green. A lowercase “m” denotes an oxidized methionine, while a lowercase “q” denotes a pyroglutamine residue. Amino acids in parentheses are residues N and C terminal to the peptide in the protein sequence. Peptides indicated by an asterisk are doubly phosphorylated. (B) The MS/MS spectra for the THRAP3 phosphopeptide is shown. Note that only b (blue), y (red), and parental ion minus phosphoric acid (P(m/z) ⫺ H3PO4) or minus phosphoric acid and water (P(m/z) ⫺ H3PO4-H2O) (green) are indicated. The number associated with each ion is the mass/charge ratio (m/z) of that ion. (C) Statistics showing the number of phosphorylation sites, number of phosphorylated proteins, and number of phosphorylated proteins as a percentage of total proteins precipitated with each WW domain (from Fig. 3). has been reported to regulate transcription and splicing (25, 45, 63, 69) and to bind the Huntingtin protein (27), coprecipitated with Diaphanous 1 (DIAPH1, GeneID 1729) (Fig. 6A), a protein with links to the cytoskeleton (73), but which has also been shown to be present in the nucleus in NIH 3T3 cells (70). We next tested whether these in vivo interactions are indeed WW domain dependent. Figure 6B shows that Myc-tagged VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS 7099 FIG. 6. Identified proteins are direct, in vivo, binding partners. (A) Proteins were precipitated from 293T cell lysate (anti-RNA Pol II LS immunoprecipitate [ippt]) or Jurkat cell lysate (anti-p68 ippts) with the indicated antibodies, separated on SDS-PAGs, transferred to PVDF membranes, and blotted with antiserum against AIP4 or Diaphanous 1 (top blots). Blots were stripped and reprobed with the indicated antibodies (bottom blots). (B) 293T cells were transfected with cDNAs coding for a doubly Myc-tagged AIP4 (Myc-AIP4), enzymatically inactive AIP4 mutant (Myc-C830A AIP4), an AIP4 construct lacking the entire C2 domain (Myc-⌬C2 AIP4), or a construct in which the second tryptophan (tyrosine in the case of the fourth WW domain) residue in each WW domain had been mutated to alanine (Myc-WWmut AIP4). Cell lysates were immunoprecipitated with the anti-Myc MAb 9E10, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and blotted for the large subunit of RNA polymerase II (anti-RNA Pol II LS) or the p68 and p25 subunits of CFIm (anti-p68 and anti-p25). The blots were stripped and reprobed with the 9E10 anti-Myc MAb (bottom blots). Cell lysates are included to show that the levels of RNA Pol II LS, p68, and p25 are equivalent in each of the lysates. Note that the dot in the top blot of the Myc-WWmut AIP4 pulldown is an artifact. (C) Baculovirusexpressed, purified recombinant His-tagged p68 or p25 protein was incubated with GST alone or a GST fusion protein of the first WW domain of AIP4 (GST-AIP4 WW1). GST precipitates were separated on SDS-PAGs, transferred to nitrocellulose, and blotted with either anti-p68 antiserum (top blot) or anti-p25 antiserum (bottom blot). The positions of molecular mass markers (in kilodaltons) are indicated to the left. The positions of p68 and p25 proteins are indicated by the arrows. The asterisk in the top blot indicates a p68 degradation product, while the asterisk in the bottom blot indicates the GST-AIP4 WW1 fusion protein. The input is one/fifth of the material used for the pulldown experiment. (D) 293T cells were transfected as indicated with cDNAs coding for a doubly Myc-tagged AIP4 (Myc-AIP4), Flag-tagged p68 protein (FLAG-p68), or Flag-tagged p68 protein in which the tyrosine residue in the PY motif had been mutated to alanine (FLAG-p68 Y/A). Cell lysates were immunoprecipitated with the anti-FLAG M2 MAb, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and blotted with the anti-Myc MAb (top blot). The blot was stripped and reprobed with the anti-Flag M2 MAb (middle blot). Cell lysate is included to show that equivalent amounts of Myc-tagged AIP4 were expressed in the appropriate transfections (bottom blot). (E) 293T cells were transfected as indicated with cDNAs coding for a Flag-tagged p68 protein (FLAG-p68), a doubly Myc-tagged AIP4 (Myc-AIP4), an enzymatically inactive AIP4 mutant (Myc-C830A AIP4), or an HA-tagged ubiquitin (HA-Ub) construct. Cell lysates were immunoprecipitated with the anti-FLAG M2 MAb, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and blotted with an anti-HA antiserum (top blot). The blot was stripped and reprobed with the anti-Myc 9E10 MAb (middle blot). Cell lysate is included to show that equivalent amounts of FLAG protein were expressed in the appropriate transfections (bottom blot). The positions of ubiquitylated p68 proteins are indicated by arrows. The positions of molecular mass standards (in kilodaltons) are indicated to the left. AIP4 coprecipitates with the RNA Pol II LS and the p68 and p25 subunits of CFIm from 293T cell lysate. Similar results were obtained with Myc-tagged AIP4 variants lacking the entire C2 domain (⌬C2) or rendered enzymatically inactive by substitution of a critical catalytic cysteine within the HECT domain (C830A). However, collectively inactivating the four WW domains in AIP4 by mutating the second conserved tryptophan residue in each case (tyrosine in the case of the fourth WW domain) (13, 50) abolished AIP4 interaction with all three of these binding partners (Fig. 6B). 7100 INGHAM ET AL. While these coprecipitation experiments identify proteins that associate with AIP4 in cells, they do not prove that these interactions are direct. To explore this point, we tested whether baculovirus-expressed, purified, recombinant Histagged p68 and p25 subunits of CFIm could be precipitated with a GST fusion protein containing the first WW domain of AIP4. Although both the p68 and p25 subunits possess a single PY motif and therefore could potentially directly interact with AIP4, only the purified p68 protein on its own precipitated with GST-AIP4 WW1, whereas p25 did not (Fig. 6C). However, the p25 subunit was precipitated with the GST-AIP4 WW1 fusion protein when coincubated with the p68 subunit, likely through heterodimerizing with the p68 protein. To test whether the single PY motif of p68 is required for its WW domain-dependent interaction with AIP4, we coexpressed Myc-tagged AIP4 in 293T cells with either Flag-p68 or Flag-p68 Y/A, which contains a mutation that disrupts WW domain binding (14, 74). Unlike wild-type Flag-p68, the Flagp68 Y/A protein did not coprecipitate the Myc-tagged AIP4 (Fig. 6D), indicating that the interaction between AIP4 and p68 depends on both the AIP4 WW domains and the p68 PY motif. Since AIP4 is an E3 protein-ubiquitin ligase, we examined whether the interaction between p68 and AIP4 might promote p68 ubiquitylation. Figure 6E shows that when Flag-tagged p68 is coexpressed with an HA-tagged ubiquitin construct in 293T cells, the HA-tagged ubiquitin is incorporated into p68, forming a characteristic ubiquitin ladder. Moreover, the further expression of Myc-tagged AIP4, but not the enzymatically inactive C830A mutant, strongly enhanced p68 ubiquitylation. Taken together, these data suggest that proteins identified in this domain-based screen can serve as direct physiological binding partners for WW domain-containing polypeptides. Furthermore, the screen can identify a protein such as the p25 subunit of CFIm, which is recruited into a complex with a WW domain protein through an indirect association. In addition, complexes such as CFIm/AIP4 identified in the WW domain screen can be functional in vivo, as indicated by the AIP4mediated ubiquitylation of p68. WW domain screen isolates multiprotein complexes involved in transcription, splicing, chromatin remodeling, and cytoskeletal organization. The data for CFIm suggest that the WW domains can bind proteins that are themselves components of stable multiprotein complexes. In fact, analysis of the WW domain-binding proteins identified by LC-MS/MS suggests that several such complexes are recruited to WW domains. For instance, several studies have shown that the WW domains of NEDD4 family proteins bind the RNA Pol II LS (POLR2A) (1, 12). Indeed, we found the RNA Pol II LS associated with the AIP4 WW2, NEDD4 WW2, and WWOX WW1 domains. Moreover, we identified additional RNA Pol II subunits, as well as proteins reported to bind RNA Pol II subunits (RNA polymerase II, subunit 5-mediating protein [C19orf2] [GeneID 8725]) (Fig. 7A). These data argue that the WW domain pulldowns precipitate larger transcriptional complexes. Additionally, we identified proteins of the ENA/VASP family (ENAH [Mena], VASP, and EVL) in pulldowns with the FE65 WW domain and other group B (II/III) WW domains (Fig. 7B). The FE65 WW domain reportedly binds a PPLP motif in Mena (18) and EVL (44), and the related VASP MOL. CELL. BIOL. protein also has this motif. Collectively, these proteins control cell motility through regulation of the actin cytoskeleton (61). The ENA/VASP proteins associate with WASP/WAVE family proteins (11), which also regulate the actin cytoskeleton through the Arp2/3 complex, and we found several family members (WAS [WASP], WASL [N-WASP], and WASF2 [WAVE2]) in group B (II/III) WW domain pulldowns (Fig. 7B and Fig. 3) (5). These proteins can recruit additional polypeptides, such as WASPIP (WIP), WIRE, ABI1, and CYFIP1/2, which were also identified in the group B (II/III) WW domain precipitations. It will be of interest to determine how WW domain interactions may regulate this actin polymerizing complex; in one instance, a WW domain protein, FBP11, has been shown to interfere with N-WASP function by sequestering it in the nucleus (52). Other cellular complexes were also seen in WW domain pulldowns. We purified components that make up the U2 splicing complex (SF3A1, -2, and -3 and SF3B1, -2, -3, and -4) with group B (II/III) WW domains, subunits of the chaperonin-containing t-complex polypeptide 1 (CCT) that facilitate the folding of actin, tubulin, and other cytosolic proteins (CCT2, CCT4, CCT7) (71) with FBP11 WW2 and several components of the chromatin remodeling SWI/SNF complex (ARID1A, SMARCC1, SMARCC2, and SMARCE1) with the AIP4 WW2 domain. These results are consistent with the notion that WW domain proteins interact with larger multiprotein complexes involved in multiple facets of cellular regulation. Specificity within the WW domains of AIP4 and between the related WW domains of AIP4 and NEDD4-1. In the MS-based analysis, we used the second WW domain of AIP4 to precipitate proteins from Jurkat cell lysate. However, AIP4 possesses four WW domains, which may differ in their ability to bind ligands (22, 26, 47, 64). Therefore, we investigated whether proteins identified by MS as precipitating with the second WW domain could also bind to the other WW domains of AIP4. Figure 8A shows that the RNA Pol II LS and the p68 and p25 subunits of CFIm were precipitated by each of four WW domains, whereas EWS (EWSR1, GeneID 2130), a protein with an RNA recognition motif that is frequently rearranged with Ets family transcription factors in cancers such as Ewing’s sarcoma (41), was specifically precipitated by AIP4 WW2. Similarly, a Flag-tagged KIAA0144 protein (UBAP2L, GeneID 9898), a protein of unknown function with a ubiquitin-associated domain, was selectively precipitated by the first and second WW domains. Thus, the isolated domains of tandem WW domain arrays can show both common and selective binding, suggesting that the tandem WW domains of AIP4 may act synergistically to bind common partners or alternatively may act as a scaffold to recruit specific proteins to individual WW domains. Many of the proteins precipitated by the second WW domains of AIP4 and NEDD4-1 possess at least one PY motif (Fig. 8B and Fig. 4). While many of these proteins were precipitated by WW domains from both proteins, several were selectively precipitated by only one of the WW domains (e.g., UBAP2L [KIAA0144] and EWSR1 [EWS]). Direct analysis by immunoblotting of proteins isolated from various amounts of 293T cell lysate showed that the second WW domains of NEDD4-1 and AIP4 were equivalent in their abilities to pre- VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS 7101 FIG. 7. Protein complexes precipitated by GST-WW domains. (A) The transcriptional complex precipitated by the AIP4 WW2 domain is shown. The WW domain is indicated in yellow, direct WW domain-binding proteins in red, and likely indirectly precipitating proteins in green. Gene symbols and common names (in parentheses) are provided. Black lines indicate a direct physical interaction, while gray lines indicate a physical association, but not necessarily direct. (B) Actin cytoskeleton-regulating complex precipitated by the FE65 WW domain (as well as other group B [II/III] WW domains) is shown and labeled as described above for panel A. Although not found in the FE65 pulldowns, NAP1 (NCKAP1) was included (in blue) as it has been shown to bridge ABI1 with CYFIP1/2 (24, 35) and was found in association with other group B (II/III) WW domains (Fig. 3). Note that this model represents one possible cytoskeleton-regulating complex based on published work. Since the WASP proteins (WAS and WASL), WIPIP, WIRE, WASF2, and ABI1 are extremely proline rich and contain motifs that are predicted to interact with group B WW domains, they may interact directly with group B WW domains. cipitate the p68 subunit of CFIm, whereas the KIAA0144 and EWS proteins were preferentially precipitated by the AIP4 WW2 domain (Fig. 8C). This preferential binding of EWS and KIAA0144 likely explains why the EWS and KIAA0144 proteins were not identified as NEDD4-1 WW2-binding proteins by MS (Fig. 3; see Table S2 in the supplemental material). To confirm that binding of the AIP4 and NEDD4-1 WW domains to the KIAA0144 protein was through the PY motif in KIAA0144, we mutated the tyrosine residue in the PY motif. Figure 8D shows this substitution greatly decreased KIAA0144 precipitation by both the AIP4 and NEDD4-1 WW2 domains. The mutation did not completely abolish binding and had a more severe effect on precipitation by the NEDD4-1 WW2 domain than AIP4 WW2 (Fig. 8D). To probe whether the PY motif is the main site of interaction for the WW domains of AIP4 and NEDD4-1 and determine whether there may be additional binding sites, as suggested by the PY mutants, we generated an array of overlapping peptides corresponding to the entire sequence of KIAA0144 and determined which peptides were recognized by GST-AIP4 WW2 and GST-NEDD4-1 WW2 fusion proteins. A representative array (Fig. 8E) shows that three peptides were reproducibly and specifically recognized by GST-AIP4 WW2 and GST-NEDD4-1 WW2 (two strongly and one weakly, indicated in red) but not GST alone. The two strong-intensity spots (spots 1 and 2) include the PY motif, while the weak-intensity spot (spot 3) contains an LPXY motif, a motif that has been shown to bind the WW domains of NEDD4 family proteins (13, 62). Taken together, these data show that recognition of KIAA0144 by the second WW domains of AIP4 and NEDD4-1 is largely dependent on the KIAA0144 PY motif and that individual group A (I) WW domains show preference for specific PY motifs. 7102 INGHAM ET AL. MOL. CELL. BIOL. FIG. 8. Specificity of binding between the different WW domains of AIP4 and the second WW domains of AIP4 and NEDD4-1. (A) 293T cells were transfected with a cDNA coding for a Flag-tagged KIAA0144 protein (FLAG-KIAA0144 blot) or left untransfected (other blots). Cell lysates were prepared, and an equivalent amount of lysate was precipitated with GST alone or with the indicated GST-AIP4 WW domain. Half of each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the indicated antibody. The other half of the GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for the pulldowns (bottom blot). Cell lysate is included to show the electrophoretic mobilities of the indicated proteins. (B) Osprey diagram (7) showing the proteins identified by MS to precipitate with the second WW domains of NEDD4-1 and AIP4. Proteins with PY motifs are indicated in red. Note that the large subunit of RNA polymerase II (POL2A) does not have a PY motif but does possess several copies of the YSPTSPS heptamer sequence in its C-terminal domain (CTD), which has been shown to bind the WW domains of NEDD4 family proteins (12). (C) 293T cells were transfected with a cDNA coding for a Flag-tagged KIAA0144 protein (FLAG-KIAA0144 blot) or left untransfected (other blots). Cell lysates were prepared, and proteins were precipitated with the indicated GST fusion proteins from either 2 mg or 500 g of cell lysate. Half of each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the indicated antibody. The other half of the GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for the pulldowns (bottom blot). Cell lysate (10 g) was included to show the electrophoretic mobility of the immunoblotted protein and to give an indication of the amount of protein precipitated by the GST fusion proteins. (D) 293T cells were transfected with a cDNA coding for a Flag-tagged KIAA0144 protein (wild type [wt]) or a Flag-tagged KIAA0144 protein in which the tyrosine residue in the PY motif had been mutated to alanine (Y/A). Cell lysates were prepared, and proteins were precipitated with the indicated fusion proteins from equivalent amounts of cell lysate. Half of each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the anti-FLAG M2 MAb. The other half of the GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for the pulldowns (bottom blot). Cell lysate was included to show the electrophoretic mobility of the FLAG-KIAA0144 protein. (E) SPOTS membranes were prepared as outlined in Materials and Methods. Proteins (12-mers) comprising the entire coding sequence of KIAA0144 were generated, with a moving window of four amino acids. SPOTS membranes were probed with 1 ⌴ of either GST-AIP4 WW2, GST-NEDD4-1 WW2, or GST alone. Spots specific to GST-AIP4 WW2 and NEDD4-1 WW2 are numbered and indicated in red. The sequences of these peptides (with the PY motif and LPXY motifs in red) are shown below the blots. Spots marked by an asterisk were recognized by GST alone in other experiments. An individual protein with distinct binding motifs for different classes of WW domains. A few proteins identified in the MS screen were precipitated by WW domains that typically recognize different classes of proline-rich ligands. For example, the p68 protein of CFIm was precipitated by seven of the WW domains analyzed (AIP4 WW2, NEDD4-1 WW2, WWOX WW, CA150 WW2, FBP21 WW2, and FBP11 WW2). This raised the question as to whether p68 possessed a single “ge- VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS 7103 FIG. 9. Recognition of the p68 CFIm protein by multiple WW domains. (A) The coding sequence of the p68 protein is shown. The proline-rich region used for the SPOTS blots in panel C is indicated in pink. The PY (red), PPPPP stretch (yellow), and three PPGPPP motifs (green) are also indicated. The RNA recognition motif is underlined. (B) 293T cells were transfected with cDNAs coding for either a Flag-tagged p68 protein (wild type [wt]) or a Flag-tagged p68 protein in which the tyrosine residue in the PY motif had been mutated to alanine (Y/A). Cell lysates were prepared, and an equivalent amount of lysate was precipitated with GST alone or the indicated GST-WW domain fusion protein. Half of each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the anti-FLAG M2 MAb. The other half of the GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for the pulldowns (bottom blot). Cell lysate is included to show the electrophoretic mobilities of the indicated proteins; note that the p68 Y/A was consistently expressed at higher levels than the wt protein. (C) Peptides (12-mers), with a moving window of five amino acids, were generated for the proline-rich region of p68 (panel A, pink region), and SPOTS membranes were probed with 1 M of GST alone or the indicated GST-WW domain fusion protein. Peptides recognized by one group of WW domains over another are boxed and numbered, and the peptide sequences are provided below the blots. In the boxed sequences, the PY (red), PPPPP (yellow), and PPGPPP (green) motifs in the peptides are highlighted. neric” WW domain recognition motif, or alternatively, distinct recognition motifs recognized by different classes of WW domains. The sequence of p68 (Fig. 9A) shows that the C terminus possesses numerous RS repeats, a characteristic of SR proteins (9). The p68 protein is also extremely proline rich and has a single PY motif, a PPPPP stretch, and three PPGPPP motifs. Since the PY motif of p68 is required for association with AIP4 (Fig. 6), we tested whether it is also important for association with other WW domains that recognize p68. Mutation of the p68 PY motif abolished binding to the WW domains of WWOX and NEDD4-1. In contrast, the AIP4 WW2 domain showed reduced but still significant binding to the p68 Y/A mutant, while the FE65 WW domain and CA150 WW2 domains were equally able to precipitate wild-type p68 and the p68 Y/A mutant (Fig. 9B). To locate the recognition motifs for these latter WW domains, we synthesized a peptide SPOTS array corresponding to the proline-rich region of p68 (Fig. 9A, pink region) and probed this with GST-WW fusion proteins (Fig. 9C). As expected, the NEDD4-1 WW2, WWOX WW1, and AIP4 WW2 domains recognized two overlapping peptides containing the PY motif (Fig. 9C, red box). The fact that the AIP4 WW2 domain recognized no additional peptides suggests that the binding of this WW domain to the p68 Y/A mutant still involves the p68 PY motif (Fig. 9B). The FE65 WW domain and CA150 WW2 domain collectively bound a distinct set of proline-rich peptides that were not recognized by the NEDD4-1, WWOX, or AIP4 WW domain or GST alone (Fig. 9C, blue, pink, and green boxes). These included peptides comprising the PPPPP stretch and the three PPGPPP motifs, as well as several other peptides rich in proline and proline/ 7104 INGHAM ET AL. arginine (Fig. 9C). These results illustrate the specificity of group A (I) WW domains for PY motifs and show the more promiscuous nature of the group B (II/III) WW domains. Furthermore, the data suggest that WW domains may interact with p68 at multiple sites, with specific classes of WW domains recruited to separate sites. The p68 protein may therefore exemplify a scaffold with the ability to recruit multiple classes of WW domains and WW domain proteins through distinct proline-rich sequences. DISCUSSION To explore the range of cellular processes and ligand-binding preferences exhibited by individual WW domains, we have undertaken an MS-based screen to identify proteins from Jurkat T cells that associate in vitro with GST-WW domain fusion proteins taken from a broad range of WW-containing proteins. We have used unsupervised hierarchical clustering to group WW domains on the basis of their similarity of precipitated protein ligands in this assay. This approach yielded three subsets of WW domains that cluster on the basis of the similarities of their protein-binding partners. Many of these associated proteins contain proline-rich sequences typical of peptide ligands for WW domains. Indeed, we found that proteins in one cluster (A) are enriched for PY motifs and are recognized by WW domains with a known predilection for PY peptides (i.e., group I WW domains). In a similar fashion, proteins in a second cluster (B) possess PPLP and proline/argininecontaining motifs, as well as polyproline stretches containing glycine and methionine, and associate with WW domains previously classified as group II/III on the basis of their peptidebinding properties. The final cluster (C) contains proteins that are recognized by the Pin1 WW domain (group IV); these show no preference for the specific proline-rich motifs analyzed but are enriched in phosphorylated-Ser/Thr-Pro sites. Although we have included the FBP21 WW2 in group II/III (cluster B), it has a fair number of specific binding partners (Fig. 3) and has previously been suggested to lie outside the conventional WW domain groups (29). Hierarchical clustering therefore reveals a striking relationship between the full-length proteins recognized by a particular subset of WW domains in cell lysates and the peptide-binding specificities of these domains. This suggests that significant information regarding the molecular recognition properties of interaction domains can be inferred from proteomic-scale analysis of their proteinbinding partners. Furthermore, these data argue that the WW domain-binding motifs identified by the analysis of synthetic peptides are used extensively in the context of intact proteins to determine binding specificity. While some of the polypeptides precipitated by WW domains in this analysis have been described as binding partners for specific WW domain proteins in cells, most of these interactions have not been previously identified. We therefore used the p68/p25 subunits of the CFIm complex to address whether a novel interaction detected in this in vitro screen might be relevant in cells, whether it involves the anticipated recognition of a specific proline-rich motif by one or more WW domains, whether WW domains can recruit multiple components of a larger protein complex, and whether such interactions are functional in vivo. The p68 protein contains RS repeats and a MOL. CELL. BIOL. RNA recognition motif, which together with p25 binds premRNA and aids in the recruitment of other 3⬘ processing factors, such as cleavage and polyadenylation specificity factor (16, 59, 60). Indeed, we have found that the AIP4 E3 proteinubiquitin ligase binds directly to the p68 CFIm subunit in a fashion that requires functional AIP4 WW domains and a PY motif in p68. Although p25 has a PY motif, it does not interact directly with AIP4 WW domains but rather binds p68, which acts as a bridge to AIP4. Furthermore, we identified an SR protein, SFRS7 (9G8), which interacts with the RS domain of p68 (16) in the NEDD4-1 WW2 pulldowns (Fig. 3), providing further support for the notion that WW domain pulldowns can precipitate cellular complexes. From a functional point of view, AIP4 stimulates the selective ubiquitylation of p68 (but not p25) in cells. Ubiquitylation has many functions, including degradation via the 26S proteasome, but can also positively and negatively regulate molecular interactions independent of degradation (37, 43). It will be of interest to explore whether ubiquitylation of p68 may regulate its binding to RNA or its ability to recruit other factors important for pre-mRNA cleavage. Taken together, these data suggest that WW domains may regulate an extensive network of protein-protein interactions in vivo. Two additional points support this view. First, the WW domains tested in this screen isolated a number of multiprotein complexes with central functions in cellular regulation, including assemblies involved in transcription, splicing, chromatin remodeling, and actin polymerization. WW domain proteins may therefore serve to bridge or regulate such cellular machinery, for example in the coupling of transcription to splicing (25, 45, 63). In support of this notion, we have used peptide arrays to show that the p68 CFIm subunit contains multiple prolinerich sequences that are able to bind WW domains of different classes in vitro. Thus, WW domains can potentially function to establish networks of interactions, and this may be especially true of proteins with tandem WW domains. In this regard, the screen compared the second WW domains of the related NEDD4 family E3 ubiquitin ligases NEDD4-1 and AIP4. While many similarities were seen in the proteins precipitated by the second WW domains of AIP4 and NEDD4-1, there were some proteins specifically precipitated with one WW domain (Fig. 8B). Further investigation of two such selective proteins, KIAA0144 and EWS, confirmed that these proteins were indeed precipitated much more efficiently by the AIP4 WW2 domain than by NEDD4-1 WW2 (Fig. 8C). Since humans have nine NEDD4 family proteins (33), an important question is whether these proteins are functionally overlapping or whether they have specific targets. Our data, although limited to one WW domain for both AIP4 and NEDD4-1, suggest that both these paradigms may apply, as some proteins are recognized equivalently by the two domains, while others are selective for one. Of interest, a chromosomal inversion that disrupts the promoter of the Itch/AIP4 gene in mice leads to a hyperactivation of cells of the immune system and an autoimmune-like phenotype; this also suggests that these proteins can have nonoverlapping biological functions (58). In conclusion, our experiments show that WW domains, and by extension WW domain-containing proteins, associate with multiprotein cellular complexes, many of which have not been VOL. 25, 2005 IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS previously described. Thus, WW domains provide a versatile module with which to assemble regulatory protein networks. 22. ACKNOWLEDGMENTS R.J.I. was a recipient of fellowships from the National Cancer Institute of Canada (supported by Funds from the Terry Fox Run) and the Leukemia & Lymphoma Society. T.P. is a Distinguished Scientist of the Canadian Institutes for Health Research (CIHR). This work was supported by grants from the CIHR, the Ontario R&D Research Fund, and Genome Canada. We also thank the National Cell Culture Center for tissue culture services. 23. 24. 25. REFERENCES 1. Beaudenon, S. L., M. R. Huacani, G. Wang, D. P. McDonnell, and J. M. Huibregtse. 1999. Rsp5 ubiquitin-protein ligase mediates DNA damageinduced degradation of the large subunit of RNA polymerase II in Saccharomyces cerevisiae. Mol. Cell. Biol. 19:6972–6979. 2. Bedford, M. T., D. C. Chan, and P. Leder. 1997. FBP WW domains and the Abl SH3 domain bind to a specific class of proline-rich ligands. EMBO J. 16:2376–2383. 3. Bedford, M. T., D. Sarbassova, J. Xu, P. Leder, and M. B. Yaffe. 2000. A novel pro-Arg motif recognized by WW domains. J. Biol. Chem. 275:10359– 10369. 4. Bednarek, A. K., K. J. Laflin, R. L. Daniel, Q. Liao, K. A. Hawkins, and C. M. Aldaz. 2000. WWOX, a novel WW domain-containing protein mapping to human chromosome 16q23.3–24.1, a region frequently affected in breast cancer. Cancer Res. 60:2140–2145. 5. Bompard, G., and E. Caron. 2004. Regulation of WASP/WAVE proteins: making a long story short. J. Cell Biol. 166:957–962. 6. Bork, P., and M. Sudol. 1994. The WW domain: a signalling site in dystrophin? Trends Biochem. Sci. 19:531–533. 7. Breitkreutz, B. J., C. Stark, and M. Tyers. 27 February 2003, posting date. Osprey: a network visualization system. Genome Biol. 4:R22. [Online.] http://genomebiology.com/2003/4/3/R22. 8. Buschdorf, J. P., and W. H. Stratling. 2004. A WW domain binding region in methyl-CpG-binding protein MeCP2: impact on Rett syndrome. J. Mol. Med. 82:135–143. 9. Caceres, J. F., and A. R. Kornblihtt. 2002. Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 18: 186–193. 10. Camon, E., M. Magrane, D. Barrell, D. Binns, W. Fleischmann, P. Kersey, N. Mulder, T. Oinn, J. Maslen, A. Cox, and R. Apweiler. 2003. The Gene Ontology Annotation (GOA) project: implementation of GO in SWISSPROT, TrEMBL, and InterPro. Genome Res. 13:662–672. 11. Castellano, F., C. Le Clainche, D. Patin, M. F. Carlier, and P. Chavrier. 2001. A WASp-VASP complex regulates actin polymerization at the plasma membrane. EMBO J. 20:5603–5614. 12. Chang, A., S. Cheang, X. Espanel, and M. Sudol. 2000. Rsp5 WW domains interact directly with the carboxyl-terminal domain of RNA polymerase II. J. Biol. Chem. 275:20562–20571. 13. Chen, H. I., A. Einbond, S. J. Kwak, H. Linn, E. Koepf, S. Peterson, J. W. Kelly, and M. Sudol. 1997. Characterization of the WW domain of human Yes-associated protein and its polyproline-containing ligands. J. Biol. Chem. 272:17070–17077. 14. Chen, H. I., and M. Sudol. 1995. The WW domain of Yes-associated protein binds a proline-rich ligand that differs from the consensus established for Src homology 3-binding modules. Proc. Natl. Acad. Sci. USA 92:7819–7823. 15. Chung, W., and J. T. Campanelli. 1999. WW and EF hand domains of dystrophin-family proteins mediate dystroglycan binding. Mol. Cell Biol. Res. Commun. 2:162–171. 16. Dettwiler, S., C. Aringhieri, S. Cardinale, W. Keller, and S. M. L. Barabino. 2004. Distinct sequence motifs within the 68-kDa subunit of cleavage factor Im mediate RNA binding, protein-protein interactions, and subcellular localization. J. Biol. Chem. 279:35788–35797. 17. Eisen, M. B., P. T. Spellman, P. O. Brown, and D. Botstein. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95:14863–14868. 18. Ermekova, K. S., N. Zambrano, H. Linn, G. Minopoli, F. Gertler, T. Russo, and M. Sudol. 1997. The WW domain of neural protein FE65 interacts with proline-rich motifs in Mena, the mammalian homolog of Drosophila Enabled. J. Biol. Chem. 272:32869–32877. 19. Espanel, X., and M. Sudol. 1999. A single point mutation in a group I WW domain shifts its specificity to that of group II WW domains. J. Biol. Chem. 274:17284–17289. 20. Faber, P. W., G. T. Barnes, J. Srinidhi, J. Chen, J. F. Gusella, and M. E. MacDonald. 1998. Huntingtin interacts with a family of WW domain proteins. Hum. Mol. Genet. 7:1463–1474. 21. Fan, H., K. Sakuraba, A. Komuro, S. Kato, F. Harada, and Y. Hirose. 2003. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 7105 PCIF1, a novel human WW domain-containing protein, interacts with the phosphorylated RNA polymerase II. Biochem. Biophys. Res. Commun. 301: 378–385. Fotia, A. B., A. Dinudom, K. E. Shearwin, J. P. Koch, C. Korbmacher, D. I. Cook, and S. Kumar. 2003. The role of individual Nedd4-2 (KIAA0439) WW domains in binding and regulating epithelial sodium channels. FASEB J. 17:70–72. Frank, R. 2002. The SPOT-synthesis technique. Synthetic peptide arrays on membrane supports—principles and applications. J. Immunol. Methods 267: 13–26. Gautreau, A., H. Y. Ho, J. Li, H. Steen, S. P. Gygi, and M. W. Kirschner. 2004. Purification and architecture of the ubiquitous Wave complex. Proc. Natl. Acad. Sci. USA 101:4379–4383. Goldstrohm, A. C., T. R. Albrecht, C. Sune, M. T. Bedford, and M. A. Garcia-Blanco. 2001. The transcription elongation factor CA150 interacts with RNA polymerase II and the pre-mRNA splicing factor SF1. Mol. Cell. Biol. 21:7617–7628. Harvey, K. F., A. Dinudom, P. Komwatana, C. N. Jolliffe, M. L. Day, G. Parasivam, D. I. Cook, and S. Kumar. 1999. All three WW domains of murine Nedd4 are involved in the regulation of epithelial sodium channels by intracellular Na⫹. J. Biol. Chem. 274:12525–12530. Holbert, S., I. Denghien, T. Kiechle, A. Rosenblatt, C. Wellington, M. R. Hayden, R. L. Margolis, C. A. Ross, J. Dausset, R. J. Ferrante, and C. Neri. 2001. The Gln-Ala repeat transcriptional activator CA150 interacts with huntingtin: neuropathologic and genetic evidence for a role in Huntington’s disease pathogenesis. Proc. Natl. Acad. Sci. USA 98:1811–1816. Houthaeve, T., H. Gausepohl, M. Mann, and K. Ashman. 1995. Automation of micro-preparation and enzymatic cleavage of gel electrophoretically separated proteins. FEBS Lett. 376:91–94. Hu, H., J. Columbus, Y. Zhang, D. Wu, L. Lian, S. Yang, J. Goodwin, C. Luczak, M. Carter, L. Chen, M. James, R. Davis, M. Sudol, J. Rodwell, and J. J. Herrero. 2004. A map of WW domain family interactions. Proteomics 4:643–655. Huang, X., F. Poy, R. Zhang, A. Joachimiak, M. Sudol, and M. J. Eck. 2000. Structure of a WW domain containing fragment of dystrophin in complex with beta-dystroglycan. Nat. Struct. Biol. 7:634–638. Ikeda, M., A. Ikeda, L. C. Longan, and R. Longnecker. 2000. The EpsteinBarr virus latent membrane protein 2A PY motif recruits WW domaincontaining ubiquitin-protein ligases. Virology 268:178–191. Ilsley, J. L., M. Sudol, and S. J. Winder. 2002. The WW domain: linking cell signalling to the membrane cytoskeleton. Cell Signal 14:183–189. Ingham, R. J., G. Gish, and T. Pawson. 2004. The Nedd4 family of E3 ubiquitin ligases: functional diversity within a common modular architecture. Oncogene 23:1972–1984. Ingham, R. J., L. Santos, M. Dang-Lawson, M. Holgado-Madruga, P. Dudek, C. R. Maroun, A. J. Wong, L. Matsuuchi, and M. R. Gold. 2001. The Gab1 docking protein links the B cell antigen receptor to the phosphatidylinositol 3-kinase/Akt signaling pathway and to the SHP2 tyrosine phosphatase. J. Biol. Chem. 276:12257–12265. Innocenti, M., A. Zucconi, A. Disanza, E. Frittoli, L. B. Areces, A. Steffen, T. E. Stradal, P. P. Di Fiore, M. F. Carlier, and G. Scita. 2004. Abi1 is essential for the formation and activation of a WAVE2 signalling complex. Nat. Cell Biol. 6:319–327. Jolliffe, C. N., K. F. Harvey, B. P. Haines, G. Parasivam, and S. Kumar. 2000. Identification of multiple proteins expressed in murine embryos as binding partners for the WW domains of the ubiquitin-protein ligase Nedd4. Biochem. J. 351:557–565. Kaiser, P., K. Flick, C. Wittenberg, and S. I. Reed. 2000. Regulation of transcription by ubiquitination without proteolysis: Cdc34/SCF(Met30)-mediated inactivation of the transcription factor Met4. Cell 102:303–314. Kanelis, V., D. Rotin, and J. D. Forman-Kay. 2001. Solution structure of a Nedd4 WW domain-ENaC peptide complex. Nat. Struct. Biol. 8:407–412. Kato, Y., K. Nagata, M. Takahashi, L. Lian, J. J. Herrero, M. Sudol, and M. Tanokura. 2004. Common mechanism of ligand recognition by group II/III WW domains: redefining their functional classification. J. Biol. Chem. 279: 31833–31841. Komuro, A., M. Saeki, and S. Kato. 1999. Association of two nuclear proteins, Npw38 and NpwBP, via the interaction between the WW domain and a novel proline-rich motif containing glycine and arginine. J. Biol. Chem. 274:36513–36519. Kovar, H. 2003. Ewing tumor biology: perspectives for innovative treatment approaches. Adv. Exp. Med. Biol. 532:27–37. Krebs, D. L., Y. Yang, M. Dang, J. Haussmann, and M. R. Gold. 1999. Rapid and efficient retrovirus-mediated gene transfer into B cell lines. Methods Cell Sci. 21:57–68. Kuras, L., A. Rouillon, T. Lee, R. Barbey, M. Tyers, and D. Thomas. 2002. Dual regulation of the Met4 transcription factor by ubiquitin-dependent degradation and inhibition of promoter recruitment. Mol. Cell 10:69–80. Lambrechts, A., A. V. Kwiatkowski, L. M. Lanier, J. E. Bear, J. Vandekerckhove, C. Ampe, and F. B. Gertler. 2000. cAMP-dependent protein kinase phosphorylation of EVL, a Mena/VASP relative, regulates its interaction with actin and SH3 domains. J. Biol. Chem. 275:36143–36151. 7106 INGHAM ET AL. 45. Lin, K. T., R. M. Lu, and W. Y. Tarn. 2004. The WW domain-containing proteins interact with the early spliceosome and participate in pre-mRNA splicing in vivo. Mol. Cell. Biol. 24:9176–9185. 46. Liou, Y. C., A. Sun, A. Ryo, X. Z. Zhou, Z. X. Yu, H. K. Huang, T. Uchida, R. Bronson, G. Bing, X. Li, T. Hunter, and K. P. Lu. 2003. Role of the prolyl isomerase Pin1 in protecting against age-dependent neurodegeneration. Nature 424:556–561. 47. Lott, J. S., S. J. Coddington-Lawson, P. H. Teesdale-Spittle, and F. J. McDonald. 2002. A single WW domain is the predominant mediator of the interaction between the human ubiquitin-protein ligase Nedd4 and the human epithelial sodium channel. Biochem. J. 361:481–488. 48. Lu, P. J., G. G. Wulf, X. Z. Zhou, P. Davies, and K. P. Lu. 1999. The prolyl isomerase Pin1 restores the function of Alzheimer-associated phosphorylated tau protein. Nature 399:784–788. 49. Lu, P. J., X. Z. Zhou, M. Shen, and K. P. Lu. 1999. Function of WW domains as phosphoserine- or phosphothreonine-binding modules. Science 283:1325– 1328. 50. Macias, M. J., M. Hyvonen, E. Baraldi, J. Schultz, M. Sudol, M. Saraste, and H. Oschkinat. 1996. Structure of the WW domain of a kinase-associated protein complexed with a proline-rich peptide. Nature 382:646–649. 51. Marchler-Bauer, A., J. B. Anderson, P. F. D. Cherukuri, C. DeWeese-Scott, L. Y. Geer, M. Gwadz, S. He, D. I. Hurwitz, J. D. Jackson, Z. Ke, C. J. Lanczycki, C. A. Liebert, C. Liu, F. Lu, G. H. Marchler, M. Mullokandov, B. A. Shoemaker, V. Simonyan, J. S. Song, P. A. Thiessen, R. A. Yamashita, J. J. Yin, D. Zhang, and S. H. Bryant. 2005. CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res. 33:D192–D196. 52. Mizutani, K., S. Suetsugu, and T. Takenawa. 2004. FBP11 regulates nuclear localization of N-WASP and inhibits N-WASP-dependent microspike formation. Biochem. Biophys. Res. Commun. 313:468–474. 53. Murillas, R., K. S. Simms, S. Hatakeyama, A. M. Weissman, and M. R. Kuehn. 2002. Identification of developmentally expressed proteins that functionally interact with Nedd4 ubiquitin ligase. J. Biol. Chem. 277:2897–2907. 54. Otte, L., U. Wiedemann, B. Schlegel, J. R. Pires, M. Beyermann, P. Schmieder, G. Krause, R. Volkmer-Engert, J. Schneider-Mergener, and H. Oschkinat. 2003. WW domain sequence activity relationships identified using ligand recognition propensities of 42 WW domains. Protein Sci. 12:491– 500. 55. Passani, L. A., M. T. Bedford, P. W. Faber, K. M. McGinnis, A. H. Sharp, J. F. Gusella, J. P. Vonsattel, and M. E. MacDonald. 2000. Huntingtin’s WW domain partners in Huntington’s disease post-mortem brain fulfill genetic criteria for direct involvement in Huntington’s disease pathogenesis. Hum. Mol. Genet. 9:2175–2182. 56. Pawson, T., and P. Nash. 2003. Assembly of cell regulatory systems through protein interaction domains. Science 300:445–452. 57. Perkins, D. N., D. J. Pappin, D. M. Creasy, and J. S. Cottrell. 1999. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. 58. Perry, W. L., C. M. Hustad, D. A. Swing, T. N. O’Sullivan, N. A. Jenkins, and N. G. Copeland. 1998. The itchy locus encodes a novel ubiquitin protein ligase that is disrupted in a18H mice. Nat. Genet. 18:143–146. 59. Ruegsegger, U., K. Beyer, and W. Keller. 1996. Purification and characterization of human cleavage factor Im involved in the 3⬘ end processing of messenger RNA precursors. J. Biol. Chem. 271:6107–6113. 60. Ruegsegger, U., D. Blank, and W. Keller. 1998. Human pre-mRNA cleavage MOL. CELL. BIOL. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. factor Im is related to spliceosomal SR proteins and can be reconstituted in vitro from recombinant subunits. Mol. Cell 1:243–253. Sechi, A. S., and J. Wehland. 2004. ENA/VASP proteins: multifunctional regulators of actin cytoskeleton dynamics. Front. Biosci. 9:1294–1310. Shcherbik, N., Y. Kee, N. Lyon, J. M. Huibregtse, and D. S. Haines. 2004. A single PXY motif located within the carboxyl terminus of Spt23p and Mga2p mediates a physical and functional interaction with ubiquitin ligase Rsp5p. J. Biol. Chem. 279:53892–53898. Smith, M. J., S. Kulkarni, and T. Pawson. 2004. FF domains of CA150 bind transcription and splicing factors through multiple weak interactions. Mol. Cell. Biol. 24:9274–9285. Snyder, P. M., D. R. Olson, F. J. McDonald, and D. B. Bucher. 2001. Multiple WW domains, but not the C2 domain, are required for inhibition of the epithelial Na⫹ channel by human Nedd4. J. Biol. Chem. 276:28321– 28326. Sokal, R. R., and C. D. Michener. 1958. A statistical method for evaluating systematic relationships. Univ. Kans. Sci. Bull. 38:1409–1438. Staub, O., S. Dho, P. Henry, J. Correa, T. Ishikawa, J. McGlade, and D. Rotin. 1996. WW domains of Nedd4 bind to the proline-rich PY motifs in the epithelial Na⫹ channel deleted in Liddle’s syndrome. EMBO J. 15:2371– 2380. Sudol, M., and T. Hunter. 2000. NeW wrinkles for an old domain. Cell 103:1001–1004. Sudol, M., K. Sliwa, and T. Russo. 2001. Functions of WW domains in the nucleus. FEBS Lett. 490:190–195. Sune, C., T. Hayashi, Y. C. Liu, W. S. Lane, R. A. Young, and M. A. Garcia-Blanco. 1997. CA150, a nuclear protein associated with the RNA polymerase II holoenzyme, is involved in Tat-activated human immunodeficiency virus type 1 transcription. Mol. Cell. Biol. 17:6029–6039. Tominaga, T., W. Meng, K. Togashi, H. Urano, A. S. Alberts, and M. Tominaga. 2002. The Rho GTPase effector protein, mDia, inhibits the DNA binding ability of the transcription factor Pax6 and changes the pattern of neurite extension in cerebellar granule cells through its binding to Pax6. J. Biol. Chem. 277:47686–47691. Valpuesta, J. M., J. Martin-Benito, P. Gomez-Puertas, J. L. Carrascosa, and K. R. Willison. 2002. Structure and function of a protein folding machine: the eukaryotic cytosolic chaperonin CCT. FEBS Lett. 529:11–16. Verdecia, M. A., M. E. Bowman, K. P. Lu, T. Hunter, and J. P. Noel. 2000. Structural basis for phosphoserine-proline recognition by group IV WW domains. Nat. Struct. Biol. 7:639–643. Wallar, B. J., and A. S. Alberts. 2003. The formins: active scaffolds that remodel the cytoskeleton. Trends Cell Biol. 13:435–446. Winberg, G., L. Matskova, F. Chen, P. Plant, D. Rotin, G. Gish, R. Ingham, I. Ernberg, and T. Pawson. 2000. Latent membrane protein 2A of EpsteinBarr virus binds WW domain E3 protein-ubiquitin ligases that ubiquitinate B-cell tyrosine kinases. Mol. Cell. Biol. 20:8526–8535. Wulf, G. M., A. Ryo, G. G. Wulf, S. W. Lee, T. Niu, V. Petkova, and K. P. Lu. 2001. Pin1 is overexpressed in breast cancer and cooperates with Ras signaling in increasing the transcriptional activity of c-Jun towards cyclin D1. EMBO J. 20:3459–3472. Yaffe, M. B., M. Schutkowski, M. Shen, X. Z. Zhou, P. T. Stukenberg, J. U. Rahfeld, J. Xu, J. Kuang, M. W. Kirschner, G. Fischer, L. C. Cantley, and K. P. Lu. 1997. Sequence-specific and phosphorylation-dependent proline isomerization: a potential mitotic regulatory mechanism. Science 278:1957– 1960. PIN1 WW AIP4 WW2 NEDD4-1 WW2 WWOX WW1 FBP11 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 CA150 WW2 NEDD4-1 WW2 WWOX WW1 PIN1 WW AIP4 WW2 FBP11 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 CA150 WW2 PIN1 WW AIP4 WW2 NEDD4-1 WW2 WWOX WW1 FBP11 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 CA150 WW2 NEDD4-1 WW2 WWOX WW1 PIN1 WW AIP4 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 FBP11 WW2 CA150 WW2 NEDD4-1 WW2 WWOX WW1 PIN1 WW AIP4 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 FBP11 WW2 CA150 WW2 NEDD4-1 WW2 WWOX WW1 PIN1 WW AIP4 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 FBP11 WW2 CA150 WW2 NEDD4-1 WW2 WWOX WW1 PIN1 WW AIP4 WW2 GAS7 WW FBP11 WW1 FE65 WW FBP21 WW2 FBP11 WW2 CA150 WW2 Supplementary Material, Figure 1 HNRPK CPSF6 HNRPU SFPQ SF3A1 SF3B1 SF3B3 SF1 DIAPH1 W AS NONO SF3A2 U2AF2 EVL SF3A3 SF3B2 ABI1 CYFIP1 CYFIP2 W ASF2 PSPC1 BRD4 FMNL1 HEM1 HNRPF APBB1IP GAPD RPLP0 NCKAP1 W ASL YLPM1 DHX9 CCT2 CCT4 CCT7 TD-60 DHX15 RPL4 CRK7 NSEP1 DDX46 VASP ATAD3A W IRE DIAPH2 DDX17 HNRPH1 ENAH KHSRP MGC20255 PQBP1 PTBP1 RBM17 PABP C1 FLJ12529 W ASPIP SF3B4 KHDRBS1 CHERP HTATSF1 W BP11 BUB3 DERPC DDX23 DNM2 Group B enriched FUS PRP3 PRPF4 SART1 SFRS15 SMARCA3 SNRP70 SNRPA ZNF207 DDX3X THRAP3 PRPF8 G22P1 M11 S1 U5-116KD AP2A1 BCLAF1 C13orf8 CCNK CDC2L1 CDC2L2 DDB1 DDX5 DKFZP434I11 6 ETV6 G3BP2 HADHA G3BP LOC133619 PKM2 PPA RBP RBM21 REPS1 RNPS1 SFRS11 SRRM1 SRRM2 TBC1D4 TFG TLE3 ASCC3L1 VCY2IP1 W RNIP1 CPSF5 POLR2A POLR2B SMARCC1 W BP2 C19orf2 FLJ13150 FLJ21908 HNRPL HNRPUL1 POLR2C POLR2E ARID1A CPSF1 AMOTL1 CAD DHX30 EPRS GRINL1A NEDD4 RAPGEF2 POLR3A RPS3A RUVBL1 SFRS7 EW SR1 DAZAP1 ITCH UBAP2L SMARCC2 SMARCE1 SPG20 UBAP2 TCERG1 CPSF2 CPSF3 FIP1L1 KIAA0460 SPK W DR33 Group C enriched Group A enriched PPxY PPLP PxPPxR PPRxxP PPPPP PPGPPP PxxGMxPP Supplementary Material Supplementary Material, Table I NEDD4-1 WW2-5’ GATCC GGA TTA CCA CCA GGT TGG GAA GAA AAA CAA GAT GAA AGA GGA AGA TCA TAT TAT GTA GAT CAC AAT TCC AGA ACG ACT ACT TGG ACA AAG CCC ACT TAA G NEDD4-1 WW2-3’ AATTC TTA AGT GGG CTT TGT CCA AGT AGT CGT TCT GGA ATT GTG ATC TAC ATA ATA TGA TCT TCC TCT TTC ATC TTG TTT TTC TTC CCA ACC TGG TGG TAA TCC G Smurf1WW1-5’ GATCC CTG CCC GAA GGC TAC GAA CAA AGA ACA ACA GTC CAG GGC CAA GTT TAC TTT TTG CAT ACA CAG ACT GGA GTT AGC ACG TGG CAC GAC CCC AGG TAA G Smurf1 WW1-3’ AATTC TTA CCT GGG GTC GTG CCA CGT GCT AAC TCC AGT CTG TGT ATG CAA AAA GTA AAC TTG GCC CTG GAC TGT TGT TCT TTG TTC GTA GCC TTC GGG CAG G CA150 WW1-5’ GATCC CCT ACG GAG GAG ATA TGG GTT GAA AAT AAA ACT CCA GAT GGG AAG GTT TAT TAT TAT AAT GCT CGG ACA CGT GAA TCT GCA TGG ACC AAG CCA GAT TAA G CA150 WW1-3’ AATTC TTA ATC TGG CTT GGT CCA TGC AGA TTC ACG TGT CCG AGC ATT ATA ATA ATA AAC CTT CCC ATC TGG AGT TTT ATT TTC AAC CCA TAT CTC CTC CGT AGG G CA150 WW2-5’ GATCC ACA GCA GTT TCT GAA TGG ACT GAA TAT AAA ACA GCA GAT GGG AAG ACA TAT TAT TAT AAT AAT AGA ACA TTA GAA TCA ACC TGG GAA AAA CCC CAA GAA TAA G CA150 WW2-3’ AATTC TTA TTC TTG GGG TTT TTC CCA GGT TGA TTC TAA TGT TCTATT ATT ATA ATA ATA TGT CTT CCC ATC TGC TGT TTT ATA TTC AGT CCA TTC AGA AAC TGC TGT G WWOX WW1-5’ GATCC GAG CTG CCT CCG GGC TGG GAG GAG AGA ACC ACC AAG GAC GGC TGG GTT TAC TAC GCC AAT CAC ACC GAG GAG AAG ACT CAG TGG GAA CAT CCA AAA TAA G WWOX WW1-3’ AATTC TTA TTT TGG ATG TTC CCA CTG AGT CTT CTC CTC GGT GTG ATT GGC GTA GTA AAC CCA GCC GTC CTT GGT GGT TCT CTC CTC CCA GCC CGG AGG CAG CTC G FE65 WW-5’ GATCC GAC CTG CCG GCT GGA TGG ATG AGG GTC CAG GAC ACC TCA GGG ACC TAT TAC TGG CAC ATC CCA ACA GGG ACC ACC CAG TGG GAA CCC CCC GGC TAA G FE65 WW-3’ AATTC TTA GCC GGG GGG TTC CCA CTG GGT GGT CCC TGT TGG GAT GTG CCA GTA ATA GGT CCC TGA GGT GTC CTG GAC CCT CAT CCA TCC AGC CGG CAG GTC G GAS7 WW-5’ GATCC ATC CTT CCA CCT GGC TGG CAG AGC TAC CTG TCG CCT CAG GGC CGG CGG TAC TAT GTC AAC ACG ACC ACC AAT GAG ACC ACC TGG GAA CGT CCC AGC TAA G GAS7 WW-3’ AATTC TTA GCT GGG ACG TTC CCA GGT GGT CTC ATT GGT GGT CGT GTT GAC ATA GTA CCG CCG GCC CTG AGG CGA CAG GTA GCT CTG CCA GCC AGG TGG AAG GAT G Pin1 WW-5’ GATCC AAG CTG CCG CCC GGC TGG GAG AAG CGC ATG AGC CGC AGC TCA GGC CGA GTG TAC TAC TTC AAC CAC ATC ACT AAC GCC AGC CAG TGG GAG CGG CCC AGC TAA G Pin1 WW-3’ AATTC TTA GCT GGG CCG CTC CCA CTG GCT GGC GTT AGT GAT GTG GTT GAA GTA GTA CAC TCG GCC TGA GCT GCG GCT CAT GCG CTT CTC CCA GCC GGG CGG CAG CTT G PLEKHA5 WW1-5’ GATCC TCC CTG CCC CGG TCC TGG ACT TAC GGG ATC ACC AGG GGC GGC CGA GTC TTC TTC ATC AAC GAG GAG GCC AAG AGC ACC ACC TGG CTG CAC CCC GTC ACC TAA G PLEKHA5 WW1-3’ AATTC TTA GGT GAC GGG GTG CAG CCA GGT GGT GCT CTT GGC CTC CTC GTT GAT GAA GAA GAC TCG GCC GCC CCT GGT GAT CCC GTA AGT CCA GGA CCG GGG CAG GGA G PQBP1 WW-5’ GATCC GGC CTA CCA CCA AGC TGG TAC AAG GTG TTC GAC CCT TCC TGC GGG CTC CCT TAC TAC TGG AAT GCA GAC ACA GAC CTT GTA TCC TGG CTC TCC CCA CAT GAC TAA G PQBP1 WW-3’ AATTC TTA GTC ATG TGG GGA GAG CCA GGA TAC AAG GTC TGT GTC TGC ATT CCA GTA GTA AGG GAG CCC GCA GGA AGG GTC GAA CAC CTT GTA CCA GCT TGG TGG TAG GCC G FBP11 WW1-5’ GATCC GGT GCA AAA TCA ATG TGG ACT GAA CAT AAA TCA CCT GAT GGA AGG ACT TAC TAC TAC AAC ACT GAA ACC AAA CAG TCT ACC TGG GAG AAA CCA GAT GAT TAA G FBP11 WW1-3’ AATTC TTA ATC ATC TGG TTT CTC CCA GGT AGA CTG TTT GGT TTC AGT GTT GTA GTA GTA AGT CCT TCC ATC AGG TGA TTT ATG TTC AGT CCA CAT TGA TTT TGC ACC G FBP11 WW2-5’ GATCC CTA TCT AAA TGC CCC TGG AAG GAA TAC AAA TCA GAT TCT GGA AAG CCT TAC TAT TAT AAT TCT CAA ACA AAA GAA TCT CGC TGG GCC AAA CCT AAA GAA TAA G FBP11 WW2-3’ AATTC TTA TTC TTT AGG TTT GGC CCA GCG AGA TTC TTT TGT TTG AGA ATT ATA ATA GTA AGG CTT TCC AGA ATC TGA TTT GTA TTC CTT CCA GGG GCA TTT AGA TAG G KIAA1052 WW-5’ GATCC CCA CTG CCT GGA GAG TGG AAA CCA TGC CAG GAC ATC ACA GGT GAC ATT TAC TAT TTC AAC TTC GCC AAC GGG CAG TCT ATG TGG GAC CAT CCA TGT GAC TAA G KIAA1052 WW-3’ AATTC TTA GTC ACA TGG ATG GTC CCA CAT AGA CTG CCC GTT GGC GAA GTT GAA ATA GTA AAT GTC ACC TGT GAT GTC CTG GCA TGG TTT CCA CTC TCC AGG CAG TGG G FBP21 WW2-5’ GATCC GCA GTG AAG ACC GTT TGG GTA GAA GGT TTA AGT GAA GAT GGT TTT ACC TAT TAC TAT AAT ACA GAA ACA GGA GAA TCC AGA TGG GAG AAA CCT GAT GAT TAA G FBP21 WW2-3’ AATTC TTA ATC ATC AGG TTT CTC CCA TCT GGA TTC TCC TGT TTC TGT ATT ATA GTA ATA GGT AAA ACC ATC TTC ACT TAA ACC TTC TAC CCA AAC GGT CTT CAC TGC G Supplementary Material, Table II AIP4 WW2 Gene Symbol Alternate Name(s) ARID1A SF1 ITCH B120:P270:BM029:BAF250:C1or f4:SMARCF1 RMP:URI:NNX3:FLJ10575 EWS RPB1:RPO2:POLR2:POLRA:RP Bh1:RPOL2:RpIILS:hsRPB1:hRP B220 RPB2:POL2RB:hsRPB2:hRPB14 0 RPB3:RPB31:hRPB33:hsRPB3 RPB5:XAP4:RPABC1:hRPB25:h sRPB5 Rsc8:SRG3:SWI3:BAF155:CRA CC1 Rsc8:BAF170:CRACC2 BAF57 CFIM25 CFIM:CFIM68:HPBRII-4:HPBRII7 MGC19907 UNKNOWN hnRNP-L E1BAP5:E1B-AP5:FLJ12944 MGC9315:FLJ39024 PAB1:PABP:PABP1:PABPC2:PA BPL1 ZFM1:ZNF162:D11S636 AIF4:AIP4:NAPP1 SPG20 FLJ21908 UBAP2L UBAP2 WBP2 SPARTIN:TAHCCP1:KIAA0610 UNKNOWN KIAA0144; NICE-4 FLJ22435:KIAA1491:bA176F3.5 WBP-2:MGC18269 C19orf2 EWSR1 POLR2A POLR2B POLR2C POLR2E SMARCC1 SMARCC2 SMARCE1 CPSF5 CPSF6 DAZAP1 FLJ13150 HNRPL HNRPUL1 FLJ12529 PABPC1 Gene ID Domains Cellular Process 8289 BRIGHT Transcription 8725 2130 5430 Prefoldin RRM,ZnF_RBZ RPOLA_N Transcription Transcription Transcription 5431 none Transcription 5432 5434 RPOLD none Transcription Transcription 6599 SANT Transcription 6601 6605 11051 11052 CHROMO,SANT HMG none RRM Transcription Transcription RNA processing RNA processing 26528 79871 3191 11100 79869 26986 RRM,RRM none RRM,RRM,RRM SAP,SPRY RRM 4 RRM, PolyA RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing 7536 83737 KH C2,4 WW, HECTc MIT 2 TPR UBA UBA none RNA processing Protein Transport 23111 79657 9898 55833 23558 Protein Transport Unknown Unknown Unknown Unknown CA150 WW domain 2 Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process CPSF5 CPSF6 CFIM25 CFIM:CFIM68:HPBRII-4:HPBRII7 CSBP:TUNP:HNRNPK SAF-A:U21.1:HNRNPU P54:NMT55:NRB54:P54NRB PSP1:FLJ10955 ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 11051 none RRM RNA processing RNA processing KH,KH,KH SPRY RRM,RRM RRM,RRM KH SWAP,SWAP,UB Q TGFB none ARM RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing BASIC/PSP RNA processing none RNA processing RRM,RRM,BBC RRM,RRM,RRM BROMO,BROMO SH3 none none FH2 FH2 WH1 FH2 RNA processing RNA processing Cell Cycle Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton WH1,PBD,WH2 WH2 none Cytoskeleton Cytoskeleton Cytoskeleton HNRPK HNRPU NONO PSPC1 SF1 SF3A1 SF3A2 SF3A3 SF3B1 SF3B2 SF3B3 SFPQ U2AF2 BRD4 ABI1 CYFIP1 CYFIP2 DIAPH1 DIAPH2 EVL FMNL1 WAS WASF2 WIRE MIF:MIS:PRP11:SAP62:SF3a66 PRP9:SAP61:SF3a60 PRP10:SAP155:SF3B155:SF3b1 55 SF3b1:SAP145:SF3B145:SF3b1 50 RSE1:SAP130:SF3B130:SF3b13 0:KIAA0017 PSF:POMP100 U2AF65 CAP:MCAP:HUNKI E3B1:NAP1:ABI-1 SHYC:KIAA0068:P140SRA-1 PIR121:PRO1331 DRF1:DFNA1:LFHL1:hDIA1 DIA:POF:DIA2:POF2 RNB6 FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B: MGC21878 THC:IMD2:WASP SCAR2:WAVE2:dJ393P12.2 FLJ23260:FLJ30495:DKFZp547 N082 11052 3190 3192 4841 55269 7536 10291 8175 10946 23451 10992 23450 6421 11338 23476 10006 23191 26999 1729 1730 51466 752 7454 10163 147179 FBP11 WW domain 1 Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process CRK7 NSEP1 CRKRS:KIAA0904 YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1 MGC9936:FLJ25329:KIAA0801 DBP1:HRH2:DDX15:PRP43:PrP p43p HNRNPF:mcs94-1 P54:NMT55:NRB54:P54NRB ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 51755 4904 Kinase CSP Transcription Transcription 9879 1665 DEXDc,HELICc DEXDc,HELICc RNA processing RNA processing 3185 4841 7536 10291 RNA processing RNA processing RNA processing RNA processing 8175 10946 23451 RRM,RRM,RRM RRM,RRM KH SWAP,SWAP,UB Q TGFB none ARM 10992 BASIC/PSP RNA processing 23450 none RNA processing 10262 RRM,RRM RNA processing 6421 11338 6124 1729 51466 752 RRM,RRM,BBC RRM,RRM,RRM none FH2 WH1 FH2 RNA processing RNA processing Translation Cytoskeleton Cytoskeleton Cytoskeleton 10006 7408 7454 8976 7456 55210 3071 SH3 WH1 WH1,PBD,WH2 WH1,PBD,WH2 WH1,PBD,WH2 AAA none Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Unknown Unknown DDX46 DHX15 HNRPF NONO SF1 SF3A1 SF3A2 SF3A3 SF3B1 SF3B2 SF3B3 SF3B4 SFPQ U2AF2 RPL4 DIAPH1 EVL FMNL1 ABI1 VASP WAS WASL WASPIP ATAD3A HEM1 MIF:MIS:PRP11:SAP62:SF3a66 PRP9:SAP61:SF3a60 PRP10:SAP155:SF3B155:SF3b1 55 SF3b1:SAP145:SF3B145:SF3b1 50 RSE1:SAP130:SF3B130:SF3b13 0:KIAA0017 SAP49:SF3B49:SF3b49:MGC10 828 PSF:POMP100 U2AF65 HRPL4 DRF1:DFNA1:LFHL1:hDIA1 RNB6 FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B: MGC21878 E3B1:NAP1:ABI-1 UNKNOWN THC:IMD2:WASP N-WASP:MGC48327 WIP:PRPL-2 FLJ10709 UNKNOWN RNA processing RNA processing RNA processing FBP11 WW domain 2 Gene Symbol Alternate Name(s) CPSF6 DHX9 CFIM:CFIM68:HPBRII-4:HPBRII-7 LKP:RHA:DDX9:NDHII:NDH II 11052 1660 HNRPF HNRPK HNRPU NONO SF1 SF3A1 SF3A2 SF3B1 SF3B3 HNRNPF:mcs94-1 CSBP:TUNP:HNRNPK SAF-A:U21.1:HNRNPU P54:NMT55:NRB54:P54NRB ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 MIF:MIS:PRP11:SAP62:SF3a66 PRP10:SAP155:SF3B155:SF3b155 RSE1:SAP130:SF3B130:SF3b130:KI AA0017 PSF:POMP100 U2AF65 P0:L10E:RPP0:PRLP0 ML8:KU70:TLAA:D22S671:D22S731 KIAA1470:DKFZp762N0610 SHYC:KIAA0068:P140SRA-1 PIR121:PRO1331 DRF1:DFNA1:LFHL1:hDIA1 RNB6 FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:MG C21878 E3B1:NAP1:ABI-1 THC:IMD2:WASP SCAR2:WAVE2:dJ393P12.2 N-WASP:MGC48327 WIP:PRPL-2 RIAM:INAG1 ZAP:ZAP3 UNKNOWN G3PD:GAPDH CCTB SRB:Cctd Ccth:Nip7-1 SFPQ U2AF2 RPLP0 G22P1 TD-60 CYFIP1 CYFIP2 DIAPH1 EVL FMNL1 ABI1 WAS WASF2 WASL WASPIP APBB1IP YLPM1 HEM1 GAPD CCT2 CCT4 CCT7 Gene ID Domains Cellular Process RNA processing RNA processing 3185 3190 3192 4841 7536 10291 8175 23451 23450 RRM DSRM,DSRM,DEX Dc,HELICc RRM,RRM,RRM KH,KH,KH SPRY RRM,RRM KH SWAP,SWAP,UBQ TGFB ARM none RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing 6421 11338 6175 2547 55920 23191 26999 1729 51466 752 RRM,RRM,BBC RRM,RRM,RRM none Ku78 none none none FH2 WH1 FH2 RNA processing RNA processing Translation Cell Cycle Cell Cycle Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton 10006 7454 10163 8976 7456 54518 56252 3071 2597 10576 10575 10574 SH3 WH1,PBD,WH2 WH2 WH1,PBD,WH2 WH1,PBD,WH2 RA,PH none none none none none none Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Signal Transduction Unknown Unknown Metabolism Chaperone Chaperone Chaperone FBP21 WW domain 2 Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process EWSR1 FUS HTATSF1 SMARCA3 EWS TLS:FUS1 TAT-SF1:dJ196E23.2 HLTF:ZBU1:HLTF1:RNF80:HIP11 6:SNF2L3:HIP116A CA150:TAF2S 2130 2521 27336 6596 Transcription Transcription Transcription UNKNOWN CFIM:CFIM68:HPBRII-4:HPBRII-7 prp28:MGC8416:U5-100K DBX:DDX3:HLP2:DDX14 DBP1:HRH2:DDX15:PRP43:PrPp 43p CSBP:TUNP:HNRNPK SAF-A:U21.1:HNRNPU p62:Sam68 MGC9315:FLJ39024 P54:NMT55:NRB54:P54NRB PAB1:PABP:PABP1:PABPC2:PA BPL1 PRP3:RP18:HPRP3:Prp3p:HPRP 3P PRP4:HPRP4:Prp4p:HPRP4P 7756 11052 9416 1654 1665 RRM,ZnF_RBZ RRM,ZnF_RBZ RRM,RRM DEXDc,RING,HE LICc PQQ_DH, 3 WW domains, 7 FF domains none RRM DEXDc,HELICc DEXDc,HELICc DEXDc,HELICc 3190 3192 10657 79869 4841 26986 KH,KH,KH SPRY KH RRM RRM,RRM 4 RRM, PolyA 9129 PWI 9128 WD40/SFM, 5 WD40 RRM,RRM none TCERG1 ZNF207 CPSF6 DDX23 DDX3X DHX15 HNRPK HNRPU KHDRBS1 FLJ12529 NONO PABPC1 PRP3 PRPF4 PSPC1 SART1 SF1 SF3A1 SF3A2 SF3A3 SF3B1 SF3B3 SF3B4 SFPQ SFRS15 SNRP70 SNRPA U5-116KD WBP11 BUB3 DERPC G22P1 PSP1:FLJ10955 Ara1:HOMS1:MGC2038:SART125 9 ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 MIF:MIS:PRP11:SAP62:SF3a66 PRP9:SAP61:SF3a60 PRP10:SAP155:SF3B155:SF3b15 5 RSE1:SAP130:SF3B130:SF3b130 :KIAA0017 SAP49:SF3B49:SF3b49:MGC108 28 PSF:POMP100 SRA4:KIAA1172:DKFZP434E098 RPU1:U1AP:U170K:U1RNP:RNP U1Z UNKNOWN KIAA0031 NPWBP BUB3L CTF8:DERPC ML8:KU70:TLAA:D22S671:D22S7 31 10915 Transcription Transcription Transcription RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing 55269 9092 7536 10291 8175 10946 23451 KH SWAP,SWAP,UB Q TGFB none ARM 23450 none 10262 RRM,RRM 6421 57466 6625 RRM,RRM,BBC RPR,RRM RRM 6626 9343 51729 9184 54921 2547 RRM,RRM none none WD40,WD40 none Ku78 RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing Cell Cycle Cell Cycle Cell Cycle DIAPH1 DNM2 WAS WASPIP CHERP DRF1:DFNA1:LFHL1:hDIA1 DYNII THC:IMD2:WASP WIP:PRPL-2 DAN16:SCAF6 1729 1785 7454 7456 10523 M11S1 GPIAP1 4076 FH2 DYNc,PH,GED WH1,PBD,WH2 WH1,PBD,WH2 SWAP,RPR,G_pa tch none Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Other Unknown FE65 WW domain Gene Alternate Name(s) Symbol HTATSF1 NSEP1 PQBP1 THRAP3 CPSF6 DDX17 DDX3X DDX46 DHX15 DHX9 HNRPH1 HNRPK KHDRBS1 KHSRP RBM17 FLJ12529 NONO PABPC1 PTBP1 SF1 SF3A1 SF3A2 SF3A3 SF3B1 SF3B2 SF3B3 SF3B4 SFPQ U2AF2 WBP11 RPL4 CYFIP1 CYFIP2 DIAPH1 DIAPH2 ENAH EVL WIRE TAT-SF1:dJ196E23.2 YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1 SHS:MRX55:NPW38 TRAP150 CFIM:CFIM68:HPBRII-4:HPBRII-7 P72:RH70 DBX:DDX3:HLP2:DDX14 MGC9936:FLJ25329:KIAA0801 DBP1:HRH2:DDX15:PRP43:PrPp4 3p LKP:RHA:DDX9:NDHII:NDH II Gene ID Domains Cellular Process 27336 4904 RRM,RRM CSP Transcription Transcription 10084 9967 11052 10521 1654 9879 1665 none none RRM DEXDc,HELICc DEXDc,HELICc DEXDc,HELICc DEXDc,HELICc Transcription Transcription RNA processing RNA processing RNA processing RNA processing RNA processing 1660 RNA processing HNRPH:hnRNPH CSBP:TUNP:HNRNPK p62:Sam68 FBP2:KSRP:FUBP2 SPF45:MGC14439 MGC9315:FLJ39024 P54:NMT55:NRB54:P54NRB PAB1:PABP:PABP1:PABPC2:PAB PL1 PTB:PTB2:PTB3:PTB4:pPTB:HNR PI:PTB-1:PTBT:HNRNPI:MGC8461:MGC10830 ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 3187 3190 10657 8570 84991 79869 4841 26986 DSRM,DSRM,DE XDc,HELICc RRM,RRM,RRM KH,KH,KH KH 4 KH domains G_patch,RRM RRM RRM,RRM 4 RRM, PolyA 5725 4 RRM's RNA processing 7536 10291 RNA processing RNA processing MIF:MIS:PRP11:SAP62:SF3a66 PRP9:SAP61:SF3a60 PRP10:SAP155:SF3B155:SF3b15 5 SF3b1:SAP145:SF3B145:SF3b15 0 RSE1:SAP130:SF3B130:SF3b130: KIAA0017 SAP49:SF3B49:SF3b49:MGC1082 8 PSF:POMP100 U2AF65 NPWBP HRPL4 SHYC:KIAA0068:P140SRA-1 PIR121:PRO1331 DRF1:DFNA1:LFHL1:hDIA1 DIA:POF:DIA2:POF2 mena:NDPP1:FLJ10773 RNB6 FLJ23260:FLJ30495:DKFZp547N0 82 8175 10946 23451 KH SWAP,SWAP,UB Q TGFB none ARM 10992 BASIC/PSP RNA processing 23450 none RNA processing 10262 RRM,RRM RNA processing 6421 11338 51729 6124 23191 26999 1729 1730 55740 51466 14717 9 RRM,RRM,BBC RRM,RRM,RRM none none none none FH2 FH2 WH1 WH1 none RNA processing RNA processing RNA processing Translation Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing ABI1 VASP WAS WASF2 WASL WASPIP CHERP YLPM1 MGC20255 E3B1:NAP1:ABI-1 UNKNOWN THC:IMD2:WASP SCAR2:WAVE2:dJ393P12.2 N-WASP:MGC48327 WIP:PRPL-2 DAN16:SCAF6 10006 7408 7454 10163 8976 7456 10523 ZAP:ZAP3 UNKNOWN 56252 90324 SH3 WH1 WH1,PBD,WH2 WH2 WH1,PBD,WH2 WH1,PBD,WH2 SWAP,RPR,G_pat ch none none Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Other Unknown Unknown Gas7 WW Domain Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process HNRPU KHDRBS1 SF1 SF3A1 SAF-A:U21.1:HNRNPU p62:Sam68 ZFM1:ZNF162:D11S636 PRP21:SAP114:SF3A120 3192 10657 7536 10291 RNA processing RNA processing RNA processing RNA processing SFPQ RPLP0 ABI1 CYFIP1 CYFIP2 DIAPH1 FMNL1 PSF:POMP100 P0:L10E:RPP0:PRLP0 E3B1:NAP1:ABI-1 SHYC:KIAA0068:P140SRA-1 PIR121:PRO1331 DRF1:DFNA1:LFHL1:hDIA1 FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:M GC21878 THC:IMD2:WASP SCAR2:WAVE2:dJ393P12.2 RIAM:INAG1 6421 6175 10006 23191 26999 1729 752 SPRY KH KH SWAP,SWAP,UB Q RRM,RRM,BBC none SH3 none none FH2 FH2 7454 10163 54518 WH1,PBD,WH2 WH2 RA,PH NAP1:NAP125:MGC8981:KIAA05 87 UNKNOWN G3PD:GAPDH 10787 none 3071 2597 none none WAS WASF2 APBB1IP NCKAP1 HEM1 GAPD RNA processing Translation Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Cytoskeleton Signal Transduction Signal Transduction Unknown Metabolism NEDD4-1 WW2 Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process C19orf2 NSEP1 8725 4904 Prefoldin CSP Transcription Transcription 5430 RPOLA_N Transcription 5431 none Transcription 5432 5434 RPOLD None Transcription Transcription 11128 8607 6599 RPOLA_N AAA SANT Transcription Transcription Transcription 8289 BRIGHT Transcription 9967 29894 11051 11052 none none none RRM Transcription RNA processing RNA processing RNA processing DDX3X DHX30 FLJ13150 HNRPK HNRPL HNRPU HNRPUL1 PRPF8 SFRS7 EPRS RPS3A DIAPH1 AMOTL1 RMP:URI:NNX3:FLJ10575 YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1 RPB1:RPO2:POLR2:POLRA:RP Bh1:RPOL2:RpIILS:hsRPB1:hRP B220 RPB2:POL2RB:hsRPB2:hRPB14 0 RPB3:RPB31:hRPB33:hsRPB3 RPB5:XAP4:RPABC1:hRPB25:h sRPB5 RPC1; RPC155 ECP54:TIP49:NMP238 Rsc8:SRG3:SWI3:BAF155:CRA CC1 B120:P270:BM029:BAF250:C1or f4:SMARCF1 TRAP150 CPSF160:HSU37012 CFIM25 CFIM:CFIM68:HPBRII-4:HPBRII7 DBX:DDX3:HLP2:DDX14 DDX30:FLJ11214:KIAA0890 UNKNOWN CSBP:TUNP:HNRNPK hnRNP-L SAF-A:U21.1:HNRNPU E1BAP5:E1B-AP5:FLJ12944 PRP8:RP13:HPRP8:PRPC8 9G8:HSSG1 PARS:QARS:QPRS FTE1 DRF1:DFNA1:LFHL1:hDIA1 JEAP DEXDc,HELICc DEXDc,HELICc none KH,KH,KH RRM,RRM,RRM SPRY SAP,SPRY JAB_MPN RRM none none FH2 none RAPGEF2 PDZGEF2:PDZ-GEF2:RA-GEF-2 1654 22907 79871 3190 3191 3192 11100 10594 6432 2058 6189 1729 15481 0 51735 RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing Translation Translation Cytoskeleton Signal Transduction Signal Transduction NEDD4 KIAA0093 4734 GRINL1A FLJ21908 WBP2 CAD DKFZp586F1918 UNKNOWN WBP-2:MGC18269 UNKNOWN 81488 79657 23558 790 POLR2A POLR2B POLR2C POLR2E POLR3A RUVBL1 SMARCC1 ARID1A THRAP3 CPSF1 CPSF5 CPSF6 CNMP,RasGEFN ,PDZ,RA,RasGE F WW,WW,WW,W W,HECTc none 2 TPR none none Protein Transport Other Unknown Unknown Metabolism Pin1 WW Gene Symbol Alternate Name(s) Gene ID Domains Cellular Process BCLAF1 CCNK CRK7 ETV6 PPARBP BTF:KIAA0164 CPR4:MGC9113 CRKRS:KIAA0904 TEL PBP:CRSP1:RB18A:TRIP2:CRSP 200:DRIP205:DRIP230:PPARGB P:TRAP220 TRAP150 ESG:ESG3:HsT18976:KIAA1547 P72:RH70 DBX:DDX3:HLP2:DDX14 P68:p68:HLR1:G17P1:HUMP68 DBP1:HRH2:DDX15:PRP43:PrPp 43p MSTP054 9774 8812 51755 2120 5469 none Cyclin,Cyclin Kinase SAM_PNT,ETS none Transcription Transcription Transcription Transcription Transcription 9967 7090 10521 1654 1655 1665 none WD40 DEXDc,HELICc DEXDc,HELICc DEXDc,HELICc DEXDc,HELICc Transcription Transcription RNA processing RNA processing RNA processing RNA processing 25962 none RNA processing PAPD2:FLJ21850:FLJ22267:FLJ2 2347 UNKNOWN HDH-VIII HNRPH:hnRNPH CSBP:TUNP:HNRNPK SAF-A:U21.1:HNRNPU KIAA0031 P54:NMT55:NRB54:P54NRB PAB1:PABP:PABP1:PABPC2:PA BPL1 PRP8:RP13:HPRP8:PRPC8 E5.1 PSF:POMP100 p54:dJ677H15.2 160KD:POP101:SRM160:MGC39488 300KD:SRL300:SRM300:SRm300:KI AA0324:SIII U2AF65 KIAA0788 64852 ZnF_U1,RRM RNA processing 9908 10146 3187 3190 3192 9343 4841 26986 NTF2, RRM NTF2,RRM RRM,RRM,RRM KH,KH,KH SPRY none RRM,RRM 4 RRM, PolyA RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing RNA processing 10594 10921 6421 9295 10250 JAB_MPN RRM RRM,RRM,BBC RRM PWI RNA processing RNA processing RNA processing RNA processing RNA processing 23524 none RNA processing 11338 23020 RNA processing RNA processing HRPL4 DDBA:XAP1:XPCE:XPE-BF:UVDDB1 WHIP:FLJ22526:bA420G6.2 p58:PK58:CLK-1:p58CLK1:p58CDC2L1 CDC2L3:p58GTA:PITSLRE ML8:KU70:TLAA:D22S671:D22S7 6124 1642 RRM,RRM,RRM DEXDc,HELICc,S EC63,DEXDc,HE LICc,SEC63 none none Translation DNA repair 56897 984 ZnF_Rad18,AAA kinase DNA repair Cell Cycle 985 2547 kinase Ku78 Cell Cycle Cell Cycle THRAP3 TLE3 DDX17 DDX3X DDX5 DHX15 DKFZP434I 116 RBM21 G3BP2 G3BP HNRPH1 HNRPK HNRPU U5-116KD NONO PABPC1 PRPF8 RNPS1 SFPQ SFRS11 SRRM1 SRRM2 U2AF2 ASCC3L1 RPL4 DDB1 WRNIP1 CDC2L1 CDC2L2 G22P1 AP2A1 REPS1 C13orf8 31 C19ORF5:C19orf5:VCY2IP1:FLJ1 0669:VCY2IP-1 ADTAA:CLAPA1:AP2-ALPHA RALBP1 FLJ90413:KIAA1802 LOC133619 UNKNOWN M11S1 TBC1D4 TFG HADHA PKM2 GPIAP1 KIAA0603 TF6 GBP:MTPA:LCHAD PK3:PKM:TCB:OIP3:CTHBP:THB P1:MGC3932 VCY2IP1 55201 none Cytoskeleton 160 85021 28348 9 13361 9 4076 9882 10342 3030 5315 none EH none Protein Transport Other Unknown none Unknown none PTB,PTB,TBC PB1 none none Unknown Unknown Unknown Metabolism Metabolism WWOX WW domain 1 Gene Symbol POLR2A POLR2B SMARCC1 TCERG1 CPSF1 CPSF2 CPSF3 CPSF5 CPSF6 FIP1L1 FLJ12529 HNRPK KIAA0460 PABPC1 Alternate Name(s) Gene ID Domains Cellular Process RPB1:RPO2:POLR2:POLRA:RP Bh1:RPOL2:RpIILS:hsRPB1:hRP B220 RPB2:POL2RB:hsRPB2:hRPB14 0 Rsc8:SRG3:SWI3:BAF155:CRA CC1 CA150:TAF2S 5430 RPOLA_N Transcription 5431 none Transcription 6599 SANT Transcription 10915 Transcription RNA processing RNA processing RNA processing RNA processing RNA processing CPSF160:HSU37012 CPSF100:KIAA1367 CPSF:CPSF73 CFIM25 CFIM:CFIM68:HPBRII-4:HPBRII7 Rhe:DKFZp586K0717 MGC9315:FLJ39024 CSBP:TUNP:HNRNPK UNKNOWN PAB1:PABP:PABP1:PABPC2:PA BPL1 PRP21:SAP114:SF3A120 29894 53981 51692 11051 11052 PQQ_DH, 3 WW domains, 7 FF domains none none none none RRM 81608 79869 3190 23248 26986 none RRM KH,KH,KH RPR 4 RRM, PolyA RNA processing RNA processing RNA processing RNA processing RNA processing 10291 RNA processing PRP10:SAP155:SF3B155:SF3b1 55 RSE1:SAP130:SF3B130:SF3b13 0:KIAA0017 PSF:POMP100 SPK:SYM FLJ23260:FLJ30495:DKFZp547 N082 WDC146:FLJ11294 WBP-2:MGC18269 23451 SWAP,SWAP,UB Q ARM 23450 none RNA processing 6421 8189 14717 9 55339 23558 RRM,RRM,BBC none none RNA processing RNA processing Cytoskeleton 6 WD40 domains none Unknown Unknown SF3A1 SF3B1 SF3B3 SFPQ SYMPK WIRE WDR33 WBP2 RNA processing Supplementary Figure Legends Supplementary Material, Table I - Sequence of WW oligonucleotides. The sequences of oligonucleotides (5’ and 3’) used to generate the GST-WW domain fusion constructs are shown. The resulting annealed products contain overhanging BamHI (5’) and EcoRI (3’) ends (shown in bold) for direct cloning into BamHI/EcoRI cut pGEXKT (Amersham Biosciences AB, Upsalla, Sweden). Supplementary Material, Table II – Proteins identified precipitating with WW domains. Proteins identified by mass spectrometry (indicated by gene symbol) that met our “hit” criteria (see Materials and Methods) are shown for each WW domain bait. Also included are alternate names, modular domains, and putative cellular function for each precipitated protein. Supplementary Material, Figure 1 – Superimposition of proline-based WW domainbinding motifs onto the clustered WW domain data. The individual proline-rich motifs of Figure 4, as well as PPGPPP and PXXPMXPP motifs, are indicated in the precipitated proteins from the clustered data of Figure 3. The coloring of the precipitated proteins on the right indicates cellular function, which is defined in the Figure 3. Group A (yellow), B (red), and C (green) WW domains are colored at the top, as in Figure 3. Groups of precipitated proteins that were enriched in the pull-downs of specific WW domain classes are indicated to the left.