Download WW Domains Provide a Platform for the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SNARE (protein) wikipedia , lookup

Endomembrane system wikipedia , lookup

Protein phosphorylation wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

Cell nucleus wikipedia , lookup

Magnesium transporter wikipedia , lookup

LSm wikipedia , lookup

Protein wikipedia , lookup

Protein moonlighting wikipedia , lookup

Cyclol wikipedia , lookup

Signal transduction wikipedia , lookup

Ribosome wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

List of types of proteins wikipedia , lookup

JADE1 wikipedia , lookup

Proteolysis wikipedia , lookup

SR protein wikipedia , lookup

Gene expression wikipedia , lookup

Transcript
MOLECULAR AND CELLULAR BIOLOGY, Aug. 2005, p. 7092–7106
0270-7306/05/$08.00⫹0 doi:10.1128/MCB.25.16.7092–7106.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Vol. 25, No. 16
WW Domains Provide a Platform for the Assembly of
Multiprotein Networks†
Robert J. Ingham,1 Karen Colwill,1 Caley Howard,1 Sabine Dettwiler,2 Caesar S. H. Lim,1,3
Joanna Yu,1,3 Kadija Hersi,1 Judith Raaijmakers,1 Gerald Gish,1 Geraldine Mbamalu,1
Lorne Taylor,1 Benny Yeung,1 Galina Vassilovski,1 Manish Amin,1 Fu Chen,4
Liudmila Matskova,4 Gösta Winberg,4 Ingemar Ernberg,4 Rune Linding,1
Paul O’Donnell,1 Andrei Starostine,1 Walter Keller,2
Pavel Metalnikov,1Chris Stark,1 and Tony Pawson1,3*
Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada1; Department of Molecular and
Medical Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada3; Department of Cell Biology, Biozentrum,
University of Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland2; and Karolinska Institutet, Microbiology and
Tumor Biology Center (MTC), SE-171 Stockholm, Sweden4
Received 8 April 2005/Returned for modification 5 May 2005/Accepted 22 May 2005
WW domains are protein modules that mediate protein-protein interactions through recognition of prolinerich peptide motifs and phosphorylated serine/threonine-proline sites. To pursue the functional properties of
WW domains, we employed mass spectrometry to identify 148 proteins that associate with 10 human WW
domains. Many of these proteins represent novel WW domain-binding partners and are components of
multiprotein complexes involved in molecular processes, such as transcription, RNA processing, and cytoskeletal regulation. We validated one complex in detail, showing that WW domains of the AIP4 E3 proteinubiquitin ligase bind directly to a PPXY motif in the p68 subunit of pre-mRNA cleavage and polyadenylation
factor Im in a manner that promotes p68 ubiquitylation. The tested WW domains fall into three broad groups
on the basis of hierarchical clustering with respect to their associated proteins; each such cluster of bound
proteins displayed a distinct set of WW domain-binding motifs. We also found that separate WW domains from
the same protein or closely related proteins can have different specificities for protein ligands and also
demonstrated that a single polypeptide can bind multiple classes of WW domains through separate prolinerich motifs. These data suggest that WW domains provide a versatile platform to link individual proteins into
physiologically important networks.
tidyl prolyisomerase domains (Pin1), and Rho GTPase-activating protein domains. Consequently, WW domain-containing
proteins are involved in a variety of cellular processes, including transcription, RNA processing, protein trafficking, receptor
signaling, and control of the cytoskeleton (32, 33, 68). WW
domain-mediated interactions have been implicated in cancer
(4, 75), in hereditary disorders, such as Liddle’s syndrome (66)
and Rett’s syndrome (8), as well as in Alzheimer’s (46, 48) and
Huntington’s (20, 55) diseases.
WW domains are typically 35 to 40 amino acids in length (6)
and fold into a three-stranded, antiparallel ␤ sheet with two
ligand-binding grooves (30, 38, 50, 72). WW domains bind a
variety of distinct peptide ligands including motifs with core
proline-rich sequences, such as PPXY (amino acid single-letter
code; X is any amino acid) (PY) (14), PPLP (2, 18), as well as
proline/arginine-containing (PR) sequences (3) and phosphorylated serine/threonine-proline sites [p(S/T)P] (49, 76). WW
domains have been classified into four groups on the basis of
their binding to peptide ligands (3, 19, 67). Group I WW
domains, which include the NEDD4 family proteins and the
65-kDa Yes-associated protein (YAP65), recognize PY motifs
(14, 66). Group II WW domains, such as the FE65 WW domain and the first WW domain of FBP11, recognize PPLP
motifs (2, 18), and group III WW domains (FBP30 WW1)
recognize PR motifs (3). Group IV WW domains, such as
those from Pin1 (49) and PDX-1 C-terminus-interacting factor
Many signaling proteins contain modular domains that mediate specific protein-protein interactions, frequently through
the recognition of short peptide motifs in their binding partners (56). In many cases these interactions are regulated by
posttranslational modifications, such as phosphorylation. Interaction domains can thereby control the subcellular localization, enzymatic activity, and substrate specificity of regulatory
proteins and the assembly of multiprotein complexes, and thus
the flow of information through signaling pathways.
WW domains comprise a family of protein-protein interaction modules that are found in many eukaryotes and are
present in approximately 50 human proteins (6; see Fig. 1).
Within these polypeptides, WW domains are joined to a number of distinct interaction modules, including phosphotyrosinebinding domains (i.e., in the FE65 protein) and FF domains
(CA150 and FBP11), as well as protein localization domains,
such as C2 (NEDD4 family proteins) and pleckstrin homology
domains (PLEKHA5). WW domains are also linked to a variety of catalytic domains, including HECT E3 protein-ubiquitin
ligase domains (in NEDD4 family proteins), rotomerase/pep* Corresponding author. Mailing address: Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario M5G 1X5, Canada. Phone: (416) 586-8262. Fax: (416)
586-8869. E-mail: [email protected].
† Supplemental material for this article may be found at http://mcb
.asm.org/.
7092
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
7093
FIG. 1. Domain organization and alternate names of WW domain-containing proteins analyzed in this study. WW domain-containing proteins
examined in this study, their unique GeneIDs, as well as alternate names are indicated. The relative domain organization of each of the proteins
was determined by searching the National Center for Biotechnology Information conserved domain database (CDD) (51) and is indicated. WW
domains specifically analyzed in this study are indicated with an asterisk. a.a., amino acids; PTB, phosphotyrosine binding; PH, pleckstrin
homology; SH3, Src homology 3.
1 (21), recognize p(S/T)P motifs (76). Group II and III WW
domains can be rather versatile in their binding properties,
since they not only recognize both PPLP- and PR-containing
peptides (with varied affinities) but also polyproline stretches
often containing glycine, methionine, or arginine (29, 39, 40,
54), and therefore could be viewed as a single set.
A number of groups have identified WW domain-binding
proteins through screening cDNA expression libraries and
yeast two-hybrid analysis (2, 14, 18, 36, 53). These studies,
coupled with probing of peptide libraries, have suggested that
consensus sequences are recognized by individual WW domains (29, 54). To obtain a more comprehensive view of the
functions of WW domain-containing proteins, as well as the
ability of individual WW domains to recognize cellular proteins, we have used tandem mass spectrometry (MS) to identify
human polypeptides that associate with a range of WW domains. This approach has several advantages. (i) It is not biased towards previously determined peptide ligands and provides orthogonal results regarding WW domain-binding
properties that can be overlaid on peptide-binding data. (ii) It
allows the identification of proteins that interact indirectly with
WW domains, providing new information regarding the cellular complexes and machinery engaged by WW domains. (iii) It
facilitates the identification of phosphorylated ligands for
group IV WW domains.
Here, we identify 148 T-cell proteins that selectively associate with 10 human WW domains. Interestingly, hierarchical
clustering of this data set organized the WW domains into
three groups on the basis of their protein-binding properties,
which correlates with the presence of specific proline-rich or
phosphorylated peptide motifs in these associated proteins.
The screen identified novel, direct, and biologically relevant
binding partners for WW domain-containing proteins, as well
as associated proteins that bind WW domains indirectly as
components of multiprotein complexes. Finally, we show that
WW domains from the same protein or protein families can
have distinct binding preferences and that individual proteins
can potentially serve as scaffolds to recruit multiple WW domains of different classes to distinct binding sites. These results
argue that the reiterated use of a small interaction domain can
yield a versatile network of protein interactions that influences
many facets of cellular behavior.
MATERIALS AND METHODS
Antibodies and cDNAs. The polyclonal antisera to AIP4 (74) and p68 and p25
antisera (60) have been previously described. Antibodies were purchased as
7094
INGHAM ET AL.
follows: anti-Diaphanous 1 polyclonal antisera (ImmunoGlobe, Himmelstadt,
Germany); anti-FLAG M2 monoclonal antibody (MAb) (Sigma, St. Louis, MO);
anti-Myc 9E10 MAb, antibody to the large subunit of RNA polymerase II
(anti-RNA Pol II LS), antihemagglutinin (anti-HA), and anti-EWSR1 (antiEWS) polyclonal antisera (Santa Cruz, Santa Cruz, CA); anti-CA150 polyclonal
antisera (Abcam, Cambridge, MA); and anti-Pin1 polyclonal antisera (Upstate
Biotechnology, Inc., Upstate, NY). The HA-tagged ubiquitin construct was a gift
from Ivan Dikic (Goethe University Medical School, Frankfurt, Germany). The
doubly Myc-tagged AIP4 constructs were generated by PCR of our previously
published constructs (74) with the amino terminus of the published AIP4 sequence (31) replacing the amino-terminal Itch portion of the fusion used in our
earlier study. These constructs were cloned into the pCDNA3.1A expression
vector (Invitrogen, San Diego, CA). The Flag-tagged p68 (Flag-p68) and
KIAA0144 and the corresponding tyrosine-to-alanine (Y/A) mutation in the PY
motif, were generated by PCR and cloned into the pCMVFlac eukaryotic expression vector (Sigma). All PCR products were sequence verified. Production of
the His-tagged, recombinant pre-mRNA cleavage and polyadenylation factor Im
(CFIm) p68 and p25 has been previously described (16).
Construction of glutathione S-transferase (GST) fusion proteins. Oligonucleotides for the WW domains including four or five amino acids N and C terminal
to the conserved tryptophan residues of each of the WW domains (see Table S1
in the supplemental material) were purchased from Sigma. Complementary
oligonucleotides (approximately 2 mM of each) were annealed and cloned into
BamHI/EcoRI-cut pGEXKT (Amersham Biosciences AB, Uppsala, Sweden).
All constructs were sequence verified. The AIP4 WW domain fusion proteins
have been previously described (74).
Purification of GST fusion proteins. GST fusion proteins were purified from
BL21 Escherichia coli bacterial lysate with glutathione-Sepharose beads (Amersham). Fusion protein-containing beads were stored at 4°C as a 10% slurry in
phosphate-buffered saline containing 0.02% NaN3. Fusion protein used for
SPOTS blotting was eluted from the beads with 50 mM Tris base and 25 mM
glutathione (Sigma), and the eluates were dialyzed overnight at 4°C in phosphate-buffered saline.
Cell culture. Jurkat cell pellets were purchased from the National Cell Culture
Center (Minneapolis, MN) where cells were grown in RPMI 1640 medium
supplemented with 0.1 mM sodium pyruvate and 10% normal calf serum. 293T
cells were grown in Dulbecco modified Eagle medium (Gibco BRL, Cambridge,
Ontario, Canada) supplemented with 10% fetal calf serum (Gibco BRL).
GST-WW domain pulldowns for MS analysis. Jurkat cells were resuspended in
1% NP-40 lysis buffer supplemented with 1 mM NaVO4, 1 mM phenylmethylsulfonyl fluoride, 10 ␮g/ml leupeptin, and 10 ␮g/ml aprotinin at 1.0 ⫻ 108 cells/ml
and incubated on a nutator in the cold for 30 min. Detergent-insoluble material
was removed by centrifugation at ⬃50,000 ⫻ g for 30 min at 4°C. One milliliter
of lysate (1.0 ⫻ 108 cell equivalents) was then added to 80 ␮g of GST fusion
protein precoupled to glutathione-Sepharose beads and allowed to incubate for
6 h on a nutator at 4°C. The beads were washed four times with 1% NP-40 lysis
buffer with inhibitors, and proteins were removed from the beads by boiling in
sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) sample
buffer.
Running and staining of SDS-PAGs. Samples were run on 10% SDS-PAGs
and washed a minimum of three times for 10 min each time with distilled,
deionized water before being stained with GelCode colloidal Coomassie blue
reagent (MJS BioLynx, Inc., Brockville, Ontario, Canada) overnight at room
temperature. Gels were destained the next day with distilled, deionized water.
Excision of bands for MS. Individual bands were excised using an Investigator
ProPic Gel Cutting Robotic workstation (Genomic Solutions, Ann Arbor, MI)
and delivered to a single well of a 96-well microtiter plate with prepunched holes
at the bottom of each well (Genomic Solutions).
Reduction, alkylation, and tryptic digestion of samples. The PCR microtiter
plate was transferred to a Genomic Solutions ProGest Digestion robot for
“in-gel” trypsin digestion where protein bands were washed, reduced with dithiothreitol, alkylated with iodoacetamide, before being digested with sequencegrade, modified trypsin (Promega, Madison, WI) as previously described (28).
Tryptic peptides were then extracted from the gel for analysis by MS.
Liquid chromatography-MS (LC-MS). Tryptic peptides were analyzed by liquid chromatography-tandem MS (LC-MS/MS) using two systems. The first was
an Ultimate high-performance liquid chromatography system equipped with a
FAMOS 96-well autosampler (LC Packings-Dionex, Sunnyvale, CA) linked to a
Q-TRAP mass spectrometer (MDS Sciex, Concord, Ontario, Canada). The second was an HP 1100 high-performance liquid chromatography system (Palo Alto,
CA) connected to an LCQ-Deca mass spectrometer (Thermo Electron, San Jose,
CA). Peptides were separated on custom-made 75-␮m-inner-diameter PicoTip
columns packed with 5-␮m C-18 beads. The gradient was 3 to 60% of acetonitrile
MOL. CELL. BIOL.
during 25 min with a total run time of 45 min. Data were analyzed in batch using
the Mascot search engine (57), and proteins were considered “hits” if two
independent peptides or a single peptide with a Mascot score of 50 or higher was
found. Protein hits were converted to gene identifiers (GeneIDs) for further
analysis. Any “hit” that was not seen in two of three independent experiments or
was seen binding to GST alone was eliminated. Also excluded were all actins,
myosins, keratins, spectrins, and heat shock proteins. The cellular process for
each “hit” was assigned on the basis of published work and GeneOntology
information provided by the Gene Ontology Consortium (10). For the identification of phosphopeptides, the data were analyzed using the Spectrum Mill
software (Agilent Technologies, Palo Alto, CA). The validity of phosphorylation
sites was manually verified by examination of the MS/MS fragmentation spectra.
Transfection of 293T cells, GST pulldowns, and immunoprecipitation experiments. 293T cells were transfected in 10-cm plates using CaPO4 as previously
described (42). Transfected cells were lysed in 1 ml 1% NP-40 lysis buffer
supplemented with inhibitors (see above) for 10 min on ice. Detergent-insoluble
material was removed by spinning at ⬃18,000 ⫻ g in a microcentrifuge, and the
concentration of the cleared lysate was determined by bicinchoninic acid assay
(Pierce, Rockford, IL). An equivalent amount of cleared lysate was incubated
with 2 ␮g of the indicated antibodies and either protein A or G-Sepharose or
anti-mouse immunoglobulin G-agarose for 1 h on a rocker at 4°C. For GST
fusion protein experiments, equivalent amounts of lysate were incubated for 1 h
on a rocker at 4°C with ⬃8 ␮g of GST fusion protein preabsorbed on glutathione-Sepharose beads. Immunoprecipitates and GST pulldowns were then
washed three times with 1% NP-40 lysis buffer with inhibitors, and bound proteins were eluted by boiling in SDS-PAGE sample buffer.
Western blotting. Cell lysate, immunoprecipitations, and GST pulldowns were
separated on SDS-PAGs, transferred to Polyscreen polyvinylidene fluoride
(PVDF) membranes (Perkin-Elmer), and blocked for 1 h in Tris-buffered saline
containing 0.5% Tween 20 (TBST) supplemented with 5% skim milk powder.
Membranes were probed with the indicated antibodies (1 ␮g/ml in TBST) as
previously described (34), and proteins were visualized using enhanced chemiluminescence reagent. Blots were stripped by incubating the membrane for 1 h
in TBST, pH 2, and reprobed as described above.
Hierarchical clustering. Pearson correlation coefficients were computed to
reflect the similarities between the different WW domains with respect to the
proteins they precipitated and vice versa. The x and y axes were hierarchically
clustered independently using an average linkage method, and the matrix was
reorganized to reflect the hierarchically clustered ordering (17). The tree shows
the relatedness between items based on the clustering, with items closer together
being more related (65).
Ubiquitylation experiments. For ubiquitylation experiments, transfected cells
were treated for 1 h with 50 ␮M of the proteasome inhibitor MG132 (BACHEM,
King of Prussia, PA) prior to lysis. Cells were lysed and lysates processed as
indicated above with the exception that 50 ␮M MG132 and 20 mM N-ethylmaleimide (Sigma) were added to the lysis buffer.
Synthesis and blotting of SPOTS membranes. SPOTS blots (23) were synthesized, using standard 9-fluorenylmethoxycarbonyl (Fmoc) chemistry, in 30 ⫻ 20
spot arrays using a Multipep peptide synthesizer adapted for SPOTS synthesis
(Intavis AG, Cologne, Germany). Membranes were blocked for at least 1 h in
TBST with 5% skim milk powder before being probed overnight with the indicated GST fusion protein (1 ␮M in TBST) in the cold. The following day, blots
were washed three times for 10 min each time in TBST, incubated with a
horseradish peroxidase-conjugated, anti-GST MAb (B14) (Santa Cruz) for 45
min at room temperature, and then washed with TBST three more times for 10
min each time before visualizing SPOTS by exposing the membranes to enhanced chemiluminescence reagent.
RESULTS
Cluster analysis groups WW domains according to their
binding partners. To identify cellular proteins that interact
with individual WW domains, we constructed GST fusion proteins of 15 WW domains from 13 proteins containing WW
domains (Fig. 1). This set includes WW domains that have
different binding preferences for peptide ligands and are derived from proteins with diverse molecular functions. Distinct
WW domains from the same protein (FBP21 or CA150) and
from proteins with similar domain organizations (i.e., NEDD41/AIP4/Smurf1 and CA150/FBP11) were also part of the test
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
FIG. 2. Purification of WW domain-binding proteins. Proteins
were precipitated from Jurkat cell lysate with the indicated fusion
proteins as outlined in Materials and Methods. Precipitated proteins
were separated on SDS-PAGs, and gels were stained with colloidal
Coomassie blue to visualize proteins. One representative gel is shown
here. The positions of molecular mass markers (in kilodaltons) are
indicated to the left. The positions of bait proteins are indicated by
arrows.
set. Fusion proteins were expressed in bacteria, purified, and
used to precipitate proteins from Jurkat T-cell lysates as described in Materials and Methods. A representative gel stained
with colloidal Coomassie blue shows that the GST-WW domain fusion proteins can precipitate proteins from Jurkat cell
lysates not seen with GST alone (Fig. 2). Ten of the tested WW
domains associated with specific proteins and showed distinct
binding profiles (Fig. 2), whereas five (Smurf1 WW1, CA150
WW1, PQBP1, PLEKHA5, and KIAA1052) did not bind specific proteins in this assay (data not shown), potentially because
they require additional sequences either to fold into functional
modules or to recognize ligands (15, 29, 30, 50, 54). Proteins
that were precipitated by the 10 WW domains were then excised from the relevant gel and analyzed by LC-MS/MS. Precipitations with each GST-WW domain were repeated three
times, and only proteins identified in two of three independent
experiments with a minimum score of confidence (see Materials and Methods) were included for subsequent analysis. Using these criteria, a total of 148 proteins were identified as
being precipitated by one or more of the 10 WW domains but
not by GST alone (see Table S2 in the supplemental material).
The majority of these proteins represent novel WW domainassociated proteins and are involved in processes such as transcription, RNA processing, and regulation of the cytoskeleton.
In order to compare the binding properties of the different
WW domains, we employed hierarchical clustering to group
the WW domains on the basis of the similarity of the proteins
they recognized. This clustering separated the WW domains
into three classes (Fig. 3). One cluster (group A [shown in
7095
yellow]) includes the NEDD4-1 WW2, AIP4 WW2, and
WWOX WW1 domains. A second group comprises the FE65
WW, Gas7 WW, FBP21 WW2, FBP11 WW1 and WW2, and
CA150 WW2 domains (group B [shown in red]). The WW
domain of Pin1 clustered independently of the other WW
domains and thus forms its own group (group C [shown in
green]). Thus, despite the fact that these WW domain complexes may contain both direct and indirect interaction partners, data based exclusively on the analysis of full-length proteins isolated by WW domains identified three groups of
domains, raising the possibility that these might correspond to
groups previously distinguished by their selectivity for short
peptide motifs.
Clusters of WW domain ligands are enriched for specific
WW domain-binding motifs. WW domains bind a number of
proline-based peptide ligands including PY, PPLP, PR, and
p(S/T)P motifs as well as polyproline stretches (2, 3, 14, 18, 29,
39, 40, 49, 54, 76). Therefore, we examined whether PPXY,
PPLP, and PPPPP motifs are present in the WW-binding proteins from Jurkat cells (Fig. 4). Since a clear consensus sequence does not exist for PR peptide ligands, we searched for
the PPRXP and PXPPXR motifs described by Otte et al. (54).
Analysis of over 24,000 reference proteins with unique
GeneIDs showed that only 13% of these polypeptides contained at least one such motif. In contrast, over 50% of the
proteins isolated in association with the analyzed WW domains
possess at least one of these peptide sequences, with the exception of Pin1 WW domain-binding proteins. Furthermore,
WW domains assigned to group A by hierarchical clustering
were enriched for precipitated proteins with PPXY motifs,
while group B WW domains preferentially precipitated proteins containing PPLP, PPPPP, and PXPPXR motifs (Fig. 4).
This relationship is highlighted by superimposing the presence
of these motifs on the clustered data of Fig. 3 (see Fig. S1 in
the supplemental material).
The majority of the proteins precipitated by the WW domain
of Pin1 lacked these proline-rich motifs, and there was no
enrichment for any specific proline-rich sequence (Fig. 4; see
Fig. S1 in the supplemental material). This is consistent with
the ability of this domain to recognize p(S/T)P ligands. We
therefore examined the MS data for tryptic peptides containing
phosphorylated serine or threonine and compared Pin1 phosphorylation data with results obtained for representatives of
group A (NEDD4-1 WW2) and group B (FBP11 WW2) WW
domains (Fig. 5A). Twenty-two distinct serine/threonine phosphorylation sites were found in 7 of the 45 proteins precipitated by the Pin1 WW domain (Fig. 5C), and the majority of
these phosphorylation sites conform to the Pin1 consensus,
p(S/T)P. A single phosphorylation site from one protein was
identified in the NEDD4-1 WW2 pulldowns, while two phosphorylation sites were identified for the single phosphoprotein
precipitated by the FBP11 WW2 (Fig. 5C). While this analysis
is not comprehensive, the GST-Pin1 WW domain pulldowns
appear to be enriched for phosphorylated-serine/threonine
peptides which fit the consensus for Pin1 WW domain binding.
Taken together, these results indicate that the binding properties of individual WW domains for full-length cellular proteins can be integrated with their selective recognition of peptide ligands to provide a more comprehensive view of their
7096
INGHAM ET AL.
MOL. CELL. BIOL.
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
7097
FIG. 4. Examination of precipitated proteins for known proline-based WW domain-binding motifs. The proteins precipitated with each of the
GST-WW pulldowns were analyzed for the presence of the indicated proline-containing motifs. Note that the pie charts do not equal 100%, since
individual proteins can have more than one motif. The number of proteins precipitated by each WW domain and the percentage of proteins with
at least one of the proline-rich motifs are indicated below the corresponding pie charts. The ⬃24,000 reference proteins with unique GeneIDs were
downloaded from the National Center for Biotechnology Information database and searched locally for the indicated motifs.
physiological interactions. This approach may be of general
utility in exploring the functions of interaction domains.
Domain-based screen identifies novel, physiologically relevant WW domain-associated proteins. Several of the proteins
identified as binding to specific WW domains have been previously shown to coimmunoprecipitate with the corresponding
WW domain-containing protein. For example, transfected
FE65 coprecipitates with endogenous Mena from Cos-7 cells
(18), and CA150 coprecipitates with the SF1 splicing factor
through its second WW domain (45). We identified both of
these interactions in our screen (ENAH [GeneID 55740] and
SF1 [GeneID 7536]). In addition, NEDD4-1 and its yeast orthologue, Rsp5p, directly interact with the RNA Pol II LS (12)
and induce its ubiquitylation in response to UV-induced DNA
damage (1). We found RNA Pol II LS in association with
NEDD4-1 WW2 and also with WW domains from AIP4 and
WWOX (POL2RA, GeneID 5430).
However, the majority of the proteins we identified have not
been previously shown to interact with the relevant WW domain-containing proteins. Therefore, we assessed whether selected proteins identified as binding isolated WW domains in
vitro could associate with the cognate full-length proteins in
cells (Fig. 6A). In a fashion similar to that of NEDD4-1, endogenous AIP4 coimmunoprecipitated with the RNA Pol II
LS. More interestingly, we tested whether AIP4 associates in
vivo with the p68 subunit of CFIm (p68) (CPSF6, GeneID
11052), which precipitated with the AIP4 WW2 domain in
vitro. The CFIm complex is involved in the earliest events of
pre-mRNA cleavage prior to the addition of the poly(A) tail
(59, 60) and is formed through the heterodimerization of
p68 with a p25 subunit (CPSF5, GeneID 11051) which was
also precipitated with the AIP4 WW2 domain (Fig. 3). Indeed, the p68 subunit of CFIm coprecipitated with AIP4
from Jurkat T-cell lysate. We also found that CA150, which
FIG. 3. Hierarchical clustering of WW domains with precipitated proteins. The GST-WW domains (abscissa) and precipitated proteins
(ordinate) were hierarchically clustered as outlined in Materials and Methods. The tree at the top indicates the relatedness of the WW domains
with respect to precipitated proteins. This clustering distinguishes three groups of WW domains: group A (AIP4 WW2, NEDD4-1 WW2, and
WWOX WW1) in yellow, group B (FBP11 WW1 and WW2, FBP21 WW2, FE65 WW, CA150 WW2, and Gas7 WW) in red, and group C (Pin1
WW) in green. The color of the precipitated protein indicates cellular function and is shown to the right.
7098
INGHAM ET AL.
MOL. CELL. BIOL.
FIG. 5. Analysis of GST-WW domain pulldowns for phosphopeptides. (A) The MS data from the GST-Pin1 WW, NEDD4-1 WW2, and FBP11
WW2 pulldowns were analyzed for the presence of phosphopeptides as described in Materials and Methods. Identified phosphopeptides and their
host proteins are indicated. Definitively phosphorylated residues are indicated in red, and possible phosphorylation sites where the identification
was ambiguous are indicated in green. A lowercase “m” denotes an oxidized methionine, while a lowercase “q” denotes a pyroglutamine residue.
Amino acids in parentheses are residues N and C terminal to the peptide in the protein sequence. Peptides indicated by an asterisk are doubly
phosphorylated. (B) The MS/MS spectra for the THRAP3 phosphopeptide is shown. Note that only b (blue), y (red), and parental ion minus
phosphoric acid (P(m/z) ⫺ H3PO4) or minus phosphoric acid and water (P(m/z) ⫺ H3PO4-H2O) (green) are indicated. The number associated with
each ion is the mass/charge ratio (m/z) of that ion. (C) Statistics showing the number of phosphorylation sites, number of phosphorylated proteins,
and number of phosphorylated proteins as a percentage of total proteins precipitated with each WW domain (from Fig. 3).
has been reported to regulate transcription and splicing (25,
45, 63, 69) and to bind the Huntingtin protein (27), coprecipitated with Diaphanous 1 (DIAPH1, GeneID 1729) (Fig.
6A), a protein with links to the cytoskeleton (73), but which
has also been shown to be present in the nucleus in NIH 3T3
cells (70).
We next tested whether these in vivo interactions are indeed
WW domain dependent. Figure 6B shows that Myc-tagged
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
7099
FIG. 6. Identified proteins are direct, in vivo, binding partners. (A) Proteins were precipitated from 293T cell lysate (anti-RNA Pol II LS
immunoprecipitate [ippt]) or Jurkat cell lysate (anti-p68 ippts) with the indicated antibodies, separated on SDS-PAGs, transferred to PVDF
membranes, and blotted with antiserum against AIP4 or Diaphanous 1 (top blots). Blots were stripped and reprobed with the indicated antibodies
(bottom blots). (B) 293T cells were transfected with cDNAs coding for a doubly Myc-tagged AIP4 (Myc-AIP4), enzymatically inactive AIP4 mutant
(Myc-C830A AIP4), an AIP4 construct lacking the entire C2 domain (Myc-⌬C2 AIP4), or a construct in which the second tryptophan (tyrosine
in the case of the fourth WW domain) residue in each WW domain had been mutated to alanine (Myc-WWmut AIP4). Cell lysates were
immunoprecipitated with the anti-Myc MAb 9E10, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and
blotted for the large subunit of RNA polymerase II (anti-RNA Pol II LS) or the p68 and p25 subunits of CFIm (anti-p68 and anti-p25). The blots
were stripped and reprobed with the 9E10 anti-Myc MAb (bottom blots). Cell lysates are included to show that the levels of RNA Pol II LS, p68,
and p25 are equivalent in each of the lysates. Note that the dot in the top blot of the Myc-WWmut AIP4 pulldown is an artifact. (C) Baculovirusexpressed, purified recombinant His-tagged p68 or p25 protein was incubated with GST alone or a GST fusion protein of the first WW domain
of AIP4 (GST-AIP4 WW1). GST precipitates were separated on SDS-PAGs, transferred to nitrocellulose, and blotted with either anti-p68
antiserum (top blot) or anti-p25 antiserum (bottom blot). The positions of molecular mass markers (in kilodaltons) are indicated to the left. The
positions of p68 and p25 proteins are indicated by the arrows. The asterisk in the top blot indicates a p68 degradation product, while the asterisk
in the bottom blot indicates the GST-AIP4 WW1 fusion protein. The input is one/fifth of the material used for the pulldown experiment. (D) 293T
cells were transfected as indicated with cDNAs coding for a doubly Myc-tagged AIP4 (Myc-AIP4), Flag-tagged p68 protein (FLAG-p68), or
Flag-tagged p68 protein in which the tyrosine residue in the PY motif had been mutated to alanine (FLAG-p68 Y/A). Cell lysates were
immunoprecipitated with the anti-FLAG M2 MAb, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and
blotted with the anti-Myc MAb (top blot). The blot was stripped and reprobed with the anti-Flag M2 MAb (middle blot). Cell lysate is included
to show that equivalent amounts of Myc-tagged AIP4 were expressed in the appropriate transfections (bottom blot). (E) 293T cells were
transfected as indicated with cDNAs coding for a Flag-tagged p68 protein (FLAG-p68), a doubly Myc-tagged AIP4 (Myc-AIP4), an enzymatically
inactive AIP4 mutant (Myc-C830A AIP4), or an HA-tagged ubiquitin (HA-Ub) construct. Cell lysates were immunoprecipitated with the
anti-FLAG M2 MAb, and immunoprecipitates were separated on SDS-PAGs, transferred to PVDF membranes, and blotted with an anti-HA
antiserum (top blot). The blot was stripped and reprobed with the anti-Myc 9E10 MAb (middle blot). Cell lysate is included to show that equivalent
amounts of FLAG protein were expressed in the appropriate transfections (bottom blot). The positions of ubiquitylated p68 proteins are indicated
by arrows. The positions of molecular mass standards (in kilodaltons) are indicated to the left.
AIP4 coprecipitates with the RNA Pol II LS and the p68 and
p25 subunits of CFIm from 293T cell lysate. Similar results
were obtained with Myc-tagged AIP4 variants lacking the entire C2 domain (⌬C2) or rendered enzymatically inactive by
substitution of a critical catalytic cysteine within the HECT
domain (C830A). However, collectively inactivating the four
WW domains in AIP4 by mutating the second conserved tryptophan residue in each case (tyrosine in the case of the fourth
WW domain) (13, 50) abolished AIP4 interaction with all three
of these binding partners (Fig. 6B).
7100
INGHAM ET AL.
While these coprecipitation experiments identify proteins
that associate with AIP4 in cells, they do not prove that these
interactions are direct. To explore this point, we tested
whether baculovirus-expressed, purified, recombinant Histagged p68 and p25 subunits of CFIm could be precipitated
with a GST fusion protein containing the first WW domain of
AIP4. Although both the p68 and p25 subunits possess a single
PY motif and therefore could potentially directly interact with
AIP4, only the purified p68 protein on its own precipitated
with GST-AIP4 WW1, whereas p25 did not (Fig. 6C). However, the p25 subunit was precipitated with the GST-AIP4
WW1 fusion protein when coincubated with the p68 subunit,
likely through heterodimerizing with the p68 protein.
To test whether the single PY motif of p68 is required for its
WW domain-dependent interaction with AIP4, we coexpressed Myc-tagged AIP4 in 293T cells with either Flag-p68 or
Flag-p68 Y/A, which contains a mutation that disrupts WW
domain binding (14, 74). Unlike wild-type Flag-p68, the Flagp68 Y/A protein did not coprecipitate the Myc-tagged AIP4
(Fig. 6D), indicating that the interaction between AIP4 and
p68 depends on both the AIP4 WW domains and the p68 PY
motif.
Since AIP4 is an E3 protein-ubiquitin ligase, we examined
whether the interaction between p68 and AIP4 might promote
p68 ubiquitylation. Figure 6E shows that when Flag-tagged p68
is coexpressed with an HA-tagged ubiquitin construct in 293T
cells, the HA-tagged ubiquitin is incorporated into p68, forming a characteristic ubiquitin ladder. Moreover, the further
expression of Myc-tagged AIP4, but not the enzymatically inactive C830A mutant, strongly enhanced p68 ubiquitylation.
Taken together, these data suggest that proteins identified in
this domain-based screen can serve as direct physiological
binding partners for WW domain-containing polypeptides.
Furthermore, the screen can identify a protein such as the p25
subunit of CFIm, which is recruited into a complex with a WW
domain protein through an indirect association. In addition,
complexes such as CFIm/AIP4 identified in the WW domain
screen can be functional in vivo, as indicated by the AIP4mediated ubiquitylation of p68.
WW domain screen isolates multiprotein complexes involved in transcription, splicing, chromatin remodeling, and
cytoskeletal organization. The data for CFIm suggest that the
WW domains can bind proteins that are themselves components of stable multiprotein complexes. In fact, analysis of the
WW domain-binding proteins identified by LC-MS/MS suggests that several such complexes are recruited to WW domains. For instance, several studies have shown that the WW
domains of NEDD4 family proteins bind the RNA Pol II LS
(POLR2A) (1, 12). Indeed, we found the RNA Pol II LS
associated with the AIP4 WW2, NEDD4 WW2, and WWOX
WW1 domains. Moreover, we identified additional RNA Pol II
subunits, as well as proteins reported to bind RNA Pol II
subunits (RNA polymerase II, subunit 5-mediating protein
[C19orf2] [GeneID 8725]) (Fig. 7A). These data argue that the
WW domain pulldowns precipitate larger transcriptional complexes. Additionally, we identified proteins of the ENA/VASP
family (ENAH [Mena], VASP, and EVL) in pulldowns with
the FE65 WW domain and other group B (II/III) WW domains (Fig. 7B). The FE65 WW domain reportedly binds a
PPLP motif in Mena (18) and EVL (44), and the related VASP
MOL. CELL. BIOL.
protein also has this motif. Collectively, these proteins control
cell motility through regulation of the actin cytoskeleton (61).
The ENA/VASP proteins associate with WASP/WAVE family
proteins (11), which also regulate the actin cytoskeleton
through the Arp2/3 complex, and we found several family
members (WAS [WASP], WASL [N-WASP], and WASF2
[WAVE2]) in group B (II/III) WW domain pulldowns (Fig. 7B
and Fig. 3) (5). These proteins can recruit additional polypeptides, such as WASPIP (WIP), WIRE, ABI1, and CYFIP1/2,
which were also identified in the group B (II/III) WW domain
precipitations. It will be of interest to determine how WW
domain interactions may regulate this actin polymerizing complex; in one instance, a WW domain protein, FBP11, has been
shown to interfere with N-WASP function by sequestering it in
the nucleus (52).
Other cellular complexes were also seen in WW domain
pulldowns. We purified components that make up the U2
splicing complex (SF3A1, -2, and -3 and SF3B1, -2, -3, and -4)
with group B (II/III) WW domains, subunits of the chaperonin-containing t-complex polypeptide 1 (CCT) that facilitate
the folding of actin, tubulin, and other cytosolic proteins
(CCT2, CCT4, CCT7) (71) with FBP11 WW2 and several
components of the chromatin remodeling SWI/SNF complex
(ARID1A, SMARCC1, SMARCC2, and SMARCE1) with the
AIP4 WW2 domain. These results are consistent with the notion that WW domain proteins interact with larger multiprotein complexes involved in multiple facets of cellular regulation.
Specificity within the WW domains of AIP4 and between the
related WW domains of AIP4 and NEDD4-1. In the MS-based
analysis, we used the second WW domain of AIP4 to precipitate proteins from Jurkat cell lysate. However, AIP4 possesses
four WW domains, which may differ in their ability to bind
ligands (22, 26, 47, 64). Therefore, we investigated whether
proteins identified by MS as precipitating with the second WW
domain could also bind to the other WW domains of AIP4.
Figure 8A shows that the RNA Pol II LS and the p68 and p25
subunits of CFIm were precipitated by each of four WW domains, whereas EWS (EWSR1, GeneID 2130), a protein with
an RNA recognition motif that is frequently rearranged with
Ets family transcription factors in cancers such as Ewing’s
sarcoma (41), was specifically precipitated by AIP4 WW2. Similarly, a Flag-tagged KIAA0144 protein (UBAP2L, GeneID
9898), a protein of unknown function with a ubiquitin-associated domain, was selectively precipitated by the first and second WW domains. Thus, the isolated domains of tandem WW
domain arrays can show both common and selective binding,
suggesting that the tandem WW domains of AIP4 may act
synergistically to bind common partners or alternatively may
act as a scaffold to recruit specific proteins to individual WW
domains.
Many of the proteins precipitated by the second WW domains of AIP4 and NEDD4-1 possess at least one PY motif
(Fig. 8B and Fig. 4). While many of these proteins were precipitated by WW domains from both proteins, several were
selectively precipitated by only one of the WW domains (e.g.,
UBAP2L [KIAA0144] and EWSR1 [EWS]). Direct analysis by
immunoblotting of proteins isolated from various amounts of
293T cell lysate showed that the second WW domains of
NEDD4-1 and AIP4 were equivalent in their abilities to pre-
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
7101
FIG. 7. Protein complexes precipitated by GST-WW domains. (A) The transcriptional complex precipitated by the AIP4 WW2 domain is
shown. The WW domain is indicated in yellow, direct WW domain-binding proteins in red, and likely indirectly precipitating proteins in green.
Gene symbols and common names (in parentheses) are provided. Black lines indicate a direct physical interaction, while gray lines indicate a
physical association, but not necessarily direct. (B) Actin cytoskeleton-regulating complex precipitated by the FE65 WW domain (as well as other
group B [II/III] WW domains) is shown and labeled as described above for panel A. Although not found in the FE65 pulldowns, NAP1 (NCKAP1)
was included (in blue) as it has been shown to bridge ABI1 with CYFIP1/2 (24, 35) and was found in association with other group B (II/III) WW
domains (Fig. 3). Note that this model represents one possible cytoskeleton-regulating complex based on published work. Since the WASP proteins
(WAS and WASL), WIPIP, WIRE, WASF2, and ABI1 are extremely proline rich and contain motifs that are predicted to interact with group B
WW domains, they may interact directly with group B WW domains.
cipitate the p68 subunit of CFIm, whereas the KIAA0144 and
EWS proteins were preferentially precipitated by the AIP4
WW2 domain (Fig. 8C). This preferential binding of EWS and
KIAA0144 likely explains why the EWS and KIAA0144 proteins were not identified as NEDD4-1 WW2-binding proteins
by MS (Fig. 3; see Table S2 in the supplemental material).
To confirm that binding of the AIP4 and NEDD4-1 WW
domains to the KIAA0144 protein was through the PY motif in
KIAA0144, we mutated the tyrosine residue in the PY motif.
Figure 8D shows this substitution greatly decreased KIAA0144
precipitation by both the AIP4 and NEDD4-1 WW2 domains.
The mutation did not completely abolish binding and had a
more severe effect on precipitation by the NEDD4-1 WW2
domain than AIP4 WW2 (Fig. 8D). To probe whether the PY
motif is the main site of interaction for the WW domains of
AIP4 and NEDD4-1 and determine whether there may be
additional binding sites, as suggested by the PY mutants, we
generated an array of overlapping peptides corresponding to
the entire sequence of KIAA0144 and determined which peptides were recognized by GST-AIP4 WW2 and GST-NEDD4-1
WW2 fusion proteins. A representative array (Fig. 8E) shows
that three peptides were reproducibly and specifically recognized by GST-AIP4 WW2 and GST-NEDD4-1 WW2 (two
strongly and one weakly, indicated in red) but not GST alone.
The two strong-intensity spots (spots 1 and 2) include the PY
motif, while the weak-intensity spot (spot 3) contains an LPXY
motif, a motif that has been shown to bind the WW domains of
NEDD4 family proteins (13, 62). Taken together, these data
show that recognition of KIAA0144 by the second WW domains of AIP4 and NEDD4-1 is largely dependent on the
KIAA0144 PY motif and that individual group A (I) WW
domains show preference for specific PY motifs.
7102
INGHAM ET AL.
MOL. CELL. BIOL.
FIG. 8. Specificity of binding between the different WW domains of AIP4 and the second WW domains of AIP4 and NEDD4-1. (A) 293T cells
were transfected with a cDNA coding for a Flag-tagged KIAA0144 protein (FLAG-KIAA0144 blot) or left untransfected (other blots). Cell lysates
were prepared, and an equivalent amount of lysate was precipitated with GST alone or with the indicated GST-AIP4 WW domain. Half of each
precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the indicated antibody. The other half of the GST
pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for
the pulldowns (bottom blot). Cell lysate is included to show the electrophoretic mobilities of the indicated proteins. (B) Osprey diagram (7)
showing the proteins identified by MS to precipitate with the second WW domains of NEDD4-1 and AIP4. Proteins with PY motifs are indicated
in red. Note that the large subunit of RNA polymerase II (POL2A) does not have a PY motif but does possess several copies of the YSPTSPS
heptamer sequence in its C-terminal domain (CTD), which has been shown to bind the WW domains of NEDD4 family proteins (12). (C) 293T
cells were transfected with a cDNA coding for a Flag-tagged KIAA0144 protein (FLAG-KIAA0144 blot) or left untransfected (other blots). Cell
lysates were prepared, and proteins were precipitated with the indicated GST fusion proteins from either 2 mg or 500 ␮g of cell lysate. Half of
each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the indicated antibody. The other half of the
GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used
for the pulldowns (bottom blot). Cell lysate (10 ␮g) was included to show the electrophoretic mobility of the immunoblotted protein and to give
an indication of the amount of protein precipitated by the GST fusion proteins. (D) 293T cells were transfected with a cDNA coding for a
Flag-tagged KIAA0144 protein (wild type [wt]) or a Flag-tagged KIAA0144 protein in which the tyrosine residue in the PY motif had been mutated
to alanine (Y/A). Cell lysates were prepared, and proteins were precipitated with the indicated fusion proteins from equivalent amounts of cell
lysate. Half of each precipitate was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the anti-FLAG M2 MAb. The
other half of the GST pulldown was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion
protein were used for the pulldowns (bottom blot). Cell lysate was included to show the electrophoretic mobility of the FLAG-KIAA0144 protein.
(E) SPOTS membranes were prepared as outlined in Materials and Methods. Proteins (12-mers) comprising the entire coding sequence of
KIAA0144 were generated, with a moving window of four amino acids. SPOTS membranes were probed with 1 ␮⌴ of either GST-AIP4 WW2,
GST-NEDD4-1 WW2, or GST alone. Spots specific to GST-AIP4 WW2 and NEDD4-1 WW2 are numbered and indicated in red. The sequences
of these peptides (with the PY motif and LPXY motifs in red) are shown below the blots. Spots marked by an asterisk were recognized by GST
alone in other experiments.
An individual protein with distinct binding motifs for different classes of WW domains. A few proteins identified in the
MS screen were precipitated by WW domains that typically
recognize different classes of proline-rich ligands. For example,
the p68 protein of CFIm was precipitated by seven of the WW
domains analyzed (AIP4 WW2, NEDD4-1 WW2, WWOX
WW, CA150 WW2, FBP21 WW2, and FBP11 WW2). This
raised the question as to whether p68 possessed a single “ge-
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
7103
FIG. 9. Recognition of the p68 CFIm protein by multiple WW domains. (A) The coding sequence of the p68 protein is shown. The proline-rich
region used for the SPOTS blots in panel C is indicated in pink. The PY (red), PPPPP stretch (yellow), and three PPGPPP motifs (green) are also
indicated. The RNA recognition motif is underlined. (B) 293T cells were transfected with cDNAs coding for either a Flag-tagged p68 protein (wild
type [wt]) or a Flag-tagged p68 protein in which the tyrosine residue in the PY motif had been mutated to alanine (Y/A). Cell lysates were prepared,
and an equivalent amount of lysate was precipitated with GST alone or the indicated GST-WW domain fusion protein. Half of each precipitate
was separated on an SDS-PAG, transferred to a PVDF membrane, and blotted with the anti-FLAG M2 MAb. The other half of the GST pulldown
was separated on an SDS-PAG and stained with Coomassie blue to show that equivalent amounts of GST fusion protein were used for the
pulldowns (bottom blot). Cell lysate is included to show the electrophoretic mobilities of the indicated proteins; note that the p68 Y/A was
consistently expressed at higher levels than the wt protein. (C) Peptides (12-mers), with a moving window of five amino acids, were generated for
the proline-rich region of p68 (panel A, pink region), and SPOTS membranes were probed with 1 ␮M of GST alone or the indicated GST-WW
domain fusion protein. Peptides recognized by one group of WW domains over another are boxed and numbered, and the peptide sequences are
provided below the blots. In the boxed sequences, the PY (red), PPPPP (yellow), and PPGPPP (green) motifs in the peptides are highlighted.
neric” WW domain recognition motif, or alternatively, distinct
recognition motifs recognized by different classes of WW domains. The sequence of p68 (Fig. 9A) shows that the C terminus possesses numerous RS repeats, a characteristic of SR
proteins (9). The p68 protein is also extremely proline rich and
has a single PY motif, a PPPPP stretch, and three PPGPPP
motifs. Since the PY motif of p68 is required for association
with AIP4 (Fig. 6), we tested whether it is also important for
association with other WW domains that recognize p68. Mutation of the p68 PY motif abolished binding to the WW
domains of WWOX and NEDD4-1. In contrast, the AIP4
WW2 domain showed reduced but still significant binding to
the p68 Y/A mutant, while the FE65 WW domain and CA150
WW2 domains were equally able to precipitate wild-type p68
and the p68 Y/A mutant (Fig. 9B). To locate the recognition
motifs for these latter WW domains, we synthesized a peptide
SPOTS array corresponding to the proline-rich region of p68
(Fig. 9A, pink region) and probed this with GST-WW fusion
proteins (Fig. 9C). As expected, the NEDD4-1 WW2, WWOX
WW1, and AIP4 WW2 domains recognized two overlapping
peptides containing the PY motif (Fig. 9C, red box). The fact
that the AIP4 WW2 domain recognized no additional peptides
suggests that the binding of this WW domain to the p68 Y/A
mutant still involves the p68 PY motif (Fig. 9B). The FE65
WW domain and CA150 WW2 domain collectively bound a
distinct set of proline-rich peptides that were not recognized by
the NEDD4-1, WWOX, or AIP4 WW domain or GST alone
(Fig. 9C, blue, pink, and green boxes). These included peptides
comprising the PPPPP stretch and the three PPGPPP motifs,
as well as several other peptides rich in proline and proline/
7104
INGHAM ET AL.
arginine (Fig. 9C). These results illustrate the specificity of
group A (I) WW domains for PY motifs and show the more
promiscuous nature of the group B (II/III) WW domains.
Furthermore, the data suggest that WW domains may interact
with p68 at multiple sites, with specific classes of WW domains
recruited to separate sites. The p68 protein may therefore
exemplify a scaffold with the ability to recruit multiple classes
of WW domains and WW domain proteins through distinct
proline-rich sequences.
DISCUSSION
To explore the range of cellular processes and ligand-binding preferences exhibited by individual WW domains, we have
undertaken an MS-based screen to identify proteins from
Jurkat T cells that associate in vitro with GST-WW domain
fusion proteins taken from a broad range of WW-containing
proteins. We have used unsupervised hierarchical clustering to
group WW domains on the basis of their similarity of precipitated protein ligands in this assay. This approach yielded three
subsets of WW domains that cluster on the basis of the similarities of their protein-binding partners. Many of these associated proteins contain proline-rich sequences typical of peptide ligands for WW domains. Indeed, we found that proteins
in one cluster (A) are enriched for PY motifs and are recognized by WW domains with a known predilection for PY peptides (i.e., group I WW domains). In a similar fashion, proteins
in a second cluster (B) possess PPLP and proline/argininecontaining motifs, as well as polyproline stretches containing
glycine and methionine, and associate with WW domains previously classified as group II/III on the basis of their peptidebinding properties. The final cluster (C) contains proteins that
are recognized by the Pin1 WW domain (group IV); these
show no preference for the specific proline-rich motifs analyzed but are enriched in phosphorylated-Ser/Thr-Pro sites.
Although we have included the FBP21 WW2 in group II/III
(cluster B), it has a fair number of specific binding partners
(Fig. 3) and has previously been suggested to lie outside the
conventional WW domain groups (29). Hierarchical clustering
therefore reveals a striking relationship between the full-length
proteins recognized by a particular subset of WW domains in
cell lysates and the peptide-binding specificities of these domains. This suggests that significant information regarding the
molecular recognition properties of interaction domains can
be inferred from proteomic-scale analysis of their proteinbinding partners. Furthermore, these data argue that the WW
domain-binding motifs identified by the analysis of synthetic
peptides are used extensively in the context of intact proteins
to determine binding specificity.
While some of the polypeptides precipitated by WW domains in this analysis have been described as binding partners
for specific WW domain proteins in cells, most of these interactions have not been previously identified. We therefore used
the p68/p25 subunits of the CFIm complex to address whether
a novel interaction detected in this in vitro screen might be
relevant in cells, whether it involves the anticipated recognition
of a specific proline-rich motif by one or more WW domains,
whether WW domains can recruit multiple components of a
larger protein complex, and whether such interactions are
functional in vivo. The p68 protein contains RS repeats and a
MOL. CELL. BIOL.
RNA recognition motif, which together with p25 binds premRNA and aids in the recruitment of other 3⬘ processing
factors, such as cleavage and polyadenylation specificity factor
(16, 59, 60). Indeed, we have found that the AIP4 E3 proteinubiquitin ligase binds directly to the p68 CFIm subunit in a
fashion that requires functional AIP4 WW domains and a PY
motif in p68. Although p25 has a PY motif, it does not interact
directly with AIP4 WW domains but rather binds p68, which
acts as a bridge to AIP4. Furthermore, we identified an SR
protein, SFRS7 (9G8), which interacts with the RS domain of
p68 (16) in the NEDD4-1 WW2 pulldowns (Fig. 3), providing
further support for the notion that WW domain pulldowns can
precipitate cellular complexes. From a functional point of view,
AIP4 stimulates the selective ubiquitylation of p68 (but not
p25) in cells. Ubiquitylation has many functions, including degradation via the 26S proteasome, but can also positively and
negatively regulate molecular interactions independent of degradation (37, 43). It will be of interest to explore whether
ubiquitylation of p68 may regulate its binding to RNA or its
ability to recruit other factors important for pre-mRNA cleavage.
Taken together, these data suggest that WW domains may
regulate an extensive network of protein-protein interactions
in vivo. Two additional points support this view. First, the WW
domains tested in this screen isolated a number of multiprotein
complexes with central functions in cellular regulation, including assemblies involved in transcription, splicing, chromatin
remodeling, and actin polymerization. WW domain proteins
may therefore serve to bridge or regulate such cellular machinery, for example in the coupling of transcription to splicing (25,
45, 63). In support of this notion, we have used peptide arrays
to show that the p68 CFIm subunit contains multiple prolinerich sequences that are able to bind WW domains of different
classes in vitro. Thus, WW domains can potentially function to
establish networks of interactions, and this may be especially
true of proteins with tandem WW domains.
In this regard, the screen compared the second WW domains of the related NEDD4 family E3 ubiquitin ligases
NEDD4-1 and AIP4. While many similarities were seen in the
proteins precipitated by the second WW domains of AIP4 and
NEDD4-1, there were some proteins specifically precipitated
with one WW domain (Fig. 8B). Further investigation of two
such selective proteins, KIAA0144 and EWS, confirmed that
these proteins were indeed precipitated much more efficiently
by the AIP4 WW2 domain than by NEDD4-1 WW2 (Fig. 8C).
Since humans have nine NEDD4 family proteins (33), an important question is whether these proteins are functionally
overlapping or whether they have specific targets. Our data,
although limited to one WW domain for both AIP4 and
NEDD4-1, suggest that both these paradigms may apply, as
some proteins are recognized equivalently by the two domains,
while others are selective for one. Of interest, a chromosomal
inversion that disrupts the promoter of the Itch/AIP4 gene in
mice leads to a hyperactivation of cells of the immune system
and an autoimmune-like phenotype; this also suggests that
these proteins can have nonoverlapping biological functions
(58).
In conclusion, our experiments show that WW domains, and
by extension WW domain-containing proteins, associate with
multiprotein cellular complexes, many of which have not been
VOL. 25, 2005
IDENTIFICATION OF WW DOMAIN-BINDING PROTEINS
previously described. Thus, WW domains provide a versatile
module with which to assemble regulatory protein networks.
22.
ACKNOWLEDGMENTS
R.J.I. was a recipient of fellowships from the National Cancer Institute of Canada (supported by Funds from the Terry Fox Run) and
the Leukemia & Lymphoma Society. T.P. is a Distinguished Scientist
of the Canadian Institutes for Health Research (CIHR). This work was
supported by grants from the CIHR, the Ontario R&D Research
Fund, and Genome Canada.
We also thank the National Cell Culture Center for tissue culture
services.
23.
24.
25.
REFERENCES
1. Beaudenon, S. L., M. R. Huacani, G. Wang, D. P. McDonnell, and J. M.
Huibregtse. 1999. Rsp5 ubiquitin-protein ligase mediates DNA damageinduced degradation of the large subunit of RNA polymerase II in Saccharomyces cerevisiae. Mol. Cell. Biol. 19:6972–6979.
2. Bedford, M. T., D. C. Chan, and P. Leder. 1997. FBP WW domains and the
Abl SH3 domain bind to a specific class of proline-rich ligands. EMBO J.
16:2376–2383.
3. Bedford, M. T., D. Sarbassova, J. Xu, P. Leder, and M. B. Yaffe. 2000. A
novel pro-Arg motif recognized by WW domains. J. Biol. Chem. 275:10359–
10369.
4. Bednarek, A. K., K. J. Laflin, R. L. Daniel, Q. Liao, K. A. Hawkins, and C. M.
Aldaz. 2000. WWOX, a novel WW domain-containing protein mapping to
human chromosome 16q23.3–24.1, a region frequently affected in breast
cancer. Cancer Res. 60:2140–2145.
5. Bompard, G., and E. Caron. 2004. Regulation of WASP/WAVE proteins:
making a long story short. J. Cell Biol. 166:957–962.
6. Bork, P., and M. Sudol. 1994. The WW domain: a signalling site in dystrophin? Trends Biochem. Sci. 19:531–533.
7. Breitkreutz, B. J., C. Stark, and M. Tyers. 27 February 2003, posting date.
Osprey: a network visualization system. Genome Biol. 4:R22. [Online.]
http://genomebiology.com/2003/4/3/R22.
8. Buschdorf, J. P., and W. H. Stratling. 2004. A WW domain binding region
in methyl-CpG-binding protein MeCP2: impact on Rett syndrome. J. Mol.
Med. 82:135–143.
9. Caceres, J. F., and A. R. Kornblihtt. 2002. Alternative splicing: multiple
control mechanisms and involvement in human disease. Trends Genet. 18:
186–193.
10. Camon, E., M. Magrane, D. Barrell, D. Binns, W. Fleischmann, P. Kersey,
N. Mulder, T. Oinn, J. Maslen, A. Cox, and R. Apweiler. 2003. The Gene
Ontology Annotation (GOA) project: implementation of GO in SWISSPROT, TrEMBL, and InterPro. Genome Res. 13:662–672.
11. Castellano, F., C. Le Clainche, D. Patin, M. F. Carlier, and P. Chavrier.
2001. A WASp-VASP complex regulates actin polymerization at the plasma
membrane. EMBO J. 20:5603–5614.
12. Chang, A., S. Cheang, X. Espanel, and M. Sudol. 2000. Rsp5 WW domains
interact directly with the carboxyl-terminal domain of RNA polymerase II.
J. Biol. Chem. 275:20562–20571.
13. Chen, H. I., A. Einbond, S. J. Kwak, H. Linn, E. Koepf, S. Peterson, J. W.
Kelly, and M. Sudol. 1997. Characterization of the WW domain of human
Yes-associated protein and its polyproline-containing ligands. J. Biol. Chem.
272:17070–17077.
14. Chen, H. I., and M. Sudol. 1995. The WW domain of Yes-associated protein
binds a proline-rich ligand that differs from the consensus established for Src
homology 3-binding modules. Proc. Natl. Acad. Sci. USA 92:7819–7823.
15. Chung, W., and J. T. Campanelli. 1999. WW and EF hand domains of
dystrophin-family proteins mediate dystroglycan binding. Mol. Cell Biol.
Res. Commun. 2:162–171.
16. Dettwiler, S., C. Aringhieri, S. Cardinale, W. Keller, and S. M. L. Barabino.
2004. Distinct sequence motifs within the 68-kDa subunit of cleavage factor
Im mediate RNA binding, protein-protein interactions, and subcellular localization. J. Biol. Chem. 279:35788–35797.
17. Eisen, M. B., P. T. Spellman, P. O. Brown, and D. Botstein. 1998. Cluster
analysis and display of genome-wide expression patterns. Proc. Natl. Acad.
Sci. USA 95:14863–14868.
18. Ermekova, K. S., N. Zambrano, H. Linn, G. Minopoli, F. Gertler, T. Russo,
and M. Sudol. 1997. The WW domain of neural protein FE65 interacts with
proline-rich motifs in Mena, the mammalian homolog of Drosophila Enabled. J. Biol. Chem. 272:32869–32877.
19. Espanel, X., and M. Sudol. 1999. A single point mutation in a group I WW
domain shifts its specificity to that of group II WW domains. J. Biol. Chem.
274:17284–17289.
20. Faber, P. W., G. T. Barnes, J. Srinidhi, J. Chen, J. F. Gusella, and M. E.
MacDonald. 1998. Huntingtin interacts with a family of WW domain proteins. Hum. Mol. Genet. 7:1463–1474.
21. Fan, H., K. Sakuraba, A. Komuro, S. Kato, F. Harada, and Y. Hirose. 2003.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
7105
PCIF1, a novel human WW domain-containing protein, interacts with the
phosphorylated RNA polymerase II. Biochem. Biophys. Res. Commun. 301:
378–385.
Fotia, A. B., A. Dinudom, K. E. Shearwin, J. P. Koch, C. Korbmacher, D. I.
Cook, and S. Kumar. 2003. The role of individual Nedd4-2 (KIAA0439) WW
domains in binding and regulating epithelial sodium channels. FASEB J.
17:70–72.
Frank, R. 2002. The SPOT-synthesis technique. Synthetic peptide arrays on
membrane supports—principles and applications. J. Immunol. Methods 267:
13–26.
Gautreau, A., H. Y. Ho, J. Li, H. Steen, S. P. Gygi, and M. W. Kirschner.
2004. Purification and architecture of the ubiquitous Wave complex. Proc.
Natl. Acad. Sci. USA 101:4379–4383.
Goldstrohm, A. C., T. R. Albrecht, C. Sune, M. T. Bedford, and M. A.
Garcia-Blanco. 2001. The transcription elongation factor CA150 interacts
with RNA polymerase II and the pre-mRNA splicing factor SF1. Mol. Cell.
Biol. 21:7617–7628.
Harvey, K. F., A. Dinudom, P. Komwatana, C. N. Jolliffe, M. L. Day, G.
Parasivam, D. I. Cook, and S. Kumar. 1999. All three WW domains of
murine Nedd4 are involved in the regulation of epithelial sodium channels by
intracellular Na⫹. J. Biol. Chem. 274:12525–12530.
Holbert, S., I. Denghien, T. Kiechle, A. Rosenblatt, C. Wellington, M. R.
Hayden, R. L. Margolis, C. A. Ross, J. Dausset, R. J. Ferrante, and C. Neri.
2001. The Gln-Ala repeat transcriptional activator CA150 interacts with
huntingtin: neuropathologic and genetic evidence for a role in Huntington’s
disease pathogenesis. Proc. Natl. Acad. Sci. USA 98:1811–1816.
Houthaeve, T., H. Gausepohl, M. Mann, and K. Ashman. 1995. Automation
of micro-preparation and enzymatic cleavage of gel electrophoretically separated proteins. FEBS Lett. 376:91–94.
Hu, H., J. Columbus, Y. Zhang, D. Wu, L. Lian, S. Yang, J. Goodwin, C.
Luczak, M. Carter, L. Chen, M. James, R. Davis, M. Sudol, J. Rodwell, and
J. J. Herrero. 2004. A map of WW domain family interactions. Proteomics
4:643–655.
Huang, X., F. Poy, R. Zhang, A. Joachimiak, M. Sudol, and M. J. Eck. 2000.
Structure of a WW domain containing fragment of dystrophin in complex
with beta-dystroglycan. Nat. Struct. Biol. 7:634–638.
Ikeda, M., A. Ikeda, L. C. Longan, and R. Longnecker. 2000. The EpsteinBarr virus latent membrane protein 2A PY motif recruits WW domaincontaining ubiquitin-protein ligases. Virology 268:178–191.
Ilsley, J. L., M. Sudol, and S. J. Winder. 2002. The WW domain: linking cell
signalling to the membrane cytoskeleton. Cell Signal 14:183–189.
Ingham, R. J., G. Gish, and T. Pawson. 2004. The Nedd4 family of E3
ubiquitin ligases: functional diversity within a common modular architecture.
Oncogene 23:1972–1984.
Ingham, R. J., L. Santos, M. Dang-Lawson, M. Holgado-Madruga, P. Dudek,
C. R. Maroun, A. J. Wong, L. Matsuuchi, and M. R. Gold. 2001. The Gab1
docking protein links the B cell antigen receptor to the phosphatidylinositol
3-kinase/Akt signaling pathway and to the SHP2 tyrosine phosphatase.
J. Biol. Chem. 276:12257–12265.
Innocenti, M., A. Zucconi, A. Disanza, E. Frittoli, L. B. Areces, A. Steffen,
T. E. Stradal, P. P. Di Fiore, M. F. Carlier, and G. Scita. 2004. Abi1 is
essential for the formation and activation of a WAVE2 signalling complex.
Nat. Cell Biol. 6:319–327.
Jolliffe, C. N., K. F. Harvey, B. P. Haines, G. Parasivam, and S. Kumar. 2000.
Identification of multiple proteins expressed in murine embryos as binding
partners for the WW domains of the ubiquitin-protein ligase Nedd4. Biochem. J. 351:557–565.
Kaiser, P., K. Flick, C. Wittenberg, and S. I. Reed. 2000. Regulation of
transcription by ubiquitination without proteolysis: Cdc34/SCF(Met30)-mediated inactivation of the transcription factor Met4. Cell 102:303–314.
Kanelis, V., D. Rotin, and J. D. Forman-Kay. 2001. Solution structure of a
Nedd4 WW domain-ENaC peptide complex. Nat. Struct. Biol. 8:407–412.
Kato, Y., K. Nagata, M. Takahashi, L. Lian, J. J. Herrero, M. Sudol, and M.
Tanokura. 2004. Common mechanism of ligand recognition by group II/III
WW domains: redefining their functional classification. J. Biol. Chem. 279:
31833–31841.
Komuro, A., M. Saeki, and S. Kato. 1999. Association of two nuclear proteins, Npw38 and NpwBP, via the interaction between the WW domain and
a novel proline-rich motif containing glycine and arginine. J. Biol. Chem.
274:36513–36519.
Kovar, H. 2003. Ewing tumor biology: perspectives for innovative treatment
approaches. Adv. Exp. Med. Biol. 532:27–37.
Krebs, D. L., Y. Yang, M. Dang, J. Haussmann, and M. R. Gold. 1999. Rapid
and efficient retrovirus-mediated gene transfer into B cell lines. Methods
Cell Sci. 21:57–68.
Kuras, L., A. Rouillon, T. Lee, R. Barbey, M. Tyers, and D. Thomas. 2002.
Dual regulation of the Met4 transcription factor by ubiquitin-dependent
degradation and inhibition of promoter recruitment. Mol. Cell 10:69–80.
Lambrechts, A., A. V. Kwiatkowski, L. M. Lanier, J. E. Bear, J. Vandekerckhove, C. Ampe, and F. B. Gertler. 2000. cAMP-dependent protein
kinase phosphorylation of EVL, a Mena/VASP relative, regulates its interaction with actin and SH3 domains. J. Biol. Chem. 275:36143–36151.
7106
INGHAM ET AL.
45. Lin, K. T., R. M. Lu, and W. Y. Tarn. 2004. The WW domain-containing
proteins interact with the early spliceosome and participate in pre-mRNA
splicing in vivo. Mol. Cell. Biol. 24:9176–9185.
46. Liou, Y. C., A. Sun, A. Ryo, X. Z. Zhou, Z. X. Yu, H. K. Huang, T. Uchida,
R. Bronson, G. Bing, X. Li, T. Hunter, and K. P. Lu. 2003. Role of the prolyl
isomerase Pin1 in protecting against age-dependent neurodegeneration. Nature 424:556–561.
47. Lott, J. S., S. J. Coddington-Lawson, P. H. Teesdale-Spittle, and F. J.
McDonald. 2002. A single WW domain is the predominant mediator of the
interaction between the human ubiquitin-protein ligase Nedd4 and the human epithelial sodium channel. Biochem. J. 361:481–488.
48. Lu, P. J., G. G. Wulf, X. Z. Zhou, P. Davies, and K. P. Lu. 1999. The prolyl
isomerase Pin1 restores the function of Alzheimer-associated phosphorylated tau protein. Nature 399:784–788.
49. Lu, P. J., X. Z. Zhou, M. Shen, and K. P. Lu. 1999. Function of WW domains
as phosphoserine- or phosphothreonine-binding modules. Science 283:1325–
1328.
50. Macias, M. J., M. Hyvonen, E. Baraldi, J. Schultz, M. Sudol, M. Saraste, and
H. Oschkinat. 1996. Structure of the WW domain of a kinase-associated
protein complexed with a proline-rich peptide. Nature 382:646–649.
51. Marchler-Bauer, A., J. B. Anderson, P. F. D. Cherukuri, C. DeWeese-Scott,
L. Y. Geer, M. Gwadz, S. He, D. I. Hurwitz, J. D. Jackson, Z. Ke, C. J.
Lanczycki, C. A. Liebert, C. Liu, F. Lu, G. H. Marchler, M. Mullokandov,
B. A. Shoemaker, V. Simonyan, J. S. Song, P. A. Thiessen, R. A. Yamashita,
J. J. Yin, D. Zhang, and S. H. Bryant. 2005. CDD: a Conserved Domain
Database for protein classification. Nucleic Acids Res. 33:D192–D196.
52. Mizutani, K., S. Suetsugu, and T. Takenawa. 2004. FBP11 regulates nuclear
localization of N-WASP and inhibits N-WASP-dependent microspike formation. Biochem. Biophys. Res. Commun. 313:468–474.
53. Murillas, R., K. S. Simms, S. Hatakeyama, A. M. Weissman, and M. R.
Kuehn. 2002. Identification of developmentally expressed proteins that functionally interact with Nedd4 ubiquitin ligase. J. Biol. Chem. 277:2897–2907.
54. Otte, L., U. Wiedemann, B. Schlegel, J. R. Pires, M. Beyermann, P.
Schmieder, G. Krause, R. Volkmer-Engert, J. Schneider-Mergener, and H.
Oschkinat. 2003. WW domain sequence activity relationships identified using ligand recognition propensities of 42 WW domains. Protein Sci. 12:491–
500.
55. Passani, L. A., M. T. Bedford, P. W. Faber, K. M. McGinnis, A. H. Sharp,
J. F. Gusella, J. P. Vonsattel, and M. E. MacDonald. 2000. Huntingtin’s WW
domain partners in Huntington’s disease post-mortem brain fulfill genetic
criteria for direct involvement in Huntington’s disease pathogenesis. Hum.
Mol. Genet. 9:2175–2182.
56. Pawson, T., and P. Nash. 2003. Assembly of cell regulatory systems through
protein interaction domains. Science 300:445–452.
57. Perkins, D. N., D. J. Pappin, D. M. Creasy, and J. S. Cottrell. 1999. Probability-based protein identification by searching sequence databases using
mass spectrometry data. Electrophoresis 20:3551–3567.
58. Perry, W. L., C. M. Hustad, D. A. Swing, T. N. O’Sullivan, N. A. Jenkins, and
N. G. Copeland. 1998. The itchy locus encodes a novel ubiquitin protein
ligase that is disrupted in a18H mice. Nat. Genet. 18:143–146.
59. Ruegsegger, U., K. Beyer, and W. Keller. 1996. Purification and characterization of human cleavage factor Im involved in the 3⬘ end processing of
messenger RNA precursors. J. Biol. Chem. 271:6107–6113.
60. Ruegsegger, U., D. Blank, and W. Keller. 1998. Human pre-mRNA cleavage
MOL. CELL. BIOL.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
factor Im is related to spliceosomal SR proteins and can be reconstituted in
vitro from recombinant subunits. Mol. Cell 1:243–253.
Sechi, A. S., and J. Wehland. 2004. ENA/VASP proteins: multifunctional
regulators of actin cytoskeleton dynamics. Front. Biosci. 9:1294–1310.
Shcherbik, N., Y. Kee, N. Lyon, J. M. Huibregtse, and D. S. Haines. 2004. A
single PXY motif located within the carboxyl terminus of Spt23p and Mga2p
mediates a physical and functional interaction with ubiquitin ligase Rsp5p.
J. Biol. Chem. 279:53892–53898.
Smith, M. J., S. Kulkarni, and T. Pawson. 2004. FF domains of CA150 bind
transcription and splicing factors through multiple weak interactions. Mol.
Cell. Biol. 24:9274–9285.
Snyder, P. M., D. R. Olson, F. J. McDonald, and D. B. Bucher. 2001.
Multiple WW domains, but not the C2 domain, are required for inhibition of
the epithelial Na⫹ channel by human Nedd4. J. Biol. Chem. 276:28321–
28326.
Sokal, R. R., and C. D. Michener. 1958. A statistical method for evaluating
systematic relationships. Univ. Kans. Sci. Bull. 38:1409–1438.
Staub, O., S. Dho, P. Henry, J. Correa, T. Ishikawa, J. McGlade, and D.
Rotin. 1996. WW domains of Nedd4 bind to the proline-rich PY motifs in the
epithelial Na⫹ channel deleted in Liddle’s syndrome. EMBO J. 15:2371–
2380.
Sudol, M., and T. Hunter. 2000. NeW wrinkles for an old domain. Cell
103:1001–1004.
Sudol, M., K. Sliwa, and T. Russo. 2001. Functions of WW domains in the
nucleus. FEBS Lett. 490:190–195.
Sune, C., T. Hayashi, Y. C. Liu, W. S. Lane, R. A. Young, and M. A.
Garcia-Blanco. 1997. CA150, a nuclear protein associated with the RNA
polymerase II holoenzyme, is involved in Tat-activated human immunodeficiency virus type 1 transcription. Mol. Cell. Biol. 17:6029–6039.
Tominaga, T., W. Meng, K. Togashi, H. Urano, A. S. Alberts, and M.
Tominaga. 2002. The Rho GTPase effector protein, mDia, inhibits the DNA
binding ability of the transcription factor Pax6 and changes the pattern of
neurite extension in cerebellar granule cells through its binding to Pax6.
J. Biol. Chem. 277:47686–47691.
Valpuesta, J. M., J. Martin-Benito, P. Gomez-Puertas, J. L. Carrascosa, and
K. R. Willison. 2002. Structure and function of a protein folding machine:
the eukaryotic cytosolic chaperonin CCT. FEBS Lett. 529:11–16.
Verdecia, M. A., M. E. Bowman, K. P. Lu, T. Hunter, and J. P. Noel. 2000.
Structural basis for phosphoserine-proline recognition by group IV WW
domains. Nat. Struct. Biol. 7:639–643.
Wallar, B. J., and A. S. Alberts. 2003. The formins: active scaffolds that
remodel the cytoskeleton. Trends Cell Biol. 13:435–446.
Winberg, G., L. Matskova, F. Chen, P. Plant, D. Rotin, G. Gish, R. Ingham,
I. Ernberg, and T. Pawson. 2000. Latent membrane protein 2A of EpsteinBarr virus binds WW domain E3 protein-ubiquitin ligases that ubiquitinate
B-cell tyrosine kinases. Mol. Cell. Biol. 20:8526–8535.
Wulf, G. M., A. Ryo, G. G. Wulf, S. W. Lee, T. Niu, V. Petkova, and K. P. Lu.
2001. Pin1 is overexpressed in breast cancer and cooperates with Ras signaling in increasing the transcriptional activity of c-Jun towards cyclin D1.
EMBO J. 20:3459–3472.
Yaffe, M. B., M. Schutkowski, M. Shen, X. Z. Zhou, P. T. Stukenberg, J. U.
Rahfeld, J. Xu, J. Kuang, M. W. Kirschner, G. Fischer, L. C. Cantley, and
K. P. Lu. 1997. Sequence-specific and phosphorylation-dependent proline
isomerization: a potential mitotic regulatory mechanism. Science 278:1957–
1960.
PIN1 WW
AIP4 WW2
NEDD4-1 WW2
WWOX WW1
FBP11 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
CA150 WW2
NEDD4-1 WW2
WWOX WW1
PIN1 WW
AIP4 WW2
FBP11 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
CA150 WW2
PIN1 WW
AIP4 WW2
NEDD4-1 WW2
WWOX WW1
FBP11 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
CA150 WW2
NEDD4-1 WW2
WWOX WW1
PIN1 WW
AIP4 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
FBP11 WW2
CA150 WW2
NEDD4-1 WW2
WWOX WW1
PIN1 WW
AIP4 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
FBP11 WW2
CA150 WW2
NEDD4-1 WW2
WWOX WW1
PIN1 WW
AIP4 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
FBP11 WW2
CA150 WW2
NEDD4-1 WW2
WWOX WW1
PIN1 WW
AIP4 WW2
GAS7 WW
FBP11 WW1
FE65 WW
FBP21 WW2
FBP11 WW2
CA150 WW2
Supplementary Material, Figure 1
HNRPK
CPSF6
HNRPU
SFPQ
SF3A1
SF3B1
SF3B3
SF1
DIAPH1
W AS
NONO
SF3A2
U2AF2
EVL
SF3A3
SF3B2
ABI1
CYFIP1
CYFIP2
W ASF2
PSPC1
BRD4
FMNL1
HEM1
HNRPF
APBB1IP
GAPD
RPLP0
NCKAP1
W ASL
YLPM1
DHX9
CCT2
CCT4
CCT7
TD-60
DHX15
RPL4
CRK7
NSEP1
DDX46
VASP
ATAD3A
W IRE
DIAPH2
DDX17
HNRPH1
ENAH
KHSRP
MGC20255
PQBP1
PTBP1
RBM17
PABP C1
FLJ12529
W ASPIP
SF3B4
KHDRBS1
CHERP
HTATSF1
W BP11
BUB3
DERPC
DDX23
DNM2
Group B
enriched
FUS
PRP3
PRPF4
SART1
SFRS15
SMARCA3
SNRP70
SNRPA
ZNF207
DDX3X
THRAP3
PRPF8
G22P1
M11 S1
U5-116KD
AP2A1
BCLAF1
C13orf8
CCNK
CDC2L1
CDC2L2
DDB1
DDX5
DKFZP434I11 6
ETV6
G3BP2
HADHA
G3BP
LOC133619
PKM2
PPA RBP
RBM21
REPS1
RNPS1
SFRS11
SRRM1
SRRM2
TBC1D4
TFG
TLE3
ASCC3L1
VCY2IP1
W RNIP1
CPSF5
POLR2A
POLR2B
SMARCC1
W BP2
C19orf2
FLJ13150
FLJ21908
HNRPL
HNRPUL1
POLR2C
POLR2E
ARID1A
CPSF1
AMOTL1
CAD
DHX30
EPRS
GRINL1A
NEDD4
RAPGEF2
POLR3A
RPS3A
RUVBL1
SFRS7
EW SR1
DAZAP1
ITCH
UBAP2L
SMARCC2
SMARCE1
SPG20
UBAP2
TCERG1
CPSF2
CPSF3
FIP1L1
KIAA0460
SPK
W DR33
Group C
enriched
Group A
enriched
PPxY
PPLP
PxPPxR
PPRxxP
PPPPP
PPGPPP PxxGMxPP
Supplementary Material
Supplementary Material, Table I
NEDD4-1 WW2-5’ GATCC GGA TTA CCA CCA GGT TGG GAA GAA AAA
CAA GAT GAA AGA GGA AGA TCA TAT TAT GTA GAT CAC AAT TCC AGA
ACG ACT ACT TGG ACA AAG CCC ACT TAA G
NEDD4-1 WW2-3’ AATTC TTA AGT GGG CTT TGT CCA AGT AGT CGT TCT
GGA ATT GTG ATC TAC ATA ATA TGA TCT TCC TCT TTC ATC TTG TTT TTC
TTC CCA ACC TGG TGG TAA TCC G
Smurf1WW1-5’
GATCC CTG CCC GAA GGC TAC GAA CAA AGA ACA
ACA GTC CAG GGC CAA GTT TAC TTT TTG CAT ACA CAG ACT GGA GTT
AGC ACG TGG CAC GAC CCC AGG TAA G
Smurf1 WW1-3’
AATTC TTA CCT GGG GTC GTG CCA CGT GCT AAC TCC
AGT CTG TGT ATG CAA AAA GTA AAC TTG GCC CTG GAC TGT TGT TCT
TTG TTC GTA GCC TTC GGG CAG G
CA150 WW1-5’
GATCC CCT ACG GAG GAG ATA TGG GTT GAA AAT AAA
ACT CCA GAT GGG AAG GTT TAT TAT TAT AAT GCT CGG ACA CGT GAA
TCT GCA TGG ACC AAG CCA GAT TAA G
CA150 WW1-3’
AATTC TTA ATC TGG CTT GGT CCA TGC AGA TTC ACG
TGT CCG AGC ATT ATA ATA ATA AAC CTT CCC ATC TGG AGT TTT ATT TTC
AAC CCA TAT CTC CTC CGT AGG G
CA150 WW2-5’
GATCC ACA GCA GTT TCT GAA TGG ACT GAA TAT AAA
ACA GCA GAT GGG AAG ACA TAT TAT TAT AAT AAT AGA ACA TTA GAA
TCA ACC TGG GAA AAA CCC CAA GAA TAA G
CA150 WW2-3’
AATTC TTA TTC TTG GGG TTT TTC CCA GGT TGA TTC
TAA TGT TCTATT ATT ATA ATA ATA TGT CTT CCC ATC TGC TGT TTT ATA
TTC AGT CCA TTC AGA AAC TGC TGT G
WWOX WW1-5’
GATCC GAG CTG CCT CCG GGC TGG GAG GAG AGA
ACC ACC AAG GAC GGC TGG GTT TAC TAC GCC AAT CAC ACC GAG GAG
AAG ACT CAG TGG GAA CAT CCA AAA TAA G
WWOX WW1-3’
AATTC TTA TTT TGG ATG TTC CCA CTG AGT CTT CTC
CTC GGT GTG ATT GGC GTA GTA AAC CCA GCC GTC CTT GGT GGT TCT
CTC CTC CCA GCC CGG AGG CAG CTC G
FE65 WW-5’
GATCC GAC CTG CCG GCT GGA TGG ATG AGG GTC CAG
GAC ACC TCA GGG ACC TAT TAC TGG CAC ATC CCA ACA GGG ACC ACC
CAG TGG GAA CCC CCC GGC TAA G
FE65 WW-3’
AATTC TTA GCC GGG GGG TTC CCA CTG GGT GGT CCC
TGT TGG GAT GTG CCA GTA ATA GGT CCC TGA GGT GTC CTG GAC CCT
CAT CCA TCC AGC CGG CAG GTC G
GAS7 WW-5’
GATCC ATC CTT CCA CCT GGC TGG CAG AGC TAC CTG
TCG CCT CAG GGC CGG CGG TAC TAT GTC AAC ACG ACC ACC AAT GAG
ACC ACC TGG GAA CGT CCC AGC TAA G
GAS7 WW-3’
AATTC TTA GCT GGG ACG TTC CCA GGT GGT CTC ATT
GGT GGT CGT GTT GAC ATA GTA CCG CCG GCC CTG AGG CGA CAG GTA
GCT CTG CCA GCC AGG TGG AAG GAT G
Pin1 WW-5’
GATCC AAG CTG CCG CCC GGC TGG GAG AAG CGC ATG
AGC CGC AGC TCA GGC CGA GTG TAC TAC TTC AAC CAC ATC ACT AAC
GCC AGC CAG TGG GAG CGG CCC AGC TAA G
Pin1 WW-3’
AATTC TTA GCT GGG CCG CTC CCA CTG GCT GGC GTT
AGT GAT GTG GTT GAA GTA GTA CAC TCG GCC TGA GCT GCG GCT CAT
GCG CTT CTC CCA GCC GGG CGG CAG CTT G
PLEKHA5 WW1-5’ GATCC TCC CTG CCC CGG TCC TGG ACT TAC GGG ATC
ACC AGG GGC GGC CGA GTC TTC TTC ATC AAC GAG GAG GCC AAG AGC
ACC ACC TGG CTG CAC CCC GTC ACC TAA G
PLEKHA5 WW1-3’ AATTC TTA GGT GAC GGG GTG CAG CCA GGT GGT GCT
CTT GGC CTC CTC GTT GAT GAA GAA GAC TCG GCC GCC CCT GGT GAT
CCC GTA AGT CCA GGA CCG GGG CAG GGA G
PQBP1 WW-5’
GATCC GGC CTA CCA CCA AGC TGG TAC AAG GTG TTC
GAC CCT TCC TGC GGG CTC CCT TAC TAC TGG AAT GCA GAC ACA GAC
CTT GTA TCC TGG CTC TCC CCA CAT GAC TAA G
PQBP1 WW-3’
AATTC TTA GTC ATG TGG GGA GAG CCA GGA TAC AAG
GTC TGT GTC TGC ATT CCA GTA GTA AGG GAG CCC GCA GGA AGG GTC
GAA CAC CTT GTA CCA GCT TGG TGG TAG GCC G
FBP11 WW1-5’
GATCC GGT GCA AAA TCA ATG TGG ACT GAA CAT AAA
TCA CCT GAT GGA AGG ACT TAC TAC TAC AAC ACT GAA ACC AAA CAG
TCT ACC TGG GAG AAA CCA GAT GAT TAA G
FBP11 WW1-3’
AATTC TTA ATC ATC TGG TTT CTC CCA GGT AGA CTG
TTT GGT TTC AGT GTT GTA GTA GTA AGT CCT TCC ATC AGG TGA TTT ATG
TTC AGT CCA CAT TGA TTT TGC ACC G
FBP11 WW2-5’
GATCC CTA TCT AAA TGC CCC TGG AAG GAA TAC AAA
TCA GAT TCT GGA AAG CCT TAC TAT TAT AAT TCT CAA ACA AAA GAA
TCT CGC TGG GCC AAA CCT AAA GAA TAA G
FBP11 WW2-3’
AATTC TTA TTC TTT AGG TTT GGC CCA GCG AGA TTC
TTT TGT TTG AGA ATT ATA ATA GTA AGG CTT TCC AGA ATC TGA TTT GTA
TTC CTT CCA GGG GCA TTT AGA TAG G
KIAA1052 WW-5’ GATCC CCA CTG CCT GGA GAG TGG AAA CCA TGC CAG
GAC ATC ACA GGT GAC ATT TAC TAT TTC AAC TTC GCC AAC GGG CAG
TCT ATG TGG GAC CAT CCA TGT GAC TAA G
KIAA1052 WW-3’ AATTC TTA GTC ACA TGG ATG GTC CCA CAT AGA CTG
CCC GTT GGC GAA GTT GAA ATA GTA AAT GTC ACC TGT GAT GTC CTG
GCA TGG TTT CCA CTC TCC AGG CAG TGG G
FBP21 WW2-5’
GATCC GCA GTG AAG ACC GTT TGG GTA GAA GGT TTA
AGT GAA GAT GGT TTT ACC TAT TAC TAT AAT ACA GAA ACA GGA GAA
TCC AGA TGG GAG AAA CCT GAT GAT TAA G
FBP21 WW2-3’
AATTC TTA ATC ATC AGG TTT CTC CCA TCT GGA TTC
TCC TGT TTC TGT ATT ATA GTA ATA GGT AAA ACC ATC TTC ACT TAA ACC
TTC TAC CCA AAC GGT CTT CAC TGC G
Supplementary Material, Table II
AIP4 WW2
Gene
Symbol
Alternate Name(s)
ARID1A
SF1
ITCH
B120:P270:BM029:BAF250:C1or
f4:SMARCF1
RMP:URI:NNX3:FLJ10575
EWS
RPB1:RPO2:POLR2:POLRA:RP
Bh1:RPOL2:RpIILS:hsRPB1:hRP
B220
RPB2:POL2RB:hsRPB2:hRPB14
0
RPB3:RPB31:hRPB33:hsRPB3
RPB5:XAP4:RPABC1:hRPB25:h
sRPB5
Rsc8:SRG3:SWI3:BAF155:CRA
CC1
Rsc8:BAF170:CRACC2
BAF57
CFIM25
CFIM:CFIM68:HPBRII-4:HPBRII7
MGC19907
UNKNOWN
hnRNP-L
E1BAP5:E1B-AP5:FLJ12944
MGC9315:FLJ39024
PAB1:PABP:PABP1:PABPC2:PA
BPL1
ZFM1:ZNF162:D11S636
AIF4:AIP4:NAPP1
SPG20
FLJ21908
UBAP2L
UBAP2
WBP2
SPARTIN:TAHCCP1:KIAA0610
UNKNOWN
KIAA0144; NICE-4
FLJ22435:KIAA1491:bA176F3.5
WBP-2:MGC18269
C19orf2
EWSR1
POLR2A
POLR2B
POLR2C
POLR2E
SMARCC1
SMARCC2
SMARCE1
CPSF5
CPSF6
DAZAP1
FLJ13150
HNRPL
HNRPUL1
FLJ12529
PABPC1
Gene
ID
Domains
Cellular Process
8289
BRIGHT
Transcription
8725
2130
5430
Prefoldin
RRM,ZnF_RBZ
RPOLA_N
Transcription
Transcription
Transcription
5431
none
Transcription
5432
5434
RPOLD
none
Transcription
Transcription
6599
SANT
Transcription
6601
6605
11051
11052
CHROMO,SANT
HMG
none
RRM
Transcription
Transcription
RNA processing
RNA processing
26528
79871
3191
11100
79869
26986
RRM,RRM
none
RRM,RRM,RRM
SAP,SPRY
RRM
4 RRM, PolyA
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
7536
83737
KH
C2,4 WW,
HECTc
MIT
2 TPR
UBA
UBA
none
RNA processing
Protein Transport
23111
79657
9898
55833
23558
Protein Transport
Unknown
Unknown
Unknown
Unknown
CA150 WW domain 2
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
CPSF5
CPSF6
CFIM25
CFIM:CFIM68:HPBRII-4:HPBRII7
CSBP:TUNP:HNRNPK
SAF-A:U21.1:HNRNPU
P54:NMT55:NRB54:P54NRB
PSP1:FLJ10955
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
11051
none
RRM
RNA processing
RNA processing
KH,KH,KH
SPRY
RRM,RRM
RRM,RRM
KH
SWAP,SWAP,UB
Q
TGFB
none
ARM
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
BASIC/PSP
RNA processing
none
RNA processing
RRM,RRM,BBC
RRM,RRM,RRM
BROMO,BROMO
SH3
none
none
FH2
FH2
WH1
FH2
RNA processing
RNA processing
Cell Cycle
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
WH1,PBD,WH2
WH2
none
Cytoskeleton
Cytoskeleton
Cytoskeleton
HNRPK
HNRPU
NONO
PSPC1
SF1
SF3A1
SF3A2
SF3A3
SF3B1
SF3B2
SF3B3
SFPQ
U2AF2
BRD4
ABI1
CYFIP1
CYFIP2
DIAPH1
DIAPH2
EVL
FMNL1
WAS
WASF2
WIRE
MIF:MIS:PRP11:SAP62:SF3a66
PRP9:SAP61:SF3a60
PRP10:SAP155:SF3B155:SF3b1
55
SF3b1:SAP145:SF3B145:SF3b1
50
RSE1:SAP130:SF3B130:SF3b13
0:KIAA0017
PSF:POMP100
U2AF65
CAP:MCAP:HUNKI
E3B1:NAP1:ABI-1
SHYC:KIAA0068:P140SRA-1
PIR121:PRO1331
DRF1:DFNA1:LFHL1:hDIA1
DIA:POF:DIA2:POF2
RNB6
FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:
MGC21878
THC:IMD2:WASP
SCAR2:WAVE2:dJ393P12.2
FLJ23260:FLJ30495:DKFZp547
N082
11052
3190
3192
4841
55269
7536
10291
8175
10946
23451
10992
23450
6421
11338
23476
10006
23191
26999
1729
1730
51466
752
7454
10163
147179
FBP11 WW domain 1
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
CRK7
NSEP1
CRKRS:KIAA0904
YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1
MGC9936:FLJ25329:KIAA0801
DBP1:HRH2:DDX15:PRP43:PrP
p43p
HNRNPF:mcs94-1
P54:NMT55:NRB54:P54NRB
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
51755
4904
Kinase
CSP
Transcription
Transcription
9879
1665
DEXDc,HELICc
DEXDc,HELICc
RNA processing
RNA processing
3185
4841
7536
10291
RNA processing
RNA processing
RNA processing
RNA processing
8175
10946
23451
RRM,RRM,RRM
RRM,RRM
KH
SWAP,SWAP,UB
Q
TGFB
none
ARM
10992
BASIC/PSP
RNA processing
23450
none
RNA processing
10262
RRM,RRM
RNA processing
6421
11338
6124
1729
51466
752
RRM,RRM,BBC
RRM,RRM,RRM
none
FH2
WH1
FH2
RNA processing
RNA processing
Translation
Cytoskeleton
Cytoskeleton
Cytoskeleton
10006
7408
7454
8976
7456
55210
3071
SH3
WH1
WH1,PBD,WH2
WH1,PBD,WH2
WH1,PBD,WH2
AAA
none
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Unknown
Unknown
DDX46
DHX15
HNRPF
NONO
SF1
SF3A1
SF3A2
SF3A3
SF3B1
SF3B2
SF3B3
SF3B4
SFPQ
U2AF2
RPL4
DIAPH1
EVL
FMNL1
ABI1
VASP
WAS
WASL
WASPIP
ATAD3A
HEM1
MIF:MIS:PRP11:SAP62:SF3a66
PRP9:SAP61:SF3a60
PRP10:SAP155:SF3B155:SF3b1
55
SF3b1:SAP145:SF3B145:SF3b1
50
RSE1:SAP130:SF3B130:SF3b13
0:KIAA0017
SAP49:SF3B49:SF3b49:MGC10
828
PSF:POMP100
U2AF65
HRPL4
DRF1:DFNA1:LFHL1:hDIA1
RNB6
FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:
MGC21878
E3B1:NAP1:ABI-1
UNKNOWN
THC:IMD2:WASP
N-WASP:MGC48327
WIP:PRPL-2
FLJ10709
UNKNOWN
RNA processing
RNA processing
RNA processing
FBP11 WW domain 2
Gene
Symbol
Alternate Name(s)
CPSF6
DHX9
CFIM:CFIM68:HPBRII-4:HPBRII-7
LKP:RHA:DDX9:NDHII:NDH II
11052
1660
HNRPF
HNRPK
HNRPU
NONO
SF1
SF3A1
SF3A2
SF3B1
SF3B3
HNRNPF:mcs94-1
CSBP:TUNP:HNRNPK
SAF-A:U21.1:HNRNPU
P54:NMT55:NRB54:P54NRB
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
MIF:MIS:PRP11:SAP62:SF3a66
PRP10:SAP155:SF3B155:SF3b155
RSE1:SAP130:SF3B130:SF3b130:KI
AA0017
PSF:POMP100
U2AF65
P0:L10E:RPP0:PRLP0
ML8:KU70:TLAA:D22S671:D22S731
KIAA1470:DKFZp762N0610
SHYC:KIAA0068:P140SRA-1
PIR121:PRO1331
DRF1:DFNA1:LFHL1:hDIA1
RNB6
FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:MG
C21878
E3B1:NAP1:ABI-1
THC:IMD2:WASP
SCAR2:WAVE2:dJ393P12.2
N-WASP:MGC48327
WIP:PRPL-2
RIAM:INAG1
ZAP:ZAP3
UNKNOWN
G3PD:GAPDH
CCTB
SRB:Cctd
Ccth:Nip7-1
SFPQ
U2AF2
RPLP0
G22P1
TD-60
CYFIP1
CYFIP2
DIAPH1
EVL
FMNL1
ABI1
WAS
WASF2
WASL
WASPIP
APBB1IP
YLPM1
HEM1
GAPD
CCT2
CCT4
CCT7
Gene
ID
Domains
Cellular Process
RNA processing
RNA processing
3185
3190
3192
4841
7536
10291
8175
23451
23450
RRM
DSRM,DSRM,DEX
Dc,HELICc
RRM,RRM,RRM
KH,KH,KH
SPRY
RRM,RRM
KH
SWAP,SWAP,UBQ
TGFB
ARM
none
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
6421
11338
6175
2547
55920
23191
26999
1729
51466
752
RRM,RRM,BBC
RRM,RRM,RRM
none
Ku78
none
none
none
FH2
WH1
FH2
RNA processing
RNA processing
Translation
Cell Cycle
Cell Cycle
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
10006
7454
10163
8976
7456
54518
56252
3071
2597
10576
10575
10574
SH3
WH1,PBD,WH2
WH2
WH1,PBD,WH2
WH1,PBD,WH2
RA,PH
none
none
none
none
none
none
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Signal Transduction
Unknown
Unknown
Metabolism
Chaperone
Chaperone
Chaperone
FBP21 WW domain 2
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
EWSR1
FUS
HTATSF1
SMARCA3
EWS
TLS:FUS1
TAT-SF1:dJ196E23.2
HLTF:ZBU1:HLTF1:RNF80:HIP11
6:SNF2L3:HIP116A
CA150:TAF2S
2130
2521
27336
6596
Transcription
Transcription
Transcription
UNKNOWN
CFIM:CFIM68:HPBRII-4:HPBRII-7
prp28:MGC8416:U5-100K
DBX:DDX3:HLP2:DDX14
DBP1:HRH2:DDX15:PRP43:PrPp
43p
CSBP:TUNP:HNRNPK
SAF-A:U21.1:HNRNPU
p62:Sam68
MGC9315:FLJ39024
P54:NMT55:NRB54:P54NRB
PAB1:PABP:PABP1:PABPC2:PA
BPL1
PRP3:RP18:HPRP3:Prp3p:HPRP
3P
PRP4:HPRP4:Prp4p:HPRP4P
7756
11052
9416
1654
1665
RRM,ZnF_RBZ
RRM,ZnF_RBZ
RRM,RRM
DEXDc,RING,HE
LICc
PQQ_DH, 3 WW
domains, 7 FF
domains
none
RRM
DEXDc,HELICc
DEXDc,HELICc
DEXDc,HELICc
3190
3192
10657
79869
4841
26986
KH,KH,KH
SPRY
KH
RRM
RRM,RRM
4 RRM, PolyA
9129
PWI
9128
WD40/SFM, 5
WD40
RRM,RRM
none
TCERG1
ZNF207
CPSF6
DDX23
DDX3X
DHX15
HNRPK
HNRPU
KHDRBS1
FLJ12529
NONO
PABPC1
PRP3
PRPF4
PSPC1
SART1
SF1
SF3A1
SF3A2
SF3A3
SF3B1
SF3B3
SF3B4
SFPQ
SFRS15
SNRP70
SNRPA
U5-116KD
WBP11
BUB3
DERPC
G22P1
PSP1:FLJ10955
Ara1:HOMS1:MGC2038:SART125
9
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
MIF:MIS:PRP11:SAP62:SF3a66
PRP9:SAP61:SF3a60
PRP10:SAP155:SF3B155:SF3b15
5
RSE1:SAP130:SF3B130:SF3b130
:KIAA0017
SAP49:SF3B49:SF3b49:MGC108
28
PSF:POMP100
SRA4:KIAA1172:DKFZP434E098
RPU1:U1AP:U170K:U1RNP:RNP
U1Z
UNKNOWN
KIAA0031
NPWBP
BUB3L
CTF8:DERPC
ML8:KU70:TLAA:D22S671:D22S7
31
10915
Transcription
Transcription
Transcription
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
55269
9092
7536
10291
8175
10946
23451
KH
SWAP,SWAP,UB
Q
TGFB
none
ARM
23450
none
10262
RRM,RRM
6421
57466
6625
RRM,RRM,BBC
RPR,RRM
RRM
6626
9343
51729
9184
54921
2547
RRM,RRM
none
none
WD40,WD40
none
Ku78
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
Cell Cycle
Cell Cycle
Cell Cycle
DIAPH1
DNM2
WAS
WASPIP
CHERP
DRF1:DFNA1:LFHL1:hDIA1
DYNII
THC:IMD2:WASP
WIP:PRPL-2
DAN16:SCAF6
1729
1785
7454
7456
10523
M11S1
GPIAP1
4076
FH2
DYNc,PH,GED
WH1,PBD,WH2
WH1,PBD,WH2
SWAP,RPR,G_pa
tch
none
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Other
Unknown
FE65 WW domain
Gene
Alternate Name(s)
Symbol
HTATSF1
NSEP1
PQBP1
THRAP3
CPSF6
DDX17
DDX3X
DDX46
DHX15
DHX9
HNRPH1
HNRPK
KHDRBS1
KHSRP
RBM17
FLJ12529
NONO
PABPC1
PTBP1
SF1
SF3A1
SF3A2
SF3A3
SF3B1
SF3B2
SF3B3
SF3B4
SFPQ
U2AF2
WBP11
RPL4
CYFIP1
CYFIP2
DIAPH1
DIAPH2
ENAH
EVL
WIRE
TAT-SF1:dJ196E23.2
YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1
SHS:MRX55:NPW38
TRAP150
CFIM:CFIM68:HPBRII-4:HPBRII-7
P72:RH70
DBX:DDX3:HLP2:DDX14
MGC9936:FLJ25329:KIAA0801
DBP1:HRH2:DDX15:PRP43:PrPp4
3p
LKP:RHA:DDX9:NDHII:NDH II
Gene
ID
Domains
Cellular Process
27336
4904
RRM,RRM
CSP
Transcription
Transcription
10084
9967
11052
10521
1654
9879
1665
none
none
RRM
DEXDc,HELICc
DEXDc,HELICc
DEXDc,HELICc
DEXDc,HELICc
Transcription
Transcription
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
1660
RNA processing
HNRPH:hnRNPH
CSBP:TUNP:HNRNPK
p62:Sam68
FBP2:KSRP:FUBP2
SPF45:MGC14439
MGC9315:FLJ39024
P54:NMT55:NRB54:P54NRB
PAB1:PABP:PABP1:PABPC2:PAB
PL1
PTB:PTB2:PTB3:PTB4:pPTB:HNR
PI:PTB-1:PTBT:HNRNPI:MGC8461:MGC10830
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
3187
3190
10657
8570
84991
79869
4841
26986
DSRM,DSRM,DE
XDc,HELICc
RRM,RRM,RRM
KH,KH,KH
KH
4 KH domains
G_patch,RRM
RRM
RRM,RRM
4 RRM, PolyA
5725
4 RRM's
RNA processing
7536
10291
RNA processing
RNA processing
MIF:MIS:PRP11:SAP62:SF3a66
PRP9:SAP61:SF3a60
PRP10:SAP155:SF3B155:SF3b15
5
SF3b1:SAP145:SF3B145:SF3b15
0
RSE1:SAP130:SF3B130:SF3b130:
KIAA0017
SAP49:SF3B49:SF3b49:MGC1082
8
PSF:POMP100
U2AF65
NPWBP
HRPL4
SHYC:KIAA0068:P140SRA-1
PIR121:PRO1331
DRF1:DFNA1:LFHL1:hDIA1
DIA:POF:DIA2:POF2
mena:NDPP1:FLJ10773
RNB6
FLJ23260:FLJ30495:DKFZp547N0
82
8175
10946
23451
KH
SWAP,SWAP,UB
Q
TGFB
none
ARM
10992
BASIC/PSP
RNA processing
23450
none
RNA processing
10262
RRM,RRM
RNA processing
6421
11338
51729
6124
23191
26999
1729
1730
55740
51466
14717
9
RRM,RRM,BBC
RRM,RRM,RRM
none
none
none
none
FH2
FH2
WH1
WH1
none
RNA processing
RNA processing
RNA processing
Translation
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
ABI1
VASP
WAS
WASF2
WASL
WASPIP
CHERP
YLPM1
MGC20255
E3B1:NAP1:ABI-1
UNKNOWN
THC:IMD2:WASP
SCAR2:WAVE2:dJ393P12.2
N-WASP:MGC48327
WIP:PRPL-2
DAN16:SCAF6
10006
7408
7454
10163
8976
7456
10523
ZAP:ZAP3
UNKNOWN
56252
90324
SH3
WH1
WH1,PBD,WH2
WH2
WH1,PBD,WH2
WH1,PBD,WH2
SWAP,RPR,G_pat
ch
none
none
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Other
Unknown
Unknown
Gas7 WW Domain
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
HNRPU
KHDRBS1
SF1
SF3A1
SAF-A:U21.1:HNRNPU
p62:Sam68
ZFM1:ZNF162:D11S636
PRP21:SAP114:SF3A120
3192
10657
7536
10291
RNA processing
RNA processing
RNA processing
RNA processing
SFPQ
RPLP0
ABI1
CYFIP1
CYFIP2
DIAPH1
FMNL1
PSF:POMP100
P0:L10E:RPP0:PRLP0
E3B1:NAP1:ABI-1
SHYC:KIAA0068:P140SRA-1
PIR121:PRO1331
DRF1:DFNA1:LFHL1:hDIA1
FMNL:FHOD4:KW13:C17orf1:MGC1894:C17orf1B:M
GC21878
THC:IMD2:WASP
SCAR2:WAVE2:dJ393P12.2
RIAM:INAG1
6421
6175
10006
23191
26999
1729
752
SPRY
KH
KH
SWAP,SWAP,UB
Q
RRM,RRM,BBC
none
SH3
none
none
FH2
FH2
7454
10163
54518
WH1,PBD,WH2
WH2
RA,PH
NAP1:NAP125:MGC8981:KIAA05
87
UNKNOWN
G3PD:GAPDH
10787
none
3071
2597
none
none
WAS
WASF2
APBB1IP
NCKAP1
HEM1
GAPD
RNA processing
Translation
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Cytoskeleton
Signal
Transduction
Signal
Transduction
Unknown
Metabolism
NEDD4-1 WW2
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
C19orf2
NSEP1
8725
4904
Prefoldin
CSP
Transcription
Transcription
5430
RPOLA_N
Transcription
5431
none
Transcription
5432
5434
RPOLD
None
Transcription
Transcription
11128
8607
6599
RPOLA_N
AAA
SANT
Transcription
Transcription
Transcription
8289
BRIGHT
Transcription
9967
29894
11051
11052
none
none
none
RRM
Transcription
RNA processing
RNA processing
RNA processing
DDX3X
DHX30
FLJ13150
HNRPK
HNRPL
HNRPU
HNRPUL1
PRPF8
SFRS7
EPRS
RPS3A
DIAPH1
AMOTL1
RMP:URI:NNX3:FLJ10575
YB1:BP-8:CSDB:DBPB:YB1:YBX1:NSEP-1:MDR-NF1
RPB1:RPO2:POLR2:POLRA:RP
Bh1:RPOL2:RpIILS:hsRPB1:hRP
B220
RPB2:POL2RB:hsRPB2:hRPB14
0
RPB3:RPB31:hRPB33:hsRPB3
RPB5:XAP4:RPABC1:hRPB25:h
sRPB5
RPC1; RPC155
ECP54:TIP49:NMP238
Rsc8:SRG3:SWI3:BAF155:CRA
CC1
B120:P270:BM029:BAF250:C1or
f4:SMARCF1
TRAP150
CPSF160:HSU37012
CFIM25
CFIM:CFIM68:HPBRII-4:HPBRII7
DBX:DDX3:HLP2:DDX14
DDX30:FLJ11214:KIAA0890
UNKNOWN
CSBP:TUNP:HNRNPK
hnRNP-L
SAF-A:U21.1:HNRNPU
E1BAP5:E1B-AP5:FLJ12944
PRP8:RP13:HPRP8:PRPC8
9G8:HSSG1
PARS:QARS:QPRS
FTE1
DRF1:DFNA1:LFHL1:hDIA1
JEAP
DEXDc,HELICc
DEXDc,HELICc
none
KH,KH,KH
RRM,RRM,RRM
SPRY
SAP,SPRY
JAB_MPN
RRM
none
none
FH2
none
RAPGEF2
PDZGEF2:PDZ-GEF2:RA-GEF-2
1654
22907
79871
3190
3191
3192
11100
10594
6432
2058
6189
1729
15481
0
51735
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
Translation
Translation
Cytoskeleton
Signal
Transduction
Signal
Transduction
NEDD4
KIAA0093
4734
GRINL1A
FLJ21908
WBP2
CAD
DKFZp586F1918
UNKNOWN
WBP-2:MGC18269
UNKNOWN
81488
79657
23558
790
POLR2A
POLR2B
POLR2C
POLR2E
POLR3A
RUVBL1
SMARCC1
ARID1A
THRAP3
CPSF1
CPSF5
CPSF6
CNMP,RasGEFN
,PDZ,RA,RasGE
F
WW,WW,WW,W
W,HECTc
none
2 TPR
none
none
Protein Transport
Other
Unknown
Unknown
Metabolism
Pin1 WW
Gene
Symbol
Alternate Name(s)
Gene
ID
Domains
Cellular Process
BCLAF1
CCNK
CRK7
ETV6
PPARBP
BTF:KIAA0164
CPR4:MGC9113
CRKRS:KIAA0904
TEL
PBP:CRSP1:RB18A:TRIP2:CRSP
200:DRIP205:DRIP230:PPARGB
P:TRAP220
TRAP150
ESG:ESG3:HsT18976:KIAA1547
P72:RH70
DBX:DDX3:HLP2:DDX14
P68:p68:HLR1:G17P1:HUMP68
DBP1:HRH2:DDX15:PRP43:PrPp
43p
MSTP054
9774
8812
51755
2120
5469
none
Cyclin,Cyclin
Kinase
SAM_PNT,ETS
none
Transcription
Transcription
Transcription
Transcription
Transcription
9967
7090
10521
1654
1655
1665
none
WD40
DEXDc,HELICc
DEXDc,HELICc
DEXDc,HELICc
DEXDc,HELICc
Transcription
Transcription
RNA processing
RNA processing
RNA processing
RNA processing
25962
none
RNA processing
PAPD2:FLJ21850:FLJ22267:FLJ2
2347
UNKNOWN
HDH-VIII
HNRPH:hnRNPH
CSBP:TUNP:HNRNPK
SAF-A:U21.1:HNRNPU
KIAA0031
P54:NMT55:NRB54:P54NRB
PAB1:PABP:PABP1:PABPC2:PA
BPL1
PRP8:RP13:HPRP8:PRPC8
E5.1
PSF:POMP100
p54:dJ677H15.2
160KD:POP101:SRM160:MGC39488
300KD:SRL300:SRM300:SRm300:KI
AA0324:SIII
U2AF65
KIAA0788
64852
ZnF_U1,RRM
RNA processing
9908
10146
3187
3190
3192
9343
4841
26986
NTF2, RRM
NTF2,RRM
RRM,RRM,RRM
KH,KH,KH
SPRY
none
RRM,RRM
4 RRM, PolyA
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
10594
10921
6421
9295
10250
JAB_MPN
RRM
RRM,RRM,BBC
RRM
PWI
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
23524
none
RNA processing
11338
23020
RNA processing
RNA processing
HRPL4
DDBA:XAP1:XPCE:XPE-BF:UVDDB1
WHIP:FLJ22526:bA420G6.2
p58:PK58:CLK-1:p58CLK1:p58CDC2L1
CDC2L3:p58GTA:PITSLRE
ML8:KU70:TLAA:D22S671:D22S7
6124
1642
RRM,RRM,RRM
DEXDc,HELICc,S
EC63,DEXDc,HE
LICc,SEC63
none
none
Translation
DNA repair
56897
984
ZnF_Rad18,AAA
kinase
DNA repair
Cell Cycle
985
2547
kinase
Ku78
Cell Cycle
Cell Cycle
THRAP3
TLE3
DDX17
DDX3X
DDX5
DHX15
DKFZP434I
116
RBM21
G3BP2
G3BP
HNRPH1
HNRPK
HNRPU
U5-116KD
NONO
PABPC1
PRPF8
RNPS1
SFPQ
SFRS11
SRRM1
SRRM2
U2AF2
ASCC3L1
RPL4
DDB1
WRNIP1
CDC2L1
CDC2L2
G22P1
AP2A1
REPS1
C13orf8
31
C19ORF5:C19orf5:VCY2IP1:FLJ1
0669:VCY2IP-1
ADTAA:CLAPA1:AP2-ALPHA
RALBP1
FLJ90413:KIAA1802
LOC133619
UNKNOWN
M11S1
TBC1D4
TFG
HADHA
PKM2
GPIAP1
KIAA0603
TF6
GBP:MTPA:LCHAD
PK3:PKM:TCB:OIP3:CTHBP:THB
P1:MGC3932
VCY2IP1
55201
none
Cytoskeleton
160
85021
28348
9
13361
9
4076
9882
10342
3030
5315
none
EH
none
Protein Transport
Other
Unknown
none
Unknown
none
PTB,PTB,TBC
PB1
none
none
Unknown
Unknown
Unknown
Metabolism
Metabolism
WWOX WW domain 1
Gene
Symbol
POLR2A
POLR2B
SMARCC1
TCERG1
CPSF1
CPSF2
CPSF3
CPSF5
CPSF6
FIP1L1
FLJ12529
HNRPK
KIAA0460
PABPC1
Alternate Name(s)
Gene
ID
Domains
Cellular Process
RPB1:RPO2:POLR2:POLRA:RP
Bh1:RPOL2:RpIILS:hsRPB1:hRP
B220
RPB2:POL2RB:hsRPB2:hRPB14
0
Rsc8:SRG3:SWI3:BAF155:CRA
CC1
CA150:TAF2S
5430
RPOLA_N
Transcription
5431
none
Transcription
6599
SANT
Transcription
10915
Transcription
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
CPSF160:HSU37012
CPSF100:KIAA1367
CPSF:CPSF73
CFIM25
CFIM:CFIM68:HPBRII-4:HPBRII7
Rhe:DKFZp586K0717
MGC9315:FLJ39024
CSBP:TUNP:HNRNPK
UNKNOWN
PAB1:PABP:PABP1:PABPC2:PA
BPL1
PRP21:SAP114:SF3A120
29894
53981
51692
11051
11052
PQQ_DH, 3 WW
domains, 7 FF
domains
none
none
none
none
RRM
81608
79869
3190
23248
26986
none
RRM
KH,KH,KH
RPR
4 RRM, PolyA
RNA processing
RNA processing
RNA processing
RNA processing
RNA processing
10291
RNA processing
PRP10:SAP155:SF3B155:SF3b1
55
RSE1:SAP130:SF3B130:SF3b13
0:KIAA0017
PSF:POMP100
SPK:SYM
FLJ23260:FLJ30495:DKFZp547
N082
WDC146:FLJ11294
WBP-2:MGC18269
23451
SWAP,SWAP,UB
Q
ARM
23450
none
RNA processing
6421
8189
14717
9
55339
23558
RRM,RRM,BBC
none
none
RNA processing
RNA processing
Cytoskeleton
6 WD40 domains
none
Unknown
Unknown
SF3A1
SF3B1
SF3B3
SFPQ
SYMPK
WIRE
WDR33
WBP2
RNA processing
Supplementary Figure Legends
Supplementary Material, Table I - Sequence of WW oligonucleotides. The sequences
of oligonucleotides (5’ and 3’) used to generate the GST-WW domain fusion constructs
are shown. The resulting annealed products contain overhanging BamHI (5’) and EcoRI
(3’) ends (shown in bold) for direct cloning into BamHI/EcoRI cut pGEXKT (Amersham
Biosciences AB, Upsalla, Sweden).
Supplementary Material, Table II – Proteins identified precipitating with WW
domains. Proteins identified by mass spectrometry (indicated by gene symbol) that met
our “hit” criteria (see Materials and Methods) are shown for each WW domain bait. Also
included are alternate names, modular domains, and putative cellular function for each
precipitated protein.
Supplementary Material, Figure 1 – Superimposition of proline-based WW domainbinding motifs onto the clustered WW domain data. The individual proline-rich
motifs of Figure 4, as well as PPGPPP and PXXPMXPP motifs, are indicated in the
precipitated proteins from the clustered data of Figure 3. The coloring of the precipitated
proteins on the right indicates cellular function, which is defined in the Figure 3. Group
A (yellow), B (red), and C (green) WW domains are colored at the top, as in Figure 3.
Groups of precipitated proteins that were enriched in the pull-downs of specific WW
domain classes are indicated to the left.