Download msb201053-sup-0001 - Molecular Systems Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bimolecular fluorescence complementation wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Western blot wikipedia , lookup

Protein domain wikipedia , lookup

Protein purification wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Proteomics wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

List of types of proteins wikipedia , lookup

Transcript
Supplementary information
Table of Content
 Supplementary Figures 1 – 7
 Supplementary Tables I – XIII
 Supplementary References
 Supplementary MS data
Subject Categories: Plant biology, Metabolic and regulatory networks
 Supplementary figures
Figure 1. Graphical views created in Cytoscape representing the whole cell cycle interactome,
the subnetwork comprising the domain I1 interactions, and the network of the domain I2 set.
The legend corresponding to the networks is shown. N/A = not assessed, if no PCC could be
calculated. Pred. = predicted.
Figure 2. Graphical representation of the bait-bait interactions extracted from domain I1 (A)
and I2 (B) datasets, and the union of both, showing a single highly interconnected bait-bait
network (C). Networks were built in Cytoscape and visualized as described in the legend used
in Supplementary Figure1.
Figure 3. Distribution of the number of cell cycle-related features (Supplementary Table VIII)
among the whole gene pool (genome-wide), a collection of 518 cell cycle genes
(Supplementary Table IX), and bait proteins (without baits used in reverse TAP experiments).
Figure 4. Diagrams showing the distribution of the GO similarity scores of gene pairs of
domain I1 (blue) and domain I2 (green). This is compared to the distribution of the average
GO similarity scores of 1000 corresponding networks containing an equal number of
randomly chosen gene pairs as I1 (red) or I2 (purple). Standard deviations are shown for the
average of the random networks. GO terms representing the biological process, cellular
component and molecular function were assessed. Proteins that could not be assigned to a
specific gene locus were discarded from the analysis.
A
B
C
AT1G01880 X
AT1G15570
AT5G02530 X
AT4G11920
Negative control 1
Figure 5. Split luciferase analysis of AT1G01880 (DNA repair protein) X AT1G15570
(CYCA2;3), AT5G02530 (RNA and export factor-binding protein) X AT4G11920
(CCS52A2), and AT5G06680 (spindle pole body component 98) X AT3G12280 (RBR)
(Negative control 1) interactions. The different pairs of split-luciferase constructs were
transiently expressed in Arabidopsis cotyledon and light was measured directly from the
seedlings. (A) Seedlings were infiltrated with split-luciferase vectors. (B) Light emission
signal as measured by the camera. (C) Light emission in false color was overlaid on seedlings.
Bar, 10mm
Figure 6. Subnetwork around genes with an E2F motif in their promoter. The network was
buit in Cytoscape and visualized as described in the legend used in Supplementary Figure 1.
Figure 7. Subnetwork around genes with an MSA motif in their promoter. The network was
built in Cytoscape and visualized as described in the legend used in Supplementary Figure1.
Supplementary Tables
Supplementary Table I : Comparison of the number of basic cell cycle regulators.
The number of 5 different types of cell cycle regulators was counted and compared between
Arabidopsis thaliana (Capron et al, 2003; Menges et al, 2005; Peres et al, 2007),
Saccharomyces cerevisiae and Homo sapiens (Morgan, 2008a; Morgan, 2008b). For every
type of regulator, the number of regulators present in a certain species is shown at the end of
each box.
S. cerevisiae
Cyclin-dependent kinases Cdk1 (cdc28)
H. sapiens
1 Cdk1, 2, 4, 6
Cln1, 2, 3
Cyclins
Cyclin D1-3, Cyclin E
A. thaliana
4
CDKA;1, CDKB1;1-2,CDKB2;1-2
5
4
10 D-Type cyclins
10
10 A-Type cyclins, 11 B-Type cyclins
21
9
Clb1, 2, 3, 4, 5, 6
Cyclin A1-2, Cyclin B1-2 4
CDK-activating kinases
Cak1
1 Cdk7 (+ cyclin H)
2
CDKD;1-3 (+ CYCH;1) CDKF;1
5
APC activators
Cdc20, Cdh1, Ama1
3 Cdc20, Cdh1
2
Cdc20;1-6, CCS52A1-2,B
9
Cip1, Kip1-2
3
KRP1-7
7
CDK inhibitor proteins
Sic1
1
Ink4a-d
4
SIM, SMR1-13 a
14
Total
a
15
23
71
The family of SIM-related proteins (SIM, SMR1-5) was recently extended with 8 additional
members (SMR6-13) that were not described before (own unpublished data).
Supplementary Table II: Core cell cycle gene list of Arabidopsis
Overview of core cell cycle genes identified in Arabidopsis thaliana, augmented with the SIMrelated genes, mitotic checkpoint homologs including proteins of the anaphase promoting complex
(APC) and genes involved in DNA replication or DNA repair, as described in the introduction. The
last column specifies to which cell cycle category a gene belongs.
Supplementary Table III: Overview of bait proteins used to built the interactome.
The name of the bait is shown together with its accession number from the TAIR database
(Locus). In the column category, ‘Reverse’ refers to the six baits that were chosen for reverse
TAP experiments. Topology = C- or N-terminal tag fusions. Tag refers to the applied tag
being either the traditional TAP tag developed for Saccharomyces cerevisiae (Rigaut et al,
1999), or the tandem affinity tag GS (Van Leene et al, 2008). Expression was indicated as (+)
if the TAP fusion protein could be detected in transgenic cell cultures by western blot analysis.
The total number of purifications performed per bait is shown, with a minimum of 2 for each.
Supplementary Table IV: List of non-specific and background interactors.
These were determined by control TAP experiments such as TAP [7] or GS [3] purifications
on mock cultures, purifications from cultures expressing TAP fusions of heterologous GFP
[7], RFP [2] or β-glucuronidase [5], or GS fusions of heterologous GFP [8], or βglucuronidase [4]. The numbers between brackets represent the amount of experiments
performed. Ribosomal proteins, actins and tubulins identified in these control experiments
were not included in the list, instead their whole protein families were treated as non-specific
interactors and background.
Supplementary Table V: Node attributes used to generate the network in Cytoscape.
The node attribute file contains the AGI accession numbers (Locus), protein names
(Description), the TAIR8.0 descriptions, localization data downloaded from the SUB-cellular
location database for Arabidopsis (SUBA) (Heazlewood et al, 2007), information about the
periodicity of the corresponding gene (Periodic) and the cell cycle phase during which the
transcript peaks (Phase). Furthermore, data is integrated about the presence of E2F or MSA
motifs in the promoters of the corresponding gene, about the presence of CDK
phosphorylation sites in the protein sequence, and finally one can see if a protein was used as
bait or if it was predicted as a new cell cycle protein (Type). For the localization data obtained
from SUBA we ordered the different data sources as a kind of evidence score: (1) for GFP, (2)
for MS, (3) for SwissProt, and (4) for AmiGO. Each gene was annotated by taking these
scores into account, meaning that if there was an annotation for a selected gene from (1), than
this annotation was taken for the selected gene, if this was not the case we went the list down
to find an annotation.
Supplementary Table VI: Edge attributes used to generate the network in Cytoscape.
The edge attribute file lists all identified interactions, together with their AGI accession
numbers, protein names, the degree of co-expression correlation (PCC) (a value of -2 means
that the PCC could not be calculated) and the GO similarity scores (NaN = not assessed) for
biological process, molecular function and cellular component. Furthermore, it contains
information about whether or not an interaction was found in both directions (Reciprocal), or
if the interaction was confirmed and belongs to domain I1 (Confirmed). Finally, the presence
of the interaction in one of the 6 interaction databases (Arabidopsis Reactome, AtPID, BAR,
Intact, Reactome, TAIR) can be consulted.
Supplementary Table VII: Overview of microarray experiments used to calculate the
transcript PCCs.
List of experiments used to build an Arabidopsis ATH1 micro-array compendium of 518
experiments focused on plant growth and development used to calculate the transcript Pearson
correlation coefficients.
Supplementary Table VIII: Overview of cell cycle-related features used in the
computational analysis to search for new cell cycle proteins.
Cell cycle feature
Category
# of genes
References
Periodicity
Gene expression
1258
(Jensen et al, 2006; Menges et al, 2003)
MSA-like
Promoter motif
2295
(Vandepoele et al, 2006)
E2Fa-like
Promoter motif
1809
(Vandepoele et al, 2006)
E2F10SPCNA
Promoter motif
2221
(Vandepoele et al, 2006)
OS_motifsIandIIa
Promoter motif
2310
(Vandepoele et al, 2006)
UP1ATMSD
Promoter motif
3738
(Vandepoele et al, 2006)
wrrmGCGn
Promoter motif
2179
(Vandepoele et al, 2006)
CDK consensus site
Phosphorylation motif
6321
(De Veylder et al, 1997)
[IM]R-tail
Protein sequence motif
116
(Hayes et al, 2006; Vodermaier et al, 2003)
PEST-sequence
Destruction motif
2719
(Rechsteiner & Rogers, 1996)
D-box
Destruction motif
2369
(Capron et al, 2003)
KEN-box
Destruction motif
410
(Capron et al, 2003)
GxEN-box
Destruction motif
300
(Castro et al, 2003)
A-box
Destruction motif
1779
(Littlepage & Ruderman, 2002)
Supplementary Table IX: Collection of known cell cycle genes.
Collection of 518 genes annotated as cell cycle by gene ontology, including all genes
described as core cell cycle genes or related to cell cycle (Capron et al, 2003; Menges et al,
2005; Shultz et al, 2007; Vandepoele et al, 2002). The following GO terms were used to build
the collection: cell cycle (GO:0007049), cell division (GO:0051301), cell proliferation
(GO:0008283), chromosome segregation (GO:0007059), DNA replication (GO:0006260),
DNA replication and chromosome cycle (GO:0000067). For every gene, the number of cell
cycle-related features (Supplementary Table VIII) present is shown. This collection was
subtracted from the prey lists to find novel cell cycle proteins.
Supplementary Table X: New candidate cell cycle proteins.
A) Overview of 40 new candidate cell cycle proteins identified in the domain I1 dataset
containing at least 2 of the cell cycle-related features listed in Supplementary Table VIII. The
number of identified features per protein is shown.
B) Overview of 83 new candidate cell cycle proteins identified in the domain I2 dataset
containing at least 2 of the cell cycle-related features listed in Supplementary Table VIII. The
number of identified features per protein is shown.
Supplementary Table XI: Comparison of overrepresentative Biological Processes among
the preys of domain I1 (A) and I2 (B) as determined with the Cytoscape plugin BiNGO.
Search parameters among the preys of the domain I1 and I2 datasets:
File created with BiNGO (c) on 22-okt-2008 at 16:17:17
ontology: process
curator: GO
Selected ontology file : BiNGO.jar!/GO_Biological_Process
Selected annotation file : gene_association_2008okt04.tair
Overrepresentation
Selected statistical test : Hypergeometric test
Selected correction : Benjamini & Hochberg False Discovery Rate (FDR) correction
Selected significance level : 0.05
Testing option : Test cluster versus whole annotation
Number of annotated genes in selection : 137 for domain I1 and 197 for domain I2
Number of annotated genes in network/whole annotation : 25557
The result files of the BiNGO analysis are parsed into one table. The table shows the GO term
and the corresponding corrected p-value for the domain I1 dataset (A) and the domain I2
dataset (B) whereas the different ranges of the p-values are highlighted in different colors.
Furthermore, GO terms that do not exist in one of these sets are marked with >0.05.
Supplementary Table XII: Overview of tested protein pairs by the split-luciferase assay
Overview of all 17 protein pairs tested by the transient split-luciferase assay, including 3
negative controls (Untr., Neg. 1 and Neg. 2). The column ‘Type’ shows if an interaction
discovered by the TAP analysis belongs to domain I1 or domain I2, or if it was used as a
negative control. The TAIR accessions of the corresponding genes are shown together with
the protein name. Proteins in column ‘Protein A’ were fused to the amino-terminal moiety of
the Firefly luciferase, while proteins in column ‘Protein B’ were fused to the carboxy-terminal
moiety of the luciferase. Fusions to the luciferase moieties were either done to the aminoterminus of both tested proteins (N) or the the carboxy-terminus of both proteins (C). The
mean value of the relative light emission from at least two experiments per tested interaction
is shown together with the standard errors. The number of replicates is shown, taken into
account that only experiments recorded with 2000 images integration are shown. The mean
value was set to zero when net LUC activity was negative, meaning that the background value
was higher than the total LUC activity. An interaction was considered positive (strong or
weak) when the net relative LUC activity was above an arbitrary chosen threshold of 4500,
which is 9-fold higher than the signal from untransformed seedlings.
Supplementary Table XIII: List of CDK/cyclin complexes and its regulators extracted
from the interactome.
The CDK/cyclin complexes were determined based on the interactions found between a CDK
and a cyclin. MAT1 was added as canonical interactor to both CDKD/cyclin complexes.
Interacting CKS scaffolding proteins and negative regulators are based on bait-prey
interactions between a cyclin and a CKS, a cyclin and a KRP, or a cyclin and a SIM/SMR.
(SIM/SMR1,2) indicates possible regulation of the complex by one or more of these inhibitors,
based on their association with CDKB1;1. Complexes are ordered according to the time of
action of the cyclin based on its transcript peak level (Menges et al, 2005). SMR6 =
AT5G40460, SMR8 = AT1G10690, SMR11 = AT2G28330. The table is further visualized in
Figure 4.
Complex
CDKA;1
CYCD2;1
CSK interactors
Negative regulators
Time of action
CKS1/CKS2
SMR3/SMR4/SMR6/SMR8
Constant
KRP2/KRP3/KRP4/KRP5/KRP6/KRP7
CDKA;1
CYCD3;1
CKS2
SMR4
G1/S, M
KRP6
CDKA;1
CYCD3;2
CKS1/CKS2
Constant
CDKA;1
CYCD3;3
CKS2
SMR6
Early G1
CDKA;1
CYCD4;1
CKS2
SMR8
G1/S
KRP5/KRP6/KRP7
CDKA;1
CYCD4;2
CKS1/CKS2
KRP2/KRP3/KRP4/KRP5/KRP6/KRP7
G1/S
CDKA;1
CYCD5;1
CKS1/CKS2
SMR6/SMR8
Early G1
KRP2/KRP7
CDKA;1
CYCD6;1
CKS1
KRP3/KRP4/KRP7
G1/S
CDKA;1
CYCD7;1
CKS1/CKS2
SMR4/SMR6
G1/S
CDKB1;1
CYCA2;1
(SIM/SMR1/SMR2)
G2/M
CDKB1;1
CYCA2;3
CKS1
(SIM/SMR1/SMR2)
G2/M
CDKA;1
CYCA3;1
CKS2
S
CDKA;1
CYCA3;3
CKS1/CKS2
Constant
CDKA;1
CYCA3;4
CKS1/CKS2
S
CDKB2;1
CYCB1;1
G2/M
CDKB2;2
CYCB1;1
G2/M
CDKB2;2
CYCB1;2
CKS2
CDKB1;1
CYCB2;2
CKS1/CKS2
CDKB1;2
CYCB2;2
CKS1/CKS2
CDKB1;1
CYCB2;4
CKS1/CKS2
CDKB1;1
CYCB3;1
M
(SIM/SMR1/SMR2)
G2/M
G2/M
(SIM/SMR1/SMR2)
G2/M
SMR11
G2/M
(SIM/SMR1/SMR2)
G2/M
CDKC;1
CYCT1;3
Constant
CDKC;2
CYCT1;3
Constant
CDKC;1
CYCH;1
Constant
CDKC;2
CYCH;1
Constant
CDKD;2
CYCH;1/MAT1
Constant
CDKD;3
CYCH;1/MAT1
Constant
CDKG;1
CYCL1
Constant, peak at G0/G1
CDKG;2
CYCL1
Constant
Supplementary References
Capron A, Ökrész L, Genschik P (2003) First glance at the plant APC/C, a highly conserved
ubiquitin-protein ligase. Trends Plant Sci 8: 83-89
Castro A, Vigneron S, Bernis C, Labbe J C, Lorca T (2003) Xkid is degraded in a D-box, KEN-box,
and A-box-independent pathway. Mol Cell Biol 23: 4126-4138
De Veylder L, Segers G, Glab N, Van Montagu M, Inze D (1997) Identification of proteins interacting
with the Arabidopsis Cdc2aAt protein. J Exp Bot 48: 2113-2114
Hayes MJ, Kimata Y, Wattam SL, Lindon C, Mao G, Yamano H, Fry AM (2006) Early mitotic
degradation of Nek2A depends on Cdc20-independent interaction with the APC/C. Nat Cell
Biol 8: 607-614
Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007) SUBA: the Arabidopsis
Subcellular Database. Nucleic Acids Research 35: D213-D218
Jensen LJ, Jensen TS, de Lichtenberg U, Brunak S, Bork P (2006) Co-evolution of transcriptional and
post-translational cell-cycle regulation. Nature 443: 594-597
Littlepage LE, Ruderman JV (2002) Identification of a new APC/C recognition domain, the A box,
which is required for the Cdh1-dependent destruction of the kinase Aurora-A during mitotic
exit. Genes Dev 16: 2274-2285
Menges M, de Jager SM, Gruissem W, Murray JAH (2005) Global analysis of the core cell cycle
regulators of Arabidopsis identifies novel genes, reveals multiple and highly specific profiles
of expression and provides a coherent model for plant cell cycle control. Plant J 41: 546-566
Menges M, Hennig L, Gruissem W, Murray JAH (2003) Genome-wide gene expression in an
Arabidopsis cell suspension. Plant Mol Biol 53: 423-442
Morgan DO (2008a) SnapShot: cell-cycle regulators I. Cell 135: 764-764 E761
Morgan DO (2008b) SnapShot: Cell-cycle regulators II. Cell 135: 974-974 E971
Peres A, Churchman ML, Hariharan S, Himanen K, Verkest A, Vandepoele K, Magyar Z, Hatzfeld Y,
Van Der Schueren E, Beemster GTS, Frankard V, Larkin JC, Inzé D, De Veylder L (2007)
Novel plant-specific cyclin-dependent kinase inhibitors induced by biotic and abiotic stresses.
J Biol Chem 282: 25588-25596
Rechsteiner M, Rogers SW (1996) PEST sequences and regulation by proteolysis. Trends Biochem Sci
21: 267-271
Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Séraphin B (1999) A generic protein purification
method for protein complex characterization and proteome exploration. Nature Biotechnol 17:
1030-1032
Shultz RW, Tatineni VM, Hanley-Bowdoin L, Thompson WF (2007) Genome-wide analysis of the
core DNA replication machinery in the higher plants Arabidopsis and rice. Plant Physiol 144:
1697-1714
Van Leene J, Witters E, Inzé D, De Jaeger G (2008) Boosting tandem affinity purification of plant
protein complexes. Trends Plant Sci 13: 517-520
Vandepoele K, Casneuf T, Van de Peer Y (2006) Identification of novel regulatory modules in
dicotyledonous plants using expression data and comparative genomics. Genome Biol 7: R103
Vandepoele K, Raes J, De Veylder L, Rouzé P, Rombauts S, Inzé D (2002) Genome-wide analysis of
core cell cycle genes in Arabidopsis. Plant Cell 14: 903-916
Vodermaier HC, Gieffers C, Maurer-Stroh S, Eisenhaber F, Peters J M (2003) TPR subunits of the
anaphase-promoting complex mediate binding to the activator protein CDH1. Curr Biol 13:
1459-1468