* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download msb201053-sup-0001 - Molecular Systems Biology
Bimolecular fluorescence complementation wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Protein mass spectrometry wikipedia , lookup
Western blot wikipedia , lookup
Protein domain wikipedia , lookup
Protein purification wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Supplementary information Table of Content Supplementary Figures 1 – 7 Supplementary Tables I – XIII Supplementary References Supplementary MS data Subject Categories: Plant biology, Metabolic and regulatory networks Supplementary figures Figure 1. Graphical views created in Cytoscape representing the whole cell cycle interactome, the subnetwork comprising the domain I1 interactions, and the network of the domain I2 set. The legend corresponding to the networks is shown. N/A = not assessed, if no PCC could be calculated. Pred. = predicted. Figure 2. Graphical representation of the bait-bait interactions extracted from domain I1 (A) and I2 (B) datasets, and the union of both, showing a single highly interconnected bait-bait network (C). Networks were built in Cytoscape and visualized as described in the legend used in Supplementary Figure1. Figure 3. Distribution of the number of cell cycle-related features (Supplementary Table VIII) among the whole gene pool (genome-wide), a collection of 518 cell cycle genes (Supplementary Table IX), and bait proteins (without baits used in reverse TAP experiments). Figure 4. Diagrams showing the distribution of the GO similarity scores of gene pairs of domain I1 (blue) and domain I2 (green). This is compared to the distribution of the average GO similarity scores of 1000 corresponding networks containing an equal number of randomly chosen gene pairs as I1 (red) or I2 (purple). Standard deviations are shown for the average of the random networks. GO terms representing the biological process, cellular component and molecular function were assessed. Proteins that could not be assigned to a specific gene locus were discarded from the analysis. A B C AT1G01880 X AT1G15570 AT5G02530 X AT4G11920 Negative control 1 Figure 5. Split luciferase analysis of AT1G01880 (DNA repair protein) X AT1G15570 (CYCA2;3), AT5G02530 (RNA and export factor-binding protein) X AT4G11920 (CCS52A2), and AT5G06680 (spindle pole body component 98) X AT3G12280 (RBR) (Negative control 1) interactions. The different pairs of split-luciferase constructs were transiently expressed in Arabidopsis cotyledon and light was measured directly from the seedlings. (A) Seedlings were infiltrated with split-luciferase vectors. (B) Light emission signal as measured by the camera. (C) Light emission in false color was overlaid on seedlings. Bar, 10mm Figure 6. Subnetwork around genes with an E2F motif in their promoter. The network was buit in Cytoscape and visualized as described in the legend used in Supplementary Figure 1. Figure 7. Subnetwork around genes with an MSA motif in their promoter. The network was built in Cytoscape and visualized as described in the legend used in Supplementary Figure1. Supplementary Tables Supplementary Table I : Comparison of the number of basic cell cycle regulators. The number of 5 different types of cell cycle regulators was counted and compared between Arabidopsis thaliana (Capron et al, 2003; Menges et al, 2005; Peres et al, 2007), Saccharomyces cerevisiae and Homo sapiens (Morgan, 2008a; Morgan, 2008b). For every type of regulator, the number of regulators present in a certain species is shown at the end of each box. S. cerevisiae Cyclin-dependent kinases Cdk1 (cdc28) H. sapiens 1 Cdk1, 2, 4, 6 Cln1, 2, 3 Cyclins Cyclin D1-3, Cyclin E A. thaliana 4 CDKA;1, CDKB1;1-2,CDKB2;1-2 5 4 10 D-Type cyclins 10 10 A-Type cyclins, 11 B-Type cyclins 21 9 Clb1, 2, 3, 4, 5, 6 Cyclin A1-2, Cyclin B1-2 4 CDK-activating kinases Cak1 1 Cdk7 (+ cyclin H) 2 CDKD;1-3 (+ CYCH;1) CDKF;1 5 APC activators Cdc20, Cdh1, Ama1 3 Cdc20, Cdh1 2 Cdc20;1-6, CCS52A1-2,B 9 Cip1, Kip1-2 3 KRP1-7 7 CDK inhibitor proteins Sic1 1 Ink4a-d 4 SIM, SMR1-13 a 14 Total a 15 23 71 The family of SIM-related proteins (SIM, SMR1-5) was recently extended with 8 additional members (SMR6-13) that were not described before (own unpublished data). Supplementary Table II: Core cell cycle gene list of Arabidopsis Overview of core cell cycle genes identified in Arabidopsis thaliana, augmented with the SIMrelated genes, mitotic checkpoint homologs including proteins of the anaphase promoting complex (APC) and genes involved in DNA replication or DNA repair, as described in the introduction. The last column specifies to which cell cycle category a gene belongs. Supplementary Table III: Overview of bait proteins used to built the interactome. The name of the bait is shown together with its accession number from the TAIR database (Locus). In the column category, ‘Reverse’ refers to the six baits that were chosen for reverse TAP experiments. Topology = C- or N-terminal tag fusions. Tag refers to the applied tag being either the traditional TAP tag developed for Saccharomyces cerevisiae (Rigaut et al, 1999), or the tandem affinity tag GS (Van Leene et al, 2008). Expression was indicated as (+) if the TAP fusion protein could be detected in transgenic cell cultures by western blot analysis. The total number of purifications performed per bait is shown, with a minimum of 2 for each. Supplementary Table IV: List of non-specific and background interactors. These were determined by control TAP experiments such as TAP [7] or GS [3] purifications on mock cultures, purifications from cultures expressing TAP fusions of heterologous GFP [7], RFP [2] or β-glucuronidase [5], or GS fusions of heterologous GFP [8], or βglucuronidase [4]. The numbers between brackets represent the amount of experiments performed. Ribosomal proteins, actins and tubulins identified in these control experiments were not included in the list, instead their whole protein families were treated as non-specific interactors and background. Supplementary Table V: Node attributes used to generate the network in Cytoscape. The node attribute file contains the AGI accession numbers (Locus), protein names (Description), the TAIR8.0 descriptions, localization data downloaded from the SUB-cellular location database for Arabidopsis (SUBA) (Heazlewood et al, 2007), information about the periodicity of the corresponding gene (Periodic) and the cell cycle phase during which the transcript peaks (Phase). Furthermore, data is integrated about the presence of E2F or MSA motifs in the promoters of the corresponding gene, about the presence of CDK phosphorylation sites in the protein sequence, and finally one can see if a protein was used as bait or if it was predicted as a new cell cycle protein (Type). For the localization data obtained from SUBA we ordered the different data sources as a kind of evidence score: (1) for GFP, (2) for MS, (3) for SwissProt, and (4) for AmiGO. Each gene was annotated by taking these scores into account, meaning that if there was an annotation for a selected gene from (1), than this annotation was taken for the selected gene, if this was not the case we went the list down to find an annotation. Supplementary Table VI: Edge attributes used to generate the network in Cytoscape. The edge attribute file lists all identified interactions, together with their AGI accession numbers, protein names, the degree of co-expression correlation (PCC) (a value of -2 means that the PCC could not be calculated) and the GO similarity scores (NaN = not assessed) for biological process, molecular function and cellular component. Furthermore, it contains information about whether or not an interaction was found in both directions (Reciprocal), or if the interaction was confirmed and belongs to domain I1 (Confirmed). Finally, the presence of the interaction in one of the 6 interaction databases (Arabidopsis Reactome, AtPID, BAR, Intact, Reactome, TAIR) can be consulted. Supplementary Table VII: Overview of microarray experiments used to calculate the transcript PCCs. List of experiments used to build an Arabidopsis ATH1 micro-array compendium of 518 experiments focused on plant growth and development used to calculate the transcript Pearson correlation coefficients. Supplementary Table VIII: Overview of cell cycle-related features used in the computational analysis to search for new cell cycle proteins. Cell cycle feature Category # of genes References Periodicity Gene expression 1258 (Jensen et al, 2006; Menges et al, 2003) MSA-like Promoter motif 2295 (Vandepoele et al, 2006) E2Fa-like Promoter motif 1809 (Vandepoele et al, 2006) E2F10SPCNA Promoter motif 2221 (Vandepoele et al, 2006) OS_motifsIandIIa Promoter motif 2310 (Vandepoele et al, 2006) UP1ATMSD Promoter motif 3738 (Vandepoele et al, 2006) wrrmGCGn Promoter motif 2179 (Vandepoele et al, 2006) CDK consensus site Phosphorylation motif 6321 (De Veylder et al, 1997) [IM]R-tail Protein sequence motif 116 (Hayes et al, 2006; Vodermaier et al, 2003) PEST-sequence Destruction motif 2719 (Rechsteiner & Rogers, 1996) D-box Destruction motif 2369 (Capron et al, 2003) KEN-box Destruction motif 410 (Capron et al, 2003) GxEN-box Destruction motif 300 (Castro et al, 2003) A-box Destruction motif 1779 (Littlepage & Ruderman, 2002) Supplementary Table IX: Collection of known cell cycle genes. Collection of 518 genes annotated as cell cycle by gene ontology, including all genes described as core cell cycle genes or related to cell cycle (Capron et al, 2003; Menges et al, 2005; Shultz et al, 2007; Vandepoele et al, 2002). The following GO terms were used to build the collection: cell cycle (GO:0007049), cell division (GO:0051301), cell proliferation (GO:0008283), chromosome segregation (GO:0007059), DNA replication (GO:0006260), DNA replication and chromosome cycle (GO:0000067). For every gene, the number of cell cycle-related features (Supplementary Table VIII) present is shown. This collection was subtracted from the prey lists to find novel cell cycle proteins. Supplementary Table X: New candidate cell cycle proteins. A) Overview of 40 new candidate cell cycle proteins identified in the domain I1 dataset containing at least 2 of the cell cycle-related features listed in Supplementary Table VIII. The number of identified features per protein is shown. B) Overview of 83 new candidate cell cycle proteins identified in the domain I2 dataset containing at least 2 of the cell cycle-related features listed in Supplementary Table VIII. The number of identified features per protein is shown. Supplementary Table XI: Comparison of overrepresentative Biological Processes among the preys of domain I1 (A) and I2 (B) as determined with the Cytoscape plugin BiNGO. Search parameters among the preys of the domain I1 and I2 datasets: File created with BiNGO (c) on 22-okt-2008 at 16:17:17 ontology: process curator: GO Selected ontology file : BiNGO.jar!/GO_Biological_Process Selected annotation file : gene_association_2008okt04.tair Overrepresentation Selected statistical test : Hypergeometric test Selected correction : Benjamini & Hochberg False Discovery Rate (FDR) correction Selected significance level : 0.05 Testing option : Test cluster versus whole annotation Number of annotated genes in selection : 137 for domain I1 and 197 for domain I2 Number of annotated genes in network/whole annotation : 25557 The result files of the BiNGO analysis are parsed into one table. The table shows the GO term and the corresponding corrected p-value for the domain I1 dataset (A) and the domain I2 dataset (B) whereas the different ranges of the p-values are highlighted in different colors. Furthermore, GO terms that do not exist in one of these sets are marked with >0.05. Supplementary Table XII: Overview of tested protein pairs by the split-luciferase assay Overview of all 17 protein pairs tested by the transient split-luciferase assay, including 3 negative controls (Untr., Neg. 1 and Neg. 2). The column ‘Type’ shows if an interaction discovered by the TAP analysis belongs to domain I1 or domain I2, or if it was used as a negative control. The TAIR accessions of the corresponding genes are shown together with the protein name. Proteins in column ‘Protein A’ were fused to the amino-terminal moiety of the Firefly luciferase, while proteins in column ‘Protein B’ were fused to the carboxy-terminal moiety of the luciferase. Fusions to the luciferase moieties were either done to the aminoterminus of both tested proteins (N) or the the carboxy-terminus of both proteins (C). The mean value of the relative light emission from at least two experiments per tested interaction is shown together with the standard errors. The number of replicates is shown, taken into account that only experiments recorded with 2000 images integration are shown. The mean value was set to zero when net LUC activity was negative, meaning that the background value was higher than the total LUC activity. An interaction was considered positive (strong or weak) when the net relative LUC activity was above an arbitrary chosen threshold of 4500, which is 9-fold higher than the signal from untransformed seedlings. Supplementary Table XIII: List of CDK/cyclin complexes and its regulators extracted from the interactome. The CDK/cyclin complexes were determined based on the interactions found between a CDK and a cyclin. MAT1 was added as canonical interactor to both CDKD/cyclin complexes. Interacting CKS scaffolding proteins and negative regulators are based on bait-prey interactions between a cyclin and a CKS, a cyclin and a KRP, or a cyclin and a SIM/SMR. (SIM/SMR1,2) indicates possible regulation of the complex by one or more of these inhibitors, based on their association with CDKB1;1. Complexes are ordered according to the time of action of the cyclin based on its transcript peak level (Menges et al, 2005). SMR6 = AT5G40460, SMR8 = AT1G10690, SMR11 = AT2G28330. The table is further visualized in Figure 4. Complex CDKA;1 CYCD2;1 CSK interactors Negative regulators Time of action CKS1/CKS2 SMR3/SMR4/SMR6/SMR8 Constant KRP2/KRP3/KRP4/KRP5/KRP6/KRP7 CDKA;1 CYCD3;1 CKS2 SMR4 G1/S, M KRP6 CDKA;1 CYCD3;2 CKS1/CKS2 Constant CDKA;1 CYCD3;3 CKS2 SMR6 Early G1 CDKA;1 CYCD4;1 CKS2 SMR8 G1/S KRP5/KRP6/KRP7 CDKA;1 CYCD4;2 CKS1/CKS2 KRP2/KRP3/KRP4/KRP5/KRP6/KRP7 G1/S CDKA;1 CYCD5;1 CKS1/CKS2 SMR6/SMR8 Early G1 KRP2/KRP7 CDKA;1 CYCD6;1 CKS1 KRP3/KRP4/KRP7 G1/S CDKA;1 CYCD7;1 CKS1/CKS2 SMR4/SMR6 G1/S CDKB1;1 CYCA2;1 (SIM/SMR1/SMR2) G2/M CDKB1;1 CYCA2;3 CKS1 (SIM/SMR1/SMR2) G2/M CDKA;1 CYCA3;1 CKS2 S CDKA;1 CYCA3;3 CKS1/CKS2 Constant CDKA;1 CYCA3;4 CKS1/CKS2 S CDKB2;1 CYCB1;1 G2/M CDKB2;2 CYCB1;1 G2/M CDKB2;2 CYCB1;2 CKS2 CDKB1;1 CYCB2;2 CKS1/CKS2 CDKB1;2 CYCB2;2 CKS1/CKS2 CDKB1;1 CYCB2;4 CKS1/CKS2 CDKB1;1 CYCB3;1 M (SIM/SMR1/SMR2) G2/M G2/M (SIM/SMR1/SMR2) G2/M SMR11 G2/M (SIM/SMR1/SMR2) G2/M CDKC;1 CYCT1;3 Constant CDKC;2 CYCT1;3 Constant CDKC;1 CYCH;1 Constant CDKC;2 CYCH;1 Constant CDKD;2 CYCH;1/MAT1 Constant CDKD;3 CYCH;1/MAT1 Constant CDKG;1 CYCL1 Constant, peak at G0/G1 CDKG;2 CYCL1 Constant Supplementary References Capron A, Ökrész L, Genschik P (2003) First glance at the plant APC/C, a highly conserved ubiquitin-protein ligase. Trends Plant Sci 8: 83-89 Castro A, Vigneron S, Bernis C, Labbe J C, Lorca T (2003) Xkid is degraded in a D-box, KEN-box, and A-box-independent pathway. Mol Cell Biol 23: 4126-4138 De Veylder L, Segers G, Glab N, Van Montagu M, Inze D (1997) Identification of proteins interacting with the Arabidopsis Cdc2aAt protein. J Exp Bot 48: 2113-2114 Hayes MJ, Kimata Y, Wattam SL, Lindon C, Mao G, Yamano H, Fry AM (2006) Early mitotic degradation of Nek2A depends on Cdc20-independent interaction with the APC/C. Nat Cell Biol 8: 607-614 Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007) SUBA: the Arabidopsis Subcellular Database. Nucleic Acids Research 35: D213-D218 Jensen LJ, Jensen TS, de Lichtenberg U, Brunak S, Bork P (2006) Co-evolution of transcriptional and post-translational cell-cycle regulation. Nature 443: 594-597 Littlepage LE, Ruderman JV (2002) Identification of a new APC/C recognition domain, the A box, which is required for the Cdh1-dependent destruction of the kinase Aurora-A during mitotic exit. Genes Dev 16: 2274-2285 Menges M, de Jager SM, Gruissem W, Murray JAH (2005) Global analysis of the core cell cycle regulators of Arabidopsis identifies novel genes, reveals multiple and highly specific profiles of expression and provides a coherent model for plant cell cycle control. Plant J 41: 546-566 Menges M, Hennig L, Gruissem W, Murray JAH (2003) Genome-wide gene expression in an Arabidopsis cell suspension. Plant Mol Biol 53: 423-442 Morgan DO (2008a) SnapShot: cell-cycle regulators I. Cell 135: 764-764 E761 Morgan DO (2008b) SnapShot: Cell-cycle regulators II. Cell 135: 974-974 E971 Peres A, Churchman ML, Hariharan S, Himanen K, Verkest A, Vandepoele K, Magyar Z, Hatzfeld Y, Van Der Schueren E, Beemster GTS, Frankard V, Larkin JC, Inzé D, De Veylder L (2007) Novel plant-specific cyclin-dependent kinase inhibitors induced by biotic and abiotic stresses. J Biol Chem 282: 25588-25596 Rechsteiner M, Rogers SW (1996) PEST sequences and regulation by proteolysis. Trends Biochem Sci 21: 267-271 Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Séraphin B (1999) A generic protein purification method for protein complex characterization and proteome exploration. Nature Biotechnol 17: 1030-1032 Shultz RW, Tatineni VM, Hanley-Bowdoin L, Thompson WF (2007) Genome-wide analysis of the core DNA replication machinery in the higher plants Arabidopsis and rice. Plant Physiol 144: 1697-1714 Van Leene J, Witters E, Inzé D, De Jaeger G (2008) Boosting tandem affinity purification of plant protein complexes. Trends Plant Sci 13: 517-520 Vandepoele K, Casneuf T, Van de Peer Y (2006) Identification of novel regulatory modules in dicotyledonous plants using expression data and comparative genomics. Genome Biol 7: R103 Vandepoele K, Raes J, De Veylder L, Rouzé P, Rombauts S, Inzé D (2002) Genome-wide analysis of core cell cycle genes in Arabidopsis. Plant Cell 14: 903-916 Vodermaier HC, Gieffers C, Maurer-Stroh S, Eisenhaber F, Peters J M (2003) TPR subunits of the anaphase-promoting complex mediate binding to the activator protein CDH1. Curr Biol 13: 1459-1468