Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Immunology and Cell Biology (2007) 85, 567–570 & 2007 Australasian Society for Immunology Inc. All rights reserved 0818-9641/07 $30.00 www.nature.com/icb SHORT COMMUNICATION Use of gene profiling to describe a niche for dendritic cell development Geneviève Despars1,3, Terence J O’Neill2 and Helen C O’Neill1 Gene profiling provides a multitude of data on individual gene expression. The view is expressed here that unreplicated data can be used in a descriptive way to compare cell populations in terms of their lineage characteristics and function. In these studies, the aim is to provide a snapshot of gene expression or its absence as a reflection of cell lineage or type, rather than gain a reliable expression measure for all genes expressed. The data set used in this analysis represents gene expression in the splenic stroma STX3 supportive of dendritic cell hematopoiesis and the lymph node stroma 2RL22, which is non-supportive. These were obtained by hybridization of Affymetrix U74Av2 genechips. The use of P-value selection to identify genes with a high probability of differential expression has been used effectively to detect differentially expressed genes. Genes that relate to a niche environment for hematopoiesis have been selected for further study to make predictions about the cell types of supportive stroma. Immunology and Cell Biology (2007) 85, 567–570; doi:10.1038/sj.icb.7100080; published online 29 May 2007 The microenvironment required for development of dendritic cells (DC) is still poorly understood, owing to a lack of appropriate in vitro systems to study differentiation. Long-term cultures (LTC) of spleenproducing DC have been promising in this regard.1 These cultures comprise an adherent stromal cell layer required for DC development and a suspension fraction of progenitors and immature DC.2,3 Production of known growth factors regulating DC development in vitro was found to be negative.1 Cell–cell contact is required between DC progenitors and adherent stromal cells to drive DC development, but no adhesion molecules have yet been identified.3,4 Very little is therefore known about the potential regulators expressed by stromal cell components of spleen that support DC production in LTC. The need to identify stromal regulators of DC hematopoiesis has prompted an investigation of the genome-wide expression patterns of two functionally different stroma. The STX3 splenic stroma and the 2RL22 lymph node stroma support and do not support, respectively, DC development from bone-marrow-derived progenitors.5 A major challenge of transcriptome analysis using microarrays is the analysis of signal values and the retrieval of specific data sets. This paper details the procedures for retrieval of probe sets identifying genes expressed with high certainty in STX3 but not expressed in 2RL22. RESULTS AND DISCUSSION Transcriptome analysis on two functionally distinct stroma was performed with the aim of identifying new regulators involved in DC development in vitro. Since the aim was to ‘fish’ for genes expressed with high certainty in STX3 but not expressed in 2RL22, we performed computational analysis to identify these genes from unreplicated Affymetrix genechip experiment. We compare here the standard method of Affymetrix Microarray Suite 5.0 software package (MAS 5.0) with another method based on P-value for retrieving a short list of genes of interest. This computational subtraction method could be relevant for gene expression studies that are not easily replicated and that sometimes involve small numbers of stem cells or progenitors. These studies usually involve difficult and costly procedures for cell isolation. Comparative transcriptome analysis is based on the hypothesis that genes with similar expression patterns are related mechanistically, and functionally associated with a given biological process. The STX3 splenic stroma is an in vitro niche model for early DC hematopoiesis. To identify genes specifically expressed in STX3, computational subtractions were performed between 2RL22 and STX3. This differential analysis will allow identification of genes potentially involved in early DC hematopoiesis. Analysis of signal value distribution revealed that the vast majority of genes are commonly expressed at similar levels in both stroma. The common data set of 2RL22 and STX3 probably reflects housekeeping genes related to cell metabolism and division. The general aim of the Affymetrix genechip experiment was to retrieve a list of genes with high confidence in their specific expression in STX3. Detection calls ascribed by MAS 5.0 were first used to retrieve specific data sets. Detection calls correspond to ‘present’ for detection P-value o0.04, ‘marginal’ for detection 1School of Biochemistry and Molecular Biology, The Australian National University, Canberra, Australia and 2School of Finance and Applied Statistics, The Australian National University, Canberra, Australia 3Current address: Institut de Génomique Fonctionnelle de Lyon, IFR128 Gerland Lyon Sud, Université Lyon 1, CNRS, INRA, Ecole Normale Supérieure de Lyon, France. Correspondence: Professor HC O’Neill, School of Biochemistry and Molecular Biology, The Australian National University, Bldg 41 Linnaeus Way, Canberra, ACT 0200, Australia. E-mail: [email protected] Received 15 February 2007; revised 29 April 2007; accepted 4 May 2007; published online 29 May 2007 Gene profiling to describe a niche for DC development G Despars et al 568 Figure 1 Comparison of specific data sets retrieved on detection calls and detection P-values. (a) Signal plot of specific data sets retrieved according to indicated selection criterion. For detection call, the 2RL22-specific data set was selected by 2RL22: presence and STX3: absence or marginal. The STX3-specific data set included probe sets with both STX3: presence and 2RL22: absence or marginal. Retrieval on stringent detection P-value was based on detection P-value p0.005 for presence and detection P-value X0.1 for absence. Corresponding probe sets were plotted for 2RL22 signal value against STX3 signal value. Contours represent 1, 5, 10, 15, 20 and 25 probe set limits for the left panel and 1, 2, 3, 4 and 5 probe set limits for the right panel. The linear regression of 1 is shown. (b) Detection P-value plot of specific data sets retrieved on detection call and detection P-value. The standard linear regression of 0.04 is shown. P-value between 0.04 and 0.06 and ‘absent’ for detection P-value 40.06. This method of retrieval gives poor discrimination of specific expression, displaying overlap between specific data sets and many probe sets on the linear regression of 1 (Figure 1a). In MAS 5.0, the detection P-value is calculated with the one-sided Wilcoxon Signed Rank Test and ascribes discrete values. For a given probe set, a low STX3 detection P-value is associated with a high certainty of signal value. This approach limits the number of false positives, also described as the Bonferroni Correction. A 2RL22 detection P-value greater than the MAS 5.0 default value of 0.06 for absence should remove false negatives, for example probe sets considered as absent but which are not. To identify probe sets specifically expressed by STX3 but not 2RL22, several detection P-values below 0.04 were set for STX3, along with a constant 2RL22 detection P-value above 0.1. The same approach was taken to retrieve a 2RL22specific data set. Specific data sets retrieved on stringent detection P-values p0.005 for presence and detection P-value X0.1 showed no overlap in terms of signal value (Figure 1a). Probe sets of these specific data sets were annotated and clustered into functional categories. The pattern of functional categorization was different between STX3 and 2RL22 data sets, suggesting that selection on detection P-value retrieves genes related to distinct biological functions (data not shown). Immunology and Cell Biology STX3-specific data sets retrieved on standard and stringent detection P-values were further analyzed in terms of detection P-values and signal (Figure 1b). The nature of the one-sided Wilcoxon Signed Rank Test was revealed by alignment of discrete values rather than random dispersion. A data set of 673 probe sets was extracted using standard detection P-value, which is relatively large and complicates the choice of potentially interesting genes. Discrimination between STX3 and 2RL22 signal values was also poor, with a mean signal value of 113.3±10.0 for STX3 and 36.3±2.1 for 2RL22. The size of data sets decreased as the STX3 detection P-value decreased from 0.03 to 0.005, ranging from 317 to 165 probe sets. Analysis of signal values indicated good discrimination between STX3 and 2RL22 with the mean STX3 signal being B7-fold to B15-fold higher than the mean 2RL22 signal. These were significantly different (Z-test). Selection on the basis of stringent detection P-value also generated data sets with higher STX3 signal values compared with selection based on detection calls. Selection with stringent detection P-value can efficiently discriminate specifically expressed probe sets and enrich for highly expressed probe sets. Since specific gene expression was thought to be related to the DC supportive function of STX3, we were interested in identifying genes that are potential regulators of the microenvironment for DC development. A DC niche could be regulated by secreted factors as well as cell surface molecules. Probe sets clustered under chemokines/ cytokines, growth factors, matrix remodeling, extracellular matrix/cell adhesion, surface proteins, receptors, signaling molecules, transcriptional regulation and development were considered to be of interest. Selection on the basis of detection P-value resulted in enrichment of functionally relevant categories. These categories altogether represent 56% of probe sets of the specific data set retrieved on detection call (STX3 P-value o0.04; 2RL22 P-value 40.04) compared with 66.9% of the STX3-specific data set retrieved on the basis of detection P-value (STX3 P-value p0.005 and 2RL22X0.1). This was reflected by decreasing numbers of probe sets related to metabolism across data sets, rather than by an increase in the absolute number of probe sets within relevant categories (data not shown). Analysis of genes in the categories ‘receptor’ and ‘extracellular matrix’ showed that selection on detection P-value led to removal of probe sets with an STX3:2RL22 signal ratio of B1, as well as with a higher mean signal value (Figure 2). These results indicate that selection on detection P-value eliminated genes related to common metabolic functions, as well as genes of little interest in terms of signal discrimination between STX3 and 2RL22 within the categories of interest. Expression of a number of genes in STX3 but not 2RL22 was confirmed using RT-PCR.6 In the ‘receptor’ category, these genes included Acvrl1, Ms4s4d and Thfrsf9, and in the ‘extracelluular matrix category’, these included Col18a1, Mcam and Cd34. With the expression also of Flt1 but not Cd31 or Vwf (von Willebrand factor), STX3 appears to represent an immature endothelial cell. Altogether, retrieval of genes on the basis of detection P-value was successful in identifying the phenotype of STX3 stromal cells that provide hematopoietic support function for DC development. Rather than a comprehensive map of gene expression, this method provided a biological snapshot of gene expression in STX3. In contrast, 2RL22 showed expression of genes for extracellular matrix proteins (Col1a1, Col2a1, Col3a1, Col5a1, Col5a2, P4ha1 and P4ha2) expressed by fibroblasts. This report compares various approaches for extraction of differentially expressed genes from unreplicated Affymetrix data sets. Our aim was to detect a subset of genes with high certainty of differential expression rather than to detect all the genes that are potentially differentially expressed. The method for extraction of data was based on P-value selection to remove data outside the bounds of statistical significance. This procedure was compared with the default detection Gene profiling to describe a niche for DC development G Despars et al 569 METHODS Cell lines Derivation of the splenic stroma STX3 and the lymph node stroma 2RL22 has been described previously.1–3 Both were derived from LTC established from B10.A(2R) mice. The two stroma differ in cell morphology. STX3 comprises a mix of endothelial cells and fibroblasts, while 2RL22 contains mainly fibroblast-like cells.5 Stromal cells were cultured as described previously6 and maintained by scraping attached cells for passage into a new flask. Microarray analysis of gene expression Total RNA extraction was performed using Trizol (Invitrogen Life Technologies, Mount Waverley, VIC, Australia). Synthesis of cDNA involved the use of T7-(dT)24 primers and SuperScript II according to the manufacturer’s instructions (Invitrogen Life Technologies). This was followed by second strand synthesis with DNA polymerase 1 (Promega, Annandale, NSW, Australia). In vitro transcription and biotin labeling were performed by Dr Kaiman Peng (Biomolecular Resources Facility, Australian National University) using the BioArray High-Yield RNA Transcript Labeling Kit (Affymetrix, Santa Clara, CA, USA). Labeled cRNA was fragmented and hybridized to Test 3 chips (Affymetrix), before hybridization to murine genome U74Av2 Genechips (Affymetrix). Hybridization involved 0.05 mg/ml biotin-labelled cRNA in hybridization buffer (100 mM MES, 1 N [Na+], 20 mM EDTA, 0.01% Tween 20) supplemented with 0.1 mg/ml herring sperm DNA and 0.5 mg/ml acetylated bovine serum albumin for 16 h. Washing and staining with streptavidinphycoerythrin were performed on the fluidics station according to the manufacturer’s instructions (Affymetrix). The quality of the two arrays was confirmed on the basis of low background (STX3: 55.76; 2RL22: 58.56), high overall % genes expressed indicative of high-quality RNA (STX3: 52.8%; 2RL22: 50.5%), low noise (RawQ) (STX3: 2.300; 2RL22: 2.210) and signal intensity ratios of 3¢/5¢ probe sets for housekeeping genes B1.0 (b-actin: STX3¼1.08, 2RL22¼1.29; GAPDH: STX3¼1.02, 2RL22¼1.01). Data mining for retrieval of specific data sets Figure 2 Comparison of the categories ‘receptor’ and ‘extracellular matrix’ for STX3-specific data sets. Dot plots of the receptor (a) and extracellular matrix (b) categories of STX3-specific data sets retrieved on detection call and detection P-value. The linear regression of 1 is shown. call associated with the commonly used software package MAS 5.0. Selection on the basis of P-value was retained as the most discriminating method for retrieving probe sets specifically expressed by STX3 but not 2RL22. This method was considered the most appropriate, given the size of the data set and expression levels. Using P-value selection, probe sets were retrieved with higher expression levels in STX3 compared with 2RL22, indicated by differences in the mean and median signal values. Several of these genes were further verified by RT-PCR, which confirmed differential gene expression between STX3 and 2RL22.6 Furthermore, Affymetrix genechip results have been shown to correlate well with quantitative real-time PCR data.7 Specifically expressed genes are expressed at low levels and represent a relatively small proportion of genes expressed by each stroma. This suggests that a small set of genes expressed at low level may be responsible for cell specificity. Hence, definition of the specific microenvironment for early DC hematopoiesis could depend on expression of a limited number of relevant genes. Other computational approaches have involved lower confidence bound calculation and P-value. However, the lower confidence bound calculation depends on a large number of arrays.8 The statistical significance (P-value) of gene expression has also been described for microarray experiments that have a series of samples.9 Most importantly, the P-value selection method used here can be readily applied to non-replicated data sets derived from limited numbers of rare cells like stem cells. Scanned images of murine genome U74Av2 genechips (Affymetrix) hybridized with labelled cRNA prepared from STX3 or 2RL22 RNA were processed using Affymetrix Microarray Suite 5.0 software (MAS 5.0). This analysis generated a text file of probe set entries, P-values and signal values for STX3 and 2RL22, as well as partial annotation of probe sets representing individual genes. In MAS 5.0, the P-value is calculated using the distribution of the test statistic, in this case, using one-sided Wilcoxon Signed Rank Test to calculate the detection P-value. The P-value is the probability under the null hypothesis that the test statistic is as extreme as the observed value. Microsoft Excel or scripts written using interactive data language (IDL) software (http://www.ittvis.com) were used to calculate mean and median signal values for data sets. Data sets were made for probe sets according to a range of different selection criteria. Probe sets within each data set were manually annotated with gene name and the Gene Ontology categories of ‘Biological Process’, ‘Cellular Component’ and ‘Molecular Function’, available on the websites of Mouse Genome Informatics (www.informatics.jax.org), Affymetrix database (www.affymetrix.com) and NCBI database (http://www.ncbi.nlm.nih.gov). The Conserved Domain database10 was used in a few cases to predict gene function. Statistical analysis was performed on signal values of data sets using the Z-test, chosen because of the skewed distribution of signal values. ACKNOWLEDGEMENTS This work was supported by funding from the Australian National University to HO and TO. GD was supported by a PhD scholarship from the Fonds de la Recherche en Santé du Québec. 1 Ni K, O’Neill HC. Long-term stromal cultures produce dendritic-like cells. Br J Haematol 1997; 97: 710–725. 2 Ni K, O’Neill HC. Spleen stromal cells support haemapoiesis and in vitro growth of dendritic cells from bone marrow. Br J Haematol 1999; 105: 58–67. Immunology and Cell Biology Gene profiling to describe a niche for DC development G Despars et al 570 3 Wilson HL, Ni K, O’Neill HC. Identification of progenitor cells in long-term spleen stromal cultures that produce immature dendritic cells. Proc Natl Acad Sci USA 2000; 97: 4784–4789. 4 Wilson HL, Ni K, O’Neill HC. Proliferation of dendritic cell progenitors in long term culture is not dependent on granulocyte macrophage-colony stimulating factor. Exp Hematol 2000; 28: 193–202. 5 Ni K, O’Neill HC. Hemopoiesis in long-term stroma-dependent cultures from lymphoid tissues: production of cells with myeloid/dendritic characteristics. In vitro Cell Dev Biol Anim 1998; 34: 298–307. 6 Despars G, Ni K, Bouchard A, O’Neill TJ, O’Neill HC. Molecular definition of an in vitro niche for dendritic cell development. Exp Hematol 2004; 32: 1182–1193. Immunology and Cell Biology 7 de Reyniès A, Geromin D, Cayuela JM, Petel F, Dessen P, Sigaux F et al. Comparison of the latest commercial short and long oligonucleotide microarray technologies. BMC Genomics 2006; 15: 51. 8 Li C, Wong WH. Model-based analysis of oligonucleotide arrays : model validation, design issues and standard error application. Genome Biol 2001; 2, research 0032.1–0032.11. 9 Sasik R, Calvo E, Corbeil J. Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model. Bioinformatics 2002; 18: 1633–1640. 10 Marchler-Bauer A, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He S et al. CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res 2003; 31: 383–387.