Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A platform for querying breast and prostate cancerrelated microRNA genes Shun-Tsung Chen, Hsing-Fang Wu, Ka-Lok Ng* Department of Biomedical Informatics Asia University Taiwan 41354 *corresponding author: [email protected] Abstract—Recent studies indicate that microRNA may play an important role in human cancer, where microRNA targets tumor suppressor genes or oncogenes. To study this hypothesis, breast and prostate cancers are selected as illustrations. Differentially expressed genes (DEGs) are identified by using the Bioconductor package. By integrating three complementary resources, that is, cancerous genes, microRNA target genes, and cancer-related microRNA databases; it is found that certain cancer-related DEGs are regulated by microRNAs. These findings suggest a potential relationship between those microRNAs and DEGs events, which deserve further in-vitro investigation. An integrated platform has been set up that provides a user friendly interface for query, http://ppi.bioinfo.asia.edu.tw/R_cancer/ . Keywords- differentially expressed genes; microRNA; breast cancer; prostate cancer; Bioconductor; Significance Analysis of Microarray, Empirical Bayes Analysis of Microarrays and Empirical Bayes Method I. and clustering analysis [8]. In this study, three statistical methods, i.e. Significance Analysis of Microarray (SAM) [910], Empirical Bayes Analysis of Microarrays (EBAM) [11], and empirical Bayes statistics for differential expression (eBayes) [12] are employed to screen DEGs. The publicly available microarray data analysis package Bioconductor [1314] is adopted to perform such calculations. We selected breast cancer and prostate cancer as our study cases in this work. Three complementary resources, (i) the cancerous gene database, Tumor Associated Gene (TAG) [15], (ii) the miRNA target gene database, ncRNAppi [16], and (iii) the cancerrelated microRNA databases; miR2disease [17] are used to look for miRNAs which regulate cancer-related genes. The main advantage of the present platform on miRNA-mRNA targeting information is that all the target genes’ information and disease records are experimentally verified on high confidence records. INTRODUCTION Microarray technology allows for high-throughput screening and analyzing tens of thousands of genes at the same time. Some genes are activated or inhibited (called differentially expressed genes, DEGs), due to certain regulatory factors, resulting in changes in gene expression levels up to a few times, ten times or more. Given sets of cancer microarray data, one can identify DEGs among a large number of gene expressions, and understand the mechanism of sickness caused by these DEGs. MicroRNAs (miRNAs) are a class of small non-coding RNAs that hybridize to their target mRNA sequence in the 3’untranslated region, and induce either translation repression or mRNA degradation. Studies show that the abnormal expression of genes often results from the regulation of miRNAs. Recent works indicate that miRNAs could play an important role in human cancer where miRNAs target oncogene (OCG) or tumor suppressor genes (TSG) to regulate gene expression [1-5]. In this paper, we propose to study the role of miRNAs in carcinogenesis. There are many microarray data analysis methods, such as using the concept of false discovery rate (FDR) to screen for significant genes [6], using ANOVA to explore the impact of microarray gene expression values within a single factor [7], II. INPUT DATA AND METHODS A. Input data The microarray data for breast cancer and prostate cancer were downloaded from ith experiment ID E-GEOD-9574 and E-GEOE-GEOD-9574, an 133A microarray, compared gene expression between 15 samples of histologically normal breast epithelium from breast cancer patients and 14 samples of cancer-free controls. The 15 samples were drawn from epithelia adjacent to a breast tumor, and the 14 samples were obtained from patients undergoing reduction mammoplasty without apparent breast cancer. 504 hybridizations, including data from several kinds of normal tissues and cancer tissue. Among these 504 arrays, 64 arrays are experimented prostate cancer tissue and 18 arrays are normal prostate tissue samples’ B. Bioconductor packages – SAM, EBAM and eBayes SAM is a statistical method for identifying DEGs by comparing two or more groups of samples. It uses repeated permutations of the data to estimate False Discovery Rate (FDR) based on observed versus expected score, which is obtained from randomized data. A gene which has an observed score that deviates significantly from the expected score is consider as a DEG. EBAM performs one and two class analyses using either a modified t-statistic or standardized Wilcoxon rank statistic, and a multiclass analysis using a modified F-statistic. Moreover, this function provides a EBAM procedure for categorical data such as SNP data and the possibility of employing a user-written score function. The eBayes algorithm computes moderated t-statistics, moderated F-statistic, and log-odds of differential expression by empirical Bayes shrinkage of the standard errors towards a common value. Since both the breast and prostate cancer microarray data are obtained from two different groups of patients, a two-class unpaired test is adopted in the SAM, EBAM and eBayes analysis. C. Databases integration The Tumor Associated Gene database (TAG) has collected a total of 655 tumor-associated genes, including 245 oncogenes (OCG), 259 tumor suppressor genes ( ncRNAppi (Non-coding RNA protein-protein interaction) is a useful tool for identifying ncRNA targeting pathways (http://ncrnappi.cs.nthu.edu.tw/). It offers all possible sub-paths from miRNA and siRNA target genes by using protein-protein interactions. miR2Disease is a manually curated database, which aims at providing a comprehensive resource for miRNA deregulation in various human diseases (http://www.mir2disease.org/). Each entry in the miR2Disease contains detailed information on a miRNA-disease relationship, including miRNA ID, disease name, a brief description of the miRNA-disease relationship, miRNA expression pattern in the disease state, detection method for miRNA expression, experimentally verified miRNA target genes, and literature reference. III. Using a delta value of 0.909, EBAM predicted 100 DEGs. The EBAM plot is shown in Fig. 2. The z-score values for up and down regulated DEGs are z 2.833 and z 2.518 respectively. After removing redundant probe IDs, 87 DEGs are left with the FDR equals to 0.04. The Jaccard coefficient of overlapping prediction between SAM and EBAM is 50%, i.e. 57/(84+8757)). Beside this difference, the rank of DEGs determined by SAM and EBAM are slightly different. Figure 2. EBAM plot for breast cancer DEGs, Delta = 0.909. Predicted DEGs are validated by using the TAG database, and the accuracy of prediction using SAM and EBAM are compared. It is found that SAM and EBAM predicted 18 genes (21.4%) and 16 genes (18.4%) respectively. This result suggests that SAM performs slightly better (3%) than EBAM in identifying cancer-related genes. Table 1 lists the DEGs obtained by SAM and EBAM, and validated by the cancer gene database, TAG. TABLE I. DEGS PREDICTED BY SAM AND EBAM AND VALIDATED BY TAG SAM prediction RESULTS EIF1, BTG2, FANCG, EGR1, CYLD, CEBPD, KLF 6 EBAM prediction EIF1, BTG2, FANCG, EGR1, CYLD, CEBPD, KLF6 JUN, FOS, JUND, KLF4, LYN PTP4A1, EMP1, PMAIP1, AKR7A2 In this section, the results of DEGs predicted by the Bioconductor package are reported for both breast cancer and prostate cancer. OCG A. Breast cancer Others Using a delta () value of 1.14, SAM predicted 84 DEGs after removing redundant probe IDs, where the FDR is 0.053. Fig. 1 shows the SAM plot. To further annotate the oncogenic or tumor suppressor role for the rest of the predicted DEGs, their gene symbols were manually input into PubMed to search for relevant publications. Cancer-related DEGs were queried against NCBI PubMed, the ncRNAppi and miR2disease databases to identify their upstream miRNAs. The results are shown in Table II. TSG JUN, FOS, JUND, KLF4, LYN, MLF1, CD24, GNAS PTP4A1, EMP1, PMAIP1, CLDN1, H3F3B TABLE II. Figure 1. SAM plot for breast cancer DEGs CANCER-RELATED MIRNA IDENTIFIED BY LITERATURE MINING, USING NCRNAPPI AND MIR2DISEASE JUN FOS CD24 TAG OCG OCG OCG ncRNAppi miR-30 n/a miR-373 MCL1 OCG [19-20] miR-29b PTGS2 OCG [21-22] ZEB1 OCG [23-24] let-7b, miR-16 miR-200 (a,b,c), 429 miR2Disease n/a miR-101 n/a miR-29b, 133b, 101, 153, 204, 320, 512-5p n/a miR-141, 200 (a,b,c), 205, 429 CLDN1 TSG [25-26] miR-155 CYR61 TSG [27-28] miR-30a-3p H3F3B n/a miR-1 n/a denotes information is not available miR-206 miR-206 n/a denotes information is not available; other denotes cancer-related gene n/a n/a n/a Tissue specific information for cancer-related miRNAs, which are provided by miR2disease, is summarized in Table III. It is found that certain miRNAs which target the DEGs predicted by SAM or EBAM are indeed breast tissue specific, suggesting these miRNAs can potentially play an oncogenic or tumor suppressor role in breast cancer. TABLE III. CANCER TISSUE SPECIFIC MIRNA AND ITS DOWNSTREAM REGULATED GENE miRNA cancer tissue type regulated DEG miR-29b breast MCL1 miR-30a-3p breast, colon, lung CYR61 miR-`155 breast, colon, liver, lung, ovary CLDN1 miR200(a,b,c) breast, liver, lung, ovary ZEB1 miR-429 breast, lung, ovary ZEB1 It was found that FOS, AXL and TGFBR2 are regulated by miR-101, miR-1 and miR-20a, respectively; Both PLP2 and CD59 are regulated by miR-124; GJA1 is regulated by both the miR-1 and miR-206. Results of tissue specific cancerrelated miRNA are depicted in Table VI. Literature records indicated that both miR-101 and miR-20a are associated with prostate cancer. FOS plays the role of OCG in prostate cancer [31], but this information is not recorded by miR2Disease. TABLE VI. CANCER TISSUE SPECIFIC MIRNA AND ITS DOWNSTREAM REGULATED GENE miRNA miR-101 miR-373 breast, head and neck squamous cell (HNSC), prostate CD24 miR-20a cancer tissue type regulated DEG prostate, ovary, renal, liver, bladder, head and HNSC prostate, breast, colon, kidney, liver, lung, ovary, pancreatic FOS TGFBR2 miR-1 colon, liver, lung, HNSC AXL miR-124 liver PLP2 miR-124 liver CD59 miR-206 breast, ovary miR-1 colon, liver, lung, HNSC GJA1 B. Prostate cancer The first 100 most significant DEGs, which consist of up and down regulated genes, predicted by The adjusted p-value of these DEGs are less than 1.9*10-5. Among these 100 DEGs, 15 genes are removed due to repeated probe IDs. For the remaining 85 DEGs there are 16 known cancer genes (18.8%) recorded by TAGs and the PubMed service. The results are summarized in Table IV. TABLE IV. THE RESULTS OF MATCHING THE TOP 85 DEGS WITH TAG OCG (5) TSG (8) Others (3) AXL, ETS2, JUND, LMO2, FOS TGFBR3, PPP1R3C, GPX3, RARRES1, GAS1, BIN1, TGFBR2, CSPG4, RSU1, Genes PLP2, CD59 and GJA1 are not collected by TAG, however, PLP2 is recorded by the ncRNAppi database; whereas the literature suggested that GJA1 is a TSG [29] and CD59 [30] is associated with prostate cancer. Again, cancer-related DEGs are queried against NCBI PubMed, the ncRNAppi and miR2disease databases. The results are shown in Table V. TABLE V. FOS AXL TGFBR2 PLP2 CD59 GJA1 CANCER-RELATED MIRNA IDENTIFIED BY LITERATURE MINING, USING NCRNAPPI AND MIR2DISEASE TAG OCG OCG TSG n/a other TSG ncRNAppi n/a miR-1 miR-20a miR-124 miR-124 miR-1 miR2Disease miR-101 n/a miR-20a n/a n/a miR-1 TGFBR2 is identified as a miR-20a target gene by miR2Disease [32], this means that miR-20a participates in human prostate cancer by regulating TGFBR2. CD59 is associated with prostate cancer, but its regulator miR-124 could not be identified with any association with cancer. In addition, the literature had a record indicating that AXL, which is regulated by miR-1, is expressed in prostate cancer [33]. It suggested that AXL is probably highly related to prostate cancer. An integrated platform has been set up to provide a user friendly interface for query, http://ppi.bioinfo.asia.edu.tw/R_cancer/. This website provides genetic, miRNA and tissue information for (i) breast cancer DEGs predicted by SAM and EBAM, and (ii) prostate cancer DEGs predicted by eBayes. This platfrom will serve as an useful resource for studying the oncogenic and tumor suppressor role of miRNAs. IV. SUMMARY AND CONCLUSION In this study, the Bioconductor package is adopted to identify DEGs for breast as well as prostate cancer from microarray data. Our results suggested that, SAM, EBAM and eBayes, achieve a similar level of cancer gene prediction accuracy, i.e. around 20%. By integrating the ncRNAppi and miR2Disease databases, it is found that certain DEGs are regulated by miRNAs. Breast and prostate tissue specific cancer-related miRNAs are identified, suggesting these miRNAs can potentially play an oncogenic or tumor suppressor role in cancer. Also, a few cancer-related miRNAs are identified which are verified in the literature, thus, indicating the strength or effectiveness of the present approach. [14] ACKNOWLEDGMENT Drs. Ka-Lok Ng and Shan-Chin Lee work is supported by the National Science Council of R.O.C. under the grant of NSC NSC 99-2221-E-468-016-MY2. We thank Tim Williams, Asia University, for providing an English proof reading service for this article. [15] [16] REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] C. Z. Chen, “MicroRNAs as Oncogenes and Tumor Suppressors,” New Eng. J. Med, 353, pp.1768-1771, 2005. R. Garzon, M. Fabbri, A. Cimmino, G. A. Calin, and C. M. Croce, “MicroRNA expression and function in cancer,” Trends Mol Med, 12, pp.580-587, 2006. E. K. Aurora and J. S. Frank, “Oncomirs-microRNAs with a role in cancer,” Nature Reviews Cancer, 6, pp.259269, 2006. P. M. Voorhoeve, “MicroRNAs: Oncogenes, tumor suppressors or master regulators of cancer heterogeneity?,” Biochim Biophys Acta, 1805, pp.72-86, 2010. B. Zhang, X. Pan, G. P. Cobb, and T. A. Anderson, “microRNAs as oncogenes and tumor suppressors,” Develop Biol, 302(1), pp.1-12, 2007. B. Efron and R. Tibshirani, “Empirical bayes methods and false discovery rates for microarrays,” Genet Epidemiol, 23(1), pp.70-86, 2002. M. K. Kerr, C. A. Afshari, B. Lee, P. Bushel, J. Martinez, N. J. Walker et al., “Statistical analysis of a gene expression microarray experiment with replication,” Statistica Sinica, 12, pp.203-217, 2002. J. Laura, D. Hongyue, J. Marc, D. Yudong, A. Augustinus, M. Mao et al., “Gene expression profiling predicts clinical outcome of breast cancer,” Nature, 415, pp.530-536, 2002. V. Tusher, R. Tibshirani, and G. Chu, “Significance analysis of microarrays applied to the ionizing radiation response,” Proc Natl Acad Sci U.S.A., 98(9), pp.51165121, 2001. S. Zhang, “A comprehensive evaluation of SAM, the SAM R-package and a simple modification to improve its performance,” BMC Bioinformatics, 8, 230, 2007. B. Efron, R. Tibshirani, J. Storey, and V. Tusher, “Empirical Bayes Analysis of a Microarray Experiment,” American Statistical Association Journal of the American Statistical Association, 96(456), pp.1151-1160, 2001. B. Efron, “Robbins, empirical Bayes and microarrays,” Annals of Statistics 31(2), pp.366-378, 2003. http://www.bioconductor.org [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] R. Irizarry, “From CEL Files to Annotated Lists of Interesting Genes.In: Bioinformatics and Computational Biology Solutions using R and Bioconductor,” pp.431442, 2005. H. H. Chan, “Identification of novel tumor-associated gene (TAG) by bioinformatics analysis,” MSc Thesis. National Cheng Kung University, Institute of Molecular Medicine, Taiwan; 2006. K. L. Ng, H. C. Liu, and S. C. Lee, “ncRNAppi – A tool for identifying disease-related miRNA and siRNA targeting pathways,” Bioinfo. 25(23), pp.3199-3201, 2009. Q. Jiang, Y. Wang, Y. Hao, L. Juan, M. Teng, X. Zhang et al., “miR2Disease: a manually curated database for microRNA deregulation in human disease,” Nucl Acids Res, 37, D98-104, 2009. http://www.ebi.ac.uk/arrayexpress C. Kempkensteffen, S. Hinz, M. Johannsen, H. Krause, A. Magheli, F. Christoph et al., “Expression of Mcl-1 splicing variants in clear-cell renal cancer and their correlation with histopathological parameters and prognosis,” Tumour Biol., 30(2), pp.73-79, 2009. M. Sano, Y. Nakanishi, H. Yagasaki, T. Honma, T. Oinuma, Y. Obana et al., “Overexpression of antiapoptotic Mcl-1 in testicular germ cell tumours,” Histopathology, 46(5), pp.532-539, 2005. C. Fordyce, T. Fessenden, C. Pickering, J. Jung, V. Singla, H. Berman et al., “DNA damage drives an activin a-dependent induction of cyclooxygenase-2 in premalignant cells and lesions,” Cancer Pre. Res 3(2), pp.190-201, 2010. A. Lucci, S. Krishnamurthy, B. Singh, I. Bedrosian, F. Meric-Bernstam, J. Reuben et al., “Cyclooxygenase-2 expression in primary breast cancers predicts dissemination of cancer cells to the bone marrow,” Breast Cancer Res., 117(1), pp.61-68, 2009. C. Jonathan, M. G. Robert, A. P. Vincent, Ait-Si-Ali Slimane, I. Jean, A. D. Harry et al., “ZEB-1, a repressor of the semaphorin 3F tumor suppressor gene in lung cancer cells,” Neoplasia, 11(2), pp.157–166, 2009. U. Wellner, J. Schubert, U. C. Burk, O. Schmalhofer, F. Zhu, A. Sonntag et al., “The EMT-activator ZEB1 promotes tumorigenicity by repressing stemnessinhibiting microRNAs,” Nat. Cell Biol., pp.1487-1495, 2009. T. L. Chang, K. Ito, T. K. Ko, Q. Liu, M. Salto-Tellez, K. G. Yeoh et al., “Claudin-1 has tumor suppressive activity and is a direct target of RUNX3 in gastric epithelial cells,” Gastroenterology, 138(1), pp.255-265, 2010. S. Morohashi, T. Kusumi, F. Sato, H. Odagiri, H. Chiba, S. Yoshihara, et al., “Decreased expression of claudin-1 correlates with recurrence status in breast cancer,” Int J Mol Med., 20(2), pp.139-143, 2007. W. Chien, T. Kumagai, C. W. Miller, J. C. Desmond, J. M. Frank, J. W. Said et al., “Cyr61 suppresses growth of human endometrial cancer cells,” J. Biol. Chem., 279(51), pp.53087–53096, 2004. [28] A. S. Dobroff, H. Wang, V.O. Melnikova, G. J. Villares, M. Zigler, L. Huang et al., “Silencing cAMP-response element-binding protein (CREB) identifies CYR61 as a tumor suppressor gene in melanoma,” J. Biol. Chem.,284(38), pp.26194-26206, 2009. [29] K. Shima, T. Muramatsu, Y. Abiko, Y. Yamaoka, H. Sasaki, and M. Shimono, “Connexin 43 transfection in basaloid squamous cell carcinoma cells,” PMID : 16820904.PubMed, 16(2), pp.285-288, 2006. [30] A. A. Babiker, B. Nilsson, G. Ronquist, L. Carlsson, and K. N. Ekdahl, “Transfer of functional prostasomal CD59 of metastatic prostatic cancer cell origin protects cells against complement attack,” .PMID: 15389819.PubMed, 62(2), pp.105-114, 2005. B. Collado, M. G. Sánchez, I. Díaz-Laviada, J. C. Prieto, and M. J. Carmena, “Vasoactive intestinal peptide (VIP) induces c-fos expression in LNCaP prostate cancer cells through a mechanism that involves Ca2+ signalling.Implications in angiogenesis and neuroendocrine differentiation,” Biochimica et Biophysica Acta, 1744(2), pp.224-233, 2005. [32] S. Volinia, G. A. Calin, C. G. Liu, S. Ambs, A. Cimmino, F. Petrocca et al., “A microRNA expression signature of human solid tumors defines cancer gene targets,” Proc Natl Acad Sci U.S.A. 103(7), pp.2257-2261, 2006. [33] P. P. Sainaghi, L. Castello, L. Bergamasco, M. Galletti, P. Bellosta, and G. C. Avanzi, “Gas6 induces proliferation in prostate carcinoma cell lines expressing the Axl receptor,” J Cell Physiol, 204(1), pp.36-44, 2005. [31]