Download A. Breast cancer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Prostate-specific antigen wikipedia , lookup

Transcript
A platform for querying breast and prostate cancerrelated microRNA genes
Shun-Tsung Chen, Hsing-Fang Wu, Ka-Lok Ng*
Department of Biomedical Informatics
Asia University
Taiwan 41354
*corresponding author: [email protected]
Abstract—Recent studies indicate that microRNA may play an
important role in human cancer, where microRNA targets tumor
suppressor genes or oncogenes. To study this hypothesis, breast
and prostate cancers are selected as illustrations. Differentially
expressed genes (DEGs) are identified by using the Bioconductor
package. By integrating three complementary resources, that is,
cancerous genes, microRNA target genes, and cancer-related
microRNA databases; it is found that certain cancer-related
DEGs are regulated by microRNAs. These findings suggest a
potential relationship between those microRNAs and DEGs
events, which deserve further in-vitro investigation. An
integrated platform has been set up that provides a user friendly
interface for query, http://ppi.bioinfo.asia.edu.tw/R_cancer/ .
Keywords- differentially expressed genes; microRNA; breast
cancer; prostate cancer; Bioconductor; Significance Analysis of
Microarray, Empirical Bayes Analysis of Microarrays and
Empirical Bayes Method
I.
and clustering analysis [8]. In this study, three statistical
methods, i.e. Significance Analysis of Microarray (SAM) [910], Empirical Bayes Analysis of Microarrays (EBAM) [11],
and empirical Bayes statistics for differential expression
(eBayes) [12] are employed to screen DEGs. The publicly
available microarray data analysis package Bioconductor [1314] is adopted to perform such calculations. We selected breast
cancer and prostate cancer as our study cases in this work.
Three complementary resources, (i) the cancerous gene
database, Tumor Associated Gene (TAG) [15], (ii) the miRNA
target gene database, ncRNAppi [16], and (iii) the cancerrelated microRNA databases; miR2disease [17] are used to
look for miRNAs which regulate cancer-related genes. The
main advantage of the present platform on miRNA-mRNA
targeting information is that all the target genes’ information
and disease records are experimentally verified on high
confidence records.
INTRODUCTION
Microarray technology allows for high-throughput
screening and analyzing tens of thousands of genes at the same
time. Some genes are activated or inhibited (called
differentially expressed genes, DEGs), due to certain regulatory
factors, resulting in changes in gene expression levels up to a
few times, ten times or more. Given sets of cancer microarray
data, one can identify DEGs among a large number of gene
expressions, and understand the mechanism of sickness caused
by these DEGs.
MicroRNAs (miRNAs) are a class of small non-coding
RNAs that hybridize to their target mRNA sequence in the 3’untranslated region, and induce either translation repression or
mRNA degradation. Studies show that the abnormal expression
of genes often results from the regulation of miRNAs. Recent
works indicate that miRNAs could play an important role in
human cancer where miRNAs target oncogene (OCG) or tumor
suppressor genes (TSG) to regulate gene expression [1-5]. In
this paper, we propose to study the role of miRNAs in
carcinogenesis.
There are many microarray data analysis methods, such as
using the concept of false discovery rate (FDR) to screen for
significant genes [6], using ANOVA to explore the impact of
microarray gene expression values within a single factor [7],
II.
INPUT DATA AND METHODS
A. Input data
The microarray data for breast cancer and prostate cancer
were downloaded from ith experiment ID E-GEOD-9574 and
E-GEOE-GEOD-9574, an 133A microarray, compared gene
expression between 15 samples of histologically normal breast
epithelium from breast cancer patients and 14 samples of
cancer-free controls. The 15 samples were drawn from
epithelia adjacent to a breast tumor, and the 14 samples were
obtained from patients undergoing reduction mammoplasty
without apparent breast cancer. 504 hybridizations, including
data from several kinds of normal tissues and cancer tissue.
Among these 504 arrays, 64 arrays are experimented prostate
cancer tissue and 18 arrays are normal prostate tissue samples’
B. Bioconductor packages – SAM, EBAM and eBayes
SAM is a statistical method for identifying DEGs by
comparing two or more groups of samples. It uses repeated
permutations of the data to estimate False Discovery Rate
(FDR) based on observed versus expected score, which is
obtained from randomized data. A gene which has an
observed score that deviates significantly from the expected
score is consider as a DEG. EBAM performs one and two
class analyses using either a modified t-statistic or
standardized Wilcoxon rank statistic, and a multiclass analysis
using a modified F-statistic. Moreover, this function provides
a EBAM procedure for categorical data such as SNP data and
the possibility of employing a user-written score function. The
eBayes algorithm computes moderated t-statistics, moderated
F-statistic, and log-odds of differential expression by
empirical Bayes shrinkage of the standard errors towards a
common value.
Since both the breast and prostate cancer microarray data
are obtained from two different groups of patients, a two-class
unpaired test is adopted in the SAM, EBAM and eBayes
analysis.
C. Databases integration
The Tumor Associated Gene database (TAG) has collected
a total of 655 tumor-associated genes, including 245
oncogenes (OCG), 259 tumor suppressor genes (
ncRNAppi (Non-coding RNA protein-protein interaction)
is a useful tool for identifying ncRNA targeting pathways
(http://ncrnappi.cs.nthu.edu.tw/). It offers all possible sub-paths
from miRNA and siRNA target genes by using protein-protein
interactions.
miR2Disease is a manually curated database, which aims at
providing a comprehensive resource for miRNA deregulation
in various human diseases (http://www.mir2disease.org/). Each
entry in the miR2Disease contains detailed information on a
miRNA-disease relationship, including miRNA ID, disease
name, a brief description of the miRNA-disease relationship,
miRNA expression pattern in the disease state, detection
method for miRNA expression, experimentally verified
miRNA target genes, and literature reference.
III.
Using a delta value of 0.909, EBAM predicted 100 DEGs.
The EBAM plot is shown in Fig. 2. The z-score values for up
and down regulated DEGs are z  2.833 and z  2.518
respectively.
After removing redundant probe IDs, 87 DEGs are left with
the FDR equals to 0.04. The Jaccard coefficient of overlapping
prediction between SAM and EBAM is 50%, i.e. 57/(84+8757)). Beside this difference, the rank of DEGs determined by
SAM and EBAM are slightly different.
Figure 2. EBAM plot for breast cancer DEGs, Delta = 0.909.
Predicted DEGs are validated by using the TAG database,
and the accuracy of prediction using SAM and EBAM are
compared. It is found that SAM and EBAM predicted 18 genes
(21.4%) and 16 genes (18.4%) respectively. This result
suggests that SAM performs slightly better (3%) than EBAM
in identifying cancer-related genes. Table 1 lists the DEGs
obtained by SAM and EBAM, and validated by the cancer
gene database, TAG.
TABLE I.
DEGS PREDICTED BY SAM AND EBAM AND VALIDATED BY
TAG
SAM prediction
RESULTS
EIF1, BTG2, FANCG, EGR1,
CYLD, CEBPD, KLF 6
EBAM prediction
EIF1,
BTG2,
FANCG,
EGR1, CYLD, CEBPD,
KLF6
JUN, FOS, JUND, KLF4,
LYN
PTP4A1, EMP1, PMAIP1,
AKR7A2
In this section, the results of DEGs predicted by the
Bioconductor package are reported for both breast cancer and
prostate cancer.
OCG
A. Breast cancer
Others
Using a delta () value of 1.14, SAM predicted 84 DEGs
after removing redundant probe IDs, where the FDR is 0.053.
Fig. 1 shows the SAM plot.
To further annotate the oncogenic or tumor suppressor role
for the rest of the predicted DEGs, their gene symbols were
manually input into PubMed to search for relevant publications.
Cancer-related DEGs were queried against NCBI PubMed, the
ncRNAppi and miR2disease databases to identify their
upstream miRNAs. The results are shown in Table II.
TSG
JUN, FOS, JUND, KLF4,
LYN, MLF1, CD24, GNAS
PTP4A1, EMP1, PMAIP1,
CLDN1, H3F3B
TABLE II.
Figure 1. SAM plot for breast cancer DEGs
CANCER-RELATED MIRNA IDENTIFIED BY LITERATURE
MINING, USING NCRNAPPI AND MIR2DISEASE
JUN
FOS
CD24
TAG
OCG
OCG
OCG
ncRNAppi
miR-30
n/a
miR-373
MCL1
OCG [19-20]
miR-29b
PTGS2
OCG [21-22]
ZEB1
OCG [23-24]
let-7b, miR-16
miR-200
(a,b,c), 429
miR2Disease
n/a
miR-101
n/a
miR-29b, 133b, 101,
153, 204, 320, 512-5p
n/a
miR-141, 200 (a,b,c),
205, 429
CLDN1
TSG [25-26]
miR-155
CYR61
TSG [27-28]
miR-30a-3p
H3F3B
n/a
miR-1
n/a denotes information is not available
miR-206
miR-206
n/a denotes information is not available; other denotes cancer-related gene
n/a
n/a
n/a
Tissue specific information for cancer-related miRNAs,
which are provided by miR2disease, is summarized in Table III.
It is found that certain miRNAs which target the DEGs
predicted by SAM or EBAM are indeed breast tissue specific,
suggesting these miRNAs can potentially play an oncogenic or
tumor suppressor role in breast cancer.
TABLE III.
CANCER TISSUE SPECIFIC MIRNA AND ITS DOWNSTREAM
REGULATED GENE
miRNA
cancer tissue type
regulated DEG
miR-29b
breast
MCL1
miR-30a-3p
breast, colon, lung
CYR61
miR-`155
breast, colon, liver, lung, ovary
CLDN1
miR200(a,b,c)
breast, liver, lung, ovary
ZEB1
miR-429
breast, lung, ovary
ZEB1
It was found that FOS, AXL and TGFBR2 are regulated
by miR-101, miR-1 and miR-20a, respectively; Both PLP2
and CD59 are regulated by miR-124; GJA1 is regulated by
both the miR-1 and miR-206. Results of tissue specific cancerrelated miRNA are depicted in Table VI. Literature records
indicated that both miR-101 and miR-20a are associated with
prostate cancer. FOS plays the role of OCG in prostate cancer
[31], but this information is not recorded by miR2Disease.
TABLE VI.
CANCER TISSUE SPECIFIC MIRNA AND ITS DOWNSTREAM
REGULATED GENE
miRNA
miR-101
miR-373
breast, head and neck squamous cell
(HNSC), prostate
CD24
miR-20a
cancer tissue type
regulated DEG
prostate, ovary, renal, liver, bladder, head
and HNSC
prostate, breast, colon, kidney, liver, lung,
ovary, pancreatic
FOS
TGFBR2
miR-1
colon, liver, lung, HNSC
AXL
miR-124
liver
PLP2
miR-124
liver
CD59
miR-206
breast, ovary
miR-1
colon, liver, lung, HNSC
GJA1
B. Prostate cancer
The first 100 most significant DEGs, which consist of up
and down regulated genes, predicted by The adjusted p-value
of these DEGs are less than 1.9*10-5. Among these 100 DEGs,
15 genes are removed due to repeated probe IDs. For the
remaining 85 DEGs there are 16 known cancer genes (18.8%)
recorded by TAGs and the PubMed service. The results are
summarized in Table IV.
TABLE IV.
THE RESULTS OF MATCHING THE TOP 85 DEGS WITH TAG
OCG (5)
TSG (8)
Others (3)
AXL, ETS2, JUND,
LMO2, FOS
TGFBR3, PPP1R3C, GPX3,
RARRES1, GAS1, BIN1,
TGFBR2,
CSPG4, RSU1,
Genes PLP2, CD59 and GJA1 are not collected by TAG,
however, PLP2 is recorded by the ncRNAppi database;
whereas the literature suggested that GJA1 is a TSG [29] and
CD59 [30] is associated with prostate cancer.
Again, cancer-related DEGs are queried against NCBI
PubMed, the ncRNAppi and miR2disease databases. The
results are shown in Table V.
TABLE V.
FOS
AXL
TGFBR2
PLP2
CD59
GJA1
CANCER-RELATED MIRNA IDENTIFIED BY LITERATURE
MINING, USING NCRNAPPI AND MIR2DISEASE
TAG
OCG
OCG
TSG
n/a
other
TSG
ncRNAppi
n/a
miR-1
miR-20a
miR-124
miR-124
miR-1
miR2Disease
miR-101
n/a
miR-20a
n/a
n/a
miR-1
TGFBR2 is identified as a miR-20a target gene by
miR2Disease [32], this means that miR-20a participates in
human prostate cancer by regulating TGFBR2. CD59 is
associated with prostate cancer, but its regulator miR-124
could not be identified with any association with cancer. In
addition, the literature had a record indicating that AXL,
which is regulated by miR-1, is expressed in prostate cancer
[33]. It suggested that AXL is probably highly related to
prostate cancer.
An integrated platform has been set up to provide a user
friendly
interface
for
query,
http://ppi.bioinfo.asia.edu.tw/R_cancer/.
This website
provides genetic, miRNA and tissue information for (i) breast
cancer DEGs predicted by SAM and EBAM, and (ii) prostate
cancer DEGs predicted by eBayes. This platfrom will serve as
an useful resource for studying the oncogenic and tumor
suppressor role of miRNAs.
IV.
SUMMARY AND CONCLUSION
In this study, the Bioconductor package is adopted to
identify DEGs for breast as well as prostate cancer from
microarray data. Our results suggested that, SAM, EBAM and
eBayes, achieve a similar level of cancer gene prediction
accuracy, i.e. around 20%. By integrating the ncRNAppi and
miR2Disease databases, it is found that certain DEGs are
regulated by miRNAs. Breast and prostate tissue specific
cancer-related miRNAs are identified, suggesting these
miRNAs can potentially play an oncogenic or tumor suppressor
role in cancer. Also, a few cancer-related miRNAs are
identified which are verified in the literature, thus, indicating
the strength or effectiveness of the present approach.
[14]
ACKNOWLEDGMENT
Drs. Ka-Lok Ng and Shan-Chin Lee work is supported by
the National Science Council of R.O.C. under the grant of NSC
NSC 99-2221-E-468-016-MY2. We thank Tim Williams, Asia
University, for providing an English proof reading service for
this article.
[15]
[16]
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
C. Z. Chen, “MicroRNAs as Oncogenes and Tumor
Suppressors,” New Eng. J. Med, 353, pp.1768-1771,
2005.
R. Garzon, M. Fabbri, A. Cimmino, G. A. Calin, and C.
M. Croce, “MicroRNA expression and function in
cancer,” Trends Mol Med, 12, pp.580-587, 2006.
E. K. Aurora and J. S. Frank, “Oncomirs-microRNAs
with a role in cancer,” Nature Reviews Cancer, 6, pp.259269, 2006.
P. M. Voorhoeve, “MicroRNAs: Oncogenes, tumor
suppressors
or
master
regulators
of
cancer
heterogeneity?,” Biochim Biophys Acta, 1805, pp.72-86,
2010.
B. Zhang, X. Pan, G. P. Cobb, and T. A. Anderson,
“microRNAs as oncogenes and tumor suppressors,”
Develop Biol, 302(1), pp.1-12, 2007.
B. Efron and R. Tibshirani, “Empirical bayes methods
and false discovery rates for microarrays,” Genet
Epidemiol, 23(1), pp.70-86, 2002.
M. K. Kerr, C. A. Afshari, B. Lee, P. Bushel, J. Martinez,
N. J. Walker et al., “Statistical analysis of a gene
expression microarray experiment with replication,”
Statistica Sinica, 12, pp.203-217, 2002.
J. Laura, D. Hongyue, J. Marc, D. Yudong, A.
Augustinus, M. Mao et al., “Gene expression profiling
predicts clinical outcome of breast cancer,” Nature, 415,
pp.530-536, 2002.
V. Tusher, R. Tibshirani, and G. Chu, “Significance
analysis of microarrays applied to the ionizing radiation
response,” Proc Natl Acad Sci U.S.A., 98(9), pp.51165121, 2001.
S. Zhang, “A comprehensive evaluation of SAM, the
SAM R-package and a simple modification to improve its
performance,” BMC Bioinformatics, 8, 230, 2007.
B. Efron, R. Tibshirani, J. Storey, and V. Tusher,
“Empirical Bayes Analysis of a Microarray Experiment,”
American Statistical Association Journal of the American
Statistical Association, 96(456), pp.1151-1160, 2001.
B. Efron, “Robbins, empirical Bayes and microarrays,”
Annals of Statistics 31(2), pp.366-378, 2003.
http://www.bioconductor.org
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
R. Irizarry, “From CEL Files to Annotated Lists of
Interesting Genes.In: Bioinformatics and Computational
Biology Solutions using R and Bioconductor,” pp.431442, 2005.
H. H. Chan, “Identification of novel tumor-associated
gene (TAG) by bioinformatics analysis,” MSc Thesis.
National Cheng Kung University, Institute of Molecular
Medicine, Taiwan; 2006.
K. L. Ng, H. C. Liu, and S. C. Lee, “ncRNAppi – A tool
for identifying disease-related miRNA and siRNA
targeting pathways,” Bioinfo. 25(23), pp.3199-3201, 2009.
Q. Jiang, Y. Wang, Y. Hao, L. Juan, M. Teng, X. Zhang
et al., “miR2Disease: a manually curated database for
microRNA deregulation in human disease,” Nucl Acids
Res, 37, D98-104, 2009.
http://www.ebi.ac.uk/arrayexpress
C. Kempkensteffen, S. Hinz, M. Johannsen, H. Krause, A.
Magheli, F. Christoph et al., “Expression of Mcl-1
splicing variants in clear-cell renal cancer and their
correlation with histopathological parameters and
prognosis,” Tumour Biol., 30(2), pp.73-79, 2009.
M. Sano, Y. Nakanishi, H. Yagasaki, T. Honma, T.
Oinuma, Y. Obana et al., “Overexpression of antiapoptotic Mcl-1 in testicular germ cell tumours,”
Histopathology, 46(5), pp.532-539, 2005.
C. Fordyce, T. Fessenden, C. Pickering, J. Jung, V.
Singla, H. Berman et al., “DNA damage drives an activin
a-dependent induction of cyclooxygenase-2 in
premalignant cells and lesions,” Cancer Pre. Res 3(2),
pp.190-201, 2010.
A. Lucci, S. Krishnamurthy, B. Singh, I. Bedrosian, F.
Meric-Bernstam, J. Reuben et al., “Cyclooxygenase-2
expression in primary breast cancers predicts
dissemination of cancer cells to the bone marrow,” Breast
Cancer Res., 117(1), pp.61-68, 2009.
C. Jonathan, M. G. Robert, A. P. Vincent, Ait-Si-Ali
Slimane, I. Jean, A. D. Harry et al., “ZEB-1, a repressor
of the semaphorin 3F tumor suppressor gene in lung
cancer cells,” Neoplasia, 11(2), pp.157–166, 2009.
U. Wellner, J. Schubert, U. C. Burk, O. Schmalhofer, F.
Zhu, A. Sonntag et al., “The EMT-activator ZEB1
promotes tumorigenicity by repressing stemnessinhibiting microRNAs,” Nat. Cell Biol., pp.1487-1495,
2009.
T. L. Chang, K. Ito, T. K. Ko, Q. Liu, M. Salto-Tellez, K.
G. Yeoh et al., “Claudin-1 has tumor suppressive activity
and is a direct target of RUNX3 in gastric epithelial
cells,” Gastroenterology, 138(1), pp.255-265, 2010.
S. Morohashi, T. Kusumi, F. Sato, H. Odagiri, H. Chiba,
S. Yoshihara, et al., “Decreased expression of claudin-1
correlates with recurrence status in breast cancer,” Int J
Mol Med., 20(2), pp.139-143, 2007.
W. Chien, T. Kumagai, C. W. Miller, J. C. Desmond, J. M.
Frank, J. W. Said et al., “Cyr61 suppresses growth of
human endometrial cancer cells,” J. Biol. Chem., 279(51),
pp.53087–53096, 2004.
[28] A. S. Dobroff, H. Wang, V.O. Melnikova, G. J. Villares,
M. Zigler, L. Huang et al., “Silencing cAMP-response
element-binding protein (CREB) identifies CYR61 as a
tumor suppressor gene in melanoma,” J. Biol.
Chem.,284(38), pp.26194-26206, 2009.
[29] K. Shima, T. Muramatsu, Y. Abiko, Y. Yamaoka, H.
Sasaki, and M. Shimono, “Connexin 43 transfection in
basaloid squamous cell carcinoma cells,” PMID :
16820904.PubMed, 16(2), pp.285-288, 2006.
[30] A. A. Babiker, B. Nilsson, G. Ronquist, L. Carlsson, and
K. N. Ekdahl, “Transfer of functional prostasomal CD59
of metastatic prostatic cancer cell origin protects cells
against complement attack,” .PMID: 15389819.PubMed,
62(2), pp.105-114, 2005.
B. Collado, M. G. Sánchez, I. Díaz-Laviada, J. C. Prieto,
and M. J. Carmena, “Vasoactive intestinal peptide (VIP)
induces c-fos expression in LNCaP prostate cancer cells
through
a
mechanism
that
involves
Ca2+
signalling.Implications
in
angiogenesis
and
neuroendocrine
differentiation,”
Biochimica
et
Biophysica Acta, 1744(2), pp.224-233, 2005.
[32] S. Volinia, G. A. Calin, C. G. Liu, S. Ambs, A. Cimmino,
F. Petrocca et al., “A microRNA expression signature of
human solid tumors defines cancer gene targets,” Proc
Natl Acad Sci U.S.A. 103(7), pp.2257-2261, 2006.
[33] P. P. Sainaghi, L. Castello, L. Bergamasco, M. Galletti, P.
Bellosta, and G. C. Avanzi, “Gas6 induces proliferation in
prostate carcinoma cell lines expressing the Axl receptor,”
J Cell Physiol, 204(1), pp.36-44, 2005.
[31]