Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Prognostic microRNA Signature of TripleNegative Breast Cancer Identified by CrossValidated Cox Model Development Jianying Zhang, PhD Charles Shapiro, MD Center for Biostatistics Department of Biomedical Informatics Ohio State University, Columbus, OH [email protected] Background - TNBC • ER (estrogen), PR (progesterone), HER2/neu (epidermal growth factor receptor) all negative • Annually 170K new TNBC diagnoses per year (worldwide), accounting for 12-24% of all breast cancers • Poor prognosis: chemo-resistance and early recurrence • Limited treatment options Background - microRNA • Small (19-25 nucleotide), non-coding RNAs • Regulatory role: reduce the abundance and translational efficiency of mRNAs • microRNAs play key roles in cancer progression and drug resistance • oncomiRs regulate tumor suppressor genes • microRNA signature is necessary and predictive for prognosis of TNBC Research Goal • Identify a miR signature for good prognosis of TNBC over time, especially within 2-5 years • Dichotomization of patients into high vs low risk groups • Primary endpoint: time-to-progression (recurrence/death) TNBC Cohort • 159 TNBCs from the Wexner Medical Center of the Ohio State University with first diagnosis from 19852005 (4 before 1995) • Both survival/recurrence data and valid miRNA profiling were available • Demographic/pathologic/clinic outcome data were collected from Information Warehouse Clinical Data Summary Progressi Recurren Progressi P value (cox on-free ce/death on rate model) Chemo multi/single- agent 66 37 36% no chemo 11 6 35% White 69 39 36% Other 7 4 36% 50- 35 23 40% 50+ 42 20 32% Negative 50 19 28% Positive 26 20 43% 0 32 100% no recurrence 77 11 13% Death Alive Dead 77 0 2 41 3% 100% Grade II 11 2 15% III 64 38 37% IV 1 0 0% Unknown 0 3 100% Yes 27 17 39% No 49 22 31% 1 4 80% Race Age (mean: 51) Lymph nodes Recurrence Menopause status recurrence Unknown 0.985 0 0.554 0.048 0.702 0.11 miRNA Profiling • Patient primary tumor tissues at diagnosis/surgery • Nano String nCounter platform for miR profiling • 6 positive controls, 8 negative controls, 5 housekeeping controls, and 734 miRs • Technical Normalization by normalization factor (NF) calculated from positive controls • Samples with outbound NF were excluded • Filtering low-expressed miRs by negative control threshold (398 miRs remained). • Global normalization – Quantile Normalization Statistical Challenges/Pitfalls Model over-fitting: a model fitting the training data too well while fitting the test data poorly. Reasons: • Too many more-than-necessary predictors were selected • High-dimensional data: sample size n >> p (#predictors) • Same data used to develop the model and assess the performance • Training set/data is too small or too noisy (A) Re-substitution estimates and (B) cross-validated estimates. Simon et al 2011 Solution: Cross-Validation (CV) • K-fold CV - Training data is split into K partitions equally at random - Each partition (1 to K) is left as the test set role in turns - Remaining K-1 partitions plays the train set role: build the model - Model prediction (risk score for Cox) is evaluated on test set - After all K folds run, overall model performance is evaluated: cross-validated AUC, ROC, survival curves (KM), etc. • Repeat above K-fold CV many times (e.g. 100 times) Simon et al 2011 Model Building – Cox Regression • Feature Screening/Selection (FS) - Univariate cox regression Univariate two-group comparison Clustering Principle component analysis (PCA) Linear discriminant analysis (LDA) • Model Selection (MS) - Traditional stepwise selection - Penalized likelihood or shrinkage methods (Lasso/Ridge, etc.) - Machine learning Cox Proportional Hazard Model • Semi-parametric: For observation i at time t hi(t) = h0(t) exp(β1Xi1 + β2Xi2 + … + βpXip) p: # of predictors • Linear model: f(xi) = log(hi(t)/h0(t)) = β1Xi1 + β2Xi2 + … + βpXip • hi(t)/h0(t): proportional hazard • f(x) = B’X is called linear risk score function where B = (β1, β2, …,βp) for p predictors Parameter Estimate • OLS (Ordinary Least Square) To minimize: 𝑛𝑖=1 𝑓 𝑥𝑖 − 𝑥𝑖 ′ β 2 • LASSO (Least absolute shrinkage and selection operator) To maximize the Cox’s partial likelihood: 𝐿(β) = exp(β′ 𝑥𝑡 ) 𝑟∈𝐷 ′ 𝑗∈𝑅𝑡 exp(β 𝑥𝑗 ) where D is set of event indices; Rt is the set of indices at risk at time t Subject to: 𝑝 𝑗=1 |β𝑗| ≤ s Model/Variable Selection • • • • • Forward (AIC) Backward (AIC) Stepwise (AIC) Penalized or shrinkage method (Lasso) Cross-Validation is used to find the optimal tuning parameter (maximum likelihood) Model Assessment (MA) • Continuous response: Mean-square error: bias2 + variance • Binary/categorical response: AUC-area under the curve (ROC) Sensitivity/specificity • Time-to-event response: Time-dependent ROC/AUC (Heagetty et al, 2000) Time-Dependent ROC • ROC and AUC are estimated in terms of landmark time t • Sensitivity: Pr[M ≥ c | T ≤ t] • Specificity: Pr[M < c | T > t] T: survival time; M: test value (risk score) C: threshold of positivity • Two methods: KM – Kaplan Meier NNE – Nearest Neighbor Estimate Heagerty et al 2000 CV Modeling on TNBC Data N=159 5 exclusions: older than 80 or no PF N=154 Validation Set (N=34) Training Set (N=120) 5-fold CV Final model 100 iterations FS FS0: No FS FS1: Univariate Cox FS2: progression vs progression-free MS MS1: CV stepwise MS2: CV Lasso Other covariates (NODES Positive) Add Skip MA Time-dependent ROC/AUC Risk classification Risk score prediction Cross-validated ROC/AUC/risk classification FS0 FS1 so-b .1 FS2 o.13 .f1.1 5 o.1. f1.1 5 o.0. f1.1 5 o.1. f1.1 5 o.0. f1.1 5 d ta.s tepc v.f1 .1 5. uns c al e d ta.s tepc v.f1 .1 5 ta.l a ss ta.l a ss t.s te pcv .f1.1 5 .un sc a le t.s te pcv .f1.1 5 t.l as s t.l as s t.l as s Co x -Ste pwi se Co x -Las so-b .13 Co x -Las so-b .1 Co x -Las so-b .0 Uni Uni Uni Uni Non -Las onl y so-b .0 DE S Non -Las NO 0.3 0.4 0.5 0.6 CV AUC (KM) 0.7 0.8 Cross-validated AUC by KM (100 runs) FS2 + NODES FS0 FS1 so-b .1 FS2 o.13 .f1.1 5 o.1. f1.1 5 o.0. f1.1 5 o.1. f1.1 5 o.0. f1.1 5 d ta.s tepc v.f1 .1 5. uns c al e d ta.s tepc v.f1 .1 5 ta.l a ss ta.l a ss t.s te pcv .f1.1 5 .un sc a le t.s te pcv .f1.1 5 t.l as s t.l as s t.l as s Co x -Ste pwi se Co x -Las so-b .13 Co x -Las so-b .1 Co x -Las so-b .0 Uni Uni Uni Uni Non -Las onl y so-b .0 DE S Non -Las NO 0.4 0.5 0.6 CV AUC (NNE) 0.7 0.8 Cross-validated AUC by NNE (100 runs) FS2 + NODES High-Frequency miRs T-FS w/ FC 1.15, Stepwise CV, plus NODES (scaled) T-FS w/ FC 1.15, Stepwise CV, plus NODES (unscaled) hsa.miR.1913 hsa.miR.361.5p hsa.miR.1305 hsa.miR.566 hsa.miR.623 hsa.miR.1280 hsa.miR.516a.3p hsa.miR.146b.5p hsa.miR.548a.3p hsa.miR.423.5p hsa.miR.205 hsa.miR.2115 hsa.let.7c hsa.miR.199a.5p hsa.miR.122 hsa.miR.1252 hsa.miR.874 hsa.miR.1253 hsa.miR.365 hsa.miR.525.5p hsa.miR.578 hsa.miR.1286 hsa.miR.16 hsa.miR.1285 hsa.miR.142.3p hsa.miR.1307 hsa.miR.142.5p kshv.miR.K12.11 hsa.miR.363 hsa.miR.155 hsa.miR.1201 hsa.miR.31 hsa.miR.361.5p hsa.miR.197 hsa.miR.1979 hsa.miR.613 hsa.miR.362.5p hsa.miR.1280 hsa.miR.146b.5p hsa.miR.1305 hsa.miR.423.5p hsa.miR.2115 hsa.miR.516a.3p hsa.miR.205 hsa.miR.199a.5p hsa.let.7c hsa.miR.1286 hsa.miR.874 hsa.miR.365 hsa.miR.578 hsa.miR.1253 hsa.miR.525.5p hsa.miR.1285 hsa.miR.16 hsa.miR.142.3p hsa.miR.1307 hsa.miR.142.5p kshv.miR.K12.11 hsa.miR.363 hsa.miR.155 0 100 200 300 Freq (/500) each miR 400 0 100 200 300 Freq (/500) each miR 400 Significance Test of the AUC • Hypothesis: AUC = 0.5 vs AUC > 0.5 • 500 random permutations of the time-to-event and event status • Five-fold cross-validated AUC was calculated within each permutation based on the candidate model development procedures • For example, for the procedure with the highest mean AUC: P(AUC>= 0.61) = 0.05 P(AUC >= 0.63) = 0.04 P(AUC>= 0.68) = 0.032 Optimal Model • Full training set (n=120) was used for modeling • Feature selection/screening by comparing progression vs progression-free with FC>1.15 and P<0.06 • 5-fold cross-validated stepwise cox regression was used for model selection. • Above 5-fold CV stepwise was repeated 100 times to pick the most frequent model • The miRs in the most frequent model plus NODES to fit the final cox model • Evaluate the final cox model on validation data set Final model – Full train set • Five miRs were selected miR-363 miR-155 miR-142.5p kshv.miR.K12.11 miR-1307 # miRs 5 4 2 miRs freq miR-363 + miR-155 + miR-142.5p + kshv.miR.K12.11 + miR-1307 27 miR-363 + miR-155 + miR-142.5p + kshv.miR.K12.11 24 miR-363 + miR-155 13 IPA Enrichment Pathway Analysis IPA – Diseases or Functions Categories Diseases or Functions Annotation # Molecules Cancer, Cancer, Cancer, Cancer, Cancer, acute lymphocytic leukemia advanced colorectal adenoma advanced Dukes' stage colorectal cancer ALK fusion negative anaplastic large cell lymphoma B-cell lymphoma Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities Gastrointestinal Disease, Organismal Injury and Abnormalities Gastrointestinal Disease, Organismal Injury and Abnormalities Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities Cancer, Organismal Injury and Abnormalities, Reproductive System Disease Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, Cancer, breast cancer Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities Burkitt's lymphoma Organismal Injury and Abnormalities, Reproductive System Disease early stage invasive cervical squamous cell carcinoma Organismal Injury and Abnormalities EPCAM positive tumor Organismal Injury and Abnormalities growth of tumor Gastrointestinal Disease, Hepatic System Disease, Organismal Injury and Abnormalities hepatocellular carcinoma Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities Hodgkin's disease Endocrine System Disorders, Organismal Injury and Abnormalities, Respiratory Disease large cell lung cancer Gastrointestinal Disease, Hepatic System Disease, Organismal Injury and Abnormalities liver metastasis Gastrointestinal Disease, Organismal Injury and Abnormalities metastatic colorectal cancer Organismal Injury and Abnormalities metastatic melanoma cancer Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities multiple myeloma Neurological Disease, Organismal Injury and Abnormalities neuroblastoma Endocrine System Disorders, Organismal Injury and Abnormalities, Respiratory Disease neuroendocrine lung cancer Organismal Injury and Abnormalities, Respiratory Disease non-small cell lung cancer Endocrine System Disorders, Gastrointestinal Disease, Organismal Injury and Abnormalities pancreatic ductal adenocarcinoma Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities precursor T-cell lymphoblastic leukemia-lymphoma Hematological Disease, Immunological Disease, Neurological Disease, Organismal Injury and Abnormalities primary central nervous system lymphoma Organismal Injury and Abnormalities primary melanoma Organismal Injury and Abnormalities, Reproductive System Disease progesterone receptor positive breast carcinoma Cellular Development, Cellular Growth and Proliferation, Organismal Injury and Abnormalities, Respiratory Disease, Tumor proliferation Morphology of lung adenocarcinoma cells Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities, Tissue Morphology quantity of lymphoma Hematological Disease, Immunological Disease, Organismal Injury and Abnormalities sinonasal natural killer/T-cell lymphoma Endocrine System Disorders, Organismal Injury and Abnormalities, Respiratory Disease small cell lung cancer Organismal Injury and Abnormalities, Respiratory Disease squamous cell lung cancer Gastrointestinal Disease, Organismal Injury and Abnormalities stage 2 colorectal cancer Organismal Injury and Abnormalities, Reproductive System Disease, Skeletal and Muscular Disorders uterine leiomyoma 2 1 1 1 2 3 1 3 1 2 2 1 1 1 1 3 1 1 2 3 1 1 1 3 1 1 1 3 1 2 1 1 Recent Published Evidence miR hsa.miR.155 has-miR-142-5p has-miR-363 Author/year miR-155 supresses ErbB2-induced malignant He et al, 2016 transforemation of breast epithelial cells miR-155 represses matrix Gla protein to promote oncogenic Tiago et al, 2016 signals in MCF-7 cells Hemmatzadeh et al 2016 miR-155 is one oncomirs in treatment of breast cancer Loss of miR-155 enhances C/EBP-β-mediated MDSC Kim et al 2016 inflitration and tumor growth miR-155 expression change is associated with breast cancer lymph-node metastasis Petrovic et al 2016 Autoregulatory response to oncogenic drivers for good Tembe et al 2014 prognosis for melanoma Jayawardana et al 2016 One of the prognostic biomarkers in metastatic melanoma MiR-363-3p inhibits the epithelial-to-mesenchymal transition Hu et al, 2016 and suppresses metastasis by targeting Sox4 Loss of tumor suppressive miR-363-3p overexpressed Li et al, 2016 oncogenic cAMP responsive binding protein 1 Tumor suppressor role of miR-363-3p: inhibit cell Song et al, 2015 growth/migration by targeting NOTCH1 Khuu et al, 2016 Anti-proliferative properties Beltran et al , 2011 Suppresses breast tumor growth and metastasis Zhang et al, 2014 Sensitizes cisplatin-induced apoptosis targeting in Mcl-1 kshv-miR-K12-11 Rainy et al, 2016 Dahlke et al, 2012 Skalsky et al, 2007 miR-1307 Findings Zhou et al, 2015 Induce non-cell-autonomous target gene regulation by viral oncomiR spreading between B and T cells promotes B-cell expansion in vivo Coded by a herpesvirus as an ortholog of miR-155 in B cell lymphoma developmenbt Up-regulated in chemoresistant epithelial ovarian cancer tissues, but not associated with lymph node metastasis Cancer breast cancer breast cancer cell line breast cancer breast cancer melanoma melanoma colerecctal cancer renal cancer gastric cancer carcinoma breast cancer breast cancer Kaposi sarcoma Kaposi sarcoma Kspodi'd sarcoma Ovarian cancer Final Model Validation (n=34) Event: Recurrence/death Final model P value (log-rank test) P value (log-rank test) AUC (2-year) AUC (3-year) AUC (5-year) by validation median by train median Scaled data 0.84 0.72 0.77 0.065 0.0008 Unscaled data 0.87 0.79 0.83 0.065 0.023 Event: Recurrence Final model P value (log-rank test) AUC (2-year) AUC (3-year) AUC (5-year) by validation median P value (log-rank test) by train median Scaled data 0.79 0.67 0.67 0.028 0.337 Unscaled data 0.79 0.68 0.68 0.028 0.154 0.6 0.4 0.2 sensitivity 0.8 1.0 Validation: ROC of 5-year on the 5-miR+NODES model 0.0 AUC=0.79 0.0 0.2 0.4 0.6 0.8 1.0 1-specificity Sensitivity: P(high risk | event at time t); Specificity = P(low risk | event-free at time t); By cutoff of risk score = -0.04, sensitivity=80% and specificity = 64%. Validation Set Cutoff: median of train risk score 0 50 100 Time to event (months) 0.8 0.6 0.4 High risk Low risk P = 0.023 0.0 P = 0.065 0.2 Event-Free Probability 0.8 0.6 0.4 0.2 High risk Low risk 0.0 Event-Free Probability 1.0 1.0 Cutoff: median of validation risk score 150 0 50 100 Time to event (months) 150 Ongoing … • • • • • • • • Any other predictive covariates missing? Non-molecule predictors parallel to miRNA predictors? Any covariates competing with miRNA biomarkers? Covariates forced into the model, or involving in variable/selection? Does the “final” model still need further variable selection in terms of their parameter estimate and P values? More model selection methods? Will the recommended model comparison procedure work for other public/published data? Can the proposed miRNA signature be validated in other independent studies? Reference 1. Heagerty PJ, Lumley T, Pepe MS. Time dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000; 56: 337-44. 2. Simon RM, Subramanian J, Li MC, Menezes S. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data. Briefings in Bioinformatics. 2011; 12(3): 203-214. 3. Tibshirani R. The lasso method for variable selection in the Cox model. Statist. Med.. 1997; 16: 385-395. 4. He XH, Zhu W, Yuan P, Jiang S, Li D, Zhang HW, Liu MF. miR-155 downregulates ErbB2 and suppresses ErbB2induced malignant transformation of breast epithelial cells. Oncogene. 2016 Nov 17;35(46):6015-6025. 5. Tiago DM, Conceição N, Caiado H, Laizé V, Cancela ML. Matrix Gla protein repression by miR-155 promotes oncogenic signals in breast cancer MCF-7 cells. FEBS Lett. 2016 Apr;590(8):1234-41. 6. Hemmatzadeh M, Mohammadi H, Jadidi-Niaragh F, Asghari F, Yousefi M. The role of oncomirs in the pathogenesis and treatment of breast cancer. Biomed Pharmacother. 2016 Mar;78:129-39. 7. Kim S, Song JH, Kim S, Qu P, Martin BK, Sehareen WS, Haines DC, Lin PC, Sharan SK, Chang S. Loss of oncogenic miR-155 in tumor cells promotes tumor growth by enhancing C/EBP-β-mediated MDSC infiltration. Oncotarget. 2016 Mar 8;7(10):11094-112. 8. Petrović N, Kolaković A, Stanković A, Lukić S, Řami A, Ivković M, Mandušić V. MiR-155 expression level changes might be associated with initial phases of breast cancer pathogenesis and lymph-node metastasis. Cancer Biomark. 2016;16(3):385-94. 9. V. Tembe, S.-J. Schramm, M.S. Stark, et al. microRNA and mRNA expression profiling in metastatic melanoma reveal associations with BRAF mutation and patient prognosis Pigment Cell Melanoma Res, 28 (2014), pp. 254– 266. 10. Jayawardana K, Schramm SJ, Tembe V, Mueller S, Thompson JF, Scolyer RA, Mann GJ, Yang J. Identification, Review, and Systematic Cross-Validation of microRNA Prognostic Signatures in Metastatic Melanoma. J Invest Dermatol. 2016 Jan;136(1):245-54. 11. Zhang R, Li Y, Dong X, Peng L, Nie X. MiR-363 sensitizes cisplatin-induced apoptosis targeting in Mcl-1 in breast cancer. Med Oncol. 2014 Dec;31(12):347. 12. Beltran AS, Russo A, Lara H, Fan C, Lizardi PM, Blancafort P.Suppression of breast tumor growth and metastasis by an engineered transcription factor. PLoS One. 2011;6(9):e24595. Reference 13. Hu F, Min J, Cao X, Liu L, Ge Z, Hu J, Li X. MiR-363-3p inhibits the epithelial-to-mesenchymal transition and suppresses metastasis in colorectal cancer by targeting Sox4. Biochem Biophys Res Commun. 2016 May 20;474(1):35-42.. 13. Khuu C, Sehic A, Eide L, Osmundsen H. Anti-proliferative properties of miR-20b and miR-363 from the miR-106a-363 cluster on human carcinoma cells. Microrna. 2016 Mar 22. 14. Song B, Yan J, Liu C, Zhou H, Zheng Y. Tumor Suppressor Role of miR-363-3p in Gastric Cancer. Med Sci Monit. 2015 Dec 28;21:4074-80. 15. Li Y, Chen D, Li Y, Jin L, Liu J, Su Z, Qi Z, Shi M, Jiang Z, Ni L, Yang S, Gui Y, Mao X, Chen Y, Lai Y. Oncogenic cAMP responsive element binding protein 1 is overexpressed upon loss of tumor suppressive miR-10b-5p and miR-363-3p in renal cancer. Oncol Rep. 2016 Apr;35(4):1967-78. 16. Rainy N, Zayoud M, Kloog Y, Rechavi O, Goldstein I. Viral oncomiR spreading between B and T cells is employed by Kaposi's sarcoma herpesvirus to induce non-cell-autonomous target gene regulation. Oncotarget. 2016 Jul 5;7(27):41870-41884. 17. Dahlke C, Maul K, Christalla T, Walz N, Schult P, Stocking C, Grundhoff A. A microRNA encoded by Kaposi sarcoma-associated herpesvirus promotes B-cell expansion in vivo. PLoS One. 2012;7(11):e49435. 18. Skalsky RL, Samols MA, Plaisance KB, Boss IW, Riva A, Lopez MC, Baker HV, Renne R. Kaposi's sarcomaassociated herpesvirus encodes an ortholog of miR-155. J Virol. 2007 Dec;81(23):12836-45. 19. Zhou Y, Wang M, Wu J, Jie Z, Chang S, Shuang T. The clinicopathological significance of miR-1307 in chemotherapy resistant epithelial ovarian cancer. J Ovarian Res. 2015 Apr 9;8:23. 20. Shimomura A, Shiino S, Kawauchi J, Takizawa S, Sakamoto H, Matsuzaki J, Ono M, Takeshita F, Niida S, Shimizu C, Fujiwara Y, Kinoshita T, Tamura K, Ochiya T. Novel combination of serum microRNA for detecting breast cancer in the early stage. Cancer Sci. 2016 Mar;107(3):326-34. 21. Cordero F, Ferrero G, Polidoro S, Fiorito G, Campanella G, Sacerdote C, Mattiello A, Masala G, Agnoli C, Frasca G, Panico S, Palli D, Krogh V, Tumino R, Vineis P, Naccarati A. Differentially methylated microRNAs in prediagnostic samples of subjects who developed breast cancer in the European Prospective Investigation into Nutrition and Cancer (EPIC-Italy) cohort. Carcinogenesis. 2015 Oct;36(10):1144-53. Thanks to: Dr. Charles L. Shapiro Medicine, Hematology and Medical Oncology Mount Sinai Health System New York, NY James Cancer Hospital and Solove Resarch Institute The Ohio State University Columbus, OH