Download Stromal gene expression predicts clinical outcome in breast cancer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

The Cancer Genome Atlas wikipedia , lookup

Transcript
ARTICLES
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
Stromal gene expression predicts clinical outcome in
breast cancer
Greg Finak1–3, Nicholas Bertos2,7, Francois Pepin1–3,7, Svetlana Sadekova2, Margarita Souleimanova2,
Hong Zhao2, Haiying Chen2, Gulbeyaz Omeroglu2, Sarkis Meterissian4, Atilla Omeroglu6, Michael Hallett1,3 &
Morag Park2,3,5
Although it is increasingly evident that cancer is influenced by signals emanating from tumor stroma, little is known regarding how
changes in stromal gene expression affect epithelial tumor progression. We used laser capture microdissection to compare gene
expression profiles of tumor stroma from 53 primary breast tumors and derived signatures strongly associated with clinical
outcome. We present a new stroma-derived prognostic predictor (SDPP) that stratifies disease outcome independently of standard
clinical prognostic factors and published expression-based predictors. The SDPP predicts outcome in several published whole
tumor–derived expression data sets, identifies poor-outcome individuals from multiple clinical subtypes, including lymph
node–negative tumors, and shows increased accuracy with respect to previously published predictors, especially for HER2-positive
tumors. Prognostic power increases substantially when the predictor is combined with existing outcome predictors. Genes
represented in the SDPP reveal the strong prognostic capacity of differential immune responses as well as angiogenic and
hypoxic responses, highlighting the importance of stromal biology in tumor progression.
Breast cancer is a major cause of morbidity and mortality in Western
countries1. Disease-related mortality has declined as a result of
earlier diagnoses and the availability of adjuvant therapies. The
identification of individuals at increased risk of recurrence, who
may then be selected for aggressive systemic therapy, remains a
considerable challenge. Recent technological advances permit
the genomic characterization of tumors, enhancing our understanding of cancer initiation and progression2–6. Gene expression
signatures have been identified that classify breast tumors into subtypes showing distinct expression profiles associated with specific
clinical characteristics2–8. Transcriptional signatures have been identified for estrogen receptor (ER)–positive (luminal), HER2-positive
(ERBB2-amplified), and ER-, progesterone receptor (PR)- and
HER2–negative (basal) breast cancer4,6, indicating that there exist
molecularly distinct subtypes of breast cancer with distinct clinical
outcomes. Genomics-based predictors of metastasis in breast cancer
have been developed for clinical use9. Although some investigators have focused their analyses on specific cell types (primarily
fibroblasts) or derived gene expression signatures from defined
cell types studied in vitro10–14, most prognostic gene expression
signatures and predictors have been derived from whole tissue
consisting of tumor epithelial cells and the surrounding stroma.
Samples judged to possess insufficient tumor epithelial cell content
are generally excluded2,3,5,6.
Under normal physiological conditions, stroma serves as an important barrier to epithelial cell transformation; the interplay between
epithelial cells and the microenvironment maintains epithelial polarity
and modulates growth inhibition15. However, the stromal compartment undergoes changes in response to emerging epithelial lesions and
can have a key role in cancer initiation and progression15,16. These
changes may include the recruitment of immune and endothelial cells
providing growth and matrix remodeling factors, as well as a new
blood supply promoting tumor growth and metastasis15–17. To date,
the mechanisms involved are poorly understood and their relationship
to disease outcome has not been addressed.
Gene expression in isolated tumor stroma from clinical breast
cancer samples has not been rigorously examined. Given the importance of stromal elements in modulating and driving cancer progression in animal models, it is crucial to elucidate the contribution of
stroma and understand the consequences of changes in this tissue
compartment as they relate to breast cancer outcome. In this study, we
have used laser capture microdissection (LCM) to isolate tumor
stroma and matched normal stroma from human breast cancers and
have used microarrays to identify stromal gene expression signatures.
To our surprise, whereas the gene expression signatures derived from
whole tumors generate clusters associated with ER and HER2
status2,4,6, differential gene expression from the tumor stroma
generates clusters linked to clinical outcome.
1McGill
Centre for Bioinformatics, 3775 University Street, McGill University, Québec H3A 2B4, Canada. 2Molecular Oncology Group, 687 Pine Avenue West, McGill
University Health Centre, Québec H3A 1A1, Canada. 3Department of Biochemistry, 3655 Promenade Sir William Osler, McGill University, Québec H3G 1Y6, Canada.
4Department of Surgery and 5Department of Oncology, 687 Pine Avenue West, McGill University, Québec H3A 1A1, Canada. 6Department of Pathology,
3775 University Street, McGill University, Québec, H3A 2B4, Canada. 7These authors contributed equally to this work. Correspondence should be addressed to
M.P. ([email protected]).
Received 3 October 2007; accepted 2 April 2008; published online 27 April 2008; doi:10.1038/nm1764
NATURE MEDICINE
ADVANCE ONLINE PUBLICATION
1
ARTICLES
Class distinction
Identify differentially expressed genes
between clusters, n = 53 samples
IDC-stroma
Logistic regression
Order genes by predictive power for
outcome in multivariate model with LN,
ER, PR, HER2 status, n = 53 samples
IDC-stroma
Test classifier on independent data
NKI3, n = 295, Rotterdam5,
n = 286, JRH32, n = 251 and
Uppsala33, n = 189
Choose minimal size classifier with
maximal AUC
Retrain on full data set, n = 53 samples
IDC-stroma
Class prediction
Train predictors with increasing number
of genes, n = 27 random samples
IDC-stroma
0
0
90%
88%
92%
0
0
88%
0
c
85%
73%
0
73%
1.0
0
88%
Test, n = 26 remaining samples
Cross-validate 50 iterations
Cluster 1
ER (positive)
PR (positive)
HER2 (positive)
LN (positive)
Grade (III)
Recurrence (positive)
Poor outcome
Post-op radiation therapy (yes)
Post-op hormonal therapy (yes)
Post-op chemotherapy (yes)
ER negative (P = 5.9 × 10–3)
Recurrence positive (P = 5.98 × 10–4)
Poor outcome (P = 2.04 × 10–5)
Fractoin good outcome
Cluster 2
2
Cluster 3
1
b
Class discovery
Identify genes, n = 31 matched samples
Identify clusters, n = 53 samples
IDC-stroma
E1923
E1715
E1900
E1056
E1417
E1696
E1199
E1217
E1227
E1317
E1531
E1197
E1239
E1672
E1745
E1551
E1542
E1792
E1383
E1125
E1808
E1957
E1469
E1659
E1188
E1084
E1998
E1668
E1751
E1277
E1527
E1228
E1529
E1116
E1752
E1525
E1647
E1859
E1897
E1902
E1467
E1901
E1292
E1223
E1973
E1843
E1423
E1830
E1314
E1174
E1257
E1677
E1634
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
a
0.8
P = 3.64 × 10–5
0.6
Cluster 1
Cluster 2
Cluster 3
0.4
0.2
0.0
0
40
60
20
Recurrence-free good-outcome
survival time (months)
Recurrence negative (P = 7.26 × 10–3)
Good outcome (P = 2.30 × 10–3)
Figure 1 Class discovery in tumor stroma. (a) Flow chart outlining principal steps in the construction of the SDPP. LN, lymph node. (b) Class discovery in
tumor stroma samples over a basis set of the 200 most variable genes from matched normal versus tumor stroma gene expression data. Clusters in the tree
are labeled with the percentage of times they were observed in 1,000 bootstrap iterations. Clinical characteristics of each tumor sample are presented in the
boxes below each sample with a shaded box representing a positive status, and a strikethrough indicating no information available. Poor outcome is defined
as dead of disease or alive with disease as of last follow-up. Significant associations of each cluster with clinical characteristics are presented below the
relevant cluster. (c) Kaplan-Meier survival curves for samples belonging to clusters 1, 2 and 3 in b.
RESULTS
Tumor stroma gene expression identifies outcome-linked clusters
To investigate changes in breast tumor stroma, we have combined
LCM-based tissue isolation and RNA amplification with gene expression profiling using DNA microarrays as previously described18. We
isolated cells from the stromal compartment within the tumor bed, and
from morphologically normal tissue at least 2 mm from the tumor,
from individuals with invasive breast carcinoma (Supplementary
Fig. 1a and Supplementary Table 1 online). We have previously
shown that such morphologically normal tissue has not undergone significant gene expression changes compared to breast reduction tissue18.
We determined whether the gene expression profiles derived from
tumor stroma can be used to identify cancer subtypes via a classdiscovery approach. We identified genes whose expression varied most
between tumor tissue and normal stroma for the 31 tissue-matched
pairs. The 200 most variable genes were used to cluster our complete
data set of 53 tumor stroma samples (Fig. 1a), generating three
clusters (Fig. 1b). Cluster 1 (Fig. 1b,c) shows a significantly reduced
rate of recurrence (P ¼ 7.26 103, w2 test for association) and longer
relapse-free survival (P ¼ 4.17 103) compared to the remaining
samples, whereas cluster 2 (Fig. 1b,c) has a significantly increased rate
of recurrence (P ¼ 2.04 105) and shorter relapse-free survival (P ¼
2.87 104). Cluster 3 (Fig. 1b,c) contains a mixture of outcomes.
Multivariate Cox regression indicates that the poor outcome–
associated cluster 2 is independent of ER, HER2 and lymph node
status, as well as age, grade and tumor size, whereas the good
2
outcome–associated cluster 1 is independent of these variables, as well
as of radiotherapy, chemotherapy and hormonal therapy (Supplementary Tables 1 and 2 online). Hence, tumor-associated changes in
stromal gene expression can be used to identify sample clusters
distinct from previously identified breast tumor subtypes2,4,6.
Outcome-linked stroma shows distinct biological responses
The tripartition of samples by stromal expression profiles may
represent three subtypes of breast tumor stroma (Fig. 1b). To
investigate whether differences between these groups reflect distinct
biological responses, we performed pair-wise comparisons of gene
expression between the three clusters (Fig. 1a). From this class
distinction, we identified 163 unique genes that show the greatest
differential expression pattern between clusters (Fig. 2 and Supplementary Data online) and can be used to group tumors by outcome
in a manner similar to that previously generated by class discovery
(Fig. 2a,b). The 163-gene set was used as a starting point to
characterize the differences between the good outcome– and poor
outcome–associated stroma subtypes. Each subtype can be distinguished by its differential expression of genes involved in distinct
biological responses, providing evidence that the subtypes reflect
different combinations of underlying biologies (Fig. 2a). Gene ontology analysis of the genes with elevated expression in the poor-outcome
sample cluster (Fig. 2c) identifies an enrichment for genes involved in
functions and processes associated with poor outcome19,20 (Supplementary Fig. 1b and Supplementary Data). These include factors
ADVANCE ONLINE PUBLICATION
NATURE MEDICINE
ARTICLES
a
b
Good outcome
Poor outcome
89%
–6–4–2 0 2 4 6 Mixed outcome
c
E1056
E1647
E1174
E1897
E1634
E1417
E1317
E1696
E1199
E1257
E1239
E1197
E1659
E1227
E1217
E1900
E1715
E1423
E1223
E1745
E1672
E1901
E1531
E1314
E1551
E1542
E1677
E1859
E1469
E1292
E1188
E1529
E1830
E1973
E1668
E1843
E1525
E1923
E1751
E1277
E1527
E1467
E1116
E1752
E1228
E1902
E1998
E1084
E1808
E1125
E1957
E1792
E1383
100%
93%
Fraction good outcome
1.0
0.8
0.6
0.4
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
Mixed outcome
Good outcome
Poor outcome
e
Mixed outcome
Good outcome
Poor outcome
0.0
ER (positive)
PR (positive)
HER2 (positive)
LN (positive)
Grade (III)
Recurrence (positive)
Poor outcome
d
–3
P = 8.34 × 10
0.2
Mixed outcome
0
20
40
60
Recurrence-free good-outcome
survival time (months)
Good outcome
ITGBL1*
OGN*
C21orf34*
SORCS2
ADRA2A*
CXCL14*
FRZB*
RAI2 *
HOXA10*
PRND
F2RL2*
SLC40A1*
FGF18
ESR1
AR
BCAN
TLN2
PSCD3
SUSD3
GREB1
AREG
WISP2
PIP
TFF1
TFF3
PSMD1
RPL10
SCGB2A2
ACAA2
PDCD7
ZHX2
A_24_P936252
TCEA3
Poor outcome
CD48 *
PLEK *
SOAT1
LAP3
PLA2G7
MS4A4A
GIMAP5 *
RUNX3 *
HLA-A
HLA-F
IL10RA
C1orf38
NCF2
COTL1
COTL1
GZMA*
CD8A*
CD52 *
TRBV5-4 *
CD3D
CD247 *
TRA@
CD2
XCL1
GZMB
CYBB
CCL13
MEI1
HCST
Figure 2 Class distinction of tumor stroma. (a) Hierarchical clustering of tumor stroma
samples with the 163 genes differentially expressed between clusters 1, 2 and 3 of
Figure 1b. Gene clusters are labeled with significance from bootstrap analysis, and color bars
represent the three gene clusters described in the text. Heat map colors represent meancentered fold change expression in log-space. (b) Kaplan-Meier curves for each of the three
clusters. (c) Expanded view of the genes expressed predominantly in samples of the pooroutcome cluster. (d) Genes expressed predominantly in samples of the mixed-outcome cluster.
(e) Genes expressed predominantly in samples of the good-outcome cluster. *, member of
the SDPP gene set.
linked to angiogenic, hypoxic and tumor-associated macrophage
(TAM) responses, such as adrenomedullin (ADM), interleukin-8
(IL8), endothelin-1 (EDN1), osteopontin (SPP1) and the chemokine
CXCL1 (CXCL1)7,21–25. Further supporting a link with angiogenesis,
samples within the poor-outcome cluster show the highest endothelial
content, as established by immunohistochemical staining for CD31
(Supplementary Fig. 2a,b online).
Genes whose expression is decreased in subjects in the poor–
outcome group (Fig. 2a,d) include elements involved in negative
regulation of WNT receptor signaling (FRZB) and in development
(HOXA10) (Supplementary Data); the loss of either of these genes has
been linked to epithelial-to-mesenchymal-transition and poor outcome in some cancers26,27. The gene set expressed predominantly in
the good-outcome cluster (Fig. 2e) is functionally enriched for
elements of the T helper type 1 (TH1) immune response, including
T cell receptor complex (CD8A, CD247, CD3D), MHC class I protein
binding (CD8A) and granzyme A and granzyme B activity (GZMA)
(Supplementary Fig. 1b)28–30. We observed elevated levels of CD8Aand CD247-positive cells in tumor stroma from individuals in the
good outcome–linked cluster compared to those in the poor outcome–linked cluster (Supplementary Fig. 2c,d), suggesting that there
is an increased recruitment of activated T cells and natural killer cells
to tumors in individuals from the good-outcome cluster (Fig. 2a,e).
Immunohistochemistry directed against osteopontin (SPP1) (elevated
in the poor-outcome cluster) confirmed microarray results (Supplementary Fig. 2e). Expression of selected genes was successfully
validated by quantitative real-time PCR (Supplementary Fig. 2f).
NATURE MEDICINE
ADVANCE ONLINE PUBLICATION
Mixed outcome
Good outcome
Poor outcome
IL4I1
AQP9
S100A8
S100P
CLEC4E
CAPS
MMP12
MMP12
A_24_P6918261 (unknown)
LCN2
SYTL1
CALB2
MMP7
GRB14
HRASLS *
SCEL
A_32_P18475 (unknown)
GPR110
SPP1*
IQGAP3 (AI345640)
A_32_P13533 (unknown)
S100A7
S100A9
A_23_P366468 (unknown)
HIST1H1C
SPNS2
CXCL1
MMP1
STK38L
KRT23
UGCGL1
ACTG2
ACTG2
KCNK5
SCRG1 (scrapie-responsive gene 1)
C1orf31
VGLL1*
ROPN1
SHC4
UBE2C
KIF18B
FAM83D
NCAPG
ASPM
CENPF
OIP5
C6orf173
ECT2
GPR56
RDH10
MYBL1
CLIP4
FAM54A
SGOL1
E2F8
CHEK1
HSPC159
ZNF165
GBP5
LCP1*
CDCA7
KYNU
NDC80
RIOK3
CTSL2
SLAIN1
SQLE
GJD4
BXDC1
AZIN1
ATG5
GTF3C6
SRPK1
AMD1
GK
CRY1
C1orf31
TACSTD1
TFEC
LACTB2
ITGB8
SLC30A5
LRRCC1
ORMDL1
RP11–11C5.2
CHML
KLF8
IL8
ADM *
STK24
C6orf168 *
SNTG2 *
HTATIP2
C6orf203
C6orf117
B3GNT5
RCAN1
OXR1
EDN1
RIPK4
PERP
GALNT3
Thus, we identify distinct sets of genes reflecting hypoxia and
angiogenesis (linked to poor outcome) and a TH1-like immune
response (linked to good outcome).
Construction of a stroma-derived prognostic predictor
No prognostic predictors have been derived from clinical samples of
breast tumor stroma. We generated a predictor of minimal size and
maximum accuracy from the 163 genes identified as discussed above.
Many factors that are prognostic for breast cancer outcome, such as
ER or HER2 status, can substantially affect tumor gene expression
profiles2,4,6. To limit the influence of these variables, we identified
genes predictive of outcome independently of these factors. We used
multivariate logistic regression with ER, PR, HER2 and lymph node
status as covariates to rank genes by their independent prognostic
ability (Fig. 1a and Supplementary Data). To construct a predictor,
we trained a multivariate naive Bayes classifier31 by using incrementally larger gene sets from this ordered list (Fig. 1a). After generating
ROC curves, we assessed the performance of classifiers by comparing
the areas under the curve (AUC), where a larger AUC indicates
better performance.
Although there were a number of predictors with similar performance and composition (Supplementary Fig. 3a online), the
predictor that maximized the AUC contained 26 genes (SDPP;
Fig. 1a) and performed well in data derived from tumor stroma
(Fig. 3a–c), but not in data derived from tumor epithelium (Supplementary Fig. 3a). This result, as well as the inability of the SDPP
to cluster tumor epithelium by outcome (Supplementary Fig. 3b–e
3
ARTICLES
SDPP performance in whole-tumor data sets
Previous analyses have identified predictors of outcome with
gene expression data derived from whole breast tumor tissue
comprising both tumor epithelium and stroma2,3,5,6. To establish
whether the SDPP could predict outcome from such data, we analyzed
three publicly available breast cancer data sets, the Netherlands Cancer
Institute (NKI)3, John Radcliffe Hospital (JRH)32 and Uppsala33
data sets (Fig. 1a). These comprise 295, 251 and 189 invasive ductal
carcinoma (IDC) samples, respectively, with mixed ER, PR, grade and
lymph node status. We also analyzed a fourth data set, the Rotterdam
set5, which contains 286 lymph node–negative IDC samples with
mixed clinical characteristics. Only a subset of the 26 SDPP genes was
available in each data set (NKI, 15; JRH, 16; Uppsala, 11; Rotterdam,
19). However, in each case, genes from each of the three biological
classes were present (Fig. 4a).
Despite the lack of expression data for some of the 26 SDPP genes in
these data sets, the SDPP accurately predicted outcome in each tumor
data set, performing significantly better than random (accuracy: NKI,
75.6%, P ¼ 3.5 1011; JRH, 73.3%, P ¼ 1.2 104; Uppsala, 59.8%,
P ¼ 4.8 102; Rotterdam, 60%, P ¼ 8.0 103). This indicates that
a stromal signal can be detected in expression data generated from
whole tumor. In all data sets, individuals assigned to the poor-outcome
group had a substantially increased risk of recurrence or death from
disease when compared to individuals in other groups (Table 1, Fig. 4b
and Supplementary Fig. 4a,b online), demonstrating the utility and
robustness of the SDPP in whole tumor–derived data. Because all
samples in the Rotterdam set were node negative5, our analysis shows
that gene expression in tumor stroma is predictive of outcome before
detectable lymph node involvement.
E1527
E1751
E1843
E1228
E1668
E1188
E1529
E1973
E1752
E1277
E1084
E1467
E1998
E1859
E1830
E1672
E1223
E1634
E1292
E1525
E1469
E1116
E1902
E1659
E1677
E1542
E1923
E1647
E1174
E1901
E1199
E1314
E1257
E1423
E1239
E1197
E1551
E1696
E1417
E1317
E1745
E1531
E1900
E1715
E1957
E1217
E1897
E1808
E1227
E1056
E1383
E1125
E1792
Fraction event free
The SDPP is an independent prognostic factor
To test whether the SDPP is an independent prognostic factor in
whole tumor–derived data sets, we performed multivariate Cox
regression of available prognostic factors in all four data sets. The
SDPP was independent of ER and HER2 status, lymph node involvement, grade, age, chemotherapy and hormonal therapy (Table 1 and
Supplementary Fig. 4a,c). Of note, the hazard ratio (HR) for the
poor-outcome group identified by the SDPP
in the HER2-positive cohorts was, on average,
2.6 times greater than for the complete popua
c
1.0
1.0
lations, indicating increased utility of the
P = 1.11 × 10–16
0.8
0.8
predictor in this cohort (Fig. 4b; HRHER2/
0.6
0.6
HRtotal ¼ 3.33, HRHER2/HRtotal ¼ 2.53 and
0.4
0.4
HRHER2/HRtotal ¼ 1.35 for overall survival
AUC = 0.93
0.2
0.2
and time to metastasis in the NKI data set
and relapse-free survival in the Rotterdam
0.0
0.0
0
20
40
60
0.0 0.2 0.4 0.6 0.8 1.0
data set, respectively).
Recurrence-free good-outcome
False-positive rate
Other expression-based prognostic signasurvival time (months)
tures and predictors for breast cancer outGZMA
b
CD8A
TRBV5-4
come have been identified. The US Food
–4 –2 0 2 4
CD52
CD247
and Drug Administration (FDA)–approved
CD48
PLEK
70-gene predictor, developed from a subset
RUNX3
GIMAP5
of the NKI cohort9, predicts metastatic proLCP1
F2RL2
SLC40A1
gression and contains genes involved in proFRZB
RAI2
liferation, angiogenesis and invasion2,34. The
HOXA10
ITGBL1
SDPP predicted outcome with greater accuOGN
C21orf34
ADRA2A
racy (75.6%, P ¼ 3.5 1011) than the
CXCL14
SPP1
70-gene predictor (61.0%, P ¼ 0.457) in the
HRASLS
VGLL1
NKI data set and was 5.96 times more likely
ADM
C6orf168
to identify a true poor-outcome individual
SNTG2
than the 70-gene predictor in the HER2positive cohort (positive diagnostic likelihood
Probability
ratio of 6.86 for SDPP versus 1.15 for 70-gene
ER (positive)
PR (positive)
HER2 (positive)
predictor; Supplementary Fig. 4d).
LN (positive)
Grade (III)
Signatures have also been developed that
Recurrence (positive)
Poor outcome
reflect
in vitro biological responses, including
Post-op radiation therapy (yes)
Post-op hormonal therapy (yes)
the response of cultured stromal fibroblasts
Post-op chemotherapy (yes)
to serum stimulation8 (wound response
signature,
380 genes), a hypoxia-associated
Figure 3 Construction and performance of the SDPP. (a) ROC curve for the SDPP applied to tumor
stroma samples, showing the true-positive and false-positive rate, as well as the AUC. The AUC
transcriptional response7 (123 genes), a
corresponds to the probability of the SDPP ranking a randomly selected positive example higher than
signature derived from good- versus poora randomly selected negative example. Dotted diagonal line shows the performance of a random
outcome fibroblastic tumors (solitary fibrous
predictor. (b) Predictions made by the SDPP in the stroma data set. Samples are ordered by the
tumor versus desmoid-type fibromatosis;
probability of membership in each of the three classes, whereas genes are arranged by hierarchical
SFT/DTF)10 (656 genes) and an interferon
clustering. Each gene cluster is color-coded as in Figure 2a. Heat map colors represent mean-centered
(IFN) response signature from cocultured
fold change expression in log-space. (c) Kaplan-Meier curves for the three tumor groups identified
by the SDPP.
fibroblasts14 (49 genes). Given that these
Average true-positive rate
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
and Supplementary Results online) shows that the prognostic
pattern of SDPP genes is specific to tumor stroma. Notably,
the genes in the SDPP contain representatives from each of the
three gene clusters identified in the 163-gene set, highlighting the
importance of each of these biological functions in tumor stroma
(Figs. 2a and 3b).
4
ADVANCE ONLINE PUBLICATION
NATURE MEDICINE
ARTICLES
0.6
0.4 Hazard
ratio = 3.67
0.2 P = 9 × 10–5
0.0 Median = 7.67 years
0.6
–5
P = 1.36 × 10
0.8
0.6
0.4
Hazard ratio = 3.21
0.2 P = 1.1 × 10–4
Median = 4.63
years
0.0
0
5
10
15
Metastasis-free survival
time (years)
1.0
–3
P = 2.11 × 10
0.8
0.6
0.4
Hazard ratio = 1.932
0.2
–4
P = 8.0 × 10
0.0
0.2
0
2 4 6 8 10 12 14
Relapse-free survival
time (years)
0.5
SDPP poor outcome
SDPP good outcome
SDPP not used
predictors and signatures arise from different sources and probably
reflect different biologies, we compared their performances with that
of the SDPP in the NKI data set. Notably, all of the large gene
signatures showed good accuracy in predicting outcome: hypoxia
(68%, P ¼ 2.1 103), wound response (68.5%, P ¼ 9.0 104),
IFN (63.4%, P ¼ 0.16) and SFT/DTF (73.6%, P ¼ 1.48 108).
Although the SFT/DTF signature shows similar accuracy to the SDPP,
the SDPP is independent of all other predictors and signatures,
including the SFT/DTF (Table 1) and the IFN signature (data not
shown). Consistent with these findings, only three genes of the 656gene SFT/DTF signature overlap with the 26-gene SDPP (SLC40A1,
OGN and CXCL14). Moreover, in accordance with the fibroblastderived nature of the SFT/DTF, gene ontology analysis shows that the
SFT/DTF encompasses neither an immune response linked to the
good-outcome SFT component nor enhanced angiogenesis or hypoxia
linked to the poor-outcome DTF component (data not shown). Thus,
the SDPP reflects biological processes beyond those indicated by the
fibroblast-derived SFT/DTF signature, the IFN response signature and
other signatures.
Integration of multiple molecular predictors
Published prognostic predictors for breast cancer are considered to be of
limited value when used independently34,35. To test whether the SDPP
adds value, we applied a graphical modeling approach (Supplementary
Results and Supplementary Fig. 4e). Using the NKI data set and
predictions from the 70-gene predictor, the wound response, hypoxia
and SFT/DTF signatures, and the SDPP, we constructed a Bayes’
classifier of metastasis31. We found that the first four signatures each
have a posterior probability of metastasis (posterior probability) below
50%, whereas the SDPP has a posterior probability of 55.8% (Fig. 4c).
ADVANCE ONLINE PUBLICATION
Median = 3.25 years
0
5
10
15
Overall survival time (years)
1.0
–5
P = 2.07 ×10
0.8
0.6
Hazard ratio = 8.12
–4
P = 1.3 × 10
0.4
0.2
Median = 1.73 years
0.0
0
1.0
5
10
15
Metastasis-free survival
time (years)
–2
P = 3.87 × 10
0.8
0.6
0.4
Hazard ratio = 2.60
0.2
–2
P = 4.6 × 10
0.0
0.7
Poor-outcome prediction
Predictor not used
Good-outcome prediction
Hazard ratio = 12.22
P = 1.4 × 10–4
0.4
0.0
Fraction metastasis free
1.0
–6
P = 2.96 × 10
0.8
0
5
10
15
Overall survival time (years)
0.3
SFT/DTF
Hypoxia
70-gene
Wound
SDPP
NATURE MEDICINE
Fraction surviving
Fraction surviving
0.8
Prediction
Relapse
c
1.0
–7
P = 3.02 × 10
Fraction relapse free
RUNX3
RUNX3
PLEK
PLEK
GZMA
CD48
CD247
GIMAP5
GIMAP5
CD8A
LCP1
CD52
CD52
CXCL14
RAI2
ADRA2A
OGN
FRZB
FRZB
HOXA10
HOXA10
ADM
SPP1
VGLL1
VGLL1
HRASLS
Posterior probability
of metastasis
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
Prediction
Metastasis
b 1.0
Fraction metastasis free
CD8A
GZMA
RUNX3
LCP1
CD48
PLEK
CD247
HOXA10
ADRA2A
F2RL2
OGN
RAI2
FRZB
ADM
SPP1
Fraction relapse free
a
0
2 4 6 8 10 12 14
Relapse-free survival
time (years)
Figure 4 Performance of the SDPP in publicly
available breast cancer gene expression data sets.
(a) Predictions of good, poor and mixed outcome
for individuals in the NKI (top) and Rotterdam
(bottom) data sets using the SDPP. Samples are
ordered by their score from the SDPP, and genes
are ordered by hierarchical clustering. Tick marks
below the heat map represent metastasis or
relapse events. (b) Kaplan-Meier survival analysis
of the SDPP classes for overall survival (top left)
and time to metastasis (middle left) of the
complete NKI data set. Overall survival (top right)
and time to metastasis (middle right) are also
shown for the HER2-positive cohort of the NKI
data set. Relapse-free survival is shown for the
complete Rotterdam data set (bottom left) and
the HER2-positive cohort thereof (bottom right).
Relevant hazard ratios, median survival, and P
values are indicated. (c) Posterior probability of
metastasis calculated from the Bayes’ classifier
trained on good- and poor-outcome predictions
from the SDPP, 70-gene predictor, and wound,
hypoxia and SFT/DTF signatures in the NKI data
set. A black box indicates a poor-outcome
prediction from a signature, a gray box indicates
a good-outcome prediction and an empty box
indicates that information from that predictor
was not used. Red box, dots, cases where the
SDPP predicted good outcome; green box, dots,
cases where the SDPP predicted poor outcome
SDPP; black box, dots, SDPP predictions were
not used. Grey dotted line, the prior probability
of metastasis for the case where predictor
information was not used.
Notably, combining the SDPP with any of these predictors improves
the posterior probability in the NKI data set beyond any combination
of poor-outcome predictors that does not include the SDPP (maximum 86.6% posterior probability with SDPP versus maximum
59.1% without SDPP; Fig. 4c). Additionally, we observed comparable
improvements in the posterior probability when we combined the
other predictors with the SDPP to predict good outcome in the NKI
data set (28.2% with SDPP versus 42.7% without SDPP; Fig. 4c).
These results demonstrate an interaction between the biological
processes underlying the predictors and highlight the increased
prognostic power that can be gained by characterization of the
tumor microenvironment.
DISCUSSION
Here we identify subtypes of tumor stroma corresponding to goodand poor-outcome breast cancers that have not been previously
recognized. Although there is an increasing awareness that stromal
interactions contribute to tumor progression, previous studies have
not addressed how changes occurring in tumor stroma affect disease
outcome. Such insight is essential for the development of new
therapeutic strategies. On the basis of differential gene expression
patterns in breast tumor stroma, we have developed a 26-gene
predictor (Table 2) that forecasts disease outcome with greater
accuracy than predictors or signatures derived from whole tissue.
This result strongly suggests that changes in breast tumor stroma have
a crucial role in disease progression and outcome.
A key finding of this study is that multiple, distinct biological
responses are differentially present within the stroma of individuals in
outcome-linked categories (Fig. 2). For example, tumor stroma
samples from the good-outcome cluster overexpress a distinct set of
5
ARTICLES
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
Table 1 Multivariate Cox regression for overall survival in the complete NKI data set3
changes in tumor epithelium as well as
stroma, the combined outputs of which are
Variable
P -value
Hazard ratio
Lower 95% CI
Upper 95% CI predictive35. The gene expression profile of
tumor stroma, distilled as a 26-gene SDPP,
Stroma predictor (poor outcome)
0.0043**
3.055
1.419
6.58
functions as a robust predictor whose output
Stroma predictor (mixed outcome)
0.014*
2.343
1.185
4.63
is correlated with clinical outcome in our
ER (positive)
0.83
0.93
0.487
1.78
bank of breast cancer samples. The perforHER2 (positive)
0.12
1.536
0.888
2.66
mance of the predictor in independent, whole
Grade (poorly differentiated)
0.82
1.065
0.615
1.84
tumor–derived data sets indicates that
Grade (well differentiated)
0.12
0.414
0.138
1.24
although the prognostic ability of the SDPP
Lymph nodes positive (44)
0.021*
2.343
1.139
4.82
is specific to tumor stroma, the signal can be
Lymph nodes positive (0)
0.55
0.801
0.389
1.65
detected in and the SDPP can accurately
Age (o40 years)
0.0042**
2.122
1.268
3.55
predict outcome in data sets from whole tissue
Hormonal therapy (yes)
0.83
0.905
0.353
2.32
Chemotherapy (yes)
0.23
0.65
0.321
1.32
(Fig. 4). Notably, the identification of good or
70 genes (poor outcome)
0.0029**
3.847
1.587
9.32
poor outcome by the SDPP in individuals in
Hypoxia signature
0.033*
1.922
1.055
3.5
the NKI, Rotterdam and other public data sets
Wound signature (intermediate)
0.94
1.033
0.411
2.6
is independent of ER and HER2 status, indiWound signature (quiescent)
0.5
0.848
0.523
1.37
cating that the SDPP identifies distinct bioloSFT/DTF (DTF-like)
0.18
1.679
0.791
3.56
gical processes, rather than those associated
STF/DTF (unclassified)
0.15
1.562
0.848
2.88
with clinical breast cancer subtypes. The inde*P o 0.05; **P o 0.01. CI, confidence interval.
pendence of good- and poor-outcome categories with respect to treatment suggests that
immune-related genes, including T cell and NK cell markers these classes reflect host responses that are not targeted by convenindicative of a TH1–type immune response (GZMA, CD52, CD247, tional adjuvant therapies (Table 1).
The hazard ratio for the SDPP poor-outcome group is greater than,
CD8A). This is consistent with studies correlating increased memory
and activated TH1 cell content in the tumor with good outcome in and independent of, the hazard ratio for lymph node involvement, the
lung and colon cancers36,37. Therefore, individuals with this gene single most clinically accepted factor currently used to predict
expression pattern may benefit from treatments targeting tumor breast cancer outcome47. Moreover, the SDPP has a higher accuracy
cells via the immune response, such as vaccine therapies in the than all other signatures and predictors in the NKI data set, and it
adjuvant setting.
shows no gene overlap and adds prognostic information when
In contrast, the immune cell presence within the tumor stroma is combined with the FDA-approved 70-gene predictor. Of note, in
diminished in individuals in the poor-outcome cluster. Instead, the HER2-positive cohort of the NKI data set, the SDPP is more
stroma from individuals in the poor-outcome cluster shows markers likely to identify a true poor-outcome sample than the 70-gene
of an increased hypoxic and angiogenic response, as well as a decrease predictor, providing increased specificity at the cost of sensitivity
in chemokines that stimulate natural killer cell migration (CXCL14) (Supplementary Fig. 4d). In light of the SDPP’s improved perforand mediate prosurvival signals in T lymphocytes (GIMAP5)38,39.
mance in HER2-positive tumors over the total population, and in
Tumor hypoxia elicits changes in gene expression that act to modify consideration of the immune-related functions of genes associated
the tumor microenvironment25,40. Individuals in the poor-outcome with good outcome, we speculate that the SDPP may provide
cluster have elevated stromal expression of hypoxia-associated genes. information necessary to predict response to trastuzumab or other
IL-8 enhances endothelial cell proliferation41, which is consistent with anti-HER2 therapies.
Of note, the markedly improved prognostic power obtained from
the increased density of CD31-positive endothelial cells in the tumor
stromal bed of these individuals. Monocytes and TAMs accumulate in combining the SDPP with other signatures and predictors demonhypoxic areas of tumors, in part recruited by the action of chemoat- strates that the SDPP captures biological processes in a prognostic
tractants such as CXCL1 and ET-1 induced in the hypoxic environ- space orthogonal to existing predictors (Fig. 4c). The SDPP is
ment24,42. Consistent with this mechanism, we observe elevated complementary to the SFT/DTF signature. Furthermore, the small
expression of genes found in the hypoxic transcriptome of monocytes size of the SDPP (26 genes) makes it more amenable to clinical use
and macrophages, including ADM, MMP1 and SPP125. Several matrix than the 656-gene SFT/DTF signature.
Distinct gene expression signatures in breast tumor stroma reflect
metallopeptidases are also highly expressed in poor-outcome as
compared to good-outcome sample clusters, including MMP12 and different clinical outcomes that are not restricted to a specific clinical
MMP1, which are known to be involved in tissue remodeling by subtype. The stroma signature presented here, alone or in combinamacrophages; MMP1 is also involved in angiogenesis, invasion and tion with other molecular prognostic predictors, promises to improve
metastasis43,44. Although these biological responses (hypoxia, angio- molecular classification and outcome prediction in breast cancer,
genesis and differential immune response) have been associated with specifically by aiding identification of patients who may benefit
poor outcome, and although TAM recruitment correlates with poor from aggressive therapies, or stratifying breast cancer subjects for
prognosis in multiple cancers, including breast cancer45, their utility as clinical trials. The SDPP provides prognostic information beyond
independent prognostic factors is poor19,20 or restricted to specific classical prognostic factors and published molecular signatures. This
finding, in combination with the improved accuracy that we obtained
clinical subtypes46.
Our study reveals that integrating the output of these multiple with a combinatorial approach, highlights the need to integrate all
biological responses generates an effective and independent predictor aspects of the tumor microenvironment into prognostic prediction
of outcome (Fig. 3). These observations reflect the heterogeneity of and suggests future directions for the development of novel diagnostic
breast cancer, wherein outcome is affected by genetic and epigenetic or therapeutic modalities.
6
ADVANCE ONLINE PUBLICATION
NATURE MEDICINE
ARTICLES
Table 2 Description of genes in the 26-gene SDPP
P value
(logistic
Gene name
Chromosome
regression)
5q11-q12
0.023
GZMA
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
CD8A
2p12
0.036
Transcript description
KEGG pathways
Gene ontology processes
Granzyme A (granzyme 1;
Neuroactive ligand-
Apoptosis, cleavage of lamin, cytolysis,
cytotoxic T lymphocyte–
associated serine
receptor interaction
immune response, proteolysis
esterase-3)
CD8a (CD8; MAL; p32; Leu2)
Antigen processing and
T cell activation, antigen processing
presentation, cell adhesion molecules (CAMs),
and presentation, immune response,
transmembrane receptor protein
hematopoietic cell
lineage, T cell receptor
tyrosine kinase signaling pathway
signaling pathway
TRBV5-4
7q34
0.048
T cell receptor b variable 5-4
CD52
1p36
0.017
(TRBV5-4)
CD52 (CDW52; CAMPATH-1
Elevation of cytosolic calcium ion
antigen)
CD247
1q22-q23
0.028
CD247 (CD3H; CD3Q; CD3Z;
TCRZ; CD3-ZETA)
concentration, respiratory burst
Natural killer cell–
mediated cytotoxicity;
Cell surface receptor–linked signal
transduction
T cell receptor signaling
pathway
CD48
1q21.3-q22
0.032
PLEK
2p14
0.055
RUNX3
1p36
0.058
CD48 (BCM1; BLAST; hCD48;
mCD48; BLAST1; SLAMF2;
Natural killer cell–
mediated cytotoxicity
Defense response
MEM-102)
Pleckstrin (P47; FLJ27168)
Intracellular signaling cascade
Runt-related transcription
factor-3 (AML2; CBFA3;
Cell proliferation, induction of apoptosis,
negative regulation of epithelial cell
PEBP2aC; FLJ34510;
MGC16070)
proliferation, negative regulation of
progression through cell cycle, regulation
of transcription (DNA-dependent),
transcription, transcription from RNA
polymerase II promoter
GIMAP5
7q36.1
0.050
GTPase, IMAP family member
5 (IAN4; IAN5; IMAP3; hIAN5;
HIMAP3; IAN4L1; FLJ11296)
LCP1
13q14.3
0.075
Actin filament bundle formation
Lymphocyte cytosolic protein-1
(L-plastin) (CP64; PLS2; LC64P;
FLJ25423; FLJ26114; FLJ39956;
L-PLASTIN; DKFZp781A23186)
F2RL2
5q13
0.028
Coagulation factor II (thrombin)
receptor-like 2 (PAR3)
SLC40A1
2q32
0.020
Solute carrier family 40
wounding, signal transduction
Anatomical structure morphogenesis,
(iron-regulated transporter) member-1
(FPN1; HFE4; MTP1; IREG1; MST079;
cellular iron ion homeostasis, ion transport,
iron ion transport
MSTP079; SLC11A3)
Frizzled-related protein
Wnt receptor signaling pathway, cell
(FRE; FZRB; hFIZ; FRITZ; FRP-3; FRZB1;
SFRP3; SRFP3; FRZB-1; FRZB-PEN)
differentiation, multicellular organismal
development, negative regulation of Wnt
FRZB
2qter
0.059
Neuroactive ligand–
receptor interaction
G protein–coupled receptor protein signaling
pathway, blood coagulation, response to
receptor signaling pathway, skeletal
Xp22
0.027
Retinoic acid–induced-2
development
Embryonic development
HOXA10
7p15-p14
0.045
Homeobox A10 (PL; HOX1;
HOX1H; HOX1.8; MGC12859)
Multicellular organismal development,
regulation of transcription (DNA-
ITGBL1
13q33
0.066
Homo sapiens mRNA full-length insert
dependent), spermatogenesis, transcription
Cell adhesion, cell-matrix adhesion,
RAI2
(AL359052)
OGN
cDNA clone EUROIMAGE 1968422
OR integrin, beta-like 1 (with
9q22
0.038
integrin-mediated signaling pathway
EGF-like repeat domains) (OSCP; TIED)
Osteoglycin (mimecan;
OIF; SLRR3A; DKFZP586P2421)
NATURE MEDICINE
ADVANCE ONLINE PUBLICATION
7
ARTICLES
Table 2 Continued
P value
(logistic
Gene name
C21orf34
Chromosome
regression)
21q21.1
0.047
Transcript description
KEGG pathways
Gene ontology processes
Neuroactive ligand-
G protein–coupled receptor protein signaling
receptor interaction
pathway, Rho protein signal transduction,
actin cytoskeleton organization and
Chromosome 21 open
reading frame 34 (C21orf35;
FLJ38295; hypothetical protein
ADRA2A
10q24-q26
0.044
LOC388815)
Adrenergic aa-2A-,
receptor (ADRA2; ADRAR; ZNF32;
ADRA2R; ALPHA2AAR)
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
biogenesis, activation of MAPK activity, cell
motility, fear response, negative regulation
of adenylate cyclase activity, positive
regulation of cell proliferation,
signal transduction
CXCL14
SPP1
HRASLS
VGLL1
ADM
5q31
4q21-q25
0.051
0.037
Chemokine (C-X-C motif) ligand-14
(KS1; Kec; BMAC; BRAK; NJAC; MIP-
Cytokine-cytokine
receptor interaction,
Cell-cell signaling, chemotaxis, immune
response, inflammatory response, signal
2g; SCYB14; MGC10687; bolekine)
leukocyte transendothelial migration
transduction
Secreted phosphoprotein-1 (osteopontin, bone sialoprotein I, early
Cell communication,
ECM-receptor
Cell adhesion, cell-matrix adhesion,
negative regulation of bone mineralization,
T lymphocyte activation 1) (OPN; BNSP;
BSPI; ETA-1; MGC110940)
interaction, focal
adhesion
ossification
3q29
0.028
HRAS-like suppressor (A-C1;
HSD28; HRASLS1; H-REV107)
Xq26.3
0.038
Vestigial-like-1 (Drosophila)
(TDU; VGL1)
Regulation of transcription (DNAdependent)
11p15.4
0.022
Adrenomedullin (AM)
cAMP biosynthetic process, cell-cell
signaling, circulation, excretion, female
pregnancy, heart development, positive
regulation of cell proliferation, progesterone
biosynthetic process, response to wounding,
signal transduction
C6orf168
(A_32_P74964)
6q16.2
0.021
Chromosome 6 open reading
frame 168
SNTG2
(A_32_P49764)
2p25.3
0.017
Syntrophin, g2 (SYN5; G2SYN;
MGC133174)
Central nervous system development
KEGG, Kyoto encyclopedia of genes and genomes.
METHODS
Description of samples. We collected tissue samples from 73 women presenting with invasive breast carcinoma. We froze the samples in liquid nitrogen
within 30 min of surgical removal and subsequently subjected them to LCM.
From this cohort, we obtained 53 samples of tumor stroma (50 IDC and three
invasive lobular carcinoma cases); in 31 cases, we were able to obtain
individual-matched normal adjacent stroma (Supplementary Methods
online). The median follow-up time was 3.58 years. We determined recurrence
(local or distant) by examination of medical records after diagnosis. We defined
poor outcome as alive with disease or dead of disease as of the time of the latest
follow-up. No subject in the study received neoadjuvant therapy. This study
was approved by the McGill University Health Centre (MUHC) Research Ethics
Board (Protocols SUR-99-780 and SUR-00-966). All human subjects provided
written, informed consent.
99.2 percentile of the variance distribution across all matched samples, which
were also significantly differentially expressed in at least three samples (P o 1 105). We clustered the tumor stroma data using these genes and evaluated the
resulting clusters as described (Supplementary Methods).
Identification of differentially expressed genes. We used pair-wise class
distinction to identify genes differentially expressed between the stroma clusters
as described (Supplementary Methods). The resulting gene list contained
163 members.
Laser capture microdissection, RNA isolation and microarray hybridization.
We carried out LCM, sample isolation and preparation and microarray
hybridization as previously described18 (Supplementary Methods). After we
performed normalization and model fitting as previously described18, our
microarray data set contained 111 distinct expression experiments.
Predictor construction and evaluation. We used logistic regression to score
and rank each gene in the expression profile on the basis of its statistical
significance (P-value) in predicting recurrence in a model that included the
gene expression level, lymph node status, and ER, PR and HER2 status. This
model ensured that the predictive strength of a gene was not confounded with
those variables. We trained naive Bayes’ classifiers to predict outcome using the
ranked gene expression profile of the recurrence-positive stroma cluster and
compared them using ROC curves as described (Supplementary Methods).
The AUC was used to evaluate the performance of the SDPP in tumor stroma,
tumor epithelium, normal stroma and normal epithelium.
Stroma subtype identification. We fit a linear model48 to the individualmatched tumor versus normal stroma data and identified the genes in the
SDPP performance in tumor epithelium. We evaluated the SDPP in gene expression data from LCM-derived tumor epithelium (Supplementary Methods).
8
ADVANCE ONLINE PUBLICATION
NATURE MEDICINE
ARTICLES
Gene ontology analysis. We cross-referenced genes differentially expressed in
each stroma outcome subtype against gene onotology annotations49 to identify
overrepresented gene ontology categories using a test against the hypergeometric distribution, at a significance threshold of P r 0.05.
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
Comparison with publicly available breast cancer data sets. We downloaded
four publicly available breast cancer data sets and used the SDPP to predict the
outcome for each sample. In the NKI3, JRH32, Uppsala33 and Rotterdam5 data
sets, the poor-, mixed- and good-outcome categories of samples identified by
the SDPP were treated as categorical variables in Cox proportional hazards
regression and included other prognostic factors available for each data set.
The accuracy of a predictor was measured as the total true positives over
the total number of samples in a cohort. Details are provided in Supplementary Methods.
Predictor integration. We combined the independent predictions of the SDPP,
70-gene predictor, wound response, SFT/DTF and hypoxia signatures in the
NKI data set to construct a Bayes’ classifier of metastasis as described
(Supplementary Methods).
Annotation of unknown genes. We obtained probe annotations from the
BioConductor hgug4112a Agilent annotation package. Unknown probes were
annotated by using BLAST to align probe sequences against the ENSEMBL
human genome assembly (release 49) or by searching across other databases.
For details, see Supplementary Methods. Assignments for unknown members
of the set of 163 genes are found in Supplementary Data.
Immunohistochemistry. We validated protein expression of selected members
of the SDPP gene set (CD8A, CD247 and SPP1) by immunohistochemistry
with sections from formalin-fixed, paraffin-embedded blocks obtained from
the MUHC Pathology archive. CD31 expression was evaluated on frozen tissue
sections. Procedures were carried out as per the manufacturer’s instructions
(Supplementary Methods).
Quantitative real-time–PCR. We used amplified RNAs prepared from microdissected material as templates for quantitative RT-PCR validation using a
LightCycler system (Roche Applied Science) as per the manufacturer’s instructions (Supplementary Methods).
Accession codes. Microarray results have been submitted to the Gene Expression Omnibus with accession code GSE9014.
Note: Supplementary information is available on the Nature Medicine website.
ACKNOWLEDGMENTS
We thank: D. Fleiszer, A. Loutfi, C. Milne, D. Owen, G. Pearl, R. Salasidis,
F. Tremblay, M. Wexler (surgeons); F. Halwani, K. Khetani (pathologists);
H. Barwick, A. Cuellar, D. Hori, S. Eng, L. Pasyuk, T. Vilhena, C. Palko-Condron
(Pathology staff); C. Loiselle (Nursing); the MUHC Anaesthesia Department;
A. Dedhar and A. Viquez (tissue and data collectors) for their assistance. We also
thank C. Mihalcioiu, P. Siegel and members of the Park lab for their critical
review of this manuscript. This work was supported by grants to M.P. from
the Québec Breast Cancer Foundation, Genome Canada–Génome Québec,
Valorisation-Recherche Québec and Fonds de la Récherche en Santé du Québec
and a Canadian Institutes of Health Research (CIHR) Team Grant; a National
Science and Engineering Research Council of Canada Discovery Grants Program
grant to M.H.; a CIHR McGill University Cancer Consortium Training Award to
G.F.; a US Department of Defense Breast Cancer Predoctoral Traineeship Award
to F.P.; MUHC Research Institute and MUHC Department of Medicine
Fellowships to N.B.; and Cedars Cancer Institute Fellowships to S.S. and N.B. M.P.
holds the Diane and Sal Guerrera Chair in Cancer Genetics at McGill University.
AUTHOR CONTRIBUTIONS
G.F. designed and implemented the data analysis pipeline for the data generated
for this study, developed methods and software for data analysis, analyzed and
interpreted the data, and contributed to manuscript preparation. N.B.
coordinated experiments, supervised the quantitative RT-PCR and
immunohistochemical validation aspects of this study, participated in discussions
of data analysis and interpretation, and contributed to manuscript preparation.
F.P. contributed to methods and software development and participated in
discussions of data analysis and interpretation. S.S. developed protocols for tissue
storage, LCM, linear amplification and labeling, and supervised these applications.
NATURE MEDICINE
ADVANCE ONLINE PUBLICATION
M.S. performed LCM and immunohistochemistry. H.Z. performed quantitative
RT-PCR and isolated RNA after LCM. H.C. prepared samples and conducted
gene expression profiling. G.O. performed pathological and histological analysis
of samples and gave advice regarding immunohistochemistry. S.M. contributed
to clinical analyses and tissue procurement. A.O. performed pathological and
histological analyses on tissue samples before LCM. M.H. supervised the
bioinformatics and biostatistics aspects of the project, designed and coordinated
analyses, and contributed to manuscript preparation. M.P. initiated and
supervised the tissue collection and microarray preparation, supervised the
expression profiling aspect of this project, designed and coordinated experiments
and contributed to manuscript preparation.
Published online at http://www.nature.com/naturemedicine
Reprints and permissions information is available online at http://npg.nature.com/
reprintsandpermissions
1. Kamangar, F., Dores, G.M. & Anderson, W.F. Patterns of cancer incidence, mortality,
and prevalence across five continents: defining priorities to reduce cancer disparities in
different geographic regions of the world. J. Clin. Oncol. 24, 2137–2150 (2006).
2. van ’t Veer, L.J. et al. Gene expression profiling predicts clinical outcome of breast
cancer. Nature 415, 530–536 (2002).
3. van de Vijver, M.J. et al. A gene-expression signature as a predictor of survival in breast
cancer. N. Engl. J. Med. 347, 1999–2009 (2002).
4. Perou, C.M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752
(2000).
5. Wang, Y. et al. Gene-expression profiles to predict distant metastasis of lymphnode–negative primary breast cancer. Lancet 365, 671–679 (2005).
6. Sorlie, T. et al. Gene expression patterns of breast carcinomas distinguish tumor
subclasses with clinical implications. Proc. Natl. Acad. Sci. USA 98, 10869–10874
(2001).
7. Chi, J.T. et al. Gene expression programs in response to hypoxia: cell type specificity
and prognostic significance in human cancers. PLoS Med. 3, e47 (2006).
8. Chang, H.Y. et al. Robustness, scalability and integration of a wound-response gene
expression signature in predicting breast cancer survival. Proc. Natl. Acad. Sci. USA
102, 3738–3743 (2005).
9. Glas, A.M. et al. Converting a breast cancer microarray signature into a high-throughput
diagnostic test. BMC Genomics 7, 278 (2006).
10. West, R.B. et al. Determination of stromal signatures in breast carcinoma. PLoS Biol.
3, e187 (2005).
11. Allinen, M. et al. Molecular characterization of the tumor microenvironment in breast
cancer. Cancer Cell 6, 17–32 (2004).
12. Iyer, V.R. et al. The transcriptional program in the response of human fibroblasts to
serum. Science 283, 83–87 (1999).
13. Singer, C.F. et al. Differential gene expression profile in breast cancer–derived stromal
fibroblasts. Breast Cancer Res. Treat. published online, doi:10.1007/s10549-0079725-2 (27 September 2007).
14. Buess, M. et al. Characterization of heterotypic interaction effects in vitro
to deconvolute global gene expression profiles in cancer. Genome Biol. 8, R191
(2007).
15. Bhowmick, N.A. & Moses, H.L. Tumor-stroma interactions. Curr. Opin. Genet. Dev. 15,
97–101 (2005).
16. Kim, J.B., Stein, R. & O’Hare, M.J. Tumour-stromal interactions in breast cancer: the
role of stroma in tumourigenesis. Tumour Biol. 26, 173–185 (2005).
17. Tlsty, T.D. & Coussens, L.M. Tumor stroma and regulation of cancer development.
Annu. Rev. Pathol. 1, 119–150 (2006).
18. Finak, G. et al. Gene expression signatures of morphologically normal breast tissue
identify basal-like tumors. Breast Cancer Res. 8, R58 (2006).
19. Uzzan, B., Nicolas, P., Cucherat, M. & Perret, G.Y. Microvessel density as a prognostic
factor in women with breast cancer: a systematic review of the literature and metaanalysis. Cancer Res. 64, 2941–2955 (2004).
20. Gruber, G. et al. Hypoxia-inducible factor 1 a in high-risk breast cancer: an independent prognostic parameter? Breast Cancer Res. 6, R191–R198 (2004).
21. Nikitenko, L.L., Fox, S.B., Kehoe, S., Rees, M.C. & Bicknell, R. Adrenomedullin and
tumour angiogenesis. Br. J. Cancer 94, 1–7 (2006).
22. Bobrovnikova-Marjon, E.V., Marjon, P.L., Barbash, O., Vander Jagt, D.L. & Abcouwer,
S.F. Expression of angiogenic factors vascular endothelial growth factor and interleukin-8/CXCL8 is highly responsive to ambient glutamine availability: role of nuclear
factor-kB and activating protein-1. Cancer Res. 64, 4858–4869 (2004).
23. Wang, D. et al. CXCL1 induced by prostaglandin E2 promotes angiogenesis in colorectal cancer. J. Exp. Med. 203, 941–951 (2006).
24. Murdoch, C., Giannoudis, A. & Lewis, C.E. Mechanisms regulating the recruitment of
macrophages into hypoxic areas of tumors and other ischemic tissues. Blood 104,
2224–2234 (2004).
25. Bosco, M.C. et al. Hypoxia modifies the transcriptome of primary human monocytes:
modulation of novel immune-related genes and identification of CC-chemokine
ligand 20 as a new hypoxia-inducible gene. J. Immunol. 177, 1941–1955
(2006).
26. Yoshida, H., Broaddus, R., Cheng, W., Xie, S. & Naora, H. Deregulation of the HOXA10
homeobox gene in endometrial carcinoma: role in epithelial-mesenchymal transition.
Cancer Res. 66, 889–897 (2006).
9
© 2008 Nature Publishing Group http://www.nature.com/naturemedicine
ARTICLES
27. Lee, A.Y. et al. Expression of the secreted frizzled-related protein gene family is
downregulated in human mesothelioma. Oncogene 23, 6672–6676 (2004).
28. Dunn, G.P., Koebel, C.M. & Schreiber, R.D. Interferons, immunity and cancer
immunoediting. Nat. Rev. Immunol. 6, 836–848 (2006).
29. Ellyard, J.I., Simson, L. & Parish, C.R. TH2-mediated anti-tumour immunity: friend or
foe? Tissue Antigens 70, 1–11 (2007).
30. Mills, C.D., Kincaid, K., Alt, J.M., Heilman, M.J. & Hill, A.M. M-1/M-2 macrophages
and the TH1/TH2 paradigm. J. Immunol. 164, 6166–6173 (2000).
31. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference. 116–226 (Morgan Kaufman Publishers, San Mateo, California, 1988).
32. Miller, L.D. et al. An expression signature for p53 status in human breast cancer
predicts mutation status, transcriptional effects and patient survival. Proc. Natl. Acad.
Sci. USA 102, 13550–13555 (2005).
33. Sotiriou, C. et al. Gene expression profiling in breast cancer: understanding the
molecular basis of histologic grade to improve prognosis. J. Natl. Cancer Inst. 98,
262–272 (2006).
34. Nuyten, D.S. & van de Vijver, M.J. Gene expression signatures to predict the
development of metastasis in breast cancer. Breast Dis. 26, 149–156 (2006).
35. Massague, J. Sorting out breast-cancer gene signatures. N. Engl. J. Med. 356,
294–297 (2007).
36. Pages, F. et al. Effector memory T cells, early metastasis and survival in colorectal
cancer. N. Engl. J. Med. 353, 2654–2666 (2005).
37. Hiraoka, K. et al. Concurrent infiltration by CD8+ T cells and CD4+ T cells is a
favourable prognostic factor in non–small-cell lung carcinoma. Br. J. Cancer 94,
275–280 (2006).
38. Dalberg, U., Markholst, H. & Hornum, L. Both Gimap5 and the diabetogenic
BBDP allele of Gimap5 induce apoptosis in T cells. Int. Immunol. 19, 447–453
(2007).
10
39. Starnes, T. et al. The chemokine CXCL14 (BRAK) stimulates activated NK cell
migration: implications for the downregulation of CXCL14 in malignancy. Exp.
Hematol. 34, 1101–1105 (2006).
40. Boudreau, N. & Myers, C. Breast cancer–induced angiogenesis: multiple mechanisms
and the role of the microenvironment. Breast Cancer Res. 5, 140–146 (2003).
41. Li, A., Dubey, S., Varney, M.L., Dave, B.J. & Singh, R.K. IL-8 directly enhanced
endothelial cell survival, proliferation and matrix metalloproteinase production and
regulated angiogenesis. J. Immunol. 170, 3369–3376 (2003).
42. Sica, A., Schioppa, T., Mantovani, A. & Allavena, P. Tumour-associated macrophages
are a distinct M2 polarised population promoting tumour progression: potential targets
of anti-cancer therapy. Eur. J. Cancer 42, 717–727 (2006).
43. Gupta, G.P. et al. Mediators of vascular remodelling co-opted for sequential steps in
lung metastasis. Nature 446, 765–770 (2007).
44. Hofmann, H.S. et al. Matrix metalloproteinase-12 expression correlates with local
recurrence and metastatic disease in non–small cell lung cancer patients. Clin. Cancer
Res. 11, 1086–1092 (2005).
45. Lewis, C.E. & Pollard, J.W. Distinct role of macrophages in different tumor microenvironments. Cancer Res. 66, 605–612 (2006).
46. Teschendorff, A.E., Miremadi, A., Pinder, S.E., Ellis, I.O. & Caldas, C. An immune
response gene expression module identifies a good prognosis subtype in estrogen
receptor negative breast cancer. Genome Biol. 8, R157 (2007).
47. Fitzgibbons, P.L. et al. Prognostic factors in breast cancer. College of American
Pathologists Consensus Statement 1999. Arch. Pathol. Lab. Med. 124, 966–978
(2000).
48. Smyth, G.K. Linear models and empirical bayes methods for assessing differential
expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3 Article 3 (2004).
49. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene
Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
ADVANCE ONLINE PUBLICATION
NATURE MEDICINE