Download Full Text

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A Gene Expression Signature with Independent Prognostic Significance in Epithelial
Ovarian Cancer
Dimitrios Spentzos, M.D., Douglas A. Levine, M.D., Marco F. Ramoni, Ph.D., Marie Joseph,
Xuesong Gu, Ph.D., Jeff Boyd, Ph.D., Towia A. Libermann, Ph.D., and
Stephen A. Cannistra, M.D.*
From the Program of Gynecologic Medical Oncology, Beth Israel Deaconess Medical Center,
(DS, SAC), Genomics Center (DS, MJ, XG, TL) and Bioinformatics Core (DS, TL), Beth Israel
Deaconess Medical Center, Harvard Medical School, Children’s Hospital Informatics Program
and Harvard Partners Center for Genetics and Genomics (MFR), Boston, MA, and the
Department of Surgery, Memorial Sloan-Kettering Cancer Center, NY (DL, JB).
Funded in part through grants from the Patricia Cronin Foundation, the Director’s Challenge
Grant (U01 CA88175), RO1 CA85467, and U24 DK58739.
*Address correspondence and reprint requests to Dr. Stephen A. Cannistra, Program of
Gynecologic Medical Oncology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue,
Boston, MA 02215. (Telephone: 617-667-4283; Fax: 617-975-5598; E-mail:
[email protected]).
Running head: A prognostic gene signature in ovarian cancer
Presented in part at the Annual Meeting of the American Society of Clinical Oncology, Chicago,
IL, June 2003.
1
Abstract
Purpose: Currently available clinical and molecular prognostic factors provide an imperfect
assessment of prognosis for patients with epithelial ovarian cancer (EOC). In this study, we
investigated whether tumor transcription profiling could be used as a prognostic tool in this
disease.
Methods:
Tumor tissue from 68 patients was profiled with oligonucleotide microarrays.
Samples were randomly split into training and validation sets. A three-step training procedure
was employed to discover a statistically significant Kaplan-Meier split in the training set. The
resultant prognostic signature was then tested on an independent validation set for confirmation.
Results: In the training set, a 115-gene signature referred to as the Ovarian Cancer Prognostic
Profile (OCPP) was identified. When applied to the validation set, the OCPP distinguished
between patients with unfavorable and favorable overall survival (median 30 months versus not
yet reached, respectively, log-rank p=0.004). The signature maintained independent prognostic
value in multivariate analysis, controlling for other known prognostic factors such as age, stage,
grade, and debulking status. The Hazard Ratio for death in the unfavorable OCPP group was
4.8, p=0.021 by Cox Proportional Hazards analysis.
Conclusion: The OCPP is an independent prognostic determinant of outcome in EOC. The use
of gene profiling may ultimately permit identification of EOC patients appropriate for
investigational treatment approaches, based upon a low likelihood of achieving prolonged
survival with standard first-line platinum-based therapy.
2
Introduction
The majority of patients with epithelial ovarian cancer (EOC) are diagnosed with advanced
disease involving sites such as the upper abdomen, pleural space, and para-aortic lymph nodes 1.
Post-operative chemotherapy is almost always required in an attempt to eradicate residual tumor
that remains after initial surgery. Standard chemotherapy with carboplatin in combination with a
taxane results in an initial response rate of over 70%, although subsequent relapse frequently
occurs and eventually becomes resistant to a wide variety of agents 2. Consequently, the longterm survival of patients with upper abdominal involvement (stage III) or those with disease
beyond the abdomen (stage IV) ranges from 30% to less than 10% 1.
Despite the highly lethal nature of EOC, the clinical course of advanced disease can be difficult
to predict in an individual patient. A small fraction of patients will be cured with surgery
followed by chemotherapy, another group will experience relapse after a relatively long time
interval (e.g., greater than 1-2 years), others will relapse and succumb to this disease within
months of completing first-line therapy, and some will exhibit primary resistance to first-line
chemotherapy. For patients with advanced disease, features associated with a more favorable
prognosis include ability to perform an optimal surgical debulking, low grade disease, non-clear
cell histology, age less than 65 years, a rapid serologic (CA-125) response to chemotherapy, the
presence of BRCA-1 germ-line mutation, and overexpression of pro-apoptotic proteins such as
BAX 1,3-7. Nonetheless, these prognostic factors are imperfect predictors of outcome, and for the
most part they do not provide insight into the biologic mechanisms responsible for clinical
behavior.
3
The heterogeneity of clinical outcomes in patients with ovarian cancer suggests that reliable
prognostic and/or predictive factors would be of potential clinical value. Accurate predictive
markers might identify patients who are appropriate candidates for novel first-line experimental
approaches, based upon a high chance of exhibiting resistance to standard first-line
chemotherapy. Alternatively, accurate prognostic factors may permit identification of patients
who are likely to relapse and die of disease, despite achievement of a complete response. Such
patients may be appropriate candidates for experimental approaches designed to determine the
value of maintenance or consolidation strategies, for instance. Finally, reliable prognostic and/or
predictive factors might provide important insights into the biology of drug resistance and tumor
aggressiveness, yielding potentially new molecular targets for drug development.
Previous studies investigating the mechanisms of drug resistance, tumor growth, and metastatic
potential have revealed that these processes are multifactorial in nature, associated with genetic
abnormalities in multiple gene families.
Thus, more recent attempts to develop accurate
predictors of clinical outcome in other malignancies have focused on techniques that are capable
of assessing global gene expression. This task has become feasible through the development of
genome-wide expression arrays (cDNA and oligonucleotide microarrays), which have been
capable of distinguishing between specific tumor types (e.g., myeloid versus lymphoid
leukemia), between specific histologic subtypes (e.g., follicular versus large cell lymphoma), and
between different clinical outcomes 8-11. For example, microarray expression profiles in patients
with non-Hodgkin’s lymphoma have been recently shown to provide prognostic information that
was independent of standard clinical metrics such as the International Prognostic Index, attesting
to the potential clinical utility of this technique 10,11.
4
In this study, we utilized oligonucleotide microarrays to globally analyze gene expression of
primary ovarian cancer samples in order to define profiles that have prognostic relevance. We
demonstrate that it is possible to accurately prognosticate clinical outcome in patients with EOC
using this technique, and we discuss the potential relevance of these findings for clinical
management.
Materials and Methods
Study subjects. Sixty-eight patients with epithelial ovarian cancer diagnosed between January
1995 and October 2000 form the basis of this study (n=38 patients from Beth Israel Deaconess
Medical Center (BIDMC) and n=30 patients from Memorial Sloan-Kettering Cancer Center
(MSKCC)).
All patients underwent exploratory laparotomy for diagnosis, staging, and
debulking, followed by first-line platinum/taxane based chemotherapy.
Standard post-
chemotherapy surveillance included serial physical examination, serum CA-125 level, and CT
scanning based upon clinical suspicion of relapse. At one of the two institutions (MSKCC),
patients who were in complete clinical remission after standard chemotherapy were considered
for a second-look laparoscopy, although findings from this procedure were not taken into
account in the definition of complete clinical remission (see below). Follow-up data for this
study were extracted from the Ovarian Cancer Relational Database at BIDMC and the Ovarian
Cancer Clinical Database at MSKCC. The study protocol for collection of tissue and clinical
information was approved by the Institutional Review Boards at both institutions, and patients
provided written informed consent authorizing collection and use of the tissue for study
purposes.
Clinical definitions. Staging was assessed in accordance with the International Federation of
Gynecology and Obstetrics (FIGO) 1. Optimal debulking was defined as less than or equal to 1
5
cm. gross residual disease, and suboptimal debulking was more than 1 cm. residual disease. A
complete clinical response/remission (CCR) was defined as resolution of all clinical/radiographic
evidence of disease and normalization of the serum CA-125 level after the completion of first
line chemotherapy. Completion of first-line chemotherapy was considered to be the date of the
last administered cycle of treatment. For the purpose of this study, persistent disease was
defined as lack of a complete response to first-line chemotherapy. For patients who achieved a
CCR, disease-free survival (DFS) was defined as the time interval between the end of first-line
chemotherapy and the first confirmed sign of disease recurrence. Overall survival (OS) was
defined as the time interval between the date of diagnosis and the date of death from any cause.
RNA isolation. Ovarian cancer samples were collected at the time of primary debulking surgery
and frozen at –80o C. Microdissection was not used in this analysis, in order to assess the
contribution of stromal and hematopoietic cell elements to the genetic profile. Tumor samples
were pulverized in liquid nitrogen and homogenized in Trizol solution, followed by RNA
isolation using standard techniques.
cDNA synthesis, microarray probe preparation, and Affymetrix GeneChip hybridization. These
procedures were carried out using standard protocols and are described in detail in the on-line
supplement to this manuscript (www.bidmcgenomics.org/OvarianCancer ) as well as in previous
publications 12-15. The Affymetrix U 95 A2 array was used containing 12,625 transcripts. Image
analysis was performed using the MAS 5 Affymetrix algorithm 12-15.
Training set data analysis. A three-step process was developed to identify a gene expression
profile using a randomly chosen training set of 34 samples (Figure 1). In step 1, samples from
seven patients with the shortest survival (excluding censored patients) and seven patients with
the longest known survival were analyzed with supervised statistical methods of pattern
recognition and class prediction (first training step, Figure 1) 16-23. The subsets of genes with the
highest predictive accuracy (by leave-one-out cross validation)8 for the initial 14 samples were
6
then selected for a second training step (Figure 1), in order to refine the expression profile. For
this step, class labels were assigned to the remaining 20 patient samples from the training set, by
predicting their class membership using the genes identified in the first training step. Once the
labels were assigned, the survival times of the entire group of 34 training samples was assessed
by Kaplan-Meier analysis. Predictive signatures with various numbers of genes were tested (all
of them with the highest predictive accuracy for the first 14 training samples), until a distinction
with maximal statistical significance and stable class assignments was reached by Kaplan-Meier
analysis. The class assignments that yielded the best survival discrimination were considered to
be the candidate phenotypes for final refinement in the third training step (Figure 1). For this
step, the entire training set of 34 samples was then split into a favorable and unfavorable group
(based on these class assignments), and the two groups were again subjected to pattern
recognition and class prediction analysis. The signature with the highest predictive accuracy (by
leave one out cross validation) for the previously assigned 34 labels was chosen as the final gene
profile. The resultant gene profile was then applied to an independent set of samples (validation
set) in order to confirm its prognostic significance.
Gene expression pattern analysis and class prediction.
16-23
algorithm are provided in previous publications
Details on the pattern recognition
. Briefly, this is a supervised method that is
designed to discover patterns of gene expression associated with binary phenotypes. A pattern is
defined as a subset of genes whose expression levels are tightly clustered (usually at a high or
low expression level) in a subset of samples within a given phenotype. A computer algorithm
(SPLASH) 17 is used to discover all patterns characteristic of the two phenotypes at a given level
of statistical significance as previously described
16-22
.
The degree of differential gene
expression was assessed by a signal to noise ratio and a permutation test as described previously
8
. Class predictions at all steps (training and validation) were carried out using the weighted
voting
8,11,23-25
and k nearest neighbor (k-nn)
7
9,26,27
algorithms.
Predictive accuracy in the
training set was assessed by leave-one-out cross validation 8. The p value for predictor accuracy
was calculated using the Fisher’s test on the prediction contingency table and by a permutation
test as described previously 23,25.
Statistical tests and survival analysis. Associations between categorical variables were assessed
with the Fisher’s exact test. Differences in median values were assessed with the Wilcoxon’s
test when appropriate. All deaths observed in the dataset were cancer-related, meaning that
overall survival is equivalent to cancer specific survival for purposes of this analysis. Overall
survival (OS) and disease-free survival (DFS) curves were generated with the Kaplan-Meier
method, and differences between survival curves were assessed for statistical significance with
the log-rank test. Multivariate analysis for confounding factors was carried out using Cox
Proportional Hazards Regression with categorical or continuous covariates as appropriate. For
this analysis, gene profile was considered as a binary category (favorable, unfavorable) as
described in the predictive analysis. Age was considered as a continuous variable and the rest of
the covariates were considered as categorical variables. The p values of all statistical tests were
two-sided. The Genes at Work (IBM), Whitehead GeneCluster 2 and SPSS (version 11.5)
packages were used for statistical tests. Details of the bioinformatics and statistical methods are
provided
in
the
on-line
supplement
to
this
manuscript
(www.bidmcgenomics.org/OvarianCancer).
Results
Patient characteristics.
The clinical and pathologic characteristics of the 68 patients with
epithelial ovarian cancer are shown in Table 1. The median age at diagnosis was 55 years (range
36 to 80 years), and the majority (96%) had advanced stage (FIGO stages III/IV), grade III
tumors (80%), with papillary serous histology (97%).
8
Sixty-five percent of patients were
optimally cytoreduced after initial surgery (less than or equal to 1 cm. residual diameter disease),
and all received post-operative taxane/platinum-based combination chemotherapy. The median
follow up was 40+ months (range 1 to 74+ months), with a median overall survival (OS) for the
entire group of 49 months, and a median disease-free survival (DFS) of 15 months. Thus, the
survival characteristics of this group are typical for patients with advanced epithelial ovarian
cancer.
Development of the Ovarian Cancer Prognostic Profile (OCPP). The strategy for identifying a
gene expression profile with prognostic significance is shown in Figure 1. In the first training
step, 14 samples out of a randomly chosen training set of 34 samples were initially selected for
pattern analysis. This group consisted of 7 samples with the shortest OS time (4, 10, 12, 18, 19,
24, 26 months) and 7 samples with the longest OS times (58+, 59+, 61+, 63, 65+, 68+, 73+).
These samples were selected in an orderly fashion starting from the most extreme sample on
each end of the survival spectrum, with the two groups roughly representing less than 2-year and
more than 5-year survival. Pattern analysis comparing the two groups revealed approximately
100 multigene patterns that associated with the two survival groups (p<0.001 for each pattern;
additional data may be found in the on-line supplement). Class prediction with the weighted
voting and the k nearest neighbor algorithm was performed. Predictor sets ranging from 120 to
300 genes using the k nearest neighbor algorithm showed high predictive accuracy (100%,
p=0.0004 by a Fisher’s exact test and p<0.001 by a permutation test). Similar results were
obtained with the weighted voting algorithm (100%, p=0.0003) using 180-400 genes. In the
second training step, we used a weighted voting predictor with 200 genes (with 100% accuracy
for the initial 14 samples) to assign labels (favorable and unfavorable) to the remaining 20
samples from the training set and then generated survival curves for the entire group. KaplanMeier analysis showed a statistically significant difference in OS between the two groups. The
9
unfavorable group had a median survival of 33 months whereas the favorable group had a
median survival that has not yet been reached (log rank p=0.0008). We also tested a range of the
other highly accurate predictors (between 120-400 genes both by the k-nn and weighted voting
methods) for their performance on the remaining 20 samples and obtained similar results, with p
values for the Kaplan-Meier analysis being very similar to that of the 180-gene predictor. In the
third training step, we utilized the entire group of the 34 training samples in order to develop a
final candidate signature.
We carried out pattern recognition and obtained 766 multigene
patterns that associated with the two classes (favorable and unfavorable) at a p<0.001. The
highest predictive accuracy was obtained using a 115-gene predictor (85% by weighted voting
and 91% by k-nn, p=0.00005 by Fisher’s exact test and p<0.001 by permutation test). This final
gene expression profile is shown in Figure 2 and will be referred to as the Ovarian Cancer
Prognostic Profile (OCPP).
Association of the OCPP with survival. The OCPP was used to assign labels (favorable versus
unfavorable) to a randomly chosen validation set of 34 patient samples, followed by KaplanMeier analysis. These samples are distinct from the training set and had not been used at any
step in the generation of the OCPP. As shown in Figure 3A, a strong survival split was observed
on the basis of the OCPP, with a median OS of the unfavorable and favorable groups of 30
months and not yet reached, respectively, at a median follow up of 47 months (log rank
p=0.004). In addition, there is a suggestion of a plateau in the favorable curve that identifies a
subset of patients with a particularly indolent course, having a close to 70% long-term survival at
5 years. After the prognostic value of the signature was validated, we then applied the signature
to the entire set of 68 patients in order to arrive at a more stable estimate of the effect size
(Figure 3B). The median survival for the unfavorable group was 30 months, while it has not yet
been reached for the favorable group at a median follow up of 49 months, (log rank p=0.0001).
10
By a univariate Cox Proportional Hazards Model, the hazard ratio (HR) for death in the
unfavorable group was 4.6 (95% CI: 2.0-10.7, p=0.0001) relative to the favorable group.
Patients from both hospital sites were similarly represented in the two groups, with 47% and
60% of samples from each site assigned to the unfavorable and favorable groups, respectively.
The OCPP was similarly prognostic when used to analyze the BIDMC versus MSKCC groups
separately (see on-line supplement for additional details). The OCPP was also used to assess
DFS, as shown in Figure 4. Within the validation set, median DFS was 10 and 33 months for the
unfavorable and favorable groups, respectively (log rank p =0.01). When all 68 patients were
considered together, the median DFS was 10 and 20 months, respectively (log rank p=0.015).
Figure 5 shows the Kaplan-Meier analysis as a function of gene profile for homogeneous subsets
of patients with stage III/IV disease (n=33), grade III disease (n=29), or optimal debulking status
(n=24) in the validation set. We purposely avoided mixing the training and validation sets for
these subset analyses, in order to avoid re-analyzing samples that had already been used to
generate the prognostic gene profile. For the subset of patients with stage III/IV disease, the
median OS for the unfavorable and the favorable gene profile classes was 30 months versus not
yet reached, respectively (Figure 5A, p=0.006). For patients with grade III disease, the median
OS for the unfavorable and the favorable profile was also 30 months versus not yet reached,
respectively (Figure 5B, p<0.0006). Restricting the analysis to only those patients who were
optimally debulked, the median OS for the unfavorable versus the favorable gene profile groups
was 41 months versus not yet reached, respectively (Figure 5C, p=0.08). Thus, the OCPP
provided excellent discrimination of survival curves for these patient subsets. More specifically,
these results indicate that the survival predictions shown in Figure 3 were not sensitive to the
small number of early stage and low-grade patients contained within this patient cohort.
11
Association of the OCPP with other clinical parameters. Table 2 shows the distribution of
several known prognostic factors as a function of gene profile assignment. The two groups
(favorable and unfavorable) were well balanced for grade, stage, and histology. However, the
favorable profile group was enriched for patients who were optimally cytoreduced (81% versus
51%, p=0.02,), whereas the unfavorable profile group was characterized by a higher median age
(61 versus 52 years, p=0001). Therefore, several prognostic factors were next evaluated by both
univariate and multivariate analysis (Table 3). In addition to gene profile, debulking status and
age maintained prognostic value for OS in univariate analysis. However, the OCPP maintained
independent prognostic significance in multivariate analysis (Table 3), when correcting for
debulking status and age. Specifically, the HR for death for the unfavorable versus the favorable
group was 4.8 in the validation set (95% CI: 1.3-17.9, p=0.021), as well as in the entire dataset
(HR 3.6, 95% CI: 1.6-8.3, p=0.002), while controlling for debulking status and age. Debulking
status and age were not independently associated with survival in any of the analyses (training
set, validation set, or entire dataset), while controlling for each other and for the OCPP, although
debulking status showed a trend towards significance in the validation set (HR 2.6, 95% CI: 2.97.5, p=0.069).
Association between the OCPP and response to first-line chemotherapy. As shown in Table 4,
the percentage of patients achieving a CCR after first-line therapy in the favorable versus
unfavorable groups was 96% and 81%, (p =0.063).
Although this trend did not reach
significance by a two-sided Fisher’s test, it suggests that the association between the OCPP and
survival (Figures 3 and 4) may be partly related to the likelihood of achieving a CCR with firstline chemotherapy. However, after excluding patients who did not achieve a complete response
to chemotherapy, the unfavorable and favorable groups as defined by the OCPP still showed
significantly different OS (41 months versus not yet reached, respectively, p=0.012). Taken
12
together, these observations suggest that the prognostic influence of the OCPP is largely
independent of response to first-line treatment.
Second-look laparoscopy was routinely performed at one of the two participating institutions
(MSKCC) on patients who had achieved complete remission, had no detectable tumor by CT
scan at the end of post-operative chemotherapy, and met eligibility criteria for various
investigational protocols. Twenty-four (24) of the 30 patients from MSKCC had second-look
laparoscopy, with 14 patients having evidence of residual disease. There was no statistically
significant association between gene profile (favorable/unfavorable) and the second-look
laparoscopy findings.
Specifically, the percentage of patients with positive second-look
laparoscopy in the unfavorable and favorable groups was 55% and 61%, respectively (Fisher’s
p=1.0).
Functional classification of genes contained in the OCPP. The OCPP as shown in Figure 2
consists of 70 genes overexpressed in the unfavorable group and 45 genes overexpressed in the
favorable group. A full list of the 115 prognostic genes is provided in the on-line supplement.
Interestingly, several of these genes belong to families known to be associated with the
malignant phenotype (Table 5).
In order to avoid inflating the statistical significance of
differentially expressed genes, the p values were estimated using the validation set only. Gene
families represented in this profile include growth factor receptors and signaling molecules,
angiogenesis genes, cellular adhesion and tumor invasion genes, mesenchymal markers, as well
as hormone receptor associated genes. The possible significance of this gene expression profile
will be briefly discussed below.
13
Discussion
Currently available clinical factors provide an imperfect assessment of prognosis for patients
with advanced epithelial ovarian cancer.
By using gene expression profiling, we now
demonstrate the independent prognostic value of this technique when applied to tissue samples
obtained at the time of initial diagnostic laparotomy. In order to define the OCPP, we combined
a number of well-described methods of microarray analysis and phenotypic prediction in a way
that allowed us to approach survival as a continuous and censored variable. The training
approach that we have developed is based upon an initial assessment of samples at the extreme
ends of the survival spectrum, but avoids using an arbitrary cut-off for defining “long” and
“short” survival durations. In this regard, our analysis is similar to that used in previous studies
involving lymphoma and lung cancer
10,28
, which also approached survival as a continuous
outcome in order to discover relevant prognostic signatures.
The gene profile shown in Figure 2 provided independent prognostic information for OS in
patients with advanced ovarian cancer. Specifically, we were able to discriminate between two
distinct OS groups on the basis of the OCPP (Figure 3), one with median OS of 30 months and
another with median OS that has not yet been reached after a median follow up of 49 months
(p=0.0001).
Importantly, the gene profile was strongly associated with survival in the
independent validation set. Beyond the difference in median survival, it is notable that the
favorable group demonstrated a possible survival plateau, with a subset of patients having a 70%
probability of survival at 5 years. This level of discrimination between poor and good risk
patients is not generally possible using conventional clinical factors and may provide a powerful
way to identify, at the time of diagnosis, those patients who are at highest risk for an unfavorable
outcome with conventional treatment approaches.
prognostic for DFS as well (Figure 4).
14
In addition to OS, the OCPP was also
The prognostic power of our gene expression profile was not dependent upon its association with
other known characteristics, as it retained independent significance in multivariate analysis
(Table 3). This is a particularly important aspect of this study, given the emphasis recently
placed on appropriate multivariate assessment of genomic signatures used for clinical prediction
29
. Although there were 3 patients with early stage disease in our initial cohort (Table 2),
excluding these patients from the analysis did not diminish the prognostic significance of the
gene profile when applied only to patients with advanced stage disease (Figure 5A). Similarly,
the prognostic value of the gene profile was not sensitive to the small number of low-grade
tumors that were present in our study (Figure 5B). Furthermore, the profile showed prognostic
value even within the subset of optimally debulked patients, although this did not reach statistical
significance (p=0.08, Figure 5C).
Although limited by small numbers, insight into potential mechanisms underlying the prognostic
value of the OCPP was obtained by analyzing its association with response to first-line
chemotherapy. Although the OCPP was associated with a trend in chemotherapy response
(p=0.063, Table 4), the profile maintained strong prognostic significance when applied to the
homogeneous group of patients with chemosensitive disease. Thus, the prognostic value of the
OCPP cannot be solely ascribed to its association with drug resistance, and it is possible that it is
identifying other factors such as proliferative rate or metastatic potential that could alter the
natural history of this disease. In this regard, several genes with potential functional relevance
were overexpressed in the unfavorable group (Table 5 and Figure 2), including the platelet
derived growth factor receptor
30-31
and mesenchymal markers such as fibronectin
32
and
connective tissue growth factor 33. The coordinated expression of these and other mesenchymal
genes (such as fibromodulin and vimentin, Table 5) observed in the OCPP may reflect a
15
contribution from tumor stroma, and/or might represent a process known as epithelialmesenchymal transition, which has been correlated with aggressive tumor behavior in preclinical model systems
34-39
. In addition, the overexpression of estrogen pathway related genes
(such as the estrogen receptor binding site associated antigen 9) in the favorable group could
imply that estrogen responsiveness may contribute to an overall improved outcome, reminiscent
of the well-described prognostic association in breast cancer. It is particularly interesting that
certain genes upregulated in the unfavorable OCPP signature (Table 5) have been previously
associated with poor prognosis in EOC. For example, expression of plasminogen activator
inhibitor type 1 (PAI-1), a potentially important mediator of tumor invasion, has correlated with
tumor aggressiveness and poor patient outcome in EOC
40-43
, as well as in other tumor types
41
.
Likewise, thrombospondin 2 expression has been associated with poor prognosis in endometrial
cancer
44
and in EOC
45
. Finally, VEGF-C expression has been previously associated with
inferior survival and lymphatic spread in EOC
46-48
.
These interesting observations
notwithstanding, it is important to point out that the functional role of these genes in ovarian
cancer remains to be established and cannot be conclusively derived from this descriptive study.
This study demonstrates that it is feasible to define a gene profile that independently correlates
with survival in epithelial ovarian cancer. The availability of a powerful prognostic tool such as
the OCPP may enable clinicians to identify those patients most appropriate for investigational
approaches such as novel first-line or maintenance strategies. In addition, the availability of
molecularly-defined survival phenotypes may permit more rational targeted therapy using agents
that inhibit the VEGF or PDGF pathways, for instance. Finally, although not directly tested in
our study, it may be possible to use gene profiling to identify patients with early stage disease
who are at high risk for relapse, and are therefore most appropriate for adjuvant platinum-based
chemotherapy. Although our data suggest the potential utility of this approach, it is recognized
16
that the prognostic value of gene profiling in ovarian cancer must be further evaluated in
additional prospective studies of patients with both advanced as well as early stage disease.
17
Acknowledgements
We thank Dr. Todd Golub for his helpful comments during manuscript preparation. We also
thank Dr. Arthur Sytkowski and the Clinical Investigator Training Program (CITP) for providing
Dr. Dimitrios Spentzos with prior research experience during fellowship training. Finally, we
wish to acknowledge the efforts of gynecologic oncologists at BIDMC and MSKCC in providing
tissue samples used in this analysis.
18
Figure Legends
Figure 1: Development of Gene Expression Profile. One-half of the patient cohort (training set,
n = 34) were randomly selected in order to develop a prognostic gene expression profile. Three
training steps were used to progressively refine the profile, as described in text. The resultant
gene expression profile was then applied to an independent set of patient samples (validation
set).
Figure 2: Expression plot of the 115 prognostic genes comprising the Ovarian Cancer
Prognostic Profile (OCPP). Rows: Prognostic gene expression levels (normalized). Complete
information regarding gene identity is provided in the on-line supplement (a subset of these
genes is also provided in Table 5). Columns: Training set samples (n = 34). Red color:
Overexpressed genes. Blue color: Underexpressed genes.
Figure 3: Association between the OCPP and Survival. Figure 3A: Overall survival in the
validation set (n = 34). Median survival for the unfavorable group is 30 months and has not been
reached for the favorable group at a median follow up of 47 months (p=0.004 by log rank test).
Figure 3B: Overall survival in the entire data set (n = 68). The OCPP was applied to the entire
data set (validation plus training samples) in order to more accurately assess effect size. Median
survival for the unfavorable and favorable groups is 30 months and not yet reached, respectively,
at a median follow up of 49 months (p=0.0001 by log-rank test).
Figure 4: Association between the OCPP and Disease Free Survival. Figure 4A: Disease-free
survival in the validation set. The median DFS for the unfavorable and favorable groups was 10
months and 33 months, respectively (p=0.01). Figure 4B: Disease-free survival in the entire
data set. The median DFS for the unfavorable and favorable groups was 10 months and 20
months, respectively (p=0.01).
Figure 5: Relationship between the OCPP and survival in homogeneous patient subsets.
Median survival of the unfavorable versus favorable OCPP groups as follows: Stage III/IV
(n=33); 30 months versus not yet reached; Grade 3 (n=29), 30 months versus not yet reached;
Optimally debulked (n=24), 41 months versus not yet reached. All analyses performed in the
validation set.
19
Table 1: Clinical and pathological characteristics (N = 68)
Characteristic
Number (percentage)
Age (Median, range)
55 (36-80)
Stage (FIGO)
I
II
III
IV
1(1.5%)
2 (3%)
58 (85.5)%)
7 (10%)
Grade
1
2
3
1 (1.5%)
13 (19%)
54 (79.5%)
Histologic subtype
Papillary serous (pure or mixed)
Endometrioid
Clear cell
62 (91%)
1 (1.5%)
5 (7.5%)
Debulking status
Optimal
Suboptimal
44 (65%)
24 (35%)
First-line chemotherapy
Platinum-based
Taxane (paclitaxel or docetaxel)
68 (100%)
68 (100%)
20
Table 2. Relationship between the OCPP and known prognostic factors.
Characteristic
Unfavorable
OCPP
(N=37)
Favorable
OCPP
(N=31)
P value*
Median Age
(years)
61
52
0.0001
7 (19%)
30 (81%)
7 (23%)
24 (77%)
0.77
1 (3%)
36 (97%)
2 (6%)
29 (94%)
0.59
2 (5%)
35 (95%)
3 (10%)
28 (90%)
0.65
19 (51%)
18 (49%)
25 (81%)
6 (19%)
0.02
Grade
1/2
3
Stage
I/II
III/IV
Histology
Clear cell
Other histology
Debulking status
Optimal
Suboptimal
* P values for grade, stage, histology, debulking status: Fisher’s exact test. P value for age:
Wilcoxon’s rank sum test.
21
Table 3. Prognostic value of the gene expression profile adjusted for debulking
status and age by Cox Proportional Hazards regression.
Prognostic Factora
Univariate p value
OCPPb
(unfavorable/favorable)
Mulitvariate p value
All patients
Training Set
0.0001
0.03 (HRc 4.2)
Debulking status
(suboptimal/optimal)
Aged
Validation Set
All patients
0.02 (HR 4.8)
0.002 (HR 3.6)
0.03
0.51
0.069 (HR 2.6)
0.39
0.01
0.36
0.44
0.27
a) Prognostic significance assessed for overall survival.
b) OCPP- Ovarian Cancer Prognostic Profile.
c) HR: Hazard Ratio for death
d) Age analyzed as a continuous variable..
22
Table 4. Association between the OCPP and response to first-line chemotherapy.
Response category
Unfavorable
OCPP
(N=37)
Favorable OCPP
(N=31)
p valuea
30 (81%)
7 (19%)
29 (96%)
1 (4%)
0.063
6 (55%)
5 (45%)
8 (62%)
5 (38%)
1.0
Clinical responseb
Complete Response
Persistent Disease
Second-look laparoscopyc
(N=24)
Positive
Negative
a) P value by Fisher’s exact test.
b) Clinically defined as described in text.
c) Defined on the basis of surgical findings, including random biopsies.
23
Table 5: Expression pattern of selected genes in the unfavorable OCPP.
Gene Identity
Connective tissue growth factor
Estrogen receptor binding site antigen 9
Fibromodulin
Fibronectin
Fibronectin precursor
Integrin beta 5 subunit
Leukemia inhibitory factor receptor
Lymphocyte antigen 75
Mitogen inducible gene (mig-2)
PDGF receptor
Plasminogen activator inhibitor I
Receptor protein tyrosine kinase
SHC transforming protein 1
Thrombospondin 2
VEGF-C
Vimentin
v-fos transformation effector protein
Expression
pattern
Increased
Decreased
Increased
Increased
Increased
Increased
Decreased
Decreased
Increased
Increased
Increased
Increased
Increased
Increased
Increased
Increased
Increased
P value
<0.0001
<0.001
<0.0001
<0.0001
<0.0001
<0.01
<0.001
<0.01
<0.0001
<0.0001
<0.0001
<0.0001
<0.001
<0.0001
<0.05
<0.001
<0.0001
Expression patterns were determined using Affymetrix expression values generated with the
MAS 5 algorithm. Permutation p values were estimated in the validation set. A color plot with
the OCPP expression patterns is also provided in Figure 2. Complete information regarding the
identity of genes comprising the OCPP is available in the on-line supplement at
www.bidmcgenomics.org/OvarianCancer .
24
Figure 1: Development of gene expression profile
All patients (n=68)
Training set (n=34)
(randomly selected)
Selection of
“extreme” survival
patients (n=14)
Pattern recognition/
class prediction
(1st training step)
Optimization of class
predictors
(2nd training step)
(n=34)
Refinement of final
predictive signature
(3rd training step)
(n=34)
Independent validation
of profile using a
separate patient cohort
(n=34)
25
Figure 2: Expression plot of 115 prognostic genes.
Prognosis
Unfavorable
Favorable
Receptor protein tyrosine kinase
Fibronectin
Fibronectin precursor
Mitogen inducible gene
Integrin beta 5
PDGF receptor
Connective tissue growth factor
VEGF C
Thrombospondin 2
Vimentin
Plasminogen activator inhibitor 1
Leukemia inhibitory factor receptor
Estrogen receptor binding site antigen 9
NOT4Hp transcriptional repressor
Beta dystrobrevin
Non receptor protein tyrosine kinase Tnk1
Normalized expression
low
high
26
Figure 3A: Overall Survival in the Validation Set
Overall survival
Favorable profile
P=0.004
Unfavorable profile
Months from diagnosis
Number at risk:
Unfavorable profile 18
Favorable profile 16
17
15
14
15
7
14
6
11
3
5
2
2
1
0
0
0
Figure 3B: Overall Survival in the Entire Data Set.
Overall survival
Favorable profile
P=0.0001
Unfavorable profile
Months from diagnosis
Number at risk:
Unfavorable profile 37
Favorable profile
31
34
30
28
30
20
27
13
21
27
7
12
3
6
1
1
0
0
Disease free survival
Figure 4A. Disease Free Survival in the Validation Set
Favorable profile
P=0.01
Unfavorable profile
Months from diagnosis
Number at risk:
Favorable profile
15
Unfavorable profile 13
13
8
8
3
7
1
5
0
2
0
1
0
0
0
Disease free survival
Figure 4B. Disease Free Survival in the Entire Data Set
P=0.01
Favorable profile
Unfavorable profile
Months from diagnosis
Number at risk:
Favorable profile
Unfavorable profile
31
29
23
16
16
8
13
4
28
8
2
4
0
1
0
0
0
Figure 5: Association of the OCPP with survival in selected patient subsets.
Stages III/IV
Overall Survival
Favorable profile
P=0.006
Unfavorable profile
Months from diagnosis
Number at risk:
Favorable profile
Unfavorable profile
15
18
14
17
14
15
13
11
10
6
6
3
1
2
0
1
0
0
Grade III
Overall Survival
Favorable profile
P=0.0006
Unfavorable profile
Months from diagnosis
Number at risk:
Favorable profile
Unfavorable profile
13
16
13
15
13
13
12
9
9
4
5
2
2
2
0
1
0
0
Optimally debulked
Overall Survival
Favorable profile
Number at risk:
Favorable profile
Unfavorable profile
P=0.08
Unfavorable profile
Months from diagnosis
12
12
11
12
11
11
10
8
9
5
6
3
2
2
29
0
1
0
0
References
1. Cannistra SA: Cancer of the ovary. N Engl J Med 329:1550-9, 1993
2. McGuire WP, Hoskins WJ, Brady MF, et al: Cyclophosphamide and cisplatin compared
with paclitaxel and cisplatin in patients with stage III and stage IV ovarian cancer. N. Engl J
Med 334:1-6, 1996
3. Ben David Y, Chetrit A, Hirsh-Yechezkel G, et al: Effect of BRCA mutations on the
length of survival in epithelial ovarian tumors. J Clin Oncol 20:463-6, 2002
4. Boyd J, Sonoda Y, Federici MG, et al: Clinicopathologic features of BRCA-linked and
sporadic ovarian cancer. Jama 283:2260-5, 2000
5. Cass I, Baldwin RL, Varkey T, et al: Improved survival in women with BRCA-associated
ovarian carcinoma. Cancer 97:2187-95, 2003
6. Rubin SC: BRCA-related ovarian carcinoma. Cancer 97:2127-9, 2003
7. Tai YT, Lee S, Niloff E, et al: BAX protein expression and clinical outcome in epithelial
ovarian cancer. J Clin Oncol 16:2583-90, 1998
8. Golub TR, Slonim DK, Tamayo P, et al: Molecular classification of cancer: class
discovery and class prediction by gene expression monitoring. Science 286:531-7, 1999
9. Armstrong SA, Staunton JE, Silverman LB, et al: MLL translocations specify a distinct
gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41-7, 2002
10. Rosenwald A, Wright G, Chan WC, et al: The use of molecular profiling to predict
survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 346:1937-47,
2002
11. Shipp MA, Ross KN, Tamayo P, et al: Diffuse large B-cell lymphoma outcome
prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68-74,
2002
12. Ruel M, Bianchi C, Khan TA, et al: Gene expression profile after cardiopulmonary
bypass and cardioplegic arrest. J Thorac Cardiovasc Surg 126:1521-30, 2003
13. Mitsiades N, Mitsiades CS, Richardson PG, et al: The proteasome inhibitor PS-341
potentiates sensitivity of multiple myeloma cells to conventional chemotherapeutic agents:
therapeutic applications. Blood 101:2377-80, 2003
14. Mitsiades CS, Mitsiades NS, McMullan CJ, et al: Transcriptional signature of histone
deacetylase inhibition in multiple myeloma: biological and clinical implications. Proc Natl
Acad Sci U S A 101:540-5, 2004
30
15. Mitsiades N, Mitsiades CS, Poulaki V, et al: Molecular sequelae of proteasome inhibition
in human multiple myeloma cells. Proc Natl Acad Sci U S A 99:14374-9, 2002
16. Califano A, Stolovitzky G, Tu Y: Analysis of gene expression microarrays for phenotype
classification. Proc Int Conf Intell Syst Mol Biol 8:75-85, 2000
17. Califano A: SPLASH: structural pattern localization analysis by sequential histograms.
Bioinformatics 16:341-57, 2000
18. Klein U, Tu Y, Stolovitzky GA, et al: Gene expression profiling of B cell chronic
lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells. J Exp
Med 194:1625-38, 2001
19. Klein U, Tu Y, Stolovitzky GA, et al: Gene expression dynamics during germinal center
transit in B cells. Ann N Y Acad Sci 987:166-72, 2003
20. Klein U, Tu Y, Stolovitzky GA, et al: Transcriptional analysis of the B cell germinal
center reaction. Proc Natl Acad Sci U S A 100:2639-44, 2003
21. Kuppers R, Klein U, Schwering I, et al: Identification of Hodgkin and Reed-Sternberg
cell-specific genes by gene expression profiling. J Clin Invest 111:529-37, 2003
22. Lepre J, Rice JJ, Tu Y, et al: Genes@Work: an efficient algorithm for pattern discovery
and multivariate feature selection in gene expression data. Bioinformatics, 2004
23. Pomeroy SL, Tamayo P, Gaasenbeek M, et al: Prediction of central nervous system
embryonal tumour outcome based on gene expression. Nature 415:436-42, 2002
24. Ramaswamy S, Ross KN, Lander ES, et al: A molecular signature of metastasis in
primary solid tumors. Nat Genet 33:49-54, 2003
25. Ramaswamy S, Tamayo P, Rifkin R, et al: Multiclass cancer diagnosis using tumor gene
expression signatures. Proc Natl Acad Sci U S A 98:15149-54, 2001
26. Wu B, Abbott T, Fishman D, et al: Comparison of statistical methods for classification of
ovarian cancer using mass spectrometry data. Bioinformatics 19:1636-43, 2003
27. Kim S: Protein beta-turn prediction using nearest-neighbor method. Bioinformatics
20:40-4, 2004
28. Beer DG, Kardia SL, Huang CC, et al: Gene-expression profiles predict survival of
patients with lung adenocarcinoma. Nat Med 8:816-24, 2002
29. Ntzani EE, Ioannidis JP: Predictive ability of DNA microarrays for cancer outcomes and
correlates: an empirical assessment. Lancet 362:1439-44, 2003
31
30. Henriksen R, Funa K, Wilander E, et al: Expression and prognostic significance of
platelet-derived growth factor and its receptors in epithelial ovarian neoplasms. Cancer Res
53:4550-4, 1993
31. Shawver LK, Schwartz DP, Mann E, et al: Inhibition of platelet-derived growth factormediated signal transduction and tumor growth by N-[4-(trifluoromethyl)-phenyl]5methylisoxazole-4-carboxamide. Clin Cancer Res 3:1167-77, 1997
32. Thant AA, Nawa A, Kikkawa F, et al: Fibronectin activates matrix metalloproteinase-9
secretion via the MEK1-MAPK and the PI3K-Akt pathways in ovarian cancer cells. Clin Exp
Metastasis 18:423-8, 2000
33. Sakamoto M, Kondo A, Kawasaki K, et al: Analysis of gene expression profiles
associated with cisplatin resistance in human ovarian cancer cell lines and tissues using
cDNA microarray. Hum Cell 14:305-15, 2001
34. Auersperg N, Pan J, Grove BD, et al: E-cadherin induces mesenchymal-to-epithelial
transition in human ovarian surface epithelium. Proc Natl Acad Sci U S A 96:6249-54, 1999
35. Jechlinger M, Grunert S, Tamir IH, et al: Expression profiling of epithelial plasticity in
tumor progression. Oncogene 22:7155-69, 2003
36. Thiery JP, Chopin D: Epithelial cell plasticity in development and tumor progression.
Cancer Metastasis Rev 18:31-42, 1999
37. Thiery JP: Epithelial-mesenchymal transitions in tumour progression. Nat Rev Cancer
2:442-54, 2002
38. Tran NL, Nagle RB, Cress AE, et al: N-Cadherin expression in human prostate
carcinoma cell lines. An epithelial-mesenchymal transformation mediating adhesion
withStromal cells. Am J Pathol 155:787-98, 1999
39. Vincent-Salomon A, Thiery JP: Host microenvironment in breast cancer development:
epithelial-mesenchymal transition in breast cancer development. Breast Cancer Res 5:101-6,
2003
40. Kuhn W, Schmalfeldt B, Reuning U, et al: Prognostic significance of urokinase (uPA)
and its inhibitor PAI-1 for survival in advanced ovarian carcinoma stage FIGO IIIc. Br J
Cancer 79:1746-51, 1999
41. Harbeck N, Kruger A, Sinz S, et al: Clinical relevance of the plasminogen activator
inhibitor type 1--a multifaceted proteolytic factor. Onkologie 24:238-44, 2001
32
42. Konecny G, Untch M, Pihan A, et al: Association of urokinase-type plasminogen
activator and its inhibitor with disease progression and prognosis in ovarian cancer. Clin
Cancer Res 7:1743-9, 2001
43. Chambers SK, Ivins CM, Carcangiu ML: Plasminogen activator inhibitor-1 is an
independent poor prognostic factor for survival in advanced stage epithelial ovarian cancer
patients. Int J Cancer 79:449-54, 1998
44. Seki N, Kodama J, Hashimoto I, et al: Thrombospondin-1 and -2 messenger RNA
expression in normal and neoplastic endometrial tissues: correlation with angiogenesis and
prognosis. Int J Oncol 19:305-10, 2001
45. Kodama J, Hashimoto I, Seki N, et al: Thrombospondin-1 and -2 messenger RNA
expression in epithelial ovarian tumor. Anticancer Res 21:2983-7, 2001
46. Hsieh CY, Chen CA, Chou CH, et al: Overexpression of Her-2/NEU in epithelial ovarian
carcinoma induces vascular endothelial growth factor C by activating NF-kappa B:
implications for malignant ascites formation and tumor lymphangiogenesis. J Biomed Sci
11:249-59, 2004
47. Ueda M, Terai Y, Kumagai K, et al: Vascular endothelial growth factor C gene
expression is closely related to invasion phenotype in gynecological tumor cells. Gynecol
Oncol 82:162-6, 2001
48. Yokoyama Y, Charnock-Jones DS, Licence D, et al: Vascular endothelial growth factorD is an independent prognostic factor in epithelial ovarian carcinoma. Br J Cancer 88:23744, 2003
33