Download manuscript revised - Harvard University Department of Physics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Cancer Survival as a Function of Age at Diagnosis: A Study
of the Surveillance, Epidemiology and End Results Database.
Authors
Mena N. Bassilya,b,c, MSc, MBBch; Richard Wilsonc, D.Phil; Francesco
Pompeic, PhD, M.S; Dimitriy Burmistrovd, PhD, MSc.
a
Harvard School of Public Health, 401 Park Drive, Boston, MA 02215, USA.
Menoufiya University, Shebin El-kom, Menoufiya, Egypt.
c
Harvard University- Department of Physics, Jefferson Laboratory, 17 Oxford st,
Cambridge, MA 02138, USA.
d
ITA software, 141 Portland st, Cambridge, MA 02139, USA.
b
Corresponding author:
Dr. Mena Bassily
Harvard University
Jefferson Laboratory
Room 257-A
17 Oxford Street, Cambridge, MA 02138
U.S.A.
Tel.: (617)- 332- 4823 (USA code: +1)
Fax: (617)- 495- 3387 (USA code: +1)
E-mail: [email protected]
1
ABSTRACT
Background: Recent research suggested that cancer survival has improved in recent cohorts.
Improvement in cancer survival is considered a valid indicator of the quality of care introduced to the
patients. The aim of this study is to investigate the changes in the survival profile over age for patients
with the most incident cancers. Methods: Survival data of 3.94 million patients diagnosed with 23
primary-site cancers within the periods of 1979 to 1983, 1989 to 1993, and 1999 to 2003 were adopted
from the Surveillance, Epidemiology and End Results database. Gender and cause-specific survival
probabilities were estimated at one-, three-, and five-years after diagnosis using the Kaplan-Meier
survival estimate. Survival was presented for each of the studied cancers, cohorts, and sexes in the form
of line graphs as a function of age at diagnosis. Error bars demonstrated the probability of error at 95%
confidence level. Results: The graphs demonstrated that cancer survival was improved over the
successive cohorts for most cancers, with several exceptions such as brain and lung cancers. The relation
between survival and the age at diagnosis was generally described in the form of a gradual decline phase
and a rapid fall-off phase at 70-80 years of age, with few exceptions as in leukemia and Hodgkin
lymphoma. Patients who survived for 3 years were more likely to live for 5 years after diagnosis, but
this prediction could not be extrapolated to the one-year survivors. Conclusion: Further studies on
tumor- specific characteristics and treatment modalities of these patients are suggested for clarification
of the possible causes of variations in patient’s survival profile over age.
Key words: Cancer survival; Cause- specific survival; Cancer in old age; SEER; KaplanMeier; survival phases.
Abbreviations:
DNA: Deoxyribonucleic Acid.
KM: Kaplan- Meier.
NCI: National Cancer Institute.
SEER: Surveillance, Epidemiology, and End Results.
SEER*Stat: Surveillance, Epidemiology, and End Results Statistical Software.
2
INTRODUCTION
Studying survival of different cancers has important practical value for patients, providers, and
researchers
1,2
. Cancer patients may wish to know how their prognosis is changing over time, and what
is their life expectancy based on the disease status. The proper understanding of prognosis may help
both of the physicians and the patients decide on treatment options, balancing the personal values for
quality versus quantity of life 3. Knowledge of cancer survival provides a more objective basis to deem a
patient “cured” of their disease. Providers can make use of survival information to more objectively
determine an appropriate frequency of follow-up visits and aggressiveness of surveillance testing, based
on patient’s current risk profile. When designing clinical trials, clinical researchers may also find it
useful in helping to determine sufficient follow-up times for trial endpoints 4.
Period survival more accurately describes patients’ prognosis since the overall survival
projections are often discouraging and not necessarily pertinent for patients who have survived the initial
treatment period, as prognosis after initial management isn’t static. Patients who have survived an
interval of time after diagnosis have a different probability of surviving for the following 5 years from
that which was estimated at the time of diagnosis 5.
Two forms of net (non crude) survival analyses are available in the SEER*Stat; relative and
cause- specific survivals. Both of these methods present the likelihood that cancer patients will not die
from causes associated with their cancer. Relative survival is derived by comparing the survival of all
causes of death in a group of cancer patients to survival of all causes of death in a cancer-free age- and
sex- matched population. This means that in relative survival, unlike cause-specific survival, both of the
numerator and the denominator are derived from different populations. Besides, relative survival is
entirely based on the assumption that the other causes of death in the Surveillance, Epidemiology and
End Results database (SEER) cohorts and in the general U.S. population are highly comparable. This
3
assumption may be misleading if one factor is present at a cohort that may increase the risk of dying
from causes other than cancer, compared to the general population. For example, smoking may be
highly represented in a lung cancer cohort who often die due to causes related to smoking other than
lung cancer e.g. heart disease. Relative survival cannot separate the risk of death from lung cancer from
the risk of dying of non cancer causes due to smoking. Therefore, relative survival fraction for those
with lung cancer is an overestimate of the effect of lung cancer alone 6.
This article provides an overview of cause-specific survival of many cancers as a function of age
at diagnosis to assess whether there is a specific pattern or trend which describes the survival of all
cancer types, specially at old age groups (85+ years). To provide a detailed modeling of survival at old
ages, the 85+ age group provided in the SEER database were re-classified into smaller five-year age
groups. This study may be also used by clinicians as a benchmark to tell the patients of different cancers
about the probability of surviving for 1, 3, and 5 years. We were also concerned with studying the
progress in cancer survival over the last three decades to detect if the advances in cancer screening,
diagnosis, and treatment were interpreted in the form of improvement of cancer patients’ longevity.
4
MATERIALS AND METHODS
The Surveillance, Epidemiology and End Results database of the National Cancer Institute (NCI)
is the largest population- based cancer registry in the United States, which geographically encompasses
approximately 26 % of the U.S. population. The latest SEER17 cancer registry collects data on cancer
incidence and survival from seventeen population- based cancer registries which are Connecticut, Iowa,
New Mexico, Utah, Hawaii, the metropolitan areas of Detroit, California (San Francisco-Oakland, San
Jose-Monterey, and Los Angeles), Atlanta, Seattle-Puget, rural Georgia, Arizona, New Orleans,
Louisiana, New Jersey, Puerto Rico, Kentucky, in addition to American Indian Alaska natives 7. These
registries were chosen for their completeness and their adequate representation of all races and minority
populations 8. The SEER program standard for data completeness is 98 % 9.
The SEER program registries routinely collect data on demographic characteristics of the
patients, tumor characteristics and staging, as well as a follow-up for the survival status. An
epidemiological study comparing SEER areas with non- SEER areas in the United States concluded that
the age and sex distributions of these areas were comparable 8. Data about survival are actively collected
by SEER cancer registries and reported to the NCI, data are then ascertained from hospital records,
private laboratories, radiotherapy units, other health care service units, and from death certificates when
cancer is listed as a cause of death 10.
SEER Data Confidentiality:
All available data in the SEER database are retrospective in nature. Personal identifiers are
absent from the database. All variables that might lead to reidentification such as the date of birth have
been removed or transposed. Any remaining risk of reidentification has been minimized by the
governing agency in not allowing the data to be available as public-use information. Investigators have
5
to sign a legally binding data-use agreement with the Centers for Medicare and Medicaid Services and
SEER 11.
Statistical Methods:
SEER*stat software (version 6.5.2, National Cancer Institute, Bethesda, MD) was used to
download and analyze patients’ cause-specific survival. Cause- specific survival is defined as the
probability of surviving certain type of cancer for certain period of time, excluding death from causes
other than the cancer of interest (as reported in the death certificate and / or autopsy). In this study, only
primary cancers were considered. Consequently, the cause-specific survival presented in this study is an
estimate of the likelihood that primary cancer patients will not die from primary cancer only, not from a
secondary cancer or other associated causes. To calculate cause-specific survival we used the KaplanMeier product-limit method available in SEER*Stat program. The Kaplan-Meier (KM) Estimator is the
nonparametric maximum likelihood estimate of survival function
12
. In the SEER database, survival is
available on each cancer patient from the time of initial diagnosis to the date of last contact or the date of
death if the patient had died. The KM estimator calculates the survival probability at a defined period of
time based on calculation of the survival estimate at the end of each month of this period7. This method
allows for early exclusion of those who deceased during the specified time interval and for prompt
censoring of the cases lost from follow- up, with regular adjustment of the at-risk group (denominator)
on a monthly basis, in order to introduce an accurate net survival estimate for the defined period of time.
One, three, and five- year survival data for approximately 3.94 million patients diagnosed with
different primary site cancers (21 sites for females and 20 sites for males) were analyzed using the KM
method. The KM analysis allows estimation of survival over time, even when patients drop out or are
studied for different lengths of time. For each interval, survival probability is calculated as the number
of patients surviving for certain time divided by the number of patients at risk. Patients who have died,
dropped out, or did not reach the time of follow up are not counted as “at-risk” and are considered
6
censored. Eventually, the probability of surviving to any point is estimated from the cumulative
probability of surviving each of the preceding time intervals. When the population is large enough (such
as in the SEER database), the estimated Kaplan-Meier survival approaches, to a large extent, the true
survival of that population 12, 26, 27.
Cancer cases were grouped according to age at diagnosis into twenty three 5-year age groups
ranging from 0- 4 to 110-114 years old. Age and cause- specific survival rates for each type of cancer
were calculated separately for each sex and over three different cross-sectional cohorts (1979 to 1983,
1989 to 1993, and 1999 to 2003). We used three cohorts derived from three successive decades to study
the possible time to time variability in cancer survival. To investigate the changes in cancer survival
over decades, the age and sex- specific 5-year survival fraction for the three studied cohorts was
demonstrated in line graphs (Figure 3). To show the survival probability at different periods after
diagnosis, the one-, three-, and five-year survival fractions for each type of cancer and sex were plotted
with the age at diagnosis in the form of line graphs (figures 1 and 2).
For graphical clarity reasons, only the 5-year survival plots show error bars. In case of
appearance of a very large error bar (> 75% of the entire scale of the y-axis), the corresponding data
point was considered statistically unreliable and the error bars were not shown at such points. The points
which have statistical uncertainty were connected to the other points of the line graph with a grey line.
Two types of error-bars are presented in the plots. The first is the 95% confidence bands provided by
SEER*Stat program. These confidence intervals were presented whenever SEER*Stat was able to
calculate them in the survival-session outputs. In cases when the last patient is dead of the cancer of
interest, this means that this patient was not “censored” nor “lost” from the observation due to reasons
other than death. However, SEER*Stat returns the survival and the upper confidence limit on it as equal
to 0%. This tells us that calculating variance using the KM estimator with a small-variance
7
approximation is inapplicable because some uncertainty should be associated even with the empirical
zero-survival values.
The zero-survival points were shown on the plots as bold square points. A method to assign
uncertainty and the confidence bands at these points was developed. These bands were shown on the
plots with dashed error-bars connected with the zero-survival points. These error-bars demonstrate the
one-sided 95% confidence intervals.
The main difficulty in estimating error-bars on zero-survival points was the small sample size (120) in the age groups that ended with zero survival. One approach to avoid using small variance
approximation is the “Bootstrap”13. This approach is based on the idea that the original sample
represents the population from which it was drawn. So re-sampling from this sample may represent what
we could get if we took many samples from the population. Bootstrap methods for right censored data
in general and for KM estimator in particular were studied by many authors14-16. Technically the
bootstrap is a version of a Monte Carlo (MC) calculation, implementing repeated random sampling with
replacement from the observed sample under the assumption of independence of times of death of the
patients of interest and times of patient’s loss from observation due to other reasons (censoring times).
The procedure simulates independently two times for each hypothetical person; the time of death from
the reason under study, and the censoring time. If censoring happens before the death, this person is
considered lost from observation at the simulated censoring time and he/she is considered deceased at
the simulated death time. Many random realizations of KM estimator values can be randomly drawn this
way, and their distribution can be used to estimate confidence limits on the KM survival curve.
Though it was shown that such an approach can produce asymptotically correct confidence
bands for survival curves, and it works well in simulation studies with medium to large sample sizes, the
approach has limitations for small sample sizes. The following extreme example shows that this
8
approach is not perfect if there is only one observed death from the cancer of interest in the original
sample. All bootstrap samples will include this single observation replicated, and it will cause zerouncertainty of the survival estimate. On the other hand the situation with the latest extreme example can
be improved if random samples were drawn not from this single observed value, but from a Bayesian
posterior distribution conditional on this single observation. We used the bootstrapping procedure with
random sampling from the posterior Bayesian distribution conditional on all of the observed death and
loss times rather than from their empirical distributions. The proposed approach was implemented in a
code, and used to estimate the error-bars at the zero-survival points in this paper. Confidence bands
obtained with the described method should be interpreted as expressions of the uncertainty of
information extracted from the SEER database for the respective data points with zero survival. The
detailed description of the algorithm as well as its small-sample and large-sample properties can be a
matter of a forthcoming publication. The formal steps of the algorithm are shown in the appendix.
9
RESULTS
Figure 1(A-D) and figure 2(A-D) show age-specific cancer survival for all 21 of the major organ
site cancers listed in the SEER for females and all 20 major organ sites listed for males; respectively,
along with the total for all cancers, for the periods 1979 to 1983, 1989 to 1993, and 1999 to 2003, for
total of 129 cancer datasets. These graphs model the one, three, and five-year cancer survival as a
function of age, for both sexes, and over the three previously mentioned cohorts. No consistent cancer
survival pattern or trend could be derived out of these plots, however, several results about cancer
survival were observed.
A) - Cancer survival over the three cohorts (decades): (figure 3 A-C)
For most of the figures of the studied cancers, there is a definite shift of cancer survival from the
left to the right from the earlier to the later cohorts. This denotes a considerable improvement in survival
of these cancers at the younger ages, with the greatest survival improvements in non-Hodgkin
lymphoma and leukemia. These figures are also available in an expanded larger view at (http://
physics.harvard.edu/~Wilson/cancers and chemicals/survival.html). Interestingly, some cancers showed
high survival rates persistently over decades e.g. melanoma (females better than males), prostate,
testicular, and thyroid cancers.
Although cancer survival improved over the years, the survival fraction for some cancers such as
pancreatic cancer remained poor. On the other hand, survival of corpus uteri and urinary bladder cancers
did not seem to improve over the three cohorts but the survival fraction was good.
Some cancers showed inconsistent increases in the survival fraction. For example, survival of
laryngeal cancer patients increased between the 1979-1983 to 1989-1993 cohorts, followed by a decline
between the latter and the 1999-2003 cohort. This inconsistency was mainly observed for patient age at
10
diagnosis of 40 years or older. Similarly, a very slight and inconsistent improvement in survival rate was
observed in breast cancer in males. On the contrary, survival of some cancers remained unchanged e.g.
brain, lung and bronchus, urinary bladder, cervical, and uterine cancers. Unfortunately, liver cancer
survival was very low and becomes worse in the last decade.
B) - Cancer survival with the age of patient at diagnosis:
Two main conclusions could be drawn from the plots. Firstly, cancer survival declines with the
increase in the age of diagnosis. Secondly, the relation of cancer survival to the age of diagnosis can be
described as two-phased survival. Phase I is a phase of gradual decrease of survival with the age of
diagnosis. Phase II is a rapid fall off phase during which cancer survival rapidly declines. For the
majority of the studied cancers phase II occurs at (70-80) years of age.
Some exceptions to the facts in the previous paragraph were noted:

In leukemia and prostate cancer, there was a transient improvement of survival between the
age of (40-60 years) for the former and (40-90 years) for the latter before the fall off phase took
place.

Phase I couldn’t be applied to some cancers e.g. colorectal, uterine, oral, prostate, testicular
cancers, and melanoma. These cancers have shown a plateau of constant cancer survival, rather
than a gradual decline, before the rapid fall off phase (Phase II) has started.

In some cancers e.g. brain, lung and bronchus, renal, pancreatic, and ovarian cancers, as well
as Hodgkin lymphoma, the rapid fall off phase (Phase II) starts at an earlier age, mainly at the
age of 40 years old.
C) – One, three, and five-year cancer survivals:
One, three, and five-year survival plots had the same pattern or trend of change with the age of
patient at diagnosis as indicated by the approximate parallelism of the three period-survival plots. In
11
most of the studied cancers, the plots representing the three and five-year survival rates were very close
however; there is a wide discrepancy between both of them and the one-year survival plot. This
speculation may lead to a conclusion that cancer patients who lived for 3 years after diagnosis have high
probability to survive for 5 years e.g. patients with Hodgkin lymphoma, renal cancer, and esophageal
cancer, but we can’t extrapolate this conclusion to cancer patients who survived only for one year after
diagnosis.
One of the exceptions to this conclusion is that in some cancers the probability of survival for 5
years after diagnosis was similarly high for both of the one-year and 3-year survivors e.g. Melanoma
(specially in females), uterine, and testicular cancers.
Another exception is the breast cancer and myeloma in which survival for 3 years post-diagnosis
was not a valid predictor for surviving for 5 years. This fact is shown by the wide gap between the 3 and
5- year survival plots in both cancers.
12
DISCUSSION
Survival information is potentially of great importance to patients, clinicians, and researchers as
it quantifies a patient’s changing risk profile over time. This study discussed age-specific cancer survival
for 20 of the major organ site male cancers and 21 major organ site female cancers based on the SEER
data.
No single model describes the pattern of cancer survival as a function of age at diagnosis for all
of the studied cancers. This may be explained by the wide range of cancer-specific factors which
manipulate the survival pattern including cancer site, pathological differentiation of the cancer cells,
cancer stage at the time of diagnosis, the time span between cancer incidence and start of treatment, type
of intervention, quality of care received, and associated morbidity17. This result highlighted the wide
discrepancy between modeling of cancer incidence versus survival. Harding and colleagues18 observed a
unique cancer incidence model when they studied the same primary site cancers, cohorts, and age
groups which we used in this study. They further reported that the incidence for each cancer tends to fall
in the same way at old ages and could fit into one single model. Unlike incidence, no turn over in the
cause-specific survival was observed in old age groups.
An overall improvement in survival of cancer patients over the successive decades was achieved
as represented by the definite shift of cancer survival from left to right in the later cohorts for most of the
cancer types. This is consistent with the results of Pulte et al.19 who reported an ongoing increase in
trends of cancer survival in the period between 1990 to 2004 for four of the studied cancers. The greatest
survival improvements were observed in Hodgkin lymphoma, non- Hodgkin lymphoma, and leukemia.
Similarly, Koumarianou and colleagues20 reported that survival of Hodgkin lymphoma was significantly
improved over the last three decades due to the improved chemotherapy regimens. A study of trends of
13
survival of leukemia patients in Slovenia documented that the hazard of dying from leukemia has been
constantly decreasing between 1957 and 2007
21
. The improvement of cancer survival for most of the
cancers studied may be attributed to application of cancer screening programs and availability of tumor
markers for many cancers which resulted in early diagnosis, development of better diagnostic
techniques, and the advances in the intervention regimens and the quality of care. For a few cancers,
with liver and pancreatic cancers as prominent examples, patients’ prognosis remained poor over the
three studied cohorts (decades). Further studies are needed to investigate the factors which lie behind the
unimproved prognosis for each individual cancer.
Marubini and colleagues22 suggested that the relation between survival fraction and patient’s age
at diagnosis must be considered, as this provides a better understanding of the biological basis for the
complex relation between age of diagnosis and prognosis, and might clarify our view for the natural
history of cancer. On this regard, we found that cancer survival was declining with the age at diagnosis.
The same results were also reported by Navarro-Garcia et al.23 and Adami et al.24.
For the majority of the studied cancers, the relation of cancer survival to the age of diagnosis can
be allocated to two phases. The first phase is the gradual decline phase and the second is the rapid fall
off phase. For most of the cancers, the second phase starts at the age of 70-80 years. The rapid decline of
cancer survival after the age of 70 might be attributed to cancer comorbidities which decrease patient’s
vitality, immunity, and the compensatory reserve of all body systems. Cancer associated morbidity in
elderly patients may lead to a more vulnerability to failure of the body systems and rapid
decompensation of the vital organs, specially if the patient is exposed to the toxic effects of the
chemotherapy or if the cancer is involving a vital organ. Although our study has examined the causespecific survival, which means that all other causes of death were excluded, but this does not preclude
the role of the other comorbidities in the facilitation of and precipitation to cancer mortalities.
14
For few cancers e.g. Hodgkin lymphoma, the rapid fall off phase starts at earlier age, mainly at
40 years old. This denotes that cancer survival is widely variable and takes different patterns over age
and with different types of cancers and that survival is no more related to worse diagnosis and treatment
options at old ages. The relation between survival and age at diagnosis should be understood in terms of
the cancer biology at the cellular and molecular levels. To our knowledge, no previous studies have
discussed the pattern of survival in relation to the age at diagnosis. In a study by Brenner et al.25 about
the trends in long term survival in Hodgkin lymphoma patients over three period cohorts. The authors
were concerned about the improvement in survival over the successive period cohorts and didn’t give
much concern to the pattern of survival over the different age groups; however, a secondary analysis of
their data confirmed our results as survival of Hodgkin Lymphoma drops abruptly after the age of 35 to
44 in all of the studied cohorts.
The approximate parallelism of the one, three, and five-year period survival plots over the
different cohorts invites a speculation that cancer survival might be biologically linked to the age of
patient at diagnosis, whether due to age-related cellular or DNA molecular changes. The relationship
between the period survival and age of patient at diagnosis may no longer be explained in terms of the
differences in the ease of diagnosis over the different age groups. This could be identified by the non
disruption of the parallelism of the plots representing the three studied period survivals, even with the
advances achieved in the diagnosis of cancer over the different cohorts.
Unlike the one-year survival plots, there is an observed proximity of the 3-year and 5-year
survival plots. This proximity demonstrates that cancer patients who lived for 3 years after diagnosis
have high probability to survive for 5 years post-diagnosis, but this prediction couldn’t be extrapolated
to those who survived only for one year after diagnosis. Some exceptions may apply to this conclusion,
for example, the probability of surviving a uterine and testicular cancer for 5 years was similarly high
15
for both of the one-year and the 3-year survivors. On the contrary, surviving cancer breast or myeloma
for 3 years after diagnosis was not a valid predictor for surviving for 5-years.
One limitation to this study is the SEER classification for the cause of death which was entirely
based on death certificates or autopsy reports. It is not uncommon that the cause of death is inaccurately
reported in the death certificate as “death due to circulatory failure” “or senility”. If a misclassification
bias had happened in the reporting of the cause of death to the SEER database, this may subsequently
affect the survival models we presented in this study. Also, in this study we were concerned with
investigation of the survival over cohorts, age groups, both sexes, and primary cancer sites. However,
we were not able to provide further modeling for important cancer details such as cancer stage at the
time of diagnosis, pathological type, cellular differentiation, type of treatment received, and primary
versus metastatic cancer. Modeling of the effect of these parameters on the survival function will be
discussed in future publications. In some instances cause-specific survival may be inaccurate. For
example, cause of death may be unreliable or unknown or if a cancer has metastasized to another site,
the death certificate may list cancer of the metastasized site as the cause of death. To eliminate these
limitations, only individuals with one primary cancer were included in the cohort and if the cause of
death is missing the cases were excluded from the analysis.
The variability of cancer survival over different age groups, although was clearly demonstrated
in this study; but we were not able to identify the cause of this variability on a mathematical basis. This
data, however, could be of great importance to cancer biologists and molecular genetics scientists who
may explain the cause of this variation on a biological basis (at the cellular level).
The KM survival estimate was somewhat inaccurate if the number of observed cases was less
than 20 patients, a matter which was frequently experienced at very young or very old ages. In cases
16
when the last patient died of the cancer of interest, the KM estimate of the SEER*stat automatically
calculate the standard error as zero when the survival is equal to 0 %. The “bootstrap” method was used
to overcome this difficulty (refer to the appendix section).
Otherwise, this study provided several strengths as it is the first study to demonstrate the agedependent survival of many types of cancer, in both genders, over three different cohorts, and for about
3.94 million patients. One of the advantages is that we adopted our data from the SEER database which
is the U.S. largest and most accurate database in which all of the minorities and ethnic subgroups are
represented. This is the first study to provide detailed survival graphs for patients of 85+ years, as
divided into 5- year age groups up to the age of 115.
Conclusively, this study presented two main perspectives. The first is that it would be useful to
patients to know their prognosis, to clinicians to know the best treatment intervention for the patients
based on their prognosis and to determine the frequency of the follow-up visits, and for business
decision makers (as in health and life insurance companies) who need to estimate the probability of
patient survival versus mortality at each age group in order to accurately set their premiums and the
expected costs. The second is that this data would be useful to molecular biologists who try to
understand cell senescence, as well as incidence and survival of cancer at different ages on a cellular
basis.
17
ACKNOWLEDGMENT
Dr. Bassily acknowledges the Egyptian Ministry of Higher Education fellowship which enabled
him to participate in this work. All authors greatly acknowledge the considerable help of Mr. Charles
Harding of Harvard University and the SEER staff for their kind assistance in understanding the SEER
database and in overcoming some of its technical difficulties. Personal funds were used in support of
this study.
CONFLICTS OF INTEREST STATEMENT
Authors declare that no conflicts of interest pertinent to the topic of this study are to be
disclosed.
18
REFERENCES
1. American Cancer Society. Cancer facts and figures 2006 [online] 2006 [cited 12-12-2006].
http://www.cancer.org.
2. Blanke CD, Coia LR, Shwartz RE, Bonin SR. Gastric cancer. In: Pazdur R, Coia LR, Hoskins WJ,
Wagman LD, editors. Cancer management: A Multidisciplinary approach. 9th ed. Lawrence, KS: CMP
Media, LLC, 2005. p. 279-92.
3. Kato I, Severson R, Shwartz A. Conditional median survival of patients with advanced carcinoma.
Cancer 2001; 92: 2211-9.
4. Wang SJ, Emery R, Fuller CD, Kim JS, Sittig DF. Conditional survival in gastric cancer- A SEER
database analysis. Gastric Cancer 2007; 10: 153-8.
5. Henson DE, Ries LA. On the estimation of survival. Semin Surg Oncol 1994; 10: 2-6.
6. Ries LA, Reichman ME, Lewis DR, Hankey BF, Edwards BK. Cancer survival and incidence from
the Surveillance, Epidemiology, and End Results (SEER) Program. The Oncologist 2003; 8: 541-52.
7. National Cancer Institute. Surveillance, Epidemiology and End Results (SEER) Program 2005.
http://seer.cancer.gov/data.
8. Nattinger AB, McAuliffe TL, Schapira MM. Generalizability of the Surveillance, Epidemiology and
End Results registry population: Factors relevant to epidemiologic and health care research. J clin
Epidemiol 1997; 50:939-45.
9. Warren JL, Klabunde CN, Schrag D, Bach PB, Riley GF. Overview of the SEER- Medicare data:
content, research applications, and generalizability to the United States elderly population. Med care
2002; 40 (8 Suppl): IV-3-18.
10. Fuller CD, Wang SJ, Thomas CR. Conditional survival in head and neck squamous cell carcinomaresults from the SEER dataset 1973-1998. Cancer 2007; 109:1331-43.
11. Surveillance, Epidemiology and End Results (SEER) program. SEER*stat database: IncidenceSERR 17 regs public-use, 2005 sub (1973-2004 varying). In: National Cancer Institute D, Surveillance
Research Program, Cancer Statistics Branch. http://seer.cancer.gov/resources. Accessed August, 2009.
12. Klein M. Survival analysis- a self-learning text. David G. Kleinbaun, Springer Science and Business
Media Inc., New York, NY. 2005: 51-55.
13. Efron B. Bootstrap methods- another look at the Jackknife. The Annals of Statist 1979; 7 (1): 1–26.
14. Efron B. Censored data and the bootstrap. J Amer Statist Assoc 1981; 76: 312-19.
15. Reid N. Estimating the median survival time. Biometrika 1981; 68: 601-8.
16. Akritas MG. Bootstrapping the Kaplan-Meier estimator. J Amer Statist Assoc 1986; 81: 1032-38.
19
17. Swanson GM, Lin CS. Survival patterns among younger women with breast cancer- the effects of
age, race, stage, and treatment. J Natl Cancer Inst Monogr 1994; 16: 69-77.
18. Harding C, Pompei F, Lee EE, Wilson R. Cancer suppression at old age. Cancer Res 2008; 68(11):
4465-78.
19. Pulte D, Gondos A, Brenner H. Trends in 5- and 10-year survival after diagnosis with childhood
hematologic malignancies in the United States, 1990-2004. J Natl Cancer Inst 2008; 100(18): 1301-9.
20. Koumarianou AA, Xiros N, Papageorgio U, pectasides D, Economopoulos T. Survival improvement
of young patients, aged 16-23, with Hodgkin lymphoma during the last three decades. Anticancer Res
2007; 27(2): 1191-7.
21. Perme MP, Jerels B. Trends in survival of childhood cancer in Slovenia between 1957 and 2007.
Pediatr Haematol Oncol 2009; 26(4):240-51.
22. Marubini E, Mariani L. New method for expressing survival in cancer- the relation between survival
times and patient’s age at diagnosis must be taken into account. BMJ 1997; 315(7119): 1375-6.
23. Navarro- Garcia JF, Viogue J, Cuch A. Survival in cancer of the breast in Zaragoza (1960-1990) in
relation to age, clinical stage, and period of time of the diagnosis. Med Clin (Barc) 1995; 105(19): 7217.
24. Adami HO, Malker B, Holmberg L, Persson I, Stone B. The relation between survival and age at
diagnosis. N Engl J Med 1986; 315(9): 559-63.
25. Brenner H, Gondos A, Pulte D. Ongoing improvement in long-term survival of patients with
Hodgkin disease at all ages and recent catch-up of older patients. Blood 2008; 111(6): 2976- 83.
26. Bland M, Altman D. Survival probabilities: the Kaplan- Meier method. BMJ 1998; 317: 1572.
27. Marubini E, Valsecchi MG. Analyzing survival data from clinical trials and observational studies.
Statistics in Practice. J.Wiley, Chichester; New York. P. 5-105.
20
Figure Legends:
# Titles of the graphs: the title of each graph consists of the name of the major organ site cancer, the
studied sex, and cohort e.g. Breast, F, 1979-1983. Dotted line: 1-year survival model, interrupted line: 3year survival, and the continuous line: 5-year survival.
Figure 1 (A)-(D): Cancer survival as a function of the age at diagnosis in years for females for the periods 1979
to 1983, 1989 to 1993, and 1999 to 2003. Figure A: Survival for all cancer sites and each of five major organ
sites, B: survival of six major organ sites, C: survival of six major organ sites, D: survival of four major organ
sites. Two sided probability of error is shown by the error bars. The last patient died at certain age groups is
shown by the large solid squares. Grey lines attach points of statistical uncertainty to the other points of the line
graph.
Figure 2 (A)-(D): Cancer survival as a function of the age at diagnosis in years for males for the periods 1979 to
1983, 1989 to 1993, and 1999 to 2003. Figure A: Survival for all cancer sites and each of five major organ sites,
B: survival of six major organ sites, C: survival of six major organ sites, D: survival of three major organ sites.
Two sided probability of error is shown by the error bars. The last patient died at certain age groups is shown by
the large solid squares. Grey lines attach points of statistical uncertainty to the other points of the line graph.
Figure 3 (A)-(C): 5-year cancer survival for the three studied cohorts as a function of the age at diagnosis in years
for both females and males. Figure A: survival for all cancer sites and each of eight major organ sites, B: survival
of nine major organ sites, C: survival of six major organ sites. Two sided probability of error is shown by the error
bars. The last patient died at certain age groups is shown by the large solid squares. Grey lines attach points of
statistical uncertainty to the other points of the line graph.
21