Download manuscript revised - Harvard University Department of Physics

Cancer Survival as a Function of Age at Diagnosis: A Study of the Surveillance, Epidemiology and End Results Database. Authors Mena N. Bassilya,b,c, MSc, MBBch; Richard Wilsonc, D.Phil; Francesco Pompeic, PhD, M.S; Dimitriy Burmistrovd, PhD, MSc. a Harvard School of Public Health, 401 Park Drive, Boston, MA 02215, USA. Menoufiya University, Shebin El-kom, Menoufiya, Egypt. c Harvard University- Department of Physics, Jefferson Laboratory, 17 Oxford st, Cambridge, MA 02138, USA. d ITA software, 141 Portland st, Cambridge, MA 02139, USA. b Corresponding author: Dr. Mena Bassily Harvard University Jefferson Laboratory Room 257-A 17 Oxford Street, Cambridge, MA 02138 U.S.A. Tel.: (617)- 332- 4823 (USA code: +1) Fax: (617)- 495- 3387 (USA code: +1) E-mail: [email protected] 1 ABSTRACT Background: Recent research suggested that cancer survival has improved in recent cohorts. Improvement in cancer survival is considered a valid indicator of the quality of care introduced to the patients. The aim of this study is to investigate the changes in the survival profile over age for patients with the most incident cancers. Methods: Survival data of 3.94 million patients diagnosed with 23 primary-site cancers within the periods of 1979 to 1983, 1989 to 1993, and 1999 to 2003 were adopted from the Surveillance, Epidemiology and End Results database. Gender and cause-specific survival probabilities were estimated at one-, three-, and five-years after diagnosis using the Kaplan-Meier survival estimate. Survival was presented for each of the studied cancers, cohorts, and sexes in the form of line graphs as a function of age at diagnosis. Error bars demonstrated the probability of error at 95% confidence level. Results: The graphs demonstrated that cancer survival was improved over the successive cohorts for most cancers, with several exceptions such as brain and lung cancers. The relation between survival and the age at diagnosis was generally described in the form of a gradual decline phase and a rapid fall-off phase at 70-80 years of age, with few exceptions as in leukemia and Hodgkin lymphoma. Patients who survived for 3 years were more likely to live for 5 years after diagnosis, but this prediction could not be extrapolated to the one-year survivors. Conclusion: Further studies on tumor- specific characteristics and treatment modalities of these patients are suggested for clarification of the possible causes of variations in patient’s survival profile over age. Key words: Cancer survival; Cause- specific survival; Cancer in old age; SEER; KaplanMeier; survival phases. Abbreviations: DNA: Deoxyribonucleic Acid. KM: Kaplan- Meier. NCI: National Cancer Institute. SEER: Surveillance, Epidemiology, and End Results. SEER*Stat: Surveillance, Epidemiology, and End Results Statistical Software. 2 INTRODUCTION Studying survival of different cancers has important practical value for patients, providers, and researchers 1,2 . Cancer patients may wish to know how their prognosis is changing over time, and what is their life expectancy based on the disease status. The proper understanding of prognosis may help both of the physicians and the patients decide on treatment options, balancing the personal values for quality versus quantity of life 3. Knowledge of cancer survival provides a more objective basis to deem a patient “cured” of their disease. Providers can make use of survival information to more objectively determine an appropriate frequency of follow-up visits and aggressiveness of surveillance testing, based on patient’s current risk profile. When designing clinical trials, clinical researchers may also find it useful in helping to determine sufficient follow-up times for trial endpoints 4. Period survival more accurately describes patients’ prognosis since the overall survival projections are often discouraging and not necessarily pertinent for patients who have survived the initial treatment period, as prognosis after initial management isn’t static. Patients who have survived an interval of time after diagnosis have a different probability of surviving for the following 5 years from that which was estimated at the time of diagnosis 5. Two forms of net (non crude) survival analyses are available in the SEER*Stat; relative and cause- specific survivals. Both of these methods present the likelihood that cancer patients will not die from causes associated with their cancer. Relative survival is derived by comparing the survival of all causes of death in a group of cancer patients to survival of all causes of death in a cancer-free age- and sex- matched population. This means that in relative survival, unlike cause-specific survival, both of the numerator and the denominator are derived from different populations. Besides, relative survival is entirely based on the assumption that the other causes of death in the Surveillance, Epidemiology and End Results database (SEER) cohorts and in the general U.S. population are highly comparable. This 3 assumption may be misleading if one factor is present at a cohort that may increase the risk of dying from causes other than cancer, compared to the general population. For example, smoking may be highly represented in a lung cancer cohort who often die due to causes related to smoking other than lung cancer e.g. heart disease. Relative survival cannot separate the risk of death from lung cancer from the risk of dying of non cancer causes due to smoking. Therefore, relative survival fraction for those with lung cancer is an overestimate of the effect of lung cancer alone 6. This article provides an overview of cause-specific survival of many cancers as a function of age at diagnosis to assess whether there is a specific pattern or trend which describes the survival of all cancer types, specially at old age groups (85+ years). To provide a detailed modeling of survival at old ages, the 85+ age group provided in the SEER database were re-classified into smaller five-year age groups. This study may be also used by clinicians as a benchmark to tell the patients of different cancers about the probability of surviving for 1, 3, and 5 years. We were also concerned with studying the progress in cancer survival over the last three decades to detect if the advances in cancer screening, diagnosis, and treatment were interpreted in the form of improvement of cancer patients’ longevity. 4 MATERIALS AND METHODS The Surveillance, Epidemiology and End Results database of the National Cancer Institute (NCI) is the largest population- based cancer registry in the United States, which geographically encompasses approximately 26 % of the U.S. population. The latest SEER17 cancer registry collects data on cancer incidence and survival from seventeen population- based cancer registries which are Connecticut, Iowa, New Mexico, Utah, Hawaii, the metropolitan areas of Detroit, California (San Francisco-Oakland, San Jose-Monterey, and Los Angeles), Atlanta, Seattle-Puget, rural Georgia, Arizona, New Orleans, Louisiana, New Jersey, Puerto Rico, Kentucky, in addition to American Indian Alaska natives 7. These registries were chosen for their completeness and their adequate representation of all races and minority populations 8. The SEER program standard for data completeness is 98 % 9. The SEER program registries routinely collect data on demographic characteristics of the patients, tumor characteristics and staging, as well as a follow-up for the survival status. An epidemiological study comparing SEER areas with non- SEER areas in the United States concluded that the age and sex distributions of these areas were comparable 8. Data about survival are actively collected by SEER cancer registries and reported to the NCI, data are then ascertained from hospital records, private laboratories, radiotherapy units, other health care service units, and from death certificates when cancer is listed as a cause of death 10. SEER Data Confidentiality: All available data in the SEER database are retrospective in nature. Personal identifiers are absent from the database. All variables that might lead to reidentification such as the date of birth have been removed or transposed. Any remaining risk of reidentification has been minimized by the governing agency in not allowing the data to be available as public-use information. Investigators have 5 to sign a legally binding data-use agreement with the Centers for Medicare and Medicaid Services and SEER 11. Statistical Methods: SEER*stat software (version 6.5.2, National Cancer Institute, Bethesda, MD) was used to download and analyze patients’ cause-specific survival. Cause- specific survival is defined as the probability of surviving certain type of cancer for certain period of time, excluding death from causes other than the cancer of interest (as reported in the death certificate and / or autopsy). In this study, only primary cancers were considered. Consequently, the cause-specific survival presented in this study is an estimate of the likelihood that primary cancer patients will not die from primary cancer only, not from a secondary cancer or other associated causes. To calculate cause-specific survival we used the KaplanMeier product-limit method available in SEER*Stat program. The Kaplan-Meier (KM) Estimator is the nonparametric maximum likelihood estimate of survival function 12 . In the SEER database, survival is available on each cancer patient from the time of initial diagnosis to the date of last contact or the date of death if the patient had died. The KM estimator calculates the survival probability at a defined period of time based on calculation of the survival estimate at the end of each month of this period7. This method allows for early exclusion of those who deceased during the specified time interval and for prompt censoring of the cases lost from follow- up, with regular adjustment of the at-risk group (denominator) on a monthly basis, in order to introduce an accurate net survival estimate for the defined period of time. One, three, and five- year survival data for approximately 3.94 million patients diagnosed with different primary site cancers (21 sites for females and 20 sites for males) were analyzed using the KM method. The KM analysis allows estimation of survival over time, even when patients drop out or are studied for different lengths of time. For each interval, survival probability is calculated as the number of patients surviving for certain time divided by the number of patients at risk. Patients who have died, dropped out, or did not reach the time of follow up are not counted as “at-risk” and are considered 6 censored. Eventually, the probability of surviving to any point is estimated from the cumulative probability of surviving each of the preceding time intervals. When the population is large enough (such as in the SEER database), the estimated Kaplan-Meier survival approaches, to a large extent, the true survival of that population 12, 26, 27. Cancer cases were grouped according to age at diagnosis into twenty three 5-year age groups ranging from 0- 4 to 110-114 years old. Age and cause- specific survival rates for each type of cancer were calculated separately for each sex and over three different cross-sectional cohorts (1979 to 1983, 1989 to 1993, and 1999 to 2003). We used three cohorts derived from three successive decades to study the possible time to time variability in cancer survival. To investigate the changes in cancer survival over decades, the age and sex- specific 5-year survival fraction for the three studied cohorts was demonstrated in line graphs (Figure 3). To show the survival probability at different periods after diagnosis, the one-, three-, and five-year survival fractions for each type of cancer and sex were plotted with the age at diagnosis in the form of line graphs (figures 1 and 2). For graphical clarity reasons, only the 5-year survival plots show error bars. In case of appearance of a very large error bar (> 75% of the entire scale of the y-axis), the corresponding data point was considered statistically unreliable and the error bars were not shown at such points. The points which have statistical uncertainty were connected to the other points of the line graph with a grey line. Two types of error-bars are presented in the plots. The first is the 95% confidence bands provided by SEER*Stat program. These confidence intervals were presented whenever SEER*Stat was able to calculate them in the survival-session outputs. In cases when the last patient is dead of the cancer of interest, this means that this patient was not “censored” nor “lost” from the observation due to reasons other than death. However, SEER*Stat returns the survival and the upper confidence limit on it as equal to 0%. This tells us that calculating variance using the KM estimator with a small-variance 7 approximation is inapplicable because some uncertainty should be associated even with the empirical zero-survival values. The zero-survival points were shown on the plots as bold square points. A method to assign uncertainty and the confidence bands at these points was developed. These bands were shown on the plots with dashed error-bars connected with the zero-survival points. These error-bars demonstrate the one-sided 95% confidence intervals. The main difficulty in estimating error-bars on zero-survival points was the small sample size (120) in the age groups that ended with zero survival. One approach to avoid using small variance approximation is the “Bootstrap”13. This approach is based on the idea that the original sample represents the population from which it was drawn. So re-sampling from this sample may represent what we could get if we took many samples from the population. Bootstrap methods for right censored data in general and for KM estimator in particular were studied by many authors14-16. Technically the bootstrap is a version of a Monte Carlo (MC) calculation, implementing repeated random sampling with replacement from the observed sample under the assumption of independence of times of death of the patients of interest and times of patient’s loss from observation due to other reasons (censoring times). The procedure simulates independently two times for each hypothetical person; the time of death from the reason under study, and the censoring time. If censoring happens before the death, this person is considered lost from observation at the simulated censoring time and he/she is considered deceased at the simulated death time. Many random realizations of KM estimator values can be randomly drawn this way, and their distribution can be used to estimate confidence limits on the KM survival curve. Though it was shown that such an approach can produce asymptotically correct confidence bands for survival curves, and it works well in simulation studies with medium to large sample sizes, the approach has limitations for small sample sizes. The following extreme example shows that this 8 approach is not perfect if there is only one observed death from the cancer of interest in the original sample. All bootstrap samples will include this single observation replicated, and it will cause zerouncertainty of the survival estimate. On the other hand the situation with the latest extreme example can be improved if random samples were drawn not from this single observed value, but from a Bayesian posterior distribution conditional on this single observation. We used the bootstrapping procedure with random sampling from the posterior Bayesian distribution conditional on all of the observed death and loss times rather than from their empirical distributions. The proposed approach was implemented in a code, and used to estimate the error-bars at the zero-survival points in this paper. Confidence bands obtained with the described method should be interpreted as expressions of the uncertainty of information extracted from the SEER database for the respective data points with zero survival. The detailed description of the algorithm as well as its small-sample and large-sample properties can be a matter of a forthcoming publication. The formal steps of the algorithm are shown in the appendix. 9 RESULTS Figure 1(A-D) and figure 2(A-D) show age-specific cancer survival for all 21 of the major organ site cancers listed in the SEER for females and all 20 major organ sites listed for males; respectively, along with the total for all cancers, for the periods 1979 to 1983, 1989 to 1993, and 1999 to 2003, for total of 129 cancer datasets. These graphs model the one, three, and five-year cancer survival as a function of age, for both sexes, and over the three previously mentioned cohorts. No consistent cancer survival pattern or trend could be derived out of these plots, however, several results about cancer survival were observed. A) - Cancer survival over the three cohorts (decades): (figure 3 A-C) For most of the figures of the studied cancers, there is a definite shift of cancer survival from the left to the right from the earlier to the later cohorts. This denotes a considerable improvement in survival of these cancers at the younger ages, with the greatest survival improvements in non-Hodgkin lymphoma and leukemia. These figures are also available in an expanded larger view at (http:// physics.harvard.edu/~Wilson/cancers and chemicals/survival.html). Interestingly, some cancers showed high survival rates persistently over decades e.g. melanoma (females better than males), prostate, testicular, and thyroid cancers. Although cancer survival improved over the years, the survival fraction for some cancers such as pancreatic cancer remained poor. On the other hand, survival of corpus uteri and urinary bladder cancers did not seem to improve over the three cohorts but the survival fraction was good. Some cancers showed inconsistent increases in the survival fraction. For example, survival of laryngeal cancer patients increased between the 1979-1983 to 1989-1993 cohorts, followed by a decline between the latter and the 1999-2003 cohort. This inconsistency was mainly observed for patient age at 10 diagnosis of 40 years or older. Similarly, a very slight and inconsistent improvement in survival rate was observed in breast cancer in males. On the contrary, survival of some cancers remained unchanged e.g. brain, lung and bronchus, urinary bladder, cervical, and uterine cancers. Unfortunately, liver cancer survival was very low and becomes worse in the last decade. B) - Cancer survival with the age of patient at diagnosis: Two main conclusions could be drawn from the plots. Firstly, cancer survival declines with the increase in the age of diagnosis. Secondly, the relation of cancer survival to the age of diagnosis can be described as two-phased survival. Phase I is a phase of gradual decrease of survival with the age of diagnosis. Phase II is a rapid fall off phase during which cancer survival rapidly declines. For the majority of the studied cancers phase II occurs at (70-80) years of age. Some exceptions to the facts in the previous paragraph were noted:  In leukemia and prostate cancer, there was a transient improvement of survival between the age of (40-60 years) for the former and (40-90 years) for the latter before the fall off phase took place.  Phase I couldn’t be applied to some cancers e.g. colorectal, uterine, oral, prostate, testicular cancers, and melanoma. These cancers have shown a plateau of constant cancer survival, rather than a gradual decline, before the rapid fall off phase (Phase II) has started.  In some cancers e.g. brain, lung and bronchus, renal, pancreatic, and ovarian cancers, as well as Hodgkin lymphoma, the rapid fall off phase (Phase II) starts at an earlier age, mainly at the age of 40 years old. C) – One, three, and five-year cancer survivals: One, three, and five-year survival plots had the same pattern or trend of change with the age of patient at diagnosis as indicated by the approximate parallelism of the three period-survival plots. In 11 most of the studied cancers, the plots representing the three and five-year survival rates were very close however; there is a wide discrepancy between both of them and the one-year survival plot. This speculation may lead to a conclusion that cancer patients who lived for 3 years after diagnosis have high probability to survive for 5 years e.g. patients with Hodgkin lymphoma, renal cancer, and esophageal cancer, but we can’t extrapolate this conclusion to cancer patients who survived only for one year after diagnosis. One of the exceptions to this conclusion is that in some cancers the probability of survival for 5 years after diagnosis was similarly high for both of the one-year and 3-year survivors e.g. Melanoma (specially in females), uterine, and testicular cancers. Another exception is the breast cancer and myeloma in which survival for 3 years post-diagnosis was not a valid predictor for surviving for 5 years. This fact is shown by the wide gap between the 3 and 5- year survival plots in both cancers. 12 DISCUSSION Survival information is potentially of great importance to patients, clinicians, and researchers as it quantifies a patient’s changing risk profile over time. This study discussed age-specific cancer survival for 20 of the major organ site male cancers and 21 major organ site female cancers based on the SEER data. No single model describes the pattern of cancer survival as a function of age at diagnosis for all of the studied cancers. This may be explained by the wide range of cancer-specific factors which manipulate the survival pattern including cancer site, pathological differentiation of the cancer cells, cancer stage at the time of diagnosis, the time span between cancer incidence and start of treatment, type of intervention, quality of care received, and associated morbidity17. This result highlighted the wide discrepancy between modeling of cancer incidence versus survival. Harding and colleagues18 observed a unique cancer incidence model when they studied the same primary site cancers, cohorts, and age groups which we used in this study. They further reported that the incidence for each cancer tends to fall in the same way at old ages and could fit into one single model. Unlike incidence, no turn over in the cause-specific survival was observed in old age groups. An overall improvement in survival of cancer patients over the successive decades was achieved as represented by the definite shift of cancer survival from left to right in the later cohorts for most of the cancer types. This is consistent with the results of Pulte et al.19 who reported an ongoing increase in trends of cancer survival in the period between 1990 to 2004 for four of the studied cancers. The greatest survival improvements were observed in Hodgkin lymphoma, non- Hodgkin lymphoma, and leukemia. Similarly, Koumarianou and colleagues20 reported that survival of Hodgkin lymphoma was significantly improved over the last three decades due to the improved chemotherapy regimens. A study of trends of 13 survival of leukemia patients in Slovenia documented that the hazard of dying from leukemia has been constantly decreasing between 1957 and 2007 21 . The improvement of cancer survival for most of the cancers studied may be attributed to application of cancer screening programs and availability of tumor markers for many cancers which resulted in early diagnosis, development of better diagnostic techniques, and the advances in the intervention regimens and the quality of care. For a few cancers, with liver and pancreatic cancers as prominent examples, patients’ prognosis remained poor over the three studied cohorts (decades). Further studies are needed to investigate the factors which lie behind the unimproved prognosis for each individual cancer. Marubini and colleagues22 suggested that the relation between survival fraction and patient’s age at diagnosis must be considered, as this provides a better understanding of the biological basis for the complex relation between age of diagnosis and prognosis, and might clarify our view for the natural history of cancer. On this regard, we found that cancer survival was declining with the age at diagnosis. The same results were also reported by Navarro-Garcia et al.23 and Adami et al.24. For the majority of the studied cancers, the relation of cancer survival to the age of diagnosis can be allocated to two phases. The first phase is the gradual decline phase and the second is the rapid fall off phase. For most of the cancers, the second phase starts at the age of 70-80 years. The rapid decline of cancer survival after the age of 70 might be attributed to cancer comorbidities which decrease patient’s vitality, immunity, and the compensatory reserve of all body systems. Cancer associated morbidity in elderly patients may lead to a more vulnerability to failure of the body systems and rapid decompensation of the vital organs, specially if the patient is exposed to the toxic effects of the chemotherapy or if the cancer is involving a vital organ. Although our study has examined the causespecific survival, which means that all other causes of death were excluded, but this does not preclude the role of the other comorbidities in the facilitation of and precipitation to cancer mortalities. 14 For few cancers e.g. Hodgkin lymphoma, the rapid fall off phase starts at earlier age, mainly at 40 years old. This denotes that cancer survival is widely variable and takes different patterns over age and with different types of cancers and that survival is no more related to worse diagnosis and treatment options at old ages. The relation between survival and age at diagnosis should be understood in terms of the cancer biology at the cellular and molecular levels. To our knowledge, no previous studies have discussed the pattern of survival in relation to the age at diagnosis. In a study by Brenner et al.25 about the trends in long term survival in Hodgkin lymphoma patients over three period cohorts. The authors were concerned about the improvement in survival over the successive period cohorts and didn’t give much concern to the pattern of survival over the different age groups; however, a secondary analysis of their data confirmed our results as survival of Hodgkin Lymphoma drops abruptly after the age of 35 to 44 in all of the studied cohorts. The approximate parallelism of the one, three, and five-year period survival plots over the different cohorts invites a speculation that cancer survival might be biologically linked to the age of patient at diagnosis, whether due to age-related cellular or DNA molecular changes. The relationship between the period survival and age of patient at diagnosis may no longer be explained in terms of the differences in the ease of diagnosis over the different age groups. This could be identified by the non disruption of the parallelism of the plots representing the three studied period survivals, even with the advances achieved in the diagnosis of cancer over the different cohorts. Unlike the one-year survival plots, there is an observed proximity of the 3-year and 5-year survival plots. This proximity demonstrates that cancer patients who lived for 3 years after diagnosis have high probability to survive for 5 years post-diagnosis, but this prediction couldn’t be extrapolated to those who survived only for one year after diagnosis. Some exceptions may apply to this conclusion, for example, the probability of surviving a uterine and testicular cancer for 5 years was similarly high 15 for both of the one-year and the 3-year survivors. On the contrary, surviving cancer breast or myeloma for 3 years after diagnosis was not a valid predictor for surviving for 5-years. One limitation to this study is the SEER classification for the cause of death which was entirely based on death certificates or autopsy reports. It is not uncommon that the cause of death is inaccurately reported in the death certificate as “death due to circulatory failure” “or senility”. If a misclassification bias had happened in the reporting of the cause of death to the SEER database, this may subsequently affect the survival models we presented in this study. Also, in this study we were concerned with investigation of the survival over cohorts, age groups, both sexes, and primary cancer sites. However, we were not able to provide further modeling for important cancer details such as cancer stage at the time of diagnosis, pathological type, cellular differentiation, type of treatment received, and primary versus metastatic cancer. Modeling of the effect of these parameters on the survival function will be discussed in future publications. In some instances cause-specific survival may be inaccurate. For example, cause of death may be unreliable or unknown or if a cancer has metastasized to another site, the death certificate may list cancer of the metastasized site as the cause of death. To eliminate these limitations, only individuals with one primary cancer were included in the cohort and if the cause of death is missing the cases were excluded from the analysis. The variability of cancer survival over different age groups, although was clearly demonstrated in this study; but we were not able to identify the cause of this variability on a mathematical basis. This data, however, could be of great importance to cancer biologists and molecular genetics scientists who may explain the cause of this variation on a biological basis (at the cellular level). The KM survival estimate was somewhat inaccurate if the number of observed cases was less than 20 patients, a matter which was frequently experienced at very young or very old ages. In cases 16 when the last patient died of the cancer of interest, the KM estimate of the SEER*stat automatically calculate the standard error as zero when the survival is equal to 0 %. The “bootstrap” method was used to overcome this difficulty (refer to the appendix section). Otherwise, this study provided several strengths as it is the first study to demonstrate the agedependent survival of many types of cancer, in both genders, over three different cohorts, and for about 3.94 million patients. One of the advantages is that we adopted our data from the SEER database which is the U.S. largest and most accurate database in which all of the minorities and ethnic subgroups are represented. This is the first study to provide detailed survival graphs for patients of 85+ years, as divided into 5- year age groups up to the age of 115. Conclusively, this study presented two main perspectives. The first is that it would be useful to patients to know their prognosis, to clinicians to know the best treatment intervention for the patients based on their prognosis and to determine the frequency of the follow-up visits, and for business decision makers (as in health and life insurance companies) who need to estimate the probability of patient survival versus mortality at each age group in order to accurately set their premiums and the expected costs. The second is that this data would be useful to molecular biologists who try to understand cell senescence, as well as incidence and survival of cancer at different ages on a cellular basis. 17 ACKNOWLEDGMENT Dr. Bassily acknowledges the Egyptian Ministry of Higher Education fellowship which enabled him to participate in this work. All authors greatly acknowledge the considerable help of Mr. Charles Harding of Harvard University and the SEER staff for their kind assistance in understanding the SEER database and in overcoming some of its technical difficulties. Personal funds were used in support of this study. CONFLICTS OF INTEREST STATEMENT Authors declare that no conflicts of interest pertinent to the topic of this study are to be disclosed. 18 REFERENCES 1. American Cancer Society. Cancer facts and figures 2006 [online] 2006 [cited 12-12-2006]. http://www.cancer.org. 2. Blanke CD, Coia LR, Shwartz RE, Bonin SR. Gastric cancer. In: Pazdur R, Coia LR, Hoskins WJ, Wagman LD, editors. Cancer management: A Multidisciplinary approach. 9th ed. Lawrence, KS: CMP Media, LLC, 2005. p. 279-92. 3. Kato I, Severson R, Shwartz A. Conditional median survival of patients with advanced carcinoma. Cancer 2001; 92: 2211-9. 4. Wang SJ, Emery R, Fuller CD, Kim JS, Sittig DF. Conditional survival in gastric cancer- A SEER database analysis. Gastric Cancer 2007; 10: 153-8. 5. Henson DE, Ries LA. On the estimation of survival. Semin Surg Oncol 1994; 10: 2-6. 6. Ries LA, Reichman ME, Lewis DR, Hankey BF, Edwards BK. Cancer survival and incidence from the Surveillance, Epidemiology, and End Results (SEER) Program. The Oncologist 2003; 8: 541-52. 7. National Cancer Institute. Surveillance, Epidemiology and End Results (SEER) Program 2005. http://seer.cancer.gov/data. 8. Nattinger AB, McAuliffe TL, Schapira MM. Generalizability of the Surveillance, Epidemiology and End Results registry population: Factors relevant to epidemiologic and health care research. J clin Epidemiol 1997; 50:939-45. 9. Warren JL, Klabunde CN, Schrag D, Bach PB, Riley GF. Overview of the SEER- Medicare data: content, research applications, and generalizability to the United States elderly population. Med care 2002; 40 (8 Suppl): IV-3-18. 10. Fuller CD, Wang SJ, Thomas CR. Conditional survival in head and neck squamous cell carcinomaresults from the SEER dataset 1973-1998. Cancer 2007; 109:1331-43. 11. Surveillance, Epidemiology and End Results (SEER) program. SEER*stat database: IncidenceSERR 17 regs public-use, 2005 sub (1973-2004 varying). In: National Cancer Institute D, Surveillance Research Program, Cancer Statistics Branch. http://seer.cancer.gov/resources. Accessed August, 2009. 12. Klein M. Survival analysis- a self-learning text. David G. Kleinbaun, Springer Science and Business Media Inc., New York, NY. 2005: 51-55. 13. Efron B. Bootstrap methods- another look at the Jackknife. The Annals of Statist 1979; 7 (1): 1–26. 14. Efron B. Censored data and the bootstrap. J Amer Statist Assoc 1981; 76: 312-19. 15. Reid N. Estimating the median survival time. Biometrika 1981; 68: 601-8. 16. Akritas MG. Bootstrapping the Kaplan-Meier estimator. J Amer Statist Assoc 1986; 81: 1032-38. 19 17. Swanson GM, Lin CS. Survival patterns among younger women with breast cancer- the effects of age, race, stage, and treatment. J Natl Cancer Inst Monogr 1994; 16: 69-77. 18. Harding C, Pompei F, Lee EE, Wilson R. Cancer suppression at old age. Cancer Res 2008; 68(11): 4465-78. 19. Pulte D, Gondos A, Brenner H. Trends in 5- and 10-year survival after diagnosis with childhood hematologic malignancies in the United States, 1990-2004. J Natl Cancer Inst 2008; 100(18): 1301-9. 20. Koumarianou AA, Xiros N, Papageorgio U, pectasides D, Economopoulos T. Survival improvement of young patients, aged 16-23, with Hodgkin lymphoma during the last three decades. Anticancer Res 2007; 27(2): 1191-7. 21. Perme MP, Jerels B. Trends in survival of childhood cancer in Slovenia between 1957 and 2007. Pediatr Haematol Oncol 2009; 26(4):240-51. 22. Marubini E, Mariani L. New method for expressing survival in cancer- the relation between survival times and patient’s age at diagnosis must be taken into account. BMJ 1997; 315(7119): 1375-6. 23. Navarro- Garcia JF, Viogue J, Cuch A. Survival in cancer of the breast in Zaragoza (1960-1990) in relation to age, clinical stage, and period of time of the diagnosis. Med Clin (Barc) 1995; 105(19): 7217. 24. Adami HO, Malker B, Holmberg L, Persson I, Stone B. The relation between survival and age at diagnosis. N Engl J Med 1986; 315(9): 559-63. 25. Brenner H, Gondos A, Pulte D. Ongoing improvement in long-term survival of patients with Hodgkin disease at all ages and recent catch-up of older patients. Blood 2008; 111(6): 2976- 83. 26. Bland M, Altman D. Survival probabilities: the Kaplan- Meier method. BMJ 1998; 317: 1572. 27. Marubini E, Valsecchi MG. Analyzing survival data from clinical trials and observational studies. Statistics in Practice. J.Wiley, Chichester; New York. P. 5-105. 20 Figure Legends: # Titles of the graphs: the title of each graph consists of the name of the major organ site cancer, the studied sex, and cohort e.g. Breast, F, 1979-1983. Dotted line: 1-year survival model, interrupted line: 3year survival, and the continuous line: 5-year survival. Figure 1 (A)-(D): Cancer survival as a function of the age at diagnosis in years for females for the periods 1979 to 1983, 1989 to 1993, and 1999 to 2003. Figure A: Survival for all cancer sites and each of five major organ sites, B: survival of six major organ sites, C: survival of six major organ sites, D: survival of four major organ sites. Two sided probability of error is shown by the error bars. The last patient died at certain age groups is shown by the large solid squares. Grey lines attach points of statistical uncertainty to the other points of the line graph. Figure 2 (A)-(D): Cancer survival as a function of the age at diagnosis in years for males for the periods 1979 to 1983, 1989 to 1993, and 1999 to 2003. Figure A: Survival for all cancer sites and each of five major organ sites, B: survival of six major organ sites, C: survival of six major organ sites, D: survival of three major organ sites. Two sided probability of error is shown by the error bars. The last patient died at certain age groups is shown by the large solid squares. Grey lines attach points of statistical uncertainty to the other points of the line graph. Figure 3 (A)-(C): 5-year cancer survival for the three studied cohorts as a function of the age at diagnosis in years for both females and males. Figure A: survival for all cancer sites and each of eight major organ sites, B: survival of nine major organ sites, C: survival of six major organ sites. Two sided probability of error is shown by the error bars. The last patient died at certain age groups is shown by the large solid squares. Grey lines attach points of statistical uncertainty to the other points of the line graph. 21

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download manuscript revised - Harvard University Department of Physics