Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Vol. 158, 1487-1490. October 1997 Printed i n U.S.A. VARIATION IN PROSTATE CANCER SURVIVAL EXPLAINED BY SIGNIFICANT PROGNOSTIC FACTORS ARNON KRONGRAD,* HONG LAI AND SHENGHAN LAI From the Departments of Urology, Medicine, and Epidemiology and Public Health, and the Sylvester Comprehensive Cancer Center, University of Miami School of Medicine, and Geriatric Research, Education, and Clinical Center, Veterans Affairs Medical Center, Miami, Florida ABSTRACT Purpose: Traditional survival analytical tools do not reveal the ability of significant prognostic factors to predict (that is, explain variation in) survival. We used survival data in patients with prostate cancer to illustrate how the association of factors with survival diverges from their ability to explain variation in survival; bladder cancer was included as a point of general comparison. Materials and Methods: We used the 1973 to 1990 records of the Surveillance Epidemiology and End Results program. Multivariate proportional hazards models were used to identify factors that significantly associated with survival. The proportion of variation explained by these factors was estimated with the Schemper method. Results: The dataset included 10,636 patients with prostate cancer and 1,070 with bladder cancer. Median survival was significantly longer in prostate than bladder cancer; other characteristics were similarly distributed. Age, stage and marital status were associated with survival in both cancers (p value range 0.0001 to 0.0009). The total proportion of variation explained was 7.1%and 32.1%for prostate and bladder cancer, respectively. In prostate cancer, age, stage and marital status explained 0.6, 5.5 and 0.4%, of the adjusted proportion of variation explained, respectively, and in bladder cancer, they explained 14.7, 8.9 and 0.6%,respectively. Conclusions: Proportional hazards models identified but did not reveal the ability of significant prognostic factors to explain variations in survival. The proportion of variation explained analyses illustrate why predicting survival is so difficult, especially in prostate cancer. The prognostic factors used do not possess the ability to explain variation in survival; new prognostic factors must be identified. KEY WORDS:prognosis; mortality; prostatic neoplasms; bladder neoplasms; models, statistical. figure, R2 quantifies the ability of age to predict specific end points. R2 cannot be derived from the statistical association (p value) of the putative predictive factor with the study end point. As shown in figure 1,the associations of age with PSA and year of birth are highly significant (p <0.001).This means that the relationships of age with year of birth and PSA are not attributable to chance alone; that is, we can observe reproducibly that PSA goes up and year of birth goes down as we examine older patients. The significant association does not mean that we can gauge equally precisely PSA and year of birth by knowing a patient’s age. This point is exemplified in figure 1, in that age very precisely predicts year of birth (because its relationship with year of birth is perfect; R2 = 1.0) but not PSA (because its relationship with PSA is far from perfect; R2 = 0.02). As a practical validation of the example, try using a prostate cancer patient’s age to predict his year of birth and PSA. To more precisely predict PSA, one must examine age and other independently associated factors, such as disease stage. To identify factors that predict survival, one often uses proportional hazards models. Proportional hazards models generate risk ratios, confidence intervals and p values but not a descriptor of the Variation in survival explained by the prognostic factor. In this regard, proportional hazards models are somewhat handicapped, relative to the linear regression model. In recent years, a n approach has been introduced * Accepted for publication February 21,1997 to survival analyses to quantify the proportion of variation Funded by the Health Foundation of South Florida. Requests for reprints: Urology (112),Veterans Affairs Medical explained by significant prognostic fact0rs.2-~Like the R2 of Center, 1201 NW 16 St., Miami, Florida 33125. linear models the proportion of variation explained of propor- Predicting clinical outcomes can be very difficult. Theoretically, precise predictions depend on variables that can explain large variations in outcome. However, one might find it ironic and frustrating that prediction of specific outcomes, death of patients with prostate cancer, for example, is difficult when statistically significant prognostic variables (for example, disease stage) have been identified. The origins of the problem may reside partly in the blurred distinction between the term “significant” as used in precise statistical language and in everyday language.’ In statistical terminolom, “significant” denotes that an association is not attributable to chance alone. In everyday language, “significant” connotes that an observed association is large. Blurring semantic distinctions can generate the false expectation that statistically significant prognostic variables can predict a large part of an associated outcome. Unless a variable is not only statistically significantly associated with an outcome but also able to explain variation in the outcome, our efforts to apply it in precise predictions will be frustrated. To identify factors (for example, age) that predict end Points other than survival, for example, serum prostate specific antigen (PSA) level one often uses linear regression models. Regression models are generally characterized with a descriptor of the variation in the end point--”goodness of fit”-explained by the tested factor; this descriptor is R2,the W a r e of the multiple correlation coefficient. As shown in the 1487 1488 VARIATION IN PROSTATE AND I3LADDER CANCER SURVIVAL METHODS 1950 a We used the 1973 to 1990 public-use tapes of the Surveillance Epidemiology and End Results program, which we have described previously.fi,7 We examined the importance of clinical stage in untreated nonmetastatic prostate or bladder cancer, in which stage was defined by the historic staging systems: in situ (Tl), localized (T2) or regional tT3 and/or N + ). In this analysis, bladder cancer was studied to allow a point of comparison. Because there were few in situ case, they were grouped with localized cases in the analyses. Survival time was calculated from the date of diagnosis to death or December 31, 1990. Cox proportional hazards modelsH were used to examine the relative risk of overall mortality, corresponding 95% confidence intervals and p values by age, disease stage and marital status. In these models age was treated a s a continuous variable, whereas stage and marital status were dichotomized in accordance with our previous observations.fi.7 The proportion of variation explained by the full model (total proportion of variation explained) and for each factor was calculated with the methods of Schemper.“ Marginal “unadjusted” proportion of variation explained for each of the factors was obtained from separate regression for each of the factors, and partial “adjusted” proportion of variation explained was calculated as the difference of the total proportion of variation explained and proportion of variation explained for a model with the factor of interest excluded. - 1945 - L, 0 19301Y3S - 1930 - 1925 , 40 B SO 70 M) . . . .’ . .. ..’ .. IS a - R2=0.02 . * t o . . RESULTS 40 50 70 M) Patient Age (years) Goodness of fit (R2)in artificial linear regression models. A, year of birth is plotted as function of patient age and provides model of year of birth described by equation (1997) - (1Kage). B, PSA is plotted a s function of patient age and provides model of PSA described by (-6.3) + (0.23Kage).Models show highly significant associations of year of birth and PSA with age (p <0.001),but goodness of fit (R2; possible range 0.00 to 1.00)differs substantially. tional hazards models is held as a descriptor that is distinct from measures of relative risk, confidence intervals and p value. One reason to use proportion of variation explained is the direct comparability of quantitative prognostic factors that it allows. With proportion of variation explained, one can establish a ranking of the relative “weight” of independent prognostic variables that appear based on proportional hazards models to have similar risk ratios and significance. One can also learn that the presumed “weight” of a prognostic variable is wrong. For example, CD4 count is a frequently applied clinical measure in the assessment of the severity of human immunodeficiency virus (HIV)infection and analyses have shown that a CD4 count of 200 or more is a highly significant predictor of mortality in HIV infection (risk ratio = 2.67; p = 0.02).” However, using proportion of variation explained analysis, we have recently shown that CD4 accounts for only 4% of the variation in individual mortality. To predict prognosis in HIV, one must also take into account variables that explain more of the variation. To our knowledge, proportion of variation explained analyses have not been previously applied to analysis of survival in prostate cancer. Our purpose was t o illustrate the use of proportion of variation explained analyses and generate hypotheses regarding the ability of prognostic factors to explain variation in survival. To Drovide a Doint of comDarison. we also included bladder cancer in the kalysis. 1 ~~~~~~ I - The dataset included 10,636 patients with untreated prostate cancer and 1,070 patients with bladder cancer whose survival times were known. The median survival for patients with prostate cancer was 50 months (with 95% confidence interval 49 to 52 months), which was significantly ( p <0.001) longer than the median survival of 40 months (with 95% confidence interval 31 to 47 months) for patients with bladder cancer. Other distributions were quite similar and are presented in table 1. About 7 2 4 of patients with prostate cancer were above 70 years old, whereas only 56% of patients with bladder cancer were above 70 years old. In proportional hazards models, age, disease stage and marital status were highly significantly ( p value range 0.0001 to 0.0009) associated with overall survival in prostate and bladder cancer (tables 2 and 3). The total proportion of variation explained for the model for prostate cancer is 7.170, whereas the total proportion of variation explained for the model for bladder cancer is 32.1%. The proportion of variation explained for individual prognostic factors for prostate cancer is dramatically different from TABLE1. Demographic characteristics of patients prostate or bladder cancer No Prostate uii th _ _ 11 trca ted ~~~ No Bladder Ca1%1 _ _ _ _ ~ _C n_I ( , 1 Age: 50 or Less 51 to 60 61 to 70 71 to 80 80 or More Race: White Black Other Stage: In situ Localized Regional Marital status: Married Other 72 (0.71 527 (5.01 2419 122.71 4657 143.8, 2961 (27.8) 60 15.61 134112.51 282 126.41 357 133.41 2.37 r‘22.l) 8931 184.01 1128 (10.6) 577 (5.51 951 (88.91 82 15.7) 19 10.21 8485 179.81 2132 120.0) 0 10.01 888 183.01 182 117.01 7106 166.8) 3530 (33.2) 37 ( 3 . 5 ) i l n (. 6. 6 41 360 133 61 ~~ ~. VARIATION IN PROSTATE AND BLADDER CANCER SURVIVAL 1489 nearly identical relative risks for bladder and prostate cancer. Why age explains more of the variation in survival in bladder than prostate cancer is hard to explain. r/r Unadjusted % Adjusted Readers may be surprised that prognostic factors to which Value Proportion of Proportion of Variation Vanation they have become accustomed performed poorly. The reader Explained Explained is asked to consider that the factors applied were found to be independently associated with survival in the relative abAge 1.04 11.036-1.043) 0.0001 0.4 0.6 sence of other measures, such as illness. Thus, they may Stage: Localized 1.00 6.1 5.5 altogether cease to associate independently with survival on Regional 1.28 11.21-1.36) 0.0001 the introduction into the models of other measures. One Mamed: might even argue that the goal of clinical medicine is to NO 1.00 1.1 0.4 Yes 0.84 (0.79-0.88) 0.0001 substitute such nonremediable factors as age with such reTotal proportion of variation explained 7.1%. mediable factors as illness, in our models of patient outcome so that we may better effect improved outcomes; this goal is in essence a rendering of such factors as age irrelevant to our TABLE3 . Relative risk (95%confidence interval) of overall prognostic models. Consider also that having prostate cancer mortality in patients with bladder cancer is associated with many other factors-physical dysfunction, illness, cognitive impairment-that may better explain variIr Unadjusted % Adjusted Value Proportion of Proportion of ation in survival. Given that in prostate cancer, median Variation Variation survival is relatively long and patients are relatively olderExnlained Exnlained medically and psychosocially diverse-the opportunity for k e 1.06 (1.05-1.07) 0.0001 22.1 14.7 associated factors to exert a n effect on survival is relatively Stage: great. Accordingly, one might hypothesize that in younger, 14.4 8.9 Localized* 1.00 simpler patient cohorts, (for example, patients with osteosar3.55 (2.914.32) 0.0001 Regional coma), such factors as disease stage would better explain Married: Nn 1 5.3 0.6 ~. .on .. variation in survival. One might also hypothesize that other Yes 0.75 (0.63-0.89) 0.0009 measures of disease severity (for example, cancer grade) will Localized also included in situ cases. explain greater proportions of survival variation than obTotal proportion of variation explained 32.1%. served with the measures applied in this study. How much explanation of survival is enough? The utopian answer is 100%-maximal theoretical explanation-althose for bladder cancer. For example, although the relative though "the more the better" is more useful, given that the risk for age is almost identical in the prostate and bladder objectives are to make precise predictions of outcomes and cancer models, the proportion of variation explained is much develop interventions that redistribute survival. When prohigher for bladder cancer. posing a new prognostic factor, one should be prepared first to illustrate its independence from other known prognostic DISCUSSION factors. One might also wish to quantify its ability to explain Identification of factors that predict clinical outcomes is variation in outcome. If this ability is meager, interventions important. When prognosis is poorly understood, it is very designed to change the factor may effect minimally the imdifficult to advise and treat patients intelligently. Given the proved outcomes. On the other hand, a great ability to eximportance of understanding prognostic factors, how do we plain variation suggests that interventions aimed a t this assess the predictive power of a given prognostic model? factor have merit and will effect improved outcomes. Our observations fit with the clinical impression that preTheoretically, a high predictive power depends on the correct integration of factors that explain large variations in out- dicting survival in prostate cancer is difficult and strongly comes. However, one might find it ironic and frustrating that suggest that current efforts to predict survival in prostate prediction of specific outcomes (for example, survival in pros- cancer are dependent on variables that are poorly suited to tate cancer) is difficult even when many statistically signifi- them. More poignantly, the observations suggest that proscant prognostic factors have been identified from conven- tate cancer stage, as assessed in the Surveillance Epidemitional models, for example, Cox proportional hazards models. ology and End Results program data, is in fact a very poor Because analysis of prognostic factors in survival studies is predictor (5.59 of variation in this analysis). Substitution usually based on COXproportional hazards models, the re- with other stage ascertainments may or may not change sults allow only point estimates and confidence intervals for ability to explain survival variation, especially because clinthe relative risk associated with prognostic factors (and cor- ical stage ascertainment is notoriously unreliable." To preresponding p values). The predictive power of the prognostic dict precisely survival in prostate cancer will require supplefactor is rarely evaluated and should not be confused with mentation with variables that can explain larger proportions statistical significance. The fact that a model is correctly of survival variation. The present analysis is not meant to be a definitive and specified for a set of prognostic variables does imply that, on average, predictions will be precise but not that for an indi- final analysis of the variation in survival of prostate and bladder cancer. Importantly, we have used a simple but unvidual patient the prediction will be very precise.2 By limiting analysis of a data set solely to the COXmodels, usual sample-untreated patients-to help illustrate conwe may miss important information on the extent to which cepts; because the sample is unusual, the results may not be Prognostic factors determine the outcome for a patient. This generalizable. The relevance of the proportions of variation Point is made vividly in the present study. Inspection of the explained that are generated in the present analyses to preP values generated in proportional hazards models did not dictive models that might include other prognostic factors, allow meaningful assessment of the relative ability of signif- such as grade and illness, can only be determined through 'cant input variables to explain variations in survival; even future analyses. the relative risk estimates lead to misleading conclusions about predictive power. For example, proportion of variation CONCLUSIONS explained analysis shows that the variation explained by age The present study is meant as an illustration of principle 1s larger for bladder cancer than prostate cancer (14.7 versus 0.6%).However, the Cox model showed identical p values and from which we can learn to define more precisely our objecTABLE2 Relative risk (and 951; confidence intervals) of overall mortalitv in Datients with urostate cancer 1490 VARIATION IN PROSTATE AND BLADDER CANCER SURVIVAL tives and match our methods to those objectives. The analysis is also meant as a reminder that, as in Babylon, our plans can be unraveled by semantic imprecision and segregation of the clinical and analytical worlds. To develop the predictive models that our patients deserve, we will have to accept the limitations of variables to which we have become accustomed and seek variables that can explain progressively larger proportions of variations in survival. Our present results, if generalizable, suggest that we have a long way to go before we can effect better survival for patients with prostate cancer. REFERENCES 1. Significance of significant. New Engl. J. Med., 2 7 8 1232, 1968. 2. Korn, E. L. and Simon, R.: Measures of explained variation for survival data. Stat. Med., 9 487, 1990. 3. Korn, E. L. and Simon, R.: Explained residual variation, explained risk and goodness of fit. Amer. Stat., 4 5 201, 1991. 4. Schemper, M.: The relative importance of prognostic factors in studies of suMval. Stat. Med., 1 2 2377, 1993. 5. Page, J. B., Lai, S., Fletcher, M. A., Patarca, R., Smith, P. C., Lai, H. C. and Klimas, N. G.: Predictors of survival in human immunodeficiency virus type 1-seropositive intravenous drug users. Clin. Diag. Lab. Immunol., 3: 51, 1996. 6. Krongrad, A., Lai, H., L a m , S. H. and Lai, S.: Mortality in prostate cancer. J. Urol., 156 1084, 1996. 7. Krongrad, A,, Lai, H., Burke, M. A., Goodkin, K. and Lai, S.: Marriage and mortality in prostate cancer. J. Urol., 156 1696, 1996. 8. Cox, D. R.: Regression models and life-tables. J. Roy Stat. SOC., 34: 187, 1972. 9. Smith, D. S. and Catalona, W. J.: Interexaminer variability of digtal rectal examination in detecting_ _prostate cancer. Urolo&, 4 5 70, 1995. EDITORIAL COMMENT This article provides important insight for nonstatisticians into one of the vagaries of statistics in clinical papers. The urology jour. nals are replete with studies using Cox regression analysis to show that various clinical parameters have a significant correlation with survival. The authors used the large data base from the Surveillance Epidemiology and End Results program to show that age, clinical stage and marital status were all associated with survival for patients with prostate and bladder cancer. Among the potential uses for such correlations is the ability to counsel patients about their relative risks from a disease and help guide them toward optinial therapy. The important lesson from this study is that finding a significant statistical correlation using COXregression does not neeessarily have much clinical significance. The authors have introduced to the urology community the proportion of variation explained, which is a measure of how much outcome can be explained by a particular parameter. Despite finding similar p values for the relationship between stage and survival in bladder and prostate cancer, the adjusted proportion of variation explained for this variable in prostate cancer was only 5.5%. In other words, this variable, although statistically significant, has very little ability to predict outcome. Other factors not assessed in this study need to be studied to determine which factor predicts survival more effectively. The importance of this paper is not that stage is a poor predictor of outcome. Rather it demonstrates that when reading papers on prognostic factors, there is a need to know the proportion of variation explained in order to truly assess the potential usefulness of the information. Perhaps future papers that use Cox regression should be required to also provide the proportion of variation explained so that the reader can assess whether statistically significant variables have any clinical usefulness. What is so interesting about this study is that the 3 highly significant factors only explain 6.5% of the mortality. By using proportion of variation explained, perhaps, the most clinically useful prognostic factors now can be defined. Gerald W . Chodak Prostate and Urology Center Louis A. Weiss Memorial Hospital Uniuersitv of Chicago Hosnitals Chicago, ?ll&is