Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data assimilation wikipedia , lookup
Instrumental variables estimation wikipedia , lookup
Time series wikipedia , lookup
Regression toward the mean wikipedia , lookup
Confidence interval wikipedia , lookup
Least squares wikipedia , lookup
Linear regression wikipedia , lookup
An Alternative to the Odds Ratio: A Method for Comparing Adjusted Treatment Group Effects on a Dichotomous Outcome P. Chris Holland, CL McIntosh and Associates, Inc., Rockville, MD USA Abstract The odds ratio is a commonly used statistic for measuring the association between two groups on a dichotomous outcome. In clinical trials, it can be used as a measure of association between two treatment groups on a clinical response rate. However, it is sometimes desired to compare directly the estimated difference in response rates between two treatment groups. While the SAS/STAT® software does offer features for such comparisons, it does not allow for adjustments made to the treatment effect based on other explanatory variables. I present a method for testing success rate differences between two treatment groups while adjusting for other factors and a SAS macro that performs the necessary steps for carrying out the procedure. The macro produces an output data set that contains estimated adjusted success rates, the difference between them, confidence intervals around the difference, and an associated pvalue. Introduction The odds ratio is a widely used statistic in logistic regression analysis. In clinical trials, it allows researchers to answer the question, “which treatment is better?” with respect to a certain outcome or event. In a study that examines the effects of a drug used to treat heart disease, the event could be a heart attack within. Or there could be some pre-defined criteria that are used to determine if a study subject is a treatment responder or nonresponder. In a simple logistic regression model, the odds ratio between two groups is often used to determine which group increases or decreases a subject’s chances of experiencing whatever outcome is being modeled. An odds ratio of 1 indicates that both groups are equal. Confidence intervals can be used to see if the odds ratio contains the value of 1. If so, one could conclude that, with a specified level of certainty, the two groups are statistically equal. Conversely, a confidence interval that does not contain 1 would signify that the odds of the event occurring are statistically significantly greater for one treatment group as compared to another. In a multiple logistic regression model, factors such as age, gender or smoking history can be added to a model and used to help better predict a the chances of an event. For example, if the gender distribution differs between two groups and gender is known to have an effect on the outcome, then the effect from the gender disparity can be used to adjust the group’s effect on the event. This results in a more accurate estimate of the model coefficient for the group effect and the odds ratio associated with it. Sometimes, however, researchers are interested in knowing the estimated event rates and the difference between those rates as actual percents. One reason for this would be the ease in interpreting the results. Interpreting percentages, the differences between those percentages, and confidence intervals around those differences are likely to be more intuitive to the reader than an odds ratio. Another, and perhaps more compelling reason for certain clinical studies, can be attributed to FDA guidelines. For studies that evaluate the efficacy of antimicrobials, for example, the FDA guidance documents suggest trial success criteria that are based on the differences in treatment success rates between the test drug and an active control. The recommended analytical approach involves estimation through the use of 95% confidence intervals around the treatment difference and a certain threshold for the lower limit of those intervals (the threshold depends on the treatment success rates observed during the trial). Presenting results in the form of success rates, differences between the observed success rates, and confidence intervals around those differences is straightforward when no adjustments are made. The 6.12 release of the SAS System made these analyses easy to perform with the addition of the RISKDIFF option for the FREQ procedure. However, the same philosophy that explained the advantage of using adjusted odds ratios could be applied to estimated response rates. Without making adjustments for factors that are known to affect the response, disparities among these factors between the treatment groups could be mistaken for a treatment effect or non-effect. I describe the PCNTDIFF macro that, with the use of the GENMOD procedure and the SAS macro language, derives estimated and adjusted response rates from a given logistic regression model for two specified groups. The difference between these rates and an associated pvalue (using normal approximation) are then computed. Associated confidence intervals are also constructed. The results are then saved in an output data set. Logistic Regression Background Logistic regression is used to model dichotomous or binary outcomes. It represents a way of transforming these data so that the properties of simple linear regression can hold up. In any simple regression equation, we model the mean value of the outcome variable, Y, given the value of the independent variable, x. This is known as the conditional mean, or the “expected value of Y given x and is denoted as E(Y|x). The regression equation is: E(Y|x) = β0 + β1 x As the name suggests, logistic regression is based on the logistic distribution. Common notation to represent E(Y|x) when logistic regression is used is π(x). The form of the logistic regression model is: β +β x π(x) = __e 0 1 _ β +β x 1+e 0 1 In order to take advantage of the properties of linear regression, a transformation of π(x) is necessary. This transformation is called the logit transformation and is define as: g(x) = ln[ π(x) / (1 - π(x)) ] Given π(x) from above we have: g(x) = β0 + β1 x Expanding this equation for multiple regression we get: g(x1, x2,…, xp ) = β0 + β1 x1 + β2x2 + …βp xp Adjusted Effect Estimates Although the odds ratio is a useful and widely used statistic, it doesn’t tell the whole story when trying to compare two groups to one another. Most importantly, it doesn’t tell a reader the difference in the outcome rates between the two groups. The methodology below explains how adjusted effect estimates, the difference between the estimates, a confidence interval around the difference and the associated p-value are all are constructed. Given x’=(x1, x2,…, xp), we have the following equation for this multiple logistic regression model: g(x’) = β0 + β1 x1 + β2x2 + …+ βp xp Reverting back from the logit transformation gives us: β +β x β x π(x’) = __e 0 1 1+ … + p p _ β +β x β x 1 + e 0 1 1+ … + p p This can perhaps more simply be expressed as: π(x’) = __ 1 _ -(β + β x β x ) 1 + e 0 1 1+ … + p p Now, let’s assume that the two groups we want to compare are represented by x1, whose corresponding coefficient is β 1. Since π( x’) represents the equation for the estimated outcome rate, our objective then is to fit π(x’) for x1=1 and x1=0 (noted as π1 and π0 ) and then find the difference between these two values. The approach used for doing this is the same as that used to construct leastsquares means, or population marginal means, which results in the mean for each group that you would expect from a balanced design. For each model parameter, a co-efficient is sought. For the covariates (continuous variables), we use the overall population mean. One thought may be to choose the mean value for each factor in the group where x1=1 and then do likewise for x1=0. This would, after all, allow us to find the most accurate predictor from each of the two treatment groups. However, finding the most accurate predictor isn’t the objective. In fact, allowing potential differences between the two groups with respect to other factors in the model would confound the objective of trying to quantify the differences between the two groups while holding all other effects equal. For a categorical variable, the weight (or co-efficient) is the inverse of the number of levels in the category. Alternatively, if desired, one could use the population mean percentages for each category. To do this, replace the categorical variable from the CLASS statement with dummy or indicator variables that are treated as covariates. For interaction terms, simply use the product of the coefficient used for each term. So, if modeling for x1=1, an interaction term involving the group and a categorical variable with k levels would use 1*1/k as the co-efficient for each of the (k-1) interaction parameters. If modeling for π0, then the co-efficient is zero. Let’s say we have a model with the group variable and parameter β1, a categorical variable with k levels and parameters β21 to β2(k-1), a continuous covariate with parameter β3, and a term for the group-by-categorical variable interaction with parameters β41 to β4(k-1). The two equations for finding estimates of π1 and π0 would then be: -[β +β +β (1/k) + π1 = 1 / 1 + e 0 1 + (1/k)] +β … 4(k-1) π0 = 1 / 1 + e 21 -[β + β (1/k) 0 21 +β … +β … 2(k-1) 2(k-1) (1/k) +β xbar+β (1/k) 3 41 (1/k) +β xbar] 3 The estimate of interest is then π1 - π0, which, at this point, is simple to compute. Finding the variance of this difference, however, is admittedly heuristic. Unlike the actual least squares means for linear models where the variance associated with the difference between the means is well understood, finding a precise variance estimator for π1 - π0 is not. As a result, π1 and π0 are treated as parameters to binomial distributions. The variance of each is thus π1(1 - π1) and π0(1 - π0), respectively. We can estimate the variance for π1 - π0 as S p = π1(1 - π1)/n1 + π0(1 - π0)/n0. For sufficiently large n1 and n0, we can regard π1 - π0 as a normal random variable and can use the normal approximation to construct a twosided confidence interval around this difference. This gives us the following formula: (π1 - π0) ± zα/2 * sqrt[π1(1 - π1)/n1 + π0(1 - π0)/n0] Where zα/2 represents the value from a standard normal distribution. It follows that the p-value is also derived from the normal distribution. The PCNTDIFF Macro All of computations mentioned so far are automatically taken care of by the PCNTDIFF macro. To use the macro, one simply needs to provide a few macro parameters, such as the input data set, the response variable, the group variable, x1, and other explanatory variables for the requested logistic regression model. These include the CLASS (categorical) variables, the covariates, and the interaction terms. A macro call using all of these parameters would look something like this: %pcntdiff(data=dataset, response=success, grp=trt, classvrs=site race, intrax=trt*site trt*race, covariat=age, byvars= study); The first three parameters are required, although without any adjustments from class variables, covariates, or interactions, the purpose of using the macro is defeated. The first three parameters represent the input data set, the dependent variable, and the group variable, respectively. Class variables, interaction terms, and covariates are specified with the CLASSVRS, INTRAX, and COVARIAT parameters, respectively. If more than one term is desired for either of these, then each term should be separated by a space (however, with interaction terms, the term itself should have no spaces on either side of the “*” symbol). Lastly, the BYVARS parameter is used to run the analysis for different BY groups. Once all of the macro parameters are specified and the macro is executed, the GENMOD procedure in the SAS/STAT software is used to construct the parameter estimates for the logistic regression model. The FREQ procedure is used to find the weights for the CLASS variables and the MEANS procedure is used to calculate the means of each covariate. As shown above, if interaction terms are present, the coefficients for interaction terms involving the treatment group variable fall out of the π0 equation (since x1=0). If desired, the procedure output from all procedures used within the macro can be printed to an output file (or the output window if running SAS interactively). This can be helpful for double-checking the macro’s results. After all of the necessary values are found, the estimated and adjusted outcome rates are computed. An output data set is created containing variables that represent the BY variables (if any are specified), character or numeric representations for each of the two comparison groups, n1, n0, π1, π0, (π1 - π0), the CI around (π1 - π0), the associated p-value, and other related statistics. An Applied Example To illustrate the advantage of constructing adjusted success rates and the use of the PCNTDIFF macro, consider an analysis using fabricated data that examines the effects of an experimental drug with an active control on subjects who have been diagnosed with pneumonia. A subject is considered a success if his or her condition improves or the subject is cured after one week on treatment (as judged by the investigating physician). In order for the trial to be considered a success, the results should demonstrate that the experimental treatment is statistically and clinically superior or equivalent in efficacy to the active control. The recommended analytical approach as defined by FDA guidance documents is to construct a two-sided 95% confidence interval of the treatment difference (test drug minus control) in success rates. The confidence interval should contain zero and the lower limit of the confidence interval should not exceed the clinically specified boundary for establishing efficacy equivalence, which depends on the better of the two treatment success rates. The data used to demonstrate this application contain 219 intent-to-treat subjects, 113 of whom are in the active control group and 106 of whom are in the test drug group. In accordance with the analytical approach mentioned above, the efficacy analysis is based on a two-sided 95% confidence interval around the difference in the treatment success rates. These are obtained by using the RISKDIFF option in PROC FREQ. Output 1 contains the contingency table. Output 1: TABLE OF TMT BY SUCCESS TMT SUCCESS Frequency ‚ Row Pct ‚No ‚Yes ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Test Drug ‚ 35 ‚ 71 ‚ ‚ 33.02 ‚ 66.98 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Active Control ‚ 21 ‚ 92 ‚ ‚ 18.58 ‚ 81.42 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 56 163 Total 106 113 219 As can be seen from the table, the success rate in the active control group looks considerably greater than that in the test group (81.4% vs. 67.0%). Since the overall success rate of the better of the two groups is in the 8090% range, the lower limit of confidence interval is going to have to be greater than –15% in order to demonstrate clinical and statistical equivalence (according to the FDA guidelines). The Chi-Square test statistics appear in Output 2. Output 2: STATISTICS FOR TABLE OF TMT BY SUCCESS Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 5.988 0.014 Likelihood Ratio Chi-Square 1 6.027 0.014 Continuity Adj. Chi-Square 1 5.253 0.022 Mantel-Haenszel Chi-Square 1 5.961 0.015 Fisher's Exact Test (Left) 0.011 (Right) 0.995 (2-Tail) 0.020 Phi Coefficient -0.165 Contingency Coefficient 0.163 Cramer's V -0.165 Regardless of which test you use, statistical significance falls in favor of the active control group, which suggests that the asymptotic confidence interval is not going to contain zero and the study success criteria is therefore not going to be met. Nonetheless, lets look at the column 2 risk estimates in Output 3: Output 3: Output 6: Column 2 Risk Estimates Final Results-- Data Set logit 95% Confidence Bounds 95% Confidence Bounds Risk ASE (Asymptotic) (Exact) ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Row 1 0.670 0.046 0.580 0.759 0.572 0.758 Row 2 0.814 0.037 0.742 0.886 0.730 0.881 Total 0.744 0.029 0.687 0.802 0.681 0.801 Difference -0.144 (Row 1 - Row 2) 0.059 -0.259 Output 4: TTEST PROCEDURE Variable: AGE TMT N Mean Std Dev Std Error ---------------------------------------------------------------Active Control 113 42.33628319 4.95625618 0.46624536 Test Drug 106 47.31132075 6.93419849 0.67350890 Subjects in the test group are five years older, on average, than the active group. This is not something you would expect to happen in a properly randomized trial, but it is subject to happen nonetheless. Since age can be considered an influential factor on the outcome, there is good reason to re-construct the response rates by adjusting for age. This is where the PCNTDIFF macro comes into play. The macro call would look like: %pcntdiff(data=pneumo, response=success, grp=trt, covariat=age); Here, we are adjusting treatment group effects by the covariate age. Looking at the parameter estimates from the logistic regression model as seen in Output 5, we can get a good idea about the strength and direction of the age effect. Output 5: The GENMOD Procedure Analysis Of Parameter Estimates NOTE: Test Drug Active Control Active Control Test Drug 0.7912 OBS 1 Difference -0.07 Adjusted Success Rate for Group 2 0.7212 95% Confidence Interval normal approximation p-value (-.1836, 0.0436) 0.22712 As suspected, the results are much different from what was seen before the age adjustment was made. Applying these results to the study success criteria, we see that the confidence interval does contain zero. The lower limit of the confidence interval is less than –15%, but since the success rate for the better of the two treatments is now less than 80%, the lower bound threshold changes to –20% according to the FDA criteria. By these parameters, the trial success criteria are met. Conclusion Variances T DF Prob>|T| --------------------------------------Unequal -6.0734 189.0 0.0001 Equal -6.1369 217.0 0.0000 INTERCEPT TMT TMT AGE SCALE 1 Group 1 Value Adjusted Success Rate for Group 1 -0.030 The confidence interval for Test minus Control is (-0.259, 0.030). All may not be lost, however. Review of the demographics and baseline characteristics between the two treatment groups reveals a statistically significant difference with respect to the age. Results from a t-test are in Output 4. Parameter OBS Group 2 Value DF Estimate Std Err ChiSquare Pr>Chi 1 1 0 1 0 4.8700 -0.3817 0.0000 -0.0791 1.0000 1.1000 0.3470 0.0000 0.0247 0.0000 19.6017 1.2097 . 10.2394 . 0.0001 0.2714 . 0.0014 . The scale parameter was held fixed. The evidence is strong that response rates adjusted for age would yield more favorable results. The results from the PCNTDIFF macro are in Output 6. The odds ratio has become an important statistic for comparing the relationship between two groups with respect to a dichotomous or binary outcome. Multiple logistic regression models allow researches to adjust group effects for other possible sources of variation. However, there remains a missing link between the odds ratio (and the model parameter estimate with which it is associated) and the adjusted event rates that multiple logistic regression models help estimate. The methodology presented in this paper, and the SAS macro used to carry out this methodology, attempts to bridge the gap from hypothesis-based statistics for the logistic regression model’s parameters, to the estimated event rates derived from those models. The macro creates a data set that contains adjusted estimates to the event rates for the groups of interest, an estimated difference between those event rates, a two-sided confidence interval around the difference, and p-value for the test that the two groups are equal. Under proper conditions, the conclusions drawn from this hypothesis test should closely match those from the multiple logistic regression model’s odds ratio and parameter estimate hypothesis tests. References Fisher, Lloyd; van Belle, Gerald; Biostatistics: a methodology for the health sciences, New York, Wiley, 1993 Holland, P. Chris, More Class to PROC PHREG: An Enhanced SAS® Macro for the Analysis of Cox Proportional Hazards Models that Involve Multinomial Effects, PharmaSUG 1999 Conference Proceedings Hosmer, David W.; Lemeshow, Stanley; Applied Logistic Regression, New York, Wiley, 1989 SAS Institute Inc., SAS/STAT User’s Guide, Version 6, Fourth Edition, Volumes 1 and 2, Cary, NC: SAS Institute Inc., 1989 Stokes, Maura E.; David, Charles S., and Kock, Gary G.; Categorical Data Analysis Using the SAS System, Cary, NC: SAS Institute Inc., 1995 499 pp. US FDA Draft Guidance Document for Evaluating Clinical Studies of Antimicrobials in the Division of Anti-infective Drug Products. Issued February, 1997 SAS Institute Inc., SAS/STAT Software: Changes and Enhancements through Release 6.11, Cary, NC: SAS Institute Inc., 1996 1104 pp. SAS and SAS/STAT are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Acknowledgements I would like to thank Dr. Hoi Leung for his help in explaining and introducing this procedure to me. Contact Information P. Chris Holland, MS Statistician CL McIntosh & Associates, Inc. 12300 Twinbrook Parkway, Suite 625 Rockville, MD20852 Phone: (301) 770-9590 ext. 271 (W) (703) 524-9810 (H) e-mail: [email protected] The PCNTDIFF macro can be downloaded from the World Wide Web at: http://www.erols.com/petey/macrodoc