Download Conditional logistic regression using COXREG

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Interaction (statistics) wikipedia , lookup

Time series wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Regression toward the mean wikipedia , lookup

Least squares wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Conditional logistic regression using
COXREG
Conditional logistic regression models are designed for situations in which one or more "cases," who show the
response of interest, are matched with one or more "controls," who do not show the response. The most
common situation involves 1-1 matching, though 1-N and M-N matching is also seen. (In order to avoid
confusion between the term "case" as used here and a physical case in an SPSS Statistics data file, I'll use
"case" to refer to the physical case (or cases) showing the response, and use the term "physical case" for a
physical case in the data file when such reference is required.)
In many software packages, the standard binary logistic regression procedures can be used to fit 1-1 matching
situations by suppressing the intercept, using a constant dependent variable with a value of 1 for every physical
case, and defining a physical case by taking the difference between the case and control values on the
predictor variables. This will not work with the LOGISTIC REGRESSION procedure because it will only
estimate a model when the dependent variable has exactly two values. However, this can be done in the
NOMREG procedure, which is accessed in the menus via Analyze>Regression>Multinomial Logistic. See the
example on matched case-control studies in the chapter on multinomial logistic regression in the SPSS
Advanced Statistical Procedures Companion, by Marija Norusis, or the Case Study in the Help (Help>Case
Studies>Regression Option>Multinomial Logistic Regression>Using Multinomial Logistic Regression to
Analyze a 1-1 Matched Case-Control Study) for more details on how to use NOMREG for matched 1-1 case
control studies.
Fitting models with multiple controls cannot be done using NOMREG. It is possible to use the COXREG
procedure (or the CSCOXREG procedure in the Complex Samples module) to fit such models. This approach
may also be easier with 1-1 matched data as well, as it does not require you to compute differences between
predictor variable values for cases and controls.
Suppose we have K pairs of matched cases and controls (in a 1-1 matching). The total number of physical
cases in the data file will then be 2K. In order to use COXREG to do the conditional logistic regression, we
need to do the following:
Code or recode the dependent variable so that it has a value of 1 for the cases and 2 for the controls. We'll call
this variable DV. (Technically, the only requirement is that the case in each set has a positive value that is
smaller than that for its control.)
Create a copy of this variable with another name. We'll call this variable STATUS. (Technically, all that's
needed here is for all cases to share some property not shared by the controls.)
If it does not already exist, create a variable that denotes each pair (this will have K different values in our
example). We'll call this variable PAIR.
In the menus, click on Statistics>Survival>Cox Regression. Move DV into the Time slot. Move STATUS into the
Status slot, click on the Define Event button, and define the value 1 as the single value denoting an event.
Move the PAIR variable into the Strata slot. Then specify all desired predictors, choose any desired variable
selection methods, and define any appropriate covariates as categorical, just as you would in LOGISTIC
REGRESSION. The Variables in the Equation output for COXREG looks exactly like that in LOGISTIC
REGRESSION (without an intercept). You can get confidence intervals for the Exp(B) values, or odds ratios, in
the Options dialog.
In command syntax, the basic structure would be:
COXREG dv WITH covlist
/STATUS=status(1)
/STRATA=pair.
The reason that this method works properly is that the conditional partial likelihood maximized by the COXREG
procedure is the same one that results from the conditional logistic regression situation. The likelihood is a
function of the probabilities of those physical cases that are cases being the ones to respond as opposed to
those that are controls within the matched pairs. This can theoretically be extended to the 1-N and M-N
matching cases, where pairs are larger sets, but COXREG should generally be used only for the 1-1 and 1-N
cases.
If there are multiple controls for a each case, you can easily extend the COXREG method by simply having
more than one control for each set. In this case, the variable name for the stratification variable might be called
SET or something more accurately descriptive than PAIR, but this isn't necessary. Thus, fitting the conditional
logistic model for the 1-N matched situation is easy. There can also be different numbers of controls for
different cases.
Using the COXREG method for the M-N matching situation (or any situation with more than one control for a
case) is not recommended. The reason for this is that the COXREG procedure offers only Breslow's
approximate method for dealing with tied event times within a stratum. This approximation is good only when
the number of ties at each event time is small relative to the number of physical cases at risk at that event time.
Since in this application we are defining strata as sets, we will have M tied event times out of M+N total at risk
physical cases, which will generally be a substantial proportion. The estimates one gets from COXREG in this
situation are thus likely to be inadequate approximations to the true maximum likelihood values based on the
discrete time likelihood.