Download Regression Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Regression toward the mean wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Choice modelling wikipedia , lookup

Least squares wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Chapter 11
Survival Analysis
Part 2
Survival Analysis and
Regression

Combine lots of information
 Look at several variables simultaneously

Explore interactions
 model interaction directly

Control (adjust) for confounding
2
Proportional hazards regression
(Cox Regression)

Can we relate predictors to survival time?

We would like something like linear regression
t  B0  B1 X 1  B2 X 2  ...

Can we incorporate censoring too?

Use the hazard function
3
Hazard function

Given patient survived to time t, what is the
probability they develop outcome very soon?
(t + small amount of time)

Approximates proportion of patients having
event around time t
4
Hazard function
 (t ) 
Prob(t  T  t   t  T )

Hazard less intuitive than survival curve
Conditional probability the event will occur between t and t+
 given it has not previously occurred
Rate per unit of time, as  goes to 0 get instant rate
Tells us where the greatest risk is given survival up to that
time (risk of the event at that time for an individual)
5
Possible Hazard of Death from Birth
Probability of dying in next year as function of age
 (t)
0
6
17
23
At which age would the hazard be greatest?
80
6
Possible Hazard of Divorce
0
2
10
25
35
50
7
Why “proportional hazards”?
Ratio of hazards measures relative risk
(t) for exposed
RR(t) 
(t) for unexposed
If we assume relative risk is constant over time…

 (t ) for exposed
c
 (t ) for unexposed
The hazards are proportional!
8
Proportional Hazard of Death from Birth
Probability of dying in next year as function of age for
two groups (women, men)
 (t)
0
6
17
23
80
At which age would the hazard be greatest?
9
Proportional Hazards and
Survival Curves

If we assume proportional hazards then
sa (t )  [ sb (t )]

c
The curves should not cross.
10
Proportional hazards regression model
one covariate
 (t )  0 (t ) exp( 1 X 1 )
0(t) - unspecified baseline hazard (constant)
(t) the hazard for subject with X=0 (cannot be
negative)
1 = regression coefficient associated with the
predictor (X)
1 positive indicates larger X increases the hazard
Can include more than one predictor
11
Interpretation of Regression Parameters
 (t )  0 (t ) exp( 1 X 1   2 X 2   3 X 3  ....   p X p )
Log ( (t ))   o  1 ( x1 )   2 ( x2 )  ...   p ( x p )
For a binary predictor; X1 = 1 if exposed and 0 if unexposed,
exp(1) is the relative hazard for exposed versus unexposed
(1 is the log of the relative hazard)
exp(1) can be interpreted as relative risk or relative rate with
all other covariates held fixed.
12
Example - risk of outcome for
women vs. men
Suppose X1=1 for females, 0 for males
For females;
For males;
 (t )  0 (t ) exp( 1 X1 )
 (t )  0 (t ) exp( 1 *1)  0 (t ) exp( 1 )
 (t )  0 (t ) exp( 1 * 0)  0 (t )
hazard for females 0 (t ) exp( 1 )
Relative hazard 

 exp( 1 )
hazard for males
0 (t )
13
Example - Risk of outcome for
1 unit change in blood pressure
Suppose X1= systolic blood
pressure (mm Hg)
 (t )  0 (t ) exp( 1 *114)
For person with SBP = 114
 (t )  0 (t ) exp( 1 *113)
For person with SBP = 113
Relative risk of 1 unit
increase in SBP:
 (t )  0 (t ) exp( 1 X1 )
0 (t ) exp(114 * 1 )
0 (t ) exp(113 * 1 )
 exp(114 1  1131 )
 exp( 1 )

14
Example - Risk of outcome for
10 unit change in blood pressure
Suppose X= systolic blood
pressure (mmHg)
 (t )  0 (t ) exp( 1 *110)
For person with SBP = 110
 (t )  0 (t ) exp( 1 *100)
For person with SBP = 100
Relative risk of 10 unit
increase in SBP:
 (t )  0 (t ) exp( 1 X1 )
0 (t ) exp(110 * 1 )
0 (t ) exp(100 * 1 )
 exp(110 1  100 1 )
 exp(10 1 )

15
Parameter estimation

How do we come up with estimates for i?

Can’t use least squares since outcome is not
continuous

Maximum partial-likelihood (beyond the scope of this
class)
 Given our data, what are the values of i that are
most likely?

See page 392 of Le for details
16
Inference for proportional hazards regression


Collect data, choose model, estimate is
Describe hazard ratios, exp(i), in statistical
terms.


How confident are we of our estimate?
Is the hazard ratio is different from one due to
chance?
17
95% Confidence Intervals for the relative
risk (hazard ratio)

Based on transforming the 95% CI for the hazard ratio
(e

 i 1.96SE
,e
i 1.96 SE
)
Supplied automatically by SAS
“We have a statistically significant association between the predictor
and the outcome controlling for all other covariates”

Equivalent to a hypothesis test; reject Ho: RR = 1 at alpha = 0.05
(Ha: RR1)
18
Hypothesis test for individual PH
regression coefficient

Null and alternative hypotheses

Ho : Bi = 0, Ha: Bi  0

Test statistic and p-values supplied by SAS

If p<0.05, “there is a statistically significant association
between the predictor and outcome variable controlling
for all other covariates” at alpha = 0.05

When X is binary, identical results as log-rank test
19
Hypothesis test for all coefficients

Null and alternative hypotheses

Ho : all Bi = 0, Ha: not all Bi  0

Several test statistics, each supplied by SAS
 Likelihood ratio, score, Wald

p-values are supplied by SAS

If p<0.05, “there is a statistically significant association
between the predictors and outcome at alpha = 0.05”
20
Example
Myelomatosis: Tumors throughout the body composed of cells derived
from hemopoietic(blood) tissues of the bone marrow.
N=25
dur=>is time in days from the point of randomization to either death or
censoring (which could occur either by loss to follow-up or termination
of the observation).
Status=>has a value of 1 if dead; it has a value of 0 if censored.
Treat=>specifies a value of 1 or 2 to correspond to two treatments.
Renal=>has a value of 0 if renal functioning was normal at the time of
randomization; it has a value of 1 for impaired functioning.
The MYEL Data set take from: Survival Analysis Using SAS, A Practical Guide by Paul D. Allison - page 269
21
22
23
SAS- PHREG
PROC PHREG DATA = myel;
MODEL dur*status(0) =treat;
RUN;

Same as LIFETEST
Fit proportional hazards model with time to death as outcome

“ status(0)”; observations with status variable = 0 are censored


status= 1 means an event occurred
Look at effect of Treatment 2 vs. Treatment 1 on mortality.
PROC PHREG Output
Analysis of Maximum Likelihood Estimates
Variable
treat
DF
Parameter
Estimate
Standard
Error
Chi-Square
Pr > ChiSq
Hazard
Ratio
1
0.57276
0.50960
1.2633
0.2610
1.773
77% increased risk of death for treatment 2 vs. treatment 1,
But it is not significant? Why?
25
Complications

Complications



competing risks (high death rate)– RENAL FUNCTION
Non proportional hazards -time dependent covariates
(will show you later)
Extreme censoring in one group
26
SAS- PHREG
PROC PHREG DATA = myel;
MODEL dur*status(0) = renal treat;
RUN;

Same as LIFETEST
Look at effect of Treatment 2 vs. Treatment 1 on mortality
adjusted for renal functioning at baseline.
Output with adjusted
treatment effect
Analysis of Maximum Likelihood Estimates
Parameter
Hazard
Variable
renal
treat
DF
1
1
Estimate
4.10540
1.24308
Standard
Error
1.16451
0.59932
Chi-Square
12.4286
4.3021
Pr > ChiSq
0.0004
0.0381
Ratio
60.667
3.466
28
29