Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics for Health Research Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Objectives of Session • Understand what is meant by a binary outcome • How analyses of binary outcomes implemented in logistic regression model • Understand when a logistic model is appropriate • Be able to implement in SPSS and • Interpret logistic model output Binary Outcome Extremely common in health research: •Dead / Alive •Hospitalisation (Yes / No) •Diagnosis of diabetes (Yes / No) •Met target e.g. total cholesterol < 5.0 mmol/l (Yes / No) n.b. Can use any code such as 1 / 2 but mathematically easier to use 0 / 1 How is relationship formulated? For linear simplest equation is : y a bx ei y is the outcome; a is the intercept; b is the slope related to x the explanatory variable and; e is the error term or random ‘noise’ Can we fit y as a probability range 0 to 1? y a bx ei Not quite! Y as continuous - any value from -∞ to + ∞ Outcome is a probability of event, Π (or p) on scale 0 – 1 Certain transformations of p can give the required scale Probit is a normal transformation of p but not easy to interpret results The logit transformation works! We can now fit p as a probability range 0 to 1 And y in range -∞ to + ∞ y log it (p) a bx e i p log a bx e 1 p i Logistic Regression Model p log a bx e 1 p i This has very useful properties The term p/(1-p) is called the ‘Odds’ of an event Note: not the same as the probability of an event p If x is binary coded 0/1 then - exp (b) = ODDS RATIO for the outcome in those coded 1 relative to code 0 e.g. Odds of death in men (1) vs. women (0) Logistic Regression Model Consider the LDL data. It has two binary outcomes – 1) LDL target achieved 2) Chol target achieved For example consider gender as a predictor – Male = 1 & Female = 2 For a binary x we can express results as odds ratios (available in crosstabs) LDL target achieved Gender No Male Female 140 149 Yes 563 Odds yes = 563/140 531 Odds yes = 531/149 Odds ratio = 4.02 / 3.56 OR = 0.886 Female cf Male LDL target achieved No Gender Male Female 140 149 Yes 563 531 Odds yes = 563/140 = 4.02 Odds yes = 531/149 = 3.56 N.b. Odds is different to prob – Men p = 563/(140+563) = 0.80 or 80% Odds ratio from Crosstabs Obtain odds ratios for 2 x 2 tables from crosstabs and select option ‘risk’ Results from Crosstabs Odds ratios for achieving LDL target in females vs. males n.b. OR given for Female vs male = 0.886 Fit Logistic Regression Model Dependent is binary outcome – LDL target met (Yes = 1, No = 0) Independent – Gender 1 = M, 2 = F Should get same as the crosstabs result Select Analyze / Regression / Binary Logistic Select option of 95% CI for exp (b) Regression / Binary logistic….. Odds ratio from logistic model results for a binary predictor EXP (B) = Odds ratio F vs. M Note that OR for Men vs Women = 1/0.886 = 1.13 Fit Logistic Regression Model – continuous predictor Dependent is binary outcome – LDL target met Independent – Continuous predictor – Adherence B represents the change in the ODDS RATIO for a 1 unit increase in adherence B x 10 represents the change in the ODDS RATIO for a 10 unit increase in adherence Odds ratio from logistic model results for a continuous EXP (B) = Odds ratio for 1% increase in Adherence OR for 10% increase is exp(10 x 0.010) = 1.105 i.e. a 10.5% increase in odds of meeting LDL target for each 10% increase in adherence Fit Logistic Regression Model – categorical predictor Dependent is binary outcome – LDL target met Independent – APOE genotype (1 – 6) Choose a reference category, in this case worst outcome is genotype 6 so choose 6 to give ORs > 1 B represents the OR for each category relative to the reference category Regression / Binary logistic….. Choose Categorical Odds ratios from logistic model results for a categorical predictor EXP (B) = Odds ratio for APOE (2) vs APOE (6) OR = 4.381 (95% CI 1.742, 11.021) Other binary models The logistic model is only applicable whenever the length of follow-up is same for each individual e.g. 5-yr follow-up of a cohort For binary outcomes where censoring occurs i.e. people leave the cohort from death or migration then length of followup varies and need to use survival models such as Cox Proportional Hazards model Summary • Logistic model easily fitted in SPSS • Clear link with ODDS RATIOS • Common model for case-control, cohort studies as well as development of clinical prediction models