Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Logistic Regression Statistics Sweden September 2004 Dan Hedlin Binary Y variable (0 or 1) • Contract cancer or not, over or under a poverty line, response or nonresponse • Y is not limited in ordinary regression • Trick: p 1x1 2 x2 log 1 p • p is probability for cancer, etc. Alternative expressions • Common notation logit p 1x1 2 x2 • Equivalent: e 1x1 2 x2 p 1 e 1x1 2 x2 Different scales • Log-odds (additive effects) • Odds p/(1-p) (multiplicative effects) • Probability p Another difference to ’ordinary’ regression: • Iterative computation and numerical issues Interpretation of parameters • ’Base probability’ for x1 0 and x2 0 e p 1 e • Maybe most interpretable when x are interval scaled variables and the zero point is meaningful Interpretation of ß p x • One auxiliary variable: log 1 p p x e • So 1 p • Hence additive one-step-increment of x gives multiplicative effect on odds with e Classical example • Bliss (1935), also in Agresti (1990) ’Catergorical Data Analysis’, Wiley, section 4.5.3. • Beetles, two interval-scaled variables y = dead/survived, x = log(dose carbon disulphide) • There are other models for a binary y that in some cases may be better. Logistic reg most common. Model fitting 1. Table low-high risk vs each variable separately 2. Are there cells with zero observations? 3. First selection with e.g. Forward selection 0.25 significance level 4. Test each remaining variable separately 5. For continuous variables: examine linearity by dividing the continuous variable in groups and compute log-odds within group 6. Test interaction effects 7. Consider subject matter knowledge