Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Understanding and Interpreting Results from Logistic, Multinomial, and Ordered Logistic Regression Models: Using Post-Estimation Commands in Stata Raymond Sin-Kwok Wong University of California-Santa Barbara Model Estimation and Interpretation • For OLS models, both model estimation and interpretation are relatively easily, since the effects are linear. • For non-linear models, model estimation is simple but the interpretation of results can be tricky, especially for beginners who are not familiar with the non-linear relationship between dependent and independent variables. What my talk is about? – Not about the rationale for statistical modeling or the mathematical and statistical derivation of specific non-linear models – But about a set of post-estimation tools that would aid understanding and interpretation and the presentation of complex relationship among variables using graphical display Alternative Output Methods • (a) Display odds-ratios rather than logit coefficients logit y x1 x2 x3 x4 x5 logit, or Alternative Output Methods • (b) Use LISTCOEF listcoef [varlist] [,pvalue(#) [factor|percent|std] constant help] Factor: factor changes in the odds or expected counts Percent: % change in the odds or expected counts Std: Standardized coefficients ============================================================== Option std factor percent ----------------------------------------------------------------------------------------------------------Type 1: regress, probit, cloglog, Default No No oprobit, tobit, cnreg, intreg Type 2: logit, logistic, ologit Yes Default Yes Type 3: clogit, mlogit, poisson, No Default Yes nbreg, zip, zinb ============================================================== Alternative Output Methods • Different standardized coefficients – x-standardized (bStdX) • For a standard deviation increase in xk, y is expected to change by βkSx units, holding everything constant – y-standardized (bStdY) • For a unit increase in xk, y is expected to change by βkSy standard deviations, holding everything constant – Fully standardized (bStdXY) • For a standard deviation increase in xk, y is expected to change by βkS units, holding everything constant Post-Estimation Tests regress y x1 x2 … xk estimates store mod1 What is the use? For post-estimation analysis Two kind of tests are common: (a) Wald test, and (b) LR tests Wald Test test varlist, [accumulate] For example, (a) regress y x1 x2 x3 x4 x5 test x1 x2 x3 x4 x5 This tests for the H0: β1 = β2 = β3 = β4 = β5 = 0 (b) test x1=2x2 test x3=x4, accumulate This tests for the H0: β1 = 2β2 and β3 = β4 LR (Likelihood-Ratio) Tests lrtest [, saving(name) using(name) model(name) df(#) ] For example, (a) logit chd age age2 sex (b) lrtest, saving(0) (c) logit chd age sex (d) lrtest (e) lrtest, saving(1) (f) logit chd sex (g) lrtest (h) lrtest, using(1) (i) lrtest, model(1) estimate saturated model save results estimate simpler model obtain test save results as 1 estimate simplest model compare to saturated model compare to model 1 repeat earlier test • . logit died studytime age drug • • • • Logit estimates • • • • • • • • -----------------------------------------------------------------------------died | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------studytime | -.0236468 .0457671 -0.52 0.605 -.1133487 .0660551 age | .0793438 .0699391 1.13 0.257 -.0577343 .2164219 drug | -1.150009 .5549529 -2.07 0.038 -2.237697 -.0623212 _cons | -1.113136 3.945369 -0.28 0.778 -8.845918 6.619645 ------------------------------------------------------------------------------ • . lrtest, saving(0) Log likelihood = -24.364293 Number of obs LR chi2(3) Prob > chi2 Pseudo R2 = = = = 48 13.67 0.0034 0.2191 • • • • • • . logit died studytime age Iteration 0: log likelihood Iteration 1: log likelihood Iteration 2: log likelihood Iteration 3: log likelihood Iteration 4: log likelihood • • • • Logit estimates • • • • • • • -----------------------------------------------------------------------------died | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------studytime | -.0843475 .0353784 -2.38 0.017 -.153688 -.015007 age | .0518897 .0646409 0.80 0.422 -.0748042 .1785836 _cons | -.87332 3.729449 -0.23 0.815 -8.182906 6.436266 ------------------------------------------------------------------------------ • • • . lrtest Logit: likelihood-ratio test • . lrtest, saving(1) Log likelihood = -26.734061 = = = = = -31.199418 -26.82757 -26.734502 -26.734061 -26.734061 Number of obs LR chi2(2) Prob > chi2 Pseudo R2 = = = = chi2(1) = Prob > chi2 = 48 8.93 0.0115 0.1431 4.74 0.0295 • . logit died age • • • • Iteration Iteration Iteration Iteration • • • • Logit estimates • • • • • • -----------------------------------------------------------------------------died | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------age | .0893535 .0585925 1.52 0.127 -.0254857 .2041928 _cons | -4.353928 3.238757 -1.34 0.179 -10.70177 1.993919 ------------------------------------------------------------------------------ • • • . lrtest Logit: likelihood-ratio test • • • . lrtest, using(1) Logit: likelihood-ratio test 0: 1: 2: 3: log log log log likelihood likelihood likelihood likelihood Log likelihood = -29.945379 = = = = -31.199418 -29.955649 -29.945382 -29.945379 Number of obs LR chi2(1) Prob > chi2 Pseudo R2 = = = = 48 2.51 0.1133 0.0402 chi2(2) = Prob > chi2 = 11.16 0.0038 chi2(1) = Prob > chi2 = 6.42 0.0113 Fit Statistics • fitstat calculates a large number of fit statistics for many kinds of regression models. It works after the following: clogit, cnreg, cloglog, intreg, logistic, logit, mlogit, nbreg, ocratio, ologit, oprobit, poisson, probit, regress, zinb, and zip. With the saving() and using() options, it can also be used to compare fit measures for two different models. fitstat [, saving(name) using(name) bic force save dif] Examples: (a) logit y x1 x2 … x10 Fitstat Fit statistics (b) To compute, save, and compare with other models Logit y x1 x2 x3 x4 x5 age Quietly fitstat, saving(mod1) Generate age2=age*age Logit y x1 x2 x3 x4 x5 age age2 Fitstat, using(mod1) Post-Estimation Approach to Interpret NonLinear Regression Models • For non-linear regression models, the interpretation of individual coefficients do not have the simple linear relationship. For example, the beta coefficient in a logistic regression model can only be interpreted as the logit coefficient. If we want to interpret the model in terms of predicted probability, the effect of a change in a variable depends on the values of all variables in the model. Or to put it differently, it depends on where we evaluate the effect. Post-Estimation Approach to Interpret NonLinear Regression Models • (1) Use predict command regress y x1 x2 x3 predict logit y x1 x2 x3 predict ologit y x1 x2 x3 predict mlogit y x1 x2 x3 predict poission y x1 x2 x3 generate predicted-y generate predicted P(Y=1) generate predicted P(Y=k) generate predicted P(Y=k) generate predicted count Post-Estimation Approach to Interpret NonLinear Regression Models • (2) Use prchange command to compute discrete and marginal changes in the predicted outcomes prchange [varlist] [if exp] [in range] [,x(variables_and_values) rest(stat) outcome(#) fromto brief nobase nolabel help all uncentered delta(#) ] Examples: (a) prchange age, x(x1=20 x2=10) rest(mean) help (b) prchange, help (c) prchange x1 x2, fromto This will calculate x=min to max, 0 to 1, -.5 to .5, -.5 sd to .5 sd, and marginal effect Post-Estimation Approach to Interpret NonLinear Regression Models • (3) Use prvalue command to calculate the change in probability for a discrete change for any magnitudes in an independent variable. prvalue[if exp] [in range] [,(variables_and_values)][rest(stat)] [level(#)][save][dif][brief][all][maxcnt#)][nobase][nolabel] [ystar] Examples: (a) prvalue, rest(median) (b) prvalue, x(age=30) save brief prvalue, x(age=40) dif brief (c) prvalue age, x(age=30) uncentered delta(10) rest(mean) brief This will generate a change in probability (P(Y=1)) from age 30 to age 40. Post-Estimation Approach to Interpret NonLinear Regression Models • (4) Use prgen command to compute predicted values as one variable changes over a range of values, which is useful for constructing plots. The syntax is: prgen varname, generate(newvar)[from(#) to(#) ncases(#)] [x(variables_and_values)][rest(stat)][maxcnt (#)] [brief][nobase][all] Examples To compute predicted values from an ordered probit where warm has four categories SD, D, A and SA: . . . . oprobit warm yr89 male white age ed prst prgen age, f(20) t(80) gen(mn) prgen age, x(male=0) rest(grmean) f(20) t(80) gen(fem) prgen age, x(male=1) rest(grmean) f(20) t(80) gen(mal) To plot the predicted probabilites for average males: . graph malp1 malp2 malp3 malp4 malX Post-Estimation Approach to Interpret NonLinear Regression Models Models and Predictions - * is the prefix all models: *X: value of X logit & probit: Predicted probability of each outcome: *p0, *p1 ologit, oprobit Predicted probabilities: *p#1,*p#2,... where #1,#2,... are values of the outcome variable. Cumulative probabilities: *s#1,*s#2,... where #1,#2,... are of the outcome variable. *s#k is the probability of all categories up to or equal to #k. mlogit: Predicted probabilities: *p#1,*p#2,... where #1,#2,... are values of the outcome variable. Post-Estimation Approach to Interpret NonLinear Regression Models • (5) Use prtab command to construct a table of predicted values for all combinations of up to three variables. The syntax is: prtab rowvar [colvar [supercolvar]] [if exp] [in range], [by(superrowvar)][x(variables_and_values)][rest(stat)] [outcome(string)][brief][nobase][nolabel][novarlbl][all] Examples: (a) probit faculty female fellow phd mcit3 mnas prtab female fellow mnas (b) ologit jobclass female fellow pub1 phd prtab female fellow, x(phd=min) (c) logit died female race age educ prtab female race educ Post-Estimation Approach to Interpret NonLinear Regression Models • (6) Use mfx compute command to compute numerically calculates the marginal effects or the elasticities and their standard errors after estimation. Exactly what mfx can calculate is determined by the previous estimation command and the predict() option. At which points the marginal effects or elasticities are to be evaluated is determined by the at() option. By default, mfx calculates the marginal effects or elasticities at the means of the independent variables by the default prediction option associated with the preceding estimation command. Post-Estimation Approach to Interpret NonLinear Regression Models Examples (a) logit foreign mpg price mfx compute mfx, at(mpg = 20, price = 6000) mfx compute, predict(xb) mfx replay, level(90) (b) mlogit rep78 mpg displ, nolog mfx compute, predict(outcome(1)) (c) regress mpg length weight mfx compute, eyex