Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Determining Probability Estimates from Logistic Regression Coefficient Estimates Vartanian: SW541 Method 1: Evaluating the Coefficients at their mean values. The likelihood of being at or below 150% of the poverty line10 years after initially entering the sample. HiAFDC = using AFDC for 3 or 4 out of 4 years in late adolescence or early adulthood. LowAFDC = using AFDC for 1 or 2 out of 4 years in late adolescence or early adulthood. HiPov = Being at or below 150% of the poverty line 3 or 4 years in late adolescence or early adulthood. Lowpov= Being at or below 150% of the poverty line 1 or 2 years in late adolescence or early adulthood. Npsamp = never using AFDC or being below 150% of the poverty line for all four years in late adolescence or early adulthood. The above variables are a set of dummy variables with Hipov being the excluded category. Kids1= number of children White = dummy variable for race of white. The LOGISTIC Procedure Model Information Data Set WORK.Z Response Variable IN150 Number of Response Levels 2 Number of Observations 1708 Weight Variable Sum of Weights Link Function Optimization Technique WEIGHT 1708 Logit Fisher's scoring Model Fit Statistics Criterion AIC SC -2 Log L Intercept Only Intercept and Covariates 1958.664 1964.107 1956.664 1517.652 1555.754 1503.652 Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square DF Pr > ChiSq 453.0119 471.6869 344.0779 6 6 6 <.0001 <.0001 <.0001 The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Chi-Square Pr > ChiSq Intercept KIDS1 WHITE HIAFDC LOWAFDC 1 1 1 1 1 1.0902 -0.3963 -0.9588 0.8817 -0.4945 0.2098 0.1465 0.1449 0.2541 0.2514 26.9979 7.3228 43.7570 12.0388 3.8694 <.0001 0.0068 <.0001 0.0005 0.0492 LOWPOV NPSAMP 1 1 -1.3263 -2.1128 0.2033 0.1976 42.5562 114.2793 <.0001 <.0001 Odds Ratio Estimates Effect Point Estimate 95% Wald Confidence Limits KIDS1 WHITE HIAFDC LOWAFDC 0.673 0.383 2.415 0.610 0.505 0.289 1.468 0.373 0.896 0.509 3.974 0.998 LOWPOV NPSAMP 0.265 0.121 0.178 0.082 0.395 0.178 Determining Mean Values for the Independent Variables The MEANS Procedure Variable N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ KIDS1 1708 0.7684085 0.4219729 0 1.0000000 WHITE 1708 0.7776183 0.4159680 0 1.0000000 HIAFDC 1708 0.0914987 0.2884014 0 1.0000000 LOWAFDC 1708 0.0683175 0.2523639 0 1.0000000 hipov 1708 0.0985880 0.2981954 0 1.0000000 LOWPOV 1708 0.2213286 0.4152628 0 1.0000000 NPSAMP 1708 0.5202671 0.4997354 0 1.0000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ To determine the overall probability of being at or below 150% of the poverty line, use the formula for the logistic function: prob' e bX 1%e bX Replace the x values with the means for the x values. Remember that you must also include the intercept. XB=1.0902 +.7684085*(-.3963)+.7776183*(-.9588)+.0914987*(.8817)+ .0683175*(-.4945)+.2213286*(-1.3263)+.5202671*(-2.1128) = -1.30578 Use this value of -1.30578 in the logistic formula to come up with an overall likelihood of being at or below 150% of the poverty line: prob' e &1.30578 1%e &1.30578 ' .270961 '.213194 1.270961 In other words, there is a 21.3194% chance of being at or below 150% of the poverty line. If you’re interested in the likelihood of being at or below 150% of the poverty line for those who are in the high AFDC group, you would multiply the high AFDC group by 1 (instead of the mean value for this group) and multiply the other groups in this set of dummy variables by 0 (low AFDC, low poverty, and non-poor sample). White and number of kids are still evaluated at their mean values. XB=1.0902 +.7684085*(-.3963)+.7776183*(-.9588)+1*(.8817)+0*(-.4945)+0*(-1.3263)+ 0*(-2.1128) = .921799 Plug this XB into the logistic formula: prob' e .921799 1%e .921799 ' 2.514 '.715424 3.514 This gives you the probability of being in the condition for those who are in the high AFDC group, controlling for the effects of race and number of children. If you wanted to determine the likelihood of being in or near poverty for those with 5 children, you would use the first XB numbers from above except that you would multiply the coefficient estimate for kids by 5 instead of by its mean value. Method 2: Evaluating the coefficient estimates at the actual values for each sample member for the independent variables. This is what I was showing you with the SAS code in class. Let’s say we only had 3 individuals in the sample (for simplicity) but somehow came up with the coefficients given above. Let’s say that these 3 individuals had the following independent variable values: Observation 1 2 3 Kids1 1 2 3 White 1 0 1 HiAFDC 0 1 0 LowAFDC 1 0 0 Lowpov Npsamp 0 0 0 0 0 0 We would then determine XB values for each of the observations use these XB values to determine probability estimates for each of the individuals. We would then take the mean of these probability estimates to come up with an overall probability estimate. For observation 1: XB= 1.0902 +1*(-.3963)+1*(-.9588)+0*(.8817)+1*(-.4945)+0*(-1.3263)+0*(-2.1128) = -.7594 prob' e &.7594 1%e &.7594 ' .4679 '.318755 1.4679 For observation 2: XB= 1.0902 +2*(-.3963)+0*(-.9588)+1*(.8817)+0*(-.4945)+0*(-1.3263)+0*(-2.1128) = 1.1793 prob' e 1.1793 1%e 1. 1793 ' 3.25 '.7647 4.25 For observation 3: XB= 1.0902 +3*(-.3963)+1*(-.9588)+0*(.8817)+0*(-.4945)+0*(-1.3263)+0*(-2.1128) = -1.058 prob' e &1.058 1%e &1.058 ' .347 '.2577 1.347 To determine the overall mean: (.318755+.7647+.2577)/3 = .447052, or 44.7052% likelihood of being in or near poverty.