Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Newsom
Data Analysis II
Fall 2015
1
Multiple Logistic Regression
Hypothetical Widget Example
SPSS
logistic regression vars=success with yrsexp prevbiz
/print=summary ci(.95) goodfit iter(1)
/classplot.
Block 0: Beginning Block
Iteration Historya,b,c
Coefficients
Constant
.080
.080
-2 Log
likelihood
69.235
69.235
Iteration
Step
1
0
2
a. Constant is included in the model.
b. Initial -2 Log Likelihood: 69.235
c. Estimation terminated at iteration number 2 because
parameter estimates changed by less than .001.
Classification Tablea,b
Predicted
success
Step 0
Observed
success
0
0
1
Percentage
Correct
.0
100.0
52.0
1
0
0
24
26
Overall Percentage
a. Constant is included in the model.
b. The cut value is .500
Variables in the Equation
Step 0
Constant
B
.080
S.E.
.283
Wald
.080
df
Sig.
.777
1
Exp(B)
1.083
Variables not in the Equation
Step
0
Variables
Score
13.127
8.013
13.398
yrsexp
prevbiz
Overall Statistics
Block 1: Method = Enter
Iteration Historya,b,c,d
Iteration
Step
1
1
2
3
4
-2 Log
likelihood
54.878
54.454
54.450
54.450
Constant
-1.750
-2.160
-2.205
-2.206
Coefficients
yrsexp
.195
.242
.247
.247
a. Method: Enter
b. Constant is included in the model.
c. Initial -2 Log Likelihood: 69.235
d. Estimation terminated at iteration number 4 because
parameter estimates changed by less than .001.
prevbiz
-.570
-.774
-.796
-.796
df
1
1
2
Sig.
.000
.005
.001
Newsom
Data Analysis II
Fall 2015
2
Omnibus Tests of Model Coefficients
Step 1
Chi-square
14.785
14.785
14.785
Step
Block
Model
df
Sig.
.001
.001
.001
2
2
2
Model Summary
-2 Log
Cox & Snell
likelihood
R Square
54.450 a
.256
Step
1
Nagelkerke
R Square
.341
a. Estimation terminated at iteration number 4 because
parameter estimates changed by less than .001.
Hosmer and Lemeshow Test
Step
1
Chi-square
4.571
df
Sig.
.712
7
Contingency Table for Hosmer and Lemeshow Test
Step
1
1
2
3
4
5
6
7
8
9
success = 0
Observed
Expected
6
5.307
4
4.861
3.880
3
4
3.060
3
2.076
2
2.401
1
1.168
1
.620
0
.627
success = 1
Observed
Expected
0
.693
2
1.139
3
2.120
2
2.940
2
2.924
5
4.599
4
3.832
3
3.380
5
4.373
Total
6
6
6
6
5
7
5
4
5
Classification Tablea
Predicted
success
Step 1
1
0
Observed
success
0
1
9
20
15
6
Overall Percentage
Percentage
Correct
62.5
76.9
70.0
a. The cut value is .500
Variables in the Equation
Step
a
1
yrsexp
prevbiz
Constant
B
.247
-.796
-2.206
S.E.
.103
1.153
.811
Wald
5.741
.478
7.390
df
1
1
1
Sig.
.017
.490
.007
Exp(B)
1.280
.451
.110
95.0% C.I.for EXP(B)
Lower
Upper
1.046
1.568
4.316
.047
a. Variable(s) entered on step 1: yrsexp, prevbiz.
R
>
>
>
>
>
>
>
#base R
#obtain the likelihood ratio test of all predictors by testing two separate models
# intecept only model-- use 1 when there are no predictors
model0 <- glm(success ~ 1, data=mydata, family="binomial")
#full model
model1 <- glm(success ~ yrsexp + prevbiz, data=mydata, family="binomial")
summary(model1)
Call:
glm(formula = success ~ yrsexp + prevbiz, family = "binomial",
data = mydata)
Newsom
Data Analysis II
Fall 2015
Deviance Residuals:
Min
1Q
Median
-1.9307 -0.9381
0.5176
3
3Q
0.8956
Max
1.9377
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.2057
0.8114 -2.719 0.00656
yrsexp
0.2472
0.1032
2.396 0.01658
prevbiz
-0.7965
1.1525 -0.691 0.48951
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 69.235
Residual deviance: 54.450
AIC: 60.45
on 49
on 47
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
>
> #requests likelihood ratio, G-squared, comparing the deviances from the two models
> anova(model0,model1,test="Chisq")
Analysis of Deviance Table
Model 1: success ~ 1
Model 2: success ~ yrsexp + prevbiz
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1
49
69.235
2
47
54.450 2
14.785 0.000616
>
> #obtain psuedo-R-sq values with modEvA package
> #install.packages("modEvA", repos="http://R-Forge.R-project.org")
> library(modEvA)
> RsqGLM(model=model) #model on right side of equal sign is name of my model above
$CoxSnell
[1] 0.2559842
$Nagelkerke
[1] 0.3414946
$McFadden
[1] 0.2135439
$Tjur
[1] 0.2648686
$sqPearson
[1] 0.2603336
Write-up
To identify factors that predict success in the widget business, a multiple logistic regression analysis was
conducted, simultaneously entering years of prior experience in the widget field and prior ownership of a
business into the model. The results indicated that, together, years of experience and prior ownership
accounted for a significant amount of variance in success (likelihood ratio χ2 = 14.785). The Nagelkerke R2
indicated approximately 34% of the variance in success of the new business was accounted for by the
predictors. Prior ownership was not significantly independently related to success (b = -.80, OR = .45, ns).
Years of experience was significantly related to the probability of success (b = .25, OR = 1.28, p < .05) after
controlling for prior business ownership. For every additional year of experience in the widget business, there
was approximately a 28% increase in probability of success in the new business.