Download Part ME-2-1: Binary Choice

Document related concepts

Linear regression wikipedia , lookup

Time series wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
1/93: Topic 2.1 – Binary Choice Models
Microeconometric Modeling
William Greene
Stern School of Business
New York University
New York NY USA
2.1 Binary Choice Models
2/93: Topic 2.1 – Binary Choice Models
Concepts
•
•
•
•
•
•
•
•
•
•
•
•
•
Random Utility
Maximum Likelihood
Parametric Model
Partial Effect
Average Partial Effect
Odds Ratio
Linear Probability Model
Cluster Correction
Pseudo R squared
Likelihood Ratio, Wald, LM
Decomposition of Effect
Exclusion Restrictions
Incoherent Model
Models
•
•
•
•
•
•
•
•
•
Nonparametric Regression
Klein and Spady Model
Probit
Logit
Bivariate Probit
Recursive Bivariate Probit
Multivariate Probit
Sample Selection
Panel Probit
3/93: Topic 2.1 – Binary Choice Models
Central Proposition
A Utility Based Approach




Observed outcomes partially reveal underlying preferences
There exists an underlying preference scale defined over
alternatives, U*(choices)
Revelation of preferences between two choices labeled 0 and 1
reveals the ranking of the underlying utility

U*(choice 1) > U*(choice 0)
Choose 1

U*(choice 1) < U*(choice 0)
Choose 0
Net utility = U = U*(choice 1) - U*(choice 0). U > 0 => choice 1
4/93: Topic 2.1 – Binary Choice Models
Binary Outcome: Visit Doctor
In the 1984 year of the GSOEP, 1611 of 3874
individuals visited the doctor at least once.
5/93: Topic 2.1 – Binary Choice Models
A Random Utility Model for
the Binary Choice

Yes or No decision | Visit or not visit the doctor

Model: Net utility of visit at least once

Net utility depends on observables and unobservables
Udoctor = Net utility = U*visit – U*not visit
Random Utility
Udoctor =  + 1Age + 2Income + 3Sex + 
Choose to visit at least once if net utility is positive

Observed Data: X
y
= Age, Income, Sex
= 1 if choose visit,  Udoctor > 0, 0 if not.
6/93: Topic 2.1 – Binary Choice Models
Modeling the Binary Choice Between
the Two Alternatives
Net Utility Udoctor = U*visit – U*not visit
Udoctor =  + 1 Age + 2 Income + 3 Sex + 
Chooses to visit: Udoctor > 0
 + 1 Age + 2 Income + 3 Sex +  > 0
Choosing to visit is a random outcome because of 
 > -( + 1 Age + 2 Income + 3 Sex)
7/93: Topic 2.1 – Binary Choice Models
Probability Model for Choice Between Two Alternatives
People with the same (Age,Income,Sex) will make different choices between  is
random. We can model the probability that the random event “visits the
doctor”will occur.
Probability is
governed by ,
the random
part of the
utility function.
Event DOCTOR=1 occurs if  > -( + 1Age + 2Income + 3Sex)
We model the probability of this event.
8/93: Topic 2.1 – Binary Choice Models
An Application
27,326 Observations in GSOEP Sample



1 to 7 years, panel
7,293 households observed
We use the 1994 year; 3,337 household observations
9/93: Topic 2.1 – Binary Choice Models
An Econometric Model

Choose to visit iff Udoctor > 0
 Udoctor =  + 1 Age + 2 Income + 3 Sex + 


Udoctor > 0   > -( + 1 Age + 2 Income + 3 Sex)
 <  + 1 Age + 2 Income + 3 Sex)
Probability model: For any person observed by the analyst,
Prob(doctor=1) = Prob( <  + 1 Age + 2 Income + 3 Sex)

Note the relationship between the unobserved  and the
observed outcome DOCTOR.
10/93: Topic 2.1 – Binary Choice Models
Index = +1Age + 2 Income + 3 Sex
Probability = a function of the Index.
P(Doctor = 1) = f(Index)
Internally consistent probabilities:
(1) (Coherence)
0 < Probability < 1
(2) (Monotonicity) Probability increases with Index.
11/93: Topic 2.1 – Binary Choice Models
A Fully Parametric Model




Index Function: U = β’x + ε
Observation Mechanism: y = 1[U > 0]
Distribution: ε ~ f(ε); Normal, Logistic, …
Maximum Likelihood Estimation:
Max(β) logL = Σi log Prob(Yi = yi|xi)


We will focus on parametric models
We examine the linear probability “model” in passing.
12/93: Topic 2.1 – Binary Choice Models
Parametric Model Estimation

How to estimate , 1, 2, 3?

The technique of maximum likelihood
L   y 0 Prob[ y  0 | x]   y 1 Prob[ y  1| x]

Prob[doctor=1] = Prob[ > -( + 1 Age + 2 Income + 3 Sex)]
Prob[doctor=0] = 1 – Prob[doctor=1]

Requires a model for the probability
13/93: Topic 2.1 – Binary Choice Models
Completing the Model: F()

The distribution





Normal:
PROBIT, natural for behavior
Logistic:
LOGIT, allows “thicker tails”
Gompertz: EXTREME VALUE, asymmetric
Others…
Does it matter?


Yes, large difference in estimates
Not much, quantities of interest are more stable.
14/93: Topic 2.1 – Binary Choice Models
Estimated Binary Choice Models for
Three Distributions
Log-L(0) = log likelihood for a model that has only a constant term.
Ignore the t ratios for now.
15/93: Topic 2.1 – Binary Choice Models
Partial Effects in Probability
Models


Prob[Outcome] = some F(+1Income…)
“Partial effect” = F(+1Income…) / ”x”


Partial effects are derivatives
Result varies with model


(derivative)
Logit: F(+1Income…) /x
=

Probit:  F(+1Income…)/x
=

Extreme Value:  F(+1Income…)/x
=
Scaling usually erases model differences

Normal density  
Prob * (-log Prob)  
Prob * (1-Prob)

16/93: Topic 2.1 – Binary Choice Models
Partial effect for the logit model
exp(α+β1 Age +β 2Income +β 3Sex )
Prob(doctor =1) =
1+ exp(α+β1 Age +β 2Income +β 3Sex )
= (α+β1 Age +β 2Income +β 3Sex)
= (β1 x )
The derivative with respect to one of the variables is
 (β1 x )
  (β1 x )1  (β1 x ) β k
xk
(1) A multiple of the coefficient, not the coefficient itself
(2) A function of all of the coefficients and variables
(3) Evaluated using the data and model parts after the model
is estimated.
Similar computations apply for other models such as probit.
17/93: Topic 2.1 – Binary Choice Models
Estimated Partial Effects
for Three Models
(Standard errors to be considered later)
18/93: Topic 2.1 – Binary Choice Models
Partial Effect for a Dummy Variable Computed
Using Means of Other Variables



Prob[yi = 1|xi,di] = F(’xi+di) where d is a dummy
variable such as Sex in our doctor model.
For the probit model, Prob[yi = 1|xi,di] = (x+d),
 = the normal CDF.
Partial effect of d
Prob[yi = 1|xi, di=1] - Prob[yi = 1|xi, di=0]

  
= (di )   ˆ x  ˆ   ˆ x
19/93: Topic 2.1 – Binary Choice Models
Partial Effect – Dummy Variable
20/93: Topic 2.1 – Binary Choice Models
Computing Partial Effects

Compute at the data means (PEA)




Simple
Inference is well defined.
Not realistic for some variables, such as Sex
Average the individual effects (APE)


More appropriate
Asymptotic standard errors are slightly more complicated.
21/93: Topic 2.1 – Binary Choice Models
Partial Effects
Probability = Pi  F( ' xi )
Pi F( ' xi )
Partial Effect =

 f ( ' xi )   = di
xi
xi


Partial Effect at the Means = f ( ' x )    f  '  n1 in1xi   
Average Partial Effect
=
1
n
in1di


1
n
in1f ( ' xi )   
Both are estimates of δ =E[di ] under certain assumptions.
22/93: Topic 2.1 – Binary Choice Models
The two approaches often give similar answers,
though sometimes the results differ substantially.
Average Partial Effects
Partial Effects at Data
Means
23/93: Topic 2.1 – Binary Choice Models
APE vs. Partial Effects at the Mean
Delta Method for Average Partial Effect
N
1

Estimator of Var   i 1 PartialEffect i   G Var ˆ  G 
N

24/93: Topic 2.1 – Binary Choice Models
25/93: Topic 2.1 – Binary Choice Models
26/93: Topic 2.1 – Binary Choice Models
27/93: Topic 2.1 – Binary Choice Models
How Well Does the Model Fit the Data?

There is no R squared for a probability model.




Least squares for linear models is computed to maximize R2
There are no residuals or sums of squares in a binary choice model
The model is not computed to optimize the fit of the model to the
data
How can we measure the “fit” of the model to the data?

“Fit measures” computed from the log likelihood



Pseudo R squared = 1 – logL/logL0
Also called the “likelihood ratio index”
Direct assessment of the effectiveness of the model at predicting the
outcome
28/93: Topic 2.1 – Binary Choice Models
Pseudo R2 = Likelihood Ratio Index
Pseudo R 2 = 1 -
log L for the model
log L for a model with only a constant term
 
The prediction of the model is Fˆ = F ˆ xi = Estimated Prob(yi  1| xi )
Using only the constant term, F()
LogL0 =
 (1  y ) log[1  F()]  y log F()
N
i 1
i
i
= N 0 log[1  F( )]  N1 log F() < 0
The log likelihood for the model is larger, but also < 0.
log L
LRI = 1 . Since logL > logL0 0  LRI < 1.
log L0
29/93: Topic 2.1 – Binary Choice Models
Fit Measures Based on Predictions

Computation
 Use the model to compute predicted probabilities
 P = F(a + b1Age + b2Income + b3Female+…)
 Use a rule to compute predicted y = 0 or 1
 Predict y=1 if P is “large” enough
 Generally use 0.5 for “large” (more likely than not)
ŷ  1 if Pˆ > P*


Fit measure compares predictions to actuals
Count successes and failures
30/93: Topic 2.1 – Binary Choice Models
Cramer Fit Measure
F̂ = Predicted Probability
N
ˆ  N (1  y )Fˆ

y
F
i

1
i
i
ˆ 
 i 1
N1
N0

 
ˆ  Mean Fˆ | when y = 1 - Mean Fˆ | when y = 0

= reward for correct predictions minus
penalty for incorrect predictions
+----------------------------------------+
| Fit Measures Based on Model Predictions|
| Efron
=
.04825|
| Ben Akiva and Lerman
=
.57139|
| Veall and Zimmerman
=
.08365|
| Cramer
=
.04771|
+----------------------------------------+
31/93: Topic 2.1 – Binary Choice Models
Hypothesis Tests
We consider “nested” models and parametric
tests
 Test statistics based on the usual 3 strategies





Wald statistics: Use the unrestricted model
Likelihood ratio statistics: Based on comparing the
two models
Lagrange multiplier: Based on the restricted model.
Test statistics require the log likelihood and/or
the first and second derivatives of logL
32/93: Topic 2.1 – Binary Choice Models
Computing test statistics requires the log likelihood
and/or standard errors based on the Hessian of LogL
Logit: g i = yi -  i
E[Hi ] =  i =  i (1- i )
Hi =  i (1- i )
(qi  2 yi  1, zi  qi xi .  i = exp(zi )/[1+exp(zi )])
2
i2
zi i  i 
qi i
   , E[Hi ] =  i =
Hi =
Probit: g i =
 i (1   i )
i  i 
i
i  ( zi ),  i   ( zi ). Note, g i is a "generalized residual."
Estimators: Based on H i , E[Hi ] and g i2 all functions evaluated at zi
Actual Hessian:
N
Est.Asy.Var[ˆ ] =   i 1 H i xi xi 


1
N
Expected Hessian: Est.Asy.Var[ˆ ] =   i 1  i xi xi 


1
N
Est.Asy.Var[ˆ ] =   i 1 g i2 xi xi 


1
BHHH:
33/93: Topic 2.1 – Binary Choice Models
Robust Covariance Matrix
(Robust to the model specification? Latent heterogeneity?
Correlation across observations? Not always clear)
"Robust" Covariance Matrix: V = A B A
A = negative inverse of second derivatives matrix
1
2

  2 log L 
N  log Prob i 
= estimated E 
    i 1
ˆ
ˆ














B = matrix sum of outer products of first derivatives

  log L  log L  
= estimated E 


  
 
For a logit model, A = 




 log Probi  log Probi 

i 1
ˆ
ˆ 

N
ˆ (1  Pˆ ) x x 
P
i
i i
i 1 i

N
1
1

N
N
2



ˆ

B =  i 1 ( yi  Pi ) xi xi    i 1 ei2 xi xi 

 

(Resembles the White estimator in the linear model case.)
1
34/93: Topic 2.1 – Binary Choice Models
Robust Covariance Matrix
for Logit Model
Doesn’t change much. The model is well specified.
--------+-------------------------------------------------------------------|
Standard
Prob.
95% Confidence
DOCTOR| Coefficient
Error
z
|z|>Z*
Interval
--------+-------------------------------------------------------------------Conventional Standard Errors
Constant|
1.86428***
.67793
2.75 .0060
.53557
3.19299
AGE|
-.10209***
.03056
-3.34 .0008
-.16199
-.04219
AGE^2.0|
.00154***
.00034
4.56 .0000
.00088
.00220
INCOME|
.51206
.74600
.69 .4925
-.95008
1.97420
|Interaction AGE*INCOME
_ntrct02|
-.01843
.01691
-1.09 .2756
-.05157
.01470
FEMALE|
.65366***
.07588
8.61 .0000
.50494
.80237
--------+-------------------------------------------------------------------Robust Standard Errors
Constant|
1.86428***
.68518
2.72 .0065
.52135
3.20721
AGE|
-.10209***
.03118
-3.27 .0011
-.16321
-.04098
AGE^2.0|
.00154***
.00035
4.44 .0000
.00086
.00222
INCOME|
.51206
.75171
.68 .4958
-.96127
1.98539
|Interaction AGE*INCOME
_ntrct02|
-.01843
.01705
-1.08 .2796
-.05185
.01498
FEMALE|
.65366***
.07594
8.61 .0000
.50483
.80249
35/93: Topic 2.1 – Binary Choice Models
Base Model for Hypothesis Tests
---------------------------------------------------------------------Binary Logit Model for Binary Choice
Dependent variable
DOCTOR
Log likelihood function
-2085.92452
H0: Age is not a significant
Restricted log likelihood
-2169.26982
determinant of
Chi squared [
5 d.f.]
166.69058
Significance level
.00000
Prob(Doctor = 1)
McFadden Pseudo R-squared
.0384209
Estimation based on N =
3377, K =
6
H0: β2 = β3 = β5 = 0
Information Criteria: Normalization=1/N
Normalized
Unnormalized
AIC
1.23892
4183.84905
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Characteristics in numerator of Prob[Y = 1]
Constant|
1.86428***
.67793
2.750
.0060
AGE|
-.10209***
.03056
-3.341
.0008
42.6266
AGESQ|
.00154***
.00034
4.556
.0000
1951.22
INCOME|
.51206
.74600
.686
.4925
.44476
AGE_INC|
-.01843
.01691
-1.090
.2756
19.0288
FEMALE|
.65366***
.07588
8.615
.0000
.46343
--------+-------------------------------------------------------------
36/93: Topic 2.1 – Binary Choice Models
Likelihood Ratio Test
Null hypothesis restricts the parameter vector
Alternative relaxes the restriction
Test statistic: Chi-squared =
2 (LogL|Unrestricted model – LogL|Restrictions) > 0
Degrees of freedom = number of restrictions
37/93: Topic 2.1 – Binary Choice Models
LR Test of H0: β2 = β3 = β5 = 0
UNRESTRICTED MODEL
Binary Logit Model for Binary Choice
Dependent variable
DOCTOR
Log likelihood function
-2085.92452
Restricted log likelihood
-2169.26982
Chi squared [
5 d.f.]
166.69058
Significance level
.00000
McFadden Pseudo R-squared
.0384209
Estimation based on N =
3377, K =
6
Information Criteria: Normalization=1/N
Normalized
Unnormalized
AIC
1.23892
4183.84905
RESTRICTED MODEL
Binary Logit Model for Binary Choice
Dependent variable
DOCTOR
Log likelihood function
-2124.06568
Restricted log likelihood
-2169.26982
Chi squared [
2 d.f.]
90.40827
Significance level
.00000
McFadden Pseudo R-squared
.0208384
Estimation based on N =
3377, K =
3
Information Criteria: Normalization=1/N
Normalized
Unnormalized
AIC
1.25974
4254.13136
Chi squared[3] = 2[-2085.92452 - (-2124.06568)] = 77.46456
38/93: Topic 2.1 – Binary Choice Models
Wald Test of H0: β2 = β3 = β5 = 0
Unrestricted parameter vector is estimated
Discrepancy: q= Rb – m is computed
(or r(b,m) if nonlinear)
Variance of discrepancy is estimated:
Var[q] = R V R’
Wald Statistic is q’[Var(q)]-1q = q’[RVR’]-1q
39/93: Topic 2.1 – Binary Choice Models
Lagrange Multiplier Test of H0: β2 = β3 = β5 = 0




Restricted model is estimated
Derivatives of unrestricted model and variances of
derivatives are computed at restricted estimates
Wald test of whether derivatives are zero tests the
restrictions
Usually hard to compute – difficult to program the
derivatives and their variances.
40/93: Topic 2.1 – Binary Choice Models
LM Test for a Logit Model

Compute b0 (subject to restictions)
(e.g., with zeros in appropriate positions.

Compute Pi(b0) for each observation.

Compute ei(b0) = [yi – Pi(b0)]

Compute gi(b0) = xiei using full xi vector

LM = [Σigi(b0)]’[Σigi(b0)gi(b0)’]-1[Σigi(b0)]
41/93: Topic 2.1 – Binary Choice Models
42/93: Topic 2.1 – Binary Choice Models
Application: Health Care Usage
German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods
Variables in the file are
Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293
individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary
choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges
from 1 to 7. (Frequencies are: 1=1525, 2=1079, 3=825, 4=926, 5=1051, 6=1000, 7=887). Note, the variable
NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of
the data for the person.
DOCTOR = 1(Number of doctor visits > 0)
HOSPITAL = 1(Number of hospital visits > 0)
HSAT = health satisfaction, coded 0 (low) - 10 (high)
DOCVIS = number of doctor visits in last three months
HOSPVIS = number of hospital visits in last calendar year
PUBLIC = insured in public health insurance = 1; otherwise = 0
ADDON = insured by add-on insurance = 1; otherswise = 0
HHNINC = household nominal monthly net income in German marks / 10000.
(4 observations with income=0 were dropped)
HHKIDS = children under age 16 in the household = 1; otherwise = 0
EDUC = years of schooling
AGE = age in years
MARRIED = marital status
EDUC = years of education
43/93: Topic 2.1 – Binary Choice Models
The Bivariate Probit Model
y1 * = β1 x1 + ε1, y1 = 1(y1* > 0)
y 2 * = β2 x 2 + ε 2 ,y 2 = 1(y 2 * > 0)
 0   1 ρ  
 ε1 
  ~ N   , 

ε
0
ρ
1

 2
  
The variables in x 2 and x 2 may be the same or
different. There is no need for each equation to have
its 'own variable.'
(The equations can be fit one at a time. Use FIML for
(1) efficiency and (2) to get the estimate of ρ.)
44/93: Topic 2.1 – Binary Choice Models
ML Estimation of the Bivariate Probit
Model
(2yi1 -1)β1x i1,

n



logL =  i=1logΦ2 (2yi2 -1)β2 x i2 ,

(2yi1 -1)(2y i2 -1)ρ
=  i=1logΦ2  qi1β1x i1,qi2β2 x i2 ,qi1qi2ρ
n
Note : qi1 = (2y i1 -1) = -1 if y i1 = 0 and +1 if y i1 = 1.
Φ2 = Bivariate normal CDF - must be computed
using quadrature
Maximized with respect to β1, β2 and ρ.
45/93: Topic 2.1 – Binary Choice Models
Application to Health Care Data
x1=one,age,female,educ,married,working
x2=one,age,female,hhninc,hhkids
BivariateProbit ; lhs=doctor,hospital
; rh1=x1
; rh2=x2;marginal effects $
46/93: Topic 2.1 – Binary Choice Models
Parameter Estimates
---------------------------------------------------------------------FIML Estimates of Bivariate Probit Model
Dependent variable
DOCTOR HOSPITAL
Log likelihood function
-25323.63074
Estimation based on N = 27326, K = 12
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Index
equation for DOCTOR
Constant|
-.20664***
.05832
-3.543
.0004
AGE|
.01402***
.00074
18.948
.0000
43.5257
FEMALE|
.32453***
.01733
18.722
.0000
.47877
EDUC|
-.01438***
.00342
-4.209
.0000
11.3206
MARRIED|
.00224
.01856
.121
.9040
.75862
WORKING|
-.08356***
.01891
-4.419
.0000
.67705
|Index
equation for HOSPITAL
Constant|
-1.62738***
.05430
-29.972
.0000
AGE|
.00509***
.00100
5.075
.0000
43.5257
FEMALE|
.12143***
.02153
5.641
.0000
.47877
HHNINC|
-.03147
.05452
-.577
.5638
.35208
HHKIDS|
-.00505
.02387
-.212
.8323
.40273
|Disturbance correlation
RHO(1,2)|
.29611***
.01393
21.253
.0000
--------+-------------------------------------------------------------
47/93: Topic 2.1 – Binary Choice Models
Marginal Effects

What are the marginal effects



Possible margins?





Effect of what on what?
Two equation model, what is the conditional mean?
Derivatives of joint probability = Φ2(β1’xi1, β2’xi2,ρ)
Partials of E[yij|xij] =Φ(βj’xij) (Univariate probability)
Partials of E[yi1|xi1,xi2,yi2=1] = P(yi1,yi2=1)/Prob[yi2=1]
Note marginal effects involve both sets of regressors.
If there are common variables, there are two effects
in the derivative that are added.
(See Appendix for formulations.)
48/93: Topic 2.1 – Binary Choice Models
Marginal Effects: Decomposition
49/93: Topic 2.1 – Binary Choice Models
Direct Effects
Derivatives of E[y1|x1,x2,y2=1] wrt x1
+-------------------------------------------+
| Partial derivatives of E[y1|y2=1] with
|
| respect to the vector of characteristics. |
| They are computed at the means of the Xs. |
| Effect shown is total of 4 parts above.
|
| Estimate of E[y1|y2=1] = .819898
|
| Observations used for means are All Obs. |
| These are the direct marginal effects.
|
+-------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
AGE
.00382760
.00022088
17.329
.0000
43.5256898
FEMALE
.08857260
.00519658
17.044
.0000
.47877479
EDUC
-.00392413
.00093911
-4.179
.0000
11.3206310
MARRIED
.00061108
.00506488
.121
.9040
.75861817
WORKING
-.02280671
.00518908
-4.395
.0000
.67704750
HHNINC
.000000
......(Fixed Parameter).......
.35208362
HHKIDS
.000000
......(Fixed Parameter).......
.40273000
50/93: Topic 2.1 – Binary Choice Models
Indirect Effects
Derivatives of E[y1|x1,x2,y2=1] wrt x2
+-------------------------------------------+
| Partial derivatives of E[y1|y2=1] with
|
| respect to the vector of characteristics. |
| They are computed at the means of the Xs. |
| Effect shown is total of 4 parts above.
|
| Estimate of E[y1|y2=1] = .819898
|
| Observations used for means are All Obs. |
| These are the indirect marginal effects. |
+-------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
AGE
-.00035034
.697563D-04
-5.022
.0000
43.5256898
FEMALE
-.00835397
.00150062
-5.567
.0000
.47877479
EDUC
.000000
......(Fixed Parameter).......
11.3206310
MARRIED
.000000
......(Fixed Parameter).......
.75861817
WORKING
.000000
......(Fixed Parameter).......
.67704750
HHNINC
.00216510
.00374879
.578
.5636
.35208362
HHKIDS
.00034768
.00164160
.212
.8323
.40273000
51/93: Topic 2.1 – Binary Choice Models
Partial Effects: Total Effects
Sum of Two Derivative Vectors
+-------------------------------------------+
| Partial derivatives of E[y1|y2=1] with
|
| respect to the vector of characteristics. |
| They are computed at the means of the Xs. |
| Effect shown is total of 4 parts above.
|
| Estimate of E[y1|y2=1] = .819898
|
| Observations used for means are All Obs. |
| Total effects reported = direct+indirect. |
+-------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
AGE
.00347726
.00022941
15.157
.0000
43.5256898
FEMALE
.08021863
.00535648
14.976
.0000
.47877479
EDUC
-.00392413
.00093911
-4.179
.0000
11.3206310
MARRIED
.00061108
.00506488
.121
.9040
.75861817
WORKING
-.02280671
.00518908
-4.395
.0000
.67704750
HHNINC
.00216510
.00374879
.578
.5636
.35208362
HHKIDS
.00034768
.00164160
.212
.8323
.40273000
52/93: Topic 2.1 – Binary Choice Models
A Simultaneous Equations Model
Simultaneous Equations Model
y1 * = β1x1 + θ1y 2 + ε1, y1 = 1(y1 * > 0)
y 2 * = β2 x 2 + θ2 y1 + ε 2 ,y 2 = 1(y 2 * > 0)
 0   1 ρ  
 ε1 
 ε  ~ N  0  ,  ρ 1  

 2
  
This model is not identified. Incoherent.
(Not estimable. The computer can
compute 'estimates' but they have no meaning.)
53/93: Topic 2.1 – Binary Choice Models
Fully Simultaneous “Model”
---------------------------------------------------------------------FIML Estimates of Bivariate Probit Model
Dependent variable
DOCHOS
Log likelihood function
-20318.69455
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Index
equation for DOCTOR
Constant|
-.46741***
.06726
-6.949
.0000
AGE|
.01124***
.00084
13.353
.0000
43.5257
FEMALE|
.27070***
.01961
13.807
.0000
.47877
EDUC|
-.00025
.00376
-.067
.9463
11.3206
MARRIED|
-.00212
.02114
-.100
.9201
.75862
WORKING|
-.00362
.02212
-.164
.8701
.67705
HOSPITAL|
2.04295***
.30031
6.803
.0000
.08765
|Index
equation for HOSPITAL
Constant|
-1.58437***
.08367
-18.936
.0000
AGE|
-.01115***
.00165
-6.755
.0000
43.5257
FEMALE|
-.26881***
.03966
-6.778
.0000
.47877
HHNINC|
.00421
.08006
.053
.9581
.35208
HHKIDS|
-.00050
.03559
-.014
.9888
.40273
DOCTOR|
2.04479***
.09133
22.389
.0000
.62911
|Disturbance correlation
RHO(1,2)|
-.99996***
.00048
********
.0000
--------+-------------------------------------------------------------
54/93: Topic 2.1 – Binary Choice Models
A Recursive Simultaneous
Equations Model
Recursive Simultaneous Equations Model
y1 * = β1x1 +
ε1, y1 = 1(y1 * > 0)
y 2 * = β2 x 2 + θ 2 y1 + ε 2 ,y 2 = 1(y 2 * > 0)
 0   1 ρ  
 ε1 
~
N
  , 
ε 

0
ρ
1




 2

This model is identified. It can be consistently and efficiently
estimated by full information maximum likelihood. Treated as
a bivariate probit model, ignoring the simultaneity.
Bivariate ; Lhs = y1,y2 ; Rh1=…,y2 ; Rh2 = … $
55/93: Topic 2.1 – Binary Choice Models
56/93: Topic 2.1 – Binary Choice Models
57/93: Topic 2.1 – Binary Choice Models
Causal Inference?
Causal Inference?
There is no partial (marginal) effect for PIP.
PIP cannot change partially (marginally). It changes because something
else changes. (X or I or u2.)
The calculation of MEPIP does not make sense.
58/93: Topic 2.1 – Binary Choice Models
59/93: Topic 2.1 – Binary Choice Models
60/93: Topic 2.1 – Binary Choice Models
61/93: Topic 2.1 – Binary Choice Models
62/93: Topic 2.1 – Binary Choice Models