Download Instrumental variables*

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Instrumental variables
Anant Nyshadham
Instrumental Variables
• What is a natural experiment?
– “situations where the forces of nature or
government policy have conspired to produce
an environment somewhat akin to a randomized
experiment”
• Angrist and Krueger (2001, p. 73)
• Natural experiments can provide a useful
source of exogenous variation in
problematic regressors
– But they require detailed institutional knowledge
Instrumental
Variables and Natural
Experiments
• Some natural experiments in economics
– Existing policy differences, or changes that
affect some jurisdictions (or groups) but not
others
• Minimum wage rate
• Excise taxes on consumer goods
• Unemployment insurance, workers’ compensation
– Unexpected “shocks” to the local economy
• Coal prices and the Middle East oil embargo (1973)
• Agricultural production and adverse weather events
Instrumental
Variables and Natural
Experiments
• Some potential pitfalls
– Not all policy differences/changes are exogenous
• Political factors and past realizations of the response
variable can affect existing policies or policy changes
– Generalizability of causal effect estimates
• Results may not generalize beyond the units under study
– Heterogeneity in causal effects
• Results may be sensitive to the natural experiment
chosen in a specific study (L.A.T.E.)
Instrumental Variables and
Natural Experiments
• Some natural experiments used as IV
which are of interest to development
economists
– Acemoglu Johnson & Robinson (2001): settler
mortality
– Paxson (1992): rainfall
– Schultz & Tansel (1997): healthcare prices
True Model
• Suppose true model is:
– Y = a + bX + cV + e
• a, b, and c are parameters to be estimated; e is error
term
• Do not observe V
• Can only estimate:
– Y = a + bX + e
• What do we do to get b instead of b?
Methods
• Y = a + bX + η; η = cV + e
• Differencing/FE
• Find groups with common V (assumption), but variation in
X
• Subtract off V to remove it from error term
• Instrumental Variable
• Find instrument Z; X = j + kZ + i
• Predict portion of X which does not correlate with V
• Use this portion in original estimating equation
IV Criteria and Assumptions
•
•
•
•
Step/Stage 1: X = j + kZ + I  X’ = k’Z
Step/Stage 2: Y = a + bX’ + η; recover true b
Criteria for Z
Z must sufficiently predict X: k>>0 or k<<0
• Testable using estimate of k from first stage
• Z must only impact Y through X
• Cov(Z,η)=0; Cov(Z,V)=0 & Cov(Z,e)=0
• Z does not belong original estimation equation
• Assumption, untestable
An IV example: Angrist and
Krueger (1991), J.L.E.
• Returns to education (Y = wages)
– Problem of omitted “ability bias”
• Years of schooling vary by quarter of birth
– Compulsory schooling laws, age-at-entry rules
– Someone born in Q1 is a little older and will be
able to drop out sooner than someone born in
Q4
• Q.O.B. can be treated as a useful source
of exogeneity in schooling
Angrist and Krueger (1991), J.L.E.
• People born in Q1 do
obtain less schooling
– But pay close attention to
the scale of the y-axis
– Mean difference between
Q1 and Q4 is only 0.124,
or 1.5 months
• So...need large N since
R2X,Z will be very small
– A&K had over 300k for
the 1930-39 cohort
Source: Angrist and Krueger (1991), Figure I
Angrist and Krueger (1991), J.L.E.
• Final 2SLS model interacted QOB with
year of birth (30), state of birth (150)
– OLS: b = .0628 (s.e. = .0003)
– 2SLS: b = .0811 (s.e. = .0109)
• Least squares estimate does not appear
to be badly biased by omitted variables
– But...replication effort identified some
pitfalls in this analysis that are instructive
Bound,
Jaeger, and Baker (1995),
J.A.S.A.
• Potential problems with QOB as an IV
– Correlation between QOB and schooling is
weak
• Small Cov(X,Z) introduces finite-sample bias, which
will be exacerbated with the inclusion of many IV’s
– QOB may not be exogenous (correlated with
unobservable determinants of wages, e.g.
family income)
– QOB may not satisfy exclusion restriction (e.g.
age relative to peers changes social dynamics,
competition, leadership skill etc.)
Bound,
Jaeger, and Baker (1995),
J.A.S.A.
• Even if the instrument is “good,” matters
can be made far worse with IV as opposed
to LS
– Weak correlation between IV and endogenous
regressor can pose severe finite-sample bias
• And…really large samples won’t help, especially if
there is even weak endogeneity between IV and
error
• First-stage diagnostics provide a sense of
how good an IV is in a given setting
– F-test and partial-R2 on IV’s
Useful
Diagnostic Tools for IV Models
• Tests of instrument relevance
– Weak IV’s → Large variance of bIV as well as
potentially severe finite-sample bias
• Tests of instrument exogeneity
– Endogenous IV’s → Inconsistency of bIV that
makes it no better (and probably worse) than
bLS
• Durbin-Wu-Hausman test
– Endogeneity of the problem regressor(s)
Tests of Instrument Relevance
• Diagnostics based on the F-test for the joint
significance of the IV’s
– Nelson and Startz (1990); Staiger and Stock
(1997)
– Bound, Jaeger, and Baker (1995)
• Partial R-square for the IV’s
– Shea (1997)
• There is a growing econometric literature
on the “weak instrument” problem
Tests of Instrument Exogeneity
• Model must be overidentified, i.e., more
IV’s than endogenous X’s
– H0: All IV’s uncorrelated with structural error
• Overidentification test:
1. Estimate structural model
2. Regress IV residuals on all exogenous
variables
3. Compute NR2 and compare to chi-square
• df = # IV’s – # endogenous X’s
Application: Adolescent
Work and Delinquent Behavior
• Prior research shows a positive correlation
between teenage work and delinquency
– Reasons to suspect serious endogeneity bias
• 2nd wave of the NLSY97 (N = 8,368)
– Y = 1 if committed delinquent act (31.9%)
– X = 1 if worked in a formal job (52.6%)
– Z1 = 1 if child labor law allows 40+ hours
(14.2%)
– Z2 = 1 if no child labor restriction in place
(39.6%)
Regression
Model Ignoring Endogeneity
. reg pcrime work if nomiss==1 & wave==2
Source |
SS
df
MS
-------------+-----------------------------Model | 1.37395379
1 1.37395379
Residual | 1815.97786 8366 .217066443
-------------+-----------------------------Total | 1817.35182 8367 .217204711
Number of obs
F( 1, 8366)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8368
6.33
0.0119
0.0008
0.0006
.4659
-----------------------------------------------------------------------------pcrime |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------work |
.0256633
.0102005
2.52
0.012
.0056677
.0456588
_cons |
.3053242
.0074009
41.26
0.000
.2908167
.3198318
------------------------------------------------------------------------------
• Teenage workers significantly more
delinquent
– Modest effect but consistent with prior research
First-Stage Model
. reg work law40 nolaw if nomiss==1 & wave==2
Source |
SS
df
MS
-------------+-----------------------------Model | 271.829722
2 135.914861
Residual | 1814.33364 8365 .216895832
-------------+-----------------------------Total | 2086.16336 8367 .249332301
Number of obs
F( 2, 8365)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8368
626.64
0.0000
0.1303
0.1301
.46572
-----------------------------------------------------------------------------work |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------law40 |
.0688902
.0154383
4.46
0.000
.0386274
.099153
nolaw |
.3818684
.0110273
34.63
0.000
.3602521
.4034847
_cons |
.3655636
.0074883
48.82
0.000
.3508847
.3802425
------------------------------------------------------------------------------
• State child labor laws affect probability of
work
– This is a really strong first stage (F, R2)
Two-Stage Least Squares Model
. ivreg pcrime (work = law40 nolaw) if nomiss==1 & wave==2
Instrumental variables (2SLS) regression
Source |
SS
df
MS
-------------+-----------------------------Model | -19.5287923
1 -19.5287923
Residual | 1836.88061 8366 .219564978
-------------+-----------------------------Total | 1817.35182 8367 .217204711
Number of obs
F( 1, 8366)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8368
6.86
0.0088
.
.
.46858
-----------------------------------------------------------------------------pcrime |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------work | -.0744352
.0284206
-2.62
0.009
-.1301466
-.0187238
_cons |
.3580171
.0158135
22.64
0.000
.3270187
.3890155
-----------------------------------------------------------------------------Instrumented: work
Instruments:
law40 nolaw
------------------------------------------------------------------------------
What
Do the Models Suggest Thus Far?
• Completely different conclusions!
– OLS = Teenage work is criminogenic (b = +.026)
• Delinquency risk increases by 8.5 percent (base = .305)
– 2SLS = Teenage work is prophylactic (b = –.074)
• Delinquency risk decreases by 20.7 percent (base = .358)
• Which model should we believe?
– We still have some additional diagnostic work to do
to evaluate the 2SLS model
• Overidentification test
Overidentification
Test from the Software
. overid
Tests of overidentifying restrictions:
Sargan N*R-sq test
0.509 Chi-sq(1)
Basmann test
0.508 Chi-sq(1)
P-value = 0.4757
P-value = 0.4758
• IV’s jointly pass the exogeneity
requirement
– Notice that -overid- provides a global test,
whereas the regression-based approach
allows you to test the IV’s jointly as well as
individually
So Where Do We Stand with
the Work-Delinquency Question?
• Are child labor laws correlated with work?
– YES = first-stage F is large
• Are child labor laws good IV’s?
– YES = overidentification test is not rejected
• Is teenage work endogenous?
– YES = Hausman test is rejected
• Prior research findings that teenage
work is criminogenic are selection
artifacts
Now…What Happens if I Throw
in a Potentially Bogus Instrument?
• Now there are three instrumental variables
– Z1 = 1 if child labor law allows 40+ hours (14.2%)
– Z2 = 1 if no child labor restriction in place (39.6%)
– Z3 = 1 if high unemployment rate in county (20.1%)
• A little more difficult to tell a convincing story
that the unemployment rate is only related to
delinquency through work experience
– But let’s see what happens
First-Stage Model
. reg work law40 nolaw highun if nomiss==1 & wave==2
Source |
SS
df
MS
-------------+-----------------------------Model | 277.229696
3 92.4098987
Residual | 1808.93366 8364 .216276144
-------------+-----------------------------Total | 2086.16336 8367 .249332301
Number of obs
F( 3, 8364)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8368
427.28
0.0000
0.1329
0.1326
.46505
-----------------------------------------------------------------------------work |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------law40 |
.0636421
.0154519
4.12
0.000
.0333525
.0939317
nolaw |
.3775975
.0110447
34.19
0.000
.3559472
.3992479
highun | -.0636009
.0127283
-5.00
0.000
-.0885517
-.0386502
_cons |
.3808061
.0080759
47.15
0.000
.3649754
.3966368
------------------------------------------------------------------------------
• So far so good and consistent with
expectation
Two-Stage Least Squares Model
. ivreg pcrime (work = law40 nolaw highun) if nomiss==1 & wave==2
Instrumental variables (2SLS) regression
Source |
SS
df
MS
-------------+-----------------------------Model | -16.0635514
1 -16.0635514
Residual | 1833.41537 8366 .219150773
-------------+-----------------------------Total | 1817.35182 8367 .217204711
Number of obs
F( 1, 8366)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8368
5.47
0.0194
.
.
.46814
-----------------------------------------------------------------------------pcrime |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------work | -.0657624
.0281159
-2.34
0.019
-.1208765
-.0106483
_cons |
.3534516
.0156602
22.57
0.000
.3227537
.3841496
-----------------------------------------------------------------------------Instrumented: work
Instruments:
law40 nolaw highun
------------------------------------------------------------------------------
Post-Hoc Diagnostics
. overid
Tests of overidentifying restrictions:
Sargan N*R-sq test
5.301 Chi-sq(2)
Basmann test
5.301 Chi-sq(2)
P-value = 0.0706
P-value = 0.0706
. ivendog
Tests of endogeneity of: work
H0: Regressor is exogenous
Wu-Hausman F test:
Durbin-Wu-Hausman chi-sq test:
12.32811
12.31438
F(1,8365)
Chi-sq(1)
P-value = 0.00045
P-value = 0.00045
• Overidentification gives cause for concern
– The p-value shouldn’t be anywhere near 0.05
Conclusion from Diagnostic Tests
• 2SLS “work effect” is similar
– Without unemployment, b = –.074 (s.e. = .028)
– With unemployment, b = –.066 (s.e. = .028)
• But…the second model is invalidated
because the unemployment rate is not
exogenous
– If affects criminality through other channels
• We need to control for all other indirect pathways,
or…
• It should not be used as an IV at all
Closing Comments about
Instrumental Variables Studies
• In general, a lagged value of the
endogenous regressor is not a good
instrument
– Traditional structural equation model uses lagged
values of X and Y as instruments to break the
simultaneity between the current values of X and Y
X1
X2
Y1
Y2
These models impose the
awfully strong assumption that
lagged values of X and Y only
affect the outcomes through
current values
Rules for Good Practice with
Instrumental Variables Models
• IV models can be very informative, but
it’s your job to convince your audience
– Show the first-stage model diagnostics
• Even the most clever IV might not be
sufficiently strongly related to X to be a useful
source of identification
– Report test(s) of overidentifying restrictions
• An invalid IV is often worse than no IV at all
– Report LS endogeneity (DWH) test
Rules for Good Practice with
Instrumental Variables Models
• Most importantly, TELL A STORY about
why a particular IV is a “good instrument”
• Something to consider when thinking
about whether a particular IV is “good”
– Does the IV, for all intents and purposes,
randomize the endogenous regressor?