Download Finding Instrumental Variables: Identification

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Least squares wikipedia , lookup

Rubin causal model wikipedia , lookup

Bias of an estimator wikipedia , lookup

Choice modelling wikipedia , lookup

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Transcript
Finding Instrumental Variables:
Identification Strategies
Amine Ouazad
Ass. Professor of Economics
Outline
1. Before/After
2. Difference-in-difference estimation
3. Regression Discontinuity Design
BEFORE/AFTER ESTIMATION
Principle
• Yi,t = Xi,tb + S gt 1(t) + ei,t.
• And the event/the change/the policy happened
in period t.
• Examples: Event Studies: Impact of media on
share prices. Corporate governance: quota of
40% of female board members in Norway.
• Threats to identification:
– How is it possible to distinguish secular changes from
changes due to the policy/the event? It is not.
– Also, the reform should not be applied gradually
– The reform should not be anticipated.
DIFFERENCE-IN-DIFFERENCES
Card on Immigration and
Unemployment
Potential Outcome Framework
•
•
•
•
•
Before
After
Treatment
E(Y0|T)
E(Y1|T)
Control
E(Y0|C)
E(Y1|C)
Y0: outcome before.
Y1: outcome after.
Observed:
E(Y1(1)|T), E(Y0(0)|T), E(Y1(0)|C), E(Y0(0)|C) are observed.
E(Y1(1)|T)-E(Y0(0)|T) does not identify the effect of the treatment because
of secular changes (invalid before/after design).
• We do not observe : E(Y1(0)|T)-E(Y0(0)|T) !!
Parallel Trend Hypothesis
E(Yt|group)
Change in the outcome in the treatment group
Counterfactual change in the outcome
in the treatment group
Change in the outcome
in the control group
Time
• Assumption: Trend in the outcome would have been identical to the
control group had there been no change in the treatment.
Difference-in-differences estimator
Simple estimator
• DD = [E(Y1|T)-E(Y0|T)]-[E(Y1|C)-E(Y0|C)]
In OLS, reduced form:
• yit = xitb + a Aftert + g Treatment Groupi
+ d After*Treatment Groupi,t + ei,t
In OLS, 2SLS:
• yi,t = xitb + d Treatmenti,t + ei,t.
• Treatmenti,t = constant + a Aftert + g Treatment
Groupi + d After*Treatment Groupi,t + ei,t
Difference-in-differences estimator
• A single dummy for the treatment group can
be replaced by a set of dummies for each
individual (`individual fixed effects’).
– Each fixed effect controls for the individualspecific slope.
• The After*Treatment interaction can be itself
interacted with observables to estimate the
heterogeneity in the effect of the treatment.
Testing the parallel trend hypothesis
Placebo difference-in-differences
• Trend in the outcome before the event are not statistically different
in control and treatment group.
• Change the outcome for an outcome that you expect is not affected
by the change.
Confounding factors
• Trends in other covariates at the time of the change do not
experience a break.
• i.e. control for xit in the regression should be there to improve
efficiency, preferably not to control for confounding factors.
Robustness check
• Use an alternative control group.
Identification issues in
Difference-in-differences
• Functional form misspecification/non linearities
– The parallel trend hypothesis is very sensitive to the use of logs/absolute
values.
– Example: (30-20)-(10-5) = 5%. In logs, (log(30)-log(20))-(log(10)-log(5)) =
log(1.5)-log(2) < 0
• Selection into control and treatment groups should be exogenous.
– Selection not based on pretreatment outcomes.
• No spillover across control and treatment group.
• Externalities within the control and the treatment group.
• Heterogeneity in the magnitude of the treatment.
– Yit = mi + at + hiMit + eit.
– Then DD = [Y1T-Y0T] – [Y1C-Y0C]=hTMT – hCMC.
• Long-run versus short-run effects of the treatment.
– Change in time window/time between two measures.
– Confounding factors more frequent with a larger time window.
Inference issues in
difference-in-differences.
• Standard errors in difference-in-differences are
very sensitive to autocorrelation of the residuals.
– Bertrand, “How much should we trust difference-indifferences estimates?”, Quarterly Journal of
Economics.
– Paper creates “fake” policy changes (dummies
randomly put in the timeline) and regresses OLS DD
without correcting s.e.s for AR(1). >5% of the effects
are positive.
– Solution: ar option in xtreg/xtivreg, or bootstrap
(coming soon).
Levitt on Abortion and Crime
Identification strategy
• State and year fixed effect. ABORT is the measure of effective abortion
rates.
Results
Being (a bit too) fancy: DDD
• If violation of parallel trends hypothesis in DD.
• Assume there is a change in state health care law that affects people aged
65 years and older.
• Before and after, control group is people younger than 65.
• The difference in slopes can be controlled for by looking at another state
not affected by the change.
• dB: dummy for state implementing the policy.
– A state not implementing the policy.
•
•
•
•
dE: dummy for people older than 65.
d2: after dummy.
Parameter of interest is d3.
Equivalently:
REGRESSION DISCONTINUITY
DESIGNS
DiNardo on Unions and Wages
First Stage
Reduced Form
Reduced Form
RD Designs in the potential outcome
framework
• Y(1) outcome with the treatment, Y(0) outcome without
the treatment.
• Yi(0) is observed below the cutoff, Yi(1) is observed above
the cutoff.
• Cutoff at x=c.
• And E(Y|X) = E(Y(0)|X) for X<c.
• E(Y|X) = E(Y(1)|X) for X>c.
• Sharp regression discontinuity design estimator.
– With the continuity of the conditional mean.
• Estimation of the effect at the cutoff. Any heterogeneity in
the effect will not be captured.
Regression discontinuity design
framework
• No discontinuity in the unobservables at the cutoff.
– Discontinuity in the observables can be controlled for, but are
considered suspicious.
• Discontinuity in the treatment at the cutoff.
• Estimator:
𝛿=
lim 𝐸(𝑌|𝑥) − lim 𝐸(𝑌|𝑥)
𝑥↓𝑐
𝑥↑𝑐
lim 𝐸(𝐷|𝑥) − lim 𝐸(𝐷|𝑥)
𝑥↓𝑐
𝑥↑𝑐
• Fuzzy Regression discontinuity design: some discontinuity
at the cutoff.
• Sharp Regression discontinuity design: probability of the
treatment goes from 0 to 1.
RD estimator
• The regression discontinuity estimator identifies
the effect of interest.
• Simple case: no effect of distance to cutoff on
outcome.
–
–
–
–
y = zg + d D + e
Take out the effect of x by projecting on x.
y* = d D* + e*
Assuming lim x->c from above E(e|x)= lim x->c from
below E(e|x).
– Then plim bRD = b.
OLS estimation
• First Stage
– D = S ak xk + g 1(x>c) + e
– Start with a small-order polynomial.
Add powers if needed.
– Same analysis of the F-Stat and Angrist-Pischke F-stat as in
the IV session.
• Reduced Form
– Y = S bk xk + d 1(x>c) + e
– Estimate in the reduced form should be significant.
• IV Estimation
– Y= constant + d D + h.
– With D = S ak xk + g 1(x>c) + e.
Polynomial regression
• Accounts for the effect of distance on the expected value of the
outcome conditional on distance.
• Example: third order polynomial.
• Choice of the order:
• Akaike Information Criterion
– X Rated: [Microfoundation: Kullback Leibler distance between the density
function and the true density function is minimized when the AIC is
maximized.
– ln(L)=(N/2) ln(estimator of sigma2), where L is the likelihood of the
model.]
Test of the polynomial restriction
• Assume linear functional form and add dummies for bins of
the distance between the cutoff and the assignment variable
x.
• Bk: dummy for bin k. Size of bins: bandwidth.
• And test for the joint significance of the fk.
• Also a test of the presence of discontinuities at other values
than the cutoff.
The role of covariates
• Covariates:
– Reduce the sampling variability around the cutoff.
– Should not affect the point estimate if
randomization around the cutoff is correct.
• Either control directly in OLS (for sharp RD) or
in 2SLS (for fuzzy RD).
• Or take the residuals of Y and of X on the
covariates before drawing the graph and
performing the regression.
Tests of the Regression Discontinuity
Design Hypothesis
• Individual can precisely manipulate the “assignment
variable.”
– If individuals cannot precisely manipulate the assignment
variable, around the threshold the variation is as good as
random.
Hence:
• Show that there is no discontinuity in observable
covariates.
• Change the window around the cutoff for the estimation.
Stable estimates?
– Trade off between statistical power and assumptions.
• Higher order polynomials, analysis by subset.
• Density/Number of individuals below/above the cutoff.
Additional notes
• Regression discontinuity design (and regression
kink design) are very demanding on the data.
– Hence it is typically used with large datasets with a
large number of points.
• Estimate the effect for individuals/firms that are
likely to be around the cutoff.
– Industries that are always far from the cutoff (e.g.
retail) will not be part of the estimate for small
windows around the cutoff. Impact of unionization for
the subset of individuals/firms likely to be around the
cutoff.
CONCLUSION
Finding IV estimators
• IV estimators can be found by:
– Looking at the timing of changes.
– Looking at the differential implementation of changes.
• Using: law/events/randomness.
– To build “natural” experiments rather than controlled
experiments.
• Drawback: demanding on the data.
• Trade-off between statistical power and
credibility of the hypothesis.