Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Finding Instrumental Variables: Identification Strategies Amine Ouazad Ass. Professor of Economics Outline 1. Before/After 2. Difference-in-difference estimation 3. Regression Discontinuity Design BEFORE/AFTER ESTIMATION Principle • Yi,t = Xi,tb + S gt 1(t) + ei,t. • And the event/the change/the policy happened in period t. • Examples: Event Studies: Impact of media on share prices. Corporate governance: quota of 40% of female board members in Norway. • Threats to identification: – How is it possible to distinguish secular changes from changes due to the policy/the event? It is not. – Also, the reform should not be applied gradually – The reform should not be anticipated. DIFFERENCE-IN-DIFFERENCES Card on Immigration and Unemployment Potential Outcome Framework • • • • • Before After Treatment E(Y0|T) E(Y1|T) Control E(Y0|C) E(Y1|C) Y0: outcome before. Y1: outcome after. Observed: E(Y1(1)|T), E(Y0(0)|T), E(Y1(0)|C), E(Y0(0)|C) are observed. E(Y1(1)|T)-E(Y0(0)|T) does not identify the effect of the treatment because of secular changes (invalid before/after design). • We do not observe : E(Y1(0)|T)-E(Y0(0)|T) !! Parallel Trend Hypothesis E(Yt|group) Change in the outcome in the treatment group Counterfactual change in the outcome in the treatment group Change in the outcome in the control group Time • Assumption: Trend in the outcome would have been identical to the control group had there been no change in the treatment. Difference-in-differences estimator Simple estimator • DD = [E(Y1|T)-E(Y0|T)]-[E(Y1|C)-E(Y0|C)] In OLS, reduced form: • yit = xitb + a Aftert + g Treatment Groupi + d After*Treatment Groupi,t + ei,t In OLS, 2SLS: • yi,t = xitb + d Treatmenti,t + ei,t. • Treatmenti,t = constant + a Aftert + g Treatment Groupi + d After*Treatment Groupi,t + ei,t Difference-in-differences estimator • A single dummy for the treatment group can be replaced by a set of dummies for each individual (`individual fixed effects’). – Each fixed effect controls for the individualspecific slope. • The After*Treatment interaction can be itself interacted with observables to estimate the heterogeneity in the effect of the treatment. Testing the parallel trend hypothesis Placebo difference-in-differences • Trend in the outcome before the event are not statistically different in control and treatment group. • Change the outcome for an outcome that you expect is not affected by the change. Confounding factors • Trends in other covariates at the time of the change do not experience a break. • i.e. control for xit in the regression should be there to improve efficiency, preferably not to control for confounding factors. Robustness check • Use an alternative control group. Identification issues in Difference-in-differences • Functional form misspecification/non linearities – The parallel trend hypothesis is very sensitive to the use of logs/absolute values. – Example: (30-20)-(10-5) = 5%. In logs, (log(30)-log(20))-(log(10)-log(5)) = log(1.5)-log(2) < 0 • Selection into control and treatment groups should be exogenous. – Selection not based on pretreatment outcomes. • No spillover across control and treatment group. • Externalities within the control and the treatment group. • Heterogeneity in the magnitude of the treatment. – Yit = mi + at + hiMit + eit. – Then DD = [Y1T-Y0T] – [Y1C-Y0C]=hTMT – hCMC. • Long-run versus short-run effects of the treatment. – Change in time window/time between two measures. – Confounding factors more frequent with a larger time window. Inference issues in difference-in-differences. • Standard errors in difference-in-differences are very sensitive to autocorrelation of the residuals. – Bertrand, “How much should we trust difference-indifferences estimates?”, Quarterly Journal of Economics. – Paper creates “fake” policy changes (dummies randomly put in the timeline) and regresses OLS DD without correcting s.e.s for AR(1). >5% of the effects are positive. – Solution: ar option in xtreg/xtivreg, or bootstrap (coming soon). Levitt on Abortion and Crime Identification strategy • State and year fixed effect. ABORT is the measure of effective abortion rates. Results Being (a bit too) fancy: DDD • If violation of parallel trends hypothesis in DD. • Assume there is a change in state health care law that affects people aged 65 years and older. • Before and after, control group is people younger than 65. • The difference in slopes can be controlled for by looking at another state not affected by the change. • dB: dummy for state implementing the policy. – A state not implementing the policy. • • • • dE: dummy for people older than 65. d2: after dummy. Parameter of interest is d3. Equivalently: REGRESSION DISCONTINUITY DESIGNS DiNardo on Unions and Wages First Stage Reduced Form Reduced Form RD Designs in the potential outcome framework • Y(1) outcome with the treatment, Y(0) outcome without the treatment. • Yi(0) is observed below the cutoff, Yi(1) is observed above the cutoff. • Cutoff at x=c. • And E(Y|X) = E(Y(0)|X) for X<c. • E(Y|X) = E(Y(1)|X) for X>c. • Sharp regression discontinuity design estimator. – With the continuity of the conditional mean. • Estimation of the effect at the cutoff. Any heterogeneity in the effect will not be captured. Regression discontinuity design framework • No discontinuity in the unobservables at the cutoff. – Discontinuity in the observables can be controlled for, but are considered suspicious. • Discontinuity in the treatment at the cutoff. • Estimator: 𝛿= lim 𝐸(𝑌|𝑥) − lim 𝐸(𝑌|𝑥) 𝑥↓𝑐 𝑥↑𝑐 lim 𝐸(𝐷|𝑥) − lim 𝐸(𝐷|𝑥) 𝑥↓𝑐 𝑥↑𝑐 • Fuzzy Regression discontinuity design: some discontinuity at the cutoff. • Sharp Regression discontinuity design: probability of the treatment goes from 0 to 1. RD estimator • The regression discontinuity estimator identifies the effect of interest. • Simple case: no effect of distance to cutoff on outcome. – – – – y = zg + d D + e Take out the effect of x by projecting on x. y* = d D* + e* Assuming lim x->c from above E(e|x)= lim x->c from below E(e|x). – Then plim bRD = b. OLS estimation • First Stage – D = S ak xk + g 1(x>c) + e – Start with a small-order polynomial. Add powers if needed. – Same analysis of the F-Stat and Angrist-Pischke F-stat as in the IV session. • Reduced Form – Y = S bk xk + d 1(x>c) + e – Estimate in the reduced form should be significant. • IV Estimation – Y= constant + d D + h. – With D = S ak xk + g 1(x>c) + e. Polynomial regression • Accounts for the effect of distance on the expected value of the outcome conditional on distance. • Example: third order polynomial. • Choice of the order: • Akaike Information Criterion – X Rated: [Microfoundation: Kullback Leibler distance between the density function and the true density function is minimized when the AIC is maximized. – ln(L)=(N/2) ln(estimator of sigma2), where L is the likelihood of the model.] Test of the polynomial restriction • Assume linear functional form and add dummies for bins of the distance between the cutoff and the assignment variable x. • Bk: dummy for bin k. Size of bins: bandwidth. • And test for the joint significance of the fk. • Also a test of the presence of discontinuities at other values than the cutoff. The role of covariates • Covariates: – Reduce the sampling variability around the cutoff. – Should not affect the point estimate if randomization around the cutoff is correct. • Either control directly in OLS (for sharp RD) or in 2SLS (for fuzzy RD). • Or take the residuals of Y and of X on the covariates before drawing the graph and performing the regression. Tests of the Regression Discontinuity Design Hypothesis • Individual can precisely manipulate the “assignment variable.” – If individuals cannot precisely manipulate the assignment variable, around the threshold the variation is as good as random. Hence: • Show that there is no discontinuity in observable covariates. • Change the window around the cutoff for the estimation. Stable estimates? – Trade off between statistical power and assumptions. • Higher order polynomials, analysis by subset. • Density/Number of individuals below/above the cutoff. Additional notes • Regression discontinuity design (and regression kink design) are very demanding on the data. – Hence it is typically used with large datasets with a large number of points. • Estimate the effect for individuals/firms that are likely to be around the cutoff. – Industries that are always far from the cutoff (e.g. retail) will not be part of the estimate for small windows around the cutoff. Impact of unionization for the subset of individuals/firms likely to be around the cutoff. CONCLUSION Finding IV estimators • IV estimators can be found by: – Looking at the timing of changes. – Looking at the differential implementation of changes. • Using: law/events/randomness. – To build “natural” experiments rather than controlled experiments. • Drawback: demanding on the data. • Trade-off between statistical power and credibility of the hypothesis.