Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Instrumental Variables: Introduction Methods of Economic Investigation Lecture 14 Last Time Review of Causal Effects Defining our types of estimates: ATE (hypothetical) TOT (can get this if SB = 0) ITT (use this if we’ve got compliance problems) Methods Experiment (gold standard but can’t always get it) Fixed Effects (assumption on within group variation) Difference-in-Differences (assumption on parallel trends) Propensity Score Matching (assumption on relationship between observables, unobseravables, and treatment) Today’s Class Introduction to Instrumental Variables What are they How do we estimate IV Tests for specification/fit Recap of the problem There is some part of the error that we don’t observe (maybe behavioral parameters, maybe simultaneously determined component, etc.) This component might not be: Fixed within a group Fixed over time/space Related to observables BUT…this component IS correlated with the treatment/variable of interest Our Treatment Effects Model Consider the following model to estimate the effect of treatment S on some outcome Y: Y = αX + ρS + η Our Treatment here is S Think of the example of schooling How much more will you earn if you go to college? Can’t observe true underlying ability which is correlated with college attendance decision and future earnings What’s correlated and what’s not The model we want to estimate: Yi = αX + ρsi +γAi + vi We have that: E[sv] = 0 (by assumption) E[Av] = 0 (by construction) The idea: if A could be observed, we’d just include it in the regression and be done The Instrument…. Assigned to Treatment (S=1) A AH =1 AL =0 Not Assigned Treatment (S=0) B AH =1 AL =0 ITT compares all of A to all of B: this mixes up the compliers (AH=S=1; AL=S=0) and the non-compliers (AH=1, S=0; AL=0, S=1) Introducing Instruments The problem: How to estimate ρ when A is not observed A is related to Y Cov(AS) ≠0 The solution: find something that is Correlated with S [“Monotonicity”] Uncorrelated with any other determinant of the outcome variable Y [“Exclusion Restriction”] How does IV work Call our instrument z Our two instrument characteristics can be re-written as E[z S] ≠0 E[z η] = 0 Then from our equations we can write our population estimate of ρ as: Cov(Y , z ) Cov(Y , z ) / V ( z ) Cov( s, z ) Cov( s, z ) / V ( z ) The Instrument…. Assigned to Treatment (S=1) A AH =1 AL =0 Not Assigned Treatment (S=0) B AH =1 AL =0 Using the Instrument, we can determine where the partition is: then we can compare the part of A which was “randomly assigned (AH=S=1) to the part of B that is randomly assigned (AL=S=0) Simplest case for IV Homogeneous treatment effects (same ρ for all i ) Dummy Variable for instrument z= 1 with probability q Can break-up continuous instruments into sets of dummy variables or use GLS to generalize For now—don’t worry about covariates Simple extension: just include these in both stages Simplify our notation later… Return to LATE Using z as a dummy that’s 1 with probability q Cov(Y, z) = {E[Y | z = 1] – E[Y | z = 0]}q(1 – q) Cov(s, z) = {E[s | z = 1] – E[s | z = 0]}q(1 – q) Can rewrite ρ as: E[Yi | z i 1] E[Yi | z i 0] E[si | z i 1] E[si | z i 0] Should look familiar: it’s our LATE estimate Another type of intuition Remember that E[η | S] ≠ 0 (that’s why we’re in this mess) E[Y | S] ≠ ρE[S] Can condition on Z, rather than S By the “exclusion restriction” property of our instrument E[η | Z] = 0 So now can estimate ρ because E[Y | z] = ρE[S | z] If Z is binary, then this simplifies to our Wald estimator IV estimate intuition The only reason for a relationship between z and Y is the relationship between z and X In dummy variable specification: this is just rescaling the reduced form difference in means (E[Y | z=1] – E[Y| z=0]) by the first stage difference in means (E[S | z=1] – E[S| z=0]) How does IV work: Regression Intuition To see why this is, Think about our “structural equations” Yi = αX + ρsi + ηi We can estimate ρ by getting the ratio of two different coefficients First stage: si = π10X + π11zi + ξ1i Reduced form: yi = π20X + π21zi + ξ2i Endogenous Exogenous Covariates Exogenous instrument Rewriting the Structural equation Plug in the values from the first stage: Yi = αX + ρsi + ηi = αX + ρ [π10X + π11zi + ξ1i] + ηi = [α + ρπ10]X + ρ π11zi + ρ ξ1i+ ηi = π20X + π21zi + ξ2i = αX + ρ[π10X + π11zi ] + + ξ2i Coefficient population regression of y on s, and also on the fitted value of S (and the X’s) Fitted value in the population regression of s on z (and X) Population vs. Estimates If we had the entire population, we could measure the relationship between z and S and obtain the true π’s Using these π’s we could then obtain the true ρ Unfortunately, most of the time, we have finite samples Estimating 2SLS In practice, use finite samples to obtain fitted value sˆi ˆ10 X i ˆ11 z i Consistent estimate of parameters from OLS Use these parameters to construct fitted value Then use this fitted value to construct second stage estimating equation y X sˆi [ i (si sˆ)] Can get consistent estimates because covariates and fitted values are independent of η (by assumption) Independent of ( si sˆ) (by construction) Bias in 2SLS 2SLS is biased—we’ll talk about this in detail next time but the general idea is: We must estimate the first stage (e.g. Ŝ ) In practice, the first-stage estimates reflect some of the randomness in the endogenous ŝ variable (e.g. S) This randomness generates finite-sample correlations between first-stage fitted values and second stage errors Endogeneous variable correlated with the second stage errors Some of that is left in the first stage fitted value Asymptotically this bias goes to zero but in finite sample might not Next time: Issues with IV estimates Return to Consistency: what about bias? Weak instruments Heterogeneous Treatment Effects