* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Z - NYU
Survey
Document related concepts
Transcript
Econometrics I Professor William Greene Stern School of Business Department of Economics 12-1/71 Part 12: Endogeneity Econometrics I Part 12 - Endogeneity 12-2/71 Part 12: Endogeneity Sources of Endogeneity One or more variables is correlated with the disturbance. Sources: 12-3/71 Omitted variables Nonrandom sampling Nonrandom attrition from the sample Voluntary participation in the program (Not so) old fashioned simultaneous equations Dynamics in panel data Measurement error Part 12: Endogeneity The First IV Study – Omitted Variable (Snow, J., On the Mode of Communication of Cholera, 1855) London Cholera epidemic, ca 1853-4 Cholera = f(Water Purity,u)+ε. Effect of water purity on cholera? Purity=f(cholera prone environment (waste in streets, rodents, etc.)). Regression does not work. Two London water companies Lambeth Vauxhall/Southwark =====|||||====== Main sewage discharge Paul Grootendorst: A Review of Instrumental Variables Estimation of Treatment Effects… http://individual.utoronto.ca/grootendorst/pdf/IV_Paper_Sept6_2007.pdf 12-4/71 Part 12: Endogeneity IV Estimation – The Wald Estimator Cholera=f(Water Purity,u)+ε Z = water company Z = 1(Lambeth Water) Cov(Cholera,Z)=δCov(Water Purity,Z); δ = Cov(C,Z)/Cov(WP,Z) Z is randomly mixed in the population (two full sets of pipes) and uncorrelated with behavioral unobservables, u) Cholera=α+δPurity+u+ε Purity = Mean+random variation+λu Cov(Cholera,Z)= δCov(Purity,Z) C1 = Proportion with Cholera & Lambeth Water ˆδ = C1 − C0 C0 = Proportion with Cholera & Vauxhall Water W1 − W0 W1 = Proportion with Bad Water & Lambeth Water W0 = Proportion with Bad Water & Vauxhall Water [This is two stage least squares C on (BadW) with Z = (WaterCo).] 12-5/71 Part 12: Endogeneity Generalities y = Xβ + Yδ + ε Cov( X, ε) = 0, K1 variables Cov(Y, ε) ≠ 0, K 2 variables Y is endogenous OLS regression of y on (X,Y) cannot estimate (β,δ ) consistently. Some other estimator is needed. Additional structure: Y =ZΠ +V where Cov(Z,ε)= 0. An instrumental variable (IV) estimator based on (X,Z) may be able to estimate (β,δ ) consistently. 12-6/71 Part 12: Endogeneity The Standard Example The Effect of Education on Earnings LWAGE = β1 + β2EDUC + β3EXP + β4EXP 2 + ...ε+ What is ε? Ability, Motivation ,... + everything else EDUC = f(GENDER, SMSA, SOUTH, Ability, Motivation,...) 12-7/71 Part 12: Endogeneity What Influences LWAGE? LWAGE = β1 + β2EDUC( X, Ability, Motivation,...) + β3EXP + β4EXP 2 + ... ε(+ Ability, Motivation ) Increased Ability is associated with increases in EDUC( X, Ability, Motivation,...) and ε(Ability, Motivation ) What looks like an effect due to increase in EDUC may be an increase in Ability. The estimate of β2 picks up the effect of EDUC and the hidden effect of Ability. 12-8/71 Part 12: Endogeneity An Exogenous Influence LWAGE = β1 + β2EDUC( X, Z, Ability, Motivation,...) + β3EXP + β4EXP 2 + ... ε(+ Ability, Motivation ) Increased Z is associated with increases in EDUC( X, Z, Ability, Motivation,...) and not ε(Ability, Motivation ) An effect due to the effect of an increase Z on EDUC will only be an increase in EDUC. The estimate of β2 picks up the effect of EDUC only. Z is an Instrumental Variable 12-9/71 Part 12: Endogeneity Instrumental Variables Structure LWAGE (ED,EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) ED (MS, FEM) Reduced Form: LWAGE[ ED (MS, FEM), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ] 12-10/71 Part 12: Endogeneity Two Stage Least Squares Strategy Reduced Form: LWAGE[ ED (MS, FEM,X), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ] Strategy 12-11/71 (1) Purge ED of the influence of everything but MS, FEM (and the other X variables). Predict ED using all exogenous information in the sample (X and Z). (2) Regress LWAGE on this prediction of ED and everything else. Standard errors must be adjusted for the predicted ED Part 12: Endogeneity Cornwell and Rupert Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP WKS OCC IND SOUTH SMSA MS FEM UNION ED LWAGE = = = = = = = = = = = work experience weeks worked occupation, 1 if blue collar, 1 if manufacturing industry 1 if resides in south 1 if resides in a city (SMSA) 1 if married 1 if female 1 if wage set by union contract years of education log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155. See Baltagi, page 122 for further analysis. The data were downloaded from the website for Baltagi's text. 12-12/71 Part 12: Endogeneity Specification: Quadratic Effect of Experience 12-13/71 Part 12: Endogeneity OLS 12-14/71 Part 12: Endogeneity 2 Stage Least Squares The weird results for the coefficient on ED happened because the instruments, MS and FEM are dummy variables. There is not enough variation in these variables. 12-15/71 Part 12: Endogeneity Source of Endogeneity LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + ε ED = f(MS,FEM, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u 12-16/71 Part 12: Endogeneity Remove the Endogeneity LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u + ε Strategy 12-17/71 Estimate u Add u to the equation. ED is uncorrelated with ε when u is in the equation. Part 12: Endogeneity Auxiliary Regression for ED to Obtain Residuals 12-18/71 Part 12: Endogeneity OLS with Residual (Control Function) Added 2SLS 12-19/71 Part 12: Endogeneity A Warning About Control Function 12-20/71 Part 12: Endogeneity The two stage LS strategy: (The two stage button in your software.) The software regresses EDUC on all independent variables plus the two instrumental variables (stage 1), then takes the predicted value on education and regresses lwage on that predicted value plus the original independent variables (stage 2). Is this correct? Then the second method you showed is the same except the predicted residuals are included in the second stage OLS. Is one method preferred over another? They produce the same results. 12-21/71 Part 12: Endogeneity General Results By construction, the IV estimator is consistent. So, we have an estimator that is consistent when least squares is not. 12-22/71 Part 12: Endogeneity Instrumental Variables Framework: y = Xβ + ε, K variables in X. There exists a set of K variables, Z such that plim(Z’X/n) ≠ 0 but plim(Z’ε/n) = 0 The variables in Z are called instrumental variables. An alternative (to least squares) estimator of β is bIV = (Z’X)-1Z’y We consider the following: 12-23/71 Why use this estimator? What are its properties compared to least squares? We will also examine an important application Part 12: Endogeneity IV Estimators Consistent bIV = (Z’X)-1Z’y = (Z’X/n)-1 (Z’X/n)β+ (Z’X/n)-1Z’ε/n = β+ (Z’X/n)-1Z’ε/n β Asymptotically normal (same approach to proof as for OLS) Inefficient – (relative to what?) to be shown. 12-24/71 Part 12: Endogeneity IV Estimation Why use an IV estimator? Suppose that X and ε are not uncorrelated. Then least squares is neither unbiased nor consistent. Recall the proof of consistency of least squares: b = β + (X’X/n)-1(X’ε/n). Plim b = β requires plim(X’ε/n) = 0. If this does not hold, the estimator is inconsistent. 12-25/71 Part 12: Endogeneity A Popular Misconception A popular misconception. If only one variable in X is correlated with ε, the other coefficients are consistently estimated. False. Suppose only the first variable is correlated with ε σ1ε 0 Under the assumptions, plim(X'ε/n) = . Then ... . q 11 σ1ε 21 0 q plim b -β = plim(X'X /n)-1 = σ1ε ... ... K 1 . q = σ1ε times the first column of Q-1 The problem is “smeared” over the other coefficients. 12-26/71 Part 12: Endogeneity Asymptotic Covariance Matrix of bIV bIV − β = (Z'X ) −1 Z ' ε (bIV − β)(bIV − β) ' = (Z'X )−1 Z ' εε'Z(X'Z)-1 E[(bIV − β)(bIV − β) ' | X, Z] = σ2 (Z'X )−1 Z ' Z(X'Z)-1 12-27/71 Part 12: Endogeneity Asymptotic Efficiency Asymptotic efficiency of the IV estimator. The variance is larger than that of LS. (A large sample type of GaussMarkov result is at work.) (1) It’s a moot point. LS is inconsistent. (2) Mean squared error is uncertain: MSE[estimator|β]=Variance + square of bias. IV may be better or worse. Depends on the data 12-28/71 Part 12: Endogeneity Two Stage Least Squares How to use an “excess” of instrumental variables (1) X is K variables. Some (at least one) of the K variables in X are correlated with ε. (2) Z is M > K variables. Some of the variables in Z are also in X, some are not. None of the variables in Z are correlated with ε. (3) Which K variables to use to compute Z’X and Z’y? 12-29/71 Part 12: Endogeneity Choosing the Instruments Choose K randomly? Choose the included Xs and the remainder randomly? Use all of them? How? A theorem: There is a most efficient way to construct the IV estimator from this subset: (1) For each column (variable) in X, compute the predictions of that variable using all the columns of Z. (2) Linearly regress y on these K predictions. This is two stage least squares 12-30/71 Part 12: Endogeneity Algebraic Equivalence Two stage least squares is equivalent to 12-31/71 (1) each variable in X that is also in Z is replaced by itself. (2) Variables in X that are not in Z are replaced by predictions of that X with all the variables in Z that are not in X. Part 12: Endogeneity 2SLS Algebra ˆ = Z(Z'Z)-1 Z'X X ˆ ˆ ) −1 X'y ˆ b2SLS = ( X'X But, Z(Z'Z)-1 Z'X = (I - MZ ) X and (I - MZ ) is idempotent. ˆ ˆ = X'(I - MZ )(I - MZ ) X = X'(I - MZ ) X so X'X ˆ ) −1 X'y ˆ = a real IV estimator by the definition. b2SLS = ( X'X ˆ are linear combinations ˆ ε/n) = 0 since columns of X Note, plim(X' of the columns of Z , all of which are uncorrelated with ε. b2SLS = [ X'(I - MZ ) X ]-1 X'(I - MZ ) y 12-32/71 Part 12: Endogeneity Asymptotic Covariance Matrix for 2SLS General Result for Instrumental Variable Estimation E[(bIV − β)(bIV − β) ' | X, Z] = σ2 (Z'X )−1 Z ' Z(X'Z)-1 ˆ = (I - MZ ) X Specialize for 2SLS, using Z = X ˆ ) −1 X ˆ 'X ˆ (X'X ˆ )-1 E[(b2SLS − β)(b2SLS − β) ' | X, Z] = σ2 ( X'X ˆ ˆ ) −1 X ˆ 'X ˆ (X'X ˆ ˆ )-1 = σ2 ( X'X ˆ ˆ ) −1 = σ2 ( X'X 12-33/71 Part 12: Endogeneity 2SLS Has Larger Variance than LS A comparison to OLS ˆ 'X ˆ )-1 Asy.Var[2SLS]=σ2 ( X Neglecting the inconsistency, Asy.Var[LS] =σ2 ( X ' X )-1 (This is the variance of LS around its mean, not β) Asy.Var[2SLS] ≥ Asy.Var[LS] in the matrix sense. Compare inverses: -1 ˆ 'X ˆ] {Asy.Var[LS]}-1 - {Asy.Var[2SLS]}= (1 / σ2 )[X ' X - X (1 / σ2 )[X ' X - X '(I − MZ ) X ]=(1 / σ2 )[X ' MZ X ] = This matrix is nonnegative definite. (Not positive definite as it might have some rows and columns which are zero.) Implication for "precision" of 2SLS. The problem of "Weak Instruments" 12-34/71 Part 12: Endogeneity Estimating σ2 Estimating the asymptotic covariance matrix a caution about estimating σ2 . ˆ, Since the regression is computed by regressing y on x one might use 2 σ ˆ = 1 n ˆ 2sls ) Σni=1 (y i − x'b This is inconsistent. Use 2 σ ˆ = 1 n Σni=1 (y i − x'b2sls ) (Degrees of freedom correction is optional. Conventional, but not necessary.) 12-35/71 Part 12: Endogeneity Robust Covariance Matrix for 2SLS Robust VC Estimator = ( ˆ ′X ˆ X ) −1 C = ∑ c 1 (∑ nc ∑ nc =t 1 =s 1 )( ) ˆ ( y − x′ β) ˆ xˆ xˆ ′ X ˆ ′X ˆ ( yct − x′ct β) cs cs ct cs Actual xct 12-36/71 −1 Predicted xct Part 12: Endogeneity Endogeneity Test? (Hausman) Exogenous Endogenous OLS Consistent, Efficient Inconsistent 2SLS Consistent, Inefficient Consistent Base a test on d = b2SLS - bOLS Use a Wald statistic, d’[Var(d)]-1d What to use for the variance matrix? Hausman: V2SLS - VOLS 12-37/71 Part 12: Endogeneity An Endogenous Dummy Variable HospitalVisits = f(Age,Health,Insurance, Other unobservables) Insurance 12-38/71 = f(Expected Doctor Visits, Age,Health,Gender,Income,Kids, Education, Other unobservables) Part 12: Endogeneity Hausman Test 12-39/71 Part 12: Endogeneity 12-40/71 Part 12: Endogeneity Rank = 1. Use a generalized inverse. Still not significant 12-41/71 Part 12: Endogeneity Endogeneity Test: Wu Considerable complication in Hausman test (text, pp. 235-236) Simplification: Wu test. Regress y on X and X^ estimated for the endogenous part of X. Then use an ordinary Wald test. 12-42/71 Part 12: Endogeneity Wu Test in C&R Wage Equation 12-43/71 Part 12: Endogeneity Alternative to Hausman’s Formula? H test requires the difference between an efficient and an inefficient estimator. Any way to compare any two competing estimators even if neither is efficient? Bootstrap? (Maybe) 12-44/71 Part 12: Endogeneity 12-45/71 Part 12: Endogeneity Weak Instruments Symptom: The relevance condition, plim Z’X/n not zero, is close to being violated. Detection: Standard F test in the regression of xk on Z. F < 10 suggests a problem. F statistic based on 2SLS – see text p. 351. Remedy: Not much – most of the discussion is about the condition, not what to do about it. Use LIML? Requires a normality assumption. Probably not too restrictive. 12-46/71 Part 12: Endogeneity Two Problems with 2SLS Z’X/n may not be sufficiently large. The covariance matrix for the IV estimator is Asy.Cov(b ) = σ2[(Z’X)(Z’Z)-1(X’Z)]-1 If Z’X/n -> 0, the variance explodes. Additional problems: 2SLS biased toward plim OLS Asymptotic results for inference fall apart. When there are many instruments, X̂ is too close to X; 2SLS becomes OLS. 12-47/71 Part 12: Endogeneity Orthodoxy 12-48/71 A proxy is not an instrumental variable Instrument is a noun, not a verb Are you sure that the instrument is really exogenous? The “natural experiment.” Part 12: Endogeneity Autism: Natural Experiment Autism ----- Television watching Which way does the causation go? We need an instrument: Rainfall Rainfall effects staying indoors which influences TV watching Rainfall is definitely absolutely truly exogenous, so it is a perfect instrument. The correlation survives, so TV “causes” autism. 12-49/71 Part 12: Endogeneity Whitehouse (WJS) Commend on Waldman When economists use one variable as a proxy for another—rainfall patterns instead of TV viewing, for example—it’s not always clear what the results actually measure. Prof. Waldman’s willingness to hazard an opinion on a delicate matter of science reflects the growing ambition of economists—and also their growing hubris, in the view of critics. Academic economists are increasingly venturing beyond their traditional stomping ground, a wanderlust that has produced some powerful results but also has raised concerns about whether they’re sometimes going too far. 12-50/71 Part 12: Endogeneity Treatment Effect Earnings and Education: Effect of an additional year of schooling Estimating Average and Local Average Treatment Effects of Education when Compulsory Schooling Laws Really Matter 12-51/71 Philip Oreopoulos AER, 96,1, 2006, 152-175 Part 12: Endogeneity Treatment Effects and Natural Experiments 12-52/71 Part 12: Endogeneity Treatment Effects Outcome Y and treatment T = 0/1 Y = T*Y1 + (1-T)*Y0; Potential outcomes Y1 = outcome if treated Y0 = outcome if not treated Y = x’b + dT + e d = endogenous (causal) treatment effect How to estimate? 12-53/71 2SLS Sample Selection Propensity Score Matching Part 12: Endogeneity Union Effect in LWAGE 12-54/71 Part 12: Endogeneity Union Dummy 12-55/71 Part 12: Endogeneity 2SLS 12-56/71 Part 12: Endogeneity Sample Selection 12-57/71 Part 12: Endogeneity Sample Selection MLE 12-58/71 Part 12: Endogeneity Matching Estimates ATT = E[lnWage(1)|X,Union=1] – E[lnWage(1)|X,Union=0] 12-59/71 Part 12: Endogeneity Measurement Error y = βx* + ε all of the usual assumptions x = x* + u the true x* is not observed (education vs. years of school) What happens when y is regressed on x? Least squares attenutation: cov(x,y) cov(x * +u, β x * +ε) = plim b = var(x) var(x * +u) β var(x*) = <β var(x*) + var(u) 12-60/71 Part 12: Endogeneity Why Is Least Squares Attenuated? y = βx* + ε x = x* + u y = βx + (ε - βu) y = βx + v, cov(x,v) = - β var(u) Some of the variation in x is not associated with variation in y. The effect of variation in x on y is dampened by the measurement error. 12-61/71 Part 12: Endogeneity Measurement Error in Multiple Regression Multiple regression: y = β1 x1 * +β2 x 2 * + ε x1 * is measured with error;= x1 x1 * +u x 2 is measured with out error. The regression is estimated by least squares Popular myth #1. b1 is biased downward, b2 consistent. Popular myth #2. All coefficients are biased toward zero. Result for the simplest case. Let = σij cov(x = 1, 2 (2x2 covariance matrix) i *, x j *),i, j σij =ijth element of the inverse of the covariance matrix θ2 = var(u) For the least squares estimators: θ2 σ12 1 plim b1 = β1 , plim b2 = β2 − β1 2 11 2 11 1 + θ σ 1 + θ σ The effect is called "smearing." 12-62/71 Part 12: Endogeneity Twins Application from the literature: Ashenfelter/Kreuger: A wage equation for twins that includes “schooling.” 12-63/71 Part 12: Endogeneity A study of moral hazard Riphahn, Wambach, Million: “Incentive Effects in the Demand for Healthcare” Journal of Applied Econometrics, 2003 Did the presence of the ADDON insurance influence the demand for health care – doctor visits and hospital visits? 12-64/71 Part 12: Endogeneity Application: Health Care Panel Data German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Variables in the file are Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). Note, the variable NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of the data for the person. (Downloaded from the JAE Archive) DOCTOR HOSPITAL HSAT DOCVIS HOSPVIS PUBLIC ADDON HHNINC HHKIDS EDUC AGE MARRIED EDUC 12-65/71 = = = = = = = = 1(Number of doctor visits > 0) 1(Number of hospital visits > 0) health satisfaction, coded 0 (low) - 10 (high) number of doctor visits in last three months number of hospital visits in last calendar year insured in public health insurance = 1; otherwise = 0 insured by add-on insurance = 1; otherswise = 0 household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) = children under age 16 in the household = 1; otherwise = 0 = years of schooling = age in years = marital status = years of education Part 12: Endogeneity Evidence of Moral Hazard? 12-66/71 Part 12: Endogeneity Regression Study 12-67/71 Part 12: Endogeneity Endogenous Dummy Variable HospitalVisits = f(Age,Health,Insurance, Other unobservables) Insurance 12-68/71 = f(Expected Doctor Visits, Age,Health,Gender,Income,Kids, Education, Other unobservables) Part 12: Endogeneity Approaches (Parametric) Control Function: Build a structural model for the two variables (Heckman) (Semiparametric) Instrumental Variable: Create an instrumental variable for the dummy variable (Barnow/Cain/ Goldberger, Angrist, Current generation of researchers) 12-69/71 Part 12: Endogeneity Heckman’s Control Function Approach Y = xβ + δT + E[ε|T] + {ε - E[ε|T]} λ = E[ε|T] , computed from a model for whether T = 0 or 1 Result is insignificant and goes in the wrong direction. 12-70/71 Part 12: Endogeneity Instrumental Variable Approach Construct a prediction for AddOn using only the exogenous information. Use 2SLS using this instrumental variable. Same non-result. 12-71/71 Part 12: Endogeneity