Download Z - NYU

Document related concepts

Data assimilation wikipedia , lookup

Bias of an estimator wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Econometrics I
Professor William Greene
Stern School of Business
Department of Economics
12-1/71
Part 12: Endogeneity
Econometrics I
Part 12 - Endogeneity
12-2/71
Part 12: Endogeneity
Sources of Endogeneity
One or more variables is correlated with the
disturbance.
 Sources:








12-3/71
Omitted variables
Nonrandom sampling
Nonrandom attrition from the sample
Voluntary participation in the program
(Not so) old fashioned simultaneous equations
Dynamics in panel data
Measurement error
Part 12: Endogeneity
The First IV Study – Omitted Variable
(Snow, J., On the Mode of Communication of Cholera, 1855)
London Cholera epidemic, ca 1853-4
Cholera = f(Water Purity,u)+ε.
 Effect of water purity on cholera?
 Purity=f(cholera prone environment (waste in streets,
rodents, etc.)). Regression does not work.
Two London water companies
Lambeth
Vauxhall/Southwark
=====|||||======
Main sewage discharge
Paul Grootendorst: A Review of Instrumental Variables Estimation of Treatment Effects…
http://individual.utoronto.ca/grootendorst/pdf/IV_Paper_Sept6_2007.pdf
12-4/71
Part 12: Endogeneity
IV Estimation – The Wald Estimator





Cholera=f(Water Purity,u)+ε
Z = water company Z = 1(Lambeth Water)
Cov(Cholera,Z)=δCov(Water Purity,Z); δ = Cov(C,Z)/Cov(WP,Z)
Z is randomly mixed in the population (two full sets of pipes) and uncorrelated
with behavioral unobservables, u)
Cholera=α+δPurity+u+ε
 Purity = Mean+random variation+λu
 Cov(Cholera,Z)= δCov(Purity,Z)
C1 = Proportion with Cholera & Lambeth Water
ˆδ = C1 − C0 C0 = Proportion with Cholera & Vauxhall Water
W1 − W0 W1 = Proportion with Bad Water & Lambeth Water
W0 = Proportion with Bad Water & Vauxhall Water
[This is two stage least squares C on (BadW) with Z = (WaterCo).]
12-5/71
Part 12: Endogeneity
Generalities
y = Xβ + Yδ + ε
Cov( X, ε) = 0, K1 variables
Cov(Y, ε) ≠ 0, K 2 variables
Y is endogenous
OLS regression of y on (X,Y) cannot estimate (β,δ )
consistently. Some other estimator is needed.
Additional structure:
Y =ZΠ +V where Cov(Z,ε)= 0.
An instrumental variable (IV) estimator based on (X,Z) may
be able to estimate (β,δ ) consistently.
12-6/71
Part 12: Endogeneity
The Standard Example
The Effect of Education on Earnings
LWAGE = β1 + β2EDUC + β3EXP + β4EXP 2 + ...ε+
What is ε? Ability, Motivation ,... + everything else
EDUC = f(GENDER, SMSA, SOUTH, Ability, Motivation,...)
12-7/71
Part 12: Endogeneity
What Influences LWAGE?
LWAGE = β1 + β2EDUC( X, Ability, Motivation,...)
+ β3EXP + β4EXP 2 + ...
ε(+ Ability, Motivation
)
Increased Ability is associated with increases in
EDUC( X, Ability, Motivation,...) and ε(Ability, Motivation )
What looks like an effect due to increase in EDUC may
be an increase in Ability. The estimate of β2 picks up
the effect of EDUC and the hidden effect of Ability.
12-8/71
Part 12: Endogeneity
An Exogenous Influence
LWAGE = β1 + β2EDUC( X, Z, Ability, Motivation,...)
+ β3EXP + β4EXP 2 + ...
ε(+ Ability, Motivation
)
Increased Z is associated with increases in
EDUC( X, Z, Ability, Motivation,...) and not ε(Ability, Motivation )
An effect due to the effect of an increase Z on EDUC will
only be an increase in EDUC. The estimate of β2 picks up
the effect of EDUC only.
Z is an Instrumental Variable
12-9/71
Part 12: Endogeneity
Instrumental Variables

Structure



LWAGE (ED,EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION)
ED (MS, FEM)
Reduced Form:
LWAGE[ ED (MS, FEM),
EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION ]
12-10/71
Part 12: Endogeneity
Two Stage Least Squares Strategy
Reduced Form:
LWAGE[ ED (MS, FEM,X),
EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION ]
 Strategy




12-11/71
(1) Purge ED of the influence of everything but MS,
FEM (and the other X variables). Predict ED using all
exogenous information in the sample (X and Z).
(2) Regress LWAGE on this prediction of ED and
everything else.
Standard errors must be adjusted for the predicted ED
Part 12: Endogeneity
Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years
Variables in the file are
EXP
WKS
OCC
IND
SOUTH
SMSA
MS
FEM
UNION
ED
LWAGE
=
=
=
=
=
=
=
=
=
=
=
work experience
weeks worked
occupation, 1 if blue collar,
1 if manufacturing industry
1 if resides in south
1 if resides in a city (SMSA)
1 if married
1 if female
1 if wage set by union contract
years of education
log of wage = dependent variable in regressions
These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel
Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied
Econometrics, 3, 1988, pp. 149-155. See Baltagi, page 122 for further analysis. The
data were downloaded from the website for Baltagi's text.
12-12/71
Part 12: Endogeneity
Specification: Quadratic Effect of Experience
12-13/71
Part 12: Endogeneity
OLS
12-14/71
Part 12: Endogeneity
2 Stage Least Squares
The weird results for the
coefficient on ED happened
because the instruments,
MS and FEM are dummy
variables. There is not
enough variation in these
variables.
12-15/71
Part 12: Endogeneity
Source of Endogeneity
LWAGE = f(ED,
EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION) + ε
 ED
= f(MS,FEM,
EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION) + u

12-16/71
Part 12: Endogeneity
Remove the Endogeneity

LWAGE = f(ED,
EXP,EXPSQ,WKS,OCC,
SOUTH,SMSA,UNION) + u + ε

Strategy


12-17/71
Estimate u
Add u to the equation. ED is uncorrelated with ε when u is in
the equation.
Part 12: Endogeneity
Auxiliary Regression for ED to Obtain Residuals
12-18/71
Part 12: Endogeneity
OLS with Residual (Control Function) Added
2SLS
12-19/71
Part 12: Endogeneity
A Warning About Control Function
12-20/71
Part 12: Endogeneity
The two stage LS strategy: (The two stage button in your software.)
The software regresses EDUC on all independent variables plus the
two instrumental variables (stage 1), then takes the predicted value on
education and regresses lwage on that predicted value plus the
original independent variables (stage 2). Is this correct?
Then the second method you showed is the same except the
predicted residuals are included in the second stage OLS.
Is one method preferred over another? They produce the same
results.
12-21/71
Part 12: Endogeneity
General Results
By construction, the IV estimator is consistent. So,
we have an estimator that is consistent when
least squares is not.
12-22/71
Part 12: Endogeneity
Instrumental Variables


Framework: y = Xβ + ε, K variables in X.
There exists a set of K variables, Z such that
plim(Z’X/n) ≠ 0 but plim(Z’ε/n) = 0

The variables in Z are called instrumental variables.
An alternative (to least squares) estimator of β is
bIV = (Z’X)-1Z’y

We consider the following:



12-23/71
Why use this estimator?
What are its properties compared to least squares?
We will also examine an important application
Part 12: Endogeneity
IV Estimators
Consistent
bIV = (Z’X)-1Z’y
= (Z’X/n)-1 (Z’X/n)β+ (Z’X/n)-1Z’ε/n
= β+ (Z’X/n)-1Z’ε/n  β
Asymptotically normal (same approach to proof as
for OLS)
Inefficient – (relative to what?) to be shown.
12-24/71
Part 12: Endogeneity
IV Estimation
Why use an IV estimator? Suppose that X and ε
are not uncorrelated. Then least squares is
neither unbiased nor consistent.
Recall the proof of consistency of least squares:
b = β + (X’X/n)-1(X’ε/n).
Plim b = β requires plim(X’ε/n) = 0. If this does not
hold, the estimator is inconsistent.
12-25/71
Part 12: Endogeneity
A Popular Misconception
A popular misconception. If only one variable in X is correlated with ε,
the other coefficients are consistently estimated. False.
Suppose only the first variable is correlated with ε
 σ1ε 
 
0
Under the assumptions, plim(X'ε/n) =   . Then
 ... 
 
 . 
 q 11 
 σ1ε 
 21 
 
0
q 
plim b -β = plim(X'X /n)-1   = σ1ε 
 ... 
 ... 
 K 1 
 
.
 
q 
= σ1ε times the first column of Q-1
The problem is “smeared” over the other coefficients.
12-26/71
Part 12: Endogeneity
Asymptotic Covariance Matrix of bIV
bIV − β =
(Z'X ) −1 Z ' ε
(bIV − β)(bIV − β) ' =
(Z'X )−1 Z ' εε'Z(X'Z)-1
E[(bIV − β)(bIV − β) ' | X, Z] =
σ2 (Z'X )−1 Z ' Z(X'Z)-1
12-27/71
Part 12: Endogeneity
Asymptotic Efficiency
Asymptotic efficiency of the IV estimator. The variance is
larger than that of LS. (A large sample type of GaussMarkov result is at work.)
(1) It’s a moot point. LS is inconsistent.
(2) Mean squared error is uncertain:
MSE[estimator|β]=Variance + square of bias.
IV may be better or worse. Depends on the data
12-28/71
Part 12: Endogeneity
Two Stage Least Squares
How to use an “excess” of instrumental variables
(1) X is K variables. Some (at least one) of the K
variables in X are correlated with ε.
(2) Z is M > K variables. Some of the variables in
Z are also in X, some are not. None of the
variables in Z are correlated with ε.
(3) Which K variables to use to compute Z’X and Z’y?
12-29/71
Part 12: Endogeneity
Choosing the Instruments




Choose K randomly?
Choose the included Xs and the remainder randomly?
Use all of them? How?
A theorem: There is a most efficient way to construct the
IV estimator from this subset:



(1) For each column (variable) in X, compute the predictions of
that variable using all the columns of Z.
(2) Linearly regress y on these K predictions.
This is two stage least squares
12-30/71
Part 12: Endogeneity
Algebraic Equivalence

Two stage least squares is equivalent to


12-31/71
(1) each variable in X that is also in Z is replaced by
itself.
(2) Variables in X that are not in Z are replaced by
predictions of that X with all the variables in Z that are
not in X.
Part 12: Endogeneity
2SLS Algebra
ˆ = Z(Z'Z)-1 Z'X
X
ˆ ˆ ) −1 X'y
ˆ
b2SLS = ( X'X
But, Z(Z'Z)-1 Z'X = (I - MZ ) X and (I - MZ ) is idempotent.
ˆ ˆ = X'(I - MZ )(I - MZ ) X = X'(I - MZ ) X so
X'X
ˆ ) −1 X'y
ˆ = a real IV estimator by the definition.
b2SLS = ( X'X
ˆ are linear combinations
ˆ ε/n) = 0 since columns of X
Note, plim(X'
of the columns of Z , all of which are uncorrelated with ε.
b2SLS = [ X'(I - MZ ) X ]-1 X'(I - MZ ) y
12-32/71
Part 12: Endogeneity
Asymptotic Covariance Matrix for 2SLS
General Result for Instrumental Variable Estimation
E[(bIV − β)(bIV − β) ' | X, Z] =
σ2 (Z'X )−1 Z ' Z(X'Z)-1
ˆ = (I - MZ ) X
Specialize for 2SLS, using Z = X
ˆ ) −1 X
ˆ 'X
ˆ (X'X
ˆ )-1
E[(b2SLS − β)(b2SLS − β) ' | X, Z] =
σ2 ( X'X
ˆ ˆ ) −1 X
ˆ 'X
ˆ (X'X
ˆ ˆ )-1
= σ2 ( X'X
ˆ ˆ ) −1
= σ2 ( X'X
12-33/71
Part 12: Endogeneity
2SLS Has Larger Variance than LS
A comparison to OLS
ˆ 'X
ˆ )-1
Asy.Var[2SLS]=σ2 ( X
Neglecting the inconsistency,
Asy.Var[LS] =σ2 ( X ' X )-1
(This is the variance of LS around its mean, not β)
Asy.Var[2SLS] ≥ Asy.Var[LS] in the matrix sense.
Compare inverses:
-1
ˆ 'X
ˆ]
{Asy.Var[LS]}-1 - {Asy.Var[2SLS]}=
(1 / σ2 )[X ' X - X
(1 / σ2 )[X ' X - X '(I − MZ ) X ]=(1 / σ2 )[X ' MZ X ]
=
This matrix is nonnegative definite. (Not positive definite
as it might have some rows and columns which are zero.)
Implication for "precision" of 2SLS.
The problem of "Weak Instruments"
12-34/71
Part 12: Endogeneity
Estimating σ2
Estimating the asymptotic covariance matrix a caution about estimating σ2 .
ˆ,
Since the regression is computed by regressing y on x
one might use
2
σ
ˆ =
1
n
ˆ 2sls )
Σni=1 (y i − x'b
This is inconsistent. Use
2
σ
ˆ =
1
n
Σni=1 (y i − x'b2sls )
(Degrees of freedom correction is optional. Conventional,
but not necessary.)
12-35/71
Part 12: Endogeneity
Robust Covariance Matrix for 2SLS
Robust VC Estimator =
(
ˆ ′X
ˆ
X
)
−1
 C
=
 ∑ c 1
(∑
nc
∑
nc
=t 1 =s 1
)( )
ˆ ( y − x′ β)
ˆ xˆ xˆ ′  X
ˆ ′X
ˆ
( yct − x′ct β)
cs
cs
ct cs 

Actual xct
12-36/71
−1
Predicted xct
Part 12: Endogeneity
Endogeneity Test? (Hausman)
Exogenous
Endogenous
OLS
Consistent, Efficient
Inconsistent
2SLS
Consistent, Inefficient
Consistent
Base a test on d = b2SLS - bOLS
Use a Wald statistic, d’[Var(d)]-1d
What to use for the variance matrix?
Hausman: V2SLS - VOLS
12-37/71
Part 12: Endogeneity
An Endogenous Dummy Variable

HospitalVisits = f(Age,Health,Insurance,
Other unobservables)

Insurance
12-38/71
= f(Expected Doctor Visits,
Age,Health,Gender,Income,Kids,
Education, Other unobservables)
Part 12: Endogeneity
Hausman Test
12-39/71
Part 12: Endogeneity
12-40/71
Part 12: Endogeneity
Rank = 1. Use a generalized inverse. Still not significant
12-41/71
Part 12: Endogeneity
Endogeneity Test: Wu
Considerable complication in Hausman test
(text, pp. 235-236)
 Simplification: Wu test.
 Regress y on X and X^ estimated for the
endogenous part of X. Then use an ordinary
Wald test.

12-42/71
Part 12: Endogeneity
Wu Test in C&R Wage Equation
12-43/71
Part 12: Endogeneity
Alternative to Hausman’s Formula?

H test requires the difference between an
efficient and an inefficient estimator.

Any way to compare any two competing
estimators even if neither is efficient?

Bootstrap? (Maybe)
12-44/71
Part 12: Endogeneity
12-45/71
Part 12: Endogeneity
Weak Instruments



Symptom: The relevance condition, plim Z’X/n not zero, is close to
being violated.
Detection:
 Standard F test in the regression of xk on Z. F < 10 suggests a
problem.
 F statistic based on 2SLS – see text p. 351.
Remedy:
 Not much – most of the discussion is about the condition, not
what to do about it.
 Use LIML? Requires a normality assumption. Probably not too
restrictive.
12-46/71
Part 12: Endogeneity
Two Problems with 2SLS

Z’X/n may not be sufficiently large. The
covariance matrix for the IV estimator is
Asy.Cov(b ) = σ2[(Z’X)(Z’Z)-1(X’Z)]-1


If Z’X/n -> 0, the variance explodes.
Additional problems:



2SLS biased toward plim OLS
Asymptotic results for inference fall apart.
When there are many instruments, X̂ is too close
to X; 2SLS becomes OLS.
12-47/71
Part 12: Endogeneity
Orthodoxy
12-48/71

A proxy is not an instrumental variable

Instrument is a noun, not a verb

Are you sure that the instrument is really
exogenous? The “natural experiment.”
Part 12: Endogeneity
Autism: Natural Experiment
Autism ----- Television watching
 Which way does the causation go?
 We need an instrument: Rainfall




Rainfall effects staying indoors which influences TV
watching
Rainfall is definitely absolutely truly exogenous, so it
is a perfect instrument.
The correlation survives, so TV “causes” autism.
12-49/71
Part 12: Endogeneity
Whitehouse (WJS) Commend on Waldman
When economists use one variable as a proxy for another—rainfall
patterns instead of TV viewing, for example—it’s not always clear what
the results actually measure.
Prof. Waldman’s willingness to hazard an opinion on a delicate matter of
science reflects the growing ambition of economists—and also their growing
hubris, in the view of critics. Academic economists are increasingly venturing
beyond their traditional stomping ground, a wanderlust that has produced
some powerful results but also has raised concerns about whether they’re
sometimes going too far.
12-50/71
Part 12: Endogeneity
Treatment Effect

Earnings and Education: Effect of an additional
year of schooling

Estimating Average and Local Average
Treatment Effects of Education when
Compulsory Schooling Laws Really Matter


12-51/71
Philip Oreopoulos
AER, 96,1, 2006, 152-175
Part 12: Endogeneity
Treatment Effects and Natural Experiments
12-52/71
Part 12: Endogeneity
Treatment Effects
Outcome Y and treatment T = 0/1
 Y = T*Y1 + (1-T)*Y0; Potential outcomes




Y1 = outcome if treated
Y0 = outcome if not treated
Y = x’b + dT + e


d = endogenous (causal) treatment effect
How to estimate?



12-53/71
2SLS
Sample Selection
Propensity Score Matching
Part 12: Endogeneity
Union Effect in LWAGE
12-54/71
Part 12: Endogeneity
Union Dummy
12-55/71
Part 12: Endogeneity
2SLS
12-56/71
Part 12: Endogeneity
Sample Selection
12-57/71
Part 12: Endogeneity
Sample Selection MLE
12-58/71
Part 12: Endogeneity
Matching
Estimates ATT = E[lnWage(1)|X,Union=1] – E[lnWage(1)|X,Union=0]
12-59/71
Part 12: Endogeneity
Measurement Error
y = βx* + ε all of the usual assumptions
x = x* + u the true x* is not observed
(education vs. years of school)
What happens when y is regressed on x? Least
squares attenutation:
cov(x,y) cov(x * +u, β x * +ε)
=
plim b =
var(x)
var(x * +u)
β var(x*)
=
<β
var(x*) + var(u)
12-60/71
Part 12: Endogeneity
Why Is Least Squares Attenuated?
y = βx* + ε
x = x* + u
y = βx + (ε - βu)
y = βx + v, cov(x,v) = - β var(u)
Some of the variation in x is not
associated with variation in y. The
effect of variation in x on y is
dampened by the measurement error.
12-61/71
Part 12: Endogeneity
Measurement Error in Multiple Regression
Multiple regression: y = β1 x1 * +β2 x 2 * + ε
x1 * is measured with error;=
x1 x1 * +u
x 2 is measured with out error.
The regression is estimated by least squares
Popular myth #1. b1 is biased downward, b2 consistent.
Popular myth #2. All coefficients are biased toward zero.
Result for the simplest case. Let
=
σij cov(x
=
1, 2 (2x2 covariance matrix)
i *, x j *),i, j
σij =ijth element of the inverse of the covariance matrix
θ2 = var(u)
For the least squares estimators:
 θ2 σ12 
1


plim b1 = β1 
, plim b2 = β2 − β1 
2 11 
2 11 
1
+
θ
σ


1 + θ σ 
The effect is called "smearing."
12-62/71
Part 12: Endogeneity
Twins
Application from the literature:
Ashenfelter/Kreuger: A wage
equation for twins that includes
“schooling.”
12-63/71
Part 12: Endogeneity
A study of moral hazard
Riphahn, Wambach, Million: “Incentive Effects in the Demand
for Healthcare”
Journal of Applied Econometrics, 2003
Did the presence of the ADDON insurance influence the
demand for health care – doctor visits and hospital visits?
12-64/71
Part 12: Endogeneity
Application: Health Care Panel Data
German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods
Variables in the file are
Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293
individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate
binary choice. This is a large data set. There are altogether 27,326 observations. The number of
observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051,
6=1000, 7=987). Note, the variable NUMOBS below tells how many observations there are for each
person. This variable is repeated in each row of the data for the person. (Downloaded from the JAE
Archive)
DOCTOR
HOSPITAL
HSAT
DOCVIS
HOSPVIS
PUBLIC
ADDON
HHNINC
HHKIDS
EDUC
AGE
MARRIED
EDUC
12-65/71
=
=
=
=
=
=
=
=
1(Number of doctor visits > 0)
1(Number of hospital visits > 0)
health satisfaction, coded 0 (low) - 10 (high)
number of doctor visits in last three months
number of hospital visits in last calendar year
insured in public health insurance = 1; otherwise = 0
insured by add-on insurance = 1; otherswise = 0
household nominal monthly net income in German marks / 10000.
(4 observations with income=0 were dropped)
= children under age 16 in the household = 1; otherwise = 0
= years of schooling
= age in years
= marital status
= years of education
Part 12: Endogeneity
Evidence of Moral Hazard?
12-66/71
Part 12: Endogeneity
Regression Study
12-67/71
Part 12: Endogeneity
Endogenous Dummy Variable

HospitalVisits = f(Age,Health,Insurance,
Other unobservables)

Insurance
12-68/71
= f(Expected Doctor Visits,
Age,Health,Gender,Income,Kids,
Education, Other unobservables)
Part 12: Endogeneity
Approaches

(Parametric) Control Function: Build a structural
model for the two variables (Heckman)

(Semiparametric) Instrumental Variable: Create
an instrumental variable for the dummy variable
(Barnow/Cain/ Goldberger, Angrist, Current
generation of researchers)
12-69/71
Part 12: Endogeneity
Heckman’s Control Function Approach


Y = xβ + δT + E[ε|T] + {ε - E[ε|T]}
λ = E[ε|T] , computed from a model for whether T = 0 or 1
Result is insignificant and goes in the wrong direction.
12-70/71
Part 12: Endogeneity
Instrumental Variable Approach
Construct a prediction for AddOn using only the exogenous
information. Use 2SLS using this instrumental variable.
Same non-result.
12-71/71
Part 12: Endogeneity