Download 05_Simultaneity-Issues-in-Ordinary-Least-Squares

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Choice modelling wikipedia , lookup

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Simultaneity Issues in
Ordinary Least Squares
Amine Ouazad
Ass. Prof. of Economics
Recap from last sessions
• OLS is consistent under the linearity assumptions, the full
rank assumption, and the exogeneity of the covariates
assumption.
• The exogeneity of the covariates (A3) is violated whenever:
1.
2.
3.
•
•
•
There is an omitted variable in the residual which is
correlated with the covariates. (last session)
There is measurement error. (last session)
There is a reverse causality or simultaneity problem (this
session).
These three issues cause identification problems:
even if sample size is infinite, the estimator does not
come arbitrarily close to the true value.
The OLS estimator is inconsistent.
The OLS estimator is biased.
Outline
1. Supply/demand estimation
2. Simultaneous Equation Model (X Rated)
3. Bruce Sacerdote, Peer effects with random
assignment: Results for Dartmouth
roommates, Quarterly Journal of Economics,
2001.
SUPPLY AND DEMAND ESTIMATION
100
80
60
40
0
20
1940
1960
1980
Year
2000
2020
1940
1960
Current US $
1980
Year
2000
2020
Total production in Barrels per day
30000
0
40
60
80
40000
50000
60000
70000
Total production in Barrels per day
20
80000
100
100
80
60
40
0
20
30000
40000
50000
60000
70000
Total production in Barrels per day
Source: BP Statistical Review, 2009. Forecastchart.com
80000
. eststo: regress totalquantity price
. esttab, se r2
(1)
totalquant~y
price
481.3***
(78.93)
_cons
52098.0***
(2224.4)
N
R-sq
44
0.470
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Supply/demand estimation
The problem
• pt :price at time t.
• qt : quantity at time t.
• Specification: qt = a + b pt + e.
• The OLS regression of qt on pt does not satisfy
A3 because there is a correlation between
price changes and the unobservables e.
Full model
𝑞𝑡 = 𝑎 + 𝑏𝑝𝑡 + 𝑒𝑡
𝑞𝑡 = 𝑐 − 𝑑𝑝𝑡 + 𝑢𝑡
• qt, pt: endogenous variables.
• e: supply shock, u: demand shock.
Market equilibrium:
Notice the effect of supply and demand shocks on price and quantity.
𝑐 − 𝑎 𝑢𝑡 − 𝑒𝑡
𝑝𝑡 =
+
𝑏+𝑑
𝑏+𝑑
𝑐𝑏 + 𝑎𝑑
𝑑
𝑞𝑡 =
−
(𝑢 − 𝑒𝑡 )
𝑏+𝑑
𝑏+𝑑 𝑡
Hence Cov(pt,et) is non zero and the regression of quantity on prices
does not yield a consistent estimator of either demand or supply.
Exogenous/Endogenous
• Greek: ενδογενής, meaning "proceeding from
within" ("ενδο"=inside "-γενής"=coming
from), the complement of exogenous (Greek:
εξωγενής exo, "έξω"= outside) "proceeding
from outside".
• Definition of exogenous/endogenous depends
on the model. For instance, in the previous
model, if
Structural, Identifiable Parameters
• Structural parameters: (a,b,c,d,Var(e),Var(u))
• Identifiable parameters:
𝑐 − 𝑎 𝑐𝑏 + 𝑎𝑑
𝑢𝑡 − 𝑒𝑡
𝑑
,
, 𝑉𝑎𝑟
, Var(−
𝑢 − 𝑒𝑡 )
𝑏+𝑑 𝑏+𝑑
𝑏+𝑑
𝑏+𝑑 𝑡
– These parameters are the mean of pt, the mean of
qt, the variance of pt, the variance of qt.
• Hence there are 4 identifiable parameters,
and 6 structural parameters…. The model is
not identified.
Observational equivalence, example
𝑞𝑡 = 1 + 2𝑝𝑡 + 𝑒𝑡
𝑞𝑡 = 4 − 5𝑝𝑡 + 𝑢𝑡
• With Cov(et,ut) = 0, Var(et) = 1, Var(ut) =1.
• Then, the following demand and supply schedule gives the
same distributions of prices, quantities, and correlation
between price and quantity:
𝑞𝑡 = 1 + 2𝑝𝑡 + 𝑒𝑡
1
2𝑒𝑡 + 𝑢𝑡
𝑞𝑡 = 2 − 𝑝𝑡 +
3
3
• How did I find this??
Solving the simultaneity problem
Intuition
• If you have a variable that affects demand
without affecting supply, then it is possible to
identify the supply curve.
• If you have a variable that affects supply
without affecting demand, then it is possible
to identify the demand curve.
• Here we have this:
– temperature affects only supply !
– We are able to estimate the demand curve. How ?
Consider the following:
𝑏=
𝐶𝑜𝑣(𝑞𝑡 , 𝑧𝑡 )
𝐶𝑜𝑣(𝑝𝑡 , 𝑧𝑡 )
• Prove that this covariance is equal to b.
– Under what assumption?
• z is called a supply shifter.
– A supply shifter identifies demand.
• Question: what if you had one variable that
affects demand without affecting supply?
Stata application
ivreg quantity (price = temperature)
• This regression estimates the demand curve,
since temperature affects only supply.
• This is called an instrumental variable
regression, to be seen later in econometrics A.
(Reference: William Greene, Simultaneous Equations Model)
SIMULTANEOUS EQUATIONS
Structural form of the model
•
•
•
•
yt1,ytM are the endogenous variables
xt1, xtK are the exogenous variables.
et1,…,etM are the structural residuals/shocks/unobservables.
t: time periods.
In matrix form:
𝑦𝑡′ Γ + 𝑥𝑡′ 𝐵 = 𝜀𝑡 ′
Reduced form of the model
𝑦𝑡′ = −𝑥𝑡′ 𝐵Γ −1 + 𝜀𝑡 ′Γ −1
• Joy! This reduced form model can be
estimated as is, with M separate equations,
the OLS estimator of the regression of each
element of yt on the xt is consistent.
• But wait a minute: from 𝐵Γ −1 it is not
possible to recover all the elements of B and
Γ .
Exercise
• Write the structural form of the model for the
oil example.
• Hints:
– There are 2 equations.
– 2 Endogenous variables: pt qt.
– 1 Exogenous variable: the constant.
• Write the reduced form model using the
previous formula. Do we find the same
solution?
Matrix form notation: Structural model
𝑌Γ + 𝑋𝐵 = 𝐸
• Y : TxM matrix. T rows, M columns.
– M = 2 in the oil example.
• X: TxK matrix.
– K = 1 in the oil example.
• E : TxM vector.
• Exogeneity E(E|X) = 0 and E(E’E|X) = S.
Matrix form notation:
Reduced form model
𝑌 = 𝑋Π + 𝑉
• P : the matrix of reduced form parameters.
(KxM matrix).
• V : the vector of residuals, with variancecovariance matrix W. The var-cov matrix has
size M.
Identification of
Reduced Form parameters
• Parameters in the structural form model:
– M*M + K*M + ½ M(M+1)
– G matrix, B matrix, S matrix.
• Parameters in the reduced form model:
– K*M + ½ M(M+1)
– P matrix, W matrix.
• Aie ! M*M parameters ‘too many’ !
Solutions?
1. Normalizations:
– make the coefficient of each independent variable
equal to 1. The number of excess parameters is then
M(M-1).
2. Identities & Restrictions:
– Pin down relationships between parameters.
3. Exclusions:
– Political events have an effect on supply, but not on
world demand.
4. Restrictions on the variance covariance matrix:
– Assume 0 correlation between disturbances in the
reduced form model.
Notations for equation j
• Considering equation j in isolation…
• We set the coefficient on yj equal to 1.
• We are going to exclude endogenous variables
(exactly Mj* variables) and exclude exogenous
variables (Kj* variables).
Equation j
Finding the structural parameters
Pj*gj=pj*
• This equation gives the structural parameters
of equation j. It has Kj* equations (the rows of
Pj*) and Mj unknowns (the coefficients of the
structural parameters).
Order condition
Kj* greater or equal than Mj
• The number of exogenous variables excluded
in equation j must be at least as large as the
number of endogenous variables included in
equation j.
• Relationship with agricultural product
exercise??
Rank condition
rank[Pj*] = Mj
• This condition imposes a restriction on the
submatrix of the reduced-form coefficient
matrix.
Deducing the coefficients of the
exogenous variables
bj=pj-Π𝑗 gj
• This equation gives the coefficients of the
exogenous covariates of equation j as a
function of known quantities.
Take Away
• With simultaneous equations, only the
reduced-form model is typically identified in
OLS.
– You cannot interpret the results of an OLS of the
structural model (where endogenous variables are
in the covariates).
• However, by making suitable assumptions on
the exclusion of exogenous variables, you can
identify the model.
Take Away
• 4 steps:
1. Estimate the reduced form model.
2. Make the necessary exclusion restrictions.
3. Write the structural parameters as a function of
the reduced form parameters.
4. Solve the system of equations.
SOCIAL INTERACTIONS:
SACERDOTE QJE 2001
The question
• What is the impact of a roommate’s
characteristic and behavior on your
GPA/behavior ?
– Characteristic: SAT score (before university),
gender, age, any x that is not changing over time.
– Behavior: any variable that is a choice, such as
choice of major, achievement.
The issues
• Regressing my GPA on the GPA of my
roommate has a number of problems…
– First, the GPA of my roommate is also determined
by my GPA: simultaneity bias. (A3)
– Second, if roommates are not randomly allocated,
and if, for instance, having a male roommate is
correlated with having a drinking roommate, then:
omitted variable bias. (A3)
– Third, there are some common shocks that affect
both me and my roommate at the same time:
correlation of the error terms. (A4)
– Fourth, the effect of my roommate may depend
on my characteristics, and also on his other
characteristics: non linearity. (A1)
The structural model
• Academic ability, Measurement error, Grade
point average of the roommate, residual.
• Assumptions A1,A2 are maintained. The
exogeneity of the GPA of the other roommate
(A3) is violated.
The reduced form model
• Notice that right-hand side variables are
purely exogenous.
• Identification problem is given pi_0, pi_1, and
pi_2, can we “get” the structural parameters?
Without any more constraint, no.
• Notice that there will be correlation of the
residuals across individuals.
Identification problem
• In the reduced form, assuming A1,A2,A3, the OLS
is consistent. A4 is violated (more on this in the
next session).
– regress gpa gpa_roommate characteristics
characteristics_roommate gives consistent and
unbiased estimates of the effects.
• (To correct for A4, add option cluster(room) if
room is a variable indicating the room number).
Randomization
• Individuals are randomized conditionally on
their stated preferences. Violation of A3? Not
if the preferences are present as covariates.
– Conditional randomization, E(e|X) = 0.
• Method of randomization?
Results
Checking Linearity (A1)
Take Away from this session
• Spot Reverse Causality issues in papers!
– Sometimes mild, sometimes very severe
(demand/supply, social interactions)
• You can solve the problem by instrumenting
the endogenous variable with a variable that
affects the variable without affecting the
outcome (a demand shifter for supply, a
supply shifter for demand).
OLS: Three identification issues
1. Measurement error
2. Omitted variable bias
3. Reverse causality/Simultaneity