Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Simultaneity Issues in Ordinary Least Squares Amine Ouazad Ass. Prof. of Economics Recap from last sessions • OLS is consistent under the linearity assumptions, the full rank assumption, and the exogeneity of the covariates assumption. • The exogeneity of the covariates (A3) is violated whenever: 1. 2. 3. • • • There is an omitted variable in the residual which is correlated with the covariates. (last session) There is measurement error. (last session) There is a reverse causality or simultaneity problem (this session). These three issues cause identification problems: even if sample size is infinite, the estimator does not come arbitrarily close to the true value. The OLS estimator is inconsistent. The OLS estimator is biased. Outline 1. Supply/demand estimation 2. Simultaneous Equation Model (X Rated) 3. Bruce Sacerdote, Peer effects with random assignment: Results for Dartmouth roommates, Quarterly Journal of Economics, 2001. SUPPLY AND DEMAND ESTIMATION 100 80 60 40 0 20 1940 1960 1980 Year 2000 2020 1940 1960 Current US $ 1980 Year 2000 2020 Total production in Barrels per day 30000 0 40 60 80 40000 50000 60000 70000 Total production in Barrels per day 20 80000 100 100 80 60 40 0 20 30000 40000 50000 60000 70000 Total production in Barrels per day Source: BP Statistical Review, 2009. Forecastchart.com 80000 . eststo: regress totalquantity price . esttab, se r2 (1) totalquant~y price 481.3*** (78.93) _cons 52098.0*** (2224.4) N R-sq 44 0.470 Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001 Supply/demand estimation The problem • pt :price at time t. • qt : quantity at time t. • Specification: qt = a + b pt + e. • The OLS regression of qt on pt does not satisfy A3 because there is a correlation between price changes and the unobservables e. Full model 𝑞𝑡 = 𝑎 + 𝑏𝑝𝑡 + 𝑒𝑡 𝑞𝑡 = 𝑐 − 𝑑𝑝𝑡 + 𝑢𝑡 • qt, pt: endogenous variables. • e: supply shock, u: demand shock. Market equilibrium: Notice the effect of supply and demand shocks on price and quantity. 𝑐 − 𝑎 𝑢𝑡 − 𝑒𝑡 𝑝𝑡 = + 𝑏+𝑑 𝑏+𝑑 𝑐𝑏 + 𝑎𝑑 𝑑 𝑞𝑡 = − (𝑢 − 𝑒𝑡 ) 𝑏+𝑑 𝑏+𝑑 𝑡 Hence Cov(pt,et) is non zero and the regression of quantity on prices does not yield a consistent estimator of either demand or supply. Exogenous/Endogenous • Greek: ενδογενής, meaning "proceeding from within" ("ενδο"=inside "-γενής"=coming from), the complement of exogenous (Greek: εξωγενής exo, "έξω"= outside) "proceeding from outside". • Definition of exogenous/endogenous depends on the model. For instance, in the previous model, if Structural, Identifiable Parameters • Structural parameters: (a,b,c,d,Var(e),Var(u)) • Identifiable parameters: 𝑐 − 𝑎 𝑐𝑏 + 𝑎𝑑 𝑢𝑡 − 𝑒𝑡 𝑑 , , 𝑉𝑎𝑟 , Var(− 𝑢 − 𝑒𝑡 ) 𝑏+𝑑 𝑏+𝑑 𝑏+𝑑 𝑏+𝑑 𝑡 – These parameters are the mean of pt, the mean of qt, the variance of pt, the variance of qt. • Hence there are 4 identifiable parameters, and 6 structural parameters…. The model is not identified. Observational equivalence, example 𝑞𝑡 = 1 + 2𝑝𝑡 + 𝑒𝑡 𝑞𝑡 = 4 − 5𝑝𝑡 + 𝑢𝑡 • With Cov(et,ut) = 0, Var(et) = 1, Var(ut) =1. • Then, the following demand and supply schedule gives the same distributions of prices, quantities, and correlation between price and quantity: 𝑞𝑡 = 1 + 2𝑝𝑡 + 𝑒𝑡 1 2𝑒𝑡 + 𝑢𝑡 𝑞𝑡 = 2 − 𝑝𝑡 + 3 3 • How did I find this?? Solving the simultaneity problem Intuition • If you have a variable that affects demand without affecting supply, then it is possible to identify the supply curve. • If you have a variable that affects supply without affecting demand, then it is possible to identify the demand curve. • Here we have this: – temperature affects only supply ! – We are able to estimate the demand curve. How ? Consider the following: 𝑏= 𝐶𝑜𝑣(𝑞𝑡 , 𝑧𝑡 ) 𝐶𝑜𝑣(𝑝𝑡 , 𝑧𝑡 ) • Prove that this covariance is equal to b. – Under what assumption? • z is called a supply shifter. – A supply shifter identifies demand. • Question: what if you had one variable that affects demand without affecting supply? Stata application ivreg quantity (price = temperature) • This regression estimates the demand curve, since temperature affects only supply. • This is called an instrumental variable regression, to be seen later in econometrics A. (Reference: William Greene, Simultaneous Equations Model) SIMULTANEOUS EQUATIONS Structural form of the model • • • • yt1,ytM are the endogenous variables xt1, xtK are the exogenous variables. et1,…,etM are the structural residuals/shocks/unobservables. t: time periods. In matrix form: 𝑦𝑡′ Γ + 𝑥𝑡′ 𝐵 = 𝜀𝑡 ′ Reduced form of the model 𝑦𝑡′ = −𝑥𝑡′ 𝐵Γ −1 + 𝜀𝑡 ′Γ −1 • Joy! This reduced form model can be estimated as is, with M separate equations, the OLS estimator of the regression of each element of yt on the xt is consistent. • But wait a minute: from 𝐵Γ −1 it is not possible to recover all the elements of B and Γ . Exercise • Write the structural form of the model for the oil example. • Hints: – There are 2 equations. – 2 Endogenous variables: pt qt. – 1 Exogenous variable: the constant. • Write the reduced form model using the previous formula. Do we find the same solution? Matrix form notation: Structural model 𝑌Γ + 𝑋𝐵 = 𝐸 • Y : TxM matrix. T rows, M columns. – M = 2 in the oil example. • X: TxK matrix. – K = 1 in the oil example. • E : TxM vector. • Exogeneity E(E|X) = 0 and E(E’E|X) = S. Matrix form notation: Reduced form model 𝑌 = 𝑋Π + 𝑉 • P : the matrix of reduced form parameters. (KxM matrix). • V : the vector of residuals, with variancecovariance matrix W. The var-cov matrix has size M. Identification of Reduced Form parameters • Parameters in the structural form model: – M*M + K*M + ½ M(M+1) – G matrix, B matrix, S matrix. • Parameters in the reduced form model: – K*M + ½ M(M+1) – P matrix, W matrix. • Aie ! M*M parameters ‘too many’ ! Solutions? 1. Normalizations: – make the coefficient of each independent variable equal to 1. The number of excess parameters is then M(M-1). 2. Identities & Restrictions: – Pin down relationships between parameters. 3. Exclusions: – Political events have an effect on supply, but not on world demand. 4. Restrictions on the variance covariance matrix: – Assume 0 correlation between disturbances in the reduced form model. Notations for equation j • Considering equation j in isolation… • We set the coefficient on yj equal to 1. • We are going to exclude endogenous variables (exactly Mj* variables) and exclude exogenous variables (Kj* variables). Equation j Finding the structural parameters Pj*gj=pj* • This equation gives the structural parameters of equation j. It has Kj* equations (the rows of Pj*) and Mj unknowns (the coefficients of the structural parameters). Order condition Kj* greater or equal than Mj • The number of exogenous variables excluded in equation j must be at least as large as the number of endogenous variables included in equation j. • Relationship with agricultural product exercise?? Rank condition rank[Pj*] = Mj • This condition imposes a restriction on the submatrix of the reduced-form coefficient matrix. Deducing the coefficients of the exogenous variables bj=pj-Π𝑗 gj • This equation gives the coefficients of the exogenous covariates of equation j as a function of known quantities. Take Away • With simultaneous equations, only the reduced-form model is typically identified in OLS. – You cannot interpret the results of an OLS of the structural model (where endogenous variables are in the covariates). • However, by making suitable assumptions on the exclusion of exogenous variables, you can identify the model. Take Away • 4 steps: 1. Estimate the reduced form model. 2. Make the necessary exclusion restrictions. 3. Write the structural parameters as a function of the reduced form parameters. 4. Solve the system of equations. SOCIAL INTERACTIONS: SACERDOTE QJE 2001 The question • What is the impact of a roommate’s characteristic and behavior on your GPA/behavior ? – Characteristic: SAT score (before university), gender, age, any x that is not changing over time. – Behavior: any variable that is a choice, such as choice of major, achievement. The issues • Regressing my GPA on the GPA of my roommate has a number of problems… – First, the GPA of my roommate is also determined by my GPA: simultaneity bias. (A3) – Second, if roommates are not randomly allocated, and if, for instance, having a male roommate is correlated with having a drinking roommate, then: omitted variable bias. (A3) – Third, there are some common shocks that affect both me and my roommate at the same time: correlation of the error terms. (A4) – Fourth, the effect of my roommate may depend on my characteristics, and also on his other characteristics: non linearity. (A1) The structural model • Academic ability, Measurement error, Grade point average of the roommate, residual. • Assumptions A1,A2 are maintained. The exogeneity of the GPA of the other roommate (A3) is violated. The reduced form model • Notice that right-hand side variables are purely exogenous. • Identification problem is given pi_0, pi_1, and pi_2, can we “get” the structural parameters? Without any more constraint, no. • Notice that there will be correlation of the residuals across individuals. Identification problem • In the reduced form, assuming A1,A2,A3, the OLS is consistent. A4 is violated (more on this in the next session). – regress gpa gpa_roommate characteristics characteristics_roommate gives consistent and unbiased estimates of the effects. • (To correct for A4, add option cluster(room) if room is a variable indicating the room number). Randomization • Individuals are randomized conditionally on their stated preferences. Violation of A3? Not if the preferences are present as covariates. – Conditional randomization, E(e|X) = 0. • Method of randomization? Results Checking Linearity (A1) Take Away from this session • Spot Reverse Causality issues in papers! – Sometimes mild, sometimes very severe (demand/supply, social interactions) • You can solve the problem by instrumenting the endogenous variable with a variable that affects the variable without affecting the outcome (a demand shifter for supply, a supply shifter for demand). OLS: Three identification issues 1. Measurement error 2. Omitted variable bias 3. Reverse causality/Simultaneity