Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Simultaneous Equation Models • As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the explanatory variables is jointly determined with the dependent variable, usually through an equilibrium mechanism. • Simultaneous Equations Models (SEMs) differ from those considered previously because in each model there are two or more dependent variables rather than just one. • Simultaneous equations models also differ from most of the econometric models we have considered so far because they consist of a set of equations • The least squares estimation procedure is not appropriate in these models and we must develop new ways to obtain reliable estimates of economic parameters. • The usual method for estimating SEMs is the instrumental variables method, discussed last week. Example • The classic example of an SEM is a supply and demand equation for some commodity (e.g. coffee) or input to production (e.g. labour) • Consider a simple market supply function: • Where is quantity or output, is price and is some observed variable affecting supply of the commodity (e.g. weather). The error term, , contains other factors that affect supply. • The equation is an example of a structural equation, i.e. it is derivable from economic theory and has a causal interpretation. • The coefficient measures how supply of the product changes when the price changes. If price and quantity are measured in logs, the coefficient gives the price elasticity of supply. • Plotting the supply function, we plot output as a function of price, holding and fixed. Changes in either of these two factors lead to shifts in the supply curve; the difference being that is observed, while is not. • The crucial assumption for OLS that we make is that the independent variables are independent of the error term. • In this case, this assumption does not hold. Assuming that the demand curve is downward sloping (or vertical), then a shift in the supply curve produces a change in both price and quantity. Thus the error term is correlated with price. • In addition, the fact that is random means that on the right-hand side of the supply and demand equations we have an explanatory variable that is random. This is contrary to the assumption of “fixed explanatory variables” that we usually make in regression model analysis. • The important thing to remember is that supply and demand interact to jointly determine the market price of a good and the amount of it that is sold • An econometric model that explains market price and quantity should therefore consist of two equations, one for supply and one for demand. Demand: Supply: • (1) (2) Where is the quantity demanded and is an observed variable affecting the demand for the commodity (e.g. income). • In this model the variables p and q are called endogenous variables because their values are determined within the system we have created. • The variables and have values that are given to us, and which are determined outside this system. As such, these are exogenous variables. • The error terms in the supply and demand equations are assumed to have the usual properties; i.e. they have a constant mean and variance, and are independently distributed A Bad Example • An important point to remember when using SEMs is that each equation in the model should have a ceteris paribus, causal interpretation. • In the above example, the two equations describe entirely different relationships. - The supply equation describes the behaviour of firms - The demand equation is a behavioural relationship for consumers • Each equation has a ceteris paribus interpretation therefore and stands on its own • They become linked in the econometric analysis only because the observed price and quantity are determined by the intersection of supply and demand. • Consider the following example: • Neither of these equations has a sensible ceteris paribus interpretation because housing and saving are chosen by the same individual. - If income increases, a person will generally change the optimal mix of housing expenditures and saving. The first equation however, makes it seem as though we want to know the impact of a change in income, education or age on housing expenditure, holding saving constant. • Just because two variables are determined simultaneously does not mean that a SEM is suitable. Simultaneity Bias in OLS • Consider the following example: (3) (4) • To show that is generally correlated with we can solve for in terms of the exogenous variables and the error terms. Replacing in (4) with the expression in (3) gives, 1 • Assuming that (5) 1 we can divide (5) by 1 to obtain, ! ! (6) Where ! /1 ; ! /1 and /1 • Equation (6) expresses in terms of the exogenous variables and the error terms and is called the reduced form equation for . • The parameters ! and ! are non-linear functions of the structural parameters, and are termed the reduced form parameters. • The reduced form error, , is a linear function of the structural error terms, and . • Since the ’ are uncorrelated with the ’, is also uncorrelated with the ’, hence the reduced form parameters in (6) can be estimated by OLS. - The reduced form equations can be important for economic analysis. These equations relate the equilibrium values of the endogenous variables to the exogenous variables. • Equation (6) also tells us that estimation of equation (3) by OLS will result in biased and inconsistent estimates of and . - In equation (3) the issue is whether and are correlated ( and are by assumption uncorrelated) - From (6) we see that and are correlated if and only if and are correlated - Since is a linear combination of and it is generally correlated with • When is correlated with because of simultaneity, we say that OLS suffers from simultaneity bias The Instrumental Variables Solution • As we saw last time the IV solution of two-stage least squares can be used to solve the problem of endogenous explanatory variables. • This is also true for SEMs – the major difference being that because we specify a structural equation for each endogenous variable, we can immediately see whether sufficient IVs are available to estimate either equation. • Consider the following example: • Here we can think of the coffee market as an example, with being say per capita coffee consumption, being the average price per jar and being something like the weather (in Brazil!) that affects supply. It is assumed that is exogenous to both the supply and demand equations • The first question to be addressed is: given a random sample on , and , which of the above equations can be estimated, i.e. which is an identified equation? • It turns out that the demand equation is identified, but the supply equation is not. - This is indicated for our rules for instruments - We can use as in instrument for price in the demand equation - Because appears in the supply equation however, we cannot use it as an instrument in this equation - In order to estimate the supply equation we would need an observed exogenous variable that shifts the demand curve • Considering the more general two-equation model: • Where and are the endogenous variables, and are the structural error terms, and and now denote a set of exogenous regressors, $ and $ , that appear in the first and second regression respectively, i.e. , , … , &' and , , … , &( . • In many cases and will overlap • The assumption that and contain different exogenous variables means that we impose exclusion restrictions on the model, i.e. we assume that certain exogenous regressors do not appear in the first equation and others are absent from the second. This allows us to distinguish between the two structural equations. • The Rank Condition for Identification of a Structural Equation - The first equation in a two-equation SEM is identified if, and only if, the second equation contains at least one exogenous variable (with a nonzero coefficient) that is excluded from the first equation. - The order condition for identifying the first equation states that at least one exogenous variable is excluded from this regression. This is simple to check once both equations have been specified. - The rank condition requires more: at least one of the exogenous regressors excluded from the first equation must have a non-zero population coefficient in the second equation. This can be tested using a t or F test. - Identification in the second equation is the mirror image of the above Estimation • Once we have determined that an equation is identified, we can estimate it by TSLS - The instruments consist of the exogenous variables appearing in either equation • Tests for endogeneity, overidentifying restrictions and so on proceed as before • It turns out that, when any system with two or more equations is correctly specified and certain additional assumptions hold, system estimation methods (e.g. Three-Stage-Least-Squares) are generally more efficient than estimating by TSLS Systems with More Than Two Equations • SEMs can consist of more than two equations • Studying the general identification of these models is not straightforward • Once an equation has been shown to be identified, it can be estimated by TSLS • Consider the following three equation system (7) (8) ) ) (9) • It is difficult to show that an equation in a SEM with more than two equations is identified • It is clear however that (9) is not identified, since all exogenous regressors are included in the equation, leaving no instruments for - we have in terms of last week an unidentified equation • Equation (7) on the other hand looks promising; we have three exogenous regressors excluded from the regression, , and ) , and only two endogenous regressors, and - this equation is therefore overidentified • In general, an equation in any SEM satisfies the order condition for identification if the number of excluded exogenous variables from the equation is at least as large as the number of endogenous regressors - As such, the order condition in (8) is also satisfied since we have one excluded exogenous regressor, ) , and one endogenous regressor, - the equation is exactly identified • Identification of an equation depends on the parameters (which we can never know for sure) in the other equations however - For example, if ) 0 in (9) then (8) is not identified, as ) is useless as an instrument for