Download Explanatory Variable/Error Term Independence Premise

Chapter 18: Explanatory Variable/Error Term Independence Premise, Consistency, and Instrumental Variables Chapter 18 Outline • Review o Regression Model o Standard Ordinary Least Squares (OLS) Premises o Estimation Procedures Embedded within the Ordinary Least Squares (OLS) Estimation Procedure • Taking Stock and a Preview: The Ordinary Least Squares (OLS) Estimation Procedure • A Closer Look at the Explanatory Variable/Error Term Independence Premise • Explanatory Variable/Error Term Correlation and Bias o Geometric Motivation o Confirming Our Logic • Estimation Procedures: Large and Small Sample Properties o Unbiased and Consistent Estimation Procedure o Unbiased but Not Consistent Estimation Procedure o Biased but Consistent Estimation Procedure • The Ordinary Least Squares (OLS) Estimation Procedure, and Consistency • Instrumental Variable (IV) Estimation Procedure: A Two Regression Procedure o Mechanics o The “Good” Instrument Conditions Chapter 18 Prep Questions 1. Consider the following model: yt = βConst + βxxt + et yt = Dependent variable xt = Explanatory variable t = 1, 2, …, T T = Sample size et = Error term Suppose that the actual constant equals 6 and the actual coefficient equals 1/2: 1 βConst = 6 βx = 2 Also, suppose that the sample size is 6. The following table reports the value of the explanatory variable and the error term for each of the 6 observations: Observation xt et 1 2 4 2 6 2 3 10 3 2 4 14 −2 5 18 −1 6 22 −4 a. On a sheet of graph paper place x on the horizontal axis and e on the vertical axis. 1) Plot a scatter diagram of x and e. 2) As xt increases, does et typically increase or decrease? 3) Is et positively or negatively correlated with xt? b. Immediately below this graph construct a second graph with x on the horizontal axis and y on the vertical axis. 1) Plot the line depicting the actual equation, the line representing the actual constant and the actual coefficient: 1 y = 6 + x. 2 2) Fill in the following blanks for each observation t: Observation xt et yt 1 2 4 _____ 2 6 2 _____ 3 10 3 _____ 4 14 _____ −2 5 18 _____ −1 6 22 _____ −4 3) Plot the x and y values for each of the 6 observations. c. Based on the points you plotted in your second graph, “eyeball” the best fitting line and sketch it in. d. How are the slope of the line representing the actual equation and the slope of the best fitting line related? 2. Recall the poll Clint conducted to estimate the fraction of the student population that supported him in the upcoming election for class president. He used the following approach: Random Sample Technique: Write the name of each individual in the population on a 3×5 card • Perform the following procedure 16 times: o Thoroughly shuffle the cards. o Randomly draw one card. o Ask that individual if he/she is voting for Clint and record the answer. o Replace the card. • Calculate the fraction of the sample supporting Clint. 3 Now, consider an alternative approach. You are visiting Clint in his dorm room when he asks you to conduct the poll. Instead of writing the name of each individual on a 3×5 card, you simply leave Clint’s room and ask the first 16 people you run into how he/she will vote: Nonrandom Sample Technique: • Leave Clint’s dorm room and ask the first 16 people you run into if he/she is voting for Clint. • Calculate the fraction of the sample supporting Clint. Use the Econometrics Lab to simulate the two sampling techniques: [Link to MIT-Lab 18P.2 goes here.] Figure 18.1: Opinion Poll Simulation a. Answer the questions posed in the lab and then fill in the following blanks: After Many, Many Repetitions 4 Sampling Population Sample Mean (Average) Magnitude Technique Fraction Size of Estimates of Bias Random .50 16 _____ _____ Nonrandom .50 16 _____ _____ Nonrandom .50 25 _____ _____ Nonrandom .50 100 _____ _____ Focus on the nonrandom sampling technique. b. What happens to the mean of the estimated fraction as the sample size increases? c. Explain why your answer to part b “makes sense.” To do so, consider the following questions: 1) Compared to the general student population, are the students who live near Clint more likely to be Clint’s friends? 2) Compared to the general student population, are the students who live near Clint more likely to vote for him? 3) Would the nonrandom sampling technique bias the poll in Clint’s favor? 4) What happens to the magnitude of the bias as the sample size increases? Explain. d. As the sample size increases, what happens to the variance of the estimates? Variance of Estimates _____ _____ _____ _____ Review Regression Model We begin by reviewing the basic regression model: yt = βConst + βxxt + et yt = Dependent variable xt = Explanatory variable t = 1, 2, …, T T = Sample size et = Error term The error term is a random variable that represents random influences: Mean[et] = 0 5 The Standard Ordinary Least Squares (OLS) Premises Recall the standard ordinary least squares (OLS) regression premises: • Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same; all the variances equal Var[e]: Var[e1] = Var[e2] = … = Var[eT] = Var[e] • Error Term/Error Term Independence Premise: The error terms are independent: Cov[ei, ej] = 0. Knowing the value of the error term from one observation does not help us predict the value of the error term for any other observation. • Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. Knowing the value of an observation’s explanatory variable does not help us predict the value of that observation’s error term. Estimation Procedures Embedded within the Ordinary Least Squares (OLS) Estimation Procedure The ordinary least squares (OLS) estimation procedure includes three important estimation procedures. A procedure to estimate the: • Values of the regression parameters, βx and βConst: T ∑(y t bx = − y )( xt − x ) and bConst = y − bx x t =1 T ∑ (x − x ) 2 t t =1 • • Variance of the error term’s probability distribution, Var[e]: SSR EstVar[e ] = Degrees of Freedom Variance of the coefficient estimate’s probability distribution, Var[bx]: EstVar[e ] EstVar[bx ] = T ∑ ( xt − x )2 t =1 When the standard ordinary least squares (OLS) regression premises are met: • Each estimation procedure is unbiased; that is, each estimation procedure does not systematically underestimate or overestimate the actual value. • The ordinary least squares (OLS) estimation procedure for the coefficient value is the best linear unbiased estimation procedure (BLUE). 6 Crucial Point: When the ordinary least squares (OLS) estimation procedure performs its calculations, it implicitly assumes that the standard ordinary least squares (OLS) regression premises are satisfied. Taking Stock and a Preview: The Ordinary Least Squares (OLS) Estimation Procedure The ordinary least square (OLS) estimation procedure is economist’s most widely used estimation procedure. When contemplating the use of this procedure, we should keep two issues in mind: • Is the ordinary least squares (OLS) estimation procedure for the coefficient value unbiased? • If unbiased, is the ordinary least squares (OLS) estimation procedure reliable in the following two ways: o Can the calculations for the standard errors be trusted? o Is the ordinary least square (OLS) estimation procedure for the coefficient value the most reliable, the best linear unbiased estimation procedure (BLUE)? In the previous two chapters we showed that the violation the first two standard ordinary least squares (OLS) premises, the error term equal variance premise and the error term/error term independence premise, does not cause the ordinary least squares (OLS) estimation procedure to be biased. This was good news. We then focused on the second issue. We learned that the standard error calculations could not be trusted and that the ordinary least squares (OLS) estimation procedure was not the best linear unbiased estimation procedure (BLUE). In this chapter, we turn our attention to the third premise, explanatory variable/error term independence. Unfortunately, violation of the third premise does cause the ordinary least squares (OLS) estimation procedure to be biased. The explanatory variable/error term independence premise determines whether or not the ordinary least squares (OLS) estimation procedure is unbiased or biased. Figure 18.2 summarizes the roles played by the three standard premises. 7 OLS Bias Question: Is the explanatory variable/error term independence premise satisfied or violated? Is the OLS estimation procedure for the value of the coefficient unbiased or biased? OLS Reliability Question: Are the error term equal variance and error term/error term independence premises satisfied or violated? Can the OLS calculations for the standard error be “trusted?” Is the OLS estimation procedure for the value of the coefficient BLUE? Satisfied: Explanatory Variable and Error Term Independent ⏐ ↓ Unbiased ã é Satisfied Violated ⏐ ⏐ ⏐ ⏐ ↓ ↓ Yes No Yes No Violated: Explanatory Variable and Error Term Correlated ⏐ ↓ Biased Figure 18.2: OLS Bias and Reliability Flow Diagram This chapter begins by explaining why bias results when the explanatory variable/error term independence premise is violated. Next, we introduce a new property that is used to describe estimation procedures, consistency. Typically, consistency is considered to be not as desirable as is being unbiased, but in some cases, estimation procedures that are biased sometimes meet the consistency standard. We close the chapter by introducing one such procedure: the instrumental variables (IV) estimation procedure. A Closer Look at the Explanatory Variable/Error Term Independence Premise We begin by using a simulation to illustrate the explanatory variable/error term independence premise: • Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. Knowing the value of an observation’s explanatory variable does not help us predict the value of that observation’s error term. 8 Econometrics Lab 18.1: Explanatory Variable/Error Term Independence and Correlation [Link to MIT-Lab 18.1 goes here.] Initially, explanatory variable/error term correlation coefficient equals .00. Be certain that the Pause checkbox is checked. Then, click the Start button. Note that the blue points indicate the observations with low x values, the black points the observations medium x values, and the red points the observations with high x values. Click the Start button a few more times to convince yourself that this is always true. Now, clear the Pause checkbox and click Start. After many, many repetitions, click Stop. Note that the scatter diagram points are distributed more or less evenly across the graph as shown in Figure 18.3: et xt Figure 18.3: Corr X&E = 0 Since they are spread evenly, knowing the value of the explanatory variable, xt, does not help us predict the value of the error term, et. The explanatory variable and the error term are independent: the explanatory variable/error term independence premise is satisfied. The value of x, low, medium, or high, does not affect the mean of the error terms. The mean is approximately 0 in each case. Low x’s Mean: .0 Variance: 500. Medium x’s Mean: .0 Variance: 500. High x’s Mean: .0 Variance: 500. Figure 18.4: Error Term Probability Distributions – Corr X&E = 0 9 Next, we select .60 in the Corr X&E list. Consequently, the explanatory variable and error term are now positively correlated. After many, many repetitions we observe that the explanatory variable and the error term are no longer independent. The scatter diagram points are no longer spread evenly; a pattern emerges. As illustrated in Figure 18.5, as the value of explanatory variable rises, the error term tends to rise also: et xt Figure 18.5: Corr X&E = .6 • • When the value of the explanatory variable is low, the error term is typically negative. The mean of the low x value error terms is negative. When the value of the explanatory variable is high and the error term is typically positive. The mean of the high x value error terms is positive. Low x’s Mean: −24. Variance: 500. Medium x’s Mean: .0 Variance: 500. High x’s Mean: 24. Variance: 500. Figure 18.6: Error Term Probability Distributions – Corr X&E = .6 10 Last, we select −.60 in the Corr X&E list. Again, the scatter diagram points are not spread evenly. The explanatory variable and error term are now negatively correlated. As illustrated in Figure 18.7, as the value of explanatory variable rises, the error term falls: et xt Figure 18.7: Corr X&E = −.6 • When the value of the explanatory variable is low, the error term is typically positive. The mean of the low x value error terms is positive. • When the value of the explanatory variable is high, the error term is typically negative. The mean of the high x value error terms is negative. The mean of low x error terms is positive and the mean of high x error terms is negative. We shall proceed by explaining geometrically why correlation between the explanatory variables and error terms biases the ordinary least squares (OLS) estimation procedure for coefficient value. Then, we shall use a simulation to confirm our logic. 11 Explanatory Variable/Error Term Correlation and Bias Geometric Motivation et Explanatory variable and error term positively correlated et Explanatory variable and error term negatively correlated xt xt yt yt Actual equation line Actual equation line xt Figure 18.8: Explanatory Variable/Error Term Correlation Focus attention on Figure 18.8. The line in the lower two graphs represent the actual relationship between the dependent variable, yt, and the explanatory variable, xt: yt = βConst + βxxt. βConst is the actual constant and βx the actual coefficient. Now, we shall examine the left and right panels: • Left panels of Figure 18.8: The explanatory variable, xt, and error term, et, are positively correlated as illustrated in the top left scatter diagram. et tends to be low for low values of xt and high for high values of xt. Now, consider the bottom left scatter diagram in which the xt’s and yt’s are plotted. When the explanatory variable and the error term are positively correlated, the scatter diagram points tend to lie below the actual equation xt 12 line for low values of xt and above the actual equation line for high values of xt. • Right panels of Figure 18.8: The explanatory variable, xt, and error term, et, are negatively correlated as illustrated in the top right scatter diagram. et tends to be tends to be high for low values of xt and low for high variables of xt. Now, consider the bottom right scatter diagram in which the xt’s and yt’s are plotted. When the explanatory variable and the error term are negatively correlated, the scatter diagram points tend to lie above the actual regression line for low values of xt and below the actual regression line for high values of xt. In Figure 18.9 we have added the best fitting line for each of the two panels: • Left panels of Figure 18.9: When the explanatory variable and error terms are positively correlated the best fitting line is more steeply sloped that the actual equation line; consequently, the ordinary least squares (OLS) estimation procedure for the coefficient value is biased upward. • Right panels of Figure 18.9: When the explanatory variable and error terms are negatively correlated the best fitting line is less steeply sloped than the actual equation line; consequently, the ordinary least squares (OLS) estimation procedure for the coefficient value is biased downward. 13 Figure 18.9: Explanatory Variable/Error Term Correlation with Best Fitting Line Confirming Our Suspicions Based on our logic we would expect the ordinary least squares (OLS) estimation procedure for the coefficient value to be biased whenever the explanatory variable and the error term are correlated. 14 Econometrics Lab 18.2: Ordinary Least Squares (OLS) and Explanatory Variable/Error Term Correlation [Link to MIT-Lab 18.2 goes here.] We can confirm our logic using a simulation. As a base case, we begin with .00 specified in the Corr X&E list; the explanatory variables and error terms are independent. Click Start and then after many, many repetitions, click Stop. The simulation confirms that no bias results whenever the explanatory variable/error term independence premise is satisfied. Now, specify .30 in the Corr X&E list; the explanatory variable and error terms are positively correlated. Click Start and then after many, many repetitions, click Stop. The average of the estimated coefficient values, 3.9, exceeds the actual value, 2.0; the ordinary least squares (OLS) estimation procedure for the coefficient value is biased upward whenever the explanatory variable and error terms are positively correlated. By selecting −.6 from the “Corr X&E” list, we can show that downward bias results whenever the explanatory variable and error terms are negatively correlated. The average of the estimated coefficient values, .1, is less than the actual value, 2.0; Estimation Corr Sample Actual Mean of Magnitude Variance of Procedure X&E Size Coef Coef Ests of Bias Coef Ests OLS .00 50 2.0 ≈2.0 ≈0.0 ≈4.0 OLS .30 50 2.0 ≈6.1 ≈4.1 ≈3.6 OLS 50 2.0 −.30 ≈−2.1 ≈4.1 ≈3.6 Table 18.1: Explanatory Variable/Error Term Correlation – Simulation Results The simulation validates our logic: Explanatory variable and error term positively correlated ↓ OLS estimation procedure for the coefficient value is biased upward Explanatory variable and error term negatively correlated ↓ OLS estimation procedure for the coefficient value is biased downward 15 Estimation Procedures: Small and Large Sample Properties Explanatory variable/error term correlation creates a problem for the ordinary least squares (OLS) estimation procedure. Positive correlation causes upward bias and negative correlation causes downward bias. What can we do in these cases? Econometricians respond to this question very pragmatically by adopting the philosophy that “half a loaf is better than none.” In general, we use different estimation procedures which while still biased, may meet a less demanding criterion called consistency. In most cases, consistency is not as desirable as is being unbiased; nevertheless, if we cannot find an unbiased estimation procedure, consistency proves to be better than nothing. After all, “half a loaf is better than none.” To explain the notion of consistency, we begin by reviewing what it means for an estimation procedure to be unbiased. Probability Distribution Actual Value Estimate Figure 18.10: Unbiased Estimation Procedures and the Probability Distribution of Estimates Unbiased: An estimation procedure is unbiased whenever the mean (center) of the estimate’s probability distribution equals the actual value. Mean Estimate’s Probability Distribution = Actual Value An unbiased estimation procedure does not systematically underestimate or overestimate the actual value. The relative frequency interpretation of probability provides intuition. If the experiment were repeated many, many times the average of the numerical values of the estimates will equal the actual value: Mean (Average) of the Estimates = Actual Value after many, many repetitions Being unbiased is a small sample property because the size of the sample plays no role in determining whether or not an estimation procedure is unbiased. 16 Consistent: Consistency is a large sample property; the size sample plays a critical role here. Also, both the mean and variance of the estimate’s probability distribution are important when deciding if an estimation procedure is consistent. • Mean of the estimate’s probability distribution: Consistency requires the mean either to o equal the actual value: Mean[Est] = Actual Value or o approach the actual value as the sample size approaches infinity: Mean[Est] → Actual Value as Sample Size → ∞ That is, either the o estimation procedure must be unbiased or o magnitude of the bias must diminish as the sample size becomes larger. • Variance of the estimate’s probability distribution: Consistency requires the variance to diminish as the sample size becomes larger; more specifically, the variance must approach 0 as the sample size approaches infinity: Variance[Est] → 0 as Sample Size → ∞ Figure 18.11 illustrates the relationship between the two properties of estimation procedures: Consistent Unbiased Figure 18.11: Unbiased and Consistent Estimation Procedures 17 Figure 18.12 provides a flow diagram, a “roadmap,” we can use to determine the properties of an estimation procedure: Does Mean[Est] equal the Actual Value? ⏐ ↓ ⏐ No ⏐ ⏐ ↓ ↓ Biased Yes ↓ ⏐ ⏐ Does Mean[Est] → Actual Value ⏐ as the sample size → ∞? ⏐ ⏐ ↓ ↓ ⏐ Unbiased Yes ⏐ ⏐ é ã ↓ Does Var[Est] → 0 No as the sample size → ∞? ⏐ ⏐ ↓ ↓ ⏐ Yes No ⏐ é ↓ ↓ Consistent Not Consistent Figure 18.12: Determining the Properties of an Estimation Procedure To illustrate the distinction between these two properties of estimation procedures we shall consider three examples. An estimation procedure that is • unbiased and consistent. • unbiased but not consistent. • biased and consistent. Unbiased and Consistent Estimation Procedure When the standard ordinary least squares (OLS) premises are met the ordinary least squares (OLS) estimation procedure is not only unbiased, but also consistent. We shall use our Econometrics Lab to illustrate this. 18 Econometrics Lab 18.3: Ordinary Least Squares (OLS) Estimation Procedure [Link to MIT-Lab 18.3 goes here.] Sample Actual Mean of Magnitude Variance of Estimation Procedure Size Coef Coef Ests of Bias Coef Ests OLS 3 2.0 ≈2.0 ≈0.0 ≈2.50 OLS 6 2.0 ≈2.0 ≈0.0 ≈1.14 OLS 10 2.0 ≈2.0 ≈0.0 ≈.67 OLS 100 2.0 ≈2.0 ≈0.0 ≈.07 OLS 250 2.0 ≈2.0 ≈0.0 ≈.03 Table 18.2: Unbiased and Consistent Estimation Procedure This estimation procedure is unbiased and consistent. After many, many repetitions: • The average of the estimated coefficient values equals the actual value, 2.0, suggesting that the estimation procedure is unbiased. • The variance of the estimated coefficient values appears to be approaching 0 as the sample size increases. Large Sample Small Sample Est Actual Value Figure 18.13: OLS Estimation Procedure – Probability Distributions When the standard ordinary least squares (OLS) premises are met, the ordinary least squares (OLS) estimation procedure provides us with the best of all possibilities; it is both unbiased and consistent. 19 Unbiased but Inconsistent Estimation Procedure Any Two yt 3 9 15 21 27 xt Figure 18.14: Any Two Estimation Procedure The Any Two estimation procedure that we introduced in Chapter 6 provides us with an example of an estimation procedure that is unbiased, but not consistent. Let us review the Any Two estimation procedure. First, we construct a scatter diagram plotting the explanatory variable on the horizontal axis and the dependent variable on the vertical axis. Then, we choose any two points at random and draw a straight line connecting these points. The coefficient estimate equals the slope of this line. Econometrics Lab 18.4: Any Two Estimation Procedure [Link to MIT-Lab 18.4 goes here.] Estimation Procedure Any Two Any Two Any Two Sample Actual Mean of Magnitude Variance of Size Coef Coef Ests of Bias Coef Ests 3 2.0 ≈2.0 ≈0.0 ≈7.5 6 2.0 ≈2.0 ≈0.0 ≈17.3 10 2.0 ≈2.0 ≈0.0 ≈31.1 Table 18.3: Any Two Estimation Procedure 20 This estimation procedure is unbiased but not consistent. After many, many repetitions: • The average of the estimated coefficient values equals the actual value, 2.0, suggesting that the estimation procedure is unbiased. • The variance of the estimated coefficient values increases as the sample size increases; consequently, the estimation procedure is not consistent. Small Sample Large Sample Est Actual Value Figure 18.15: Any Two Estimation Procedure – Probability Distributions Biased but Consistent Estimation Procedure To illustrate an estimation procedure that is biased, but consistent, we shall revisit the opinion poll conducted by Clint. Recall that Clint used a random sampling procedure to poll the population: Random Sampling Procedure Write the name of each individual in the population on a 3×5 card. • Perform the following procedure 16 times: o Thoroughly shuffle the cards. o Randomly draw one card. o Ask that individual if he/she supports Clint and record the answer. o Replace the card. • Calculate the fraction of the sample supporting Clint. This estimation procedure proved to be unbiased. But now consider an alternative approach. Suppose that you are visiting Clint in his dorm room and he asks you to conduct the poll. Instead of taking the time to write the name of each individual on a 3×5 card, however, you simply leave Clint’s room and ask the first 16 people you run into how he/she will vote: 21 Nonrandom Sampling Procedure Leave Clint’s dorm room. • Ask the first 16 people you run into if he/she is voting for Clint. • Calculate the fraction of the sample supporting Clint. Why do we call this a nonrandom sampling technique? Compared to the general student population: • Are the students who live near Clint more likely to be a friend of Clint? • Consequently, are the students who live near Clint more likely to vote for Clint? Since your starting point is Clint’s dorm room, it is likely that you will poll students who are Clint’s friends. They will probably be more supportive of Clint than the general student population, will they not? Consequently, we would expect this polling technique to be biased in favor of Clint. We shall use a simulation to test our logic. Econometrics Lab 18.5: Biased but Consistent Estimation Procedure Figure 18.16: Opinion Poll Simulation 22 [Link to MIT-Lab 18.5 goes here.] Observe that you can select the sampling technique by checking or clearing the Non-random Sample checkbox. Begin by clearing the Non-random Sample checkbox to choose the random sampling technique; this provides us with a benchmark. Click Start and then after many, many repetitions click Stop. As before, we observe that the estimation procedure is unbiased. Next, specify the nonrandom technique that we just introduced by checking the “Non-random Sample” checkbox. You walk out of Clint’s dorm room and poll the first 16 people you run into. Click Start and then after many, many repetitions click Stop. The simulation results confirm our logic. The nonrandom polling technique biases the poll results in favor of Clint. But now what happens as we increase the sample size from 16 to 25 and then to 100? After Many, Many Repetitions Variance Sampling Population Sample Mean (Average) Magnitude Technique Fraction Size of Estimates of Bias of Estimates Random .50 16 ≈.50 ≈.00 ≈.015 Nonrandom .50 16 ≈.56 ≈.06 ≈.015 Nonrandom .50 25 ≈.54 ≈.04 ≈.010 Nonrandom .50 100 ≈.51 ≈.01 ≈.0025 Table 18.4: Opinion Poll Simulation – Random and Nonrandom Samples We observe that while the nonrandom sampling technique is still biased, the magnitude of the bias declines as the sample size increases. As the sample size increases from 16 to 25 to 100, the magnitude of the bias decreases from .06 to .04 to .01. This makes sense, does it not? As the sample size becomes larger you will be farther and farther from Clint’s dorm room which means that you will be getting larger and larger portion of your sample from the general student population rather than Clint’s friends. Furthermore, the variance of the estimates also decreases as the sample size increases. This estimation procedure is biased but consistent. After many, many repetitions: • The average of the estimates appears to be approaching the actual value, .5. • The variance of the estimated coefficient values appears to be approaching 0 as the sample size increases. 23 Large Sample Small Sample Est Actual Value Figure 18.17: Nonrandom Sample Estimation Procedure – Probability Distributions The Ordinary Least Squares (OLS) Estimation Procedure, and Consistency We have shown that when the explanatory variable/error term independence premise is violated, the ordinary least squares (OLS) estimation procedure for the coefficient estimate is biased. But might it be consistent? Econometrics Lab 18.6: Ordinary Least Squares (OLS) Estimation Procedure and Consistency [Link to MIT-Lab 18.6 goes here.] Corr Sample Actual Mean of Magnitude Variance of Estimation Procedure X&E Size Coef Coef Ests of Bias Coef Ests OLS .30 50 2.0 ≈6.1 ≈4.1 ≈3.6 OLS .30 100 2.0 ≈6.1 ≈4.1 ≈1.7 OLS .30 150 2.0 ≈6.1 ≈4.1 ≈1.2 Table 18.5: Explanatory Variable/Error Term Correlation – Simulation Results Clearly, the magnitude of the bias does not diminish as the sample size increases. The simulation demonstrates that when the explanatory variable/error term independence premise is violated, the ordinary least squares (OLS) estimation procedure is neither unbiased nor consistent. This leads us to a new estimation procedure, the instrumental variable (IV) estimation procedure. Like ordinary least squares, the instrumental variables will prove to be biased when the explanatory variable/error term independence premise is violated, but it has an advantage: under certain conditions, the instrumental variable (IV) estimation procedure is consistent. 24 Instrumental Variable (IV) Estimation Procedure: A Two Regression Procedure Motivation of the Instrumental Variables Estimation Procedure In some situations, the instrumental variable estimation procedure can mitigate, but not completely remedy cases in which the explanatory variable and the error term are correlated: Original Model: yt = βConst + βxxt + εt where é ã When xt and εt are correlated ↓ xt is the “problem” explanatory variable yt = Dependent variable xt = Explanatory variable εt = Error term t = 1, 2, …, T T = Sample size Figure 18.18: “Problem” Explanatory Variable When an explanatory variable, xt, is correlated with the error term, εt, we shall refer to the explanatory variable as the “problem” explanatory variable. The correlation of the explanatory variable and the error term creates the bias problem for the ordinary least squares (OLS) estimation procedure. We begin by searching for another variable called an instrument. Traditionally, we denote the instrument by the lower case Roman letter z, zt. An effective instrument must possess two properties. A “good” instrument, zt, must be • correlated with the “problem” explanatory variable, xt. • independent of the error term, εt. We use the instrument to provide us with an estimate of the “problem” explanatory variable. Then, this estimate is used as a surrogate for the “problem” explanatory variable. The estimate of the “problem” explanatory variable, rather than the “problem” explanatory variable itself, is used to explain the dependent variable. 25 Mechanics • Choose a “Good” Instrument: A “good” instrument, zt, must have two properties: o Correlated with the “problem” explanatory variable, xt. o Uncorrelated with the error term, εt. • Instrumental Variables (IV) Regression 1: Use the instrument, zt, to provide an “estimate” of the problem explanatory variable, xt. o Dependent variable: “Problem” explanatory variable, xt. o Explanatory variable: Instrument, zt. o Estimate of the “problem” explanatory variable: Estxt = aConst + azzt where aConst and az are the estimates of the constant and coefficient in this regression, IV Regression 1. • Instrumental Variables (IV) Regression 2: In the original model, replace the “problem” explanatory variable, xt, with its surrogate, Estxt, the estimate of the “problem” explanatory variable provided by the instrument, zt, from IV Regression 1. o Dependent variable: Original dependent variable, yt. o Explanatory variable: Estimate of the “problem” explanatory variable based on the results from IV Regression 1, Estxt. The “Good” Instrument Conditions Let us now provide the intuition behind why a “good” instrument, zt, must satisfy the two conditions mentioned above: • Instrument/”Problem” Explanatory Variable Correlation: The instrument, zt, must be correlated with the “problem” explanatory variable, xt. To understand why, focus on IV Regression 1. We are using the instrument to create a surrogate for the “problem” explanatory variable in IV Regression 1: Estxt = aConst + azzt The estimate, Estxt, will be a good surrogate only if it is a good predictor of the “problem” explanatory variable, xt. This will occur only if the instrument, zt, is correlated with the “problem” explanatory variable, xt. 26 • Instrument/Error Term Independence: The instrument, zt, must be independent of the error term, εt. Focus on IV Regression 2. We begin with the original model and then replace the “problem” explanatory, xt, variable with its surrogate, Estxt: yt = βConst + βxxt + εt Replace “problem” with surrogate ↓ βxEstxt = βConst + + εt where Estxt = aConst + azzt from IV Regression 1 To avoid violating the explanatory variable/error term independence premise in IV Regression 2, the surrogate for the “problem” explanatory variable, Estxt, must be independent of the error term, εt. The surrogate, Estxt, is derived from the instrument, zt, in IV Regression 1: Estxt = aConst + azzt Consequently, to avoid violating the explanatory variable/error term independence premise the instrument, zt, and the error term, εt, must be independent. Estxt and εt must be independent ã é = βConst + βxEstxt + yt εt ⏐ ↓ ⏐ Estxt = aConst + azzt ⏐ ⏐ é ↓ zt and εt must be independent Justification of the Instrumental Variables Estimation Procedure As we shall see while instrumental variable estimation procedure will not solve the problem of bias, it does mitigate it. We shall use a simulation to illustrate that while the instrumental variable (IV) estimation procedure is still biased it is consistent when “good” instrument conditions are satisfied. 27 Econometrics Lab 18.7: Ordinary Least Squares (OLS) Estimation Procedure and Consistency Figure 18.19: Instrumental Variables Simulation [Link to MIT-Lab 18.7 goes here.] 28 Two new correlation lists appear in this simulation: Corr X&Z and Corr Z&E. The two new lists reflect the two conditions required for a good instrument. • The Corr X&Z list specifies the correlation coefficient for the explanatory variable and the instrument. To be a “good” instrument the explanatory variable and the instrument must be correlated. The default value is .50. • The Corr Z&E specifies the correlation coefficient for the instrument and error term. To be a “good” instrument the instrument and error term must be independent. The default value is .00; that is, the instrument and error term are independent. The default values meet the conditions necessary for a “good” instrument. Also, note that in the Corr X&E list the value .30 is specified. The correlation coefficient for the explanatory variable and error term equals .30. The explanatory variable/error term independence premise is violated. Last, IV is selected indicating that the instrumental variable (IV) estimation procedure we just described will be used to estimate the value of the explanatory variable’s coefficient. We shall now illustrate that while the instrumental variable (IV) estimation procedure is still biased, it is consistent. To do so, click Start and then after many, many repetitions click Stop. Subsequently, we increase the sample size from 50 to 100 and then again from 100 to 150. Table 18.6 reports the simulation results: Estimation Correlation Coefficients Sample Actual Mean of Magnitude Variance of Procedure X&Z Z&E X&E Size Coef Coef Ests of Bias Coef Ests IV .50 .00 .30 50 2.0 ≈1.61 ≈.39 ≈20.3 IV .50 .00 .30 100 2.0 ≈1.82 ≈.18 ≈8.7 IV .50 .00 .30 150 2.0 ≈1.88 ≈.12 ≈5.5 Table 18.6: IV Estimation Procedure – “Good” Instrument Conditions Satisfied Both bad news and good news emerge: • Bad News: The instrumental variable estimation is biased. The mean of the estimates for the coefficient of the explanatory variable does not the actual value we specified, 2.0. • Good News: As we increase the sample size, o the mean of the coefficient estimates gets closer to the actual value and o the variance of the coefficient estimates becomes smaller. This illustrates the fact that the instrumental variable (IV) estimation procedure is consistent. 29 Next, we shall use the lab to illustrate the importance of the “good” instrument conditions. First, let us see what happens when we improve the instrument by making it more highly correlated with the problem explanatory variable. We do this by increasing the correlation coefficient of the explanatory variable and the instrument from .50 to .75 in the Corr X&Z list: Estimation Correlation Coefficients Sample Actual Mean of Magnitude Variance of Procedure X&Z Z&E X&E Size Coef Coef Ests of Bias Coef Ests IV .50 .00 .30 150 2.0 ≈1.88 ≈.12 ≈5.5 IV .75 .00 .30 150 2.0 ≈1.95 ≈.05 ≈2.3 Table 18.7: IV Estimation Procedure – A Better Instrument The magnitude of the bias decreases; also, the variance of the coefficient estimates also decreases. A more highly correlated instrument provides a better estimate of the “problem” explanatory variable in IV Regression 1 and hence is a better instrument. Last, let us use the lab to illustrate the important role that the independence of the error term and the instrument plays by specifying .10 from the Corr Z&E list; the instrument and the error term are no longer independent: Estimation Correlation Coefficients Sample Actual Mean of Magnitude Variance of Procedure X&Z Z&E X&E Size Coef Coef Ests of Bias Coef Ests IV .75 .10 .30 50 2.0 ≈3.69 ≈1.69 ≈6.8 IV .75 .10 .30 100 2.0 ≈3.74 ≈1.74 ≈3.2 IV .75 .10 .30 150 2.0 ≈3.76 ≈1.76 ≈2.1 Table 18.8: IV Estimation Procedure – Instrument Correlated with Error Term As we increase the sample size from 50 to 100 to 150, the magnitude of the bias does not decrease. The instrumental variable (IV) estimation procedure is no longer consistent when the instrument is correlated with the error term; the explanatory variable/error term independence premise is violated in IV Regression 2.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Explanatory Variable/Error Term Independence Premise