Download Lecture_17 and lab 10

Outline Announcements about field trip this afternoon. Starting Chapter 8: Heteroskedasticity Practice problem on Linear Probability Model Continue Chapter 8. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Chapter 8: Heteroskedasticity Recall MLR 5, the homoskedasticity assumption: 𝑉𝑎𝑟 𝑦 𝑥1 , 𝑥2 , … = 𝜎 2 The value of the explanatory variables must contain no information about the variance of the unobserved factors. An Example of how it can be hard to justify: Wage equation: There is greater variance in the wages of college graduates than in the wages of high school graduates. - heteroskedasticity Also greater variance in the wages of 50-year olds than the wages of 25year olds. - heteroskedasticity © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Another example: The linear probability model (using a binary variable as Y, see Lecture 16) We treat Y as a binary random variable with probability p(𝑿) of y=1, and probability (1-p(𝑿)) of y=0. (The purpose of the regression is to determine how p(𝑿) depends on each variable 𝑿𝟏 , 𝑿𝟐 , … , 𝑿𝒌 ) What is the variance of Y conditional on 𝑿 ? Since p is really p(𝑿), the variance of Y depends on the values of 𝑿. The linear probability model is necessarily heteroskedastic © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Heteroscedasticity Consequences of heteroscedasticity for OLS OLS still unbiased and consistent under heteroscedastictiy! R-squared is still valid. But, Heteroscedasticity invalidates variance formulas for OLS estimators As a result, the usual F-tests and t-tests are not valid, And, OLS is no longer the best linear unbiased estimator (BLUE); there may be more efficient linear estimators that are also unbiased. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. An easy fix: Use heteroskedasticityrobust standard errors. Heteroscedasticity-robust inference after OLS Formulas for OLS standard errors and related statistics have been developed that are robust to heteroscedasticity of unknown form Formula for heteroscedasticity-robust OLS standard error Also called White/Eicker standard errors. They involve the squared residuals from the regression and from a regression of xj on all other explanatory variables. Using these formulas, the usual t-test is valid in large samples The usual F-statistic does not work under heteroscedasticity, but heteroscedasticity robust versions are available in Stata, using the "test" command that you saw on HW 7 (and we will use in lab today). © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. An easy fix: Use heteroskedasticityrobust standard errors. Example: Comparing the standard errors 1. reg lwage educ exper exper2 2. reg lwage educ exper exper2, robust Heteroscedasticity robust standard errors may be larger or smaller than their nonrobust counterparts. The differences are often small in practice. F-statistics are also often not too different. If there is strong heteroscedasticity, differences may be larger. To be on the safe side, it is advisable to always compute robust standard errors. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Testing for Heteroscedasticity: Breusch-pagan test Testing for heteroscedasticity Important because if there is heteroscedasticity, OLS may not be the most efficient linear estimator anymore. Breusch-Pagan test for heteroscedasticity Under MLR.4 The mean of u2 must not vary with x1, x2, …, xk If H0 is false, then the expected value of 𝑢2 could be any function of the x’s, but for simplicity, let’s suppose that the function is linear: © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Testing for Heteroscedasticity: Breusch-pagan test Breusch-Pagan test for heteroscedasticity (cont.) Regress squared residuals (our way to estimate 𝑢2 ) on all explanatory variables and test whether this regression has explanatory power. A large F-test statistic (coming from a high R-squared) is evidence against the null hypothesis. If the F test is large enough to reject the null hypothesis, then we can confirm that (some of the) X‘s are related to the variance of U, meaning that there is heteroskedasticity. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Heteroskedasticity of a known form When we actually know how the Var(U) depends on X, we can transform the variables to correct this. Our goal is to redefine them in a way so that the regression using transformed variables no longer has any heteroskedasticity. This method is called Weighted Least Squares (WLS) as opposed to Ordinary Least Squares. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Heteroskedasticity of a known form Weighted least squares estimation Heteroscedasticity is known up to a multiplicative constant The functional form of the heteroscedasticity is known Transformed model © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Example: Weighted least squares (WLS) Example: Savings and income Note that this regression model has no intercept The transformed model is homoscedastic If the other Gauss-Markov assumptions hold, then OLS applied to the transformed model is the best linear unbiased estimator! © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Example: Weighted least squares (WLS) OLS in the transformed model is weighted least squares (WLS) Observations with a large variance are downweighted (treated as less important, relative to the obs with less variance. Why is WLS more efficient than OLS in the original model? Observations with a large variance are less informative than observations with small variance and therefore should get less weight WLS is a special case of generalized least squares (GLS) © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Heteroscedasticity Example: Financial wealth equation Net financial wealth Assumed form of heteroscedasticity: WLS estimates have considerably smaller standard errors (which is line with the expectation that they are more efficient). © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Heteroscedasticity WLS in the linear probability model In the LPM, the exact form of heteroscedasticity is known Use inverse values as weights in WLS Discussion Infeasible if LPM predictions are below zero or greater than one If such cases are rare, they may be adjusted to values such as .01/.99 Otherwise, it is probably better to use OLS with robust standard errors © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Lab Session slides We will practice testing for heteroskedasticity © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Heteroscedasticity Important special case of heteroscedasticity If the observations are reported as averages at the city/county/state/country/firm level, they should be weighted by the size of the unit Average contribution to pension plan in firm i (as share of earnings) Average earnings and age in firm i Percentage firm contributes to plan Heteroscedastic error term Error variance if errors are homoscedastic at the employee level If errors are homoscedastic at the employee level, WLS with weights equal to firm size mi should be used. If the assumption of homoscedasticity at the employee level is not exactly right, one can calculate robust standard errors after WLS (i.e. for the transformed model). © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Try it out: The Breusch-Pagan test. Open the “obesity” data from Piazza page. Start with the SLR you ran in your HW: reg obes_rate F50 To check for heteroskedasticity, create a variable for 𝑢2 and see if it is correlated with F50. Recall that the command: predict ___, resid creates a variable with the residual or 𝑢. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Breusch-Pagan test 2nd example Now, let’s run the model with all the Census controls: reg obes F50 medhhinc- pctpopownerocc Perform the same test as before. What do you find? © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Example: Weighted least squares (WLS) obesity dataset Recall: What is the level of observation here? Suppose that at the individual level, the OLS model is homoskedastic. That is, let’s assume that for the U described below, 𝑉𝑎𝑟 𝑈𝑖 𝐹50𝑖 = 𝜎 2 𝑂𝑏𝑒𝑠𝑒𝑖 = 𝐵0 + 𝐵1 𝐹50𝑖 + 𝑈𝑖 What happens when we aggregate to the school level? © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture_17 and lab 10