Download Lecture 10

Lecture 10 : Heteroskedasticity Econ 488 Order of Testing 1. Omitted variables and incorrect functional form (Adjusted R2) 2. Either A or B, but not both A. Serial Correlation (Durbin-Watson) B. Heteroskedasticity (Park’s Test, White’s Test) 3. Multicollinearity (Correlation Matrix, VIF) 4. Irrelevant Variables (t-test) Homoskedasticiy Ideal Case: Homoskedasticity Error variance σ2 is constant across sample σ2 measures dispersion of dependent variable around regression line Homoskedasticity means that the average relationship between dependent variable and independent variable is the same throughout sample Homoskedasticity Heteroskedasticity Heteroskedasticity (or heteroscedasticity) is when σ2 is not constant across sample Dispersion of dependent variable around regression line is not constant. Heteroskedasticity Heteroskedasticity Why do we care? If we don’t fix heteroskedasticity: Coefficients are not efficient (not minimum variance) Estimated standard errors biased and inconsistent…meaning t-stats are not right! When can it occur? Whenever dispersion around regression line differs within sample means relationship between dependent variable and independent variable differs within sample Example: MLB Payroll and Market Size 2008 MLB Payrolls Large Markets:(Population>5,000,000) Mean: $104,000,000 Std Dev: $44,600,000 Min: $21,800,000 (Florida Marlins) Max: $209,000,000 (NY Yankees) Small Markets:(Population<5,000,000) Mean: $78,800,000 Std Dev: $28,300,000 Min: $43,800,000 (Tampa Bay Rays) Max: $139,000,000 (Detroit Tigers) Heteroskedasticity Note: Same principle applies when observations are groups that differ in size. e.g.: States (population) Countries (population) Colleges (enrollment) Companies (sales) Etc. Another Example  Household income and consumption. A. Low-income households • • Little Flexibility in spending Most income spend on necessities: • • • Food, shelter, clothing, transportation, utilities Little dispersion of consumption around mean consumption. Small σ2 Household Income vs. Consumption B. High income households • • More flexibility in spending Once necessities are purchased, much remains to be spent in different ways • • • Big Spenders Savers and Investors Large dispersion of consumption around mean. Pure vs. Impure Heteroskedasticity Impure – Occurs when regression is not correctly specified E.g. omitted variables Can cause heteroskedasticity Pure – Occurs due to nature of data Consequences If we ignore heteroskedasticity, coefficient estimates are: Unbiased – OK! Consistent – OK! Inefficient – Not OK. t-tests are inaccurate. Detection Tests detect heteroskedasticity But won’t distinguish between pure and impure types If test uncovers heteroskedasticity–STOP! Try to decide if you have omitted variable. If you do… Include it in your model, and then retest for heteroskedasticity Detection OR…If you don’t have an omitted variable: Employ one of the remedies we’ll discuss After you “fix” the problem, Test again If you still have heteroskedasticity, It might be the impure type Detection  Plots 1) Estimate model, save residuals 2) Plot residuals against each independent variable separately Example: data3-6.gdt Plots Plots – V on it’s side Plots – Increasing or Decreasing Plots – Rainbow or inverted rainbow Park Test If there is heteroskedasticity, then… Var(εi)= σ2 Zi2 εi = error term σ2 = variance of homoskedastic error term Zi = proportionality factor If you know something about Z, you can use the Park test. Find a variable that is related to heteroskedasticity (e.g. population) Park Test 1. Run regression, obtain residuals 2. Run the following regression: o ln(ei2)= α0+ α1ln(Zi)+ ui o o o o Where: ei= residuals from regression Zi= best choice as to proportionality factor in data ui= classical error term 3. Test the significance of ln(Zi). o If significant, there is evidence of heteroskedasticity. Park Test Problem: We don’t always have a good Z So, we can use White’s Test White’s Test  H0: No Heteroskedasticity  HA: Heteroskedasticity White’s Test 1) Estimate Equation  Yi=β0+β1X1i+β2X2i+εi 2) Save residual o ei  Yi  ˆ1 X 1i  ˆ2 X 2i and square it. 3) Regress squared residual on a constant, X1, X2, X12, X22, X1X2 (all combinations of X’s)  ui2=α0+ α1X1i+ α2X2i + α3X1i2+ α4X2i2+ α5X1iX2i+ vi White’s Test 4) Compute N*R2 o N= sample size o R2 = unadjusted R2 5) Reject Null if o NR2 >χ2 (Chi-Square) with 5 degrees of freedom o Because there are 5 independent vars in auxiliary regression (step 3) White’s Test If you have 3 independent vars, auxiliary regression will have 9 independent vars. X1, X2, X3, X12, X22, X32, X1X2, X2X3, X1X3 If you have 6 independent vars, auxiliary regression will have 27 independent vars! This can get out of hand quickly. White’s Test Version 2 Same as before, except in auxiliary regression only use the X and X2 terms (no cross products) Use when you have a lot of independent variables. Remedies For Heteroskedasticity 1. Heteroskedasticity-Corrected Standard Errors o Fixes consistency of standard errors, so when N is large, standard errors are correct. o In gretl, just check the “robust standard error” box when running a regression Remedies For Heteroskedasticity 2. Weighted Least Squares (WLS)      (1) Yi=β0+β1X1i+β2X2i+εi (2) Var(εi)= σ2 Zi2 eqn. (1) is equivalent to (3) Yi=β0+β1X1i+β2X2i+Ziui So we can divide through by Zi Remedies For Heteroskedasticity  Step one:  Yi   0  1 X 1i   2 X 2i  ui Zi Zi Zi Zi  Step two: estimate by OLS  Caution about step 2: there are two cases. Remedies For Heteroskedasticity  Case 1: Z is not in the original equation Old: Yi=β0+β1X1i+β2X2i+εi New: Yi 1 1 X 1i  2 X 2i Zi  0 Zi  Zi  Zi  ui What’s Missing? The constant! Solution: Add a constant Better: Yi  X 1 X   0   0  1 1i  2 2i  ui Zi Zi Zi Zi Remedies For Heteroskedasticity  Case 2: Z is in the original equation Suppose X1 is Z Old: Yi=β0+β1X1i+β2X2i+εi New: Y  X 1 i X 1i  0 Xi  1  2 X 1i 2i  ui What’s different about this equation? One of the slope coefficients in the original equation becomes an intercept! This happens because X1i/X1i=1 Remedies For Heteroskedasticity That is: Intercept value in the new equation is the same as slope β2 in the original equation. What should you look at in the new equation to find the equation of X2? The constant. Remedies For Heteroskedasticity Example: saving.gdt (weight by income)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 10