* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The total sum of squares is defined as
Data assimilation wikipedia , lookup
Forecasting wikipedia , lookup
German tank problem wikipedia , lookup
Choice modelling wikipedia , lookup
Regression toward the mean wikipedia , lookup
Bias of an estimator wikipedia , lookup
Linear regression wikipedia , lookup
Economics 405 Due—October 16 at 6:30 pm. Problem Set #2 Instructions: Show all of your work and give complete explanations. As noted on the course syllabus, the problem set will be graded on a credit/no-credit basis. You must give a substantive answer to each question/problem. Non-attempts and weak attempts will be docked accordingly. Group work is discouraged. To the extent that you do work with a colleague, be absolutely sure to give answers in your own words. Duplicate answers will automatically be assigned a 0.0. n 1. The total sum of squares is defined as SST ( y i y ) 2 . i 1 a. n n n i 1 i 1 i 1 Show that SST ( y i yˆ i ) 2 ( yˆ i y ) 2 2 ( y i yˆ i )( yˆ i y ) . ( y y ) ( y yˆ yˆ y ) [( y yˆ ) ( yˆ {[( y yˆ ) ( yˆ y )] [( y yˆ ) ( yˆ y )]} {( y yˆ ) ( yˆ y ) 2( y yˆ )( yˆ y )} 2 2 i i i i i i i i i 2 i i i i i y )] 2 i 2 i i i i Then applying Property Sum.3 (p.708), we obtain: n n n i 1 i 1 i 1 SST ( y i yˆ i ) 2 ( yˆ i y ) 2 2 ( y i yˆ i )( yˆ i y ) n b. Show that the last term on the right-hand-side of a.), 2 ( y i yˆ i )( yˆ i y ) , i 1 equals 0. (Hint: First show that n n i 1 i 1 ( yi yˆ i )( yˆ i y ) uˆi (ˆ0 ˆ1 xi y ) , then apply the algebraic properties of OLS in conjunction with the rules of summation.) By definition, uˆi yi yˆ i and yˆ i ˆ0 ˆ1 xi . Substituting the residual into the expression inside the first set of brackets and the predicted value of y into the second set of brackets, we get the hint. Therefore: n uˆ (ˆ i 1 i 0 ˆ1 xi y ) {ˆ0 uˆ i ˆ1 xi uˆ i yuˆ i } . Recognizing that ˆ0 , ˆ1 , and y are constants in a given sample and applying Property Sum.3, we obtain: 2 n n n i 1 i 1 i 1 {ˆ 0 uˆ i ˆ1 xi uˆ i yuˆ i } ˆ 0 uˆ i ˆ1 xi uˆ i y uˆ i The sum of the OLS residuals equals zero (see p. 40 of the text), therefore terms 1 n and 3 must both equal 0. Furthermore, x uˆ i 1 i i 0 follows from the second first- order-condition for the minimization of SSR (see p. 40 or p. 30), therefore the second term in the expression above must also equal zero. Thus: n 2 ( y i yˆ i )( yˆ i y ) 2 0 0 . i 1 n 2. The OLS estimator of the slope coefficient is ̂1 (x i 1 i n (x i 1 n a. Show algebraically that ˆ1 1 (x i 1 i x )( y i y ) i x)2 x )u i SSTx . See pp. 53-54 in the text or lecture notes from the second half of class on September 25. b. Using the expression in a.), explain why it is unlikely that ˆ1 1 for a given sample of data. There is nothing to force the second element on the right hand side of n ˆ1 1 (x i 1 i x )u i to add up to zero. ui is the population error. SSTx While on average it is equal to zero, any given value can be positive or negative. Therefore, over a sample of n observations, the sum of the product of the mean deviations of the regressor and the population errors can be positive or negative depending on the sample of data. c. Given the Simple Linear Regression Assumptions (SLR.1 – SLR.4), show that ˆ1 is an unbiased estimator of 1 . Proof is given in Theorem 2.1 on p. 54 of the text and in the lecture notes from the second half of class on September 25. 3 d. Under what circumstances is the zero conditional mean assumption not valid? If the zero conditional mean assumption is not valid, does that imply that ˆ1 is a biased estimator of 1 ? Why or why not? The zero conditional mean assumption is not valid when the covariance between the regressor and the error is not equal to zero. The error contains factors that determine the dependent variable which have not been included in the systematic component of the regression model. To the extent that any of these factors are correlated with the regressor, the zero conditional mean assumption will not be valid. If the zero conditional mean assumption is not valid then, E (u | x) 0 . As a result, n E ( ˆ1 ) 1 (1 / SSTx ) ( xi x )E (u i | x) 1 , since the sum term does i 1 not equal zero when the zcma is not valid. Specifically, if x and u are positively correlated, then the products inside the sum will tend to be positive. Likewise, if correlation between x and u is negative, then the products inside the sum will tend to be negative. 3. a. If the errors, ui, have constant variance regardless of the value of X, then we say that they are homoskedastic. True or false? Explain and illustrate with the appropriate graph. True. By definition, the errors are homoskedastic if Var (u | x) 2 . This says that the variance of the errors is the same regardless of the value of x. See Figure 2.8 on p. 58. b. If the errors, ui, are heteroskedastic, then the OLS estimators, ̂ 0 and ˆ1 , are biased. True or false? Explain. This statement is false. Unbiasedness of the OLS estimators requires assumptions SLR.1-SLR.4. The homoskedasticity assumption SLR.5 is not necessary to show unbiasedness of the OLS estimators. n uˆ 4. Show that i 1 2 i is a biased estimator of the error variance, Var (u ) . Is this n estimator biased in the downward direction or the upward direction? Explain. The claim was made in class on October 2 (the claim’s proof is part of the proof n of Theorem 2.3 on p. 62 of the text) that E ( uˆ i2 ) (n 2) 2 . Therefore, i 1 4 n uˆ 2 i n2 2 2 . The punch line n n i 1 here is that the sample average of the squared residuals biased in the downward direction, i.e., it would systematically underestimate the true population variance. Note that with a small sample, this would be a big problem. For example, n = 3 => (n-2)/n = 1/3 = .333. With a large sample, there wouldn’t be much bias at all. For example, n = 1000 => (n-2)/n = .998 1. E( 5. i 1 n ) (1 / n) E ( uˆ i2 ) (1 / n)( n 2) 2 I obtained a sample of 88 home prices (measured in thousands of dollars). I regressed the house sale price (price) on house size (sqrft, measured in hundreds of square feet, e.g., 2300 hundred square feet implies sqrft = 23.0). Here’s what I found: Dependent Variable: PRICE Method: Least Squares Sample: 1 88 Included observations: 88 Variable Coefficient Std. Error t-Statistic Prob. SQRFT C 14.02110 11.20414 ________ 24.74261 0.452828 0.0000 0.6518 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat a. 0.620797 0.616387 ________ 348053.4 -489.3087 1.728723 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) 293.5460 102.7134 11.16611 11.22241 140.7913 0.000000 Write down the fitted model. ^ price 11.204 14.021sqrft b. R 2 .621 Determine ̂ . Interpret the value you obtain (be sure to make your interpretation relative to the standard deviation of the dependent variable). ˆ SSR n2 348053.4 63.617 88 2 The standard deviation of the dependent variable is 102.713, which, for this sample, implies that the typical amount of deviation from the mean for a given observation is $102,713 (since the dependent variable is measured in thousands of dollars). The standard error of the regression, at 63.617, implies that the typical amount of discrepancy between an observed value 5 and the predicted value based on the regression relationship is $63,617. In other words, the typical amount of deviation in an observation due to unobservable factors is $63,617. c. It turns out that SSTx 2898.406 . Determine SE ( ˆ1 ) . Interpret the value you obtain. SE ( ˆ1 ) ˆ SSTx 63.617 2898.406 1.182 . The point estimate of 1 is 14.021. The standard error of ˆ1 gives us an idea of how precise the estimate of the parameter is. The closer the standard error is to 0, the more precise the estimate. In the problem at hand, we have estimated that an extra hundred square feet in house size is associated with a $14,021 increase in house price. The standard error of the estimate indicates that the true impact of a hundred square foot increase in house size on price could be as much $15,000+ or as little as a bit less than $13,000. 6. Let ̂ 0 and ˆ1 be the intercept and slope from the simple regression of yi on xi, ~ ~ using n observations. Let c1 and c2 , with c2 0 , be constants. Let 0 and 1 be the intercept and slope from the regression of c1 yi on c2 xi . a. c ~ ~ ~ Show that 1 ( 1 ) ˆ1 and 0 c1 ˆ0 . [Hint: To obtain 1 , plug c1 yi c2 and c2 xi into equation (2.19) from the text. Then, use equation (2.17) ~ ~ from the text for 0 , being sure to plug in c1 yi and c2 xi and 1 .] The formula for the OLS slope coefficient estimator is ( xi x )( yi y ) . In this problem the dependent variable is c y and the 1 i ( xi x ) 2 regressor is c2 xi . Note that the mean of the dependent variable is c y i c1 y c1 y and, by similar reasoning, the mean of the n n regressor is c2 x . Plugging into the formula for the OLS estimator we get: 1 i 6 ~ 1 (c 2 xi c 2 x )(c1 y i c1 y ) (c x 1 i c1 x ) 2 c c ( x x )( y c (x x) 1 2 i 2 2 y) i i 2 c1c 2 c 22 ( x x )( y y ) (x x) i i 2 i c1 ˆ 1 c2 The general formula for the OLS intercept estimator is ˆ0 y ˆ1 x . Recalling the definitions of the dependent variable and the regressor in the case at hand and plugging into the general formula, the intercept estimator is ~ ~ 0 c1 y 1c2 x c1 y b. c1 ˆ 1c2 x c1 y c1 ˆ1 x c1 ( y ˆ1 x ) c1 ˆ0 . c2 Using the Ceosal1 data from the textbook’s data files, I regressed salary on roe and obtained ^ salary = 963.19 +18.501roe R2 = .013 Let c1 = 1000 and c2 = 1/100. What do the results from part a.) imply regarding the estimated slope and intercept coefficients from a regression of c1 yi on c2 xi ? With the proposed transformations of x and y, what units are the variables measured in? c1 ˆ 1000 ) 1 ( )18.501 1,850,100 c2 1 / 100 ~ c ˆ 1000 963.19 963,190 ~ 1 ( 0 1 0 Since salary is measured in thousands of dollars, the transformation puts the dependent variable into dollars. Since roe is measured in percentage points, the transformation puts the regressor into decimal terms. For example, if roe = 20.0, then the transformation implies c2xi = .20. ˆ1 18.501 implies that a 1 percentage point increase in roe raises CEO ~ salary by $18,501. 1 1,850,100 implies that a 1 unit change in c 2 x raises CEO salary (measured in dollars) by $1,850,000. A 1 unit change in the transformed regressor, however, is much too large a change to consider. That would be like going from a return on equity of 20% to 120%! What’s a more reasonable change to look at? A 1 percentage point change in the transformed regressor would be .01 (e.g., .20 to .21). Therefore, the predicted increase in salary (measured in dollars) due to a ~ .01 change in c 2 x is c1 y 1 (c 2 x) 1,850,000 .01 $18,500 . If this looks like the interpretation for ˆ , it should, since they are the same! In 1 7 other words, the transformations employed here do not alter the fundamental relationship between the dependent and independent variables. c. Use Eviews to actually run the regression of c1 yi on c2 xi proposed in b. Note that you’ll have to GENR the transformed variables. Attach your computer output. Does the regression confirm your claim in part b? (It should.) Dependent Variable: C1Y Method: Least Squares Date: 10/15/07 Time: 06:13 Sample: 1 209 Included observations: 209 Variable Coefficient Std. Error t-Statistic Prob. C2X C 1850119. 963191.3 1112325. 213240.3 1.663290 4.516930 0.0978 0.0000 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat 0.013189 0.008421 1366555. 3.87E+14 -3248.264 2.104990 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) 1281120. 1372345. 31.10301 31.13499 2.766532 0.097768 The claims in b.) are confirmed. d. Compare the R2 of the new regression with the original regression. Which is greater? Explain. The R2s of the two regression are the same. This shows that changing the definitions of the dependent and independent variables by linear transformation has no impact on the goodness of fit of the model. The underlying amount of variation in the dependent variable explained by variation in the regressor remains constant. 7. Use Eviews to access Wage2 from the textbook’s data files. The variables of interest are monthly salary (wage) and IQ score (IQ). 8 a. Use EViews to generate the descriptive statistics table for wage and IQ. Attach the printout. What are average wage and average IQ in the sample? What is the sample standard deviation of IQ? Date: 10/15/07 Time: 06:24 Sample: 1 460 b. WAGE IQ Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1005.246 960.0000 3078.000 233.0000 409.6735 1.316589 6.029875 104.7196 106.0000 145.0000 59.00000 13.82357 -0.295697 2.884268 Jarque-Bera Probability 308.8473 0.000000 6.960182 0.030805 Observations 460 460 Estimate a level-level model with wage as the dependent variable and IQ as the regressor. Use your estimated model to determine the predicted increase in wage for a 15 point increase in IQ. Does IQ account for most of the variation in wage? Explain. Dependent Variable: WAGE Method: Least Squares Date: 10/15/07 Time: 06:27 Sample: 1 460 Included observations: 460 Variable Coefficient Std. Error t-Statistic Prob. IQ C 8.096739 157.3586 1.332109 140.7054 6.078137 1.118355 0.0000 0.2640 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat 0.074642 0.072622 394.5175 71284974 -3401.435 1.794438 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) 1005.246 409.6735 14.79754 14.81550 36.94375 0.000000 The fitted model is: waˆge 157.36 8.097 IQ R 2 .075 waˆge ˆ1 IQ 8.097 15 121.45 . A 15 point increase in IQ raises predicted monthly earnings by $121.45. Since R 2 .075 , variation in IQ scores does not account for much variation in monthly earnings. 9 c. GENR a new variable that gives the logarithm of wage. Call this new variable logwage. (Note I want you to perform this step despite the fact that the data file already contains the logarithm of wage (lwage)). Attach the descriptive statistics table for logwage. Date: 10/15/07 Time: 06:47 Sample: 1 460 LOGWAGE WAGE 6.835596 6.866931 8.032035 5.451038 0.396488 -0.138713 3.399628 1005.246 960.0000 3078.000 233.0000 409.6735 1.316589 6.029875 Jarque-Bera Probability 4.536122 0.103513 308.8473 0.000000 Observations 460 460 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis d. Estimate a log-level model, using logwage as the dependent variable and IQ as the regressor. Attach the printout and write down the fitted model. If IQ increases by 15 points, what is the approximate percentage increase in predicted wage? Explain. Dependent Variable: LOGWAGE Method: Least Squares Date: 10/15/07 Time: 06:50 Sample: 1 460 Included observations: 460 Variable Coefficient Std. Error t-Statistic Prob. IQ C 0.007968 6.001208 0.001287 0.135991 6.188747 44.12959 0.0000 0.0000 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat 0.077172 0.075157 0.381298 66.58771 -208.1892 1.795664 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) 6.835596 0.396488 0.913866 0.931828 38.30059 0.000000 ^ The fitted model is: log wage 6.001 0.00797 IQ R 2 .077 10 In a log-level model, the percent change in the dependent variable due to a 1-unit change in the regressor is 100 ˆ1 . Therefore, the regression model here implies that monthly earnings increase by .797% for a 1 point increase in IQ. Accordingly, a 15 point increase in IQ is predicted to increase earnings by .797% 15 11.95% (i.e., 100ˆ1 IQ %wage ).