Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Page 1 Answers Chapter 9, 11 and 12 Suggested problems 9.1 (a) When a GPA is increased by one unit, and other variables are held constant, average starting salary will increase by the amount $1643. Students who take econometrics will have a starting salary which is $5033 higher, on average, than the starting salary of those who did not take econometrics. The intercept suggests the starting salary for someone with zero GPA and no econometrics is $24,200. However, this figure is likely to be unreliable since there would be no one with a zero GPA. (b) A suitably modified equation is SAL 1 2GPA 3METRICS 4 SEX e (c) To see if the value of econometrics is the same for men and women, we change the model to SAL 1 2GPA 3 METRICS 4 SEX 5 METRICS SEX e (d) The estimated models, with standard errors in parentheses below the estimated coefficients, are SAL = 24242 + 1658 GPA + 5024 METRICS 205 SEX (1091) (356) (460) (420) SAL = 24222 + 1675 GPA + 4924 METRICS 280 SEX + 275 METRICS SEX (1104) (365) (582) (500) (966) The estimated equation for part (b) suggests the starting salary for females is $205 lower than that for males. However, this estimated coefficient is less than its standard error. The hypothesis that males and females have the same starting salary would not be rejected. The estimated equation for part (c) suggests that: Value of econometrics for men = 4924 Value of econometrics for women = 4924 + 275 = 5199 That is, econometrics is more valuable for women than men. However, the estimated coefficient for METRICS SEX is not significantly different from zero. The hypothesis that econometrics is equally valuable for men and women would not be rejected. 9.2 (a) Considering each of the coefficients in turn, we have the following interpretations. Intercept: At the beginning of the time period over which observations were taken, on a day which is not Friday, Saturday or a holiday, and a day which has neither a full moon nor a half moon, the average number of emergency room cases was 94. T: The average number of emergency room cases has been increasing by 0.0338 per day. HOLIDAY: The average number of emergency room cases goes up by 13.9 on holidays. FRI and SAT: The average number of emergency room cases goes up by 6.9 and 10.6 on Fridays and Saturdays, respectively. FULLMOON: The average number of emergency room cases goes up by 2.45 on days when there is a full moon. However, a null hypothesis stating that a full moon has no influence on the number of emergency room cases would not be rejected. NEWMOON: The average number of emergency room cases goes up by 6.4 on days when there is a new moon. However, a null hypothesis stating that a new moon has no influence on the number of emergency room cases would not be rejected. (b) Here are the results. I realize I did not post the data or code, but with this output you can still do the restricted F test Page 2 ****************************************************************************** The REG Procedure Model: MODEL1 Dependent Variable: calls Analysis of Variance Su of Mean DF Squares Square F Value Pr > F Model 6 5693.37691 948.89615 7.77 <.0001 Error 222 27109 122.11182 Corrected Total 228 32802 Source Root MSE Dependent Mean Coeff Var 11.05042 R-Square 0.1736 100.56769 Adj R-Sq 0.1512 10.98804 Parameter Estimates Parameter Standard DF Estimate Error t Value Pr > |t| Intercept 1 93.69583 1.55916 60.09 <.0001 t 1 0.03380 0.01105 3.06 0.0025 hol 1 13.86293 6.44517 2.15 0.0326 fri 1 6.90978 2.11132 3.27 0.0012 sat 1 10.58940 2.11843 5.00 <.0001 full 1 2.45445 3.98092 0.62 0.5382 new 1 6.40595 4.25689 1.50 0.1338 Variable (c) The null and alternative hypotheses are H 0 : 6 7 0 H1 : 6 or 7 is nonzero. The test statistic is F ( SSER SSEU ) 2 SSEU (229 7) where SSE R = 27424 is the sum of squared errors from the estimated equation with FULLMOON and NEWMOON omitted and SSEU = 27109 is the sum of squared errors from the estimated equation with these variables included. The calculated value of the F statistic is 1.29 with corresponding p-value of 0.277. This p-value came from SAS output (see below). Alternatively, you can get the Fcritical value of approx 3.07 at 5% level of significance. Thus, we do not reject the null hypothesis that new and full moons have no impact on the number of emergency room cases. Here is the restricted regression: The REG Procedure Model: MODEL2 Dependent Variable: calls Page 3 Analysis of Variance Sum of Mean DF Squares Square F Value Pr > F Model 4 5378.00978 1344.50245 10.98 <.0001 Error 224 27424 122.42942 Corrected Total 228 32802 Source Root MSE Dependent Mean Coeff Var 11.06478 R-Square 0.1640 100.56769 Adj R-Sq 0.1490 11.00232 Parameter Estimates Parameter Standard DF Estimate Error t Value Pr > |t| Intercept 1 94.02146 1.54585 60.82 <.0001 t 1 0.03383 0.01107 3.06 0.0025 hol 1 13.61679 6.45107 2.11 0.0359 fri 1 6.84914 2.11367 3.24 0.0014 sat 1 10.34207 2.11533 4.89 <.0001 Variable F = [(SSER – SSEU)/2]/(SSEU/(t-k)) =[(27424-27109)/2]/(27109/229-7) = 157.5/122.11 = 1.29 Note: the following code in sas will AUTOMATICALLY do the restricted F-test.. proc reg; model calls = t hol fri sat full new; test full=0, new=0; run; Here is the output…see the F stat at the bottom. The REG Procedure Model: MODEL1 Dependent Variable: calls Analysis of Variance Sum of Mean DF Squares Square F Value Pr > F Model 6 5693.37691 948.89615 7.77 <.0001 Error 222 27109 122.11182 Corrected Total 228 32802 Source Root MSE Dependent Mean Coeff Var 11.05042 R-Square 0.1736 100.56769 Adj R-Sq 0.1512 10.98804 Parameter Estimates Parameter Standard Page 4 Variable DF Estimate Error t Value Pr > |t| Intercept 1 93.69583 1.55916 60.09 <.0001 t 1 0.03380 0.01105 3.06 0.0025 hol 1 13.86293 6.44517 2.15 0.0326 fri 1 6.90978 2.11132 3.27 0.0012 sat 1 10.58940 2.11843 5.00 <.0001 full 1 2.45445 3.98092 0.62 0.5382 new 1 6.40595 4.25689 1.50 0.1338 ****************************************************************************** The REG Procedure Model: MODEL1 Test 1 Results for Dependent Variable calls Mean Source DF Square F Value Pr > F 2 157.68356 1.29 0.2770 222 122.11182 Numerator Denominator 11.1 Specification Transformation: for var(et) 2 xt Divide the model by X1/4: independent variables: 1/X1/4 and X/X1/4 Why??? We divide by the standard deviation, which is the square root of the variance : X1/4. We can ignore the term in all of the models since it doesn’t vary over observations. 2xt Divide the model by X1/2 Independent variables: 1/X1/2 and X/X1/2 This is just like the one we did in class. The standard deviation is xt 2 xt2 Divide the model by X: independent variables: 1/x plus an intercept Here, the standard deviation is xt, so we divide by xt 2ln(xt) Divide the model by (ln(X))1/2 Here the standard deviation is Independent variables: 1/(ln(x))1/2 ln( xt ) Here is the main regression that we used in class to test for heteroskedasticity. I do not repeat the test here. See slides 11.11 and 11.12) The REG Procedure Model: food Dependent Variable: y Page 5 Analysis of Variance Source DF Sum of Squares Mean Square Model Error Corrected Total 1 38 39 25221 54311 79533 25221 1429.24556 Root MSE Dependent Mean Coeff Var 37.80536 130.31300 29.01120 R-Square Adj R-Sq F Value Pr > F 17.65 0.0002 0.3171 0.2991 Parameter Estimates Variable Intercept x DF 1 1 Parameter Estimate 40.76756 0.12829 Standard Error 22.13865 0.03054 t Value 1.84 4.20 Pr > |t| 0.0734 0.0002 Below is the output for 3 different transformed regressions. I do not do part b since we did this in class. see slides 11.17 and 11.18. (ignore testing these models for heteroskedasticity. These models have had the heterskedasticity remove, albeit in 3 different ways.) Below I highlight the coefficient estimate b2 that measures the effect on X on Y: The REG Procedure Model: model_A Dependent Variable: ystar NOTE: No intercept in model. R-Square is redefined. Analysis of Variance Source DF Sum of Squares Mean Square Model Error Uncorrected Total 2 38 40 26028 1915.30947 27943 13014 50.40288 Root MSE Dependent Mean Coeff Var 7.09950 25.30918 28.05108 Variable x1star x2star DF R-Square Adj R-Sq Parameter Estimates Parameter Standard Estimate Error 1 1 36.75257 0.13391 20.05232 0.02879 F Value Pr > F 258.20 <.0001 0.9315 0.9279 t Value Pr > |t| 1.83 4.65 0.0747 <.0001 ****************************************************************************** The REG Procedure Model: model_C Dependent Variable: ystar Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total 1 38 39 0.00562 0.09285 0.09846 Root MSE Dependent Mean Coeff Var 0.04943 0.19116 25.85840 Page 6 2.30 0.1377 0.00562 0.00244 R-Square Adj R-Sq 0.0571 0.0322 Parameter Estimates Variable Intercept x1star Parameter Estimate 0.15769 21.28584 DF 1 1 Standard Error 0.02342 14.03797 t Value 6.73 1.52 Pr > |t| <.0001 0.1377 ****************************************************************************** The REG Procedure Model: model_D Dependent Variable: ystar NOTE: No intercept in model. R-Square is redefined. Analysis of Variance Source DF Sum of Squares Mean Square Model Error Uncorrected Total 2 38 40 106685 8111.23735 114797 53343 213.45361 Root MSE Dependent Mean Coeff Var 14.61005 50.89489 28.70632 R-Square Adj R-Sq F Value Pr > F 249.90 <.0001 0.9293 0.9256 Parameter Estimates Variable x1star x2star Parameter Estimate 39.55015 0.12996 DF 1 1 Standard Error 21.46901 0.02997 t Value 1.84 4.34 Pr > |t| 0.0733 0.0001 11.2 SAS output appear below. (a) Countries with high per capita income can decide whether to spend larger amounts on education than their poorer neighbours, or to spend more of their larger income on other things. They are likely to have more discretion with respect to where public monies are spent. On the other hand, countries with low per capita income may regard a particular level of education spending as essential, meaning that they have less scope for deviating from a mean function. These differences can be captured by a model with heteroskedasticity. Remember that heteroskedasticity is more common in cross-section data. (b) The least squares estimated function is yt 01246 . 0.07317 xt (0.0485) (0.00518) R2 0.862 This function and the corresponding residuals appear in Figure 11.1. The absolute magnitude of the errors does tend to increase as x increases suggesting the existence of heteroskedasticity. Page 7 Yt 1.6 1.4 1.2 y = - 0.1246 + 0.0732x 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 0 5 10 15 20 Xt Figure 11.1 Estimated Function for Education Expenditure (c) Since it is suspected that, if heteroskedasticity exists, the variance is related to xt , we begin by ordering the observations according to the magnitude of xt. Then, splitting the sample into two equal subsets of 17 observations each, and applying least squares to each subset, we obtain 12 = 0.0081608 and 22 = 0.029127 leading to a Goldfelt-Quandt statistic of GQ 0.029127 = 3.569 0.008161 The critical value from an F-distribution with (15,15) degrees of freedom and a 5% significance level is Fc = 2.40. Since 3.569 > 2.40 we reject a null hypothesis of homoskedasticity and conclude that the error variance is directly related to per capita income xt. (e) Generalized least squares estimation under the assumption var et 2 xt yields yt 0.0929 0.06932 xt (0.0289) (0.00441) (note: I have expressed these results in the model’s original form although it was estimated with no intercept and two independent variables: the reciprocal of the square root of x and x over the square root of x.) The estimated response of per capita education expenditure to per capita income has declined slightly relative to the least squares estimate. The associated 95% confidence interval is (0.0603, 0.0783). This interval is narrower than both those computed from least squares estimates. The comparison with the White-calculated interval suggests that generalized least squares is more efficient; a comparison with the conventional least squares interval is not really valid because the standard errors used to compute that interval are not valid. See below for the case were Var(et) = 2X2t. The differences of how this is carried out and how to interpret the results is important. Part B Source Least Squares results The REG Procedure Model: MODEL1 Dependent Variable: y Analysis of Variance Sum of DF Squares Model Error Corrected Total 1 32 33 3.68386 0.59063 4.27449 Mean Square 3.68386 0.01846 F Value Pr > F 199.59 <.0001 Page 8 Root MSE Dependent Mean Coeff Var Variable Intercept x DF 1 1 0.13586 0.47674 28.49753 R-Square Adj R-Sq Parameter Estimates Parameter Standard Estimate Error -0.12457 0.04852 0.07317 0.00518 0.8618 0.8575 t Value -2.57 14.13 Pr > |t| 0.0151 <.0001 ****************************************************************************** The REG Procedure Model: MODEL1 Dependent Variable: y This is part D, White standard Error for b2 would the the square root of 0.0000363146 = 0.006. This is larger than the 0.00518 value reported y least squares. Consistent Covariance of Estimates Variable Intercept x Intercept 0.0015372135 -0.000211654 x -0.000211654 0.0000363146 ****************************************************************************** This regression gets you the numerator for the GQ-statistic The REG Procedure Model: MODEL1 Dependent Variable: y Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 Error 15 Corrected Total 16 Root MSE Dependent Mean Coeff Var Variable Intercept x DF 1 1 0.42220 0.42220 14.50 0.0017 0.43690 0.02913 0.85910 0.17067 R-Square 0.4914 0.78115 Adj R-Sq 0.4575 21.84803 Parameter Estimates Parameter Standard Estimate Error t Value Pr > |t| -0.14087 0.24569 -0.57 0.5749 0.07516 0.01974 3.81 0.0017 This regression gets you the denominator for the GQ-statistic The REG Procedure Model: MODEL1 Dependent Variable: y Analysis of Variance Sum of Mean Source DF Squares Square F Value Model 1 0.14225 0.14225 17.43 Error 15 0.12241 0.00816 Corrected Total 16 0.26466 Pr > F 0.0008 Root MSE Dependent Mean Coeff Var Variable Intercept x DF 1 1 0.09034 R-Square 0.5375 0.17232 Adj R-Sq 0.5066 52.42382 Parameter Estimates Parameter Standard Estimate Error t Value Pr > |t| -0.03807 0.05495 -0.69 0.4990 0.05047 0.01209 4.17 0.0008 ****************************************************************************** These are the critical values The SAS System Obs fc tc 17 Page 9 1 2.40345 2.03693 This regression corrects for heteroskedasticity of the form var(et) = 2Xt The REG Procedure Model: MODEL1 Dependent Variable: ystar NOTE: No intercept in model. R-Square is redefined. Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Uncorrected Total 2 32 34 0.96083 0.06341 1.02423 0.48041 0.00198 242.45 <.0001 Root MSE Dependent Mean Coeff Var 0.04451 R-Square 0.9381 0.15116 Adj R-Sq 0.9342 29.44875 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| x1star 1 -0.09292 0.02890 -3.21 0.0030 x2star 1 0.06932 0.00441 15.71 <.0001 We predict that if GDP per capita increases by $1.00, pubic expenditures on education per capital will increase by $0.069 ****************************************************************************** This regression corrects for heteroskedasticity of the form var(et) = 2X2t The REG Procedure Model: MODEL1 Dependent Variable: ystar Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 0.00349 0.00349 12.69 0.0012 Error 32 0.00880 0.00027504 Corrected Total 33 0.01229 Root MSE 0.01658 R-Square 0.2840 Dependent Mean 0.05153 Adj R-Sq 0.2616 Coeff Var 32.18259 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 0.06443 0.00460 13.99 <.0001 xstar 1 -0.06739 0.01892 -3.56 0.0012 We predict that if GDP per capita increases by $1.00, pubic expenditures on education per capital will increase by $0.064, because the intercept in this transformed model is actually the slope coefficient the original model. 11.10 (a) The graphs for plotting the residuals against income and age show that the absolute values of the residuals increase as income increases but they appear to be constant as age increases. This indicates that the error variance depends on income. (b) Since the residual plot shows that the error variance may increase when income increases, and this is a reasonable outcome since greater income implies greater flexibility in travel, we set the null and alternative hypotheses as H0 : 12 22 against H1 : 12 22 . The test statistic is GQ ˆ 12 (2.9471 107 ) (100 4) 2.8124 ˆ 22 (1.0479 107 ) (100 4) The 5% critical value for (96, 96) degrees of freedom is Fc 1.35 . Thus, we reject H 0 and conclude that the error variance depends on income. Page 10 12.1 (a) The least-squares estimated equation is given by It = 6.22 + 0.770 Yt 0.184 Rt (2.51) (0.072) (0.126) R2 = 0.816 Both b2 and b3 have the expected signs; income is expected to have a positive effect on investment whereas an increase in the interest rate should reduce investment. The standard errors for b1 and b2 are relatively small suggesting that the corresponding coefficients are significantly different from zero. However, the standard error of b3 is large, yielding a t ratio that is less than two. Based on this standard error, we can question whether we should include Rt in the equation, although economic theory suggests Rt should have a strong influence on It. (b) The plot of the least squares residuals in Figure 12.1 reveals a few long runs of negative and positive residuals, suggesting the existence of autocorrelation. e 8 4 0 0 5 10 15 20 25 30 35 t -4 -8 Figure 12.1 Residuals for Investment Equation (c) In this context, the Durbin-Watson test is a test for H0: = 0 against H1: > 0 in the first-order autoregressive model et = et1 + vt. We find the computed value for the Durbin-Watson statistic is d = 0.852 (calculate by SAS). With T=30 and k=3, we have dL = 1.284 and dU=1.567. Because the d statistic is less than dL reject Ho.. (d) The results from estimating the model in SAS and correcting for AR(1) errors are: It = 8.41 + 0.742 Yt 0.285 Rt (2.90) (0.115) (0.081) = 0.5616 Comparing these results with those form part (a), we find that there has been little change in the coefficient estimates, but a considerable change in the standard errors. The standard error on the coefficient of Yt has increased, suggesting that, if we did not correct for autocorrelation, our confidence interval for 2 would be too narrow, giving us a false sense of the reliability of our estimate. The opposite has occurred for 3. Here the standard error has dropped after correcting for autocorrelation. From the results in part (a) we might be misled into thinking that interest rate has no impact on investment (the estimated coefficient is not significant). After correcting for autocorrelation we have a relatively narrow confidence interval that does not include zero. (e) Given the next year values of Y and R are YT+1 = 36 and RT+1 = 14, the appropriate forecast using IT 1 1 2YT 1 3RT 1 ~ eT = 8.4093 + 0.7422(36) 0.2849(14) + 0.5616(2.1462) = 32.346 If autocorrelation is ignored, our prediction is IT 1 b1 b2 YT 1 b3 RT 1 = 6.22 + 0.77(36) 0.184(14) = 31.363 Page 11 There is not a large difference between the two predictions. 12.3 (a) The least-squares estimated equation is ln( JVt ) = 3.5027 1.6116 ln(Ut) (0.2829) (0.1555) R2 = 0.8299 Using the value tc = 2.074, a 95% confidence interval for 2 is b2 tcse(b2) = (1.9342, 1.2890) (b) The value of the Durbin-Watson statistic is d = 1.09. The lower limit is dL = 1.273 and the upper limit is dU = 1.446 we reject H0 and conclude that positive autocorrelation exists. The existence of autocorrelation means the original assumption for et , that the et are independent, is not correct. This problem also causes the confidence interval for 2 in part (a) to be incorrect, meaning we will have a false sense of the reliability of the coefficient estimate. (c) After correcting for autocorrelation, the estimated equation is ln( JVt ) = 3.5138 1.616 ln(Ut) (0.2437) (0.127) = 0.4318 (SAS) The 95% confidence interval for 2 from SAS is (1.879, 1.353). This confidence interval is slightly narrower than that given in part (a). A direct comparison with the interval in part (a) is difficult because the least squares standard errors are incorrect in the presence of AR(1) errors. However, given the change in standard errors is not great, and that we know least squares is less efficient, one could conjecture that the least squares confidence interval is narrower than it should be, implying unjustified reliability. 12.8 (a) The estimate for the AR(1) parameter is T ˆ eˆ eˆ t 2 T t t 1 eˆ t 1 2 t 55453 0.8758 63316 The approximate Durbin-Watson statistic is d * 2(1 ˆ ) 0.2484 . Based on T 90 and K 5 , d L 1.566 and dU 1.751 . Since d * is less than d L we conclude that positive autocorrelation is present. (b) The estimate for the AR(1) parameter of is T ˆ eˆ eˆ t 2 T t t 1 eˆ t 1 2 t 621 0.0505 12292 The approximate Durbin-Watson statistic is d * 2(1 ˆ ) 1.8990 . Based on T 90 and K 6 , d L 1.542 and dU 1.776 . Since d * > dU , we cannot reject a null hypothesis of no positive autocorrelation. Page 12 12.9 (a) From the residual plots the residuals tend to exhibit runs of positive and negative values, suggesting autocorrelated errors. The Durbin-Watson statistic is 1.124. With T = 26 and K = 2 we obtain d L 1.302 and dU 1.461 . Since the value of the Durbin-Watson statistic is less than d L , we conclude that there is evidence of positive autocorrelation. (b) The estimates, their standard errors and the confidence intervals obtained from least squares and generalised least squares (GLS) are presented in the table below. For the least squares method T = 26, K = 2 and the critical t value is t0.025 2.064 . For GLS, T = 25 and K = 2, so t0.025 2.069 . The estimates obtained from least squares and GLS are very similar. However, the standard errors from GLS are much higher than from least squares, resulting in GLS confidence intervals which are much wider than those obtained from least squares. Hence, ignoring autocorrelation means the estimates are less reliable than they appear. This information was presented in the book, page 281 Least squares 1 GLS Estimate (se) Confidence interval Estimate (se) Confidence interval -387.97 (112.66) (-620.49, -155.45) -343.85 (192.17) (-741.45, 53.75) 2 24.7646 (22.759, 26.770) 24.3882 (21.131, 27.645) (0.9715) (1.5741) (c) Because of the evidence of autocorrelation the forecasts are based on the GLS results. DISP86 ˆ 1 ˆ 2 DUR86 ˆ e85 343.85 24.3882(190) 0.4186(277.2) 4173.87 DISP87 ˆ 1 ˆ 2 DUR87 ˆ 2 e85 343.85 24.3882(195) 0.41862 ( 277.2) 4363.28 DISP88 ˆ 1 ˆ 2 DUR88 ˆ 3 e85 343.85 24.3882(192) 0.41863 ( 277.2) 4318.35