Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Page 1
Answers Chapter 9, 11 and 12 Suggested problems
9.1
(a) When a GPA is increased by one unit, and other variables are held constant, average starting
salary will increase by the amount $1643. Students who take econometrics will have a starting
salary which is $5033 higher, on average, than the starting salary of those who did not take
econometrics. The intercept suggests the starting salary for someone with zero GPA and no
econometrics is $24,200. However, this figure is likely to be unreliable since there would be no
one with a zero GPA.
(b) A suitably modified equation is
SAL  1  2GPA  3METRICS  4 SEX  e
(c) To see if the value of econometrics is the same for men and women, we change the model to
SAL  1  2GPA  3 METRICS  4 SEX  5 METRICS  SEX  e
(d) The estimated models, with standard errors in parentheses below the estimated coefficients, are

SAL = 24242 + 1658 GPA + 5024 METRICS  205 SEX
(1091) (356)
(460)
(420)

SAL = 24222 + 1675 GPA + 4924 METRICS  280 SEX + 275 METRICS  SEX
(1104) (365)
(582)
(500)
(966)
The estimated equation for part (b) suggests the starting salary for females is $205 lower than
that for males. However, this estimated coefficient is less than its standard error. The
hypothesis that males and females have the same starting salary would not be rejected.
The estimated equation for part (c) suggests that:
Value of econometrics for men = 4924
Value of econometrics for women = 4924 + 275 = 5199
That is, econometrics is more valuable for women than men. However, the estimated
coefficient for METRICS  SEX is not significantly different from zero. The hypothesis that
econometrics is equally valuable for men and women would not be rejected.
9.2
(a) Considering each of the coefficients in turn, we have the following interpretations.
Intercept: At the beginning of the time period over which observations were taken, on a day
which is not Friday, Saturday or a holiday, and a day which has neither a full moon nor a half
moon, the average number of emergency room cases was 94.
T: The average number of emergency room cases has been increasing by 0.0338 per day.
HOLIDAY: The average number of emergency room cases goes up by 13.9 on holidays.
FRI and SAT: The average number of emergency room cases goes up by 6.9 and 10.6 on
Fridays and Saturdays, respectively.
FULLMOON: The average number of emergency room cases goes up by 2.45 on days when
there is a full moon. However, a null hypothesis stating that a full moon has no influence on the
number of emergency room cases would not be rejected.
NEWMOON: The average number of emergency room cases goes up by 6.4 on days when there
is a new moon. However, a null hypothesis stating that a new moon has no influence on the
number of emergency room cases would not be rejected.
(b) Here are the results. I realize I did not post the data or code, but with this output you can still do
the restricted F test
Page 2
******************************************************************************
The REG Procedure
Model: MODEL1
Dependent Variable: calls
Analysis of Variance
Su of
Mean
DF
Squares
Square
F Value
Pr > F
Model
6
5693.37691
948.89615
7.77
<.0001
Error
222
27109
122.11182
Corrected Total
228
32802
Source
Root MSE
Dependent Mean
Coeff Var
11.05042
R-Square
0.1736
100.56769
Adj R-Sq
0.1512
10.98804
Parameter Estimates
Parameter
Standard
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
93.69583
1.55916
60.09
<.0001
t
1
0.03380
0.01105
3.06
0.0025
hol
1
13.86293
6.44517
2.15
0.0326
fri
1
6.90978
2.11132
3.27
0.0012
sat
1
10.58940
2.11843
5.00
<.0001
full
1
2.45445
3.98092
0.62
0.5382
new
1
6.40595
4.25689
1.50
0.1338
Variable
(c) The null and alternative hypotheses are
H 0 : 6  7  0
H1 : 6 or 7 is nonzero.
The test statistic is
F
( SSER  SSEU ) 2
SSEU (229  7)
where SSE R = 27424 is the sum of squared errors from the estimated equation with
FULLMOON and NEWMOON omitted and SSEU = 27109 is the sum of squared errors from the
estimated equation with these variables included. The calculated value of the F statistic is 1.29
with corresponding p-value of 0.277. This p-value came from SAS output (see below).
Alternatively, you can get the Fcritical value of approx 3.07 at 5% level of significance. Thus,
we do not reject the null hypothesis that new and full moons have no impact on the number of
emergency room cases.
Here is the restricted regression:
The REG Procedure
Model: MODEL2
Dependent Variable: calls
Page 3
Analysis of Variance
Sum of
Mean
DF
Squares
Square
F Value
Pr > F
Model
4
5378.00978
1344.50245
10.98
<.0001
Error
224
27424
122.42942
Corrected Total
228
32802
Source
Root MSE
Dependent Mean
Coeff Var
11.06478
R-Square
0.1640
100.56769
Adj R-Sq
0.1490
11.00232
Parameter Estimates
Parameter
Standard
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
94.02146
1.54585
60.82
<.0001
t
1
0.03383
0.01107
3.06
0.0025
hol
1
13.61679
6.45107
2.11
0.0359
fri
1
6.84914
2.11367
3.24
0.0014
sat
1
10.34207
2.11533
4.89
<.0001
Variable
F = [(SSER – SSEU)/2]/(SSEU/(t-k)) =[(27424-27109)/2]/(27109/229-7) = 157.5/122.11 = 1.29
Note: the following code in sas will AUTOMATICALLY do the restricted F-test..
proc reg;
model calls = t hol fri sat full new;
test full=0, new=0;
run;
Here is the output…see the F stat at the bottom.
The REG Procedure
Model: MODEL1
Dependent Variable: calls
Analysis of Variance
Sum of
Mean
DF
Squares
Square
F Value
Pr > F
Model
6
5693.37691
948.89615
7.77
<.0001
Error
222
27109
122.11182
Corrected Total
228
32802
Source
Root MSE
Dependent Mean
Coeff Var
11.05042
R-Square
0.1736
100.56769
Adj R-Sq
0.1512
10.98804
Parameter Estimates
Parameter
Standard
Page 4
Variable
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
93.69583
1.55916
60.09
<.0001
t
1
0.03380
0.01105
3.06
0.0025
hol
1
13.86293
6.44517
2.15
0.0326
fri
1
6.90978
2.11132
3.27
0.0012
sat
1
10.58940
2.11843
5.00
<.0001
full
1
2.45445
3.98092
0.62
0.5382
new
1
6.40595
4.25689
1.50
0.1338
******************************************************************************
The REG Procedure
Model: MODEL1
Test 1 Results for Dependent Variable calls
Mean
Source
DF
Square
F Value
Pr > F
2
157.68356
1.29
0.2770
222
122.11182
Numerator
Denominator
11.1
Specification
Transformation:
for var(et)
 2 xt
Divide the model by X1/4:
independent variables: 1/X1/4
and X/X1/4
Why???
We divide by the standard
deviation, which is the square
root of the variance : X1/4.
We can ignore the  term in all
of the models since it doesn’t
vary over observations.
2xt
Divide the model by X1/2
Independent variables: 1/X1/2
and X/X1/2
This is just like the one we did
in class. The standard deviation
is
xt
 2 xt2
Divide the model by X:
independent variables: 1/x plus
an intercept
Here, the standard deviation is
xt, so we divide by xt
2ln(xt)
Divide the model by (ln(X))1/2
Here the standard deviation is
Independent variables:
1/(ln(x))1/2
ln( xt )
Here is the main regression that we used in class to test for heteroskedasticity. I do not
repeat the test here. See slides 11.11 and 11.12)
The REG Procedure
Model: food
Dependent Variable: y
Page 5
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
1
38
39
25221
54311
79533
25221
1429.24556
Root MSE
Dependent Mean
Coeff Var
37.80536
130.31300
29.01120
R-Square
Adj R-Sq
F Value
Pr > F
17.65
0.0002
0.3171
0.2991
Parameter Estimates
Variable
Intercept
x
DF
1
1
Parameter
Estimate
40.76756
0.12829
Standard
Error
22.13865
0.03054
t Value
1.84
4.20
Pr > |t|
0.0734
0.0002
Below is the output for 3 different transformed regressions. I do not do part b since we did
this in class. see slides 11.17 and 11.18.
(ignore testing these models for heteroskedasticity. These models have had the
heterskedasticity remove, albeit in 3 different ways.)
Below I highlight the coefficient estimate b2 that measures the effect on X on Y:
The REG Procedure
Model: model_A
Dependent Variable: ystar
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Uncorrected Total
2
38
40
26028
1915.30947
27943
13014
50.40288
Root MSE
Dependent Mean
Coeff Var
7.09950
25.30918
28.05108
Variable
x1star
x2star
DF
R-Square
Adj R-Sq
Parameter Estimates
Parameter
Standard
Estimate
Error
1
1
36.75257
0.13391
20.05232
0.02879
F Value
Pr > F
258.20
<.0001
0.9315
0.9279
t Value
Pr > |t|
1.83
4.65
0.0747
<.0001
******************************************************************************
The REG Procedure
Model: model_C
Dependent Variable: ystar
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
Error
Corrected Total
1
38
39
0.00562
0.09285
0.09846
Root MSE
Dependent Mean
Coeff Var
0.04943
0.19116
25.85840
Page 6
2.30
0.1377
0.00562
0.00244
R-Square
Adj R-Sq
0.0571
0.0322
Parameter Estimates
Variable
Intercept
x1star
Parameter
Estimate
0.15769
21.28584
DF
1
1
Standard
Error
0.02342
14.03797
t Value
6.73
1.52
Pr > |t|
<.0001
0.1377
******************************************************************************
The REG Procedure
Model: model_D
Dependent Variable: ystar
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Uncorrected Total
2
38
40
106685
8111.23735
114797
53343
213.45361
Root MSE
Dependent Mean
Coeff Var
14.61005
50.89489
28.70632
R-Square
Adj R-Sq
F Value
Pr > F
249.90
<.0001
0.9293
0.9256
Parameter Estimates
Variable
x1star
x2star
Parameter
Estimate
39.55015
0.12996
DF
1
1
Standard
Error
21.46901
0.02997
t Value
1.84
4.34
Pr > |t|
0.0733
0.0001
11.2
SAS output appear below.
(a)
Countries with high per capita income can decide whether to spend larger amounts on education than
their poorer neighbours, or to spend more of their larger income on other things. They are likely
to have more discretion with respect to where public monies are spent. On the other hand,
countries with low per capita income may regard a particular level of education spending as
essential, meaning that they have less scope for deviating from a mean function. These
differences can be captured by a model with heteroskedasticity. Remember that
heteroskedasticity is more common in cross-section data.
(b) The least squares estimated function is
yt   01246
.
 0.07317 xt
(0.0485) (0.00518)
R2  0.862
This function and the corresponding residuals appear in Figure 11.1. The absolute magnitude of
the errors does tend to increase as x increases suggesting the existence of heteroskedasticity.
Page 7
Yt 1.6
1.4
1.2
y = - 0.1246 + 0.0732x
1.0
0.8
0.6
0.4
0.2
0.0
-0.2 0
5
10
15
20
Xt
Figure 11.1 Estimated Function for Education Expenditure
(c) Since it is suspected that, if heteroskedasticity exists, the variance is related to xt , we begin by
ordering the observations according to the magnitude of xt. Then, splitting the sample into two
equal subsets of 17 observations each, and applying least squares to each subset, we obtain  12
= 0.0081608 and  22 = 0.029127 leading to a Goldfelt-Quandt statistic of
GQ 
0.029127
= 3.569
0.008161
The critical value from an F-distribution with (15,15) degrees of freedom and a 5% significance
level is Fc = 2.40. Since 3.569 > 2.40 we reject a null hypothesis of homoskedasticity and
conclude that the error variance is directly related to per capita income xt.
(e) Generalized least squares estimation under the assumption var  et    2 xt yields
yt   0.0929  0.06932 xt
(0.0289) (0.00441)
(note: I have expressed these results in the model’s original form although it was estimated with
no intercept and two independent variables: the reciprocal of the square root of x and x over the
square root of x.) The estimated response of per capita education expenditure to per capita
income has declined slightly relative to the least squares estimate. The associated 95%
confidence interval is (0.0603, 0.0783). This interval is narrower than both those computed from
least squares estimates. The comparison with the White-calculated interval suggests that
generalized least squares is more efficient; a comparison with the conventional least squares
interval is not really valid because the standard errors used to compute that interval are not
valid. See below for the case were Var(et) = 2X2t. The differences of how this is carried out
and how to interpret the results is important.
Part B
Source
Least Squares results
The REG Procedure
Model: MODEL1
Dependent Variable: y
Analysis of Variance
Sum of
DF
Squares
Model
Error
Corrected Total
1
32
33
3.68386
0.59063
4.27449
Mean
Square
3.68386
0.01846
F Value
Pr > F
199.59
<.0001
Page 8
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
x
DF
1
1
0.13586
0.47674
28.49753
R-Square
Adj R-Sq
Parameter Estimates
Parameter
Standard
Estimate
Error
-0.12457
0.04852
0.07317
0.00518
0.8618
0.8575
t Value
-2.57
14.13
Pr > |t|
0.0151
<.0001
******************************************************************************
The REG Procedure
Model: MODEL1
Dependent Variable: y
This is part D, White standard
Error for b2 would the the square root of 0.0000363146 = 0.006. This is larger than the 0.00518 value
reported y least squares.
Consistent Covariance of Estimates
Variable
Intercept
x
Intercept
0.0015372135
-0.000211654
x
-0.000211654
0.0000363146
******************************************************************************
This regression gets you the numerator for the GQ-statistic
The REG Procedure
Model: MODEL1
Dependent Variable: y
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Pr > F
Model
1
Error
15
Corrected Total
16
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
x
DF
1
1
0.42220
0.42220
14.50
0.0017
0.43690
0.02913
0.85910
0.17067
R-Square
0.4914
0.78115
Adj R-Sq
0.4575
21.84803
Parameter Estimates
Parameter
Standard
Estimate
Error
t Value
Pr > |t|
-0.14087
0.24569
-0.57
0.5749
0.07516
0.01974
3.81
0.0017
This regression gets you the denominator for the GQ-statistic
The REG Procedure
Model: MODEL1
Dependent Variable: y
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Model
1
0.14225
0.14225
17.43
Error
15
0.12241
0.00816
Corrected Total
16
0.26466
Pr > F
0.0008
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
x
DF
1
1
0.09034
R-Square
0.5375
0.17232
Adj R-Sq
0.5066
52.42382
Parameter Estimates
Parameter
Standard
Estimate
Error
t Value
Pr > |t|
-0.03807
0.05495
-0.69
0.4990
0.05047
0.01209
4.17
0.0008
******************************************************************************
These are the critical values
The SAS System
Obs
fc
tc
17
Page 9
1
2.40345
2.03693
This regression corrects for heteroskedasticity of the form var(et) = 2Xt
The REG Procedure
Model: MODEL1
Dependent Variable: ystar
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Pr > F
Model
Error
Uncorrected Total
2
32
34
0.96083
0.06341
1.02423
0.48041
0.00198
242.45
<.0001
Root MSE
Dependent Mean
Coeff Var
0.04451
R-Square
0.9381
0.15116
Adj R-Sq
0.9342
29.44875
Parameter Estimates
Parameter
Standard
Variable
DF
Estimate
Error
t Value
Pr > |t|
x1star
1
-0.09292
0.02890
-3.21
0.0030
x2star
1
0.06932
0.00441
15.71
<.0001
We predict that if GDP per capita increases by $1.00, pubic expenditures on education per capital will
increase by $0.069
******************************************************************************
This regression corrects for heteroskedasticity of the form var(et) = 2X2t
The REG Procedure
Model: MODEL1
Dependent Variable: ystar
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Pr > F
Model
1
0.00349
0.00349
12.69
0.0012
Error
32
0.00880
0.00027504
Corrected Total
33
0.01229
Root MSE
0.01658
R-Square
0.2840
Dependent Mean
0.05153
Adj R-Sq
0.2616
Coeff Var
32.18259
Parameter Estimates
Parameter
Standard
Variable
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
0.06443
0.00460
13.99
<.0001
xstar
1
-0.06739
0.01892
-3.56
0.0012
We predict that if GDP per capita increases by $1.00, pubic expenditures on education per capital will
increase by $0.064, because the intercept in this transformed model is actually the slope coefficient
the original model.
11.10 (a) The graphs for plotting the residuals against income and age show that the absolute values of the
residuals increase as income increases but they appear to be constant as age increases. This
indicates that the error variance depends on income.
(b) Since the residual plot shows that the error variance may increase when income increases, and
this is a reasonable outcome since greater income implies greater flexibility in travel, we set the
null and alternative hypotheses as H0 : 12  22 against H1 : 12  22 . The test statistic is
GQ 
ˆ 12
(2.9471 107 ) (100  4)

 2.8124
ˆ 22
(1.0479  107 ) (100  4)
The 5% critical value for (96, 96) degrees of freedom is Fc  1.35 . Thus, we reject H 0 and
conclude that the error variance depends on income.
Page 10
12.1
(a) The least-squares estimated equation is given by
It = 6.22 + 0.770 Yt  0.184 Rt
(2.51) (0.072) (0.126)
R2 = 0.816
Both b2 and b3 have the expected signs; income is expected to have a positive effect on
investment whereas an increase in the interest rate should reduce investment. The standard
errors for b1 and b2 are relatively small suggesting that the corresponding coefficients are
significantly different from zero. However, the standard error of b3 is large, yielding a t ratio
that is less than two. Based on this standard error, we can question whether we should include Rt
in the equation, although economic theory suggests Rt should have a strong influence on It.
(b) The plot of the least squares residuals in Figure 12.1 reveals a few long runs of negative and
positive residuals, suggesting the existence of autocorrelation.
e
8
4
0
0
5
10
15
20
25
30
35
t
-4
-8
Figure 12.1 Residuals for Investment Equation
(c) In this context, the Durbin-Watson test is a test for H0:  = 0 against H1:  > 0 in the first-order
autoregressive model et = et1 + vt. We find the computed value for the Durbin-Watson statistic
is d = 0.852 (calculate by SAS). With T=30 and k=3, we have dL = 1.284 and dU=1.567.
Because the d statistic is less than dL  reject Ho..
(d) The results from estimating the model in SAS and correcting for AR(1) errors are:
It = 8.41 + 0.742 Yt  0.285 Rt
(2.90) (0.115) (0.081)
 = 0.5616
Comparing these results with those form part (a), we find that there has been little change in the
coefficient estimates, but a considerable change in the standard errors. The standard error on the
coefficient of Yt has increased, suggesting that, if we did not correct for autocorrelation, our
confidence interval for 2 would be too narrow, giving us a false sense of the reliability of our
estimate. The opposite has occurred for 3. Here the standard error has dropped after correcting
for autocorrelation. From the results in part (a) we might be misled into thinking that interest
rate has no impact on investment (the estimated coefficient is not significant). After correcting
for autocorrelation we have a relatively narrow confidence interval that does not include zero.
(e) Given the next year values of Y and R are YT+1 = 36 and RT+1 = 14, the appropriate forecast using
IT 1   1   2YT 1   3RT 1   ~
eT = 8.4093 + 0.7422(36)  0.2849(14) + 0.5616(2.1462) = 32.346
If autocorrelation is ignored, our prediction is
IT 1  b1  b2 YT 1  b3 RT 1 = 6.22 + 0.77(36)  0.184(14) = 31.363
Page 11
There is not a large difference between the two predictions.
12.3
(a) The least-squares estimated equation is

ln( JVt ) = 3.5027  1.6116 ln(Ut)
(0.2829) (0.1555)
R2 = 0.8299
Using the value tc = 2.074, a 95% confidence interval for 2 is
b2  tcse(b2) = (1.9342, 1.2890)
(b) The value of the Durbin-Watson statistic is d = 1.09. The lower limit is dL = 1.273 and the upper
limit is dU = 1.446  we reject H0 and conclude that positive autocorrelation exists. The
existence of autocorrelation means the original assumption for et , that the et are independent, is
not correct. This problem also causes the confidence interval for 2 in part (a) to be incorrect,
meaning we will have a false sense of the reliability of the coefficient estimate.
(c) After correcting for autocorrelation, the estimated equation is

ln( JVt ) = 3.5138  1.616 ln(Ut)
(0.2437) (0.127)
 = 0.4318
(SAS)
The 95% confidence interval for 2 from SAS is (1.879, 1.353). This confidence interval is
slightly narrower than that given in part (a). A direct comparison with the interval in part (a) is
difficult because the least squares standard errors are incorrect in the presence of AR(1) errors.
However, given the change in standard errors is not great, and that we know least squares is less
efficient, one could conjecture that the least squares confidence interval is narrower than it
should be, implying unjustified reliability.
12.8
(a) The estimate for the AR(1) parameter  is
T
ˆ 
 eˆ eˆ
t 2
T
t t 1
 eˆ
t 1

2
t
55453
 0.8758
63316
The approximate Durbin-Watson statistic is d *  2(1 ˆ )  0.2484 .
Based on T  90 and K  5 , d L  1.566 and dU  1.751 . Since d * is less than d L we conclude
that positive autocorrelation is present.
(b) The estimate for the AR(1) parameter of  is
T
ˆ 
 eˆ eˆ
t 2
T
t t 1
 eˆ
t 1
2
t

621
 0.0505
12292
The approximate Durbin-Watson statistic is d *  2(1  ˆ ) 1.8990 .
Based on T  90 and K  6 , d L  1.542 and dU  1.776 . Since d * > dU , we cannot reject a
null hypothesis of no positive autocorrelation.
Page 12
12.9
(a) From the residual plots the residuals tend to exhibit runs of positive and negative values,
suggesting autocorrelated errors. The Durbin-Watson statistic is 1.124. With T = 26 and K = 2
we obtain d L  1.302 and dU  1.461 . Since the value of the Durbin-Watson statistic is less than
d L , we conclude that there is evidence of positive autocorrelation.
(b) The estimates, their standard errors and the confidence intervals obtained from least squares and
generalised least squares (GLS) are presented in the table below. For the least squares method T
= 26, K = 2 and the critical t value is t0.025  2.064 . For GLS, T = 25 and K = 2, so t0.025  2.069 .
The estimates obtained from least squares and GLS are very similar. However, the standard
errors from GLS are much higher than from least squares, resulting in GLS confidence intervals
which are much wider than those obtained from least squares. Hence, ignoring autocorrelation
means the estimates are less reliable than they appear.
This information was presented in the book, page 281
Least squares
1
GLS
Estimate
(se)
Confidence interval
Estimate
(se)
Confidence interval
-387.97
(112.66)
(-620.49, -155.45)
-343.85
(192.17)
(-741.45, 53.75)
2
24.7646
(22.759, 26.770)
24.3882
(21.131, 27.645)
(0.9715)
(1.5741)
(c) Because of the evidence of autocorrelation the forecasts are based on the GLS results.
DISP86  ˆ 1  ˆ 2 DUR86  ˆ e85
 343.85  24.3882(190)  0.4186(277.2)
 4173.87
DISP87  ˆ 1  ˆ 2 DUR87  ˆ 2 e85
 343.85  24.3882(195)  0.41862 ( 277.2)
 4363.28
DISP88  ˆ 1  ˆ 2 DUR88  ˆ 3 e85
 343.85  24.3882(192)  0.41863 ( 277.2)
 4318.35
Related documents