Download Chapter 2 Problem Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 2
ANSWERS TO END-OF-CHAPTER EXERCISES
1.
The null hypothesis states that mean sales, per representative are $24,000. The
alternative hypothesis is that mean sales are not $24,000.
H
0
:   $24,000, and H A :   $24,000.
X 
 t n 1.
s
n
Since the sample standard deviation is reported, the appropriate test statistic follows a
t distribution:
However, because the sample size of 100 is so large, we can apply the standard normal table.
Since the alternative is one of inequality, with level of significance being .05, we need the area
making up two and one-half percent of the distribution in each tail. From the Standard Normal
Table we see that the probability that a Z-variable exceeds 1.96 is .025. Accordingly, our
rejection region is for values of the test statistic that lie either above 1.96 or below -1.96.
Using the sample mean, sample standard deviation, and sample size, the calculated value of the
test statistic is:
25350  24000
 1.80.
7490
100
Since this lies outside the rejection region, we fail to reject the null at the 95% level of
confidence and conclude that the reported mean level of sales is statistically indistinguishable
from $24,000.
2.
State the null hypothesis as the inventory level is the same as the industry average, and
the alternative as one of inequality:
H
0
:   325, and H A :   325.
Since both the population mean and variance are unknown, the appropriate test statistic is based
on the t-distribution:
X 
 t n 1.
s
n
The critical region for a 5% level of significance with sample size 120 for a two-tailed alternative
is given by:
1
Reject H0 if
X 
s
n
0
 1.96.
Given that n = 120, X = 310, and s = 72; the absolute value of the test statistic under the null is:
310  325
 2.28.
72
120
Since 2.28 > 1.96, we can reject the null and conclude the manufacturers mean tire inventory is
significantly different from the industry norm.
3.
a) The mean is the arithmetic average of all the numbers in the data set. The median is
the value that splits the respondents into two equal parts when they are arrayed from the lowest
to the highest value. The mode is the value that occurs most frequently.
Using FORECASTXTM, the following descriptive statistics were calculated:
Statistics for Solution
Credit Hours
Mean
8.3
Median
9
Mode
9
Standard Deviation
2.6378
Sample Variance
6.9579
Range
11
Minimum
2
Maximum
13
Standard Error
0.5898
b) Since the claim is that business graduate students take fewer credit hours than the average
graduate student, we state the null such that its rejection merits the claim. Specifically, the null
states that the mean number of credit hours taken by the business graduate student is equal to or
larger than the mean number of credit hours taken by typical graduate students. We state the
alternative as mean credit hours taken is less than the mean credit hours taken by a typical
graduate student:
H
0
:   9.1, and H A :   9.1.
2
Since both the population mean and variance are unknown, the appropriate test statistic is based
on the t-distribution.
X 
 t n 1.
s
n
The critical region for a one-tailed alternative at the 5% level of significance with sample size 20
is given by:
Reject H0 if
X 
0
s
n
 1.725.
Given that n = 20, X = 8.3, and s = 2.638, we can calculate the sample value of the test statistic
under the null hypothesis:
8.3  9.1
 1.356.
2.638
20
Since 1.356 < 1.725, we fail to reject the null hypothesis and conclude that business graduate
students do not typically take less credits per quarter than non-business graduate students.
4.
The issue is whether or not ACC’s sales staff is comparable to those of other producers in
the same industry. Accordingly, state the null as ACC’s mean sales per salesperson equals that
of other producers, with the alternative of inequality:
H
0
:   $255,000, and H A :   $255,000.
The following statistics are used to test this assertion.
Statistics for the Solution
Sales
Mean
Median
Mode
Standard Deviation
Sample Variance
Standard Error
Range
219,017.5625
180,142.0000
None
76,621.2783
5,870,820,285.1958
19,155.3196
229,726.0000
3
Minimum
Maximum
110,027.0000
339,753.0000
Since both the population mean and variance are unknown, the appropriate test statistic is based
on the t-distribution:
X 
 t n 1.
s
n
The critical region for a 5% level of significance, when the sample size is 16 under a two-tailed
alternative, is given by:
Reject H0 if
X 
s
n
0
 2.131.
Given that n = 16, X = 219,018, and s = 76621.3, we can calculate the sample value of the test
statistic under the null hypothesis:
219018  255000
 1.878.
76621.3
16
Since 1.878 < 2.131, we cannot reject the null and conclude that ACC’s sales staff are
comparable to that of other producers.
5.
a) Since the mean is 205 pounds, and the normal distribution is symmetric about the
mean, half of the population should lie above the mean. Therefore 50 percent of the players
would weigh more than 205 pounds.
b) To make statistical inference from the probability distribution of football player weights, we
use the standard normal distribution with the appropriate transformation. The appropriate Z
value for 250 pounds is:
Z = (250 - 205)/30 = 1.5.
Accordingly, what percent of players weigh less than 250 pounds is the same as asking:
P(Z < 1.5) = P[Z ≤ 0] + P[0 ≤ Z < 1.5] = .5000 + .4332 = .9332.
Using the relative frequency interpretation of probabilities this would imply that 93.3% of the
players would weigh less than 250 pounds.
c) From Table 2-4: Z.10 = -1.285. Accordingly, 90% of the area under the standard normal
density lies above the Z-value of –1.285:
4
P[Z > -1.285] = [(X - 205)/30 > -1.285] = .90
P[X - 205 > -38.55] = P[X > 166.45] = .90.
Therefore 90% of the players would weigh more than 166.45 pounds.
d) P[180 ≤ X ≤ 230] = P[180 - 205 ≤ X - 205 ≤ 230 - 205]
= P[-25 ≤ X - 205 ≤ 25] = P[-25/30 ≤ Z ≤ 25/30] = P[-.8333 ≤ Z ≤ .8333]
= P[-.8333 ≤ Z ≤ 0] + P[0 ≤ Z ≤ .8333] = .2967 + .2967 = .5934
Therefore, 59.34% of the players would weigh between 80 and 230 pounds.
6.
a) Ms. Wharton’s hope is that her bank is viewed more favorably than the average bank,
which has an approval rating of 7.01. We formulate the null such that its rejection merits Ms.
Wharton’s hopes:
H
0
:   7.01, and H A :   7.01.
Since data were derived from a market research survey, we view both the population mean and
variance as having been estimated. Accordingly, the appropriate test statistic is based upon the t-
X 
 t n 1.
s
n
distribution:
To find the calculated value of the test statistic based upon this sample, we note the following
information: Sample Mean = 7.25, 0 = 7.01, s = 2.51, and n = 400.
7.25  7.01
 1.91.
2.51
400
The critical region for significance level .05 and sample size 400, under a one-tailed alternative,
is given by:
P[t399 > 1.645] = .05.
Since our calculated t-value falls into the critical region, we can reject the null hypothesis at the
5% level of significance.
c) The calculated value of our t-statistic would now be:
5
7.25  7.01
 .956.
2.51
100
Hence, we cannot reject the null with this smaller sample, other things held constant.
Why? The key is to examine the variance of the sampling distribution of the sample mean,
which depends on the sample size. As sample size increases, we can be more confident in our
estimate of the population mean.
7.
a) Using FORECASTXTM, the following descriptive statistics about class size were
obtained.
Statistics for
Solution
Number of students
Mean
40.1600
Median
42.0000
Mode
20.0000
Standard Deviation
13.1424
Sample Variance
Standard Error
172.7233
2.6285
Range
54.0000
Minimum
10.0000
Maximum
64.0000
b) The standard error of the sample mean is 2/n. Estimating the population variance by the
sample variance and noting the sample size, the standard error of the estimated sample mean is
(172.72/25)1/2 = 2.628.
c) The sampling distribution of the sample mean shows that the sample mean is an unbiased
estimator of the population mean. Accordingly, 40.16 is our point estimate for the population
class size.
d) Since the population mean and variance are unknown, we use the following confidence
interval involving the t-distribution to make probability statements about the unknown
population mean:
6

 s 
 s 
P  X  t n1, / 2 
    X  t n1, / 2 
  1   .
n
n





For a 95 percent confidence interval, using t24,.025 = 2.064, we get:
P[34.74 <  < 45.58] = .95.
For 90% confidence interval, using t24,.05 = 1.711, we get:
P[35.66 <  < 44.66] = .90.
The 95% confidence interval is a wider since we are statistically more confident.
8.
a) To examine whether there has been an upward trend in annual larceny thefts in the
United States, a time-series plot of annual data from 1972 through 1994 was prepared.
Larceny Thefts
9000
Larceny Thefts (000)
8000
7000
6000
5000
4000
3000
2000
1000
19
94
19
92
19
90
19
88
19
86
19
84
19
82
19
80
19
78
19
76
19
74
19
72
0
Year
As shown in the time-series plot, there is a positive trend over the sample period.
b) The ACF and PACF estimates and correlograms for THEFTS are reported below.
rk = 2/(sqrt n) so in this case rk = 2/(sqrt 23) = .417.
ACF Values For Larceny Thefts
Obs
1
ACF
.7986
Upper Limit
.4087
7
Lower Limit
-.4087
2
3
4
5
.5594
.4045
.3298
.2844
.4087
.4087
.4087
.4087
-.4087
-.4087
-.4087
-.4087
To test the significance of individual autocorrelation parameters, we use the following 95%
confidence approximation, which states that the critical value for rk is: where n is the sample
2
rk  n ,
size. For estimated autocorrelation coefficients in excess of rk in absolute value, we reject the
null hypothesis of zero autocorrelation assuming a two-tailed alternative at the .05 level of
significance.
Since sample size is 23, the appropriate critical value for testing the null of zero autocorrelation
is .417. Examining the autocorrelation function, we see that we can reject the null of zero
autocorrelation at lags 1, and 2, since their coefficient estimates exceed .417. Autocorrelations at
lags 3 throgh 5 are not statistically different from zero. Accordingly, because of trend the data
are nonstationary.
c) To account for trend, Holt’s exponential smoothing or a regression trend model should be
considered.
9.
A time-series plot of mobile home shipments (MHS) is shown below.
8
Mobile Home Shipments (000)
100
90
80
MHS (000)
70
60
50
MHS
40
30
20
10
Mar-95
Mar-94
Mar-93
Mar-92
Mar-91
Mar-90
Mar-89
Mar-88
Mar-87
Mar-86
Mar-85
Mar-84
Mar-83
Mar-82
Mar-81
0
End Month of Quarters
As indicated by the time-series plot, there is significant seasonality to mobile home sales as
shown by the regular periodic variation in the data arising at the same time each year.
Specifically, the plot shows seasonal downturns in quarters four and one compared to quarters
two and three. The plot also reveals periods of upward trends followed by downward trends,
presumably related to business cycle factors such as interest rates and unemployment.
The autocorrelation structure of MHS and correlogram is reported below.
Obs
1
2
3
4
5
6
7
8
9
10
11
12
ACF
PACF
.7779
.5593
.6165
.6911
.4623
.2359
.2494
.3154
.1103
-.1118
-.0954
-.0136
.7779
-.1161
.5678
.0537
-.5250
-.0206
-.0345
.1172
-.2919
-.0385
.0099
-.0064
9
1
0.8
0.6
ACF
0.4
Upper Limit
0.2
Low er Limit
0
-0.2
1
2
3
4
5
6
7
8
9
10
11
12
-0.4
1
0.8
0.6
0.4
PACF
0.2
Upper Limit
Low er Limit
0
-0.2
1
2
3
4
5
6
7
8
9
10
11
12
-0.4
-0.6
Using the 95% approximation for testing the null of zero autocorrelation at lag k, we can reject
the null if the estimated autocorrelation coefficient exceeds
2
.
n
With our sample size of 60, the critical value for rk is .258199. Examining the estimated
autocorrelations in the table above show that we can reject the null of zero autocorrelation at lags
1, 2, 3, 4, 5, and 8; it is not until lag six that the autocorrelation function falls below .258199 (r6
= .235934). Accordingly, the autocorrelation results indicate a significant trend in the data.
r
k

In addition, the relatively large autocorrelation coefficients for lags of 4 and 8 quarters (r4 =
.691087 and r8 = .315445) indicate significant seasonality in the MHS series.
Forecasting methods that might be suggested as good candidates for MHS, based on Table 2-1 in
the text would include Winters’ exponential smoothing, time series decomposition, and a causal
regression model.
10.
a) Private housing starts data (PHS) are plotted below.
10
Private Housing Starts (000)
400
350
PHS (000)
300
250
200
PHS
150
100
50
Feb-98
Feb-96
Feb-94
Feb-92
Feb-90
Feb-88
Feb-86
Feb-84
Feb-82
Feb-80
0
Mid-Month of Quarter
The time-series plot of private housing starts (PHS) shows no significant trend and significant
seasonality.
b) Using FORECASTXTM, the estimated autocorrelation coefficients and correlogram for
PHS are reported below.
ACF Values For Total Houses Sold (000) Per Quarter
Obs
1
2
3
4
5
6
7
8
ACF
.8258
.6887
.7155
.7613
.6044
.4660
.4898
.5411
Upper Limit
Lower Limit
.2450
.2450
.2450
.2450
.2450
.2450
.2450
.2450
11
-.2450
-.2450
-.2450
-.2450
-.2450
-.2450
-.2450
-.2450
With our sample size of 64, the critical value for rk is .25. Examining the estimated
autocorrelations in the table above show that we can reject the null of zero autocorrelation at lags
1throgh 8 using the approximate 95% confidence rule.
The autocorrelation results indicate trend in the data. On the other hand, the relatively large
autocorrelation coefficients for lags of 4 and 8 quarters (r4 = . 76 and r8 = . 54) indicate
significant seasonality in the PHS series.
c) Using FORECASTXTM, the estimated autocorrelation coefficients and correlograms of the
first-differenced series (DPHS) are reported below (see next page).
Since we lose a data point in the first-differenced series relative to the original, our critical value
of rk is now .252. Accordingly, we can reject the null for lags 2, 4, 6, 8, 10, and 12. These large
autocorrelations suggests the PHS data is highly seasonal which continues into the firstdifferenced series.
ACF @ PACF - 12282005 Values For DTHS
Obs
1
2
3
4
5
ACF
.1475
.5529
.0726
.7088
.1158
Upper
Limit
.2469
Lower Limit
-.2469
.2469
-.2469
.2469
.2469
-.2469
-.2469
.2469
-.2469
12
.5272
.0685
.6052
.0591
.4676
.1032
.5803
6
7
8
9
10
11
12
11.
.2469
-.2469
.2469
.2469
-.2469
-.2469
.2469
-.2469
.2469
-.2469
.2469
.2469
-.2469
-.2469
A time-series plot of the Japanese-yen U.S.-dollar exchange rate (EXRJ) is shown below.
Japanese Yen per U.S.
Dollar
Exchange Rate
200
150
100
EXRJ
50
0
1
3
5
7
9
11 13 15 17 19 21 23
Month
As shown by the time-series plot, there appears to be no significant trend or seasonality in
exchange-rate data.
The autocorrelation structure and correlograms for EXRJ are reported below:
Obs
1
2
3
4
5
6
7
8
9
10
ACF
PACF
.8157
.8157
.5383
-.3797
.2733
-.0798
.0340
-.1550
-.1214
.0408
-.1924
-.0112
-.2157
-.0537
-.1978
-.0036
-.1215
.1120
-.1217
-.3281
13
11
12
-.1823
-.2593
-.1047
-.1248
1
0.8
0.6
0.4
ACF
0.2
Upper Limit
0
Low er Limit
-0.2
1
2
3
4
5
6
7
8
9
10
11
12
-0.4
-0.6
1
0.8
0.6
0.4
PACF
0.2
Upper Limit
Low er Limit
0
-0.2
1
2
3
4
5
6
7
8
9
10
11
12
-0.4
-0.6
The approximate 95% critical value for rejecting the null of zero autocorrelation at lag k with a
sample size of 24 is .408. Since the autocorrelation coefficients fall to below the critical value
after just two periods, we can conclude that there is no trend in the data.
14