Download Chapter 7 Estimates and Sample Sizes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 7
Estimates and Sample Sizes
7-2 Estimating a Population Proportion
1. The confidence level was not stated. The most common level of confidence is 95%, and
sometimes that level is carelessly assumed without actually being stated.
2. The margin of error is the maximum likely difference between the point estimate for a
parameter and its true value.
3. By including a statement of the maximum likely error, a confidence interval provides
information about the accuracy of an estimate.
4. No. A voluntary response sample is not necessarily representative of the population.
5. For 99% confidence, α = 1–0.99 = 0.01 and α/2 = 0.01/2 = 0.005.
For the upper 0.005, A = 0.9950 and z = 2.575.
zα/2 = z0.005 = 2.575
6. For 99.5% confidence, α = 1–0.995 = 0.005 and α/2 = 0.005/2 = 0.0025.
For the upper 0.0025, A = 0.9975 and z = 2.81.
zα/2 = z0.0025 = 2.81
7. For α = 0.10, α/2 = 0.10/2 = 0.05.
For the upper 0.05, A = 0.9500 and z = 1.645.
zα/2 = z0.05 = 1.645
8. For α = 0.02, α/2 = 0.02/2 = 0.01.
For the upper 0.01, A = 0.9900 [0.9901] and z = 2.33.
zα/2 = z0.01 = 2.33
9. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.200+0.500)/2 = 0.700/2 = 0.350
E = (U–L)/2 = (0.500–0.200)/2 = 0.300/2 = 0.150
The interval can be expressed as 0.350 ± 0.150.
10. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.720+0.780)/2 = 1.500/2 = 0.750
E = (U–L)/2 = (0.780–0.720)/2 = 0.060/2 = 0.030
The interval can be expressed as 0.750 ± 0.030.
11. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.437+0.529)/2 = 0.966/2 = 0.483
E = (U–L)/2 = (0.529–0.437)/2 = 0.092/2 = 0.046
The interval can be expressed as 0.483 ± 0.046.
12. Given that p̂ = 0.222 and E = 0.044,
L = p̂ – E = 0.222 – 0.044 = 0.178
U = p̂ +E = 0.222 + 0.044 = 0.266
The interval can be expressed as 0.178 < p < 0.266.
198
CHAPTER 7 Estimates and Sample Sizes
13. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.320+0.420)/2 = 0.740/2 = 0.370
E = (U–L)/2 = (0.420–0.320)/2 = 0.100/2 = 0.050
14. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.772+0.776)/2 = 1.548/2 = 0.774
E = (U–L)/2 = (0.776–0.772)/2 = 0.004/2 = 0.002
15. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.433+0.527)/2 = 0.960/2 = 0.480
E = (U–L)/2 = (0.527–0.433)/2 = 0.094/2 = 0.047
16. Let L = the lower confidence limit; U = the upper confidence limit.
p̂ = (L+U)/2 = (0.102+0.236)/2 = 0.338/2 = 0.169
E = (U–L)/2 = (0.236–0.102)/2 = 0.134/2 = 0.067
ˆ ˆ do not round off in the middle of the
IMPORTANT NOTE: When calculating E = z α/2 pq/n
problem. This, and the subsequent calculations of U = p̂ + E and L = p̂ – E may accomplished
conveniently on most calculators having a memory as follows.
(1) Calculate p̂ = x/n and STORE the value.
(2) Calculate E as 1 – RECALL = * RECALL = ÷ n =
* zα/2 =
(3) With the value for E showing on the display, the upper confidence limit U can be calculated
by using + RECALL =.
(4) With the value for U showing on the display, the lower confidence limit L can be calculated
by using – RECALL ± + RECALL.
THE MANUAL USES THIS PROCEDURE, AND ROUNDS THE FINAL ANSWER TO 3
SIGNIFICANT DIGITS, EVEN THOUGH IT REPORTS INTERMEDIATE STEPS WITH A FINITE
NUMBER OF DECIMAL PLACES. If the above procedure does not work on your calculator, or to
find out if some other procedure would be more efficient on your calculator, ask your instructor
for assistance. You must become familiar with your own calculator – and be sure to do your
homework on the same calculator you will use for the exams.
17. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 400/1000 = 0.40
ˆ ˆ = 1.96 (0.40)(0.60)/1000 = 0.0304
E = zα /2 pq/n
18. α = 0.01 and zα/2 = z0.005 = 2.275; p̂ = x/n = 220/500 = 0.44
ˆ ˆ = 2.575 (0.44)(0.56)/500 = 0.0572
E = zα /2 pq/n
19. α = 0.02 and zα/2 = z0.01 = 2.33; p̂ = x/n = [492]/1230 = 0.40
ˆ ˆ = 2.33 (0.40)(0.60)/1230 = 0.0325
E = zα /2 pq/n
NOTE: The value x=[492] was not given. In truth, any 486 ≤ x ≤ 498 rounds to the given
p̂ = x/1230 = 40%. For want of a more precise value, p̂ = 0.40 is used in the calculation of E.
20. α = 0.10 and zα/2 = z0.05 = 1.645; p̂ = x/n = [623]/1780 = 0.35
ˆ ˆ = 1.645 (0.35)(0.65)/1780 = 0.0186
E = zα /2 pq/n
NOTE: The value x=[623] was not given. In truth, any 615 ≤ x ≤ 631 rounds to the given
p̂ = x/1780 = 35%. For want of a more precise value, p̂ = 0.35 is used in the calculation of E.
Estimating a Population Proportion SECTION 7-2
21. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 40/200 = 0.2000
ˆˆ
pˆ ± zα /2 pq/n
0.2000 ± 1.96 (0.2000)(0.8000)/200
0.2000 ± 0.0554
0.145 < p < 0.255
22. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 400/2000 = 0.2000
ˆˆ
pˆ ± zα /2 pq/n
0.2000 ± 1.96 (0.2000)(0.8000)/2000
0.2000 ± 0.0175
0.182 < p < 0.218
23. α = 0.01 and zα/2 = z0.005 = 2.575; p̂ = x/n = 109/1236 = 0.0882
ˆˆ
pˆ ± zα /2 pq/n
0.0882 ± 2.575 (0.0882)(0.9118)/1236
0.0882 ± 0.0207
0.0674 < p < 0.109
24. α = 0.01 and zα/2 = z0.005 = 2.575; p̂ = x/n = 4821/5200 = 0.9271
ˆˆ
pˆ ± zα /2 pq/n
0.9271 ± 2.575 (0.9271)(0.0729)/5200
0.9271 ± 0.0093
0.918 < p < 0.936
25. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.045; p̂ unknown, use p̂ =0.5
ˆˆ 2
n = [(z α/2 )2 pq]/E
= [(1.96)2(0.5)(0.5)]/(0.045)2 = 474.27, rounded up to 475
26. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.005; p̂ unknown, use p̂ =0.5
ˆˆ 2
n = [(z α/2 )2 pq]/E
= [(2.575)2(0.5)(0.5)]/(0.005)2 = 66306.25, rounded up to 66,307
27. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02; p̂ estimated to be 0.14
ˆˆ 2
n = [(z α/2 )2 pq]/E
= [(2.575)2(0.14)(0.86)]/(0.02)2 = 1995.82, rounded up to 1996
28. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ estimated to be 0.87
ˆˆ 2
n = [(z α/2 )2 pq]/E
= [(1.96)2(0.87)(0.13)]/(0.03)2 = 482.76, rounded up to 483
29. Let x = the number of girls born using the method.
a. p̂ = x/n = 525/574 = 0.9146, rounded to 0.915
199
200
CHAPTER 7 Estimates and Sample Sizes
b. α = 0.05, zα/2 = z0.025 = 1.96
ˆˆ
pˆ ± zα /2 pq/n
0.9146 ± 1.96 (0.9146)(0.0854)/574
0.9146 ± 0.0229
0.892 < p < 0.937
c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 95%
certain that the method is effective.
30. Let x = the number of boys born using the method.
a. p̂ = x/n = 127/152 = 0.8355, rounded to 0.836
b. α = 0.01, zα/2 = z0.005 = 2.575
ˆˆ
pˆ ± zα /2 pq/n
0.8355± 2.575 (0.8355)(0.1645)/152
0.8355 ± 0.0774
0.758 < p < 0.913
c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99%
certain that the method is effective.
31. Let x = the number of deaths in the week before Thanksgiving.
a. p̂ = x/n = 6062/12000 = 0.5052, rounded to 0.505
b. α = 0.05, zα/2 = z0.025 = 1.96
ˆˆ
pˆ ± zα /2 pq/n
0.5052 ± 1.96 (0.5052)(0.4948)/12000
0.5052 ± 0.0089
0.496 < p < 0.514
c. No. Since 0.5 is within the confidence interval, there is no evidence that people can
temporarily postpone their death in such circumstances.
32. Let x = the number of suits dropped or dismissed.
a. p̂ = x/n = 856/1228 = 0.6971, rounded to 0.697
b. α = 0.01, zα/2 = z0.005 = 2.575
ˆˆ
pˆ ± zα /2 pq/n
0.6971 ± 2.575 (0.6971)(0.3029)/1228
0.6971 ± 0.0338
0.663 < p < 0.731
c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99%
certain that more than half the suits are dropped or dismissed.
33. Let x = the number of yellow peas
a. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 152/(428+152) = 152/580 = 0.2621
ˆˆ
pˆ ± zα /2 pq/n
0.2621 ± 1.96 (0.2621)(0.7379)/580
0.2621 ± 0.0358
0.226 < p < 0.298 or 22.6% < p < 29.8%
b. No. Since 0.25 is within the confidence interval, it is a reasonable possibility for the true
population proportion. The results do not contradict the theory.
Estimating a Population Proportion SECTION 7-2
201
34. Let x = the number who say they voted.
a. α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = 701/1002 = 0.6996
ˆˆ
pˆ ± zα /2 pq/n
0.6996 ± 2.575 (0.6996)(0.3004)/1002
0.6996 ± 0.0373
0.662 < p < 0.737
b. No. Since 0.61 is not within the confidence interval, the results are not consistent with the
actual voter turnout. It appears that people do not tell the truth about whether they voted.
35. Let x = the number that develop those types of cancer.
a. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 135/420095 = 0.0003214
ˆˆ
pˆ ± zα /2 pq/n
0.0003214 ± 1.96 (0.0003214)(0.9996786)/420095
0.0003214 ± 0.0000542
0.000267 < p < 0.000376 or 0.0267% < p < 0.0376%
b. No. Since 0.0340% = 0.000340 is within the confidence interval, it is a reasonable
possibility for the true population value. The results do not provide evidence that cell phone
users have a different cancer rate than the general population.
36. Let x = the number who think global warming demands priority attention.
a. p̂ = x/n = 939/1708 = 0.5498, rounded to 0.550 or 55.0%
b. α = 0.01, zα/2 = z0.005 = 2.575
ˆˆ
pˆ ± zα /2 pq/n
0.5498 ± 2.575 (0.5498)(0.4502)/1708
0.5498 ± 0.0310
0.519 < p < 0.581
c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99%
certain that more than half the population thinks global warming demands such attention.
37. Let x = the number who say they use the Internet.
α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = [2198]/3011 = 0.73
NOTE: The value x=[2198] was not given. In truth, any 2183 ≤ x ≤ 2213 rounds to the given
p̂ = x/3011 = 73%. For want of a more precise value, p̂ = 0.73 is used in the calculations.
Technically, this should limit the exercise to two significant digit accuracy
ˆˆ
pˆ ± zα /2 pq/n
0.73 ± 1.96 (0.73)(0.27)/3011
0.73 ± 0.0159
0.714 < p < 0.746
No. Since 0.75 is not within the confidence interval, it is not likely to be the correct value of
the population proportion and should not be reported as such. In this particular exercise,
however, the above NOTE indicates that the third significant digit in the confidence interval
endpoints is not reliable – and if p̂ is really 2213/3011 = 0.73497, for example, the confidence
interval is 0.719 < p < 0.751 and 75% is acceptable.
202
CHAPTER 7 Estimates and Sample Sizes
38. Let x = the number who display little or no knowledge of the company.
α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = [71]/150 = 0.47
NOTE: The value x=[71] was not given. In truth, any 70 ≤ x ≤ 71 rounds to the given
p̂ = x/150 = 47%. For want of a more precise value, p̂ = 0.47 is used in the calculations.
Technically, this should limit the exercise to two significant digit accuracy.
ˆˆ
pˆ ± zα /2 pq/n
0.47 ± 2.575 (0.47)(0.53)/150
0.47 ± 0.1049
0.365 < p < 0.575
Yes. Since 0.50 is within the confidence interval, it is a likely value for the true population
proportion.
39. Let x = the number who indicate the outbreak would deter them from taking a cruise.
α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = [21302]/34358 = 0.62
NOTE: The value x=[21302] was not given. In truth, any 21131 ≤ x ≤ 21473 rounds to the
given p̂ = x/34358 = 62%. For want of a more precise value, p̂ = 0.62 is used in the
calculations. Technically, this should limit the exercise to two significant digit accuracy.
ˆˆ
pˆ ± zα /2 pq/n
0.62 ± 1.96 (0.62)(0.38)/34258
0.62 ± 0.0051
0.615 < p < 0.625
No. Since the sample is a voluntary response sample, the respondents are not likely to be
representative of the population.
40. Let x = the number who correctly identified which hand Emily selected.
a. If the touch therapists made random guesses, one would expect the proportion of correct
responses to be 0.50 regardless of how Emily chose which hand to use – even if, for
example, she always used the right hand.
b. p̂ = x/n = 123/280 = 0.4393, rounded to 0.439
c. α = 0.01, zα/2 = z0.005 = 2.575
ˆˆ
pˆ ± zα /2 pq/n
0.4393 ± 2.575 (0.4393)(0.5607)/280
0.4393 ± 0.0764
0.363 < p < 0.516
d. Since the confidence interval includes 0.50, the therapists’ success rate is consistent with
chance guessing. There is no evidence that the professional touch therapists have any
special ability in this area.
41. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02
a. p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(2.575)2(0.5)(0.5)]/(0.02)2 = 4144.14, rounded up to 4145
n = [(z α/2 ) 2 pq]/E
b. p̂ estimated to be 0.73
ˆ ˆ 2 = [(2.575)2(0.73)(0.27)]/(0.02)2 = 3267.24, rounded up to 3268
n = [(z α/2 ) 2 pq]/E
Estimating a Population Proportion SECTION 7-2
203
42. α = 0.10, zα/2 = z0.05 = 1.645 and E = 0.04
a. p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(1.645)2(0.5)(0.5)]/(0.04)2 = 422.82, rounded up to 423
n = [(z α/2 ) 2 pq]/E
b. p̂ estimated to be 0.08
ˆ ˆ 2 = [(1.645)2(0.08)(0.92)]/(0.04)2 = 124.48, rounded up to 125
n = [(z α/2 ) 2 pq]/E
43. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.03)2 = 1067.11, rounded up to 1068
n = [(z α/2 ) 2 pq]/E
44. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.02; p̂ estimated to be 0.10
ˆ ˆ 2 = [(1.96)2(0.10)(0.90)]/(0.02)2 = 864.36, rounded up to 865
n = [(z α/2 ) 2 pq]/E
45. Let x = the number of green M&M’s.
α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 19/100 = 0.19
ˆˆ
pˆ ± zα /2 pq/n
0.1900 ± 1.96 (0.19)(0.81)/100
0.1900 ± 0.0769
0.113 < p < 0.267 or 11.3% < p < 26.7%
Yes. Since 0.160 is within the confidence interval, this result is consistent with the claim
that the true population proportion is 16%.
46. Let x = the number students who gained weight during their freshman year.
Of the 67 students, 17 lost weight and 5 stayed the same and 45 gained weight.
a. p̂ = x/n = 45/67 = 0.6716, rounded to 0.672 or 67.2%
b. α = 0.05, zα/2 = z0.025 = 1.96
ˆˆ
pˆ ± zα /2 pq/n
0.6716 ± 1.96 (0.6716)(0.3284)/67
0.6716 ± 0.1125
0.559 < p < 0.784 or 55.9% < p < 78.4%
c. It is estimated that 67% of U.S. college students gain weight during their freshman year.
This result comes from a study of 67 men and women published in the Journal of American
College Health, Vol. 55, No, 1. That estimate has an 11% margin of error with a 95%
confidence level. In other words, 95% of all such studies can be expected to produce
estimates that are within 11% of the true proportion of all college students who gain weight
during their freshman year.
47. Let x = the number of days with precipitation.
α = 0.05, zα/2 = z0.025 = 1.96
Wednesdays: p̂ = x/n = 16/53 = 0.3019.
Sundays: p̂ = x/n = 15/52 = 0.2885
ˆˆ
ˆˆ
pˆ ± zα /2 pq/n
pˆ ± zα /2 pq/n
0.3019 ± 1.96 (0.3019)(0.6981)/53
0.2885 ± 1.96 (0.2885)(0.7115)/52
0.2885 ± 0.1231
0.3019 ± 0.1236
0.178 < p < 0.425
0.165 < p < 0.412
The confidence intervals are similar. It does not appear to rain more on either day.
204
CHAPTER 7 Estimates and Sample Sizes
48. Let x = the number of movies with R ratings.
α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 12/35 = 0.3429
ˆˆ
pˆ ± zα /2 pq/n
0.3429 ± 1.96 (0.3429)(0.6571)/35
0.3429 ± 0.1573
0.186 < p < 0.500
No. To more significant digits, the upper limit of the confidence limit is 0.500114. Since the
confidence interval includes values higher than 0.50, it is not an unreasonable possibility that
the proportion of movies rated R is greater than ½. We cannot conclude that most movies have
a rating different from R.
49. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ unknown, use p̂ =0.5
ˆ ˆ α /2 ]2
Npq[z
n=
ˆ ˆ α /2 ]2 + (N-1)E 2
pq[z
(12784)(0.5)(0.5)[1.96]2
12277.8
=
= 984.97, rounded up to 985
=
2
2
(0.5)(0.5)[1.96] + (12783)(0.03)
12.4651
No. The sample size is not too much lower than the n=1068 required for a population of
millions of people.
50. α = 0.05, zα = z0.05 = 1.645 and p̂ = x/n = 630/750 = 0.8400
ˆˆ
pˆ - zα pq/n
0.8400 - 1.645 (0.8400)(0.1600)/750
0.8400 - 0.0220
0.818 < p
The interval is expressed as p > 0.818. The desired figure is 81.8%.
51. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 3/8 = 0.3750
ˆˆ
pˆ ± zα /2 pq/n
0.3750 ± 1.96 (0.3750)(0.6250)/8
0.3750 ± 0.3355
0.0395 < p < 0.710
Yes. The results are “reasonably close” – being shifted down 4.5% from the correct interval
0.085 < p < 0.755. But depending on the context, such an error could be serious.
52. α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = 95/100 = 0.9500
ˆˆ
pˆ ± zα /2 pq/n
0.9500 ± 2.575 (0.9500)(0.0500)/100
0.9500 ± .0561
0.894 < p < 1.006
This interval is noteworthy because the upper limit is greater than 1, the maximum possible
value for p in any problem. This occurs because the normal distribution used is only an
approximation to the binomial. In this case the approximation is barely appropriate – since
nq ≈ 100(0.05) = 5, the minimum acceptable value to use the normal to approximate the
binomial. In such cases the interval should be reported as 0.894<p<1. NOTE: Do not use
0.894<p ≤ 1, because the presence of 5 tails indicates that p=1 is not true.
Estimating a Population Proportion SECTION 7-2
205
53. a. If p̂ = x/n = 0/n = 0, then
(1) np ≈ 0 < 5, and the normal approximation to the binomial does not apply.
ˆ ˆ = 0, and there is no meaningful interval .
(2) E = zα /2 pq/n
b. Since p̂ = x/n = 0/20 = 0, use 3/n = 3/20 = 0.15 as the 95% upper bound for p.
NOTE: The corresponding interval would be 0 ≤ p<0.15. Do not use 0<p<0.15, because the
failure to observe any successes in the sample does not rule out p=0 as the true population
proportion.
54. Since “19 cases out of 20” implies 19/20 = 0.95 = 95% confidence, use α = 0.05.
Since p̂ is unknown, use p̂ = 0.5.
ˆˆ 2
n = [(z α/2 ) 2 pq]/E
= [(1.96)2(0.5)(0.5)]/(0.01)2 = 9604
7-3 Estimating a Population Mean: σ Known
1. A point estimate is a single value used to estimate a population parameter. If the parameter in
question is the mean of a population, the best point estimate is the mean of a random sample
from that population.
2. No. The list of the employees at her facility from which she obtained her simple random
sample is itself a convenience sample. Those employees are likely not representative of the
population by age, gender, ethnicity, or other factors that may affect leg length.
3. It is estimated that the mean height of U.S. women is 63.195 inches. This result comes from
the Third National Health and Nutrition Examination Survey of the U.S. Department of Health
and Human Services. It is based on an in-depth study of 40 women and assumes a population
standard deviation of 2.5 inches. The estimate has a margin of error of 0.775 inches with a
95% level of confidence. In other words, 95% of all such studies can be expected to produce
estimates that are within 0.775 inches of the true population mean height of all U.S. women.
4. While any one particular estimate for a population parameter may not be correct, a statistic is
an unbiased estimator of a population parameter if its long-run expected value (i.e., its mean
value) is equal to the true value of the parameter being estimated.
5. For 90% confidence, α = 1–0.90 = 0.10 and α/2 = 0.10/2 = 0.05.
For the upper 0.05, A = 0.9500 and z = 1.645.
zα/2 = z0.05 = 1.645
6. For 98% confidence, α = 1–0.98 = 0.02 and α/2 = 0.02/2 = 0.01.
For the upper 0.01, A = 0.9900 [0.9901] and z = 2.33.
zα/2 = z0.01 = 2.33
7. For α = 0.20, α/2 = 0.20/2 = 0.10.
For the upper 0.10, A = 0.9000 and z = 1.28.
zα/2 = z0.10 = 1.28
8. For α = 0.04, α/2 = 0.0/2 = 0.02.
For the upper 0.02, A = 0.9800 [0.9798] and z = 2.05.
zα/2 = z0.02 = 2.05
206
CHAPTER 7 Estimates and Sample Sizes
9. Since σ is known and n>30, the methods of this section may be used.
α = 0.05, zα/2 = z0.025 = 1.96
E = zα /2 σ/ n = 1.96(68)/ 50 = 18.8 FICO units
x ± E
677.0 ± 18.8
658.2 < μ < 695.8 (FICO units)
NOTE: The above interval assumes x = 677.0. Technically, the failure to report x to tenths
limits the endpoints of the confidence interval to whole number accuracy.
10. Since σ is known and n>30, the methods of this section may be used.
α = 0.05, zα/2 = z0.025 = 1.96
E = zα /2 σ/ n = 1.96(7)/ 32 = 2.4 feet
x ± E
137.0 ± 2.4
134.6 < μ < 139.4 (feet)
NOTE: The above interval assumes x = 137.0. Technically, the failure to report x to tenths
limits the endpoints of the confidence interval to whole number accuracy.
11. Since n<30 and the population is far from normal, the methods of this section may not be used.
12. Since n<30 and the population is far from normal, the methods of this section may not be used.
13. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(68)/(3)]2 = 1973.73, rounded up to 1974
14. α = 0.01, zα/2 = z0.005 = 2.575
n = [zα/2·σ/E]2 = [(2.575)(7)/(2)]2 = 81.23, rounded up to 82
15. α = 0.01, zα/2 = z0.005 = 2.575
n = [zα/2·σ/E]2 = [(2.575)(0.212)/(0.010)]2 = 2980.07, rounded up to 2981
16. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(18.6)/(2)]2 = 332.26, rounded up to 333
17. x = 21.12 mg
18. 19.853 < μ < 22.387 (mg)
19. E = (U – L)/2 = (22.387 – 19.853)/2 = 1.267
21.12 ± 1.267 (mg)
20. We are 95% confident that the interval from 19.853 mg to 22.387 mg contains the true mean
amount of tar in all king-size, non-filtered, non-menthol, and non-light cigarettes.
21. a. x = 146.22 lbs
b. α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
146.22 ± 1.96(30.86)/ 40
146.22 ± 9.56
136.66 < μ < 155.78 (lbs)
Estimating a Population Mean: σ Known SECTION 7-3
207
22. a. x = $415,953
b. α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
415,953 ± 1.96(463,364)/ 40
415,953 ± 143,598
272,355 < μ < 559,551 (dollars)
c. Yes. In this case the confidence interval includes the true population mean.
23. a. x = 58.3 seconds
b. α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
58.3 ± 1.96(9.5)/ 40
58.3 ± 2.9
55.5 < μ < 61.2 (seconds)
c. Yes. Since the confidence interval contains 60 seconds, it is reasonable to assume that the
sample mean was reasonably close to 60 seconds – and it was, in fact, 58.3 seconds.
24. a. x = 4.63 cells/microliter
b. α = 0.01, zα/2 = z0.005 = 2.575
x ± zα/2·σ / n
4.63 ± 2.575(0.54)/ 50
4.63 ± 0.20
4.43 < μ < 4.83 (cells/microliter)
c. The intervals are not directly comparable, since the two given in part (c) are normal ranges
for individual counts and the one calculated in part (b) is a confidence interval for mean
counts. One would expect the confidence interval for mean counts to be well within the
normal ranges for individual counts. The fact that the point estimate and the lower
confidence interval limit for the mean are so close to the lower limit of the normal ranges for
individuals suggests that the sample may consist of persons with lower red blood cell
counts.
25. a. α = 0.05, zα/2 = z0.025 = 1.96
b. α = 0.01, zα/2 = z0.005 = 2.575
x ± zα/2·σ / n
x ± zα/2·σ / n
1522 ± 1.96(333)/ 125
1522 ± 2.575(333)/ 125
1522 ± 58
1522 ± 77
1464 < μ < 1580
1445 < μ < 1599
c. The 99% confidence interval in part (b) is wider than the 95% confidence interval in part (a).
For an interval to have more confidence associated with it, it must be wider to allow for
more possibilities.
26. a. α = 0.05, zα/2 = z0.025 = 1.96
b. α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
x ± zα/2·σ / n
3433 ± 1.96(495)/ 75
3433 ± 1.96(495)/ 75000
3433 ± 112
3433 ± 4
3321 < μ < 3545 (grams)
3429 < μ < 3437 (grams)
c. The n=75 confidence interval in part (a) is wider than the n=75,000 confidence interval in
part (b). There is less accuracy associated with smaller samples.
208
CHAPTER 7 Estimates and Sample Sizes
27. summary statistics: n = 14 Σx = 1875 x = 133.93
α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
133.93 ± 1.96(10)/ 14
133.93 ± 5.24
128.7 < μ < 139.2 (mmHg)
Ideally, there is a sense in which all the measurements should be the same – and in that case
there would be no need for a confidence interval. It is unclear what the given σ = 10 represents
in this situation. Is it the true standard deviation in the values of all people in the population
(in which case it would not be appropriate in this context where only a single person is
involved)? Is it the true standard deviation in momentary readings on a single person (due to
constant biological fluctuations)? Is it the true standard deviation in readings from evaluator to
evaluator (when they are supposedly evaluating the same thing)? Using the methods of this
section and assuming σ = 10, the confidence interval would be 128.7 < μ < 139.2 as given
above even if all the readings were the same.
28. a. summary statistics: n = 10 Σx = 39 x = 3.9
α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
3.9 ± 1.96(2.87)/ 10
3.9 ± 1.8
2.1 < μ < 5.7
b. No; since n<30 and the population distribution is not normal, the methods of this section do
not apply. No; since the methods of this section do not apply, the confidence interval does
not provide a good estimate. Even though the confidence interval may include the true
mean, the endpoints of the confidence limits do not carry the supposed level of confidence.
29. summary statistics: n = 35 Σx = 4305
α = 0.05, zα/2 = z0.025 = 1.96
x ± zα/2·σ / n
123.00 ± 1.96(100)/ 35
123.00 ± 33.13
89.9 < μ < 156.1 (million dollars)
30. summary statistics: n = 100 Σx = 70311
α = 0.01, zα/2 = z0.005 = 2.575
x ± zα/2·σ / n
703.11 ± 2.575(92.2)/ 100
703.11 ± 23.74
679.4 < μ < 726.9 (FICO units)
x = 123.00
x = 703.11
31. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(15)/(5)]2 = 34.57, rounded up to 35
32. α = 0.01, zα/2 = z0.005 = 2.575
n = [zα/2·σ/E]2 = [(2.575)(2.5)/(0.2)]2 = 1036.04, rounded up to 1037
33. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(10.6)/(0.25)]2 = 6906.27, rounded up to 6907
The sample size is too large to be practical.
Estimating a Population Mean: σ Known SECTION 7-3
209
34. α = 0.10, zα/2 = z0.05 = 1.645
n = [zα/2·σ/E]2 = [(1.645)(0.88)/(0.1)]2 = 209.55, rounded up to 210
35. α = 0.05, zα/2 = z0.025 = 1.96
Using the range rule of thumb: R = 40,000 – 0 = 40,000, and σ ≈ R/4 = 40,000/4 = 10,000.
n = [zα/2·σ/E]2 = [(1.96)(10,000)/(100)]2 = 38416, rounded up to 217
36. α = 0.05, zα/2 = z0.025 = 1.96
a. Using the range rule of thumb: R = 96 – 56 = 40, and σ ≈ R/4 = 40/4 = 10.
n = [zα/2·σ/E]2 = [(1.96)(10)/(2)]2 = 96.04, rounded up to 97
b. Using the sample standard deviation: σ ≈ s = 11.297
n = [zα/2·σ/E]2 = [(1.96)(11.297)/(2)]2 = 122.56, rounded up to 123
c. The two values are relatively close. Since s (which considers all the data) is a better
estimator for σ than R/4 (which is based entirely on the extreme values), the sample size of
123 should be preferred.
37. Since n/N = 125/200 = 0.625 > 0.05, use the finite population correction factor.
α = 0.05, zα/2 = z0.025 = 1.96
x ± [zα /2 σ/ n ] ⋅ (N-n)/(N-1)
1522 ± [1.96(333)/ 125] ⋅ (200-125)/(200-1)
1522 ± [58.3774]·[0.6139]
1522 ± 36
1486 < μ < 1558
The confidence interval becomes narrower because the sample is a larger portion of the
population. As n approaches N, the length of the confidence interval shrinks to 0 – because
when n=N the true mean μ can be determined with certainty.
38. From Exercise 32: α = 0.01, zα/2 = z0.005 = 2.575 and σ = 2.5 and E = 0.2.
Nσ 2 (zα /2 ) 2
500(2.5) 2 (2.575) 2
20720.7
n=
=
=
=337.48, rounded up to 338
2
2
2
2
2
2
(N-1)E + σ (zα /2 ) (500-1)(0.2) + (2.5) (2.575) 61.4014
Yes; the information about the population size has a significant effect, dropping the required
sample size from 1037 to 338.
7-4 Estimating a Population Mean: σ Not Known
1. According to the point estimate (“average”), the parameter of interest is a population mean.
But according to the margin of error (“percentage points”), the parameter of interest is a
population proportion. It is possible that the margin of error the paper intended to
communicate was 1% of $483 (or $4.83, which in a 95% confidence interval would
correspond to a sample standard deviation of $226.57) – but the proper units for the margin of
error in a situation like this are “dollars” and not “percentage points.”
2. Robust against departures from normality mans that that the requirement that the original
population be approximately normal is not a strong requirement, and that the methods of this
section still give good results if the departure from normality is not too extreme. The methods
of this section are not robust against poor sampling methods, as poor sampling methods can
yield data that are entirely useless.
210
CHAPTER 7 Estimates and Sample Sizes
3. No; the estimate will not be good for at least two reasons. First, the sample is a convenience
sample using the state of California, and California residents may not be representative of then
entire country. Secondly, any survey that involves self-reporting (especially of financial
information) is suspect because people tend to report favorable rather than accurate data.
4. The degrees of freedom in this survey is 4. In general, the degrees of freedom in a problem is
the number of data values that are free to vary without changing the estimate of the parameter
of interest. The estimate of a mean is determined by Σx, and n-1 of the data values are free to
vary so long as the nth value is the one necessary to produce the required Σx.
5. σ unknown, normal population, n=23: use t with df =22
α = 0.05, tdf,α/2= t22,0.025 = 2.074
IMPORTANT NOTE: This manual uses the following conventions.
(1) The designation “df” stands for “degrees of freedom.”
(2) Since the t value depends on the degrees of freedom, a subscript may be used to clarify which t
distribution is being used. For df =15 and α/2 =0.025, for example, one may indicate
t15,α/2 = 2.132. As with the z distribution, it is also acceptable to use the actual numerical value
within the subscript and indicate t15,.025 = 2.132.
(3) Always use the closest entry in Table A-3. When the desired df is exactly halfway between the
two nearest tabled values, be conservative and choose the one with the lower df.
(4) As the degrees of freedom increase, the t distribution approaches the standard normal
distribution – and the “large” row of the t table actually gives z values. Consequently the z
score for certain “popular” α and α/2 values may be found by reading Table A-3 “frontwards”
instead of Table A-2 “backwards.” This is not only easier but also more accurate – since Table
A-3 includes one more decimal place. Note the following examples.
For “large” df and α/2 = 0.05, tα/2 = 1.645 = zα/2 (as found in the z table).
For “large” df and α/2 = 0.01, tα/2 = 2.326 = zα/2 (more accurate than the 2.33 in the z table).
This manual uses this technique from this point on. [For df = “large” and α/2 = 0.005,
tα/2 = 2.576 ≠ 2.575 = zα/2 (as found in the z table). This is a discrepancy caused by using
different mathematical approximation techniques to construct the tables, and not a true
difference. While 2.576 is the more standard value, his manual will continue to use 2.575.]
6. σ known, normal population: use z
α = 0.01, zα/2 = z0.005 = 2.575
7. σ unknown, population not normal, n=6: neither normal nor t applies
8. σ unknown, population not normal, n=40: use t with df =39
α = 0.05, tdf,α/2 = t39,0.025 =2.023
9. σ known, population not normal, n=200: use z
α = 0.10, zα/2 = z0.05 = 1.645
10. σ unknown, population not normal, n=9: neither normal nor t applies
11. σ unknown, population normal, n=12: use t with df = 11
α = 0.01, tdf,α/2 = t11,0.005 = 3.106
12. σ unknown, population not normal, n=38: use t with df =37
α = 0.05, tdf,α/2 = t39,0.025 =2.026
Estimating a Population Mean: σ Not Known SECTION 7-4
211
13. σ unknown, normal distribution: use t with df = 19
α = 0.05, tdf, α/2 = t19,0.025 = 2.093
b.
x ±E
a. E = tα/2·s/ n
= 2.093(569)/ 20
9004 ± 266
= 266 dollars
8738 < μ < 9270 (dollars)
14. σ unknown, normal distribution: use t with df = 6
α = 0.01, tdf, α/2 = t6,0.005 = 3.707
b. x ± E
a. E = tα/2·s/ n
= 3.707(0.04)/ 7
0.12 ± 0.06
= 0.06 grams/mile
0.06 < μ < 0.18 (grams/mile)
15. From the SPSS display: 8.0518 < μ< 8.0903 (grams)
There is 95% confidence that the interval from 8.0518 grams to 8.0903 grams contains the true
mean weight of all U.S. dollar coins in circulations.
16. From the TI-83/84 Plus display: 1.5514 < μ < 2.2706 (lbs)
There is 99% confidence that the interval from 1.5514 lbs to 2.2706 lbs contains the true mean
annual weight of the plastic discarded by U.S. households.
17. a. x = 3.2 mg/dL
b. σ unknown, n > 30: use t with df=46 [45]
α = 0.05, tdf, α/2 = t46,0.025 = 2.014
x ± tα/2·s/ n
3.2 ± 2.014(18.6)/ 47
3.2 ± 5.5
-2.3 < μ < 8.7 (mg/dl)
Since the confidence interval includes 0, there is a reasonable possibility that the true value
is zero – i.e., that the Garlicin treatment has no effect on LDL cholesterol levels.
18. a. x = 3103 grams
b. σ unknown, n > 30: use t with df=185 [200]
α = 0.05, tdf, α/2 = t185,0.025 = 1.972
x ± tα/2·s/ n
3103 ± 1.972(696)/ 186
3103 ± 101
3002< μ < 3204 (grams)
c. Yes. Since the confidence interval for the mean birth weight for mothers who used cocaine
is entirely below the confidence interval in part (b), it appears that cocaine use is associated
with lower birth rates.
19. a. x = 98.20 °F
b. σ unknown, n > 30: use t with df=105 [100]
α = 0.01, tdf, α/2 = t105,0.005 = 2.626
x ± tα/2·s/ n
98.20 ± 2.626(0.62)/ 106
98.20 ± 0.16
98.04 < μ < 98.36 (°F)
c. No, the confidence interval does not contain the value 98.6 °F. This suggests that the
common belief that 98.6 °F is the normal body temperature may not be correct.
212
CHAPTER 7 Estimates and Sample Sizes
20. a. x = 2.1 lbs
b. σ unknown, n > 30: use t with df=39
α = 0.01, tdf, α/2 = t39,0.005 = 2.708
x ± tα/2·s/ n
2.1 ± 2.708(4.8)/ 40
2.1 ± 2.1
0 < μ < 4.2 (lbs)
c. Yes; since the confidence interval does not include 0, the diet appears to be effective.
No; since the amount of weight loss is so small, the diet does not appear to be practical.
21. a. σ unknown, n > 30: use t with df=336 [300] b. σ unknown, n > 30: use t with df=369 [400]
α = 0.05, tdf,α/2 = t336,0.025 = 1.968
α = 0.05, tdf,α/2 = t369,0.025 = 1.966
x ± tα/2·s / n
x ± tα/2·s / n
6.0 ± 1.968(2.3)/ 337
6.1 ± 1.966(2.4)/ 370
6.0 ± 0.2
1.6 ± 0.2
5.8 < μ < 6.2 (days)
5.9 < μ < 6.3 (days)
c. The two confidence intervals are very similar and overlap considerably. There is no
evidence that the echinacea treatment is effective.
22. a. σ unknown, n > 30: use t with df=141 [100] b. σ unknown, n > 30: use t with df=79 [80]
α = 0.05, tdf,α/2 = t141,0.025 = 1.984
α = 0.05, tdf,α/2 = t79,0.025 = 1.990
x ± tα/2·s / n
x ± tα/2·s / n
1.8 ± 1.984(1.4)/ 142
1.6 ± 1.990(1.2)/ 80
1.8 ± 0.2
1.6 ± 0.3
1.6 < μ < 2.0 (headaches)
1.3 < μ < 1.9 (headaches)
c. The two confidence intervals are very similar and overlap considerably. There is no
evidence that the acupuncture treatment is effective.
23. a. σ unknown, n ≤ 30: if approximately
b. σ unknown, n ≤ 30: if approximately
normal distribution, use t with df=19
normal distribution, use t with df=19
α = 0.05, tdf,α/2 = t19,0.025 = 2.093
α = 0.05, tdf,α/2 = t99,0.025 = 2.093
x ± tα/2·s / n
x ± tα/2·s / n
5.0 ± 2.093(2.4)/ 20
4.7 ± 2.093(2.9)/ 20
5.0 ± 1.1
4.7 ± 1.4
3.9 < μ < 6.1 (VAS units)
3.4 < μ < 6.1 (VAS units)
c. The two confidence intervals are very similar and overlap considerably. There is no
evidence that the magnet treatment is effective.
24. a. σ unknown, n > 30: use t with df=78 [80]
b. σ unknown, n > 30: use t with df=78 [80]
α = 0.01, tdf,α/2 = t78,0.005 = 2.639
α = 0.01, tdf,α/2 = t78,0.005 = 2.639
x ± tα/2·s / n
x ± tα/2·s / n
35.8 ± 2.639(11.3)/ 79
43.8 ± 2.639(8.9)/ 79
35.8 ± 3.4
43.8 ± 2.6
32.4 < μ < 39.2 (years)
41.2 < μ < 46.4 (years)
c. The two confidence intervals differ considerably and do not overlap at all. Women Oscar
winners are considerably younger than their male counterparts. Either women and men
reach their peak acting ability at different years, or the standards for judging women and
men are not really the same.
Estimating a Population Mean: σ Not Known SECTION 7-4
213
25. preliminary values: n = 6, Σx = 9.23, Σx2 = 32.5197
x = (Σx)/n = (9.23)/6 = 1.538
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [6(32.5197) – (9.23)2]/[6(5)] = 3.664
s = 1.914
σ unknown (and assuming the distribution is approximately normal), use t with df=5
α = 0.05, tdf,α/2 = t5,0.025 = 2.571
x ± tα/2·s/ n
1.538 ± 2.571(1.914)/ 6
1.538 ± 2.009
-0.471 < μ < 3.547 [which should be adjusted, since negative values are not possible]
0 < μ < 3.547 (micrograms/cubic meter)
Yes. The fact that 5 of the 6 sample values are below x raises a question about whether the
data meet the requirement that the underlying distribution is normal.
26. preliminary values: n = 7, Σx = 0.85, Σx2 = 0.1123
x = (Σx)/n = (0.85)/7 = 0.121
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [7(0.1123) – (0.85)2]/[7(6)] = 0.001514
s = 0.0389
σ unknown (and assuming the distribution is approximately normal), use t with df=6
α = 0.02, tdf,α/2 = t6,0.01 = 3.143
x ± tα/2·s/ n
0.121 ± 3.143(0.0389)/ 7
0.121 ± .046
0.075 < μ < 0.168 (grams/mile)
No. Since the confidence interval includes values greater than 0.165, there is a reasonable
possibility that the true mean emission amount is greater than that.
NOTE: This is a two-sided 98% confidence interval, and the requirement is one-sided (i.e., that
μ < 0.165). This means that the level of significance associated with the interval may not be
the same level of significance associated with a conclusion about the requirement.
27. preliminary values: n = 10, Σx = 204.0, Σx2 = 5494.72
x = (Σx)/n = (204.0)/10 = 20.40
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(5494.72) – (204.0)2]/[10(9)] = 148.124
s = 12.171
a. σ unknown (and assuming the distribution is approximately normal), use t with df=9
α = 0.05, tdf,α/2 = t9,0.05 = 2.262
x ± tα/2·s/ n
20.40 ± 2.262(12.171)/ 10
20.40 ± 8.71
11.7 < μ < 29.1 (million dollars)
b. No. Since the data are the top 10 salaries, they are not a random sample.
c. There is a sense in which the data are the population (i.e., the top ten salaries) and are not a
sample of any population. Possible populations from which the data could be considered a
sample (but not a representative sample appropriate for any statistical inference) would be
the salaries of all TV personalities, the salaries of the top 10 salaries of TV personality for
different years.
d. No. Since no population can be identified from which these data are a random sample, the
confidence interval has no context and makes no sense.
214
CHAPTER 7 Estimates and Sample Sizes
28. preliminary values: n = 12, Σx = 1461, Σx2 = 182,435
x = (Σx)/n = (1461)/12 = 121.75
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(182435) – (1461)2]/[12(11)] = 414.386
s = 20.356
a. σ unknown (and assuming the distribution is approximately normal), use t with df=11
α = 0.01, tdf,α/2 = t11,0.005 = 3.106
x ± tα/2·s/ n
121.75 ± 3.106(20.356)/ 12
121.75 ± 18.25
103.5 < μ < 140.0 (minutes)
b. While it is tempting to add 30 minutes to the upper confidence interval limit associated with
the mean times, it is not appropriate to make a decision about individual times based on the
distribution of the means. Without knowing the distribution of the lengths of the individual
films, and without assuming they are normally distributed, it is still possible to give the
manager some guidance. Defining an outlier to be any values more than two standard
deviations from the mean the usual maximum and minimum film lengths are:
usual min: 121.75 – 2(20.356) = 81.0 minutes
usual max: 121.75 + 2(20.356) = 162.5 minutes
If the manager allowed 162.5 + 30 = 192.5 minutes between showings, he would
accommodate all but the unusually long films. In practice, in order to use round numbers
and err slightly on the conservative side, he should consider a regular schedule of 195
minutes (i.e., 3 hours and 15 minutes) between feature showings.
29. preliminary values: n = 12, Σx = 52118, Σx2 = 228,072,688
x = (Σx)/n = (52118)/12 = 4343.17
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(228072688) – (52118)2]/[12(11)] = 155957.06
s = 394.91
σ unknown (and assuming the distribution is approximately normal), use t with df=11
α = 0.05, tdf,α/2 = t12,0.025 = 2.201
x ± tα/2·s/ n
4343.17 ± 2.201(394.91)/ 12
4343.17 ± 250.91
4092.2< μ < 4594.1 (seconds)
30. preliminary values: n = 43, Σx = 2358, Σx2 = 130,930
x = (Σx)/n = (2358)/43 = 54.837
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(130930) – (2358)2]/[43(42)] = 38.663
s = 6.218
σ unknown and n>30, use t with df=42 [40]
α = 0.01, tdf,α/2 = t42,0.005 = 2.704
x ± tα/2·s/ n
54.837 ± 2.704(6.218)/ 43
54.837 ± 2.564
52.3 < μ < 57.4 (years)
There is a sense in which the data are the population (i.e., the ages at inauguration of all US
Presidents) and are not a sample of any population. Possible populations from which the data
could be considered a sample (but not a representative sample appropriate for any statistical
inference) would be the ages of all US adults, the ages upon taking office of word heads of
states, the ages at inauguration of all past-present-future US presidents.
Estimating a Population Mean: σ Not Known SECTION 7-4
215
31. a. preliminary values: n = 25, Σx = 31.4, Σx2 = 40.74
x = (Σx)/n = (31.4)/25 = 1.256
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [25(40.74) – (31.4)2]/[25(24)] = 32.54/600 = 0.0542
s = 0.2329
σ unknown (and assuming the distribution is approximately normal), use t with df=24
α = 0.05, tdf,α/2 = t24,0.025 = 2.064
x ± tα/2·s/ n
1.256 ± 2.064(0.2329)/ 25
1.256 ± 0.096
1.16 < μ < 1.35 (mg)
NOTE: The Minitab output for this exercise is given below.
Variable
nicotine
N
25
Mean
1.25600
StDev
0.23288
SE Mean
0.04658
95% CI
(1.15987, 1.35213)
b. preliminary values: n = 25, Σx = 22.9, Σx2 = 22.45
x = (Σx)/n = (22.9)/25 = 0.916
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [25(22.45) – (22.9)2]/[25(24)] = 36.84/600 = 0.0614
s = 0.2478
σ unknown (and assuming the distribution is approximately normal), use t with df=24
α = 0.05, tdf,α/2 = t24,0.025 = 2.064
x ± tα/2·s/ n
0.916 ± 2.064(0.2478)/ 25
0.916 ± 0.102
0.81 < μ < 1.02 (mg)
NOTE: The Minitab output for this exercise is given below.
Variable
nicotine
N
25
Mean
0.916000
StDev
0.247790
SE Mean
0.049558
95% CI
(0.813717, 1.018283)
c. There is no overlap in the confidence intervals. Yes; since the CI for the filtered cigarettes
is completely below the CI for the unfiltered cigarettes, the filters appear to be effective in
reducing the amounts of nicotine.
32. a. preliminary values: n = 40, Σx = 2776, Σx2 = 197632
x = (Σx)/n = (2776)/40 = 69.4
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [40(197632) – (2776)2]/[40(39)] = 127.631
s = 11.297
σ unknown and n>30, use t with df=39
α = 0.05, tdf,α/2 = t39,0.025 = 2.024
x ± tα/2·s/ n
69.4 ± 2.024(11.297)/ 40
69.4 ± 3.6
65.8 < μ < 73.0 (beats/min)
NOTE: The Minitab output for this exercise is given below.
Variable
PULSE
N
40
Mean
69.4000
StDev
11.2974
SE Mean
1.7863
95% CI
(65.7869, 73.0131)
b. preliminary values: n = 40, Σx = 3052, Σx2 = 238960
x = (Σx)/n = (3052)/40 = 76.3
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [40(238960) – (3052)2]/[40(39)] = 156.215
s = 12.499
216
CHAPTER 7 Estimates and Sample Sizes
σ unknown and n>30, use t with df=39
α = 0.05, tdf,α/2 = t39,0.025 = 2.024
x ± tα/2·s/ n
76.3 ± 2.024(12.499)/ 40
76.3 ± 4.0
72.3 < μ < 80.3 (beats/min)
NOTE: The Minitab output for this exercise is given below.
Variable
PULSE
N
40
Mean
76.3000
StDev
12.4986
SE Mean
1.9762
95% CI
(72.3027, 80.2973)
c. Since the two confidence intervals overlap, we cannot conclude that the two population
means are different. But recall the CAUTION in this section that the overlapping of
confidence intervals should not be used for making formal and final conclusions about
equality of means.
33. preliminary values: n = 43, Σx = 2738, Σx2 = 307,250
x = (Σx)/n = (2738)/43 = 63.674
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(307250) – (2738)2]/[43(42)] = 3164.511
s = 56.254
σ unknown and n>30, use t with df=42 [40]
α = 0.01, tdf,α/2 = t42,0.005 = 2.704
x ± tα/2·s/ n
63.674 ± 2.704(56.254)/ 43
63.674 ± 23.197
40.5 < μ < 86.9 (years)
Yes, the confidence interval changes considerably from the previous 52.3 < μ < 57.4.
Yes, apparently confidence interval limits can be very sensitive to outliers.
When apparent outliers are discovered in data sets they should be carefully examined to
determine if an error has been made. If an error has been made that cannot be corrected, the
value should be discarded. If the value appears to be valid, it may be informative to construct
confidence intervals with and without the outlier.
34. preliminary values: n = 43, Σx = 2358, Σx2 = 130,930
x = (Σx)/n = (2358)/43 = 54.837
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(130930) – (2358)2]/[43(42)] = 38.663
s = 6.218
σ unknown and n>30, alternate method says use s for σ and use z
α = 0.01, zα/2 = z0.005 = 2.575
x ± zα/2·σ/ n
54.837 ± 2.575(6.218)/ 43
54.837 ± 2.441
52.4 < μ < 57.3 (years)
For any α, the z value is smaller than the corresponding t value – although the difference
decreases as n increases. This creates a smaller E and a narrower confidence interval than one
is entitled to – i.e., it does not take into consideration the extra uncertainty created by using the
sample s instead of the population σ. And so the confidence interval found by the alternative
method will always be narrower, but usually by a very small amount. In some situations,
however, the unjustified narrowness of the interval could lead to incorrect conclusions.
Estimating a Population Mean: σ Not Known SECTION 7-4
217
35. assuming a large population
using the finite population N = 465
α = 0.05 & df=99 [100], tdf,α/2 = t99,0.02 5= 1.984 α = 0.05 & df=99 [100], tdf,α/2 = t99,0.025 = 1.984
E = [tα/2·s/ n ] × (N-n)/(N-1)
E = tα/2·s/ n
= [1.984(0.0518)/ 100 ]× 365/464
= 1.984(0.0518)/ 100
= 0.0103 g
= 0.0091 g
x ±E
x ±E
0.8565 ± 0.0103
0.8565 ± 0.0091
0.8462 < μ < 0.8668 (grams)
0.8474 < μ < 0.8656 (grams)
The second confidence interval is narrower, reflecting the fact that there are more restrictions
and less variability (and more certainty) in the finite population situation when n>.05N.
36. a. In general, one sample value gives no information about the variation of the variable. It is
possible, however, that one value plus other considerations can give some insight. If one
knows that 0 is a possible value, for example, then one large sample value would indicate a
large variance. [For example: If you take a sample of n=1 of the daily snowfall in a US city
and find that 10.0 feet of snow fell that day, you would assume that there are days with no
snow and that there must be a large variability in the amounts of daily snowfall.]
b. The formula for E requires a value for s and a t score with n-1 degrees of freedom. When
n=1, the formula for s fails to produce a value [because there is an (n-1) in the denominator]
and there is no df=0 row for the t statistic. No confidence interval can be constructed.
c. x ± 9.68|x|
12.0 ± 9.68|12.0|
12.0 ± 116.2
-104.2 < μ < 128.2 [which should be adjusted, since negative heights are not possible]
0 < μ < 128.2 (feet)
Is it likely that some other randomly selected Martian may be 50 feet tall?
No, if “likely” is understood to be “highly probable.” The range for individual heights
would be even larger than the 0 – 128 given for the mean. With so many possibilities over
such a wide range, 50 (or any other individual value) is not highly probable.
Yes, if “likely” is understood to be “reasonable.” Since the confidence interval includes
the value 50, it is a reasonable possibility for the mean height of all Martians – and any
possible mean height would be a possible individual height.
7-5 Estimating a Population Variance
1. We can be 95% confident that the interval from 0.0455grams to 0.0602 grams includes the true
value of the standard deviation in the weights for the population of all M&M’s.
2. Yes; (0.0455 g, 0.0602 g) is another format for indicating the confidence interval given in
Exercise 1 – although the format in Exercise 1 has the advantage if indicating that the
parameter of interest is σ, the population standard deviation. No; while the given expression
yields the same endpoints given in Exercise 1, it falsely implies that the point estimate for the
parameter in question is 0.05285 grams.
3. No; the population of last two digits from 00 to 99 follows a uniform distribution and not a
normal distribution. One of the requirements for using the methods of this section is that the
population values have a distribution that is approximately normal – even if the sample size is
large.
218
CHAPTER 7 Estimates and Sample Sizes
4. An unbiased estimator is one whose long-run average value is equal to the true value of the
population parameter it estimates. The sample variance is an unbiased estimator of the
population variance, but the sample standard deviation is not an unbiased estimator of the
population standard deviation – as illustrated by exercises 10 and 11 in section 6-4 of the
previous chapter.
5. α = 0.05 and df = 8
2
χ 2L = χ 8,0.975
= 2.180
2
χ 2R = χ 8,0.025
= 17.535
6. α = 0.05 and df = 19
2
χ 2L = χ 19,0.975
= 8.907
2
χ 2R = χ 19,0.025
= 32.852
7. α = 0.01 and df = 80
2
χ 2L = χ 80,0.995
= 51.172
2
χ 2R = χ 80,0.005
= 116.321
8. α = 0.10 and df = 50
2
χ 2L = χ 50,0.95
= 34.764
2
χ 2R = χ 50,0.05
= 67.505
2
2
9. α = 0.05 and df = 29; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(29)(333)2/45.722 < σ2 < (29)(333) 2/16.047
70333.3 < σ2 < 200397.6
265 < σ < 448
2
2
10. α = 0.05 and df = 24; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(24)(2.3)2/39.364 < σ2 < (24)(2.3) 2/12.401
3.23 < σ2 < 10.24
1.8 < σ < 3.2 (mph)
2
2
11. α = 0.01 and df = 6; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(6)(2.019)2/18.548 < σ2 < (6)(2.019) 2/0.676
1.3186 < σ2 < 36.1807
1.148 < σ < 6.015 (cells/microliter)
2
2
12. α = 0.01 and df = 7; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(7)(0.12)2/20.278 < σ2 < (7)(0.12) 2/0.989
0.00497 < σ2 < 0.10192
0.07 < σ < 0.32 (seconds)
13. From the upper right section of Table 7-2, n = 19,205.
No. This sample size is too large to be practical for most applications.
14. From the upper right section of Table 7-2, n = 21.
Yes. This sample size is practical for most applications.
15. From the lower left section of Table 7-2, n = 101.
Yes. This sample size is practical for most applications.
16. From the upper left section of Table 7-2, n = 211.
Estimating a Population Variance SECTION 7-5
219
2
2
17. α = 0.05 and df = 189; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(189)(645) 2/228.9638 < σ2 < (189)(645) 2/152.8222
343411 < σ2 < 514511
586 < σ < 717 (grams)
No. Since the confidence interval includes 696, it is a reasonable possibility for σ.
18. a. R = 1.015 – 0.696 = 0.319 grams
By the range rule of thumb, σ ≈ R/4 = 0.319/4 = 0.07975 grams.
2
2
b. α = 0.05 and df = 99 [100]; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(99)(0.0518)2/129.561 < σ2 < (99)(0.0518) 2/74.222
0.002050 < σ2 < 0.003579
0.0453 < σ < 0.0598 (grams)
c. No; the confidence interval does not contain the estimate from part (a). This suggests that
the range rule of thumb is not accurate in this case. Remember, however, that the range rule
of thumb applies to all distributions – and that normal distributions (like the weights of the
M&M’s) have smaller standard deviations than other distributions with the same range
because they bunch up near the middle.
2
2
19. a. α = 0.05 and df = 22; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(22)(22.9)2/36.781 < σ2 < (22)(22.9) 2/10.982
313.67 < σ2 < 1050.54
17.7 < σ < 32.4 (minutes)
2
2
b. α = 0.05 and df = 11; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(11)(20.8)2/21.920 < σ2 < (11)(20.8) 2/3.816
217.11 < σ2 < 1247.13
14.7 < σ < 35.3 (minutes)
c. The two intervals are similar. No, there does not appear to be a difference in the variation of
lengths of PG/PGF-13 movies and R movies.
2
2
20. a. α = 0.01 and df = 39 [40]; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(39)(11.3)2/66.766 < σ2 < (39)(11.3) 2/20.707
74.588 < σ2 < 240.494
8.6 < σ < 15.5 (beats/minute)
2
2
b. α = 0.01 and df = 39 [40]; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(39)(12.5)2/66.766 < σ2 < (39)(12.5) 2/20.707
91.270 < σ2 < 294.285
9.6 < σ < 17.2 (beats/minute)
c. The two intervals are similar. No, there does not appear to be a difference in the variation of
pulse rates of men and women.
220
CHAPTER 7 Estimates and Sample Sizes
21. preliminary values: n = 12, Σx = 52118, Σx2 = 228,072,688
x = (Σx)/n = (52118)/12 = 4343.2
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(228072688) – (52118)2]/[12(11)] = 155,957.06
s = 394.91
2
2
α = 0.01 and df = 11; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(11)(394.91)2/26.757 < σ2 < (11)(394.91) 2/2.603
64115.10 < σ2 < 659057.88
253.2 < σ < 811.8 (seconds)
22. preliminary values: n = 12, Σx = 10008, Σx2 = 8,360,132
x = (Σx)/n = (1008)/12 = 84.0
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(8360132) – (10008)2]/[12(11)] = 1223.64
s = 34.98
2
2
α = 0.05 and df = 7; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(11)(34.98)2/21.920 < σ2 < (11)(34.98) 2/3.816
614.05 < σ2 < 3527.25
24.8 < σ < 59.4 (mm)
Yes, the interval contains the traditionally believed value of 35 mm.
23. preliminary values: n = 6, Σx = 9.23, Σx2 = 32.5197
x = (Σx)/n = (9.23)/6 = 1.538
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [6(32.5197) – (9.213)2]/[6(5)] = 3.664
s = 1.914
2
2
α = 0.05 and df = 5; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(5)(3.664)/12.833 < σ2 < (5)(3.664)/0.831
1.4276 < σ2 < 22.0468
1.195 < σ < 4.695 (micrograms per cubic meter)
Yes. One of the requirements to use the methods of this section is that the original distribution
be approximately normal, and the fact that 5 of the 6 sample values are less than the mean
suggests that the original distribution is not normal.
24. a. preliminary values: n = 10, Σx = 71.5, Σx2 = 513.27
x = (Σx)/n = (71.5)/10 = 7.15
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(513.27) – (71.5)2]/[10(9)] = 0.2272
s = 0.48
2
2
α = 0.05 and df = 9; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(9)(0.2272)/19.023 < σ2 < (9)(0.2272)/2.700
0.0975 < σ2 < 0.7574
0.33 < σ < 0.87 (minutes)
b. preliminary values: n = 10, Σx = 71.5, Σx2 = 541.09
x = (Σx)/n = (71.5)/10 = 7.15
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(541/09) – (71.5)2]/[10(9)] = 3.3183
s = 1.82
Estimating a Population Variance SECTION 7-5
221
2
2
α = 0.05 and df = 9; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(9)(3.3183)/19.023 < σ2 < (9)(3.3183)/2.700
1.5699 < σ2 < 11.0611
1.25 < σ < 3.33 (minutes)
c. The variation is considerably higher in part (b). Yes; since the intervals do not overlap,
there is a significant difference in the variability of the two systems. The single-line system
in part (a) is better for the customers because it eliminates the long wait endured by some
customers when one of the lines is slow.
25. preliminary values: n = 100, Σx = 70311, Σx2 = 50,278,497
x = (Σx)/n = (70311)/100 = 703.11
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [100(50278497) – (70311)2]/[100(99)] = 8506.36
s = 92.23
2
2
α = 0.05 and df = 99 [100]; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(99)(8506.36)/129.561 < σ2 < (99)(8506.36)/74.222
6499.87 < σ2 < 11346.09
80.6 < σ < 106.5 (FICO units)
NOTE: The statistical portion of Excel yielded the following results.
Confidence Level
0.95
Lower Conf. Limit Stan. Dev.
80.979
92.23
Upper Conf. Limit
107.141
26. preliminary values: n = 48, Σx = 133,522, Σx2 = 393,933,262
x = (Σx)/n = (133522)/48 = 2781.71
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [48(393933262) – (133522)2]/[48(47)] = 479,021.32
s = 692.11
2
2
α = 0.05 and df = 47 [50]; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(47)(479021.32)/79.490 < σ2 < (47)(479021.32)/27.991
283230.62 < σ2 < 804330.03
532.2 < σ < 896.8 (kWh)
NOTE: The statistical portion of Excel yielded the following results.
Confidence Level
0.99
Lower Conf. Limit Stan. Dev.
545.339
692.114
Upper Conf. Limit
934.611
27. Applying the given formula yields the following χ 2L and χ 2R values.
χ2 = (1/2)[ ± zα/2 +
2(df) - 1 ]2
= (1/2)[ ± 1.96 + 2(189) - 1 ]2
= (1/2)[ ± 1.96 + 19.416]2
= (1/2)[17.456]2 and (1/2)[21.376]2
= 152.3645
and 228.4771
These are close to the 152.8222 and 228.9638 given in exercise #17.
222
CHAPTER 7 Estimates and Sample Sizes
28. Notice that what the exercise calls (n-1)s2/(n+1) can be restated more naturally in a format
(n-1)s 2
(n-1) Σ(x-x) 2
Σ(x-x) 2
.
=
×
=
similar to the definition of the variance as
(n+1)
(n+1) (n-1)
(n+1)
This exercise compares the mean square error (MSE) of the unbiased s2 = Σ(x- x )2/(n-1)
with the biased s 2B = Σ(x- x )2/(n+1).
Consider the original population of size N=3.
x x-μ (x-μ)2
2 -2
4
3 -1
1
7
3
9
12
0
14
μ = (Σx)/N = 12/3 = 4
σ2 = Σ(x-μ)2/N = 14/3 = 4.667
Consider the 9 possible equally likely samples (with replacement) of size n=2.
The following table contains the values necessary to answer the various parts of this exercise.
sample x
s2 s2-σ2
(s2-σ2)2
s 2B
s 2B -σ2 ( s 2B -σ2)2
2,2
2,3
2,7
3,2
3,3
3,7
7,2
7,3
7,7
2.0
2.5
4.5
2.5
3.0
5.0
4.5
5.0
7.0
36.0
0
0.5
12.5
0.5
0
8.0
12.5
8.0
0
42
-4.667 21.778
-4.167 17.361
7.833 61.361
-4.167 16.361
-4.667 21.778
3.333 11.111
7.833 61.361
3.333 11.111
-4.667 21.778
0
245
0
0.167
4.167
0.167
0
2.667
4.167
2.667
0
14
-4.667 21.778
-4.500 20.250
–0.500
0.250
-4.500 20.250
-4.667 21.778
-2.000
4.000
-0.500
0.250
-2.000
4.000
-4.667
21.778
-28
114.333
a. Consider the estimator s2.
E(s2) = (Σs2)/9 = 42/9 = 4.6677 = σ2, and so s2 is an unbiased estimator of σ2.
MSE(s2) = Σ(s2-σ2)2/9 = 245/9 = 27.222
b. Consider the estimator s 2B .
E( s 2B ) = (Σ s 2B )/9 = 14/9 = 1.556 ≠ σ2, and so s 2B is not an unbiased estimator of σ2.
MSE( s 2B ) = Σ( s 2B -σ2)2/9 = 114.333/9 = 12.704
c. Parts (a) and (b) show that s 2B has a smaller mean square error (MSE), but that it is a biased
estimator. It can be shown that the estimator s 2B = Σ(x- x )2/(n+1) has the minimum MSE of
all estimators of σ2.
Statistical Literacy and Critical Thinking
1. A point estimate is a single value calculated from sample data that is used to estimate the true
value of a population characteristic, called the parameter. In this context the sample proportion
that test positive is the best point estimate for the population proportion that would test
positive. A confidence interval is a range of values that is likely, with some specific degree of
confidence, to include the true value of the population parameter. The major advantage of the
confidence interval over the point estimate is its ability to communicate a sense of the accuracy
of the estimate.
2. We can be 95% confident that the interval from 2.62% to 4.99% contains the true percentage
of all job applicants who would test positive for drug use.
Statistical Literacy and Critical Thinking
223
3. The confidence level in Exercise 2 is 95%. In general, the confidence level specifies the
proportion of times a given procedure to construct an interval estimate can be expected to
produce an interval that will include the true value of the parameter.
4. The respondents are not likely to be representative of the general population for two reasons.
The sample is a convenience sample, composed only of those who visit the AOL Web site.
The sample is a voluntary response sample, composed only of those who take the time to selfselect themselves to be in the survey. Convenience samples are typically not representative
racially, socio-economically, etc. Voluntary response samples typically include mainly those
with strong opinions on, or a personal interest in, the topic of the survey.
Chapter Quick Quiz
1. We can be 95% confident that the interval from 20.0 to 20.0 contains the true value of the
population mean.
2. The interval includes some values greater than 50%, suggesting that the Republican may win;
but the interval also includes some values less than 50%, suggesting that the Republican may
lose. Statement (2), that the election is too close to call, best describes the results of the
survey.
3. The critical value of tα/2 for n=20 and α = 0.05 is t19,0.025 = 2.093.
4. The critical value of zα/2 for n=20 and α = 0.10 is z0.05 = 1.645.
5. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.02; p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.02)2 = 2401
n = [(z α/2 ) 2 pq]/E
6. p̂ = x/n = 240/600 = 0.40
7. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 240/600 = 0.4000
ˆˆ
pˆ ± zα /2 pq/n
0.4000 ± 1.96 (0.4000)(0.6000)/600
0.4000 ± 0.0392
0.361< p < 0.439
8. σ unknown, n > 30: use t with df=35 and α = 0.05, tdf, α/2 = t35,0.025 = 2.014
x ± tα/2·s/ n
40.0 ± 2.030(10.0)/ 36
40.0 ± 3.4
36.6 < μ < 43.4 (years)
9. σ known, n > 30: use z and α = 0.05, z α/2 = z0.025 = 1.96
x ± zα/2·σ/ n
40.0 ± 1.96(10.0)/ 36
40.0 ± 3.3
36.7 < μ < 43.3 (years)
10. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(12)/(0.5)]2 = 2212.76, rounded up to 2213
224
CHAPTER 7 Estimates and Sample Sizes
Review Exercises
1. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 589/745 = 0.7906
ˆˆ
pˆ ± zα /2 pq/n
0.7906 ± 1.96 (0.7906)(0.2094)/745
0.7906 ± 0.0292
0.761 < p < 0.820 or 76.1% < p < 82.0%
We can be 95% confident that the interval from 76.1% to 82.0% contains the true percentage
of all adults who believe that it is morally wrong to not report all income on their tax returns.
2. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02; p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(2.575)2(0.5)(0.5)]/(0.02)2 = 4144.14, rounded up to 4145
n = [(z α/2 ) 2 pq]/E
3. α = 0.01, zα/2 = z0.005 = 2.575
n = [zα/2·σ/E]2 = [(2.575)(28785)/(500)]2 = 21975.91, rounded up to 21,976
No, the required sample size is too large to be practical. It appears that some re-thinking of the
requirements is necessary.
4. σ unknown, n > 30: use t with df=36 and α = 0.01, tdf, α/2 = t36,0.005 = 2.719
x ± tα/2·s/ n
2.4991 ± 2.719(0.0165)/ 37
2.4991 ± 0.0074
2.4917 < μ < 2.5065 (grams)
Since the above interval includes 2.5 grams, it appears on the surface that the manufacturing
process is meeting the design specifications – but the interval may not be relevant to determine
whether the pennies were manufactured according to specification if the sample came (as so it
seems) from worn pennies in circulation.
5. preliminary values: n = 5, Σx = 3344, Σx2 = 2,470,638
x = (Σx)/n = (3344)/5 = 668.8
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [5(2470638) – (3344)2]/[5(4)] = 58542.7
s = 241.96
σ unknown and n=5: assuming a normal distribution, use t with df=4
α = 0.05, tdf,α/2 = t4,0.025 = 2.776
x ± tα/2·s/ n
668.8 ± 2.776(241.96)/ 5
668.8 ± 300.4
368.4 < μ < 969.2 (hic)
6. preliminary values: n = 5, Σx = 3344, Σx2 = 2,470,638
x = (Σx)/n = (3344)/5 = 668.8
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [5(2470638) – (3344)2]/[5(4)] = 58542.7
s = 241.96
2
2
α = 0.05 and df = 4; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(4)(58542.7)/11.143 < σ2 < (4)(58542.7)/0.484
21015.06 < σ2 < 483823.97
145.0 < σ < 695.6 (hic)
Review Exercises
225
7. Let x = the number who believe that cloning should not be allowed.
a. p̂ = x/n = 901/1012 = 0.8903, rounded to 0.890
b. α = 0.05, zα/2 = z0.025 = 1.96
ˆˆ
pˆ ± zα /2 pq/n
0.8903 ± 1.96 (0.8903)(0.1097)/1012
0.8903 ± 0.0193
0.871 < p < 0.910
c. Yes. Since the entire interval is above 50% = 0.50, there is strong evidence that the majority
is opposed to such cloning.
8. a. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.04; p̂ unknown, use p̂ =0.5
ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.04)2 = 600.25, rounded up to 601
n = [(z α/2 ) 2 pq]/E
b. α = 0.05, zα/2 = z0.025 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(14227)/(750)]2 = 1382.34, rounded up to 1383
c. To meet both criteria simultaneously , use the larger sample size of n=1383.
9. preliminary values: n = 8, Σx = 30.72, Σx2 = 160.2186
a. x = (Σx)/n = (30.72)/8 = 3.840 lbs
b. σ unknown and n=8: assuming a normal distribution, use t with df=7
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [8(160.2186) – (30.72)2]/[8(7)] = 6.03626
s = 2.4569
α = 0.05, tdf,α/2 = t7,0.025 = 2.365
x ± tα/2·s/ n
3.8400 ± 2.365(2.4569)/ 8
3.8400 ± 2.0543
1.786 < μ < 5.894 (lbs)
c. σ known and a normal distribution: use z
α = 0.05, z α/2 = z0.025 = 1.96
x ± zα/2·σ/ n
3.8400 ± 1.96(3.108)/ 8
3.8400 ± 2.1537
1.686 < μ < 5.994 (lbs)
10. preliminary values: n = 8, Σx = 30.72, Σx2 = 160.2186
x = (Σx)/n = (30.72)/8 = 3.840
s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [8(160.2186) – (30.72)2]/[8(7)] = 6.03626
s = 2.4569
2
2
a. α = 0.05 and df = 7; χ 2L = χ df,1-α/2
and χ 2R = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(7)(6.03626)/16.013 < σ2 < (7)(6.03626)/1.690
2.6387 < σ2 < 25.0022
1.624 < σ < 5.000 (lbs)
2
2
b. α = 0.05 and df = 7; χ 2L = χ df,1-α/2
and χ R2 = χ df,α/2
(n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L
(7)(6.03626)/16.013 < σ2 < (7)(6.03626)/1.690
2.639 < σ2 < 25.002 (lbs2)
226
CHAPTER 7 Estimates and Sample Sizes
Cumulative Review Exercises
1. scores in order: 103 105 110 119 119 123 125 125 127 128
preliminary values: n = 10, Σx = 1184, Σx2 = 140948
a. x = (Σx)/n = (1184)/10 = 118.4 lbs
b. x = (x5 + x6)/2 = (119 + 123)/2 = 121.0 lbs
c. s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(140948) – (1184)2]/[10(9)] = 84.711
s = 9.2 lbs
2. Ratio, since differences are meaningful and there is a meaningful zero.
3. From Exercise 1, x = 118.4 and s = 9.204.
α = 0.05, tdf,α/2 = t9,0.025 = 2.262
x ± tα/2·s/ n
118.4 ± 2.262(9.204)/ 10
118.4 ± 6.6
111.8 < μ < 125.0 (lbs)
4. α = 0.05, zα/2 = z0.05 = 1.96
n = [zα/2·σ/E]2 = [(1.96)(7.5)/(2)]2 = 54.0225, rounded up to 55
5. Let D = an applicant tests positive for drugs.
P(D) = 0.038
a. P( D ) = 1 – P(D)
= 1 – 0.038
= 0.962
b. P(D1 and D2) = P(D1)·P(D2|D1)
= (0.038)(0.038) = 0.00144
c. binomial: n=500 and p=0.038
normal approximation appropriate since
np = 500(0.038) = 19 ≥ 5
nq = 500(0.962) = 481 ≥ 5
μ = np = 500(0.038) = 19
σ = npq = 500(0.038)(0.962) = 4.275
P(x ≥ 20)
= P(x>19.5)
= P(z>0.12)
= 1 – 0.5478
= 0.4522
6. a. normal distribution
μ = 21.1
σ = 4.8
P(x>20.0)
= P(z>-0.23)
= 1 – 0.4090
= 0.5910
0.5478
<-------------------------------19
0
19.5
0.12
x
Z
0.4090
<----------20.0
-0.23
21.1
0
x
Z
Cumulative Review Exercises
227
NOTE: Since ACT scores are whole numbers, another valid interpretation of part (a) is
P(x>20) = PC(x>20.5) = P(z>-0.13) = 1 – 0.4483 = 0.5517. Presumably the “20.0” was
specified in the exercise to discourage this interpretation and to allow for a direct
comparison to part (b).
b. normal distribution,
since the original distribution is so
μ x = μ = 21.1
σ x = σ/ n = 4.8/ 25 = 0.96
P( x >20.0)
= P(z>-1.15)
= 1 – 0.1251
= 0.8749
c. normal distribution: μ = 21.1, σ = 4.8
For P90, A = 0.9000 [0.8997]
and z = 1.28 [from z table]
or z = 1.282 [from last row of t table]
x = μ + zσ
= 21.1 + (1.282)(4.8)
= 21.1 + 6.2
= 27.3
0.1251
<----------20.0
-1.15
_
x
Z
21.1
0
<--------------------------------|
0.9000
21.1
0
?
1.282
x
Z
7. A simple random sample of size n from some population occurs when every sample of size n
has an equal chance of being selected from that population. A voluntary response sample
occurs when the subjects themselves decide whether to be included.
8. As grade point averages are typically reported to two decimal places, R = 4.00 – 0.00 = 4.00.
The range rule of thumb states that σ ≈ R/4 = 4.00/4 = 1.000.
9. Let C = getting a T-F question correct by random guessing.
P(C1 and C2 and …and C12) = P(C1)·P(C2)·…·P(C12)
= (0.5)(0.5)…(0.5) = (0.5)12 = 0.000244
Since 0.000244 > 0, it is possible to get all 12 questions correct by random guessing.
Since 0.000244 < 0.05, it is unlikely to get all 12 questions correct by random guessing.
10. This is a convenience sample, composed of those friends who happen to be available.
Convenience samples are typically not representative of the population in a variety of ways –
e.g., racially, socio-economically, by gender, etc.