Download Chapter 7 Estimates and Sample Sizes

Chapter 7 Estimates and Sample Sizes 7-2 Estimating a Population Proportion 1. The confidence level was not stated. The most common level of confidence is 95%, and sometimes that level is carelessly assumed without actually being stated. 2. The margin of error is the maximum likely difference between the point estimate for a parameter and its true value. 3. By including a statement of the maximum likely error, a confidence interval provides information about the accuracy of an estimate. 4. No. A voluntary response sample is not necessarily representative of the population. 5. For 99% confidence, α = 1–0.99 = 0.01 and α/2 = 0.01/2 = 0.005. For the upper 0.005, A = 0.9950 and z = 2.575. zα/2 = z0.005 = 2.575 6. For 99.5% confidence, α = 1–0.995 = 0.005 and α/2 = 0.005/2 = 0.0025. For the upper 0.0025, A = 0.9975 and z = 2.81. zα/2 = z0.0025 = 2.81 7. For α = 0.10, α/2 = 0.10/2 = 0.05. For the upper 0.05, A = 0.9500 and z = 1.645. zα/2 = z0.05 = 1.645 8. For α = 0.02, α/2 = 0.02/2 = 0.01. For the upper 0.01, A = 0.9900 [0.9901] and z = 2.33. zα/2 = z0.01 = 2.33 9. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.200+0.500)/2 = 0.700/2 = 0.350 E = (U–L)/2 = (0.500–0.200)/2 = 0.300/2 = 0.150 The interval can be expressed as 0.350 ± 0.150. 10. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.720+0.780)/2 = 1.500/2 = 0.750 E = (U–L)/2 = (0.780–0.720)/2 = 0.060/2 = 0.030 The interval can be expressed as 0.750 ± 0.030. 11. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.437+0.529)/2 = 0.966/2 = 0.483 E = (U–L)/2 = (0.529–0.437)/2 = 0.092/2 = 0.046 The interval can be expressed as 0.483 ± 0.046. 12. Given that p̂ = 0.222 and E = 0.044, L = p̂ – E = 0.222 – 0.044 = 0.178 U = p̂ +E = 0.222 + 0.044 = 0.266 The interval can be expressed as 0.178 < p < 0.266. 198 CHAPTER 7 Estimates and Sample Sizes 13. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.320+0.420)/2 = 0.740/2 = 0.370 E = (U–L)/2 = (0.420–0.320)/2 = 0.100/2 = 0.050 14. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.772+0.776)/2 = 1.548/2 = 0.774 E = (U–L)/2 = (0.776–0.772)/2 = 0.004/2 = 0.002 15. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.433+0.527)/2 = 0.960/2 = 0.480 E = (U–L)/2 = (0.527–0.433)/2 = 0.094/2 = 0.047 16. Let L = the lower confidence limit; U = the upper confidence limit. p̂ = (L+U)/2 = (0.102+0.236)/2 = 0.338/2 = 0.169 E = (U–L)/2 = (0.236–0.102)/2 = 0.134/2 = 0.067 ˆ ˆ do not round off in the middle of the IMPORTANT NOTE: When calculating E = z α/2 pq/n problem. This, and the subsequent calculations of U = p̂ + E and L = p̂ – E may accomplished conveniently on most calculators having a memory as follows. (1) Calculate p̂ = x/n and STORE the value. (2) Calculate E as 1 – RECALL = * RECALL = ÷ n = * zα/2 = (3) With the value for E showing on the display, the upper confidence limit U can be calculated by using + RECALL =. (4) With the value for U showing on the display, the lower confidence limit L can be calculated by using – RECALL ± + RECALL. THE MANUAL USES THIS PROCEDURE, AND ROUNDS THE FINAL ANSWER TO 3 SIGNIFICANT DIGITS, EVEN THOUGH IT REPORTS INTERMEDIATE STEPS WITH A FINITE NUMBER OF DECIMAL PLACES. If the above procedure does not work on your calculator, or to find out if some other procedure would be more efficient on your calculator, ask your instructor for assistance. You must become familiar with your own calculator – and be sure to do your homework on the same calculator you will use for the exams. 17. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 400/1000 = 0.40 ˆ ˆ = 1.96 (0.40)(0.60)/1000 = 0.0304 E = zα /2 pq/n 18. α = 0.01 and zα/2 = z0.005 = 2.275; p̂ = x/n = 220/500 = 0.44 ˆ ˆ = 2.575 (0.44)(0.56)/500 = 0.0572 E = zα /2 pq/n 19. α = 0.02 and zα/2 = z0.01 = 2.33; p̂ = x/n = [492]/1230 = 0.40 ˆ ˆ = 2.33 (0.40)(0.60)/1230 = 0.0325 E = zα /2 pq/n NOTE: The value x=[492] was not given. In truth, any 486 ≤ x ≤ 498 rounds to the given p̂ = x/1230 = 40%. For want of a more precise value, p̂ = 0.40 is used in the calculation of E. 20. α = 0.10 and zα/2 = z0.05 = 1.645; p̂ = x/n = [623]/1780 = 0.35 ˆ ˆ = 1.645 (0.35)(0.65)/1780 = 0.0186 E = zα /2 pq/n NOTE: The value x=[623] was not given. In truth, any 615 ≤ x ≤ 631 rounds to the given p̂ = x/1780 = 35%. For want of a more precise value, p̂ = 0.35 is used in the calculation of E. Estimating a Population Proportion SECTION 7-2 21. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 40/200 = 0.2000 ˆˆ pˆ ± zα /2 pq/n 0.2000 ± 1.96 (0.2000)(0.8000)/200 0.2000 ± 0.0554 0.145 < p < 0.255 22. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 400/2000 = 0.2000 ˆˆ pˆ ± zα /2 pq/n 0.2000 ± 1.96 (0.2000)(0.8000)/2000 0.2000 ± 0.0175 0.182 < p < 0.218 23. α = 0.01 and zα/2 = z0.005 = 2.575; p̂ = x/n = 109/1236 = 0.0882 ˆˆ pˆ ± zα /2 pq/n 0.0882 ± 2.575 (0.0882)(0.9118)/1236 0.0882 ± 0.0207 0.0674 < p < 0.109 24. α = 0.01 and zα/2 = z0.005 = 2.575; p̂ = x/n = 4821/5200 = 0.9271 ˆˆ pˆ ± zα /2 pq/n 0.9271 ± 2.575 (0.9271)(0.0729)/5200 0.9271 ± 0.0093 0.918 < p < 0.936 25. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.045; p̂ unknown, use p̂ =0.5 ˆˆ 2 n = [(z α/2 )2 pq]/E = [(1.96)2(0.5)(0.5)]/(0.045)2 = 474.27, rounded up to 475 26. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.005; p̂ unknown, use p̂ =0.5 ˆˆ 2 n = [(z α/2 )2 pq]/E = [(2.575)2(0.5)(0.5)]/(0.005)2 = 66306.25, rounded up to 66,307 27. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02; p̂ estimated to be 0.14 ˆˆ 2 n = [(z α/2 )2 pq]/E = [(2.575)2(0.14)(0.86)]/(0.02)2 = 1995.82, rounded up to 1996 28. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ estimated to be 0.87 ˆˆ 2 n = [(z α/2 )2 pq]/E = [(1.96)2(0.87)(0.13)]/(0.03)2 = 482.76, rounded up to 483 29. Let x = the number of girls born using the method. a. p̂ = x/n = 525/574 = 0.9146, rounded to 0.915 199 200 CHAPTER 7 Estimates and Sample Sizes b. α = 0.05, zα/2 = z0.025 = 1.96 ˆˆ pˆ ± zα /2 pq/n 0.9146 ± 1.96 (0.9146)(0.0854)/574 0.9146 ± 0.0229 0.892 < p < 0.937 c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 95% certain that the method is effective. 30. Let x = the number of boys born using the method. a. p̂ = x/n = 127/152 = 0.8355, rounded to 0.836 b. α = 0.01, zα/2 = z0.005 = 2.575 ˆˆ pˆ ± zα /2 pq/n 0.8355± 2.575 (0.8355)(0.1645)/152 0.8355 ± 0.0774 0.758 < p < 0.913 c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99% certain that the method is effective. 31. Let x = the number of deaths in the week before Thanksgiving. a. p̂ = x/n = 6062/12000 = 0.5052, rounded to 0.505 b. α = 0.05, zα/2 = z0.025 = 1.96 ˆˆ pˆ ± zα /2 pq/n 0.5052 ± 1.96 (0.5052)(0.4948)/12000 0.5052 ± 0.0089 0.496 < p < 0.514 c. No. Since 0.5 is within the confidence interval, there is no evidence that people can temporarily postpone their death in such circumstances. 32. Let x = the number of suits dropped or dismissed. a. p̂ = x/n = 856/1228 = 0.6971, rounded to 0.697 b. α = 0.01, zα/2 = z0.005 = 2.575 ˆˆ pˆ ± zα /2 pq/n 0.6971 ± 2.575 (0.6971)(0.3029)/1228 0.6971 ± 0.0338 0.663 < p < 0.731 c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99% certain that more than half the suits are dropped or dismissed. 33. Let x = the number of yellow peas a. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 152/(428+152) = 152/580 = 0.2621 ˆˆ pˆ ± zα /2 pq/n 0.2621 ± 1.96 (0.2621)(0.7379)/580 0.2621 ± 0.0358 0.226 < p < 0.298 or 22.6% < p < 29.8% b. No. Since 0.25 is within the confidence interval, it is a reasonable possibility for the true population proportion. The results do not contradict the theory. Estimating a Population Proportion SECTION 7-2 201 34. Let x = the number who say they voted. a. α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = 701/1002 = 0.6996 ˆˆ pˆ ± zα /2 pq/n 0.6996 ± 2.575 (0.6996)(0.3004)/1002 0.6996 ± 0.0373 0.662 < p < 0.737 b. No. Since 0.61 is not within the confidence interval, the results are not consistent with the actual voter turnout. It appears that people do not tell the truth about whether they voted. 35. Let x = the number that develop those types of cancer. a. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 135/420095 = 0.0003214 ˆˆ pˆ ± zα /2 pq/n 0.0003214 ± 1.96 (0.0003214)(0.9996786)/420095 0.0003214 ± 0.0000542 0.000267 < p < 0.000376 or 0.0267% < p < 0.0376% b. No. Since 0.0340% = 0.000340 is within the confidence interval, it is a reasonable possibility for the true population value. The results do not provide evidence that cell phone users have a different cancer rate than the general population. 36. Let x = the number who think global warming demands priority attention. a. p̂ = x/n = 939/1708 = 0.5498, rounded to 0.550 or 55.0% b. α = 0.01, zα/2 = z0.005 = 2.575 ˆˆ pˆ ± zα /2 pq/n 0.5498 ± 2.575 (0.5498)(0.4502)/1708 0.5498 ± 0.0310 0.519 < p < 0.581 c. Yes. Since 0.5 is not within the confidence interval, and below the interval, we can be 99% certain that more than half the population thinks global warming demands such attention. 37. Let x = the number who say they use the Internet. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = [2198]/3011 = 0.73 NOTE: The value x=[2198] was not given. In truth, any 2183 ≤ x ≤ 2213 rounds to the given p̂ = x/3011 = 73%. For want of a more precise value, p̂ = 0.73 is used in the calculations. Technically, this should limit the exercise to two significant digit accuracy ˆˆ pˆ ± zα /2 pq/n 0.73 ± 1.96 (0.73)(0.27)/3011 0.73 ± 0.0159 0.714 < p < 0.746 No. Since 0.75 is not within the confidence interval, it is not likely to be the correct value of the population proportion and should not be reported as such. In this particular exercise, however, the above NOTE indicates that the third significant digit in the confidence interval endpoints is not reliable – and if p̂ is really 2213/3011 = 0.73497, for example, the confidence interval is 0.719 < p < 0.751 and 75% is acceptable. 202 CHAPTER 7 Estimates and Sample Sizes 38. Let x = the number who display little or no knowledge of the company. α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = [71]/150 = 0.47 NOTE: The value x=[71] was not given. In truth, any 70 ≤ x ≤ 71 rounds to the given p̂ = x/150 = 47%. For want of a more precise value, p̂ = 0.47 is used in the calculations. Technically, this should limit the exercise to two significant digit accuracy. ˆˆ pˆ ± zα /2 pq/n 0.47 ± 2.575 (0.47)(0.53)/150 0.47 ± 0.1049 0.365 < p < 0.575 Yes. Since 0.50 is within the confidence interval, it is a likely value for the true population proportion. 39. Let x = the number who indicate the outbreak would deter them from taking a cruise. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = [21302]/34358 = 0.62 NOTE: The value x=[21302] was not given. In truth, any 21131 ≤ x ≤ 21473 rounds to the given p̂ = x/34358 = 62%. For want of a more precise value, p̂ = 0.62 is used in the calculations. Technically, this should limit the exercise to two significant digit accuracy. ˆˆ pˆ ± zα /2 pq/n 0.62 ± 1.96 (0.62)(0.38)/34258 0.62 ± 0.0051 0.615 < p < 0.625 No. Since the sample is a voluntary response sample, the respondents are not likely to be representative of the population. 40. Let x = the number who correctly identified which hand Emily selected. a. If the touch therapists made random guesses, one would expect the proportion of correct responses to be 0.50 regardless of how Emily chose which hand to use – even if, for example, she always used the right hand. b. p̂ = x/n = 123/280 = 0.4393, rounded to 0.439 c. α = 0.01, zα/2 = z0.005 = 2.575 ˆˆ pˆ ± zα /2 pq/n 0.4393 ± 2.575 (0.4393)(0.5607)/280 0.4393 ± 0.0764 0.363 < p < 0.516 d. Since the confidence interval includes 0.50, the therapists’ success rate is consistent with chance guessing. There is no evidence that the professional touch therapists have any special ability in this area. 41. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02 a. p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(2.575)2(0.5)(0.5)]/(0.02)2 = 4144.14, rounded up to 4145 n = [(z α/2 ) 2 pq]/E b. p̂ estimated to be 0.73 ˆ ˆ 2 = [(2.575)2(0.73)(0.27)]/(0.02)2 = 3267.24, rounded up to 3268 n = [(z α/2 ) 2 pq]/E Estimating a Population Proportion SECTION 7-2 203 42. α = 0.10, zα/2 = z0.05 = 1.645 and E = 0.04 a. p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(1.645)2(0.5)(0.5)]/(0.04)2 = 422.82, rounded up to 423 n = [(z α/2 ) 2 pq]/E b. p̂ estimated to be 0.08 ˆ ˆ 2 = [(1.645)2(0.08)(0.92)]/(0.04)2 = 124.48, rounded up to 125 n = [(z α/2 ) 2 pq]/E 43. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.03)2 = 1067.11, rounded up to 1068 n = [(z α/2 ) 2 pq]/E 44. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.02; p̂ estimated to be 0.10 ˆ ˆ 2 = [(1.96)2(0.10)(0.90)]/(0.02)2 = 864.36, rounded up to 865 n = [(z α/2 ) 2 pq]/E 45. Let x = the number of green M&M’s. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 19/100 = 0.19 ˆˆ pˆ ± zα /2 pq/n 0.1900 ± 1.96 (0.19)(0.81)/100 0.1900 ± 0.0769 0.113 < p < 0.267 or 11.3% < p < 26.7% Yes. Since 0.160 is within the confidence interval, this result is consistent with the claim that the true population proportion is 16%. 46. Let x = the number students who gained weight during their freshman year. Of the 67 students, 17 lost weight and 5 stayed the same and 45 gained weight. a. p̂ = x/n = 45/67 = 0.6716, rounded to 0.672 or 67.2% b. α = 0.05, zα/2 = z0.025 = 1.96 ˆˆ pˆ ± zα /2 pq/n 0.6716 ± 1.96 (0.6716)(0.3284)/67 0.6716 ± 0.1125 0.559 < p < 0.784 or 55.9% < p < 78.4% c. It is estimated that 67% of U.S. college students gain weight during their freshman year. This result comes from a study of 67 men and women published in the Journal of American College Health, Vol. 55, No, 1. That estimate has an 11% margin of error with a 95% confidence level. In other words, 95% of all such studies can be expected to produce estimates that are within 11% of the true proportion of all college students who gain weight during their freshman year. 47. Let x = the number of days with precipitation. α = 0.05, zα/2 = z0.025 = 1.96 Wednesdays: p̂ = x/n = 16/53 = 0.3019. Sundays: p̂ = x/n = 15/52 = 0.2885 ˆˆ ˆˆ pˆ ± zα /2 pq/n pˆ ± zα /2 pq/n 0.3019 ± 1.96 (0.3019)(0.6981)/53 0.2885 ± 1.96 (0.2885)(0.7115)/52 0.2885 ± 0.1231 0.3019 ± 0.1236 0.178 < p < 0.425 0.165 < p < 0.412 The confidence intervals are similar. It does not appear to rain more on either day. 204 CHAPTER 7 Estimates and Sample Sizes 48. Let x = the number of movies with R ratings. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 12/35 = 0.3429 ˆˆ pˆ ± zα /2 pq/n 0.3429 ± 1.96 (0.3429)(0.6571)/35 0.3429 ± 0.1573 0.186 < p < 0.500 No. To more significant digits, the upper limit of the confidence limit is 0.500114. Since the confidence interval includes values higher than 0.50, it is not an unreasonable possibility that the proportion of movies rated R is greater than ½. We cannot conclude that most movies have a rating different from R. 49. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.03; p̂ unknown, use p̂ =0.5 ˆ ˆ α /2 ]2 Npq[z n= ˆ ˆ α /2 ]2 + (N-1)E 2 pq[z (12784)(0.5)(0.5)[1.96]2 12277.8 = = 984.97, rounded up to 985 = 2 2 (0.5)(0.5)[1.96] + (12783)(0.03) 12.4651 No. The sample size is not too much lower than the n=1068 required for a population of millions of people. 50. α = 0.05, zα = z0.05 = 1.645 and p̂ = x/n = 630/750 = 0.8400 ˆˆ pˆ - zα pq/n 0.8400 - 1.645 (0.8400)(0.1600)/750 0.8400 - 0.0220 0.818 < p The interval is expressed as p > 0.818. The desired figure is 81.8%. 51. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 3/8 = 0.3750 ˆˆ pˆ ± zα /2 pq/n 0.3750 ± 1.96 (0.3750)(0.6250)/8 0.3750 ± 0.3355 0.0395 < p < 0.710 Yes. The results are “reasonably close” – being shifted down 4.5% from the correct interval 0.085 < p < 0.755. But depending on the context, such an error could be serious. 52. α = 0.01, zα/2 = z0.005 = 2.575 and p̂ = x/n = 95/100 = 0.9500 ˆˆ pˆ ± zα /2 pq/n 0.9500 ± 2.575 (0.9500)(0.0500)/100 0.9500 ± .0561 0.894 < p < 1.006 This interval is noteworthy because the upper limit is greater than 1, the maximum possible value for p in any problem. This occurs because the normal distribution used is only an approximation to the binomial. In this case the approximation is barely appropriate – since nq ≈ 100(0.05) = 5, the minimum acceptable value to use the normal to approximate the binomial. In such cases the interval should be reported as 0.894<p<1. NOTE: Do not use 0.894<p ≤ 1, because the presence of 5 tails indicates that p=1 is not true. Estimating a Population Proportion SECTION 7-2 205 53. a. If p̂ = x/n = 0/n = 0, then (1) np ≈ 0 < 5, and the normal approximation to the binomial does not apply. ˆ ˆ = 0, and there is no meaningful interval . (2) E = zα /2 pq/n b. Since p̂ = x/n = 0/20 = 0, use 3/n = 3/20 = 0.15 as the 95% upper bound for p. NOTE: The corresponding interval would be 0 ≤ p<0.15. Do not use 0<p<0.15, because the failure to observe any successes in the sample does not rule out p=0 as the true population proportion. 54. Since “19 cases out of 20” implies 19/20 = 0.95 = 95% confidence, use α = 0.05. Since p̂ is unknown, use p̂ = 0.5. ˆˆ 2 n = [(z α/2 ) 2 pq]/E = [(1.96)2(0.5)(0.5)]/(0.01)2 = 9604 7-3 Estimating a Population Mean: σ Known 1. A point estimate is a single value used to estimate a population parameter. If the parameter in question is the mean of a population, the best point estimate is the mean of a random sample from that population. 2. No. The list of the employees at her facility from which she obtained her simple random sample is itself a convenience sample. Those employees are likely not representative of the population by age, gender, ethnicity, or other factors that may affect leg length. 3. It is estimated that the mean height of U.S. women is 63.195 inches. This result comes from the Third National Health and Nutrition Examination Survey of the U.S. Department of Health and Human Services. It is based on an in-depth study of 40 women and assumes a population standard deviation of 2.5 inches. The estimate has a margin of error of 0.775 inches with a 95% level of confidence. In other words, 95% of all such studies can be expected to produce estimates that are within 0.775 inches of the true population mean height of all U.S. women. 4. While any one particular estimate for a population parameter may not be correct, a statistic is an unbiased estimator of a population parameter if its long-run expected value (i.e., its mean value) is equal to the true value of the parameter being estimated. 5. For 90% confidence, α = 1–0.90 = 0.10 and α/2 = 0.10/2 = 0.05. For the upper 0.05, A = 0.9500 and z = 1.645. zα/2 = z0.05 = 1.645 6. For 98% confidence, α = 1–0.98 = 0.02 and α/2 = 0.02/2 = 0.01. For the upper 0.01, A = 0.9900 [0.9901] and z = 2.33. zα/2 = z0.01 = 2.33 7. For α = 0.20, α/2 = 0.20/2 = 0.10. For the upper 0.10, A = 0.9000 and z = 1.28. zα/2 = z0.10 = 1.28 8. For α = 0.04, α/2 = 0.0/2 = 0.02. For the upper 0.02, A = 0.9800 [0.9798] and z = 2.05. zα/2 = z0.02 = 2.05 206 CHAPTER 7 Estimates and Sample Sizes 9. Since σ is known and n>30, the methods of this section may be used. α = 0.05, zα/2 = z0.025 = 1.96 E = zα /2 σ/ n = 1.96(68)/ 50 = 18.8 FICO units x ± E 677.0 ± 18.8 658.2 < μ < 695.8 (FICO units) NOTE: The above interval assumes x = 677.0. Technically, the failure to report x to tenths limits the endpoints of the confidence interval to whole number accuracy. 10. Since σ is known and n>30, the methods of this section may be used. α = 0.05, zα/2 = z0.025 = 1.96 E = zα /2 σ/ n = 1.96(7)/ 32 = 2.4 feet x ± E 137.0 ± 2.4 134.6 < μ < 139.4 (feet) NOTE: The above interval assumes x = 137.0. Technically, the failure to report x to tenths limits the endpoints of the confidence interval to whole number accuracy. 11. Since n<30 and the population is far from normal, the methods of this section may not be used. 12. Since n<30 and the population is far from normal, the methods of this section may not be used. 13. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(68)/(3)]2 = 1973.73, rounded up to 1974 14. α = 0.01, zα/2 = z0.005 = 2.575 n = [zα/2·σ/E]2 = [(2.575)(7)/(2)]2 = 81.23, rounded up to 82 15. α = 0.01, zα/2 = z0.005 = 2.575 n = [zα/2·σ/E]2 = [(2.575)(0.212)/(0.010)]2 = 2980.07, rounded up to 2981 16. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(18.6)/(2)]2 = 332.26, rounded up to 333 17. x = 21.12 mg 18. 19.853 < μ < 22.387 (mg) 19. E = (U – L)/2 = (22.387 – 19.853)/2 = 1.267 21.12 ± 1.267 (mg) 20. We are 95% confident that the interval from 19.853 mg to 22.387 mg contains the true mean amount of tar in all king-size, non-filtered, non-menthol, and non-light cigarettes. 21. a. x = 146.22 lbs b. α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 146.22 ± 1.96(30.86)/ 40 146.22 ± 9.56 136.66 < μ < 155.78 (lbs) Estimating a Population Mean: σ Known SECTION 7-3 207 22. a. x = $415,953 b. α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 415,953 ± 1.96(463,364)/ 40 415,953 ± 143,598 272,355 < μ < 559,551 (dollars) c. Yes. In this case the confidence interval includes the true population mean. 23. a. x = 58.3 seconds b. α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 58.3 ± 1.96(9.5)/ 40 58.3 ± 2.9 55.5 < μ < 61.2 (seconds) c. Yes. Since the confidence interval contains 60 seconds, it is reasonable to assume that the sample mean was reasonably close to 60 seconds – and it was, in fact, 58.3 seconds. 24. a. x = 4.63 cells/microliter b. α = 0.01, zα/2 = z0.005 = 2.575 x ± zα/2·σ / n 4.63 ± 2.575(0.54)/ 50 4.63 ± 0.20 4.43 < μ < 4.83 (cells/microliter) c. The intervals are not directly comparable, since the two given in part (c) are normal ranges for individual counts and the one calculated in part (b) is a confidence interval for mean counts. One would expect the confidence interval for mean counts to be well within the normal ranges for individual counts. The fact that the point estimate and the lower confidence interval limit for the mean are so close to the lower limit of the normal ranges for individuals suggests that the sample may consist of persons with lower red blood cell counts. 25. a. α = 0.05, zα/2 = z0.025 = 1.96 b. α = 0.01, zα/2 = z0.005 = 2.575 x ± zα/2·σ / n x ± zα/2·σ / n 1522 ± 1.96(333)/ 125 1522 ± 2.575(333)/ 125 1522 ± 58 1522 ± 77 1464 < μ < 1580 1445 < μ < 1599 c. The 99% confidence interval in part (b) is wider than the 95% confidence interval in part (a). For an interval to have more confidence associated with it, it must be wider to allow for more possibilities. 26. a. α = 0.05, zα/2 = z0.025 = 1.96 b. α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n x ± zα/2·σ / n 3433 ± 1.96(495)/ 75 3433 ± 1.96(495)/ 75000 3433 ± 112 3433 ± 4 3321 < μ < 3545 (grams) 3429 < μ < 3437 (grams) c. The n=75 confidence interval in part (a) is wider than the n=75,000 confidence interval in part (b). There is less accuracy associated with smaller samples. 208 CHAPTER 7 Estimates and Sample Sizes 27. summary statistics: n = 14 Σx = 1875 x = 133.93 α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 133.93 ± 1.96(10)/ 14 133.93 ± 5.24 128.7 < μ < 139.2 (mmHg) Ideally, there is a sense in which all the measurements should be the same – and in that case there would be no need for a confidence interval. It is unclear what the given σ = 10 represents in this situation. Is it the true standard deviation in the values of all people in the population (in which case it would not be appropriate in this context where only a single person is involved)? Is it the true standard deviation in momentary readings on a single person (due to constant biological fluctuations)? Is it the true standard deviation in readings from evaluator to evaluator (when they are supposedly evaluating the same thing)? Using the methods of this section and assuming σ = 10, the confidence interval would be 128.7 < μ < 139.2 as given above even if all the readings were the same. 28. a. summary statistics: n = 10 Σx = 39 x = 3.9 α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 3.9 ± 1.96(2.87)/ 10 3.9 ± 1.8 2.1 < μ < 5.7 b. No; since n<30 and the population distribution is not normal, the methods of this section do not apply. No; since the methods of this section do not apply, the confidence interval does not provide a good estimate. Even though the confidence interval may include the true mean, the endpoints of the confidence limits do not carry the supposed level of confidence. 29. summary statistics: n = 35 Σx = 4305 α = 0.05, zα/2 = z0.025 = 1.96 x ± zα/2·σ / n 123.00 ± 1.96(100)/ 35 123.00 ± 33.13 89.9 < μ < 156.1 (million dollars) 30. summary statistics: n = 100 Σx = 70311 α = 0.01, zα/2 = z0.005 = 2.575 x ± zα/2·σ / n 703.11 ± 2.575(92.2)/ 100 703.11 ± 23.74 679.4 < μ < 726.9 (FICO units) x = 123.00 x = 703.11 31. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(15)/(5)]2 = 34.57, rounded up to 35 32. α = 0.01, zα/2 = z0.005 = 2.575 n = [zα/2·σ/E]2 = [(2.575)(2.5)/(0.2)]2 = 1036.04, rounded up to 1037 33. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(10.6)/(0.25)]2 = 6906.27, rounded up to 6907 The sample size is too large to be practical. Estimating a Population Mean: σ Known SECTION 7-3 209 34. α = 0.10, zα/2 = z0.05 = 1.645 n = [zα/2·σ/E]2 = [(1.645)(0.88)/(0.1)]2 = 209.55, rounded up to 210 35. α = 0.05, zα/2 = z0.025 = 1.96 Using the range rule of thumb: R = 40,000 – 0 = 40,000, and σ ≈ R/4 = 40,000/4 = 10,000. n = [zα/2·σ/E]2 = [(1.96)(10,000)/(100)]2 = 38416, rounded up to 217 36. α = 0.05, zα/2 = z0.025 = 1.96 a. Using the range rule of thumb: R = 96 – 56 = 40, and σ ≈ R/4 = 40/4 = 10. n = [zα/2·σ/E]2 = [(1.96)(10)/(2)]2 = 96.04, rounded up to 97 b. Using the sample standard deviation: σ ≈ s = 11.297 n = [zα/2·σ/E]2 = [(1.96)(11.297)/(2)]2 = 122.56, rounded up to 123 c. The two values are relatively close. Since s (which considers all the data) is a better estimator for σ than R/4 (which is based entirely on the extreme values), the sample size of 123 should be preferred. 37. Since n/N = 125/200 = 0.625 > 0.05, use the finite population correction factor. α = 0.05, zα/2 = z0.025 = 1.96 x ± [zα /2 σ/ n ] ⋅ (N-n)/(N-1) 1522 ± [1.96(333)/ 125] ⋅ (200-125)/(200-1) 1522 ± [58.3774]·[0.6139] 1522 ± 36 1486 < μ < 1558 The confidence interval becomes narrower because the sample is a larger portion of the population. As n approaches N, the length of the confidence interval shrinks to 0 – because when n=N the true mean μ can be determined with certainty. 38. From Exercise 32: α = 0.01, zα/2 = z0.005 = 2.575 and σ = 2.5 and E = 0.2. Nσ 2 (zα /2 ) 2 500(2.5) 2 (2.575) 2 20720.7 n= = = =337.48, rounded up to 338 2 2 2 2 2 2 (N-1)E + σ (zα /2 ) (500-1)(0.2) + (2.5) (2.575) 61.4014 Yes; the information about the population size has a significant effect, dropping the required sample size from 1037 to 338. 7-4 Estimating a Population Mean: σ Not Known 1. According to the point estimate (“average”), the parameter of interest is a population mean. But according to the margin of error (“percentage points”), the parameter of interest is a population proportion. It is possible that the margin of error the paper intended to communicate was 1% of $483 (or $4.83, which in a 95% confidence interval would correspond to a sample standard deviation of $226.57) – but the proper units for the margin of error in a situation like this are “dollars” and not “percentage points.” 2. Robust against departures from normality mans that that the requirement that the original population be approximately normal is not a strong requirement, and that the methods of this section still give good results if the departure from normality is not too extreme. The methods of this section are not robust against poor sampling methods, as poor sampling methods can yield data that are entirely useless. 210 CHAPTER 7 Estimates and Sample Sizes 3. No; the estimate will not be good for at least two reasons. First, the sample is a convenience sample using the state of California, and California residents may not be representative of then entire country. Secondly, any survey that involves self-reporting (especially of financial information) is suspect because people tend to report favorable rather than accurate data. 4. The degrees of freedom in this survey is 4. In general, the degrees of freedom in a problem is the number of data values that are free to vary without changing the estimate of the parameter of interest. The estimate of a mean is determined by Σx, and n-1 of the data values are free to vary so long as the nth value is the one necessary to produce the required Σx. 5. σ unknown, normal population, n=23: use t with df =22 α = 0.05, tdf,α/2= t22,0.025 = 2.074 IMPORTANT NOTE: This manual uses the following conventions. (1) The designation “df” stands for “degrees of freedom.” (2) Since the t value depends on the degrees of freedom, a subscript may be used to clarify which t distribution is being used. For df =15 and α/2 =0.025, for example, one may indicate t15,α/2 = 2.132. As with the z distribution, it is also acceptable to use the actual numerical value within the subscript and indicate t15,.025 = 2.132. (3) Always use the closest entry in Table A-3. When the desired df is exactly halfway between the two nearest tabled values, be conservative and choose the one with the lower df. (4) As the degrees of freedom increase, the t distribution approaches the standard normal distribution – and the “large” row of the t table actually gives z values. Consequently the z score for certain “popular” α and α/2 values may be found by reading Table A-3 “frontwards” instead of Table A-2 “backwards.” This is not only easier but also more accurate – since Table A-3 includes one more decimal place. Note the following examples. For “large” df and α/2 = 0.05, tα/2 = 1.645 = zα/2 (as found in the z table). For “large” df and α/2 = 0.01, tα/2 = 2.326 = zα/2 (more accurate than the 2.33 in the z table). This manual uses this technique from this point on. [For df = “large” and α/2 = 0.005, tα/2 = 2.576 ≠ 2.575 = zα/2 (as found in the z table). This is a discrepancy caused by using different mathematical approximation techniques to construct the tables, and not a true difference. While 2.576 is the more standard value, his manual will continue to use 2.575.] 6. σ known, normal population: use z α = 0.01, zα/2 = z0.005 = 2.575 7. σ unknown, population not normal, n=6: neither normal nor t applies 8. σ unknown, population not normal, n=40: use t with df =39 α = 0.05, tdf,α/2 = t39,0.025 =2.023 9. σ known, population not normal, n=200: use z α = 0.10, zα/2 = z0.05 = 1.645 10. σ unknown, population not normal, n=9: neither normal nor t applies 11. σ unknown, population normal, n=12: use t with df = 11 α = 0.01, tdf,α/2 = t11,0.005 = 3.106 12. σ unknown, population not normal, n=38: use t with df =37 α = 0.05, tdf,α/2 = t39,0.025 =2.026 Estimating a Population Mean: σ Not Known SECTION 7-4 211 13. σ unknown, normal distribution: use t with df = 19 α = 0.05, tdf, α/2 = t19,0.025 = 2.093 b. x ±E a. E = tα/2·s/ n = 2.093(569)/ 20 9004 ± 266 = 266 dollars 8738 < μ < 9270 (dollars) 14. σ unknown, normal distribution: use t with df = 6 α = 0.01, tdf, α/2 = t6,0.005 = 3.707 b. x ± E a. E = tα/2·s/ n = 3.707(0.04)/ 7 0.12 ± 0.06 = 0.06 grams/mile 0.06 < μ < 0.18 (grams/mile) 15. From the SPSS display: 8.0518 < μ< 8.0903 (grams) There is 95% confidence that the interval from 8.0518 grams to 8.0903 grams contains the true mean weight of all U.S. dollar coins in circulations. 16. From the TI-83/84 Plus display: 1.5514 < μ < 2.2706 (lbs) There is 99% confidence that the interval from 1.5514 lbs to 2.2706 lbs contains the true mean annual weight of the plastic discarded by U.S. households. 17. a. x = 3.2 mg/dL b. σ unknown, n > 30: use t with df=46 [45] α = 0.05, tdf, α/2 = t46,0.025 = 2.014 x ± tα/2·s/ n 3.2 ± 2.014(18.6)/ 47 3.2 ± 5.5 -2.3 < μ < 8.7 (mg/dl) Since the confidence interval includes 0, there is a reasonable possibility that the true value is zero – i.e., that the Garlicin treatment has no effect on LDL cholesterol levels. 18. a. x = 3103 grams b. σ unknown, n > 30: use t with df=185 [200] α = 0.05, tdf, α/2 = t185,0.025 = 1.972 x ± tα/2·s/ n 3103 ± 1.972(696)/ 186 3103 ± 101 3002< μ < 3204 (grams) c. Yes. Since the confidence interval for the mean birth weight for mothers who used cocaine is entirely below the confidence interval in part (b), it appears that cocaine use is associated with lower birth rates. 19. a. x = 98.20 °F b. σ unknown, n > 30: use t with df=105 [100] α = 0.01, tdf, α/2 = t105,0.005 = 2.626 x ± tα/2·s/ n 98.20 ± 2.626(0.62)/ 106 98.20 ± 0.16 98.04 < μ < 98.36 (°F) c. No, the confidence interval does not contain the value 98.6 °F. This suggests that the common belief that 98.6 °F is the normal body temperature may not be correct. 212 CHAPTER 7 Estimates and Sample Sizes 20. a. x = 2.1 lbs b. σ unknown, n > 30: use t with df=39 α = 0.01, tdf, α/2 = t39,0.005 = 2.708 x ± tα/2·s/ n 2.1 ± 2.708(4.8)/ 40 2.1 ± 2.1 0 < μ < 4.2 (lbs) c. Yes; since the confidence interval does not include 0, the diet appears to be effective. No; since the amount of weight loss is so small, the diet does not appear to be practical. 21. a. σ unknown, n > 30: use t with df=336 [300] b. σ unknown, n > 30: use t with df=369 [400] α = 0.05, tdf,α/2 = t336,0.025 = 1.968 α = 0.05, tdf,α/2 = t369,0.025 = 1.966 x ± tα/2·s / n x ± tα/2·s / n 6.0 ± 1.968(2.3)/ 337 6.1 ± 1.966(2.4)/ 370 6.0 ± 0.2 1.6 ± 0.2 5.8 < μ < 6.2 (days) 5.9 < μ < 6.3 (days) c. The two confidence intervals are very similar and overlap considerably. There is no evidence that the echinacea treatment is effective. 22. a. σ unknown, n > 30: use t with df=141 [100] b. σ unknown, n > 30: use t with df=79 [80] α = 0.05, tdf,α/2 = t141,0.025 = 1.984 α = 0.05, tdf,α/2 = t79,0.025 = 1.990 x ± tα/2·s / n x ± tα/2·s / n 1.8 ± 1.984(1.4)/ 142 1.6 ± 1.990(1.2)/ 80 1.8 ± 0.2 1.6 ± 0.3 1.6 < μ < 2.0 (headaches) 1.3 < μ < 1.9 (headaches) c. The two confidence intervals are very similar and overlap considerably. There is no evidence that the acupuncture treatment is effective. 23. a. σ unknown, n ≤ 30: if approximately b. σ unknown, n ≤ 30: if approximately normal distribution, use t with df=19 normal distribution, use t with df=19 α = 0.05, tdf,α/2 = t19,0.025 = 2.093 α = 0.05, tdf,α/2 = t99,0.025 = 2.093 x ± tα/2·s / n x ± tα/2·s / n 5.0 ± 2.093(2.4)/ 20 4.7 ± 2.093(2.9)/ 20 5.0 ± 1.1 4.7 ± 1.4 3.9 < μ < 6.1 (VAS units) 3.4 < μ < 6.1 (VAS units) c. The two confidence intervals are very similar and overlap considerably. There is no evidence that the magnet treatment is effective. 24. a. σ unknown, n > 30: use t with df=78 [80] b. σ unknown, n > 30: use t with df=78 [80] α = 0.01, tdf,α/2 = t78,0.005 = 2.639 α = 0.01, tdf,α/2 = t78,0.005 = 2.639 x ± tα/2·s / n x ± tα/2·s / n 35.8 ± 2.639(11.3)/ 79 43.8 ± 2.639(8.9)/ 79 35.8 ± 3.4 43.8 ± 2.6 32.4 < μ < 39.2 (years) 41.2 < μ < 46.4 (years) c. The two confidence intervals differ considerably and do not overlap at all. Women Oscar winners are considerably younger than their male counterparts. Either women and men reach their peak acting ability at different years, or the standards for judging women and men are not really the same. Estimating a Population Mean: σ Not Known SECTION 7-4 213 25. preliminary values: n = 6, Σx = 9.23, Σx2 = 32.5197 x = (Σx)/n = (9.23)/6 = 1.538 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [6(32.5197) – (9.23)2]/[6(5)] = 3.664 s = 1.914 σ unknown (and assuming the distribution is approximately normal), use t with df=5 α = 0.05, tdf,α/2 = t5,0.025 = 2.571 x ± tα/2·s/ n 1.538 ± 2.571(1.914)/ 6 1.538 ± 2.009 -0.471 < μ < 3.547 [which should be adjusted, since negative values are not possible] 0 < μ < 3.547 (micrograms/cubic meter) Yes. The fact that 5 of the 6 sample values are below x raises a question about whether the data meet the requirement that the underlying distribution is normal. 26. preliminary values: n = 7, Σx = 0.85, Σx2 = 0.1123 x = (Σx)/n = (0.85)/7 = 0.121 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [7(0.1123) – (0.85)2]/[7(6)] = 0.001514 s = 0.0389 σ unknown (and assuming the distribution is approximately normal), use t with df=6 α = 0.02, tdf,α/2 = t6,0.01 = 3.143 x ± tα/2·s/ n 0.121 ± 3.143(0.0389)/ 7 0.121 ± .046 0.075 < μ < 0.168 (grams/mile) No. Since the confidence interval includes values greater than 0.165, there is a reasonable possibility that the true mean emission amount is greater than that. NOTE: This is a two-sided 98% confidence interval, and the requirement is one-sided (i.e., that μ < 0.165). This means that the level of significance associated with the interval may not be the same level of significance associated with a conclusion about the requirement. 27. preliminary values: n = 10, Σx = 204.0, Σx2 = 5494.72 x = (Σx)/n = (204.0)/10 = 20.40 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(5494.72) – (204.0)2]/[10(9)] = 148.124 s = 12.171 a. σ unknown (and assuming the distribution is approximately normal), use t with df=9 α = 0.05, tdf,α/2 = t9,0.05 = 2.262 x ± tα/2·s/ n 20.40 ± 2.262(12.171)/ 10 20.40 ± 8.71 11.7 < μ < 29.1 (million dollars) b. No. Since the data are the top 10 salaries, they are not a random sample. c. There is a sense in which the data are the population (i.e., the top ten salaries) and are not a sample of any population. Possible populations from which the data could be considered a sample (but not a representative sample appropriate for any statistical inference) would be the salaries of all TV personalities, the salaries of the top 10 salaries of TV personality for different years. d. No. Since no population can be identified from which these data are a random sample, the confidence interval has no context and makes no sense. 214 CHAPTER 7 Estimates and Sample Sizes 28. preliminary values: n = 12, Σx = 1461, Σx2 = 182,435 x = (Σx)/n = (1461)/12 = 121.75 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(182435) – (1461)2]/[12(11)] = 414.386 s = 20.356 a. σ unknown (and assuming the distribution is approximately normal), use t with df=11 α = 0.01, tdf,α/2 = t11,0.005 = 3.106 x ± tα/2·s/ n 121.75 ± 3.106(20.356)/ 12 121.75 ± 18.25 103.5 < μ < 140.0 (minutes) b. While it is tempting to add 30 minutes to the upper confidence interval limit associated with the mean times, it is not appropriate to make a decision about individual times based on the distribution of the means. Without knowing the distribution of the lengths of the individual films, and without assuming they are normally distributed, it is still possible to give the manager some guidance. Defining an outlier to be any values more than two standard deviations from the mean the usual maximum and minimum film lengths are: usual min: 121.75 – 2(20.356) = 81.0 minutes usual max: 121.75 + 2(20.356) = 162.5 minutes If the manager allowed 162.5 + 30 = 192.5 minutes between showings, he would accommodate all but the unusually long films. In practice, in order to use round numbers and err slightly on the conservative side, he should consider a regular schedule of 195 minutes (i.e., 3 hours and 15 minutes) between feature showings. 29. preliminary values: n = 12, Σx = 52118, Σx2 = 228,072,688 x = (Σx)/n = (52118)/12 = 4343.17 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(228072688) – (52118)2]/[12(11)] = 155957.06 s = 394.91 σ unknown (and assuming the distribution is approximately normal), use t with df=11 α = 0.05, tdf,α/2 = t12,0.025 = 2.201 x ± tα/2·s/ n 4343.17 ± 2.201(394.91)/ 12 4343.17 ± 250.91 4092.2< μ < 4594.1 (seconds) 30. preliminary values: n = 43, Σx = 2358, Σx2 = 130,930 x = (Σx)/n = (2358)/43 = 54.837 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(130930) – (2358)2]/[43(42)] = 38.663 s = 6.218 σ unknown and n>30, use t with df=42 [40] α = 0.01, tdf,α/2 = t42,0.005 = 2.704 x ± tα/2·s/ n 54.837 ± 2.704(6.218)/ 43 54.837 ± 2.564 52.3 < μ < 57.4 (years) There is a sense in which the data are the population (i.e., the ages at inauguration of all US Presidents) and are not a sample of any population. Possible populations from which the data could be considered a sample (but not a representative sample appropriate for any statistical inference) would be the ages of all US adults, the ages upon taking office of word heads of states, the ages at inauguration of all past-present-future US presidents. Estimating a Population Mean: σ Not Known SECTION 7-4 215 31. a. preliminary values: n = 25, Σx = 31.4, Σx2 = 40.74 x = (Σx)/n = (31.4)/25 = 1.256 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [25(40.74) – (31.4)2]/[25(24)] = 32.54/600 = 0.0542 s = 0.2329 σ unknown (and assuming the distribution is approximately normal), use t with df=24 α = 0.05, tdf,α/2 = t24,0.025 = 2.064 x ± tα/2·s/ n 1.256 ± 2.064(0.2329)/ 25 1.256 ± 0.096 1.16 < μ < 1.35 (mg) NOTE: The Minitab output for this exercise is given below. Variable nicotine N 25 Mean 1.25600 StDev 0.23288 SE Mean 0.04658 95% CI (1.15987, 1.35213) b. preliminary values: n = 25, Σx = 22.9, Σx2 = 22.45 x = (Σx)/n = (22.9)/25 = 0.916 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [25(22.45) – (22.9)2]/[25(24)] = 36.84/600 = 0.0614 s = 0.2478 σ unknown (and assuming the distribution is approximately normal), use t with df=24 α = 0.05, tdf,α/2 = t24,0.025 = 2.064 x ± tα/2·s/ n 0.916 ± 2.064(0.2478)/ 25 0.916 ± 0.102 0.81 < μ < 1.02 (mg) NOTE: The Minitab output for this exercise is given below. Variable nicotine N 25 Mean 0.916000 StDev 0.247790 SE Mean 0.049558 95% CI (0.813717, 1.018283) c. There is no overlap in the confidence intervals. Yes; since the CI for the filtered cigarettes is completely below the CI for the unfiltered cigarettes, the filters appear to be effective in reducing the amounts of nicotine. 32. a. preliminary values: n = 40, Σx = 2776, Σx2 = 197632 x = (Σx)/n = (2776)/40 = 69.4 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [40(197632) – (2776)2]/[40(39)] = 127.631 s = 11.297 σ unknown and n>30, use t with df=39 α = 0.05, tdf,α/2 = t39,0.025 = 2.024 x ± tα/2·s/ n 69.4 ± 2.024(11.297)/ 40 69.4 ± 3.6 65.8 < μ < 73.0 (beats/min) NOTE: The Minitab output for this exercise is given below. Variable PULSE N 40 Mean 69.4000 StDev 11.2974 SE Mean 1.7863 95% CI (65.7869, 73.0131) b. preliminary values: n = 40, Σx = 3052, Σx2 = 238960 x = (Σx)/n = (3052)/40 = 76.3 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [40(238960) – (3052)2]/[40(39)] = 156.215 s = 12.499 216 CHAPTER 7 Estimates and Sample Sizes σ unknown and n>30, use t with df=39 α = 0.05, tdf,α/2 = t39,0.025 = 2.024 x ± tα/2·s/ n 76.3 ± 2.024(12.499)/ 40 76.3 ± 4.0 72.3 < μ < 80.3 (beats/min) NOTE: The Minitab output for this exercise is given below. Variable PULSE N 40 Mean 76.3000 StDev 12.4986 SE Mean 1.9762 95% CI (72.3027, 80.2973) c. Since the two confidence intervals overlap, we cannot conclude that the two population means are different. But recall the CAUTION in this section that the overlapping of confidence intervals should not be used for making formal and final conclusions about equality of means. 33. preliminary values: n = 43, Σx = 2738, Σx2 = 307,250 x = (Σx)/n = (2738)/43 = 63.674 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(307250) – (2738)2]/[43(42)] = 3164.511 s = 56.254 σ unknown and n>30, use t with df=42 [40] α = 0.01, tdf,α/2 = t42,0.005 = 2.704 x ± tα/2·s/ n 63.674 ± 2.704(56.254)/ 43 63.674 ± 23.197 40.5 < μ < 86.9 (years) Yes, the confidence interval changes considerably from the previous 52.3 < μ < 57.4. Yes, apparently confidence interval limits can be very sensitive to outliers. When apparent outliers are discovered in data sets they should be carefully examined to determine if an error has been made. If an error has been made that cannot be corrected, the value should be discarded. If the value appears to be valid, it may be informative to construct confidence intervals with and without the outlier. 34. preliminary values: n = 43, Σx = 2358, Σx2 = 130,930 x = (Σx)/n = (2358)/43 = 54.837 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [43(130930) – (2358)2]/[43(42)] = 38.663 s = 6.218 σ unknown and n>30, alternate method says use s for σ and use z α = 0.01, zα/2 = z0.005 = 2.575 x ± zα/2·σ/ n 54.837 ± 2.575(6.218)/ 43 54.837 ± 2.441 52.4 < μ < 57.3 (years) For any α, the z value is smaller than the corresponding t value – although the difference decreases as n increases. This creates a smaller E and a narrower confidence interval than one is entitled to – i.e., it does not take into consideration the extra uncertainty created by using the sample s instead of the population σ. And so the confidence interval found by the alternative method will always be narrower, but usually by a very small amount. In some situations, however, the unjustified narrowness of the interval could lead to incorrect conclusions. Estimating a Population Mean: σ Not Known SECTION 7-4 217 35. assuming a large population using the finite population N = 465 α = 0.05 & df=99 [100], tdf,α/2 = t99,0.02 5= 1.984 α = 0.05 & df=99 [100], tdf,α/2 = t99,0.025 = 1.984 E = [tα/2·s/ n ] × (N-n)/(N-1) E = tα/2·s/ n = [1.984(0.0518)/ 100 ]× 365/464 = 1.984(0.0518)/ 100 = 0.0103 g = 0.0091 g x ±E x ±E 0.8565 ± 0.0103 0.8565 ± 0.0091 0.8462 < μ < 0.8668 (grams) 0.8474 < μ < 0.8656 (grams) The second confidence interval is narrower, reflecting the fact that there are more restrictions and less variability (and more certainty) in the finite population situation when n>.05N. 36. a. In general, one sample value gives no information about the variation of the variable. It is possible, however, that one value plus other considerations can give some insight. If one knows that 0 is a possible value, for example, then one large sample value would indicate a large variance. [For example: If you take a sample of n=1 of the daily snowfall in a US city and find that 10.0 feet of snow fell that day, you would assume that there are days with no snow and that there must be a large variability in the amounts of daily snowfall.] b. The formula for E requires a value for s and a t score with n-1 degrees of freedom. When n=1, the formula for s fails to produce a value [because there is an (n-1) in the denominator] and there is no df=0 row for the t statistic. No confidence interval can be constructed. c. x ± 9.68|x| 12.0 ± 9.68|12.0| 12.0 ± 116.2 -104.2 < μ < 128.2 [which should be adjusted, since negative heights are not possible] 0 < μ < 128.2 (feet) Is it likely that some other randomly selected Martian may be 50 feet tall? No, if “likely” is understood to be “highly probable.” The range for individual heights would be even larger than the 0 – 128 given for the mean. With so many possibilities over such a wide range, 50 (or any other individual value) is not highly probable. Yes, if “likely” is understood to be “reasonable.” Since the confidence interval includes the value 50, it is a reasonable possibility for the mean height of all Martians – and any possible mean height would be a possible individual height. 7-5 Estimating a Population Variance 1. We can be 95% confident that the interval from 0.0455grams to 0.0602 grams includes the true value of the standard deviation in the weights for the population of all M&M’s. 2. Yes; (0.0455 g, 0.0602 g) is another format for indicating the confidence interval given in Exercise 1 – although the format in Exercise 1 has the advantage if indicating that the parameter of interest is σ, the population standard deviation. No; while the given expression yields the same endpoints given in Exercise 1, it falsely implies that the point estimate for the parameter in question is 0.05285 grams. 3. No; the population of last two digits from 00 to 99 follows a uniform distribution and not a normal distribution. One of the requirements for using the methods of this section is that the population values have a distribution that is approximately normal – even if the sample size is large. 218 CHAPTER 7 Estimates and Sample Sizes 4. An unbiased estimator is one whose long-run average value is equal to the true value of the population parameter it estimates. The sample variance is an unbiased estimator of the population variance, but the sample standard deviation is not an unbiased estimator of the population standard deviation – as illustrated by exercises 10 and 11 in section 6-4 of the previous chapter. 5. α = 0.05 and df = 8 2 χ 2L = χ 8,0.975 = 2.180 2 χ 2R = χ 8,0.025 = 17.535 6. α = 0.05 and df = 19 2 χ 2L = χ 19,0.975 = 8.907 2 χ 2R = χ 19,0.025 = 32.852 7. α = 0.01 and df = 80 2 χ 2L = χ 80,0.995 = 51.172 2 χ 2R = χ 80,0.005 = 116.321 8. α = 0.10 and df = 50 2 χ 2L = χ 50,0.95 = 34.764 2 χ 2R = χ 50,0.05 = 67.505 2 2 9. α = 0.05 and df = 29; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (29)(333)2/45.722 < σ2 < (29)(333) 2/16.047 70333.3 < σ2 < 200397.6 265 < σ < 448 2 2 10. α = 0.05 and df = 24; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (24)(2.3)2/39.364 < σ2 < (24)(2.3) 2/12.401 3.23 < σ2 < 10.24 1.8 < σ < 3.2 (mph) 2 2 11. α = 0.01 and df = 6; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (6)(2.019)2/18.548 < σ2 < (6)(2.019) 2/0.676 1.3186 < σ2 < 36.1807 1.148 < σ < 6.015 (cells/microliter) 2 2 12. α = 0.01 and df = 7; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (7)(0.12)2/20.278 < σ2 < (7)(0.12) 2/0.989 0.00497 < σ2 < 0.10192 0.07 < σ < 0.32 (seconds) 13. From the upper right section of Table 7-2, n = 19,205. No. This sample size is too large to be practical for most applications. 14. From the upper right section of Table 7-2, n = 21. Yes. This sample size is practical for most applications. 15. From the lower left section of Table 7-2, n = 101. Yes. This sample size is practical for most applications. 16. From the upper left section of Table 7-2, n = 211. Estimating a Population Variance SECTION 7-5 219 2 2 17. α = 0.05 and df = 189; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (189)(645) 2/228.9638 < σ2 < (189)(645) 2/152.8222 343411 < σ2 < 514511 586 < σ < 717 (grams) No. Since the confidence interval includes 696, it is a reasonable possibility for σ. 18. a. R = 1.015 – 0.696 = 0.319 grams By the range rule of thumb, σ ≈ R/4 = 0.319/4 = 0.07975 grams. 2 2 b. α = 0.05 and df = 99 [100]; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (99)(0.0518)2/129.561 < σ2 < (99)(0.0518) 2/74.222 0.002050 < σ2 < 0.003579 0.0453 < σ < 0.0598 (grams) c. No; the confidence interval does not contain the estimate from part (a). This suggests that the range rule of thumb is not accurate in this case. Remember, however, that the range rule of thumb applies to all distributions – and that normal distributions (like the weights of the M&M’s) have smaller standard deviations than other distributions with the same range because they bunch up near the middle. 2 2 19. a. α = 0.05 and df = 22; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (22)(22.9)2/36.781 < σ2 < (22)(22.9) 2/10.982 313.67 < σ2 < 1050.54 17.7 < σ < 32.4 (minutes) 2 2 b. α = 0.05 and df = 11; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (11)(20.8)2/21.920 < σ2 < (11)(20.8) 2/3.816 217.11 < σ2 < 1247.13 14.7 < σ < 35.3 (minutes) c. The two intervals are similar. No, there does not appear to be a difference in the variation of lengths of PG/PGF-13 movies and R movies. 2 2 20. a. α = 0.01 and df = 39 [40]; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (39)(11.3)2/66.766 < σ2 < (39)(11.3) 2/20.707 74.588 < σ2 < 240.494 8.6 < σ < 15.5 (beats/minute) 2 2 b. α = 0.01 and df = 39 [40]; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (39)(12.5)2/66.766 < σ2 < (39)(12.5) 2/20.707 91.270 < σ2 < 294.285 9.6 < σ < 17.2 (beats/minute) c. The two intervals are similar. No, there does not appear to be a difference in the variation of pulse rates of men and women. 220 CHAPTER 7 Estimates and Sample Sizes 21. preliminary values: n = 12, Σx = 52118, Σx2 = 228,072,688 x = (Σx)/n = (52118)/12 = 4343.2 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(228072688) – (52118)2]/[12(11)] = 155,957.06 s = 394.91 2 2 α = 0.01 and df = 11; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (11)(394.91)2/26.757 < σ2 < (11)(394.91) 2/2.603 64115.10 < σ2 < 659057.88 253.2 < σ < 811.8 (seconds) 22. preliminary values: n = 12, Σx = 10008, Σx2 = 8,360,132 x = (Σx)/n = (1008)/12 = 84.0 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [12(8360132) – (10008)2]/[12(11)] = 1223.64 s = 34.98 2 2 α = 0.05 and df = 7; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (11)(34.98)2/21.920 < σ2 < (11)(34.98) 2/3.816 614.05 < σ2 < 3527.25 24.8 < σ < 59.4 (mm) Yes, the interval contains the traditionally believed value of 35 mm. 23. preliminary values: n = 6, Σx = 9.23, Σx2 = 32.5197 x = (Σx)/n = (9.23)/6 = 1.538 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [6(32.5197) – (9.213)2]/[6(5)] = 3.664 s = 1.914 2 2 α = 0.05 and df = 5; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (5)(3.664)/12.833 < σ2 < (5)(3.664)/0.831 1.4276 < σ2 < 22.0468 1.195 < σ < 4.695 (micrograms per cubic meter) Yes. One of the requirements to use the methods of this section is that the original distribution be approximately normal, and the fact that 5 of the 6 sample values are less than the mean suggests that the original distribution is not normal. 24. a. preliminary values: n = 10, Σx = 71.5, Σx2 = 513.27 x = (Σx)/n = (71.5)/10 = 7.15 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(513.27) – (71.5)2]/[10(9)] = 0.2272 s = 0.48 2 2 α = 0.05 and df = 9; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (9)(0.2272)/19.023 < σ2 < (9)(0.2272)/2.700 0.0975 < σ2 < 0.7574 0.33 < σ < 0.87 (minutes) b. preliminary values: n = 10, Σx = 71.5, Σx2 = 541.09 x = (Σx)/n = (71.5)/10 = 7.15 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(541/09) – (71.5)2]/[10(9)] = 3.3183 s = 1.82 Estimating a Population Variance SECTION 7-5 221 2 2 α = 0.05 and df = 9; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (9)(3.3183)/19.023 < σ2 < (9)(3.3183)/2.700 1.5699 < σ2 < 11.0611 1.25 < σ < 3.33 (minutes) c. The variation is considerably higher in part (b). Yes; since the intervals do not overlap, there is a significant difference in the variability of the two systems. The single-line system in part (a) is better for the customers because it eliminates the long wait endured by some customers when one of the lines is slow. 25. preliminary values: n = 100, Σx = 70311, Σx2 = 50,278,497 x = (Σx)/n = (70311)/100 = 703.11 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [100(50278497) – (70311)2]/[100(99)] = 8506.36 s = 92.23 2 2 α = 0.05 and df = 99 [100]; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (99)(8506.36)/129.561 < σ2 < (99)(8506.36)/74.222 6499.87 < σ2 < 11346.09 80.6 < σ < 106.5 (FICO units) NOTE: The statistical portion of Excel yielded the following results. Confidence Level 0.95 Lower Conf. Limit Stan. Dev. 80.979 92.23 Upper Conf. Limit 107.141 26. preliminary values: n = 48, Σx = 133,522, Σx2 = 393,933,262 x = (Σx)/n = (133522)/48 = 2781.71 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [48(393933262) – (133522)2]/[48(47)] = 479,021.32 s = 692.11 2 2 α = 0.05 and df = 47 [50]; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (47)(479021.32)/79.490 < σ2 < (47)(479021.32)/27.991 283230.62 < σ2 < 804330.03 532.2 < σ < 896.8 (kWh) NOTE: The statistical portion of Excel yielded the following results. Confidence Level 0.99 Lower Conf. Limit Stan. Dev. 545.339 692.114 Upper Conf. Limit 934.611 27. Applying the given formula yields the following χ 2L and χ 2R values. χ2 = (1/2)[ ± zα/2 + 2(df) - 1 ]2 = (1/2)[ ± 1.96 + 2(189) - 1 ]2 = (1/2)[ ± 1.96 + 19.416]2 = (1/2)[17.456]2 and (1/2)[21.376]2 = 152.3645 and 228.4771 These are close to the 152.8222 and 228.9638 given in exercise #17. 222 CHAPTER 7 Estimates and Sample Sizes 28. Notice that what the exercise calls (n-1)s2/(n+1) can be restated more naturally in a format (n-1)s 2 (n-1) Σ(x-x) 2 Σ(x-x) 2 . = × = similar to the definition of the variance as (n+1) (n+1) (n-1) (n+1) This exercise compares the mean square error (MSE) of the unbiased s2 = Σ(x- x )2/(n-1) with the biased s 2B = Σ(x- x )2/(n+1). Consider the original population of size N=3. x x-μ (x-μ)2 2 -2 4 3 -1 1 7 3 9 12 0 14 μ = (Σx)/N = 12/3 = 4 σ2 = Σ(x-μ)2/N = 14/3 = 4.667 Consider the 9 possible equally likely samples (with replacement) of size n=2. The following table contains the values necessary to answer the various parts of this exercise. sample x s2 s2-σ2 (s2-σ2)2 s 2B s 2B -σ2 ( s 2B -σ2)2 2,2 2,3 2,7 3,2 3,3 3,7 7,2 7,3 7,7 2.0 2.5 4.5 2.5 3.0 5.0 4.5 5.0 7.0 36.0 0 0.5 12.5 0.5 0 8.0 12.5 8.0 0 42 -4.667 21.778 -4.167 17.361 7.833 61.361 -4.167 16.361 -4.667 21.778 3.333 11.111 7.833 61.361 3.333 11.111 -4.667 21.778 0 245 0 0.167 4.167 0.167 0 2.667 4.167 2.667 0 14 -4.667 21.778 -4.500 20.250 –0.500 0.250 -4.500 20.250 -4.667 21.778 -2.000 4.000 -0.500 0.250 -2.000 4.000 -4.667 21.778 -28 114.333 a. Consider the estimator s2. E(s2) = (Σs2)/9 = 42/9 = 4.6677 = σ2, and so s2 is an unbiased estimator of σ2. MSE(s2) = Σ(s2-σ2)2/9 = 245/9 = 27.222 b. Consider the estimator s 2B . E( s 2B ) = (Σ s 2B )/9 = 14/9 = 1.556 ≠ σ2, and so s 2B is not an unbiased estimator of σ2. MSE( s 2B ) = Σ( s 2B -σ2)2/9 = 114.333/9 = 12.704 c. Parts (a) and (b) show that s 2B has a smaller mean square error (MSE), but that it is a biased estimator. It can be shown that the estimator s 2B = Σ(x- x )2/(n+1) has the minimum MSE of all estimators of σ2. Statistical Literacy and Critical Thinking 1. A point estimate is a single value calculated from sample data that is used to estimate the true value of a population characteristic, called the parameter. In this context the sample proportion that test positive is the best point estimate for the population proportion that would test positive. A confidence interval is a range of values that is likely, with some specific degree of confidence, to include the true value of the population parameter. The major advantage of the confidence interval over the point estimate is its ability to communicate a sense of the accuracy of the estimate. 2. We can be 95% confident that the interval from 2.62% to 4.99% contains the true percentage of all job applicants who would test positive for drug use. Statistical Literacy and Critical Thinking 223 3. The confidence level in Exercise 2 is 95%. In general, the confidence level specifies the proportion of times a given procedure to construct an interval estimate can be expected to produce an interval that will include the true value of the parameter. 4. The respondents are not likely to be representative of the general population for two reasons. The sample is a convenience sample, composed only of those who visit the AOL Web site. The sample is a voluntary response sample, composed only of those who take the time to selfselect themselves to be in the survey. Convenience samples are typically not representative racially, socio-economically, etc. Voluntary response samples typically include mainly those with strong opinions on, or a personal interest in, the topic of the survey. Chapter Quick Quiz 1. We can be 95% confident that the interval from 20.0 to 20.0 contains the true value of the population mean. 2. The interval includes some values greater than 50%, suggesting that the Republican may win; but the interval also includes some values less than 50%, suggesting that the Republican may lose. Statement (2), that the election is too close to call, best describes the results of the survey. 3. The critical value of tα/2 for n=20 and α = 0.05 is t19,0.025 = 2.093. 4. The critical value of zα/2 for n=20 and α = 0.10 is z0.05 = 1.645. 5. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.02; p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.02)2 = 2401 n = [(z α/2 ) 2 pq]/E 6. p̂ = x/n = 240/600 = 0.40 7. α = 0.05, zα/2 = z0.025 = 1.96 and p̂ = x/n = 240/600 = 0.4000 ˆˆ pˆ ± zα /2 pq/n 0.4000 ± 1.96 (0.4000)(0.6000)/600 0.4000 ± 0.0392 0.361< p < 0.439 8. σ unknown, n > 30: use t with df=35 and α = 0.05, tdf, α/2 = t35,0.025 = 2.014 x ± tα/2·s/ n 40.0 ± 2.030(10.0)/ 36 40.0 ± 3.4 36.6 < μ < 43.4 (years) 9. σ known, n > 30: use z and α = 0.05, z α/2 = z0.025 = 1.96 x ± zα/2·σ/ n 40.0 ± 1.96(10.0)/ 36 40.0 ± 3.3 36.7 < μ < 43.3 (years) 10. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(12)/(0.5)]2 = 2212.76, rounded up to 2213 224 CHAPTER 7 Estimates and Sample Sizes Review Exercises 1. α = 0.05 and zα/2 = z0.025 = 1.96; p̂ = x/n = 589/745 = 0.7906 ˆˆ pˆ ± zα /2 pq/n 0.7906 ± 1.96 (0.7906)(0.2094)/745 0.7906 ± 0.0292 0.761 < p < 0.820 or 76.1% < p < 82.0% We can be 95% confident that the interval from 76.1% to 82.0% contains the true percentage of all adults who believe that it is morally wrong to not report all income on their tax returns. 2. α = 0.01, zα/2 = z0.005 = 2.575 and E = 0.02; p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(2.575)2(0.5)(0.5)]/(0.02)2 = 4144.14, rounded up to 4145 n = [(z α/2 ) 2 pq]/E 3. α = 0.01, zα/2 = z0.005 = 2.575 n = [zα/2·σ/E]2 = [(2.575)(28785)/(500)]2 = 21975.91, rounded up to 21,976 No, the required sample size is too large to be practical. It appears that some re-thinking of the requirements is necessary. 4. σ unknown, n > 30: use t with df=36 and α = 0.01, tdf, α/2 = t36,0.005 = 2.719 x ± tα/2·s/ n 2.4991 ± 2.719(0.0165)/ 37 2.4991 ± 0.0074 2.4917 < μ < 2.5065 (grams) Since the above interval includes 2.5 grams, it appears on the surface that the manufacturing process is meeting the design specifications – but the interval may not be relevant to determine whether the pennies were manufactured according to specification if the sample came (as so it seems) from worn pennies in circulation. 5. preliminary values: n = 5, Σx = 3344, Σx2 = 2,470,638 x = (Σx)/n = (3344)/5 = 668.8 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [5(2470638) – (3344)2]/[5(4)] = 58542.7 s = 241.96 σ unknown and n=5: assuming a normal distribution, use t with df=4 α = 0.05, tdf,α/2 = t4,0.025 = 2.776 x ± tα/2·s/ n 668.8 ± 2.776(241.96)/ 5 668.8 ± 300.4 368.4 < μ < 969.2 (hic) 6. preliminary values: n = 5, Σx = 3344, Σx2 = 2,470,638 x = (Σx)/n = (3344)/5 = 668.8 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [5(2470638) – (3344)2]/[5(4)] = 58542.7 s = 241.96 2 2 α = 0.05 and df = 4; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (4)(58542.7)/11.143 < σ2 < (4)(58542.7)/0.484 21015.06 < σ2 < 483823.97 145.0 < σ < 695.6 (hic) Review Exercises 225 7. Let x = the number who believe that cloning should not be allowed. a. p̂ = x/n = 901/1012 = 0.8903, rounded to 0.890 b. α = 0.05, zα/2 = z0.025 = 1.96 ˆˆ pˆ ± zα /2 pq/n 0.8903 ± 1.96 (0.8903)(0.1097)/1012 0.8903 ± 0.0193 0.871 < p < 0.910 c. Yes. Since the entire interval is above 50% = 0.50, there is strong evidence that the majority is opposed to such cloning. 8. a. α = 0.05, zα/2 = z0.025 = 1.96 and E = 0.04; p̂ unknown, use p̂ =0.5 ˆ ˆ 2 = [(1.96)2(0.5)(0.5)]/(0.04)2 = 600.25, rounded up to 601 n = [(z α/2 ) 2 pq]/E b. α = 0.05, zα/2 = z0.025 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(14227)/(750)]2 = 1382.34, rounded up to 1383 c. To meet both criteria simultaneously , use the larger sample size of n=1383. 9. preliminary values: n = 8, Σx = 30.72, Σx2 = 160.2186 a. x = (Σx)/n = (30.72)/8 = 3.840 lbs b. σ unknown and n=8: assuming a normal distribution, use t with df=7 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [8(160.2186) – (30.72)2]/[8(7)] = 6.03626 s = 2.4569 α = 0.05, tdf,α/2 = t7,0.025 = 2.365 x ± tα/2·s/ n 3.8400 ± 2.365(2.4569)/ 8 3.8400 ± 2.0543 1.786 < μ < 5.894 (lbs) c. σ known and a normal distribution: use z α = 0.05, z α/2 = z0.025 = 1.96 x ± zα/2·σ/ n 3.8400 ± 1.96(3.108)/ 8 3.8400 ± 2.1537 1.686 < μ < 5.994 (lbs) 10. preliminary values: n = 8, Σx = 30.72, Σx2 = 160.2186 x = (Σx)/n = (30.72)/8 = 3.840 s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [8(160.2186) – (30.72)2]/[8(7)] = 6.03626 s = 2.4569 2 2 a. α = 0.05 and df = 7; χ 2L = χ df,1-α/2 and χ 2R = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (7)(6.03626)/16.013 < σ2 < (7)(6.03626)/1.690 2.6387 < σ2 < 25.0022 1.624 < σ < 5.000 (lbs) 2 2 b. α = 0.05 and df = 7; χ 2L = χ df,1-α/2 and χ R2 = χ df,α/2 (n-1)s2/ χ 2R < σ2 < (n-1)s2/ χ 2L (7)(6.03626)/16.013 < σ2 < (7)(6.03626)/1.690 2.639 < σ2 < 25.002 (lbs2) 226 CHAPTER 7 Estimates and Sample Sizes Cumulative Review Exercises 1. scores in order: 103 105 110 119 119 123 125 125 127 128 preliminary values: n = 10, Σx = 1184, Σx2 = 140948 a. x = (Σx)/n = (1184)/10 = 118.4 lbs b. x = (x5 + x6)/2 = (119 + 123)/2 = 121.0 lbs c. s2 = [n(Σx2) – (Σx)2]/[n(n-1)] = [10(140948) – (1184)2]/[10(9)] = 84.711 s = 9.2 lbs 2. Ratio, since differences are meaningful and there is a meaningful zero. 3. From Exercise 1, x = 118.4 and s = 9.204. α = 0.05, tdf,α/2 = t9,0.025 = 2.262 x ± tα/2·s/ n 118.4 ± 2.262(9.204)/ 10 118.4 ± 6.6 111.8 < μ < 125.0 (lbs) 4. α = 0.05, zα/2 = z0.05 = 1.96 n = [zα/2·σ/E]2 = [(1.96)(7.5)/(2)]2 = 54.0225, rounded up to 55 5. Let D = an applicant tests positive for drugs. P(D) = 0.038 a. P( D ) = 1 – P(D) = 1 – 0.038 = 0.962 b. P(D1 and D2) = P(D1)·P(D2|D1) = (0.038)(0.038) = 0.00144 c. binomial: n=500 and p=0.038 normal approximation appropriate since np = 500(0.038) = 19 ≥ 5 nq = 500(0.962) = 481 ≥ 5 μ = np = 500(0.038) = 19 σ = npq = 500(0.038)(0.962) = 4.275 P(x ≥ 20) = P(x>19.5) = P(z>0.12) = 1 – 0.5478 = 0.4522 6. a. normal distribution μ = 21.1 σ = 4.8 P(x>20.0) = P(z>-0.23) = 1 – 0.4090 = 0.5910 0.5478 <-------------------------------19 0 19.5 0.12 x Z 0.4090 <----------20.0 -0.23 21.1 0 x Z Cumulative Review Exercises 227 NOTE: Since ACT scores are whole numbers, another valid interpretation of part (a) is P(x>20) = PC(x>20.5) = P(z>-0.13) = 1 – 0.4483 = 0.5517. Presumably the “20.0” was specified in the exercise to discourage this interpretation and to allow for a direct comparison to part (b). b. normal distribution, since the original distribution is so μ x = μ = 21.1 σ x = σ/ n = 4.8/ 25 = 0.96 P( x >20.0) = P(z>-1.15) = 1 – 0.1251 = 0.8749 c. normal distribution: μ = 21.1, σ = 4.8 For P90, A = 0.9000 [0.8997] and z = 1.28 [from z table] or z = 1.282 [from last row of t table] x = μ + zσ = 21.1 + (1.282)(4.8) = 21.1 + 6.2 = 27.3 0.1251 <----------20.0 -1.15 _ x Z 21.1 0 <--------------------------------| 0.9000 21.1 0 ? 1.282 x Z 7. A simple random sample of size n from some population occurs when every sample of size n has an equal chance of being selected from that population. A voluntary response sample occurs when the subjects themselves decide whether to be included. 8. As grade point averages are typically reported to two decimal places, R = 4.00 – 0.00 = 4.00. The range rule of thumb states that σ ≈ R/4 = 4.00/4 = 1.000. 9. Let C = getting a T-F question correct by random guessing. P(C1 and C2 and …and C12) = P(C1)·P(C2)·…·P(C12) = (0.5)(0.5)…(0.5) = (0.5)12 = 0.000244 Since 0.000244 > 0, it is possible to get all 12 questions correct by random guessing. Since 0.000244 < 0.05, it is unlikely to get all 12 questions correct by random guessing. 10. This is a convenience sample, composed of those friends who happen to be available. Convenience samples are typically not representative of the population in a variety of ways – e.g., racially, socio-economically, by gender, etc.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 7 Estimates and Sample Sizes