Download Document

Part IV: Complete Solutions, Chapter 3 261 Chapter 3: Averages and Variation Calculations may vary slightly owing to rounding. Section 3.1 1. The middle value is the median. The most frequent value is the mode. The mean takes all values into account. 2. The symbol for the sample mean is x , and the symbol for the population mean is μ. 3. For a mound-shaped, symmetric distribution, the mean, median, and mode all will be equal. 4. (a) Mean, median, and mode if it exists (b) Mode if it exists (c) Mean, median, and mode if it exists 5. (a) Mode = 5, the most common value Median = 4, the middle value in the ordered data set Mean = 2  3  4  5  5 19   3.8 5 5 (b) Only the mode (c) All three make sense. (d) The mode and the median 6. (a) Mode = 2, the most common value Median = 3, the middle value in the ordered data set Mean = 2  2  3  6  10  4.6 5 (b) Mode = 7, median = 8, mean = 9.6, using the same techniques as part (a) (c) Each statistic was increased by 5. In general, adding a constant c to each value in a data set results in the mode, median, and mean increasing by c. 7. (a) Mode = 2, the most common value Median = 3, the middle value in the ordered data set Mean = 2  2  3  6  10  4.6 5 (b) Mode = 10, median = 15, mean = 23, using the same techniques as part (a) (c) Each statistic was multiplied by 5. In general, multiplying each value in a data set by a constant c results in the mode, median, and mean being multiplied by c. (d) Mode = 177.8 cm, median = 172.72 cm, mean = 180.34 cm 8. (a) If the largest data value is replaced by a larger value, the mean will increase because the sum of the data values will increase. The median will not change because the same value will still be in the eighth position when the data are ordered. (b) If the largest value is replaced by a smaller value (but still higher than the median), the mean will decrease because the sum of the data values will decrease. The median will not change because the same value will be in the eighth position in increasing order. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 262 (c) If the largest value is replaced by a value that is smaller than the median, the mean will decrease because the sum of the data values will decrease. The median also will decrease because the former value in the eighth position will move to the ninth position in increasing order. The median will be the new value in the eighth position. 9. 146  152   144  167.3 14 To compute the median, first order the data set smallest to largest. Then Mean = 168  174  171 2 Mode = most common value = 178 Median = 10. 111  6.167 18 57 Median  6 2 Mode  7 Mean  11. First, organize the data from smallest to largest. Then compute the mean, median, and mode. (a) Upper Canyon 1 1 1 2 3 3 3 3 4 6 9 2 2 3 6 7 x 36   3.27 n 11 Median  3 (middle value) Mean  x  Mode  3 (occurs most frequently) (b) Lower Canyon 0 0 1 1 1 1 8 13 14 x 59   4.21 n 14 22 Median  2 2 Mode  1 (occurs most frequently) Mean  x  (c) The mean for the Lower Canyon is greater than that of the Upper Canyon. However, the median and mode for the Lower Canyon are less than those of the Upper Canyon. (d) 5% of 14 is 0.7, which rounds to 1. So eliminate one data value from the bottom of the list and one from the top. Then compute the mean of the remaining 12 values. x 45   3.75 n 12 Now this value is closer to the Upper Canyon mean. 5% trimmed mean  Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 12. (a) Mean  x  Median  263 x 1050   26.3 years n 40 25  26  25.5 years 2 Mode  25 (b) The three averages are close, so each represents the age fairly accurately. There may be one high outlier (37), so the median may be the best measure. 13. (a) 2723  $136.20 20 65  68 Median   $66.50 2 Mode  $60.00 Mean  x  (b) 5% of 20 data values is 1, so we remove the smallest and largest values and recompute the mean. Mean  x  2183  $121.30 . The trimmed mean is still much larger than the median. 18 (c) Reporting the median certainly will give the customer a much lower figure for the daily cost, but that really doesn’t tell the whole story. Reporting the mean and the median, as well as the high outliers, may be the most useful description of the situation. 14. Weighted average =   xw w 92  0.25  81 0.225   93  0.225   85  0.30  1  87.65 15. Weighted average =   xw w 9  2   7  3  6 1  10  4  2  3 1 4 85  10  8.5 xw 16. (a) Weighted average  w 64.1(0.38)  75.8(0.47)  23.9(0.07)  68.2(0.08)  1  67.1 mg/l (b) Since 67.1 mg/L is greater than 58 mg/L, this wetlands system does not meet the target standard for the chlorine compound. The average chlorine compound mg/L is too high. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 264 17. Harmonic mean  2  66.67 mph 1 1  60 75 18. Geometric mean  5 1.10 1.12 1.1481.0381.16  1.112 . Thus the average growth factor is approximately 11%. Section 3.2 1. The mean is associated with the standard deviation. 2. The standard deviation is the square root of the variance. 3. Yes. When computing the sample standard deviation, divide by n – 1. When computing the population standard deviation, divide by n. 4. The symbol for the sample standard deviation is S. The symbol for the population standard deviation is σ. 5. (a) i, ii, iii (b) The data change between data sets (i) and (ii) increased by the squared difference sum  ( x  x )2 by 10, whereas the data change between data sets (ii) and (iii) increased the squared difference sum  ( x  x )2 by only 6.   x  x   3.61 2 6. (a) s  n 1 (b) Adding a constant to each data value does not change s. Thus s ≈ 3.61. (c) Shifting data by c units does not change the standard deviation. 7. (a) s ≈ 3.61 (same as above) (b) s ≈ 18.0 (c) We see that the standard deviation has increased by 5. In general, multiplying each data value by a constant c will result in the standard deviation being multiplied by the absolute value of c. 8. (a) No, 80 is only 2 standard deviations away from its mean. (b) Yes, 80 is 3.33 standard deviations away from its mean. 9. (a) Range = maximum – minimum = 30 – 15 = 15 (b) Use a calculator to verify that x  110 and that x2  2,568. (c) Computation formula (sample data) for s 2 . Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 x 2  s 265 (x )2 n n 1 2568   (110)2 5 5 1  6.08 s 2  6.082  37 (d) x x 110   22 n 5 Defining formula (sample data) for s 2 . ( x  x ) 2 n 1 s (23  22)2  (17  22)2  5 1  6.08   (25  22)2 s 2  6.082  37 (e)   22  ( x   ) 2 N (23  22)2  (17  22)2  5  5.44   2  5.442  29.59 Copyright © Houghton Mifflin Company. All rights reserved.  (25  22)2 Part IV: Complete Solutions, Chapter 3 266 10. (a) X x2 y y2 11 121 10 100 0 0 2 4 36 1296 29 841 21 441 14 196 31 961 22 484 23 529 18 324 24 576 14 196 11 121 2 4 11 121 3 9 21 441 10 100 x  103 x 2  4607 y  90 y 2  2258 (b) x s  x 103   10.3 n 10 x 2  (x )2 n n 1 4607  s (103)2 10 10  1  19.85 s 2  19.852  394.0 (c) y  y 90  9 n 10 y 2  (y )2 n n 1 2258  (90)2 10 10  1  12.68 s 2  12.682  160.8 x  2s = 10.3  2(19.85) = 29.4 x + 2s = 10.3 + 2(19.85) = 50 y = 2s = 9  2(12.68) = 16.36 y + 2s = 9 + 2(12.68) = 34.36 At least 75% of the returns for the stock Total Stock Fund fall between –29.4% and 50%, whereas at least 75% of the returns for the Balanced Index fall between –16.36% and 34.36. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 267 s 19.85 100  192.7% (d) Stock fund: CV = 100  x 10.3 s 12.68 Balanced fund: CV  100  100  140.9% y 9 For each unit of return, the balanced fund has lower risk. Since the CV can be thought of as a measure of risk per unit of expected return, a smaller CV is better because a lower risk is better. 11. (a) Range = 7.89 – 0.02 = 7.87 (b) Use a calculator to verify that x  62.11 and x 2  164.23. (c) x  x 62.11   1.24 n 50 x 2  s  ( x ) 2 n n 1 164.23  (62.11) 2 50 50  1  1.333  1.33 s 2  1.3332  1.78 (d) CV  s 1.33 100  100  107% x 1.24 The standard deviation of the time to failure is just slightly larger than the average time. 12. (a) x x2 y y2 13.20 174.24 11.85 140.42 5.60 31.36 15.25 232.56 19.80 392.04 21.30 453.69 15.05 226.50 17.30 299.29 21.40 457.96 27.50 756.25 17.25 297.56 10.35 107.12 27.45 753.50 14.90 222.01 16.95 287.30 48.70 2371.69 23.90 571.21 25.40 645.16 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 268 32.40 1049.76 25.95 673.40 40.75 1660.56 57.60 3317.76 5.10 26.01 34.35 1179.92 17.75 315.06 38.80 1505.44 28.35 803.72 41.00 1681.00 31.25 976.56 y  421.5 y 2  14,562.27 x  284.95 x 2  7046.80 (b) Grid E: x  s2   x 284.95   20.35 n 14 x 2  (x )2 n n 1 7046.80  (284.95)2 14 14  1  96 s  s 2  96  9.80 Grid H: y  s2   y 421.5   28.1 n 15 y 2  (y )2 n n 1 14,562.27  (421.5)2 15 15  1  194 s  s 2  194  13.93 (c) x  2s  20.35  2(9.80)  0.75 x  2s  20.35  2(9.80)  39.95 For Grid E, at least 75% of the data fall in the interval 0.75–39.95. y  2s  28.1  2(13.93)  0.24 y  2s  28.1  2(13.93)  55.96 For Grid H, at least 75% of the data fall in the interval 0.24–39.95. Grid H shows a wider 75% range of values. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 (d) Grid E: CV  269 s 9.80 100  100  48% x 20.35 s 13.93 100  100  49% y 28.1 Grid H demonstrates slightly greater variability per expected signal. The CV, together with the confidence interval, indicates that Grid H might have more buried artifacts. Grid H: CV  13. (a) Students verify results with a calculator. (b) x  s  x 245   49 n 5 x 2  ( x ) 2 n n 1 14, 755  (245)2 5 5 1  26.22 s 2  26.222  687.49 (c) y  s  y 224   44.8 n 5 y 2  (y )2 n n 1 12, 070  (224)2 5 5 1  22.55 s 2  22.552  508.50 s 26.22 100  100  53.5% x 49 s 22.55 100  50.3% Canada Goose nest: CV  100  y 44.8 The CV gives the ratio of the standard deviation to the mean. With respect to their means, the variation for the mallards is slightly higher than the variation for the Canada geese. (d) Mallard nest: CV  14. (a) s 14.05 100  100  146.7% x 9.58 s 12.50 Vanguard CV  100  100  138.6% x 9.02 Vanguard fund has slightly less risk per unit of return. Pax CV  (b) Pax: x  2s  9.58  2(14.05)  18.52 x  2s  9.58  2(14.05)  37.68 At least 75% of returns for Pax fall within the interval 18.52% to 37.68%. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 270 Vanguard: x  2s  9.02  2(12.50)  15.98 x  2s  9.02  2(12.50)  34.02 At least 75% of the returns for Vanguard fall within in the interval 15.98% to 34.02%. Vanguard has a narrower range of returns, with less downside, but also less upside. 15. CV  s 100 x x  CV s 100 x  CV s 100 2.2 1.5  s 100 s  0.033 16. Class 1–10 11–20 21–30 31 and over x 5.5 15.5 25.5 35.5 f 34 18 17 11 n   f  80 x s  x  x  f  n 1 s  119.9  10.95 1 Class 7. 21–30 31–40 41 and over f 260 348 287 n f x 10.6 0.6 9.4 19.4  x  x 2 112.36 0.36 88.36 376.36  x  x 2 f 3820.24 6.48 1502.12 4139.96   x  x  f  9468.8  xf  1290 2 9468.8  119.9 79 x  x x  x 2 xf  x  x 2 f 6630 27,583.4  106.09 25.53 12,354 10.3 0.09 31.3  5.54 13,058.5 0.3 9 94.09 27,003.8 5.5 .7 2  895  xf  32,042.5   x  x  f  54,619 x  xf 32,042.5   35.80 n 895 x  x   f 2 s2  xx  xf 1290   16.1 n 80 2 2 xf 187 279 433.5 390.5 n 1 s  61.1  7.82  54,619  61.1 894 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 18. f 2 2 4 22 64 90 14 2 x 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 7 9 22 143 480 765 133 21 CV  19.  xf  1,580   xf 2 n  12,702  x 10.55 14.55 18.55 22.55 26.55 f 15 20 5 7 3  220 xf 158.25 291.00 92.75 157.85 79.65 xx  x  x 2 5.05 1.05 2.95 6.95 10.95 25.502 1.102 8.703 48.303 119.903  xf  779.5  x  x 2 f 382.537 22.050 43.513 338.118 359.708   x  x  f  1,145.9 2  xf 779.5   15.6 n 50 x  x  f 2 s  200 s 1.05 ×100  ×100  13.29% x 7.9 n   f  50 2 1,580 2 SS x 220   1.05 n 1 199 Class 8.6–12.5 12.6–16.5 16.6–20.5 20.6–24.5 24.6–28.5 x  x 2 f  12,702  xf 1,580   7.9 n 200 SS x   x 2 f  s x2 f 24.5 40.5 121.0 929.5 3,600.0 6,502.5 1,263.5 220.5 xf  f  200 x 271 n 1 s  23.4  4.8  1,145.9  23.4 49 20. (a) Students can use a TI-83 to verify the calculations. (b) For 1992, x  1.78  17.79  7.46  9.01 3 For 2000, x  17.49  6.80  2.38  7.30 3 (c) Students can use a TI-83 to verify the calculations. (d) The 3-year moving averages have approximately the same mean as computed in part (a), but the standard deviation is much smaller. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 272 21. x  x  2     x 2  2 xx  x 2   x 2   2 xx   x 2   x2  2x  x  nx 2   x 2  2xnx  nx 2  x  x - 2nx  nx   x  nx   x  n  n     2 2 x 2 2  x  2 2 2 2 2 n Section 3.3 1. 82% or more of the scores were at or below her score. 100%  82% = 18% or fewer of the scores were above her score. 2. The upper quartile is the 75th percentile. Therefore, the minimum percentile rank must be the 75th percentile. 3. No, the score 82 might have a percentile rank less than 70. Raw scores are not necessarily equal to percentile scores. 4. Timothy performed better because a percentile rank of 72 is greater than a percentile rank of 70. 5. Order the data from smallest to largest. Lowest value  2 Highest value  42 There are 20 data values. Median  23  23  23 2 There are 10 values less than the Q2 position and 10 values greater than the Q2 position. 8  11  9.5 2 28  29 Q3   28.5 2 IQR  Q3  Q1  28.5  9.5  19 Q1  Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 Boxplot of Months for Nurses 60 50 Months 40 30 20 10 0 6. (a) Order the data from smallest to largest. Lowest value  3 Highest value  72 There are 20 data values. Median  22  24  23 2 There are 10 values less than the median and 10 values greater than the median. 15  17  16 2 29  31 Q3   30 2 IQR  Q3  Q1  30  16  14 Q1  Boxplot of Months Clerical 40 Months Clerical 30 20 10 0 Copyright © Houghton Mifflin Company. All rights reserved. 273 Part IV: Complete Solutions, Chapter 3 274 (b) The median for nurses and clerical workers is 23 months. The upper half of the data for the nurses falls between values of 23 and 42 months, whereas the upper half of the data for the clerical workers falls between 23 and 72 months. The distance between Q3 and the maximum for nurses is 13.5 months; for clerical workers, this distance is 42 months. The distance between Q1 and the minimum for nurses is 7.5 months; for clerical workers, this distance is 13 months. 7. (a) Lowest value  17 Highest value  38 There are 50 data values. Median  24  24  24 2 There are 25 values above and 25 values below the Q2 position. Q1  22 Q3  27 IQR  27  22  5 Boxplot of College Graduates 35 College Graduates 30 25 20 15 (b) 26% is in the third quartile because it is between the median and Q3. 8. (a) Lowest value  5 Highest value  15 There are 50 data values. Median  10  10  10 2 There are 25 values above and 25 values below the Q2 position. Q1  9 Q3  12 IQR  12  9  3 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 275 Boxplot of High School Dropouts High School Dropouts 15.0 12.5 10.0 7.5 5.0 (b) 7% is in the first quartile because it is below Q1. 9. (a) (b) (c ) (d) California has the lowest premium, and Pennsylvania has the highest. Pennsylvania has the highest median premium. California has the smallest range, and Texas has the smallest IQR. The smallest IQR will be Texas. The largest IQR will be Pennsylvania. For figure (a), IQR = 3,652 – 2,758 = 894 For figure (b), IQR = 5,801 – 4,326 = 1,475 For figure (c), IQR = 3,966 – 2,801 = 1,165 Therefore, figure (a) is Texas and figure (b) is Pennsylvania. By elimination, figure (c) is California. 10. (a) Order the data from smallest to largest. Lowest value  4 Highest value  80 There are 24 data values. Median  65  66  65.5 2 There are 12 values above and 12 values below the median. Q1  61  62  61.5 2 Q3  71  72  71.5 2 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 276 Boxplot of Heights 80 70 60 Heights 50 40 30 20 10 0 (b) IQR  Q3  Q1  71.5  61.5  10 (c) 1.5 10   15 Lower limit: Q1  1.5  IQR   61.5  15  46.5 Upper limit: Q3  1.5  IQR   71.5  15  86.5 (d) Yes, the value 4 is below the lower limit and so is an outlier; it is probably an error. Chapter 3 Review 1. (a) The variance and the standard deviation (b) Box-and-whisker plot 2. (a) For (i), the mode is the tallest bar, namely, 7; the median and mean are estimated to be 7. For (ii), the mode = median = mean = 7. (b) Distribution (i) will have a larger standard deviation because more data are in the tails. This is indicated by the tall bars at values of 4 and 10. 3. (a) For both data sets, the mean is 20. Also, for both data sets, the range = maximum – minimum = 31 – 7 = 24. (b) Data set C1 seems more symmetric because the mean equals the median and the median is centered in the interquartile range. (c) For C1, IQR = 25 – 15 = 10. For C2, IQR = 22 – 20 = 2. Thus, for C1, the middle 50% of the data have a range of 10, whereas for C2, the middle 50% of the data have a smaller range of 2. 4. (a) Mean = x   x 1.9  2.8   n 8  7.2 36.2 8  4.525 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 277 Order the data from smallest to largest. 1.9 1.9 Median = 2.8 3.9 4.2 5.7 7.2 8.6 3.9  4.2  4.05 2 The mode is 1.9 because it is the value that occurs most frequently. (b) s x  x  2  42.395  2.46 7 n 1 s 2.46 CV  100  100  54.4% x 4.525 Range = 8.6  1.9  6.7 5. (a) Lowest value  31 Highest value  68 There are 60 data values. Median  45  45  45 2 There are 30 values above and 30 values below the Q2 position. 40  40  40 2 52  53 Q3   52.5 2 IQR  52.5  40  12.5 Q1  Boxplot of Percentage of Georgia Democrats by County 70 Georgia Democrats 60 50 40 30 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 278 (b) Class width = 8 Class 31–38 39–46 47–54 55–62 63–70 x s x Midpoint 34.5 42.5 50.5 58.5 66.5 f 11 24 15 7 3 xf 379.5 1020 757.5 409.5 199.5 x2 f 13,092.8 43,350.0 38,253.8 23,955.8 13,266.8 n   f  60  xf  2, 766  x 2 f  131,919  xf 2,766   46.1 n 60  x2 f    xf 2  n n 1 131,919  59 (2,766) 2 60  4,406.4  8.64 59 x  2s  46.1  2 8.64   28.82 x  2s  46.1  2 8.64   63.38 We expect at least 75% of the counties in Georgia to have between 28.82% and 63.38% Democrats. (c) x  46.15, s  8.63 6. (a) Weighted average =  xw w 92  0.05   73  0.08   81  0.08   85  0.15   87  0.15   83  0.15   90  0.34    0.05  0.08  0.08  0.15  0.15  0.15  0.34 85.77 1  85.77 (b) Weighted average =  xw w  20  0.05   73  0.08   81  0.08   85  0.15   87  0.15   83  0.15   90  0.34  1  82.17 7. Mean weight  2,500  156.25 16 The mean weight is 156.25 lb. 8. (a) Lowest value  7.8 Highest value  29.5 There are 72 data values. Median  20.2  20.3  20.25 2 There are 36 values above and 36 values below the Q2 position. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 279 14.0  14.4  14.2 2 23.8  23.8 Q3   23.8 2 Q1  (b) IRQ = 23.8  14.2 = 9.6 kilograms (c) Boxplot of Kilograms 30 Kilograms 25 20 15 10 (d) The median is closer to the maximum value, indicating that the higher weights are more concentrated than the lower weights. The lower whisker is also longer than the upper, which emphasizes again skewness toward the lower values. Yes, the lower half shows slightly more spread, indicating skewness to the left (low). 9. (a) A college degree does not guarantee an increase of 83.4% in earnings compared with a high-school diploma. This statement is based on averages. (b) We compute as follows: x  2s  $51, 206  2($8,500)  $34, 206 x  2s  $51, 206  2($8,500)  $68, 206 (c) x (0.46)(4,500)  (0.21)(7,500)  (0.07)(12, 000)  (0.08)(18, 000)  (0.09)(24, 000)  (0.09)(31, 000)  0.46  0.21  0.07  0.08  0.09  0.09 x  $10,875 10. (a) Order the data from smallest to largest. Lowest value  6 Highest value  16 There are 50 data values. Median  11  11  11 2 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 280 There are 25 values above and 25 values below the Q2 position. Q1  10 Q3  13 IQR  Q3  Q1  13  10  3 Boxplot of Soil Water Content 17.5 Soil Water Content 15.0 12.5 10.0 7.5 5.0 (b) Class 6–8 9–11 12–14 15–17 x Midpoint 7 10 13 16 s  xf  575  x 2 f  6,923 xf 4 24 15 7 n   f  50 x 28 240 195 112 x2 f 196 2,400 2,535 1,792 f  xf 575   11.5 n 50  x2 f    xf 2 n 1 n (575)2 6,923  50  49  310.5  2.52 49 x  2s  11.5  2  2.52   6.46 x  2s  11.5  2  2.52   16.54 We expect at least 75% of the soil water content measurements to fall in the interval 6.46–16.54. (c) Using a TI-83, x  11.48; s  2.44 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 11. Weighted average =   281  xw w 5  2   8  3  7  3  9  5   7  3  23353 121 16  7.56 Cumulative Review Problems Chapters 1, 2, 3 1. (a) Median, percentile (b) Mean, variance, standard deviation 2. (a) (b) (c) (d) Gap between first bar and rest of bars or between last bar and rest of bars Large gap between data on far left side or far right side and rest of data Several empty stems after stem including lowest values or before stem including highest values Data beyond fences placed at Q1 – 1.5(IQR) and at Q3 + 1.5(IQR). 3. (a) (b) (c) (d) Same Set B has higher mean. Set B has higher standard deviation. Set B has much longer whisker beyond Q3. 4. (a ) In Set A, 86 is the relatively higher score because a larger percentage of scores fall below it. (b) In Set B because 86 is more standard deviations above the mean 5. One could assign a consecutive number to each well in West Texas and then use a random-number table or a computer package to draw the simple random sample. 6. The pH levels are ratios because the values can be multiplied. Also, 0 pH is meaningful and not just a place on the scale. 7. Use the one’s digit for the stem and the tenths decimal for the leaves. Split each stem into five rows. Here, 7 0 = 7.0. 7 000000001111111111 7 222222222233333333333 7 44444444455555555 7 666666666777777 7 8888899999 8 01111111 8 2222222 8 45 8 67 8 88 Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 282 8. Class Boundaries 6.95–7.35 7.35–7.75 7.75–8.15 8.15–8.55 8.55–8.95 Class Limits 7.0–7.3 7.4–7.7 7.8–8.1 8.2–8.5 8.6–8.9 Midpoints Frequency 7.15 7.55 7.95 8.35 8.75 39 32 18 9 4 Relative Frequency 0.382 0.314 0.176 0.088 0.039 Histogram of pH Level 40 30 30 Relative Frequency Frequency Histogram of pH Level 40 20 20 10 10 0 Cumulative Frequency 0.382 0.696 0.872 0.960 0.999 6.95 7.35 7.75 8.15 8.55 8.95 0 6.95 7.35 7.75 8.15 8.55 8.95 To construct the frequency polygon, draw a dot at the minimum class boundary, at each midpoint, and at the maximum class boundary. Then connect the dots. 9. To draw the ogive, the vertical axis is labeled with relative frequency, and the horizontal axis is labeled with the upper class boundaries. Draw a dot at the minimum class boundary and zero, and then draw a dot at each upper class boundary and the corresponding cumulative frequency. Connect the dots. 10. Range = 8.8 – 7.0 = 1.8 x  x  7.0  7.0  ...  8.8  7.58 n 102 Median  7.5  7.5  7.5 2 Mode  7.3 11. (a) The students can verify the figures using a calculator or a statistics package. (b) s2   x  x  n 1 2  0.1984 s  s 2  0.1984  0.4454 CV  s 0.4454   0.59  5.9% x 7.58 The sample variance is only 5.9% of the mean. This appears to be small. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: Complete Solutions, Chapter 3 12. 283 x  2(s)  7.58  2(0.4454)  6.69 x  2(s)  7.58  2(0.4454)  8.47 Thus 75% of all pH levels are found between 6.69 and 8.47. 13. We know the minimum value is 7.0, the maximum value is 8.8, and the median is 7.5. Using Minitab, we find that Q1 = 7.2 and Q3 = 7.9. Thus IQR = 7.9 – 7.2 = 0.7. Boxplot of pH Levels for West Texas 9.0 pH 8.5 8.0 7.5 7.0 14. The histogram shows that the distribution is skewed right. Lower values are more common because the height of the bars is higher. 15. 87.2% of the wells have a pH of less than 8.15. 57.8% of the wells could be used for the irrigation. Here, 57.8% = 31.4% + 17.6% + 8.8%. 16. There do not appear to be any outliers because there are no large gaps in the data set. Eight are neutral. 17. Half the wells are found to have a pH between 7.2 and 7.9. There is skewness toward the high values, with half the wells having a pH between 7.5 and 8.8. The boxplot and the histogram are consistent because both show the distribution to be right skewed. 18. Answers will vary. Good reports will include the preceding graphs, measures of center, measures of variation, and a comment about any unusual features. Copyright © Houghton Mifflin Company. All rights reserved.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document