Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Time series wikipedia , lookup

World Values Survey wikipedia , lookup

Transcript
Part IV: Complete Solutions, Chapter 3
261
Chapter 3: Averages and Variation
Calculations may vary slightly owing to rounding.
Section 3.1
1.
The middle value is the median. The most frequent value is the mode. The mean takes all values into
account.
2.
The symbol for the sample mean is x , and the symbol for the population mean is μ.
3.
For a mound-shaped, symmetric distribution, the mean, median, and mode all will be equal.
4.
(a) Mean, median, and mode if it exists
(b) Mode if it exists
(c) Mean, median, and mode if it exists
5.
(a) Mode = 5, the most common value
Median = 4, the middle value in the ordered data set
Mean =
2  3  4  5  5 19

 3.8
5
5
(b) Only the mode
(c) All three make sense.
(d) The mode and the median
6.
(a) Mode = 2, the most common value
Median = 3, the middle value in the ordered data set
Mean =
2  2  3  6  10
 4.6
5
(b) Mode = 7, median = 8, mean = 9.6, using the same techniques as part (a)
(c) Each statistic was increased by 5. In general, adding a constant c to each value in a data set results in
the mode, median, and mean increasing by c.
7.
(a) Mode = 2, the most common value
Median = 3, the middle value in the ordered data set
Mean =
2  2  3  6  10
 4.6
5
(b) Mode = 10, median = 15, mean = 23, using the same techniques as part (a)
(c) Each statistic was multiplied by 5. In general, multiplying each value in a data set by a constant c
results in the mode, median, and mean being multiplied by c.
(d) Mode = 177.8 cm, median = 172.72 cm, mean = 180.34 cm
8.
(a) If the largest data value is replaced by a larger value, the mean will increase because the sum of the
data values will increase. The median will not change because the same value will still be in the eighth
position when the data are ordered.
(b) If the largest value is replaced by a smaller value (but still higher than the median), the mean will
decrease because the sum of the data values will decrease. The median will not change because the
same value will be in the eighth position in increasing order.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
262
(c) If the largest value is replaced by a value that is smaller than the median, the mean will decrease
because the sum of the data values will decrease. The median also will decrease because the former
value in the eighth position will move to the ninth position in increasing order. The median will be the
new value in the eighth position.
9.
146  152   144
 167.3
14
To compute the median, first order the data set smallest to largest. Then
Mean =
168  174
 171
2
Mode = most common value = 178
Median =
10.
111
 6.167
18
57
Median 
6
2
Mode  7
Mean 
11. First, organize the data from smallest to largest. Then compute the mean, median, and mode.
(a) Upper Canyon
1
1
1
2
3
3
3
3
4
6
9
2
2
3
6
7
x 36

 3.27
n 11
Median  3 (middle value)
Mean  x 
Mode  3 (occurs most frequently)
(b) Lower Canyon
0
0
1
1
1
1
8
13
14
x 59

 4.21
n 14
22
Median 
2
2
Mode  1 (occurs most frequently)
Mean  x 
(c) The mean for the Lower Canyon is greater than that of the Upper Canyon. However, the median and
mode for the Lower Canyon are less than those of the Upper Canyon.
(d) 5% of 14 is 0.7, which rounds to 1. So eliminate one data value from the bottom of the list and one
from the top. Then compute the mean of the remaining 12 values.
x 45

 3.75
n 12
Now this value is closer to the Upper Canyon mean.
5% trimmed mean 
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
12. (a) Mean  x 
Median 
263
x 1050

 26.3 years
n
40
25  26
 25.5 years
2
Mode  25
(b) The three averages are close, so each represents the age fairly accurately. There may be one high
outlier (37), so the median may be the best measure.
13.
(a)
2723
 $136.20
20
65  68
Median 
 $66.50
2
Mode  $60.00
Mean  x 
(b) 5% of 20 data values is 1, so we remove the smallest and largest values and recompute the mean.
Mean  x 
2183
 $121.30 . The trimmed mean is still much larger than the median.
18
(c) Reporting the median certainly will give the customer a much lower figure for the daily cost, but that
really doesn’t tell the whole story. Reporting the mean and the median, as well as the high outliers,
may be the most useful description of the situation.
14.
Weighted average =

 xw
w
92  0.25  81 0.225   93  0.225   85  0.30 
1
 87.65
15. Weighted average =

 xw
w
9  2   7  3  6 1  10  4 
2  3 1 4
85

10
 8.5
xw
16. (a) Weighted average 
w
64.1(0.38)  75.8(0.47)  23.9(0.07)  68.2(0.08)

1
 67.1 mg/l
(b) Since 67.1 mg/L is greater than 58 mg/L, this wetlands system does not meet the target standard for
the chlorine compound. The average chlorine compound mg/L is too high.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
264
17. Harmonic mean 
2
 66.67 mph
1
1

60 75
18. Geometric mean  5 1.10 1.12 1.1481.0381.16  1.112 . Thus the average growth factor is
approximately 11%.
Section 3.2
1.
The mean is associated with the standard deviation.
2.
The standard deviation is the square root of the variance.
3.
Yes. When computing the sample standard deviation, divide by n – 1. When computing the population
standard deviation, divide by n.
4.
The symbol for the sample standard deviation is S. The symbol for the population standard deviation is σ.
5.
(a) i, ii, iii
(b) The data change between data sets (i) and (ii) increased by the squared difference sum  ( x  x )2 by
10, whereas the data change between data sets (ii) and (iii) increased the squared difference sum
 ( x  x )2
by only 6.
  x  x   3.61
2
6.
(a) s 
n 1
(b) Adding a constant to each data value does not change s. Thus s ≈ 3.61.
(c) Shifting data by c units does not change the standard deviation.
7.
(a) s ≈ 3.61 (same as above)
(b) s ≈ 18.0
(c) We see that the standard deviation has increased by 5. In general, multiplying each data value by a
constant c will result in the standard deviation being multiplied by the absolute value of c.
8.
(a) No, 80 is only 2 standard deviations away from its mean.
(b) Yes, 80 is 3.33 standard deviations away from its mean.
9.
(a) Range = maximum – minimum = 30 – 15 = 15
(b) Use a calculator to verify that x  110 and that x2  2,568.
(c) Computation formula (sample data) for s 2 .
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
x 2 
s
265
(x )2
n
n 1
2568 

(110)2
5
5 1
 6.08
s 2  6.082
 37
(d)
x
x 110

 22
n
5
Defining formula (sample data) for s 2 .
( x  x ) 2
n 1
s
(23  22)2  (17  22)2 
5 1
 6.08

 (25  22)2
s 2  6.082
 37
(e)
  22

( x   ) 2
N
(23  22)2  (17  22)2 
5
 5.44

 2  5.442
 29.59
Copyright © Houghton Mifflin Company. All rights reserved.
 (25  22)2
Part IV: Complete Solutions, Chapter 3
266
10. (a)
X
x2
y
y2
11
121
10
100
0
0
2
4
36
1296
29
841
21
441
14
196
31
961
22
484
23
529
18
324
24
576
14
196
11
121
2
4
11
121
3
9
21
441
10
100
x  103
x 2  4607
y  90
y 2  2258
(b)
x
s

x 103

 10.3
n
10
x 2 
(x )2
n
n 1
4607 
s
(103)2
10
10  1
 19.85
s 2  19.852  394.0
(c)
y

y 90

9
n 10
y 2 
(y )2
n
n 1
2258 
(90)2
10
10  1
 12.68
s 2  12.682  160.8
x  2s = 10.3  2(19.85) = 29.4
x + 2s = 10.3 + 2(19.85) = 50
y = 2s = 9  2(12.68) = 16.36
y + 2s = 9 + 2(12.68) = 34.36
At least 75% of the returns for the stock Total Stock Fund fall between –29.4% and 50%, whereas at
least 75% of the returns for the Balanced Index fall between –16.36% and 34.36.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
267
s
19.85
100  192.7%
(d) Stock fund: CV = 100 
x
10.3
s
12.68
Balanced fund: CV  100 
100  140.9%
y
9
For each unit of return, the balanced fund has lower risk. Since the CV can be thought of as a measure
of risk per unit of expected return, a smaller CV is better because a lower risk is better.
11. (a) Range = 7.89 – 0.02 = 7.87
(b) Use a calculator to verify that x  62.11 and x 2  164.23.
(c) x 
x 62.11

 1.24
n
50
x 2 
s

( x ) 2
n
n 1
164.23 
(62.11) 2
50
50  1
 1.333
 1.33
s 2  1.3332
 1.78
(d) CV 
s
1.33
100 
100  107%
x
1.24
The standard deviation of the time to failure is just slightly larger than the average time.
12. (a)
x
x2
y
y2
13.20
174.24
11.85
140.42
5.60
31.36
15.25
232.56
19.80
392.04
21.30
453.69
15.05
226.50
17.30
299.29
21.40
457.96
27.50
756.25
17.25
297.56
10.35
107.12
27.45
753.50
14.90
222.01
16.95
287.30
48.70
2371.69
23.90
571.21
25.40
645.16
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
268
32.40
1049.76
25.95
673.40
40.75
1660.56
57.60
3317.76
5.10
26.01
34.35
1179.92
17.75
315.06
38.80
1505.44
28.35
803.72
41.00
1681.00
31.25
976.56
y  421.5
y 2  14,562.27
x  284.95
x 2  7046.80
(b) Grid E: x 
s2 

x 284.95

 20.35
n
14
x 2 
(x )2
n
n 1
7046.80 
(284.95)2
14
14  1
 96
s  s 2  96  9.80
Grid H: y 
s2


y 421.5

 28.1
n
15
y 2 
(y )2
n
n 1
14,562.27 
(421.5)2
15
15  1
 194
s  s 2  194  13.93
(c) x  2s  20.35  2(9.80)  0.75
x  2s  20.35  2(9.80)  39.95
For Grid E, at least 75% of the data fall in the interval 0.75–39.95.
y  2s  28.1  2(13.93)  0.24
y  2s  28.1  2(13.93)  55.96
For Grid H, at least 75% of the data fall in the interval 0.24–39.95. Grid H shows a wider 75% range of
values.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
(d) Grid E: CV 
269
s
9.80
100 
100  48%
x
20.35
s
13.93
100 
100  49%
y
28.1
Grid H demonstrates slightly greater variability per expected signal. The CV, together with the confidence
interval, indicates that Grid H might have more buried artifacts.
Grid H: CV 
13. (a) Students verify results with a calculator.
(b) x 
s

x 245

 49
n
5
x 2 
( x ) 2
n
n 1
14, 755 
(245)2
5
5 1
 26.22
s 2  26.222  687.49
(c) y 
s

y 224

 44.8
n
5
y 2 
(y )2
n
n 1
12, 070 
(224)2
5
5 1
 22.55
s 2  22.552  508.50
s
26.22
100 
100  53.5%
x
49
s
22.55
100  50.3%
Canada Goose nest: CV  100 
y
44.8
The CV gives the ratio of the standard deviation to the mean. With respect to their means, the variation
for the mallards is slightly higher than the variation for the Canada geese.
(d) Mallard nest: CV 
14. (a)
s
14.05
100 
100  146.7%
x
9.58
s
12.50
Vanguard CV  100 
100  138.6%
x
9.02
Vanguard fund has slightly less risk per unit of return.
Pax CV 
(b) Pax: x  2s  9.58  2(14.05)  18.52
x  2s  9.58  2(14.05)  37.68
At least 75% of returns for Pax fall within the interval 18.52% to 37.68%.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
270
Vanguard: x  2s  9.02  2(12.50)  15.98
x  2s  9.02  2(12.50)  34.02
At least 75% of the returns for Vanguard fall within in the interval 15.98% to 34.02%.
Vanguard has a narrower range of returns, with less downside, but also less upside.
15. CV 
s
100
x
x  CV
s
100
x  CV
s
100
2.2 1.5 
s
100
s  0.033
16.
Class
1–10
11–20
21–30
31 and over
x
5.5
15.5
25.5
35.5
f
34
18
17
11
n   f  80
x
s 
x  x  f

n 1
s  119.9  10.95
1 Class
7.
21–30
31–40
41 and
over
f
260
348
287
n f
x
10.6
0.6
9.4
19.4
 x  x 2
112.36
0.36
88.36
376.36
 x  x 2
f
3820.24
6.48
1502.12
4139.96
  x  x  f  9468.8
 xf  1290
2
9468.8
 119.9
79
x  x x  x 2
xf
 x  x 2 f
6630
27,583.4
 106.09
25.53
12,354 10.3
0.09
31.3

5.54 13,058.5 0.3
9
94.09
27,003.8
5.5
.7
2
 895  xf  32,042.5
  x  x  f  54,619
x
 xf 32,042.5

 35.80
n
895
x  x   f
2
s2 
xx
 xf 1290

 16.1
n
80
2
2
xf
187
279
433.5
390.5
n 1
s  61.1  7.82

54,619
 61.1
894
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
18.
f
2
2
4
22
64
90
14
2
x
3.5
4.5
5.5
6.5
7.5
8.5
9.5
10.5
7
9
22
143
480
765
133
21
CV 
19.
 xf  1,580
  xf 2
n
 12,702 
x
10.55
14.55
18.55
22.55
26.55
f
15
20
5
7
3
 220
xf
158.25
291.00
92.75
157.85
79.65
xx
 x  x 2
5.05
1.05
2.95
6.95
10.95
25.502
1.102
8.703
48.303
119.903
 xf  779.5
 x  x 2
f
382.537
22.050
43.513
338.118
359.708
  x  x  f  1,145.9
2
 xf 779.5

 15.6
n
50
x  x  f
2
s 
200
s
1.05
×100 
×100  13.29%
x
7.9
n   f  50
2
1,580 2
SS x
220

 1.05
n 1
199
Class
8.6–12.5
12.6–16.5
16.6–20.5
20.6–24.5
24.6–28.5
x
 x 2 f  12,702
 xf 1,580

 7.9
n
200
SS x   x 2 f 
s
x2 f
24.5
40.5
121.0
929.5
3,600.0
6,502.5
1,263.5
220.5
xf
 f  200
x
271
n 1
s  23.4  4.8

1,145.9
 23.4
49
20. (a) Students can use a TI-83 to verify the calculations.
(b) For 1992, x 
1.78  17.79  7.46
 9.01
3
For 2000, x 
17.49  6.80  2.38
 7.30
3
(c) Students can use a TI-83 to verify the calculations.
(d) The 3-year moving averages have approximately the same mean as computed in part (a), but the
standard deviation is much smaller.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
272
21.
x  x 
2


  x 2  2 xx  x 2   x 2   2 xx   x 2 
 x2  2x  x  nx 2   x 2  2xnx  nx 2 
x
 x - 2nx  nx   x  nx   x  n  n  


2
2
x
2
2
 x

2
2
2
2
2
n
Section 3.3
1.
82% or more of the scores were at or below her score. 100%  82% = 18% or fewer of the scores were
above her score.
2.
The upper quartile is the 75th percentile. Therefore, the minimum percentile rank must be the 75th
percentile.
3.
No, the score 82 might have a percentile rank less than 70. Raw scores are not necessarily equal to
percentile scores.
4.
Timothy performed better because a percentile rank of 72 is greater than a percentile rank of 70.
5.
Order the data from smallest to largest.
Lowest value  2
Highest value  42
There are 20 data values.
Median 
23  23
 23
2
There are 10 values less than the Q2 position and 10 values greater than the Q2 position.
8  11
 9.5
2
28  29
Q3 
 28.5
2
IQR  Q3  Q1  28.5  9.5  19
Q1 
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
Boxplot of Months for Nurses
60
50
Months
40
30
20
10
0
6.
(a) Order the data from smallest to largest.
Lowest value  3
Highest value  72
There are 20 data values.
Median 
22  24
 23
2
There are 10 values less than the median and 10 values greater than the median.
15  17
 16
2
29  31
Q3 
 30
2
IQR  Q3  Q1  30  16  14
Q1 
Boxplot of Months Clerical
40
Months Clerical
30
20
10
0
Copyright © Houghton Mifflin Company. All rights reserved.
273
Part IV: Complete Solutions, Chapter 3
274
(b) The median for nurses and clerical workers is 23 months. The upper half of the data for the nurses falls
between values of 23 and 42 months, whereas the upper half of the data for the clerical workers falls
between 23 and 72 months. The distance between Q3 and the maximum for nurses is 13.5 months; for
clerical workers, this distance is 42 months. The distance between Q1 and the minimum for nurses is
7.5 months; for clerical workers, this distance is 13 months.
7.
(a)
Lowest value  17
Highest value  38
There are 50 data values.
Median 
24  24
 24
2
There are 25 values above and 25 values below the Q2 position.
Q1  22
Q3  27
IQR  27  22  5
Boxplot of College Graduates
35
College Graduates
30
25
20
15
(b) 26% is in the third quartile because it is between the median and Q3.
8.
(a) Lowest value  5
Highest value  15
There are 50 data values.
Median 
10  10
 10
2
There are 25 values above and 25 values below the Q2 position.
Q1  9
Q3  12
IQR  12  9  3
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
275
Boxplot of High School Dropouts
High School Dropouts
15.0
12.5
10.0
7.5
5.0
(b) 7% is in the first quartile because it is below Q1.
9.
(a)
(b)
(c )
(d)
California has the lowest premium, and Pennsylvania has the highest.
Pennsylvania has the highest median premium.
California has the smallest range, and Texas has the smallest IQR.
The smallest IQR will be Texas. The largest IQR will be Pennsylvania.
For figure (a), IQR = 3,652 – 2,758 = 894
For figure (b), IQR = 5,801 – 4,326 = 1,475
For figure (c), IQR = 3,966 – 2,801 = 1,165
Therefore, figure (a) is Texas and figure (b) is Pennsylvania. By elimination, figure (c) is California.
10. (a) Order the data from smallest to largest.
Lowest value  4
Highest value  80
There are 24 data values.
Median 
65  66
 65.5
2
There are 12 values above and 12 values below the median.
Q1 
61  62
 61.5
2
Q3 
71  72
 71.5
2
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
276
Boxplot of Heights
80
70
60
Heights
50
40
30
20
10
0
(b) IQR  Q3  Q1  71.5  61.5  10
(c) 1.5 10   15
Lower limit: Q1  1.5  IQR   61.5  15  46.5
Upper limit: Q3  1.5  IQR   71.5  15  86.5
(d) Yes, the value 4 is below the lower limit and so is an outlier; it is probably an error.
Chapter 3 Review
1.
(a) The variance and the standard deviation
(b) Box-and-whisker plot
2.
(a) For (i), the mode is the tallest bar, namely, 7; the median and mean are estimated to be 7. For (ii), the
mode = median = mean = 7.
(b) Distribution (i) will have a larger standard deviation because more data are in the tails. This is
indicated by the tall bars at values of 4 and 10.
3.
(a) For both data sets, the mean is 20. Also, for both data sets, the range = maximum – minimum = 31 – 7
= 24.
(b) Data set C1 seems more symmetric because the mean equals the median and the median is centered in
the interquartile range.
(c) For C1, IQR = 25 – 15 = 10. For C2, IQR = 22 – 20 = 2. Thus, for C1, the middle 50% of the data have
a range of 10, whereas for C2, the middle 50% of the data have a smaller range of 2.
4.
(a) Mean = x 

x 1.9  2.8 

n
8
 7.2
36.2
8
 4.525
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
277
Order the data from smallest to largest.
1.9
1.9
Median =
2.8
3.9
4.2
5.7
7.2
8.6
3.9  4.2
 4.05
2
The mode is 1.9 because it is the value that occurs most frequently.
(b)
s
x  x 
2

42.395
 2.46
7
n 1
s
2.46
CV  100 
100  54.4%
x
4.525
Range = 8.6  1.9  6.7
5.
(a)
Lowest value  31
Highest value  68
There are 60 data values.
Median 
45  45
 45
2
There are 30 values above and 30 values below the Q2 position.
40  40
 40
2
52  53
Q3 
 52.5
2
IQR  52.5  40  12.5
Q1 
Boxplot of Percentage of Georgia Democrats by County
70
Georgia Democrats
60
50
40
30
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
278
(b) Class width = 8
Class
31–38
39–46
47–54
55–62
63–70
x
s
x
Midpoint
34.5
42.5
50.5
58.5
66.5
f
11
24
15
7
3
xf
379.5
1020
757.5
409.5
199.5
x2 f
13,092.8
43,350.0
38,253.8
23,955.8
13,266.8
n   f  60
 xf  2, 766
 x 2 f  131,919
 xf 2,766

 46.1
n
60
 x2 f 
  xf 2

n
n 1
131,919 
59
(2,766) 2
60

4,406.4
 8.64
59
x  2s  46.1  2 8.64   28.82
x  2s  46.1  2 8.64   63.38
We expect at least 75% of the counties in Georgia to have between 28.82% and 63.38% Democrats.
(c) x  46.15, s  8.63
6.
(a) Weighted average =
 xw
w
92  0.05   73  0.08   81  0.08   85  0.15   87  0.15   83  0.15   90  0.34 


0.05  0.08  0.08  0.15  0.15  0.15  0.34
85.77
1
 85.77
(b) Weighted average =
 xw
w

20  0.05   73  0.08   81  0.08   85  0.15   87  0.15   83  0.15   90  0.34 
1
 82.17
7.
Mean weight 
2,500
 156.25
16
The mean weight is 156.25 lb.
8.
(a)
Lowest value  7.8
Highest value  29.5
There are 72 data values.
Median 
20.2  20.3
 20.25
2
There are 36 values above and 36 values below the Q2 position.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
279
14.0  14.4
 14.2
2
23.8  23.8
Q3 
 23.8
2
Q1 
(b) IRQ = 23.8  14.2 = 9.6 kilograms
(c)
Boxplot of Kilograms
30
Kilograms
25
20
15
10
(d) The median is closer to the maximum value, indicating that the higher weights are more concentrated
than the lower weights. The lower whisker is also longer than the upper, which emphasizes again
skewness toward the lower values. Yes, the lower half shows slightly more spread, indicating
skewness to the left (low).
9.
(a) A college degree does not guarantee an increase of 83.4% in earnings compared with a high-school
diploma. This statement is based on averages.
(b) We compute as follows:
x  2s  $51, 206  2($8,500)  $34, 206
x  2s  $51, 206  2($8,500)  $68, 206
(c)
x
(0.46)(4,500)  (0.21)(7,500)  (0.07)(12, 000)  (0.08)(18, 000)  (0.09)(24, 000)  (0.09)(31, 000)

0.46  0.21  0.07  0.08  0.09  0.09
x  $10,875
10. (a) Order the data from smallest to largest.
Lowest value  6
Highest value  16
There are 50 data values.
Median 
11  11
 11
2
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
280
There are 25 values above and 25 values below the Q2 position.
Q1  10
Q3  13
IQR  Q3  Q1  13  10  3
Boxplot of Soil Water Content
17.5
Soil Water Content
15.0
12.5
10.0
7.5
5.0
(b)
Class
6–8
9–11
12–14
15–17
x
Midpoint
7
10
13
16
s
 xf  575
 x 2 f  6,923
xf
4
24
15
7
n   f  50
x
28
240
195
112
x2 f
196
2,400
2,535
1,792
f
 xf 575

 11.5
n
50
 x2 f 
  xf 2
n 1
n
(575)2
6,923  50

49

310.5
 2.52
49
x  2s  11.5  2  2.52   6.46
x  2s  11.5  2  2.52   16.54
We expect at least 75% of the soil water content measurements to fall in the interval 6.46–16.54.
(c) Using a TI-83, x  11.48; s  2.44
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
11. Weighted average =


281
 xw
w
5  2   8  3  7  3  9  5   7  3 
23353
121
16
 7.56
Cumulative Review Problems Chapters 1, 2, 3
1.
(a) Median, percentile
(b) Mean, variance, standard deviation
2.
(a)
(b)
(c)
(d)
Gap between first bar and rest of bars or between last bar and rest of bars
Large gap between data on far left side or far right side and rest of data
Several empty stems after stem including lowest values or before stem including highest values
Data beyond fences placed at Q1 – 1.5(IQR) and at Q3 + 1.5(IQR).
3.
(a)
(b)
(c)
(d)
Same
Set B has higher mean.
Set B has higher standard deviation.
Set B has much longer whisker beyond Q3.
4.
(a ) In Set A, 86 is the relatively higher score because a larger percentage of scores fall below it.
(b) In Set B because 86 is more standard deviations above the mean
5.
One could assign a consecutive number to each well in West Texas and then use a random-number table or
a computer package to draw the simple random sample.
6.
The pH levels are ratios because the values can be multiplied. Also, 0 pH is meaningful and not just a place
on the scale.
7.
Use the one’s digit for the stem and the tenths decimal for the leaves. Split each stem into five rows.
Here, 7 0 = 7.0.
7 000000001111111111
7 222222222233333333333
7 44444444455555555
7 666666666777777
7 8888899999
8 01111111
8 2222222
8 45
8 67
8 88
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
282
8.
Class
Boundaries
6.95–7.35
7.35–7.75
7.75–8.15
8.15–8.55
8.55–8.95
Class Limits
7.0–7.3
7.4–7.7
7.8–8.1
8.2–8.5
8.6–8.9
Midpoints
Frequency
7.15
7.55
7.95
8.35
8.75
39
32
18
9
4
Relative
Frequency
0.382
0.314
0.176
0.088
0.039
Histogram of pH Level
40
30
30
Relative Frequency
Frequency
Histogram of pH Level
40
20
20
10
10
0
Cumulative
Frequency
0.382
0.696
0.872
0.960
0.999
6.95
7.35
7.75
8.15
8.55
8.95
0
6.95
7.35
7.75
8.15
8.55
8.95
To construct the frequency polygon, draw a dot at the minimum class boundary, at each midpoint, and at
the maximum class boundary. Then connect the dots.
9.
To draw the ogive, the vertical axis is labeled with relative frequency, and the horizontal axis is labeled
with the upper class boundaries. Draw a dot at the minimum class boundary and zero, and then draw a dot
at each upper class boundary and the corresponding cumulative frequency. Connect the dots.
10. Range = 8.8 – 7.0 = 1.8
x
 x  7.0  7.0  ...  8.8  7.58
n
102
Median  7.5 
7.5  7.5
2
Mode  7.3
11. (a) The students can verify the figures using a calculator or a statistics package.
(b)
s2 
 x  x 
n 1
2
 0.1984
s  s 2  0.1984  0.4454
CV 
s 0.4454

 0.59  5.9%
x
7.58
The sample variance is only 5.9% of the mean. This appears to be small.
Copyright © Houghton Mifflin Company. All rights reserved.
Part IV: Complete Solutions, Chapter 3
12.
283
x  2(s)  7.58  2(0.4454)  6.69
x  2(s)  7.58  2(0.4454)  8.47
Thus 75% of all pH levels are found between 6.69 and 8.47.
13.
We know the minimum value is 7.0, the maximum value is 8.8, and the median is 7.5. Using Minitab, we
find that Q1 = 7.2 and Q3 = 7.9. Thus IQR = 7.9 – 7.2 = 0.7.
Boxplot of pH Levels for West Texas
9.0
pH
8.5
8.0
7.5
7.0
14.
The histogram shows that the distribution is skewed right. Lower values are more common because the
height of the bars is higher.
15.
87.2% of the wells have a pH of less than 8.15. 57.8% of the wells could be used for the irrigation. Here,
57.8% = 31.4% + 17.6% + 8.8%.
16.
There do not appear to be any outliers because there are no large gaps in the data set. Eight are neutral.
17.
Half the wells are found to have a pH between 7.2 and 7.9. There is skewness toward the high values,
with half the wells having a pH between 7.5 and 8.8. The boxplot and the histogram are consistent
because both show the distribution to be right skewed.
18.
Answers will vary. Good reports will include the preceding graphs, measures of center, measures of
variation, and a comment about any unusual features.
Copyright © Houghton Mifflin Company. All rights reserved.