Download Chapter 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 2
ORGANIZATION AND DESCRIPTION OF
DATA
(a) The percentage in other classes is 100  32.7  12.8  12.5  12.1  8.2  21.7 %
(b)
35
30
25
Percent Waste
2.1
20
15
10
5
0
Paper Yard
Food Plastic Metals Other
(c) The percentage of waste that is paper or paperboard is: 32.7 %
The percentage of waste in the top two categories is: 32.7 12.8  45.5 %
The percentage in the top five categories is:
32.7 12.8 12.5 12.1  8.2  78.3%
5
6
2.2
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
The frequency table for blood type is
Blood type
O
A
B
AB
Total
2.3
Frequency
16
18
4
2
40
Relative Frequency
0.40  16/ 40
0.45  18/ 40
0.10  4 / 40
0.05  2 / 40
1.00
The frequency table for number of activities is
Number of Activities
0
1
2
3
4
5
6
7
Total
Frequency
7
10
13
5
2
1
1
1
40
This is the relative frequency histogram:
Relative Frequency
7/40 = 0.175
10/40 = 0.25
13/40 = 0.325
5/40 = 0.125
2/40 = 0.05
1/40 = 0.025
1/40 = 0.025
1/40 = 0.025
1.00
7
2.4
The frequency table for number of crashes per month is
Number of Activities
0
1
2
3
4
5
6
Total
Frequency
5
12
11
14
8
8
1
59
Relative Frequency
5/59 = 0.085
12/59 = 0.203
11/59 = 0.186
14/59 = 0.237
8/59 = 0.136
8/59 = 0.136
1/59 = 0.017
1.00 (rounding error)
This is the relative frequency histogram:
2.5
(a) The table of relative frequencies for workers in the department is
Mode of Transportation
Drive alone
Car pool
Ride bus
Other
Total
Frequency
50
6
14
10
80
Relative Frequency
0.625  50 / 80
0.075  6 / 80
0.175  14 / 80
0.125  10 / 80
1.000
8
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
(b) The pie chart for workers in the department is
2.6
The table of relative frequencies for the money raised (in million dollars) is
Source
Individuals and bequests
Industry and business
Foundations and associations
Total
Frequency
234
48
132
414
Relative Frequency
0.565  234 / 414
0.116  48 / 414
0.319  132 / 414
1.000
The pie chart for the university fund drive is
2.7
There are overlapping classes in the grouping. A report of 3 stolen bicycles will fall
in two classes.
2.8
There is a gap. A report of 6 complaints in one week does not fall in any class. The
last class should be 6 or more.
9
2.9
There is a gap. The response 5 close friends does not fall in any class. The last
class should be 5 or more.
2.10 The first class should be “less than 175 pounds”. Otherwise, a light weight kicker
cannot be assigned to a class.
2.11 (a) Yes. (b) Yes. (c) Yes. (d) No. (e) No.
2.12 The frequency table of the survey response is
Response
1
2
3
4
Total
Frequency
14
13
7
16
50
Relative Frequency
0.28  14/ 50
0.26  13/ 50
0.14  7 / 50
0.32  16/ 50
1.00
2.13 (a) The relative frequencies are 0.18 , 0.48, 0.26, and 0.08 for 0, 1, 2, and 3 bags,
respectively.
(b) Nearly one-half of the passengers check exactly one bag. The longest tail is to
the right.
(c) The proportion of passengers who fail to check a bag is 9 / 50  0.18 .
10
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.14 The dot diagram of meter readings is
Measurements DotPlot
422
432
442
452
462
472
2.15 The dot diagram of amounts of radiation leakage is
2.16 The dot diagram of number of bad checks received is
Number of Bad Checks Received
3
4
5
6
7
8
11
2.17 (a) The dot diagram of number of CFUs is
(b) There is a long tail to the right with one extremely large value of 1700 CFU
units.
(c) There is one day so the proportion is 1/15  0.067
2.18 (a) The frequency distribution of tornado fatalities is given in the table below.
Class Interval
[0, 25)
[25, 50)
[50, 75)
[75, 100)
[100, 150)
[150, 200)
[200, 250)
[250, 550)
Total
Frequency
2
19
18
7
5
2
1
3
58
Relative Frequency
2/58 = 0.034
19/58 = 0.328
18/58 = 0.310
7/58 = 0.121
5/58 = 0.086
2/58 = 0.034
1/58 = 0.017
3/58 = 0.052
0.982 (rounding error)
(b) The relative frequency histogram is given below.
Relative Frequency
0.3
0.2
0.1
0.0
0
25
50
75
100
150
200
250
Number of Deaths
(c) The proportion of years having 49 or fewer tornado fatalities is
0.034  0.328  0.362 .
(d) There is a long tail to the right due to the fact that the last class interval is much
wider than the others yet still exhibits a low frequency of observations.
12
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.19 (a) In the following frequency distribution of lizard speed (in meters per second),
the left endpoint is included in the class interval but not the right endpoint.
Class Interval
0.45 to 0.90
0.90 to 1.35
1.35 to 1.80
1.80 to 2.25
2.25 to 2.70
Total
Frequency
2
6
11
5
6
30
Relative Frequency
0.067
0.200
0.367
0.167
0.200
1.001 (rounding error)
(b) All of the class intervals are of length 0.45 so we can graph rectangles whose
heights are the relative frequency. The histogram is
2.20 In the following frequency distribution of order of earthquake magnitudes (as given
on the Richter scale), the left endpoint is not included in the class interval, but the
right one is.
Class Interval
(6.0, 6.3]
(6.3, 6.6]
(6.6, 6.9]
(6.9, 7.2]
(7.2, 7.5]
(7.5, 7.8)
(7.8, 8.1)
Total
Frequency
12
15
10
10
5
2
1
55
Relative Frequency
12/55 = 0.218
15/55 = 0.273
10/55 = 0.182
10/55 = 0.182
5/55 = 0.091
2/55 = 0.036
1/55 = 0.018
1.0000 (rounding error)
The class intervals all have the same length so we take the option of making the
height of a rectangle equal to the relative frequency. The histogram is
Frequency
13
16
14
12
10
8
6
4
2
0
6.3
6.6
6.9
7.2
7.5
7.8
8.1
Order of Earthquake Magnitude
2.21 This time, the frequency distribution is given by
Class Interval
(6.0, 6.3]
(6.3, 6.6]
(6.6, 6.9]
(6.9, 7.2]
(7.2, 7.9]
Total
Frequency
12
15
10
10
8
55
Relative Frequency
12/55 = 0.218
15/55 = 0.273
10/55 = 0.182
10/55 = 0.182
8/55 = 0.145
1.0000 (rounding error)
Frequency
The corresponding frequency histogram is as follows:
16
14
12
10
8
6
4
2
0
6.3
6.6
6.9
7.2
>7.2
Order of Earthquake Magnitude
14
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.22 The stem-and-leaf display of the scores is
9 58
10 6
11 559
12 6
13 135678
14 344557
15 2478
16 01222567
17 14688
18 24
19 04
2.23 The stem-and-leaf display of the amount of iron present in the oil is
0 6
1 2234455567777889
2 000000222445567799
3 022444566
4 1167
5 12
2.24 The corresponding measurements are
225 238 290 319 344 371 382 397 405 416 433 480 504 568 613
2.25 The double-stem display of the amount of iron present in the oil is
0 6
1 22344
1 55567777889
2 00000022244
2 5567799
3 022444
3 566
4 11
4 67
5 12
15
2.26 The corresponding measurements are
18 20 20 20 20 21 22 22 23 23 23 23 23 24
24 24 25 25 25 26 26 27 29 30 31 31 34
2.27 The five-stem display of the Consumer Price Index in 2001 for the given cities is
15 5
15
15 9
16
16
16
16 7
16 8
17 0
17 22333
17 4
17 677
17 88
18 11
18 2
18
18 67
18
19 011
2.28 (a) The median is 4. The sample mean is
3  7  4  11  5 30
x

6
5
5
(b) The median is 3. The sample mean is
3  1  7  3  1 15
x
 3
5
5
2.29 (a) The sample mean is
4  8  1  2  0 15
x
 3
5
5
The ordered measurements are 0, 1, 2, 4, 8, so the median is 2
(b) The mean is
26  30  38  32  26  31 183
x

 30.5
6
6
16
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
The ordered measurements are: 26, 26, 30, 31, 32, 38
30  31
 30.5
The median 
2
(c) The sample mean is
2  4  0  2  8  2  4
x
2
7
The ordered measurements are: 2,  2, 0, 2, 4, 4, 8 .
The median is 2.
2.30 The sample mean is
x
6.2  6.9  5.5  5.3  5.6  5.4  6.6  6.5 48.0

 6.0
8
8
The ordered measurements are: 5.3, 5.4, 5.5, 5.6, 6.2, 6.5, 6.6, 6.9.
5.6  6.2 11.8

 5.9
The median 
2
2
2.31 (a) x  3810/15  254.
(b) The ordered observations are:
10
20
50
60
80
90
90 110
140 180 260 340 380 400 1600
So, the median is 110 CFU units. The one very large observation makes the
sample mean much larger. Hence, the sample median is better to use in this
instance.
2.32 (a) The ordered monthly incomes are: 2300 2350 2400 2450 2575 2650 4700.
x
19425
 2775 , median  2450 .
7
(b) For a typical salary, the median is better. Only one person earns more than the
mean.
2.33 The mean is 956/12  79.67 . The claim ignores variability and is not true. It is
certainly unpleasant with a daily maximum temperature 105oF in July.
17
2.34 The sample mean is
x
65  72  67  73  70  67  84 498

 71.14 cases
7
7
The ordered sales times are: 65, 67, 67, 70, 72, 73, 84
median  70
2.35 (a) x  212/ 25  8.48
(b) The sample median is 8. Since the sample mean and median are about the
same, either of them can be used as an indication of radiation leakage.
2.36 The mean, 10.30, is one measure of center tendency and the median, 10.00, is
another. These values may be interpreted as follows. On average, there were 10.3
reports of aggravated assault at the 27 universities. Thirteen of the universities had
at least 10 such reports while 13 recorded at most 10 such reports. At least one
school logged exactly 10 reports.
2.37 The mean, 118.05, is one measure of center tendency and the median, 117.00, is
another. The value 118.05 tells us that, on average, that a baby weighed 118.05
ounces. The median tells us that about half of the babies weighed at least 117
ounces while roughly half weighed at most 117 ounces.
0(7)  1(10)  2(13)  3(5)  4(2)  5(1)  6(1)  7(1)
 1.925 (activities)
40
(b) Sample median is 2 activities
(c) The large observations of 5, 6, and 7 activities did not drastically affect the
computation of the mean in this instance.
2.38 (a) x 
1(7)  2(9)  3(6)  4(5)  5(3)
 2.6 (returns)
30
(b) Sample median is 2 returns.
2.39 (a) x 
2.40 (a) Sample median  (235  242) / 2  238.5 (seconds).
(b) x  1240 / 6  206.7 (seconds).
2.41 (a) x  271/ 40  6.775 days.
(b) Sample median  (6  7) / 2  6.5 . Both the sample mean and the sample
median give a good indication of the amount of mineral lost.
2.42 (a) Sample median for males  (45.8  48.3) / 2  47.05 .
(b) Sample median for females  30.3 .
(c) Sample median for the combined set of males and females  38.6 .
18
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.43 The ordered measurements are: 145, 158, 165, 176, 182, 183, 200, 205, 216, 232
Sample median  (182  183) / 2  182.5 (minutes).
2.44 In Exercise 2.43, the sample mean  1862/10  186.2 (minutes). The total time for
10 games is 10x  1862 minutes and this is meaningful. However 10  median
ignores the actual times of the long games and is therefore meaningless.
2.45 (a) The dot diagram for the diameters (in feet) of the Indian mounds in southern
Wisconsin is
(b) x  346/13  26.62 . Sample median  24 .
(c) 13/ 4  3.25 , so we count in 4 observations. Q1  22 and Q3  30 .
2.46 16/ 4  4 , an integer, so we average the 4th and 5th observations.
Q1  (15  20) / 2  17.5 and Q3  (34  42) / 2  38 . The median, or
Q2  (26  31) / 2  28.5 days.
2.47 (a) Median  (152  154) / 2  153 .
(b) 40/ 4  10 , so we need to count in 10 observations. The 11-th smallest
observation also satisfies the definition.
Q1 
135  136
166  167
 135.5, Q3 
 166.5
2
2
2.48 x  2283/ 25  91.32 calls per shift.
2.49 The ordered data are
50
57
68
69
72
73
73
80
82
91
92
93
94
96
96 100 102 104 105 106
108 109 118 118 127
Since the number of observations is 25, the median or second quartile is the 13th
ordered observation in the list. The first quartile is the 7th observation.
Q1  73 Q2  94 Q3  105
19
2.50 (a) The ordered data are
0.50 0.76 1.02 1.04 1.20 1.24 1.28 1.29 1.36 1.49
1.55 1.56 1.57 1.57 1.63 1.70 1.72 1.78 1.78 1.92
1.94 2.10 2.11 2.17 2.47 2.52 2.54 2.57 2.66 2.67
Since the number of observations is 30, the median or second quartile is the
average of the 15th and 16th in the list. Sample median  (1.63  1.70) / 2
 1.665 meters per second. Because 30/ 4  7.5 , the first quartile is the 8th
ordered observation.
Q1  1.29 Q2  1.665 Q3  2.11
(b) Since 0 .9(30)  27 , the 90th percentile is the average of the 27th and 28th
observation in the ordered list. Sample 90th percentile  (2.54  2.57) / 2
 2.555 .
2.51 (a) The ordered observations are
10
20
50
60
80
90
90 110
140 180 260 340 440 450 1700
Since the sample size is 15, the median is the 8th observation 110. To obtain
Q1 , we find 15/ 4  3.75 so the first quartile is the 4th observation in the
ordered list.
Q1  60 Q3  340
(b) The 90th percentile requires us to count in at least 0 .9(15)  13.5 or 14
observations. The 90th sample percentile  450 .
2.52 (a) The mean of the original data set is
4  8  8  7  9  6 42
x

7
6
6
Adding c  4 to the original data set we get: 8, 12, 12, 11, 13, 10. The mean of
the new data set is
8  12  12  11  13  10 66
xc 

 11  x  c
6
6
which equals x  4  7  4 . Multiplying the original data set by d  2 we get:
8, 16, 16, 14, 18, 12. The mean of the new data set is
dx 
8  16  16  14  18  12 84

 14  d x
6
6
20
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
which equals d  x  2(7) .
(b) The median of the original data set is
median 
78
 7.5
2
When c  4 is added to the original data set, the median of the new data set is
median of ( x  4) 
11  12
 11.5
2
which equals (median  c)  7.5  4 . When the original data set is multiplied
by d  2 , the median of the new data set is
median of 2 x 
14  16
 15
2
which equals (d  median)  2(7.5) .
2.53 (a) The ordered data are 62, 70, 75, 75, 80. The median is 75oF and the mean is
x  362 / 5  72.4o F .
(b) The mean of (o F  32) is x  32 by property (i) of Exercise 2.52 with c  32 .
By property (ii)
5 o
5
( F  32)  (mean of (o F  32))
9
9
5
5
 ( x  32)  (72.4  32)  22.44o C
9
9
mean of
By similar properties for the median
median of
5 o
5
5
( F  32)  (median of (o F )  32)  (75  32)  23.89o C
9
9
9
2.54 (a) Company A. The average is highest and a superior machinist would earn above
the median.
(b) Company B. A medium quality machinist would be paid near the median.
Company B has the higher median.
21
2.55 (a)
(b)
67
454
 13.40
 50.44
Lake Woodruff x 
5
9
(c) From the dot diagrams, the males in Lake Apopka have lower levels of
testosterone and their sample mean is only about one-third of that for males in
(un-contaminated) Lake Woodruff. This finding is consistent with the
environmentalists’ concern that the contamination has affected the testosterone
levels and the reproductive abilities.
Lake Apopka x 
2.56 (a)
22
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
(b) Males x  67 / 5  13.40
Females x  144/11  13.09
(c) The dot diagrams of the amount of testosterone seems to be quite similar for
males and females although there is a gap in the male diagram. The two means
are nearly the same which suggests that the insecticide contamination has
pushed hormone concentrations far out of balance because, ordinarily, males
should have higher testosterone concentrations.
2.57 (a) We carry out all necessary calculations in the following table. The mean is
x  12 / 3  4 .
Total
x
7
2
3
12
xx
3
2
1
0.0
( x  x )2
9
4
1
14
(b) The variance and the standard deviation are
14
s2 
 7 , s  7  2.646
3 1
2.58 (a) We carry out all necessary calculations in the following table. The mean is
x  15/ 3  5 .
xx
x
( x  x )2
1
-4
16
10
5
25
4
-1
1
Total
15
0.0
42
(b) The variance and the standard deviation are
42
s2 
 21 , s  21  4.583
3 1
2.59 (a) We carry out all necessary calculations in the following table. The mean is
x  24 / 4  6 .
xx
x
( x  x )2
Total
6
4
12
2
24
0
2
6
4
0.0
(b) The variance and the standard deviation are
56
s2 
 18.667 , s  18.667  4.320
4 1
0
4
36
16
56
23
2.60 (a) We carry out all necessary calculations in the following table. The mean is
x  11.5 / 5  2.3 .
Total
x
2.6
1.5
3.5
2.4
1.5
11.5
xx
0.3
-0.8
1.2
0.1
-0.8
0.0
( x  x )2
0.09
0.64
1.44
0.01
0.64
2.82
(b) The variance and the standard deviation are
2.82
s2 
 0.705 , s  0.705  0.840
5 1
2.61 We carry out all necessary calculations in the following table.
Total
x
8
3
4
15
x2
64
9
16
89
The variance is
2
x  1 

1 
152  1

2

s 
x


89



  (89  75)  7
n 1 
n  2
3  2


2
2.62 We carry out all necessary calculations in the following table.
x
x2
6
36
4
16
12
144
2
4
Total
24
200
The variance is
2
x  1 

1 
242  1
56

2
2

s 
x


200

 18.667


  (200  144) 
n 1 
n  3
4  3
3


2.63 (a) s 2  (38  122 / 5) / 4  2.30 .
(b) s 2  (19  (7)2 / 6) / 5  2.167 .
(c) s 2  (499  592 / 7) / 6  0.286 .
24
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.64 (a) Many factors could explain the difference in apartment rents. One possible
factor is simply that different landlords may charge different rents. Other
factors are the size of the apartment, the proximity of the apartment to key
locations such as parks or public transportation, and whether utilities such as
water and electricity are included.
(b) s 2  (3,836,675  51552 / 7) / 6  6730.95 .
(c) s  6730.95  82.04
2.65 s  (9726  3462 /13) /12  6.5643 .
2.66 (a) s 2  (142  262 / 5) / 4  1.70 .
(b) s  1.70  1.304
2.67 (a) s 2  (3,140,900  38102 /15) /14  155, 225.7 .
(b) s  155225.7  393.99 .
(c) s 2  (580900  22102 ) /14) /13  17,848.9 . so s  17848.9  133.6 . The
single very large value greatly inflates the standard deviation.
2.68 (a) s 2  (2410  2122 / 25) / 24  25.51 .
(b) s  25.51  5.05 .
2.69 (a) x  1862/10  186.2 .
(b) s 2  (353,308  18622 /10) / 9  733.73 .
(c) s  733.73  27.09
2.70 (a) x  62 / 50  1.24 bags.
(b) s 2  (112  622 / 50) / 49  0.71673 , so that s  .847 .
2.71 (a) Median  67.4 .
(b) x  478.4 / 7  68.343 .
(c) s 2  (32,773.34  478.42 / 7) / 6  13.020 . Hence s  3.608 .
2.72 (a) The measure of variation displayed is 7.61, the sample standard deviation. The
sample variance is s 2  7.612  57.9121 .
(b) The interquartile range is Q3  Q1  14.00  5.00  9.00 . This means the center
half of the data span an interval of length 9.
(c) Any value greater than 7.61 would correspond to greater variation.
25
2.73 (a) The measure of variation displayed is 15.47, the sample standard deviation.
The sample variance is s 2  15.47 2  239.321 .
(b) The interquartile range is Q3  Q1  131.00  106.00  25.00 . This means the
center half of the data span an interval of length 25 ounces.
(c) Any value smaller than 15.47 would correspond to smaller variation.
2.74 (i) For the observations 5, 9, 9, 8, 10, 7, x  8, s 2  3.2 and s  1.789 . Add c  4
to the observations x , we have 9, 13, 13, 12, 14, 11. The sample mean and
variance of the new data set are
9  13  13  12  14  11
 12
6
9  1  1  0  4  1 16
variance of ( x  4) 

 3.2
6 1
5
mean of ( x  4) 
So the standard deviation of the new data set x  4 is
same as the standard deviation of x.
3.2  1.789 which is the
(ii) Multiply the observations x by d  2 . We get 10, 18, 18, 16, 20, 14. The
sample mean and variance of the new data set are
10  18  18  16  20  14
 16
6
36  4  4  0  16  4
variance of 2 x 
6 1
4(9  1  1  0  4  1)

 4(3.2)  12.8
5
So the standard deviation of the new data set 2x is 2 3.2  3.578 or d times the
standard deviation of x.
mean of 2 x 
2.75 Using the data set in Exercise 2.22, in Exercise 2.47, we determined that Q1  135.5
and Q3  166.5 . Hence,
Interquartile range  Q3  Q1  166.5  135.5  31.0 points.
2.76 From the data set of Exercise 2.33, in Exercise 2.46 we determined that Q1  17.5
and Q3  38 . Hence,
Interquartile range  Q3  Q1  38  17.5  20.5 days.
26
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.77 No. Typically, the middle half of a data set is much more concentrated than the
sum of the two quarters, one in each tail. As an example, for the water quality data
of Exercise 2.17, the range is 1600  10  1590 because of one extremely large
observation. From the quartiles determined in Exercise 2.51, the interquartile range
is 340  60  280 . The range is six times larger than the interquartile range.
2.78 (a) x  150.125 and 2s  49.354 so x  2s is the interval (100.771, 199.479) .
This interval contains 38 observations or proportion .95 of the observations.
And x  3s is the interval (76.094, 224.156) which contains proportion 1 of the
observations.
(b) The empirical guidelines suggests proportion 0.95 in the interval x  2s and we
observed 0.95. It suggests proportion 0.997 for the interval x  3s and we
observed 1.000. The agreement is excellent.
2.79 (a) x  6.775 and s  19.4096  4.406 .
(b) The proportion of the observations are given in the following table:
xs
x  2s
x  3s
Interval:
(2.369, 11.181) ( 2.037, 15.587) ( 6.443, 19.993)
Proportion: 26 / 40  0.65
38 / 40  0.95
40 / 40  1.00
Guidelines:
0.68
0.95
0.997
(c) We observe a good agreement with the proportions suggested by the empirical
guideline.
2.80 (a) x  51.71/ 30  1.724 and s  (98.641  51.712 / 30) / 29  0.5727 .
(b) The proportion of the observations are given in the following table:
xs
x  2s
x  3s
Interval:
(1.151, 2.296) (.579, 2.2869) (.006, 3.442)
Proportion: 20 / 30  0.667 29 / 30  0.967 30 / 30  1.00
Guidelines:
0.68
0.95
0.997
(c) We observe a good agreement with the proportions suggested by the empirical
guideline.
2.81 (a) x  25.160 and s  114.790  10.714 .
(b) The proportion of the observations are given in the following table:
xs
x  2s
x  3s
Interval:
(14.446, 35.874) (3.732, 46.588) ( 6.982, 57.302)
Proportion:
36 / 50  0.72
47 / 50  0.94
50 / 50  1.00
Guidelines:
0.68
0.95
0.997
(c) We observe a good agreement with the proportions suggested by the empirical
guideline.
27
2.82 (a) The z-values of 350 and 620 are
z
350  490
620  490
 1.167 , z 
 1.083 .
120
120
(b) For the z-score of 2.4, the raw score is obtained by solving the equation
x  210
so x  210  50(2.4)  330 .
50
102  118.05
144  118.05
 1.037
z
 1.677
2.83 (a) z 
(b)
15.47
15.47
z  2.4 
2.84 (a), (b)
The boxplots for salaries in City A and City B are shown below.
28
(c)
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
There is a greater difference between the cities with respect to the higher salaries.
For instance, any salary above the median in City B is greater than the 75th
percentile in City A.
2.85 For males, the minimum and the maximum horizontal velocity of a thrown ball are
25.2 and 59.9 respectively. The quartiles are:
Q1  (38.6  39.1) / 2  38.85,
median  (45.8  48.3) / 2  47.05,
Q3  (49.9  51.7) / 2  50.8.
For females, the minimum and the maximum horizontal velocity of a thrown ball
are 19.4 and 53.7 respectively. The quartiles are
Q1  25.7, median  30.3, Q3  33.5 .
The boxplot of the male and female throwing speed are
Comparing the two boxplots, we can see that males throw the ball faster than
females.
2.86 (a) The differences, arranged in order, between 2007 and 1992 Consumer Price
Index are
13 13 14 18 20 20 21 21 21 22 23 24 25 25 26 26
27 33 33 34 35 36 37 41
Q1 
20  21
24  25
33  34
 20.5, Q2 
 24.5, Q3 
 33.5
2
2
2
The five-number summary is:
13, 20.5, 24.5, 33.5, 41
29
Alternatively, from Minitab:
Descriptive Statistics:
Variable
C1
N
24
N*
Mean
0 25.33
SE Mean
1.59
StDev
7.81
Minimum
Q1
13.00 20.25
Median
24.50
Q3
33.00
Maximum
41.00
(b)
Boxplot of Increases in Consumer Price Indices
40
35
30
25
20
15
10
2.87 (a)
x
608
 25.33
24
and
s
16806  (608) 2 / 24
 7.811
24  1
(b) Since x  2s  25.33  2(7.811)  9.708 and x  2s  25.33  2(7.811)  40.952
only the increase of 41 for Honolulu lies outside the interval. The proportion
23/ 24  0.958 lies within the interval.
2.88 (a) Using the ordered data set from Example 5, we have
min  4.5
max  10.0
Q1  (6.0  6.3) / 2  6.15
Q2  7.3
Q3  (8.0  8.3) / 2  8.15
(b) The box plot depicting this data set is as follows:
30
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
Hours of Sleep
10
9
C1
8
7
6
5
4
2.89 From Exercise 2.38, we know that
x  1.925
s
and
249  (77) 2 / 40
 1.607
40  1
2.90 (a) x  296 /14  21.14
(b)
12078  296
2
14  21.16
14  1
(c) The dot plot is given below
s
(d) All are losses except for two gains in the 1998 and 2002 elections.
2.91 (a) The ordered data are
8  5 4 5 8 11 12 16 26 30 43 47 52 55
Median  (12  16) / 2  14 seats lost.
31
(b) The maximum number of seats lost, 55, occurred when Harry S. Truman was
President. The minimum number, 8 or a gain, occurred during G.W. Bush’s
term as President.
(c) range  55  (8)  63
2.92
The process appears to be in statistical control. The pattern is nearly a horizontal
band with one possible low value.
2.93
The value 215 from the second pay period looks high and 194 from the fifth period
is possibly high.
32
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.94 We calculate x  2283/ 25  91.32 and s  9281.44 / 24  19.67 so the upper
limit is x  2s  130.66 and the lower limit is x  2s  51.98 .
Only the value 50 calls for worker 20 is out of control.
2.95 We calculate x  2501/ 26  96.2 and s  65254 / 25  51.1 so the upper limit is
x  2s  198.4 and the lower limit is x  2s  6.0 which we take as 0.
Only the value 215 from the second pay period is out of control.
33
2.96
Time Series Plot of Exchange Rate
2.4
2.2
Exchange Rate
2.0
1.8
1.6
1.4
1.2
1.0
1992
1994
1996
1998
2000
Index
2002
2004
2006
The process appears to be in statistical control between 1994 and 2003, but then begins to
taper off from 2003 to 2007.
2.97
Xbar Chart of Exchange Rate
2.4
1
2.2
Sample Mean
2.0
UCL=1.902
1.8
1.6
_
_
X=1.416
1.4
1.2
1.0
LCL=0.930
1
3
5
7
9
Sample
11
13
15
The process appears to be in statistical control. Only the value of 2.29 in 1993 is
out of control.
34
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.98 We re-calculate without the outlier 5326.
18329
5195138
x
 1221.9 and s 
 609.16
15
14
so the upper limit is x  2s  2440.2 and the lower limit is x  2s  3.6 . All of the
points are within the control limits.
2.99 (a) The relative frequencies of the occupation groups are:
Goods Producing
Service (Private)
Government
Total
Relative Frequency
2007
2000
0.139
0.161
0.722
0.702
0.139
0.136
1.000
1.000
(b) The proportions of persons in private service occupations and government has
increased while the proportion in goods producing have decreased from 2000 to
2007.
2.100 (a) The frequency table of “intended major” of the students is:
Intended major
Biological Science
Humanities
Physical Sciences
Social Science
Total
Frequency Relative Frequency
18
0.367
4
0.082
9
0.184
18
0.367
49
1.000
35
(b) The frequency table of “year in college” of the students is:
Year
1
2
3
4
Total
Frequency Relative Frequency
4
0.082
10
0.204
20
0.408
15
0.306
49
1.000
2.101 The dot diagrams of heights for the male and female students are
2.102 The frequency table of the causes for power outage is:
Frequency Table for Causes of Outage
Cause
Frequency
Trees and limbs
12
Animals
9
Lighting
3
Wind storm
1
Fuse
1
Unknown
4
The Pareto chart for the cause of outage is
36
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.103 (a) Yes. The exact number of lunches is the sum of the frequencies of the first
four classes.
(b) Yes. The exact number of lunches is the sum of the frequencies of the last
two classes.
(c) No.
2.104 The sample mean and sample standard deviation are:
137  139  96  137  115
 124.8 mm.
5
(79,300  6242 / 5)
s2 
 356.2, so that s  356.2  18.9 mm .
4
x
2.105 (a) The mean, 227.4, is one measure of center tendency and the median, 232.5, is
another. These values may be interpreted as follows. On average, the 20
grizzly bears weigh 227.4 pounds apiece. Half of the grizzly bears sampled
weighed at least 232.5 pounds while half weighed at most 232.5 pounds.
(b) The sample standard deviation is 82.7 pounds.
(c) The z score for a weight of 320 pounds is
320  227.4
z
 1.12
82.7
2.106 (a) Median  (64  67) / 2  65.5 .
(b) We count in 38/ 4  9.5 or 10 observations to find Q1  50 and Q3  79 .
(c) The proportion of students who scored below 40 is 5 / 38  0.132 .
The proportion of students who scored 90 or over is 4 / 38  0.105 .
2.107 (a) Sample median  (9  9) / 2  9 .
(b) x  271/ 30  9.033 .
(c) The sample variance is
1 
2712 
s 2   2587 
  4.792 .
29 
30 
2.108 (a) The double stem display is
4 23
4 6677899
5 0011112244444
5 555566677778
6 0111244
6 589
37
(b) Median 
54  55
 54.5, Q1  50 , Q3  57 .
2
2.109 (a) x  7, s  2
(b) By the properties, the new data set x  100 has sample mean  (7  100)
 107 and standard deviation 2. By direct calculation, we verify
106  108  104  109  108
 107
5
(106  107) 2  (108  107) 2  (104  107) 2  (109  107) 2  (108  107) 2
4
4
(c) By the properties, the new data set 3x has sample mean  3(7)  21
and standard deviation 3 s  3(2)  6 . By direct calculation, we verify
18  24  12  27  24
 21
5
(3)2  (3)2  (9)2  (6)2  (3)2
 1 1  9  4 1 
2
 9
  9s
4
4


2.110 (a) For the heights of males, x  69.61, s  2.97 .
(b) For the heights of females, x  66.14, s  2.60 .
(c) For the heights of males, median  70, Q1  68, Q3  72 .
(d) For the heights of females, median  66.5, Q1  65, Q3  67 .
2.111 (a) The dot diagrams are
(b) From the dot diagrams we can see the number of flies (grape juice) is centered
at about 11 and the number of flies (regular food) is centered near 25. The
spread looks about the same.
38
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
(c) Regular food: x  25.1, s  6.84 .
Grape juice: x  10.6, s  6.07 .
2.112 (a) The dot diagram of the usage times per ounce of toothpaste is
(b) The relative frequency of usage times that do not exceed .80 is
number of usage times less than or equal to .80 12

 0.50 .
24
24
(c) x  0.794 and s  0.115 .
(d) Median  0.805, Q1  0.755 and Q3  0.86 .
2.113 (a) x  5.38 and s  3.42 .
(b) Median  5 .
(c) Range  13  0  13 .
2.114 (a) 0.75 since 106.0 is the third quartile.
(b) 0.50 since 94.0 is the median.
(c) 0.68 in the interval x  2s or 69.5-111.5 if the frequency distribution is nearly
bell-shaped.
(d) 0.50 since Q1  85.5 and Q3  106.0 .
(e) 0.997 in the interval x  3s or 59.0-122.0 if the frequency distribution is
nearly bell-shaped.
2.115 (a)
(b)
(c)
(d)
Median  4.505, Q1  4.30 and Q3  4.70 .
90th percentile =  (4.80  5.07) / 2  4.935 .
x  4.5074 and s  0.368 .
The boxplot of acid rain in Wisconsin is
39
2.116 (a) and (b). We have x  4.507 and s  0.368 so
xs
x  2s
x  3s
Interval:
(4.139, 4.875) (3.771, 5.243)
(3.403, 5.611)
Proportion:
38 / 50  0.76
46 / 50  0.92
50 / 50  1.000
Guidelines:
0.68
0.95
0.997
(c) The observed proportions are somewhat close to those suggested by the
empirical guidelines. However, the proportion between one and two standard
deviations (46  38) / 50  0.16 is noticeably smaller than the expected
0.95  0.68  0.27 .
2.117 (a) Median  6.3, Q1  5.9 and Q3  6.9 .
(b) x  314.8/ 49  6.424 and s  2047.4  (314.8)2 / 49 / 48  0.721 .
(c)
Class Interval (%)
(5.2, 5.6]
(5.6, 6.0]
(6.0, 6.4]
(6.4, 6.8]
(6.8, 7.2]
(7.2, 8.4]
Total
Frequency
5
11
13
7
8
5
49
Relative Frequency
0.1020
0.2245
0.2653
0.1429
0.1633
0.1020
1.0000
We use the convention that the right endpoint is included in the class interval.
(d) The boxplot of the data is
40
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.118 (a)
Class Interval
 4500, 1600
( 1600,  850]
( 850,  250]
( 250, 0]
(0, 250]
(250,850]
(850,1650]
(1650, 2450]
Total
Frequency
1
Relative Frequency
0.026
1
2
8
13
6
5
3
39
0.026
0.051
0.205
0.333
0.154
0.128
0.077
1.0000
(b) The frequency and relative frequency histograms are:
Frequency
Yearly Changes in Dow Jones
Averages
14
12
10
8
6
4
2
0
-1600 -850 -250
0
250
850 1650 2450
41
Yearly Changes in Dow Jones Averages
0.35
Relative Frequency
0.3
0.25
0.2
0.15
0.1
0.05
0
1
2
3
4
5
6
7
8
Changes in DJ Average
(Here, the classes were renumbered 1 – 8, consecutively, for simplicity.
(c) Since the value 0 is included in the interval ( 250, 0] , it is unclear how many
of the 8 observations contained in that interval are negative. It is, however,
unlikely that the Dow Jones average at the end of one year was exactly the
same as that in the previous year. It seems safe to assume that all 8
differences in the interval ( 250, 0] are negative. The proportion of changes
that are negative is then (1  1  2  8) / 39  0.308 .
(d) The distribution is roughly bell-shaped centered slightly above 0.
2.119 (a)
Winning Times in Minutes and Seconds
3.4
4.4
5.4
(b) It is not reasonable, because a frequency distribution would not show the
systematic decrease of the winning times over the years, which is the main
feature of these observations.
42
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.120 The mode is 1 bag, since the sample has twenty-four 1’s.
2.121 We calculate
419, 411  4167
4167
50
 38.368
x
 83.34 and s 
49
50
2
2.122 (a) The time plot for this data set is as follows:
Time Plot for Deaths Due to Lightning
Number of Deaths
200
150
100
50
1959
1964
1969
1974
1979
1984
Year
1989
1994
1999
2004
(b) The number of deaths due to lightning is steadily declining over the indicated
time period. The mean and standard deviation, while important, would not
capture this trend. The 5-number summary would be an important supplement
to report when describing this data set.
2.123 (a) The partial MINITAB output is
43
(b) The partial MINITAB output for the acid rain data.
(Note that MINITAB uses a slightly different convention for determining Q1
and Q3 .)
2.124 (a) The histogram and box plot reveal a longer right hand tail. The five
observations 4749, 4846, 4949, 5005, and 5157 seem to be detached and
larger than expected for a bell-shaped pattern.
(b) Textbook scheme: Q1  3470 MINITAB scheme: Q1  3457.5
The scheme used by the text yields a slightly higher value.
2.125 The partial MINITAB output for the data set in Table 4.
2.126 The MINITAB commands and partial output of the final times to run 1.5 miles in
Data Bank D.5.
2.127 The mean and standard deviation given by MINITAB are the rounded off values
of the answer given by SAS.
44
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.128 (a) Freshwater growth of male salmon: x  98.350, s  30.03
median  98.5, Q1  80 and Q3  118 . The frequency table for freshwater
growth of male salmon is:
Class Interval
40-60
60-80
80-100
100-120
120-140
140-160
Total
Frequency
6
4
11
10
6
3
40
Relative Frequency
0.150
0.100
0.275
0.250
0.150
0.075
1.000
Because the cells have equal width, we take the option of using relative
frequency for the vertical scale. The histogram is shown below.
(b) Freshwater growth of female salmon: x  114.825, s  22.22
median  115.5, Q1  98.5 and Q3  132.5 . The frequency table for
freshwater growth of female salmon is:
Class Interval
60-80
80-100
100-120
120-140
140-160
160-180
Total
Frequency
2
9
12
13
3
1
40
Relative Frequency
0.050
0.225
0.300
0.325
0.075
0.025
1.000
Because the cells have equal width, we take the option of using relative
frequency for the vertical scale. The histogram is shown below.
45
(c) The box plots for freshwater growth of male and female salmon are:
2.129 (a) The histogram of the alligator data is
(b)
x
4035
155672
 109.1 s 
 65.8
37
37  1
2.130 (a) The ordered data are
16
18
19
19
24
29
32
33
46
58
68
72
78
82
82
83
99 101 109 110
114 118 125 134 140 141 142 143 163 170
184 194 200 220 220 221 228
Median  109, Q1  58 and Q3  143 .
(b) We count in 34 positions to find the 90th percentile  220 . The only two
observations that are higher were taken from females.
46
CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA
2.131 (a)
(b) The ordered observations are
75.3 75.7 75.9 75.9 76.2 76.3 76.4 76.4 76.6 76.6
76.7 76.9 76.9 77.0 77.0
77.1 77.4 77.4 77.4 77.4
77.4 77.5 77.6 77.6 77.8 77.9 77.9 77.9 77.9 77.9
78.0
78.1 78.3 78.4 78.4 78.5 79.1 79.2 80.0 80.4
There are 40 observations so the median  (77.4  77.4) / 2  77.4 . The first
quartile is the average of the 40 / 4  10th and 11th observations in the sorted
list. Q1  (76.6  76.7) / 2  76.65 and Q1  (77.9  78.0) / 2  77.95 .
(c) The interval x  s or (76.36, 78.56) has relative frequency 30 / 40  0.75
compared to 0.683. The interval x  2s or (75.26, 79.66) has relative
frequency 38 / 40  0.95 compared to 0.95. The interval x  3s or
(74.16,80.76) has relative frequency 1 compared to 0.997. The agreement is
quite good.
Related documents