Download Chapter Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Mean field particle methods wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter Solutions
Solution 1
a. Determine the mean price of this raw data by summing the prices for the six jars and dividing
the total by six. Recall the formula for the mean of a sample was given previously. See
Formula [3-2].
X
X $7.98

 $1.33
n
6
b. As noted above the median is defined as the middle value of a set of data, after the data is
arranged from smallest to largest. The prices for the six jars of blackberry preserves have
been ordered from a low of $1.26 up to $1.42. Because this is an even number of prices the
median price is halfway between the third and the fourth price. The median is $1.32.
Prices Arranged from Low to High:
$1.26
$1.31
Median 
$1.31
$1.33
D D
$1.35
$1.42
$1.31  $1.33
 $1.32
2
Suppose there are an odd number of blackberry
preserve prices, such as shown in the table.
$1.31
$1.31
$1.33
$1.35
$1.42
The median is the middle value ($1.33). To find the median, the values must first be ordered
from low to high.
c. The mode is the price that occurs most often. The price of $1.31 occurs twice in the original
data and is the mode.
Solution 2
The geometric mean (GM) annual percent increase from one time period to another is determined
using formula [3-5].
GM  n
Value at the end of the period
1
Value at the start of the period
[3  5]
Note that there are 14 years between 1988 and 2002, so, n = 14.
GM  14
For those with a
x
300
14
 1  60.0  1  133971
.
 1  0.33971
5
y key on their calculator, the geometric mean can be solved quickly by:
GM  n1
Using
x
y
300
14
 1  60.0  1
5
Display
60
300  5 =
Depress x y
Depress 14
1.33971
Depress 1 = 0.33971, or about 34%
The value 1 is subtracted, according to formula [3-5], so the rate of increase is 0.33971, or
33.971% per year. The sale of hospital beds increased at a rate of almost 34% per year.
Solution 3
Recall that the range is the difference between the largest value and the smallest value.
b
g
Range  Highest Value  Lowest Value  $212  $92  $120
This indicates that there is a difference of $120 between the largest and the smallest heating cost.
Solution 4
The mean deviation is the mean of the absolute deviations from the arithmetic mean. For raw, or
ungrouped data, it is computed by first determining the mean. Next, the difference between each
value and the arithmetic mean is determined. Finally, these differences are totaled and the total
divided by the number of observations. We ignore the sign of each difference. Formula [3-2] for
the sample mean and formula [3-7] for the mean deviation are shown below.
Sample Mean
X
X
n
Mean Deviation
[3  2]
MD 
 X X
[3  7]
n
The table below shows the data values, each data value minus the mean, and the absolute value of
the deviations from the mean.
In other words, the signs of the deviations from the mean are disregarded.
Payment
X
$191
212
176
129
106
92
108
109
103
121
175
194
$1,716
| X  X|
|$+48
| +69
| +33
| 14
| –37
| –51
| –35
| –34
| –40
| –22
| +32
| +51
|
|
|
|
|
|
|
|
|
|
|
|
Absolute
Deviations
=
=
=
=
=
=
=
=
=
=
=
=
$48
69
33
14
37
51
35
34
40
22
32
51
$466
X
X $1,716

 $143.00
n
12
MD 
XX
n

$466
 $38.83
12
The mean deviation of $38.83 indicates that the typical electric bill deviates $38.83 from the
mean of $143.00.
Solution 5
a. The arithmetic mean of this sample data, grouped into a frequency distribution, is computed
by formula [3-17].
X
Where:
X
M
f
fX
fM
n
 fM
n
[3  17]
is the designation for the arithmetic mean.
is the mid-value, or midpoint, of each class.
is the frequency in each class.
is the frequency in each class times the midpoint of the class.
is the sum of these products.
is the total number of frequencies.
It is assumed that the observations
in each class are represented by the
midpoint of the class. The midpoint
of the first class is $9.00, found by
($8.00 + $10.00)/2. For the next
higher class, the midpoint is $11.00.
Using formula [3-17] the arithmetic
mean hourly wage is $13.90, found
by
X
Wage
Rate
$8
$10
$12
$14
$16
$18
up to
up to
up to
up to
up to
up to
Total
$10
$12
$14
$16
$18
$20
Frequency
f
Class
Midpoint X
3
6
12
10
7
2
40
$9.00
11.00
13.00
15.00
17.00
19.00
fX
$27.00
66.00
156.00
150.00
119.00
38.00
$556.00
 fM $556.00

 $13.90
n
40
b. The mode is the value that occurs most often. For data grouped into a frequency distribution,
the mode is the midpoint of the class containing the most observations. There are more
observations (12) in the $12 up to $14 class than in any other class. The midpoint of the class
is $13, which is the mode.
We computed two measures of location for the hourly wage data. Observe that the mean ($13.90)
and the mode ($13.00) are different. Generally, this is the case. We will discuss what measure of
location to select to represent the data.
Solution 6
The number of faculty for each rank is not equal. Therefore, it is not appropriate simply to add the
average salaries of the four ranks and divide by 4. We have a better method for weighting the
averages. In this problem the salaries for each rank are multiplied by the number of faculty in that
rank, the products totaled, then divided by the number of faculty. The result is the weighted mean.
w1 X 1  w2 X 2  w3 X 3  w4 X 4
w1  w2  w31  w4
10($34,000)  12($45,000)  20($58,000)  5($68,000)

10  12  20  5
$2,380,000

47
 $50,638
X
Solution 7
The sample variance, designated s2, is based on squared deviations from the mean. For ungrouped
raw data, it is computed using formula [3-10] or [3-11].
Formula [3-10]
s2 
( X  X )
n 1
Formula [3-11]
2
s2 
( X ) 2
n
n 1
X 2 
Computing the sample variance both ways:
X
$191
212
176
129
106
92
108
109
103
121
175
194
$1,716
XX
$48
69
33
–14
–37
–51
–35
–34
–40
–22
32
51
0
( X  X )2
2,304
4,761
1,089
196
1,369
2,601
1,225
1,156
1,600
484
1,024
2,601
20,410
X2
36,481
44,944
30,976
16,641
11,236
8,464
11,664
11,881
10,609
14,641
30,625
37,636
265,798
( X  X ) 2 20, 410

 1,855.45
n 1
12  1
or
s2 
(X ) 2
n 1
s2 
n
(1,716) 2
265,798 
12

 1,855.45
12  1
X 2 
The standard deviation of the sample, designated by s, is the square root of the variance. The
square root of 1,855.45 is $43.07. Note that the standard deviation is in the same terms as the
original data, that is, dollars.
Solution 8
The range is the difference between the lower class limit of the lowest class and the upper class
limit of the highest class.
Range = Upper Class Limit – Lower Class Limit
Range = 50 – 20 = 30 months
Solution 9
Formula [3-18] is used to compute the standard deviation of grouped data.
s
(fX ) 2
n
n 1
fX 2 
[3  18]
Where:
s is the symbol for the sample standard deviation.
X is the midpoint of a class.
f is the class frequency.
n is the total number of sample observations.
Applying this formula to the distribution of the ages of the personal computers in Problem 8, the
standard deviation is 6.39 months.
Age to the
Nearest Month
20 up to 25
25 up to 30
30 up to 35
35 up to 40
40 up to 45
45 up to 50
f
3
5
10
7
4
1
30
Class Midpoint
X
22.5
27.5
32.5
37.5
42.5
47.5
fX
fX2
67.5
137.5
325.0
262.5
170.0
47.5
1010.0
1,518.75
3,781.25
10,562.50
9,843.75
7,225.00
2,256.25
35,187.50
The variance is the square of the standard deviation.
s2 = (6.39)2 = 40.83
s
( fX ) 2
n
n 1
fX 2 
(1010) 2
30

30  1
 6.39 months
35,187.50 
Solution 10
To find the proportion of faculty who earn between $46,000 and $58,000 we must first determine
k; k is the number of standard deviations above or below the mean.
k
X  X $46,000  $52,000

 2.00
s
$3,000
k
X  X $58,000  $52,000

 2.00
s
$3,000
Applying Chebyshev's Theorem: 1 
1
1
 1  2  0.75
2
k
2
This means that at least 75 percent of the faculty earn between $46,000 and $58,000.
The Empirical rule states that about 68 percent of the observations fall within one standard
deviation of the mean, 95 percent are within plus and minus two standard deviations of the mean,
and virtually all (99.7%) will lie within three standard deviations from the mean. Hence, about 95
percent
of
the
observations
fall
between
$46,000
and
$58,000,
found
by X  2s  $52,000  2($3,000). So if we conclude that we have a bell shaped distribution, most
of the observations fall within the interval.
Solution 11
The coefficient of variation measures the relative dispersion in a distribution. In this problem it
allows for a comparison of two distributions expressed in different units (dollars and years).
Formula [3-13] is used.
CV 
For the salaries:
$3,000
(100)
$52,000
 5.8%
CV 
s
(100)
X
[3  13]
For the length of service:
CV 
4 years
(100)
15 years
 26.7%
The coefficient of variation is larger for length of service than for salary. This indicates that there
is more relative dispersion in the distribution of the lengths of service relative to the mean than
for the distribution of salaries.
Solution 12
The first step is to organize the data into an ordered array from smallest to largest:
13
16
17
20
25
26
27
50
To locate the first quartile, let P = 25 and L p  ( n  1)
56
65
68
80
86
90
92
P
25
 (15  1)
4
100
100
Then locate the 4th observation in the array which is 20. Thus Q1 = 20 or $20,000.
To locate the third quartile, let P = 75 and L p  ( n  1)
P
75
 (15  1)
 12
100
100
Then locate the 12th observation in the array which is 80. Thus Q3 = 80 or $80,000.
To locate the median, let P = 50 and L p  ( n  1)
P
50
 (15  1)
8
100
100
Then locate the 8th observation in the array which is 50. Thus Q2 = the median =50 or $50,000.
In the above example with 15 observations the location formula yielded a whole number result.
Suppose we were to add one more observation (95) to the data list.
13
16
17
20
25
26
27
50
56
65
68
80
86
90
92
What is the third quartile now?
To locate the third quartile, let P = 75 and L p  ( n  1)
P
75
 (16  1)
 12.75
100
100
Then locate the 12th and 13th observation in the array which are 80 and 86. The value of the third
quartile is 0.75 of the distance between the 12th and 13th value. We must calculate 0.75(86  80)
= 4.5 Thus Q3 = (80 + 4.5) = 84.5 or $84,500.
95
Solution 13
The first step is to identify the five essential pieces of data:
Minimum value = 13,
Q1 = 20,
Q2 = 50
Q3 = 80,
Maximum value = 92
The next step in drawing a box plot is to create an appropriate scale along the horizontal axis.
Next, we draw a box that starts at Q1 = 20, and ends at Q3 = 80. Inside the box we place a vertical
line to represent the median 50. We then extend horizontal lines from the box to the minimum
(12) and the maximum (92).
Min
Q1
Med
Q3
Max
+
+
+
+
+
+
+
+
+
0
10
20
30
40
50
60
70
80
90
100
The box plot shows that the middle 50 percent of the homes sold for between $20,000 and
$80,000. Also the distribution is somewhat positively skewed, since the line from Q3 (80) to the
Maximum (92) is longer than the line from Q1 (20) to the minimum (13). In other words the 25%
of the data to the larger than the third quartile is spread out more than the 25% of the data less
than the first quartile.