Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter Solutions Solution 1 a. Determine the mean price of this raw data by summing the prices for the six jars and dividing the total by six. Recall the formula for the mean of a sample was given previously. See Formula [3-2]. X X $7.98 $1.33 n 6 b. As noted above the median is defined as the middle value of a set of data, after the data is arranged from smallest to largest. The prices for the six jars of blackberry preserves have been ordered from a low of $1.26 up to $1.42. Because this is an even number of prices the median price is halfway between the third and the fourth price. The median is $1.32. Prices Arranged from Low to High: $1.26 $1.31 Median $1.31 $1.33 D D $1.35 $1.42 $1.31 $1.33 $1.32 2 Suppose there are an odd number of blackberry preserve prices, such as shown in the table. $1.31 $1.31 $1.33 $1.35 $1.42 The median is the middle value ($1.33). To find the median, the values must first be ordered from low to high. c. The mode is the price that occurs most often. The price of $1.31 occurs twice in the original data and is the mode. Solution 2 The geometric mean (GM) annual percent increase from one time period to another is determined using formula [3-5]. GM n Value at the end of the period 1 Value at the start of the period [3 5] Note that there are 14 years between 1988 and 2002, so, n = 14. GM 14 For those with a x 300 14 1 60.0 1 133971 . 1 0.33971 5 y key on their calculator, the geometric mean can be solved quickly by: GM n1 Using x y 300 14 1 60.0 1 5 Display 60 300 5 = Depress x y Depress 14 1.33971 Depress 1 = 0.33971, or about 34% The value 1 is subtracted, according to formula [3-5], so the rate of increase is 0.33971, or 33.971% per year. The sale of hospital beds increased at a rate of almost 34% per year. Solution 3 Recall that the range is the difference between the largest value and the smallest value. b g Range Highest Value Lowest Value $212 $92 $120 This indicates that there is a difference of $120 between the largest and the smallest heating cost. Solution 4 The mean deviation is the mean of the absolute deviations from the arithmetic mean. For raw, or ungrouped data, it is computed by first determining the mean. Next, the difference between each value and the arithmetic mean is determined. Finally, these differences are totaled and the total divided by the number of observations. We ignore the sign of each difference. Formula [3-2] for the sample mean and formula [3-7] for the mean deviation are shown below. Sample Mean X X n Mean Deviation [3 2] MD X X [3 7] n The table below shows the data values, each data value minus the mean, and the absolute value of the deviations from the mean. In other words, the signs of the deviations from the mean are disregarded. Payment X $191 212 176 129 106 92 108 109 103 121 175 194 $1,716 | X X| |$+48 | +69 | +33 | 14 | –37 | –51 | –35 | –34 | –40 | –22 | +32 | +51 | | | | | | | | | | | | Absolute Deviations = = = = = = = = = = = = $48 69 33 14 37 51 35 34 40 22 32 51 $466 X X $1,716 $143.00 n 12 MD XX n $466 $38.83 12 The mean deviation of $38.83 indicates that the typical electric bill deviates $38.83 from the mean of $143.00. Solution 5 a. The arithmetic mean of this sample data, grouped into a frequency distribution, is computed by formula [3-17]. X Where: X M f fX fM n fM n [3 17] is the designation for the arithmetic mean. is the mid-value, or midpoint, of each class. is the frequency in each class. is the frequency in each class times the midpoint of the class. is the sum of these products. is the total number of frequencies. It is assumed that the observations in each class are represented by the midpoint of the class. The midpoint of the first class is $9.00, found by ($8.00 + $10.00)/2. For the next higher class, the midpoint is $11.00. Using formula [3-17] the arithmetic mean hourly wage is $13.90, found by X Wage Rate $8 $10 $12 $14 $16 $18 up to up to up to up to up to up to Total $10 $12 $14 $16 $18 $20 Frequency f Class Midpoint X 3 6 12 10 7 2 40 $9.00 11.00 13.00 15.00 17.00 19.00 fX $27.00 66.00 156.00 150.00 119.00 38.00 $556.00 fM $556.00 $13.90 n 40 b. The mode is the value that occurs most often. For data grouped into a frequency distribution, the mode is the midpoint of the class containing the most observations. There are more observations (12) in the $12 up to $14 class than in any other class. The midpoint of the class is $13, which is the mode. We computed two measures of location for the hourly wage data. Observe that the mean ($13.90) and the mode ($13.00) are different. Generally, this is the case. We will discuss what measure of location to select to represent the data. Solution 6 The number of faculty for each rank is not equal. Therefore, it is not appropriate simply to add the average salaries of the four ranks and divide by 4. We have a better method for weighting the averages. In this problem the salaries for each rank are multiplied by the number of faculty in that rank, the products totaled, then divided by the number of faculty. The result is the weighted mean. w1 X 1 w2 X 2 w3 X 3 w4 X 4 w1 w2 w31 w4 10($34,000) 12($45,000) 20($58,000) 5($68,000) 10 12 20 5 $2,380,000 47 $50,638 X Solution 7 The sample variance, designated s2, is based on squared deviations from the mean. For ungrouped raw data, it is computed using formula [3-10] or [3-11]. Formula [3-10] s2 ( X X ) n 1 Formula [3-11] 2 s2 ( X ) 2 n n 1 X 2 Computing the sample variance both ways: X $191 212 176 129 106 92 108 109 103 121 175 194 $1,716 XX $48 69 33 –14 –37 –51 –35 –34 –40 –22 32 51 0 ( X X )2 2,304 4,761 1,089 196 1,369 2,601 1,225 1,156 1,600 484 1,024 2,601 20,410 X2 36,481 44,944 30,976 16,641 11,236 8,464 11,664 11,881 10,609 14,641 30,625 37,636 265,798 ( X X ) 2 20, 410 1,855.45 n 1 12 1 or s2 (X ) 2 n 1 s2 n (1,716) 2 265,798 12 1,855.45 12 1 X 2 The standard deviation of the sample, designated by s, is the square root of the variance. The square root of 1,855.45 is $43.07. Note that the standard deviation is in the same terms as the original data, that is, dollars. Solution 8 The range is the difference between the lower class limit of the lowest class and the upper class limit of the highest class. Range = Upper Class Limit – Lower Class Limit Range = 50 – 20 = 30 months Solution 9 Formula [3-18] is used to compute the standard deviation of grouped data. s (fX ) 2 n n 1 fX 2 [3 18] Where: s is the symbol for the sample standard deviation. X is the midpoint of a class. f is the class frequency. n is the total number of sample observations. Applying this formula to the distribution of the ages of the personal computers in Problem 8, the standard deviation is 6.39 months. Age to the Nearest Month 20 up to 25 25 up to 30 30 up to 35 35 up to 40 40 up to 45 45 up to 50 f 3 5 10 7 4 1 30 Class Midpoint X 22.5 27.5 32.5 37.5 42.5 47.5 fX fX2 67.5 137.5 325.0 262.5 170.0 47.5 1010.0 1,518.75 3,781.25 10,562.50 9,843.75 7,225.00 2,256.25 35,187.50 The variance is the square of the standard deviation. s2 = (6.39)2 = 40.83 s ( fX ) 2 n n 1 fX 2 (1010) 2 30 30 1 6.39 months 35,187.50 Solution 10 To find the proportion of faculty who earn between $46,000 and $58,000 we must first determine k; k is the number of standard deviations above or below the mean. k X X $46,000 $52,000 2.00 s $3,000 k X X $58,000 $52,000 2.00 s $3,000 Applying Chebyshev's Theorem: 1 1 1 1 2 0.75 2 k 2 This means that at least 75 percent of the faculty earn between $46,000 and $58,000. The Empirical rule states that about 68 percent of the observations fall within one standard deviation of the mean, 95 percent are within plus and minus two standard deviations of the mean, and virtually all (99.7%) will lie within three standard deviations from the mean. Hence, about 95 percent of the observations fall between $46,000 and $58,000, found by X 2s $52,000 2($3,000). So if we conclude that we have a bell shaped distribution, most of the observations fall within the interval. Solution 11 The coefficient of variation measures the relative dispersion in a distribution. In this problem it allows for a comparison of two distributions expressed in different units (dollars and years). Formula [3-13] is used. CV For the salaries: $3,000 (100) $52,000 5.8% CV s (100) X [3 13] For the length of service: CV 4 years (100) 15 years 26.7% The coefficient of variation is larger for length of service than for salary. This indicates that there is more relative dispersion in the distribution of the lengths of service relative to the mean than for the distribution of salaries. Solution 12 The first step is to organize the data into an ordered array from smallest to largest: 13 16 17 20 25 26 27 50 To locate the first quartile, let P = 25 and L p ( n 1) 56 65 68 80 86 90 92 P 25 (15 1) 4 100 100 Then locate the 4th observation in the array which is 20. Thus Q1 = 20 or $20,000. To locate the third quartile, let P = 75 and L p ( n 1) P 75 (15 1) 12 100 100 Then locate the 12th observation in the array which is 80. Thus Q3 = 80 or $80,000. To locate the median, let P = 50 and L p ( n 1) P 50 (15 1) 8 100 100 Then locate the 8th observation in the array which is 50. Thus Q2 = the median =50 or $50,000. In the above example with 15 observations the location formula yielded a whole number result. Suppose we were to add one more observation (95) to the data list. 13 16 17 20 25 26 27 50 56 65 68 80 86 90 92 What is the third quartile now? To locate the third quartile, let P = 75 and L p ( n 1) P 75 (16 1) 12.75 100 100 Then locate the 12th and 13th observation in the array which are 80 and 86. The value of the third quartile is 0.75 of the distance between the 12th and 13th value. We must calculate 0.75(86 80) = 4.5 Thus Q3 = (80 + 4.5) = 84.5 or $84,500. 95 Solution 13 The first step is to identify the five essential pieces of data: Minimum value = 13, Q1 = 20, Q2 = 50 Q3 = 80, Maximum value = 92 The next step in drawing a box plot is to create an appropriate scale along the horizontal axis. Next, we draw a box that starts at Q1 = 20, and ends at Q3 = 80. Inside the box we place a vertical line to represent the median 50. We then extend horizontal lines from the box to the minimum (12) and the maximum (92). Min Q1 Med Q3 Max + + + + + + + + + 0 10 20 30 40 50 60 70 80 90 100 The box plot shows that the middle 50 percent of the homes sold for between $20,000 and $80,000. Also the distribution is somewhat positively skewed, since the line from Q3 (80) to the Maximum (92) is longer than the line from Q1 (20) to the minimum (13). In other words the 25% of the data to the larger than the third quartile is spread out more than the 25% of the data less than the first quartile.