Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9 Statistics Section 9.2 Measures of Variation Who Has Better Scores? Adam and Bonnie are comparing their quiz scores in an effort to determine who is the “best”. Help them decide by calculating the mean, median, and mode for each. Adam’s Scores 85 60 105 85 72 100 Bonnie’s Scores 81 85 86 85 90 80 The Winner? Adam Bonnie Mean 84.5 84.5 Median 85 85 Mode 85 85 So, who has the better quiz scores? Another Way to Compare Sometimes the measures of central tendency (mean, median, mode) aren’t enough to adequately describe the data. We also need to take into account the consistency, or spread, of the data. Range The range of the data is the difference between the largest and smallest number in a sample. Find the range of Adam and Bonnie’s scores. Adam: 105 – 60 = 45 Bonnie: 90 – 80 = 10 Based on the range, Bonnie’s scores are more consisent, and some might argue therefore, better than Adam’s. Another Measure of Dispersion The most useful measure of variation (spread) is the standard deviation. First, we will look at the deviations from the mean. Deviations from the Mean The deviation from the mean is the difference between a single data point and the calculated mean of the data. X X Data point close to mean: small deviation Data point far from mean: large deviation Sum of deviations from mean is always zero. Mean of the deviations is always zero. Deviations from Mean for Adam and Bonnie Adam Bonnie Data Point Deviation from Mean Data Point Deviation from Mean 85 85 - 84.5 = 0.5 81 81 - 84.5 = -3.5 60 60 – 84.5 = -24.5 85 85 – 84.5 = 0.5 105 105 – 84.5 = 20.5 86 86 – 84.5 = 1.5 85 85 – 84.5 = 0.5 85 85 – 84.5 = 0.5 72 72 – 84.5 = -12.5 90 90 – 84.5 = 5.5 100 100 – 84.5 = 15.5 80 80 – 84.5 = -4.5 Sum = 0 Sum = 0 Variance Because the average of the mean deviation is always zero, we must modify our approach using the variance. The variance is the mean of the squares of the deviation. Variance for Adam and Bonnie’s Scores Using the deviations from the mean we have already calculated for Adam and Bonnie, we will find the variance for each. (.5)2 (24.5)2 (20.5)2 (.5) 2 (12.5) 2 (15.5) 2 Adam : s² = 6 1 1417.5 s² = 5 = 283.5 (3.5)2 (.5)2 (1.5) 2 (.5) 2 (5.5) 2 (4.5) 2 Bonnie: s² = 6 1 s² = 65.5 5 = 13.1 Standard Deviation To find the variance, we squared the deviations from the mean, so the variance is in squared units. To return to the same units as the data, we use the square root of the variance, the standard deviation. s s Standard Deviation 2 Variance Adam and Bonnie’s Standard Deviation Adam: s s 283.5 16.84 Bonnie: s s 13.1 3.62 Based on the standard deviation, Bonnie’s scores are better because there is less dispersion. In other words, she is more consistent than Adam. 2 2 Formulas for Variance and Standard Deviation Example 1 The number of homicide victims in Vermont from 1992 through 2001 is given in the table at right. (Source: http://170.222.24.9/cjs/crime_01/homicide_01.html) Find the mean, median, mode, and standard deviation of the data. Year Homicide Victims 1992 1993 1994 21 15 5 1995 1996 1997 1998 1999 2000 2001 13 11 9 12 17 12 11 Sample vs. Population The mean, variance, and standard deviation of a random sample is referred to as the sample mean ( X ), sample variance (s²), and sample standard deviation (s). The sample mean, variance, standard deviation, etc. can only give us an approximation to the population mean (µ), the population variance (σ²), and the population standard deviation (σ). The main difference lies in the denominator of the formulas for standard deviation. When the value of the sample, n, is large, the sample standard deviation gives a good estimate of the population standard deviation. Grouped Distributions Example 2 Mr. Smith recently gave a math test and organized his scores into the table at right. Help Mr. Smith determine the class average, the median score, and standard deviation. Score Frequency 40 - 49 2 50 - 59 3 60 - 69 6 70 - 79 12 80 - 89 7 90 - 100 5 Chebyshev’s Theorem Chebyshev’s Theorem states that for any set of numbers, the fraction (or probability) that will lie within k standard deviations of the mean (for k > 1) is at least __1__ 1 k² Example 3 Use Chebyshev’s Theorem to find the fraction of all the numbers of a data set that must lie within 4 standard deviations from the mean. Example 4 In a certain distribution of numbers, the mean is 50 with a standard deviation of 6. Use Chebyshev’s Theorem to tell the probability that a number lies in each interval. a.) between 38 and 62 b.) less than 38 or more than 62