Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Section 3.2 Measures of Dispersion 1. 2. 3. 4. 5. Range Variance Standard deviation Empirical Rule for bell shaped distributions Chebyshev’s Inequality for any distribution 3-1 Range The range of a set of data is the difference between the maximum value and the minimum value. Range = (maximum value) – (minimum value) EXAMPLE The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Find the range. Range = 43 – 5 = 38 minutes Variance The population variance is the sum of squared deviations about the population mean divided by the number of observations in the population, N. That is it is the mean of the sum of the squared deviations about the population mean. 3-4 The population variance is symbolically represented by σ2 (lower case Greek sigma squared). 3-5 EXAMPLE Population Variance The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population variance of this data. Recall that 174 24.85714 7 3-6 xi μ xi – μ (xi – μ)2 23 36 23 18 24.85714 24.85714 24.85714 24.85714 -1.85714 11.14286 -1.85714 -6.85714 3.44898 124.1633 3.44898 47.02041 5 26 43 24.85714 24.85714 24.85714 -19.8571 1.142857 18.14286 394.3061 1.306122 329.1633 x i 2 x i N 2 2 902.8571 902.8571 129.0 minutes2 7 3-7 The sample variance is computed by determining the sum of squared deviations about the sample mean and then dividing this result by n – 1. 3-8 EXAMPLE Sample Variance For the travel time data assume we obtained the following simple random sample: 5, 36, 26. Compute the sample variance travel time. Travel Time, xi Sample Mean, Deviation about the Mean, Squared Deviations about the Mean, x x 2 x xi x 5 22.333 5 – 22.333 = -17.333 (-17.333)2 = 300.432889 36 22.333 13.667 186.786889 26 22.333 3.667 13.446889 i x x i s 2 x x i n 1 2 500.66667 2 500.66667 3 1 250.333 square minutes 3-9 Standard Deviation The standard deviation of a set of sample values is a measure of variation of values about the mean. Population standard deviation: = square root of the population variance Sample standard deviation: s = square root of the sample variance, so that s s 2 3-11 EXAMPLE Population Standard Deviation The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population standard deviation of this data. Recall, from the last objective that σ2 = 129.0 minutes2. Therefore, 902.8571 2 11.4 minutes 7 3-12 EXAMPLE Sample Standard Deviation Recall the sample data 5, 26, 36 results in a sample variance of s2 xi x n 1 2 500.66667 3 1 250.333 square minutes Use this result to determine the sample standard deviation. s s2 500.666667 15.8 minutes 3 1 3-13 EXAMPLE Comparing Standard Deviations Wait Time at Wendy’s 1.50 2.53 1.88 3.99 0.90 0.79 1.20 2.94 1.90 1.23 1.01 1.46 1.40 1.00 0.92 1.66 0.89 1.33 1.54 1.09 0.94 0.95 1.20 0.99 1.72 0.67 0.90 0.84 0.35 2.00 Wait Time at McDonald’s 3.50 0.00 1.97 0.00 3.08 0.00 0.26 0.71 0.28 2.75 0.38 0.14 2.22 0.44 0.36 0.43 0.60 4.54 1.38 3.10 1.82 2.33 0.80 0.92 2.19 3.04 2.54 0.50 1.17 0.23 3-14 EXAMPLE Comparing Standard Deviations Determine the standard deviation waiting time for Wendy’s and McDonald’s. Which is the better company in terms of waiting times? 3-15 EXAMPLE Comparing Standard Deviations Determine the standard deviation waiting time for Wendy’s and McDonald’s. Sample standard deviation for Wendy’s: 0.738 minutes Sample standard deviation for McDonald’s: 1.265 minutes 3-16 The empirical rule for bell shaped distributions For many observations – especially if their histogram is bell-shaped 1. 2. Roughly 68% of the observations in the list lie within 1 standard deviation from the average And 95% of the observations lie within 2 standard deviations from the average Ave-2s.d. Ave-s.d. Average 68% 95% Ave+s.d. Ave+2s.d. 3-18 The Empirical Rule The Empirical Rule The Empirical Rule EXAMPLE Using the Empirical Rule The following data represent the serum HDL cholesterol of the 54 female patients of a family doctor. 41 62 67 60 54 45 48 75 69 60 54 47 43 77 69 60 55 47 38 58 70 61 56 48 35 82 65 62 56 48 37 39 72 63 56 50 44 85 74 64 57 52 44 55 74 64 58 52 44 54 74 64 59 53 3-22 (a)Compute the population mean and standard deviation. (b) Draw a histogram to verify the data is bell-shaped. (c) Determine the percentage of patients that have serum HDL within 3 standard deviations of the mean according to the Empirical Rule. (d) Determine the percentage of patients that have serum HDL between 34 and 69.1 according to the Empirical Rule. (e) Determine the actual percentage of patients that have serum HDL between 34 and 69.1 (use the raw data directly, not the empirical rule for this question. See how close the empirical rule approximation was!) 3-23 (a) Using a TI-83 plus graphing calculator or Excel, we find 57.4 and 11.7 (b) 3-24 22.3 34.0 45.7 57.4 69.1 80.8 92.5 (c) According to the Empirical Rule, 99.7% of the patients that have serum HDL within 3 standard deviations of the mean. (d) 13.5% + 34% + 34% = 81.5% of patients will have a serum HDL between 34.0 and 69.1 according to the Empirical Rule. (e) 45 out of the 54 or 83.3% of the patients have a serum HDL between 34.0 and 69.1. 3-25 Empirical rule for any shape distribution • Chebyshev’s Inequality 3-26 EXAMPLE Using Chebyshev’s Theorem Using the data from the previous example, use Chebyshev’s Theorem to (a) determine the percentage of patients that have serum HDL within 3 standard deviations of the mean. 1 1 100% 88.9% 2 3 (b) determine the actual percentage of patients that have serum HDL between 34 and 80.8. 1 1 2 100% 75% 2 3-27