Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Putting Statistics to Work Copyright © 2011 Pearson Education, Inc. Unit 6B Measures of Variation Copyright © 2011 Pearson Education, Inc. Slide 6-3 6-B Why Variation Matters Consider the following waiting times for 11 customers at 2 banks. Big Bank (three lines): 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Best Bank (one line): 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8 Which bank is likely to have more unhappy customers? → Big Bank, due to more surprise long waits Copyright © 2011 Pearson Education, Inc. Slide 6-4 6-B The range, R, of a variable is the difference between the largest data value and the smallest data values. That is Range = R = Largest Data Value – Smallest Data Value 3-5 Slide 6-5 6-B Quartiles The lower quartile (or first quartile) divides the lowest fourth of a data set from the upper threefourths. It is the median of the data values in the lower half of a data set. The middle quartile (or second quartile) is the overall median. The upper quartile (or third quartile) divides the lower three-fourths of a data set from the upper fourth. It is the median of the data values in the upper half of a data set. Copyright © 2011 Pearson Education, Inc. Slide 6-6 6-B Quartiles divide data sets into fourths, or four equal parts. • The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile. • The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median. • The 3rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3rd quartile is equivalent to the 75th percentile. 3-7 Slide 6-7 6-B © 2010 Pearson Prentice Hall. All rights reserved 3-8 Slide 6-8 6-B EXAMPLE Finding and Interpreting Quartiles A group of Brigham Young University—Idaho students (Matthew Herring, Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed of vehicles traveling through a construction zone on a state highway, where the posted speed was 25 mph. The recorded speed of 14 randomly selected vehicles is given below: 20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40 Find and interpret the quartiles for speed in the construction zone. Step 1: The data is already in ascending order. Step 2: There are n = 14 observations, so the median, or second quartile, Q2, is the mean of the 7th and 8th observations. Therefore, M = 32.5. Step 3: The median of the bottom half of the data is the first quartile, Q1. 20, 24, 27, 28, 29, 30, 32 The median of these seven observations is 28. Therefore, Q1 = 28. The median of the top half of the data is the third quartile, Q3. Therefore, Q3 = 38. 3-9 Slide 6-9 6-B Interpretation: • 25% of the speeds are less than or equal to the first quartile, 28 miles per hour, and 75% of the speeds are greater than 28 miles per hour. • 50% of the speeds are less than or equal to the second quartile, 32.5 miles per hour, and 50% of the speeds are greater than 32.5 miles per hour. • 75% of the speeds are less than or equal to the third quartile, 38 miles per hour, and 25% of the speeds are greater than 38 miles per hour. 3-10 Slide 6-10 6-B The Five-Number Summary The five-number summary for a data set consists of the following five numbers: low value lower quartile median upper quartile high value A boxplot shows the five-number summary visually, with a rectangular box enclosing the lower and upper quartiles, a line marking the median, and whiskers extending to the low and high values. Copyright © 2011 Pearson Education, Inc. Slide 6-11 6-B The Five-Number Summary Five-number summary of the waiting times at each bank: Big Bank Best Bank low value (min) = 4.1 lower quartile = 5.6 median = 7.2 upper quartile = 8.5 high value (max) = 11.0 low value (min) = 6.6 lower quartile = 6.7 median = 7.2 upper quartile = 7.7 high value (max) = 7.8 The corresponding boxplot: Copyright © 2011 Pearson Education, Inc. Slide 6-12 6-B 3-13 Slide 6-13 6-B EXAMPLE Determining and Interpreting the Interquartile Range Determine and interpret the interquartile range of the speed data. Q1 = 28 Q3 = 38 IQR Q3 Q1 38 28 10 The range of the middle 50% of the speed of cars traveling through the construction zone is 10 miles per hour. 3-14 Slide 6-14 6-B Suppose a 15th car travels through the construction zone at 100 miles per hour. How does this value impact the mean, median, standard deviation, and interquartile range? Without 15th car With 15th car Mean 32.1 mph 36.7 mph Median 32.5 mph 33 mph Standard deviation 6.2 mph 18.5 mph IQR 10 mph 11 mph 3-15 Slide 6-15 6-B The closing prices for 9 telecommunications stocks are shown below. Compute the interquartile range, IQR. 3.14 5.70 40.87 71.64 6.72 15.63 17.75 28.12 31.24 A. 29.845 B. 68.32 C. 6.21 D. 36.055 Slide 6-16 6-B © 2010 Pearson Prentice Hall. All rights reserved 3-17 Slide 6-17 6-B EXAMPLE Determining and Interpreting the Interquartile Range Check the speed data for outliers. Step 1: The first and third quartiles are Q1 = 28 mph and Q3 = 38 mph. Step 2: The interquartile range is 10 mph. Step 3: The fences are Lower Fence = Q1 – 1.5(IQR) Upper Fence = Q3 + 1.5(IQR) = 28 – 1.5(10) = 38 + 1.5(10) = 13 mph = 53 mph Step 4: There are no values less than 13 mph or greater than 53 mph. Therefore, there are no outliers. © 2010 Pearson Prentice Hall. All rights reserved 3-18 Slide 6-18 6-B © 2010 Pearson Prentice Hall. All rights reserved 3-19 Slide 6-19 6-B EXAMPLE Obtaining the Five-Number Summary Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the July 2005 survey. Determine the five-number summary of the data. Institution Pulaski Bank and Trust Company Rate 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana Bar Harbor Bank and Trust Company 9.9% 14.5% Source: http://www.federalreserve.gov/pubs/SHOP/survey.htm First, we write the data is ascending order: 6.5%, 9.9%, 12.0%, 13.0%, 13.3%, 13.9%, 14.3%, 14.4%, 14.4%, 14.5% The smallest number is 6.5%. The largest number is 14.5%. The first quartile is 12.0%. The second quartile is 13.6%. The third quartile is 14.4%. Five-number Summary: 6.5% 12.0% 13.6% 14.4% 14.5% 3-20 Slide 6-20 6-B © 2010 Pearson Prentice Hall. All rights reserved 3-21 Slide 6-21 EXAMPLE Constructing a Boxplot 6-B Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the July 2005 survey. Draw a boxplot of the data. Institution Pulaski Bank and Trust Company Rate 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana Bar Harbor Bank and Trust Company 9.9% 14.5% Source: http://www.federalreserve.gov/pubs/SHOP/survey.htm © 2010 Pearson Prentice Hall. All rights reserved 3-22 Slide 6-22 6-B Step 1: The interquartile range (IQR) is 14.4% - 12% = 2.4%. The lower and upper fences are: Lower Fence = Q1 – 1.5(IQR) Upper Fence = Q3 + 1.5(IQR) = 12 – 1.5(2.4) = 14.4 + 1.5(2.4) = 8.4% = 18.0% Step 2: * [ © 2010 Pearson Prentice Hall. All rights reserved ] 3-23 Slide 6-23 6-B The interest rate boxplot indicates that the distribution is skewed left. © 2010 Pearson Prentice Hall. All rights reserved 3-24 Slide 6-24 6-B Use the boxplot to identify the first quartile. 10 18 | | | | 24 | | | 26 | 30 | | | 10 12 14 16 18 20 22 24 26 28 30 A. 10 B. 18 C. 24 D. 26 Slide 3- 25 Copyright © 2010 Pearson Education, Inc. 6-B Use the boxplot to identify the first quartile. 10 18 | | | | 24 | | | 26 | 30 | | | 10 12 14 16 18 20 22 24 26 28 30 A. 10 B. 18 C. 24 D. 26 Slide 3- 26 Copyright © 2010 Pearson Education, Inc. 6-B The interest rate boxplot indicates that the distribution is skewed left. © 2010 Pearson Prentice Hall. All rights reserved 3-27 Slide 6-27 6-B Use the boxplot to identify the first quartile. 10 18 | | | | 24 | | | 26 | 30 | | | 10 12 14 16 18 20 22 24 26 28 30 A. 10 B. 18 C. 24 D. 26 Slide 3- 28 Copyright © 2010 Pearson Education, Inc. 6-B Use the boxplot to identify the first quartile. 10 18 | | | | 24 | | | 26 | 30 | | | 10 12 14 16 18 20 22 24 26 28 30 A. 10 B. 18 C. 24 D. 26 Slide 3- 29 Copyright © 2010 Pearson Education, Inc. 6-B Standard Deviation The standard deviation is the single number most commonly used to describe variation. sum of (deviation s from the mean) 2 standard deviation total number of data values 1 Copyright © 2011 Pearson Education, Inc. Slide 6-30 6-B Calculating the Standard Deviation The standard deviation is calculated by completing the following steps: 1. Compute the mean of the data set. Then find the deviation from the mean for every data value. deviation from the mean = data value – mean 2. Find the squares of all the deviations from the mean. 3. Add all the squares of the deviations from the mean. 4. Divide this sum by the total number of data values minus 1. 5. The standard deviation is the square root of this quotient. Copyright © 2011 Pearson Education, Inc. Slide 6-31 6-B Standard Deviation Let A = {2, 8, 9, 12, 19} with a mean of 10. Find the sample standard deviation of the data set A. x (data value) 2 8 9 12 19 x – mean (deviation) 2 – 10 = –8 8 – 10 = –2 9 – 10 = –1 12 – 10 = 2 19 – 10 = 9 Total (deviation)2 (-8)2 = 64 (-2)2 = 4 (-1)2 = 1 (2)2 = 4 (9)2 = 81 154 sum of (deviation s from the mean) 2 standard deviation total number of data values 1 154 6.2 5 1 Copyright © 2011 Pearson Education, Inc. Slide 6-32 6-B The Range Rule of Thumb The standard deviation is approximately related to the range of a data set by the range rule of thumb: range standard deviation 4 If we know the standard deviation for a data set, we estimate the low and high values as follows: low value mean 2 standard deviation high value mean 2 standard deviation Copyright © 2011 Pearson Education, Inc. Slide 6-33 6-B Assignment P. 389-390 7-25 odd Copyright © 2011 Pearson Education, Inc. Slide 6-34