Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2-1 Chapter Two Descriptive Statistics 2-2 McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Descriptive Statistics 2.1 2.2 2.3 2.4 Describing the Shape of a Distribution Describing Central Tendency Measures of Variation Percentiles, Quartiles, and Box-andWhiskers Displays 2.5 Describing Qualitative Data *2.6 Using Scatter Plots to Study the Relationship Between Variables *2.7 Misleading Graphs and Charts 2-3 2.1 Stem and Leaf Display: Car Mileage Example 2.1: The Car Mileage Case 1 5 12 21 (11) 17 7 1 2-4 29 8 30 1344 30 5666889 31 001233444 31 55566777889 32 0001122344 32 556788 33 3 Stem and Leaf Display: Payment Times Example 2.2: The Accounts Receivable Case 2-5 1 2 4 7 11 18 27 (8) 30 24 19 16 13 10 8 5 3 2 1 1 10 0 11 0 12 00 13 000 14 0000 15 0000000 16 000000000 17 00000000 18 000000 19 00000 20 000 21 000 22 000 23 00 24 000 25 00 26 0 27 0 28 29 0 Histograms Example 2.4: The Accounts Receivable Case Frequency Histogram 2-6 Relative Frequency Histogram The Normal Curve 2-7 Skewness Left Skewed 2-8 Symmetric Right Skewed Dot Plots Scores on Exams 1 and 2 2-9 2.2 Population Parameters and Sample Statistics A population parameter is number calculated from all the population measurements that describes some aspect of the population. The population mean, denoted , is a population parameter and is the average of the population measurements. A point estimate is a one-number estimate of the value of a population parameter. A sample statistic is number calculated using sample measurements that describes some aspect of the sample. 2-10 Measures of Central Tendency 2-11 Mean, σ The average or expected value Median, Md The middle point of the ordered measurements Mode, Mo The most frequent value The Mean Population X1, X2, …, XN Sample x1, x2, …, xn x Population Mean Sample Mean n N 2-12 X i =1 N i x x i =1 n i The Sample Mean The sample mean x is defined as n x x i 1 n i x1 x2 ... xn n and is a point estimate of the population mean, . 2-13 Example: Car Mileage Case Example 2.5: Sample mean for first five car mileages from Table 2.1 30.8, 31.7, 30.1, 31.6, 32.1 5 x x i 1 5 i x1 x2 x3 x4 x5 5 30.8 31.7 30.1 31.6 32.1 156.5 31.26 5 5 2-14 The Median The population or sample median is a value such that 50% of all measurements lie above (or below) it. The median Md is found as follows: 1. If the number of measurements is odd, the median is the middlemost measurement in the ordered values. 2. If the number of measurements is even, the median is the average of the two middlemost measurements in the ordered values. 2-15 Example: Sample Median Example 2.6: Internists’ Salaries (x$1000) 127 132 138 141 144 146 152 154 165 171 177 192 241 Since n = 13 (odd,) then the median is the middlemost or 7th measurement, Md=152 2-16 The Mode The mode, Mo of a population or sample of measurements is the measurement that occurs most frequently. 2-17 Example: Sample Mode Example 2.2: The Accounts Receivable Case The value 16 occurs 9 times therefore: Mo = 16 2-18 1 2 4 7 11 18 27 (8) 30 24 19 16 13 10 8 5 3 2 1 1 10 0 11 0 12 00 13 000 14 0000 15 0000000 16 000000000 17 00000000 18 000000 19 00000 20 000 21 000 22 000 23 00 24 000 25 00 26 0 27 0 28 29 0 Relationships Among Mean, Median and Mode 2-19 2.3 Measures of Variation Range Largest minus the smallest measurement Variance The average of the sum of the squared deviations from the mean Standard Deviation The square root of the variance 2-20 The Range Range = largest measurement - smallest measurement Example: Internists’ Salaries (in thousands of dollars) 127 132 138 141 144 146 152 154 165 171 177 192 241 Range = 241 - 127 = 114 ($114,000) 2-21 The Variance Population X1, X2, …, XN s2 σ2 Population Variance N 2 2-22 Sample x1, x2, …, xn (X i i=1 N Sample Variance n - )2 s2 = (x i - x ) 2 i =1 n -1 The Standard Deviation Population Standard Deviation, s: Sample Standard Deviation, s: 2-23 s s 2 2 Example: Population Variance/Standard Deviation Population of annual returns for five junk bond mutual funds: 10.0%, 9.4%, 9.1%, 8.3%, 7.8% m= 10.0+9.4+9.1+8.3+7.8 = 44.6 = 8.92% 5 50 2 2 2 2 2 ( 10 . 0 8 . 92 ) ( 9 . 4 8 . 92 ) ( 91 . 8 . 92 ) ( 8 . 3 8 . 92 ) ( 7 . 8 8 . 92 ) 2 5 = 1.1664+.2304+.3844+1.2544 = 3.068 = .6136 5 5 2 .6136 .7833 2-24 Example: Sample Variance/Standard Deviation Example 2.11: Sample variance and standard deviation for first five car mileages from Table 2.1 30.8, 31.7, 30.1, 31.6, 32.1 (x - x ) 5 2 x 31.26 i s2 = i =1 5 -1 (30.8 31.26) 2 (31.7 31.26) 2 (30.1 31.26) 2 (31.6 31.26) 2 (32.1 31.26) 2 s = 4 2 s2 = 2.572 4 = 0.643 2-25 s s 2 .643 0.8019 The Empirical Rule for Normal Populations If a population has mean m and standard deviation s and is described by a normal curve, then 68.26% of the population measurements lie within one standard deviation of the mean: [m-s, m+s] 95.44% of the population measurements lie within two standard deviations of the mean: [m-2s, m+2s] 99.73% of the population measurements lie within three standard deviations of the mean: [m-3s, m+3s] 2-26 Example: The Empirical Rule Example 2.13: The Car Mileage Case 2-27 Chebyshev’s Theorem Let m and s be a population’s mean and standard deviation, then for any value k>1, At least 100(1 - 1/k2 )% of the population measurements lie in the interval: [m-ks, m+ks] 2-28 2.4 Percentiles and Quartiles For a set of measurements arranged in increasing order, the pth percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the value. The first quartile Q1 is the 25th percentile The second quartile (or median) Md is the 50th percentile The third quartile Q3 is the 75th percentile. The interquartile range IQR is Q3 - Q1 2-29 Example: Quartiles 20 customer satisfaction ratings: 1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10 Md = (8+8)/2 = 8 Q1 = (7+8)/2 = 7.5 Q3 = (9+9)/2 = 9 IRQ = Q3 - Q1 = 9 - 7.5 = 1.5 2-30 Box and Whiskers Plots 2-31 2.5 Describing Qualitative Data 2-32 Population and Sample Proportions Population X1, X2, …, XN Sample x1, x2, …, xn p̂ p Population Proportion Sample Proportion n pˆ x i i =1 n xi = 1 if characteristic present, 0 if not 2-33 Example: Sample Proportion Example 2.16: Marketing Ethics Case 117 out of 205 marketing researchers disapproved of action taken in a hypothetical scenario X = 117, number of researches who disapprove n = 205, number of researchers surveyed Sample Proportion: 2-34 X 117 p .57 n 205 Bar Chart Percentage of Automobiles Sold by Manufacturer, 1970 versus 1997 2-35 Pie Chart Percentage of Automobiles Sold by Manufacturer,1997 2-36 Pareto Chart Pareto Chart of Labeling Defects 2-37 2.6 Scatter Plots Restaurant Ratings: Mean Preference vs. Mean Taste 2-38 2.7 Misleading Graphs and Charts: Scale Break Mean Salaries at a Major University, 1999 - 2002 2-39 Misleading Graphs and Charts: Horizontal Scale Effects Mean Salary Increases at a Major University, 1999-2002 2-40 Descriptive Statistics Summary: 2.1 2.2 2.3 2.4 Describing the Shape of a Distribution Describing Central Tendency Measures of Variation Percentiles, Quartiles, and Box-andWhiskers Displays 2.5 Describing Qualitative Data *2.6 Using Scatter Plots to Study the Relationship Between Variables *2.7 Misleading Graphs and Charts 2-41