Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Business Statistics, 3e by Ken Black Chapter 3 Discrete Distributions Descriptive Statistics Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-1 Learning Objectives • Distinguish between measures of central tendency, measures of variability, and measures of shape • Understand the meanings of mean, median, mode, quartile, percentile, and range • Compute mean, median, mode, percentile, quartile, range, variance, standard deviation, and mean absolute deviation on ungrouped data Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-2 Learning Objectives -- Continued • Differentiate between sample and population variance and standard deviation • Understand the meaning of standard deviation as it is applied by using the empirical rule and Chebyshev’s theorem • Compute the mean, median, standard deviation, and variance on grouped data • Understand box and whisker plots, skewness, and kurtosis Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-3 Measures of Central Tendency: Ungrouped Data • Measures of central tendency yield information about “particular places or locations in a group of numbers.” • Common Measures of Location – Mode – Median – Mean – Percentiles – Quartiles Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-4 Mode • The most frequently occurring value in a data set • Applicable to all levels of data measurement (nominal, ordinal, interval, and ratio) • Bimodal -- Data sets that have two modes • Multimodal -- Data sets that contain more than two modes Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-5 Mode -- Example • The mode is 44. • There are more 44s than any other value. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 35 41 44 45 37 41 44 46 37 43 44 46 39 43 44 46 40 43 44 46 40 43 45 48 3-6 Median • Middle value in an ordered array of numbers. • Applicable for ordinal, interval, and ratio data • Not applicable for nominal data • Unaffected by extremely large and extremely small values. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-7 Median: Computational Procedure • First Procedure – Arrange the observations in an ordered array. – If there is an odd number of terms, the median is the middle term of the ordered array. – If there is an even number of terms, the median is the average of the middle two terms. • Second Procedure – The median’s position in an ordered array is given by (n+1)/2. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-8 Median: Example with an Odd Number of Terms Ordered Array 3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22 • • • • There are 17 terms in the ordered array. Position of median = (n+1)/2 = (17+1)/2 = 9 The median is the 9th term, 15. If the 22 is replaced by 100, the median is 15. • If the 3 is replaced by -103, the median is 15. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-9 Median: Example with an Even Number of Terms Ordered Array 3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 • There are 16 terms in the ordered array. • Position of median = (n+1)/2 = (16+1)/2 = 8.5 • The median is between the 8th and 9th terms, 14.5. • If the 21 is replaced by 100, the median is 14.5. • If the 3 is replaced by -88, the median is 14.5. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-10 Arithmetic Mean • • • • • Commonly called ‘the mean’ is the average of a group of numbers Applicable for interval and ratio data Not applicable for nominal or ordinal data Affected by each value in the data set, including extreme values • Computed by summing all values in the data set and dividing the sum by the number of values in the data set Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-11 Population Mean X X X X ... X 1 2 3 N N N 24 13 19 26 11 5 93 5 18. 6 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-12 Sample Mean X X X X ... X X 1 2 3 n n n 57 86 42 38 90 66 6 379 6 63.167 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-13 Percentiles • Measures of central tendency that divide a group of data into 100 parts • At least n% of the data lie below the nth percentile, and at most (100 - n)% of the data lie above the nth percentile • Example: 90th percentile indicates that at least 90% of the data lie below it, and at most 10% of the data lie above it • The median and the 50th percentile have the same value. • Applicable for ordinal, interval, and ratio data • Not applicable for nominal data Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-14 Percentiles: Computational Procedure • Organize the data into an ascending ordered array. • Calculate the P percentile location: i (n) 100 • Determine the percentile’s location and its value. • If i is a whole number, the percentile is the average of the values at the i and (i+1) positions. • If i is not a whole number, the percentile is at the (i+1) position in the ordered array. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-15 Percentiles: Example • Raw Data: 14, 12, 19, 23, 5, 13, 28, 17 • Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28 • Location of 30th percentile: 30 i (8) 2. 4 100 • The location index, i, is not a whole number; i+1 = 2.4+1=3.4; the whole number portion is 3; the 30th percentile is at the 3rd location of the array; the 30th percentile is 13. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-16 Quartiles • Measures of central tendency that divide a group of data into four subgroups • Q1: 25% of the data set is below the first quartile • Q2: 50% of the data set is below the second quartile • Q3: 75% of the data set is below the third quartile • Q1 is equal to the 25th percentile • Q2 is located at 50th percentile and equals the median • Q3 is equal to the 75th percentile • Quartile values are not necessarily members of the data set Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-17 Quartiles Q2 Q1 25% 25% Q3 25% Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 25% 3-18 Quartiles: Example • Ordered array: 106, 109, 114, 116, 121, 122, 125, 129 • Q1 25 109114 i (8) 2 100 Q1 2 1115 . • Q2: 50 i (8) 4 100 116121 Q2 1185 . 2 • Q3: 75 i (8) 6 100 122125 Q3 1235 . 2 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-19 Variability No Variability in Cash Flow Variability in Cash Flow Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning Mean Mean Mean Mean 3-20 Variability Variability No Variability Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-21 Measures of Variability: Ungrouped Data • Measures of variability describe the spread or the dispersion of a set of data. • Common Measures of Variability – Range – Interquartile Range – Mean Absolute Deviation – Variance – Standard Deviation – Z scores – Coefficient of Variation Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-22 Range • The difference between the largest and the smallest values in a set of data • Simple to compute 35 41 44 • Ignores all data points except 37 41 the44 two extremes 37 43 44 • Example: Range 39 43 = 44 Largest - Smallest = 40 43 44 48 - 35 = 13 40 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 43 45 45 46 46 46 46 48 3-23 Interquartile Range • Range of values between the first and third quartiles • Range of the “middle half” • Less influenced by extremes Interquartile Range Q 3 Q1 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-24 Deviation from the Mean • Data set: 5, 9, 16, 17, 18 • Mean: X 65 13 N 5 • Deviations from the mean: -8, -4, 3, 4, 5 -4 -8 0 +3 +4 5 10 15 +5 20 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-25 Mean Absolute Deviation • Average of the absolute deviations from the mean X 5 9 16 17 18 X X -8 -4 +3 +4 +5 0 +8 +4 +3 +4 +5 24 M . A. D. Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning X N 24 5 4.8 3-26 Population Variance • Average of the squared deviations from the arithmetic mean X 5 9 16 17 18 X X -8 -4 +3 +4 +5 0 64 16 9 16 25 130 2 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 X N 130 5 2 6 .0 3-27 2 Population Standard Deviation • Square root of the variance X 5 9 16 17 18 X X -8 -4 +3 +4 +5 0 2 64 16 9 16 25 130 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 X N 130 5 2 6 .0 2 2 6 .0 5 .1 3-28 2 Sample Variance • Average of the squared deviations from the arithmetic mean X 2,398 1,844 1,539 1,311 7,092 X X X 625 71 -234 -462 0 X 390,625 5,041 54,756 213,444 663,866 2 S Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 X X n1 6 6 3 ,8 6 6 3 2 2 1 , 2 8 8 .6 7 3-29 2 Sample Standard Deviation • Square root of the sample variance X X X X 2,398 1,844 1,539 1,311 7,092 625 71 -234 -462 0 X 2 S 390,625 5,041 54,756 213,444 663,866 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 X X n1 6 6 3 ,8 6 6 3 2 2 1 , 2 8 8 .6 7 S S 2 2 2 1 , 2 8 8 .6 7 4 7 0 .4 1 3-30 2 Uses of Standard Deviation • Indicator of financial risk • Quality Control – construction of quality control charts – process capability studies • Comparing populations – household incomes in two cities – employee absenteeism at two plants Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-31 Standard Deviation as an Indicator of Financial Risk Annualized Rate of Return Financial Security A 15% 3% B 15% 7% Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-32 Empirical Rule • Data are normally distributed (or approximately normal) Distance from the Mean 1 2 3 Percentage of Values Falling Within Distance Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 68 95 99.7 3-33 Chebyshev’s Theorem • Applies to all distributions 1 P( k X k ) 1 2 k for k > 1 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-34 Chebyshev’s Theorem • Applies to all distributions Number of Standard Deviations K=2 K=3 K=4 Distance from the Mean 2 3 4 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning Minimum Proportion of Values Falling Within Distance 1-1/22 = 0.75 1-1/32 = 0.89 1-1/42 = 0.94 3-35 Coefficient of Variation • Ratio of the standard deviation to the mean, expressed as a percentage • Measurement of relative dispersion C.V . 100 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-36 Coefficient of Variation 84 29 1 1 4.6 CV . . 1 1 100 1 2 2 10 CV . . 4.6 100 29 1586 . Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 2 100 2 10 100 84 1190 . 3-37 Measures of Central Tendency and Variability: Grouped Data • Measures of Central Tendency – Mean – Median – Mode • Measures of Variability – Variance – Standard Deviation Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-38 Mean of Grouped Data • Weighted average of class midpoints • Class frequencies are the weights fM f fM N f 1M 1 f 2 M 2 f 3 M 3 f iM i f 1 f 2 f 3 fi Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-39 Calculation of Grouped Mean Class Interval Frequency Class Midpoint 20-under 30 6 25 30-under 40 18 35 40-under 50 11 45 50-under 60 11 55 60-under 70 3 65 70-under 80 1 75 50 fM f fM 150 630 495 605 195 75 2150 2150 43 . 0 50 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-40 Median of Grouped Data N cfp W Median L 2 fmed Where: L the lower limit of the median class cfp = cumulative frequency of class preceding the median class fmed = frequency of the median class W = width of the median class N = total of frequencies Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-41 Median of Grouped Data -- Example Cumulative Class Interval Frequency Frequency 20-under 30 6 6 30-under 40 18 24 40-under 50 11 35 50-under 60 11 46 60-under 70 3 49 70-under 80 1 50 N = 50 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning N cfp W Md L 2 fmed 50 24 10 40 2 11 40.909 3-42 Mode of Grouped Data • Midpoint of the modal class • Modal class has the greatest frequency 30 40 Class Interval Frequency Mode 35 20-under 30 6 2 30-under 40 18 40-under 50 50-under 60 60-under 70 70-under 80 11 11 3 1 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-43 Variance and Standard Deviation of Grouped Data Population Sample f M S N 2 2 2 2 S Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning f M X n1 S 2 3-44 2 Population Variance and Standard Deviation of Grouped Data Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 2 f f M fM 6 18 11 11 3 1 50 25 35 45 55 65 75 150 630 495 605 195 75 2150 M N M M -18 -8 2 12 22 32 2 7200 144 50 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 2 f M 1944 1152 44 1584 1452 1024 7200 324 64 4 144 484 1024 2 2 144 12 3-45 Measures of Shape • Skewness – Absence of symmetry – Extreme values in one side of a distribution • Kurtosis – – – – Peakedness of a distribution Leptokurtic: high and thin Mesokurtic: normal shape Platykurtic: flat and spread out • Box and Whisker Plots – Graphic display of a distribution – Reveals skewness Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-46 Skewness Negatively Skewed Symmetric (Not Skewed) Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning Positively Skewed 3-47 Skewness Mean Median Mean Median Mode Negatively Skewed Symmetric (Not Skewed) Mode Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning Mean Mode Median Positively Skewed 3-48 Coefficient of Skewness • Summary measure for skewness S 3 Md • If S < 0, the distribution is negatively skewed (skewed to the left). • If S = 0, the distribution is symmetric (not skewed). • If S > 0, the distribution is positively skewed (skewed to the right). Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-49 Coefficient of Skewness 1 M d1 1 S 1 23 26 M d2 12.3 2 3 1 M d1 1 3 23 26 12.3 0.73 S 2 2 26 26 M d3 12.3 3 3 2 M d2 2 3 26 26 12.3 0 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning S 3 3 29 26 12.3 3 3 M d3 3 3 29 26 12.3 0.73 3-50 Kurtosis • Peakedness of a distribution – Leptokurtic: high and thin – Mesokurtic: normal in shape – Platykurtic: flat and spread out Leptokurtic Mesokurtic Platykurtic Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-51 Box and Whisker Plot • Five secific values are used: – – – – – Median, Q2 First quartile, Q1 Third quartile, Q3 Minimum value in the data set Maximum value in the data set • Inner Fences – IQR = Q3 - Q1 – Lower inner fence = Q1 - 1.5 IQR – Upper inner fence = Q3 + 1.5 IQR • Outer Fences – Lower outer fence = Q1 - 3.0 IQR – Upper outer fence = Q3 + 3.0 IQR Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 3-52 Box and Whisker Plot Minimum Q1 Q2 Q3 Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning Maximum 3-53 Skewness: Box and Whisker Plots, and Coefficient of Skewness S<0 Negatively Skewed S=0 Symmetric (Not Skewed) Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning S>0 Positively Skewed 3-54