Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Descriptive Statistics Chapter 3 MSIS 111 Prof. Nick Dedeke 1 Objectives Define measures of central tendency, variability, shape and association Define statistical measures Compute statistical measures for ungrouped and grouped data Interpret statistical results 2 Introduction In most competitive sports, one looks for the position of the athletes, e.g. who came in first, second, and so on. In statistics, one is interested in the following measures: - most frequent value in data set - summary of all values in data set - midpoint position of data set - positions of data in data set - distances to midpoint of data set 3 Exercise: Statistical Measure 1 We want to find out which of the following students is the better one using the available data. Kuli 1st 2nd 1st 2nd 1st 4th 3rd 3rd 2nd 5th 1st Marti 3rd 2nd 3rd 1st 2nd 1st 1st 1st 3rd 2nd 3rd 4 Using Statistical Measures Kuli 1st 2nd 1st 2nd 1st 4th 3rd 3rd 2nd 5th 1st Marti 3rd 2nd 3rd 2nd 2nd 1st 1st 1st 3rd 2nd 1st Mode: Most frequently occurring value of variable Mode for Kuli: Mode for Marti: Mean: Average of the values of a variable Sample mean = Xi n Mean or average score for Kuli Mean or average score for Marti 5 Using Frequency Distributions Analysis of Kuli’s performance Mean = Fi * Xi Fi = Mode = Median point = (11+ 1)/2 = 6th Median value = 2nd Using cumul. Freq. column = 2nd Xi 1st 2nd 3rd 4th 5th Frequency Fi * Cum. (Fi) Xi (C Fi) 4 4 4 3 6 7 2 6 9 1 4 10 1 5 11 11 25 6 Using Frequency Distributions Analysis of Marti’s performance Mean = Fi * Xi Fi = Mode = Median point = (11+ 1)/2 = 6th Median value = 2nd Using cumul. Freq. column = 2nd Xi Frequency (Fi) 1st 2nd 3rd 4th 5th 4 4 3 0 0 11 Fi * Cum. Xi (C Fi) 4 8 9 0 0 21 4 8 11 0 0 7 New Case: Median measure Analysis of Katie’s performance Mean = Fi * Xi Fi = Mode = Median point = (12+ 1)/2 = 6.5th Median value =(2nd+3rd)/2 = 2.5th Average of the 6th and 7th positions. Xi Frequency (Fi) 1st 2nd 3rd 4th 4 2 5 1 12 Fi * Cum. Xi (C Fi) 4 8 15 4 31 4 6 11 12 8 Percentiles Sometimes we are not analyzing several values from one person, but one value for several persons or objects. For example we have data from the performance of several fund manager’s for year 2006. We want to present the data in the form, XX manager is in the top 10 or tenth percentile or top 25 or 25th percentile. The method used consists of three steps - organize data in ascending order - calculate location of percentile you want - identify the object in the percentile location from the data set 9 Interpretation: Percentiles If manager is in the tenth percentile of of a group, this means that 90 % of everyone in the data set scored better than the manager. If manager is in the 95th percentile of of a group, this means that 5 % of everyone in the data set scored higher or better than the manager. 10 Exercise: Percentiles for Known Values First name Bill Fund performance 106% Jane Sven 109% 114% Larry Dub Anna Cole Salome 116% 121% 122% 125% 129% In which percentile is Sven? 11 Response: Percentiles for Known Values First name Fund performance Bill Jane Sven 106% 109% 114% Larry Dub Anna Cole Salome 116% 121% 122% 125% 129% Fi Rel. fi Cum fi Percentiles 1 1 1 1/8 1/8 1/8 1/8 2/8 3/8 12.5th Percentile 1/8 1/8 1/8 1/8 1/8 4/8 5/8 6/8 7/8 1 50th Percentile 1 1 1 1 1 N=8 In which percentile is Sven? 25th Percentile 37.5th Percentile 62.5th Percentile 75th Percentile 87.5th Percentile 100th Percentile 12 Example: Percentiles for UnKnown Values First name Fund performance Bill Jane Sven 106% 109% 114% Larry Dub Anna Cole Salome 116% 121% 122% 125% 129% Fi Rel. fi Cum fi Percentiles 1 1 1 1/8 1/8 1/8 1/8 2/8 3/8 12.5th Percentile 1 1/8 4/8 1 1/8 5/8 1 1/8 6/8 1 1/8 7/8 1 1/8 1 N=8 What is the value of the 90th percentile? 25th Percentile 37.5th Percentile 50th Percentile 62.5th Percentile 75th Percentile 87.5th Percentile 100th Percentile 13 Computing Percentile locations 90th percentile location i = (P/100) * N = 0.9 * 8 = 7.2th position 90th percentile is 0.2 or 20% between the 7th and 8th The value for the 90th percentile is computed by averaging the following values = 7th position’s value + (8th position’s value - 7th position value)* Fraction got from computing i 125% + (129% - 125%)*0.2 = 125.8% (~ 126%) 50th percentile location i = (P/100) * N = 0.5 * 8 = 4th position 14 Computing Central Tend. Measures Mean= Fi *Xi Fi = 1655/15 =110.33 Xi Fi Fi * Xi 55 60 100 125 2 1 3 5 110 60 300 625 140 4 15 560 1655 15 Computing Dispersion Measures Mean (μ) = Fi *Xi Fi =1655/15 =110.33 Variance (s 2) = Fi * (Xi- μ)2 (n –1) =13573.335/(15 –1) =969.52 Standard deviation (s) = 31.137 Xi Fi Fi * Xi (Xi- μ) (Xi- μ)2 Fi * (Xi- μ)2 55 60 100 2 1 3 110 60 300 -55.33 -50.33 -10.33 3061.409 2533.109 106.709 6122.818 2533.109 320.127 125 140 5 4 15 625 560 1655 14.67 29.67 215.209 880.309 1076.045 3521.236 13573.335 16 Computing Dispersion Measures 2 Var (s 2) = Fi* Xi 2 - (Fi*Xi)2/n (n –1) = 196175 – (1655 2/15)/(15 –1) =(196175 – 182601.66)/14 = = 969.52 Standard deviation (s) = 31.137 Xi Fi Fi * Xi (Xi) 55 60 100 2 1 3 110 60 300 3025 3600 10000 6050 3600 30000 125 140 5 4 15 625 560 1655 15625 19600 78125 78400 196175 2 Fi*(Xi)2 17 Exercise: Dispersion Measures Var (s 2) = Fi* Xi 2 - (Fi*Xi)2/n (n –1) Standard deviation (s) = Xi Fi 5 6 10 2 1 3 12 14 2 1 Fi * Xi (Xi) 2 Fi*(Xi)2 18 Excel Examples 19 Grouped Data Examples Class interval Freq (Fi) M Fi * M Fi * M2 [1 – 3) inch 16 2 inches 32 inches 64 inches [3 – 5) inch 2 4 inches 8 inches 32 inches [5 – 7) inch 4 6 inches 24 inches 144 inches [7 – 9) inch 3 8 inches 24 inches 192 inches [9 – 11) inch 9 10 inches 90 inches 900 inches [11 – 13) inch 6 12 inches 72 inches 864 inches 40 250 2,196 Var (s 2) = Fi* Mi 2 - (Fi*Mi)2/n (n –1) Standard deviation (s) = 4.03 inches = 2196 – 1562.5 = 16.24 39 20 Grouped Data Exercise Class interval Freq (Fi) [1 – 4) inches 4 [4 – 8) inches 4 [8 – 12) inches 6 [12 – 16) inches 12 [16 – 20) inches 8 [20 – 24) inches 6 40 M Var (s 2) = Fi* Mi 2 - (Fi*Mi)2/n (n –1) Fi * M Fi * M2 = Standard deviation (s) = 21