Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Measures of Variability OBJECTIVES •To understand the different measures of variability •To determine the range, variance, quartile deviation, mean deviation and standard deviation for ungrouped and grouped data Measures of dispersion (variability or spread) consider the extent to which the observations vary MEASURES OF VARIATION RANGE QUARTILE DEVIATION MEAN DEVIATION VARIANCE STANDARD DEVIATION 1. Range, R The difference in value between the highest-valued data, H, and the lowestvalued data, L R=H–L Example: 3, 3, 5, 6, 8 R=H–L=8–3=5 2. Quartile Deviation, QD or semi-interquartile range obtained by getting one half the difference between the third and the first quartiles Q3 Q1 QD 2 SOLVE FOR Q1 and Q3 N cf Q1 ll 4 w fi 3N cf Q3 ll 4 w fi cf less than cumulative frequency before the quantile class f i frequency of the quantile class w class width or size of class interval (or simply classes) Problem The examination scores of 50 students in a statistics class resulted to the following values: Q3 = 75.43 Q1 = 54.24 Determine the value of the quartile deviation or semi-interquartile range. Solution Q3 Q1 75.43 54.24 QD 2 2 10.60 Problem Compute the value of the semiinter quartile range or quartile deviation The performance ratings of 100 faculty members of a certain college are presented in a frequency distribution as follows: Class interval or Classes 71-74 75-78 79-82 83-86 87-90 91-94 95-98 f 3 10 13 18 25 19 12 <cf 3 13 26 44 69 88 100 1st quartile class 3rd quartile class Solution (Grouped data) Q3 Q1 QD 2 N cf Q1 ll 4 f Q1 Q1 82.19 25 13 w 78.5 4 13 3N cf 4 w Q3 ll fQ3 75 69 90.5 4 19 Q3 91.76 Solution cont’d… Substitute 91.76 82.19 QD 2 QD 4.78 3. Mean Deviation, MD – based on all items in a distribution For ungrouped data x MD i For grouped data fx MD n where MD mean deviation f frequency n xi x x x the individual values x mean of the distributi on n sample size i 4. Variance, s2 - most commonly used measure of variability - the square of standard deviation For ungrouped data x x n 2 s2 xi 2 2 n or s 2 n where s 2 variance of a set of observatio ns xi x x the deviation of a score from the mean x the mean x a score n sample size Note: The greater the variability of the observations in a data set, the greater the variance. If there is no variability of the observations, that is, if all are equal and hence, all are equal to the mean then s2 = 0 For grouped data s 2 fxi n 2 or s 2 fx fx 2 2 n n where s 2 variance of a set of observatio ns xi x x the deviation of a score from the mean x the mean x individual values in the distriutio n n sample size 5. Standard Deviation, s - the positive square root of the variance s s 2 Problem: Find the (a) range, (b) quartile deviation, © mean deviation, (d) variance and (e) standard deviation Student 1 2 3 4 5 6 7 8 9 10 Score 50 48 lowest value 72 67 71 65 73 highest value 62 64 60 (a) Range, R R=H–L R = 73 – 48 = 25 (b) Quartile Deviation, QD Arrangement in ascending order 48 50 60 62 64 65 67 71 72 73 Using method 3 for finding Qn (ungrouped data) Q1 is located at n/4 = 10/4 = 2.5 Q1 = (50+60)/2 = 55 Q3 is located at 3n/4 = 3(10)/4 = 7.5 Q3 = (67+71)/2 =69 QD cont’d… Q3 Q1 69 55 QD 2 2 QD 7 © Mean Deviation, MD x MD i n 65.6 6.56 10 First, solve for the mean Ungrouped data 73 72 71 67 65 64 62 60 50 48 x 10 x 63.20 Data for mean deviation, MD Score, x 73 72 71 67 65 64 62 60 50 48 TOTAL xi = x- x 9.8 8.8 7.8 3.8 1.8 0.8 -1.2 -3.2 -13.2 -15.2 65.6 xi2 96.04 77.44 60.84 14.44 3.24 0.64 1.44 10.24 74.24 231.04 669.60 (d) Variance, s2 s 2 x i n 2 669.6 10 66.96 (e) Standard Deviation, s s 66.96 2 s 66.96 8.18 Problem: The following are marks obtained by a group of 40 university students on an English examination: 42 88 37 75 98 93 73 62 96 80 52 76 66 73 69 54 83 62 53 79 69 56 81 75 52 65 49 80 67 59 88 80 44 71 87 82 89 79 72 91 Find the following: a. range b. quartile deviation c. mean deviation d. variance e. standard deviation Solution a. Range, R = H – L = 98 – 37 = 61 b. Quartile Deviation, QD Q3 Q1 QD 2 3n cf w where Q 3 ll 4 fQ3 30 26 79.5 5 82.83 6 English scores of 40 university students Classes f <cf 95-99 2 40 90-94 2 38 85-89 4 36 80-84 6 32 75-79 5 26 70-74 4 21 65-69 5 17 60-64 2 12 55-59 2 10 50-54 4 8 45-49 1 4 40-44 2 3 35-39 1 1 Solve for Q1 n cf w Q1 ll 4 f Q1 10 8 54.5 5 59.50 2 Substitute Q3 Q1 QD 2 82.83 - 59.50 2 11.67 c. Mean Deviation, MD 2840 x 71 40 xi x x Refer to the table f x MD i n 516 12.9 40 Data for mean deviation, MD Class interval x f fx |xi| f|xi| 95-99 97 2 194 26 52 90-94 92 2 184 21 42 85-89 87 4 348 16 64 80-84 82 6 492 11 66 75-79 77 5 385 6 30 70-74 72 4 288 1 4 65-69 67 5 335 4 20 60-64 62 2 124 9 18 55-59 57 2 114 14 28 50-54 52 4 208 19 76 45-49 47 1 47 24 24 40-44 42 2 84 29 58 35-39 37 1 37 34 34 40 2840 Total 516 d. Variance, s2 s 2 fxi n 241.5 2 9660 40 Data for the variance, s2 Class interval x f fx xi fxi2 95-99 97 2 194 26 1352 90-94 92 2 184 21 882 85-89 87 4 348 16 1024 80-84 82 6 492 11 726 75-79 77 5 385 6 180 70-74 72 4 288 1 4 65-69 67 5 335 -4 80 60-64 62 2 124 -9 162 55-59 57 2 114 -14 392 50-54 52 4 208 -19 1444 45-49 47 1 47 -24 576 40-44 42 2 84 -29 1682 35-39 37 1 37 -34 1156 Total 40 9660 e. Standard Deviation, s The standard deviation, s, is the positive square root of the variance, s s 241.5 15.54 2 New Topic… Objectives To know the measures of skewness and kurtosis To find the Pearsonian coefficient of skewness Measures of Skewness summarize the extent to which the observations are symmetrically distributed Skewness the degree to which a distribution departs from symmetry about its mean value or refers to asymmetry (or "tapering") in the distribution of sample data Positive skew the right tail is longer the mass of the distribution is concentrated on the left of the figure has a few relatively high values the distribution is said to be right-skewed mean > median > mode the skewness is greater than zero Negative skew the left tail is longer the mass of the distribution is concentrated on the right of the figure has a few relatively low values the distribution is said to be left-skewed mean < median < mode the skewness is lower than zero No skew the distribution is symmetric like the bell-shaped normal curve mean = median = mode ~ x x xˆ OR… Exercise Pearsonian coefficient of skewness x ~x x xˆ Sk or S k 3 s s where Sk Pearsonian coefficien t of skewness x mean ~ x median x̂ mode Skewness based on quartiles Sk Q3 Q2 Q2 Q1 Q3 Q1 where Q1 1st quartile Q 2 2nd quartile Q3 3rd quartile Interpretation If skewness is positive, the data are positively skewed or skewed right, meaning that the right tail of the distribution is longer than the left. If skewness is negative, the data are negatively skewed or skewed left, meaning that the left tail is longer. Interpretation cont’d… If skewness = 0, the data are perfectly symmetrical. But a skewness of exactly zero is quite unlikely for real-world data, so how can you interpret the skewness number? In the classic Principles of Statistics (1965), M.G. Bulmer suggests this rule of thumb: Interpretation cont’d… If skewness is less than −1 or greater than +1, the distribution is highly skewed. If skewness is between −1 and −½ or between +½ and +1, the distribution is moderately skewed. Interpretation cont’d… If skewness is between −½ and +½, the distribution is approximately symmetric. Example: With a skewness of −0.1082, the sample data are approximately symmetric. Problem Find the Pearsonian coefficient of skewness of the set of data shown in the following table: Scores of ten students in a mathematics ability test Student 1 2 3 4 5 6 7 8 9 10 Score 50 48 72 67 71 65 73 62 64 60 Computed values Refer to the previous computations x 63.20 s 8.18 65 71 ~ x Mdn 68 2 (x ~ x) (63.2 68) Sk 3 3 s 8.18 S k 1.76 Interpretation Negative sign means the tail extends to the left the mean is less than the mode by 176% considered a substantial departure from symmetry Problem Find the Pearsonian coefficient of skewness for the following set of data: x 71 s 15.54 d1 xˆ Mo ll ( w) d1 d 2 2 xˆ 79.5 (5) 82.83 2 1 Class interval x f fx |xi| f|xi| 95-99 97 2 194 26 52 90-94 92 2 184 21 42 85-89 87 4 348 16 64 80-84 82 6 492 11 66 75-79 77 5 385 6 30 70-74 72 4 288 1 4 65-69 67 5 335 4 20 60-64 62 2 124 9 18 55-59 57 2 114 14 28 50-54 52 4 208 19 76 45-49 47 1 47 24 24 40-44 42 2 84 29 58 35-39 37 1 37 34 34 2840 516 Total 40 d1 xˆ Mo ll ( w) d1 d 2 2 xˆ 79.5 (5) 82.83 2 1 x xˆ 71 82.83 Sk s 15.54 S k 0.761 Interpretation Negative (-) computed value means the mean is less than the mode by 76.1% considered quite negligible departure from symmetry given set of data is more or less evenly distributed Problem Find the Pearsonian coefficient of skewness for the distribution whose mean, x 20.5 mode , x̂ 18.6 and standard deviation, s 5 Solution x xˆ 20.5 18.6 S k s 5 S k 0.38 Interpretation Positive sign indicates the tail of the distribution extends to the right Computed value means the mean is greater than the mode by 38% considered negligible skewness Measures of Kurtosis Kurtosis - the degree of peakedness (or flatness) of a distribution Standardiz ed kurtosis measure n m4 m4 ' 4 s where m 4 i xi x 4 i 1 n 1 s standard deviation s 2 Types of Kurtosis Mesokurtic distribution a normal distribution, neither too peaked nor too flat its kurtosis (Ku) is equal to 3 Leptokurtic distribution has a higher peak than the normal distribution with narrow humps and heavier tails its kurtosis (Ku) is higher than 3 Platykurtic distribution has a lower peak than a normal distribution flat distributions with values evenly distributed about the center with broad humps and short tails its kurtosis (Ku) is less than 3