Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
“JUST THE MATHS” UNIT NUMBER 18.3 STATISTICS 3 (Measures of dispersion (or scatter)) by A.J.Hobson 18.3.1 18.3.2 18.3.3 18.3.4 18.3.5 18.3.6 18.3.7 18.3.8 Introduction The mean deviation Practical calculation of the mean deviation The root mean square (or standard) deviation Practical calculation of the standard deviation Other measures of dispersion Exercises Answers to exercises UNIT 18.3 - STATISTICS 3 MEASURES OF DISPERSION (OR SCATTER) 18.3.1 INTRODUCTION Averages typify a whole collection of values, but they give little information about how the values are distributed within the whole collection. For example, 99.9, 100.0, 100.1 is a collection which has an arithmetic mean of 100.0 and so is 99.0,100.0,101.0; but the second collection is more widely dispersed than the first. It is the purpose of this Unit to examine two types of quantity which typify the distance of all the values in a collection from their arithmetic mean. They are known as measures of dispersion (or scatter) and the smaller these quantities are, the more clustered are the values around the arithmetic mean. 18.3.2 THE MEAN DEVIATION If the n values x1 , x2 , x3 , ......,xn have an arithmetic mean of x, then x1 − x, x2 − x, x3 − x,......,xn − x are called the “deviations” of x1 , x2 , x3 ,......, xn from the arithmetic mean. Note: The deviations add up to zero since n X i=1 (xi − x) = n X xi − i=1 n X x = nx − nx = 0 i=1 DEFINITI0N The “mean deviation” (or, more accurately, the “mean absolute deviation) is defined by the formula n 1X |xi − x| M.D. = n i=1 1 18.3.3 PRACTICAL CALCULATION OF MEAN DEVIATION In calculating a mean deviation, the following short-cuts usually turn out to be useful, especially for larger collections of values: (a) If a constant, k, is subtracted from each of the values xi (i = 1,2,3...n), and also we use the “fictitious” arithmetic mean, x−k, in the formula, then the mean deviation is unaffected. Proof: n n 1X 1X |xi − x| = |(xi − k) − (x − k)|. n i=1 n i=1 (b) If we divide each of the values xi (i = 1,2,3,...n) by a positive constant, l, and also we use the “fictitious” arithmetic mean xl , then the mean deviation will be divided by l. Proof: n n xi 1 X 1X x |xi − x| = − . ln i=1 l n i=1 l Summary If we code the data using both a subtraction by k and a division by l, the value obtained from the mean deviation formula needs to multiplied by l to give the correct value. 18.3.4 THE ROOT MEAN SQUARE (OR STANDARD) DEVIATION A more common method of measuring dispersion, which ensures that negative deviations from the arithmetic mean do not tend to cancel out positive deviations, is to determine the arithmetic mean of their squares, and then take the square root. DEFINITION The “root mean square deviation” (or “standard deviaton”) is defined by the formula v u n u1 X t R.M.S.D. = (xi − x)2 . n i=1 2 Notes: (i) The root mean square deviation is usually denoted by the symbol, σ. (ii) The quantity σ 2 is called the “variance”. 18.3.5 PRACTICAL CALCULATION OF THE STANDARD DEVIATION In calculating a standard deviation, the following short-cuts usually turn out to be useful, especially for larger collections of values: (a) If a constant, k, is subtracted from each of the values xi (i = 1,2,3...n), and also we use the “fictitious” arithmetic mean, x − k, in the formula, then σ is unaffected. Proof: v u n u1 X t (xi − x)2 = n i=1 v u n u1 X t [(xi − k) − (x − k)]2 . n i=1 (b) If we divide each of the values xi (i = 1,2,3,...n) by a constant, l, and also we use the “fictitious” arithmetic mean xl , then σ will be divided by l. Proof: v u v u n n X u1 X 1u t1 (xi − x)2 = t l n i=1 n i=1 xi x − l l 2 . Summary If we code the data using both a subtraction by k and a division by l, the value obtained from the standard deviation formula needs to multiplied by l to give the correct value, σ. (c) For the calculation of the standard deviation, whether by coding or not, a more convenient formula may be obtained by expanding out the expression (xi − x)2 as follows: n n n X X 1 X x2i − 2x xi + x2 . σ = n i=1 i=1 i=1 " # 2 3 That is, σ2 = n 1X x2 − 2x2 + x2 , n i=1 i which gives the formula v u u1 σ=t n n X ! x2i − x2 . i=1 Note: In advanced statistical work, the above formulae for standard deviation are used only for descriptive problems in which we know every member of a collection of observations. For inference problems, it may be shown that the standard deviation of a sample is always smaller than that of a total population; and the basic formula used for a sample is v u u σ=t n 1 X (xi − x)2 . n − 1 i=1 18.3.6 OTHER MEASURES OF DISPERSION We mention here, briefly, two other measures of dispersion: (i) The Range This is the difference between the highest and the smallest members of a collection of values. (ii) The Coefficient of Variation This is a quantity which expresses the standard deviation as a percentage of the arithmetic mean. It is given by the formula C.V. = σ × 100. x 4 EXAMPLE The following grouped frequency distribution table shows the diameter of 98 rivets: Class Cls. Mid Freq. Intvl. Pt. xi fi 6.60 − 6.62 6.61 1 6.62 − 6.64 6.63 4 6.64 − 6.66 6.65 6 6.66 − 6.68 6.67 12 6.68 − 6.70 6.69 5 6.70 − 6.72 6.71 10 6.72 − 6.74 6.73 17 6.74 − 6.76 6.75 10 6.76 − 6.78 6.77 14 6.78 − 6.80 6.79 9 6.80 − 6.82 6.81 7 6.82 − 6.84 6.83 2 6.84 − 6.86 6.85 1 Totals 98 Cum. Freq. 1 5 11 23 28 38 55 65 79 88 95 97 98 (xi − 6.61) ÷ 0.02 fi xi 0 = xi 0 0 0 1 4 2 12 3 36 4 20 5 50 6 102 7 70 8 112 9 81 10 70 11 22 12 12 591 xi 0 2 fi xi 0 2 fi |xi 0 − x0 | 0 1 4 9 16 25 36 49 64 81 100 121 144 0 4 24 108 80 250 612 490 896 729 700 242 144 4279 0.58 2.40 3.72 7.68 3.30 6.80 11.90 7.20 10.36 6.84 5.46 1.60 0.82 68.66 Estimate the arithmetic mean, the standard deviation and the mean (absolute) deviation of these diameters. Solution Fictitious arithmetic mean = 591 ' 6.03 98 Actual arithmetic mean = 6.03 × 0.02 + 6.61 ' 6.73 s Fictitious standard deviation = 4279 − 6.032 ' 2.70 98 Actual standard deviation = 2.70 × 0.02 ' 0.054 Fictitious mean deviation = 5 68.66 ' 0.70 98 Actual mean deviation ' 0.70 × 0.02 ' 0.014 18.3.7 EXERCISES 1. For the collection of numbers 6.5, 8.3, 4.7, 9.2, 11.3, 8.5, 9.5, 9.2 calculate (correct to two places of decimals) the arithmetic mean, the standard deviation and the mean (absolute) deviation. 2. Estimate the arithmetic mean, the standard deviation and the mean (absolute) deviation for the following grouped frequency distribution table: Class Interval Frequency 10 − 30 30 − 50 50 − 70 70 − 90 90 − 110 110 − 130 5 8 12 18 3 2 3. The following table shows the registered speeds of 100 speedometers at 30m.p.h. Regd. Speed 27.5 − 28.5 28.5 − 29.5 29.5 − 30.5 30.5 − 31.5 31.5 − 32.5 32.5 − 33.5 33.5 − 34.5 34.5 − 35.5 Frequency 2 9 17 26 24 16 5 1 Estimate the arithmetic mean, the standard deviation and the coefficient of variation. 18.3.8 ANSWERS TO EXERCISES 1. Arithmetic mean ' 8.40, standard deviation ' 1.88, mean deviation ' 1.43 2. Arithmetic mean ' 65, standard deviation ' 24.66, mean deviation ' 20.21 3. Arithmetic mean ' 31.4, standard deviation ' 1.44, coefficient of variation ' 4.61. 6