Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Dispersion refers to the extent to which the items vary from one another and from the central value.It may be noted that the measures of dispersion or variation measure only the degree but not the direction of the variation. The measures of dispersion are also called averages of the second order because they are based on the deviations of the different values from the mean or other measures of central tendency which are called averages of the first order. 1 In the words of Bowley “Dispersion is the measure of the variation of the items” According to Conar “Dispersion is a measure of the extent to which the individual items vary” 2 METHODS OF MEASURING DISPERSION Range Inter Quartile Range and Quartile Deviation Mean Deviation Standard Deviation Coefficient of variation Lorenz curve 3 Range is the simplest method of measuring dispersion It is defined as the difference between the smallest and the largest observations in a given set of data. Its formula is: R=L-S Where, R= Range L= Largest value S= Smallest value The relative measure of Range, also called coefficient of range is defined as: Coefficient of Range= L-S/L+S 4 Range is calculated with the help of following series : I. Individual series II. Discrete series III. Continuous series Let us explain these series with the help of examples Ex 1. Five students obtained the following marks in statistics: 20, 35, 25, 30, 15 Find the range and coefficient of range. Sol. Here L=35, S=15 range= L-S range= 35-15=20 Coefficient of range= L-S/L+S =35-15/35+15 = 20/15 = 0.40 marks 10 20 30 40 50 60 70 No of 15 student s 18 25 30 16 10 9 Sol. Here L=70, S=10 range= L-S range= 70-10=60 Coefficient of range= L-S/L+S = 70-10/70+10 = 0.75 Ex 3. find range and coefficient of range from the series: size 5-10 10-15 15-20 20-25 25-30 frequency 4 9 15 30 40 Sol. Range= L-S here, L= upper limit of the largest class=30 S= lower limit of the smallest class=5 range= 30-5=25 Coefficient of range= L-S/L+S =30-5/30+5 =5/7 Interquartile range is the difference b/w the upper quartile (Q3) and lower quartile (Q1) Its formula is: interquartile range= Q3- Q1 Quartile deviation is the half of the difference b/w the upper quartile (Q3) and lower quartile (Q1) Its formula is: Quartile deviation =Q3 - Q1/2 the relative measure of Quartile deviation also called coefficient of Quartile deviation is defined as: coefficient of Quartile deviation= Q3 - Q1/Q3+Q1 It is calculated with the help of following series : I. Individual series II. Discrete series III. Continuous series Let us explain these series with the help of examples Ex 1. Find the interquartile range, quartile deviation and coefficient of quartile deviation from the data 28,18,20,24,27,30,15 Sol. Arrange the data in ascending order: 15,18,2024,27,28,30 Q1= size of (n+1/4)th item Q3= size of 3(n+1/4)th item = size of (7+1/4)th item = size of 3(7+1/4)th item = size of (2)nd item = size of 6th item = 18 marks = 28 marks interquartile range=Q3-Q1 = 28-18=10 quartile deviation= Q3 - Q1/2 = 10/2=5 coefficient of quartile deviation= Q3 - Q1/Q3+Q1 = 28-18/28+18 = 0.217 Wages 10 No of 2 workers 20 30 40 50 60 8 20 35 42 20 Q1= size of (n+1/4)th item = size of (127+1/4)th item = size of 32nd item=40 Q3= size of 3(n+1/4)th item = size of 3(127+1/4)th item = size of 96th item=50 interquartile range=Q3-Q1 = 50-40=10 quartile deviation= Q3 - Q1/2 = 50-40/2=5 coefficient of quartile deviation= Q3 - Q1/Q3+Q1 = 50-40/50+40 = 0.11 Age(year s) 0-20 20-40 40-60 60-80 80-100 persons 4 10 15 20 11 Age(years) f cf 0-20 4 4 20-40 10 14 40-60 15 29 60-80 20 49 80-100 11 60 N=60 Q1= size of (n/4)th item =(60/4)th item =15th item Q1 lies in the class 40-60 Q1= l1 + n/4-cf * i f = 40+ 15-14 * 20 15 = 41.33 Q3 = size of 3(n/4)th item = 3(60/4)th item = 45th item Q3 lies in the class 60-80 Q3= l1 + n/4-cf * i f = 60+ 45-29 * 20 20 = 76 interquartile range=Q3-Q1 = 76-41.33=34.67 quartile deviation= Q3 - Q1/2 = 76-41.33/2 coefficient of quartile deviation= Q3 - Q1/Q3+Q1 = 76-41.33/76+41.33 = 0.29 MEAN DEVIATION Mean Deviation is also known as average deviation. In this case deviation taken from any average especially Mean, Median or Mode. While taking deviation we have to ignore negative items and consider all of them as positive. 15 Formulae to calculate mean deviation M.D. from Mean= X X N M.D. from Median= X M N Coefficient of Mean Deviation M .D. Coefficient of M .D. = X X X Coefficient of M .D.M = M .D.M M Individual series M .D.X M .D.M = d N = d N M X Example: Calculate mean deviation from mean as well as from median and coefficient of mean deviation from the following data: Marks: 20 , 22 , 25 , 38 , 40 , 50 , 65 , 70 , 75 Solution: X X N Median= Size of =Size of 405 45 9 N 1 th item 2 9 1 th item = 40 2 Marks (X) Deviations from mean 45 Deviations from Median 40 dM dX 20 25 20 22 23 28 25 20 15 38 7 2 40 5 0 50 5 10 65 20 25 70 25 30 75 30 35 N=9, ΣХ=405 d X 160 d M 155 d M.D. from Mean= Coeff. OfM .D.X 160 17.78 9 X N = M .D.X 17.78 0.39 45 X Median= 40 d M.D. from Median= M N 155 17.22 9 M .D.M 17.22 Coeff of M .D.M 0.43 M 40 •Discrete Series M .D.X M .D.M f dx N f dM N Example: Calculate the mean deviation from median and mean and their coefficient from the following table: X: 20 30 40 50 60 70 Frequenc y: 8 12 20 10 6 4 Solution: N 1 th item 2 M = Size of = Size of 60 1 th item 2 = Size of 30.5th item M =40 fX 2460 X 41 N 60 Calculation of mean deviation from median X f c.f. M=40 dM X M f dM 20 8 8 20 160 30 12 20 10 120 40 20 40 0 0 50 10 50 10 100 60 6 56 20 120 70 4 60 30 120 N=60 M=40 M .D.M fd N Coeff of f d M 620 M 620 10.33 60 M .D.M M .D.M M 10.33 0.258 40 Calculation of mean deviation from mean X f fX X 41 f dX dX X X 20 8 160 21 168 30 12 360 11 132 40 20 800 1 20 50 10 500 9 90 60 6 360 19 114 70 4 280 29 116 N=60 ΣfX=246 0 f d X 41 M .D.X f N Coeff ofM .D.X X 640 dX 640 10.67 60 M .D.X X 10.67 0.26 41 •Continuous Series M .D.X fd M .D.M fd X N N M ,where dX m X ,where dM m M Example: Calculate the mean deviation from mean and its coefficient from the following data: Marks 0-10 10-20 20-30 30-40 40-50 No. of students 5 8 15 16 6 Solution: fm 1350 X 27 N 50 Calculation of mean deviation from mean Marks f m fm X 27 d X m 27 X 27 fd M .D. f dX X 0-10 5 5 25 22 110 10-20 8 15 120 12 96 20-30 15 25 375 2 30 30-40 16 35 560 8 128 40-50 6 45 270 18 108 N=50 Coeff. of f d Σfm=1350 M .D.M fd N M X N 472 9.44 50 X 472 9.44 0.349 27 Example: Calculate means deviation from median and its coefficient from the following data Size frequency Solution: 100-- 120 4 120- 140 140- 160 6 N 33 16.5th Item 2 2 160 N c. f . M l1 2 i f 10 160- 180 8 180- 200 5 Median lies in 140- 16.5 10 140 20 153 10 Calculation of mean deviation from median Size f cf m M=153 dM m M f dM 100-120 4 4 110 43 172 120-140 6 10 130 23 138 140-160 10 20 150 3 30 160-180 8 28 170 17 136 180-200 5 33 37 185 N=33 190 M=153 fd M .D.M M N 661 20.03 33 f d M 661 M .D.M 20.03 Coeff of M .D.M 0.1309 M 153 Short-cut method for mean deviation If value of the average comes out to be in fractions, M.D. is calculated using the following formula: •Individual Series- X X N N X orM M .D. A B A B NA NB •Discrete and Continuous Series- fX fX f f X orM M .D. A B A N B •Individual Series Example: Using shortcut method , calculate the mean deviations from mean and median from the following data: 7 , 9 , 13 , 13 , 15 , 17 , 19 , 21 , 23 X X 7 9 13 13 Taking XB 57 NB 5 Taking M X 42 B NB 4 15 M=15 X 15.22 17 19 21 23 ΣX=137 N=9 X A 80 NA 4 X A 80 NA 4 137 X 15.22 9 M = Size of N 1 th 9 1 5th item = 15 2 2 From mean: M .D.X X X N N X A B NA NB 80 57 (4 5)(15.22) 9 4.25 A B From Median: M .D.M X X N N M A B A NA NB 80 42 ( 4 4)(15) 9 38 4.22 9 B •Continuous Series Example: Using shortcut method , calculate the mean deviations from mean and median from the following data: Marks 0-10 10-20 20-30 30-40 40-50 No. of Student s 6 28 51 11 4 Calculation of mean deviation from mean MarksX f X fX 0-10 6 5 30 10-20 28 15 420 f B 34 fX 450 B X 229 20-30 51 30-40 11 f 66 25 1275 35 385 45 180 A 40-50 4 ΣfХ=2290 X fX N 2290 22.9 100 fX A 1840 M .D.X fX fX f f X A B A N 1840 450 (66 34)( 22.9) 100 6.572 B Calculation of mean deviation from median Marks X f 0-10 6 10-20 f 28 B 34 cf X fX 6 5 30 34 15 420 fX 450 B M = 23.14 20-30 51 30-40 11 40-50 4 f 66 A 85 25 1275 96 35 385 100 45 180 ΣfХ=22 90 Median = 23.14 fX A 1840 M .D.M fX fX f f M A B A N 1840 450 (66 34)( 23.14) 100 6.4952 B The concept of standard deviation was first introduced by Karl Pearson in 1893. The standard deviation is the most useful and the most popular measure of dispersion. Just as the arithmetic mean is the most of all the averages, the standard deviation is the best of all measures of dispersion. Calculation of Standard Deviation Individual Series In case of the individual series, standard deviation can be compared by applying any of the three methods: 1) Actual Mean Method When deviations are taken from the actual mean, the following formula is used : σ = ∑ ( X – X)² N or ∑ x ² where , x = X – X N Steps for Calculation i. Calculate the actual mean (x) of the series ii. Then take the deviation of the items from the mean i.e. find X – X and denote these deviations by x. iii. Square these deviations and obtain the total i.e. ∑ x2 iv. Divide ∑ x2 by the total number of items i.e. N and take the square root of it. The result will give the value of the standard deviation. 2) Assumed Mean Method When the actual mean is not a whole number but in fraction then it becomes difficult to take deviations from mean and obtain the square of these deviations. To save time and labor we use assumed mean method or called short-cut method. When deviations are taken fro assumed mean , the following formula is used : Steps for Calculation i. Any of the items in the series is taken as assumed mean. ii. Take the deviations of the items from the assumed mean, i.e. X – A and denote these deviations by ‘d’. Sum up these deviations to obtain ∑ d. iii. Then square these deviations taken from assumed mean and obtain the total , i.e. ∑d² iv. Substitute the values of ∑d² , ∑ d and N in the above formula. The result will give the value of standard deviation. Example –Calculate the standard deviation from the following : X = 16, 20, 18, 19, 20, 20, 28, 17, 22, 20 Solution : Calculation of Standard Deviation X X= 20 x= X – X x² 16 -4 16 20 0 0 18 -2 4 19 -1 1 20 0 0 20 0 0 28 8 64 17 -3 9 22 2 4 20 0 0 N = 10, ∑x= 200 ∑x= 0 ∑x² = 98 Example –Calculate standard deviation of the following series: 7, 10, 12, 13, 15, 20, 21, 28, 29, 35 Use assumed method Solution : Calculation of Standard Deviation X A = 20 d= X – A d² 7 -13 169 10 -10 100 12 -8 64 13 -7 49 15 -5 25 20 = A 0 0 21 1 1 28 8 64 29 9 81 35 15 225 N = 10 ∑d= -10 ∑d² = 778 2) Method based on Use of Actual data When the number of observations are few, standard deviations can be calculated by using the actual data. When this method is used, following formula is used : Steps for Calculation i. Firs we find the sum of the items, i.e. ∑ x ii. Then the values of x are squared up and added to get ∑ x² iii.Substitute the values in the above formula. The result will give the value of standard deviation. Example –Calculate the standard deviation from the following : X = 16, 20, 18, 19, 20, 20, 28, 17, 22, 20 Solution : Calculation of Standard Deviation X X² 16 256 20 400 18 324 19 361 20 400 20 400 28 784 17 289 22 484 20 400 N = 10, ∑x= 200 ∑x²= 4098 There are 3 methods to find standard deviation standard deviation is denoted by symbol= σ 1) Actual mean method:√∑fx²⁄N Where, x=X-mean 2) Assumed mean method: √∑fd²/N-(∑fd/N)² Where, d=X-A 3) Step deviation method: √∑fd’²/N-(fd’/N)²*i Where d’=X-A/i and i= common factor X:10,20,30,40,50,60,70 F: 3,5,7,9,8,5,3 SD= √fd’²/N-(fd’/N)²*i= √109/40-(1/40)²*10 = 16.5 X f A=40 d=X-40 d’ fd’ fd’² 10 3 -30 -3 -9 27 20 5 -20 -2 -10 20 30 7 -10 -1 -7 7 40 9 0 0 0 0 50 8 10 1 8 8 60 5 20 2 10 20 70 3 30 3 9 27 ∑fd=1 ∑fd’²=109 N=40 In continuous series we can use any of three methods discussed above in discrete series. BUT step deviation method is commonly used in continuous series, its formula is: SD=√fd’²/N-(fd’/N)²*i Where, d’= m-A/i and i= size of class interval MARKS 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 NO. OF ST. 10 20 40 30 20 10 4 5 Marks f M. V d=m-35 d’ fd’ fd’² 0-10 5 5 -30 -3 -15 45 10-20 10 15 -20 -2 -20 40 20-30 20 25 -10 -1 -20 20 30-40 40 35 0 0 0 0 40-50 30 45 10 1 30 30 50-60 20 55 20 2 40 80 60-70 10 65 30 3 30 90 70-80 4 75 40 4 16 64 ∑=fd’=61 ∑=fd’²= -369 N=139 SD= √fd’²⁄N-(fd’/N)²*i = √369/139-(61/139)²*10= 15.69 Variance is the sqaure of standard deviation, i.e variance= (S.D)²= σ² COMBINED STANDARD DEVIATION The combined standard deviation of two or more groups can be calculated by the formula : ₂ ₂ ₂ ₂ σ₁₂ = N₁σ₁ + N₂σ₂ + N₁d₁ + N₂d₂ N₁ + N₂ Series A Series B Mean 50 40 Standard deviation 5 6 100 150 No. of items Find the combined mean and standard deviation of the two series. Solution: X‾12 = N1X1 + N2 X2 / N1 + N2 = (100×50)+ (150×40))/(100+150)=11,000/250=44 σ12 = √ N1(σ12 + d12) + N2 (σ22 + d22) / N1+ N2 d1 = (X‾1 - X‾12) = 50 - 44 = 6 d2 = (X‾2 - X‾12) = 40 - 44 = 4 σ12 = √(100 [ (52+ 62) ]+ 150 [62+ (-42 )]) / (100+150))) = √((2500+3600+5400+2400)/250) = √(13,900/250) = √55.6=7.46 Correcting Mean and Standard Deviation Quite often while tabulating data some values are wrongly entered. For example sometimes 15 may be misread as 51 or 159 may be misread as 59, and we may have used these wrong values in the calculation of the Mean and Standard Deviation. When the error is found out, there are two courses open to us-—either to make all calculations afresh, which is a very tedious task or to correct the values of Mean and Standard, deviation by adjustment and readjustment of wrong and right figures. This process is easy and takes much less time and saves a lot of botheration. We shall discuss these procedures in the examples given below: Example : The mean and standard deviation of 20 items was found to be 10 and 2 respectively. Later it was found the item 12 was misread as 8. Calculate the correct mean and standard deviation. Solution: Given N =20, #=10, a=2 Wrong value used 8, correct value 12. (i) Calculation of Correct Mean: Given N = 20, X = 10, σ = 2 Wrong value used 8, correct value 12. X = (∑X)/N or X × N = ∑X or X × N = 10 x 20 = 200 = ∑X This is the wrong value of 2X. The correct value would be 200—8 + 12 = 204 Correct ∑X=204. As such (ii) Calculations of correct standard deviation: Variance or σ2= (∑(x2)/N- (X2) The wrong value of ∑ (X2) = N (σ2- X2) The correct value of ∑(X2) =2080 — (8)2 + (12)2 = 2160 Now correct variance = (∑(X2))/N- (X)2 = 2160/20- (10.2)2 The correct standard deviation = √(2160/20- (10.2)2) = √(108-104.04) = √3.96 =1.989 It is a graphical method of studying dispersion. Lorenz curve was given by famous statistician Max O Lorenz. Lorenz curve has great utility in the study of degree of inequality in the distribution of income and wealth between the countries. It is also useful for comparing the distribution of wages, profit etc over different business groups. Lorenz curve is a cumulative percentage curve in which the percentage of frequency is combined with percentage of other items such as income , profit, wages etc EXAMPLE Income 100 200 400 500 800 No. of persons 80 70 50 30 20 Income Cumulativ Cumulative e Income Percentage No. of Cumulati Cumulative persons ve total percentage 100 100 100 *100 = 5 200 80 80 80*100=32 250 200 300 300*100=15 2000 70 150 150*100=60 250 400 700 700*100=35 2000 50 200 200*100=80 250 500 1200 1200*100=60 2000 30 230 230*100=92 250 800 2000 2000*100=100 2000 20 250 250*100=100 250