Download dispersion final

Document related concepts
no text concepts found
Transcript
Dispersion refers to the extent to which the
items vary from one another and from the
central value.It may be noted that the
measures of dispersion or variation measure
only the degree but not the direction of the
variation. The measures of dispersion are
also called averages of the second order
because they are based on the deviations of
the different values from the mean or other
measures of central tendency which are
called averages of the first order.
1
 In the words of Bowley “Dispersion is
the measure of the variation of the
items”
 According to Conar “Dispersion is a
measure of the extent to which the
individual items vary”
2
METHODS OF MEASURING
DISPERSION
 Range
 Inter Quartile Range and Quartile Deviation
 Mean Deviation
 Standard Deviation
 Coefficient of variation
 Lorenz curve
3
 Range is the simplest method of
measuring
dispersion
 It is defined as the difference between the smallest
and the largest observations in a given set of data.
 Its formula is:
R=L-S
Where, R= Range
L= Largest value
S= Smallest value
 The relative measure of Range, also called coefficient
of range is defined as:
Coefficient of Range= L-S/L+S
4
 Range is calculated with the help of
following series :
I. Individual series
II. Discrete series
III. Continuous series
Let us explain these series with the help of
examples
Ex 1. Five students obtained the following marks in
statistics:
20, 35, 25, 30, 15
Find the range and coefficient of range.
Sol. Here L=35, S=15
range= L-S
range= 35-15=20
Coefficient of range= L-S/L+S
=35-15/35+15
= 20/15
= 0.40
marks
10
20
30
40
50
60
70
No of
15
student
s
18
25
30
16
10
9
Sol. Here L=70, S=10
range= L-S
range= 70-10=60
Coefficient of range= L-S/L+S
= 70-10/70+10
= 0.75
Ex 3. find range and coefficient of range from the series:
size
5-10
10-15
15-20
20-25
25-30
frequency
4
9
15
30
40
Sol. Range= L-S
here, L= upper limit of the largest class=30
S= lower limit of the smallest class=5
range= 30-5=25
Coefficient of range= L-S/L+S
=30-5/30+5
=5/7
 Interquartile range is the difference b/w the upper quartile




(Q3) and lower quartile (Q1)
Its formula is:
interquartile range= Q3- Q1
Quartile deviation is the half of the difference b/w the upper
quartile (Q3) and lower quartile (Q1)
Its formula is:
Quartile deviation =Q3 - Q1/2
the relative measure of Quartile deviation also called
coefficient of Quartile deviation is defined as:
coefficient of Quartile deviation= Q3 - Q1/Q3+Q1
 It is calculated with the help of following series :
I. Individual series
II. Discrete series
III. Continuous series
Let us explain these series with the help of
examples
 Ex 1. Find the interquartile range, quartile deviation and coefficient of
quartile deviation from the data
28,18,20,24,27,30,15
 Sol. Arrange the data in ascending order:
15,18,2024,27,28,30
Q1= size of (n+1/4)th item Q3= size of 3(n+1/4)th item
= size of (7+1/4)th item
= size of 3(7+1/4)th item
= size of (2)nd item
= size of 6th item
= 18 marks
= 28 marks
interquartile range=Q3-Q1
= 28-18=10
quartile deviation= Q3 - Q1/2
= 10/2=5
coefficient of quartile deviation= Q3 - Q1/Q3+Q1
= 28-18/28+18
= 0.217
Wages
10
No of
2
workers
20
30
40
50
60
8
20
35
42
20
 Q1= size of (n+1/4)th item
= size of (127+1/4)th item
= size of 32nd item=40
Q3= size of 3(n+1/4)th item
= size of 3(127+1/4)th item
= size of 96th item=50
interquartile range=Q3-Q1
= 50-40=10
quartile deviation= Q3 - Q1/2
= 50-40/2=5
coefficient of quartile deviation= Q3 - Q1/Q3+Q1
= 50-40/50+40
= 0.11
Age(year
s)
0-20
20-40
40-60
60-80
80-100
persons
4
10
15
20
11
Age(years)
f
cf
0-20
4
4
20-40
10
14
40-60
15
29
60-80
20
49
80-100
11
60
N=60
Q1= size of (n/4)th item
=(60/4)th item
=15th item
Q1 lies in the class 40-60
Q1= l1 + n/4-cf * i
f
= 40+ 15-14 * 20
15
= 41.33
Q3 = size of 3(n/4)th item
= 3(60/4)th item
= 45th item
Q3 lies in the class 60-80
Q3= l1 + n/4-cf * i
f
= 60+ 45-29 * 20
20
= 76
interquartile range=Q3-Q1
= 76-41.33=34.67
quartile deviation= Q3 - Q1/2
= 76-41.33/2
coefficient of quartile deviation= Q3 - Q1/Q3+Q1
= 76-41.33/76+41.33
= 0.29
MEAN DEVIATION
 Mean Deviation is also known as
average deviation. In this case deviation
taken from any average especially Mean,
Median or Mode. While taking deviation
we have to ignore negative items and
consider all of them as positive.
15
Formulae to calculate mean
deviation
M.D. from Mean=
 X X
N
M.D. from Median=
 X M
N
Coefficient of Mean Deviation
M .D.
Coefficient of M .D. =
X
X
X
Coefficient of
M .D.M
=
M .D.M
M
Individual series
M .D.X
M .D.M
=
d
N
=
d
N
M
X
Example: Calculate mean deviation from mean as well as
from median and coefficient of mean deviation from the
following data:
Marks: 20 , 22 , 25 , 38 , 40 , 50 , 65 , 70 , 75
Solution:
X

X
N
Median= Size
of
=Size of
405

 45
9
 N 1

 th item
 2 
 9  1  th item = 40


 2 
Marks
(X)
Deviations from
mean
45
Deviations from
Median 40
dM
dX
20
25
20
22
23
28
25
20
15
38
7
2
40
5
0
50
5
10
65
20
25
70
25
30
75
30
35
N=9, ΣХ=405
d
X
 160
d
M
 155
d
M.D. from
Mean=
Coeff. OfM .D.X
160

 17.78
9
X
N
=
M .D.X
17.78

 0.39
45
X
Median= 40
d

M.D. from Median=
M
N
155

 17.22
9
M .D.M 17.22
Coeff of M .D.M 

 0.43
M
40
•Discrete Series
M .D.X 
M .D.M 

f dx
N
 f dM
N
Example: Calculate the mean deviation from median and
mean and their coefficient from the following table:
X:
20
30
40
50
60
70
Frequenc
y:
8
12
20
10
6
4
Solution:
N 1
 th item
 2 
M = Size of
= Size of  60  1  th item

2

= Size of 30.5th item
M =40
fX 2460

X

 41
N
60
Calculation of mean deviation from median
X
f
c.f.
M=40
dM  X  M
f dM
20
8
8
20
160
30
12
20
10
120
40
20
40
0
0
50
10
50
10
100
60
6
56
20
120
70
4
60
30
120
N=60
M=40
M .D.M
fd


N

Coeff of
f d
M
 620
M
620
 10.33
60
M .D.M
M .D.M 
M
10.33

 0.258
40
Calculation of mean deviation from mean
X
f
fX
X  41
f dX
dX  X  X
20
8
160
21
168
30
12
360
11
132
40
20
800
1
20
50
10
500
9
90
60
6
360
19
114
70
4
280
29
116
N=60
ΣfX=246
0
f d
X  41
M .D.X
f


N

Coeff ofM .D.X
X
 640
dX
640
 10.67
60
M .D.X

X
10.67

 0.26
41
•Continuous Series
M .D.X
fd


M .D.M
fd


X
N
N
M
,where
dX  m X
,where
dM  m  M
Example: Calculate the mean deviation from mean and
its coefficient from the following data:
Marks
0-10
10-20
20-30
30-40
40-50
No. of students
5
8
15
16
6
Solution:
fm 1350

X

 27
N
50
Calculation of mean deviation from mean
Marks
f
m
fm
X  27
d X  m  27
X  27
fd

M .D. 
f dX
X
0-10
5
5
25
22
110
10-20
8
15
120
12
96
20-30
15
25
375
2
30
30-40
16
35
560
8
128
40-50
6
45
270
18
108
N=50
Coeff. of
f d
Σfm=1350
M .D.M
fd


N
M
X
N
472

 9.44
50
X
 472
9.44

 0.349
27
Example: Calculate means deviation from median and
its coefficient from the following data
Size
frequency
Solution:
100-- 120
4
120- 140
140- 160
6
N 33

 16.5th Item
2
2
160
N
 c. f .
M  l1  2
i
f
10
160- 180
8
180- 200
5
Median lies in 140-
 16.5  10 
 140  
  20  153
 10 
Calculation of mean deviation from median
Size
f
cf
m
M=153
dM  m  M
f dM
100-120
4
4
110
43
172
120-140
6
10
130
23
138
140-160
10
20 150
3
30
160-180
8
28 170
17
136
180-200
5
33
37
185
N=33
190
M=153
fd


M .D.M
M
N
661

 20.03
33
f d
M
 661
M .D.M 20.03
Coeff of M .D.M 

 0.1309
M
153
Short-cut method for mean
deviation
If value of the average comes out to be in fractions,
M.D. is calculated using the following formula:
•Individual Series-

X    X   N   N  X orM

M .D. 
A
B
A
B
NA  NB
•Discrete and Continuous Series-

fX    fX    f    f  X orM

M .D. 
A
B
A
N
B
•Individual Series
Example: Using shortcut method ,
calculate the mean deviations
from mean and median from the
following data:
7 , 9 , 13 , 13 , 15 , 17 , 19 , 21 , 23
X
X
7
9
13
13
Taking
 XB  57
NB  5
Taking M
X
 42
B
NB  4
15
M=15
X  15.22
17
19
21
23
ΣX=137
N=9
X
A
 80
NA  4
X
A
 80
NA  4
137
X
 15.22
9
M = Size of  N  1 th  9  1  5th item = 15

2

2
From mean:
M .D.X

X    X   N   N  X


A
B
NA  NB
80  57  (4  5)(15.22)

9
 4.25
A
B
From Median:
M .D.M

X    X   N   N  M


A
B
A
NA  NB
80  42  ( 4  4)(15)

9
38

 4.22
9
B
•Continuous Series
Example: Using shortcut method , calculate the mean
deviations from mean and median from the following
data:
Marks
0-10
10-20
20-30
30-40
40-50
No. of
Student
s
6
28
51
11
4
Calculation of mean deviation from mean
MarksX
f
X
fX
0-10
6
5
30
10-20
28
15
420
 f 
B
 34
 fX   450
B
X  229
20-30
51
30-40
11
 f   66
25
1275
35
385
45
180
A
40-50
4
ΣfХ=2290
X 
 fX
N

2290
 22.9
100
 fX 
A
 1840
M .D.X

fX    fX    f    f  X


A
B
A
N
1840  450  (66  34)( 22.9)

100
 6.572
B
Calculation of mean deviation from
median
Marks
X
f
0-10
6
10-20


f

28
B
 34
cf
X
fX
6
5
30
34
15
420
 fX   450
B
M = 23.14
20-30
51
30-40
11
40-50
4
 f   66
A
85
25
1275
96
35
385
100
45
180
ΣfХ=22
90
Median = 23.14
 fX 
A
 1840
M .D.M

fX    fX    f    f  M


A
B
A
N
1840  450  (66  34)( 23.14)

100
 6.4952
B
 The concept of standard deviation was
first introduced by Karl Pearson in
1893. The standard deviation is the
most useful and the most popular
measure of dispersion. Just as the
arithmetic mean is the most of all the
averages, the standard deviation is the
best of all measures of dispersion.
Calculation of Standard Deviation
Individual Series
In case of the individual series, standard deviation can
be compared by applying any of the three methods:
1)
Actual Mean Method
When deviations are taken from the actual mean, the
following formula is used :
σ = ∑ ( X – X)²
N
or ∑ x ² where , x = X – X
N
Steps for Calculation
i. Calculate the actual mean (x) of the
series
ii. Then take the deviation of the items
from the mean i.e. find X – X and
denote these deviations by x.
iii. Square these deviations and obtain
the total i.e. ∑ x2
iv. Divide ∑ x2 by the total number of
items i.e. N and take the square root
of it. The result will give the value of
the standard deviation.
2) Assumed Mean Method
When the actual mean is not a whole
number but in fraction then it becomes
difficult to take deviations from mean and
obtain the square of these deviations. To
save time and labor we use assumed mean
method or called short-cut method. When
deviations are taken fro assumed mean , the
following formula is used :
Steps for Calculation
i. Any of the items in the series is taken as
assumed mean.
ii. Take the deviations of the items from the
assumed mean, i.e. X – A and denote these
deviations by ‘d’. Sum up these deviations
to obtain ∑ d.
iii. Then square these deviations taken from
assumed mean and obtain the total , i.e.
∑d²
iv. Substitute the values of ∑d² , ∑ d and N in
the above formula. The result will give the
value of standard deviation.
Example –Calculate the
standard deviation
from the following :
X = 16, 20, 18, 19, 20,
20, 28, 17, 22, 20
Solution : Calculation of
Standard Deviation
X
X= 20
x= X – X
x²
16
-4
16
20
0
0
18
-2
4
19
-1
1
20
0
0
20
0
0
28
8
64
17
-3
9
22
2
4
20
0
0
N = 10, ∑x=
200
∑x= 0
∑x² = 98
Example –Calculate
standard deviation of the
following series:
7, 10, 12, 13, 15, 20, 21,
28, 29, 35
Use assumed method
Solution :
Calculation of Standard
Deviation
X
A = 20
d= X – A
d²
7
-13
169
10
-10
100
12
-8
64
13
-7
49
15
-5
25
20 = A
0
0
21
1
1
28
8
64
29
9
81
35
15
225
N = 10
∑d= -10
∑d² = 778
2) Method based on Use of Actual data
When the number of observations are
few, standard deviations can be
calculated by using the actual data.
When this method is used, following
formula is used :
Steps for Calculation
i. Firs we find the sum of the items,
i.e. ∑ x
ii. Then the values of x are squared
up and added to get ∑ x²
iii.Substitute the values in the above
formula. The result will give the
value of standard deviation.
Example –Calculate
the standard deviation
from the following :
X = 16, 20, 18, 19, 20,
20, 28, 17, 22, 20
Solution :
Calculation of
Standard Deviation
X
X²
16
256
20
400
18
324
19
361
20
400
20
400
28
784
17
289
22
484
20
400
N = 10, ∑x= 200
∑x²= 4098
There are 3 methods to find standard deviation
standard deviation is denoted by symbol= σ
1) Actual mean method:√∑fx²⁄N
Where, x=X-mean
2) Assumed mean method: √∑fd²/N-(∑fd/N)²
Where, d=X-A
3) Step deviation method: √∑fd’²/N-(fd’/N)²*i
Where d’=X-A/i and i= common factor
 X:10,20,30,40,50,60,70
 F: 3,5,7,9,8,5,3
 SD= √fd’²/N-(fd’/N)²*i= √109/40-(1/40)²*10 = 16.5
X
f
A=40
d=X-40
d’
fd’
fd’²
10
3
-30
-3
-9
27
20
5
-20
-2
-10
20
30
7
-10
-1
-7
7
40
9
0
0
0
0
50
8
10
1
8
8
60
5
20
2
10
20
70
3
30
3
9
27
∑fd=1
∑fd’²=109
N=40
 In continuous series we can use any of
three methods discussed above in
discrete series. BUT step deviation
method is commonly used in continuous
series, its formula is:
 SD=√fd’²/N-(fd’/N)²*i
Where, d’= m-A/i and i= size of class
interval
MARKS 0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
NO. OF
ST.
10
20
40
30
20
10
4
5
Marks
f
M.
V
d=m-35
d’
fd’
fd’²
0-10
5
5
-30
-3
-15
45
10-20
10
15
-20
-2
-20
40
20-30
20
25
-10
-1
-20
20
30-40
40
35
0
0
0
0
40-50
30
45
10
1
30
30
50-60
20
55
20
2
40
80
60-70
10
65
30
3
30
90
70-80
4
75
40
4
16
64
∑=fd’=61
∑=fd’²=
-369
N=139
 SD= √fd’²⁄N-(fd’/N)²*i =
√369/139-(61/139)²*10= 15.69
Variance is the sqaure of standard deviation,
i.e
variance= (S.D)²= σ²
COMBINED STANDARD
DEVIATION
The combined standard deviation of
two or more groups can be
calculated by the formula :
₂
₂
₂
₂
σ₁₂ = N₁σ₁ + N₂σ₂ + N₁d₁ + N₂d₂
N₁ + N₂
Series A
Series B
Mean
50
40
Standard deviation
5
6
100
150
No. of items
Find the combined mean and standard deviation of the two series.
Solution:
X‾12 = N1X1 + N2 X2 / N1 + N2
= (100×50)+ (150×40))/(100+150)=11,000/250=44
σ12 = √ N1(σ12 + d12) + N2 (σ22 + d22) / N1+ N2
d1 = (X‾1 - X‾12) = 50 - 44 = 6
d2 = (X‾2 - X‾12) = 40 - 44 = 4
σ12 = √(100 [ (52+ 62) ]+ 150 [62+ (-42 )]) / (100+150)))
= √((2500+3600+5400+2400)/250) = √(13,900/250)
= √55.6=7.46
Correcting Mean and Standard Deviation
Quite often while tabulating data some values are wrongly entered. For
example sometimes 15 may be misread as 51 or 159 may be misread as 59,
and we may have used these wrong values in the calculation of the Mean
and Standard Deviation. When the error is found out, there are two courses
open to us-—either to make all calculations afresh, which is a very tedious
task or to correct the values of Mean and Standard, deviation by adjustment
and readjustment of wrong and right figures. This process is easy and takes
much less time and saves a lot of botheration. We shall discuss these
procedures in the examples given below:
Example : The mean and standard deviation of 20 items was found to be 10
and 2 respectively. Later it was found the item 12 was misread as 8. Calculate
the correct mean and standard deviation.
Solution:
Given N =20, #=10, a=2 Wrong value used 8, correct value 12.
(i) Calculation of Correct Mean:
Given N = 20, X = 10, σ = 2 Wrong value used 8, correct value 12.
X = (∑X)/N or X × N = ∑X or X × N = 10 x 20 = 200 = ∑X
This is the wrong value of 2X. The correct value would be
200—8 + 12 = 204 Correct ∑X=204. As such
(ii) Calculations of correct standard deviation:
Variance or σ2= (∑(x2)/N- (X2)
The wrong value of ∑ (X2) = N (σ2- X2)
The correct value of ∑(X2) =2080 — (8)2 + (12)2 = 2160
Now correct variance = (∑(X2))/N- (X)2 = 2160/20- (10.2)2
The correct standard deviation = √(2160/20- (10.2)2)
= √(108-104.04) = √3.96
=1.989
It is a graphical method of studying dispersion.
Lorenz curve was given by famous statistician Max O
Lorenz. Lorenz curve has great utility in the study of
degree of inequality in the distribution of income
and wealth between the countries. It is also useful for
comparing the distribution of wages, profit etc over
different business groups. Lorenz curve is a
cumulative percentage curve in which the percentage
of frequency is combined with percentage of other
items such as income , profit, wages etc
EXAMPLE
Income
100
200
400
500
800
No. of
persons
80
70
50
30
20
Income
Cumulativ Cumulative
e Income
Percentage
No. of
Cumulati Cumulative
persons ve total
percentage
100
100
100 *100 = 5
200
80
80
80*100=32
250
200
300
300*100=15
2000
70
150
150*100=60
250
400
700
700*100=35
2000
50
200
200*100=80
250
500
1200
1200*100=60
2000
30
230
230*100=92
250
800
2000
2000*100=100
2000
20
250
250*100=100
250
Related documents