Download 1 UNIT-1 LESSON-1 : PREPARATION OF FREQUENCY

Document related concepts

Statistics wikipedia , lookup

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
UNIT-1
LESSON-1 : PREPARATION OF FREQUENCY DISTRIBUTION AND THEIR
GRAPHICAL PRESENTATION
1. STRUCTURE
1.0
1.1
1.2
1.3
1.4
Objective
What is Frequency Distribution
Types of Frequency Distribution
Principles of Constructing Frequency Distribution
Graphs of Frequency Distributions
1.4.1 Histogram
1.4.2 Frequency Polygon
1.4.3 Smoothed Frequency Curves
1.4.4 Cumulative Frequency Curves or Ogives
1.5 Summary
1.6 Self Assessment Questions
1.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Learn a frequency distribution and types of distributions
(b) Learn the principles and procedure of preparing a frequency distribution
(c) Learn the graphical presentation of distribution with the help of histogram, frequency polygon,
smoothed frequency curves and ogives.
1.1 WHAT IS FREQUENCY DISTRIBUTION
Collected and classified data are presented in a form of frequency distribution. Frequency distribution
is simply a table in which the data are grouped into classes on the basis of common characteristics
and the number of cases which fall in each class are recorded. It shows the frequency of occurrence
of different values of a single variable. A frequency distribution is constructed for satisfying three
objectives :
(i) to facilitate the analysis of data
(ii) to estimate frequencies of the unknown population distribution from the distribution of
sample data and
(iii) to facilitate the computation of various statistical measures.
1.2 TYPES OF FREQUENCY DISTRIBUTION
1. Univariate Frequency Distribution
2. Bivariate Frequency Distribution
1
In this lesson, we shall discuss the Univariate frequency distribution. Univariate distribution
incorporates different values of one variable only whereas the Bivariate frequency distribution
incorporates the values of two variables only. The Univariate frequency distribution is classified
further into three categories :
(i) Series of Individual observations
(ii) Discrete frequency distribution, and
(iii) Continuous frequency distribution.
Series of individual observations, is a simple listing of items of each observation. If marks of
20 students in statistics of a class are given individually, it will form a series of Individual observations.
Marks obtained in Statistics:
Roll Nos. 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Marks : 60 71 80 41 94 33 81 41 78 66 85 35 61 55 98 52 50 91 30 88
Marks in Ascending Order
Marks in Descending Order
30
98
33
94
35
91
41
88
41
85
50
81
52
80
55
78
60
71
61
66
66
61
71
60
78
55
80
52
81
50
85
41
88
41
91
35
94
33
98
30
2
Discrete Frequency Distribution : In a discrete series, the data are presented in such a way
that exact measurements of units are indicated. In a discrete frequency distribution, we count the
number of times each value of the variable in data given to you. This is facilitated through the
technique of tally bars.
In the first column, we write all values of the variable. In the second column, a vertical bar
called tally bar against the variable, we write a particular value has occurred four times, for the fifth
occurrence, we put a cross tally mark (/) on the four tally bars to make a block of 5. The technique
of putting cross tally bars at every fifth repetition facilitates the counting of the number of occurrences
of the value. After putting tally bars for all the values in the data; we count the number of times each
value is repeated and write it against the corresponding value of the variable in the third column
entitled frequency. This type of representation of the data is called discrete frequency distribution.
We are given marks of 50 students :
70 55
51
42
57
40
26
43
46
41
46
48
33
40
26
40
40 41
43
53
45
60
47
63
53
33
50
40
33
40
26
53
59 33
65
78
39
55
48
15
26
43
59
51
39
15
45
26
61 15
We can construct a discrete frequency distribution from the above given marks.
Marks of 50 Students
Marks
Tally Bars
Frequency
15
|||
3
26
||||
5
33
||||
4
39
40
||
||||
2
5
41
||
2
42
|
1
43
|||
3
45
|
2
46
||
2
47
|
1
48
||
2
50
|
1
51
||
2
53
|||
3
55
|||
3
57
|
1
3
59
||
2
60
|
1
61
|
1
63
|
1
65
|
1
70
|
1
78
|
1
Total 50
The presentation of the data in the form of a discrete frequency distribution is better than
arranging but it does not condense the data as needed and is quite difficult to grasp and comprehend.
This distribution is quite simple in case the values of the variable are repeated otherwise there will
be hardly any condensation.
Continuous Frequency Distribution : If the identity of the units about a particular information
is collected, is not relevant nor is the order in which the observations occur, then the first step of
condensation is to classify the data into different classes by dividing the entire group of values of
the variable into a suitable number of groups and then recording the number of observations in each
group. Thus, if we divide the total range of values of the variable (marks of 50 students) i.e. 78 – 15
= 63 into groups of 10 each, then we shall get (63/10) 6 groups and the distribution of marks is
displayed by the following frequency distribution :
Marks of 50 students
Marks (×)
Tally Bars
Number of Students (f)
15–25
25–34
35–45
|||
|||| ||||
|||| |||| |||
3
9
13
45–55
55–65
65–75
75–85
|||| |||| |||
13
9
2
1
|||| ||||
||
|
Total
50
The various groups into which the values of the variable are classified are known as classes,
the length of the class interval (10) is called the width of magnitude of the class. Two values,
specifying the class, are called the class limits. The presentation of the data into continuous classes
with the corresponding frequencies is known as continuous frequency distribution. There are two
methods of classifying the data according to class intervals :
(i) exclusive method
(ii) inclusive method
4
In an exclusive method, the class intervals are fixed in such a manner that upper limit of one
class becomes the lower limit of the following class. Moreover, an item equal to the upper limit of
a class would be excluded from that class and included subsequently in the next class. The following
data are classified on this basis.
Income (Rs.)
No. of Persons
200–250
250–300
300–350
350–400
400–450
450–500
50
100
70
130
50
100
Total
500
It is clear from the example that the exclusive method ensures continuity of the data in as
much as the upper limit of one class is the lower limit of the next class. Therefore, 50 persons have
their incomes between 200 to 249.99 and a person whose income is 250 shall be included in the
next class of 250–300.
According to the inclusive method, an item equal to upper limit of a class is included in that
class itself. The following table demonstrates this method.
Income(Rs.)
No. of Persons
200–249
250–299
300–349
350–399
400–49
450–99
50
100
70
130
50
100
Total
500
Hence in the class 200–249, we include persons whose income is between Rs. 200 and
Rs. 249.
1.3 PRINCIPLES OF CONSTRUCTING FREQUENCY DISTRIBUTIONS
Inspite of the great importance of classification in statistical analysis, no hard and fast rules be laid
down for it. A statistician uses his discretion for classifying a frequency distribution and sound
experience, wisdom, skill and aptness for an appropriate classification of the data. However, the
following guidelines must be considered to construct a frequency distribution:
1.
Types of classes : The classes should be clearly defined and should not lead to any ambiguity.
They should be exhaustive and mutually exclusive so that any value of variable corresponds to
only class.
5
2.
Number of classes : The choice about the number of classes into which a given frequency
distribution should be divided depends upon
(i) The total frequency which means the total number of observations in the distribution.
(ii) The nature of the data which means the size or magnitude of the values of the variable.
(iii) The desired accuracy.
(iv) The convenience regarding computation of the various descriptive measures of the frequency
distribution such as means, variance etc.
The number of classes should neither be too small nor too large. In case the classes are few, the
classification becomes very broad and rough which might obscure some important features and
characteristics of the data. The accuracy of the results decreases as the number of classes becomes
smaller. On the other hand, too many classes will result in very few frequencies in each class. This
will give an irregular pattern of frequencies in different classes thus makes the frequency distribution
irregular. Moreover a large number of classes will render the distribution too unwieldy to handle.
The computational work for further processing of the data will become quite tedious and time
consuming without any proportionate gain in the accuracy of the results. Hence a balance should be
maintained between the loss of information in the first case and irregularity of frequency distribution
in the second case, to arrive at a pleasing compromise giving the optimum number of classes.
Normally, the number of classes should not be less than 5 and more than 20. Prof. Sturges has given
a formula :
k = 1+3.322 log n
where k refers to the number of classes and n is the total frequency or number of observations. The
value of k is rounded to the next higher integer :
If n = 100
k = 1 + 3.322 log 100 = 1 + 6.6448 = 8
If n = 10,000
k = 1 + 3.322 log 10,000 = 1 + 13.288 = 14
However, this rule should be applied only when the number of observations are not very small.
Moreover, the number or class intervals should be such that they give uniform and unimodal
distribution which means that the frequencies in the given classes increase and decrease steadily
and there are no sudden jumps. The number of classes should be an integer preferably 5 or some
multiples of 5, 10, 15, 20, 25 etc. which are quite convenient for numerical computations.
3.
Size of class intervals : Because the size of the class interval is inversely proportional to the
number of classes in a given distribution, the choice about the size of the class interval will
also depend upon the sound subjective judgment of the statistician. An approximate value of
the magnitude of the class interval say i can be calculated with the help of Sturge’s Rule :
i=
Range
1 + 3.322 log n
where i stands for class magnitude or interval, Range is calculated by taking the difference between
largest and smallest value of the distribution, and n refers to total number of observations.
6
If we are given the following information; n = 400, Largest item = 1300 and Smallest
item = 340.
then, i =
1300 − 340
960
960
=
=
= 99.54(100 approx.)
1 + 3.322 log 400 1 + 3.322 × 2.6021 9.644
Another rule of thumb for determining the size of the class interval is that the length of the
class interval should not be greater than
1
4
th of the estimated population standard deviation. If 6 is
the estimate of population standard deviation then the length of class interval is given by: i ≥ 6/4
The size of class intervals should be taken as 5 or multiples of 5,10,15 or 20 for easy
computations of various statistical measures of the frequency distribution, class intervals should be
so fixed that each class has a convenient mid-point around which all the observations in that class
cluster. It means that the entire frequency of the class is concentrated at the mid value of the class.
This assumption will be true only if the frequencies of the different classes are uniformly distributed
in the respective class intervals. It is always desirable to take the class intervals of equal or uniform
magnitude throughout the frequency distribution.
4.
Class boundaries : If in a grouped frequency distribution there are gaps between the upper
limit of any class and lower limit of the succeeding class (as in case of inclusive type of
classification), there is a need to convert the data into a continuous distribution by applying a
correction factor for continuity for determining new classes of exclusive type. The lower and
upper class limits of new exclusive type classes are called class boundaries.
If d is the gap between the upper limit of any class and lower limit of succeeding class, the
class boundaries for any class are given by :
1 
d
2 

1 
Lower class boundary = Lower class limit − d
2 
Upper class boundary = Upper class limit +
d/2 is called the correction factor.
Let us consider the following example to understand:
Marks
Class Boundaries
20 – 24
(20 – 0.5,24+ 0.5) i.e., 19.5 – 24.5
25 – 29
(25 – 0.5,29 + 0.5) i.e., 24.5 – 29.5
30 – 34
(30 – 0.5,34 + 0.5) i.e., 29.5 – 34.5
35 – 39
(35 – 0.5,39 + 0.5) i.e., 34.5 – 39.5
40 – 44
(40 – 0.5,44 + 0.5) i.e., 39.5 – 44.5
Correction factor =
d 35 − 34 1
=
= = 0.5
2
2
2
7
5.
Mid-value or class mark: Mid value or class mark is the value of a variable which lies exactly
at the middle of a class. Mid-value of any class is obtained on dividing the sum of the upper
and lower class limits by 2.
Mid value of a class =
1
[Lower class limit + Upper class limit]
2
The class limits should be selected in such a manner that the observations in any class are
evenly distributed throughout the class interval so that the actual average of the observations in any
class is very close to the mid-value of the class.
6.
Open end classes : The classification is termed as open end classification if the lower limit of
the first class or the upper limit of the last class or both are not specified and such classes in
which one of the limits is missing are called open end classes. For example, the classes like the
marks less than 20 or age below 60 years. As far as possible open end classes should be
avoided because in such classes the mid-value cannot be accurately obtained. But if the open
end classes are inevitable then it is customary to estimate the class mark or mid-value for the
first class with reference to the succeeding class. In other words, we assume that the magnitude
of the first class is same as that of the second class.
Example 1 : Construct a frequency distribution from the following data by inclusive method taking
4 as the class interval :
10
17
15
22
11
16
19
24
29
18
25
26
32
14
17
20
23
27
30
12
15
18
24
36
18
15
21
28
33
38
34
13
10
16
20
22
29
19
23
31
Solution : Because the minimum value of the variable is 10 which is a very convenient figure for
taking the lower limit of the first class and the magnitude of the class interval is given to be 4, the
classes for preparing frequency distribution by the Inclusive Method will be 10–13, 14–17, 18–21,
22–25,............ 38–41.
Frequency Distribution
Class Interval
Tally Bars
Frequency (f)
10 – 13
||||
5
14 – 17
|||| |||
8
18 – 21
|||| |||
8
22 – 25
|||| ||
7
26 – 29
||||
5
30 – 33
||||
4
34 – 37
||
2
38 – 41
|
1
8
Example 2 : Prepare a statistical table from the following :
Weekly wages (Rs.) of 100 workers of Factory A
88
23
27
28
86
96
94
93
86
99
82
24
24
55
88
99
55
86
82
36
96
39
26
54
87
100
56
84
83
46
102
48
27
26
29
100
59
83
84
48
104
46
30
29
40
101
60
89
46
49
106
33
36
30
40
103
70
90
49
50
104
36
37
40
40
106
72
94
50
60
24
39
49
46
66
107
76
96
46
67
26
78
50
44
43
46
79
99
36
68
29
67
56
99
93
48
80
102
32
51
Solution : The lowest value is 23 and the highest 106. The difference in the lowest and highest
value is 83. If we take a class interval of 10, nine classes would be made. The first class should be
taken as 20–30 instead of 23–33 as per the guidelines of classification.
Frequency Distribution of the Wages of 100 Workers
Wages (Rs.)
Tally Bars
Frequency (f)
20 – 30
30 – 40
40 – 50
50 – 60
60 – 70
70 – 80
80 – 90
90 – 100
100 – 110
|||| |||| |||
|||| |||| |
13
11
18
10
6
5
14
12
11
|||| |||| |||| |||
|||| ||||
|||| |
||||
|||| |||| ||||
|||| |||| ||
|||| |||| |
Total
100
1.4 GRAPHS OF FREQUENCY DISTRIBUTIONS
The guiding principles for the graphic representation of the frequency distributions are precisely
the same as for the diagrammatic and graphic representation of other types of data. The information
contained in a frequency distribution can be shown in graphs which reveals the important
characteristics and relationships that are not easily discernible on a simple examination of the
frequency tables. The most commonly used graphs for charting a frequency distribution for the
general understanding of the details of the data are :
9
1. Histogram
2. Frequency polygon
3. Smoothed frequency curves
4. Ogives or cumulative frequency curves.
1.4.1 Histogram
The term ‘histogram’ must not be confused with the term ‘historigram’ which relates to time charts.
Histogram is the best way of presenting graphically a simple frequency distribution. The statistical
meaning of histogram is that it is a graph that represents the class frequencies in a frequency
distribution by vertical adjacent rectangles.
While constructing histogram the variable is always taken on the X-axis and the corresponding
frequencies on the Y-axis. Each class is then represented by a distance on the scale that is proportional
to its class-interval. The distance for each rectangle on the X-axis shall remain the same in case the
class-intervals are uniform throughout; if they are different the width of the rectangles shall also
change proportionately. The Y-axis represents the frequencies of each class which constitute the
height of its rectangle. We get a series of rectangles each having a class interval distance as its width
and the frequency distance as its height. The area of the histogram represents the total frequency.
The histogram should be clearly distinguished from a bar diagram. A bar diagram is onedimensional i.e., only the length of the bar is important and not the width, a histogram is twodimensional, that is, in a histogram both the length and the width are important. However, a histogram
can be misleading if the distribution has unequal class-intervals and suitable adjustments in
frequencies are not made.
The technique of constructing histogram is explained for :
(i) distributions having equal class-intervals and
(ii) distributions having unequal class-intervals.
When class-intervals are equal, take frequency on the Y-axis, the variable on the X-axis and
construct rectangles. In such a case the heights of the rectangles will be proportional to the frequencies.
Example 3 : Draw a histogram from the following data :
Classes
Frequency
0 – 10
10 – 20
20 – 30
30 – 40
40 – 50
50 – 60
60 – 70
70 – 80
80 – 90
90 – 100
5
11
19
21
16
10
8
6
3
1
10
Solution:
Y
HISTOGRAM
25
FREQUENCY
20
15
10
5
0
10
20
30
40
50
60
70
80
90
X
100
CLASSES
When class-intervals are unequal the frequencies must be adjusted before constructing a
histogram. We take that class which has the lowest class-interval and adjust the frequencies of other
classes accordingly. If one class-interval is twice as wide as the one having the lowest class-interval
we divide the height of its rectangle by two, if it is three times more we divide it by three etc., the
heights will be proportional to the ratios of the frequencies to the width of the classes.
Example 4 : Represent the following data on a histogram.
Average monthly income of 1035 employees in a construction industry is given below:
Monthly Income (Rs.)
No. of Workers
600 – 700
700 – 800
800 – 900
900 – 1000
1000 – 1200
1200 – 1400
1400 – 1500
1500 – 1800
1800 or more
25
100
150
200
240
160
50
90
20
Solution : Histogram showing monthly incomes of workers
NUMBER OF WORKERS
200
Y
150
100
50
600
700
800
900 1000 1100 1200 1300 1400 1500
MONTHLY INCOME
11
1800
X
When mid point are given, first we ascertain the upper and lower limits of each class and
then construct the histogram in the same manner.
Example 5 : Draw a histogram of the following distribution :
Life of Electric Lamps
Firm A
Firm B
1010
10
287
1030
130
105
1050
482
26
1070
360
230
1090
18
352
in hours
Solution : Since we are given the mid points, we should ascertain the class limits. To calculate the
class limits of various classes, take difference of two consecutive mid-points and divide the difference
by 2, then add and subtract the value obtained from each mid-point to calculate lower and higher
class-limits.
500
Life of Electric
Frequency
Frequency
Lamps
Firm A
Firm B
1000–1020
10
287
1020–1040
130
105
1040–1060
482
76
1060–1080
360
230
1080–1100
18
352
HISTOGRAM (FIRM A)
400
FREQUENCY
FREQUENCY
400
300
200
1.4.2
300
200
100
100
1000
HISTOGRAM (FIRM A)
500
1020 1040 1060
LIFE OF LAMPS
1080
1000
1100
1020
1040 1060
1080
1100
LIFE OF LAMPS
Frequency Polygon
This is a graph of frequency distribution which has more than four sides. It is particularly
effective in comparing two or more frequency distributions. There are two ways of constructing a
frequency polygon.
12
(i) We may draw a histogram of the given data and then join by straight line the mid-points of
the upper horizontal side of each rectangle with the adjacent ones. The figure so formed shall be
frequency polygon. Both the ends of the polygon should be extended to the base line in order to
make the area under frequency polygons equal to the area under Histogram.
NUMBER OF STUDENTS (FREQUENCY)
400
Y
300
200
100
0
X
CLASS MARK
(ii) Another method of constructing frequency polygon is to take the mid-points of the various
class-intervals and then plot the frequency corresonding to each point and join all these points by
straight lines. The figure obtained by both the methods would be equal.
400
Y
2
5
4
300
5
3
5
NUMBER OF STUDENTS (FREQUENCY)
1
200
7
3
2
100
r
8
2
9
1
X
0
CLASS MARK
13
Frequency polygon has an advantage over the histogram. The frequency polygons of several
distributions can be drawn on the same axis, which makes comparisons possible whereas histogram
can not be usefully employed in the same way. To compare histograms we draw them on separate
graphs.
1.4.3 Smoothed Frequency Curve
A smoothed frequency curve can be drawn through the various points of the polygon. The
curve is drawn by free hand in such a manner that the area included under the curve is approximately
the same as that of the polygon. The object of drawing a smoothed curve is to eliminate as far as
possible all accidental variations which exists in the original data, while smoothening, the top of
the curve would overtop the highest point of polygon particularly when the magnitude of the class
interval is large. The curve should look as regular as possible and all sudden turns should be avoided.
The extent of smoothening would depend upon the nature of the data. For drawing smoothed
frequency curve it is necessary to first draw the polygon and then smoothen it. We must keep in
mind the following points to smoothen a frequency graph :
(i) Only frequency distribution based on samples should be smoothened.
(ii) Only continuous series should be smoothened.
(iii) The total area under the curve should be equal to the area under the histogram or polygon.
The diagram given below will illustrate the point:
HISTOGRAM FREQUENCY POLYGON AND CURVE
50
HISTOGRAM
40
FREQUENCY
CURVE
20
14.5
11.5
12.5
9.5
10.5
7.5
8.5
13.5
FREQUENCY
POLYGON
10
6.5
NO. OF LEAVES
30
LENGTH OF LEAVES (cm)
1.4.4 Cumulative Frequency Curves or Ogives
We have discussed the charting of simple distributions where each frequency refers to the
measurement of the class-interval against which it is placed. Sometimes it becomes necessary to
know the number of items whose values are greater or less than a certain amount. We may, for
example, be interested in knowing the number of students whose weight is less than 65 lbs. or more
than say 15.5 lbs. To get this information, it is necessary to change the form of frequency distribution
from a simple to a cumulative distribution. In a cumulative frequency distribution of the frequency
of each class is made to include the frequencies of all the lower or all the upper classes depending
14
upon the manner in which cumulation is done. The graph of such a distribution is called a cumulative
frequency curve or an Ogive. There are two method of constructing ogives, namely :
(i) less than method and
(ii) more than method.
In the less than method, we start with the upper limit of each class and go on adding the
frequencies. When these frequencies are plotted we get a rising curve.
In the more than method, we start with the lower limit of each class and we subtract the
frequency of each class from total frequencies. When these frequencies are plotted, we get a declining
curve.
This example would illustrate both types of ogives.
Example 6 : Draw ogives by both the methods from the following data.
Distribution of weight of the students of a college (lbs.)
Weights
No. of Students
90.5–100.5
5
100.5–110.5
34
110.5–120.5
139
120.5–130.5
300
130.5–140.5
367
140.5–150.5
319
150.5–160.5
205
160.5–170.5
76
170.5–180.5
43
180.5–190.5
16
190.5–200.5
3
200.5–210.5
4
210.5–220.5
3
220.5–230.5
1
Solution : First of all we shall find out the cumulative frequencies of the given data by less than
method.
Less than (Weights)
Cumulative frequency
100.5
5
110.5
39
120.5
178
130.5
478
140.5
845
15
150.5
1164
160.5
1369
170.5
1445
180.5
1488
190.5
1504
200.5
1507
210.5
1511
220.5
1514
230.5
1515
Plot these frequencies and weights on a graph paper. The curve formed is called an Ogive.
1500
1250
750
500
220.5
230.5
200.5
210.5
180.5
190.5
150.5
160.5
170.5
130.5
140.5
0
110.5
120.5
250
90.5
100.5
CUMULATIVE FREQUENCY
1000
X
SIZES
Now we calculate the cumulative frequencies of the given data by more than method.
More than (Weights)
Cumulative frequencies
90.5
1515
100.5
1510
110.5
1476
120.5
1337
130.5
1037
140.5
670
16
150.5
351
160.5
146
170.5
70
180.5
27
190.5
11
200.5
8
210.5
4
220.5
1
By plotting these frequencies on a graph paper, we will get a declining curve which will be our
cumulative frequency curve or Ogive by More than method.
Y
1500
1250
750
500
230.5
210.5
220.5
190.5
200.5
180.5
160.5
170.5
140.5
150.5
120.5
130.5
0
90.5
250
100.5
110.5
CUMULATIVE FREQUENCY
1000
X
SIZES
Although the graphs are a powerful and effective media of presenting statistical data, they are
not under all circumstances and for all purposes complete substitutes for tabular and other forms of
presentation. The specialist in this field is one who recognizes not only the advantages but also the
limitations of these techniques. He knows when to use and when not to use these methods and from
his experience and expertise is able to select the most appropriate method for every purpose.
Example 7 : Draw an ogive by less than method and determine the number of companies getting
profits between Rs. 45 crores and Rs. 75 crores :
17
Profits (Rs. crores)
No. of Companies
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
90–100
8
12
20
24
15
10
7
3
1
Solution :
OGlVE BY LESS THAN METHOD
Less than 20
Less than 30
Less than 40
Less than 50
Less than 60
Less than 70
Less than 80
Less than 90
Less than 100
No. of Companies
100
92
8
20
40
64
79
89
96
99
100
NO. OF COMPANIES
Profit (Rs. Crores)
OGIVE BY LESS THAN METHOD
80
92–51 = 41
60
51
40
20
20
30
40 45 50
60
70 75 80
85
PROFIT RS. IN CRORES
It is clear from the graph that the number of companies getting profits less than Rs. 75 crores
is 92 and the number of companies getting profits less than Rs. 45 crores is 51. Hence the number
of companies getting profits between Rs. 45 crores and Rs. 75 crores is 92 – 51 = 41.
Example 8 : The following distribution is with regard to weight in grams of mangoes of a given
variety. If mangoes of weight less than 443 grams be considered unsuitable for foreign market, what
is the percentage of total yield suitable for it? Assume the given frequency distribution to be typical
of the variety:
Weight in gms.
No. of mangoes
410–419
420–429
430–439
440–449
450–459
460–469
470–479
10
20
42
54
45
18
7
18
Draw an ogive of ‘more than’ type of the above data and deduce how many mangoes will be
more than 443 grams.
Solution : Mangoes weighing more than 443 gms. are suitable for foreign market. Number of
mangoes weighing more than 443 gms lies in the last four classes. Number of mangoes weighing
between 444 and 449 grams would be
6
324
× 54 =
= 32.4
10
10
Total number of mangoes weighing more than 443 gms. = 32.4 + 45 + 18 + 7 = 102.4
Percentage of mangoes =
102.4
× 100 = 52.25
196
Therefore, the percentage of the total mangoes suitable for foreign market is 52.25.
OGIVE BY MORE THAN METHOD
Weight more than (gms.) No. of Mangoes
OGIVE BY MORE THAN METHOD
410
196
420
186
430
166
440
124
450
70
200
180
No. of mangoes
160
140
120
100
80
60
40
460
25
20
410
470
420
430
440
450
460
470
Weight in grams
7
From the graph it can be seen that there are 103 mangoes whose weight will be more than 443
gms and are suitable for foreign market.
1.5 SUMMARY
●
A frequency distribution aims to reduce the size of the given set of data for a better
comprehension.
●
An array, which is an arrangement of data in an ascending or descending order of magnitude,
is a useful step in preparing a frequency distribution.
●
To prepare a frequency distribution, we have to decide about the class intervals to be taken.
The width of class intervals depends on the number of classes. The number of classes should
not be very small or very large.
19
●
Given values are considered one by one and placed in appropriate class intervals. The number
of observations in each class is called the class frequency.
●
The class intervals may be overlapping Like 10–20, 20–30, etc. or inclusive like 10–19, 20–
29, etc.
●
Inclusive class intervals should be transformed into exclusive classes, depending on the way
the given data are recorded.
●
Class mid-points are the points that lie halfway between the two class limits.
●
The frequencies of a distribution can also be cumulated in ascending or descending order.
They are known as ACF and DCF. respectively.
●
The ACF are ‘less than’ cumulative frequencies while the DCF are ‘more than’ cumulative
frequencies.
●
Absolute class frequencies may also be expressed as relative frequencies, either as proportions
or percentages.
●
A frequency distribution may have class intervals with equal or unequal width.
●
A frequency distribution may be shown graphically by a histogram and frequency polygon.
●
A histogram consists of bars drawn over class limits with heights of bars such that the areas of
the bars are proportional to the frequencies of various class intervals.
●
A frequency polygon is a line chart and is drawn by joining points given by the class midpoints and class frequencies.
●
Cumulative frequencies arc shown graphically by means of ogives.
1.6 SELF ASSESSMENT QUESTIONS
Exercise 1 : True or False Statements
(i) Before constructing a frequency distribution, it is necessary that the data be arranged as an
array.
(ii) If the class intervals are given in the exclusive form as 10–20. 20–30. etc.. then a value
exactly equal to 20 may be included in either of these classes.
(iii) In the case of inclusive class intervals, the class mid-points are determined only after
converting them into exclusive form.
(iv) The number and width of class intervals are determined independently of each other.
(v) A frequency distribution must have all class intervals of equal width.
(vi) A distribution can have both ends open.
(vii) A bivariate frequency distribution can be prepared only when both the variables involved
are discrete or are continuous.
(viii) Relative frequencies are obtained by dividing the frequencies of various classes by the width
of the respective classes.
20
(ix) Frequency density is another name for relative frequency.
(x) The proportionate frequencies facilitate comparison between distributions better than absolute
frequencies.
(xi) It is never possible to calculate absolute frequencies from the proportionate frequencies for
a distribution.
(xii) In presenting a distribution graphically, the variable is shown horizontally while the
frequencies are shown vertically.
(xiii) It is necessary that the widths of bars representing various class intervals of a frequency
distribution be always equal.
(xiv) The areas covered by a histogram and a frequency polygon are equal.
(xv) Strictly speaking, a histogram cannot be drawn for an open-ended distribution.
Ans. 1. F 2. F 3. T 4. F 5. F 6. T 7. F 8. F 9. F 10. T 11. F 12. T 13. F 14. T 15. T
Exercise 2 : Questions and Answers
(i) What is a frequency distribution? Explain the process of preparing a univariate frequency
distribution.
(ii) Explain the following:
(a) Grouping error
(b) Cumulative frequencies
(c) Relative frequencies
(d) Frequency density
(iii) What is a bivariate frequency distribution? How is it constructed? Can we prepare a bivariate
frequency distribution if one of the variables is discrete and the other is continuous?
(iv) Explain the drawing of histogram when class intervals are equal and when they are not equal.
(v) What are ogives? How are they constructed and what information do they provide?
(vi) From the time cards of a factory, the following information has been obtained about the number
of days each one of the 48 workers has reported late for the work during the last month:
3
0
5
0
6
2
1
0
4
6
5
2
1
1
1
3
4
2
2
5
6
3
0
2
2
3
2
5
4
2
4
3
5
2
2
2
4
6
4
0
3
1
1
4
5
2
1
1
Prepare a frequency distribution using this information. Also, indicate percentage frequencies.
(vii) XYZ Company collected data regarding the number of interviews required for each of its 40
sales persons to make their most recent sale. Following are those numbers:
102
95
90
90
101
60
80
113
102
110
126
66
121
116
139
72
101
93
114
99
112
105
97
100
99
115
129
111
119
81
91
93
119
113
128
110
75
87
107
108
(a) Construct a frequency distribution with six class intervals.
(b) Construct a histogram from the data.
21
(viii) If the class mid-points in a frequency distribution of weights of a group of students are
125, 132, 139, 146, 153, 160, 167, 174 and 181 lbs. find
(a) Size of the class interval.
(b) Class limits assuming weights have been measured to the nearest pound.
(ix) Convert the following class intervals into exclusive form:
(a)
(b)
(c)
Diameters (in cm)
Age in years
Height in inches
0.5–0.9
1.0–1.4
1.5–1.9
2.0–2.4
2.5–2.9
3.0–3.4
5–9
10–14
15–24
25–39
40–59
60–79
60–64
65–69
70–74
75–79
80–84
85–89
(x) The monthly profits earnedby 100 companies during the last financial year are given
below.
Monthly Profit
(Rs. lakhs)
No. of Companies
Monthly Profit
(Rs. lakhs)
No. of Companies
20–30
4
60–70
15
30–40
8
70–80
10
40–50
18
80–90
8
50–60
30
90–100
7
(a) Draw an ogive by ‘less than’ method and ‘more than’ method.
(b) Obtain the limits of monthly profits of central 50 percent of the companies and
check these values against the formula calculated values.
(xi) The salary distribution of employees of a company is given below
Salary (in ‘000 Rs.)
No. of Employees
8–10
18
10–12
32
12–14
70
14–16
88
16 –18
64
18–20
44
20–22
24
22–24
10
22
(a) Show these data by means of a histogram and frequency polygon on the same graph.
(b) Draw a more-than ogive and using it estimate (i) the number of employees earning
more than Rs. 16,500; and (ii) the number of employees earning less than Rs. 13,000.
(xii) The following table gives the distribution of weekly income of 160 families:
Weekly Income (Rs.)
No. of Families
2,000–4,000
20
4,000–6,000
40
6,000–8,000
50
8,000–12,000
32
12,000–16,000
16
16,000–20,000
2
Draw a ‘less than’ ogive and answer the following from it:
(a) What are the limits within which incomes of the middle 50 percent of the families
lie?
(b) It is decided that 80 percent of the families should pay income tax. What is the
minimum taxable income?
(c) What is the minimum income of the richest 30 percent of the families?
Ans. 10. (b) = 47 and 70, 11. (b) = 126 and 86, 12. (a) = 5000 – 9250 (b) = 4600 (c) = 8250
23
LESSON : 2
MEASURES OF CENTRAL TENDENCY – MATHEMATICAL
AND POSITIONAL AVERAGES
2.
STRUCTURE
2.0
2.1
2.2
2.3
2.4
Objective
What is Central Tendency?
What are the Objectives of Central Tendency?
Characteristics of a Good Average
Types of Averages
2.4.1 Arithmetic Mean
2.4.2 Mathematical Properties of Arithmetic Mean
2.4.3 Weighted Mean
2.5 Geometric Mean
2.5.1 Specific uses of G.M.
2.5.2 Weighted G.M.
2.6 Harmonic Mean
2.6.1 Application of Harmonic Mean
2.7 Median
2.8 Other Positional Averages
2.9 Calculation of Missing Frequencies
2.10 Mode
2.10.1 Determination of Mode by Graph
2.11 Summary
2.12 Self Assessment Questions
2.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Learn the meaning of central tendency and other averages
(b) Learn the process of computing arithmetic mean, weighted Mean, Harmonic mean, Geometric
mean, Median, Deciles, Quartiles, Percentiles and Mode under different situations
(c) Comprehend mathematical properties of Arithmetic average
(d) Learn specific uses of different averages.
2.1 WHAT IS CENTRAL TENDENCY
One of the important objectives of statistical is to find out various numerical values which explains
the inherent characteristics of a frequency distribution. The first of such measures are averages. The
averages are the measures which condense a huge unwieldy set of numerical data into single numerical
values which represent the entire distribution. The inherent inability of the human mind to a large
body of numerical data compels us to few constants that will describe the data. Averages provide us
the gist and give a bird’s eye view of the huge mass of unwieldy numerical data. Averages are the
24
typical values around which other items of the distribution congregate. This value lie between the
two extreme observation of the distribution and give us an idea about the concentration of the
values in the central part of the distribution. They are called the measures of central tendency.
Averages are also called measures of location since they enable us to locate the position or
place of the distribution in question. Averages are statistical constants which enables us to comprehend
in a single value the significance of the whole. According to Croxton and Cowden, an average value
is a single value within the range of the data that is used to represent all the values in that series.
Since an average is somewhere within the range of the data, it is sometimes called a measure of
central value. An average, known as the measure of central tendency, is the most typical representative
item of the group to which it belongs and which is capable of revealing all important characteristics
of that group or distribution.
2.2 WHAT ARE THE OBJECTS OF CENTRAL TENDENCY
The most important object of calculating an average or measuring central tendency is to determine
a single figure which may be used to represent a whole series involving magnitudes of the same
variable.
Second object is that an average represents the entire data, it facilitates comparison within one
group or between group of data. Thus, the performance of the members of a group can be compared
with the average performance of different group.
Third object is that an average helps in computing various other statistical measures such as
dispersion, skewness. kurtosis etc.
2.3 CHARACTERISTICS OF A GOOD AVERAGE
An average represents the statistical data and it is used for purposes of comparison, it must possess
the following properties.
1.
It must be rigidly defined and not left to the mere estimation of the observer. If the definition is
rigid, the computed value of the average obtained by different persons shall be similar.
2.
The average must be based upon all values given in the distribution. If the item is not based on
all values it might not be representative of the entire group of data.
3.
It should be easily understood. The average should possess simple and obvious properties. It
should be too abstract for the common people.
4.
It should be capable of being calculated with reasonable care and rapidity.
5.
It should be stable and unaffected by sampling fluctuations.
6.
It should be capable of further algebraic manipulation.
2.4 TYPES OF AVERAGES
Different methods of measuring “Central Tendency” provide us with different kinds of averages.
The following are the main types of averages that are commonly used :
25
1.
Mean
(i) Arithmetic mean
(ii) Weighted mean
(iii) Geometric mean
(iv) Harmonic mean
2.
Median
3.
Mode
2.4.1 Arithmetic Mean
The arithmetic mean of a series is the quotient obtained by dividing the sum of the values by the
number of items. In algebraic language, if X1, X2, X3,.........Xn are the n values of a variate X, then
the Arithmetic Mean (X) is defined by the following formula :
X=
=
1
(X1 + X2 + X3 + ............. + X n )
n
 ∑X
1 n
Xi  =
∑

n  i=1 
N
Example 1 : The following are the monthly salaries (Rs.) of ten employees in an office. Calculate
the mean salary of the employees: 250, 275, 265, 280, 400, 490, 670, 890, 1100, 1250.
Solution : X =
X=
∑X
N
250 + 275 + 265 + 280 + 400 + 490 + 670 + 890 + 1100 + 1250 5870
=
= Rs. 587
10
10
Short-cut Method : Direct method is suitable where the number of items is moderate and the
figures are small sizes and integers. But if the number of items is large and/or the values of the
variate are big, then the process of adding together all the values may be a lengthy process. To
overcome this difficulty of computations, a short-cut method may be used. Short cut method of
computation is based on an important characteristic of the arithmetic mean, that is, the algebraic
sum of the deviations of a series of individual observations from their mean is always equal to zero.
Thus deviations of the various values of the variate from an assumed mean computed and the sum
is divided by the number of items. The quotient obtained is added to the assumed mean to find the
arithmetic mean.
Symbolically, X = A +
Σdx
, where A is assumed mean and deviations or dx = (X – A).
N
We can solve the previous example by short-cut method.
26
Computation of Arithmetic Mean
Serial
Number
Salary (Rupees)
X
1.
250
– 150
2.
275
– 125
3.
265
– 135
4.
280
– 120
5.
400
0
6.
490
+ 90
7.
670
+ 270
8.
890
+ 490
9.
1100
+ 700
10.
1250
+ 850
N = 10
Σdx = 1870
X=A+
Deviations from assumed mean
where dx = (X – A), A = 400
Σdx
N
By substituting the values in the formula, we get
X = 400 +
1870
= Rs. 587
10
Computation of Arithmetic Mean in Discrete Series. In discrete series, arithmetic mean
may be computed by both direct and short cut methods. The formula according to direct method is :
X =
1
Σ ( fX )
( f1 X 1 + f 2 X 2 + ........... + f n X n ) =
n
N
where the variable values X1, X2, ........Xn have frequencies f1, f2 ,........fn and N = Σf.
Example 2 : The following table gives the distribution of 100 accidents during area days of the
week in a given month. During a particular month there were 5 Fridays and Saturdays and only four
each of other days. Calculate the average number of accidents per day.
Days :
Sun
Mon
Tue
Wed
Thur
Fri
Sat
Total
Number of
accidents :
20
22
10
9
11
8
20
= 100
27
Solution :
Calculation of Number of Accidents per Day
Day
No. of
No. of days
Accidents
in month
X
f
fX
20
22
10
9
11
8
20
4
4
4
4
4
5
5
80
88
40
36
44
40
100
100
N = 30
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
X=
Total accidents
Σf X = 428
ΣfX 428
=
= 14.27 = 14 accidents per day
N
30
The formula for computation of arithmetic mean according to short cut method is
X=A+
Σfdx
N
where A is Assumed mean, dx = (X – A) and N = Σf .
We can solve the previous example by short-cut method as given below :
Calculation of Average Accidents per day
Day
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
X=A+
X
20
22
10
9
11
8
20
dx = X–A (where A = 10)
+ 10
+ 12
+0
–1
+1
–2
+ 10
f
fdx
4
4
4
4
4
5
5
+ 40
+ 48
+0
–4
+4
– 10
+ 50
30
+ 128
Σfdx
128
= 10 +
= 14.27 = 14 accidents per day
N
30
Calculation of Arithmetic Mean for Continuous Series : The arithmetic mean can be
computed both by direct and short-cut method. In addition, a coding method or step deviation
method is also applied for simplification of calculations. In any case, it is necessary to find out the
mid-values of the various classes in the frequency distribution before arithmetic mean of the frequency
28
distribution can be computed. Once the mid-points of various classes are found out, then the process
of the calculation of arithmetic mean is same as in the case of discrete series. In case of direct
method, the formula to be used :
X=
Σfm
,
N
when m = mid points of various classes and N = the total frequency
In the short-cut method, the following formula is applied:
X=A+
Σfdx
N
where dx = (m – A) and N = Σf
The short-cut method can further be simplified in practice and is named coding method. The
deviations from the assumed mean are divided by a common factor to reduce their size. The sum of
the products of the deviations and frequencies is multiplied by this common factor and then it is
divided by the total frequency and added to the assumed mean. Symbolically
X=A+
Σfd ' x
×i,
N
where d ' x =
m− A
and i = common factor
i
Example 3 : Following is the frequency distribution of marks obtained by 50 students in a test of
Statistics :
Marks
Number of Students
0–10
4
10–20
6
20–30
20
30–40
10
40–50
7
50–60
3
Calculate arithmetic mean by;
(i) direct method,
(ii) short-cut method, and
(iii) coding method.
Solution :
Calculation of Arithmetic Mean
X
f
m
fm
m–A
i
(where A = 25) (where i = 10)
dx = m – A
d'x =
fdx
fd′ x
0–10
4
5
20
– 20
–2
– 80
–8
10–20
6
15
90
– 10
–1
– 60
–6
20–30
20
25
500
0
0
0
0
29
30–40
10
35
350
+ 10
+1
100
+ 10
40–50
7
45
315
+ 20
+2
140
+ 14
50–60
3
55
165
+ 30
+3
90
+9
Σfdx = 190
Σfd ' x = + 19
Σfm = 1440
N = 50
Direct Method :
X=
Σfm 1440
=
= 28.8 marks.
N
50
Short-cut Method:
X=A+
Σfdx
190
= 25 +
= 28.8 marks.
N
50
X=A+
Σfd ' x
19
× i = 25 +
× 10 = 25 + 3.8 = 28.8 marks.
N
50
Coding Method:
We can observe that answer of average marks i.e. 28.8 is identical by all methods.
2.4.2 Mathematical Properties of Arithmetic Mean
(i) The sum of the deviation of a given set of individual observations from the arithmetic
mean is always zero.
Symbolically, ∑ ( X – X ) = 0. It is due to this property that the arithmetic mean is
characterised as the centre of gravity i.e., the sum of positive deviations from the mean is
equal to the sum of negative deviations.
(ii) The sum of squares of deviations of a set of observations is the minimum when deviations
are taken from the arithmetic average. Symbolically, ∑ ( X – X ) 2 = smaller than Σ(X –
any other value)2.
We can verify the above properties with the help of the following data :
Values
Deviations from X
Deviations from assumed mean
X
(X – X)
( X – X )2
(X – A)
( X – A) 2
3
–6
36
–7
49
5
–4
16
–5
25
10
1
1
0
0
12
3
9
2
4
15
6
36
5
25
Total = 45
0
98
–5
103
30
X=
∑ X 45
=
=9,
n
5
where A (assumed mean) = 10
(iii) If each value of a variable X is increased or decreased or multiplied by a constant k, the
arithmetic mean also increases or decreases or multiplies by the same constant.
(iv) If we are given the arithmetic mean and number of items of two or more groups, we can
compute the combined average of these groups by applying the following formula :
X12 =
N1X1 + N 2 X 2
N1 + N 2
where X12 refers to combined average of two groups,
X1 refers to arithmetic mean of first group,
X 2 refers to arithmetic mean of second group,
N1 refers to number of items of first group, and
N2 refers to number of items of second group
We can understand the property with the help of the following examples.
Example 4 : The average marks of 25 male students in a section is 61 and average marks of 35
female students in the same section is 58. Find combined average marks of 60 students.
Solution : We are given the following information,
X1 = 61,
Apply
N1 = 25,
X12 =
X 2 = 58,
N2 = 35
N1X1 + N 2 X 2 (25 × 61) + (35 × 58)
=
= 59.25 marks.
N1 + N 2
25 + 35
Example 5 : The mean wage of 100 workers in a factor, running two shifts of 60 and 40 workers
respectively is Rs. 38. The mean wage of 60 workers in morning shift is Rs. 40. Find the mean wage
of 40 workers working in the evening shift.
Solution : We are given the following information,
X1 = 40, N1 = 60, X 2 = ?, N2 = 40, X12 = 38, and N = 100
Apply
X12 =
N1X1 +N 2 X 2
N1 + N 2
38 =
(60 × 40) + (40 × X 2 )
60 + 40
X2 =
3800 − 2400
= 35.
40
or
31
3800 = 2400 + 40 X 2
Example 6 : The mean age of a combined group of men and women is 30 years. If the mean age of
the group of men is 32 and that of women group is 27. find out the percentage of men and women
in the group.
Solution : Let us take group of men as first group and women as second group. Therefore,
X1 = 32 years, X 2 = 27 years, and X12 = 30 years. In the problem, we are not given the number of
men and women. We can assume N1 + N2 = 100 and therefore, N1 = 100 – N2
Apply
X12 =
30 =
N1X1 + N 2 X 2
N1 + N 2
32N1 + 27N 2
(Substitute N1 = 100 – N2)
100
30 × 100 = 32 (100 – N 2 ) + 27 N 2
or
5N 2 = 200
N 2 = 200 / 5 = 40%
N1 = (100 – N 2 ) = (100 – 40) = 60%
Therefore, the percentage of men in the group is 60 and that of women is 40.
(v) Because X = ∑ X
N
∴
Σf = N.X
If we replace each item in the series by the mean, the sum of these substitutions will be equal
to the sum of the individual items. This property is used to find out the aggregate values and corrected
averages. We can understand the property with the help of an example.
Example 7 : Mean of 100 observations is found to be 44. If at the time of computation two items
are wrongly taken as 30 and 27 inplace of 3 and 72. Find the corrected average.
Solution :
∴
X=
ΣX
N
∑ X = N.X = 100×44 = 4400
Corrected ∑ X = ∑ X + correct items – wrong items = 4400 + 3 + 72 – 30 – 27 = 4418
Corrected average =
Corrected ∑ X 4418
=
= 44.18
N
100
Calculation of Arithmetic mean in Case of Open-End Classes :
Open-end classes are those in which lower limit of the first class and the upper limit of the last class
are not defined. In these series, we can not calculate mean unless we make an assumption about the
unknown limits. The assumption depends upon the class-interval following the first class and
preceding the last class, For example :
32
Marks
No. of students
Below 15
4
15–30
6
30–45
12
45–60
8
Above 60
7
In this example, because all defined class-intervals are same, the assumption would be that the
first and last class shall have same class-interval of 15 and hence the lower limit of the first class
shall be zero and upper limit of last class shall be 75. Hence first class would be 0–15 and the last
class 60–75.
What happens in this case ?
Marks
No. of students
Below 10
4
10–30
7
30–60
10
60–100
8
Above 100
4
In this problem because the class interval is 20 in the second class, 30 in the third, 40 in the
fourth class and so on. The class interval is increasing by 10. Therefore the appropriate assumption
in this case would be that the lower limit of the first class is zero and the upper limit of the last class
is 150. In case of other open-end class distributions the first class limit should be fixed on the basis
of succeeding class interval and the last class limit should be fixed on the basis of preceding class
interval.
If the class intervals are of varying width, an effort should be made to avoid calculating mean
and mode. It is advisable to calculate median.
2.4.3 Weighted Mean
In the computation of arithmetic mean, we give equal importance to each item in the series.
Raja Toy Shop sell, Toy Cars at Rs. 3 each, Toy Locomotives at Rs. 5 each, Toy Aeroplane at
Rs. 7 each and Toy Double Decker at Rs. 9 each.
What shall be the average price of the toys sold ? If the shop sells 4 toys one of each kind.
∑ X 24
X (Mean Price) = N = 4 = Rs. 6.
In this case the importance of each toy is equal as one toy of each variety has been sold. While
computing the arithmetic mean this fact has been taken care of including the price of each toy once
only.
33
But if the shop sells 100 toys, 50 cars, 25 locomotives, 15 aeroplanes and 10 double deckers,
the importance of the four toys to the dealer is not equal as a source of earning revenue. In fact their
respective importance is equal to the number of units of each toy sold, i.e. the importance of Toy car
is 50; the importance of Locomotive is 25; the importance of Aeroplane is 15; and the importance
of Double Decker is 10.
It may be noted that 50, 25, 15, 10 are the quantities of the various classes of toys sold. These
quantities are called as ‘weights’ in statistical language. Weight is represented by symbol W and
ΣW represents the sum of weights.
While determining the average price of toy sold these weights are of great importance and are
taken into account to compute weighted mean.
Xw =
∑[(W1X1 ) + (W2 X 2 ) + (W3X 3 ) + (W4 X 4 )] ∑ WX
=
∑W
W1 + W2 + W3 + W4
where W1, W2, W3, W4 are weights and X1, X2, X3, X4 represents the price of 4 varieties of toy.
Hence by substituting the values of W1, W2, W3, W4 and X1, X2, X3, X4, we get
(50 × 3) + (25 × 5) + (15 × 7) + (10 × 9)
50 + 25 + 15 + 10
150 + 125 + 105 + 90 470
Xw =
=
= Rs. 4.70
100
100
The table given below demonstrates the procedure of computing the weighted Mean.
Xw =
Weighted Arithmetic mean of Toys by the Raja Shop.
Toy
Car
Locomotive
Aeroplane
Double Decker
Price per toy (Rs.)
Number sold
Price × weight
X
3
5
7
9
W
50
25
15
10
WX
150
125
105
90
∑ W = 100
∑ WX = 470
∑ WX 470
=
= Rs. 4.70
100
∑X
Example 8 : The table below shows the number of skilled and unskilled workers in two localities
along with their average hourly wages.
∴
Xw =
Ram Nagar
Worker Category
Shyam Nagar
Number
Wages (per hour)
Number
Wages (per hour)
Skilled
150
1.80
350
1.75
Unskilled
850
1.30
650
1.25
Determine the average hourly wage in each locality. Also give reasons why the results show
34
that the average hourly wage in Shyam Nagar exceed the average hourly wage in Ram Nagar, even
though in Shyam Nagar the average hourly wages of both categories of workers is lower. It is
required to compute weighted mean.
Solution :
Ram Nagar
Shyam Nagar
X
W
WX
X
W
WX
Skilled
1.80
150
270
1.75
350
612.50
Unskilled
1.30
850
1105
1.25
650
812.50
1000
1375
1000
1425
Total
Xw =
1375
= Rs.1.375
1000
Xw =
1425
= Rs. 1.425
1000
It may be noted that weights are more evenly assigned to the different categories of workers in
Shyam Nagar than in Ram Nagar.
2.5 GEOMETRIC MEAN
In general, if we have n numbers (none of them being zero), then the G.M. is defined as
G.M. =
x1 , x2 ,...........xn = ( x1 , x2 ...........xn )1/ n
In the case of a discrete series, if x1, x2,...........xn occur f1, f2,.......fn times respectively and N is
the total frequency (i.e. N = f1+, f2+,.........fn), then
G.M. =
n
x1 f1 , x2 f 2 ,...........xn f n
For convenience, use of logarithms is made extensively to calculate the nth root. In terms of
logarithms
 log x1 + log x2 + ........... + log xn 
G.M. = AL 


n
 ∑ log x 
,
= AL 
 N 
where AL stands for anti log.
In discrete series, G.M. = AL ∑ f log x
N
∑ f log m
N
Example 9 : Calculate G.M. of the following data :
and in case of continuous series, G.M. = AL
2,
4,
8
35
Solution : G.M. =
3
2 × 4 × 8 = 3 64 = 4
In terms of logarithms, the question can be solved as follows :
log 2 = 0.3010, log 4 = 0.6021, and log 8 = 9.9031
Apply the formula:
G.M. = AL
∑ log x
1.8062
= AL
= AL 0.60206 = 4
N
3
Example 10. Calculate geometric mean of the following data :
x
5
6
7
8
9
10
11
f
2
4
7
10
9
6
2
Solution :
Calculation of G.M.
x
log.x
f
f log x
5
0.6990
2
1.3980
6
0.7782
4
3.1128
7
0.8451
7
5.9157
8
0.9031
10
9.0310
9
0.9542
9
8.5878
10
1.0000
6
6.0000
11
1.0414
2
2.0828
N = 40
Σf log x = 36.1281
 36.1281
 Σf log x 
 = AL 
G.M. = AL 
 = AL (0.9032) = 8.002
40 
N
Example 11 : Calculate G.M. from the following data :
X
f
9.5–14.5
10
14.5–19.5
15
19.5–24.5
17
24.5–29.5
25
29.5–34.5
18
34.5–39.5
12
39.5–44.5
8
36
Solution :
Calculation of G.M.
X
m
log m
f
f log m
9.5–14.5
12
1.0792
10
10.7920
14.5–19.5
17
1.2304
15
18.4560
19.5–24.5
22
1.3424
17
22.8208
24.5–29.5
27
1.4314
25
35.7850
29.5–34.5
32
1.5051
18
27.0918
34.5–39.5
37
1.5682
12
18.8184
39.5–44.5
42
1.6232
8
12.9856
N = 105 Σf log m = 146.7496
 146.7496 
= AL (1.3976) = 24.98
G.M. = AL 
 105 
2.5.1 Specific uses of G.M.
The Geometric Mean has certain specific uses, some of them are :
(i) It is used in the construction of index numbers,
(ii) It is also helpful in finding out the compound rates of change such as the rate of growth of
population in a country.
(iii) It is suitable where the data are expressed in terms of rates, ratios and percentage.
(iv) It is quite useful in computing the average rates of depreciation or appreciation.
(v) It is most suitable when large weights are to be assigned to small items and small weights
to large items.
Example 12. The gross national product of a country was Rs. 1,000 crores 10 years earlier. It is Rs.
2,000 crores now. Calculate the rate of growth in G.N.P.
Solution : In this case compound interest formula will be used for computing the average annual
per cent increase of growth.
Pn = Po(1 + r)n
where
Pn = prinicipal sum (or any other variate) at the end of the period.
Po = prinicipal sum in the beginning of the period.
r = rate of increase or decrease.
n = number of years.
It may he noted that the above formula can also be written in the following form :
r =
n
Pn
−1
Po
37
Substituting the values given in the formula, we have
r=
10
2000
− 1 = 10 2 − 1
1000
 log 2 
 0.30103 
− 1 = AL 
− 1 = 1.0718 − 1 = 0.0718 = 7.18%
= AL 

 10 
 10 
Hence, the rate of growth in GNP is 7.18%.
Example 13 : The price of commodity increased by 5 per cent from 1998 to 1999, 8 per cent from
1999 to 2000 and 77 per cent from 2000 to 2001. The average increase from 1998 to 2001 is quoted
at 26 per cent and not 30 per cent. Explain this statement and verify your result.
Solution : Taking Pn as the price at the end of the period. Po as the price in the beginning, we can
substitute the values of Pn and Po in the compound interest formula. Taking Po = 100; Pn = 200.72
Pn = Po(1 + r)n
200.72 = 100(1 + r)3
(1 + r)3 =
or
r =
200.72
or 1 + r =
100
3
3
200.72
100
200.72
− 1 = 1.260 − 1 = 0.260 = 26%
100
Thus increase is not average of (5 + 8 + 77)/3 = 30 per cent. It is 26% as found out by G.M.
2.5.2 Weighted G.M.
The weighted G.M. is calculated with the help of the following formula :
G.M. =
=
x1w1.x2 w2 .........xn wn
log x1 × w1 + log x2 × w2 + ...log xn × wn
w1 + w2 ...wn
 Σ (log x × w) 
= AL 

Σw

Example 14 : Find out weighted G.M. from the following data :
Group
Index number
Weights
Food
352
48
Fuel
220
10
Cloth
230
8
House Rent
160
12
Misc.
190
15
38
Solution :
Calculation of Weighted G.M.
Group
Index Number(x)
Weights (w)
log x
w log x
Food
352
48
2.5465
122.2320
Fuel
220
10
2.3424
23.4240
Cloth
230
8
2.3617
18.8936
House Rent
160
12
2.2041
26.4492
Misc.
190
15
2.2788
34.1820
93
225.1808
225.1808
 Σw log x 
= AL
= 263.8
G.M. (weighted) = AL 

93
 Σw 
Example 15 : A machine depreciates at the rate of 35.5% per annum in the first year, at the rate of
22.5% per annum in the second year, and at the rate of 9.5% per annum in the third year, each
percentage being computed on the actual value. What is the average rate of depreciation?
Solution : Average rate of depreciation can be calculated by taking G.M.
X (values taking 100 as base)
log X
I
100 – 35.5 = 64.5
1.8096
II
100 – 22.5 = 77.5
1.8893
III
100 – 9.5 = 90.5
1.9566
Year
Σlog X = 5.6555
 Σ log x  5.6555
=
= AL 1.8851 = 76.77
Apply G.M.= AL 
3
 w 
∴ Average rate of depreciation = 100 – 76.77 = 23.33%.
Example 16 : The arithmetic mean and geometric mean of two values are 10 and 8 respectively.
Find the values.
Solution : If two values are taken as a and b, then
a+b
= 10,
2
Or a + b = 20,
then a – b =
and
ab = 8
ab = 64
(a + b) 2 − 4ab = (20) 2 − 4 × 64 = 400 − 256 = 144 = 12
Now, we have a + b = 20, a – b = 12
Solving for a and b, we get a = 4 and b = 16.
39
2.6 HARMONIC MEAN
The harmonic mean is defined as the reciprocals of the average of reciprocals of all items in a
series. Symbolically,
H.M. =
N
N
=
1
1 1 1
 1
 x + x + x + ....... x  Σ  x 
n
1
2
3
In case of a discrete series,
H.M. =
N
{ }
1
Σ f×
x
and in case of a continuous series,
H.M. =
N
{ }
Σ f×
1
m
It may be noted that none of the values of the variable should be zero.
Example 17 : Calculate harmonic mean from the following data : 5, 15, 25, 35 and 45
Solution :
X
1
X
5
0.20
15
0.067
25
0.040
35
0.029
45
0.022
N=5
 1
Σ   = 0.358
X
H.M. =
5
N
=
= 14 approx.
 1  0.358
Σ 
 x
Example 18 : From the following data compute the value of the harmonic mean :
x:
5
15
25
35
45
f:
5
15
10
15
5
40
Solution :
Calculation of Harmonic Mean
x
f
1
x
5
5
0.200
1.000
15
15
0.067
1.005
25
10
0.040
0.400
35
15
0.29
0.435
45
5
0.022
0.110
 1
Σ  f  = 2.950
 x
Σf = 50
N
H.M. =
1
x
f
1

Σf × 
x

50
= 17 approx.
2.95
=
Example 19 : Calculate harmonic mean from the following distribution:
x
f
0–10
5
10–20
15
20–30
10
30–40
15
40–50
5
Solution : First of all, we shall find out mid points of the various classes. They are 5, 15, 25, 35 and
45. Then we will calculate the H.M. by applying the following formula :
H.M. =
N
{ }
Σ f×
1
m
Calculation of Harmonic Mean
x (mid points)
f
1
x
5
5
0.200
1.000
15
15
0.067
1.005
25
10
0.040
0.400
41
f
1
x
35
15
0.29
0.435
45
5
0.022
0.110
 1
Σ  f  = 2.950
 x
Σf = 50
H.M. =
N
1

Σ f × 

m
=
50
= 17 approximately
2.950
2.6.1. Application of Harmonic Mean
Like Geometric means, the harmonic mean is also applicable to certain special types of problems.
Some of them are:
(i) If, in averaging time rates, distance is constant, then H.M. is to be calculated.
Example 20 : A man travels 480 km. a day. On the first day he travels for 12 hours @ 40 km. per
hour and second day for 10 hours @ 48 km. per hour. On the third day he travels for 15 hours @ 32
km. per hour. Find his average speed.
Solution: We shall use the harmonic mean,
H.M. =
3
3
N
=
=
= 39 km. per hour (approx.)
1
1
1
37
480
 1
+ +
Σ
 X  40 48 32
 48 + 40 + 32 
 = 40 km. per hour
The arithmetic mean would be 
3
(ii) If, in averaging the price data, the prices are expressed as “quantity per rupee”. Then
harmonic mean should be applied.
Example 21 : A man purchased one kilo of cabbage form each of four places at the rate of 20 kg.,
16 kg., 12 kg. and 10 kg. per rupees respectively. On the average how many kilos of cabbages he has
purchased per rupee.
Solution: H.M. =
4
4
4 × 240
N
=
=
=
= 13.5 kg. per rupee.
1 1 1 1 71 250
71
 1
+ + +
Σ 
 x  20 16 12 10
Example 22 : Find two numbers whose geometric mean is 18 and arithmetic mean is 19.5. Also,
calculate their harmonic mean.
Solution : Let us assume two numbers are x and y. Now with the help of given information,
X =
or
x+ y
= 19.5
2
x + y = 39
...(i)
42
Also,
or
Now,
G.M. =
xy = 18
xy = 324
...(ii)
(x – y)2 = ( x + y ) 2 − 4 xy
= 392 − 4 × 324 = 225
∴
x – y = ± 15
...(iii)
With the help of equations (i) and (iii),
we get x = 27 and y = 12
Therefore the two numbers are 12 and 27.
Now,
G.M. =
AM × HM
G.M.2 = AM × HM
or
H.M. = GM2/AM
H.M. = 182/19.5 = 16.62
2.7 MEDIAN
The median is that value of the variable which divides the group in two equal parts. One part
comprising the values greater than and the other all values less than median. Median of a distribution
may be defined as that value of the variable which exceeds and is exceeded by the same number of
observation. It is the value such that the number of observations above it is equal to the number of
observations below it. Thus we know that the arithmetic mean is based on all items of the distribution,
the median is positional average, that is, it depends upon the position occupied by a value in the
frequency distribution.
When the items of a series are arranged in ascending or descending order of magnitude the
value of the middle item in the series in known as median in the case of individual observations.
Symbolically,
 N + 1
 th item
Median = size of 
2 
If the number of items is even, then there is no value exactly in the middle of the series. In such
a situation the median is arbitrarily taken to be halfway between the two middle items. Symbolically,
N
 N + 1
size of th item + size of
th item
 2 
2
Median =
2
Example 23. Find the median of the following series:
(i) 8, 4, 8, 3, 4, 8, 6, 5, 10.
(ii) 15, 12, 5, 7, 9, 5, 11, 28.
43
Solution :
Computation of Median
(i)
(ii)
Serial No.
X
Serial No.
X
1
3
1
5
2
4
2
5
3
4
3
7
4
5
4
9
5
6
5
11
6
8
6
12
7
8
7
15
8
8
8
28
9
10
N=9
For (i) series
N=8
9 +1
 N + 1
 th item = size of the
th item = size of 5th item = 6
Median = size of the 
2
2
8 +1
 N + 1
 th item = size of the
For (ii) series Median = size of the 
th item
2
2
=
size of 4th + size of 5th item 9 + 11
=
= 10
2
2
Location of Median in Discrete series: In a discrete series, medium is computed in the following
manner:
(i) Arrange the given variable data in ascending or descending order.
(ii) Find cumulative frequencies.
(iii) Apply Med. = size of
( N + 1)
th item
2
(iv) Locate median according to the size i.e., variable corresponding to the size or for next
cumulative frequency.
Example 24 : Following are the number of rooms in the houses of a particular locality. Find median
of the data :
No. of rooms :
3
4
5
6
7
8
No. of houses :
38
654
311
42
12
2
44
Solution :
Computation of Median
No. of rooms
No. of houses
Cumulative frequency
X
f
Cf
3
38
38
4
654
692
5
311
1003
6
42
1045
7
12
1057
8
2
1059
1059 + 1
 N + 1
 th item = size of
Median = size of 
th item = 530th item.
2
2
Median lies in the cumulative frequency of 692 and the value corresponding to this is 4.
Therefore, Median = 4 rooms
In a continuous series, median is computed in the following manner :
(i) Arrange the given variable data in ascending or descending order.
(ii) If inclusive series is given, it must be converted into exclusive series to find real class
intervals.
(iii) Find cumulative frequencies.
(iv) Apply Median = size of
N
th item to ascertain median class.
2
(v) Apply formula of interpolation to ascertain the value of median.
N
− cf 0
Median = l1 + 2
× (l2 − l1 )
f
or
N
− cf 0
Median = l2 − 2
× (l2 − l1 )
f
where, l1 refers to lower limit of median class
l2 refers to higher limit of median class
cf0 refers cumulative frequency of previous class
f refers to frequency of median class.
Example 25 : The following table gives you the distribution of marks secured by some students in
an examination
45
Marks
No. of Students
0–20
21–30
31–40
41–50
51–60
61–70
71–80
42
48
120
84
48
36
31
Find the median marks.
Solution :
Calculation of Median Marks
Marks
No.of students
(x)
(f)
0–20
21–30
31–40
41–50
51–60
61–70
71–80
42
38
120
84
48
36
31
Median = size of
cf
42
80
200
284
332
368
399
N
399
th item = size of
th = 199.5th item
2
2
which lies in (31–40) group, therefore the median class is 30.5–40.5
Applying the formula of interpolation.
N
− cf 0
Median = l1 + 2
× (l2 − l1 )
f
= 30.5 +
199.5 − 80
119.5
× (10) = 30.5 +
= 40.46 marks.
120
12
2.8 OTHER POSITIONAL AVERAGES
The median divides the series into two equal parts. Similarly there are certain other measures which
divide the series into certain equal parts. There are first quartile, third quartile, deciles percentiles
etc. If the items are arranged in ascending or descending order of magnitude, Q1 is that value which
covers l/4th of the total number of items. Similarly, if the total number of items are divided into ten
46
equal parts, then, there shall be nine deciles.
Symbolically,
 N + 1
 th item
First quartile (Q1) = size of 
4 
Third quartile (Q3) = size of
3( N + 1)
th item
4
 N + 1
 th item
First decile (D1) = size of 
10 
Sixth decile (D6) = size of
6( N + 1)
th item
10
 N + 1
 th item
First percentile (P1) = size of 
100 
Once values of the items are found out, then formulae of interpolation are applied for ascertaining
the value of Q1, Q3, D1, D4, P40 etc.
Example 26. Calculate Q1, Q3, D2 and P5 from the following data :
Marks:
Below 10
10–20
20–40
40–60
60–80
8
10
22
25
10
No. of Students:
Solution:
Calculation of Positional values
Marks
No. of Students (f)
c.f.
8
10
22
25
10
5
8
18
40
65
75
80
Below 10
10–20
20–40
40–60
60–80
Above 80
N = 80
Q1 = size of
N
80
th item =
= 20th item
4
4
Hence Q1 lies in the class 20–40, apply
N
− Cf 0
N
Q1 = l1 + 4
× i where l1 = 20, = 20, Cf0 = 18, f = 22 and i = (l2 − l1 ) = 20
4
f
47
Above 80
5
By substituting the values, we get
Q1 = 20 +
(20 − 18)
× 20 = 20 + 1.8 = 21.8
22
Similarly, we can calculate
Q3 = size of
3N
3 × 80
th item =
th item = 60th item.
4
4
Hence Q3 lies in the class 40–60
3N
− Cf 0
3N
Q3 = l1 + 4
× i where l1 = 40, 4 = 60, Cf0 = 40, f = 25, i = 20.
f
∴ Q3 = 40 +
D2 = size of
(60 − 40)
× 20 = 40 + 16 = 56
25
2N
th item = 16th item. Hence D2 lies in the class 10–20.
10
2N
− Cf 0
2N
× i where l1 = 10,
D2 = l1 + 10
= 16, Cf0 = 8, f = 10, i = 10.
4
f
D2 = 10+
(16 − 8)
× 10 = 10 + 8 = 18
10
P5 = size of
5N
5 × 80
th item =
th item = 4th item. Hence P5 lies in the class 0–10
100
100
5N
− Cf 0
5N
P5 = l1 + 100
× i where l1 = 0, 100 = 4, Cf0 = 0, f = 8, i = 10
f
P5 = 0 +
4−0
× 10 = 0 + 5 = 5.
8
2.9. CALCULATION OF MISSING FREQUENCIES
Example 27: In the frequency distribution of 100 families given below; the number of families
corresponding to expenditure groups 20–40 and 60–80 are missing from the table. However the
median is known to be 50. Find out the missing frequencies.
Expenditure:
0–20
No. of families: 14
20–40
40–60
60–80
80–100
?
27
?
15
48
Solution: We shall assume the missing frequencies for the classes 20–40 to be x and 60–80 to y
Expenditure (Rs.)
0–20
20–40
40–60
60–80
80–100
No. of families
14
x
27
y
15
N = 100 = 56 + x + y
C.f
14
14 + x
14 + 27+ x
41 + x + y
41 + 15 + x + y
From the table we have N = Σ F = 56 + x + y = 100.
∴ x + y = 100 – 56 + 44
Median is given as 50 which lies in the class 40–60, which becomes the median class.
By using the median formula we get:
N
− Cf 0
Median = l1 + 2
×i
f
∴
50 = 40 +
50 − (14 + x)
× (60 − 40)
27
or
50 = 40 +
50 − (14 + x)
× 20
27
36 − x
20
× 20 or 50 − 40 = (36 − x) ×
27
27
or
50 − 40 =
or
10 × 27 = 720 − 20x or
∴
20x = 720 – 270
270 = 720 – 20x
450
= 22.5
20
By substituting the value of x in the equation,
x + y = 44
we get,
22.5 + y = 44
∴
y = 44 – 22.5 = 21.5.
Hence frequency for the class 20–40 is 22.5 and 60–80 is 21.5.
x=
2.10 MODE
Mode is that value of the variable which occurs or repeats itself maximum number of times. The
mode is the most “fashionable” size in the sense that it is the most common and typical and is
defined by Zizek as “the value occurring most frequently in series of items and around which the
other items are distributed most densely.” In the words of Croxton and Cowden, the mode of a
distribution is the value at the point where the items tend to be most heavily concentrated. According
to A.M. Tuttle, Mode is the value which has the greater frequency density in its immediate
neighbourhood. In the case of individual observations, the mode is that value which is repeated the
maximum number of times in the series. The value of mode can be denoted by the alphabet z also.
49
Example 28 : Calculate mode from the following data:
Sr. Number :
1
2
3
4
5
Marks obtained : 10
27
24
12
27
6
27
7
20
8
18
9
15
10
30
Solution :
Marks
No. of Students
10
12
15
18
20
24
27
30
1
1
1
1
1
1
3
1
Calculation of Mode in Discrete series. In discrete series, it is quite often determined by
inspection. We can understand with the help of an example :
X
1
2
3
4
5
6
7
f
4
5
13
6
12
8
6
By inspection, the modal size is 3 as it has the maximum frequency. But this test of greatest
frequency is not fool proof as it is not the frequency of a single class, but also the frequencies of the
neighbour classes that decide the mode. In such cases, we shall be using the method of Grouping
and Analysis table.
Size of shoe
1
2
3
4
5
6
7
frequency
4
5
13
6
12
8
6
Solution : By inspection, the mode is 3, but the size of mode may be 5. This is so because the
neighbouring frequencies of size 5 are greater than the neighbouring frequencies of size 3. This
effect of neighbouring frequencies is seen with the help of grouping and analysis table technique.
Grouping table
Size of Shoe
Frequency
1
2
3
50
4
5
6
When there exist two groups of frequencies in equal magnitude, then we should consider
either both or omit both while analysing the sizes of items.
Analysis Table
Column
Size of items with maximum frequency
1
3
2
5, 6
3
1, 2, 3, 4, 5
4
4, 5, 6
5
5, 6, 7
6
3, 4, 5
Item 5 occurs maximum number of times, therefore, mode is 5. We can note that by inspection
we had determined 3 to be the mode.
Determination of mode in continuous series : In the continuous series, the determination of
mode requires one additional step. Once the modal class is determined by inspection or with the
help of grouping technique, then the following formula of interpolation is applied:
f1 − f0
Mode = l1 + 2 f − f − f (l2 − l1 )
1
0
2
f1 − f0
Mode = l2 − 2 f − f − f (l2 − l1 )
1
0
2
or
l1 = lower limit of the class, where mode lies.
l2 = upper limit of the class, where mode lies.
f0 = frequency of the class proceeding the modal class.
f1 = frequency of the class, where mode lies.
f2 = frequency of the class succeeding the modal class.
Example 29 : Calculate mode of the following frequency distribution :
Variable
Frequency
0–10
5
10–20
10
20–30
15
30-40
14
40–50
10
50–60
5
60–70
3
51
Solution :
Grouping Table
X
1
0–10
5
2
3
4
5
6
15
30
10–20
10
30
25
20–30
15
39
29
30–40
14
40–50
10
39
24
29
15
50–60
5
18
8
60–70
3
Analysis Table
Column
Size of item with maximum frequency
1
20–30
2
20–30, 30–40
3
10–20, 20–30
4
0–10, 10–20, 20–30
5
10–20, 20–30, 30–40
6
20–30, 30–40, 40–50
Modal group is 20–30 because it has occurred 6 times. Applying the formula of interpolation.
f1 − f 0
Mode = l1 + 2 f − f − f (l2 − l1 )
1
0
2
= 20 +
15 − 10
5
(30 − 20) = 20 + (10) = 28.3
30 − 10 − 14
6
Calculation of mode where it is ill defined. The above formula is not applied where there are
many modal values in a series or a distribution. For instance there may be two or more than two
items having the maximum frequency. In these cases, the series will be known as bimodal or
multimodal series. The mode is said to be ill-defined and in such cases the following formula is
applied.
Mode = 3 Median – 2 Mean.
52
Example 30. Calculate mode of the following frequency data :
Variate value
Frequency
10–20
5
20–30
9
30–40
13
40–50
21
50–60
20
60–70
15
70–80
8
80–90
3
Solution : First of all, ascertain the modal group with the help of process of grouping.
Grouping Table
X
1
10–20
5
2
3
4
5
6
14
20–30
9
27
22
30–40
13
43
34
40–50
21
54
41
50–60
20
56
35
60–70
15
43
23
70–80
8
26
11
80–90
3
Analysis Table
Column
Size of item with maximum frequency
1
40–50
2
50–60, 60–70
3
40–50, 50–60
4
40–50, 50–60, 60–70
5
20–30, 30–40, 40–50, 50–60, 60–70, 70–80
6
30–40, 40–50, 50–60
53
There are two groups which occur equal number of items. They are 40–50 and 50–60. Therefore,
we will apply the following formula:
Mode = 3 median – 2 mean and for this purpose the values of mean and median are required to
be computed.
Calculation of Mean and Median
Variate
 m – 45 


10 
frequency
mid values
X
f
m
d′x
fd′x
cf
10–20
5
15
–3
– 15
5
20–30
9
25
–2
– 18
14
30–40
13
35
–1
– 13
27
40–50
21
45
0
0
50–60
20
55
+1
+ 20
68 value of
60–70
15
65
+2
+ 30
83 item which lies
70–80
8
75
+3
+ 24
91 in (40–50) group
80–90
3
85
+4
+ 12
94
48 Median is the
N
th
2
Σfd′ = +40
N = 94
N
− cf 0
Med. = l1 + 2
× i.
f
40
47 − 27
200
(10) = 40 +
= 49.5
= 45 + (10) = 45 + 4.2 = 49.2
= 40 +
94
21
21
Mode = 3 median – 2 mean
Σfd ′x
X = A+
× i.
N
= 3 (49.5) –2 (49.2) = 148.5 – 98.4 = 50.1
2.10.1. Determination of mode by graph
Mode can also be computed by curve fitting. The following steps are to be taken;
(i) Draw a histogram of the data.
(ii) Draw the lines diagonally inside the modal class rectangle, starting from each upper corner
of the rectangle to the upper corner of the adjacent rectangle.
(iii) Draw a perpendicular line from the intersection of the two diagonal lines to the X-axis.
The abscissa of the point at which the perpendicular line meets is the value of the mode.
Example 31 : Construct a histogram for the following distribution and, determine the mode
graphically :
54
X
:
0–10
10–20
20–30
30–40
40–50
f
:
5
8
15
12
7
Verify the result with the help of interpolation.
Solution :
Mode = l1 +
f1 − f 0
(l2 − l1 )
2 f1 − f 0 − f 2
= 20 +
15 − 8
7
(30 − 20) = 20 + (10) = 27
30 − 8 − 12
10
Example 32 : Calculate mode from the following data :
Marks
No. of Students
Below 10
4
′′ 20
6
′′ 30
24
′′ 40
46
′′ 50
67
′′ 60
86
′′ 70
96
′′ 80
99
′′ 90
100
Solution : Since we are given the cumulative frequency distribution of marks, first we shall convert
it into the normal frequency distribution:
55
Marks
Frequencies
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
4
6–4=2
24 – 6 = 18
46 – 24 = 22
67 – 46 = 21
86 – 67 = 19
96 – 86 = 10
99 – 96 = 3
100 – 99 = 1
It is evident from the table that the distribution is irregular and maximum chances are that the
distribution would be having more than one mode. You can verify by applying the grouping and
analysing table.
The formula calculate the value of mode in cases of bio-modal distributions is :
Mode = 3 median – 2 mean.
Computation of Mean and Median
Marks
 X – 45 


10 
Mid-value
Frequency
(x)
(f)
cf
(dx)
5
15
25
35
45
55
65
75
85
4
2
18
22
21
19
10
3
1
4
6
24
46
67
86
96
99
100
–4
–3
–2
–1
0
1
2
3
4
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
Σf = 100
Mean = A +
– 16
–6
– 36
– 22
0
19
20
9
4
Σfdx = – 28
Σfdx
−28
× i = 45 +
× 10 = 42.2
N
100
Median = size of
fdx
N
100
th item =
= 50th item.
2
2
Because 50 is smaller to 67 in cf column. Median class is 40–50
N
− Cf 0
Median = l1 + 2
×i
f
56
50 − 46
4
× 10 = 40 + × 10 = 41.9
21
21
Apply, Mode = 3 median –2 mean
Median = 40 +
Mode = 3 × 41.9 –2 × 42.2 = 125.7 – 84.3 = 41.3.
Example 33 : Median and mode of the wage distribution are known to be Rs. 33.5 and 34 respectively.
Find the missing values.
Wages (Rs.)
No. of workers
0–10
10–20
20–30
30–40
40–50
50–60
60–70
4
16
?
?
?
6
4
Total = 230
Solution : We assume the missing frequencies as 20–30 as x, 30–40 as y, and 40–50 as 230 –
(4 + 16 + x + y + 6 + 4) = 200 – x – y.
We now proceed further to compute missing frequencies :
Wages (Rs.)
No. of workers
Cumulative frequencies
X
f
cf
0–10
10–20
20–30
30–40
40–50
50–60
60–70
4
16
x
y
200 – x – y
6
4
4
20
20 + x
20 + x + y
220
226
230
N = 230
Apply,
N
− cf 0
Median = l1 + 2
× (l2 − l1 )
f
= 30 +
115 − (20 + x)
× (40 − 30)
y
y (33.5 – 30) = (115 – 20 – x)10
3.5 y = 1150 – 200 – 10x
57
10x + 3.5y = 950
...................(i)
f1 − f0
Mode = l1 + 2 f − f − f (l2 − l1 )
1
0
2
Apply,
= 20 +
15 − 8
(30 − 20)
30 − 8 − 12
4(3y – 200) = 10 (y – x)
10x + 2y = 800
....................(ii)
Subtract equation (ii) from equation (i),
1.5y = 150,
y=
150
= 100
1.5
Substitute the value of y = 100 in equation (i), we get
10x + 3.5 (100) = 950
10x = 950 – 350
x = 600/10 = 60.
∴ Third missing frequency = 200 – x – y = 200 – 60 – 100 = 40.
2.11 SUMMARY
●
Central tendency indicates the location of the centre of a set of data. It is the average value.
●
An average is a typical value which is used to represent the entire set of values and is used as
a benchmark to make comparisons.
●
A good average is expected to be based on all values; not affected unduly by the presence of
extremely large or small values in the data, amenable to further algebraic treatment and having
sampling stability.
●
Averages are distinguished as mathematical and positional.
●
Arithmetic mean is a mathematical average which is most commonly used and understood and
also very extensively used in statistical work. Obtained by dividing the sum of values by their
number, it is easy to calculate. It enjoys well defined algebraic properties like zero-sum
deviations, least squares, and combined mean. It meets most of the requisites of a good average.
●
Geometric mean and harmonic mean are other mathematical averages but they have limited
and specific uses. Their calculation is restricted only to positive values. It is possible to calculate
combined average for two or more sets of data for each of these.
●
Geometric mean, which is nth root of the product of n values, is basically applied to obtain
average growth rates, price changes and depreciation rates.
●
Harmonic mean is equal to the reciprocal of the arithmetic mean of reciprocals. It is used to
average rates. Harmonic mean is used when the weights are in terms of the numerator factor of
58
the given rates. Arithmetic mean is correct to use when weights are in terms of the denominator
factor.
●
Mathematical averages can be simple or weighted and used accordingly as all values enjoy an
equal or unequal weightage. Their values can be calculated only by using well-defined formulae
and cannot be obtained graphically. Being based on all values, they are affected in a larger
measure by the presence of extreme values.
●
The positional averages include median and mode. While median refers to the central value in
a set of arrayed values, the mode is that value in a series which appears the maximum number
of times. A given set of individual values or a frequency distribution may have one or more
modal values. If values in a given set of data are all unique, there is no mode. Mode suffers
from these drawbacks.
●
The positional averages do not possess any mathematical properties, except that the sum of
absolute deviations from median is the least.
●
In addition to median, there are a number of partition values that divide given distribution into
a certain number of parts. They include quartiles, deciles and percentiles. The partition values
are not averages but they are discussed here for the reason that their calculation proceeds in the
same manner as that of median. They are used to locate relative position of different values
clearly (like use of percentiles in the CAT entrance examinations) and also to calculate measures
of variation, skewness etc.
2.12 SELF ASSESSMENT QUESTIONS
Exercise 1 : True or False Statements
(i) An average serves as a benchmark for comparisons.
(ii) Mean, median and mode are called positional averages while geometric mean and harmonic
mean are designated as mathematical averages.
(iii) In the deviation method of calculating arithmetic mean, the mean is obtained by adding the
mean of the deviations to the assumed mean value.
(iv) Arithmetic mean is not suitable for open-ended frequency distributions.
(v) All averages can be distinguished as being simple and weighted.
(vi) In the weighted arithmetic mean calculation, it is immaterial whether the weights are
expressed as, say, 20% and 80% or as 4 and 16.
(vii) The sum of squares of deviations as well as the sum of deviations from mean is equal to
zero.
(viii) For calculating different measures of central tendency, it is necessary that all class intervals
have equal width.
(ix) Median cannot be calculated in open-ended class frequency distributions.
(x) In an array of 41 items, median is equal to (41 + 1 )/2 = 21.
(xi) Two sets of values. A and B. are identical except that their respective largest values are 80
and 8,000. The median of both the distributions shall be same.
59
(xii) The sum of absolute deviations from median is equal to zero.
(xiii) The quartiles divide a distribution into four equal parts.
(xiv) A distribution has 10 deciles and 100 percentiles.
(xv) In a distribution of wages of the workers of a factory, the 95th percentile indicates the
maximum wage earned by the top 95 percent of the workers.
(xvi) The lower quartile in a distribution with a total frequency of 800 is equal to n/4 = 200.
(xvii) The median, quartiles and percentiles can be determined graphically only by means of a
“less than’ ogive.
(xviii) The lower and the upper quartiles mark off the limits within which the middle 50 percent
of the cases fall.
(xix) It is possible to have more than one median in a given distribution.
(xx) For every frequency distribution, the upper and lower quartiles are located at equal distance
from median.
(xxi) A distribution can have more than two modes.
(xxii) In an array of marks (out of 100) scored by students of a class, the mark 47 appears 48
times (total number of students is 90). The mode of the set of values is 48.
(xxiii) It is possible to estimate mode from median and mean of a distribution.
(xxiv) For every distribution, X > Median > Mode.
(xxv) Geometric mean is the nth root of the sum of n values.
(xxvi) Geometric mean is the appropriate measure for averaging ratios and percentages.
(xxvii) The geometric mean of two unequal values, x and v. is equal to the geometric mean of their
arithmetic mean and harmonic mean values.
(xxviii) The positional averages are not amenable to algebraic manipulations.
(xxix) Only mathematical averages are distinguishable as being simple or weighted.
(xxx) Mathematical averages cannot be determined graphically.
Ans. 1. T, 2. F, 3. T, 4. T, 5. F, 6. T, 7. F, 8. F, 9. F, 10. F, 11. T, 12. F, 13. T, 14. F, 15. F, 16. F, 17. F,
18. T, 19. F, 20. F, 21. T, 22. F, 23. T, 24. F, 25. F, 26. T, 27. T, 28. T, 29. T, 30. T
Exercise 2 : Questions and Answers
(i) What do you understand by an average? Discuss the desirable properties of a good measure
of central tendency.
(ii) “An average is a number indicating the central value of a group of observations.” How
far is it true for mean, median and mode? Give illustrations.
(iii) State and explain the properties of arithmetic mean, geometric mean and harmonic mean.
(iv) Write a note on how you would decide whether arithmetic mean or harmonic mean should
be used to calculate average in a given case.
60
(v) Define median, quartiles, deciles and percentiles. State the property of median. Does it
have any practical application?
(vi) Write a detailed note on the choice of an average. Which average would be more suitable
in the following cases:
(a) Average size of ready-made garments sold by a store.
(b) Average intelligence level of students of a class.
(c) Average rate of growth of population per decade.
(vii) A taxi ride in New Delhi costs Rs. 20 for the first kilometer and Rs. 11 per kilometer
thereafter. Assume that the cost of each kilometer is incurred at the beginning of the
kilometer. The waiting charges are Rs. 30 per hour or a part thereof, subject to a minimum
of 15 minutes stay. Calculate the effective average cost per kilometer to a customer who
rides a taxi from the Railway Station for her home 21.7 kilometers away and chooses to
stay for a coffee for 25 minutes on the way.
(viii) Find the arithmetic mean of the first 100 natural numbers.
(ix) The arithmetic mean of a distribution is known to be 55.45. It is written below with the
variate given in codified values. You are required to determine the class intervals.
d′
–3
–2
–1
0
1
2
3
f
10
28
30
42
65
15
10
It is known that various d′ values have been calculated as (X – A)/10.
(x) In a hotel, a total of 500 bulbs were installed simultaneously and their failure over time
was observed as detailed below.
End of week :
1
No. of failures : 12
2
3
4
5
6
7
40
108
242
346
428
500
You are required to calculate the mean life of the bulbs.
(xi) A factory employs 100 workers. The mean daily wages of 99 of these workers is Rs. 85
while the wages of the 100th worker are Rs. 99 more than the mean wages of all the
workers. Obtain mean wages of the workers of the factory.
(xii) The following data gives the distribution of accidents in a large city over weekdays of the
last month :
Day:
Average number of
accidents:
Sun.
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
26
16
12
10
8
10
8
Over the particular month, there were 5 Mondays and 5 Tuesdays. Calculate the mean
number of accidents per day.
(xiii) The table below shows the number of skilled and unskilled workers in two factories and
their average daily wages.
61
Worker
Factory A
Factory B
category
Number
Wages per day (Rs.)
Number
Wages per day (Rs.)
Skilled
150
180
350
175
Unskilled
850
130
650
125
Determine the average daily wages for each factory. Also, give reasons why the results
show that the average daily wages in Factory B are higher than the average daily wages in
Factory A, even though in Factory B the average wages for both categories of workers are
lower.
(xiv) The average sales made by 18 salesmen in a company were reported to be Rs. 73,560
during the last month. It was discovered later that the sales made by one of the salesmen
was recorded as Rs. 92,280 instead of the actual Rs. 22, 280 and that of another salesman
at Rs. 16,630 instead of the actual Rs. 76,630. Compute the actual average sales made.
(xv) The arithmetic mean of daily wages of 300 weavers and 250 spinning machine workers are
Rs. 198 and Rs. 179, respectively. It is given that the arithmetic mean of all the workers in
the factory is Rs. 187. Find the total number of workers in the factory if it is also given that
the arithmetic mean of the daily wages of the remaining workers is Rs. 180.5.
(xvi) Of a batch of 20 students, four students failed while 7 students passed with distinction.
The remaining students, who passed ordinarily, scored the following marks:
47, 49, 52, 52, 57, 58, 59, 64, 66
The students who failed secured 23 marks on an average while the students who passed
with distinction fetched 76 marks on an average. With this information, you are required to
obtain the mean and median marks.
(xvii) A survey of 350 families in a town yielded the following information :
No. of children:
0
1
2
3
4 or more
No. of families:
13
94
146
67
30
Find the median number of children in the families.
(xviii) The following table gives the distribution of monthly income of 600 families in a certain
city:
Monthly income (in ‘000 Rs.)
No. of families
10 – 20
60
20 – 30
170
30 – 40
200
40 – 50
60
50 – 60
50
60 – 70
40
70 – 80
20
62
Draw a ‘less than’ ogive and a ‘more than’ ogive for these data on a graph. Read the
median income and the limits within which the middle 50 percent of families have their
income.
(xix) Find the missing frequencies in the following distribution if N = 100 and median = 30.
Marks
No. of students
0 – 10
10
10 – 20
?
20 – 30
25
30 – 40
30
40 – 50
?
50 – 60
10
(xx) The number of sales made by a shoe store in the City Mall during the past 20 days is as
follows:
7
6
13
16
8
5
9
9
10
19
16
8
11
13
7
24
22
15
21
21
Find the 50th, 75th and 88th percentiles.
Ans. 7. Rs. 12.14/km 8. 50.5 9. 20–30, 30–40, etc. 10. 4.15 Weeks 11. Rs.86 12. 14.27 13. A Rs.
138 B Rs. 143 14. Rs. 73.004 15. 750 16. Mean = 56.4, Median = 58.5 17. 2 18. Med = 33.5, 25–43
app. 19. 15 and 10 20. 12, 18.25 and 21.48
63
LESSON-3
MEASURES OF VARIATION – ABSOLUTE AND RELATIVE
3.
STRUCTURE
3.0
3.1
3.2
3.3
3.4
3.5
Objective
Need of Variation
What is Variation?
Requisites of a Good Measure of Variation
Types of Variation
Methods of Computing Variation
3.5.1 Mathematical Methods – Range
3.5.2 Quartile Deviation
3.5.3 Average Deviation
3.5.4 Standard Deviation
3.5.5 Mathematical Properties of Standard Deviation
3.5.6 Graphic Method of Variation
3.6 Revisionary Problems
3.7 Summary
3.8 Self Assessment Questions
3.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Understand the meaning, need and requisites of a good measure of variation
(b) Differentiate between absolute and relative measure of variation
(c) Identify and compute different types of variation such as range, quartile deviation, average
deviation, standard deviation and variance
(d) Comprehend merits and demerits of different measures and properties of standard deviation
(e) Comment upon the variability of problems with the help of coefficient of variation.
3.1 NEED OF VARIATION
Measures of central tendency, Mean, Median, Mode, etc., indicate the central position of a series.
They indicate the general magnitude of the data but fail to reveal all the peculiarities and characteristics
of the series. In other words, they fail to reveal the degree of the spread out or the extent of the
variability in individual items of the distribution. This can be known by certain other measures,
known as ‘Measures of Dispersion’ or Variation.
We can understand variation with the help of the following example :
64
Series I
Series II
Series III
10
2
10
10
8
12
10
20
8
ΣX = 30
30
30
X=
ΣX 30
=
= 10
N
3
X=
30
= 10
3
X=
30
= 10
3
In all three series, the value of arithmetic mean is 10. On the basis of this average, we can say
that the series are alike. If we carefully examine the composition of three series, we find the following
differences :
(i) In case of 1st series, the value are equal; but in 2nd and 3rd series, the values are unequal
and do not follow any specific order.
(ii) The magnitude of deviation, item-wise, is specific different for the 1st, 2nd and 3rd series.
But all these deviations cannot be ascertained if the value of ‘simple mean’ is taken into
consideration.
(iii) In these three series, it is quite possible that the value of arithmetic mean is 10; but the
value of median may differ from each other. This can be understood as follows :
I
II
III
10
2
8
10 Median
8 Median
10 Median
10
20
12
The value of ‘Median’ in 1st series is 10, in 2nd series = 8 and in 3rd series = 10. Therefore,
the value of Mean and Median are not identical.
(iv) Even though the average remains the same, the nature and extent of the distribution of the
size of the items may vary. In other words, the structure of the frequency distributions
may differ even though their means are identical.
3.2 WHAT IS VARIATION
Simplest meaning that can be attached to the word ‘dispersion’ is a lack of uniformity in the sizes or
quantities of the items of a group or series. According to Reiglemen, “Dispersion is the extent to
which the magnitudes or qualities of the items differ, the degree of diversity.” The word dispersion
may also be used to indicate the spread of the data.
In all these definitions, we can find the basic property of dispersion as a value that indicates
the extent to which all other values are dispersed about the central value in a particular distribution.
65
3.3 REQUISITES OF A GOOD MEASURE OF VARIATION
There are certain pre-requisites for a good measure of dispersion :
1.
It should be simple to understand.
2.
It should be easy to compute.
3.
It should be rigidly defined.
4.
It should be based on each individual item of the distribution.
5.
It should be capable of further algebraic treatment.
6.
It should have sampling stability.
7.
It should not be unduly affected by the extreme items.
3.4 TYPES OF VARIATION
The measures of dispersion can be either ‘absolute’ or ‘relative’. Absolute measures of dispersion
are expressed in the same units in which the original data are expressed. For example, if the series
is expressed as Marks of the students in a particular subject; the absolute dispersion, will provide
the value in Marks. The only difficulty is that if two or more series are expressed in different units,
the series cannot be compared on the basis of dispersion.
‘Relative’ or ‘Coefficient’ of dispersion is the ratio or the percentage of a measure of absolute
dispersion to an appropriate average. The basic advantage of this measure is that two or more series
can be compared with each other, despite the fact they are expressed in different units.
Theoretically, ‘Absolute measure’ of dispersion is better. But from a practical point of view,
relative or coefficient of dispersion is considered better as it is used to make comparison between
series.
3.5 METHODS OF COMPUTING VARIATION
Methods of studying dispersion are divided into two types :
(i)
Mathematical Methods : We can study the ‘degree’ and ‘extent’ of variation by these methods.
In this category, commonly used measures of dispersion are :
(a) Range
(b) Quartile Deviation
(c) Average Deviation
(d) Standard deviation and coefficient of variation.
(ii) Graphic Methods : Where we want to study only the extent of variation, whether it is higher or
lesser a Lorenz-curve is used.
3.5.1 Mathematical Methods – Range:
It is the simpest method of studying dispersion. Range is the difference between the smallest
66
value and the largest value of a series. While computing range, we do not take into account frequencies
of different groups.
Formula:
Absolute Range = L – S
Coefficient of Range =
L −S
L+S
where, L represents largest value in a distribution
S represents smallest value in a distribution
We can understand the computation of range with the help of examples of different series.
(i)
Raw Data
Example 1 : Marks out of 50 in a subject of 12 students, in a class are given as follows :
12, 18, 20, 12, 16, 14, 30, 32, 28, 12, 12 and 35.
In the example, the maximum or the highest marks obtained by a candidate is ‘35’ and the
lowest marks obtained by a candidate is ‘12’. Therefore, we can calculate range;
L = 35 and S = 12
Absolute Range = L – S = 35 – 12 = 23 marks
Coefficient of Range =
L − S 35 − 12 23
=
=
= 0.49 approx.
L + S 35 + 12 47
(ii) Discrete Series
Example 2 :
Marks of the Students in
Accounts (out of 50)
Smallest
Largest
No. of students
(X)
(f)
10
4
12
10
18
16
20
15
Total 45
Absolute Range = 20 – 10 = 10 marks
Coefficient of Range =
20 − 10 10
=
= 0.34 approx.
20 + 10 30
67
(iii) Continuous Series
Example 3 :
X
frequencies
10 – 15
4
S = 10
15 – 20
10
L = 30
20 – 25
26
25 – 30
8
Absolute Range = L – S = 30 –10 = 20 marks
Coefficient of Range =
L − S 35 − 12 20
=
=
= 0.5 approx.
L + S 35 + 12 40
Range is a simplest method of studying dispersion. It takes lesser time to compute the ‘absolute’
and ‘relative’ range. Range does not take into account all the values of a series, i.e. it considers only
the extreme items and middle items are not given any importance. Therefore, Range cannot tell us
anything about the character of the distribution. Range cannot be computed in the case of ‘open
ends’ distribution i.e., a distribution where the lower limit of the first group and upper limit of the
higher group is not given.
The concept of range is useful in the field of quality control and, to study the variations in the
prices of the shares etc.
3.5.2 Quartile Deviation
The concept of ‘Quartile Deviation’ does take into account only the values of the ‘Upper
quartile’ (Q3) and the ‘Lower quartile’ (Q1). Quartile Deviation is also called ‘inter-quartile range’.
It is a comparatively better method when we are interested in knowing the range within which
certain proportion of the items fall.
‘Quartile Deviation’ can be obtained as :
(i) Inter-quartile range = Q3 – Q1
(ii) Semi-quartile range =
Q3 − Q1
2
Q3 − Q1
Q3 + Q1
Calculation of Inter-quartile Range, semi-quartile Range and Coefficient of Quartile Deviation
in case of Raw Data
(iii) Coefficient of Quartile Deviation =
Example 4 : Suppose the values of X are : 20, 12, 18, 25, 32, 10
In case of quartile-deviation, it is necessary to calculate the values of Q1 and Q3 by arranging
the given data in ascending or descending order.
Therefore, the arranged data are : (in ascending order)
X = 10, 12, 18, 20, 25, 32
No. of items = 6
68
 N + 1
 th item =
Q1 = the value of 
4 
 6 + 1

 = 1.75th item
4 
= the value of 1st item + 0.75 (value of 2nd item – value of 1st item)
= 10 + 0.75 (12 – 10) = 10 + .75 (2) 10 + 1.50 = 11.50
 N + 1
 6 + 1
 th item = 3 

Q3 = the value of 3 
4
4 
= the value of 3(7/4)th item = the value of 5.25th item
= the value of 5th item + 0.25 (the value of 6th item minus the value of 5th item)
= 25 + 0.25 (32 – 25) = 25 + 0.25 (7) = 26.75.
Therefore,
(i) Inter-quartile range = Q3 – Q1 = 26.75 – 11.50 = 15.25
(ii) Semi-quartile range =
Q3 − Q1 15.25
=
= 7.625
2
2
(iii) Coefficient of Quartile Deviation =
Q3 − Q1 26.75 − 11.50 15.25
=
=
= 0.39 approx.
Q3 + Q1 26.75 + 11.50 38.25
Calculation of Inter-quartile Range, semi-quartile Range and Coefficient of Quartile Deviation
in discrete series
Example 5 : Suppose a series consists of the salaries (Rs.) and number of the Workers in a factory :
Salaries (Rs.)
No. of workers
60
4
100
20
120
21
140
16
160
9
Solution : In the problem, we will first, compute the values of Q3 and Q1.
Salaries (Rs.)
No. of workers
Cumulative frequencies
(x)
(f)
(c.f.)
60
4
4
100
20
24 – Q1 lies in this cumulative
120
21
45
140
16
61
160
9
70
N = Σf = 70
69
frequency
Calculation of Q1 :
Calculation of Q3 :
 N + 1
 th item
Q1 = size of 
4 
 N + 1
 th item
Q3 = size of 3 
4 
 70 + 1
 th item = 17.75th item
= size of 
4 
 70 + 1
 th item = 53.25th item
= size of 3 
4 
17.75 lies in the cumulative frequency 24,
53.25 lies in the cumulative frequency 61
which is corresponding to the value Rs. 100
which is corresponding to Rs. 140
∴ Q1 = Rs. 100
∴ Q3 = Rs. 140
(i) Inter-quartile range = Q3 – Q1 = Rs. 140 – Rs. 100 = Rs. 40
(ii) Semi-quartile range =
Q3 − Q1  140 − 100 
=
 = Rs. 20

2
2
Q3 − Q1 140 − 100 40
(iii) Coefficient of Quartile Deviation = Q + Q = 140 + 100 = 240 = 0.17 approx.
3
1
Calculation of Inter-quartile range, semi-quartile range and Coefficient of Quartile Deviation
in the case of continuous series
Example 6 : We are given the following data :
Salaries (Rs.)
No. of workers
10–20
4
20–30
6
30–40
10
40–50
5
Total 25
In this example, the values of Q3 and Q1 are obtained as follows :
Salaries (Rs.)
(x)
No. of workers
(f)
Cumulative frequencies
(c.f.)
10–20
4
4
20–30
6
10
30–40
10
20
40–50
5
25
N = 25
70
N
− cf 0
N
Q1 = l1 + 4
is used to find out Q1 group
×i
4
f
where l1 = lower limit of Q1 group
f = frequency of Q1 group
i = magnitude of Q1 group (l2 – l1)
cf0 = cumulative frequency of the group preceeding Q1 group.
N
25
or
or 6.25. It lies in the cumulative frequency 10, which is corresponding to
4
4
class 20–30.
Therefore,
Therefore, Q1 group is 20–30.
Q1 = 20 +
where, l1= 20, f = 6, i = 10,
6.25 − 4
× 10 = 20 + 3.75 = 23.75
6
N
= 6.25, cf0 = 4
4
3N
− Cf 0
Q3 = l1 + 4
×i
f
3 N 3 × 25 75
=
= 18.75 which lies in the cumulative frequency 20, which is corresponding
=
4
4
4
to class 30–40.
Therefore Q3 group is 30–40.
where, l1= 30, i = 10,
3N
= 18.75, cf0 = 10, f = 10
4
Q3 = 30 +
18.75 − 10
× 10 = Rs. 38.75
10
Therefore :
(i) Inter-quartile range = Q3 – Q1 = Rs. 38.75 – Rs. 23.75 = Rs. 15.00
(ii) Semi-quartile range =
Q3 − Q1 15.00
=
= 7.50
2
2
(iii) Coefficient of Quartile Deviation =
Q3 − Q1 Rs. 38.75 – Rs. 23.75
15
=
=
= 0.24.
Q3 + Q1 Rs. 38.75 + Rs. 23.75 62.50
71
Advantages of Quartile Deviation
Some of the important advantages of this measure of dispersion are :
(i) It is easy to calculate. We are required simply to find the values of Q1 and Q3 and then apply
the formula of absolute and coefficient of quartile deviation.
(ii) It has better results than range method. While calculating range, we take only the extreme
values that make dispersion erratic. In the case of quartile deviation, we take into account
middle 50% items.
(iii) The quartile deviation is not affected by the extreme items.
Disadvantages
(i) It is completely dependent on the central items. If these values are irregular and abnormal
the result is bound to be affected.
(ii) All the items of the frequency distribution are not given equal importance in finding the
values of Q1 and Q3.
(iii) Because it does not take into account all the items of the series, considered to be inaccurate
of dispersion.
Similarly, somtimes we calculate percentile range, say, 90th and 10th percentile as it gives
slightly better measure of dispersion, in certain cases. If we consider the calculations, then
(i) Absolute percentile range = P90 – P10
(ii) Coefficient of percentile range =
P90 − P10
P90 + P10
This method of calculating dispersion can be applied generally in the case of open end series
where the importance of extreme values are not considered.
3.5.3 Average Deviation
Average deviation’ is defined as a value, which is obtained by taking the average of the deviations
of various items, from a measure of central tendency, Mean or Median or Mode, after ignoring
negative signs.
Generally, the measure of central-tendency, from which the deviations are taken, is specified
in the problem. If nothing is mentioned regarding the measure of central tendency specified then
deviations are taken from median because the sum of the deviations (after ignoring negative signs)
is minimum.
Computation in case of raw data
(i)
Absolute Average Deviation about Mean or Median or Mode =
Σ|d |
N
where: N = Number of observations,
|d| = deviations taken from Mean or Median or Mode ignoring signs.
(ii) Coefficient of A.D. =
Average Deviation about Mean or Median or Mode
Mean or Median or Mode
72
Steps to Compute Average Deviation :
(i) Calculate the value of Mean or Median or Mode
(ii) Take deviations from the given measure of central-tendency and they are shown as d.
(iii) Ignore the negative signs of the deviation that can be shown as |d| and add them to find
Σ |d|.
(iv) Apply the formula to get Average Deviation about Mean or Median or Mode.
Example 7 : Suppose the values are 5, 5, 10, 15, 20. We want to calculate Average Deviation and
Coefficient of Average Deviation about Mean or Median or Mode.
Solution :
Average Deviation about mean (Absolute and Coefficient).
Deviation from mean Deviations after ignoring signs
d
|d|
5
–6
6
5
–6
6
10
+1
1
15
+4
4
20
+9
9
(X)
ΣX = 55
X =
ΣX
N
where N = 5, ΣX = 55
X =
55
= 11
5
Σ |d| = 26
Average Deviation about Mean =
Σ | d | 26
=
= 5.2.
N
5
Coefficient of Average Deviation about Mean =
Mean Deviation about Mean 5.2
=
= 0.47.
Mean
11
Average Deviation (Absolute and Coefficient) about Median
X
Deviation from median
d
Deviations after ignoring
negative signs |d|
5
–5
5
5
–5
5
Median 10
0
0
15
+5
5
20
+ 10
10
Σ|d| = 25
N=5
73
Average deviation about Median =
Σ | d | 25
=
= 5.2 .
N
5
Coefficient of Average Deviation about median
=
A.D. about Median 5
=
= 0.5
Median
10
Average Deviation (Absolute and Coefficient) about Mode
X
Deviation from mode d
|d|
5
0
0
Mode 5
0
0
10
+5
5
15
+ 10
10
20
+ 15
15
Σ | d | = 30
N=5
Average deviation about Mode =
Σ | d | 30
=
= 6.
N
5
Coefficient of Average Deviation about mode =
A.D. about Mode 6
= = 1.2.
Mode
5
Average deviation in case of discrete and continuous series
Average Deviation about Mean or Median or Mode =
Σf | d |
N
where N = No. of items
| d | = deviations from Mean or Median or Mode, after ignoring negative signs.
Coefficient of A.D. about Mean or Median or Mode =
A.D. about Mean or Median or Mode
Value of Mean or Median or Mode
Example 8 : Suppose we want to calculate coefficient of Average Deviation about Mean from the
following discrete series:
X
Frequency
10
5
15
10
20
15
25
10
30
5
74
Solution : First of all, we shall calculate the value of arithmetic Mean,
Calculation of Simple Mean
X
f
fX
10
5
50
15
10
150
20
25
30
15
10
5
N = 45
300
250
150
ΣfX = 900
X=
ΣfX 900
=
= 20
N
45
Calculation of Coefficient of Average Deviation about Mean
Deviation from mean
Deviations after ignoring
X
f
d
negative signs | d |
10
15
20
25
30
5
10
15
10
5
– 10
–5
0
+5
+ 10
10
5
0
5
10
Σ f|d|
50
50
0
50
50
Σf |d| = 200
N = 45
Coefficient of Average Deviation about Mean =
Average Deviation about Mean =
A.D. about Mean 4.4
=
= 0.22
Mean
20
Σ | d | 200
=
= 4.44 approx.
N
45
In case we want to calculate coefficient of Average Deviation about Median from the following
data :
Class Interval
Frequency
10–14
15–19
20–24
25–29
30–34
5
10
15
10
5
N = 45
First of all we shall calculate the value of Median but it is necessary to find the ‘real limits’ of
the given class-intervals. This is possible by subtracting 0.5 from the lower-limits and added to the
75
upper limits of the given classes. Hence, the real limits shall be : 9.5–14.5, 14.5–19.5, 19.5–24.5,
24.5–29.5 and 29.5–34.5.
Calculation of Median
f
c.f.
5
10
15
10
5
5
15
30
40
45
Class Intervals
9.5–14.5
14.5–19.5
19.5–24.5
24.5–29.5
29.5–34.5
N = 45
N
− Cf0
Median = l1 + 2
×i
f
where
l1 = lower limit of median group
i = magnitude of median group
f = frequency of median group
Cf0 = cumulative frequency of the group preceeding median group
n
2
= size of median group
N
45
th item i.e.
= 22.5
2
2
It lies in the cumulative frequency 30, which is corresponding to class interval 19.5–24.5.
∴ Median size =
Median group is 19.5–24.5
Median = 19.5 +
22.5 − 15
7.5
× 5 = 19.5 +
× 5 = 19.5 + 2.5 = 19.5 + 2.5 = 22
15
15
Calculation of Coefficient of Average Deviation about Median
Class
Intervals
Frequency Mid points Deviation from Deviations after ignoring
f
x
median (22)
negative signs |d|
f|d|
9.5–14.5
5
12
– 10
10
50
14.5–19.5
10
17
–5
5
50
19.5–24.5
15
22
0
0
0
24.5–29.5
10
27
+5
5
50
29.5–34.5
5
32
+ 10
10
50
Σf |d| = 200
N = 45
76
Coefficient of Average Deviation about Median =
Average Deviation about Mean =
A.D. about Median
Median
Σ | d | 200
=
= 4.44 approx.
N
45
Coefficient of A.D. about Median =
4.4
= 0.2.
22
Advantages of Average Deviations
1. Average deviation takes into account all the items of a series and hence, it provides
sufficiently representative results.
2. It simplifies calculations since all signs of the deviations are taken as positive.
3. Average Deviation may be calculated either by taking deviations from Mean or Median or
Mode.
4. Average Deviation is not affected by extreme items.
5. It is easy to calculate and understand.
6. Average deviation is used to make healthy comparisons.
Disadvantages of Average Deviations
1.
It is illogical and mathematically unsound to assume all negative signs as positive signs.
2.
Because the method is not mathematically sound, the results obtained by this method are not
reliable.
3.
This method is unsuitable for making comparisons either of the series or structure of the series.
This method is more effective during the reports presented to the general public or to groups
who are not familiar with statistical methods.
3.5.4 Standard Deviation
The standard deviation, which is shown by greek letter σ (read as sigma) is extremely useful in
judging the representativeness of the mean. The concept of standard deviation which was introduced
by Karl Pearson, has a practical significance because it is free from all defects which exists in case
of range, quartile deviation or average deviation.
Standard deviation is calculated as the square root of average of squared deviations taken from
actual mean. It is also called root mean square deviation. The square of standard deviation i.e. σ2 is
called ‘variance’ in statistics.
Calculation of standard deviation in case of raw data
There are four ways of calculating standard deviation for raw data :
(i) When actual values are considered;
(ii) When deviations are taken from actual mean;
(iii) When deviations are taken from assumed mean; and
(iv) When ‘step deviations’ are taken from assumed mean.
77
(i)
When the actual values are considered :
σ=
ΣX 2
− ( X )2
N
where N = Number of the items,
ΣX 2
− ( X )2
or σ =
N
2
X = Given values in the series
X = Arithmetic mean of the values.
We can also write the formula as follows :
σ=
ΣX 2  ΣX 
−
 N 
N
2
ΣX
where X =
N
Steps to calculate σ
(i) Compute simple mean of the given values.
(ii) Square the given values and aggregate them
(iii) Apply the formula to find the value of standard deviation.
Example 9 : Suppose the values are given 2, 4, 6, 8, 10. We want to apply the formula
σ=
ΣX 2
− ( X )2
N
Solution : We are required to calculate the values of N , X , ΣX 2 . They are calculated as follows :
220
− (6)2
5
X
X2
σ=
2
4
=
4
16
Variance (σ2) = ( 8) 2 = 8
6
36
X =
8
64
10
100
N=5
ΣX2 = 220
44 − 36 = 8 = 2.828
ΣX 30
=
=6
N
5
There are certain specific problems, where the method can be applied. It is different type of
problem which is given as follows :
78
(ii) When the deviations are taken from actual mean
σ=
Σx 2
N
where N = no. of items and x = ( X − X )
Steps to Calculate σ
(i) Compute the deviations of given values from actual mean i.e., ( X − X ) and represent
them by x.
(ii) Square these deviations and aggregate them
(iii) Use the formula, σ =
Σx 2
N
Example 10 : We are given values as 2, 4, 6, 8, 10. We want to find out standard deviation.
X
x2
x = (X – X )
2
2 – 6 = –4
(– 4 )2 = 16
4
4 – 6 = –2
(–2)2 = 4
6
6–6 =0
8
8–6 =+2
(2)2 = 4
10
10 – 6 = + 4
(4)2 = 16
=0
Σx2 = 40
N=5
 ΣX 30 
X = 6  N = 5 
Σx 2
40
=
= 8 = 2.828
N
5
(iii) When the deviations are taken from assumed mean
σ=
σ=
where,
Σdx 2  Σdx 
−
 N 
N
2
N = no. of items.
dx = deviations from assumed mean i.e., (X – A).
A = assumed mean
Steps to Calculate :
(i) We consider any value as assumed mean. The value may be given in the series or may not
be given in the series.
(ii) We take deviations from the assumed value i.e., (X – A), to obtain dx for the series and
aggregate them to find Σdx.
79
(iii) We square these deviations to obtain dx2 and aggregate them to find Σdx2.
(iv) Apply the formula given above to get standard deviation :
Example 11 : Suppose the values are given as 2, 4, 6, 8 and 10. We can obtain the standard deviation
as:
assumed mean (A)
X
dx = (X – A)
dx2
2
– 2 = (2 – 4)
4
4
0 = (4 – 4)
0
6
+ 2 = (6 – 4)
4
8
+ 4 = (8 – 4)
16
10
+ 6 = (10 – 4)
36
N=5
Σ dx = 10
Σ dx2 = 60
2
σ=
Σdx 2  Σdx 
−
=
 N 
N
2
60  10 
−   = 12 − 4 = 8 = 2.828.
5  5
(iv) When step deviations are taken from assumed mean
2
σ=
Σdx 2  Σdx 
−
×i
 N 
N
 X − A

where i = Common factor, N = Number of items, dx = Step-deviations = 
i 
Steps to Calculate σ :
(i) We consider any value as assumed mean from the given values or from outside.
(ii) We take deviation from the assumed mean i.e., (X – A),
(iii) We divide the deviations obtained in step (ii) with a common factor to find step deviations
and represent them as dx and aggregate them to obtain Σ dx.
(iv) We square the step deviations to obtain dx2 and aggregate them to find Σdx2.
Example 12 : We continue with the same example to understand the computation of Standard
Deviation.
d
dx =   ,i = 2
i
dx2
X
d = (X – A)
2
–2
–1
1
A=4
0
0
0
6
+2
1
1
8
+4
2
4
10
+6
3
9
Σdx = 5
Σdx2 = 15
N=5
80
2
σ =
Σdx 2  Σdx 
−
×i
 N 
N
σ =
15  5 
−   × 2 = 3 − 1 × 2 = 2 × 2 = 1.414 × 2 = 2.828.
5  5
where N = 5, i = 2, dx = 5 Σdx2 = 15
2
Note : We can notice an important point that the standard deviation value is identical by four
methods. Therefore, any of the four formulae can be applied to find the value of standard
deviation. But the suitability of a formula depends on the magnitude of items in a question.
σ
X
In the above given example, σ = 2.828 and X = 6
Coefficient of Standard-deviation =
Therefore, coefficient of standard deviation =
σ 2.828
=
= 0.471
X
6
Coefficient of Variation or C.V.
=
σ
2.828
× 100 =
× 100 = 47.1%
X
6
Generally, coefficient of variaton is used to compare two or more series. If coefficient of
variation (C.V.) is more in one series as compared to the other, there will be more variations in that
series, lesser stability or consistency in its composition. If coefficient of variation is lesser as compared
to other series, it will be more stable, or consistent. Moreover, that series is always better where
coefficient of variation is lesser or coefficient of standard deviation is lesser.
Example 13 : Suppose we want to compare two firms where the salaries of the employees are given
as follows :
Firm A
Firm B
No. of workers
100
100
Mean salary (Rs.)
100
80
40
45
Standard-deviation (Rs.)
Solution : We can compare these firms either with the help of coefficient of standard deviation or
coefficient of variation. If we use coefficient of variation, then we shall apply the formula :
σ

C.V. =  × 100
X
Firm A
C.V. =
Firm B
40
× 100 = 40%
100
C.V. =
45
× 100 = 56.25%
80
X = 100, σ = 40
X = 80, σ = 45.
Because the coefficient of variation is lesser for firm A as compared to firm B, therefore, firm A is
better.
81
Calculation of standard-deviation in discrete and continuous series
We use the same formula for calculating standard deviation for a continuous series and a discrete
series. The only difference that in discrete series, values and frequencies are given whereas in a
continuous series, class-intervals and frequencies are given. When the mid-points of these classintervals are obtained, a continuous series takes the shape of a discrete series. Alphabet X denotes
values in a discrete series and mid points in a continuous series.
When the deviations are taken from actual mean
We use the same formula for calculating standard deviation for a continuous series
σ =
where
Σfx 2
N
N = Number of the items (Σf)
f = Frequencies corresponding to different values or class-intervals.
x = Deviations from actual mean ( X − X )
X = Values in a discrete series and mid-points in a continuous series.
Step to calculate σ
(i) Compute the arithmetic mean by applying the required formula.
(ii) Take deviations from the arithmetic mean and represent these deviations by x.
(iii) Square the deviations to obtain values of x2.
(iv) Multiply the frequencies of the different class-intervals with x2 to find fx2. Aggregate fx2
column to obtain Σfx2.
(v) Apply the formula to obtain the value of standard deviation.
2
If we want to calculate variance then we can take σ =
Σfx 2
N
Example 14 : We can understand the procedure by taking an example :
Class Intervals
Frequency (f)
Midpoints (m)
fm
10 – 14
5
12
60
15 – 19
10
17
170
20 – 24
15
22
330
25 – 29
10
27
270
30 – 34
5
32
160
Σ fm = 990
N= 45
Therefore, X =
Σfm 990
=
= 22
N
45
where, N = 45, Σ fm = 990
82
Calculation of Standard Deviation
Class
Intervals
Mid points
Deviations from
actual median = 22
f
X
x
x2
f x2
10 – 14
5
12
–10
100
500
15 – 19
10
17
–5
25
250
20 – 24
15
22
0
0
0
25 – 29
10
27
+5
25
250
30 – 34
5
32
+ 10
100
500
Σfx2 = 1500
N = 45
σ =
Σfx 2
where, N = 45, Σfx 2 = 1500
N
σ =
1500
= 33.33 = 5.77 approx.
45
When the deviations are taken from assumed mean
In some cases, the value of simple mean may be in fractions, then it becomes time consuming to
take deviations and square them. Alternatively, we can take deviations from the assumed
mean.
σ =
where
Σfdx 2  Σfdx 
−
 N 
N
2
N = number of items,
dx = deviations from assumed mean (X – A),
f = frequency of the different groups,
A = assumed mean and
X = values or mid points.
Step to calculate σ
(i) Take the assumed mean from the given values or mid points.
(ii) Take deviations from the assumed mean and represent them by dx.
(iii) Square the deviations to get dx2 .
(iv) Multiply f with dx of different groups to obtain fdx and add them up to get Σ fdx.
(v) Multiply f with dx2 of different groups to obtain fdx2 and add them up to get Σ fdx2.
(vi) Apply the formula to get the value of standard deviation.
83
Example 15 : We can understand the procedure with the help of an example.
Class
Intervals
Frequency
Mid
Deviations from
point assumed Mean = (17)
f
x
dx
dx2
fdx
fdx2
10 – 14
5
12
–5
25
– 25
125
15 – 19
10
17
0
0
0
0
20 – 24
15
22
+5
25
75
375
25 – 29
10
27
+ 10
100
100
1000
30 – 34
5
32
+ 15
225
75
1125
Σ fdx =225
Σfdx2 = 2625
N = 45
σ=
Σfdx 2  Σfdx 
−
 N 
N
2
where, N = 45, Σfdx 2 = 2625, Σfdx = 225
2
∴σ
=
2625  225 
−
 = 58.33 − 25 = 33.33 = 5.77 approx.
45  45 
When the step deviations are taken from the assumed mean
2
σ=
where
Σfdx 2  Σfdx 
−
×i
 N 
N
N = Number of the items (Σf),
i = common factor
f = frequencies correspondig to the different groups,
 X − A

dx = step-deviations 
i 
Steps to calculate σ
(i) Take deviations from the assumed mean of the calculated mid-points and divide all
deviations by a common factor (i) and represent these values by dx.
(ii) Square these step deviations dx to obtain dx2 for different groups.
(iii) Multiply f with dx of different groups to find fdx and add them to obtain Σ fdx.
(iv) Multiply f with dx2 of different groups to find fdx2 for different groups and add them to
obtain Σ fdx2.
(v) Apply the formula to get standard deviation.
Example 16 : Suppose we are given the series and we want to calculate standard deviation with the
84
help of step deviation method. According to the given formula, we are required to calculate the
value of i, N, Σfdx and Σfdx2.
Class
Frequency
Mid
Deviations from
point
assumed mean (22)
i=5
f
x
X
 X – A


i 
dx
10 – 14
5
12
– 10
–2
4
– 10
20
15 – 19
10
17
–5
–1
1
– 10
10
20 – 24
15
22
+0
0
0
0
0
25 – 29
10
27
+5
+1
1
10
10
30 – 34
5
32
+ 10
+2
4
10
20
Intervals
dx2
fdx
fdx2
Σfdx = 0 Σfdx2 = 60
N =45
2
σ
=
Σfdx 2  Σfdx 
−
× i,
 N 
N
2
∴σ =
60  0 
−  ×5=
45  45 
where, N = 45, i = 5, Σfdx = 0, Σfdx2 = 60
4
× 5 = 1.33 × 5 = 1.154 × 5 = 5.77 approx.
3
Advantages of Standard Deviation
(i) Standard deviation is the best measure of dispersion because it takes into account all the
items and is capable of future algebraic treatment and statistical analysis.
(ii) It is possible to calculate standard deviation for two or more series.
(iii) The measure is most suitable for making comparisons among two or more series about
variability.
Disadvantages
(i) It is difficult to compute.
(ii) It assign more weights to extreme items and less weights to items that are nearer to mean.
It is because of this fact that the squares of the deviations which are large in size would be
proportionately greater than the squares of those deviations which are comparatively small.
3.5.5 Mathematical Properties of Standard Deviation
(i) If deviations of given items are taken from arithmetic mean and squared then the sum of
squared deviation should be minimum, i.e., ( X − X ) 2 = Minimum.
(ii) If different values are increased or decreased by a constant, the standard deviation will
remain the same. Whereas if different values are multiplied or divided by a constant than
the standard deviation will be multiplied or divided by that constant.
85
(iii) Combined standard deviation can be obtained for two or more series with formula given
below :
σ12
=
N1σ12 + N2σ 22 + N1d12 + N 2 d 22
N1 + N 2
where : N1 represents number of items in first series,
N2
represents number of items in second series,
σ12 represents variance of first series,
σ22 represents variance of second series,
d1 represents the difference between X 12 − X 1 ,
d2 represents the difference between X 12 − X 2 ,
X 1 represents arithmetic mean of first series,
X 2 represents arithmetic mean of second series,
X12 represents combined arithmetic mean of both the series.
Example 17 : Find the combined standard deviation of two series, from the below given information :
First Series
Second Series
No. of items
10
15
Arithmetic means
15
20
Standard deviation
4
5
Solution : Since we are considering two series, therefore combined standard deviation is computed
by the following formula :
σ12 =
N1σ12 + N2σ 22 + N1d12 + N 2 d 22
N1 + N 2
where : N1= 10, N2 = 15, X1 = 15, X 2 = 20, σ1 = 4, σ 2 = 5
X 12 =
or X12 =
X1 N1 + X 2 N 2
N1 + N 2
(15 × 10) + (20 × 15) 150 + 300 450
=
=
= 18
10 + 15
25
25
d1 = ( X 12 − X 1 ) = 18 − 15 = 3
d 2 = ( X 12 − X 2 ) = 18 − 20 = −2.
and
By applying the formula of combined standard deviations, we get:
86
10(4) 2 + 15(5) 2 + 10(18 − 15)2 + 15(18 − 20) 2
10 + 15
σ12 =
=
(10 × 16) + (15 × 25) + (10 × 9) + (15 × 4)
25
=
160 + 375 + 90 + 60
685
=
= 27.4 = 5.2 approx.
25
25
(iv) Standard deviation of n natural numbers can be computed as :
σ =
1
( N 2 − 1) where, N represents number of items
12
(v) For a symmetrical distribution,
X ± σ covers 68.27% of items,
X ± 2σ covers 95.45% of items,
X ± 3σ covers 99.73% of items.
Example 18 : You are heading a rationing department in a State affected by food shortage. Local
investigators, submit the following report:
Daily calorie value of food available per adult during current period :
Area
Mean
Standard Deviation
A
2,500
400
B
2,000
200
The estimated requirement of an adult is taken as 2,800 calories daily and the absolute minimum
is 1,350. Comment on the reported figures, and determine which area, in your opinion, need more
urgent attention.
Solution: We know that X ± σ covers 68.27% of items, X ± 2σ covers 95.45% of items,
X ± 3σ covers 99.73% cases. In the given problem if we take into consideration 99.73%, i.e., almost
the whole population, the limits; would be X ± 3σ.
For Area A these limits are :
X + 3σ = 2,500 + (3 × 400) = 3,700
X – 3σ = 2,500 – (3 × 400) = 1,300
87
For Area B these limits are :
X + 3σ = 2,000 + (3 × 200) = 2,600
X – 3σ = 2,000 – (3 × 200) = 1,400
It is clear from above limits that in Area A there are some persons who are getting 1300
calories, i.e. below the minimum which is 1,350. But in case of area B there is no one who is getting
less than the minimum. Hence area A needs more urgent attention.
(vi) Relationship between quartile deviation, average deviation and standard deviation is given
as:
Quartile deviation = 2/3 Standard deviation
Average deviation = 4/5 Standard deviation
(vii) We can also compute corrected standard deviation by using the following formula :
Corrected σ =
(a) Compute corrected X =
where
Corrected ΣX 2
− (corrected X )2
N
Corrected ΣX
N
corrected ΣX = ΣX + correct items – Wrong items
where
ΣX = N.X
(b) Compute corrected ΣX2 = ΣX2 + (Each correct item)2 – (Each wrong item)2
where
ΣX2 = Nσ 2 + NX 2
Example 19 : (a) Find out the coefficient of variation of a series for which the following results are
given:
N = 50, ΣX′ = 25, ΣX′2 = 500
where : X′ = deviation from the assumed average 5.
(b) For a frequency distribution of marks, in statistics of 100 candidates (grouped in class
intervals of 0–10, 10–20) the mean and, standard deviation, were found to be 45 and 20.
Later it was discovered that the score 54 was misread as 64 in obtaining frequency
distribution. Find out the correct mean and correct standard deviation of the frequency
distribution.
(c) Can, coefficient of variation be greater than 100%? If so, when?
Solution : (a) We want to calculate, coefficient of variation, which is =
σ
× 100.
X
Therefore, we are required to calculate mean and standard deviation.
Calculation of simple mean
X = A+
88
ΣX ′
where, A = 5, N = 50, ΣX′ = 25
N
X = 5+
∴
25
= 5.5
50
Calculation of standard deviation
2
σ=
2
500  25 
ΣX ′ 2
 ΣX ′ 
− 
=
−   = 5 − 0.25 = 4.75 = 2.179

 N 
N
50  50 
Calculation of Coefficient of variation
C.V. =
σ
2.179
217.9
× 100 =
× 100 =
= 39.6%
X
5.5
5.5
(b) Given X = 45, σ = 20, N = 100, wrong value = 64, correct value = 54.
Since this is a case of continuous series, therefore, we will apply the formulae for mean and
standard deviation that are applicable in continuous series.
Calculation of correct Mean
X =
Σfx
or NX = ΣfX
N
By substituting the values, we get 100 × 45 = 4500
Correct ΣfX = 4500 – 64 + 54 = 4490
∴ Correct X =
Correct Σfx 4490
=
= 44.9
N
100
Calculation of correct σ
σ=
ΣfX 2
ΣfX 2
2
2
− ( X )2
− ( X ) or σ =
N
N
where, σ = 20, N = 100, X = 45.
ΣfX 2
− (45) 2
(20) =
100
2
ΣfX 2
− 2025
100
or
400 =
or
ΣfX 2
400 + 2025 =
100
or
2425 × 100 = ΣfX 2 = 242500
89
∴ Correct ΣfX 2 = 242500 – (64)2 + (54)2 = 242500 – 4096 + 2916 = 242500 – 1180 = 241320
Correct σ =
=
Correct ΣfX 2
− (Correct ( X ))2
N
241320
− (44.9) 2 = 2413.20 − 2016.01 = 397.19 = 39.9 approx.
100
(c) The formulae for the computation of coefficient of variation is =
{
}
σ
× 100 .
X
Hence, coefficient of variation can be greater than 100% only when the value of standard
deviation is greater than the value of mean.
This will happen when data contains a large number of small items and few items are quite
large. In such a case the value of simple mean will be pulled down and the value of standard
deviation will go up.
Similarly, if there are negative items in a series, the value of mean will come down and the
value of standard deviation shall not be affected because of squaring the deviations.
Example 20 : In a distribution of 10 observations, the value of mean and standard deviation are
given as 20 and 8. By mistake, two values are taken as 2 and 6 instead of 4 and 8. Find out the value
of correct mean and variance.
Solution : We are given; N = 10, X = 20, σ = 3
Wrong values = 2 and 6 and Correct values = 4 and 8
Calculation of correct Mean
ΣX
X = N or X = ΣX
∴ ΣX = 10 × 20 = 200
But ΣX is incorrect. Therefore we shall find correct ΣX.
Correct ΣX = 200 – 2 – 6 + 4 + 8 = 204
Correct Mean =
Correct ΣX 204
=
= 20.4
N
10
Calculation of correct variance
σ2 =
or
σ2 =
ΣX 2
− ( X )2
N
ΣX 2
− ( X )2
N
90
ΣX 2
− (20) 2
(8) =
10
2
or
ΣX 2
− 400
10
or
64 =
or
ΣX 2
64 + 400 =
10
or
ΣX2 = 4640
But this is wrong and hence we shall compute correct ΣX2
Correct ΣX2 = 4640 –22 – 62 + 42 + 82
= 4640 – 4 – 36 + 16 + 64
= 4680
Correct ΣX 2
− Correct ( X )2
Correct σ =
N
4680
− (20.4) 2 = 468 − 416.16 = 51.84
=
10
2
3.5.6 Graphic Method of Variation
The concept of Lorenz-curve was devised by Max-o-Lorenz. It is also called a cumulative percentage
curve, in which the percentage of the items is combined with the percentage of other items as
wealth, profits, etc.
The Lorenz-curve can be drawn in the following manner:
(i) Size of the items and the frequencies are converted into percentages.
(ii) Cumulative percentages are obtained both for the items and frequencies.
(iii) We take cumulative frequency on OX-axis. The values on OX-axis starts from 100% and
decreases to 0%.
(iv) We take into account cummulative item percentages and divide the axis in equal parts in
such a manner that different parts on both the axis are equal. On OY-axis, we start from 0%
and increases to 100%.
(v) Join 0% of OX-axis and 100% of OY-axis by a straight line which is called line of Equal
Distribution.
(vi) We plot points of different series, on the basis of cummulative item percentages and
cumulative frequency percentages and join these points to draw a ‘Lorenz curve’ for different
series.
(vii) The ‘Lorenz-curves’ are compared with the line of Equal Distribution and the distance
between them will determine variability. The Lorenz-curve located far away from the line
of equal distribution is more variable.
91
Example 21 : Draw Lorenz curves and interpret the results of the below given profits of companies
located in two different sites.
No. of companies
Area B
Profit earned (Rs. lakhs)
Area A
10
4
24
20
10
18
30
6
5
40
5
3
Solution :
Profits earned
(Rs. lakhs)
X
No. of Companies
Area A
Area B
fA
fB
Percentage
X
fA
fB
Cumulative
Percentage
CX
CfA CfB
10
4
24
10
16
48
10
16
48
20
10
18
20
40
36
30
56
84
30
6
5
30
24
10
60
80
94
40
5
3
40
20
06
100
100
100
Total: 100
25
50
100
100
100
Solution :
Y
80
NE
LI
OF
EQ
L
UA
60
LORENZ CURVE OF
AREA A COMPANIES
ST
DI
T
BU
RI
40
N
IO
CUMULATIVE ITEM PERCENTAGES
100
20
0
100
80
60
40
LORENZ CURVE OF
AREA B COMPANIES
X
0
20
CUMULATIVE FREQUENCY
PERCENTAGES
Figure
We notice in the above diagram that the Lorenz curve for Area B companies is away from the
line of equal distribution in comparison with Lorenz curve for Area A. Therefore, we can conclude
that there is more variability in Area B companies as compared to Area A companies.
92
3.6 REVISIONARY PROBLEMS
Example 22 : Compute (a) Inter-quartile range,
(b) Semi-quartile range, and
(c) Coefficient of quartile deviation from the following data :
Farm Size
(acres)
No. of firms
Farm Size
(acres)
No. of firms
0–40
394
161–200
169
41–80
461
201–240
113
81–120
391
241 and over
148
121–160
334
Solution : In this case, the real limits of the class intervals can be obtained by subtracting 0.5 from
the lower limits of the class intervals and adding 0.5 to the upper limits of the different class intervals.
This adjustment is necessary to calculate median and quartiles of the series.
Farm Size
(acres)
No. of firms
Cumulative
frequency (c.f.)
0–40
394
394
41–80
461
855
81–120
391
1246
121–160
334
1580
161–200
169
1749
201–240
113
1862
241 and over
148
2010
N = 2010
Q1 = l1 +
N 4 − c. f 0
×i
f
2010
n
= 502 th item
=
4
4
Q
Q1 lies in the cumulative frequency of the group 41–80, where the real limits of class intervals
are 40.5–80.5 and l1 = 40.5, f = 461, i = 40, c.f0 = 394,
∴
Similarly,
n
= 502.5
4
502.5 − 394
× 40 = 40.5 + 9.4 = 49.9 acres
461
3n
− c. f 0
Q3 = l1 + 4
×i
f
Q1 = 40.5 +
93
3n
3 × 2010
=
× 1507.5 th item
4
4
Q3 lies in the cumulative frequency of the group 121–160, where the real limits of the class
interval are 120.5–160.5 and l1 = 120.5, i = 40, f = 334,
1507.5 − 1246
× 40 = 120.5 + 31.3 = 151.8 acres
334
Q3 = 120.5 +
∴
3n
= 1507.5, c.f. = 1246
4
Inter-quartile range = Q3 – Q1 = 151.8 – 49.9 = 101.9 acres
Semi-quartile range =
Q3 − Q1 151.8 − 49.9
=
= 50.95 approx.
2
2
Coefficient of quartile deviation =
Q3 − Q1 151.8 − 49.9 101.9
=
=
= 0.5 approx.
Q3 + Q1 151.8 + 49.9 201.7
Example 23 : Calculate mean and coefficient of mean deviation about mean from the following
data :
Marks less than
No. of students
10
4
20
10
30
20
40
40
50
50
60
56
70
60
Solution : In this question, we are given less than type series alongwith the cumulative frequencies.
Therefore, we are required first of all to find out class intervals and frequencies for calculating
mean and coefficient of mean deviation about mean.
Marks
No. of
Mid Deviations from
Step
Deviation
Deviations
from mean (35)
(A = 35)
i = 10
(ignoring signs)
students points assumed Mean
f
X
X′
 X – A
dx = 
 i 
|dx|
fdx
f|dx|
0–10
4
5
– 30
–3
3
– 12
12
10–20
6
15
– 20
–2
2
– 12
12
20–30
10
25
– 10
–1
1
– 10
10
94
30–40
20
35
0
0
0
0
0
40–50
10
45
+ 10
+1
1
+ 10
10
50–60
6
55
+ 20
+2
2
+ 12
12
60–70
4
65
+ 30
+3
3
+ 12
12
Σfdx = 0 Σf |dx| = 68
N = 60
Σfdx
X = A+ N ×i
where, N = 60, A = 35, i = 10, Σ.fdx = 0
 0

∴ X = 35 +  × 10 = 35
60
Σf | dx |
68
×i =
× 10 = 11.33
M.D. about mean =
N
60
M.D. about mean 11.33
=
= 0.324 approx.
mean
35
Example 24 : Calculate standard deviation from the following data:
Coefficient of M.D. about mean =
Class interval
Frequency
–30 to –20
5
–20 to –10
10
–10 to 0
15
0 to 10
10
10 to 20
5
N = 45
Solution :
Calculation of Standard Deviation
Class
Intervals
Frequency
Mid
points
Deviations
from assumed
Mean (A = –5)
Step Deviations
when i = 10
f
X
X′
–30 to –20
5
– 25
– 20
( X – A)
i
–2
–20 to –10
10
– 15
– 10
10 to 0
15
–5
0 to 10
10
10 to 20
5
dx2
fdx
fdx2
4
– 10
20
–1
1
– 10
10
+0
0
0
0
0
5
+ 10
1
1
10
10
15
+ 20
2
4
10
20
dx =
Σfdx = 0 Σfdx2 = 60
N = 45
95
2
Σfdx 2  Σfdx 
−
×i
 N 
N
σ=
where N = 45, i = 10, Σfdx = 0, Σfdx2 = 60
2
∴
60  0 
−   × 10 =
45  45 
σ=
60
× 10 = 1.33 × 10 = 1.153
45
Example 25 : For two firms A and B belonging to same industry, the following details are available
:
Number of Employees:
Average wage per month :
Standard deviation of the wage per month :
Firm A
Firm B
100
200
Rs. 240
Rs. 170
Rs. 6
Rs. 8
Find (i) Which firm pays out larger amount as monthly wages?
(ii) Which firm shows greater variability in the distribution of wages?
(iii) Find average monthly wage and the standard deviation of the wages of all employees
firms.
Solution : (i) For finding out which firm pays larger amount, we have to find out ΣX.
ΣX
N
or
Firm A : N = 100, X = 240
∴
Σ X = 100 × 240 = 24000
Firm B : N = 200. X = 170
∴
Σ X = 200 × 170 = 34000
X=
Σ X = NX
Hence firm B pays larger amount as monthly wages.
(ii) For finding out which firm shows greater variability in the distribution of wages, we have
to calculate coefficient of variation
Firm A : C.V. =
σ
6
× 100 =
× 100 = 2.50
X
240
Firm B : C.V. =
σ
8
× 100 =
× 100 = 4.71
X
170
Since coefficient of variation is greater for firm B, hence it shows greater variability in the
distribution of wages.
(iii) Combined wage : X12 =
N1 X1 + N 2 X 2
N1 + N 2
where, N1 = 100, X1 = 240, N2 = 200, X 2 = 170
96
(100 × 240) + (200 × 170) 24000 + 34000
=
= 193.33
100 + 200
300
Combined Standard Deviation
Hence
X 12 =
σ12 =
N1σ12 + N2σ 22 + N1d12 + N 2 d 22
N1 + N 2
where N1 = 100, σ1 = 6, σ2 = 8, d1 = ( X 1 − X 12 ) = 240 – 193.3 = 46.7 and d 2 = ( X 2 − X 12 ) =
170 – 193.3 = – 23.3
σ12 =
(100)(36) + (200)(64) + (100)(46.7)2 + (200)( −23.3) 2
100 + 200
3600 + 12800 + 218089 + 108578
343, 067
=
= 33.81
300
300
Example 26 : From the following frequency distribution of heights of 360 boys in the age-group
10–20 years, calculate the:
=
(i) arithmetic mean;
(ii) coefficient of variation; and
(iii) quartile deviation
Height (cms)
No. of boys
Height (cms)
No. of boys
126–130
31
146–150
60
131–135
44
151–155
55
136–140
48
156–160
43
141–145
51
161–165
28
Solution :
Calculation of X , Q.D., and C.V..
Heights
m.p.
X
f
(X – 143)/5
dx
fdx
fdx2
c.f.
126–130
128
31
–3
–93
279
31
131–135
133
44
–2
–88
176
75
136–140
138
48
–1
–48
48
123
141–145
143
51
0
0
0
174
146–150
148
60
+1
+60
60
234
151–155
153
55
+2
+110
220
289
156–160
158
43
+3
+129
387
332
161–165
163
28
+4
+112
448
360
Σfdx = 182
Σfdx2 = 1618
N = 45
97
Σfdx
X = A + N × i where, N = 360, A = 143, i = 5, Σfdx = 182
(i)
∴
X = 143 +
(ii)
C.V. =
182
× 5 = 143 + 2.53 = 145.53.
360
σ
× 100
X
2
σ=
=
C.V. =
(iii)
Q.D. =
Q1 = Size of
2
1618  182 
Σfdx 2  Σfdx 
−
×i =
−

 ×5
 N 
N
360  360 
4.494 − 0.506 × 5 = 2.00 × 5 = 10
10
× 100 = 6.87 per cent
145.53
Q3 − Q1
2
N
360
th observation =
= 90th observation
4
4
Q1 lies in the class 136–140. But the real limit of this class is 135.5–140.5.
N 4 − cf 0
90 − 75
× i = 135.5 +
× 5 = 135.5 + 1.56 = 137.06
48
f
Q1 = l1 +
3N
360
th observation = 3 ×
= 270th observation
4
4
Q3 lies in the class 151–155. But the real limit of this class is 150.5–155.5.
Q3 = Size of
Q3 = l1 +
Q.D. =
3 N 4 − cf 0
270 − 234
× i = 150.5 +
× 5 = 150.5 + 3.27 = 153.77.
55
f
Q3 − Q 1 153.77 − 137.06
=
= 8.355.
2
2
3.7 SUMMARY
●
While averages summarize and present data in a single number, variation is studied to get a
better idea of the nature of data.
●
Variation can be absolute or relative. Absolute variation refers to the amount of variation in a
set of data while relative variation serves to compare variability across different sets of data.
●
The ‘distance’ measures of variation include range and partial ranges including inter-quartile
98
range and inter-percentile range which are used in addition to or as surrogates for range. Range
is commonly used in reporting price movements, quality control, etc. Coefficient of range is a
relative measure.
●
The measures involving deviations include quartile deviation and its co-efficient: mean deviation
and its coefficient; and standard deviation, variance and coefficient of variation.
●
Quartile deviation is a quick, inspectional measure of variability and used when there are
scattered or extreme values included in the data.
●
A measure based on each observation in the data is the mean deviation which is equal to the
sum of absolute deviations of the various observations from their mean or median. The relative
measure related to this is the coefficient of mean deviation.
●
Standard deviation is also based on all observations. It is the best measure of variation as it
possesses mathematical properties.
●
Coefficient of standard deviation is sometimes used instead of coefficient of variation.
●
All coefficients are pure numbers and there are no units associated with them. Hence they are
used for making comparisons of variability.
●
Graphically, Lorenz curve is used to describe inequalities of income. The extent of departure
of the curve of actual distribution of income from the line of equal distribution indicates the
degree of inequalities of income.
●
Well-defined relationship exists between values of quartile deviation, mean deviation and
standard deviation in the case of normal distributions. The relationship works well even for
distributions which deviate moderately from normality.
3.8 SELF ASSESSMENT QUESTIONS
Exercise 1 : True and False Statements
(i) Measures of variation attempt to present in a single number the amount of variation in a set
of data.
(ii) All measures of relative variation are pure numbers with no units attached to them.
(iii) Range cannot be negative.
(iv) Range cannot be determined in open-ended frequency distributions.
(v) Coefficient of range cannot be greater than 1.
(vi) Quartile deviation is same as semi inter-quartile range.
(vii) Only absolute values of deviations are considered in the calculation of mean deviation.
(viii) Since the sum of absolute deviations measured from median is the minimum, this serves
as the most appropriate average for calculating mean deviation.
(ix) Since the sum of deviations of a set of values from their mean is equal to zero, it follows
that mean deviation from mean would always be equal to zero.
(x) Mean deviation can never be negative.
(xi) Mean deviation cannot be calculated for distributions with open-ended classes.
99
(xii) The arithmetic mean is used for measuring deviations in calculating standard deviation
due to its least squares property.
(xiii) Standard deviation cannot be equal to zero.
(xiv) Standard deviation can never exceed the arithmetic mean.
(xv) Standard deviation is positive or negative depending upon the sign of deviations of various
values from their mean.
(xvi) Variance is the square root of standard deviation.
(xvii) Coefficient of variation is always expressed as a percentage.
(xviii) Coefficient of standard deviation is equal to the ratio of standard deviation to arithmetic
mean of the data.
(xix) Coefficient of variation expresses arithmetic mean as a percentage of standard deviation.
(xx) If each of the values of a set of data is increased by 5, the mean and standard deviation
would both increase by 5.
(xxi) If each of the values of a set of data is multiplied by –5, the standard deviation would also
be multiplied by the same number and hence become negative.
(xxii) If each of the values of a set of data is increased by K, the coefficient of variation would
also increase by K.
(xxiii) When each value of a given set of data is multiplied by K, the revised coefficient of variation
would be K times the original coefficient value.
(xxiv) The combined standard deviation of two sets of data will always lie between the standard
deviation values of the two sets.
(xxv) The standard deviation of a distribution is approximately equal to 1.25 times the mean
deviation and 1.5 times the quartile deviation.
(xxvi) Standard deviation is exactly equal to one-sixth of the range.
(xxvii) In a normal distribution, the percentage of values included between X − 2σ and X + 2σ is
99.73.
(xxviii) Calculating the Z-scores is standardizing the data values.
(xxix) The sum total of all Z-scores for a given set of values ranges between ±3.
Ans. 1. T, 2. T, 3. T, 4. T, 5. T, 6. T, 7. T, 8. T, 9. F, 10. T, 11. T, 12. T, 13. F, 14. F, 15. F, 16. F, 17. T,
18. T, 19. F, 20. F, 21. F, 22. F, 23. F, 24. F, 25. T, 26. F, 27. F, 28. T, 29. F
Exercise - 2 : Questions and Answers
(i) What is range? What are its limitations as a measure of variation? Give examples where
range can be used satisfactorily for measuring variation.
(ii) What are quartiles? How are they used for measuring variation?
(iii) Define mean deviation. How does it differ from standard deviation?
100
(iv) Define mean deviation, standard deviation and inter-quartile range of a frequency
distribution. Why is standard deviation considered as the most appropriate measure of
variation? Give an example in which you would prefer an alternative measure of variation.
(v) State and explain the properties of standard deviation and variance. Do you agree that
standard deviation is always positive and never negative or zero?
(vi) What is coefficient of variation? How is it different from coefficient of standard deviation
and variance?
(vii) Explain the relationship between quartile deviation. mean deviation and standard deviation
in the case of normal distribution. Also, discuss the empirical relationship between mean
and standard deviation.
(viii) What is Lorenz curve? How is it obtained? Discuss its significance as a tool of studying
variation.
(ix) Determine (i) weekly and (ii) monthly range of gold prices (per 10 gm) from the following
data for a month:
Week
High
Low
1
28,122
27,880
2
29,208
28,890
3
28,890
28,706
4
29,225
28,930
(x) The heights of 11 men are measured as 65, 68, 70, 69, 58, 66, 71, 65, 67, 69 and 73 inches.
Calculate the range. If the shortest and the tallest of them are omitted, what is the percentage
change in range?
(xi) Draw a “less than ogive” from the following data and obtain the lower and upper quartiles
there from. Also, calculate the values of quartile deviation and its coefficient.
Wages (in Rs.)
No. of workers
5,000 or more
Nil
4,500 or more
4
4,000 or more
18
3,500 or more
38
3,000 or more
60
2,500 or more
75
2,000 or more
85
1,500 or more
93
1,000 or more
100
101
(xii) The following table shows the percentage of different age groups to the total population of
a certain country:
Age group
(years)
Percentage of the total
population
0–14
15–19
20–24
25–29
30–39
40–49
50–59
60 and above
42.0
8.7
7.9
7.4
12.6
9.3
6.1
6.0
You are required to find the age limits within which the middle 50 percent of the population
lies. Also, calculate (i) inter-quartile range, (ii) quartile deviation, and (iii) co-efficient of
quartile deviation. (Note that in census, the age is recorded as on last birthday).
(xiii) The distribution of marks of 1200 students appeared in an entrance examination is given
below:
Marks:
20–30
20–40
20–50
20–60
20–70
20–80
90
210
420
720
945
No. of students: 30
20–90 20–100
1080
1200
(xiv) In the following data, two class frequencies are missing:
Class interval :
Frequency:
Class interval:
Frequency:
100–110
110–120
120–130
130–140
140–150
4
7
15
?
40
150–60
160–170
170–180
180–190
190–200
?
16
10
6
3
However, it is possible to ascertain that the total frequency is 150 and that the median is
equal to 146.25. You are required to find the missing frequencies. Having obtained these,
calculate mean, standard deviation and the coefficient of variation.
(xv) Find two numbers whose mean is 12 and standard deviation is 4.
(xvi) Find the mean and standard deviation of the first 13 natural numbers.
(xvii) The mean and variance of the following continuous distribution are 61 and 15.9, respectively.
The distribution, after taking step-deviations, is as follows: .
d′
–3
–2
–1
0
1
2
3
f
10
15
25
25
10
10
5
You are required to determine actual class intervals.
102
(xviii) The standard deviation of a set of 10 numbers was calculated as equal to 5 while their
arithmetic mean was found to be 12. It was discovered later on that an item was recorded
as 5 instead of 15. Rectify the error and determine the correct value of standard deviation.
(xix) The mean of 5 observations is 4.4 and the variance is 8.24. If three of the observations are
1, 2 and 6, find the other two.
(xx) Mean, median and variance of a set of 5 numbers are known to be 12,11 and 9.2, respectively.
If two of the numbers are 8 and 16, determine the remaining numbers.
(xxi) The following is the record of the number of bricks laid each day for 10 days by two brick
layers A and B. Calculate the co-efficient of variation in each case and discuss the relative
consistency of the two brick layers.
A:
700
675
725
625
650
700
650
700
600
650
B:
550
600
575
550
650
600
550
525
625
600
If each of the values in respect of worker A is decreased by 10 and each of the values in
respect of worker B is increased by 50, how will it affect the results obtained earlier?
(xxii) A purchasing agent obtained samples of lamps from two suppliers. He had the samples
tested in his own laboratory for the length of life, with the following results:
Length of life
(in hours)
Samples from
Company A
Company B
700 – 900
10
3
900 – 1,100
16
42
1,100 – 1,300
26
12
1,300 – 1,500
8
3
(a) Which company’s lamps have greater average life?
(b) Which company’s lamps have more uniform life?
(xxiii) A sample of 35 values has mean 80 and standard deviation equal to 4. A second sample of
65 values has mean 70 and standard deviation equal to 3. Find the mean and standard
deviation of the combined set of 100 values.
(xxiv) Particulars regarding the incomes of employees of two factories are given below:
Factory
No. of Employees
Average Income (Rs.)
Variance
A
600
475
180
B
500
586
140
(a) In which factory is the variation in income greater?
(b) What is the wage bill of each of the-factories?
(c) What is the average income of all employees put together?
(d) What is the combined standard deviation of incomes of the two sets of employees?
103
(xxv) The number of employees, wages per employee and the variance of wage for two factories
are given here :
Factory A
Factory B
No. of employees
50
100
Average wages per day (Rs.)
120
85
9
16
Variance of wages
(a) In which factory is there greater variation in the distribution of wages?
(b) Suppose in factory B, the wages of an employee were recorded as Rs. 120 instead of
Rs.100. What would be the corrected variance for factory B? Will it change the conclusion
drawn in (a)?
(xxvi) Fill in the blanks:
Number
Mean
Variance
Group 1
Group 2
Group 3
Combined
70
30
?
150
140
?
146
145
?
48
56
78
(xxvii) For a set of 100 values, the standard deviation is known to be 14.4 and the co-efficient of
variation is 40%. Calculate the arithmetic mean.
(xxviii) If a 20 is subtracted from every observation in a data set. then the co-efficient of variation
of the resulting data set is 20%. If a 40 is added to every observation of the same data set,
then the co-efficient of variation of the resulting set of data is 10%. Find the mean and
standard deviation of the original set of data.
(xxix) A set of 40 numbers has mean and standard deviation equal to X and σ. respectively. If
each of the values of the set is multiplied by 16, the co-efficient of variation works out to
be 25% while if each value of the set is increased by 16, the co-efficient of variation
becomes 20%. Find the mean and standard deviation of the set of numbers.
(xxx) Two groups of workers, consisting of 30 and 50 persons, have the same mean wages but
different standard deviations. The respective standard deviations are Rs. 16 and Rs. 12.
Obtain the combined standard deviation of their wages.
Ans. 10. 15.6, 11. Q1 = 2500, Q3 = 3825, QD = 662.5 CQD = 0.209, 12. 28.214, 14.107, 0.612, 13.
13.575, 0.206, 14. 147.33, 19.198, 13.03%, 15. 8,16, 16. 7, 3.742, 17. 20-40, 40-60 etc. 18. 4.47,
19. 4, 9, 20. 15, 10, 21. 5.568%, 6.38%, 5.562%, 5.876%, 22. 1106.67, 1050, 16.65%, 11.86%, 23.
73.5, 5.85, 24. A, 285000, 293000, 525.45, 56.72, 25. 2.5%, 4.71%, 5.96, 2.88%, 26. 50,155, 38 27.
36, 28. 80, 12, 29. 64, 16, 30. 13.64
104
LESSON-4
SKEWNESS AND KURTOSIS
4.
STRUCTURE
4.0
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
Objective
Tests of Skewness
Nature of Skewness
Characteristics of Skewness
Methods of Skewness
4.4.1 Bowley’s Method
4.4.2 Karl Pearson’s Method
4.4.3 Revisionary Problems
Measures of Kurtosis
Comparison among Variation, Skewness and Kurtosis
Summary
Self Assessment Questions
4.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Understand the meaning of skewness and kurtosis
(b) Distinguish between skewness, variation and kurtosis
(c) Compute the skewness and kurtosis
(d) Comment upon the nature of distribution with the help of measures of skewness and kurtosis.
4.1 TESTS OF SKEWNESS
Measures of Skewness and Kurtosis, like measures of central tendency and dispersion, study the
characteristics of a frequency distribution. Averages tell us about the central value of the distribution
and measures of dispersion tell us about the concentration of the items around a central value.
These measures do not reveal whether the dispersal of value on either side of an average is
symmetrical or not. If observations are arranged in a symmetrical manner around a measure of
central tendency, we get a symmetrical distribution, otherwise, it may be arranged in an asymmetrical
order which gives asymmetrical distribution. Thus, skewness is a measure that studies the degree
and direction of departure from symmetry.
A symmetrical distribution, when presented on the graph paper, it gives a ‘symmetrical curve’,
where the value of mean, median and mode are exactly equal. On the other hand, in an asymmetrical
distribution, the values of mean, median and mode are not equal.
When two or more symmetrical distributions are compared, the difference in them are studied
with ‘Kurtosis’. On the other hand, when two or more symmetrical distributions are compared, it
105
will give different degrees of Skewness. These measures are mutually exclusive i.e. the presence of
skewness implies absence of kurtosis and vice-versa.
There are certain tests to know whether skewness does or does not exist in a frequency
distribution. They are :
1. In a skewed distribution, values of mean, median and mode would not coincide. The values
of mean and mode are pulled away and the value of median will be at the centre. In this
distribution, mean-Mode = 2/3 (Median – Mode).
2. Quartiles will not be equidistant from median.
3. When the asymmetrical distribution is drawn on the graph paper, it will not give a bell
shaped-curve.
4. Sum of the positive deviations from the median is not equal to sum of negative deviations.
5. Frequencies are not equal at points of equal deviations from the mode.
4.2 NATURE OF SKEWNESS
Skewness can be positive or negative or zero.
1. When the values of mean, median and mode are equal, there is no skewness.
2. When mean > median > mode, skewness will be postive.
3. When mean < median < mode, skewness will be negative.
4.3 CHARACTERISTICS OF SKEWNESS
1. It should be a pure number in the sense that its value should be independent of the unit of
the series and also degree of variation in the series.
2. It should have zero-value, when the distribution is symmetrical.
3. lt should have a meaningful scale of measurement so that we could easily interpret the
measured value.
4.4 METHODS OF SKEWNESS
Skewness can be studied graphically and mathematically. When we study skewness graphically, we
can find out whether skewness is positive or negative or zero. We cannot find out value of coefficient
of skewness. This can be shown with the help of a diagram :
Mode
Median
Median
Mean
POSITIVE SKEWNESS
X > MEDIAN > MODE
Mode
Mean
NO SKEWNESS
X = MEDIAN = MODE
106
NEGATIVE SKEWNESS
X < MEDIAN < MODE
Mathematically skewness can be studied as :
(a) Absolute Skewness.
(b) Relative or coefficient of skewness.
When the skewness is presented in absolute item i.e, in units, it is absolute skewness. If the
value of skewness is obtained in ratios or percentages, it is called relative or coefficient of skewness.
When skewness is measured in absolute terms, we can compare one distribution with the
other, if the units of measurement are not different. When it is presented in ratios or percentages,
comparison become easy. Relative measures of skewness is also called coefficient of skewness.
Mathematical measures of skewness can be calculated on the basis of:
(a) Bowley’s Method
(b) Karl-Pearson’s Method
(c) Kelly’s method
4.4.1 Bowley’s Method
Bowley’s method of skewness is based on the values of median, lower and upper quartiles. This
method suffer from the same limitations which are in the case of median and quartiles.
Wherever positional measures are called for, skewness should be measured by Bowley’s method.
This method is, also used in case of ‘open-end series’, where the importance of extreme values is
ignored.
Absolute skewness = Q3 + Q1 – 2 Median
Coefficient of Skewness =
Q2 + Q1 − 2 Median
Q3 − Q1
Coefficient of skewness lies within the limit ±1. This method is quite convenient for determining
skewness where one has already calculated quartiles.
For example, if the class intervals and frequencies are given as follows :
Class Intervals
Frequencies
Below 10
5
10 – 20
10
20 – 30
15
30 – 40
10
above 40
5
In this case, if we want to calculate, coefficient of skewness on the basis of this method, then
107
we are required to calculate the values of Median, Q3 and Q1:
Calculation of Coefficient of Skewness on the basis of Median and Quartiles
Class
Intervals
Frequency Cumulative
frequency
Calculations
n 2 − c. f .
×i
f
Below 10
5
5
Median = l1 +
10–20
10
15
n 2=
20–30
15
30
frequency 30, corresponding to class (20–30)
30–40
10
40
∴ Median = 20 +
40 and above
5
45
= 20 +
45
= 22.5 , It lies in the cumulative
2
22.5 − 15
× 10
15
7.5
× 10 = 25
15
N = 45
45
= 11.25 , lies in the cumulative
4
Q1 = l + 3n 4 − c. f . × i
f
n 4=
Absolute Skewness = Q3 + Q1 – 2 median
frequency, corresponding to class
interval (10 – 20)
where, Q3 = 33.75, Q1= 16.75, Median = 25
Q1 = 10 +
∴ Ab. Skewness = 33.75 + 16.25 – 2(25)
Q3 = l +
= 50 – 50 = 0
3n/4 = 33.75, that lies in the cumulative
Coefficient of Skewness =
Q3 + Q1 − 2(Median)
Q3 − Q1
11.25 − 5
× 10 = 16.25
10
3n 4 − c. f .
×i
f
frequency 40, corresponding to group
(30–40)
Q3 = 30 +
Now we have, Q3 = 33.75, Q1 = 16.25,
33.75 − 30
× 10 = 33.75
10
Median 25
∴ Coefficient of Skewness =
33.75 + 16.25 − 2(25)
0
=
=0
33.75 − 16.25
17.5
4.4.2 Karl-Pearson’s Method (Pearsonian Coefficient of Skewness)
Karl Pearson has suggested two formulae;
108
(i) where the relationship of mean and mode is established;
(ii) where the relationship between mean and median is established.
When the values of
When the values of
Mean and Mode are related
Mean and Median are related
Absolute skewness = Mean – Mode
Absolute skewness = 3(Mean – Median)
Coefficient of skewness =
Mean – Mode
σ
Coefficient of skewness =
Coefficient of skewness generally
lies within ±1
3(Mean – Median)
σ
Coefficient of skewness generally lies within ±3
Calculation of Coefficient of skewness by using the following formula:
Coefficient of skewness =
Mean – Mode
σ
Given X values are = 12, 18, 18, 22, 35, and N = 5 ∴ X =
X
ΣX 105
=
= 21 and Mode = 18
N
5
(X – 21)
x
x2
12
–9
81
σ=
18
–3
9
∴ σ=
18
–3
9
22
+1
1
35
+ 14
196
Σx 2
where x = ( X − X )
N
296
= 59.2 = 7.7
5
Σx2 = 296
N=5
∴ Coefficient of skewness =
Mean – Mode
σ
Substitute Mean = 21, Mode =18, Standard deviation = 7.7.
21 − 18
3
=
= + 0.4
7.7
7.7
Calculation of Karl-Pearson’s coefficient of skewness by using the following formula:
∴ SK =
Coefficient of skewness =
3(Mean – Median)
σ
109
For the given data X = 12, 18, 18, 22, 35
Mean = 21, Median = 18, σ = 7.7
∴ Coefficient of skewness =
3(21 − 18) 3 × 3
9
=
=
= 1.12
7.7
7.7 7.7
4.4.3 Revisionary Problems
Example 1 : Calculate appropriate measure of skewness from the following income distribution:
Monthly income (Rs.)
Frequency
upto–100
9
101–150
51
151–200
120
201–300
240
301–500
136
501–750
33
751–1000
9
above 1000
2
N = 600
Solution : In this problem, the open-ends series is given with inclusive class-intervals. Hence,
Bowley’s measure of skewness is better, because it is based on Quartiles and not affected by extreme
class intervals.
Calculation of coefficient of skewness based on quartiles and median
Monthly income (Rs.)
Frequency
f
Cumulative Frequency
c.f.
Upto 100
09
9
101–150
51
60
151–200
120
180
201–300
240
420
301–500
136
556
501–750
33
589
751–1000
09
598
Above 1000
02
600
N = 600
Coefficient of skewness =
Q3 + Q1 − 2 Median
Q3 − Q1
110
Median = L1 +
n 2 − cf
×i
f
600
= 300; It lies in the cumulative frequency 420, which is corresponding to group
2
201 – 300.
∴ N/2 =
But the real limits of the class interval are 200.5 – 300.5
∴ Median = 200.5 +
Q1 = L1 +
n/4 =
300 − 180
120
× 100 = 200.5 +
× 100 = 250.5
240
240
n 4 − cf
×i
f
600
= 150. It lies in the cumulative frequency 180,which is corresponding to class interval
4
151–200.
But the real limits of this class-interval are 150.5–200.5.
∴ Q1 = 150.5 +
Q3 = L1 +
150 − 60
90
× 50 = 150.5 +
× 50 = 150.5 + 37.5 = Rs. 188
120
120
3n 4 − cf
×i
f
where 3n/4 is used to find out upper quartile group.
3 × 600
= 450 . It lies in the cumulative frequency 556, which is corresponding to
4
group 301 – 500.
∴ 3n 4 =
The real limits of this class interval are 300.5–500.5
∴ Q 3 = 300.5 +
450 − 420
30
× 200 = 300.5 +
× 200 = 300.5 + 44.12 = Rs. 344.62
136
136
Hence, Coefficient of skewness =
344.62 + 188 − 2(250.5) 532.62 − 501 31.62
=
=
= + 0.2
344.62 − 188
156.62
156.62
approx.
Example 2 : Calculate the appropriate measure of skewness from the following cumulative frequency
distribution :
Age (under years) :
20
30
40
50
60
70
No. of persons
12
29
48
75
94
106
:
111
Solution: In this problem, we are given the upper limits of classes along with the cumulative
frequency. Therefore, we have to find out the lower limits and frequencies for the given data.
Age (years)
Number of Persons
Cumulative
Frequency (f)
Frequency (c.f.)
Below 20
12
12
20–30
17
29
30–40
19
48
40–50
27
75
50–60
19
94
60–70
12
106
N = 106
Because the lower limit of the first group is not given, the appropriate measure of skewness is
Bowley’s method. It is based on quartiles and median and is not influenced by extreme classintervals.
 Q3 + Q1 − 2 Median 
Bowley’s coefficient of skewness = 

Q3 − Q1

Thus, we have to calculate the values of Q3, Q1 and median.
Median = L1 +
Median has
n 2 − c. f .
×i
f
N
106
items or
or 53 items below it.
2
2
Therefore, it lies in the cumulative frequency 75, which is corresponding to the class-interval
(40–50). Hence, median group is (40–50).
where L1 = 40, i = 10, f = 27,
∴ Median = 40 +
N
= 53, c.f. = 48
2
53 − 48
5
× 10 = 40 +
× 10 = 40 + 1.9 = 41.9
27
27
x
− c. f
Q1 = L1 + 4
×i
f
Q1 has
N
106
or
or 26.5 items below it.
4
4
112
It lies in the cumulative frequency 29, which is corresponding to the-class-interval 20–30.
Therefore,Q1 group is 20–30.
where L1 = 20,
∴ Q1 = 20 +
N
= 26.5, i = 10, c.f. = 12, f = 17
4
26.5 − 12
× 10 = 20 + 8.53 = 28.53
17
3N
− c. f .
Q3 = l1 + 4
×i
f
Q3 has
3N
3 × 106
or
or 79.5 items below it.
4
4
It lies in the cumulative frequency 94, which is corresponding to the group 50–60.
Therefore, Q3 group is 50–60.
∴ Q3 = 50 +
79.5 − 75
4.5
× 10 = 50 +
× 10 = 50 + 2.37 = 52.37
19
19
Coefficient of skewness =
Q3 + Q1 − 2 Median
Q2 − Q1
where Q3 = 52.37, Q1 = 28.53, median = 41.9
∴ Coefficient of Skewness =
52.37 + 28.53 − 2(41.9) 80.90 − 83.8 −2.90
=
=
= −0.12
52.37 − 28.53
23.84
23.84
Example 3 : Calculate the Karl-Pearson’s coefficient of skewness from the following data :
Marks (above) :
0
10
20
30
40
50
60
70
80
No. of Students:
150
140
100
80
80
70
30
14
0
Solution : Thus formula of Karl-Pearson is applied to find out coefficient of skewness.
SK =
3(mean – median)
σ
n
− c. f .
Median = L1 + 2
×i
f
N
150
or
or 75 items below it. It lies in the cumulative frequency 80, which is
2
2
corresponding to the group (40–50). Therefore, median group is 40–50.
Median has
113
where, L1 = 40,
Median = 40 +
N
= 76, f = 10, c. f . = 70, i = 10
2
75 − 70
+ 10 = 45.
10
Calculation of Mean and Standard Deviation
Marks
Frequency Mid Deviations from
f
points Assumed Mean
X
(X – 45)
i = 10
dx2
fdx2
fdx
 X – 45 
 10 
dx = 
0–10
10
5
– 40
–4
16
– 40
160
10–20
40
15
– 30
–3
9
– 120
360
20–30
20
25
– 20
–2
4
– 40
80
30–40
0
35
– 10
–1
1
0
0
40–50
10
45
0
0
0
0
0
50–60
40
55
+ 10
+1
1
40
40
60–70
16
65
+ 20
+2
4
32
64
70–80
14
75
+ 30
+3
9
42
126
Σfdx = –86 Σ fdx2 = 830
N = 150
∴
Σfd
× i , where A = 45, N = 150, i = 10, Σfdx = –86.
N
X
= A+
X
= 45 +
−86
× 10 = 45 − 5.73 = 39.27
150
2
σ =
Σfdx 2  Σfdx 
−
×i
 N 
N
where, N = 150, i = 10, Σfdx2 = 830, Σfdx = – 86.
2
∴
σ =
830  −86 
2
−
 × 10 = 5.5333 − ( −.57) × 10
150  150 
or
σ =
5.5333 − 0.3249 × 10 = 5.2048 × 10 = 22.8
Coefficient of skewness =
3(Mean – Median)
σ
where, σ = 22.8, Mean = 39.27, Median = 45.
114
∴ Coefficient of skewness =
3(39.27 − 45.00) 3( −5.73) −17.19
=
=
= −0.754
22.8
22.8
22.8
Hence, Coefficient of skewness is – 0.754.
Example 4 : (a) In a frequency distribution the coefficient of skewness based on quartiles is 0.6. If
the sum of the upper and lower quartile is 100 and median is 38, find the values of lower and upper
quartiles. Also find out the value of middle 50% items.
(b) In a certain distribution, the following results were obtained:
Coefficient of variation = 40%
X
= 25.
Mode = 20.
Find out the Coefficient of skewness, by applying
Mean – Mode
Standard Deviation
Solution : (a) Since Bowley’s method is based on quartiles, we shall use the following formula :
Coefficient of skewness =
Q3 + Q1 − 2(Median)
Q3 − Q1
where coeff. of SK. = +0.6, median = 38, (Q3 + Q1) = 100.
By substituting the values in the formula, we get
+ 0.6 =
100 − 2(38)
(Q3 − Q1 )
By cross multiplying, we get:
0.6 (Q3 – Q1) = 100 – 76 = 24
or
Q3 – Q1 =
24
= 40.
0.6
We can solve the below given simultaneous equations :
or
Q3 + Q1 = 100
...(i)
Q3 – Q1 = 40
...(ii)
2Q3 = 140 (adding the equations)
Q3 = 70
Since
∴
Q3 + Q1 = 100
Q1 = 100 – 70 = 30.
Hence the lower and upper quartiles are 30 and 70.
The value of middle 50% items can be obtained with the help of (Q3 – Q1).
∴ The value of middle 50% items is (70 – 30) = 40.
(b) In this problem, the value of standard-deviation is missing. We can calculate σ by applying
115
the following formula:
C.V. =
σ
× 100
X
We are given, C.V. = 40%, X = 25
∴ 40 =
σ
× 100 or σ = 10
25
Coefficient of skewness =
Mean – Mode
σ
or
2Q3 = 140 (adding the equations)
Q3 = 70
Since
Q3 + Q1 = 100
∴
Q1 = 100 – 70 = 30
Hence the lower and upper quartiles are 30 and 70.
The value of middle 50% items can be obtained with the help of (Q3 – Q1)
∴ The value of middle 50% items is (70 – 30) = 40.
(b) In this problem, the value of standard-deviation is missing. We can calculate σ by applying
the following formula:
C.V. =
σ
× 100
X
We are given, C.V. = 40%, X = 25
∴ 40 =
σ
× 100 or σ = 10
25
Coefficient of skewness =
Mean – Mode
σ
We are given, Mean = 25, Mode = 20, σ = 10
Coefficient of skewness =
25 − 20
= 0.5
20
Hence Coefficient of skewness is = + 0.5
Example 5 : What is the relationship between Mean, Median and Mode in:
(a) Symmetrical curve.
(b) a negatively skewed curve.
(c) A positively skewed curve,
116
From the marks obtained by 120 students each in section A and B of a class, the following
measures are secured :
Section A
Section B
Mean = 47 Marks
Mean = 48
Standard deviation = 15 marks
Standard deviation = 15 marks
Mode = 52
Mode = 45.
Find out the coefficient of skewness and determine the degree of skewness and in which
distribution, the marks are more skewed.
Solution : The relationship between mean, median and mode, in different cases, can be established
as :
(a) In a symmetrical curve, there is no skewness. Therefore the value of mean = median =
mode.
(b) In a negatively skewed curve, the value of mean is less than median is less than mode. In
other words, mean < median < mode.
(c) In a positively skewed curve, the value of mean is greater than median is greater than
mode. In other words, mean > median > mode.
In the given problem, for finding out the degree of skewness, we have to compute the coefficient
of skewness,
where
β2 = 3, Mesokurtic Curve
β
2
< 3, Platykurtic Curve
β
2
> Leptokurtic Curve
4.5 MEASURES OF KURTOSIS
Measure of kurtosis is denoted by β2 and in a normal distribution β2 = 3.
If β2 is greater than 3, the curve is more peaked and is named as leptokurtic, if β2 is less than 3, the
117
curve is flatter at the top than the normal, and is named as platykurtic. Thus kurtosis is measured by
Σfx 4
µ
n
β2 = 4 =
µ 2  Σfx 2  2


n 
where x = ( X − X )
R.A. Fisher had introduced another notation Greek letter gamma, symbolically,
γ2 = β2 – 3 =
µ4
= 3.
µ 22
In this case of a normal distribution, γ2 is zero. If γ2 is more than zero (positive), then the curve
is platykurtic and if γ2 is less than 0 (negative) then the curve is leptokurtic.
µ4
Σfx 4
It may be noted that µ 4 =
is an absolute measure of kurtosis, but β2 = 2 is a relative
n
µ2
measure of kurtosis. Larger the value of γ2 in a frequency distribution, the greater is its departure
from normality.
Skewness and kurtosis. β1 and β2 are measures of symmetry and normality respectively. If
β2 = 0, the distribution is symmetrical and if β2 = 3, the distribution curve is mesokurtic.
4.6 COMPARISON AMONG VARIATION, SKEWNESS AND KURTOSIS
Dispersion, Skewness and Kurtosis are different characteristics of frequency distribution. Dispersion
studies the scatter of the items round a central value or among themselves. It does not show the
extent to which deviations cluster below an average or above it. This is studied by skewness. In
other words, this tells us about the cluster of the deviations above and below a measure of central
tendency. Kurtosis studies the concentration of the items at the central part of a series. If items
concentrate too much at the centre, the curve becomes ‘LEPTOKURTIC’ and if the concentration at
the centre is comparatively less, the curve becomes ‘PLATYKURTIC’.
Example 6 : From the following data given below, calculate the value of kurtosis and find out the
nature of distribution:
X
:
0–10
10–20
20–30
30–40
40–50
f
:
5
10
15
10
5
Solution :
Calculation of Mean =
Calculation of β2 =
Σfx 1125
=
= 25
N
45
µ4
Σfx 4
µ
=
= 40000
where,
4
N
µ 22
118
∴
µ2 =
Σfx 2 60000
=
= 133.33
N
45
β2 =
µ4
40000
=
=3
2
µ 2 (133.33) 2
Since the value of β2 = 3, the distribution curve is mesokurtic.
(CALCULATION OF β 2)
X
f
Classes Frequency
Mid
Points
(X – 25)
X
fx
x
x2
x3
x4
fx2
fx3
fx4
0–10
5
5
25
– 20
400
–8000
160000
2000 –40000
800000
10–20
10
15
150
– 10
100
–1000
10000
1000 –10000
100000
20–30
15
25
375
0
0
0
1
0
0
0
30–40
10
35
350
+ 10
100
1000
10000
1000
10000
100000
40–50
5
45
225
+ 20
400
8000
160000
2000
40000
800000
N = 45
Σ
Σfx2 =6000
0
1800000
fx = 1125
4.7 SUMMARY
●
Both skewness and kurtosis are related to the shape of the frequency curve.
●
Skewness means lack of symmetry, which implies that the mean, median and mode are unequal
in such a case.
●
Skewness is positive when its longer tail is to the right and negative when it is on the left.
●
There are three measures of skewness, given by Karl Pearson which is based on averages and
standard deviation; by Bowley which uses median and quartiles; and by Kelly, based on median
and the tenth and ninetieth percentiles.
●
Kurtosis refers to relative height of the frequency curve. Distributions can be mesokurtic,
leptokurtic and platykurtic on this basis.
4.8 SELF ASSESSMENT QUESTIONS
Exercise 1 : True or False Statements
(i) Since mode is the point corresponding to maximum concentration of frequencies, its value
is always higher than the mean and median.
(ii) Skewness and kurtosis are both indicative of the nature of dispersion.
(iii) All distributions are either positively skewed or negatively skewed.
(iv) For a symmetrical distribution, the sum of positive and negative deviations from median is
always equal to zero.
119
(v) If the excess of mean over mode is negative, it implies that skewness is negative.
(vi) A longer tail to the right indicates positive skewness while a longer tail to the left indicates
negative skewness in the data.
(vii) For any skewed distribution, Mean – Mode = 3(Mean – Median).
(viii) Positive skewness is indicated when X > Me> Mo and negative skewness when X < Me
< Mo.
(ix) In a highly skewed distribution, the value of second quartile may be different from that of
the median
(x) In every distribution, the lower and upper quartiles are equidistant from median.
(xi) Bowley’s measure of skewness can vary between ±3.
(xii) Negatively skewed distributions are usually platykurtic.
(xiii) Bowley’s measure of skewness is more appropriate to use in an open-ended distribution.
(xiv) A distribution more peaked than normal distribution is called platykurtic distribution.
(xv) Kelly’s measure of kurtosis can vary between the limits of – 0.2631 to +0.2369.
(xvi) The five-point summary of a distribution includes mean, median, mode, lower quartile and
upper quartile.
(xvii) A distribution with lower quartile = 127.8, median = 135.2 and upper quartile = 148.8 has
negative skewness.
Ans. 1. F, 2. T, 3. F, 4. T, 5. T, 6. T, 7. F, 8. T, 9. F, 10. F, 11. F, 12. F, 13. T, 14. F, 15. T, 16. F, 17. F
Exercise II. Questions and Answers
(i) What is skewness? What are the tests of skewness? Distinguish between positive and
negative skewness. Give examples of cases where positively and negatively skewed
distributions may be obtained.
(ii) Draw rough sketches to show asymmetrical distribution, a negatively skewed distribution
and a positively skewed distribution. Also, show the relative location of mean, median and
mode in each case.
(iii) State the empirical relationship between mean, median and mode for unimodal frequency
curves that are moderately skewed.
(iv) Explain the measures of skewness given by Karl Pearson and Bowley.
(v) What is kurtosis? How is it measured in terms of Kelly’s formula and the beta co-efficient?
(vi) “Averages, measures of variation, skewness and kurtosis are complementary in
understanding a frequency distribution.” Explain.
(vii) For the distribution of daily wages of a factory employing 880 workers, the co-efficient of
quartile deviation is 3/5 and the co-efficient of skewness based on quartiles is 1/3. The
median wage is known to be Rs. 90. Calculate the lower and upper quartile wages.
(viii) In a symmetrical distribution, the mean, standard deviation and range of marks for a group
of 20 students are 40, 12 and 60. Find the standard deviation of marks if the students with
highest and lowest marks are excluded.
120
(ix) Given that median = 133.5 and mode = 134, obtain the missing frequencies for the
following distribution and then calculate Bowley’s co-efficient of skewness:
Class Interval
Frequency
100–110
110–120
120–130
130–140
140–150
150–160
160–170
8
32
?
?
?
12
8
Total
460
(x) For a distribution, Bowley’s co-efficient of skewness is 0.6. If the sum of the upper and
the lower quartiles is 100 and median is 38, find the values of the upper and lower quartiles.
(xi) For a distribution, Bowley’s co-efficient of skewness is – 0.36, lower quartile is 8.6 and
median is 12.3. Calculate the co-efficient of quartile deviation for this distribution.
(xii) The following table gives the distribution of monthly wages of 500 workers in a factory :
Monthly Wages
No. of Workers
(in Rs.)
1,500 – 2,000
2,000 – 2,500
2,500 – 3,000
3,000 – 3,500
3,500 – 4,000
4,000 – 4,500
10
25
145
220
70
30
Compute average monthly wage, mode, standard deviation and Karl Pearson’s co-efficient
of skewness.
(xiii) Given that median = 46 and mode = 37, find the missing frequencies of the following
distribution and also calculate Karl Pearson’s co-efficient of skewness:
Class Interval : 20–30 30–40
Frequency
:
12
?
40–50 50–60
?
?
60–70
70–80
80–90
Total
12
9
7
100
(xiv) Consider the following data about two distributions:
Distribution A
Distribution B
Mean
120
110
Median
110
120
Standard deviation
10
10
121
Examine the following statements, stating with reasons whether each of them is true or false:
(a) Distribution A has the same degree of variation as distribution B has.
(b) Distribution A has the same degree of skewness as distribution B has.
(xv) Karl Pearson’s co-efficient of skewness of a distribution is 0.40. Its standard deviation is 8
and mean is 30. Find the median and mode of the distribution.
(xvi) For a moderately skewed distribution of the retail prices of children’s shoes, it is found that
the mean price is Rs. 180 and the median price is Rs. 164. If the co-efficient of variation is
20%, find the Karl Pearson’s co-efficient of skewness.
(xvii) For a distribution, mean = 65, median = 70 and co-efficient of skewness = –0.6. Find the
mode and co-efficient of variation.
(xviii) Using mean and median, calculate Karl Pearson’s co-efficient of skewness for the following
distribution:
Marks
:
Frequency :
10-100
10-80
10-60
10-50
10-40
100
88
73
57
35
10-30 10-20
17
5
(xix) Given, mean = 50, co-efficient of variation = 40% and J = –0.4. Find mode, median and
standard deviation.
(xx) The sum of 20 observations is 300 and the sum of their squares is 5000. Find the coefficient of variation and co-efficient of skewness, given further that median =15.
(xxi) Pearson’s co-efficient of skewness for a data distribution is 0.5 and co-efficient of variation
is 40%. Its mode is 80. Find the mean and median of the distribution.
Ans. 9. –0.115, 10. 70, 30, 11. 0.24, 12. 3.345, 3.167, 503.46, 0.354, 13. 26, 20, 14, 0.7, 14. (a)
False CVA < CVB (b) True 15. 38.93, 36.8, 16. 1.33, 17. 80, 38.46%, 18. 0.12, 19. 58, 52.67, 20, 20.
33.33%, 0, 21. 100, 93.33
122
LESSON : 5
MOMENTS
5.
STRUCTURE
5.0 Objective
5.1 Calculation of Central Moments
5.1.1 Direct Method
5.1.2 Short Cut Method
5.1.3 Step Deviation Method
5.2 Sheppard’s Correction for Grouping Errors
5.3 Coefficients Based on Moments
5.4 Summary
5.5 Self Assessment Questions
5.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Understand the meaning of moments
(b) Compute central moments by different methods
(c) Comprehend Shepard’s corrections of moments for Grouping errors
(d) Comment upon the nature of distribution with the help of α and β.
5.1 CALCULATION OF CENTRAL MOMENTS
The concept of moments has crept into the statistical literature from mechanics. In mechanics, this
concept refers to the turning or the rotating effect of a force whereas it is used to describe the
peculiarities of a frequency distribution in statistics. We can measure the central tendency of a set of
observations by using moments. Moments also help in measuring the scatteredness, asymmetry and
peakedness of a curve for a particular distribution.
Moments refers to the average of the deviations from mean or some other value, raised to a
certain power. The arithmetic mean of various powers of these deviations in any distribution is
called the moments of the distribution about mean. Moments about mean are generally used in
statistics. We use a greek alphabet µ read as mu for these moments. We shall understand the first
four moments about mean in the lesson, i.e., µ 1, µ 2, µ 3 and µ 4.
We can compute central moments in the following ways :
1. Direct method
2. Short-cut method
3. Step-deviation method.
5.1.1 Direct Method
(i) Calculate arithmetic mean (X)
123
(ii) Calculate the sum of deviations (Σx) from arithmetic mean.
(iii) Calculate the sum of x2, x3 and x4.
In case of frequency distributions multiply the individual value of x2, x3 and x4 with corresponding
frequencies and find out the sum of fx2, fx3, and fx4.
(iv) Apply the following formulae rule.
µ1 =
Σx
=0
N
µ2 =
Σx 2
N
µ3 =
Σx 3
N
Σx 4
µ4 =
N
In case of frequency distribution apply:
Σfx
µ1 =
N
µ2 =
Σfx 2
N
Σfx3
µ3 =
N
Σfx 4
N
Let us take an example to understand the computation of the moments about mean
µ4 =
Example 1 : Calculate the first four moments about the mean from the following set of numbers 2,
3, 7, 8, 10
Solution :
Calculation of Moments
X
(X – X)
x
x2
x3
x4
2
–4
16
–64
256
3
–3
9
–27
81
7
1
1
1
1
8
2
4
8
16
10
4
16
64
256
ΣX = 30
0
46
–18
610
124
X =
ΣX 30
=
= 6,
N
5
where N = 5
Moments of the data can be computed by using the values calculated above.
µ1 =
Σx 0
= =0
N 5
µ2 =
Σx 2 46
=
= 9.2
N
5
Σx3 −18
=
= −3.6
µ3 =
N
5
Σx 4 610
=
= 122
µ4 =
N
5
Therefore, the first four central moments about the mean are : 0, 9.2, –3.6 and 122 respectively.
Example 2 : From the marks distribution of 100 candidates, compute the first four moments about
mean.
Marks
Number of Candidates
0–10
10
10–20
15
20–30
25
30–40
25
40–50
10
50–60
10
60–70
5
Solution :
Calculation of Moments
(Mid Value)
fX
(X – X)
x
10
50
– 26
– 260
6,760 – 1,75,760
15
15
225
– 16
– 240
3,840
– 61,440
9,83,040
20–30
25
25
625
–6
– 150
900
– 5,400
32,400
30–40
35
25
875
4
100
400
1,600
6,400
40–50
45
10
450
14
140
1,960
27,440
3,84,160
50–60
55
10
550
24
240
5,760
1,38,240
33,17,760
60–70
65
5
325
34
170
5,780
1,96,520
66,81,680
N = 100
3100
0
25,400
Marks
X
0–10
5
10–20
f
125
fx
fx2
fx3
fx4
45,69,760
1,21,200 1,59,75,200
∴
X =
ΣfX 3100
=
= 31 marks.
N
100
Now, we can calculate the moments about mean as follows :
µ1 =
Σfx
0
=
N 100
µ2 =
Σfx 2 25, 400
=
= 254
N
100
µ3 =
Σfx3 1, 21, 200
=
= 1, 212
N
100
µ4 =
Σfx 4 1,59,75,200
=
= 1,59,752
N
100
Therefore, the Central Moments are : 0, 254, 1212, 159752 respectively.
5.1.2 Short-cut Method
If the arithmetic mean is in fractions then, it is difficult to calculate deviations (x) from arithmetic
mean. Short-cut method is used in such cases.
(i) Take any value as an arbitrary mean (A).
(ii) Calculate deviations (d) from A and calculate the first four moments in the similar way as
done in direct method.
These moments are called moments about an arbitrary origin which are represented by the
greek word v read as nu. The formulae for these moments are :
v1 =
Σ ( X − A) Σd
=
where d = X – A
N
Ν
Σ ( X − A)2 Σd 2
=
v2 =
N
Ν
v3 =
Σ ( X − A)3 Σd 3
=
N
Ν
v4 =
Σ ( X − A)4 Σd 4
=
N
Ν
v1 =
Σf ( X − A) Σfd
=
N
N
In case of frequency distribution,
126
v2 =
Σf ( X − A) 2 Σfd 2
=
N
N
v3 =
Σf ( X − A)3 Σfd 3
=
N
N
v4 =
Σf ( X − A) 4 Σfd 4
=
N
N
After calculating moments about an arbitrary origin convert them into Moments about mean
by using the following equations:
µ 1 = v1– v1 = 0
µ 2 = v2 – v12 = σ2
µ 3 = v3 – 3v2v1 + 2v13
µ 4 = v4 – 4v3.v1 + 6v2.v12 – 3v14
We can calculate the Moments about an arbitrary origin from Moments about the mean by this
relationship:
v1 = µ 1+d
where d is the difference between the
v2 = µ 2 + d2
mean and origin about which the Moments
v3 = µ 3 + 3µ 2d + d3
are to be calculated.
v4 = µ 4 + 4µ 3d + 6µ 2d2 + d4
∴d=X–A
Example 3 : We are given the following set of numbers 1, 3, 7, 9, 10. Calculate the first four
moments about the origin 4.
Solution :
Calculation of First Four Moments about A = 4
X
d = (X – A)
d2
d3
d4
1
–3
9
– 27
81
3
–1
1
–1
1
7
3
9
27
81
9
5
25
125
625
10
6
36
216
1296
N=5
10
80
340
2084
v1 =
Σd 10
=
=2
N
5
Σd 2 80
=
= 16
v2 =
N
5
127
v3 =
Σd 3 340
=
= 68
N
5
Σd 4 2084
=
= 416.8
v4 =
N
5
Therefore the Moments about an arbitrary origin are 2, 16, 68 and 416.8 respectively.
Example 4 : Calculate first four moments about mean for the distribution of heights of the following
100 students.
Heights (Inches)
61
64
67
70
73
Number of Students
5
18
42
27
8
Solution :
Calculation of Central Moments (short-cut method)
Heights
No. of students
A = 67
f×d
fd × d
fd2 × d
fd3 × d
X
f
d = (X – 67)
fd
fd2
fd3
fd4
61
5
–6
–30
180
–1,080
6,480
64
18
–3
–54
162
– 486
1,458
67
42
0
0
0
0
0
70
27
+3
81
243
729
2,187
73
8
+6
48
288
1,728
10,368
45
873
891
20,493
N = 100
Now we can substitute the calculated values in the formulae
v1 =
Σfd
45
=
= 0.45
N
100
v2 =
Σfd 2 873
=
= 8.73
N
100
Σfd 3 891
=
= 8.91
v3 =
N
100
v4 =
Σfd 4 20493
=
= 204.93
N
100
Moments about mean can be calculated as follows :
µ 1 = v1 –v1 = 0 = 0.45 – 0.45 = 0
µ 2 = v2 – v12 = σ2 = 8.73 – (0.45)2 = 8.73 – 0.2025 = 8.5275
128
µ 3 = v3 – 3v2v1 + 2v13 = 8.91 – 3(8.73 × 0.45) + 2(0.45)3 = 8.91 – 11.7855 + 0.18225 = – 2.6932
µ 4 = v4 – 4v3. v1 + 6v2.vl2 – 3v14
= 204.93 – 4 × 8.91 × 0.45 + 6 × 8.73 × (0.45)2 – 3 × (0.45)4 = 204.93 – 16.038 + 10.60695
– 0.1230 = 199.3759
Hence the Central Moments are : 0,8.5275, –2.6932 and 199.3759.
If we are given the values of central moments and were interested in finding the Moments
about an arbitary origin (A = 67). Then we can calculate as follows :
X
= A+
Σfd
45
= 67 +
= 67.45
N
100
d = ( X − A) = 67.45 − 67 = 0.45
v1 = µ1 + d = 0 + 0.45 = 0.45
v2 = µ 2 + d 2 = 8.5275 + (0.45) 2 = 8.5275 + .2025 = 8.73
v3 = µ3 + 3µ 2 d + d 3
= −2.6932 + 3 × 8.5275 × (0.45) + (0.45)3 = −2.6932 + 11.512125 + 0.091125 = 8.91.
v4 = µ 4 4µ3d + 6µ 2 d 3 + d 4
= 199.3759 – (4 × –2.69325 + 0.45) + 6 × 8.5275 × (0.45)2 + (0.45)4
= 199.3759 – 4.84785 + 10.36091 + 0.041006 = 204.93
∴ Moments about an arbitrary origin (67) are : 0.45, 8.73, 8.91 and 204.93.
5.1.3 Step-Deviation Method
It is the most appropriate method to calculate central moments in problems of continuous frequency
distributions with equal class-intervals. Step-deviation method is similar to short cut method. The
only difference is that in case of step-deviation method, we take a common factor from among the
deviations (d) which are taken from assumed mean (A).
(i) Calculate deviations (d) from arbitrary origin (A ).
(ii) Take a common factor from all the values of d and find out the sum of d′.
(iii) Find out the values of d′2, d′3 and d′4 and their aggregates.
(iv) Calculate the value of v1, v2, v3 and v4 by using the formulae.
(v) Convert the calculated Moments about an arbitrary origin into Moments about the mean
with the help of these relationship:
µ 1 = v1 − v1 = 0
µ 2 = v2 − v12 = σ 2
129
µ 3 = v3 − 3v2v1 + 2v13
µ 4 = v4 − 4v3 .v1 + 6v2 .v12 + 3v14
We shall understand the computation of the first four moments about an arbitrary origin by
step deviation method.
Example 5 : Calculate first four moments about mean with the help of moments about an assumed
mean 35 from the following data :
Class
Frequency
0–10
4
10–20
10
20–30
21
30–40
32
40–50
21
50–60
7
60–70
5
Solution :
Calculation of Moments about arbitrary Mean
A = 35, C = 10
(M id Points)
Class
X
f
 X – A
d′ = 
 C 
0–10
5
4
–3
– 12
36
– 108
324
10–20
15
10
–2
– 20
40
– 80
160
20–30
25
21
–1
– 21
21
– 21
21
30–40
35
32
0
0
0
0
0
40–50
45
21
1
21
21
21
21
50–60
55
7
2
14
28
56
112
60–70
65
5
3
15
45
135
405
–3
191
3
1043
N= 100
fd′ 1
fd′ 2
fd′ 3
fd′ 4
First of all, we shall calculate the first four moments about an arbitrary mean by substituting
the values.
v1 =
Σfd ′
−3
×C =
× 10 = −0.3
N
100
130
v2 =
191
Σfd ′ 2
× 100 = 191
× C2 =
100
N
3
Σfd ′ 3
× 103 = 30
× C3 =
v3 =
100
N
v4 =
1043
Σfd ′ 4
× 104 = 1, 04,300
× C4 =
100
N
From these we get the central moments as below :
µ1 = v1 − v1 = −0.3 − ( −0.3) = 0
µ 2 = v2 − v12 = σ2 = 191 − ( −0.3) 2 = 191 − 0.09 = 190.91
µ3 = v3 − 3v2v1 + 2v13
= 30 – 3 × 191 (–0.3) + 2 (–0.3)3 = 30 + 171.9 – 0.054 = 201.846
µ 4 = v4 − 4v3 .v1 + 6v2 .v12 − 3v14
= 1, 04,300 − 4 × 30( −0.3) × 6 × 191( −0.3) 2 − 3( −0.3) 4
= 1, 04,300 + 36 + 103.14 − 0.0243 = 1, 04, 439.12
5.2 SHEPPARD’S CORRECTION FOR GROUPING ERRORS
When data are grouped into frequency distributions, the individual values lose their identity. While
calculating moments, it is assumed that the frequencies are concentrated at the mid-points of the
classes for a continuous frequency distribution.
Let us understand by assuming a class of 10–20 whose relative frequency is 20.
To compute Moments or arithmetic mean, we always take the mid-point of the class 10–20
10 + 20
= 15. But in reality it may be just possible that more than half the values for the
2
class 10–20 are more than 15. Due to this assumption the grouping errors enter into the calculation
of Moments. To remove these errors W.F. Sheppard introduced some corrections which are known
as Sheppard’s corrections. These are :
which is
The first and third Moment needs no corrections.
µ 2 (corrected) = µ 2 (uncorrected) –
1 2
i
12
µ 4 (corrected) = µ 4 (uncorrected) –
7 4
1
2
i
µ 2 (uncorrected) i +
240
2
where i is the class-interval.
131
Sheppard’s corrections are applied only under these conditions :
(i) When the frequency distribution is continuous.
(ii) When frequency tapers off to zero in both directions.
(iii) When the frequencies are not less than 1000.
(iv) When the frequency distribution is not J-shaped or U-shaped or skewed,
(v) When the class-interval is uniform.
Let us understand with the help of an example.
Example 6 : Applying Sheppard’s corrections, find out the corrected values of the moments where
the class interval is 10 and µ1 = 0, µ 2 = 254, µ 3 = 1212 and µ 4 = 1,59,752.
Solution : We are given all the values of four moments and class-interval.
µ1 and µ3 needs no correction.
µ2 (corrected) = µ 2 −
1 2
1
100
i = 254 − × 102 = 254 −
= 245.667
12
12
12
1
7 4
1
7
2
i = 1,59752 − × 254 × 10 2 +
× 104
µ4 (corrected) = µ 4 − µ 2i +
2
240
2
240
= 1,59752 – 12,700 + 291.667 = 147,343.667.
Therefore, the corrected values of four moments are 0,245.667, 1,212 and 147,343.667
respectively.
5.3 COEFFICIENTS BASED ON MOMENTS
There are three coefficients which are used in practice. They are α (Alpha), β (Beta), γ (Gamma)
coefficients. These coefficients are calculated on individual relationships of various Moments. The
formulae are :
Alpha-Coefficients
α1 =
α3 =
µ1
= 0,
σ
µ3
=
µ3
α2 =
, and α 4 =
σ
Beta-Coefficients
3
µ 32 2
µ2
=1
σ2
µ4 µ4
=
σ 4 µ 22
µ4
µ32
= α 22 and β 2 = 2 = α 4
3
µ2
µ2
Gamma-Coefficients
β1 =
γ 1 = β 1 = α 3 and γ 2 = β 2 − 3 =
µ 4 − 3µ 22
µ 22
132
Beta-Coefficients (β1 and β2) are used to find the skewness and kurtosis of a distribution. Let
us take an illustration to understand coefficients.
Example 7 : The values of µ1 , µ2 , µ3 , and µ 4 , are 0, 9.2, 3.6 and 122 respectively. Find out the
skewness and kurtosis of the distribution.
Solution : To comment upon skewness and kurtosis of the distribution, we shall calculate the
values of β1 and β2.
µ32 (3.6)2
12.96
=
= 0.0166
β1 = 3 =
2
778.688
µ 2 (9.2)
µ4
122
122
β2 = µ 2 = (9.2)2 = 84.64 = 1.4
2
Hence the distribution is positively skewed and the curve is platykurtic or flat at the top.
Example 8 : Calculate the first four Moments about an arbitrary origin. Convert them into Moments
about mean. Applying Sheppard’s corrections, calculate corrected Moments and beta coefficients
from the following data:
Experience
No. of Employees
(years)
0–1
15
1–2
22
2–3
45
3–4
35
4–5
30
5–6
20
6–7
16
7–8
10
8–9
5
9–10
2
Solution :
Calculation of Moments
Experience Mid Points No. of Employees
(Years)
X
f
Let A = 4.5
d = (X – A)
fd
fd2
fd3
fd4
0–1
0.5
15
–4
– 60
240
– 960
3,840
1–2
1.5
22
–3
– 66
198
– 594
1,782
2–3
2.5
45
–2
– 90
180
– 360
720
3–4
3.5
35
–1
– 35
35
– 35
35
133
4–5
4.5
30
0
0
0
0
0
5–6
5.5
20
1
20
20
20
20
6–7
6.5
16
2
32
64
128
256
7–8
7.5
10
3
30
90
270
810
8–9
8.5
5
4
20
80
320
1,280
9–10
9.5
2
5
10
50
250
1,250
– 139
957
– 961
9,993
200
We can find out
v1 =
Σfd −139
=
= −0.695
N
200
Σfd 2 957
=
= 4.785
v2 =
N
200
v3 =
Σfd 3 −961
=
= −4.805
N
200
Σfd 4 9993
=
= 49.965
v4 =
N
200
Computed moments are moments about an arbitrary point, 4.5. The central moments are
calculated below:
µ1 = v1 – v1 = – 0.695 – (–0.695) = 0
µ2 = v2 – v12 = 5.985 – (–0.695)2 = 5.985 – 0.483 = 5.502
µ3 = v3 – 3v2v1 + 2v13
= – 4.805 – 3 (5.985) × (– 0.695) + 2(–0.695)3 = – 4.805 + 12.479 – 0.671 = 7.003
µ4 = v4 – 4v3.v1 + 6v2.v12 – 3v14
= 49.965 – 4 (– 4.805) × (– 0.695) + 6 × 5.985 (–0.695)2 – 3(– 0.695)4
= 49.965 – 13.358 + 17.345 – 0.700 = 53.252.
Applying Sheppard’s corrections, we have
1 2
i
12
1
2
= 5.502 − × (1) = 5.502 − 0.083 = 5.419
12
µ 2 (corrected) = µ 2 −
µ 4 (corrected) = µ 4 –
1
7
µ 2i 2 +
× i4
12
240
= 53.252 −
1
7
× 5.502 × (1) 2 ×
× (1) 4
2
240
134
= 53.252 − 2.751 +
7
= 53.252 − 2.751 + 0.029 = 50.53
240
Beta Constants
µ 32
(7.003) 2 49.042
=
= 0.31
β1 = 3 =
µ 2 (5.419)2 159.132
µ4
50.33
50.33
β2 = µ 2 = (5.419)2 = 29.336 = 1.714
2
Therefore the central moments after correction are 0, 5.419, 7.003 and 50.53. β1 = 0.31 and
β2 = 1.714.
5.4 SUMMARY
●
Moments provide a useful method to study various characteristics of a set of data. Moments
calculated about mean are called central moments. There can also be moments about any
given value A. When A = 0, moments calculated are called moments about origin.
●
It is possible to convert moments about A into central moments and vice-versa.
●
The first moment about zero is equal to mean and the first moment about mean is equal to
zero.
●
The second central moment is the variance of the distribution.
●
The beta and g-statistics are calculated with central moments and are used to learn about
skewness and kurtosis. β1 and g1are measures of skewness while β2 and g2 measure kurtosis.
●
If the third central moment is positive, the skewness is positive while if it is negative, the
skewness is negative. For a symmetrical distribution, it is equal to zero. β1 is always positive
but the greater its value, the more the skewness.
●
If β2 = 3, the distribution is mesokurtic; if β2 is less than 3, the distribution is platykurtic while
if it is greater than 3, the distribution in question is leptokurtic.
5.5 SELF ASSESSMENT QUESTIONS
Exercise 1 : True and False Statements
(i) The first moment about mean is always equal to zero.
(ii) The moments of even order can never assume negative values.
(iii) The fourth moment about mean always exceeds the third moment which, in turn, always
exceeds the second moment.
(iv) The second moment about origin is equal to the variance of the distribution.
(v) A negative value of the third moment about an arbitrary point does not necessarily mean
that the distribution has negative skewness.
135
(vi) Sheppard’s corrections aim at removing the effect of grouping error.
(vii) The corrected values of moments, obtained after applying Sheppard’s corrections, are always
higher than the corresponding uncorrected values of the moments.
(viii) All symmetrical distributions are not mesokurtic, but all mesokurtic distributions are
symmetrical.
(ix) A look at the value of the fourth central moment reveals whether the distribution is
mesokurtic or not.
(x) When β1 > 0, the skewness is positive and when β1 < 0, the skewness is negative.
Ans. 1. T, 2. T, 3. F, 4. F, 5. T, 6. F, 7. F, 8. F, 9. F, 10. F
Exercise II : Questions and Answers
(i) Define moments. Establish the relationship between moments about arbitrary origin and
central moments and vice versa.
(ii) Explain how moments help in determining the shape of a frequency distribution. In this
context, explain the calculation and interpretation of beta and gamma statistics.
(iii) Explain clearly the difference between skewness and kurtosis. Is it correct to say that a
distribution which has β2 equal to 3 must be symmetrical?
(iv) “A distribution, for which β1 equal to 0, is necessarily a normal distribution.” Do you
agree? Explain.
(v) What are Sheppard’s corrections for moments? State the conditions under which they are
applied.
(vi) The following is the distribution of heights of the students of a class:
Height (cms)
No. of Students
120–125
3
125–130
7
130–135
25
135–140
30
140–145
25
145–150
7
150–155
3
(a) Calculate Karl Pearson’s co-efficient of skewness using (i) mean and mode, and
(ii) mean and median.
(b) Calculate Bowley’s co-efficient of skewness.
(vii) Calculate (a) central moments, (b) moments about A = 6, and (c) moments about origin
for the following data:
9, 4, 6, 9, 11, 13, 8, 4
136
(viii) Calculate four central moments for the following distribution:
Class Interval
Frequency
10–20
20–30
30–40
40–50
50–60
60–70
8
12
15
8
5
2
Total 50
(ix) Given, n = 10, Σ(X – 10) = – 40, Σ(X – 10)2 = 2,870 and Σ(X – 10)3 = – 480. Calculate the
following:
(a) Arithmetic mean
(b) Variance
(c) Third moment about 10
(d) A measure of relative skewness based on moments
(x) For the following distribution, calculate first four (i) moments about 45 and (ii) central
moments. Also, comment on the shape of the distribution using gamma statistics.
Class Interval
Frequency
10–20
2
20–30
5
30–40
12
40–50
20
50–60
8
60–0
2
70–80
1
(xi) Given the following distribution:
Class Interval
Frequency
10–20
2
20–30
6
30–40
12
40–50
20
50–60
12
60–70
6
70––80
2
137
(a) Calculate mean, median and mode and also find the value of Karl Pearson’s coefficient of skewness.
(b) Calculate the first four central moments and the beta co-efficient of skewness.
(c) Comment on the results.
(xii) You are given the following frequency distribution:
Class Interval
Frequency
80–100
12
100–120
15
120–140
20
140–160
38
160–180
60
180–200
33
200–220
14
220–240
8
(a) Calculate moments about 170.
(b) Convert these to central moments and obtain the values of beta and gamma statistics.
Also, comment on the shape of the distribution.
(xiii) For the following distribution, calculate first four moments about A = 150 and obtain
central moments from these. Also, calculate beta co-efficients and comment on the
skewness and kurtosis.
Class Interval
Frequency
80–100
3
100–120
5
120–140
20
140–160
16
160–180
10
180–200
4
200–220
2
(xiv) The first four moments of a distribution are 2, 20. 40, and 500, respectively. Comment on
the shape of the distribution.
(xv) The first four moments of a distribution are calculated as 6; 235; 1,248; and 24,680. You
are required to examine the skewness and kurtosis of the distribution.
138
(xvi) From the following information about two distributions, find which is more skewed?
Distribution
Second Moment
Third Moment
A
16
–15.7
B
40
25.8
(xvii) For a mesokurtic distribution, the fourth central moment is 768. Obtain its standard
deviation.
(xviii) The first four moments of a distribution about the value 4 of the variable are –1.5, 17, –30
and 108. Its mean is given to be 2.5. You are required to
(a) Calculate the central moments and moments about origin.
(b) Determine the co-efficient of variation and variance of the distribution.
(c) Examine the shape of the distribution.
(xix) For a mesokurtic distribution, β1 = 0.004 and µ3 = 16. Calculate the value of its fourth
central moment.
(xx) For a mesokurtic distribution, co-efficient of variation = 40% and arithmetic mean = 40.
Find the value of its fourth central moment.
(xxi) If variance = 42, then what values of µ4 would make a distribution (i) mesokurtic, (ii)
platykurtic, and (iii) leptokurtic?
(xxii) The first two moments about 40 for a set of 25 values were calculated as equal to 65 and
2,985, respectively. Test if the calculations are consistent.
(xxiii) You are given here the results of calculations in respect of a negatively skewed distribution:
N = 100, mean = 14, variance = 230, β1 = 0.8 and β2 = 2.4.
It was discovered later on that an item 12 was wrongly recorded as 2. Find the corrected
values of mean, variance and the two beta constants.
(xxiv) For a mesokurtic distribution, it is known that the first moment about 32 is 28 while the
fourth moment about 60 is 62,208. Determine the values of mean and co-efficient of
variation.
(xxv) Apply Sheppard’s corrections to the following moments :
First moment = – 7
Third moment = 873
Second moment = 193
Fourth moment = 87,504
Width of class interval = 10
Ans. 6. 0, 0, 7. 0, 9, 2.25, 154.5, 2, 13, 64.25, 404.5, 8, 73, 730.25, 7778.5 8. 0, 179.4, 951, 80,
354, 9. 6, 271, –48, 0.733 10. –2.6, 150, –1, 100, 75,000, 0, 143.24, 11. 45, 45, 45, 0, 0, 180, 0,
90000, 0; 12. –8.6, 1212, –40, 400, 44, 40000, 0, 1, 138.04, –10, 402.5, 35, 71,667, 0.0734, –0.271,
2.758, –0.242; 13. –5, 740, – 6000, 1544000, 0, 715, 4850, 15.33, 125, 0.0644, 3; 14. 1, 2.39; 15.
Low positive skewness, platykurtic; 16. A, 17. 4; 18. 0, 14.75, 39.75, 142.3125, 2.5, 21, 166, 1132,
183.3%, 21, 0.702, –2.35; 19. 4800; 20. 1,96,608; 21. 5.292, < 5.292, > 5.292; 22. No, 0; 23. 14.1,
228.59, 0.771, 2.042; 24. 60, 20%; 25. 1.357, 154.578
❑
139
UNIT-2 PROBABILITY, PROBABILITY DISTRIBUTIONS
AND DECISION THEORY
LESSON-1 : THEORY OF PROBABILITY
1. STRUCTURE
1.0 Objective
1.1 Probability Foundations
1.2 Events
1.2.1 Mutually Exclusive and Overlapping Events
1.2.2 Complementary events
1.2.3 Independent and Dependent Events
1.3 Methods of Assigning Probability
1.3.1 Classical Approach
1.3.2 Relative Frequency Approach
1.3.3 Personalistic Approach
1.4 Computation of Probability
1.5 Laws of Probability
1.5.1 Addition Rule
1.5.2 Conditional Probability
1.5.3 Multiplication Rule
1.6 Bayes’ Theorem
1.7 Expected Value
1.8 Summary
1.9 Self Assessment Questions
1.0 OBJECTIVE
After reading this chapter, you should be able to :
(a) Define probability
(b) Understand the concept of experiment, sample space, events and their relationships
(c) Describe the classical, relative frequency and personalistic approaches of probability
(d) Compute probabilities with the help of addition, multiplication, conditional and Bayes’ theorem
1.1 PROBABILITY FOUNDATIONS
Most of the decision-making situations in business management involve uncertainty. Since
uncertainty is present and is an important aspect in determining the consequences of various
alternative courses of action, it is imperative to get proper appreciation of it, draw a mathematical
picture of it and attempt to measure it in numerical terms. There are many advantages in having a
numerical measure for uncertainty. Besides facilitating understanding and allowing analysis, it helps
in communication between executives. Verbally, a manager at a meeting might indicate that he is
“fairly sure” about the success of a particular project. This phrase might mean something quite
140
The theory of probabitity takes on practical
value when it is defined in relation to an
experiment. Such an experiment might be tossing
a coin, taking out a card out of a standard deck
of playing cards, tossing a six-faced die,
observing the number of defectives in a lot of
electric bulbs, tossing a pair of dice, drawing a
ball from an urn containing balls,.. and so on.
Once the experiment has been defined, all
possible outcomes from the experiment are
identified. This exhaustive set of outcomes
constitutes the sample space, S. The sample space
is a key concept and an important base of
probability theory.
One of the simplest sample spaces can be
the set of outcomes when a pair of coins is tossed.
It consists of four outcomes which can be
conveniently represented as :
Second Die
different to the other executives at the meeting. ‘Fairly sure’ might mean that success will occur 9
out of 10 times to one decision maker (implying that he is 90 percent sure) while the same phrase
might indicate 7 times out of 10 to another. Numbers remove such confusion. Besides this, an
important advantage of a numerical measure is the ability to use mathematics for analysis. Uncertainty
is expressed in numerical terms by the theory of probability as probability is at once the language
and measure of uncertainty. In this lesson, we are going to study as to how the probability also
provides a foundation for the whole of the analytical statistics that we are going to learn in the
course of these lessons.
6
5
4
3
2
1
1
2
3
4
5
6
First Die
Sample space for pair of dice experiment
Figure 1
S = {HH, HT, TH, TT}
where H denotes a head and T denotes a tail.
We can consider the case of a manufacturer who produces electric bulbs in large batches. From
each batch, a sample of 80 items is selected at random, and the number of defective items are
recorded. Although the number of defectives in any sample
cannot be predicted with certainty, all of the possible outcomes may be known. The number of
defective items in a sample can be any integer from 9 to 80. Here the sample space is :
S = {0, 1, 2, 3, .......80}
In the same manner, when a pair of dice is tossed, a total of 36 outcomes are possible. This can
be represented as shown in figure. 1
In all the three examples, the number of outcomes from the experiment are known to be finite.
While in most cases it is so but it is not a rule. The number of outcomes can be infinite as well. For
example, it we consider the experiment of observing the life- time of an electric bulb in hours, the
outcome can be any real, non-negative number. Thus, this sample space contains an infinite number
of sample points.
141
1.2 EVENTS
An event refers to any set of possible outcomes in a sample space. If the sample space for an event
has the elements S1, S2, S3,...Sn, an event in the sample space S would be any one, or collection of
S1, S2, S3;...Sn. In a sample space, every combination of sample points may be defined as an event.
In an experiment of counting defective items in a sample, the set of all possible outcomes having
less than 10 defective items can be represented by the event A.
Each sample point does not have to identify a separate event. The faces of a die provide a
sample space of six outcomes. If the occurrence of each face identifies a different event, there are
six possible events. On the other hand, suppose that an even number represents a gain of Rs. 100 to
a person X and an odd number represents a loss of Rs. 100 to the person Y. In this case, there are six
outcomes but only two events – gain of Rs. 100 to X, and loss of Rs. 100 to Y.
Every event A is a subset of the sample space and every event is a collection of the elements of
a sample space. Events can be classified as being elementary or compound. Elementary events are
said to be those which have a single sample point whereas compound events are those which contain
more than one sample point. Thus, whereas the compound events can be decomposed, the elementary
events cannot be. The appearance of 3 on a die is an elementary event while the appearance of an
even number on the die is a compound event (as it contains three sample points).
Now, keeping in mind the definitions of experiment, sample space and events, we introduce
some more concepts.
1.2.1 Mutually Exclusive and Overlapping Events
Two events are mutually exclusive if the occurrence of one event precludes the occurrence of the
other. For example, the events that (i) an employee would be late, and (ii) the employee would be
absent, on a particular day, are mutually exclusive since both cannot occur simultaneously. An
employee cannot be both late and absent on a particular day. Similarly, suppose we consider a box
in which 20 cards, marked 1 to 20 are placed and a card is drawn at random. If A be the event that
the number on the card is divisible by 3 and B be the event that the number would be divisible by 7,
then the events A and B are mutually exclusive. This is because for A to occur, the number would be
one of 3, 6, 9, 12, 15 or 18, and B to occur, it should be one of 7 or 14. Since no number is common
to them, they are mutually exclusive.
On the other hand, two or more events which are not mutually exclusive are called overlapping
events. In the above example of cards, suppose A represents the event that the number on the card
chosen is divisible by 3 and B represents the event that the number is divisible by 5, then for A to
occur the number must be either 3,6,9,12,15 or 18, and for B to occur, it must be one of 5,10,15 and
20. Note that if the number 15 is obtained, it implies that both A and B have taken place. Thus A and
B are not mutually exclusive.
We can use Venn diagram to depict mutually exclusive and overlapping events. This is shown
in figure 2. Part (a) of the figure shows the mutually exclusive events A and B, each of them defined
over the sample space S. Note that A and B have no sample points in common. On the other hand
part (b) of the figure shows overlapping events A and B, as they have some common sample points.
142
S
S
A∩B
A
A
B
B
(a) Mutually Exclusive Events
(b) Overlapping Events
Figure 2
1.2.2 Complementary Events
Events are said to be complementary when the sample space
is partitioned into the segment that represents the occurrence
of an event A, and the segment that is not a part of A. Thus,
the complement of an event A is the collection of all possible
outcomes that are not contained in event A. For example,
in the toss of a coin, appearance of head and tail are
complementary to each other. Complementary events are
shown in figure 3. Here A and A are complementary to
each other. The events of a person being able to hit a target,
and not being able to hit the target are complementary, and
so are the events of the appearance of a head and a tail on
tossing a coin.
Complementary Events
Figure 3
1.2.3 Independent and Dependent Events
Two events are said to be independent if the occurrence of one event in no way influences the
occurrence of the other event. For example, we toss a six-faced die and call the event of appearance
of an even number as the event A and the appearance of an odd number as the event B. Now,
suppose that in the first toss we get an even number. If we toss the die the second time, we can still
get an even or an odd number and their chances are not influenced by the result of the first trial.
Thus, the appearance of an even number in the first trial and the appearance of an even number in
the second trial is an example of independent events. Similarly, if we pick a card at random from a
deck of playing cards, note its suit and put it back and then draw one card from the deck, then the
chances of a king card, for example, in the second trial is not at all affected by the card we had
drawn at the first trial. But if we take out a card and do not replace it back, then the chances of
drawing a king card in the second trial are certainly affected by the card we had drawn in the first
trial. If it were a king card the first time, then only 3 king cards remain in the 51 cards while if a nonking card was drawn then we would have 4 king cards in the lot of 51 cards. So the chances of a
king card in the second trial are dependent upon the results of the first trial. This time, the events of
a king card in the first trial and king card in the second trial are not independent because the outcome
in one trial is in some way influenced by the outcome of the previous trial.
143
1.3 METHODS OF ASSIGNING PROBABILITY
There are three methods of assigning probability to an event. They are :
(i) Classical approach,
(ii) Relative frequency approach, and
(iii) Personalistic approach.
We now disucss these methods :
1.3.1 Classical Approach
The classical approach to determine probability is the oldest one. It originated with the games of
chance. According to this theory, if there are n outcomes of an experiment which are mutually
exclusive and equally likely to occur, then the probability of each sample point is 1/n. Thus, if a fair
die is tossed, each of six numbers 1, 2,... 6, is equally likely to occur and the probability that a given
number, say 5, would occur is 1/6. From this, the classical interpretation of probability is: if the
sample space of an experiment has n(S) equally likely outcomes and if an event A, defined on this
sample space has n(A) sample points, then the probability that event A would occur is the ratio of
n(A) to n(S).
Thus, P(A) =
n(A)
n(S)
To illustrate, we consider the following examples.
Example 1 : A six-faced die is tossed once. Find the probability that the number obtained on
tossing is (i) and odd number, (ii) a number greater than 2.
Solution : Let A : the event that the number is an ood number, and
B : the event that the number is greater than 2.
From the given information, n(S) = 6 (as there are six possible outcomes)
n(A) = 3 (being numbers 1, 3 and 5), and
n(B) = 4 (being numbers 3, 4, 5 and 6)
∴ P(A) =
P(B) =
n(A) 3 1
= = ; and
n(S) 6 2
n(A) 4 2
= =
n(S) 6 3
Example 2 : A card is drawn from a deck of playing cards at random. Find the chance that (i) it is a
face card, (ii) it is a black ace card.
Solution : Let
A : the event that the card is a picture card
B : the event that the card is black ace card
we have,
n(S) = 52 (there being 52 cards)
n(A) = 12 (there being 4J, 4Q and 4K cards with faces)
144
n(B) = 2 (there being 2 black aces)
∴ P(A) =
P(B) =
12 3
= , and
52 13
2
1
=
52 26
Example 3 : Find the probability that a leap year selected at random shall contain 53 Sundays.
Solution : Like every year, a leap year would have 52 full weeks. The remaining two days of the
years could be:
Sunday and Monday, Monday and Tuesday, Tuesday and Wednesday, Wednesday and Thursday,
Thursday and Friday, Friday and Saturday, or Saturday and Sunday.
We observe here that n(S) = 7. Since two of the obove combinations have a Sunday included,
we have n(A) = 2.
Therefore, P(A) =
n(A) 2
= .
n(S) 7
The classical theory, under the assumption of equally likely outcomes, depends on logical
reasoning. It does very well when we are concerned with balanced coins, perfect dice, wellshuffled
pack of cards and all those situations where all outcomes are equally likely. However, problems are
immediately encountered when we have to deal with the unbalanced coins, loaded dice and so on.
In such situations, we have to depend on the relative frequency approach.
1.3.2 Relative Frequency Approach
It is based on the actual observation. For example, if we were interested in the probability of 50 or
fewer customers arriving at a super market before 10 a.m. we would pick a trial number of days (n)
and count how often 50 or fewer actually did arrive before 10 a.m. Our probability assessment
would be the ratio of days when 50 or fewer customers arrived to the total number of days observed.
Similarly, to have an idea of the probability that a head would appear on tossing a coin, we may
actually toss the coin a number of times, say 1000, and find the number of times a head appears. If
536 times a head has appeared, then the probability of head to occur will be taken to be the ratio of
the two : 536/1000. Naturally, if the coin is a fair one then the ratio will approach to 0.5 as we
continuously increase the number of trials. And if the coin is not a fair one then the chances of a
head would tend to approach the true probability of the head occurring on this coin depending upon
how biased is the coin.
Formally, the probability assessment for event A using the relative frequency approach is given
by :
P(A) =
Number of times A occurs
Number of trials, n
It can be easily visualised that when the number of trials increases, we get better and better
145
estimate of the true probability of the event in question.
Both the classical and the relative frequency approaches to probability are objective in nature.
The classical definition is objective in the sense that it is based on deduction from a set of assumptions
while the relative frequency approach is objective because the probability is derived from repeated
empirical observations. However, both the theories fail when we are dealing with unique events.
For example to determine the probability that a certain student will succeed in a particular
examination, we can apply none of the two. This is because it cannot be ruled that for every student
the events of succeeding and not succeeding are equally likely. Similarly, we cannot subject the
candidate to appear in the examination several times to estimate his probability of success. In such
cases, we have the personalistic approach to probability.
1.3.3 Personalistic Approach
The approach views that the probability of an event is a measure of the degree of belief that an
investigator has in the happening of it. It grants that the probability of the same event may be
assigned differently by different investigators according to the confidence each one has in its
happening. Thus, whereas the chances of a candidate succeeding in an examination may be placed
at 80 percent by one person, another might estimate the chances to be 95 percent. Accordingly, the
two would assign a probability 0.80 and 0.95 respectively for the event to happen.
In may be mentioned that the three approaches to probability definitions are not competitive
rather they are complementary in nature.
1.4 COMPUTATION OF PROBABILITY
As we have already seen, the probability of an event is defined as the ratio of the number of favorable
outcomes (for the event) to the total number of possible outcomes. Little difficulty is experienced
when the total and favorable outcomes are small in number but when they are large, we may require
the use of counting techniques to identify their number. Therefore, we first state the method of
obtaining permutations and combinations.
(1) If a job can be done in m ways and another job can be done in n ways, then the total number of
ways in which both of them can be done is m × n. This is the fundamental multiplication rule.
Example 4 : A man can go from city A to cirty B by three routes and come back by any of four
routes, in how many ways can he perform his to and fro journey.
Solution : He can perform the journey in a total of 3 × 4 = 12 different ways.
Example 5 : Three balanced dice are tossed. Find the chance that the sum of digits on the two
would be equal to 10.
Solution : Total number of ways in which three dice can fall = 6 × 6 × 6 = 216. Total number of
ways in which a total of 10 can appear = 27 (as shown below)
(1, 3, 6),
(1, 4, 5),
(1, 5, 4),
(1, 6, 3),
(2, 2, 6),
(2, 3, 5),
(2, 4, 4),
(2, 5, 3),
(2, 6, 2),
(3, 1, 6),
(3, 2, 5),
(3, 3, 4),
(3, 4, 3),
(3, 5, 2),
(3, 6, 1),
(4, 1, 5),
(4, 2, 4),
(4, 3, 3),
(4, 4, 2),
(4, 5, 1),
(5, 1, 4),
(5, 2, 3),
(5, 3, 2),
(5, 4, 1),
(6, 1, 3),
(6, 2, 2),
(6, 3, 1)
146
Accordingly,
P(total of 10) =
27 1
=
216 8
(2) The total number of arrangements of n distinct objects considered all at a time is equal to n!
Thus, nPn = n !
Example 6 : In how many ways can the letters in the word DELHI be arranged ?
Solution : Since all the 5 letters are different, they can be arranged in 5! = 120 ways.
(3) The total number of arrangements of n distinct objects taken r at a time equals.
n
Pr =
n!
(n − r )
Example 7 : A car dealer has 4 places in his showroom. It has just received a consignment of 10
cars of different shades. In how many ways can he arrange cars in the showroom ?
Solution :
nP
r
=
10
P4 =
10!
10 × 9 × 8 ×7 ×6!
=
=5040
(10 − 40)!
6!
(4) If out of n objects, k1 are alike, k2 are alike, k3 are alike....and so on such that k1 + k2 + k3
+ ......... = n, the number of arrangements of the n objects would be equal to
n
Pk1, k 2, k3 .... =
n!
k1 ! k 2 ! k3 !,....
Example 8 : In how many ways can the “letters in the word STATISTICS be arranged ?
Solution : Here n = 10, k1 (S) = 3, k2(T) = 3, k3(I) = 2, k4(A) = 1 and k5(C) = 1
Accordingly
10P
3, 3, 2, 1, 1
=
10!
= 50400.
3!3!2!1!1!
(5) Out of a total of n distinct objects, the number of combinations of r objects can be obtained as
follows :
nC
r=
n!
or n !/ (n − r )!r !
(n − n)!r !
Example 9 : In how many ways can a committee of 3 persons be chosen out of a total of 10
persons ?
Solution : Here n = 10 and r = 3. The total number of committees would be :
nC
r
=
10
C3 =
10!
10 × 9 ×8 ×7!
=
=120
(10 − 3)!3!
7! × 3 × 2
147
Example 10 : A committee of four is to be selected randomly out of a total of 10 executives, 3 of
which are chartered accountants. Find the probability that the committee would include exactly 2
CAs.
Solution : The committee of 4 executives can be selected out of a total of 10 executives in 10C4
ways. The number of ways in which 2C As can be selected out of 3 is equal to 3C2 while the number
of ways in which 2 executives out of a total of 7 executives is equal to 7C2.
C 2 × 7 C2
63
=
∴ P(committee includes exactly 2 CAs) =
10
210
C4
3
Example 11 : Two cards are drawn at random from a well-shuffled deck of cards. Find the probability
that both are ace cards.
Solution : No. of ways in which 2 cards can be selected out of 52 cards
=
52
C2 =
52 × 51 ×50!
=1326
50!2
4
No. of ways in which 2 aces can be selected out of 4 ace cards = C 2 =
∴ P(2 ace cards) =
4!
=6
2!2!
6
1
=
1326 221
1.5 LAWS OF PROBABILITY
The probability associated with any event represents the likelihood of that event occurring on a
particular trial of an experiment. This probability also measures the perceived uncertainty about
whether the event will occur. If we are not uncertain at all, we assign the event a probability of zero
or one. If the event be A, then P(A) = 0 means that event A would not occur, while P(A) = 1
indicates that event A would definitely occur. Thus, for any event, the probability would range
between zero and one. Probability is non-negative concept. Symbolically,
Rule 1 : 0 ≤ P (A) ≤ 1
Example 12 : (i) Determine the probability that a 7 would appear on a six-faced die tossed once, (ii)
Determine the probability that an even or an odd number would appear on tossing a die.
Solution :
(i) P(7) =
0
= 0 (since a 7 does not exist, therefore there is no question of its occurrence)
6
(ii) P(even or odd number) =
6
= 1 (since the number appearing must be an even or an odd
6
one)
Rule 2 : The probability of the complement of event A is one minus the probability of event A.
148
Symbolically,
P(A) = 1–P(A)
To be able to hit and not be able to hit a target for example, are complementary events. If the
probability of a person to hit a target is given to be 3/5, then the probability that he would not be
able to hit the target would be :
P(not hitting the target) = 1 −
3 2
=
5 5
1.5.1 Addition Rule
When making a decision involving probabilities, we often need to combine event probabilities with
some event of interest. Here we first consider the calculation of probability that event A or B, each
of them being defined on the sample space would occur. We use the addition rules of probability for
this purpose.
Rule 3 : When the events are mutually exclusive, the probability of occurrence of either of
them is given by the sum of their individual probabilities. For two events A and B which are mutually
exclusive,
P(A or B) = P(A) + P (B)
P(A ∪ B) = P(A) + P(B)
Alternatively,
where (A ∪ B) reads A union B and means A or B. Thus, for the mutually exclusive events, the
probability that either one of them would occur is given by the sum of their individual probabilities.
This rule is known as the special rule of addition. In general terms, the rule is
P(A ∪ B ∪ C ∪...K) = P(A) + P(B) + P(C) + ...+ P(K)
Example 13 : A box contains 20 discs numbered 1 to 20. A disc is selected at random. Find the
probability that the number on the disc is divisible by 5 or 7.
Solution : Let A be the event that the number is divisible by 5, and B be the event that the number
is divisible by 7. Since there is no number which is common in these, the events A and B are
mutually exclusive. Accordingly.
P(A) =
4
2
, P(B) =
20
20
and
P(A ∪ B) =
4
2
6
3
+
=
=
20 20 20 10
Rule 4 : When the events are overlapping : when two events A and B are overlapping, then the
probability that either A or B or both of them would occur is given by the sum of individual
probabilities of events A and B to occur minus the probability of their joint occurrence. Symbolically,
P(A or B or Both) = P(A) + P (B) – P (A and B)
Alternatively,
P (A ∪ B) = P(A) + P (B) – P (A ∩ B)
When they are three overlapping events A, B, and C, we have ,
P(A ∪ B ∪ C) = P(A) + P (B) + P (C) – P (A ∩ B) – P (A ∩ C) – P (B ∩ C) + P (A ∩ B ∩ C)
Example 14 : A box contains 20 discs numbered 1 through 20. A disc is selected at random. Find
149
the chance that its number is divisible by 3 or 5.
Solution : Let A be the event that the number is divisible by 3, and B the event that the number is
divisible by 5.
Here six numbers 3, 6, 9, 12, 15 and 18 are divisible by 3 and four numbers 5, 10, 15 and 20 are
divisible by 5. We notice that the number 15 is included in both the lists. Thus, we have,
P(A) =
6
4
1
, P(B) =
, and P(A ∩ B) =
20
20
20
Accordingly, P(A ∪ B) =
6
4
1
9
+
−
=
20 20 20 20
Example 15 : A survey conducted to know the smoking habits of 500 persons yielded the following
results :
Cigarette Brand
No. of smokers
A
140
B
175
C
100
A and B
45
A and C
38
B and C
44
A and B and C
18
Find the probability that a person selected at random from the above group would be
(i) a smoker of brand A or B,
(ii) a smoker of A or B or C.
(iii) a non-smoker.
Solution: From the given information, P(A) =
P(A ∩ C) =
140
175
100
45
,
, P(B) =
, P(C) =
, P(A ∩ B) =
500
500
500
500
38
44
18
, P(B ∩ C) =
, and P(A ∩ B ∩ C) =
500
500
500
Accordingly,
(i) P(A ∪ B) = P(A) + P(B) – P(A ∩ B) =
140 175 45 270
+
−
=
=0.54
500 500 500 500
(ii) P (A ∪ B ∪ C) = P (A) + P(B) + P(C) – P(A ∩ B) – P (A ∩ C) – P (B ∩ C) + P (A ∩ B ∩ C)
=
140 175 100 45
38
44 18 306
+
+
−
−
−
+
=
500 500 500 500 500 500 500 500
150
0.612
=
(iii) P (non-smoker) = 1 – P (smoker) = 1 −
306 194
=
=0.388
500 500
1.5.2 Conditional Probability
In dealing with probability we often need to determine the chances of two or more events occurring
either at the same time or in succession. For example, a quality control manager for a manufacturing
company may be interested in the probability of selecting two successive defectives from an assembly
line. In other instances, the decision maker may know that an event has occurred and may want to
know the chances of a second event occurring. For example, the market research organization
engaged by a company may give a favorable report for a high sales figure for a new product to be
introduced by the company. The company managing director might well be interested to know the
probability of making high sales given a favourable report.
These situations require tools different from those presented above in context of addition
rules. Specifically we need to understand the rules for conditional probability and multiplication of
probabilities. To understand, suppose that the employees of an organization are cross-classified
according to sex and rank as follows :
Officer
Clerk
Total
Males
400
300
700
Females
200
1000
300
Total
600
400
1000
If an employee is selected at random then the probability that the employee would be a male =
700/1000. since out of the total of 1000 employees, a total of 700 are males. This probability is
unconditional in the sense that we are not given any information about the type of employee selected.
Now, if it is given that an employee is selected at random and he is an officer, then the probability
that he would be a male shall be equal to 400/600, because the focus would be only on the officer
employees which are 600 in all and of which there are 400 who are males. This probability is
conditional. If we let the event A to represent the event that the employee would be a male, event A
to represent that an employee would be an officer, we can write the conditional probability as :
P(A/B) =
400 2
=
600 3
where (A/B) reads as event A given that event B has occurred. Upon a closer look we can represent
the probability as:
P(A/B) =
P(A ∩ B)
P(B)
In this, P(A ∩ B) represents the probability that both events A and B would occur and P(B) is
the probability for the event B to occur. Thus, P(A/B) is the conditional probability of A given B and
is defined provided P(B) > 0.
For the above example, P(A ∩ B) = 400/1000 and P(B) = 600/1000. As such,
151
P(A/B) =
400 /1000 400 2
=
=
600 /1000 600 3
Example 16 : Consider an experiment in which two successive draws are to be made from an urn
containing three white balls and five black balls. Assume that the balls are drawn at random and that
the ball chosen on the first draw is not replaced. Find the probability that (i) the first ball drawn is
white, and (ii) the second one is black.
Solution : (i) Let A be the event that the first ball drawn is white and B be the event that the second
3
ball drawn is black. From the given information, P(A) = , since there are three white balls in a
8
total of eight balls. (ii) To determine the conditional probability of B given A, P (B/A), which is the
probability of drawing a black ball on the second draw after drawing a white ball for the first draw,
it should be noted that if A has already occurred, then there is a total of seven balls remaining and
5
five of them are black. Thus, P(B/A) = .
7
1.5.3 Multiplication Rule
From the conditional probability defined in the preceeding paragraphs, since
P(A/B) =
P(A ∩ B)
P(B)
Rule 5 :
P(A ∩ B) = P (B) × P (A/B)
Also
P(A ∩ B) = P (A) × P (B/A)
This is called the multiplication rule for the non-independent events A and B, and states that
the joint probability of the events A and B is given by the probability of the event A multiplied by
the probability for event B given that event A has occurred (or the probability of event B, multiplied
by the probability for event A given that event B has occurred).
Similarly, for events A, B, and C which are not independent, we have
P(A ∩ B ∩ C) = P (A) × P (B/A) × P (C/A ∩ B)
Example 17 : Two balls are selected one after the other from an urn containing 7 black and 8 green
balls. The first ball is not replaced before the second one is drawn. Find the probability that both
would be green.
Solution : Let A be the event that the first ball drawn is green and B be the event that the second ball
drawn is green. From the given information.
P(A) =
8
7
, P(B/A) =
(if a green ball is taken out there would be 7 green balls in a total of
15
14
14 balls)
8 7
4
×
=
15 14 15
Rule 6 : If the events A and B are independent, the probability that both events occur can be
∴ P(A ∩ B) = P (A) × P(B/A) =
152
determined by using P(A) and P(B). As mentioned earlier, two events are independent if the
occurrence of one has no effect upon the occurrence of the other. More formally, if A and B are
independent,
P(A/B) = P (A), and P(B/A) = P (B).
If A and B are independent, the conditional probability of A, given B, is the same as P(A),
since the occurrence of the event B does not affect the occurrence of the event B; P(A/B) = P(A).
The joint probability of independent events may be seen as the product of the probabilities of
the events A and B, since
P(A/B) =
P(A ∩ B)
= P(A)
P(B)
and
P(A ∩ B) = P(A) × P(B)
To generalize, for independent events A, B, C ... we have
P(A ∩ B ∩ C) = P(A) × P(B) × P (C) × ....
Example 18 : Two balls are selected one after the other from an urn containing 7 black and 8 green
balls. The first ball is replaced before the second one is drawn. Find the probability that both would
be green.
Solution : Let A and B be the events that the first and the second ball, respectively, would be green.
From the given information,
P(A) =
8
8
and P(B) =
15
15
Accordingly,
P(A ∩ B) = P(A) × P(B) =
8 8
64
×
=
15 15 225
If it is significant to note that the condition P(A ∩ B) = P(A) × P(B) is satisfied then the events
A and B are said to be independent, just as when they are independent then this relation is satisfied.
This condition can be employed to determine whether the given events A and B are independent.
Example 19 : For the data given in example, test whether the events of an employee selected being
male and an employee selected being clerk are independent.
Solution : Let A be the event that an employee selected is male and B be the event that an employee
selected would be a clerk. From the given information,
the number of employees who are males = 700
the number of employees who are clerks = 400
the number of employees who are males and clerks = 300
Accordingly
P(A) =
700
,
1000
P(B) =
400
,
1000
and
P (A ∩ B) =
153
300
1000
Here since
700 400
300
×
≠
, therefore the events A and B are not independent.
1000 1000 1000
Now we shall discuss the theorem of total probability, also called as the theorem of elimination.
Rule 7 : If H1, H2, ....Hn be n mutually exclusive events, each with a non-zero probability, and
E be an event defined on the same sample space and can be associated with either of them, the total
probability of event E to occur is given by : P(E) = P(H1) × P(E/H1) + P(H2) × P(E/H2) +....+ P(Hn)
× P(E/Hn).
Alternatively,
n
P(E) =
∑ [P(Hi ) × P(E/Hi )]
i=1
In this formulation, of course, P(Hi) and P(E/Hi) must all be given.
To illustrate the application of this theorem, consider the following example.
Example 20 : Two sets of candidates are competing for the positions of board of directors of a
company. The chances for the first set to win are 60% while the chances for the second set are 40%.
If the first set wins, the probability that a product will be introduced is 0.80 while if the second set
wins, the probability for the product to be introduced is 0.30. Determine the probability that the
product will be introduced.
Solution : If H1 is the event that the first set wins,
H2 is the event that the second set wins, and
E is the event that the product is introduced, then
P(H1) = 0.60, P(E/H1) = 0.80, P(H2) = 0.40, P(E/H2) = 0.30
Accordingly,
P(E) = P(H1) × P(E/H1) + P(H2) × P(E/H2)
= 0.6 × 0.8 + 0.4 × 0.3 = 0.48 + 0.12 = 0.60
Thus, there is a sixty per cent chance that the product shall be introduced.
1.6 BAYES’ THEOREM
Conditional probabilities provide a lot of good information for decision makers. For instance, say
the medical researchers are interested in determining the probability of getting cancer by a person
supposing he was exposed to hazardous chemicals. That is, P(cancer/hazardous chemicals).
In such cases we use the conditional probability rule
P(A/B) =
P(A ∩ B)
P(B)
However, in many practical applications, decision makers may know that an event has occurred
but do not know what the chances were of that event before the fact. This cannot be known by the
use of conditional probability rule directly. In such cases we employ an extension of conditional
154
probability called Bayes’ Theorem. This theorem deals with the conditional probability of an event
Hi, given the probability of E, where E may have elements in each of the events H1, H2, H3, ....Hn
with no element of E in more than one Hi. The Venn diagram of the figure displays such a condition.
H2
H1
H3
H4
Bayes’ Theorem
Figure 4
We shall illustrate the concept with an example and then make a generalization.
Example 21 : Box 1 contains 5 white balls and 3 red balls. Box 2 contains 4 white balls and 4 red
balls. A box is selected at random and one ball is randomly taken from that box. If the ball is white,
what is the probability that it came from box 1 ? box 2 ?
Solution : Let H1 : the box 1 is selected,
H2 : the box 2 is selected, and
E : the ball is white.
From the given information, P(H1) =
1
1
5
4
, P(H2) = , P(E/H1) = , and P(E/H2) = .
2
2
8
8
Here we wish to calculate P(H1/E) P(H2/E) From the theorem of conditional probability,
P
(H E ) = P(HP(E)∩ E) and P (H E ) = P(HP(E)∩ E)
1
1
2
2
P(H1 ∩ E) is the probability of first selecting box 1 and then selecting one white ball from it.
P(H1 ∩ E) = P(H1) × P(E/H1) =
1 5 5
× =
2 8 16
P(H2 ∩ E) is the probability of first selecting box 2 and then selecting one white ball from it.
P(H2 ∩ E) = P(H2) × P(E/H2) =
1 4 4
× =
2 8 16
Since the ball selected can be from box 1 or box 2, we have,
P(E) = P(H1 ∩E) + P(H2 ∩ E)
= P(H1) × P(E/H1) + P(H2) × P(E/H2)
1 5
1 4
5 4
=  ×  +
×
=
+
 2 8   2 8  16 16
155
9
= .
16
Accordingly,
P
( )
H1
5
P(H1 ∩ E)
P(H1 ∩ E)
=
=
= 16
E
P(E)
P(H1 ∩ E) + P(H 2 ∩ E) 9
16
=5
9
( )
4
4
P(H 2 ∩ E)
Also, P H 2
=
= 16 =
E
9
P(E)
9
16
Notice here that naturally either box 1 or box 2 would have been selected. When no information
about the colour of the ball is known, the probability that box 1 is selected is 1/2 and so is the
probability that box 2 is selected. Thus, P(H1) = 1/2 and P(H2) = 1/2 are the prior probabilities.
Having known later on that the ball selected is of the white colour, we have revised these probabilities
of P(H1/E) = 5/9 and P(H2/E) = 4/9. These probabilities are known as posterior probabilities. Thus
the prior probabilities are transformed into posterior probabilities by incorporating the additional
information, with the help of conditional and joint probabilities. The information in the above
stated example can be restated as follows :
Event
Prior Prob.
Conditional Prob.
Joint Prob.
Posterior Prob.
(Hi)
P(Hi)
P(E/Hi)
P(Hi ∩ E)
P(Hi/E)
H1
1/2
5/8
5/16
5/9
H2
1/2
4/8
4/16
4/9
Total, P(E) = 9/16
We can formally state the Bayes’ theorem now as follows : If H1, H2, ...Hn be mutually exclusive
and collectively exhaustive events and E be an event which is arbitrarily defined on this sample
space such that P(E) > 0, then the Bayes’ Therom states that
P
( )
Hi
P(H i ∩ E)
=
E
P(E)
n
where in P (E)
∑ P(Hi ∩ E)
i =1
Example 22 : A company has two suppliers of raw materials used in making cement. Vendor A
supplies 30 per cent of raw materials while vendor B supplies 70 per cent. Tests have shown the 40
per cent of vendor A’s materials are poor quality whereas 5 per cent of vendor B’s materials are poor
quality. The cement company’s manager has just found that there is a poor quality material in
inventory. Which company most probably supplied the material ?
Solution : Let
H1 be the event that the material is supplied by vendor A,
H2 be the event that the material is supplied by vendor B,
E be the event that the material is of the poor quality.
Given:
Prior probabilities:
P(H1) = 0.30,
156
P(H2) = 0.70
Conditional probabilities:
P(E/H1) = 0.40,
OP(E/H2) = 0.05.
P(H1 ∩ E) = P(H1) × P(E/H1) = 0.30 × 0.40 = 0.120
Joint probabilities:
P(H2 ∩ E) = P(H2) × P(E/H2) = 0.70 × 0.05 = 0.035
Total probability, P(E) = P(H1 ∩ E) + P (H2 ∩ E) = 0.120 + 0.035 = 0.155.
Posterior probabilities:
P
(H E ) = P(HP(E)∩ E) =0.120
0.155
1
=0.77
P
0.035
(H E ) = P(HP(E)∩ E) =0.155
=0.23
1
2
2
Thus, vendor A most likely supplied the poor quality material.
1.7 EXPECTED VALUE
An important concept, which has its origin in gambling and to which the probability is applied is
the expected value. According to this, if an experiment has n outcomes that are assigned the payoffs
x1, x2,............ xn occurring with probabilities p1, p2,....pn respectively, then the expected value is
given by
E(x) = x1 × p1 + x2 × p2 + ............. + xn × pn
Example 23 : A player is engaged in the experiment of rolling a fair die. The player recovers an
amount of rupees equal to the number of dots on the face that turns up, except when face 5 or 6 turns
up in which case the player will lose Rs. 5 or Rs. 6 respectively. What is the expected value of the
game to the player ?
Solution : From the given information we have :
Outcome:
1
2
3
4
5
6
Probability: 1/6
1/6
1/6
1/6
1/6
1/6
Payoff:
2
3
4
–5
–6
1
1
1
× 5− × 6 − ×
6
6
6
1
= −
6
1
The expected value of the games is :
E(x) = 1 ×
1
1
1
+ 2 × +3 ×
6
6
6
4+
Thus, the player would expect to lose on an average Re 1/6 or 17 p. on each throw.
Example 24 : An oil company may bid for only one of two contracts for oil drilling in two different
areas A and B. It is estimated that a net profit of Rs. 4,00,000 would be realized from the first field
and Rs. 5,00,000 from the second field. Legal and other costs of bidding for the first oil field are Rs.
1,02,500 and for the second one are Rs. 1,05,000. The probability of discovering oil in the first field
is 0.60 and in the second is 0.70. The manager of the company wants to know as to for which oil
157
field should the manager bid ?
Solution : The expected values for the two contracts are calculated below :
Calculation of Expected Value
Investment
A
Outcome
Amount
Probability
Expected Value
Success
4,00,000
0.6
2,40,000
(1,02,500)
0.4
(41,000)
Failure
Total 1,99,000
B
Success
Failure
5,00,000
0.7
3,50,000
(1,05,000)
0.3
(31,500)
Total 3,18,500
Example 25 : A box contains 4 white, 8 green and 8 red marbles. A player selects one marble at
random. The player wins Rs. 6 if the marble he selects is white, Rs. 2 if it is green, but must pay if
it is red. How much should he pay for a red marble if this is to be a fair game ?
Solution:
Probability of selecting a white ball =
4
,
20
Probability of selecting a green ball =
8
,
20
Probability of selecting a red ball =
8
.
20
For a game to be fair, its net pay off must be equal to zero. We have,
Colour of Ball
Payoff
Probability
Expected Value
White
6
4/20
24/20
Green
2
8/20
16/20
Red
–x (suppose)
8/20
–8x/20
0
∴
24 16 8 x
40 8 x
+
−
=
or ⇒
20 20 20
20 20
or x = 40/8 = Rs. 5.
1.8 SUMMARY
●
Probability is the likelihood that something will happen. When we calculate the probability of
an event, we assign it a number between zero and one, depicting how likely it is to happen.
●
There are three approaches to calculate, probability of an event. These are: (i) classical approach,
158
where the probability of an event is the ratio of number of favourable outcomes to the number
of total possible outcomes; (ii) relative frequency approach, where an estimate of probability
is given by the ratio of the number of favourable outcomes to the number of trials made; and
(iii) personalistic approach, where the probability to an event is assigned by an individual
depending on his degree of belief in the occurrence of the event.
●
There are several theorems of probability, which are used to calculate probabilities in different
situations.
●
Theorem of complementary events : This is used to determine the probability of an event
happening by subtracting the probability of the event not happening from 1.
●
Theorem of addition: It deals with the probability of occurrence of either of the events when
they are mutually exclusive or when they are overlapping. According to this, the probability
that either of the events will happen is equal to the sum of their individual probabilities less the
probability of their joint occurrence.
●
Theorem of multiplication: This theorem deals with the calculation of the probability when
our interest is in the occurrence of the events jointly. For independent events, it uses
multiplication of individual probabilities while for events which are not independent, it uses
conditional probability. A conditional probability, is the likelihood that an event will happen,
given that another event has already happened.
●
A probability tree provides a useful way of the handling and analysing conditional probabilities
occurring at multiple levels. It represents the given information through various branches on a
set of chance nodes.
●
Bayes’ theorem: This theorem provides a method of revising given probabilities on the basis
of additional information. This involves transforming prior probabilities into posterior
probabilities with the help of conditional and joint probabilities.
1.9 SELF ASSESSMENT QUESTIONS
Exercise 1. True or False Statements
(i) The outcomes of an experiment are known as sample space.
(ii) Two events are said to be mutually exclusive if the happening of one does not affect the
probability of . happening of the other.
(iii) Mutually exclusive events may or may not be collectively exhaustive.
(iv) Overlapping events are same as non-independent events.
(v) Appearance of a heads and appearance of a tails in single trial of a coin represent
independent events.
(vi) Two mutually exclusive events are not necessarily complementary events but two
complementary events are mutually exclusive.
(vii) Overlapping events are those which can occur severally and jointly.
(viii) The classical approach defines probability of an event as the ratio of number of favourable
outcomes to the total number of trials.
159
(ix) Personalistic approach can be used to obtain probabilities of unique events only.
(x) All three approaches to the definition of probability have one thing in common, the
probability is expressed as a ratio not exceeding 1.
(xi) For an experiment involving a toss of three coins, n(S) is equal to 6.
(xii) If P(A/B) = P(B). then A and B are said to be independent.
(xiii) In a toss of a die, the probability of getting a 5 shall be same as the probability of getting
5 given that the number is odd.
(xiv) In the statistical sense, E1, and E2 are independent only when P(E1 ∩ E2) = P(E1) × P(E2).
(xv) Bayes’ theorem is used to calculate revised probabilities called posterior probabilities
from prior probabilities, using conditional and joint probabilities.
(xvi) The sum of posterior probabilities is equal to 1 as is the sum of prior probabilities, in a
given problem.
Ans. 1. T, 2. F, 3. T, 4. F, 5. F, 6. T, 7. T, 8. F, 9. F, 10. T, 11. F, 12. F, 13. F, 14. T, 15. T, 16. T
Exercise 2 : Questions and Answers
(i) What is probability? Explain the calculation of probability under the classical
approach.
(ii) Which probability approach would you use to calculate the following probabilities? Give
reasons also.
(a) The next toss of a fair coin will land on heads.
(b) India will win the next match with England.
(c) The sum of the faces of two dice will be eight.
(d) The success of a new product launched in the market.
(iii) “Complementary events are mutually exclusive but mutually exclusive events may not
be complementary.” Discuss with examples.
(iv) Distinguish between mutually exclusive and overlapping events. How is the theorem of
addition applied in both these cases?
(v) Distinguish clearly between mutually exclusive and independent events. Can two events
be mutually exclusive and independent simultaneously? Do you agree that on tossing a
coin once, the appearance of heads and appearance of tails represent independent as well
as mutually exclusive events?
(vi) In each of the following cases, examine whether events are mutually exclusive, overlapping,
complementary, independent or not-independent:
(a) On a single toss of a die. appearance of 5 or 6 or appearance of a number smaller
than 4.
(b) A bank employee being an assistant manager or being a female.
(c) A claim adjuster in an insurance company being a male or above 50 years of age.
(d) An employee being a clerk or a sportsman.
160
(e) A person in a hospital being a heart specialist or over 45 years of age or a lab
technician.
(f) A two-shift factory employee working in morning shift or evening shift.
(g) A teacher in a college working in the commerce department or the chemistry
department.
(h) A college employee being a teaching faculty or a member of non-teaching staff.
(i) A factory employee being male or being a trade union member.
(j) Appearance of an odd number or appearance of a number greater than 4 on a single
toss of a die.
(k) Getting two defectives one after another from a lot of 50 units of an item.
(vii) Explain the meaning of marginal, joint and conditional probability. How can we obtain
marginal probability of event E from the given joint probabilities of events A and E, B and
E, C and E, and D and E, where A, B, C and D are the events to which E is related?
(viii) What is statistical independence? How can we ascertain whether events A and B are
statistically independent?
(ix) State and explain Bayes’ Theorem.
(x) A, B and C are competing for the award of a building contract. It is believed that the
chances of A’s getting the contract are one-half of the combined chances of B and C’s
getting it. Further B’s chances are believed to be one-half of C’s chances. What is the
probability of each one getting the contract?
(xi) A survey on MBA students provided the following data for 2,018 students:
Age group
Whether applied to more than one school
Yes
No
23 or under
297
201
24–26
209
379
27–30
185
268
31–35
66
193
36 and over
51
169
(a) What is the probability that a randomly selected applicant is 23 or under?
(b) What is the probability that a randomly selected applicant is older than 26?
(c) What is the probability that a randomly selected applicant applied to more than one
school?
(d) What is the probability that a randomly selected applicant is above 36 and has not
applied to more than one school?
(e) What is the probability that a randomly selected applicant is under 27 and has applied
to more than one school?
161
(xii) There are twenty tickets in a bag, numbered consecutively from 1 to 20. A ticket is selected
at random. Find the chance that the number on the ticket is:
(a) Greater than 14 (b) Divisible by 4 (c) Between 8 and 15, both inclusive.
(xiii) There are four shops and four applicants each of whom applies for one shop at random.
Find the probability that
(a) Each of them applies for a different shop.
(b) Each of them applies for the same shop.
(xiv) A committee of 6 is to be chosen randomly from a group of 8 men and 4 women. Determine
the probability that it shall be composed of
(a) 4 men and 2 women
(b) 6 women
(c) 2 men and 4 women
(d) 6 men
(xv) Among the 90 pieces of mail delivered to an office, 50 are addressed to the accounting
department and 40 are addressed to the marketing department. If two of these pieces of
mail are delivered to the managers’ office by mistake, and the selection is random, what
are the probabilities that
(a) Both of them should have been delivered to the accounting department;
(b) Both of them should have been delivered to the marketing department;
(c) One should have been delivered to the accounting department and the other to the
marketing department?
(xvi) A bag contains 6 green, 7 blue and 2 red balls. Three balls are chosen at random. Find the
probability that (i) all of them are green, (ii) both the red balls are included and (iii) the
balls are all of different colours.
(xvii) Two unbiased dice are tossed. What is the probability that the total of numbers on them
would be a multiple of 3?
(xviii) A pack contains 30 tickets numbered consecutively from 1 to 30. A ticket is chosen at
random from this. Find the chance that the number on this would be (i) a multiple 6 or 7
and (ii) a multiple of 3 or 5.
(xix) Five candidates A, B, C, D and E appear for an interview. Two candidates D and E are
eliminated in the first round of the interview. A has twice the chance of being selected
than B, and B has twice the chance as C, in the final interview. D bets that either A or B
will be selected and E bets that either B or C will be selected. Who is likely to win the
bet?
(xx) Given the following probability table of television viewing frequencies (X) and the income
levels (Y):
Viewing
Income levels (Y)
frequency (X)
High
Middle
Low
Regular
Occasional
Rarely
Total
0.10
0.10
0.05
0.25
0.15
0.20
0.05
0.40
0.05
0.10
0.20
0.35
162
Total
0.30
0.40
0.30
1.00
(a) What is the probability that a person is a low income individual and views TV
regularly?
(b) If an individual is at low income level, what is the probability that he/she views TV
regularly?
(c) What is the probability that given an individual does not have high income, he/she
rarely watches TV?
(d) If an individual occasionally watches TV, what is the probability that he/she is a
high income earner or a middle income earner?
(e) Is viewing TV regularly independent of earning high income? Explain.
(xxi) The probability that a contractor will not get a plumbing contract is 2/3 and the probability
that he will get an electric contract is 5/9. If the probability of getting at least one contract
is 4/5. what is the probability that he will get both the contracts?
(xxii) An unbiased die and a biased die are tossed together. Find the probability that the sum of
digits obtained on them is even, given that on the biased die, it is thrice as likely to show
an even number as an odd one when tossed once.
(xxiii) A six-faced die is so biased that the digits 1, 3 or 5 on it are thrice as likely as the digits 2,4
or 6, when tossed once. Find the probability that in two tosses of this die, the sum of
digits would be odd.
(xxiv) A husband and wife appear in an interview for two vacancies for the same post. The
probability of the husband’s selection is 1/7 and that of wife’s selection is 1/5. What is the
probability that
(a) Both of them will be selected?
(b) Only one of them will be selected?
(c) None of them will be selected?
(xxv) (a) In rolling a pair of dice, what is the probability of rolling a total of 21 on the first two
rolls?
(b) Given that P(A) = 0.65. P(B) = 0.80, P(A/B) = P(A) and P(B/A) = 0.85. Is this a
consistent assignment of probabilities?
(xxvi) An MBA applies for one job in two firms X and Y. The probability of his being selected in
firm X is 0.7 and his being rejected in firm Y is 0.5. The probability of at least one of his
applications being rejected is 0.6. What is the probability that he will be selected in one or
both of the firms?
(xxvii) During a survey of road safety, it was found that 60 percent of accidents occur at night, 52
percent are alcohol related, and 37 percent are alcohol related and occur at night.
(a) What is the probability that an accident was alcohol related given that it occurred at
night?
(b) What is the probability that an accident occurred at night given that it was alcohol
related?
163
(xxviii) An advertising executive is studying television-viewing habits of married men and women
during prime time hours. On the basis of past viewing records, the executive has determined
that during prime time, husbands are watching television 60% of the time. It has also
been determined that when the husband is watching television, 40% of the time the wife
is also watching. When the husband is not watching television, 30% of the time the wife
is watching television. Find the probability that,
(a) If the wife is watching television, the husband is also watching television.
(b) The wife is watching television during prime time.
(xxix) In a small factory, machines A, B and C manufacture 35%, 25% and 40% respectively of
the total output. Of their output, respectively, 0.5, 4 and 2 percent are defective. One item
is drawn and found to be defective. What are the respective probabilities that it was
produced by machines A, B and C?
(xxx) Reliance Industries Limited is determining whether it should submit a bid for oil exploration
contract. In the past, main competitor of RIL, ONGC has submitted bids 66 percent of the
time. If ONGC does not bid for oil exploration contract, the probability that RIL will get
the contract is 0.45. If ONGC does bid for oil exploration contract, the probability that
RIL will get the contract is 0.25.
(a) If Reliance Industries gets the contract, what is the probability that ONGC did not
bid?
(b) What is the probability that Reliance Industries will get the contract?
Ans. 10. 0.33, 0.22, 0.44, 11. 498/2018, 932/2018, 808/2018, 169/2018, 507/2018 12. 0.3, 0.25, 0.4
13. 3/32, 1/64 14. 0.4545, 0, 0.0303, 0.0303 15. 0.306, 0.195, 0.4993 16. 20/455, 13/455, 84/455
17. 0.33 18. 9/30, 14/30 19. D is likely to win 20. 0.0575, 0.01429, 0.333, 0.75, No 21. 14/45 22. 0.5
23. 0.375 24. 1/35, 10/35, 24/35 25. 20/1296, Not consistent 26. 0.8 27. 0.617, 0.712 28. 0.67, 0.36
29. 0.362, 0.406, 0.232 30. 0.519, 0.318
164
LESSON-2
PROBABILITY DISTRIBUTIONS
2. STRUCTURE
2.0 Objective
2.1 Binomial Distribution
2.1.1 Properties of Binomial Distribution
2.2 Poisson Distribution
2.3 Normal Distribution
2.3.1 Properties of Normal Distribution
2.3.2 Calculation of Probabilities
2.4 Summary
2.5 Self Assessment Questions
2.0 OBJECTIVE
After reading this lesson, you should be able to :
(a) Understand the concept of probability distribution and random variables
(b) Understand the characteristics and procedure of computing probabilities using binomial and
Poisson distribution
(c) Comprehend the difference between discrete and continuous probability distribution.
(d) Understand normal distribution, properties of a normal curve and computation of probabilities
using z-values.
(e) Analyse the situations under which a Poisson distribution is treated a binomial or normal
distribution.
2.1 BINOMIAL DISTRIBUTION
Refer to a set of mathematical models of the relative frequencies of a finite number of observations
of a variable. It is systematic arrangement of probabilities of mutually exclusive and collectively
exhaustive elementary events of an experiment. Observed frequency distributions are based upon
actual observation and experimentation. We can deduce mathematically a frequency distribution of
certain population based on the trend of the known values. This kind of distribution on experience
or theoretical considerations is known as theoretical distribution or probability distributions. These
distributions may not fully agree with actual observations or the empirical distributions based on
sample observations. If the number of experiments is increased sufficiently the observed distributions
may come closer to theoretical or probability distributions. Theoretical distributions are useful for
situations where actual observations or experiments are not possible. Moreover, it can be used to
test the goodness of fit. They provide decision makers with a logical basis for making decisions and
are useful in making predictions on the basis of limited information or theoretical considerations.
165
There are broadly three theroetical distributions which are generally applied in practice. They
are :
1. Binomial distribution
2. Poisson distribution
3. Normal distribution
We shall discuss them in detail one by one.
It is a discrete distribution. The binomial distribution was discovered by James Bernoulli in 1700 to
deal with dichotomous classification of events. It is a probability distribution expressing the
probability of one set of dichotomous alternatives, i.e., success or failure. The binomial probability
distribution is developed under some assumptions which are:
(i) An experiment is performed under similar conditions for a number of times.
(ii) Each trial shall give two possible outcomes of the experiment success or failure. S =
{failure, success]
(iii) The probability of a success denoted by p remains constant for all trials. The probability
of a failure denoted by q is equal to (1 – p).
(iv) All trials for an experiment are independent.
If a trial of an experiment can result in success with probability p and failure with probability
q = (1 – p), the probability of exactly successes in n trials is given by
P(x) = nCx pxqn-x where x = 0, 1, 2...n where P(x) = Probability of x successes
n
Cx =
n!
(! is termed factorial)
x !(n − x)!
The entire probability distribution of x = 0, 1, 2,....n can be written as follows :
Binomial Probability Distribution
Number of success
Probability
x
P(x)
n
0
1
n
2
n
:
C0 p 0q n
C1 p1q n −1
C2 p 2q n− 2
:
n
x
:
Cx p xq n − x
:
n
n
166
Cn p nq n − n
We should note that the variable x (number of successes) is discrete. It can take integer values
0, 1, 2, ..., n. The probabilities specified in the above table are in fact successes terms of the Binomial
Expansion of (p + q)n, which is
(q + p) n = nC0 q n p 0 +nC1q n −1 p1 +nC2 q n − 2 p 2
+C3q n −3 p3 ...+nCn q n − n p n
n
2.1.1 Properties of Binomial Distribution
(i) The shape and location of binomial distribution changes as p changes for a given n or n
changes for given p. If n increases for a fixed p, the binomial distribution moves to the
right, flattens and spreads out. The mean of the distribution (np) increases as n increases
for constant value of p.
(ii) The mode of the binomial distribution is equal to the value of x which has the largest
probability.
(iii) If n is large and p and q are not close to zero, the binomial distribution can be approximated
by a normal distribution with standardised variable.
z =
X − np
npq
(iv) The mean and the standard deviation of the Binomial distribution is np and
npq
respectively.
(v) The other constants of the distribution can be calculated.
µ 2 = npq
µ 3 = npq (q – p)
µ 4 = 3n2p2q2 + npq (1 – 6pq)
We can calculate the value of β1 and β2 to measure nature of the distribution.
µ32 n 2 p 2 q 2 (q − p) 2 (q − p 2 )
=
β1 = 3 =
npq
µ2
n3 p 3q 3
β2 =
µ 4 3n2 p 2 q 2 + npq(1 −6 pq)
1 −6 pq
=
=3 +
2
2 2 2
npq
n p q
µ2
The binomial distribution is useful in describing variety of real life events. Binomial distribution
is useful to answer questions such as : If we conduct an experiment n times under the stated conditions,
what is the probability of obtaining exactly x successes? For example, if 10 coins are tossed
simultaneously what is the probability of getting 4 heads ? We shall explain the usefulness of
binomial distribution by certain examples.
Example : A coin is tossed eight times. What is probability of obtaining 0, 1, 2, 3, 4, 5, 6, 7 and all
heads ?
167
Solution : Let us denote the occurrence of head as success by p.
So that p =
1
2
∴ q=1–p=
1
2
and n = 8 (given)
We can calculate various probabilities by expanding the binomial theorem.
( q + p )8 = 8C0 q8 p 0 +8C1q 7 p1 +8C2 q6 p 2
+C3q5 p 3
+C4 q 4 p 4
8
8
C
+5q3 p 5
8
+ 8C7 q1 p 7 + 8Cn q 0 p8
Therefore the probability of obtaining 0 heads =
8
C0 q8 p 0 =
8!
8!× 0!
8
 1   1
×   
 2  2
0
8
8!
1
  1 
=  ×   =
 2
8!  2 
8
1
= Ans.
256
The probability of obtaining 1 head = 8C1q7p1
7
8!
1
 1
 1
×   × =8  ×
=
 2
7!× 1!  2 
2
7
1
×
2
8
=
Ans.
256
The probability of getting 2 heads = 8C2q6p2
6
2
8!  1   1
8 × 7 × 6!
 1
=
  ×   =
 ×
2!× 6! 2
2
2! × 6!  2 
8
8 ×7
1
=   ×
 2
2
8
28
= Ans.
256
The probability of getting 3 heads = 8C3q5p3
5
8!
 1   1
×   × 
=
3!× 5!  2   2 
3
=
8 × 7 ×6 ×5!
 1
 ×
3 ×2 ×1 ×5!  2 
8
56
=
Ans.
256
The probability of getting 4 heads = 8C4q4p4
4
8!
 1   1
×   × 
=
4!× 4!  2   2 
4
=
8 × 7 × 6 ×5 ×4!
 1
 ×
4 ×3 ×2 ×1 ×4!  2 
8
70
=
Ans.
256
The probability of getting 5 heads = 8C5q3p5
3
8!
 1   1
×   × 
=
3!× 5!  2   2 
5
=
8 × 7 ×6 ×5!
 1
 ×
3 ×2 ×1 ×5!  2 
8
56
=
Ans.
256
The probability of getting 6 heads = 8C6q2p6
2
8!
 1   1
×   × 
=
2!× 6!  2   2 
6
8 × 7 × 6!
 1
=
 ×
2 ×1 ×6!  2 
8
168
28
=
Ans.
256
C+6 q 2 p 6
8
The probability of getting 7 heads = 8C7q1p7
1
8!
 1   1
×   × 
=
7!× 1!  2   2 
7
8 × 7!
 1
=
 × 
7!
2
8
8
=
Ans.
256
The probability of getting 8 heads = 8C8q0p8
8!
 1   1
×   × 
=
8!× 0!  2   2 
8
8!
 1
=  ×
8!  2 
8
1
=
Ans.
256
We can also calculate the probability of 4 or more heads or maximum 6 heads. Probability of
4 or more heads =
8
C4 q 4 p 4 + 8C5 q 3 p 5 + 8C6 q 2 p 6 +8C7 q1 p 7 +8Cn q 0 p8
70
=
256
56
+
256
28
+
256
8
1 7
8
0 8
Probability of getting more than 6 heads = C7 q p + Cn q p =
8
+
256
1
+
256
163
= Ans.
256
8
1
9
+
=
Ans.
256 256 256
Example 1 : A box contains 100 transistors, 20 of which are defective, 10 are selected at dom.
Calculate the probability that (i) all 10 are defective, (ii) all 10 are good.
Solution : Let x represent the number of defective transistors selected. Then the value of x would
be, x = 0, 1, 2,.... 10.
Let us put p as the probability of a defective transistor.
∴ p=
20 1
4
= and q = 1 – p = , n = 10
100 5
5
Using the formula for binomial expansion, the probability of x defective transistors is
P(x) = nCxpxqn–x
(i) Probability that all 10 are defective = 10C10 p10q10 – 10
=
10!
10!× 10!
 1
× 
 5
10
0
 4   1
×   = 
5
5
10
1
=10 Ans.
5
(ii) Probability that all 10 are good = 10C0 p0q10
0
10
10
10!  1   4   4
×   × 
=
 =   Ans.
10!  5   5 
5
2.2 POISSON DISTRIBUTION
This is also a discrete distribution. It was originated by a French mathematician Simeon Denis
Poisson in 1837. The Poisson distribution is the limiting form of binomial distribution as n becomes
infinitely large (n > 20) and p approaches zero (p < 0.05) such that np = m remains fixed. The
169
possion distribution is useful for rare events. Suppose in the binomial distribution.
(a) p is very small
(b) n is so large that np = m is constant
Then, we would get the following distribution
x
0
Probability
e–m
1
e− m
2
m
1!
e−m
m2
2!
............ x
3
e− m
m3
3!
..............
Total
e−m
mx
x!
It is a Poisson distribution. Under these conditions the probability of getting x successes is
P(x) = e
−m
mx
x!
Sum of the probabilities of 0, 1, 2, 3 successes is
 m m 2 m3
m x
+
+
.... + 
e − m 1 + +
 1! 2! 3!
x! 
e=m .em
1=
where e is a constant whose value is 2.7183 and m is the parameter of the distribution i.e. the
average number of occurrences of an event.
A classical example of the Poisson distribution is given by road accidents. As we know the
number of people travelling on the road is very large i.e. n is large. Probability that any specific
individual runs into an accident is very small. However.
np = average number of road accidents is a finite constant on any particular day.
Therefore, x (number of road accidents on a particular day) follows Poisson distribution.
The various parameters of Poisson Distribution are :
Mean (m) = np
(variance) = np = m
σ =
np
µ 2 = np = m
µ3 = m
µ 4 = m + 3m2
∴
β1 =
µ 32
µ 32
=
m2 1
=
m3 m
µ 4 m + 3m2
1
=3 +
β2 = 2 =
2
m
µ2
m
170
Example 2 : If one house in 1000 has a fire in a district per year. What is the probability that exactly
5 houses will have fire during the year if there are 2000 houses ?
Solution : We shall apply Poisson distribution
m = np where n = 2000, p =
∴
m = np = 2000 ×
P(x) =
e–m
1
= 2.
1000
mx
×
when x = 5 and e = 2.7183
x!
P(5) = 2.7183–2 ×
= Reciprocal (AL (2 log 2.7183))
1
1000
2 × 2 × 2 ×2 ×2
4
25
= (2.7183)–2 ×
= 2.7183–2 ×
5 × 4 × 3 ×2 ×1
15
5!
4
4
= Reciprocal (7.389)  4 = 0.1352 ×
= 0.036 Ans.
15
15
 15 
Example 3 : If 3% of the bulbs manufactured are defective, calculate the probability that a sample
of 100 bulbs-will contain no defective and one defective bulb using Poisson distribution.
Solution : Given number of defective bulbs are 3% (3/100).
∴
m = np = 100 ×
3
= 3.
100
Probability of no defective bulb in a sample of 100 is
P(x) = e–m ×
mx
where m = 3, and e = 2.7183
x!
P(o) = 2.7183–3 = 0.05 Ans.
Probability of one defective bulb in a sample of 100 is
P(1) = e–m ×
m1
= 2.7183–3 × 3 = 0.15 Ans.
1!
2.3 NORMAL DISTRIBUTION
The most important continuous probability distribution used in the entire field of statistics is normal
distribution. The normal curve is bell-shaped that extends infinitely in both directions coming closer
and closer to the horizontal axis without touching it. The mathematical equation of normal curve
was developed by De Moivre in 1733. A continuous random variable x is said to be normally
distributed if it has the probability density function represented by the equation of normal curve.
1
e
y=
σ 2π
− ( x −µ )2
2 σ2
171
, – ∞ ≤x ≤ + ∞
Where µ and σ are mean and standard deviation which are two parameters and e = 2.7183,
p = 3.1416 are constants.
It may be understood that the normal distributions can have different shapes depending upon
values of µ and σ but there is one and only one normal distribution for any given pair values of µ
and σ.
2.3.1 Properties of Normal Distribution
1.
If the parameters µ and σ of the normal curve are specified, the normal curve is fully determined
and we can draw it by obtaining the value of y corresponding to different values of x (the
abscissa).
2.
The normal curve tends to touch the x-axis only at infinity i.e. the x-axis is an asymptotic to
the normal curve. It is a continuous curve stretching from – ∞ to + ∞.
3.
The mean, median and mode of the normal distribution are equal.
4.
The height of the normal curve is maximum at x = µ. Hence the mode of the normal curve is
x = µ.
5.
The two quartiles Q1 and Q3 are equidistant from the median.
Q1 = µ – 0.6745 σ
Q3 = µ + 0.6745 σ
Hence Quartile Deviation =
Q3 − Q1
= 0.6745 σ
2
4
σ or MD = 0.7979 σ
5
6.
Mean deviation about mean is
7.
The points of inflexion of the normal curve occur at x = µ + σ and x = µ – σ
8.
The tails of curve extend to infinity on both sides of the mean. The maximum ordinate at
X = µ is given by y =
9.
1
σ 2π
Approximately 100% of the area under the curve is covered by µ + 3σ.
Distance from the mean
% of total area under the
ordinate in terms of ± σ
normal curve
Mean ± 3σ
68.27
Mean ± 2σ
95.45
Mean ± 3σ
99.73
10. All odd moments are equal to zero.
µ1 = µ3 = 0
β1 = 0 and β2 = 3. Thus the curve is mesokurtic.
11. The normal distribution is formed with a continuous variable.
172
12. The fourth moment is equal to 3σ4 for a normal distribution.
The equation of the normal curves gives the ordinate of the curve corresponding to any given
value of x. But we are interested in finding out the area under the normal curve rather than its
ordinate (y). A normal curve with 0 mean and unit standard deviation is known as the standard
normal curve. With the help of a statistical table which gives the area and ordinates of the normal
curve are given corresponding to standard normal variate.
x−µ
and not corresponding to x.
σ
Let us see the normal curve area under x-scale and z-scale.
z =
Fig. 1
2.3.2 Calculation of Probabilities
Now we discuss the method of calculating probabilities where the distribution follows the normal
pattern.In fact,the probability forthe variable to assum e a value w ithin a given range,say X1 and
X2, is equal to the ratio of the area under the curve in that range to the total area under the curve. To
obtain the relevant areas, we first transform a given value of the variable X into standardized variate
Z as follows :
X −µ
Z=
σ
Then we consult the normal area table. This table is constructed in a manner such that the areas
between mean (µ) and the particular values of Z are given. The first column of this table contains
values of Z from 0.0 to 3.0, while the top row of the table gives values 0.00; 0.01; 0.02; ....0.09. To
find the area (from mean) to a specific value of Z, we look up in the first column for Z-value upto its
first decimal place while its second decimal place is read from the top row. To illustrate, if we want
to find the area between mean and Z = 1.42, then we look for 1.4 in the first column and 0.02 in the
top row. Corresponding to these, the value in the table reads 0.4222. Similarly, it can be verified
that area upto Z = 0.10 is equal to 0.0398 while for Z = 2.59, it is 0.4952.
Let us understand few more things :
(i) The area under the curve from Z = 0 (when X = µ) to a particular value of Z gives the
proportion of the area under this part of the curve to the total area under the curve. Thus,
Z = 0 to Z = 1.42 the value 0.4222. Naturally, this is taken as the probability that the
variable in question will assume a value within these limits.
173
(ii) Since the normal curve is symmetrical with respect to mean, the area between µ(Z = 0) and
particular value of Z to its right will be same as the value of Z to its left. Thus, area between
Z = 0 and Z = 1.5 is equal to area between Z = 0 and Z = –1.5. Remember that for values of
X greater than µ, the Z value will be positive while for X < µ, the value of Z would be
negative.
(iii) The general procedure for calculating probabilities is like this :
(a) specify clearly the relevant area under the curve which is of interest.
(b) determine the Z value (s).
(c) obtain the required area (s) with reference to the normal area table.
Example 4 : Find the area under the normal curve :
(i) between Z = 0 and Z = 1.20
(ii) between Z = 1.0 and Z = 2.43
(iii) to the right of Z = 1.37
(iv) between Z = –1.3 and Z = 1.49
(v) to the right of Z = –1.78
Solution : For each of these, the relevant portions under the normal curve are shown shaded and the
areas determined with reference to the normal area table.
Fig. 2
(i) Area between Z = 0 and Z = 1.20 is 0.3849.
Fig. 3
(ii) Area between Z = 0 and Z = 1.0 is 0.3413
Area between Z = 0 and Z = 2.43 is 0.4925
∴ Area between Z = 1.0 and Z = 2.43 is 0.4925 – 0.3413 = 0.1512.
Fig. 4
174
(iii) Area between z = 0 and Z = 1.37 is 0.4147.
Total area under the curve being equal to 1, the area to the right of Z = 0 is 0.5, as is the area
to the left of it
∴ Area beyond Z = 1.37 is 0.5000 – 0.4147 = 0.0853.
Fig. 5
(iv) Area between Z = 0 and Z = 1.3 is 0.4032.
Area between Z = 0 and Z = 1.49 is 0.4319.
∴ Area between Z = 1.3 and Z = 1.49 is 0.4032 + 0.4319 = 0.8351.
Fig. 6
(v) Area between Z = 0 and Z = – 1.78 is 0.4625.
Area between Z = 0 is 0.5.
∴ Area to the right of Z = – 1.78 is 0.4625 + 0.5 = 0.9625.
Example 5 : Balls are tested by dropping from a certain height of bounce. A ball is said to be fast if
it rises above 36 inches. The height of the bounce may be taken to be normally distributed with
mean 33 inches and standard deviation of 1.2 inches. If a ball is drawn at random, what is the
chance that it would be fast ?
Solution : The given information is depicted in figure 7. Here we have to calculate the probability
that the height of the bounce, X, would be greater that 36. This is shown shaded in figure 7.
Fig. 7
We have,
X = 36, µ = 33 and σ = 1.2
Z=
X − µ 36 − 33
=
=2.5
σ
1.2
175
From the normal area table, area between Z = 0 and Z = 2.5 is equal to 0.4938. So area beyond
Z = 2.5 is 0.5 – 0.4938 = 0.0062. Therefore, P(X > 36) = 0.0062, the chance of getting a fast ball.
Example 6 : The life (x) of electric bulbs in hours is supposed to be normally distributed as
1
( x −155)2
e 722
19. 2π
What is the probability that the life of a bulb will be :
(i) Less than 117 hours (ii) more than 193 hours (iii) between 117 and 193 hours.
Solution : Given µ = 155 and σ = 19
117 − 155
= −2
Therefore, corresponding to x = 117 the standard normal variate is z =
19
Fig. 8
We have to obtain the area to the left of Z = –2 [Pr(Z< – 2)].
From the table we see the area z = 0 and z = –2 and subtract it from 0.5.
∴ 0.5 – .4772 = 0.0228
Hence the probability of life of bulbs more than 193 hours is 0.0228.
To obtain the probability that the life of the bulb is more than 193 hours, we obtain the
corresponding standard normal variate
193 − 155
= +2
z=
19
Fig. 9
And the area between 117 hours and 193 hours shall be
117
Fig. 10
176
where Z = + 2.
Hence Pr(–2 < Z < +2) = Pr (117 < x < 193)
= .4772 + .4772 = .9544 Ans.
Example 7 : The results of a particular examination are given below in a summary form :
Result
Percentage of Candidates
Total Passed
80
Passed with distinction
10
Failed
20
It is known that a candidate fails if he obtains less than 40 marks (out of 100), while he must
obtain at least 75 marks in order to pass with distinction. Determine the mean and the standard
deviation of marks assuming distribution of marks to be normal.
Solution : According to the given information,
Percentage of students getting marks less than 40 = 20,
Percentage of students getting marks between 40 and 75 = 70, and
Percentage of students getting marks above 75 = 10.
The relevant area is shown in figure 15.
Fig. 11
Here P(X < 40) = 0.20, P (40 < X < 75) = 0.70 and P(X > 75) = 0.10
Let µ and σ represent the mean and standard deviation of the distribution. We have, area
between µ and X = 40 equal to 0.30, and area between µ and X = 75 as equal to 0.40.
Now we have,
40 − µ
, and
σ
75 − µ
,
For
X = 75, Z =
σ
Corresponding to the area 0.30 in the normal area table, Z = 0.84. Thus, for X = 40, we have Z = –
0.84 (Since the value of 40 lies to the left of µ). Similarly, for the area equal to 0.40, we have Z = 1.28.
For
X = 40, Z =
We have, then
40 − µ
σ
= – 0.84
75 − µ
σ
= – 1.28
177
and
....(i)
....(ii)
Rearranging the above equations, we get
µ – 0.84 σ = 40 and
µ + 1.28 σ = 75
....(iii)
....(iv)
Subtracting equation (iii) from equation (iv), we get
2.12 σ = 35
σ = 35/2.12 = 16.51
or
Substituting the value of σ in equation (iii) and solving for µ, we get
µ – (0.84) (16.51) = 40
µ = 40 + 13.87 = 53.87
or
Thus, Mean = 53.87 marks and standard deviation = 16.51 marks.
Example 8 : There are 900 students in B.Com (Hons.) course of a college and the probability of a
student needing a particular book on a day is 0.10. How many copies of the book should be kept in
the library that there should be at least 0.90 chance that a student needing that book will not go
disappointed ? Assume normal approximation to the binomial distribution.
Solution : According to the given information, n = 900, p = 0.10, and q = 1 – p = 1 – 0.10 = 0.90.
Therefore, mean = np = 900 × 0.10 = 90, and σ =
npq = 900 ×0.10 ×0.90 =9. Here we are
required to determine X, to the right of which 10 per cent of the area under the curve lies.
Area between µ = 90 and X would be equal to 0.50 – 0.10 = 0.40. Now, Z value corresponding
to the area 0.40 equals 1.28.
Thus,
X − 90
= 1.28
9
Solving for X, we get X = 1.28 × 9 + 90 = 11.52 + 90 = 101.52.
Z=
Therefore 102 books should be kept in the library to meet the demand of students.
Table-1
Table of values of e–µ (0 ≤ µ ≤ 1)
µ
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
1.0000
0.9048
0.8187
0.7408
0.6703
0.6065
0.5488
0.4966
0.4493
0.4066
1
0.9900
0.8958
0.8106
0.7334
0.6636
0.6005
0.5434
0.4916
0.4449
0.4025
Note: e–0.4 = 0.6703,
2
0.9802
0.8869
0.8025
0.7261
0.6570
0.5945
0.5379
0.4868
0.4404
0.3985
3
0.9704
0.8781
0.7945
0.7189
0.6505
0.5886
0.5326
0.4819
0.4360
0.3946
4
0.9608
0.8694
0.7866
0.7118
0.6440
0.5827
0.5273
0.4771
0.4317
0.3906
e–0.37 = 0.6907,
178
5
0.9512
0.8607
0.7788
0.7047
0.6376
0.5770
0.5220
0.4724
0.4274
0.3867
6
0.9418
0.8521
0.7711
0.6977
0.6313
0.5712
0.5169
0.4677
0.4332
0.3829
e–0.99 = 0.3716
7
0.9324
0.8437
0.7634
0.6907
0.6250
0.5655
0.5117
0.4630
0.4190
0.3791
8
0.9231
0.8353
0.7558
0.6839
0.6188
0.5599
0.5066,
0.4584
0.4148
0.3703
9
0.9139
0.8207
0.7443
0.6771
0.6126
0.5543
0.5016
0.4538
0.4107
0.3716
Table of values of e–µ (0 ≤ µ ≤ 10)
µ
1
2
3
4
5
6
0.36788 0.13534 0.04979 0.01832 0.006738
7
0.00091
0.00248
Note:
8
9
10
0.00012
0.00033
0.00040
e–1.5 = e–1 × e–0.5 = 0.36788 × 0.6065 = 0.2231
e–4.34 = e–4 × e–0.34 = 0.01832 × 0.7118 = 0.01304
Table-2
Normal Curve Z-score
An entry in the table is the area under the curve between Z = 0 and a
positive value of Z. Areas for negative values of Z arc obtained by
symmetry.
Z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.1
0.2
0.3
0.4
0.0000
0.0398
0.0793
0.1179
0.1554
0.0040
0.0438
0.0832
0.1217
0.1591
0.0080
0.0478
0.0871
0.1255
0.1628
0.0120
0.0517
0.0910
0.1293
0.1664
0.0160
0.0557
0.0948
0.1331
0.1700
0.0199
0.0596
0.0987
0.1368
0.1736
0.0239
0.0636
0.1026
0.1406
0.1772
0.0279
0.0675
0.1064
0.1443
0.1808
0.0319
0.0714
0.1103
0.1480
0.1844
0.0359
0.0753
0.1141
0.1517
0.1879
0.5
0.6
0.7
0.8
0.9
0.1915
0.2257
0.2580
0.2881
0.3159
0.1950
0.2291
0.2611
0.2910
0.3186
0.1985
0.2324
0.2642
0.2939
0.3212
0.2019
0.2357
0.2673
0.2967
0.3238
0.2054
0.2389
0.2703
0.2995
0.3264
0.2088
0.2422
0.2734
0.3023
0.3289
0.2123
0.2454
0.2764
0.3051
0.3315
0.2157
0.2486
0.2794
0.3078
0.3340
0.2190
0.2517
0.2823
0.3106
0.3365
0.2224
0.2549
0.2852
0.3133
0.3389
1.0
1.1
1.2
1.3
1.4
0.3413
0.3642
0.3849
0.4032
0.4192
0.3438
0.3665
0.3869
0.4049
0.4207
0.3461
0.3686
0.3888
0.4066
0.4222
0.3485
0.3708
0.3907
0.4082
0.4236
0.3508
0.3729
0.3925
0.4099
0.4251
0.3531
0.3749
0.3944
0.4115
0.4265
0.3554
0.3770
0.3962
0.4131
0.4279
0.3577
0.3790
0.3980
0.4147
0.4292
0.3599
0.3810
0.3997
0.4162
0.4306
0.3621
0.3830
0.4015
0.4177
0.4319
1.5
1.6
1.7
1.8
1.9
0.4332
0.4452
0.4554
0.4641
0.4713
0.4345
0.4463
0.4564
0.4649
0.4719
0.4357
0.4474
0.4573
0.4656
0.4726
0.4370
0.4484
0.4582
0.4664
0.4732
0.4382
0.4495
0.4591
0.4671
0.4738
0.4394
0.4505
0.4599
0.4678
0.4744
0.4406
0.4515
0.4608
0.4686
0.4750
0.4418
0.4525
0.4616
0.4693
0.4756
0.4429
0.4535
0.4625
0.4699
0.4761
0.4441
0.4545
0.4633
0.4706
0.4767
2.0
2.1
2.2
2.3
2.4
0.4772
0.4821
0.4861
0.4893
0.4918
0.4778
0.4826
0.4864
0.4896
0.4920
0.4783
0.4830
0.4868
0.4898
0.4922
0.4788
0.4834
0.4871
0.4901
0.4925
0.4793
0.4838
0.4875
0.4904
0.4927
0.4798
0.4842
0.4878
0.4906
0.4929
0.4803
0.4846
0.4881
0.4909
0.4931
0.4808
0.4850
0.4884
0.4911
0.4932
0.4812
0.4854
0.4887
0.4913
0.4934
0.4817
0.4857
0.4890
0.4916
0.4936
0.4945 0.4946
0.4959 0.4960
0.4669 0.4970
0.4948
0.4961
0.4971
0.4949
0.4962
0.4972
0.4951
0.4963
0.4973
0.4952
0.4964
0.4974
2.5 0.4938 0.4940
2.6 0.4953 0.4955
2.7 0.4965 0.4966
0.4941 0.4943
0.4956 0.4957
0.4967 0.4968
179
2.8 0.4974 0.4975
2.9 0.4981 0.4982
3.0 0.4987 0.4987
0.4976 0.4977
0.4982 0.4983
0.4987 0.4988
0.4977 0.4978
0.4984 0.4984
0.4988 0.4989
0.4779
0.4985
0.4989
0.4979
0.4985
0.4989
0.4980
0.4986
0.4990
0.4981
0.4986
0.4990
2.4 SUMMARY
●
Probability distributions are obtained primarily on theoretical considerations and describe how
the outcomes of an experiment are expected to vary.
●
A probability distribution may involve a discrete or a continuous random variable.
●
The two basic measures to describe a probability distribution are expected value and standard
deviation. For a discrete probability distribution, expected value, µ = Σpx and standard deviation,
σ=
Σp( x −x ) 2 .
●
The binomial, and Poisson distributions involve discrete random variables while normal
distribution involve continuous random variables.
●
A binomial distribution involves n independent trials, each of which can result in only two
possible outcomes, called success and failure. The probabilities of various number of successes
are given by the binomial expansion (q + p)n. The binomial formula is used to calculate the
probability of x successes in n trials.
●
The mean and standard deviation of a binomial distribution are given by np and npq ,
respectively. The distribution is symmetrical if p = q and skewed if p ≠ q. For a given number
of trials, the greater the difference between p and q, more the skewness.
●
The Poisson distribution is normally used to analyse phenomena that produce rare occurrences.
So, it is called the distribution of rare events. It is defined by a single parameter m, which is its
mean value. Its mean and variance are equal.
●
A Poisson distribution is positively skewed and skewness decreases as m increases.
●
The Poisson distribution can also be used as approximation to binomial distribution when the
number of trials is large and the probability of success in a trial is very small.
●
The normal distribution is an all important probability distribution. It involves a continuous
variable, has a curve that is unimodal. symmetrical and asymptotic to the x-axis.
●
The normal distribution has two parameters – mean and standard deviation. For every pair of
values, there is a distinct normal distribution. The distribution with mean µ = 0 and standard
deviation σ = 1 is called standardised normal distribution.
●
The proportion of area lying in a given interval to the total area under the normal curve gives
the probability that the variable in question will take a value within that interval.
●
To obtain probabilities, the given values are transformed into standard normal distribution.
This is called z-transformation. It is defined as z = (x – µ)/σ. The normal area table is consulted
to determine the relevant area.
●
Normal distribution can be used as an approximation to different discrete probability
distributions like binomial, and Poisson distributions under appropriate conditions.
180
2.5 SELF ASSESSMENT QUESTIONS
Exercise 1 : True and False Statements
(i) A variable is said to be a random variable if it takes different values as a result of the
outcomes of a random experiment.
(ii) A probability distribution can also be obtained using historical data.
(iii) A discrete random variable can assume any value in a given range whereas a continuous
random variable can assume only isolated values.
(iv) The expected value, E(x), of a probability distribution is obtained by summation of the
products of the values of x and their corresponding probabilities.
(v) The two parameters of a binomial distribution are n and p.
(vi) A binomial distribution involves infinite number of trials.
(vii) A binomial distribution with n trials involves a total of n number of successes.
(viii) A binomial distribution can be completely identified by n and p.
(ix) If mean and standard deviation values are given, we can fit a binomial distribution.
(x) The variance of a binomial distribution can be smaller than, equal to or greater than its
mean value.
(xi) Regardless of the value of n, a binomial distribution is symmetrical if p = q.
(xii) A binomial distribution is positively skewed when p > q and negatively skewed when p < q.
(xiii) A Poisson distribution is positively skewed and the skewness decreases when m increases.
(xiv) The mean and standard deviation of a Poisson distribution are always equal.
(xv) A binomial distribution with p = 0.5 results in a uniform distribution involving discrete
variable.
(xvi) Binomial approximation to Poisson distribution is used when n is very large and p is very
small.
(xvii) A normal distribution is defined by mean and standard deviation.
(xviii) The standard normal distribution has µ = 1 and σ = 0.
(xix) The smaller the standard deviation, the greater the height of the normal curve.
(xx) For a normal distribution with µ = 110 and σ = 10, the area included between x = 120 and
x = 130 is equal to the area between x = 90 and x = 100.
(xxi) The curve of normal distribution is symmetrical and mesokurtic.
(xxii) µ ± 3σ covers 99.27 percent area under the normal curve.
(xxiii) In using normal approximation to the binomial distribution, the mean and standard deviation
are taken to be equal to np and npq, respectively.
Ans. 1. T, 2. T, 3. F, 4. T, 5. T, 6. F, 7. F, 8. T, 9. T, 10. F, 11. T, 12. F, 13. T, 14. F, 15. F, 16. F, 17. T,
18. F, 19. F, 20. T, 21. T, 22. F, 23. F
181
Exercise 2 : Questions and Answers
(i) Explain the concept of probability distributions. Give two examples of how a probability
binomial distribution can be obtained.
(ii) Distinguish between discrete and continuous probability distributions.
(iii) What are the conditions under which a binomial distribution is used? How are the
probabilities calculated in case~of a binomial distribution? Discuss the conditions under
which a binomial distribution can be approximated as (i) a normal distribution and (ii) a
Poisson distribution.
(iv) Write a note on the binomial distribution. In particular, mention its assumptions, its expected
value and variance, and the shape.
(v) Write a note on Poisson distribution. Under what conditions is it used as an approximation
to the binomial distribution?
(vi) State and explain the properties of a normal curve. Show that the height of a normal curve
at mean is the highest.
(vii) What are the parameters of (i) binomial distribution, (ii) Poisson distribution, (iii)
hypergeometric distribution, and (iv) uniform distribution?
(viii) Determine the probability of getting three heads in 6 tosses of a fair coin.
(ix) The probability that a student will graduate next year is 0.4. Determine the probability that
out of five students, each of which has the same chance of graduating, (i) none, (ii) one,
(iii) at least one, (iv) no more than one, and (v) all will graduate.
(x) The incidence of occupational disease in an industry is such that the workmen have a 30
percent chance of suffering from it. What is the probability that out of 8 workmen, 6 or
more will contact the disease?
(xi) A factory finds that, on an average, 20 percent bolts are defective from one machine. If 10
bolts are selected at random, find the probability that
(a) Exactly two bolts are defective,
(b) Less than two bolts are defective.
(c) More than two bolts are defective.
(d) More than eight bolts are defective.
(xii) A firm produces a product and finds that 10 percent of its output is defective. A small
sample of 5 items is taken from the production line. Find the probability of getting each of
the following number of defective items in the sample: 0, 1, 2, 3, 4, and 5.
(xiii) Find the probability of getting a 5 or 6 thrice in five tosses of an unbiased die.
(xiv) A sign on the gas pumps of a chain of gasoline stations encourages customers to have their
oil checked, claiming that one out of five cars needs to have oil added. If this is true, what
is the probability of the following events ?
(a) One of the next four cars needs oil.
(b) Two out of the next eight cars need oil.
(xv) The mean of a binomial distribution is 20 and its standard deviation is 4. Calculate the
values of n. p and q.
(xvi) Assuming the binomial distribution applies, find the expected value, variance, and standard
deviation of the distribution determined by n = 80 and p = 0.6.
182
(xvii) Calculate the probabilities of 0, 1. 2, 3, 4 and 5 heads on the toss of a set of five balanced
coins. Also, obtain the mean and standard deviation of the distribution.
(xviii) Bring out the fallacy, if any, in the following statements:
(a) The mean of a binomial distribution is 6 and its standard deviation is 3.
(b) The mean of a binomial distribution is 3 and its variance is 4.
(xix) For a binomial variable x, it is given that n = 8, and P(x – 2) = 16 P(x = 6). Determine the
values of p and q.
(xx) If the probability of a defective bolt is 0.2, find the mean and standard deviation of the
number of defective bolts in a total of 900 such bolts.
(xxi) (a) In a binomial distribution involving 5 independent trials, probabilities of 1 and 2
successes are 0.4096 and 0.2048, respectively. Find the parameter p of the distribution.
(b) In a binomial distribution w|th 6 independent trials, the probabilities of 3 and 4 successes
are found to be 0.2457 and 0.0819, respectively. Find mean and variance of this
distribution.
(xxii) The administrative officer of a nursing home reports that the number of patients admitted
to the ICU on any day follows a Poisson probability law, with a mean of 7. What is the
probability that on a given day
(a) No patient will be admitted?
(b) Exactly seven patients will be admitted?
(c) No more than three patients will be admitted?
(xxiii) Assume that the number of network errors experienced in a day on a local area network
(LAN) is distributed as Poisson random variable. The average number of network errors
experienced in a day is 2.4. What is the probability that in any given day
(a) Exactly one network error will result? (b) Two or more network errors will result?
(xxiv) Assume the mean height of soldiers to be 68.22 inches with variance of 10.8 inches, how
many soldiers in a regiment of 1400 would you expect to be taller than 72 inches?
(xxv) In an examination 10% of students passed with distinction, 60% passed and 30% failed. If
it is given that a candidate needs 40% marks to pass and 75% marks to pass with distinction,
determine mean and standard deviation of the distribution of marks assuming the marks
are distributed normally.
(xxvi) A project yields an average case flow of Rs. 550 lakh and standard deviation of Rs. 110
lakh.
Compute the following probabilities :
(a) Cash flow will be more than Rs. 675 lakh,
(b) Cash flow will be less than Rs. 450 lakh and
(c) Cash flow will be between Rs. 425 and Rs. 750 lakh.
Ans. 8. 0.3125, 9. 0.0778, 0.2592, 0.92224, 0.3370, 0.01024, 10. 0.1129, 11. 0.302, 0.3758, 0.3222,
0.0000042, 12. 0.5905, 0.3281, 0.0729, 0.0881, 0.00045, 0.00001, 13. 0.165, 14. 0.4096, 0.1536,
15. 100, 0.2, 0.8, 16. 48, 19.2, 4.38 17. 1/32, 5/32, 10/32, 10/32, 5/32, 1/32, 2.5, 1.12, 18. q > 1 in
each case 19. 1/3 and 2/3 20. 180, 12 21. 0.25, 24/13, 216/169, 22. 0.0009, 0.149, 0.0818, 23.
0.2177, 0.6916, 24. 175, 25. 50.17, 19.4, 26. 0.1271, 0.1814, 0.8385
183
LESSON-3
STATISTICAL DECISION THEORY
3. STRUCTURE
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Objective
Probability in Decision Making
Decision Making Process
Decision Under Uncertainty
3.3.1 Maximax Criterion
3.3.2 Maximin Criterion
3.3.3 Laplace Criterion
3.3.4 Minimax Regret Criterion
Decision Under Risk
3.4.1 Expectation Criterion
3.4.2 Expected Opportunity Loss
Expected Value of Perfect Information (EVPI)
Decision Tree
Summary
Self Assessment Questions
3.0 OBJECTIVE
After reading this lesson, you would be able to :
(a) Understand the steps of decision-making process
(b) Comprehnd the concepts of states of nature and courses of action and compute payoff and
regret tables
(c) Learn about various decision-making rules
(d) Compute EVPI and take decisions with the help of Decision tree analysis
3.1 PROBABILITY IN DECISION MAKING
Statistical techniques are being widely used to solve business problems. These techniques are being
used to solve problems for which information is incomplete, uncertain or in some cases almost
completely lacking. This new area of statistics is known as statistical decision theory. In decision
theory we must decide among alternatives by taking into account the monetary considerations of
our actions. A manager who wants to select from among a number of available investment alternatives
should consider the profit or loss that might result from each alternative. Decision theory involves
selecting an alternative and having a reasonable idea of the economic consequences of choosing
that action.
Decision theory may be applied to problems whether the time span is one day or five years,
whether it involves financial management or a plant assembly line. Most of these problems have
184
common characteristics. The elements common to most decision theory problems are : (i) An
objective, (ii) Several courses of actions, (iii) A calculable measure of the benefit or worth of various
alternatives, (iv) Events beyond the control of the decision maker, and (v) Uncertainty about which
outcome or state of nature will actually happen.
Most complex managerial decisions are made with some uncertainty. Managers authorize
substantial capital investments with incomplete knowledge about product demand. When decisions
are made under uncertain future conditions, use of probabilities provides us with a rational technique
for making choices.
Example 1 : A bakery provides cakes at a cost of Rs. 6 and sells them for Rs. 10 each. A cake not
sold on a particular day is worthless. The baker’s problem is to determine the optimum number of
cakes to be made each day. On days when his stock is more than his sales his profits are reduced by
the cost of unsold cakes. On days of demand exceeding his stock he loses sales and makes smaller
profits than he could have. He has kept a record of his sales for past 100 days to tell him about the
historical pattern of sales.
Daily sales of cakes
No of days sold
300
15
400
20
500
45
600
15
700
5
100
Solution : On the basis of above information we can assign probabilities of sale of cakes in different
quantities. For example a probabiltiy of 0.45 is assigned to the sale figure of 500 cakes. The table
assigning probabilities to various quantities can be prepared.
Daily sales (events)
No. of days sold
Probabilities of each number being sold
E1300
15
0.15
E2400
20
0.20
E3500
45
0.45
E4600
15
0.15
E5 700
5
0.05
100
1.00
We can make a table of net benefits that accrue to the decision maker from a given combination
of Act and Event. This is known as pay off matrix. For the given example we can make the pay off
table or matrix.
When the baker sells one cake he get a profit of Rs. (10 – 6) = Rs. 4 and when a cake remains
unsold he loses Rs. 6. The pay off matrix is given below :
185
Production
(Decision maker’s alternative)
A1
A2
A3
A4
A5
Units
300
400
500
600
700
E1 300
Rs. 1200
Rs. 600
E2 400
1200
E3 500
0
(600)
(1200) £
1600 *
1000
400
(200) £
1200
1600
2000 *
1400
800 £
E4 600
1200 £
1600
2000
2400 *
1800
E5 700
1200 £
1600
2000
2400
2800 *
We have calculated the profits or losses for various situations, when demand and production
both match, there is maximum profit but when production exceeds demand profit reduces because
the unsold quantity is thrown and even the cost not recovered e.g., when both production and
demand is 300 units, there is a payoff (profit) of 300 × 4 = Rs. 1200. When demand is 400 but
production is 300 units then also there is profit of 1200 because he can sell only 300 units and earn
(300 × 4 = Rs. 1200). But when production is 400 units and demand 300 units, he earns 300 × 4 =
Rs 1200 but looses 100 × 6 = Rs 600 on account of unsold quantity of 100 cakes and net pay off is
just Rs. 600 i.e. (1200 – 600). When production is 500 units and sales 300 units he neither earns nor
loses because by selling 300 units he makes a profit of 300 × 4 = Rs 1200. He loses Rs. 1200 cost of
200 units that remained unsold and the net result is zero profit. But when he produces 600 cakes and
sells only 300 he incuss a loss of Rs. 600 [(300 × 4) – (300 × 6) = 1200 – 1800 = – 600].
3.2 DECISION-MAKING PROCESS
As indicated, the decision theory is used to determine optimal strategy where a decision-maker is
faced with several decision alternatives and an uncertain pattern of future events. All decisionmaking situations are characterised by the fact that two or more alternative courses of action are
available to the decision-maker to choose from. Further, a decision may be defined as the selection
by the decision-maker of an act, considered to be best according to some pre-designated standard,
from among the available options. The decision-making process uses the following steps:
(a) Identification of the various possible outcomes, called states of nature or events. The events
are beyond the control of the decision-maker.
(b) Identification of all the courses of action, Aj, or the strategies which are available to the
decision-maker. The decision-maker has control over choice of these.
(c) Determination of the pay-off function which describes the consequences resulting from
different combinations of the acts and events.
(d) Choosing from among the various alternatives on the basis of some criterion, which may
involve the information given in step (c) only or some additional information.
Example 2 : A toy making company is bringing out a new type of toy. It is considering whether to
186
bring out a full, partial or minimal product line. The company has three levels of product acceptance.
The management will make its decision on the basis of anticipated profit from the first year of
production. The relevant data are given in the table below :
Anticipated Profit
Product
Product Line
Acceptance
Full
Partial
Minimal
Good
Rs. 80,000
Rs. 70,000
Rs. 50,000
Fair
Rs. 50,000
Rs. 45,000
Rs. 40,000
Poor
Rs. –25,000
Rs. –10,000
Rs. 0
Solution : Take optimal decision using various decision criteria.
Let us first analyse the given information.
States of nature : The states of nature or events here are in terms of the product acceptance
which may be good, fair or poor. The management has no control over this aspect.
Courses of actions : The management has to accept one of the product lines: full, partial or
minimal. These are the choices available out of which one has to be adopted. The courses of action
are also called simply acts or strategies. The decision-maker has a control over these.
Pay-off table: A pay-off table explains the economics of the given problem. A pay-off is a
conditional value or profit/loss, or a conditional cost. It is conditional in the sense that a certain
profit/loss is associated with each course of action. Thus, the profit or loss resulting from the adoption
of a certain strategy is dependent upon particular event that may occur. To illustrate, if the management
goes in for full product line and the product acceptance is good, the company would make a profit
of Rs. 80,000.
The given pay-off values are shown in Table 1 with additional information.
Regret or Opportunity Loss Table: The outcomes of various combinations of acts and events
can also be expressed in terms of regret or opportunity loss values. The regret values are also
conditional as each one of them results from a certain combination of act and event. Regret is
defined as the amount of pay-off foregone by not adopting the optimal course of action – that which
would yield the highest pay-off, for each given event. To illustrate, in the context of our example, in
the event of good product acceptance, the full product line would yield a profit of Rs. 80,000; a
partial product line would result in a profit of Rs. 70,000; while a minimal product line would give
a profit of Rs. 50,000. The optimal course of action in this case would be full product line. Thus, at
the end of the year, if the product acceptance was good, the management would not regret if it had
gone for full product line. In case it had decided for partial product line, it would regret somewhat
while a decision of minimal product line would cause a greater regret. The regret would be Rs.
10,000 (= 80,000 – 70,000) and Rs. 30,000 (= 80,000 – 50,000), respectively, with these two policies.
187
Table 1 : Decision-making Using Different Criteria
Anticipated Profit (in Rs.)
Act
Event
Full Product Line
Partial Product Line Minimal Product Line
Good Product Acceptance
80,000
70,000
50,000
Fair Product Acceptance
50,000
45,000
40,000
Poor Product Acceptance
–25,000
–10,000
0
Maximum
80,000
70,000
50,000
Minimum
–25,000
–10,000
0
35,000
35,000
30,000
Average
Table 2 : Conditional Regret Table
Act
Event
Full Product Line Partial Product Line Minimal Product Line
Good Product Acceptance
0
10,000
30,000
Fair Product Acceptance
0
5,000
10,000
Poor Product Acceptance
25,000
10,000
0
The elements of a decision process are:
A decision-maker.
A set of possible outcomes, or events, in the decision situation.
A set of courses of action available to the decision-maker.
A set of conditional pay-offs corresponding to various possible combinations of events and actions.
Selection of a particular course of action based on some criterion.
After setting up the pay-off table and the regret table we proceed to take a decision. There are
several rules, or criteria, on the basis of which decision may be taken.The selection of an appropriate
criterion depends on factors such as the nature of decision situation, attitude of the decision-maker
etc. We shall first discuss the decision rules for taking decisions in conditions of uncertainty and
then for conditions of risk.
3.3 DECISIONS UNDER UNCERTAINTY
The decision situations where there is no way in which the decision-maker can assess the probabilities
of the various states of nature are called decisions under uncertainty. In such situations, the decisionmaker has no idea as to which of the possible states of nature would occur nor has he a reason to
believe why a given state is more, or less, likely to occur as another. With probabilities of the
various outcomes not known, the actual decisions are based on specific criteria. The several principles
which may be employed for taking decisions in such conditions are discussed below.
188
3.3.1 Maximax Criterion
An optimist believes that whatever course of action he chooses from all possible outcomes would
be the best. This rule suggests, that for each strategy, the maximum pay-off should be considered.
Further, the maximum of these pay-offs should be chosen for decision. For our example, the maximum
profit associated with different courses of action is as follows:
Full product line: Rs. 80,000; Partial product line: Rs. 70,000; and Minimal product line: Rs.
50,000.
The highest of these corresponds to the full product line. Therefore, optimal decision is to go
for full product line.
3.3.2 Maximin Criterion
This principle is adopted by pessimistic decision-makers who are conservative in their approach. A
pessimist is one who believes that whatever course of action he chooses would be the worst out of
all the possible outcomes. Using this approach, the minimum pay-offs from various strategies are
considered and the maximum one is selected. It therefore chooses the best (the maximum) profit
from the set of worst (the minimum) profits.
The minimum pay-off associated with the full, partial and minimal product line is Rs. (–)
25,000, Rs. (–) 10,000 and Rs. 0, respectively. According to this criterion the management would
go for minimal product line.
3.3.3 Laplace Criterion
The Laplace principle is based on the simple rule that if we are uncertain about various events, then
we may treat them as equally probable. Therefore the expected value of pay-off for each strategy is
calculated and the strategy with the highest mean value is adopted. The expected pay-offs for various
courses of action are calculated as :
Full product line
:
(80,000 + 50,000 – 25,000)/3 = Rs. 35,000
Partial product line
:
(70,000 + 45,000 – 10,000)/3 = Rs. 35,000
Minimal product line
:
(50,000 + 40,000 + 0)/3 = Rs. 30,000
Since the highest expected pay-off is shared by the strategies of full product line and partial
product line, both could be adopted by the management.
3.3.4 Minimax Regret Criterion
The Minimax Regret principle is based on the concept of regret. According to the principle the
course of action that minimises the maximum regret will be selected. It is known as savage principle.
First the regret matrix is derived from the pay-off matrix. Then the maximum regret value
corresponding to each of the strategies is determined and the strategy which minimises the maximum
regret is chosen. The principle of choice is also conservative in approach and is very close to the
minimax principle applied to the original matrix containing pay-off values.
From the regret matrix given in Table 2 we get the following maximum regret values associated
with the various couses of action :
189
Full product line :
Rs. 25,000
Partial product line :
Rs. 10,000
Minimal product line :
Rs. 30,000
The maximum regret value for the strategy of partial product line is minimum and will be the
optimal choice.
When probabilities of various events are not given, the decision criteria could be:
1. Considering best of the best pay-off for every act.
2. Considering best of the worst pay-off for every act.
3. Considering best of the weighted average of the best and worst pay-offs under every act.
4. Considering best of the simple average of all pay-offs under every act.
5. Considering act with least of maximum regret values associated with all acts.
3.4 DECISION UNDER RISK
The decision situations wherein the decision-maker chooses to consider several possible outcomes
and the probabilities of their occurrence can be stated are called decisions under risk. The probabilities
of various outcomes may be given or they may be determined from the past records.
Under conditions of risk, there are generally two criteria to choose from.
(a) Expectation Criterion (b) Expected opportunity loss or expected regret
3.4.1 Expectation Criterion
Decision making in situations of risk is on the basis of the expectation principle with the event
probabilities assigned. The expected pay off for each strategy is calculated by multiplying the pay
off values with their respective probabilities and then adding up these products. The strategy with
n
the highest expected pay off represents the optimal choice. Symbolically, Σ Pi = pi aij where aij ⇒
i =1
the pay off resulting from the combinations of ith event and jth act. pi represents the probability of
ith event.
In the example, we are given probabilities of various events and we can determine the expected
pay offs.
Events
Probabilities
A1
A2
A3
A4
A5
(Pi)
300
400
500
600
700
E1 300
0.15
1200
600
0
–600
–1200
E2 400
0.20
1200
1600
1000
400
200
E3 500
0.45
1200
1600
2000
1400
800
E4 600
0.15
1200
1600
2000
2400
1800
E5 700
0.05
1200
1600
2000
2400
2800
190
The calculation of expected pay off values EP for various acts is shown below :for A1, EP1 = 0.15 × 1200 + 0.20 × 1200 + 0.45 × 1200 + 0.15 × 1200 + 0.05 × 1200 = Rs. 1200
for A2, EP2 = 0.15 × 600 + 0.20 × 1600 + 0.45 × 1600 + 0.15 × 1600 + 0.05 × 1600 = Rs. 1450
for A3, EP3 = 0.15 × 0 + 0.20 × 1000 + 0.45 × 2000 + 0.15 × 2000 + 0.05 × 2000 = Rs. 1500
for A4, EP4 = 0.15 × –600 + 0.20 × 400 + 0.45 × 1400 + 0.15 × 2400 + 0.05 × 2400 = Rs. 1100
for A5, EP5 = 0.15 × –1200 + 0.20 × –200 + 0.45 × 800 + 0.15 × 1800 + 0.05 × 2800 = Rs. 550
Since maximum expected pay off is associated with strategy A3, the best course of action is to
produce 500 cakes.
3.4.2 Expected Opportunity Loss
The expected opportunity loss or expected regret criterion is another basis on which a decision may
be taken. For this purpose opportunity loss matrix alongwith probability distribution can be
reproduced as given below:
Calculation of expected Regret:
Acts
Event
Probability
A1
A2
A3
A4
A5
(Pi)
300
400
500
600
700
E1 300
0.15
0
600
1200
1800
2400
E2 400
0.15
400
0
600
1200
1800
E3 500
0.15
800
400
0
600
1200
E4 600
0.15
1200
800
400
0
600
E5 700
0.15
1600
1200
800
400
0
Now we can determine expected regret (ER) for various strategies
for A1, ER1 = 0.15 × 0 + 0.20 × 400 + 0.45 × 800 + 0.15 × 1200 + 0.05 × 1600 = Rs. 4700
for A2, ER2 = 0.15 × 600 + 0.20 × 0 + 0.45 × 400 + 0.15 × 800 + 0.05 × 1200 = Rs. 450
for A3, ER3 = 0.15 × 1200 + 0.20 × 600 + 0.45 × 0 + 0.15 × 400 + 0.05 × 800 = Rs. 400
for A4, ER4 = 0.15 × 1800 + 0.20 × 1200 + 0.45 × 600 + 0.15 × 0 + 0.05 × 400 = Rs. 800
for A5, ER5 = 0.15 × 2400 + 0.20 × 1800 + 0.45 × 1200 + 0.15 × 600 + 0.05 × 0 = Rs. 1350
Under this criterion, the optimal strategy is the one which minimizes the expected regret.
Since the minimum value occurs at A3 it represents the optimal decision. This is same as under
expected pay off criterion.
3.5 EXPECTED VALUE OF PERFECT INFORMATION (EVPI)
Assuming that we can obtain a perfect prediction about the future and also the cost of this information,
we can compare the cost of that information with the additional profit we would realize as a result
of having the information. The difference between expected cost with optimal policy and expected
cost with perfect information is known as EPVI.
191
In our example if the bakery shop knows that next year’s demand is going to be 300 units with
probability 0.15 and a profit of Rs. 1200. We calculate expected profit 1200 × 0.15 = Rs. 180. When
the shop manager knows that demand will be 400 units he will earn Rs. 1600 and since the probability
of this event is 0.20 his expected profit will be 0.20 × 1600 = Rs. 320. Similarly the expected pay
off for each level of demand can be obtained and aggregated. These expected profits we get are
EVPI.
EVPI = 0.15 × 1200 + 0.20 × 1600 + 0.45 × 2000 + 0.15 × 2400 + 0.05 × 2800 = Rs. 1900
Alternatively,
EVPI = EPj + ERj
For A1, EP1 = 1200 and ER1 = 700 and EVPI = 1200 + 700 = Rs. 1900
For A2, EP2 = 1450 and ER2 = 450 and EVPI = 1450 + 450 = Rs. 1900
For A3, EP3 = 1500 and ER3 = 400 and EVPI = 1500 + 400 = 1900 and similarly for A4 and A5.
Example 3 : A grocery shop is faced with the problem of how many cakes to buy in order to meet
the day’s demand. The left over cakes are a total loss. If the customer’s demand is not satisfied, the
sales will be lost. The shopkeeper has got the information regarding past sales for past 200 days :
Sales per day
No. of days
Probability
25
20
0.10
26
60
0.30
27
100
0.50
28
20
0.10
(i) Prepare the payoff matrix and opportunity loss (regret) matrix.
(ii) Find the optimal number of cakes that should be bought each day.
(iii) Find EVPI.
The cost of a cake is Rs. 8 and it is sold for Rs. 10 each.
Solution :
(i)
Profit = (Cakes sold × selling price) – (Cakes unsold × cost price)
Opportunity loss (regret) = Maximum profit in a row – Profit under each column in that row.
Pay off Table (Rs.)
25
26
27
28
Probability
25
50
42
34
26
0.10
26
50
52
44
36
0.30
27
50
52
54
46
0.50
28
50
52
54
56
0.10
EMV
50
51
49
42
192
Regret Table (Rs.)
25
26
27
28
Probability
25
0
8
16
24
0.10
26
2
0
8
16
0.30
27
4
2
0
8
0.50
28
6
4
2
0
0.10
EOL
3.20
2.20
4.20
11.20
(ii) Now we can calculate expected monetary value and expected opportunity loss
EMV = Σ (Profit Column × Probability column)
EOL = Σ (Regret Column × Probability column)
Max EMV is 52 corresponding to 26 cakes and min. EOL is 2.20 corresponding to 26 cakes
∴ optimum number of cakes to be purchased is 26
(iii) EVPI= (Max. Pay off in each row × Corresponding Probability) – Max. EMV
= (50 × .10 + 52 ×.30 + 54 × .50 + 56 × .10) – 51
= (5 + 15.60 + 27 + 5.60) – 51 = 53.20 – 51 = Rs. 2.20
Example 4 : The payoff (in Rs.) of three acts A1, A 2 and A3 and the possible states of nature S1, S2
and S3 are given below:
Acts
States of Nature
A1
A2
A3
S1
–20
–50
200
S2
200
–100
–50
S3
400
600
300
The probabilities of the states of nature are 0.3, 0.4 and 0.3 respectively. Determine the optimal
act using the expectation principle.
Solution :
Acts
States of Nature
A1
A2
A3
Probabilities
S1
–20
–50
200
0.3
S2
200
–100
–50
0.4
S3
400
600
300
0.3
Expected profit for Act A1 = – 20 × 0.3 + 200 × 0.4 + 400 × 0.3
= – 6 + 80 + 120 = Rs. 194
Expected profit for Act A2 = – 50 × 0.3 – 100 × 0.4 + 600 × 0.3
= – 15 – 40 + 180 = Rs. 125
193
Expected profit for Act A3 = 200 × 0.3 – 50 × 0.4 + 300 × 0.3
= 60 – 20 + 90 = Rs. 130
∴ Act A1 is the Optimal Act.
Example 5 : Each unit of a product produced and sold yields a profit of Rs. 50 but a unit produced
but not sold results in a loss of Rs. 30. The probability distribution of the number of units demanded
is as follows :
No. of Units Demanded
Probability
0
0.20
1
0.20
2
0.25
3
0.30
4
0.05
How many units be produced to maximise the expected profits? Also calculate EVPI.
Solution :
Given : Profit for units produced and sold = Rs. 50
Loss for units produced and not sold = Rs. 30
Pay Off Table
(Production)
Demand
Probability 0 EMV 1
0
1
2
3
4
0.2
0.2
0.25
0.2
0.05
0
0
0
0
0
0
0
0
0
0
Total
(30)
50
50
50
50
0
EMV
2
EMV
3
(6)
(60)
(12) (90)
10
20
4 (10)
12.50 100 17.50
70
15
100
45 150
2.50
100
7.50 150
34
EMV
4
EMV
(18)
(2)
17.50
45
7.50
(120)
(40)
40
120
120
(24)
(8)
10
36
10
52
50
42
Note : Values in brackets are negative.
We should produce 2 units because EMV = Rs. 52 (maximum).
Further, EVPI = EPPI – EMV
EPPI Table
Demand
Probability
Max. pay off
0
0.2
0
0 × 0.2 = 0
1
0.2
50
50 × 0.2 = 10
2
0.25
100
100 × 0.25 = 25
3
0.30
150
150 × 0.30 = 45
4
0.05
200
200 × 0.05 = 10
Total
EPPI
90
∴ EVPI = 90 – 52 = Rs. 38
194
Example 6 : A physician purchases a particular vaccine on Monday of each week. The vaccine
must be used within the week following, otherwise it becomes worthless. The vaccine costs Rs. 2
per dose and the physician charges Rs. 4 per dose. In the post 50 weeks, the physician has administered
the vaccine in the following quantities:
Doses per Week
Number of Weeks
20
5
25
15
50
25
60
5
On the basis of EMV, find how many doses the physician must purchase each week to maximise
his profit ?
Solution : Given : Cost = Rs. 2.00, Price = Rs. 4.00
Profit = Rs. 4 – Rs. 2 = Rs. 2.00 per dose
We shall calculate pay off for different actions :
P11 (Demand/Dose = 20)
P12
P13
P14
P21
P22
P23
P24
P31
P32
P33
P34
P41
P42
P43
P44
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
20 × 2 = Rs. 40
40 – 10 = 30
40 – 60 = –20
40 – 80 = – 40
40 – 0 = 40
50 – 0 = 50
50 – 50 = 0
50 – 70 = –20
40 – 0 = 40
50 = 50
50 × 2 = 100
100 – 20 = 80
20 × 2 = 40
25 × 2 = 50
50 × 2 = 100
60 × 2 = 120
Pay Off Table
Probability
Doses per week
A1
A2
A3
A4
Demand
20
25
50
60
(5/50) = 0.1
20
40
30
–20
–40
(15/50) = 0.3
25
40
50
0
–20
(25/50) = 0.5
50
40
50
100
80
(5/50) = 0.1
60
40
50
100
120
195
n
∴ EMV for A1
=
∑ pi xi
x =1
= (0.1 × 40) + (0.3 × 40) + (0.5 × 40) + (0.1 × 40)
= 4 + 12 + 20 + 4 = Rs. 40
EMV for A2 = (0.1 × 30) + (0.3 × 50) + (0.5 × 50) + (0.1 × 50)
= 3 + 15 + 25 + 5 = Rs. 48
EMV for A3 = (0.1 × –20) + (0.3 × 0) + (0.5 × 100) + (0.1 × 100)
= – 2 + 0 + 50 + 10 = Rs. 58
EMV for A4 = (0.1 × –40) + (0.3 × –20) + (0.5 × 80) + (0.1 × 120)
= – 4 – 6 + 40 + 12 = Rs. 42
The physician should purchase 50 doses each week because EMV for A3 is Rs. 58 (Maximum).
3.6 DECISION TREE
It is quite useful to represent the structure of a decision problem under uncertainty by a ‘decision
tree diagram’ or by “decision tree.”
A decision tree is a graphic representation of decision process. We can introduce probabilities
into the analysis of complex decisions involving (i) many alternatives, and (ii) future conditions
that are not known but can be specified in terms of a set of probabilities. The decision tree analysis
helps in making decision concerning a wide variety of problems such as project management,
personnel, new product strategies, acquisition or disposal of physical properties, investments, etc.
Decision trees have standard symbols, squares symbolize decision points and circles represent chance
events. From each square and circle branches are drawn. These represent each possible outcome or
state of nature.
Steps in Decision Tree Analysis
In a decision tree analysis, the decision-maker follows the following six steps* :
1. Define the Problem in Structured Terms. First of all, the factors relevant to the solution
should be determined. Then probability distributions that are appropriate to describe future
behaviour of those factors are estimated.
2. Model the Decision Process. A decision tree that illustrates all the alternatives in a problem is
constructed. The entire decision process is presented in an organised step-by-step procedure.
3. Apply the Appropriate Probability Values and Financial Data. To each of the branches
and sub-branches of the decision tree the appropriate probability values and financial data are
applied. This will help us to distinguish between the probability value and conditional monetary
value associated with each outcome.
4. “Solve” the Decision Tree. Using the method explained above locate that particular branch of
the tree that has the largest expected value or that maximises the decision criteria.
5. Perform Sensitivity Analysis. Determine how the solution reacts to changes in inputs. Changing
probability value and conditional financial values, enables the decision maker to test the
magnitude and the direction of the reaction.
196
6.
List the Underlying Assumptions. The accounting, cost finding and other assumptions used
to arrive at a function should be explained. This will help others to know what risks they are
taking when they use the results of decision tree analysis.
Advantage of the Decision Tree Approach
The decision tree analysis is important because of the following:
1. It structures the decision process enabling decisions to be made in an orderly manner.
2. It requires the decision-maker to examine all possible outcomes – desirable and undesirable.
3. It communicates the decision-making process to others.
4. It allows a group to discuss alternatives by focusing on each financial figure, probability
value and underlying assumptions – one at a time to arrive at a consensus decision instead
of debating that decision entirely.
5. It can be used with a computer so that many different sets of assumptions can be simulated
and their effects on the final outcomes can be analysed.
Example 7 : There is 40% chance that a patient admitted to hospital is suffering from cancer. A
doctor has to decide whether an operation should be performed or not. If the patient is suffering
from cancer and the serious operation is performed, the chance that he will recover is 70%, otherwise
it is 35%. On the other hand, if the patient is not suffering from cancer and the operation is performed,
the chance that he will recover is 20% otherwise it is 100%. Assume that recovery and death are the
only two possible outcomes. What decision should the doctor take?
Solution :
The chance of recovery after operation = 0.28 + 0.12 = 0.40
The chance of recovery without operation = 0.14 + 0.60 = 0.74
Since, the chances of recovery without operation > chances of recovery with operation
Therefore, the doctors should not undertake operation.
Example 8 : A businessman has two independent investments A and B available to him but he lacks
the capital to undertake both of them simultaneously. Investment A requires capital of Rs. 30,000
and investment B Rs. 50,000. Market survey shows: high, medium and low demands with
corresponding probabilities of 0.4, 0.4 and 0.2 respectively in case of investment A and 0.3, 0.4 and
0.3 for investment B. Returns from investment A are Rs. 75,000, Rs. 55,000 and Rs. 35,000 and
corresponding figures for investment B are likely to be Rs. 100,000, Rs. 80,000 and Rs. 70,000 for
197
high, medium and low demand respectively. What decision should the company take? Decide by
constructing an appropriate decision-tree.
Solution :
Expected net profit for investment A = 0.4 × 75,000 + 0.4 × 55,000 + 0.2 × 35,000 – 30,000
= Rs. 29,000
Expected net profit for investment B = 0.3 × 100,000 + 0.4 × 80,000 + 0.3 × 70,000 – 50,000
= Rs. 33,000
Since the expected net profit for investment B is more than A, the businessman should invest
in B.
Example 9 : A person has two independent investments A and B available to him, but he can
undertake only one at a time due to certain constraints. He can choose A first and then stop, or if A
is successful, then take B or vice-versa. The probability of success of A is 0.6 while for B it is 0.4.
Both investments require an initial capital outlay of Rs. 10,000 and both return nothing if the venture
is unsuccessful. Successful completion of A will return Rs. 20,000 (over cost) and successful
completion of B will return Rs. 24,000 (over cost). Draw a decision-tree and determine the best
strategy.
198
EVALUATION OF DECISION POINTS
Decision point
D3 (i) Accept A
Outcome
Probability Conditional values
Expected values
Success
0.6
20,000
12,000
Failure
0.4
–10,000
–4,000
8,000
(ii) Stop
D2 (i) Accept B
—
—
—
—
Success
0.4
24,000
9,600
Failure
0.6
–10,000
–6,000
3,600
(ii) Stop
D1 (i) Accept A,
then B
—
—
—
—
Success
0.6
20,000 + 3,600
14,160
Failure
0.4
–10,000
–4,000
10,160
(ii) Accept B,
Success
0.4
24,000 + 8,000
12,800
then A
Failure
0.6
–10,000
–6,000
6,800
EMV is highest when he accepts investment A and on success invests money in B.
3.7 SUMMARY
●
Decision theory is concerned with decision-making under conditions of uncertainty. The decision
process involves the steps of identification of states of nature, courses of action and pay-off
table depicting consequences resulting from their interaction, and then choosing the appropriate
action on the basis of given criterion, pay-offs, the outcome of event-action combinations can
also be expressed in terms of regret.
●
Among the rules for taking decisions are (a) maximax/minimin: where we select best of the
best; (b) maximin/minimax: selecting best of the worst; (c) Hurwicz criterion: choosing the
best using weighted average of the best and the worst; (d) Laplace principle: selecting the act
with best average; and (e) Savage principle: involving selection of act with least of the maximum
regret values.
●
Where probabilities are used, the decision rules include (a) maximum likelihood principle
where the best act of the most probable event is selected; (b) expectation principle: in which
the act with the best expected value is chosen; and (c) expected regret criterion, where the act
with least expected value is selected, leading to identical decision as under expectation principle.
●
In addition to the optimal course of action, expected value of perfect information, EVPI, is
also calculated. It equals the expected regret value of the optimal act and represents the maximum
amount that a decision-maker would be willing to pay in case he is provided with perfect
information.
199
●
The decision analysis using expectation principle can be extended to cover situations where
given probabilities of various states of nature are revised on the basis of some additional
information about them (the states). The prior probabilities are converted into posterior
probabilities using the conditional, joint and total probabilities. Such an analysis is called
posterior analysis.
●
The decision-tree approach to decision-making is used in situations where multi-stage decisions
are needed. The sequences of action-event combinations available to the decision-maker are
presented graphically in the form of a decision tree. In analysing such situations, alternatives
are evaluated by proceeding in a backward manner – by evaluating the best course at later
stages to decide the best action at the earlier stages. The decision criterion is expected monetary
value, EMV.
3.8 SELF ASSESSMENT QUESTIONS
Exercise 1 : True and False Statements
(i) Decision theory is concerned with determining optimal strategies where a decision-maker
is faced with a number of alternatives and a risky pattern of future events.
(ii) Regret is the amount of money paid for not adopting the optimal course of action.
(iii) Laplace principle is based on the premise of equal chances of occurrence of possible events.
(iv) Minimax is an optimist’s choice while Minimjn is a pessimist’s criterion.
(v) In Hurwicz criterion, the maximum pay-off is multiplied by α, and the minimum by 1– α,
where α may be any number between zero and 100.
(vi) In situations of decisions under uncertainty, the Laplace criterion is the least conservative
while the minimax criterion is the most conservative.
(vii) In case of pay-offs represented as profits, the Savage criterion for selecting optimal course
of action will be based on the maximin principle.
(viii) Expected pay-off and expected regret criteria would both lead to indentical decisions in a
given case.
(ix) All criteria for decisions under risk inherently assume that the particular (optimal) decision
reached will be repeated a large number of times.
(x) EPPI is the expected pay-off under certainty.
(xi) EVPI is the expected regret value of any strategy.
(xii) In Bayesian approach to decision making, the optimal strategy is determined using posterior
probabilities.
(xiii) Posterior probabilities are obtained by modifying prior probabilities taking into consideration
the information from a sample.
(xiv) The decision tree approach to decision-making is appropriate in those situations where a
sequence of decisions is involved.
(xv) In the decision trees, no more than two alternative courses of action can emanate from a
decision node.
200
(xvi) In decision trees, the probabilities of all events at chance nodes and the monetary evaluations
of different alternatives must all be known in advance.
(xvii) The probabilities of various outcomes at each chance node should always add up to one.
(xviii) A decision taken on the basis of expected monetary value would always prove to be the
right decision.
Ans. 1. T, 2. F, 3. T, 4. F, 5. F, 6. T, 7. F, 8. T, 9. T, 10. T, 11. F, 12. T, 13. T, 14. T, 15. F, 16. T, 17. F
Exercise 2 : Questions and Answers
(i) Describe the steps involved in the process of decision-making.
(ii) What are pay-off and regret functions? How can entries in a regret table be derived from a
pay-off table?
(iii) Explain and illustrate the following principles of decision-making: (a) Laplace, (b)
Maximax, (c) Maximin, (d) Hurwicz, and (e) Savage.
(iv) How are maximum likelihood and expectation principles of choice differentiated? Do they
always lead to same decisions?
(v) Define the term EPPI. How is it calculated? What does it signify?
(vi) What do you understand by EVPI? How is it calculated?
(vii) Explain the procedure of analysing a decision tree.
(viii) The research department of Hindustan Lever has recommended the marketing department
to launch a shampoo of three different types. The marketing manager has to decide one of
the types of shampoo to be launched under the following estimated pay-offs (in millions of
Rs.) for various levels of sales:
Type of Shampoo
Estimated level of sale (Units)
15,000
10,000
5,000
Egg shampoo
30
10
10
Clinic shampoo
40
15
5
Deluxe shampoo
55
20
3
What will be the marketing manager’s decision if (a) Maximin, (b) Maximax, (c) Laplace,
and (d) Minimax regret criterion is applied?
(ix) A food products company is contemplating the introduction of a revolutionary new product
with new packaging to replace the existing product at much higher price (S1) or a moderate
change in the composition of the existing product with a new packaging at a small increase
in price (S2) or a small change in the composition of the existing product except the word
‘new’ with negligible increase in price (S3). The three possible states of nature are: (i) high
increase in sales (N1), (ii) No change in sales (N2) and (iii) decrease in sales (N3). The
marketing department of the company worked out pay-offs in terms of yearly net profits
for each of the three strategies of these events.
201
Pay-offs in Rs.
Strategies
States of Nature
N1
N2
N3
S1
7,00,000
3,00,000
1,50,000
S2
5,00,000
4,50,000
0
S3
3,00,000
3,00,000
3,00,000
Which strategy should the executive concerned choose on the basis of:
(a) Maximin criterion;
(b) Maximax criterion;
(c) Minimax criterion; and
(d) Laplace criterion?
(x) The oil company of India is interested in acquiring a piece of land which is considered likely to
contain oil deposits. The company has the option of (a) buying the land outright, (b) obtaining
an option to buy, drill for oil and if found exercise the option, and (c) not buying or obtaining
option. There are three possibilities on such land: large oil reserves may be found; minor
reserves may be found, or there may be no oil. The pay-offs (in lacs of Rs. resulting from
various combinations of acts and events are tabulated below:
Acts
Buy land
Obtain option
No action
Large Reserves
40
28
0
Minor Reserves
10
1
0
No Oil
–25
–2
0
What action should be taken by the company when the decision criterion is:
(a) Laplace, (b) Maximin, (c) Maximax. (d) Minimax Regret, and (e) Expected pay-off (when
the probabilities of obtaining large, minor, and no reserves are estimated to be 0.2, 0.5 and 0.3,
respectively)?
(xi) A firm is considering the purchase of some complex equipment from either of the two suppliers
S1 and S2. Supplier S1 is capable of supplying the equipment on time to meet a certain desired
deadline. The price chargeable by S1 is, however, considerably higher than that of S2. It is felt
by the management of the firm that S 2 may deliver the equipment or may not be able to deliver
on time. It is even suspected that supplier S 2 may never be able to deliver the equipment to the
specifications. However, the management believes that if it waits for some months, it may get
better information on S2’s capabilities of supplying the equipment.
The management is considering three alternative courses of action.
A1 : Order from S1. If later on it is clear that S2 can supply, order from S1 can be cancelled. Of
course, delay would be caused when the order is given to S2.
A2: Order from supplier S2. If it is known later on that S2 cannot supply the equipment, the order
may be switched to S1.
202
A3: Wait till the time information on S′2s capabilities is known. This would obviously cause
delay. The outcomes (profits) in the various possible situations are:
Event
Course of action
A1
A2
A3
E1
250
100
200
E2
250
125
300
E3
250
625
450
E1 : S2 fails to deliver
E2 : S2 delivers late
E3 : S2 delivers on time
What would be the management’s decision according to each of the following criteria:
(a) Laplace, (b) Maximin, (c) Hurwicz (with α = 0.5), and (d) Minimax Regret.
(xii) An investor is given the following investment alternatives and~percentage rates of return:
States of Nature (Market Conditions)
Low
Medium
High
Regular shares
2%
5%
8%
Risky shares
–5%
7%
15%
Property
–10%
10%
20%
Over the past 300 days, 150 days have been medium market conditions and 60 days have
had high market increases.
On the basis of these data, state the optimal investment strategy for the investor.
(xiii) The manager of a small departmental store must place order every week for an item
which costs Rs. 15 and sells for Rs. 25. Units not sold during the week are disposed of for
Rs. 10 each. The demand for this item is estimated to be as follows:
Demand (Units): 25
Probability:
0.10
26
27
28
29
30
0.15
0.30
0.20
0.15
0.10
(a) Determine the optimal number of units to be produced, using the expected monetary
value criterion.
(b) Determine the expected value of perfect information.
(xiv) Three types of souvenirs can be sold outside a stadium. From the following conditional
pay-off table, construct the opportunity loss table. (Sales are dependent on the winning
team.)
Types of Souvenir
I (Rs.)
II (Rs.)
III (Rs.)
Team A wins
1,200
800
300
Team B wins
250
700
1,100
203
Point out which type of souvenir should be bought if probability of Team A’s winning is
0.6.
(xv) Chemical Products Ltd. produces a compound which must be sold within the month it is
produced, if the normal price of Rs. 100 per drum is to be obtained. Anything unsold in
that month is sold in a different market for Rs. 20 per drum. The variable cost is Rs. 55
per drum.
During the last three years, monthly demand was recorded and showed the following
frequencies:
Monthly demand (No. of drums):
2,000
3,000
6,000
8
16
12
Frequency (No. of months) :
(a) Prepare an appropriate pay-off table.
(b) Advise the production management on the number of drums that should be produced
next month.
(xvi) A stockist of a particular commodity makes a profit of Rs. 30 on each sale made within
the same week of purchase; otherwise he incurs a loss of Rs. 30 on each item.
No. of items sold within the same week :
5
6
7
8
9
10
11
Frequency
0
9
12
24
9
6
0
:
(a) Find out the optimum number of items the stockist should buy every week in order
to maximize the profit.
(b) Calculate the expected value of perfect information.
(xvii) A physician purchases a particular vaccine on Monday of each week. The vaccine must
be used within the week following, otherwise it becomes worthless. The vaccine costs
Rs. 20 per dose and the physician charges Rs. 60 per dose. In the past 50 weeks, the
physician has administered the vaccine in the following quantities:
Doses per week :
20
25
40
60
No. of weeks
5
15
25
5
:
(a) Draw up a pay-off matrix.
(b) Obtain a regret matrix.
(c) Determine the optimum number of doses the physician should buy.
(d) The maximum amount the physician would be willing to pay per week for perfect
information about the number of doses expected to be demanded in a week.
Ans. 8. Egg, Deluxe, Deluxe, Deluxe, 9. S3, S1, S1, S1, 10. Obtain option, No action, Buy Land,
obtain option, buy land or obtain option 11. A3 A1 A2 A2 A2 or A3, 12. Property, Exp. Return = 6%
13. (i) (ii), 14. Type I ER = 340, 15. 3000 drums 16. 8 units, EP = 210, EVPI = 25.50, 17. 40, EMV
= 1210, EVPI = 1210.
❑❑❑
204