Download Statistics and Probability

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
21
Statistics and Probability
21.1
INTRODUCTION
Statistics is as old as human society itself.
It is difficult to imagine any facet of our life untouched by numerical data. Modern society
is essentially data-oriented. It is, therefore, essential to know how to extract useful information
from such data. This is the primary objective of statistics. Statistics concerns itself with the
collection, presentation, and drawing of inferences from numerical data that vary.
In a singular sense, statistics is used to describe the principles and methods that are
employed in collection, presentation, analysis, and interpretation of data. These devices help to
simplify the complex data and make it possible for a common man to understand it without much
difficulty. The human mind is unable to assimilate complicated data at a stretch. Statistical
methods make these figures intelligible and readily understandable.
In a plural sense, statistics is considered as a numerical description of the quantitative aspect
of things.
Definition. Statistics is the science that deals with methods of collecting, classifying,
presenting, comparing, and interpreting numerical data in order to throw light on any sphere of
enquiry.
21.2
VARIABLE (OR VARIATE)
A quantity that can vary from one individual to another is called a variable or variate, e.g.,
heights, weights, ages, wages of people, rainfall records of cities, etc.
Quantities that can take any numerical value within a certain range are called continuous
variables, e.g., as a child grows, his/her height takes all possible values from 50 cm to 100 cm.
Quantities that are incapable of taking all possible values are called discrete or discontinuous variables, e.g., the number of children in a family are positive integers 1, 2, 3, etc.
(no value between any two consecutive integers).
21.3
FREQUENCY DISTRIBUTIONS
Consider the grades obtained by 60 students in mathematics:
38, 11, 40, 0, 26, 15, 5, 45, 7, 32, 2, 18, 42, 8, 31, 27, 4, 12, 35, 15, 0, 7, 28, 46, 9, 16, 29,
34, 10, 7, 5, 1, 17, 22, 35, 8, 36, 47, 11, 30, 19, 0, 16, 14, 16, 18, 41, 38, 2, 17, 42, 45, 48, 28, 7,
21, 8, 28, 5, 20.
The data does not give any useful information. It is rather confusing. These are called raw
data or ungrouped data.
1146
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
We would like to bring out certain salient features of this data. If we express the data in
ascending or descending order of magnitude, this does not reduce the bulk of the data. We
condense the data into classes or groups as below:
(i) Determine the range of the data, i.e., the difference between the largest and smallest
numbers occurring in the data.
Here the range = 48 – 0 = 48.
(ii) Decide upon the number of classes or groups into which raw data is to be grouped.
There are no hard and fast rules for this. The insight of the experimentor determines this number.
However, the number of classes should not be less than 5 or more than 30. With a smaller
number of classes accuracy is lost and with a larger number of classes the computations become
tedious.
Let us make the number of classes = 7 here.
(iii) Divide the range by the desired number of classes to determine the approximate width
or size of class interval. If the quotient is a fraction, take the next integer. In the above example,
48
or 7.
the size of the class interval is
7
As far as possible, classes should be of the same size.
(iv) Using the size of the interval, set up the class limits, making sure that the minimum and
the maximum numbers occurring in the data are included in some class. As far as possible, openend classes (a < x < b) should be avoided since they create difficulty in analysis and interpretation. Boundaries of each class are selected in such a way that there is no ambiguity as to
which class a particular item of the data belongs.
(v) The observations corresponding to the common point of two classes should always be
included in the higher class, e.g., if 20 is an element of the data and 10–20 and 20–30 are two
classes, then 20 is to be set in the class 20–30 and not 10–20. That is to say every class should be
regarded as open to the right.
(vi) Take each item from the data, one at a time, and place a tally mark (/) opposite the
class to which it belongs. Tally marks are recorded in bunches of five. Having occurred four
times, the fifth occurrence is represented by setting a cross-tally ( \ or / ) on the first four tallies
( |||| or |||| ). This technique facilitates the counting of the tally marks at the end.
(vii) The count of tally marks in a particular class provides us with the frequency in that
class. The word “frequency” is derived from “how frequently” a variable occurs.
(viii) Grades are called the variable (x) and the number of students in a class is known as the
frequency ( f ) or class frequency of the variable.
(ix) The total of all frequencies must equal the number of observations in the raw data.
(x) The table displaying the manner in which frequencies are distributed over various
classes is called the frequency table.
(xi) We are often interested in knowing, at a glance, the number of observations less than a
particular value. This is done by finding cumulative frequency. The cumulative frequency
corresponding to a class is the sum of frequencies of that class and of all classes prior to that
class.
(xii) The table displaying the manner in which cumulative frequencies are distributed is
called the cumulative frequency table.
Using the above steps, we have the following cumulative frequency table for the example
under consideration.
21.3 FREQUENCY DISTRIBUTIONS
1147
________________________________________________________________________________________________________
Class interval
Tally marks
(grades x)
(number of students)
0–7
7–14
14–21
21–28
28–35
35–42
42–49
Frequency
(f)
Cumulative
Frequency
10
12
12
4
8
7
7
10
22
34
38
46
53
60
|||| ||||
|||| |||| ||
|||| |||| ||
||||
|||| |||
|||| ||
|||| ||
Total
60
ILLUSTRATIVE EXAMPLES
Example 1. The weights in grams of 50 apples picked at random from a market are as
follows:
106, 107, 76, 82, 109, 107, 115, 93, 187, 195, 123, 125, 111, 92, 86, 70, 126, 68, 130, 129,
139, 119, 115, 128, 100, 186, 84, 99, 113, 204, 111, 141, 136, 123, 90, 115, 98, 110, 78, 90, 107,
81, 131, 75, 84, 104, 110, 80, 118, 82.
Form the grouped frequency table by dividing the variate range into intervals of equal
width, each corresponding to 20 gms in such a way that the mid-value of the first class
corresponds to 70 gms.
Sol. Mid-value of first class = 70 ⎫
(given)
⎬
Width of each class
= 20 ⎭
∴ The first class interval is (70 – 10) – (70 + 10) i.e., 60 – 80.
Weight in grams
No. of apples
60–80
80–100
100–120
120–140
140–160
160–180
180–200
200–220
Frequency
||||
|||| |||| |||
|||| |||| |||| ||
|||| ||||
|
5
13
17
10
1
0
3
1
|||
|
Total
50
Example 2. Form an ordinary frequency table from the following table:
Grades
Above
Above
Above
No. of Students
0
10
20
40
30
25
Grades
Above
Above
Above
No. of Students
30
40
50
18
12
0
1148
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
Sol.
Noo. of Studentts ( f )
4 – 30 = 100
40
3 – 25 = 5
30
2 – 18 = 7
25
18 – 12 = 6
12 – 0 = 122
Grrades
0––10
10––20
20––30
30––40
40––50
m the followinng:
Exaample 3. Forrm an ordinaary frequenccy table from
G
Grades
Below
B
B
Below
B
Below
N of Studennts
No.
Grades
5
7
13
Beloow
Beloow
Beloow
10
20
30
No. of
o Students
40
50
60
22
30
38
Sol.
Graades
0––10
10––20
20––30
30––40
40––50
50––60
21.4
Noo. of Studentts ( f )
5
7–5=2
13 – 7 = 6
2 – 13 = 9
22
3 – 22 = 8
30
3 – 30 = 8
38
“E
EXCLUSIVE
E” AND “INC
CLUSIVE” CLASS-INTE
C
ERVALS
Classs-intervals of the type { x : a ≤ x < b} = [a, b) arre called “exxclusive” sinnce they excclude
the upperr limit of thee class. The following
f
daata are classiified on this basis.
21.5 THREE TYPES OF SERIES
1149
________________________________________________________________________________________________________
Income ($)
No. of people
50–100
88
100–150
70
150–200
52
200–250
30
250–300
23
In this method, the upper limit of one class is the lower limit of the next class. In this
example, there are 88 people whose income is from $50 to $99.99. A person whose income is
$100 is included in the class $100–$150.
Class-intervals of the type { x : a ≤ x ≤ b} = [ a, b ] are called “inclusive” since they include
the upper limit of the class. The following data are classified on this basis.
Income ($)
50–99
100–149
150–199
200–249
250–299
No. of people
60
38
22
16
7
However, to ensure continuity and to get correct class-limits, the exclusive method of classification should be adopted. To convert inclusive class-intervals into exclusive ones, we have to
make an adjustment.
Adjustment. Find the difference between the lower limit of the second class and the upper
limit of the first class. Divide it by 2. Subtract the value obtained from all the lower limits and
add the value to all the upper limits.
100 − 99
In the above example, the adjustment factor is
= .5. The adjusted classes would
2
then be as follows:
Income ($)
No. of people
49.5–99.5
60
99.5–149.5
38
149.5–199.5
22
199.5–249.5
16
249.5–299.5
7
The size of the class interval is 50.
21.5
THREE TYPES OF SERIES
In this chapter, we will come across the following three types of series:
(a) Individual Observations (i.e., where frequencies are not given).
Form x : x1 , x2 , x3 , . . . , xn .
(b) Discrete Series. It is a series of observations of the form
x : x1 , x2 , x3 , . . . , xn
f : f1 , f 2 , f3 , . . . , f n
(c) Continuous Series. It is a series of observations of the form
Class Interval : a1 − a2 a2 − a3 . . . an − an +1
f
f1
f2
fn
:
...
For the purpose of further calculations in statistical work, the mid-point of each class is
taken to represent the class.
1150
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Thus, if mi is the mid-point of the ith class, then mi =
form
Mid -value
m:
m1 ,
m2 ,
m3 , . . . , mn
Frequency
f :
f1 ,
f2 ,
f3 , . . . , f n .
ai + ai +1
and the above series takes the
2
The mid-value of the ith class may also be denoted by xi . Thus, a continuous series is
reduced to the form of a discrete series.
21.6
GRAPHICAL REPRESENTATION
A frequency distribution when represented by means of a graph makes the unwieldy data
intelligible. A better perspective can be had by representing the frequency distribution
graphically since graphs, if drawn attractively, are eye-catching and leave a more lasting
impression on the mind of the observer. Graphs are a good visual aid. But graphs do not give
accurate measurements of the variable as are given by the tables. Another disadvantage is that by
taking different scales, the facts may be misrepresented.
Some important types of graphs are given below:
(A) Histogram
In drawing the histogram of a given grouped frequency distribution:
(a) Mark off along the x-axis all the class intervals on a suitable scale. (If class-intervals are
equal, then each = 1 cm is quite suitable.)
(b) Mark frequencies along the y-axis on a suitable scale.
(c) It must not be assumed that the scale for both the axes will be the same. We can have
different scales for the two axes. The determination of scale depends upon our convenience and
the type and nature of the data. The scale or scales should be so chosen as to fit the size of graphpaper and to hold all the figures of the data.
(d) Construct rectangles with the class-intervals as bases and heights proportional (if the
class intervals are equal) to the frequencies.
A diagram with all these rectangles is called a histogram.
ILLUSTRATIVE EXAMPLES
Example 1. The weights (in grams) of 40 oranges picked at random from a basket are as
follows: 45, 55, 30, 110, 75, 100, 40, 60, 65, 40, 100, 75, 70, 60, 70, 95, 85, 80, 35, 45, 40, 50,
60, 65, 55, 45, 90, 85, 75, 85, 75, 70, 110, 100, 80, 70, 55, 30, 70.
Represent the data by means of a histogram.
Sol. Range = max. (110) – min. (30) = 80
Let the number of class intervals = 7
⎛ 80 ⎞
or ⎟ 12.
Width of the class interval = ⎜
⎝ 7
⎠
Wts. of oranges
No. of oranges
Frequency
(in gms.)
30–42
42–54
54–66
66–78
78–90
90–102
102–114
Total
|||| ||
||||
|||| |||
|||| ||||
||||
||||
||
7
4
8
9
5
5
2
40
21.6 GRA
APHICAL REP
PRESENTATIO
ON
1151
________________________
________________________________________________________________________________________
The histogram of
o the above frequency distribution
d
is given heree:
(B) Frequency Polygon
d
For a grouped frequency distribution
with equal class-intervvals, a frequuency polygon is
obtained by joining the
t middle points
p
of thee upper sides (tops) of thhe adjacent rectangles of
o the
histogram
m by means of straight lines. To coomplete the polygon, thee mid-pointss at each ennd are
joined to the immediately lower and higher mid-points
m
att zero frequeency, i.e., onn the x-axis.
Exaample 2. Thee following table
t
gives thhe weights (to
( the neareest pound) off 40 studentss at a
universityy. Constructt a frequenccy distributioon with 7 classes and draw
d
the hisstogram andd frequency polygon.
p
138,, 164, 150,, 132, 144, 125, 149, 157, 146, 158, 140, 147, 136, 148, 152, 144,
168, 1266, 138, 176
6, 163, 1199, 154, 165,, 146, 173, 142, 147, 135, 140, 135, 102, 145,
135, 1422, 150, 156,, 145, 128.
Sol. Range of raaw data = maax. (176) – min.
m (102) = 74
mber of classses = 7
Num
⎛ 74 ⎞
or ⎟ 11.
∴ Width
W
of classs interval = ⎜
⎝ 7
⎠
Weightt
(to
o the nearestt pound)
Tally marrks
F
Frequency
102–1133
113–1244
124–1355
|
|
||||
135–1466
|||| |||| ||||
14
146–1577
|||| |||| ||
12
157–1688
||||
|||
168–1799
Total
1
1
4
5
3
40
1152
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
The histogram and
a frequenccy polygon are
a shown heere:
(H
Histogram: reectangles; Frequency
F
poolygon: show
wn dotted.)
t Ogive
(C) Cumulativee Frequencyy Curve or the
mulative freqquency is caalled a cumuulative frequuency
The curve obtaiined by plottting the cum
curve or an ogive (prronounced ojjive). There are two typees of ogives..
L
og
give. Plot thhe points witth the upper limits of thee classes as abscissae
a
annd the
(i) Less-than
corresponnding less-th
han cumulative frequenccy as ordinattes. Join the points by a freehand sm
mooth
curve to get the less-tthan ogive. It
I is a rising curve. (An ogive
o
usually means a leess-than ogivve.)
(ii) More-than ogive. Plot the points with
w the low
wer limits off the classes as abscissaee and
m
cuumulative freequency as ordinates. Jooin the poinnts by a freeehand
the correesponding more-than
smooth curve
c
to get the
t more-thaan ogive. It is
i a falling cuurve.
Connsider the folllowing frequency distribbution:
Gradess
No. of students
Graades
No. of students
10–20
20–30
30–40
4
6
10
40––50
50––60
60––70
20
18
2
Let us convert it
i first into a “less-than C.F.” distribbution and then
t
into a “more-than
“
C
C.F.”
distributiion.
Gradess
less-than
n
20
30
40
50
60
70
o students
No. of
4
(+ 6 = )10
(+ 100 = ) 20
(+ 200 = ) 40
(+ 188 = ) 58
(+ 2 = ) 60
Graades
more-than
10
20
30
40
50
60
70
No. of studdents
660
(– 4 = ) 56
5
(– 6 = ) 50
5
(– 10 = ) 40
4
(– 20 = ) 20
2
(– 18 = ) 2
(– 2 = ) 0
21.7 COM
MPARISON OF
F FREQUENCY DISTRIBUTIIONS
1153
________________________
________________________________________________________________________________________
Exaample 3. Drraw the twoo ogives for the followiing distributtion showing the numbber of
grades off 59 studentss:
Gradess
No. of
o students
Graddes
No. of studdents
0–10
10–20
0
20–30
0
30–40
0
4
8
11
15
40––50
50––60
60––70
12
6
3
Gradess
No. of
o students
Less--than
C.F
F.
More-thaan
C.F.
0–10
10–20
0
20–30
0
30–40
0
40–50
0
50–60
0
60–70
0
4
8
11
15
12
6
3
4
122
233
388
500
566
599
59
55
47
36
21
9
3
Sol.
(
23), (400, 38), (50, 50), (60, 566), (70, 59), and
Plottting the poiints (10, 4),, (20, 12), (30,
joining thhem by freeh
hand, the sm
mooth rising curve
c
obtainned is less-thhan ogive.
Plottting the poin
nts (0, 59), (l0,
( 55), (200, 47), (30, 36),
3 (40, 21), (50, 9), (600, 3), and jooining
them by freehand, the smooth fallling curve obtained
o
is more-than
m
oggive.
21.7
CO
OMPARISO
ON OF FREQ
QUENCY DISTRIBUTIO
ONS
Wheen two or more
m
differeent series off the same type
t
are com
mpared, tabuulation of obsero
vations is
i not sufficient. It is offten desirablle to define quantitativeely the charracteristics of
o the
frequencyy distributio
on.
1154
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
There are two fundamental characteristics in which similar frequency distributions may
differ:
(i) They may differ in measures of location or central tendency, i.e., in the value of the
variate x around which they center.
(ii) They may differ in the extent to which observations are scattered about the central
value. Measures of this kind are called measures of dispersion.
21.8
MEASURES OF CENTRAL TENDENCY
Tabulation arranges facts in a logical order and helps their understanding and comparison.
But often, the groups tabulated are still too large for their characteristics to be readily grasped.
What is desired is a numerical expression that summarizes the characteristic of the group.
Measures of central tendency or measures of location (also popularly called averages) serve this
purpose.
A figure that is used to represent a whole series should neither have the lowest value nor the
highest in the series, but a value somewhere between these two limits, possibly in the center,
where most of the items of the series cluster. Such figures are called Measures of Central
Tendency (or averages).
There are five types of averages in common use:
1. Arithmetic Average or Mean
4. Geometric Mean
2. Median
5. Harmonic Mean
3. Mode
We shall take them one by one.
21.8.1
Arithmetic Mean
In the case of Individual Observations (i.e., where frequency is not given):
1. Direct Method. If x : x1 , x2 , . . . , xn then A.M. x is given by
x1 + x2 + . . . + xn 1
= Σx.
n
n
2. Short Cut Method. (Shift of origin.) Shifting the origin to an arbitrary point a, the
formula
1
1
x = Σx becomes x − a = Σ( x − a )
n
n
1
or
x = a + Σd x where d x = x − a
n
x=
Here, a = arbitrary number, called the Assumed Mean
Σd x = Σ( x − a) = ( x1 − a ) + ( x2 − a ) + . . . + ( xn − a)
= sum of the deviations of the variate x from a
n = number of observations.
In the case of a Discrete Series:
1. Direct Method. If the frequency distribution is
x : x1 , x2 , . . . , xn
f : f1 , f 2 , . . . , f n ,
x=
then
f1 x1 + f 2 x2 + . . . + f n xn Σ fx
=
N
f1 + f 2 + . . . + f n
where N = f1 + f 2 + . . . + f n = Σf
21.8 MEASURES OF CENTRAL TENDENCY
1155
________________________________________________________________________________________________________
2. Short Cut Method. (Shift of origin.) Shifting the origin to an arbitrary point a, the
formula
1
1
x = Σfx becomes x − a = Σf ( x − a )
N
N
1
x = a + Σfd x , where d x = x − a
or
N
1
Thus
x = a + Σfd x where a = assumed mean
N
Σ fd x = Σ f ( x − a)
= f1 ( x1 − a ) + f 2 ( x2 − a) + . . . + f n ( xn − a)
= sum of the products of f and the deviation of the corresponding variate x from a.
N = f1 + f 2 + . . . + f n = Σ f .
Note. If the frequencies are given in terms of class intervals, the mid-values of the class
intervals are considered as x and then the above formulae are applied.
In the case of Continuous Series having equal class intervals, say of width h, we use a
different formula (Shift of origin and change of scale; Step Deviation Method).
x−a
Let
u=
then x = a + hu
h
∴
Σfx = Σf (a + hu ) = aΣf + hΣfu
Dividing both sides by N = Σf , we get
Σfx
hΣfu
=a+
N
N
or
x = a+h
Σfu
N
where
u=
x−a
.
h
Weighted Arithmetic Mean. If the variate-values are not of equal importance, we may
attach weights to them w1 , w2 , . . . , wn as measures of their importance.
The weighted mean xw is defined as xw =
w1 x1 + w2 x2 + . . . + wn xn Σwx
=
(i.e., write w for f ).
w1 + w2 + . . . + wn
Σw
ILLUSTRATIVE EXAMPLES
Example 1. Find the mean from the following data:
Grades
Below
Below
Below
Below
Below
10
20
30
40
50
No. of students
5
9
17
29
45
Grades
Below
Below
Below
Below
Below
60
70
80
90
100
No. of students
60
70
78
83
85
1156
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Sol. The frequency distribution table can be written as:
Grades
Mid values (x)
f
x − 55
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
90–100
5
15
25
35
45
55
65
75
85
95
5
4
8
12
16
15
10
8
5
2
– 50
– 40
– 30
– 20
– 10
0
10
20
30
40
N = Σ f = 85
u=
x − 55
10
–5
–4
–3
–2
–1
0
1
2
3
4
fu
– 25
– 16
– 24
– 24
– 16
0
10
16
15
8
Σ fu = −56
Σ fu
⎛ −56 ⎞
= 55 + 10 × ⎜
[Here a = 55, h = 10]
⎟
N
⎝ 86 ⎠
112
= 55 −
= 48.41.
17
Example 2. The mean of 200 items was 50. Later on it was discovered that two items were
misread as 92 and 8 instead of 192 and 88. Find the correct mean.
Sol. Here the incorrect value of x = 50, n = 200
Σx
∴ Σx = nx
x=
Since
n
Using the incorrect value of x ,
Incorrect Σx = 200 × 50 = 10000
∴ Corrected value of Σx = 10000 − (92 + 8) + (192 + 88) = 10180
Corrected Σx 10180
=
= 50.9.
Correct mean =
200
n
Here x = a + h
Properties of the Arithmetic Mean
Property I. The algebraic sum of the deviations of all the variates from their arithmetic
mean is zero.
Proof. Let dx be the deviation of the variate x from the mean x , then dx = x − x
∴
Σ fd x = Σ f ( x − x ) = Σ fx − x Σ f
Σ fx
, where N = Σ f .
N
Property II. The sum of the squares of the deviations of a set of values is minimum when
taken about the mean.
Proof. Let the frequency distribution be xi / fi , i = 1, 2, . . . , n. Let z be the sum of the squares
of the deviations of the given values from an arbitrary point a (say).
= Nx − Nx = 0
∵x =
21.8 MEASURES OF CENTRAL TENDENCY
1157
________________________________________________________________________________________________________
n
z = ∑ f ( x − a)2 .
⇒ Let
i =1
We have to show that z is minimum when a = x .
dz
d 2z
z will be minimum when
= 0 and
>0
da
da 2
n
n
dz
Now
= ∑ 2 f ( x − a ) ⋅ (−1) = −2∑ f ( x − a )
da i = 1
i =1
dz
∴
= 0 ⇒ −2Σ f ( x − a ) = 0
da
⇒
Σ fx − aΣ f = 0
Σ fx
⎡
⎤
⎢⎣ ∵ x = N , Σ f = N ⎥⎦
⇒
Nx − aN = 0
⇒
x −a =0
( ∵ N = Σ f ≠ 0)
⇒
a=x
n
d 2z
f (−1) = 2Σ f = 2N > 0
=
−
2
∑
da 2
i =1
Hence z is minimum when a = x .
Property III. (Mean of the composite series.)
If xi (i = 1, 2, . . . , k) are the arithmetic means of k distributions with respective frequencies ni (i = 1, 2, . . . , k), then the mean x of the whole distribution obtained by combining
the k distributions is given by
n x + n x + ... + nk xk Σi ni xi
x= 1 1 2 2
=
Σ ni
n1 + n2 + ... + nk
Also
i
Proof. Let x11 , x12 , x13 , . . . , x1n1 be the variables of the first distribution, x21 , x22 , . . . , x2n2
be the variables of the second distribution, and so on. Then by definition
1
⎫
( x11 + x12 + . . . + x1n1 ) ⎪
n1
⎪
1
⎪
x2 = ( x21 + x22 + . . . + x2 n2 ) ⎪
n2
⎬
.............................................⎪
⎪
1
⎪
xn = ( xk1 + xk2 + . . . + xknk ) ⎪
nk
⎭
x1 =
. . . ( A)
The mean x of the whole distribution of size (n1 + n2 + . . . + nk ) is given by
x=
=
( x11 + x12 + . . . + x1n1 ) + ( x21 + x22 + . . . + x2 n2 ) + . . . + ( xk1 + xk2 + . . . + xknk )
n1 + n2 + . . . + nk
n1 x1 + n2 x2 + . . . + nk xkk Σi ni xi
=
n1 + n2 + . . . + nk
Σ ni
i
1158
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Example 3. The mean annual salary paid to all employees of a company was $50000. The
mean annual salaries paid to male and female employees were $52000 and $42000 respectively.
Determine the percentage of males and females employed by the company.
Sol. Let p1 and p2 represent the percentage of males and females respectively.
. . . (1)
Then p1 + p2 = 100
Mean annual salary of all employees ( x ) = $50000
= $52000
Mean annual salary of all males ( x1 )
Mean annual salary of all females ( x2 ) = $42000
p x + p2 x2
52000 p1 + 42000 p2
, we get 50000 =
x= 1 1
Using
p1 + p2
100
or
520 p1 + 420 p2 = 50000 or 260 p1 + 210 p2 = 25000
260 p1 + 210(100 − p1 ) = 25000
[Using (1)]
or
50 p1 = 25000 – 21000 = 4000 ∴ p1 = 80 and p2 = 100 − 80 = 20
or
Hence the percentage of males and females is 80 and 20 respectively.
21.8.2
Median
1. The median is the central value of the variable when the values are arranged in
ascending or descending order of magnitude. When the observations are arranged in the order
of their size, the median is the value of that item that has an equal number of observations on
either side. The median divides the distribution into two equal parts. The median is, thus, a
potential average.
For the computation of a median, it is necessary that the items be arranged in ascending or
descending order.
2. For an ungrouped frequency distribution, if the n values of the variate are arranged in
ascending or descending order of magnitude.
th
⎛ n +1 ⎞
(a) When n is odd, the middle value, i.e., ⎜
⎟ value gives the median.
⎝ 2 ⎠
th
th
⎛n⎞
⎛n ⎞
(b) When n is even, there are two middle values ⎜ ⎟ and ⎜ + 1⎟ .
⎝2⎠
⎝2 ⎠
The arithmetic mean of these two values gives the median.
3. For a discrete frequency distribution, the median is obtained by considering cumulaN +1
N +1
tive frequencies. Find
where N = Σfi . Find the cumulative frequency just ≥
. The
2
2
corresponding value of x is the median.
4. For a grouped frequency distribution, the median is given by the formula,
h⎛N
⎞
Median = l + ⎜ − C ⎟
f⎝2
⎠
where, l = lower limit of the median class, where the median class is the class corresponding
N
to the cumulative frequency just ≥
2
h = width of the median class; f = frequency of the median class
N = Σf ; C = cumulative frequency of the class preceding the median class.
21.8 MEASURES OF CENTRAL TENDENCY
1159
________________________________________________________________________________________________________
5. Partition values. These are the values of the variate that divide the total frequency into a
number of equal parts, the median being that value of the variate that divides the total frequency
into two equal parts.
(a) Quartiles. Quartiles are those values of the variate that divide the total frequency into
four equal parts. When the lower half before the median is divided into two equal parts, the value
of the dividing variate is called the Lower Quartile and is denoted by Q1. The value of the variate
dividing the upper half into two equal parts is called the Upper Quartile and is denoted by Q3.
(Q2 being the median.) The formulae for computation are
Q1 = l +
h⎛N
h ⎛ 3N
⎞
⎞
− C⎟
⎜ − C ⎟ ; Q3 = l + ⎜
f ⎝4
f ⎝ 4
⎠
⎠
(b) Deciles. Deciles are those values of the variate that divide the total frequency into 10
equal parts. D1, D2, . . . denote respectively the first, second, . . . deciles.
D1 = l +
h⎛N
⎞
⎜ − C⎟,
f ⎝ 10
⎠
D4 = l +
h ⎛ 4N
⎞
− C⎟,
⎜
f ⎝ 10
⎠
D7 = l +
h ⎛ 7N
⎞
− C⎟
⎜
f ⎝ 10
⎠
(The fifth decile D5 is the median.)
(c) Percentiles. Percentiles are those values of the variate that divide the total frequency
into 100 equal parts. If P1, P2, . . . denote respectively the first, second, . . . percentiles, then
P9 = l +
h ⎛ 9N
⎞
− C⎟,
⎜
f ⎝ 100
⎠
P72 = l +
h ⎛ 72N
⎞
− C ⎟ etc.
⎜
f ⎝ 100
⎠
(The 50th percentile P50 is the median.)
In the above formulae for Quartiles, Deciles, and Percentiles, the letters l, i, f, N, C have
been used in the same sense in which they have been used in the formula for the median.
ILLUSTRATIVE EXAMPLES
Example 1. Below are given the grades obtained by a group of 20 students in a certain
class in mathematics and physics:
Roll Nos.
Grades in Math
Grades in Physics
Roll Nos.
Grades in Math
Grades in Physics
:
:
:
:
:
:
1
53
58
11
25
10
2
54
55
12
42
42
3
52
25
13
33
15
4
32
32
14
48
46
5
30
26
15
72
50
6
60
85
16
51
64
7
47
44
17
45
39
8
46
80
18
33
38
9
35
33
19
65
30
10
28
72
20
29
36
In which subject is the level of knowledge of the students higher?
Sol. To find out the subject in which the level of knowledge of the students is higher, we
find out the medians of both the series. The subject for which the median value is higher will be
the subject in which the level of knowledge of the students is higher. Let us arrange the grades in
ascending order of magnitude.
1160
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
S. No.
Grades in
Math
Grades in
Physics
S. No.
Grades in
Math
Grades in
Physics
1
2
3
4
5
6
7
8
9
10
25
28
29
30
32
33
33
35
42
45
10
15
25
26
30
32
33
36
38
39
11
12
13
14
15
16
17
18
19
20
46
47
48
51
52
53
54
60
65
72
42
44
46
50
55
58
64
72
80
85
Number of items in each case = 20 (even)
Median grades in Mathematics
⎛ 20 ⎞
⎛ 20 ⎞
= A.M. of sizes of ⎜ ⎟ th and ⎜ + 1⎟ th items
⎝ 2 ⎠
⎝ 2
⎠
45 + 46
= 45.5.
= A.M. of sizes of 10th and 11th items =
2
39 + 42
= 40.5.
Median grades in physics = A.M. of sizes of 10th and 11th items =
2
Since the median grades in mathematics are greater than the median grades in physics, the
level of knowledge in mathematics is higher.
Example 2. Obtain the median for the following frequency distribution:
x: 1
f: 8
2
10
3
11
4
16
5
20
6
25
7
15
8
9
9
6
Sol. The cumulative frequency distribution table is given below:
Here N = 120 ∴
x
f
C.F.
1
2
3
4
5
6
7
8
9
8
10
11
16
20
25
15
9
6
8
18
29
45
65
90
105
114
120
N +1
= 60.5
2
The cumulative frequency just greater than
C.F. 65 is 5. Hence the median is 5.
N +1
is 65 and the value of x corresponding to
2
21.8 MEASURES OF CENTRAL TENDENCY
1161
________________________________________________________________________________________________________
Example 3. Find the median, lower, and upper quartiles from the following table:
Grades
Below 10
Below 20
Below 30
Below 40
No. of students
15
35
60
84
Grades
Below 50
Below 60
Below 70
Below 80
No. of students
94
127
198
249
Sol. From the above table, we reconstruct the C.F. table with class intervals.
Grades
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
Here
No. of students ( f )
15
20
25
24
10
33
71
51
C.F.
15
35
60
84
94
127
198
249
N = 249
(i) Calculation of Median
∴
N
= 124.5 ∴ median class is 50 − 60, l = 50; h = 10, f = 33, C = 94
2
h ⎛N
10
⎞
Median = l +
⎜ − C ⎟ = 50 + (124.5 − 94)
f ⎝2
33
⎠
305
= 50 +
= 50 + 9.24 = 59.24
33
(ii) Calculation of lower quartile Q1
N
= 62.25 ∴ lower quartile class is 30 − 40, l = 30
4
h = 10, f = 24, C = 60
∴
h⎛N
10
⎞
⎜ − C ⎟ = 30 + (62.25 − 60)
f ⎝4
24
⎠
22.5
= 30 +
= 30 + .94 = 30.94.
24
Q1 = l +
(iii) Calculation of upper quartile Q3
3N 747
=
= 186.75 ∴ upper quartile class is 60 − 70
4
4
l = 60, h = 10, f = 71, C = 127
∴
h ⎛ 3N
10
⎞
− C ⎟ = 60 + (186.75 − 127)
⎜
f ⎝ 4
71
⎠
597.5
= 60 +
= 60 + 8.41 = 68.41.
71
Q3 = l +
1162
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
21.8.3
Mode
1. Mode. Mode is the value that occurs most frequently in a set of observations and around
which the other items of the set cluster densely. It is the point of maximum frequency or the
point of greatest density. In other words, the mode or modal value of the distribution is that value
of the variate for which frequency is maximum.
2. Calculation of the Mode.
(a) In the case of discrete frequency distribution, mode is the value of x corresponding to
maximum frequency.
But in any one (or more) of the following cases:
(i) if the maximum frequency is repeated
(ii) if the maximum frequency occurs in the very beginning or at the end of the distribution
(iii) if there are irregularities in the distribution, the value of the mode is determined by the
method of grouping (illustrated in the examples below).
(b) In the case of a continuous frequency distribution, the mode is given by the formula:
Mode = l +
f m − f1
×h
2 f m − f1 − f 2
where l is the lower limit, h is the width, and fm is the frequency of the model class, and f1 and f2
are the frequencies of the classes preceding and succeeding the modal class respectively.
While applying the above formula, it is necessary to see that the class-intervals are of the
same size. If they are unequal, they should first be made equal on the assumption that the
frequencies are equally distributed throughout the class.
In case fm – f1 < 0 or 2fm – f1 – f2 = 0, use the formula
Mode = l +
where
Δ1
×h
Δ1 + Δ 2
Δ1 = f m − f1 and Δ 2 = f m − f 2 .
(c) For a symmetrical distribution, the mean, median, and mode coincide.
(d) Where the mode is ill-defined, i.e., where the method of grouping also fails, its value
can be ascertained by the formula
Mode = 3 Median – 2 Mean
This measure is called the empirical mode.
ILLUSTRATIVE EXAMPLES
Example 1. Calculate the mode from the following frequency distribution:
Size (x)
:
Frequency ( f ) :
4
2
5
5
6
8
7
9
8
12
9
14
10
14
11
15
12
11
13
13
21.8 MEA
ASURES OF CENTRAL
C
TEN
NDENCY
1163
________________________
________________________________________________________________________________________
Sol. Method off Grouping:
planation:
Exp
In column I,
In column II,
In column III,
In column IV,
In column V,
In column VI,
original
o
freqquencies are written.
frequencies
f
wo.
of column I are combineed two by tw
leave
l
the firsst frequencyy of column I and combinne the otherss two by two..
frequencies
f
of column I are combineed three by three.
t
leave
l
the firsst frequencyy of column I and combinne the otherss three by thrree.
leave
l
the firsst two frequeencies in collumn I and combine
c
the others threee
by
b three.
umns, the maaximum freqquency is wriitten in bold black type.
In all these colu
Note. All operattions are donne on colum
mn I.
w we frame another tablle in which against
a
everyy maximum item of coluumns I to VI,
V we
Now
write dow
wn the correesponding size
s
or sizes. The size (x)
( that occuurs the maxiimum numbber of
times is the
t mode.
Columnns
Size of item having max. frequeency
I
11
II
10,
III
9
9,
V
VI
10
10,
IV
8,
11
9
9,
10
9
9,
10,
11,
1
12
11
Sincce the item 10
1 occurs a maximum
m
nuumber of tim
mes (i.e., 5 tim
mes), hence the mode is 10.
1164
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Example 2. Find the mode of the following:
Grades
No. of candidates
Grades
No. of candidates
:
:
:
:
1–5
7
26–30
18
6–10
10
31–35
10
11–15
16
36–40
5
16–20
32
41–45
1
21–25
24
Sol. Here the greatest frequency 32 lies in the class 16–20. Hence the modal class is 16–20.
But the actual limits of this class are 15.5–20.5.
l = 15.5, f m = 32, f1 = 16, f 2 = 24, h = 5
∴
Mode = l +
f m − f1
32 − 16
× h = 15.5 +
×5
2 f m − f1 − f 2
64 − 16 − 24
= 15.5 +
21.8.4
16
10
× 5 = 15.5 + = 18.83.
24
3
Geometric Mean
Geometric Mean. (a) The geometric mean (G.M.) of n individual observations x1, x2, . . . ,
xn ( xi ≠ 0) is the nth root of their product.
G = ( x1 , x2 , . . . , xn )1/ n
Thus
Taking logarithms of both sides log G =
1
1 n
(log x1 + log x2 + . . . + log xn ) = ∑ log xi
n
n i =1
⎡1 n
⎤
G = antilog ⎢ ∑ log xi ⎥
⎣ n i =1
⎦
∴
(b) If x1 , x2 , . . . , xn occur f1 , f 2 , . . . , f n times respectively and N =
n
∑f,
i =1
i
then the G.M. is
given by
G = ( x1f1 x2f2 . . . xnfn )1/ N
Taking logarithms of both sides
log G =
1
1 n
( f1 log x1 + f 2 log x2 + . . . + f n log xn ) = ∑ f i log xi
N
N i =1
⎡1 n
⎤
G = antilog ⎢ ∑ f i log xi ⎥
⎣ N i =1
⎦
(c) In the case of a continuous frequency distribution, x is taken to be the value corresponding to the mid-points of the class-intervals.
Example. Compute the geometric mean from the following data:
Grades
0–10
10–20
20–30
30–40
40–50
No. of students
10
5
8
7
20
21.8 MEASURES OF CENTRAL TENDENCY
1165
________________________________________________________________________________________________________
Sol.
Grades
0–10
10–20
20–30
30–40
40–50
No. of Students
(f)
10
5
8
7
20
50
Mid-values
(x)
5
15
25
35
45
log x
f log x
0.6990
1.1761
1.3979
1.5441
1.6532
6.9900
5.8805
11.1832
10.8087
33.0640
67.9264
1
67.9264
Σ f log x =
= 1.3585
N
50
G = antilog 1.3585 = 22.83.
log G =
21.8.5
Harmonic Mean
Harmonic Mean. The harmonic mean of a number of observations is the reciprocal of the
arithmetic mean of the reciprocals of the given values. Thus, the harmonic mean H of n observations x1 , x2 , . . . , xn is
1
n
=
H= n
.
1 1
1
1
1
+ +...+
∑
xn
n i = 1 xi x1 x2
If x1 , x2 , . . . , xn (none of them being zero) have the frequencies f1 , f 2 , . . . , f n respectively,
then the harmonic mean is given by
n
1
N
H= n
, N = ∑ fi
=
f
f1 f 2
fi
1
i =1
+ + ...+ n
∑
x1 x2
xn
n i = 1 xi
In the case of class-intervals, x is taken to be the mid-value of the class-interval.
ILLUSTRATIVE EXAMPLES
Example 1. Find the harmonic mean of the following data:
Grades (out of 150) No. of students
10
2
20
3
40
6
60
5
120
4
Sol.
1
x
f
x
10
2
.100
20
3
.050
40
6
.025
60
5
.017
120
4
.008
20
f
x
.200
.150
.150
.085
.032
.617
1166
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
H.M. =
N
20
=
= 32.4.
f .617
Σx
Example 2. An airplane flies along the four sides of a square at speeds of 100, 200, 300,
and 400 km/hr respectively. What is the average speed of the airplane in its flight around the
square?
Sol. When equal distances are covered with unequal speeds, the harmonic mean is the
proper average.
4
Average speed =
= 192 km/hr.
∴
1
1
1
1
+
+
+
100 200 300 400
TEST YOUR KNOWLEDGE
1. The minimum temperature in (°C) for Anytown for the month of July, 2006 as reported by the
Meteorological Department is given below. Construct a frequency distribution table for it.
30.3, 30.0, 25.8, 26.5, 24.2, 25.2, 28.0, 28.0, 29.5, 27.8, 30.0, 31.1, 27.2, 25.9, 27.6, 24.5, 24.4, 27.0,
28.1, 26.0, 25.4, 28.0, 26.9, 25.7, 27.2, 25.5, 26.6, 28.5, 28.0, 27.7, 24.0.
2. The following are the monthly rents (in dollars) of 40 stores. Tabulate the data by grouping in intervals
of $8.
380, 420, 490, 370, 820, 370, 750, 620, 540, 790, 840, 750, 630, 440, 740, 440, 360, 690, 540, 480, 740,
470, 520, 570, 620, 670, 720, 770, 820, 510, 310, 380, 430, 750, 670, 770, 470, 640, 840, 810.
3. Draw a histogram representing the following frequency distribution:
Monthly Wages
Number of Workers
(in $)
15
2
20
20
25
26
30
16
35
9
40
4
45
3
[Hint. Mid-values of class intervals of size 5 are given.]
4. Represent the following distribution by a (i) histogram and (ii) frequency polygon.
Scores
90–99
80–89
70–79
60–69
50–59
40–49
30–39
Frequency
2
12
22
20
14
3
1
5. Represent the following distribution by an ogive:
Grades
0–10
10–20
20–30
30–40
40–50
No. of students
5
13
12
11
8
Grades
50–60
60–70
70–80
80–90
90–100
No. of students
4
1
3
1
2
21.8 MEASURES OF CENTRAL TENDENCY
1167
________________________________________________________________________________________________________
6. Compute the arithmetic mean for the following data:
Height (in cm):
No. of people:
219
2
216
4
213
6
210
10
207
11
204
7
201
5
198
4
195
1
7. Find the average grades of students from the following data:
Grades
Above 0
Above 10
Above 20
Above 30
Above 40
Above 50
No. of students
80
77
72
65
55
43
Grades
Above 60
Above 70
Above 80
Above 90
Above 100
No. of students
28
16
10
8
0
8. Two hundred people were interviewed by a public opinion polling agency. The frequency distribution
gives the ages of the people interviewed.
Age Group
Frequency
80–89
2
70–79
2
60–69
6
50–59
20
Calculate the arithmetic mean of the data.
Age Group
40–49
30–39
20–29
10–19
Frequency
56
40
42
32
9. Calculate the arithmetic mean from the following data:
Class interval
0–1
1–3
3–5
5–10
10–15
Frequency
8
8
10
12
18
Class interval
15–25
25–28
28–30
30–45
45–60
Frequency
11
10
9
8
6
10. Find the class intervals if the arithmetic mean of the following distribution is 33 and assumed mean
is 35.
Step deviation (u)
Frequency ( f )
:
:
–3
5
–2
10
–1
25
0
30
1
20
2
10
11. The average height of a group of 25 children was calculated to be 78.4 cm. It was later discovered that
one value was misread as 69 cm instead of the correct value of 96 cm. Calculate the correct average.
12. A candidate obtains the following percentage in an examination: english 60, history 75, mathematics 63,
physics 59, and chemistry 55. Find the weighted mean if weights 2, 1, 5, 5, 3 are allotted to the subjects.
13. From the following data calculate the missing frequency:
No. of pills
4–8
8–12
12–16
16–20
20–24
No. of people cured
11
13
16
14
?
No. of pills
24–28
28–32
32–36
36–40
No. of people cured
9
17
6
4
The average number of pills to cure a person is 20.
14. The frequencies of values 0, 1, 2, . . . , n of a variable are given by
qn, nC1qn–lp, nC2qn–2p2, . . . , pn where p + q = 1. Show that the mean is np.
1168
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
15. The mean grades obtained by 300 students in the subject of statistics is 45. The mean of the top 100 of
them was found to be 70 and the mean of the last 100 was known to be 20. What is the mean of the
remaining 100 students?
16. In a certain examination, the average grade of all students in class A is 68.4 and that of all students in
class B is 71.2. If the average of both classes combined is 70, find the ratio of the number of students in
class A to the number in class B.
17. The following are the monthly salaries in dollars of 30 employees of a firm:
910 1390 1260 1190 1000 870 650 770 990 950 1080 1270 860 1480 1160 760 690 880 1120
1180 890 1160 970 1050 950 800 860 1060 930 1350
The firm gave bonuses of 100, 150, 200, 250, 300, 350, 400, 450, and 500 to employees in the respective
salary groups: exceeding 600 but not exceeding 700, exceeding 700 but not exceeding 800, and so on up
to exceeding 1400 but not exceeding 1500. Find the average bonus paid per employee.
18. According to the census of 2006, the following are the population figures in thousands of 10 cities:
2000, 1180, 1785, 1500, 560, 782, 1200, 385, 1123, 222.
Find the median.
19. Find the median from the following table:
x:
f:
5
1
7
2
9
7
11
9
13
11
15
8
17
5
19
4
20. Calculate the mean and median from the following table:
Class interval
6.5–7.5
7.5–8.5
8.5–9.5
9.5–10.5
10.5–11.5
11.5–12.5
12.5–13.5
Frequency
5
12
25
48
32
6
1
21. Compute the median from the following data:
Mid-value
115
125
135
145
155
Frequency
6
25
48
72
116
Mid-value
165
175
185
195
Frequency
60
38
22
3
22. Find the median, quartiles, 7th decile, and 85th percentile from the following data:
Monthly Rent
($)
200–400
400–600
600–800
800–1000
1000–1200
No. of families
6
9
11
14
20
Monthly Rent
($)
1200–1400
1400–1600
1600–1800
1800–2000
No. of families
15
10
8
7
23. An incomplete frequency distribution is given as follows:
Variable
10–20
20–30
30–40
40–50
Frequency
12
30
?
65
Variable
50–60
60–70
70–80
Total
Frequency
?
25
18
229
Given that the median value is 46, determine the missing frequencies using the median formula.
21.8 MEASURES OF CENTRAL TENDENCY
1169
________________________________________________________________________________________________________
24. Find the median, lower and upper quartiles, 4th decile, and 60th percentile for the following distribution:
Grades
0–4
4–8
8–12
12–14
No. of students
10
12
18
7
Grades
14–18
18–20
20–25
25 and above
No. of students
5
8
4
6
[Hint. Here the class-intervals are not all equal. To find any partition value, there is no need to make
them equal.]
25. Find the mode of the following frequency distribution:
Size
Frequency
:
:
1
3
2
8
3
15
4
23
5
35
6
40
7
32
8
28
9
20
10
45
11
14
12
6
26. Find the mode and median from the following table:
Grades
0–10
10–20
20–30
30–40
No. of students
2
18
30
45
Grades
40–50
50–60
60–70
70–80
No. of students
35
20
6
3
Monthly wages
(in $)
1500–1700
1700–1900
1900–2100
2100–2300
No. of workers
8
12
2
2
27. Calculate the mode of the following distribution:
Monthly wages
(in $)
500–700
700–900
900–1100
1100–1300
1300–1500
No. of workers
4
44
38
28
6
[Hint. Use the method of grouping for finding the modal class.]
28. An incomplete distribution of families according to their expenditure per week is given below. The
median and mode for the distribution are $250 and $240 respectively. Calculate the missing frequencies.
Expenditure
No. of families
:
:
0–100
14
100–200
?
200–300
27
300–400
?
400–500
15
29. Compute the geometric mean of the following data:
x
y
:
:
10
2
15
3
18
5
20
6
25
4
30. If n1 and n2 are the sizes, G1 and G2 the geometric means of two series respectively, then the geometric
n log G 1 + n2 log G 2
mean G of the combined series is given by log G = 1
.
n1 + n2
31. The grades obtained by 25 students in a test are given below:
Grades
No. of students
Find the harmonic mean.
:
:
11
3
12
7
13
8
32. Compute the harmonic mean of the following data:
Class
0–10
10–20
20–30
30–40
40–50
Frequency
4
6
10
7
3
14
5
15
2
1170
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
33. Three cities A,
A B, and C aree equidistant frrom each otherr. A woman driives from A to B at 30 km/hrr, from
B to C at 40 km/hr,
k
and from
m C to A at 500 km/hr. Determ
mine her average speed.
34. Show that in
n finding the arithmetic
a
meaan of a set off readings on a thermometerr, it does not matter
m
whether we measure
m
tempeerature in Centigrade or Fahrrenheit, but thaat in finding the geometric mean,
m
it
does matter which
w
scale wee use.
A
Answers
6.
10.
207.54 cm
0–10, 10–20,
1
20–30, 30–40,
40–50, 50–60
18.
22.
1151.5 thousands
($) 110
00, 781.80, 14000,
1333.30
0, 1600
24.
10.89, 6.5, 18.125, 9.33,
12.57
7..
11..
15..
19..
23..
25..
28..
32..
51.75
79.48 cm
45
13
34, 45
6
250, 240
16.03
8.
12.
16.
20.
35.8 years
60.63%
3:4
Meean = 9.87,
Meedian = 9.97
9.
13.
17.
21.
17.36
14
$275
153.8
26.
29.
33.
36,, 36.6
18.20
38.3 km/hr
27.
31.
$975.00
12.7
________________________
________________________________________________________________________________________
21.9
DISPERSION
N
A measure
m
of central
c
tendeency by itseelf can exhiibit only
one of the importaant characteeristics of distribution.. It can
o
as well as a singgle figure caan. It is
representt a series only
inadequaate to give uss a completee idea of the distributionn. It must
be suppoorted and su
upplementedd by some other
o
measurres. One
such meaasure is Disp
persion.
Twoo or more frequency distributions
d
may have exactly
identical averages but even then they mayy differ markkedly in
several ways.
w
Furtheer analysis iss, therefore, essential to account
for these differences.. Consider thhe followingg example:
Disttribution A :
Disttribution B :
75
10
85
2
20
95
30
105
70
1115
1880
125
290
600
= 100. In distribution
d
A, the valuues of the vaariate
6
differ froom 100 but the
t differencce is small. In distribution B, the iteems are widdely scatteredd and
lie far froom the mean. Althoughh the A.M. iss the same, the two disttributions widely
w
differ from
each otheer in their formation.
Therefore, whilee studying a distributionn, it is equally important to know how
w the variatees are
clusteredd around or scattered aw
way from thee point of ceentral tendenncy. Such variation
v
is called
c
dispersioon or spread
d or scatter or
o variabilityy. Thus, disppersion is thhe extent to which the values
v
are dispeersed about the
t central value.
v
The A.M. of eaach distributtion is
21.10
M
MEASURES
S OF DISPERSION
The following are
a the measuures of dispeersion:
(a) Range
R
(b)) Quartile deeviation or seemi-inter-quuartile range
(c) Average
A
(or mean) deviaation
(d)) Standard deviation.
d
(a) Range.
R
Ran
nge is the diifference bettween the exxtreme values of the variaate.
Ran
nge = L – S,, where L = Largest
L
and S = Smallesst
L −S
Coeefficient of th
he Range =
.
L+S
21.10 MEASURES OF DISPERSION
1171
________________________________________________________________________________________________________
It is easily understood and computed. But it suffers from the drawback that it depends
exclusively on the two extreme values. It is not a reliable measure of dispersion.
(b) Quartile Deviation. The difference between the upper and lower quartiles, i.e., Q3 – Q1
is known as the inter-quartile range and half of it, i.e., 12 (Q3 – Q1), is called the semiinter-quartile range or the quartile deviation.
Quartile Deviation =
1
(Q3 − Q1 ).
2
It is definitely a better measure of dispersion than range as it makes use of 50% of the data.
But since it ignores the other 50% of the data, it is also not a reliable measure of dispersion.
Coefficient of the Quartile Deviation =
Q3 − Q1
.
Q3 + Q1
Example. Calculate the quartile deviation of the grades of 39 students in statistics given
below:
:
Grades
No. of students :
0–5
4
5–10
6
10–15
8
15–20
12
20–25
7
25–30
2
Sol. The cumulative frequency table is given below:
Here
Grades
No. of students ( f )
C.F.
0– 5
5–10
10–15
15–20
20–25
25–30
4
6
8
12
7
2
4
10
18
30
37
39
N
= 9.75 ∴ Class of Q1 is 5 − 10
4
h⎛N
5
5 × 5.75
⎞
= 9.79
Q1 = l + ⎜ − C ⎟ = 5 + (9.75 − 4) = 5 +
f ⎝4
6
6
⎠
3N
= 29.25 ∴ Class of Q3 is 15 − 20
4
h ⎛ 3N
5
5 × 11.25
⎞
− C ⎟ = 15 + (29.25 − 18) = 15 +
= 19.69
Q3 = l + ⎜
f ⎝ 4
12
12
⎠
N = Σ f = 39;
1
1
1
Quartile deviation = (Q3 − Q1 ) = (19.69 − 9.79) = × 9.90 = 4.95.
2
2
2
(c) Average Deviation or Mean Deviation. If x1 , x2 , x3 , . . . , xn occur f1 , f 2 , f 3 , . . . , f n
n
times respectively and N =
∑f,
i =1
median) is given by
i
the mean deviation from the average A (usually mean or
1172
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Mean deviation =
1 n
∑ fi xi − A ,
N i =1
where xi − A represents the modulus or the absolute value of the deviation (xi – A).
Since the mean deviation is based on all the values of the variate, it is a better measure of
dispersion than range or quartile deviation. But some artificiality is created due to ignoring the
signs of the deviations (xi – A). This renders it useless for further mathematical treatment.
Coefficient of Mean Deviation =
Mean Deviation
.
Average from which it is calculated
Example. Find the mean deviation from the median of the following frequency distribution:
:
Grades
No. of students :
0–10
5
10–20
8
20–30
15
30–40
16
40–50
6
Sol.
Mid-value
f
C.F.
x − Md
f x − Md
5
15
25
35
45
5
8
15
16
6
50
5
13
28
44
50
23
13
3
7
17
115
104
45
112
102
478
N
= 25 ∴ The median class corresponds to c.f. 28, i.e., median class is 20–30
2
h⎛N
10
⎞
Median M d = l + ⎜ − C ⎟ = 20 + (25 − 13) = 20 + 8 = 28
f ⎝2
15
⎠
1
478
= 9.56 marks.
Mean deviation from median = Σ f x − M d =
N
50
(d) Standard Deviation. Root-Mean Square Deviation. The root-mean square deviation,
denoted by s, is defined as the positive square root of the mean of the squares of the deviations
from an arbitrary origin A. Thus
s=+
1
Σ fi ( xi − A) 2
N
When the deviations are taken from the mean x , the root-mean square deviation is called
the standard deviation and is denoted by the Greek letter σ . Thus
σ =+
1
Σ fi ( xi − x ) 2 .
N
Note. The square of the standard deviation σ 2 is called variance.
Short-cut methods for calculating Standard Deviation ( σ ).
21.10 MEASURES OF DISPERSION
1173
________________________________________________________________________________________________________
(i) Direct Method
σ=
σ2 =
⇒
1
Σ fi ( xi − x ) 2
N
1
1
1
1
Σ f i ( xi2 − 2 xi x + x 2 ) = Σ fi xi2 − 2 x ⋅ Σ f i xi + x 2 ⋅ Σ f i
N
N
N
N
(taking the constants x , x 2 outside the summation sign)
=
σ=
⇒
1
1
1
Σ fi xi2 − 2 x ⋅ x + x 2 ⋅ ⋅ N = Σ fi xi2 − x 2
N
N
N
1
Σ fi xi2 − x 2 =
N
2
1
⎛1
⎞
Σ fi xi2 − ⎜ Σ fi xi ⎟ .
N
⎝N
⎠
(ii) Change of Origin
Let the origin be shifted to an arbitrary point a. Let d = x – a denote the deviation of variate
x from the new origin
d = x−a ⇒ d = x −a
∴
d −d = x−x
σx =
1
Σ f ( x − x )2 =
N
1
Σ f (d − d ) 2 = σ d
N
∴ The S.D. remains unchanged by shift of origin.
2
σx = σd
1
⎛1
⎞
Σ fd 2 − ⎜ Σ fd ⎟ .
N
⎝N
⎠
Note. In the case of series of individual observations, if the mean is a whole number, take a = x . In the case
of discrete series, when the values of x are not equidistant, take a somewhere in the middle of the x-series.
(iii) Shift of Origin and Change of Scale (Step Deviation Method)
1
Let the origin be shifted to an arbitrary point a. Let the new scale be
times the original
h
scale.
x−a
then hu = x − a ⇒ hu = x − a ∴ h(u − u ) = x − x
h
1
1
1
σx =
Σ f ( x − x )2 =
Σ fh 2 (u − u ) 2 = h
Σ f (u − u ) 2 = hσ u
N
N
N
Let u =
which is independent of a but not h. Hence the S.D. is independent of the change of the origin
but not of the change of scale.
1
⎛1
⎞
Σ fu 2 − ⎜ Σ fu ⎟
σ x = hσ u = h
N
⎝N
⎠
2
Note. In the case of discrete series, when the values of x are equidistant at intervals of h or in the case of
continuous series having equal class intervals of width h, use the Step Deviation Method.
1174
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Relation between σ and s
By definition, we have
1
1
Σ f i ( xi − a) 2 = Σ fi ( xi − x + x − a) 2
N
N
1
= Σ fi ( xi − x + d ) 2 where d = x − a
N
1
= Σ fi [( xi − x ) 2 + d 2 + 2d ( xi − x )]
N
1
2d
2d
d2
d2
2
2
(0)
= Σ fi ( xi − x ) + Σ fi +
Σ fi ( xi − x ) = σ + ⋅ N +
N
N
N
N
N
[∵ Σ f i ( xi − x ) = algebraic sum of the deviations from mean = 0]
s2 =
=σ 2 + d2
s2 = σ 2 + d 2 ∵ d 2 ≥ 0
Hence
∴ s2 ≥ σ 2
Clearly s2 is least when d = 0, i.e., x = a
∴ Mean square deviation (s2) and consequently the root-mean square deviation (s) is least
when the deviations are measured from the mean.
Hence standard deviation is the least possible root-mean square deviation.
21.11
RELATIONS BETWEEN MEASURES OF DISPERSION
4
4
(standard deviation) = σ
5
5
2
2
Semi-interquartile range =
(standard deviation) = σ .
3
3
Mean Deviation =
21.12
COEFFICIENT OF DISPERSION
Whenever we want to compare the variability of two series that differ widely in their
averages or which are measured in different units, we calculate the coefficients of dispersion,
which being ratios are numbers independent of the units of measurement. The coefficients of
dispersion (C.D.) based on different measures of dispersion are as follows:
xmax − xmin
xmax + xmin
Q − Q1
C.D. = 3
Q3 + Q1
=
(a) C.D. based on range:
(b) Based on quartile deviation:
(c) Based on mean deviation:
(d) Based on standard deviation:
mean deviation
average from which it is calculated
S.D. σ
=
C.D. =
Mean x
C.D. =
Coefficient of variation. It is the percentage variation in the mean, standard deviation being
considered as the total variation in the mean.
C.V. =
σ
x
×100.
21.12 COEFFICIENT OF DISPERSION
1175
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. Find the mean and standard deviation of the following:
Series
Frequency
Series
Frequency
15–20
20–25
25–30
30–35
35–40
40–45
2
5
8
11
15
20
45–50
50–55
55–60
60–65
65–70
70–75
20
17
16
13
11
5
Sol.
Mid-values x
f
17.5
22.5
27.5
32.5
37.5
42.5
47.5
52.5
57.5
62.5
67.5
72.5
2
5
8
11
15
20
20
17
16
13
11
5
u=
x − 47.5
5
–6
–5
–4
–3
–2
–1
0
1
2
3
4
5
N = 143
x = a + h⋅
fu
fu2
– 12
– 25
– 32
– 33
– 30
– 20
0
17
32
39
44
25
72
125
128
99
60
20
0
17
64
117
176
125
5
1003
Σ fu
5
= 47.5 + 5 ×
= 47.7
N
143
1
1003 ⎛ 5 ⎞
⎛ Σ fu ⎞
Σ fu 2 − ⎜
−⎜
σ x = hσ u = h
⎟ =5
⎟ = 5 × 2.65 = 13.25.
N
143 ⎝ 143 ⎠
⎝ N ⎠
2
2
Example 2. Goals scored by two teams A and B in a soccer season were as follows:
No. of goals scored
in a match
0
1
2
3
4
Find out which team is more consistent.
No. of matches
A
B
27
17
9
9
8
6
5
5
4
3
1176
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Sol. Calculation of coefficient of variation for team A:
No. of goals scored
(x)
No. of matches
(f)
dx = x − 2
fdx
fd x2
0
1
2
3
4
27
9
8
5
4
–2
–1
0
1
2
– 54
–9
0
5
8
108
9
0
1
56
– 50
138
N = 53
x =a+
Σ fd x
−50
= 2+
= 2 − 0.94 = 1.06
N
53
1
138 ⎛ −50 ⎞
⎛ Σ fd x ⎞
Σ fd x2 − ⎜
=
−⎜
⎟ = 1.31
⎟
N
53 ⎝ 53 ⎠
⎝ N ⎠
2
σ=
Coefficient of variation for team A =
σ
x
2
× 100 =
1.31× 100
= 123.6
1.06
Calculation of coefficient of variation for team B:
No. of goals scored
(x)
No. of matches
(f)
dx = x – 2
fdx
fd x2
0
1
2
3
4
17
9
6
5
3
–2
–1
0
1
2
– 34
–9
0
5
6
68
9
0
5
12
–32
94
N = 40
x =a+
Σ fd x
32
= 2−
= 2 − .8 = 1.2
N
40
1
94 ⎛ −32 ⎞
⎛ Σ fd x ⎞
Σ fd x2 − ⎜
=
−⎜
⎟ = 1.3
⎟
N
40 ⎝ 40 ⎠
⎝ N ⎠
2
σ=
σ
2
1.3 × 100
= 108.3
x
1.2
Since the coefficient of variation is less for team B, team B is therefore more consistent.
Coefficient of variation for team B =
21.13
× 100 =
THEOREM
The standard deviations of two series containing n1 and n2 members are σ1 and σ2
respectively, being measured from their respective means x1 and x2 . If the two series are
grouped together as one series of (n1 + n2) members, show that the standard deviation σ of this
series, measured from its mean x , is given by
21.13 THEOREM
1177
________________________________________________________________________________________________________
σ2 =
n1σ 12 + n2σ 22
n1n2
( x1 − x2 ) 2 .
+
2
n1 + n2
(n1 + n2 )
Proof. Let S12 and S22 be the mean square deviations of the two series respectively and S2 be
the mean square deviation of the two series taken together.
Then if a is the assumed mean, we have
S2 =
=
1
n1 + n2
n1 + n2
∑
f ( x − a)2 =
1
n1 + n2
⎤
1 ⎡ n1
2
−
+
f
x
a
f ( x − a)2 ⎥
(
)
⎢∑
∑
n1 + n2 ⎣ 1
n1 +1
⎦
⎡
⎤
1 n1
2
f ( x − a) 2 etc.⎥
∵
=
S
∑
1
⎢
n1 1
⎣
⎦
n1S12 + n2S22
n1 + n2
n1 (σ 12 + d12 ) + n2 (σ 22 + d 22 )
=
[∵ S2 = a 2 + d 2 where d = x − a ]
n1 + n2
=
n1σ 12 + n2σ 22 n1d12 + n2 d 22
+
n1 + n2
n1 + n2
. . . (1)
d1 = x1 − a, d 2 = x2 − a
Now
If a is the mean of the two combined series, i.e., if a = x , then S2 = σ 2
n x +n x
Also
x= 1 1 2 2
n1 + n2
∴
∴
d1 = x1 − x = x1 −
n1 x1 + n2 x2 n2 ( x1 − x2 )
=
n1 + n2
n1 + n2
d 2 = x2 − x = x2 −
n1 x1 + n2 x2 n1 ( x2 − x1 )
=
n1 + n2
n1 + n2
n1d12 + n2 d 22 =
=
∴
From (1), σ 2 =
n1n22 ( x1 − x2 ) 2 n2 n12 ( x2 − x1 ) 2
+
(n1 + n2 ) 2
(n1 + n2 ) 2
n1n2 ( x1 − x2 ) 2
nn
⋅ (n2 + n1 ) = 1 2 ( x1 − x2 ) 2
2
n1 + n2
(n1 + n2 )
n1σ 12 + n2σ 22
n1n2
+
( x1 − x2 ) 2 .
n1 + n2
(n1 + n2 ) 2
( ∵ S2 = σ 2 )
Example. The first of the two samples has 100 items with mean 15 and standard deviation
3. If the whole group has 250 items with mean 15.6 and standard deviation 13.44 , find the
standard deviation of the second group.
Sol. Here
∴
Using
n1 = 100, x1 = 15, σ 1 = 3
n = n1 + n2 = 250, x = 15.6, σ = 13.44
n2 = 250 − 100 = 150
n x +n x
100(15) + 150( x2 )
x = 1 1 2 2 , we have 15.6 =
n1 + n2
250
1178
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
150 xx = 250 ×15.6 − 1500 = 2400
or
∴ x2 = 16
d1 = x1 − x = 15 − 15.6 = −0.6
d 2 = x2 − x = 16 − 15.6 = 0.4
The variance of the combined group σ 2 is given by the formula
σ2 =
n1σ 12 + n2σ 22 n1d12 + n2 d 22
+
n1 + n2
n1 + n2
(n1 + n2 )σ 2 = n1 (σ 12 + d12 ) + n2 (σ 22 + d 22 )
∴ 250 ×13.55 = 100(9 + 0.36) + 150(σ 22 + 0.16)
or
150σ 22 = 250 ×13.44 − 100 × 9.36 − 150 × 0.16 = 3360 − 936 − 24 = 2400
∴
σ 22 = 16. Hence σ 2 = 4.
or
21.14
SKEWNESS
For a symmetrical distribution, the frequencies are symmetrically distributed about the
mean, i.e., variates equidistant from the mean have equal frequencies. Also, in the case of such a
distribution, the mean, mode, and median coincide and the median lies halfway between the two
quartiles.
Thus
M = M0 = Md and Q3 – M = M – Q1.
Skewness means a lack of symmetry or lopsidedness in a frequency distribution. The
object of measuring skewness is to estimate the extent to which a distribution is distorted from a
perfectly symmetrical distribution. Skewness indicates whether the curve is turned more to one
side than to the other, i.e., whether the curve has a longer tail on one side.
Skewness can be positive as well as negative. Skewness is positive if the longer tail of the
distribution lies toward the right and negative if it lies toward the left.
21.15
MEASURES OF SKEWNESS
Measures of skewness give us an idea about the extent of “lopsided-ness” in a series. Such
measures should be
(i) Pure numbers so as to be independent of the units in which the variable is measured.
(ii) Zero when the distribution is symmetrical.
Relative measures of skewness are called the coefficient of skewness. They are independent
of the units of measurement and as such, they are pure numbers.
Bowley’s coefficient of skewness based on quartiles is defined as
Sk =
(Q3 − M d ) − (M d − Q1 ) Q3 + Q1 − 2M d
=
(Q3 − M d ) + (M d − Q1 )
Q3 − Q1
Karl Pearson’s coefficient of skewness is defined as
Sk =
M − M0
Mean − Mode
=
σ
Standard Deviation
If the mode is ill-defined, then using M0 = 3Md – 2M, we have Sk =
3(M − M d )
σ
.
The value of Bowley’s coefficient of skewness lies between –1 and +1 and that of Karl
Pearson’s coefficient of skewness lies between –3 and +3.
21.16 MOMENTS
1179
________________________________________________________________________________________________________
Example. Find the coefficient of dispersion and a measure of skewness from the following
table giving the wage bonuses of 230 people:
Wage bonuses (in $)
70–80
80–90
90–100
100–110
No. of people
12
18
35
42
Wage bonuses (in $) No. of people
110–120
50
120–130
45
130–140
20
140–150
8
Sol.
Mid-values
(x)
No. of people
(f)
C.F.
75
85
95
105
115
125
135
145
12
18
35
42
50
45
20
8
12
30
65
107
157
202
222
230
u=
x − 105
10
–3
–2
–1
0
1
2
3
4
N = 230
Mean M = a + h
fu
fu2
– 36
– 36
– 35
0
50
90
60
32
108
72
35
0
50
180
180
128
= 125
= 753
Σ fu
125
= 105 + 10 ×
= 105 + 5.4 = Rs. 110.4.
N
230
The greatest frequency 50 lies in the class 110–120. Hence this is the modal class.
f m = 50, f1 = 42, f 2 = 45, l = 110, h = 10,
f m − f1
∴ Mode M 0 = l +
×h
2 f m − f1 − f 2
= 110 +
50 − 42
83
×10 = 110 + = 110 + 6.2 = $116.2
100 − 42 − 45
13
2
2
1
753 ⎛ 125 ⎞
⎛1
⎞
Standard deviation σ = h
Σ fu 2 − ⎜ Σ fu 2 ⎟ = 10
−⎜
⎟ = $17.3
N
230 ⎝ 230 ⎠
⎝N
⎠
σ
17.3
∴ Coefficient of dispersion =
=
= 0.16
M 110.4
M − M 0 110.4 − 116.2
=
= −0.33.
Measure of skewness Sk =
σ
17.3
21.16
MOMENTS
The rth moment of a variable x about any point A is denoted by μr′ and is defined as
μr′ =
1
Σ f ( x − A) r
N
where
N=Σ f
1180
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
The rth moment of a variable x about the mean M is denoted by μr and is defined as
1
μr = Σ f ( x − M) r
N
1
1
1
In particular
μ0′ = Σ f ( x − A)0 = Σ f = ⋅ N = 1
N
N
N
μ0 = 1
Similarly,
1
Σ f ( x − M) = 0
N
| being the algebraic sum of the deviations from the mean
1
μ2 = Σ f ( x − M) 2 = σ 2 , by definition.
N
The results μ0 = 1, μ1 = 0, μ2 = σ 2 are of fundamental importance and should be committed
to memory.
μ1 =
21.17
RELATION BETWEEN MOMENTS ABOUT THE MEAN IN TERMS OF
MOMENTS ABOUT ANY POINT AND VICE VERSA
1
Σ f ( x − A) r
N
1
= Σ fd r
N
μr′ =
By definition,
or
1
Σ fd
N
1
M = A + Σ fd = A + μ1′
N
μ1′ = M − A
Now
μr′ =
Setting r = 1,
∴
where A is any point
where d = x − A
. . . (i)
μ1′ =
1
Σ
N
1
= Σ
N
1
= Σ
N
1
= Σ
N
. . . (ii)
f ( x − M) r
f ( x − A + A − M) r =
1
Σ f (d − μ1′) r
N
| Using (ii)
f ⎡⎣ d r − r C1d r −1μ1′ + r C2 d r − 2 μ1′2 − r C3 d r −3 μ1′3 + . . . + (−1) r ⋅ μ1′r ⎤⎦
1
1
Σ fd r −1 + r C2 μ1′2 Σ fd r − 2
N
N
1
1
− r C3 μ ′3 Σ fd r −3 + . . . + (−1) r μ1′r ⋅ Σ f
N
N
r
r
2
r
3
= μr′ − C1μr′−1 + C2 μr′− 2 μ1′ − C3 μr′−3 μ1′ + . . . + (−1) r μ1′r
fd r − r C1μ1′ ⋅
| Using (i)
In particular, setting r = 2, 3, 4, we get
μ2 = μ2′ − 2 μ1′2 + μ0′ μ1′2 = μ2′ − μ1′2
μ3 = μ3′ − 3μ2′ μ1′ + 3μ2′3 − μ0′ μ1′3 = μ3′ − 3μ2′ μ1′ + 2μ1′3
μ4 = μ4′ − 4 μ3′ μ1′ + 6μ2′ μ1′2 − 4 μ1′μ1′3 + μ0′ μ1′4
= μ4′ − 4 μ3′ μ1′ + 6μ2′ μ1′2 − 3μ1′4
| ∵ μ0′ = 1
21.19 SHEPPARD’S CORRECTIONS FOR MOMENTS
1181
________________________________________________________________________________________________________
μ1 = 0
Hence
μ 2 = μ 2′ − μ1′2
μ3 = μ3′ − 3μ 2′ μ1′ + 2 μ3′3
( μ1′ = M − A)
μ 4 = μ 4′ − 4 μ3′ μ1′ + 6 μ 2′ μ1′ − 3μ1′
2
4
1
1
Σ f ( x − M) r = Σ fd r where d = x − M
N
N
1
1
1
μ r′ = Σ f ( x − A) r = Σ f ( x − M + M − A) r = Σ f ( d + μ1′) r
N
N
N
1
= Σ f ( d r + r C1d r −1 μ1′ + r C 2 d r − 2 μ1′2 + r C 3 d r −3 μ1′3 + . . . + μ1′r )
N
1
1
1
1
= Σ fd r + r C1 μ1′ ⋅ Σ fd r −1 + r C 2 μ1′2 Σ fd r − 2 + . . . + μ1′r ⋅ Σ f
N
N
N
N
2
r
r
r
= μ r + C1 μ r −1 μ1′ + C 2 μ r − 2 μ1′ + . . . + μ1′
Conversely, μ r =
Now
. . . (iii)
| Using (ii )
| Using (iii )
In particular, setting r = 2, 3, 4 and noting that μ1 = 0, μ0 = 1, we get
μ2′ = μ2 + 2μ1μ1′ + μ0 μ1′2 = μ2 + μ1′2
μ3′ = μ3 + 3μ2 μ1′ + 3μ1μ1′2 + μ0 μ1′3 = μ3 + 3μ2 μ1′ + μ1′3
μ4′ = μ4 + 4μ3 μ1′ + 6μ2 μ1′2 + 4μ1μ1′3 + μ0 μ1′4 = μ4 + 4μ3 μ1′ + 6μ 2 μ1′2 + μ1′4 .
21.18
EFFECT OF A CHANGE OF ORIGIN AND SCALE ON MOMENTS
Let
∴
∴
x
x−x
μr′
Also
x−A
i.e., x = A + hu
h
= A + hu , where bar denotes the mean of the respective variable
= h(u − u )
1
1
1
= Σ f ( x − A) r = Σ fh r u r = h r ⋅ Σ fu r
N
N
N
1
1
1
= Σ f ( x − x ) r = Σ fh r (u − u ) r = h r ⋅ Σ f (u − u ) r
N
N
N
u=
μr
Hence the rth moment of the variable x is hr times the corresponding moment of the
variable u.
21.19
SHEPPARD’S CORRECTIONS FOR MOMENTS
In the case of class intervals we assume that the frequencies are concentrated at mid-points
of class intervals. Since this assumption is not true in general, some error is likely to creep into
the calculation of moments. W.F. Sheppard gave the following formulae by which these errors
may be corrected.
1
μ2 (corrected) = μ2 − h 2 ; μ3 (corrected) = μ3
12
1
7 4
μ4 (corrected) = μ4 − h 2 μ2 +
h where h is the width of class intervals.
2
240
1182
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
21.20
C
CHARLIER’
S CHECK
To check the accuracy
a
in the calculaation of the first four moments,
m
w often usee the
we
followingg identities known
k
as Chharlier checkks:
Σ f ( x + 1) = Σ fx + Σ f = Σ fx + N
Σ f ( x + 1) 2 = Σ fx 2 + 2Σ fx + N
Σ f ( x + 1)3 = Σ fx3 + 3Σ fx 2 + 3Σ fx + N
Σ f (x + 1) 4 = Σ fx 4 + 4Σ fx 3 + 6Σ fx 2 + 4Σ fx + N.
21.21
P
PEARSON’S
S β AND γ COEFFICIEN
C
NTS
Karll Pearson defined the following
f
foour coefficieents based upon
u
the firrst four mom
ments
about thee mean:
β1 =
μ32
, γ = + β1 ;
μ23 1
β2 =
μ4
, γ = β2 − 3
μ22 2
These coefficieents are inddependent of
o units of measuremeent and theerefore, are pure
numbers..
β1 ( β 2 + 3)
Baseed upon mom
ments, the cooefficient off skewness iss Sk =
.
2(5β 2 − 6 β1 − 9)
21.22
K
KURTOSIS
Giveen two freq
quency distrributions thhat have thee same variiability as measured
m
byy the
standard deviation, they
t
may bee relatively more
m
or lesss flat toppedd than the “nnormal curvee”. A
y be symmettrical but it may
m not be equally flat toopped with the
t normal curve.
c
frequencyy curve may
The relattive flatness of the top iss called kurtoosis and is measured
m
by β 2 .
Curvves that are neither flatt nor sharplyy peaked aree
called noormal curvess or mesokurtic curves (see
(
curve A
in the figgure). For succh a curve β 2 = 3 and hence
h
γ 2 = 0..
Curvves that aree flatter thann the normaal curve (seee
curve B in the figuree) are calledd platykurticc. For such a
curve β 2 < 3 and hen
nce γ 2 < 0.
Curvves that arre more shharply peakeed than thee
normal curve
c
(see curve C inn the figuree) are calledd
leptokurttic. For such
h a curve β 2 > 3 and hennce γ 2 > 0.
21.23
β1 AS A MEA
ASURE OF SKEWNESS
For a symmetriccal distributiion, all the moments
m
of odd
o order abbout the meann vanish.
Let x denote th
he mean of thhe variate x, then
μ2 r +1 =
1 n
∑ fi ( xi − x )2r +1 , N = Σ fi
N i =1
21.23 β1 AS A MEASURE OF SKEWNESS
1183
________________________________________________________________________________________________________
In a symmetrical distribution, the values of the variate equidistant from the mean have equal
frequencies.
∴
f1 ( x1 − x ) 2 +1 + f n ( xn − x ) 2 r +1 = 0
[∵ x1 − x and xn − x are equal in magnitude but opposite in sign. Also f1 = f n ]
Similarly f 2 ( x2 − x ) 2 r +1 + f n −1 ( xn −1 − x ) 2 r +1 = 0 and so on.
1 n
∑ fi ( xi − x )2r +1 cancel in pairs. In n is odd, again the
N i =1
terms cancel in pairs and the middle term vanishes, since the middle term = x .
Hence
μ2 r +1 = 0
∴
If n is even, all the terms in
μ3 = 0 and hence
β1 =
μ32
= 0.
u23
Thus, β1 gives a measure of departure from symmetry, i.e., of skewness.
Example. Calculate the first four moments of the following distribution about the mean and
hence find β1 and β 2 :
x :
0
1
2
3
4
5
6
7
8
f :
1
8
28
56
70
56
28
8
1
Sol. Let us first calculate moments about x = 4.
In particular
μr′ =
1
1
Σ f ( x − 4) r = Σ fd r
N
N
x
f
0
1
2
3
4
5
6
7
8
1
8
28
56
70
56
28
8
1
N = 256
d=x–4
–4
–3
–2
–1
0
1
2
3
4
where d = x − 4
fd
fd 2
fd 3
fd 4
–4
– 24
– 56
– 56
0
56
56
24
4
0
16
72
112
56
0
56
112
72
16
512
– 64
– 216
– 224
– 56
0
56
224
216
64
0
256
648
448
56
0
56
448
648
256
2816
1
1
512
Σ fd = 0; μ2′ = Σ fd 2 =
=2
N
N
256
1
1
2816
μ3′ = Σ fd 3 = 0; μ4′ = Σ fd 4 =
= 11
N
N
256
μ1′ =
Moments about the mean are
μ1 = 0 (always );
μ2 = μ2′ − μ1′2 = 2
μ3 = μ3′ − 3μ2′ μ1′ + 2μ1′3 = 0; μ4 = μ4′ − 4μ3′ μ1′ + 6μ2′ μ1′2 − 3μ1′4 = 11
μ32
β1 = 3 = 0;
μ2
β2 =
μ4 11
= = 2.75.
μ22 4
1184
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. Calculate the quartile deviation of the grades of 63 students in Physics given below:
Grades
0–10
10–20
20–30
30–40
40–50
No. of students
5
7
10
16
11
Grades
50–60
60–70
70–80
80–90
90–100
No. of students
7
3
2
2
0
2. Find the mean deviation from the mean of the following distribution:
Class
Frequency
:
:
0–6
8
6–12
10
12–18
12
18–24
9
24–30
5
3. Compute the mean deviation from the median of the following distribution:
Grades
No. of students
:
:
0–10
5
10–20
10
20–30
20
30–40
5
40–50
10
4. Compute the standard deviation for the following data relating to grades obtained by 15 students:
12, 21, 21, 23, 27, 28, 30, 34, 37, 39, 39, 39, 40, 49, 54.
5. Calculate the mean and standard deviation for the following distribution:
x:
f:
56
3
63
6
70
14
77
16
84
13
91
6
98
2
6. Calculate the mean and standard deviation for the following:
Size of item
Frequency
:
:
6
3
7
6
8
9
9
13
10
8
11
5
12
4
7. The following table shows the grades obtained by 100 candidates in an examination. Calculate the mean,
median, and standard deviation:
Grades obtained :
No. of candidates :
1–10
3
11–20
16
21–30
26
31–40
31
41–50
16
51–60
8
8. Calculate the mean and standard deviation of the following frequency distribution:
Weekly bonus wages in $
No. of workers
4.5–12.5
12.5–20.5
20.5–28.5
28.5–36.5
36.5–44.5
44.5–52.5
52.5–60.5
60.5–68.5
68.5–76.5
4
24
21
18
5
3
5
8
2
9. (i) The mean of five items of an observation is 4 and the variance is 5.2. If three of the items are 1, 2,
and 6, then find the other two.
(ii) Show that the variance of the first n positive integers is
1
12
( n − 1).
2
21.23 β1 AS A MEASURE OF SKEWNESS
1185
________________________________________________________________________________________________________
10. Compute the quartile deviation and standard deviation for the following:
x:
f:
100–109
15
110–119
44
120–129
133
130–139
150
140–149
125
150–159
82
160–169
35
170–179
16
11. Find the standard deviation for the following data giving bonus wages of 230 people:
Bonus wages (in $)
70–80
80–90
90–100
100–110
No. of people
12
18
35
42
Bonus wages (in $)
110–120
120–130
130–140
140–150
No. of people
50
45
20
8
12. A collar manufacturer is considering the production of a new type of collar to attract young men. The
following statistics of neck circumferences are available based upon the measurements of a typical group
of college students:
Mid-value
(inches)
12.5
13.0
13.5
14.0
14.5
Mid-value
(inches)
15.0
15.5
16.0
16.5
No. of students
4
19
30
63
66
No. of students
29
18
1
1
Compute the mean, standard deviation, and variance.
13. A student obtained the mean and standard deviation of 100 observations as 40 and 5 respectively. It was
later discovered that he had wrongly copied down an observation as 50 instead of 40. Calculate the
correct mean and standard deviation.
14. The scores of two golfers for 10 rounds each are:
A:
B:
58
84
59
56
60
92
54
65
65
86
66
78
52
44
75
54
69
78
52
68
Which may be regarded as the more consistent player?
15. The heights and weights of 10 people are given below. In which characteristic are they more variable?
Height in cm :
Weight in kg :
170
75
172
74
168
75
177
76
179
77
171
73
173
76
178
75
173
74
179
75
16. The following are the rushing yards of two high school football teams A and B in a series of games:
A:
B:
12
47
115
12
6
16
73
42
7
4
19
51
119
37
36
48
84
43
29
0
Which team has the better running game and which is more consistent?
17. An analysis of monthly bonus wages paid to the workers in two firms A and B belonging to the same
industry gives the following results:
Number of workers
Average monthly wage
Variance of distribution of bonus wages
Firm A
500
$186
81
Firm B
600
$175
100
(i) Which firm, A or B, has a larger bonus wage bill?
(ii) In which firm, A or B, is there greater variability in individual bonus wages?
(iii) Calculate the variance of the distribution of bonus wages of all the workers in the firms A and B taken
together.
1186
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
18. Find the coefficient of skewness for the following distribution:
Class
0– 5
5–10
10–15
15–20
Frequency
2
5
7
13
Class
20–25
25–30
30–35
35–40
Frequency
21
16
8
3
19. Calculate the quartile coefficient of skewness for the following distribution:
x :
f :
1–5
3
6–10
4
11–15
68
16–20
30
21–25
10
26–30
6
31–35
2
20. Calculate the first four moments about the mean for the following data:
Variate
Frequency
:
:
1
1
2
6
3
13
4
25
5
30
6
22
7
9
8
5
9
2
21. The first three moments of a distribution about the value 2 of the variable are 1, 16, and – 40. Show that
the mean is 3, variance is 15, and μ 3 = –86. Also show that the first three moments about x = 0 are 3,
24, and 76.
22. For a distribution, the mean is 10, variance is 16, γ 1 is +1 and β 2 is 4. Find the first four moments about
the origin.
23. The first four moments of a distribution about the value 5 of the variable are 2, 20, 40, and 50. Find the
moments about the mean.
24. Show that for a discrete distribution:
(i) β 2 > 1
(ii) β 2 > β1
Answers
1.
5.
9.
13.
17.
12.32
75.53, 9.87
(i) 4, 7
39.9, 4.9
(i) B (ii) B
(iii) $180, 121.36
2.
6.
10.
14.
18.
22.
6.3
9, 1.61
10.9, 15.26
A
–1
10, 116, 1544, 23184
3.
7.
11.
15.
19.
23.
9
32, 32.6, 12.4
$17.10
Height
0.25
0. 16, –64, 162
4.
8.
12.
16.
20.
10.9
$31.35, $16.64
14.24, 0.72, 0.52
A, B
0, 2.49, 0.68, 18.26
________________________________________________________________________________________________________
21.24
CORRELATION
In a bivariate distribution, if the change in one variable affects a change in the other variable, the variables are said to be correlated.
If the two variables deviate in the same direction, i.e., if the increase (or decrease) in one
results in a corresponding increase (or decrease) in the other, the correlation is said to be direct
or positive.
E.g., the correlation between income and expenditure is positive.
If the two variables deviate in opposite directions, i.e., if the increase (or decrease) in one
results in a corresponding decrease (or increase) in the other, the correlation is said to be inverse
or negative.
E.g., the correlation between volume and the pressure of a perfect gas or the correlation
between price and demand is negative.
Correlation is said to be perfect if the deviation in one variable is followed by a corresponding proportional deviation in the other.
21.27 COMPUTATION OF THE CORRELATION COEFFICIENT
1187
________________________________________________________________________________________________________
21.25
SCATTER OR DOT DIAGRAMS
This is the simplest method of the diagrammatic representation of bivariate data. Let
( xi , yi ) i = 1, 2, 3, . . . , n be a bivariate distribution. Let the values of the variables x and y be
plotted along the x-axis and y-axis on a suitable scale. Then corresponding to every ordered pair,
there corresponds a point or dot in the xy-plane. The diagram of dots so obtained is called a dot
or scatter diagram.
If the dots are very close to each other and the number of observations is not very large, a
fairly good correlation is expected. If the dots are widely scattered, a poor correlation is
expected.
21.26 KARL PEARSON’S COEFFICIENT OF CORRELATION (OR PRODUCT
MOMENT CORRELATION COEFFICIENT)
The correlation coefficient between two variables x and y, usually denoted by r ( x, y ) or rxy
is a numerical measure of the linear relationship between them and is defined as
1
1
Σ( xi − x )( y1 − y )
Σ( xi − x )( yi − y )
Σ( xi − x )( y1 − y )
n
rxy =
=
=n
σ xσ y
1
1
Σ( xi − x ) 2 Σ( yi − y ) 2
Σ( xi − x ) 2 ⋅ Σ( yi − y ) 2
n
n
Note. The correlation coefficient is independent of change of origin and scale.
Let us define two new variables u and v as
u=
21.27
x−a
y −b
,v=
where a, b, h, k are constants, then rxy = ruv .
h
k
COMPUTATION OF THE CORRELATION COEFFICIENT
1
Σ( xi − x )( yi − y )
n
We know that rxy =
σ xσ y
Now
Similarly,
∴
1
1
Σ( xi − x )( yi − y ) = Σ( xi yi − xi y − yi x + x y )
n
n
1
1
1
1
= Σxi yi − y ⋅ Σxi − x ⋅ Σyi + (nx y )
n
n
n
n
1
1
= Σxi yi − y ⋅ x − x ⋅ y + x ⋅ y = Σxi yi − x ⋅ y
n
n
1
1
σ x2 = Σ( xi − x ) 2 = Σ( xi2 − 2 xi x + x 2 )
n
n
1
1
1
1
1
= Σxi2 − 2 x ⋅ Σxi + nx 2 = Σxi2 − 2 x ⋅ x + x 2 = Σxi2 − x 2
n
n
n
n
n
1
σ y2 = Σyi2 − y 2
n
1
Σxi yi − x y
n
rxy =
⎛1 2
2 ⎞⎛ 1
2
2⎞
⎜ Σxi − x ⎟ ⎜ Σyi − y ⎟
⎝n
⎠⎝ n
⎠
1188
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
1
Σui vi − u v
x−a
y −b
n
u=
,v=
then rxy = ruv =
.
h
k
⎛1 2
2 ⎞⎛ 1
2
2⎞
⎜ Σui − u ⎟ ⎜ Σvi − v ⎟
⎝n
⎠⎝ n
⎠
If
ILLUSTRATIVE EXAMPLES
Example 1. Ten students got the following percentage of grades in Principles of Economics
and Statistics:
: 1
Roll Nos.
Grades in Economics : 78
: 84
Grades in Statistics
2
36
51
3
98
91
4
25
60
5
75
68
6
82
62
7
90
86
8
62
58
9
65
53
10
39
47
Calculate the coefficient of correlation.
Sol. Let the grades in the two subjects be denoted by x and y respectively.
x
y
u = x – 65
v = y – 66
u2
v2
uv
78
36
98
25
75
82
90
62
65
39
Total
84
51
91
60
68
62
86
58
53
47
13
– 29
33
– 40
10
17
25
–3
0
–26
0
18
– 15
25
–6
2
–4
20
–8
– 13
– 19
0
169
841
1089
1600
100
289
625
9
0
676
5398
324
225
625
36
4
16
400
64
169
361
2224
234
435
825
240
20
– 68
500
24
0
494
2734
1
1
Σui = 0, v = Σvi = 0
n
n
1
1
Σui vi − u v
(2734)
n
10
ruv =
=
1
1
⎛1 2
2 ⎞⎛ 1
2
2⎞
(5398) ⋅ (2224)
⎜ Σui − u ⎟ ⎜ Σvi − v ⎟
10
10
⎝n
⎠⎝ n
⎠
u=
2734
= 0.787
5398 × 2224
rxy = ruv = 0.787.
=
Hence
Example 2. Find the coefficient of correlation for the following table:
x:
y:
10
18
14
12
18
24
22
6
26
30
30
36
21.27 COMPUTATION OF THE CORRELATION COEFFICIENT
1189
________________________________________________________________________________________________________
u=
Sol. Let
x
10
14
18
22
26
30
Total
x − 22
y − 24
, v=
.
4
6
y
18
12
24
6
30
36
u
–3
–2
–1
0
1
2
–3
v
–1
–2
0
–3
1
2
–3
1
1
1
1
Σui = (−3) = − ; v = Σvi
n
n
6
2
1
Σui vi − u v
n
ruv =
=
⎛1 2
2 ⎞⎛ 1
2
2⎞
⎜ Σui − u ⎟ ⎜ Σvi − v ⎟
⎝n
⎠⎝ n
⎠
u=
Hence
u2
9
4
1
0
1
4
19
v2
1
4
0
9
1
4
19
uv
3
4
0
0
1
4
12
1
1
= (−3) = −
6
2
1
1
(12) −
6
4
= 0.6
1 ⎤ ⎡1
1⎤
⎡1
⎢⎣ 6 (19) − 4 ⎥⎦ ⎢⎣ 6 (19) − 4 ⎥⎦
rxy = ruv = 0.6.
Example 3. A computer, while calculating the correlation coefficient between two variables
X and Y from 25 pairs of observations, obtained the following results:
n = 25,
ΣY = 100,
ΣX = 125,
ΣY 2 = 460,
ΣX 2 = 650,
ΣXY = 508.
It was, however, later discovered at the time of checking that two pairs had been copied
incorrectly as X Y
while the correct values were X Y
6 14
8 12
8 6
6 8
Obtain the correct value of the correlation coefficient.
Sol.
Corrected Σ X = 125 − 6 − 8 + 8 + 6 = 125
Corrected Σ X = 100 − 14 − 6 + 12 + 8 = 100
⎫
⎪
⎪
⎪
2
2
2
2
2
Corrected Σ X = 650 − 6 − 8 + 8 + 6 = 650
⎬
⎪
Corrected ΣY 2 = 460 − 142 − 62 + 122 + 82 = 436
⎪
Corrected Σ XY = 508 − 6 ×14 − 8 × 6 + 8 ×12 + 6 × 8 = 520 ⎭⎪
(Subtract the incorrect values and add the corresponding correct values)
X=
1
1
1
1
ΣX = ×125 = 5; Y = ΣY = ×100 = 4
n
25
n
25
1
ΣXY − X Y
n
Corrected rxy =
⎛1
2
2 ⎞⎛ 1
2
2⎞
⎜ ΣX − X ⎟ ⎜ ΣY − Y ⎟
⎝n
⎠⎝ n
⎠
1190
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
1
4
× 520 − 5 × 4
4 5 2
25
5
=
=
= × = = 0.67.
⎛ 1
⎞⎛ 1
⎞
⎛ 36 ⎞ 5 6 3
(1) ⎜ ⎟
⎜ × 650 − 25 ⎟ ⎜ × 436 − 16 ⎟
⎝ 25
⎠ ⎝ 25
⎠
⎝ 25 ⎠
Example 4. If z = ax + by and r is the correlation coefficient between x and y, show that
σ z2 = a 2σ x2 + b 2σ y2 + 2abrσ xσ y .
z = ax + by
Sol.
⇒
z = ax + by , zi = axi + byi
zi − z = a ( xi − x ) + b( yi − y )
1
1
n
n
1
= Σ ⎡⎣ a 2 ( xi − x ) 2 + b 2 ( yi − y ) 2 + 2ab( xi − x )( yi − y ) ⎤⎦
n
1
1
1
= a 2 ⋅ Σ( xi − x ) 2 + b 2 ⋅ Σ( yi − y ) 2 + 2ab ⋅ Σ( xi − x )( yi − y )
n
n
n
1
Σ( xi − x )( yi − y )
= a 2σ 2 + b 2σ 2 + 2abrσ σ y
∵ r= n
σ z2 = Σ( zi − z ) 2 = Σ[a( xi − x ) + b( yi − y )]2
Now
x
y
σ xσ y
x
21.28 CALCULATION OF THE COEFFICIENT OF CORRELATION FOR A
BIVARIATE FREQUENCY DISTRIBUTION
If the bivariate data on x and y is presented on a two-way correlation table and f is the
frequency of a particular rectangle in the correlation table, then
1
Σ fxy − Σ fx Σ fy
n
rxy =
1
1
2⎤⎡
2⎤
⎡
2
2
⎢⎣Σ fx − n ( Σ fx ) ⎥⎦ ⎢⎣ Σ fy − n ( Σ fy ) ⎥⎦
Since the change of origin and scale do not affect the coefficient of correlation,
∴ rxy = ruv where the new variables u, v are properly chosen.
Example. The following table gives, according to age, the frequency of grades obtained by
100 students in an intelligence test:
Age (in years)
Grades
10–20
20–30
30–40
40–50
50–60
60–70
Total
18
19
20
21
Total
4
5
6
4
2
4
8
4
2
2
22
2
6
10
6
4
3
31
4
11
8
4
1
28
8
19
35
22
10
6
100
19
Calculate the coefficient of correlation between age and intelligence.
21.29 RANK CORRELATION
1191
________________________________________________________________________________________________________
Sol. Let age and intelligence be denoted by x and y respectively.
Mid
value
15
x
y
10–20
25
35
45
55
65
u
fu
fu2
fuv
8
–3
24
72
30
4
11
19
35
–2
–1
– 38
– 35
76
35
20
9
6
4
3
31
8
4
1
28
22
10
6
100
0
0
1
10
2
12
Totals – 75
0
10
24
217
0
2
–2
59
–1
0
1
Totals
– 22
22
16
0
0
0
28
28
13
– 32
126
59
18
19
20
4
2
2
20–30
30–40
5
6
4
8
6
10
40–50
50–60
60–70
f
4
19
4
2
2
22
v
2
– 38
76
56
fv
fv2
fuv
21
Let us define two new variables u and v as u =
f
y − 45
, v = x − 20
10
1
Σ fuv − Σ fu Σ fv
n
rxy = ruv =
1
1
⎡
2
2⎤⎡
2
2⎤
⎢⎣Σ fu − n (Σ fu ) ⎥⎦ ⎢⎣ Σ fv − n (Σ fv) ⎥⎦
1
59 −
(−75)(−32)
59 − 24
100
=
=
= 0.25.
643
2894
1
1
⎡
⎤
⎡
⎤
2
2
×
⎢⎣ 217 − 100 (−75) ⎥⎦ ⎢⎣126 − 100 (−32) ⎥⎦
4
25
21.29
RANK CORRELATION
Sometimes we have to deal with problems in which data cannot be quantitatively measured
but qualitative assessment is possible.
Let a group of n individuals be arranged in order of merit or proficiency in possession of
two characteristics A and B. The ranks in the two characteristics are, in general, different. For
example, if A stands for intelligence and B for beauty, it is not necessary that the most intelligent
individual may be the most beautiful and vice versa. Thus an individual who is ranked at the top
for the characteristic A may be ranked at the bottom for the characteristic B. Let ( xi , yi ), i = 1, 2,
. . . , n be the ranks of the n individuals in the group for the characteristics A and B respectively.
The Pearsonian coefficient of correlation between the ranks xi’s and yi’s is called the rank
correlation coefficient between the characteristics A and B for that group of individuals.
1192
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Thus the rank correlation coefficient
r=
Σ( xi − x )( yi − y )
Σ( xi − x ) 2 Σ( yi − y ) 2
1
Σ( xi − x )( yi − y )
n
=
. . . (1)
σ xσ y
Now xi’s and yi’s are merely the permutations of n numbers from 1 to n. Assuming that no
two individuals are bracketed or tied in either classification, i.e., ( xi , yi ) ≠ ( x j , y y ) for i ≠ j,
both x and y take all integral values from 1 to n.
1
1 n(n + 1) n + 1
∴
x = y = (1 + 2 + 3 + . . . + n) = ⋅
=
n
n
2
2
n(n + 1)
Σxi = 1 + 2 + 3 + . . . + n =
= Σyi
2
n(n + 1)(2n + 1)
Σxi2 = 12 + 22 + . . . + n 2 =
= Σyi2
6
If di denotes the difference in ranks of the ith individual, then
[∵ x = y ]
di = xi − yi = ( xi − x ) − ( yi − y )
1 2 1
2
Σdi = Σ [ ( xi − x ) − ( yi − y ) ]
n
n
1
1
1
= Σ( xi − x ) 2 + Σ( yi − y ) 2 − 2 ⋅ Σ( xi − x )( yi − y )
n
n
n
2
2
= σ x + σ y − 2rσ xσ y
But
. . . (2) [Using (1)]
1
1
n
n
1 2
⎡1
⎤
Σd i = 2σ x2 − 2rσ x2 = 2(1 − r )σ x2 = 2(1 − r ) ⎢ Σxi2 − x 2 ⎥
n
⎣n
⎦
σ x2 = Σxi2 − x 2 = Σyi2 − y 2 = σ y2
∴ From (2),
⎡ 1 n(n + 1)(2n + 1) (n + 1) 2 ⎤
= 2(1 − r ) ⎢ ⋅
−
⎥
6
4 ⎦
⎣m
2
6Σdi2
⎡ 4n + 2 − 3n − 3 ⎤ (1 − r )(n − 1)
= (1 − r )(n + 1) ⎢
=
or 1 − r =
n(n 2 − 1)
6
6
⎣
⎦⎥
Hence
6Σdi2
r = 1−
.
n(n 2 − 1)
Note. This is called Spearman’s Formula for Rank Correlation.
Σd i = Σ ( xi − yi ) = Σxi − Σyi = 0
always. This serves as a check on calculations.
Example. The grades secured by recruits in the selection test (X) and in the proficiency test
(Y) are given below:
Serial No :
:
X
:
Y
1
10
30
2
15
42
3
12
45
Calculate the rank correlation coefficient.
4
17
46
5
13
33
6
16
34
7
24
40
8
14
35
9
22
39
21.30 REPEATED RANKS
1193
________________________________________________________________________________________________________
Sol. Here the grades are given. Therefore, first of all, write down ranks. In each series, the
item with the largest size is ranked 1, next largest 2, and so on.
X
10
15
12
17
13
16
24
14
22
Y
30
42
45
46
33
34
40
35
39
Ranks in X (x)
Ranks in Y ( y )
9
5
8
3
7
4
1
6
2
9
3
2
1
8
7
4
6
5
d=x–y
0
2
6
2
–1
–3
–3
0
–3
0
d
0
4
36
4
1
9
9
0
9
72
2
∴
21.30
r = 1−
6Σ d 2
6 × 72
= 1−
= 1 − 0.6 = 0.4
2
9 × 80
n(n = 1)
Total
Here n = 9.
REPEATED RANKS
If any two or more individuals have the same rank or the same value in the series of grades,
then the above formula fails and requires an adjustment. In such cases, each individual is given
an average rank. This common average rank is the average of the ranks that these individuals
would have assumed if they were slightly different from each other. Thus, if two individuals are
ranked equal at the sixth place, they would have assumed the 6th and 7th ranks if they were
6+7
= 6.5. If three individuals are ranked
ranked slightly differently. Their common rank =
2
equal in fourth place, they would have assumed the 4th, 5th, and 6th ranks if they were ranked
4+5+6
slightly differently. Their common rank =
= 5.
3
1
Adjustment. Add
m(m 2 − 1) to Σd 2 where m stands for the number of times an item is
12
repeated.
This adjustment factor is to be added for each repeated item.
1
1
⎧
⎫
6 ⎨Σd 2 + m(m 2 − 1) + m(m 2 − 1) + . . . ⎬
12
12
⎭
r = 1− ⎩
Thus
2
n(n − 1)
Example. Obtain the rank correlation coefficient for the following data:
X:
Y:
68
62
64
58
75
68
50
45
64
81
80
60
75
68
40
48
55
50
64
70
Sol. Here, grades are given, so write down the ranks.
X
68
64
75
50
64
80
75
40
55
64
Total
Y
Ranks in X (x)
Ranks in Y ( y )
62
4
5
–1
58
6
7
–1
68
2.5
3.5
–1
45
9
10
–1
81
6
1
5
60
1
6
–5
68
2.5
3.5
–1
48
10
9
1
50
8
8
0
70
6
2
4
0
1
1
1
1
25
25
1
1
0
16
72
d=x–y
d2
1194
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
In the X-series, the value 75 occurs twice. Had these values been slightly different, they
2+3
would have been given the ranks 2 and 3. Therefore, the common rank given to them is
=
2
2.5. The value 64 occurs three times. Had these values been slightly different, they would have
5+6+7
been given the ranks 5, 6, and 7. Therefore the common rank given to them is
= 6.
3
Similarly, in the Y-series, the value 68 occurs twice. Had these values been slightly different,
they would have been given the ranks 3 and 4. Therefore, the common rank given to them is
3+ 4
= 3.5.
2
Thus, m has the values 2, 3, 2.
∴
1
1
⎧
⎫
6 ⎨Σd 2 + m(m 2 − 1) + m(m 2 − 1) + . . . ⎬
12
12
⎭
r = 1− ⎩
2
n(n − 1)
1
1
1
⎡
⎤
6 ⎢72 + {2(22 − 1)} + {3(32 − 1)} + {2(22 − 1)}⎥
12
12
12
⎦
r = 1− ⎣
2
10(10 − 1)
6 × 75 6
= 1−
= = 0.545.
990 11
21.31
REGRESSION
Regression is the estimation or prediction of unknown values of one variable from known
values of another variable.
After establishing the fact of correlation between two variables, it is natural to want to know
the extent to which one variable varies in response to a given variation in the other variable; one
is interested to know the nature of the relationship between the two variables.
Regression measures the nature and extent of correlation.
21.32
LINEAR REGRESSION
If two variates x and y are correlated, i.e., there exists an association or relationship between
them, then the scatter diagram will be more or less concentrated around a curve. This curve is
called the curve of regression and the relationship is said to be expressed by means of curvilinear
regression. In the particular case, when the curve is a straight line, it is called a line of regression
and the regression is said to be linear.
A line of regression is the straight line that gives the best fit in the least square sense to
the given frequency.
If the line of regression is so chosen that the sum of squares of deviation parallel to the axis
of y is minimized [See part (a) of the figure on the next page], it is called the line of regression of
y on x and it gives the best estimate of y for any given value of x.
If the line of regression is so chosen that the sum of squares of deviation parallel to the axis
of x is minimized [See part (b) of the figure on the next page], it is called the line of regression of
x on y and it gives the best estimate of x for any given value of y.
21.33 LIN
NES OF REGR
RESSION
1195
________________________
________________________________________________________________________________________
21.33
L
LINES
OF REGRESSIO
R
ON
Let the equation
n of the line of regressionn of y on x be
b
Then
y = a + bx
. . . (1)
y = a + bxx
. . . (2)
Subtracting (2) from (1), wee have
y − y = b( x − x )
uations are
The normal equ
. . . (3)
Σy = nna + bΣx
Σyx = aΣx + bΣx 2
. . . (4)
gin to ( x , y ), (4) becom
mes
Shiffting the orig
Σ( x − x )( y − y ) = aΣ( x − x ) + bΣ( x − x ) 2
Sincce
. . . (5)
Σ(x − x )( y − y )
1
= r ∴ Σ( x − x ) = 0; annd
Σ( x − x ) 2 = σ x2
nσ xσ y
n
∴ From
F
(5), nrσ xσ y = a.0 + b.nσ x2
⇒
b=
rσ y
σx
σy
Hennce, from (3)), the line off regression of
o y on x is y − y = r
(x − x )
σx
σ
Sim
milarly, the lin
ne of regresssion of x on y is
x − x = r x ( y − y)
σy
rσ y
σx
rσ x
σy
is called th
he regressionn coefficient of y on x and is denotedd by byx .
is called th
he regressionn coefficient of x on y annd is denotedd by bxy .
Notee. If r = 0, the two
t lines of reggression becom
me y = y and x = x , which are
a two straighht lines parallell to the
X- and Y-axes respectivelly and passing through their means
m
y and x . They are mutually
m
perpenndicular.
l
of regresssion will coinciide.
If r = ± 1, the two lines
1196
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
21.34
PROPERTIES OF REGRESSION
Property I. The correlation coefficient is the geometric mean between the regression
coefficients.
rσ y
rσ x
Proof. The coefficients of regression are
and
.
σx
G.M. between them =
rσ y
σx
×
rσ x
σy
σy
= r 2 = r = coefficient of correlation.
Property II. If one of the regression coefficients is greater than 1, the other must be less
than 1.
rσ
rσ
Proof. The two regression coefficients are byx = y and bxy = x .
σx
Let byx > 1, then
σy
1
<1
byx
. . . (1)
Since bxy ⋅ bxy = r 2 ≤ 1 (∵ − 1 ≤ r ≤ 1) ∴ bxy ≤
1
< 1.
byx
| Using (1)
Similarly, if bxy > 1, then byx < 1.
Property III. The arithmetic mean of regression coefficients is greater than the correlation
coefficient.
rσ y rσ x
+
byx + bxy
σx σy
Proof. We have to prove that
> r or
>r
2
2
σ y2 + σ x2 > 2σ xσ y or (σ x − σ y ) 2 > 0, which is true.
or
Property IV. Regression coefficients are independent of the origin but not of scale.
x−a
y=b
Proof . Let
,v=
where a, b, h, and k are constants
u=
h
k
rσ
kσ
k ⎛ rσ ⎞ k
byx = y = r ⋅ v = ⎜ v ⎟ = bvu
hσ u h ⎝ σ u ⎠ h
σx
h
buv .
k
Thus, byx and bxy are both independent of a and b but not of h and k.
Similarly,
bxy =
Property V. The correlation coefficient and the two regression coefficients have the
same sign.
Proof. Regression coefficient of y on x = bxy = r
Regression coefficient of x on y = bxy = r
σx
σy
σy
σx
Since σ x and σ y are both positive, byx , bxy , and r have the same sign.
21.35 ANGLE BETWEEN TWO LINES OF REGRESSION
1197
________________________________________________________________________________________________________
21.35
ANGLE BETWEEN TWO LINES OF REGRESSION
If θ is the acute angle between the two regression lines in the case of two variables x and y,
show that
tan θ =
1 − r 2 σ xσ y
⋅
where r, σ x , σ y have their usual meanings.
r σ x2 + σ y2
Explain the significance of the formula when r = 0 and r = ± 1.
Proof. Equations of the lines of regression of y on x and x on y are
y− y =
Their slopes are m1 =
∴
rσ y
σx
rσ y
σx
( x − x ) and x − x =
and m2 =
rσ x
σy
( y − y)
σy
.
rσ x
σ y rσ y
−
rσ x σ x
m2 − m1
=±
tan θ = ±
σ2
1 + m2 m1
1 + y2
σx
2
2
σx
1− r σ y
1 − r 2 σ xσ y
=±
⋅ ⋅
=±
⋅
r σ x σ x2 + σ y2
r σ x2 + σ y2
Since r 2 ≤ 1 and σ x , σ y are positive.
∴ Positive sign gives the acute angle between the lines.
1 − r 2 σ xσ y
Hence
tan θ =
⋅
r σ x2 + σ y2
when
r = 0, θ =
Note.
rσ x
π
2
∴ The two lines of regression are perpendicular to each other.
Hence the estimated value of y is the same for all values of x and vice versa when r = ± 1,
tan θ = 0 so that, θ = 0 or π .
Hence the lines of regression coincide and there is a perfect correlation between the two
variates x and y.
Similarly,
1
1
1
Σxy − x y
Σxy − x y
Σxy − x y
σ
n
n
n
x
=
⋅
=
=
1 2
σy
σ xσ y
σy
σ y2
Σy − y 2
n
1
rσ y n Σxy − x y
=
.
1 2
σx
2
Σx − x
n
1198
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. Calculate the coefficient of correlation and obtain the least square regression
line of y on x for the following data:
x:
y:
1
9
2
8
3
10
4
12
5
11
6
13
7
14
8
16
9
15
Also obtain an estimate of y that should correspond on the average to x = 6.2.
Sol.
x
1
2
3
4
5
6
7
8
9
Total
y
9
8
10
12
11
13
14
16
15
u=x–5
–4
–3
–2
–1
0
1
2
3
4
0
u = y – 12
–3
–4
–2
0
–1
1
2
4
3
0
u2
16
9
4
1
0
1
4
9
16
60
v2
9
16
4
0
1
1
4
16
9
60
uv
12
12
4
0
0
1
4
12
12
57
1
1
Σuv − u v
(57) − 0
n
9
=
rxy = ruv =
⎛1 2
2 ⎞⎛ 1
2
2⎞
⎡1
⎤ ⎡1
⎤
(60) − 0 ⎥ ⎢ (60) − 0 ⎥
⎜ Σu − u ⎟ ⎜ Σv − v ⎟
⎢
n
n
⎝
⎠⎝
⎠
⎣9
⎦ ⎣9
⎦
19
= 0.95
20
1
1
rσ y rσ v n Σuv − u v 9 (57) − 0 19
=
=
=
=
= 0.95
1 2
1
σx
σu
20
2
Σu − u
(60) − 0
n
9
1
1
x = 5 + Σu = 5, y = 12 + Σv = 12
9
9
=
Also
Equation of the line of regression of y on x is
y− y =
or
or
rσ y
σx
(x − x )
y − 12 = 0.95( x − 5)
y = 0.95 x + 7.25
When x = 6.2, the estimated value of y = 0.95 × 6.2 + 7.25 = 5.89 + 7.25 = 13.14.
21.35 ANGLE BETWEEN TWO LINES OF REGRESSION
1199
________________________________________________________________________________________________________
Example 2. In a partially destroyed laboratory record of an analysis of a correlation data,
only the following results are legible:
Variance of x = 9
Regression equations: 8x – 10y + 66 = 0, 40x – 18y = 214.
What were (a) the mean values of x and y, (b) the standard deviation of y, and (c) the
coefficient of correlation between x and y.
Sol. (i) Since both the lines of regression pass through the point ( x , y ) therefore, we
have
8 x − 10 y + 66 = 0
40 x − 18 y − 214 = 0
. . . (1)
. . . (2)
. . . (3)
40 x − 50 y + 330 = 0
32 y − 544 = 0 ∴
8 x − 170 + 66 = 0 or
x = 13,
Multiplying (1) by 5,
Subtracting (3) from (2),
∴ From (1),
Hence
(ii ) Variance of
y = 17
8 x = 104 ∴ x = 13
y = 17
x = σ x2 = 9
∴
. . . (a)
(given)
σx = 3
The equations of the lines of regression can be written as
y = .8 x + 6.6 and x = .45 y + 5.35
rσ y
∴ The regression coefficient of y on x is
The regression coefficient of x on y is
rσ x
σy
σx
= .8
. . . (4)
= .45
. . . (5)
Multiplying (4) and (5), r 2 = .8 × .45 = .36 ∴ r = 0.6
. . . (b)
(Positive sign with square root is taken because regression coefficients are positive.)
σy =
From (4),
.8σ x .8 × 3
=
= 4.
0.6
r
. . . (c)
TEST YOUR KNOWLEDGE
1. (a) Calculate the correlation coefficient for the following heights in inches of fathers (X) and their
sons (Y ) .
X:
65
66
67
67
68
69
70
72
Y:
67
68
65
68
72
72
69
71
(b) Find the correlation coefficient between x and y from the given data:
x:
y:
78
125
89
137
97
156
69
112
59
107
79
138
68
123
57
108
63
82
53
37
(c) Find the correlation coefficient from the following data:
x:
y:
92
86
89
88
87
91
86
77
83
68
77
85
71
52
50
57
1200
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
2. Calculate the coefficient of correlation for the following ages of husbands and wives:
Husbands’s age
Wife’s age
x:
y:
23
18
27
20
28
22
28
27
29
21
30
29
31
27
33
29
35
28
36
29
σ x − y = σ x + σ y − 2rσ xσ y
2
3. Establish the formula
2
2
where r is the correlation coefficient between x and y.
4. (a) Calculate the coefficient of correlation for the following table:
x
16–18
18–20
20–22
10–20
2
1
1
20–30
3
2
3
2
30–40
3
4
5
6
40–50
2
2
3
4
50–60
1
2
2
60–70
1
2
1
y
22–24
(b) Find the correlation between x (grades in mathematics) and y (grades in Engineering Drawing) given
in the following data:
x
10–40
40–70
70–100
Total
0–30
30–60
60–90
5
—
—
20
28
32
—
2
13
25
30
45
Total
5
80
15
100
y
5. Ten students got the following percentage of grades in chemistry and physics:
Students
Grades in chemistry
Grades in physics
:
:
:
1
78
84
2
36
51
3
98
91
4
25
60
5
75
68
6
82
62
7
90
86
8
62
58
9
65
63
10
39
47
Calculate the rank correlation coefficient.
6. Ten competitors in a musical test were ranked by the three judges x, y, and z in the following order:
Ranks by x :
Ranks by y :
Ranks by z :
1
3
6
6
5
4
5
8
9
10
4
8
3
7
1
2
10
2
4
2
3
9
1
10
7
6
5
8
9
7
Using the rank correlation method, discuss which pair of judges has the nearest approach to common
likings in music.
7. A sample of 12 fathers and their sons gave the following data about their heights in inches:
Father
Son
:
:
65
68
63
66
67
68
64
65
68
69
62
66
70
68
66
65
68
71
67
67
69
68
71
70
Calculate the coefficient of rank correlation.
8. If r = 0, show that the two lines of regression are parallel to the axes.
9. If the two regression coefficients are 0.8 and 0.2, what would be the value of the coefficient of
correlation?
21.36 THEORY OF PROBABILITY
1201
________________________________________________________________________________________________________
10. (a) Find the correlation coefficient and the equations of regression lines for the following values of x
and y:
x:
1
2
3
4
5
y:
2
5
3
8
7
(b) Find the correlation coefficient between x and y for the given values. Find also the two regression
lines.
x:
1
2
3
4
5
6
7
8
9
10
y:
10
12
16
28
25
36
41
49
40
50
11. The two regression equations of the variables x and y are x = 19.13 – 0.87y and y = 11.64 – 0.50x. Find
(i) mean of x’s, (ii) mean of y’s, and (iii) the correlation coefficient between x and y.
12. Two random variables have the regression lines with equations 3x + 2y = 26 and 6x + y = 31. Find the
mean values and the correlation coefficient between x and y.
13. In a partially destroyed sheet of laboratory data, only the equations giving the two lines of regression of
y on x and x on y are available and are respectively, 7x – 16y + 9 = 0, 5y – 4x – 3 = 0.
Calculate the coefficient of correlation, x and y .
Answers
1.
4.
6.
9.
11.
12.
(a) 0.603 (b) 0.96 (e) 0.7291
(a) 0.28 (b) 0.4517
x and z
0.4
(i) 15.79 (ii) 3.74 (iii) –0.6595
x = 4, y = 7; r − 0.5
2.
5.
7.
10.
13.
0.82
0.84
0.722
(a) r = 0.8; y = 1.3x + 1.1; x = 0.5y + 0.5
(b) r = 0.96; y = 4.69x + 4.9; x = 0.2y – 0.64
r = 0.7395; x = −0.1034; y = 0.5172.
________________________________________________________________________________________________________
21.36
THEORY OF PROBABILITY
Here we define and explain certain terms that are used frequently.
(a) Trial and event. Let an experiment be repeated under essentially the same conditions
and let it result in any one of the several possible outcomes. Then, the experiment is called a trial
and the possible outcomes are known as events or cases.
For example:
(i) Tossing a coin is a trial and the turning up of heads or tails is an event.
(ii) Throwing a die is a trial and getting 1 or 2 or 3 or 4 or 5 or 6 is an event.
(b) Exhaustive events. The total number of all possible outcomes in any trial is known as
exhaustive events or exhaustive cases.
For example:
(i) In tossing a coin, there are two exhaustive cases, heads and tails.
(ii) In throwing a die, there are 6 exhaustive cases, for any one of the six faces that may
turn up.
(iii) In throwing two dice, the exhaustive cases are 6 × 6 = 62, for any of the 6 numbers
from 1 to 6 on one die can be associated with any of the 6 numbers on the other die.
In general, in throwing n dice, the exhaustive cases are 6n.
(c) Favorable events or cases. The cases that entail the occurrence of an event are said to
be favorable to the event. It is the total number of possible outcomes in which the specified event
happens.
For example:
(i) In throwing a die, the number of cases favorable to the appearance of a multiple of 3
are two, viz. 3 and 6, while the number of cases favorable to the appearance of an even
number are three, viz., 2, 4, and 6.
1202
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
(ii) In a throw of two dice, the number of cases favorable to getting a sum of 6 is 5, viz.,
(1, 5); (5, 1); (2,4); (4, 2); (3, 3).
(d) Mutually exclusive events. Events are said to be mutually exclusive or incompatible if
the occurrence of any one of them precludes (i.e., rules out) the occurrence of all others, i.e., if
no two or more than two of them can happen simultaneously in the same trial.
For example:
(i) In tossing a coin, the events “heads” and “tails” are mutually exclusive, since if the
outcome is heads, the possibility of getting tails in the same trial is ruled out.
(ii) In throwing a die, all the six faces numbered, 1, 2, 3, 4, 5, 6 are mutually exclusive
since any outcome rules out the possibility of getting any other.
(e) Equally likely events. Events are said to be equally likely if there is no reason to expect
any one in preference to any other.
For example:
(i) When a card is drawn from a well-shuffled deck, any card may appear in the draw so
that the 52 different cases are equally likely.
(ii) In throwing a die, all six faces are equally likely to come up.
( f ) Independent and dependent events. Two or more events are said to be independent if
the occurrence or non-occurrence of any one does not depend (or is not affected) by the
occurrence or non-occurrence of any other. Otherwise they are said to be dependent.
For example: If a card is drawn from a deck of well-shuffled cards and replaced before
drawing the second card, the result of the second draw is independent of the first draw. However,
if the first card drawn is not replaced, then the second draw is dependent on the first draw.
21.37
(a) MATHEMATICAL (OR CLASSICAL) DEFINITION OF PROBABILITY
If a trial results in n exhaustive, mutually exclusive and equally likely cases and m of them
are favorable to the occurrence of an event E, then the probability of occurrence of E is given by
p or P (E) =
Favorable number of cases m
= .
Exhaustive number of cases n
Note 1. Since the number of cases favorable to the occurrence of E is m and the exhaustive number of cases is
n, therefore, the number of cases unfavorable to the occurrence of E are n – m.
Note 2. The probability that the event E will not happen is given by
q or P(E) =
Unfavorable number of cases
n−m
= 1−
m
= 1− p
Exhaustive number of cases
n
n
Obviously, p and q are non-negative and cannot exceed 1, i.e., 0 ≤ p ≤ 1, 0 ≤ q ≤ 1.
Note 3. If P(E) = 1, E is called a certain event, i.e., the chance of its occurrence is 100%.
If P(E) = 0, then E is an impossible event.
Note 4. If n cases are favorable to E and m cases are favorable to E (i.e., unfavorable to E), then exhaustive
number of cases = n + m.
n
m
P(E) =
and P(E) =
n+m
n+m
We say that the “odds in favor of E” are n : m and the “odds against E” are m : n.
21.37
=
(b) STATISTICAL (OR EMPIRICAL) DEFINITION OF PROBABILITY
If in n trials, an event E occurs m times, then the probability of the occurrence of E is given
by
m
.
n→∞ n
p = P(E) = Lt
21.37 (b) STATISTICAL (OR EMPIRICAL) DEFINITION OF PROBABILITY
1203
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. A bag contains 7 white, 6 red, and 5 black balls. Two balls are drawn at
random. Find the probability that they will both be white.
Sol. Total number of balls = 7 + 6 + 5 = 18.
Out of 18 balls, 2 can be drawn in 18C2 ways.
18 ×17
∴ Exhaustive number of cases = 18C2 =
= 153
2 ×1
7×6
Out of 7 white balls, 2 can be drawn in 7C2 =
= 21 ways.
2 ×1
∴ Favorable number of cases = 21
Probability =
21 7
= .
153 51
Example 2. Four cards are drawn from a deck of cards. Find the probability that (i) all are
diamonds, (ii) there is one card of each suit, and (iii) there are two spades and two hearts.
Sol. 4 cards can be drawn from a deck of 52 cards in 52C4 ways.
52 × 51× 50 × 49
∴ Exhaustive number of cases = 52C4 =
= 270725.
4 × 3 × 2 ×1
(i) There are 13 diamonds in the deck and 4 can be drawn out of them in 13C4 ways.
13 ×12 ×11×10
Favorable number of cases = 13C4 =
∴
= 715.
4 × 3 × 2 ×1
Required probability =
715
143
11
=
=
.
270725 54145 4165
(ii) There are 4 suits, each containing 13 cards.
Favorable number of cases = 13CI × 13C1 × 13C1 × 13C1 = 13 × 13 × 13 × 13.
∴
Required probability =
13 ×13 ×13 ×13 ×13 2197
=
.
270725
20825
(iii) 2 spades out of 13 can be drawn in 13C2 ways.
2 hearts out of 13 can be drawn in 13C2 ways.
Favorable number of cases = 13C2 × 13C2 = 78 × 78
∴
78 × 78
468
=
Required probability =
.
270725 20825
Example 3. A bag contains 50 tickets numbered 1, 2, 3, . . . , 50, of which five are drawn at
random and arranged in ascending order of magnitude (x1 < x2 < x3 < x4 < x5). What is the
probability that x3 = 30?
Sol. Exhaustive number of cases 50C5.
If x3 = 30, then the two tickets with numbers x1 and x2 must come out of 29 tickets
numbered 1 to 29 and this can be done in 29C2 ways. The other two tickets with numbers x4 and
x5 must come out of the 20 tickets number 31 to 50 and this can be done in 20C2 ways.
∴ Favorable number of cases = 29C2 × 20C2.
Required probability =
29
C2 × 20 C2
551
=
.
50
C5
15134
1204
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
21.38
RANDOM EXPERIMENT
Occurrences that can be repeated a number of times, essentially under the same conditions,
and whose result cannot be predicted beforehand are known as random experiments.
For example, the rolling of a die, or the tossing of a coin are random experiments.
Sample Space. Out of the several possible outcomes of a random experiment, one and only
one can take place in a trial. The set of all these possible outcomes is called the sample space for
the particular experiment and is denoted by S.
For example, if a coin is tossed, the possible outcomes are H (Heads) and T (Tails).
Thus
S = {H, T}.
Sample Point. The elements of S, the sample space, are called sample points.
For example, if a coin is tossed and H and T denote “Heads” and “Tails” respectively, then
S = {H, T}.
The two sample points are H and T.
Finite Sample Space. If the number of sample points in a sample space is finite, we call it a
finite sample space. (In this chapter, we shall deal with finite sample spaces only.)
Event. Every subset of S, the sample space, is called an event.
Since S ⊂ S, S itself is an event; called a certain event.
Also, φ ⊂ S, the null set is also an event, called an impossible event.
If e ∈ S, then e is called an elementary event. Every elementary event contains only one
sample point.
21.39
AXIOMS
(i) With each event E (i.e., a sample point) is associated a real number between 0 and 1,
called the probability of that event and is denoted by P(E). Thus 0 ≤ P(E) ≤ 1.
(ii) The sum of the probabilities of all simple (elementary) events constituting the sample
space is 1. Thus P(S) = 1.
(iii) The probability of a compound event (i.e., an event made up of two or more sample
events) is the sum of the probabilities of the simple events comprising the compound event.
Thus, if there are n equally likely possible outcomes of a random experiment, then the
sample space S contains n sample points and the probability associated with each sample point is
1
.
n
[By Axiom (ii)]
Now, if an event E consists of m sample points, then the probability of E is
1 1
m
+ + . . . . + m times =
n n
n
Number of sample points in E
.
=
Number of sample points in S
P(E) =
This closely agrees with the classical definition of probability.
21.40
PROBABILITY OF THE IMPOSSIBLE EVENT IS ZERO, i.e., P ( φ ) = 0
Impossible event contains no sample point. As such, the sample space S and the impossible
event φ are mutually exclusive.
21.45 AD
DDITION THEO
OREM OF PRO
OBABILITIES (OR THEOREM
M OF TOTAL PROBABILITY)
P
1205
________________________
________________________________________________________________________________________
S ∪φ = S
⇒
⇒ P(S ∪ φ ) = P(S)
⇒ P(S) + P(φ ) = P(S) ⇒
P(φ ) = 0.
21.41 P
PROBABILIT
TY OF THE COMPLEMENTARY EV
VENT A OF
F A IS GIVEN BY
P A ) = 1 – P(A)
P(
P
A and
a A are dissjoint eventss. Also A ∪ A = S
∴
P(A ∪ A ) = P(S
S)
⇒
P(A) + P( A ) = 1 Hence P( A ) = 1 – P(A
A).
21.42
F
FOR
ANY TW
WO EVENT
TS A AND B, P( A ∩ B) = P(B) – P((A ∩ B)
A ∩ B = {p : p ∈ B and p ∉ A}
Now
w A ∩ B an
nd A ∩ B arre disjoint seets and
( A ∩ B) ∪ (A ∩ B) = B
P[( A ∩ B) ∪ (A
A ∩ B)] = P(B)
P
⇒
P( A ∩ B) + P(A
A ∩ B) = P(B
B)
⇒
P( A ∩ B) = P(B
B) – P(A ∩ B).
B
⇒
Notee. Similarly, it can
c be proved that P(A ∩ B ) = P(A) – P(A
A ∩ B).
21.43
IF B ⊂ A, TH
HEN
(i) P(A
P ∩ B) = P(A) – P(B
B)
(ii) P((B) ≤ P(A)
Proof. When B ⊂ A, B and A ∩ B aree disjoint andd their unionn is A.
⇒
B ∪ (A ∩ B ) = A
⇒
P[B ∪ (A ∩ B )]
) = P(A)
∩
⇒
P(B) + P(A
B ) = P(A)
⇒
P(A ∩ B ) = P(A
A) – P(B)
. . . (1)
Now
w, if E is any
y event,
thenn
0 ≤ P((E) ≤ 1, i.e., P(E) ≥ 0
∴
P(A ∩ B ) ≥ 0 ⇒ P(A) – P(B)
P
≥0
[
[Using
(1)]
P(B) ≤ P(A).
⇒
21.44
P ∩ B) ≤ P(A) AND P(A
P(A
P ∩ B) ≤ P(B)
Proof. By 21.43
3, B ⊂ A ⇒ P(B) ≤ P((A)
Sincce (A ∩ B) ⊂ A and (A ∩ B) ⊂ B
∴
P(A ∩ B) ≤ P(A)) and P(A ∩ B) ≤ P(B).
21.45 A
ADDITION THEOREM OF
O PROBAB
BILITIES (OR
R THEOREM
M OF TOTA
AL
P
PROBABILIT
TY)
Stattement. If A and B are any
a two evennts, then
i.e.,
P(A ∪ B) = P(A) + P(B) – P((A ∩ B)
P(A or B) = P(A) + P(B) – P(A
A and B).
1206
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
Proof. A and A ∩ B are dissjoint sets annd their unioon is A ∪ B.
⇒
A ∪ B = A ∪ ( A ∩ B)
⇒ P(A ∪ B)) = P[A ∪ ( A ∩ B)] = P(A) + P(A
A ∩ B)
[ A ∩ B) + P(A ∩ B) – P(A ∩ B)]
B
= P(A) + [P(
= P(A) + P[(
P A ∩ B) ∪ (A ∩ B)] – P[(A ∩ B)]
[∴ A ∩ B and A ∩ B are disjoint]
d
= P(A) + P(B)
P
– P(A ∩ B) [∵ ( A ∩ B) ∪ (A
( ∩ B) = B]
P(A ∪ B) = P(A) + P((B) – P(A ∩ B).
S
Notee 1. If A and B are two mutuaally disjoint eveents, then A ∩ B = φ , so thaat P(A ∩ B) = P(
P φ ) = 0.
P(A ∪ B) = P(A) + P(B).
∴
Notee 2. P(A ∪ B)) is also writtenn as P(A + B). Thus, for mutuually disjoint events A and B,
P(A + B)
B = P(A) + P((B).
P(A ∩ B) is also written
w
as P(AB
B).
21.46
IF A, B, AND
D C ARE AN
NY THREE EVENTS,
E
TH
HEN
P(A ∪ B ∪ C) = P((A) + P(B) + P(C) – P(A
A ∩ B) – P(B ∩ C) – P((C ∩ A) + P(A
P ∩ B ∩ C)
C
or
P
– P(AB
B) – P(BC) – P(CA) + P(ABC)
P
P(A + B + C) = P(A)) + P(B) + P(C)
Proof. Using the above Artiicle 21.45 foor two eventss, we have
A ∪ B ∪ C)) = P[(A ∪ B)
B ∪ C]
P(A
∪ B) ∩ C]
= P(A ∪ B)
B + P(C) – P[(A
P
= [P(A) + P(B) – P(A ∩ B)] + P(C
C) – P[(A ∩ C) ∪ (B ∩ C)]
[By thee distributivee law]
= P(A) + P(B)
P
+ P(C) – P(A ∩ B) – [P(A ∩ C)]
C
+ P(B ∩ C) – P{(A ∩ C) ∩ (B ∩ C)}
[By Art. 21.45]
= P(A) + P(B)
P
+ P(C) – P(A ∩ B) – P(A ∩ C)) – P(B ∩ C))
+ P(A ∩ B ∩ C)
[∵ (A ∩ C) ∩ (B ∩ C)
C = A ∩ B ∩ C]
= P(A) + P(B)
P
+ P(C) – P(A ∩ B) – P(B ∩ C) – P(C ∩ A))
+ P(A ∩ B ∩ C)
[∵ A ∩ C = C ∩ A]
P
+ P(C) – P(AB) – P(BC)
P
– P(CA) + P(ABC
C).
or
P(A + B + C)) = P(A) + P(B)
21.47 IF
F A1, A2, . . . , An ARE n MUTUALLY
Y EXCLUSIVE EVENTS
S, THEN TH
HE
P
PROBABILIT
TY OF THE OCCURREN
NCE OF ON
NE OF THEM
M IS
P 1 ∪ A2 ∪ . . . ∪ An) = P(A1 + A2 + . . . + An) = P(A1) + P(A
P(A
P 2) + . . . + P(An)
Proof. Let N bee the total nuumber of muutually excluusive, exhausstive and equually likely cases
a so on.
of which m1 are favorable to A1, m2 are favorrable to A2, and
m1 ⎫
Probbability of occurrence
o
off event A1 = P(A1 ) =
N ⎪
⎪
m2 ⎪
Probbability of occurrence
o
off event A 2 = P(A 2 ) =
⎪
. . . (1)
N ⎬
⎪
................
⎪
mn ⎪
o
off event A n = P(A n ) =
Probbability of occurrence
N ⎪⎭
21.48 CONDITIONAL PROBABILITY
1207
________________________________________________________________________________________________________
The events being mutually exclusive and equally likely, the number of cases favorable to
the event
A1 or A2 or . . . or An is m1 + m2 + . . . + mn .
∴ Probability of occurrence of one of the events A1, A2, . . . , An is P(A1 + A2 + . . . + An)
m1 + m2 + . . . + mn m1 m2
m
=
+
+...+ n
N
N N
N
= P(A1 ) + P(A 2 ) + . . . + P(A n )
=
| Using (1)
ILLUSTRATIVE EXAMPLES
Example 1. In a given race, the odds in favor of four horses A, B, C, D are 1 : 3, 1 : 4,
1 : 5, 1 : 6 respectively. Assuming that a dead heat is impossible; find the chance that a particular horse wins the race.
Sol. Let p1, p2, p3, p4 be the probabilities of the horses A, B, C, D winning, respectively.
Since a dead heat (in which all the four horses cover the same distance in the same time) is not
possible, the events are mutually exclusive.
Odds in favor of A are 1 : 3 ∴ p1 =
Similarly,
1
1
=
1+ 3 4
1
1
1
p2 = , p3 = , p4 = .
5
6
7
If p is the chance that one of them wins, then
p = p1 + p2 + p3 + p4 =
1 1 1 1 319
.
+ + + =
4 5 6 7 420
Example 2. A card is drawn from a well-shuffled deck of playing cards. What is the
probability that it is either a spade or an ace?
Sol. Let
and
A = the event of drawing a spade
B = the event of drawing an ace
A and B are not mutually exclusive.
∴
21.48
AB = the event of drawing the ace of spades
13
4
1
P(A) = , P(B) = , P(AB) =
52
52
52
13 4 1 16 4
+ −
=
= .
P(A + B) = P(A) + P(B) − P(AB) =
52 52 52 52 13
CONDITIONAL PROBABILITY
The probability of the occurrence of an event E1 when another event E2 is known to have
already happened is called Conditional Probability and is denoted by P(E1/E2).
Mutually Independent Events. An event E1 is said to be independent of an event E2 if
P(E1/E2) = P(E1)
i.e., if the probability of the occurrence of E1 is independent of the occurrence of E2.
1208
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THE THEOREM OF
COMPOUND PROBABILITY)
The probability of the simultaneous occurrence of two events is equal to the probability
of one of the events multiplied by the conditional probability of the other, i.e., for two events
A and B,
P(A ∩ B) = P(A) × P(B/A)
where P(B/A) represents the conditional probability of the occurrence of B when the event A has
already happened.
Proof. Suppose a trial results in n exhaustive, mutually exclusive and equally likely
outcomes, m of them being favorable to the occurrence of the event A.
m
∴ Probability of the occurrence of the event A = P(A) =
. . . (1)
n
Out of m outcomes favorable to the occurrence of A, let m1 be favorable to the occurrence
of the event B.
m
∴ Conditional probability of B, given that A has happened = P(B/A) = 1
. . . (2)
m
Now, out of n exhaustive, mutually exclusive and equally likely outcomes, m1 are favorable
to the occurrence of A and B.
∴ Probability of simultaneous occurrence of A and B
m m m m m
= P(A ∩ B) = 1 = 1 × = × 1
n
m n n m
= P(A) × P(B/A)
[Using (1) and (2)]
Hence P(A ∩ B) = P(A) × P(B/A).
Note. P(A ∩ B) is also written as P(AB).
Thus P(AB) = P(A) × P(B/A).
Cor. 1. Interchanging A and B
P(BA) = P(B) × P(A/E)
or
P(AB) = P(B) × P(A/E)
[∵ B ∩ A = A ∩ B]
Cor. 2. If A and B are independent events, then P(B/A) = P(B)
..
P(AB) = P(A) × P(B).
Generalization. If A1, A2, . . . , An are n independent events, then
P(A1A 2 . . . A n ) = P(A1 ) × P(A 2 ) × . . . × P(A n ).
Cor. 3. If p is the chance that an event will occur in one trial then the chance that it will
occur in a succession of r trials is
p ⋅ p . . . p ⋅ (r times) = p r .
Cor. 4. If p1 , p2 , . . . , pn are the probabilities that certain events occur, then the probabilities
of their non-occurrence are 1 − p1 , 1 − p2 , . . . , 1 − pn and, therefore, the probability of all of these
failing is
(1 − p1 )(1 − p2 ) . . . (1 − pn ).
Hence the chance in which at least one of these events must occur is
1 − (1 − p1 )(1 − p2 ) . . . (1 − pn ).
21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY)
1209
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. A problem in mechanics is given to three students A, B, C whose chances of
1 1 1
solving it are , , respectively. What is the probability that the problem will be solved?
2 3 4
1 1 1
Sol. The probabilities of A, B, C solving the problem are , , .
2 3 4
1
1
1
1 2 3
The probabilities of A, B, C not solving the problem are 1 − , 1 − , 1 − i.e., , , .
2
3
4
2 3 4
1 2 3 1
∴ The probability that the problem is not solved by any of them = × × = .
2 3 4 4
1 3
Hence the probability that the problem is solved by at least one of them = 1 − = .
4 4
Example 2. The odds that a book will be favorably reviewed by three independent critics
are 5 to 2, 4 to 3, and 3 to 4 respectively. What is the probability that, of the three reviews, a
majority will be favorable?
Sol. Let the three critics be A, B, C. The probabilities p1 , p2 , p3 of the book being
5 4 3
favorably reviewed by A, B, C are , , respectively.
7 7 7
∴ The probabilities that the book is unfavorably reviewed by A, B, C are
5 2
4 3
3 4
1− = , 1− = , 1− = .
7 7
7 7
7 7
A majority will be favorable if the reviews of at least two are favorable.
(i) If A, B, C all review favorably, the probability is
5 4 3 60
× × =
| p1 p2 p3
7 7 7 343
(ii) If A, B review favorably and C reviews unfavorably, the probability is
5 4 4 80
| p1 p2 (1 − p3 )
× × =
7 7 7 343
(iii) If A, C review favorably and B reviews unfavorably, the probability is
5 3 3 45
| p1 (1 − p2 ) p3
× × =
7 7 7 343
(iv) If B, C review favorably and A reviews unfavorably, the probability is
2 4 3 24
| (1 − p1 ) p2 p3
× × =
7 7 7 343
Hence the probability that a majority will be favorable is
60 80
45 24 209
+
+
+
=
.
343 343 343 343 343
Example 3. A can hit a target 4 times in 5 shots; B can hit it 3 times in 4 shots; C can hit it
twice in 3 shots. They fire a volley. What is the probability that at least two shots hit?
4
Sol. Probability of A’s hitting the target
=
5
3
=
Probability of B’s hitting the target
4
1210
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Probability of C’s hitting the target =
2
.
3
For at least two hits, we may have
(i) A, B, C all hit the target, the probability of which is
4 3 2 24
× × = .
5 4 3 60
(ii) A, B hit the target and C misses it, the probability of which is
4 3 ⎛ 2 ⎞ 4 3 1 12
× × ⎜1 − ⎟ = × × = .
5 4 ⎝ 3 ⎠ 5 4 3 60
(iii) A, C hit the target and B misses it, the probability of which is
4 ⎛ 3⎞ 2 4 1 2 8
× ⎜1 − ⎟ × = × × = .
5 ⎝ 4 ⎠ 3 5 4 3 60
(iv) B, C hit the target and A misses it, the probability of which is
⎛ 4⎞ 3 2 1 3 2 6
⎜1 − ⎟ × × = × × = .
⎝ 5 ⎠ 4 3 5 4 3 60
Since these are mutually exclusive events, the required probability is
=
24 12 8
6 50 5
+ + +
=
= .
60 60 60 60 60 6
Example 4. A has 2 shares in a lottery in which there are 3 prizes and 5 blanks; B has 3
shares in a lottery in which there are 4 prizes and 6 blanks. Show that A’s chance of success is to
B’s as 27 : 35.
Sol. A can draw two tickets (out of 3 + 5 = 8) in 8C3 = 28 ways.
A will get the blanks in 5C2 = 10 ways. ∴ A can win a prize in 28 – 10 = 18 ways
18 9
Hence A’s chance of success =
=
28 14
B can draw 3 tickets in 10C3 = 120 ways; B will get all blanks in 6C3 = 20 ways.
∴ B can win a prize in 120 – 20 = 100 ways.
100 5
Hence B’s chance of success =
= .
120 6
9 5
: = 27 : 35.
∴ A’s chance : B’s chance =
14 6
Example 5. A and B throw alternately with a single die, A having the first throw. The
person who first throws a one wins. What are their respective chances of winning?
1
Sol. The chance of throwing a one with a single die =
6
1 5
The chance of not throwing a one with a single die = 1 − = .
6 6
If A is to win, he should throw a one in the first or third or fifth, . . . , throws.
If B is to win, he should throw a one in the second or fourth or sixth, . . . , throws.
The chances that a one is thrown in the first, second, third, . . . , throws are
2
3
1 5 1 5 5 1 5 5 5 1
1 5 1 ⎛5⎞ 1 ⎛5⎞ 1
, ⋅ , ⋅ ⋅ , ⋅ ⋅ ⋅ . . . or
, ⋅ , ⎜ ⎟ ⋅ , ⎜ ⎟ ⋅ , .. .
6 6 6 6 6 6 6 6 6 6
6 6 6 ⎝6⎠ 6 ⎝6⎠ 6
21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY)
1211
________________________________________________________________________________________________________
2
4
1 ⎛5⎞ 1 ⎛5⎞ 1
∴ A’s chance = + ⎜ ⎟ ⋅ + ⎜ ⎟ ⋅ + . . . =
6 ⎝6⎠ 6 ⎝6⎠ 6
1
6
6
=
2
⎛ 5 ⎞ 11
1− ⎜ ⎟
⎝6⎠
Sum of an infinite Geometric
a
Progression =
1− r
B’s chance = 1 −
6 5
= .
11 11
Example 6. Cards are dealt one by one from a well-shuffled deck until an ace appears.
Show that the probability that exactly n cards are dealt before the first ace appears is
4(51 − n)(50 − n)(49 − n)
.
52 ⋅ 51⋅ 50 ⋅ 49
Sol. Let A be the event of drawing n non-ace cards and B, the event of drawing an ace in the
(n + l)th draw.
Consider the event A
n cards can be drawn out of 52 cards in 52Cn ways.
⇒ Exhaustive cases = 52Cn
n non-ace cards can be drawn out of 52 cards in 48Cn ways.
⇒ Favorable cases = 48Cn
48!
(52 − n)!(n)!
×
∴ P(A) = 48 Cn / 52 Cn =
(48 − n)!n !
52!
48! ⋅ (52 − n)(51 − n)(50 − n)(49 − n)(48 − n)! (52 − n)(51 − n)(50 − n)(49 − n)
=
=
.
(48 − n)! ⋅ 52 ⋅ 51 ⋅ 50 ⋅ 49 ⋅ (48)!
52 ⋅ 51 ⋅ 50 ⋅ 49
Consider the event B
n cards have already been drawn in the first n draws.
Exhaustive cases = 52–nC1 = 52 – n; Favorable cases = 4C1 = 4
4
∴
P(B/A) =
52 − n
Reqd. Probability = P(A) ⋅ P(B/A)
=
(52 − n)(51 − n)(50 − n)(49 − n)
4
4(51 − n)(50 − n)(49 − n)
×
=
.
52 ⋅ 51⋅ 50 ⋅ 49
52 − n
52 ⋅ 51⋅ 50 ⋅ 49
Example 7. An urn contains 10 white and 3 black balls, while another urn contains 3 white
and 5 black balls. Two balls are drawn from the first urn and put into the second urn and then a
ball is drawn from the latter. What is the probability that it is a white ball?
Sol. The two balls drawn from the first urn may be
(i) both white (ii) both black (iii) one white and one black.
Let these events be denoted by A, B, C respectively.
10
3
C
10 × 9 15
C
3× 2
1
P(A) = 13 2 =
= ;
P(B) = 13 2 =
=
C2 13 × 12 26
C2 13 × 12 26
P(C) =
10
C1 × 3 C1 10 × 3 10
=
=
13
13 × 12 26
C2
2 ×1
1212
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
When two balls are transferred from the first urn to the second urn, the second urn may
contain
(i) 5 white and 5 black balls
(ii) 3 white and 7 black balls
(iii) 4 white and 6 black balls.
Let W denote the event of drawing a white ball from the second urn in the three cases (i),
(ii), and (iii).
5
3
4
Now
P(W/A) = , P(W/B) = , P(W/C) =
10
10
10
∴ Reqd. probability = P(A) ⋅ P(W/A) + P(B) ⋅ P(W/B) + P(C) ⋅ P(W/C)
15 5 1 3 10 4 75 + 3 + 40 118 59
= ⋅ + ⋅ + ⋅ =
=
=
.
26 10 26 10 26 10
260
260 130
TEST YOUR KNOWLEDGE
1. In a class of 10 students, 4 are boys and the rest are girls. Find the probability that a student selected will
be a girl.
2. What is the chance that a (i) non-leap year (ii) leap year should have fifty-three Sundays?
3. A card is drawn from an ordinary deck and a gambler bets that it is a spade or an ace. What are the odds
against his winning the bet?
4. An integer is chosen at random from the first two hundred positive integers. What is the probability that
the integer chosen is divisible by 6 or 8?
5. Six cards are drawn at random from a deck of 52 cards. What is the probability that 3 will be red and 3
will be black?
6. From a set of raffle tickets numbered 1 to 100, three are drawn at random. What is the probability that all
are odd numbered?
7. (a) If from a lottery of 30 tickets, marked, 1, 2, 3, . . . , 30, four tickets are drawn, what is the chance that
those marked 1 and 2 are among them?
(b) An urn contains 5 red and 10 black balls. Eight of them are placed in another urn. What is the chance
that the latter then contains 2 red and 6 black balls?
8. A party of n people sit at a round table. Find the odds against two specified individuals sitting next to
each other.
9. A five-figured number is formed by the digits 0, 1, 2, 3, 4 (without repetition). Find the probability that
the number formed is divisible by 4.
10. Three newspapers A, B, C are published in a city and a survey of readers indicates the following: 20%
read A, 16% read B, 14% read C, 8% read both A and B, 5% read both A and C, 4% read both B and C,
and 2% read all three.
For a person chosen at random, find the probability that he reads none of the papers.
1 1 1 1
1
11. A problem in statistics is given to five students. Their chances of solving it are , , , , and .
2 3 4 4
5
What is the probability that the problem will be solved?
12. A can hit a target 5 times in 6 shots, B hits it 4 times in 5 shots, and C hits it 3 times in 4 shots. They fire
a volley. What is the probability that at least two shots hit the target?
13. Three groups of children contain, respectively, 3 girls and 1 boy; 2 girls and 2 boys; 1 girl and 3 boys.
One child is selected at random from each group. Show that the chance that the three selected consist of
13
.
1 girl and 2 boys is
32
14. Four people are chosen at random from a group containing 3 men, 2 women, and 4 children. Show that
5
.
the chance that exactly two of them will be children is
21
21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY)
1213
________________________________________________________________________________________________________
15. A bag contains 10 balls, two of which are red, three are blue, and five are black. Three balls are drawn at
random from the bag. What is the probability that
(i) the three balls are of different colors, (ii) two balls are of the same color,
(iii) the balls are all of the same color.
16. It is 8 : 5 against a person who is 40 years old living until they are 70 and 4 : 3 against a person now 50
living until they are 80. Find the probability that at least one of these people will be alive 30 years from
now.
17. Find the chance of throwing 5 or 6 at least once in four throws of a die.
18. A has 3 shares in a lottery where there are 3 prizes and 6 blanks. B has one share in another, where there
is just one prize and two blanks. Show that A has a better chance of winning a prize than B in the ratio
16 : 7.
19. A, B, and C, in order, toss a coin. The first one to throw a head wins. If A starts, find their respective
chances of winning.
20. A speaks the truth in 60% of cases and B in 70% of cases. In what percentages of cases are they likely to
contradict each other in stating the same fact?
21. A and B throw alternately with a pair of ordinary dice. A wins if he throws 6 before B throws 7 and B
wins if he throws 7 before A throws 6. If A begins, find their respective chances of winning. (Huygen’s
Problem)
22. (a) Two cards are randomly drawn from a deck of 52 cards and thrown away. What is the probability of
drawing an ace in a single draw from the remaining 50 cards?
(b) A box A contains 2 white and 4 black balls. Another box B contains 5 white and 7 black balls. A ball
is transferred from the box A to the box B; then a ball is drawn from box B. Find the probability that it is
white.
23. Of the cigarette-smoking population, 70% are men and 30% are women, 10% of these men and 20% of
these women smoke ABC Cigarettes. What is the probability that a person seen smoking an ABC
cigarette will be a man?
24. A committee consists of 9 students, two of which are in their 1st year, three are in their 2nd year, and
four are in their 3rd year. Three students are to be removed at random. What is the chance that
(i) the three students belong to different classes,
(ii) two belong to the same class and the third to the different class, and
(iii) the three belong to the same class?
25. Five workers in a company of twenty are graduates. If 3 workers are picked out of 20 at random, what is
the probability that
(ii) at least one is a graduate?
(i) they are all graduates?
26. If A, B, C are events such that
P(A) = 0.3, P(B) = 0.4, P(C) = 0.8, P(A ∩ B) = 0.08, P(A ∩ C) = 0.28, P(A ∩ B ∩ C) = 0.09
If P(A ∪ B ∪ C) ≥ 0.75, then show that 0.23 ≤ P(B ∩ C) ≤ 0.48.
27. For two events A and B, let P(A) = 0.4, P(B) = p and P(A ∪ B) = 0.6
(i) Find p so that A and B are independent events.
(ii) For what value of p are A and B mutually exclusive?
28. A husband and wife appear in an interview for two vacancies in the same position. The probability of the
husband’s selection is 17 and that of the wife’s selection is 15 . What is the probability that
(i) both of them will be selected, (ii) only one of them will be selected, and
(iii) none of them will be selected?
29. Two dice are tossed once. Find the probability of getting an even number on the first throw or a total
of 8.
1214
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
30. A drawer contains 50 bolts and 150 nuts. Half of the bolts and half of the nuts are rusted. If one item is
chosen at random, what is the probability that it is rusted or is a bolt?
31. An old purse contains 2 silver and 4 copper coins. A second purse contains 4 silver and 3 copper coins.
If a coin is pulled out at random from one of the two purses, what is the probability that it is a silver
coin?
32. A class consists of 80 students, 25 of which are girls and 55 are boys, 10 of which have blue eyes and
the remaining 20 have brown hair. What is the probability of selecting a brown-haired, blue-eyed girl?
33. Of the students attending a lecture, 50% could not see what was written on the board and 40% could not
hear what the lecturer was saying. The most unfortunate 30% fell into both of these categories. What is
the probability that a student picked at random was able to see and hear satisfactorily?
34. The probabilities of A, B, C solving a problem are 13 , 72 , and 83 , respectively. If all three try to solve
the problem simultaneously, find the probability that exactly one of them will solve it.
35. A student takes his examinations in four subjects α , β , γ , δ . He estimates his chance of passing in α
as 54 , in β as 34 , in γ as 56 , and in δ as 23 . To qualify he must pass in α and at least two other
subjects. What is the probability that he qualifies?
36. For any two events A and B, prove that
P(A ∩ B) ≤ P(A) ≤ P(A ∪ B) ≤ P(A) + P(B).
Answers
1.
5.
9.
3
2.
5
13000
39151
5
10.
16
1
15.
(i )
20.
46%
24.
(i )
29.
32.
6.
4
2
7
5
9
5
512
(ii )
(ii )
79
120
55
84
(iii )
(iii )
11
120
5
84
16.
(i )
1
7
(ii )
33
13
20
59
25.
(i )
8
2
5
9:4
7.
(a)
17.
91
5
3.
11.
30 31
,
61 61
33.
7
4
21.
30.
2
1
114
(ii )
137
228
145
, (b )
140
429
65
19.
81
27.
(i )
8.
12.
20
(a)
34.
2
17
22.
31.
4.
1
13
, (b )
16
39
1
(ii ) 0.2
3
23.
28.
1
4
( n − 3) : 2
107
120
4 2 1
, ,
7 7 7
7
13
(i )
42
56
35
(iii )
19
25
1
35.
61
90
24
35
(ii )
2
7
21.50 BAYES’ THEOREM
1215
________________________________________________________________________________________________________
21.50
BAYES’ THEOREM
If E1, E2, . . . , En are mutually exclusive and exhaustive events with P(Ei) ≠ 0, (i = 1, 2,
. . . , n) of a random experiment then for any arbitrary event A of the sample space of the above
experiment with P(A) > 0, we have
P( Ei ) P( A / Ei )
P ( Ei / A) = n
∑ P( Ei ) P( A / Ei )
i =1
Proof. Let S be the sample space of the random experiment.
The events E1, E2, . . . , En being exhaustive
∴
S = E1 ∪ E 2 ∪ . . . ∪ E n
A = A∩S
= A ∩ (E1 ∪ E 2 ∪ . . . ∪ E n )
= (A ∩ E1 ) ∪ (A ∩ E 2 ) ∪ . . . ∪ (A ∩ E n )
⇒
P(A) = P(A ∩ E1 ) + P(A ∩ E 2 ) + . . . + P(A ∩ E n )
[∵ A ⊂ S]
[Distributive Law]
= P(E n )P(A/E1 ) + P(E 2 )P(A/E 2 ) + . . . + P(E n )P(A/E n )
n
= ∑ P(E i )P(A/E i )
. . . (1)
i =1
Now
⇒
P(A ∩ E i ) = P(A)P(E i / A)
P(E i / A) =
P(A ∩ E i )
P(E )P(A/E i )
= n i
P(A)
∑ P(Ei )P(A/Ei )
[Using (1)]
i =1
Note. The significance of Bayes’ Theorem may be understood in the following manner:
P(Ei) is the probability of the occurrence of Ei. The experiment is performed and we are told that the event A
has occurred. With this information, the probability P(Ei) is changed to P(Ei/A). Bayes’ Theorem enables us to
evaluate P(Ei/A) if all the P(Ei) and the conditional probabilities P(A/Ei) are known.
ILLUSTRATIVE EXAMPLES
Example 1. A bag X contains 2 white and 3 red balls and a bag Y contains 4 white and 5
red balls. One ball is drawn at random from one of the bags and is found to be red. Find the
probability that it was drawn from bag Y.
Sol. Let E1: the ball is drawn from bag X; E2: the ball is drawn from bag Y
and
A: the ball is red.
We have to find P(E2/A).
By Bayes’ Theorem,
P(E 2 )P(A/E 2 )
P(E 2 /A) =
. . . (1)
P(E1 )P(A/E1 ) + P(E 2 )P(A/E 2 )
1
Since the two bags are equally likely to be selected, P(E1 ) = P(E 2 ) =
2
3
Also
P(A/E1) = P(a red ball is drawn from bag X) =
5
5
P(A/E2) = P(a red ball is drawn from bag Y) =
9
1216
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
1 5
×
25
2
9
= .
∴ From (1), we have P(E2/A) =
1 3 1 5 52
× + ×
2 5 2 9
Example 2. In a bolt factory, machines A, B, and C manufacture respectively 25%, 35%,
and 40% of the total. Of their output 5, 4, and 2 percent are defective bolts. A bolt is drawn at
random from the product and is found to be defective. What is the probability that it was manufactured by machine B?
Sol. Let E1, E2, and E3 denote the events that a bolt selected at random is manufactured by
the machines A, B, and C respectively and let H denote the event of its being defective. Then
P(E1) = 0.25, P(E2) = 0.35, P(E3) = 0.40
The probability of drawing a defective bolt manufactured by machine A is P(H/E1) = 0.05
Similarly,
P(H/E2) = 0.04 and P(H/E3) = 0.02
By Bayes’ Theorem, we have
P(E 2 )P(H/E 2 )
P(E 2 /H) =
P(E1 )P(H/E1 ) + P(E 2 )P(H / E 2 ) + P(E 3 )P(H / E 3 )
=
0.35 × 0.04
0.0140
=
= 0.41.
0.25 × 0.05 + 0.35 × 0.04 + 0.40 × 0.02 0.0345
Example 3. The contents of bags I, II, and III are as follows:
1 white, 2 black, and 3 red balls,
2 white, 1 black, and 1 red balls, and
4 white, 5 black, and 3 red balls.
One bag is chosen at random and two balls are drawn from it. They happen to be white and
red.
What is the probability that they come from bags I, II, or III?
Sol. Let E1 : bag I is chosen; E2 : bag II is chosen; E3 : bag III is chosen
and
A : the two balls are white and red.
We have to find P(E1/A), P(E2/A), and P(E3A).
1
Now
P(E1) = P(E2) = P(E3) =
3
1
C × 3 C1 1
=
P(A/E1) = P (a white and a red ball are drawn from bag I) = 16
C2
5
2
P(A/E2) =
4
C1 × 1 C1 1
C1 × 3 C1 2
=
;
P(A
/
E
)
=
=
3
4
12
C2
3
C2
11
By Bayes’ Theorem, we have
1 1
×
P(E1 )P(A / E1 )
33
3 5
=
=
P(E1 / A) =
P(E1 )P(A / E1 ) + P(E 2 )P(A / E 2 ) + P(E 3 )P(A / E 3 ) 1 × 1 + 1 × 1 + 1 × 2 118
3 5 3 3 3 11
55
15
Similarly, P(E2/A) =
P(E3/A) =
.·
118
59
21.52 DISCRETE PROBABILITY DISTRIBUTION
1217
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. Two bags contain 4 white, 6 blue and 4 white, 5 blue balls, respectively. One of the bags is selected
at random and a ball is drawn from it. If the ball drawn is white, find the probability that it is drawn
from the
(ii) second bag
(i) first bag
2. Three bags contain 6 red, 4 black; 4 red, 6 black; and 5 red, 5 black balls, respectively. One of the bags
is selected at random and a ball is drawn from it. If the ball drawn is red, find the probability that it is
drawn from the first bag.
3. A factory has two machines A and B. Past records show that machine A produced 60% of the items of
output and machine B produced 40% of the items. Further, 2% of the items produced by machine A
were defective and 1% produced by machine B were defective. If a defective item is drawn at random,
what is the probability that it was produced by machine A?
4. An insurance company insured 2000 motorcycle drivers, 4000 car drivers, and 6000 truck drivers. The
probability of an accident is 0.01, 0.03, and 0.15 respectively. One of the insured persons has an
accident. What is the probability that he is a motorcycle driver?
5. A company has two plants to manufacture scooters. Plant I manufactures 70% of scooters and plant II
manufactures 30%. At plant I, 80% of the scooters are rated standard quality and at plant II, 90% of the
scooters are rated standard quality. A scooter is chosen at random and is found to be of standard quality.
What is the chance that it has come from plant II?
Answers
1.
4.
(i )
9
19
(ii )
10
19
1
52
2.
5.
2
5
27
3.
3
4
83
________________________________________________________________________________________________________
21.51
RANDOM VARIABLE
If the numerical values assumed by a variable are the result of some chance factors, so that a
particular value cannot be exactly predicted in advance, the variable is then called a random
variable. A random variable is also called a chance variable or a stochastic variable.
Random variables are denoted by capital letters, usually from the last part of the alphabet,
for instance, X, Y, Z, etc.
Continuous and Discrete Random Variables
A continuous random variable is one that can assume any value within an interval, i.e., all
values of a continuous scale. For example (i) the weights (in kg) of a group of individuals, (ii)
the heights of a group of individuals.
A discrete random variable is one that can assume only isolated values. For example,
(i) the number of heads in 4 tosses of a coin is a discrete random variable as it cannot
assume values other than 0, 1, 2, 3, 4.
(ii) the number of aces in a draw of 2 cards from a well-shuffled deck is a random variable
as it can take the values 0, 1, 2 only.
21.52
DISCRETE PROBABILITY DISTRIBUTION
Let a random variable X assume values x1, x2, x3, . . . , xn with probabilities p1, p2, p3, . . . , pn
respectively, where P(X = xi) = pi ≥ 0 for each xi and p1 + p2 + p3 + . . . + pn =
n
∑p
i =1
i
= 1.
1218
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
X
: x1 ,
P(X) : p1 ,
x2 ,
p2 ,
x3 , . . . , xn
p3 , . . . , pn
is called the discrete probability distribution for X and it spells out how a total probability of 1 is
distributed over several values of the random variable.
21.53
MEAN AND VARIANCE OF RANDOM VARIABLES
Let
X : x1 ,
x2 ,
x3 , . . . , xn
P(X) : p1 , p2 ,
p3 , . . . , pn
be a discrete probability distribution.
Σpi xi
= Σpi xi
Σpi
Other names for the mean are average or expected value E ( X ) .
We denote the variance by σ 2 and define σ 2 = Σpi ( xi − μ ) 2
We denote the mean by μ and define μ =
If μ is not a whole number, then
(∵ Σpi = 1)
σ 2 = Σpi xi2 − μ 2
Standard deviation σ = + Variance.
ILLUSTRATIVE EXAMPLES
Example 1. Five defective bulbs are accidentally mixed with twenty good ones. It is not
possible to just look at a bulb and tell whether or not it is defective. Find the probability
distribution of the number of defective bulbs, if four bulbs are drawn at random from this lot.
Sol. Let X denote the number of defective bulbs out of four. Clearly, X can take the values
0, 1, 2, 3, or 4.
Number of defective bulbs
= 5
Number of good bulbs
= 20
Total number of bulbs
= 25
P(X = 0) = P (no defective) = P (all 4 good ones)
=
20
25
C4 20 × 19 × 18 × 17
969
=
=
C 4 25 × 24 × 23 × 22 2530
P(X = 1) = P(1 defective and 3 good ones) =
5
C1 × 20 C3 1140
=
25
C4
2530
P(X = 2) = P(2 defectives and 2 good ones) =
5
C2 × 20 C2
380
=
25
C4
2530
C3 × 20 C1
40
P(X = 3) = P(3 defectives and 1 good one) =
=
25
C4
2530
5
P(X = 4) = P(all 4 defectives) =
5
C4
1
=
C4 2530
25
∴ The probability distribution of the random variable X is
X
:
P(X) :
0
1
2
3
4
969
2530
1140
2530
380
2530
40
2540
1
2530
21.53 MEAN AND VARIANCE OF RANDOM VARIABLES
1219
________________________________________________________________________________________________________
Example 2. A die is tossed three times. A success is “getting 1 or 6” on a toss. Find the
mean and the variance of the number of successes.
Sol. Let X denote the number of successes. Clearly X can take the values 0, 1, 2, or 3.
2 1
1 2
Probability of success = = ; Probability of failure = 1 − =
6 3
3 3
2 2 2 8
P(X = 0) = P (no success) = P (all 3 failures) = × × =
3 3 3 27
1
2 2 12
P(X = 1) = P (1 success and 2 failures) = 3 C1 × × × =
3 3 3 27
1 1 2 6
P(X = 2) = P (2 successes and 1 failure) = 3 C2 × × × =
3 3 3 27
1 1 2 6
P(X = 3) = P (all 3 successes) = × × =
3 3 3 27
∴ The probability distribution of the random variable X is
X
: 0
1
2
3
P(X) :
8
27
12
27
6
27
1
27
To find the mean and variance
xi
0
1
2
3
pi
8
27
12
27
6
27
1
27
pi xi
pi xi2
0
0
12
27
12
27
3
27
12
27
24
27
9
27
5
3
1
Mean μ = Σpi xi = 1
5
2
Variance σ 2 = Σpi xi2 − μ 2 = − 1 = .
3
3
Example 3. A random variable X has the following probability function:
Values of X, x : 0 1 2 3 4 5 6
7
2
2
2
p(x) : 0 k 2k 2k 3k k 2k 7k + k
(i) Find k,
(ii) Evaluate P(X < 6), P(X ≥ 6), P(3 < X ≤ 6)
(iii) Find the minimum value of x so that P(X ≤ x) > 12 .
Sol. (i) Since
7
∑ p( x) = 1, we have
x=0
⇒
⇒
0 + k + 2k + 2k + 3k + k2 + 2k2 + 7k2 + k = 1
10k2 + 9k – 1 = 0
⇒
(10k – 1)(k + 1) = 0
1
k=
10
[∵ p ( x) ≥ 0]
1220
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
(ii)
P(X < 6) = P(X = 0) + P(X = 1) + . . . + P(X = 5)
= 0 + k + 2k + 2k + 3k + k2 = 8k + k2 =
8
1
81
+
=
10 100 100
P(X ≥ 6) = P(X = 6) + P(X = 7)
9
1 19
+ =
= 2k 2 + 7 k 2 + k =
100 10 100
P(3 < X ≤ 6) = P(X = 4) + P(X = 5) + P(X = 6)
3
3
33
=
= 3k + k 2 + 2k 2 = +
10 100 100
1 1
3 1
< ;
P(X ≤ 2) = k + 2k =
<
(iii) P(X ≤ 1) = k =
10 2
10 2
5 1
8 1
P(X ≤ 3) = k + 2k + 2k =
= ;
P(X ≤ 4) = k + 2k + 2k + 3k =
>
10 2
10 2
∴ The maximum value of x so that P(X ≤ x) > 12 is 4.
TEST YOUR KNOWLEDGE
1. Find the probability distribution of the number of doubles in four throws of a pair of dice.
2. Two bad eggs are mixed accidently with 10 good ones. Find the probability distribution of the number of
bad eggs in 3, drawn at random, without replacement, from this lot.
3. A die is tossed twice. Getting a number greater than 4 is considered a success. Find the variance of the
probability distribution of the number of successes.
4. Two cards are drawn simultaneously from a well-shuffled deck of 52 cards. Compute the variance for
the number of aces.
5. A bag contains 4 white and 3 red balls. Three balls are drawn, with replacement, from this bag. Find μ ,
σ , and σ for the number of red balls drawn.
2
6. A random variable X has the following probability distribution:
:
:
Values of X, x
p(x)
0
a
1
3a
2
5a
3
7a
4
9a
5
11a
6
13a
7
15a
8
17a
(i) Determine the value of a.
(ii) Find P(X < 3), P(X ≥ 3), P(2 ≤ X < 5)
(iii) What is the smallest value of x for which P(X ≤ x) > 0.5?
7. Find the standard deviation for the following discrete distribution:
x
:
p( x)
:
8
12
16
20
24
1
1
3
1
1
8
6
8
4
12
Answers
1. X
:
P(X) :
2. X
:
P(X) :
0
625
1
500
2
150
3
20
4
1
1296
1296
1296
1296
1296
0
12
1
9
2
1
22
22
22
21.55 BINOMIAL PROBABILITY DISTRIBUTION
1221
________________________________________________________________________________________________________
3.
6.
4
9
(i ) a =
400
4.
1
81
2873
1 8 7
(ii ) , ,
9 9 27
(iii ) 5
5.
9 36 6
,
,
7 49 7
7.
2 5
________________________________________________________________________________________________________
21.54
THEORETICAL DISTRIBUTIONS
Frequency distributions can be classified under two heads:
(i) Observed Frequency Distributions.
(ii) Theoretical or Expected Frequency Distributions.
Observed frequency distributions are based on actual observation and experimentation. If a
certain hypothesis is assumed, it is sometimes possible to derive mathematically what the
frequency distribution of a certain universe should be. Such distributions are called Theoretical
Distributions.
There are many types of theoretical frequency distributions, but we shall consider only three
that are of great importance:
(i) Binomial Distribution (or Bernoulli’s Distribution);
(ii) Poisson’s Distribution;
(iii) Normal Distribution.
BINOMIAL (OR BERNOULLI’S) DISTRIBUTION
21.55
BINOMIAL PROBABILITY DISTRIBUTION
Let there be n independent trials in an experiment. Let a random variable X denote the
number of successes in these n trials. Let p be the probability of a success and q be that of a
failure in a single trial so that p + q = 1. Let the trials be independent and p be constant for every
trial.
Let us find the probability of r successes in n trials.
r successes can be obtained in n trials in nCr ways.
∴
P(X = r ) = n Cr P (S S S) . . . S
r times
F F F ... F
( n − r ) times
= Cr P(S)P(S) . . . P(S) P(F)P(F) . . . P(F)
n
r factors
= Cr p p p . . . p
n
r factors
= Cr p q
n
Hence
r
( n − r ) factors
q q q ... q
( n − r ) factors
n−r
P(X = r) = nCr qn–rpr, where p + q = 1 and r = 0, 1, 2, . . . , n.
The distribution (1) is called the binomial probability distribution and X is called the
binomial variate.
Note 1. P(X = r) is usually written as P(r).
1222
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Note 2. The successive probabilities P(r) in (1) for r = 0, 1, 2, . . . , n are
n
C0qn, nC1qn–1p, nC2qn–2p2, . . . , nCnpn
which are the successive terms of the binomial expansion of (q + p)n. That is why this distribution is called the
“binomial” distribution.
Note 3. n and p occurring in the binomial distribution are called the parameters of the distribution.
Note 4. In a binomial distribution:
(i) n, the number of trials is finite.
(ii) each trial has only two possible outcomes usually called success and failure.
(iii) all the trials are independent.
(iv) p (and hence q) is constant for all the trials.
21.56 RECURRENCE OR RECURSION FORMULA FOR THE BINOMIAL
DISTRIBUTION
In a binomial distribution,
n!
q n−r p r
(n − r )!r !
n!
q n − r −1 p r +1
P(r + 1) = n Cr +1q n − r −1 p r +1 =
(n − r − 1)!(r + 1)!
r!
p
P(r + 1)
(n − r )!
=
×
×
P(r )
(n − r − 1)! (r + 1)! q
P(r ) = n Cr q n − r p r =
∴
r!
p
(n − r ) × (n − r − 1)!
⎛ n−r ⎞ p
×
× ×=⎜
⎟⋅
(n − r − 1)!
(r + 1) × r ! q
⎝ r +1 ⎠ q
n−r p
⇒ P(r + 1) =
⋅ P(r )
r +1 q
which is the required recurrence formula. Applying this formula successively, we can find P(1),
P(2), P(3), . . . , if P(0) is known.
=
21.57
MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION
For the binomial distribution, P(r ) = n Cr q n − r p r
n
n
Mean μ = ∑ rP(r ) = ∑ r ⋅ n Cr q n − r p r
r =0
r=0
= 0 + 1⋅ C1q
n
n −1
p + 2 ⋅ n C 2 q n − 2 p 2 + 3 ⋅ n C3 q n −3 p 3 + . . . + n ⋅ n C n p n
n(n − 1) n − 2 2
n(n − 1)(n − 2) n −3 3
q p + 3⋅
q p + . . . + np n
2 ⋅1
3 ⋅ 2 ⋅1
n(n − 1)(n − 2) n −3 3
= nq n −1 p + n(n − 1)q n − 2 p 2 +
q p + . . . + np n
2 ⋅1
(n − 1)(n − 2) n −3 2
⎡
⎤
= np ⎢ q n −1 + (n − 1)q n − 2 p +
q p + . . . + p n −1 ⎥
2 ⋅1
⎣
⎦
n −1
n −1
n −1
n−2
n −1
n −3 2
n −1
= np ⎡⎣ C0 q + C1q p + C2 q p + . . . + Cn −1 p n −1 ⎤⎦
= nq n −1 p + 2 ⋅
= np(q + p) n −1 = np
(∵ p + q = 1)
21.57 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION
1223
________________________________________________________________________________________________________
Hence the variance of the binomial distribution is np.
n
n
Variance σ 2 = ∑ r 2 P(r ) − μ 2 = ∑ [r + r (r − 1)]P(r ) − μ 2
r =0
r=0
n
n
n
r =0
r =0
r=2
= ∑ rP(r ) + ∑ r (r − 1)P(r ) − μ 2 = μ + ∑ r (r − 1) n Cr q n − r p r − μ 2
(since the contribution due to r = 0 and r = 1 is zero).
= μ + [2 ⋅1 ⋅ n C2 q n − 2 p 2 + 3 ⋅ 2 ⋅ n C3 q n −3 p 3 + . . . + n(n − 1) n Cn p n ] − μ 2
n(n − 1) n − 2 2
n(n − 1)(n − 2) n −3 3
⎡
⎤
= μ + ⎢ 2 ⋅1 ⋅
q p + 3⋅ 2 ⋅
q p + . . . + n(n − 1) p n ⎥ − μ 2
2 ⋅1
3 ⋅ 2 ⋅1
⎣
⎦
n−2 2
n −3 3
n
2
= μ + [n(n − 1)q p + n(n − 1)(n − 2)q p + . . . + n(n − 1) p ] − μ
= μ + n(n − 1) p 2 [q n − 2 + (n − 2)q n −3 p + . . . + p n − 2 ] − μ 2
= μ + n(n − 1) p 2 [ n − 2 C0 q n − 2 + n − 2 C1q n −3 p + . . . + n − 2 Cn − 2 p n − 2 ] − μ 2
= μ + n(n − 1) p 2 (q + p ) n − 2 − μ 2 = μ + n(n − 1) p 2 − μ 2
[∵ q + p = 1]
= np + n(n − 1) p 2 − n 2 p 2
[∵ μ = np ]
= np[1 + (n − 1) p − np ] = np[1 − p ] = npq.
Hence the variance of the binomial distribution is npq.
Standard deviation of the binomial distribution is npq .
Similarly, we can prove that
β1 =
γ 1 = β1 =
Hence
Note. γ 1 =
q− p
npq
positive, if p >
β2 = 3 +
μ32 (q − p)2 (1 − 2 p )2
1 − 6 pq
μ
=
=
; β 2 = 42 = 3 +
3
npq
npq
npq
μ2
μ2
1
2
1 − 6 pq
npq
=
1− 2 p
npq
q − p 1− 2 p
=
;
npq
npq
γ 2 = β2 − 3 =
1 − 6 pq
npq
gives a measure of skewness of the binomial distribution. If p <
, skewness is negative and if p =
1
2
1
2
, skewness is
, it is zero.
gives a measure of the kurtosis of the binomial distribution.
ILLUSTRATIVE EXAMPLES
Example 1. One ship out of 9 was sunk on an average in making a certain voyage. What
was the probability that exactly 3 out of a convoy of 6 ships would arrive safely?
1 8
1
Sol. p, the probability of a ship arriving safely = 1 − = ; q = , n = 6
9 9
9
⎛1 8⎞
Binomial distribution is ⎜ + ⎟
⎝9 9⎠
6
3
3
⎛ 1 ⎞ ⎛ 8 ⎞ 10240
.
The probability that exactly 3 ships arrive safely = 6 C3 ⎜ ⎟ ⎜ ⎟ =
96
⎝9⎠ ⎝9⎠
1224
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Example 2. Assume that on the average one telephone number out of fifteen called between
2 P.M. and 3 P.M. on week-days is busy. What is the probability that if 6 randomly selected
telephone numbers are called (i) not more than three, (ii) at least three of them will be busy?
Sol. p, the probability of a telephone number being busy between 2 P.M. and 3 P.M. on
week-days = 151
1 14
⎛ 14 1 ⎞
q = 1 − = , n = 6; Binomial distribution is ⎜ + ⎟
15 15
⎝ 15 15 ⎠
6
The probability that not more than three will be busy
= p(0) + p (1) + p(2) + p(3)
6
5
4
2
3
⎛ 14 ⎞
⎛ 14 ⎞ ⎛ 1 ⎞
⎛ 14 ⎞ ⎛ 1 ⎞
⎛ 14 ⎞ ⎛ 1 ⎞
= C0 ⎜ ⎟ + 6 C1 ⎜ ⎟ ⎜ ⎟ + 6 C2 ⎜ ⎟ ⎜ ⎟ + 6 C3 ⎜ ⎟ ⎜ ⎟
⎝ 15 ⎠
⎝ 15 ⎠ ⎝ 15 ⎠
⎝ 15 ⎠ ⎝ 15 ⎠
⎝ 15 ⎠ ⎝ 15 ⎠
3
(14)
2744 × 4150
=
[2744 + 1176 + 210 + 20] =
= 0.9997
6
(15)
(15)6
3
6
The probability that at least three of them will be busy
= p(3) + p(4) + p(5) + p(6)
3
3
2
4
5
6
⎛ 14 ⎞ ⎛ 1 ⎞
⎛ 14 ⎞ ⎛ 1 ⎞
⎛ 14 ⎞ ⎛ 1 ⎞
⎛1⎞
= 6 C3 ⎜ ⎟ ⎜ ⎟ + 6 C4 ⎜ ⎟ ⎜ ⎟ + 6 C5 ⎜ ⎟ ⎜ ⎟ + 6 C6 ⎜ ⎟ = 0.005.
⎝ 15 ⎠ ⎝ 15 ⎠
⎝ 15 ⎠ ⎝ 15 ⎠
⎝ 15 ⎠ ⎝ 15 ⎠
⎝ 15 ⎠
Example 3. Six dice are thrown 729 times. How many times do you expect at least three
dice to show a five or six?
2 1
Sol. p = the chance of getting 5 or 6 with one die = =
6 3
1 2
q = 1 − = , n = 6, N = 729
3 3
since dice are in sets of 6 and there are 729 sets.
6
⎛ 2 1⎞
The binomial distribution is N(q + p) = 729 ⎜ + ⎟
⎝ 3 3⎠
The expected number of times at least three dice will show five or six
⎡ 6 ⎛ 2 ⎞ 3 ⎛ 1 ⎞ 3 6 ⎛ 2 ⎞ 2 ⎛ 1 ⎞ 4 6 ⎛ 2 ⎞ ⎛ 1 ⎞5 6 ⎛ 1 ⎞ 6 ⎤
= 729 ⎢ C3 ⎜ ⎟ ⎜ ⎟ + C4 ⎜ ⎟ ⎜ ⎟ + C5 ⎜ ⎟ ⎜ ⎟ + C6 ⎜ ⎟ ⎥
⎝ 3⎠ ⎝3⎠
⎝ 3⎠ ⎝3⎠
⎝ 3 ⎠⎝ 3 ⎠
⎝ 3 ⎠ ⎥⎦
⎢⎣
729
= 6 [160 + 60 + 12 + 1] = 233
3
n
Example 4. Out of 800 families with 4 children each, how many families would be expected
to have (i) 2 boys and 2 girls (ii) at least one boy (iii) no girl (iv) at most two girls? Assume
equal probabilities for boys and girls.
Sol. Since probabilities for boys and girls are equal
1
1
p = probability of having a boy = ; q = probability of having a girl =
2
2
21.57 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION
1225
________________________________________________________________________________________________________
4
⎛1 1⎞
n = 4, N = 800 ∴ The binomial distribution is 800 ⎜ + ⎟ .
⎝2 2⎠
(i) The expected number of families having 2 boys and 2 girls
2
2
1
⎛1⎞ ⎛1⎞
= 800 C2 ⎜ ⎟ ⎜ ⎟ = 800 × 6 × = 300.
16
⎝2⎠ ⎝2⎠
4
(ii) The expected number of families having at least one boy
⎡ 4 ⎛ 1 ⎞3 ⎛ 1 ⎞ 4 ⎛ 1 ⎞ 2 ⎛ 1 ⎞ 2 4 ⎛ 1 ⎞ ⎛ 1 ⎞3 4 ⎛ 1 ⎞ 4 ⎤
= 800 ⎢ C1 ⎜ ⎟ ⎜ ⎟ + C2 ⎜ ⎟ ⎜ ⎟ + C3 ⎜ ⎟ ⎜ ⎟ + C4 ⎜ ⎟ ⎥
⎝2⎠ ⎝2⎠
⎝2⎠ ⎝2⎠
⎝ 2 ⎠⎝ 2 ⎠
⎝ 2 ⎠ ⎦⎥
⎣⎢
= 800 ×
1
[4 + 6 + 4 + 1] = 750.
16
(iii) The expected number of families having no girl, i.e., having 4 boys
4
⎛1⎞
= 800 ⋅ C4 ⎜ ⎟ = 50.
⎝2⎠
4
(iv) The expected number of families having at most two girls, i.e., having at least 2 boys
2
2
3
4
⎡
1
⎛1⎞ ⎛1⎞
⎛ 1 ⎞⎛ 1 ⎞
⎛1⎞ ⎤
= 800 ⎢ 4 C2 ⎜ ⎟ ⎜ ⎟ + 4 C3 ⎜ ⎟ ⎜ ⎟ + 4 C4 ⎜ ⎟ ⎥ = 800 × [6 + 4 + 1] = 550.
16
⎝2⎠ ⎝2⎠
⎝ 2 ⎠⎝ 2 ⎠
⎝ 2 ⎠ ⎥⎦
⎢⎣
TEST YOUR KNOWLEDGE
1. Ten coins are tossed simultaneously. Find the probability of getting at least seven heads.
2. The probability of any ship of a company being destroyed on a certain voyage is 0.02. The company
owns 6 ships for the voyage. What is the probability of:
(i) losing one ship
(ii) losing at most two ships
(iii) losing none.
3. The probability that a man aged 60 will live to be 70 is 0.65. What is the probability that out of ten men
now 60, at least 7 would live to be 70?
4. The incidence of occupational disease in an industry is such that the workers have a 20% chance of
suffering from it. What is the probability that out of six workers chosen at random, four or more will
suffer from the disease?
5. The probability that a pen manufactured by a company will be defective is
1
10
. If 12 such pens are
manufactured, find the probability that
(i) exactly two will be defective
(iii) none will be defective.
(ii) at least two will be defective
6. If the chance that one of the ten telephone lines is busy at an instant is 0.2
(i) What is the chance that 5 of the lines are busy?
(ii) What is the probability that all the lines are busy?
7. If on an average 1 vessel in every 10 is wrecked, find the probability that out of 5 vessels expected to
arrive, at least 4 will arrive safely.
8. A product is 0.5% defective and is packed in cartons of 100. What percentage contains not more than 3
defectives?
1226
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
9. A bag contains 5 white, 7 red, and 8 black balls. If four balls are drawn one by one, with replacement,
what is the probability that
(i) none is white
(iii) at least one is white
(ii) all are white
(iv) only 2 are white?
10. In a hurdle race, a player has to cross 10 hurdles. The probability that he will clear each hurdle is
5
6
.
What is the probability that he will knock down fewer than 2 hurdles?
11. Fit a binomial distribution for the following data and compare the theoretical frequencies with the actual
ones:
:
:
x
f
0
2
1
14
2
20
3
34
4
22
5
8
12. If the sum of mean and variance of a binomial distribution is 4.8 for five trials, find the distribution.
13. If the mean of a binomial distribution is 3 and the variance is
3
2
, find the probability of obtaining at least
4 successes.
14. In 800 families with 5 children each, how many families would be expected to have (i) 3 boys and 2
girls, (ii) 2 boys and 3 girls, (iii) no girl (iv) at the most two girls. (Assume probabilities for boys and
girls to be equal.)
15. In 100 sets of ten tosses of an unbiased coin, in how many cases do you expect to get
(i) 7 heads and 3 tails
(ii) at least 7 heads?
16. The following data are the number of seeds germinating out of 10 on a damp filter for 80 sets of seeds.
Fit a binomial distribution to this data:
x : 0
f : 6
1
20
2
28
3
12
4
8
5
6
Σ fx
6
0
7
0
8
0
9
0
10
0
Total
80
∴ np = 2.175 etc.]
Σf
17. A bag contains 10 balls each marked with one of the digits 0 to 9. If four balls are drawn successively
(with replacement) from the bag, what is the probability that none is marked with the digit 0?
[Hint. Here n = 10, N = 80, Mean =
18. A box contains 100 tickets each bearing one of the numbers from 1 to 100. If 5 tickets are drawn
successively (with replacement) from the box, find the probability that all the tickets bear numbers
divisible by 10.
19. The probability that a ball thrown by a child will strike a target is
1
5
. If six balls are thrown find the
probability that (i) exactly two will strike the target, (ii) at least two will strike the target.
20. In sampling a large number of parts manufactured by a machine, the mean number of defectives in a
sample of 20 is 2. Out of 1000 such samples, how many would be expected to contain at least 3
defective parts?
Answers
1.
4.
7.
10.
13.
11
64
53
3125
0.91854
5⎛5⎞
⎜ ⎟
2⎝6⎠
11
32
2.
(i) 0.1085 (ii) 0.9997 (iii) 0.8858
3.
0.514
5.
(i) 0.2301 (ii) 0.3412 (iii) 0.2833
6.
(i) 0.02579 (ii) 1.024 × 10–7
8.
99.83
9.
(i )
9
11.
100 (0.432 + 0.568)5
12.
81
256
⎛1+ 4⎞
⎜
⎟
⎝5 5⎠
(ii )
5
1
256
(iii )
175
256
(iv )
27
128
21.58 POISSON DISTRIBUTION AS A LIMITING CASE OF BINOMIAL DISTRIBUTION
1227
________________________________________________________________________________________________________
14.
(i) 250 (ii) 250 (iii) 25 (iv) 400
15.
(i) 12 nearly (ii) 17 nearly
16.
80 (0.7825 + 0.2175)10
17.
⎛9⎞
⎜ ⎟
⎝ 10 ⎠
18.
20.
0.00001
323
19.
(i) 0.246 (ii) 0.345
4
________________________________________________________________________________________________________
POISSON DISTRIBUTION
21.58 POISSON DISTRIBUTION AS A LIMITING CASE OF BINOMIAL
DISTRIBUTION
If the parameters n and p of a binomial distribution are known, we can find the distribution.
But in situations where n is very large and p is very small, the application of the binomial
distribution is very laborious. However, if we assume that as n → ∞ and p → 0 such that np
always remains finite, say λ , we get the Poisson approximation to the binomial distribution.
Now, for a binomial distribution
P(X = r ) = n Cr q n − r p r
n(n − 1)(n − 2) . . . (n − r + 1)
=
× (1 − p ) n − r × p r
r!
n(n − 1)(n − 2) . . . (n − r + 1) ⎛ λ ⎞
=
× ⎜1 − ⎟
r!
⎝ n⎠
n−r
⎛λ⎞
×⎜ ⎟
⎝n⎠
r
since np = λ ∴ p =
n
⎛ λ⎞
1−
r
λ n(n − 1)(n − 2) . . . (n − r + 1) ⎜⎝ n ⎟⎠
= ×
×
r
r!
nr
⎛ λ⎞
⎜1 − ⎟
⎝ n⎠
n
⎛ λ⎞
1−
r
λ ⎛ n ⎞ ⎛ n − 1 ⎞ ⎛ n − 2 ⎞ ⎛ n − r + 1 ⎞ ⎜⎝ n ⎟⎠
= ⎜ ⎟⎜
⎟×
⎟⎜
⎟ ...⎜
r ! ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ ⎛ λ ⎞r
⎜1 − ⎟
⎝ n⎠
n
− ⎤
⎡
λ
λ
⎢⎛⎜1 − ⎞⎟ ⎥
⎢
n⎠ ⎥
λ r ⎛ 1 ⎞ ⎛ 2 ⎞ ⎛ r − 1 ⎞ ⎣⎝
⎦
= ⎜1 − ⎟ ⎜1 − ⎟ . . . ⎜1 −
⎟×
r
r! ⎝ n ⎠⎝ n ⎠ ⎝
n ⎠
⎛ λ⎞
⎜1 − ⎟
⎝ n⎠
As n → ∞ , each of the (r – 1) factors
⎛ 1⎞ ⎛ 2⎞
⎛ r −1 ⎞
⎜1 − ⎟ , ⎜1 − ⎟ , . . . , ⎜1 −
⎟ tends to 1. Also
n ⎠
⎝ n⎠ ⎝ n⎠
⎝
r
⎛ λ⎞
⎜1 − ⎟ tends to 1.
⎝ n⎠
n
⎡
⎤
λ
λ
⎛ 1⎞
⎛
⎞
⎢
Since Lt ⎜1 + ⎟ = e, the Naperian base. ∴ ⎜1 − ⎟ ⎥
x →∞
⎢⎝ n ⎠ ⎥
⎝ x⎠
⎣
⎦
x
−λ
−λ
→ e − λ as n → ∞
λ
n
1228
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Hence in the limiting case when n → ∞, we have
P(X = r ) =
λ r e−λ
(r = 0, 1, 2, 3, . . . )
r!
where λ is a finite number = np.
(A) represents the Poisson probability distribution.
. . . (A)
Note 1. λ is called the parameter of the distribution.
x x2
xn
Note 2. e = 1 +
+ . . . + + . . . to ∞.
1! 2!
n!
x
Note 3. The sum of the probabilities P(r) for r = 0, 1, 2, 3, . . . is 1, since
P(0) + P(1) + P(2) + P(3) + . . . = e
−λ
=e
−λ
+
λe
−λ
1!
λ e
2
+
2!
λe
3
+
−λ
3!
+...
⎛ λ λ λ
⎞
⎜1 + + + + . . . ⎟ = e
1!
2!
3!
⎝
⎠
2
21.59
−λ
3
−λ
λ
⋅ e = 1.
RECURRENCE FORMULA FOR THE POISSON DISTRIBUTION
For the Poisson distribution, P(r ) =
λ r e−λ
r!
and P(r + 1) =
λ r +1e − λ
(r + 1)!
P(r + 1)
λr !
λ
λ
=
=
or P(r + 1) =
P(r ), r = 0, 1, 2, 3, . . .
P(r )
(r + 1)! r + 1
r +1
This is called the recurrence formula for the Poisson distribution.
∴
21.60
MEAN AND VARIANCE OF THE POISSON DISTRIBUTION
For the Poisson distribution, P(r ) =
Mean μ
λ r e− λ
r!
∞
∞
λ r e− λ
r =0
r=0
r!
= ∑ rP(r ) = ∑ r ⋅
⎛
⎞
λ2 λ3
= e−λ ⎜ λ +
+ + . . .⎟
1! 2!
r = 1 ( r − 1)!
⎝
⎠
∞
= e−λ ∑
λr
⎛ λ λ2
⎞
= λ e− λ ⎜1 + +
+ . . . ⎟ = λ e− λ ⋅ eλ = λ
⎝ 1! 2!
⎠
Thus, the mean of the Poisson distribution is equal to the parameter λ .
Variance σ
2
λ r e− λ
r 2λ r
= ∑ r P(r ) − μ = ∑ r ⋅
−λ = e ∑
− λ2
r!
r =0
r =0
r =1 r !
∞
2
2
∞
2
2
−λ
∞
⎡12 ⋅ λ 22 ⋅ λ 2 32 λ 3 42 λ 4
⎤
= e−λ ⎢
+
+
+
+ . . .⎥ − λ 2
2!
3!
4!
⎣ 1!
⎦
⎡ 2λ 3λ 2 4λ 3
⎤
= λ e − λ ⎢1 +
+
+
+ . . .⎥ − λ 2
2!
3!
⎣ 1!
⎦
21.60 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION
1229
________________________________________________________________________________________________________
⎡ (1 + 1)λ (1 + 2)λ 2 (1 + 3)λ 3
⎤
= λ e − λ ⎢1 +
+
+
+ . . .⎥ − λ 2
1!
2!
3!
⎣
⎦
2
3
2
⎡
⎞ ⎛ λ 2λ
⎞⎤
λ λ λ
3λ 3
−λ ⎛
= λ e ⎢⎜ 1 + +
+ + . . .⎟ + ⎜ +
+
+ . . .⎟⎥ − λ 2
3!
⎠ ⎝ 1! 2!
⎠⎦
⎣⎝ 1! 2! 3!
⎡
⎛ λ λ2
⎞⎤
= λ e − λ ⎢ eλ + λ ⎜1 + +
+ . . . ⎟⎥ − λ 2
⎝ 1! 2!
⎠⎦
⎣
−λ
λ
λ
−λ
2
= λ e [e + λ e ] − λ = λ e ⋅ eλ (1 + λ ) − λ 2 = λ (1 + λ ) − λ 2 = λ.
Hence, the variance of the Poisson distribution is also λ .
Thus, the mean and the variance of the Poisson distribution are each equal to the
parameter λ .
Note. The mean and the variance of the Poisson distribution can also be derived from those of the binomial
distribution in the limiting case when n → ∞, p → 0 and np = λ .
Mean of binomial distribution is np.
∴ Mean of the Poisson distribution = Lt np = Lt λ = λ
n→∞
n→∞
Variance of the binomial distribution is npq = np (1 – p)
⎛ λ⎞
∴ Variance of the Poisson distribution = Lt np (1 − p ) = Lt λ ⎜ 1 − ⎟ = λ .
n→∞
n →∞
⎝ n⎠
ILLUSTRATIVE EXAMPLES
Example 1. If the variance of the Poisson distribution is 2, find the probabilities for r = 1,
2, 3, 4 from the recurrence relation of the Poisson distribution.
Sol. λ , the parameter of the Poisson distribution = Variance = 2
Recurrence relation for the Poisson distribution is
λ
2
P(r + 1) =
P(r ) =
P(r )
. . . (1)
r +1
r +1
λ r e− λ
e −2
Now
P(r ) =
⇒ P(0) =
= e −2 = 0.1353
r!
0!
Setting r = 0, 1, 2, 3 in (1), we get
2
P(1) = 2P(0) = 2 × 0.1353 = 0.2706; P(2) = P(1) = 0.2706
2
2
2
2
1
P(3) = P(2) = × 0.2706 = 0.1804; P(4) = P(3) = × 0.1804 = .0902.
3
3
4
2
Example 2. Assume that the probability of an individual coal miner being injured in a
certain way in a mine accident during a year is 1/2400. Use Poisson’s distribution to calculate
the probability that in a mine employing 200 miners there will be at least one such similar
accident in a year.
1
200
1
, n = 200; ∴ λ = np =
Sol. Here
p=
=
= 0.083
2400
2400 12
λ r e − λ (0.083) r e −.083
∴
=
P(r ) =
r!
r!
1230
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
P(at least one fatal accident) = 1 – P(no fatal accident) (0.083)
= 1 − P(0) = 1 −
(0.083)0 e −0.83
= 1 − .92 = 0.08.
0!
Example 3. Data was collected over a period of 10 years, showing the number of injuries
from horse kicks in each of the 200 army corps. The distribution of injuries was as follows:
No. of injuries
Frequency
:
:
0
109
1
65
2
22
3
3
4
1
Total
200
Fit a Poisson distribution to the data and calculate the theoretical frequencies:
Σ fx 65 + 44 + 9 + 4 122
=
=
= 0.61
Sol. Mean of given distribution =
Σf
200
200
This is the parameter (m) of the Poisson distribution.
mr e− m
where N = Σ f = 200
∴ Required Poisson distribution is N ⋅
r!
(0.61) r
(0.61) r
(0.61) 2
.
= 200e −0.61 ⋅
= 200 × 0.5435
= 108.7 ×
r!
r!
r!
r
0
1
P(r)
108.7
108.7 × 0.61 = 66.3
Theoretical Frequency
109
66
(0.61) 2
= 20.2
2!
(0.61)3
108.7 ×
= 4.1
3!
(0.61) 4
108.7 ×
= 0.7
4!
108.7 ×
2
3
4
20
4
1
Total = 200
Example 4. A car rental firm has two cars, which it hires out day by day. The number of
requests for a car on each day is distributed as a Poisson distribution with mean 1.5. Calculate
the proportion of days on which neither car is used and the proportion of days on which some
requests are refused. (e–1.5 = 0.2231)
Sol. Since the number of requests for a car is distributed as a Poisson distribution with mean
m = 1.5.
∴ Proportion of days on which neither car is used
= Probability of there being no requests for a car
m0e− m
= e −1.5 = 0.2231
0!
Proportion of days on which some requests are refused
= probability for the number of requests to be more than two
=
⎛
me − m m 2 e − m ⎞
= 1 − P( x ≤ 2) = 1 − ⎜ e − m +
+
⎟
1!
2! ⎠
⎝
21.60 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION
1231
________________________________________________________________________________________________________
⎛
(1.5) 2 ⎞
= 1 − e −1.5 ⎜1 + 1.5 +
⎟ = 1 − 0.2231 (1 + 1.5 + 1.125)
2 ⎠
⎝
= 1 − 0.2231× 3.625 = 1 − 0.8087375 = 0.1912625.
Example 5. Six coins are tossed 6400 times. Using the Poisson distribution, determine the
approximate probability of getting six heads x times.
Sol. Probability of getting one head with one coin = 12 .
6
1
⎛1⎞
∴ The probability of getting six heads with six coins = ⎜ ⎟ =
⎝ 2 ⎠ 64
1
= 100
64
∴ Average number of six heads with six coins in 6400 throws = np = 6400 ×
∴ The mean of the Poisson distribution = 100.
Approximate probability of getting six heads x times when the distribution is Poisson
=
m x e − m (100) x ⋅ e −100
=
.
x!
(100)!
TEST YOUR KNOWLEDGE
1. Fit a Poisson distribution to the following:
x :
f :
0
192
1
100
2
24
3
3
4
1
2. If the probability of a bad reaction from a certain injection is 0.001, determine the chance that out of
2000 individuals more than two will get a bad reaction.
3. If X is a Poisson variate such that P(X = 2) = 9P(X = 4) + 90P(X = 6), find the standard deviation.
4. If a random variable has a Poisson distribution such that P(1) = P(2), find
(i) mean of the distribution
(ii) P(4)
5. Suppose that X has a Poisson distribution. If P(X = 2) =
2
3
P(X = 1) find, (i) P(X = 0) (ii) P(X = 3).
6. A certain screw-making machine produces on average 2 defective screws out of 100, and packs them in
boxes of 500. Find the probability that a box contains 15 defective screws.
7. The incidence of occupational disease in an industry is such that the workmen have a 10% chance of
suffering from it. What is the probability that in a group of 7, five or more will suffer from it?
8. Fit a Poisson distribution to the following and calculate theoretical frequencies:
x
f
:
:
0
122
1
60
2
15
3
2
4
1
9. Fit a Poisson distribution to the following data given the number of yeast cells per square for 400
squares:
No. of cells per sq. :
No. of squares
:
0
103
1
143
2
98
3
42
4
8
5
4
6
2
7
0
8
0
9
0
10
0
⎛2⎞
10. Show that in a Poisson distribution with unit mean, mean deviation about mean is ⎜ e ⎟ times the
⎝ ⎠
standard deviation.
1232
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
11. In a certain factory
f
turningg razor blades, there is a smaall chance of 0.002 for any blade to be defe
fective.
The blades are
a supplied inn packets of 10.
1 Use the Pooisson distribuution to calculate the approxximate
number of packets containning no defectiive, one defective, and two defective bladdes respectivelly in a
shipment of 10000
1
packets..
12. The probability that a man aged 35 yearss will die before reaching thee age of 40 yeears may be takken as
0.018. Out off a group of 4000 men, now agged 35 years, what
w is the proobability that 2 men will die within
w
the next 5 yeaars?
13. Suppose a bo
ook of 585 pagges contains 433 typographicaal errors. If theese errors are randomly
r
distrributed
throughout th
he book, what is
i the probabiliity that 10 pagees, selected at random,
r
will be
b free from errrors?
Answers
e
1.
320 ×
4.
(i) 2
7.
0.0008
11.
12.
0.503
(9
9.503)
r!
(ii)
2.
0.32
3.
5.
(i) e–4 (ii) 4ee–4
6.
8.
121.36 ×
2
9802, 196, 2
0.01936
1
15
2
3e
r
13.
(0.5))
(10) e
−10
1
(15)!
= 0.035
5
, where r = 0,
0 1, 2, 3, 4
r!
Theoretical freequencies are 121,
1 61, 15,
3, 0 respectiveely
0.4795
9.
Theoretiical frequencies are
109, 1422, 92, 40, 13,
3, 1, 0, 0,
0 0, 0
________________________
________________________________________________________________________________________
N
NORMAL
DISTRIB
BUTION
21.61
N
NORMAL
DISTRIBUTIO
ON
The normal distribution is a continuouus distributioon. It can bee derived frrom the binoomial
distributiion in the lim
miting case when
w
n, the number of trials is veryy large and p,
p the probaability
of a success, is close to 12 . The general
g
equattion of the noormal distribbution is givven by
1 ⎛ x−μ ⎞
2
− ⎜
⎟
1
f ( x) =
e 2⎝ σ ⎠
σ 2π
t parameteers of
where thee variable x can assume all values frrom – ∞ to + ∞ . μ andd σ , called the
the distriibution, are respectivelyy the mean and the staandard deviaation of the distributionn and
– ∞ < μ < ∞ , σ > 0. x is calleed the normal variate annd f ( x) is called
c
the prrobability deensity
function of the normaal distributioon.
μ andd standard deviation
If a variable x has the norrmal distribuution with mean
m
d
σ , we
2
briefly write
w
x : N( μ , σ ).
The graph of th
he normal distribution
d
is called thee
normal curve.
c
It is bell-shaped and symmeetrical abouut
the meann μ . The tw
wo tails of thhe curve exttend to + ∞
and – ∞ toward the positive and negative directions
d
of
the x-axis respectivelly and graduually approach the x-axiss
without ever
e
meeting
g it. The cuurve is unimodal and thee
mode of the normal distribution coincides with
w its meann
21.63 ST
TANDARD FOR
RM OF THE NO
ORMAL DISTR
RIBUTION
1233
________________________
________________________________________________________________________________________
μ . The line
l x = μ divides
d
the arrea under thee normal currve above thhe x-axis intoo two equal parts.
p
Thus, thee median of the distribuution also coincides withh its mean annd mode. Thhe area undeer the
normal curve
c
betweeen any two given ordinnates x = x1 and x = x2 represents thhe probabiliity of
values faalling into thee given interrval. The tottal area undeer the normall curve abovve the x-axis is 1.
21.62
B
BASIC
PRO
OPERTIES OF
O THE NOR
RMAL DISTRIBUTION
The probability density funcction of the normal
n
distribution is givven by
1 ⎛ x−μ ⎞
σ ⎟⎠
− ⎜
1
f ( x) =
e 2⎝
σ 2π
(i) f ( x) ≥ 0
∫
(ii))
∞
−∞
2
f ( x)dxx = 1,
i.e., the total areea under thee normal curvve above thee x-axis is 1.
(iii)) The normaal distributionn is symmetrrical about itts mean.
(iv)) It is a unim
modal distribuution. The mean,
m
mode, and mediann of this distrribution coinncide.
21.63
S
STANDARD
D FORM OF THE NORM
MAL DISTRIBUTION
If X is a normaal random variable
v
withh mean μ and
a
standard deviation σ , then thhe random variable Z =
X−μ
h the norrmal distribbution with mean 0 and
has
a
σ
standard deviation 1. The random variable Z is called the
t
ndard ) norm
standarddized (or stan
mal random variable.
The probability
y density function
f
foor the norm
mal
distributiion in standaard form is given
g
by
1 2
1 −2z
f ( z) =
e
2π
a parameteer. This helpps us to com
mpute areas under
u
the noormal probaability
It iss free from any
curve by making use of standard tables.
Notee 1. If f ( z ) is the probabilityy density functiion for the norm
mal distributioon, then
P(z1 ≤ Z ≤ z2 ) = ∫
z2
z1
f ( z )dz = F(
F z2 )F( z1 ), where F(zz ) = ∫
z
−∞
f ( x)dz = P(Z ≤ z )
f
F(z) defined
d
above is
i called the disstribution funcction for the noormal distributiion.
The function
Notee 2. The probaabilities P(z1 ≤ Z ≤ z 2 ), P(z1 < Z ≤ z 2 ), P( z1 ≤ Z < z 2 ) annd P( z1 < Z < z 2 ) are all reggarded
to be the saame.
Notee 3.
F(− z1 ) = 1 − F( z1 ).
ILLUSTRA
ATIVE EXAMP
PLES
Exaample 1. A sample
s
of 1000 dry batterry cells testeed to find thhe length of life produceed the
followingg results:
x = 12 houurs, σ = 3 hoours.
1234
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
Assuuming the da
ata to be norrmally distriibuted, whatt percentage of battery cells
c
are expected
to have liife
(i)) more than 15 hours
(ii) lesss than 6 hourrs
(iii)) between 10
0 and 14 houurs?
Sol. Here x deno
otes the lenggth of life of dry battery cells.
x − x x − 12
Alsoo
=
z=
.
σ
3
(i) When
W
x = 15, z = 1
∴
P( x > 15) = P( z > 1)
= P(0 < z < ∞) − P(0 < z < 1)
= 0.5 − 0.34413 = 0.15877 = 15.87%.
(ii) When
W
x = 6,, z = – 2
∴
P( x < 6) = Pz < −2)
= P(0 > 2) = P(0
P < z < ∞) − P(0 < z < 2)
2
= 0.5 − 0.47722 = 0.0228 = 2.28%.
2
(iii) When x = 10, z = − = – 0.67
3
2
= 0.67
Wheen x = 14, z =
3
P
P(10
< x < 14
4)
= P(−0.67 < z < 0.677)
= 2P(0 < z < 0.67) = 2 × 0.2487
= 0.4974
4 = 49.74%.
Exaample 2. In a normal diistribution, 31%
3
of the items
i
are unnder 45 and 8% are oveer 64.
Find the mean and sttandard deviiation of the distributionn.
Sol. Let x and σ be the meean and S.D. respectivelly.
31%
% of the item
ms are under 45.
4
⇒ Area to the left of the orrdinate x = 45
4 is 0.31
Wheen x = 45, leet z =z1
P(z1 < z < 0) = 0.55 – 0.31 = 0.19
From
m the tabless, the valuee of z corresponding too this
area is 0.5
z1 = −0.5[ z1 < 0]
∴
Wheen x = 64, leet z = z2
P(0 < z < z2) = 0.55 – 0.08 = 0.42
From
m the tables,, the value of z corresponnding to thiss area is 1.4.
z2 = 1.4
x−x
Sincce
z=
σ
−0.5 =
45 − x
σ
and 1.44 =
64 − x
σ
⇒
5 − x = −0.5σ
45
andd
64
4 − x = 1.4σ
−19 = −1.9σ ∴ σ = 10
Subbtracting
From
m (1),
45
5 − x = −0.5 × 10 − 5 ∴ x = 50.
. . . (1)
. . . (2)
21.64 POPULATION OR UNIVERSE
1235
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. The mean height of 500 students in a certain college is 151 cm and the standard deviation is 15 cm.
Assuming the heights are normally distributed, how many students have heights between 120 and
155 cm?
2. An aptitude test for selecting officers in a bank is conducted on 1000 candidates. The average score is 42
and the standard deviation of score is 24. Assuming normal distribution for the scores, find
(i) The number of candidates whose scores exceed 60
(ii) The number of candidates whose scores lie between 30 and 60.
3. In a normal distribution, 7% of the items are under 35 and 89% are under 63. What are the mean and
standard deviation of the distribution?
4. Let X denote the number of scores on a test. If X is normally distributed with mean 100 and standard
deviation 15, find the probability that X does not exceed 130.
5. It is known from past experience that the number of telephone calls made daily in a certain community
between 3 P.M. and 4 P.M. have a mean of 352 and a standard deviation of 31. What percentage of the
time will there be more than 400 telephone calls made in this community between 3 P.M. and 4 P.M.?
6. Students of a class were given a mechanical aptitude test. Their grades were found to be normally
distributed with mean 60 and standard deviation 5. What percent of students scored
(i) more than 60 grades?
(iii) between 45 and 65 grades?
(ii) less than 56 grades?
7. In an examination taken by 500 candidates, the average and the standard deviation of grades obtained
(normally distributed) are 40% and 10%. Find approximately:
(i) How many will pass, if 50% is fixed as a minimum?
(ii) What should be the minimum if 350 candidates are to pass?
(iii) How many have scored above 60%?
Answers
1.
300
2.
(i) 252 (ii) 533
3.
x = 50.3, σ = 10.33
4.
0.9772
5.
6.06%
6.
(i) 50% (ii) 21.2% (iii) 84%
7.
(i) 79 (ii) 35% (iii) 11
________________________________________________________________________________________________________
SAMPLING AND TESTS OF SIGNIFICANCE
21.64
POPULATION OR UNIVERSE
An aggregate of objects (animate or inanimate) under study is called population or
universe. It is thus a collection of individuals or of their attributes (qualities) or of results of
operations that can be numerically specified.
A universe containing a finite number of individuals or members is called a finite inverse:
for example, the universe of the weights of students in a particular class.
A universe with an infinite number of members is known as an infinite universe: for
example, the universe of pressures at various points in the atmosphere.
In some cases, we may even be ignorant whether or not a particular universe is infinite, e.g.,
the universe of stars.
1236
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
The universe of concrete objects is an existent universe. The collection of all possible ways
in which a specified event can happen is called a hypothetical universe. The universe of heads
and tails obtained by tossing a coin an infinite number of times (provided that it does not wear
out) is a hypothetical one.
21.65
SAMPLING
The statistician is often confronted with the problem of discussing a universe of which he
cannot examine every member, i.e., of which complete enumeration is impracticle. For example,
if we want to have an idea of the average per capita income of the United States, enumeration of
every earning individual in the country is a very difficult task. Naturally, the question arises:
What can be said about a universe of which we can examine only a limited number of members?
This question is the origin of the Theory of Sampling.
A finite sub-set of a universe is called a sample. A sample is thus a small portion of the
universe. The number of individuals in a sample is called the sample size. The process of
selecting a sample from a universe is called sampling.
The theory of sampling is a study of the relationship existing between a population and
samples drawn from the population. The fundamental object of sampling is to get as much
information as possible about the whole universe by examining only a part of it. An attempt is
thus made through sampling to give the maximum information about the parent universe with the
minimum effort.
Sampling is quite often used in our day-to-day practical life. For example, in a store we
assess the quality of lettuce, apples, or any other commodity by taking only a handful of it from
the bag and then decide whether to purchase it or not. A chef normally tastes cooked products to
find if they have been properly cooked and contain the proper quantity of salt or sugar, by taking
a spoonful of it.
21.66
PARAMETERS OF STATISTICS
The statistical constants of the population such as mean, the variance, etc. are known as the
parameters. The statistical concepts of the sample from the members of the sample to estimate
the parameters of the population from which the sample has been drawn are known as statistics.
Population mean and variance are denoted by μ and σ 2 , while those of the sample are
given by x and s 2 .
21.67
STANDARD ERROR (S.E.)
The standard deviation of the sampling distribution of a statistic is known as the standard
error (S.E.).
It plays an important role in the theory of large samples and it forms a basis of the testing of
hypotheses. If t is any statistic, for a large sample
z=
t − E(t )
is normally distributed with mean 0 and variance 1.
S.E.(t )
For a large sample, the standard errors of some of the well-known statistics are listed below:
n
σ
2
s2
sample size
population variance
sample variance
p
Q
n1 , n2
population proportion
=1–p
sizes of two independent random samples
21.70 LEVEL OF SIGNIFICANCE
1237
________________________________________________________________________________________________________
No.
21.68
Statistic
Standard error
1.
x
σ/ n
2.
s
σ 2 / 2n
3.
Difference of two sample means x1 − x2
σ 12
4.
Difference of two sample standard deviations s1 − s2
5.
Difference of two sample proportions p1 − p2
6.
Observed sample proportion p
n1
σ 12
2n1
+
σ 22
+
σ 22
n2
2n2
P1Q1 P2 Q 2
+
n1
n2
PQ/n
TEST OF SIGNIFICANCE
An important aspect of the sampling theory is to study the test of significance, which will
enable us to decide, on the basis of the results of the sample, whether
(i) the deviation between the observed sample statistic and the hypothetical parameter
value or
(ii) the deviation between two sample statistics is significant or might be attributed due to
chance or the fluctuations of the sampling.
To apply the tests of significance, we first set up a hypothesis that is a definite statement
about the population parameter called the Null hypothesis denoted by H0.
Any hypothesis that is complementary to the null hypothesis (H0) is called an Alternative
hypothesis denoted by H1.
For example, if we want to test the null hypothesis that the population has a specified mean
μ0 , then we have
H0 : μ = μ 0
Alternative hypotheses will be
(i) H1 : μ ≠ m0 ( μ > μ0 or μ < μ0 ) (two-tailed alternative hypothesis).
(ii) H1 : μ > μ0 (right-tailed alternative hypothesis (or) single-tailed).
(iii) H1 : μ < μ0 (left-tailed alternative hypothesis (or) single-tailed).
Hence alternative hypotheses help to know whether the test is a two-tailed test or a onetailed test.
21.69
CRITICAL REGION
A region corresponding to a statistic t, in the sample space S that amounts to rejection of the
null hypothesis H0, is called the critical region or the region of rejection. The region of the
sample space S that amounts to the acceptance of H0 is called the acceptance region.
21.70
LEVEL OF SIGNIFICANCE
The probability of the value of the variate falling in the critical region is known as the level
of significance.
The probability α that a random value of the statistic t belongs to the critical region is
known as the level of significance.
1238
CHAPTER 21: STATISTICS AND PROBAB
BILITY
________________________
________________________________________________________________________________________
P(t ∈ ω | H 0 ) = α
i.e., the leevel of signiificance is thhe size of thee type I errorr or the maxiimum produucer’s risk.
21.71
E
ERRORS
IN
N SAMPLING
G
The main goal of the samppling theory is to draw a valid concclusion abouut the popullation
parameteers on the baasis of the sample
s
resullts. In doing this we maay commit thhe followingg two
types of errors:
e
Typ
pe I Error. When
W
H0 is true,
t
we mayy reject it.
P(R
Reject H0 wheen it is true) = P(Reject H0/H0) = α
α is called the size of the tyype I error, also
a referredd to as produ
ucer’s risk.
Typ
pe II Error. When H0 is wrong we may
m accept itt.
P(A
Accept H0 wh
hen it is wroong) = P(Acccept H0/H1) = β . β is called the size of the tyype II
error, also referred to
o as consum
mer’s risk.
Critical values
v
or siignificant va
alues
The values of th
he test statisstic that sepaarate the crittical region and the acceptance regiion is
called thee critical va
alues or the significant
s
v
value
.
Thiss value is dependent
d
o (i) the level
on
l
of siggnificance used
u
and (iii) the alternnative
hypothessis, whether it
i is one-tailed or two-taailed.
t − E(
E t)
For larger samp
ples correspponding to thhe statistic t, the variabble z =
is norm
mally
S.E
E.(t )
distributeed with meaan 0 and vaariance 1. The
T value off z (as givenn previouslyy) under thee null
hypothessis is known as the test statistic
s
.
The critical valu
ue of zα of the
t test statistic at level of significannce α for a two-tailed test
t is
given by
p ( z > zα ) = α
. . . (1)
o z so that the total areea of the crittical region on both tailss is α . Sincce the
i.e., zα is the value of
normal curve is symm
metrical, from equation (1), we get
p ( z > zα ) + p ( z < − zα ) = α ; i.e., 2 p ( z > zα ) = a; p ( z > zα ) = α / 2
i.e., the area
a of each tail
t is α / 2.
21.72 TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION
1239
________________________________________________________________________________________________________
The critical value zα is that value such that the area to the right of zα is α / 2 and the area
to the left of – zα is α / 2.
In the case of the one-tailed test
p ( z > zα ) = α if it is right tailed; p(z < – zα ) = α if it is left tailed.
The critical value of z for a single-tailed test (right or left) at the level of significance α is
the same as the critical value of z for a two-tailed test at the level of significance 2α .
Using the equation and the normal tables, the critical value of z at a different level of
significance ( α ) for both single-tailed and two-tailed tests are calculated and listed below. The
equations are
p ( z > zα ) = α ; p ( z > zα ) = α ; p ( z < − zα ) = α
Level of significance
1% (0.01)
5% (0.05)
10% (0.1)
Two-tailed test
zα = 2.58
z = 1.966
z = 0.645
Right-tailed
zα = 2.33
zα = 1.645
zα = 1.28
Left-tailed
zα = −2.33
zα = −1.645
zα = −1.28
Note. The following steps may be adopted to test statistical hypotheses:
Step 1: Null hypothesis. Set up H0 in clear terms.
Step 2: Alternative hypothesis. Set up H1 so that we can decide whether to use the onetailed test or the two-tailed test.
Step 3: Level of significance. Select the appropriate level of significance in advance
depending on the reliability of the estimates.
t − E(t )
Step 4: Test statistic. Compute the test statistic z =
under the null hypothesis.
S.E.(t )
Step 5: Conclusion. Compare the computed value of z with the critical value zα at the level
of significance ( α ).
If z > zα , we reject H0 and conclude that there is significant difference. If z < zα , we
accept H0 and conclude that there is no significant difference.
TEST OF SIGNIFICANCE FOR LARGE SAMPLES
If the sample size n > 30, the sample is taken as a large sample. For such a sample we apply
a normal test, as Binomial, Poisson, chi-square, etc. are closely approximated by normal
distributions assuming the population as normal.
Under a large sample test, the following are the important tests of significance.
1. Testing of significance for a single proportion.
2. Testing of significance for a difference of proportions.
3. Testing of significance for a single mean.
4. Testing of significance for a difference of means.
21.72
TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION
This test is used to find the significant difference between the proportion of the sample and
the population.
1240
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Let X be the number of successes in n independent trials with constant probability P of
success for each trial.
E(X) = nP; V(X) = nPQ; Q = 1 – P = Probability of failure.
Let p = X/n called the observed proportion of success.
1
np
E(x) =
= p; E(p ) = p
n
n
1
1(PQ)
V( p) = V(X/n) = 2 v(X) =
= PQ/n
n
n
PQ
p − E(p)
p− p
S.E.( p) =
;z=
=
∼ N(0, 1)
SE(p)
n
PQ/n
E(p) = E(X/n) =
This z is called the test statistic that is used to test the significant difference of sample and
population proportion.
Note 1. The probable limit for the observed proportion of successes is p ± zα
PQ/n , where
significant value at level of significance α .
Note 2. If p is not known, the limits for the proportion in the population are p ± zα
zα is the
pq / n , q = 1 – p.
Note 3. If α is not given, we can take safely 3σ limits.
Hence, the confidence limits for the observed proportion p are p ± 3
The confidence limits for the population proportion p are p ±
pq
n
PQ
n
.
.
ILLUSTRATIVE EXAMPLES
Example 1. A coin was tossed 400 times and returned heads 216 times. Test the hypothesis
that the coin is unbiased.
Sol. H0: The coin is unbiased, i.e., P = 0.5.
H1: The coin is not unbiased (biased), i.e., P ≠ 0.5
Here n = 400; X = No. of success = 216
X 216
=
= 0.54
p = proportion of success in the sample
n 400
population proportion = 0.5 = P; Q = 1 – P = 1 – 0.5 = 0.5
p−P
under H0, test statistic z =
PQ/n
0.54 − 0.5
= 1.6
0.5 × 0.5
400
we use the two-tailed test.
Conclusion. Since z = 1.6 < 1.96
z =
I.e., z < zα , zα is the significant value of z at 5% level of significance.
I.e., the coin is unbiased in P = 0.5.
21.72 TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION
1241
________________________________________________________________________________________________________
Example 2. A certain cubical die was thrown 9000 times and a 5 or a 6 was obtained 3240
times. On the assumption of unbiased throwing, do the data indicate an unbiased die?
Sol. Here n = 9000
P = probability of success (i.e., getting a 5 or a 6 in the throw of the die)
P = 2/6 = 1/3, Q = 1 – 1/3 = 2/3
X 3240
p= =
= 0.36
n 9000
H0 : is unbiased, i.e., P = 1/3
H1 : P ≠ 1/3 (two-tailed test)
p−P
0.36 − 0.33
z=
=
= 0.03496
PQ
1 2
1
× ×
The test statistic
n
3 3 9000
z = 0.03496 < 1.96
Conclusion. Accept the hypothesis
As z < zα , zα is the tabulated value of z at 5% level of significance.
∵ H0 is accepted, we conclude that the die is unbiased.
Example 3. A manufacturer claims that only 4% of his products supplied are defective. A
random sample of 600 products contained 36 defectives. Test the claim of the manufacturer.
Sol. (i)
P = observed proportion of success.
36
= 0.06
600
p = proportion of defectives in the population = 0.04
H0 : p = 0.04 is true.
I.e., the claim of the manufacturer is accepted.
H1 : (i) P ≠ 0.04 (two-tailed test)
(ii) If we want to reject, only if p > 0.04 then (right tailed).
I.e.,
P = proportion of defectives in the sample =
Under H0,
z=
0.06 − 0.04
p−P
=
= 2.5.
PQ/n
0.04 × 0.96
600
Conclusion. Since z = 2.5 > 1.96, we reject the hypothesis H0 at 5% level of significance
two tailed.
If H1 is taken as p > 0.04, we apply the right-tailed test.
z = 2.5 > 1.645 ( zα ) so we reject the null hypothesis here also.
In both cases, the manufacturer’s claim is not acceptable.
Example 4. A machine is producing bolts of which a certain fraction is defective. A random
sample of 400 is taken from a large batch and is found to contain 30 defective bolts. Does this
indicate that the proportion of defectives is larger than that claimed by the manufacturer who
claims that only 5% of his products are defective? Find the 95% confidence limits of the
proportion of defective bolts in the batch.
1242
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Sol. Null hypothesis. H0 : The manufacturer’s claim is accepted, i.e., P =
5
= 0.05
100
Q = 1 – P = 1 – 0.05 = 0.95
Alternative hypothesis. p > 0.05 (right-tailed test).
p = observed proportion of sample =
30
= 0.075
400
0.075 − 0.05
p−P
∴ z=
= 2.2941.
PQ/n
0.05 × 0.95
400
Conclusion. The tabulated value of z at 5% level of significance for the right-tailed test is
Under H0, the test statistic z =
zα = 1.645. Since z = 2.2941 > 1.645,
H0 is rejected at 5% level of significance, i.e., the proportion of defective bolts is larger than the
manufacturer claims.
To find 95% confidence limits of the proportion, it is given by
p ± zα PQ/n
0.05 × 0.95
= 0.05 ± 0.02135 = 0.07136, 0.02865
400
Hence 95% confidence limits for the proportion of defective bolts are (0.07136, 0.02865).
0.05 ± 1.96
Example 5. A bag contains defective articles, the exact number of which is not known. A
sample of 100 from the bag gives 10 defective articles. Find the limits for the proportion of
defective articles in the bag.
10
Sol. Here p = proportion of defective articles =
= 0.1; q = 1 – p = 1– 0.1 = 0.9.
100
Since the confidence limit is not given, we assume it is 95%.
∴ level of significance is 5% zα = 1.96.
Also the proportion of population P is not given. To get the confidence limit, we use P,
0.1× 0.9
= 0.1 ± 0.0588 = 0.1588, 0.0412.
which is given by P ± pq / n = 0.1 ± 1.96
100
Hence, the 95% confidence limits for the defective articles in the bag are (0.1588, 0.0412).
TEST YOUR KNOWLEDGE
1. A sample of 600 people selected at random from a large city shows that the percentage of males in the
sample is 53. It is believed that the ratio of males to the total population in the city is 0.5. Test whether
the belief is confirmed by the observation.
2. In a city, a sample of 1000 people was taken, and out of them 540 are vegetarian and the rest are nonvegetarian. Can we say that both habits of eating (vegetarian or non-vegetarian) are equally popular in
the city at (i) 1% level of significance (ii) 5% level of significance?
3. 325 men out of 600 men chosen from a big city were found to be smokers. Does this information support
the conclusion that the majority of men in the city are smokers?
4. A random sample of 500 bolts was taken from a large shipment and 65 were found to be defective. Find
the percentage of defective bolts in the shipment.
21.73 TEST OF DIFFERENCE BETWEEN PROPORTIONS
1243
________________________________________________________________________________________________________
5. In a hospital, 475 female and 525 male babies were born in a week. Do these figures confirm the
hypothesis that males and females are born in equal numbers?
6. 400 apples are taken at random from a large basket and 40 are found to be bad. Estimate the proportion
of bad apples in the basket and assign limits within which the percentage most probably lies.
Answers
1.
3.
5.
H0 accepted at 5% level
H0 rejected at 5% level
H0 accepted at 5% level
2.
4.
6.
H0 rejected at 5% level, accepted at 1% level
Between 17.51 and 8.49
8.5 : 11.5
________________________________________________________________________________________________________
21.73
TEST OF DIFFERENCE BETWEEN PROPORTIONS
Consider two samples X1 and X2 of sizes n1 and n2 respectively taken from two different
populations. We test the significance of the difference between the sample proportion p1 and p2.
The test statistic under the null hypothesis H0, that there is no significant difference between the
two sample proportion, yields
p1 − p2
n p +n p
z=
, where P = 1 1 2 2 and Q = 1 − P.
n1 + n2
⎛1 1⎞
PQ ⎜ + ⎟
⎝ n1 n2 ⎠
ILLUSTRATIVE EXAMPLES
Example 1. Before an increase in the excise duty on tea, 800 people out of a sample of
1000 people were found to be tea drinkers. After an increase in the duty, 800 people were known
to be tea drinkers in a sample of 1200 people. Do you think that there has been a significant
decrease in the consumption of tea after the increase in the excise duty?
Sol. Here
n1 = 800, n2 = 1200
p1 =
X1 800 4
X
800 2
=
= ; p2 = 2 =
=
n1 1000 5
n2 1200 3
P=
p1n1 + p2 n2 X1 + X 2
800 + 800
8
3
=
=
= ;Q=
n1 + n2
n1 + n2 1000 + 1200 11
11
Null hypothesis H0. p1 = p2, i.e., there is no significant difference in the consumption of tea
before and after the increase of excise duty.
H1 : p1 > p2 (right-tailed test)
p1 − p2
0.8 − 0.6666
The test statistic z =
=
= 6.842.
8 3⎛ 1
1 ⎞
⎛1 1⎞
× ⎜
+
PQ ⎜ + ⎟
⎟
11
11
1000
1200 ⎠
⎝
n
n
2 ⎠
⎝ 1
Conclusion. Since the calculated value of z > 1.645 also z > 2.33, both the significant
value of z at 5% and 1% level of significance. Hence H0 is rejected, i.e., there is a significant
decrease in the consumption of tea due to the increase in excise duty.
Example 2. A machine produced 16 defective articles in a batch of 500. After overhauling
the machine it produced 3 defectives in a batch of 100. Has the machine improved?
16
3
Sol.
p1 =
= 0.032; n1 = 500
p2 =
= 0.03; n2. = 100
500
100
1244
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Null hypothesis H0. The machine has not improved due to overhauling. p1 = p2.
p n + p2 n2 19
H1 : p1 > p2 (right tailed) ∴ P = 1 1
=
≅ 0.032
n1 + n2
600
Under H0, the test statistic
p1 − p2
0.032 − 0.03
z=
=
= 0.104.
1 ⎞
⎛1 1⎞
⎛ 1
(0.032)(0.968) ⎜
+
PQ ⎜ + ⎟
⎟
500
100 ⎠
⎝
n
n
2 ⎠
⎝ 1
Conclusion. The calculated value of z < 1.645, the significant value of z at 5% level of
significance. H0 is accepted, i.e., the machine has not improved due to overhauling.
Example 3. In two large populations there are 30% and 25% respectively of fair-haired
people. Is this difference likely to be hidden in samples of 1200 and 900 respectively from the
two populations?
Sol. p1 = proportion of fair-haired people in the first population = 30% = 0.3; p2 = 25% =
0.25; Q1 = 0.7, Q2 = 0.75.
H0 : Sample proportions are equal, i.e., the difference in population proportions is likely to
be hidden in sampling.
H1 : p1 ≠ p2
z=
P1 − P2
=
P1Q1 P2 Q 2
+
n1
n2
0.3 − 0.25
= 2.5376.
0.3 × 0.7 0.25 × 0.75
+
1200
900
Conclusion. Since z > 1.96, the significant value of z at 5% level of significance, H0 is
rejected. However
z
< 2.58, the significant value of z at 1% level of significance. H0 is
accepted. At 5% level these samples will reveal the difference in the population proportions.
Example 4. 500 articles from a factory are examined and found to be 2% defective. 800
similar articles from a second factory are only found to be 1.5% defective. Can it be reasonably
concluded that the products of the first factory are inferior to those of the second?
Sol.
n1 = 500, n2 = 800
p1 = proportion of defective products from the first factory = 2% = 0.02
p2 = proportion of defective products from the second factory = 1.5% = 0.015
H0 : There is no significant difference between the two products, i.e., the products do not
differ in quality.
H1 : p1 < p2 (one-tailed test)
p1 − p2
Under H0,
z=
⎛1 1⎞
PQ ⎜ + ⎟
⎝ n1 n2 ⎠
P=
z=
n1 p1 + n2 p2 0.02(500) + (0.015)(800)
=
= 0.01692; Q = 1 − P = 0.9830
n1 + n2
500 + 800
0.02 − 0.015
1 ⎞
⎛ 1
0.01692 × 0.983 ⎜
+
⎟
⎝ 500 800 ⎠
= 0.68
Conclusion. As z < 1.645, the significant value of z at 5% level of significance, H0 is
accepted, i.e., the products do not differ in quality.
21.74 TEST OF SIGNIFICANCE FOR THE SINGLE MEAN
1245
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. A random sample of 400 men and 600 women was asked whether they would like to have a school near
their residence. 200 men and 325 women were in favor of the proposal. Test the hypothesis that the
proportion of men and women in favor of the proposal is the same at 5% level of significance.
2. In a town A, there were 956 births of which 52.5% was males while in towns A and B combined, this
proportion in a total of 1406 births was 0.496. Is there any significant difference in the proportion of
male births in the two towns?
3. In a referendum submitted to the student body at a university, 850 men and 560 women voted. 500 men
and 320 women voted yes. Does this indicate a significant difference of opinion between men and
women on this matter at 1% level?
4. A manufacturing firm claims that its brand A product outsells its brand B product by 8%. If it is found
that 42 out of a sample of 200 people prefer brand A and 18 out of another sample of 100 people prefer
brand B, test whether the 8% difference is a valid claim.
Answers
1. H0 : accepted
2. H0 : rejected
3. H0 : accepted
4. H0 : accepted.
________________________________________________________________________________________________________
21.74
TEST OF SIGNIFICANCE FOR THE SINGLE MEAN
To test whether the difference between the sample mean and the population mean is
significant or not:
Let X1, X2, . . . , Xn be a random sample of size n from a large population X1, X2,. . . , XN of
size N with mean μ and variance σ 2 ∴ the standard error of mean of a random sample of size
n from a population with variance σ 2 is σ / n .
To test whether the given sample of size n has been drawn from a population with mean μ ,
i.e., to test whether the difference between the sample mean and population mean is significant
or not. Under the null hypothesis that there is no difference between the sample mean and the
population mean
x −μ
the test statistic is z =
, where σ is the standard deviation of the population.
σ/ n
X−μ
, where s is the standard deviation of
If σ is not known, we use the test statistic z =
s/ n
the sample.
Note. If the level of significance is α and zα is the critical value
− zα < z =
x −μ
< zα
σ/ n
The limits of the population mean μ are given by x − zα
σ
n
< μ < x + zα σ / n .
At 5% of level of significance, 95% confidence limits are x − 1.96
At 1% level of significance, 99% confidence limits are x − 2.58
These limits are called confidence limits or fiducial limits.
σ
n
σ
n
< μ < x + 1.96
< μ < x + 2.58
σ
σ
n
n
.
.
1246
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. A normal population has a mean of 6.8 and standard deviation of 1.5.
A sample of 400 members gave a mean of 6.75. Is the difference significant?
Sol. H0 : There is no significant difference between x and μ .
H1 : There is significant difference between x and μ .
Given μ = 6.8, σ = 1.5, x = 6.75, and n = 400
x −μ
6.75 − 6.8
=
= − 0.67 = 0.67
1.5 / 900
σ/ n
Conclusion. As the calculated value of z < zα = 1.96 at 5% level of significance, H0 is
z =
accepted, i.e., there is no significant difference between x and μ .
Example 2. A random sample of 900 wooden sticks has a mean of 3.4 cms. Can it be
reasonably regarded as a sample from a large population of mean 3.2 cms and S.D. 2.3 cms?
Sol. Here n = 900, x = 3.4, μ = 3.2, σ = 2.3.
H0 : Assume that the sample is drawn from a large population with mean 3.2 and S.D. = 2.3.
H1 : μ ≠ 3.25 (Apply two-tailed test.)
x −μ
3.4 − 3.2
=
= 0.261.
Under H0; z =
σ / n 2.3 / 900
Conclusion. As the calculated value of z = 0.261 < 1.96 the significant value of z at 5%
level of significance. H0 is accepted, i.e., the sample is drawn from the population with mean 3.2
and S.D. = 2.3.
Example 3. The mean weight obtained from a random sample of size 100 is 64 gms. The
S.D. of the weight distribution of the population is 3 gms. Test the statement that the mean weight
of the population is 67 gms at 5% level of significance. Also set up 99% confidence limits of the
mean weight of the population.
Sol. Here n = 100, μ = 67, x = 64, σ = 3.
H0 : There is no significant difference between sample and population mean.
I.e.,
μ = 67, the sample is drawn from the population with μ = 67
H1 : μ ≠ 67 (Two-tailed test)
x −μ
64 − 67
=
= −10 ∴ z = 10.
Under H0, z =
σ / n 3 / 100
Conclusion. Since the calculated value of z > 1.96, the significant value of z at 5% level
of significance, H0 is rejected, i.e., the sample is not drawn from the population with mean 67.
The 99% confidence limits is given by x ± 2.58 σ / n = 64 ± 2.58 ×3 / 100 = 64.774,
63.226.
Example 4. The average grades in mathematics of a sample of 100 students was 51 with a
S.D. of 6. Could this have been a random sample from a population with average grades of 50?
Sol. Here n = 100, x = 51, s = 6, μ = 50; σ is unknown.
H0 : The sample is drawn from a population with mean 50, μ = 50
H1 : μ ≠ 50
21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE SAMPLES
1247
________________________________________________________________________________________________________
x −μ
51 − 50 10
=
= = 1.6666.
s / n 6 / 100 6
Conclusion. Since z = 1.666 < 1.96, zα the significant value of z at 5% level of sig-
Under H0, z =
nificance, H0 is accepted, i.e., the sample is drawn from the population with mean 50.
TEST YOUR KNOWLEDGE
1. A sample of 1000 students from a university was taken and their average weight was found to be 112
pounds with a S.D. of 20 pounds. Could the mean weight of students in the population be 120 pounds?
2. A sample of 400 male students is found to have a mean height of 160 cms. Can it be reasonably regarded
as a sample from a large population with mean height 162.5 cms and standard deviation 4.5 cms?
3. A random sample of 200 measurements from a large population gave a mean value of 50 and a S.D. of
9. Determine 95% confidence interval for the mean of the population.
4. The guaranteed average life of a certain type of bulb is 1000 hours with a S.D. of 125 hours. It is decided
to sample the output so as to ensure that 90% of the bulbs do not fall short of the guaranteed average by
more than 2.5%. What must be the minimum size of the sample?
5. The heights of college students in a city are normally distributed with a S.D. of 6 cms. A sample of 1000
students has a mean height of 158 cms. Test the hypothesis that the mean height of college students in
the city is 160 cms.
Answers
1. H0 is rejected
2. H0 accepted
3. 48.8 and 51.2
4. n = 4
5. H0 rejected at 1% to 5% level of significance.
________________________________________________________________________________________________________
21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE
SAMPLES
Let x1 be the mean of a sample of size n1 from a population with mean μ1 and variance σ 12 .
Let x2 be the mean of an independent sample of size n2 from another population with mean μ2
x1 − x2
and variance σ 22 . The test statistic is given by z =
.
σ 12
n1
+
σ 22
n2
Under the null hypothesis that the samples are drawn from the same population where σ 1 =
x1 − x2
σ 2 = σ , i.e., μ1 = μ2 the test statistic is given by z =
.
1 1
+
σ
n1 n2
Note 1. If σ 1 , σ 2 are not known and σ 1 ≠ σ 2 the test statistic in this case is z =
x1 − x2
2
s1
n1
n1 s1 + n2 s2
2
Note 2. If σ is not known and σ 1 = σ 2 , we use σ =
2
z=
x1 − x2
n1 s1 + n2 s2
2
n1 + n2
2
⎛1 1⎞
⎜n +n ⎟
⎝ 1 2⎠
.
n1 + n2
2
to calculate σ ;
2
+
s2
n2
.
1248
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
ILLUSTRATIVE EXAMPLES
Example 1. The average bonus income of people was $210 with a S.D. of $10 in a sample
of 100 people of a city. For another sample of 150 people, the average income was $220 with
S.D. of $12. The S.D. of bonus incomes of the people of the city was $11. Test whether there is
any significant difference between the average bonus incomes of the localities.
Sol. Here n1 = 100, n2 = 150, x1 = 210, x2 = 220, s1 = 10, s2 = 12.
Null hypothesis. The difference is not significant, i.e., there is no difference between the
bonus incomes of the localities.
H 0 : x1 = x2 ,
Under H0,
z=
x1 − x2
2
1
2
2
s
s
+
n1 n2
H1 : x1 ≠ x2
210 − 220
=
102 122
+
100 150
= −7.1428 ∴
z = 7.1428.
Conclusion. As the calculated value of z > 1.96, the significant value of z at 5% level
of significance, H0 is rejected, i.e., there is significant difference between the average bonus
incomes of the localities.
Example 2. Intelligence tests were given to two groups of boys and girls.
Mean
S.D.
Size
Girls
75
8
60
Boys
73
10
100
Examine if the difference between mean scores is significant.
Sol. Null hypothesis H0. There is no significant difference between mean scores, i.e.,
x1 = x2 .
H1 : x1 ≠ x2
Under the null hypothesis z =
x1 − x
2
1
2
2
s
s
+
n1 n2
=
75 − 73
82 102
+
60 100
= 1.3912.
Conclusion. As the calculated value of z < 1.96, the significant value of z at 5% level of
significance, H0 is accepted, i.e., there is no significant difference between mean scores.
Example 3. For sample I, n1 = 1000, Σx = 49,000, Σ( x − x ) 2 = 7,84,000.
For sample II, n2 = 1,500, Σx = 70,500, Σ( x − x ) 2 = 24,00,000. Discuss the significance of
the difference of the sample means.
Sol. Null hypothesis H0. There is no significant difference between the sample means.
H 0 : x1 = x2 ; H1 : x1 ≠ x2
To calculate sample variance
s12 =
1
784000
Σ(X1 − X1 ) 2 =
= 784
1000
n1
21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE SAMPLES
1249
________________________________________________________________________________________________________
s22 =
1
1
(2400000) = 11600
Σ(X 2 − X 2 ) 2 =
1500
n2
x1 =
70500
Σx1 49000
Σx
=
= 49; x2 = 2 =
= 47
1000
1500
n1
n2
Under the null hypothesis, the test statistic
z=
x1 − x2
s12 s22
+
n1 n2
=
49 − 47
= 1.470.
784 1600
+
1000 1500
Conclusion. As the calculated value of z = 1.47 < 1.96, the significant value of z at 5%
level of significance, H0 is accepted, i.e., there is no significant difference between the sample
means.
Example 4. From the data given below, compute the standard error of the difference of the
two sample means and find out if the two means significantly differ at 5% level of significance.
No. of items
Group I
50
Group II
75
Mean
181.5
179
S.D.
3.0
3.6
Sol. Null hypothesis H0. There is no significant difference between the samples.
x1 = x2 ; H1 : x1 ≠ x2
Under H0, z =
x1 − x2
2
1
2
2
s
s
+
n1 n2
=
181.5 − 179.0
9 (3.6) 2
+
50
75
= 4.2089.
Conclusion. As z > the tabulated value of z at 5% level of significance H0 is rejected, i.e.,
there is significant difference between the samples.
Example 5. A random sample of 200 towns in anystate gives the mean population per town
at 485 with a S.D. of 50. Another random sample of the same size from the same state gives the
mean population per town at 510 with a S.D. of 40. Is the difference between the mean values
given by the two samples statistically significant? Justify your answer.
Sol. Here n1 = 200, n2 = 250, x1 = 485, x2 = 510, s1 = 50, s2 = 40.
Null hypothesis H0. There is no significant difference between the mean values, i.e.,
x1 = x2 ; H : x1 ≠ x2 (Two-tailed test)
x −x
485 − 510
Under H0, the test statistic is given by z = 1 2 =
= −5.52
502 402
s12 s22
+
+
200 200
n1 n2
∴
z = 5.52.
Conclusion. As the calculated value of z > 1.96, the significant value of z at 5% level of
significance, H0 is rejected, i.e., there is significant difference between the mean values of the
two samples.
1250
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. Intelligence tests on two groups of boys and girls gave the following results. Examine whether the
difference is significant.
Mean
S.D.
Size
Girls
70
10
70
Boys
75
11
100
2. Two random samples of sizes 1000 and 2000 of farms gave an average yield of 2000 kg and 2050 kg
respectively. The variance of wheat farms in the country may be taken as 100 kg. Examine whether the
two samples differ significantly in yield.
3. A sample of heights of 6400 soldiers has a mean of 67.85 inches and a S.D. of 2.56 inches while another
sample of heights of 1600 sailors has a mean of 68.55 inches with a S.D. of 2.52 inches. Do the data
indicate that the sailors are on the average taller than soldiers?
4. In a survey of buying habits, 400 shoppers are chosen at random in supermarket A. Their average
weekly food expenditure is $250 with a S.D. of $40. For 500 shoppers chosen at supermarket B, the
average weekly food expenditure is $220 with a S.D. of $45. Test at 1% level of significance whether
the average food expenditures of the two groups are equal.
5. The number of accidents per day was studied for 144 days in town A and for 100 days in town B and the
following information was obtained.
Mean number of accidents
S.D.
Town A
4.5
1.2
Town B
5.4
1.5
Is the difference between the mean accidents of the two towns statistically significant?
6. An examination was given to 50 students of college A and to 60 students of college B. For A, the mean
grade was 75 with a S.D. of 9 and for B, the mean grade was 79 with a S.D. of 7. Is there any significant
difference between the performance of the students of college A and those of college B?
7. A random sample of 200 measurements from a large population gave a mean value of 50 and a S.D.
of 9. Determine the 95% confidence interval for the mean of the population.
8. The means of two large samples of 1000 and 2000 members are 168.75 cms and 170 cms respectively.
Can the samples be regarded as drawn from the same population of standard deviation 6.25 cms?
Answers
1.
4.
7.
No significant difference
Highly significant
49.584, 50.416
2.
5.
8.
Highly significant
Highly significant
Not significant
3.
6.
Highly significant
Not significant
________________________________________________________________________________________________________
21.76 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD
DEVIATIONS
If s1 and s2 are the standard deviations of two independent samples then under the null
hypothesis H0 : σ 1 = σ 2 , i.e., the sample standard deviations don’t differ significantly, and the
statistic
21.76 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD DEVIATIONS
1251
________________________________________________________________________________________________________
z=
s1 − s2
σ 12
2n1
+
σ 22
, where σ 1 and σ 2 are population standard deviations
2n2
when population standard deviations are not known then z =
s1 − s2
s12
s22
+
2n1 2n2
.
ILLUSTRATIVE EXAMPLES
Example 1. Random samples drawn from two countries gave the following data relating to
the heights of adult males.
Country A
67.42
Country B
67.25
Standard deviation
2.58
2.50
Number in samples
1000
1200
Mean height (in inches)
(i) Is the difference between the means significant?
(ii) Is the difference between the standard deviations significant?
Sol. Given: n1 = 1000, n2 = 1200, x1 = 67.42; x2 = 67.25, s1 = 2.58, s2 = 2.50.
Since the sample sizes are large we can take σ 1 = s1 = 2.58; σ 2 = s2 = 2.50.
(i) Null Hypothesis. H0 = μ1 = μ2 , i.e., sample means do not differ significantly.
Alternative hypothesis: H1 : μ1 ≠ μ2 (two-tailed test)
z=
x1 − x2
s12 s22
+
n1 n2
67.42 − 67.25
=
(2.58) 2 (2.50) 2
+
1000
1200
= 1.56
since z < 1.96 we accept the null hypothesis at 5% level of significance.
(ii) We set up the null hypothesis.
H0 : σ 1 = σ 2 , i.e., the sample S.D.’s do not differ significantly.
Alternative hypothesis: H1 = σ 1 ≠ σ 2 (two-tailed)
∴ The test statistic is given by
z=
s1 − s2
σ 12
2n1
=
+
σ 22
=
2n2
s1 − s2
s12
s2
+ 2
2n1 2n2
2.58 − 2.50
2
2
(2.58)
(2.50)
×
2 ×1000 2 ×1200
=
(∵ σ 1 = s1 , σ 2 = s2 for large samples)
0.08
= 1.0387
6.6564 6.25
+
2000 2400
Since z < 1.96 we accept the null hypothesis at 5% level of significance.
1252
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Example 2. An intelligence test of two groups of boys and girls gives the following results:
Girls
Boys
mean = 84
mean = 81
S.D. = 10
S.D. = 12
N = 121
N = 81
(a) Is the difference in mean scores significant?
(b) Is the difference between the standard deviations significant?
Sol. Given: n1 = 121, n2 = 81, x1 = 84, x2 = 81, s1 = 10, s2 = 12.
(a) Null hypothesis. H0 = μ1 = μ2 , i.e., sample means do not differ significantly.
Alternative hypothesis: H1 = μ1 ≠ μ2 (two-tailed)
x −x
84 − 81
The test statistic is z = 1 2 =
= 0.1859
(10) 2 (12) 2
s12 s22
+
+
121
81
n1 n2
Since z < 1.96 we accept the null hypothesis at 5% level of significance.
(b) We set up the null hypothesis H0 = σ 1 = σ 2 , i.e., the sample S.D.’s do not differ significantly. Alternative hypothesis: H1 = σ 1 ≠ σ 2 (two-tailed)
s1 − s2
s1 − s2
The test statistic is z =
=
2
2
σ1 σ 2
s12
s22
+
+
2n1 2n2
2n1 2n2
(∵ σ 1 = s1 , σ 2 = s2 for large samples)
10 − 12
= −1.7526 ∴ z = 1.7526
100
144
+
2 ×121 2 × 81
since z = 1. 75 < 1.96 we accept the null hypothesis at 5% level of significance.
=
TEST YOUR KNOWLEDGE
1. The mean yield of two sets of plots and their variability are as given; examine
(i) whether the difference in the mean yield of the two sets of plots is significant;
(ii) whether the difference in the variability in yields is significant.
Mean yield per plot
S.D. per plot
Set of 40 plots
1258 lb
34
Set of 60 plots
1243 lb
28
2. The yield of wheat in a random sample of 1000 farms in a certain area has a S.D. of 192 kg. Another
random sample of 1000 farms gives a S.D. of 224 kg. Are the S.D.’s significantly different?
Answers
1. z = 2.321 Difference significant at 5% level; z = 1.31 Difference not significant at 5% level
2. z = 4.851 The S.D.’s are significantly different.
________________________________________________________________________________________________________
21.77
TEST OF SIGNIFICANCE OF SMALL SAMPLES
When the size of the sample is less than 30, then the sample is called a small sample. For
such a sample it will not be possible for us to assume that the random sampling distribution of
21.79 TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE
1253
________________________________________________________________________________________________________
a statistic is approximately normal and the values given by the sample data are sufficiently close
to the population values and can be used in their place for the calculation of the standard error of
the estimate.
t-TEST
21.78
STUDENT’S t-DISTRIBUTION
This t-distribution is used when the sample size is ≤ 30 and the population standard
deviation is unknown.
x −μ
t-statistic is defined as t =
∼ t(n – 1 d.f.) d.f.—degrees of freedom where
s/ n
s=
Σ(X − X) 2
.
n −1
The t-table
The t-table given at the end is the probability integral of the t-distribution. The t-distribution
has a different value for each degree of freedom and when the degrees of freedom are infinitely
large, the t-distribution is equivalent to normal distribution and the probabilities shown in the
normal distribution tables are applicable.
Application of t-distribution
Some of the applications of t-distribution are given below:
1. To test if the sample mean ( X ) differs significantly from the hypothetical value μ of
the population mean.
2. To test the significance between two sample means.
3. To test the significance of observed partial and multiple correlation coefficients.
Critical value of t
The critical value or significant value of t at level of significance α degrees of freedom γ
for the two-tailed test is given by
P ⎡⎣ t > tγ (α ) ⎤⎦ = α
P ⎡⎣ t > tγ (α ) ⎤⎦ = 1 − α
The significant value of t at level of significance α for a single-tailed test can be
determined from those of the two-tailed test by referring to the values at 2α .
21.79
TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE
To test whether the mean of a sample drawn from a normal population deviates significantly
from a stated value when variance of the population is unknown.
H0 : There is no significant difference between the sample mean x and the population mean
μ , i.e., we use the statistic
X−μ
,
where X is the mean of the sample
t=
s/ n
1 n
(X i − X) 2 with degrees of freedom (n − 1).
s2 =
∑
n −1 i =1
At a given level of significance α1 and degrees of freedom (n – 1). We refer to t-table tα
(two-tailed or one-tailed).
1254
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
If the calculated t value is such that t < tα the null hypothesis is accepted. t > tα H0 is
rejected.
Fiducial limits of population mean
If tα is the table of t at level of significance α at (n – 1) degrees of freedom
X−μ
< tα for acceptance of H0.
s/ n
x − tα s n < μ < x + tα s / n
95% confidence limits (level of significance 5%) are X ± t 0.05 s / n .
99% confidence limits (level of significance 1%) are X ± t0.01s / n .
Note. Instead of calculating s, we calculate S for the sample.
Since s 2 =
1 n
1 n
(X i − X) 2 ∴ S2 = ∑ (X i − X) 2 .
∑
n −1 i =1
n i =1
n 2⎤
⎡
2
2
2
⎢⎣ (n − 1) s = nS , s = n − 1 S ⎥⎦
ILLUSTRATIVE EXAMPLES
Example 1. A random sample of size 16 has 53 as its mean. The sum of squares of the
deviation from mean is 135. Can this sample be regarded as taken from the population having 56
as its mean? Obtain 95% and 99% confidence limits of the mean of the population.
Sol. H0 : There is no significant difference between the sample mean and the hypothetical
population mean.
H 0 : μ = 56; H1 : μ ≠ 56 (Two-tailed test)
t:
X−μ
∼ t (n − 1 d.f.)
s/ n
Given: X = 53, μ = 56, n = 16, Σ(X − X) 2 = 135
s=
Σ(X − X)2
135
53 − 56 −3 × 4
=
= 3; t =
=
= −4
n −1
15
3
3 / 16
t = 4. d . fv = 16 − 1 = 15.
Conclusion. t0.05 = 1.753.
Since t = 4 > t0.05 = 1.753, i.e., the calculated value of t is more than the table value. The
hypothesis is rejected. Hence the sample mean has not come from a population having 56 as its
mean.
95% confidence limits of the population mean.
X±
s
3
t0.05 , 53 ±
(1.725) = 51.706; 54.293
n
16
99% confidence limits of the population mean.
X±
s
3
t0.01 , 53 ±
(2.602) = 51.048; 54.951.
n
16
21.79 TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE
1255
________________________________________________________________________________________________________
Example 2. The lifetime of electric bulbs for a random sample of 10 from a large shipment
gave the following data:
Item
Life in 1000s of hrs.
1
4.2
2
4.6
3
3.9
4
4.1
5
5.2
6
3.8
7
3.9
8
4.3
9
4.4
10
5.6
Can we accept the hypothesis that the average lifetime of a bulb is 4000 hrs?
Sol. H0 : There is no significant difference in the sample mean and population mean, i.e.,
μ = 4000 hrs.
X−μ
∼ t (10 − 1 d.f .)
Applying the t-test: t =
s/ n
X
4.2
4.6
3.9
4.1
5.2
3.8
3.9
4.3
4.4
5.6
X−X
– 0.2
0.2
– 0.5
– 0.3
0.8
– 0.6
– 0.5
– 0.1
0
1.2
(X – X )2
0.04
0.04
0.25
0.09
0.64
0.36
0.25
0.01
0
1.44
X=
s=
ΣX 44
=
= 4.4
n
10
Σ(X − X) 2 = 3.12
Σ(X − X)2
3.12
4.4 − 4
=
= 0.589; t =
= 2.123
0.589
n −1
9
10
For γ = 9, t0.05 = 2.26.
Conclusion. Since the calculated value of t is less than table t0.05. ∴ The hypothesis
μ = 4000 hrs is accepted.
I.e., the average lifetime of the bulbs could be 4000 hrs.
Example 3. A sample of 20 items has mean 42 units and S.D. 5 units. Test the hypothesis
that it is a random sample from a normal population with mean 45 units.
Sol. H0 : There is no significant difference between the sample mean and the population
mean.
I.e.,
μ = 45 units
μ ≠ 45 (Two-tailed test)
H1 :
n = 20, X = 42, S = 5; γ = 19 d.f.
Given :
n 2 ⎡ 20 ⎤ 2
S =⎢
(5) = 26.31 ∴ s = 5.129
s2 =
n −1
⎣ 20 − 1 ⎥⎦
X−μ
42 − 45
Applying the t-test t =
=
= −2.615; t = 2.615
s / n 5.129 / 20
The tabulated value of t at 5% level for 19 d.f. is t0.05 = 2.09.
Conclusion. Since t > t0.05, the hypothesis H0 is rejected, i.e., there is significant
difference between the sample mean and the population mean.
I.e., the sample could not have come from this population.
Example 4. The 9 items of a sample have the following values: 45, 47, 50, 52, 48, 47, 49,
53, 51. Does the mean of these values differ significantly from the assumed mean 47.5?
1256
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Sol. H0 : μ = 47.5
I.e., there is no significant difference between the sample and the population mean.
H1 : μ ≠ 47.5 (two-tailed test); given : n = 9, μ = 47.5
X
45
47
50
52
48
47
49
53
51
X−X
– 4.1
– 2.1
0.9
2.9
– 1.1
– 2.1
– 0.1
3.9
1.9
(X – X )2 16.81
4.41
0.81
8.41
1.21
4.41
0.01
15.21
3.61
Σx 442
Σ(X − X) 2
2
2
X=
=
= 49.11; Σ(X − X) = 54.89; s =
= 6.86 ∴ s = 2.619
n
9
(n − 1)
t0.05
Conclusion. Since t
X − μ 49.1 − 47.5 (1.6) 8
=
=
= 1.7279
2.619
s / n 2.619 / 8
= 2.31 for γ = 8.
t=
Applying the t-test
< t0.05, the hypothesis is accepted, i.e., there is no significant
difference between their mean.
Example 5. The following results are obtained from a sample of 10 boxes of biscuits.
Mean weight content = 490 gm.
S.D. of the weight 9 gm. Could the sample come from a population having a mean of
500 gm?
Sol. Given:
n = 10, X = 490; S = 9 gm, μ = 500
n 2
10 2
S =
× 9 = 9.486
9
n −1
s=
H0 : The difference is not significant, i.e., μ = 500; H1: μ ≠ 500
Applying t-test
X−μ
490 − 500
=
= −0.333
s / n 9.486 / 10
= 2.26 for γ = 9.
t=
t0.05
Conclusion. Since t = .333 > t0.05, the hypothesis H0 is rejected, i.e., μ ≠ 500.
∴ The sample could not have come from the population having mean 500 gm.
TEST YOUR KNOWLEDGE
1. Ten individuals are chosen at random from a normal population of students and their grades are found to
be 63, 63, 66, 67, 68, 69, 70, 70, 71, 71. In light of these data, discuss the suggestion that the mean grade
of the population of students is 66.
2. The following values give the lengths of 12 samples of Egyptian cotton taken from a shipment: 48, 46,
49, 46, 52, 45, 43, 47, 47, 46, 45, 50. Test whether the mean length of the shipment can be taken as 46.
3. A sample of 18 items has a mean of 24 units and a standard deviation of 3 units. Test the hypothesis that
it is a random sample from a normal population with a mean of 27 units.
4. A random sample of 10 students had the following I.Q.’s 70, 120, 110, 101, 88, 83, 95, 98, 107, and 100.
Do these data support the assumption of a population mean I.Q. of 160?
21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES
1257
________________________________________________________________________________________________________
5. A filling machine is expected to fill 5 kg of powder into bags. A sample of 10 bags gave the following
weights: 4.7, 4.9, 5.0, 5.1, 5.4, 5.2, 4.6, 5.1, 4.6, and 4.7. Test whether the machine is working properly.
Answers
1.
4.
accepted
accepted
accepted
accepted
2.
5.
3.
rejected
________________________________________________________________________________________________________
21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES
(FROM A NORMAL POPULATION)
This test is used to test whether the two samples of sizes x1, x2, . . . , xn1 , y1, y2, . . . , yn2 of
sizes n1, n2 have been drawn from two normal populations with mean μ1 and μ2 respectively
under the assumption that the population variances are equal. (σ 1 = σ 2 = σ ).
H0 : The samples have been drawn from the normal population with means μ1 and μ2 , i.e.,
H0 : μ1 ≠ μ2 .
Let X, Y be the means of the two samples.
Under this H0 the test of statistic t is given by t =
(X − Y)
∼ t (n1 + n2 − 2 d.f.)
1 1
s
+
n1 n2
n1 s1 + n2 s2
2
Note 1. If the two sample standard deviations s1, s2 are given then we have s =
2
X−Y
Note 2. If n1 = n2 = n, t =
s1 + s2
2
2
2
n1 + n2 − 2
.
can be used as a test statistic.
n −1
Note 3. If the pairs of values are in some way associated (correlated) we can’t use the test statistic as given in
Note 2. In this case we find the differences of the associated pairs of values and apply for a single mean, i.e.,
X−μ
t=
with degrees of freedom n – 1.
s/ n
The test statistic is t =
I.e.,
d
s/
n
or t =
d
s/
n −1
, where
d is the mean of paired difference.
d i = xi − yi
d i = X − Y, where ( xi , yi ) are the paired data i = 1, 2, . . . , n.
ILLUSTRATIVE EXAMPLES
Example 1. Two samples of sodium vapor bulbs were tested for length of life and the
following results were returned:
Type I
Type II
Size
8
7
Sample mean
1234 hrs
1036 hrs
Sample S.D.
36 hrs
40 hrs
Is the difference in the means significant enough to generalize that type I is superior to type
II regarding length of life?
1258
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Sol.
H0 : μ1 = μ2 , i.e., two types of bulbs have the same lifetime.
H1 : μ1 > μ2 , i.e., type I is superior to type II
s2 =
n1s22 + n2 s22 8 × (36) 2 + 7(40) 2
=
= 1659.076 ∴ s = 40.7317
8+7−2
n1 + n2 − 2
X1 − X 2
1234 − 1036
=
= 18.1480 ∼ t (n1 + n2 − 2 d.f.)
1 1
1 1
s
40.7317 +
+
n1 n2
8 7
t0.05 at d.f. 13 is 1.77 (one-tailed test)
Conclusion. Since calculated t > t0.05, H0 is rejected, i.e., H1 is accepted.
The t-statistic
t=
∴ Type I is definitely superior to type II
n1
n2
Y
X
1
⎡⎣Σ(X i − X) 2 + (Y j − Y) 2 ⎤⎦
Y=∑ j;
where X = ∑ i ,
s2 =
n1 + n2 − 2
i = 1 ni
j = 1 n2
is an unbiased estimate of the population variance σ 2 .
t follows t distribution with n1 + n2 – 2 degrees of freedom.
Example 2. Samples of sizes 10 and 14 were taken from two normal populations with S.D.
3.5 and 5.2. The sample means were found to be 20.3 and 18.6. Test whether the means of the
two populations are the same at 5% level.
Sol.
H0 : μ1 = μ2 , i.e., the means of the two populations are the same.
H1 : μ1 ≠ μ2 .
Given
X1 = 20.3, X 2 = 18.6; n1 = 10, n2 = 14, s1 = 3.5, s2 = 5.2
s2 =
t=
n1s12 + n2 s22 10(3.5) 2 + 14(5.2) 2
=
= 22.775 ∴ s = 4.772
10 + 14 − 2
n1 + n2 − 2
X1 − X 2
20.3 − 18.6
=
= 0.8604
1 1 ⎛ 1 1 ⎞
s
+
+ ⎟ 4.772
⎜
n1 n2 ⎝ 10 14 ⎠
The value of t at 5% level for 22 d.f. is t0.05 = 2.0739.
Conclusion. Since t = 0.8604 < t0.05 the hypothesis is accepted, i.e., there is no significant
difference between their means.
Example 3. The heights of 6 randomly chosen sailors in inches are 63, 65, 68, 69, 71, and
72. Those of 9 randomly chosen soldiers are 61, 62, 65, 66, 69, 70, 71, 72, and 73. Test whether
the sailors are, on the average, taller than the soldiers.
Sol. Let X1 and X2 be the two samples denoting the heights of sailors and soldiers. Given
the sample size n1 = 6, n2 = 9, H0 : μ1 = μ2 .
I.e., the means of both the population are the same.
H1 : μ1 > μ2 (one-tailed test)
21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES
1259
________________________________________________________________________________________________________
Calculation of two sample means:
X1
63
65
68
69
71
72
X1 − X1
–5
–3
0
1
3
4
(X1 – X 1 )2
25
9
0
1
9
16
X1 =
ΣX1
= 68; Σ(X1 − X1 ) 2 = 60
n1
X2
61
62
65
66
69
70
71
72
73
X2 − X2
– 6.66
– 5.66
– 2.66
1.66
1.34
2.34
3.34
4.34
5.34
(X2 – X 2 )2
44.36
32.035
7.0756 2.7556 1.7956 5.4756 11.1556 18.8356 28.5156
X2 =
ΣX 2
= 67.66; Σ(X 2 − X 2 ) 2 = 152.0002
n2
s2 =
1
⎡⎣ Σ(X1 − X1 )2 + Σ(X 2 − X 2 ) 2 ⎤⎦
n1 + n2 − 2
1
[60 + 152.0002] = 16.3077 ∴ s = 4.038
6+9−2
X − X2
68 − 67.666
t= 1
=
= 0.3031 ∼ t (n1 + n2 − 2 d.f.)
1 1
1 1
4.0382 +
s
+
n1 n2
6 9
=
Under H0,
The value of t at 10% level of significance (∵ the test is one tailed) for 13 d.f. is 1.77.
Conclusion. Since t = 0.3031 < t0.05 = 1.77 the hypothesis H0 is accepted.
I.e., there is no significant difference between their average.
I.e., the sailors are not, on the average, taller than the soldiers.
Example 4. A certain stimulus administered to each of 12 patients resulted in the following
increases of blood pressure: 5, 2, 8, –1, 3, 0, –2, 1, 5, 0, 4, 6. Can it be concluded that the
stimulus will in general be accompanied by an increase in blood pressure?
Sol. To test whether the mean increase in blood pressure of all patients to whom the
stimulus is administered will be positive, we have to assume that this population is normal with
mean μ and S.D. σ , which are unknown.
H0 : μ = 0; H1 : μ1 > 0
The test statistic under H0
d
∼ t (n − 1 degrees of freedom)
s / n −1
5 + 2 + 8 + (−1) + 3 + 0 + 6 + (−2) + 1 + 5 + 0 + 4
d=
= 2.583
12
t=
1260
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
1
Σd 2
− d 2 = [52 + 22 + 82 + (−1) 2 + 32 + 02 + 62
n
12
+ (−2) 2 + 12 + 52 + 02 + 42 ] − (2.583) 2
= 8.744 ∴ s = 2.9571
s2 =
t=
2.583
2.583 11
d
=
=
= 2.897 ∼ t (n − 1 d.f.)
2.9571
s / n − 1 2.9571/ 12 − 1
Conclusion. The tabulated value of t0.05 at 11 d.f. is 2.2.
∵ t > t0.05, H0 is rejected.
I.e., the stimulus does not increase the blood pressure. The stimulus in general will be accompanied by an increase in blood pressure.
Example 5. The memory capacity of 9 students was tested before and after a course of
medication for a month. State whether the course was effective or not from the data below (in the
same units).
Before
10
15
9
3
7
12
16
17
4
After
12
17
8
5
6
11
18
20
3
Sol. Since the data are correlated and concerned with the same set of students, we use the
paired t-test.
H0 : Medication was not effective μ1 = μ2
H1 : μ1 ≠ μ2 (Two-tailed test).
Before medication (X)
10
15
9
3
7
12
16
17
4
After medication (Y)
12
17
8
5
6
11
18
20
3
d=X–Y
–2
–2
1
–2
1
1
–2
–3
1
d2
4
4
1
4
1
1
4
9
1
Σd = −7
Σd 2 = 29
29
Σd −7
Σd 2
=
= −0.7778; s 2 =
− (d ) 2 =
− (−0.7778) 2 = 2.617
n
9
n
9
d
−0.7778
−0.7778 × 8
t=
=
=
= −1.359
1.6177
2.6172 / 8
s / n −1
d=
The tabulated value of t0.05 at 8 d.f. is 2.31.
21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES
1261
________________________________________________________________________________________________________
Conclusion. Since t = 1.359 < t0.05, H0 is accepted, i.e., medication was not effective in
improving performance.
Example 6. The following figures refer to observations in live independent samples.
Sample I
25
30
28
34
24
20
13
32
22
38
Sample II
40
34
22
20
31
40
30
23
36
17
Analyze whether the samples have been drawn from the populations of equal means.
Sol. H0 : The two samples have been drawn from the population of equal means, i.e., there
is no significant difference between their means,
i.e.,
μ1 = μ2
H1 : μ1 ≠ μ2 (Two-tailed test)
Given n1 = Sample I size = 10; n2 = Sample II size = 10
To calculate the two sample means and the sum of squares of deviation from the mean, let
X1 be the sample I and X2 be the sample II.
X1
25
30
28
34
24
20
13
32
22
38
X1 − X1
– 1.6
3.4
1.4
7.4
– 2.6
– 6.6
– 13.6
5.4
4.6
11.4
( X1 − X1 )2
2.56
11.56
1.96
54.76
6.76
43.56 184.96 29.16
X2
40
34
22
20
31
40
30
23
36
17
X2 − X2
10.7
4.7
–7.3
– 9.3
1.7
10.7
0.7
– 6.3
6.7
– 12.3
53.29
86.49
2.89
114.49
0.49
39.67
( X 2 − X 2 ) 2 114.49 22.09
10
X1
= 26.6
i = 1 n1
X1 = ∑
Σ(X1 − X1 ) 2 = 486.4
s2 =
=
10
X 2 293
=
= 29.3
10
i = 1 n2
X2 = ∑
Σ(X 2 − X 2 ) 2 = 630.08
1
⎡⎣Σ(X1 − X1 ) 2 + Σ(X 2 − X 2 ) 2 ⎤⎦
n1 + n2 − 2
1
[486.4 + 630.08] = 62.026 ∴ s = 7.875
10 + 10 − 2
Under H0 the test statistic is given by
t=
X1 − X 2
26.6 − 29.3
=
= −0.7666 ∼ t (n1 + n2 − 2 d.f.)
1 1
1 1
s
+
7.875
+
n1 n2
10 10
t = 0.7666.
21.16 129.96
44.89 151.29
1262
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Conclusion. The tabulated value of t at 5% level of significance for 18 d.f. is 2.1. Since the
calculated value t = 0.7666 < t0.05, H0 is accepted.
I.e., there is no significant difference between their means.
I.e., the two samples have been drawn from the populations of equal means.
TEST YOUR KNOWLEDGE
1. The mean life of 10 electric motors was found to be 1450 hrs with a S.D. of 423 hrs. A second sample of
17 motors chosen from a different batch showed a mean life of 1280 hrs with a S.D. of 398 hrs. Is there
a significant difference between the means of the two samples?
2. The grades obtained by a group of 9 regular course students and another group of 11 part-time course
students in a test are given below
Regular
:
56
62
63
54
60
51
67
69
58
Part-time
:
62
70
71
62
60
56
75
64
72
68
66
Examine whether the grades obtained by regular students and part-time students differ significantly at
5% and 1% levels of significance.
3. A group of 10 boys fed on diet A and another group of 8 boys fed on a different diet B; they recorded
the following increase in weight (kgs).
Diet A :
5
6
8
1
12
4
3
9
6
10
Diet B :
2
3
6
8
10
1
2
8
Does it show the superiority of diet A over diet B?
4. Two independent samples of sizes 7 and 9 have the following values:
Sample A :
10
12
10
13
14
11
10
Sample B :
10
13
15
12
10
14
11
12
11
Test whether the difference between the means is significant.
5. To compare the prices of a certain product in two cities, 10 shops were visited at random in each town.
The prices were noted below:
City 1 :
61
63
56
63
56
63
59
56
44
61
City 2 :
55
54
47
59
51
61
57
54
64
58
Test whether the average prices can be said to be the same in the two cities.
6. The average number of articles produced by two machines per day are 200 and 250 with standard
deviation 20 and 25 respectively on the basis of records of 25 days’ production. Can you regard both the
machines as equally efficient at 5% level of significance?
7. Two salesmen represent a firm in a certain company. One of them claims that he makes larger sales than
the other. A sample survey was made and the following results were obtained:
No. of sales
:
1st Salesman (18)
Average sales :
$210
S.D.
:
$25
Find whether the average sales differ significantly.
2nd Salesman (20)
$175
$20
Answers
1.
5.
accepted
accepted
2.
6.
rejected
rejected
3.
7.
accepted
rejected
4.
accepted
________________________________________________________________________________________________________
21.81
SNEDECOR’S VARIANCE RATIO TEST OR F-TEST
In testing the significance of the difference of two means of two samples, we assumed
that the two samples came from the same population or a population with equal variance. The
21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST
1263
________________________________________________________________________________________________________
object of the F-test is to discover whether two independent estimates of population variance
differ significantly or whether the two samples may be regarded as drawn from the normal
populations having the same variance. Hence before applying the t-test for the significance of the
difference of two means, we have to test for the equality of population variance by using the
F-test.
Let n1 and n2 be the sizes of two samples with variance s12 and s22 . The estimates of the
population variance based on these samples are s12 =
n s2
n1s12
and s22 = 2 2 . The degrees of
n2 − 1
n1 − 1
freedom of these estimates are v1 = n1 − 1, v2 = n2 − 1.
To test whether these estimates s12 and s22 are significantly different or whether the samples
may be regarded as drawn from the same population or from two populations with the same
variance σ 2 , we set up the null hypothesis H0 : σ 12 = σ 22 = σ 2 .
I.e., the independent estimates of the common population do not differ significantly.
To carry out the test of significance of the difference of the variances we calculate the test
s2
statistic (Nr) F = 12 ; the numerator is greater than the denominator (Dr), i.e., s12 > s22 .
s2
Conclusion. If the calculated value of F exceeds F0.05 for (n1 – 1), (n2 – 1) degrees of
freedom given in the table we conclude that the ratio is significant at 5% level.
I.e., we conclude that the sample could have come from two normal populations with the same
variance.
The assumptions on which the F-test is based are:
1. The populations for each sample must be normally distributed.
2. The samples must be random and independent.
3. The ratio of σ 12 to σ 22 should be equal to 1 or greater than 1. That is why we take the
larger variance in the numerator of the ratio.
Applications. The F-test is used to test
(i) whether two independent samples have been drawn from the normal populations with
the same variance σ 2 .
(ii) Whether the two independent estimates of the population variance are homogeneous or
not.
ILLUSTRATIVE EXAMPLES
Example 1. In two independent samples of sizes 8 and 10 the sum of squares of deviations
of the sample values from the respective sample means were 84.4 and 102.6. Test whether the
difference of variances of the populations is significant or not.
Sol. Null hypothesis H0. σ 12 = σ 22 = σ 2 , i.e., there is no significant difference between
population variance.
s12
Under H0 : F = 2 ∼ F(v1 , v2 d.f.)
s2
1264
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
where v1 = n1 – 1, n1 = Sample I size = 8; v2 = n2 – 1, n2 = Sample II size = 10
Σ(X1 − X1 ) 2 = 84.4; Σ(X 2 − X 2 ) 2 = 102.6
s12 =
Σ(X1 − X1 ) 2 84.4
Σ(X 2 − X 2 ) 2 102.6
=
= 12.057; s22 =
=
= 11.4
n1 − 1
7
n2 − 1
9
F=
s12
12.057
∵ s12 > s22 ∴ F =
= 1.0576.
2
s2
11.4
Conclusion. The tabulated value of F at 5% level of significance for (7, 9) d.f. is 3.29
∴ F0.05 = 3.29 and F = 1.0576 > 3.29 = F0.05 ⇒ H0 is accepted.
∴ There is no significant difference between the variance of the populations.
Example 2. Two random samples are drawn from two normal populations as follows:
A
17
27
18
25
27
29
13
B
16
16
20
27
26
25
21
17
Test whether the samples are drawn from the same normal population.
Sol. To test whether two independent samples have been drawn from the same population
we have to test (i) equality of the means by applying the t-test and (ii) equality of the population
variance by applying the F-test.
Since the t-test assumes that the sample variances are equal, we shall first apply the F-test.
F-test. Null hypothesis H0. σ 12 = σ 22 , i.e., the population variances do not differ significantly.
Alternative hypothesis. H1 : σ 12 ≠ σ 22
Test statistic: F =
s12
, (if s12 > s22 )
2
s2
Computations for s12 and s22
X1
X1 − X1
( X1 − X1 )2
X2
X2 − X2
( X 2 − X 2 )2
17
– 4.625
21.39
16
– 2.714
7.365
27
5.735
28.89
16
– 2.714
7.365
18
– 3.625
13.14
20
1.286
1.653
25
3.375
11.39
27
8.286
68.657
27
5.735
28.89
26
7.286
53.085
29
7.735
54.39
25
6.286
39.513
13
– 8.625
74.39
21
2.286
5.226
17
– 4.625
21.39
21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST
1265
________________________________________________________________________________________________________
X1 = 21.625; n1 = 8; Σ(X1 − X1 ) 2 = 253.87
X 2 = 18.714; n2 = 7; Σ(X 2 − X 2 ) 2 = 182.859
s12 =
Σ(X1 − X1 ) 2 253.87
=
= 36.267;
7
n1 − 1
s22 =
Σ(X 2 − X 2 ) 2 182.859
=
= 30.47
n2 − 1
6
F=
s12 36.267
=
= 1.190.
s22 30.47
Conclusion. The table value of F for v1 = 7 and v2 = 6 degrees of freedom at 5% level is
4.21. The calculated value of F is less than the tabulated value of F. ∴ H0 is accepted. Hence we
conclude that the variability in two populations is the same.
t-test: Null hypothesis. H0 : μ1 = μ2 , i.e., the population means are equal.
Alternative hypothesis. H1 : μ1 ≠ μ2
Test of statistic
s2 =
t=
Σ(X1 − X1 ) 2 + Σ(X 2 − X 2 ) 2 253.87 + 182.859
=
= 33.594 ∴ s = 5.796
8+7−2
n1 + n2 − 2
X1 − X 2
21.625 − 18.714
=
= 0.9704 ∼ t (n1 + n2 − 2) d.f.
1 1
1 1
5.796 +
s
+
8 7
n1 n2
Conclusion. The tabulated value of t at 5% level of significance for 13 d.f. is 2.16.
The calculated value of t is less than the tabulated value. H0 is accepted, i.e., there is no
significant difference between the population mean, i.e., μ1 = μ2 . ∴ We conclude that the two
samples have been drawn from the same normal population.
Example 3. Two independent samples of sizes 7 and 6 had the following values:
Sample A
28
30
32
33
31
29
Sample B
29
30
30
24
27
28
34
Examine whether the samples have been drawn from normal populations having the same
variance.
Sol. H0 : The variances are equal, i.e., σ 12 = σ 22 .
I.e., the samples have been drawn from normal populations with the same variance.
H1 : σ 12 ≠ σ 22
s12 2
Under the null hypothesis, the test statistic F = 2 ( s1 > s22 ).
s2
1266
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Computations for s12 and s22
X1
X1 − X1
( X1 − X1 )2
X2
X2 − X2
( X 2 − X 2 )2
28
30
32
33
31
29
34
–3
–1
1
2
0
–2
3
9
1
1
4
0
4
9
28
29
30
30
24
27
28
1
2
2
–4
–1
0
1
4
4
16
1
0
26
X1 = 31, n1 = 7; Σ(X1 − X1 ) 2 = 28
X 2 = 28, n2 = 6; Σ(X 2 − X 2 ) 2 = 26
Σ(X1 − X1 ) 2 28
Σ(X 2 − X 2 ) 2 26
2
s =
=
= 4.666; s2 =
=
= 5.2
n1 − 1
6
n2 − 1
5
2
1
F=
s12
5.2
=
= 1.1158.
2
s2 4.666
(∵ s22 > s12 )
Conclusion. The tabulated value of F at v1 = 6 – 1 and v2 = 7 – 1 d.f. for 5% level of
significance is 4.39.
Since the tabulated value of F is less than the calculated value, H0 is accepted, i.e., there is
no significant difference between the variances, i.e., the samples have been drawn from the
normal population with the same variance.
Example 4. The two random samples reveal the following data:
Sample no.
Size
Mean
Variance
I
II
16
25
440
460
40
42
Test whether the samples come from the same normal population.
Sol. A normal population has two parameters, namely, the mean μ and the variance σ 2 . To
test whether the two independent samples have been drawn from the same normal population, we
have to test
(i) the equality of means
(ii) the equality of variance.
Since the t-test assumes that the sample variances are equal, we first apply the F-test.
F-test. Null hypothesis. σ 12 = σ 22
The population variances do not differ significantly.
Alternative hypothesis. σ 12 ≠ σ 22
Under the null hypothesis the test statistic is given by F =
s12 2
, ( s1 > s22 )
s22
21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST
1267
________________________________________________________________________________________________________
Given, n1 = 16, n2 = 25; s12 = 40, s22 = 42
n1s12
s12 n1 − 1 16 × 40
24
=
×
= 0.9752.
∴
F= 2 =
2
n2 s2
s2
15
25 × 42
n2 − 1
Conclusion. The calculated value of F is 0.9752. The tabulated value of F at 16 – 1, 25 – 1
d.f. for 5% level of significance is 2.11.
Since the calculated value is less than that of the tabulated value, H0 is accepted, i.e., the
population variances are equal.
t-test. Null hypothesis. H0 : μ1 = μ2 , i.e., the population means are equal.
Alternative hypothesis. H1 : μ1 ≠ μ2 under the null hypothesis the test statistic:
Given: n1 = 16, n2 = 25, X1 = 440, X 2 = 460
s2 =
t=
n1s12 + n2 s22 16 × 40 + 25 × 42
=
= 43.333 ∴ s = 6.582
16 + 25 − 2
n1 + n2 − 2
X1 − X 2
440 − 460
=
= −9.490 for (n1 + n2 − 2) d.f.
1 1
1 1
6.582
s
+
+
16 25
n1 n2
Conclusion. The calculated value of t is 9.490. The tabulated value of t at 39 d.f. for 5%
level of significance is 1.96.
Since the calculated value is greater than the tabulated value, H0 is rejected.
I.e., there is a significant difference between the means, i.e., μ1 ≠ μ2 .
Since there is a significant difference between the means, and no significant difference
between the variances, we conclude that the samples do not come from the same normal
population.
Example 5. Two random samples drawn from two normal populations have the variable
values as below:
Sample I
19
17
16
28
22
23
19
24
26
Sample II
28
32
40
37
30
35
40
28
41
45
30
36
Obtain the estimate of the variance of the population and test whether the two populations
have the same variance.
ΣX1
ΣX 2
Sol. X1 =
= 21.55; n1 = 9; X 2 =
= 35.166; n2 = 12
n1
n2
X1
d1 = X 1 − 17
d12
X2
d 2 = X 2 − 28
d 22
19
2
4
28
0
0
17
0
0
32
4
16
16
–1
1
40
12
144
28
11
121
37
9
81
(continued)
1268
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
22
5
25
30
2
4
23
6
36
35
7
49
19
2
4
40
12
144
24
7
49
28
0
0
26
9
81
41
13
169
45
17
289
30
2
4
36
8
64
Σd12 = 321
Σd 22 = 964
s12 =
Σ(X1 − X1 ) 2 Σd12 − n1 (X1 − A) 2 321 − 9(21.55 − 17) 2
=
=
= 16.834
n1 − 1
n1 − 1
9 −1
s22 =
Σ(X 2 − X 2 ) 2 Σd 22 − n2 (X 2 − A) 2 964 − 12(35.166 − 28) 2
=
=
= 31.616
12 − 1
n2 − 1
n2 − 1
s22 31.616
= 1.878.
F= 2 =
s1 16.834
(∵ s22 > s12 )
Conclusion. The calculated value of F is 1.878. The tabulated value of F for v2 = 12 – 1 =
11, v1 = 9 – 1 = 8 d.f. at 5% level of significance is 3.315. Since the calculated value of F is
less than the tabulated value, H0 is accepted, i.e., there is no significant difference between the
population variance, i.e., the two populations have the same variance.
TEST YOUR KNOWLEDGE
1. From the following two sample values find out whether they have come from the same population:
Sample 1
17
27
18
25
27
29
27
23
Sample 2
16
16
20
16
20
17
15
21
17
2. The daily wages in dollars of skilled workers in two cities are as follows:
Size of sample of workers
S.D. of wages in the sample
City A
160
250
City B
130
320
3. The standard deviation calculated from two random samples of sizes 9 and 13 are 2.1 and 1.8
respectively. May the samples be regarded as drawn from normal populations with the same standard
deviation?
21.82 CHI-SQUARE (χ2) TEST
1269
________________________________________________________________________________________________________
4. Two independent samples of size 8 and 9 had the following values of the variables:
Sample I
20
30
23
25
21
22
23
24
Sample II
30
31
32
34
35
29
28
27
26
Do the estimates of the population variance differ significantly?
Answers
1. rejected
2. accepted
3. accepted
4. accepted
________________________________________________________________________________________________________
21.82
CHI-SQUARE ( χ2 ) TEST
When a coin is tossed 200 times, the theoretical considerations lead us to expect 100 heads
and 100 tails. But in practice, these results are rarely achieved. The quantity χ2 (the Greek letter
chi squared, pronounced chi-square) describes the magnitude of discrepancy between theory and
observation. If χ = 0, the observed and expected frequencies completely coincide. The greater the
discrepancy between the observed and expected frequencies, the greater the value of χ2. Thus χ2
affords a measure of the correspondence between theory and observation.
If Oi (i = 1, 2, . . . , n) is a set of observed (experimental) frequencies and Ei (i = 1, 2, . . . , n)
is the corresponding set of expected (theoretical or hypothetical) frequencies, then χ 2 is defined
as
n
⎡ (O − E i ) 2 ⎤
χ2 = ∑⎢ i
⎥
Ei
i =1 ⎣
⎦
where ΣOi = ΣE i = N (total frequency) and degrees of freedom (d.f.) = (n – 1).
Note.
(i) If χ = 0, the observed and theoretical frequencies agree exactly.
2
(ii) If χ > 0 they do not agree exactly.
2
21.82.1
Degrees of Freedom
While comparing the calculated value of χ2 with the table value, we have to determine the
degrees of freedom.
If we have to choose any four numbers whose sum is 50, we can exercise our independent
choice for any three numbers only, the fourth being 50 minus the total of the three numbers
selected. Thus, though we are to choose any four numbers, our choice is reduced to three because
of an imposed condition. There is only one restraint on our freedom and our degrees of freedom
are 4 – 1 = 3. If two restrictions are imposed, our freedom to choose will be further curtailed and
the degrees of freedom will be 4 – 2 = 2.
In general, the number of degrees of freedom is the total number of observations less the
number of independent constraints imposed on the observations. Degrees of freedom (d.f.) are
usually denoted by ν (the letter nu of the Greek alphabet).
Thus, ν = n – k, where k is the number of independent constraints in a set of data of n
observations.
Note. (i) For a p × q contingency table ( p columns and q rows), ν = ( p – 1) (q – 1)
(ii) In the case of a contingency table, the expected frequency of any class
Total of row in which it occurs × Total of columns in which it occurs
=
Total number of observations
1270
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
The χ2 test is one of the simplest and the most general tests known. It is applicable to a very
large number of problems in practice, which can be summed up under the following heads:
(i) as a test of goodness of fit.
(ii) as a test of independence of attributes.
(iii) as a test of homogeneity of independent estimates of the population variance.
(iv) as a test of the hypothetical value of the population variance σ 2 .
(v) as a list of the homogeneity of independent estimates of the population correlation
coefficient.
21.82.2
Conditions for Applying the χ2 Test
Following are the conditions that should be satisfied before the χ 2 test can be applied.
(a) N, the total number of frequencies, should be large. It is difficult to say what constitutes
largeness, but as an arbitrary figure, we may say that N should be at least 50, however few the
cells.
(b) No theoretical cell-frequency should be small. Here again, it is difficult to say what
constitutes smallness, but 5 should be regarded as the very minimum and 10 is better. If small
theoretical frequencies occur (i.e., < 10), the difficulty is overcome by grouping two or more
classes together before calculating (O – E). It is important to remember that the number of
degrees of freedom is determined with the number of classes after regrouping.
(c) The constraints on the cell frequencies, if any, should be linear.
Note. If any one of the theoretical frequencies is less than 5, we then apply a correction given by F. Yates,
which is usually known as “Yates’s correction for continuity,” we add 0.5 to the cell frequency that is less than 5
and adjust the remaining cell frequency suitably so that the marginal total is not changed.
21.82.3
The χ2 Distribution
For large sample sizes, the sampling distribution of χ2 can be closely approximated by a
continuous curve known as the chi-square distribution. The probability function of χ2 distribution
is given by
f ( χ 2 ) = c( χ 2 )(ν /2−1) e − x
2
/2
where e = 2.71828, ν = number of degrees of freedom; c = a constant depending only on ν .
Symbolically, the degrees of freedom are denoted by the symbol ν or by d.f. and are
obtained by the rule ν = n – k, where k refers to the number of independent constraints.
In general, when we fit a binomial distribution the number of degrees of freedom is one less
than the number of classes; when we fit a Poisson distribution, the degrees of freedom are 2 less
than the number of classes, because we use the total frequency and the arithmetic mean to get the
parameter of the Poisson distribution. When we fit a normal curve, the number of degrees of
freedom are 3 less than the number of classes, because in this fitting we use the total frequency,
mean, and standard deviation.
If the data is given in a series of “n” numbers then degrees of freedom = n – 1.
In the case of Binomial distribution d.f. = n – 1.
In the case of Poisson distribution d.f. = n – 2.
In the case of Normal distribution d.f. = n – 3.
21.82.4
The χ2 Test as a Test of Goodness of Fit
The χ2 test enables us to ascertain how well the theoretical distributions such as Binomial,
Poisson, or Normal, etc. fit empirical distributions, i.e., distributions obtained from sample data.
21.82 CHI-SQUARE (χ2) TEST
1271
________________________________________________________________________________________________________
If the calculated value of χ2 is less than the table value at a specified level (generally 5%) of
significance, the fit is considered to be good, i.e., the divergence between actual and expected
frequencies is attributed to fluctuations of simple sampling. If the calculated value of χ2 is greater
than the table value, the fit is considered to be poor.
ILLUSTRATIVE EXAMPLES
Example 1. The following table gives the number of accidents that took place in an industry
during various days of the week. Test whether accidents are uniformly distributed over the week.
Day
Mon
Tue
Wed
Thu
Fri
Sat
No. of accidents
14
18
12
11
15
14
Sol. Null hypothesis H0. The accidents are uniformly distributed over the week.
Under this H0, the expected frequencies of the accidents on each of these days =
84
= 14.
6
Observed frequency Oi
14
18
12
11
15
14
Expected frequency Ei
14
14
14
14
14
14
(Oi − Ei )
0
16
4
9
1
0
2
Σ(Oi − E i ) 2 30
=
= 2.1428.
χ =
Ei
14
2
Conclusion. Table value of χ2 at 5% level for (6 – 1 = 5 d.f.) is 11.09.
Since the calculated value of χ2 is less than the tabulated value, H0 is accepted, i.e., the
accidents are uniformly distributed over the week.
Example 2. A die is thrown 270 times and the results of these throws are given below:
No. appeared on the die
1
2
3
4
5
6
Frequency
40
32
29
59
57
59
Test whether the die is biased or not.
Sol. Null hypothesis H0. Die is unbiased.
Under this H0, the expected frequencies for each digit is
276
= 46.
6
To find the value of χ2
Oi
40
32
29
59
57
59
Ei
46
46
46
46
46
46
(Oi − Ei ) 2
36
196
289
169
121
169
1272
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Σ(Oi − E i ) 2 980
=
= 21.30.
Ei
46
Conclusion. The tabulated value of χ2 at 5% level of significance for (6 – 1 = 5) d.f. is
11.09. Since the calculated value of χ2 = 21.30 > 11.07 the tabulated value, H0 is rejected.
I.e., the die is not unbiased or the die is biased.
χ2 =
Example 3. The following table shows the distribution of digits in numbers chosen at
random from a telephone directory:
Digits
0
1
2
3
4
5
Frequency
1026
1107
997
966
1075
6
7
933 1107 972
8
9
964
853
Test whether the digits may be taken to occur equally frequently in the directory.
Sol. Null hypothesis H0. The digits taken in the directory occur with equal frequency, i.e.,
there is no significant difference between the observed and expected frequency.
10, 000
Under H0, the expected frequency is given by =
= 1000
10
To find the value of χ2
Oi
1026
1107
997
996
1075
1107
933
972
964
853
Ei
1000
1000
1000 1000
1000
1000
1107
1000
1000
1000
(Oi − Ei ) 2
676
11449
5625
11449
4489
784
1296 21609
χ2 =
9
1156
Σ(Oi − E i ) 2 58542
=
= 58.542.
Ei
1000
Conclusion. The tabulated value of χ2 at 5% level of significance for 9 d.f. is 16.919. Since
the calculated value of χ2 is greater than the tabulated value, H0 is rejected.
I.e., there is a significant difference between the observed and theoretical frequency.
I.e., the digits taken in the directory do not occur with equal frequency.
Example 4. Records taken of the number of male and female births in 800 families having
four children are as follows:
No. of male births
0
1
2
3
4
No. of female births
4
3
2
1
0
No. of families
32
178
290
236
94
Test whether the data are consistent with the hypothesis that the binomial law holds and the
chance of male birth is equal to that of female birth, namely p = q = 1/2.
Sol. H0 : The data are consistent with the hypothesis of equal probability for male and
female births, i.e., p = q = 1/2.
21.82 CHI-SQUARE (χ2) TEST
1273
________________________________________________________________________________________________________
We use binomial distribution to calculate theoretical frequency given by:
N(r) = N × P(X = r)
where N is the total frequency. N(r) is the number of families with r male children:
P(X = r) = n Cr p r q n − r
where p and q are the probability of male and female births, n is the number of children.
4
1
⎛1⎞
N(0) = No. of families with 0 male children = 800 × C0 ⎜ ⎟ = 800 ×1× 4 = 50
2
⎝2⎠
4
1
3
2
2
3
0
4
⎛1⎞ ⎛1⎞
⎛1⎞ ⎛1⎞
N(1) = 800 × C1 ⎜ ⎟ ⎜ ⎟ = 200; N(2) = 800 × 4 C 2 ⎜ ⎟ ⎜ ⎟ = 300
⎝2⎠ ⎝2⎠
⎝2⎠ ⎝2⎠
4
1
⎛1⎞
N(3) = 800 × 4 C3 ⎜ ⎟
⎝2⎠
⎛1⎞
⎛1⎞ ⎛1⎞
4
⎜ ⎟ = 200; N(4) = 800 × C4 ⎜ ⎟ ⎜ ⎟ = 50
⎝2⎠
⎝2⎠ ⎝2⎠
Observed frequency Oi
32
178
290
236
94
Expected frequency Ei
50
200
300
200
50
(Oi − Ei ) 2
324
484
100
1296
1936
(Oi − Ei ) 2
Ei
6.48
2.42
0.333
6.48
38.72
Σ(Oi − E i ) 2
= 54.433.
Ei
Conclusion. The table value of χ2 at 5% level of significance for 5 – 1 = 4 d.f. is 9.49.
Since the calculated value of χ2 is greater than the tabulated value, H0 is rejected.
I.e., the data are not consistent with the hypothesis that the binomial law holds and that the
chance of a male birth is not equal to that of a female birth.
χ2 =
Note. Since the fitting is binomial, the degrees of freedom ν = n – 1, i.e., ν = 5 – 1 = 4.
Example 5. Verify whether the Poisson distribution can be assumed from the data given
below:
No. of defects
0
1
2
3
4
5
Frequency
6
13
13
8
4
3
Sol. H0 : The Poisson fit is a good fit to the data.
Σ f i xi 94
=
=2
Σ fi
47
To fit a Poisson distribution we require m. Parameter m = x = 2.
By the Poisson distribution the frequency of r success is
mr
N(r ) = N × e − m ⋅
, N is the total frequency.
r!
Mean of the given distribution =
1274
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
(2)0
= 6.36 ≈ 6;
N(0) = 47 × e ⋅
0!
(2) 2
N(2) = 47 × e −2 ⋅
= 12.72 ≈ 13;
2!
4
−2 (2)
N(4) = 47 × e ⋅
= 4.24 ≈ 4;
4!
(2)1
= 12.72 ≈ 13
N(1) = 47 × e ⋅
1!
(2)3
N(3) = 47 × e −2 ⋅
= 8.48 ≈ 9
3!
5
−2 (2)
N(5) = 47 × e ⋅
= 1.696 ≈ 2.
5!
−2
−2
X
0
1
2
3
4
5
Oi
6
13
13
8
4
3
Ei
6.36
12.72
12.72
8.48
4.24
1.696
(Oi − Ei ) 2
Ei
0.2037
0.00616
0.00616
0.02716
0.0135
1.0026
Σ(Oi − E i ) 2
χ =
= 1.2864.
Ei
2
Conclusion. The calculated value of χ2 is 1.2864. The tabulated value of χ2 at 5% level of
significance for γ = 6 – 2 = 4 d.f. is 9.49. Since the calculated value of χ2 is less than that of the
tabulated value, H0 is accepted, i.e., the Poisson distribution provides a good fit to the data.
Example 6. The theory predicts the proportion of beans in the four groups, G1, G2, G3, G4
should be in the ratio 9 : 3 : 3 : 1. In an experiment with 1600 beans the numbers in the four
groups were 882, 313, 287, and 118. Does the experimental result support the theory?
Sol. H0. The experimental result supports the theory, i.e., there is no significant difference
between the observed and theoretical frequency under H0; the theoretical frequency can be
calculated as follows:
1600 × 9
= 900;
16
1600 × 3
E(G 3 ) =
= 300;
16
E(G1 ) =
1600 × 3
= 300;
16
1600 × 1
E(G 4 ) =
= 100
16
E(G 2 ) =
To calculate the value of χ2
Observed frequency Oi
882
313
287
118
Expected frequency Ei
900
300
300
100
(Oi − Ei ) 2
Ei
0.36
0.5633
0.5633
3.24
χ2 =
Σ(Oi − E i ) 2
= 4.7266.
Ei
Conclusion. The table value of χ2 at 5% level of significance for 3 d.f. is 7.815. Since the
calculated value of χ2 is less than that of the tabulated value, hence H0 is accepted.
I.e., the experimental results support the theory.
21.82 CHI-SQUARE (χ2) TEST
1275
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. The following table gives the frequency of occupance of the digits 0, 1, . . . , 9 in the last place in four
logarithms of numbers 10–99. Examine whether there is any peculiarity.
Digits
:
Frequency :
0
6
1
16
2
15
3
10
4
12
5
12
6
3
7
2
8
9
9
5
2. The sales in a supermarket during a week are given below. Test the hypothesis that the sales do not
depend on the day of the week, using a significance level of 0.05.
Days
:
Sales (in $10000) :
Mon
65
Tues
54
Wed
60
Thurs
56
Fri
71
Sat
84
3. A survey of 320 families with 5 children each revealed the following information:
No. of boys
No. of girls
No. of families
:
:
:
5
0
14
4
1
56
3
2
110
2
3
88
1
4
40
0
5
12
Is this result consistent with the hypothesis that male and female births are equally probable?
4. 4 coins were tossed at a time and this operation was repeated 160 times. It is found that 4 heads occur 6
times, 3 heads occur 43 times, 2 heads occur 69 times, and one head occur 34 times. Discuss whether the
coin may be regarded as unbiased.
5. Fit a Poisson distribution to the following data and the best goodness of fit:
x
f
:
:
0
109
1
65
2
22
3
3
4
1
6. In the accounting department of a bank, 100 accounts are selected at random and estimated for errors.
The following results were obtained:
No. of errors
No. of accounts
:
:
0
35
1
40
2
19
3
2
4
0
5
2
6
2
Does this information verify that the errors are distributed according to the Poisson probability law?
7. In a sample analysis of examination results of 500 students, it was found that 280 students have failed,
170 have gotten C’s, 90 have gotten B’s, and the rest, A’s. Do these figures support the general belief
that the above categories are in the ratio 4 : 3 : 2 : 1 respectively?
Answers
1. no
5. Poisson law fits the data
2. accepted
6. maybe
3. accepted
7. yes
4. unbiased
________________________________________________________________________________________________________
21.82.5
The χ2 Test as a Test of Independence
With the help of the χ2 test, we can find whether or not two attributes are associated. We
take the null hypothesis that there is no association between the attributes under study, i.e., we
assume that the two attributes are independent. If the calculated value of χ2 is less than the
table value at a specified level (generally 5%) of significance, the hypothesis holds true, i.e., the
attributes are independent and do not bear any association. On the other hand, if the calculated
value of χ2 is greater than the table value at a specified level of significance, we say that the
results of the experiment do not support the hypothesis. In other words, the attributes are
associated. Thus a very useful application of the χ2 test is to investigate the relationship between
trials or attributes, which can be classified into two or more categories.
1276
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
The sample data are set out into a two-way table, called a contingency table.
Let us consider two attributes A and B divided into r classes A1, A2, A3, . . . , Ar and B
divided into s classes B1, B2, B3, . . . , Bs. If (Ai), (Bj) represents the number of people possessing
the attributes Ai, Bj respectively, (i = 1, 2, . . . , r, j = 1, 2, . . . , s) and (Ai Bj) represent the
number of people possessing attributes Ai and Bj. Also we have
r
∑ Ai =
i =1
s
∑B
i =1
j
where N is the
total frequency. The contingency table for r × s is given below:
A
A1
A2
A3
. . . Ar
Total
B1
(A1B1)
(A2B1)
(A3B1)
. . . (ArB1)
B1
B2
(A1B2)
(A2B2)
(A3B2)
. . . (ArB2)
B2
B3
(A1B3)
(A2B3)
(A3B3)
. . . (ArB3)
B3
...
...
...
...
...
...
...
...
...
...
...
...
Bs
(A1Bs)
(A2Bs)
(A3Bs)
. . . (ArBs)
(Bs)
Total
(A1)
(A2)
(A3)
. . . (Ar)
N
B
H0 : Both the attributes are independent, i.e., A and B are independent under the null
hypothesis; we calculate the expected frequency as follows:
P(A i ) = Probability that a person possesses the attribute A i =
P(B j ) = Probability that a person possesses the attribute B j =
(A i )
i = 1, 2, . . . , r
N
(B j )
N
P(A i B j ) = Probability that a person possesses both attributes A i and B j =
(A i B j )
N
If (A i B j )0 is the expected number of people possessing both the attributes Ai and Bj
(A i B j )0 = NP(A i B j ) = NP(A i )(B j )
(A i ) (B j ) (A i )(B j )
=
N N
N
2
r
s ⎡ ⎡ (A B ) − (A B ) ⎤ ⎤
i j
i j 0⎦
⎣
2
⎥
χ = ∑∑ ⎢
⎢
⎥
(A
B
)
i =1 j =1
i j 0
⎣
⎦
=N
Hence
(∵ A and B are independent)
which is distributed as a χ2 variate with (r – 1)(s – 1) degrees of freedom.
a|b 2
Note 1. For a 2 × 2 contingency table where the frequencies are
χ can be calculated from independent
c d
2
( a + b + c + d )( ad − bc )
.
frequencies as χ2 =
( a + b)(c + d )(b + d )( a + c )
21.82 CHI-SQUARE (χ2) TEST
1277
________________________________________________________________________________________________________
Note 2. If the contingency table is not 2 × 2, then the formula for calculating χ2 as given in Note 1, cannot be
(A i )(B j )
used. Hence, we have another formula for calculating the expected frequency (AiBj)0 =
N
Product of column total and row total
I.e., the expected frequency in each cell is =
.
whole total
a|b
ad − bc
is the 2 × 2 contingency table with two attributes, Q =
is called the coefficient of
c d
ad + bc
association.
Note 3. If
If the attributes are independent then
a
=
c
.
b d
Note 4. Yate’s Correction. In a 2 × 2 table, if the frequencies of a cell is small, we make Yates’s correction to
make χ2 continuous.
Decrease by 12 those cell frequencies that are greater than expected frequencies, and increase by 12 those that
are less than expected. This will not affect the marginal columns. This correction is known as Yates’s correction to
continuity.
χ =
2
After Yates’s correction
χ =
2
1 ⎞
⎛
N ⎜ bc − ad − N ⎟
2 ⎠
⎝
2
( a + c )(b + d )(c + d )( a + b)
⎛
⎝
N ⎜ ad − bc −
⎞
2 ⎠
1
when ad − bc < 0
2
N⎟
( a + c )(b + d )(c + d )( a + b)
when ad − bc > 0.
ILLUSTRATIVE EXAMPLES
Example 1. What are the expected frequencies of the 2 × 2 contingency tables given below:
(i)
a
b
c
d
Observed frequencies
Sol.
(i)
(ii)
a
b
a+b
c
d
c+d
a+c
b+d
a+b+c+d=N
2
10
6
6
Expected frequencies
→
(a + c)(a + b)
a+b+c+d
(b + d )(a + b)
a+b+c+d
(a + c)(c + d )
a+b+c+d
(b + d )(c + d )
a+b+c+d
1278
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Observed frequencies
(ii)
Expected frequencies
2
10
12
6
6
12
8
16
24
→
8 × 12
=4
24
16 ×12
=8
24
8 × 12
=4
24
16 ×12
=8
24
Example 2. From the following table regarding the color of eyes of fathers and sons test
whether the color of the son’s eye is associated with that of the father.
Eye color of father
Light
Eye color of son
Light
471
Not light
51
Not light
148
230
Sol. Null hypothesis H0. The color of the son’s eye is not associated with that of the father,
i.e., they are independent.
Under H0, we calculate the expected frequency in each cell as
=
Product of column total and row total
whole total
Expected frequencies are:
Eye color
of son
Eye color
of father
Light
Not light
Total
Light
619 × 522
= 359.02
900
289 × 522
= 167.62
900
522
Not light
619 × 378
= 259.98
900
289 × 378
= 121.38
900
378
619
289
900
Total
(471 − 359.02) 2 (51 − 167.62) 2 (148 − 259.98) 2 (230 − 121.38) 2
+
+
+
359.02
167.62
259.98
121.38
= 261.498.
χ2 =
Conclusion. Tabulated value of χ2 at 5% level for 1 d.f. is 3.841.
Since the calculated value of χ2 > the tabulated value of χ2, H0 is rejected. They are
dependent, i.e., the color of the son’s eye is associated with that of the father.
21.82 CHI-SQUARE (χ2) TEST
1279
________________________________________________________________________________________________________
Example 3. The following table gives the number of good and bad parts produced by each
of the three shifts in a factory:
Good parts
Bad parts
Total
Day shift
960
40
1000
Evening shift
940
50
990
Night shift
950
45
995
Total
2850
135
2985
Test whether or not the production of bad parts is independent of the shift on which they
were produced.
Sol. Null hypothesis H0. The production of bad parts is independent of the shift on which
they were produced.
I.e., the two attributes, production and shifts, are independent.
⎡ ⎡(A B ) − (A B ) ⎤ 2 ⎤
i j 0
i j ⎦
⎥
χ = ∑∑ ⎢ ⎣
⎥⎦
(A
B
)
i =1 j =1 ⎢
i j 0
⎣
2
Under H0,
2
3
Calculation of expected frequencies
Let A and B be two attributes, namely, production and shifts. A is divided into two classes
A1, A2, and B is divided into three classes B1, B2, B3.
(A1 )(B2 ) (2850) × (1000)
=
= 954.77
N
2985
(A )(B ) (2850) × (990)
(A1B2 )0 = 1 2 =
= 945.226
N
2985
(A )(B ) (2850) × (995)
(A1B3 )0 = 1 3 =
= 950
N
2985
(A )(B ) (135) × (1000)
(A 2 B1 )0 = 2 1 =
= 45.27
N
2985
(A )(B ) (135) × (990)
(A 2 B2 )0 = 2 2 =
= 44.773
N
2985
(A )(B ) (135) × (995)
(A 2 B3 )0 = 2 3 =
= 45.
N
2985
(A1B1 )0 =
To calculate the value of χ2
Class
Oi
Ei
(Oi − Ei ) 2
(Oi − Ei ) 2 / Ei
(A1B1)
960
954.77
27.3529
0.02864
(A1B2)
940
945.226
27.3110
0.02889
(A1B3)
950
950
0
0
(A2B1)
40
45.27
27.7729
0.61349
(A2B2)
50
44.773
27.3215
0.61022
(A2B3)
45
45
0
0
1.28126
1280
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Conclusion. The tabulated value of χ2 at 5% level of significance for 2 degrees of freedom
(r – 1)(s – 1) is 5.991. Since the calculated value of χ2 is less than the tabulated value, we accept
H0, i.e., the production of bad parts is independent of the shift on which they were produced.
Example 4. From the following data, find whether hair color and sex are associated.
Color
Fair
Red
Medium
Dark
Black
Total
Boys
592
849
504
119
36
2100
Girls
544
677
451
97
14
1783
Total
1136
1526
955
216
50
3883
Sex
Sol. Null hypothesis H0. The two attributes of hair color and sex are not associated, i.e.,
they are independent.
Let A and B be the attributes of hair color and sex, respectively. A is divided into 5 classes
(r = 5). B is divided into 2 classes (s = 2).
∴ Degrees of freedom = (r – 1)(s – 1) = (5 – 1)(2– 1) = 4
⎡(A i B j )0 − (A i B j ) ⎤⎦
Under H0, we calculate χ = ∑∑ ⎣
(A i B j )0
i =1 j =1
2
5
2
2
Calculate the expected frequency (A i B j )0 as follows:
(A1B1 )0 =
(A1 )(B1 ) 1136 × 2100
=
= 614.37
N
3883
(A1B2 )0 =
(A1 )(B2 ) 1136 ×1783
=
= 521.629
N
3883
(A 2 B1 )0 =
(A 2 )(B1 ) 1526 × 2100
=
= 852.289
N
3883
(A 2 B2 )0 =
(A 2 )(B2 ) 1526 × 1783
=
= 700.71
N
3883
(A 3 B1 )0 =
(A 3 )(B1 ) 955 × 2100
=
= 516.482
N
3883
(A 3 B2 )0 =
(A 3 )(B2 ) 955 × 1783
=
= 483.517
N
3883
21.82 CHI-SQUARE (χ2) TEST
1281
________________________________________________________________________________________________________
(A 4 B1 )0 =
(A 4 )(B1 ) 216 × 2100
=
= 116.816
N
3883
(A 4 B2 )0 =
(A 4 )(B2 ) 216 × 1783
=
= 99.183
N
3883
(A 5 B1 )0 =
(A 5 )(B1 ) 50 × 2100
=
= 27.04
N
3883
(A 5 B2 )0 =
(A 5 )(B2 ) 50 ×1783
=
= 22.959
N
3883
Calculation of χ2
(Oi − Ei ) 2
Ei
Class
Oi
Ei
(Oi − Ei ) 2
A2B1
592
614.37
500.416
0.8145
A1B2
544
521.629
500.462
0.959
A2B1
849
852.289
10.8175
0.0127
A2B2
677
700.71
562.1641
0.8023
A3B1
504
516.482
155.800
0.3016
A3B2
451
438.517
155.825
0.3553
A4B1
119
116.816
4.7698
0.0408
A4B2
97
99.183
4.7654
0.0480
A5B1
36
27.04
80.2816
2.9689
A5B2
14
22.959
80.2636
3.495
9.79975
χ2 = 9.799.
Conclusion. Table of χ2 at 5% level of significance for 4 d.f. is 9.488.
Since the calculated value of χ2 < tabulated value H0 is rejected, i.e., the two attributes are
not independent, i.e., the hair color and sex are associated.
Example 5. Can vaccination be regarded as a preventive measure of smallpox as evidenced
by the following data of 1482 people exposed to small pox in a locality? 368 in all were attacked
of these 1482 people, and 343 were vaccinated, and of these only 35 were attacked.
Sol. For the given data we form the contingency table. Let the two attributes be vaccination
and exposed to smallpox. Each attribute is divided into two classes.
1282
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Vaccination A
Vaccinated
Not
Total
Attacked
35
333
368
Not
308
806
1114
Total
343
1139
1482
Disease
smallpox B
Null hypothesis H0. The two attributes are independent, i.e., vaccination cannot be regarded
as a preventive measure of smallpox.
Degrees of freedom ν = ( r − 1)( s − 1) = (2 − 1)(2 − 1) = 1
⎡(A i B j )0 − (A i B j ) ⎤⎦
χ = ∑∑ ⎣
(A i B j )0
i =1 j =1
2
Under H0,
2
2
2
Calculation of expected frequency
(A1B1 )0 =
(A1 )(B1 ) 343 × 368
=
= 85.1713
N
1482
(A1B2 )0 =
(A1 )(B2 ) 343 × 1114
=
= 257.828
N
1482
(A 2 B1 )0 =
(A 2 )(B1 ) 1139 × 368
=
= 282.828
N
1482
(A 2 B2 )0 =
(A 2 )(B2 ) 1139 × 1114
=
= 856.171
N
1482
Calculation of χ2
Class
Oi
Ei
(Oi − Ei ) 2
(Oi − Ei ) 2
Ei
(A1B1)
35
85.1713
2517.159
29.554
(A1B2)
308
257.828
2517.229
8.1728
(A2B1)
333
282.828
2517.2295
7.5592
(A2B2)
806
856.171
2517.1292
2.9399
48.2261
Calculated value of χ2 = 48.2261.
Conclusion. Tabulated value of χ2 at 5% level of significance for 1 d.f. is 3.841. Since the
calculated value of χ2 > tabulated value H0 is rejected.
I.e., the two attributes are not independent, i.e., the vaccination can be regarded as a
preventive measure of smallpox.
21.83 Z-TEST
1283
________________________________________________________________________________________________________
TEST YOUR KNOWLEDGE
1. In a locality 100 people were randomly selected and asked about their educational achievements. The
results are given below:
Education
Sex
Middle
High school
College
Male
10
15
25
Female
25
10
15
Based on this information, can you say the education depends on sex?
2. The following data is collected on two characteristics:
Smokers
Nonsmokers
Literate
83
57
Illiterate
45
68
Based on this information can you say that there is no relation between habit of smoking and literacy?
3. 500 students at school were graded according to their intelligences and economic conditions of their
homes. Examine whether there is any association between economic condition and intelligence, from the
following data:
Economic conditions
Intelligence
Good
Bad
Rich
85
75
Poor
165
175
4. In an experiment on the immunization of goats from anthrax, the following results were obtained. Derive
your inferences on the efficiency of the vaccine.
Died from anthrax
Survived
Inoculated with vaccine
2
10
Not inoculated
6
6
Answers
1. Yes
2. No
3. No
4. Not effective.
________________________________________________________________________________________________________
21.83
Z-TEST
This test is used to test the significance of the correlation coefficient in small samples. If r is
the correlation coefficient of the sample and ρ , that of the population, calculate the value of
1284
CHAPTER 21: STATISTICS AND PROBABILITY
________________________________________________________________________________________________________
Z −ξ
1
n−3
where
⎛ 1+ r ⎞
or 1.1513 log10 ⎜
⎟
⎝ 1− r ⎠
⎛ 1+ ρ ⎞
⎛ 1+ ρ ⎞
1
1
ξ = tanh −1 ρ = log e ⎜
⎟ or 1.1513 log10 ⎜
⎟
2
2
⎝ 1− ρ ⎠
⎝ 1− ρ ⎠
Z=
1
1
⎛ 1+ r ⎞
tanh −1 r = log e ⎜
⎟
2
2
⎝ 1− r ⎠
1
= S.E.
n−3
If the absolute value of this
difference
exceeds 1.96, the difference is significant at 5%
S.E.
level.
ILLUSTRATIVE EXAMPLES
Example 1. Test the significance of the correlation r = 0.5 from a sample of size 18 against
the hypothetical correlation ρ = 0.7.
Sol. We have to test the hypothesis that the correlation in the population is 0.7.
1
⎛ 1+ r ⎞
⎛ 1 + 0.5 ⎞
log e ⎜
⎟ = 1.1513 log10 ⎜
⎟
2
⎝ 1− r ⎠
⎝ 1 − 0.5 ⎠
= 1.1513 log 3 = 1.1513 × 0.4771 = 0.549
Z=
ξ=
⎛ 1+ ρ ⎞
1
⎛ 1 + 0.7 ⎞
log e ⎜
⎟
⎟ = 1.1513 log10 ⎜
2
⎝ 1 − 0.7 ⎠
⎝ 1− ρ ⎠
= 1.1513 log 5.67 = 1.1513 × 0.7536 = 0.868
Z − ξ = 0.549 − 0.868 = −0.319
1
1
1
= 0.26
S.E. =
=
=
n−3
15
18 − 3
Z − ξ 0.319
=
= 1.23, which is less than 1.96 (5% level of signifiS.E.
0.26
cance) and is, therefore, not significant. Hence the sample may be regarded as coming from a
population with ρ = 0.7.
The absolute value of
Example 2. From a sample of 19 pairs of observations, the correlation is 0.5 and the
corresponding population value is 0.3. Is the difference significant?
Sol. Here n = 19, r = 0.5, ρ = 0.3
1
⎛ 1+ r ⎞
⎛ 1 + 0.5 ⎞
log e ⎜
⎟ = 1.1513 log10 ⎜
⎟
2
⎝ 1− r ⎠
⎝ 1 − 0.5 ⎠
= 1.1513 log 3 = 1.1513 × 0.4771 = 0.55
Z=
ξ=
⎛ 1+ ρ ⎞
1
⎛ 1 + 0.3 ⎞
log e ⎜
⎟
⎟ = 1.1513 log10 ⎜
2
⎝ 1 − 0.3 ⎠
⎝ 1− ρ ⎠
= 1.1513 log1.857 = 1.1513 × 0.2695 = 0.31
21.83 Z-TEST
1285
________________________________________________________________________________________________________
Z − ξ = 0.55 − 0.31 = 0.24; S.E.x =
∴
1
1
1
=
= = 0.25
19 − 3 4
n−3
Z − ξ 0.24
=
= 0.96
S.E. 0.25
which is less than 1.96 (5% level of significance) and is, therefore, not significant. Hence the
sample may be regarded as coming from a population with ρ = 0.3.
TEST YOUR KNOWLEDGE
1. A correlation coefficient of 0.72 is obtained from a sample of 29 pairs of observations. Can the sample
be regarded as drawn from a bivariate normal population in which the true correlation coefficient is 0.8?
Answer
1. Yes
________________________________________________________________________________________________________