Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
21 Statistics and Probability 21.1 INTRODUCTION Statistics is as old as human society itself. It is difficult to imagine any facet of our life untouched by numerical data. Modern society is essentially data-oriented. It is, therefore, essential to know how to extract useful information from such data. This is the primary objective of statistics. Statistics concerns itself with the collection, presentation, and drawing of inferences from numerical data that vary. In a singular sense, statistics is used to describe the principles and methods that are employed in collection, presentation, analysis, and interpretation of data. These devices help to simplify the complex data and make it possible for a common man to understand it without much difficulty. The human mind is unable to assimilate complicated data at a stretch. Statistical methods make these figures intelligible and readily understandable. In a plural sense, statistics is considered as a numerical description of the quantitative aspect of things. Definition. Statistics is the science that deals with methods of collecting, classifying, presenting, comparing, and interpreting numerical data in order to throw light on any sphere of enquiry. 21.2 VARIABLE (OR VARIATE) A quantity that can vary from one individual to another is called a variable or variate, e.g., heights, weights, ages, wages of people, rainfall records of cities, etc. Quantities that can take any numerical value within a certain range are called continuous variables, e.g., as a child grows, his/her height takes all possible values from 50 cm to 100 cm. Quantities that are incapable of taking all possible values are called discrete or discontinuous variables, e.g., the number of children in a family are positive integers 1, 2, 3, etc. (no value between any two consecutive integers). 21.3 FREQUENCY DISTRIBUTIONS Consider the grades obtained by 60 students in mathematics: 38, 11, 40, 0, 26, 15, 5, 45, 7, 32, 2, 18, 42, 8, 31, 27, 4, 12, 35, 15, 0, 7, 28, 46, 9, 16, 29, 34, 10, 7, 5, 1, 17, 22, 35, 8, 36, 47, 11, 30, 19, 0, 16, 14, 16, 18, 41, 38, 2, 17, 42, 45, 48, 28, 7, 21, 8, 28, 5, 20. The data does not give any useful information. It is rather confusing. These are called raw data or ungrouped data. 1146 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ We would like to bring out certain salient features of this data. If we express the data in ascending or descending order of magnitude, this does not reduce the bulk of the data. We condense the data into classes or groups as below: (i) Determine the range of the data, i.e., the difference between the largest and smallest numbers occurring in the data. Here the range = 48 – 0 = 48. (ii) Decide upon the number of classes or groups into which raw data is to be grouped. There are no hard and fast rules for this. The insight of the experimentor determines this number. However, the number of classes should not be less than 5 or more than 30. With a smaller number of classes accuracy is lost and with a larger number of classes the computations become tedious. Let us make the number of classes = 7 here. (iii) Divide the range by the desired number of classes to determine the approximate width or size of class interval. If the quotient is a fraction, take the next integer. In the above example, 48 or 7. the size of the class interval is 7 As far as possible, classes should be of the same size. (iv) Using the size of the interval, set up the class limits, making sure that the minimum and the maximum numbers occurring in the data are included in some class. As far as possible, openend classes (a < x < b) should be avoided since they create difficulty in analysis and interpretation. Boundaries of each class are selected in such a way that there is no ambiguity as to which class a particular item of the data belongs. (v) The observations corresponding to the common point of two classes should always be included in the higher class, e.g., if 20 is an element of the data and 10–20 and 20–30 are two classes, then 20 is to be set in the class 20–30 and not 10–20. That is to say every class should be regarded as open to the right. (vi) Take each item from the data, one at a time, and place a tally mark (/) opposite the class to which it belongs. Tally marks are recorded in bunches of five. Having occurred four times, the fifth occurrence is represented by setting a cross-tally ( \ or / ) on the first four tallies ( |||| or |||| ). This technique facilitates the counting of the tally marks at the end. (vii) The count of tally marks in a particular class provides us with the frequency in that class. The word “frequency” is derived from “how frequently” a variable occurs. (viii) Grades are called the variable (x) and the number of students in a class is known as the frequency ( f ) or class frequency of the variable. (ix) The total of all frequencies must equal the number of observations in the raw data. (x) The table displaying the manner in which frequencies are distributed over various classes is called the frequency table. (xi) We are often interested in knowing, at a glance, the number of observations less than a particular value. This is done by finding cumulative frequency. The cumulative frequency corresponding to a class is the sum of frequencies of that class and of all classes prior to that class. (xii) The table displaying the manner in which cumulative frequencies are distributed is called the cumulative frequency table. Using the above steps, we have the following cumulative frequency table for the example under consideration. 21.3 FREQUENCY DISTRIBUTIONS 1147 ________________________________________________________________________________________________________ Class interval Tally marks (grades x) (number of students) 0–7 7–14 14–21 21–28 28–35 35–42 42–49 Frequency (f) Cumulative Frequency 10 12 12 4 8 7 7 10 22 34 38 46 53 60 |||| |||| |||| |||| || |||| |||| || |||| |||| ||| |||| || |||| || Total 60 ILLUSTRATIVE EXAMPLES Example 1. The weights in grams of 50 apples picked at random from a market are as follows: 106, 107, 76, 82, 109, 107, 115, 93, 187, 195, 123, 125, 111, 92, 86, 70, 126, 68, 130, 129, 139, 119, 115, 128, 100, 186, 84, 99, 113, 204, 111, 141, 136, 123, 90, 115, 98, 110, 78, 90, 107, 81, 131, 75, 84, 104, 110, 80, 118, 82. Form the grouped frequency table by dividing the variate range into intervals of equal width, each corresponding to 20 gms in such a way that the mid-value of the first class corresponds to 70 gms. Sol. Mid-value of first class = 70 ⎫ (given) ⎬ Width of each class = 20 ⎭ ∴ The first class interval is (70 – 10) – (70 + 10) i.e., 60 – 80. Weight in grams No. of apples 60–80 80–100 100–120 120–140 140–160 160–180 180–200 200–220 Frequency |||| |||| |||| ||| |||| |||| |||| || |||| |||| | 5 13 17 10 1 0 3 1 ||| | Total 50 Example 2. Form an ordinary frequency table from the following table: Grades Above Above Above No. of Students 0 10 20 40 30 25 Grades Above Above Above No. of Students 30 40 50 18 12 0 1148 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ Sol. Noo. of Studentts ( f ) 4 – 30 = 100 40 3 – 25 = 5 30 2 – 18 = 7 25 18 – 12 = 6 12 – 0 = 122 Grrades 0––10 10––20 20––30 30––40 40––50 m the followinng: Exaample 3. Forrm an ordinaary frequenccy table from G Grades Below B B Below B Below N of Studennts No. Grades 5 7 13 Beloow Beloow Beloow 10 20 30 No. of o Students 40 50 60 22 30 38 Sol. Graades 0––10 10––20 20––30 30––40 40––50 50––60 21.4 Noo. of Studentts ( f ) 5 7–5=2 13 – 7 = 6 2 – 13 = 9 22 3 – 22 = 8 30 3 – 30 = 8 38 “E EXCLUSIVE E” AND “INC CLUSIVE” CLASS-INTE C ERVALS Classs-intervals of the type { x : a ≤ x < b} = [a, b) arre called “exxclusive” sinnce they excclude the upperr limit of thee class. The following f daata are classiified on this basis. 21.5 THREE TYPES OF SERIES 1149 ________________________________________________________________________________________________________ Income ($) No. of people 50–100 88 100–150 70 150–200 52 200–250 30 250–300 23 In this method, the upper limit of one class is the lower limit of the next class. In this example, there are 88 people whose income is from $50 to $99.99. A person whose income is $100 is included in the class $100–$150. Class-intervals of the type { x : a ≤ x ≤ b} = [ a, b ] are called “inclusive” since they include the upper limit of the class. The following data are classified on this basis. Income ($) 50–99 100–149 150–199 200–249 250–299 No. of people 60 38 22 16 7 However, to ensure continuity and to get correct class-limits, the exclusive method of classification should be adopted. To convert inclusive class-intervals into exclusive ones, we have to make an adjustment. Adjustment. Find the difference between the lower limit of the second class and the upper limit of the first class. Divide it by 2. Subtract the value obtained from all the lower limits and add the value to all the upper limits. 100 − 99 In the above example, the adjustment factor is = .5. The adjusted classes would 2 then be as follows: Income ($) No. of people 49.5–99.5 60 99.5–149.5 38 149.5–199.5 22 199.5–249.5 16 249.5–299.5 7 The size of the class interval is 50. 21.5 THREE TYPES OF SERIES In this chapter, we will come across the following three types of series: (a) Individual Observations (i.e., where frequencies are not given). Form x : x1 , x2 , x3 , . . . , xn . (b) Discrete Series. It is a series of observations of the form x : x1 , x2 , x3 , . . . , xn f : f1 , f 2 , f3 , . . . , f n (c) Continuous Series. It is a series of observations of the form Class Interval : a1 − a2 a2 − a3 . . . an − an +1 f f1 f2 fn : ... For the purpose of further calculations in statistical work, the mid-point of each class is taken to represent the class. 1150 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Thus, if mi is the mid-point of the ith class, then mi = form Mid -value m: m1 , m2 , m3 , . . . , mn Frequency f : f1 , f2 , f3 , . . . , f n . ai + ai +1 and the above series takes the 2 The mid-value of the ith class may also be denoted by xi . Thus, a continuous series is reduced to the form of a discrete series. 21.6 GRAPHICAL REPRESENTATION A frequency distribution when represented by means of a graph makes the unwieldy data intelligible. A better perspective can be had by representing the frequency distribution graphically since graphs, if drawn attractively, are eye-catching and leave a more lasting impression on the mind of the observer. Graphs are a good visual aid. But graphs do not give accurate measurements of the variable as are given by the tables. Another disadvantage is that by taking different scales, the facts may be misrepresented. Some important types of graphs are given below: (A) Histogram In drawing the histogram of a given grouped frequency distribution: (a) Mark off along the x-axis all the class intervals on a suitable scale. (If class-intervals are equal, then each = 1 cm is quite suitable.) (b) Mark frequencies along the y-axis on a suitable scale. (c) It must not be assumed that the scale for both the axes will be the same. We can have different scales for the two axes. The determination of scale depends upon our convenience and the type and nature of the data. The scale or scales should be so chosen as to fit the size of graphpaper and to hold all the figures of the data. (d) Construct rectangles with the class-intervals as bases and heights proportional (if the class intervals are equal) to the frequencies. A diagram with all these rectangles is called a histogram. ILLUSTRATIVE EXAMPLES Example 1. The weights (in grams) of 40 oranges picked at random from a basket are as follows: 45, 55, 30, 110, 75, 100, 40, 60, 65, 40, 100, 75, 70, 60, 70, 95, 85, 80, 35, 45, 40, 50, 60, 65, 55, 45, 90, 85, 75, 85, 75, 70, 110, 100, 80, 70, 55, 30, 70. Represent the data by means of a histogram. Sol. Range = max. (110) – min. (30) = 80 Let the number of class intervals = 7 ⎛ 80 ⎞ or ⎟ 12. Width of the class interval = ⎜ ⎝ 7 ⎠ Wts. of oranges No. of oranges Frequency (in gms.) 30–42 42–54 54–66 66–78 78–90 90–102 102–114 Total |||| || |||| |||| ||| |||| |||| |||| |||| || 7 4 8 9 5 5 2 40 21.6 GRA APHICAL REP PRESENTATIO ON 1151 ________________________ ________________________________________________________________________________________ The histogram of o the above frequency distribution d is given heree: (B) Frequency Polygon d For a grouped frequency distribution with equal class-intervvals, a frequuency polygon is obtained by joining the t middle points p of thee upper sides (tops) of thhe adjacent rectangles of o the histogram m by means of straight lines. To coomplete the polygon, thee mid-pointss at each ennd are joined to the immediately lower and higher mid-points m att zero frequeency, i.e., onn the x-axis. Exaample 2. Thee following table t gives thhe weights (to ( the neareest pound) off 40 studentss at a universityy. Constructt a frequenccy distributioon with 7 classes and draw d the hisstogram andd frequency polygon. p 138,, 164, 150,, 132, 144, 125, 149, 157, 146, 158, 140, 147, 136, 148, 152, 144, 168, 1266, 138, 176 6, 163, 1199, 154, 165,, 146, 173, 142, 147, 135, 140, 135, 102, 145, 135, 1422, 150, 156,, 145, 128. Sol. Range of raaw data = maax. (176) – min. m (102) = 74 mber of classses = 7 Num ⎛ 74 ⎞ or ⎟ 11. ∴ Width W of classs interval = ⎜ ⎝ 7 ⎠ Weightt (to o the nearestt pound) Tally marrks F Frequency 102–1133 113–1244 124–1355 | | |||| 135–1466 |||| |||| |||| 14 146–1577 |||| |||| || 12 157–1688 |||| ||| 168–1799 Total 1 1 4 5 3 40 1152 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ The histogram and a frequenccy polygon are a shown heere: (H Histogram: reectangles; Frequency F poolygon: show wn dotted.) t Ogive (C) Cumulativee Frequencyy Curve or the mulative freqquency is caalled a cumuulative frequuency The curve obtaiined by plottting the cum curve or an ogive (prronounced ojjive). There are two typees of ogives.. L og give. Plot thhe points witth the upper limits of thee classes as abscissae a annd the (i) Less-than corresponnding less-th han cumulative frequenccy as ordinattes. Join the points by a freehand sm mooth curve to get the less-tthan ogive. It I is a rising curve. (An ogive o usually means a leess-than ogivve.) (ii) More-than ogive. Plot the points with w the low wer limits off the classes as abscissaee and m cuumulative freequency as ordinates. Jooin the poinnts by a freeehand the correesponding more-than smooth curve c to get the t more-thaan ogive. It is i a falling cuurve. Connsider the folllowing frequency distribbution: Gradess No. of students Graades No. of students 10–20 20–30 30–40 4 6 10 40––50 50––60 60––70 20 18 2 Let us convert it i first into a “less-than C.F.” distribbution and then t into a “more-than “ C C.F.” distributiion. Gradess less-than n 20 30 40 50 60 70 o students No. of 4 (+ 6 = )10 (+ 100 = ) 20 (+ 200 = ) 40 (+ 188 = ) 58 (+ 2 = ) 60 Graades more-than 10 20 30 40 50 60 70 No. of studdents 660 (– 4 = ) 56 5 (– 6 = ) 50 5 (– 10 = ) 40 4 (– 20 = ) 20 2 (– 18 = ) 2 (– 2 = ) 0 21.7 COM MPARISON OF F FREQUENCY DISTRIBUTIIONS 1153 ________________________ ________________________________________________________________________________________ Exaample 3. Drraw the twoo ogives for the followiing distributtion showing the numbber of grades off 59 studentss: Gradess No. of o students Graddes No. of studdents 0–10 10–20 0 20–30 0 30–40 0 4 8 11 15 40––50 50––60 60––70 12 6 3 Gradess No. of o students Less--than C.F F. More-thaan C.F. 0–10 10–20 0 20–30 0 30–40 0 40–50 0 50–60 0 60–70 0 4 8 11 15 12 6 3 4 122 233 388 500 566 599 59 55 47 36 21 9 3 Sol. ( 23), (400, 38), (50, 50), (60, 566), (70, 59), and Plottting the poiints (10, 4),, (20, 12), (30, joining thhem by freeh hand, the sm mooth rising curve c obtainned is less-thhan ogive. Plottting the poin nts (0, 59), (l0, ( 55), (200, 47), (30, 36), 3 (40, 21), (50, 9), (600, 3), and jooining them by freehand, the smooth fallling curve obtained o is more-than m oggive. 21.7 CO OMPARISO ON OF FREQ QUENCY DISTRIBUTIO ONS Wheen two or more m differeent series off the same type t are com mpared, tabuulation of obsero vations is i not sufficient. It is offten desirablle to define quantitativeely the charracteristics of o the frequencyy distributio on. 1154 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ There are two fundamental characteristics in which similar frequency distributions may differ: (i) They may differ in measures of location or central tendency, i.e., in the value of the variate x around which they center. (ii) They may differ in the extent to which observations are scattered about the central value. Measures of this kind are called measures of dispersion. 21.8 MEASURES OF CENTRAL TENDENCY Tabulation arranges facts in a logical order and helps their understanding and comparison. But often, the groups tabulated are still too large for their characteristics to be readily grasped. What is desired is a numerical expression that summarizes the characteristic of the group. Measures of central tendency or measures of location (also popularly called averages) serve this purpose. A figure that is used to represent a whole series should neither have the lowest value nor the highest in the series, but a value somewhere between these two limits, possibly in the center, where most of the items of the series cluster. Such figures are called Measures of Central Tendency (or averages). There are five types of averages in common use: 1. Arithmetic Average or Mean 4. Geometric Mean 2. Median 5. Harmonic Mean 3. Mode We shall take them one by one. 21.8.1 Arithmetic Mean In the case of Individual Observations (i.e., where frequency is not given): 1. Direct Method. If x : x1 , x2 , . . . , xn then A.M. x is given by x1 + x2 + . . . + xn 1 = Σx. n n 2. Short Cut Method. (Shift of origin.) Shifting the origin to an arbitrary point a, the formula 1 1 x = Σx becomes x − a = Σ( x − a ) n n 1 or x = a + Σd x where d x = x − a n x= Here, a = arbitrary number, called the Assumed Mean Σd x = Σ( x − a) = ( x1 − a ) + ( x2 − a ) + . . . + ( xn − a) = sum of the deviations of the variate x from a n = number of observations. In the case of a Discrete Series: 1. Direct Method. If the frequency distribution is x : x1 , x2 , . . . , xn f : f1 , f 2 , . . . , f n , x= then f1 x1 + f 2 x2 + . . . + f n xn Σ fx = N f1 + f 2 + . . . + f n where N = f1 + f 2 + . . . + f n = Σf 21.8 MEASURES OF CENTRAL TENDENCY 1155 ________________________________________________________________________________________________________ 2. Short Cut Method. (Shift of origin.) Shifting the origin to an arbitrary point a, the formula 1 1 x = Σfx becomes x − a = Σf ( x − a ) N N 1 x = a + Σfd x , where d x = x − a or N 1 Thus x = a + Σfd x where a = assumed mean N Σ fd x = Σ f ( x − a) = f1 ( x1 − a ) + f 2 ( x2 − a) + . . . + f n ( xn − a) = sum of the products of f and the deviation of the corresponding variate x from a. N = f1 + f 2 + . . . + f n = Σ f . Note. If the frequencies are given in terms of class intervals, the mid-values of the class intervals are considered as x and then the above formulae are applied. In the case of Continuous Series having equal class intervals, say of width h, we use a different formula (Shift of origin and change of scale; Step Deviation Method). x−a Let u= then x = a + hu h ∴ Σfx = Σf (a + hu ) = aΣf + hΣfu Dividing both sides by N = Σf , we get Σfx hΣfu =a+ N N or x = a+h Σfu N where u= x−a . h Weighted Arithmetic Mean. If the variate-values are not of equal importance, we may attach weights to them w1 , w2 , . . . , wn as measures of their importance. The weighted mean xw is defined as xw = w1 x1 + w2 x2 + . . . + wn xn Σwx = (i.e., write w for f ). w1 + w2 + . . . + wn Σw ILLUSTRATIVE EXAMPLES Example 1. Find the mean from the following data: Grades Below Below Below Below Below 10 20 30 40 50 No. of students 5 9 17 29 45 Grades Below Below Below Below Below 60 70 80 90 100 No. of students 60 70 78 83 85 1156 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Sol. The frequency distribution table can be written as: Grades Mid values (x) f x − 55 0–10 10–20 20–30 30–40 40–50 50–60 60–70 70–80 80–90 90–100 5 15 25 35 45 55 65 75 85 95 5 4 8 12 16 15 10 8 5 2 – 50 – 40 – 30 – 20 – 10 0 10 20 30 40 N = Σ f = 85 u= x − 55 10 –5 –4 –3 –2 –1 0 1 2 3 4 fu – 25 – 16 – 24 – 24 – 16 0 10 16 15 8 Σ fu = −56 Σ fu ⎛ −56 ⎞ = 55 + 10 × ⎜ [Here a = 55, h = 10] ⎟ N ⎝ 86 ⎠ 112 = 55 − = 48.41. 17 Example 2. The mean of 200 items was 50. Later on it was discovered that two items were misread as 92 and 8 instead of 192 and 88. Find the correct mean. Sol. Here the incorrect value of x = 50, n = 200 Σx ∴ Σx = nx x= Since n Using the incorrect value of x , Incorrect Σx = 200 × 50 = 10000 ∴ Corrected value of Σx = 10000 − (92 + 8) + (192 + 88) = 10180 Corrected Σx 10180 = = 50.9. Correct mean = 200 n Here x = a + h Properties of the Arithmetic Mean Property I. The algebraic sum of the deviations of all the variates from their arithmetic mean is zero. Proof. Let dx be the deviation of the variate x from the mean x , then dx = x − x ∴ Σ fd x = Σ f ( x − x ) = Σ fx − x Σ f Σ fx , where N = Σ f . N Property II. The sum of the squares of the deviations of a set of values is minimum when taken about the mean. Proof. Let the frequency distribution be xi / fi , i = 1, 2, . . . , n. Let z be the sum of the squares of the deviations of the given values from an arbitrary point a (say). = Nx − Nx = 0 ∵x = 21.8 MEASURES OF CENTRAL TENDENCY 1157 ________________________________________________________________________________________________________ n z = ∑ f ( x − a)2 . ⇒ Let i =1 We have to show that z is minimum when a = x . dz d 2z z will be minimum when = 0 and >0 da da 2 n n dz Now = ∑ 2 f ( x − a ) ⋅ (−1) = −2∑ f ( x − a ) da i = 1 i =1 dz ∴ = 0 ⇒ −2Σ f ( x − a ) = 0 da ⇒ Σ fx − aΣ f = 0 Σ fx ⎡ ⎤ ⎢⎣ ∵ x = N , Σ f = N ⎥⎦ ⇒ Nx − aN = 0 ⇒ x −a =0 ( ∵ N = Σ f ≠ 0) ⇒ a=x n d 2z f (−1) = 2Σ f = 2N > 0 = − 2 ∑ da 2 i =1 Hence z is minimum when a = x . Property III. (Mean of the composite series.) If xi (i = 1, 2, . . . , k) are the arithmetic means of k distributions with respective frequencies ni (i = 1, 2, . . . , k), then the mean x of the whole distribution obtained by combining the k distributions is given by n x + n x + ... + nk xk Σi ni xi x= 1 1 2 2 = Σ ni n1 + n2 + ... + nk Also i Proof. Let x11 , x12 , x13 , . . . , x1n1 be the variables of the first distribution, x21 , x22 , . . . , x2n2 be the variables of the second distribution, and so on. Then by definition 1 ⎫ ( x11 + x12 + . . . + x1n1 ) ⎪ n1 ⎪ 1 ⎪ x2 = ( x21 + x22 + . . . + x2 n2 ) ⎪ n2 ⎬ .............................................⎪ ⎪ 1 ⎪ xn = ( xk1 + xk2 + . . . + xknk ) ⎪ nk ⎭ x1 = . . . ( A) The mean x of the whole distribution of size (n1 + n2 + . . . + nk ) is given by x= = ( x11 + x12 + . . . + x1n1 ) + ( x21 + x22 + . . . + x2 n2 ) + . . . + ( xk1 + xk2 + . . . + xknk ) n1 + n2 + . . . + nk n1 x1 + n2 x2 + . . . + nk xkk Σi ni xi = n1 + n2 + . . . + nk Σ ni i 1158 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Example 3. The mean annual salary paid to all employees of a company was $50000. The mean annual salaries paid to male and female employees were $52000 and $42000 respectively. Determine the percentage of males and females employed by the company. Sol. Let p1 and p2 represent the percentage of males and females respectively. . . . (1) Then p1 + p2 = 100 Mean annual salary of all employees ( x ) = $50000 = $52000 Mean annual salary of all males ( x1 ) Mean annual salary of all females ( x2 ) = $42000 p x + p2 x2 52000 p1 + 42000 p2 , we get 50000 = x= 1 1 Using p1 + p2 100 or 520 p1 + 420 p2 = 50000 or 260 p1 + 210 p2 = 25000 260 p1 + 210(100 − p1 ) = 25000 [Using (1)] or 50 p1 = 25000 – 21000 = 4000 ∴ p1 = 80 and p2 = 100 − 80 = 20 or Hence the percentage of males and females is 80 and 20 respectively. 21.8.2 Median 1. The median is the central value of the variable when the values are arranged in ascending or descending order of magnitude. When the observations are arranged in the order of their size, the median is the value of that item that has an equal number of observations on either side. The median divides the distribution into two equal parts. The median is, thus, a potential average. For the computation of a median, it is necessary that the items be arranged in ascending or descending order. 2. For an ungrouped frequency distribution, if the n values of the variate are arranged in ascending or descending order of magnitude. th ⎛ n +1 ⎞ (a) When n is odd, the middle value, i.e., ⎜ ⎟ value gives the median. ⎝ 2 ⎠ th th ⎛n⎞ ⎛n ⎞ (b) When n is even, there are two middle values ⎜ ⎟ and ⎜ + 1⎟ . ⎝2⎠ ⎝2 ⎠ The arithmetic mean of these two values gives the median. 3. For a discrete frequency distribution, the median is obtained by considering cumulaN +1 N +1 tive frequencies. Find where N = Σfi . Find the cumulative frequency just ≥ . The 2 2 corresponding value of x is the median. 4. For a grouped frequency distribution, the median is given by the formula, h⎛N ⎞ Median = l + ⎜ − C ⎟ f⎝2 ⎠ where, l = lower limit of the median class, where the median class is the class corresponding N to the cumulative frequency just ≥ 2 h = width of the median class; f = frequency of the median class N = Σf ; C = cumulative frequency of the class preceding the median class. 21.8 MEASURES OF CENTRAL TENDENCY 1159 ________________________________________________________________________________________________________ 5. Partition values. These are the values of the variate that divide the total frequency into a number of equal parts, the median being that value of the variate that divides the total frequency into two equal parts. (a) Quartiles. Quartiles are those values of the variate that divide the total frequency into four equal parts. When the lower half before the median is divided into two equal parts, the value of the dividing variate is called the Lower Quartile and is denoted by Q1. The value of the variate dividing the upper half into two equal parts is called the Upper Quartile and is denoted by Q3. (Q2 being the median.) The formulae for computation are Q1 = l + h⎛N h ⎛ 3N ⎞ ⎞ − C⎟ ⎜ − C ⎟ ; Q3 = l + ⎜ f ⎝4 f ⎝ 4 ⎠ ⎠ (b) Deciles. Deciles are those values of the variate that divide the total frequency into 10 equal parts. D1, D2, . . . denote respectively the first, second, . . . deciles. D1 = l + h⎛N ⎞ ⎜ − C⎟, f ⎝ 10 ⎠ D4 = l + h ⎛ 4N ⎞ − C⎟, ⎜ f ⎝ 10 ⎠ D7 = l + h ⎛ 7N ⎞ − C⎟ ⎜ f ⎝ 10 ⎠ (The fifth decile D5 is the median.) (c) Percentiles. Percentiles are those values of the variate that divide the total frequency into 100 equal parts. If P1, P2, . . . denote respectively the first, second, . . . percentiles, then P9 = l + h ⎛ 9N ⎞ − C⎟, ⎜ f ⎝ 100 ⎠ P72 = l + h ⎛ 72N ⎞ − C ⎟ etc. ⎜ f ⎝ 100 ⎠ (The 50th percentile P50 is the median.) In the above formulae for Quartiles, Deciles, and Percentiles, the letters l, i, f, N, C have been used in the same sense in which they have been used in the formula for the median. ILLUSTRATIVE EXAMPLES Example 1. Below are given the grades obtained by a group of 20 students in a certain class in mathematics and physics: Roll Nos. Grades in Math Grades in Physics Roll Nos. Grades in Math Grades in Physics : : : : : : 1 53 58 11 25 10 2 54 55 12 42 42 3 52 25 13 33 15 4 32 32 14 48 46 5 30 26 15 72 50 6 60 85 16 51 64 7 47 44 17 45 39 8 46 80 18 33 38 9 35 33 19 65 30 10 28 72 20 29 36 In which subject is the level of knowledge of the students higher? Sol. To find out the subject in which the level of knowledge of the students is higher, we find out the medians of both the series. The subject for which the median value is higher will be the subject in which the level of knowledge of the students is higher. Let us arrange the grades in ascending order of magnitude. 1160 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ S. No. Grades in Math Grades in Physics S. No. Grades in Math Grades in Physics 1 2 3 4 5 6 7 8 9 10 25 28 29 30 32 33 33 35 42 45 10 15 25 26 30 32 33 36 38 39 11 12 13 14 15 16 17 18 19 20 46 47 48 51 52 53 54 60 65 72 42 44 46 50 55 58 64 72 80 85 Number of items in each case = 20 (even) Median grades in Mathematics ⎛ 20 ⎞ ⎛ 20 ⎞ = A.M. of sizes of ⎜ ⎟ th and ⎜ + 1⎟ th items ⎝ 2 ⎠ ⎝ 2 ⎠ 45 + 46 = 45.5. = A.M. of sizes of 10th and 11th items = 2 39 + 42 = 40.5. Median grades in physics = A.M. of sizes of 10th and 11th items = 2 Since the median grades in mathematics are greater than the median grades in physics, the level of knowledge in mathematics is higher. Example 2. Obtain the median for the following frequency distribution: x: 1 f: 8 2 10 3 11 4 16 5 20 6 25 7 15 8 9 9 6 Sol. The cumulative frequency distribution table is given below: Here N = 120 ∴ x f C.F. 1 2 3 4 5 6 7 8 9 8 10 11 16 20 25 15 9 6 8 18 29 45 65 90 105 114 120 N +1 = 60.5 2 The cumulative frequency just greater than C.F. 65 is 5. Hence the median is 5. N +1 is 65 and the value of x corresponding to 2 21.8 MEASURES OF CENTRAL TENDENCY 1161 ________________________________________________________________________________________________________ Example 3. Find the median, lower, and upper quartiles from the following table: Grades Below 10 Below 20 Below 30 Below 40 No. of students 15 35 60 84 Grades Below 50 Below 60 Below 70 Below 80 No. of students 94 127 198 249 Sol. From the above table, we reconstruct the C.F. table with class intervals. Grades 0–10 10–20 20–30 30–40 40–50 50–60 60–70 70–80 Here No. of students ( f ) 15 20 25 24 10 33 71 51 C.F. 15 35 60 84 94 127 198 249 N = 249 (i) Calculation of Median ∴ N = 124.5 ∴ median class is 50 − 60, l = 50; h = 10, f = 33, C = 94 2 h ⎛N 10 ⎞ Median = l + ⎜ − C ⎟ = 50 + (124.5 − 94) f ⎝2 33 ⎠ 305 = 50 + = 50 + 9.24 = 59.24 33 (ii) Calculation of lower quartile Q1 N = 62.25 ∴ lower quartile class is 30 − 40, l = 30 4 h = 10, f = 24, C = 60 ∴ h⎛N 10 ⎞ ⎜ − C ⎟ = 30 + (62.25 − 60) f ⎝4 24 ⎠ 22.5 = 30 + = 30 + .94 = 30.94. 24 Q1 = l + (iii) Calculation of upper quartile Q3 3N 747 = = 186.75 ∴ upper quartile class is 60 − 70 4 4 l = 60, h = 10, f = 71, C = 127 ∴ h ⎛ 3N 10 ⎞ − C ⎟ = 60 + (186.75 − 127) ⎜ f ⎝ 4 71 ⎠ 597.5 = 60 + = 60 + 8.41 = 68.41. 71 Q3 = l + 1162 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 21.8.3 Mode 1. Mode. Mode is the value that occurs most frequently in a set of observations and around which the other items of the set cluster densely. It is the point of maximum frequency or the point of greatest density. In other words, the mode or modal value of the distribution is that value of the variate for which frequency is maximum. 2. Calculation of the Mode. (a) In the case of discrete frequency distribution, mode is the value of x corresponding to maximum frequency. But in any one (or more) of the following cases: (i) if the maximum frequency is repeated (ii) if the maximum frequency occurs in the very beginning or at the end of the distribution (iii) if there are irregularities in the distribution, the value of the mode is determined by the method of grouping (illustrated in the examples below). (b) In the case of a continuous frequency distribution, the mode is given by the formula: Mode = l + f m − f1 ×h 2 f m − f1 − f 2 where l is the lower limit, h is the width, and fm is the frequency of the model class, and f1 and f2 are the frequencies of the classes preceding and succeeding the modal class respectively. While applying the above formula, it is necessary to see that the class-intervals are of the same size. If they are unequal, they should first be made equal on the assumption that the frequencies are equally distributed throughout the class. In case fm – f1 < 0 or 2fm – f1 – f2 = 0, use the formula Mode = l + where Δ1 ×h Δ1 + Δ 2 Δ1 = f m − f1 and Δ 2 = f m − f 2 . (c) For a symmetrical distribution, the mean, median, and mode coincide. (d) Where the mode is ill-defined, i.e., where the method of grouping also fails, its value can be ascertained by the formula Mode = 3 Median – 2 Mean This measure is called the empirical mode. ILLUSTRATIVE EXAMPLES Example 1. Calculate the mode from the following frequency distribution: Size (x) : Frequency ( f ) : 4 2 5 5 6 8 7 9 8 12 9 14 10 14 11 15 12 11 13 13 21.8 MEA ASURES OF CENTRAL C TEN NDENCY 1163 ________________________ ________________________________________________________________________________________ Sol. Method off Grouping: planation: Exp In column I, In column II, In column III, In column IV, In column V, In column VI, original o freqquencies are written. frequencies f wo. of column I are combineed two by tw leave l the firsst frequencyy of column I and combinne the otherss two by two.. frequencies f of column I are combineed three by three. t leave l the firsst frequencyy of column I and combinne the otherss three by thrree. leave l the firsst two frequeencies in collumn I and combine c the others threee by b three. umns, the maaximum freqquency is wriitten in bold black type. In all these colu Note. All operattions are donne on colum mn I. w we frame another tablle in which against a everyy maximum item of coluumns I to VI, V we Now write dow wn the correesponding size s or sizes. The size (x) ( that occuurs the maxiimum numbber of times is the t mode. Columnns Size of item having max. frequeency I 11 II 10, III 9 9, V VI 10 10, IV 8, 11 9 9, 10 9 9, 10, 11, 1 12 11 Sincce the item 10 1 occurs a maximum m nuumber of tim mes (i.e., 5 tim mes), hence the mode is 10. 1164 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Example 2. Find the mode of the following: Grades No. of candidates Grades No. of candidates : : : : 1–5 7 26–30 18 6–10 10 31–35 10 11–15 16 36–40 5 16–20 32 41–45 1 21–25 24 Sol. Here the greatest frequency 32 lies in the class 16–20. Hence the modal class is 16–20. But the actual limits of this class are 15.5–20.5. l = 15.5, f m = 32, f1 = 16, f 2 = 24, h = 5 ∴ Mode = l + f m − f1 32 − 16 × h = 15.5 + ×5 2 f m − f1 − f 2 64 − 16 − 24 = 15.5 + 21.8.4 16 10 × 5 = 15.5 + = 18.83. 24 3 Geometric Mean Geometric Mean. (a) The geometric mean (G.M.) of n individual observations x1, x2, . . . , xn ( xi ≠ 0) is the nth root of their product. G = ( x1 , x2 , . . . , xn )1/ n Thus Taking logarithms of both sides log G = 1 1 n (log x1 + log x2 + . . . + log xn ) = ∑ log xi n n i =1 ⎡1 n ⎤ G = antilog ⎢ ∑ log xi ⎥ ⎣ n i =1 ⎦ ∴ (b) If x1 , x2 , . . . , xn occur f1 , f 2 , . . . , f n times respectively and N = n ∑f, i =1 i then the G.M. is given by G = ( x1f1 x2f2 . . . xnfn )1/ N Taking logarithms of both sides log G = 1 1 n ( f1 log x1 + f 2 log x2 + . . . + f n log xn ) = ∑ f i log xi N N i =1 ⎡1 n ⎤ G = antilog ⎢ ∑ f i log xi ⎥ ⎣ N i =1 ⎦ (c) In the case of a continuous frequency distribution, x is taken to be the value corresponding to the mid-points of the class-intervals. Example. Compute the geometric mean from the following data: Grades 0–10 10–20 20–30 30–40 40–50 No. of students 10 5 8 7 20 21.8 MEASURES OF CENTRAL TENDENCY 1165 ________________________________________________________________________________________________________ Sol. Grades 0–10 10–20 20–30 30–40 40–50 No. of Students (f) 10 5 8 7 20 50 Mid-values (x) 5 15 25 35 45 log x f log x 0.6990 1.1761 1.3979 1.5441 1.6532 6.9900 5.8805 11.1832 10.8087 33.0640 67.9264 1 67.9264 Σ f log x = = 1.3585 N 50 G = antilog 1.3585 = 22.83. log G = 21.8.5 Harmonic Mean Harmonic Mean. The harmonic mean of a number of observations is the reciprocal of the arithmetic mean of the reciprocals of the given values. Thus, the harmonic mean H of n observations x1 , x2 , . . . , xn is 1 n = H= n . 1 1 1 1 1 + +...+ ∑ xn n i = 1 xi x1 x2 If x1 , x2 , . . . , xn (none of them being zero) have the frequencies f1 , f 2 , . . . , f n respectively, then the harmonic mean is given by n 1 N H= n , N = ∑ fi = f f1 f 2 fi 1 i =1 + + ...+ n ∑ x1 x2 xn n i = 1 xi In the case of class-intervals, x is taken to be the mid-value of the class-interval. ILLUSTRATIVE EXAMPLES Example 1. Find the harmonic mean of the following data: Grades (out of 150) No. of students 10 2 20 3 40 6 60 5 120 4 Sol. 1 x f x 10 2 .100 20 3 .050 40 6 .025 60 5 .017 120 4 .008 20 f x .200 .150 .150 .085 .032 .617 1166 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ H.M. = N 20 = = 32.4. f .617 Σx Example 2. An airplane flies along the four sides of a square at speeds of 100, 200, 300, and 400 km/hr respectively. What is the average speed of the airplane in its flight around the square? Sol. When equal distances are covered with unequal speeds, the harmonic mean is the proper average. 4 Average speed = = 192 km/hr. ∴ 1 1 1 1 + + + 100 200 300 400 TEST YOUR KNOWLEDGE 1. The minimum temperature in (°C) for Anytown for the month of July, 2006 as reported by the Meteorological Department is given below. Construct a frequency distribution table for it. 30.3, 30.0, 25.8, 26.5, 24.2, 25.2, 28.0, 28.0, 29.5, 27.8, 30.0, 31.1, 27.2, 25.9, 27.6, 24.5, 24.4, 27.0, 28.1, 26.0, 25.4, 28.0, 26.9, 25.7, 27.2, 25.5, 26.6, 28.5, 28.0, 27.7, 24.0. 2. The following are the monthly rents (in dollars) of 40 stores. Tabulate the data by grouping in intervals of $8. 380, 420, 490, 370, 820, 370, 750, 620, 540, 790, 840, 750, 630, 440, 740, 440, 360, 690, 540, 480, 740, 470, 520, 570, 620, 670, 720, 770, 820, 510, 310, 380, 430, 750, 670, 770, 470, 640, 840, 810. 3. Draw a histogram representing the following frequency distribution: Monthly Wages Number of Workers (in $) 15 2 20 20 25 26 30 16 35 9 40 4 45 3 [Hint. Mid-values of class intervals of size 5 are given.] 4. Represent the following distribution by a (i) histogram and (ii) frequency polygon. Scores 90–99 80–89 70–79 60–69 50–59 40–49 30–39 Frequency 2 12 22 20 14 3 1 5. Represent the following distribution by an ogive: Grades 0–10 10–20 20–30 30–40 40–50 No. of students 5 13 12 11 8 Grades 50–60 60–70 70–80 80–90 90–100 No. of students 4 1 3 1 2 21.8 MEASURES OF CENTRAL TENDENCY 1167 ________________________________________________________________________________________________________ 6. Compute the arithmetic mean for the following data: Height (in cm): No. of people: 219 2 216 4 213 6 210 10 207 11 204 7 201 5 198 4 195 1 7. Find the average grades of students from the following data: Grades Above 0 Above 10 Above 20 Above 30 Above 40 Above 50 No. of students 80 77 72 65 55 43 Grades Above 60 Above 70 Above 80 Above 90 Above 100 No. of students 28 16 10 8 0 8. Two hundred people were interviewed by a public opinion polling agency. The frequency distribution gives the ages of the people interviewed. Age Group Frequency 80–89 2 70–79 2 60–69 6 50–59 20 Calculate the arithmetic mean of the data. Age Group 40–49 30–39 20–29 10–19 Frequency 56 40 42 32 9. Calculate the arithmetic mean from the following data: Class interval 0–1 1–3 3–5 5–10 10–15 Frequency 8 8 10 12 18 Class interval 15–25 25–28 28–30 30–45 45–60 Frequency 11 10 9 8 6 10. Find the class intervals if the arithmetic mean of the following distribution is 33 and assumed mean is 35. Step deviation (u) Frequency ( f ) : : –3 5 –2 10 –1 25 0 30 1 20 2 10 11. The average height of a group of 25 children was calculated to be 78.4 cm. It was later discovered that one value was misread as 69 cm instead of the correct value of 96 cm. Calculate the correct average. 12. A candidate obtains the following percentage in an examination: english 60, history 75, mathematics 63, physics 59, and chemistry 55. Find the weighted mean if weights 2, 1, 5, 5, 3 are allotted to the subjects. 13. From the following data calculate the missing frequency: No. of pills 4–8 8–12 12–16 16–20 20–24 No. of people cured 11 13 16 14 ? No. of pills 24–28 28–32 32–36 36–40 No. of people cured 9 17 6 4 The average number of pills to cure a person is 20. 14. The frequencies of values 0, 1, 2, . . . , n of a variable are given by qn, nC1qn–lp, nC2qn–2p2, . . . , pn where p + q = 1. Show that the mean is np. 1168 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 15. The mean grades obtained by 300 students in the subject of statistics is 45. The mean of the top 100 of them was found to be 70 and the mean of the last 100 was known to be 20. What is the mean of the remaining 100 students? 16. In a certain examination, the average grade of all students in class A is 68.4 and that of all students in class B is 71.2. If the average of both classes combined is 70, find the ratio of the number of students in class A to the number in class B. 17. The following are the monthly salaries in dollars of 30 employees of a firm: 910 1390 1260 1190 1000 870 650 770 990 950 1080 1270 860 1480 1160 760 690 880 1120 1180 890 1160 970 1050 950 800 860 1060 930 1350 The firm gave bonuses of 100, 150, 200, 250, 300, 350, 400, 450, and 500 to employees in the respective salary groups: exceeding 600 but not exceeding 700, exceeding 700 but not exceeding 800, and so on up to exceeding 1400 but not exceeding 1500. Find the average bonus paid per employee. 18. According to the census of 2006, the following are the population figures in thousands of 10 cities: 2000, 1180, 1785, 1500, 560, 782, 1200, 385, 1123, 222. Find the median. 19. Find the median from the following table: x: f: 5 1 7 2 9 7 11 9 13 11 15 8 17 5 19 4 20. Calculate the mean and median from the following table: Class interval 6.5–7.5 7.5–8.5 8.5–9.5 9.5–10.5 10.5–11.5 11.5–12.5 12.5–13.5 Frequency 5 12 25 48 32 6 1 21. Compute the median from the following data: Mid-value 115 125 135 145 155 Frequency 6 25 48 72 116 Mid-value 165 175 185 195 Frequency 60 38 22 3 22. Find the median, quartiles, 7th decile, and 85th percentile from the following data: Monthly Rent ($) 200–400 400–600 600–800 800–1000 1000–1200 No. of families 6 9 11 14 20 Monthly Rent ($) 1200–1400 1400–1600 1600–1800 1800–2000 No. of families 15 10 8 7 23. An incomplete frequency distribution is given as follows: Variable 10–20 20–30 30–40 40–50 Frequency 12 30 ? 65 Variable 50–60 60–70 70–80 Total Frequency ? 25 18 229 Given that the median value is 46, determine the missing frequencies using the median formula. 21.8 MEASURES OF CENTRAL TENDENCY 1169 ________________________________________________________________________________________________________ 24. Find the median, lower and upper quartiles, 4th decile, and 60th percentile for the following distribution: Grades 0–4 4–8 8–12 12–14 No. of students 10 12 18 7 Grades 14–18 18–20 20–25 25 and above No. of students 5 8 4 6 [Hint. Here the class-intervals are not all equal. To find any partition value, there is no need to make them equal.] 25. Find the mode of the following frequency distribution: Size Frequency : : 1 3 2 8 3 15 4 23 5 35 6 40 7 32 8 28 9 20 10 45 11 14 12 6 26. Find the mode and median from the following table: Grades 0–10 10–20 20–30 30–40 No. of students 2 18 30 45 Grades 40–50 50–60 60–70 70–80 No. of students 35 20 6 3 Monthly wages (in $) 1500–1700 1700–1900 1900–2100 2100–2300 No. of workers 8 12 2 2 27. Calculate the mode of the following distribution: Monthly wages (in $) 500–700 700–900 900–1100 1100–1300 1300–1500 No. of workers 4 44 38 28 6 [Hint. Use the method of grouping for finding the modal class.] 28. An incomplete distribution of families according to their expenditure per week is given below. The median and mode for the distribution are $250 and $240 respectively. Calculate the missing frequencies. Expenditure No. of families : : 0–100 14 100–200 ? 200–300 27 300–400 ? 400–500 15 29. Compute the geometric mean of the following data: x y : : 10 2 15 3 18 5 20 6 25 4 30. If n1 and n2 are the sizes, G1 and G2 the geometric means of two series respectively, then the geometric n log G 1 + n2 log G 2 mean G of the combined series is given by log G = 1 . n1 + n2 31. The grades obtained by 25 students in a test are given below: Grades No. of students Find the harmonic mean. : : 11 3 12 7 13 8 32. Compute the harmonic mean of the following data: Class 0–10 10–20 20–30 30–40 40–50 Frequency 4 6 10 7 3 14 5 15 2 1170 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ 33. Three cities A, A B, and C aree equidistant frrom each otherr. A woman driives from A to B at 30 km/hrr, from B to C at 40 km/hr, k and from m C to A at 500 km/hr. Determ mine her average speed. 34. Show that in n finding the arithmetic a meaan of a set off readings on a thermometerr, it does not matter m whether we measure m tempeerature in Centigrade or Fahrrenheit, but thaat in finding the geometric mean, m it does matter which w scale wee use. A Answers 6. 10. 207.54 cm 0–10, 10–20, 1 20–30, 30–40, 40–50, 50–60 18. 22. 1151.5 thousands ($) 110 00, 781.80, 14000, 1333.30 0, 1600 24. 10.89, 6.5, 18.125, 9.33, 12.57 7.. 11.. 15.. 19.. 23.. 25.. 28.. 32.. 51.75 79.48 cm 45 13 34, 45 6 250, 240 16.03 8. 12. 16. 20. 35.8 years 60.63% 3:4 Meean = 9.87, Meedian = 9.97 9. 13. 17. 21. 17.36 14 $275 153.8 26. 29. 33. 36,, 36.6 18.20 38.3 km/hr 27. 31. $975.00 12.7 ________________________ ________________________________________________________________________________________ 21.9 DISPERSION N A measure m of central c tendeency by itseelf can exhiibit only one of the importaant characteeristics of distribution.. It can o as well as a singgle figure caan. It is representt a series only inadequaate to give uss a completee idea of the distributionn. It must be suppoorted and su upplementedd by some other o measurres. One such meaasure is Disp persion. Twoo or more frequency distributions d may have exactly identical averages but even then they mayy differ markkedly in several ways. w Furtheer analysis iss, therefore, essential to account for these differences.. Consider thhe followingg example: Disttribution A : Disttribution B : 75 10 85 2 20 95 30 105 70 1115 1880 125 290 600 = 100. In distribution d A, the valuues of the vaariate 6 differ froom 100 but the t differencce is small. In distribution B, the iteems are widdely scatteredd and lie far froom the mean. Althoughh the A.M. iss the same, the two disttributions widely w differ from each otheer in their formation. Therefore, whilee studying a distributionn, it is equally important to know how w the variatees are clusteredd around or scattered aw way from thee point of ceentral tendenncy. Such variation v is called c dispersioon or spread d or scatter or o variabilityy. Thus, disppersion is thhe extent to which the values v are dispeersed about the t central value. v The A.M. of eaach distributtion is 21.10 M MEASURES S OF DISPERSION The following are a the measuures of dispeersion: (a) Range R (b)) Quartile deeviation or seemi-inter-quuartile range (c) Average A (or mean) deviaation (d)) Standard deviation. d (a) Range. R Ran nge is the diifference bettween the exxtreme values of the variaate. Ran nge = L – S,, where L = Largest L and S = Smallesst L −S Coeefficient of th he Range = . L+S 21.10 MEASURES OF DISPERSION 1171 ________________________________________________________________________________________________________ It is easily understood and computed. But it suffers from the drawback that it depends exclusively on the two extreme values. It is not a reliable measure of dispersion. (b) Quartile Deviation. The difference between the upper and lower quartiles, i.e., Q3 – Q1 is known as the inter-quartile range and half of it, i.e., 12 (Q3 – Q1), is called the semiinter-quartile range or the quartile deviation. Quartile Deviation = 1 (Q3 − Q1 ). 2 It is definitely a better measure of dispersion than range as it makes use of 50% of the data. But since it ignores the other 50% of the data, it is also not a reliable measure of dispersion. Coefficient of the Quartile Deviation = Q3 − Q1 . Q3 + Q1 Example. Calculate the quartile deviation of the grades of 39 students in statistics given below: : Grades No. of students : 0–5 4 5–10 6 10–15 8 15–20 12 20–25 7 25–30 2 Sol. The cumulative frequency table is given below: Here Grades No. of students ( f ) C.F. 0– 5 5–10 10–15 15–20 20–25 25–30 4 6 8 12 7 2 4 10 18 30 37 39 N = 9.75 ∴ Class of Q1 is 5 − 10 4 h⎛N 5 5 × 5.75 ⎞ = 9.79 Q1 = l + ⎜ − C ⎟ = 5 + (9.75 − 4) = 5 + f ⎝4 6 6 ⎠ 3N = 29.25 ∴ Class of Q3 is 15 − 20 4 h ⎛ 3N 5 5 × 11.25 ⎞ − C ⎟ = 15 + (29.25 − 18) = 15 + = 19.69 Q3 = l + ⎜ f ⎝ 4 12 12 ⎠ N = Σ f = 39; 1 1 1 Quartile deviation = (Q3 − Q1 ) = (19.69 − 9.79) = × 9.90 = 4.95. 2 2 2 (c) Average Deviation or Mean Deviation. If x1 , x2 , x3 , . . . , xn occur f1 , f 2 , f 3 , . . . , f n n times respectively and N = ∑f, i =1 median) is given by i the mean deviation from the average A (usually mean or 1172 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Mean deviation = 1 n ∑ fi xi − A , N i =1 where xi − A represents the modulus or the absolute value of the deviation (xi – A). Since the mean deviation is based on all the values of the variate, it is a better measure of dispersion than range or quartile deviation. But some artificiality is created due to ignoring the signs of the deviations (xi – A). This renders it useless for further mathematical treatment. Coefficient of Mean Deviation = Mean Deviation . Average from which it is calculated Example. Find the mean deviation from the median of the following frequency distribution: : Grades No. of students : 0–10 5 10–20 8 20–30 15 30–40 16 40–50 6 Sol. Mid-value f C.F. x − Md f x − Md 5 15 25 35 45 5 8 15 16 6 50 5 13 28 44 50 23 13 3 7 17 115 104 45 112 102 478 N = 25 ∴ The median class corresponds to c.f. 28, i.e., median class is 20–30 2 h⎛N 10 ⎞ Median M d = l + ⎜ − C ⎟ = 20 + (25 − 13) = 20 + 8 = 28 f ⎝2 15 ⎠ 1 478 = 9.56 marks. Mean deviation from median = Σ f x − M d = N 50 (d) Standard Deviation. Root-Mean Square Deviation. The root-mean square deviation, denoted by s, is defined as the positive square root of the mean of the squares of the deviations from an arbitrary origin A. Thus s=+ 1 Σ fi ( xi − A) 2 N When the deviations are taken from the mean x , the root-mean square deviation is called the standard deviation and is denoted by the Greek letter σ . Thus σ =+ 1 Σ fi ( xi − x ) 2 . N Note. The square of the standard deviation σ 2 is called variance. Short-cut methods for calculating Standard Deviation ( σ ). 21.10 MEASURES OF DISPERSION 1173 ________________________________________________________________________________________________________ (i) Direct Method σ= σ2 = ⇒ 1 Σ fi ( xi − x ) 2 N 1 1 1 1 Σ f i ( xi2 − 2 xi x + x 2 ) = Σ fi xi2 − 2 x ⋅ Σ f i xi + x 2 ⋅ Σ f i N N N N (taking the constants x , x 2 outside the summation sign) = σ= ⇒ 1 1 1 Σ fi xi2 − 2 x ⋅ x + x 2 ⋅ ⋅ N = Σ fi xi2 − x 2 N N N 1 Σ fi xi2 − x 2 = N 2 1 ⎛1 ⎞ Σ fi xi2 − ⎜ Σ fi xi ⎟ . N ⎝N ⎠ (ii) Change of Origin Let the origin be shifted to an arbitrary point a. Let d = x – a denote the deviation of variate x from the new origin d = x−a ⇒ d = x −a ∴ d −d = x−x σx = 1 Σ f ( x − x )2 = N 1 Σ f (d − d ) 2 = σ d N ∴ The S.D. remains unchanged by shift of origin. 2 σx = σd 1 ⎛1 ⎞ Σ fd 2 − ⎜ Σ fd ⎟ . N ⎝N ⎠ Note. In the case of series of individual observations, if the mean is a whole number, take a = x . In the case of discrete series, when the values of x are not equidistant, take a somewhere in the middle of the x-series. (iii) Shift of Origin and Change of Scale (Step Deviation Method) 1 Let the origin be shifted to an arbitrary point a. Let the new scale be times the original h scale. x−a then hu = x − a ⇒ hu = x − a ∴ h(u − u ) = x − x h 1 1 1 σx = Σ f ( x − x )2 = Σ fh 2 (u − u ) 2 = h Σ f (u − u ) 2 = hσ u N N N Let u = which is independent of a but not h. Hence the S.D. is independent of the change of the origin but not of the change of scale. 1 ⎛1 ⎞ Σ fu 2 − ⎜ Σ fu ⎟ σ x = hσ u = h N ⎝N ⎠ 2 Note. In the case of discrete series, when the values of x are equidistant at intervals of h or in the case of continuous series having equal class intervals of width h, use the Step Deviation Method. 1174 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Relation between σ and s By definition, we have 1 1 Σ f i ( xi − a) 2 = Σ fi ( xi − x + x − a) 2 N N 1 = Σ fi ( xi − x + d ) 2 where d = x − a N 1 = Σ fi [( xi − x ) 2 + d 2 + 2d ( xi − x )] N 1 2d 2d d2 d2 2 2 (0) = Σ fi ( xi − x ) + Σ fi + Σ fi ( xi − x ) = σ + ⋅ N + N N N N N [∵ Σ f i ( xi − x ) = algebraic sum of the deviations from mean = 0] s2 = =σ 2 + d2 s2 = σ 2 + d 2 ∵ d 2 ≥ 0 Hence ∴ s2 ≥ σ 2 Clearly s2 is least when d = 0, i.e., x = a ∴ Mean square deviation (s2) and consequently the root-mean square deviation (s) is least when the deviations are measured from the mean. Hence standard deviation is the least possible root-mean square deviation. 21.11 RELATIONS BETWEEN MEASURES OF DISPERSION 4 4 (standard deviation) = σ 5 5 2 2 Semi-interquartile range = (standard deviation) = σ . 3 3 Mean Deviation = 21.12 COEFFICIENT OF DISPERSION Whenever we want to compare the variability of two series that differ widely in their averages or which are measured in different units, we calculate the coefficients of dispersion, which being ratios are numbers independent of the units of measurement. The coefficients of dispersion (C.D.) based on different measures of dispersion are as follows: xmax − xmin xmax + xmin Q − Q1 C.D. = 3 Q3 + Q1 = (a) C.D. based on range: (b) Based on quartile deviation: (c) Based on mean deviation: (d) Based on standard deviation: mean deviation average from which it is calculated S.D. σ = C.D. = Mean x C.D. = Coefficient of variation. It is the percentage variation in the mean, standard deviation being considered as the total variation in the mean. C.V. = σ x ×100. 21.12 COEFFICIENT OF DISPERSION 1175 ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. Find the mean and standard deviation of the following: Series Frequency Series Frequency 15–20 20–25 25–30 30–35 35–40 40–45 2 5 8 11 15 20 45–50 50–55 55–60 60–65 65–70 70–75 20 17 16 13 11 5 Sol. Mid-values x f 17.5 22.5 27.5 32.5 37.5 42.5 47.5 52.5 57.5 62.5 67.5 72.5 2 5 8 11 15 20 20 17 16 13 11 5 u= x − 47.5 5 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 N = 143 x = a + h⋅ fu fu2 – 12 – 25 – 32 – 33 – 30 – 20 0 17 32 39 44 25 72 125 128 99 60 20 0 17 64 117 176 125 5 1003 Σ fu 5 = 47.5 + 5 × = 47.7 N 143 1 1003 ⎛ 5 ⎞ ⎛ Σ fu ⎞ Σ fu 2 − ⎜ −⎜ σ x = hσ u = h ⎟ =5 ⎟ = 5 × 2.65 = 13.25. N 143 ⎝ 143 ⎠ ⎝ N ⎠ 2 2 Example 2. Goals scored by two teams A and B in a soccer season were as follows: No. of goals scored in a match 0 1 2 3 4 Find out which team is more consistent. No. of matches A B 27 17 9 9 8 6 5 5 4 3 1176 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Sol. Calculation of coefficient of variation for team A: No. of goals scored (x) No. of matches (f) dx = x − 2 fdx fd x2 0 1 2 3 4 27 9 8 5 4 –2 –1 0 1 2 – 54 –9 0 5 8 108 9 0 1 56 – 50 138 N = 53 x =a+ Σ fd x −50 = 2+ = 2 − 0.94 = 1.06 N 53 1 138 ⎛ −50 ⎞ ⎛ Σ fd x ⎞ Σ fd x2 − ⎜ = −⎜ ⎟ = 1.31 ⎟ N 53 ⎝ 53 ⎠ ⎝ N ⎠ 2 σ= Coefficient of variation for team A = σ x 2 × 100 = 1.31× 100 = 123.6 1.06 Calculation of coefficient of variation for team B: No. of goals scored (x) No. of matches (f) dx = x – 2 fdx fd x2 0 1 2 3 4 17 9 6 5 3 –2 –1 0 1 2 – 34 –9 0 5 6 68 9 0 5 12 –32 94 N = 40 x =a+ Σ fd x 32 = 2− = 2 − .8 = 1.2 N 40 1 94 ⎛ −32 ⎞ ⎛ Σ fd x ⎞ Σ fd x2 − ⎜ = −⎜ ⎟ = 1.3 ⎟ N 40 ⎝ 40 ⎠ ⎝ N ⎠ 2 σ= σ 2 1.3 × 100 = 108.3 x 1.2 Since the coefficient of variation is less for team B, team B is therefore more consistent. Coefficient of variation for team B = 21.13 × 100 = THEOREM The standard deviations of two series containing n1 and n2 members are σ1 and σ2 respectively, being measured from their respective means x1 and x2 . If the two series are grouped together as one series of (n1 + n2) members, show that the standard deviation σ of this series, measured from its mean x , is given by 21.13 THEOREM 1177 ________________________________________________________________________________________________________ σ2 = n1σ 12 + n2σ 22 n1n2 ( x1 − x2 ) 2 . + 2 n1 + n2 (n1 + n2 ) Proof. Let S12 and S22 be the mean square deviations of the two series respectively and S2 be the mean square deviation of the two series taken together. Then if a is the assumed mean, we have S2 = = 1 n1 + n2 n1 + n2 ∑ f ( x − a)2 = 1 n1 + n2 ⎤ 1 ⎡ n1 2 − + f x a f ( x − a)2 ⎥ ( ) ⎢∑ ∑ n1 + n2 ⎣ 1 n1 +1 ⎦ ⎡ ⎤ 1 n1 2 f ( x − a) 2 etc.⎥ ∵ = S ∑ 1 ⎢ n1 1 ⎣ ⎦ n1S12 + n2S22 n1 + n2 n1 (σ 12 + d12 ) + n2 (σ 22 + d 22 ) = [∵ S2 = a 2 + d 2 where d = x − a ] n1 + n2 = n1σ 12 + n2σ 22 n1d12 + n2 d 22 + n1 + n2 n1 + n2 . . . (1) d1 = x1 − a, d 2 = x2 − a Now If a is the mean of the two combined series, i.e., if a = x , then S2 = σ 2 n x +n x Also x= 1 1 2 2 n1 + n2 ∴ ∴ d1 = x1 − x = x1 − n1 x1 + n2 x2 n2 ( x1 − x2 ) = n1 + n2 n1 + n2 d 2 = x2 − x = x2 − n1 x1 + n2 x2 n1 ( x2 − x1 ) = n1 + n2 n1 + n2 n1d12 + n2 d 22 = = ∴ From (1), σ 2 = n1n22 ( x1 − x2 ) 2 n2 n12 ( x2 − x1 ) 2 + (n1 + n2 ) 2 (n1 + n2 ) 2 n1n2 ( x1 − x2 ) 2 nn ⋅ (n2 + n1 ) = 1 2 ( x1 − x2 ) 2 2 n1 + n2 (n1 + n2 ) n1σ 12 + n2σ 22 n1n2 + ( x1 − x2 ) 2 . n1 + n2 (n1 + n2 ) 2 ( ∵ S2 = σ 2 ) Example. The first of the two samples has 100 items with mean 15 and standard deviation 3. If the whole group has 250 items with mean 15.6 and standard deviation 13.44 , find the standard deviation of the second group. Sol. Here ∴ Using n1 = 100, x1 = 15, σ 1 = 3 n = n1 + n2 = 250, x = 15.6, σ = 13.44 n2 = 250 − 100 = 150 n x +n x 100(15) + 150( x2 ) x = 1 1 2 2 , we have 15.6 = n1 + n2 250 1178 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 150 xx = 250 ×15.6 − 1500 = 2400 or ∴ x2 = 16 d1 = x1 − x = 15 − 15.6 = −0.6 d 2 = x2 − x = 16 − 15.6 = 0.4 The variance of the combined group σ 2 is given by the formula σ2 = n1σ 12 + n2σ 22 n1d12 + n2 d 22 + n1 + n2 n1 + n2 (n1 + n2 )σ 2 = n1 (σ 12 + d12 ) + n2 (σ 22 + d 22 ) ∴ 250 ×13.55 = 100(9 + 0.36) + 150(σ 22 + 0.16) or 150σ 22 = 250 ×13.44 − 100 × 9.36 − 150 × 0.16 = 3360 − 936 − 24 = 2400 ∴ σ 22 = 16. Hence σ 2 = 4. or 21.14 SKEWNESS For a symmetrical distribution, the frequencies are symmetrically distributed about the mean, i.e., variates equidistant from the mean have equal frequencies. Also, in the case of such a distribution, the mean, mode, and median coincide and the median lies halfway between the two quartiles. Thus M = M0 = Md and Q3 – M = M – Q1. Skewness means a lack of symmetry or lopsidedness in a frequency distribution. The object of measuring skewness is to estimate the extent to which a distribution is distorted from a perfectly symmetrical distribution. Skewness indicates whether the curve is turned more to one side than to the other, i.e., whether the curve has a longer tail on one side. Skewness can be positive as well as negative. Skewness is positive if the longer tail of the distribution lies toward the right and negative if it lies toward the left. 21.15 MEASURES OF SKEWNESS Measures of skewness give us an idea about the extent of “lopsided-ness” in a series. Such measures should be (i) Pure numbers so as to be independent of the units in which the variable is measured. (ii) Zero when the distribution is symmetrical. Relative measures of skewness are called the coefficient of skewness. They are independent of the units of measurement and as such, they are pure numbers. Bowley’s coefficient of skewness based on quartiles is defined as Sk = (Q3 − M d ) − (M d − Q1 ) Q3 + Q1 − 2M d = (Q3 − M d ) + (M d − Q1 ) Q3 − Q1 Karl Pearson’s coefficient of skewness is defined as Sk = M − M0 Mean − Mode = σ Standard Deviation If the mode is ill-defined, then using M0 = 3Md – 2M, we have Sk = 3(M − M d ) σ . The value of Bowley’s coefficient of skewness lies between –1 and +1 and that of Karl Pearson’s coefficient of skewness lies between –3 and +3. 21.16 MOMENTS 1179 ________________________________________________________________________________________________________ Example. Find the coefficient of dispersion and a measure of skewness from the following table giving the wage bonuses of 230 people: Wage bonuses (in $) 70–80 80–90 90–100 100–110 No. of people 12 18 35 42 Wage bonuses (in $) No. of people 110–120 50 120–130 45 130–140 20 140–150 8 Sol. Mid-values (x) No. of people (f) C.F. 75 85 95 105 115 125 135 145 12 18 35 42 50 45 20 8 12 30 65 107 157 202 222 230 u= x − 105 10 –3 –2 –1 0 1 2 3 4 N = 230 Mean M = a + h fu fu2 – 36 – 36 – 35 0 50 90 60 32 108 72 35 0 50 180 180 128 = 125 = 753 Σ fu 125 = 105 + 10 × = 105 + 5.4 = Rs. 110.4. N 230 The greatest frequency 50 lies in the class 110–120. Hence this is the modal class. f m = 50, f1 = 42, f 2 = 45, l = 110, h = 10, f m − f1 ∴ Mode M 0 = l + ×h 2 f m − f1 − f 2 = 110 + 50 − 42 83 ×10 = 110 + = 110 + 6.2 = $116.2 100 − 42 − 45 13 2 2 1 753 ⎛ 125 ⎞ ⎛1 ⎞ Standard deviation σ = h Σ fu 2 − ⎜ Σ fu 2 ⎟ = 10 −⎜ ⎟ = $17.3 N 230 ⎝ 230 ⎠ ⎝N ⎠ σ 17.3 ∴ Coefficient of dispersion = = = 0.16 M 110.4 M − M 0 110.4 − 116.2 = = −0.33. Measure of skewness Sk = σ 17.3 21.16 MOMENTS The rth moment of a variable x about any point A is denoted by μr′ and is defined as μr′ = 1 Σ f ( x − A) r N where N=Σ f 1180 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ The rth moment of a variable x about the mean M is denoted by μr and is defined as 1 μr = Σ f ( x − M) r N 1 1 1 In particular μ0′ = Σ f ( x − A)0 = Σ f = ⋅ N = 1 N N N μ0 = 1 Similarly, 1 Σ f ( x − M) = 0 N | being the algebraic sum of the deviations from the mean 1 μ2 = Σ f ( x − M) 2 = σ 2 , by definition. N The results μ0 = 1, μ1 = 0, μ2 = σ 2 are of fundamental importance and should be committed to memory. μ1 = 21.17 RELATION BETWEEN MOMENTS ABOUT THE MEAN IN TERMS OF MOMENTS ABOUT ANY POINT AND VICE VERSA 1 Σ f ( x − A) r N 1 = Σ fd r N μr′ = By definition, or 1 Σ fd N 1 M = A + Σ fd = A + μ1′ N μ1′ = M − A Now μr′ = Setting r = 1, ∴ where A is any point where d = x − A . . . (i) μ1′ = 1 Σ N 1 = Σ N 1 = Σ N 1 = Σ N . . . (ii) f ( x − M) r f ( x − A + A − M) r = 1 Σ f (d − μ1′) r N | Using (ii) f ⎡⎣ d r − r C1d r −1μ1′ + r C2 d r − 2 μ1′2 − r C3 d r −3 μ1′3 + . . . + (−1) r ⋅ μ1′r ⎤⎦ 1 1 Σ fd r −1 + r C2 μ1′2 Σ fd r − 2 N N 1 1 − r C3 μ ′3 Σ fd r −3 + . . . + (−1) r μ1′r ⋅ Σ f N N r r 2 r 3 = μr′ − C1μr′−1 + C2 μr′− 2 μ1′ − C3 μr′−3 μ1′ + . . . + (−1) r μ1′r fd r − r C1μ1′ ⋅ | Using (i) In particular, setting r = 2, 3, 4, we get μ2 = μ2′ − 2 μ1′2 + μ0′ μ1′2 = μ2′ − μ1′2 μ3 = μ3′ − 3μ2′ μ1′ + 3μ2′3 − μ0′ μ1′3 = μ3′ − 3μ2′ μ1′ + 2μ1′3 μ4 = μ4′ − 4 μ3′ μ1′ + 6μ2′ μ1′2 − 4 μ1′μ1′3 + μ0′ μ1′4 = μ4′ − 4 μ3′ μ1′ + 6μ2′ μ1′2 − 3μ1′4 | ∵ μ0′ = 1 21.19 SHEPPARD’S CORRECTIONS FOR MOMENTS 1181 ________________________________________________________________________________________________________ μ1 = 0 Hence μ 2 = μ 2′ − μ1′2 μ3 = μ3′ − 3μ 2′ μ1′ + 2 μ3′3 ( μ1′ = M − A) μ 4 = μ 4′ − 4 μ3′ μ1′ + 6 μ 2′ μ1′ − 3μ1′ 2 4 1 1 Σ f ( x − M) r = Σ fd r where d = x − M N N 1 1 1 μ r′ = Σ f ( x − A) r = Σ f ( x − M + M − A) r = Σ f ( d + μ1′) r N N N 1 = Σ f ( d r + r C1d r −1 μ1′ + r C 2 d r − 2 μ1′2 + r C 3 d r −3 μ1′3 + . . . + μ1′r ) N 1 1 1 1 = Σ fd r + r C1 μ1′ ⋅ Σ fd r −1 + r C 2 μ1′2 Σ fd r − 2 + . . . + μ1′r ⋅ Σ f N N N N 2 r r r = μ r + C1 μ r −1 μ1′ + C 2 μ r − 2 μ1′ + . . . + μ1′ Conversely, μ r = Now . . . (iii) | Using (ii ) | Using (iii ) In particular, setting r = 2, 3, 4 and noting that μ1 = 0, μ0 = 1, we get μ2′ = μ2 + 2μ1μ1′ + μ0 μ1′2 = μ2 + μ1′2 μ3′ = μ3 + 3μ2 μ1′ + 3μ1μ1′2 + μ0 μ1′3 = μ3 + 3μ2 μ1′ + μ1′3 μ4′ = μ4 + 4μ3 μ1′ + 6μ2 μ1′2 + 4μ1μ1′3 + μ0 μ1′4 = μ4 + 4μ3 μ1′ + 6μ 2 μ1′2 + μ1′4 . 21.18 EFFECT OF A CHANGE OF ORIGIN AND SCALE ON MOMENTS Let ∴ ∴ x x−x μr′ Also x−A i.e., x = A + hu h = A + hu , where bar denotes the mean of the respective variable = h(u − u ) 1 1 1 = Σ f ( x − A) r = Σ fh r u r = h r ⋅ Σ fu r N N N 1 1 1 = Σ f ( x − x ) r = Σ fh r (u − u ) r = h r ⋅ Σ f (u − u ) r N N N u= μr Hence the rth moment of the variable x is hr times the corresponding moment of the variable u. 21.19 SHEPPARD’S CORRECTIONS FOR MOMENTS In the case of class intervals we assume that the frequencies are concentrated at mid-points of class intervals. Since this assumption is not true in general, some error is likely to creep into the calculation of moments. W.F. Sheppard gave the following formulae by which these errors may be corrected. 1 μ2 (corrected) = μ2 − h 2 ; μ3 (corrected) = μ3 12 1 7 4 μ4 (corrected) = μ4 − h 2 μ2 + h where h is the width of class intervals. 2 240 1182 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ 21.20 C CHARLIER’ S CHECK To check the accuracy a in the calculaation of the first four moments, m w often usee the we followingg identities known k as Chharlier checkks: Σ f ( x + 1) = Σ fx + Σ f = Σ fx + N Σ f ( x + 1) 2 = Σ fx 2 + 2Σ fx + N Σ f ( x + 1)3 = Σ fx3 + 3Σ fx 2 + 3Σ fx + N Σ f (x + 1) 4 = Σ fx 4 + 4Σ fx 3 + 6Σ fx 2 + 4Σ fx + N. 21.21 P PEARSON’S S β AND γ COEFFICIEN C NTS Karll Pearson defined the following f foour coefficieents based upon u the firrst four mom ments about thee mean: β1 = μ32 , γ = + β1 ; μ23 1 β2 = μ4 , γ = β2 − 3 μ22 2 These coefficieents are inddependent of o units of measuremeent and theerefore, are pure numbers.. β1 ( β 2 + 3) Baseed upon mom ments, the cooefficient off skewness iss Sk = . 2(5β 2 − 6 β1 − 9) 21.22 K KURTOSIS Giveen two freq quency distrributions thhat have thee same variiability as measured m byy the standard deviation, they t may bee relatively more m or lesss flat toppedd than the “nnormal curvee”. A y be symmettrical but it may m not be equally flat toopped with the t normal curve. c frequencyy curve may The relattive flatness of the top iss called kurtoosis and is measured m by β 2 . Curvves that are neither flatt nor sharplyy peaked aree called noormal curvess or mesokurtic curves (see ( curve A in the figgure). For succh a curve β 2 = 3 and hence h γ 2 = 0.. Curvves that aree flatter thann the normaal curve (seee curve B in the figuree) are calledd platykurticc. For such a curve β 2 < 3 and hen nce γ 2 < 0. Curvves that arre more shharply peakeed than thee normal curve c (see curve C inn the figuree) are calledd leptokurttic. For such h a curve β 2 > 3 and hennce γ 2 > 0. 21.23 β1 AS A MEA ASURE OF SKEWNESS For a symmetriccal distributiion, all the moments m of odd o order abbout the meann vanish. Let x denote th he mean of thhe variate x, then μ2 r +1 = 1 n ∑ fi ( xi − x )2r +1 , N = Σ fi N i =1 21.23 β1 AS A MEASURE OF SKEWNESS 1183 ________________________________________________________________________________________________________ In a symmetrical distribution, the values of the variate equidistant from the mean have equal frequencies. ∴ f1 ( x1 − x ) 2 +1 + f n ( xn − x ) 2 r +1 = 0 [∵ x1 − x and xn − x are equal in magnitude but opposite in sign. Also f1 = f n ] Similarly f 2 ( x2 − x ) 2 r +1 + f n −1 ( xn −1 − x ) 2 r +1 = 0 and so on. 1 n ∑ fi ( xi − x )2r +1 cancel in pairs. In n is odd, again the N i =1 terms cancel in pairs and the middle term vanishes, since the middle term = x . Hence μ2 r +1 = 0 ∴ If n is even, all the terms in μ3 = 0 and hence β1 = μ32 = 0. u23 Thus, β1 gives a measure of departure from symmetry, i.e., of skewness. Example. Calculate the first four moments of the following distribution about the mean and hence find β1 and β 2 : x : 0 1 2 3 4 5 6 7 8 f : 1 8 28 56 70 56 28 8 1 Sol. Let us first calculate moments about x = 4. In particular μr′ = 1 1 Σ f ( x − 4) r = Σ fd r N N x f 0 1 2 3 4 5 6 7 8 1 8 28 56 70 56 28 8 1 N = 256 d=x–4 –4 –3 –2 –1 0 1 2 3 4 where d = x − 4 fd fd 2 fd 3 fd 4 –4 – 24 – 56 – 56 0 56 56 24 4 0 16 72 112 56 0 56 112 72 16 512 – 64 – 216 – 224 – 56 0 56 224 216 64 0 256 648 448 56 0 56 448 648 256 2816 1 1 512 Σ fd = 0; μ2′ = Σ fd 2 = =2 N N 256 1 1 2816 μ3′ = Σ fd 3 = 0; μ4′ = Σ fd 4 = = 11 N N 256 μ1′ = Moments about the mean are μ1 = 0 (always ); μ2 = μ2′ − μ1′2 = 2 μ3 = μ3′ − 3μ2′ μ1′ + 2μ1′3 = 0; μ4 = μ4′ − 4μ3′ μ1′ + 6μ2′ μ1′2 − 3μ1′4 = 11 μ32 β1 = 3 = 0; μ2 β2 = μ4 11 = = 2.75. μ22 4 1184 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. Calculate the quartile deviation of the grades of 63 students in Physics given below: Grades 0–10 10–20 20–30 30–40 40–50 No. of students 5 7 10 16 11 Grades 50–60 60–70 70–80 80–90 90–100 No. of students 7 3 2 2 0 2. Find the mean deviation from the mean of the following distribution: Class Frequency : : 0–6 8 6–12 10 12–18 12 18–24 9 24–30 5 3. Compute the mean deviation from the median of the following distribution: Grades No. of students : : 0–10 5 10–20 10 20–30 20 30–40 5 40–50 10 4. Compute the standard deviation for the following data relating to grades obtained by 15 students: 12, 21, 21, 23, 27, 28, 30, 34, 37, 39, 39, 39, 40, 49, 54. 5. Calculate the mean and standard deviation for the following distribution: x: f: 56 3 63 6 70 14 77 16 84 13 91 6 98 2 6. Calculate the mean and standard deviation for the following: Size of item Frequency : : 6 3 7 6 8 9 9 13 10 8 11 5 12 4 7. The following table shows the grades obtained by 100 candidates in an examination. Calculate the mean, median, and standard deviation: Grades obtained : No. of candidates : 1–10 3 11–20 16 21–30 26 31–40 31 41–50 16 51–60 8 8. Calculate the mean and standard deviation of the following frequency distribution: Weekly bonus wages in $ No. of workers 4.5–12.5 12.5–20.5 20.5–28.5 28.5–36.5 36.5–44.5 44.5–52.5 52.5–60.5 60.5–68.5 68.5–76.5 4 24 21 18 5 3 5 8 2 9. (i) The mean of five items of an observation is 4 and the variance is 5.2. If three of the items are 1, 2, and 6, then find the other two. (ii) Show that the variance of the first n positive integers is 1 12 ( n − 1). 2 21.23 β1 AS A MEASURE OF SKEWNESS 1185 ________________________________________________________________________________________________________ 10. Compute the quartile deviation and standard deviation for the following: x: f: 100–109 15 110–119 44 120–129 133 130–139 150 140–149 125 150–159 82 160–169 35 170–179 16 11. Find the standard deviation for the following data giving bonus wages of 230 people: Bonus wages (in $) 70–80 80–90 90–100 100–110 No. of people 12 18 35 42 Bonus wages (in $) 110–120 120–130 130–140 140–150 No. of people 50 45 20 8 12. A collar manufacturer is considering the production of a new type of collar to attract young men. The following statistics of neck circumferences are available based upon the measurements of a typical group of college students: Mid-value (inches) 12.5 13.0 13.5 14.0 14.5 Mid-value (inches) 15.0 15.5 16.0 16.5 No. of students 4 19 30 63 66 No. of students 29 18 1 1 Compute the mean, standard deviation, and variance. 13. A student obtained the mean and standard deviation of 100 observations as 40 and 5 respectively. It was later discovered that he had wrongly copied down an observation as 50 instead of 40. Calculate the correct mean and standard deviation. 14. The scores of two golfers for 10 rounds each are: A: B: 58 84 59 56 60 92 54 65 65 86 66 78 52 44 75 54 69 78 52 68 Which may be regarded as the more consistent player? 15. The heights and weights of 10 people are given below. In which characteristic are they more variable? Height in cm : Weight in kg : 170 75 172 74 168 75 177 76 179 77 171 73 173 76 178 75 173 74 179 75 16. The following are the rushing yards of two high school football teams A and B in a series of games: A: B: 12 47 115 12 6 16 73 42 7 4 19 51 119 37 36 48 84 43 29 0 Which team has the better running game and which is more consistent? 17. An analysis of monthly bonus wages paid to the workers in two firms A and B belonging to the same industry gives the following results: Number of workers Average monthly wage Variance of distribution of bonus wages Firm A 500 $186 81 Firm B 600 $175 100 (i) Which firm, A or B, has a larger bonus wage bill? (ii) In which firm, A or B, is there greater variability in individual bonus wages? (iii) Calculate the variance of the distribution of bonus wages of all the workers in the firms A and B taken together. 1186 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 18. Find the coefficient of skewness for the following distribution: Class 0– 5 5–10 10–15 15–20 Frequency 2 5 7 13 Class 20–25 25–30 30–35 35–40 Frequency 21 16 8 3 19. Calculate the quartile coefficient of skewness for the following distribution: x : f : 1–5 3 6–10 4 11–15 68 16–20 30 21–25 10 26–30 6 31–35 2 20. Calculate the first four moments about the mean for the following data: Variate Frequency : : 1 1 2 6 3 13 4 25 5 30 6 22 7 9 8 5 9 2 21. The first three moments of a distribution about the value 2 of the variable are 1, 16, and – 40. Show that the mean is 3, variance is 15, and μ 3 = –86. Also show that the first three moments about x = 0 are 3, 24, and 76. 22. For a distribution, the mean is 10, variance is 16, γ 1 is +1 and β 2 is 4. Find the first four moments about the origin. 23. The first four moments of a distribution about the value 5 of the variable are 2, 20, 40, and 50. Find the moments about the mean. 24. Show that for a discrete distribution: (i) β 2 > 1 (ii) β 2 > β1 Answers 1. 5. 9. 13. 17. 12.32 75.53, 9.87 (i) 4, 7 39.9, 4.9 (i) B (ii) B (iii) $180, 121.36 2. 6. 10. 14. 18. 22. 6.3 9, 1.61 10.9, 15.26 A –1 10, 116, 1544, 23184 3. 7. 11. 15. 19. 23. 9 32, 32.6, 12.4 $17.10 Height 0.25 0. 16, –64, 162 4. 8. 12. 16. 20. 10.9 $31.35, $16.64 14.24, 0.72, 0.52 A, B 0, 2.49, 0.68, 18.26 ________________________________________________________________________________________________________ 21.24 CORRELATION In a bivariate distribution, if the change in one variable affects a change in the other variable, the variables are said to be correlated. If the two variables deviate in the same direction, i.e., if the increase (or decrease) in one results in a corresponding increase (or decrease) in the other, the correlation is said to be direct or positive. E.g., the correlation between income and expenditure is positive. If the two variables deviate in opposite directions, i.e., if the increase (or decrease) in one results in a corresponding decrease (or increase) in the other, the correlation is said to be inverse or negative. E.g., the correlation between volume and the pressure of a perfect gas or the correlation between price and demand is negative. Correlation is said to be perfect if the deviation in one variable is followed by a corresponding proportional deviation in the other. 21.27 COMPUTATION OF THE CORRELATION COEFFICIENT 1187 ________________________________________________________________________________________________________ 21.25 SCATTER OR DOT DIAGRAMS This is the simplest method of the diagrammatic representation of bivariate data. Let ( xi , yi ) i = 1, 2, 3, . . . , n be a bivariate distribution. Let the values of the variables x and y be plotted along the x-axis and y-axis on a suitable scale. Then corresponding to every ordered pair, there corresponds a point or dot in the xy-plane. The diagram of dots so obtained is called a dot or scatter diagram. If the dots are very close to each other and the number of observations is not very large, a fairly good correlation is expected. If the dots are widely scattered, a poor correlation is expected. 21.26 KARL PEARSON’S COEFFICIENT OF CORRELATION (OR PRODUCT MOMENT CORRELATION COEFFICIENT) The correlation coefficient between two variables x and y, usually denoted by r ( x, y ) or rxy is a numerical measure of the linear relationship between them and is defined as 1 1 Σ( xi − x )( y1 − y ) Σ( xi − x )( yi − y ) Σ( xi − x )( y1 − y ) n rxy = = =n σ xσ y 1 1 Σ( xi − x ) 2 Σ( yi − y ) 2 Σ( xi − x ) 2 ⋅ Σ( yi − y ) 2 n n Note. The correlation coefficient is independent of change of origin and scale. Let us define two new variables u and v as u= 21.27 x−a y −b ,v= where a, b, h, k are constants, then rxy = ruv . h k COMPUTATION OF THE CORRELATION COEFFICIENT 1 Σ( xi − x )( yi − y ) n We know that rxy = σ xσ y Now Similarly, ∴ 1 1 Σ( xi − x )( yi − y ) = Σ( xi yi − xi y − yi x + x y ) n n 1 1 1 1 = Σxi yi − y ⋅ Σxi − x ⋅ Σyi + (nx y ) n n n n 1 1 = Σxi yi − y ⋅ x − x ⋅ y + x ⋅ y = Σxi yi − x ⋅ y n n 1 1 σ x2 = Σ( xi − x ) 2 = Σ( xi2 − 2 xi x + x 2 ) n n 1 1 1 1 1 = Σxi2 − 2 x ⋅ Σxi + nx 2 = Σxi2 − 2 x ⋅ x + x 2 = Σxi2 − x 2 n n n n n 1 σ y2 = Σyi2 − y 2 n 1 Σxi yi − x y n rxy = ⎛1 2 2 ⎞⎛ 1 2 2⎞ ⎜ Σxi − x ⎟ ⎜ Σyi − y ⎟ ⎝n ⎠⎝ n ⎠ 1188 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 1 Σui vi − u v x−a y −b n u= ,v= then rxy = ruv = . h k ⎛1 2 2 ⎞⎛ 1 2 2⎞ ⎜ Σui − u ⎟ ⎜ Σvi − v ⎟ ⎝n ⎠⎝ n ⎠ If ILLUSTRATIVE EXAMPLES Example 1. Ten students got the following percentage of grades in Principles of Economics and Statistics: : 1 Roll Nos. Grades in Economics : 78 : 84 Grades in Statistics 2 36 51 3 98 91 4 25 60 5 75 68 6 82 62 7 90 86 8 62 58 9 65 53 10 39 47 Calculate the coefficient of correlation. Sol. Let the grades in the two subjects be denoted by x and y respectively. x y u = x – 65 v = y – 66 u2 v2 uv 78 36 98 25 75 82 90 62 65 39 Total 84 51 91 60 68 62 86 58 53 47 13 – 29 33 – 40 10 17 25 –3 0 –26 0 18 – 15 25 –6 2 –4 20 –8 – 13 – 19 0 169 841 1089 1600 100 289 625 9 0 676 5398 324 225 625 36 4 16 400 64 169 361 2224 234 435 825 240 20 – 68 500 24 0 494 2734 1 1 Σui = 0, v = Σvi = 0 n n 1 1 Σui vi − u v (2734) n 10 ruv = = 1 1 ⎛1 2 2 ⎞⎛ 1 2 2⎞ (5398) ⋅ (2224) ⎜ Σui − u ⎟ ⎜ Σvi − v ⎟ 10 10 ⎝n ⎠⎝ n ⎠ u= 2734 = 0.787 5398 × 2224 rxy = ruv = 0.787. = Hence Example 2. Find the coefficient of correlation for the following table: x: y: 10 18 14 12 18 24 22 6 26 30 30 36 21.27 COMPUTATION OF THE CORRELATION COEFFICIENT 1189 ________________________________________________________________________________________________________ u= Sol. Let x 10 14 18 22 26 30 Total x − 22 y − 24 , v= . 4 6 y 18 12 24 6 30 36 u –3 –2 –1 0 1 2 –3 v –1 –2 0 –3 1 2 –3 1 1 1 1 Σui = (−3) = − ; v = Σvi n n 6 2 1 Σui vi − u v n ruv = = ⎛1 2 2 ⎞⎛ 1 2 2⎞ ⎜ Σui − u ⎟ ⎜ Σvi − v ⎟ ⎝n ⎠⎝ n ⎠ u= Hence u2 9 4 1 0 1 4 19 v2 1 4 0 9 1 4 19 uv 3 4 0 0 1 4 12 1 1 = (−3) = − 6 2 1 1 (12) − 6 4 = 0.6 1 ⎤ ⎡1 1⎤ ⎡1 ⎢⎣ 6 (19) − 4 ⎥⎦ ⎢⎣ 6 (19) − 4 ⎥⎦ rxy = ruv = 0.6. Example 3. A computer, while calculating the correlation coefficient between two variables X and Y from 25 pairs of observations, obtained the following results: n = 25, ΣY = 100, ΣX = 125, ΣY 2 = 460, ΣX 2 = 650, ΣXY = 508. It was, however, later discovered at the time of checking that two pairs had been copied incorrectly as X Y while the correct values were X Y 6 14 8 12 8 6 6 8 Obtain the correct value of the correlation coefficient. Sol. Corrected Σ X = 125 − 6 − 8 + 8 + 6 = 125 Corrected Σ X = 100 − 14 − 6 + 12 + 8 = 100 ⎫ ⎪ ⎪ ⎪ 2 2 2 2 2 Corrected Σ X = 650 − 6 − 8 + 8 + 6 = 650 ⎬ ⎪ Corrected ΣY 2 = 460 − 142 − 62 + 122 + 82 = 436 ⎪ Corrected Σ XY = 508 − 6 ×14 − 8 × 6 + 8 ×12 + 6 × 8 = 520 ⎭⎪ (Subtract the incorrect values and add the corresponding correct values) X= 1 1 1 1 ΣX = ×125 = 5; Y = ΣY = ×100 = 4 n 25 n 25 1 ΣXY − X Y n Corrected rxy = ⎛1 2 2 ⎞⎛ 1 2 2⎞ ⎜ ΣX − X ⎟ ⎜ ΣY − Y ⎟ ⎝n ⎠⎝ n ⎠ 1190 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 1 4 × 520 − 5 × 4 4 5 2 25 5 = = = × = = 0.67. ⎛ 1 ⎞⎛ 1 ⎞ ⎛ 36 ⎞ 5 6 3 (1) ⎜ ⎟ ⎜ × 650 − 25 ⎟ ⎜ × 436 − 16 ⎟ ⎝ 25 ⎠ ⎝ 25 ⎠ ⎝ 25 ⎠ Example 4. If z = ax + by and r is the correlation coefficient between x and y, show that σ z2 = a 2σ x2 + b 2σ y2 + 2abrσ xσ y . z = ax + by Sol. ⇒ z = ax + by , zi = axi + byi zi − z = a ( xi − x ) + b( yi − y ) 1 1 n n 1 = Σ ⎡⎣ a 2 ( xi − x ) 2 + b 2 ( yi − y ) 2 + 2ab( xi − x )( yi − y ) ⎤⎦ n 1 1 1 = a 2 ⋅ Σ( xi − x ) 2 + b 2 ⋅ Σ( yi − y ) 2 + 2ab ⋅ Σ( xi − x )( yi − y ) n n n 1 Σ( xi − x )( yi − y ) = a 2σ 2 + b 2σ 2 + 2abrσ σ y ∵ r= n σ z2 = Σ( zi − z ) 2 = Σ[a( xi − x ) + b( yi − y )]2 Now x y σ xσ y x 21.28 CALCULATION OF THE COEFFICIENT OF CORRELATION FOR A BIVARIATE FREQUENCY DISTRIBUTION If the bivariate data on x and y is presented on a two-way correlation table and f is the frequency of a particular rectangle in the correlation table, then 1 Σ fxy − Σ fx Σ fy n rxy = 1 1 2⎤⎡ 2⎤ ⎡ 2 2 ⎢⎣Σ fx − n ( Σ fx ) ⎥⎦ ⎢⎣ Σ fy − n ( Σ fy ) ⎥⎦ Since the change of origin and scale do not affect the coefficient of correlation, ∴ rxy = ruv where the new variables u, v are properly chosen. Example. The following table gives, according to age, the frequency of grades obtained by 100 students in an intelligence test: Age (in years) Grades 10–20 20–30 30–40 40–50 50–60 60–70 Total 18 19 20 21 Total 4 5 6 4 2 4 8 4 2 2 22 2 6 10 6 4 3 31 4 11 8 4 1 28 8 19 35 22 10 6 100 19 Calculate the coefficient of correlation between age and intelligence. 21.29 RANK CORRELATION 1191 ________________________________________________________________________________________________________ Sol. Let age and intelligence be denoted by x and y respectively. Mid value 15 x y 10–20 25 35 45 55 65 u fu fu2 fuv 8 –3 24 72 30 4 11 19 35 –2 –1 – 38 – 35 76 35 20 9 6 4 3 31 8 4 1 28 22 10 6 100 0 0 1 10 2 12 Totals – 75 0 10 24 217 0 2 –2 59 –1 0 1 Totals – 22 22 16 0 0 0 28 28 13 – 32 126 59 18 19 20 4 2 2 20–30 30–40 5 6 4 8 6 10 40–50 50–60 60–70 f 4 19 4 2 2 22 v 2 – 38 76 56 fv fv2 fuv 21 Let us define two new variables u and v as u = f y − 45 , v = x − 20 10 1 Σ fuv − Σ fu Σ fv n rxy = ruv = 1 1 ⎡ 2 2⎤⎡ 2 2⎤ ⎢⎣Σ fu − n (Σ fu ) ⎥⎦ ⎢⎣ Σ fv − n (Σ fv) ⎥⎦ 1 59 − (−75)(−32) 59 − 24 100 = = = 0.25. 643 2894 1 1 ⎡ ⎤ ⎡ ⎤ 2 2 × ⎢⎣ 217 − 100 (−75) ⎥⎦ ⎢⎣126 − 100 (−32) ⎥⎦ 4 25 21.29 RANK CORRELATION Sometimes we have to deal with problems in which data cannot be quantitatively measured but qualitative assessment is possible. Let a group of n individuals be arranged in order of merit or proficiency in possession of two characteristics A and B. The ranks in the two characteristics are, in general, different. For example, if A stands for intelligence and B for beauty, it is not necessary that the most intelligent individual may be the most beautiful and vice versa. Thus an individual who is ranked at the top for the characteristic A may be ranked at the bottom for the characteristic B. Let ( xi , yi ), i = 1, 2, . . . , n be the ranks of the n individuals in the group for the characteristics A and B respectively. The Pearsonian coefficient of correlation between the ranks xi’s and yi’s is called the rank correlation coefficient between the characteristics A and B for that group of individuals. 1192 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Thus the rank correlation coefficient r= Σ( xi − x )( yi − y ) Σ( xi − x ) 2 Σ( yi − y ) 2 1 Σ( xi − x )( yi − y ) n = . . . (1) σ xσ y Now xi’s and yi’s are merely the permutations of n numbers from 1 to n. Assuming that no two individuals are bracketed or tied in either classification, i.e., ( xi , yi ) ≠ ( x j , y y ) for i ≠ j, both x and y take all integral values from 1 to n. 1 1 n(n + 1) n + 1 ∴ x = y = (1 + 2 + 3 + . . . + n) = ⋅ = n n 2 2 n(n + 1) Σxi = 1 + 2 + 3 + . . . + n = = Σyi 2 n(n + 1)(2n + 1) Σxi2 = 12 + 22 + . . . + n 2 = = Σyi2 6 If di denotes the difference in ranks of the ith individual, then [∵ x = y ] di = xi − yi = ( xi − x ) − ( yi − y ) 1 2 1 2 Σdi = Σ [ ( xi − x ) − ( yi − y ) ] n n 1 1 1 = Σ( xi − x ) 2 + Σ( yi − y ) 2 − 2 ⋅ Σ( xi − x )( yi − y ) n n n 2 2 = σ x + σ y − 2rσ xσ y But . . . (2) [Using (1)] 1 1 n n 1 2 ⎡1 ⎤ Σd i = 2σ x2 − 2rσ x2 = 2(1 − r )σ x2 = 2(1 − r ) ⎢ Σxi2 − x 2 ⎥ n ⎣n ⎦ σ x2 = Σxi2 − x 2 = Σyi2 − y 2 = σ y2 ∴ From (2), ⎡ 1 n(n + 1)(2n + 1) (n + 1) 2 ⎤ = 2(1 − r ) ⎢ ⋅ − ⎥ 6 4 ⎦ ⎣m 2 6Σdi2 ⎡ 4n + 2 − 3n − 3 ⎤ (1 − r )(n − 1) = (1 − r )(n + 1) ⎢ = or 1 − r = n(n 2 − 1) 6 6 ⎣ ⎦⎥ Hence 6Σdi2 r = 1− . n(n 2 − 1) Note. This is called Spearman’s Formula for Rank Correlation. Σd i = Σ ( xi − yi ) = Σxi − Σyi = 0 always. This serves as a check on calculations. Example. The grades secured by recruits in the selection test (X) and in the proficiency test (Y) are given below: Serial No : : X : Y 1 10 30 2 15 42 3 12 45 Calculate the rank correlation coefficient. 4 17 46 5 13 33 6 16 34 7 24 40 8 14 35 9 22 39 21.30 REPEATED RANKS 1193 ________________________________________________________________________________________________________ Sol. Here the grades are given. Therefore, first of all, write down ranks. In each series, the item with the largest size is ranked 1, next largest 2, and so on. X 10 15 12 17 13 16 24 14 22 Y 30 42 45 46 33 34 40 35 39 Ranks in X (x) Ranks in Y ( y ) 9 5 8 3 7 4 1 6 2 9 3 2 1 8 7 4 6 5 d=x–y 0 2 6 2 –1 –3 –3 0 –3 0 d 0 4 36 4 1 9 9 0 9 72 2 ∴ 21.30 r = 1− 6Σ d 2 6 × 72 = 1− = 1 − 0.6 = 0.4 2 9 × 80 n(n = 1) Total Here n = 9. REPEATED RANKS If any two or more individuals have the same rank or the same value in the series of grades, then the above formula fails and requires an adjustment. In such cases, each individual is given an average rank. This common average rank is the average of the ranks that these individuals would have assumed if they were slightly different from each other. Thus, if two individuals are ranked equal at the sixth place, they would have assumed the 6th and 7th ranks if they were 6+7 = 6.5. If three individuals are ranked ranked slightly differently. Their common rank = 2 equal in fourth place, they would have assumed the 4th, 5th, and 6th ranks if they were ranked 4+5+6 slightly differently. Their common rank = = 5. 3 1 Adjustment. Add m(m 2 − 1) to Σd 2 where m stands for the number of times an item is 12 repeated. This adjustment factor is to be added for each repeated item. 1 1 ⎧ ⎫ 6 ⎨Σd 2 + m(m 2 − 1) + m(m 2 − 1) + . . . ⎬ 12 12 ⎭ r = 1− ⎩ Thus 2 n(n − 1) Example. Obtain the rank correlation coefficient for the following data: X: Y: 68 62 64 58 75 68 50 45 64 81 80 60 75 68 40 48 55 50 64 70 Sol. Here, grades are given, so write down the ranks. X 68 64 75 50 64 80 75 40 55 64 Total Y Ranks in X (x) Ranks in Y ( y ) 62 4 5 –1 58 6 7 –1 68 2.5 3.5 –1 45 9 10 –1 81 6 1 5 60 1 6 –5 68 2.5 3.5 –1 48 10 9 1 50 8 8 0 70 6 2 4 0 1 1 1 1 25 25 1 1 0 16 72 d=x–y d2 1194 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ In the X-series, the value 75 occurs twice. Had these values been slightly different, they 2+3 would have been given the ranks 2 and 3. Therefore, the common rank given to them is = 2 2.5. The value 64 occurs three times. Had these values been slightly different, they would have 5+6+7 been given the ranks 5, 6, and 7. Therefore the common rank given to them is = 6. 3 Similarly, in the Y-series, the value 68 occurs twice. Had these values been slightly different, they would have been given the ranks 3 and 4. Therefore, the common rank given to them is 3+ 4 = 3.5. 2 Thus, m has the values 2, 3, 2. ∴ 1 1 ⎧ ⎫ 6 ⎨Σd 2 + m(m 2 − 1) + m(m 2 − 1) + . . . ⎬ 12 12 ⎭ r = 1− ⎩ 2 n(n − 1) 1 1 1 ⎡ ⎤ 6 ⎢72 + {2(22 − 1)} + {3(32 − 1)} + {2(22 − 1)}⎥ 12 12 12 ⎦ r = 1− ⎣ 2 10(10 − 1) 6 × 75 6 = 1− = = 0.545. 990 11 21.31 REGRESSION Regression is the estimation or prediction of unknown values of one variable from known values of another variable. After establishing the fact of correlation between two variables, it is natural to want to know the extent to which one variable varies in response to a given variation in the other variable; one is interested to know the nature of the relationship between the two variables. Regression measures the nature and extent of correlation. 21.32 LINEAR REGRESSION If two variates x and y are correlated, i.e., there exists an association or relationship between them, then the scatter diagram will be more or less concentrated around a curve. This curve is called the curve of regression and the relationship is said to be expressed by means of curvilinear regression. In the particular case, when the curve is a straight line, it is called a line of regression and the regression is said to be linear. A line of regression is the straight line that gives the best fit in the least square sense to the given frequency. If the line of regression is so chosen that the sum of squares of deviation parallel to the axis of y is minimized [See part (a) of the figure on the next page], it is called the line of regression of y on x and it gives the best estimate of y for any given value of x. If the line of regression is so chosen that the sum of squares of deviation parallel to the axis of x is minimized [See part (b) of the figure on the next page], it is called the line of regression of x on y and it gives the best estimate of x for any given value of y. 21.33 LIN NES OF REGR RESSION 1195 ________________________ ________________________________________________________________________________________ 21.33 L LINES OF REGRESSIO R ON Let the equation n of the line of regressionn of y on x be b Then y = a + bx . . . (1) y = a + bxx . . . (2) Subtracting (2) from (1), wee have y − y = b( x − x ) uations are The normal equ . . . (3) Σy = nna + bΣx Σyx = aΣx + bΣx 2 . . . (4) gin to ( x , y ), (4) becom mes Shiffting the orig Σ( x − x )( y − y ) = aΣ( x − x ) + bΣ( x − x ) 2 Sincce . . . (5) Σ(x − x )( y − y ) 1 = r ∴ Σ( x − x ) = 0; annd Σ( x − x ) 2 = σ x2 nσ xσ y n ∴ From F (5), nrσ xσ y = a.0 + b.nσ x2 ⇒ b= rσ y σx σy Hennce, from (3)), the line off regression of o y on x is y − y = r (x − x ) σx σ Sim milarly, the lin ne of regresssion of x on y is x − x = r x ( y − y) σy rσ y σx rσ x σy is called th he regressionn coefficient of y on x and is denotedd by byx . is called th he regressionn coefficient of x on y annd is denotedd by bxy . Notee. If r = 0, the two t lines of reggression becom me y = y and x = x , which are a two straighht lines parallell to the X- and Y-axes respectivelly and passing through their means m y and x . They are mutually m perpenndicular. l of regresssion will coinciide. If r = ± 1, the two lines 1196 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 21.34 PROPERTIES OF REGRESSION Property I. The correlation coefficient is the geometric mean between the regression coefficients. rσ y rσ x Proof. The coefficients of regression are and . σx G.M. between them = rσ y σx × rσ x σy σy = r 2 = r = coefficient of correlation. Property II. If one of the regression coefficients is greater than 1, the other must be less than 1. rσ rσ Proof. The two regression coefficients are byx = y and bxy = x . σx Let byx > 1, then σy 1 <1 byx . . . (1) Since bxy ⋅ bxy = r 2 ≤ 1 (∵ − 1 ≤ r ≤ 1) ∴ bxy ≤ 1 < 1. byx | Using (1) Similarly, if bxy > 1, then byx < 1. Property III. The arithmetic mean of regression coefficients is greater than the correlation coefficient. rσ y rσ x + byx + bxy σx σy Proof. We have to prove that > r or >r 2 2 σ y2 + σ x2 > 2σ xσ y or (σ x − σ y ) 2 > 0, which is true. or Property IV. Regression coefficients are independent of the origin but not of scale. x−a y=b Proof . Let ,v= where a, b, h, and k are constants u= h k rσ kσ k ⎛ rσ ⎞ k byx = y = r ⋅ v = ⎜ v ⎟ = bvu hσ u h ⎝ σ u ⎠ h σx h buv . k Thus, byx and bxy are both independent of a and b but not of h and k. Similarly, bxy = Property V. The correlation coefficient and the two regression coefficients have the same sign. Proof. Regression coefficient of y on x = bxy = r Regression coefficient of x on y = bxy = r σx σy σy σx Since σ x and σ y are both positive, byx , bxy , and r have the same sign. 21.35 ANGLE BETWEEN TWO LINES OF REGRESSION 1197 ________________________________________________________________________________________________________ 21.35 ANGLE BETWEEN TWO LINES OF REGRESSION If θ is the acute angle between the two regression lines in the case of two variables x and y, show that tan θ = 1 − r 2 σ xσ y ⋅ where r, σ x , σ y have their usual meanings. r σ x2 + σ y2 Explain the significance of the formula when r = 0 and r = ± 1. Proof. Equations of the lines of regression of y on x and x on y are y− y = Their slopes are m1 = ∴ rσ y σx rσ y σx ( x − x ) and x − x = and m2 = rσ x σy ( y − y) σy . rσ x σ y rσ y − rσ x σ x m2 − m1 =± tan θ = ± σ2 1 + m2 m1 1 + y2 σx 2 2 σx 1− r σ y 1 − r 2 σ xσ y =± ⋅ ⋅ =± ⋅ r σ x σ x2 + σ y2 r σ x2 + σ y2 Since r 2 ≤ 1 and σ x , σ y are positive. ∴ Positive sign gives the acute angle between the lines. 1 − r 2 σ xσ y Hence tan θ = ⋅ r σ x2 + σ y2 when r = 0, θ = Note. rσ x π 2 ∴ The two lines of regression are perpendicular to each other. Hence the estimated value of y is the same for all values of x and vice versa when r = ± 1, tan θ = 0 so that, θ = 0 or π . Hence the lines of regression coincide and there is a perfect correlation between the two variates x and y. Similarly, 1 1 1 Σxy − x y Σxy − x y Σxy − x y σ n n n x = ⋅ = = 1 2 σy σ xσ y σy σ y2 Σy − y 2 n 1 rσ y n Σxy − x y = . 1 2 σx 2 Σx − x n 1198 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. Calculate the coefficient of correlation and obtain the least square regression line of y on x for the following data: x: y: 1 9 2 8 3 10 4 12 5 11 6 13 7 14 8 16 9 15 Also obtain an estimate of y that should correspond on the average to x = 6.2. Sol. x 1 2 3 4 5 6 7 8 9 Total y 9 8 10 12 11 13 14 16 15 u=x–5 –4 –3 –2 –1 0 1 2 3 4 0 u = y – 12 –3 –4 –2 0 –1 1 2 4 3 0 u2 16 9 4 1 0 1 4 9 16 60 v2 9 16 4 0 1 1 4 16 9 60 uv 12 12 4 0 0 1 4 12 12 57 1 1 Σuv − u v (57) − 0 n 9 = rxy = ruv = ⎛1 2 2 ⎞⎛ 1 2 2⎞ ⎡1 ⎤ ⎡1 ⎤ (60) − 0 ⎥ ⎢ (60) − 0 ⎥ ⎜ Σu − u ⎟ ⎜ Σv − v ⎟ ⎢ n n ⎝ ⎠⎝ ⎠ ⎣9 ⎦ ⎣9 ⎦ 19 = 0.95 20 1 1 rσ y rσ v n Σuv − u v 9 (57) − 0 19 = = = = = 0.95 1 2 1 σx σu 20 2 Σu − u (60) − 0 n 9 1 1 x = 5 + Σu = 5, y = 12 + Σv = 12 9 9 = Also Equation of the line of regression of y on x is y− y = or or rσ y σx (x − x ) y − 12 = 0.95( x − 5) y = 0.95 x + 7.25 When x = 6.2, the estimated value of y = 0.95 × 6.2 + 7.25 = 5.89 + 7.25 = 13.14. 21.35 ANGLE BETWEEN TWO LINES OF REGRESSION 1199 ________________________________________________________________________________________________________ Example 2. In a partially destroyed laboratory record of an analysis of a correlation data, only the following results are legible: Variance of x = 9 Regression equations: 8x – 10y + 66 = 0, 40x – 18y = 214. What were (a) the mean values of x and y, (b) the standard deviation of y, and (c) the coefficient of correlation between x and y. Sol. (i) Since both the lines of regression pass through the point ( x , y ) therefore, we have 8 x − 10 y + 66 = 0 40 x − 18 y − 214 = 0 . . . (1) . . . (2) . . . (3) 40 x − 50 y + 330 = 0 32 y − 544 = 0 ∴ 8 x − 170 + 66 = 0 or x = 13, Multiplying (1) by 5, Subtracting (3) from (2), ∴ From (1), Hence (ii ) Variance of y = 17 8 x = 104 ∴ x = 13 y = 17 x = σ x2 = 9 ∴ . . . (a) (given) σx = 3 The equations of the lines of regression can be written as y = .8 x + 6.6 and x = .45 y + 5.35 rσ y ∴ The regression coefficient of y on x is The regression coefficient of x on y is rσ x σy σx = .8 . . . (4) = .45 . . . (5) Multiplying (4) and (5), r 2 = .8 × .45 = .36 ∴ r = 0.6 . . . (b) (Positive sign with square root is taken because regression coefficients are positive.) σy = From (4), .8σ x .8 × 3 = = 4. 0.6 r . . . (c) TEST YOUR KNOWLEDGE 1. (a) Calculate the correlation coefficient for the following heights in inches of fathers (X) and their sons (Y ) . X: 65 66 67 67 68 69 70 72 Y: 67 68 65 68 72 72 69 71 (b) Find the correlation coefficient between x and y from the given data: x: y: 78 125 89 137 97 156 69 112 59 107 79 138 68 123 57 108 63 82 53 37 (c) Find the correlation coefficient from the following data: x: y: 92 86 89 88 87 91 86 77 83 68 77 85 71 52 50 57 1200 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 2. Calculate the coefficient of correlation for the following ages of husbands and wives: Husbands’s age Wife’s age x: y: 23 18 27 20 28 22 28 27 29 21 30 29 31 27 33 29 35 28 36 29 σ x − y = σ x + σ y − 2rσ xσ y 2 3. Establish the formula 2 2 where r is the correlation coefficient between x and y. 4. (a) Calculate the coefficient of correlation for the following table: x 16–18 18–20 20–22 10–20 2 1 1 20–30 3 2 3 2 30–40 3 4 5 6 40–50 2 2 3 4 50–60 1 2 2 60–70 1 2 1 y 22–24 (b) Find the correlation between x (grades in mathematics) and y (grades in Engineering Drawing) given in the following data: x 10–40 40–70 70–100 Total 0–30 30–60 60–90 5 — — 20 28 32 — 2 13 25 30 45 Total 5 80 15 100 y 5. Ten students got the following percentage of grades in chemistry and physics: Students Grades in chemistry Grades in physics : : : 1 78 84 2 36 51 3 98 91 4 25 60 5 75 68 6 82 62 7 90 86 8 62 58 9 65 63 10 39 47 Calculate the rank correlation coefficient. 6. Ten competitors in a musical test were ranked by the three judges x, y, and z in the following order: Ranks by x : Ranks by y : Ranks by z : 1 3 6 6 5 4 5 8 9 10 4 8 3 7 1 2 10 2 4 2 3 9 1 10 7 6 5 8 9 7 Using the rank correlation method, discuss which pair of judges has the nearest approach to common likings in music. 7. A sample of 12 fathers and their sons gave the following data about their heights in inches: Father Son : : 65 68 63 66 67 68 64 65 68 69 62 66 70 68 66 65 68 71 67 67 69 68 71 70 Calculate the coefficient of rank correlation. 8. If r = 0, show that the two lines of regression are parallel to the axes. 9. If the two regression coefficients are 0.8 and 0.2, what would be the value of the coefficient of correlation? 21.36 THEORY OF PROBABILITY 1201 ________________________________________________________________________________________________________ 10. (a) Find the correlation coefficient and the equations of regression lines for the following values of x and y: x: 1 2 3 4 5 y: 2 5 3 8 7 (b) Find the correlation coefficient between x and y for the given values. Find also the two regression lines. x: 1 2 3 4 5 6 7 8 9 10 y: 10 12 16 28 25 36 41 49 40 50 11. The two regression equations of the variables x and y are x = 19.13 – 0.87y and y = 11.64 – 0.50x. Find (i) mean of x’s, (ii) mean of y’s, and (iii) the correlation coefficient between x and y. 12. Two random variables have the regression lines with equations 3x + 2y = 26 and 6x + y = 31. Find the mean values and the correlation coefficient between x and y. 13. In a partially destroyed sheet of laboratory data, only the equations giving the two lines of regression of y on x and x on y are available and are respectively, 7x – 16y + 9 = 0, 5y – 4x – 3 = 0. Calculate the coefficient of correlation, x and y . Answers 1. 4. 6. 9. 11. 12. (a) 0.603 (b) 0.96 (e) 0.7291 (a) 0.28 (b) 0.4517 x and z 0.4 (i) 15.79 (ii) 3.74 (iii) –0.6595 x = 4, y = 7; r − 0.5 2. 5. 7. 10. 13. 0.82 0.84 0.722 (a) r = 0.8; y = 1.3x + 1.1; x = 0.5y + 0.5 (b) r = 0.96; y = 4.69x + 4.9; x = 0.2y – 0.64 r = 0.7395; x = −0.1034; y = 0.5172. ________________________________________________________________________________________________________ 21.36 THEORY OF PROBABILITY Here we define and explain certain terms that are used frequently. (a) Trial and event. Let an experiment be repeated under essentially the same conditions and let it result in any one of the several possible outcomes. Then, the experiment is called a trial and the possible outcomes are known as events or cases. For example: (i) Tossing a coin is a trial and the turning up of heads or tails is an event. (ii) Throwing a die is a trial and getting 1 or 2 or 3 or 4 or 5 or 6 is an event. (b) Exhaustive events. The total number of all possible outcomes in any trial is known as exhaustive events or exhaustive cases. For example: (i) In tossing a coin, there are two exhaustive cases, heads and tails. (ii) In throwing a die, there are 6 exhaustive cases, for any one of the six faces that may turn up. (iii) In throwing two dice, the exhaustive cases are 6 × 6 = 62, for any of the 6 numbers from 1 to 6 on one die can be associated with any of the 6 numbers on the other die. In general, in throwing n dice, the exhaustive cases are 6n. (c) Favorable events or cases. The cases that entail the occurrence of an event are said to be favorable to the event. It is the total number of possible outcomes in which the specified event happens. For example: (i) In throwing a die, the number of cases favorable to the appearance of a multiple of 3 are two, viz. 3 and 6, while the number of cases favorable to the appearance of an even number are three, viz., 2, 4, and 6. 1202 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ (ii) In a throw of two dice, the number of cases favorable to getting a sum of 6 is 5, viz., (1, 5); (5, 1); (2,4); (4, 2); (3, 3). (d) Mutually exclusive events. Events are said to be mutually exclusive or incompatible if the occurrence of any one of them precludes (i.e., rules out) the occurrence of all others, i.e., if no two or more than two of them can happen simultaneously in the same trial. For example: (i) In tossing a coin, the events “heads” and “tails” are mutually exclusive, since if the outcome is heads, the possibility of getting tails in the same trial is ruled out. (ii) In throwing a die, all the six faces numbered, 1, 2, 3, 4, 5, 6 are mutually exclusive since any outcome rules out the possibility of getting any other. (e) Equally likely events. Events are said to be equally likely if there is no reason to expect any one in preference to any other. For example: (i) When a card is drawn from a well-shuffled deck, any card may appear in the draw so that the 52 different cases are equally likely. (ii) In throwing a die, all six faces are equally likely to come up. ( f ) Independent and dependent events. Two or more events are said to be independent if the occurrence or non-occurrence of any one does not depend (or is not affected) by the occurrence or non-occurrence of any other. Otherwise they are said to be dependent. For example: If a card is drawn from a deck of well-shuffled cards and replaced before drawing the second card, the result of the second draw is independent of the first draw. However, if the first card drawn is not replaced, then the second draw is dependent on the first draw. 21.37 (a) MATHEMATICAL (OR CLASSICAL) DEFINITION OF PROBABILITY If a trial results in n exhaustive, mutually exclusive and equally likely cases and m of them are favorable to the occurrence of an event E, then the probability of occurrence of E is given by p or P (E) = Favorable number of cases m = . Exhaustive number of cases n Note 1. Since the number of cases favorable to the occurrence of E is m and the exhaustive number of cases is n, therefore, the number of cases unfavorable to the occurrence of E are n – m. Note 2. The probability that the event E will not happen is given by q or P(E) = Unfavorable number of cases n−m = 1− m = 1− p Exhaustive number of cases n n Obviously, p and q are non-negative and cannot exceed 1, i.e., 0 ≤ p ≤ 1, 0 ≤ q ≤ 1. Note 3. If P(E) = 1, E is called a certain event, i.e., the chance of its occurrence is 100%. If P(E) = 0, then E is an impossible event. Note 4. If n cases are favorable to E and m cases are favorable to E (i.e., unfavorable to E), then exhaustive number of cases = n + m. n m P(E) = and P(E) = n+m n+m We say that the “odds in favor of E” are n : m and the “odds against E” are m : n. 21.37 = (b) STATISTICAL (OR EMPIRICAL) DEFINITION OF PROBABILITY If in n trials, an event E occurs m times, then the probability of the occurrence of E is given by m . n→∞ n p = P(E) = Lt 21.37 (b) STATISTICAL (OR EMPIRICAL) DEFINITION OF PROBABILITY 1203 ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. A bag contains 7 white, 6 red, and 5 black balls. Two balls are drawn at random. Find the probability that they will both be white. Sol. Total number of balls = 7 + 6 + 5 = 18. Out of 18 balls, 2 can be drawn in 18C2 ways. 18 ×17 ∴ Exhaustive number of cases = 18C2 = = 153 2 ×1 7×6 Out of 7 white balls, 2 can be drawn in 7C2 = = 21 ways. 2 ×1 ∴ Favorable number of cases = 21 Probability = 21 7 = . 153 51 Example 2. Four cards are drawn from a deck of cards. Find the probability that (i) all are diamonds, (ii) there is one card of each suit, and (iii) there are two spades and two hearts. Sol. 4 cards can be drawn from a deck of 52 cards in 52C4 ways. 52 × 51× 50 × 49 ∴ Exhaustive number of cases = 52C4 = = 270725. 4 × 3 × 2 ×1 (i) There are 13 diamonds in the deck and 4 can be drawn out of them in 13C4 ways. 13 ×12 ×11×10 Favorable number of cases = 13C4 = ∴ = 715. 4 × 3 × 2 ×1 Required probability = 715 143 11 = = . 270725 54145 4165 (ii) There are 4 suits, each containing 13 cards. Favorable number of cases = 13CI × 13C1 × 13C1 × 13C1 = 13 × 13 × 13 × 13. ∴ Required probability = 13 ×13 ×13 ×13 ×13 2197 = . 270725 20825 (iii) 2 spades out of 13 can be drawn in 13C2 ways. 2 hearts out of 13 can be drawn in 13C2 ways. Favorable number of cases = 13C2 × 13C2 = 78 × 78 ∴ 78 × 78 468 = Required probability = . 270725 20825 Example 3. A bag contains 50 tickets numbered 1, 2, 3, . . . , 50, of which five are drawn at random and arranged in ascending order of magnitude (x1 < x2 < x3 < x4 < x5). What is the probability that x3 = 30? Sol. Exhaustive number of cases 50C5. If x3 = 30, then the two tickets with numbers x1 and x2 must come out of 29 tickets numbered 1 to 29 and this can be done in 29C2 ways. The other two tickets with numbers x4 and x5 must come out of the 20 tickets number 31 to 50 and this can be done in 20C2 ways. ∴ Favorable number of cases = 29C2 × 20C2. Required probability = 29 C2 × 20 C2 551 = . 50 C5 15134 1204 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 21.38 RANDOM EXPERIMENT Occurrences that can be repeated a number of times, essentially under the same conditions, and whose result cannot be predicted beforehand are known as random experiments. For example, the rolling of a die, or the tossing of a coin are random experiments. Sample Space. Out of the several possible outcomes of a random experiment, one and only one can take place in a trial. The set of all these possible outcomes is called the sample space for the particular experiment and is denoted by S. For example, if a coin is tossed, the possible outcomes are H (Heads) and T (Tails). Thus S = {H, T}. Sample Point. The elements of S, the sample space, are called sample points. For example, if a coin is tossed and H and T denote “Heads” and “Tails” respectively, then S = {H, T}. The two sample points are H and T. Finite Sample Space. If the number of sample points in a sample space is finite, we call it a finite sample space. (In this chapter, we shall deal with finite sample spaces only.) Event. Every subset of S, the sample space, is called an event. Since S ⊂ S, S itself is an event; called a certain event. Also, φ ⊂ S, the null set is also an event, called an impossible event. If e ∈ S, then e is called an elementary event. Every elementary event contains only one sample point. 21.39 AXIOMS (i) With each event E (i.e., a sample point) is associated a real number between 0 and 1, called the probability of that event and is denoted by P(E). Thus 0 ≤ P(E) ≤ 1. (ii) The sum of the probabilities of all simple (elementary) events constituting the sample space is 1. Thus P(S) = 1. (iii) The probability of a compound event (i.e., an event made up of two or more sample events) is the sum of the probabilities of the simple events comprising the compound event. Thus, if there are n equally likely possible outcomes of a random experiment, then the sample space S contains n sample points and the probability associated with each sample point is 1 . n [By Axiom (ii)] Now, if an event E consists of m sample points, then the probability of E is 1 1 m + + . . . . + m times = n n n Number of sample points in E . = Number of sample points in S P(E) = This closely agrees with the classical definition of probability. 21.40 PROBABILITY OF THE IMPOSSIBLE EVENT IS ZERO, i.e., P ( φ ) = 0 Impossible event contains no sample point. As such, the sample space S and the impossible event φ are mutually exclusive. 21.45 AD DDITION THEO OREM OF PRO OBABILITIES (OR THEOREM M OF TOTAL PROBABILITY) P 1205 ________________________ ________________________________________________________________________________________ S ∪φ = S ⇒ ⇒ P(S ∪ φ ) = P(S) ⇒ P(S) + P(φ ) = P(S) ⇒ P(φ ) = 0. 21.41 P PROBABILIT TY OF THE COMPLEMENTARY EV VENT A OF F A IS GIVEN BY P A ) = 1 – P(A) P( P A and a A are dissjoint eventss. Also A ∪ A = S ∴ P(A ∪ A ) = P(S S) ⇒ P(A) + P( A ) = 1 Hence P( A ) = 1 – P(A A). 21.42 F FOR ANY TW WO EVENT TS A AND B, P( A ∩ B) = P(B) – P((A ∩ B) A ∩ B = {p : p ∈ B and p ∉ A} Now w A ∩ B an nd A ∩ B arre disjoint seets and ( A ∩ B) ∪ (A ∩ B) = B P[( A ∩ B) ∪ (A A ∩ B)] = P(B) P ⇒ P( A ∩ B) + P(A A ∩ B) = P(B B) ⇒ P( A ∩ B) = P(B B) – P(A ∩ B). B ⇒ Notee. Similarly, it can c be proved that P(A ∩ B ) = P(A) – P(A A ∩ B). 21.43 IF B ⊂ A, TH HEN (i) P(A P ∩ B) = P(A) – P(B B) (ii) P((B) ≤ P(A) Proof. When B ⊂ A, B and A ∩ B aree disjoint andd their unionn is A. ⇒ B ∪ (A ∩ B ) = A ⇒ P[B ∪ (A ∩ B )] ) = P(A) ∩ ⇒ P(B) + P(A B ) = P(A) ⇒ P(A ∩ B ) = P(A A) – P(B) . . . (1) Now w, if E is any y event, thenn 0 ≤ P((E) ≤ 1, i.e., P(E) ≥ 0 ∴ P(A ∩ B ) ≥ 0 ⇒ P(A) – P(B) P ≥0 [ [Using (1)] P(B) ≤ P(A). ⇒ 21.44 P ∩ B) ≤ P(A) AND P(A P(A P ∩ B) ≤ P(B) Proof. By 21.43 3, B ⊂ A ⇒ P(B) ≤ P((A) Sincce (A ∩ B) ⊂ A and (A ∩ B) ⊂ B ∴ P(A ∩ B) ≤ P(A)) and P(A ∩ B) ≤ P(B). 21.45 A ADDITION THEOREM OF O PROBAB BILITIES (OR R THEOREM M OF TOTA AL P PROBABILIT TY) Stattement. If A and B are any a two evennts, then i.e., P(A ∪ B) = P(A) + P(B) – P((A ∩ B) P(A or B) = P(A) + P(B) – P(A A and B). 1206 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ Proof. A and A ∩ B are dissjoint sets annd their unioon is A ∪ B. ⇒ A ∪ B = A ∪ ( A ∩ B) ⇒ P(A ∪ B)) = P[A ∪ ( A ∩ B)] = P(A) + P(A A ∩ B) [ A ∩ B) + P(A ∩ B) – P(A ∩ B)] B = P(A) + [P( = P(A) + P[( P A ∩ B) ∪ (A ∩ B)] – P[(A ∩ B)] [∴ A ∩ B and A ∩ B are disjoint] d = P(A) + P(B) P – P(A ∩ B) [∵ ( A ∩ B) ∪ (A ( ∩ B) = B] P(A ∪ B) = P(A) + P((B) – P(A ∩ B). S Notee 1. If A and B are two mutuaally disjoint eveents, then A ∩ B = φ , so thaat P(A ∩ B) = P( P φ ) = 0. P(A ∪ B) = P(A) + P(B). ∴ Notee 2. P(A ∪ B)) is also writtenn as P(A + B). Thus, for mutuually disjoint events A and B, P(A + B) B = P(A) + P((B). P(A ∩ B) is also written w as P(AB B). 21.46 IF A, B, AND D C ARE AN NY THREE EVENTS, E TH HEN P(A ∪ B ∪ C) = P((A) + P(B) + P(C) – P(A A ∩ B) – P(B ∩ C) – P((C ∩ A) + P(A P ∩ B ∩ C) C or P – P(AB B) – P(BC) – P(CA) + P(ABC) P P(A + B + C) = P(A)) + P(B) + P(C) Proof. Using the above Artiicle 21.45 foor two eventss, we have A ∪ B ∪ C)) = P[(A ∪ B) B ∪ C] P(A ∪ B) ∩ C] = P(A ∪ B) B + P(C) – P[(A P = [P(A) + P(B) – P(A ∩ B)] + P(C C) – P[(A ∩ C) ∪ (B ∩ C)] [By thee distributivee law] = P(A) + P(B) P + P(C) – P(A ∩ B) – [P(A ∩ C)] C + P(B ∩ C) – P{(A ∩ C) ∩ (B ∩ C)} [By Art. 21.45] = P(A) + P(B) P + P(C) – P(A ∩ B) – P(A ∩ C)) – P(B ∩ C)) + P(A ∩ B ∩ C) [∵ (A ∩ C) ∩ (B ∩ C) C = A ∩ B ∩ C] = P(A) + P(B) P + P(C) – P(A ∩ B) – P(B ∩ C) – P(C ∩ A)) + P(A ∩ B ∩ C) [∵ A ∩ C = C ∩ A] P + P(C) – P(AB) – P(BC) P – P(CA) + P(ABC C). or P(A + B + C)) = P(A) + P(B) 21.47 IF F A1, A2, . . . , An ARE n MUTUALLY Y EXCLUSIVE EVENTS S, THEN TH HE P PROBABILIT TY OF THE OCCURREN NCE OF ON NE OF THEM M IS P 1 ∪ A2 ∪ . . . ∪ An) = P(A1 + A2 + . . . + An) = P(A1) + P(A P(A P 2) + . . . + P(An) Proof. Let N bee the total nuumber of muutually excluusive, exhausstive and equually likely cases a so on. of which m1 are favorable to A1, m2 are favorrable to A2, and m1 ⎫ Probbability of occurrence o off event A1 = P(A1 ) = N ⎪ ⎪ m2 ⎪ Probbability of occurrence o off event A 2 = P(A 2 ) = ⎪ . . . (1) N ⎬ ⎪ ................ ⎪ mn ⎪ o off event A n = P(A n ) = Probbability of occurrence N ⎪⎭ 21.48 CONDITIONAL PROBABILITY 1207 ________________________________________________________________________________________________________ The events being mutually exclusive and equally likely, the number of cases favorable to the event A1 or A2 or . . . or An is m1 + m2 + . . . + mn . ∴ Probability of occurrence of one of the events A1, A2, . . . , An is P(A1 + A2 + . . . + An) m1 + m2 + . . . + mn m1 m2 m = + +...+ n N N N N = P(A1 ) + P(A 2 ) + . . . + P(A n ) = | Using (1) ILLUSTRATIVE EXAMPLES Example 1. In a given race, the odds in favor of four horses A, B, C, D are 1 : 3, 1 : 4, 1 : 5, 1 : 6 respectively. Assuming that a dead heat is impossible; find the chance that a particular horse wins the race. Sol. Let p1, p2, p3, p4 be the probabilities of the horses A, B, C, D winning, respectively. Since a dead heat (in which all the four horses cover the same distance in the same time) is not possible, the events are mutually exclusive. Odds in favor of A are 1 : 3 ∴ p1 = Similarly, 1 1 = 1+ 3 4 1 1 1 p2 = , p3 = , p4 = . 5 6 7 If p is the chance that one of them wins, then p = p1 + p2 + p3 + p4 = 1 1 1 1 319 . + + + = 4 5 6 7 420 Example 2. A card is drawn from a well-shuffled deck of playing cards. What is the probability that it is either a spade or an ace? Sol. Let and A = the event of drawing a spade B = the event of drawing an ace A and B are not mutually exclusive. ∴ 21.48 AB = the event of drawing the ace of spades 13 4 1 P(A) = , P(B) = , P(AB) = 52 52 52 13 4 1 16 4 + − = = . P(A + B) = P(A) + P(B) − P(AB) = 52 52 52 52 13 CONDITIONAL PROBABILITY The probability of the occurrence of an event E1 when another event E2 is known to have already happened is called Conditional Probability and is denoted by P(E1/E2). Mutually Independent Events. An event E1 is said to be independent of an event E2 if P(E1/E2) = P(E1) i.e., if the probability of the occurrence of E1 is independent of the occurrence of E2. 1208 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THE THEOREM OF COMPOUND PROBABILITY) The probability of the simultaneous occurrence of two events is equal to the probability of one of the events multiplied by the conditional probability of the other, i.e., for two events A and B, P(A ∩ B) = P(A) × P(B/A) where P(B/A) represents the conditional probability of the occurrence of B when the event A has already happened. Proof. Suppose a trial results in n exhaustive, mutually exclusive and equally likely outcomes, m of them being favorable to the occurrence of the event A. m ∴ Probability of the occurrence of the event A = P(A) = . . . (1) n Out of m outcomes favorable to the occurrence of A, let m1 be favorable to the occurrence of the event B. m ∴ Conditional probability of B, given that A has happened = P(B/A) = 1 . . . (2) m Now, out of n exhaustive, mutually exclusive and equally likely outcomes, m1 are favorable to the occurrence of A and B. ∴ Probability of simultaneous occurrence of A and B m m m m m = P(A ∩ B) = 1 = 1 × = × 1 n m n n m = P(A) × P(B/A) [Using (1) and (2)] Hence P(A ∩ B) = P(A) × P(B/A). Note. P(A ∩ B) is also written as P(AB). Thus P(AB) = P(A) × P(B/A). Cor. 1. Interchanging A and B P(BA) = P(B) × P(A/E) or P(AB) = P(B) × P(A/E) [∵ B ∩ A = A ∩ B] Cor. 2. If A and B are independent events, then P(B/A) = P(B) .. P(AB) = P(A) × P(B). Generalization. If A1, A2, . . . , An are n independent events, then P(A1A 2 . . . A n ) = P(A1 ) × P(A 2 ) × . . . × P(A n ). Cor. 3. If p is the chance that an event will occur in one trial then the chance that it will occur in a succession of r trials is p ⋅ p . . . p ⋅ (r times) = p r . Cor. 4. If p1 , p2 , . . . , pn are the probabilities that certain events occur, then the probabilities of their non-occurrence are 1 − p1 , 1 − p2 , . . . , 1 − pn and, therefore, the probability of all of these failing is (1 − p1 )(1 − p2 ) . . . (1 − pn ). Hence the chance in which at least one of these events must occur is 1 − (1 − p1 )(1 − p2 ) . . . (1 − pn ). 21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY) 1209 ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. A problem in mechanics is given to three students A, B, C whose chances of 1 1 1 solving it are , , respectively. What is the probability that the problem will be solved? 2 3 4 1 1 1 Sol. The probabilities of A, B, C solving the problem are , , . 2 3 4 1 1 1 1 2 3 The probabilities of A, B, C not solving the problem are 1 − , 1 − , 1 − i.e., , , . 2 3 4 2 3 4 1 2 3 1 ∴ The probability that the problem is not solved by any of them = × × = . 2 3 4 4 1 3 Hence the probability that the problem is solved by at least one of them = 1 − = . 4 4 Example 2. The odds that a book will be favorably reviewed by three independent critics are 5 to 2, 4 to 3, and 3 to 4 respectively. What is the probability that, of the three reviews, a majority will be favorable? Sol. Let the three critics be A, B, C. The probabilities p1 , p2 , p3 of the book being 5 4 3 favorably reviewed by A, B, C are , , respectively. 7 7 7 ∴ The probabilities that the book is unfavorably reviewed by A, B, C are 5 2 4 3 3 4 1− = , 1− = , 1− = . 7 7 7 7 7 7 A majority will be favorable if the reviews of at least two are favorable. (i) If A, B, C all review favorably, the probability is 5 4 3 60 × × = | p1 p2 p3 7 7 7 343 (ii) If A, B review favorably and C reviews unfavorably, the probability is 5 4 4 80 | p1 p2 (1 − p3 ) × × = 7 7 7 343 (iii) If A, C review favorably and B reviews unfavorably, the probability is 5 3 3 45 | p1 (1 − p2 ) p3 × × = 7 7 7 343 (iv) If B, C review favorably and A reviews unfavorably, the probability is 2 4 3 24 | (1 − p1 ) p2 p3 × × = 7 7 7 343 Hence the probability that a majority will be favorable is 60 80 45 24 209 + + + = . 343 343 343 343 343 Example 3. A can hit a target 4 times in 5 shots; B can hit it 3 times in 4 shots; C can hit it twice in 3 shots. They fire a volley. What is the probability that at least two shots hit? 4 Sol. Probability of A’s hitting the target = 5 3 = Probability of B’s hitting the target 4 1210 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Probability of C’s hitting the target = 2 . 3 For at least two hits, we may have (i) A, B, C all hit the target, the probability of which is 4 3 2 24 × × = . 5 4 3 60 (ii) A, B hit the target and C misses it, the probability of which is 4 3 ⎛ 2 ⎞ 4 3 1 12 × × ⎜1 − ⎟ = × × = . 5 4 ⎝ 3 ⎠ 5 4 3 60 (iii) A, C hit the target and B misses it, the probability of which is 4 ⎛ 3⎞ 2 4 1 2 8 × ⎜1 − ⎟ × = × × = . 5 ⎝ 4 ⎠ 3 5 4 3 60 (iv) B, C hit the target and A misses it, the probability of which is ⎛ 4⎞ 3 2 1 3 2 6 ⎜1 − ⎟ × × = × × = . ⎝ 5 ⎠ 4 3 5 4 3 60 Since these are mutually exclusive events, the required probability is = 24 12 8 6 50 5 + + + = = . 60 60 60 60 60 6 Example 4. A has 2 shares in a lottery in which there are 3 prizes and 5 blanks; B has 3 shares in a lottery in which there are 4 prizes and 6 blanks. Show that A’s chance of success is to B’s as 27 : 35. Sol. A can draw two tickets (out of 3 + 5 = 8) in 8C3 = 28 ways. A will get the blanks in 5C2 = 10 ways. ∴ A can win a prize in 28 – 10 = 18 ways 18 9 Hence A’s chance of success = = 28 14 B can draw 3 tickets in 10C3 = 120 ways; B will get all blanks in 6C3 = 20 ways. ∴ B can win a prize in 120 – 20 = 100 ways. 100 5 Hence B’s chance of success = = . 120 6 9 5 : = 27 : 35. ∴ A’s chance : B’s chance = 14 6 Example 5. A and B throw alternately with a single die, A having the first throw. The person who first throws a one wins. What are their respective chances of winning? 1 Sol. The chance of throwing a one with a single die = 6 1 5 The chance of not throwing a one with a single die = 1 − = . 6 6 If A is to win, he should throw a one in the first or third or fifth, . . . , throws. If B is to win, he should throw a one in the second or fourth or sixth, . . . , throws. The chances that a one is thrown in the first, second, third, . . . , throws are 2 3 1 5 1 5 5 1 5 5 5 1 1 5 1 ⎛5⎞ 1 ⎛5⎞ 1 , ⋅ , ⋅ ⋅ , ⋅ ⋅ ⋅ . . . or , ⋅ , ⎜ ⎟ ⋅ , ⎜ ⎟ ⋅ , .. . 6 6 6 6 6 6 6 6 6 6 6 6 6 ⎝6⎠ 6 ⎝6⎠ 6 21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY) 1211 ________________________________________________________________________________________________________ 2 4 1 ⎛5⎞ 1 ⎛5⎞ 1 ∴ A’s chance = + ⎜ ⎟ ⋅ + ⎜ ⎟ ⋅ + . . . = 6 ⎝6⎠ 6 ⎝6⎠ 6 1 6 6 = 2 ⎛ 5 ⎞ 11 1− ⎜ ⎟ ⎝6⎠ Sum of an infinite Geometric a Progression = 1− r B’s chance = 1 − 6 5 = . 11 11 Example 6. Cards are dealt one by one from a well-shuffled deck until an ace appears. Show that the probability that exactly n cards are dealt before the first ace appears is 4(51 − n)(50 − n)(49 − n) . 52 ⋅ 51⋅ 50 ⋅ 49 Sol. Let A be the event of drawing n non-ace cards and B, the event of drawing an ace in the (n + l)th draw. Consider the event A n cards can be drawn out of 52 cards in 52Cn ways. ⇒ Exhaustive cases = 52Cn n non-ace cards can be drawn out of 52 cards in 48Cn ways. ⇒ Favorable cases = 48Cn 48! (52 − n)!(n)! × ∴ P(A) = 48 Cn / 52 Cn = (48 − n)!n ! 52! 48! ⋅ (52 − n)(51 − n)(50 − n)(49 − n)(48 − n)! (52 − n)(51 − n)(50 − n)(49 − n) = = . (48 − n)! ⋅ 52 ⋅ 51 ⋅ 50 ⋅ 49 ⋅ (48)! 52 ⋅ 51 ⋅ 50 ⋅ 49 Consider the event B n cards have already been drawn in the first n draws. Exhaustive cases = 52–nC1 = 52 – n; Favorable cases = 4C1 = 4 4 ∴ P(B/A) = 52 − n Reqd. Probability = P(A) ⋅ P(B/A) = (52 − n)(51 − n)(50 − n)(49 − n) 4 4(51 − n)(50 − n)(49 − n) × = . 52 ⋅ 51⋅ 50 ⋅ 49 52 − n 52 ⋅ 51⋅ 50 ⋅ 49 Example 7. An urn contains 10 white and 3 black balls, while another urn contains 3 white and 5 black balls. Two balls are drawn from the first urn and put into the second urn and then a ball is drawn from the latter. What is the probability that it is a white ball? Sol. The two balls drawn from the first urn may be (i) both white (ii) both black (iii) one white and one black. Let these events be denoted by A, B, C respectively. 10 3 C 10 × 9 15 C 3× 2 1 P(A) = 13 2 = = ; P(B) = 13 2 = = C2 13 × 12 26 C2 13 × 12 26 P(C) = 10 C1 × 3 C1 10 × 3 10 = = 13 13 × 12 26 C2 2 ×1 1212 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ When two balls are transferred from the first urn to the second urn, the second urn may contain (i) 5 white and 5 black balls (ii) 3 white and 7 black balls (iii) 4 white and 6 black balls. Let W denote the event of drawing a white ball from the second urn in the three cases (i), (ii), and (iii). 5 3 4 Now P(W/A) = , P(W/B) = , P(W/C) = 10 10 10 ∴ Reqd. probability = P(A) ⋅ P(W/A) + P(B) ⋅ P(W/B) + P(C) ⋅ P(W/C) 15 5 1 3 10 4 75 + 3 + 40 118 59 = ⋅ + ⋅ + ⋅ = = = . 26 10 26 10 26 10 260 260 130 TEST YOUR KNOWLEDGE 1. In a class of 10 students, 4 are boys and the rest are girls. Find the probability that a student selected will be a girl. 2. What is the chance that a (i) non-leap year (ii) leap year should have fifty-three Sundays? 3. A card is drawn from an ordinary deck and a gambler bets that it is a spade or an ace. What are the odds against his winning the bet? 4. An integer is chosen at random from the first two hundred positive integers. What is the probability that the integer chosen is divisible by 6 or 8? 5. Six cards are drawn at random from a deck of 52 cards. What is the probability that 3 will be red and 3 will be black? 6. From a set of raffle tickets numbered 1 to 100, three are drawn at random. What is the probability that all are odd numbered? 7. (a) If from a lottery of 30 tickets, marked, 1, 2, 3, . . . , 30, four tickets are drawn, what is the chance that those marked 1 and 2 are among them? (b) An urn contains 5 red and 10 black balls. Eight of them are placed in another urn. What is the chance that the latter then contains 2 red and 6 black balls? 8. A party of n people sit at a round table. Find the odds against two specified individuals sitting next to each other. 9. A five-figured number is formed by the digits 0, 1, 2, 3, 4 (without repetition). Find the probability that the number formed is divisible by 4. 10. Three newspapers A, B, C are published in a city and a survey of readers indicates the following: 20% read A, 16% read B, 14% read C, 8% read both A and B, 5% read both A and C, 4% read both B and C, and 2% read all three. For a person chosen at random, find the probability that he reads none of the papers. 1 1 1 1 1 11. A problem in statistics is given to five students. Their chances of solving it are , , , , and . 2 3 4 4 5 What is the probability that the problem will be solved? 12. A can hit a target 5 times in 6 shots, B hits it 4 times in 5 shots, and C hits it 3 times in 4 shots. They fire a volley. What is the probability that at least two shots hit the target? 13. Three groups of children contain, respectively, 3 girls and 1 boy; 2 girls and 2 boys; 1 girl and 3 boys. One child is selected at random from each group. Show that the chance that the three selected consist of 13 . 1 girl and 2 boys is 32 14. Four people are chosen at random from a group containing 3 men, 2 women, and 4 children. Show that 5 . the chance that exactly two of them will be children is 21 21.49 MULTIPLICATIVE LAW OF PROBABILITY (OR THEOREM OF COMPOUND PROBABILITY) 1213 ________________________________________________________________________________________________________ 15. A bag contains 10 balls, two of which are red, three are blue, and five are black. Three balls are drawn at random from the bag. What is the probability that (i) the three balls are of different colors, (ii) two balls are of the same color, (iii) the balls are all of the same color. 16. It is 8 : 5 against a person who is 40 years old living until they are 70 and 4 : 3 against a person now 50 living until they are 80. Find the probability that at least one of these people will be alive 30 years from now. 17. Find the chance of throwing 5 or 6 at least once in four throws of a die. 18. A has 3 shares in a lottery where there are 3 prizes and 6 blanks. B has one share in another, where there is just one prize and two blanks. Show that A has a better chance of winning a prize than B in the ratio 16 : 7. 19. A, B, and C, in order, toss a coin. The first one to throw a head wins. If A starts, find their respective chances of winning. 20. A speaks the truth in 60% of cases and B in 70% of cases. In what percentages of cases are they likely to contradict each other in stating the same fact? 21. A and B throw alternately with a pair of ordinary dice. A wins if he throws 6 before B throws 7 and B wins if he throws 7 before A throws 6. If A begins, find their respective chances of winning. (Huygen’s Problem) 22. (a) Two cards are randomly drawn from a deck of 52 cards and thrown away. What is the probability of drawing an ace in a single draw from the remaining 50 cards? (b) A box A contains 2 white and 4 black balls. Another box B contains 5 white and 7 black balls. A ball is transferred from the box A to the box B; then a ball is drawn from box B. Find the probability that it is white. 23. Of the cigarette-smoking population, 70% are men and 30% are women, 10% of these men and 20% of these women smoke ABC Cigarettes. What is the probability that a person seen smoking an ABC cigarette will be a man? 24. A committee consists of 9 students, two of which are in their 1st year, three are in their 2nd year, and four are in their 3rd year. Three students are to be removed at random. What is the chance that (i) the three students belong to different classes, (ii) two belong to the same class and the third to the different class, and (iii) the three belong to the same class? 25. Five workers in a company of twenty are graduates. If 3 workers are picked out of 20 at random, what is the probability that (ii) at least one is a graduate? (i) they are all graduates? 26. If A, B, C are events such that P(A) = 0.3, P(B) = 0.4, P(C) = 0.8, P(A ∩ B) = 0.08, P(A ∩ C) = 0.28, P(A ∩ B ∩ C) = 0.09 If P(A ∪ B ∪ C) ≥ 0.75, then show that 0.23 ≤ P(B ∩ C) ≤ 0.48. 27. For two events A and B, let P(A) = 0.4, P(B) = p and P(A ∪ B) = 0.6 (i) Find p so that A and B are independent events. (ii) For what value of p are A and B mutually exclusive? 28. A husband and wife appear in an interview for two vacancies in the same position. The probability of the husband’s selection is 17 and that of the wife’s selection is 15 . What is the probability that (i) both of them will be selected, (ii) only one of them will be selected, and (iii) none of them will be selected? 29. Two dice are tossed once. Find the probability of getting an even number on the first throw or a total of 8. 1214 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 30. A drawer contains 50 bolts and 150 nuts. Half of the bolts and half of the nuts are rusted. If one item is chosen at random, what is the probability that it is rusted or is a bolt? 31. An old purse contains 2 silver and 4 copper coins. A second purse contains 4 silver and 3 copper coins. If a coin is pulled out at random from one of the two purses, what is the probability that it is a silver coin? 32. A class consists of 80 students, 25 of which are girls and 55 are boys, 10 of which have blue eyes and the remaining 20 have brown hair. What is the probability of selecting a brown-haired, blue-eyed girl? 33. Of the students attending a lecture, 50% could not see what was written on the board and 40% could not hear what the lecturer was saying. The most unfortunate 30% fell into both of these categories. What is the probability that a student picked at random was able to see and hear satisfactorily? 34. The probabilities of A, B, C solving a problem are 13 , 72 , and 83 , respectively. If all three try to solve the problem simultaneously, find the probability that exactly one of them will solve it. 35. A student takes his examinations in four subjects α , β , γ , δ . He estimates his chance of passing in α as 54 , in β as 34 , in γ as 56 , and in δ as 23 . To qualify he must pass in α and at least two other subjects. What is the probability that he qualifies? 36. For any two events A and B, prove that P(A ∩ B) ≤ P(A) ≤ P(A ∪ B) ≤ P(A) + P(B). Answers 1. 5. 9. 3 2. 5 13000 39151 5 10. 16 1 15. (i ) 20. 46% 24. (i ) 29. 32. 6. 4 2 7 5 9 5 512 (ii ) (ii ) 79 120 55 84 (iii ) (iii ) 11 120 5 84 16. (i ) 1 7 (ii ) 33 13 20 59 25. (i ) 8 2 5 9:4 7. (a) 17. 91 5 3. 11. 30 31 , 61 61 33. 7 4 21. 30. 2 1 114 (ii ) 137 228 145 , (b ) 140 429 65 19. 81 27. (i ) 8. 12. 20 (a) 34. 2 17 22. 31. 4. 1 13 , (b ) 16 39 1 (ii ) 0.2 3 23. 28. 1 4 ( n − 3) : 2 107 120 4 2 1 , , 7 7 7 7 13 (i ) 42 56 35 (iii ) 19 25 1 35. 61 90 24 35 (ii ) 2 7 21.50 BAYES’ THEOREM 1215 ________________________________________________________________________________________________________ 21.50 BAYES’ THEOREM If E1, E2, . . . , En are mutually exclusive and exhaustive events with P(Ei) ≠ 0, (i = 1, 2, . . . , n) of a random experiment then for any arbitrary event A of the sample space of the above experiment with P(A) > 0, we have P( Ei ) P( A / Ei ) P ( Ei / A) = n ∑ P( Ei ) P( A / Ei ) i =1 Proof. Let S be the sample space of the random experiment. The events E1, E2, . . . , En being exhaustive ∴ S = E1 ∪ E 2 ∪ . . . ∪ E n A = A∩S = A ∩ (E1 ∪ E 2 ∪ . . . ∪ E n ) = (A ∩ E1 ) ∪ (A ∩ E 2 ) ∪ . . . ∪ (A ∩ E n ) ⇒ P(A) = P(A ∩ E1 ) + P(A ∩ E 2 ) + . . . + P(A ∩ E n ) [∵ A ⊂ S] [Distributive Law] = P(E n )P(A/E1 ) + P(E 2 )P(A/E 2 ) + . . . + P(E n )P(A/E n ) n = ∑ P(E i )P(A/E i ) . . . (1) i =1 Now ⇒ P(A ∩ E i ) = P(A)P(E i / A) P(E i / A) = P(A ∩ E i ) P(E )P(A/E i ) = n i P(A) ∑ P(Ei )P(A/Ei ) [Using (1)] i =1 Note. The significance of Bayes’ Theorem may be understood in the following manner: P(Ei) is the probability of the occurrence of Ei. The experiment is performed and we are told that the event A has occurred. With this information, the probability P(Ei) is changed to P(Ei/A). Bayes’ Theorem enables us to evaluate P(Ei/A) if all the P(Ei) and the conditional probabilities P(A/Ei) are known. ILLUSTRATIVE EXAMPLES Example 1. A bag X contains 2 white and 3 red balls and a bag Y contains 4 white and 5 red balls. One ball is drawn at random from one of the bags and is found to be red. Find the probability that it was drawn from bag Y. Sol. Let E1: the ball is drawn from bag X; E2: the ball is drawn from bag Y and A: the ball is red. We have to find P(E2/A). By Bayes’ Theorem, P(E 2 )P(A/E 2 ) P(E 2 /A) = . . . (1) P(E1 )P(A/E1 ) + P(E 2 )P(A/E 2 ) 1 Since the two bags are equally likely to be selected, P(E1 ) = P(E 2 ) = 2 3 Also P(A/E1) = P(a red ball is drawn from bag X) = 5 5 P(A/E2) = P(a red ball is drawn from bag Y) = 9 1216 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 1 5 × 25 2 9 = . ∴ From (1), we have P(E2/A) = 1 3 1 5 52 × + × 2 5 2 9 Example 2. In a bolt factory, machines A, B, and C manufacture respectively 25%, 35%, and 40% of the total. Of their output 5, 4, and 2 percent are defective bolts. A bolt is drawn at random from the product and is found to be defective. What is the probability that it was manufactured by machine B? Sol. Let E1, E2, and E3 denote the events that a bolt selected at random is manufactured by the machines A, B, and C respectively and let H denote the event of its being defective. Then P(E1) = 0.25, P(E2) = 0.35, P(E3) = 0.40 The probability of drawing a defective bolt manufactured by machine A is P(H/E1) = 0.05 Similarly, P(H/E2) = 0.04 and P(H/E3) = 0.02 By Bayes’ Theorem, we have P(E 2 )P(H/E 2 ) P(E 2 /H) = P(E1 )P(H/E1 ) + P(E 2 )P(H / E 2 ) + P(E 3 )P(H / E 3 ) = 0.35 × 0.04 0.0140 = = 0.41. 0.25 × 0.05 + 0.35 × 0.04 + 0.40 × 0.02 0.0345 Example 3. The contents of bags I, II, and III are as follows: 1 white, 2 black, and 3 red balls, 2 white, 1 black, and 1 red balls, and 4 white, 5 black, and 3 red balls. One bag is chosen at random and two balls are drawn from it. They happen to be white and red. What is the probability that they come from bags I, II, or III? Sol. Let E1 : bag I is chosen; E2 : bag II is chosen; E3 : bag III is chosen and A : the two balls are white and red. We have to find P(E1/A), P(E2/A), and P(E3A). 1 Now P(E1) = P(E2) = P(E3) = 3 1 C × 3 C1 1 = P(A/E1) = P (a white and a red ball are drawn from bag I) = 16 C2 5 2 P(A/E2) = 4 C1 × 1 C1 1 C1 × 3 C1 2 = ; P(A / E ) = = 3 4 12 C2 3 C2 11 By Bayes’ Theorem, we have 1 1 × P(E1 )P(A / E1 ) 33 3 5 = = P(E1 / A) = P(E1 )P(A / E1 ) + P(E 2 )P(A / E 2 ) + P(E 3 )P(A / E 3 ) 1 × 1 + 1 × 1 + 1 × 2 118 3 5 3 3 3 11 55 15 Similarly, P(E2/A) = P(E3/A) = .· 118 59 21.52 DISCRETE PROBABILITY DISTRIBUTION 1217 ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. Two bags contain 4 white, 6 blue and 4 white, 5 blue balls, respectively. One of the bags is selected at random and a ball is drawn from it. If the ball drawn is white, find the probability that it is drawn from the (ii) second bag (i) first bag 2. Three bags contain 6 red, 4 black; 4 red, 6 black; and 5 red, 5 black balls, respectively. One of the bags is selected at random and a ball is drawn from it. If the ball drawn is red, find the probability that it is drawn from the first bag. 3. A factory has two machines A and B. Past records show that machine A produced 60% of the items of output and machine B produced 40% of the items. Further, 2% of the items produced by machine A were defective and 1% produced by machine B were defective. If a defective item is drawn at random, what is the probability that it was produced by machine A? 4. An insurance company insured 2000 motorcycle drivers, 4000 car drivers, and 6000 truck drivers. The probability of an accident is 0.01, 0.03, and 0.15 respectively. One of the insured persons has an accident. What is the probability that he is a motorcycle driver? 5. A company has two plants to manufacture scooters. Plant I manufactures 70% of scooters and plant II manufactures 30%. At plant I, 80% of the scooters are rated standard quality and at plant II, 90% of the scooters are rated standard quality. A scooter is chosen at random and is found to be of standard quality. What is the chance that it has come from plant II? Answers 1. 4. (i ) 9 19 (ii ) 10 19 1 52 2. 5. 2 5 27 3. 3 4 83 ________________________________________________________________________________________________________ 21.51 RANDOM VARIABLE If the numerical values assumed by a variable are the result of some chance factors, so that a particular value cannot be exactly predicted in advance, the variable is then called a random variable. A random variable is also called a chance variable or a stochastic variable. Random variables are denoted by capital letters, usually from the last part of the alphabet, for instance, X, Y, Z, etc. Continuous and Discrete Random Variables A continuous random variable is one that can assume any value within an interval, i.e., all values of a continuous scale. For example (i) the weights (in kg) of a group of individuals, (ii) the heights of a group of individuals. A discrete random variable is one that can assume only isolated values. For example, (i) the number of heads in 4 tosses of a coin is a discrete random variable as it cannot assume values other than 0, 1, 2, 3, 4. (ii) the number of aces in a draw of 2 cards from a well-shuffled deck is a random variable as it can take the values 0, 1, 2 only. 21.52 DISCRETE PROBABILITY DISTRIBUTION Let a random variable X assume values x1, x2, x3, . . . , xn with probabilities p1, p2, p3, . . . , pn respectively, where P(X = xi) = pi ≥ 0 for each xi and p1 + p2 + p3 + . . . + pn = n ∑p i =1 i = 1. 1218 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ X : x1 , P(X) : p1 , x2 , p2 , x3 , . . . , xn p3 , . . . , pn is called the discrete probability distribution for X and it spells out how a total probability of 1 is distributed over several values of the random variable. 21.53 MEAN AND VARIANCE OF RANDOM VARIABLES Let X : x1 , x2 , x3 , . . . , xn P(X) : p1 , p2 , p3 , . . . , pn be a discrete probability distribution. Σpi xi = Σpi xi Σpi Other names for the mean are average or expected value E ( X ) . We denote the variance by σ 2 and define σ 2 = Σpi ( xi − μ ) 2 We denote the mean by μ and define μ = If μ is not a whole number, then (∵ Σpi = 1) σ 2 = Σpi xi2 − μ 2 Standard deviation σ = + Variance. ILLUSTRATIVE EXAMPLES Example 1. Five defective bulbs are accidentally mixed with twenty good ones. It is not possible to just look at a bulb and tell whether or not it is defective. Find the probability distribution of the number of defective bulbs, if four bulbs are drawn at random from this lot. Sol. Let X denote the number of defective bulbs out of four. Clearly, X can take the values 0, 1, 2, 3, or 4. Number of defective bulbs = 5 Number of good bulbs = 20 Total number of bulbs = 25 P(X = 0) = P (no defective) = P (all 4 good ones) = 20 25 C4 20 × 19 × 18 × 17 969 = = C 4 25 × 24 × 23 × 22 2530 P(X = 1) = P(1 defective and 3 good ones) = 5 C1 × 20 C3 1140 = 25 C4 2530 P(X = 2) = P(2 defectives and 2 good ones) = 5 C2 × 20 C2 380 = 25 C4 2530 C3 × 20 C1 40 P(X = 3) = P(3 defectives and 1 good one) = = 25 C4 2530 5 P(X = 4) = P(all 4 defectives) = 5 C4 1 = C4 2530 25 ∴ The probability distribution of the random variable X is X : P(X) : 0 1 2 3 4 969 2530 1140 2530 380 2530 40 2540 1 2530 21.53 MEAN AND VARIANCE OF RANDOM VARIABLES 1219 ________________________________________________________________________________________________________ Example 2. A die is tossed three times. A success is “getting 1 or 6” on a toss. Find the mean and the variance of the number of successes. Sol. Let X denote the number of successes. Clearly X can take the values 0, 1, 2, or 3. 2 1 1 2 Probability of success = = ; Probability of failure = 1 − = 6 3 3 3 2 2 2 8 P(X = 0) = P (no success) = P (all 3 failures) = × × = 3 3 3 27 1 2 2 12 P(X = 1) = P (1 success and 2 failures) = 3 C1 × × × = 3 3 3 27 1 1 2 6 P(X = 2) = P (2 successes and 1 failure) = 3 C2 × × × = 3 3 3 27 1 1 2 6 P(X = 3) = P (all 3 successes) = × × = 3 3 3 27 ∴ The probability distribution of the random variable X is X : 0 1 2 3 P(X) : 8 27 12 27 6 27 1 27 To find the mean and variance xi 0 1 2 3 pi 8 27 12 27 6 27 1 27 pi xi pi xi2 0 0 12 27 12 27 3 27 12 27 24 27 9 27 5 3 1 Mean μ = Σpi xi = 1 5 2 Variance σ 2 = Σpi xi2 − μ 2 = − 1 = . 3 3 Example 3. A random variable X has the following probability function: Values of X, x : 0 1 2 3 4 5 6 7 2 2 2 p(x) : 0 k 2k 2k 3k k 2k 7k + k (i) Find k, (ii) Evaluate P(X < 6), P(X ≥ 6), P(3 < X ≤ 6) (iii) Find the minimum value of x so that P(X ≤ x) > 12 . Sol. (i) Since 7 ∑ p( x) = 1, we have x=0 ⇒ ⇒ 0 + k + 2k + 2k + 3k + k2 + 2k2 + 7k2 + k = 1 10k2 + 9k – 1 = 0 ⇒ (10k – 1)(k + 1) = 0 1 k= 10 [∵ p ( x) ≥ 0] 1220 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ (ii) P(X < 6) = P(X = 0) + P(X = 1) + . . . + P(X = 5) = 0 + k + 2k + 2k + 3k + k2 = 8k + k2 = 8 1 81 + = 10 100 100 P(X ≥ 6) = P(X = 6) + P(X = 7) 9 1 19 + = = 2k 2 + 7 k 2 + k = 100 10 100 P(3 < X ≤ 6) = P(X = 4) + P(X = 5) + P(X = 6) 3 3 33 = = 3k + k 2 + 2k 2 = + 10 100 100 1 1 3 1 < ; P(X ≤ 2) = k + 2k = < (iii) P(X ≤ 1) = k = 10 2 10 2 5 1 8 1 P(X ≤ 3) = k + 2k + 2k = = ; P(X ≤ 4) = k + 2k + 2k + 3k = > 10 2 10 2 ∴ The maximum value of x so that P(X ≤ x) > 12 is 4. TEST YOUR KNOWLEDGE 1. Find the probability distribution of the number of doubles in four throws of a pair of dice. 2. Two bad eggs are mixed accidently with 10 good ones. Find the probability distribution of the number of bad eggs in 3, drawn at random, without replacement, from this lot. 3. A die is tossed twice. Getting a number greater than 4 is considered a success. Find the variance of the probability distribution of the number of successes. 4. Two cards are drawn simultaneously from a well-shuffled deck of 52 cards. Compute the variance for the number of aces. 5. A bag contains 4 white and 3 red balls. Three balls are drawn, with replacement, from this bag. Find μ , σ , and σ for the number of red balls drawn. 2 6. A random variable X has the following probability distribution: : : Values of X, x p(x) 0 a 1 3a 2 5a 3 7a 4 9a 5 11a 6 13a 7 15a 8 17a (i) Determine the value of a. (ii) Find P(X < 3), P(X ≥ 3), P(2 ≤ X < 5) (iii) What is the smallest value of x for which P(X ≤ x) > 0.5? 7. Find the standard deviation for the following discrete distribution: x : p( x) : 8 12 16 20 24 1 1 3 1 1 8 6 8 4 12 Answers 1. X : P(X) : 2. X : P(X) : 0 625 1 500 2 150 3 20 4 1 1296 1296 1296 1296 1296 0 12 1 9 2 1 22 22 22 21.55 BINOMIAL PROBABILITY DISTRIBUTION 1221 ________________________________________________________________________________________________________ 3. 6. 4 9 (i ) a = 400 4. 1 81 2873 1 8 7 (ii ) , , 9 9 27 (iii ) 5 5. 9 36 6 , , 7 49 7 7. 2 5 ________________________________________________________________________________________________________ 21.54 THEORETICAL DISTRIBUTIONS Frequency distributions can be classified under two heads: (i) Observed Frequency Distributions. (ii) Theoretical or Expected Frequency Distributions. Observed frequency distributions are based on actual observation and experimentation. If a certain hypothesis is assumed, it is sometimes possible to derive mathematically what the frequency distribution of a certain universe should be. Such distributions are called Theoretical Distributions. There are many types of theoretical frequency distributions, but we shall consider only three that are of great importance: (i) Binomial Distribution (or Bernoulli’s Distribution); (ii) Poisson’s Distribution; (iii) Normal Distribution. BINOMIAL (OR BERNOULLI’S) DISTRIBUTION 21.55 BINOMIAL PROBABILITY DISTRIBUTION Let there be n independent trials in an experiment. Let a random variable X denote the number of successes in these n trials. Let p be the probability of a success and q be that of a failure in a single trial so that p + q = 1. Let the trials be independent and p be constant for every trial. Let us find the probability of r successes in n trials. r successes can be obtained in n trials in nCr ways. ∴ P(X = r ) = n Cr P (S S S) . . . S r times F F F ... F ( n − r ) times = Cr P(S)P(S) . . . P(S) P(F)P(F) . . . P(F) n r factors = Cr p p p . . . p n r factors = Cr p q n Hence r ( n − r ) factors q q q ... q ( n − r ) factors n−r P(X = r) = nCr qn–rpr, where p + q = 1 and r = 0, 1, 2, . . . , n. The distribution (1) is called the binomial probability distribution and X is called the binomial variate. Note 1. P(X = r) is usually written as P(r). 1222 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Note 2. The successive probabilities P(r) in (1) for r = 0, 1, 2, . . . , n are n C0qn, nC1qn–1p, nC2qn–2p2, . . . , nCnpn which are the successive terms of the binomial expansion of (q + p)n. That is why this distribution is called the “binomial” distribution. Note 3. n and p occurring in the binomial distribution are called the parameters of the distribution. Note 4. In a binomial distribution: (i) n, the number of trials is finite. (ii) each trial has only two possible outcomes usually called success and failure. (iii) all the trials are independent. (iv) p (and hence q) is constant for all the trials. 21.56 RECURRENCE OR RECURSION FORMULA FOR THE BINOMIAL DISTRIBUTION In a binomial distribution, n! q n−r p r (n − r )!r ! n! q n − r −1 p r +1 P(r + 1) = n Cr +1q n − r −1 p r +1 = (n − r − 1)!(r + 1)! r! p P(r + 1) (n − r )! = × × P(r ) (n − r − 1)! (r + 1)! q P(r ) = n Cr q n − r p r = ∴ r! p (n − r ) × (n − r − 1)! ⎛ n−r ⎞ p × × ×=⎜ ⎟⋅ (n − r − 1)! (r + 1) × r ! q ⎝ r +1 ⎠ q n−r p ⇒ P(r + 1) = ⋅ P(r ) r +1 q which is the required recurrence formula. Applying this formula successively, we can find P(1), P(2), P(3), . . . , if P(0) is known. = 21.57 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION For the binomial distribution, P(r ) = n Cr q n − r p r n n Mean μ = ∑ rP(r ) = ∑ r ⋅ n Cr q n − r p r r =0 r=0 = 0 + 1⋅ C1q n n −1 p + 2 ⋅ n C 2 q n − 2 p 2 + 3 ⋅ n C3 q n −3 p 3 + . . . + n ⋅ n C n p n n(n − 1) n − 2 2 n(n − 1)(n − 2) n −3 3 q p + 3⋅ q p + . . . + np n 2 ⋅1 3 ⋅ 2 ⋅1 n(n − 1)(n − 2) n −3 3 = nq n −1 p + n(n − 1)q n − 2 p 2 + q p + . . . + np n 2 ⋅1 (n − 1)(n − 2) n −3 2 ⎡ ⎤ = np ⎢ q n −1 + (n − 1)q n − 2 p + q p + . . . + p n −1 ⎥ 2 ⋅1 ⎣ ⎦ n −1 n −1 n −1 n−2 n −1 n −3 2 n −1 = np ⎡⎣ C0 q + C1q p + C2 q p + . . . + Cn −1 p n −1 ⎤⎦ = nq n −1 p + 2 ⋅ = np(q + p) n −1 = np (∵ p + q = 1) 21.57 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION 1223 ________________________________________________________________________________________________________ Hence the variance of the binomial distribution is np. n n Variance σ 2 = ∑ r 2 P(r ) − μ 2 = ∑ [r + r (r − 1)]P(r ) − μ 2 r =0 r=0 n n n r =0 r =0 r=2 = ∑ rP(r ) + ∑ r (r − 1)P(r ) − μ 2 = μ + ∑ r (r − 1) n Cr q n − r p r − μ 2 (since the contribution due to r = 0 and r = 1 is zero). = μ + [2 ⋅1 ⋅ n C2 q n − 2 p 2 + 3 ⋅ 2 ⋅ n C3 q n −3 p 3 + . . . + n(n − 1) n Cn p n ] − μ 2 n(n − 1) n − 2 2 n(n − 1)(n − 2) n −3 3 ⎡ ⎤ = μ + ⎢ 2 ⋅1 ⋅ q p + 3⋅ 2 ⋅ q p + . . . + n(n − 1) p n ⎥ − μ 2 2 ⋅1 3 ⋅ 2 ⋅1 ⎣ ⎦ n−2 2 n −3 3 n 2 = μ + [n(n − 1)q p + n(n − 1)(n − 2)q p + . . . + n(n − 1) p ] − μ = μ + n(n − 1) p 2 [q n − 2 + (n − 2)q n −3 p + . . . + p n − 2 ] − μ 2 = μ + n(n − 1) p 2 [ n − 2 C0 q n − 2 + n − 2 C1q n −3 p + . . . + n − 2 Cn − 2 p n − 2 ] − μ 2 = μ + n(n − 1) p 2 (q + p ) n − 2 − μ 2 = μ + n(n − 1) p 2 − μ 2 [∵ q + p = 1] = np + n(n − 1) p 2 − n 2 p 2 [∵ μ = np ] = np[1 + (n − 1) p − np ] = np[1 − p ] = npq. Hence the variance of the binomial distribution is npq. Standard deviation of the binomial distribution is npq . Similarly, we can prove that β1 = γ 1 = β1 = Hence Note. γ 1 = q− p npq positive, if p > β2 = 3 + μ32 (q − p)2 (1 − 2 p )2 1 − 6 pq μ = = ; β 2 = 42 = 3 + 3 npq npq npq μ2 μ2 1 2 1 − 6 pq npq = 1− 2 p npq q − p 1− 2 p = ; npq npq γ 2 = β2 − 3 = 1 − 6 pq npq gives a measure of skewness of the binomial distribution. If p < , skewness is negative and if p = 1 2 1 2 , skewness is , it is zero. gives a measure of the kurtosis of the binomial distribution. ILLUSTRATIVE EXAMPLES Example 1. One ship out of 9 was sunk on an average in making a certain voyage. What was the probability that exactly 3 out of a convoy of 6 ships would arrive safely? 1 8 1 Sol. p, the probability of a ship arriving safely = 1 − = ; q = , n = 6 9 9 9 ⎛1 8⎞ Binomial distribution is ⎜ + ⎟ ⎝9 9⎠ 6 3 3 ⎛ 1 ⎞ ⎛ 8 ⎞ 10240 . The probability that exactly 3 ships arrive safely = 6 C3 ⎜ ⎟ ⎜ ⎟ = 96 ⎝9⎠ ⎝9⎠ 1224 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Example 2. Assume that on the average one telephone number out of fifteen called between 2 P.M. and 3 P.M. on week-days is busy. What is the probability that if 6 randomly selected telephone numbers are called (i) not more than three, (ii) at least three of them will be busy? Sol. p, the probability of a telephone number being busy between 2 P.M. and 3 P.M. on week-days = 151 1 14 ⎛ 14 1 ⎞ q = 1 − = , n = 6; Binomial distribution is ⎜ + ⎟ 15 15 ⎝ 15 15 ⎠ 6 The probability that not more than three will be busy = p(0) + p (1) + p(2) + p(3) 6 5 4 2 3 ⎛ 14 ⎞ ⎛ 14 ⎞ ⎛ 1 ⎞ ⎛ 14 ⎞ ⎛ 1 ⎞ ⎛ 14 ⎞ ⎛ 1 ⎞ = C0 ⎜ ⎟ + 6 C1 ⎜ ⎟ ⎜ ⎟ + 6 C2 ⎜ ⎟ ⎜ ⎟ + 6 C3 ⎜ ⎟ ⎜ ⎟ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ 3 (14) 2744 × 4150 = [2744 + 1176 + 210 + 20] = = 0.9997 6 (15) (15)6 3 6 The probability that at least three of them will be busy = p(3) + p(4) + p(5) + p(6) 3 3 2 4 5 6 ⎛ 14 ⎞ ⎛ 1 ⎞ ⎛ 14 ⎞ ⎛ 1 ⎞ ⎛ 14 ⎞ ⎛ 1 ⎞ ⎛1⎞ = 6 C3 ⎜ ⎟ ⎜ ⎟ + 6 C4 ⎜ ⎟ ⎜ ⎟ + 6 C5 ⎜ ⎟ ⎜ ⎟ + 6 C6 ⎜ ⎟ = 0.005. ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 15 ⎠ Example 3. Six dice are thrown 729 times. How many times do you expect at least three dice to show a five or six? 2 1 Sol. p = the chance of getting 5 or 6 with one die = = 6 3 1 2 q = 1 − = , n = 6, N = 729 3 3 since dice are in sets of 6 and there are 729 sets. 6 ⎛ 2 1⎞ The binomial distribution is N(q + p) = 729 ⎜ + ⎟ ⎝ 3 3⎠ The expected number of times at least three dice will show five or six ⎡ 6 ⎛ 2 ⎞ 3 ⎛ 1 ⎞ 3 6 ⎛ 2 ⎞ 2 ⎛ 1 ⎞ 4 6 ⎛ 2 ⎞ ⎛ 1 ⎞5 6 ⎛ 1 ⎞ 6 ⎤ = 729 ⎢ C3 ⎜ ⎟ ⎜ ⎟ + C4 ⎜ ⎟ ⎜ ⎟ + C5 ⎜ ⎟ ⎜ ⎟ + C6 ⎜ ⎟ ⎥ ⎝ 3⎠ ⎝3⎠ ⎝ 3⎠ ⎝3⎠ ⎝ 3 ⎠⎝ 3 ⎠ ⎝ 3 ⎠ ⎥⎦ ⎢⎣ 729 = 6 [160 + 60 + 12 + 1] = 233 3 n Example 4. Out of 800 families with 4 children each, how many families would be expected to have (i) 2 boys and 2 girls (ii) at least one boy (iii) no girl (iv) at most two girls? Assume equal probabilities for boys and girls. Sol. Since probabilities for boys and girls are equal 1 1 p = probability of having a boy = ; q = probability of having a girl = 2 2 21.57 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION 1225 ________________________________________________________________________________________________________ 4 ⎛1 1⎞ n = 4, N = 800 ∴ The binomial distribution is 800 ⎜ + ⎟ . ⎝2 2⎠ (i) The expected number of families having 2 boys and 2 girls 2 2 1 ⎛1⎞ ⎛1⎞ = 800 C2 ⎜ ⎟ ⎜ ⎟ = 800 × 6 × = 300. 16 ⎝2⎠ ⎝2⎠ 4 (ii) The expected number of families having at least one boy ⎡ 4 ⎛ 1 ⎞3 ⎛ 1 ⎞ 4 ⎛ 1 ⎞ 2 ⎛ 1 ⎞ 2 4 ⎛ 1 ⎞ ⎛ 1 ⎞3 4 ⎛ 1 ⎞ 4 ⎤ = 800 ⎢ C1 ⎜ ⎟ ⎜ ⎟ + C2 ⎜ ⎟ ⎜ ⎟ + C3 ⎜ ⎟ ⎜ ⎟ + C4 ⎜ ⎟ ⎥ ⎝2⎠ ⎝2⎠ ⎝2⎠ ⎝2⎠ ⎝ 2 ⎠⎝ 2 ⎠ ⎝ 2 ⎠ ⎦⎥ ⎣⎢ = 800 × 1 [4 + 6 + 4 + 1] = 750. 16 (iii) The expected number of families having no girl, i.e., having 4 boys 4 ⎛1⎞ = 800 ⋅ C4 ⎜ ⎟ = 50. ⎝2⎠ 4 (iv) The expected number of families having at most two girls, i.e., having at least 2 boys 2 2 3 4 ⎡ 1 ⎛1⎞ ⎛1⎞ ⎛ 1 ⎞⎛ 1 ⎞ ⎛1⎞ ⎤ = 800 ⎢ 4 C2 ⎜ ⎟ ⎜ ⎟ + 4 C3 ⎜ ⎟ ⎜ ⎟ + 4 C4 ⎜ ⎟ ⎥ = 800 × [6 + 4 + 1] = 550. 16 ⎝2⎠ ⎝2⎠ ⎝ 2 ⎠⎝ 2 ⎠ ⎝ 2 ⎠ ⎥⎦ ⎢⎣ TEST YOUR KNOWLEDGE 1. Ten coins are tossed simultaneously. Find the probability of getting at least seven heads. 2. The probability of any ship of a company being destroyed on a certain voyage is 0.02. The company owns 6 ships for the voyage. What is the probability of: (i) losing one ship (ii) losing at most two ships (iii) losing none. 3. The probability that a man aged 60 will live to be 70 is 0.65. What is the probability that out of ten men now 60, at least 7 would live to be 70? 4. The incidence of occupational disease in an industry is such that the workers have a 20% chance of suffering from it. What is the probability that out of six workers chosen at random, four or more will suffer from the disease? 5. The probability that a pen manufactured by a company will be defective is 1 10 . If 12 such pens are manufactured, find the probability that (i) exactly two will be defective (iii) none will be defective. (ii) at least two will be defective 6. If the chance that one of the ten telephone lines is busy at an instant is 0.2 (i) What is the chance that 5 of the lines are busy? (ii) What is the probability that all the lines are busy? 7. If on an average 1 vessel in every 10 is wrecked, find the probability that out of 5 vessels expected to arrive, at least 4 will arrive safely. 8. A product is 0.5% defective and is packed in cartons of 100. What percentage contains not more than 3 defectives? 1226 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 9. A bag contains 5 white, 7 red, and 8 black balls. If four balls are drawn one by one, with replacement, what is the probability that (i) none is white (iii) at least one is white (ii) all are white (iv) only 2 are white? 10. In a hurdle race, a player has to cross 10 hurdles. The probability that he will clear each hurdle is 5 6 . What is the probability that he will knock down fewer than 2 hurdles? 11. Fit a binomial distribution for the following data and compare the theoretical frequencies with the actual ones: : : x f 0 2 1 14 2 20 3 34 4 22 5 8 12. If the sum of mean and variance of a binomial distribution is 4.8 for five trials, find the distribution. 13. If the mean of a binomial distribution is 3 and the variance is 3 2 , find the probability of obtaining at least 4 successes. 14. In 800 families with 5 children each, how many families would be expected to have (i) 3 boys and 2 girls, (ii) 2 boys and 3 girls, (iii) no girl (iv) at the most two girls. (Assume probabilities for boys and girls to be equal.) 15. In 100 sets of ten tosses of an unbiased coin, in how many cases do you expect to get (i) 7 heads and 3 tails (ii) at least 7 heads? 16. The following data are the number of seeds germinating out of 10 on a damp filter for 80 sets of seeds. Fit a binomial distribution to this data: x : 0 f : 6 1 20 2 28 3 12 4 8 5 6 Σ fx 6 0 7 0 8 0 9 0 10 0 Total 80 ∴ np = 2.175 etc.] Σf 17. A bag contains 10 balls each marked with one of the digits 0 to 9. If four balls are drawn successively (with replacement) from the bag, what is the probability that none is marked with the digit 0? [Hint. Here n = 10, N = 80, Mean = 18. A box contains 100 tickets each bearing one of the numbers from 1 to 100. If 5 tickets are drawn successively (with replacement) from the box, find the probability that all the tickets bear numbers divisible by 10. 19. The probability that a ball thrown by a child will strike a target is 1 5 . If six balls are thrown find the probability that (i) exactly two will strike the target, (ii) at least two will strike the target. 20. In sampling a large number of parts manufactured by a machine, the mean number of defectives in a sample of 20 is 2. Out of 1000 such samples, how many would be expected to contain at least 3 defective parts? Answers 1. 4. 7. 10. 13. 11 64 53 3125 0.91854 5⎛5⎞ ⎜ ⎟ 2⎝6⎠ 11 32 2. (i) 0.1085 (ii) 0.9997 (iii) 0.8858 3. 0.514 5. (i) 0.2301 (ii) 0.3412 (iii) 0.2833 6. (i) 0.02579 (ii) 1.024 × 10–7 8. 99.83 9. (i ) 9 11. 100 (0.432 + 0.568)5 12. 81 256 ⎛1+ 4⎞ ⎜ ⎟ ⎝5 5⎠ (ii ) 5 1 256 (iii ) 175 256 (iv ) 27 128 21.58 POISSON DISTRIBUTION AS A LIMITING CASE OF BINOMIAL DISTRIBUTION 1227 ________________________________________________________________________________________________________ 14. (i) 250 (ii) 250 (iii) 25 (iv) 400 15. (i) 12 nearly (ii) 17 nearly 16. 80 (0.7825 + 0.2175)10 17. ⎛9⎞ ⎜ ⎟ ⎝ 10 ⎠ 18. 20. 0.00001 323 19. (i) 0.246 (ii) 0.345 4 ________________________________________________________________________________________________________ POISSON DISTRIBUTION 21.58 POISSON DISTRIBUTION AS A LIMITING CASE OF BINOMIAL DISTRIBUTION If the parameters n and p of a binomial distribution are known, we can find the distribution. But in situations where n is very large and p is very small, the application of the binomial distribution is very laborious. However, if we assume that as n → ∞ and p → 0 such that np always remains finite, say λ , we get the Poisson approximation to the binomial distribution. Now, for a binomial distribution P(X = r ) = n Cr q n − r p r n(n − 1)(n − 2) . . . (n − r + 1) = × (1 − p ) n − r × p r r! n(n − 1)(n − 2) . . . (n − r + 1) ⎛ λ ⎞ = × ⎜1 − ⎟ r! ⎝ n⎠ n−r ⎛λ⎞ ×⎜ ⎟ ⎝n⎠ r since np = λ ∴ p = n ⎛ λ⎞ 1− r λ n(n − 1)(n − 2) . . . (n − r + 1) ⎜⎝ n ⎟⎠ = × × r r! nr ⎛ λ⎞ ⎜1 − ⎟ ⎝ n⎠ n ⎛ λ⎞ 1− r λ ⎛ n ⎞ ⎛ n − 1 ⎞ ⎛ n − 2 ⎞ ⎛ n − r + 1 ⎞ ⎜⎝ n ⎟⎠ = ⎜ ⎟⎜ ⎟× ⎟⎜ ⎟ ...⎜ r ! ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ ⎝ n ⎠ ⎛ λ ⎞r ⎜1 − ⎟ ⎝ n⎠ n − ⎤ ⎡ λ λ ⎢⎛⎜1 − ⎞⎟ ⎥ ⎢ n⎠ ⎥ λ r ⎛ 1 ⎞ ⎛ 2 ⎞ ⎛ r − 1 ⎞ ⎣⎝ ⎦ = ⎜1 − ⎟ ⎜1 − ⎟ . . . ⎜1 − ⎟× r r! ⎝ n ⎠⎝ n ⎠ ⎝ n ⎠ ⎛ λ⎞ ⎜1 − ⎟ ⎝ n⎠ As n → ∞ , each of the (r – 1) factors ⎛ 1⎞ ⎛ 2⎞ ⎛ r −1 ⎞ ⎜1 − ⎟ , ⎜1 − ⎟ , . . . , ⎜1 − ⎟ tends to 1. Also n ⎠ ⎝ n⎠ ⎝ n⎠ ⎝ r ⎛ λ⎞ ⎜1 − ⎟ tends to 1. ⎝ n⎠ n ⎡ ⎤ λ λ ⎛ 1⎞ ⎛ ⎞ ⎢ Since Lt ⎜1 + ⎟ = e, the Naperian base. ∴ ⎜1 − ⎟ ⎥ x →∞ ⎢⎝ n ⎠ ⎥ ⎝ x⎠ ⎣ ⎦ x −λ −λ → e − λ as n → ∞ λ n 1228 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Hence in the limiting case when n → ∞, we have P(X = r ) = λ r e−λ (r = 0, 1, 2, 3, . . . ) r! where λ is a finite number = np. (A) represents the Poisson probability distribution. . . . (A) Note 1. λ is called the parameter of the distribution. x x2 xn Note 2. e = 1 + + . . . + + . . . to ∞. 1! 2! n! x Note 3. The sum of the probabilities P(r) for r = 0, 1, 2, 3, . . . is 1, since P(0) + P(1) + P(2) + P(3) + . . . = e −λ =e −λ + λe −λ 1! λ e 2 + 2! λe 3 + −λ 3! +... ⎛ λ λ λ ⎞ ⎜1 + + + + . . . ⎟ = e 1! 2! 3! ⎝ ⎠ 2 21.59 −λ 3 −λ λ ⋅ e = 1. RECURRENCE FORMULA FOR THE POISSON DISTRIBUTION For the Poisson distribution, P(r ) = λ r e−λ r! and P(r + 1) = λ r +1e − λ (r + 1)! P(r + 1) λr ! λ λ = = or P(r + 1) = P(r ), r = 0, 1, 2, 3, . . . P(r ) (r + 1)! r + 1 r +1 This is called the recurrence formula for the Poisson distribution. ∴ 21.60 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION For the Poisson distribution, P(r ) = Mean μ λ r e− λ r! ∞ ∞ λ r e− λ r =0 r=0 r! = ∑ rP(r ) = ∑ r ⋅ ⎛ ⎞ λ2 λ3 = e−λ ⎜ λ + + + . . .⎟ 1! 2! r = 1 ( r − 1)! ⎝ ⎠ ∞ = e−λ ∑ λr ⎛ λ λ2 ⎞ = λ e− λ ⎜1 + + + . . . ⎟ = λ e− λ ⋅ eλ = λ ⎝ 1! 2! ⎠ Thus, the mean of the Poisson distribution is equal to the parameter λ . Variance σ 2 λ r e− λ r 2λ r = ∑ r P(r ) − μ = ∑ r ⋅ −λ = e ∑ − λ2 r! r =0 r =0 r =1 r ! ∞ 2 2 ∞ 2 2 −λ ∞ ⎡12 ⋅ λ 22 ⋅ λ 2 32 λ 3 42 λ 4 ⎤ = e−λ ⎢ + + + + . . .⎥ − λ 2 2! 3! 4! ⎣ 1! ⎦ ⎡ 2λ 3λ 2 4λ 3 ⎤ = λ e − λ ⎢1 + + + + . . .⎥ − λ 2 2! 3! ⎣ 1! ⎦ 21.60 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION 1229 ________________________________________________________________________________________________________ ⎡ (1 + 1)λ (1 + 2)λ 2 (1 + 3)λ 3 ⎤ = λ e − λ ⎢1 + + + + . . .⎥ − λ 2 1! 2! 3! ⎣ ⎦ 2 3 2 ⎡ ⎞ ⎛ λ 2λ ⎞⎤ λ λ λ 3λ 3 −λ ⎛ = λ e ⎢⎜ 1 + + + + . . .⎟ + ⎜ + + + . . .⎟⎥ − λ 2 3! ⎠ ⎝ 1! 2! ⎠⎦ ⎣⎝ 1! 2! 3! ⎡ ⎛ λ λ2 ⎞⎤ = λ e − λ ⎢ eλ + λ ⎜1 + + + . . . ⎟⎥ − λ 2 ⎝ 1! 2! ⎠⎦ ⎣ −λ λ λ −λ 2 = λ e [e + λ e ] − λ = λ e ⋅ eλ (1 + λ ) − λ 2 = λ (1 + λ ) − λ 2 = λ. Hence, the variance of the Poisson distribution is also λ . Thus, the mean and the variance of the Poisson distribution are each equal to the parameter λ . Note. The mean and the variance of the Poisson distribution can also be derived from those of the binomial distribution in the limiting case when n → ∞, p → 0 and np = λ . Mean of binomial distribution is np. ∴ Mean of the Poisson distribution = Lt np = Lt λ = λ n→∞ n→∞ Variance of the binomial distribution is npq = np (1 – p) ⎛ λ⎞ ∴ Variance of the Poisson distribution = Lt np (1 − p ) = Lt λ ⎜ 1 − ⎟ = λ . n→∞ n →∞ ⎝ n⎠ ILLUSTRATIVE EXAMPLES Example 1. If the variance of the Poisson distribution is 2, find the probabilities for r = 1, 2, 3, 4 from the recurrence relation of the Poisson distribution. Sol. λ , the parameter of the Poisson distribution = Variance = 2 Recurrence relation for the Poisson distribution is λ 2 P(r + 1) = P(r ) = P(r ) . . . (1) r +1 r +1 λ r e− λ e −2 Now P(r ) = ⇒ P(0) = = e −2 = 0.1353 r! 0! Setting r = 0, 1, 2, 3 in (1), we get 2 P(1) = 2P(0) = 2 × 0.1353 = 0.2706; P(2) = P(1) = 0.2706 2 2 2 2 1 P(3) = P(2) = × 0.2706 = 0.1804; P(4) = P(3) = × 0.1804 = .0902. 3 3 4 2 Example 2. Assume that the probability of an individual coal miner being injured in a certain way in a mine accident during a year is 1/2400. Use Poisson’s distribution to calculate the probability that in a mine employing 200 miners there will be at least one such similar accident in a year. 1 200 1 , n = 200; ∴ λ = np = Sol. Here p= = = 0.083 2400 2400 12 λ r e − λ (0.083) r e −.083 ∴ = P(r ) = r! r! 1230 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ P(at least one fatal accident) = 1 – P(no fatal accident) (0.083) = 1 − P(0) = 1 − (0.083)0 e −0.83 = 1 − .92 = 0.08. 0! Example 3. Data was collected over a period of 10 years, showing the number of injuries from horse kicks in each of the 200 army corps. The distribution of injuries was as follows: No. of injuries Frequency : : 0 109 1 65 2 22 3 3 4 1 Total 200 Fit a Poisson distribution to the data and calculate the theoretical frequencies: Σ fx 65 + 44 + 9 + 4 122 = = = 0.61 Sol. Mean of given distribution = Σf 200 200 This is the parameter (m) of the Poisson distribution. mr e− m where N = Σ f = 200 ∴ Required Poisson distribution is N ⋅ r! (0.61) r (0.61) r (0.61) 2 . = 200e −0.61 ⋅ = 200 × 0.5435 = 108.7 × r! r! r! r 0 1 P(r) 108.7 108.7 × 0.61 = 66.3 Theoretical Frequency 109 66 (0.61) 2 = 20.2 2! (0.61)3 108.7 × = 4.1 3! (0.61) 4 108.7 × = 0.7 4! 108.7 × 2 3 4 20 4 1 Total = 200 Example 4. A car rental firm has two cars, which it hires out day by day. The number of requests for a car on each day is distributed as a Poisson distribution with mean 1.5. Calculate the proportion of days on which neither car is used and the proportion of days on which some requests are refused. (e–1.5 = 0.2231) Sol. Since the number of requests for a car is distributed as a Poisson distribution with mean m = 1.5. ∴ Proportion of days on which neither car is used = Probability of there being no requests for a car m0e− m = e −1.5 = 0.2231 0! Proportion of days on which some requests are refused = probability for the number of requests to be more than two = ⎛ me − m m 2 e − m ⎞ = 1 − P( x ≤ 2) = 1 − ⎜ e − m + + ⎟ 1! 2! ⎠ ⎝ 21.60 MEAN AND VARIANCE OF THE POISSON DISTRIBUTION 1231 ________________________________________________________________________________________________________ ⎛ (1.5) 2 ⎞ = 1 − e −1.5 ⎜1 + 1.5 + ⎟ = 1 − 0.2231 (1 + 1.5 + 1.125) 2 ⎠ ⎝ = 1 − 0.2231× 3.625 = 1 − 0.8087375 = 0.1912625. Example 5. Six coins are tossed 6400 times. Using the Poisson distribution, determine the approximate probability of getting six heads x times. Sol. Probability of getting one head with one coin = 12 . 6 1 ⎛1⎞ ∴ The probability of getting six heads with six coins = ⎜ ⎟ = ⎝ 2 ⎠ 64 1 = 100 64 ∴ Average number of six heads with six coins in 6400 throws = np = 6400 × ∴ The mean of the Poisson distribution = 100. Approximate probability of getting six heads x times when the distribution is Poisson = m x e − m (100) x ⋅ e −100 = . x! (100)! TEST YOUR KNOWLEDGE 1. Fit a Poisson distribution to the following: x : f : 0 192 1 100 2 24 3 3 4 1 2. If the probability of a bad reaction from a certain injection is 0.001, determine the chance that out of 2000 individuals more than two will get a bad reaction. 3. If X is a Poisson variate such that P(X = 2) = 9P(X = 4) + 90P(X = 6), find the standard deviation. 4. If a random variable has a Poisson distribution such that P(1) = P(2), find (i) mean of the distribution (ii) P(4) 5. Suppose that X has a Poisson distribution. If P(X = 2) = 2 3 P(X = 1) find, (i) P(X = 0) (ii) P(X = 3). 6. A certain screw-making machine produces on average 2 defective screws out of 100, and packs them in boxes of 500. Find the probability that a box contains 15 defective screws. 7. The incidence of occupational disease in an industry is such that the workmen have a 10% chance of suffering from it. What is the probability that in a group of 7, five or more will suffer from it? 8. Fit a Poisson distribution to the following and calculate theoretical frequencies: x f : : 0 122 1 60 2 15 3 2 4 1 9. Fit a Poisson distribution to the following data given the number of yeast cells per square for 400 squares: No. of cells per sq. : No. of squares : 0 103 1 143 2 98 3 42 4 8 5 4 6 2 7 0 8 0 9 0 10 0 ⎛2⎞ 10. Show that in a Poisson distribution with unit mean, mean deviation about mean is ⎜ e ⎟ times the ⎝ ⎠ standard deviation. 1232 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ 11. In a certain factory f turningg razor blades, there is a smaall chance of 0.002 for any blade to be defe fective. The blades are a supplied inn packets of 10. 1 Use the Pooisson distribuution to calculate the approxximate number of packets containning no defectiive, one defective, and two defective bladdes respectivelly in a shipment of 10000 1 packets.. 12. The probability that a man aged 35 yearss will die before reaching thee age of 40 yeears may be takken as 0.018. Out off a group of 4000 men, now agged 35 years, what w is the proobability that 2 men will die within w the next 5 yeaars? 13. Suppose a bo ook of 585 pagges contains 433 typographicaal errors. If theese errors are randomly r distrributed throughout th he book, what is i the probabiliity that 10 pagees, selected at random, r will be b free from errrors? Answers e 1. 320 × 4. (i) 2 7. 0.0008 11. 12. 0.503 (9 9.503) r! (ii) 2. 0.32 3. 5. (i) e–4 (ii) 4ee–4 6. 8. 121.36 × 2 9802, 196, 2 0.01936 1 15 2 3e r 13. (0.5)) (10) e −10 1 (15)! = 0.035 5 , where r = 0, 0 1, 2, 3, 4 r! Theoretical freequencies are 121, 1 61, 15, 3, 0 respectiveely 0.4795 9. Theoretiical frequencies are 109, 1422, 92, 40, 13, 3, 1, 0, 0, 0 0, 0 ________________________ ________________________________________________________________________________________ N NORMAL DISTRIB BUTION 21.61 N NORMAL DISTRIBUTIO ON The normal distribution is a continuouus distributioon. It can bee derived frrom the binoomial distributiion in the lim miting case when w n, the number of trials is veryy large and p, p the probaability of a success, is close to 12 . The general g equattion of the noormal distribbution is givven by 1 ⎛ x−μ ⎞ 2 − ⎜ ⎟ 1 f ( x) = e 2⎝ σ ⎠ σ 2π t parameteers of where thee variable x can assume all values frrom – ∞ to + ∞ . μ andd σ , called the the distriibution, are respectivelyy the mean and the staandard deviaation of the distributionn and – ∞ < μ < ∞ , σ > 0. x is calleed the normal variate annd f ( x) is called c the prrobability deensity function of the normaal distributioon. μ andd standard deviation If a variable x has the norrmal distribuution with mean m d σ , we 2 briefly write w x : N( μ , σ ). The graph of th he normal distribution d is called thee normal curve. c It is bell-shaped and symmeetrical abouut the meann μ . The tw wo tails of thhe curve exttend to + ∞ and – ∞ toward the positive and negative directions d of the x-axis respectivelly and graduually approach the x-axiss without ever e meeting g it. The cuurve is unimodal and thee mode of the normal distribution coincides with w its meann 21.63 ST TANDARD FOR RM OF THE NO ORMAL DISTR RIBUTION 1233 ________________________ ________________________________________________________________________________________ μ . The line l x = μ divides d the arrea under thee normal currve above thhe x-axis intoo two equal parts. p Thus, thee median of the distribuution also coincides withh its mean annd mode. Thhe area undeer the normal curve c betweeen any two given ordinnates x = x1 and x = x2 represents thhe probabiliity of values faalling into thee given interrval. The tottal area undeer the normall curve abovve the x-axis is 1. 21.62 B BASIC PRO OPERTIES OF O THE NOR RMAL DISTRIBUTION The probability density funcction of the normal n distribution is givven by 1 ⎛ x−μ ⎞ σ ⎟⎠ − ⎜ 1 f ( x) = e 2⎝ σ 2π (i) f ( x) ≥ 0 ∫ (ii)) ∞ −∞ 2 f ( x)dxx = 1, i.e., the total areea under thee normal curvve above thee x-axis is 1. (iii)) The normaal distributionn is symmetrrical about itts mean. (iv)) It is a unim modal distribuution. The mean, m mode, and mediann of this distrribution coinncide. 21.63 S STANDARD D FORM OF THE NORM MAL DISTRIBUTION If X is a normaal random variable v withh mean μ and a standard deviation σ , then thhe random variable Z = X−μ h the norrmal distribbution with mean 0 and has a σ standard deviation 1. The random variable Z is called the t ndard ) norm standarddized (or stan mal random variable. The probability y density function f foor the norm mal distributiion in standaard form is given g by 1 2 1 −2z f ( z) = e 2π a parameteer. This helpps us to com mpute areas under u the noormal probaability It iss free from any curve by making use of standard tables. Notee 1. If f ( z ) is the probabilityy density functiion for the norm mal distributioon, then P(z1 ≤ Z ≤ z2 ) = ∫ z2 z1 f ( z )dz = F( F z2 )F( z1 ), where F(zz ) = ∫ z −∞ f ( x)dz = P(Z ≤ z ) f F(z) defined d above is i called the disstribution funcction for the noormal distributiion. The function Notee 2. The probaabilities P(z1 ≤ Z ≤ z 2 ), P(z1 < Z ≤ z 2 ), P( z1 ≤ Z < z 2 ) annd P( z1 < Z < z 2 ) are all reggarded to be the saame. Notee 3. F(− z1 ) = 1 − F( z1 ). ILLUSTRA ATIVE EXAMP PLES Exaample 1. A sample s of 1000 dry batterry cells testeed to find thhe length of life produceed the followingg results: x = 12 houurs, σ = 3 hoours. 1234 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ Assuuming the da ata to be norrmally distriibuted, whatt percentage of battery cells c are expected to have liife (i)) more than 15 hours (ii) lesss than 6 hourrs (iii)) between 10 0 and 14 houurs? Sol. Here x deno otes the lenggth of life of dry battery cells. x − x x − 12 Alsoo = z= . σ 3 (i) When W x = 15, z = 1 ∴ P( x > 15) = P( z > 1) = P(0 < z < ∞) − P(0 < z < 1) = 0.5 − 0.34413 = 0.15877 = 15.87%. (ii) When W x = 6,, z = – 2 ∴ P( x < 6) = Pz < −2) = P(0 > 2) = P(0 P < z < ∞) − P(0 < z < 2) 2 = 0.5 − 0.47722 = 0.0228 = 2.28%. 2 (iii) When x = 10, z = − = – 0.67 3 2 = 0.67 Wheen x = 14, z = 3 P P(10 < x < 14 4) = P(−0.67 < z < 0.677) = 2P(0 < z < 0.67) = 2 × 0.2487 = 0.4974 4 = 49.74%. Exaample 2. In a normal diistribution, 31% 3 of the items i are unnder 45 and 8% are oveer 64. Find the mean and sttandard deviiation of the distributionn. Sol. Let x and σ be the meean and S.D. respectivelly. 31% % of the item ms are under 45. 4 ⇒ Area to the left of the orrdinate x = 45 4 is 0.31 Wheen x = 45, leet z =z1 P(z1 < z < 0) = 0.55 – 0.31 = 0.19 From m the tabless, the valuee of z corresponding too this area is 0.5 z1 = −0.5[ z1 < 0] ∴ Wheen x = 64, leet z = z2 P(0 < z < z2) = 0.55 – 0.08 = 0.42 From m the tables,, the value of z corresponnding to thiss area is 1.4. z2 = 1.4 x−x Sincce z= σ −0.5 = 45 − x σ and 1.44 = 64 − x σ ⇒ 5 − x = −0.5σ 45 andd 64 4 − x = 1.4σ −19 = −1.9σ ∴ σ = 10 Subbtracting From m (1), 45 5 − x = −0.5 × 10 − 5 ∴ x = 50. . . . (1) . . . (2) 21.64 POPULATION OR UNIVERSE 1235 ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. The mean height of 500 students in a certain college is 151 cm and the standard deviation is 15 cm. Assuming the heights are normally distributed, how many students have heights between 120 and 155 cm? 2. An aptitude test for selecting officers in a bank is conducted on 1000 candidates. The average score is 42 and the standard deviation of score is 24. Assuming normal distribution for the scores, find (i) The number of candidates whose scores exceed 60 (ii) The number of candidates whose scores lie between 30 and 60. 3. In a normal distribution, 7% of the items are under 35 and 89% are under 63. What are the mean and standard deviation of the distribution? 4. Let X denote the number of scores on a test. If X is normally distributed with mean 100 and standard deviation 15, find the probability that X does not exceed 130. 5. It is known from past experience that the number of telephone calls made daily in a certain community between 3 P.M. and 4 P.M. have a mean of 352 and a standard deviation of 31. What percentage of the time will there be more than 400 telephone calls made in this community between 3 P.M. and 4 P.M.? 6. Students of a class were given a mechanical aptitude test. Their grades were found to be normally distributed with mean 60 and standard deviation 5. What percent of students scored (i) more than 60 grades? (iii) between 45 and 65 grades? (ii) less than 56 grades? 7. In an examination taken by 500 candidates, the average and the standard deviation of grades obtained (normally distributed) are 40% and 10%. Find approximately: (i) How many will pass, if 50% is fixed as a minimum? (ii) What should be the minimum if 350 candidates are to pass? (iii) How many have scored above 60%? Answers 1. 300 2. (i) 252 (ii) 533 3. x = 50.3, σ = 10.33 4. 0.9772 5. 6.06% 6. (i) 50% (ii) 21.2% (iii) 84% 7. (i) 79 (ii) 35% (iii) 11 ________________________________________________________________________________________________________ SAMPLING AND TESTS OF SIGNIFICANCE 21.64 POPULATION OR UNIVERSE An aggregate of objects (animate or inanimate) under study is called population or universe. It is thus a collection of individuals or of their attributes (qualities) or of results of operations that can be numerically specified. A universe containing a finite number of individuals or members is called a finite inverse: for example, the universe of the weights of students in a particular class. A universe with an infinite number of members is known as an infinite universe: for example, the universe of pressures at various points in the atmosphere. In some cases, we may even be ignorant whether or not a particular universe is infinite, e.g., the universe of stars. 1236 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ The universe of concrete objects is an existent universe. The collection of all possible ways in which a specified event can happen is called a hypothetical universe. The universe of heads and tails obtained by tossing a coin an infinite number of times (provided that it does not wear out) is a hypothetical one. 21.65 SAMPLING The statistician is often confronted with the problem of discussing a universe of which he cannot examine every member, i.e., of which complete enumeration is impracticle. For example, if we want to have an idea of the average per capita income of the United States, enumeration of every earning individual in the country is a very difficult task. Naturally, the question arises: What can be said about a universe of which we can examine only a limited number of members? This question is the origin of the Theory of Sampling. A finite sub-set of a universe is called a sample. A sample is thus a small portion of the universe. The number of individuals in a sample is called the sample size. The process of selecting a sample from a universe is called sampling. The theory of sampling is a study of the relationship existing between a population and samples drawn from the population. The fundamental object of sampling is to get as much information as possible about the whole universe by examining only a part of it. An attempt is thus made through sampling to give the maximum information about the parent universe with the minimum effort. Sampling is quite often used in our day-to-day practical life. For example, in a store we assess the quality of lettuce, apples, or any other commodity by taking only a handful of it from the bag and then decide whether to purchase it or not. A chef normally tastes cooked products to find if they have been properly cooked and contain the proper quantity of salt or sugar, by taking a spoonful of it. 21.66 PARAMETERS OF STATISTICS The statistical constants of the population such as mean, the variance, etc. are known as the parameters. The statistical concepts of the sample from the members of the sample to estimate the parameters of the population from which the sample has been drawn are known as statistics. Population mean and variance are denoted by μ and σ 2 , while those of the sample are given by x and s 2 . 21.67 STANDARD ERROR (S.E.) The standard deviation of the sampling distribution of a statistic is known as the standard error (S.E.). It plays an important role in the theory of large samples and it forms a basis of the testing of hypotheses. If t is any statistic, for a large sample z= t − E(t ) is normally distributed with mean 0 and variance 1. S.E.(t ) For a large sample, the standard errors of some of the well-known statistics are listed below: n σ 2 s2 sample size population variance sample variance p Q n1 , n2 population proportion =1–p sizes of two independent random samples 21.70 LEVEL OF SIGNIFICANCE 1237 ________________________________________________________________________________________________________ No. 21.68 Statistic Standard error 1. x σ/ n 2. s σ 2 / 2n 3. Difference of two sample means x1 − x2 σ 12 4. Difference of two sample standard deviations s1 − s2 5. Difference of two sample proportions p1 − p2 6. Observed sample proportion p n1 σ 12 2n1 + σ 22 + σ 22 n2 2n2 P1Q1 P2 Q 2 + n1 n2 PQ/n TEST OF SIGNIFICANCE An important aspect of the sampling theory is to study the test of significance, which will enable us to decide, on the basis of the results of the sample, whether (i) the deviation between the observed sample statistic and the hypothetical parameter value or (ii) the deviation between two sample statistics is significant or might be attributed due to chance or the fluctuations of the sampling. To apply the tests of significance, we first set up a hypothesis that is a definite statement about the population parameter called the Null hypothesis denoted by H0. Any hypothesis that is complementary to the null hypothesis (H0) is called an Alternative hypothesis denoted by H1. For example, if we want to test the null hypothesis that the population has a specified mean μ0 , then we have H0 : μ = μ 0 Alternative hypotheses will be (i) H1 : μ ≠ m0 ( μ > μ0 or μ < μ0 ) (two-tailed alternative hypothesis). (ii) H1 : μ > μ0 (right-tailed alternative hypothesis (or) single-tailed). (iii) H1 : μ < μ0 (left-tailed alternative hypothesis (or) single-tailed). Hence alternative hypotheses help to know whether the test is a two-tailed test or a onetailed test. 21.69 CRITICAL REGION A region corresponding to a statistic t, in the sample space S that amounts to rejection of the null hypothesis H0, is called the critical region or the region of rejection. The region of the sample space S that amounts to the acceptance of H0 is called the acceptance region. 21.70 LEVEL OF SIGNIFICANCE The probability of the value of the variate falling in the critical region is known as the level of significance. The probability α that a random value of the statistic t belongs to the critical region is known as the level of significance. 1238 CHAPTER 21: STATISTICS AND PROBAB BILITY ________________________ ________________________________________________________________________________________ P(t ∈ ω | H 0 ) = α i.e., the leevel of signiificance is thhe size of thee type I errorr or the maxiimum produucer’s risk. 21.71 E ERRORS IN N SAMPLING G The main goal of the samppling theory is to draw a valid concclusion abouut the popullation parameteers on the baasis of the sample s resullts. In doing this we maay commit thhe followingg two types of errors: e Typ pe I Error. When W H0 is true, t we mayy reject it. P(R Reject H0 wheen it is true) = P(Reject H0/H0) = α α is called the size of the tyype I error, also a referredd to as produ ucer’s risk. Typ pe II Error. When H0 is wrong we may m accept itt. P(A Accept H0 wh hen it is wroong) = P(Acccept H0/H1) = β . β is called the size of the tyype II error, also referred to o as consum mer’s risk. Critical values v or siignificant va alues The values of th he test statisstic that sepaarate the crittical region and the acceptance regiion is called thee critical va alues or the significant s v value . Thiss value is dependent d o (i) the level on l of siggnificance used u and (iii) the alternnative hypothessis, whether it i is one-tailed or two-taailed. t − E( E t) For larger samp ples correspponding to thhe statistic t, the variabble z = is norm mally S.E E.(t ) distributeed with meaan 0 and vaariance 1. The T value off z (as givenn previouslyy) under thee null hypothessis is known as the test statistic s . The critical valu ue of zα of the t test statistic at level of significannce α for a two-tailed test t is given by p ( z > zα ) = α . . . (1) o z so that the total areea of the crittical region on both tailss is α . Sincce the i.e., zα is the value of normal curve is symm metrical, from equation (1), we get p ( z > zα ) + p ( z < − zα ) = α ; i.e., 2 p ( z > zα ) = a; p ( z > zα ) = α / 2 i.e., the area a of each tail t is α / 2. 21.72 TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION 1239 ________________________________________________________________________________________________________ The critical value zα is that value such that the area to the right of zα is α / 2 and the area to the left of – zα is α / 2. In the case of the one-tailed test p ( z > zα ) = α if it is right tailed; p(z < – zα ) = α if it is left tailed. The critical value of z for a single-tailed test (right or left) at the level of significance α is the same as the critical value of z for a two-tailed test at the level of significance 2α . Using the equation and the normal tables, the critical value of z at a different level of significance ( α ) for both single-tailed and two-tailed tests are calculated and listed below. The equations are p ( z > zα ) = α ; p ( z > zα ) = α ; p ( z < − zα ) = α Level of significance 1% (0.01) 5% (0.05) 10% (0.1) Two-tailed test zα = 2.58 z = 1.966 z = 0.645 Right-tailed zα = 2.33 zα = 1.645 zα = 1.28 Left-tailed zα = −2.33 zα = −1.645 zα = −1.28 Note. The following steps may be adopted to test statistical hypotheses: Step 1: Null hypothesis. Set up H0 in clear terms. Step 2: Alternative hypothesis. Set up H1 so that we can decide whether to use the onetailed test or the two-tailed test. Step 3: Level of significance. Select the appropriate level of significance in advance depending on the reliability of the estimates. t − E(t ) Step 4: Test statistic. Compute the test statistic z = under the null hypothesis. S.E.(t ) Step 5: Conclusion. Compare the computed value of z with the critical value zα at the level of significance ( α ). If z > zα , we reject H0 and conclude that there is significant difference. If z < zα , we accept H0 and conclude that there is no significant difference. TEST OF SIGNIFICANCE FOR LARGE SAMPLES If the sample size n > 30, the sample is taken as a large sample. For such a sample we apply a normal test, as Binomial, Poisson, chi-square, etc. are closely approximated by normal distributions assuming the population as normal. Under a large sample test, the following are the important tests of significance. 1. Testing of significance for a single proportion. 2. Testing of significance for a difference of proportions. 3. Testing of significance for a single mean. 4. Testing of significance for a difference of means. 21.72 TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION This test is used to find the significant difference between the proportion of the sample and the population. 1240 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Let X be the number of successes in n independent trials with constant probability P of success for each trial. E(X) = nP; V(X) = nPQ; Q = 1 – P = Probability of failure. Let p = X/n called the observed proportion of success. 1 np E(x) = = p; E(p ) = p n n 1 1(PQ) V( p) = V(X/n) = 2 v(X) = = PQ/n n n PQ p − E(p) p− p S.E.( p) = ;z= = ∼ N(0, 1) SE(p) n PQ/n E(p) = E(X/n) = This z is called the test statistic that is used to test the significant difference of sample and population proportion. Note 1. The probable limit for the observed proportion of successes is p ± zα PQ/n , where significant value at level of significance α . Note 2. If p is not known, the limits for the proportion in the population are p ± zα zα is the pq / n , q = 1 – p. Note 3. If α is not given, we can take safely 3σ limits. Hence, the confidence limits for the observed proportion p are p ± 3 The confidence limits for the population proportion p are p ± pq n PQ n . . ILLUSTRATIVE EXAMPLES Example 1. A coin was tossed 400 times and returned heads 216 times. Test the hypothesis that the coin is unbiased. Sol. H0: The coin is unbiased, i.e., P = 0.5. H1: The coin is not unbiased (biased), i.e., P ≠ 0.5 Here n = 400; X = No. of success = 216 X 216 = = 0.54 p = proportion of success in the sample n 400 population proportion = 0.5 = P; Q = 1 – P = 1 – 0.5 = 0.5 p−P under H0, test statistic z = PQ/n 0.54 − 0.5 = 1.6 0.5 × 0.5 400 we use the two-tailed test. Conclusion. Since z = 1.6 < 1.96 z = I.e., z < zα , zα is the significant value of z at 5% level of significance. I.e., the coin is unbiased in P = 0.5. 21.72 TESTING OF SIGNIFICANCE FOR A SINGLE PROPORTION 1241 ________________________________________________________________________________________________________ Example 2. A certain cubical die was thrown 9000 times and a 5 or a 6 was obtained 3240 times. On the assumption of unbiased throwing, do the data indicate an unbiased die? Sol. Here n = 9000 P = probability of success (i.e., getting a 5 or a 6 in the throw of the die) P = 2/6 = 1/3, Q = 1 – 1/3 = 2/3 X 3240 p= = = 0.36 n 9000 H0 : is unbiased, i.e., P = 1/3 H1 : P ≠ 1/3 (two-tailed test) p−P 0.36 − 0.33 z= = = 0.03496 PQ 1 2 1 × × The test statistic n 3 3 9000 z = 0.03496 < 1.96 Conclusion. Accept the hypothesis As z < zα , zα is the tabulated value of z at 5% level of significance. ∵ H0 is accepted, we conclude that the die is unbiased. Example 3. A manufacturer claims that only 4% of his products supplied are defective. A random sample of 600 products contained 36 defectives. Test the claim of the manufacturer. Sol. (i) P = observed proportion of success. 36 = 0.06 600 p = proportion of defectives in the population = 0.04 H0 : p = 0.04 is true. I.e., the claim of the manufacturer is accepted. H1 : (i) P ≠ 0.04 (two-tailed test) (ii) If we want to reject, only if p > 0.04 then (right tailed). I.e., P = proportion of defectives in the sample = Under H0, z= 0.06 − 0.04 p−P = = 2.5. PQ/n 0.04 × 0.96 600 Conclusion. Since z = 2.5 > 1.96, we reject the hypothesis H0 at 5% level of significance two tailed. If H1 is taken as p > 0.04, we apply the right-tailed test. z = 2.5 > 1.645 ( zα ) so we reject the null hypothesis here also. In both cases, the manufacturer’s claim is not acceptable. Example 4. A machine is producing bolts of which a certain fraction is defective. A random sample of 400 is taken from a large batch and is found to contain 30 defective bolts. Does this indicate that the proportion of defectives is larger than that claimed by the manufacturer who claims that only 5% of his products are defective? Find the 95% confidence limits of the proportion of defective bolts in the batch. 1242 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Sol. Null hypothesis. H0 : The manufacturer’s claim is accepted, i.e., P = 5 = 0.05 100 Q = 1 – P = 1 – 0.05 = 0.95 Alternative hypothesis. p > 0.05 (right-tailed test). p = observed proportion of sample = 30 = 0.075 400 0.075 − 0.05 p−P ∴ z= = 2.2941. PQ/n 0.05 × 0.95 400 Conclusion. The tabulated value of z at 5% level of significance for the right-tailed test is Under H0, the test statistic z = zα = 1.645. Since z = 2.2941 > 1.645, H0 is rejected at 5% level of significance, i.e., the proportion of defective bolts is larger than the manufacturer claims. To find 95% confidence limits of the proportion, it is given by p ± zα PQ/n 0.05 × 0.95 = 0.05 ± 0.02135 = 0.07136, 0.02865 400 Hence 95% confidence limits for the proportion of defective bolts are (0.07136, 0.02865). 0.05 ± 1.96 Example 5. A bag contains defective articles, the exact number of which is not known. A sample of 100 from the bag gives 10 defective articles. Find the limits for the proportion of defective articles in the bag. 10 Sol. Here p = proportion of defective articles = = 0.1; q = 1 – p = 1– 0.1 = 0.9. 100 Since the confidence limit is not given, we assume it is 95%. ∴ level of significance is 5% zα = 1.96. Also the proportion of population P is not given. To get the confidence limit, we use P, 0.1× 0.9 = 0.1 ± 0.0588 = 0.1588, 0.0412. which is given by P ± pq / n = 0.1 ± 1.96 100 Hence, the 95% confidence limits for the defective articles in the bag are (0.1588, 0.0412). TEST YOUR KNOWLEDGE 1. A sample of 600 people selected at random from a large city shows that the percentage of males in the sample is 53. It is believed that the ratio of males to the total population in the city is 0.5. Test whether the belief is confirmed by the observation. 2. In a city, a sample of 1000 people was taken, and out of them 540 are vegetarian and the rest are nonvegetarian. Can we say that both habits of eating (vegetarian or non-vegetarian) are equally popular in the city at (i) 1% level of significance (ii) 5% level of significance? 3. 325 men out of 600 men chosen from a big city were found to be smokers. Does this information support the conclusion that the majority of men in the city are smokers? 4. A random sample of 500 bolts was taken from a large shipment and 65 were found to be defective. Find the percentage of defective bolts in the shipment. 21.73 TEST OF DIFFERENCE BETWEEN PROPORTIONS 1243 ________________________________________________________________________________________________________ 5. In a hospital, 475 female and 525 male babies were born in a week. Do these figures confirm the hypothesis that males and females are born in equal numbers? 6. 400 apples are taken at random from a large basket and 40 are found to be bad. Estimate the proportion of bad apples in the basket and assign limits within which the percentage most probably lies. Answers 1. 3. 5. H0 accepted at 5% level H0 rejected at 5% level H0 accepted at 5% level 2. 4. 6. H0 rejected at 5% level, accepted at 1% level Between 17.51 and 8.49 8.5 : 11.5 ________________________________________________________________________________________________________ 21.73 TEST OF DIFFERENCE BETWEEN PROPORTIONS Consider two samples X1 and X2 of sizes n1 and n2 respectively taken from two different populations. We test the significance of the difference between the sample proportion p1 and p2. The test statistic under the null hypothesis H0, that there is no significant difference between the two sample proportion, yields p1 − p2 n p +n p z= , where P = 1 1 2 2 and Q = 1 − P. n1 + n2 ⎛1 1⎞ PQ ⎜ + ⎟ ⎝ n1 n2 ⎠ ILLUSTRATIVE EXAMPLES Example 1. Before an increase in the excise duty on tea, 800 people out of a sample of 1000 people were found to be tea drinkers. After an increase in the duty, 800 people were known to be tea drinkers in a sample of 1200 people. Do you think that there has been a significant decrease in the consumption of tea after the increase in the excise duty? Sol. Here n1 = 800, n2 = 1200 p1 = X1 800 4 X 800 2 = = ; p2 = 2 = = n1 1000 5 n2 1200 3 P= p1n1 + p2 n2 X1 + X 2 800 + 800 8 3 = = = ;Q= n1 + n2 n1 + n2 1000 + 1200 11 11 Null hypothesis H0. p1 = p2, i.e., there is no significant difference in the consumption of tea before and after the increase of excise duty. H1 : p1 > p2 (right-tailed test) p1 − p2 0.8 − 0.6666 The test statistic z = = = 6.842. 8 3⎛ 1 1 ⎞ ⎛1 1⎞ × ⎜ + PQ ⎜ + ⎟ ⎟ 11 11 1000 1200 ⎠ ⎝ n n 2 ⎠ ⎝ 1 Conclusion. Since the calculated value of z > 1.645 also z > 2.33, both the significant value of z at 5% and 1% level of significance. Hence H0 is rejected, i.e., there is a significant decrease in the consumption of tea due to the increase in excise duty. Example 2. A machine produced 16 defective articles in a batch of 500. After overhauling the machine it produced 3 defectives in a batch of 100. Has the machine improved? 16 3 Sol. p1 = = 0.032; n1 = 500 p2 = = 0.03; n2. = 100 500 100 1244 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Null hypothesis H0. The machine has not improved due to overhauling. p1 = p2. p n + p2 n2 19 H1 : p1 > p2 (right tailed) ∴ P = 1 1 = ≅ 0.032 n1 + n2 600 Under H0, the test statistic p1 − p2 0.032 − 0.03 z= = = 0.104. 1 ⎞ ⎛1 1⎞ ⎛ 1 (0.032)(0.968) ⎜ + PQ ⎜ + ⎟ ⎟ 500 100 ⎠ ⎝ n n 2 ⎠ ⎝ 1 Conclusion. The calculated value of z < 1.645, the significant value of z at 5% level of significance. H0 is accepted, i.e., the machine has not improved due to overhauling. Example 3. In two large populations there are 30% and 25% respectively of fair-haired people. Is this difference likely to be hidden in samples of 1200 and 900 respectively from the two populations? Sol. p1 = proportion of fair-haired people in the first population = 30% = 0.3; p2 = 25% = 0.25; Q1 = 0.7, Q2 = 0.75. H0 : Sample proportions are equal, i.e., the difference in population proportions is likely to be hidden in sampling. H1 : p1 ≠ p2 z= P1 − P2 = P1Q1 P2 Q 2 + n1 n2 0.3 − 0.25 = 2.5376. 0.3 × 0.7 0.25 × 0.75 + 1200 900 Conclusion. Since z > 1.96, the significant value of z at 5% level of significance, H0 is rejected. However z < 2.58, the significant value of z at 1% level of significance. H0 is accepted. At 5% level these samples will reveal the difference in the population proportions. Example 4. 500 articles from a factory are examined and found to be 2% defective. 800 similar articles from a second factory are only found to be 1.5% defective. Can it be reasonably concluded that the products of the first factory are inferior to those of the second? Sol. n1 = 500, n2 = 800 p1 = proportion of defective products from the first factory = 2% = 0.02 p2 = proportion of defective products from the second factory = 1.5% = 0.015 H0 : There is no significant difference between the two products, i.e., the products do not differ in quality. H1 : p1 < p2 (one-tailed test) p1 − p2 Under H0, z= ⎛1 1⎞ PQ ⎜ + ⎟ ⎝ n1 n2 ⎠ P= z= n1 p1 + n2 p2 0.02(500) + (0.015)(800) = = 0.01692; Q = 1 − P = 0.9830 n1 + n2 500 + 800 0.02 − 0.015 1 ⎞ ⎛ 1 0.01692 × 0.983 ⎜ + ⎟ ⎝ 500 800 ⎠ = 0.68 Conclusion. As z < 1.645, the significant value of z at 5% level of significance, H0 is accepted, i.e., the products do not differ in quality. 21.74 TEST OF SIGNIFICANCE FOR THE SINGLE MEAN 1245 ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. A random sample of 400 men and 600 women was asked whether they would like to have a school near their residence. 200 men and 325 women were in favor of the proposal. Test the hypothesis that the proportion of men and women in favor of the proposal is the same at 5% level of significance. 2. In a town A, there were 956 births of which 52.5% was males while in towns A and B combined, this proportion in a total of 1406 births was 0.496. Is there any significant difference in the proportion of male births in the two towns? 3. In a referendum submitted to the student body at a university, 850 men and 560 women voted. 500 men and 320 women voted yes. Does this indicate a significant difference of opinion between men and women on this matter at 1% level? 4. A manufacturing firm claims that its brand A product outsells its brand B product by 8%. If it is found that 42 out of a sample of 200 people prefer brand A and 18 out of another sample of 100 people prefer brand B, test whether the 8% difference is a valid claim. Answers 1. H0 : accepted 2. H0 : rejected 3. H0 : accepted 4. H0 : accepted. ________________________________________________________________________________________________________ 21.74 TEST OF SIGNIFICANCE FOR THE SINGLE MEAN To test whether the difference between the sample mean and the population mean is significant or not: Let X1, X2, . . . , Xn be a random sample of size n from a large population X1, X2,. . . , XN of size N with mean μ and variance σ 2 ∴ the standard error of mean of a random sample of size n from a population with variance σ 2 is σ / n . To test whether the given sample of size n has been drawn from a population with mean μ , i.e., to test whether the difference between the sample mean and population mean is significant or not. Under the null hypothesis that there is no difference between the sample mean and the population mean x −μ the test statistic is z = , where σ is the standard deviation of the population. σ/ n X−μ , where s is the standard deviation of If σ is not known, we use the test statistic z = s/ n the sample. Note. If the level of significance is α and zα is the critical value − zα < z = x −μ < zα σ/ n The limits of the population mean μ are given by x − zα σ n < μ < x + zα σ / n . At 5% of level of significance, 95% confidence limits are x − 1.96 At 1% level of significance, 99% confidence limits are x − 2.58 These limits are called confidence limits or fiducial limits. σ n σ n < μ < x + 1.96 < μ < x + 2.58 σ σ n n . . 1246 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. A normal population has a mean of 6.8 and standard deviation of 1.5. A sample of 400 members gave a mean of 6.75. Is the difference significant? Sol. H0 : There is no significant difference between x and μ . H1 : There is significant difference between x and μ . Given μ = 6.8, σ = 1.5, x = 6.75, and n = 400 x −μ 6.75 − 6.8 = = − 0.67 = 0.67 1.5 / 900 σ/ n Conclusion. As the calculated value of z < zα = 1.96 at 5% level of significance, H0 is z = accepted, i.e., there is no significant difference between x and μ . Example 2. A random sample of 900 wooden sticks has a mean of 3.4 cms. Can it be reasonably regarded as a sample from a large population of mean 3.2 cms and S.D. 2.3 cms? Sol. Here n = 900, x = 3.4, μ = 3.2, σ = 2.3. H0 : Assume that the sample is drawn from a large population with mean 3.2 and S.D. = 2.3. H1 : μ ≠ 3.25 (Apply two-tailed test.) x −μ 3.4 − 3.2 = = 0.261. Under H0; z = σ / n 2.3 / 900 Conclusion. As the calculated value of z = 0.261 < 1.96 the significant value of z at 5% level of significance. H0 is accepted, i.e., the sample is drawn from the population with mean 3.2 and S.D. = 2.3. Example 3. The mean weight obtained from a random sample of size 100 is 64 gms. The S.D. of the weight distribution of the population is 3 gms. Test the statement that the mean weight of the population is 67 gms at 5% level of significance. Also set up 99% confidence limits of the mean weight of the population. Sol. Here n = 100, μ = 67, x = 64, σ = 3. H0 : There is no significant difference between sample and population mean. I.e., μ = 67, the sample is drawn from the population with μ = 67 H1 : μ ≠ 67 (Two-tailed test) x −μ 64 − 67 = = −10 ∴ z = 10. Under H0, z = σ / n 3 / 100 Conclusion. Since the calculated value of z > 1.96, the significant value of z at 5% level of significance, H0 is rejected, i.e., the sample is not drawn from the population with mean 67. The 99% confidence limits is given by x ± 2.58 σ / n = 64 ± 2.58 ×3 / 100 = 64.774, 63.226. Example 4. The average grades in mathematics of a sample of 100 students was 51 with a S.D. of 6. Could this have been a random sample from a population with average grades of 50? Sol. Here n = 100, x = 51, s = 6, μ = 50; σ is unknown. H0 : The sample is drawn from a population with mean 50, μ = 50 H1 : μ ≠ 50 21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE SAMPLES 1247 ________________________________________________________________________________________________________ x −μ 51 − 50 10 = = = 1.6666. s / n 6 / 100 6 Conclusion. Since z = 1.666 < 1.96, zα the significant value of z at 5% level of sig- Under H0, z = nificance, H0 is accepted, i.e., the sample is drawn from the population with mean 50. TEST YOUR KNOWLEDGE 1. A sample of 1000 students from a university was taken and their average weight was found to be 112 pounds with a S.D. of 20 pounds. Could the mean weight of students in the population be 120 pounds? 2. A sample of 400 male students is found to have a mean height of 160 cms. Can it be reasonably regarded as a sample from a large population with mean height 162.5 cms and standard deviation 4.5 cms? 3. A random sample of 200 measurements from a large population gave a mean value of 50 and a S.D. of 9. Determine 95% confidence interval for the mean of the population. 4. The guaranteed average life of a certain type of bulb is 1000 hours with a S.D. of 125 hours. It is decided to sample the output so as to ensure that 90% of the bulbs do not fall short of the guaranteed average by more than 2.5%. What must be the minimum size of the sample? 5. The heights of college students in a city are normally distributed with a S.D. of 6 cms. A sample of 1000 students has a mean height of 158 cms. Test the hypothesis that the mean height of college students in the city is 160 cms. Answers 1. H0 is rejected 2. H0 accepted 3. 48.8 and 51.2 4. n = 4 5. H0 rejected at 1% to 5% level of significance. ________________________________________________________________________________________________________ 21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE SAMPLES Let x1 be the mean of a sample of size n1 from a population with mean μ1 and variance σ 12 . Let x2 be the mean of an independent sample of size n2 from another population with mean μ2 x1 − x2 and variance σ 22 . The test statistic is given by z = . σ 12 n1 + σ 22 n2 Under the null hypothesis that the samples are drawn from the same population where σ 1 = x1 − x2 σ 2 = σ , i.e., μ1 = μ2 the test statistic is given by z = . 1 1 + σ n1 n2 Note 1. If σ 1 , σ 2 are not known and σ 1 ≠ σ 2 the test statistic in this case is z = x1 − x2 2 s1 n1 n1 s1 + n2 s2 2 Note 2. If σ is not known and σ 1 = σ 2 , we use σ = 2 z= x1 − x2 n1 s1 + n2 s2 2 n1 + n2 2 ⎛1 1⎞ ⎜n +n ⎟ ⎝ 1 2⎠ . n1 + n2 2 to calculate σ ; 2 + s2 n2 . 1248 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ ILLUSTRATIVE EXAMPLES Example 1. The average bonus income of people was $210 with a S.D. of $10 in a sample of 100 people of a city. For another sample of 150 people, the average income was $220 with S.D. of $12. The S.D. of bonus incomes of the people of the city was $11. Test whether there is any significant difference between the average bonus incomes of the localities. Sol. Here n1 = 100, n2 = 150, x1 = 210, x2 = 220, s1 = 10, s2 = 12. Null hypothesis. The difference is not significant, i.e., there is no difference between the bonus incomes of the localities. H 0 : x1 = x2 , Under H0, z= x1 − x2 2 1 2 2 s s + n1 n2 H1 : x1 ≠ x2 210 − 220 = 102 122 + 100 150 = −7.1428 ∴ z = 7.1428. Conclusion. As the calculated value of z > 1.96, the significant value of z at 5% level of significance, H0 is rejected, i.e., there is significant difference between the average bonus incomes of the localities. Example 2. Intelligence tests were given to two groups of boys and girls. Mean S.D. Size Girls 75 8 60 Boys 73 10 100 Examine if the difference between mean scores is significant. Sol. Null hypothesis H0. There is no significant difference between mean scores, i.e., x1 = x2 . H1 : x1 ≠ x2 Under the null hypothesis z = x1 − x 2 1 2 2 s s + n1 n2 = 75 − 73 82 102 + 60 100 = 1.3912. Conclusion. As the calculated value of z < 1.96, the significant value of z at 5% level of significance, H0 is accepted, i.e., there is no significant difference between mean scores. Example 3. For sample I, n1 = 1000, Σx = 49,000, Σ( x − x ) 2 = 7,84,000. For sample II, n2 = 1,500, Σx = 70,500, Σ( x − x ) 2 = 24,00,000. Discuss the significance of the difference of the sample means. Sol. Null hypothesis H0. There is no significant difference between the sample means. H 0 : x1 = x2 ; H1 : x1 ≠ x2 To calculate sample variance s12 = 1 784000 Σ(X1 − X1 ) 2 = = 784 1000 n1 21.75 TEST OF SIGNIFICANCE FOR DIFFERENCE OF MEANS OF TWO LARGE SAMPLES 1249 ________________________________________________________________________________________________________ s22 = 1 1 (2400000) = 11600 Σ(X 2 − X 2 ) 2 = 1500 n2 x1 = 70500 Σx1 49000 Σx = = 49; x2 = 2 = = 47 1000 1500 n1 n2 Under the null hypothesis, the test statistic z= x1 − x2 s12 s22 + n1 n2 = 49 − 47 = 1.470. 784 1600 + 1000 1500 Conclusion. As the calculated value of z = 1.47 < 1.96, the significant value of z at 5% level of significance, H0 is accepted, i.e., there is no significant difference between the sample means. Example 4. From the data given below, compute the standard error of the difference of the two sample means and find out if the two means significantly differ at 5% level of significance. No. of items Group I 50 Group II 75 Mean 181.5 179 S.D. 3.0 3.6 Sol. Null hypothesis H0. There is no significant difference between the samples. x1 = x2 ; H1 : x1 ≠ x2 Under H0, z = x1 − x2 2 1 2 2 s s + n1 n2 = 181.5 − 179.0 9 (3.6) 2 + 50 75 = 4.2089. Conclusion. As z > the tabulated value of z at 5% level of significance H0 is rejected, i.e., there is significant difference between the samples. Example 5. A random sample of 200 towns in anystate gives the mean population per town at 485 with a S.D. of 50. Another random sample of the same size from the same state gives the mean population per town at 510 with a S.D. of 40. Is the difference between the mean values given by the two samples statistically significant? Justify your answer. Sol. Here n1 = 200, n2 = 250, x1 = 485, x2 = 510, s1 = 50, s2 = 40. Null hypothesis H0. There is no significant difference between the mean values, i.e., x1 = x2 ; H : x1 ≠ x2 (Two-tailed test) x −x 485 − 510 Under H0, the test statistic is given by z = 1 2 = = −5.52 502 402 s12 s22 + + 200 200 n1 n2 ∴ z = 5.52. Conclusion. As the calculated value of z > 1.96, the significant value of z at 5% level of significance, H0 is rejected, i.e., there is significant difference between the mean values of the two samples. 1250 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. Intelligence tests on two groups of boys and girls gave the following results. Examine whether the difference is significant. Mean S.D. Size Girls 70 10 70 Boys 75 11 100 2. Two random samples of sizes 1000 and 2000 of farms gave an average yield of 2000 kg and 2050 kg respectively. The variance of wheat farms in the country may be taken as 100 kg. Examine whether the two samples differ significantly in yield. 3. A sample of heights of 6400 soldiers has a mean of 67.85 inches and a S.D. of 2.56 inches while another sample of heights of 1600 sailors has a mean of 68.55 inches with a S.D. of 2.52 inches. Do the data indicate that the sailors are on the average taller than soldiers? 4. In a survey of buying habits, 400 shoppers are chosen at random in supermarket A. Their average weekly food expenditure is $250 with a S.D. of $40. For 500 shoppers chosen at supermarket B, the average weekly food expenditure is $220 with a S.D. of $45. Test at 1% level of significance whether the average food expenditures of the two groups are equal. 5. The number of accidents per day was studied for 144 days in town A and for 100 days in town B and the following information was obtained. Mean number of accidents S.D. Town A 4.5 1.2 Town B 5.4 1.5 Is the difference between the mean accidents of the two towns statistically significant? 6. An examination was given to 50 students of college A and to 60 students of college B. For A, the mean grade was 75 with a S.D. of 9 and for B, the mean grade was 79 with a S.D. of 7. Is there any significant difference between the performance of the students of college A and those of college B? 7. A random sample of 200 measurements from a large population gave a mean value of 50 and a S.D. of 9. Determine the 95% confidence interval for the mean of the population. 8. The means of two large samples of 1000 and 2000 members are 168.75 cms and 170 cms respectively. Can the samples be regarded as drawn from the same population of standard deviation 6.25 cms? Answers 1. 4. 7. No significant difference Highly significant 49.584, 50.416 2. 5. 8. Highly significant Highly significant Not significant 3. 6. Highly significant Not significant ________________________________________________________________________________________________________ 21.76 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD DEVIATIONS If s1 and s2 are the standard deviations of two independent samples then under the null hypothesis H0 : σ 1 = σ 2 , i.e., the sample standard deviations don’t differ significantly, and the statistic 21.76 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF STANDARD DEVIATIONS 1251 ________________________________________________________________________________________________________ z= s1 − s2 σ 12 2n1 + σ 22 , where σ 1 and σ 2 are population standard deviations 2n2 when population standard deviations are not known then z = s1 − s2 s12 s22 + 2n1 2n2 . ILLUSTRATIVE EXAMPLES Example 1. Random samples drawn from two countries gave the following data relating to the heights of adult males. Country A 67.42 Country B 67.25 Standard deviation 2.58 2.50 Number in samples 1000 1200 Mean height (in inches) (i) Is the difference between the means significant? (ii) Is the difference between the standard deviations significant? Sol. Given: n1 = 1000, n2 = 1200, x1 = 67.42; x2 = 67.25, s1 = 2.58, s2 = 2.50. Since the sample sizes are large we can take σ 1 = s1 = 2.58; σ 2 = s2 = 2.50. (i) Null Hypothesis. H0 = μ1 = μ2 , i.e., sample means do not differ significantly. Alternative hypothesis: H1 : μ1 ≠ μ2 (two-tailed test) z= x1 − x2 s12 s22 + n1 n2 67.42 − 67.25 = (2.58) 2 (2.50) 2 + 1000 1200 = 1.56 since z < 1.96 we accept the null hypothesis at 5% level of significance. (ii) We set up the null hypothesis. H0 : σ 1 = σ 2 , i.e., the sample S.D.’s do not differ significantly. Alternative hypothesis: H1 = σ 1 ≠ σ 2 (two-tailed) ∴ The test statistic is given by z= s1 − s2 σ 12 2n1 = + σ 22 = 2n2 s1 − s2 s12 s2 + 2 2n1 2n2 2.58 − 2.50 2 2 (2.58) (2.50) × 2 ×1000 2 ×1200 = (∵ σ 1 = s1 , σ 2 = s2 for large samples) 0.08 = 1.0387 6.6564 6.25 + 2000 2400 Since z < 1.96 we accept the null hypothesis at 5% level of significance. 1252 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Example 2. An intelligence test of two groups of boys and girls gives the following results: Girls Boys mean = 84 mean = 81 S.D. = 10 S.D. = 12 N = 121 N = 81 (a) Is the difference in mean scores significant? (b) Is the difference between the standard deviations significant? Sol. Given: n1 = 121, n2 = 81, x1 = 84, x2 = 81, s1 = 10, s2 = 12. (a) Null hypothesis. H0 = μ1 = μ2 , i.e., sample means do not differ significantly. Alternative hypothesis: H1 = μ1 ≠ μ2 (two-tailed) x −x 84 − 81 The test statistic is z = 1 2 = = 0.1859 (10) 2 (12) 2 s12 s22 + + 121 81 n1 n2 Since z < 1.96 we accept the null hypothesis at 5% level of significance. (b) We set up the null hypothesis H0 = σ 1 = σ 2 , i.e., the sample S.D.’s do not differ significantly. Alternative hypothesis: H1 = σ 1 ≠ σ 2 (two-tailed) s1 − s2 s1 − s2 The test statistic is z = = 2 2 σ1 σ 2 s12 s22 + + 2n1 2n2 2n1 2n2 (∵ σ 1 = s1 , σ 2 = s2 for large samples) 10 − 12 = −1.7526 ∴ z = 1.7526 100 144 + 2 ×121 2 × 81 since z = 1. 75 < 1.96 we accept the null hypothesis at 5% level of significance. = TEST YOUR KNOWLEDGE 1. The mean yield of two sets of plots and their variability are as given; examine (i) whether the difference in the mean yield of the two sets of plots is significant; (ii) whether the difference in the variability in yields is significant. Mean yield per plot S.D. per plot Set of 40 plots 1258 lb 34 Set of 60 plots 1243 lb 28 2. The yield of wheat in a random sample of 1000 farms in a certain area has a S.D. of 192 kg. Another random sample of 1000 farms gives a S.D. of 224 kg. Are the S.D.’s significantly different? Answers 1. z = 2.321 Difference significant at 5% level; z = 1.31 Difference not significant at 5% level 2. z = 4.851 The S.D.’s are significantly different. ________________________________________________________________________________________________________ 21.77 TEST OF SIGNIFICANCE OF SMALL SAMPLES When the size of the sample is less than 30, then the sample is called a small sample. For such a sample it will not be possible for us to assume that the random sampling distribution of 21.79 TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE 1253 ________________________________________________________________________________________________________ a statistic is approximately normal and the values given by the sample data are sufficiently close to the population values and can be used in their place for the calculation of the standard error of the estimate. t-TEST 21.78 STUDENT’S t-DISTRIBUTION This t-distribution is used when the sample size is ≤ 30 and the population standard deviation is unknown. x −μ t-statistic is defined as t = ∼ t(n – 1 d.f.) d.f.—degrees of freedom where s/ n s= Σ(X − X) 2 . n −1 The t-table The t-table given at the end is the probability integral of the t-distribution. The t-distribution has a different value for each degree of freedom and when the degrees of freedom are infinitely large, the t-distribution is equivalent to normal distribution and the probabilities shown in the normal distribution tables are applicable. Application of t-distribution Some of the applications of t-distribution are given below: 1. To test if the sample mean ( X ) differs significantly from the hypothetical value μ of the population mean. 2. To test the significance between two sample means. 3. To test the significance of observed partial and multiple correlation coefficients. Critical value of t The critical value or significant value of t at level of significance α degrees of freedom γ for the two-tailed test is given by P ⎡⎣ t > tγ (α ) ⎤⎦ = α P ⎡⎣ t > tγ (α ) ⎤⎦ = 1 − α The significant value of t at level of significance α for a single-tailed test can be determined from those of the two-tailed test by referring to the values at 2α . 21.79 TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE To test whether the mean of a sample drawn from a normal population deviates significantly from a stated value when variance of the population is unknown. H0 : There is no significant difference between the sample mean x and the population mean μ , i.e., we use the statistic X−μ , where X is the mean of the sample t= s/ n 1 n (X i − X) 2 with degrees of freedom (n − 1). s2 = ∑ n −1 i =1 At a given level of significance α1 and degrees of freedom (n – 1). We refer to t-table tα (two-tailed or one-tailed). 1254 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ If the calculated t value is such that t < tα the null hypothesis is accepted. t > tα H0 is rejected. Fiducial limits of population mean If tα is the table of t at level of significance α at (n – 1) degrees of freedom X−μ < tα for acceptance of H0. s/ n x − tα s n < μ < x + tα s / n 95% confidence limits (level of significance 5%) are X ± t 0.05 s / n . 99% confidence limits (level of significance 1%) are X ± t0.01s / n . Note. Instead of calculating s, we calculate S for the sample. Since s 2 = 1 n 1 n (X i − X) 2 ∴ S2 = ∑ (X i − X) 2 . ∑ n −1 i =1 n i =1 n 2⎤ ⎡ 2 2 2 ⎢⎣ (n − 1) s = nS , s = n − 1 S ⎥⎦ ILLUSTRATIVE EXAMPLES Example 1. A random sample of size 16 has 53 as its mean. The sum of squares of the deviation from mean is 135. Can this sample be regarded as taken from the population having 56 as its mean? Obtain 95% and 99% confidence limits of the mean of the population. Sol. H0 : There is no significant difference between the sample mean and the hypothetical population mean. H 0 : μ = 56; H1 : μ ≠ 56 (Two-tailed test) t: X−μ ∼ t (n − 1 d.f.) s/ n Given: X = 53, μ = 56, n = 16, Σ(X − X) 2 = 135 s= Σ(X − X)2 135 53 − 56 −3 × 4 = = 3; t = = = −4 n −1 15 3 3 / 16 t = 4. d . fv = 16 − 1 = 15. Conclusion. t0.05 = 1.753. Since t = 4 > t0.05 = 1.753, i.e., the calculated value of t is more than the table value. The hypothesis is rejected. Hence the sample mean has not come from a population having 56 as its mean. 95% confidence limits of the population mean. X± s 3 t0.05 , 53 ± (1.725) = 51.706; 54.293 n 16 99% confidence limits of the population mean. X± s 3 t0.01 , 53 ± (2.602) = 51.048; 54.951. n 16 21.79 TEST I: t-TEST OF SIGNIFICANCE OF THE MEAN OF A RANDOM SAMPLE 1255 ________________________________________________________________________________________________________ Example 2. The lifetime of electric bulbs for a random sample of 10 from a large shipment gave the following data: Item Life in 1000s of hrs. 1 4.2 2 4.6 3 3.9 4 4.1 5 5.2 6 3.8 7 3.9 8 4.3 9 4.4 10 5.6 Can we accept the hypothesis that the average lifetime of a bulb is 4000 hrs? Sol. H0 : There is no significant difference in the sample mean and population mean, i.e., μ = 4000 hrs. X−μ ∼ t (10 − 1 d.f .) Applying the t-test: t = s/ n X 4.2 4.6 3.9 4.1 5.2 3.8 3.9 4.3 4.4 5.6 X−X – 0.2 0.2 – 0.5 – 0.3 0.8 – 0.6 – 0.5 – 0.1 0 1.2 (X – X )2 0.04 0.04 0.25 0.09 0.64 0.36 0.25 0.01 0 1.44 X= s= ΣX 44 = = 4.4 n 10 Σ(X − X) 2 = 3.12 Σ(X − X)2 3.12 4.4 − 4 = = 0.589; t = = 2.123 0.589 n −1 9 10 For γ = 9, t0.05 = 2.26. Conclusion. Since the calculated value of t is less than table t0.05. ∴ The hypothesis μ = 4000 hrs is accepted. I.e., the average lifetime of the bulbs could be 4000 hrs. Example 3. A sample of 20 items has mean 42 units and S.D. 5 units. Test the hypothesis that it is a random sample from a normal population with mean 45 units. Sol. H0 : There is no significant difference between the sample mean and the population mean. I.e., μ = 45 units μ ≠ 45 (Two-tailed test) H1 : n = 20, X = 42, S = 5; γ = 19 d.f. Given : n 2 ⎡ 20 ⎤ 2 S =⎢ (5) = 26.31 ∴ s = 5.129 s2 = n −1 ⎣ 20 − 1 ⎥⎦ X−μ 42 − 45 Applying the t-test t = = = −2.615; t = 2.615 s / n 5.129 / 20 The tabulated value of t at 5% level for 19 d.f. is t0.05 = 2.09. Conclusion. Since t > t0.05, the hypothesis H0 is rejected, i.e., there is significant difference between the sample mean and the population mean. I.e., the sample could not have come from this population. Example 4. The 9 items of a sample have the following values: 45, 47, 50, 52, 48, 47, 49, 53, 51. Does the mean of these values differ significantly from the assumed mean 47.5? 1256 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Sol. H0 : μ = 47.5 I.e., there is no significant difference between the sample and the population mean. H1 : μ ≠ 47.5 (two-tailed test); given : n = 9, μ = 47.5 X 45 47 50 52 48 47 49 53 51 X−X – 4.1 – 2.1 0.9 2.9 – 1.1 – 2.1 – 0.1 3.9 1.9 (X – X )2 16.81 4.41 0.81 8.41 1.21 4.41 0.01 15.21 3.61 Σx 442 Σ(X − X) 2 2 2 X= = = 49.11; Σ(X − X) = 54.89; s = = 6.86 ∴ s = 2.619 n 9 (n − 1) t0.05 Conclusion. Since t X − μ 49.1 − 47.5 (1.6) 8 = = = 1.7279 2.619 s / n 2.619 / 8 = 2.31 for γ = 8. t= Applying the t-test < t0.05, the hypothesis is accepted, i.e., there is no significant difference between their mean. Example 5. The following results are obtained from a sample of 10 boxes of biscuits. Mean weight content = 490 gm. S.D. of the weight 9 gm. Could the sample come from a population having a mean of 500 gm? Sol. Given: n = 10, X = 490; S = 9 gm, μ = 500 n 2 10 2 S = × 9 = 9.486 9 n −1 s= H0 : The difference is not significant, i.e., μ = 500; H1: μ ≠ 500 Applying t-test X−μ 490 − 500 = = −0.333 s / n 9.486 / 10 = 2.26 for γ = 9. t= t0.05 Conclusion. Since t = .333 > t0.05, the hypothesis H0 is rejected, i.e., μ ≠ 500. ∴ The sample could not have come from the population having mean 500 gm. TEST YOUR KNOWLEDGE 1. Ten individuals are chosen at random from a normal population of students and their grades are found to be 63, 63, 66, 67, 68, 69, 70, 70, 71, 71. In light of these data, discuss the suggestion that the mean grade of the population of students is 66. 2. The following values give the lengths of 12 samples of Egyptian cotton taken from a shipment: 48, 46, 49, 46, 52, 45, 43, 47, 47, 46, 45, 50. Test whether the mean length of the shipment can be taken as 46. 3. A sample of 18 items has a mean of 24 units and a standard deviation of 3 units. Test the hypothesis that it is a random sample from a normal population with a mean of 27 units. 4. A random sample of 10 students had the following I.Q.’s 70, 120, 110, 101, 88, 83, 95, 98, 107, and 100. Do these data support the assumption of a population mean I.Q. of 160? 21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES 1257 ________________________________________________________________________________________________________ 5. A filling machine is expected to fill 5 kg of powder into bags. A sample of 10 bags gave the following weights: 4.7, 4.9, 5.0, 5.1, 5.4, 5.2, 4.6, 5.1, 4.6, and 4.7. Test whether the machine is working properly. Answers 1. 4. accepted accepted accepted accepted 2. 5. 3. rejected ________________________________________________________________________________________________________ 21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES (FROM A NORMAL POPULATION) This test is used to test whether the two samples of sizes x1, x2, . . . , xn1 , y1, y2, . . . , yn2 of sizes n1, n2 have been drawn from two normal populations with mean μ1 and μ2 respectively under the assumption that the population variances are equal. (σ 1 = σ 2 = σ ). H0 : The samples have been drawn from the normal population with means μ1 and μ2 , i.e., H0 : μ1 ≠ μ2 . Let X, Y be the means of the two samples. Under this H0 the test of statistic t is given by t = (X − Y) ∼ t (n1 + n2 − 2 d.f.) 1 1 s + n1 n2 n1 s1 + n2 s2 2 Note 1. If the two sample standard deviations s1, s2 are given then we have s = 2 X−Y Note 2. If n1 = n2 = n, t = s1 + s2 2 2 2 n1 + n2 − 2 . can be used as a test statistic. n −1 Note 3. If the pairs of values are in some way associated (correlated) we can’t use the test statistic as given in Note 2. In this case we find the differences of the associated pairs of values and apply for a single mean, i.e., X−μ t= with degrees of freedom n – 1. s/ n The test statistic is t = I.e., d s/ n or t = d s/ n −1 , where d is the mean of paired difference. d i = xi − yi d i = X − Y, where ( xi , yi ) are the paired data i = 1, 2, . . . , n. ILLUSTRATIVE EXAMPLES Example 1. Two samples of sodium vapor bulbs were tested for length of life and the following results were returned: Type I Type II Size 8 7 Sample mean 1234 hrs 1036 hrs Sample S.D. 36 hrs 40 hrs Is the difference in the means significant enough to generalize that type I is superior to type II regarding length of life? 1258 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Sol. H0 : μ1 = μ2 , i.e., two types of bulbs have the same lifetime. H1 : μ1 > μ2 , i.e., type I is superior to type II s2 = n1s22 + n2 s22 8 × (36) 2 + 7(40) 2 = = 1659.076 ∴ s = 40.7317 8+7−2 n1 + n2 − 2 X1 − X 2 1234 − 1036 = = 18.1480 ∼ t (n1 + n2 − 2 d.f.) 1 1 1 1 s 40.7317 + + n1 n2 8 7 t0.05 at d.f. 13 is 1.77 (one-tailed test) Conclusion. Since calculated t > t0.05, H0 is rejected, i.e., H1 is accepted. The t-statistic t= ∴ Type I is definitely superior to type II n1 n2 Y X 1 ⎡⎣Σ(X i − X) 2 + (Y j − Y) 2 ⎤⎦ Y=∑ j; where X = ∑ i , s2 = n1 + n2 − 2 i = 1 ni j = 1 n2 is an unbiased estimate of the population variance σ 2 . t follows t distribution with n1 + n2 – 2 degrees of freedom. Example 2. Samples of sizes 10 and 14 were taken from two normal populations with S.D. 3.5 and 5.2. The sample means were found to be 20.3 and 18.6. Test whether the means of the two populations are the same at 5% level. Sol. H0 : μ1 = μ2 , i.e., the means of the two populations are the same. H1 : μ1 ≠ μ2 . Given X1 = 20.3, X 2 = 18.6; n1 = 10, n2 = 14, s1 = 3.5, s2 = 5.2 s2 = t= n1s12 + n2 s22 10(3.5) 2 + 14(5.2) 2 = = 22.775 ∴ s = 4.772 10 + 14 − 2 n1 + n2 − 2 X1 − X 2 20.3 − 18.6 = = 0.8604 1 1 ⎛ 1 1 ⎞ s + + ⎟ 4.772 ⎜ n1 n2 ⎝ 10 14 ⎠ The value of t at 5% level for 22 d.f. is t0.05 = 2.0739. Conclusion. Since t = 0.8604 < t0.05 the hypothesis is accepted, i.e., there is no significant difference between their means. Example 3. The heights of 6 randomly chosen sailors in inches are 63, 65, 68, 69, 71, and 72. Those of 9 randomly chosen soldiers are 61, 62, 65, 66, 69, 70, 71, 72, and 73. Test whether the sailors are, on the average, taller than the soldiers. Sol. Let X1 and X2 be the two samples denoting the heights of sailors and soldiers. Given the sample size n1 = 6, n2 = 9, H0 : μ1 = μ2 . I.e., the means of both the population are the same. H1 : μ1 > μ2 (one-tailed test) 21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES 1259 ________________________________________________________________________________________________________ Calculation of two sample means: X1 63 65 68 69 71 72 X1 − X1 –5 –3 0 1 3 4 (X1 – X 1 )2 25 9 0 1 9 16 X1 = ΣX1 = 68; Σ(X1 − X1 ) 2 = 60 n1 X2 61 62 65 66 69 70 71 72 73 X2 − X2 – 6.66 – 5.66 – 2.66 1.66 1.34 2.34 3.34 4.34 5.34 (X2 – X 2 )2 44.36 32.035 7.0756 2.7556 1.7956 5.4756 11.1556 18.8356 28.5156 X2 = ΣX 2 = 67.66; Σ(X 2 − X 2 ) 2 = 152.0002 n2 s2 = 1 ⎡⎣ Σ(X1 − X1 )2 + Σ(X 2 − X 2 ) 2 ⎤⎦ n1 + n2 − 2 1 [60 + 152.0002] = 16.3077 ∴ s = 4.038 6+9−2 X − X2 68 − 67.666 t= 1 = = 0.3031 ∼ t (n1 + n2 − 2 d.f.) 1 1 1 1 4.0382 + s + n1 n2 6 9 = Under H0, The value of t at 10% level of significance (∵ the test is one tailed) for 13 d.f. is 1.77. Conclusion. Since t = 0.3031 < t0.05 = 1.77 the hypothesis H0 is accepted. I.e., there is no significant difference between their average. I.e., the sailors are not, on the average, taller than the soldiers. Example 4. A certain stimulus administered to each of 12 patients resulted in the following increases of blood pressure: 5, 2, 8, –1, 3, 0, –2, 1, 5, 0, 4, 6. Can it be concluded that the stimulus will in general be accompanied by an increase in blood pressure? Sol. To test whether the mean increase in blood pressure of all patients to whom the stimulus is administered will be positive, we have to assume that this population is normal with mean μ and S.D. σ , which are unknown. H0 : μ = 0; H1 : μ1 > 0 The test statistic under H0 d ∼ t (n − 1 degrees of freedom) s / n −1 5 + 2 + 8 + (−1) + 3 + 0 + 6 + (−2) + 1 + 5 + 0 + 4 d= = 2.583 12 t= 1260 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 1 Σd 2 − d 2 = [52 + 22 + 82 + (−1) 2 + 32 + 02 + 62 n 12 + (−2) 2 + 12 + 52 + 02 + 42 ] − (2.583) 2 = 8.744 ∴ s = 2.9571 s2 = t= 2.583 2.583 11 d = = = 2.897 ∼ t (n − 1 d.f.) 2.9571 s / n − 1 2.9571/ 12 − 1 Conclusion. The tabulated value of t0.05 at 11 d.f. is 2.2. ∵ t > t0.05, H0 is rejected. I.e., the stimulus does not increase the blood pressure. The stimulus in general will be accompanied by an increase in blood pressure. Example 5. The memory capacity of 9 students was tested before and after a course of medication for a month. State whether the course was effective or not from the data below (in the same units). Before 10 15 9 3 7 12 16 17 4 After 12 17 8 5 6 11 18 20 3 Sol. Since the data are correlated and concerned with the same set of students, we use the paired t-test. H0 : Medication was not effective μ1 = μ2 H1 : μ1 ≠ μ2 (Two-tailed test). Before medication (X) 10 15 9 3 7 12 16 17 4 After medication (Y) 12 17 8 5 6 11 18 20 3 d=X–Y –2 –2 1 –2 1 1 –2 –3 1 d2 4 4 1 4 1 1 4 9 1 Σd = −7 Σd 2 = 29 29 Σd −7 Σd 2 = = −0.7778; s 2 = − (d ) 2 = − (−0.7778) 2 = 2.617 n 9 n 9 d −0.7778 −0.7778 × 8 t= = = = −1.359 1.6177 2.6172 / 8 s / n −1 d= The tabulated value of t0.05 at 8 d.f. is 2.31. 21.80 TEST II: t-TEST FOR DIFFERENCE OF MEANS OF TWO SMALL SAMPLES 1261 ________________________________________________________________________________________________________ Conclusion. Since t = 1.359 < t0.05, H0 is accepted, i.e., medication was not effective in improving performance. Example 6. The following figures refer to observations in live independent samples. Sample I 25 30 28 34 24 20 13 32 22 38 Sample II 40 34 22 20 31 40 30 23 36 17 Analyze whether the samples have been drawn from the populations of equal means. Sol. H0 : The two samples have been drawn from the population of equal means, i.e., there is no significant difference between their means, i.e., μ1 = μ2 H1 : μ1 ≠ μ2 (Two-tailed test) Given n1 = Sample I size = 10; n2 = Sample II size = 10 To calculate the two sample means and the sum of squares of deviation from the mean, let X1 be the sample I and X2 be the sample II. X1 25 30 28 34 24 20 13 32 22 38 X1 − X1 – 1.6 3.4 1.4 7.4 – 2.6 – 6.6 – 13.6 5.4 4.6 11.4 ( X1 − X1 )2 2.56 11.56 1.96 54.76 6.76 43.56 184.96 29.16 X2 40 34 22 20 31 40 30 23 36 17 X2 − X2 10.7 4.7 –7.3 – 9.3 1.7 10.7 0.7 – 6.3 6.7 – 12.3 53.29 86.49 2.89 114.49 0.49 39.67 ( X 2 − X 2 ) 2 114.49 22.09 10 X1 = 26.6 i = 1 n1 X1 = ∑ Σ(X1 − X1 ) 2 = 486.4 s2 = = 10 X 2 293 = = 29.3 10 i = 1 n2 X2 = ∑ Σ(X 2 − X 2 ) 2 = 630.08 1 ⎡⎣Σ(X1 − X1 ) 2 + Σ(X 2 − X 2 ) 2 ⎤⎦ n1 + n2 − 2 1 [486.4 + 630.08] = 62.026 ∴ s = 7.875 10 + 10 − 2 Under H0 the test statistic is given by t= X1 − X 2 26.6 − 29.3 = = −0.7666 ∼ t (n1 + n2 − 2 d.f.) 1 1 1 1 s + 7.875 + n1 n2 10 10 t = 0.7666. 21.16 129.96 44.89 151.29 1262 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Conclusion. The tabulated value of t at 5% level of significance for 18 d.f. is 2.1. Since the calculated value t = 0.7666 < t0.05, H0 is accepted. I.e., there is no significant difference between their means. I.e., the two samples have been drawn from the populations of equal means. TEST YOUR KNOWLEDGE 1. The mean life of 10 electric motors was found to be 1450 hrs with a S.D. of 423 hrs. A second sample of 17 motors chosen from a different batch showed a mean life of 1280 hrs with a S.D. of 398 hrs. Is there a significant difference between the means of the two samples? 2. The grades obtained by a group of 9 regular course students and another group of 11 part-time course students in a test are given below Regular : 56 62 63 54 60 51 67 69 58 Part-time : 62 70 71 62 60 56 75 64 72 68 66 Examine whether the grades obtained by regular students and part-time students differ significantly at 5% and 1% levels of significance. 3. A group of 10 boys fed on diet A and another group of 8 boys fed on a different diet B; they recorded the following increase in weight (kgs). Diet A : 5 6 8 1 12 4 3 9 6 10 Diet B : 2 3 6 8 10 1 2 8 Does it show the superiority of diet A over diet B? 4. Two independent samples of sizes 7 and 9 have the following values: Sample A : 10 12 10 13 14 11 10 Sample B : 10 13 15 12 10 14 11 12 11 Test whether the difference between the means is significant. 5. To compare the prices of a certain product in two cities, 10 shops were visited at random in each town. The prices were noted below: City 1 : 61 63 56 63 56 63 59 56 44 61 City 2 : 55 54 47 59 51 61 57 54 64 58 Test whether the average prices can be said to be the same in the two cities. 6. The average number of articles produced by two machines per day are 200 and 250 with standard deviation 20 and 25 respectively on the basis of records of 25 days’ production. Can you regard both the machines as equally efficient at 5% level of significance? 7. Two salesmen represent a firm in a certain company. One of them claims that he makes larger sales than the other. A sample survey was made and the following results were obtained: No. of sales : 1st Salesman (18) Average sales : $210 S.D. : $25 Find whether the average sales differ significantly. 2nd Salesman (20) $175 $20 Answers 1. 5. accepted accepted 2. 6. rejected rejected 3. 7. accepted rejected 4. accepted ________________________________________________________________________________________________________ 21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST In testing the significance of the difference of two means of two samples, we assumed that the two samples came from the same population or a population with equal variance. The 21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST 1263 ________________________________________________________________________________________________________ object of the F-test is to discover whether two independent estimates of population variance differ significantly or whether the two samples may be regarded as drawn from the normal populations having the same variance. Hence before applying the t-test for the significance of the difference of two means, we have to test for the equality of population variance by using the F-test. Let n1 and n2 be the sizes of two samples with variance s12 and s22 . The estimates of the population variance based on these samples are s12 = n s2 n1s12 and s22 = 2 2 . The degrees of n2 − 1 n1 − 1 freedom of these estimates are v1 = n1 − 1, v2 = n2 − 1. To test whether these estimates s12 and s22 are significantly different or whether the samples may be regarded as drawn from the same population or from two populations with the same variance σ 2 , we set up the null hypothesis H0 : σ 12 = σ 22 = σ 2 . I.e., the independent estimates of the common population do not differ significantly. To carry out the test of significance of the difference of the variances we calculate the test s2 statistic (Nr) F = 12 ; the numerator is greater than the denominator (Dr), i.e., s12 > s22 . s2 Conclusion. If the calculated value of F exceeds F0.05 for (n1 – 1), (n2 – 1) degrees of freedom given in the table we conclude that the ratio is significant at 5% level. I.e., we conclude that the sample could have come from two normal populations with the same variance. The assumptions on which the F-test is based are: 1. The populations for each sample must be normally distributed. 2. The samples must be random and independent. 3. The ratio of σ 12 to σ 22 should be equal to 1 or greater than 1. That is why we take the larger variance in the numerator of the ratio. Applications. The F-test is used to test (i) whether two independent samples have been drawn from the normal populations with the same variance σ 2 . (ii) Whether the two independent estimates of the population variance are homogeneous or not. ILLUSTRATIVE EXAMPLES Example 1. In two independent samples of sizes 8 and 10 the sum of squares of deviations of the sample values from the respective sample means were 84.4 and 102.6. Test whether the difference of variances of the populations is significant or not. Sol. Null hypothesis H0. σ 12 = σ 22 = σ 2 , i.e., there is no significant difference between population variance. s12 Under H0 : F = 2 ∼ F(v1 , v2 d.f.) s2 1264 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ where v1 = n1 – 1, n1 = Sample I size = 8; v2 = n2 – 1, n2 = Sample II size = 10 Σ(X1 − X1 ) 2 = 84.4; Σ(X 2 − X 2 ) 2 = 102.6 s12 = Σ(X1 − X1 ) 2 84.4 Σ(X 2 − X 2 ) 2 102.6 = = 12.057; s22 = = = 11.4 n1 − 1 7 n2 − 1 9 F= s12 12.057 ∵ s12 > s22 ∴ F = = 1.0576. 2 s2 11.4 Conclusion. The tabulated value of F at 5% level of significance for (7, 9) d.f. is 3.29 ∴ F0.05 = 3.29 and F = 1.0576 > 3.29 = F0.05 ⇒ H0 is accepted. ∴ There is no significant difference between the variance of the populations. Example 2. Two random samples are drawn from two normal populations as follows: A 17 27 18 25 27 29 13 B 16 16 20 27 26 25 21 17 Test whether the samples are drawn from the same normal population. Sol. To test whether two independent samples have been drawn from the same population we have to test (i) equality of the means by applying the t-test and (ii) equality of the population variance by applying the F-test. Since the t-test assumes that the sample variances are equal, we shall first apply the F-test. F-test. Null hypothesis H0. σ 12 = σ 22 , i.e., the population variances do not differ significantly. Alternative hypothesis. H1 : σ 12 ≠ σ 22 Test statistic: F = s12 , (if s12 > s22 ) 2 s2 Computations for s12 and s22 X1 X1 − X1 ( X1 − X1 )2 X2 X2 − X2 ( X 2 − X 2 )2 17 – 4.625 21.39 16 – 2.714 7.365 27 5.735 28.89 16 – 2.714 7.365 18 – 3.625 13.14 20 1.286 1.653 25 3.375 11.39 27 8.286 68.657 27 5.735 28.89 26 7.286 53.085 29 7.735 54.39 25 6.286 39.513 13 – 8.625 74.39 21 2.286 5.226 17 – 4.625 21.39 21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST 1265 ________________________________________________________________________________________________________ X1 = 21.625; n1 = 8; Σ(X1 − X1 ) 2 = 253.87 X 2 = 18.714; n2 = 7; Σ(X 2 − X 2 ) 2 = 182.859 s12 = Σ(X1 − X1 ) 2 253.87 = = 36.267; 7 n1 − 1 s22 = Σ(X 2 − X 2 ) 2 182.859 = = 30.47 n2 − 1 6 F= s12 36.267 = = 1.190. s22 30.47 Conclusion. The table value of F for v1 = 7 and v2 = 6 degrees of freedom at 5% level is 4.21. The calculated value of F is less than the tabulated value of F. ∴ H0 is accepted. Hence we conclude that the variability in two populations is the same. t-test: Null hypothesis. H0 : μ1 = μ2 , i.e., the population means are equal. Alternative hypothesis. H1 : μ1 ≠ μ2 Test of statistic s2 = t= Σ(X1 − X1 ) 2 + Σ(X 2 − X 2 ) 2 253.87 + 182.859 = = 33.594 ∴ s = 5.796 8+7−2 n1 + n2 − 2 X1 − X 2 21.625 − 18.714 = = 0.9704 ∼ t (n1 + n2 − 2) d.f. 1 1 1 1 5.796 + s + 8 7 n1 n2 Conclusion. The tabulated value of t at 5% level of significance for 13 d.f. is 2.16. The calculated value of t is less than the tabulated value. H0 is accepted, i.e., there is no significant difference between the population mean, i.e., μ1 = μ2 . ∴ We conclude that the two samples have been drawn from the same normal population. Example 3. Two independent samples of sizes 7 and 6 had the following values: Sample A 28 30 32 33 31 29 Sample B 29 30 30 24 27 28 34 Examine whether the samples have been drawn from normal populations having the same variance. Sol. H0 : The variances are equal, i.e., σ 12 = σ 22 . I.e., the samples have been drawn from normal populations with the same variance. H1 : σ 12 ≠ σ 22 s12 2 Under the null hypothesis, the test statistic F = 2 ( s1 > s22 ). s2 1266 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Computations for s12 and s22 X1 X1 − X1 ( X1 − X1 )2 X2 X2 − X2 ( X 2 − X 2 )2 28 30 32 33 31 29 34 –3 –1 1 2 0 –2 3 9 1 1 4 0 4 9 28 29 30 30 24 27 28 1 2 2 –4 –1 0 1 4 4 16 1 0 26 X1 = 31, n1 = 7; Σ(X1 − X1 ) 2 = 28 X 2 = 28, n2 = 6; Σ(X 2 − X 2 ) 2 = 26 Σ(X1 − X1 ) 2 28 Σ(X 2 − X 2 ) 2 26 2 s = = = 4.666; s2 = = = 5.2 n1 − 1 6 n2 − 1 5 2 1 F= s12 5.2 = = 1.1158. 2 s2 4.666 (∵ s22 > s12 ) Conclusion. The tabulated value of F at v1 = 6 – 1 and v2 = 7 – 1 d.f. for 5% level of significance is 4.39. Since the tabulated value of F is less than the calculated value, H0 is accepted, i.e., there is no significant difference between the variances, i.e., the samples have been drawn from the normal population with the same variance. Example 4. The two random samples reveal the following data: Sample no. Size Mean Variance I II 16 25 440 460 40 42 Test whether the samples come from the same normal population. Sol. A normal population has two parameters, namely, the mean μ and the variance σ 2 . To test whether the two independent samples have been drawn from the same normal population, we have to test (i) the equality of means (ii) the equality of variance. Since the t-test assumes that the sample variances are equal, we first apply the F-test. F-test. Null hypothesis. σ 12 = σ 22 The population variances do not differ significantly. Alternative hypothesis. σ 12 ≠ σ 22 Under the null hypothesis the test statistic is given by F = s12 2 , ( s1 > s22 ) s22 21.81 SNEDECOR’S VARIANCE RATIO TEST OR F-TEST 1267 ________________________________________________________________________________________________________ Given, n1 = 16, n2 = 25; s12 = 40, s22 = 42 n1s12 s12 n1 − 1 16 × 40 24 = × = 0.9752. ∴ F= 2 = 2 n2 s2 s2 15 25 × 42 n2 − 1 Conclusion. The calculated value of F is 0.9752. The tabulated value of F at 16 – 1, 25 – 1 d.f. for 5% level of significance is 2.11. Since the calculated value is less than that of the tabulated value, H0 is accepted, i.e., the population variances are equal. t-test. Null hypothesis. H0 : μ1 = μ2 , i.e., the population means are equal. Alternative hypothesis. H1 : μ1 ≠ μ2 under the null hypothesis the test statistic: Given: n1 = 16, n2 = 25, X1 = 440, X 2 = 460 s2 = t= n1s12 + n2 s22 16 × 40 + 25 × 42 = = 43.333 ∴ s = 6.582 16 + 25 − 2 n1 + n2 − 2 X1 − X 2 440 − 460 = = −9.490 for (n1 + n2 − 2) d.f. 1 1 1 1 6.582 s + + 16 25 n1 n2 Conclusion. The calculated value of t is 9.490. The tabulated value of t at 39 d.f. for 5% level of significance is 1.96. Since the calculated value is greater than the tabulated value, H0 is rejected. I.e., there is a significant difference between the means, i.e., μ1 ≠ μ2 . Since there is a significant difference between the means, and no significant difference between the variances, we conclude that the samples do not come from the same normal population. Example 5. Two random samples drawn from two normal populations have the variable values as below: Sample I 19 17 16 28 22 23 19 24 26 Sample II 28 32 40 37 30 35 40 28 41 45 30 36 Obtain the estimate of the variance of the population and test whether the two populations have the same variance. ΣX1 ΣX 2 Sol. X1 = = 21.55; n1 = 9; X 2 = = 35.166; n2 = 12 n1 n2 X1 d1 = X 1 − 17 d12 X2 d 2 = X 2 − 28 d 22 19 2 4 28 0 0 17 0 0 32 4 16 16 –1 1 40 12 144 28 11 121 37 9 81 (continued) 1268 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ 22 5 25 30 2 4 23 6 36 35 7 49 19 2 4 40 12 144 24 7 49 28 0 0 26 9 81 41 13 169 45 17 289 30 2 4 36 8 64 Σd12 = 321 Σd 22 = 964 s12 = Σ(X1 − X1 ) 2 Σd12 − n1 (X1 − A) 2 321 − 9(21.55 − 17) 2 = = = 16.834 n1 − 1 n1 − 1 9 −1 s22 = Σ(X 2 − X 2 ) 2 Σd 22 − n2 (X 2 − A) 2 964 − 12(35.166 − 28) 2 = = = 31.616 12 − 1 n2 − 1 n2 − 1 s22 31.616 = 1.878. F= 2 = s1 16.834 (∵ s22 > s12 ) Conclusion. The calculated value of F is 1.878. The tabulated value of F for v2 = 12 – 1 = 11, v1 = 9 – 1 = 8 d.f. at 5% level of significance is 3.315. Since the calculated value of F is less than the tabulated value, H0 is accepted, i.e., there is no significant difference between the population variance, i.e., the two populations have the same variance. TEST YOUR KNOWLEDGE 1. From the following two sample values find out whether they have come from the same population: Sample 1 17 27 18 25 27 29 27 23 Sample 2 16 16 20 16 20 17 15 21 17 2. The daily wages in dollars of skilled workers in two cities are as follows: Size of sample of workers S.D. of wages in the sample City A 160 250 City B 130 320 3. The standard deviation calculated from two random samples of sizes 9 and 13 are 2.1 and 1.8 respectively. May the samples be regarded as drawn from normal populations with the same standard deviation? 21.82 CHI-SQUARE (χ2) TEST 1269 ________________________________________________________________________________________________________ 4. Two independent samples of size 8 and 9 had the following values of the variables: Sample I 20 30 23 25 21 22 23 24 Sample II 30 31 32 34 35 29 28 27 26 Do the estimates of the population variance differ significantly? Answers 1. rejected 2. accepted 3. accepted 4. accepted ________________________________________________________________________________________________________ 21.82 CHI-SQUARE ( χ2 ) TEST When a coin is tossed 200 times, the theoretical considerations lead us to expect 100 heads and 100 tails. But in practice, these results are rarely achieved. The quantity χ2 (the Greek letter chi squared, pronounced chi-square) describes the magnitude of discrepancy between theory and observation. If χ = 0, the observed and expected frequencies completely coincide. The greater the discrepancy between the observed and expected frequencies, the greater the value of χ2. Thus χ2 affords a measure of the correspondence between theory and observation. If Oi (i = 1, 2, . . . , n) is a set of observed (experimental) frequencies and Ei (i = 1, 2, . . . , n) is the corresponding set of expected (theoretical or hypothetical) frequencies, then χ 2 is defined as n ⎡ (O − E i ) 2 ⎤ χ2 = ∑⎢ i ⎥ Ei i =1 ⎣ ⎦ where ΣOi = ΣE i = N (total frequency) and degrees of freedom (d.f.) = (n – 1). Note. (i) If χ = 0, the observed and theoretical frequencies agree exactly. 2 (ii) If χ > 0 they do not agree exactly. 2 21.82.1 Degrees of Freedom While comparing the calculated value of χ2 with the table value, we have to determine the degrees of freedom. If we have to choose any four numbers whose sum is 50, we can exercise our independent choice for any three numbers only, the fourth being 50 minus the total of the three numbers selected. Thus, though we are to choose any four numbers, our choice is reduced to three because of an imposed condition. There is only one restraint on our freedom and our degrees of freedom are 4 – 1 = 3. If two restrictions are imposed, our freedom to choose will be further curtailed and the degrees of freedom will be 4 – 2 = 2. In general, the number of degrees of freedom is the total number of observations less the number of independent constraints imposed on the observations. Degrees of freedom (d.f.) are usually denoted by ν (the letter nu of the Greek alphabet). Thus, ν = n – k, where k is the number of independent constraints in a set of data of n observations. Note. (i) For a p × q contingency table ( p columns and q rows), ν = ( p – 1) (q – 1) (ii) In the case of a contingency table, the expected frequency of any class Total of row in which it occurs × Total of columns in which it occurs = Total number of observations 1270 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ The χ2 test is one of the simplest and the most general tests known. It is applicable to a very large number of problems in practice, which can be summed up under the following heads: (i) as a test of goodness of fit. (ii) as a test of independence of attributes. (iii) as a test of homogeneity of independent estimates of the population variance. (iv) as a test of the hypothetical value of the population variance σ 2 . (v) as a list of the homogeneity of independent estimates of the population correlation coefficient. 21.82.2 Conditions for Applying the χ2 Test Following are the conditions that should be satisfied before the χ 2 test can be applied. (a) N, the total number of frequencies, should be large. It is difficult to say what constitutes largeness, but as an arbitrary figure, we may say that N should be at least 50, however few the cells. (b) No theoretical cell-frequency should be small. Here again, it is difficult to say what constitutes smallness, but 5 should be regarded as the very minimum and 10 is better. If small theoretical frequencies occur (i.e., < 10), the difficulty is overcome by grouping two or more classes together before calculating (O – E). It is important to remember that the number of degrees of freedom is determined with the number of classes after regrouping. (c) The constraints on the cell frequencies, if any, should be linear. Note. If any one of the theoretical frequencies is less than 5, we then apply a correction given by F. Yates, which is usually known as “Yates’s correction for continuity,” we add 0.5 to the cell frequency that is less than 5 and adjust the remaining cell frequency suitably so that the marginal total is not changed. 21.82.3 The χ2 Distribution For large sample sizes, the sampling distribution of χ2 can be closely approximated by a continuous curve known as the chi-square distribution. The probability function of χ2 distribution is given by f ( χ 2 ) = c( χ 2 )(ν /2−1) e − x 2 /2 where e = 2.71828, ν = number of degrees of freedom; c = a constant depending only on ν . Symbolically, the degrees of freedom are denoted by the symbol ν or by d.f. and are obtained by the rule ν = n – k, where k refers to the number of independent constraints. In general, when we fit a binomial distribution the number of degrees of freedom is one less than the number of classes; when we fit a Poisson distribution, the degrees of freedom are 2 less than the number of classes, because we use the total frequency and the arithmetic mean to get the parameter of the Poisson distribution. When we fit a normal curve, the number of degrees of freedom are 3 less than the number of classes, because in this fitting we use the total frequency, mean, and standard deviation. If the data is given in a series of “n” numbers then degrees of freedom = n – 1. In the case of Binomial distribution d.f. = n – 1. In the case of Poisson distribution d.f. = n – 2. In the case of Normal distribution d.f. = n – 3. 21.82.4 The χ2 Test as a Test of Goodness of Fit The χ2 test enables us to ascertain how well the theoretical distributions such as Binomial, Poisson, or Normal, etc. fit empirical distributions, i.e., distributions obtained from sample data. 21.82 CHI-SQUARE (χ2) TEST 1271 ________________________________________________________________________________________________________ If the calculated value of χ2 is less than the table value at a specified level (generally 5%) of significance, the fit is considered to be good, i.e., the divergence between actual and expected frequencies is attributed to fluctuations of simple sampling. If the calculated value of χ2 is greater than the table value, the fit is considered to be poor. ILLUSTRATIVE EXAMPLES Example 1. The following table gives the number of accidents that took place in an industry during various days of the week. Test whether accidents are uniformly distributed over the week. Day Mon Tue Wed Thu Fri Sat No. of accidents 14 18 12 11 15 14 Sol. Null hypothesis H0. The accidents are uniformly distributed over the week. Under this H0, the expected frequencies of the accidents on each of these days = 84 = 14. 6 Observed frequency Oi 14 18 12 11 15 14 Expected frequency Ei 14 14 14 14 14 14 (Oi − Ei ) 0 16 4 9 1 0 2 Σ(Oi − E i ) 2 30 = = 2.1428. χ = Ei 14 2 Conclusion. Table value of χ2 at 5% level for (6 – 1 = 5 d.f.) is 11.09. Since the calculated value of χ2 is less than the tabulated value, H0 is accepted, i.e., the accidents are uniformly distributed over the week. Example 2. A die is thrown 270 times and the results of these throws are given below: No. appeared on the die 1 2 3 4 5 6 Frequency 40 32 29 59 57 59 Test whether the die is biased or not. Sol. Null hypothesis H0. Die is unbiased. Under this H0, the expected frequencies for each digit is 276 = 46. 6 To find the value of χ2 Oi 40 32 29 59 57 59 Ei 46 46 46 46 46 46 (Oi − Ei ) 2 36 196 289 169 121 169 1272 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Σ(Oi − E i ) 2 980 = = 21.30. Ei 46 Conclusion. The tabulated value of χ2 at 5% level of significance for (6 – 1 = 5) d.f. is 11.09. Since the calculated value of χ2 = 21.30 > 11.07 the tabulated value, H0 is rejected. I.e., the die is not unbiased or the die is biased. χ2 = Example 3. The following table shows the distribution of digits in numbers chosen at random from a telephone directory: Digits 0 1 2 3 4 5 Frequency 1026 1107 997 966 1075 6 7 933 1107 972 8 9 964 853 Test whether the digits may be taken to occur equally frequently in the directory. Sol. Null hypothesis H0. The digits taken in the directory occur with equal frequency, i.e., there is no significant difference between the observed and expected frequency. 10, 000 Under H0, the expected frequency is given by = = 1000 10 To find the value of χ2 Oi 1026 1107 997 996 1075 1107 933 972 964 853 Ei 1000 1000 1000 1000 1000 1000 1107 1000 1000 1000 (Oi − Ei ) 2 676 11449 5625 11449 4489 784 1296 21609 χ2 = 9 1156 Σ(Oi − E i ) 2 58542 = = 58.542. Ei 1000 Conclusion. The tabulated value of χ2 at 5% level of significance for 9 d.f. is 16.919. Since the calculated value of χ2 is greater than the tabulated value, H0 is rejected. I.e., there is a significant difference between the observed and theoretical frequency. I.e., the digits taken in the directory do not occur with equal frequency. Example 4. Records taken of the number of male and female births in 800 families having four children are as follows: No. of male births 0 1 2 3 4 No. of female births 4 3 2 1 0 No. of families 32 178 290 236 94 Test whether the data are consistent with the hypothesis that the binomial law holds and the chance of male birth is equal to that of female birth, namely p = q = 1/2. Sol. H0 : The data are consistent with the hypothesis of equal probability for male and female births, i.e., p = q = 1/2. 21.82 CHI-SQUARE (χ2) TEST 1273 ________________________________________________________________________________________________________ We use binomial distribution to calculate theoretical frequency given by: N(r) = N × P(X = r) where N is the total frequency. N(r) is the number of families with r male children: P(X = r) = n Cr p r q n − r where p and q are the probability of male and female births, n is the number of children. 4 1 ⎛1⎞ N(0) = No. of families with 0 male children = 800 × C0 ⎜ ⎟ = 800 ×1× 4 = 50 2 ⎝2⎠ 4 1 3 2 2 3 0 4 ⎛1⎞ ⎛1⎞ ⎛1⎞ ⎛1⎞ N(1) = 800 × C1 ⎜ ⎟ ⎜ ⎟ = 200; N(2) = 800 × 4 C 2 ⎜ ⎟ ⎜ ⎟ = 300 ⎝2⎠ ⎝2⎠ ⎝2⎠ ⎝2⎠ 4 1 ⎛1⎞ N(3) = 800 × 4 C3 ⎜ ⎟ ⎝2⎠ ⎛1⎞ ⎛1⎞ ⎛1⎞ 4 ⎜ ⎟ = 200; N(4) = 800 × C4 ⎜ ⎟ ⎜ ⎟ = 50 ⎝2⎠ ⎝2⎠ ⎝2⎠ Observed frequency Oi 32 178 290 236 94 Expected frequency Ei 50 200 300 200 50 (Oi − Ei ) 2 324 484 100 1296 1936 (Oi − Ei ) 2 Ei 6.48 2.42 0.333 6.48 38.72 Σ(Oi − E i ) 2 = 54.433. Ei Conclusion. The table value of χ2 at 5% level of significance for 5 – 1 = 4 d.f. is 9.49. Since the calculated value of χ2 is greater than the tabulated value, H0 is rejected. I.e., the data are not consistent with the hypothesis that the binomial law holds and that the chance of a male birth is not equal to that of a female birth. χ2 = Note. Since the fitting is binomial, the degrees of freedom ν = n – 1, i.e., ν = 5 – 1 = 4. Example 5. Verify whether the Poisson distribution can be assumed from the data given below: No. of defects 0 1 2 3 4 5 Frequency 6 13 13 8 4 3 Sol. H0 : The Poisson fit is a good fit to the data. Σ f i xi 94 = =2 Σ fi 47 To fit a Poisson distribution we require m. Parameter m = x = 2. By the Poisson distribution the frequency of r success is mr N(r ) = N × e − m ⋅ , N is the total frequency. r! Mean of the given distribution = 1274 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ (2)0 = 6.36 ≈ 6; N(0) = 47 × e ⋅ 0! (2) 2 N(2) = 47 × e −2 ⋅ = 12.72 ≈ 13; 2! 4 −2 (2) N(4) = 47 × e ⋅ = 4.24 ≈ 4; 4! (2)1 = 12.72 ≈ 13 N(1) = 47 × e ⋅ 1! (2)3 N(3) = 47 × e −2 ⋅ = 8.48 ≈ 9 3! 5 −2 (2) N(5) = 47 × e ⋅ = 1.696 ≈ 2. 5! −2 −2 X 0 1 2 3 4 5 Oi 6 13 13 8 4 3 Ei 6.36 12.72 12.72 8.48 4.24 1.696 (Oi − Ei ) 2 Ei 0.2037 0.00616 0.00616 0.02716 0.0135 1.0026 Σ(Oi − E i ) 2 χ = = 1.2864. Ei 2 Conclusion. The calculated value of χ2 is 1.2864. The tabulated value of χ2 at 5% level of significance for γ = 6 – 2 = 4 d.f. is 9.49. Since the calculated value of χ2 is less than that of the tabulated value, H0 is accepted, i.e., the Poisson distribution provides a good fit to the data. Example 6. The theory predicts the proportion of beans in the four groups, G1, G2, G3, G4 should be in the ratio 9 : 3 : 3 : 1. In an experiment with 1600 beans the numbers in the four groups were 882, 313, 287, and 118. Does the experimental result support the theory? Sol. H0. The experimental result supports the theory, i.e., there is no significant difference between the observed and theoretical frequency under H0; the theoretical frequency can be calculated as follows: 1600 × 9 = 900; 16 1600 × 3 E(G 3 ) = = 300; 16 E(G1 ) = 1600 × 3 = 300; 16 1600 × 1 E(G 4 ) = = 100 16 E(G 2 ) = To calculate the value of χ2 Observed frequency Oi 882 313 287 118 Expected frequency Ei 900 300 300 100 (Oi − Ei ) 2 Ei 0.36 0.5633 0.5633 3.24 χ2 = Σ(Oi − E i ) 2 = 4.7266. Ei Conclusion. The table value of χ2 at 5% level of significance for 3 d.f. is 7.815. Since the calculated value of χ2 is less than that of the tabulated value, hence H0 is accepted. I.e., the experimental results support the theory. 21.82 CHI-SQUARE (χ2) TEST 1275 ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. The following table gives the frequency of occupance of the digits 0, 1, . . . , 9 in the last place in four logarithms of numbers 10–99. Examine whether there is any peculiarity. Digits : Frequency : 0 6 1 16 2 15 3 10 4 12 5 12 6 3 7 2 8 9 9 5 2. The sales in a supermarket during a week are given below. Test the hypothesis that the sales do not depend on the day of the week, using a significance level of 0.05. Days : Sales (in $10000) : Mon 65 Tues 54 Wed 60 Thurs 56 Fri 71 Sat 84 3. A survey of 320 families with 5 children each revealed the following information: No. of boys No. of girls No. of families : : : 5 0 14 4 1 56 3 2 110 2 3 88 1 4 40 0 5 12 Is this result consistent with the hypothesis that male and female births are equally probable? 4. 4 coins were tossed at a time and this operation was repeated 160 times. It is found that 4 heads occur 6 times, 3 heads occur 43 times, 2 heads occur 69 times, and one head occur 34 times. Discuss whether the coin may be regarded as unbiased. 5. Fit a Poisson distribution to the following data and the best goodness of fit: x f : : 0 109 1 65 2 22 3 3 4 1 6. In the accounting department of a bank, 100 accounts are selected at random and estimated for errors. The following results were obtained: No. of errors No. of accounts : : 0 35 1 40 2 19 3 2 4 0 5 2 6 2 Does this information verify that the errors are distributed according to the Poisson probability law? 7. In a sample analysis of examination results of 500 students, it was found that 280 students have failed, 170 have gotten C’s, 90 have gotten B’s, and the rest, A’s. Do these figures support the general belief that the above categories are in the ratio 4 : 3 : 2 : 1 respectively? Answers 1. no 5. Poisson law fits the data 2. accepted 6. maybe 3. accepted 7. yes 4. unbiased ________________________________________________________________________________________________________ 21.82.5 The χ2 Test as a Test of Independence With the help of the χ2 test, we can find whether or not two attributes are associated. We take the null hypothesis that there is no association between the attributes under study, i.e., we assume that the two attributes are independent. If the calculated value of χ2 is less than the table value at a specified level (generally 5%) of significance, the hypothesis holds true, i.e., the attributes are independent and do not bear any association. On the other hand, if the calculated value of χ2 is greater than the table value at a specified level of significance, we say that the results of the experiment do not support the hypothesis. In other words, the attributes are associated. Thus a very useful application of the χ2 test is to investigate the relationship between trials or attributes, which can be classified into two or more categories. 1276 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ The sample data are set out into a two-way table, called a contingency table. Let us consider two attributes A and B divided into r classes A1, A2, A3, . . . , Ar and B divided into s classes B1, B2, B3, . . . , Bs. If (Ai), (Bj) represents the number of people possessing the attributes Ai, Bj respectively, (i = 1, 2, . . . , r, j = 1, 2, . . . , s) and (Ai Bj) represent the number of people possessing attributes Ai and Bj. Also we have r ∑ Ai = i =1 s ∑B i =1 j where N is the total frequency. The contingency table for r × s is given below: A A1 A2 A3 . . . Ar Total B1 (A1B1) (A2B1) (A3B1) . . . (ArB1) B1 B2 (A1B2) (A2B2) (A3B2) . . . (ArB2) B2 B3 (A1B3) (A2B3) (A3B3) . . . (ArB3) B3 ... ... ... ... ... ... ... ... ... ... ... ... Bs (A1Bs) (A2Bs) (A3Bs) . . . (ArBs) (Bs) Total (A1) (A2) (A3) . . . (Ar) N B H0 : Both the attributes are independent, i.e., A and B are independent under the null hypothesis; we calculate the expected frequency as follows: P(A i ) = Probability that a person possesses the attribute A i = P(B j ) = Probability that a person possesses the attribute B j = (A i ) i = 1, 2, . . . , r N (B j ) N P(A i B j ) = Probability that a person possesses both attributes A i and B j = (A i B j ) N If (A i B j )0 is the expected number of people possessing both the attributes Ai and Bj (A i B j )0 = NP(A i B j ) = NP(A i )(B j ) (A i ) (B j ) (A i )(B j ) = N N N 2 r s ⎡ ⎡ (A B ) − (A B ) ⎤ ⎤ i j i j 0⎦ ⎣ 2 ⎥ χ = ∑∑ ⎢ ⎢ ⎥ (A B ) i =1 j =1 i j 0 ⎣ ⎦ =N Hence (∵ A and B are independent) which is distributed as a χ2 variate with (r – 1)(s – 1) degrees of freedom. a|b 2 Note 1. For a 2 × 2 contingency table where the frequencies are χ can be calculated from independent c d 2 ( a + b + c + d )( ad − bc ) . frequencies as χ2 = ( a + b)(c + d )(b + d )( a + c ) 21.82 CHI-SQUARE (χ2) TEST 1277 ________________________________________________________________________________________________________ Note 2. If the contingency table is not 2 × 2, then the formula for calculating χ2 as given in Note 1, cannot be (A i )(B j ) used. Hence, we have another formula for calculating the expected frequency (AiBj)0 = N Product of column total and row total I.e., the expected frequency in each cell is = . whole total a|b ad − bc is the 2 × 2 contingency table with two attributes, Q = is called the coefficient of c d ad + bc association. Note 3. If If the attributes are independent then a = c . b d Note 4. Yate’s Correction. In a 2 × 2 table, if the frequencies of a cell is small, we make Yates’s correction to make χ2 continuous. Decrease by 12 those cell frequencies that are greater than expected frequencies, and increase by 12 those that are less than expected. This will not affect the marginal columns. This correction is known as Yates’s correction to continuity. χ = 2 After Yates’s correction χ = 2 1 ⎞ ⎛ N ⎜ bc − ad − N ⎟ 2 ⎠ ⎝ 2 ( a + c )(b + d )(c + d )( a + b) ⎛ ⎝ N ⎜ ad − bc − ⎞ 2 ⎠ 1 when ad − bc < 0 2 N⎟ ( a + c )(b + d )(c + d )( a + b) when ad − bc > 0. ILLUSTRATIVE EXAMPLES Example 1. What are the expected frequencies of the 2 × 2 contingency tables given below: (i) a b c d Observed frequencies Sol. (i) (ii) a b a+b c d c+d a+c b+d a+b+c+d=N 2 10 6 6 Expected frequencies → (a + c)(a + b) a+b+c+d (b + d )(a + b) a+b+c+d (a + c)(c + d ) a+b+c+d (b + d )(c + d ) a+b+c+d 1278 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Observed frequencies (ii) Expected frequencies 2 10 12 6 6 12 8 16 24 → 8 × 12 =4 24 16 ×12 =8 24 8 × 12 =4 24 16 ×12 =8 24 Example 2. From the following table regarding the color of eyes of fathers and sons test whether the color of the son’s eye is associated with that of the father. Eye color of father Light Eye color of son Light 471 Not light 51 Not light 148 230 Sol. Null hypothesis H0. The color of the son’s eye is not associated with that of the father, i.e., they are independent. Under H0, we calculate the expected frequency in each cell as = Product of column total and row total whole total Expected frequencies are: Eye color of son Eye color of father Light Not light Total Light 619 × 522 = 359.02 900 289 × 522 = 167.62 900 522 Not light 619 × 378 = 259.98 900 289 × 378 = 121.38 900 378 619 289 900 Total (471 − 359.02) 2 (51 − 167.62) 2 (148 − 259.98) 2 (230 − 121.38) 2 + + + 359.02 167.62 259.98 121.38 = 261.498. χ2 = Conclusion. Tabulated value of χ2 at 5% level for 1 d.f. is 3.841. Since the calculated value of χ2 > the tabulated value of χ2, H0 is rejected. They are dependent, i.e., the color of the son’s eye is associated with that of the father. 21.82 CHI-SQUARE (χ2) TEST 1279 ________________________________________________________________________________________________________ Example 3. The following table gives the number of good and bad parts produced by each of the three shifts in a factory: Good parts Bad parts Total Day shift 960 40 1000 Evening shift 940 50 990 Night shift 950 45 995 Total 2850 135 2985 Test whether or not the production of bad parts is independent of the shift on which they were produced. Sol. Null hypothesis H0. The production of bad parts is independent of the shift on which they were produced. I.e., the two attributes, production and shifts, are independent. ⎡ ⎡(A B ) − (A B ) ⎤ 2 ⎤ i j 0 i j ⎦ ⎥ χ = ∑∑ ⎢ ⎣ ⎥⎦ (A B ) i =1 j =1 ⎢ i j 0 ⎣ 2 Under H0, 2 3 Calculation of expected frequencies Let A and B be two attributes, namely, production and shifts. A is divided into two classes A1, A2, and B is divided into three classes B1, B2, B3. (A1 )(B2 ) (2850) × (1000) = = 954.77 N 2985 (A )(B ) (2850) × (990) (A1B2 )0 = 1 2 = = 945.226 N 2985 (A )(B ) (2850) × (995) (A1B3 )0 = 1 3 = = 950 N 2985 (A )(B ) (135) × (1000) (A 2 B1 )0 = 2 1 = = 45.27 N 2985 (A )(B ) (135) × (990) (A 2 B2 )0 = 2 2 = = 44.773 N 2985 (A )(B ) (135) × (995) (A 2 B3 )0 = 2 3 = = 45. N 2985 (A1B1 )0 = To calculate the value of χ2 Class Oi Ei (Oi − Ei ) 2 (Oi − Ei ) 2 / Ei (A1B1) 960 954.77 27.3529 0.02864 (A1B2) 940 945.226 27.3110 0.02889 (A1B3) 950 950 0 0 (A2B1) 40 45.27 27.7729 0.61349 (A2B2) 50 44.773 27.3215 0.61022 (A2B3) 45 45 0 0 1.28126 1280 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Conclusion. The tabulated value of χ2 at 5% level of significance for 2 degrees of freedom (r – 1)(s – 1) is 5.991. Since the calculated value of χ2 is less than the tabulated value, we accept H0, i.e., the production of bad parts is independent of the shift on which they were produced. Example 4. From the following data, find whether hair color and sex are associated. Color Fair Red Medium Dark Black Total Boys 592 849 504 119 36 2100 Girls 544 677 451 97 14 1783 Total 1136 1526 955 216 50 3883 Sex Sol. Null hypothesis H0. The two attributes of hair color and sex are not associated, i.e., they are independent. Let A and B be the attributes of hair color and sex, respectively. A is divided into 5 classes (r = 5). B is divided into 2 classes (s = 2). ∴ Degrees of freedom = (r – 1)(s – 1) = (5 – 1)(2– 1) = 4 ⎡(A i B j )0 − (A i B j ) ⎤⎦ Under H0, we calculate χ = ∑∑ ⎣ (A i B j )0 i =1 j =1 2 5 2 2 Calculate the expected frequency (A i B j )0 as follows: (A1B1 )0 = (A1 )(B1 ) 1136 × 2100 = = 614.37 N 3883 (A1B2 )0 = (A1 )(B2 ) 1136 ×1783 = = 521.629 N 3883 (A 2 B1 )0 = (A 2 )(B1 ) 1526 × 2100 = = 852.289 N 3883 (A 2 B2 )0 = (A 2 )(B2 ) 1526 × 1783 = = 700.71 N 3883 (A 3 B1 )0 = (A 3 )(B1 ) 955 × 2100 = = 516.482 N 3883 (A 3 B2 )0 = (A 3 )(B2 ) 955 × 1783 = = 483.517 N 3883 21.82 CHI-SQUARE (χ2) TEST 1281 ________________________________________________________________________________________________________ (A 4 B1 )0 = (A 4 )(B1 ) 216 × 2100 = = 116.816 N 3883 (A 4 B2 )0 = (A 4 )(B2 ) 216 × 1783 = = 99.183 N 3883 (A 5 B1 )0 = (A 5 )(B1 ) 50 × 2100 = = 27.04 N 3883 (A 5 B2 )0 = (A 5 )(B2 ) 50 ×1783 = = 22.959 N 3883 Calculation of χ2 (Oi − Ei ) 2 Ei Class Oi Ei (Oi − Ei ) 2 A2B1 592 614.37 500.416 0.8145 A1B2 544 521.629 500.462 0.959 A2B1 849 852.289 10.8175 0.0127 A2B2 677 700.71 562.1641 0.8023 A3B1 504 516.482 155.800 0.3016 A3B2 451 438.517 155.825 0.3553 A4B1 119 116.816 4.7698 0.0408 A4B2 97 99.183 4.7654 0.0480 A5B1 36 27.04 80.2816 2.9689 A5B2 14 22.959 80.2636 3.495 9.79975 χ2 = 9.799. Conclusion. Table of χ2 at 5% level of significance for 4 d.f. is 9.488. Since the calculated value of χ2 < tabulated value H0 is rejected, i.e., the two attributes are not independent, i.e., the hair color and sex are associated. Example 5. Can vaccination be regarded as a preventive measure of smallpox as evidenced by the following data of 1482 people exposed to small pox in a locality? 368 in all were attacked of these 1482 people, and 343 were vaccinated, and of these only 35 were attacked. Sol. For the given data we form the contingency table. Let the two attributes be vaccination and exposed to smallpox. Each attribute is divided into two classes. 1282 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Vaccination A Vaccinated Not Total Attacked 35 333 368 Not 308 806 1114 Total 343 1139 1482 Disease smallpox B Null hypothesis H0. The two attributes are independent, i.e., vaccination cannot be regarded as a preventive measure of smallpox. Degrees of freedom ν = ( r − 1)( s − 1) = (2 − 1)(2 − 1) = 1 ⎡(A i B j )0 − (A i B j ) ⎤⎦ χ = ∑∑ ⎣ (A i B j )0 i =1 j =1 2 Under H0, 2 2 2 Calculation of expected frequency (A1B1 )0 = (A1 )(B1 ) 343 × 368 = = 85.1713 N 1482 (A1B2 )0 = (A1 )(B2 ) 343 × 1114 = = 257.828 N 1482 (A 2 B1 )0 = (A 2 )(B1 ) 1139 × 368 = = 282.828 N 1482 (A 2 B2 )0 = (A 2 )(B2 ) 1139 × 1114 = = 856.171 N 1482 Calculation of χ2 Class Oi Ei (Oi − Ei ) 2 (Oi − Ei ) 2 Ei (A1B1) 35 85.1713 2517.159 29.554 (A1B2) 308 257.828 2517.229 8.1728 (A2B1) 333 282.828 2517.2295 7.5592 (A2B2) 806 856.171 2517.1292 2.9399 48.2261 Calculated value of χ2 = 48.2261. Conclusion. Tabulated value of χ2 at 5% level of significance for 1 d.f. is 3.841. Since the calculated value of χ2 > tabulated value H0 is rejected. I.e., the two attributes are not independent, i.e., the vaccination can be regarded as a preventive measure of smallpox. 21.83 Z-TEST 1283 ________________________________________________________________________________________________________ TEST YOUR KNOWLEDGE 1. In a locality 100 people were randomly selected and asked about their educational achievements. The results are given below: Education Sex Middle High school College Male 10 15 25 Female 25 10 15 Based on this information, can you say the education depends on sex? 2. The following data is collected on two characteristics: Smokers Nonsmokers Literate 83 57 Illiterate 45 68 Based on this information can you say that there is no relation between habit of smoking and literacy? 3. 500 students at school were graded according to their intelligences and economic conditions of their homes. Examine whether there is any association between economic condition and intelligence, from the following data: Economic conditions Intelligence Good Bad Rich 85 75 Poor 165 175 4. In an experiment on the immunization of goats from anthrax, the following results were obtained. Derive your inferences on the efficiency of the vaccine. Died from anthrax Survived Inoculated with vaccine 2 10 Not inoculated 6 6 Answers 1. Yes 2. No 3. No 4. Not effective. ________________________________________________________________________________________________________ 21.83 Z-TEST This test is used to test the significance of the correlation coefficient in small samples. If r is the correlation coefficient of the sample and ρ , that of the population, calculate the value of 1284 CHAPTER 21: STATISTICS AND PROBABILITY ________________________________________________________________________________________________________ Z −ξ 1 n−3 where ⎛ 1+ r ⎞ or 1.1513 log10 ⎜ ⎟ ⎝ 1− r ⎠ ⎛ 1+ ρ ⎞ ⎛ 1+ ρ ⎞ 1 1 ξ = tanh −1 ρ = log e ⎜ ⎟ or 1.1513 log10 ⎜ ⎟ 2 2 ⎝ 1− ρ ⎠ ⎝ 1− ρ ⎠ Z= 1 1 ⎛ 1+ r ⎞ tanh −1 r = log e ⎜ ⎟ 2 2 ⎝ 1− r ⎠ 1 = S.E. n−3 If the absolute value of this difference exceeds 1.96, the difference is significant at 5% S.E. level. ILLUSTRATIVE EXAMPLES Example 1. Test the significance of the correlation r = 0.5 from a sample of size 18 against the hypothetical correlation ρ = 0.7. Sol. We have to test the hypothesis that the correlation in the population is 0.7. 1 ⎛ 1+ r ⎞ ⎛ 1 + 0.5 ⎞ log e ⎜ ⎟ = 1.1513 log10 ⎜ ⎟ 2 ⎝ 1− r ⎠ ⎝ 1 − 0.5 ⎠ = 1.1513 log 3 = 1.1513 × 0.4771 = 0.549 Z= ξ= ⎛ 1+ ρ ⎞ 1 ⎛ 1 + 0.7 ⎞ log e ⎜ ⎟ ⎟ = 1.1513 log10 ⎜ 2 ⎝ 1 − 0.7 ⎠ ⎝ 1− ρ ⎠ = 1.1513 log 5.67 = 1.1513 × 0.7536 = 0.868 Z − ξ = 0.549 − 0.868 = −0.319 1 1 1 = 0.26 S.E. = = = n−3 15 18 − 3 Z − ξ 0.319 = = 1.23, which is less than 1.96 (5% level of signifiS.E. 0.26 cance) and is, therefore, not significant. Hence the sample may be regarded as coming from a population with ρ = 0.7. The absolute value of Example 2. From a sample of 19 pairs of observations, the correlation is 0.5 and the corresponding population value is 0.3. Is the difference significant? Sol. Here n = 19, r = 0.5, ρ = 0.3 1 ⎛ 1+ r ⎞ ⎛ 1 + 0.5 ⎞ log e ⎜ ⎟ = 1.1513 log10 ⎜ ⎟ 2 ⎝ 1− r ⎠ ⎝ 1 − 0.5 ⎠ = 1.1513 log 3 = 1.1513 × 0.4771 = 0.55 Z= ξ= ⎛ 1+ ρ ⎞ 1 ⎛ 1 + 0.3 ⎞ log e ⎜ ⎟ ⎟ = 1.1513 log10 ⎜ 2 ⎝ 1 − 0.3 ⎠ ⎝ 1− ρ ⎠ = 1.1513 log1.857 = 1.1513 × 0.2695 = 0.31 21.83 Z-TEST 1285 ________________________________________________________________________________________________________ Z − ξ = 0.55 − 0.31 = 0.24; S.E.x = ∴ 1 1 1 = = = 0.25 19 − 3 4 n−3 Z − ξ 0.24 = = 0.96 S.E. 0.25 which is less than 1.96 (5% level of significance) and is, therefore, not significant. Hence the sample may be regarded as coming from a population with ρ = 0.3. TEST YOUR KNOWLEDGE 1. A correlation coefficient of 0.72 is obtained from a sample of 29 pairs of observations. Can the sample be regarded as drawn from a bivariate normal population in which the true correlation coefficient is 0.8? Answer 1. Yes ________________________________________________________________________________________________________