Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY MEASURES OF DISPERSION Definition of dispersion It is the degree to which numerical data tends to spread about an average value It is the extent of the scattered ness of items around a measure of central tendency Significance of measuring dispersion To determine the reliability of an average To serve as a basis for the control of the variability To compare two or more series with regard to their variability To facilitate the use of other statistical measures Properties of a good measure of dispersion It should be: Simple to understand Easy to compute Rigidly defined Based on each and every item in the distribution Amenable to further algebraic calculations Have sampling stability Not be unduly affected by extreme values Measures of dispersion Range Quartile deviation Mean deviation Standard deviation The Range: it is the difference between the smallest value and the largest value of a series Advantages of the Range It is the simplest to understand and compute It takes the minimum time to calculate the value of the range Limitations It is not based on each and every value of the distribution It is subject to fluctuations of considerable magnitude from sample to sample It cannot be computed in case of open-ended distributions It does not explain or indicate anything about the character of the distribution within the two extreme observations. Uses of the range Quality control Fluctuations of prices Weather forecast Finding the difference between two values e.g. wages earned by different employees. The standard deviation It is the square root of the arithmetic average of the squares of the deviations measured from the mean. It measures how much “spread” or “ Variability” is present in the sample. A small standard deviation means a high degree of uniformity of the observations as well as the homogeneity of a series and vice versa. 1 INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY Ways of computing the standard deviation Direct method Ungrouped data dx 2 where n Grouped data fdx dx 2 = sum of squares of the deviations from arithmetic mean 2 n Indirect method Ungrouped data Dx 2 n 2 Dx where Dx are deviations from an assumed mean. n Grouped data fDx 2 n fDx n 2 Step deviation method i* fD' x 2 n 2 fD' x where i common factor D' x = step deviations from the n assumed mean. Coefficient of standard deviation Coefficient of variation = Mean = Mean * 100 Combined arithmetic mean and combined standard deviation Combined arithmetic mean for two sets of data with arithmetic means x1 , x 2 and the number of observations n1 n 2 is given by X n1 x1 n2 x 2 n1 n2 Combined standard deviation of two series is given by (with n1 n 2 large) n1 s1 n2 s 2 n1 d1 n2 d 2 n1 n2 2 S 2 2 2 where d1 x1 x and d 2 x 2 x Example An analysis of the monthly wages paid to workers of two firms A and B belonging to the same industry gives the following results: 2 INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY Firm A Firm B No. of wage earners 586 648 Average monthly wage 52.5 47.5 Standard deviation 10 11 Compute the combined standard deviation. Advantages of the standard deviation It is rigidly defined and is based on all the observations of the series It is applied or used in other statistical techniques like correlation and regression analysis and sampling theory. It is possible to calculate the combined standard deviation of two or more groups. Disadvantages of the standard deviation It cannot be used for comparing the dispersion of two or more series of observations given in different units. It gives more weight to extreme values. Examples 1. The following marks belong to 99 students of a secondary school in Keroka Municipality Marks Number of students 0 – 10 10 10 – 20 ? 20 – 30 25 30 – 40 30 40 – 50 ? 50 – 60 10 On later analysis, it was discovered that two class interval frequencies were missing. The median score was found to be 30. Required: i. Find the missing frequencies. ii. Determine the modal mark of the students iii. Find the mean mark iv. Find the standard deviation. 2. The following table indicates the marks obtained by students in a statistics test. Marks Number of students 0 – 20 5 20 – 40 7 40 – 60 60 – 80 8 80 - 100 7 The arithmetic mean for the class was 52.5 marks. You are required to determine the value of: i. ii. iii. iv. The missing frequency The median mark The modal mark The standard deviation 3 INTRODUCTION TO STATISTICS ST.PAULS UNIVERSITY 3. The following are the daily wages in Ksh. of 30 workers of a flower farm in Ruiru, which grow the flowers and export them to European Countries. 140 139 126 114 100 88 62 77 99 103 108 129 144 148 134 63 69 148 132 118 142 116 123 104 95 80 85 106 123 133 The Company (flower farm) gives bonus of Sh. 10, 15, 20, 25, 30 and 35 for individuals in the respective salary; exceeding 60 but not exceeding 75, exceeding 75 but not exceeding 90 and so on up to exceeding 135 but not exceeding 150. Required to calculate i. The average wage and average bonus paid by the flower farm ii. The median wage and median bonus iii. The modal wage and modal bonus. iv. The standard deviation for the wages and also for the bonus 4