Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROGRAMME 27 STATISTICS (contd) STROUD Worked examples and exercises are in the text Programme 27: Statistics (REMINDER) Mean The arithmetic mean of a set of n observations is their average: x x sum of observations mean = that is x number of observations n When calculating from a frequency distribution, this becomes: xf xf x n f [Here x now means not the individual observations, but the different values for which frequencies are counted – J.A.B.] STROUD Worked examples and exercises are in the text (Simple) Coding Method for Calculating a Mean Manually or Mentally [Slide added by J.A.B.] The textbook mentions a “coding” method for calculating the mean. In class I go through a simplified, very useful version of this. It’s easy: instead of averaging the values themselves directly, you take a convenient number, the “base”, that’s very roughly in the middle of or near to the values. You work out their (positive or negative) deviations from that base value, take the average of those deviations, and then add that average to the base. The result is the average of the original values. Exercise: try it with 8 values between, say, 50 and 85, using, say, 60 or 70 as the base. Compare the difficulty of doing this with adding the values and dividing by 8. Also check that it doesn’t matter what base you choose, leaving you free to pick a convenenient round number. Exercise: explain why the method works in general. STROUD Worked examples and exercises are in the text Programme 27: Statistics (NEW) Mode of a set of data The mode of a set of data is that value of the variable that occurs most often. The mode of: 2, 2, 6, 7, 7, 7, 10, 13 is clearly 7. The mode may not be unique, for instance the modes of: 23, 25, 25, 25, 27, 27, 28, 28, 28 are 25 and 28. STROUD Worked examples and exercises are in the text Programme 27: Statistics Modal Class of a grouped frequency distribution The modal class of grouped data is the class with the greatest population. For example, the modal class of: is the third class. STROUD Worked examples and exercises are in the text Programme 27: Statistics Mode of a grouped frequency distribution Plotting the histogram of the data enables the mode to be found: STROUD Worked examples and exercises are in the text Programme 27: Statistics Mode of a grouped frequency distribution, contd The mode can also be calculated algebraically: If L = lower boundary value l = AB = difference in frequency on the lower boundary u = CD = difference in frequency on the upper boundary c = class interval the mode is then: l mode L c l u STROUD Worked examples and exercises are in the text Programme 27: Statistics Mode of a grouped frequency distribution, contd For example, the modal class of: L = ...... l = ..... u = ...... c = ..... l mode L c l u STROUD Worked examples and exercises are in the text Programme 27: Statistics Mode of a grouped frequency distribution, contd For example, the modal class of: L = 15.5 u= 3 l = 16-7 = 9 c = 16-10=6 l mode L c l u STROUD 15.5+9/(9+6)*3 = 15.5 + 1.8 = 17.3 Worked examples and exercises are in the text Programme 27: Statistics Median of a set of data The median is the value of the middle datum when the data is arranged in ascending or descending order. If there is an even number of values the median is the average of the two middle data. STROUD Worked examples and exercises are in the text Programme 27: Statistics Median with grouped data In the case of grouped data the median divides the population of the largest block of the histogram into two parts: 6 12 15 A B 13 9 5 In this frequency distribution A + B = 20 so that A = 7: 7 The width of A class interval 20 0.35 0.3 0.105 Therefore, Median = 30.85 + 0.105 = 30.96 STROUD A B Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Range Standard deviation Alternative formula for the standard deviation STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Range The mean, mode and median give important information about the central tendency of data but they do not tell anything about the spread or dispersion about the centre. For example, the two sets of data: 26, 27, 28 ,29 30 and 5, 19, 20, 36, 60 both have a mean of 28 but one is clearly more tightly arranged about the mean than the other. The simplest measure of dispersion is the range – the difference between the highest and the lowest values. STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Standard deviation The standard deviation is the most widely used measure of dispersion. The variance of a set of data is the average of the square of the difference in value of a datum from the mean: ( x1 x )2 ( x2 x ) 2 variance n ( xn x ) 2 This has the disadvantage of being measured in the square of the units of the data. The standard deviation is the square root of the variance: n standard deviation STROUD (x x ) i 1 2 i n Worked examples and exercises are in the text Programme 27: Statistics Dispersion Alternative formula for the standard deviation Since: n (x x ) i i 1 n n n 2 x i 1 2 i n (x i 1 2 i n n 2 x xi x i 1 n 2 xi x x 2 ) i 1 n 2 x i 1 2 i 2nx 2 nx 2 n n That is: STROUD x i 1 n 2 i x2 x2 x 2 Worked examples and exercises are in the text