Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
S2 MATHEMATICS SUPPORT CENTRE Title: Measures of Spread Target: On completion of this worksheet you should understand what is meant by a measure of spread and be able to calculate range, interquartile range and standard deviation. Suppose there are two football teams, A and B, and we need to choose one of them to take part in a competition. In this competition consistency is important. In order to make our decision we will use the number of goals they scored in their last 11 matches. A 4 7 0 1 2 0 6 7 4 2 0 B 3 3 2 3 3 3 2 4 4 3 3 We will first consider the mean number of goals for each team: mean A = 33 = 3 mean B = 33 = 3 11 11 The means are equal so we must use other criteria. As consistency is important we need to find some way of measuring the spread of the data. The Range range = largest value – smallest value range A = 7 – 0 = 7 range B = 4 – 2 = 2 As team B has a lower range than team A we would choose team B. In general although the range is very easy to find it can be distorted by a very high or a very low figure particularly if there is a large amount of data. The Interquartile Range The median divides the data into two halves (see sheet S1). The quartiles with the median divide the data into quarters. To find the median and quartiles the data must first be put in numerical order. Mathematics Support Centre,Coventry University, 2001 Example Team A: 0 0 0 1 2 2 4 4 6 7 7 The median is the 6th value, the lower quartile (Q1) is the 3rd value and the upper quartile (Q3) is the 9th value 0 0 0 1 2 2 4 4 6 7 7 Q1 median Q3 We can see that these values divide the data into four equal parts. Now another measure of spread is the interquartile range = Q3 – Q1 interquartile range for A = 6 – 0 = 6 For team B we must first arrange the data in order as before: 2 2 3 3 3 3 3 3 3 4 4 Q1 M Q3 The quartiles and median are in the same positions as for team A. interquartile range for B = 3 – 3 = 0 Exercise Find the range and interquartile range for the following: 1. The test marks for a group of students are: 50, 42, 76, 38, 12, 56, 62. 2. Bolts are packed in boxes of 10. A sample of boxes was examined and the number of defective bolts in each was: 1, 0, 3, 0, 0, 1, 1, 1, 3, 1, 0, 1, 2, 0, 1. 3. The number of minutes late for a particular bus: 5, 2, 7, 5, 6, 4, 1, 0, 3, 4. (Answers: 64, 24; 3, 1; 7, 4) The Standard Deviation The standard deviation uses all the values and gives a more useful measure of spread. The formula is standard deviation = (x − x)2 n where x is the mean Using the football teams data (mean = 3) A B x (x − x) (x − x)2 x (x − x) (x − x)2 4 7 0 1 2 0 6 7 4 1 4 -3 -2 -1 -3 3 4 1 1 16 9 4 1 9 9 16 1 1 9 3 3 2 3 3 3 2 4 4 3 3 0 0 -1 0 0 0 -1 1 1 0 0 1 0 0 0 1 1 1 = 76 Σ (x − x)2 2 0 -1 -3 Σ (x − x)2 0 0 0 0 Example Find the standard deviation: Goals Number of Scored Matches x f fx fx2 0 10 0 0 1 29 29 29 2 32 64 128 3 23 69 207 4 6 24 96 Σf = 100 Σfx = 186 Σfx2 = 460 460 186 standard deviation = − 100 100 2 = 1 ⋅ 068 This is the standard deviation for the population of 100 matches, ie. for the data given. If we want to use this data as a sample to estimate the standard deviation for the population of all matches then we must use a slightly different formula which gives a better estimate. We use s to denote this standard deviation. =4 Σfx 2 2 Σfx ) ( − Σf Σf − 1 76 = 2 ⋅ 63 Standard deviation A = σ = 11 Standard deviation = s = 4 = 0 ⋅ 60 11 Note the use of σ (sigma) for the standard deviation The formula can also be written as Using this with the above example gives Standard deviation B = σ = σ = Σx 2 Σx − n n 2 Exercise Find the standard deviation for the questions in the previous exercise using both formulae. (Answers: 18·8, 0·966, 2·1) For a frequency distribution the formula is Σfx 2 Σfx − Standard deviation = σ = Σf Σf 2 This formula is also used for a grouped frequency distribution where x is the mid-point of the interval. You can use your calculator to work out the mean and standard deviation. First put your calculator into statistics mode and then clear all memories. Each x value is entered ( × frequency if needed ) followed by ‘DATA’ key. How to display the mean and standard deviation depends on your calculator. Mathematics Support Centre,Coventry University, 2001 s= 186 2 100 = 1 ⋅ 073 100 − 1 460 − As the number of values is large there is little difference between the two standard deviations but this second formula should be used when estimating a population standard deviation. Exercise Estimate the population standard deviation for the following sets of data: 1 x 4 5 6 7 8 f 2 7 8 3 2 2 x 23 24 25 26 27 f 9 21 24 13 3 3 x 1-3 4-6 7-9 10-12 13-15 f 5 24 27 15 9 (Answers: 1·10, 1·05, 3·29) The variance is the square of the standard deviation so Σfx variance = σ = 2 2 2 ( Σfx ) − Σf Σf − 1