Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Topic 3: Measures of Central Tendency With any set of numerical data, there is always a temptation to summarise the data to a single value that attempts to be a typical value for the data. There are three commonly used measures to describe the 'central' or 'typical' value of the data, the Mean, Median and Mode. Mean The mean is commonly referred to as the average. The mean is the most common measure used. It is found by adding up all the values and dividing by the number of values. Because of this, every value contributes to the mean. For a set of values written as X1, X2, X3, X4, X5,………. Xn, the sample mean is calculated using the equation: = X sum of the values = number of values in the sample ∑X n where X is the mean of the n values in the sample. The symbol Σ is shorthand for "the sum of". The equation for the population mean uses some different pronumerals. = m sum of the values = number of values in the population ∑X N where μ is the mean of the N values in the population. Because the calculation of the sample and population means is the same (except for the pronumerals used in the equation), calculators have just one key to cover both. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 1 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Finding the mean of Individual Values This data is the test results of a sample 15 students on a maths test (in order). 23, 45, 50, 54, 55, 57, 59, 59, 59, 61, 63, 75, 75, 81, 90. X= ∑X n 23 + 45 + 50 + ............. + 75 + 81 + 90 X= 15 X = 60.4 The mean can easily be calculated on a calculator, especially a scientific calculator with STATS mode (covered later). Finding the mean from a Frequency Distribution Table Let’s recall the data about the number of matches in 50 boxes which was organised in a FDT with no grouping. The table has a new column added headed f x x. This column is the score multiplied by the frequency. f ×X Number of Matches (X) Frequency (f) 46 2 2 x 46 = 92 47 3 3 x 47 = 141 48 5 240 49 6 294 50 16 800 51 6 306 52 4 208 53 3 159 54 3 162 55 2 110 Total 50 Σf = ΣfX = 2512 92 represents the total of all the 46s in the data. 141 represents the total of all the 47s in the data. 2512 is the total of all the values in the data. The mean number of matches per box is given by the equation: sum of the values number of values in the sample ΣfX X= Σf 2512 X= 50 X = 50.24 X= Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 2 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy The mean number per box is 50.24 matches. Now let’s look at the example about the money spent by children at the show. This was organised using a Grouped FDT. Finding the mean is performed in a similar manner to above except the Group Midpoint is used to represent the group. Group Midpoint (X) f ×X Amount Spent$ Frequency (f) $0 but less than $10 2 5 2 x 5 = 10 $10 but less than $20 3 15 3 x 15 = 45 $20 but less than $30 5 25 125 $30 but less than $40 4 35 140 $40 but less than $50 2 45 90 $50 but less than $60 8 55 440 $60 but less than $70 14 65 910 $70 but less than $80 18 75 1350 $80 but less than $90 12 85 1020 $90 but less than $100 2 95 190 Total 70 Σf = ΣfX = 4320 The mean amount spent is given by the equation: sum of the values number of values in the sample ΣfX X= Σf 4320 X= 70 X = $61.71 (to the nearest cent) X= Remember this method assumes that the values in each group are evenly spread. This assumption is not always true so the figure obtained for the mean using this method can slightly different to the mean if the original values where used. Finding the mean using a Calculator A scientific calculator is usually capable of performing this operation when STATS mode is used. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 3 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy The instructions below apply to the Casio fx-82AU and the Sharp EL531 only. If you have a different scientific calculator to this, you will need to read the user's guide for your calculator. There are 3 steps to using a scientific calculator in STATS mode: 1. Getting the calculator into STATS mode. 2. Entering the values (data). 3. Obtaining the statistical measure required. On a Casio fx-82AU: On a Sharp EL531 To get this calculator into STATS mode To get this calculator into STATS mode Press Mode . The display will be: Press Mode . Press 1 for STAT Press 0 for Standard Deviation (+ more) The calculator should display Stat 0 in the top left of the display. Press 2 for STAT Press 1 for 1-VAR The calculator should display a small STAT on the top line of the display. Now your calculator is in STATs mode, the next stage is to enter the data. There are slight differences depending on how the data is organised. Finding the Mean of Individual Values This data is the test results of a sample 15 students on a maths test (in order). 23, 45, 50, 54, 55, 57, 59, 59, 59, 61, 63, 75, 75, 81, 90. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 4 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy On a Casio fx-82AU: On a Sharp EL531 Each number is entered into the list, 23 =, 45 =: Each number from the list is entered as: 23 then press M+ the calculator display will show 45 then press If you make a mistake, use the big round blue ‘Replay’ button to navigate back to the position of the incorrect value and enter the correct value. 1 M+ the calculator display will show . . . . . 90 then press DATA SET = DATA SET= 2 DATA SET= 15 M+ the calculator display will show DATA SET= The display 15 informs you that 15 numbers have been entered into the STATs memory. If you make a mistake and need to start again, the STATs memory is cleared by pressing 2ndF CA Now the data is entered, the calculated value of the sample and population standard deviations can be obtained. On a Casio fx-82AU: On a Sharp EL531 When all the values have been entered, press AC This is important. To obtain the value of the population standard deviation; press Now press SHIFT 1 . This is the same as population standard deviation. SHIFT STAT . The calculator display will be as shown below. RCL σ x the symbol σ is used to represent the To obtain the value of the sample standard deviation; press RCL sx the symbol s is used to represent the sample standard deviation. (Think of RCL as recall) Other values are found by; To obtain the value of the mean; press RCL x the symbol x means the mean. To obtain the number of values entered; press Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 5 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy RCL n the symbol n means the number of numbers entered. To obtain the sum of the values entered; press RCL Σx the symbol Σx means the sum of the numbers entered. To obtain the sum of the squares of the values entered; press RCL Σx 2 the symbol Σx 2 means the sum of the squares of the numbers entered. Now press 4 to get more options. Press 2 = – for the mean Press 3 = – for the population standard dev. Press 4 = – for the sample standard dev. Always press = to get the value required. Do the calculation yourself, the values obtained should be: Mean = 60.4 Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 6 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Finding the Mean from a Frequency Distribution Table Let’s recall the data about the number of matches in 50 boxes which was organised in a FDT with no grouping. Number of Matches (X) Frequency (f) 46 2 47 3 48 5 49 6 50 16 51 6 52 4 53 3 54 3 55 2 Total Σf = 50 It is worth recalling that this table is informing us that there are 2 occurrences of the value 46, 3 occurrences of 47, 5 occurrences of 48, …. Instead of entering the value of 50, 16 times a modified entry process is used. The process for finding the standard deviation is exactly the same. Remember to clear STATs memory first. With a Frequency Distribution Table, the data is entered into the Sharp EL531 calculator in the format: score, frequency . On the Casio fx-82AU, the table the values are entered into needs to be changed into a frequency like the example above. To do this: Press SHIFT MODE . Eight options will be displayed. Press the big blue key down . This will take you to another screen with 5 options. Press 3: STAT The screen will then display Press 1: ON This will add a frequency column Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 7 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy On a Casio fx-82AU: On a Sharp EL531 Get the calculator into STATS mode: Each row from the table is entered as: Press Mode . The display will be: 46 , 2 M + the calculator display will show DATA SET = 1 47 , 3 M + the calculator display will show DATA SET= 2 DATA SET= Press 2 for STAT The display 2 informs you that 2 rows have been entered into the STATs memory. 48 , 5 M + Press 1 for 1-VAR This time the calculator will display an X column and a FREQ column. Enter a value followed by =. Navigate to the frequency column using the big blue REPLAY key. the calculator display will show . . . . . . . DATA SET= 3 55 , 2 M + the calculator display will show DATA SET= 15 If you make a mistake and need to start again, the STATs memory is cleared by pressing 2ndF CA Obtaining values for the mean is exactly the same process as in the previous section. Do the calculation yourself, the values obtained should be: Mean = 50.24 Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 8 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy If the data is grouped, the process is almost identical except that the group midpoint is used. Amount Spent$ Group Midpoint (X) Frequency (f) $0 but less than $10 5 2 $10 but less than $20 15 3 $20 but less than $30 25 5 $30 but less than $40 35 4 $40 but less than $50 45 2 $50 but less than $60 55 8 $60 but less than $70 65 14 $70 but less than $80 75 18 $80 but less than $90 85 12 $90 but less than $100 95 2 Total Σf = 70 The data is entered into the calculator as Group Midpoint, Frequency or Group Midpoint; Frequency depending on the calculator used. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 9 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Median The median is the middle value of the data if the values are arranged in order. As a measure of centre, it is a value based on its position; it is not influenced by the size of the values that are above or below it. Therefore it is quite different to the mean because it is based on position and for the mean, every value contributes. Finding the median of Individual Values This data is the test results of a sample 15 students on a maths test (in order). 23, 45, 50, 54, 55, 57, 59, 59, 59, 61, 63, 75, 75, 81, 90. When finding the median for a set of individual values, the first step is to order the values. Usually this is done in ascending order, but descending order gives the same result. The position of the median can always be found by using a simple expression: The position of the median is n +1 . 2 As there are 15 values, the position of the median is: n + 1 15 + 1 16 = = = 8 2 2 2 th The 8 value is the median. The median is the 8th value, which is 59. The median is the 8th value, so there are 7 values before it and 7 values after it. 23, 45, 50, 54, 55, 57, 59 59 59, 61, 63, 75, 75, 81, 90 7 values median 7 values = 15 values The median cannot be calculated on a scientific calculator. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 10 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Finding the median from a Frequency Distribution Table Let’s recall the data about the number of matches in 50 boxes which was organised in a FDT with no grouping. To find the median, the cumulative frequency column is required. Number of Matches (X) Frequency (f) Cumulative Frequency 46 2 2 47 3 5 48 5 10 49 6 16 50 16 32 51 6 38 52 4 42 53 3 45 54 3 48 55 2 50 Total Σf = 50 th The 17 through to the nd 32 values are in this group. th As there are 50 values, the position of the median is: This means that the 25 th and 26 values are both 50. n + 1 50 + 1 51 = = = 25.5 2 2 2 In this case the 25th and 26th values are required. The median is the average of the 25th and 26th values. As the 25th and 26th values are both 50, the median is 50. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 11 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Now let’s look at the example using a Grouped FDT. Finding the median can be performed in one of two ways: interpolation method or graphical method. (a) Interpolation Method (Amount Spent at a Show Data) Amount Spent$ Frequency (f) Group Midpoint (X) Cumulative Frequency $0 but less than $10 2 5 2 $10 but less than $20 3 15 5 $20 but less than $30 5 25 10 $30 but less than $40 4 35 14 $40 but less than $50 2 45 16 $50 but less than $60 8 55 24 $60 but less than $70 14 65 38 $70 but less than $80 18 75 56 95 70 The group width $80 but =70-60 less than $90 =10 12 $90 but less than $100 2 Total Σf = 70 This means that the th 24 score is $60 This means that the th 38 score is $70 There are 14 values in this 85 group. 68 As there are 70 values, the position of the median is: n + 1 70 + 1 71 = = = 35.5 2 2 2 In this case the 35th and 36th values are required. Both of these values are located in the group ‘$60 but less than $70’. This means that the median is greater than 60 but less than 70. This means that the median is (35.5 – 24 =) 11.5/14 of the way through the ‘$60 but less than $70’ group. The median is: ( 35.5 − 24 ) ×10 (Group width) Median = 60 + 14 Median = 68.2 Or expressed more generally: n +1 -cf m -1 2 × Group width Median = Lm + fm where Lm is the lower limit of the median group f m is the frequency of the median group cf m -1 is the cumulative frequency of the previous group Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 12 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy (b) Graphical method (using the Ogive) There are 70 values in the table. This is shown on the vertical axis (Cumulative Frequency). The first step is to come half way up the vertical axis. This is to 35. The second step; from this position, move horizontally across the graph until the cumulative frequency Ogive is found. The third step is to move down (vertically) to the horizontal axis and read the value. Ogive of Money Spent 80 70 Cumulative Frequency 60 50 40 30 20 10 0 0 20 40 60 80 100 120 Amount Spent $ From the graph, the median is approximately 68. Remember this method assumes that the values in each group are evenly spread. This assumption is not always true, so the figure obtained for the median using this method can slightly different to the actual median if the original values where used. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 13 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Mode The Mode is the value that occurs the most often. The mode may not exist because values only occur once. The mode could even be quite different to the mean and median, it may be much higher or much lower depending on the meaning of the data. There may also be more than one mode. If there are two modes, the data is said to be ‘bimodal’. Finding the mode from Graphs Number of Children in 40 households 16 14 Frequency 12 10 8 6 4 2 0 0 1 2 3 4 5 Number of Children From the graph, it can be observed that '1 child per household' has the highest frequency. This is the mode of these values. Finding the mode of Individual Values This data is the test results of a sample 15 students on a maths test (in order). 23, 45, 50, 54, 55, 57, 59, 59, 59, 61, 63, 75, 75, 81, 90. When inspecting this data it can be seen that the 59 occurs 3 times. This is the value that occurs the most frequently, that is, it has the highest frequency. The mode for this data is 59. Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 14 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Finding the mode from a Frequency Distribution Table Let’s recall the data about the number of matches in 50 boxes which was organised in a FDT with no grouping. To find the mode, the frequency column is inspected. The highest frequency of 16 is related to a box containing 50 matches. The mode of this data is 50 matches. Take care to give the mode as the score (50 matches) not the frequency (16). Number of Matches (X) Frequency (f) 46 2 47 3 48 5 49 6 50 16 51 6 52 4 53 3 54 3 55 2 Total Σf = 50 The mode for this data is 50 matches as it has the highest frequency (occurs the most often) When data is grouped, it is more relevant to refer to the modal group. From the scenario, ‘Amount Spent at the Show’, the group with the highest frequency is '$70 but less than $80' which has a frequency of 18. It can be stated that the modal group is '$70 but less than $80'. Amount Spent$ Frequency (f) Group Midpoint (X) $0 but less than $10 2 5 $10 but less than $20 3 15 $20 but less than $30 5 25 $30 but less than $40 4 35 $40 but less than $50 2 45 $50 but less than $60 8 55 $60 but less than $70 14 65 $70 but less than $80 18 75 $80 but less than $90 12 85 The modal group for this data is ‘$70 but less than $80’ as it has the highest frequency (occurs the most Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 15 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy $90 but less than $100 2 Total Σf = 70 95 Either the group ‘$70 but less than $80’ or the Group Midpoint ‘$75’ can be given as the mode. Which measure of centre should be used and when? Every value from the data is used to calculate the mean. As long as the data is spread out without excessive variation, the mean is usually used. A lecturer would use the mean to find a measure of centre because the marks would be usually spread out from (say) 30 to 100%. In this data the low or high values are not excessively different to a measure of centre. In real estate, the median is often used as a measure of centre because there are some excessively high values that are quite different to the bulk of the market. Real estate sales generally consist of many properties at the cheaper part of the market and fewer properties at the expensive part of the market. If the mean was to be used, the fewer, more expensive properties would have a huge effect on the mean. Because the median is a measure of centre based on location, variation in the price of high value properties and the number of high value properties will have no effect on the median but would have a significant effect on the mean. In the clothing industry, the mode is often used as a measure of centre. If a shop sells mostly size 12 clothing, then size 12 is the mode of the sizes sold. The mean is of little use because it could be a value that is not even a possible size. The median is a better measure but fails to reflect the nature of the sales environment. Video ‘Measures of Central Tendency’ Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 16 [last edited on 21 July 2015] CRICOS Provider: 01241G Centre for Teaching and Learning Numeracy Activity 1. The lifetime, in hours, of a sample of 15 light bulbs is: 351, 429, 885, 509, 317, 753, 827, 737, 487, 726, 395, 773, 926, 688, 485. Calculate the mean, mode and median of the values. Do not organise into a table. 2. The number of children in a 10 families is: 1, 5, 2, 2, 2, 3, 1, 4, 3, 2. Calculate the mean, mode and median of the values. Do not organise into a table. 3. For the Car Speed Data, calculate the mean, mode and median of the values after organising the data in Frequency Distribution Table (The FDT can be found in the answers to the Topic ‘Graphs’). 12 31 22 40 4. 41 35 50 52 44 31 46 57 Car Speed Data (in km/hr) 45 28 40 20 59 27 14 33 48 27 61 42 32 49 25 39 62 19 32 64 46 58 52 52 25 38 69 27 The number of students attending a class (maximum 25) for 30 lessons is given in the table below: 25 25 24 22 25 24 24 25 24 23 Students Attending Class 24 25 23 24 20 23 25 24 24 25 24 25 25 23 24 23 25 24 21 22 (a) Calculate the mean, mode and median using a Frequency Distribution Table. (The FDT can be found in the answers to the Topic ‘Graphs’). (b) Discuss the appropriateness of each measure of central tendency as a typical value. 5. The Shot Put distances thrown by 27 world champion shot putters are given in the table below. The unit is metres (m). 22.25 21.72 21.07 20.22 22.67 20.19 20.37 21.55 20.38 21.58 21.39 20.45 23.12 22.37 21.72 21.25 23.09 20.91 22.19 21.19 21.19 22.58 21.70 22.07 21.22 21.97 20.54 Calculate the mean, mode and median using a grouped Frequency Distribution Table. (The FDT can be found in the answers to the Topic ‘Graphs’) Centre for Teaching and Learning | Academic Practice | Academic Skills T +61 2 6626 9262 E [email protected] W www.scu.edu.au/teachinglearning Page 17 [last edited on 21 July 2015] CRICOS Provider: 01241G