Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Averages and spread 2 Worked examples: 1. Raw data Raw data refers to a data set where each data value is known individually. All of the different measures of location and dispersion can be calculated from raw data. Example In an experiment, scientists from CERN were tested for their level of radiation exposure (millisievert, mSv), over a year. The results were as follows: 12.3 21.2 19.0 13.1 17.1 18.1 24.0 15.1 15.4 21.7 18.2 a. Calculate the following numerical measures i. Mean ii. Median iii. Mode iv. Interquartile range v. vi. vii. Range Variance Standard deviation b. Interpretation: The safe level for one yearβs exposure is 20.0 mSv. Explain if the following statement is correct, using the data you have just calculated. βThe scientists at CERN are working within the safe levels of radioactive exposure.β Worked examples: 2. Frequency distribution A frequency distribution is a compact way of describing raw data when some of the readings occur more than once. Instead of listing all the data values individually, a frequency distribution lists each different value along with its frequency (the number of times it occurs). Averages and measures of spread can be calculated by reconstructing the raw data, but for the mean, variance and standard deviation it is easier adapt the formulae to take frequencies into account: β ππ₯ π₯Μ = where π = βπ π π2 = β ππ₯ 2 β π₯Μ 2 π π=β Page 1 of 3 β ππ₯ 2 β π₯Μ 2 . π These are best calculated from a table, as in the example following. Example Isabella went up and down the street to find out how many parking spaces each house has. Here are her results: Number of parking spaces Frequency 1 15 2 27 3 8 4 3 a. Calculate the mean, variance and standard deviation b. Calculate the median and interquartile range c. Calculate the mode and the range Worked examples: 3. Grouped data Sometimes data is grouped, either because the data is continuous or because there are a lot of different data values. For example, we might group together all the values between 0 and 10 in one group, those from 10 to 20 in another and so on. Grouped data looks a bit like a frequency distribution but, instead of having the frequency for each data value, we have the frequency for each group of data. We describe the grouping using intervals: 0 β€ π₯ < 10, 10 β€ π₯ < 25, etc. The intervals can be different sizes, but we need to be careful that they do not overlap or leave gaps. For example: the intervals 0 β€ π₯ β€ 10, 10 β€ π₯ < 25 overlap, since 10 could go into either group; the intervals 0 β€ π₯ β€ 9, 10 β€ π₯ β€ 24 leave a gap, since 9.5 does not fit into either group. Once data has been grouped, we do not know the individual data values. This means we can only estimate the averages and measures of spread. Grouped data is dealt with in much the same way as a frequency distribution, but there is one extra step to deal with: we have to decide what to use as the x-value of each group. So that all mathematicians to do this in the same way, we agree to use the midpoint of each group. We do not know the data values, so we assume that they are all at the midpoint. Page 2 of 3 The midpoint is calculated by averaging the group boundaries. For example, using 0+10 intervals: 0 β€ π₯ < 10, 10 β€ π₯ < 25, the first midpoint is at 2 = 5, and the second midpoint is at 10+25 2 = 17.5. Example A survey is conducted to look into the amount of money the average customer spends at a supermarket checkout. This was done with a sample of 100 people. The information was then grouped into the following intervals: Amount spent (£) Frequency 5 ο£ x < 25 10 25 ο£ x < 40 13 40 ο£ x < 70 12 70 ο£ x < 100 29 100 ο£ x < 150 23 150 ο£ x < 200 13 a. Estimate the mean and standard deviation of these data. b. Estimate the median and interquartile range of these data. Page 3 of 3