Download Averages and spread 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Time series wikipedia , lookup

Transcript
Averages and spread 2
Worked examples:
1. Raw data
Raw data refers to a data set where each data value is known individually. All of the
different measures of location and dispersion can be calculated from raw data.
Example
In an experiment, scientists from CERN were tested for their level of radiation
exposure (millisievert, mSv), over a year. The results were as follows:
12.3
21.2
19.0
13.1
17.1
18.1
24.0
15.1
15.4
21.7
18.2
a. Calculate the following numerical measures
i. Mean
ii. Median
iii. Mode
iv. Interquartile range
v.
vi.
vii.
Range
Variance
Standard deviation
b. Interpretation:
The safe level for one year’s exposure is 20.0 mSv. Explain if the following statement
is correct, using the data you have just calculated.
β€˜The scientists at CERN are working within the safe levels of radioactive exposure.’
Worked examples:
2. Frequency distribution
A frequency distribution is a compact way of describing raw data when some of the
readings occur more than once. Instead of listing all the data values individually, a
frequency distribution lists each different value along with its frequency (the number of
times it occurs). Averages and measures of spread can be calculated by
reconstructing the raw data, but for the mean, variance and standard deviation it is
easier adapt the formulae to take frequencies into account:
βˆ‘ 𝑓π‘₯
π‘₯Μ… =
where 𝑛 = βˆ‘π‘“
𝑛
𝜎2 =
βˆ‘ 𝑓π‘₯ 2
βˆ’ π‘₯Μ… 2
𝑛
𝜎=√
Page 1 of 3
βˆ‘ 𝑓π‘₯ 2
βˆ’ π‘₯Μ… 2 .
𝑛
These are best calculated from a table, as in the example following.
Example
Isabella went up and down the street to find out how many parking spaces each house
has. Here are her results:
Number of parking
spaces
Frequency
1
15
2
27
3
8
4
3
a. Calculate the mean, variance and standard deviation
b. Calculate the median and interquartile range
c. Calculate the mode and the range
Worked examples:
3. Grouped data
Sometimes data is grouped, either because the data is continuous or because there
are a lot of different data values. For example, we might group together all the values
between 0 and 10 in one group, those from 10 to 20 in another and so on. Grouped
data looks a bit like a frequency distribution but, instead of having the frequency for
each data value, we have the frequency for each group of data.
We describe the grouping using intervals: 0 ≀ π‘₯ < 10, 10 ≀ π‘₯ < 25, etc. The intervals
can be different sizes, but we need to be careful that they do not overlap or leave
gaps. For example: the intervals 0 ≀ π‘₯ ≀ 10, 10 ≀ π‘₯ < 25 overlap, since 10 could go
into either group; the intervals 0 ≀ π‘₯ ≀ 9, 10 ≀ π‘₯ ≀ 24 leave a gap, since 9.5 does not
fit into either group.
Once data has been grouped, we do not know the individual data values. This means
we can only estimate the averages and measures of spread. Grouped data is dealt
with in much the same way as a frequency distribution, but there is one extra step to
deal with: we have to decide what to use as the x-value of each group. So that all
mathematicians to do this in the same way, we agree to use the midpoint of each
group. We do not know the data values, so we assume that they are all at the
midpoint.
Page 2 of 3
The midpoint is calculated by averaging the group boundaries. For example, using
0+10
intervals: 0 ≀ π‘₯ < 10, 10 ≀ π‘₯ < 25, the first midpoint is at 2 = 5, and the second
midpoint is at
10+25
2
= 17.5.
Example
A survey is conducted to look into the amount of money the average customer spends
at a supermarket checkout. This was done with a sample of 100 people. The
information was then grouped into the following intervals:
Amount spent (£)
Frequency
5 ο‚£ x < 25
10
25 ο‚£ x < 40
13
40 ο‚£ x < 70
12
70 ο‚£ x < 100
29
100 ο‚£ x < 150
23
150 ο‚£ x < 200
13
a. Estimate the mean and standard deviation of these data.
b. Estimate the median and interquartile range of these data.
Page 3 of 3