Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Time series wikipedia , lookup

World Values Survey wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
KS3 Mathematics
D2 Processing data
1 of 42
© Boardworks Ltd 2004
Contents
D2 Processing data
D2.1 Finding the mode
D2.2 Calculating the mean
D2.3 Finding the median
D2.4 Finding the range
D2.5 Calculating statistics
2 of 42
© Boardworks Ltd 2004
Finding the mode
The mode or modal value in a set of data is the data value
that appears the most often.
For example, the number of goals scored by the local
football team in the last ten games is:
2,
1,
2,
0,
0,
2,
3,
1,
2,
1.
Is it possible to have more than one modal value?
Yes
The modal score is 2.
Is it possible to have no modal value?
3 of 42
Yes
© Boardworks Ltd 2004
Finding the mode
A dice was thrown ten times. These are the results:
What was the modal score?
3 is the modal score because it appears most often.
4 of 42
© Boardworks Ltd 2004
Finding the mode
The mode is the only average that can be used for
categorical or non-numerical data.
For example, 30 pupils are asked how they usually travel to
school. The results are shown in a frequency table.
Method of travel Frequency
Bicycle
6
8
On foot
Car
2
Bus
6
Train
3
5 of 42
What is the modal
method of travel?
Most children travel by foot.
Travelling on foot is
therefore the modal method
of travel.
© Boardworks Ltd 2004
Finding the mode from a bar chart
This bar chart shows the scores in a science test:
Number of pupils
9
8
7
6
5
4
3
2
1
0
1
2
3
4
5
6
7
8
9
10
Mark out of ten
What was the modal score?
6 is the modal score because it has the highest bar.
6 of 42
© Boardworks Ltd 2004
Finding the mode from a pie chart
This pie chart shows the favourite food of a sample of people:
23
78
55
18
chocolate
fruit
vegetables
sweets
other
What was the
modal food
type?
26
The biggest sector of the pie chart is for chocolate, so this
is the modal food type.
7 of 42
© Boardworks Ltd 2004
Finding the mode from a frequency table
This frequency table shows the frequency of different length
words in a given paragraph of text.
Word length 1
Frequency
3
2 3 4
16 12 16
5
7
6
3
7
11
8
6
9
2
10
1
What was the modal word length?
We need to look for the word lengths that occur most
frequently.
For this data there are two modal word lengths: 2 and 4.
8 of 42
© Boardworks Ltd 2004
Finding the modal class for continuous data
This grouped frequency table shows the times 50 girls and
50 boys took to complete one lap around a race track.
Frequency
Time (minutes:seconds)
Boys
Girls
2:00 ≤ 2:15
3
1
2:15 ≤ 2:30
7
6
2:30 ≤ 2:45
11
10
2:45 ≤ 3:00
13
9
3:15 ≤ 3:30
8
12
3:30 ≤ 3:45
7
10
3:45 ≤ 4:00
1
2
9 of 42
What is the modal
class for the girls?
What is the modal
class for the boys?
What is the modal
class for the pupils
regardless of
whether they are a
boy or a girl?
© Boardworks Ltd 2004
Contents
D2 Processing data
D2.1 Finding the mode
D2.2 Calculating the mean
D2.3 Finding the median
D2.4 Finding the range
D2.5 Calculating statistics
10 of 42
© Boardworks Ltd 2004
The mean
The mean is the most commonly used average.
To calculate the mean of a set of values we add together the
values and divide by the total number of values.
Sum of values
Mean =
Number of values
For example, the mean of 3, 6, 7, 9 and 9 is
3+6+7+9+9
5
11 of 42
=
34
5
= 6.8
© Boardworks Ltd 2004
The mean
12 of 42
© Boardworks Ltd 2004
Problems involving the mean
A pupil scores 78%, 75% and 82% in three tests. What must
she score in the fourth test to get an overall mean of 80%?
To get a mean of 80% the four marks must add up to
4 × 80% = 320%
The three marks that the pupils has so far add up to
78% + 75% + 82% = 235%
The mark needed in the fourth test is
320% – 235% = 85%
13 of 42
© Boardworks Ltd 2004
Calculating the mean from a frequency table
The following frequency table shows the scores obtained
when a dice is thrown 50 times.
What is the mean score?
Score
1
2
3
4
5
6
Total
Frequency
8
11
6
9
9
7
50
Score ×
Frequency
8
22
18
36
45
42
171
171
The mean score =
50
14 of 42
= 3.42
© Boardworks Ltd 2004
Calculating the mean using a spreadsheet
When processing large amounts of data it is often helpful
to use a spreadsheet to help us calculate the mean.
For example, 500 households were asked how many
children under the age of 16 lived in the home. The results
were collected in a spreadsheet.
15 of 42
© Boardworks Ltd 2004
Calculating the mean using a spreadsheet
The total number of households is found by entering
=SUM(B2:J2) in cell K2 as follows:
Pressing enter shows the number of households.
16 of 42
© Boardworks Ltd 2004
Calculating the mean using a spreadsheet
The total number of children in households with no
children is found by entering =B1*B2 into cell B3.
The total number of children in households with one child
is found by entering =C1*C2 into cell C3.
17 of 42
© Boardworks Ltd 2004
Calculating the mean using a spreadsheet
This can be repeated along the row to find the total
number of children in each type of household.
To find the total number of children altogether enter
=SUM(B3:J3) in cell K3.
18 of 42
© Boardworks Ltd 2004
Calculating the mean using a spreadsheet
The mean number of children in each household is now
found by dividing the number in cell K3 by the number in
cell K2.
Mean number of children =
19 of 42
1061
= 2.122
500
© Boardworks Ltd 2004
Using an assumed mean
We can calculate the mean of a set of data using an
assumed mean.
For example, suppose the heights of ten year 8 pupils are
as follows:
148 cm, 155 cm, 145 cm, 157 cm, 156 cm,
142 cm, 168 cm, 152 cm, 150 cm, 138 cm.
To find the mean of these values using an assumed mean
we start by making a guess at what the mean might be.
This is the assumed mean.
For this set of values we can use an assumed mean of
150 cm.
20 of 42
© Boardworks Ltd 2004
Using an assumed mean
Subtract the assumed mean from each of the data values.
148 cm, 155 cm, 145 cm, 157 cm, 156 cm,
142 cm, 168 cm, 152 cm, 150 cm, 138 cm.
– 150 cm
–2 cm,
–8 cm,
5 cm,
18 cm,
–5 cm,
2 cm,
7 cm,
0 cm,
6 cm,
–12 cm.
Next, find the mean of the new set of values.
38 – 27
10
21 of 42
=
11
10
= 1.1
© Boardworks Ltd 2004
Using an assumed mean
To find the actual mean of the heights, add this value to the
assumed mean.
150 + 1.1 = 151.1
assumed
mean
mean of the
differences
actual
mean
So, the actual mean of the heights is 151.1 cm.
This method is often used to find the mean of numbers that
are large or written to a large number of decimal places.
We can use this method to find the mean mentally, using
jottings.
22 of 42
© Boardworks Ltd 2004
Using an assumed mean
In summary, to find the mean of a set of values using an
assumed mean follow these steps:
1) Assume the mean of the values.
2) Subtract this assumed mean from each of the values.
3) Find the mean of the new values.
4) Add this mean to the assumed mean to find the actual
mean.
Sum of the differences
Actual mean = Assumed mean +
Total number of values
23 of 42
© Boardworks Ltd 2004
Contents
D2 Processing data
D2.1 Finding the mode
D2.2 Calculating the mean
D2.3 Finding the median
D2.4 Finding the range
D2.5 Calculating statistics
24 of 42
© Boardworks Ltd 2004
Finding the median
The median is the middle value of a set of numbers
arranged in order.
For example, find the median of
10,
7,
9,
12,
7,
8,
6,
Write the values in order:
6,
7,
7,
8,
9,
10,
12.
The median is the middle value.
25 of 42
© Boardworks Ltd 2004
Finding the median
When there is an even number of values, there will be two
values in the middle.
In this case, we have to find the mean of the two middle
values.
For example,
Find the median of 56, 42, 47, 51, 65 and 43.
The values in order are:
42,
43,
47,
51,
56,
65.
There are two middle values, 47 and 51.
26 of 42
© Boardworks Ltd 2004
Finding the median
To find the number that is half-way between 47 and 51 we
can add the two numbers together and divide by 2.
47 + 51
2
=
98
2
= 49
Alternatively, find the difference between 47 and 51 and add
half this difference to the lower number.
51 – 47 = 4
½ of 4 = 2
2 + 47 = 49
The median of 42, 43, 47, 51, 56 and 65 is 49.
27 of 42
© Boardworks Ltd 2004
Find the median
28 of 42
© Boardworks Ltd 2004
Rogue values
The median is often used when there is a rogue value – that
is, a value that is much smaller or larger than the rest.
What is the rogue value in the following data set:
192, 183, 201, 177, 193, 197, 4, 186, 179?
The median of this data set is:
4, 177, 179, 183, 186, 192, 193, 197, 201.
The median of the data set is not affected by the rogue
value, 4.
The mean of the data set is 168. This is not representative
of the set because it is lower than almost all the data values.
29 of 42
© Boardworks Ltd 2004
Mean or median?
Would it be better to use the median or the mean to
represent the following data sets?
34.2, 36.8, 29.7, 356, 42.5, 37.1? median
0.4, 0.5, 0.3, 0.8, 0.7, 1.0? mean
892, 954, 1026, 908, 871, 930? mean
3.12, 3.15, 3.23, 9.34, 3.16, 3.20? median
97.85, 95.43, 102.45, 98.02, 97.92, 99.38? mean
87634, 9321, 78265, 83493, 91574, 90046? median
30 of 42
© Boardworks Ltd 2004
Contents
D2 Processing data
D2.1 Finding the mode
D2.2 Calculating the mean
D2.3 Finding the median
D2.4 Finding the range
D2.5 Calculating statistics
31 of 42
© Boardworks Ltd 2004
Finding the range
The range of a set of data is a measure of how the data is
spread across the distribution.
To find the range we subtract the lowest value in the set from
the highest value.
Range = highest value – lowest value
When the range is small it tells us that the values are similar
What does it mean if the range is small?
in size.
When the range is large it tells us that the values vary widely
What does it mean if the range is large?
in size.
32 of 42
© Boardworks Ltd 2004
Find the range
33 of 42
© Boardworks Ltd 2004
Contents
D2 Processing data
D2.1 Finding the mode
D2.2 Calculating the mean
D2.3 Finding the median
D2.4 Finding the range
D2.5 Calculating statistics
34 of 42
© Boardworks Ltd 2004
Remember the three averages and range
S
M
M
I
A
M
E
D
I
A
N
C
L
D
MODE
L
L
L
M
RANGE
E
M
R
S
MEAN
O
G
T
D
N
E
DIVIDE
S
T
35 of 42
© Boardworks Ltd 2004
The three averages and range
There are three different types of average:
MEAN
MEDIAN
MODE
sum of values
number of values
middle value
most common
The range is not an average, but tells you how the data is
spread out:
RANGE
largest value – smallest value
36 of 42
© Boardworks Ltd 2004
The three averages
Each type of average has its purpose and sometimes one is
preferable to an other.
The mode is easy to find and it eliminates some of the
effects of extreme values. It is the only type of average that
can be used for categorical (non-numerical) data.
The median is also fairly easy to find and has the advantage of
being hardly affected by rogue values or skewed data.
The mean is the most difficult to calculate but takes into
account all the values in the data set.
37 of 42
© Boardworks Ltd 2004
Find the mean, median and range
38 of 42
© Boardworks Ltd 2004
Find the missing value
39 of 42
© Boardworks Ltd 2004
Calculating statistics
Look at the values on these five cards:
2
4
5
8
11
Choose three cards so that:
The mean is bigger than the median.
The median is bigger than the mean.
The mean and the median are the same.
40 of 42
© Boardworks Ltd 2004
Stem-and-leaf diagrams
Sometimes data is arranged in a stem-and-and leaf diagram.
For example, this stem-and-leaf diagram shows the marks
scored by 21 pupils in a maths test.
stem = tens
leaves = units
Find the median, mode
and range for the data.
0
6 7 9
1
4 5 5 8
2
0 1 3 5 6 6
There are 21 data values
so the median will be the
25 .
11th value, that is ___
3
0 2 2 2 5 8
The mode is ___
32 .
4
0 0
The range is 40 – 6,
34 .
which is ___
41 of 42
© Boardworks Ltd 2004
Stem-and-leaf diagrams
42 of 42
© Boardworks Ltd 2004