Download The Normal distribution topic exploration pack

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
LEVEL 3 CERTIFICATE
Topic Exploration Pack
H866/H867
QUANTITATIVE PROBLEM
SOLVING (MEI)
QUANTITATIVE REASONING (MEI)
The Normal distribution
March 2015
We will inform centres about any changes to the specification. We will also
publish changes on our website. The latest version of our specification will
always be the one on our website (www.ocr.org.uk) and this may differ from
printed versions.
Copyright © 2015 OCR. All rights reserved.
Copyright
OCR retains the copyright on all its publications, including the specifications.
However, registered centres for OCR are permitted to copy material from this
specification booklet for their own internal use.
Oxford Cambridge and RSA Examinations is a Company Limited by Guarantee.
Registered in England. Registered company number 3484466.
Registered office: 1 Hills Road
Cambridge
CB1 2EU
OCR is an exempt charity.
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Contents
Introduction ..................................................................................................................................... 4
Part A Properties of the Normal curve ............................................................................................. 6
Part B Calculation of z-scores ....................................................................................................... 10
Part C Testing for Normality .......................................................................................................... 11
Activity 2 Comparing distributions - Teacher Guidance ................................................................. 12
This Topic Exploration Pack should accompany the OCR resource ‘The Normal Distribution’
learner activities, which you can download from the OCR website.
This activity offers an
opportunity for maths
skills development.
March 2015
3
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Introduction
There are three aspects of the Normal Distribution that need to be taught in Introduction to
Quantitative Reasoning:
Firstly, learners need to recognise and use the basic properties of the Normal distribution. These
are that:
•
it forms a bell-shaped curve;
•
it is symmetrical about the mean, which equals the mode, although learners should
understand that real data samples are rarely perfectly symmetrical;
•
68% of the data items lie within 1 standard deviation of the mean, 95% lie within 2 standard
deviations and 99.7% lie within 3 standard deviations.
Secondly, learners need to be able to calculate z-scores, understand that a z-score represents a
number of standard deviations from the mean and use z-scores to make comparisons.
Finally, they should be able to interpret a Normal probability plot when testing for Normality using
statistical software. This will only require them to recognise that a straight line plot indicates
Normality.
Prior Knowledge
Students will need to be familiar with:
•
Mean, median and mode, which should already have been covered by all students at
GCSE.
•
The variability of sample data. The GCSE subject content includes, at both Foundation and
Higher, the limitations of sampling and the fact that empirical, unbiased samples tend
towards theoretical probability distributions with increasing sample size.
•
Constructing and interpreting frequency charts from grouped and continuous data. All
students should already be able to interpret, analyse and compare the distributions of data
sets through appropriate graphical representation involving discrete, continuous and grouped
data. However, only those who study for Higher GCSE are required to construct frequency
charts from grouped continuous data.
•
The difference between discrete and continuous variables. It is worth discussing the fact that
all recorded data is discrete because we have to measure to a given level of accuracy.
•
The concept of standard deviation as a measure of spread, which is not covered at GCSE
and therefore needs to be taught.
March 2015
4
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Teaching points/Misconceptions
Most distributions of height, weight, and other measurements follow a Normal distribution and so
students do not generally find it difficult to recognise the shape of the Normal distribution and
identify where it might occur; it falls within their experience that most measurements will be
towards the middle and a few will be at the extremes.
Students are not required to calculate standard deviation in the examination. However, in order to
understand what standard deviation is, it would be useful for them to know how it is calculated. Its
calculation as the (square root of the) average of the squared differences from the mean is not
difficult to introduce and will show students that it is a fairly natural and transparent way to measure
spread.
Some students may believe that all z-scores lie between −3 and 3. This view can be discouraged
by drawing sketches showing that values fall outside this range. Others may believe that z-scores
can take any values, failing to consider that most real distributions will have natural limitations - for
example, length cannot be negative and test scores cannot be greater than 100%.
March 2015
5
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Part A Properties of the Normal curve
At the start of the topic, the idea of distribution will need to be revised. This could be done by using
the Standards Units card sets S5 and S6:
S5 Card Set A; ask students in pairs to group the charts in as many ways as they can (they
•
might group by symmetry, skew, averages, range, total frequency or others they think of).
S6 Cards sets A and B; ask students to match the frequency graphs with the statements.
•
Alternatively, students could generate data and compare the resulting distributions.
For example:
•
Roll one six-sided dice 50 times and record the scores (rectangular).
•
Roll a single dice until they get a six and count the number of rolls it takes - ask
several pairs to do this and collate the data as it takes a while (geometric).
•
•
Roll 2 dice 50 times and record the sum each time (normal)
Roll ten six-sided dice 30 times and record the number of 2s (binomial).
Many students will find the geometric distribution in example 2 incredibly counter intuitive; teachers
may wish to allow time to discuss this in more detail and explore the idea that probabilities can
often lead to unexpected outcomes.
The accompanying Excel workbook ‘The Normal distribution’ (worksheet ‘Distributions’) will create
these graphs for you. Enter the frequencies for an experiment in one of the blue columns.
Or, if you have data you can use, you could compare real distributions that might interest your
students.
March 2015
6
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Students should understand that standard deviation is the measure of spread most commonly
used by statisticians because it;
•
relates to the mean,
•
can be manipulated algebraically,
•
takes into account all the data,
•
is not too sensitive to outlying values.
They do not need to know the formulae or to calculate standard deviation but it will help them to
understand if the method is explained. This could be done by generating some class data (for
example, ask them to measure their pulse rates over 1 minute).
Then ask the following questions, getting the class to do the calculations with their data (in small
groups of 6-8 if possible):
(a) We want to find out how far the data is, on average, from the mean. Can you suggest a
good way to start?
Answer: Subtracting the mean from each item of data to find how far each measurement is from
the mean.
(b) We want to find the average difference - how could we do that?
Answer: Find the mean difference by adding up the differences and dividing by the number of data
items. (Students will find that total is always zero - explore why).
(c) What could we do to avoid the total being zero?
Answer: Either ignore the negative signs or square the differences.
March 2015
7
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
(d) Which method would be better?
Answer: This is genuinely debatable, but the absolute values are not generally used because,
algebraically, the modulus sign is difficult to manipulate. Standard deviation is the name of the
measure which uses the squared values and which is most commonly used.
(e) Finish by explaining that the mean of the squared differences is called the variance and that
the square root of this value is called the standard deviation. We square root because the
standard deviation needs to be in the same units as the original data, not in squared units
which is what the variance is measured in (because we squared the differences).
The Normal distribution could be introduced by showing the Normal distribution in fish populations
presented by Marcus de Sautoy.
Students can then do Activity 1 which involves them collecting data that is approximately Normally
distributed and finding out what proportion of the data lies within 1, 2 and 3 standard deviations of
the mean. There are many sets of data you could get your class to collect such as heights,
estimates, or length of time breath can be held. However, they will need about 100 results to do
this activity properly so they need to collect a large number of measurements quickly. Some
practical ideas suggested by the Centre for Innovation in Mathematics Teaching are;
1.
Lengths of leaves. Evergreen bushes such as laurel are useful - though make sure all the
leaves are from the same year's growth.
2.
Weights of crisp packets. Borrow a box of crisps from a canteen and weigh each packet
accurately on a balance from the science laboratory.
3.
Pieces of string. Look at 10cm on a ruler and then take a ball of string and try to cut 100
lengths of 10cm by guessing. Measure the lengths of all the pieces in mm.
4.
Weights of apples. If anyone has apple trees in their garden they are bound to have large
quantities in the autumn.
March 2015
8
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
5.
Topic Exploration Pack
Size of pebbles on a beach. Geographers often look at these to study the movement of
beaches. Use a pair of callipers then measure on a ruler.
6.
Game of bowls. Make a line with a piece of rope on the grass about 20 metres away. Let
everyone have several goes at trying to land a tennis ball on the line. Measure how far each
ball is from the line.
Students could work in groups, or as a whole class. The main teaching points are that:
•
the graph forms a bell-shaped curve,
•
the graph is symmetrical about the mean, and the mean and mode are roughly the same,
•
roughly 68% of the data items lie within 1 standard deviation of the mean, 95% lie within 2
standard deviations and 99.7% lie within 3 standard deviations,
•
real data samples are rarely perfectly symmetrical and the percentages from even a large
sample may not be exact,
•
there are tests which statisticians use to check whether a set of data is close enough to
assume that it is Normal.
Students could use the Excel workbook ‘The Normal distribution’ (worksheet ‘Activity 1 Data’) to
plot their data and to calculate the mean and standard deviation. If you use the spreadsheet, enter
the raw data in the first green column. Then decide on some (equal) class intervals and enter the
upper bounds in the second green column. You may prefer them to group and plot the data by
hand, use a scientific calculator to find the mean and standard deviation, or set up their own
spreadsheet to plot their data and calculate the values.
March 2015
9
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Part B Calculation of z-scores
You could now watch the video clip Against All Odds: Normal Calculations which uses the context
of a club for tall people to compare the entry requirements for men and women by calculating zscores. The suggested sections to use are:
00:21 to 04:00 revises the Normal distribution and introduces the Beanstalks. At 04:00 you could
stop and discuss with the group how many standard deviations away from the mean 5’10” is and
develop the idea of a z-score being
𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
often written
𝑥−𝜇
𝜎
.
06:16 to 09:00 explains the calculation of z-scores. You may wish to stop the clip at 09:00 and
explain the use of tables to find exact percentages before continuing with the video. Students will
not be examined on this, but it is difficult to interpret z-scores without understanding that exact
percentages can be found for any z-score.
09:00 to 10:41 explains the use of z-scores to compare values from different distributions and
interprets the values for the male and female Beanstalks.
Students could then do Activity 2 (see Activity 2 - Teacher Guidance below and Excel workbook
‘The Normal distribution’ (worksheet ‘Activity 2 Dice’)). Given two dice games, one where you
score the sum of three dice and one where you score the sum of 5 dice, in which game are you
more likely to score 10 or more? This activity requires the collection of discrete data but the
distributions will approximate Normal distributions.
March 2015
10
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Part C Testing for Normality
The data collected in Activities 1 and 2 may have formed asymmetrical, bell-shaped curve when
plotted or they may not: when we take even a large random sample, there is always a chance that
our sample is not itself distributed normally, even if the population is. Students may have
questioned whether or not their frequency graphs were Normal and this is a very good question to
ask.
Statisticians often need to test whether their data is Normal in order to decide whether the
assumptions they are making are correct and it is not always possible to do this by plotting the
sample data. They have therefore developed statistical methods and software to check data for
Normality. There are several ways in which this can be done and students will not be examined on
these methods. What they need to do is to be able to interpret a Normal probability plot. The
simple rule is that, if the plot is close to a straight line, the data may be considered to be Normally
distributed. If the plot does not resemble a straight line, then the data cannot be considered to be
Normally distributed. This is because the probability plots essentially plot observed probabilities
against those expected if the distribution were Normal, so a perfect straight line would indicate that
all the data was exactly as expected.


Normally distributed
Not Normally distributed
To put this in a context, you could show the clip Against All Odds: Checking Assumption of
Normality which is about egg sizes and hen weights on a chicken farm. The clip refers to boxplots, which students should be familiar with. It also refers to ‘bin-sizes’ which we usually call
‘class-intervals’.
March 2015
11
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
Topic Exploration Pack
Students could then undertake Activity 3, using the Excel workbook ‘The Normal distribution’
(worksheet ‘Plots’) to check whether their data from Activities 1 and 2, or any other data they wish
to collect or make up, are distributed Normally.
Activity 2 Comparing distributions - Teacher Guidance
Initially, you could just pose the problem: “Given two dice games, one where you score the sum of
three dice and one where you score the sum of 5 dice, in which game are you more likely to score
10 or more? How much more likely?” It might be fairly obvious that you are more likely to get 10 or
more with 5 dice but how much more likely is not so easy to answer. Twice as likely? Three times?
The group might suggest rolling some dice and seeing what happens, which is a good place to
start.
1.
Split the class into two, one group rolling three fair six-sided dice and the other group rolling
5 of the same dice. Each group will need to roll their dice at least 100 times so it would be
best, if you have enough dice, to have more than one sub-group doing each. Agree with
them how they are going to collect the data (tally chart including all possible scores).
2.
See which group had more scores that were 15 or higher, but also enter the data for each
group into the spreadsheet and calculate the mean and standard deviation for each group.
Plot the data using the spreadsheet and compare the distributions. They should approximate
those below:
3.
Discuss the difference between the theoretical distributions and the real data. Explain that we
are going to use the real data to estimate the mean and standard deviation of the theoretical
distributions because we don’t know how to work them out exactly (this is not difficult, but
beyond the scope of Core Maths - could set as a challenge for small 𝑛).
March 2015
12
Quantitative Problem Solving (MEI)
Quantitative Reasoning (MEI)
4.
Topic Exploration Pack
Ask class to find how many standard deviations away from the mean of each distribution 10
is (that is, find the z-scores for the value 10). The theoretical z-scores are −0.17 for three
dice and −1.96 for five dice.
5.
97.5
55
≈ 1.77 so 10 is one and a half to two times as likely with 5 dice as with 3.
Is this borne out by their experimental results?
We’d like to know your view on the resources we produce. By clicking on the ‘Like’ or ‘Dislike’
button you can help us to ensure that our resources work for you. When the email template pops
up please add additional comments if you wish and then just click ‘Send’. Thank you.
If you do not currently offer this OCR qualification but would like to do so, please complete the
Expression of Interest Form which can be found here: www.ocr.org.uk/expression-of-interest
OCR Resources: the small print
OCR’s resources are provided to support the teaching of OCR specifications, but in no way constitute an endorsed teaching method that is required by the Board,
and the decision to use them lies with the individual teacher. Whilst every effort is made to ensure the accuracy of the content, OCR cannot be held responsible for
any errors or omissions within these resources. We update our resources on a regular basis, so please check the OCR website to ensure you have the most up to
date version.
© OCR 2015 - This resource may be freely copied and distributed, as long as the OCR logo and this message remain intact and OCR is acknowledged as the
originator of this work.
OCR acknowledges the use of the following content: Thumbs up and down icons: alexwhite/Shutterstock.com
Please get in touch if you want to discuss the accessibility of resources we offer to support delivery of our qualifications: [email protected]
March 2015
13
OCR customer contact centre
General qualifications
Telephone 01223 553998
Facsimile 01223 552627
Email [email protected]
For staff training purposes and as part of our quality assurance programme your call may be recorded or monitored.
©OCR 2015 Oxford Cambridge and RSA Examinations is a Company Limited by Guarantee. Registered in England.
Registered office 1 Hills Road, Cambridge CB1 2EU. Registered company number 3484466. OCR is an exempt charity.