Download Teach probability and statistics for understanding

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Teach Probability and Statistics for understanding page
1
Teach probability and statistics for understanding
Statistics
Probability
1
Representing data p3
20
2
Collecting simple category data and count
frequencies p4
Predicting the outcome of chance events in
words p22
21
Presenting data using pictograms and bar
graphs p5
Using intuitive chance language to compare
events p23
22
Gathering of frequency data for different
data types p6
Planning chance experiments, including
collecting and displaying data p24
23
Comparing chances, including ‘fairness’ p25
24
Recognising equally likely events p26
25
Representing estimations of chance as
fractions, decimals or percentages p27
3
4
5
Displaying frequency data for different
data types p7
6
Designing questions for data collecting p8
7
Collecting, displaying and interpreting
data, including using software p9
26
Developing and testing conjectures; using a
counter-example to disprove. p28
8
Using questionnaires to obtain discrete
and continuous data p10
27
9
Presenting data in appropriate summary
statistics and displays p11
Simulating chance events using random
devices, including calculators and
computers p29
28
Using variable chance experiments to get
closer to true probabilities 'in the long run'
p30
29
Calculating equally likely probabilities by
listing all the possible outcomes of an event
p31
30
Understanding probability as long-run
relative frequency p32
31
Calculating simple probabilities of equally
likely events p33
32
Generating random numbers for simulations
using technology p34
33
Using tree diagrams to predict probabilities
for two-event experiments p35
34
Estimating probability based on data tables
p36
35
Estimating probability based on real
experiments p37
36
Listing event spaces (for up to 3 events) to
calculate probabilities p38
37
Calculating probabilities (for
complementary, mutually exclusive, and
compound events) p39
10
Using grouping of data and histograms
p12
11
Interpreting summary statistics and
displays to answer questions p13
12
Using technology to organise data tables
and displays (dot plots, stem and leaf
plots, column graphs, bar charts and
histograms) p14
13
14
Calculating and interpreting summary
statistics (mean, median, mode, range,
difference) p15
Using technology to organise data tables
and displays (dot plots, stem and leaf
plots, column graphs, bar charts and
histograms) p16
15
Calculating and interpreting summary
statistics (mean, median, mode, range,
difference) p17
16
Sampling and using it in a survey p18
17
Calculating and interpreting summary
statistics from univariate data (mode,
median and mean, box plot, inter-quartile
range, outliers) p19
18
Interpreting summary statistics (mode,
median and mean, box plot, inter-quartile
range, outliers) and displays p20
19
Interpreting and predicting from
association p21
Teach Probability and Statistics for understanding page 2
Resources for learning
The curriculum described in this section does not use textbooks. Instead it calls on the wealth of high quality
learning resources that are available, mainly through MAV.
Lesson plans
Maths300
This is available from MAV. There is an annual subscription, for a user name and password for on-line
access to over 170 lesson plans, many with high quality associated software.
RIME
This collection of lesson plans is also available from MAV. There are three books in the series, also available
on CD, with extra spreadsheets. Choose RIME (Measurement, Space, Chance & data).
RIME 5&6 A set of RIME-style lessons written specifically for upper primary . From MAV.
Chance & Data Investigations Volumes 1 & 2
Chance (Vol 1) and data (Vol 2) investigations designed to take many days, for many year levels, from MAV.
Teaching advice
Continuum
Guidelines in Measurement
People Count A book for teachers who want the maths explained to them clearly.
Problem solving
Maths With Attitude:
For each content dimension and for Years 3 to 10, this is a repackaging of the best Maths300 lessons and
the best 20 Mathematics tasks, with a useful guide. From MAV.
Mathematics Task Centre:
This is a collection of Mathematics tasks are available from www.blackdouglas.com.au/taskcentre, or as part
of the Maths With Attitude kits, from MAV.
Worksheets
Active Learning: This is a set of graded worksheets three books from MAV. They are also available on one
CD containing the contents of all three books in the series, plus extra worksheets describing how to use the
hundreds of spreadsheets also on the CD.
Active Learning 2: More of the same in three books, and also on CD.
Tuning in with task cards – lower, middle, upper primary. Each book is a set of 150 work cards to guide
students into hands-on activities. From MAV
Certain Number – a book of hands-on learning combining the ideas of chance with those of number; MAV
Maths Investigations
This is a collection of open-ended questions with guidance for exploring real-life situations mainly in
measurement and data. Mainly middle years, from MAV.
Maths at Work: A classic book of real life applications for less academic secondary school students, it is
now only in worksheet form on CD, from MAV.
Computers
Interactive Learning
One CD from MAV, containing hundreds of spreadsheets requiring no knowledge of Excel, and covering all
levels and dimensions. Very useful for homework!
Learning objects (FUSE or Scootle)
Resources from MAV
may be purchased with a credit card or school order number on-line,
using the MAV’s web site: www.mav.vic.edu.au/shop.
Teach Probability and Statistics for understanding page 3
1
Representing data
Activities (from Guidelines in Measurement)
Teach Probability and Statistics for understanding page 4
2
Collecting simple category data and count frequencies
Category data are qualities for which we do not use numbers: colours, size (big, small), names (of people,
songs, teams) etc. When we find them we often count how many there are of each. These are their
frequencies. At this level we might just organise groups, or make a list, and then count.
Suggested activities (from Guidelines in Measurement)
1.
Count boys and girls
2.
Count numbers with cats or dogs
3.
Count those who walk, are driven or bus to school
Teach Probability and Statistics for understanding page 5
3
Presenting data using pictograms and bar graphs
Once we have our list we can represent them graphically. This is a reorganisation of the data, and children
may need help to see it happening.
Suggested activities
1.
People graphs
Ask children to form lines based on birth month, favourite season, football team, etc. the finished
groups themselves are the graph, with the children as parts of it.
2.
Pictograms
Use coloured stickers with children’s names to reproduce a ‘people graph’ on a large sheet of paper.
3.
Bar or column graphs
If we remove the names and use rectangles we have a column graph (if vertical) or bar graph (if
horizontal).
Teach Probability and Statistics for understanding page 6
4
Gathering of frequency data for different data types
This section is about getting the data. Sometimes we use questionnaires to ask, or we count or we measure.
Suggested activities
1.
Design a simple questionnaire
e.g. What are your favourite..., How many ___ do you have?
2.
Count objects with different characteristics
e.g. cars of different models, or colours
3.
Measure children
e.g. heights (nearest cm), masses (nearest kg), arm reach (nearest cm),...
Activities (from Guidelines in Measurement)
Teach Probability and Statistics for understanding page 7
5
Displaying frequency data for different data types, including using
software
Once we have the data we are able to display it for others to see.
This is a good time to start to use suitable software.
Suggested activities
1.
Raw data displays
Children might use lengths of wool to represent their heights. Taping these to a wall shows the raw
data.
2.
Frequency displays (columns, bars)
The raw data can be combined into groups of the ‘same’ height, and the numbers at each height can
be counted. This leads to a column or bar graph.
3.
Software
The spreadsheets below are designed to let simple data exploration take place with minimum fuss.
Teach Probability and Statistics for understanding page 8
6
Designing questions for data collecting
This time we try to manage a few really good questions about ourselves. This could be the start of a simple
data-base. In Chance & Data Investigations Volume 1, look up Database Debut.
Suggested activities
1.
Design the questions
Once children have the ideas of what information they want they need to learn to ask clear questions
– a literacy task. The other important idea is that the responses must be clear and able to be
tabulated and graphed.
Teach Probability and Statistics for understanding page 9
7
Collecting, displaying and interpreting data
Getting a lot of data is one thing, but organising it in a form suitable for making a display, and then
interpreting what it all means are added skills. A computer data base (e.g. in Excel) allows the computer to
sort and organise the data in the ways you want. It makes counting easier, and then will draw simple graphs
for you. However these skills take time to learn. Again read Database Debut.
Suggested activities
1.
Asking the questions and keeping records
It will be helpful for students to design a table into which they can record the responses of each
person they ask.
2.
Sorting the data
Simpler tables will include total frequencies of each response.
3.
Graphing the data
Technology can be used to create column graphs to show the frequencies.
4.
Interpreting the data
The final stage is to write the answers to the original questions, using the data as evidence.
Teach Probability and Statistics for understanding page 10
8
Obtaining discrete and continuous data
The distinction between discrete and continuous data needs care. Numbers of things obtained by counting
are discrete, as fractions are not possible. An example of continuous data is a measurement such as height,
where fraction values are possible, even if we round to the nearest cm.
Suggested activities
1.
Asking for opinions
One common form of discrete data is the opinion poll. For example: 1 strongly agree, 2 agree, 3 no
opinion, 4 disagree, 5 strongly disagree. Asking unbiased questions is an important skill.
2.
Asking for continuous data
These questions should require the person asked to make a measurement, and to round the data.
Examples are hand span, head circumference, arm stretch.
Teach Probability and Statistics for understanding page 11
9
Presenting data in appropriate summary statistics and displays
It is no simple skill to judge the most useful display for different data. The same applies to summary
statistics, such as the mean (the averages), medians (the middle one) or mode (the most common).
Suggested activities
1.
Choosing the best graph
We have used frequencies to create column graphs or bar graphs. If the data shows a trend over
time it is common to plot points (to show the value) and join these with a dotted line.
Line plots show a set of up to about 25 different values above a number line. This is useful for
showing how the values are clustered, for example the hand-spans for an entire class.
Excel allows students to create pie charts easily. However this only makes sense when there are up
to 5 or 6 different categories that form part of a recognisable whole – the pie.
2.
Understanding means (averages)
The mean is the result of sharing the values equally. Imagine everyone puts all the cash they have
into a box, and then share it equally; to do this mathematically they would add the values and then
divide by the number of people.
3.
Who uses the median and why?
The median is the middle value, or in the case of an even number of values it is the average of the
middle two (if they are different). For ‘symmetrical’ data sets the median is close to the mean, and
could be much easier to find. For non-symmetrical sets, the median might be a much better
representative of the majority of values. ‘Typical’ house values are given as medians.
4.
Modes: one or more?
Modes are found from the frequencies. Many data sets will have more than one mode, but a
symmetrical set will have similar mode, mean and median.
Teach Probability and Statistics for understanding page 12
10
Using grouping of data and histograms
Histograms are often confused with column graphs because they look similar. However histograms use
grouped data, whereas column graphs represent the frequencies of single categories. Grouping can only
occur where the data is continuous; for example, heights can be 120 - 129 cm, 130 - 139 cm, etc. In this
case, the value 129 cm includes all the way up to 129.9 cm. You do not round the lengths up but take the
whole number part of the measurement.
Suggested activities
1.
Grouping of data, using constant group width
Student heights, masses, or many other body measurements are suitable. The best way to group
data is in easy-to-manage chunks. Start just below the smallest (or at zero if it is close) and go up in
either 5s or 10s, 20s, etc. The result is a ‘grouped frequency table’.
2.
Histograms, using equal width columns
Histograms look like column graphs but have a number line as the horizontal scale. Each column
shows the number of values in the range of that column. A grouped frequency table is best
represented by a histogram.
Teach Probability and Statistics for understanding page 13
11
Interpreting summary statistics and displays to answer questions
The most important skill with data is the proper interpretation of other people’s data. The media use a lot of
statistics, and it is frequently used to confuse rather than clarify, or to argue for a particular interpretation. We
all need to develop skills that will help us think critically about data presented to us.
Suggested activities
1.
Interpreting tables
Much data is presented in tables. The skills required involve understanding exactly what the data
means, and looking for the key summaries (totals or percentages).
2.
Interpreting graphs
Commercial graphs are often cluttered with material that obscures their message. Look at the
variables shown, look for numbers on at least one scale, and look at relationships or trends.
3.
Interpreting summary statistics
Means are by far the most common summaries. Make sure students understand how a mean is a
representative of the total data set, by being a ‘central’ value. Look for cases of ‘skewed’ data, with
some very high values; in these cases the mean (average) will be unduly affected by the extremely
high values.
Teach Probability and Statistics for understanding page 14
12
Using technology to organise data tables and displays
(dot plots, column graphs, bar charts)
The spreadsheets from Interactive Learning provide experience in Excel, the most common statistical
program used in industry and commerce. The second-hand data in the spreadsheets below comes from the
internet: AFL, the Netball League, the Bureau of Meteorology, The Olympic Games, and the Australia
Bureau of Statistics. Once they have the idea, students can explore much data freely available on the
internet.
Suggested activities
1.
Using raw data in a spreadsheet
Type the variable names at the top of each column. Enter the values for each person across each
row. Then Excel can be used to sort the data, to graph it and to find means or medians.
2.
Census in schools
The Australian Bureau of Statistics holds a large database of responses from students across
Australia. Schools are welcome to access these files and use the data. Contact Australian Bureau of
Statistics, Phone: 1800 623 273 or 03 9615 7505.
Teach Probability and Statistics for understanding page 15
13
Calculating and interpreting summary statistics (mean, median,
mode)
This first foray into summary statistics deals with means (just common averages), medians (the middle ones)
and modes (the most common).
Suggested activities
1.
Choosing the best summary statistics
At this stage students can choose means (averages) or median (the middle value, or the average of
the middle two values). The other type is the mode (the most common, with the highest frequency);
there may be several modes. The choice will depend on whether or not the set is reasonably
symmetrical. If it is, and if the data is in order, the median is easy to find. If you have frequencies, the
mode is the easy way. If the data is clearly non-symmetrical using the median.
Teach Probability and Statistics for understanding page 16
14
Using technology to organise data tables and displays
(dot plots, stem and leaf plots and histograms)
The tools provided in Interactive Learning make this section easy to manage.
Dot plots (sometimes called line plots)
Up to about 25 dots are placed above a number line to show the data. Repeated values are stacked.
Suggested activities
1.
Parallel dot plots
Two sets of data can be placed above and below the same number line. This clearly shows similarities
and differences.
2.
Stem and leaf plots
The basic stemplot uses the tens digit on the vertical stem and ones digits as horizontal ‘leaves’.
If the data on the leaves is in order (smallest closest to the stem) then it is ordered, otherwise
unordered. It is also a quick way to get up to about 50 values in order.
3.
Back-to-back stem plots
Two sets of ‘leaves’, one to the left and one to the right of the same ‘stem’ makes comparing two data
sets easy.
4.
Grouping of data, using constant grouping range
Grouped frequency tables are convenient for continuous data, where there are many values that can be
logically combined. The most useful tables have 4 to 6 columns.
5.
Histograms, using equal width columns
A histogram is a graph based on a grouped frequency table. The horizontal number line shows the
grouping ranges and the vertical scale shows the frequencies.
6.
The effect of varying the grouping range
For some data it is interesting to see the effect on a histogram of changing the width of each column.
This can be done in the spreadsheet ‘Histogram’ below.
Teach Probability and Statistics for understanding page 17
15
Calculating and interpreting summary statistics
(mean, median, mode, range, mean difference)
Students have met mean, median and mode in sections 10 and14, but now we also consider the spread of
the data.
Suggested activities
1.
Choosing between mean, median and mode
It is useful to use, or invent, non-symmetrical data sets, and compare the values of the median and
the mean.
2.
Measuring the spread with the range
Suppose you were measuring a length. You are either well over or well under the mean value, but
never actually on it. compare this ability to measure with someone who is always close and has the
same mean value. The range of values (the difference between the highest and lowest) is a
measure of the critical difference in spread.
3.
Measuring the spread with mean difference
The full range only takes the extremes into account and can be greatly influenced by ‘outlying’
values or outliers. A better measure of spread is the mean (positive) difference between each value
and the mean.
To achieve this on a spreadsheet, list the values in one column. Calculate the average (=AVERAGE)
and use a formula to find the mean (positive) difference for each value (=ABS(V-A)). Then find the
average of these differences.
Teach Probability and Statistics for understanding page 18
16
Sampling and using it in a survey
Up to this point any surveys that have taken place have used a ‘convenience’ sample, whoever was
available.
Why use a sample? Why not a census?
Samples are used because it can be far too expensive to survey everyone. The problem is that no sample
can be guaranteed to be representative of the whole population.
Ways of choosing a sample
The most common method is to try to be completely random. This means that all possible sample members
have an equal chance of being chosen. Note that even this does not guarantee that the sample will
‘represent’ the population well.
Other methods include ‘stratified random sampling’, e.g. randomly choosing males and randomly choosing
females, in the correct percentages to match the population.
Sample size
The larger the sample the more chance there is that its mean is close to the population mean.
Suggested activities
1.
Choosing randomly
Number all the possible people to ask from 1 to n. Use table of random numbers, or a spreadsheet.
To get a random number from 1 to n in a spreadsheet cell, type this formula (replacing n by the
desired number). ‘=INT(RAND()*n+1)’.
2.
Deciding how many to choose
The sample size is a question of how much time (and money) you have, and how accurate you need
to be. The spreadsheet ‘Sample size’ allows you to explore this, with a prepared population of 100.
3.
Stratified random sampling
Sometimes it is wise to make sure the fractions in the sample match the fractions in the overall
population. To do this, choose the appropriate number in the same for each group (e.g. each
gender) and then use random sampling to choose each sub-sample.
Teach Probability and Statistics for understanding page 19
17
Calculating and interpreting summary statistics from univariate
data (inter-quartile range, outliers, box plot)
‘Univariate data’ means data with one statistical variable. This has been the case to this point; bi-variate data
appears only in item 20. In this section we introduce quartiles, leading to inter-quartile range and box plots.
Further measures of spread: quartiles and the inter-quartile range
Imagine the data arranged in order. Starting from the lowest value, the mean is at the 50% mark. Then the
two quartiles (lower and upper) are at the 25% and 75% marks. The ‘inter-quartile range’ is simply the
difference between the upper and lower quartiles. It also avoids extreme values.
Outliers
Given a set of data, the first thing to do is to look for, and possibly remove, extreme values, or outliers. Some
will be genuine, but many may be typos.
Box plots
A box plot (or ‘box and whisker’ because of the vague appearance of a cat face) uses the five values
(lowest, highest, median, lower and upper quartiles) to draw a rectangle that contains the middle 50% of
values, and shows the median. The ‘whiskers’ extend to the extremes, unless they are so extreme as to be
called outliers. It is drawn over (or under) a number line so that values may be read.
Parallel box plots
Two or more boxplots drawn near the same number line enable comparisons of the sets they represent.
Suggested activities
1.
Data sources
Sets of data that are likely to have a good spread, and probably also be non-symmetrical, are
surveys of students’ pocket money and earnings (see Active Learning CD11), surveys of distances
travelled, and skewed natural data such as rainfall (e.g. Darwin, from the Bureau of Meteorology).
2.
For a small data set, do the analysis manually
3.
For larger data sets, use the spreadsheets or the free software to create boxplots.
4.
Focus on correct interpretation of boxplots.
5.
Compare two similar data sets, such as rainfall for Darwin and Perth (or money access for
males and females), using parallel boxplots.
Teach Probability and Statistics for understanding page 20
18
Interpreting summary statistics and displays
This section aims to focus on interpreting second-hand data, particularly from the media.
Interpreting tables
Much data is presented in tables. The skills required involve understanding exactly what the data means,
and looking for the key summaries (totals or percentages).
Interpreting graphs
Commercial graphs are often cluttered with material that obscures their message. Look at the variables
shown, look for numbers on at least one scale, and look at relationships or trends.
Interpreting summary statistics
Means (averages) are by far the most common summaries. When percentages are used, it is not always
made clear what they are percentages of!
Suggested activities
1.
Census at school
A ready source of material is available from ABS in the CensusAtSchool project.
<www.abs.gov.au/websitedbs/cashome.nsf/Home/Home>
2.
Bureaux of Statistics and Meteorology
Search the ABS <www.abs.gov.au/ausstats>, BOM www.bom.gov.au and other web sites
for summary statistics and displays that are interesting to interpret.
3.
Road accidents
There have been a number of in-depth studies of road accidents. The one that led to the
change to 50 km/h limit and pleas to reduce speeds by 5 km/h is found at
http://casr.adelaide.edu.au/speed/.
Teach Probability and Statistics for understanding page 21
19
Interpreting and predicting from association
Association is a concept involving the relationship between two variables. For example, for growing children,
height is related to age; but this is not true for adults. Association can be close or very little.
Scatterplots
Values of one variable (the independent one, e.g. time) are placed using the horizontal axis, and the other
on the vertical. The result is a scatter of points.
Degree of association
A scatterplot shows clearly the extent of the relationship between the two variables. Only if the association is
close can prediction be a sensible activity.
Association is not ‘cause and effect’
Even if association is close, it does not follow that changes in one variable cause changes in the other. (It
doesn’t follow that they don’t either!)
Lines of fit and prediction
If the association is fairly close, you can draw a line of fit (or Trend line) either by eye, or using a
spreadsheet (or other statistical program). Computers will probably only offer ‘lines of best fit’ that use the
least squares method. In Excel, this is the process.
Enter the data into two columns in a table. Select it, choose Graph Wizard, and Scatterplot. Select
the graph and under Chart, choose Add Trendline. Choose the line (providing the data is not
obviously a different shape).
Suggested activities
1.
Data for scatterplots
The basic tool is the scatterplot, so any sets of data that provide two variables for a reasonable
number of cases will do. For example, get some age (in months) and height data, compare
handspan and height for students (males and females are likely to be different) or plot winning and
losing scores in many football games. There is data in the resources below, and on the ABS, BOM,
and other websites.
2.
Focus on sensible interpretations.
Don’t try to link variables where the association is not strong. Where it is you can use lines of fit and
make predictions.
Teach Probability and Statistics for understanding page 22
20
Predicting the outcome of chance events in words
The big idea here is that some events are ‘certain’ and others are ‘unpredictable’ or ‘uncertain’ in outcome.
Being able to tell the difference is the start of the ‘chance’ journey. Once children have the idea that some
events are uncertain, they can describe some common events as ‘more certain than others’.
Suggested activities
1.
Play and discuss outcomes
Roll dice, or play other uncertain games, and discuss the difference between certainty and
uncertainty.
2.
Sure thing, or impossible
The extremes of uncertainty are easier to identify.
3.
Likely and unlikely
After a while some events will seem to be more likely than others.
Teach Probability and Statistics for understanding page 23
21
Using intuitive chance language to compare events
Once children have the idea that some events are uncertain, they can describe some common events as
‘more certain than others’.
Suggested activities
1.
Ordering the language of chance
There are many common words that describe levels of uncertainty. What is not so clear is what degree
of uncertainty they describe. Here are some; students could try to put them into order.
certain, impossible, odds against, one in a million, fluke, unlikely, unexpected, maybe, perhaps, surely,
unsure, improbable, possible, hopeful, lucky...
Teach Probability and Statistics for understanding page 24
22
Planning chance experiments, including collecting and displaying
data
The ideas of data collection are useful to compare the frequencies of each of the possible outcomes for
simple experiments.
Suggested activities
1.
Listing the outcomes
The first task is to work out what could happen, making a list of outcomes. Use coloured marbles
(pegs or unifix), dice, coins, cards, spinners, etc.
2.
Experimenting with chance
The next step is to select randomly – blindly if needed, and record the results in a table.
Finally display in a graph and interpret the results. Which are more likely? Will it always happen?
Teach Probability and Statistics for understanding page 25
23
Comparing chances, including ‘fairness’
As a result of intuition and experimenting, students grasp that some games are not ‘fair’. ‘Fairness’ is a basic
idea in chance – it means that all possible outcomes have the same chance of occurring.
The concept of fairness
“Heads I win, tails you lose”. Not fair? What makes a game ‘fair’? Both players much have an equal chance
of winning.
Suggested activities
1.
Testing for fairness
Roll two dice, and add the numbers. If the total is 5, 6, 7, 8 or 9 I win, if 2, 3, 4, 10, 11, 12 you win.
So you have more totals than I do. Is this fair? Play the game in pairs and record the results. It is
soon clear that I will win about two-thirds of the time. The game is not fair.
Teach Probability and Statistics for understanding page 26
24
Recognising equally likely events
Calculations with chance are only possible (at this level) once we know that all the outcomes are equally
likely. So ‘equally likely’ is a basic idea, basically the same as fairness – it means that all possible outcomes
have the same chance of occurring.
Equally likely from ‘symmetry’
Symmetry is visual; we judge that a coin has no bias to heads or tails, that a die has no visible bias.
Equally likely from evidence
The only way to be sure is to run an experiment e.g. roll the die hundreds of times. If each child does 100
and results are combined the results will be impressive; even though the totals may differ by quite a bit, the
graphs should be very similar in height – the differences will be greatly outweighed by the totals.
Suggested activities
1.
Equally likely using random numbers
The spreadsheets use random numbers from a computer. Are these really ‘equally likely’?
Teach Probability and Statistics for understanding page 27
25
Representing estimations of chance as fractions, decimals or
percentages
Once the students have some skill with fractions, decimals and percentages they can use them to describe
levels of chance.
Suggested activities
1.
Putting a number on your intuition
There are many common words that describe levels of uncertainty. Here are some; students could
try to put them into order first, then give each a decimal value (between 0 for impossible and 1 for
certain). Note that chance values outside this range are impossible.
certain, impossible, odds against, one in a million, fluke, unlikely, unexpected, maybe, perhaps,
surely, unsure, improbable, possible, hopeful, lucky...
2.
Simple experiments to check intuition
What is your intuition about the chance of a plastic spoon landing bowl up when dropped?
Drop a spoon from a good height man times and test your intuition. The fraction (or decimal) value
will be the number of ‘bowl up’ cases divided by the total number of tries.
Repeat with a plastic cup, matchbox or drawing pin.
Teach Probability and Statistics for understanding page 28
26
Developing and testing conjectures; using a counter-example to
disprove
The example given in VELS is ‘a 6 is harder to roll than a 1’. This perception comes from playing dice
games, where you need a 6 for special moves, or to get started.
Suggested activities
1.
Developing conjectures
Students may offer many of the common misconceptions. Here are some:
•
After a run of heads, a tail is more likely. No, the coin can’t remember what happened last time.
•
Coincidences are unlikely. In fact given a large enough sample of events to look at,
‘coincidences’ are very common.
•
Lotto numbers are equally likely to occur, so those most often chosen are less likely in the
future. (This is often seen as a version of “the law of averages”.)
•
In Lotto, the run 1,2,3,4,5,6 is less likely that any other set of six numbers.
•
An iPod knows which tunes you like (or don’t like) in sequence – it is not truly random.
It is very worthwhile to disprove these, but because chance is involved, it will take more than one
counter-example to convince some students!
2.
Check using VERY large numbers of trials
It is not helpful to run an experiment with only a few trials – largely because you might find that, on
this occasion, 6 is harder to roll!
Teach Probability and Statistics for understanding page 29
27
Simulating chance events using random devices, including
calculators and computers
The use of coins or dice to predict outcomes for other events (such as a baby’s sex) is a powerful idea,
called simulation. Once the use of hand-held random devices is understood, students can move towards
using calculators or computers.
Suggested activities
1.
Using coins or dice
Coins are easy; how do you get equally likely events from a die?
Two equally likely events might be 1, 2, 3 vs 4, 5, 6; or even vs odd, etc.
Three equally likely events might be 1 or 2, 3 or 4, 5 or 6, or other pairings such as ‘add to 7’.
To simulate months where there are 12 equally likely events, roll once for first half/second half, then
roll again for the month in that half year. So 6 then 3 makes third month in second half = September.
2.
Using a calculator
Enter a two digit number, then multiply by another. Repeat multiply until the calculator freezes. Use
the right end digit(s). Alternative: enter a four digit number, and take the square root seven times.
Use the right end digit(s).
3.
Using a computer
Excel has a pseudo-random number generator. In one cell type “=RAND()”. This will produce a
random decimal between 0 and 1, just like probabilities.
To get random digits for dice, type “=RANDBETWEEN(1,6)”. Similar functions are used in the
chance spreadsheets below.
Teach Probability and Statistics for understanding page 30
28
Using ‘long run’ chance experiments to get close to true
probabilities
The idea of probability is that the fraction of successful attempts from experiments (called the ‘relative
frequency’) will get closer to the true fraction only after a very large number of trials – ‘in the long run’.
Simulation relies on this result to estimate probabilities we are unable to calculate.
Suggested activities
1.
Die rolling, getting more and more results
If students roll a die only 6 times it is actually quite unlikely that one of them will get all numbers
different. If they roll 60 each, equal frequencies are still unlikely. But if each uses a calculator to find
the decimal fractions such as (number of 1s ÷ total rolls) they will find the decimals are rather similar.
If they combine their results and repeat the calculation the results will be even more similar.
2.
Using a computer to perform ‘long run’ experiments
The spreadsheets below allow students to set the probability (e.g. of the drawing pin landing point
up) and then to run very many trials based on it.
Teach Probability and Statistics for understanding page 31
29
Calculating equally likely probabilities by listing all the possible
outcomes of an event
For calculation of probability at this level we assume that all possible events are equally likely. Then we list
all the possible outcomes (results) and work out the fraction of the total that are successes. This will be the
probability – in theory. Any testing that takes place must take account of natural variability.
Suggested activities
1.
Testing for equal likelihood
Complex events (such as numbers of heads from two coins, sum from two dice) will not have equal
outcomes; for example the totals of two die numbers range from 2 to 12, with 7 the most likely.
2.
Listing all the possible outcomes
Single events (such as dice, cards, coins) will be simple to list.
3.
Calculating the ‘success fraction’ or probability
With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’
and divide by the total number of possible outcomes. This is the probability. For example, in a pack
of cards, the number of aces is 4, the number of cards is 52, so the probability is 4 ÷ 52 = = .
Teach Probability and Statistics for understanding page 32
30
Understanding probability as long-run relative frequency
The relative frequency means the number of successes divided by the total number of attempts (i.e. trials).
When you have only tried a few times, this fraction can vary a lot, but for very large numbers of trials it stays
close to a single value: the probability.
Suggested activities
1.
Doing some ‘long run’ experiments, by combining data
Arrange for each student or small group to perform a simple experiment to test the predicted
probability. They will need to perform about 50 trials, and divide the number of successes by the
number of trials. The results will differ, but when combined they produce much larger numbers, and
the success fraction’ (relative frequency of success) will be closer to the calculated probability.
2.
Using a computer for ‘long run’ experiments
The spreadsheets below allow students to set the probability (e.g. of the drawing pin landing point
up) and then to run very many trials based on it.
Teach Probability and Statistics for understanding page 33
31
Calculating simple probabilities of equally likely events
At this level only single events are considered. Two-event experiments appear in section 31.
Suggested activities
1.
Testing for equal likelihood
Complex events (such as numbers of heads from two coins, sum from two dice) will not have equal
outcomes; for example the totals of two die numbers range from 2 to 12, with 7 the most likely.
2.
Listing all the possible outcomes
Single events (such as dice, cards, coins) will be simple to list. You can include criteria that involve
more than one characteristic, such as ‘red picture cards’.
3.
Calculating the ‘success fraction’ or probability
With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’
and divide by the total number of possible outcomes. This is the probability. For example, in a pack
of cards, the number of ‘red picture cards’ is 6 (K, Q, J of hearts and diamonds), the number of cards
is 52, so the probability is 6 ÷ 52 = 3/26 = 0.115.
Teach Probability and Statistics for understanding page 34
32
Generating random numbers for simulations using technology
In this section we revisit the idea of simulation. The core of the simulation is the random number generator,
be it a coin (0 or 1), die (1 to 6), cards (suits, numbers, etc.) or calculator or computer. This section deals
with the skills to use a calculator or spreadsheet to generate random numbers.
Suggested activities
1.
Using a calculator
Enter a two digit number, then multiply by another. Repeat multiply until the calculator freezes. Use
the right end digit(s). Alternative: enter a four digit number, and take the square root seven times.
Use the right end digit(s).
2.
Using a computer
Excel has a pseudo-random number generator. In one cell type “=RAND()”. This will produce a
random decimal between 0 and 1, just like probabilities.
To get random digits for dice, type “=RANDBETWEEN(1,6)”. Similar functions are used in ‘Dice
football’ and the other the chance spreadsheets below.
Teach Probability and Statistics for understanding page 35
33
Using tree diagrams to predict probabilities for two-event
experiments
The graphic organiser called ‘tree diagrams’ is used to make sure that all the possible outcomes are listed.
Suggested activities
1.
Tree diagrams for ‘two-event’ experiments
Tree diagrams are graphic organisers that help make sure all the outcomes are listed. For example,
if the two-event experiment involves tossing a coin and then rolling a die, the tree diagram would
show 12 outcomes, like this. (This could be used to simulate choosing from 12 equally likely
months.)
H
H1
2.
H2
H3
T
H4
H5
H6
T1
T2
T3
T4
T5
T6
Predicting probabilities using tree diagram
With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’
and divide by the total number of possible outcomes. This is the probability.
Teach Probability and Statistics for understanding page 36
34
Estimating probability based on data tables
If we consider a table of second-hand data as the results of an ‘experiment’ then we can use the data to
make predictions. If there are very many trials (as in census data) then the result can be quite close to
probability, but we always call these predictions ‘estimates’.
Suggested activities
1.
Choosing suitable data
The most useful data involves a complete population. For example in Victoria the percentage of
families that have one parent is about 15%. Some suitable examples are provided in the
spreadsheets below.
2.
Interpreting the data using chance
The interpretation must be sure to use the same ‘unit’, (for example, families) and stay with the same
population (Victorian families). If we were to randomly select one Victorian family, we can estimate
the probability that there would be only one parent is 0.15. This probability varies with location within
Victoria as the data shows. In each spreadsheet below, students can carefully construct chance
interpretations from the real data.
Teach Probability and Statistics for understanding page 37
35
Estimating probability based on experiments, simulations or
surveys
This section involves hands-on approaches to getting data from which to estimate probabilities.
Suggested activities
1.
Estimating from experiments
Suppose you roll a die twice and find the difference between the two numbers. The possibilities are
0, 1, 2, 3, 4, 5. Are they equally likely? (No.) What is the probability of each? You can work it out
theoretically, and you can do an experiment.
2.
Estimating from simulations
The experiment above can be done as a computer simulation using ‘Die rolling’. You can run up to
10 000 trials at once and see the graph.
3.
Estimating from surveys
The use of a survey to form an estimate should be discussed carefully. How can you choose the
sample so that it represents the population well?
For example: the ‘Opinion survey’ (spreadsheet) represents 24 questions asked on 148 students in
Year 9. How typical are they of all Year 9 students?
The ‘Young adults’ whose ages, heights and masses are shown on that spreadsheet are from Year
11. How representative are they?
The ‘Normal samples’ spreadsheet selects five numbers from a ‘normal’ population. How much do
different samples differ from each other?
Teach Probability and Statistics for understanding page 38
36
Listing event spaces (for up to 3 events) to calculate probabilities
The various techniques for listing all the possible outcomes are learned here. They include lists, grids, tree
diagrams, venn diagrams and two-way tables. These is more in the section on Sets and logic in Structure.
Suggested activities
1.
Grids
Grids are sometimes called Carroll diagrams, after Lewis Carroll, who invented their use in this
context. The essential idea that there are always two possibilities: having a characteristic or not.
They are easier to manage if there are only two variables involved.
Examples: roll a die (odd or even) and toss a coin (head or tail); choosing students by sex and by
age (under or over an exact age);
Two variables creates four outcomes; for example, by comparing each student to the average values
of age and height we get this. For three variables the grid is split to make eight outcomes.
shorter
shorter
M
F
taller
younger
younger
older
older
2.
taller
M
F
Tree diagrams
All the possible combined outcomes from three separate decisions (or ‘events’) can be listed using
three-stage tree diagrams. For example, a student’s level, sex and the system of the school. Using a
tree diagram, 12 possible combinations will be listed.
Note: If selecting things is included, it is important to know whether or not the selected item is
replaced, so that all are available for each selection. For example, using only the digits 0 to 9, the
three decisions could be: choose a multiple of 3 (i.e.;. 3, 6, 9), choose a factor of 12 (i.e. 1, 2, 3, 4,
6), choose a prime number (i.e. 2, 3, 5, 7). Without replacement, this can be very involved.
3.
Venn diagrams
When there is overlap it is useful to use loop diagrams named after John Venn. This is also useful to
describe sub-set relationships. The digit sets described above look like this. For example if three
digit numbers are formed from these choices in the order multiples, factors then primes, there will be
3 choices for multiples, 5 for factors and 4 for primes: 60 three-digit numbers in all.
factors of 12
1
0
2
5
primes
4.
4
7
3
8
6
9
multiples of 3
Two-way tables
They are like Carroll diagrams, but have more than two choices at each stage. For example, the
data in ‘Education’ uses a simple grid: Level (Primary, secondary) vs sex (male, female). The third is
the system level (Govt, Catholic, Independent). This allows all the possible combinations to be listed:
2 levels x 2 sexes x 3 systems makes 12 possibilities.
Teach Probability and Statistics for understanding page 39
37
Calculating probabilities
(for complementary, mutually exclusive, and compound events)
Whether or not the laws of probability apply depends on the relationship between the two or more events.
For two events there are three possibilities.
Two complementary events are two events that contain all the possibilities. You can find the probability of
one of them by subtracting the other probability from 1.
Mutually exclusive events have no overlap, so the simple form of the addition law applies – their probabilities
add to 1. The multiplication law gives the probability of both events occurring.
Compound events involve overlapping outcomes. For example on rolling a die, the events might be ‘getting a
factor of 12 and ‘getting a factor of 10’. The outcomes 1 and 2 are common to both events. Alternatively one
event might be a subset of the other; for example ‘getting a factor of 12’ and ‘getting a factor of 6’ have the
outcomes 1, 2, 3 and 6 in common. A more involved addition law (with subtraction!) applies. The simple
multiplication law does not apply.
Suggested activities
1.
Using Venn diagrams to determine relationships: examples with die numbers
Complementary
6
1
2
2
1
3
Subset
3
4
5 6
Mutually exclusive
2.
4
5
4
5
3
6
2
1
3
1
4
5
2
Overlapping
Complementary events
The probability of a number from 1 to 6 being a factor of 6 is
factor of 6 is
3.
6
2
, so the probability of not being a
3
1
.
3
Mutually exclusive events
This applies where there is no overlap i.e. no outcomes belonging to more than one set. The
separate probabilities add to 1.
To find the probability of getting one result OR another, just add their separate probabilities.
To find the probability that any particular combination occurs (i.e. one and also another) multiply their
probabilities.
4.
Compound events
This applies where there is overlap. In both cases, the separate probabilities add to more than 1,
because of the overlap.
To find the probability of getting one result OR another, add their separate probabilities and then
subtract the probability of the intersection (overlap). Compare the two ‘Addition law’ spreadsheets.
To find the probability that any particular combination occurs (i.e. one AND also another) list the
cases. The simple multiplication law does not hold. The ‘Multiplication law’ spreadsheet uses two
differently coloured dice so there can be no outcomes belonging to both.