Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Teach Probability and Statistics for understanding page 1 Teach probability and statistics for understanding Statistics Probability 1 Representing data p3 20 2 Collecting simple category data and count frequencies p4 Predicting the outcome of chance events in words p22 21 Presenting data using pictograms and bar graphs p5 Using intuitive chance language to compare events p23 22 Gathering of frequency data for different data types p6 Planning chance experiments, including collecting and displaying data p24 23 Comparing chances, including ‘fairness’ p25 24 Recognising equally likely events p26 25 Representing estimations of chance as fractions, decimals or percentages p27 3 4 5 Displaying frequency data for different data types p7 6 Designing questions for data collecting p8 7 Collecting, displaying and interpreting data, including using software p9 26 Developing and testing conjectures; using a counter-example to disprove. p28 8 Using questionnaires to obtain discrete and continuous data p10 27 9 Presenting data in appropriate summary statistics and displays p11 Simulating chance events using random devices, including calculators and computers p29 28 Using variable chance experiments to get closer to true probabilities 'in the long run' p30 29 Calculating equally likely probabilities by listing all the possible outcomes of an event p31 30 Understanding probability as long-run relative frequency p32 31 Calculating simple probabilities of equally likely events p33 32 Generating random numbers for simulations using technology p34 33 Using tree diagrams to predict probabilities for two-event experiments p35 34 Estimating probability based on data tables p36 35 Estimating probability based on real experiments p37 36 Listing event spaces (for up to 3 events) to calculate probabilities p38 37 Calculating probabilities (for complementary, mutually exclusive, and compound events) p39 10 Using grouping of data and histograms p12 11 Interpreting summary statistics and displays to answer questions p13 12 Using technology to organise data tables and displays (dot plots, stem and leaf plots, column graphs, bar charts and histograms) p14 13 14 Calculating and interpreting summary statistics (mean, median, mode, range, difference) p15 Using technology to organise data tables and displays (dot plots, stem and leaf plots, column graphs, bar charts and histograms) p16 15 Calculating and interpreting summary statistics (mean, median, mode, range, difference) p17 16 Sampling and using it in a survey p18 17 Calculating and interpreting summary statistics from univariate data (mode, median and mean, box plot, inter-quartile range, outliers) p19 18 Interpreting summary statistics (mode, median and mean, box plot, inter-quartile range, outliers) and displays p20 19 Interpreting and predicting from association p21 Teach Probability and Statistics for understanding page 2 Resources for learning The curriculum described in this section does not use textbooks. Instead it calls on the wealth of high quality learning resources that are available, mainly through MAV. Lesson plans Maths300 This is available from MAV. There is an annual subscription, for a user name and password for on-line access to over 170 lesson plans, many with high quality associated software. RIME This collection of lesson plans is also available from MAV. There are three books in the series, also available on CD, with extra spreadsheets. Choose RIME (Measurement, Space, Chance & data). RIME 5&6 A set of RIME-style lessons written specifically for upper primary . From MAV. Chance & Data Investigations Volumes 1 & 2 Chance (Vol 1) and data (Vol 2) investigations designed to take many days, for many year levels, from MAV. Teaching advice Continuum Guidelines in Measurement People Count A book for teachers who want the maths explained to them clearly. Problem solving Maths With Attitude: For each content dimension and for Years 3 to 10, this is a repackaging of the best Maths300 lessons and the best 20 Mathematics tasks, with a useful guide. From MAV. Mathematics Task Centre: This is a collection of Mathematics tasks are available from www.blackdouglas.com.au/taskcentre, or as part of the Maths With Attitude kits, from MAV. Worksheets Active Learning: This is a set of graded worksheets three books from MAV. They are also available on one CD containing the contents of all three books in the series, plus extra worksheets describing how to use the hundreds of spreadsheets also on the CD. Active Learning 2: More of the same in three books, and also on CD. Tuning in with task cards – lower, middle, upper primary. Each book is a set of 150 work cards to guide students into hands-on activities. From MAV Certain Number – a book of hands-on learning combining the ideas of chance with those of number; MAV Maths Investigations This is a collection of open-ended questions with guidance for exploring real-life situations mainly in measurement and data. Mainly middle years, from MAV. Maths at Work: A classic book of real life applications for less academic secondary school students, it is now only in worksheet form on CD, from MAV. Computers Interactive Learning One CD from MAV, containing hundreds of spreadsheets requiring no knowledge of Excel, and covering all levels and dimensions. Very useful for homework! Learning objects (FUSE or Scootle) Resources from MAV may be purchased with a credit card or school order number on-line, using the MAV’s web site: www.mav.vic.edu.au/shop. Teach Probability and Statistics for understanding page 3 1 Representing data Activities (from Guidelines in Measurement) Teach Probability and Statistics for understanding page 4 2 Collecting simple category data and count frequencies Category data are qualities for which we do not use numbers: colours, size (big, small), names (of people, songs, teams) etc. When we find them we often count how many there are of each. These are their frequencies. At this level we might just organise groups, or make a list, and then count. Suggested activities (from Guidelines in Measurement) 1. Count boys and girls 2. Count numbers with cats or dogs 3. Count those who walk, are driven or bus to school Teach Probability and Statistics for understanding page 5 3 Presenting data using pictograms and bar graphs Once we have our list we can represent them graphically. This is a reorganisation of the data, and children may need help to see it happening. Suggested activities 1. People graphs Ask children to form lines based on birth month, favourite season, football team, etc. the finished groups themselves are the graph, with the children as parts of it. 2. Pictograms Use coloured stickers with children’s names to reproduce a ‘people graph’ on a large sheet of paper. 3. Bar or column graphs If we remove the names and use rectangles we have a column graph (if vertical) or bar graph (if horizontal). Teach Probability and Statistics for understanding page 6 4 Gathering of frequency data for different data types This section is about getting the data. Sometimes we use questionnaires to ask, or we count or we measure. Suggested activities 1. Design a simple questionnaire e.g. What are your favourite..., How many ___ do you have? 2. Count objects with different characteristics e.g. cars of different models, or colours 3. Measure children e.g. heights (nearest cm), masses (nearest kg), arm reach (nearest cm),... Activities (from Guidelines in Measurement) Teach Probability and Statistics for understanding page 7 5 Displaying frequency data for different data types, including using software Once we have the data we are able to display it for others to see. This is a good time to start to use suitable software. Suggested activities 1. Raw data displays Children might use lengths of wool to represent their heights. Taping these to a wall shows the raw data. 2. Frequency displays (columns, bars) The raw data can be combined into groups of the ‘same’ height, and the numbers at each height can be counted. This leads to a column or bar graph. 3. Software The spreadsheets below are designed to let simple data exploration take place with minimum fuss. Teach Probability and Statistics for understanding page 8 6 Designing questions for data collecting This time we try to manage a few really good questions about ourselves. This could be the start of a simple data-base. In Chance & Data Investigations Volume 1, look up Database Debut. Suggested activities 1. Design the questions Once children have the ideas of what information they want they need to learn to ask clear questions – a literacy task. The other important idea is that the responses must be clear and able to be tabulated and graphed. Teach Probability and Statistics for understanding page 9 7 Collecting, displaying and interpreting data Getting a lot of data is one thing, but organising it in a form suitable for making a display, and then interpreting what it all means are added skills. A computer data base (e.g. in Excel) allows the computer to sort and organise the data in the ways you want. It makes counting easier, and then will draw simple graphs for you. However these skills take time to learn. Again read Database Debut. Suggested activities 1. Asking the questions and keeping records It will be helpful for students to design a table into which they can record the responses of each person they ask. 2. Sorting the data Simpler tables will include total frequencies of each response. 3. Graphing the data Technology can be used to create column graphs to show the frequencies. 4. Interpreting the data The final stage is to write the answers to the original questions, using the data as evidence. Teach Probability and Statistics for understanding page 10 8 Obtaining discrete and continuous data The distinction between discrete and continuous data needs care. Numbers of things obtained by counting are discrete, as fractions are not possible. An example of continuous data is a measurement such as height, where fraction values are possible, even if we round to the nearest cm. Suggested activities 1. Asking for opinions One common form of discrete data is the opinion poll. For example: 1 strongly agree, 2 agree, 3 no opinion, 4 disagree, 5 strongly disagree. Asking unbiased questions is an important skill. 2. Asking for continuous data These questions should require the person asked to make a measurement, and to round the data. Examples are hand span, head circumference, arm stretch. Teach Probability and Statistics for understanding page 11 9 Presenting data in appropriate summary statistics and displays It is no simple skill to judge the most useful display for different data. The same applies to summary statistics, such as the mean (the averages), medians (the middle one) or mode (the most common). Suggested activities 1. Choosing the best graph We have used frequencies to create column graphs or bar graphs. If the data shows a trend over time it is common to plot points (to show the value) and join these with a dotted line. Line plots show a set of up to about 25 different values above a number line. This is useful for showing how the values are clustered, for example the hand-spans for an entire class. Excel allows students to create pie charts easily. However this only makes sense when there are up to 5 or 6 different categories that form part of a recognisable whole – the pie. 2. Understanding means (averages) The mean is the result of sharing the values equally. Imagine everyone puts all the cash they have into a box, and then share it equally; to do this mathematically they would add the values and then divide by the number of people. 3. Who uses the median and why? The median is the middle value, or in the case of an even number of values it is the average of the middle two (if they are different). For ‘symmetrical’ data sets the median is close to the mean, and could be much easier to find. For non-symmetrical sets, the median might be a much better representative of the majority of values. ‘Typical’ house values are given as medians. 4. Modes: one or more? Modes are found from the frequencies. Many data sets will have more than one mode, but a symmetrical set will have similar mode, mean and median. Teach Probability and Statistics for understanding page 12 10 Using grouping of data and histograms Histograms are often confused with column graphs because they look similar. However histograms use grouped data, whereas column graphs represent the frequencies of single categories. Grouping can only occur where the data is continuous; for example, heights can be 120 - 129 cm, 130 - 139 cm, etc. In this case, the value 129 cm includes all the way up to 129.9 cm. You do not round the lengths up but take the whole number part of the measurement. Suggested activities 1. Grouping of data, using constant group width Student heights, masses, or many other body measurements are suitable. The best way to group data is in easy-to-manage chunks. Start just below the smallest (or at zero if it is close) and go up in either 5s or 10s, 20s, etc. The result is a ‘grouped frequency table’. 2. Histograms, using equal width columns Histograms look like column graphs but have a number line as the horizontal scale. Each column shows the number of values in the range of that column. A grouped frequency table is best represented by a histogram. Teach Probability and Statistics for understanding page 13 11 Interpreting summary statistics and displays to answer questions The most important skill with data is the proper interpretation of other people’s data. The media use a lot of statistics, and it is frequently used to confuse rather than clarify, or to argue for a particular interpretation. We all need to develop skills that will help us think critically about data presented to us. Suggested activities 1. Interpreting tables Much data is presented in tables. The skills required involve understanding exactly what the data means, and looking for the key summaries (totals or percentages). 2. Interpreting graphs Commercial graphs are often cluttered with material that obscures their message. Look at the variables shown, look for numbers on at least one scale, and look at relationships or trends. 3. Interpreting summary statistics Means are by far the most common summaries. Make sure students understand how a mean is a representative of the total data set, by being a ‘central’ value. Look for cases of ‘skewed’ data, with some very high values; in these cases the mean (average) will be unduly affected by the extremely high values. Teach Probability and Statistics for understanding page 14 12 Using technology to organise data tables and displays (dot plots, column graphs, bar charts) The spreadsheets from Interactive Learning provide experience in Excel, the most common statistical program used in industry and commerce. The second-hand data in the spreadsheets below comes from the internet: AFL, the Netball League, the Bureau of Meteorology, The Olympic Games, and the Australia Bureau of Statistics. Once they have the idea, students can explore much data freely available on the internet. Suggested activities 1. Using raw data in a spreadsheet Type the variable names at the top of each column. Enter the values for each person across each row. Then Excel can be used to sort the data, to graph it and to find means or medians. 2. Census in schools The Australian Bureau of Statistics holds a large database of responses from students across Australia. Schools are welcome to access these files and use the data. Contact Australian Bureau of Statistics, Phone: 1800 623 273 or 03 9615 7505. Teach Probability and Statistics for understanding page 15 13 Calculating and interpreting summary statistics (mean, median, mode) This first foray into summary statistics deals with means (just common averages), medians (the middle ones) and modes (the most common). Suggested activities 1. Choosing the best summary statistics At this stage students can choose means (averages) or median (the middle value, or the average of the middle two values). The other type is the mode (the most common, with the highest frequency); there may be several modes. The choice will depend on whether or not the set is reasonably symmetrical. If it is, and if the data is in order, the median is easy to find. If you have frequencies, the mode is the easy way. If the data is clearly non-symmetrical using the median. Teach Probability and Statistics for understanding page 16 14 Using technology to organise data tables and displays (dot plots, stem and leaf plots and histograms) The tools provided in Interactive Learning make this section easy to manage. Dot plots (sometimes called line plots) Up to about 25 dots are placed above a number line to show the data. Repeated values are stacked. Suggested activities 1. Parallel dot plots Two sets of data can be placed above and below the same number line. This clearly shows similarities and differences. 2. Stem and leaf plots The basic stemplot uses the tens digit on the vertical stem and ones digits as horizontal ‘leaves’. If the data on the leaves is in order (smallest closest to the stem) then it is ordered, otherwise unordered. It is also a quick way to get up to about 50 values in order. 3. Back-to-back stem plots Two sets of ‘leaves’, one to the left and one to the right of the same ‘stem’ makes comparing two data sets easy. 4. Grouping of data, using constant grouping range Grouped frequency tables are convenient for continuous data, where there are many values that can be logically combined. The most useful tables have 4 to 6 columns. 5. Histograms, using equal width columns A histogram is a graph based on a grouped frequency table. The horizontal number line shows the grouping ranges and the vertical scale shows the frequencies. 6. The effect of varying the grouping range For some data it is interesting to see the effect on a histogram of changing the width of each column. This can be done in the spreadsheet ‘Histogram’ below. Teach Probability and Statistics for understanding page 17 15 Calculating and interpreting summary statistics (mean, median, mode, range, mean difference) Students have met mean, median and mode in sections 10 and14, but now we also consider the spread of the data. Suggested activities 1. Choosing between mean, median and mode It is useful to use, or invent, non-symmetrical data sets, and compare the values of the median and the mean. 2. Measuring the spread with the range Suppose you were measuring a length. You are either well over or well under the mean value, but never actually on it. compare this ability to measure with someone who is always close and has the same mean value. The range of values (the difference between the highest and lowest) is a measure of the critical difference in spread. 3. Measuring the spread with mean difference The full range only takes the extremes into account and can be greatly influenced by ‘outlying’ values or outliers. A better measure of spread is the mean (positive) difference between each value and the mean. To achieve this on a spreadsheet, list the values in one column. Calculate the average (=AVERAGE) and use a formula to find the mean (positive) difference for each value (=ABS(V-A)). Then find the average of these differences. Teach Probability and Statistics for understanding page 18 16 Sampling and using it in a survey Up to this point any surveys that have taken place have used a ‘convenience’ sample, whoever was available. Why use a sample? Why not a census? Samples are used because it can be far too expensive to survey everyone. The problem is that no sample can be guaranteed to be representative of the whole population. Ways of choosing a sample The most common method is to try to be completely random. This means that all possible sample members have an equal chance of being chosen. Note that even this does not guarantee that the sample will ‘represent’ the population well. Other methods include ‘stratified random sampling’, e.g. randomly choosing males and randomly choosing females, in the correct percentages to match the population. Sample size The larger the sample the more chance there is that its mean is close to the population mean. Suggested activities 1. Choosing randomly Number all the possible people to ask from 1 to n. Use table of random numbers, or a spreadsheet. To get a random number from 1 to n in a spreadsheet cell, type this formula (replacing n by the desired number). ‘=INT(RAND()*n+1)’. 2. Deciding how many to choose The sample size is a question of how much time (and money) you have, and how accurate you need to be. The spreadsheet ‘Sample size’ allows you to explore this, with a prepared population of 100. 3. Stratified random sampling Sometimes it is wise to make sure the fractions in the sample match the fractions in the overall population. To do this, choose the appropriate number in the same for each group (e.g. each gender) and then use random sampling to choose each sub-sample. Teach Probability and Statistics for understanding page 19 17 Calculating and interpreting summary statistics from univariate data (inter-quartile range, outliers, box plot) ‘Univariate data’ means data with one statistical variable. This has been the case to this point; bi-variate data appears only in item 20. In this section we introduce quartiles, leading to inter-quartile range and box plots. Further measures of spread: quartiles and the inter-quartile range Imagine the data arranged in order. Starting from the lowest value, the mean is at the 50% mark. Then the two quartiles (lower and upper) are at the 25% and 75% marks. The ‘inter-quartile range’ is simply the difference between the upper and lower quartiles. It also avoids extreme values. Outliers Given a set of data, the first thing to do is to look for, and possibly remove, extreme values, or outliers. Some will be genuine, but many may be typos. Box plots A box plot (or ‘box and whisker’ because of the vague appearance of a cat face) uses the five values (lowest, highest, median, lower and upper quartiles) to draw a rectangle that contains the middle 50% of values, and shows the median. The ‘whiskers’ extend to the extremes, unless they are so extreme as to be called outliers. It is drawn over (or under) a number line so that values may be read. Parallel box plots Two or more boxplots drawn near the same number line enable comparisons of the sets they represent. Suggested activities 1. Data sources Sets of data that are likely to have a good spread, and probably also be non-symmetrical, are surveys of students’ pocket money and earnings (see Active Learning CD11), surveys of distances travelled, and skewed natural data such as rainfall (e.g. Darwin, from the Bureau of Meteorology). 2. For a small data set, do the analysis manually 3. For larger data sets, use the spreadsheets or the free software to create boxplots. 4. Focus on correct interpretation of boxplots. 5. Compare two similar data sets, such as rainfall for Darwin and Perth (or money access for males and females), using parallel boxplots. Teach Probability and Statistics for understanding page 20 18 Interpreting summary statistics and displays This section aims to focus on interpreting second-hand data, particularly from the media. Interpreting tables Much data is presented in tables. The skills required involve understanding exactly what the data means, and looking for the key summaries (totals or percentages). Interpreting graphs Commercial graphs are often cluttered with material that obscures their message. Look at the variables shown, look for numbers on at least one scale, and look at relationships or trends. Interpreting summary statistics Means (averages) are by far the most common summaries. When percentages are used, it is not always made clear what they are percentages of! Suggested activities 1. Census at school A ready source of material is available from ABS in the CensusAtSchool project. <www.abs.gov.au/websitedbs/cashome.nsf/Home/Home> 2. Bureaux of Statistics and Meteorology Search the ABS <www.abs.gov.au/ausstats>, BOM www.bom.gov.au and other web sites for summary statistics and displays that are interesting to interpret. 3. Road accidents There have been a number of in-depth studies of road accidents. The one that led to the change to 50 km/h limit and pleas to reduce speeds by 5 km/h is found at http://casr.adelaide.edu.au/speed/. Teach Probability and Statistics for understanding page 21 19 Interpreting and predicting from association Association is a concept involving the relationship between two variables. For example, for growing children, height is related to age; but this is not true for adults. Association can be close or very little. Scatterplots Values of one variable (the independent one, e.g. time) are placed using the horizontal axis, and the other on the vertical. The result is a scatter of points. Degree of association A scatterplot shows clearly the extent of the relationship between the two variables. Only if the association is close can prediction be a sensible activity. Association is not ‘cause and effect’ Even if association is close, it does not follow that changes in one variable cause changes in the other. (It doesn’t follow that they don’t either!) Lines of fit and prediction If the association is fairly close, you can draw a line of fit (or Trend line) either by eye, or using a spreadsheet (or other statistical program). Computers will probably only offer ‘lines of best fit’ that use the least squares method. In Excel, this is the process. Enter the data into two columns in a table. Select it, choose Graph Wizard, and Scatterplot. Select the graph and under Chart, choose Add Trendline. Choose the line (providing the data is not obviously a different shape). Suggested activities 1. Data for scatterplots The basic tool is the scatterplot, so any sets of data that provide two variables for a reasonable number of cases will do. For example, get some age (in months) and height data, compare handspan and height for students (males and females are likely to be different) or plot winning and losing scores in many football games. There is data in the resources below, and on the ABS, BOM, and other websites. 2. Focus on sensible interpretations. Don’t try to link variables where the association is not strong. Where it is you can use lines of fit and make predictions. Teach Probability and Statistics for understanding page 22 20 Predicting the outcome of chance events in words The big idea here is that some events are ‘certain’ and others are ‘unpredictable’ or ‘uncertain’ in outcome. Being able to tell the difference is the start of the ‘chance’ journey. Once children have the idea that some events are uncertain, they can describe some common events as ‘more certain than others’. Suggested activities 1. Play and discuss outcomes Roll dice, or play other uncertain games, and discuss the difference between certainty and uncertainty. 2. Sure thing, or impossible The extremes of uncertainty are easier to identify. 3. Likely and unlikely After a while some events will seem to be more likely than others. Teach Probability and Statistics for understanding page 23 21 Using intuitive chance language to compare events Once children have the idea that some events are uncertain, they can describe some common events as ‘more certain than others’. Suggested activities 1. Ordering the language of chance There are many common words that describe levels of uncertainty. What is not so clear is what degree of uncertainty they describe. Here are some; students could try to put them into order. certain, impossible, odds against, one in a million, fluke, unlikely, unexpected, maybe, perhaps, surely, unsure, improbable, possible, hopeful, lucky... Teach Probability and Statistics for understanding page 24 22 Planning chance experiments, including collecting and displaying data The ideas of data collection are useful to compare the frequencies of each of the possible outcomes for simple experiments. Suggested activities 1. Listing the outcomes The first task is to work out what could happen, making a list of outcomes. Use coloured marbles (pegs or unifix), dice, coins, cards, spinners, etc. 2. Experimenting with chance The next step is to select randomly – blindly if needed, and record the results in a table. Finally display in a graph and interpret the results. Which are more likely? Will it always happen? Teach Probability and Statistics for understanding page 25 23 Comparing chances, including ‘fairness’ As a result of intuition and experimenting, students grasp that some games are not ‘fair’. ‘Fairness’ is a basic idea in chance – it means that all possible outcomes have the same chance of occurring. The concept of fairness “Heads I win, tails you lose”. Not fair? What makes a game ‘fair’? Both players much have an equal chance of winning. Suggested activities 1. Testing for fairness Roll two dice, and add the numbers. If the total is 5, 6, 7, 8 or 9 I win, if 2, 3, 4, 10, 11, 12 you win. So you have more totals than I do. Is this fair? Play the game in pairs and record the results. It is soon clear that I will win about two-thirds of the time. The game is not fair. Teach Probability and Statistics for understanding page 26 24 Recognising equally likely events Calculations with chance are only possible (at this level) once we know that all the outcomes are equally likely. So ‘equally likely’ is a basic idea, basically the same as fairness – it means that all possible outcomes have the same chance of occurring. Equally likely from ‘symmetry’ Symmetry is visual; we judge that a coin has no bias to heads or tails, that a die has no visible bias. Equally likely from evidence The only way to be sure is to run an experiment e.g. roll the die hundreds of times. If each child does 100 and results are combined the results will be impressive; even though the totals may differ by quite a bit, the graphs should be very similar in height – the differences will be greatly outweighed by the totals. Suggested activities 1. Equally likely using random numbers The spreadsheets use random numbers from a computer. Are these really ‘equally likely’? Teach Probability and Statistics for understanding page 27 25 Representing estimations of chance as fractions, decimals or percentages Once the students have some skill with fractions, decimals and percentages they can use them to describe levels of chance. Suggested activities 1. Putting a number on your intuition There are many common words that describe levels of uncertainty. Here are some; students could try to put them into order first, then give each a decimal value (between 0 for impossible and 1 for certain). Note that chance values outside this range are impossible. certain, impossible, odds against, one in a million, fluke, unlikely, unexpected, maybe, perhaps, surely, unsure, improbable, possible, hopeful, lucky... 2. Simple experiments to check intuition What is your intuition about the chance of a plastic spoon landing bowl up when dropped? Drop a spoon from a good height man times and test your intuition. The fraction (or decimal) value will be the number of ‘bowl up’ cases divided by the total number of tries. Repeat with a plastic cup, matchbox or drawing pin. Teach Probability and Statistics for understanding page 28 26 Developing and testing conjectures; using a counter-example to disprove The example given in VELS is ‘a 6 is harder to roll than a 1’. This perception comes from playing dice games, where you need a 6 for special moves, or to get started. Suggested activities 1. Developing conjectures Students may offer many of the common misconceptions. Here are some: • After a run of heads, a tail is more likely. No, the coin can’t remember what happened last time. • Coincidences are unlikely. In fact given a large enough sample of events to look at, ‘coincidences’ are very common. • Lotto numbers are equally likely to occur, so those most often chosen are less likely in the future. (This is often seen as a version of “the law of averages”.) • In Lotto, the run 1,2,3,4,5,6 is less likely that any other set of six numbers. • An iPod knows which tunes you like (or don’t like) in sequence – it is not truly random. It is very worthwhile to disprove these, but because chance is involved, it will take more than one counter-example to convince some students! 2. Check using VERY large numbers of trials It is not helpful to run an experiment with only a few trials – largely because you might find that, on this occasion, 6 is harder to roll! Teach Probability and Statistics for understanding page 29 27 Simulating chance events using random devices, including calculators and computers The use of coins or dice to predict outcomes for other events (such as a baby’s sex) is a powerful idea, called simulation. Once the use of hand-held random devices is understood, students can move towards using calculators or computers. Suggested activities 1. Using coins or dice Coins are easy; how do you get equally likely events from a die? Two equally likely events might be 1, 2, 3 vs 4, 5, 6; or even vs odd, etc. Three equally likely events might be 1 or 2, 3 or 4, 5 or 6, or other pairings such as ‘add to 7’. To simulate months where there are 12 equally likely events, roll once for first half/second half, then roll again for the month in that half year. So 6 then 3 makes third month in second half = September. 2. Using a calculator Enter a two digit number, then multiply by another. Repeat multiply until the calculator freezes. Use the right end digit(s). Alternative: enter a four digit number, and take the square root seven times. Use the right end digit(s). 3. Using a computer Excel has a pseudo-random number generator. In one cell type “=RAND()”. This will produce a random decimal between 0 and 1, just like probabilities. To get random digits for dice, type “=RANDBETWEEN(1,6)”. Similar functions are used in the chance spreadsheets below. Teach Probability and Statistics for understanding page 30 28 Using ‘long run’ chance experiments to get close to true probabilities The idea of probability is that the fraction of successful attempts from experiments (called the ‘relative frequency’) will get closer to the true fraction only after a very large number of trials – ‘in the long run’. Simulation relies on this result to estimate probabilities we are unable to calculate. Suggested activities 1. Die rolling, getting more and more results If students roll a die only 6 times it is actually quite unlikely that one of them will get all numbers different. If they roll 60 each, equal frequencies are still unlikely. But if each uses a calculator to find the decimal fractions such as (number of 1s ÷ total rolls) they will find the decimals are rather similar. If they combine their results and repeat the calculation the results will be even more similar. 2. Using a computer to perform ‘long run’ experiments The spreadsheets below allow students to set the probability (e.g. of the drawing pin landing point up) and then to run very many trials based on it. Teach Probability and Statistics for understanding page 31 29 Calculating equally likely probabilities by listing all the possible outcomes of an event For calculation of probability at this level we assume that all possible events are equally likely. Then we list all the possible outcomes (results) and work out the fraction of the total that are successes. This will be the probability – in theory. Any testing that takes place must take account of natural variability. Suggested activities 1. Testing for equal likelihood Complex events (such as numbers of heads from two coins, sum from two dice) will not have equal outcomes; for example the totals of two die numbers range from 2 to 12, with 7 the most likely. 2. Listing all the possible outcomes Single events (such as dice, cards, coins) will be simple to list. 3. Calculating the ‘success fraction’ or probability With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’ and divide by the total number of possible outcomes. This is the probability. For example, in a pack of cards, the number of aces is 4, the number of cards is 52, so the probability is 4 ÷ 52 = = . Teach Probability and Statistics for understanding page 32 30 Understanding probability as long-run relative frequency The relative frequency means the number of successes divided by the total number of attempts (i.e. trials). When you have only tried a few times, this fraction can vary a lot, but for very large numbers of trials it stays close to a single value: the probability. Suggested activities 1. Doing some ‘long run’ experiments, by combining data Arrange for each student or small group to perform a simple experiment to test the predicted probability. They will need to perform about 50 trials, and divide the number of successes by the number of trials. The results will differ, but when combined they produce much larger numbers, and the success fraction’ (relative frequency of success) will be closer to the calculated probability. 2. Using a computer for ‘long run’ experiments The spreadsheets below allow students to set the probability (e.g. of the drawing pin landing point up) and then to run very many trials based on it. Teach Probability and Statistics for understanding page 33 31 Calculating simple probabilities of equally likely events At this level only single events are considered. Two-event experiments appear in section 31. Suggested activities 1. Testing for equal likelihood Complex events (such as numbers of heads from two coins, sum from two dice) will not have equal outcomes; for example the totals of two die numbers range from 2 to 12, with 7 the most likely. 2. Listing all the possible outcomes Single events (such as dice, cards, coins) will be simple to list. You can include criteria that involve more than one characteristic, such as ‘red picture cards’. 3. Calculating the ‘success fraction’ or probability With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’ and divide by the total number of possible outcomes. This is the probability. For example, in a pack of cards, the number of ‘red picture cards’ is 6 (K, Q, J of hearts and diamonds), the number of cards is 52, so the probability is 6 ÷ 52 = 3/26 = 0.115. Teach Probability and Statistics for understanding page 34 32 Generating random numbers for simulations using technology In this section we revisit the idea of simulation. The core of the simulation is the random number generator, be it a coin (0 or 1), die (1 to 6), cards (suits, numbers, etc.) or calculator or computer. This section deals with the skills to use a calculator or spreadsheet to generate random numbers. Suggested activities 1. Using a calculator Enter a two digit number, then multiply by another. Repeat multiply until the calculator freezes. Use the right end digit(s). Alternative: enter a four digit number, and take the square root seven times. Use the right end digit(s). 2. Using a computer Excel has a pseudo-random number generator. In one cell type “=RAND()”. This will produce a random decimal between 0 and 1, just like probabilities. To get random digits for dice, type “=RANDBETWEEN(1,6)”. Similar functions are used in ‘Dice football’ and the other the chance spreadsheets below. Teach Probability and Statistics for understanding page 35 33 Using tree diagrams to predict probabilities for two-event experiments The graphic organiser called ‘tree diagrams’ is used to make sure that all the possible outcomes are listed. Suggested activities 1. Tree diagrams for ‘two-event’ experiments Tree diagrams are graphic organisers that help make sure all the outcomes are listed. For example, if the two-event experiment involves tossing a coin and then rolling a die, the tree diagram would show 12 outcomes, like this. (This could be used to simulate choosing from 12 equally likely months.) H H1 2. H2 H3 T H4 H5 H6 T1 T2 T3 T4 T5 T6 Predicting probabilities using tree diagram With all the possible outcomes listed, we just count the number that meet the criterion of ‘success’ and divide by the total number of possible outcomes. This is the probability. Teach Probability and Statistics for understanding page 36 34 Estimating probability based on data tables If we consider a table of second-hand data as the results of an ‘experiment’ then we can use the data to make predictions. If there are very many trials (as in census data) then the result can be quite close to probability, but we always call these predictions ‘estimates’. Suggested activities 1. Choosing suitable data The most useful data involves a complete population. For example in Victoria the percentage of families that have one parent is about 15%. Some suitable examples are provided in the spreadsheets below. 2. Interpreting the data using chance The interpretation must be sure to use the same ‘unit’, (for example, families) and stay with the same population (Victorian families). If we were to randomly select one Victorian family, we can estimate the probability that there would be only one parent is 0.15. This probability varies with location within Victoria as the data shows. In each spreadsheet below, students can carefully construct chance interpretations from the real data. Teach Probability and Statistics for understanding page 37 35 Estimating probability based on experiments, simulations or surveys This section involves hands-on approaches to getting data from which to estimate probabilities. Suggested activities 1. Estimating from experiments Suppose you roll a die twice and find the difference between the two numbers. The possibilities are 0, 1, 2, 3, 4, 5. Are they equally likely? (No.) What is the probability of each? You can work it out theoretically, and you can do an experiment. 2. Estimating from simulations The experiment above can be done as a computer simulation using ‘Die rolling’. You can run up to 10 000 trials at once and see the graph. 3. Estimating from surveys The use of a survey to form an estimate should be discussed carefully. How can you choose the sample so that it represents the population well? For example: the ‘Opinion survey’ (spreadsheet) represents 24 questions asked on 148 students in Year 9. How typical are they of all Year 9 students? The ‘Young adults’ whose ages, heights and masses are shown on that spreadsheet are from Year 11. How representative are they? The ‘Normal samples’ spreadsheet selects five numbers from a ‘normal’ population. How much do different samples differ from each other? Teach Probability and Statistics for understanding page 38 36 Listing event spaces (for up to 3 events) to calculate probabilities The various techniques for listing all the possible outcomes are learned here. They include lists, grids, tree diagrams, venn diagrams and two-way tables. These is more in the section on Sets and logic in Structure. Suggested activities 1. Grids Grids are sometimes called Carroll diagrams, after Lewis Carroll, who invented their use in this context. The essential idea that there are always two possibilities: having a characteristic or not. They are easier to manage if there are only two variables involved. Examples: roll a die (odd or even) and toss a coin (head or tail); choosing students by sex and by age (under or over an exact age); Two variables creates four outcomes; for example, by comparing each student to the average values of age and height we get this. For three variables the grid is split to make eight outcomes. shorter shorter M F taller younger younger older older 2. taller M F Tree diagrams All the possible combined outcomes from three separate decisions (or ‘events’) can be listed using three-stage tree diagrams. For example, a student’s level, sex and the system of the school. Using a tree diagram, 12 possible combinations will be listed. Note: If selecting things is included, it is important to know whether or not the selected item is replaced, so that all are available for each selection. For example, using only the digits 0 to 9, the three decisions could be: choose a multiple of 3 (i.e.;. 3, 6, 9), choose a factor of 12 (i.e. 1, 2, 3, 4, 6), choose a prime number (i.e. 2, 3, 5, 7). Without replacement, this can be very involved. 3. Venn diagrams When there is overlap it is useful to use loop diagrams named after John Venn. This is also useful to describe sub-set relationships. The digit sets described above look like this. For example if three digit numbers are formed from these choices in the order multiples, factors then primes, there will be 3 choices for multiples, 5 for factors and 4 for primes: 60 three-digit numbers in all. factors of 12 1 0 2 5 primes 4. 4 7 3 8 6 9 multiples of 3 Two-way tables They are like Carroll diagrams, but have more than two choices at each stage. For example, the data in ‘Education’ uses a simple grid: Level (Primary, secondary) vs sex (male, female). The third is the system level (Govt, Catholic, Independent). This allows all the possible combinations to be listed: 2 levels x 2 sexes x 3 systems makes 12 possibilities. Teach Probability and Statistics for understanding page 39 37 Calculating probabilities (for complementary, mutually exclusive, and compound events) Whether or not the laws of probability apply depends on the relationship between the two or more events. For two events there are three possibilities. Two complementary events are two events that contain all the possibilities. You can find the probability of one of them by subtracting the other probability from 1. Mutually exclusive events have no overlap, so the simple form of the addition law applies – their probabilities add to 1. The multiplication law gives the probability of both events occurring. Compound events involve overlapping outcomes. For example on rolling a die, the events might be ‘getting a factor of 12 and ‘getting a factor of 10’. The outcomes 1 and 2 are common to both events. Alternatively one event might be a subset of the other; for example ‘getting a factor of 12’ and ‘getting a factor of 6’ have the outcomes 1, 2, 3 and 6 in common. A more involved addition law (with subtraction!) applies. The simple multiplication law does not apply. Suggested activities 1. Using Venn diagrams to determine relationships: examples with die numbers Complementary 6 1 2 2 1 3 Subset 3 4 5 6 Mutually exclusive 2. 4 5 4 5 3 6 2 1 3 1 4 5 2 Overlapping Complementary events The probability of a number from 1 to 6 being a factor of 6 is factor of 6 is 3. 6 2 , so the probability of not being a 3 1 . 3 Mutually exclusive events This applies where there is no overlap i.e. no outcomes belonging to more than one set. The separate probabilities add to 1. To find the probability of getting one result OR another, just add their separate probabilities. To find the probability that any particular combination occurs (i.e. one and also another) multiply their probabilities. 4. Compound events This applies where there is overlap. In both cases, the separate probabilities add to more than 1, because of the overlap. To find the probability of getting one result OR another, add their separate probabilities and then subtract the probability of the intersection (overlap). Compare the two ‘Addition law’ spreadsheets. To find the probability that any particular combination occurs (i.e. one AND also another) list the cases. The simple multiplication law does not hold. The ‘Multiplication law’ spreadsheet uses two differently coloured dice so there can be no outcomes belonging to both.