Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AS91586 Probability distributions (3.14) Anna Martin Avondale College What are the big ideas and how do we effectively teach them? Uncertainty...….it’s all around us - deal with it! ★ Resources that are already available http://new.censusatschool.org.nz/resources/3-14/ ★ 2013 and 2014 NZQA exams http://www.nzqa.govt.nz/ncea/assessment/viewdetailed.do?standardNumber=91586 Starter sketches... Number 1 - 5 in your books. Ready?? For each distribution, you will get some descriptions of its features. You have to use these descriptions to try to sketch the shape of the distribution. You will need to draw an x/horizontal axis WITH the correct variable and an attempt at the scale. Distribution 1 My distribution is of ages in years e.g. 2.7 years I am symmetrical. Sketch me! I have a mean and median of 10 years. I am uniformly distributed. My outcomes lie between 0 and 20 years inclusive. Distribution 2 My distribution is of heights of flowers in cm. I am symmetrical. Sketch me! I have a mean and median of 50 cm. I am normally distributed. My standard deviation is around 10 cm. Distribution 3 My distribution is of lengths of hair in cm. I am bimodal. Sketch me! I have a small peak at 5 cm and a larger peak at 25 cm. My outcomes pretty much range between 0 cm and 30 cm, although a few people have longer hair than 30 cm. Distribution 4 My distribution is of hours worked per week. I am negatively skewed. Sketch me! My median (and peak) is at 40 hours, but my mean is lower at 35 hours. My outcomes pretty much range between 0 and 60 hours, it’s unlikely to find longer than 60 hours per week. Distributions 5 Sketch us! We are both distributions for the amount of water consumed in mL. We are both normally distributed. For athletes, my distribution has a mean of around 1500 mL and a standard deviation of around 200 mL. For non-athletes, my distribution has a mean of around 1200 mL and a standard deviation of around 400 mL. Let’s see how you went! Distribution 1 My distribution is of ages in years. I am symmetrical. I have a mean and median of 10 years. I am uniformly distributed. My outcomes lie between 0 and 20 years inclusive. 0 10 Age (in years) 20 Distribution 2 My distribution is of heights of flowers in cm. I am symmetrical. I have a mean and median of 50 cm. I am normally distributed. My standard deviation is around 10 cm. 20 30 40 50 60 Heights (in cm) 70 80 Distribution 3 My distribution is of lengths of hair in cm. I am bimodal. I have a small peak at 5 cm and a larger peak at 25 cm. My outcomes pretty much range between 0 cm and 30 cm, although few people have longer hair than 30 cm. 0 5 10 15 20 Hair lengths (in cm) 25 30 Distribution 4 My distribution is of hours worked per week. I am negatively skewed. My median (and peak) is at 40 hours, but my mean is lower at 35 hours. My outcomes pretty much range between 0 and 60 hours, it’s unlikely to find longer than 60 hours per week. 0 10 20 30 40 Hours worked (per week) 50 60 Distributions 5 We are both distributions for the amount of water consumed in mL. We are both normally distributed. For athletes, my distribution has a mean of around 1500 mL and a standard deviation of around 200 mL. For non-athletes, my distribution has a mean of around 1200 mL and a standard deviation of around 400 mL. The non-athletes peak is lower because they are more spread out (imagine an ice-cream melting….) 0 400 800 1200 1600 Water consumed (mL) 2000 2400 What are the foundation ideas from Level 2? AS91267 Probability Models - what are the uniform and normal distributions? Features of distributions - what are the key features of experimental or sampling distributions? Locating values in a distribution how do you use theoretical models or experimental distributions to estimate probabilities? Comparing models with sample data - do I have enough data to identify the features of the random variable? Expectation - what would be typical values for this distribution (middle 80%)? what would be unlikely? how many times would I expect to see this happen? AS91268 Simulations Randomness - what specifically is the random process for the situation? Independence - what specific things are you assuming don’t influence each other? Why does independence matter? Probabilities given - why are you assuming these will stay the same? Will these always stay the same? Will things run out or change? How would this affect your simulation? Number values given - why are you assuming these will stay the same? Will these always be the same? Could they be higher or lower? How would this affect your simulation? Estimates - why can’t you answer the problem with an exact number? What are the key concepts for Level 3 probability distributions? Model (theoretical) Random variable Solve problems involving uncertainty Working with theoretical distributions or models…. ★ Using contextual clues/information to select an appropriate model ★ Justifying the selection of the model Consider how you can get students thinking using distributions to solve problems quicker, rather than initially getting stuck on calculating probabilities and navigating the graphics calculator. NZQA 2014 What are the key concepts for Level 3 probability distributions? Model (theoretical) Random variable Solve problems involving uncertainty Experimental (simulated or sample data)* Beliefs, misconceptions or claims ★ A certain teacher is slightly obsessed with the TV show “The Block” ★ She notices that every year a big deal is made about the auction order ★ Watch this video from the most recent series of “The Block NZ” and consider the question below Youtube video link Start at 14:40 What do the presenter and contestants seem to believe about the auction order and winning the competition? Initial exploration of data This same teacher has looked at the Australian and NZ versions of “The Block” and recorded who won each series and which order they went in the auction. There are have been 12 series of “The Block” in Australia and New Zealand. What proportion of these series did the team that went 1st or 4th in the auction win? Considering the random variable and expectations If the auction order doesn’t make a difference (and no other factors influenced who was the winner - a big assumption!), then the theoretical probability distribution could be uniform. This means each different auction order has a 25% chance of resulting in a win. How many times out of 12 would you expect the team that went first to win using the uniform distribution? Assessing reasoning skills Ben and Quinn (from the NZ 2014 series) say that the past results show that going 1st is the way to win the whole competition, because 5 out of 12 times the couple that went first won the competition! Explain to Ben and Quinn in simple terms that they can understand why this may not be very good reasoning :-) Visualising chance variation This same teacher ran a simulation to investigate how many times certain auction orders win out of 12 seasons, if the chance of winning was 25% for each auction order. This is an animated gif of 10 of these simulations. How many times does a result of 5 or more wins come up (regardless of auction order)? Visualising chance variation This same teacher ran a simulation to investigate how many times certain auction orders win out of 12 seasons, if the chance of winning was 25% for each auction order. This is an animated gif of 10 of these simulations. How could you use these graphs to explain to the producers of “The Block” why there may be insufficient evidence to support a claim that the order of the auction makes a difference to who wins? Identifying likely and unlikely outcomes in the distribution This same teacher ran a simulation through the computer (1000 trials), assuming going 1st has a fixed chance of winning (p = 0.25), to investigate the variation in how many times the 1st place would come up the winner in sets of 12 (to mimic the 12 different series of the block) by chance alone. How many seasons you would expect to find that the team that went first would win by chance alone? What would be an unlikely outcome(s)? Considering chance variation as part of solving a problem Ben and Quinn (from the NZ 2014 series) say that the past results show that going 1st is the way to win the whole competition, because 5 out of 12 times the couple that went first won the competition! Explain to Ben and Quinn using the experimental distribution why this is not very good reasoning :-) Can you tell young Paul Rudd from old Paul Rudd? ★ Paul Rudd (an actor) never seems to actually age! ★ In the following quiz, you'll see eight pairings of pictures of Rudd at different ages — some are four years apart, some are eleven, some are somewhere in between — and you have to guess in which of the two he is older. ★ Ready? Link to source material NZQA 2013 Question 3(c) Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Let’s look at the results! For each pair, record if you correctly identified the older Paul Rudd :-) Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Which is the older Paul Rudd? Left Right Can you tell older Paul Rudd from younger Paul Rudd? How many did you get right? What is the theoretical probability of someone getting a successful outcome for each trial through guessing? To apply the binomial distribution to this situation, what conditions do you need (or need to assume)? Using the binomial distribution, is there sufficient evidence that you can tell "old Paul Rudd" from "young Paul Rudd"? This could be extended by providing students with data from the website on which this survey was initially conducted. Does this data suggest people were just guessing? Selecting models using data and conditions…... Summary sheet Note: Sometimes we use probability distribution models because they are “fit for purpose” even if they do not technically meet all of the mathematical conditions. This is often the case with the Poisson distribution, because one of its properties is that as the mean of the random variable increases, the variation also tends to increase - this is a common feature of data and random processes. We also need to keep in mind how much data we have, and remind ourselves (and our students) about all the thinking we have around sample to population inferences…. World war 2 London bombing During World War II, London was assaulted with German flying-bombs on V-2 rockets. The British were interested in whether or not the Germans could actually target their bomb hits or were limited to random hits with their flying-bombs. Based on the work by the British statistician R.D. Clarke An Application of the Poisson Distribution (1946) World war 2 London bombing It should be noted that this analysis is very important. For if the Germans could only randomly hit targets, then deployment throughout the countryside of various security installations would serve quite well to protect them, as random bombing over a wide range was unlikely to hit a given target. However, if the Germans could actually target their flying-bombs, then the British were faced with a more potent opponent and deployment of security installations would do little to protect them. World war 2 London bombing The British mapped off the central 24 km by 24 km region of London into 1/2 km by 1/2 km square areas. Then they recorded the number of bomb hits, noting their location, and this data is in the following table: No. bombs 0 1 2 3 4 5 or over No. areas 229 221 93 35 7 1 Imagine that you are a young Lieutenant in His Majesty's Service. You are charged with ascertaining if the British are up against an adversary who can target their flying-bombs or one who can only randomly toss these bombs at London. Considering data and theoretical distributions…... The data collected (n = 586) Estimate the mean Check your estimate number of bombs with a calculation per area Why could we use Poisson to model this random variable and what would be its parameter? Plot the values from the model distribution on the graph Considering data and theoretical distributions…... The data collected (n = 586) What can you conclude? Imagine that you are a young Lieutenant in His Majesty's Service. You are charged with ascertaining if the British are up against an adversary who can target their flyingbombs or one who can only randomly toss these bombs at London.