Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stick Tossing and Confidence Intervals Bruce Cohen Lowell High School, SFUSD [email protected] http://www.cgl.ucsf.edu/home/bic David Sklar San Francisco State University [email protected] Asilomar - December 2006 Ver. 0.5 Estimating a Probability An Old Problem: When a thin stick of unit length is “randomly” tossed onto a grid of parallel lines spaced one unit apart what is the probability that the stick lands crossing a grid line? We would like to take a purely experimental and statistical approach to the problem of finding, or at least estimating, the desired probability. Our experiments will consist of tossing a stick some fixed number of times, keeping track of how many times the stick lands crossing a grid line (the data), and computing the percentage of times this event occurs (a statistic). Basic statistical theory will help us understand how to interpret these results. Plan Estimating a simple probability Toss sticks, gather data Background material Estimating the probability The average and standard deviation of a list of numbers Estimating the uncertainty in the estimate of the probability Histograms, what they are and what they aren’t Confidence intervals and what they mean The average and standard deviation of a histogram The normal curve Where does the procedure for finding confidence intervals come from? Why does it work? A mathematical model for the data The mathematics of the model Box models and histograms for the sum of the draws The Central Limit Theorem Sketch of a proof of a special case of the Central Limit Theorem Estimating the Probability: A Sample Calculation Result: 20 line crossings in 36 tosses # crossings estimated probability # tosses Standard Error (SE) for the estimated probability 20 0.5555 55.6% 36 est. prob. est. prob. of of crossing not crossing # tosses 20 16 36 36 36 0.0828 8.3% Conclusions: Based on this data an approximate 68% confidence interval for the probability that the stick lands crossing a line is 55.6% 8.3% 47.3% 63.9% an approximate 95% confidence interval is 55.6% 16.6% 39.0% 72.2% 68% Confidence Intervals for 10 Experiments (36 tosses per experiment) estimated cross prob SE 20 55.6% 8.3% 24 66.7% 7.9% 19 52.8% 8.3% 28 77.8% 6.9% 23 63.9% 8.0% 25 69.4% 7.7% 24 66.7% 7.9% 27 75.0% 7.2% 23 63.9% 8.0% 21 58.3% 8.2% 63.9% 47.3% 74.6% 58.8% 61.1% 44.5% 84.7% 70.9% 71.2% 55.9% 77.1% 61.7% 74.6% 58.8% 67.8% 71.2% 55.9% 66.5% 50.1% 40% 50% 82.2% 60% 70% 80% Pooling the data Result: 234 line crossings in 360 (independent) tosses # crossings 234 estimated probability 65.0% # tosses 360 Standard Error (SE) for the estimated probability est. prob. est. prob. of of crossing not crossing # tosses .650 .350 360 0.025 2.5% Conclusions: Based on this data an approximate 68% confidence interval for the probability that the stick lands crossing a line is 65.0% 2.5% 62.5% 67.5% an approximate 95% confidence interval is 65.0% 5.0% 60.0% 70.0% 68% Confidence Intervals for 10 Experiments (36 tosses per experiment) estimated cross prob error 20 55.6% 8.3% 24 66.7% 7.9% 19 52.8% 8.3% 28 77.8% 6.9% 23 63.9% 8.0% 25 69.4% 7.7% 24 66.7% 7.9% 27 75.0% 7.2% 23 63.9% 8.0% 21 58.3% 8.2% 234 65.0% 2.5% 63.9% 47.3% 74.6% 58.8% 61.1% 44.5% 84.7% 70.9% 71.2% 55.9% 77.1% 61.7% 74.6% 58.8% 67.8% 71.2% 55.9% 66.5% 50.1% 62.5% 40% 50% 82.2% 60% 67.5% 70% 80% Some 95% Confidence Intervals Where Does the Procedure for Finding Confidence Intervals Come From? As with all “real world” applications of mathematics we begin with a Mathematical Model. Box Model The number of line crossings in n tosses of the stick is like the Sum of values of n draws at random with replacement from a box with two kinds of numbered tickets. Those numbered 1 correspond to the stick landing crossing a line, and those numbered 0 to not crossing. The percentage of tickets numbered 1 in the box is not known. This unknown percentage corresponds to the probability that a stick lands crossing a line. ? 1 ?? 0 The set of tickets in the box is called the population, and the (unknown) % of 1’s in the population is a parameter. The n drawn tickets are a sample, and the % of 1’s in the sample is a statistic. Note: this kind of box is called a zero–one box. The Mathematics of the Model The goal for the rest of the talk is to develop the mathematics of the box model. We first review some basic background material which we then use to understand the behavior of the sum of the draws from a box of known composition. Finally we use this understanding to see why the confidence levels come from areas under the normal curve. The Average and Standard Deviation of a List of Numbers Example List: 21, 28, 30, 30, 34, 37 The SD measures the spread. sum of the values number of elements 30 The mean measures the “center” of the list. The mean or average 25 35 The average is the balance point. deviation element average deviations 9, 2, 0, 0, 4, 7 The Standard Deviation (SD) mean of the squared deviations 9 2 2 2 02 02 42 7 2 5 6 The SD measures the spread of the list about the mean. It has the same units as the values in the list. It is a natural scale for the list: we are often more interested in how many SD’s a value is from the mean than in the value itself. The Average and Standard Deviation of a List of Numbers For a list consisting of just 0’s and 1’s we have: average sum of the values number of ones fraction of 1's number of elements number of elements and with some algebra we can show that SD mean of the squared deviations fractions of 1's fractions of 0's We can now re-interpret the procedure for estimating our probability estimated probability SE # crossings sample # of 1's # tosses sample size est. prob. est. prob. of of crossing not crossing # tosses sample fraction of 1's sample average sample sample fraction of 1's fraction of 0's sample size sample SD sample size Properties of The Average and Standard Deviation 1. If we add a constant, B, to each element of a list the average of the new list is the old average + B. 2. If we multiply each element of a list by a constant, A, the average of the new list is A times the old average. 3. If we add a constant, B, to each element of a list the SD of the new list is the old SD. 4. If we multiply each element of a list by a constant, A, the SD of the new list is |A| times the old SD. Standard Units We are often more interested in how many SD’s a value is from the mean than in the value itself. For example: 37 is 1.4 SD’s above the average or 28 is 0.4 SD’s below the average. The value of an element in Standard Units is the the number of SD’s it is above (positive), or below (negative) the mean. To convert a value to standard units use value in standard units z -value Example List: deviation value average SD SD 21, 28, 30, 30, 34, 37 with average 30 and SD 5 In Standard Units: -1.8, -0.4, 0, 0, 0.8, 1.4 A list in standard units will have mean 0 and SD 1. Adding a constant to each element of a list or multiplying each element by a constant will not change the values of the elements in standard units. For many lists roughly 68% of the values lie within 1 SD of the mean and 95% lie within 2 SD’s. From Lists to Histograms Example: 36 Exam Scores 23, 29, 30, 31, 35, 38, 40, 41, 42, 45, 46, 51, 52, 54, 55, 55, 57, 58, 59, 60, 61, 63, 69, 70, 70, 71, 71, 74, 75, 75, 82, 85, 86, 91, 91, 93. Note: Av = 59.1, SD = 18.9 0.8 1.4 1.9 1.1 0.8 Endpoint convention: class intervals contain left endpoints, but not right endpoints A Histogram represents the percentages by areas (not by heights). 2.0 (% /point) area in % (width in pts)(height in %/pt) 13.9 % 18 pts density in % pt (1.9) (1.4) (1.0) (0.8) 44.4% (0.8) 16.7% 16.7% 13.9% 8.3% 0.0 % 13.9 16.7 44.4 16.7 8.3 Density (% per point) 0.5 1.5 1.0 class intervals # 20 - 38 5 38 - 50 6 50 - 74 16 74 - 90 6 90 - 100 3 density 20 40 60 scores 80 A histogram is not a bar chart. 100 A Histogram is Not A Bar Chart Bar Chart of Scores Histogram of Scores Density (% per point) 0.5 1.5 1.0 40 (1.9) (1.4) (1.0) (0.8) 44.4% (0.8) 16.7% 16.7% 16.7% 16.7% 13.9% 8.3% 8.3% 0 0.0 13.9% % of total papers 10 30 20 2.0 44.4% 20 40 60 scores 80 100 20 38 50 74 90 scores A Histogram represents the percentages by areas (not by heights). A histogram is not a bar chart. 100 The Average and Standard Deviation of a Histogram To find the mean or average of a histogram first list the center of each class interval then multiply each by the area of the block above it and finally sum. Class intervals: 20 to 38, 38 to 50, 50 to 74, 74 to 90, 90 to 100 List of midpoints: 29, 44, 62, 82, 95 Histogram Av 29 .139 +44 .167 +62 .444 +82 .167 +95 .083 60.5 To find the standard deviation of a histogram find the squared deviations of the center of each class interval, then multiply each by the area of its corresponding block, then sum, and finally take the square root. 29 60.5 .139 44 - 60.5 .167 2 2 62 - 60.5 .444 82 - 60.5 .167 19.0 2 95 - 60.5 .083 [Note for the original data: Av = 59.1, SD = 18.9] For many histograms roughly 68% of the area lies within 1 SD of the mean and 95% lies within 2 SD’s. 2.0 (1.9) Density (% per point) 0.5 1.5 1.0 SD 2 (1.4) (1.0) (0.8) (0.8) 44.4% 16.7% Av = 60.5 16.7% SD = 19 13.9% 8.3% 0.0 2 20 40 60 scores 80 100 1.5 1.0 Av = 60.5 0.5 Density (% per point) 2.0 Histograms and Standard Units 0.0 SD = 19 scores -3 -2 -1 0 Standard Units 1 2 3 The Normal Curve Area (percent) Height (% per Std.U.) The normal curve was discovered by Abraham De Moivre around 1720. Around 1870 Adolph Quetelet had the idea of using it as an ideal histogram to which histograms for data could be compared. Many histograms follow the normal curve and many do not. The equation for the Standard Normal Curve is 1 e 2 y f z From: Freedman, Pisani, and Purves, Statistics, 3rd Ed. the family: g x 1 2 e z2 2 x 2 2 2 1.5 1.0 Av = 60.5 0.5 Density (% per point) 2.0 Histograms, Standard Units, and the Normal curve 0.0 SD = 19 scores -3 -2 -1 0 Standard Units 1 2 3 Data Histograms and Probability Histograms Discrete data convention From: Freedman, Pisani, and Purves, Statistics, 3rd ed. Data Histograms and Probability Histograms for the Sum of the Draws The Central Limit Theorem There are many Central Limit Theorems. We state two in terms of box models. The second is a special case of the first and it covers the model we are dealing with in our stick tossing problem. It goes back to the early eighteenth century. When drawing at random with replacement from a box of numbered tickets (with bounded range), the probability histogram for the sum of the draws will follow the standard normal curve, even if the the contents of the box do not. The histogram must be put into standard units, and the number of draws must be reasonably large. De Moivre – La Place version: When drawing at random with replacement from a zero-one box, the probability histogram for the sum of the draws will follow the standard normal curve, even if the the contents of the box do not. The histogram must be put into standard units, and the number of draws must be reasonably large. The Normal Curve and Probability Histograms for the Sum of the Draws 1 0 provides a box model for counting the number of heads in n tosses of a fair coin. Histogram for the box 100 50 0 0 1 From: Freedman, Pisani, and Purves The Normal Curve and Probability Histograms for the Sum of the Draws From: Freedman, … The Normal Curve and Probability Histograms for the Sum of the Draws Histogram for the box 1 2 From: Freedman, … 9 The Central Limit Theorems When drawing at random with replacement from a box of numbered tickets (with bounded range), the probability histogram for the sum (and average) of the draws will follow the standard normal curve, even if the the contents of the box do not. The histogram must be put into standard units, and the number of draws must be reasonably large. De Moivre – La Place version: When drawing at random with replacement from a zero-one box, the probability histogram for the sum (and average) of the draws will follow the standard normal curve, even if the the contents of the box do not. The histogram must be put into standard units, and the number of draws must be reasonably large. The probability histogram for the average of the draws, when put in standard units is the same as for the sum because multiplying each value of the sum by 1/(# of draws) won’t change the corresponding values in standard units. Where Does the 68% Confidence Level Come From? Estimated SE SD of the sample for the average # of draws True SE for the average of the draws Pop. SD # of draws 1 Sample Average True Population Average Standard units Since the estimated SE for the average computed from sample is, on average, about equal to the true SE a 68% confidence interval will cover the true population mean whenever the sample mean is within 1 SE of the true mean. The probability of this happening is, by the central limit theorem, the area within 1 standard unit of 0 under the normal curve, and this area is about 68%. How to Prove The De Moivre – La Place Version of The Central Limit Theorem Show that the probability that the sum of n draws at random with replacement from a zero-one box is exactly k given by the binomial formula b k ; n, p n! pk qnk , k ! n k ! Then using “Stirling’s Formula” show b k ; n, p n! 2 n where q 1 p n 1 2 n e k n np nq 2 n n k k n k nk Letting x k np and recalling that q 1 p b k ; n, p 1 x 1 x x np 2 npq 1 1 np nq x np x 1 nq x nq How to Prove The De Moivre – La Place Version of The Central Limit Theorem -- continued Use the series for the log to show that, for x npq x np x nq x x 1 x log 1 1 2 npq np nq x Which implies 1 np Hence b k ; n, p x np 1 2 npq e x 1 nq 1 x 2 npq x nq e 1 x 2 npq 2 1 2 npq e 2 2 1 k np 2 npq 2 1 2 npq The limiting processes in these steps require some care. Both k and n must go to infinity together in a fixed relationship to each other, and we need to understand why values of x for which |x|>npq are unimportant. e 1 z2 2 Bibliography 1. Freedman, Pisani, & Purves, Statistics, 3rd Ed., W.W. Norton, New York, 1998 2. W. Feller, An Introduction to Probability Theory and Its Applications, Volume I, 2nd Ed., John Wiley & Sons, New York, London, Sydney, 1957 3. F. Mosteller, Fifty Challenging Problems in Probability with Solutions, Addison-Wesley, Palo Alto, 1965. 4. http://www-history.mcs.st-andrews.ac.uk/Biographies/De_Moivre.html 5. R Development Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, 2006, <http://www.R-project.org>