Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hitting 10 You toss a coin repeatedly. If heads is worth 1 and tails is worth 2, and you add up the score as you go along, what is the probability that you’ll hit 10? That is to say, 10 might be one of the sums you get, but it also might not—if you hit 9 and then roll tails, you’ll go straight to 11 and miss 10. The question asks, if you do this a large number of times, what proportion of the times will you get 10 as one of your sums? A note on notation: it makes sense to jettison the terminology “heads” and “tails” and use “1” and “2”, as if the coin had a 1 on one side and a 2 on the other. This is what I will do. As usual, I start out collecting data. Each of 35 students runs 10 trials and keeps track of the number of “successes” (hitting 10). The scores range from 4/10 to 9/10 successes, with an overall success rate of 216/350 = 0.62 . That’s an empirical estimate of our probability. One student asserts right away that the answer should be 2/3. Here is her argument. Think of the possible sequences of sums that go “past” 10. We could have 8, 10 9, 10 9, 11 These are the only possibilities—that is, every sequence will contain one (and only one) of these pairs. Now two of these hit 10 and one misses 10, for a probability of 2/3. An interesting argument. But a number of students are uncomfortable with it and there is a clear need for some care in the analysis. But where to begin? Someone suggests that if we knew the probability of hitting 8 or 9, then we might be able to figure out the probability of hitting 10. That sounds like induction—working up from small numbers. So let’s take some small numbers N and see what we get. What’s wrong with this argument? That’s not easy to see at first. In fact probabilistic arguments can be devilishly tricky. It is true that a probability can be regarded as the proportion of “successes” among all the “possibilities” but it’s important that these possibilities all be equally likely. Are these three “sequences” equally likely to occur? Some notation: we’ll denote by PN the probability of hitting the target N. Why not start at the very beginning. What’s the probability of hitting N=1? Well, to hit 1, you’ll have to toss 1 right away, and that will happen with probability ½. So P1 = ½ . Okay, let’s go for N=2. Here I get a verbal argument. Think about the first toss: half the time you toss 2 and succeed right away. What about the other half when you toss 1? Well, then you toss again and half the time you toss another 1 (success) and the other half you toss 2 (failure). So with probability ½ + ¼ you hit 2 and with probability ¼ you miss it. So P2 = ¾. hitting10 11/13/2007 1 Now on to N=3. This time a student comes to the board and tabulates all the possible sequences that would arise, stopping each time when it is known whether 3 has been hit. There are 5 possible sequences, of which 3 are successful, for a probability of 3/5. I let the table stand, and take a vote. Nearly everyone accepts the result. But there are a few abstainers with cross looks and wrinkled brows. Buoyed up by this show of support, the student carries on. He has done the same for N=4 and found 8 possible sequences of which 5 are successes. Now 3, 5 and 8 are successive Fibonacci numbers, and he asserts that it continues in this way. Thus PN is always the ratio of two successive Fibonacci numbers. The N=3 table 111 S 112 F 12 S 21 S 22 F Well, what can I say? Wow. Certainly the class is cowed. “Comments?” I ask crisply. Silence, even from the wrinkled brows. We’ve encountered the Fibonacci numbers. What could be better? I just has to be right. The N=4 table 1111 S 1112 F 112 S 121 S 122 F 211 S 212 F 22 S To trouble-shoot, I suggest we go back and do the same analysis for the case N=2 that we have already done. It shows three possibilities, two of which are successes. Thus P2 should be 2/3. But we’ve just shown that P2 = 3/4. So there’s certainly a flaw in the reasoning here. The N=2 table 11 S 12 F 2 S Let’s compare this with the original argument—there we noted that the initial 2 (the third row) occurs ½ the time, so this contributes ½ to the probability right away. And the other two cases each contribute ¼. That is, it would seem that the rows of the table should be weighted with their probability of occurrence. If we include these weights (as a third column), and add up all the S-weights, we do get the correct answer of ¾. The N=2 table with weights 11 S ¼ 12 F ¼ 2 S ½ It would appear that the flaw in the “Fibonacci” calculation is the various rows in the table haven’t been weighted by their probability of occurrence. Let’s see what happens if we do that with the N=3 table. The new weighted table now gives a success probability of 1/8 + 1/4 + 1/4 = 5/8 and this is in fact the correct answer. It also checks out with a couple of approaches that other students have been trying. Note that since the third column represents the probabilities of the different possible outcomes, they have to sum to 1. That’s a nice error check. [We’ve had to abandon that amusing Fibonacci excursion, but see problem 3.] hitting10 11/13/2007 The N=3 table with weights 111 S 1/8 112 F 1/8 12 S 1/4 21 S 1/4 22 F 1/4 2 1 S F 2 1 1 1 I give the class the job of constructing the weighted table for N=4, and we get the answer of 11/16. It starts to get hard to organize the entries in a systematic way. Some students are using tree diagrams instead of tables, and they provide an easy system, but they get large quickly. S 2 1 1 S 2 1 The N=4 table with weights 1111 S 1/16 1112 F 1/16 112 S 1/8 121 S 1/8 122 F 1/8 211 S 1/8 212 F 1/8 22 S 1/4 2 2 1 2 F 2 1 1 2 1 2 S F 2 1 2 1 Can you track the correspondence between the tree diagram and the table? 1 2 S 2 1 2 2 Perhaps it’s time to start tabulating the PN values to see if we can see some useful patterns in the numbers. So far, we have calculated the first 4 values. Any guesses as to what comes next? One clear pattern is seen in the denominators—2, 4, 8, 16––at the Nth stage we get 2N. Well, an interesting pattern is observed in the numerators as well: 1, 3, 5 and 11. Add consecutive pairs: 1+3=4 3+5=8 5 + 11 = 16 N 1 2 3 4 5 PN 1/2 3/4 5/8 11/16 ? N 1 2 3 4 5 6 7 8 9 10 PN ½ ¾ 5/8 11/16 21/32 43/64 85/128 171/256 341/512 683/1024 which is the denominator of 3 which is the denominator of 5 which is the denominator of 11. Can this be a coincidence? The prediction it gives for P5 is 21/32. And someone has been constructing the table for N=5, and reports that this is correct! Wow. Nice. Perhaps we should stop now and try to see why this interesting numerical pattern might hold (and thereby get a proof of its validity). But the class is impatient to complete the table—and so that’s what we do. Okay! We have a candidate: P10. = 683/1024. How can we be sure that it’s correct? A hand goes up in the back. It is Devon. It’s correct. I beg your pardon? It’s correct. The answer for P10 is correct. How do you know? I worked it out. hitting10 11/13/2007 3 You worked it out? You calculated P10? Yes. How? Uhmm… You made a table of all the possibilities? Not exactly… A tree? Not really… I invite Devon to the front, and give him the chalk and a clean panel. He proceeds to display an extremely elegant accounting of all possible paths leading to a success for N=10. With a bit of reorganization on my part, this is the table he puts up. sequence composition 10 ones 8 ones & 1 two 6 ones & 2 twos 4 ones & 3 twos 2 ones & 4 twos 5 twos Different routes to get to 10 Prob. of number of total sequence sequences probability ½10 1 1/210 9 ½ 9 9/29 8 ½ 8!/6!2! = 28 28/28 7 ½ 7!/4!3! = 35 35/27 6 ½ 6!/2!4! = 15 15/26 5 ½ 1 1/25 Having put the table on the board, he then calculates the sum: 1 10 2 + 9 9 2 + 28 8 2 + 35 2 7 + 15 2 6 + 1 25 1 683 = 10 [1 + 2 ⋅ 9 + 4 ⋅ 28 + 8 ⋅ 35 + 16 ⋅ 15 + 32] = 10 2 2 A convincing result! Needless to say, this “full calculation” is not the route I intended the class to follow. Indeed, many of them have not yet encountered the combinatorial coefficients. And I can feel a definite level of anxiety rising from the class. What happened here really blew me away. Devon hadn’t wasted any time either—he had in fact obtained the answer before we had completed the table, and was just waiting for the chance to put it forward. It turns out he was taking OAC finite math that very term (the year was 1995), and this problem hit him right at a golden moment. Most of the class has trouble seeing what Devon has done. It helps to look back at the N=4 table. Notice, for example, that there are three rows which have 2 ones and 1 two, each with probability 1/8. Well, in his accounting, these would all be listed on one row and there would be a 3 in his third column to say that he’s counted three rows of the “expanded” table. For example, the 28 in column 3 says that there are 28 ways to arrange a string of 6 ones and 2 twos, and so these would have occupied 28 rows in the extended table. So the table at the left gives us a very economical presentation. “What we have here,” I say, “Is a complete (and remarkably compact) listing of cases.” Such an approach is of great theoretical importance, and you would encounter it in any introductory finite math course. So I’m very happy for you to have seen an example here of what it looks like. However the method has some disadvantages. It is computationally heavy. How well would it work if I had asked for the probability of hitting 100?” I look at Devon and he wrinkles his brow. “And even more––could this approach give us a general formula for PN?” In fact in problems of this type there is often an approach which does not require listing all the successful cases, which uses a nice piece of reasoning to get around that. Let’s get back to some of those intriguing patterns. hitting10 11/13/2007 4 By now, the class has noticed more patterns in the PN table. For example, if you double any numerator, you get the next numerator ±1, where you alternate the + and the – . The point is that these numerical patterns should correspond to algebraic relationships between the PN.. And if we can formulate these, we might be able to “see” why they must hold. N 1 2 3 4 5 PN 1/2 3/4 5/8 11/16 21/32 Let’s take, for example, the original numerator pattern: 1+3=4 3+5=8 5 + 11 = 16, etc. What does this tell us about relationships among the PN ? To have a specific case, let’s take: 5 + 11 = 16. To get the PN into the picture, divide by 16: 5 11 + =1 16 16 Now the 11/16 is clearly P4 . But what is the 5/16. Well, it would seem to be half of P3. So we write this as: P3 + P4 = 1 2 Okay. That’s an intriguing relationship. Who can see a reason why it should be true? Well, since the two terms on the left sum to 1, and the second is the probability of hitting 4, the first has to be the probability of not hitting 4. So that becomes the question—can we see a reason why P3/2 should be the probability of not hitting 4? Well, it takes a minute, but it’s all too easy. The crucial observation is that there’s only one way to miss 4 and that’s to hit 3 and then toss a 2. And the chances of doing that are P3 × ½, the first to hit the 3, and the second to toss the 2. That’s real nice. This is a general result: The only way to miss n+1 is to hit n and then toss a 2. This can be written: 1 − PN +1 = 1 PN 2 PN +1 = 1 − 1 PN 2 If we rewrite it as: It is in recursive form and this allows us to recursively calculate all the PN starting at P1 = ½. hitting10 11/13/2007 I give the students some time to find this lovely argument on their own, but not many succeed. They seem to find it a bit foreign. One might have thought that this recursive approach would have been tried at the very beginning of the problem. But for all it’s simplicity and directness, it does not seem to occur naturally. It’s an approach that must be learned and experienced. 5 One more question––can we work out P100 from this? Of course we could start with P1 and iterate the recursion 99 times. That would be easy enough with a spreadsheet, but where would it get us? What we’d really like is a formula for PN in terms of N. And that’s really a question of solving the recursion. We have what’s called a first-order linear inhomogeneous recursion and there are lots of different “methods” around to solve it. But my favorite is the one that I find high school students doing if they’re not told what to do, and that’s just to iterate the recursion again and again but keeping the terms in expanded form until one sees a pattern. 1 PN 2 We have a recursive formula for PN. Can we solve it? PN +1 = 1 − To make the calculations easier (and more transparent!) we write the recursion as PN+1 = 1 + aPN with a = – ½ . Then start at the beginning: Notice what a difference that “a” makes. Using a general parameter there instead of the –½ really allows us to focus on the developing pattern as we move from one N to the next. P0 = 1 P1 = 1 + aP0 = 1 + a. P2 = 1 + aP1 = 1 + a(1 + a) = 1 + a + a2 P3 = 1 + aP2 = 1 + a(1 + a + a2) = 1 + a + a2 + a3 P4 = 1 + aP3 = 1 + a(1 + a + a2 + a3) = 1 + a + a2 + a3 + a4 The pattern is clear and it is clear that it will continue (a formal proof of that can be made using mathematical induction). PN = 1 + a + a2 + a3 + ... + aN = 1 − a N +1 1− a using the formula for the sum of a geometric series. Now putting a = – ½: PN = [ ] 1 − (−1 / 2) N +1 1 − (−1 / 2) N +1 2 = = 1 − (−1 / 2) N +1 . 3 3/ 2 1 − (−1 / 2) We have a formula! The probability of hitting 10 is then P10 = [ 2 1 − (−1 / 2)11 3 ] = 683 2⎡ −1 ⎤ 2 ⎡ 2049 ⎤ 1− = = . ⎢ ⎥ ⎢ ⎥ 1024 3 ⎣ 2048 ⎦ 3 ⎣ 2048 ⎦ as we obtained with our combinatorial calculations. The general formula is nice. It tells us that the probabilities will be successively above and below 2/3, and will approach 2/3 in the limit. You might have managed to already argue that the probability should get close to 2/3 for large N. Why should the PN approach 2/3? Here’s a nice way of thinking about it. Suppose we make a large number of tosses, say 200. Then we expect 100 1’s and 100 2’s, so our total sum should be about (on average?) 300. But how many numbers between 1 and 300 will we have hit?––well we get a new hit with every toss, so we’ll hit 200 numbers. That means we hit a proportion 200/300 of the numbers less than 300. So the probability of hitting any particular number should be around 2/3. Question. There’s something a bit unexpected about the PN formula and that’s the 3 in the denominator. Isn’t the denominator supposed to be 2N ? Argue that PN will always be able to be written as a fraction with 2N in the denominator. hitting10 11/13/2007 6 Problems 1. Harry and Sally play the following game with a biased coin. Sally flips first. If she gets heads she wins; if she gets tails she gives the coin to Harry. Then Harry flips and if he gets heads he wins; if he gets tails he gives the coin to Sally. The game continues until someone flips heads and wins. (a) If heads comes up with probability 1/4, what is the probability Sally will win? (b) If over many repetitions of the game, Sally wins 75% of the time, what would you say was the bias on the coin? (c)Let the bias on the coin be such that heads occurs a proportion x of the time, and suppose Sally wins with probability p. Find simple formulas for p in terms of x and for x in terms of p. 2. Annie and Bill each have a fair die, which they throw alternately. Annie scores if she throws a multiple of three, and Bill scores if he throws a multiple of two. The winner is the first person to score. Show that the game is fair if Annie rolls first. [Note: If I'm using this problem in class, it works well to put the class into groups of two to play this game a few times, and collect some data. We see if the game really seems to be fair, and we get a "hands-on" feel for the analysis.] 3. Here’s a generalization of 2. Suppose that Annie and Bill each spin a pointer with n equally likely outcomes numbered 1 to n. Annie scores if she throws any of the first numbers 1, 2, … k and Bill scores if he throws any of the numbers k+1, k+2, … n. If Annie goes first, for what numbers n and k is the game fair? 4. A flips a fair coin 100 times, and B flips a fair coin 101 times. What is the probability that B gets more heads than A? [This is an awesome problem and works well with an exploratory approach. Start by solving simpler problems.] 5. Following Fibonacci. What the Fibonacci guy discovered (above) is that if you count the number of different sequences that you might generate in deciding whether a target n was “hit” you got a Fibonacci number, and if you counted the number of those that were “successful,” in that they hit the target, that was the preceding Fibonacci number. In fact, that’s a very nice result, not about probabilities, but about numbers of sequences of a certain type. Track these ideas down and write a nice little essay about the subject. 6. Follow the methods developed above to solve the recursion x n +1 = 3 − 3 x 4 n starting with x0 = 1. That is, find a general formula for xn . hitting10 11/13/2007 7