Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Progressive Monty Hall Problem S.K. Lucas James Madison University Harrisonburg, VA 22807 [email protected] Jason Rosenhouse James Madison University Harrisonburg, VA 22807 [email protected] Andrew Schepler 341 S. Highland Ave. Apt. A Pittsburgh, PA 15206 [email protected] In the classical Monty Hall problem, you are shown three identical doors. Behind one of them is a car. The other two conceal goats. You choose one of the doors, but do not open it. Monty now opens a door he knows to conceal a goat and gives you the options of either switching to the other unopened door, or sticking with your original choice. You then receive whatever is behind your door. What should you do? This problem has long been a mainstay of undergraduate probability and statistics courses. The ease with which it is stated coupled with the subtlety of its solution makes it an excellent pedagogical tool. Most people find the correct answer, that you double your chances of winning by switching doors, to be highly counterintuitive. Also counterintuitive is the finding that if Monty does not know the location of the car, and instead randomly chooses a door that happens to conceal a goat, then there is no longer any advantage to switching doors. Distinguishing between these seemingly identical scenarios requires some consideration of conditional probabilities and Bayes’ Theorem. A cogent discussion can be found in [1]. 1 Several n door versions of the problem are known. One of them assumes you are presented with n identical doors, with n ≥ 3. You select one, but do not open it. Monty now opens a door he knows to conceal a goat, and gives you the option of switching doors. After making your choice, Monty reveals another goat. He again gives you the option of switching. This process continues until only two doors remain (your current choice, and one other unopened door). You make your final choice, and receive whatever is behind your door. We assume throughout that Monty always chooses randomly from among the goat-concealing doors when more than one such door remains in play. What strategy maximizes your chances of success? We refer to this as the Progressive Monty Hall Problem. The case n = 4 was solved in [2]. By enumerating the sample space and working out the probability of each scenario, the authors established that your best strategy is to stick with your original door until only two doors remain, and then switch. They asserted, but did not prove, that this strategy is optimal for any value of n ≥ 4. No proof of this has ever appeared in the literature. Here we prove that this strategy is uniquely optimal for all values of n. In the next section we will use Bayes’ Theorem and induction to establish that you win with probability n−1 by following it. We will then discuss a few n subtle points about the problem before proving our result. Switching at the Last Minute We will refer to the strategy of sticking with your initial choice until only two doors remain and then switching as SLM (for Switching at the Last Minute). In this section we determine the probability of winning by following this strategy. First we define some terminology and notation. If E is an event in a probability space, we denote by E the event that E does not occur. We will denote by Di the event that the i-th door conceals the prize, and by Mi the event that Monty opens door i. We will write E1 ∩ E2 to denote the event in which both E1 and E2 occur. It will be convenient to speak simply of 2 the probability of a particular door. This should be understood to mean the probability that that door conceals the prize. Finally, we will sometimes refer to “a stage of the game.” This refers to any point in the game just after Monty has opened a door, but before you have made your decision about whether or not to switch. We also remind the reader that if A and B are events, then the conditional probability of A given B, denoted by P (A|B), is defined by the formula P (A|B) = P P(A∩B) . According to Bayes’ Theorem, we also have (B) P (A|B) = P (A)P (B|A) . P (B) To show that the SLM strategy wins with probability n−1 , we could argue n as follows: We can assume without loss of generality that you initially choose door one. Then P (D1 ) = n1 and P (D1 ) = n−1 . Since Monty only opens goatn concealing doors, the SLM strategy wins precisely when D1 occurs, which proves the result. This argument, while correct, leaves something to be desired. It does not adequately flesh out the significance of the fact that Monty only opens goat-concealing doors, and it provides no inkling of how to handle variations of the problem in which Monty has a nonzero probability of opening the door with the car. And if ever there was a situation where intuitively satisfying arguments should be regarded with suspicion, it is the Monty Hall problem. For these reasons we believe it is worthwhile to include a proper, probabilistic proof that SLM wins with probability n−1 . n We will do this by induction on the number of doors that Monty has eliminated. We again assume without loss of generality that you initially choose door one. This door has probability n1 at this time. Monty now opens door i, with 2 ≤ i ≤ n, to reveal a goat. We must show that this revelation provides no reason for altering the probability of door one. According to Bayes’ Theorem, we have P (D1 |Mi ) = P (D1 )P (Mi |D1 ) . P (Mi ) 3 Since Monty chooses randomly from among the n − 1 goat-concealing doors, 1 we have P (Mi|D1 ) = n−1 . Since the doors are assumed to be identical, we 1 also have P (Di ) = n . Finally, we have P (Mi ) = P (Mi |D1 )P (D1 ) + P (Mi |Di )P (Di ) + P (Mi |D1 ∩ Di )P (D1 ∩ Di ) 1 1 2 1 1 = + 1− = . n−1 n n−2 n n−1 (Note that P (Mi |Di ) = 0. If the prize is behind door i, then Monty will not open door i. We will omit this term from future calculations of this sort.) Plugging everything into Bayes’ Theorem now reveals that P (D1 |Mi ) = 1 , and we conclude that Monty’s actions have provided no reason for altering n the probability of door one. That is the base case of the induction. Now assume that Monty has eliminated x doors, with 1 ≤ x ≤ n − 3. Since we are following SLM, we assume that we have stuck with door one throughout. By the inductive hypothesis, we can assume that door one still has probability n1 . Monty will now open another goat-concealing door, we will refer to this as door j. We want to show that P (D1 |Mj ) = n1 . By Bayes’ Theorem, this is equivalent to showing that P (Mj |D1 ) = P (Mj ). Note that there are n−x−1 doors other than door one remaining in play. Since none of these doors has been chosen at any stage of the game, they must have the same probability, and the sum of their probabilities must be n−1 . n n−1 1 Consequently, we have P (Dj ) = n(n−x−1) . We also have P (Mj |D1 ) = n−x−1 . Finally, we compute P (Mj ) = P (Mj |D1 )P (D1 ) + P (Mj |D1 ∩ Dj )P (D1 ∩ Dj ) 1 1 1 1 n−1 = + 1− − n−x−1 n n−x−2 n n(n − x − 1) 1 = . n−x−1 It follows that P (D1 ) = n1 regardless of the number of doors Monty has opened. Consequently, by switching when only one other door remains, we will win with probability n−1 . n 4 A Case Study In the next two sections we will establish that SLM is the unique optimal strategy. However, a full appreciation of the subtleties of the problem will be useful before proceeding with the somewhat technical proof. So let us consider the progressive Monty Hall problem in the case where there are five doors. Throughout we will use a quintuple of rational numbers to represent the probabilities of the five doors, and we will refer to this quintuple as the probability vector for that stage of the game. Of course, as the game begins we have probability vector 1 1 1 1 1 , , , , . 5 5 5 5 5 Suppose we initially choose door one, and then Monty opens door two. According to our results from the previous section we ought to find that P (D1 |M2 ) = 51 . Indeed, since P (D1 ) = P (D2 ) = 51 , we compute P (D1 )P (M2 |D1 ) P (M2 |D1 )P (D1) + P (M2 |D1 ∩ D2 )P (D1 ∩ D2 ) 1 1 1 = 1 1 5 4 1 3 = . 5 +3 5 4 5 P (D1 |M2 ) = It follows that we now have the probability vector 1 4 4 4 , 0, , , . 5 15 15 15 What happens if we now switch to door three, and Monty then opens door five? How does that affect the probability of doors one and three? In working through the following calculations, it will be useful to keep in mind that there are two kinds of doors that Monty can not open: the door concealing the prize, and the door you have currently chosen. 4 Note that we now have P (D1) = 51 , P (D3) = P (D5) = 15 , P (M5 |D1 ) = P (M5 |D3 ) = 13 and P (M5 |D1 ∩ D3 ) = 12 . Note that we also have P (M5 ) = P (M5 |D1 )P (D1 ) + P (M5 |D3 )P (D3 ) + P (M5 |D4 )P (D4 ) 1 1 1 4 1 4 29 = + + = . 2 5 3 15 2 15 90 5 Consequently, we can compute that: P (D1 |M5 ) = and P (D3 |M5 ) = 4 15 1 3 1 5 1 2 29 90 4 15 1 3 + 1 2 = 9 . 29 1− 8 15 = 8 . 29 This produces the probability vector 9 8 12 , 0, , , 0 . 29 29 29 An interesting result. The probability of all three doors increases. In particular, the probability of door one increases. Had we stuck with door one, the opening of door five would not have altered the probability of this door. But the real surprise occurs in the following scenario: suppose that after we switched to door three, Monty opened door one instead of door five. How does that affect the probability of door three? We would now compute 4 1 1 15 3 = , P (D3 |M1 ) = 4 1 1 1 4 4 + 2 1 − 5 − 15 15 3 leading to the probability vector 1 3 3 0, 0, , , . 4 8 8 Consequently, the probability of door three goes down when we see Monty open door one. Very surprising! The fact that a door’s probability can decrease when Monty opens a different door makes it highly nontrivial to prove that SLM really is the best strategy. Perhaps there is some clever sequence of switches and subsequent door openings that drives the probability of at least one door below n1 . Even though we have no control over the doors Monty opens, this possibility can not be so readily dismissed. This explains the hard work required to establish the optimality of SLM. 6 Updating the Probabilities We will now establish the suboptimality of other strategies. The proof will have two parts. In this section we will derive formulas that tell us how to update the probabilities each time Monty opens a door. In the next section we will use these formulas to establish the impossibility either of obtaining a door of probability smaller than n1 , or of obtaining a door of probability equal to n1 by any strategy other than SLM. Let us now suppose that we are part way through the game and that k doors remain, with 3 ≤ k ≤ n. As before, we denote by Di the event that the car is behind door i, and by Mj the event that Monty opens door j. We also define Cℓ to denote the event that you choose door ℓ. We can write if j = i or j = ℓ, 0 1 P (Mj |Di ∩ Cℓ ) = k−1 (1) if i = ℓ and j 6= ℓ, 1 if i 6= ℓ, j 6= ℓ and j 6= i. k−2 It follows from the definition of conditional probability that P (Mj |Di ∩ Cℓ ) = P (Di ∩ Mj |Cℓ ) . P (Di|Cℓ ) (2) Note, however, that Di and Cℓ are independent. Since choosing a door provides no basis for altering the probability of the door, we conclude that P (Di|Cℓ ) = P (Di). Therefore, (2) becomes P (Di )P (Mj |Di ∩ Cℓ ) = P (Di ∩ Mj |Cℓ ). Combining this with (1) gives us 0 1 P (Di ∩ Mj |Cℓ ) = k−1 P (Di ) 1 P (Di ) k−2 if j = i or j = ℓ, if i = ℓ and j 6= ℓ, if i 6= ℓ, j 6= ℓ and j 6= i. Using the equation P (Di ∩ Mj |Cℓ ) = P (Di|Mj ∩ Cℓ )P (Mj |Cℓ ), 7 (3) (which again follows from the definition of conditional probability), we can recast equation (3) as 0 if i = j, 1 1 P (Di|Mj ∩ Cℓ ) = (4) P (Di ) if i = ℓ, k−1 P (Mj |Cℓ ) 1 P (Di ) if i 6= ℓ and i 6= j. k−2 Note that since it is impossible for Monty to open the same door you choose, in (4) we have implicitly assumed j 6= ℓ. If we throw out the case where i = j, then equation (4) is the complete list of updated probabilities for the k − 1 doors remaining after Monty opens a door. This completes the first part of the proof. SLM is Uniquely Optimal To complete the proof that SLM is uniquely optimal, we use equation (4) to derive certain results about the ratios of the probabilities of the remaining doors at any stage of the game. Assume that k doors remain in play. Further suppose that you now choose door ℓ and Monty subsequently opens door j 6= ℓ. It is a consequence of equation (4) that for h 6= i we have k−2 if i = ℓ, P (Di | Mj ∩ Cℓ ) P (Di ) k−1 k−1 = if h = ℓ, P (Dh | Mj ∩ Cℓ ) P (Dh ) k−2 1 if h 6= ℓ and i 6= ℓ. We have already shown that when following SLM beginning with door 1, at n−1 the stage with k doors remaining, P (D1 ) = n1 and P (Di) = n(k−1) for i 6= 1. k−1 1) It follows that PP (D = n−1 . (Di ) Furthermore, we claim that with k doors remaining, we have P (Di ) k−1 ≥ P (Dh ) n−1 (5) for any pair of doors remaining in play. We prove this by induction on the number of doors that have been opened. 8 At the beginning of the game k = n, and all doors have probability Since this is in accord with inequality (5), we see that the base case is satisfied. Now suppose k doors remain unopened, and all pairs of probabilities satisfy inequality (5). If you now choose door ℓ and Monty opens door j, the updated ratios are k − 2 P (Dℓ ) P (Dℓ | Mj ∩ Cℓ ) = and (6) P (Dh | Mj ∩ Cℓ ) k − 1 P (Dh ) P (Di | Mj ∩ Cℓ ) k − 1 P (Di ) = , (7) P (Dℓ | Mj ∩ Cℓ ) k − 2 P (Dℓ ) 1 . n (Dℓ ) P (Di ) k−1 if one of i or h is equal to ℓ. Since PP (D , ≥ n−1 , it is readily seen that h ) P (Dℓ ) these updated ratios satisfy the bound of inequality (5). If we have i, h 6= ℓ, then P (Di ) k−1 P (Di | Mj ∩ Cℓ ) = ≥ , P (Dh | Mj ∩ Cℓ ) P (Dh ) n−1 and again we see that inequality (5) is satisfied. This completes the induction. Now suppose that k = 2, and that door i and door h are the only ones remaining. Our previous result implies that P (Di ) P (Di ) 1 = ≥ . P (Dh ) 1 − P (Di) n−1 This implies that P (Di) ≥ n1 . Equivalently, P (Di ) ≤ n−1 . This establishes n the optimality of SLM. It remains only to establish that SLM is uniquely optimal. To do this, note that our previous calculations show that it is impossible, at any stage of the game, for a given door i to satisfy P (Di) < n1 . Such a probability would imply that at least one of the other k − 1 unopened doors must have probability 1 − n1 n−1 P (Dh ) > = . k−1 n(k − 1) But then P (Di ) P (Dh ) < k−1 , n−1 contradicting inequality (5). 9 Finally, note that in inequality (5) we obtain equality only if (a) our current door choice is placed in the numerator (as in equation (6)), and (b) at the previous stage the ratio of the probabilities for the same two doors k−1 . But (b) is possible only if door ℓ has been our choice all along. It was n−1 follows that only by following SLM can the optimal ratio be obtained, and the proof is complete. References 1. Richard Isaac, The Pleasures of Probability, Springer-Verlag, New York, 1995. pp. 1-11, 24-27. 2. V.V. Bapeswara Rao and M. Bhaskara Rao, A Three-Door Game Show and Some of its Variants, The Mathematical Scientist, 17, No. 2 (1992), 89-94. 10