Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 5 Agenda 1. Conditional Probability 2. Theorem of Total probability 3. Bayes Rule Conditional Probability Suppose you are going to have a surgery, but before that you want to talk to a Doctor about how likely it is that the surgery will be a success. The Doctor informs you that about 60% cases have success and you find this number too low and depressing. But if the Doctor looked at the proportion of success among young patients he would have observed that 85% of young patients have success (and that number is not so bad!). So often, while calculating probabilty of an event (e.g. probability of success of operation) if you already have any piece of information(e.g. the patient is young), it’s a good idea to use it. But how to use that extra information ? Definition 1. If A and B are two events in an experiment with sample space S and suppose P (B) > 0, (i.e. B is not an impossible event) then the probability of happening A given that B has happened is defined as P (A|B) = P (A ∩ B) P (B) P (A|B) is read as “Probability of A given B” Example Toss a fair die. Let A denote the event that the outcome is 2, 4 or 6. If someone asks you to play the following game will you play ? “If A occurs you pay me $10, otherwise I will pay you $10” Answer S = {1, 2, 3, 4, 5, 6}. All outcomes are equally likely, as it is a fair die. A = {2, 4, 6} 1 P (A) = 36 = 21 So it seems like a fair game and $10 is not a big amount, so I will bet. Now suppose the die is cast in a secret chamber and you have a helpful informer who tells you that the results were 4,5 or 6. Would you still play if you had the option to move out? So let B = {4, 5, 6}, and we know B has occured. So P (A|B) = P ({4, 6}) 2 P (A ∩ B) = = P (B) P ({4, 5, 6}) 3 So the game is fair no more and I wouldn’t play. Conditional probability satisfies the 3 axioms of probability. Lemma 1. Let B be an event with P (B) > 0.Then 1. For any event A, 0 ≤ P (A|B) ≤ 1. 2. If S is the sample space, P (S|B) = 1 3. If A1 , A2 , A3 , . . . are disjoint events then P (∪∞ i=1 Ai |B) = ∞ X P (Ai |B) i=1 Proof. 1. A ∩ B ⊂ B. Hence 0 ≤ P (A ∩ B) ≤ P (B). Lets divide all of them by P (B) to get, 0≤ 2. P (S|B) = P (A ∩ B) ≤1 P (B) P (S ∩ B) P (B) = =1 P (B) P (B) 3. If A1 , A2 , A3 , . . . are disjoint events then so are A1 ∩ B, A2 ∩ B, A3 ∩ B, . . .. Hence by the 3rd axiom of probability P (∪∞ i=1 (Ai ∩ B)) = ∞ X P (Ai ∩ B) i=1 ∞ But ∪∞ i=1 (Ai ∩ B) = (∪i=1 Ai ) ∩ B. Hence P ((∪∞ i=1 Ai ) ∩ B) = ∞ X i=1 2 P (Ai ∩ B) We divide both sides by P (B) to get, P (∪∞ i=1 Ai |B) = ∞ X P (Ai |B) i=1 Question :: If P(A) = 0, does that mean P(A|B) = 0 ? If P(A|B) = 0, does that mean P(A) = 0 ? For any event A, P(A|S) = P(A) TRUE or FALSE ? We know, that a sequence of events B1 , B2 , . . . , Bk is called mutually exclusive (also called disjoint), if for any i 6= j, Bi ∩ Bj = ∅. A sequence of events B1 , B2 , . . . , Bk is called mutually exhaustive if B1 ∪ B2 ∪ . . . ∪ Bk = S . If a sequence of events B1 , B2 , . . . , Bk is mutually exclusive and mutually exhaustive, we call it a partition of the sample space. It is as if we are dividing the sample space into several disjoint pieces. Figure 1: This sample space has been partition into B1 , B2 , B3 , B4 , B5 , B6 3 Theorem of Total probability Theorem 1. If B1 , B2 , . . . , Bk is a partition of the sample space S, such that P (Bi ) > 0 for all i; then for any event A P (A) = k X P (A|Bi )P (Bi ) i=1 Proof. Since B1 , B2 , . . . , Bk are mutually exhaustive, S = B1 ∪ B2 ∪ . . . ∪ Bk Hence A ∩ S = A ∩ (B1 ∪ B2 ∪ . . . ∪ Bk ) ⇒ A = (A ∩ B1 ) ∪ (A ∩ B2 ) ∪ . . . ∪ (A ∩ Bk ) [By Distributive Law] Now B1 , B2 , . . . , Bk are mutually exclusive, hence so are A ∩ B1 , A ∩ B2 , . . . , A ∩ Bk . Thus P (A) = P (A ∩ B1 ) + P (A ∩ B2 ) + . . . + P (A ∩ Bk ) Now for each A ∩ Bi we can write P (A|Bi ) = P (A∩Bi ) P (Bi ) which implies P (A ∩ Bi ) = P (A|Bi )P (Bi ) . Thus P (A) = P (A|B1 )P (B1 ) + P (A|B2 )P (B2 ) + . . . + P (A|Bk )P (Bk ) Bayes Rule Theorem 2. If B1 , B2 , . . . , Bk is a partition of the sample space S, then for any event A P (A|Bi )P (Bi ) P (Bi |A) = Pn j=1 P (A|Bj )P (Bj ) 4 Proof. P (Bi ∩ A) P (A) P (A ∩ Bi ) = P (A) P (A|Bi )P (Bi ) = P (A) P (A|Bi )P (Bi ) = Pn [Apply Theorem of Total Probability] j=1 P (A|Bj )P (Bj ) P (Bi |A) = Homework :: 2.124,2.125,2.129,2.134,2.135,2.137 5