Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 Lecture Notes Conditional Probability and Independence October 1, 2015 1 MATH 305-02 – Probability Lecture Notes October 1, 2015 Sections 3.1 – 3.2 Conditional Probabilities Conditional probabilities are one of the most important concepts in probability theory. In most cases, partial information is known before we compute the probability of an event. This means the probability is based on some condition; hence, conditional probability. Example Suppose we roll two dice. What is the probability that the dice sums to 8? The sample space is S = {(i, j) | i, j = 1, 2, . . . , 6} , which has 36 elements. The event E = {(i, j) | i + j = 8} contains 5 elements. Since any roll of the dice is equally likely, it follws that 5 P (E) = . 36 Now, suppose we know that the first dice rolled was a 3. What is the probability that the sum is 8? In this case, our sample space is different: S = {(3, j) | j = 1, 2, . . . , 6} , which only has 6 elements. Furthermore, each element in the space is equally likely with probability 1/6. Since only rolling a 5 with the second dice will yield an 8 in this case, it follows that the probability of rolling an 8 given that the first roll is a 3 is 1 P (8 is rolled | 3 was rolled first) = . 6 We note that the probability is actually a little bit higher in this case. In general, if E and F are two events of an experiment, then the conditional probability that E occurs given that F has already occurred is denoted by P (E | F ). We can derive a formula for this probability. Suppose F has occurred. Then for E to occur, it is necessary that the occurrence is in both E and F , so it is in EF . Since F has occurred, the event F becomes our new sample space, and we need to see how often E occurs within this new space. Hence, the probability that EF occurs will equal P (EF ) relative to P (F ). Therefore, we arrive at the following definition: Definition If P (F ) > 0, then P (E | F ) = 2 P (EF ) . P (F ) MATH 305-02 – Probability Lecture Notes October 1, 2015 Example (3.1) Two fair dice are rolled. What is the conditional probability that at least one lands on 6 given that the dice land on different numbers? Here, we let E be the event that at least one dice lands on 6 and let F be the event that the two numbers are different. Then we seek the P (E | F ), which is given by P (EF ) P (E | F ) = . P (F ) We can compute each probability on the right hand side. We first note that the sample space has 36 elements, each equally likely. Consider the event E, which is the event that at least one dice is a six. This event has 11 elements because E = {(i, 6), (6, i), (6, 6) | i = 1, 2, . . . , 5}. Of these 11 events, 10 have numbers that are different. Hence, P (EF ) = 10 . 36 The event F has 30 elements which we can denote by F = {(i, j) | i 6= j for i, j = 1, 2, . . . , 6}. There is only 30 elements because the only dice rolls that yield identical numbers are doubles and there are only 6 doubles. Hence, 30 P (F ) = . 36 Therefore, we can conclude that the desired probability is P (E | F ) = 10/36 1 = . 30/36 3 Exercise (2a) Suppose Joe is 80 percent certain that his missing key is in one of the two pockets of his hanging jacket, being 40 percent sure it is in the left pocket and 40 percent sure it is in the right. If he searches the left pocket and does not find the key, what is the conditional probability that it is in the right pocket? Solution: Here, we let L denote the event that the key is in the left pocket and let R denote the event that the key is in the right pocket. Then the probability that we seek is P (R | Lc ). Hence, P (R | Lc ) = P (RLc ) P (R) 0.4 2 = = = . P (Lc ) 1 − P (L) 0.6 3 Here, we have used the fact that P (RLc ) = P (R) since R, the event that the key is in the right pocket, is most certainly a subset of Lc , the event that the key is not in the left pocket. (The rest of Lc is filled with the event that it is in neither pocket). 3 MATH 305-02 – Probability Lecture Notes October 1, 2015 Exercise (3.7) The king comes from a family of 2 children. What is the probability that the other child is his sister? Solution: Let E be the event that the king is male and F be the event that his sister is female. We first note that the sample space to this problem is S = {(g, g), (g, b), (b, g), (b, b)}, where b denotes boy and g denotes girl. We wish to find the probability P (F | E), which is P (F | E) = P (F E) 2/4 2 = = . P (E) 3/4 3 Sometimes working with the reduced sample space is the best way to go. Example (3.4) What is the probability that at least one of a pair of fair dice lands on 6, given that the sum of the dice is i, for i = 2, 3, 4, . . . , 12? In this case, it is much easier to consider the reduced sample space given by the condition of the sum. Let E be the event that at least one dice lands on 6 and let Fi be the event that the sum of the dice is i. Then we quickly see that the desired probability is P (E | Fi ) = 0, i = 2, 3, 4, 5, 6 This is because if the sum is less than 6, there is no way a 6 could’ve been rolled. For i = 7, we see that our reduced sample space is restricted to F7 = {(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)}, of which only 2 elements contain a 6. Therefore, P (E | F7 ) = 2 1 = . 6 3 Using the same argument, we see that when i = 8, F8 = {(6, 2), (5, 3), (4, 4), (3, 5), (6, 2)}. Our reduced sample space only has 5 elements, 2 of which contain a 6. Therefore, 2 P (E |F8 ) = . 5 Continuing in this manner, we find 1 2 2 P (E | F10 ) = 3 P (E | F11 ) = 1 P (E | F9 ) = P (E | F12 ) = 1. 4 MATH 305-02 – Probability October 1, 2015 Lecture Notes Exercise (3.10) Three cards are randomly drawn, without replacement, from an ordinary deck of 52 playing cards. Compute the conditional probability that the first card is a spade given that the second and third cards are spades. Solution: Let E be the event that the first card is a spade and let F be the event that the second and third cards are spades. This problem is very easy to do if we look at the reduced sample space. Suppose we draw three cards. If two of them are already spades, then two of the 13 spades are gone and we are left with 11 spades to choose from a remaining deck of 50 cards. Hence, the probability is 11 P (E | F ) = . 50 Exercise (2c) In the card game of bridge, the 52 cards are dealt out equally to 4 players - called East, West, North, and South, where East and West are on a team and North and South are on a team. If North and South have a total of 8 spades among them, what is the probability that East has 3 of the remaining spades? Solution: Let the event E be East having 3 spades and let F be the event that North and South have 8 spades among the two of them. We can think in the sense of the reduced sample space. Assuming the North and South hands have been dealt and they received 8 spades, it follows that 26 East could have a possible 13 hands. We seek the probability that he received 3 of the remaining 5 spades. There are 53 21 10 ways this could happen. Hence, 5 21 P (E | F ) = 3 10 26 13 . We can rearrange the conditional probability formula and write P (EF ) = P (F )P (E | F ). We can generalize the conditional probability formula to include any number of events in succession. This is called the multiplication rule: Theorem. (Multiplication Rule) P (E1 E2 · · · En ) = P (E1 )P (E2 | E1 )P (E3 |E1 E2 ) · · · P (En | E1 E2 · · · En−1 ). Proof. We can show this is the case by continually applying the conditional probability formula. Consider the right hand side of this equation: P (E1 E2 · · · En ) P (E1 E2 ) ··· P (E1 )P (E2 | E1 ) · · · P (En | E1 E2 · · · En−1 ) = P (E1 ) P (E1 ) P (E1 E2 · · · En−1 ) = P (E1 E2 · · · En ) 5 MATH 305-02 – Probability October 1, 2015 Lecture Notes Example (2d) Celine is undecided as to whether to take a French course or a chemistry course. She estimates that her probability of receiving an A grade would be 1/2 in a French course and 2/3 in a chemistry course. If Celine decides to base her decision on the flip of a fair coin, what is the probability that she gets an A in chemistry? We can use the multiplicative rule to answer this question. Let C be the event that she takes a chemistry class, and A be the event that she gets an A in whatever course she chooses. Then, we know that the probability that she gets an A assuming she takes chemistry is P (A | C) = 2/3 and we know the probability that she takes chemistry is P (C) = 1/2. We see the probability that she takes chemistry and gets an A. Hence, 1 2 1 P (CA) = P (C)P (A | C) = = . 2 3 3 Example (3.13) Suppose an ordinary deck of 52 cards is randomly divided into 4 hands of 13 each. We wish to determine the probability p that each hand has an ace. Let Ei be the event that hand i has an ace, then we can determine p = P (E1 E2 E3 E4 ) using the multiplication rule. Here, we can write P (E1 E2 E3 E4 ) = P (E1 )P (E2 | E1 )P (E3 | E1 E2 )P (E4 |E1 E2 E3 ). We need the conditional probabilities on the right hand side of this equation. Consider the prob 4 ability that the first hand is dealt an ace. There are a total of 52 13 possible hands. There are 1 possible aces to fill the one spot and there are 48 12 remaining cards to fill the remaining 12 spots in the hand. Therefore, P (E1 ) = 48 12 52 13 4 1 . Now, assume we know that the first hand has an ace. Then we have a reduced sample space of 39 cards and there are only 3 aces left. Hence, 3 36 1 P (E2 | E1 ) = 12 39 13 . Now, assume we know that first two hands contain aces. Then our sample space is reduced further and there are only 26 remaining cards, 2 of which are aces. This gives 2 24 1 P (E3 | E1 E2 ) = 12 26 13 . Finally, there is only 1 ace left and there are 13 cards to choose from. Hence, 1 12 1 P (E4 |E1 E2 E3 ) = 12 13 13 = 1. Therefore, p is given by 4 1 48 12 52 13 p= 3 1 36 12 39 13 2 1 24 12 26 13 ≈ 0.1055. 6 MATH 305-02 – Probability Lecture Notes October 1, 2015 Exercise (3.12) A recent college graduate is planning to take the first three actuarial exams in the coming summer. She will take the first exam in June. If she passes that exam, then she will take the second exam in July, and if she passes that one, she will take the third exam in September. If she fails an exam, she cannot take any more. The probability that she passes the exam 1 is 0.9. If she passes the first, the conditional probability that she passes the exam 2 is 0.8, and if she passes both the first and second exams, the conditional probability that she passes the exam 3 is 0.7. a.) What is the probability that she passes all three exams? b.) Given that she did not pass all three exams, what is the conditional probability that she failed the second exam? Solution: a.) Let Ei be the event that she passed the ith exam. Then we seek the probability P (E1 E2 E3 ). From the information given, we know P (E1 ) = 0.9, P (E2 | E1 ) = 0.8, and P (E3 | E1 E2 ) = 0.7. By the multiplication rule, we have P (E1 E2 E3 ) = P (E1 )P (E2 | E1 )P (E3 | E1 E2 ) = (0.9)(0.8)(0.7) = 0.504. b.) We seek the probability P (E2c | (E1 E2 E3 )c ), which the conditional probability formula gives c (E E E )c P E P (E1 E2c ) 1 2 3 2 = P (E2c | (E1 E2 E3 )c ) = 1 − P (E1 E2 E3 ) P (E1 E2 E3 )c Here, we have used the fact that P E2c (E1 E2 E3 )c = P (E1 E2c ) since the only way she could have failed exam 2 is if she passed exam 1 and failed exam 2. This gives P (E1 )P (E2c | E1 ) 1 − 0.504 (0.9)(0.2) = ≈ 0.3629 0.496 P (E2c | (E1 E2 E3 )c ) = Exercise (3.14) An urn initially contains 5 white and 7 black balls. Each time a ball is selected, its color is noted and it is replaced in the urn along with 2 other balls of the same color. Compute the probability that a.) the first 2 balls selected are black and the next 2 are white; b.) of the first 4 balls selected, exactly 2 are black. 7 MATH 305-02 – Probability October 1, 2015 Lecture Notes Solution: a.) Let B denote the event that a black ball was drawn and let W denote the event that a white ball was drawn. We seek the probability P (BBW W ), which by the multiplication rule is P (BBW W ) = P (B)P (B | B)P (W | BB)P (W | BBW ). The probability that the initial ball is black is simply P (B) = 7 , 12 since there are 7 black balls and 5 white balls initially. Since a black ball was selected, we put it back and add two more black balls. Now, we seek the probability P (B | B), which is given by 9 P (B | B) = . 14 Now, we assume a second black ball was chosen so that there are 11 black balls and still only 5 white. If a white ball is selected next, then the probability would be P (W | BB) = 5 . 16 Finally, the probability that another white ball is chosen would be P (W | BBW ) = 7 . 18 Therefore, the probability that two black balls are chosen then two white balls are chosen is given by 7 9 5 7 35 P (BBW W ) = = . 12 14 16 18 768 b.) We note that there is exactly 42 ways for 2 black balls and 2 white balls to be chosen. In every case, the probability will be that found in part a. Therefore, the probability is 4 210 2 35 = . P (2 black, 2 white) = 768 768 Recall that we may consider the probability of an event as a long-run relative frequency, i.e., lim n→∞ n(E) , n where n(E) is the number of times event E occurs in n repetitions of an experiment. P (E | F ) is consistent with this interpretation. Let n be large, then if we only consider the experiments in 8 MATH 305-02 – Probability Lecture Notes October 1, 2015 which F occurs, then P (E | F ) will equal the long-run proportion of them in which E also occurs. To verify: nP (F ) ≈ number of times F occurs nP (EF ) ≈ number of times both E and F occur. Then out of nP (F ) experiments in which F occurs, the proportion in which E occurs is P (EF ) nP (EF ) = , nP (F ) P (F ) which is in agreement with our definition of P (E | F ) as n gets large. 9 MATH 305-02 – Probability Lecture Notes October 1, 2015 Section 3.3 Bayes’s Formula Consider E = EF ∪ EF c . Here, EF and EF c are mutually exclusive. By Axiom 3.], we have P (E) = P (EF ) + P (EF c ) ⇒ ⇒ P (E) = P (E | F )P (F ) + P (E | F c )P (F c ) P (E) = P (E | F )P (F ) + P (E | F c ) 1 − P (F ) . This equation states that the probability of an event E is a weighted average of the conditional probability of E given that F has occurred and the conditional probability of E given that F has not occurred. In fact, each weight is the probability of the event on which it is conditioned. This allows us to find the probability of an event by first “conditioning” on whether or not some second event has occurred. Example (3.23) Urn I contains 2 white and 4 red balls, whereas urn II contains 1 white and 1 red ball. A ball is randomly drawn from urn I and placed into urn II, then a ball is randomly selected from urn II. What is a.) the probability that the ball selected from urn II is white? b.) the conditional probability that the transferred ball was white given that a white ball is selected from urn II? To answer part a., we consider what happens if both a red ball was initially drawn and transferred or if a white ball was drawn and transferred. Let Rt be the event that a red ball was drawn and transferred and let Wt mean that a white ball was transferred. Let W be the event that a white ball was selected from urn II. Then the probability we seek is P (W ), which is given by P (W ) = P (W | Rt )P (Rt ) + P (W | Wt )P (Wt ) 1 4 2 2 = + 3 6 3 6 4 = . 9 For part b., we seek the probability P (Wt | W ), which using the conditional probability formula, 10 MATH 305-02 – Probability Lecture Notes October 1, 2015 gives P (Wt W ) P (W ) P (Wt )P (W | Wt ) = P (W ) 2 2 P (Wt | W ) = = 6 3 4 9 1 = . 2 Exercise (3a) An insurance company believes that people can be divided into two classes; those who are accident prone and those who are not. The company’s statistics show that an accident prone person will have an accident at some time within a fixed 1-year period with probability 0.4, whereas this probability decreases to 0.2 for a person who is not accident prone. a.) If we assume that 30% of the population is accident prone, what is the probability that a new policyholder will have an accident within a year of purchasing a policy? b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is the probability that she is accident prone? Solution: a.) Let A be the event that a person is accident-prone, and let A1 be the event that a person had an accident within a year. Then, we seek the probability P (A1 ), which we can find by conditioning on whether or not that person is accident prone. We have P (A1 ) = P (A1 | A)P (A) + P (A1 | Ac )P (Ac ) = (0.4)(0.3) + (0.2)(0.7) = 0.26. b.) Now, we assume the person has had an accident and we want to know whether she was actually accident prone. In this case, we seek the event P (A | A1 ). This is given by P (AA1 ) P (A1 ) P (A)P (A1 | A) = P (A1 ) 6 (0.3)(0.4) = = . 0.26 13 P (A | A1 ) = 11 MATH 305-02 – Probability October 1, 2015 Lecture Notes Example (3d) A blood test is 95% effective in detecting a certain disease when it is, in fact, present. The test also yields a “false positive” result for 1% of healthy persons tested. If 0.5% of the population actually has the disease, what is the probability that a person has the disease given that the test result is positive? Let D be the event that the person has the disease and let E be the event that the result is positive. Then from the information given we know P (E | D) = 0.95, P (D) = 0.005, P (E | Dc ) = 0.01. We seek the probability P (D | E), which is given by P (D | E) = P (DE) P (E) P (E | D)P (D) P (E | D)P (D) + P (E | Dc )P (Dc ) (0.95)(0.005) = (0.95)(0.005) + (0.01)(0.995) ≈ 0.323. = Exercise (3c) In answering a question on a multiple choice test, a student either knows the answer or guesses. Let p be the probability that the student knows the answer and 1 − p be the probability that the student guesses. Assume that a student who guesses at the answer will be correct with probability 1/m, where m is the number of multiple-choice alternatives. What is the conditional probability that a student knew the answer to a question given that he or she answered it correctly? Solution: Let C be the event that the student gets the answer correct and let G be the event that the student guessed at the answer. Then we are given the probabilities P (Gc ) = p and P (C | G) = 1/m and we seek the probability P (Gc | C), which is the probability that the student knew the answer assuming he/she got it correct. Using our definition for conditional probability gives P (Gc | C) = P (Gc C) P (C) P (C | Gc )P (Gc ) P (C | Gc )P (Gc ) + P (C | G)P (G) (1)(p) = 1 (1)(p) + m (1 − p) p = p + 1−p m mp = . 1 + (m − 1)p = 12 MATH 305-02 – Probability Lecture Notes October 1, 2015 Exercise (3.19) A total of 48% of the women and 37% of the men who took a certain “quit smoking” class remained nonsmokers for at least one year after completing the class. These people then attended a success party at the end of a year. If 62% of the original class was male, a.) what percentage of those at the party were women? b.) what percentage of the original class attended the party? Solution: Let A be the event that a smoker attended the party and let W indicate that the smoker was a woman. Then we are given the following probabilities: P (W ) = 0.38, P (W c ) = 0.62, P (A | W ) = 0.48, and P (A | W c ) = 0.37. Consider the following solutions: a.) Here, we are concerned with the probability P (W | A). Using the definition of conditional probabilities, we obtain P (W | A) = = P (W A) P (A) P (A | W )P (W ) , P (A | W )P (W ) + P (A | W c )P (W c ) where we have conditioned the denominator term over whether or not the person selected is a male or female. Therefore, P (W | A) = (0.48)(0.38) ≈ 0.443. (0.48)(0.38) + (0.37)(0.62) b.) We’ve actually already answered this question. We seek the probability P (A), which by conditioning on whether the smoker is a male or female gives P (A) = P (A | W )P (W ) + P (A | W c )P (W c ) = (0.48)(0.38) + (0.37)(0.62) = 0.4118. Exercise (3f) At a certain stage of a criminal investigation, the inspector in charge is 60% convinced of the guilt of a certain suspect. Suppose, however, that a new piece of evidence which shows that the criminal has a certain characteristic is uncovered. If 20% of the population possesses this characteristic, how certain of the guilt of the suspect should the inspector now be if it turns out that the suspect has the characteristic? Solution: Let G be the event that the suspect is guilty and let E be the event that the suspect 13 MATH 305-02 – Probability Lecture Notes October 1, 2015 has this characteristic. Then the probability we seek is P (G | E), which is the probability that the suspect is guilty given that he has the characteristic. From conditional probabilities, we have P (G | E) = P (GE) P (E) P (E | G)P (G) P (E | G)P (G) + P (E | Gc )P (Gc ) (1)(0.6) = (1)(0.6) + (0.2)(0.4) ≈ 0.882. = Just as we did before with conditional probabilities, we can generalize the previous result. Suppose Fi for i = 1, 2, . . . , n are all mutually exclusive events. Then the sample space can be written as n [ S= Fi , i=1 or equivalently, one of Fi must occur. Then we have for any event E, E= n [ EFi , i=1 and each event EFi is mutually exclusive. [Venn Diagram] Then P (E) = n X P (EFi ) = i=1 n X P (E | Fi )P (Fi ). i=1 This is called the law of total probability. This shows that we can compute P (E) by first conditioning on which one of the Fi that occurs. Again, P (E) is a weighted average of P (E | Fi ), each term being weighted by the probability P (Fi ). Theorem. (Bayes’s Formula) P (Fj | E) = P (E | Fj )P (Fj ) P (EFj ) = Pn . P (E) i=1 P (E | Fi )P (Fi ) Proof. Follows directly from the definition of conditional probability and the law of total probability. Example (3.32) A family has j children with probability pj , where p1 = 0.1, p2 = 0.25, p3 = 0.35, p4 = 0.3. A child from this family is randomly chosen. Given that this child is the eldest child in the family, find the conditional probability that the family has a.) only 1 child; b.) 4 children. 14 MATH 305-02 – Probability Lecture Notes October 1, 2015 To answer these questions, we first start off defining each event. Let E be the event that the child selected is the oldest and let Fj be the event that the family has j children. From this, we may conclude that the probability that the child is the oldest, given that there is j children is P (E | Fj ) = 1/j. Furthermore, we know P (Fj ) = pj as given in the problem. To answer parts a.) and b.), we seek the probability P (Fj | E). Hence, by the Bayes’s formula P (EFj ) P (E) P (E | Fj )P (Fj ) = P4 i=1 P (E | Fj )P (Fj ) 1 j pj =P 4 1 i=1 j pj P (Fj | E) = = pj /j . p1 + p2 /2 + p3 /3 + p4 /4 From this Therefore, for part a.), we have P (F1 | E) = p1 1/10 6 = = , p1 + p2 /2 + p3 /3 + p4 /4 5/12 25 and for part b.) we have P (F4 | E) = p4 /4 3/40 9 = . = p1 + p2 /2 + p3 /3 + p4 /4 5/12 50 Exercise (3.36) Stores A, B, and C have 50, 75, and 100 employees, respectively, and 50, 60, and 70 percent of them respectively are women. Resignations are equally likely among all employees, regardless of sex. One woman employee resigns. What is the probability that she works in store C. Solution: Here, let W refer to the event that the resignation came from a woman and let A, B, and C represent the event that the person who resigned worked in store A, B, or C, respectively. Then we seek the probability P (C | W ). Using the Bayes’s formula, we obtain P (W | C)P (C) P (W | A)P (A) + P (W | B)P (B) + P (W | C)P (C) (0.75)(100/225) = (0.5)(50/225) + (0.7)(75/225) + (0.75)(100/225) = 0.5. P (C | W ) = 15 MATH 305-02 – Probability Lecture Notes October 1, 2015 Exercise (3n) A bin contains 3 types of disposable flashlights. The probability that a type 1 flashlight will give more than 100 hours of use is 0.7, with the corresponding probabilities for type 2 and type 3 flashlights being 0.4 and 0.3, respectively. Suppose that 20% of the flashlights in the bin are type 1, 30% are type 2, and 50% are type 3. a.) What is the probability that a randomly chosen flashlight will give more than 100 hours of use? b.) Given that a flashlight lasted more than 100 hours, what is the conditional probability that it was a type j flashlight, for j = 1, 2, 3. Solution: Let E be the event that the chosen flashlight gives more than 100 hours of light and let Fi be the event that a flashlight of type i is chosen. Consider the following solutions: a.) We seek the probability P (E). We find this probability by conditioning on which flashlight is chosen. The law of total probability gives P (E) = P (E | F1 )P (F1 ) + P (E | F2 )P (F2 ) + P (E | F3 )P (F3 ) = (0.7)(0.2) + (0.4)(0.3) + (0.3)(0.5) = 0.41. b.) We seek the probabilities P (Fi | E), which is given by Bayes’s Formula: P (Fi | E) = P (E | Fi )P (Fi ) . 0.41 This gives the following 14 41 12 P (F2 | E) = 41 15 P (F3 | E) = . 41 P (F1 | E) = Exercise (3k) A plane is missing, and it is presumed that it was equally likely to have gone down in any of 3 possible regions. Let 1 − βi , for i = 1, 2, 3, denote the probability that the plane will be found upon a search of the ith region when the plane is, in fact, in that region. What is the conditional probability that the plane is in the ith region given that a search of region 1 is unsuccessful? Solution: Here, the values βi are called overlook probabilities for obvious reasons. Let Ri be the event that the plane is in location i and let E be the event that a search of region 1 was unsuccessful. Then, using these events, we conclude the following: P (Ri ) = 1/3, P (E | R1 ) = β1 , P (E | R2 ) = 1, 16 MATH 305-02 – Probability Lecture Notes October 1, 2015 and P (E | R3 ) = 1. From this, we can find the desired probabilities. Let i = 1, then we seek the following: P (E | R1 )P (R1 ) P (E | R1 )P (R1 ) + P (E | R2 )P (R2 ) + P (E | R3 )P (R3 ) β1 (1/3) = β1 /3 + 1/3 + 1/3 β1 = . 2 + β1 P (R1 | E) = For i = 2, 3, we obtain the following: P (E | Ri )P (Ri ) P (E | R1 )P (R1 ) + P (E | R2 )P (R2 ) + P (E | R3 )P (R3 ) (1)(1/3) = β1 /3 + 1/3 + 1/3 1 = . 2 + β1 P (Ri | E) = Definition The odds of an event E are defined as [ odds ] = P (E) P (E) = . c P (E ) 1 − P (E) The odds tell how much more likely it is that event E occurs than it is that it doesn’t occur. If the odds are α, then it is common to say “α to 1” in favor of the hypothesis. We can now compute the odds when new evidence is introduced. Suppose H is true with probability P (H) and let E be new evidence. Then, given the new evidence, P (HE) P (E | H)P (H) = P (E) P (E) c P (H E) P (E | H c )P (H c ) P (H c | E) = = P (E) P (E) P (H | E) = Dividing the two expressions gives the “new odds” in light of this evidence: P (H) P (E | H) P (H | E) = . P (H c | E) P (H c ) P (E | H c ) | {z } | {z } new odds old odds We can see that the new odds increase if the new evidence is more likely when H is true than when it is false. 17 MATH 305-02 – Probability Lecture Notes October 1, 2015 Example (3i) An urn contains two type A coins and one type B coin. When a type A coin is flipped it comes up heads with probability 1/4, whereas when a type B coin is flipped, it comes up heads with probability 3/4. A coin is randomly chosen from the urn and flipped. Given that the flip landed on heads, what is the probability that it was a type A coin? To answer this question, we define H as the event of flipping a heads and A the event that coin A is drawn. We find the odds of drawing an A coin as [ odds of drawing A ] = P (A) 2/3 = = 2. c P (A ) 1/3 The odds of drawing an A coin are two to one. Now, assume we know that the coin flipped came up heads. Then in light of this new evidence, we wish to find the odds that it is an A coin. This means P (A | H) P (A) P (H | A) 1/4 2 [ new odds of drawing A ] = = = (2) = . c c c P (A | H) P (A P (H | A ) 3/4 3 So the new odds are 2/3 to one, which means the probability that the coin picked was the A coin given that heads was flipped is P (A | H) = 2/5. 18 MATH 305-02 – Probability Lecture Notes October 1, 2015 Section 3.4 Independence We say E is independent of F if knowledge that F has occurred does not influence the probability that E occurs. In other words, if P (E | F ) is the same as P (E), then E is independent of F , which means P (EF ) = P (E)P (F ). Furthermore, it follows that if E is independent of F , then F is independent of E. Definition Two events E and F are said to be independent if P (EF ) = P (E)P (F ). Example (4a) A card is selected at random from an ordinary deck of 52 playing cards. If E is the event that the selected card is an ace and F is the event that it is a spade, then are the events independent? In this case, yes because because P (E) = 4/52 = 1/13 and the probability of drawing an ace knowing the card is a spade is exactly P (E | F ) = 1/13. Therefore, knowing the card is a spade gives you no extra benefit. On the flippity flop, you could say P (F ) = 13/52 = 1/4. Suppose you know the card you drew was an ace, then what is the probability that it is a spade? Then, P (F | E) = 1/4 since there are four possible suits of the ace, one of which being a spade. You can also think that the probability of drawing an ace of spades is P (EF )1/52 and the probability of each event separately is P (E) = 4/52 and P (F ) = 13/52, which gives P (EF ) = P (E)P (F ). Example (4b) Two coins are flipped, and all 4 outcomes are assumed to be equally likely. If E is the event that the first coin lands on heads and F the event that the second lands on tails, then are E and F independent? Yes, because P (E) = 1/2, P (F ) = 1/2, and P (EF ) = P ({(H, T )}) = 1/4. Hence, P (EF ) = P (E)P (F ) and the two events are independent. Exercise (4c) Suppose we toss 2 fair dice. Let E1 denote the event that the sum of the dice is 6 and let E2 denote the event that the sum of the two dice is 7. Let F denote the the event that the first roll was a 4. Is either E1 or E2 independent of F ? Solution: The sample space for this problem contains 36 elements. The events consist of the following outcomes: E1 = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} E2 = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} F = {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)} . It follows easily that P (E1 ) = 5/36 and P (F ) = 1/6 since each outcome is equally likely. Also, 1 P (E1 F ) = 36 as there is only one element, namely (4, 2), that is in common. Thus, P (E1 F ) = 1 5 6= = P (E1 )P (F ), 36 216 19 MATH 305-02 – Probability October 1, 2015 Lecture Notes and the two events are not independent. But, we have P (E2 ) = 1/6, P (F ) = 1/6, and P (E2 F ) = 1/36, which means 1 P (E2 F ) = = P (E2 )P (F ). 36 Proposition If E and F are independent, then so are E and F c . Proof. We consider the fact that E = EF ∪ EF c . These two events are mutually exclusive. Therefore, P (E) = P (EF ) + P (EF c ) = P (E)P (F ) + P (EF c ) since E and F are independent. This equation can be rewritten as P (E) 1 − P (F ) = P (EF c ) ⇒ P (E)P (F c ) = P (EF c ), which means E and F c are independent. For three events, say E, F , and G, to be independent, we must have P (EF G) = P (E)P (F )P (G), and P (EF ) = P (E)P (F ), P (EG) = P (E)P (G), P (F G) = P (F )P (G). In general, independence can be extended to more than three events. The events E1 , E2 , . . ., En are said to be independent if for every subset of E10 , E20 , . . ., Er0 , for r ≤ n, of these events, P (E10 E20 · · · Er0 ) = P (E10 E20 · · · Er0 ). Exercise (4f) An infinite sequence of independent trials is to be performed. Each trial results in a success with probability p and a failure with probability 1 − p. What is the probability that at least 1 success occurs in the first n trials? What is the probability that k successes occur in the first n trials? Solution: Here, we define S to be a success and F to be a failure. We know that each trial is independent and P (S) = p and P (F ) = 1 − p. To find the probability of at least one success, we can find the probability of no successes and subtract it from one. Hence, P (at least one success) = 1 − P (no successes) = 1 − (1 − p)n , where we have used the fact that the trials are independent by multiplying the probability of failure n times. To answer the second question, we consider a sequence of n trials, k of which are successes and n − k of which are failures: SSS · · · SS} F · · · F F} . {z | F F {z | k successes n − k failures 20 MATH 305-02 – Probability Lecture Notes October 1, 2015 The probability of this sequence is simply pk (1 −p)n−k . But this is only one such sequence of k successes and n − k failures. There is exactly nk sequences of this form since we are placing k successes among n trials. Therefore, the probability is n k P (k successes) = p (1 − p)n−k . k Exercise (4i) Suppose there are n types of coupons and that each newPcoupon collected is, inn dependent of previous selections, a type i coupon with probability pi , i=1 pi = 1. Suppose k coupons are to be collected. If Ai is the event that there is at least one type i coupon among those collected, then, for i 6= j, find P (Ai ), P (Ai ∪ Aj ), and P (Ai | Aj ). Solution: We note here that the process of selecting a coupon is independent of the previous selection and that the actual event defined by Ai is not independent of Aj for i 6= j. Since Ai is defined as the event that at least one coupon of type i is picked, we again consider the event that no coupon of type i is selected. Therefore, to find P (Ai ), we obtain P (Ai ) = 1 − P (no coupon of type i) = 1 − (1 − pi )k . The event Ai ∪ Aj is the event that at least one of the coupons i or j is selected. Again, we can find this by considering the event that neither of coupon i nor j is chosen: P (Ai ∪ Aj ) = 1 − P (Ai ∪ Aj ) = 1 − (1 − pi − pj )k . Finally, to find the conditional probability P (Ai | Aj ), we use the conditional probability formula, which gives P (Ai | Aj ) = P (Ai Aj ) . P (Aj ) To determine P (Ai Aj ), we can use the following identity: P (Ai ∪ Aj ) = P (Ai ) + P (Aj ) − P (Ai Aj ). Substituting and evaluating, we obtain the following solution: 1 − (1 − pi )k + 1 − (1 − pj )k − 1 − (1 − pi − pj )k . P (Ai Aj ) = 1 − (1 − pj )k 21 MATH 305-02 – Probability Lecture Notes October 1, 2015 Exercise (3.91) Suppose that n independent trials, each of which P2results in any of the outcomes 0, 1, or 2, with respective probabilities p0 , p1 , and p2 , such that i=0 pi = 1, are performed. Find the probability that outcomes 1 and 2 both occur at least once. Solution: Let Ei be the event that outcome i does not occur. Then we seek the event P (E1 ∪ E2 )c , which is the event that at least one of the two outcomes occurs. We have P (E1 ∪ E2 ) = P (E1 ) + P (E2 ) − P (E1 E2 ) = (1 − p1 )n + (1 − p2 )n − (1 − p1 − p2 )n = (1 − p1 )n + (1 − p2 )n − pn0 . Subtracting this from one yields the desired probability: P (E1 ∪ E2 )c = 1 + pn0 − (1 − p1 )n − (1 − p2 )n . 22 MATH 305-02 – Probability Lecture Notes October 1, 2015 Section 3.5 P (· | F ) is a Probability We now consider the conditional probability P (· | F ) as a probability function that satisfies the three axioms of probability. Proposition For any event E of a sample space S, the probability P (E | F ) satisfies the three axioms of probability: 1.] 0 ≤ P (E | F ) ≤ 1. 2.] P (S | F ) = 1. 3.] If events E1 , E2 , E3 , . . . are mutually exclusive (i.e., Ei Ej = ∅ for i, j, when i 6= j), then ! ∞ ∞ [ X P Ei | F = P (Ei | F ). i=1 i=1 Proof. For part 1.], we must show that 0 ≤ P (E | F ) ≤ 1. Here, we can use the formula for conditional probability to write P (EF ) P (E | F ) = . P (F ) Here, P (EF ) ≥ 0 and it follows that EF ⊂ F , which means P (EF ) ≤ P (F ). Therefore, 0≤ P (EF ) ≤ 1. P (F ) P (E | F ) satisfies the first axiom. Part 2.] follows since P (S | F ) = P (SF ) P (F ) = = 1. P (F ) P (F ) Finally, consider a sequence of mutually exclusive events E1 , E2 , . . ., then ! ! ! ∞ ∞ [ [ ! P Ei F P Ei F ∞ [ i=1 i=1 P Ei | F = = . P (F ) P (F ) i=1 From here, since Ei Ej = ∅ for i 6= j it follows that Ei F Ej F = ∅ and we have ! ∞ ∞ [ X P Ei F = P (Ei F ). i=1 i=1 23 MATH 305-02 – Probability October 1, 2015 Lecture Notes This is because all other sums in the inclusion-exclusion principle drop out because every event is mutually exclusive. Therefore, ∞ X P ∞ [ i=1 ! Ei | F = P (Ei F ) i=1 P (F ) = ∞ X P (Ei | F ). i=1 Axiom 3.] is satisfied. if we define Q(E) = P (E | F ), then from the above proposition, Q(E) may be regarded as a probability function on the events of S. In this sense, all propositions previously proved for probabilities hold for Q(E). For example, we have Q(E1 ∪ E2 ) = Q(E1 ) + Q(E2 ) − Q(E1 E2 ) or equivalently P (E1 ∪ E2 | F ) = P (E1 | F ) + P (E2 | F ) − P (E1 E2 | F ). Also, we can define a conditional probability for the probability Q(E). Suppose we wish to find the probability Q(E1 ) by first conditioning on whether or not E2 occurs, then Q(E1 ) = Q(E1 | E2 )Q(E2 ) + Q(E1 | E2c )Q(E2c ) If we substitute our equation for conditional probability, Q(E1 | E2 ) = Q(E1 E2 )/Q(E2 ), we obtain Q(E1 E2 ) Q(E2 ) P (E1 E2 | F ) = P (E2 | F ) Q(E1 | E2 ) = = P (E1 E2 F ) P (F ) P (E2 F ) P (F ) P (E1 E2 F ) P (E2 F = P (E1 | E2 F ). = Therefore, if we condition the probability Q(E1 ) on whether or not E2 occurs, it is equivalent to P (E1 | F ) = P (E1 | E2 F )P (E2 | F ) + P (E1 | E2c F )P (E2c | F ). Exercise (5a and 3a) An insurance company believes that people can be divided into two classes; those who are accident prone and those who are not. The company’s statistics show that an accident prone person will have an accident at some time within a fixed 1-year period with probability 0.4, whereas this probability decreases to 0.2 for a person who is not accident prone. a.) If we assume that 30% of the population is accident prone, what is the probability that a new policyholder will have an accident within a year of purchasing a policy? 24 MATH 305-02 – Probability Lecture Notes October 1, 2015 b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is the probability that she is accident prone? c.) What is the conditional probability that a new policyholder will have an accident in his or her second year of policy ownership, given that the policyholder had an accident the fist year? Solution: Let A be the event that a person is accident-prone, and let A1 be the event that a person had an accident within a year. Also, let A2 denote the event that a person had an accident during the second year of holding a policy. We are given P (A1 | A) = 0.4 and P (A1 | Ac ) = 0.2. Consider the following: a.) We seek the probability P (A1 ), which we can find by conditioning on whether or not that person is accident prone. We have P (A1 ) = P (A1 | A)P (A) + P (A1 | Ac )P (Ac ) = (0.4)(0.3) + (0.2)(0.7) = 0.26. b.) Now, we assume the person has had an accident and we want to know whether she was actually accident prone. In this case, we seek the event P (A | A1 ). This is given by P (AA1 ) P (A1 ) P (A)P (A1 | A) = P (A1 ) 6 (0.3)(0.4) = . = 0.26 13 P (A | A1 ) = c.) Now, we wish to find the probability P (A2 | A1 ). We can find this by conditioning on whether or not the person is accident prone. This gives P (A2 | A1 ) = P (A2 | AA1 )P (A | A1 ) + P (A2 | Ac A1 )P (Ac | A1 ). Here, the probability P (A2 | AA1 ) = 0.4 since the person is accident prone and the second year is separate period from the first year. This gives the following solution: 6 7 + (0.2) P (A2 | A1 ) = (0.4) 13 13 ≈ 0.29. Exercise (5e) There are k + 1 coins in a box. When flipped, the ith coin will turn up heads with probability i/k, for i = 0, 1, . . . , k. A coin is randomly selected from the box and is then repeatedly flipped. If the first n flips all result in heads, what is the conditional probability that the (n + 1) flip will do likewise? 25 MATH 305-02 – Probability October 1, 2015 Lecture Notes Solution: Let Ci be the event that the ith coin is drawn, Fn be the event that the first n flips were heads, and H be the event that the n + 1 flip is heads. Then the probability we seek is 1 P (H | Fn ). We can conclude that the probabilities given are P (Ci ) = k+1 , P (H | Ci ) = ki , and i n P (Fn | Ci ) = k , as each flip is independent from the last. To find the probability, we condition on which coin was drawn: P (H | Fn ) = k X P (H | Ci Fn )P (Ci | Fn ) i=0 We can assume that P (H | Ci Fn ) = P (H | Ci ) = ki since the previous flips really should not influence the n + 1 flip. The only thing that matters is which coin was used. Therefore, we have P (H | Fn ) = k X i k i=0 P (Ci | Fn ) Now, we seek the probability P (Ci | Fn ), which is the probability of drawing the ith coin considering n heads was flipped. Using Bayes’s Formula, we obtain P (Fn | Ci )P (Ci ) P (Ci | Fn ) = Pk j=0 P (Fn | Cj )P (Cj ) 1 i n =P k = k k+1 n j j=0 k i n k n Pk j j=0 k 1 k+1 From this, we update our probability as i n+1 i=0 k n . Pk j j=0 k Pk P (H | Fn ) = We see if k is large, and if we multiplying the numerator and denominator by following approximation: 1 Pk i n+1 i=0 k k n j 1 Pk j=0 k k P (H | Fn ) = 1 k, we obtain the R1 xn+1 dx n+1 ≈ 0R 1 . = n n+2 0 x dx This approximation follows as the sum, as k gets big, is a Riemann sum approximation to the integral. 26