Download P(A|B)

Chapter 2. Conditional Probability The probabilities assigned to various events depend on what is known about the experimental situation when the assignment is made. For a particular event A, we have used P(A) to represent the probability assigned to A; we now think of P(A) as the original or unconditional probability of the event A. 2.1 The definition of conditional probability In this section, we examine how to information “an event B has occurred” affects the probability assigned to A. We will use the notation P(A|B) to represent the conditional probability of A given that the event B has occurred. Conditioning is one of the fundamental tools of probability: probably the most fundamental tool. It is especially helpful for calculating the probabilities of intersections, such as P(A|B), which themselves are critical for the useful Partition Theorem. Additionally, the whole field of stochastic processes is based on the idea of conditional probability. What happens next in a process depends, or is conditional, on what has happened beforehand. Dependent events. Suppose A and B are two events on the same sample space. There will often be dependence between A and B. This means that if we know that B has occurred, it changes our knowledge of the chance that A will occur. Example1: Toss a die once. However, if we know that B has occurred, then there is an increased chance that A has occurred: Conditioning as reducing the sample space Example 2. The car survey in Examples of basic probability calculations also asked respondents which they valued more highly in a car: ease of parking, or style/prestige. Here are the responses: Suppose we pick a respondent at random from all those in the table. Let event A =“respondent thinks that prestige is more important”. Suppose we reduce our sample space from This is our definition of conditional probability: Definition: Let A and B be two events with P(B)>0. The conditional probability that event A occurs, given that event B has occurred, is written P(A|B), and is given by Note: Follow the reasoning above carefully. It is important to understand why the conditional probability is the probability of the intersection within the new sample space Conditioning on event B means changing the sample space to B. Think of P(A|B) as the chance of getting an A, from the set of B's only. The Multiplication Rule For any events A and B, New statement of the Partition Theorem (The Law of Total Probability) The Multiplication Rule gives us a new statement of the Partition Theorem (Total Probability Theorem) Both formulations of the Partition Theorem are very widely used, but especially the conditional formulation Examples of conditional probability and partitions Example 3. A news magazine publishes three columns entitled “Art”(A), “Books”(B), and “Cinema”(C). Reading habits of a randomly selected reader with respect to these columns are Read A B C AB AC B C A B C regularly Probability 0.14 0.23 0.37 0.08 0.09 0.13 0.05 We thus have P( A  B) 0.08   0.348 P( B) 0.23 P( A  ( B  C )) 0.04  0.05  0.03 P( A | B  C )    0.255 P( B  C ) 0.47 P( A  ( A  B  C )) P( A | reads at least one)  P( A | A  B  C )  P( A  B  C ) P( A) 0.14    0.286 P( A  B  C ) 0.49 P(( A  B)  C )) 0.04  0.05  0.08 P( A  B | C )    0.459 P(C ) 0.37 P( A | B)  Example 4. Four individuals have responded to a request by a blood bank for blood donations. None of them has donated before, so their blood types are unknown. Suppose only type A+ is desired and only one of the four actually has this type. If the potential donors are selected in random order for typing, what is the probability that at least three individuals must by typed to obtain the desired type? Solution. Making the identification B={first type not A+} and A={second type not A+}, P(B)=3/4. Given that the first type is not A+, two of the three individuals left are not A+, so P(A|B)=2/3. The multiplication rule now gives P(at least three individuals are typed)=P(A  B)=P(A|B)P(B) =2/3*3/4=0.5 The multiplication rule is most useful when the experiment consists of several stages in succession. The conditioning event B then describes the outcome of the first stage and A the outcome of the second, so that P(A|B)-conditioning on what occurs first-will often be known. The rule is easily extended to experiments involving more than two stages. More than two events To find P( A1 A2 A3 ) , we can apply the multiplication rule successively: Where A1 occurs first, followed by A2 , and finally A3 . Example 5. For the blood typing experiment of the above example, P(third type is A)  P(third is | first isn' t  second isn' t )  P(second isn' t | first isn' t )  P( first isn' t ) 1 2 3 1      0.25 2 3 4 4 Example 6: A box contains w white balls and r red balls. Draw 3 balls without replacement. What is the probability of getting the sequence white, red, white? Solution: Example 7. Tom gets the bus to campus every day. The bus is on time with probability 0.6, and late with probability 0.4. The sample space can be written as We can formulate events as follows: T = “on time”; L = “late”. From the information given, the events have probabilities: P(T) = 0.6 ; P(L) = 0.4 Question(a) Do the events T and L form a partition of the sample space? Explain why or why not. Solution. Yes. They cover all possible journeys (probabilities sum to 1), and there is no overlap in the events by deﬁnition. The buses are sometimes crowded and sometimes noisy, both of which are problems for Tom as he likes to use the bus journeys to do his Stats assignments. When the bus is on time, it is crowded with probability 0.5. When it is late, it is crowded with probability 0.7. The bus is noisy with probability 0.8 when it is crowded, and with probability 0.4 when it is not crowded. Question(b) Formulate events C and N corresponding to the bus being crowded and noisy. Do the events C and N form a partition of the sample space? Explain why or why not. Solution. Let C = “crowded”, N =“noisy”. C and N do NOT form a partition of  . It is possible for the bus to be noisy when it is crowded, so there must be some overlap between C and N. Question(c) Write down probability statements corresponding to the information given above. Your answer should involve two statements linking C with T and L, and two statements linking N with C. Solution. Questin(d) Find the probability that the bus is crowded. Question(e) Find the probability that the bus is noisy. Example 8. A chain of video stores sells three different brands of VCRs. Of its VCR sales, 50% are brand 1(the least expensive), 30% are brand 2, and 20% are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is known that 25% of brand 1’s VCRs require warranty repair work, whereas the corresponding percentages for brands 2 and 3 are 20% and 10%, respectively. Question(a) What is the probability that a randomly selected purchaser has bought a brand 1 VCR that will need repair while under warranty? Question(b) What is the probability that a randomly selected purchaser has a VCR that will need repair while under warranty? Question(c) If a customer returns to the store with a VCR that needs warranty repair work, what is the probability that it is a brand 1 VCR? A brand 2 VCR? A brand 3 VCR? Solution. Let Ai ={brand i is purchased}, for i=1,2,and 3. B={needs repair}, B’={doesn’t need repair}. Then P( A1 )=0.5, P( A2 )=0.3, P( A3 )=0.2 P(B| A1 )=0.25, P(B| A2 )=0.2, P(B| A3 )=0.1. (a) P( A1  B )=P(B| A1 )P( A1 )=0.25*0.5=0.125 (b) P(B)=P((brand 1 and repair) or (brand 2 and repair) or (brand 3 and repair) ) = P( A1  B )+P( A2  B )+P( A3  B )=0.125+0.06+0.02=0.205 (c) P( A1 | B )= P( A1  B) 0.125   0.61 , P( B) 0.205 P( A3 | B )=1- P( A1 | B )-P( A2 | B )=0.1 P( A2 | B )= P( A2  B) 0.06   0.29 P( B) 0.205 2.2 Statistical Independence Two events A and B are statistically independent if the occurrence of one does not affect the occurrence of the other. We use this as our definition of statistical independence. For more than two events, we say: Statistical independence for calculating the probability of an intersection We usually have two choices. 1. IF A and B are statistically independent, then 2. If A and B are not known to be statistically independent, we usually have to use conditional probability and the multiplication rule: This still requires us to be able to calculate P(A|B). Note: If events are physically independent, then they will also be statistically independent. Pairwise independence does not imply mutual independence Example 9. A jar contains 4 balls: one red, one white, one blue, and one red, white& blue. Draw one ball at random. So A, B and C are NOT mutually independent, despite being pairwise independent. Example 10. It is known that 30% of a certain company’s washing machines require service while under warranty, whereas only 10% of its dryers need such service. If someone purchases both a washer and a dryer made by this company, what is the probability that both machines need warranty service? Let A denote the event that the washer needs service while under warranty, Let B defined analogously for the dryer. Then P(A)=0.3, P(B)=0.1. Assuming that the two machines function independently of one another, the desired probability is P( A  B)  P( A)  P( B)  0.3  0.1  0.03 The probability that neither machine needs service is P( A'B' )  P( A' )  P( B' )  (0.7)(0.9)  0.63. Example 11. A system consists of four components, as illustrated in Fig .The entire system will work if either the 1-2 subsystem works or if the 3-4 subsystem works (since the two subsystems are connected in parallel). Since the two components in each subsystem are connected in series, a subsystem will work only if both its components work. If components work or fail independently of one another and if each works with probability 0.9, what is the probability that the entire system will work (the system reliability coefficient)? Letting Ai (i=1,2,3,4) be the event that the ith component works, the Ai ’s are mutually independent. The event that the 1-2 subsystem works in A1  A2 , and similarly, A3  A4 denotes the event that the 3-4 subsystem works. The event that the entire system works is ( A1  A2 )  ( A3  A4 ) , so P[( A1  A2 )  ( A3  A4 )]  P( A1  A2 )  P( A3  A4 )  P[( A1  A2 )  ( A3  A4 )]  P( A1 )  P( A2 )  P( A3 )  P( A4 )  P( A1 )  P( A2 )  P( A3 )  P( A4 )  (0.9)(0.9)  (0.9)(0.9)  (0.9)(0.9)(0.9)(0.9)  0.9636 Example 12. Suppose that a machine produces a defective item with probability p (0<p<1) and produces a nondefective item with probability 1-p. Suppose further that six items produced by the machine are selected at random and inspected, and that the results (defective or nondefective) for these six items are independent. We shall determine the probability that exactly two of the six items are defective. Solution. It can be assumed that the sample space S contains all possible arrangements of six items, each one of which might be either defective or nondefective. Let Dj denote the event that the jth item in the sample is defective, c then D j is the event that this item is nondefective. Since the outcomes for the six different items are independent, the probability of obtaining any particular sequence of defective and nondefective items will simply be the product of the individual probabilities for the items. For example, P( D1c  D2  D3c  D4c  D5  D6c )  P( D1c ) P( D2 ) P( D3c ) P( D4c ) P( D5 ) P( D6c )  (1  p) p(1  p)(1  p) p(1  p)  p 2 (1  p)4 . It can be seen that the probability of any other particular sequence in S containing two defective items and four nondefective items will also be p 2 (1  p)4 . 6 Since there are   2   distinct arrangements of two defective items and four nondefective items.  6 The probability of obtaining exactly two defectives is   p 2 (1  p ) 4 . 2   2.3 Bayes' Theorem: inverting conditional probabilities Then This is the simplest form of Bayes' Theorem, named after Thomas Bayes (1702-61), English clergyman and founder of Bayesian Statistics. Bayes' Theorem allows us to “invert” the conditioning, i.e. to express P(B| A) in terms of P(A|B). This is very useful. For example, it might be easy to calculate, P(later event|earlier event); but we might only observe the later event and wish to deduce the probability that the earlier event occurred, P(earlier event| later event) Full statement of Bayes' Theorem: Example 13. The case of the Perfidious Gardener. Mr Smith owns a hysterical rosebush. It will die with probability 1/2 if watered, and with probability 3/4 if not watered. Worse still, Smith employs a perfidious gardener who will fail to water the rosebush with probability 2/3. Smith returns from holiday to find the rosebush . . . DEAD!! What is the probability that the gardener did not water it? So the gardener failed to water the rosebush with probability 3/4. Example14. The case of the Defective Ketchup Bottle. Ketchup bottles are produced in 3 different factories, accounting for 50%, 30%, and 20% of the total output respectively. The percentage of defective bottles from the 3 factories is respectively 0.4%, 0.6%, and 1.2%. A statistics lecturer who eats only ketchup finds a defective bottle in her wig. What is the probability that it came from Factory 1? Information given:

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download P(A|B)