Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 4 Selected material from: Ch. 6 Probability M Example: Music preferences F Suppose you want to know what types of CD’s males and females are more likely to buy. The CD’s are classified as Classical, Rock, and Pop. Let M = male, F = female, C = Classical, R = Rock, P = Pop. A chance experiment is any situation where two or more outcomes may result; here which CD’s males and females buy. The collection of all possible outcomes of a chance experiment is the sample space for the experiment, {MC, MR, MP, FC, FR, FP}. C R 2 P © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example: Music preferences Another way of illustrating the sample space would be using a picture called a tree diagram. Classical Rock Male Pop Classical Female Rock Pop To identify any particular outcome of the sample space, you traverse the tree by first selecting a branch corresponding to gender and then a branch corresponding to the choice of music. 3 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. M Example: Music preferences F An experiment will be performed where the shoppers in a CD store are observed. The sample space for the experiment = {MC, MR, MP, FC, FR, FP}. An event is any collection of outcomes from the sample space of a chance experiment; for example, the event of a male in the CD store is {MC, MR, MP}. A simple event is an event consisting of exactly one outcome; such as the event of a female buying classical music = {FC}. C R 4 P © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Venn diagrams Corndog A Venn diagram is an informal picture used to identify relationships. A rectangle is used to represent the sample space and a circle to represent the event A. The complement of A is the region in the rectangle outside of A, and is referred to as AC. AC A 5 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Events F M The union of events A and B is all outcomes that belong to at least one of A or B and is referred to as A or B or A B. The intersection of events A and B is all outcomes that belong to both A and B and is referred to as A and B or A B. Two events A and B that are disjoint or mutually exclusive have no outcomes in common; their intersection is the empty set . Union 6 Intersection Disjoint © 2008 Brooks/Cole, a division of Thomson Learning, Inc. The concept extends to more than 2 events B A B A C C A ∩ B ∩ C B A C A 7 B A, B and C are disjoint B A C C A ∩ B ∩ CC © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Basic rules of probability Probability gives the chance that an event will occur and follows the following basic axioms. 1.) For any event E, 0 P(E) 1. 2.) If S is the sample space for an experiment, P(S)=1. 3.) If two events E and F are disjoint, then P(E or F) = P(E) + P(F). 4.) For any event E, P(E) + P(EC) = 1 P(EC) = 1 – P(E). Regarding 3.), more generally, if E1 , E 2 ,..., E k are disjoint, then P(E1 or E 2 or or E k ) P(E1 E 2 E k ) P(E1 ) P(E 2 ) P(E k ). 8 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bus example A study was performed to look at the relationship between motion sickness and seat position in a bus. Specifically, 3256 people were asked where they sat in the bus and whether or not they got sick (nausea). party bus Results Seat Position in Bus Front Middle Back Nausea 58 166 193 No Nausea 870 1163 806 9 “Motion Sickness in Public Road Transport: The Effect of Driver, Route and Vehicle” Ergonomics (1999): 1646 – 1664. © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bus example Calculating totals for some of the events. Seat Position in Bus Front Middle Back Nausea 58 166 193 No Nausea 870 1163 806 Total 928 1329 999 928 people sat in the front of the bus 10 Total 417 people had 417 nausea 2839 3256 806 people sat in the back and had no nausea N = Nausea NC = No Nausea F = Front M = Middle B = Back © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bus example The probability that an individual in the study gets nausea is 417 P(N) 0.128 3256 Seat Position in Bus Front Middle Back Nausea 58 166 193 No Nausea 870 1163 806 Total 928 1329 999 11 Total 417 2839 3256 N = Nausea NC = No Nausea F = Front M = Middle B = Back © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bus example Other probabilities are easily calculated by dividing all numbers in the table by 3256: Seat Position in Bus Front Middle Back Nausea 0.018 0.051 0.059 No Nausea 0.267 0.357 0.248 Total 0.285 0.408 0.307 P(N and F) P(F) 12 P(M and P(N) Total 0.128 0.872 1.000 NC) N = Nausea NC = No Nausea F = Front M = Middle B = Back © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Conditional probability Let E and F be two events with P(F) > 0. The conditional probability of E given F is P(E|F) = P(E F)/P(F). Bus example Seat Position in Bus Front Middle Back Total Nausea 58 166 193 417 No Nausea 870 1163 806 2839 Total 928 1329 999 3256 The probability of getting nausea for someone who sat in the front of the bus is P(Nausea|Front) = 58/928 = 0.0625. 13 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Independence Two events E and F are said to be independent if P(E|F) = P(E) and dependent if P(E|F) P(E). If P(E|F) = P(E) then it is also true that P(F|E) = P(F). Another definition of independence between E and F is that P(E F) = P(E)P(F). Bus example Seat Position in Bus Front Middle Back Nausea 58 166 193 No Nausea 870 1163 806 Total 928 1329 999 Total 417 2839 3256 P(Nausea|Front) = (58/3256)/(928/3256) = 0.0625 P(Nausea) = 417/3256 = 0.128 14 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Sampling schemes Sampling is with replacement if, once selected, an individual or object is put back into the population before the next selection. Sampling is without replacement if, once selected, an individual or object is not returned to the population prior to subsequent selections. 15 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example: Cards Suppose we are going to select three cards from an ordinary deck of cards. Consider the events: E1 = event that the first card is a king E2 = event that the second card is a king E3 = event that the third card is a king. 16 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Cards – with replacement If we select the first card and then place it back in the deck before we select the second, and so on, the sampling will be with replacement and the events are independent. 4 P(E1 ) P(E 2 ) P(E3 ) 52 Under this sampling scheme the probability of getting three kings in a row is: P(E1 E2 E3 ) P(E1 )P(E2 )P(E3 ) 4 4 4 0.000455 52 52 52 17 independence © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Cards – without replacement If we select the cards in the usual manner without replacing them in the deck, the sampling will be without replacement and the events are not independent. P(king on first draw) P(E1 E2 E3 ) P(E1 P(king on second draw | king on first draw) )P(E2 )P(E3 ) 4 3 2 0.000181 52 51 50 P(king on third draw | king on first 2 draws) Comparing chances: As expected, the probability of getting 3 kings in a row is higher if you sample with replacement (.000455) than if you do not; .000455/.000181 = 2.5 times higher. 18 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Addition rule for two events For any two events E and F, P(E F) P(E) P(F) P(E F) 19 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bus example Calculate the proportion of people who either sat in the back of the bus or got nausea. Seat Position in Bus Front Middle Back Nausea 58 166 193 No Nausea 870 1163 806 Total 928 1329 999 Total 417 2839 3256 N = Nausea NC = No Nausea F = Front M = Middle B = Back P(B N) = P(B) + P(N) – P(B N) = 999/3256 + 417/3256 – 193/3256 = 1233/3256 = 0.376. 20 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Multiplication rule For any two events E and F, P(E F) P(E | F)P(F) From symmetry we also have P(E F) P(F | E)P(E) Law of total probability If B1 and B2 are disjoint events with P(B1) + P(B2) = 1, then for any event E: P(E) = P(E ∩ B1) + P(E ∩ B2) = P(E|B1)P(B1) + P(E|B2)P(B2) 21 using the multiplication rule © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example: Secretaries Question: 18% of all employees in a large company are secretaries and furthermore, 35% of the secretaries are male. If an employee from this company is randomly selected, what is the probability the employee will be a secretary and also male? Solution: M = event that the employee is male; S = event that the employee is a secretary. Then P(S) = 0.18, P(M|S)=0.35, and we need P(S ∩ M). P(S ∩ M)=P(M|S)P(S)=(0.35)(0.18)=0.063. 6.3% of the company’s employees are male secretaries. 22 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bayes rule If B1 and B2 are disjoint events with P(B1) + P(B2) = 1, then for any event E P(E | B1 )P(B1) P(B1 | E) P(E | B1 )P(B1 ) P(E | B2 )P(B2 ) Bayes rule is good for getting one conditional probability when only others are available. Proof: 23 P( B1 E ) Definition of conditional probability P( E ) P( E | B1 ) P( B1 ) Multiplication rule P( E ) P( E | B1 ) P( B1 ) Law of total probability P( E | B1 ) P( B1 ) P( E | B2 ) P( B2 ) P( B1 | E ) © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bayes rule More generally, if B1, B2, , Bk are disjoint events with P(B1) + P(B2) + P(Bk) = 1, then for any event E P(Bi | E) P(E | Bi )P(Bi) P(E | B1 )P(B1 ) P(E | B2 )P(B2 ) P(E | Bk )P(Bk ) for any of the events Bi for i=1,...,k. 24 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example: Radio switches A company that makes radios uses three different suppliers (A, B and C) to supply on‐switches for the radio. 50% of the switches come from supplier A, 35% from supplier B and 15% from supplier C. It is known that 1% of the switches from supplier A are defective, 2% from supplier B are defective, and 5% from supplier C are defective. Question: If a radio from this company had a defective on‐switch, what is the probability that the switch came from supplier A? 25 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Radio switches First, define the events, including what you need to calculate. A = event that the on‐switch came from supplier A B = event that the on‐switch came from supplier B C = event that the on‐switch came from supplier C D = event the on‐switch was defective Question: If a radio from this company had a defective on‐switch, what is the probability that the switch came from supplier A? That means we need to calculate P(A|D). We are given that P(A) = 0.5, P(B) = 0.35 , P(C) = 0.15, P(D|A) =0.01, P(D|B) =0.02, P(D|C) =0.05, but not P(A|D). That means that we need to use Bayes rule. 26 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Radio switches We need to calculate P(A|D). We are given that P(A) = 0.5, P(B) = 0.35 , P(C) = 0.15, P(D|A) =0.01, P(D|B) =0.02, P(D|C) =0.05. P(D | A)P(A) P(D | A)P(A) P(D | B)P(B) P(D | C)P(C) .01 .5 .256 .01 .5 .02 .35 .05 .15 P(A | D) Conclusion: If the switch was defective, there is a 25.6% chance it came from supplier A. 27 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Tutorial 4: Disease testing Only 0.1% of individuals in a certain population have a particular disease. There exists a diagnostic test for the disease. Of those who have the disease 95% test positive, of those who do not have the disease 90% test negative. 1.) Construct a tree diagram for an individual who will take the test, starting with the branches disease versus not disease; write the appropriate probabilities on the branches. 2.) Calculate the probability of having the disease and testing positive and the probability of not having the disease and testing positive. Which is higher? Why? Would you have expected this? 3.) Calculate the probability of testing positive. 4.) The sensitivity of a test is defined as the probability of testing positive for individuals who have the disease; what is the sensitivity of this test? 28 Tutorial 4: Disease testing Only 0.1% of individuals in a certain population have a particular disease. There exists a diagnostic test for the disease. Of those who have the disease 95% test positive, of those who do not have the disease 90% test negative. 5.) The specificity of a test is defined as the probability of testing negative for individuals who do not have the disease; what is the specificity of this test? 6.) For an individual who tests positive, calculate the probability that they actually have the disease. This quantity is called the positive predictive value. Is the result surprising? Give an intuitive explanation for the result. 7.) Calculate the negative predictive value of the test, which is the probability of not having the disease given that the test is negative. 29