Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
10. General rules of probability The Practice of Statistics in the Life Sciences Third Edition © 2014 W.H. Freeman and Company Objectives (PSLS Chapter 10) General rules of probability Independent events Conditional probability Multiplication rule Tree diagrams Diagnosis tests Independent events Two events are independent if knowing that one event is true or has happened does not change the probability of the other event. “male” and “get head when flipping a coin” independent “male” and “taller than 6 ft” not independent “male” and “high cholesterol” it’s not obvious (I’m guessing independent) “male” and “pregnant” not independent Sampling without replacement Pick one frog at random from your target population and don’t put it back. Then pick another frog, etc. Artificial pond with 10 male and 10 female frogs. P(1st frog is male) = 10/20 = 0.5. If 1st frog is male, P(2nd frog is male) = 9/19 = 0.47 (≠ 0.5). Here, successive picks are not independent. Survey of a whole county with thousands of frogs (half males, half females). P(1st frog is male) = 0.5. If 1st frog is male, P(2nd frog is male) ≈ 0.5. Here, successive picks are “nearly” independent. Conditional probability Conditional probabilities reflect how the probability of an event can be different if we know that some other event has occurred or is true. The conditional probability of event B, given event A is: (provided that P(A) ≠ 0) P ( A and B ) P ( B | A) P ( A) When two events A and B are independent, P(B | A) = P(B). No information is gained from the knowledge of event A. Probabilities of hearing impairment and blue eyes among Dalmatian dogs. HI = Dalmatian is hearing impaired B = Dalmatian is blue eyed Neither HI nor B 0.66 HI and not B 0.23 P(HI and B) = .05 P(HI) = .28 B and not HI 0.06 HI and B 0.05 P(B) = .11 P(HI | B) = P(HI and B) / P(B) = .05/.11≈ .45 P(HI | B) ≠ P(HI ), therefore HI and B are not independent. Large-scale surveys indicate that 11% of the population smokes. Medical researchers know that the probability that a smoker will get lung cancer is 0.34. The probability that a person will get lung cancer if the person doesn’t smoke is 0.03. The probability P(Lung cancer | Smoker) is A) 0.03 B) 0.0394 C) 0.34 D) 0.583 E) 0.97 Are “Smoker” and “Lung cancer” independent? A) Yes B) No Large-scale surveys indicate that 11% of the population smokes. Medical researchers know that the probability that a smoker will get lung cancer is 0.34. The probability that a person will get lung cancer if the person doesn’t smoke is 0.03. Lung cancer Non-smoker and no Lung cancer Smoker 0.11 Multiplication rule General multiplication rule: The probability that any two events, A and B, both occur is: P(A and B) = P(A)P(B|A) Multiplication rule for independent events: If A and B are independent, then: P(A and B) = P(A)P(B) Artificial pond with 10 male and 10 female frogs Successive captures are not independent. Probability of randomly capturing 2 male frogs in a row: P(male and male) = P(1st is male) P(2nd is male | 1st is male) = (10/20)(9/19) ≈ 0.237 Blood donation center Unrelated visitors are independent. Probability that the next two unrelated visitors are both type O: P(O and O) = P(O) * P(O) = (.45)(.45) ≈ .2025 Tree diagrams Tree diagrams are used to represent probabilities graphically and facilitate computations. Probabilities of skin cancer among men and women by body locations. 0.44 Man Head 0.15 0.41 In a random individual with skin cancer: P(head) = 0.15 0.56 Woman 0.63 Man 0.37 Woman 0.20 Man P(trunk) = 0.41 P(limbs) = 0.44 Trunk 0.44 % in each group who are women: P(woman | head) = 0.56 P(woman | trunk) = 0.37 Limbs 0.80 Woman P(woman | limbs) = 0.80 Diagnostic tests and screening Unknown medical status Known test result Test sensitivity P(+|disease) Rate of disease in target population P(disease) Positive True positive Negative False negative Positive False positive Disease Individual No disease If a person gets a positive test result, what is the probability that he/she actually has the disease? This is the positive predictive value: PPV = P(disease | positive test) Negative Test specificity P(–|nodisease) True negative Diagnosis tests P( pos | dis ) # individuals with disease total # individuals Disease prevalence Diagnosis sensitivity Positive True positive Negative False negative Positive False positive Negative True negative Disease Individual No disease If a person gets a positive test result, what is the probability that he/she actually has the disease? This is the positive predictive value: PPV = P(disease | positive test) Diagnosis specificity P(neg | no dis) HIV-AIDS: Suppose that about 1% of a large population has HIV-AIDS antibodies. The enzyme immuno-assay test has sensitivity .9985 and specificity .9940. Diagnosis sensitivity .9985 Disease incidence Positive True positive Negative False negative Positive False positive Negative True negative HIV antibodies .01 .0015 .99 .006 Random adult taking the test No HIV antibodies Incidence of HIV-AIDS in the general population .994 Diagnosis specificity HIV-test performance If a person who takes this test gets a positive test result, what is the probability that he or she actually has HIV-AIDS, P(HIV-AIDS | positive test)? What’s the positive Disease incidence predictive value of the enzyme immuno-assay .01 .99 Incidence of HIVAIDS in general population Positive HIV antibodies Negative False negative .0060 Positive False positive .9940 Negative .0015 Random adult taking the test test for HIV-AIDS? PPV = P(disease | positive) Diagnosis sensitivity .9985 No HIV antibodies Diagnosis specificity HIV-test performance P(true positive) = P(disease and positive) = P(disease)P(positive | disease) = (.01)(.9985) = .09985 P(false positive) = P(no disease and positive) = P(no disease)P(positive | no disease) = (.99)(.006) = .00594 PPV = P(disease | positive) = P(disease and positive) / P(positive) = P(true positive) / P(all positives) = .09985 / (.09985 + .00594) ≈ 62.7% What is the rate of prostate cancer among mature men? P(cancer)= ? How “sensitive” is the PSA test? P(+ | cancer)= ? How “specific” is the PSA test? P(– | no cancer)= ? If a man gets an elevated PSA test (+), what is the probability that he has prostate cancer? PPV = P(cancer | +)= ?