* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download peA) P(BI and A)
Survey
Document related concepts
Transcript
Probabilities and Frequency Ratios on the basis of new data can be significant. 23 P(AB). P(AB) = P(AIB)P(B), and P(AIB) = (t)(t)(i)(t)(t)(t)(i)(t) (!-)6(1-)2 ri\rr. Thus, P(AB) =~. t=~. And, peA) -dh+~= -dh. One has two black and white faces. I put them it so you cannot see it. What is the probability ? Most people would Finally, we obtain P(WIA) = P(AW) peA) = 32/6561 34/6561 32 0941 34 =. . Bayes' theorem The method used in Example 2-4 is an application of Bayes' theorem, named after the Reverend Thomas Bayes (1702-1761). Suppose the events B 1 • B 2 • • • • , B" constitute a partition of the sample space S. In other words, B 1 , B 2 • • • • , B" are mutually exclusive events which together cover all of S. Probabilities have been assigned to each event in the partition. Now, suppose event A occurs. How does the infor mation that A has occurred affect the probabilities of B\> B 2 , ••• , Bk? We need to find the conditional probabilities P(BIIA), where i 1,2, ... , k. By definition, the conditional probability of event BI given A is P(BdA) the outcome of 8 rolls is and we seek P(WIA). !predominantly white one ~ility that a white face ap white), and the probability )le to regard the 8 rolls as , and the event B (denoting .lly exclusive and cover all ,re peA) P(AW or AB) = gous reasoning we can find P(Bi and A) peA) The probability that B; and A occur simultaneously is equal to the proba bility that A occurs given Bi times the probability of B i • That is, Since B I , B 2 , • • • , Bk form a partition of S, event (B 1 or B2 or ... B k ) is equivalent to S. When A occurs, one and only one of the events in the partition must occur, so peA) P(B I and A) + P(B 2 and A) + ... + P(Bk and A). As was seen earlier, the joint probability PCB; and A) peA IBi)P(B t ). So ~ Substituting in the definitional equation for the conditional proba bility, we obtain P(BdA) This is Bayes' theorem. ~---,~---- 24 Probability and Statistical Inference This theorem is basic to the approach to statistical and decision problems known as Bayesian statistics. But what is distinctive (and con troversial) in the Bayesian approach is the subjective assignment of probabilities to nonrecurring events. Bayes theorem is used to calculate how new information modifies these probabilities. Annual Death Rat Cause of Death Cancer of lung Emphysema Cirrhosis of the liver Cancer of rectum Influenza and pneumor All other causes Basic definitions and rules for the calculation of probabilities have been presented and applied through examples. The organization of data to provide a reasonable test. of a probabilistic hypothesis has also been demonstrated. Totals PROBLEMS 2.1. Consider an ordinary deck of playing cards. What is the probability of drawing, in a random. drawing of a single card, a) a spade b) an ace c) a face card (jack, queen, or king) or a diamond? 2.2. Consult Table 2-1. a) What percentage of males 16 years old and over was in the labor force in May 1970? This is known as a labor force participation rate. b) What was the probability that a male teen-ager (16-19) not in the labor force in May 1970 was going to school? c) What was the labor force participation rate for male teen-agers (16-19) in May 1970? d) What was the labor force participation rate for white male teen agers (16-19) in May 1970? Was labor force participation of male teen-agers independent of race? /2.3. A randomly chosen group of 383,000 persons has been observed for a year. Of the group, 249,000 are cigarette smokers and 134,000 are nonsmokers. Deaths during the year are reported in Table 2-3. a) Estimate the probability that a randomly selected smoker will die of lung cancer within a year. b) Estimate the probability that a randomly selected nonsmoker will die of lung cancer within a year. c) Estimate the probability that a randomly selected person is a cigarette smoker. d) Estimate the probability that a person who has died oflung cancer was a cigarette smoker. e) Estimate theprobal: ~," • of the liver was a cil f) Estimate the probai; was a cigarette smo) g) Is the event that a ~ independent of the ~ 2.4. A coin is biased so If the coin is tossed exactly 2 tails? 2.5. Suppose two cards placed side-by-side of the second. a) What is the b) What is the c) What is the d) What is the e) If the first card is the probability 2.6. Find the probability dice is 2, 3, 4, . . . , What is the probabilt 2.7. Here are the rules dice. Only the total of 7 or 11 on is 2, 3, or 12, he first throw is a I 26 Probability and Statistical Inference b) Find the probability that he wins on his first throw. c) Find the probability that he wins given that his point is 10. d) Find the probability that his first throw is 6 and he wins. e) Find the probability that he wins. 2.8. Two balls are to be drawn from an urn containing 5 red and 3 black balls. a) What is the probability that both balls will be red? b) What is the probability that the first ball will be red and the second ball black? c) What is the probability that one ball will be red and the other black? d) If the first ball is replaced before the second drawing, what is / the probability that both balls drawn will be red? I 2.10. For a recent year only 15 percent of couples applying for divorce had three or more children. Is this evidence to support the con tention that "children hold marriages together"? If so, explain the" evidence. If not, explain why not and describe some data that might give better evidence. 2.11. Suppose Mr. Jones chooses at random one of the integers 1, 2, or 3. ~ Then he throws as many dice as indicated by the chosen number. What is the probability that he will score a total of 4 points? 2.12. A fair die was thrown twice. a) What is the probability that the first throw yielded a 5 given that the sum of the two throws was 6? b) What is the probability that the sum of the two throws was 5 given that the first throw yielded an even number? Probability has been in the sample space of i outcomes may take measurement of height pair of dice; a complel family, number of roOl education, occupation a these examples suggest, or include numerical in the outcome in a numeJ is male, 0 if female. It is also worthwhi may be described in m; interest. For instance, t: tion on family income, ! then interviewing the p~ be considered to be the , of the outcome might in the person's name. Infc ployment status might b code. Similarly, the oub full box score, by the fi done in Example 2-3), ( a loss). rna Suppose a test for a rare blood disease is known to be 95 percent reliable. In other words, if the test shows an individual to have the disease then the probability is 0.95 that he does have it; when the test shows that the disease is not present, it is correct 95 percent of the time. Suppose the test is given to a large population, 1 percent of whom have the disease. What fraction of those whom the test shows to have the disease will actually have it? Answer: 0.0095 / 0.0590 (about 0.161) B1 = they have the disease B2 = healthy - they don't have the disease A = test positive P(B1) = 0.01 P(B2) = 0.99 P(A/B1) = 0.95 P(A/B2) = 0.05 P(B1/A) = P(A and B1) / P(A) P(A and B1) = P(A/B1)P(B1) = 0.95 * 0.01 = 0.0095 P(A) = P(A and B1) + P(A and B2) = P(A/B1)P(B1) + P(A/B2)P(B2) = (0.95)*(0.01)+ (0.05)*(0.99) = 0.059 P(B1/A) = P(A and B1) / P(A) = 0.0095 / 0.059 (about 0.161) Example 2.1. Suppose tha the probability the price, either 0.4 or 0.6. Based u that 0 is equally likely to P(O The conditional probability of A given B is defined whenever PCB) > 0 and is P(AIB) = peA n B) PCB) . (2.1) P(BIA) = peA ny) P(A) . (2.2) Similarly, if peA) > 0, then Rearranging (2.1) and (2.2) gives the very useful multiplicative laws: peA n B) P(AIB)P(B) P(BIA)P(A). We observe the stock for t days. Assume that the pri probability that the price this further information, , probability that 8 is 0.6, greater than the prior pro let A be the event that th4 Bayes' law we have (2.3) P(O = 0.6IA) 2.2.1 Independence The events AI, ... ,An are independent if for any 1 ::; i l < ... < ik ::; n P{A·21 n .. · n k 1.1,,; } = P{k11 } ... P{A lie } • 2.2.2 Bayes' law Suppose that B 1 , ••• , BK is a partition of S meaning that Bi n B j and Bl U B2 U··· U BK = S. Then for any set A, we have that 0 if i 1= j and therefore peA) = peA n B 1 ) + ... + peA n BK)' (2.4) It follows from (2.2) through (2.4) that P(BjIA) ~(AIBj)P(Bj2 peA) = (] P(AIBj)P(B j ) peA (0.6)3(0.5) + (0.4)3(0! Thus, our probability th utive price increases but before observing data ar conditional on observed probability that 0 equals Bayes' law is so imp beliefs in light of new in information is somethin ematics. 2 There is a hu emphasis on new infor Bayes' law for guidance. P(AIB0P(B~T+~-:--~-P(AIB~ )P(BK)' (2.5) Equation (2.5) is called Bayes' law, also known as Bayes' rule or Bayes' theorem. Bayes' law is a simple, almost trivial, mathematical result, but its implications are profound. In fact, there is an entire branch of statistics, called Bayesian statistics, that is based upon Bayes' law and is now playing a very wide role in applied statistics. The importance of Bayes' law comes from its usefulness when updating probabilities. Here is an example, one that is too simple to be realistic but that illustrates the basic idea behind applying Bayes' law. I 2.3 Probability 2.3.1 Random Dil variab~ J A quantity such as the many possible values, b~ such quantities random variable and the proba probability distribution 2 See Edwards (1982). j I nAc from these two basic 0,1 it follows from Iwhopn<>v<>r P(B) > 0 and is (2.1) (2.2) Example 2.1. Suppose that our prior knowledge about a stock indicates that the probability the price will rise on any given day, which we denote by 0, is either 0.4 or 0.6. Based upon past data, say from similar stocks, we believe that 0 is equally likely to be 0.4 or 0.6. Thus, we have the prior probabilities P(O = 0.4) 0.5 and P(O = 0.6) = 0.5. We observe the stock for three consecutive days and its price rises on all three days. Assume that the price changes are independent across days so that the probability that the price rises on each of three consecutive days is 0 3 • Given this further information, we may suspect that 0 = 0.6, not 0.4. Therefore the probability that 0 is 0.6, given three consecutive price increases, should be greater than the prior probability of 0.5, but how much greater? As notation, let A be the event that the prices rises on three consecutive days. Then, using Bayes' law we have (2.3) P(O = 0.6IA) = P(AIO (0.6)3(0.5) (0.6)3 0.2160 (0.6)3(0.5) + (0.4)3(0.5) = (0.6)3 + (0.4)3 = 0.2160 + 0.0640 = 0.7714. that B; nBj = 0 if i have that tf j (2.4) Thus, our probability that 0 is 0.6 was 0.5 before we observed three consecutive price increases but is 0.7714 after observing this event. Probabilities before observing data are called the prior probabilities and the probabilities conditional on observed data are called the posterior probabilities, so the prior probability that 0 equals 0.6 is 0.5 and the posterior probability is 0.7714. Bayes' law is so important because it tells us exactly how to update our beliefs in light of new information. Revising beliefs after receiving additional information is something that humans do poorly without the help of mathematics. There is a human tendency to put either too little or too much emphasis on new information, but this problem can be mitigated by using Bayes' law for guidance. 26 Probability and Statistical Inference b) Find the probability that he wins on his first throw. c) Find the probability that he wins given that his point is 10. d) Find the probability that his first throw is 6 and he wins. e) Find the probability that he wins. 2.8. Two balls are to be drawn from an urn containing 5 red and 3 black balls. a) What is the probability that both balls will be red? b) What is the probability that the first ball will be red and the second ball black? c) What is the probability that one ball will be red and the other black? d) If the first ball is replaced before the second drawing, what is / the probability that both balls drawn will be red? I 2.10. For a recent year only 15 percent of couples applying for divorce had three or more children. Is this evidence to support the con tention that "children hold marriages together"? If so, explain the" evidence. If not, explain why not and describe some data that might give better evidence. 2.11. Suppose Mr. Jones chooses at random one of the integers 1, 2, or 3. ~ Then he throws as many dice as indicated by the chosen number. What is the probability that he will score a total of 4 points? 2.12. A fair die was thrown twice. a) What is the probability that the first throw yielded a 5 given that the sum of the two throws was 6? b) What is the probability that the sum of the two throws was 5 given that the first throw yielded an even number? Probability has been in the sample space of i outcomes may take measurement of height pair of dice; a complel family, number of roOl education, occupation a these examples suggest, or include numerical in the outcome in a numeJ is male, 0 if female. It is also worthwhi may be described in m; interest. For instance, t: tion on family income, ! then interviewing the p~ be considered to be the , of the outcome might in the person's name. Infc ployment status might b code. Similarly, the oub full box score, by the fi done in Example 2-3), ( a loss). rna EXERCISES Applications 6.25 Worked. Objective: To explore the implications of conditional probability on the interpretation of test results, especially for rare events. Tests for diseases such as AIDS involve errors. There are false positives and false negatives. With false positives, the test indicates that an individual has AIDS when he or she does not, and the false negative indicates that a tested individual does not have AIDS when in fact he or she does. Let us consider the consequences of the first problem, false positives, when we contemplate testing a large class of people, or even the entire population. Suppose that 100 million people are tested. What information have we been given? pr(test + |NoAIDS) = .05, pr(test–|AIDS) = .01, and pr(AIDS) = .005. So, in a population of 100 million, we are looking for 500,000 people. The test is presumed to be quite accurate in that there is only a 5% error for false positives and a 1% error for false negatives. What do we want to know? What is the probability of having AIDS, given that a person has tested positive? We will use our concepts of conditional probability extensively in answering this question. We want pr(AIDS|test+). We derive pr(AIDS|test+) = pr(AIDS, test+) pr(test+) And we can break up the pr(test+) into its constituent states by pr(test+) = pr(AIDS, test+) + pr(NoAIDS, test+) where we recognize that the states AIDS and NoAIDS partition the state “test +.” We can test positive if we have AIDS and if we do not have AIDS, and these two states are mutually exclusive. If we can work out these last two joint probabilities given our information, we will have the problem solved. We define pr(AIDS, test+) = pr(test + |AIDS)pr(AIDS) pr(NoAIDS, test+) = pr(test + |NoAIDS)pr(No AIDS) 209 We have been given the information we need; pr(AIDS) is .005, so that pr(NoAIDS) is .995. We know that pr(test + |AIDS) is .99, because pr(test − |AIDS) is .01. It is also given that pr(test + |NoAIDS) is .05. We can now calculate that pr(AIDS, test+) = .99 × .005 = .00495 pr(NoAIDS, test+) = .05 × .995 = .04975 We can now solve our original problem: pr(AIDS|test+) = .00495 = .0905 .00495 + .04975 This result that the probability that one has AIDS given a positive reading is only about 10% may be surprising at first sight, but if you experiment with the calculations a bit you will begin to see the logic of the situation. The bigger the probability of AIDS in the first place, the bigger the probability given the test. So the surprising nature of the result is due to the relative low probability of AIDS in the whole population. 6.26 Given the facts stated in Exercise 6.25, determine the probability of not having AIDS, given that you have tested negatively. Rework both exercises if the probability of AIDS in the population is .2. Draw some policy conclusions about testing schemes from these calculations. Exr 6.25 page 209 total population 100,000,000 50,0000 have aids 99,500,000 B1 B2 A = they have the disease (AIDS) = healthy - they don't have AIDS = test positive P(B1) P(B2) = 0.005 = 0.995 P(A/B2) P(A/B1) = 0.05 = 0.99 [P(negative/B1)=0.01] FIND P(AIDS/TEST POSITIVE): P(B1/A) = P(A and B1) / P(A) P(A and B1) = P(A/B1)P(B1) = 0.99 * 0.005 = 0.00495 P(A) = P(A and B1) + P(A and B2) = P(A/B1)P(B1) + P(A/B2)P(B2) =(0.99)*(0.005)+ (0.05)*(0.995) = 0.00495 + 0.04975 = 0.0547 P(B1/A) = P(A and B1) / P(A) = 0.00495/0.0547 = 0.0904936 Exr 6.26 page 209 total population 100,000,000 50,0000 have aids 99,500,000 B1 B2 N = they have the disease (AIDS) = healthy - they don't have AIDS = test negative P(B1) P(B2) = 0.005 = 0.995 P(N/B2) P(N/B1) = 0.95 [P(negative/B2)=0.95] = 0.01 [P(negative/B1)=0.01] FIND P(NO AIDS/TEST NEGATIVE): P(B2/N) = P(N and B2) / P(N) P(N and B2) = P(N/B2)P(B2) = 0.95 * 0.995 = 0.94525 P(N) = P(N and B1) + P(N and B2) = P(N/B1)P(B1) + P(N/B2)P(B2) =(0.01)*(0.005)+ (0.95) * (0.995) = 0.00005 + 0.94525 = 0.9453 P(B2/N) = P(N and B2) / P(N) = 0.94525/0.9453 = 0.9999471