* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bayes` theorem
Survey
Document related concepts
Transcript
The end of the world and life in a computer simulation 1 The doomsday argument In the Leslie introduces familiar reasoning: Leslie begreading ins his for exptoday, osition of the argumaent withsort an of exa mple: This The reasbasic oningidea is, here fairlyis clea a we kind of reas one rly, which employ allonin the time in nour ordinary the world. It ht g upo whi ch we reasoning rely all ofabout the tim e. It mig be sum med like this might beup summed up :asiffollows: we have two theories, and the first makes a certain event much mor e likely than the other, and we observe that event, that should lead us to favor the first theory. Note that this doesn’t mean that The we principle should alw of ays confirmation think that the first theory is true; rather, what it means is that, whatever our initial estimate of the probab ility of the first theory, we should increase that estimat E eis upo evidence T1 if the probability of n obsfor ervi ngover theT2eve nt in question. E given T1 > the probability of E given T2. Now consider the application of that line of reasoning to the exa mples of balls shot at random from Here a lott mac hine. Sup e tha youtwokno w thabetween t the ball E ery is our evidence, and pos T1 and T2tare theories which aremac trying When s inwe the hintoe decide. are num bered sequenti ally (wi th no rep eats) ofbeg inning we talk about the probability X given Y, wit we h are1,talking about the don probability that X will take place, if re but tha t you ’t know how man y ball s the are inYthe hine. Now we start the machine, and a ball comes out with ‘3’ on alsomac happens. it. You’re now asked: do you think that it is more likely that the machine has 10 balls in it, or 10,000 balls in it? The line of reasoning sketched above seems to favor the hyp othesis that there are just 10 balls in the machine. (Note that, whichever hypothesis you endorse , you could be wrong; this is not a form of reasoning which delivers results guaranteed to be correct. The question is just which of these hypotheses is most likely, given the evidence.) The basic idea here is one which we employ all the time in our ordinary reasoning about the world. It might be summed up as follows: The principle of confirmation E is evidence for T1 over T2 if the probability of E given T1 > the probability of E given T2. Here E is our evidence, and T1 and T2 are two theories between which we are trying to decide. When we talk about the probability of X given Y, we are talking about the probability that X will take place, if Y also happens. Now consider the application of that line of reasoning to the examples of balls shot at random from a lottery machine. Suppose that you know that the balls in the machine are numbered sequentially (with no repeats) beginning with 1, but that you don't know how many balls there are in the machine. Now we start the machine, and a ball comes out with “3” on it. You're now asked: do you think that it is more likely that the machine has 10 balls in it, or 10,000 balls in it? The principle of confirmation suggests - correctly, it seems - that our piece of evidence favors the hypothesis that there are just 10 balls in the machine. This is because the probability that “3” comes out given that balls 1-10 are in the machine is 10%, whereas the probability that this ball comes out given that balls numbered 1-10,000 are in the machine is only 0.01%. (Note that, whichever hypothesis you endorse, you could be wrong; this is not a form of reasoning which delivers results guaranteed to be correct. The question is just which of these hypotheses is most likely, given the evidence.) This gives us some information about how to reason about probabilities, but not very much. It tells us when evidence favors one theory over another, but does not tell us how much. It leaves unanswered questions like: if before I thought that the 10,000 ball hypothesis was 90% likely to be true, should I now think that the 10 ball hypothesis is more than 50% likely to be true? If I assigned each of the two hypotheses prior to the emergence of the “3” ball a probability of 0.5 (50% likely to be true), what probabilities should I assign to the theories after the ball comes out? This gives us some information about how to reason about probabilities, but not very much. It tells us when evidence favors one theory over another, but does not tell us how much. It leaves unanswered questions like: if before I thought that the 10,000 ball hypothesis was 90% likely to be true, should I now think that the 10 ball hypothesis is more than 50% likely to be true? If I assigned each of the two hypotheses prior to the emergence of the “3” ball a probability of 0.5 (50% likely to be true), what probabilities should I assign to the theories after the ball comes out? 1.1 Bay To understand Leslie’s argument, we’ll have to understand how to answer these sorts of questions. In fact, we can do better than just saying that i One way to answer these questions employs a widely assign to one theory. We can, using a widely ac accepted rule of reasoning called “Bayes’ theorem,” much you should raise your probabilit namedsay after how Thomas Bayes, an 18th century English mathematician Presbyterian minister. widelyand accepted is that following it enables one To arrive the theorem, we begin with the following Toat arrive at Bayes’ theorem, we can begin w definition of conditional probability, where, as is probability’: probability of one standard, we abbreviate the “the conditional probability of xclaim, given claims and b, we can say that given y” as “Pr(xa| y)”: P (a|b) = P (a&b) P (b) This says, in effect,words, that the probability of a given of b isa given b is t In other the probability the chance that a and b are both true, divided by the the chances that b is true. For example, let a = chances that b is true. that each of Obama, Hilary, and McCain have probability is that Obama wins, given that a m that a man would win, you should then (given there is a 0.5 probability that Obama will win. In fact, we can do better than just saying that in One way to answer these questions employs a widely assign one theory. We can, using a widely acc accepted rule ofto reasoning called “Bayes’ theorem,” much you should raise your probability namedsay after how Thomas Bayes, an 18th century English mathematician Presbyterian minister. widelyand accepted is that following it enables one t To arrive at the theorem, we begin with the following To arrive at Bayes’ theorem, we can begin wit definition of conditional probability, where, as is probability’: probability of one standard, we abbreviate the “the conditional probability of xclaim, given t given y” as “Pr(xa| y)”: claims and b, we can say that P (a|b) = P (a&b) P (b) This says, in effect, that the probability of a given b is In other words, the probability the chance that a and b are both true, divided byof thea given b is the the chances thatchances b is true. that b is true. For example, let a = ‘O that each of Obama, Hilary, and McCain have a Let’s work through an example. Suppose that this is some time before the 2008 election, and let a = probability isofthat Obama wins, given that a man ‘Obama wins’, and let b = ‘a man wins.’ Suppose that you think that each Obama, Hilary, and McCain that a manthat would you that should have a 1/3 chance of winning. Then what is the conditional probability Obamawin, wins, given a manthen (given wins, using the above formula? there is a 0.5 probability that Obama will win. The conditional probability is that Obama wins, given that a man wins, is ½, since in this case Pr(a&b)=⅓ Using this definition of conditional probability, and Pr(b)=⅔. Intuitively, if you found out only that a man would win, you should then (given the initial 6= 0: probability assignments) think that there is a 0.5 probability that Obama will win. P (a&b) Using this definition of conditional probability, we can then argue Bayes’ = theorem as follows, assuming 1. for P (a|b) P (b) that Pr(b)≠0. P (a&b) 2. P (b|a) = P (a) 3. P (a|b) ⇤ P (b) = P (a&b) w To arrive at Bayes’ theorem, we can begin wit t To arrive at Bayes’ theorem, we can begin with the definition of what is called ‘conditional probability’: probability of one claim, given probability’: the probability of one claim, given that another the is true. In particular, for arbitrary claims a and b, we can say that claims a and b, we can say that Definition of conditional probability P (a|b) = P (a&b) P (b) P (a|b) = P (a&b) P (b) In other words, the probability of a given b is the chance that a and b are both true, divided by other wins’, words, of aSuppose given b is the the chances that b is true. For example, let a =In‘Obama andthe let probability b = ‘a man wins.’ that each of Obama, Hilary, and McCain have a 1/3 chance of winning. the conditional the chances that b probability, is true.Then For example, Using this definition of conditional we can then arguelet for a = ‘O probability is that Obama wins, givenBayes’ that a man is of 1/2. Intuitively, if you and foundMcCain out only have a theorem as follows, assuming that Pr(b)≠0. thatwins, each Obama, Hilary, that a man would win, you should then (given the initial probability assignments) think that probability is that Obama wins, given that a man there is a 0.5 probability that Obama will win. that a man would win, you should then (given Using this definition of conditional probability,there we can argue as follows, that assuming that P (b) win. isthen a 0.5 probability Obama will 6= 0: Derivation of Bayes’ theorem 1. P (a|b) = 2. 3. 4. 5. C. P (a&b) P (b) P (a&b) P (a) P (b|a) = P (a|b) ⇤ P (b) = P (a&b) P (a&b) = P (b|a) ⇤ P (a) P (a|b) ⇤ P (b) = P (b|a) ⇤ P (a) P (a) P (a|b) = P (b|a) P (b) Using this definition of conditional probability, w def. of conditional probability 6= 0: def. of conditional probability (1), multiplication by =’;s 1. P (a|b) = PP(a&b) (b) (2), multiplication by =’s P (a&b) 2. P (b|a) = P (a) (3),(4) division by =’s 3. P (a|b) ⇤ (5), P (b) = P (a&b) 4. P (a&b) = P (b|a) ⇤ P (a) This conclusion is Bayes’ theorem. Often, what we want to know is, intuitively, the probability (a|b) ⇤is P (b) =theorem P (b|a) ⇤ Pthe (a)probability Theofconclusion of this argument is Bayes’ what it says write that if we want to know some hypothesis ‘h’ given some theorem. evidenceIntuitively, ‘e’; then5. wePwould the as: (b|a) P (a) of some theory given a bit of evidence, what we need to knowC. arePthree things: the probability of the evidence (a|b) = P(1) P (b) given the theory (i.e.,Phow likely (h) P (e|h)the evidence is to happen if the theory is true), (2) the prior probability of the theory, P prior (h|e)probability = and (3) the of the evidence. P (e) This conclusion is Bayes’ theorem. Often, what of some ‘h’ given some evidence ‘e’; t Consider what this would say about the example of thehypothesis lottery machine. Suppose for simplicity a man would win, then (given the Using this definition ofthat conditional probability, we canyou then should argue as follows, assuming thatinitial P (b) there isDerivation a 0.5 probability that Obama will win. 6= 0: of Bayes’ theorem 1. P (a|b) = 2. 3. 4. 5. C. P (a&b) P (b) P (a&b) P (a) prob def. of conditional probability Using this definition of conditional probability, we can then argu def. of conditional probability 6= 0: P (b|a) = P (a|b) ⇤ P (b) = P (a&b) P (a&b) = P (b|a) ⇤ P (a) 1. P (a|b) P (a|b) ⇤ P (b) = P (b|a) ⇤ P (a) P (a) 2. P (b|a) P (a|b) = P (b|a) P (b) = P (a&b) P (b) P (a&b) P (a) (1), multiplication by =’;s (2), multiplication by =’s (3),(4) (5), division by =’s def. of con = def. of con 3. P (a|b) ⇤ P (b) = P (a&b) (1), m This conclusion is Bayes’ theorem. Often, what we want to know is, intuitively, the probability 4. P (a&b) P then (b|a)what ⇤ would Pit (a) (2), m Theofconclusion of this argument is Bayes’ theorem. Intuitively, says write is that the if wetheorem want to know some hypothesis ‘h’ given some evidence= ‘e’; we as: the probability of some theory given a bit of evidence, what we need know three things: (1) the probability of the evidence 5. P (a|b) ⇤ P to(b) = are P (b|a) ⇤ P (a) given the theory (i.e.,Phow likely if the (h) P (e|h)the evidence is to happen P (b|a) P theory (a) is true), (2) the prior probability of the theory, P prior (h|e)probability = C. P (a|b) = and (3) the of the evidence. P (e) P (b) If we let “h” stand for the hypothesis to be tested and “e” for the relevant evidence, then the theorem can be Thissayconclusion is Bayes’ Often, whatforwesimplicity want to Consider what this would about the example of thetheorem. lottery machine. Suppose rewritten as follows: kno that you know going in there are only two‘h’ options, equally likely be correct: ofthat some hypothesis givenwhich someareevidence ‘e’; to then we would w that there are 10 balls in the machine, and that there are 10,000. Let e be the evidence that the Bayes’ theorem first ball to come out is #3, and let h be the hypothesis that there are 10 balls in the machine. Then we might say: P (h|e) = P (h) P (e|h) P (e) P (h) = 0.5 This theorem is very useful, since often it is easy to figure out the conditional probability of the evidence given the Consider whatprobability this would say about example of the lottery theory, butPvery to figure out the conditional of the theory given thethe evidence. (e|h)hard = 0.1 you know P (e) = 0.5(0.1 +that 0.0001) = 0.05005 going in that there are only two options, which A good example of this sort of situation is given by our example of the lottery balls. that there are 10 balls in the machine, and that there are 10,000 0.5 0.1 Then we find, via Bayes’ theorem, thatcome P (h|e)out = 0.05005 = 0.999. So, on the the basis hypothesis of the evidencethat th first ball to is #3, and let h be that the first ball to come out was you say: should revise your confidence in the 10-ball hypothesis Then we #3, might from 50% to 99.9% certainty. of some given some ‘e’; then we would w there is a hypothesis 0.5 probability‘h’ that Obama willevidence win. Bayes’ theorem Using this definition of conditional probability, we can then argue as follows, P (h) P (e|h) 6= 0: P (h|e) = P (e) P (a&b) 1. P (a|b) = def. of conditional prob (b) This theorem is very useful, since often it is easy toPfigure out the conditional probability of the evidence given the Consider what this would say about example the lottery theory, but very hard to figure out the conditional of the theory given thethe evidence. 2. P (b|a) = probability def. ofof conditional prob P (a) P (a&b) that3.you know going in that there are only two(1), options, which P (a|b) ⇤ P (b) = P (a&b) multiplication A good example of this sort of situation is given by our example of the lottery balls. P (a&b) P (b|a) ⇤ Pin (a)the machine, and that there (2), multiplication that4.there are=10 balls are 10,000 Remember that we have two hypotheses: that⇤there balls labeled 1-10 in the machine, and that there are balls 5.ball P (a|b) P (b)are = P (b|a) ⇤ P (a) first to come out is #3, and let h be the hypothesis that t labeled 1-10,000 in the machine. Let’s assume that, at the start, we think that these hypotheses have equal P (b|a) P (a) C.probability P (a|b) = (5),by division P (b) which sort of machine we are given is determined probability: they are each assigned 0.5. (Maybe Then we might say: coin flip.) This conclusion is Bayes’ theorem. Often, what we want to know is, intuitiv Now suppose that the first ball comes out, again‘h’ at given random, is a “3”. Bayes’ theorem us, given the the theo of that some hypothesis some evidence ‘e’; thencan wetellwould write (h) 0.5 before us contains 10 rather than 10,000 balls. According to foregoing information, how likely it is P that the= machine Bayesians, we figure this out by conditionalizing on our evidence, i.e. finding the conditional probability of the P (h) P (e|h) (e|h) 0.1 theory given the evidence. PP(h|e) == P (e) Pcontains (e) = 10 0.5(0.1 + 0.0001) Let “h” be the theory that the machine balls. Then to compute= the0.05005 relevant conditional probability, we need to know three values. Consider what this would say about the example of the lottery machine. Sup that you know going in that there are only two options, which are equally l 0.5 0.1 (h|e)are=10,000. =e0.999 that there are 10 balls in the machine, andthat thatPthere be th 0.05005Let theobviously, ball come out you should revise Next, we need to know Pr(e that |first h). Fairly this isis0.1. ball tofirst come outto #3, and letwas h be#3, the hypothesis that there your are 10con ba Then we might say: ofcertainty. from 50% to 99.9% But we also need to know the unconditional probability e, Pr(e). How should we figure this out? First, we need to know Pr(h).Then From the know that thistheorem, is 0.5. weabove, find,wevia Bayes’ A natural thought is that since we knowtheorem that each of the two hypotheses - 10 ball and 10,000 ball - have a Bayes’ P (h) = 0.5 can be restated in the following way: probability of 0.5, Pr(e) should be the mean of the conditional probabilities of e given those two hypotheses. In this case: P (e|h) = 0.1 P (e) = 0.5(0.1 + 0.0001) = 0.05005 0.5 0.1 2 4. P (a&b) = P (b|a) ⇤ (2), 4. P (a&b) = P (b|a) ⇤ P (a) of some multiplication by =’s‘e’; (2), hypothesis ‘h’P (a) given some evidence thenmultiplication we would w 5. P (a|b) ⇤ P (b) = P (b|a) ⇤ P (a) 5. P (a|b) ⇤ P (b) = P (b|a) ⇤ P (a) (3),(4) Bayes’ theorem P (b|a) P (a) C. P (a|b) = (5), division P (b|a) P (a) P (b) C. P (a|b) = P (h) P (e|h) (5), division by =’s P (b) P (h|e) = P (e) This conclusion is Bayes’ theorem. Often, what we want to know is, intuitive conclusion is Bayes’ theorem.of Often, what we ‘h’ want to know is, intuitively, the some hypothesis given some evidence ‘e’; then weprobability would write the theor Now suppose that the first ball that comes out, again at random, is a “3”. Bayes’ theorem can tell us, given the me hypothesis ‘h’ given some evidence ‘e’; then we would write the theorem as: foregoing information, how likely it is that the machine before us contains 10 ratherthe than 10,000 balls. of According to Consider what this would say about example the lottery P (h) (e|h) Bayesians, we figure this out by conditionalizing onPour evidence, i.e. finding the conditional probability of the P (h|e) = that you know going in that there are only two options, which P (e) theoryP given evidence. (h) Pthe (e|h) P (h|e) = P (e) that there are 10 balls in the machine, and that there are 10,000 Let “h” be the theory that the machine contains 10 balls. Then to compute the relevant conditional probability, we would aboutand the example the lottery machine. Sup first ballwhat to this come outsay is #3, let h beofthe hypothesis that t need to know three values. Consider thatthe youexample know going in lottery that there are onlySuppose two options, which are equally li sider what this would say about of the machine. for simplicity Then we might say: First, we need to know Pr(h).that Fromthere the above, weballs knowin that thismachine, is 0.5. are 10 the and that there you know going in that there are only two options, which are equally likely toare be10,000. correct:Let e be the ball tothat come out isare is0.1. #3, and let h be thethe hypothesis Next, needin to know Pr(e |first h). Fairly this there are 10weballs the machine, andobviously, there 10,000. Let e be evidencethat thatthere the are 10 ba Then wePmight say: (h) = 0.5 of e,that ball to But come out is #3, and let h be the hypothesis there are 10weballs we also need to know the unconditional probability Pr(e). How should figurein thisthe out?machine. n we might say: A natural thought is that since we know each the two hypotheses - 10 ball and 10,000 ball - have a P that (e|h) =of0.1 P (h) = 0.5 probability of 0.5, Pr(e) should be the mean of the conditional probabilities of e given those two hypotheses. In this P (e)==0.10.5(0.1 + 0.0001) = 0.05005 case: P (e|h) P (h) = 0.5 P (e|h) = 0.1 P (e) = 0.5(0.1 + 0.0001) = 0.05005 0.5 0.1 Then we find, via Bayes’ theorem, that P (h|e) = 0.05005 = 0.999 P (e) = 0.5(0.1 + 0.0001) 0.5 0.1 Plugging these numbers = into0.05005 Bayes’we theorem, we Bayes’ get: Then find, via theorem, that P (h|e) = = 0.999. So, your on thecon ba 0.05005 that the first ball to come out was #3, you should revise that the first ball to come out was #3, you should revise your confidence in the from 50% to 0.5 99.9% 0.1 certainty. from 50% to 99.9% certainty. n we find, via Bayes’ theorem, that P (h|e) = 0.05005 = 0.999. So, on the basis of the evidence the first ball to come out was #3, you should revise your confidence in the the 10-ball hypothesis Bayes’ theorem be restated in following way: Bayes’ theorem can can be restated in the following way: Hence, after you see the “3” ball come out of the lottery machine, you should revise the probability you assign to the 50% to 99.9% certainty. es’ 10-ball hypothesis from 0.5 to .999 - that is, you should switch from thinking that the machine has a 50% chance of being can a 10-ball machine toin thinking that it has away: 99.9% chance of being a 10-ball machine. theorem be restated the following 2 2 of some hypothesis ‘h’ given some evidence ‘e’; then we would w Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) Hence, after you see the “3” ball come out of the lottery machine, you should revise the probability you assign to the what this would say about example ofchance the lottery 10-ball hypothesis from 0.5 to Consider .999 - that is, you should switch from thinking that thethe machine has a 50% of being a 10-ball machine to thinking it has a 99.9% chancein of being 10-ball machine. thatthat you know going thatathere are only two options, which that there are 10 balls in the machine, and that there are 10,000 An intuitive way to think about what this all means is in terms of what bets you would be willing to accept. If you first ball to comethen outyouisshould #3, be and letto haccept be the hypothesis think that something has a 50% chance of happening, willing all bets which give youthat t better than even odds, and willing to reject bets which Then we allmight say:give you worse odds. Analogous remarks apply to different probability assignments. This link between probability assignments and = bets0.5 is one way to bring out a strength of the Bayesian approach to P (h) belief formation. Following the Bayesian rule of conditionalization is the only way to avoid being subject to a Dutch book. P (e|h) = 0.1 P (e) no =matter 0.5(0.1 0.05005 A Dutch book is a combination of bets which, what+ the0.0001) outcome, = is sure to lose. For example, suppose that we have a 10 horse race, and you have the following views about the probabilities that certain horses will win. 0.5 0.1 Then we find, via Bayes’ theorem, that P (h|e) = #3 has at least a 40% chance of winning 0.05005 = 0.999 #6 has athe slightly better than chance winning that first ball toeven come outof was #3, you should revise your con #8 has a better than 1 in 3 chance of winning from 50% to 99.9% certainty. If these are your probability assignments, then you should be willing to make the following three bets: Bayes’ theorem can be restated in the following way: $2 on #3 to win, at 3-2 odds $3 on #6 to win, at even odds $1.50 on #8 to win, at 3-1 odds 2 of some hypothesis ‘h’ given some evidence ‘e’; then we would w Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) This link between probability assignments and bets is one way to bring out a strength of the Bayesian approach to Consider what this would say about the example of the lottery belief formation. Following the Bayesian rule of conditionalization is the only way to avoid being subject to a Dutch that you know going in that there are only two options, which book. that there are 10 balls in the machine, and that there are 10,000 A Dutch book is a combination of bets which, no matter what the outcome, is sure to lose. For example, suppose first ball to come out is #3, and let h be the hypothesis that t that we have a 10 horse race, and you have the following views about the probabilities that certain horses will win. Then we might say: #3 has at least a 40% chance of winning #6 has a slightly better than even chance of winning #8 has a better than=1 in 3 chance of winning P (h) 0.5 P (e|h) 0.1 be willing to make the following three bets: If these are your probability assignments, then you=should P (e) = 0.5(0.1 + 0.0001) = 0.05005 $2 on #3 to win, at 3-2 odds $3 on #6 to win, at even odds $1.50 on #8 to win, at 3-1 odds 0.5 0.1 Then we find, via Bayes’ theorem, that P (h|e) = 0.05005 = 0.999 that But this is a Dutch book. Can you seethe why?first ball to come out was #3, you should revise your con from 50% to 99.9% certainty. One of the main arguments given in favor of updating your beliefs using Bayesian conditionalization, using Bayes’ theorem, is that other ways of updating your beliefs will leave you, in the above sense, subject to a Dutch book. And Bayes’ can be restated way: being subject to a Dutch book seems like theorem a paradigm of irrationality. It seems in sortthe of likefollowing the probabilistic version of having inconsistent beliefs. Hence it is a strength of Bayesianism that it avoids this situation. 2 of some hypothesis ‘h’ given some evidence ‘e’; then we would w Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) would say With this Bayesian apparatusConsider in hand, let’swhat return tothis Leslie’s argument. about the example of the lottery that you know going in that there are only two options, which Leslie asks us to consider two hypotheses about the future course of human civilization: that there are 10 balls in the machine, and that there are 10,000 to go come is #3, and Doom Soon. The first humanball race will extinctout by 2150, with the totallet h be the hypothesis that t humans born by the time ofwe such extinction being 500 billion. Then might say: Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being P (h) = 0.5 50 thousand billion. P (e|h) = 0.1 we have, along with Bayesian conditionalization, to Leslie thinks that we can use a certain kind of evidence show how likely these two hypotheses are. To do this, though, we’ll first have to think about how likely we P (e) = 0.5(0.1 + 0.0001) = 0.05005 think these two hypotheses are to begin with. Doom Soom is pretty grim; it means that the human race will go extinct during the lives of your grandchildren. 0.5 0.1 Then we find, via Bayes’ theorem, that P (h|e) = = 0.999 Let’s suppose that we think that this is pretty unlikely - maybe that it has a 1% chance of happening.0.05005 Let’s suppose (we’ll relax this assumption later) that Doom Delayed is the other possibility, so that it has a 99% that the first ball to come outonly was #3, you should revise your con chance of happening. from 50% to 99.9% certainty. What evidence could we possibly have now to help us decide between these hypotheses now? Some obvious candidates spring to mind: the proliferation of nuclear weapons; astronomical of the probabilities Bayes’ theorem can be restated in calculations the following way: of large asteroids colliding with the earth; prophecies involving the end of the Mayan calendar; etc. But the evidence that Leslie has in mind is of a different sort. 2 of some hypothesis ‘h’ given Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) Consider what this would sa you going in that Doom Soom is pretty grim; it means that the human race will go extinct during thethat lives of your know grandchildren. Let’s suppose that we think that this is pretty unlikely - maybe that it has a 1% chance happening. thatofthere areLet’s 10 balls in the suppose (we’ll relax this assumption later) that Doom Delayed is the only other possibility, so that it has a 99% first ball to come out is #3, chance of happening. Then we might say: What evidence could we possibly have now to help us decide between these hypotheses now? Some obvious candidates spring to mind: the proliferation of nuclear weapons; astronomical calculations of the probabilities of large asteroids colliding with the earth; prophecies involving the end of the Mayan calendar; etc. But the P (h) = 0.5 evidence that Leslie has in mind is of a different sort. P (e|h) = 0.1 That evidence is: each of us is one of the first 50 billion human beings born. Let’s now ask, using the Bayesian method, what probability, in light of this evidence, we should assign to Doom Soon (DS) and Doom P (e) = Delayed 0.5(0.1 + (DD). 0.000 First, what is the probability of our evidence, conditional on the truth of DS? It seems that it should be 0.1; if there Then we50find, via Bayes’ will be 500 billion total human beings, then I have a 1 in 10 chance of being among the first billion born. the that the first ball to come ou Second, what is Pr(e | DD)? By analogous reasoning, it seems to be 1 in 1000, or .001. from 50% to 99.9% certainty But what is the probability of e, by itself? On the current assumption that either DS or DD is true, and that DD is 99 times more likely to be true, it seems that Pr(e) should be equal to: Bayes’ theorem can be [99 * Pr(e | DD) + 1 * Pr(e | DS)] / 100 = .00199 resta of some hypothesis ‘h’ given Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) Consider what this would sa That evidence is: each of us is one of the first 50 billion human beings born. Let’sthat now ask, using the Bayesian you know going in that method, what probability, in light of this evidence, we should assign to Doom Soon (DS) and Doom Delayed that there are 10 balls in the (DD). first ball to come out is #3, So we have the following: Then we might say: P r(DS) = 0.01 P r(DD) = 0.99 P r(e) = 0.00199 P r(e|DS) = 0.1 P r(e|DD) = 0.001 P (h) = 0.5 P (e|h) = 0.1 And this is all we need to figure out the probabilities of DS and DD conditional on the evidence that we are P (e) = 0.5(0.1 among the first 50 billion human beings born. P (DS|e) = P (DD|e) = P (DS) ⇤ P (e|DS) .01 ⇤ .1 = = .503 P (e) .00199 P (DD) ⇤ P (e|DD) .99 ⇤ .001 = = .497 P (e) .00199 + 0.000 Then we find, via Bayes’ the that the first ball to come ou from 50% to 99.9% certainty Bayes’ theorem can be resta So, even if we begin by thinking that the probability of Doom Soon is only 1%, reflection on the simple fact that we are born among the first 50 billion humans shows that we should think that there is a greater than 50% chance that the human race will be extinct in the next 150 years. of some hypothesis ‘h’ given Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. P r(DS) = 0.01 P r(DD) = 0.99 P r(e) = 0.00199 P r(e|DS) = 0.1 P r(e|DD) = 0.001 Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) Consider what this would sa that you know going in that P (DS) ⇤ P (e|DS) .01 ⇤ .1 P (DS|e) = = = .503 P (e) that there .00199 are 10 balls in the P (DD) ⇤ P (e|DD) .99 to ⇤ .001 first ball come out is #3, P (DD|e) = = = .497 P (e) Then we.00199 might say: So, even if we begin by thinking that the probability of Doom Soon is only 1%, reflection on the simple fact that P (h) than = 0.5 we are born among the first 50 billion humans shows that we should think that there is a greater 50% chance that the human race will be extinct in the next 150 years. P (e|h) = 0.1 Varying our initial assumptions changes the outcome dramatically. Suppose that reflection upon the risk of P (e) = 0.5(0.1 + 0.000 climate change, nuclear war, etc. makes us think that we should assign DS and DD roughly equal initial probabilities - suppose we think, before considering the fact that we were born in the first 50 billion people, that the probability of each is approximately 0.5. On this assumption, the probability of Doom Soon after Then we find, via Bayes’ the conditionalizing on our evidence is just over 99%. that the first ball to come ou Alternatively, we might think (as Leslie says, p. 202) that if the human race survives past 50% 2050, then it will likely from to 99.9% certainty colonize other planets, making it quite likely that the population of humans will ultimately grow to be at least 50 million billion (50 quadrillion). On this assumption, even if we begin by assigning Doom Soon a chance of only theorem can be 1%, conditionalizing on the evidence that we were among the first 50 billion born Bayes’ gives Doom Soon a probability of 99.9%. resta of some hypothesis ‘h’ given Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Bayes’ theorem P (h|e) = P (h) P (e|h) P (e) Consider what this would sa that onyou knowfactgoing So, even if we begin by thinking that the probability of Doom Soon is only 1%, reflection the simple that in that we are born among the first 50 billion humans shows that we should think that there is a greater that therethan are50% 10 balls in the chance that the human race will be extinct in the next 150 years. first ball to come out is #3, Then upon we might Varying our initial assumptions changes the outcome dramatically. Suppose that reflection the risk ofsay: climate change, nuclear war, etc. makes us think that we should assign DS and DD roughly equal initial probabilities - suppose we think, before considering the fact that we were born in the first 50 billion people, that the probability of each is approximately 0.5. On this assumption, the probability of Doom Soon after P (h) = 0.5 conditionalizing on our evidence is just over 99%. P (e|h) = 0.1 Alternatively, we might think (as Leslie says, p. 202) that if the human race survives past 2050, then it will likely P (e) = 0.5(0.1 + 0.000 colonize other planets, making it quite likely that the population of humans will ultimately grow to be at least 50 million billion (50 quadrillion). On this assumption, even if we begin by assigning Doom Soon a chance of only 1%, conditionalizing on the evidence that we were among the first 50 billion born gives Doom Soon a probability Then we find, via Bayes’ the of 99.9%. that the first ball to come ou The numbers are surprising. But more surprising than the numbers is the way we arrived at them. One thinks of from 50% to 99.9% certainty revising one’s view about the likelihood of Doom Soon based on empirical claims about nuclear weapons, climate change, asteroids, etc. It seems crazy that such dramatic changes in our view about the extinction of the Bayes’ human race should result from mere reflection on how many human beings have been born.theorem can be This is why this argument deserves to be considered a paradox. We have a plausible argument from Leslie that we should radically change our view of the future based on the number of human beings who have lived, but it seems clear that it is unreasonable to change our views on this topic for this reason. resta This is why this argument deserves to be considered a paradox. We have a plausible argument from Leslie that we should radically change our view of the future based on the number of human beings who have lived, but it seems clear that it is unreasonable to change our views on this topic for this reason. We’ll move on shortly to consider some objections to Leslie’s argument. But first lets consider another, even more surprising, application of the same sort of reasoning, due to Nick Bostrom (in “Are you living in a computer simulation?”). Suppose we met some Martians, or other beings from another planet, who were made of entirely different materials from us - suppose, for example, that they were composed almost entirely of silicon. Could such beings be conscious? It seems clear that they could; beings could be conscious even if made of radically different materials than us. But what makes such beings conscious? Plausibly, what makes them conscious is not what they are made of, but how what they are made of is organized; even something made of silicon could be conscious if its silicon parts were arranged in a structure as complex as a human brain. If this is true, then it seems clear that computers could be conscious. Indeed, given reasonable assumptions about future increases in the computing power available to us, it seems quite plausible that in the relatively near future computers will be able to run simulations of human life which contain many, many conscious beings. With continuing increases in computing power, there will presumably come a time in which a vast, vast majority of all conscious beings to have existed will have existed only in computer simulations. Are you living in a computer simulation right now? It seems that reasoning analogous to that used in the doomsday argument suggests that it is reasonable for you to believe that you are. After all, if the hypothesis that the vast number of conscious beings to have existed exist only in a computer simulation is true, then the probability that you do not exist in a computer simulation is extremely low. Of course, there is another option: human beings could become extinct before the relevant advances in computing power are realized. So, it seems, it is overwhelmingly likely that either you are living in a computer simulation, or human beings are soon to become extinct. How might one respond to Leslie’s argument? (We’ll leave it an open question whether the responses we consider will also apply to the “computer simulation” argument.) One could of course respond that the Bayesian apparatus on which it depends is faulty. But that would seem hasty; let’s consider some other possibilities. One possibility is that the problem begins with the obviously false assumption that Doom Soon and Doom Delayed are the only relevant possibilities. Would Leslie’s argument still work if we relaxed this assumption, and took into account the fact that there are many possible futures for the human species? What would the analogous situation be with the example of the numbered balls and the lottery machine? Let’s consider a different line of objection: Look, we know that something must be wrong with this way of arguing. After all, couldn't someone in ancient Rome have used this reasoning to show that the end the world would come before 500 AD? And wouldn't they have been wrong? So mustn't there also be something wrong with our using this reasoning? Is this a good objection? A more promising line of objection focuses on the apparent assumption that one’s location in the birth order of human beings is random. Leslie asks you to, in effect, assimilate this case to the lottery ball example. But why think that the fact that I am born in the first 50 billion people is relevantly analogous to the number “50 billion” coming out of the lottery machine? To develop this objection, let’s think more closely about exactly which assumptions are involved in the lottery machine example. (This follows the discussion in Mark Greenberg’s “Apocalypse not just yet.”) Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. A more promising line of objection focuses on the apparent assumption that one’s location in the birth order of human beings is random. Leslie asks you to, in effect, assimilate this case to the lottery ball example. But why think that the fact that I am born in the first 50 billion people is relevantly analogous to the number “50 billion” coming out of the lottery machine? To develop this objection, let’s think more closely about exactly which assumptions are involved in the lottery machine example. (This follows the discussion in Mark Greenberg’s “Apocalypse not just yet.”) An analogous case would be a lottery machine which we know to contain either 500 billion or 50 thousand billion balls. To make the numbers smaller, let’s think of a pair of lottery machines with, respectively, 500 and 50,000 balls. Suppose you are confronted with a lottery machine which you know to be of one of the types just described. Suppose now that the lottery machine spits out a ball which reads “50.” (This is supposed to be analogous to finding that you are among the first 50 billion people born.) Isn’t this evidence, as Leslie says, that the machine has 500 rather than 50,000 balls in it? It is - but only if we make two assumptions about the machines. First, we must assume that the ball which comes out is randomly selected by the lottery machine. If, for example, balls with numbers higher than “100” are slightly larger and never come out of the machine, then obviously the fact that a ball labeled “50” came out of the machine would be no evidence at all about how many balls are in the machine. Second, and just as important, we must assume that the ball with the label “50” on it would still be in the urn if it contained 500 balls. Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Suppose now that the lottery machine spits out a ball which reads “50.” (This is supposed to be analogous to finding that you are among the first 50 billion people born.) Isn’t this evidence, as Leslie says, that the machine has 500 rather than 50,000 balls in it? It is - but only if we make two assumptions about the machines. First, we must assume that the ball which comes out is randomly selected by the lottery machine. If, for example, balls with numbers higher than “100” are slightly larger and never come out of the machine, then obviously the fact that a ball labeled “50” came out of the machine would be no evidence at all about how many balls are in the machine. Second, and just as important, we must assume that the ball with the label “50” on it would still be in the urn if it contained 500 balls. This second condition is easy to miss, since we of course assume that a machine with N balls in it contains those balls labeled 1-N. But of course things don’t have to work this way. Suppose that the 500 balls in the 500-ball machine were randomly selected from the balls in the 50,000 ball machine. Then the “50” ball might not be in the 500-ball machine. In this case, would the emergence of the “50” ball from the machine count in favor of the hypothesis that the machine before us contains 500 rather than 50,000 balls? The problem is that any way of understanding the analogy between the Doomsday argument and the lottery machine example seems to violate one of the two assumptions needed to legitimate the reasoning used in the lottery machine case. To see this, think about the question: what is the analogue in the Doomsday argument of the numbers written on the balls in the lottery machine case? Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Suppose now that the lottery machine spits out a ball which reads “50.” (This is supposed to be analogous to finding that you are among the first 50 billion people born.) Isn’t this evidence, as Leslie says, that the machine has 500 rather than 50,000 balls in it? It is - but only if we make two assumptions about the machines. First, we must assume that the ball which comes out is randomly selected by the lottery machine. If, for example, balls with numbers higher than “100” are slightly larger and never come out of the machine, then obviously the fact that a ball labeled “50” came out of the machine would be no evidence at all about how many balls are in the machine. Second, and just as important, we must assume that the ball with the label “50” on it would still be in the urn if it contained 500 balls. To see this, think about the question: what is the analogue in the Doomsday argument of the numbers written on the balls in the lottery machine case? Here’s one idea: the number “written on you” is the number you happen to be in the birth order of the human species. So (let’s suppose) my number is “50 billion” because I just happened to be the 50 billionth person born. On this idea, people don’t have “built in” numbers: rather, they are assigned the numbers they get based on when they are born. But now imagine that the lottery machine worked this way. On this view, there are an undisclosed number of balls in the machine, none of which have numbers written on them. We write the numbers on the balls as they come out of the machine, beginning with “1.” Now suppose we get to “50.” Is the fact that a ball with such a low number came out evidence that we have a 500-ball machine before us? Clearly not, because the first assumption - the assumption of random selection - is violated. Doom Soon. The human race will go extinct by 2150, with the total humans born by the time of such extinction being 500 billion. Doom Delayed. The human race will go on for several thousand centuries, with the total humans born before the race goes extinct being 50 thousand billion. Suppose now that the lottery machine spits out a ball which reads “50.” (This is supposed to be analogous to finding that you are among the first 50 billion people born.) Isn’t this evidence, as Leslie says, that the machine has 500 rather than 50,000 balls in it? It is - but only if we make two assumptions about the machines. First, we must assume that the ball which comes out is randomly selected by the lottery machine. If, for example, balls with numbers higher than “100” are slightly larger and never come out of the machine, then obviously the fact that a ball labeled “50” came out of the machine would be no evidence at all about how many balls are in the machine. Second, and just as important, we must assume that the ball with the label “50” on it would still be in the urn if it contained 500 balls. So let’s suppose instead that every human being comes with a “built in” number, just like the lottery balls have numbers written on them prior to their emergence from the lottery machine. There are two problems with this suggestion. First, how do we know what anyone’s number is, on this way of viewing the analogy? What does it even mean to say that people have a certain number? Second, if we can come up with a way of assigning numbers to all the people that could have existed - we violate the second assumption. For there is no guarantee that, if N people exist, they will have the numbers 1-N. This is just like the case in which 500 balls are randomly selected from the 50,000 ball machine - in this case, even if we somehow knew that my number was 50 billion - this would provide us no evidence at all about how many human beings will eventually exist.