Download Slides - Rutgers Statistics

Staying Regular? Alan Hájek To be uncertain is to be uncomfortable, but to be certain is to be ridiculous. - Chinese proverb ALI G: So what is the chances that me will eventually die? C. EVERETT KOOP: That you will die? – 100%. I can guarantee that 100%: you will die. ALI G: You is being a bit of a pessimist… –Ali G, interviewing the Surgeon General, C. Everett Koop Satellite view • My campaign against the usual ratio analysis of conditional probability (“What CP Could Not Be”…) • Related, my misgivings about the usual formula(s) for independence. (“Declarations of Independence”) • Conditionals (“zero intolerance”) • Concerns about decision theory (‘Decision Theory In Crisis’ project) • ‘The Objects of Probability’ project • Bridges between traditional and formal epistemology (“A Tale of Two Epistemologies”) Satellite view • They all come together in my concerns about regularity. Satellite view • First instalment: “Is Strict Coherence Coherent?” • I’m back on the case. Aerial view • Many philosophers are scandalized by orthodox Bayesianism’s unbridled permissiveness… • So some Bayesians are less permissive... • I’ll canvass the fluctuating fortunes of a much-touted constraint, so-called regularity. Aerial view • Various versions. • I’ll massage it, offering what I take to be a more promising (or less unpromising!) version: a constraint that bridges doxastic possibility and doxastic (subjective) probability. • So understood, regularity promises to offer a welcome connection between traditional and Bayesian epistemology. Aerial view • I’ll give a general formulation of regularity in terms of a certain kind of internal harmony within a probability space. • I’ll argue that it is untenable. Aerial view • There will be two different ways to violate regularity – zero probabilities – no probabilities at all (probability gaps). • Both ways create trouble for pillars of Bayesian orthodoxy: – – – – the ratio formula for conditional probability conditionalization, characterized with that formula the multiplication formula for independence expected utility theory Aerial view • Whither our theory of rational credence? View from the trenches: regularity • Regularity conditions are bridge principles between modality and probability: If X is possible, then the probability of X is positive. • An unmnemonic name, but a commonsensical idea… • Versions of regularity as a rationality constraint have been suggested or advocated by Jeffreys, Jeffrey, Carnap, Shimony, Kemeny, Edwards, Lindman, Savage, Stalnaker, Lewis, Skyrms, Jackson, Appiah, ... Regularity • Muddy Venn diagram. No bald spots. Regularity If X is possible, then the probability of X is positive. • There are many senses of ‘possible’ in the antecedent... • There are also many senses of ‘probability’ in the consequent… Regularity • Pair them up, and we get many, many regularity conditions. • Some are interesting, and some are not; some are plausible, and some are not. • Focus on pairings that are definitely interesting, and somewhat plausible, at least initially. Regularity • In the consequent, let’s restrict our attention to rational subjective probabilities. • In the antecedent? … Regularity • Implausible: Logical Regularity If X is LOGICALLY possible, then C(X) > 0. (Shimony, Skyrms) Regularity • Problems: There are all sorts of propositions that are knowable a priori, but whose negations are logically possible... • A rational agent may (and perhaps must) give probability 0 to these negations. Regularity • More plausible: Metaphysical Regularity If X is METAPHYSICALLY possible, then C(X) > 0. Regularity • This brings us to Lewis’s (1980) characterization of “regularity”: “C(X) is zero … only if X is the empty proposition, true at no worlds”. (According to Lewis, X is metaphysically possible iff it is true at some world.) • Lewis regards regularity in this sense as a constraint on “initial” (prior) credence functions of agents as they begin their Bayesian odysseys—Bayesian Superbabies. Regularity • Problems for metaphysical regularity: – It is metaphysically possible for no thinking thing to exist … – An infallible, omniscient God is metaphysically irregular … – Agents who are infallible over circumscribed domains: • Metaphysical regularity prohibits luminosity of one’s credences: If C(X) = x, then, C[ C(X) = x ] = 1. • Metaphysical regularity prohibits certainty about one’s phenomenal experiences. Regularity • However, doxastic possibility seems to be a promising candidate for pairing with subjective probability. • Doxastic regularity: If X is doxastically possible for a given agent, then the agent’s subjective probability of X is positive. Regularity • We can think of a doxastic possibility for an agent as something that is compatible with what that agent believes, though I can allow other understandings... • Plausibly, her beliefs and degrees of belief are, or at least should be, connected closely enough to guarantee this condition. Regularity • If doxastic regularity is violated, then offhand two different attitudes are conflated: one’s attitude to something one outright disbelieves, and one’s less committal attitude to something that is still a live possibility given what one believes ... Regularity • Doxastic regularity avoids the problems with the previous versions… Regularity • If this version of regularity fails, then another interesting version will fail too. • Epistemic regularity: If X is epistemically possible for a given agent, then the agent’s subjective probability of X is positive. • This is stronger than doxastic regularity; if it fails, so does this. Regularity • Regularity provides a bridge between traditional and Bayesian epistemology. • (But responding to skepticism is one of the main traditional concerns; far from combating skepticism, regularity seems to sustain it.) Regularity • And yet doxastic regularity appears to be untenable. Regularity • I will characterize regularity more generally as a certain internal harmony of a probability space <, F, P>. • Doxastic regularity will fall out as a special case, as will epistemic regularity and other related regularities. • Then we will be in a good position to undermine them. A more general characterization of regularity • Philosophers tend to think that all of the action in probability theory concerns probability functions. • But for mathematicians, the fundamental object of probability theory is a probability space, a triple of mathematical entities: <, F, P>. A more general characterization of regularity – A set of possibilities, which we will designate ‘’—a set of Worlds. (Think of the set of subsets of  - sets of Worlds as the agent’s set of doxastic possibilities.) – A field F of subsets of , thought of as propositions that will be the contents of the agent’s credence assignments. – A probability function P defined on F. A more general characterization of regularity • Regularity is a certain kind of harmony between , F, and P: – Between  and F: F is the power set of , so every subset of  appears in F. F recognizes every proposition that  recognizes. – Between F and P: P gives positive probability to every set in F except the empty set. P recognizes every proposition that F recognizes (except the empty set). – In this sense, P mirrors : P’s non-zero probability assignments correspond exactly to the non-empty subsets of . Three grades of probabilistic involvement • A non-empty set of possibilities may be recognized by the space by – (1st grade) being a subset of ; – (2nd grade) being an element of F; – (3rd grade) receiving positive probability from P. • These are non-decreasingly committal ways in which the space may countenance a proposition. • An agent’s space is regular if these three grades collapse into one: every non-empty subset of  receives positive probability from P. Three grades of probabilistic involvement • So far, this is all formalism; it cries out for an interpretation. Again, philosophers of probability have tended to focus on the interpretation of P … • But  and F deserve their day in the sun, too. Three grades of probabilistic involvement • I will think of the elements of  as worlds, the singletons of  as an agent’s maximally specific doxastic possibilities. • F will be the privileged sets of such doxastic possibilities that are contents of the agent’s credence assignments. Three grades of probabilistic involvement • Or start with the doxastic state of a rational agent; it cries out for a formalism so that we can model it. • Philosophers of probability have tended to focus on her probability assignments—real values in [0, 1] that obey the usual axioms. • But surprisingly little is said about the contents of these assignments, and they deserve their day in the sun too. Three grades of probabilistic involvement • They are represented by a set F of privileged subsets of a set . Three grades of probabilistic involvement • Thus, three tunnels with opposite starting points and heading in opposite directions meet happily in the middle! … Three grades of probabilistic involvement • We have a formalism looking for a philosophical interpretation and a philosophical interpretation looking for a formalism happily finding each other. • Where traditional and Bayesian epistemology have to a large extent proceeded on separate tracks, this way they are agreeably linked… Three grades of probabilistic involvement • Doxastic regularity is a special case of regularity, one in which the three grades of probabilistic involvement collapse into one for a probability space interpreted as I have. • There are other interesting special cases, too. Three grades of probabilistic involvement •  might be the agent’s set of maximally specific epistemic possibilities, and F the privileged sets of these possibilities to which she bestows credences. • Define epistemic regularity in terms of the three grades of probabilistic involvement collapsing into one for this space. Three grades of probabilistic involvement • Or you are free to give your own interpretation of  and F… Three grades of probabilistic involvement • There are two ways in which an agent’s space could fail to be regular: – 1) Her probability function assigns zero to some member of F (other than the empty set). Then her second and third grades come apart for this proposition. – 2) Her probability function fails to assign anything to some subset of , because the subset is not an element of F. Then her first and second grades come apart for this proposition. Three grades of probabilistic involvement • Those who regard regularity as a norm of rationality must insist that all instances of 1) and all instances of 2) are violations of rationality. • I will argue that there are rational instances of both 1) and 2). Dart example Throw a dart at random at the [0, 1] interval… Dart example 0 1 Dart example • Any landing point outside this interval is not countenanced—the corresponding proposition does not make even the first grade for our space... Dart example • Nor do various other possibilities … Dart example 0 1 • Certain subsets of —so-called non-measurable sets—get no probability assignments whatsoever. • They achieve the first, but not the second grade. Dart example 0 1 Dart example 0 1 • Various non-empty subsets get assigned probability 0: • All the singletons • Indeed, all the finite subsets • Indeed, all the countable subsets • Even various uncountable subsets (e.g. Cantor’s ‘ternary set’) • They achieve the second but not the third grade. Dart example • Examples like these pose a threat to regularity as a norm of rationality. • Any landing point in [0, 1] is doxastically possible for our ideal agent. • We thus get two routes to irregularity as before, now interpreted doxastically. Arguments against regularity • • In order for there to be the kind of harmony within <, F, P> that is captured by regularity, there has to be a certain harmony between the cardinalities of P’s domain—namely F—and P’s range. If F is too large relative to P’s range, then a failure of regularity is guaranteed, and this is so without any further constraints on P. Arguments against regularity Kolmogorov’s axiomatization requires P to be real valued. This means that any uncountable probability space is automatically irregular. Arguments against regularity • • It is curious that this axiomatization is restrictive on the range of all probability functions: the real numbers in [0,1], and not a richer set; yet it is almost completely permissive about their domains:  can be any set you like, however large, and F can be any field on , however large. Arguments against regularity • • • We can apparently make the set of contents of an agent’s thoughts as big as we like. But we limit the attitudes that she can bear to those contents—the attitudes can only achieve a certain fineness of grain. Put a rich set of contents together with a relatively impoverished set of attitudes, and you violate regularity. Infinitesimals to the rescue? The friend of regularity replies: if you’re going to have a rich domain of the probability function, you’d better have a rich range. Lewis: “You may protest that there are too many alternative possible worlds to permit regularity. But that is so only if we suppose, as I do not, that the values of the function C are restricted to the standard reals. Many propositions must have infinitesimal C-values, and C(A|B) often will be defined as a quotient of infinitesimals, each infinitely close but not equal to zero. (See Bernstein and Wattenberg (1969).)” Infinitesimals to the rescue? 0 0 1 Infinitesimals to the rescue? • I have seen Bernstein and Wattenberg (1969). But this article does not substantiate Lewis’ strong claim. Bernstein and Wattenberg show that using the hyperreal numbers—in particular, infinitesimals—one can give a regular probability assignment to the landing points of a fair dart throw, modelled by a random selection from the [0, 1] interval of the reals. Infinitesimals to the rescue? • But that’s a very specific case, with a specific cardinality! • We need to be convinced that a similar result holds for each set of doxastic possibilities, whatever its cardinality. • Indeed, that may well be a proper class! Infinitesimals to the rescue? • Pruss (MS) shows that if the cardinality of  is greater than that of the range of P, then regularity fails. Infinitesimals to the rescue? • We can scotch regularity even for a hyperrealvalued probability function by correspondingly enriching the space of possibilities. • The dart is thrown at the [0, 1] interval of the hyperreals. Infinitesimals to the rescue? [ 0 x-ε/2 x ] x+ε/2 1 Not to scale! • Each point x is strictly contained within nested intervals of the form [x – ε/2, x + ε/2] of width ε, for each infinitesimal ε, whose probabilities are their lengths, ε again. (This assumption can be somewhat weakened.) Infinitesimals to the rescue? [ 0 x-ε/2 x ] x+ε/2 1 Not to scale! • • Each point x is strictly contained within nested intervals of the form [x – ε/2, x + ε/2] of width ε, for each infinitesimal ε, whose probabilities are their lengths, ε again. (This assumption can be somewhat weakened.) So the point’s probability is bounded above by all these ε, and thus it must be smaller than all of them— i.e. 0. Arguments against regularity, even allowing infinitesimals • • • • • I envisage a kind of arms race: We scotched regularity for real-valued probability functions by canvassing sufficiently large domains. The friends of regularity fought back, enriching their ranges: making them hyperreal-valued. The enemy of regularity counters by enriching the domain. And so it goes. By Pruss’s result, the enemy can always win (for anything that looks like Kolmogorov’s probability theory). Arguments against regularity, even allowing infinitesimals • So there are propositions that make it to the second, but not the third grade of probabilistic involvement for rational agents: non-empty subsets of  that get assigned probability 0. Doxastically possible credence gaps • These are propositions that make it to the first but not the second grade of probabilistic involvement for rational agents: propositions that are subsets of , but that are not elements of F. Doxastically possible credence gaps • Decision theory recognizes the possibility of probability gaps in its distinction between decisions under risk, and decisions under uncertainty: in the latter case, probabilities are simply not assigned to the relevant states of the world. • More generally: I will argue that you can rationally have credence gaps. Examples of doxastically possible credence gaps • Non-measurable sets Examples of doxastically possible credence gaps • Chance gaps • The Principal Principle says (roughly!!): your credence in X, conditional on it having chance x, should be x: C(X | chance(X) = x) = x. Examples of doxastically possible credence gaps • A relative of the Principal Principle? Roughly: your credence in X, conditional on it being a chance gap, should be gappy: C(X | chance(X) is undefined) is undefined. • All I need is that rationality sometimes permits your credence to be gappy for a hypothesized chance gap. Examples of doxastically possible credence gaps • There are arguably various cases of indeterminism without chanciness. (Eagle) Examples of doxastically possible credence gaps •Norton’s dome Examples of doxastically possible credence gaps •This is a case of indeterminism: In the time reversed version of the story, the initial conditions and Newton’s laws do not entail if, when, and where the ball will roll. •But there are no chances in this picture. •E.g. chance(ball rolls north on Monday) is undefined. Examples of doxastically possible credence gaps •A rational agent who knows this could refuse to assign a credence to the ball rolling north on Monday. Examples of doxastically possible credence gaps • One’s own free choices • Kyburg, Gilboa, Spohn, Levi, Price, and Briggs contend that when I am making a choice, I must regard it as free. In doing so, I cannot assign probabilities to my acting in one way rather than another (even though onlookers may be able to do so). Examples of doxastically possible credence gaps • To be sure, these cases of probability gaps are controversial; but it is noteworthy that these authors are apparently committed to there being further counterexamples to regularity due to credence gaps. • All I need is that it is permissible to leave them as credence gaps. Ramifications of irregularity for Bayesian epistemology and decision theory • I have argued for two kinds of counterexamples to regularity: rational assignments of zero credences, and rational credence gaps, for doxastic possibilities. • I now want to explore some of the unwelcome consequences these failures of regularity have for traditional Bayesian epistemology and decision theory. Problems for the conditional probability ratio formula • The ratio analysis of conditional probability: … provided P(B) > 0 Problems for the conditional probability ratio formula • What is the probability that the dart lands on ½, given that it lands on ½? • 1, surely! • But the ratio formula cannot deliver that result, because P(dart lands on ½) = 0. Problems for the conditional probability ratio formula • Gaps create similar problems. • What is the probability that the ball rolls north on Monday, given that the ball rolls north on Monday? • 1, surely! • But the ratio formula cannot deliver that result, because P(ball rolls north on Monday) is undefined. Problems for the conditional probability ratio formula • We need a more sophisticated account of conditional probability. • I advocate taking conditional probability as primitive (in the style of Popper and Rényi). Problems for conditionalization • The zero-probability problem for the conditional probability formula quickly becomes a problem for the updating rule of conditionalization, which is defined in terms of it: Pnew(X) = Pold(X | E) (provided Pold (E) > 0) Problems for conditionalization • Suppose you learn that the dart lands on ½. What should be your new probability that the dart lands on ½? • 1 surely. • But Pold(dart lands on ½ | dart lands on ½) is undefined, so conditionalization (so defined) cannot give you this advice. Problems for conditionalization • Gaps create similar problems. • Suppose you learn that the ball rolls north on Monday. What should be your new probability that the ball rolls north on Monday? • 1 surely. • But Pold(rolls north on Monday| rolls north on Monday) is undefined, so conditionalization cannot give you this advice. Problems for conditionalization • We need a more sophisticated account of conditionalization. • Primitive conditional probabilities to the rescue! Problems for conditionals • Bennett: “nobody has any use for AC when for him P(A) = 0”. (“Zero intolerance”) • How about: “if the dart lands ½, then it lands on a rational number”? Problems for conditionals • He likes Adams’ Thesis: P(A  C) = P(C | A). • “Believe A & C to the extent that you think A & C is nearly as likely as A … You can do nothing with this in the case where your P(A) = 0” • If P(A) = 0, then for any C, A & C is nearly as likely as A! • But we want to distinguish good instances of A  C from bad. Problems for conditionals • Bennett could just as easily have insisted on “gap intolerance”: • “nobody has any use for AC when for him P(A) is a gap.” • “Believe A & C to the extent that you think A & C is nearly as likely as A …”? • “If the ball rolls north on Monday, it rolls north on Monday.” Problems for independence • We want to capture the idea of A being probabilistically uninformative about B. • A and B are said to be independent just in case P(A  B) = P(A) P(B). Problems for independence • According to this account of probabilistic independence, anything with probability 0 is independent of itself: If P(X) = 0, then P(X  X) = 0 = P(X)P(X). • But surely identity is the ultimate case of (probabilistic) dependence. Problems for independence • Suppose you are wondering whether the dart landed on ½. Nothing could be more informative than your learning: the dart landed on ½. • But according to this account of independence, the dart landing on ½ is independent of the dart landing on ½! Problems for independence • Gaps create similar problems. • Suppose you are wondering whether the ball started rolling north on Monday. Nothing could be more informative than your learning: the ball started rolling north on Monday. • But there is no verdict from this account of independence. Problems for independence • We need a more sophisticated account of independence – e.g. using primitive conditional probabilities. • Branden Fitelson and I have written about this. Problems for expected utility theory • Arguably the two most important foundations of decision theory are the notion of expected utility, and dominance reasoning. Problems for expected utility theory • And yet probability 0 propositions apparently show that expected utility theory and dominance reasoning can give conflicting verdicts. Problems for expected utility theory • Suppose that two options yield the same utility except on a proposition of probability 0; but if that proposition is true, option 1 is far superior to option 2. Problems for expected utility theory • You can choose between these two options: – Option 1: If the dart lands on 1/2, you get a million dollars; otherwise you get nothing. – Option 2: You get nothing. Problems for expected utility theory • Expected utility theory apparently says that these options are equally good: they both have an expected utility of 0. But dominance reasoning says that option 1 is strictly better than option 2. Which is it to be? • I say that option 1 is better. • I think that this is a counterexample to expected utility theory, as it is usually understood. (To be sure, there are replies …) Problems for expected utility theory • Gaps create similar problems. • You can choose between these two options: – Option 1: If the ball starts rolling north on Monday, you get a million dollars; otherwise you get nothing. – Option 2: You get nothing. Problems for expected utility theory • Expected utility theory goes silent. • I say that option 1 is better. • We need a more sophisticated decision theory. Closing sermon • Irregularity makes things go bad for the orthodox Bayesian; that is a reason to insist on regularity. • The trouble is that regularity appears to be untenable. • I think, then, that irregularity is a reason for the orthodox Bayesian to become unorthodox. Closing sermon • I have advocated replacing the orthodox theory of conditional probability, conditionalization, and independence with alternatives based on Popper/Rényi functions. Expected utility theory appears to be similarly in need of revision. Closing sermon • Or perhaps some genius will come along one day with an elegant theory that preserves regularity after all. Closing sermon • Or perhaps some genius will come along one day with an elegant theory that preserves regularity after all. • Fingers crossed! Closing sermon • And then there are some possibilities that really should be assigned zero probability … Thanks especially to Rachael Briggs, David Chalmers, John Cusbert, Kenny Easwaran, Branden Fitelson, Renée Hájek, Thomas Hofweber, Leon Leontyev, Aidan Lyon, John Maier, Daniel Nolan, Alexander Pruss, Wolfgang Schwarz, Mike Smithson, Weng Hong Tang, Peter Vranas, Clas Weber, and Sylvia Wenmackers for very helpful comments that led to improvements; to audiences at Stirling, the ANU, the AAP, UBC, Alberta, Rutgers, NYU, Berkeley, Miami, the Lofotens Epistemology conference; to Carl Brusse and Elle Benjamin for help with the slides; and to Tilly. Reply to Thomas •The minimal constraint (MC): events of lowest chance don't happen: – If X has chance 0, X does not happen. •It is reminiscent of Cournot’s Principle: “events of low probability don’t/will not happen”— “low” is understood as 0. Reply to Thomas • The minimal constraint (MC): events of lowest chance don't happen: If X has chance 0, X does not happen. • Some differences with (doxastic) regularity If X is (doxastically) possible, C(X) > 0: – MC makes no mention of ‘possibility’ – There’s nothing doxastic about it. – Gaps are no problem for it, as formulated above. (They may be a problem for a stronger formulation: If X happens, then chance(X) > 0.) Does Thomas think that this is also a conceptual truth? Reply to Thomas • According to MC, the radioactive decay laws are false if understood in terms of real-valued chances. (E.g. a particular radium atom decaying exactly when it does apparently has probability 0.) Reply to Thomas • Is MC a conceptual truth? • (Branden) The conceptual truth may be a claim of comparative probability: If X happens, then chance(X) > chance(contradiction). It would be nice to give a numerical representation of such comparative probability claims, but perhaps that cannot be done. Reply to Thomas • Thomas’s ‘wait-and-see’ approach to the range. • The details will matter: – How exactly does the choice of  determine the choice of the range? – What will additivity look like? – Sylvia et al. deliver the details of such a proposal. – Does chance wait and see?! Reply to Thomas • Thomas is a fellow-traveller, insofar as he has to give an account of probability that departs significantly from Kolmogorov’s. Introductory sermon Bruno de Finetti The multiplicative destroyer: problems for independence • 0 is the multiplicative destroyer: multiply anything by it, and you get 0 back. This spells trouble for the usual definition of probabilistic independence. • We want to capture the idea of A being probabilistically uninformative about B. • A and B are said to be independent just in case P(A  B) = P(A) P(B). Arguments for regularity • Homage to Eric Clapton (“One Chance”): “If I take the chance of seeing you again, I just don't know what I would do, baby… You had one chance and you blew it. You may never get another chance.” • Clapton is not singing about probability, but rather opportunity – a modal notion. Problems for independence • More generally, according to this account of independence, any proposition with probability 0 is probabilistically independent of anything. This includes: – its negation; – anything that entails it, and anything that it entails. Arguments against regularity, even allowing infinitesimals • Could a single construction handle all ’s at once? • I doubt that, at least for anything recognizably like Kolmogorov’s axiomatization. For recall that a decision about the range of all probability functions is made once and for all, while allowing complete freedom about how large the domains can be. • However rich the range of the probability functions get, I could presumably run the trick above again: fashion a spinner whose landing points come from a domain that is so rich, threatening to thwart regularity once again. Arguments against regularity, even allowing infinitesimals • Could we tailor the range of the probability function to the domain, for each particular application? • The trouble is that the commitment to the range of P comes first: forever more, probability functions will be mappings from sigma algebras to the reals. Or to the hyperreals. Or to the hyperhyperreals. Or to some quite different system… • Try providing an axiomatization along the lines of Kolmogorov’s that has flexibility in the range built into it. “A probability function is a mapping from F to …”—to what? • So I doubt that there could be an argument, still less a proof, still less a proof already provided by Bernstein and Wattenberg, that regularity can be sustained, however much the cardinality of  escalates. Examples of doxastically possible credence gaps • “Agents are unlike chances—for example, agents sometimes have to bet, while chances never do!” • If someone coerced you to bet on some chance gap, then sure enough we would witness some betting behaviour. • But it is doubtful that it would reveal anything about your state of mind prior to the coercion. So at best this shows that it would be rational for you to fill a gap, when coerced. • It does not impugn the rationality of having the gap in the first place—and that’s all we need for a counterexample to regularity. Arguments against regularity, even allowing infinitesimals ii. Symmetry constaints on P Williamson’s argument. Dart example • They are not always clear on exactly which version of regularity they are arguing for. Up to a point, it won’t matter. But I think their arguments go through especially well for doxastic possibility. Arguments for regularity “Keep the door open, or at least ajar” –Edwards, Lindman and Savage (1963) Arguments for regularity Lewis (1986, 175-176): [Some] say that things with no chance at all of occurring, that is with probability zero, do nevertheless happen; for instance when a fair spinner stops at one angle instead of another, yet any precise angle has probability zero. I think these people are making a rounding error: they fail to distinguish zero chance from infinitesimal chance. Zero chance is no chance, and nothing with zero chance ever happens. The spinner’s chance of stopping exactly where it did was not zero; it was infinitesimal, and infinitesimal chance is still some chance. (My bolding) Arguments for regularity 1. The argument from ordinary language. i) Zero chance is no chance – a zero chance event cannot happen. ii) Each landing point has positive chance, since each can happen, and the rational agent should know this. iii) By the Principal Principle, her corresponding credence should be positive. Hence, she should be regular. Arguments for regularity 1. The argument from ordinary language. I reply: I think that Lewis is trading on a pun. On the one hand, “X has no chance” literally means “X has zero chance”. But “X has no chance” also has the colloquial meaning “X cannot happen”—a modal claim. Arguments for regularity • Sliding from one meaning of “chance” to another is too quick an argument for regularity. • We can even say: “the landing point at noon has a chance (opportunity/possibility) of coming up; but its chance (probability) of doing so is zero.” Arguments for regularity 2. The argument from rounding error. I reply: I am not convinced that the chance is infinitesimal even for the stopping points of the spinner—more on that later—but if it is, then change the example. Arguments for regularity 3. The argument from learning. Lewis writes: “I should like to assume that it makes sense to conditionalize on any but the empty proposition. Therefore, I require that C is regular.” Arguments for regularity I reply: this presupposes that the conditional probabilities that figure in conditionalization are given by the usual ratio formula. But we should adopt a more powerful approach to conditional probability, according to which conditional probabilities can be defined even when the conditions have probability 0. More on that later. Arguments for regularity 4. (Related) The argument from stubbornness. Lewis continues: “The assumption that C is regular will prove convenient, but it is not justified only as a convenience. Also it is required as a condition of reasonableness: one who started out with an irregular credence function (and who then learned from experience by conditionalizing) would stubbornly refuse to believe some propositions no matter what the evidence in their favor.” Arguments for regularity We could strengthen this argument: Having zeroed out a possibility, an irregular agent could never even raise its probability by conditionalization, let alone raise it so high as to believe it, whatever the evidence in its favour. Arguments for regularity I reply: • As Kenny Easwaran observes, this argument presupposes that all evidence that is received initially had positive probability. But if something that initially had probability 0 is learned (by a more powerful form of conditionalization than using the ratio formula), then even zero probability propositions can have their probabilities raised—indeed, all the way to 1. • There are some propositions that you should stubbornly refuse to believe, since you could not get evidence in their favour—for example, that there are no Arguments for regularity 5. An irregular agent is susceptible to a semi-Dutch Book. While considering what it is like for us to be irregular, Skyrms (1980) argues: “If we interpret probability as a fair betting quotient there is a bet which we will consider fair even though we can possibly lose it but cannot possibly win it”. Arguments against regularity I reply: • This argument proves too much. It ‘shows’ that one must never conditionalize. But if you don’t conditionalize, your are susceptible to a DUTCH BOOK! • What is the sense of “possibly lose”? Merely logical or metaphysical? If it is doxastically impossible that you will lose the bet, you should not worry. Arguments for regularity 6. Conflating certainty and less than certainty. Williamson: “For subjective Bayesians, probability 1 is the highest possible degree of belief, which presumably is absolute certainty.” We may continue: an agent who violates regularity conflates mental states that should be distinguished, by regarding something that is doxastically possible as if it’s certainly false. Arguments for regularity I reply: As Easwaran points out, an agent’s attitudes are captured by more than just her probability assignments. A proper model of her will attribute to her a probability space <, F, C>. A doxastic possibility for her is represented by a non-empty subset of . It is thus distinguished from the empty set. Not all distinctions in her mental states need be captured solely by C. (I stressed the importance of  and F before.) Arguments for regularity 7. Upholding the norm Any agent who violates regularity violates the norm that her credences should reflect her evidence. If X is doxastically possible, then her evidence does not rule it out; but an assignment of zero credence to X treats it as if it is ruled out. Her credence conflates two different evidential situations: one in which X is doxastically live, and another in which it is doxastically dead. Arguments for regularity I reply: This argument wrongly assumes that an agent’s evidential situation with regard to a proposition is revealed only by her unconditional probability assignment to that proposition. Other aspects of her probability function can reveal that she has different attitudes to them: her primitive conditional probabilities! Arguments for regularity 8. Pragmatic argument: the argument from crossed fingers. I offer the friend of regularity a pragmatic argument for it: Bayesian epistemology goes more smoothly if we assume it. Our theorizing about rational mental states faces serious difficulties if we allow irregularity. We had better hope, then, that we can maintain regularity! Arguments for regularity I will reply to this argument at the end, after I’ve made trouble for Bayesian orthodoxy! Arguments against regularity "I believe in an open mind, but not so open that your brains fall out" –Arthur Hays Sulzberger, former publisher of the New York Times Arguments against regularity, even allowing infinitesimals • Times: • First coin: • Second coin: 1 H 2 H H 3 H H 4… H… H… P(unitalicized sequence) = ½ P(bold sequence) P(unitalicized sequence) = P(italicized sequence) P(italicized sequence) = P(bold sequence) So P(unitalicized sequence) = P(bold sequence) So P(unitalicized sequence) = ½ P(unitalicized sequence) So P(unitalicized sequence) = 0. Arguments against regularity, even allowing infinitesimals • There is another answer to the probability of a coin landing heads forever: namely, there is no answer. That is, we may want to allow that this probability is simply undefined: a coin landing heads forever is a probability gap. • This response could suppose for reductio that the probability exists and is positive. The argument that it is half as big as itself forces us to reject this supposition. • Williamson responds by rejecting its second conjunct; this response rejects its first conjunct. Either way, regularity is frustrated. Arguments against regularity, even allowing infinitesimals I am not convinced that this is a good response, but it primes us to look for other cases of the same kind as counterexamples to regularity: cases in which a set of doxastic possibilities fails to get positive credence, because it fails to get credence at all. Staying Regular Alan Hájek Closing open-mindedness, even with infinitesimals • The problems for the analysis of conditional probability, for conditionalization, for the analysis of independence, and for decision theory, seem to be very much alive. Trouble for conditionalization • Previously I endorsed Easwaran’s reply to the argument from stubbornness: if something that initially had probability 0 is learned, then other zero probability propositions can have their probabilities raised. • Now add that if something that initially was a probability gap was learned, then other probability gaps can have their probability become defined. Arguments for regularity “Keep the door open, or at least ajar” –Edwards, Lindman and Savage (1963) Arguments for regularity “Keep the door open, or at least ajar” –Edwards, Lindman and Savage (1963) Examples of doxastically possible credence gaps • It seems, then, that classical mechanics is indeterministic but not chancy: starting from a fixed set of initial conditions, it allows multiple possible futures, but they have no associated chances. • Now perhaps a rational agent may correspondingly not assign them any credences, even though they may be doxastic possibilities for her. • To be sure, she may also assign them credence; but she is not rationally compelled to do so. Arguments for regularity “Keep the door open, or at least ajar” –Edwards, Lindman and Savage (1963) Probability 0 events • So there are various non-trivial and interesting examples of probability 0 events. • They create various philosophical problems, each associated with a peculiar property of the arithmetic of 0. Regularity • Modalities are puzzling. • Probabilities are puzzling twice over. • We start to gain a handle on both binary ‘box’/‘diamond’ modalities and numerical probabilities when we formalize them, with various systems of modal logic for the former, and with Kolmogorov’s axiomatization for the latter. • We would understand both still better if we could provide bridge principles linking them. Statistical significance testing • Null hypothesis H0 vs alternative hypothesis H1 • Reject H0 if, by its lights, the probability of data at least as extreme as that observed is too improbable (typically less than 0.05 or 0.01). • ‘Improbable’ must be relativized to a probability function. Data ‘too good to be true’ • As well as data fitting a given hypothesis too poorly, it can also fit it suspiciously well. – Fisher on Mendel cooking the books in his pea experiment. The probability that by chance alone the data would fit his theory of heredity that well was 0.00003. Closing sermon • One might object that a theory based on Popper functions is more complicated than the elegant, simple theory that we previously had. • But elegance and simplicity are no excuses for the theory’s inadequacies. • Moreover, the theory had to be complicated in any case by the introduction of infinitesimals; and it seems that even they are not enough to overcome the inadequacies. Cournot’s Principle • Cournot’s principle: an event of small probability singled out in advance will not happen. • Borel: “The principle that an event with very small probability will not happen is the only law of chance.” Cournot’s Principle • The principle still has some currency, having been recently rehabilitated and defended by Shafer. Cheney’s Principle • “If there’s a 1% chance that Pakistani scientists are helping Al Qaeda build or develop a nuclear weapon, we have to treat it as a certainty in terms of our response.” Cheney’s Principle • USA had to confront a new kind of threat, that of a “low-probability, high-impact event”. Cheney’s Principle • While Cournot effectively rounds down the event’s low probability, treating it as if it’s 0, Cheney rounds it up, treating it as if it’s 1. The improbable in philosophy • Improbable events have earned their keep in philosophy. A concern with improbable events has driven philosophical positions and insights. Examples of doxastically possible credence gaps • The time-reversal of the ball’s trajectory is a Newtonian possibility. • The initial conditions and Newton’s laws do not determine if, when, and in which direction the ball will roll down. • It may roll north on Monday. • But chance(the ball rolls north on Monday) is undefined. • A rational agent may accordingly not assign a credence to the ball rolls north on Monday. The lottery paradox • The lottery paradox puts pressure either on the ‘Lockean thesis’ that rational binary belief corresponds to subjective probability above a threshold, or the closure of rational beliefs under conjunction. The lottery paradox • For the Lockean thesis to have any plausibility, the putative threshold for belief must be high. Accordingly, the probabilities involved in the lottery paradox will be small. • Lotteries cast doubt on Cournot’s principle. • We see an interesting feature of small probabilities: they may accumulate, combining to yield large probabilities. Skepticism about knowledge • Vogel: I know where my car is parked right now. But I don’t know that I am not one of the unlucky people whose car has been stolen during the last few hours. The probable • The improbable is the flip-side of the probable: if p is probable, then not-p is improbable. • Probability 1 events have a special place in probability theory. The probable • Various classic limit theorems are so-called ‘almost sure’ results. They say that various convergences occur with probability 1, rather than with certainty. The probable • An example of the probable doing philosophical work: the Problem of Old Evidence. • According to Bayesian confirmation theory, E confirms H (according to P) iff P(H | E) > P(H). • But if P(E) = 1, then P(H | E) = P(H), and E has no confirmatory power (by these lights). The problem of old evidence • This seems wrong: for example, even when we are certain of the advance of the perihelion of Mercury, this fact seems to support general relativity. Why care about the improbable? • I have already pointed out many ways in which scientists and philosophers do care about the improbable. This does much to build my case that they should care—for many of the examples are important and well motivated. Closing open-mindedness, even with infinitesimals • Lewis: You may protest that there are too many alternative possible worlds to permit regularity. But that is so only if we suppose, as I do not, that the values of the function C are restricted to the standard reals. Many propositions must have infinitesimal C-values, and C(A|B) often will be defined as a quotient of infinitesimals, each infinitely close but not equal to zero. (See Bernstein and Wattenberg (1969).) • Brian also cites this important paper. • It shows there is an open-minded probability assignment to the dart experiment (with hyperreal values). Why care about the improbable? • There are specific problems that arise only in virtue of improbability. We want a fully general probability theory that can handle them. • We want a fully general philosophy of probability. Why care about the improbable? • Probability interacts with other things that we care about, and something being improbable can matter to these other things. Why care about the improbable? • There are problems created by low probability events that are similarly created by higher probability events; but when they are improbable we are liable to neglect them. – Skepticism about knowledge (as we saw) – Counterfactuals (as we will see) What is ‘improbable’? • I will count as improbable: 1. Events that have probability 0. 2. Events that have infinitesimal probability—positive, but smaller than every positive real number. 3. Events that have small real-valued probability. • ‘Small’ is vague and context-dependent, but we know clear cases when we see them, and my cases will be clear. 4. Events with imprecise probability, with an improbable upper limit (as above). What is ‘improbable’? • There are various peculiar properties of low probabilities. I want to use them to do some philosophical work. I will go through these properties systematically, showcasing each of them with a philosophical application, a philosophical payoff. Probability 0 events • Much of what’s philosophically interesting about probability 0 events derives from interesting facts about the arithmetic of 0. • Each of its idiosyncrasies motivates a deep philosophical problem. Open-mindedness • To be sure, we could reasonably dismiss probability zero events as 'don't cares' if we could be assured that all probability functions of interest assign 0 only to impossibilities—i.e. they are regular/strictly coherent/open-minded. Open-mindedness • Open-mindedness is part of the folk concept of probability: ‘if it can happen, then it has some chance of happening’. • Open-mindedness has support from some weighty philosophical figures (e.g. Lewis). • We will see how much havoc probability-zero-but-possible events wreak. It would be nice to banish them! Closing open-mindedness? • There are apparently events that have probability 0, but that can happen. Closing open-mindedness? • A fair coin is tossed infinitely many times. The probability that it lands heads every time HHH … is 0 • The probability of each infinite sequence is 0. (We will revisit this claim later, but assume it for now.) You can’t divide by 0: problems for the conditional probability ratio formula • The ratio analysis of conditional probability: … provided P(B) > 0 You can’t divide by 0: problems for the conditional probability ratio formula • What is the probability that the coin lands heads on every toss, given that the coin lands heads on every toss? You can’t divide by 0: problems for the conditional probability ratio formula • What is the probability that the coin lands heads on every toss, given that the coin lands heads on every toss? • 1, surely! You can’t divide by 0: problems for the conditional probability ratio formula • What is the probability that the coin lands heads on every toss, given that the coin lands heads on every toss? • 1, surely! • But the ratio formula cannot deliver that result, because P(coin lands heads on every toss) = 0. You can’t divide by 0: problems for the conditional probability ratio formula • There are less trivial examples, too. You can’t divide by 0: problems for the conditional probability ratio formula • There are less trivial examples, too. • The probability that the coin lands heads every toss, given that it lands heads on the second, third, fourth, … tosses is ½. You can’t divide by 0: problems for the conditional probability ratio formula • There are less trivial examples, too. • The probability that the coin lands heads on every toss, given that it lands heads on the second, third, fourth, … tosses is ½. • Again, the ratio formula cannot say this. Trouble for conditionalization • Suppose you learn that the coin landed heads on every toss after the first. What should be your new probability that the coin landed heads on every toss? ½, surely. But Pinitial(heads every toss | heads every toss after first) is undefined, so conditionalization (so defined) cannot give you this advice. Trouble for conditionalization • To be sure, there are some more sophisticated methods for solving these problems. • Bill has written about this topic. • So have various other authors, including myself. – Popper functions – Kolmogorov: conditional probability as a random variable (conditional on a sigma algebra) Trouble for conditionalization • But something must be done – we can’t retain the Bayesian orthodoxy in the face of such cases. • We have to assess the costs of these other approaches. • And there are other, less familiar problems with Bayesian orthodoxy … The multiplicative destroyer: problems for independence • According to this account of probabilistic independence, anything with probability 0 is independent of itself: If P(X) = 0, then P(X  X) = 0 = P(X)P(X). • But surely identity is the ultimate case of (probabilistic) dependence. The multiplicative destroyer: problems for independence • Suppose you are wondering whether the coin landed heads on every toss. Nothing could be more informative than your learning: the coin landed heads on every toss. • But according to this account of independence, the coin landing heads on every toss is independent of the coin landing heads on every toss! The multiplicative destroyer: problems for independence • More generally, according to this account of independence, any proposition with probability 0 is probabilistically independent of anything. This includes: – its negation; – anything that entails it, and anything that it entails. The multiplicative destroyer: problems for independence • The ratio account of conditional probability was guilty of a sin of omission. But this account of independence is guilty of a sin of commission. The additive identity: problems for expected utility theory • While 0 is the most potent of all numbers when it comes to multiplication, it’s the most impotent when it comes to addition and subtraction. It’s the additive identity: adding it to any number makes no difference. • This creates problems for decision theory. The additive identity: problems for expected utility theory • Arguably the two most important foundations of decision theory are the notion of expected utility, and dominance reasoning. – Expected utility is a measure of choiceworthiness of an option: the weighted average of the utilities associated with that option in each possible state of the world, the weights given by corresponding probabilities that those states are realized. – (Weak) Dominance reasoning says that if one option is at least as good as another in every possible state, and strictly better in at least one possible state, then it is preferable (assuming independence of options and states). The additive identity: problems for expected utility theory • And yet probability 0 propositions show that expected utility theory and dominance reasoning can give conflicting verdicts. The additive identity: problems for expected utility theory • Suppose that two options yield the same utility except on a proposition of probability 0; but if that proposition is true, option 1 is far superior to option 2. Infinitesimal probabilities • Infinitesimals to the rescue? (E.g. from the hyperreals.) • We might say that the probability that the coin lands heads forever is not really 0, but rather an infinitesimal. The additive identity: problems for expected utility theory • Suppose that we toss the coin infinitely many times. You can choose between these two options: – Option 1: If it lands heads on every toss, you get a million dollars; otherwise you get nothing. – Option 2: You get nothing. The additive identity: problems for expected utility theory • Expected utility theory says that these options are equally good: they both have an expected utility of 0. But dominance reasoning says that option 1 is strictly better than option 2. Which is it to be? • I say that option 1 is better. • I think that this is a counterexample to expected utility theory, as it is usually understood. Infinitesimal probabilities • Infinitesimals to the rescue? (E.g. from the hyperreals.) • We might say that the probability that the coin lands heads forever is not really 0, but rather an infinitesimal. Infinitesimal probabilities • Lewis: “Zero chance is no chance, and nothing with zero chance ever happens.” • A version of Cournot’s principle, with zero probability counting as “small probability”. • “… infinitesimal chance is still some chance.” • Likewise, Brian advocates using infinitesimal probabilities. Infinitesimal probabilities • But in the cases considered, aren’t the probabilities really zero?! • Williamson has an argument that the probability of heads forever really is zero, even allowing infinitesimals. Infinitesimal probabilities • We should judge our total theory by its virtues and vices. • We have seen what I take to be some serious vices of orthodox Bayesianism (and of orthodox probability theory more generally). • One way or another, we need to go unorthodox. Real-valued, small positive probabilities • Next we turn to probabilities that are not especially strange in their own right—there is nothing weird about their mathematics. Yet they give rise to a host of philosophical problems in their own right. They are real-valued but small positive probabilities. Most counterfactuals are false • Stare in the face of chance … • ‘If the coin were tossed, it would land heads’ • I submit that it is false. There is no particular way that this chancy process would turn out, were it to be initiated. In the words of Jeffrey “that’s what chance is all about”. To think that there is a fact of the matter of how the coin would land is to misunderstand chance. Most counterfactuals are false • The argument goes through whatever the chance of Tails, as long as it is a possible outcome. Most counterfactuals are false • The argument goes through whatever the chance of Tails, as long as it is a possible outcome. • Or consider a fair lottery. ‘If you were to play the lottery, you would lose’ is false no matter how many tickets there are in the lottery. Most counterfactuals are false • In an indeterministic world such as ours appears to be, lotteries—in a broad sense—abound. Most counterfactuals are false • In an indeterministic world such as ours appears to be, lotteries—in a broad sense—abound. • The indeterminism reaches medium-sized dry goods. Most counterfactuals are false • In an indeterministic world such as ours appears to be, lotteries—in a broad sense—abound. • The indeterminism reaches medium-sized dry goods. • Even billiard ball collisions, human jumps, … are indeterministic. Most counterfactuals are false • In an indeterministic world such as ours appears to be, lotteries—in a broad sense—abound. • The indeterminism reaches medium-sized dry goods. • Even billiard ball collisions, human jumps, … are indeterministic. • There are subtle issues here! Most counterfactuals are false • Now, there are various reasons why you may not be staring chance in the face… Most counterfactuals are false • Now, there are various reasons why you may not be staring chance in the face… • The trouble is that chance is staring at you. Most counterfactuals are false • Now, there are various reasons why you may not be staring chance in the face… • The trouble is that chance is staring at you. • It is heedless of your ignorance, defiant of your ignorings. Most counterfactuals are false • Once you take seriously what quantum mechanics says, you should see chanciness almost everywhere. The world looks like a huge collection of lotteries. But whether or not you take seriously what the theory says, that’s apparently how the world is. Multiplication by extremely large utilities • Even extremely small positive probabilities can be offset by multiplication by extremely large utilities when calculating expected utilities. Pascal’s Wager Wager for God Wager against God God exists God does not exist ∞ f1 f2 f3 f1, f2, and f3 are finite utilities (no need to specify) Your probability that God exists should be positive. Rationality requires you to maximize expected utility. Therefore, Rationality requires you to wager for God. Pascal’s Wager God exists (p) God does not exist (1 – p) Wager for God ∞ f1 Wager against God f2 f3 Let p be your positive probability for God's existence. Your expected utility of wagering for God is ∞p + f1(1 – p) = ∞ Your expected utility of wagering against God is f2p + f3(1 – p) = some finite value. Therefore, you should wager for God. Pascal’s Wager • But this argument is invalid! • Pascal's specious step is to assume that only the strategy of wagering for God gets the infinite expected utility. • He has ignored all the mixed strategies. Pascal’s Wager • But this still understates Pascal's troubles. For anything that an agent might choose to do may be a mixed strategy between wagering for and wagering against God, for some appropriate probability weights. • For whatever one does, one should apparently assign some positive probability to winding up wagering for God… Pascal’s Wager • By open-mindedness, one has to assign positive probability to such non-Pascalian routes to wagering for God! Pascal’s Wager • By open-mindedness, one has to assign positive probability to such non-Pascalian routes to wagering for God! • By Pascal's lights, it seems everybody enjoys maximal expected utility at all times! Pascal’s Wager • By open-mindedness, one has to assign positive probability to such non-Pascalian routes to wagering for God! • By Pascal's lights, it seems everybody enjoys maximal expected utility at all times! • A dilemma: If a Pascalian agent is open-minded, all practical reasoning is useless; if not, the earlier theoretical problems (for conditional probability, conditionalization, independence and decision theory) are alive and well! Pascal’s Wager, reformulated • But Pascal’s Wager can apparently be rendered valid. Pascal’s Wager, reformulated Wager for God Wager against God God exists God does not exist f f1 f2 f3 Let p be your positive probability for God's existence. Your expected utility of wagering for God is fp + f1(1 – p) Your expected utility of wagering against God is f2p + f3(1 – p) = some finite value. If f is large enough, you should wager for God. Pascal’s Wager, reformulated Some real-world decision problems look rather like this, because they involve sufficiently high stakes (relative to the associated probabilities) … Imprecise probabilities with small upper limit • Sometimes our probabilities are imprecise – e.g. due to lack of relevant information, or conflicting information. • Think of imprecise probabilities as interval-valued: [x, y] • y may be very small. • But when the associated stakes are sufficiently high, there may still be cause for serious concern. Flying in Europe during the volcano eruption • All of Europe’s airports closed because of the risk to flights posed by the eruption of the volcano in Iceland. • The probability of crashes was “small”. • It was also imprecise. • But the stakes were so high that it was wise to cancel the flights. Global warming • Climate scientists differ in their probabilities of various scenarios of global warming. • It seems that our probabilities should be correspondingly imprecise. Global warming Global warming • We mainly hear about the most likely scenarios, which involve serious consequences, but arguably not catastrophic. (Certainly, various people argue that they are not catastrophic.) • Perhaps we should be more concerned with much less likely scenarios, but ones that involve truly catastrophic consequences. • This is so even when the corresponding probabilities are imprecise. Global warming THE END The additive identity: problems for expected utility theory • (Re)interpret expected utility theory so that in the case of ties, the theory is silent? • That’s a different kind of defect: incompleteness. • What would you prefer: $1, or $1? • The theory is silent? • That’s an uncomfortable silence! The additive identity: problems for expected utility theory • Decision theory supplements expected utility theory with further rules? – If two options are identical, it doesn’t matter which you choose. – If option 1 (weakly) dominates option 2, then choose option 1. The additive identity: problems for expected utility theory • There will still be problems. • Which do you prefer?: – Option 1: You get a million dollars iff the coin lands HHH … or THH … (you get two tickets) – Option 2: You get a million dollars iff the coin lands TTT ... (you get one ticket) • Option 1 is surely better, but we get silence from expected utility theory, and silence from dominance reasoning. [Dart-strike table] Dart hits irrational number Option 1 Option 2 • Point 1 – Sub point • Point 2 – Sub point Dart hits rational number [numberline - infintesimals] 0 0 • Point 1 – Sub point • Point 2 – Sub point 1 [numberline – unmeasurable] 0 • Point 1 – Sub point • Point 2 – Sub point 1 [s-curves] • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point [s-curves] • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point [climate graph] • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point [Eqn/object]  Sn  P  EX   1  n   Sn  P  EX     1  n  Sn 1 1 1 1   2 2 n π 4 π 3 2 PE | H  PH  PH | E   PE  P( A  B) P( A | B)  P( B) [Graphic] Auto-fitted & animated list (lvl2) – template for slides • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point Auto-fitted & animated list (lvl2) – template for slides • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point Auto-fitted & animated list (lvl2) – template for slides • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point Auto-fitted & animated list (lvl2) – template for slides • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point Auto-fitted & animated list (lvl2) – template for slides • Point 1 – Sub point • Point 2 – Sub point • Point 3 – Sub point • Point 4 – Sub point Auto-fitted & animated list (lvl2) – template for slides • P(H | E) = The Laws of Large Numbers • Let X1, X2… be i.i.d. random variables • Let EX be the expectation of these random variables. • Let Sn = X1 + X2 + … + Xn The Laws of Large Numbers • Strong law of large numbers: –  “Probability 1 of convergence” (almost sure convergence) From probability to possibility • There are various entailments in one direction, from positive probability to corresponding notions of possibility. – If something receives positive chance, then it is nomically possible. – If something receives positive credence, then it is epistemically possible (for the relevant agent). – If something receives positive logical probability (given some evidence sentence), then it is logically possible (and consistent with that sentence). From possibility to probability? • The folk seem to think that if something is possible, then it has some positive probability. The folk seem to believe that. Sub point • But ‘probability 0’ is a distinctive modality, irreducible to other, non-probabilistic modalities. Possibility talk and probability talk are not intertranslatable. From possibility to probability? • A probability function that dignifies every possibility with positive probability is usually said to be regular. • I prefer to call it open-minded. • We can go on to distinguish various senses of open-mindedness, corresponding to the various senses of possibility. • As we will see, it is hard to sustain any version of open-mindedness. 3 grades of probabilistic involvement • Recognition by Ω • Recognition by F. • Recognition by P. Auto-fitted & animated list (lvl2) – template for slides • Nothing that lies outside  is countenanced by <, F, P>.  is probabilistically certain. Jeffrey vs C. I. Lewis • C. I. Lewis: “if anything is to be probable, then something must be certain” • It is ironic that Jeffrey objected to Lewis’s dictum, insisting that there could be “probabilities all the way down to the roots”. • But this is an incoherent position. There couldn’t be probabilities without some underlying certainty. Far from embodying some philosophical error, Lewis’s dictum is a trivial truth! The pluriverse and credences • Let  consist of all of the worlds of the Lewisian pluriverse. • Let F consist of all propositions that you can entertain. Some subsets of  are too gerrymandered to be the contents of your credences, and so are excluded from F. • And among the propositions that you do entertain, you give some credence 0, and some positive credence. Only the latter reach the third grade of probabilistic involvement. Against regularity • Regularity falls foul of the three grades of probabilistic involvement. There is no easy passage from something being possible to its getting positive probability. Before it can get there, it has to get through three gatekeepers: it has to be recognized by , then by F, then by P. Why care about the improbable? • I have already pointed out many ways in which scientists and philosophers do care about the improbable. This already does much to build my case that they should care—for many of the examples are important and well motivated. Why care about the improbable? • There are specific problems that arise only in virtue of extreme improbability. We want a fully general probability theory that can handle them. • We want a fully general philosophy of probability. • Assigning extremely low probabilities may help us solve paradoxes, and may drive philosophical positions – de Finetti’s lottery: drop countable additivity Why care about the improbable? • There are specific problems that arise only in virtue of extreme improbability. We want a fully general probability theory that can handle them. • We want a fully general philosophy of probability. • Assigning extremely low probabilities may help us solve paradoxes, and may drive philosophical positions – de Finetti’s lottery: drop countable additivity Why care about the improbable? • A concern for extremely improbable events can stimulate new theoretical developments – Kolmogorov on conditional probability Point 2 – Popper on conditional probability – Bartha and Johns on relative probability Why care about the improbable? • Probability interacts with other things that we care about, and something being extremely improbable can matter to these other things – e.g. expected utility, Adams-style ‘probabilistic validity’, probabilistic causation, laws of nature [ Why care about the improbable? • There are problems that are shared with higher probability events, but when they are improbable we are liable to neglect them Elga on Lewisian chances and ‘fit’ • Lewis’s ‘best systems’ analysis of the laws of nature – The laws are the theorems of the theory of the universe that best combines simplicity and strength • Lewis’s ‘best systems analysis of chance’ – The chances are the probabilities assigned by the theory of the universe that best combines simplicity, strength, and fit (the probability that the theory assigns to the actual history). Elga on Lewisian chances and ‘fit’ • Elga: infinite histories will seem to get zero probability from various theories that should be in the running. • Infinitesimals to the rescue? • But for every theory whose chances accord well with the corresponding relative frequencies, there is another theory whose chances accord badly with them, but that fits the actual history better. • This is a strange fact about infinitesimal probabilities. Maier on the contingent a priori • Say that a coin is exhaustive just in case it (i) is fair and (ii) has been tossed an infinite number of times. Consider the proposition expressed by: ‘Either there are no exhaustive coins or an exhaustive coin comes up heads at least once.’ This proposition is contingent, and it has probability 1 of being knowable a priori. Probability 0 events • You can’t divide by 0: Problems for the ratio analysis of conditional probability Trouble for the conditional probability ratio formula • What is the probability that the randomly chosen point lies in the western hemisphere, given that it lies on the equator? Trouble for the conditional probability ratio formula • What is the probability that the randomly chosen point lies in the western hemisphere, given that it lies on the equator? Trouble for the conditional probability ratio formula • What is the probability that the randomly chosen point lies in the western hemisphere, given that it lies on the equator? • Surely the answer is 0. What is ‘extremely improbable’? • ‘Improbable’ does not mean ‘not probable’. – Something of middling probability is neither improbable nor probable. – We don’t want to conflate ‘low probability’ with ‘no probability’—that is, with the absence, the non-existence of a probability value. Most counterfactuals are false • This discussion recalls Hawthorne and Vogel on skepticism about knowledge. Most counterfactuals are false • This discussion recalls Hawthorne and Vogel on skepticism about knowledge. • But there is a crucial disanalogy between knowledge and counterfactuals. Knowledge is factive. This yields a key symmetry-breaker among the relevant possible outcomes. Perhaps you do know of each ticket that it will lose, except the ticket that in fact wins… Most counterfactuals are false • But there is no similar symmetry breaker for counterfactuals. There is no way of privileging a ticket that would have won had the lottery been played. Most counterfactuals are false • But there is no similar symmetry breaker for counterfactuals. There is no way of privileging a ticket that would have won had the lottery been played. • And so it goes for all the other natural ‘lotteries’ on which my arguments for the falsehood of most counterfactuals have relied. My fascination with the improbable • Conditional probability • Hume on miracles • Pascal’s Wager • The ‘Pasadena game’ • Indeterminate probabilities • Arguments against frequentism • Most counterfactuals are false • “A Poisoned Dart for Conditionals” The Strong Law of Large Numbers • Suppose you are betting on whether a coin lands heads or tails on repeated trials, and that you win $1 for each head and lose $1 for each tail. We keep track of your total earnings Sn over a long sequence of trials, and your average earnings • Roughly, in the long run, . will converge to the expectation of your winnings on each trial, namely 0. The Strong Law of Large Numbers • The strong law of large numbers says that the long run average converges to the expectation with probability 1. • This is called ‘almost sure’ convergence. • The convergence is not sure. Why care about the improbable? • Assigning low probabilities may help us solve paradoxes – Bartha and Hitchcock on ‘the shooting room’ paradox • These solutions to paradoxes may motivate philosophical positions – de Finetti’s lottery, and his argument for rejecting countable additivity What is ‘improbable’? • ‘Improbable’ does not mean ‘not probable’. – Middling probability – Non-existent probability • 3 grades of probabilistic involvement… What is ‘improbable’? • 3 grades of probabilistic involvement… What is ‘improbable’? • 3 grades of probabilistic involvement… Against regularity • The strong law of large numbers is only an ‘almost sure’ result. • The long run average can fail to converge to the expectation. – A fair coin tossed infinitely many times can land heads on every toss. Pascal’s Wager God exists God does not exist Wager for God ∞ f1 Wager against God f2 f3 (f1, f2, and f3 are finite utility values that need not be specified any further.) • Your probability that God exists should be positive. • Rationality requires you to perform the act of maximum expected utility (when there is one). Therefore, Rationality requires you to wager for God. Problems for open-mindedness • The probability that it lies on the equator is 0. A uniform probability measure over a sphere must award probabilities to regions in proportion to their area, and the equator has area 0. Problems for open-mindedness • A point is chosen at random from the surface of the earth (thought of as a perfect sphere) Trouble for the conditional probability ratio formula • What is the probability that the randomly chosen point lies in the western hemisphere, given that it lies on the equator? • Surely the answer is 1/2. • But the ratio formula cannot deliver that answer. Adams on probabilistic validity • Adams believed that indicative conditionals do not have truth conditions. • Thus he found inadequate the traditional account of validity of arguments in which conditionals appear. • But he was happy to speak of 'probabilities' attaching to conditionals. • Roughly, a probabilistically valid argument is one for which it is impossible for the premises to be probable while the conclusion is improbable. ‘Surprising’ evidence and Bayesian confirmation theory • One of the putative success stories of Bayesian confirmation theory is its explanation of why surprising evidence provides especially strong confirmation for a theory that predicts it. • Bayesians have cashed out “surprising” as “improbable”. ‘Surprising’ evidence and Bayesian confirmation theory • Bayes’ theorem: • Holding fixed P(E | H) and P(H), there is an inverse relationship between the posterior and P(E). • The more surprising the evidence, the greater its confirmatory power. Why care about the improbable? • A concern for improbable events can stimulate new theoretical developments more generally • Kolmogorov on conditional probability • Popper on conditional probability • Bartha and Johns on relative probability Infinitesimal probabilities • But Williamson shows that the coin landing heads forever must get probability 0 even allowing infinitesimal probabilities. Pascal’s Wager, reformulated • To be sure, there are still problems with the reformulated Wager (e.g. the Many Gods objection). But at least it is valid. Easwaran • Kenny Easwaran’s APA talk. Striking point that Dutch Book argument could be cast purely in terms of bets you find favorable. • Recast the account of independence, and decision theory, in terms of strict inequalities. • Bad idea for decision theory! An argument for vegetarianism Eating meat WRONG Eating meat NOT WRONG Eat meat –f f1 Don’t eat meat f2 f3 Let p be your positive probability that eating meat is wrong. Your expected utility of eating meat is –fp + f1(1 – p) Your expected utility of not eating meat is f2p + f3(1 – p), which may be higher. In that case, you should not eat meat. Engineering and risky events • Jet Propulsion Laboratory: the probability of launch failure of the Cassini spacecraft (mission to Saturn) was 1.1 x 10-3. • According to the US Nuclear Regulatory Commission, the probability of a severe reactor accident in one year is imprecise: [1.1 x 10-6 , 1.1 x 10-5] Why care about the improbable? • The improbable plays an important role in our commonsense view of the world. Such events happen all the time (Cournot’s principle notwithstanding). – The probability is allegedly 1/2375 that a golfer on the PGA tour will get a hole in one. Why care about the improbable? • We care about accurately describing and understanding our world – Taking our best science seriously—and it is full of improbable events. Why care about the improbable? • If the relevant probability space is large, there is no avoiding low probabilities. • Indeed, if the space is large enough (uncountable), there is no avoiding 0 probabilities. • In such a space, 0 and 1 are the only values that are guaranteed to be assigned infinitely many times. Popper’s philosophy of science • Popper maintained that the hallmark of a scientific claim was its falsifiability. • But many scientific claims are probabilistic, and probabilistic claims are not falsifiable. • He went on to say that if a piece of evidence is highly improbable by the lights of a theory, then the theory “in practice” rules out that evidence.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Slides - Rutgers Statistics