Download Slides - Rutgers Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Odds wikipedia, lookup

Randomness wikipedia, lookup

History of randomness wikipedia, lookup

Probabilistic context-free grammar wikipedia, lookup

Indeterminism wikipedia, lookup

Expected utility hypothesis wikipedia, lookup

Dempster–Shafer theory wikipedia, lookup

Probability box wikipedia, lookup

Infinite monkey theorem wikipedia, lookup

Birthday problem wikipedia, lookup

Inductive probability wikipedia, lookup

Risk aversion (psychology) wikipedia, lookup

Ars Conjectandi wikipedia, lookup

Probability interpretations wikipedia, lookup

Transcript
Staying Regular?
Alan Hájek
To be uncertain is to be uncomfortable,
but to be certain is to be ridiculous.
- Chinese proverb
ALI G: So what is the chances that me will eventually die?
C. EVERETT KOOP: That you will die? – 100%. I can
guarantee that 100%: you will die.
ALI G: You is being a bit of a pessimist…
–Ali G, interviewing the
Surgeon General, C. Everett Koop
Satellite view
• My campaign against the usual ratio analysis of
conditional probability (“What CP Could Not Be”…)
• Related, my misgivings about the usual formula(s)
for independence. (“Declarations of Independence”)
• Conditionals (“zero intolerance”)
• Concerns about decision theory (‘Decision Theory In
Crisis’ project)
• ‘The Objects of Probability’ project
• Bridges between traditional and formal epistemology
(“A Tale of Two Epistemologies”)
Satellite view
• They all come together in my concerns about
regularity.
Satellite view
• First instalment: “Is Strict Coherence Coherent?”
• I’m back on the case.
Aerial view
• Many philosophers are scandalized by orthodox
Bayesianism’s unbridled permissiveness…
• So some Bayesians are less permissive...
• I’ll canvass the fluctuating fortunes of a much-touted
constraint, so-called regularity.
Aerial view
• Various versions.
• I’ll massage it, offering what I take to be a more
promising (or less unpromising!) version: a
constraint that bridges doxastic possibility and
doxastic (subjective) probability.
• So understood, regularity promises to offer a
welcome connection between traditional and
Bayesian epistemology.
Aerial view
• I’ll give a general formulation of regularity in terms of
a certain kind of internal harmony within a
probability space.
• I’ll argue that it is untenable.
Aerial view
• There will be two different ways to violate regularity
– zero probabilities
– no probabilities at all (probability gaps).
• Both ways create trouble for pillars of Bayesian
orthodoxy:
–
–
–
–
the ratio formula for conditional probability
conditionalization, characterized with that formula
the multiplication formula for independence
expected utility theory
Aerial view
• Whither our theory of rational credence?
View from the trenches: regularity
• Regularity conditions are bridge principles between
modality and probability:
If X is possible, then the probability of X is positive.
• An unmnemonic name, but a commonsensical
idea…
• Versions of regularity as a rationality constraint have
been suggested or advocated by Jeffreys, Jeffrey,
Carnap, Shimony, Kemeny, Edwards, Lindman,
Savage, Stalnaker, Lewis, Skyrms, Jackson, Appiah,
...
Regularity
• Muddy Venn diagram. No bald spots.
Regularity
If X is possible, then the probability of X is positive.
• There are many senses of ‘possible’ in the
antecedent...
• There are also many senses of ‘probability’ in the
consequent…
Regularity
• Pair them up, and we get many, many regularity
conditions.
• Some are interesting, and some are not; some are
plausible, and some are not.
• Focus on pairings that are definitely interesting, and
somewhat plausible, at least initially.
Regularity
• In the consequent, let’s restrict our attention to
rational subjective probabilities.
• In the antecedent? …
Regularity
• Implausible:
Logical Regularity
If X is LOGICALLY possible, then C(X) > 0.
(Shimony, Skyrms)
Regularity
• Problems: There are all sorts of propositions that are
knowable a priori, but whose negations are logically
possible...
• A rational agent may (and perhaps must) give
probability 0 to these negations.
Regularity
• More plausible:
Metaphysical Regularity
If X is METAPHYSICALLY possible, then C(X) > 0.
Regularity
• This brings us to Lewis’s (1980) characterization of
“regularity”: “C(X) is zero … only if X is the empty
proposition, true at no worlds”. (According to Lewis,
X is metaphysically possible iff it is true at some
world.)
• Lewis regards regularity in this sense as a constraint
on “initial” (prior) credence functions of agents as
they begin their Bayesian odysseys—Bayesian
Superbabies.
Regularity
• Problems for metaphysical regularity:
– It is metaphysically possible for no thinking thing to exist …
– An infallible, omniscient God is metaphysically irregular …
– Agents who are infallible over circumscribed domains:
• Metaphysical regularity prohibits luminosity of one’s credences:
If C(X) = x, then, C[ C(X) = x ] = 1.
• Metaphysical regularity prohibits certainty about one’s
phenomenal experiences.
Regularity
• However, doxastic possibility seems to be a
promising candidate for pairing with subjective
probability.
• Doxastic regularity:
If X is doxastically possible for a given agent, then
the agent’s subjective probability of X is positive.
Regularity
• We can think of a doxastic possibility for an agent as
something that is compatible with what that agent
believes, though I can allow other understandings...
• Plausibly, her beliefs and degrees of belief are, or at
least should be, connected closely enough to
guarantee this condition.
Regularity
• If doxastic regularity is violated, then offhand two
different attitudes are conflated: one’s attitude to
something one outright disbelieves, and one’s less
committal attitude to something that is still a live
possibility given what one believes ...
Regularity
• Doxastic regularity avoids the problems with the
previous versions…
Regularity
• If this version of regularity fails, then another
interesting version will fail too.
• Epistemic regularity:
If X is epistemically possible for a given agent, then
the agent’s subjective probability of X is positive.
• This is stronger than doxastic regularity; if it fails, so
does this.
Regularity
• Regularity provides a bridge between traditional and
Bayesian epistemology.
• (But responding to skepticism is one of the main
traditional concerns; far from combating skepticism,
regularity seems to sustain it.)
Regularity
• And yet doxastic regularity appears to be untenable.
Regularity
• I will characterize regularity more generally as a
certain internal harmony of a probability space <,
F, P>.
• Doxastic regularity will fall out as a special case, as
will epistemic regularity and other related
regularities.
• Then we will be in a good position to undermine
them.
A more general characterization of regularity
• Philosophers tend to think that all of the action in
probability theory concerns probability functions.
• But for mathematicians, the fundamental object of
probability theory is a probability space, a triple of
mathematical entities: <, F, P>.
A more general characterization of regularity
– A set of possibilities, which we will designate ‘’—a set of
Worlds. (Think of the set of subsets of  - sets of Worlds as the agent’s set of doxastic possibilities.)
– A field F of subsets of , thought of as propositions that
will be the contents of the agent’s credence assignments.
– A probability function P defined on F.
A more general characterization of regularity
• Regularity is a certain kind of harmony between , F,
and P:
– Between  and F: F is the power set of , so every subset
of  appears in F. F recognizes every proposition that 
recognizes.
– Between F and P: P gives positive probability to every set
in F except the empty set. P recognizes every proposition
that F recognizes (except the empty set).
– In this sense, P mirrors : P’s non-zero probability
assignments correspond exactly to the non-empty subsets
of .
Three grades of probabilistic involvement
• A non-empty set of possibilities may be recognized by
the space by
– (1st grade) being a subset of ;
– (2nd grade) being an element of F;
– (3rd grade) receiving positive probability from P.
• These are non-decreasingly committal ways in which
the space may countenance a proposition.
• An agent’s space is regular if these three grades
collapse into one: every non-empty subset of 
receives positive probability from P.
Three grades of probabilistic involvement
• So far, this is all formalism; it cries out for an
interpretation. Again, philosophers of probability have
tended to focus on the interpretation of P …
• But  and F deserve their day in the sun, too.
Three grades of probabilistic involvement
• I will think of the elements of  as worlds, the
singletons of  as an agent’s maximally specific
doxastic possibilities.
• F will be the privileged sets of such doxastic
possibilities that are contents of the agent’s credence
assignments.
Three grades of probabilistic involvement
• Or start with the doxastic state of a rational agent; it
cries out for a formalism so that we can model it.
• Philosophers of probability have tended to focus on
her probability assignments—real values in [0, 1] that
obey the usual axioms.
• But surprisingly little is said about the contents of
these assignments, and they deserve their day in the
sun too.
Three grades of probabilistic involvement
• They are represented by a set F of privileged subsets
of a set .
Three grades of probabilistic involvement
• Thus, three tunnels with opposite starting points and
heading in opposite directions meet happily in the
middle! …
Three grades of probabilistic involvement
• We have a formalism looking for a philosophical
interpretation and a philosophical interpretation
looking for a formalism happily finding each other.
• Where traditional and Bayesian epistemology have to
a large extent proceeded on separate tracks, this way
they are agreeably linked…
Three grades of probabilistic involvement
• Doxastic regularity is a special case of regularity, one
in which the three grades of probabilistic involvement
collapse into one for a probability space interpreted as
I have.
• There are other interesting special cases, too.
Three grades of probabilistic involvement
•  might be the agent’s set of maximally specific
epistemic possibilities, and F the privileged sets of
these possibilities to which she bestows credences.
• Define epistemic regularity in terms of the three
grades of probabilistic involvement collapsing into one
for this space.
Three grades of probabilistic involvement
• Or you are free to give your own interpretation of 
and F…
Three grades of probabilistic involvement
• There are two ways in which an agent’s space could
fail to be regular:
– 1) Her probability function assigns zero to some member of
F (other than the empty set). Then her second and third
grades come apart for this proposition.
– 2) Her probability function fails to assign anything to some
subset of , because the subset is not an element of F.
Then her first and second grades come apart for this
proposition.
Three grades of probabilistic involvement
• Those who regard regularity as a norm of rationality
must insist that all instances of 1) and all instances of
2) are violations of rationality.
• I will argue that there are rational instances of both 1)
and 2).
Dart example
Throw a dart at random at the [0, 1] interval…
Dart example
0
1
Dart example
• Any landing point outside this interval is not
countenanced—the corresponding proposition does
not make even the first grade for our space...
Dart example
• Nor do various other possibilities …
Dart example
0
1
• Certain subsets of —so-called non-measurable
sets—get no probability assignments whatsoever.
• They achieve the first, but not the second grade.
Dart example
0
1
Dart example
0
1
• Various non-empty subsets get assigned probability 0:
• All the singletons
• Indeed, all the finite subsets
• Indeed, all the countable subsets
• Even various uncountable subsets (e.g. Cantor’s ‘ternary set’)
• They achieve the second but not the third grade.
Dart example
• Examples like these pose a threat to regularity as a
norm of rationality.
• Any landing point in [0, 1] is doxastically possible for
our ideal agent.
• We thus get two routes to irregularity as before, now
interpreted doxastically.
Arguments against regularity
•
•
In order for there to be the kind of harmony within
<, F, P> that is captured by regularity, there has to
be a certain harmony between the cardinalities of P’s
domain—namely F—and P’s range.
If F is too large relative to P’s range, then a failure of
regularity is guaranteed, and this is so without any
further constraints on P.
Arguments against regularity
Kolmogorov’s axiomatization requires P to be real
valued. This means that any uncountable probability
space is automatically irregular.
Arguments against regularity
•
•
It is curious that this axiomatization is restrictive
on the range of all probability functions: the real
numbers in [0,1], and not a richer set;
yet it is almost completely permissive about their
domains:  can be any set you like, however
large, and F can be any field on , however large.
Arguments against regularity
•
•
•
We can apparently make the set of contents of an
agent’s thoughts as big as we like.
But we limit the attitudes that she can bear to
those contents—the attitudes can only achieve a
certain fineness of grain.
Put a rich set of contents together with a relatively
impoverished set of attitudes, and you violate
regularity.
Infinitesimals to the rescue?
The friend of regularity replies: if you’re going to have a
rich domain of the probability function, you’d better have
a rich range.
Lewis:
“You may protest that there are too many alternative possible
worlds to permit regularity. But that is so only if we suppose, as I
do not, that the values of the function C are restricted to the
standard reals. Many propositions must have infinitesimal
C-values, and C(A|B) often will be defined as a quotient of
infinitesimals, each infinitely close but not equal to zero. (See
Bernstein and Wattenberg (1969).)”
Infinitesimals to the rescue?
0
0
1
Infinitesimals to the rescue?
• I have seen Bernstein and Wattenberg (1969). But this
article does not substantiate Lewis’ strong claim.
Bernstein and Wattenberg show that using the hyperreal
numbers—in particular, infinitesimals—one can give a
regular probability assignment to the landing points of a
fair dart throw, modelled by a random selection from the
[0, 1] interval of the reals.
Infinitesimals to the rescue?
• But that’s a very specific case, with a specific
cardinality!
• We need to be convinced that a similar result holds
for each set of doxastic possibilities, whatever its
cardinality.
• Indeed, that may well be a proper class!
Infinitesimals to the rescue?
• Pruss (MS) shows that if the cardinality of  is
greater than that of the range of P, then regularity
fails.
Infinitesimals to the rescue?
• We can scotch regularity even for a hyperrealvalued probability function by correspondingly
enriching the space of possibilities.
• The dart is thrown at the [0, 1] interval of the
hyperreals.
Infinitesimals to the rescue?
[
0
x-ε/2 x
]
x+ε/2
1
Not to scale!
•
Each point x is strictly contained within nested
intervals of the form [x – ε/2, x + ε/2] of width ε, for
each infinitesimal ε, whose probabilities are their
lengths, ε again. (This assumption can be somewhat
weakened.)
Infinitesimals to the rescue?
[
0
x-ε/2 x
]
x+ε/2
1
Not to scale!
•
•
Each point x is strictly contained within nested
intervals of the form [x – ε/2, x + ε/2] of width ε, for
each infinitesimal ε, whose probabilities are their
lengths, ε again. (This assumption can be somewhat
weakened.)
So the point’s probability is bounded above by all
these ε, and thus it must be smaller than all of them—
i.e. 0.
Arguments against regularity,
even allowing infinitesimals
•
•
•
•
•
I envisage a kind of arms race:
We scotched regularity for real-valued probability functions by
canvassing sufficiently large domains.
The friends of regularity fought back, enriching their ranges:
making them hyperreal-valued.
The enemy of regularity counters by enriching the domain.
And so it goes.
By Pruss’s result, the enemy can always win (for anything that
looks like Kolmogorov’s probability theory).
Arguments against regularity,
even allowing infinitesimals
• So there are propositions that make it to the
second, but not the third grade of probabilistic
involvement for rational agents: non-empty subsets
of  that get assigned probability 0.
Doxastically possible credence gaps
• These are propositions that make it to the first but not
the second grade of probabilistic involvement for
rational agents: propositions that are subsets of ,
but that are not elements of F.
Doxastically possible credence gaps
• Decision theory recognizes the possibility of
probability gaps in its distinction between decisions
under risk, and decisions under uncertainty: in the
latter case, probabilities are simply not assigned to
the relevant states of the world.
• More generally: I will argue that you can rationally
have credence gaps.
Examples of doxastically possible credence gaps
• Non-measurable sets
Examples of doxastically possible credence gaps
• Chance gaps
• The Principal Principle says (roughly!!):
your credence in X, conditional on it having chance x,
should be x:
C(X | chance(X) = x) = x.
Examples of doxastically possible credence gaps
• A relative of the Principal Principle? Roughly:
your credence in X, conditional on it being a chance
gap, should be gappy:
C(X | chance(X) is undefined) is undefined.
• All I need is that rationality sometimes permits your
credence to be gappy for a hypothesized chance gap.
Examples of doxastically possible credence gaps
• There are arguably various cases of indeterminism
without chanciness. (Eagle)
Examples of doxastically possible credence gaps
•Norton’s dome
Examples of doxastically possible credence gaps
•This is a case of indeterminism: In the time reversed
version of the story, the initial conditions and Newton’s
laws do not entail if, when, and where the ball will roll.
•But there are no chances in this picture.
•E.g. chance(ball rolls north on Monday) is undefined.
Examples of doxastically possible credence gaps
•A rational agent who knows this could refuse to assign
a credence to the ball rolling north on Monday.
Examples of doxastically possible credence gaps
• One’s own free choices
• Kyburg, Gilboa, Spohn, Levi, Price, and Briggs contend
that when I am making a choice, I must regard it as
free. In doing so, I cannot assign probabilities to my
acting in one way rather than another (even though
onlookers may be able to do so).
Examples of doxastically possible credence gaps
• To be sure, these cases of probability gaps are
controversial; but it is noteworthy that these authors
are apparently committed to there being further
counterexamples to regularity due to credence gaps.
• All I need is that it is permissible to leave them as
credence gaps.
Ramifications of irregularity for Bayesian
epistemology and decision theory
• I have argued for two kinds of counterexamples to
regularity: rational assignments of zero credences,
and rational credence gaps, for doxastic possibilities.
• I now want to explore some of the unwelcome
consequences these failures of regularity have for
traditional Bayesian epistemology and decision theory.
Problems for the
conditional probability ratio formula
• The ratio analysis of conditional probability:
… provided P(B) > 0
Problems for the
conditional probability ratio formula
• What is the probability that the dart lands on ½,
given that it lands on ½?
• 1, surely!
• But the ratio formula cannot deliver that result,
because P(dart lands on ½) = 0.
Problems for the
conditional probability ratio formula
• Gaps create similar problems.
• What is the probability that the ball rolls north on
Monday, given that the ball rolls north on Monday?
• 1, surely!
• But the ratio formula cannot deliver that result,
because
P(ball rolls north on Monday) is undefined.
Problems for the
conditional probability ratio formula
• We need a more sophisticated account of
conditional probability.
• I advocate taking conditional probability as primitive
(in the style of Popper and Rényi).
Problems for conditionalization
• The zero-probability problem for the conditional
probability formula quickly becomes a problem for
the updating rule of conditionalization, which is
defined in terms of it:
Pnew(X) = Pold(X | E) (provided Pold (E) > 0)
Problems for conditionalization
• Suppose you learn that the dart lands on ½. What
should be your new probability that the dart lands on
½?
• 1 surely.
• But
Pold(dart lands on ½ | dart lands on ½)
is undefined, so conditionalization (so defined) cannot
give you this advice.
Problems for conditionalization
• Gaps create similar problems.
• Suppose you learn that the ball rolls north on Monday.
What should be your new probability that the ball rolls
north on Monday?
• 1 surely.
• But
Pold(rolls north on Monday| rolls north on Monday)
is undefined, so conditionalization cannot give you
this advice.
Problems for conditionalization
• We need a more sophisticated account of
conditionalization.
• Primitive conditional probabilities to the rescue!
Problems for conditionals
• Bennett: “nobody has any use for AC when for him
P(A) = 0”. (“Zero intolerance”)
• How about: “if the dart lands ½, then it lands on a
rational number”?
Problems for conditionals
• He likes Adams’ Thesis: P(A  C) = P(C | A).
• “Believe A & C to the extent that you think A & C is
nearly as likely as A … You can do nothing with this in
the case where your P(A) = 0”
• If P(A) = 0, then for any C, A & C is nearly as likely as
A!
• But we want to distinguish good instances of A  C
from bad.
Problems for conditionals
• Bennett could just as easily have insisted on “gap
intolerance”:
• “nobody has any use for AC when for him P(A) is a
gap.”
• “Believe A & C to the extent that you think A & C is
nearly as likely as A …”?
• “If the ball rolls north on Monday, it rolls north on
Monday.”
Problems for independence
• We want to capture the idea of A being
probabilistically uninformative about B.
• A and B are said to be independent just in case
P(A  B) = P(A) P(B).
Problems for independence
• According to this account of probabilistic
independence, anything with probability 0 is
independent of itself:
If P(X) = 0, then P(X  X) = 0 = P(X)P(X).
• But surely identity is the ultimate case of
(probabilistic) dependence.
Problems for independence
• Suppose you are wondering whether the dart
landed on ½. Nothing could be more informative
than your learning: the dart landed on ½.
• But according to this account of independence, the
dart landing on ½ is independent of the dart
landing on ½!
Problems for independence
• Gaps create similar problems.
• Suppose you are wondering whether the ball
started rolling north on Monday. Nothing could be
more informative than your learning: the ball
started rolling north on Monday.
• But there is no verdict from this account of
independence.
Problems for independence
• We need a more sophisticated account of
independence – e.g. using primitive conditional
probabilities.
• Branden Fitelson and I have written about this.
Problems for expected utility theory
• Arguably the two most important foundations of
decision theory are the notion of expected utility,
and dominance reasoning.
Problems for expected utility theory
• And yet probability 0 propositions apparently show
that expected utility theory and dominance
reasoning can give conflicting verdicts.
Problems for expected utility theory
• Suppose that two options yield the same utility
except on a proposition of probability 0; but if that
proposition is true, option 1 is far superior to option
2.
Problems for expected utility theory
• You can choose between these two options:
– Option 1: If the dart lands on 1/2, you get a million
dollars; otherwise you get nothing.
– Option 2: You get nothing.
Problems for expected utility theory
• Expected utility theory apparently says that these
options are equally good: they both have an
expected utility of 0. But dominance reasoning
says that option 1 is strictly better than option 2.
Which is it to be?
• I say that option 1 is better.
• I think that this is a counterexample to expected
utility theory, as it is usually understood. (To be
sure, there are replies …)
Problems for expected utility theory
• Gaps create similar problems.
• You can choose between these two options:
– Option 1: If the ball starts rolling north on Monday, you
get a million dollars; otherwise you get nothing.
– Option 2: You get nothing.
Problems for expected utility theory
• Expected utility theory goes silent.
• I say that option 1 is better.
• We need a more sophisticated decision theory.
Closing sermon
• Irregularity makes things go bad for the orthodox
Bayesian; that is a reason to insist on regularity.
• The trouble is that regularity appears to be untenable.
• I think, then, that irregularity is a reason for the
orthodox Bayesian to become unorthodox.
Closing sermon
• I have advocated replacing the orthodox theory of
conditional probability, conditionalization, and
independence with alternatives based on
Popper/Rényi functions. Expected utility theory
appears to be similarly in need of revision.
Closing sermon
• Or perhaps some genius will come along one day with
an elegant theory that preserves regularity after all.
Closing sermon
• Or perhaps some genius will come along one day with
an elegant theory that preserves regularity after all.
• Fingers crossed!
Closing sermon
• And then there are some possibilities that really
should be assigned zero probability …
Thanks especially to Rachael Briggs, David Chalmers, John Cusbert, Kenny Easwaran,
Branden Fitelson, Renée Hájek, Thomas Hofweber, Leon Leontyev, Aidan Lyon, John Maier,
Daniel Nolan, Alexander Pruss, Wolfgang Schwarz, Mike Smithson, Weng Hong Tang, Peter
Vranas, Clas Weber, and Sylvia Wenmackers for very helpful comments that led to
improvements; to audiences at Stirling, the ANU, the AAP, UBC, Alberta, Rutgers, NYU,
Berkeley, Miami, the Lofotens Epistemology conference; to Carl Brusse and Elle Benjamin for
help
with
the
slides;
and
to
Tilly.
Reply to Thomas
•The minimal constraint (MC): events of lowest
chance don't happen:
– If X has chance 0, X does not happen.
•It is reminiscent of Cournot’s Principle: “events of low
probability don’t/will not happen”— “low” is understood
as 0.
Reply to Thomas
• The minimal constraint (MC): events of lowest
chance don't happen:
If X has chance 0, X does not happen.
• Some differences with (doxastic) regularity
If X is (doxastically) possible, C(X) > 0:
– MC makes no mention of ‘possibility’
– There’s nothing doxastic about it.
– Gaps are no problem for it, as formulated above. (They
may be a problem for a stronger formulation:
If X happens, then chance(X) > 0.)
Does Thomas think that this is also a conceptual truth?
Reply to Thomas
• According to MC, the radioactive decay laws are
false if understood in terms of real-valued chances.
(E.g. a particular radium atom decaying exactly when
it does apparently has probability 0.)
Reply to Thomas
• Is MC a conceptual truth?
• (Branden) The conceptual truth may be a claim of
comparative probability:
If X happens, then chance(X) > chance(contradiction).
It would be nice to give a numerical representation
of such comparative probability claims, but perhaps
that cannot be done.
Reply to Thomas
• Thomas’s ‘wait-and-see’ approach to the range.
• The details will matter:
– How exactly does the choice of  determine the choice of
the range?
– What will additivity look like?
– Sylvia et al. deliver the details of such a proposal.
– Does chance wait and see?!
Reply to Thomas
• Thomas is a fellow-traveller, insofar as he has to give
an account of probability that departs significantly
from Kolmogorov’s.
Introductory sermon
Bruno de Finetti
The multiplicative destroyer: problems for
independence
• 0 is the multiplicative destroyer: multiply anything by
it, and you get 0 back. This spells trouble for the
usual definition of probabilistic independence.
• We want to capture the idea of A being
probabilistically uninformative about B.
• A and B are said to be independent just in case
P(A  B) = P(A) P(B).
Arguments for regularity
• Homage to Eric Clapton (“One Chance”):
“If I take the chance of seeing you again,
I just don't know what I would do, baby…
You had one chance and you blew it.
You may never get another chance.”
• Clapton is not singing about probability, but rather
opportunity – a modal notion.
Problems for independence
• More generally, according to this account of
independence, any proposition with probability 0 is
probabilistically independent of anything. This
includes:
– its negation;
– anything that entails it, and anything that it entails.
Arguments against regularity,
even allowing infinitesimals
• Could a single construction handle all ’s at once?
• I doubt that, at least for anything recognizably like Kolmogorov’s
axiomatization. For recall that a decision about the range of all
probability functions is made once and for all, while allowing
complete freedom about how large the domains can be.
• However rich the range of the probability functions get, I could
presumably run the trick above again: fashion a spinner whose
landing points come from a domain that is so rich, threatening to
thwart regularity once again.
Arguments against regularity,
even allowing infinitesimals
• Could we tailor the range of the probability function to the domain,
for each particular application?
• The trouble is that the commitment to the range of P comes first:
forever more, probability functions will be mappings from sigma
algebras to the reals. Or to the hyperreals. Or to the hyperhyperreals. Or to some quite different system…
• Try providing an axiomatization along the lines of Kolmogorov’s
that has flexibility in the range built into it. “A probability function is
a mapping from F to …”—to what?
• So I doubt that there could be an argument, still less a proof, still
less a proof already provided by Bernstein and Wattenberg, that
regularity can be sustained, however much the cardinality of 
escalates.
Examples of doxastically possible credence gaps
• “Agents are unlike chances—for example, agents
sometimes have to bet, while chances never do!”
• If someone coerced you to bet on some chance gap,
then sure enough we would witness some betting
behaviour.
• But it is doubtful that it would reveal anything about
your state of mind prior to the coercion. So at best this
shows that it would be rational for you to fill a gap,
when coerced.
• It does not impugn the rationality of having the gap in
the first place—and that’s all we need for a
counterexample to regularity.
Arguments against regularity,
even allowing infinitesimals
ii. Symmetry constaints on P
Williamson’s argument.
Dart example
• They are not always clear on exactly which version of
regularity they are arguing for. Up to a point, it won’t
matter. But I think their arguments go through
especially well for doxastic possibility.
Arguments for regularity
“Keep the door open, or at least ajar” –Edwards,
Lindman and Savage (1963)
Arguments for regularity
Lewis (1986, 175-176):
[Some] say that things with no chance at all of occurring, that is
with probability zero, do nevertheless happen; for instance when a
fair spinner stops at one angle instead of another, yet any precise
angle has probability zero. I think these people are making a
rounding error: they fail to distinguish zero chance from
infinitesimal chance. Zero chance is no chance, and nothing
with zero chance ever happens. The spinner’s chance of
stopping exactly where it did was not zero; it was infinitesimal, and
infinitesimal chance is still some chance. (My bolding)
Arguments for regularity
1. The argument from ordinary language.
i) Zero chance is no chance – a zero chance event
cannot happen.
ii) Each landing point has positive chance, since each
can happen, and the rational agent should know this.
iii) By the Principal Principle, her corresponding
credence should be positive.
Hence, she should be regular.
Arguments for regularity
1. The argument from ordinary language.
I reply: I think that Lewis is trading on a pun.
On the one hand, “X has no chance” literally means
“X has zero chance”. But “X has no chance” also has
the colloquial meaning “X cannot happen”—a modal
claim.
Arguments for regularity
• Sliding from one meaning of “chance” to another is
too quick an argument for regularity.
• We can even say: “the landing point at noon has a
chance (opportunity/possibility) of coming up; but its
chance (probability) of doing so is zero.”
Arguments for regularity
2. The argument from rounding error.
I reply: I am not convinced that the chance is
infinitesimal even for the stopping points of the
spinner—more on that later—but if it is, then change
the example.
Arguments for regularity
3. The argument from learning.
Lewis writes:
“I should like to assume that it makes sense to
conditionalize on any but the empty proposition.
Therefore, I require that C is regular.”
Arguments for regularity
I reply: this presupposes that the conditional probabilities
that figure in conditionalization are given by the usual
ratio formula. But we should adopt a more powerful
approach to conditional probability, according to which
conditional probabilities can be defined even when the
conditions have probability 0.
More on that later.
Arguments for regularity
4. (Related) The argument from stubbornness.
Lewis continues:
“The assumption that C is regular will prove convenient,
but it is not justified only as a convenience. Also it is
required as a condition of reasonableness: one who
started out with an irregular credence function (and who
then learned from experience by conditionalizing) would
stubbornly refuse to believe some propositions no matter
what the evidence in their favor.”
Arguments for regularity
We could strengthen this argument: Having zeroed out a
possibility, an irregular agent could never even raise its
probability by conditionalization, let alone raise it so high
as to believe it, whatever the evidence in its favour.
Arguments for regularity
I reply:
• As Kenny Easwaran observes, this argument
presupposes that all evidence that is received initially
had positive probability. But if something that initially had
probability 0 is learned (by a more powerful form of
conditionalization than using the ratio formula), then
even zero probability propositions can have their
probabilities raised—indeed, all the way to 1.
• There are some propositions that you should
stubbornly refuse to believe, since you could not get
evidence in their favour—for example, that there are no
Arguments for regularity
5. An irregular agent is susceptible to a semi-Dutch
Book.
While considering what it is like for us to be irregular,
Skyrms (1980) argues: “If we interpret probability as a
fair betting quotient there is a bet which we will consider
fair even though we can possibly lose it but cannot
possibly win it”.
Arguments against regularity
I reply:
• This argument proves too much. It ‘shows’ that one
must never conditionalize. But if you don’t
conditionalize, your are susceptible to a DUTCH
BOOK!
• What is the sense of “possibly lose”? Merely logical
or metaphysical? If it is doxastically impossible that
you will lose the bet, you should not worry.
Arguments for regularity
6. Conflating certainty and less than certainty.
Williamson: “For subjective Bayesians, probability 1 is
the highest possible degree of belief, which presumably
is absolute certainty.”
We may continue: an agent who violates regularity
conflates mental states that should be distinguished, by
regarding something that is doxastically possible as if it’s
certainly false.
Arguments for regularity
I reply: As Easwaran points out, an agent’s attitudes are
captured by more than just her probability assignments.
A proper model of her will attribute to her a probability
space <, F, C>. A doxastic possibility for her is
represented by a non-empty subset of . It is thus
distinguished from the empty set. Not all distinctions in
her mental states need be captured solely by C.
(I stressed the importance of  and F before.)
Arguments for regularity
7. Upholding the norm
Any agent who violates regularity violates the norm that
her credences should reflect her evidence. If X is
doxastically possible, then her evidence does not rule it
out; but an assignment of zero credence to X treats it as
if it is ruled out. Her credence conflates two different
evidential situations: one in which X is doxastically live,
and another in which it is doxastically dead.
Arguments for regularity
I reply: This argument wrongly assumes that an agent’s
evidential situation with regard to a proposition is
revealed only by her unconditional probability
assignment to that proposition. Other aspects of her
probability function can reveal that she has different
attitudes to them: her primitive conditional probabilities!
Arguments for regularity
8. Pragmatic argument: the argument from crossed
fingers.
I offer the friend of regularity a pragmatic argument for it:
Bayesian epistemology goes more smoothly if we
assume it. Our theorizing about rational mental states
faces serious difficulties if we allow irregularity. We had
better hope, then, that we can maintain regularity!
Arguments for regularity
I will reply to this argument at the end, after I’ve made
trouble for Bayesian orthodoxy!
Arguments against regularity
"I believe in an open mind, but not so open that your
brains fall out"
–Arthur Hays Sulzberger,
former publisher of
the New York Times
Arguments against regularity,
even allowing infinitesimals
• Times:
• First coin:
• Second coin:
1
H
2
H
H
3
H
H
4…
H…
H…
P(unitalicized sequence) = ½ P(bold sequence)
P(unitalicized sequence) = P(italicized sequence)
P(italicized sequence) = P(bold sequence)
So P(unitalicized sequence) = P(bold sequence)
So P(unitalicized sequence) = ½ P(unitalicized sequence)
So P(unitalicized sequence) = 0.
Arguments against regularity,
even allowing infinitesimals
• There is another answer to the probability of a coin
landing heads forever: namely, there is no answer.
That is, we may want to allow that this probability is
simply undefined: a coin landing heads forever is a
probability gap.
• This response could suppose for reductio that the
probability exists and is positive. The argument that it
is half as big as itself forces us to reject this
supposition.
• Williamson responds by rejecting its second conjunct;
this response rejects its first conjunct. Either way,
regularity is frustrated.
Arguments against regularity,
even allowing infinitesimals
I am not convinced that this is a good response, but it
primes us to look for other cases of the same kind as
counterexamples to regularity: cases in which a set of
doxastic possibilities fails to get positive credence,
because it fails to get credence at all.
Staying Regular
Alan Hájek
Closing open-mindedness, even with infinitesimals
• The problems for the analysis of conditional probability, for
conditionalization, for the analysis of independence, and for
decision theory, seem to be very much alive.
Trouble for conditionalization
• Previously I endorsed Easwaran’s reply to the
argument from stubbornness: if something that
initially had probability 0 is learned, then other zero
probability propositions can have their probabilities
raised.
• Now add that if something that initially was a
probability gap was learned, then other probability
gaps can have their probability become defined.
Arguments for regularity
“Keep the door open, or at least ajar” –Edwards,
Lindman and Savage (1963)
Arguments for regularity
“Keep the door open, or at least ajar” –Edwards,
Lindman and Savage (1963)
Examples of doxastically possible credence gaps
• It seems, then, that classical mechanics is
indeterministic but not chancy: starting from a fixed
set of initial conditions, it allows multiple possible
futures, but they have no associated chances.
• Now perhaps a rational agent may correspondingly
not assign them any credences, even though they may
be doxastic possibilities for her.
• To be sure, she may also assign them credence; but
she is not rationally compelled to do so.
Arguments for regularity
“Keep the door open, or at least ajar” –Edwards,
Lindman and Savage (1963)
Probability 0 events
• So there are various non-trivial and interesting
examples of probability 0 events.
• They create various philosophical problems, each
associated with a peculiar property of the
arithmetic of 0.
Regularity
• Modalities are puzzling.
• Probabilities are puzzling twice over.
• We start to gain a handle on both binary
‘box’/‘diamond’ modalities and numerical
probabilities when we formalize them, with various
systems of modal logic for the former, and with
Kolmogorov’s axiomatization for the latter.
• We would understand both still better if we could
provide bridge principles linking them.
Statistical significance testing
• Null hypothesis H0 vs alternative hypothesis H1
• Reject H0 if, by its lights, the probability of data at
least as extreme as that observed is too improbable
(typically less than 0.05 or 0.01).
• ‘Improbable’ must be relativized to a probability
function.
Data ‘too good to be true’
• As well as data fitting a given hypothesis too poorly,
it can also fit it suspiciously well.
– Fisher on Mendel cooking the books in his pea
experiment. The probability that by chance alone the
data would fit his theory of heredity that well was
0.00003.
Closing sermon
• One might object that a theory based on Popper
functions is more complicated than the elegant,
simple theory that we previously had.
• But elegance and simplicity are no excuses for the
theory’s inadequacies.
• Moreover, the theory had to be complicated in any
case by the introduction of infinitesimals; and it
seems that even they are not enough to overcome the
inadequacies.
Cournot’s Principle
• Cournot’s principle: an event of small probability
singled out in advance will not happen.
• Borel: “The principle that an event with very small
probability will not happen is the only law of
chance.”
Cournot’s Principle
• The principle still has some currency, having been
recently rehabilitated and defended by Shafer.
Cheney’s Principle
• “If there’s a 1% chance
that Pakistani
scientists are helping
Al Qaeda build or
develop a nuclear
weapon, we have to
treat it as a certainty in
terms of our
response.”
Cheney’s Principle
• USA had to confront a
new kind of threat, that
of a “low-probability,
high-impact event”.
Cheney’s Principle
• While Cournot effectively rounds down the event’s
low probability, treating it as if it’s 0, Cheney
rounds it up, treating it as if it’s 1.
The improbable in philosophy
• Improbable events have earned their keep in
philosophy. A concern with improbable events has
driven philosophical positions and insights.
Examples of doxastically possible credence gaps
• The time-reversal of the ball’s trajectory is a
Newtonian possibility.
• The initial conditions and Newton’s laws do not
determine if, when, and in which direction the ball will
roll down.
• It may roll north on Monday.
• But chance(the ball rolls north on Monday) is
undefined.
• A rational agent may accordingly not assign a
credence to the ball rolls north on Monday.
The lottery paradox
• The lottery paradox puts pressure either on the
‘Lockean thesis’ that rational binary belief
corresponds to subjective probability above a
threshold, or the closure of rational beliefs under
conjunction.
The lottery paradox
• For the Lockean thesis to have any plausibility, the
putative threshold for belief must be high.
Accordingly, the probabilities involved in the lottery
paradox will be small.
• Lotteries cast doubt on Cournot’s principle.
• We see an interesting feature of small
probabilities: they may accumulate, combining to
yield large probabilities.
Skepticism about knowledge
• Vogel: I know where my car is parked right now.
But I don’t know that I am not one of the unlucky
people whose car has been stolen during the last
few hours.
The probable
• The improbable is the flip-side of the probable: if p
is probable, then not-p is improbable.
• Probability 1 events have a special place in
probability theory.
The probable
• Various classic limit theorems are so-called ‘almost
sure’ results. They say that various convergences
occur with probability 1, rather than with certainty.
The probable
• An example of the probable doing philosophical
work: the Problem of Old Evidence.
• According to Bayesian confirmation theory,
E confirms H (according to P) iff P(H | E) > P(H).
• But if P(E) = 1, then P(H | E) = P(H), and E has no
confirmatory power (by these lights).
The problem of old evidence
• This seems wrong: for example, even when we are
certain of the advance of the perihelion of Mercury,
this fact seems to support general relativity.
Why care about the improbable?
• I have already pointed out many ways in which
scientists and philosophers do care about the
improbable. This does much to build my case that
they should care—for many of the examples are
important and well motivated.
Closing open-mindedness, even with infinitesimals
• Lewis:
You may protest that there are too many alternative possible
worlds to permit regularity. But that is so only if we suppose, as
I do not, that the values of the function C are restricted to the
standard reals. Many propositions must have infinitesimal
C-values, and C(A|B) often will be defined as a quotient of
infinitesimals, each infinitely close but not equal to zero. (See
Bernstein and Wattenberg (1969).)
• Brian also cites this important paper.
• It shows there is an open-minded probability assignment to the
dart experiment (with hyperreal values).
Why care about the improbable?
• There are specific problems that arise only in
virtue of improbability. We want a fully general
probability theory that can handle them.
• We want a fully general philosophy of
probability.
Why care about the improbable?
• Probability interacts with other things that we care
about, and something being improbable can matter
to these other things.
Why care about the improbable?
• There are problems created by low probability
events that are similarly created by higher
probability events; but when they are improbable
we are liable to neglect them.
– Skepticism about knowledge (as we saw)
– Counterfactuals (as we will see)
What is ‘improbable’?
• I will count as improbable:
1. Events that have probability 0.
2. Events that have infinitesimal probability—positive, but
smaller than every positive real number.
3. Events that have small real-valued probability.
• ‘Small’ is vague and context-dependent, but we know clear cases
when we see them, and my cases will be clear.
4. Events with imprecise probability, with an improbable
upper limit (as above).
What is ‘improbable’?
• There are various peculiar properties of low
probabilities. I want to use them to do some
philosophical work. I will go through these
properties systematically, showcasing each of
them with a philosophical application, a
philosophical payoff.
Probability 0 events
• Much of what’s philosophically interesting about
probability 0 events derives from interesting facts
about the arithmetic of 0.
• Each of its idiosyncrasies motivates a deep
philosophical problem.
Open-mindedness
• To be sure, we could reasonably dismiss probability
zero events as 'don't cares' if we could be assured
that all probability functions of interest assign 0
only to impossibilities—i.e. they are regular/strictly
coherent/open-minded.
Open-mindedness
• Open-mindedness is part of the folk concept of probability:
‘if it can happen, then it has some chance of happening’.
• Open-mindedness has support from some weighty
philosophical figures (e.g. Lewis).
• We will see how much havoc probability-zero-but-possible
events wreak. It would be nice to banish them!
Closing open-mindedness?
• There are apparently events that have probability 0, but
that can happen.
Closing open-mindedness?
• A fair coin is tossed infinitely many times. The probability
that it lands heads every time
HHH …
is 0
• The probability of each infinite sequence is 0. (We will
revisit this claim later, but assume it for now.)
You can’t divide by 0: problems for the
conditional probability ratio formula
• The ratio analysis of conditional probability:
… provided P(B) > 0
You can’t divide by 0: problems for the
conditional probability ratio formula
• What is the probability that the coin lands heads on
every toss, given that the coin lands heads on every
toss?
You can’t divide by 0: problems for the
conditional probability ratio formula
• What is the probability that the coin lands heads on
every toss, given that the coin lands heads on every
toss?
• 1, surely!
You can’t divide by 0: problems for the
conditional probability ratio formula
• What is the probability that the coin lands heads on
every toss, given that the coin lands heads on every
toss?
• 1, surely!
• But the ratio formula cannot deliver that result,
because P(coin lands heads on every toss) = 0.
You can’t divide by 0: problems for the
conditional probability ratio formula
• There are less trivial examples, too.
You can’t divide by 0: problems for the
conditional probability ratio formula
• There are less trivial examples, too.
• The probability that the coin lands heads every toss,
given that it lands heads on the second, third,
fourth, … tosses is ½.
You can’t divide by 0: problems for the
conditional probability ratio formula
• There are less trivial examples, too.
• The probability that the coin lands heads on every
toss, given that it lands heads on the second, third,
fourth, … tosses is ½.
• Again, the ratio formula cannot say this.
Trouble for conditionalization
• Suppose you learn that the coin landed heads on
every toss after the first. What should be your new
probability that the coin landed heads on every toss?
½, surely. But
Pinitial(heads every toss | heads every toss after first)
is undefined, so conditionalization (so defined)
cannot give you this advice.
Trouble for conditionalization
• To be sure, there are some more sophisticated
methods for solving these problems.
• Bill has written about this topic.
• So have various other authors, including myself.
– Popper functions
– Kolmogorov: conditional probability as a random
variable (conditional on a sigma algebra)
Trouble for conditionalization
• But something must be done – we can’t retain the
Bayesian orthodoxy in the face of such cases.
• We have to assess the costs of these other
approaches.
• And there are other, less familiar problems with
Bayesian orthodoxy …
The multiplicative destroyer: problems for
independence
• According to this account of probabilistic
independence, anything with probability 0 is
independent of itself:
If P(X) = 0, then P(X  X) = 0 = P(X)P(X).
• But surely identity is the ultimate case of
(probabilistic) dependence.
The multiplicative destroyer: problems for
independence
• Suppose you are wondering whether the coin
landed heads on every toss. Nothing could be more
informative than your learning: the coin landed
heads on every toss.
• But according to this account of independence, the
coin landing heads on every toss is independent of
the coin landing heads on every toss!
The multiplicative destroyer: problems for
independence
• More generally, according to this account of
independence, any proposition with probability 0 is
probabilistically independent of anything. This
includes:
– its negation;
– anything that entails it, and anything that it entails.
The multiplicative destroyer: problems for
independence
• The ratio account of conditional probability was
guilty of a sin of omission. But this account of
independence is guilty of a sin of commission.
The additive identity: problems for expected
utility theory
• While 0 is the most potent of all numbers when it
comes to multiplication, it’s the most impotent
when it comes to addition and subtraction. It’s the
additive identity: adding it to any number makes no
difference.
• This creates problems for decision theory.
The additive identity: problems for expected
utility theory
• Arguably the two most important foundations of
decision theory are the notion of expected utility,
and dominance reasoning.
– Expected utility is a measure of choiceworthiness of an
option: the weighted average of the utilities associated
with that option in each possible state of the world, the
weights given by corresponding probabilities that those
states are realized.
– (Weak) Dominance reasoning says that if one option is
at least as good as another in every possible state, and
strictly better in at least one possible state, then it is
preferable (assuming independence of options and
states).
The additive identity: problems for expected
utility theory
• And yet probability 0 propositions show that
expected utility theory and dominance reasoning
can give conflicting verdicts.
The additive identity: problems for expected
utility theory
• Suppose that two options yield the same utility
except on a proposition of probability 0; but if that
proposition is true, option 1 is far superior to option
2.
Infinitesimal probabilities
• Infinitesimals to the rescue? (E.g. from the
hyperreals.)
• We might say that the probability that the coin
lands heads forever is not really 0, but rather an
infinitesimal.
The additive identity: problems for expected
utility theory
• Suppose that we toss the coin infinitely many
times. You can choose between these two options:
– Option 1: If it lands heads on every toss, you get a
million dollars; otherwise you get nothing.
– Option 2: You get nothing.
The additive identity: problems for expected
utility theory
• Expected utility theory says that these options are
equally good: they both have an expected utility of
0. But dominance reasoning says that option 1 is
strictly better than option 2. Which is it to be?
• I say that option 1 is better.
• I think that this is a counterexample to expected
utility theory, as it is usually understood.
Infinitesimal probabilities
• Infinitesimals to the rescue? (E.g. from the
hyperreals.)
• We might say that the probability that the coin
lands heads forever is not really 0, but rather an
infinitesimal.
Infinitesimal probabilities
• Lewis: “Zero chance is no chance, and nothing with
zero chance ever happens.”
• A version of Cournot’s principle, with zero
probability counting as “small probability”.
• “… infinitesimal chance is still some chance.”
• Likewise, Brian advocates using infinitesimal
probabilities.
Infinitesimal probabilities
• But in the cases considered, aren’t the
probabilities really zero?!
• Williamson has an argument that the probability of
heads forever really is zero, even allowing
infinitesimals.
Infinitesimal probabilities
• We should judge our total theory by its virtues and
vices.
• We have seen what I take to be some serious vices
of orthodox Bayesianism (and of orthodox
probability theory more generally).
• One way or another, we need to go unorthodox.
Real-valued, small positive probabilities
• Next we turn to probabilities that are not especially
strange in their own right—there is nothing weird
about their mathematics. Yet they give rise to a
host of philosophical problems in their own right.
They are real-valued but small positive
probabilities.
Most counterfactuals are false
• Stare in the face of chance …
• ‘If the coin were tossed, it would land heads’
• I submit that it is false. There is no particular way
that this chancy process would turn out, were it to
be initiated. In the words of Jeffrey “that’s what
chance is all about”. To think that there is a fact of
the matter of how the coin would land is to
misunderstand chance.
Most counterfactuals are false
• The argument goes through whatever the chance
of Tails, as long as it is a possible outcome.
Most counterfactuals are false
• The argument goes through whatever the chance
of Tails, as long as it is a possible outcome.
• Or consider a fair lottery. ‘If you were to play the
lottery, you would lose’ is false no matter how
many tickets there are in the lottery.
Most counterfactuals are false
• In an indeterministic world such as ours appears to
be, lotteries—in a broad sense—abound.
Most counterfactuals are false
• In an indeterministic world such as ours appears to
be, lotteries—in a broad sense—abound.
• The indeterminism reaches medium-sized dry
goods.
Most counterfactuals are false
• In an indeterministic world such as ours appears to
be, lotteries—in a broad sense—abound.
• The indeterminism reaches medium-sized dry
goods.
• Even billiard ball collisions, human jumps, … are
indeterministic.
Most counterfactuals are false
• In an indeterministic world such as ours appears to
be, lotteries—in a broad sense—abound.
• The indeterminism reaches medium-sized dry
goods.
• Even billiard ball collisions, human jumps, … are
indeterministic.
• There are subtle issues here!
Most counterfactuals are false
• Now, there are various reasons why you may not be
staring chance in the face…
Most counterfactuals are false
• Now, there are various reasons why you may not be
staring chance in the face…
• The trouble is that chance is staring at you.
Most counterfactuals are false
• Now, there are various reasons why you may not
be staring chance in the face…
• The trouble is that chance is staring at you.
• It is heedless of your ignorance, defiant of your
ignorings.
Most counterfactuals are false
• Once you take seriously what quantum mechanics
says, you should see chanciness almost
everywhere. The world looks like a huge collection
of lotteries. But whether or not you take seriously
what the theory says, that’s apparently how the
world is.
Multiplication by extremely large utilities
• Even extremely small positive probabilities can be
offset by multiplication by extremely large utilities
when calculating expected utilities.
Pascal’s Wager
Wager for God
Wager against God
God exists
God does not exist
∞
f1
f2
f3
f1, f2, and f3 are finite utilities (no need to specify)
Your probability that God exists should be positive.
Rationality requires you to maximize expected utility.
Therefore,
Rationality requires you to wager for God.
Pascal’s Wager
God exists (p)
God does not exist
(1 – p)
Wager for God
∞
f1
Wager against God
f2
f3
Let p be your positive probability for God's existence.
Your expected utility of wagering for God is
∞p + f1(1 – p) = ∞
Your expected utility of wagering against God is
f2p + f3(1 – p) = some finite value.
Therefore, you should wager for God.
Pascal’s Wager
• But this argument is invalid!
• Pascal's specious step is to assume that only the
strategy of wagering for God gets the infinite
expected utility.
• He has ignored all the mixed strategies.
Pascal’s Wager
• But this still understates Pascal's troubles. For anything
that an agent might choose to do may be a mixed strategy
between wagering for and wagering against God, for some
appropriate probability weights.
• For whatever one does, one should apparently assign some
positive probability to winding up wagering for God…
Pascal’s Wager
• By open-mindedness, one has to assign positive probability
to such non-Pascalian routes to wagering for God!
Pascal’s Wager
• By open-mindedness, one has to assign positive probability
to such non-Pascalian routes to wagering for God!
• By Pascal's lights, it seems everybody enjoys maximal
expected utility at all times!
Pascal’s Wager
• By open-mindedness, one has to assign positive probability
to such non-Pascalian routes to wagering for God!
• By Pascal's lights, it seems everybody enjoys maximal
expected utility at all times!
• A dilemma: If a Pascalian agent is open-minded, all
practical reasoning is useless; if not, the earlier theoretical
problems (for conditional probability, conditionalization,
independence and decision theory) are alive and well!
Pascal’s Wager, reformulated
• But Pascal’s Wager can apparently be rendered
valid.
Pascal’s Wager, reformulated
Wager for God
Wager against God
God exists
God does not exist
f
f1
f2
f3
Let p be your positive probability for God's existence.
Your expected utility of wagering for God is
fp + f1(1 – p)
Your expected utility of wagering against God is
f2p + f3(1 – p) = some finite value.
If f is large enough, you should wager for God.
Pascal’s Wager, reformulated
Some real-world decision problems look rather like
this, because they involve sufficiently high stakes
(relative to the associated probabilities) …
Imprecise probabilities with small upper limit
• Sometimes our probabilities are imprecise – e.g.
due to lack of relevant information, or conflicting
information.
• Think of imprecise probabilities as interval-valued:
[x, y]
• y may be very small.
• But when the associated stakes are sufficiently
high, there may still be cause for serious concern.
Flying in Europe during the volcano eruption
• All of Europe’s airports closed because of the risk
to flights posed by the eruption of the volcano in
Iceland.
• The probability of crashes was “small”.
• It was also imprecise.
• But the stakes were so high that it was wise to
cancel the flights.
Global warming
• Climate scientists differ in their probabilities of
various scenarios of global warming.
• It seems that our probabilities should be
correspondingly imprecise.
Global warming
Global warming
• We mainly hear about the most likely scenarios, which
involve serious consequences, but arguably not
catastrophic. (Certainly, various people argue that they are
not catastrophic.)
• Perhaps we should be more concerned with much less
likely scenarios, but ones that involve truly catastrophic
consequences.
• This is so even when the corresponding probabilities are
imprecise.
Global warming
THE END
The additive identity: problems for expected
utility theory
• (Re)interpret expected utility theory so that in the
case of ties, the theory is silent?
• That’s a different kind of defect: incompleteness.
• What would you prefer: $1, or $1?
• The theory is silent?
• That’s an uncomfortable silence!
The additive identity: problems for expected
utility theory
• Decision theory supplements expected utility
theory with further rules?
– If two options are identical, it doesn’t matter which you
choose.
– If option 1 (weakly) dominates option 2, then choose
option 1.
The additive identity: problems for expected
utility theory
• There will still be problems.
• Which do you prefer?:
– Option 1: You get a million dollars iff the coin lands
HHH … or THH … (you get two tickets)
– Option 2: You get a million dollars iff the coin lands
TTT ...
(you get one ticket)
• Option 1 is surely better, but we get silence from
expected utility theory, and silence from
dominance reasoning.
[Dart-strike table]
Dart hits irrational
number
Option 1
Option 2
• Point 1
– Sub point
• Point 2
– Sub point
Dart hits rational
number
[numberline - infintesimals]
0
0
• Point 1
– Sub point
• Point 2
– Sub point
1
[numberline – unmeasurable]
0
• Point 1
– Sub point
• Point 2
– Sub point
1
[s-curves]
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
[s-curves]
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
[climate graph]
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
[Eqn/object]
 Sn

P
 EX   1
 n

 Sn

P
 EX     1
 n

Sn
1 1 1 1 
 2
2
n
π 4 π 3 2
PE | H  PH 
PH | E  
PE 
P( A  B)
P( A | B) 
P( B)
[Graphic]
Auto-fitted & animated list (lvl2) – template for
slides
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
Auto-fitted & animated list (lvl2) – template for
slides
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
Auto-fitted & animated list (lvl2) – template for
slides
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
Auto-fitted & animated list (lvl2) – template for
slides
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
Auto-fitted & animated list (lvl2) – template for
slides
• Point 1
– Sub point
• Point 2
– Sub point
• Point 3
– Sub point
• Point 4
– Sub point
Auto-fitted & animated list (lvl2) – template for
slides
• P(H | E) =
The Laws of Large Numbers
• Let X1, X2… be i.i.d. random variables
• Let EX be the expectation of these random
variables.
• Let Sn = X1 + X2 + … + Xn
The Laws of Large Numbers
• Strong law of large numbers:
–  “Probability 1 of convergence” (almost sure convergence)
From probability to possibility
• There are various entailments in one direction,
from positive probability to corresponding notions
of possibility.
– If something receives positive chance, then it is nomically possible.
– If something receives positive credence, then it is epistemically
possible (for the relevant agent).
– If something receives positive logical probability (given some
evidence sentence), then it is logically possible (and consistent with
that sentence).
From possibility to probability?
• The folk seem to think that if something is
possible, then it has some positive probability. The
folk seem to believe that. Sub point
• But ‘probability 0’ is a distinctive modality,
irreducible to other, non-probabilistic modalities.
Possibility talk and probability talk are not intertranslatable.
From possibility to probability?
• A probability function that dignifies every possibility
with positive probability is usually said to be
regular.
• I prefer to call it open-minded.
• We can go on to distinguish various senses of
open-mindedness, corresponding to the various
senses of possibility.
• As we will see, it is hard to sustain any version of
open-mindedness.
3 grades of probabilistic involvement
• Recognition by Ω
• Recognition by F.
• Recognition by P.
Auto-fitted & animated list (lvl2) – template for
slides
• Nothing that lies outside  is countenanced by <,
F, P>.  is probabilistically certain.
Jeffrey vs C. I. Lewis
• C. I. Lewis: “if anything is to be probable, then something
must be certain”
• It is ironic that Jeffrey objected to Lewis’s dictum, insisting
that there could be “probabilities all the way down to the
roots”.
• But this is an incoherent position. There couldn’t be
probabilities without some underlying certainty. Far from
embodying some philosophical error, Lewis’s dictum is a
trivial truth!
The pluriverse and credences
• Let  consist of all of the worlds of the Lewisian pluriverse.
• Let F consist of all propositions that you can entertain.
Some subsets of  are too gerrymandered to be the
contents of your credences, and so are excluded from F.
• And among the propositions that you do entertain, you give
some credence 0, and some positive credence. Only the
latter reach the third grade of probabilistic involvement.
Against regularity
• Regularity falls foul of the three grades of probabilistic
involvement. There is no easy passage from something
being possible to its getting positive probability. Before it
can get there, it has to get through three gatekeepers: it
has to be recognized by , then by F, then by P.
Why care about the improbable?
• I have already pointed out many ways in which
scientists and philosophers do care about the
improbable. This already does much to build my
case that they should care—for many of the
examples are important and well motivated.
Why care about the improbable?
• There are specific problems that arise only in virtue of
extreme improbability. We want a fully general probability
theory that can handle them.
• We want a fully general philosophy of probability.
• Assigning extremely low probabilities may help us solve
paradoxes, and may drive philosophical positions
– de Finetti’s lottery: drop countable additivity
Why care about the improbable?
• There are specific problems that arise only in virtue of
extreme improbability. We want a fully general probability
theory that can handle them.
• We want a fully general philosophy of probability.
• Assigning extremely low probabilities may help us solve
paradoxes, and may drive philosophical positions
– de Finetti’s lottery: drop countable additivity
Why care about the improbable?
• A concern for extremely improbable events can
stimulate new theoretical developments
– Kolmogorov on conditional probability Point 2
– Popper on conditional probability
– Bartha and Johns on relative probability
Why care about the improbable?
• Probability interacts with other things that we care
about, and something being extremely improbable
can matter to these other things – e.g. expected
utility, Adams-style ‘probabilistic validity’,
probabilistic causation, laws of nature [
Why care about the improbable?
• There are problems that are shared with higher
probability events, but when they are improbable
we are liable to neglect them
Elga on Lewisian chances and ‘fit’
• Lewis’s ‘best systems’ analysis of the laws of nature
– The laws are the theorems of the theory of the universe that
best combines simplicity and strength
• Lewis’s ‘best systems analysis of chance’
– The chances are the probabilities assigned by the theory of
the universe that best combines simplicity, strength, and fit
(the probability that the theory assigns to the actual history).
Elga on Lewisian chances and ‘fit’
• Elga: infinite histories will seem to get zero probability
from various theories that should be in the running.
• Infinitesimals to the rescue?
• But for every theory whose chances accord well with the
corresponding relative frequencies, there is another
theory whose chances accord badly with them, but that
fits the actual history better.
• This is a strange fact about infinitesimal probabilities.
Maier on the contingent a priori
• Say that a coin is exhaustive just in case it (i) is fair
and (ii) has been tossed an infinite number of
times. Consider the proposition expressed by:
‘Either there are no exhaustive coins or an
exhaustive coin comes up heads at least once.’
This proposition is contingent, and it has probability
1 of being knowable a priori.
Probability 0 events
• You can’t divide by 0:
Problems for the ratio analysis of conditional
probability
Trouble for the conditional probability ratio formula
• What is the probability that
the randomly chosen point
lies in the western
hemisphere, given that it lies
on the equator?
Trouble for the conditional probability ratio formula
• What is the probability that
the randomly chosen point
lies in the western
hemisphere, given that it lies
on the equator?
Trouble for the conditional probability ratio formula
• What is the probability that
the randomly chosen point
lies in the western
hemisphere, given that it lies
on the equator?
• Surely the answer is 0.
What is ‘extremely improbable’?
• ‘Improbable’ does not mean ‘not probable’.
– Something of middling probability is neither improbable
nor probable.
– We don’t want to conflate ‘low probability’ with ‘no
probability’—that is, with the absence, the non-existence
of a probability value.
Most counterfactuals are false
• This discussion recalls Hawthorne and Vogel on
skepticism about knowledge.
Most counterfactuals are false
• This discussion recalls Hawthorne and Vogel on
skepticism about knowledge.
• But there is a crucial disanalogy between
knowledge and counterfactuals. Knowledge is
factive. This yields a key symmetry-breaker among
the relevant possible outcomes. Perhaps you do
know of each ticket that it will lose, except the
ticket that in fact wins…
Most counterfactuals are false
• But there is no similar symmetry breaker for
counterfactuals. There is no way of privileging a
ticket that would have won had the lottery been
played.
Most counterfactuals are false
• But there is no similar symmetry breaker for
counterfactuals. There is no way of privileging a
ticket that would have won had the lottery been
played.
• And so it goes for all the other natural ‘lotteries’ on
which my arguments for the falsehood of most
counterfactuals have relied.
My fascination with the improbable
• Conditional probability
• Hume on miracles
• Pascal’s Wager
• The ‘Pasadena game’
• Indeterminate probabilities
• Arguments against frequentism
• Most counterfactuals are false
• “A Poisoned Dart for Conditionals”
The Strong Law of Large Numbers
• Suppose you are betting on whether a coin lands
heads or tails on repeated trials, and that you win $1
for each head and lose $1 for each tail. We keep track
of your total earnings Sn over a long sequence of trials,
and your average earnings
• Roughly, in the long run,
.
will converge to the
expectation of your winnings on each trial, namely 0.
The Strong Law of Large Numbers
• The strong law of large numbers says that the long
run average converges to the expectation with
probability 1.
• This is called ‘almost sure’ convergence.
• The convergence is not sure.
Why care about the improbable?
• Assigning low probabilities may help us solve
paradoxes
– Bartha and Hitchcock on ‘the shooting room’ paradox
• These solutions to paradoxes may motivate
philosophical positions
– de Finetti’s lottery, and his argument for rejecting countable
additivity
What is ‘improbable’?
• ‘Improbable’ does not mean ‘not probable’.
– Middling probability
– Non-existent probability
• 3 grades of probabilistic involvement…
What is ‘improbable’?
• 3 grades of probabilistic involvement…
What is ‘improbable’?
• 3 grades of probabilistic involvement…
Against regularity
• The strong law of large numbers is only an ‘almost
sure’ result.
• The long run average can fail to converge to the
expectation.
– A fair coin tossed infinitely many times can land heads
on every toss.
Pascal’s Wager
God exists
God does not exist
Wager for God
∞
f1
Wager against God
f2
f3
(f1, f2, and f3 are finite utility values that need not be specified
any further.)
• Your probability that God exists should be positive.
• Rationality requires you to perform the act of
maximum expected utility (when there is one).
Therefore,
Rationality requires you to wager for God.
Problems for open-mindedness
• The probability that it lies
on the equator is 0. A
uniform probability
measure over a sphere
must award probabilities
to regions in proportion to
their area, and the
equator has area 0.
Problems for open-mindedness
• A point is chosen at
random from the
surface of the earth
(thought of as a
perfect sphere)
Trouble for the conditional probability ratio formula
• What is the probability that
the randomly chosen point
lies in the western
hemisphere, given that it lies
on the equator?
• Surely the answer is 1/2.
• But the ratio formula cannot
deliver that answer.
Adams on probabilistic validity
• Adams believed that indicative conditionals do not
have truth conditions.
• Thus he found inadequate the traditional account of
validity of arguments in which conditionals appear.
• But he was happy to speak of 'probabilities' attaching
to conditionals.
• Roughly, a probabilistically valid argument is one for
which it is impossible for the premises to be probable
while the conclusion is improbable.
‘Surprising’ evidence and Bayesian
confirmation theory
• One of the putative success stories of Bayesian
confirmation theory is its explanation of why
surprising evidence provides especially strong
confirmation for a theory that predicts it.
• Bayesians have cashed out “surprising” as
“improbable”.
‘Surprising’ evidence and Bayesian
confirmation theory
• Bayes’ theorem:
• Holding fixed P(E | H) and P(H), there is an inverse
relationship between the posterior and P(E).
• The more surprising the evidence, the greater its
confirmatory power.
Why care about the improbable?
• A concern for improbable events can stimulate new
theoretical developments more generally
• Kolmogorov on conditional probability
• Popper on conditional probability
• Bartha and Johns on relative probability
Infinitesimal probabilities
• But Williamson shows that the coin landing heads
forever must get probability 0 even allowing
infinitesimal probabilities.
Pascal’s Wager, reformulated
• To be sure, there are still problems with the reformulated
Wager (e.g. the Many Gods objection). But at least it is
valid.
Easwaran
• Kenny Easwaran’s APA talk. Striking point that
Dutch Book argument could be cast purely in terms
of bets you find favorable.
• Recast the account of independence, and decision
theory, in terms of strict inequalities.
• Bad idea for decision theory!
An argument for vegetarianism
Eating meat
WRONG
Eating meat
NOT WRONG
Eat meat
–f
f1
Don’t eat meat
f2
f3
Let p be your positive probability that eating meat is
wrong.
Your expected utility of eating meat is
–fp + f1(1 – p)
Your expected utility of not eating meat is
f2p + f3(1 – p), which may be higher.
In that case, you should not eat meat.
Engineering and risky events
• Jet Propulsion Laboratory: the probability of launch
failure of the Cassini spacecraft (mission to Saturn)
was 1.1 x 10-3.
• According to the US Nuclear Regulatory
Commission, the probability of a severe reactor
accident in one year is imprecise:
[1.1 x 10-6 , 1.1 x 10-5]
Why care about the improbable?
• The improbable plays an important role in our
commonsense view of the world. Such events
happen all the time (Cournot’s principle
notwithstanding).
– The probability is allegedly 1/2375 that a golfer on the
PGA tour will get a hole in one.
Why care about the improbable?
• We care about accurately describing and
understanding our world
– Taking our best science seriously—and it is full of
improbable events.
Why care about the improbable?
• If the relevant probability space is large, there is no
avoiding low probabilities.
• Indeed, if the space is large enough (uncountable),
there is no avoiding 0 probabilities.
• In such a space, 0 and 1 are the only values that
are guaranteed to be assigned infinitely many
times.
Popper’s philosophy of science
• Popper maintained that the hallmark of a scientific
claim was its falsifiability.
• But many scientific claims are probabilistic, and
probabilistic claims are not falsifiable.
• He went on to say that if a piece of evidence is
highly improbable by the lights of a theory, then the
theory “in practice” rules out that evidence.