Download Expected Uncertain Utility Theory,

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Discrete choice wikipedia , lookup

Marginalism wikipedia , lookup

Behavioral economics wikipedia , lookup

Marginal utility wikipedia , lookup

Arrow's impossibility theorem wikipedia , lookup

Microeconomics wikipedia , lookup

Preference (economics) wikipedia , lookup

Choice modelling wikipedia , lookup

Transcript
Expected Uncertain Utility Theory†
Faruk Gul
and
Wolfgang Pesendorfer
Princeton University
June 2010
Abstract
We introduce and analyze expected uncertain utility theory (EUU). A prior and an
interval utility characterize an EUU decision maker. The decision maker transforms each
uncertain prospect into an interval-valued prospect that assigns an interval of prizes to
each state. She then ranks prospects according to their expected interval utilities. We
define uncertainty aversion for EUU, use the EUU model to address the Ellsberg Paradox
and other ambiguity evidence, and relate EUU theory to existing models.
† This research was supported by a grant from the National Science Foundation (Grant number: SES0820101). We are grateful to Asen Kochov, Tomasz Strzalecki and Peter Wakker for their comments.
1.
Introduction
We introduce and analyze expected uncertain utility theory (EUU). We consider pref-
erences over Savage (1954) acts that associate a monetary prize to every state of nature.
Our goal is to provide a flexible theory that can address three well-documented deviations
from expected utility theory:
(i) Ellsberg-style evidence that identifies behavior inconsistent with any single subjective
prior over the event space.
(ii) Source preference evidence showing that, ceteris paribus, decision makers prefer uncertain prospects that depend on familiar rather than unfamiliar events.
(iii) Allais-style evidence showing that even when decision makers’ preferences are consistent with a subjective probability assessment, they reveal systematic violations of the
independence axiom.
In this paper, we provide axioms for EUU theory, discuss how it relates to Ellsbergstyle evidence and uncertainty aversion and relate EUU theory to other models of decision
making under uncertainty. We leave the analysis of source preference and Allais-style
evidence to the companion paper, Gul and Pesendorfer (2010).
1.1
The Representation and Main Concepts
In EUU theory, as in subjective expected utility theory (SEU), two parameters de-
scribe a decision maker: a subjective prior µ and a utility index u. The prior, defined over
some σ-algebra E, is a countably additive, complete and nonatomic probability measure.
We refer to the elements of E as ideal events. The utility index assigns a real number to
each pair (x, y) such that x ≤ y and is continuous and increasing. We refer to such an
index as an interval utility. As in SEU, the prior and the utility index are subjective; that
is, are derived from preferences.
We illustrate the main ideas of EUU theory with bets; that is, acts that deliver y if
some event A occurs and x < y if A does not occur.1 Let yAx denote such a bet.
1
We describe and discuss the general representation in the next section.
1
For an arbitrary event A (not necessarily in E), define the inner probability µ∗ (A) and
the outer probability µ∗ (A) of A as follows:
µ∗ (A) = sup µ(E)
E∈Eµ
E⊂A
µ∗ (A) = inf µ(E) = 1 − µ∗ (Ac )
E∈Eµ
E⊃A
It easy to see that inner and outer probabilities are attained. That is, for every A, there
exists E ∈ E, E ⊂ A such that µ∗ (A) = µ(E). We call this E the core of A.
Let E1 be the core of A, E3 be the core of Ac and let E2 be the complement of E1 ∪E3 .
Call E1 , E2 , E3 an ideal split of A. Now, we can restate the inner and outer probabilities
of A in terms of the probabilities of E1 and E2 :
µ∗ (A) = µ(E1 ) ≤ µ(E1 ) + µ(E2 ) = µ∗ (A)
In EUU theory, the decision maker assigns utilities to bets as follows:
W (yAx) = u(y, y)µ(E1 ) + u(x, y)µ(E2 ) + u(x, x)µ(E3 )
(1)
When µ(E2 ) = 0, the event A is ideal and expression (1) is simply the expected utility
of the bet yAx for the prior µ and the von Neumann-Morgenstern utility index vu where
vu (x) = u(x, x).
When µ(E2 ) > 0, expression (1) differs from expected utility. From the definitions of
E1 and E3 it follows that µ∗ (A ∩ E2 ) = 0 and µ∗ (A ∩ E2 ) = 1 and hence every nonnull
measurable subset of E2 must intersect A and its complement. We refer to E∩A as a diffuse
subset of E. A diffuse set D (of the entire state space) satisfies µ∗ (D) = 0 and µ∗ (D) = 1
and represents a situation of complete ignorance. EUU theory employs diffuse sets as the
building block of uncertainty, using the fact that every diffuse subset of E has the form
E ∩ D for some diffuse set D.
In addition to the Savage axioms, EUU theory uses the following novel assumption:
the decision maker is indifferent between betting on any two diffuse subsets of any ideal
set E. That is, given any diffuse subset D,
yE ∩ Dx ∼ yE ∩ D0 x
2
(2)
for every E ∈ E and all diffuse sets D, D0 and all x, y. Thus, complete ignorance results in
indifference between bets.
Considering situations of complete ignorance and quantifying the uncertainty of nonmeasurable sets by their inner and outer probabilities are two ideas that predate EUU
theory.
Arrow-Hurwicz (1972) provides the first analysis of the former idea and the
Dempster-Shafer theory (Shafer (1976)) formalizes the latter. The novelty in EUU theory
is a representation for all bets and all acts, corresponding to subjective expected utility theory at one extreme (i.e., when betting on ideal events) and to the Arrow-Hurwicz
criterion2 at the other (i.e., when betting on diffuse events).
Put differently, EUU theory shows that every subjective expected utility model with
a countably additive prior provides a rich enough framework for a complete theory of
uncertainty provided we extend preferences to all bets via (2) above.3
Our representation theorem requires a rich state space to separate uncertainty attitude
and uncertainty perception (as discussed below). A similar implicit richness requirement–
that the state space is infinite–in the Savage model facilitates an analogous separation.
In applications, this richness is not necessary, even implausible. To facilitate these applications, we provide a discrete formulation of EUU theory in section 3. The discrete
model is the appropriate framework for confronting the model with evidence. It may not
contain diffuse events or ideal events and, therefore, the finite model contains no analog
of assumption (2) above, nor does it require that the agent satisfy the expected utility
hypothesis over certain events.
1.2
Separation of Uncertainty Perception and Attitude
A novel feature of the EUU model is a separation between uncertainty perception
and uncertainty attitude that mirrors the same separation in subjective expected utility
theory (SEU). In SEU, a convex-valued prior4 describes the decision maker’s uncertainty
perception while the utility index describes her uncertainty attitude. In particular, decision
maker 1 perceives the same uncertainty in event A as decision maker 2 perceives in event
2
The relationship between the Arrow-Hurwicz criterion and EUU theory becomes clearer when we
consider more general diffuse acts; that is, all simple acts such that f −1 (x) is diffuse for all x in the range
of f .
3 This assertion relies on the Continuum Hypothesis.
4 A prior µ is convex-valued if for every event A, and λ ∈ [0, 1] there is B ⊂ A such that µ(B) = λµ(A).
3
B if 1’s prior of A is equal to 2’s prior of B. The SEU prior and utility index separate
uncertainty perception and uncertainty attitude because they satisfies the following two
properties:
(i) For every event A, there exists some event B such that 1’s prior of A is equal to 2’s
prior of B. Hence, both decision makers perceive the same range of uncertainty.
(ii) Two decision makers have the same uncertainty attitude (the same von NeumannMorgenstern index) if and only if 1’s certainty equivalent of the bet yAx is the same
as 2’s uncertainty equivalent of the bet yBx for all x, y, A, B such that 1 perceives the
same uncertainty in A as 2 does in B.
Property (i) ensures that decision makers with different uncertainty perception are
comparable while requirement (ii) ensures that agents’ uncertainty attitude measures the
ranking of comparable bets. The separation between uncertainty perception and uncertainty attitude enables comparisons of uncertainty attitudes of two decision makers without
demanding that their priors agree. In section 2.2, we demonstrate that EUU achieves the
same separation: the EUU prior measures uncertainty perception and the interval utility
measures uncertainty attitude.
Choquet expected utility theory (CEU) and maxmin expected utility theory (MEU)
also have two parameters, a capacity κ and a utility index v for the former; a set of
probabilities, ∆, and a utility index v for the latter. However, these parameters do not
achieve separation between uncertainty perception and uncertainty attitude.
To illustrate this failure for CEU, consider two capacities κ, κ0 and an event A. If we
define perceiving the same uncertainty as κ(A) = κ0 (B), then criterion (ii) will typically
fail. For example, if κ(A) = κ(B) and κ(Ac ) 6= κ(B c ) then yAx 6∼ yBx for x > y. If
we define perceiving the same uncertainty as κ(A) = κ(B) and κ(Ac ) = κ(B c ), then, in
general, (i) does not hold.
The separation of uncertainty perception and uncertainty attitude is central to the
construction of measures of risk and risk attitude in expected utility theory. Since EUU
facilitates an analogous separation, we can construct analogous measures. We use these
measures to address behavior in Ellsberg-style experiments.
Call z the certainty equivalent of a bet if the decision maker is indifferent between
getting z for sure and the bet.
4
(i) Decision maker 1 is more uncertainty averse than decision maker 2 if 1’s certainty
equivalent of yAx is lower than 2’s certainty equivalent of yBx whenever 1 perceives
the same uncertainty in A as 2 perceives in B.
(ii) Event A is more uncertain than event B if there are two preferences with identical
uncertainty perception, one more uncertainty averse than the other, such that the
more uncertainty averse preference prefers a bet on A while the less uncertainty averse
preference prefers a bet on B.
We show that the event A is more uncertain than the event B if A has a strictly lower
inner probability and a strictly higher outer probability than B. Hence, ideal events are
minimally uncertain and diffuse events are maximally uncertain.
Our definitions of uncertainty and uncertainty aversion require a separation between
uncertainty perception and uncertainty attitude as described above but are otherwise
model independent. For example, we can apply the definition to subjective expected
utility theory. In that case, “more uncertainty averse” reduces to the standard notion of
“more risk averse.” For subjective expected utility maximizers all events that have the
same prior are equally uncertain and events with different priors cannot be ranked. Thus,
the concept of “more uncertain than” is redundant for SEU.
1.3
Evidence
EUU theory is flexible enough to accommodate the various versions of the Ellsberg
paradox. More specifically, we show that Ellsberg paradoxes can be interpreted as situations in which decision makers perceive some events to be less uncertain than others
and make the Ellsberg-style choices whenever they are more uncertainty averse than a
benchmark decision maker with the same perception of uncertainty. Hence, we can relate
the propensity for Ellsberg-paradox behavior to the decision maker’s uncertainty aversion
parameter.
Recently, Machina (2009) showed that Choquet expected utility theory and related
models5 are unable to accommodate variations of the Ellsberg paradox that appear plausible and even natural. Recent experimental evidence reported in L’Haridon and Placido
5
Baillon, L’Haridon and Placido (2010) extend Machina’s observation to α−maxmin expected utility
and Klibanoff, Marinacci and Mukerji’s (2005) smooth model of ambiguity. The authors confirm that
Siniscalchi’s (2009) vector valued expected utility model permits the behavior suggested by Machina.
5
(2010) confirms Machina’s intuition and documents a particular pattern of behavior. In
section 5.1, we discuss this evidence and show that EUU theory can accommodate the
observed pattern of preference if the interval utility is strictly supermodular, that is, if
u(x4 , x1 ) + u(x3 , x2 ) > u(x2 , x1 ) + u(x4 , x3 )
whenever x1 > x2 ≥ x3 > x4 .
1.4
Outline
Section 2 contains our representation theorem and demonstrates how EUU separates
uncertainty perception and uncertainty attitude. To axiomatize EUU, we require an infinite
state space. However, to address Ellsberg-style evidence and to relate the model to the
literature, it is more convenient to use a model with a discrete state space. In section
3, we introduce EUU for a discrete state space. In section 4, we define uncertainty and
uncertainty aversion and relate it to parameters of the EUU representation. In section 5,
we formulate a canonical Ellsberg-style experiment and show how the observed patterns
of behavior match up with the parameters of the EUU representation.
Special cases of EUU have been studied by Zhang (2002) and Lehrer (2007). In
addition, the EUU representation is related to Jaffray (1989). We describe the relation to
those and other papers in section 6. Section 7 concludes with a detailed comparison to
Choquet expected utility theory and α−maxmin expected utility theory.
2.
Expected Uncertain Utility
The interval X = [l, m], l < m, is the set of monetary prizes and Ω is the state space.
The decision maker has preferences over acts; that is, functions f from Ω to X. Let F be
the set of all acts.
2.1
The Prior and Envelopes
A countably additive probability measure µ (on some σ−algebra Eµ ) is a prior if it
is complete (i.e., A ⊂ E and µ(E) = 0 implies A ∈ Eµ ) and nonatomic (i.e., µ(A) > 0
implies 0 < µ(B) < µ(A) for some B ⊂ A). Given any prior µ, let Fµ be the set of all
Eµ -measurable acts; that is
Fµ = {f ∈ F | f −1 [x, y] ∈ Eµ for all x, y ∈ X}
6
Let
I = {(x, y) | l ≤ x ≤ y ≤ m}
be the set of all prize intervals. We interpret the pair (x, y) as a single (subjective)
consequence. The pair (x, y) describes a situation that the decision maker interprets as
getting at least x and at most y.
Given a prior µ, a function f : Ω → I is a subjective interval act if it is measurable
with respect to Eµ . Let F denote the set of all subjective interval acts. For f ∈ F, let fi
denote the i’t coordinate of f. That is, f(ω) = (f1 (ω), f2 (ω)) for all ω ∈ Ω. Lemma 1 below
reveals that given any prior µ, each act can be identified with a unique (up to a set of
measure 0) subjective interval act. All proofs are in the Appendix.
Lemma 1:
For any prior µ and f ∈ F, there exists an f ∈ F such that
µ({ω ∈ Ω | f1 (ω) ≤ f (ω) ≤ f2 (ω)}) = 1
and if g ∈ F also satisfies the equation above, then
µ({ω ∈ Ω | g1 ≤ f1 (ω) ≤ f2 (ω) ≤ g2 (ω)}) = 1
It is clear that any f with the property above is unique up to a set of measure 0. We
call the f corresponding to any f its envelope. Note that f ∈ Fµ if and only if f1 = f = f2
almost µ-surely. That is, an act is Eµ −measurable if and only if f = f1 = f2 . Lemma 2
below is a converse of Lemma 1.
Lemma 2:
2.2
For any prior µ and f ∈ F, there exists f ∈ F such that f is f ’s envelope.
The Interval Utility and Representation
Henceforth, we let [f ] = ([f ]1 , [f ]2 ) denote the envelope of f . An interval utility is a
continuous function u : I → X such that u(x, y) > u(x0 , y 0 ) whenever x > x0 and y > y 0 . A
7
preference º is a expected uncertain utility (EUU) if there exists a prior µ and an interval
utility u such that the function W defined below represents º:
Z
W (f ) =
u[f ]dµ
Thus, a prior µ and an interval utility u characterize an EUU decision maker. We let ºuµ
denote the EUU preference associated with (µ, u).
Define the function vu : X → IR such that vu (x) = u(x, x) for all x ∈ X. For f ∈ Eµ
we have
Z
W (f ) =
vu (x)dµ
The following example illustrates how the EUU of an arbitrary (not Eµ −measurable) act
is computed:
Example:
Let Ω = [0, 1]×[0, 1] and let E0 be the smallest σ-algebra that contains all events of the
form [a, b] × [0, 1] for 0 ≤ a ≤ b ≤ 1. That is, E0 is the set of all full-height rectangles–i.e.,
sets of the form B × [0, 1] for any Borel set B ⊂ [0, 1] as illustrated in Figure 1 below.
E"
A!
Figure 1
Let µ0 be the unique measure on E0 that satisfies
µ0 ([a, b] × [0, 1]) = b − a
and let (E, µ) be the completion of (E0 , µ0 ).
8
Consider the act f illustrated in Figure 2 below with prizes x < y < z. The act yields
prize x on the yellow shaded region, y on the light grey shaded region and z on the dark
grey region.
z
y
x
E2
E1
Figure 2
The envelope [f ] of f depicted in Figure 2 is [f ]1 = x, [f ]2 = yE1 z and hence
W (f ) = µ(E1 )u(x, y) + µ(E2 )u(x, z).
2.3
Axioms and the Theorem
Theorem 1 below shows that º is an EUU if and only if it satisfies the following 6
axioms. Note that the axioms are analogous to their counterparts in Savage’s theorem.
We identify x ∈ X with the constant act that yields x in every state. Hence, the binary
relation º on F induces a binary relation on X.
Axiom 1:
The binary relation º is complete and transitive.
Axiom 2:
If f (s) > g(s) for all s ∈ Ω, then f  g.
We interpret prizes as quantities of money and Axiom 2 is a natural consequence of
that interpretation.6 For any f, g ∈ F and A ⊂ Ω, let f Ag denote the act h such that
h(s) = f (s) for all s ∈ A and h(s) = g(s) for all s ∈ Ac . Hence, xAy denotes the act that
yields x if A occurs and y otherwise.
6
Though natural, the assumption is not implied by the Savage axioms and cannot be satisfied in the
Savage model with a countable state space.
9
Our first goal is to identify an ideal environment from the decision maker’s preferences
and use it to calibrate his attitude towards uncertainty. Consider two acts that imply
different subacts on the event E but have a common subact on E c . If the event E is ideal,
the ranking of acts does not depend on the common subact on E c . Similarly, if two acts
differ on E c but have a common subact on E then the ranking of acts does not depend on
the common subact.
Definition:
An event E is ideal if f Eh º gEh and hEf º hEf implies f Eh0 º gEh0
and h0 Ef º h0 Eg.
Thus, an event E is ideal if Savage’s sure thing principle holds with respect to E and
E c . An event A is null if f Ah ∼ gAh for all f, g, h ∈ F. If A is not null, we call it non-null.
Let E be the set of all ideal events and E, E 0 , Ei etc. denote elements of E. Let E+ ⊂ E
denote the set of ideal events that are not null.
An event is diffuse if it and its complement intersects every non-null ideal event.
Diffuse events represent outcomes in situations of complete ignorance. The decision maker
cannot find any (non-null) ideal event contained in it or its complement and hence cannot
bound the probability of such events. Let D be the set of all diffuse events and let D, D0 , Di
etc. denote elements of D.
Definition:
An event D is diffuse if E ∩ D 6= ∅ 6= E ∩ Dc for every E ∈ E+ .
In the example above, let D ⊂ Ω be such that both it and its complement intersects
every vertical line {s} × [0, 1]. Then, D is diffuse. Our main hypothesis (formalized in
Axiom 3) is that the decision maker cannot discriminate among the diffuse subsets of any
ideal event. That is, for any E, the decision maker is indifferent between betting on E ∩ D1
and E ∩ D2 when both events are diffuse. This indifference reflects the decision maker’s
complete ignorance over diffuse outcomes.
Axiom 3:
yE ∩ Dx ∼ yE ∩ D0 x for all x, y, E, D and D0 .
One consequence of Axiom 3 is that it permits the partitioning of Ω into a finite
collections of sets D1 , . . . , Dn such that y(Dj ∪ Dk )x ∼ yDi x for all i, j and k. Note
that Savage’s theory allows for a similar possibility for countably infinite collections of
10
sets. Diffuse sets are limiting events that play a similar role in EUU theory as arbitrarily
unlikely events do in Savage’s theory. They allow us to calibrate the uncertainty of events
and thereby facilitate the separation of uncertainty perception and uncertainty attitude.
Our model requires a rich state space. The existence of diffuse events is a consequence of the countably additive and convex valued probability on that state space, just
as the existence of null events is a consequence of the convex valued (and finitely additive)
probability in Savage’s theory. Section 3, below, examines EUU in finite settings that may
contain no null or diffuse events. The finite model is the appropriate framework to confront
evidence since empirical tests typically employ simple, discrete state spaces. Because the
discrete setting may contain no diffuse events, Axiom 3 has no bite. Thus, Axiom 3 should
be interpreted more as a conceptual device than as a testable prediction of the theory.7
Axiom 4 below is Savage’s comparative probability axiom (P4) applied to ideal events.
Axiom 4:
If y > x and w > z, then yEx º yE 0 x implies wEz º wE 0 z.
Axiom 5 is Savage’s divisibility axiom for ideal events. It serves the same role here as
in Savage. Its statement below is a little simpler than Savage’s original statement because
in our setting, there is a best and a worst prize.
Let F o denote the set of simple acts; that is, acts such that f (Ω) is finite. The simple
act, f ∈ F o , is ideal if f −1 (x) ∈ E for all x. Let F e denote the set of ideal simple acts.
Axiom 5:
If f, g ∈ F e and f  g, then there exists a partition E1 , . . . , En of Ω such
that lEi f  mEi g for all i.
Axiom 6 below is a strengthening of Savage’s dominance condition adapted to our
setting. We use it to extend the representation from simple acts to all acts, to establish
continuity of u and to guarantee countable additivity of the prior µ. Notice that for ideal
acts f ∈ F e Axiom 6(i) implies Arrow’s (1970) monotone continuity axiom, the standard axiom used to establish countable additivity of the probability measure in subjective
expected utility theory.
7 The status of Axiom 3 is similar to that of P6 (small event continuity) in Savage’s theory. Both
axioms have no bite for a subset of acts that are measurable with respect to a fixed, finite collection
of events. Therefore, these axioms should be viewed as conceptual devices that connect the theory to
behavior in the idealized environment with a rich state space and a corresponding rich set of acts.
11
Axiom 6:
(i) If fn ∈ F e converges pointwise to f , then g º fn º h for all n implies
g º f º h. (ii) If fn ∈ F converges uniformly to f , then g º fn º h for all n implies
g º f º h.
Axiom 6(ii) is what would be required to get a continuous von Neumann-Morgenstern
utility index when proving Savage’s Theorem in a setting with real-valued prizes. Here, it
serves a similar role; it ensures the continuity of the interval utility. Theorem 1 below is our
main result. It establishes the equivalence of the six axioms to the existence of an EUU
representation. The uniqueness of this representation follows from standard arguments
and is omitted.
Theorem 1:
The binary relation º satisfies Axioms 1 − 6 if and only if there is a prior µ
and an interval utility u such that º = ºuµ . Moreover, the prior is unique and the interval
utility is unique up to a positive affine transformation.
The interval utility u measures the decision maker’s uncertainty attitude while the
prior µ measures the decision maker’s uncertainty perception. Next, we show that the
requirements for such a separation, as described in the introduction, are satisfied.
2.4
Uncertainty Perception and Uncertainty Attitude
In the introduction, we provide two criteria for the separation of uncertainty percep-
tion and uncertainty attitude:
(i) For every event A, there exists some event B such that 2 perceives the same uncertainty
in B as 1 does in A.
(ii) Two decision makers have the same uncertainty attitude if and only if 1’s certainty
equivalent of yAx is the same as 2’s certainty equivalent yBx for all x, y whenever 1
perceives the same uncertainty in A as 2 perceives in B.
Here, we present definitions of uncertainty perception and uncertainty attitude and
show that they satisfies the criteria above. Let ºuµ11 and ºuµ22 be two EUU decision makers.
We will refer to ºuµii as i. We say that 2 perceives the same uncertainty in B as 1 perceives
in A if
µ2∗ (B) = µ1∗ (A) and µ∗2 (B) = µ∗2 (A)
12
Also, we say that 1 has the same uncertainty attitude as 2 if u1 is a positive affine transformation of u2 . Hence, µ describes uncertainty perception and u describes uncertainty
attitude. To see how these notions meet the criteria (i) and (ii) note that Lemma 2 implies
that every prior exposes the decision maker to the same range of uncertainty perceptions.
That is, for any p, q ∈ [0, 1] there exists A, B such that
µ2∗ (B) = µ1∗ (A) = p and µ∗2 (B) = µ∗2 (A) = p + q
(3)
Thus, our definition of uncertainty perception satisfies (i).
To demonstrate property (ii), let 2 perceive the same uncertainty in B as 1 perceives
in A and let p and q satisfy (3). Also, let u1 be a positive affine transformation of u2 . By
the representation theorem, yA1 x ∼uµ11 z if and only if
u1 (z, z) = u1 (y, y)p + u1 (x, y)q + u1 (x, x)(1 − p − q)
(4)
Since u2 is a positive affine transformation of u1 , (4) also holds for u2 and hence z is also
the certainty equivalent of yAx for 1.
For the converse, let vi (x) = ui (x, x) and note that for every (x, y) ∈ I, there exists
a unique zi such that vi (zi ) = ui (x, y). Let Di be a diffuse set for i; hence 2 perceives
the same uncertainty in D2 as 1 does in D1 . Then, z1 = z2 by hypothesis. Thus, u1 is a
positive affine transformation of u2 if and only if v1 is a positive affine transformation of
v2 . That two expected utility maximizers have the same preferences over binary gambles
only if their utility indices are the same, up to a positive affine transformation, is well
known. Thus, we have established (ii).
2.5
Outline of the Proof of Theorem 1
If we restrict attention to ideal events, Axioms 1-6 yield a standard expected utility
representation with a countably additive probability measure µ and a continuous utility
index v : X → IR. Fix any diffuse event D and for (x, y) ∈ I, let u(x, y) = v(z) for
(x, y) ∈ I such that yDx ∼ z. Axioms 2 and 6 ensure that z ∈ [x, y] exists and therefore
u is well defined. The proof of the Theorem shows that W represents ºuµ . For this, it is
enough to show that v(x∗ ) = W (f ) implies x∗ ∼ f .
13
Fix any partition A1 , . . . , An of Ω. In the proof of Lemma 1, we show that Ω can
be partitioned into any finite number of diffuse sets. To show this, we use a Theorem by
Birkhoff (1967) which in turn uses the continuum hypothesis.8 We use the fact that Ω
can be partitioned into any finite number of diffuse sets together with the fact that µ is
nonatomic to show that there are two partitions, one of ideal events E0 , E1 , . . . , Em , the
other of diffuse events D1 , . . . , Dl such that µ(E0 ) = 0 and
Ei ∩ Dj ⊂ Ak
(5)
for some k and for all i, j. Since µ(E0 ) = 0, we ignore E0 .
Now, consider any simple act f : Let {w1 , . . . , wn } be the set values that f takes, and
assume without loss of generality that wi < wi+1 . Consider the partition
{Ai | Ai = f −1 (wi ) for i = 1, . . . n}
and let {Ej }, {Dk } be ideal and diffuse partitions that satisfy (5).
Let x, y be the minimal and maximal values of f on E1 . Let f1 be an act that agrees
with f on E1c , takes on the values x and y on E1 and agrees with f when f is equal to
x or y. That f1 has the same envelope as f follows from the definition of a diffuse event.
To see that f1 is indifferent to f consider the simplest case: E1 = Ω and assume that
f = (xD1 z)D2 y for some diffuse partition D1 , D2 , D3 . By monotonicity
(xD1 z)D2 y º xD1 ∪ D2 y
and
xD1 y º (xD1 z)D2 y
and by Axiom 3,
xD1 ∪ D2 y ∼ xD1 y
and therefore xD1 ∪ D2 y ∼ (xD1 z)D2 y ∼ xD1 y.
8
Birkhoff (1967), Theorem 13 (pg. 266) shows that no nontrivial (i.e., not identically equal to 0)
countably additive measure such that every singleton has measure 0 can be defined on the algebra of all
subsets of the continuum.
14
Then, by induction, f is indifferent to and has the same envelope as some act g that
takes at most two values on each Ej and agrees with f whenever f takes its maximal or
minimal value in Ei . Let yj and xj be these values respectively. Then, it follows from
the definition of an ideal event and Axiom 3 that g is indifferent to the act h such that
h(ω) = zj for all ω ∈ Ej for zj such that zj ∼ yj Dxi . Since h is measurable with respect
to ideal events, x∗ ∼ h ∼ g ∼ f for some x∗ such that
v(x∗ ) =
X
v(zj )µ(Ej ) =
X
j
u(xj , yj )µ(Ej ) = W (g) = W (f )
j
as desired. The extension to all acts uses Axiom 6 and follows familiar arguments.
3.
EUU in a Discrete Setting
To prove Theorem 1 above, we have ruled out the possibility that Ω is finite. How-
ever, in applications and when comparing the EUU model to existing alternatives, it is
convenient to consider finite spaces.
Let S = {1, . . . , n} be the discrete state space and let Φ be the corresponding collection
of discrete acts φ : S → X. The representation of a discrete EUU has two parameters, an
interval utility u and a probability π on the collection of non-empty subsets of S. For any
P
finite set Y , the function λ is a probability on Y if λ : Y → [0, 1] and y∈Y p(x) = 1. Let
P be the set of all nonempty subsets of S and let Π be the set of all probabilities on P.
Given any φ ∈ Φ, define the function [φ] : P → I as follows:
[φ](a) = (min φ(s), max φ(s))
s∈a
s∈a
for all a ∈ P. A preference º (on Φ) is a discrete EUU if it there an interval utility u and
a probability π on Π such that
U (φ) =
X
u[φ](a)π(a)
(6)
a∈P
represents º. Henceforth, write U = (π, u) if U satisfies equation (6) and we let ºuπ denote
the discrete EUU that this U represents.
15
If π(a) = 0 for all non-singleton a, then
U (φ) =
X
vu (φ(s))π({s})
s∈S
and therefore U reduces to expected utility. When π(a) > 0 for some non-singleton a,
π(a) reflects the decision maker’s inability to reduce the uncertainty of the event a to its
components.
Adapting the EUU representation theorem to the finite setting is not immediate because the EUU representation relies on the existence of a rich class of ideal sets. Imposing
the existence of such a collection of sets would be unduly restrictive in the finite setting.9
When the state space is finite, we only require that it be possible to interpret these states as
a partition of the state space Ω. This partition is described by an onto function ρ : Ω → S
and, therefore, the event ρ−1 (s) ⊂ Ω in the original model corresponds to state s in the
discrete model. An act φ ∈ Φ in the discrete model represents an act in the original model
that is measurable with respect to the partition ρ. The act f ∈ F is measurable with
respect to ρ if f = φ ◦ ρ for some φ ∈ Φ where φ ◦ ρ denotes the composition of ρ and φ.
Proposition 1 below shows that the preference º on Φ is a restriction of an EUU preference
ºuµ in the original model if and only if º is a discrete EUU.
Proposition 1:
Fix a prior µ (on Ω) and a preference º on Φ. Then, there is an interval
utility u and a partition ρ such that
φ º ψ if and only if φ ◦ ρ ºuµ ψ ◦ ρ
if and only if º is a discrete EUU.
Proposition 1 shows that the prior in the original model does not constrain the range
of possible discrete EUUs. For a fixed prior µ we can extend any discrete EUU to an
original EUU by choosing an appropriate partition ρ and an appropriate interval utility u.
To illustrate the relationship between the original model and the discrete model, let
S = {1, 2}, let A = ρ−1 (1), Ac = ρ−1 (2) and let φ(1) = x, φ(2) = y with x ≤ y. Hence
f = φ ◦ ρ = xAy is the act in the original model corresponding to φ. In the previous
9
Consider the following analogy: the proof of Savage’s theorem relies on equiprobable partitions; yet
it is needlessly restrictive to impose the existence of such partitions (i.e., to require that every nonnull
state has the same probability), when applying subjective expected utility theory to finite state spaces.
16
section, we observed that for any A, there exist unique (up to a µ−null event) ideal sets
E1 , E2 , E3 and a diffuse set D such that E1 ∪ (E2 ∩ D) = A and E3 ∪ (E2 ∩ Dc ) = Ac .
Thus, if W represents ºuµ , then
W (f ) = µ(E1 )u(x, x) + µ(E2 )u(x, y) + µ(E3 )u(y, y)
If we set π({1}) = µ(E1 ), π({2}) = µ(E3 ) and π({1, 2}) = µ(E2 ), then we get the discrete
representation
U (φ) =π({1})u(x, x) + π({1, 2})u(x, y) + π({2})u(y, y)
for acts in Φ. As we showed in the previous section, µ(E1 ), µ(E2 ), µ(E3 ) describe the decision maker’s uncertainty perception in A, Ac . Thus, in the discrete model, the probability
measure π describes the decision maker’s uncertainty perception and the interval utility u
describes his uncertainty attitude.
The discrete setting is not rich enough to achieve the separation between uncertainty
perception and uncertainty attitude according to the criterion in section 2. Therefore, we
need to appeal to the original preference (on F) that induces the discrete preference to
establish the desired separation. The same is true for subjective expected utility preferences
in finite settings.
4.
Uncertainty Aversion
The goal of this section is to define comparative uncertainty aversion for EUU theory
and associate it with parameters of the utility function. Throughout, we consider discrete
EUU preferences º on Φ, the collection of discrete acts φ : S → X. By Proposition 1
above, º=ºuπ for some (u, π). We write x for the constant act φ(s) = x and we write xay
for the act φ that yields x if the event a ⊂ S occurs and y if a does not occur.
We say that one preference is more cautious than another if, to every act, the former
assigns a lower certainty equivalent (i.e., constant act) than the latter.
Definition:
The preference º is more cautious than the preference º̂ if x º̂ φ implies
x º φ for every φ ∈ Φ.
17
The interval utility u measures uncertainty attitude and thus we can define the following comparative measure of uncertainty aversion.
Definition:
The interval utility u is more uncertainty averse than the interval utility û
if ºuπ is more cautious than ºûπ for every π.
Given any, interval utility u, recall that vu (x) = u(x, x) for all x ∈ X. For x, y ∈ X
such that x < y, let
σuxy =
y − vu−1 (u(x, y))
y−x
The interval utility is monotone and therefore u(x, y) ∈ [u(x, x), u(y, y)]. This, in turn,
implies that σuxy ∈ [0, 1]. For x < y, we have u(x, y) = vu (σuxy x + (1 − σuxy )y). Hence,
σuxy x + (1 − σuxy )y is the certainty equivalent of the interval [x, y].
Proposition 2:
Interval utility û is more uncertainty averse than u if and only if vû ◦vu−1
is concave and σûxy ≥ σuxy for all x, y.
The function vu describes the interval utility for degenerate intervals [x, x]. As Proposition 2 shows, the more uncertainty averse preference has a more concave vu . This part
of the comparative measure mirrors the standard comparative measure of risk aversion for
expected utility maximizers. For non-degenerate intervals, the more uncertainty averse
interval utility has a lower certainty equivalent than the less uncertainty averse interval
utility. This is the novel part that generalizes risk aversion to uncertainty aversion.
Next, we use our definition of “more uncertainty averse” to derive a measure for “more
uncertain than.” Fix π and consider two events a, b ⊂ S. If ºuπ prefers betting on a to
betting on b for every u, then a is a better bet; that is, in a strong sense, a is more likely
than b. On the other hand, if some u prefer a and others prefer b, then uncertainty attitude
is determining the decision makers’ ranking of a and b; in particular, if more uncertainty
averse u’s prefer b to a while less uncertainty averse ones have the opposite ranking, then
we say a is more uncertain than b.
Definition:
Event a is more uncertain than b for π if there exists û, u such that û is
more uncertainty averse than u and
yax Âuπ ybx and ybx Âûπ yax
18
for x < y.
Next, we relate the comparative measure “more uncertain than” to the parameter π.
Given any probability π on P, set π∗ (∅) = 0. Then, for all a ⊂ S, let
π∗ (a) =
X
π(a)
b⊂a
for all a ∈ P and let
π ∗ (a) = 1 − π∗ (ac )
for all a ⊂ S. Note that π ∗ (a) − π∗ (a) ≥ 0 for any a.
Proposition 3:
Event a is more uncertain than event b for π if and only if π∗ (b) > π∗ (a)
and π ∗ (a) > π ∗ (b).
To illustrate Proposition 3, consider the following example: there are three states,
S = {1, 2, 3}, π({1}) = 1/3, π({2}) = π({3}) = 0 and π({2, 3}) = 2/3. In this example,
π∗ ({1}) = 1/3 = π ∗ ({1}) = 1/3 while π∗ ({2}) = 0 and π ∗ ({2}) = 2/3 and therefore state
2 is more uncertain than state 1. Furthermore, if a = {1, 2} and b = {2, 3} then
π∗ (b) = 2/3; π ∗ (b) = 2/3
π∗ (a) = 1/3; π ∗ (a) = 1
and therefore a is more uncertain than b.
5.
Uncertainty and the Ellsberg Paradox
In this section, we will relate EUU theory to observed behavior in various versions of
the Ellsberg experiment. Our goal is not only to show that EUU theory is flexible enough
to accommodate the Ellsberg paradox but also to take advantage of the separation between
uncertainty perception and uncertainty attitude to relate a decision maker’s propensity for
Ellsberg-paradox behavior to his uncertainty aversion parameter.
Our general formulation of an Ellsberg experiment is as follows: there are two possible
prizes y = 1 and x = 0. Given any event b ⊂ S, a bet is an act that delivers 1 if b occurs
and 0 otherwise. Hence, we identify such an act with the event b. The experimenter elicits
19
the decision makers’ preferences over some collection of bets: B ⊂ 2S . The subjects are
told that one or more urns have each been filled with m balls of various colors. More
specifically, the subjects are told how many different color balls are available and which
particular color configurations are not allowed.
An outcome s ∈ S is a color configuration (one color for each ball in each urn) and
a draw from each urn. For example, in one experiment the subjects may be told that
there is one urn with three balls colored red, green or white. Furthermore, ball 1 is always
red. With this description, the subjects understand that of the nine possible (32 ) color
configurations for balls 2 and 3, only four are permitted: both green; ball 2 green, ball 3
white, ball 3 white, ball 2 white, and both white. Given these four possible ways to fill the
urn with three balls, the experiment has 4 × 3 = 12 outcomes. A color event is the set of
all outcomes associated with a particular color for the ball drawn.
The defining feature of an Ellsberg experiment is that for some events a ∈ B, the number of possible outcome associated with a in each feasible configuration of the urn is fixed.
For example, ex post (i.e., upon inspecting the contents of the urn) a = {green, white}
has a 2/3 chance of winning in every configuration. We call such events, experimentally
unambiguous. In contrast, a bet on b = {red, green} has a 1/2 chance of winning in
two configurations, a 2/3 chance in one configuration and is a sure winner in the final
configuration. Hence, b is experimentally ambiguous.
Note that the events b and a above are comparable in the sense that both contain
the same number of elements of S. Note also that both of the notions above; i.e., experimentally unambiguous and comparable, are closed under complements. That is, if a
is experimentally unambiguous, (a is comparable to b), then so is ac , (so are ac and bc .)
Hence, an Ellsberg paradox is a situation in which the decision maker prefers every experimentally unambiguous bet to any comparable experimentally ambiguous bet so that we
have
a  b and ac  bc
which is inconsistent with any betting preference that can be represented with a probability
on S.
20
We normalize u(1, 1) = 1, u(0, 0) = 0 and u(0, 1) = z and note that
z = vu−1 (1 − σu01 )
measures the uncertainty aversions of the EUU decision maker in the Ellsberg experiment.
Lower values of z correspond to greater uncertainty aversion. In the Ellsberg experiment,
EUU preferences depend only on π and z. Hence, we write ºzπ rather than ºuπ .
Let S = {1, . . . , n} were n = km. For t = 1, . . . , k and i = 1, . . . m, the state
s = (t − 1)m + i ∈ S represents the outcome in which the i’th ball has been drawn from
an urn that has been filled according to the t’th configuration. Hence, we can identify S
with the matrix (sit ) as described in the figure below:
config. 1
...
config. k
ball 1
..
.
s11
..
.
...
.
s1k
..
.
ball m
sm1
...
smk
..
Let |a| denote the cardinality of the set a and B be an algebra of subsets of S. For
any event a, let at = {s ∈ a | s = (t − 1)m + i for some i = 1, . . . , m} be the outcomes in
a associated with the t’th possible configuration; that is, the elements from t’th column
of the above matrix that are in a. An event a ∈ B is experimentally unambiguous if
mint |at | = maxt |at |; otherwise, it is experimentally ambiguous.
Note that complements of experimentally unambiguous events are experimentally unambiguous and disjoint unions of experimentally unambiguous events are experimentally
unambiguous. However, intersections of experimentally unambiguous events need not be
experimentally unambiguous.10
Let A be the collection of all experimentally unambiguous events in B. The collection
B is an Ellsberg experiment if there exist a ∈ A and b ∈ B\A such that |a| = |b|. Given
10 Epstein and Zhang (2001) define unambiguous events and argue that the set of all unambiguous events
need not be closed under intersections. Our notion of an experimentally unambiguous event supports their
argument. The Epstein-Zhang argument is based on Zhang (2002)’s four-color urn example, which we
discuss below.
21
any Ellsberg experiment B and preference º on Φ, (B, º) is an Ellsberg Paradox if for all
a, b ∈ B such that |a| = |b|,
a ∼ b whenever a, b ∈ A
a  b whenever a ∈ A, b ∈
/A
Proposition 4 below shows that for any Ellsberg experiment, there is an uncertainty perception that renders each experimentally ambiguous event more uncertain than every comparable experimentally unambiguous events. Moreover, the experiment yields a paradox
for any decision maker with that perception and greater uncertainty aversion than a benchmark.
Proposition 4:
For any Ellsberg experiment, B, there exists π and z ∗ > 0 such that
(i) a ∈ A, b ∈ B\A implies b is more uncertain than a whenever |a| = |b|, and
(ii) (B, ºzπ ) is an Ellsberg paradox whenever z < z ∗ .
Next, we apply our definition of Ellsberg experiments and Proposition 4 to three
canonical versions of the Ellsberg paradox.
The One-Urn Paradox: One ball is drawn from an urn that contains 3 balls. It is known
that exactly one ball is red and the remaining 2 balls are either white or green. The exact
number of white balls is not known. Let S = {sit } for i = 1, 2, 3, t = 1, 2, 3, 4. Suppose
the three balls are numbered 1, 2, 3 and ball 1 is always red. Each column, t, depicts one
possible color scheme for the remaining two balls; each row corresponds to a particular
ball, 1, 2, or 3 being drawn. Hence,

r
S = w
w
r
w
g
r
g
w

r
g
g
Let B = {r, g, w} be the three color events. All three color events have 4 elements
but r = {s11 , s12 , s13 , s14 } is experimentally unambiguous since |rt | = |{s1t }| = 1 for every
t ∈ {1, 2, 3, 4}, while w = {s21 , s31 , s22 , s33 } and g = {s32 , s23 , s24 , s34 } are experimentally
ambiguous since |w1 | = |{s21 , s31 }| = 2 = |g4 | and |w4 | = 0 = |g1 |.
22
Next, we will find a probability π and a threshold z ∗ to illustrate the statement in
Proposition 4. Let π be such that π({sit }) = α for all sit . Moreover, if a is a 4 element set
that takes exactly one element from each column then π(a) = β. That is, if |at | = 1 for all
t then π(a) = β. Note that there are 34 sets with this property. Thus, choose α ≥ 0, β > 0
such that
α = π(a) whenever |a| = 1
β = π(a) whenever min |at | = max |at | = 1
t
t
1 = 12α + 81β
Bets on events with at = 1 for all t yield a 1/3 chance of winning in every configuration.
Hence, those events are experimentally unambiguous. The construction of π above assigns
β to every experimentally unambiguous event.
Then,
π∗ (r) =
X
a∈r
π ∗ (r) = 1 −
π(a) = 4α + β
X
π(a) = 1 − (9α − 24 β) = 4α + 65β
a∈r c
π∗ (w) = π∗ (g) = 4α
π ∗ (w) = π ∗ (g) = 4α + 69β
Hence, the event r is more uncertain than the events g and w. A similar calculation
reveals that the event r ∪ w is more uncertain than the event w ∪ g. Recall that u(1, 1) =
1, u(0, 0) = 0 and u(0, 0) = z. Thus,
U (r) = (4α + β)u(0, 1) + 65βu(0, 1) = 4α + β + 65βz
U (g) = 4αu(1, 1) + 69βu(0, 1) = 4α + 69βz
and, therefore, the above example is an Ellsberg paradox for z < 1/4.
The Two-Urn Paradox: Urn I contains one red ball and one white ball; urn II contains
two balls that are red or white and no more is known. One ball will be drawn from each
urn.
Let m = 4 and k = 4 and therefore S = {1, . . . , 16}. Each column of S corresponds
to color choices for the two balls in urn II. In column 1, both balls are white, in column 2,
ball 1 is red and ball 2 is white, etc. Each row corresponds to a pair of draws, one from
23
each urn. Ball 1 is chosen from urn I in the first two rows, while ball 2 is chosen from Urn
I in the last two rows. In odd rows, ball 1 is chosen from urn II while in even rows ball
2 is chosen from urn II. Combining the information about the composition of urn II and
ball draws from both urns yields the following matrix:

rw
 rw
S=
ww
ww
rr
rw
wr
ww
rw
rr
ww
wr

rr
rr 

wr
wr
Let B be all combinations of color draws from the two urns. That is, B is the algebra
generated by the partition {rr, rw, wr, ww}. Since the color draw from the first urn depends
only on the number of the ball drawn, urn I color events are experimentally unambiguous;
that is, rr ∪ rw and ww ∪ ww both contain two elements from each column. In contrast,
urn II color events are experimentally ambiguous. For example, rw ∪ ww has the same
number of elements as rr ∪ rw, contains the entire first column but does not intersect the
fourth column.
A construction analogous to the one for the single urn example above can be used to
illustrate Proposition 4 in this example.
Zhang’s Four-Color Urn: One ball is drawn from an urn with 2 balls; all balls are
either red, white, green or orange. It is known that there is exactly one ball in each of
the following two categories: (1) red or white and (2) red or green. It follows from this
information that there is also one ball in each of the following two categories: (3) orange or
green and (4) orange or white. Zhang defines each of the four events above as unambiguous
and concludes that the union of unambiguous events may not be unambiguous.
Let S = {sti } for t = 1, 2, 3, 4 and i = 1, 2 where t indexes the color of ball 1 (r, o, w
or g) and i the number of the ball drawn (1 or 2). Let B be all color draws from the
urn. If ball 1 is red then ball 2 is orange. Conversely, if ball 1 is orange then ball 2 is red.
Therefore, columns 1 and 2 in the matrix below are r, o and o, r respectively. Similarly,
if ball 1 is w then ball 2 is g and if ball 1 is g then ball 1 is w. Columns 3 and 4 of the
matrix below describe the corresponding configurations of the urn.
·
¸
r o w g
S=
o r g w
24
All two-color events have two elements. The events r ∪ o and w ∪ g are experimentally
ambiguous; that is, these events do not contain the same number of elements from each
column. All other two-color events contain exactly one element from each column.
Next, we find a probability π and a threshold z ∗ to illustrate the statement in Proposition 4. Let π be the following probability. There is α ≥ 0, β > 0 such that
α = π(a) whenever |a| = 1
β = π(a) whenever min |at | = max |at | = 1
t
t
1 = 8α + 8β
We must show that the two color event r ∪ o is more uncertain than any other two color
event. We have π∗ (r ∪ o) = 4α, π ∗ (r ∪ o) = 4α + 8β and π∗ (r ∪ w) = 4α + 2β, π ∗ (r ∪ w) =
4α + 6β. Hence, the event r ∪ w is less uncertain than the event r ∪ o. The same is true
for other two color events. For this specification of the probability π, the four color urn is
an Ellsberg paradox if z < z ∗ = 1/2.
5.1
Machina Reversals
Recently, Machina (2009) showed that Choquet expected utility theory (and related
models) cannot accommodate variations of the Ellsberg paradox that appear plausible
and even natural. In this subsection, we will show that within EUU theory the behavior
described by Machina is synonymous with the nonseparability of u.
Next, we describe Machina’s urn experiment: Let S = {1, 2, 3, 4}; to be concrete,
suppose a ball will be drawn from an urn that is known to have 20 balls. It is also known
that 10 of these balls are marked 1 or 2 and the other 10 balls are marked 3 or 4.
We identify each φ ∈ Φ with (φ(1), φ(2), φ(3), φ(4)) ∈ X 4 . Machina (2008) observes
that if º is any Choquet expected utility such that
(w, x, y, z) ∼ (x, w, y, z) ∼ (y, z, x, w)
for all w, x, y, z ∈ X, then
(x1 , x3 , x2 , x4 ) ∼ (x1 , x4 , x2 , x3 )
25
(7)
whenever x1 ≥ x2 ≥ x3 ≥ x4 . He notes that this indifference may be an undesirable
restriction for a flexible model of decision making under uncertainty.
Call it an M-reversal if a preference, º on Φ, is not indifferent between (x1 , x3 , x2 , x4 )
and (x1 , x2 , x3 , x4 ) for some x1 ≥ x2 ≥ x3 ≥ x4 despite satisfying (7). Then, Machina
notes that CEU theory permits no M-reversals and argues that this may be unwarranted
restriction on a model of uncertainty.
Baillon, L’Haridon and Placido (2010) observe that other well-known models of uncertainty also preclude M-reversals. In particular, they note that α−maxmin expected utility
and Klibanoff, Marinacci and Mukerji’s (2005) smooth model of ambiguity also rule them
out. Finally, Baillon et. al. verify that Siniscalchi’s (2009) vector valued expected utility
model does permit M-reversals.
Recent experimental evidence reported in L’Haridon and Placido (2010) confirms
Machina’s intuition. L’Haridon and Placido show that over 70 percent of subjects reveal M-reversals. Below, we show that an EUU preference generates no M-reversals if and
only if its interval utility is separable.
A function w : X 2 → IR is symmetric if w(x, y) = w(y, x) for all x, y ∈ X. We identify
any symmetric w with its restriction to I. Conversely, we identify any interval utility with
w, its symmetric extension to X 2 . That is, wu (x, y) = u(min{x, y}, max{x, y}).
A binary relation on º on Φ is a Machina preference if there exists a continuous,
increasing and symmetric function w such that the function V defined by
V (x1 , x2 , x3 , x4 ) = w(x1 , x2 ) + w(x3 , x4 )
represents it. We write ºw to denote the Machina preference associated with w. Let
π o ({1, 2}) = π o ({3, 4}) = 1/2 and let π o (a) = 0 for all a ∈ P such that {1, 2} 6= a 6= {3, 4}.
Hence, for every interval utility, the Machina preference ºwu is identical to the EUU
preference ºuπo .
The interval utility u is separable if there are v1 , v2 : X → IR such that
u(x, y) = v1 (x) + v2 (y)
for all (x, y) ∈ I
26
Proposition 5:
The EUU ºuπo has a no M-reversals if and only if u is separable.
Proposition 4 shows that Machina reversals occur if the interval utility is not separable.
The L’Haridon and Placido experiments show that the majority (roughly 2/3) of the
subjects that reveal M-reversals prefer “packaging” the two extreme outcomes together.
That is, for x1 > x2 = x3 > x4 ,
(x1 , x4 , x2 , x3 ) Â (x1 , x3 , x2 , x4 )
This pattern of preference is implied by an EUU preference with a strictly supermodular
interval utility; that is,
u(x4 , x1 ) + u(x3 , x2 ) > u(x3 , x1 ) + u(x4 , x2 )
whenever x1 > x2 ≥ x3 > x4 .
6.
Related Literature
One possible way to organize the literature on uncertainty and uncertainty aversion
is to group models according to the extent to which uncertainty/ambiguity is built into
the choice object. At one extreme, there are papers such as Gilboa (1987), CasadesusMasanell, Klibanoff and Ozdenoren (2000), Epstein and Zhang (2001), the current paper
and a number of others that study preferences over Savage acts over an unstructured state
space. At the other extreme, there are papers that introduce novel choice objects designed
to reflect the decision makers’ perception of uncertainty. For example, in Olszewski (2007)
and Ahn (2008), the choice objects are sets of lotteries (i.e., probability distributions
over prizes). Sets with a single lottery correspond to situations in which the decisionmaker can reduce all uncertainty to risk while sets with multiple lotteries depict Knightian
uncertainty.
Similarly, Jaffray (1989) investigates preferences over belief functions (i.e., totally
monotone capacities) over prizes. Hence, unlike expected utility maximizers and the general class of nonexpected utility maximizers considered in Machina and Schmeidler (1992),
Ahn’s, Olszewski’s and Jaffray’s decision makers choose within a class of objects that contain lotteries but are richer. The new objects enable these authors to model perceptions
27
of uncertainty that cannot be reduced to risk. These models are silent on how “real-life”
prospects are reduced to these choice objects. That is, these models do not describe how
Savage acts are identified with the new choice objects.
In between these two classes of models, there are those that partially build-in ambiguity or at least a distinction between uncertainty and risk. Segal (1990), Klibanoff,
Marinacci and Mukerji (2005), Nau (2006) and Ergin and Gul (2009) achieve the desired
effect by structuring the state space; in the first three papers, uncertainty resolves in two
stages; the first stage represents ambiguity, the second stage is risk. The remaining two
papers assume that the state space has a product structure and identify one dimension
with ambiguity and the other with risk.
The extensive literature on ambiguity models in the Anscombe-Aumann framework
also falls into this intermediate category. This literature includes Schmeidler (1989), who
introduces Choquet expected utility, Gilboa and Schmeidler (1989), who introduce maxmin
expected utility, the generalizations of maxmin expected utility, such as α-maxmin expected utility preferences (see Ghirardato and Marinacci (2001b)), variational preferences
of Maccheroni, Marinacci and Rustichini, (2006), the general uncertainty averse preferences
of Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2008), as well as Zhang’s
(2002) model of Choquet expected utility with inner probabilities, Lehrer’s (2007) model
of partially specified probabilities, and Siniscalchi’s (2006) vector expected utility theory.
A full comparison between our model and each of the many papers in the ambiguity
literature is not feasible. However, we will relate our model to the closest existing alternatives. Many of these comparisons are complicated by relatively minor differences11 or
differences in “technical” assumptions such as finite versus countable additivity. In the
following section we provide a detailed comparison between EUU, CEU and MEU in finite
settings. Here, we discuss only the three models that are most closely related to EUU
theory:
As noted above, Jaffray (1989) provides a utility theory for belief functions over a
finite set of prizes. Hence, Jaffray’s model extends von Neumann-Morgenstern’s expected
utility theory by incorporating the decision maker’s perception of ambiguity into the choice
11
One such difference is that in many models, the set of prizes is arbitrary while many other papers,
including the current one consider an interval or real numbers.
28
objects. A belief function that assigns probability 0 both singleton sets {x}, {y} but assigns probability 1 to the set {x, y} depicts a situation in which the decision maker knows
that he will end-up with either x or y but views any remaining uncertainty as irreducible
to risk. Jaffray adopts the von Neumann-Morgenstern assumptions and imposes one additional assumption to characterize a class of preferences, call them Jaffray preferences, over
capacities.
To see the relationship between Jaffray’s model and EUU theory, take any EUU
preference ºuµ fix a finite subset of prizes X o and consider only acts that yield prizes in
X o . Then, we can associate a capacity κ over the set of all nonempty subsets of X o with
each act f in a natural way by letting κ(Y ) be the inner probability of the event f −1 (Y )
with respect to the prior µ. It can be shown that the EUU preference ºuµ will be indifferent
between two acts that yield the same capacity through this procedure.
Hence, an EUU preference induces a preference over capacities over prizes. It can
also be shown that this induced preference will satisfy Jaffray’s assumptions. Hence, EUU
theory and Jaffray’s model stand roughly in the same relationship as Savage’s theory and
von Neumann-Morgenstern theory: one takes as given lotteries (probability distribution in
vNM theory, capacities in the Jaffray model) as the choice objects, the other starts with
acts and shows that each act can be identified with a lottery in a natural way and ensures
that each preference (EU or EUU) over acts induces a preference over lotteries (EU or
Jaffray).
Zhang (2002) studies preferences in the Savage setting and Lehrer (2007) considers the
Anscombe-Aumann framework with a finite state space. Ignoring the difference in the sets
of prizes, in the state spaces, and in the underlying axioms, we can state the relationship
among the these two models and EUU theory as follows: each Lehrer representation can
be identified with a Zhang representation (and vice versa) by identifying every partially
specified probability with its inner probability extension to the set of all subsets of the
state space. Then, it can be verified that every Lehrer/Zhang representation is equivalent
to an EUU representation for some interval utility u such that u(x, y) = u(x, y 0 ) for all
x, y, y 0 . Hence, Lehrer/Zhang preferences correspond to the subclass of EUU preferences
for which the interval utility depends only on the lower end of the interval.
29
7.
EUU, Choquet and α-Maxmin Expected Utility
In this section we provide a detailed comparison between EUU, Choquet expected
utility theory (CEU) and α-maxmin expected utility theory (α-MEU) in finite settings.
Propositions 6 and 7, below, show that a discrete EUU ºuπ with an interval utility u such
that
u(x, y) = βvu (x) + (1 − β)vu (y)
for some β ∈ [0, 1] is both an a CEU preference and an α-MEU preference.
7.1
Choquet Expected Utility
A function κ : 2S → [0, 1] is a capacity if (i) κ(∅) = 0, κ(S) = 1 and (ii) a ⊂ b implies
κ(a) ≤ κ(b). A binary relation º on Φ is a CEU preference if there exist a capacity κ and
a continuous strictly increasing function v : X → IR such that the function V : Φ → IR
defined by
Z
V (φ) =
v(φ)dκ
represents º, where the integral above denotes the Choquet integral. Let ºκv denote the
CEU associated with capacity κ and utility index v.
The proposition below establishes that any CEU with a capacity of the form κ =
απ∗ + (1 − α)π ∗ and utility index v is identical to the EUU with probability π and interval
utility u such that u(x, y) = αv(x) + (1 − α)v(y).
Proposition 6:
If κ = απ∗ +(1−α)π ∗ and u(x, y) = αv(x)+(1−α)v(y) for all (x, y) ∈ I
then ºκv = ºuπ .
It follows from Proposition 6 and our analysis of uncertainty aversion for EUU preferences that among CEUs that are also EUUs, ºκ̂v̂ is more cautious than ºκv whenever
v̂ ◦ v −1 is concave, κ̂ = α̂π∗ + (1 − α̂)π ∗ , κ = απ∗ + (1 − α)π ∗ and α̂ ≥ α.
7.2
α-Maxmin Expected Utility
For α ∈ [0, 1], the binary relation º on Φ is an α-MEU preference if there exists a
compact set of probabilities ∆ and a continuous strictly increasing v : X → IR such that
the function V defined by
V (φ) = α min
λ∈∆
X
v(φ(s))λ(s) + (1 − α) max
λ∈∆
s∈S
30
X
s∈S
v(φ(s))λ(s)
represents º. We let ºα∆v denote the α-MEU with parameters α, ∆, v.
Let ∆S be the set of all probabilities on S. For any nonempty a ⊂ S, let
∆a = {λ ∈ ∆S |
P
s∈a
λ(s) = 1}
For any π ∈ Π, define ∆π ⊂ ∆S a follows:
∆π =
X
π(a)∆a
a∈P
The proposition below establishes that any α-MEU with the set of probabilities ∆π is
identical to the EUU with the probability π and the interval utility u such that u(x, y) =
αv(x) + (1 − α)v(y).
Proposition 7:
If ∆ = ∆π and u(x, y) = αv(x) + (1 − α)v(y) for all (x, y) ∈ I, then
ºα∆v = ºuπ .
Note that Propositions 6 and 7 identify the same class of EUU preferences and therefore identify preferences that are both CEU and α-MEU. Again, it follows from Proposition
5 and our analysis of uncertainty aversion for EUU preferences that among α-MEUs that
are also EUUs, ºα̂∆v̂ is more cautious than ºα∆v whenever v̂ ◦ v −1 is concave and α̂ ≥ α.
8.
8.1
Appendix A: Preliminary Results
Ideal Splits
For the prior µ, let
µ∗ (A) = sup µ(E)
E∈Eµ
E⊂A
Since µ is a prior this sup is attained. That is, there exists E ⊂ A such that µ∗ (A) =
µ(E). Note that the E with this property is unique up to a set of measure 0. Call E ∈ Eµ ,
the core of A if it has this property; i.e., E ⊂ A, E ∈ Eµ and µ(E) = µ∗ (A).
Definition:
Let E ∈ Eµ , N = {1, . . . , n} and {Ai }i∈N be a finite partition of E. Let N
be the set of all nonempty subsets of N and for J ∈ N , let N (J) = {L ∈ N | L ⊂ J}. A
pairwise disjoint collection {E∗J }J∈N ⊂ Eµ of subsets E such that
[
E∗L ⊂
[
i∈J
L∈N (J)
31
Ai
(A1)
and
µ∗ (
[
X
Ai ) =
i∈J
µ(E∗L )
(A2)
L∈N (J)
is called an ideal split of {Ai }i∈N .
Lemma A0:
(i) Any partition A1 , A2 of E ∈ Eµ has an ideal split. (ii) Any partition
A1 , A2 , A3 of E ∈ Eµ such that A3 = E ∩ D for some D ∈ D has an ideal split such that
E∗J = ∅ whenever J 6= {1, 3}, {2, 3}, {1, 2, 3}.
Proof: For both (i) and (ii), assume µ(E) 6= ∅ or else there is nothing to prove. (i) Let
{1}
E∗
{2}
{1,2}
and E∗
be the cores of A1 and A2 respectively and let E∗
{1}
{2}
{1,2}
It is easy to check that E∗ , E∗ , E∗
{1}
{2}
(ii) Let Ê∗
{1}
Ê∗
{2}
∩
{2}
Ê∗
and Ê∗
{1}
{1}
{1,2,3}
{2}
∪ E∗ ).
is the desired ideal split.
be the cores of A1 ∪ D and A2 ∪ D respectively. Note that
⊂ D and therefore µ(Ê∗
Ê∗ \Ê∗ , E∗
{1}
= E\(E∗
{1}
= E\(Ê∗
{2}
{1,3}
∩ Ê∗ ) = 0. Define E∗
{1}
{2}
{2,3}
= Ê∗ \Ê∗ , E∗
=
{2}
∪ Ê∗ ), E∗J = ∅ for all other J. Note that {E∗J } is the
desired ideal split.
Lemma A1:
Every finite partition {Ai }i∈N of every E ∈ Eµ has an ideal split. If {E∗J }
and {Ẽ∗J } are two ideal splits of {Ai }i∈N , then µ(E∗J \Ẽ∗J ) = 0 for all J.
Proof: Assume µ(E) 6= ∅ and |N | > 1 or else there is nothing to prove. We will prove
the result by induction. First, let n = |N | = 2 and note that the result follows from part
(i) of Lemma A0.
Now suppose the result is true for n ≥ 2 and consider a partition {Ai }i∈N of E
for some K = {1, . . . , n + 1}. Define Âi = Ai for i < n and Ân = An ∪ An+1 . Let
N = {1, . . . , n}. By the inductive hypothesis, there exists an ideal split {Ê∗J } of {Âi }i∈N .
For J such that n ∈
/ J, define E∗J = Ê∗J . For J ∈ {{n}, {n + 1}, {n, n + 1}}, define
{n}
B1 = Ê∗
Then, set
{n}
∩ An , B1 = Ê∗
{n}
E∗
=
{1}
{1,2}
{2}
∩ An+1 and let Ẽ∗ , Ẽ∗ , Ẽ∗
{n+1}
{1}
Ẽ∗ , E∗
=
{2}
Ẽ∗ ,
and
{n,n+1}
E∗
=
be the ideal split of B1 , B2 .
{1,2}
Ẽ∗
.
For J such that |J| ≥ 2 and n ∈ J, let J + = J ∪ {n + 1} and J − = J\{n}. Choose
any D̂ ∈ D and let
DJ = ((Ω\Ê∗J ) ∩ D̂) ∪ (Ê∗J ∩
[
i∈J −
32
Âi )
{1,3}
and note that DJ ∈ D. Then, let {Ē∗
{2,3}
(J), Ē∗
{1,2,3}
(J), Ē∗
(J)} be the ideal split of
B1 = An ∩ Ê∗J , B2 = An+1 ∩ Ê∗J , and B3 = DJ ∩ Ê∗J that is guaranteed by Lemma A0(ii).
{1,3}
Let E∗J = Ē∗
(J), E (J
−
∪{n+1})
{2,3}
= Ē∗
+
{1,2,3}
(J) and E∗J = Ē∗
(J).
Verifying that {E∗J } J⊂K is the desired ideal split is tedious but straightforward. That
J6=∅
two ideal splits are identical up to a set of measure 0 follows from the definition of an ideal
split.
Definition:
For f ∈ F o and xi > xi+1 , let {x1 , . . . , xn } be the range of f . Then, let
Ai (f )} = f −1 (xi ) for i ∈ N = {1, . . . , n} and let {E∗J (f )} be the ideal split of {Ai }.
Lemma A2:
For all f ∈ F o ,
[f ]1 (ω) = min{f (ω̂) | ω ∈ E∗J (f )}
[f ]2 (ω) = max{f (ω̂) | ω ∈ E∗J (f )}.
Proof: Let f(ω) = (mini∈J zi , maxi∈J zi ) whenever ω ∈ E∗J . For ω ∈
/
S
J
E∗J , let f(ω) =
(z1 , z1 ). That f = [f ] follows immediately from the definition of an ideal split.
8.2
Proof of Lemma 1
Lemma A2 proves the result for simple acts. Consider a general act f ∈ F. Let w =
m−l and zin = l+wi2−n for all i = 0, 1, . . . , 2n . For any x, y ∈ X, let i(n, x) = max{i | zin ≤
x} and j(n, y) = min{j | zjn ≥ y}. The function i is increasing in both arguments while j
is decreasing in the first argument and increasing in the second argument. Let g n (ω) =
i(n, f (ω)) and hn (ω) = j(i, f (ω)). Since g n and hn are simple functions, [g n ] and [hn ]
exist by Lemma A2. Since hn (ω) is a decreasing sequence, so is [hn ]i (ω) for i = 1, 2 and
therefore has a limit [hn ]i (ω). Similarly, for i = 1, 2, [g n ]i (ω) is an increasing sequence and
therefore has a limit [g]i (ω). Since [hn ]i (ω) − [g n ]i (ω) ≤ w2−n for i = 1, 2, it follows that
([g]1 , [g]2 ) = ([h]1 , [h]2 ) where [g]i , [h]i are the pointwise limits of [g]ni , [h]ni for i = 1, 2. We
claim that [f ] = ([h]1 , [h]2 ). To see this, first note that for all n
µ({ω ∈ Ω | [g n ]1 (ω) ≤ f (ω) ≤ [hn ]2 (ω)}) = 1
33
since hn ≥ f ≥ g n . Then, ([g]1 , [g]2 ) = ([h]1 , [h]2 ) implies
µ({ω ∈ Ω | [h]1 (ω) ≤ f (ω) ≤ [h]2 (ω)}) = 1
Next, observe that if [g] satisfies
µ({ω ∈ Ω | [g]1 (ω) ≤ f (ω) ≤ [g]2 (ω)}) = 1
then for all n
µ({ω ∈ Ω | [g]1 (ω) ≤ [h]n1 (ω), [g]n2 (ω) ≤ [g]2 (ω)}) = 1
and therefore
µ({ω ∈ Ω | [g]1 (ω) ≤ [h]1 (ω) ≤ [h]2 (ω) ≤ [g]2 (ω)}) = 1
Lemma A3:
A countably additive probability is convex-ranged if it is nonatomic.
Proof: Let E+ = {E ∈ E | π(E) > 0}. For E ∈ E, let
½
r(E) =
inf
E1 ,...,En
max
i
π(Ei )
π(E)
¾
q = sup r(E)
E∈E+
where the inf is taken over all finite partitions of E. First, we note that q is either 1 or
0. To see this, note that if q ∈ (0, 1), then we can find δ ∈ (0, 1) such that δ 2 < q < δ.
By definition, there must exist E such that r(E) > δ 2 and a partition E1 , . . . , En of E
such that
π(Eij )
π(Ei )
π(Ei )
π(E)
< δ. Also, there must exists partitions Ei1 , . . . , Eim for each i such that
< δ. Then, note that {Eij } is a partition of E such that
π(Eij )
π(E)
< δ 2 , contradicting
the fact that r(E) > δ 2 .
Suppose q = 1, then there exists E ∈ E+ , such that r(E) > 0 and hence r(Ω) ≥
r(E)π(E) > 0. Then, we can repeat the argument above to conclude that r(Ω) = 1,
contradicting the fact that π is nonatomic. Hence, we conclude that q = 0.
Note that if {Ei } is a partition of E such that π(E) < ², then there exist E 0 =
Sk
i=1
Ei
such that π(E 0 ) ≤ r < π(E 0 ) + ². Since q = 0, we can construct such a partition for any
34
E and ² > 0. Hence, we can construct a sequence of disjoint sets {Ek0 } such that Ek0 ⊂ E
Sm
S∞
and k=1 Ek0 > π(E) − ²m . Since π is countably additive, we have π( k=1 Ek0 ) = r and
S∞
0
k=1 Ek ⊂ E as desired.
A set D is diffuse if µ∗ (D) = µ∗ (Dc ) = 1. Let D be the set of all diffuse sets.
Lemma A4:
Assume the continuum hypothesis holds and µ is a prior. Then, for any
natural number n, there exists a partition (D1 , . . . , Dn ) of Ω such that Di ∈ D for i =
1, . . . , n.
Proof: Birkhoff (1967) page 266, Theorem 13 proves the following: under the continuum
hypothesis, no nontrivial (i.e., not identically equal to 0) measure such that every singleton
has measure 0 can be defined on the algebra of all subsets of the continuum. We will use
Birkoff’s result to establish that Ω must have a nonmeasurable subset. That is, there exists
A ⊂ Ω such that A ∈
/ Eµ .
Since µ is a prior, by Lemma A3 above, it is convex valued. Hence, we can construct a
random variable, ψ, that has a uniform distribution on the interval [0, 1] on this probability
space. Define µ̂(R) = µ(ψ −1 (R)) for every R ⊂ [0, 1]. If Eµ contains every subset of Ω, µ̂
defines a measure on the set of all subsets of the unit interval. Moreover, since ψ has a
uniform distribution, µ̂({x}) = 0 for all x ∈ [0, 1], contradicting Birkoff’s result.
{1}
{1,2}
{2}
Let E∗ (A), E∗ (A), E∗
{1,2}
and let α = supA⊂Ω µ(E∗
(A) be an ideal split of {A1 , A2 } for A1 = A and A2 = Ac
(A)). By Lemma A1, α is well defined. Since µ is a prior by
the argument above, there exists A ⊂ Ω such that A ∈
/ Eµ . Hence, α > 0.
To establish that α is attained, consider a sequence of sets A(n) such that
{1,2}
µ(E∗
S
(A(n))) > α −
1
n
{1,2}
(A(j))) and set E(0) = ∅. Define B(n) =
E∗
S
S
{1,2}
{1,2}
(A)
(B(n)) ⊂ E∗
[E(n) ∩ A(n)]\E(n − 1) and let A = n B(n). Note that n E∗
for all n = 1, 2, . . .. Let E(n) =
j≤n
and therefore µ(E{1,2} (A)) = α as desired.
{1,2}
But then, if α < 1, choose A such that µ(E∗
{1,2}
E∗
(A)) = α and define A(1) = A ∩
{1,2}
(A). Let B be any nonmeasurable subset of Ω\E∗
35
{1,2}
and note that µ(E∗
(A ∪
{1,2}
B)) > µ(E∗
{1,2}
(A)) = α a contradiction. Hence, there exists A such that µ(E∗
(A)) = 1.
Clearly, such an A is diffuse.
Next, we will show any diffuse set can be partitioned into two diffuse sets. Then, a
simple inductive argument yields part concludes the proof. Let D be any diffuse set and
define Σ1 = {E ∩ D | E ∈ E} and µ1 (E ∩ D) = µ(E) for all E ∈ Eµ . Note that since D
is diffuse, E ∩ D = E 0 ∩ D implies that E, E 0 differ by a set of measure 0. Hence, µ1 is
well-defined. It is easy to check that µ1 is a countably additive probability measure on Σ1
and µ1 ({s}) = 0 for s ∈ D. Therefore, D cannot be countable. Then, by the Continuum
Hypothesis, D must have the cardinality of the continuum. Repeating the argument yields
a diffuse subset D1 of D. Then, for any E such that µ(E) > 0, we have µ1 (E ∩ D) > 0 and
therefore E ∩ D1 6= ∅. A symmetric argument yields E ∩ (D\D1 ) 6= ∅. Hence, D1 , D\D1
are diffuse in Ω.
8.3
Proof of Lemma 2
Since µ is a prior, by Lemma A4, there exists a diffuse set D. Let f ∈ F and f = f1 Df2 .
We claim that [f ] = f. Note that f1 (ω) ≤ f (ω) ≤ f2 (ω) for all ω. For any real-valued
function g on Ω, if there exists E ∈ Eµ such that µ(E) > 0 and g(ω) > f1 (ω) for all ω ∈ E,
then, since D is diffuse, we have g(ω) > f1 (ω) = f (ω) for some ω ∈ D ∩ E. Therefore,
g ∈ F and g1 (ω) ≤ f (ω) for all ω implies µ({ω | g1 (E) ≤ f1 (E)}) = 1. A symmetric
argument yields g ∈ F and g2 (ω) ≥ f (ω) for all ω implies µ({ω | g1 (E) ≤ f2 (E)}) = 1.
9.
Appendix B: Proof of Theorem 1
The proof is divided into a series of Lemmas. It is understood that Axioms 1-6 hold
throughout.
Definition:
A set E left (right) ideal if f Eh º gEh implies f Eh0 º gEh0 (hEf º
hEg implies h0 Ef º h0 Eg). Let E l and E r be the collection of left and right ideal sets
respectively.
Lemma B0:
E l ∩ E r = E.
Proof: That E l ∩ E r ⊂ E is obvious. Suppose E ∈ E and assume f Eh º gEh. Let
f ∗ = f Eh and g ∗ = gEh. Then, f ∗ Eh = f Eh º gEh = g ∗ Eh and hence f ∗ Eh º g ∗ Eh.
36
Also, hEf ∗ = h = hEg ∗ and hence hEf ∗ º hEg ∗ and therefore f ∗ Eh0 º g ∗ Eh0 and
h0 Ef ∗ º h0 Eg ∗ since E ∈ E. That is, f Eh0 = f ∗ Eh0 º g ∗ Eh0 = gEh0 and hence E ∈ E l .
A symmetric argument establishes that E ∈ E r and therefore E = E l ∩ E r .
Lemma B1:
(i) f (s) ≥ g(s) for all s ∈ Ω implies f º g. (ii) f  g implies f  z  g
for some z ∈ X. (iii) fn , gn ∈ F , fn converges uniformly to f , gn converges uniformly
to g, f  g implies fn  gn for some n. (iv) fn , gn ∈ F e , fn converges pointwise to
f , gn pointwise to g, f  g implies fn  gn for some n. (v) If E ∈ E+ and y > x, then
yEh  xEh for all h ∈ F e . (vi) If E ∈ E+ and f ∈ F, then there exists a unique cE (f ) ∈ X
such that cE (f )Ef ∼ f .
Proof: To prove (i), let fn =
1
nm
+ ( n−1
n )f and gn =
1
nl
+ ( n−1
n )g. Then, fn converges
to f uniformly and gn converges to g uniformly. By Axiom 2, fn  gn . Then, by Axiom
6, f º gn and applying Axiom 6 again yields f º g as desired.
To prove (ii), assume f  g and let y = inf{z ∈ X | z º f } and let x = sup{z ∈
X | g º z}. By (i) above, x and y are well-defined. Axiom 6 ensures that y ∼ f and z ∼ g
and therefore y  x. Then, for z =
x+y
2 ,
we have f  z  g.
To prove (iii), let f  g and apply (ii) three times to get z, y, x such that f  z Â
y  x  g. Axiom 6 ensures that fn  y and y º gn for all n large enough. Therefore,
fn  gn for all such n. An analogous argument proves (iv).
To prove (v), consider E ∈ E+ , h ∈ F e and x < y. Then, there exists f, g, h0 such
that f Eh0  gEh0 which implies that mEh0  lEh0 by part (i) above. Hence, m  lEm
which implies y  xEy by Axiom 4, which then implies yEh  xEh as desired.
Finally, let z = inf{x ∈ X | xEf º f }. By part (i) mEf º f and hence z is welldefined. Axiom 6 and part (v) ensure that zEf ∼ f and also that y 6= z implies yEf 6∼ f .
Hence, z = cE (f ).
Lemma B2:
The collection E is a σ−field.
Proof: First, we will show that E is a field. That E ∈ E implies E c ∈ E is obvious as is
the fact that ∅ ∈ E. Hence, to show that E is a field, we need to verify that E, E 0 ∈ E
implies E ∩ E 0 ∈ E.
37
Suppose E, Ê ∈ E. Then, by Lemma B0, E, Ê ∈ E l ∩ E r . We will first show that
E ∩ Ê ∈ E l . Suppose f E ∩ Êh º gE ∩ Êh. Note that f E ∩ Êh = (f Eh)Êh. Since Ê ∈ E,
we have (f Eh)Êh0 º (gEh)Êh0 . Next, observe that (f Eh)Êh0 = (f Êh0 )E(hÊh0 ). Since
E ∈ E, we have f E ∩ Êh0 = (f Êh0 )E(h0 Êh0 ) º (g Êh0 )E(h0 Êh0 ) = gE ∩ Êh0 and hence
E ∩ Ê ∈ E l as required.
To conclude the proof, we will show that E ∩ Ê ∈ E r . But E ∩ Ê ∈ E r if and only if
(E ∩ Ê)c ∈ E l ; that is, E c ∪ Ê c ∈ E l . Since we know that E c ∈ E if and only if E ∈ E, to
conclude the proof it is enough show that E ∪ Ê ∈ E l whenever E, Ê ∈ E. Moreover, since
E ∪ Ê = (E ∩ Ê c ) ∪ Ê and since we have already shown that Ê ∈ E implies Ê c ∈ E and
E, Ê c ∈ E implies E ∩ Ê c ∈ E l , to complete the proof that E, Ê ∈ E implies E ∪ Ê ∈ E l
whenever E, Ê ∈ E, it is enough to consider disjoint E, Ê.
Let E, Ê ∈ E, E ∩ Ê = ∅ and let Ẽ = E ∪ Ê. We will show that Ẽ ∈ E l . We assume
that E, Ê ∈ E+ since otherwise Ẽ ∈ E is immediate. Throughout the remainder of the
proof of this lemma, we let (h1 , h2 , h3 ) denote the act (h1 Eh2 )Ẽh3 for all h1 , h2 , h3 ∈ F.
Fact 1: Suppose (i) (h1 , h2 , h3 ) ∼ (h01 , h02 , h3 ) and (ii) (h01 , h02 , h03 ) ∼ (h1 , h02 , h3 ), then
(h01 , h02 , h03 ) ∼ (h1 , h2 , h03 ).
Observe that since Ê ∈ E, (ii) implies (iii) (h01 , h2 , h03 ) ∼ (h1 , h2 , h3 ). Then, (i) and (iii)
yield (iv) (h01 , h2 , h03 ) ∼ (h01 , h02 , h3 ). Since E ∈ E, (iv) implies (v) (h1 , h2 , h03 ) ∼ (h1 , h02 , h3 ).
Then, (ii) and (v) yield (h1 , h2 , h03 ) ∼ (h01 , h02 , h03 ) as desired.
Next, we will prove that if Ẽ ∈
/ E, then there exists hi , h0i for i = 1, 2, 3 such that (i)
and (ii) of fact 1 hold and (h1 , h2 , h03 ) 6∼ (h01 , h02 , h03 ). This contradiction will establish the
desired result. Suppose there exists f = (f1 , f2 , h), g = (g1 , g2 , h), f 0 = (f1 , f2 , h0 ), g 0 =
(g1 , g2 , h0 ) such that f º g and g 0 Â f 0 . By Lemma B1(vi), we can assume, without loss of
generality that f1 = x1 , f2 = x2 , g1 = y1 , g2 = y2 for some xi , yi ∈ X for i = 1, 2. We can
also assume, by renaming E, Ê if necessary, that x2 ≤ y2 . Then, it follows from Lemma
B1(i) and (v) that x2 < y2 and x1 > y1 .
We endow F with the topology of uniform convergence and for any h ∈ F, let c(h) =
cΩ (h) where cΩ is the function defined in Lemma B1(vi).
Fact 2: cE : F → X is a continuous function for every E ∈ E.
Fact 2 follows immediately from Lemma B 1(vi) and Axiom 6(ii).
38
Fact 3: Let fα = (x1 , x2 , αh0 + (1 − α)h) and gα = (y1 , y2 , αh0 + (1 − α)h). Then, there
exists ᾱ ∈ [0, 1) such that c(fᾱ ) = c(gᾱ ) and c(gα ) > c(fα ) for all α ∈ (ᾱ, 1].
Fact 3 follows from the continuity of α → fα and α → gα .
Without loss of generality, assume ᾱ = 0 (otherwise, we can rename ᾱh0 + (1 − ᾱ)h)
and call it h0 ). For any t ∈ [0, 1], choose φ(t) such that
c(x1 + t · (y1 − x1 ), x2 + φ(t)(y2 − x2 ), h) = c(f )
(B1)
To see that φ(t) exists and is unique for every t, note that
c(x1 + t · (y1 − x1 ), y2 , h) ≥ c(f ) ≥ c(x1 + t · (y1 − x1 ), x2 , h).
Hence, the existence of a φ(t) satisfying (B1) follows from the continuity of c (fact 2). The
uniqueness of this φ(t) follows from Lemma B1(v). Henceforth, let x1 (t) = x1 + t · (y1 − x1 )
and x2 (t) = x2 + φ(t) · (y2 − x2 ). Let
fαt = (x1 + t · (y1 − x1 ), x2 + φ(t)(y2 − x2 ), αh0 + (1 − α)h)
and set ψ(α) = c(fα1 ).
Fact 4: ψ : [0, 1] → X is continuous.
The map α → fα1 is continuous. Then, fact 2 yields fact 4.
Note that f00 = f , f01 = g, f10 = f 0 , f11 = g 0 and c(f0t ) = c(f ) for all t ∈ [0, 1]. First, we
observe that c(fα1 ) 6= c(f01 ) for all α 6= 0. This follows since c(fα1 ) = c(f01 ) and E ∈ E imply
c(x1 , y2 , αh0 + (1 − α)h) = c(x1 , y2 , h). Then, Ê ∈ E implies c(x1 , x2 , αh0 + (1 − α)h) =
c(x1 , x2 , h) and hence c(x1 , x2 , αh0 + (1 − α)h) = c(y1 , y2 , αh0 + (1 − α)h) which contradicts
fact 3 whenever α 6= 0.
Hence, c(fα1 ) 6= c(f01 ) for all α 6= 0 implies either (i) c(fα1 ) > c(f01 ) whenever α 6= 0 or
(ii) c(fα1 ) < c(f01 ) whenever α 6= 0. First, consider case (i). Let fˆ = (x1 , y2 , h) and note
that by Lemma B1(v), c(fˆ) > c(f01 ). Then, choose α > 0 small enough so that
min{c(fˆ), c(f11 )} > c(fα1 ) > c(f01 )
39
(B2)
By fact 4, such α exist. Then, fact 2 ensures that c(tf01 + (1 − t)fˆ) = c(fα1 ) for some
t ∈ [0, 1].
To complete the proof of case (i), let h1 = x1 (t), h2 = x2 (t), h3 = h and h01 = y1 =
x1 (1), h02 = y2 = x2 (1) and h03 = αh0 + (1 − α)h. Then, (h1 , h2 , h3 ) = f0t , (h01 , h02 , h3 ) = f01 ,
(h1 , h02 , h3 ) = tf01 + (1 − t)fˆ and fα1 = (h01 , h02 , h03 ). Hence, we have c(f0t ) = c(f01 ); that is
(h1 , h2 , h3 ) ∼ (h01 , h02 , h3 ). Also, c(tf01 + (1 − t)fˆ) = c(fα1 ); that is (h1 , h02 , h3 ) ∼ (h01 , h02 , h03 ).
Then, fact 1 yields (h1 , h2 , h03 ) ∼ (h01 , h02 , h03 ); that is, c(fαt ) = c(f11 ) which contradicts
equation (B2).
For case (ii); that is, if c(fα1 ) < c(f01 ) whenever α 6= 0, rename E and Ê so that
x1 < y1 and x2 > y2 and let fˆ = (x1 , y2 , h) as before. Then, note that by Lemma B1(v),
c(fˆ) < c(f01 ). Then, choose α > 0 small enough so that
max{c(fˆ), c(f11 )} < c(fα1 ) < c(f01 ).
By fact 4, such α exist. Then, repeat the arguments of case (i) to obtain the desired
conclusion.
To prove that the field E is a σ−field, it is enough to show that if Ei ∈ E and
S
Ei ⊂ Ei+1 , then Ei ∈ E. Let Ei ⊂ Ei+1 for all i. Note that fˆEi ĝ converges pointwise
S
S
S
S
S
to fˆ Ei ĝ for all fˆ, ĝ ∈ F. Hence, if g Ei h0  f Ei h0 or h0 Ei g  h0 Ei f for some
f, g, h, h0 ∈ F e , by (iv) above, we have gEn h0  f En h0 or h0 En g  h0 En f for some n,
S
proving that Ei ∈ E for all n implies i Ei ∈ E.
Lemma B3:
There exists a finitely additive, convex-ranged probability measure µ on E
and a function v : Ω → IR such that (i) the function V : F e → IR defined by
V (f ) =
X
v(x)µ(f −1 (x))
x∈X
represents the restriction of º to F e and (ii) for any f, g ∈ F, µ({ω | f (ω) = g(ω)}) = 1
implies f = g.
Proof: Note that once we restrict attention to F e , Axiom 1 is Savages P1, Axiom 2 is P2
and Lemma B1(v) is P3. By definition P4 is satisfied, Axiom 4 is P5, Axiom 5 is P6, and
finally, Axiom 6(ii) yields P7. Then, applying the proof of Savage’s Theorem to all acts
40
in F e yields the desired conclusion. This is true despite the fact that Savage’s theorem
assumes that the underlying σ−field is the set of all subsets of Ω; the arguments work for
any σ−field. This proves (i). To prove (ii), note that by hypothesis, there exists E ∈ E
such that µ(E) = 1 and g = f Eg. But m ∼ mEl by part (i) and since E ∈ E, we have
f Em ∼ f El. Then, Lemma B1(i) yields f = f Ef ∼ f Eg.
Lemma B4:
.
The probability measure µ on E is a prior.
Proof: To show that µ is countably additive, we need to prove that given any sequence Ei
T
such that Ei+1 ⊂ Ei for all i and E ∗ := i Ei = ∅, lim µ(Ei ) = 0. Suppose lim µ(Ei ) > 0.
Then, convex-valuedness ensures the existence of E such that lim µ(Ei ) > µ(E) > 0.
Hence, µ(Ei ) > µ(E) for all i; that is mEi l  mEl for all i. But mEi l ∈ F e and converges
pointwise to mE ∗ l. Hence, mE ∗ l º mEl  l. Therefore, µ(E ∗ ) > 0 as desired.
To prove that µ is complete, we will assume µ(E) = 0 and A ⊂ E, then show that this
implies f Ah ∈ h for all f ∈ F. This mean that A ∈ E. By Lemma B1, f Ah º lEh. Since,
µ(E) = 0, Lemma B3 implies lEm ∼ m and since E ∈ E, we conclude that lEh ∼ mEh.
But, mEh º h by Lemma B1, so we have f Ah º h. A symmetric argument, yields
h º f Ah and hence, f Ah ∼ h as desired.
Since µ is convex-ranged it is obviously nonatomic. Hence, µ is a prior.
Lemma B5:
The function v is strictly increasing and continuous.
Proof: That v is strictly increasing follows from y  x whenever y > x. To prove
continuity, assume, without loss of generality, that v(m) = 1 and v(l) = 0 and suppose
r = lim v(xn ) < v(x) for some sequence xn in X. Choose Er ∈ E such that µ(Er ) = r.
Then, note that x  hEr l º xn for n large. Therefore, x  hEr l º lim xn = x, a
contradiction. Hence, r ≥ v(x). A symmetric argument proves v(x) ≥ r and yields the
continuity of v.
Lemma B6:
For any diffuse act yDx, there exists a unique z ∈ X such that yDx ∼ z.
Proof: Let z = sup{w ∈ X | yDx º w}. Since, yDx º l, by Axiom 2, z is well-defined.
Then, we can construct two sequences yn ≥ z and z ≥ xn such that both sequences
converge to z and yn º yDx, yDx º xn . Hence, by Axiom 6, z º yDx º z as desired.
41
Lemma B7:
Let D1 , . . . , Dn ∈ D be a partition of Ω and yi+1 ≥ yi for i = 1, . . . , n − 1
and define f : Ω → X as follows: f (s) = yi whenever s ∈ Di . Then, f ∼ yn Dy1 for all
D ∈ D.
Proof: By monotonicity, yn [D2 ∪ . . . ∪ Dn ]y1 º f º yn Dn y1 . By Axiom 5, yn [D2 ∪ . . . ∪
Dn ]y1 ∼ yn Dn y1 ∼ yn Dy1 .
Definition:
Let u : I → IR be defined as follows: u(x, y) = v(z) for z such that yDx ∼ z.
It follows from Lemma B6 that u is well defined.
Lemma B8:
The function u is increasing and continuous.
Proof: Suppose yDx ∼ z and ŷDx̂ ∼ ẑ. If ŷ > y and x̂ > x, then Axiom 2 implies ẑ Â z
and applying Axiom 2 again yields ẑ > z as desired. If ŷ ≥ y and x̂ ≥ x, then by Lemma
B1(i), ẑ º z. Then, applying Axiom 2 again yields ẑ ≥ z.
To prove continuity, assume yi Dxi ∼ zi for i = 1, . . . and lim(xi , yi ) = (x, y). Since X
is compact, we can assume without loss of generality, that zi converges to some z. Suppose
yDx  z and note that since yi Dxi converges uniformly to yi Dxi and the act zi converges
uniformly to z, we have by Lemma B1(iii), yi Dxi  zi for some i, a contradiction. A
symmetric argument yield yi Dxi ∼ zi and establishes continuity.
Define
Z
W (f ) =
Lemma B9:
u[f ]dµ
The function W represents the restriction of º to F o .
Proof: Let N + (f ) = {J | µ(E∗J (f )) > 0 and |J| > 1} and F n = {f ∈ F o | n = |N + (f )|}.
The proof is by induction on F n . Note that for f ∈ F 0
W (f ) =
X
v(x)µ(f −1 (x)) = v(x)
x∈X
for x such that x ∼ f . Hence, the restriction of W to F 0 represents º. Suppose W
represents the restriction of º to F n and choose f ∈ F n+1 . Define hf as follows: if
f ∈ F n , then hf = f .
42
Otherwise, choose E∗J (f ) such that |J| > 1 and µ(E∗J (f )) > 0. Let y = max f (E∗J (f ))
and x = min f (E∗J (f )). Hence, y > x. Also choose D ∈ D and define f ∗ as follows:
f ∗ (ω) = f (ω) for all ω ∈
/ EJ∗ (f ); f (ω) = y for all ω ∈ D ∩ E∗J (f ) and finally, f (ω) = x for
all ω ∈ Dc ∩ E∗J (f ).
By Lemma B6 and Axiom 3, f ∗ ∼ f . Next, choose z such that u(x, y) = v(z) and
let hf (ω) = f ∗ (ω) for all ω ∈
/ E∗J (f ) and hf (ω) = z for all ω ∈ E∗J (f ). Again, Axiom 3
ensures that hf ∼ f ∗ ∼ f . Note that hf ∈ F n and by construction, W (hf ) = W (f ∗ ).
Note also that [f ∗ ] = [f ] and therefore W (f ∗ ) = W (f ). Thus, W (f ) = W (hf ) for some
hf ∈ F n such that hf ∼ f . Then, the induction hypothesis implies that W represents º
on F n+1 .
Lemma B10:
The function W represents º.
Proof: Note that for all f , there exists xf such that W (xf ) = u(xf , xf ) = W (f ). This
follows from that fact that u is continuous and u(m, m) ≥ W (f ) ≥ u(l, l). Hence, by the
intermediate value theorem u(xf , xf ) = W (f ) for some xf ∈ [l, m]. The monotonicity of
u ensures that this xf is unique. Next, we show that f ∼ xf .
Without loss of generality, assume l = 0 (if not let l∗ = 0 and m∗ = m − l and identify
each f with f ∗ = f − l and apply all previous results to acts F ∗ = {f − l | f ∈ F}.) Define
for any x ≥ 0 and ² > 0, z ∗ (x, ²) = min{n² | n = 0, 1, . . . such that n² ≥ x}. Similarly, let
z∗ (x, ²) = max{n² | n = 0, 1, . . . such that n² ≤ x}. Clearly,
0 ≤ z ∗ (x, ²) − x ≤ z ∗ (x, ²) − z∗ (x, ²) < ²
(B3)
and the first two inequalities above are equalities if and only if x is a multiple of ².
Set f n (ω) = z ∗ (f (ω), m2−n ) and fn (ω) = z∗ (f (ω), m2−n ) for all n = 0, 1, . . .. Equation (B3) above ensures that f n ≥ f ≥ fn and f n , fn converge uniformly to f . Note also
that f n , fn ∈ F o with f n ↓ f . This implies that (for a measure 1 subset) [f n ] ↓ [f ] and
R
R
therefore u[f n ]dµ → u[f ]dµ.
Since f n ≥ f , we have W (f n ) ≥ W (f ) = W (xf ) for all n. Since W represents the
restriction of º to F o , we conclude that f n º x for all n. Then, Axiom 6 implies f º x. A
symmetric argument with fn replacing f n yields xf º f and therefore xf ∼ f as desired.
43
To conclude the proof of the Lemma, suppose f º g, then W (xf ) = W (f ) and
W (xg ) = W (g) and xf ∼ f º g ∼ xg . Since W represents the restriction of º to F o , we
conclude that W (xf ) ≥ W (xg ) and hence W (f ) ≥ W (g). Similarly, if W (f ) ≥ W (g) we
conclude f ∼ xf º xg ∼ g and therefore f º g.
Lemma B10 establishes sufficiency. To prove that the preference, ºuµ , satisfies Axioms
1-6 for every prior µ and interval utility, note that by Lemma A3, µ is convex-ranged. Then,
verifying Axioms 1, 2, 4 and 5 involves nothing more than repeating familiar arguments
from Savage’s theorem. Axiom 3 and Axiom 4 follow immediately from the representation.
Note that for any f n ∈ F e , if [f n ] converges pointwise to [g], then W (f n ) converges to
W (g). Hence, Axiom 6(i) follows from the fact that f n converges to f pointwise implies
[f n ] converges pointwise to [f ] while Axiom 6(ii) follows from the fact that for any f n , if
f n converges to f uniformly, then [f n ] converges pointwise to [f ].
The uniqueness of the representation follows from standard arguments and is therefore
omitted.
10.
Appendix C: Proofs of Propositions 1-7
Proof of Proposition 1: Suppose φ º φ0 if and only if ρ ◦ φ ºuµ ρ ◦ φ0 for some partition
ρ : Ω → {1, . . . , n}, prior µ and interval utility u.
Assume φ is one-to-one with (distinct) values x1 , . . . , xn ∈ X. Let f = φ ◦ ρ. For any
a ⊂ S, let π(a) = µ(Ea∗ (f )) for a ⊂ N such that a 6= ∅. We have shown in the proof of
Theorem 1 that
W (f ) =
X
a
u( min
f (x), max
f (y))µ(Ea∗ (f )) =
∗
∗
x∈Ea (f )
y∈Ea (f )
X
u[φ](a)π(a).
a
Note that any φ is the uniform limit of φn ’s that are one-to-one and lim W (φn ◦ ρ) =
P
P
W (φ ◦ ρ). Thus, lim a u([φn ](a))π(a) = a u[φ](a)π(a) proving that
U (φ) =
X
u[φ](a)π(a)
a
as desired.
44
For the converse, let µ, π and u be given. Choose an ideal-event partition {Ea }a∈S of
Ω such that µ(Ea ) = π(a), where S is the set of all nonempty subsets of S. For each a, let
{Dia }i∈a be a partition of Ω into diffuse sets. Define ρ such that ρ(ω) = i for ω ∈ Ea ∩ Dia .
Consider acts φ ∈ Φ that are one-to-one and let f = φ◦ρ. Then, E∗a (f ) = Ea and therefore
P
W (ρ ◦ φ) = a u[φ](a)π(a). The extension to all acts φ ∈ Φ is as above.
Proof of Proposition 2: If vû ◦ vu−1 is not concave, then familiar arguments ensure
the existence of x < z < y and p ∈ (0, 1) such that pvû (y) + (1 − p)vû (x) > vû (z) and
pvu (y) + (1 − p)vu (x) < vu (z). Then, let S = {1, 2}, π({1}) = p, π({2}) = 1 − p and
π({1, 2}) = 0, set U1 = (π, û) and U0 = (π, u). Then,
pvû (y) + (1 − p)vû (x) = U1 (y{1}x) > U1 (z) = vû (z)
pvu (y) + (1 − p)vu (x) = U0 (y{1}x) < U0 (z) = vu (z)
proving that y{1}x Âûπ z and z Âuπ y{1}x and hence ºûπ is not more uncertainty averse
than ºuπ .
If σûxy < σuxy , then choose σ strictly between these two numbers, let S = {1, 2},
π({1, 2}) = 1 and π({1}) = π({2}) = 0. Again, set U1 = (π, û) and U0 = (π, u) and note
that U1 (y{1}x) > U1 (σx + (1 − σ)y) and U0 (y{1}x) < U0 (σx + (1 − σ)y), again, proving
that û is not more uncertainty averse than u. Hence, these properties are necessary for û
to be more uncertainty averse than u.
To prove sufficiency, let U0 = (π, u) and U1 = (π, û) and assume σûxy ≥ σûxy for all x, y
and vû ◦ vu−1 is concave. Note that for any u∗ and U = (π, u∗ ),
U (y â x) = π∗ (â)vu∗ (y) + (π ∗ (â) − π∗ (â))vu∗ (σuxy∗ x + (1 − σuxy∗ )y) + (1 − π ∗ (â))vu∗ (x) (C1)
Hence, vu−1 (U1 (y â x)) > vû−1 (U0 (y â x)) for all π, proving that û is more uncertainty averse
than u.
Proof of Proposition 3: Suppose û is more uncertainty averse than u, let U0 = (π, u),
U1 = (π, û) and assume U0 (y a x) > U0 (y b x) and U1 (y a x) < U1 (y b x). Since, 0 ≤ σuxy ≤
1, equation (C1) implies that if
[π∗ (a) − π∗ (b)][π ∗ (a) − π ∗ (b)] ≥ 0
45
then, either U (y a x) ≥ U (y b x) for all U = (π, u∗ ) or U (y a x) ≤ U (y b x) for all U =
(π, u∗ ). Therefore,
[π∗ (a) − π∗ (b)][π ∗ (a) − π ∗ (b)] < 0
(C2)
Next, without loss of generality, set vû (y) = vu (y) = 1 and vû (x) = vu (x) = 0 and
let δ1 = vû (σûxy x + (1 − σûxy )y) and δ0 = vu (σuxy x + (1 − σuxy )y) and note that û is more
uncertainty averse implies δ1 ≤ δ0 . Applying (C1) to U1 (y a x) < U1 (y b x) and U0 (y a x) >
U0 (y b x) and rearranging terms yields that δ1 > δ0 if and only if π∗ (a) − π∗ (b) < 0 and
hence (C2) proves a is more uncertain than b.
Conversely, if π∗ (a) − π∗ (b) < 0 < π ∗ (a) − π ∗ (b), then let u(x, y) = y and û(x, y) = x
for all (x, y) ∈ I. Clearly, û is more uncertainty averse than u. Also, U1 (y â x) = yπ∗ (â) +
(1 − π∗ (â))x, U0 (y â x) = yπ ∗ (â) + (1 − π ∗ (â))x for all â and therefore U1 (y a x) < U1 (y b x)
and U0 (y a x) > U0 (y b x) as desired.
Proof of Proposition 4:
Let
n
π(a) =
²
0
if at = 1 for all t
otherwise
1 k
Let ² = ( m
) so that π is a probability on P. Without loss of generality, let x = 0, y = 1,
u(0, 0) = 0 and u(1, 1) = 1 and set σ = σu01 and U = (π, u). Then, for any b ∈ B, let
δt (b) =
bt
m.
Then,
π∗ (b) =
X
a⊂b
∗
π (b) = 1 −
When |a| = |b|, we have
P
π(a) =
X
=
δt (b)
t
π(a) =
a⊂bc
t (at )
Y
P
t (bt ).
Y
δt (bc )
(C3)
t
Furthermore, for a ∈ A, δt (a) = δ1 (a) for
all t. Hence, equation (C3) implies that π∗ (a) > π∗ (b) and π ∗ (a) < π ∗ (b) whenever a ∈ A,
b ∈ B\A and |a| = |b|, proving (i).
It follows from equation (C1) in the proof of proposition 2 that
U (y a x) = (1 − z)π∗ (a) + zπ ∗ (a)
46
Hence, U (y a x) − U (y b x) = (1 − z)[π∗ (a) − π∗ (b)] + z[π ∗ (a) − π ∗ (b)]. Let T be the set of
all (a, b) such that a ∈ A, b ∈ B\A and |a| = |b| and define
z ∗ = min
(a,b)∈T
π∗ (a) − π∗ (b)
π∗ (a) − π∗ (b) + π ∗ (b) − π ∗ (a)
Since B is finite, b is more uncertain than a whenever (a, b) ∈ T and |a| = |b|,
z ∗ ∈ (0, 1) is well-defined by part (i).
Proof of Proposition 5: That separability precludes M-reversals is obvious. To conclude
the proof, we will show that if there are no M-reversals, then the u that satisfies the above
equation must be separable. No M-reversals implies
u(x1 , y1 ) + u(x2 , y2 ) = u(x1 , y2 ) + u(x2 , y1 )
(C4)
whenever (x1 , y2 ), u(x2 , y1 ) ∈ I. Define v2 (y) = u(l, y) and v1 (x) = u(x, m) − u(l, m).
Then, v1 (x) + v2 (y) = u(x, m) − u(l, m) + u(l, y) and equation (C4) ensures that u(x, m) −
u(l, m) = u(x, y) − u(l, y). Therefore, v1 (x) + v2 (y) = u(x, y) for all x, y, proving the
separability of u.
Proof of Proposition 6: Assume that κ = απ∗ + (1 − α)π ∗ and u = v0 . For any φ ∈ Φ
order S = {s1 , s2 , . . . sn } so that φ(si ) ≥ φ(si+1 ) and let a0 = ∅, ai = {s1 , . . . , si } for i ≥ 1.
Then, for the V that represents the CEU ºκv we have,
V (φ) =
=
=
n
X
i=1
n
X
i=1
n
X
i=1
v(φ(si ))[κ(ai ) − κ(ai−1 )]
v(φ(si )){α[π∗ (ai ) − π∗ (ai−1 )] + (1 − α)[π∗ (ai ) − π∗ (ai−1 )]}
v(φ(si ))
X
b∈Ai
απ(b) +
n
X
i=1
47
v(φ(si ))
X
b∈Bi
(1 − α)π(b)
where Ai = {b ⊂ S | b ⊂ ai , b 6⊂ ai−1 } and Bi = {b ⊂ S | b ⊂ aci−1 , b 6⊂ aci }. Hence, we have
V (φ) =
=
n
X
i=1
n
X
v(φ(si ))
=
X
X
n
X
s∈b
s∈b
v(φ(si ))
i=1
v(min φ(s))απ(b) +
v(min φ(s))απ(b) +
b∈P
=
απ(b) +
b∈Ai
i=1 b∈Ai
X
X
X
b∈P
X
(1 − α)π(b)
b∈Bi
n
X
X
v(max φ(s))(1 − α)π(b)
s∈b
i=1 b∈Bi
v(max φ(s))(1 − α)π(b)
s∈b
vα [φ](b)π(b) = U (φ)
b∈P
where U = ºuπ .
Proof of Proposition 7: Take any φ ∈ Φ and for all a ∈ P choose s∗ , s∗ ∈ a such
that mins∈a φ(s) = φ(s∗ ) and maxs∈a φ(s) = φ(s∗ ). Then, define λa∗ , λ∗a ∈ ∆a such that
λa∗ (s∗ ) = λ∗a (s∗ ) = 1. Then, for the V that represents ºα
∆v , we have
V (φ) = α min
λ∈∆
=α
X
X
v(φ(s))λ(s) + (1 − α) max
λ∈∆
s∈S
v(φ(s∗ ))τ (a) + (1 − α)
a∈P
=
X
X
X
s∈S
a∈P
[αv(φ(s∗ )) + (1 − α)v(φ(s∗ ))]τ (a)
vα (φ(s∗ ), v(φ(s∗ ))τ (a) = U (φ)
a∈P
where U = ºuπ .
48
v(φ(s))λ(s)
v(φ(s∗ ))τ (a)
a∈P
=
X
References
Ahn, D. S., (2008) “Ambiguity without a State Space,” Review of Economic Studies, 75,
3–28.
Anscombe, F. J. and R. J. Aumann (1963) “A Definition of Subjective Probability,” Annals
of Mathematical Statistics, 34, 199–205.
Arrow, K. J and L. Hurwicz (1972) “An optimality criterion for decision-making under
ignorance.” In: C. F. Carter and J. F. Ford (eds.): Uncertainty and Expectations in Economics. Oxford: Basil Blackwell 1972.
Baillon, A., L’Haridon, 0. and Placido L. (2010) “Ambiguity Model and the Machina
Paradoxes,” forthcoming American Economic Review.
Casadesus-Masanell, R., Klibanoff, P., and E. Ozdenoren (2000) “Maxmin Expected Utility
over Savage acts with a Set of Priors, Journal of Economic Theory, 92, 33–65.
Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci L. Montrucchio (2008) “Uncertainty
Averse Preferences,” Manuscript.
Dempster, A. P. (1967) “Upper and Lower Probabilities Induced by a Multivalued Mapping.” The Annals of Mathematical Statistics, 38, 325–339.
Ellsberg, D. (1961): “Risk, Ambiguity and the Savage Axioms,” Quarterly Journal of Economics, 75, 643–669.
Epstein, L. G., (1999) “A Definition of Ambiguity Aversion,” Review of Economic Studies,
66, 579–608.
Epstein, L. G., and J. Zhang (2001) “Subjective Probabilities on Subjectively Unambiguous
Events,” Econometrica, 69, 265–306.
Ghirardato, P. and M. Marinacci (2001a): “Ambiguity Made Precise: A Comparative Foundation,” Journal of Economic Theory.
Ghirardato, P. and M. Marinacci (2001b) “Risk, Ambiguity and the Separation of Utility
and Beliefs,” Mathematics of Operations Research, 26, 4, 864–890.
Ghirardato, P., F. Maccheroni and M. Marinacci (2004) “Differentiating Ambiguity and
Ambiguity Attitude’s Journal of Economic Theory, 118, pp. 133–173.
Gilboa, I. (1987) “Expected Utility with Purely Subjective Non-Additive Probabilities,”
Journal of Mathematical Economics, 16, 65–88.
Gilboa, I, and D. Schmeidler (1989), “Maxmin Expected Utility with a Non-Unique Prior,”
Journal of Mathematical Economics, 18, 141–153.
49
Gilboa, I, and D. Schmeidler (1994), “Additive Representations of Non-additive Measures
and the Choquet Integral,” Annals of Operations Research, 52, 43–65.
Gul, F, and W. Pesendorfer (2010), “Expected Uncertain Utility and Multiple Sources,”
mimeo, Princeton University.
Jaffray, J.Y. (1989) “Linear Utility Theory for Belief Functions,” Operations Research
Letters, 8, 107–112.
Jaffray, J.-Y., and P. P. Wakker, (1993): “Decision Making with Belief Functions: Compatibility and Incompatibility with the Sure-Thing Principle,” Journal of Risk and Uncertainty,
7, 255–71.
Klibanoff, P., Marinacci, M. and Mukerji, S. (2005), “A Smooth Model of Decision Making
under Ambiguity,” Econometrica, 73, 1849–1892.
Lehrer, E. (2007) “Partially Specified Probabilities: Decisions and Games,” mimeo.
L’Haridon, 0. and Placido L. (2010) “Betting on Machina’s Reflection Example: an Experiment on Ambiguity” (forthcoming in Theory and Decision.
Maccheroni, F., M. Marinacci and A. Rustichini (2006) “Ambiguity Aversion, Robustness,
and the Variational Representation of Preferences,” Econometrica, 74, 1447–1498.
Machina, M. J. (2009): “Risk, Ambiguity, and the Rank-Dependence Axioms,” American
Economic Review, 99, 385–392.
Machina, M. J. and D. Schmeidler (1992): “A More Robust Definition of Subjective Probability,”Econometrica, 60, 745–780.
Nau, R. F. (2006), “Uncertainty Aversion with Second-Order Utilities and Probabilities,”
Management Science, 52, 136–145.
Olszewski, W. B. (2007) “Preferences over Sets of Lotteries,” Review of Economic Studies,
74, 567–595.
Savage, L. J. (1954) The Foundations of Statistics, Wiley, New York.
Schmeidler, D. (1989) “Subjective Probability and Expected Utility without Additivity,
Econometrica, 57, 571–587.
Segal, U. (1990) “Two-Stage Lotteries without the Reduction Axiom,” Econometrica, 58,
349–77.
Shafer, G. A Mathematical Theory of Evidence, Princeton University Press, 1976,
Siniscalchi, M., (2009) “Vector Expected Utility and Attitudes toward Variation,” Econometrica, 77, 801–855.
Zhang, J., (2002) “Subjective Ambiguity, Expected Utility and Choquet Expected Utility,”
Economic Theory, 20, 159–181.
50