* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Expected Uncertain Utility Theory,
Survey
Document related concepts
Transcript
Expected Uncertain Utility Theory† Faruk Gul and Wolfgang Pesendorfer Princeton University June 2010 Abstract We introduce and analyze expected uncertain utility theory (EUU). A prior and an interval utility characterize an EUU decision maker. The decision maker transforms each uncertain prospect into an interval-valued prospect that assigns an interval of prizes to each state. She then ranks prospects according to their expected interval utilities. We define uncertainty aversion for EUU, use the EUU model to address the Ellsberg Paradox and other ambiguity evidence, and relate EUU theory to existing models. † This research was supported by a grant from the National Science Foundation (Grant number: SES0820101). We are grateful to Asen Kochov, Tomasz Strzalecki and Peter Wakker for their comments. 1. Introduction We introduce and analyze expected uncertain utility theory (EUU). We consider pref- erences over Savage (1954) acts that associate a monetary prize to every state of nature. Our goal is to provide a flexible theory that can address three well-documented deviations from expected utility theory: (i) Ellsberg-style evidence that identifies behavior inconsistent with any single subjective prior over the event space. (ii) Source preference evidence showing that, ceteris paribus, decision makers prefer uncertain prospects that depend on familiar rather than unfamiliar events. (iii) Allais-style evidence showing that even when decision makers’ preferences are consistent with a subjective probability assessment, they reveal systematic violations of the independence axiom. In this paper, we provide axioms for EUU theory, discuss how it relates to Ellsbergstyle evidence and uncertainty aversion and relate EUU theory to other models of decision making under uncertainty. We leave the analysis of source preference and Allais-style evidence to the companion paper, Gul and Pesendorfer (2010). 1.1 The Representation and Main Concepts In EUU theory, as in subjective expected utility theory (SEU), two parameters de- scribe a decision maker: a subjective prior µ and a utility index u. The prior, defined over some σ-algebra E, is a countably additive, complete and nonatomic probability measure. We refer to the elements of E as ideal events. The utility index assigns a real number to each pair (x, y) such that x ≤ y and is continuous and increasing. We refer to such an index as an interval utility. As in SEU, the prior and the utility index are subjective; that is, are derived from preferences. We illustrate the main ideas of EUU theory with bets; that is, acts that deliver y if some event A occurs and x < y if A does not occur.1 Let yAx denote such a bet. 1 We describe and discuss the general representation in the next section. 1 For an arbitrary event A (not necessarily in E), define the inner probability µ∗ (A) and the outer probability µ∗ (A) of A as follows: µ∗ (A) = sup µ(E) E∈Eµ E⊂A µ∗ (A) = inf µ(E) = 1 − µ∗ (Ac ) E∈Eµ E⊃A It easy to see that inner and outer probabilities are attained. That is, for every A, there exists E ∈ E, E ⊂ A such that µ∗ (A) = µ(E). We call this E the core of A. Let E1 be the core of A, E3 be the core of Ac and let E2 be the complement of E1 ∪E3 . Call E1 , E2 , E3 an ideal split of A. Now, we can restate the inner and outer probabilities of A in terms of the probabilities of E1 and E2 : µ∗ (A) = µ(E1 ) ≤ µ(E1 ) + µ(E2 ) = µ∗ (A) In EUU theory, the decision maker assigns utilities to bets as follows: W (yAx) = u(y, y)µ(E1 ) + u(x, y)µ(E2 ) + u(x, x)µ(E3 ) (1) When µ(E2 ) = 0, the event A is ideal and expression (1) is simply the expected utility of the bet yAx for the prior µ and the von Neumann-Morgenstern utility index vu where vu (x) = u(x, x). When µ(E2 ) > 0, expression (1) differs from expected utility. From the definitions of E1 and E3 it follows that µ∗ (A ∩ E2 ) = 0 and µ∗ (A ∩ E2 ) = 1 and hence every nonnull measurable subset of E2 must intersect A and its complement. We refer to E∩A as a diffuse subset of E. A diffuse set D (of the entire state space) satisfies µ∗ (D) = 0 and µ∗ (D) = 1 and represents a situation of complete ignorance. EUU theory employs diffuse sets as the building block of uncertainty, using the fact that every diffuse subset of E has the form E ∩ D for some diffuse set D. In addition to the Savage axioms, EUU theory uses the following novel assumption: the decision maker is indifferent between betting on any two diffuse subsets of any ideal set E. That is, given any diffuse subset D, yE ∩ Dx ∼ yE ∩ D0 x 2 (2) for every E ∈ E and all diffuse sets D, D0 and all x, y. Thus, complete ignorance results in indifference between bets. Considering situations of complete ignorance and quantifying the uncertainty of nonmeasurable sets by their inner and outer probabilities are two ideas that predate EUU theory. Arrow-Hurwicz (1972) provides the first analysis of the former idea and the Dempster-Shafer theory (Shafer (1976)) formalizes the latter. The novelty in EUU theory is a representation for all bets and all acts, corresponding to subjective expected utility theory at one extreme (i.e., when betting on ideal events) and to the Arrow-Hurwicz criterion2 at the other (i.e., when betting on diffuse events). Put differently, EUU theory shows that every subjective expected utility model with a countably additive prior provides a rich enough framework for a complete theory of uncertainty provided we extend preferences to all bets via (2) above.3 Our representation theorem requires a rich state space to separate uncertainty attitude and uncertainty perception (as discussed below). A similar implicit richness requirement– that the state space is infinite–in the Savage model facilitates an analogous separation. In applications, this richness is not necessary, even implausible. To facilitate these applications, we provide a discrete formulation of EUU theory in section 3. The discrete model is the appropriate framework for confronting the model with evidence. It may not contain diffuse events or ideal events and, therefore, the finite model contains no analog of assumption (2) above, nor does it require that the agent satisfy the expected utility hypothesis over certain events. 1.2 Separation of Uncertainty Perception and Attitude A novel feature of the EUU model is a separation between uncertainty perception and uncertainty attitude that mirrors the same separation in subjective expected utility theory (SEU). In SEU, a convex-valued prior4 describes the decision maker’s uncertainty perception while the utility index describes her uncertainty attitude. In particular, decision maker 1 perceives the same uncertainty in event A as decision maker 2 perceives in event 2 The relationship between the Arrow-Hurwicz criterion and EUU theory becomes clearer when we consider more general diffuse acts; that is, all simple acts such that f −1 (x) is diffuse for all x in the range of f . 3 This assertion relies on the Continuum Hypothesis. 4 A prior µ is convex-valued if for every event A, and λ ∈ [0, 1] there is B ⊂ A such that µ(B) = λµ(A). 3 B if 1’s prior of A is equal to 2’s prior of B. The SEU prior and utility index separate uncertainty perception and uncertainty attitude because they satisfies the following two properties: (i) For every event A, there exists some event B such that 1’s prior of A is equal to 2’s prior of B. Hence, both decision makers perceive the same range of uncertainty. (ii) Two decision makers have the same uncertainty attitude (the same von NeumannMorgenstern index) if and only if 1’s certainty equivalent of the bet yAx is the same as 2’s uncertainty equivalent of the bet yBx for all x, y, A, B such that 1 perceives the same uncertainty in A as 2 does in B. Property (i) ensures that decision makers with different uncertainty perception are comparable while requirement (ii) ensures that agents’ uncertainty attitude measures the ranking of comparable bets. The separation between uncertainty perception and uncertainty attitude enables comparisons of uncertainty attitudes of two decision makers without demanding that their priors agree. In section 2.2, we demonstrate that EUU achieves the same separation: the EUU prior measures uncertainty perception and the interval utility measures uncertainty attitude. Choquet expected utility theory (CEU) and maxmin expected utility theory (MEU) also have two parameters, a capacity κ and a utility index v for the former; a set of probabilities, ∆, and a utility index v for the latter. However, these parameters do not achieve separation between uncertainty perception and uncertainty attitude. To illustrate this failure for CEU, consider two capacities κ, κ0 and an event A. If we define perceiving the same uncertainty as κ(A) = κ0 (B), then criterion (ii) will typically fail. For example, if κ(A) = κ(B) and κ(Ac ) 6= κ(B c ) then yAx 6∼ yBx for x > y. If we define perceiving the same uncertainty as κ(A) = κ(B) and κ(Ac ) = κ(B c ), then, in general, (i) does not hold. The separation of uncertainty perception and uncertainty attitude is central to the construction of measures of risk and risk attitude in expected utility theory. Since EUU facilitates an analogous separation, we can construct analogous measures. We use these measures to address behavior in Ellsberg-style experiments. Call z the certainty equivalent of a bet if the decision maker is indifferent between getting z for sure and the bet. 4 (i) Decision maker 1 is more uncertainty averse than decision maker 2 if 1’s certainty equivalent of yAx is lower than 2’s certainty equivalent of yBx whenever 1 perceives the same uncertainty in A as 2 perceives in B. (ii) Event A is more uncertain than event B if there are two preferences with identical uncertainty perception, one more uncertainty averse than the other, such that the more uncertainty averse preference prefers a bet on A while the less uncertainty averse preference prefers a bet on B. We show that the event A is more uncertain than the event B if A has a strictly lower inner probability and a strictly higher outer probability than B. Hence, ideal events are minimally uncertain and diffuse events are maximally uncertain. Our definitions of uncertainty and uncertainty aversion require a separation between uncertainty perception and uncertainty attitude as described above but are otherwise model independent. For example, we can apply the definition to subjective expected utility theory. In that case, “more uncertainty averse” reduces to the standard notion of “more risk averse.” For subjective expected utility maximizers all events that have the same prior are equally uncertain and events with different priors cannot be ranked. Thus, the concept of “more uncertain than” is redundant for SEU. 1.3 Evidence EUU theory is flexible enough to accommodate the various versions of the Ellsberg paradox. More specifically, we show that Ellsberg paradoxes can be interpreted as situations in which decision makers perceive some events to be less uncertain than others and make the Ellsberg-style choices whenever they are more uncertainty averse than a benchmark decision maker with the same perception of uncertainty. Hence, we can relate the propensity for Ellsberg-paradox behavior to the decision maker’s uncertainty aversion parameter. Recently, Machina (2009) showed that Choquet expected utility theory and related models5 are unable to accommodate variations of the Ellsberg paradox that appear plausible and even natural. Recent experimental evidence reported in L’Haridon and Placido 5 Baillon, L’Haridon and Placido (2010) extend Machina’s observation to α−maxmin expected utility and Klibanoff, Marinacci and Mukerji’s (2005) smooth model of ambiguity. The authors confirm that Siniscalchi’s (2009) vector valued expected utility model permits the behavior suggested by Machina. 5 (2010) confirms Machina’s intuition and documents a particular pattern of behavior. In section 5.1, we discuss this evidence and show that EUU theory can accommodate the observed pattern of preference if the interval utility is strictly supermodular, that is, if u(x4 , x1 ) + u(x3 , x2 ) > u(x2 , x1 ) + u(x4 , x3 ) whenever x1 > x2 ≥ x3 > x4 . 1.4 Outline Section 2 contains our representation theorem and demonstrates how EUU separates uncertainty perception and uncertainty attitude. To axiomatize EUU, we require an infinite state space. However, to address Ellsberg-style evidence and to relate the model to the literature, it is more convenient to use a model with a discrete state space. In section 3, we introduce EUU for a discrete state space. In section 4, we define uncertainty and uncertainty aversion and relate it to parameters of the EUU representation. In section 5, we formulate a canonical Ellsberg-style experiment and show how the observed patterns of behavior match up with the parameters of the EUU representation. Special cases of EUU have been studied by Zhang (2002) and Lehrer (2007). In addition, the EUU representation is related to Jaffray (1989). We describe the relation to those and other papers in section 6. Section 7 concludes with a detailed comparison to Choquet expected utility theory and α−maxmin expected utility theory. 2. Expected Uncertain Utility The interval X = [l, m], l < m, is the set of monetary prizes and Ω is the state space. The decision maker has preferences over acts; that is, functions f from Ω to X. Let F be the set of all acts. 2.1 The Prior and Envelopes A countably additive probability measure µ (on some σ−algebra Eµ ) is a prior if it is complete (i.e., A ⊂ E and µ(E) = 0 implies A ∈ Eµ ) and nonatomic (i.e., µ(A) > 0 implies 0 < µ(B) < µ(A) for some B ⊂ A). Given any prior µ, let Fµ be the set of all Eµ -measurable acts; that is Fµ = {f ∈ F | f −1 [x, y] ∈ Eµ for all x, y ∈ X} 6 Let I = {(x, y) | l ≤ x ≤ y ≤ m} be the set of all prize intervals. We interpret the pair (x, y) as a single (subjective) consequence. The pair (x, y) describes a situation that the decision maker interprets as getting at least x and at most y. Given a prior µ, a function f : Ω → I is a subjective interval act if it is measurable with respect to Eµ . Let F denote the set of all subjective interval acts. For f ∈ F, let fi denote the i’t coordinate of f. That is, f(ω) = (f1 (ω), f2 (ω)) for all ω ∈ Ω. Lemma 1 below reveals that given any prior µ, each act can be identified with a unique (up to a set of measure 0) subjective interval act. All proofs are in the Appendix. Lemma 1: For any prior µ and f ∈ F, there exists an f ∈ F such that µ({ω ∈ Ω | f1 (ω) ≤ f (ω) ≤ f2 (ω)}) = 1 and if g ∈ F also satisfies the equation above, then µ({ω ∈ Ω | g1 ≤ f1 (ω) ≤ f2 (ω) ≤ g2 (ω)}) = 1 It is clear that any f with the property above is unique up to a set of measure 0. We call the f corresponding to any f its envelope. Note that f ∈ Fµ if and only if f1 = f = f2 almost µ-surely. That is, an act is Eµ −measurable if and only if f = f1 = f2 . Lemma 2 below is a converse of Lemma 1. Lemma 2: 2.2 For any prior µ and f ∈ F, there exists f ∈ F such that f is f ’s envelope. The Interval Utility and Representation Henceforth, we let [f ] = ([f ]1 , [f ]2 ) denote the envelope of f . An interval utility is a continuous function u : I → X such that u(x, y) > u(x0 , y 0 ) whenever x > x0 and y > y 0 . A 7 preference º is a expected uncertain utility (EUU) if there exists a prior µ and an interval utility u such that the function W defined below represents º: Z W (f ) = u[f ]dµ Thus, a prior µ and an interval utility u characterize an EUU decision maker. We let ºuµ denote the EUU preference associated with (µ, u). Define the function vu : X → IR such that vu (x) = u(x, x) for all x ∈ X. For f ∈ Eµ we have Z W (f ) = vu (x)dµ The following example illustrates how the EUU of an arbitrary (not Eµ −measurable) act is computed: Example: Let Ω = [0, 1]×[0, 1] and let E0 be the smallest σ-algebra that contains all events of the form [a, b] × [0, 1] for 0 ≤ a ≤ b ≤ 1. That is, E0 is the set of all full-height rectangles–i.e., sets of the form B × [0, 1] for any Borel set B ⊂ [0, 1] as illustrated in Figure 1 below. E" A! Figure 1 Let µ0 be the unique measure on E0 that satisfies µ0 ([a, b] × [0, 1]) = b − a and let (E, µ) be the completion of (E0 , µ0 ). 8 Consider the act f illustrated in Figure 2 below with prizes x < y < z. The act yields prize x on the yellow shaded region, y on the light grey shaded region and z on the dark grey region. z y x E2 E1 Figure 2 The envelope [f ] of f depicted in Figure 2 is [f ]1 = x, [f ]2 = yE1 z and hence W (f ) = µ(E1 )u(x, y) + µ(E2 )u(x, z). 2.3 Axioms and the Theorem Theorem 1 below shows that º is an EUU if and only if it satisfies the following 6 axioms. Note that the axioms are analogous to their counterparts in Savage’s theorem. We identify x ∈ X with the constant act that yields x in every state. Hence, the binary relation º on F induces a binary relation on X. Axiom 1: The binary relation º is complete and transitive. Axiom 2: If f (s) > g(s) for all s ∈ Ω, then f  g. We interpret prizes as quantities of money and Axiom 2 is a natural consequence of that interpretation.6 For any f, g ∈ F and A ⊂ Ω, let f Ag denote the act h such that h(s) = f (s) for all s ∈ A and h(s) = g(s) for all s ∈ Ac . Hence, xAy denotes the act that yields x if A occurs and y otherwise. 6 Though natural, the assumption is not implied by the Savage axioms and cannot be satisfied in the Savage model with a countable state space. 9 Our first goal is to identify an ideal environment from the decision maker’s preferences and use it to calibrate his attitude towards uncertainty. Consider two acts that imply different subacts on the event E but have a common subact on E c . If the event E is ideal, the ranking of acts does not depend on the common subact on E c . Similarly, if two acts differ on E c but have a common subact on E then the ranking of acts does not depend on the common subact. Definition: An event E is ideal if f Eh º gEh and hEf º hEf implies f Eh0 º gEh0 and h0 Ef º h0 Eg. Thus, an event E is ideal if Savage’s sure thing principle holds with respect to E and E c . An event A is null if f Ah ∼ gAh for all f, g, h ∈ F. If A is not null, we call it non-null. Let E be the set of all ideal events and E, E 0 , Ei etc. denote elements of E. Let E+ ⊂ E denote the set of ideal events that are not null. An event is diffuse if it and its complement intersects every non-null ideal event. Diffuse events represent outcomes in situations of complete ignorance. The decision maker cannot find any (non-null) ideal event contained in it or its complement and hence cannot bound the probability of such events. Let D be the set of all diffuse events and let D, D0 , Di etc. denote elements of D. Definition: An event D is diffuse if E ∩ D 6= ∅ 6= E ∩ Dc for every E ∈ E+ . In the example above, let D ⊂ Ω be such that both it and its complement intersects every vertical line {s} × [0, 1]. Then, D is diffuse. Our main hypothesis (formalized in Axiom 3) is that the decision maker cannot discriminate among the diffuse subsets of any ideal event. That is, for any E, the decision maker is indifferent between betting on E ∩ D1 and E ∩ D2 when both events are diffuse. This indifference reflects the decision maker’s complete ignorance over diffuse outcomes. Axiom 3: yE ∩ Dx ∼ yE ∩ D0 x for all x, y, E, D and D0 . One consequence of Axiom 3 is that it permits the partitioning of Ω into a finite collections of sets D1 , . . . , Dn such that y(Dj ∪ Dk )x ∼ yDi x for all i, j and k. Note that Savage’s theory allows for a similar possibility for countably infinite collections of 10 sets. Diffuse sets are limiting events that play a similar role in EUU theory as arbitrarily unlikely events do in Savage’s theory. They allow us to calibrate the uncertainty of events and thereby facilitate the separation of uncertainty perception and uncertainty attitude. Our model requires a rich state space. The existence of diffuse events is a consequence of the countably additive and convex valued probability on that state space, just as the existence of null events is a consequence of the convex valued (and finitely additive) probability in Savage’s theory. Section 3, below, examines EUU in finite settings that may contain no null or diffuse events. The finite model is the appropriate framework to confront evidence since empirical tests typically employ simple, discrete state spaces. Because the discrete setting may contain no diffuse events, Axiom 3 has no bite. Thus, Axiom 3 should be interpreted more as a conceptual device than as a testable prediction of the theory.7 Axiom 4 below is Savage’s comparative probability axiom (P4) applied to ideal events. Axiom 4: If y > x and w > z, then yEx º yE 0 x implies wEz º wE 0 z. Axiom 5 is Savage’s divisibility axiom for ideal events. It serves the same role here as in Savage. Its statement below is a little simpler than Savage’s original statement because in our setting, there is a best and a worst prize. Let F o denote the set of simple acts; that is, acts such that f (Ω) is finite. The simple act, f ∈ F o , is ideal if f −1 (x) ∈ E for all x. Let F e denote the set of ideal simple acts. Axiom 5: If f, g ∈ F e and f  g, then there exists a partition E1 , . . . , En of Ω such that lEi f  mEi g for all i. Axiom 6 below is a strengthening of Savage’s dominance condition adapted to our setting. We use it to extend the representation from simple acts to all acts, to establish continuity of u and to guarantee countable additivity of the prior µ. Notice that for ideal acts f ∈ F e Axiom 6(i) implies Arrow’s (1970) monotone continuity axiom, the standard axiom used to establish countable additivity of the probability measure in subjective expected utility theory. 7 The status of Axiom 3 is similar to that of P6 (small event continuity) in Savage’s theory. Both axioms have no bite for a subset of acts that are measurable with respect to a fixed, finite collection of events. Therefore, these axioms should be viewed as conceptual devices that connect the theory to behavior in the idealized environment with a rich state space and a corresponding rich set of acts. 11 Axiom 6: (i) If fn ∈ F e converges pointwise to f , then g º fn º h for all n implies g º f º h. (ii) If fn ∈ F converges uniformly to f , then g º fn º h for all n implies g º f º h. Axiom 6(ii) is what would be required to get a continuous von Neumann-Morgenstern utility index when proving Savage’s Theorem in a setting with real-valued prizes. Here, it serves a similar role; it ensures the continuity of the interval utility. Theorem 1 below is our main result. It establishes the equivalence of the six axioms to the existence of an EUU representation. The uniqueness of this representation follows from standard arguments and is omitted. Theorem 1: The binary relation º satisfies Axioms 1 − 6 if and only if there is a prior µ and an interval utility u such that º = ºuµ . Moreover, the prior is unique and the interval utility is unique up to a positive affine transformation. The interval utility u measures the decision maker’s uncertainty attitude while the prior µ measures the decision maker’s uncertainty perception. Next, we show that the requirements for such a separation, as described in the introduction, are satisfied. 2.4 Uncertainty Perception and Uncertainty Attitude In the introduction, we provide two criteria for the separation of uncertainty percep- tion and uncertainty attitude: (i) For every event A, there exists some event B such that 2 perceives the same uncertainty in B as 1 does in A. (ii) Two decision makers have the same uncertainty attitude if and only if 1’s certainty equivalent of yAx is the same as 2’s certainty equivalent yBx for all x, y whenever 1 perceives the same uncertainty in A as 2 perceives in B. Here, we present definitions of uncertainty perception and uncertainty attitude and show that they satisfies the criteria above. Let ºuµ11 and ºuµ22 be two EUU decision makers. We will refer to ºuµii as i. We say that 2 perceives the same uncertainty in B as 1 perceives in A if µ2∗ (B) = µ1∗ (A) and µ∗2 (B) = µ∗2 (A) 12 Also, we say that 1 has the same uncertainty attitude as 2 if u1 is a positive affine transformation of u2 . Hence, µ describes uncertainty perception and u describes uncertainty attitude. To see how these notions meet the criteria (i) and (ii) note that Lemma 2 implies that every prior exposes the decision maker to the same range of uncertainty perceptions. That is, for any p, q ∈ [0, 1] there exists A, B such that µ2∗ (B) = µ1∗ (A) = p and µ∗2 (B) = µ∗2 (A) = p + q (3) Thus, our definition of uncertainty perception satisfies (i). To demonstrate property (ii), let 2 perceive the same uncertainty in B as 1 perceives in A and let p and q satisfy (3). Also, let u1 be a positive affine transformation of u2 . By the representation theorem, yA1 x ∼uµ11 z if and only if u1 (z, z) = u1 (y, y)p + u1 (x, y)q + u1 (x, x)(1 − p − q) (4) Since u2 is a positive affine transformation of u1 , (4) also holds for u2 and hence z is also the certainty equivalent of yAx for 1. For the converse, let vi (x) = ui (x, x) and note that for every (x, y) ∈ I, there exists a unique zi such that vi (zi ) = ui (x, y). Let Di be a diffuse set for i; hence 2 perceives the same uncertainty in D2 as 1 does in D1 . Then, z1 = z2 by hypothesis. Thus, u1 is a positive affine transformation of u2 if and only if v1 is a positive affine transformation of v2 . That two expected utility maximizers have the same preferences over binary gambles only if their utility indices are the same, up to a positive affine transformation, is well known. Thus, we have established (ii). 2.5 Outline of the Proof of Theorem 1 If we restrict attention to ideal events, Axioms 1-6 yield a standard expected utility representation with a countably additive probability measure µ and a continuous utility index v : X → IR. Fix any diffuse event D and for (x, y) ∈ I, let u(x, y) = v(z) for (x, y) ∈ I such that yDx ∼ z. Axioms 2 and 6 ensure that z ∈ [x, y] exists and therefore u is well defined. The proof of the Theorem shows that W represents ºuµ . For this, it is enough to show that v(x∗ ) = W (f ) implies x∗ ∼ f . 13 Fix any partition A1 , . . . , An of Ω. In the proof of Lemma 1, we show that Ω can be partitioned into any finite number of diffuse sets. To show this, we use a Theorem by Birkhoff (1967) which in turn uses the continuum hypothesis.8 We use the fact that Ω can be partitioned into any finite number of diffuse sets together with the fact that µ is nonatomic to show that there are two partitions, one of ideal events E0 , E1 , . . . , Em , the other of diffuse events D1 , . . . , Dl such that µ(E0 ) = 0 and Ei ∩ Dj ⊂ Ak (5) for some k and for all i, j. Since µ(E0 ) = 0, we ignore E0 . Now, consider any simple act f : Let {w1 , . . . , wn } be the set values that f takes, and assume without loss of generality that wi < wi+1 . Consider the partition {Ai | Ai = f −1 (wi ) for i = 1, . . . n} and let {Ej }, {Dk } be ideal and diffuse partitions that satisfy (5). Let x, y be the minimal and maximal values of f on E1 . Let f1 be an act that agrees with f on E1c , takes on the values x and y on E1 and agrees with f when f is equal to x or y. That f1 has the same envelope as f follows from the definition of a diffuse event. To see that f1 is indifferent to f consider the simplest case: E1 = Ω and assume that f = (xD1 z)D2 y for some diffuse partition D1 , D2 , D3 . By monotonicity (xD1 z)D2 y º xD1 ∪ D2 y and xD1 y º (xD1 z)D2 y and by Axiom 3, xD1 ∪ D2 y ∼ xD1 y and therefore xD1 ∪ D2 y ∼ (xD1 z)D2 y ∼ xD1 y. 8 Birkhoff (1967), Theorem 13 (pg. 266) shows that no nontrivial (i.e., not identically equal to 0) countably additive measure such that every singleton has measure 0 can be defined on the algebra of all subsets of the continuum. 14 Then, by induction, f is indifferent to and has the same envelope as some act g that takes at most two values on each Ej and agrees with f whenever f takes its maximal or minimal value in Ei . Let yj and xj be these values respectively. Then, it follows from the definition of an ideal event and Axiom 3 that g is indifferent to the act h such that h(ω) = zj for all ω ∈ Ej for zj such that zj ∼ yj Dxi . Since h is measurable with respect to ideal events, x∗ ∼ h ∼ g ∼ f for some x∗ such that v(x∗ ) = X v(zj )µ(Ej ) = X j u(xj , yj )µ(Ej ) = W (g) = W (f ) j as desired. The extension to all acts uses Axiom 6 and follows familiar arguments. 3. EUU in a Discrete Setting To prove Theorem 1 above, we have ruled out the possibility that Ω is finite. How- ever, in applications and when comparing the EUU model to existing alternatives, it is convenient to consider finite spaces. Let S = {1, . . . , n} be the discrete state space and let Φ be the corresponding collection of discrete acts φ : S → X. The representation of a discrete EUU has two parameters, an interval utility u and a probability π on the collection of non-empty subsets of S. For any P finite set Y , the function λ is a probability on Y if λ : Y → [0, 1] and y∈Y p(x) = 1. Let P be the set of all nonempty subsets of S and let Π be the set of all probabilities on P. Given any φ ∈ Φ, define the function [φ] : P → I as follows: [φ](a) = (min φ(s), max φ(s)) s∈a s∈a for all a ∈ P. A preference º (on Φ) is a discrete EUU if it there an interval utility u and a probability π on Π such that U (φ) = X u[φ](a)π(a) (6) a∈P represents º. Henceforth, write U = (π, u) if U satisfies equation (6) and we let ºuπ denote the discrete EUU that this U represents. 15 If π(a) = 0 for all non-singleton a, then U (φ) = X vu (φ(s))π({s}) s∈S and therefore U reduces to expected utility. When π(a) > 0 for some non-singleton a, π(a) reflects the decision maker’s inability to reduce the uncertainty of the event a to its components. Adapting the EUU representation theorem to the finite setting is not immediate because the EUU representation relies on the existence of a rich class of ideal sets. Imposing the existence of such a collection of sets would be unduly restrictive in the finite setting.9 When the state space is finite, we only require that it be possible to interpret these states as a partition of the state space Ω. This partition is described by an onto function ρ : Ω → S and, therefore, the event ρ−1 (s) ⊂ Ω in the original model corresponds to state s in the discrete model. An act φ ∈ Φ in the discrete model represents an act in the original model that is measurable with respect to the partition ρ. The act f ∈ F is measurable with respect to ρ if f = φ ◦ ρ for some φ ∈ Φ where φ ◦ ρ denotes the composition of ρ and φ. Proposition 1 below shows that the preference º on Φ is a restriction of an EUU preference ºuµ in the original model if and only if º is a discrete EUU. Proposition 1: Fix a prior µ (on Ω) and a preference º on Φ. Then, there is an interval utility u and a partition ρ such that φ º ψ if and only if φ ◦ ρ ºuµ ψ ◦ ρ if and only if º is a discrete EUU. Proposition 1 shows that the prior in the original model does not constrain the range of possible discrete EUUs. For a fixed prior µ we can extend any discrete EUU to an original EUU by choosing an appropriate partition ρ and an appropriate interval utility u. To illustrate the relationship between the original model and the discrete model, let S = {1, 2}, let A = ρ−1 (1), Ac = ρ−1 (2) and let φ(1) = x, φ(2) = y with x ≤ y. Hence f = φ ◦ ρ = xAy is the act in the original model corresponding to φ. In the previous 9 Consider the following analogy: the proof of Savage’s theorem relies on equiprobable partitions; yet it is needlessly restrictive to impose the existence of such partitions (i.e., to require that every nonnull state has the same probability), when applying subjective expected utility theory to finite state spaces. 16 section, we observed that for any A, there exist unique (up to a µ−null event) ideal sets E1 , E2 , E3 and a diffuse set D such that E1 ∪ (E2 ∩ D) = A and E3 ∪ (E2 ∩ Dc ) = Ac . Thus, if W represents ºuµ , then W (f ) = µ(E1 )u(x, x) + µ(E2 )u(x, y) + µ(E3 )u(y, y) If we set π({1}) = µ(E1 ), π({2}) = µ(E3 ) and π({1, 2}) = µ(E2 ), then we get the discrete representation U (φ) =π({1})u(x, x) + π({1, 2})u(x, y) + π({2})u(y, y) for acts in Φ. As we showed in the previous section, µ(E1 ), µ(E2 ), µ(E3 ) describe the decision maker’s uncertainty perception in A, Ac . Thus, in the discrete model, the probability measure π describes the decision maker’s uncertainty perception and the interval utility u describes his uncertainty attitude. The discrete setting is not rich enough to achieve the separation between uncertainty perception and uncertainty attitude according to the criterion in section 2. Therefore, we need to appeal to the original preference (on F) that induces the discrete preference to establish the desired separation. The same is true for subjective expected utility preferences in finite settings. 4. Uncertainty Aversion The goal of this section is to define comparative uncertainty aversion for EUU theory and associate it with parameters of the utility function. Throughout, we consider discrete EUU preferences º on Φ, the collection of discrete acts φ : S → X. By Proposition 1 above, º=ºuπ for some (u, π). We write x for the constant act φ(s) = x and we write xay for the act φ that yields x if the event a ⊂ S occurs and y if a does not occur. We say that one preference is more cautious than another if, to every act, the former assigns a lower certainty equivalent (i.e., constant act) than the latter. Definition: The preference º is more cautious than the preference º̂ if x º̂ φ implies x º φ for every φ ∈ Φ. 17 The interval utility u measures uncertainty attitude and thus we can define the following comparative measure of uncertainty aversion. Definition: The interval utility u is more uncertainty averse than the interval utility û if ºuπ is more cautious than ºûπ for every π. Given any, interval utility u, recall that vu (x) = u(x, x) for all x ∈ X. For x, y ∈ X such that x < y, let σuxy = y − vu−1 (u(x, y)) y−x The interval utility is monotone and therefore u(x, y) ∈ [u(x, x), u(y, y)]. This, in turn, implies that σuxy ∈ [0, 1]. For x < y, we have u(x, y) = vu (σuxy x + (1 − σuxy )y). Hence, σuxy x + (1 − σuxy )y is the certainty equivalent of the interval [x, y]. Proposition 2: Interval utility û is more uncertainty averse than u if and only if vû ◦vu−1 is concave and σûxy ≥ σuxy for all x, y. The function vu describes the interval utility for degenerate intervals [x, x]. As Proposition 2 shows, the more uncertainty averse preference has a more concave vu . This part of the comparative measure mirrors the standard comparative measure of risk aversion for expected utility maximizers. For non-degenerate intervals, the more uncertainty averse interval utility has a lower certainty equivalent than the less uncertainty averse interval utility. This is the novel part that generalizes risk aversion to uncertainty aversion. Next, we use our definition of “more uncertainty averse” to derive a measure for “more uncertain than.” Fix π and consider two events a, b ⊂ S. If ºuπ prefers betting on a to betting on b for every u, then a is a better bet; that is, in a strong sense, a is more likely than b. On the other hand, if some u prefer a and others prefer b, then uncertainty attitude is determining the decision makers’ ranking of a and b; in particular, if more uncertainty averse u’s prefer b to a while less uncertainty averse ones have the opposite ranking, then we say a is more uncertain than b. Definition: Event a is more uncertain than b for π if there exists û, u such that û is more uncertainty averse than u and yax Âuπ ybx and ybx Âûπ yax 18 for x < y. Next, we relate the comparative measure “more uncertain than” to the parameter π. Given any probability π on P, set π∗ (∅) = 0. Then, for all a ⊂ S, let π∗ (a) = X π(a) b⊂a for all a ∈ P and let π ∗ (a) = 1 − π∗ (ac ) for all a ⊂ S. Note that π ∗ (a) − π∗ (a) ≥ 0 for any a. Proposition 3: Event a is more uncertain than event b for π if and only if π∗ (b) > π∗ (a) and π ∗ (a) > π ∗ (b). To illustrate Proposition 3, consider the following example: there are three states, S = {1, 2, 3}, π({1}) = 1/3, π({2}) = π({3}) = 0 and π({2, 3}) = 2/3. In this example, π∗ ({1}) = 1/3 = π ∗ ({1}) = 1/3 while π∗ ({2}) = 0 and π ∗ ({2}) = 2/3 and therefore state 2 is more uncertain than state 1. Furthermore, if a = {1, 2} and b = {2, 3} then π∗ (b) = 2/3; π ∗ (b) = 2/3 π∗ (a) = 1/3; π ∗ (a) = 1 and therefore a is more uncertain than b. 5. Uncertainty and the Ellsberg Paradox In this section, we will relate EUU theory to observed behavior in various versions of the Ellsberg experiment. Our goal is not only to show that EUU theory is flexible enough to accommodate the Ellsberg paradox but also to take advantage of the separation between uncertainty perception and uncertainty attitude to relate a decision maker’s propensity for Ellsberg-paradox behavior to his uncertainty aversion parameter. Our general formulation of an Ellsberg experiment is as follows: there are two possible prizes y = 1 and x = 0. Given any event b ⊂ S, a bet is an act that delivers 1 if b occurs and 0 otherwise. Hence, we identify such an act with the event b. The experimenter elicits 19 the decision makers’ preferences over some collection of bets: B ⊂ 2S . The subjects are told that one or more urns have each been filled with m balls of various colors. More specifically, the subjects are told how many different color balls are available and which particular color configurations are not allowed. An outcome s ∈ S is a color configuration (one color for each ball in each urn) and a draw from each urn. For example, in one experiment the subjects may be told that there is one urn with three balls colored red, green or white. Furthermore, ball 1 is always red. With this description, the subjects understand that of the nine possible (32 ) color configurations for balls 2 and 3, only four are permitted: both green; ball 2 green, ball 3 white, ball 3 white, ball 2 white, and both white. Given these four possible ways to fill the urn with three balls, the experiment has 4 × 3 = 12 outcomes. A color event is the set of all outcomes associated with a particular color for the ball drawn. The defining feature of an Ellsberg experiment is that for some events a ∈ B, the number of possible outcome associated with a in each feasible configuration of the urn is fixed. For example, ex post (i.e., upon inspecting the contents of the urn) a = {green, white} has a 2/3 chance of winning in every configuration. We call such events, experimentally unambiguous. In contrast, a bet on b = {red, green} has a 1/2 chance of winning in two configurations, a 2/3 chance in one configuration and is a sure winner in the final configuration. Hence, b is experimentally ambiguous. Note that the events b and a above are comparable in the sense that both contain the same number of elements of S. Note also that both of the notions above; i.e., experimentally unambiguous and comparable, are closed under complements. That is, if a is experimentally unambiguous, (a is comparable to b), then so is ac , (so are ac and bc .) Hence, an Ellsberg paradox is a situation in which the decision maker prefers every experimentally unambiguous bet to any comparable experimentally ambiguous bet so that we have a  b and ac  bc which is inconsistent with any betting preference that can be represented with a probability on S. 20 We normalize u(1, 1) = 1, u(0, 0) = 0 and u(0, 1) = z and note that z = vu−1 (1 − σu01 ) measures the uncertainty aversions of the EUU decision maker in the Ellsberg experiment. Lower values of z correspond to greater uncertainty aversion. In the Ellsberg experiment, EUU preferences depend only on π and z. Hence, we write ºzπ rather than ºuπ . Let S = {1, . . . , n} were n = km. For t = 1, . . . , k and i = 1, . . . m, the state s = (t − 1)m + i ∈ S represents the outcome in which the i’th ball has been drawn from an urn that has been filled according to the t’th configuration. Hence, we can identify S with the matrix (sit ) as described in the figure below: config. 1 ... config. k ball 1 .. . s11 .. . ... . s1k .. . ball m sm1 ... smk .. Let |a| denote the cardinality of the set a and B be an algebra of subsets of S. For any event a, let at = {s ∈ a | s = (t − 1)m + i for some i = 1, . . . , m} be the outcomes in a associated with the t’th possible configuration; that is, the elements from t’th column of the above matrix that are in a. An event a ∈ B is experimentally unambiguous if mint |at | = maxt |at |; otherwise, it is experimentally ambiguous. Note that complements of experimentally unambiguous events are experimentally unambiguous and disjoint unions of experimentally unambiguous events are experimentally unambiguous. However, intersections of experimentally unambiguous events need not be experimentally unambiguous.10 Let A be the collection of all experimentally unambiguous events in B. The collection B is an Ellsberg experiment if there exist a ∈ A and b ∈ B\A such that |a| = |b|. Given 10 Epstein and Zhang (2001) define unambiguous events and argue that the set of all unambiguous events need not be closed under intersections. Our notion of an experimentally unambiguous event supports their argument. The Epstein-Zhang argument is based on Zhang (2002)’s four-color urn example, which we discuss below. 21 any Ellsberg experiment B and preference º on Φ, (B, º) is an Ellsberg Paradox if for all a, b ∈ B such that |a| = |b|, a ∼ b whenever a, b ∈ A a  b whenever a ∈ A, b ∈ /A Proposition 4 below shows that for any Ellsberg experiment, there is an uncertainty perception that renders each experimentally ambiguous event more uncertain than every comparable experimentally unambiguous events. Moreover, the experiment yields a paradox for any decision maker with that perception and greater uncertainty aversion than a benchmark. Proposition 4: For any Ellsberg experiment, B, there exists π and z ∗ > 0 such that (i) a ∈ A, b ∈ B\A implies b is more uncertain than a whenever |a| = |b|, and (ii) (B, ºzπ ) is an Ellsberg paradox whenever z < z ∗ . Next, we apply our definition of Ellsberg experiments and Proposition 4 to three canonical versions of the Ellsberg paradox. The One-Urn Paradox: One ball is drawn from an urn that contains 3 balls. It is known that exactly one ball is red and the remaining 2 balls are either white or green. The exact number of white balls is not known. Let S = {sit } for i = 1, 2, 3, t = 1, 2, 3, 4. Suppose the three balls are numbered 1, 2, 3 and ball 1 is always red. Each column, t, depicts one possible color scheme for the remaining two balls; each row corresponds to a particular ball, 1, 2, or 3 being drawn. Hence, r S = w w r w g r g w r g g Let B = {r, g, w} be the three color events. All three color events have 4 elements but r = {s11 , s12 , s13 , s14 } is experimentally unambiguous since |rt | = |{s1t }| = 1 for every t ∈ {1, 2, 3, 4}, while w = {s21 , s31 , s22 , s33 } and g = {s32 , s23 , s24 , s34 } are experimentally ambiguous since |w1 | = |{s21 , s31 }| = 2 = |g4 | and |w4 | = 0 = |g1 |. 22 Next, we will find a probability π and a threshold z ∗ to illustrate the statement in Proposition 4. Let π be such that π({sit }) = α for all sit . Moreover, if a is a 4 element set that takes exactly one element from each column then π(a) = β. That is, if |at | = 1 for all t then π(a) = β. Note that there are 34 sets with this property. Thus, choose α ≥ 0, β > 0 such that α = π(a) whenever |a| = 1 β = π(a) whenever min |at | = max |at | = 1 t t 1 = 12α + 81β Bets on events with at = 1 for all t yield a 1/3 chance of winning in every configuration. Hence, those events are experimentally unambiguous. The construction of π above assigns β to every experimentally unambiguous event. Then, π∗ (r) = X a∈r π ∗ (r) = 1 − π(a) = 4α + β X π(a) = 1 − (9α − 24 β) = 4α + 65β a∈r c π∗ (w) = π∗ (g) = 4α π ∗ (w) = π ∗ (g) = 4α + 69β Hence, the event r is more uncertain than the events g and w. A similar calculation reveals that the event r ∪ w is more uncertain than the event w ∪ g. Recall that u(1, 1) = 1, u(0, 0) = 0 and u(0, 0) = z. Thus, U (r) = (4α + β)u(0, 1) + 65βu(0, 1) = 4α + β + 65βz U (g) = 4αu(1, 1) + 69βu(0, 1) = 4α + 69βz and, therefore, the above example is an Ellsberg paradox for z < 1/4. The Two-Urn Paradox: Urn I contains one red ball and one white ball; urn II contains two balls that are red or white and no more is known. One ball will be drawn from each urn. Let m = 4 and k = 4 and therefore S = {1, . . . , 16}. Each column of S corresponds to color choices for the two balls in urn II. In column 1, both balls are white, in column 2, ball 1 is red and ball 2 is white, etc. Each row corresponds to a pair of draws, one from 23 each urn. Ball 1 is chosen from urn I in the first two rows, while ball 2 is chosen from Urn I in the last two rows. In odd rows, ball 1 is chosen from urn II while in even rows ball 2 is chosen from urn II. Combining the information about the composition of urn II and ball draws from both urns yields the following matrix: rw rw S= ww ww rr rw wr ww rw rr ww wr rr rr wr wr Let B be all combinations of color draws from the two urns. That is, B is the algebra generated by the partition {rr, rw, wr, ww}. Since the color draw from the first urn depends only on the number of the ball drawn, urn I color events are experimentally unambiguous; that is, rr ∪ rw and ww ∪ ww both contain two elements from each column. In contrast, urn II color events are experimentally ambiguous. For example, rw ∪ ww has the same number of elements as rr ∪ rw, contains the entire first column but does not intersect the fourth column. A construction analogous to the one for the single urn example above can be used to illustrate Proposition 4 in this example. Zhang’s Four-Color Urn: One ball is drawn from an urn with 2 balls; all balls are either red, white, green or orange. It is known that there is exactly one ball in each of the following two categories: (1) red or white and (2) red or green. It follows from this information that there is also one ball in each of the following two categories: (3) orange or green and (4) orange or white. Zhang defines each of the four events above as unambiguous and concludes that the union of unambiguous events may not be unambiguous. Let S = {sti } for t = 1, 2, 3, 4 and i = 1, 2 where t indexes the color of ball 1 (r, o, w or g) and i the number of the ball drawn (1 or 2). Let B be all color draws from the urn. If ball 1 is red then ball 2 is orange. Conversely, if ball 1 is orange then ball 2 is red. Therefore, columns 1 and 2 in the matrix below are r, o and o, r respectively. Similarly, if ball 1 is w then ball 2 is g and if ball 1 is g then ball 1 is w. Columns 3 and 4 of the matrix below describe the corresponding configurations of the urn. · ¸ r o w g S= o r g w 24 All two-color events have two elements. The events r ∪ o and w ∪ g are experimentally ambiguous; that is, these events do not contain the same number of elements from each column. All other two-color events contain exactly one element from each column. Next, we find a probability π and a threshold z ∗ to illustrate the statement in Proposition 4. Let π be the following probability. There is α ≥ 0, β > 0 such that α = π(a) whenever |a| = 1 β = π(a) whenever min |at | = max |at | = 1 t t 1 = 8α + 8β We must show that the two color event r ∪ o is more uncertain than any other two color event. We have π∗ (r ∪ o) = 4α, π ∗ (r ∪ o) = 4α + 8β and π∗ (r ∪ w) = 4α + 2β, π ∗ (r ∪ w) = 4α + 6β. Hence, the event r ∪ w is less uncertain than the event r ∪ o. The same is true for other two color events. For this specification of the probability π, the four color urn is an Ellsberg paradox if z < z ∗ = 1/2. 5.1 Machina Reversals Recently, Machina (2009) showed that Choquet expected utility theory (and related models) cannot accommodate variations of the Ellsberg paradox that appear plausible and even natural. In this subsection, we will show that within EUU theory the behavior described by Machina is synonymous with the nonseparability of u. Next, we describe Machina’s urn experiment: Let S = {1, 2, 3, 4}; to be concrete, suppose a ball will be drawn from an urn that is known to have 20 balls. It is also known that 10 of these balls are marked 1 or 2 and the other 10 balls are marked 3 or 4. We identify each φ ∈ Φ with (φ(1), φ(2), φ(3), φ(4)) ∈ X 4 . Machina (2008) observes that if º is any Choquet expected utility such that (w, x, y, z) ∼ (x, w, y, z) ∼ (y, z, x, w) for all w, x, y, z ∈ X, then (x1 , x3 , x2 , x4 ) ∼ (x1 , x4 , x2 , x3 ) 25 (7) whenever x1 ≥ x2 ≥ x3 ≥ x4 . He notes that this indifference may be an undesirable restriction for a flexible model of decision making under uncertainty. Call it an M-reversal if a preference, º on Φ, is not indifferent between (x1 , x3 , x2 , x4 ) and (x1 , x2 , x3 , x4 ) for some x1 ≥ x2 ≥ x3 ≥ x4 despite satisfying (7). Then, Machina notes that CEU theory permits no M-reversals and argues that this may be unwarranted restriction on a model of uncertainty. Baillon, L’Haridon and Placido (2010) observe that other well-known models of uncertainty also preclude M-reversals. In particular, they note that α−maxmin expected utility and Klibanoff, Marinacci and Mukerji’s (2005) smooth model of ambiguity also rule them out. Finally, Baillon et. al. verify that Siniscalchi’s (2009) vector valued expected utility model does permit M-reversals. Recent experimental evidence reported in L’Haridon and Placido (2010) confirms Machina’s intuition. L’Haridon and Placido show that over 70 percent of subjects reveal M-reversals. Below, we show that an EUU preference generates no M-reversals if and only if its interval utility is separable. A function w : X 2 → IR is symmetric if w(x, y) = w(y, x) for all x, y ∈ X. We identify any symmetric w with its restriction to I. Conversely, we identify any interval utility with w, its symmetric extension to X 2 . That is, wu (x, y) = u(min{x, y}, max{x, y}). A binary relation on º on Φ is a Machina preference if there exists a continuous, increasing and symmetric function w such that the function V defined by V (x1 , x2 , x3 , x4 ) = w(x1 , x2 ) + w(x3 , x4 ) represents it. We write ºw to denote the Machina preference associated with w. Let π o ({1, 2}) = π o ({3, 4}) = 1/2 and let π o (a) = 0 for all a ∈ P such that {1, 2} 6= a 6= {3, 4}. Hence, for every interval utility, the Machina preference ºwu is identical to the EUU preference ºuπo . The interval utility u is separable if there are v1 , v2 : X → IR such that u(x, y) = v1 (x) + v2 (y) for all (x, y) ∈ I 26 Proposition 5: The EUU ºuπo has a no M-reversals if and only if u is separable. Proposition 4 shows that Machina reversals occur if the interval utility is not separable. The L’Haridon and Placido experiments show that the majority (roughly 2/3) of the subjects that reveal M-reversals prefer “packaging” the two extreme outcomes together. That is, for x1 > x2 = x3 > x4 , (x1 , x4 , x2 , x3 )  (x1 , x3 , x2 , x4 ) This pattern of preference is implied by an EUU preference with a strictly supermodular interval utility; that is, u(x4 , x1 ) + u(x3 , x2 ) > u(x3 , x1 ) + u(x4 , x2 ) whenever x1 > x2 ≥ x3 > x4 . 6. Related Literature One possible way to organize the literature on uncertainty and uncertainty aversion is to group models according to the extent to which uncertainty/ambiguity is built into the choice object. At one extreme, there are papers such as Gilboa (1987), CasadesusMasanell, Klibanoff and Ozdenoren (2000), Epstein and Zhang (2001), the current paper and a number of others that study preferences over Savage acts over an unstructured state space. At the other extreme, there are papers that introduce novel choice objects designed to reflect the decision makers’ perception of uncertainty. For example, in Olszewski (2007) and Ahn (2008), the choice objects are sets of lotteries (i.e., probability distributions over prizes). Sets with a single lottery correspond to situations in which the decisionmaker can reduce all uncertainty to risk while sets with multiple lotteries depict Knightian uncertainty. Similarly, Jaffray (1989) investigates preferences over belief functions (i.e., totally monotone capacities) over prizes. Hence, unlike expected utility maximizers and the general class of nonexpected utility maximizers considered in Machina and Schmeidler (1992), Ahn’s, Olszewski’s and Jaffray’s decision makers choose within a class of objects that contain lotteries but are richer. The new objects enable these authors to model perceptions 27 of uncertainty that cannot be reduced to risk. These models are silent on how “real-life” prospects are reduced to these choice objects. That is, these models do not describe how Savage acts are identified with the new choice objects. In between these two classes of models, there are those that partially build-in ambiguity or at least a distinction between uncertainty and risk. Segal (1990), Klibanoff, Marinacci and Mukerji (2005), Nau (2006) and Ergin and Gul (2009) achieve the desired effect by structuring the state space; in the first three papers, uncertainty resolves in two stages; the first stage represents ambiguity, the second stage is risk. The remaining two papers assume that the state space has a product structure and identify one dimension with ambiguity and the other with risk. The extensive literature on ambiguity models in the Anscombe-Aumann framework also falls into this intermediate category. This literature includes Schmeidler (1989), who introduces Choquet expected utility, Gilboa and Schmeidler (1989), who introduce maxmin expected utility, the generalizations of maxmin expected utility, such as α-maxmin expected utility preferences (see Ghirardato and Marinacci (2001b)), variational preferences of Maccheroni, Marinacci and Rustichini, (2006), the general uncertainty averse preferences of Cerreia-Vioglio, Maccheroni, Marinacci, and Montrucchio (2008), as well as Zhang’s (2002) model of Choquet expected utility with inner probabilities, Lehrer’s (2007) model of partially specified probabilities, and Siniscalchi’s (2006) vector expected utility theory. A full comparison between our model and each of the many papers in the ambiguity literature is not feasible. However, we will relate our model to the closest existing alternatives. Many of these comparisons are complicated by relatively minor differences11 or differences in “technical” assumptions such as finite versus countable additivity. In the following section we provide a detailed comparison between EUU, CEU and MEU in finite settings. Here, we discuss only the three models that are most closely related to EUU theory: As noted above, Jaffray (1989) provides a utility theory for belief functions over a finite set of prizes. Hence, Jaffray’s model extends von Neumann-Morgenstern’s expected utility theory by incorporating the decision maker’s perception of ambiguity into the choice 11 One such difference is that in many models, the set of prizes is arbitrary while many other papers, including the current one consider an interval or real numbers. 28 objects. A belief function that assigns probability 0 both singleton sets {x}, {y} but assigns probability 1 to the set {x, y} depicts a situation in which the decision maker knows that he will end-up with either x or y but views any remaining uncertainty as irreducible to risk. Jaffray adopts the von Neumann-Morgenstern assumptions and imposes one additional assumption to characterize a class of preferences, call them Jaffray preferences, over capacities. To see the relationship between Jaffray’s model and EUU theory, take any EUU preference ºuµ fix a finite subset of prizes X o and consider only acts that yield prizes in X o . Then, we can associate a capacity κ over the set of all nonempty subsets of X o with each act f in a natural way by letting κ(Y ) be the inner probability of the event f −1 (Y ) with respect to the prior µ. It can be shown that the EUU preference ºuµ will be indifferent between two acts that yield the same capacity through this procedure. Hence, an EUU preference induces a preference over capacities over prizes. It can also be shown that this induced preference will satisfy Jaffray’s assumptions. Hence, EUU theory and Jaffray’s model stand roughly in the same relationship as Savage’s theory and von Neumann-Morgenstern theory: one takes as given lotteries (probability distribution in vNM theory, capacities in the Jaffray model) as the choice objects, the other starts with acts and shows that each act can be identified with a lottery in a natural way and ensures that each preference (EU or EUU) over acts induces a preference over lotteries (EU or Jaffray). Zhang (2002) studies preferences in the Savage setting and Lehrer (2007) considers the Anscombe-Aumann framework with a finite state space. Ignoring the difference in the sets of prizes, in the state spaces, and in the underlying axioms, we can state the relationship among the these two models and EUU theory as follows: each Lehrer representation can be identified with a Zhang representation (and vice versa) by identifying every partially specified probability with its inner probability extension to the set of all subsets of the state space. Then, it can be verified that every Lehrer/Zhang representation is equivalent to an EUU representation for some interval utility u such that u(x, y) = u(x, y 0 ) for all x, y, y 0 . Hence, Lehrer/Zhang preferences correspond to the subclass of EUU preferences for which the interval utility depends only on the lower end of the interval. 29 7. EUU, Choquet and α-Maxmin Expected Utility In this section we provide a detailed comparison between EUU, Choquet expected utility theory (CEU) and α-maxmin expected utility theory (α-MEU) in finite settings. Propositions 6 and 7, below, show that a discrete EUU ºuπ with an interval utility u such that u(x, y) = βvu (x) + (1 − β)vu (y) for some β ∈ [0, 1] is both an a CEU preference and an α-MEU preference. 7.1 Choquet Expected Utility A function κ : 2S → [0, 1] is a capacity if (i) κ(∅) = 0, κ(S) = 1 and (ii) a ⊂ b implies κ(a) ≤ κ(b). A binary relation º on Φ is a CEU preference if there exist a capacity κ and a continuous strictly increasing function v : X → IR such that the function V : Φ → IR defined by Z V (φ) = v(φ)dκ represents º, where the integral above denotes the Choquet integral. Let ºκv denote the CEU associated with capacity κ and utility index v. The proposition below establishes that any CEU with a capacity of the form κ = απ∗ + (1 − α)π ∗ and utility index v is identical to the EUU with probability π and interval utility u such that u(x, y) = αv(x) + (1 − α)v(y). Proposition 6: If κ = απ∗ +(1−α)π ∗ and u(x, y) = αv(x)+(1−α)v(y) for all (x, y) ∈ I then ºκv = ºuπ . It follows from Proposition 6 and our analysis of uncertainty aversion for EUU preferences that among CEUs that are also EUUs, ºκ̂v̂ is more cautious than ºκv whenever v̂ ◦ v −1 is concave, κ̂ = α̂π∗ + (1 − α̂)π ∗ , κ = απ∗ + (1 − α)π ∗ and α̂ ≥ α. 7.2 α-Maxmin Expected Utility For α ∈ [0, 1], the binary relation º on Φ is an α-MEU preference if there exists a compact set of probabilities ∆ and a continuous strictly increasing v : X → IR such that the function V defined by V (φ) = α min λ∈∆ X v(φ(s))λ(s) + (1 − α) max λ∈∆ s∈S 30 X s∈S v(φ(s))λ(s) represents º. We let ºα∆v denote the α-MEU with parameters α, ∆, v. Let ∆S be the set of all probabilities on S. For any nonempty a ⊂ S, let ∆a = {λ ∈ ∆S | P s∈a λ(s) = 1} For any π ∈ Π, define ∆π ⊂ ∆S a follows: ∆π = X π(a)∆a a∈P The proposition below establishes that any α-MEU with the set of probabilities ∆π is identical to the EUU with the probability π and the interval utility u such that u(x, y) = αv(x) + (1 − α)v(y). Proposition 7: If ∆ = ∆π and u(x, y) = αv(x) + (1 − α)v(y) for all (x, y) ∈ I, then ºα∆v = ºuπ . Note that Propositions 6 and 7 identify the same class of EUU preferences and therefore identify preferences that are both CEU and α-MEU. Again, it follows from Proposition 5 and our analysis of uncertainty aversion for EUU preferences that among α-MEUs that are also EUUs, ºα̂∆v̂ is more cautious than ºα∆v whenever v̂ ◦ v −1 is concave and α̂ ≥ α. 8. 8.1 Appendix A: Preliminary Results Ideal Splits For the prior µ, let µ∗ (A) = sup µ(E) E∈Eµ E⊂A Since µ is a prior this sup is attained. That is, there exists E ⊂ A such that µ∗ (A) = µ(E). Note that the E with this property is unique up to a set of measure 0. Call E ∈ Eµ , the core of A if it has this property; i.e., E ⊂ A, E ∈ Eµ and µ(E) = µ∗ (A). Definition: Let E ∈ Eµ , N = {1, . . . , n} and {Ai }i∈N be a finite partition of E. Let N be the set of all nonempty subsets of N and for J ∈ N , let N (J) = {L ∈ N | L ⊂ J}. A pairwise disjoint collection {E∗J }J∈N ⊂ Eµ of subsets E such that [ E∗L ⊂ [ i∈J L∈N (J) 31 Ai (A1) and µ∗ ( [ X Ai ) = i∈J µ(E∗L ) (A2) L∈N (J) is called an ideal split of {Ai }i∈N . Lemma A0: (i) Any partition A1 , A2 of E ∈ Eµ has an ideal split. (ii) Any partition A1 , A2 , A3 of E ∈ Eµ such that A3 = E ∩ D for some D ∈ D has an ideal split such that E∗J = ∅ whenever J 6= {1, 3}, {2, 3}, {1, 2, 3}. Proof: For both (i) and (ii), assume µ(E) 6= ∅ or else there is nothing to prove. (i) Let {1} E∗ {2} {1,2} and E∗ be the cores of A1 and A2 respectively and let E∗ {1} {2} {1,2} It is easy to check that E∗ , E∗ , E∗ {1} {2} (ii) Let Ê∗ {1} Ê∗ {2} ∩ {2} Ê∗ and Ê∗ {1} {1} {1,2,3} {2} ∪ E∗ ). is the desired ideal split. be the cores of A1 ∪ D and A2 ∪ D respectively. Note that ⊂ D and therefore µ(Ê∗ Ê∗ \Ê∗ , E∗ {1} = E\(E∗ {1} = E\(Ê∗ {2} {1,3} ∩ Ê∗ ) = 0. Define E∗ {1} {2} {2,3} = Ê∗ \Ê∗ , E∗ = {2} ∪ Ê∗ ), E∗J = ∅ for all other J. Note that {E∗J } is the desired ideal split. Lemma A1: Every finite partition {Ai }i∈N of every E ∈ Eµ has an ideal split. If {E∗J } and {Ẽ∗J } are two ideal splits of {Ai }i∈N , then µ(E∗J \Ẽ∗J ) = 0 for all J. Proof: Assume µ(E) 6= ∅ and |N | > 1 or else there is nothing to prove. We will prove the result by induction. First, let n = |N | = 2 and note that the result follows from part (i) of Lemma A0. Now suppose the result is true for n ≥ 2 and consider a partition {Ai }i∈N of E for some K = {1, . . . , n + 1}. Define Âi = Ai for i < n and Ân = An ∪ An+1 . Let N = {1, . . . , n}. By the inductive hypothesis, there exists an ideal split {Ê∗J } of {Âi }i∈N . For J such that n ∈ / J, define E∗J = Ê∗J . For J ∈ {{n}, {n + 1}, {n, n + 1}}, define {n} B1 = Ê∗ Then, set {n} ∩ An , B1 = Ê∗ {n} E∗ = {1} {1,2} {2} ∩ An+1 and let Ẽ∗ , Ẽ∗ , Ẽ∗ {n+1} {1} Ẽ∗ , E∗ = {2} Ẽ∗ , and {n,n+1} E∗ = be the ideal split of B1 , B2 . {1,2} Ẽ∗ . For J such that |J| ≥ 2 and n ∈ J, let J + = J ∪ {n + 1} and J − = J\{n}. Choose any D̂ ∈ D and let DJ = ((Ω\Ê∗J ) ∩ D̂) ∪ (Ê∗J ∩ [ i∈J − 32 Âi ) {1,3} and note that DJ ∈ D. Then, let {Ē∗ {2,3} (J), Ē∗ {1,2,3} (J), Ē∗ (J)} be the ideal split of B1 = An ∩ Ê∗J , B2 = An+1 ∩ Ê∗J , and B3 = DJ ∩ Ê∗J that is guaranteed by Lemma A0(ii). {1,3} Let E∗J = Ē∗ (J), E (J − ∪{n+1}) {2,3} = Ē∗ + {1,2,3} (J) and E∗J = Ē∗ (J). Verifying that {E∗J } J⊂K is the desired ideal split is tedious but straightforward. That J6=∅ two ideal splits are identical up to a set of measure 0 follows from the definition of an ideal split. Definition: For f ∈ F o and xi > xi+1 , let {x1 , . . . , xn } be the range of f . Then, let Ai (f )} = f −1 (xi ) for i ∈ N = {1, . . . , n} and let {E∗J (f )} be the ideal split of {Ai }. Lemma A2: For all f ∈ F o , [f ]1 (ω) = min{f (ω̂) | ω ∈ E∗J (f )} [f ]2 (ω) = max{f (ω̂) | ω ∈ E∗J (f )}. Proof: Let f(ω) = (mini∈J zi , maxi∈J zi ) whenever ω ∈ E∗J . For ω ∈ / S J E∗J , let f(ω) = (z1 , z1 ). That f = [f ] follows immediately from the definition of an ideal split. 8.2 Proof of Lemma 1 Lemma A2 proves the result for simple acts. Consider a general act f ∈ F. Let w = m−l and zin = l+wi2−n for all i = 0, 1, . . . , 2n . For any x, y ∈ X, let i(n, x) = max{i | zin ≤ x} and j(n, y) = min{j | zjn ≥ y}. The function i is increasing in both arguments while j is decreasing in the first argument and increasing in the second argument. Let g n (ω) = i(n, f (ω)) and hn (ω) = j(i, f (ω)). Since g n and hn are simple functions, [g n ] and [hn ] exist by Lemma A2. Since hn (ω) is a decreasing sequence, so is [hn ]i (ω) for i = 1, 2 and therefore has a limit [hn ]i (ω). Similarly, for i = 1, 2, [g n ]i (ω) is an increasing sequence and therefore has a limit [g]i (ω). Since [hn ]i (ω) − [g n ]i (ω) ≤ w2−n for i = 1, 2, it follows that ([g]1 , [g]2 ) = ([h]1 , [h]2 ) where [g]i , [h]i are the pointwise limits of [g]ni , [h]ni for i = 1, 2. We claim that [f ] = ([h]1 , [h]2 ). To see this, first note that for all n µ({ω ∈ Ω | [g n ]1 (ω) ≤ f (ω) ≤ [hn ]2 (ω)}) = 1 33 since hn ≥ f ≥ g n . Then, ([g]1 , [g]2 ) = ([h]1 , [h]2 ) implies µ({ω ∈ Ω | [h]1 (ω) ≤ f (ω) ≤ [h]2 (ω)}) = 1 Next, observe that if [g] satisfies µ({ω ∈ Ω | [g]1 (ω) ≤ f (ω) ≤ [g]2 (ω)}) = 1 then for all n µ({ω ∈ Ω | [g]1 (ω) ≤ [h]n1 (ω), [g]n2 (ω) ≤ [g]2 (ω)}) = 1 and therefore µ({ω ∈ Ω | [g]1 (ω) ≤ [h]1 (ω) ≤ [h]2 (ω) ≤ [g]2 (ω)}) = 1 Lemma A3: A countably additive probability is convex-ranged if it is nonatomic. Proof: Let E+ = {E ∈ E | π(E) > 0}. For E ∈ E, let ½ r(E) = inf E1 ,...,En max i π(Ei ) π(E) ¾ q = sup r(E) E∈E+ where the inf is taken over all finite partitions of E. First, we note that q is either 1 or 0. To see this, note that if q ∈ (0, 1), then we can find δ ∈ (0, 1) such that δ 2 < q < δ. By definition, there must exist E such that r(E) > δ 2 and a partition E1 , . . . , En of E such that π(Eij ) π(Ei ) π(Ei ) π(E) < δ. Also, there must exists partitions Ei1 , . . . , Eim for each i such that < δ. Then, note that {Eij } is a partition of E such that π(Eij ) π(E) < δ 2 , contradicting the fact that r(E) > δ 2 . Suppose q = 1, then there exists E ∈ E+ , such that r(E) > 0 and hence r(Ω) ≥ r(E)π(E) > 0. Then, we can repeat the argument above to conclude that r(Ω) = 1, contradicting the fact that π is nonatomic. Hence, we conclude that q = 0. Note that if {Ei } is a partition of E such that π(E) < ², then there exist E 0 = Sk i=1 Ei such that π(E 0 ) ≤ r < π(E 0 ) + ². Since q = 0, we can construct such a partition for any 34 E and ² > 0. Hence, we can construct a sequence of disjoint sets {Ek0 } such that Ek0 ⊂ E Sm S∞ and k=1 Ek0 > π(E) − ²m . Since π is countably additive, we have π( k=1 Ek0 ) = r and S∞ 0 k=1 Ek ⊂ E as desired. A set D is diffuse if µ∗ (D) = µ∗ (Dc ) = 1. Let D be the set of all diffuse sets. Lemma A4: Assume the continuum hypothesis holds and µ is a prior. Then, for any natural number n, there exists a partition (D1 , . . . , Dn ) of Ω such that Di ∈ D for i = 1, . . . , n. Proof: Birkhoff (1967) page 266, Theorem 13 proves the following: under the continuum hypothesis, no nontrivial (i.e., not identically equal to 0) measure such that every singleton has measure 0 can be defined on the algebra of all subsets of the continuum. We will use Birkoff’s result to establish that Ω must have a nonmeasurable subset. That is, there exists A ⊂ Ω such that A ∈ / Eµ . Since µ is a prior, by Lemma A3 above, it is convex valued. Hence, we can construct a random variable, ψ, that has a uniform distribution on the interval [0, 1] on this probability space. Define µ̂(R) = µ(ψ −1 (R)) for every R ⊂ [0, 1]. If Eµ contains every subset of Ω, µ̂ defines a measure on the set of all subsets of the unit interval. Moreover, since ψ has a uniform distribution, µ̂({x}) = 0 for all x ∈ [0, 1], contradicting Birkoff’s result. {1} {1,2} {2} Let E∗ (A), E∗ (A), E∗ {1,2} and let α = supA⊂Ω µ(E∗ (A) be an ideal split of {A1 , A2 } for A1 = A and A2 = Ac (A)). By Lemma A1, α is well defined. Since µ is a prior by the argument above, there exists A ⊂ Ω such that A ∈ / Eµ . Hence, α > 0. To establish that α is attained, consider a sequence of sets A(n) such that {1,2} µ(E∗ S (A(n))) > α − 1 n {1,2} (A(j))) and set E(0) = ∅. Define B(n) = E∗ S S {1,2} {1,2} (A) (B(n)) ⊂ E∗ [E(n) ∩ A(n)]\E(n − 1) and let A = n B(n). Note that n E∗ for all n = 1, 2, . . .. Let E(n) = j≤n and therefore µ(E{1,2} (A)) = α as desired. {1,2} But then, if α < 1, choose A such that µ(E∗ {1,2} E∗ (A)) = α and define A(1) = A ∩ {1,2} (A). Let B be any nonmeasurable subset of Ω\E∗ 35 {1,2} and note that µ(E∗ (A ∪ {1,2} B)) > µ(E∗ {1,2} (A)) = α a contradiction. Hence, there exists A such that µ(E∗ (A)) = 1. Clearly, such an A is diffuse. Next, we will show any diffuse set can be partitioned into two diffuse sets. Then, a simple inductive argument yields part concludes the proof. Let D be any diffuse set and define Σ1 = {E ∩ D | E ∈ E} and µ1 (E ∩ D) = µ(E) for all E ∈ Eµ . Note that since D is diffuse, E ∩ D = E 0 ∩ D implies that E, E 0 differ by a set of measure 0. Hence, µ1 is well-defined. It is easy to check that µ1 is a countably additive probability measure on Σ1 and µ1 ({s}) = 0 for s ∈ D. Therefore, D cannot be countable. Then, by the Continuum Hypothesis, D must have the cardinality of the continuum. Repeating the argument yields a diffuse subset D1 of D. Then, for any E such that µ(E) > 0, we have µ1 (E ∩ D) > 0 and therefore E ∩ D1 6= ∅. A symmetric argument yields E ∩ (D\D1 ) 6= ∅. Hence, D1 , D\D1 are diffuse in Ω. 8.3 Proof of Lemma 2 Since µ is a prior, by Lemma A4, there exists a diffuse set D. Let f ∈ F and f = f1 Df2 . We claim that [f ] = f. Note that f1 (ω) ≤ f (ω) ≤ f2 (ω) for all ω. For any real-valued function g on Ω, if there exists E ∈ Eµ such that µ(E) > 0 and g(ω) > f1 (ω) for all ω ∈ E, then, since D is diffuse, we have g(ω) > f1 (ω) = f (ω) for some ω ∈ D ∩ E. Therefore, g ∈ F and g1 (ω) ≤ f (ω) for all ω implies µ({ω | g1 (E) ≤ f1 (E)}) = 1. A symmetric argument yields g ∈ F and g2 (ω) ≥ f (ω) for all ω implies µ({ω | g1 (E) ≤ f2 (E)}) = 1. 9. Appendix B: Proof of Theorem 1 The proof is divided into a series of Lemmas. It is understood that Axioms 1-6 hold throughout. Definition: A set E left (right) ideal if f Eh º gEh implies f Eh0 º gEh0 (hEf º hEg implies h0 Ef º h0 Eg). Let E l and E r be the collection of left and right ideal sets respectively. Lemma B0: E l ∩ E r = E. Proof: That E l ∩ E r ⊂ E is obvious. Suppose E ∈ E and assume f Eh º gEh. Let f ∗ = f Eh and g ∗ = gEh. Then, f ∗ Eh = f Eh º gEh = g ∗ Eh and hence f ∗ Eh º g ∗ Eh. 36 Also, hEf ∗ = h = hEg ∗ and hence hEf ∗ º hEg ∗ and therefore f ∗ Eh0 º g ∗ Eh0 and h0 Ef ∗ º h0 Eg ∗ since E ∈ E. That is, f Eh0 = f ∗ Eh0 º g ∗ Eh0 = gEh0 and hence E ∈ E l . A symmetric argument establishes that E ∈ E r and therefore E = E l ∩ E r . Lemma B1: (i) f (s) ≥ g(s) for all s ∈ Ω implies f º g. (ii) f  g implies f  z  g for some z ∈ X. (iii) fn , gn ∈ F , fn converges uniformly to f , gn converges uniformly to g, f  g implies fn  gn for some n. (iv) fn , gn ∈ F e , fn converges pointwise to f , gn pointwise to g, f  g implies fn  gn for some n. (v) If E ∈ E+ and y > x, then yEh  xEh for all h ∈ F e . (vi) If E ∈ E+ and f ∈ F, then there exists a unique cE (f ) ∈ X such that cE (f )Ef ∼ f . Proof: To prove (i), let fn = 1 nm + ( n−1 n )f and gn = 1 nl + ( n−1 n )g. Then, fn converges to f uniformly and gn converges to g uniformly. By Axiom 2, fn  gn . Then, by Axiom 6, f º gn and applying Axiom 6 again yields f º g as desired. To prove (ii), assume f  g and let y = inf{z ∈ X | z º f } and let x = sup{z ∈ X | g º z}. By (i) above, x and y are well-defined. Axiom 6 ensures that y ∼ f and z ∼ g and therefore y  x. Then, for z = x+y 2 , we have f  z  g. To prove (iii), let f  g and apply (ii) three times to get z, y, x such that f  z  y  x  g. Axiom 6 ensures that fn  y and y º gn for all n large enough. Therefore, fn  gn for all such n. An analogous argument proves (iv). To prove (v), consider E ∈ E+ , h ∈ F e and x < y. Then, there exists f, g, h0 such that f Eh0  gEh0 which implies that mEh0  lEh0 by part (i) above. Hence, m  lEm which implies y  xEy by Axiom 4, which then implies yEh  xEh as desired. Finally, let z = inf{x ∈ X | xEf º f }. By part (i) mEf º f and hence z is welldefined. Axiom 6 and part (v) ensure that zEf ∼ f and also that y 6= z implies yEf 6∼ f . Hence, z = cE (f ). Lemma B2: The collection E is a σ−field. Proof: First, we will show that E is a field. That E ∈ E implies E c ∈ E is obvious as is the fact that ∅ ∈ E. Hence, to show that E is a field, we need to verify that E, E 0 ∈ E implies E ∩ E 0 ∈ E. 37 Suppose E, Ê ∈ E. Then, by Lemma B0, E, Ê ∈ E l ∩ E r . We will first show that E ∩ Ê ∈ E l . Suppose f E ∩ Êh º gE ∩ Êh. Note that f E ∩ Êh = (f Eh)Êh. Since Ê ∈ E, we have (f Eh)Êh0 º (gEh)Êh0 . Next, observe that (f Eh)Êh0 = (f Êh0 )E(hÊh0 ). Since E ∈ E, we have f E ∩ Êh0 = (f Êh0 )E(h0 Êh0 ) º (g Êh0 )E(h0 Êh0 ) = gE ∩ Êh0 and hence E ∩ Ê ∈ E l as required. To conclude the proof, we will show that E ∩ Ê ∈ E r . But E ∩ Ê ∈ E r if and only if (E ∩ Ê)c ∈ E l ; that is, E c ∪ Ê c ∈ E l . Since we know that E c ∈ E if and only if E ∈ E, to conclude the proof it is enough show that E ∪ Ê ∈ E l whenever E, Ê ∈ E. Moreover, since E ∪ Ê = (E ∩ Ê c ) ∪ Ê and since we have already shown that Ê ∈ E implies Ê c ∈ E and E, Ê c ∈ E implies E ∩ Ê c ∈ E l , to complete the proof that E, Ê ∈ E implies E ∪ Ê ∈ E l whenever E, Ê ∈ E, it is enough to consider disjoint E, Ê. Let E, Ê ∈ E, E ∩ Ê = ∅ and let Ẽ = E ∪ Ê. We will show that Ẽ ∈ E l . We assume that E, Ê ∈ E+ since otherwise Ẽ ∈ E is immediate. Throughout the remainder of the proof of this lemma, we let (h1 , h2 , h3 ) denote the act (h1 Eh2 )Ẽh3 for all h1 , h2 , h3 ∈ F. Fact 1: Suppose (i) (h1 , h2 , h3 ) ∼ (h01 , h02 , h3 ) and (ii) (h01 , h02 , h03 ) ∼ (h1 , h02 , h3 ), then (h01 , h02 , h03 ) ∼ (h1 , h2 , h03 ). Observe that since Ê ∈ E, (ii) implies (iii) (h01 , h2 , h03 ) ∼ (h1 , h2 , h3 ). Then, (i) and (iii) yield (iv) (h01 , h2 , h03 ) ∼ (h01 , h02 , h3 ). Since E ∈ E, (iv) implies (v) (h1 , h2 , h03 ) ∼ (h1 , h02 , h3 ). Then, (ii) and (v) yield (h1 , h2 , h03 ) ∼ (h01 , h02 , h03 ) as desired. Next, we will prove that if Ẽ ∈ / E, then there exists hi , h0i for i = 1, 2, 3 such that (i) and (ii) of fact 1 hold and (h1 , h2 , h03 ) 6∼ (h01 , h02 , h03 ). This contradiction will establish the desired result. Suppose there exists f = (f1 , f2 , h), g = (g1 , g2 , h), f 0 = (f1 , f2 , h0 ), g 0 = (g1 , g2 , h0 ) such that f º g and g 0  f 0 . By Lemma B1(vi), we can assume, without loss of generality that f1 = x1 , f2 = x2 , g1 = y1 , g2 = y2 for some xi , yi ∈ X for i = 1, 2. We can also assume, by renaming E, Ê if necessary, that x2 ≤ y2 . Then, it follows from Lemma B1(i) and (v) that x2 < y2 and x1 > y1 . We endow F with the topology of uniform convergence and for any h ∈ F, let c(h) = cΩ (h) where cΩ is the function defined in Lemma B1(vi). Fact 2: cE : F → X is a continuous function for every E ∈ E. Fact 2 follows immediately from Lemma B 1(vi) and Axiom 6(ii). 38 Fact 3: Let fα = (x1 , x2 , αh0 + (1 − α)h) and gα = (y1 , y2 , αh0 + (1 − α)h). Then, there exists ᾱ ∈ [0, 1) such that c(fᾱ ) = c(gᾱ ) and c(gα ) > c(fα ) for all α ∈ (ᾱ, 1]. Fact 3 follows from the continuity of α → fα and α → gα . Without loss of generality, assume ᾱ = 0 (otherwise, we can rename ᾱh0 + (1 − ᾱ)h) and call it h0 ). For any t ∈ [0, 1], choose φ(t) such that c(x1 + t · (y1 − x1 ), x2 + φ(t)(y2 − x2 ), h) = c(f ) (B1) To see that φ(t) exists and is unique for every t, note that c(x1 + t · (y1 − x1 ), y2 , h) ≥ c(f ) ≥ c(x1 + t · (y1 − x1 ), x2 , h). Hence, the existence of a φ(t) satisfying (B1) follows from the continuity of c (fact 2). The uniqueness of this φ(t) follows from Lemma B1(v). Henceforth, let x1 (t) = x1 + t · (y1 − x1 ) and x2 (t) = x2 + φ(t) · (y2 − x2 ). Let fαt = (x1 + t · (y1 − x1 ), x2 + φ(t)(y2 − x2 ), αh0 + (1 − α)h) and set ψ(α) = c(fα1 ). Fact 4: ψ : [0, 1] → X is continuous. The map α → fα1 is continuous. Then, fact 2 yields fact 4. Note that f00 = f , f01 = g, f10 = f 0 , f11 = g 0 and c(f0t ) = c(f ) for all t ∈ [0, 1]. First, we observe that c(fα1 ) 6= c(f01 ) for all α 6= 0. This follows since c(fα1 ) = c(f01 ) and E ∈ E imply c(x1 , y2 , αh0 + (1 − α)h) = c(x1 , y2 , h). Then, Ê ∈ E implies c(x1 , x2 , αh0 + (1 − α)h) = c(x1 , x2 , h) and hence c(x1 , x2 , αh0 + (1 − α)h) = c(y1 , y2 , αh0 + (1 − α)h) which contradicts fact 3 whenever α 6= 0. Hence, c(fα1 ) 6= c(f01 ) for all α 6= 0 implies either (i) c(fα1 ) > c(f01 ) whenever α 6= 0 or (ii) c(fα1 ) < c(f01 ) whenever α 6= 0. First, consider case (i). Let fˆ = (x1 , y2 , h) and note that by Lemma B1(v), c(fˆ) > c(f01 ). Then, choose α > 0 small enough so that min{c(fˆ), c(f11 )} > c(fα1 ) > c(f01 ) 39 (B2) By fact 4, such α exist. Then, fact 2 ensures that c(tf01 + (1 − t)fˆ) = c(fα1 ) for some t ∈ [0, 1]. To complete the proof of case (i), let h1 = x1 (t), h2 = x2 (t), h3 = h and h01 = y1 = x1 (1), h02 = y2 = x2 (1) and h03 = αh0 + (1 − α)h. Then, (h1 , h2 , h3 ) = f0t , (h01 , h02 , h3 ) = f01 , (h1 , h02 , h3 ) = tf01 + (1 − t)fˆ and fα1 = (h01 , h02 , h03 ). Hence, we have c(f0t ) = c(f01 ); that is (h1 , h2 , h3 ) ∼ (h01 , h02 , h3 ). Also, c(tf01 + (1 − t)fˆ) = c(fα1 ); that is (h1 , h02 , h3 ) ∼ (h01 , h02 , h03 ). Then, fact 1 yields (h1 , h2 , h03 ) ∼ (h01 , h02 , h03 ); that is, c(fαt ) = c(f11 ) which contradicts equation (B2). For case (ii); that is, if c(fα1 ) < c(f01 ) whenever α 6= 0, rename E and Ê so that x1 < y1 and x2 > y2 and let fˆ = (x1 , y2 , h) as before. Then, note that by Lemma B1(v), c(fˆ) < c(f01 ). Then, choose α > 0 small enough so that max{c(fˆ), c(f11 )} < c(fα1 ) < c(f01 ). By fact 4, such α exist. Then, repeat the arguments of case (i) to obtain the desired conclusion. To prove that the field E is a σ−field, it is enough to show that if Ei ∈ E and S Ei ⊂ Ei+1 , then Ei ∈ E. Let Ei ⊂ Ei+1 for all i. Note that fˆEi ĝ converges pointwise S S S S S to fˆ Ei ĝ for all fˆ, ĝ ∈ F. Hence, if g Ei h0  f Ei h0 or h0 Ei g  h0 Ei f for some f, g, h, h0 ∈ F e , by (iv) above, we have gEn h0  f En h0 or h0 En g  h0 En f for some n, S proving that Ei ∈ E for all n implies i Ei ∈ E. Lemma B3: There exists a finitely additive, convex-ranged probability measure µ on E and a function v : Ω → IR such that (i) the function V : F e → IR defined by V (f ) = X v(x)µ(f −1 (x)) x∈X represents the restriction of º to F e and (ii) for any f, g ∈ F, µ({ω | f (ω) = g(ω)}) = 1 implies f = g. Proof: Note that once we restrict attention to F e , Axiom 1 is Savages P1, Axiom 2 is P2 and Lemma B1(v) is P3. By definition P4 is satisfied, Axiom 4 is P5, Axiom 5 is P6, and finally, Axiom 6(ii) yields P7. Then, applying the proof of Savage’s Theorem to all acts 40 in F e yields the desired conclusion. This is true despite the fact that Savage’s theorem assumes that the underlying σ−field is the set of all subsets of Ω; the arguments work for any σ−field. This proves (i). To prove (ii), note that by hypothesis, there exists E ∈ E such that µ(E) = 1 and g = f Eg. But m ∼ mEl by part (i) and since E ∈ E, we have f Em ∼ f El. Then, Lemma B1(i) yields f = f Ef ∼ f Eg. Lemma B4: . The probability measure µ on E is a prior. Proof: To show that µ is countably additive, we need to prove that given any sequence Ei T such that Ei+1 ⊂ Ei for all i and E ∗ := i Ei = ∅, lim µ(Ei ) = 0. Suppose lim µ(Ei ) > 0. Then, convex-valuedness ensures the existence of E such that lim µ(Ei ) > µ(E) > 0. Hence, µ(Ei ) > µ(E) for all i; that is mEi l  mEl for all i. But mEi l ∈ F e and converges pointwise to mE ∗ l. Hence, mE ∗ l º mEl  l. Therefore, µ(E ∗ ) > 0 as desired. To prove that µ is complete, we will assume µ(E) = 0 and A ⊂ E, then show that this implies f Ah ∈ h for all f ∈ F. This mean that A ∈ E. By Lemma B1, f Ah º lEh. Since, µ(E) = 0, Lemma B3 implies lEm ∼ m and since E ∈ E, we conclude that lEh ∼ mEh. But, mEh º h by Lemma B1, so we have f Ah º h. A symmetric argument, yields h º f Ah and hence, f Ah ∼ h as desired. Since µ is convex-ranged it is obviously nonatomic. Hence, µ is a prior. Lemma B5: The function v is strictly increasing and continuous. Proof: That v is strictly increasing follows from y  x whenever y > x. To prove continuity, assume, without loss of generality, that v(m) = 1 and v(l) = 0 and suppose r = lim v(xn ) < v(x) for some sequence xn in X. Choose Er ∈ E such that µ(Er ) = r. Then, note that x  hEr l º xn for n large. Therefore, x  hEr l º lim xn = x, a contradiction. Hence, r ≥ v(x). A symmetric argument proves v(x) ≥ r and yields the continuity of v. Lemma B6: For any diffuse act yDx, there exists a unique z ∈ X such that yDx ∼ z. Proof: Let z = sup{w ∈ X | yDx º w}. Since, yDx º l, by Axiom 2, z is well-defined. Then, we can construct two sequences yn ≥ z and z ≥ xn such that both sequences converge to z and yn º yDx, yDx º xn . Hence, by Axiom 6, z º yDx º z as desired. 41 Lemma B7: Let D1 , . . . , Dn ∈ D be a partition of Ω and yi+1 ≥ yi for i = 1, . . . , n − 1 and define f : Ω → X as follows: f (s) = yi whenever s ∈ Di . Then, f ∼ yn Dy1 for all D ∈ D. Proof: By monotonicity, yn [D2 ∪ . . . ∪ Dn ]y1 º f º yn Dn y1 . By Axiom 5, yn [D2 ∪ . . . ∪ Dn ]y1 ∼ yn Dn y1 ∼ yn Dy1 . Definition: Let u : I → IR be defined as follows: u(x, y) = v(z) for z such that yDx ∼ z. It follows from Lemma B6 that u is well defined. Lemma B8: The function u is increasing and continuous. Proof: Suppose yDx ∼ z and ŷDx̂ ∼ ẑ. If ŷ > y and x̂ > x, then Axiom 2 implies ẑ  z and applying Axiom 2 again yields ẑ > z as desired. If ŷ ≥ y and x̂ ≥ x, then by Lemma B1(i), ẑ º z. Then, applying Axiom 2 again yields ẑ ≥ z. To prove continuity, assume yi Dxi ∼ zi for i = 1, . . . and lim(xi , yi ) = (x, y). Since X is compact, we can assume without loss of generality, that zi converges to some z. Suppose yDx  z and note that since yi Dxi converges uniformly to yi Dxi and the act zi converges uniformly to z, we have by Lemma B1(iii), yi Dxi  zi for some i, a contradiction. A symmetric argument yield yi Dxi ∼ zi and establishes continuity. Define Z W (f ) = Lemma B9: u[f ]dµ The function W represents the restriction of º to F o . Proof: Let N + (f ) = {J | µ(E∗J (f )) > 0 and |J| > 1} and F n = {f ∈ F o | n = |N + (f )|}. The proof is by induction on F n . Note that for f ∈ F 0 W (f ) = X v(x)µ(f −1 (x)) = v(x) x∈X for x such that x ∼ f . Hence, the restriction of W to F 0 represents º. Suppose W represents the restriction of º to F n and choose f ∈ F n+1 . Define hf as follows: if f ∈ F n , then hf = f . 42 Otherwise, choose E∗J (f ) such that |J| > 1 and µ(E∗J (f )) > 0. Let y = max f (E∗J (f )) and x = min f (E∗J (f )). Hence, y > x. Also choose D ∈ D and define f ∗ as follows: f ∗ (ω) = f (ω) for all ω ∈ / EJ∗ (f ); f (ω) = y for all ω ∈ D ∩ E∗J (f ) and finally, f (ω) = x for all ω ∈ Dc ∩ E∗J (f ). By Lemma B6 and Axiom 3, f ∗ ∼ f . Next, choose z such that u(x, y) = v(z) and let hf (ω) = f ∗ (ω) for all ω ∈ / E∗J (f ) and hf (ω) = z for all ω ∈ E∗J (f ). Again, Axiom 3 ensures that hf ∼ f ∗ ∼ f . Note that hf ∈ F n and by construction, W (hf ) = W (f ∗ ). Note also that [f ∗ ] = [f ] and therefore W (f ∗ ) = W (f ). Thus, W (f ) = W (hf ) for some hf ∈ F n such that hf ∼ f . Then, the induction hypothesis implies that W represents º on F n+1 . Lemma B10: The function W represents º. Proof: Note that for all f , there exists xf such that W (xf ) = u(xf , xf ) = W (f ). This follows from that fact that u is continuous and u(m, m) ≥ W (f ) ≥ u(l, l). Hence, by the intermediate value theorem u(xf , xf ) = W (f ) for some xf ∈ [l, m]. The monotonicity of u ensures that this xf is unique. Next, we show that f ∼ xf . Without loss of generality, assume l = 0 (if not let l∗ = 0 and m∗ = m − l and identify each f with f ∗ = f − l and apply all previous results to acts F ∗ = {f − l | f ∈ F}.) Define for any x ≥ 0 and ² > 0, z ∗ (x, ²) = min{n² | n = 0, 1, . . . such that n² ≥ x}. Similarly, let z∗ (x, ²) = max{n² | n = 0, 1, . . . such that n² ≤ x}. Clearly, 0 ≤ z ∗ (x, ²) − x ≤ z ∗ (x, ²) − z∗ (x, ²) < ² (B3) and the first two inequalities above are equalities if and only if x is a multiple of ². Set f n (ω) = z ∗ (f (ω), m2−n ) and fn (ω) = z∗ (f (ω), m2−n ) for all n = 0, 1, . . .. Equation (B3) above ensures that f n ≥ f ≥ fn and f n , fn converge uniformly to f . Note also that f n , fn ∈ F o with f n ↓ f . This implies that (for a measure 1 subset) [f n ] ↓ [f ] and R R therefore u[f n ]dµ → u[f ]dµ. Since f n ≥ f , we have W (f n ) ≥ W (f ) = W (xf ) for all n. Since W represents the restriction of º to F o , we conclude that f n º x for all n. Then, Axiom 6 implies f º x. A symmetric argument with fn replacing f n yields xf º f and therefore xf ∼ f as desired. 43 To conclude the proof of the Lemma, suppose f º g, then W (xf ) = W (f ) and W (xg ) = W (g) and xf ∼ f º g ∼ xg . Since W represents the restriction of º to F o , we conclude that W (xf ) ≥ W (xg ) and hence W (f ) ≥ W (g). Similarly, if W (f ) ≥ W (g) we conclude f ∼ xf º xg ∼ g and therefore f º g. Lemma B10 establishes sufficiency. To prove that the preference, ºuµ , satisfies Axioms 1-6 for every prior µ and interval utility, note that by Lemma A3, µ is convex-ranged. Then, verifying Axioms 1, 2, 4 and 5 involves nothing more than repeating familiar arguments from Savage’s theorem. Axiom 3 and Axiom 4 follow immediately from the representation. Note that for any f n ∈ F e , if [f n ] converges pointwise to [g], then W (f n ) converges to W (g). Hence, Axiom 6(i) follows from the fact that f n converges to f pointwise implies [f n ] converges pointwise to [f ] while Axiom 6(ii) follows from the fact that for any f n , if f n converges to f uniformly, then [f n ] converges pointwise to [f ]. The uniqueness of the representation follows from standard arguments and is therefore omitted. 10. Appendix C: Proofs of Propositions 1-7 Proof of Proposition 1: Suppose φ º φ0 if and only if ρ ◦ φ ºuµ ρ ◦ φ0 for some partition ρ : Ω → {1, . . . , n}, prior µ and interval utility u. Assume φ is one-to-one with (distinct) values x1 , . . . , xn ∈ X. Let f = φ ◦ ρ. For any a ⊂ S, let π(a) = µ(Ea∗ (f )) for a ⊂ N such that a 6= ∅. We have shown in the proof of Theorem 1 that W (f ) = X a u( min f (x), max f (y))µ(Ea∗ (f )) = ∗ ∗ x∈Ea (f ) y∈Ea (f ) X u[φ](a)π(a). a Note that any φ is the uniform limit of φn ’s that are one-to-one and lim W (φn ◦ ρ) = P P W (φ ◦ ρ). Thus, lim a u([φn ](a))π(a) = a u[φ](a)π(a) proving that U (φ) = X u[φ](a)π(a) a as desired. 44 For the converse, let µ, π and u be given. Choose an ideal-event partition {Ea }a∈S of Ω such that µ(Ea ) = π(a), where S is the set of all nonempty subsets of S. For each a, let {Dia }i∈a be a partition of Ω into diffuse sets. Define ρ such that ρ(ω) = i for ω ∈ Ea ∩ Dia . Consider acts φ ∈ Φ that are one-to-one and let f = φ◦ρ. Then, E∗a (f ) = Ea and therefore P W (ρ ◦ φ) = a u[φ](a)π(a). The extension to all acts φ ∈ Φ is as above. Proof of Proposition 2: If vû ◦ vu−1 is not concave, then familiar arguments ensure the existence of x < z < y and p ∈ (0, 1) such that pvû (y) + (1 − p)vû (x) > vû (z) and pvu (y) + (1 − p)vu (x) < vu (z). Then, let S = {1, 2}, π({1}) = p, π({2}) = 1 − p and π({1, 2}) = 0, set U1 = (π, û) and U0 = (π, u). Then, pvû (y) + (1 − p)vû (x) = U1 (y{1}x) > U1 (z) = vû (z) pvu (y) + (1 − p)vu (x) = U0 (y{1}x) < U0 (z) = vu (z) proving that y{1}x Âûπ z and z Âuπ y{1}x and hence ºûπ is not more uncertainty averse than ºuπ . If σûxy < σuxy , then choose σ strictly between these two numbers, let S = {1, 2}, π({1, 2}) = 1 and π({1}) = π({2}) = 0. Again, set U1 = (π, û) and U0 = (π, u) and note that U1 (y{1}x) > U1 (σx + (1 − σ)y) and U0 (y{1}x) < U0 (σx + (1 − σ)y), again, proving that û is not more uncertainty averse than u. Hence, these properties are necessary for û to be more uncertainty averse than u. To prove sufficiency, let U0 = (π, u) and U1 = (π, û) and assume σûxy ≥ σûxy for all x, y and vû ◦ vu−1 is concave. Note that for any u∗ and U = (π, u∗ ), U (y â x) = π∗ (â)vu∗ (y) + (π ∗ (â) − π∗ (â))vu∗ (σuxy∗ x + (1 − σuxy∗ )y) + (1 − π ∗ (â))vu∗ (x) (C1) Hence, vu−1 (U1 (y â x)) > vû−1 (U0 (y â x)) for all π, proving that û is more uncertainty averse than u. Proof of Proposition 3: Suppose û is more uncertainty averse than u, let U0 = (π, u), U1 = (π, û) and assume U0 (y a x) > U0 (y b x) and U1 (y a x) < U1 (y b x). Since, 0 ≤ σuxy ≤ 1, equation (C1) implies that if [π∗ (a) − π∗ (b)][π ∗ (a) − π ∗ (b)] ≥ 0 45 then, either U (y a x) ≥ U (y b x) for all U = (π, u∗ ) or U (y a x) ≤ U (y b x) for all U = (π, u∗ ). Therefore, [π∗ (a) − π∗ (b)][π ∗ (a) − π ∗ (b)] < 0 (C2) Next, without loss of generality, set vû (y) = vu (y) = 1 and vû (x) = vu (x) = 0 and let δ1 = vû (σûxy x + (1 − σûxy )y) and δ0 = vu (σuxy x + (1 − σuxy )y) and note that û is more uncertainty averse implies δ1 ≤ δ0 . Applying (C1) to U1 (y a x) < U1 (y b x) and U0 (y a x) > U0 (y b x) and rearranging terms yields that δ1 > δ0 if and only if π∗ (a) − π∗ (b) < 0 and hence (C2) proves a is more uncertain than b. Conversely, if π∗ (a) − π∗ (b) < 0 < π ∗ (a) − π ∗ (b), then let u(x, y) = y and û(x, y) = x for all (x, y) ∈ I. Clearly, û is more uncertainty averse than u. Also, U1 (y â x) = yπ∗ (â) + (1 − π∗ (â))x, U0 (y â x) = yπ ∗ (â) + (1 − π ∗ (â))x for all â and therefore U1 (y a x) < U1 (y b x) and U0 (y a x) > U0 (y b x) as desired. Proof of Proposition 4: Let n π(a) = ² 0 if at = 1 for all t otherwise 1 k Let ² = ( m ) so that π is a probability on P. Without loss of generality, let x = 0, y = 1, u(0, 0) = 0 and u(1, 1) = 1 and set σ = σu01 and U = (π, u). Then, for any b ∈ B, let δt (b) = bt m. Then, π∗ (b) = X a⊂b ∗ π (b) = 1 − When |a| = |b|, we have P π(a) = X = δt (b) t π(a) = a⊂bc t (at ) Y P t (bt ). Y δt (bc ) (C3) t Furthermore, for a ∈ A, δt (a) = δ1 (a) for all t. Hence, equation (C3) implies that π∗ (a) > π∗ (b) and π ∗ (a) < π ∗ (b) whenever a ∈ A, b ∈ B\A and |a| = |b|, proving (i). It follows from equation (C1) in the proof of proposition 2 that U (y a x) = (1 − z)π∗ (a) + zπ ∗ (a) 46 Hence, U (y a x) − U (y b x) = (1 − z)[π∗ (a) − π∗ (b)] + z[π ∗ (a) − π ∗ (b)]. Let T be the set of all (a, b) such that a ∈ A, b ∈ B\A and |a| = |b| and define z ∗ = min (a,b)∈T π∗ (a) − π∗ (b) π∗ (a) − π∗ (b) + π ∗ (b) − π ∗ (a) Since B is finite, b is more uncertain than a whenever (a, b) ∈ T and |a| = |b|, z ∗ ∈ (0, 1) is well-defined by part (i). Proof of Proposition 5: That separability precludes M-reversals is obvious. To conclude the proof, we will show that if there are no M-reversals, then the u that satisfies the above equation must be separable. No M-reversals implies u(x1 , y1 ) + u(x2 , y2 ) = u(x1 , y2 ) + u(x2 , y1 ) (C4) whenever (x1 , y2 ), u(x2 , y1 ) ∈ I. Define v2 (y) = u(l, y) and v1 (x) = u(x, m) − u(l, m). Then, v1 (x) + v2 (y) = u(x, m) − u(l, m) + u(l, y) and equation (C4) ensures that u(x, m) − u(l, m) = u(x, y) − u(l, y). Therefore, v1 (x) + v2 (y) = u(x, y) for all x, y, proving the separability of u. Proof of Proposition 6: Assume that κ = απ∗ + (1 − α)π ∗ and u = v0 . For any φ ∈ Φ order S = {s1 , s2 , . . . sn } so that φ(si ) ≥ φ(si+1 ) and let a0 = ∅, ai = {s1 , . . . , si } for i ≥ 1. Then, for the V that represents the CEU ºκv we have, V (φ) = = = n X i=1 n X i=1 n X i=1 v(φ(si ))[κ(ai ) − κ(ai−1 )] v(φ(si )){α[π∗ (ai ) − π∗ (ai−1 )] + (1 − α)[π∗ (ai ) − π∗ (ai−1 )]} v(φ(si )) X b∈Ai απ(b) + n X i=1 47 v(φ(si )) X b∈Bi (1 − α)π(b) where Ai = {b ⊂ S | b ⊂ ai , b 6⊂ ai−1 } and Bi = {b ⊂ S | b ⊂ aci−1 , b 6⊂ aci }. Hence, we have V (φ) = = n X i=1 n X v(φ(si )) = X X n X s∈b s∈b v(φ(si )) i=1 v(min φ(s))απ(b) + v(min φ(s))απ(b) + b∈P = απ(b) + b∈Ai i=1 b∈Ai X X X b∈P X (1 − α)π(b) b∈Bi n X X v(max φ(s))(1 − α)π(b) s∈b i=1 b∈Bi v(max φ(s))(1 − α)π(b) s∈b vα [φ](b)π(b) = U (φ) b∈P where U = ºuπ . Proof of Proposition 7: Take any φ ∈ Φ and for all a ∈ P choose s∗ , s∗ ∈ a such that mins∈a φ(s) = φ(s∗ ) and maxs∈a φ(s) = φ(s∗ ). Then, define λa∗ , λ∗a ∈ ∆a such that λa∗ (s∗ ) = λ∗a (s∗ ) = 1. Then, for the V that represents ºα ∆v , we have V (φ) = α min λ∈∆ =α X X v(φ(s))λ(s) + (1 − α) max λ∈∆ s∈S v(φ(s∗ ))τ (a) + (1 − α) a∈P = X X X s∈S a∈P [αv(φ(s∗ )) + (1 − α)v(φ(s∗ ))]τ (a) vα (φ(s∗ ), v(φ(s∗ ))τ (a) = U (φ) a∈P where U = ºuπ . 48 v(φ(s))λ(s) v(φ(s∗ ))τ (a) a∈P = X References Ahn, D. S., (2008) “Ambiguity without a State Space,” Review of Economic Studies, 75, 3–28. Anscombe, F. J. and R. J. Aumann (1963) “A Definition of Subjective Probability,” Annals of Mathematical Statistics, 34, 199–205. Arrow, K. J and L. Hurwicz (1972) “An optimality criterion for decision-making under ignorance.” In: C. F. Carter and J. F. Ford (eds.): Uncertainty and Expectations in Economics. Oxford: Basil Blackwell 1972. Baillon, A., L’Haridon, 0. and Placido L. (2010) “Ambiguity Model and the Machina Paradoxes,” forthcoming American Economic Review. Casadesus-Masanell, R., Klibanoff, P., and E. Ozdenoren (2000) “Maxmin Expected Utility over Savage acts with a Set of Priors, Journal of Economic Theory, 92, 33–65. Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci L. Montrucchio (2008) “Uncertainty Averse Preferences,” Manuscript. Dempster, A. P. (1967) “Upper and Lower Probabilities Induced by a Multivalued Mapping.” The Annals of Mathematical Statistics, 38, 325–339. Ellsberg, D. (1961): “Risk, Ambiguity and the Savage Axioms,” Quarterly Journal of Economics, 75, 643–669. Epstein, L. G., (1999) “A Definition of Ambiguity Aversion,” Review of Economic Studies, 66, 579–608. Epstein, L. G., and J. Zhang (2001) “Subjective Probabilities on Subjectively Unambiguous Events,” Econometrica, 69, 265–306. Ghirardato, P. and M. Marinacci (2001a): “Ambiguity Made Precise: A Comparative Foundation,” Journal of Economic Theory. Ghirardato, P. and M. Marinacci (2001b) “Risk, Ambiguity and the Separation of Utility and Beliefs,” Mathematics of Operations Research, 26, 4, 864–890. Ghirardato, P., F. Maccheroni and M. Marinacci (2004) “Differentiating Ambiguity and Ambiguity Attitude’s Journal of Economic Theory, 118, pp. 133–173. Gilboa, I. (1987) “Expected Utility with Purely Subjective Non-Additive Probabilities,” Journal of Mathematical Economics, 16, 65–88. Gilboa, I, and D. Schmeidler (1989), “Maxmin Expected Utility with a Non-Unique Prior,” Journal of Mathematical Economics, 18, 141–153. 49 Gilboa, I, and D. Schmeidler (1994), “Additive Representations of Non-additive Measures and the Choquet Integral,” Annals of Operations Research, 52, 43–65. Gul, F, and W. Pesendorfer (2010), “Expected Uncertain Utility and Multiple Sources,” mimeo, Princeton University. Jaffray, J.Y. (1989) “Linear Utility Theory for Belief Functions,” Operations Research Letters, 8, 107–112. Jaffray, J.-Y., and P. P. Wakker, (1993): “Decision Making with Belief Functions: Compatibility and Incompatibility with the Sure-Thing Principle,” Journal of Risk and Uncertainty, 7, 255–71. Klibanoff, P., Marinacci, M. and Mukerji, S. (2005), “A Smooth Model of Decision Making under Ambiguity,” Econometrica, 73, 1849–1892. Lehrer, E. (2007) “Partially Specified Probabilities: Decisions and Games,” mimeo. L’Haridon, 0. and Placido L. (2010) “Betting on Machina’s Reflection Example: an Experiment on Ambiguity” (forthcoming in Theory and Decision. Maccheroni, F., M. Marinacci and A. Rustichini (2006) “Ambiguity Aversion, Robustness, and the Variational Representation of Preferences,” Econometrica, 74, 1447–1498. Machina, M. J. (2009): “Risk, Ambiguity, and the Rank-Dependence Axioms,” American Economic Review, 99, 385–392. Machina, M. J. and D. Schmeidler (1992): “A More Robust Definition of Subjective Probability,”Econometrica, 60, 745–780. Nau, R. F. (2006), “Uncertainty Aversion with Second-Order Utilities and Probabilities,” Management Science, 52, 136–145. Olszewski, W. B. (2007) “Preferences over Sets of Lotteries,” Review of Economic Studies, 74, 567–595. Savage, L. J. (1954) The Foundations of Statistics, Wiley, New York. Schmeidler, D. (1989) “Subjective Probability and Expected Utility without Additivity, Econometrica, 57, 571–587. Segal, U. (1990) “Two-Stage Lotteries without the Reduction Axiom,” Econometrica, 58, 349–77. Shafer, G. A Mathematical Theory of Evidence, Princeton University Press, 1976, Siniscalchi, M., (2009) “Vector Expected Utility and Attitudes toward Variation,” Econometrica, 77, 801–855. Zhang, J., (2002) “Subjective Ambiguity, Expected Utility and Choquet Expected Utility,” Economic Theory, 20, 159–181. 50