Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
4. 198 DIFFERENCE MEASUil.EM~ Order A x A by the ordering of the lengths of chords joining pairs of points in A. Which axioms of an absolute-difference structure (Definition 8) are (4.IU) satisfied (proof or counterexample)? 19. Show that the axioms of an absolute-difference structure (Definition 8) are all necessary for the representation of Theorem 6, except for the solvability axiom. (4.1 0) Chapter 5 Probability Representations 20. Suppose that dissimilarities of the 15 pairs of six stimuli, a, b, c, d, e,J, are ranked (from 15, highest dissimilarity, to l, lowest) as in the f'bt1owing matrix. a b c d e f 7 1 4 a b c 8 5 14 ll 212 9t 6 d e 15 13 12 9t 2t Show that no absolute-difference representation is possible. Show that with only one inversion of order, a solution is possible. In the resulting representation, the distance must be more than three times as large as the cd distance. (4.10) 21. Suppose that Re 2 is ordered as follows: y) <: , y') I x - y I :;:o I x' - iff y' J. Does every pair of points in Re have a midpoint (Definition 11)? If so, what is it? Answer the same questions with respect to the lexicographic (4.!2) ordering of the plane (Exercise 6). 22. Suppose that four subjects, s, t, u, v, exhibit the following preference orderings over objects a, b, c, d, where S is subject and R is rank. iZ s u v 2 3 4 a b d c c b b c c d b a a d d a 1------ Show that no one-dimensional unfolding representation is possible. 0 5.1 INTRODUCTION The debate about what probability is and about how probabilities shall be calculated has been prolonged and is, in many respects, unresolved; nonetheless few disagreements exist about the mathematical properties of numerical probability. It is accepted that probabilities are numbers between 0 and 1 and that they are attached to entities called events. Moreover, it is widely agreed that events in a given context are subsets-although not necessarily all subsets-of a set known as a sample space. Sample spaces are intended to represent all possible observations that one might make in particular situations. The classical assumptions about events are embodied in the following definition: DEFINITION l. Suppose that X is a nonempty set (sample space) and that tff is a nonempty family of subsets of X. Then tff is an algebra of sets on X iff, for every A, BE .f?: (4.12) l. -AEtff. 2. Au BE C. Furthermore, if tff is closed under countable unions, i.e., whenever A; E tff, i = l, 2, ... , it follows that u;:l Ai E .f?, then tff is called a a-algebra on X. The elements of 0 are called events. 199 5. PROBABILITY REPRESENTATIONS 5.1. INTRODUCnON 2·01 200 words to be called an algebra of sets, g must have. the two pro:erties ln . , 1 mentation and unions. As lS shown m ectwn of bemg closed under comp ethat g JS also closed under intersections and 5.3.1 (Lemma I), it follows differences. . b f ets and that it is a subalgebra of f IS an a 1ge ra o s 0 b serve tha t l 0 , A · · g nd so X - A u -A ~very algebra on X siXnce if A iOnnCJy, ~:e~o~~:~x al~ebras ar~-usually of JS m g and 0 = IS m <o • !s mterest. S t 54 1 we shall have occasion to consider a somewhat Later, m ec wn · · ' B · · g only when weaker definition in which we are assured that A u IS m o A n B = 0 and A, B are in C.. . . 1 robability is the basis of The following axiomatic defimtwn of numenca p d r .tl by most current work in probability theory. It was first state exp ICI y Kolmogorov (1933). . t t that g is an algebra of DEFINITION 2. Suppose that X zs a nonemp y se' Th tri le t on X and that p is a junction from g znto the real numbers Be / se s C, P), is a (finitely additive) probability space iff, for every ' E . A L P(A) ~ 0. 2. P(X) = 1. If A n B = 0' then P(A u B) = P(A) 3. + P(B). It is a countably additive probability space if in addition: 5. g is a a-algebra on X. If A; E g and A; n A; p = (U i=l 0' i c/= j, i, j = l, 2, ... , then A;) = I P(A;). t=l . . f Definition 2 · the interested We do not study the surpnsmg consequenceso ' F Her (1957 ' l d book on probabihty theory, e.g,, e reader should consu t a goo D fi . . 7 as (part of) a representation 1Ion l . tead JS to treat e m 1966). 0 ur pan, ms , . . . conditions under which an ordering theorem; specifically, we mqmre m~o f p that satisfies Definition 2. > f <ff has an order-preservmg unc IOn . " r ~ o . . be interpreted empirically as meanmg qua !Obviously, the orderm; ~~to "Put another way, we shall attempt to treat tatively at least as pro a _e, as. t a measurement problem of the the assignment of probabiht!es to evens as t of e g mass or momentum. same fundamental character as the measuremehn , . ."' g of probability are . . h d b t s about t e meanm ' From this pomt of view, t e e a e l thods to determine >, It is not · 1·t b t acceptable empmca me ~ f , b bTt hould have been the focus o m rea I y, a ou evident why the measurement of pro a I I y s more philosophic contr.oversy than the measurement of mass, of length, or of any other scientifically significant attribute; but it has been. We are not suggesting that the controversies over probability have been unjustified, but merely that other controversial issues in the theory of measurement may have been neglected to a degree. To those familiar with the debate about probability (see, for example, Carnap, 1950; de Finetti, 1937; Keynes, 1921; Nagel, 1939; Savage, 1954, 1961) these last r,cmarks may seem slightly strange. Theories about the representation of a qualitative ordering of events have generally been classed as subjective (de Finetti, 1937), intuitive (Koopman, 1940a,b; 1941), or personal (Savage, 1954), with the intent of emphasizing that the ordering relation <:; may be peculiar to an individual and that he may determine it by any means at his disposal, including his personal judgment.I But these "mays" in no way preclude orderings that are determined by well-defined, public, and scientifically agreed upon procedures, such as counting relative frequencies under well-specified conditions. Even these objective procedures often contain elements of personal judgment; for example, counting the relative frequency of heads depends on our judgment that the events "heads on trial I" and "heads on trial 2" are equiprobable. This equiprobability judgment is part of a partly "objective," partly "subjective" ordering of events. Presumably, as science progresses, objective procedures will come to be developed in domains for which we now have little alternative but to accept the considered judgments of informed and experienced individuals. Ellis pointed out that the development of a probability ordering is ... analogous to that of finding a thermometric property, which ... was the first step towards devising a tern perature scale. The comparison between probability and temperature may be illuminating in other ways. The first thermometers were useful mainly for comparing atmospheric temperatures. The air thermometers of the seventeenth century, for example, were not adaptable for cc:nparing or measuring the temperatures of small solid objects. Consequently, in the early history of thermometry, there were many things which possessed temperature which could not be fitted into an objective temperature order. Similarly, then, we should not necessarily expect to find any single objective procedure capable of ordering all propositions in respect of probability, even if we assume that all propositions possess probability. Rather, we should expect there to be certain kinds of propositions that are much easier to fit into an objective probability order than others ... (Ellis, 1966, p. ! 72). Since virtually all representation theorems yield a unique probability measure, agreement about the probability ordering of an algebra of events ' Many of the relevant papers are collected together in Kyburg and Smokier (1964). 5. PROBABILITY REPRESENTATIONS 20J 5.2. A REPRESENTATION BY UNCONDITIONAL PROBABILITY 202 t bout a method to determine that ordering is or, as is more usual, agreemen a . 1 bability-an objective probability. hich two related probability sufficient to define a umque numenca pro d studied condmons un er w Various auth ors h ave th ught of as objective and the h 1uebra of events, one o p h the most interesting results are those measures on t e same a "' other subjective, must agree. er aps . 842 included in Edwards (1962); ~lso see S~ctJOn ba.bility do not appear to be d·fficulties m measunng pro d In any case, the 1 h . hen we apply extensive an 1 d""'" t from those t at anse w l inherenLy rueren 1 h In nleasuring lengths, for examp e, t thods e sew ere. d other measurernen me lativel short distances; certainly other meth~ s Y d "derable disagreement exists rods are practrcal only for re · ·cal research an consr must be used m astronomi hi h of' the alternatives is appropriate (for a among astronomers about w c f N th 1965) . · e Chapter 15 o or , · . general drscusswn, se . b !ative frequencies or by dtrect Besides the methods of order~ndg. evetnts ;h~~s of inferring the ordering. . d ts there are m tree me d . human JU gmen , . d . oifStatistics introduced an or enng , ( 19 - 4) b 0 ok The Foun atwns ' · Savage s ) . '. , . d from decisions among acts with uncertam of "personal probabthty mferre d A is an event we denote by ' . fi ·f a b are outcomes an outcomes. Bne y, 1 ' . .fA curs and b if A does not occur. aA u b __ A an act whose outcome Isna ~ill b~~ome clear in Chapter 8.) If a is (The reason for the. umon notatm b . preferred to the act aB u b_B ' then · · k pref erre d to b, an d Jf the. act a A \.) -A b IS bl than B (for the dectsJon rna er we may infer that A JS ~ore h~:idaea ~s discussed further in Section 5.2.4, whose preferences are studted). T . U lly the idea is introduced Ill where more complete references are given. t s~\o~h probability and utility conjunction with another: the rneasurernen t od utilities This was Savage's such that acts are ordered by their exhpe~ e 8 to a ~ew version of it. A d devote a separate c ap er, ' (1970) approach , an we . . .1 bl in the book by Fishburn · fine survey of work on this toptc lS ~var a d·~·ons under which an ordering of Our concern in this chapter is Wit cbo.nl. 1 1 asure Whether or not these ted by a proba 1 1ty me · h events can be represen . t blished by one or anot er ( 96. fi d by event ordenngs es a conditions are sans e . . d here· see Luce and Suppes l ), . tal ' · procedure JS· not dJscusse . expenrnen pp. 321·-327). os.2 ESENTATION BY UNCONDITIONAL A REPR PROBABILITY ;t• ons. Qualitative Probability 5 2 1 Necessary C on d• 1 · . . . h that we want to prove, we may Since we know the represen~a~r~n ~o e~::e necessary conditions on the proceed as m Chapters 3 a . . . that the representatiOn exists. assump t ron As always, if Pis to be order preserving, ?::; must be a weak ordering of t!J'. Since this property is familiar and its difficulties as an empirical law ha¥e already been discussed earlier (Section 1.3.1), albeit in other contexts, we need not repeat ourselves; slight rewording suits those comments to thili context. We next show that the certain event X is strictly more probable than the impossible event 0, the empty set, and that any event A is at least as probable as 0. Since, by Axiom 2 of Definition 2, P(X) = I and since Xu 0 = X and X n 0 = 0, it follows from Axiom 3 that 1 = P(X) = P(Xu 0) = P(X) + P(0) = 1 + P(0), and so P(0) = 0. Since Pis order preserving and P(X) = 1 > 0 = P(0), X 0. Moreover, by Axiom 1, P(A) ;:?. 0 = P( 0 ), so A <:; 0. Observe that A ,....., 0 does not imply A = 0. That is to say, nonernpty events may be equivalent in probability to the impossible event, and these null events, as they are called, correspond to sets with zero probability. We can also derive that X<:; A and a number of other properties, but it is unnecessary to state them separately because they turn out to be direct consequences of the properties that we list (see Section 5.3.1). Next, we wish to consider how the order relation and the operation of union relate. Since Axiom 3 of Definition 2 is our main tool and since it is conditional on the two events being disjoint, we may anticipate disjointness playing a role in the corresponding qualitative property. Consider three events A, B, C, with A disjoint from B and C. ff B ?::; C, then by the orderpreserving property and Axiom 3, > P(A u B) = P(A) + P(B) ? P(A) whence A u B ?::; A u C. Conversely, if A P(A) + P(B) = U + P(C) = B ?::; A P(A u C), u C, P(A u B) ? P(A u C) = P(A) + P(C), and so subtracting P(A), P(B) ;;?: P(C), whence B <:; C. Summarizing, if A n B = A n C = 0, then B ?::; C if and only if A u B ?::; A u C. Intuitively, this condition seems plausible. If B is at least as probable as C, then adjoining to each the same disjoint event should not alter the ordering, and conversely, deleting such an event also should not alter it. The primary relation between complementation and ordering is: If A ?::; B, then -B?::; -A. This is not stated as a separate axiom because it follows from the others (Lemma 4, Section 5.3.1). The final necessary condition that we derive in this section is an Archirnedean property. We first derive a special case of it. Suppose that A 1 , ... , Ai , ... 5.2. 5. PROBABILITY REPRESENTATIONS 205 A REPRESENTATION BY UNCONDITIONAL PROBABILITY 204 Furthermore, the structure is Archimedean . f . -1- · A n A ~ 0) eat:h of which . _' 2 A.~ A. are mutually disjoint events (Le., or t -,--- J, i . ' is equivalent in probability to_ some fixed even~ A, J.e., f~r t A- i~ 'a~-~~edt. If a B fi ·te induction on Axwm 2 of Defimtwn 1, Ui~1 ' _ f p;o~ab7tity measure exists, then by a fmite induction on Axwm 3 o 4. 5.2.2 Definition 2, I i=l p / 1 it follows that n cannot exceed . e if A '>- 0 th en smce ~ 1-f .P(A) > 0 , i .. , / b, t st finitely many mutually disjoint events l/P(A) Thus there can e a mo 11 :h~c~ni~u;~r:~~~~e~~:~oli~:~ A1 B1 and B1 (ii) B; n C; (iii) B; (iv) C; -~A; AH 1 = B; v C; . = r-.o = r-..; P(a) 0 ; A; ; ~~:~m~u!h:s~~~te~:~u~~~~~ s~afn~~ed l:~~~~~:~~t~:i%?~:a~~~~pl~::~ every strictly bounded standard sequence e , here since every standard sequence is bounded by X. We summarize these properties as: . is an algebra of 1y se t ' that ,ff . . DEFINI TION 4 · Suppose that X zs a,_.nonemp Th · 1 (X 0 >) zs a structure v d that > is a relation on "' · e trzp e ' ' ~ sets on A, an --: . . A B C E $· of qualitative probability if.ffor every ' ' · 3. > P(b) and {b, e} >{a, c}. (!) + P(c), P(c) + P(d) > P(a) + P(b), P(b) + P(e) > P(a) + P(c). Adding these three inequalities and canceling P(a) + P(b) P(c), we obtain P(d) + P(e) > P(a) + P(b) + P(c), whence {d, e} :>-{a, b, Therefore, if we can construct an order that simultaneously satisfies inequality {a, b, c} > {d, and the axioms of qualitative probability, then we know that this order cannot possibly be represented by a numerical probability. The trick is to find a way to avoid the tedious task of exhaustively verifying Axiom 3 for the example. Suppose we can find a probability measure P on X such that inequality (I) holds (and necessarily {d, e} >{a, b, c}) and such that all subsets A of X other than {d, e} and {a, b, c} are either strictly more probable than {d, e} or strictly less probable than {a, b, c}, i.e., not[P({d, e}) ?: P(A) ?: P({a, b, Of course the ordering induced by P automatically satisfies the Axioms of Definition 4. Now, change that order only to the extent of changing e} >{a, b, c} to {a, b, c} > {d, e}. Since no set lies between these two, their relations to all other sets are unaffected by the inversion, and so the new ordering, which cannot possibly have a probability representation, satisfies Definition 4. Therefore, we need only construct a measure P for which inequality (l) holds and no event separates {d, e} and {a, b, Let 0 < E < ±, and ignoring the normalizing factor P(X) = 16- 3<, let P(a) = 4P(d) = 2, <:;> is a weak order. X> 0 and A ;2:: 0. SupposethatAnB=A n C {c, d} :>-{a, b} + A; i~~~~~e: ~oal~o;~:s~~~:i~~ ~n~~~o:rt ;;:~~i~~ 1~ 2. The Nonsufficiency of Qualitative Probability If an order-preserving representation P exists, then We see that if event A; then 1t follows that Ai+I q . . . t A-corresponds . l t t A and of C which IS eqmva1en 10 turn is eqmva en o. ; , h; .' . sible inductive definition which to i + 1 disjoint copies of A. So t IS IS a send . P(A ) - z·P(A) we must · .· · s· by m uctwn · ;t - A >, 0 Note generalizes our ongmal notiOn. mce 1 l. 0, any standard sequence relative to A is finite. {a} >{b, c}, -"' . l b a + sets and that ,.._, is an DEFINITION 3. Suppose that (0 ts an age r 0'J • d d A A where A . E $ zs a stan ar equivalence relation on $. A sequence 1 , ... , ;h, ... , . t B' C- ~ $ such that: . to A E g lifffior i -- 1' 2 , ... , t ere exzs ;, ' sequence relative (i) :>- The question we now ask and answer in the negative is: Does every (finite) structure of qualitative probability have a finitely additive, order-preserving probability representation? The following ingenious counterexample is due to Kraft, Pratt, and Seidenberg (1959). Let X= {a, b, c, d, e} and let 6 be all subsets of X. Consider any order for which P(A;) = nP(A). each of which is equivalent infprhobability ttyo need a slightly stronger form o t IS proper For every A iff = 0 . Then B .2:; C iff A v B .2: A v C. E, P(b) = 1 - E, P(c) = 3 - E, P(e) = 6. Using additivity, it is easy to verify that inequality (1) holds, that P({d, e}) ~·· 8, 206 5. PROBABILITY REPRESENTATIONS 5.2. and that P({a, b, c}) = 8 - 3E. Since 3E < 1, the only p~ssibility for finding a distinct set A such that P({d, e}) ~ P(A) ~ P({a, b, c}) 1s for P(A) to be of t h e f orm 8 - ZE · 1· -- 0 , 1, 2, 3 • However, the number 8 can be expressed as a sum of distin;t elements from {1, 2, 3, 4, 6} in only two ways, 1 + 3 + 4 or 2 + 6. These correspond to {a, b, c} and {d, e}, respectively. T~us, n~ A distinct from those two sets can have probability between 7 and 8 mclus1ve. This completes the example. So further properties are needed in order to construct a representation. 5.2.3 Quite a number of conditions are known which, along ';ith the properties of qualitative probability (Definition 4), are e~ch sufficient to prove the existence of an additive probability representatiOn. As .would b~ expected, each condition postulates the existence of events With c~rtam (strong) properties, and therefore limits the algebras of sets ~or which th~ theory establishes a representation. We present three conditiOns here which lead to unique, finitely additive measures, another in s.ection .5.4.2 which_leads to a unique, countably additive measure, and a fifth m Sectwn 5.4.3 which leads to a unique measure when X is finite. Suppose (X, C, ;;::;> is a structure of qualitative probability. and A B, then there exists a partition {Cl ,... , Cn} of X such B U C;. that, for i = 1, ... , n, C; E C and A If A, BE ,ff > 201 THEOREM 1. Suppose that {X, C, :;::;) is a structure of qualitative probabil· ity. Axiom 5" is true iff the structure is both fine and tight, This is Savage's Theorem 4, p. 38. Although Savage's axiom is, i:n the presence of the axioms for qualitative probability, actually stronger than the next condition, which was invoked both by de Finetti (1937) and Koopman (1940a,b), Savage felt that his was the easier to justify intuitively. AXIOM 5'. Suppose that (X, C, ;;::;> is a structure of qualitative probability. For every positive integer n, there exists a partition C1 , ... , Cn of X such that, for i,j = 1, ... , n, C; E C and C; ,...._, C1 • Sufficient Conditions AXIOM 5". A REPRESENTATION BY UNCONDITIONAL PROBABILITY > Savage (1954) pointed out that if tff does not seem to ~u~fi~l Axiom ~", one can always enlarge it to an algebra that does by adJommg ~ll fimte sequences of heads and tails generated by a coin of one's own choice. For any n, this leads to a partition into 2n events, and, as he sa1d, It seems to me that you could easily choose such a coin and choose n sufficiently large 50 that you would continue to prefer to stake your gam on A, rather than o n the union of B and any particular sequence of n heads and tails. For you to be ab 1e to do so, you need by no means consider every sequence of heads and tails equally probable (Savage, 1954, p. 38). Savage established a reformulation of his axiom which is interesting ~nd which we shall need again in Section 5.4.2. Suppose that (X, C, ;;::;> IS a system of qualitative probability. It is called fine if, for ever~ A 0, there · t S a part.t. > C; for 1 = 1, ... , n._ For eXJS I lOll {C1 , .•. , C} n of X such that A ~ A Bin ,ff define A ,-...,* B iffor all C, D 0 such that All C = B_ 1\ D- 0, then A ~ c;;::; B and BuD;;::; A. The system is called tight 1f whenever A~* B, then A ,_._,B. > > If C includes all finite sequences of heads and tails generated by what you believe to be independent tosses of a fair coin, i.e., a head is just as probable as a tail, then Axiom 5' must be fulfilled for you. A common drawback of both axioms is that they force X to be infinite-the illustrative coins suggest why. The following condition, which is due to Luce (1967), is fulfilled in many infinite structures and in some finite ones, although not in most. AXIOM 5. Suppose that (X, tff, ;;::;> is a structure of qualitative probability. If A, B, C, DE C are such that A n B = 0, A > C, and B ;;::; D, then there exist C', D', E E tff such that: (i) (ii) (iii) (iv) E ""Au B; C' n D' = 0; E=:J C' u D'; C' ~ C and D' ,...._,D. It is difficult to say just what this means other than to restate it in words. If A and B are disjoint, if A is more probable than C, and if B is at least as probable as D, then the axiom asserts that somewhere in the structure there is an event E that is equivalent in probability to A u B such that E includes disjoint subsets C' and D' that are equivalent, respectively, to C and D. This axiom can be satisfied by some finite probability spaces, e.g., let X = {a, b, c, d}, let P(a) = P(b) = P(c) = 0.2, P(d) = 0.4, and ,ff is all subsets. In some sense, however, most finite structures that have a probability representation violate the axiom, e.g., any structure whose equivalence classes fail to form a single standard sequence. As was mentioned, Savage showed that, in the presence of the axioms for qualitative probability, Axiom 5" is strictly stronger than 5', and under the same conditions Luce (1967) showed that 5" is strictly stronger than 5. Fine (I97la,b) used a still weaker structural condition-the existence, for every n, of ann-fold "almost uniform" partition--in conjunction with the (necessary) 208 5. PROBABILITY REPRESENTATIONS condition that the order topology on rff have a countable base. None of these axiom systems permits finite models, so none is weaker than Axioms 1-5. Fine (1971a) proved that all are strictly stronger than Axioms 1-5. Assuming that Axioms 1-5 are weaker than the other systems that have been mentioned and that adding the (necessary) Archimedean condition to those of qualitative probability is not objectionable, the appropriate result to prove is the following. THEOREM 2. Suppose that (X, rff, ;2::) is an Archimedean structure of qualitative probability for which Axiom 5 holds, then there exists a unique order-preserving function P from t!f' into the unit interval [0, 1] such that (X, rff, P) is a finitely additive probability space. The proof, which is given in Section 5.3.2, involves reducing this structure to one of extensive measurement. It is evident that the crucial property, if A n B = 0, then P(A u B) = P(A) + P(B), is formally similar to the corresponding one in extensive measurement provided that U is treated as a concatenation operation o. Clearly the condition A n B = 0 will have to be rephrased as a restriction on the concatenation operation of the form: (A, B) in fJiJ, where :!1J C tff X C. In this connection, it should be noted that an ordering of an algebra of sets can be interpreted as a type of extensive measurement provided that the sets are collections of objects rather than events. Such an interpretation is especially natural for masses, in which case any collection of objects placed on one pan of a balance is an element of the algebra. Such an interpretation is less natural, though nonetheless possible, for length measurement. Observe that when we take one of the primitives to be an algebra of sets, we automatically assume the commutativity and associativity of concatenation, since both properties hold for unions of sets; these properties had to be listed more or less explicitly as axioms in the theories of Chapter 3. This is, of course, a general mathematical phenomena: the more structured the primitives, the weaker the axiomatic system need be in order to describe a given structure. 5.2.4 Preference Axioms for Qualitative Probability If the ordering;?:; on tff is inferred from choices among acts, then the axioms of qualitative probability can be reformulated as assumptions concerning observable choices. For simplicity, we restrict attention to acts that have either two or three uncertain outcomes. Let rff be an algebra of sets on X and '1f! a set of outcomes or consequences; let a A u bE u Ce be the act that has consequence a if A occurs, b if B occurs, 5.2. A REPRESENTATION BY UNCONDITIONAL PROBABILITY 209 and c if C occurs, where it is understood that a, b, c are in '1f/, A, B, C are in tff', and A, B, C partition X. Denote the preference relation as ;2:: *. This is a weak ordering of both '1f! and of the set Ot of all acts, i.e., formally, a subset of ('1f! X '1f/) u (Ot X Ot). We do not need to assume that a consequence and an act are compared. We also identify acts that have the same event-outcome interpretation, e.g., permutations do not matter, aA u bE u ce = bE u ce u aA , and a three-outcome act with two identical outcomes reduces to a two-outcome act, aA U bE u be = aA u baue = aA U b_A . We define a relation ;2:: on tff as follows: A ;2:: B ifffor all a, bE '1f/, if a ;2::;* b, then aA u b_A ;2::;* aE u b_ 8 . The following three axioms are sufficient (in the presence of the assumptions made informally above) to make (X, rff, ;2::) a structure of qualitative probability: !. !fa>* b, c ;2::* d, and aA u b_A ;2::;* as u b_ 8 CA U d_A ;2::;* C8 U , then d_E. 2. If a>* b, 3. lfaA U ba U be ;2::;* aA· U bE U be·, then then ax u b"' >* aq, u bx and aA U b_A ;2::;* a'" u bx. aA u aE U be ;2::;* aA· U aE U be· . We also need to assume that ;2::* is nontrivial on '1f!, i.e., there exist a, b with a >*b. The first axiom implies that ;2:: is connected: for take a >* b, then for any A, B, the connectedness of ;2::* yields some relation between aA u b_A and as U b_a , and this same relation then holds with c, d substituted for a b provided that c ;2:: * d. Transitivity of ;2:: follows from transitivity of ;2:: s~ < ;(:;is a weak order. The second axiom obviously implies X > 0 and A ;2:: 0. The third axiom is based on the idea that if two acts have a common consequence b when B occurs, then the contingency bE is irrelevant to the decision between them, and so b can just as well be replaced by a. The axiom only assumes this in a very restricted situation: when the other two outcomes are b and a. The general principle, which this axiom exemplifies, is called the extended sure-thing principle. It leads to an expected utility representation (see Chapter 8, Savage's book, or Fishburn's book, referred to above). In this highly restricted form, however, it is already sufficient to yield Axiom 3 of qualitative probability: If A is disjoint from B u C, then B ;2:: C if and only if A u B ;2:: A u C. The proof is fairly obvious: Letting a >* b, B ;2:: C translates into aB U bA U b_(AUE) <=* Ge U bAU b--(Aue) 210 5. PROBABILITY REPRESENTATIONS while A u B ?:; A u C translates into Thus the axiom (which is equivalent to its own converse) gives the desired result. Axioms 1 and 3 are highly interesting propositions about choices. If Axiom 1 is violated, there would seem to be some kind of special interaction between outcomes and events, e.g., such that even though a >* b, c ?:; * d, and aA u b_A ?:;* aB u b_B, the combination cA is somehow much less valuable than cB . Violations of the extended sure-thing principle have been much discussed (see Chapter 8). From the present standpoint, the interesting thing to ~~t~ is the light that Axiom 3 sheds on the principle of additivity of probabilities. If a>* band A ?:; A', then we must have aA U bB U be?:;* aA· U bB U 5.3. PROOFS 211 The objective probability of winning a is now two-thirds in each case, but the tables are turned. In the first scheme, the probability could be as low as one-third, depending on the coin toss, but in the second scheme, the probability is surely two-thirds. The point is that black adds a great deal more unambiguity to white than it does to red. Only if such considerations play no role-either because of the constancy of ambiguity over events or because of sophisticated decision-makers-may we expect the probability ordering inferred from decisions under uncertainty to lead to an additive representation. Other discussions of the inference of a qualitative probability from decisions are found in Section 8.7.1 and in Anscombe and Aumann (1963), Davidson and Suppes (1956), Edwards (1962), Fishburn (1967g), Pratt, Raiffa, and Schlaifer (1964), Ramsey (1931), Savage (1954), and Suppes (1956). he·. But if, somehow, B adds more to A' than it does to A, we could have the reversal, a A· u aB u be· >*aAu aB U be. An example of a possible violation is offered by the following game (adapting an idea of Ellsberg, 1961). Consider three urns, containing 200 white balls, 200 black balls, and 100 red balls, respectively. One of the first two urns is selected by tossing a fair coin, and without informing the player which one was selected, white or black, its 200 balls are thoroughly mixed with the 100 red balls. Then a single ball is drawn from the mixture, and a prize is awarded or not depending on its color. Let a denote a valuable prize and b denote no prize. The player may prefer the payoff scheme 5.3 5.3.1 PROOFS Preliminary Lemmas LEMMA 1. If rff is an algebra of sets on X (Definition I, Section 5.1), then 0, X E rff and rff is closed under set difference and intersection. If tff is a a-algebra, then it is closed under countable intersections. PROOF. This is a standard result, easily proved using the formulas X= Au -A, An B =-(-Au -B), etc. 0 The common hypotheses of Lemmas 2-4 is that <X, rff, ?:;) is a structure of qualitative probability (Definition 4, Section 5.2.1) and that A, B, C, DE rff. ared U bblack U bwhite to the scheme awhite U bblack U bred · In each case, the objective probability of winning the prize a is exactly onethird. The difference is that in the first case, he knows there are one-third red balls, whereas, in the second case, depending on the coin toss, there may be no white balls at alL Now suppose that we change b to a on black, i.e., compare ared U ablack U bwhite to awhite u abiack u bred . LEMMA 2. Suppose that A n B = C n D = 0. If A ?:; C and B ?:; D, then A u B ?:; C u D; moreover, if either hypothesis is >, then the conclusion is>. PROOF. Let A'= A-D, D' = D- A. Note that A', D' are in C (Lemma 1) and that A' n B =A' n D =AnD'= C n D' = and that A' u D = A u D'. Using Axiom 3 twice, A' u B ?:; A' u D = Au D' ?:; CuD'. 0, 5. 212 PROBABILITY REPRESENTATIONS If either hypothesis is strict, then by the "if" part of Axiom 3, A' uB >CuD'. Observe that A n D is disjoint from both A' u B and C u D'; thus, by Axiom 3, Au B and > = (A' u B) u (A n D) ?::; (CuD') u (A n D) = CuD; 0 holds if A' u B >CuD'. COROLLARY. If An B AuB~Cu D. LEMMA 3. Cn D = = A ~ C, and B ~ D, 0, then If A :J B, then A ?::; B. pROOF. Use Axiom 3 on A - B ?::; 0 (Axiom 2), noting that B is disjoint from A Band from 0 and that (A - B) u B = A. () COROLLARY 1. X?::; A. COROLLARY 2. If A ':J B, then A> B iff A- B LEMMA 4. > 0 · If (X, tff, ?::;) is an Archimedean structure of qualitative prob~~ility for which Axiom 5 is true, then it has a unique, finitely additive probabzl!ty representation. PROOF. Let A denote the equivalence class that includes A, and let ,g be the set of all equivalence classes, excluding l'lJ. By Axiom 2, X E tff. If X is the only element of tff, then we let P(A) = 0 if A ,..__, 0, P(A) = 1 1f A ~X, and p fulfills the assertions of the theorem (note that (lJ IS closed under disjoint union, by Lemma 2). We assume, therefore, that ,g contains A c:F X. We construct an extensive structure on ,g letting = {(A, B) A 1 A'EA, A v B, if A n B = 0. By the corollary of Lemma 2, o is well defined. We let ~ be the induced simple order on C. We shall now prove that (<ff, ~, fllJ, o) is an extensive structure with no essential maximum (Definition 3.3, Section 3.4.3). The six axioms are established in corresponding numbered paragraphs. It is convenient in some places to use boldface Greek capitals for elements of tff. ~ is a weak order; in fact, it is a simple order. Positivity follows from Corollary 2 of Lemma 3. 2. Associativity. Suppose that (r, Ll), (roLl, A) E fllJ. The essence of the proof is to construct A, B, C pairwise disjoint such that A = r, B =Ll, C = A. For then, the rest will follow via the associativity of the union operation. The construction is as follows. Take disjoint sets D' E r o Ll, C' E A. By positivity D' > Ll, thus, by Axiom 5 we can take E ,...._, D' v C' and B, C disjoint with E :J B u C and BE A, C E A. Then let A = E- (B u C). It remains only to show that A E r; but since (A o Ll) o A = E = (r o A) o A, this follows from Lemma 2, applied twice. 3. We need to show that if (A, C) E fllJ and A~ B, then (C, B) E fllJ; the rest follows by the obvious commutativity of o and by Lemma 2. If A = B, there is nothing to show; if A > B, apply Axiom 5. 4. Solvability. If A> B, apply Axiom 5 to A B and 0 ~ 0 to obtain A', B' with A' :J B', A-~ A', B ~ B', and let C =A'- B'; then A=BoC. 6. Finally, we show that {n I B > nA} is finite, for A E C, where lA = A, nA = (n - l) A o A. We do this by showing that the existence of nA implies 0. For suppose the existence of an n-term standard sequence relative to A that nA exists and that we have a k-term standard sequence relative to A, A 1 , ... , Ak, with k < n, and with A, E iA for i ~ k. Since (kA, A) E fJJ, we have Bk E kA and ck E A such that Bk n ck = 0. Let Ak+l = Bk u ck Then clearly, A1 , ... , Ak+1 is a (k + 1)-term standard sequence relative to A, and A1c+1 E (k + 1) A. Since a !-term standard sequence can be obtained by A 1 = A, we have the required result by induction. 1. 5. > Theorem 2 (p. 208) fllJ 213 PROOFS > If A?::; B, then --B?::; -A. PROOF. If A?::; B, -A> -B, then by Lemma 2, X> X, contradicting Axiom 1. 0 5.3.2 5.3. > 0, B B'EB > 0, with ~ Now, by Theorem 3.3 (Section 3.4.3), there is a positive-valued ratio scale on ,g such that A~B for and there exist (A, B) Choose the unit so rp(X) A'nB'= > Thus fllJ is nonempty because if A c:ft both X and l'lJ , then -A 0 (Lemma so the pair (A, -A) E fllJ. We define o on B by letting A o B = rp(A) ;;? rp(B), iff and E = P(A) fllJ, rp(A o B) L For A = lrp(A), 0, E = rp(A) tff, let if A> 0, if A~ 0. + rp(B). 5. 214 PROBABILITY REPRESENTATIONS It is easy to see that P fulfills the assertions of the the~rem. Moreover p is unique, since if another such function P' existed, then </;'(A) = P'(A) ;ould be a representatiOn of (tff, ~, PA, o). Since¢;' = a,P and ,P'(X) = </;(X)= 1 </>' = </; and P' = P. 0 5.4 MODIFICATIONS OF THE AXIOM SYSTEM 5.4.1 QM-Algebra of Sets The notions of event and probability given in Definitions I and 2 have proved satisfactory for almost all scientific purposes. The one outstanding excepuon JS quantum mechanics. In that theory both P(A) and P(B) may exist and yet P(A n B) need not. For example, suppose that A is the event that, at some time t, a certain elementary particle is located in some region a of space and B is the event that, at the same time t, this same particle has a momentum whose value lies in some interval {3. Then the event A n B means that, at time t, the particle is both in the region a and has a momentum in {3. One basic feature of quantum mechanics is that we may be able to observe whether event A occurred or whether event B occurred, but not necessarily be able to observe whether both events A and B, i.e., event A n B, occurred. In the theory this means that no probability is assignable to A n Beven when P(A) and P(B) are specified. Such an incompatibility between two very fundamental sets of ideas must be resolved, presumably by modifying probability theory in some way. Those interested in a full understanding of the physical issues involved and in some of the proposed modifications of both classical logic and probability theory should consult Birkhoff and von Neumann (1936), Kochen and Specker (1965), Reichenbach (1944), Suppes (1961, 1963, 1965, 1966), and Varadarajan (1962). Suffice it to say that the simplest proposal (Suppes, 1966) is to restrict further our notion of an algebra of sets (Definition l) to: DE~INITION 5. .suppose that X is a nonempty set and that Cis a nonempty famzly of subsets of X. Then C is a QM-algebra of sets on X iff, for every A, BE C: 1. -AEC; 2. If A n B 5.4. 0, then A u BE C. Furthermore, if C is closed under countable unions of mutually disjoint sets, then C is called a QM a-algebra . Since the definition is still stated in terms of unions, and the difficulty 215 seemed to be about intersections, it is not clear that we have coped with the problem; however, as is shown in Lemma 5 (Section 5.5.1), all is well in the sense that when A, Bare in C, An B is in C if and only if A u B is in C, and if A :) B, then A - B is in C. As a simple example, if (X, C, P) is a finitely additive probability space, then for any A in ·C, the set of all events probabilistically independent of A is a QM-algebra of sets on X; see Section 5.8. What is not clear about this suggestion is just how badly probability theory suffers. For one thing, how many of the important theorems of probability theory cannot now be proved ? This is not the place to enter into that discussion. For another, to what extent have we impaired our ability to construct a numerical probability representation from a qualitative ordering ? If one carefully reexamines Axioms l-4 of Definition 4 and Axiom 5 of Section 5.2.3, the proofs of the lemmas used in the proof of Theorem 2, and the proof of that theorem, then one finds that, with the exception of Lemmas 1 and 2, all unions are of disjoint sets and that all differences are of one set that includes another. As we mentioned above, Lemma 5 (Section 5.5.1) replaces Lemma 1. As for Lemma 2, we note that only disjoint unions enter into its statement, that it is a necessary condition for the representation (the argument for necessity is very similar to the argument for Axiom 3), and that it is strictly stronger than Axiom 3. Therefore, we adopt it as Axiom 3', in place of Axiom 3. AXIOM 3'. Suppose that A n B = C n D = 0. If A ;?:; C and B;?:; D, then A u B ;?:; C U D; moreover, if either hypothesis is)-, then the conclusion is)-. We shall encounter this axiom again in the treatment of qualitative conditional probability (Sections 5.6 and 5.7). Except for the proof of Lemma 5, we have established the following: THEOREM 3. If Cis a QM-algebra of sets on X and if (X, tff', ;?:;) satisfies Axioms 1, 2, 3', 4, and 5, then there is a unique function P on C that satisfies the Kolmogorov axioms (Definition 2) and preserves the order ;?:; on C. 5.4.2 = MODIFICATIONS OF THE AXIOM SYSTEM Countable Additivity Theorem 2, and others like it, establish conditions under which finite additivity hold, but nothing has been said about countable additivity (see Definition 2), which is needed in many parts of probability theory. Countably additive representations of qualitative probability structures have been studied by Villegas (1964, 1967) and Fine (l97la). The main result (from our 216 5. PROBABILITY REPRESENTATIONS 5<5. PROOFS 217 point of view) is embodied in the following definition and theorem. The definition follows Villegas (1964). DEFINITION 6. Suppose that (X, <ff, is a structure of qualitative probability and that ,ff is a a-algebra. We say that <:; is monotonically continuous on ,ff if/for any sequence A1 , A 2 , ••• in <ff and any B in !%', if Ai C Ai+l and B <:; A;/or all i, then B <:; U~1 Ai. THEOREM 4. A finitely additive probability representation of a structure of qualitative probability, on a a-algebra, is countably additive iff the structure is monotonically continuous. This theorem is proved in Section 5.5.2. An immediate corollary is that Theorem 2 gives a set of sufficient conditions for a countably additive probability representation for a binary relation<:; on a a-algebra: Axioms 1-5 and monotone continuity. Given Axioms 1-·3 (qualitative probability) and monotone continuity, Axioms 4 and 5 can be dispensed with if a different structural condition is used. To formulate the condition, we need a qualitative definition of a probability atom. The numerical definition is that A is an atom if P(A) > 0 and if A -:J B, then P(B) = P(A) or P(B) = 0. Correspondmg to this, we have: DEFINITION 7. Let event A E t' is an atom > be a weak ordering and for any iff A > 0 BE an algebra of sets t'< An 1%', if A -:J B, then A ~ B orB~ .0 The structural condition which replaces Axioms 4 and 5 is that ,g has no atoms. Recall the definitions of fine and tight (Section 5.2.3). THEOREM 5< Suppose that (X, !%', is a structure of qualitative probability, ,ff is a a-algebra, <:; is monotonically continuous on !%',and there are no atoms. Then the structure is both fine and tight, there is a unique orderpreserving probability, and it is countably additive. single finite standard sequence. To do so, we replace the structural restriction (Axiom 5) by some sort of equal-spacing axiom. In the case of probability structures, it is impossible for a single standard sequence to ex~aust ,g- { For if A 1 , A 2 , .. ,,An is a standard sequence, then by Defimtwn 3, we have A 2 ~ B 1 u C1 , where B1 , C1 are disjoint and both are ~AI; thus, either B1 or C1 must be outside the sequence. The most that one can expect is for a single standard sequence to exhaust the set of nonnull equivalence classes iff. However, there are a great variety of structures of ,g compatible with a single standard sequence in 6', e.g., any finite structure sat1sfymg Axwm 5. Perhaps the simplest structure is one with a finite number of equivalent atoms-corresponding to a uniform distribution on a finite set of categories. This structure can be characterized by the following axiom: AXIOM 6. If A <:; B, then there exists C E C such that A~ B u C. This axiom, and the following theorem, are due to Suppes (1969, pp. 5-8). THEOREM 6. Suppose that (X, <ff, :C:> is a structure of qualitative probabtlzty (Axwms l-3 of Definition 4) such that X is finite and Axiom 6 holds. Then there exists a unique order-preserving function P on ,ff such that (X, <ff', P) zs afimte/y additive probability space. Moreover, all atoms in ,ff are equivalent. The proof is found in Section 5.5.3. Since any event in a finite probability space is a union of disjoint atoms and a null event, the probabilities are all multiples of the atomic probability. 5.5 PROOFS 5.5.1 Structure of QM-Algebras of Sets LEMMA 5. Suppose that then for all A, BE ,ff; For the proof of this and other related results, we refer to Villegas (:964). This structural condition turns out, of course, to be a specml case of Axwm 5 (given a-algebras and monotone continuity), since a structure of qualitative probability that is both fine and tight satisfies Axiom 5 (see SectiOn 5.2.3). (iii) 5.4.3 Finite Probability Structures with Equivalent Atoms PROOF. As in the chapters on extensive measurement and difference measurement, it is possible here to give a separate and elementary axiomatization of a (i) (ii) and X 0 E ,ff is a QM-algebra of sets on X (Definition 5), <ff; if A -:J B, then A - BE <ff; A B U E ,ff (i) iff A n B E 6". The proof is the same as Lemma 1, since A n -A (iii) If A u BE <ff, then by part (ii), A - B = 0. = (ii) Since A -:J B implies -A n B = 0 and since -A it follows that -A u BE ,ff, So A - B = -(-A u B) E 1%'. E ,g ' (A u B) - Band 5. 218 B - A = (A u B) - A E PROBABILITY REPRESENTATIONS tff. Thus, the symmetric difference, (A - B) u (B - A) E 5.5. PROOFS 219 For each k, Bk C Bk+I , and 00 tff; I P(Bk) ,s;_ P(A;) i=l and again, by part (ii), the intersection, which is the union minus the symmetric difference, is in tff. Conversely, if the intersection is in tff, then A - B = A - (A n B) and B - A = B - (A n B) E tff by part (ii), so A u B, which is the disjoint union of A n B, A - B, and B - A, is in tff. 0 ,s;_ p (U Ai) -- P(Am) ~=1 5.5.2 Theorem 4 (p. 216) P(B). = Suppose that a structure of qualitative probability <X, tff, :?::;) has an orderpreserving probability measure P and that tff is a a-algebra. Then <X, tff, P) is countably additive iff:?::; is monotonically continuous on tff. PROOF. If Pis countably additive, let {A;} satisfy A; C Ai+1 and A; By countable additivity, Thus, B :?::; Bk . But since Am U Bk ;:5 B. Since the partial sums of the right-hand expression are P(Ai+I), they are bounded by P(B), and so P(U::r Ai) ~ P(B), implying B :?::; Ai . For the converse, note first that by finite additivity alone, are pairwise disjoint, then k=l > 0, we have "' = U A; = B u Am Thus, monotone continuity is contradicted. The other case is where no such Am exists. Then clearly, P(A;) = 0 for all but fimtely many i. Let J be the subset of integers i for which P(A) = 0 and Jet ' ' A= UA;. iEJ By finite additivity, I P(A;) ~ p (U A;). i=l i=l =I P(A;) This is true because each partial sum on the left is P(U~~r A;) ~ P(U;':1 Ai). Suppose that for some disjoint union, strict inequality holds, in violation of countable additivity, i.e., let p ( u Ai) - i=l I P(Ai) i¢J w = I P(Ai) i=l = E > 0. t=l We consider two cases. First, suppose that for some Am in the sequence, E ?:; P(Am) > Therefore, P(A) = E. Let ck 0. = U A;. iEJ Let i<_k k Bk = U A;, i=l B = (U A;)- Am· t=l > B. i=l Then P(Ck) = 0 by finite additivity; so we have 5. PROBABILITY REPRESENTATIONS 5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY 221 220 but 0 violating monotone continuity. 5.5.3 Theorem 6 P(A (p. 217) . . b b ·1· fi hich X is .finite and ,g o~" quahtatzve pro a t zty or w . A structure , ' 'J b bT entatwn · which satisfies Axiom 6 (Section 5A.3) has a unique pro a I zty repres ' <X moreover, all atoms are equivalent. X' ,ff' >') in which the nonempty null PROOF. Define a new structure < ' '~ N ~ ut . . d (If N 15 . the union of all null events, 0' p , N ·ff A >B) It is events are ehmmate · ' ~ - N rf!' = {A - N I A E A - N;?: B t ~ • • X - X ,h t ,ff' also satisfies the hypotheses of the theorem, easy to show t a ne~pty, event in rf!' is expressible in a unique way as the moreover, any no . . . union of finitely many pairwise diSJOillt atoms. ( uch exists by the finiteness L A be a minimal atom With respect to s . et l J be the set of distinct atoms eqmvalent to Al. of and et ,... , . rf!' If there are others, let A be mmiinal We show there are no other atoms m among the atoms >' Al . Define B and C by B C =Au (A2 u ... u = A1 u u · · u An)- - = A . B Axiom 3, B >'C. By Axiom 6, thereexists (If n 1, Bh ~ A,BC~· C ~)DyWith no loss in generality, take D dtsJomt D E rf! sue t at _D A· but then from C. Since D contains no mimmal atoms, , =: B ~· CuD ;?:' C u A = B u Al a contradiction. Therefore Now construct 0 on rff >' B, is the set of all atoms. ~~~~'ing 0(A) be the cardinality of the set {Ai I Ai C A-- Let P(A) = of the elements of the event B-in other words, when we know that the event B occurred. Thus, the outcome must be in both A and B, that is, in An B. This immediately suggests P(A n B)= P(A I B); however, this cannot be correct since, given B, either A or --A must certainly occur so P(A I B)+ P( -A I B)= l, whereas, P(A n B)+ P( --A n B)= P(B). However, if we set p(A)jn. [t is easy to verify that p is an order-preserving prob; bility and that P is unique. A REPRESENTATION BY CONDITIONAL PROBABILITY Most work in the theory of probability requires the import;~:b~~~n~ ( n of conditional probability. Intuitively, P(A I B) !S the pr . y :~e~~vent A when we have the added information that the outcome JS one = P(A n B)/P(B), (2) all is well, provided that P(B) > 0. This is the accepted definition of conditional probability. Because this concept is so important and because ordinary probability is the special case of it in which B = X, several authors have treated it as the basic notion to be axiomatized and have given generalizations of Kolmogorov's axioms (Definition 2). Copeland (1941, 1956) and Renyi (1955) are of particular interest. Csaszar (I 955) studied conditions under which a real-valued function of two set-valued arguments, which is what conditional probability is, can be expressed in the quotient form of Equation (2) in terms of a function of one set-valued argument, which is what unconditional probability is. Our concerns are somewhat different. We wish to know conditions under which a qualitative relation of the form A j B;?: C I D, meaning "A given B is qualitatively at least as probable as C given D," can be represented by a probability measure of the form of Equation in the sense that AjB;?:CID iff P(A n B)/P(B) ?o P(C n D)/P(D). To our knowledge, the only published attempts to do this are Koopman (1940 a,b), Aczel (1961, 1966, p. 319), Luce (1968), and Domotor (1969). Domoter gave necessary and sufficient conditions in the finite case similar to the axiom systems in Chapter 9. The other three systems have much in common. Koopman treated 0 as the only null event and did not assume that every pair A i B and C I D are necessarily comparable; whereas Aczel and Luce admitted other null events and did require comparability of all pairs. The lack of comparability forced Koopman to introduce some extra axioms. Aczel and Luce postulated an additivity requirement, similar to Axiom 3 of Definition 4; whereas, Koopman invoked the property that if A I B <:; C j D, then -C j D <:;--A B, to serve an analogous function. The major differences lie in the choice of sufficient conditions and in the proofs. Aczei postulated a real-valued function on the pairs A I B, which gives the ordering, and imposed con~inuity conditions. His proof used results in functional equations. Koopman used a kind of uniform-partition postulate and constructed the probability function a process. Luce, whose necessary 1 5.6 I B) 5. 222 223 5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY PROBABILITY REPRESENTATIONS The structure is Archimedean provided that: conditions were essentially the same as those of Aczel, reduced the result to known results in measurement theory, using a solvability axiom. He also used a functional-equation argument. Here, we present a modification of Luce's system. In addition to the above studies, Copeland (1956) stated a system of axioms that is a good deal more transparent than Koopman's, but he did not construct a numerical representation. Moreover, he assumed that his basic system of elements is closed under the conditioning operation, which is to say, if A and Bare events, where B is nonnull (see below), then A I B is also treated as an event. This seems a trifle odd intuitively, although, as a matter of fact, our structural restriction entails that every A I B is equivalent in probability to some event. 5.6.1 Necessary Conditions: Qualitative Conditional Probability At first glance, one might expect to begin with an algebra of sets rff and a relation ?:; on rff x rff'; however, the proviso that P(B) > 0 in Equation (2) makes it clear that matters are not quite this simple. Events with probability 0-~null events--must be excluded from the conditioning position. Such events form a subset .Y of rff, and the relation <; is on tff X (rff' --..¥).We use the suggestive notion A I B for a typical element of rff X (tff- ..¥). Since we are familiar with the general procedure for finding necessary conditions, we first state the axioms in the form of a definition and then discuss those that are novel to this context. DEFINITION 8. Suppose that X is a nonempty set, rff is an algebra of sets on X, .Y is a subset of rff, and <: is a binary relation on rff X ( rff - ..¥). The is a structure of qualitative conditional probability quadruple (X, tff, Jil', iff for every A, B, C, A', B', and C' E tff (orE rff- ..¥,whenever the symbol appears to the right of I) the following six axioms hold: 1. (rff' X (rff'- ?:;) is a weak order. Every standard sequence is finite where {A} i 7. t d d iff for all i, Ai E ,g- .Y' A t+r :1 A ~' a~d X I X>' AslaAs an ar sequence i i+l ,....._, A1 j A2 . Of course, as in all our axiom systems, the choice of necessary axioms depends somewhat o~ :he nature of the nonnecessary structural restrictions. . In the case of Defimtwn 4 (qualitative probability) the thr extrem ly t dh , ee axwms are b ~ . na ura1an ave long been ~ccepted as the definition of qualitative ~ro ability. We added only the Archiinedean axiom and a structural restric~on. Here, we must be somewhat more tentative about the choice of axioms. .e~m~s 9-12 (Sect:on 5.7.1) and their corollaries are also necessary conditions, they are denved from Axioms 1-6 only with the h l fth axiom int d db ePo e structural ro uce ~1ow. Some of these properties could well be included in the concept of qua!ItatJVe conditional probability. Indeed in Section 5 6 3 we shall relabel Lemma 12 as Axiom 6' and use it inste:d of Axiom the of a nonadditive conditional probability m (SectiOn 5.6.4). representation 6: ~evelopment So~e properties, which it would seem natural to include in qualitative conditiO. nal probability, follow from Axioms 1-6 WI.thout any structural res t nctwn. For example, Axiom 3 could be expanded into XJX?:;AIB?:; 0jX, but this can, be proved (Corollary 1 to Lemma 6). . The proOI that Axioms l-5 are necessary for the desired representation ~~ quite ~asy. Note, fo.r example, that P(A I B) c= P(A n B I B), which YAie~ds Axwm 4. ~~e denvatwn of Axiom 5 is similar to that of its analog x1om 3 of Defimtwn 4. ' To show that Axiom 6 is necessary, note that if B I A<: C' I B', with A :1 B and B' :1 C', then P(B)/P(A) ;? P(C')/P(B'). Similarly, C I B?:; B' I A', B :1 C, and A' :1 B' yield E .Y iff A I X ,..__, 0 i X. I A and X I X ?:; A i B. 4. A I B ~ A n B I B. 5. Suppose that A n B = A' n B' = 0. If A I C <: A' I C' and B I C <: B' I C', then A u B I C?:; A' U B' \ C'; moreover, if either hypoth- ~:~~e esis is >, then the conclusion is >· 6. Suppose that A :1 B :1 C and A' :1 B' :1 C'. If B I A ?:; C' I B' and C I B?:; B' I A', then C I A ?:; C' I A'; moreover, if either hypothesis is >, or C I A.. > · ......, C' 1 A' , as reqmred. Moreover, we know that P(B) and P(B') ~re pos~tiv~, he~ce, by the second inequality, P(C) is positive. Thus if either mequahty IS stnct, multiplication preserves the strict inequality. , 2. X 3. X I X~ A E tff -- "'¥; and A then the conclusion is >- P(C)/P(B) ;? P(B')/P(A'). the P-values are nonnegative, these inequalities can be multiplied to P(C)/P(A) ;? P(C)/P(A'), 5. PROBABILITY REPRESENTATIONS 5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY 225 224 The Archimedean property, 7, Since P(An) ::( 1 and P(A1) > O, 0 < P(A1) :S:: . also derived by 'multiplying fractions: 18 P(A1) P(An) P(A 1) = P(A 2 ) = (-:fc~:~ P(A2) ... P(A3). r-1. . eX I X'-- A I A. it follows that P(A1)/P(A2) < L Thus, the inequality S me , / 1 2 ' fl · 1 n [P(A )/P(A 2)]n·-l can hold for only m1te Y many · . . ,) P(A 1) < ~ 1 · 6( t asted with Axwm 6 See also the further discussion of Axwm as con r . S . and in Sections 5.6.3 and 5.6.4 (also, following Lemma 12 m ectiOn of Axiom 7 in Section 5.7.2. The proof (Section 5.7) involves three major steps. First, a number of elementary properties, such as A I B ;2': C I D implies -C I D ;2': -A I B, are shown to follow from the axioms (Section 5.7.1). Second, we introduce an ordering ?::;' on If, by letting A ;:;:;' B if and only if A I X ;:;:; B I X (Section 5.7.2). The structure <X, rff, ?::;') is shown to satisfy the hypotheses of Theorem 2. By the proofs of Theorems 2 and 3.3 (reduction to extensive measurement, and thence, to Holder's theorem) we obtain an Archimedean, positive, regular, ordered local semigroup (Definition 2.2), whose elements are equivalence classes in rff - AI. Third, we obtain a semiring structure by using the conditional structure to define a multiplication operation (Section 5.7.3). If A I X~ C 1 B, with B ::J C, then the representation that we aim for gives P(A) P(B) = P( C). Therefore we define A * B = C (where boldface letters denote equivalence classes). It is then easy to show that the axioms of Definition 2.4 hold. The required representation follows from the semiring theorem, 2.6. 5.6.3 5.6.2 Sufficient Conditions . of Definition 8, we add the nonnecessary Continuing the num benng property: AXIOM 8. !fA ! B <: C I D, then there exists C' in rff such that C n DCC' and A I B "'"-' C' I D. This simply says that rff is sufficiently rich so that whlentevterAA'i!BB T>hC,·s iisD~ k C ' D eqmva en o · we can add just enough to C to rna e I. ·oms !-6 and 8 yield rather strong structural restriction. In particular, Axl b hown thar Axiom 5 of Section 5.2.3 (see 5.~.2), and mo~e;~e~o~~~a;ha~ ~s X can b~ apart from trivial cases, Axwm 5 . of SectJOn . . . ' ste:n included partitioned into arbitrarily fine eqwvalent even:s. Ko~~~ra; :h?:.e is a nonnull the nonnecessary postulate that for every po~a~~~~~~ents. It would, of course, event that can be partitiOned mto nb eqmpr~h- g weaker even if additional be desirable to replace Axtom 8 y some m , necessary axioms had to be added. h /X @" AI is an Archimedean structure of THEOREM 7. Suppose t at\ , ' '.. 8 for which Axiom 8 holds. {f h that for all qualitative conditional probabzbty (Defimtwn ) p on ' sue Then there exists a unique real-valued function A C E {f and all B, DE rff - AI: ' (X, t!f, P) is afi.nitely additive probability space; A A E AI iff P(A) = 0; P(A n B)/P(B) ~ P(C n D)/P(D). I B ;2': C I D Further Discussion of Definition 8 and Theorem 7 In this subsection we discuss three topics: the role of various axioms in the proof of Theorem 7; alternative proofs and axiomatizations; and the testability of the axioms. Axioms 1-3 of Definition 8 are obvious ground rules for qualitative conditional probability, and are often used in the proof without explicit reference. For exam;-le, we may say, "since A ! A ,-...., B ! B ... ," without noting that this uses Axioms 3 (X X~ A A) and 1 (transitivity). By restricting the relation ?: to hold between A I B and C I D only if B ::J A and D ::J C, we could bypass Axiom 4 altogether. In effect, this is what we do during the proof, establishing the representation conditions for B::JA, D~ C: (iii)' A I B;:;:; C I D iff P(A)/P(B) ~ P(C)jP(D). We then use A I B r-..; A n B I B only to pass from this to the more usual representation, involving condition (iii) of Theorem 7. Axiom 5 is, of course, the key assumption for an additive probability measure. It is analogous to Lemma 2 of Section 5.3.1 and Axiom 3' of Section 5.4.1. From the standpoint of semirings, it has two principal uses. First, it is used to show that ® is we!I defined and monotone, where if A n B = 0, A $ B = A v B. This only uses part of the strength of the axiom, i.e., if A ! X ;2': A' I X and B I X;:;:; B' I X, then Au B I X ;2: A' u B' I X (of course, An B =A' n B' = This use of Axiom 5 is not really "·~"'Y"'v.. , being buried in the reductions to Theorems 2 and 3.3. Rather, 5 is used heavily in the preliminary lemmas to derive other needed features of qualitative conditional probability. The second major use in the semiring 1 1 226 5. PROBABILITY REPRESENTATIONS development is to prove; right distributivity of multiplication * over addition @. Here, we use the form of the axiom with I C instead of I X. The structure of the axiom suggests right distributivity, i.e., if A I X,....., A' I C (A * C = A') and B I X,....., B' I C (B * C = B'), then Au B I X~ A' u B' I C [(A @ B) * C = A' @ B' = (A * C) @ (B *C)]. . . Axiom 6 also plays two main roles in the semiring development, besides 1ts use in some preliminary lemmas. First, it is used to show that multiplication *is commutative. This leads immediately to the left distributive law. Second, Axiom 6leads to Lemma 12 and its corollary (Section 5.7.1), which are used to prove left and right monotonicity of multiplication. . Except for commutativity of multiplication, we would be better off usmg Lemma 12 as an axiom, instead of Axiom 6. To further this discussion, we state Lemma 12 here as an alternative axiom, 6'. AXIOM 6'. Suppose that A :::l B :::l C and A' :::l B' :::l C'. If B I A<:; B' l A' and C j B <:; C' j B:, then C l A<:; C' I A'; moreover, if either hypothesis is> and C E If -AI, then the conclusion is )>-. It can be seen that Axiom 6' is very similar to Axiom 6. The hypotheses of Axiom 6' are "uncrossed," with B I A dominating B' I A', etc. Axiom 6' is very analogous to the weak monotonicity condition in difference structures (Chapter 4), whereas Axiom 6 is analogous to what has been called the strong 6-tuple condition (Block & Marschak, 1960). Here, we use Axiom 6, plus other conditions, including Axiom 8, to derive 6'; then 6' does most of the work, except for commutativity of multiplication. We could use _6' ~s .an axiom, if we were willing to add another axiom to guarantee left distnbutiV!ty. For example, we could assume: Suppose that A :::l A', B :::l B', and A n B = 0. If A' 1 A ~ B' I B, then A' u B' I A u B .~A' I A ,......, B' ! B. Alternatively, we could try to push through the analogy with difference structures, deriving 6 from 6' via a representation theorem from Chapter 4. This is done in the next subsection (proof in 5.7.4), but we seem to reqmre a stronger structural condition (see Axiom 8', below) in order to obtain the solvability axiom of positive-difference structures. .. Luce' s ( 1968) proof relied on the positive-difference structure of conditiOn~! probability. The exponential of a difference representation gave a r~t10 representation Q. Normalizing so that Q(X) = 1, Q ful_frlled all the as~ertwns of Theorem 7, except possibly additivity. The normalized Q was un1que up to Q--.. Qa, and Luce used a functional-equation argument to show that o.: could be chosen such that additivity also held. Our use of the ordered local serniring incorporates the essence of a slightly different functional-equation argu.ment, since the proof of Theorem 2.6 is based precise!~ o~ :fi~dmg the unique additive representation for a semiring that is also multJphcatlve. 5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY 227 Archimedean axioms are familiar to the reader by this time, so we need not comment on the role of Axiom 7; but we do note that we have defined standard sequence in terms of the ratio structure, rather than in terms of the additive structure. We do so because the resulting definition is much simpler (compare Definition 3) and it leads directly to the Archimedean axiom for positive differences (see Section 5.7.4)0 However, the proof that Axwms 1-8 1mply the Archimedean axiom for the additive structure is a bit tricky (5.7.2). . !he structural condition, 8, plays many different roles. We note here that 1t IS used to define multiplication: A * B ~= C where C is the solution to A ! X,..._, C ! B, with B :::l C. The solution exists by Axiom 8 (note Axiom 4 is also used), since A I X<:; 0 I_ B (Corollaries 1 and 2 of Lenima 6, below). Th1s completes the d1scussron of the roles of the various axioms and of alternative axiomatizations and proofs. We now comment briefly, on the testability of the axioms. The status of the various axioms depends critically on the method used to establish the empirical ordering over If x (If ·- AI). If the ordering is estimated from relative frequency counts-by the proportion of the times that A occurs when B occurs-we would most likely attribute any failure of the axwms to error. If the ordering is obtained by having a subject rate the likelihood of A, given B,_ then it would seem that the most interesting axioms to test are 5 and 6. Axwm 4 could fail if the subject did not perceive set relatwns correctly: but this would not bear strongly on the issue of a probability reoresentatwno The possibility that Axiom 5 might fail while 6 holds motivates our development, in the next section, of a nonadditive conditional representation. If Axiom 6 fails, then the properties of "subjective conditiOnal probability" are indeed very far removed from the representation here axiomatized. in the case where the relation <:; is inferred from decisions under uncert~inty, Axiom l also comes into question, for the same reasons as in uncondJtJOnal representations (see Section 5.2.4 and Chapter 8). It seems hard to envisage situations in which tests of Axioms 2 3 7 or 8 would be empirica!ly i~teresting. In most, if not all, practical 'si;ua~ions, % = { whJCh Simplifies matters considerably. 5.6.4 A Nonadditive Conditional Representation As we mentioned in the preceding subsection, there may be cases where an ordermg of "conditional probability" satisfies the multiplicative property expressed by Axwm 6 or 6', but does not satisfy the additive property of Axwm 5. We therefore want a nonadditive representation to be obtained Without using Axiom 5. 228 5. PROBABILITY REPRESENTATIONS We retain Axioms 1-4 and 7 of Definition 8; w~ replace Axiom 6 by Axiom 6', introduced in the preceding subsection; and we replace Axioms 5 and 8 by the following: 5'. If A .AI and A :J B, then BE%. 8'. If A I B ?.::; C I D, then there exists C' E tff such that C n D C C' and A I B ,...._, C' I D; if, in addition, C E tff- %, then there exists D' such that D :J D' :J C n D and A I B ,__, C I D'. E Axiom 5' corresponds to Lemma 8 (iii) below, i.e., it can be proved from Axioms l-6 of Definition 8. The first part of Axiom 8' is just a restatement of Axiom 8; the second part has the additional proviso that just enough can be removed from D- C to obtain a subset D' such that A I B ,...._ C I D'. Naturally, this must be restricted to the case where Cis not in%. THEOREM 8. Let d', %, ?.::;) satisfy the conditions of Theorem 7, except that Axioms 5, 6, and 8 are replaced by 5', 6', and 8', respectively. Then there exists a function Q from tff - .AI to (0, l}, such that; (i) Q(X) = I; 5.7. Q(A n B)/Q(B) ~ = Qa, a > 0. Note that this theorem deals entirely with the restriction of ?.::; to (6x - %); we do not have Q defined on elements of%. In order to extend Q to .A'", so that (ii) still holds, and so that A is in % if and only if Q(A) = 0, we would need to add certain additional conditions, which are provable with the help of Axiom 5, but not provable from the present axioms. For example, we would need properties such as A I B ?.::; 0 i X, 0 I X~ 0 i B, and A l B ~ 0 I B if and only if A is in%. Since the extension to % complicates matters and seems of little interest, we omit it. Theorem 8 is proved in Section 5.7.4. PROOF. Suppose that, on the contrary, -A 1 B > -C 1 D. By the stnct mequahty clause of Axiom 5, (Au -A) 1 B (C u -C) D. By Axwms l and 4, BiB> DID, contradicting B 1 B ~ D 1 D derived from Axwms l and 3. 0 > COROLLARY 1. COROLLARY 2. A I B?.::; 0 I X. 0 IA r-o 0 0 I B. PROOF. By Axioms 1 and 3, A I A ""B I B; by the lemma, -A I A ,...,_, - B I B; and the conclusion follows by Axioms 4 and I. 0 COROLLARY 3. If A ! C ~-A I C and BID""" -B 1 D, then A I c,..._,BID. If A 1 LEMMA 7. The common hypothesis of the following four lemmas is that (X, C, %, ?.::;) a structure of qualitative conditional probability (Definition 8, Axioms Axiom 7 is not used, and Axiom 8 is assumed only in Lemma 9. 1 = !. If A :J B, then A I C;::: B I C. PROOF. By Corollaries 1 and 2 of Lemma 6, (A -B) 1 C?.::; 0 Also, B I C ,....., B I C. By Axiom 5, A I C ?.::; B I C. 1 C. 0 We collect together a number of properties of the set AI in one lemma: LEMMA 8. (i) PR'OOFS Preliminary Lemmas I Of course, in the representation, A ! C ~ -A C will yield P(A C) Corollary 3 is the qualitative version of that result (ii) IS 1 Axiom 3 and the lemma. PROOF. (iii) 5.7.1 If A I B ?.::; C I D, then - C I D ?.::; -A I B. C >BID, we can use Axiom I to derive --A!C>-B\D; this contradicts Lemma 6. Q(C n D)/Q(D). If Q' is any other function with the same properties, then Q' 229 LEMMA 6. PROOF. if A n B, C n DE 6 - .AI, then A I B ?.::; C i D iff 5.7 PROOFS 0 E%. If A E %, then -A E tff - %. If A E %, and A :J B, then BE %. If A, BE%, then A u BE%. PROOF. (i) 0 IX ""' 0 I X. (ii) If A, -A E .AI, then A i X ,....., .0 i X.~ -A I X. By Lemma 6, -A I X~ X I X~ A I X, i.e., X I X~ 0 j X, contradicting Axiom 2. 5. 230 PROBABILITY REPRESENTATIONS (iii) By Lemma 7, A I X,....., 0 I X, A:)~ imply 0 I X;?: B I X. By Corollary l to Lemma 6, B I X~ 0 I X, orB ISm .AI. . · ( ... ) (A B) 1X 0 I X this with B I X -~ 0 I X, y1elds (IV) By part m , ,....., ' ' 0 (A u B) I X~ 0 I X, by Axiom 5. LEMMA 9. If A 1 B ~ 0 1 B and B:) A, then A E .AI. Conversely, if Axiom 8 holds, BE If' -.A', and A E .AI, then A I B ~ 0 I B. pROOF. Suppose that A I B 0 I B, B:) A, but A E ,g ~ .AI. By Corollary 2 of Lemma 6, 0 I A ~ 0 I X. Since B E ,g -- .AI, Axwm 2 and Corollary I of Lemma 6 yield 0 I X< B I X. We have r-.J 0 I A <BI X, A IB,....., 0IB, ·om 6 0 B < 0 X contradicting Corollary 2 of Lemma 6. wh ence, b y Ax l , ' B w a 1 For the converse, suppose that BE C --.AI and A I B > 0 I · e PP Y Axiom 8 to A 1 B > 0 I X to obtain 1 '1 AIB~CIX>0IX. Thus c E ,g -.AI. Since also BE C --.AI, we have B I X> 0 I C. By Axiom 4, 1 B .~ c 1 X. By Axiom 6, A n B I X > 0 I X, so A n B E C .AI, A n and by Lemma 8 (iii), A E If' - .AI. 0 satisfies Axioms For the remaining lemmas, we assume that (X, B 1-6 and Axiom 8. · · diff The next two lemmas have the content of Axiom 3 of positive-. erence · · n 4 1)· if ab be are positive intervals, then ac IS stnctly struct ures (Defin111 0 · · , .. ·f A :) B greater than either of them. Here, AB will be a positive mterval 1 _ and X 1 x > B 1 A. Thus, we want to establish that 1f A:) B :J C,, w1th AB, BC positive, then C i A is strictly smaller than both B l A and C B. 1 LEMMA 10. If A:)B:)C and BEC--.AI, then CjB;?:C!A, and> holds unless either C or A ·- BE .AI. A E 0 - JV. If C PROOF. By Lemma 8 c IB r--.o 0 IB ~ 0 E .AI, then Lemma 9 gives I A ,...._, C I A, as required. So assume that C E C - .AI. . . We have c 1 B ~ c 1 B and c 1 C;::: B 1 A, whence, Axwm 6 y1elds C I B > C I A. Moreover, holds unless C I C r--- B I A; but m that case, Lemm'; 6 gives 0 1 ,...._,(A - B)\ A, and now, from Lemma 9 and c > (A - B) A ,..._, 0 \ A, we have A - BE .AI. J 5.7. 231 PROOFS LEMMA 11. If A:) B:) C and A E If'-.#', then B I A<:; CIA, and> holds unless either B or B -- C E %. PROOF. If BE%, thenCE .AI and B I A ~ 0 I A ~ C I A, as required. So assume that B E 0 - .AI. We have A I A ;?: C I B, B I A ~ B I A, so Axiom 6 yields B I A <:; C I A, with > unless A I A ,...._. C I B. ln the latter case, B - C E %. 0 The next lemma (Axiom 6', Section 5.6.3) corresponds to Axiom 4 of positive differences, the weak monotonicity condition, with an additional strict inequality proviso. LEMMA 12. If A :JB:J C, A':JB':) C',B I A;?: B' I A', andCI B <:; C' I B', then CIA ;?: C' I A'. Moreover, the conclusion is > unless either C E .AI or both hypotheses are ,...._._ PROOF. Suppose that the hypotheses of the lemma hold, but that C' I A';?: CIA. We must show that in fact C' I A'"" CIA and that either C E .AI or both hypotheses are "'-'· By Lemma 10, we have C' I B' ;?: C I A. By Axiom 8, choose B" with A :J B" :J C and C' I B',...., B" I A. ff B" E %, then Lemma 9 yields that C, C' E .AI and C' I A' ~ C I A ,_,_, 0 I X, so we are done. Assume B" E 0 %. We show that C: B";?: B I A. Otherwise, B I A > C I B" and C I B ;?: C' I B' ~ B" I A yield, by Axiom 6, that C I A > C I A. But now, C I B" ;?: B' 1 A' and B" I A,....._, C' I B' yield CIA ;?: C' I A', by Axiom 6. Thus, we have CIA,...__ C' I A', and by the strict clause of Axiom 6, C I B" ~ B' 1 A' We now have C I B" ~ B I A, and can now infer, by the strict clause of Axiom 6, that B" I A~ C \ B (otherwise, C! A >CIA). Hence, ~ holds in both hypotheses. O COROLLARY. Suppose that C, DE Then A I C ;?: B I C iff A I D ;?: B I D. rff - .A"' and that D:) C :J A u B. PROOF. If A I C;?: B I C, then since C I D ~ C i D, the lemma implies AID;?:BID. Conversely, if B I C >A I C, then the strict clause in the lemma yields BJD>AJD. 0 This corollary is one of the main results to be used in the proof of 232 5. PROBABILITY REPRESENTATIONS 5.7. PROOFS 233 Theorem 7. It is used first to establish the Archimedean' axiom for a structure of qualitative (unconditional) probability, in Section 5.7 .2, on the basis of the Archimedean axiom of Definition 8. Subsequently, it is used to show monotonicity of multiplication in the ordered local semiring. The main line of development thus consists of Lemmas 6, 7, 8 (iii), 9, !0, and 12, with corollaries. One more lemma is required, that will be used to obtain the solvability axiom for unconditional probability. (in the sense of Definition 8) of <X ,ff A' >) Let c. = _ . .' ' .. '.~ · , A; Ai-l . The C; are pmrw1se diSJOmt and c1 -- A 1 d .- · an ,or 1 > I, z' Az, U Ck, = k~I 2-i+I Az, Az'+' - U = Ck . k=2t-H LEMMA 13. If A I X;?::: B I X, then there exists B' C A such that B' iX~BI X. PROOF. By Lemma 6, -B i X;?::: -A i X. By Axiom 8, there exists -B' ~-A such that -B I X~ --B' I X. We have B' C A and, by Lemma 6, B'iX""'BIX. 0 5.7.2 By Lemma 8, each Ai A~' B iff AjX ;?:B[X. Then <X, ,ff, is an Archimedean structure of qualitative probability for which Axiom 5 of Section 5.2.3 holds. PROOF We verify the five axioms. ;?:::'is a weak order on ,ff, since ;?::: is a weak order. 2. A;?:::' 0 by Corollary 1 of Lemma 6; X>-' 0 by Axiom 2 (if X~· 0, then X E A', but X E If 3. This follows as a special case of Axiom 5 of Definition 8 (note th~t the strict clause of that Axiom corresponds to the "if" clause in Axiom 3 of Definition 4). 4. Suppose that A 1 ', A 2 ', ... is a standard sequence in (X, ,ff, relative to some A >' 0 (see Definition 3). We first construct another standard sequence A 1 , A 2 , ... , with the properties that, for all i, A; ~A/ and A, C Ai+1. To do this, let A 1 = A and assuming that A;_1 has been defined, choose Ai by applying Axiom 8. We have A/>' A;_1 (use Axiom 5, Definition 3, and A >' hence, A/ I X > Ai-l i X. By Axiom 8, there exists A,-:) Ai-l with A/ \ X.~ A; I X. Next we show that the subsequence A 2 , = 0, 1, ... , is a standard sequence E 0' -- A'. By the corollary of Lemma 12, and Axiom 4, A 2 , I Azi+r ~ (A 2 ,+I - A 2 ,) I A 2 ,+I ~ -A 2~. 1 A 2i+l An Additive Unconditional Representation THEOREM 9. Suppose that <X, If, A', is an Archimedean structure of qualitative conditional probability (Definition 8) for which Axiom 8 holds. Let on If be defined by 1. By Definition 3 and Axiom 5 we have c ,. . _,' A for all · Th b A · d · · ' ' I. US, y XIOm 5 an mductwn, any two unions of m C.'s · part1cu · 1ar, , are ,......, ' , a n d m ~ 3 to Lemm. a 6 we have A 2 . A . ~A A il · B A · ' 3By Corollary ' 2•+1 1 . 2 , a 1. y x1oms 5 • ;~ have Xi X~ Az 1_ Az > A1 I A2 (note that by Lemma 9, 2 I . 2 >. . I , • Thus, IS mdeed a standard sequence (Definition 8) and smce 1t 1s fimte, so must be the sequences {A;} and {A/}. 1 1 C au: 5. . Suppose that A n B = 0 A >' C and B >' D A 1 · L , , ~ . pp ymg emma l3 /, , , we have C' C A with C' I X.- C 1 X. Similarly we have D C B w1t,: D I X ~ D 1 X. Since A n B = 0 C' n D' 1' C' D' dE , - 0 a so, and so ' ' an = A U B fulfill the condition. O t 0 A I X "'- C I X , COROLLARY. Suppose that the hypotheses of Theorem 9 hold Let ,g b the set of ~' equivalence classes, excluding A'; and for A E ,g __·A' let denote the equivalence class of A. Let ;t be the induced simple order' on If· and if A,_B E tf/, with An B = 0, let A@ B beAu B. Let f1A be the set ~ (A, B) wzth A n B = 0 Then <,g > !A 12.> · · c; ¢1 ts an Archzmedean positive regular, ordered local semigroup (Definition 2.2). ' ' I 0 0 , ,..., , . PROOF. . From Section 5.3.2 and Theorem 9 we know that ( 0', ;t, PA, @) 1~ an extensive structure With no essential maximum (Definition 3 2)· b t smce > Js alread · 1 d · ' u ,... . Y a Simp e or er, the argument in Section 3.5.3 shows that the structure IS an ArchJmedean, positive, regular ordered local group. ' semi- 0 5.7,3 Theorem 7 (p. 224) An Archimedean structure of qualitative probability <.X, <o, "" has a . umque representation p such that (X, ,ff, P) is a finitely additive probability 234 5. PROBABILITY REPRESENTATIONS space, JV is the class of events with probability zero, and A I B?::; C I D P(A n B)/P(B);;;; P(C n iff D)/P(D). PROOF. We start by introducing a semiring operation into the semigroup ?:;, &J, €)) defined in the corollary to Theorem 9, in the preceding subsection. Take A, BE C. We have A I X> 0 ! B (since %¢: tff and since 0 X ~ k) B by Corollary 2 to Lemma By Axioms 8 and 4, there exists C with B :::J C and A i X r-J C j B. We now define A* B = C. To show that* is well defined, suppose that A I X,......, A' I X, B I X,......, B' I X, C C B, C' C B', A 1 X r-; C i B, and A' i X C' 1 B'. Applying Lemma 12 to C C B C X, C' C B' C X, we have C i X~ C' I X; thus, C = C'. By Lemma 9, A * B E Iff, since in the above definition, C I B > 0 I X ,..._, 0 I B. The set /!ll* for which * is defined consists of all of Iff X tff (this is because the product of two numbers less than or equal to 1 is again less than or equal to 1; X is a natural identity for * ). This greatly simplifies the proof of the other axioms of Definition 2.4. To show that * is associative, choose A, B, C E Iff. Let A* B = C', C C B; and let C' * C = D, DC C. Thus, (A* B) * C = D. Similarly, choose A' = B * C, A' C C, and D' =A* A', D' C A', so that A * (B *C)= D'. We have A • X~ C' i B and A! X~ D' I A', hence D' i A'~ C'! B. Also A' 1 C ~ B i X. Applying Lemma 12 to D' C A' C C and C' C B C X, we have D' j C ~ C' ! X. Since C' I X~ D I C, we have D' i C ~ D : C. By the corollary to Lemma 12, this implies D' = D, as required. Before establishing that the other axioms of Definition 2.4 are satisfied, we note that * is commutative: if A ! X r--.o C I B, with C C B, and B I X ,......., D I A, with D C A, then by Axiom 6 applied to D C A C X, C C B C X, we have D l X~ C I X, or D = C. We now show that * is right distributive over €). Left distributivity then follows by commutativity of *· Let A n B = 0, A v B i X~ C' I C, C' C C. Then (A €) B)* C~= C'. By Axiom 8, choose A' C C' such that A I A v B.~ A' I C' (Note that A v B, C' E IffLet B' = C' -- A'. By Lemma 6, B i A v B ,..._, B' I C'. Now apply Lemma 12 to A C Au B C X, A' C C' C C, obtaining A 1 X~ A' I C; and similarly, by Lemma 12, X B' C. Hence, A* C =A', B * C = B'. By construction, A' n B' = 0, therefore, 1 I N B: C'-.J 1 (A * C) $ (B * C) = A' v B' =C' =(A@ B)* C. Finally, we note that A ?:; B implies A* C ?: B * C follows immediately from the corollarv of Lemma 12; left monotonicity follows by commutativity; , be proved directly from Lemma 12. also it could 5.7. PROOFS 235 We now assert that the first three axioms of Definition 2.4 hold. Axiom 1 that< Iff, ?:;, .fllJ, @) is an ordered local semigroup, was proved in the previou~ sect1_on; Axwm 2, ~hat (C, ~, tff X C, *) is an ordered local semigroup, ~as JUst been estabhshed (the assertions about pairs in g x g are all trivial, smce * IS defined everywhere); and Axiom 3, distributivity, was also just proved. The final axiom asserts that for any A E C, there exists (B, q in /!ll such that (A, B ® C) E C X Iff. This is trivial unless &J is empty. But fllJ can be em~ty only if tff contains exactly one element, X; in which case, Theorem 7 Js tnv1ally true, letting P(A) ~~ I or 0 accordingly as A ""'' X or A ,...._,' 0 If _&J is nonempty, then (<ff', ~, &J, Iff x C, €), *) is an Archimede~n, positive, regular, ordered local semiring. By Theorem 2.6, there exists a unique fu~ct:on </> from <ff to Re+ which IS order preserving, additive, and multiphcatJVe. Define P(A) = l rp(A), if A E <ff JV, if AE%. lo, We must show that P satisfies the conclusions of Theorem 7. Suppose that A _i B?::; C I D. We want to show that P(A n B)/P(B)? P(C n D)/P(D). Without loss of generality, suppose A n B, C n DE ,g _ JV (smce C n D E% is trivial, and A n BE% implies C n D E .AI, using Lemma Choose E, FE C ,AI such that E!X~A/B?::;CiD~FjX. We have B * E =An B and F * D = C n D. Therefore since multiplicative and order preserving and since E :=; F, we have ' </>(A n B)/<P(B) = </> is ¢(E) ? ¢(F) = </>(C n D)/,P(D). > The required inequality for P follows immediately. The strict case Similarly leads to > in the P-inequality; so A 1 B?::; C 1 D if and only if P(A n B)/P(B) ? P(C n D)/P(D). We need only show now that (X, tff, P) is a finitely additive probability space. S~nce X 1s a multiplicative identity, we have P(X) = 1; dearly p ;;;; O; and addJtlVlty of P follows from additivity of rp on< C, &J, €)). For umqueness, note that another representation P' would lead to another representation 4>' of (Iff,?:;, &J, Iff x C, €), *), by letting rp'(A) = P'(A). For example, to show that rp' is multiplicative, let A* B = C. We can choose from the representation C such that A I X~ C I B and C C B; and P'(A) P'(B) = P'(C), implying ¢'(A* B) = </>'(A)rp'(B). 0 236 5.7.4 5. PROBABILITY REPRESENTATIONS Theorem 8 (p. 22&) If (X, rff, .AI, satisfies Axioms 1-4, 5', 6', 7, and 8' (Section 5.6), then there exists Q : ( rff - AI')__,. (0, I], such that (i) Q(X) = I and (ii) A I B ;?:: C ID iffQ(A n B)/Q(B);? Q(C n D)/Q(D). This Q is unique up to Q __,. Q", o: > 0. PROOF. We construct a positive-difference structure <rff -- .AI, ot*, ;?:: *) (see Definition 4.1) as follows. Let ABE Ot* iff A, BE rff- .AI, A~ B, and X I X>- B l A. For AB, CD E Ot*, let AB ;?:;*CD iff D I C;?:: B I A. We verify the six axioms of Definition 4.1. 1. Weak ordering is immediate. 2 & 3. If AB, BC E Ot*, then clearly, A, C E JJ - .AI and A~ C. We shall show that B 1 A and C 1 Bare both C j A; this will show that X I X C I A, so ACE Ot*, and also, that AB, BC < * A C. By the strict clause of Axiom 6', B! B C I B and B I A~ B I A yield B iA C I A. Similarly, C I B ~ C I Band B l B B I A yield C I B C I A. (Note that Axioms 1 and 3 were used implicitly in the above proof.) >- >- >- >- 4. >- >- 5.7. PROOFS This completes the verification of the axioms for positive differences. By Theorem 4.1, there exists a positive-valued function ,P on JJ -- .At such that AB ;?:;*CD iff f(AB) ;;; f(CD), and for AB, BC E ot*, f(AC) = if;(AB) + f(BC). We define a function Q as Q(A) -- )1, --- lexp[-!f(XA)], >- ~ D IC r--< D' >- i X, 1 1 For if XA E Ot*, then XB E 1 ot*, so Q(B)/Q(A) = exp[--f(XB) l A. Axiom 5' or 8', C' E JJ -.AI; by Axiom 5', D' E JJ- JV. Thus, AD', C'B E Ot*, and AD' ,-...,* CD ~ C' B. To complete the proof we must show that AC', and D'B E Ot*, i.e., that X I X>- both C' I A and B I D' If X! X~ C' l A, then apply the strict clause of Axiom 6' to B I C' B I A, C' A A j A, to obtain B ! A B A, a contradiction. Similarly, B I D' ~ B I B and D' i A > B ! A would yield B I A B l A; hence, X! X>Bl D'. r--J X ~A E Ot*. By definition of Ot*, XA E ot* excludes both X [ X,.__, A X and A E .Y. Also, X I X,...__ A I X excludes A E .AI (otherwise, X 1 X~' A 1 X~ 0 1 X yields X E.%). Conversely, if A¢ .AI and X X>- A X, then XA E ot* Thus, Q is well defined on JJ - %. Next we note that if ABE Ot*, then = 1 XA Q(B)/Q(A) = exp[- f(AB)]. 5. Suppose that AB, CD E Ot*, with AB >-*CD. By Axiom 8', applied to D i C B! A, we have C' and D', with A~ C', D' and C', D' ~ B, such that l C' XI if if We must show that Q is well defined and that its domain is ,fJ' that is, the two conditions in its definition are mutually exclusive, and A E JJ- .At iff one of them holds. This follows immediately from Axiom 6' B 237 >- I >- 6. is a standard sequence, in the sense of Definition 4.1, if and only if it is a standard sequence in the sense of Definition 8. (This is immediate from the definitions.) Thus, any standard sequence in - .AI, Ot*, <:; *) is finite. 2 2 The reader may be puzzled by the fact that all standard sequences are finite. This is because all standard sequences are ascending, hence, all are strictly bounded by XA, . If we permitted descending standard sequences, A, , A,, ... , such that A, ::J A,+l and A,+l 1 A, ~A, 1 A,_ 1 , there could very well be infinite standard sequences. But Jt would still be true that any strictly bounded standard sequence is finite. By solvability, any strictly bounded descending standard sequence could be converted into an ascending standard sequence. This is really the reason why only one direction needs to be constdered. + f(XA)] exp[-f(AB)]. Or if XA ¢ ot*, then X I X~ A I X, and by Axiom 6', B I X~ B fA (otherwise, the strict clause of Axiom 6' yields B 1 X> B 1X). Consequently, XB E Ot* and f(AB) = f(XB); the equation follows, since Q(A) = L We can now show that (ii) of Theorem 8 is satisfied First if AB CD E ot* then the fact that exp[- f(AB)] is a strictly decrea~ing f~nctio; of ,P(AB) gives the required result (since f is order preserving on Ot*, and the ordering <:;* on Ot* is the converse of the corresponding ;?:: ordering). This extends easily to A, B, C, DE ,fJ' - •.¥with A ~ B, C ~ D, but AB or CD if Ol*. For if AB ¢ Ct*, then X I X~ B I A; but this entails A I X~ B I X (If A I X'--- B; X then B I A,..__, BIB and Axiom 6' yield B I X>- B I X; ~tc.) Hen,ce, Q(A) ~ Q(B). Similarly, if CD¢ Ol*, X I X~ D I C and Q( C) = Q(D). Thus, if both are not in ot*, B I A~ D I C and Q(B)/Q(A) = 1 '= Q(D)/Q(C); if only AB¢0t*, then BIA>-D!C and Q(B)/Q(A)=l>exp[-!f(CD)]= Q(D)/Q( C). Finally, the representation extends to A, B, C, D arbitrary in t!- .AI by using the representation for A n B, A, C n D, C, and employing Axiom4. To prove uniqueness, note that if Q' is another representation, then ~'(AB) = -log[Q'(B)/Q'(A)] gives another representation for (JJ- .At, 5. 238 Ot*, ;?::*),hence, by Theorem 4.1, if/ = PROBABILITY REPRESENTATIONS 1. Q'(A) = Q'(A)/Q'(X) if X I X 5.8 > A I X. Since !• = exp[ -<f'(XA)] = exp[ -rx<f(XA)] = [Q(A)]~, l, we have Q' = Q• on tff -- .AI. INDEPENDENT EVENTS 239 relation on ,{]'_ Then j_ is an independence relation 3 a<f. Thus, = 5.8. j_ is iff symmetric. 2. For A E C, the set {B I BE tff and A j_ B} is a QM-algebra of sets on X (Definition 5), i.e., 0 (i) A l_ X; (ii) if A j_ B, then A J. ·-B; and (iii) if A J. B, A l_ C, and B n C = 0, then A J. B u C. It is easy to demonstrate that these axioms are necessary for the representation, e.g., for 2 (ii), if P(A n B) = P(A) P(B), then INDEPENDENT EVENTS P(A Intimately connected with the concept of conditional probability is that of (statistical) independence. Given a probability measure P, events A, Bare independent if and only if P(A n B) = P(A) P(B) [or, for P(B) :# 0, if and only if P(A B) = P(A)]. In qualitative terms, using the primitives of the preceding sections, we can define A, B to- be ,_,-independent if and only if either B is in .AI or A B ~A i X. (Or again, using *defined in Section 5.7, A, Bare ~-independent when A * B = A f'l B.) An interesting idea, introduced by Domotor (1969), is to accept independence of events as a primitive notion, in which case the primitives for a theory of conditional probability consist of a set X, an algebra tff on X, an ordering :<; * on C, and a new binary relation (interpreted as independence) _l on C. Of course, Theorem 2 is then to be enriched by adding to the constraints on the representation the following one: n -·B) = P(A) -- P(A n B) = P(A) - P(A) P(B) = P(A)[J - = P(A) P(- B). P(B)] 1 1 Al_B iff P(A n B) = P(A) P(B). It will be noted that this property of the representation together with the axiom A _l X (see Definition 9 below) implies P(X) = 1. Without such an additional constraint, we can obtain an absolute scale of probability only by the flat of normalization. From the standpoint of unconditional probability, only ratios of probabilities are meaningful. This further constraint, based on the new primitive 1_, yields a genuine absolute scale. Another way to understand the absolute scale is from the standpoint of conditional probabilin Section 5.7.3, X is the natural multiplicative identity with respect to *, forcing P(X) = 1. We start by introducing the axioms of a pure independence structure tff, 1_). <X. DEFINITION 9. Suppose tff is an algebra of sets on X and _l is a binary It is definitely not true that {B I A J. B} is an ordinary subalgebra of there can be events A, B, and C for which A _l B and A _j_ C but not A J. B n C and not A J. B u C. This corresponds to the fact that pairwise independence of events, in the usual probabilistic sense, does not entail independence of the entire set of three or more events. Only if A, B, C form an independent family of sets is each event independent of the subalgebra generated tw the other events. This motivates the definition of an abstract independence relation for sets of three or more events, based on _l, using subalgebras. If .? is a subset of tff, then the smallest subalgebra containing Y is given as the intersection of all subalgebras of tff that contain Y. (Note that C itself is a subalgebra containing Y, so this intersection is defined and it contains Y· and the intersection of arbitrarily many subalgebras is easily seen to be ~ subalgebra.) The smallest subalgebra containing A consists of { 0, A, -A, Two subalgebras of tff, ~ and tff2 are said to be 1_-independent if and only if A1 l_ Az whenever A 1 is in ~ and A 2 is in tff2 • Note that if A _l B, then {,P, A, -A, X} and { 0, B, --B, XJ are J. -independent. This leads to the generalization: A1 , ... ,Am are _j_-independent if and only if the smallest subalgebras containing every proper subset of the Ai and its complement are 1_-independent. Formally, we have: DEFINITION 10. Let tff be an algebra of sets and J. an independence 'This concept is different from that of a relation being independent implicitly defined in Definition 1.3 of Section 1.3.2. , 5. 240 PROBABILITY REPRESENTATIONS 5.9. PROOF OF THEOREM 10 241 relation on C. For m ~ 2, A 1 , •.. ,Am E 0 are j_-independent iff, for every proper subset M of {I, ... , m}, every B in the smallest subalgebra containing {Ai I i E M}, and every C in the smallest subalgebra containing {Ai I i ¢' M}, we have B j_ C. DEFINITION 12. Suppose (X, 0, ;(:;, j_) is a structure of qualitative probability with independence. Let .AI = {A 1 A E 0, A ,....., If A, C E 0 and B, D E 0 - .AI, define As we noted in motivating this definition, if m = 2, we have A and B are j_ -independent if and only if A j_ B. We can now define what we mean by a structure of qualitative probability with independence. We take a structure of qualitative probability (X, 0, an independence relation j_ on 0; and an axiom interlocking;(:; and j_. The interlocking axiom is very natural, given the representation: probabilities of independent events multiply, and multiplication is order preserving, therefore, intersections of independent events preserve order. [n addition, we include in the following definition a very strong structural condition that uses the defined notion of _L -independence of sets of events. iff there exist A', B', C', D' E 0, with DEFINITION 11. Suppose that X is a nonempty set, 0 is an algebra of sets on X, and ;(:; and j_ are binary relations on rff. The quadruple (X, 0, ;(:;, .l) is a structure of qualitative probability with independence rhe following axioms hold: l. 2. (X, 0, is a structure of qualitative probability (Definition 4). is an independence relation (Definition 3. Suppose that A, B, C, DE 0, A 1.. B, and C j_ D. If A ;(:; C and B ;(:; D, then A n B ;(:; C n D; moreover, if A > C, B ;(:; D, and B > 0, then AnB > CnD. The structure (X, 0, holds: j_) is complete iff the following additional axiom 4. For any A 1 , ... , Am , A E 0, there exists A' E 0 with A' ,___,A and A' j_ Ai, i = l, ... , m. Moreover, , ... ,Am are j_-independent, then A' can be chosen so that A 1 , ... ,Am, A' are also In a complete structure of qualitative probability with independence, we can introduce a defined relation ;(:;' on pairs A I B and C! D. Clearly, if A C B, C CD, A .l D, and C j_ B, we will want to assert that A B ;(:;' C l D if and only if P(A) P(D) ?; P(B) P(C), or, in qualitative terms, when A n D ;(:; C n B. If the requisite independence relations for A, D and C, B do not hold, then using completeness (Axiom 4), we may replace these sets by the equi.a!ent ones for which independence does hold. The theorem we obtain is that, with suitable definitions of .A' and , a complete structure of qualitative probability with independence gives rise to a structure of qualitative conditional probability. 1 A!B;(:;'CID A' '"-' A B' ~ B, n B, A' _L D' C' ,...., C and n D, D' ,...., D; C' j_ B'; and A' n D' ;(:; C' n B'. TH~O~EM 10. .~uppose that (X, 0, ;(:;, j_) is a complete structure of qualzt~~we probabzlzty with independence and that .AI and ;(:;' are given by Defimtwn 12. Then (X, 0, .AI, ;(:;') is a structure of qualitative conditional probability (Definition 8). If we now make the additional assumptions that the Archimedean and solvability axioms of Section 5.6 are satisfied by (X, 0, .AI, ;(:;'), then Theorem I 0 and these assumptions allow us to deduce the representation p of Theorem 7. This gives the desired representation for j_. For if A 1.. B, with B not in .Y, then Definition 12 gives A 1 B ,..._,' A 1 X, hence, P(A n B)/ P(B) = P(A). (If B are bovth ~ 13_, then A n B ,.,.._, 0, so the desired result is trivial.) Of course, the Arch1medean and solvability axioms, stated in terms of >' can be translated into terms of <::; and j_. The resulting axioms are not .;;; natural. A simple axiomatization directly in terms of > and j_ would be preferable. ~ 5.9 PROOF OF THEOREM 10 A com?!ete structure of conditional probability with independence induces, by Defimtwn 12, a structure of qualitative conditional probability. PROOF. We remark first that by Axiom 3 of Definition 11 the definition of;(:;' is independent of the particular choice of A', B', C', D' Definition 12. For if A'~ A", B' ~ B", C' ~ C", and D' ~ D", with A' j_ D', A" ..L D" C' 1.. B', and C" j_ B", then A' n D' ~A" n D" and C' n B',...., C" n B".' By completeness (Axiom 4), ;(:;'is a connected relation; for let A' = A n B C' = C n D, and use the axiom twice with m = 1 to obtain D' ,__, D witl~ A' 1.. D' and B',....., B with C' J... B'; then compare A' n D' with C' n B'. in' 5. PROBABILITY REPRESENTATIONS 242 To show that;?:' is transitive, use completeness successively with m = 1, 2, 3, 4, and 5 to obtain A' = A n B, B' r--oB, C' '""'"' C n D, D' ~ D, E' ~ E n F, and F' ~ F, such that A', B', C', D', E', F' are _L -independent. By the remark above on the definition of;?:', we infer, from A I B ;?:' C I D and C I D ElF, that A' n D';?: C' n B' and C' n F';?: D' n E'. By the definition of j_ -independence, we have A' n D' j_ C' nF', 243 EXERCISES Finally, suppose that A ~ B ~ C, D ~ E ~ F, B I A ;?: F i E and C i B ;?:' E i D. Choose A', B', C', D', E', F', respectively ~A, B, C, D, E, F and _[_-independent (completeness with m = 1, 2, 3, 4, 5). We have B' n E' ;?: F' n A' and C' n D' ;?: E' n B' (with strict;?:' going to strict;?:). By of ;?:, C' n D' .?: F' n A' strict inequality, if there is a strict antecedent), hence C I A ;?: (or >. as required) F I D. 0 C' n B' .L D' n E'. EXERCISES Therefore, Axiom 3 yields A' n F' n C' n D' ;?: E' n B' n C' n D'. > If not A I B ;?:' E IF, then E' n B' >A' n F'. If C' n D' 0, then the A' n F' n C' n D' strict clause of Axiom 3 yields E' n B' n C' n D' (since _[_-independence gives E' n B' j_ C' n D' and A' n F' j_ C' n D'). This contradiction shows that A I B ;?: E I F as required. On the other hand, if C' n D' ~ 0, then the strict clause of Axiom 3, applied to C' j_ D', 0 l 0, and D' > 0 (D' E /f yields C' ,....., 0 ; thence, > C' n F' ,...._ 0 ;::: D' n E'; and since D' > 0, the same arguments give E' ,....., 0 and E' n B' ~ 0 . Thus, in this case also, A' n F';?: E' n B', as required. The second axiom of Definition 8, that X E Iff - A and A E A'" iff A I X,-../ 0 I is immediate; obviously, A I X ~' B I X iff A ~ B. Similarly, the third axiom of Definition 8, that X I X ;?:' A I B and Xi X,....,' A I A, is immediate; and the fourth axiom, A i B ,.._,' A n B I B, follows from the definition of ;?:'. Thus, only the additivity and strong 6-tuple conditions need to be established. Suppose that A ! C ;?:' D I F and B j C ;?:' E I F, where A n B = 0 = D n E. Without loss of generality, assume A u B C C, D u E C F. By D, E and choose F' ""F with completeness, choose C' ~ C with C' F' _LA, R By definition of we have A nF';?: Dn C' B n F' ;?: En C'. 1. For each of the following three families Iff of sets, either show that Iff is an algebra of sets (Definition 1) or show a violation of one of the axioms. (i) Let X be a nonempty set and A a nonempty subset of X. Then /f consists of all subsets of X that either include A or are disjoint from A. Let X be a nonempty set and A a nonempty subset of X. Then /f consists of 0 and all subsets of X that intersect both A and -A. (iii) Let iF and tfJ be algebras of sets on X and Y, respectively. Then t%' consists of those subsets of Xu Y of the form A u B where A is in SF and B is in tfJ. (5.1) 2. Suppose that (X, Iff, P) is a finitely additive probability space (Definition Prove that, for all A, BE Iff P(A u B)= + P(B) -- n B). 1) 3. Prove the independence of the axioms of qualitative probability (Definition 4). (5.2.!) 4. In a structure of qualitative probability (Definition 4), either prove that the relation ~* (defined on p. 206 in connection with the definition of tight) is an equivalence relation or a counterexample. 5. Suppose that (X, Iff, ;?: ) is a structure of (Definition 4) and that A, B, C, D E 6 are such that A u C Prove that if A;?: Band C;?: D, then An C;?: B n D. = probability B u D = X. (5.2.1, 5.3.1) 6. Suppose that (X, Iff, <:) is a structure of qualitative probability (Definition 4) and that A = {A I A E Iff and A ~ ,P}. Prove that A has the following properties: By Lemma 2, which uses only qualitative probability, (Au B) n F';?: (DuE) n C', and if either antecedent inequality is strict, so is the conclusion. By the disjoint union property of _L [Axiom 2 (iii) of Definition 9] we have C' _L D u E and F' j_ A u B. Hence, A u B i C ;?:' D u E I F, with holding if either antecedent is strict. (i) If A (iii) 1f A >' E A, BE 6, and B C A, then BE A'. If A, BE A, then A u B, A n BE A. E A', then -A E Iff-%. (5.2.!, 5.3.1) 5. 244 7. Let X = {a, b, c, d, e} and C of C such that: (i) = PROBABILITY REPRESENTATIONS 2x. Suppose that ;?::; is a weak ordering Chapter 6 Additive Conjoint Measurement {a}~ (iii) {a, b} ,....., {c, d, (iv) {a, e} ~ {c, d}. Find a probability representation consistent with ;?::;. Is it unique? (5.4. 3, 5.5.3) 8. Does the probability space in Exercise 7 satisfy the hypotheses of Theorem 6? Can you define general circumstances in which a finite structure of qualitative probability has a unique representation? (5.4.3, 9.2.2). 9. Suppose that if there exists an A withO < P(A) < 1, then (X, <ff, P) is a finitely additive probability space (Definition 2) with the following property: for every A in ,ff with 0 < P(A) < 1, there exists a B in <ff with 0 < P(B) < 1 that is independent of A (p. 238). Prove that this space has no atoms (p. 216). (5.1, 5.4.2, 5.8) In the following three exercises we suppose that <X, C, JV, ;?::;) is a structure of qualitative conditional probability (Definition 8, Section 5.6.1 ). In the proofs you may use the axioms of Definition 8 and the lemmas of Section 5.7.1, but not Theorem 7. C, then A I B <: A I C. (5.6.1, 5.7.1) 11. If A, A', B, B' E C, C, DEC- JV, A i C;?::; A'! D, and B I C :::S B' I D, then (A- B)l C;?::; (A'- B')l D. (5.6.1, 5.7.1) 10. If A E<ff, B, C e<ff and B 12. Suppose that A, B, C E C and that C n B, C - BE C -- JV. Prove that A I C ,...._.A! C n B iff A I C ~A I (C -- B). Thus, letting C =X, if A is independent of B (see Section then A l B ,...._, A i -B. (5.6.1, 5.7.1, 5.8) 13. Prove Exercises 10-12 using the representation of Theorem 7. (5.6.2) 0 6.1 SEVERAL NOTIONS OF INDEPENDENCE Not all attributes that we wish to measure have an internal, additive structure of the sort we studied in Chapters 3 and 5 (temperature and density are just two examples), and when no extensive concatenation is apparent we must use some other structure in measuring the attribute. One of the most common alternative structures is for the underlying entities to be composed of two or more components, each of which affects the attribute in question. A simple example is the attribute momentum which is exhibited by physical objects and which is affected both by their mass and by their velocity. Another example, discussed in Section 1.3.2, is the judged comfort of various humidity and temperature combinations. The objects in these examples cannot readily be concatenated, nevertheless, they can be treated as composite entities. The rest of this volume, except for the last chapter, concerns theories leading to construction of measurement scales for composite objects which preserve their observed order with respect to the relevant attribute (e.g., preference, comfort, momentum) and where the scale value of each object is a function of the scale values of its components. Since such theories lead to simultaneous measurement of the objects and their components, they are called conjoint measurement theories. The present chapter deals with the additive case; more complicated rules of combination are studied in Chapters 7, 8, and 9. The first part of this chapter (Sections 6.1-6.3) deals with the two-component case. Empirical examples from physics and psychology are presented 245