Download Chapter 5 Probability Representations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of randomness wikipedia , lookup

Randomness wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Probability box wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
4.
198
DIFFERENCE MEASUil.EM~
Order A x A by the ordering of the lengths of chords joining pairs of points
in A. Which axioms of an absolute-difference structure (Definition 8) are
(4.IU)
satisfied (proof or counterexample)?
19. Show that the axioms of an absolute-difference structure (Definition 8)
are all necessary for the representation of Theorem 6, except for the
solvability axiom.
(4.1 0)
Chapter 5 Probability Representations
20. Suppose that dissimilarities of the 15 pairs of six stimuli, a, b, c, d, e,J,
are ranked (from 15, highest dissimilarity, to l, lowest) as in the f'bt1owing
matrix.
a b c d e f
7
1
4
a
b
c
8
5
14
ll
212
9t
6
d
e
15
13
12
9t
2t
Show that no absolute-difference representation is possible. Show that with
only one inversion of order, a solution is possible. In the resulting representation, the
distance must be more than three times as large as the cd
distance.
(4.10)
21.
Suppose that Re 2 is ordered as follows:
y)
<:
, y')
I x - y I :;:o I x' -
iff
y' J.
Does every pair of points in Re have a midpoint (Definition 11)? If so,
what is it? Answer the same questions with respect to the lexicographic
(4.!2)
ordering of the plane (Exercise 6).
22. Suppose that four subjects, s, t, u, v, exhibit the following preference
orderings over objects a, b, c, d, where S is subject and R is rank.
iZ
s
u
v
2
3 4
a b
d c
c b
b c
c d
b a
a d
d a
1------
Show that no one-dimensional unfolding representation is possible.
0 5.1
INTRODUCTION
The debate about what probability is and about how probabilities shall
be calculated has been prolonged and is, in many respects, unresolved;
nonetheless few disagreements exist about the mathematical properties of
numerical probability. It is accepted that probabilities are numbers between 0
and 1 and that they are attached to entities called events. Moreover, it is
widely agreed that events in a given context are subsets-although not
necessarily all subsets-of a set known as a sample space. Sample spaces
are intended to represent all possible observations that one might make in
particular situations. The classical assumptions about events are embodied
in the following definition:
DEFINITION l. Suppose that X is a nonempty set (sample space) and that
tff is a nonempty family of subsets of X. Then tff is an algebra of sets on X iff,
for every A, BE .f?:
(4.12)
l.
-AEtff.
2.
Au BE C.
Furthermore, if tff is closed under countable unions, i.e., whenever A; E tff,
i = l, 2, ... , it follows that u;:l Ai E .f?, then tff is called a a-algebra on X.
The elements of 0 are called events.
199
5. PROBABILITY REPRESENTATIONS
5.1.
INTRODUCnON
2·01
200
words to be called an algebra of sets, g must have. the two pro:erties
ln .
,
1 mentation and unions. As lS shown m ectwn
of bemg closed under comp ethat g JS also closed under intersections and
5.3.1 (Lemma I), it follows
differences.
.
b
f ets and that it is a subalgebra of
f
IS an a 1ge ra o s
0 b serve tha t l 0 ,
A · · g nd so X - A u -A
~very algebra on X siXnce if A iOnnCJy, ~:e~o~~:~x al~ebras ar~-usually of
JS m g and 0 = IS m <o •
!s
mterest.
S t
54 1 we shall have occasion to consider a somewhat
Later, m ec wn · · '
B · · g only when
weaker definition in which we are assured that A u
IS m o
A n B = 0 and A, B are in C.. .
. 1 robability is the basis of
The following axiomatic defimtwn of numenca p
d
r .tl by
most current work in probability theory. It was first state exp ICI y
Kolmogorov (1933).
.
t
t that g is an algebra of
DEFINITION 2. Suppose that X zs a nonemp y se'
Th tri le
t on X and that p is a junction from g znto the real numbers Be /
se s C, P), is a (finitely additive) probability space iff, for every ' E .
A
L
P(A) ~ 0.
2.
P(X) = 1.
If A n B = 0' then P(A u B) = P(A)
3.
+ P(B).
It is a countably additive probability space if in addition:
5.
g is a a-algebra on X.
If A; E g and A; n A;
p
=
(U
i=l
0' i c/= j, i, j = l, 2, ... , then
A;) =
I
P(A;).
t=l
. .
f Definition 2 · the interested
We do not study the surpnsmg consequenceso
' F Her (1957
'
l
d book on probabihty theory, e.g,, e
reader should consu t a goo
D fi . . 7 as (part of) a representation
1Ion l
.
tead
JS to treat
e
m
1966). 0 ur pan, ms
, .
. .
conditions under which an ordering
theorem; specifically, we mqmre m~o f
p that satisfies Definition 2.
> f <ff has an order-preservmg unc IOn
. "
r
~ o
.
.
be interpreted empirically as meanmg qua !Obviously, the orderm; ~~to "Put another way, we shall attempt to treat
tatively at least as pro a _e, as.
t
a measurement problem of the
the assignment of probabiht!es to evens as t of e g mass or momentum.
same fundamental character as the measuremehn
, . ."' g of probability are
.
.
h d b t s about t e meanm
'
From this pomt of view, t e e a e l
thods to determine >, It is not
·
1·t
b t acceptable empmca me
~
f
,
b bTt hould have been the focus o
m rea I y, a ou
evident why the measurement of pro a I I y s
more philosophic contr.oversy than the measurement of mass, of length, or
of any other scientifically significant attribute; but it has been. We are not
suggesting that the controversies over probability have been unjustified,
but merely that other controversial issues in the theory of measurement
may have been neglected to a degree.
To those familiar with the debate about probability (see, for example,
Carnap, 1950; de Finetti, 1937; Keynes, 1921; Nagel, 1939; Savage, 1954,
1961) these last r,cmarks may seem slightly strange. Theories about the
representation of a qualitative ordering of events have generally been classed
as subjective (de Finetti, 1937), intuitive (Koopman, 1940a,b; 1941), or
personal (Savage, 1954), with the intent of emphasizing that the ordering
relation <:; may be peculiar to an individual and that he may determine it
by any means at his disposal, including his personal judgment.I But these
"mays" in no way preclude orderings that are determined by well-defined,
public, and scientifically agreed upon procedures, such as counting relative
frequencies under well-specified conditions. Even these objective procedures
often contain elements of personal judgment; for example, counting the
relative frequency of heads depends on our judgment that the events "heads
on trial I" and "heads on trial 2" are equiprobable. This equiprobability
judgment is part of a partly "objective," partly "subjective" ordering of
events.
Presumably, as science progresses, objective procedures will come to be
developed in domains for which we now have little alternative but to accept
the considered judgments of informed and experienced individuals.
Ellis pointed out that the development of a probability ordering is
... analogous to that of finding a thermometric property, which ... was the first step
towards devising a tern perature scale.
The comparison between probability and temperature may be illuminating in
other ways. The first thermometers were useful mainly for comparing atmospheric
temperatures. The air thermometers of the seventeenth century, for example, were
not adaptable for cc:nparing or measuring the temperatures of small solid objects.
Consequently, in the early history of thermometry, there were many things which
possessed temperature which could not be fitted into an objective temperature
order. Similarly, then, we should not necessarily expect to find any single objective
procedure capable of ordering all propositions in respect of probability, even if we
assume that all propositions possess probability. Rather, we should expect there to
be certain kinds of propositions that are much easier to fit into an objective probability
order than others ... (Ellis, 1966, p. ! 72).
Since virtually all representation theorems yield a unique probability
measure, agreement about the probability ordering of an algebra of events
' Many of the relevant papers are collected together in Kyburg and Smokier (1964).
5.
PROBABILITY REPRESENTATIONS
20J
5.2. A REPRESENTATION BY UNCONDITIONAL PROBABILITY
202
t bout a method to determine that ordering is
or, as is more usual, agreemen a . 1
bability-an objective probability.
hich two related probability
sufficient to define a umque numenca pro d
studied condmons un er w
Various auth ors h ave
th ught of as objective and the
h
1uebra of events, one o
p h
the most interesting results are those
measures on t e same a "'
other subjective, must agree. er aps
.
842
included in Edwards (1962); ~lso see S~ctJOn ba.bility do not appear to be
d·fficulties m measunng pro
d
In any case, the 1
h
.
hen we apply extensive an
1
d""'"
t from those t at anse w
l
inherenLy rueren
1 h
In nleasuring lengths, for examp e,
t
thods e sew ere.
d
other measurernen me
lativel short distances; certainly other meth~ s
Y
d
"derable disagreement exists
rods are practrcal only for re
·
·cal research an consr
must be used m astronomi
hi h of' the alternatives is appropriate (for a
among astronomers about w c f N th 1965)
.
·
e Chapter 15 o
or ,
·
.
general drscusswn, se
.
b
!ative frequencies or by dtrect
Besides the methods of order~ndg. evetnts ;h~~s of inferring the ordering.
. d
ts there are m tree me
d .
human JU gmen ,
. d .
oifStatistics introduced an or enng
, ( 19 - 4) b 0 ok The Foun atwns
'
·
Savage s
)
. '. , .
d from decisions among acts with uncertam
of "personal probabthty mferre
d A is an event we denote by
'
. fi ·f a b are outcomes an
outcomes. Bne y, 1 '
.
.fA
curs and b if A does not occur.
aA u b __ A an act whose outcome Isna ~ill b~~ome clear in Chapter 8.) If a is
(The reason for the. umon notatm b . preferred to the act aB u b_B ' then
· ·
k
pref erre d to b, an d Jf the. act a A \.) -A
b IS
bl than B (for the dectsJon rna er
we may infer that A JS ~ore h~:idaea ~s discussed further in Section 5.2.4,
whose preferences are studted). T
.
U
lly the idea is introduced Ill
where more complete references are given. t s~\o~h probability and utility
conjunction with another: the rneasurernen t od utilities This was Savage's
such that acts are ordered by their exhpe~ e 8 to a ~ew version of it. A
d
devote a separate c ap er, '
(1970)
approach , an we
. .
.1 bl in the book by Fishburn
·
fine survey of work on this toptc lS ~var a d·~·ons under which an ordering of
Our concern in this chapter is Wit cbo.nl. 1 1 asure Whether or not these
ted by a proba 1 1ty me
·
h
events can be represen
.
t blished by one or anot er
( 96. fi d by event ordenngs es a
conditions are sans e .
.
d here· see Luce and Suppes l ),
.
tal
'
· procedure JS· not dJscusse .
expenrnen
pp. 321·-327).
os.2
ESENTATION BY UNCONDITIONAL
A REPR
PROBABILITY
;t• ons. Qualitative Probability
5 2 1 Necessary C on d• 1
·
. .
.
h
that we want to prove, we may
Since we know the represen~a~r~n ~o e~::e necessary conditions on the
proceed as m Chapters 3 a .
.
.
that the representatiOn exists.
assump t ron
As always, if Pis to be order preserving, ?::; must be a weak ordering of t!J'.
Since this property is familiar and its difficulties as an empirical law ha¥e
already been discussed earlier (Section 1.3.1), albeit in other contexts, we
need not repeat ourselves; slight rewording suits those comments to thili
context.
We next show that the certain event X is strictly more probable than the
impossible event 0, the empty set, and that any event A is at least as probable
as 0. Since, by Axiom 2 of Definition 2, P(X) = I and since Xu 0 = X
and X n 0 = 0, it follows from Axiom 3 that
1 = P(X) = P(Xu 0) = P(X)
+ P(0) =
1
+ P(0),
and so P(0) = 0. Since Pis order preserving and P(X) = 1 > 0 = P(0),
X
0. Moreover, by Axiom 1, P(A) ;:?. 0 = P( 0 ), so A <:; 0. Observe
that A ,....., 0 does not imply A = 0. That is to say, nonernpty events may
be equivalent in probability to the impossible event, and these null events,
as they are called, correspond to sets with zero probability. We can also
derive that X<:; A and a number of other properties, but it is unnecessary to
state them separately because they turn out to be direct consequences of the
properties that we list (see Section 5.3.1).
Next, we wish to consider how the order relation and the operation of
union relate. Since Axiom 3 of Definition 2 is our main tool and since it is
conditional on the two events being disjoint, we may anticipate disjointness
playing a role in the corresponding qualitative property. Consider three
events A, B, C, with A disjoint from B and C. ff B ?::; C, then by the orderpreserving property and Axiom 3,
>
P(A u B) = P(A)
+ P(B)
? P(A)
whence A u B ?::; A u C. Conversely, if A
P(A)
+ P(B) =
U
+ P(C) =
B ?::; A
P(A u C),
u C,
P(A u B) ? P(A u C) = P(A)
+ P(C),
and so subtracting P(A), P(B) ;;?: P(C), whence B <:; C. Summarizing, if
A n B = A n C = 0, then B ?::; C if and only if A u B ?::; A u C.
Intuitively, this condition seems plausible. If B is at least as probable as C,
then adjoining to each the same disjoint event should not alter the ordering,
and conversely, deleting such an event also should not alter it.
The primary relation between complementation and ordering is: If A ?::; B,
then -B?::; -A. This is not stated as a separate axiom because it follows
from the others (Lemma 4, Section 5.3.1).
The final necessary condition that we derive in this section is an Archirnedean property. We first derive a special case of it. Suppose that A 1 , ... , Ai , ...
5.2.
5. PROBABILITY REPRESENTATIONS
205
A REPRESENTATION BY UNCONDITIONAL PROBABILITY
204
Furthermore, the structure is Archimedean
.
f
. -1- · A n A ~ 0) eat:h of which
. _'
2
A.~ A.
are mutually disjoint events (Le., or t -,--- J, i . '
is equivalent in probability to_ some fixed even~ A, J.e., f~r t A- i~ 'a~-~~edt. If a
B
fi ·te induction on Axwm 2 of Defimtwn 1, Ui~1 '
_
f
p;o~ab7tity measure exists, then by a fmite induction on Axwm 3 o
4.
5.2.2
Definition 2,
I
i=l
p / 1 it follows that n cannot exceed
.
e if A '>- 0 th en smce ~
1-f .P(A) > 0 , i ..
,
/ b, t
st finitely many mutually disjoint events
l/P(A) Thus there can e a mo
11
:h~c~ni~u;~r:~~~~e~~:~oli~:~
A1
B1 and B1
(ii)
B; n C;
(iii)
B;
(iv)
C; -~A;
AH 1 = B; v C; .
=
r-.o
=
r-..;
P(a)
0 ;
A; ;
~~:~m~u!h:s~~~te~:~u~~~~~ s~afn~~ed l:~~~~~:~~t~:i%?~:a~~~~pl~::~
every strictly bounded standard sequence e
,
here since every standard sequence is bounded by X.
We summarize these properties as:
.
is an algebra of
1y se t ' that ,ff
.
.
DEFINI TION 4 · Suppose that X zs a,_.nonemp
Th
· 1 (X 0 >) zs a structure
v
d that > is a relation on "' ·
e trzp e
' ' ~
sets on A, an
--: . .
A B C E $·
of qualitative probability if.ffor every ' '
·
3.
> P(b)
and
{b, e} >{a, c}.
(!)
+ P(c), P(c) + P(d) > P(a) + P(b),
P(b) + P(e) > P(a) + P(c).
Adding these three inequalities and canceling P(a) + P(b)
P(c), we
obtain P(d) + P(e) > P(a) + P(b) + P(c), whence {d, e} :>-{a, b,
Therefore, if we can construct an order that simultaneously satisfies inequality
{a, b, c} > {d,
and the axioms of qualitative probability, then we know
that this order cannot possibly be represented by a numerical probability.
The trick is to find a way to avoid the tedious task of exhaustively verifying
Axiom 3 for the example.
Suppose we can find a probability measure P on X such that inequality (I)
holds (and necessarily {d, e} >{a, b, c}) and such that all subsets A of X
other than {d, e} and {a, b, c} are either strictly more probable than {d, e} or
strictly less probable than {a, b, c}, i.e., not[P({d, e}) ?: P(A) ?: P({a, b,
Of course the ordering induced by P automatically satisfies the Axioms of
Definition 4. Now, change that order only to the extent of changing
e} >{a, b, c} to {a, b, c} > {d, e}. Since no set lies between these two, their
relations to all other sets are unaffected by the inversion, and so the new
ordering, which cannot possibly have a probability representation, satisfies
Definition 4. Therefore, we need only construct a measure P for which
inequality (l) holds and no event separates {d, e} and {a, b,
Let 0 < E < ±, and ignoring the normalizing factor P(X) = 16- 3<, let
P(a) = 4P(d) = 2,
<:;> is a weak order.
X> 0 and A ;2:: 0.
SupposethatAnB=A n C
{c, d} :>-{a, b}
+
A;
i~~~~~e: ~oal~o;~:s~~~:i~~ ~n~~~o:rt ;;:~~i~~ 1~
2.
The Nonsufficiency of Qualitative Probability
If an order-preserving representation P exists, then
We see that if event A;
then 1t follows that Ai+I
q
. .
.
t
A-corresponds
. l t t A and of C which IS eqmva1en 10
turn is eqmva en o. ; ,
h; .' .
sible inductive definition which
to i + 1 disjoint copies of A. So t IS IS a send . P(A ) - z·P(A) we must
·
.·
· s· by m uctwn · ;t - A >, 0 Note
generalizes our ongmal notiOn. mce
1
l.
0, any standard sequence relative to A is finite.
{a} >{b, c},
-"' .
l b a + sets and that ,.._, is an
DEFINITION 3. Suppose that (0 ts an age r 0'J
•
d d
A
A
where A . E $ zs a stan ar
equivalence relation on $. A sequence 1 , ... , ;h, ... ,
. t B' C- ~ $ such that:
. to A E g lifffior i -- 1' 2 , ... , t ere exzs
;, '
sequence relative
(i)
:>-
The question we now ask and answer in the negative is: Does every (finite)
structure of qualitative probability have a finitely additive, order-preserving
probability representation? The following ingenious counterexample is due
to Kraft, Pratt, and Seidenberg (1959). Let X= {a, b, c, d, e} and let 6 be
all subsets of X. Consider any order for which
P(A;) = nP(A).
each of which is equivalent infprhobability ttyo
need a slightly stronger form o t IS proper
For every A
iff
=
0 . Then B .2:; C
iff A v
B
.2: A v
C.
E,
P(b) = 1 -
E,
P(c) = 3 -
E,
P(e) = 6.
Using additivity, it is easy to verify that inequality (1) holds, that P({d, e})
~··
8,
206
5.
PROBABILITY REPRESENTATIONS
5.2.
and that P({a, b, c}) = 8 - 3E. Since 3E < 1, the only p~ssibility for finding
a distinct set A such that P({d, e}) ~ P(A) ~ P({a, b, c}) 1s for P(A) to be of
t h e f orm 8 - ZE
· 1· -- 0 , 1, 2, 3 • However, the number 8 can be expressed as
a sum of distin;t elements from {1, 2, 3, 4, 6} in only two ways, 1 + 3 + 4
or 2 + 6. These correspond to {a, b, c} and {d, e}, respectively. T~us, n~ A
distinct from those two sets can have probability between 7 and 8 mclus1ve.
This completes the example.
So further properties are needed in order to construct a representation.
5.2.3
Quite a number of conditions are known which, along ';ith the properties
of qualitative probability (Definition 4), are e~ch sufficient to prove the
existence of an additive probability representatiOn. As .would b~ expected,
each condition postulates the existence of events With c~rtam (strong)
properties, and therefore limits the algebras of sets ~or which th~ theory
establishes a representation. We present three conditiOns here which lead
to unique, finitely additive measures, another in s.ection .5.4.2 which_leads to
a unique, countably additive measure, and a fifth m Sectwn 5.4.3 which leads
to a unique measure when X is finite.
Suppose (X, C, ;;::;> is a structure of qualitative probability.
and A
B, then there exists a partition {Cl ,... , Cn} of X such
B U C;.
that, for i = 1, ... , n, C; E C and A
If A, BE
,ff
>
201
THEOREM 1. Suppose that {X, C, :;::;) is a structure of qualitative probabil·
ity. Axiom 5" is true iff the structure is both fine and tight,
This is Savage's Theorem 4, p. 38.
Although Savage's axiom is, i:n the presence of the axioms for qualitative
probability, actually stronger than the next condition, which was invoked
both by de Finetti (1937) and Koopman (1940a,b), Savage felt that his was
the easier to justify intuitively.
AXIOM 5'. Suppose that (X, C, ;;::;> is a structure of qualitative probability.
For every positive integer n, there exists a partition C1 , ... , Cn of X such that,
for i,j = 1, ... , n, C; E C and C; ,...._, C1 •
Sufficient Conditions
AXIOM 5".
A REPRESENTATION BY UNCONDITIONAL PROBABILITY
>
Savage (1954) pointed out that if tff does not seem to ~u~fi~l Axiom ~",
one can always enlarge it to an algebra that does by adJommg ~ll fimte
sequences of heads and tails generated by a coin of one's own choice. For
any n, this leads to a partition into 2n events, and, as he sa1d,
It seems to me that you could easily choose such a coin and choose n sufficiently
large 50 that you would continue to prefer to stake your gam on A, rather than o n
the union of B and any particular sequence of n heads and tails. For you to be ab 1e
to do so, you need by no means consider every sequence of heads and tails equally
probable (Savage, 1954, p. 38).
Savage established a reformulation of his axiom which is interesting ~nd
which we shall need again in Section 5.4.2. Suppose that (X, C, ;;::;> IS a
system of qualitative probability. It is called fine if, for ever~ A
0, there
· t S a part.t.
> C; for 1 = 1, ... , n._ For
eXJS
I lOll {C1 , .•. , C}
n of X such that A ~
A Bin ,ff define A ,-...,* B iffor all C, D
0 such that All C = B_ 1\ D- 0,
then A ~ c;;::; B and BuD;;::; A. The system is called tight 1f whenever
A~* B, then A ,_._,B.
>
>
If C includes all finite sequences of heads and tails generated by what you
believe to be independent tosses of a fair coin, i.e., a head is just as probable
as a tail, then Axiom 5' must be fulfilled for you.
A common drawback of both axioms is that they force X to be infinite-the
illustrative coins suggest why. The following condition, which is due to
Luce (1967), is fulfilled in many infinite structures and in some finite ones,
although not in most.
AXIOM 5. Suppose that (X, tff, ;;::;> is a structure of qualitative probability.
If A, B, C, DE C are such that A n B = 0, A > C, and B ;;::; D, then there
exist C', D', E E tff such that:
(i)
(ii)
(iii)
(iv)
E ""Au B;
C' n D' = 0;
E=:J C' u D';
C' ~ C and D' ,...._,D.
It is difficult to say just what this means other than to restate it in words.
If A and B are disjoint, if A is more probable than C, and if B is at least as
probable as D, then the axiom asserts that somewhere in the structure there
is an event E that is equivalent in probability to A u B such that E includes
disjoint subsets C' and D' that are equivalent, respectively, to C and D. This
axiom can be satisfied by some finite probability spaces, e.g., let X =
{a, b, c, d}, let P(a) = P(b) = P(c) = 0.2, P(d) = 0.4, and ,ff is all subsets.
In some sense, however, most finite structures that have a probability
representation violate the axiom, e.g., any structure whose equivalence classes
fail to form a single standard sequence.
As was mentioned, Savage showed that, in the presence of the axioms for
qualitative probability, Axiom 5" is strictly stronger than 5', and under the
same conditions Luce (1967) showed that 5" is strictly stronger than 5. Fine
(I97la,b) used a still weaker structural condition-the existence, for every n,
of ann-fold "almost uniform" partition--in conjunction with the (necessary)
208
5. PROBABILITY REPRESENTATIONS
condition that the order topology on rff have a countable base. None of
these axiom systems permits finite models, so none is weaker than Axioms
1-5. Fine (1971a) proved that all are strictly stronger than Axioms 1-5.
Assuming that Axioms 1-5 are weaker than the other systems that have
been mentioned and that adding the (necessary) Archimedean condition to
those of qualitative probability is not objectionable, the appropriate result to
prove is the following.
THEOREM 2. Suppose that (X, rff, ;2::) is an Archimedean structure of
qualitative probability for which Axiom 5 holds, then there exists a unique
order-preserving function P from t!f' into the unit interval [0, 1] such that
(X, rff, P) is a finitely additive probability space.
The proof, which is given in Section 5.3.2, involves reducing this structure
to one of extensive measurement. It is evident that the crucial property, if
A n B = 0, then P(A u B) = P(A) + P(B), is formally similar to the
corresponding one in extensive measurement provided that U is treated as
a concatenation operation o. Clearly the condition A n B = 0 will have
to be rephrased as a restriction on the concatenation operation of the form:
(A, B) in fJiJ, where :!1J C tff X C.
In this connection, it should be noted that an ordering of an algebra of
sets can be interpreted as a type of extensive measurement provided that the
sets are collections of objects rather than events. Such an interpretation is
especially natural for masses, in which case any collection of objects placed
on one pan of a balance is an element of the algebra. Such an interpretation
is less natural, though nonetheless possible, for length measurement. Observe
that when we take one of the primitives to be an algebra of sets, we automatically assume the commutativity and associativity of concatenation,
since both properties hold for unions of sets; these properties had to be listed
more or less explicitly as axioms in the theories of Chapter 3. This is, of
course, a general mathematical phenomena: the more structured the primitives, the weaker the axiomatic system need be in order to describe a given
structure.
5.2.4
Preference Axioms for Qualitative Probability
If the ordering;?:; on tff is inferred from choices among acts, then the axioms
of qualitative probability can be reformulated as assumptions concerning
observable choices. For simplicity, we restrict attention to acts that have
either two or three uncertain outcomes.
Let rff be an algebra of sets on X and '1f! a set of outcomes or consequences;
let a A u bE u Ce be the act that has consequence a if A occurs, b if B occurs,
5.2.
A REPRESENTATION BY UNCONDITIONAL PROBABILITY
209
and c if C occurs, where it is understood that a, b, c are in '1f/, A, B, C
are in tff', and A, B, C partition X. Denote the preference relation as ;2:: *.
This is a weak ordering of both '1f! and of the set Ot of all acts, i.e., formally,
a subset of ('1f! X '1f/) u (Ot X Ot). We do not need to assume that a consequence and an act are compared.
We also identify acts that have the same event-outcome interpretation,
e.g., permutations do not matter, aA u bE u ce = bE u ce u aA , and a
three-outcome act with two identical outcomes reduces to a two-outcome
act, aA U bE u be = aA u baue = aA U b_A .
We define a relation ;2:: on tff as follows:
A ;2:: B ifffor all a, bE '1f/, if a ;2::;* b, then aA u b_A ;2::;* aE u b_ 8 .
The following three axioms are sufficient (in the presence of the assumptions
made informally above) to make (X, rff, ;2::) a structure of qualitative
probability:
!.
!fa>* b, c ;2::* d, and aA u b_A ;2::;* as u b_ 8
CA
U
d_A ;2::;* C8
U
,
then
d_E.
2.
If a>* b,
3.
lfaA U ba U be ;2::;* aA· U bE U be·, then
then ax u b"' >* aq, u bx and aA
U
b_A ;2::;* a'" u bx.
aA u aE U be ;2::;* aA· U aE U be· .
We also need to assume that ;2::* is nontrivial on '1f!, i.e., there exist a, b with
a >*b.
The first axiom implies that ;2:: is connected: for take a >* b, then for any
A, B, the connectedness of ;2::* yields some relation between aA u b_A and
as U b_a , and this same relation then holds with c, d substituted for a b
provided that c ;2:: * d. Transitivity of ;2:: follows from transitivity of ;2:: s~
<
;(:;is a weak order.
The second axiom obviously implies X > 0 and A ;2:: 0.
The third axiom is based on the idea that if two acts have a common
consequence b when B occurs, then the contingency bE is irrelevant to the
decision between them, and so b can just as well be replaced by a. The axiom
only assumes this in a very restricted situation: when the other two outcomes
are b and a. The general principle, which this axiom exemplifies, is called the
extended sure-thing principle. It leads to an expected utility representation
(see Chapter 8, Savage's book, or Fishburn's book, referred to above). In
this highly restricted form, however, it is already sufficient to yield Axiom 3
of qualitative probability: If A is disjoint from B u C, then B ;2:: C if and
only if A u B ;2:: A u C.
The proof is fairly obvious: Letting a >* b, B ;2:: C translates into
aB U
bA U b_(AUE)
<=* Ge U
bAU b--(Aue)
210
5.
PROBABILITY REPRESENTATIONS
while A u B ?:; A u C translates into
Thus the axiom (which is equivalent to its own converse) gives the desired
result.
Axioms 1 and 3 are highly interesting propositions about choices. If
Axiom 1 is violated, there would seem to be some kind of special interaction
between outcomes and events, e.g., such that even though a >* b, c ?:; * d,
and aA u b_A ?:;* aB u b_B, the combination cA is somehow much less
valuable than cB .
Violations of the extended sure-thing principle have been much discussed
(see Chapter 8). From the present standpoint, the interesting thing to ~~t~ is
the light that Axiom 3 sheds on the principle of additivity of probabilities.
If a>* band A ?:; A', then we must have
aA
U
bB
U
be?:;* aA·
U
bB
U
5.3. PROOFS
211
The objective probability of winning a is now two-thirds in each case, but
the tables are turned. In the first scheme, the probability could be as low as
one-third, depending on the coin toss, but in the second scheme, the probability is surely two-thirds. The point is that black adds a great deal more
unambiguity to white than it does to red. Only if such considerations play
no role-either because of the constancy of ambiguity over events or
because of sophisticated decision-makers-may we expect the probability
ordering inferred from decisions under uncertainty to lead to an additive
representation.
Other discussions of the inference of a qualitative probability from
decisions are found in Section 8.7.1 and in Anscombe and Aumann (1963),
Davidson and Suppes (1956), Edwards (1962), Fishburn (1967g), Pratt,
Raiffa, and Schlaifer (1964), Ramsey (1931), Savage (1954), and Suppes
(1956).
he·.
But if, somehow, B adds more to A' than it does to A, we could have the
reversal,
a A· u aB u be· >*aAu aB U be.
An example of a possible violation is offered by the following game
(adapting an idea of Ellsberg, 1961). Consider three urns, containing 200
white balls, 200 black balls, and 100 red balls, respectively. One of the first
two urns is selected by tossing a fair coin, and without informing the player
which one was selected, white or black, its 200 balls are thoroughly mixed
with the 100 red balls. Then a single ball is drawn from the mixture, and a
prize is awarded or not depending on its color. Let a denote a valuable
prize and b denote no prize. The player may prefer the payoff scheme
5.3
5.3.1
PROOFS
Preliminary Lemmas
LEMMA 1. If rff is an algebra of sets on X (Definition I, Section 5.1),
then 0, X E rff and rff is closed under set difference and intersection. If tff is a
a-algebra, then it is closed under countable intersections.
PROOF. This is a standard result, easily proved using the formulas
X= Au -A, An B =-(-Au -B), etc.
0
The common hypotheses of Lemmas 2-4 is that <X, rff, ?:;) is a structure
of qualitative probability (Definition 4, Section 5.2.1) and that A, B, C, DE rff.
ared U bblack U bwhite
to the scheme
awhite U bblack U bred ·
In each case, the objective probability of winning the prize a is exactly onethird. The difference is that in the first case, he knows there are one-third
red balls, whereas, in the second case, depending on the coin toss, there may
be no white balls at alL
Now suppose that we change b to a on black, i.e., compare
ared U ablack U bwhite
to
awhite
u
abiack
u
bred .
LEMMA 2. Suppose that A n B = C n D = 0. If A ?:; C and B ?:; D,
then A u B ?:; C u D; moreover, if either hypothesis is >, then the conclusion
is>.
PROOF. Let A'= A-D, D' = D- A. Note that A', D' are in C
(Lemma 1) and that
A' n B =A' n D =AnD'= C n D' =
and that A' u D = A u D'.
Using Axiom 3 twice,
A' u B ?:; A' u D = Au D' ?:; CuD'.
0,
5.
212
PROBABILITY REPRESENTATIONS
If either hypothesis is strict, then by the "if" part of Axiom 3,
A' uB >CuD'.
Observe that A n D is disjoint from both A' u B and C u D'; thus, by
Axiom 3,
Au B
and
>
=
(A' u B) u (A n D) ?::; (CuD') u (A n D)
=
CuD;
0
holds if A' u B >CuD'.
COROLLARY. If An B
AuB~Cu D.
LEMMA 3.
Cn D
=
=
A ~ C, and B ~ D,
0,
then
If A :J B, then A ?::; B.
pROOF. Use Axiom 3 on A - B ?::; 0 (Axiom 2), noting that B is
disjoint from A
Band from 0 and that (A - B) u B = A.
()
COROLLARY
1.
X?::; A.
COROLLARY
2.
If A ':J B, then A> B iff A- B
LEMMA 4.
>
0 ·
If (X, tff, ?::;) is an Archimedean structure of qualitative prob~~ility for
which Axiom 5 is true, then it has a unique, finitely additive probabzl!ty representation.
PROOF. Let A denote the equivalence class that includes A, and let ,g
be the set of all equivalence classes, excluding l'lJ. By Axiom 2, X E tff. If X
is the only element of tff, then we let P(A) = 0 if A ,..__, 0, P(A) = 1 1f A ~X,
and p fulfills the assertions of the theorem (note that (lJ IS closed under
disjoint union, by Lemma 2). We assume, therefore, that ,g contains A c:F X.
We construct an extensive structure on ,g letting
=
{(A, B) A
1
A'EA,
A v B, if A n B = 0. By the corollary of Lemma 2, o is well defined.
We let ~ be the induced simple order on C.
We shall now prove that (<ff, ~, fllJ, o) is an extensive structure with no
essential maximum (Definition 3.3, Section 3.4.3). The six axioms are
established in corresponding numbered paragraphs. It is convenient in some
places to use boldface Greek capitals for elements of tff.
~ is a weak order; in fact, it is a simple order.
Positivity follows from Corollary 2 of Lemma 3.
2. Associativity. Suppose that (r, Ll), (roLl, A) E fllJ. The essence of
the proof is to construct A, B, C pairwise disjoint such that A = r, B =Ll,
C = A. For then, the rest will follow via the associativity of the union
operation. The construction is as follows. Take disjoint sets D' E r o Ll,
C' E A. By positivity D' > Ll, thus, by Axiom 5 we can take E ,...._, D' v C'
and B, C disjoint with E :J B u C and BE A, C E A. Then let A =
E- (B u C). It remains only to show that A E r; but since (A o Ll) o A =
E = (r o A) o A, this follows from Lemma 2, applied twice.
3. We need to show that if (A, C) E fllJ and A~ B, then (C, B) E fllJ; the
rest follows by the obvious commutativity of o and by Lemma 2. If A = B,
there is nothing to show; if A > B, apply Axiom 5.
4. Solvability. If A> B, apply Axiom 5 to A
B and 0 ~ 0 to
obtain A', B' with A' :J B', A-~ A', B ~ B', and let C =A'- B'; then
A=BoC.
6. Finally, we show that {n I B > nA} is finite, for A E C, where lA = A,
nA = (n - l) A o A. We do this by showing that the existence of nA implies
0. For suppose
the existence of an n-term standard sequence relative to A
that nA exists and that we have a k-term standard sequence relative to A,
A 1 , ... , Ak, with k < n, and with A, E iA for i ~ k. Since (kA, A) E fJJ, we
have Bk E kA and ck E A such that Bk n ck = 0. Let Ak+l = Bk u ck
Then clearly, A1 , ... , Ak+1 is a (k + 1)-term standard sequence relative to A,
and A1c+1 E (k + 1) A. Since a !-term standard sequence can be obtained by
A 1 = A, we have the required result by induction.
1.
5.
>
Theorem 2 (p. 208)
fllJ
213
PROOFS
>
If A?::; B, then --B?::; -A.
PROOF. If A?::; B, -A> -B, then by Lemma 2, X> X, contradicting Axiom 1.
0
5.3.2
5.3.
>
0, B
B'EB
> 0,
with
~
Now, by Theorem 3.3 (Section 3.4.3), there is a positive-valued ratio scale
on ,g such that
A~B
for
and there exist
(A, B)
Choose the unit so rp(X)
A'nB'=
>
Thus fllJ is nonempty because if A c:ft both X and l'lJ , then -A
0
(Lemma
so the pair (A, -A) E fllJ. We define o on B by letting A o B =
rp(A) ;;? rp(B),
iff
and
E
=
P(A)
fllJ,
rp(A o B)
L For A
=
lrp(A),
0,
E
=
rp(A)
tff, let
if A> 0,
if A~ 0.
+ rp(B).
5.
214
PROBABILITY REPRESENTATIONS
It is easy to see that P fulfills the assertions of the the~rem. Moreover p is
unique, since if another such function P' existed, then </;'(A) = P'(A) ;ould
be a representatiOn of (tff, ~, PA, o). Since¢;' = a,P and ,P'(X) = </;(X)= 1
</>' = </; and P' = P.
0
5.4 MODIFICATIONS OF THE AXIOM SYSTEM
5.4.1
QM-Algebra of Sets
The notions of event and probability given in Definitions I and 2 have
proved satisfactory for almost all scientific purposes. The one outstanding
excepuon JS quantum mechanics. In that theory both P(A) and P(B) may
exist and yet P(A n B) need not. For example, suppose that A is the event
that, at some time t, a certain elementary particle is located in some region a
of space and B is the event that, at the same time t, this same particle has a
momentum whose value lies in some interval {3. Then the event A n B means
that, at time t, the particle is both in the region a and has a momentum in {3.
One basic feature of quantum mechanics is that we may be able to observe
whether event A occurred or whether event B occurred, but not necessarily
be able to observe whether both events A and B, i.e., event A n B, occurred.
In the theory this means that no probability is assignable to A n Beven when
P(A) and P(B) are specified.
Such an incompatibility between two very fundamental sets of ideas must
be resolved, presumably by modifying probability theory in some way. Those
interested in a full understanding of the physical issues involved and in some
of the proposed modifications of both classical logic and probability theory
should consult Birkhoff and von Neumann (1936), Kochen and Specker
(1965), Reichenbach (1944), Suppes (1961, 1963, 1965, 1966), and Varadarajan
(1962). Suffice it to say that the simplest proposal (Suppes, 1966) is to restrict
further our notion of an algebra of sets (Definition l) to:
DE~INITION
5. .suppose that X is a nonempty set and that Cis a nonempty
famzly of subsets of X. Then C is a QM-algebra of sets on X iff, for every
A, BE C:
1.
-AEC;
2.
If A n B
5.4.
0,
then A u BE C.
Furthermore, if C is closed under countable unions of mutually disjoint sets,
then C is called a QM a-algebra .
Since the definition is still stated in terms of unions, and the difficulty
215
seemed to be about intersections, it is not clear that we have coped with the
problem; however, as is shown in Lemma 5 (Section 5.5.1), all is well in the
sense that when A, Bare in C, An B is in C if and only if A u B is in C,
and if A :) B, then A - B is in C.
As a simple example, if (X, C, P) is a finitely additive probability space,
then for any A in ·C, the set of all events probabilistically independent of A
is a QM-algebra of sets on X; see Section 5.8.
What is not clear about this suggestion is just how badly probability theory
suffers. For one thing, how many of the important theorems of probability
theory cannot now be proved ? This is not the place to enter into that discussion. For another, to what extent have we impaired our ability to construct
a numerical probability representation from a qualitative ordering ? If one
carefully reexamines Axioms l-4 of Definition 4 and Axiom 5 of Section
5.2.3, the proofs of the lemmas used in the proof of Theorem 2, and the proof
of that theorem, then one finds that, with the exception of Lemmas 1 and 2,
all unions are of disjoint sets and that all differences are of one set that
includes another. As we mentioned above, Lemma 5 (Section 5.5.1) replaces
Lemma 1. As for Lemma 2, we note that only disjoint unions enter into its
statement, that it is a necessary condition for the representation (the argument for necessity is very similar to the argument for Axiom 3), and that it
is strictly stronger than Axiom 3. Therefore, we adopt it as Axiom 3', in
place of Axiom 3.
AXIOM 3'. Suppose that A n B = C n D = 0. If A ;?:; C and B;?:; D,
then A u B ;?:; C U D; moreover, if either hypothesis is)-, then the conclusion
is)-.
We shall encounter this axiom again in the treatment of qualitative conditional probability (Sections 5.6 and 5.7). Except for the proof of Lemma 5,
we have established the following:
THEOREM 3. If Cis a QM-algebra of sets on X and if (X, tff', ;?:;) satisfies
Axioms 1, 2, 3', 4, and 5, then there is a unique function P on C that satisfies
the Kolmogorov axioms (Definition 2) and preserves the order ;?:; on C.
5.4.2
=
MODIFICATIONS OF THE AXIOM SYSTEM
Countable Additivity
Theorem 2, and others like it, establish conditions under which finite
additivity hold, but nothing has been said about countable additivity (see
Definition 2), which is needed in many parts of probability theory. Countably
additive representations of qualitative probability structures have been
studied by Villegas (1964, 1967) and Fine (l97la). The main result (from our
216
5.
PROBABILITY REPRESENTATIONS
5<5.
PROOFS
217
point of view) is embodied in the following definition and theorem. The
definition follows Villegas (1964).
DEFINITION 6. Suppose that (X, <ff,
is a structure of qualitative
probability and that ,ff is a a-algebra. We say that <:; is monotonically continuous on ,ff if/for any sequence A1 , A 2 , ••• in <ff and any B in !%', if Ai C Ai+l
and B <:; A;/or all i, then B <:; U~1 Ai.
THEOREM 4. A finitely additive probability representation of a structure of
qualitative probability, on a a-algebra, is countably additive iff the structure is
monotonically continuous.
This theorem is proved in Section 5.5.2. An immediate corollary is that
Theorem 2 gives a set of sufficient conditions for a countably additive
probability representation for a binary relation<:; on a a-algebra: Axioms 1-5
and monotone continuity.
Given Axioms 1-·3 (qualitative probability) and monotone continuity,
Axioms 4 and 5 can be dispensed with if a different structural condition is
used. To formulate the condition, we need a qualitative definition of a
probability atom. The numerical definition is that A is an atom if P(A) > 0
and if A -:J B, then P(B) = P(A) or P(B) = 0. Correspondmg to this, we have:
DEFINITION 7. Let
event A E t' is an atom
>
be a weak ordering
and for any
iff A > 0
BE
an algebra of sets t'< An
1%', if A -:J B, then A ~ B
orB~ .0
The structural condition which replaces Axioms 4 and 5 is that ,g has no
atoms. Recall the definitions of fine and tight (Section 5.2.3).
THEOREM 5< Suppose that (X, !%',
is a structure of qualitative
probability, ,ff is a a-algebra, <:; is monotonically continuous on !%',and there are
no atoms. Then the structure is both fine and tight, there is a unique orderpreserving probability, and it is countably additive.
single finite standard sequence. To do so, we replace the structural restriction
(Axiom 5) by some sort of equal-spacing axiom.
In the case of probability structures, it is impossible for a single standard
sequence to ex~aust ,g- {
For if A 1 , A 2 , .. ,,An is a standard sequence,
then by Defimtwn 3, we have A 2 ~ B 1 u C1 , where B1 , C1 are disjoint and
both are ~AI; thus, either B1 or C1 must be outside the sequence. The most
that one can expect is for a single standard sequence to exhaust the set of
nonnull equivalence classes iff. However, there are a great variety of structures
of ,g compatible with a single standard sequence in 6', e.g., any finite structure
sat1sfymg Axwm 5. Perhaps the simplest structure is one with a finite number
of equivalent atoms-corresponding to a uniform distribution on a finite set
of categories. This structure can be characterized by the following axiom:
AXIOM 6. If A <:; B, then there exists C E C such that A~ B u C.
This axiom, and the following theorem, are due to Suppes (1969, pp. 5-8).
THEOREM 6. Suppose that (X, <ff, :C:> is a structure of qualitative probabtlzty (Axwms l-3 of Definition 4) such that X is finite and Axiom 6 holds.
Then there exists a unique order-preserving function P on ,ff such that (X, <ff', P)
zs afimte/y additive probability space. Moreover, all atoms in ,ff are equivalent.
The proof is found in Section 5.5.3.
Since any event in a finite probability space is a union of disjoint atoms
and a null event, the probabilities are all multiples of the atomic probability.
5.5 PROOFS
5.5.1
Structure of QM-Algebras of Sets
LEMMA 5. Suppose that
then for all A, BE ,ff;
For the proof of this and other related results, we refer to Villegas (:964).
This structural condition turns out, of course, to be a specml case of Axwm 5
(given a-algebras and monotone continuity), since a structure of qualitative
probability that is both fine and tight satisfies Axiom 5 (see SectiOn 5.2.3).
(iii)
5.4.3 Finite Probability Structures with Equivalent Atoms
PROOF.
As in the chapters on extensive measurement and difference measurement,
it is possible here to give a separate and elementary axiomatization of a
(i)
(ii)
and X
0
E
,ff
is a QM-algebra of sets on X (Definition 5),
<ff;
if A
-:J B, then A - BE <ff;
A
B
U
E ,ff
(i)
iff A n
B
E
6".
The proof is the same as Lemma 1, since A
n -A
(iii)
If A u BE <ff, then by part (ii), A - B
=
0.
=
(ii) Since A -:J B implies -A n B = 0 and since -A
it follows that -A u BE ,ff, So A - B = -(-A u B) E 1%'.
E
,g
'
(A u B) - Band
5.
218
B - A = (A u B) - A
E
PROBABILITY REPRESENTATIONS
tff. Thus, the symmetric difference,
(A - B) u (B -
A)
E
5.5.
PROOFS
219
For each k, Bk C Bk+I , and
00
tff;
I
P(Bk) ,s;_
P(A;)
i=l
and again, by part (ii), the intersection, which is the union minus the symmetric difference, is in tff. Conversely, if the intersection is in tff, then A - B =
A - (A n B) and B - A = B - (A n B) E tff by part (ii), so A u B, which
is the disjoint union of A n B, A - B, and B - A, is in tff.
0
,s;_ p
(U Ai) -- P(Am)
~=1
5.5.2
Theorem 4
(p. 216)
P(B).
=
Suppose that a structure of qualitative probability <X, tff, :?::;) has an orderpreserving probability measure P and that tff is a a-algebra. Then <X, tff, P) is
countably additive iff:?::; is monotonically continuous on tff.
PROOF. If Pis countably additive, let {A;} satisfy A; C Ai+1 and A;
By countable additivity,
Thus, B :?::; Bk . But since Am
U Bk
;:5 B.
Since the partial sums of the right-hand expression are P(Ai+I), they are
bounded by P(B), and so P(U::r Ai) ~ P(B), implying B :?::;
Ai .
For the converse, note first that by finite additivity alone,
are pairwise
disjoint, then
k=l
>
0, we have
"'
=
U A;
=
B u Am
Thus, monotone continuity is contradicted.
The other case is where no such Am exists. Then clearly, P(A;) = 0 for
all but fimtely many i. Let J be the subset of integers i for which P(A) = 0
and Jet
'
'
A= UA;.
iEJ
By finite additivity,
I P(A;) ~ p (U A;).
i=l
i=l
=I P(A;)
This is true because each partial sum on the left is P(U~~r A;) ~ P(U;':1 Ai).
Suppose that for some disjoint union, strict inequality holds, in violation of
countable additivity, i.e., let
p (
u
Ai) -
i=l
I P(Ai)
i¢J
w
=
I
P(Ai)
i=l
=
E
> 0.
t=l
We consider two cases. First, suppose that for some Am in the sequence,
E
?:; P(Am)
>
Therefore, P(A) =
E.
Let
ck
0.
=
U
A;.
iEJ
Let
i<_k
k
Bk
=
U A;,
i=l
B
=
(U A;)- Am·
t=l
> B.
i=l
Then P(Ck)
=
0 by finite additivity; so we have
5.
PROBABILITY REPRESENTATIONS
5.6.
A REPRESENTATION BY CONDITIONAL PROBABILITY
221
220
but
0
violating monotone continuity.
5.5.3
Theorem 6
P(A
(p. 217)
. .
b b ·1· fi
hich X is .finite and
,g
o~" quahtatzve pro a t zty or w
.
A structure
, '
'J
b bT
entatwn ·
which satisfies Axiom 6 (Section 5A.3) has a unique pro a I zty repres
'
<X
moreover, all atoms are equivalent.
X' ,ff' >') in which the nonempty null
PROOF. Define a new structure < ' '~
N ~
ut
. .
d (If N 15
. the union of all null events,
0' p
,
N ·ff A >B) It is
events are ehmmate ·
' ~
- N rf!' = {A - N I A E
A - N;?: B t
~ •
•
X - X
,h t
,ff'
also satisfies the hypotheses of the theorem,
easy to show t a ne~pty, event in rf!' is expressible in a unique way as the
moreover, any no
. . .
union of finitely many pairwise diSJOillt atoms. ( uch exists by the finiteness
L A be a minimal atom With respect to
s
.
et l
J
be the set of distinct atoms eqmvalent to Al.
of
and et
,... ,
. rf!' If there are others, let A be mmiinal
We show there are no other atoms m
among the atoms >' Al . Define B and C by
B
C
=Au (A2 u ... u
= A1 u
u · · u An)-
-
= A . B Axiom 3, B >'C. By Axiom 6, thereexists
(If n
1, Bh ~ A,BC~· C ~)DyWith no loss in generality, take D dtsJomt
D E rf! sue t at
_D
A· but then
from C. Since D contains no mimmal atoms,
,
=:
B ~· CuD ;?:' C u A = B u Al
a contradiction. Therefore
Now construct 0 on rff
>' B,
is the set of all atoms.
~~~~'ing 0(A) be the cardinality of the set
{Ai I Ai C A--
Let P(A)
=
of the elements of the event B-in other words, when we know that the
event B occurred. Thus, the outcome must be in both A and B, that is,
in An B. This immediately suggests P(A n B)= P(A I B); however, this
cannot be correct since, given B, either A or --A must certainly occur so
P(A I B)+ P( -A I B)= l, whereas, P(A n B)+ P( --A n B)= P(B).
However, if we set
p(A)jn.
[t
is easy to verify that p is an order-preserving prob;
bility and that P is unique.
A
REPRESENTATION BY CONDITIONAL PROBABILITY
Most work in the theory of probability requires the import;~:b~~~n~
( n of conditional probability. Intuitively, P(A I B) !S the pr
. y
:~e~~vent A when we have the added information that the outcome JS one
=
P(A
n B)/P(B),
(2)
all is well, provided that P(B) > 0. This is the accepted definition of conditional probability.
Because this concept is so important and because ordinary probability is
the special case of it in which B = X, several authors have treated it as the
basic notion to be axiomatized and have given generalizations of Kolmogorov's axioms (Definition 2). Copeland (1941, 1956) and Renyi (1955)
are of particular interest. Csaszar (I 955) studied conditions under which a
real-valued function of two set-valued arguments, which is what conditional
probability is, can be expressed in the quotient form of Equation (2) in
terms of a function of one set-valued argument, which is what unconditional
probability is.
Our concerns are somewhat different. We wish to know conditions under
which a qualitative relation of the form A j B;?: C I D, meaning "A given B
is qualitatively at least as probable as C given D," can be represented by a
probability measure of the form of Equation
in the sense that
AjB;?:CID
iff
P(A n B)/P(B) ?o P(C n D)/P(D).
To our knowledge, the only published attempts to do this are Koopman
(1940 a,b), Aczel (1961, 1966, p. 319), Luce (1968), and Domotor (1969).
Domoter gave necessary and sufficient conditions in the finite case similar
to the axiom systems in Chapter 9. The other three systems have much in
common. Koopman treated 0 as the only null event and did not assume that
every pair A i B and C I D are necessarily comparable; whereas Aczel and
Luce admitted other null events and did require comparability of all pairs.
The lack of comparability forced Koopman to introduce some extra axioms.
Aczel and Luce postulated an additivity requirement, similar to Axiom 3 of
Definition 4; whereas, Koopman invoked the property that if A I B <:; C j D,
then -C j D <:;--A B, to serve an analogous function. The major differences lie in the choice of sufficient conditions and in the proofs. Aczei
postulated a real-valued function on the pairs A I B, which gives the ordering,
and imposed con~inuity conditions. His proof used results in functional
equations. Koopman used a kind of uniform-partition postulate and constructed the probability function
a
process. Luce, whose necessary
1
5.6
I B)
5.
222
223
5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY
PROBABILITY REPRESENTATIONS
The structure is Archimedean provided that:
conditions were essentially the same as those of Aczel, reduced the result to
known results in measurement theory, using a solvability axiom. He also
used a functional-equation argument. Here, we present a modification of
Luce's system.
In addition to the above studies, Copeland (1956) stated a system of axioms
that is a good deal more transparent than Koopman's, but he did not
construct a numerical representation. Moreover, he assumed that his basic
system of elements is closed under the conditioning operation, which is to
say, if A and Bare events, where B is nonnull (see below), then A I B is also
treated as an event. This seems a trifle odd intuitively, although, as a matter
of fact, our structural restriction entails that every A I B is equivalent in
probability to some event.
5.6.1
Necessary Conditions: Qualitative Conditional Probability
At first glance, one might expect to begin with an algebra of sets rff and a
relation ?:; on rff x rff'; however, the proviso that P(B) > 0 in Equation (2)
makes it clear that matters are not quite this simple. Events with probability
0-~null events--must be excluded from the conditioning position. Such
events form a subset .Y of rff, and the relation <; is on tff X (rff' --..¥).We
use the suggestive notion A I B for a typical element of rff X (tff- ..¥).
Since we are familiar with the general procedure for finding necessary conditions, we first state the axioms in the form of a definition and then discuss
those that are novel to this context.
DEFINITION 8. Suppose that X is a nonempty set, rff is an algebra of sets
on X, .Y is a subset of rff, and <: is a binary relation on rff X ( rff - ..¥). The
is a structure of qualitative conditional probability
quadruple (X, tff, Jil',
iff for every A, B, C, A', B', and C' E tff (orE rff- ..¥,whenever the symbol
appears to the right of I) the following six axioms hold:
1.
(rff' X (rff'-
?:;) is a weak order.
Every standard sequence is finite where {A} i
7.
t
d
d
iff for all i, Ai E ,g- .Y' A t+r :1 A ~' a~d X I X>' AslaAs
an ar sequence
i
i+l ,....._, A1 j A2 .
Of course, as in all our axiom systems, the choice of necessary axioms
depends somewhat o~ :he nature of the nonnecessary structural restrictions.
.
In the case of Defimtwn 4 (qualitative probability) the thr
extrem ly t
dh
,
ee axwms are
b ~ . na ura1an ave long been ~ccepted as the definition of qualitative
~ro ability. We added only the Archiinedean axiom and a structural restric~on. Here, we must be somewhat more tentative about the choice of axioms.
.e~m~s 9-12 (Sect:on 5.7.1) and their corollaries are also necessary conditions, they are denved from Axioms 1-6 only with the h l fth
axiom int d
db
ePo
e structural
ro uce ~1ow. Some of these properties could well be included in
the concept of qua!ItatJVe conditional probability. Indeed in Section 5 6 3
we shall relabel Lemma 12 as Axiom 6' and use it inste:d of Axiom
the
of a nonadditive conditional probability
m
(SectiOn 5.6.4).
representation
6:
~evelopment
So~e properties, which it would seem natural to include in qualitative
conditiO. nal probability, follow from Axioms 1-6 WI.thout any structural
res t nctwn. For example, Axiom 3 could be expanded into
XJX?:;AIB?:; 0jX,
but this can, be proved (Corollary 1 to Lemma 6).
. The proOI that Axioms l-5 are necessary for the desired representation
~~ quite ~asy. Note, fo.r example, that P(A I B) c= P(A n B I B), which
YAie~ds Axwm 4. ~~e denvatwn of Axiom 5 is similar to that of its analog
x1om 3 of Defimtwn 4.
'
To show that Axiom 6 is necessary, note that if B I A<: C' I B', with
A :1 B and B' :1 C', then
P(B)/P(A) ;? P(C')/P(B').
Similarly, C I B?:; B' I A', B :1 C, and A' :1 B' yield
E .Y iff A I X ,..__, 0 i X.
I A and X I X ?:; A i B.
4. A I B ~ A n B I B.
5. Suppose that A n B = A' n B' = 0. If A I C <: A' I C' and
B I C <: B' I C', then A u B I C?:; A' U B' \ C'; moreover, if either hypoth-
~:~~e
esis is >, then the conclusion is >·
6. Suppose that A :1 B :1 C and A' :1 B' :1 C'. If B I A ?:; C' I B' and
C I B?:; B' I A', then C I A ?:; C' I A'; moreover, if either hypothesis is >,
or C I A.. >
·
......, C' 1 A' , as reqmred.
Moreover, we know that P(B) and P(B')
~re pos~tiv~, he~ce, by the second inequality, P(C) is positive. Thus if either
mequahty IS stnct, multiplication preserves the strict inequality.
,
2.
X
3.
X I X~ A
E
tff -- "'¥; and A
then the conclusion is
>-
P(C)/P(B) ;? P(B')/P(A').
the P-values are nonnegative, these inequalities can be multiplied to
P(C)/P(A) ;? P(C)/P(A'),
5. PROBABILITY REPRESENTATIONS
5.6. A REPRESENTATION BY CONDITIONAL PROBABILITY
225
224
The Archimedean property, 7,
Since P(An) ::( 1 and P(A1) > O,
0
< P(A1) :S::
. also derived by 'multiplying fractions:
18
P(A1)
P(An)
P(A 1)
= P(A 2 )
=
(-:fc~:~
P(A2) ...
P(A3).
r-1.
. eX I X'-- A I A. it follows that P(A1)/P(A2) < L Thus, the inequality
S me
, /
1
2 '
fl · 1
n
[P(A )/P(A 2)]n·-l can hold for only m1te Y many · .
.
,)
P(A 1) <
~
1
·
6(
t asted with Axwm 6
See also the further discussion of Axwm
as con r . S .
and
in Sections 5.6.3 and 5.6.4 (also, following Lemma 12 m ectiOn
of Axiom 7 in Section 5.7.2.
The proof (Section 5.7) involves three major steps. First, a number of
elementary properties, such as A I B ;2': C I D implies -C I D ;2': -A I B, are
shown to follow from the axioms (Section 5.7.1). Second, we introduce an
ordering ?::;' on If, by letting A ;:;:;' B if and only if A I X ;:;:; B I X (Section
5.7.2). The structure <X, rff, ?::;') is shown to satisfy the hypotheses of
Theorem 2. By the proofs of Theorems 2 and 3.3 (reduction to extensive
measurement, and thence, to Holder's theorem) we obtain an Archimedean,
positive, regular, ordered local semigroup (Definition 2.2), whose elements
are equivalence classes in rff - AI.
Third, we obtain a semiring structure by using the conditional structure
to define a multiplication operation (Section 5.7.3). If A I X~ C 1 B, with
B ::J C, then the representation that we aim for gives P(A) P(B) = P( C).
Therefore we define A * B = C (where boldface letters denote equivalence
classes). It is then easy to show that the axioms of Definition 2.4 hold. The
required representation follows from the semiring theorem, 2.6.
5.6.3
5.6.2
Sufficient Conditions
.
of Definition 8, we add the nonnecessary
Continuing the num benng
property:
AXIOM 8.
!fA ! B
<: C I D, then there exists C' in rff such that C n
DCC'
and A I B "'"-' C' I D.
This simply says that rff is sufficiently rich so that whlentevterAA'i!BB T>hC,·s iisD~
k C ' D eqmva en o
·
we can add just enough to C to rna e
I.
·oms !-6 and 8 yield
rather strong structural restriction. In particular, Axl
b hown thar
Axiom 5 of Section 5.2.3 (see 5.~.2), and mo~e;~e~o~~~a;ha~ ~s X can b~
apart from trivial cases, Axwm 5 . of SectJOn . .
. '
ste:n included
partitioned into arbitrarily fine eqwvalent even:s. Ko~~~ra; :h?:.e is a nonnull
the nonnecessary postulate that for every po~a~~~~~~ents. It would, of course,
event that can be partitiOned mto nb eqmpr~h- g weaker even if additional
be desirable to replace Axtom 8 y some m
,
necessary axioms had to be added.
h /X @" AI
is an Archimedean structure of
THEOREM 7. Suppose t at\ , '
'..
8 for which Axiom 8 holds.
{f
h that for all
qualitative conditional probabzbty (Defimtwn ) p
on ' sue
Then there exists a unique real-valued function
A C E {f and all B, DE rff - AI:
'
(X, t!f, P) is afi.nitely additive probability space;
A
A
E
AI iff P(A) = 0;
P(A n B)/P(B) ~ P(C n D)/P(D).
I B ;2': C I D
Further Discussion of Definition 8 and Theorem 7
In this subsection we discuss three topics: the role of various axioms in the
proof of Theorem 7; alternative proofs and axiomatizations; and the
testability of the axioms.
Axioms 1-3 of Definition 8 are obvious ground rules for qualitative conditional probability, and are often used in the proof without explicit reference.
For exam;-le, we may say, "since A ! A ,-...., B ! B ... ," without noting that this
uses Axioms 3 (X X~ A A) and 1 (transitivity).
By restricting the relation ?: to hold between A I B and C I D only if
B ::J A and D ::J C, we could bypass Axiom 4 altogether. In effect, this is
what we do during the proof, establishing the representation conditions for
B::JA, D~ C:
(iii)' A I B;:;:; C I D iff P(A)/P(B) ~ P(C)jP(D).
We then use A I B r-..; A n B I B only to pass from this to the more usual
representation, involving condition (iii) of Theorem 7.
Axiom 5 is, of course, the key assumption for an additive probability
measure. It is analogous to Lemma 2 of Section 5.3.1 and Axiom 3' of
Section 5.4.1. From the standpoint of semirings, it has two principal uses.
First, it is used to show that ® is we!I defined and monotone, where if
A n B = 0, A $ B = A v B. This only uses part of the strength of the
axiom, i.e., if A ! X ;2': A' I X and B I X;:;:; B' I X, then Au B I X ;2: A' u B' I X
(of course, An B =A' n B' =
This use of Axiom 5 is not really
"·~"'Y"'v.. , being buried in the reductions to Theorems 2 and 3.3. Rather, 5 is
used heavily in the preliminary lemmas to derive other needed features of
qualitative conditional probability. The second major use in the semiring
1
1
226
5. PROBABILITY REPRESENTATIONS
development is to prove; right distributivity of multiplication * over addition
@. Here, we use the form of the axiom with I C instead of I X. The structure
of the axiom suggests right distributivity, i.e., if A I X,....., A' I C (A * C = A')
and
B I X,....., B' I C (B * C = B'), then Au B I X~ A' u B' I C
[(A @ B) * C = A' @ B' = (A * C) @ (B *C)].
.
.
Axiom 6 also plays two main roles in the semiring development, besides 1ts
use in some preliminary lemmas. First, it is used to show that multiplication
*is commutative. This leads immediately to the left distributive law. Second,
Axiom 6leads to Lemma 12 and its corollary (Section 5.7.1), which are used
to prove left and right monotonicity of multiplication.
.
Except for commutativity of multiplication, we would be better off usmg
Lemma 12 as an axiom, instead of Axiom 6. To further this discussion, we
state Lemma 12 here as an alternative axiom, 6'.
AXIOM 6'. Suppose that A :::l B :::l C and A' :::l B' :::l C'. If B I A<:; B' l A'
and C j B <:; C' j B:, then C l A<:; C' I A'; moreover, if either hypothesis is>
and C E If -AI, then the conclusion is )>-.
It can be seen that Axiom 6' is very similar to Axiom 6. The hypotheses
of Axiom 6' are "uncrossed," with B I A dominating B' I A', etc. Axiom 6'
is very analogous to the weak monotonicity condition in difference structures
(Chapter 4), whereas Axiom 6 is analogous to what has been called the strong
6-tuple condition (Block & Marschak, 1960). Here, we use Axiom 6, plus
other conditions, including Axiom 8, to derive 6'; then 6' does most of the
work, except for commutativity of multiplication. We could use _6' ~s .an
axiom, if we were willing to add another axiom to guarantee left distnbutiV!ty.
For example, we could assume:
Suppose that A :::l A', B :::l B', and A n B
=
0. If A'
1
A
~ B' I B, then
A' u B' I A u B .~A' I A ,......, B' ! B.
Alternatively, we could try to push through the analogy with difference
structures, deriving 6 from 6' via a representation theorem from Chapter 4.
This is done in the next subsection (proof in 5.7.4), but we seem to reqmre a
stronger structural condition (see Axiom 8', below) in order to obtain the
solvability axiom of positive-difference structures.
..
Luce' s ( 1968) proof relied on the positive-difference structure of conditiOn~!
probability. The exponential of a difference representation gave a r~t10
representation Q. Normalizing so that Q(X) = 1, Q ful_frlled all the as~ertwns
of Theorem 7, except possibly additivity. The normalized Q was un1que up
to Q--.. Qa, and Luce used a functional-equation argument to show that o.:
could be chosen such that additivity also held. Our use of the ordered local
serniring incorporates the essence of a slightly different functional-equation
argu.ment, since the proof of Theorem 2.6 is based precise!~ o~ :fi~dmg the
unique additive representation for a semiring that is also multJphcatlve.
5.6.
A REPRESENTATION BY CONDITIONAL PROBABILITY
227
Archimedean axioms are familiar to the reader by this time, so we need
not comment on the role of Axiom 7; but we do note that we have defined
standard sequence in terms of the ratio structure, rather than in terms of
the additive structure. We do so because the resulting definition is much
simpler (compare Definition 3) and it leads directly to the Archimedean
axiom for positive differences (see Section 5.7.4)0 However, the proof that
Axwms 1-8 1mply the Archimedean axiom for the additive structure is a bit
tricky (5.7.2).
. !he structural condition, 8, plays many different roles. We note here that
1t IS used to define multiplication: A * B ~= C where C is the solution to
A ! X,..._, C ! B, with B :::l C. The solution exists by Axiom 8 (note Axiom 4 is
also used), since A I X<:; 0 I_ B (Corollaries 1 and 2 of Lenima 6, below).
Th1s completes the d1scussron of the roles of the various axioms and of
alternative axiomatizations and proofs. We now comment briefly, on the
testability of the axioms.
The status of the various axioms depends critically on the method used to
establish the empirical ordering over If x (If ·- AI). If the ordering is
estimated from relative frequency counts-by the proportion of the times
that A occurs when B occurs-we would most likely attribute any failure of
the axwms to error. If the ordering is obtained by having a subject rate the
likelihood of A, given B,_ then it would seem that the most interesting axioms
to test are 5 and 6. Axwm 4 could fail if the subject did not perceive set
relatwns correctly: but this would not bear strongly on the issue of a probability reoresentatwno The possibility that Axiom 5 might fail while 6 holds
motivates our development, in the next section, of a nonadditive conditional
representation. If Axiom 6 fails, then the properties of "subjective conditiOnal probability" are indeed very far removed from the representation
here axiomatized.
in the case where the relation <:; is inferred from decisions under uncert~inty, Axiom l also comes into question, for the same reasons as in uncondJtJOnal representations (see Section 5.2.4 and Chapter 8).
It seems hard to envisage situations in which tests of Axioms 2 3 7 or 8
would be empirica!ly i~teresting. In most, if not all, practical 'si;ua~ions,
% = {
whJCh Simplifies matters considerably.
5.6.4
A Nonadditive Conditional Representation
As we mentioned in the preceding subsection, there may be cases where an
ordermg of "conditional probability" satisfies the multiplicative property
expressed by Axwm 6 or 6', but does not satisfy the additive property of
Axwm 5. We therefore want a nonadditive representation to be obtained
Without using Axiom 5.
228
5. PROBABILITY REPRESENTATIONS
We retain Axioms 1-4 and 7 of Definition 8; w~ replace Axiom 6 by
Axiom 6', introduced in the preceding subsection; and we replace Axioms 5
and 8 by the following:
5'.
If A
.AI and A :J B, then BE%.
8'. If A I B ?.::; C I D, then there exists C' E tff such that C n D C C' and
A I B ,...._, C' I D; if, in addition, C E tff- %, then there exists D' such that
D :J D' :J C n D and A I B ,__, C I D'.
E
Axiom 5' corresponds to Lemma 8 (iii) below, i.e., it can be proved from
Axioms l-6 of Definition 8. The first part of Axiom 8' is just a restatement
of Axiom 8; the second part has the additional proviso that just enough can
be removed from D- C to obtain a subset D' such that A I B ,...._ C I D'.
Naturally, this must be restricted to the case where Cis not in%.
THEOREM 8. Let
d', %, ?.::;) satisfy the conditions of Theorem 7,
except that Axioms 5, 6, and 8 are replaced by 5', 6', and 8', respectively.
Then there exists a function Q from tff - .AI to (0, l}, such that;
(i)
Q(X) = I;
5.7.
Q(A
n B)/Q(B)
~
=
Qa, a
> 0.
Note that this theorem deals entirely with the restriction of ?.::; to
(6x
- %); we do not have Q defined on elements of%. In
order to extend Q to .A'", so that (ii) still holds, and so that A is in % if and
only if Q(A) = 0, we would need to add certain additional conditions,
which are provable with the help of Axiom 5, but not provable from the
present axioms. For example, we would need properties such as
A I B ?.::; 0 i X, 0 I X~ 0 i B, and A l B ~ 0 I B if and only if A is in%.
Since the extension to % complicates matters and seems of little interest,
we omit it.
Theorem 8 is proved in Section 5.7.4.
PROOF. Suppose that, on the contrary, -A 1 B > -C 1 D. By the
stnct mequahty clause of Axiom 5, (Au -A) 1 B
(C u -C) D. By
Axwms l and 4, BiB> DID, contradicting B 1 B ~ D 1 D derived from
Axwms l and 3.
0
>
COROLLARY 1.
COROLLARY 2.
A I B?.::; 0 I X.
0
IA
r-o
0
0
I B.
PROOF. By Axioms 1 and 3, A I A ""B I B; by the lemma,
-A I A ,...,_, - B I B; and the conclusion follows by Axioms 4 and I.
0
COROLLARY 3.
If A ! C ~-A I C and BID""" -B 1 D, then
A I c,..._,BID.
If A
1
LEMMA 7.
The common hypothesis of the following four lemmas is that (X, C, %, ?.::;)
a structure of qualitative conditional probability (Definition 8, Axioms
Axiom 7 is not used, and Axiom 8 is assumed only in Lemma 9.
1
=
!.
If A :J B, then A I C;::: B I C.
PROOF. By Corollaries 1 and 2 of Lemma 6, (A -B) 1 C?.::; 0
Also, B I C ,....., B I C. By Axiom 5, A I C ?.::; B I C.
1
C.
0
We collect together a number of properties of the set AI in one lemma:
LEMMA 8.
(i)
PR'OOFS
Preliminary Lemmas
I
Of course, in the representation, A ! C ~ -A C will yield P(A C)
Corollary 3 is the qualitative version of that result
(ii)
IS
1
Axiom 3 and the lemma.
PROOF.
(iii)
5.7.1
If A I B ?.::; C I D, then - C I D ?.::; -A I B.
C >BID, we can use Axiom I to derive
--A!C>-B\D;
this contradicts Lemma 6.
Q(C n D)/Q(D).
If Q' is any other function with the same properties, then Q'
229
LEMMA 6.
PROOF.
if A n B, C n DE 6 - .AI, then A I B ?.::; C i D iff
5.7
PROOFS
0 E%.
If A E %, then -A E tff - %.
If A
E
%, and A :J B, then BE %.
If A, BE%, then A u BE%.
PROOF.
(i)
0
IX
""' 0
I X.
(ii) If A, -A E .AI, then A i X ,....., .0 i X.~ -A I X. By Lemma 6,
-A I X~ X I X~ A I X, i.e., X I X~ 0 j X, contradicting Axiom 2.
5.
230
PROBABILITY REPRESENTATIONS
(iii) By Lemma 7, A I X,....., 0 I X, A:)~ imply 0 I X;?: B I X. By
Corollary l to Lemma 6, B I X~ 0 I X, orB ISm .AI.
.
·
(
...
)
(A
B) 1X
0 I X this with B I X -~ 0 I X, y1elds
(IV) By part m ,
,.....,
'
'
0
(A u B) I X~ 0 I X, by Axiom 5.
LEMMA 9. If A 1 B ~ 0 1 B and B:) A, then A E .AI. Conversely, if
Axiom 8 holds, BE If' -.A', and A E .AI, then A I B ~ 0 I B.
pROOF. Suppose that A I B
0 I B, B:) A, but A E ,g ~ .AI. By
Corollary 2 of Lemma 6, 0 I A ~ 0 I X. Since B E ,g -- .AI, Axwm 2 and
Corollary I of Lemma 6 yield 0 I X< B I X. We have
r-.J
0
I A <BI X,
A
IB,....., 0IB,
·om 6 0 B < 0 X contradicting Corollary 2 of Lemma 6.
wh ence, b y Ax l
,
'
B w a 1
For the converse, suppose that BE C --.AI and A I B > 0 I · e PP Y
Axiom 8 to A 1 B > 0 I X to obtain
1
'1
AIB~CIX>0IX.
Thus c E ,g -.AI. Since also BE C --.AI, we have B I X> 0 I C. By Axiom 4,
1 B .~ c 1 X. By Axiom 6, A n B I X > 0 I X, so A n B E C .AI,
A n
and by Lemma 8 (iii), A E If' - .AI.
0
satisfies Axioms
For the remaining lemmas, we assume that (X,
B
1-6 and Axiom 8.
· · diff
The next two lemmas have the content of Axiom 3 of positive-. erence
· · n 4 1)· if ab be are positive intervals, then ac IS stnctly
struct ures (Defin111 0
· ·
,
.. ·f A :) B
greater than either of them. Here, AB will be a positive mterval 1 _
and X 1 x > B 1 A. Thus, we want to establish that 1f A:) B :J C,, w1th AB,
BC positive, then C i A is strictly smaller than both B l A and C B.
1
LEMMA 10.
If A:)B:)C and BEC--.AI, then CjB;?:C!A, and>
holds unless either C or A ·- BE .AI.
A E 0 - JV. If C
PROOF. By Lemma 8
c IB
r--.o
0
IB
~ 0
E
.AI, then Lemma 9 gives
I A ,...._, C I A,
as required. So assume that C E C - .AI.
.
.
We have c 1 B ~ c 1 B and c 1 C;::: B 1 A, whence, Axwm 6 y1elds
C I B > C I A. Moreover,
holds unless C I C r--- B I A; but m that case,
Lemm'; 6 gives 0 1 ,...._,(A - B)\ A, and now, from Lemma 9 and
c
>
(A - B) A ,..._, 0 \ A, we have A - BE .AI.
J
5.7.
231
PROOFS
LEMMA 11. If A:) B:) C and A E If'-.#', then B I A<:; CIA, and>
holds unless either B or B -- C E %.
PROOF. If BE%, thenCE .AI and B I A ~ 0 I A ~ C I A, as required.
So assume that B E 0 - .AI.
We have A I A ;?: C I B, B I A ~ B I A, so Axiom 6 yields B I A <:; C I A,
with > unless A I A ,...._. C I B. ln the latter case, B - C E %.
0
The next lemma (Axiom 6', Section 5.6.3) corresponds to Axiom 4 of
positive differences, the weak monotonicity condition, with an additional
strict inequality proviso.
LEMMA 12. If A :JB:J C, A':JB':) C',B I A;?: B' I A', andCI B <:; C' I B',
then CIA ;?: C' I A'. Moreover, the conclusion is > unless either C E .AI or
both hypotheses are ,...._._
PROOF. Suppose that the hypotheses of the lemma hold, but that
C' I A';?: CIA. We must show that in fact C' I A'"" CIA and that either
C E .AI or both hypotheses are "'-'·
By Lemma 10, we have C' I B' ;?: C I A. By Axiom 8, choose B" with
A :J B" :J C and C' I B',...., B" I A. ff B" E %, then Lemma 9 yields that C,
C' E .AI and C' I A' ~ C I A ,_,_, 0 I X, so we are done. Assume B" E 0
%.
We show that C: B";?: B I A. Otherwise, B I A > C I B" and
C I B ;?: C' I B' ~ B" I A
yield, by Axiom 6, that C I A > C I A. But now, C I B" ;?: B' 1 A' and
B" I A,....._, C' I B' yield CIA ;?: C' I A', by Axiom 6. Thus, we have
CIA,...__ C' I A', and by the strict clause of Axiom 6, C I B" ~ B' 1 A' We
now have C I B" ~ B I A, and can now infer, by the strict clause of Axiom 6,
that B" I A~ C \ B (otherwise, C! A >CIA). Hence, ~ holds in both
hypotheses.
O
COROLLARY. Suppose that C, DE
Then A I C ;?: B I C iff A I D ;?: B I D.
rff -
.A"' and that D:) C :J A u B.
PROOF. If A I C;?: B I C, then since C I D ~ C i D, the lemma implies
AID;?:BID.
Conversely, if B I C >A I C, then the strict clause in the lemma yields
BJD>AJD.
0
This corollary is one of the main results to be used in the proof of
232
5.
PROBABILITY REPRESENTATIONS
5.7.
PROOFS
233
Theorem 7. It is used first to establish the Archimedean' axiom for a structure
of qualitative (unconditional) probability, in Section 5.7 .2, on the basis of
the Archimedean axiom of Definition 8. Subsequently, it is used to show
monotonicity of multiplication in the ordered local semiring. The main line
of development thus consists of Lemmas 6, 7, 8 (iii), 9, !0, and 12, with
corollaries.
One more lemma is required, that will be used to obtain the solvability
axiom for unconditional probability.
(in the sense of Definition 8) of <X ,ff A' >) Let
c. =
_
.
.' ' .. '.~ ·
,
A;
Ai-l . The C; are pmrw1se diSJOmt and
c1 --
A
1
d .- ·
an ,or 1 > I,
z'
Az,
U Ck,
=
k~I
2-i+I
Az,
Az'+' -
U
=
Ck .
k=2t-H
LEMMA 13.
If A I X;?::: B I X, then there exists
B' C A such
that
B' iX~BI X.
PROOF. By Lemma 6, -B i X;?::: -A i X. By Axiom 8, there exists
-B' ~-A such that -B I X~ --B' I X. We have B' C A and, by Lemma 6,
B'iX""'BIX.
0
5.7.2
By Lemma 8, each Ai
A~' B
iff
AjX ;?:B[X.
Then <X, ,ff,
is an Archimedean structure of qualitative probability for
which Axiom 5 of Section 5.2.3 holds.
PROOF
We verify the five axioms.
;?:::'is a weak order on
,ff,
since ;?::: is a weak order.
2. A;?:::' 0 by Corollary 1 of Lemma 6; X>-' 0 by Axiom 2 (if X~· 0,
then X E A', but X E If 3. This follows as a special case of Axiom 5 of Definition 8 (note th~t
the strict clause of that Axiom corresponds to the "if" clause in Axiom 3 of
Definition 4).
4. Suppose that A 1 ', A 2 ', ... is a standard sequence in (X, ,ff,
relative
to some A >' 0 (see Definition 3). We first construct another standard
sequence A 1 , A 2 , ... , with the properties that, for all i, A; ~A/ and
A, C Ai+1. To do this, let A 1 = A and assuming that A;_1 has been defined,
choose Ai by applying Axiom 8. We have A/>' A;_1 (use Axiom 5,
Definition 3, and A >'
hence, A/ I X > Ai-l i X. By Axiom 8, there
exists A,-:) Ai-l with A/ \ X.~ A; I X.
Next we show that the subsequence A 2 ,
= 0, 1, ... , is a standard sequence
E
0' -- A'. By the corollary of Lemma 12, and Axiom 4,
A 2 , I Azi+r ~ (A 2 ,+I - A 2 ,) I A 2 ,+I ~ -A 2~. 1 A 2i+l
An Additive Unconditional Representation
THEOREM 9. Suppose that <X, If, A',
is an Archimedean structure of
qualitative conditional probability (Definition 8) for which Axiom 8 holds. Let
on If be defined by
1.
By Definition 3 and Axiom 5 we have c ,. . _,' A for all · Th
b A ·
d ·
·
'
'
I.
US, y
XIOm 5
an mductwn, any two unions of m C.'s
· part1cu
· 1ar,
, are ,......, ' , a n d m
~
3 to Lemm. a 6 we have A 2 . A . ~A A
il · B A ·
'
3By Corollary
'
2•+1
1 .
2 , a
1.
y x1oms
5
• ;~ have Xi X~ Az 1_ Az > A1 I A2 (note that by Lemma 9,
2 I . 2 >. . I , • Thus,
IS mdeed a standard sequence (Definition 8)
and smce 1t 1s fimte, so must be the sequences {A;} and {A/}.
1
1
C au:
5. . Suppose that A n B = 0 A >' C and B >' D A 1 · L
,
,
~
. pp ymg emma l3
/,
, , we have C' C A with C' I X.- C 1 X. Similarly we have
D C B w1t,: D I X ~ D 1 X. Since A n B = 0 C' n D' 1'
C' D'
dE
,
- 0 a so, and so
'
' an
= A U B fulfill the condition.
O
t 0 A I X "'- C I X
,
COROLLARY. Suppose that the hypotheses of Theorem 9 hold Let ,g b
the set of ~' equivalence classes, excluding A'; and for A E ,g __·A' let
denote the equivalence class of A. Let ;t be the induced simple order' on If·
and if A,_B E tf/, with An B = 0, let A@ B beAu B. Let f1A be the set ~
(A, B) wzth A n B = 0 Then <,g > !A 12.> ·
·
c;
¢1
ts an Archzmedean positive
regular, ordered local semigroup (Definition 2.2).
'
'
I
0
0
,
,...,
,
. PROOF. . From Section 5.3.2 and Theorem 9 we know that ( 0', ;t, PA, @)
1~ an extensive structure With no essential maximum (Definition 3 2)· b t
smce > Js alread
· 1
d
· ' u
,...
. Y a Simp e or er, the argument in Section 3.5.3 shows that
the structure IS an ArchJmedean, positive, regular ordered local
group.
'
semi-
0
5.7,3 Theorem 7
(p. 224)
An
Archimedean structure of qualitative probability <.X, <o,
""
has a
.
umque representation p such that (X, ,ff, P) is a finitely additive probability
234
5.
PROBABILITY REPRESENTATIONS
space, JV is the class of events with probability zero, and A I B?::; C I D
P(A
n
B)/P(B);;;; P(C
n
iff
D)/P(D).
PROOF. We start by introducing a semiring operation into the semigroup
?:;, &J, €)) defined in the corollary to Theorem 9, in the preceding
subsection. Take A, BE C. We have A I X> 0 ! B (since %¢: tff and since
0
X ~ k) B by Corollary 2 to Lemma
By Axioms 8 and 4, there
exists C with B :::J C and A i X r-J C j B. We now define A* B = C.
To show that* is well defined, suppose that A I X,......, A' I X, B I X,......, B' I X,
C C B, C' C B', A 1 X r-; C i B, and A' i X
C' 1 B'. Applying Lemma 12 to
C C B C X, C' C B' C X, we have C i X~ C' I X; thus, C = C'. By Lemma 9,
A * B E Iff, since in the above definition, C I B > 0 I X ,..._, 0 I B.
The set /!ll* for which * is defined consists of all of Iff X tff (this is because
the product of two numbers less than or equal to 1 is again less than or
equal to 1; X is a natural identity for * ). This greatly simplifies the proof of
the other axioms of Definition 2.4.
To show that * is associative, choose A, B, C E Iff. Let A* B = C',
C C B; and let C' * C = D, DC C. Thus, (A* B) * C = D. Similarly,
choose A' = B * C, A' C C, and D' =A* A', D' C A', so that A * (B *C)=
D'. We have A • X~ C' i B and A! X~ D' I A', hence D' i A'~ C'! B.
Also A' 1 C ~ B i X. Applying Lemma 12 to D' C A' C C and C' C B C X,
we have D' j C ~ C' ! X. Since C' I X~ D I C, we have D' i C ~ D : C. By
the corollary to Lemma 12, this implies D' = D, as required.
Before establishing that the other axioms of Definition 2.4 are satisfied,
we note that * is commutative: if A ! X r--.o C I B, with C C B, and
B I X ,......., D I A, with D C A, then by Axiom 6 applied to D C A C X,
C C B C X, we have D l X~ C I X, or D = C.
We now show that * is right distributive over €). Left distributivity then
follows by commutativity of *· Let A n B = 0, A v B i X~ C' I C,
C' C C. Then (A €) B)* C~= C'. By Axiom 8, choose A' C C' such that
A I A v B.~ A' I C' (Note that A v B, C' E IffLet B' = C' -- A'.
By Lemma 6, B i A v B ,..._, B' I C'. Now apply Lemma 12 to A C Au B C X,
A' C C' C C, obtaining A 1 X~ A' I C; and similarly, by Lemma 12,
X
B' C. Hence, A* C =A', B * C = B'. By construction, A' n B' =
0, therefore,
1
I
N
B:
C'-.J
1
(A * C) $ (B
* C)
=
A' v B'
=C'
=(A@ B)* C.
Finally, we note that A ?:; B implies A* C ?: B * C follows immediately
from the corollarv of Lemma 12; left monotonicity follows by commutativity;
, be proved directly from Lemma 12.
also it could
5.7.
PROOFS
235
We now assert that the first three axioms of Definition 2.4 hold. Axiom 1
that< Iff, ?:;, .fllJ, @) is an ordered local semigroup, was proved in the previou~
sect1_on; Axwm 2, ~hat (C, ~, tff X C, *) is an ordered local semigroup,
~as JUst been estabhshed (the assertions about pairs in g x g are all trivial,
smce * IS defined everywhere); and Axiom 3, distributivity, was also just
proved. The final axiom asserts that for any A E C, there exists (B, q in /!ll
such that (A, B ® C) E C X Iff. This is trivial unless &J is empty. But fllJ can
be em~ty only if tff contains exactly one element, X; in which case, Theorem 7
Js tnv1ally true, letting P(A) ~~ I or 0 accordingly as A ""'' X or A ,...._,'
0
If _&J is nonempty, then (<ff', ~, &J, Iff x C, €), *) is an Archimede~n,
positive, regular, ordered local semiring. By Theorem 2.6, there exists a unique
fu~ct:on </> from <ff to Re+ which IS order preserving, additive, and multiphcatJVe. Define
P(A) =
l rp(A),
if A E <ff
JV,
if AE%.
lo,
We must show that P satisfies the conclusions of Theorem 7.
Suppose that A _i B?::; C I D. We want to show that P(A n B)/P(B)?
P(C n D)/P(D). Without loss of generality, suppose A n B, C n DE ,g _ JV
(smce C n D E% is trivial, and A n BE% implies C n D E .AI, using
Lemma
Choose E, FE C
,AI such that
E!X~A/B?::;CiD~FjX.
We have B * E =An B and F * D = C n D. Therefore since
multiplicative and order preserving and since E :=; F, we have '
</>(A n B)/<P(B)
=
</>
is
¢(E)
? ¢(F)
= </>(C n D)/,P(D).
>
The required inequality for P follows immediately. The strict case
Similarly leads to > in the P-inequality; so A 1 B?::; C 1 D if and only if
P(A
n
B)/P(B) ? P(C
n
D)/P(D).
We need only show now that (X, tff, P) is a finitely additive probability
space. S~nce X 1s a multiplicative identity, we have P(X) = 1; dearly p ;;;; O;
and addJtlVlty of P follows from additivity of rp on< C, &J, €)).
For umqueness, note that another representation P' would lead to another
representation 4>' of (Iff,?:;, &J, Iff x C, €), *), by letting rp'(A) = P'(A).
For example, to show that rp' is multiplicative, let A* B = C. We can choose
from the representation
C such that A I X~ C I B and C C B; and
P'(A) P'(B) = P'(C), implying ¢'(A* B) = </>'(A)rp'(B).
0
236
5.7.4
5. PROBABILITY REPRESENTATIONS
Theorem 8
(p. 22&)
If (X, rff, .AI,
satisfies Axioms 1-4, 5', 6', 7, and 8' (Section 5.6), then
there exists Q : ( rff - AI')__,. (0, I], such that (i) Q(X) = I and (ii) A I B ;?:: C ID
iffQ(A n B)/Q(B);? Q(C n D)/Q(D). This Q is unique up to Q __,. Q", o: > 0.
PROOF. We construct a positive-difference structure <rff -- .AI, ot*, ;?:: *)
(see Definition 4.1) as follows. Let ABE Ot* iff A, BE rff- .AI, A~ B, and
X I X>- B l A. For AB, CD E Ot*, let AB ;?:;*CD iff D I C;?:: B I A. We
verify the six axioms of Definition 4.1.
1.
Weak ordering is immediate.
2 & 3. If AB, BC E Ot*, then clearly, A, C E JJ - .AI and A~ C. We shall
show that B 1 A and C 1 Bare both
C j A; this will show that X I X
C I A,
so ACE Ot*, and also, that AB, BC < * A C.
By the strict clause of Axiom 6', B! B
C I B and B I A~ B I A yield
B iA
C I A. Similarly, C I B ~ C I Band B l B
B I A yield C I B
C I A.
(Note that Axioms 1 and 3 were used implicitly in the above proof.)
>-
>-
>-
>-
4.
>-
>-
5.7.
PROOFS
This completes the verification of the axioms for positive differences.
By Theorem 4.1, there exists a positive-valued function ,P on JJ -- .At such
that AB ;?:;*CD iff f(AB) ;;; f(CD), and for AB, BC E ot*, f(AC) =
if;(AB) + f(BC). We define a function Q as
Q(A) -- )1,
--- lexp[-!f(XA)],
>-
~
D
IC
r--<
D'
>-
i X,
1
1
For if XA
E
Ot*, then XB
E
1
ot*, so
Q(B)/Q(A) = exp[--f(XB)
l A.
Axiom 5' or 8', C' E JJ -.AI; by Axiom 5', D' E JJ- JV. Thus, AD',
C'B E Ot*, and AD' ,-...,* CD ~ C' B. To complete the proof we must show
that AC', and D'B E Ot*, i.e., that X I X>- both C' I A and B I D' If
X! X~ C' l A, then apply the strict clause of Axiom 6' to B I C'
B I A,
C' A
A j A, to obtain B ! A
B A, a contradiction. Similarly,
B I D' ~ B I B and D' i A > B ! A would yield B I A
B l A; hence,
X! X>Bl D'.
r--J
X ~A
E Ot*.
By definition of Ot*, XA E ot* excludes both X [ X,.__, A X and A E .Y.
Also, X I X,...__ A I X excludes A E .AI (otherwise, X 1 X~' A 1 X~ 0 1 X
yields X E.%). Conversely, if A¢ .AI and X X>- A X, then XA E ot*
Thus, Q is well defined on JJ - %.
Next we note that if ABE Ot*, then
=
1
XA
Q(B)/Q(A) = exp[- f(AB)].
5. Suppose that AB, CD E Ot*, with AB >-*CD. By Axiom 8', applied to
D i C
B! A, we have C' and D', with A~ C', D' and C', D' ~ B, such that
l C'
XI
if
if
We must show that Q is well defined and that its domain is ,fJ' that
is, the two conditions in its definition are mutually exclusive, and A E JJ- .At
iff one of them holds.
This follows immediately from Axiom 6'
B
237
>-
I
>-
6.
is a standard sequence, in the sense of Definition 4.1, if and only
if it is a standard sequence in the sense of Definition 8. (This is immediate
from the definitions.) Thus, any standard sequence in
- .AI, Ot*, <:; *) is
finite. 2
2 The reader may be puzzled by the fact that all standard sequences are finite. This is
because all standard sequences are ascending, hence, all are strictly bounded by XA, .
If we permitted descending standard sequences, A, , A,, ... , such that A, ::J A,+l and
A,+l 1 A, ~A, 1 A,_ 1 , there could very well be infinite standard sequences. But Jt would
still be true that any strictly bounded standard sequence is finite. By solvability, any
strictly bounded descending standard sequence could be converted into an ascending
standard sequence. This is really the reason why only one direction needs to be constdered.
+ f(XA)]
exp[-f(AB)].
Or if XA ¢ ot*, then X I X~ A I X, and by Axiom 6', B I X~ B fA
(otherwise, the strict clause of Axiom 6' yields B 1 X> B 1X). Consequently,
XB E Ot* and f(AB) = f(XB); the equation follows, since Q(A) = L
We can now show that (ii) of Theorem 8 is satisfied First if AB CD E ot*
then the fact that exp[- f(AB)] is a strictly decrea~ing f~nctio; of ,P(AB)
gives the required result (since f is order preserving on Ot*, and the ordering
<:;* on Ot* is the converse of the corresponding ;?:: ordering). This extends
easily to A, B, C, DE ,fJ' - •.¥with A ~ B, C ~ D, but AB or CD if Ol*. For if
AB ¢ Ct*, then X I X~ B I A; but this entails A I X~ B I X (If A I X'--- B; X
then B I A,..__, BIB and Axiom 6' yield B I X>- B I X; ~tc.) Hen,ce, Q(A) ~
Q(B). Similarly, if CD¢ Ol*, X I X~ D I C and Q( C) = Q(D). Thus, if both
are not in ot*, B I A~ D I C and Q(B)/Q(A) = 1 '= Q(D)/Q(C); if only
AB¢0t*, then BIA>-D!C and Q(B)/Q(A)=l>exp[-!f(CD)]=
Q(D)/Q( C). Finally, the representation extends to A, B, C, D arbitrary in
t!- .AI by using the representation for A n B, A, C n D, C, and employing
Axiom4.
To prove uniqueness, note that if Q' is another representation, then
~'(AB) = -log[Q'(B)/Q'(A)] gives another representation for (JJ- .At,
5.
238
Ot*, ;?::*),hence, by Theorem 4.1,
if/ =
PROBABILITY REPRESENTATIONS
1.
Q'(A) = Q'(A)/Q'(X)
if X I X
5.8
> A I X.
Since !•
=
exp[ -<f'(XA)]
=
exp[ -rx<f(XA)]
=
[Q(A)]~,
l, we have Q'
=
Q• on tff -- .AI.
INDEPENDENT EVENTS
239
relation on ,{]'_ Then j_ is an independence relation 3
a<f. Thus,
=
5.8.
j_ is
iff
symmetric.
2. For A E C, the set {B I BE tff and A j_ B} is a QM-algebra of sets on X
(Definition 5), i.e.,
0
(i)
A l_ X;
(ii)
if A j_ B, then A J. ·-B; and
(iii)
if A J. B, A
l_ C, and B
n C
=
0, then A
J. B u C.
It is easy to demonstrate that these axioms are necessary for the representation, e.g., for 2 (ii), if P(A n B) = P(A) P(B), then
INDEPENDENT EVENTS
P(A
Intimately connected with the concept of conditional probability is that
of (statistical) independence. Given a probability measure P, events A, Bare
independent if and only if P(A n B) = P(A) P(B) [or, for P(B) :# 0, if and
only if P(A B) = P(A)]. In qualitative terms, using the primitives of the
preceding sections, we can define A, B to- be ,_,-independent if and only if
either B is in .AI or A B ~A i X. (Or again, using *defined in Section 5.7,
A, Bare ~-independent when A * B = A f'l B.)
An interesting idea, introduced by Domotor (1969), is to accept independence of events as a primitive notion, in which case the primitives for a
theory of conditional probability consist of a set X, an algebra tff on X, an
ordering :<; * on C, and a new binary relation (interpreted as independence)
_l on C. Of course, Theorem 2 is then to be enriched by adding to the constraints on the representation the following one:
n -·B)
=
P(A) -- P(A
n B)
=
P(A) - P(A) P(B)
=
P(A)[J -
=
P(A) P(- B).
P(B)]
1
1
Al_B
iff
P(A
n
B) = P(A) P(B).
It will be noted that this property of the representation together with the
axiom A _l X (see Definition 9 below) implies P(X) = 1. Without such an
additional constraint, we can obtain an absolute scale of probability only by
the flat of normalization. From the standpoint of unconditional probability,
only ratios of probabilities are meaningful. This further constraint, based
on the new primitive 1_, yields a genuine absolute scale. Another way to
understand the absolute scale is from the standpoint of conditional probabilin Section 5.7.3, X is the natural multiplicative identity with respect to
*, forcing P(X) = 1.
We start by introducing the axioms of a pure independence structure
tff, 1_).
<X.
DEFINITION 9.
Suppose tff is an algebra of sets on X and _l is a binary
It is definitely not true that {B I A J. B} is an ordinary subalgebra of
there can be events A, B, and C for which A _l B and A _j_ C but not
A J. B n C and not A J. B u C. This corresponds to the fact that pairwise
independence of events, in the usual probabilistic sense, does not entail
independence of the entire set of three or more events. Only if A, B, C form
an independent family of sets is each event independent of the subalgebra
generated tw the other events. This motivates the definition of an abstract
independence relation for sets of three or more events, based on _l, using
subalgebras.
If .? is a subset of tff, then the smallest subalgebra containing Y is given
as the intersection of all subalgebras of tff that contain Y. (Note that C itself
is a subalgebra containing Y, so this intersection is defined and it contains Y·
and the intersection of arbitrarily many subalgebras is easily seen to be ~
subalgebra.) The smallest subalgebra containing A consists of { 0, A, -A,
Two subalgebras of tff, ~ and tff2 are said to be 1_-independent if and only
if A1 l_ Az whenever A 1 is in ~ and A 2 is in tff2 • Note that if A _l B, then
{,P, A, -A, X} and { 0, B, --B, XJ are J. -independent. This leads to the
generalization: A1 , ... ,Am are _j_-independent if and only if the smallest
subalgebras containing every proper subset of the Ai and its complement
are 1_-independent. Formally, we have:
DEFINITION 10.
Let tff be an algebra of sets and J. an independence
'This concept is different from that of a relation being independent implicitly defined
in Definition 1.3 of Section 1.3.2.
,
5.
240
PROBABILITY REPRESENTATIONS
5.9. PROOF OF THEOREM 10
241
relation on C. For m ~ 2, A 1 , •.. ,Am E 0 are j_-independent iff, for every
proper subset M of {I, ... , m}, every B in the smallest subalgebra containing
{Ai I i E M}, and every C in the smallest subalgebra containing {Ai I i ¢' M}, we
have B j_ C.
DEFINITION 12. Suppose (X, 0, ;(:;, j_) is a structure of qualitative
probability with independence. Let .AI = {A 1 A E 0, A ,.....,
If A, C E 0 and
B, D E 0 - .AI, define
As we noted in motivating this definition, if m = 2, we have A and B are
j_ -independent if and only if A j_ B.
We can now define what we mean by a structure of qualitative probability
with independence. We take a structure of qualitative probability (X, 0,
an independence relation j_ on 0; and an axiom interlocking;(:; and j_. The
interlocking axiom is very natural, given the representation: probabilities of
independent events multiply, and multiplication is order preserving, therefore,
intersections of independent events preserve order. [n addition, we include
in the following definition a very strong structural condition that uses the
defined notion of _L -independence of sets of events.
iff there exist A', B', C', D' E 0, with
DEFINITION 11. Suppose that X is a nonempty set, 0 is an algebra of sets
on X, and ;(:; and j_ are binary relations on rff. The quadruple (X, 0, ;(:;, .l)
is a structure of qualitative probability with independence
rhe following
axioms hold:
l.
2.
(X, 0,
is a structure of qualitative probability (Definition 4).
is an independence relation (Definition
3. Suppose that A, B, C, DE 0, A 1.. B, and C j_ D. If A ;(:; C and B ;(:; D,
then A n B ;(:; C n D; moreover, if A > C, B ;(:; D, and B > 0, then
AnB
> CnD.
The structure (X, 0,
holds:
j_)
is complete
iff
the following additional axiom
4. For any A 1 , ... , Am , A E 0, there exists A' E 0 with A' ,___,A and
A' j_ Ai, i = l, ... , m. Moreover,
, ... ,Am are j_-independent, then A' can
be chosen so that A 1 , ... ,Am, A' are also
In a complete structure of qualitative probability with independence, we
can introduce a defined relation ;(:;' on pairs A I B and C! D. Clearly, if
A C B, C CD, A .l D, and C j_ B, we will want to assert that A B ;(:;' C l D
if and only if P(A) P(D) ?; P(B) P(C), or, in qualitative terms, when
A n D ;(:; C n B. If the requisite independence relations for A, D and C, B
do not hold, then using completeness (Axiom 4), we may replace these sets
by the equi.a!ent ones for which independence does hold. The theorem we
obtain is that, with suitable definitions of .A' and
, a complete structure of
qualitative probability with independence gives rise to a structure of qualitative conditional probability.
1
A!B;(:;'CID
A' '"-' A
B' ~ B,
n B,
A'
_L D'
C' ,...., C
and
n D,
D' ,...., D;
C' j_ B';
and
A' n D' ;(:; C' n B'.
TH~O~EM 10. .~uppose that (X, 0, ;(:;, j_) is a complete structure of
qualzt~~we probabzlzty with independence and that .AI and ;(:;' are given by
Defimtwn 12. Then (X, 0, .AI, ;(:;') is a structure of qualitative conditional
probability (Definition 8).
If we now make the additional assumptions that the Archimedean and
solvability axioms of Section 5.6 are satisfied by (X, 0, .AI, ;(:;'), then
Theorem I 0 and these assumptions allow us to deduce the representation p
of Theorem 7. This gives the desired representation for j_. For if A 1.. B,
with B not in .Y, then Definition 12 gives A 1 B ,..._,' A 1 X, hence,
P(A n B)/ P(B)
=
P(A).
(If
B are bovth ~ 13_, then A n B ,.,.._, 0, so the desired result is trivial.)
Of course, the Arch1medean and solvability axioms, stated in terms of >'
can be translated into terms of <::; and j_. The resulting axioms are not .;;;
natural. A simple axiomatization directly in terms of > and j_ would be
preferable.
~
5.9 PROOF OF THEOREM 10
A com?!ete structure of conditional probability with independence induces,
by Defimtwn 12, a structure of qualitative conditional probability.
PROOF. We remark first that by Axiom 3 of Definition 11 the definition
of;(:;' is independent of the particular choice of A', B', C', D' Definition 12.
For if A'~ A", B' ~ B", C' ~ C", and D' ~ D", with A' j_ D', A" ..L D"
C' 1.. B', and C" j_ B", then A' n D' ~A" n D" and C' n B',...., C" n B".'
By completeness (Axiom 4), ;(:;'is a connected relation; for let A' = A n B
C' = C n D, and use the axiom twice with m = 1 to obtain D' ,__, D witl~
A' 1.. D' and B',....., B with C' J... B'; then compare A' n D' with C' n B'.
in'
5. PROBABILITY REPRESENTATIONS
242
To show that;?:' is transitive, use completeness successively with m = 1, 2,
3, 4, and 5 to obtain A' = A n B, B' r--oB, C' '""'"' C n D, D' ~ D,
E' ~ E n F, and F' ~ F, such that A', B', C', D', E', F' are _L -independent.
By the remark above on the definition of;?:', we infer, from A I B ;?:' C I D
and C I D
ElF, that A' n D';?: C' n B' and C' n F';?: D' n E'. By the
definition of j_ -independence, we have
A' n D' j_ C' nF',
243
EXERCISES
Finally, suppose that A ~ B ~ C, D ~ E ~ F, B I A ;?: F i E and
C i B ;?:' E i D. Choose A', B', C', D', E', F', respectively ~A, B, C, D, E, F
and _[_-independent (completeness with m = 1, 2, 3, 4, 5). We have
B' n E' ;?: F' n A' and C' n D' ;?: E' n B' (with strict;?:' going to strict;?:).
By
of ;?:, C' n D' .?: F' n A'
strict inequality, if there is a
strict antecedent), hence C I A ;?: (or >. as required) F I D.
0
C' n B' .L D' n E'.
EXERCISES
Therefore, Axiom 3 yields
A' n F' n C' n D' ;?: E' n B' n C' n D'.
>
If not A I B ;?:' E IF, then E' n B' >A' n F'. If C' n D'
0, then the
A' n F' n C' n D'
strict clause of Axiom 3 yields E' n B' n C' n D'
(since _[_-independence gives E' n B' j_ C' n D' and A' n F' j_ C' n D').
This contradiction shows that A I B ;?: E I F as required. On the other hand,
if C' n D' ~ 0, then the strict clause of Axiom 3, applied to C' j_ D',
0 l 0, and D' > 0 (D' E /f yields C' ,....., 0 ; thence,
>
C' n F' ,...._ 0 ;::: D' n E';
and since D' > 0, the same arguments give E' ,....., 0 and E' n B' ~ 0 .
Thus, in this case also, A' n F';?: E' n B', as required.
The second axiom of Definition 8, that X E Iff - A and A E A'" iff
A I X,-../ 0 I
is immediate; obviously, A I X ~' B I X iff A ~ B. Similarly,
the third axiom of Definition 8, that X I X ;?:' A I B and Xi X,....,' A I A, is
immediate; and the fourth axiom, A i B ,.._,' A n B I B, follows from the
definition of ;?:'. Thus, only the additivity and strong 6-tuple conditions need
to be established.
Suppose that A ! C ;?:' D I F and B j C ;?:' E I F, where A n B = 0 =
D n E. Without loss of generality, assume A u B C C, D u E C F. By
D, E and choose F' ""F with
completeness, choose C' ~ C with C'
F' _LA, R By definition of
we have
A nF';?: Dn C'
B
n
F' ;?: En C'.
1. For each of the following three families Iff of sets, either show that Iff is
an algebra of sets (Definition 1) or show a violation of one of the axioms.
(i) Let X be a nonempty set and A a nonempty subset of X. Then /f
consists of all subsets of X that either include A or are disjoint from A.
Let X be a nonempty set and A a nonempty subset of X. Then /f
consists of 0 and all subsets of X that intersect both A and -A.
(iii) Let iF and tfJ be algebras of sets on X and Y, respectively. Then t%'
consists of those subsets of Xu Y of the form A u B where A is in SF and
B is in tfJ.
(5.1)
2.
Suppose that (X, Iff, P) is a finitely additive probability space (Definition
Prove that, for all A, BE Iff
P(A u B)=
+ P(B) --
n B).
1)
3. Prove the independence of the axioms of qualitative probability
(Definition 4).
(5.2.!)
4. In a structure of qualitative probability (Definition 4), either prove that
the relation ~* (defined on p. 206 in connection with the definition of tight)
is an equivalence relation or
a counterexample.
5. Suppose that (X, Iff, ;?: ) is a structure of
(Definition 4) and that A, B, C, D E 6 are such that A u C
Prove that if A;?: Band C;?: D, then An C;?: B n D.
=
probability
B u D = X.
(5.2.1, 5.3.1)
6. Suppose that (X, Iff, <:) is a structure of qualitative probability
(Definition 4) and that A = {A I A E Iff and A ~ ,P}. Prove that A has the
following properties:
By Lemma 2, which uses only qualitative probability,
(Au B) n F';?: (DuE) n C',
and if either antecedent inequality is strict, so is the conclusion. By the disjoint
union property of _L [Axiom 2 (iii) of Definition 9] we have C' _L D u E
and F' j_ A u B. Hence, A u B i C ;?:' D u E I F, with
holding if either
antecedent is strict.
(i)
If A
(iii)
1f A
>'
E
A, BE 6, and B C A, then BE A'.
If A, BE A, then A u B, A n BE A.
E
A', then -A
E
Iff-%.
(5.2.!, 5.3.1)
5.
244
7. Let X = {a, b, c, d, e} and C
of C such that:
(i)
=
PROBABILITY REPRESENTATIONS
2x. Suppose that ;?::; is a weak ordering
Chapter 6 Additive Conjoint Measurement
{a}~
(iii)
{a, b} ,....., {c, d,
(iv)
{a, e} ~ {c, d}.
Find a probability representation consistent with ;?::;. Is it unique?
(5.4. 3, 5.5.3)
8. Does the probability space in Exercise 7 satisfy the hypotheses of
Theorem 6? Can you define general circumstances in which a finite structure
of qualitative probability has a unique representation?
(5.4.3, 9.2.2).
9. Suppose that if there exists an A withO < P(A) < 1, then (X, <ff, P) is a
finitely additive probability space (Definition 2) with the following property: for every A in ,ff with 0 < P(A) < 1, there exists a B in <ff with 0 <
P(B) < 1 that is independent of A (p. 238). Prove that this space has no
atoms (p. 216).
(5.1, 5.4.2, 5.8)
In the following three exercises we suppose that <X, C, JV, ;?::;) is a structure
of qualitative conditional probability (Definition 8, Section 5.6.1 ). In the proofs
you may use the axioms of Definition 8 and the lemmas of Section 5.7.1, but
not Theorem 7.
C, then A I B <: A I C.
(5.6.1, 5.7.1)
11. If A, A', B, B' E C, C, DEC- JV, A i C;?::; A'! D, and B I C :::S B' I D,
then (A- B)l C;?::; (A'- B')l D.
(5.6.1, 5.7.1)
10.
If A E<ff, B, C e<ff
and B
12. Suppose that A, B, C E C and that C n B, C - BE C -- JV. Prove
that A I C ,...._.A! C n B iff A I C ~A I (C -- B). Thus, letting C =X, if A
is independent of B (see Section
then A l B ,...._, A i -B.
(5.6.1, 5.7.1, 5.8)
13. Prove Exercises 10-12 using the representation of Theorem 7.
(5.6.2)
0 6.1
SEVERAL NOTIONS OF INDEPENDENCE
Not all attributes that we wish to measure have an internal, additive
structure of the sort we studied in Chapters 3 and 5 (temperature and
density are just two examples), and when no extensive concatenation is
apparent we must use some other structure in measuring the attribute. One
of the most common alternative structures is for the underlying entities to be
composed of two or more components, each of which affects the attribute in
question. A simple example is the attribute momentum which is exhibited by
physical objects and which is affected both by their mass and by their velocity.
Another example, discussed in Section 1.3.2, is the judged comfort of various
humidity and temperature combinations.
The objects in these examples cannot readily be concatenated, nevertheless,
they can be treated as composite entities. The rest of this volume, except for
the last chapter, concerns theories leading to construction of measurement
scales for composite objects which preserve their observed order with respect
to the relevant attribute (e.g., preference, comfort, momentum) and where
the scale value of each object is a function of the scale values of its components. Since such theories lead to simultaneous measurement of the objects
and their components, they are called conjoint measurement theories. The
present chapter deals with the additive case; more complicated rules of
combination are studied in Chapters 7, 8, and 9.
The first part of this chapter (Sections 6.1-6.3) deals with the two-component case. Empirical examples from physics and psychology are presented
245