Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
REMARKS ON F O U N D A T I O N S OF P R O B A B I L I T Y SEMANTICAL INTERPRETATION OF THE PROBABILITY OF FORMULAS By J E R Z Y LOS Every Boolean algebra may be isomorphically represented as an algebra of sets, therefore every probability function on a Boolean algebra may be represented as a probability on some algebra of sets. This follows from the representation theory due to M. Stone. If one considers the probabilities defined for formulas of a given formalism, which as is well known form a Boolean algebra, then such a representation seems to be insufficient, because it has no semantical meaning. The following paper contains a representation of probability functions defined on formulas by means of semantical notions. The starting point of this representation is in the two different interpretations of probability of formulas, according to the syntactical structure of the formula. If the formula is the open one, i.e. no quantifiers occur in it, then by its probability we may understand the chance of finding out such elements in a given model which fulfil this formula. If the formula is a sentence, i.e. no free variables occur in it, then by its probability we may understand the chance of finding out a model (from among models of a given class) in which this sentence is true. The main result obtained in this paper (Theorem 2) is the demonstration that, under very general conditions (which are always satisfied by elementary theories), the probability of sentences may be understood as a result of two consecutive drawings. First a model will be chosen at random from among models of a given class {7ttt}teT following a given probability ^ i n set T; and then, having the so obtained model Tflt, a sequence of elements is selected from it, following a probability vt. The probability p(oc) of the formula a is equal to the probability of fulfilling a by a sequence of elements obtained in this way. More precisely this theorem states that we can represent the probability of formulas p(oc) in the form p(*) = jTvM*))d/*(t), where {TMJier is a class of models, p, a probability in T,vt a probability in the set of sequences of elements of the model Tïlt and 7%(a) denotes the set of sequences which fulfils oc in the model TfltIt is not the aim of this paper to explain the importance of research on the probability of formulas. We would like to remark only that the theory of information yields more and more interest in such investigations. 226 J. Loa 1. Formulas, Boolean algebras and models We shall here be interested in those formal theories in which only relational constants occur. Variables of higher orders (e.g. running over sets or classes of sets) may be allowed, but in such cases we shall assume that in the formulas each such variable is bounded by means of a quantifier. Only individual variables may be free in these formulas. The set of all formulas is denoted by S. Let us denote the relational constants by rt and the individual variables occurring in formulas in S by Xj. We shall assume that i runs over a set of natural numbers and j over the set of all natural numbers. This implies that the set S is countable. By a sentence is meant a formula in which no free variable occurs. The set of sentences will be denoted by Z; it is a subset of S. Let Cn be the operation of the usual (syntactical) consequence in the set S. A subset X of S with X = Cn(X) is a system, a subset X with Cn(X) 4=$ is a consistent set, a maximal consistent system is a complete system. Let us remember the following properties of the set Z: (i) The set Z is closed with respect to the propositional operations such as implication (->), disjunction (V), conjunction (A), equivalence («-») and negation ( ~ ). (ii) For every system X, Cn(X ()Z)=X. As is well known, the set of formulas form a Boolean algebra with disjunction, conjunction and negation as Boolean operations. Instead of equality we have to accept in S the equivalence with respect to the set Cn(0), i.e. two formulas a and ß are considered to be equal if a<-»/? belongs to Cn(0). In this algebra ocCiß = 0 means that ~ (oc A ß) belongs to Gn(0). It follows from (i) that Z is a subalgebra of S and so it is called the algebra of sentences. By a model for S we shall mean every sequence 7tl= <A,RVR2,...> where A is non-empty set, and Rt relations in A such that it is possible to interpret by them constants rt. It is clear what is meant by the statement that the sequence <alta2, ...> of the elements of A fulfils the formula oc. Let us remark only that the individual variables have natural numbers as indices and therefore we use here an infinite sequence of elements of A. Moreover, the variables of higher orders (if they occur in formulas in S) have to be interpreted in an absolute sense, e.g. a set variable runs over all subset of A. Let us denote by A the set of all sequences <ax,a2,... > of elements of A and, for a given a in S, by Tfl(oc) the set of all sequences in A which fulfils oc. Tfl(oc) is a homomorphism of S into an algebra of subsets of A. By S(7H) we denote the set of all formulas "true" in Tfl, i.e., the set of a's with Tfl(oc) =A. If for a system X there exists a model 7H such that X is contained in SCITI), then X is called an co-consistent, in other cases an co-inconsistent system. If for some class of systems {7tlt}teT, X= Ç] teTS(7ttt), then X is called co-regular. If for every complete system Y including the given system X there exists a model Tfl with S(Tfl) = Y, then X is called strongly co-regular. If S is an elementary class, i.e. no variables of higher orders occur in the formulas of S, then every system in S is strongly co-regular. FOUNDATIONS OF PROBABILITY 227 2. Probability functions By a probability function or simply probability, is meant a real function p defined on a Boolean algebra B, such that 0 <p(oc) < 1 and p(a U 6) = p(a) +p(b)îor every pair a,b of elements in B with a D 6=0. If B is a cr-algebra and for evevery sequence of pairwise disjoint elements «j we have p(\JZiCti) = = 2 S i ^ ( a 0 * n e n V i s called or-additive. Let B* be a cr-algebra and 5 a subalgebra of B* such that B* is the least cr-algebra containing B. If # is a probability on J5 and p(a± (]a2f\ ... 0 an)->0 whenever flïïi a i = 0 , then # may be extended to a cr-additive probability function on B*. This result is very well known, as Kolmogoroff's theorem. For the Boolean algebra of formulas and its subalgebras the definition of the probability function is obviously applicable. The only restriction which follows from the definition of such algebras is that for two formulas oc and ß equivalent with respect to the set Cn(0) we must have p(oc) =p(ß). A probability function p on the algebra of formulas S is called continuous, if the following two conditions hold: There exists an co-regular system X0, such that p(oc) = 1 for every oc in X0. If for some e > 0 and some system X, p(Ç) >6ÎOT every sentence f belonging to X, then X is an co-consistent system. 3. Semantical introduction of probability Let Tfl be a model and A its set of elements. By A we denote, as before, the set of sequences of elements from A. A probability function v defined for an algebra of subsets of A is called a probability in the model Tfl if v is defined for all sets 7tt(oc) with oc in S. Suppose that {Ttlt}teT is a family of models and that for every t in T, vt is a probability in 7M*. Suppose, moreover, that there is a cr-additive probability function ju on a cr-algebra of subsets of T, such that all functions fa(t)=Vt(7ïlt(oc)), with oc in S, are [xmeasurable. THEOREM 1. Under the above assumptions, the function p(*) = jTf.(*)dp(t) is a probability on S. Proof. 0 ^p(oc) < 1 follows from the assumption that vt and [i are probability functions. If oc0ß=0, then ~(ocAß) belongs to Cn(0). Therefore Tflt(x) fi 7Ht(ß) = 0 for every t. Since 7ftt is a homomorphism hence: p(aUj8)= f /«vMW*)= f Vt(7nt(ocvß))d[x(t) J T JT = f *(%(«)\}TW))Mt)= f vtcmt(oc))+vt(7nt(ß))dM(t) J T JT = f vt(mt(X))dM(t)+ f J T vMß))Wt) JT = f h(t)dft(t)+ f fß(t)d[i(t)=p(oc) + p(ß). J T JT 228 j . Loé This theorem shows the possibility of introducing probability functions in the algebra 8 by means of the probability functions on models and a probability function in the set of models. In the following section we shall prove that every continuous probability function on 8 may be represented in the form of an integral as in theorem 1. 4. Representation theorem Let p be a continuous probability on S,X0 an co-regular system with p(Ç) = l for all sentences f in X0 and {Ttlt}teT a class of models such that X0= r)teTS(7ftt) and moreover such that for every system X containing XQ with p(C) >s for every sentence f in X and a suitable e > 0 depending on X, there exists a f i n T such that X c= 8(7Ht)- The existence of such a X0 and {Jftt}teT follows from the assumption on continuity of p. Let for f in Z, T(Q denotes the set of t in T with f belonging to S(7M*). JP(£) maps the algebra of sentences Z homomorphically into the algebra of subsets of T. Let us set p,(T(Ç))=p(Ç). This defines a probability function p on the algebra of subsets of T of the form T(Ç). We shall prove that p may be extended to a cr-additive probability function. Let p(T(^A ... A Çn)) >e>0 for every n. Let us form the system Gn(X0 U (Çv fa,...)) =XV Obviously for every f in Xvp(T(C)) =^>(C) =^£ a n ( i , therefore, there exists a ^ i n T such that X1czS(Mt1). It follows that tt belongs to T(£n), for every n FinaUy flST-i^CHO. Let us, applying Kolmogoroff's theorem, extend ^ to a cr-additive probability, denoting this extension by the same letter p. Now, for every oc in S, let px be a cr-additive probability function such that pa(T(Ç))=p(t;Aoc) for every f in Z. Such a probability does exist, because the equation above defines a probability on the sets of the form T(Ç), which may be extended (by the same argument as for p) to a cr-additive probability. Every px is obviously absolutely continuous with respect to p because px( JT(£)) ^ M ^ ( f )) ^ o r ev©ry oc in 8 and f in Z. From the well-known RadonNikodym theorem it follows that for every oc in 8 there exists a function /a(£) such that Ai«(r(C))= f J no /«WW Suppose a and /? are disjoint, since px(T(Ç)) + Pß(T(C)) = p(f A a) + p(f A ß) = p(CA(ocvß))=payß(T(C)),hence f /«(W*)+f fß(t)dfjt(t)=[ fx(t) + fß(t)dp(t) = f f*,ß{t)d/i(t). J no FOUNDATIONS OF PBOBABILITY 229 It shows that /a+//3==/av/? almost everywhere in the sense of p. As we have only countably many a's and ß's in S we can improve the functions fx to have the equations fx+fß==ftXs,ß holding everywhere in T for disjoint a'sandß's. Now, if we set for every t in T and a in 8 vt(7Ht(<x>)) ^/«(O we obtain a probability function on the subset 7tlt(oc) of At, such that p(£Aa)= f vt(7nt(oc))dp(t) J no For f in X0 we have p(Ç A oc) =p(oc), T(£) = T and therefore p(oc)=jTVt(7tlt(oc))dp(t). We have proved the following representation theorem: THEOREM 2. Every continuous probability on the algebra of sentences S may be represented in form of an integral p(oc) = jTVt(7nt(oc))dp(t), where Tflt are suitable models, vt probabilities in these models, and p a o-additive probability function in the set of models.