Download PROBABILITY AS A NORMALIZED MEASURE “Probability is a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of randomness wikipedia , lookup

Indeterminism wikipedia , lookup

Randomness wikipedia , lookup

Random variable wikipedia , lookup

Dempster–Shafer theory wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Probability box wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
PROBABILITY AS A NORMALIZED MEASURE
A. BEJAN†
Abstract. Axiomatic foundations of the probability theory, based
on the notions of σ-algebra (σ-field) and normalized measures on
this structure not only give a formal basis for the development of
the theory, but allows to eliminate of some contradictions which
might otherwise bring to some catastrophic consequences.
“Probability is a degree of certainty, and differs from
certainty as a part from a whole” J. Bernoulli, Ars Conjectandi (1713)
1. Elementary Probability Theory
1.1. Space of elementary events. Events. Operations in the
class of events. Ω - space of elementary events, sample space. Contains all possible outcomes of the experiment. The result of the experiment, however, is just one of these outcomes. The elements of Ω are
also called sample points.
Any subset of Ω as a set of some possible outcomes can be naturally
called an event: if A ⊆ Ω, then A is an event. We say that event A
has occurred if the outcome of the experiment is contained in A. ∅ is
the impossible event, Ω is the true event.
Events are sets. For any given events, their unions, intersections,
complements, differences are, again, events:
As we shall see later,
this is not actually
true, and not any subset of Ω may be regarded
to as an event! That is
why we need that σ-zoo!
A ∪ B, A ∩ B, A\B, Ā = Ac = Ω\A, A4B.
Events A and B are said to be inconsistent if A ∩ B = ∅.
If A ⊆ B ⊆ Ω, then B occurs whenever A occurs. One says that B
follows from A.
Example 1. Write in set-theoretic terms the event D that exactly two
of the events A, B, C occur.
D=
Try events A = {2, 3, 5},B = {2, 4, 6}, C = {1, 3, 5} to check your
Date: February 2006, HWU, young researchers seminar meeting ,.
1
B and C are the sets
of even and odd numbers
between 1 and 6; what is
A?
A. BEJAN†
2
formula.
1.2. Discrete sample space. This is nothing but a sample space
which is countable:
|Ω| = Card Ω ≤ ℵ0 ,
i.e. Ω = {ω1 , ω2 , . . . , ωn , . . .}.
Definition 1.1. Any function
p : Ω 7→ [0, 1],
which satisfies the condition
X
p(ωi ) = 1,
ωi ∈Ω
is called the probability function. The value p(ωi ) is called the probability of the elementary event ωi . By the probability of an event A ⊆ Ω is
understood to be the quantity
X
∂ef
P(A) =
p(ωi ).
ωi ∈A
Example 2. Ω = {0, 1}n = {ω = (δ1 , δ2 , . . . , δn )|δi ∈ {0, 1}}. Let
Σω = δ1 + . . . + δn , and put
∂ef
p(ω) = γ Σω (1 − γ)n−Σω ,
for some γ ∈ [0, 1]. It is a matter of checking that p(ω) defined as above
is a probability function on Ω.
...
no space is left for this
excercise, why?
Example 3. Try to find the probability function on Ω = {0, 1}ℵ0 , by
assigning probability values to each of elementary events ω from Ω.
See the thorough discussion on this example in [4], Example 1.10.
and this is a second argument why we are moving towards the σ-zoo
Remark 1. Note, that by a summation one can assign (define) probabilities to events which are not more than countable! Otherwise the
notion of the summation is not defined.
1.3. Geometric probability. The notion of the geometric probability
can be introduced as it follows.
Suppose the outcomes of the experiment can be put into one-toone correspondence to the points of some region Ω in Rn , so that the
probability for the point-outcome to lie in any part of A ⊆ Ω does not
PROBABILITY AS A NORMALIZED MEASURE
3
depend on the form of A and its position in Ω, but depends only on
the measure of A, and hence, is proportional to this measure:
µ(A)
P(“ · ” ∈ A) =
,
µ(Ω)
where µ(·) is just a geometric measure of a region - length, area, volume,
etc.
Example 4. Close your eyes, take a needle and mark some point on
[0, 1] with it. Suppose that the nib of your needle has no diameter - it
is a perfect point. What is the probability that you marked 0.29? At
the same time this is not an event which is impossible - it is one of the
elementary events in this experiment!
Example 5. Pete and John have agreed to meet in the city centre on
Friday between 19-00 and 20-00. They have decided only to wait one
for another 10 minutes and then, in any case, to leave for a pub. What
is the probability that they meet between 19-00 and 20-00?
...
...
Non-uniform geometric probability ≡ physical notion (density, center
of mass).
2. ”Non-elementary” Probability Theory
Above we saw that it is not always possible to assign probabilities to
sample points. Furthermore, some subsets of Ω may not be measurable.
There is a way - to specify the subsets which are going to be considered
as events and then define probability only on the set of events.
2.1. σ-algebra of a sample space.
Definition 2.1. The set F ⊆ 2Ω is called a σ-algebra of Ω if the
following conditions are satisfied:
(1) Ω ∈ F;
(2) A ∈ F ⇒ Ā ∈ F;
(3) if A1 , A2 , . . . ∈ F then
∞
S
Ai ∈ F.
i=1
These are the axioms.
One can check that these axioms are sufficient for F to be closed with
respect to other basic set operations (difference, symmetric difference
and intersection).
note again - not all subsets of a geometric region are measurable in
this sense, see Example 10
”A point is that of which
there is no part” Euclid
answer: 11/36
A. BEJAN†
4
Remark 2. The first axiom can be substituted with the requirement
that F is not empty.
Property 1. ∅ ∈ F .
Property 2. If A1 , A2 , . . . ∈ F then
∞
T
Ai ∈ F. Proof:
i=1
∞
S
Āi =
i=1
∞
T
Ai
i=1
Property 3. If A, B ∈ F, then A\B ∈ F. Proof: A\B = A ∩ B̄ ∈ F .
T
Theorem 2.2. The intersection
Fλ (countable or uncountable) of
λ∈Λ
sigma algebras {Fλ }λ∈Λ over some set Ω is again σ-algebra over Ω and
it is the smallest possible sigma-algebra over Ω (any other will contain
it).
Definition 2.3. Let G ⊆ 2Ω . The intersection of all sigma-algebras
containing G which is the smallest sigma-algebra containing all elements of G is called to be the sigma-algebra generated by G and denoted
by σ(G).
Example 6. Let Ω = {1, 2, 3, 4, 5, 6}. The following families of subsets
of Ω are σ-algebras:
(1)
(2)
(3)
(4)
F
F
F
F
= {∅, Ω} - so called trivial sigma-algebra.
= {∅, Ω, {1}, {1}}
= {∅, Ω, A, A}, where A is some proper subset of Ω.
= P(Ω) = 2Ω .
Now find σ(G) if G = {{3, 4}}.
Example 7. Consider the following examples.
|σ(G)| = 8
(1) Let Ω = [0, 1] and G = {[0, 31 ], [ 12 , 1]}. What is σ(G) then?
(2) Let Ω be the interval (0, 1] and let Fe to be the class of all sets of
the form (a0 , a1 ] ∪ (a2 , a3 ] . . . ∪ (an−1 , an ], where 0 ≤ a0 ≤ a1 ≤
. . . ≤ an ≤ 1. Show that Fe is not a σ-algebra.
2.2. Probability as a normalized measure.
Definition 2.4. Let Ω be some set and let F be a σ-algebra of its
subsets. Function
µ : F 7→ R ∪ {∞}
is called a measure on (Ω, F) if it satisfies the following conditions:
(1) measure is a non-negative function: µ(A) ≥ 0 ∀A ∈ F.
PROBABILITY AS A NORMALIZED MEASURE
5
(2) measure is a sigma-additive function:
∞
∞
[
X
µ( Ai ) =
µ(Ai ) ∀A1 , A2 , . . . ∈ F s.t. Ai ∩ Aj = ∅, i 6= j.
i=1
i=1
Definition 2.5. Let Ω be some set and let F be a σ-algebra of its
subsets. A measure on (Ω, F) is called normalized if µ(Ω) = 1. Another word for a normalized measure - probability, or probability
measure.
Definition 2.4 can be rewritten for probability measure.
Definition 2.6. Let Ω be some set and let F be a σ-algebra of its
subsets. Function
P : F 7→ R ∪ {∞}
is called a probability measure on (Ω, F) if it satisfies the following
conditions:
(1) it is a non-negative function: P(A) ≥ 0 ∀A ∈ F.
(2) it is a sigma-additive function:
∞
∞
[
X
P( Ai ) =
P(Ai ) ∀A1 , A2 , . . . ∈ F s.t. Ai ∩ Aj = ∅, i 6= j.
i=1
i=1
(3) P(Ω)=1.
Definition 2.7. A triple (Ω, F, P) is called a probability space if F is
a σ-algebra on Ω and P is a probability measure on F.
Property 4. P(∅) = 0.
n
n
P
S
P(Ai ) ∀A1 , A2 , . . . , An ∈ F s.t. Ai ∩ Aj =
Property 5. P( Ai ) =
i=1
∅, 1 ≤ i < j ≤ n.
i=1
Property 6. P(Ā) = 1 − P(A).
Property 7. If A ⊆ B then P(B\A) = P(B) − P(A).
Property 8. 0 ≤ P(A) ≤ 1.
Property 9. P(B ∪ A) = P(A) + P(B) − P(A ∩ B).
Property 10. P(B ∪ A) ≤ P(A) + P(B).
Property 11. P(A1 ∪ A2 ∪ . . . ∪ An ) ≤ P(A1 ) + P(A2 ) + . . . + P(An ).
n
P
P
Property 12. P(A1 ∪ A2 ∪ . . . ∪ An ) =
P(Ai ) −
P(Ai ∩ Aj ) +
i=1
i<j
P
P(Ai ∩ Aj ∩ Am ) − . . . + (−1)n−1 P(A1 ∩ A1 . . . ∩ An ).
i<j<m
induction works perfectly,
though takes
some time
A. BEJAN†
6
1
1
1 − 2!
+ 3!
− ... +
1
n−1
(−1)
→
1
− e−1
n!
Example 8. A clerk has to arrange n letters into n envelopes. However, by some reasons the letters were arranged chaotically into envelopes. What is the probability that at least one letter has been placed
into correct envelope. What is the limit of this probability when n →
∞?
Example 9. A and B play a game until one wins once (and is declared
the winner of the match). The probability that A wins each game is
0.3, the probability that B wins each game is 0.2. What is a suitable
probability space, sigma algebra and the probability that A wins the
match?
2.3. Borel σ-algebra and Lebesgue measure.
Definition 2.8. The Borel sigma algebra is defined on a topological
space (Ω, O) and is B = σ(O).
The Borel σ algebra on R is σ(C), where C is any of the classes of
sets as follows:
(1)
(2)
(3)
(4)
(5)
(6)
C
C
C
C
C
C
= {(a, b)|a ≤ b a, b ∈ R},
= {(a, b]|a ≤ b a, b ∈ R},
= {[a, b)|a ≤ b a, b ∈ R},
= {[a, b]|a ≤ b a, b ∈ R},
= {(−∞, a]| a ∈ R},
= {(−∞, a)| a ∈ R}.
The Borel σ algebra on R is denoted by B(R).
Note that all common, usual subsets of R are in B(R). Particularly,
R, any interval, any one point set, N and Q are in B(R). Question:
does B(R) contain the set of all irrational numbers?
Lemma 2.9. There exists unique measure λ on (R, B(R)) which assigns to each interval its length. This measure is called the Lebesgue
measure.
Finally, justify introduction of probability on σ-algebras by considering an example of the subset of a segment whose Lebesgue measure
does not exist.
Example 10. Vitali’s set Consider a unit circle (which is essentially
a segment :) Take some irrational number α. The number nα cannot
be integer for any n ∈ N (why?). Therefore if we take any point from
[0, 2π], i.e. the point on the circle and mark all the points which are
obtained by a rotation of x on the angle 2πnα, n = ±1, ±2, . . ., we will
PROBABILITY AS A NORMALIZED MEASURE
7
never come back to x. There is a countable set Kx of all such points for
any x from [0, 2π]. The circle is naturally divided on disjoint classes
{Kx } then. Take from each Kx one and only one point and form the
set A0 . Define by An the set of points obtained by rotation of the set
∞
S
A0 on the angle 2πnα, n ∈ Z. The union
An is nothing but a
n=−∞
segment [0, 2π] then. Also these sets are disjoint, therefore the measure
of their union is just a sum of measures.
Suppose that A0 is measurable. Noting that all An have the same
measure - it is equal to the measure of A0 , obtain the contradiction:
à ∞
!
∞
∞
[
X
X
2π = µ
An =
µ(An ) =
µ(A0 ) 6= 2π.
n=−∞
n=−∞
n=−∞
Assumption about measurability of the set A0 leads us to a contradiction. The set A0 is not measurable.
Exercise 1. Find a non-measurable (in the sense of Lebesgue measure)
set on R which is unbounded.
Remark 3. One may notice the fact that the family of different classes
Kx is uncountable1 - this makes one to be assailed by doubts about the
existence of the set A0 (recall its definition).
Indeed, historically, the term non-measurable set was established in
the theory after Vitali has proved in 1905 a theorem which stated that
any Lebesgue measurable set of a non-zero measure contains an uncountable subset, which is not Lebesgue measurable. Vitali used heavily the property of invariance of the Lebesgue measure with respect to
parallel shifts in the Euclidean space. Some time later new constructions were proposed to show the existence of non-measurable sets (F.
Bernstein, 1908; S. Ulam, 1930). However, all the new methods essentially used the so called axiom of uncountable choice, see [2] for
more information. A long series of debates between mathematicians
about the nature of unmeasurable set obtained in this way has seen
the end in 1970, when R. Sollovay proved that it is impossible to prove
the existence of unmeasurable sets without the axiom of uncountable
choice.
Acknowledgement. I am thankful to Tom for revising these notes.
His observations and our discussion have led to some corrections and
improvements. I am responsible for any mistakes/misprints/whatever
which you can find here.
1my
attention to this fact has been payed by Thomas Dodd.
A. BEJAN†
8
You also may find useful and interesting the exposition in [3].
3. FURTHER READING
A.N. Kolmogorov. Grundbegriffe der Wahrscheinlichkeitsrechnung.
Springer, Berlin, 1933. English translation (1950): Foundations of the
theory of probability. Chelsea, New-York.
References
1. Evans, L.C. An introduction to stochastic differential equations. Lecture Notes,
version 1.2.
2. http://planetmath.org/encyclopedia/AxiomOfChoice.html
3. http://www.stats.uwaterloo.ca/ dlmcleis/s901/
4. http://www.cs.cmu.edu/ chal/Shreve/chap1.ps
†
School of Mathematical and Computer Sciences, Heriot-Watt University