Download A Note on Probability, Frequency and Countable Additivity

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of randomness wikipedia , lookup

Randomness wikipedia , lookup

Probability box wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Law of large numbers wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
A Note on Probability, Frequency
and Countable Additivity
Zvonimir Šikić
Thousands of students are taught every year that the probability of an
event is the proportion of times the event would occur in a long run of
repeated experiments. It means probability is long run frequency.
The benefit of this definition is that it makes the proof of the main
properties of probability (the axioms of probability) very easy. Even Kolmogorov used it in his “Empirical deduction of the axioms”, a chapter
of his [K].
Namely, if n(A) is the number of times an event A occurs in n repeated experiments and fn(A) = n(A)/n is the corresponding frequency, then
(where fn(B|A) is frequency of B given A).
<
Of course, (1)–(4)are the axioms of probability:
1. 0 ≤pr(A) ≤ 1,
2. pr(A) + pr(–A) = 1,
3. pr (A B) = pr(A) + pr(B), if A and B exclude each other,
4. pr(A B) = pr(A)pr(B|A), where pr(B|A) is probability of B given A.
<
But, there is a big problem here. Which fn is the probability? Is the
probability of heads given by 100 tosses i.e. by f100, is it given by 1000
tosses i.e. by f1000 or what? How long should be the long run?
268
Zvonimir Šikić
The longest run possible could circumvent the problem and the
longest run possible is an infinite run. Hence:
This solution to the problem creates new problems. In contrast to finite
frequencies, limiting frequencies are unobservable (as Keynes said “in
the long run we are all dead”). The limiting frequency has no empirical
content. And we should not forget that two infinite sequences, which
differ only at a beginning, could be treated as the same sequence if we are
interested only in their long run frequencies (hence, there is no connection between limiting frequencies and finite observable frequencies).
Of course, we could be interested in the mathematical content of
limiting frequencies. After all, it is mathematical foundation of probability we could be interested in and not its applications.
So, let us explore the mathematics of pr, which is defined as
. First of all, this pr satisfies the probability axioms (1)-(4),
exists. And it is easy
because all fn satisfy them, but only if
to construct examples of infinite runs with non-existent limiting frequencies.
Here is a sequence of heads and tails with no limiting frequency of
heads (or tails, for that matter):
HT HT HHTT HHHHTTTT HHHHHHHHTTTTTTTT . . .
It starts with HT HT and after that we have blocks with 2nH and 2nT,
for every n> 0. If we stop after the n-th block, the frequency of heads will
be 1/2 (because every block has the same number of heads and tails). If
we stop in the middle of the n-th block, the frequency of heads will be:
The frequency of heads oscillates between 1/2 and 2/3 i.e. the limiting
frequency of heads does not exist.
Furthermore, even if an infinite sequence of heads and tails has a
limiting frequency, there are infinitely many subsequences of this sequence with whatever limiting frequency you like (and, of course, there
are infinitely many of them with no limiting frequency). It means that
A Note on Probability, Frequency and Countable Additivity
269
appropriately neglecting some tosses you get whatever you choose (so
you’d better be sure you’ve seen all the tosses).
Suppose, further that results of repeated “head-tail” experiments are
space-time distributed in the following way:
Heads are represented with the white points. Their coordinates are partial
sums of the series:
(2,3) + (2,3) + (2,3) + (2,3) + (2,3) + …
Tails are represented with the black points. Their coordinates are partial
sums of the series:
(1,1) + (2,1) + (2,2) + (2,1) + (2,2) + (2,1) + (2,2) + …
If you were tossing the coin, your time sequence of heads and tails is:
TTH TTH TTH TTH …
The limiting frequency of heads, in this sequence, is 1/3. This is your
probability of heads.
If I am inspecting the field of coins you tossed, my space sequence
of heads and tails is:
TH TH TH TH TH TH …
270
Zvonimir Šikić
The limiting frequency of heads, in this sequence, is 1/2. This is my
probability of heads.
Should one answer be wright and another wrong? If you prefer one
of them, think of Einstein’s special relativity.
The solution proposed by R. von Mises, in [M], is to rule problematic
sequences out. The sequences of experimental results should be random
(von Mises’ term was collective) which means that:
1. they should have limiting frequencies,
2. these limiting frequencies should remain the same in every recursive
subsequence of the given sequence (“recursive” was a Church’s clarification in [C]).
Our “sequence of heads and tails with no limiting frequency of heads
(or tails)” is ruled out by (1). The subsequences “with whatever limiting
frequency you like” are ruled out by (2). Space-time sensitive limits
are not ruled out. I suppose that the above example, which is a folk
knowledge today, was not a folk knowledge in von Mises’ days. If it was,
von Mises and Church would have added:
3. the limiting frequencies should remain the same in every recursive
reordering of the given sequence.
But there is no explanation why should an infinite sequence of repeated
experiments produce collectives. Why should an infinite sequence of
heads and tails satisfy (1)-(3)?
A further problem for frequentists is Kolmogorov’s axiom of continuity (which is equivalent to his theorem of countable additivity).
Kolmogorov said in [K] that “it is almost impossible to elucidate its
empirical meaning, as has been done for [other] axioms”. Kolmogorov
was thinking of frequencies fn as something with empirical meaning.
If we move to limiting frequencies, the common opinion is that they
violate countable additivity.
Van Fraassen in [F] and many others offer counterexamples of the
following kind. Consider an infinite lottery with tokens 1,2,3,4,… and
let Dj = “token j is drawn”. Suppose that in an infinite sequence of draws
(with replacements) none of the tokens is drawn infinitely many times.
(Dj) = 0, for every j, and it follows that
Then pr(Dj) =
pr(D1) + pr(D2) + pr(D3) + pr(D4) + … = 0.
On the other hand
<
<
<
<
pr(D1 D2 D3 D4 …) = 1,
A Note on Probability, Frequency and Countable Additivity
271
<
D4 …) = 1≠ 0 = pr(D1) + pr(D2) + pr(D3) +
<
<
<
pr(D1 D2 D3
pr(D4) + …
<
<
<
<
because D1 D2 D3 D4 … is a necessary event. Hence
and this violates countable additivity.
But why should pr(D1) + pr(D2) + pr(D3) + pr(D4) + … be 0? It is an
indeterminate form ∞∙0 which could be anything (if you still remember
your first course in calculus). As a matter of fact, in this particular case,
it is easy to prove that it is 1(as it should be, according to countable
additivity).
Suppose, for example, that our infinite sequence of draws is:
D4, D1, D9, D2, D4, D1, D7, D4, …
The corresponding probabilities are:
pr(D1) = lim (0/1, 1/2, 1/3, 1/4, 1/5, 2/6, 2/7, 2/8, …) = 0
pr(D2) = lim (0/1, 0/2, 0/3, 1/4, 1/5, 1/6, 1/7, 1/8, …) = 0
pr(D3) = lim (0/1, 0/2, 0/3, 0/4, 0/5, 0/6, 0/7, 0/8, …) = 0
pr(D4) = lim (1/1, 1/2, 1/3, 1/4, 2/5, 2/6, 2/7, 3/8, …) = 0 etc.
If we sum all the limits up we get:
pr (Dn) = lim ( 1, 1, 1, 1, 1, 1, 1, 1, …) = 1
The calculation is completely the same for every other sequence of
draws. Hence,
<
<
<
<
pr(D1 D2 D3 D4 …) = 1 = pr(D1) + pr(D2) + pr(D3) + pr(D4)
+ …,
i.e. limiting frequencies satisfy countable additivity.
My final conclusion is that limiting frequencies satisfy the probability axioms (1)-(4) (this was well known), but that they also satisfy
countable additivity (this is a new result). Hence, limiting frequencies
have no problems with probability axioms. Their problem is that they
may not exist. It is possible that an infinite sequence of experimental
results has no limiting frequency. (An appeal to the law of large numbers, “the probability of sequences with no limiting frequencies is 0”, is
not available to the frequentist, because it presupposes that probability
is defined independently of limiting frequencies. If it is not, the “law”
becomes a complete triviality.)
272
Zvonimir Šikić
Bibliography
[C] A. Church, “On the Concept of a Random Sequence”, Bull. of the
American Math. Society 46 (1940), pp. 130-135.
[F] B.C. van Fraassen, “Relative Frequencies”, in W. C. Salmon (ed.), Hans
Reichenbach, Logical Empiricist, Reidel, 1979, pp. 133-166.
[K] A. N. Kolmogorov, Foundations of the Theory of Probability, Chelsea,
1956 (German original 1933).
[M] R. von Mises, Probability, Statistics and Truth, Macmillan 1957 (German original 1936).