Download Introduction to Probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Introduction to Probability
Classical Concept:
◦ requires finitely many and equally likely outcomes
◦ probability of event defined as number of favorable outcomes
(s) divided by number of total outcomes (N):
s
Probability of event =
N
◦ can be determined by counting outcomes
In many practical situations the different outcomes are not equally
likely:
◦ Success of treatment
◦ Chance to die of a heart attack
◦ Chance of snowfall tomorrow
It is not immediately clear how to measure chance in each of these
cases.
Three Concepts of Probability
◦ Frequency interpretation
◦ Subjective probabilities
◦ Mathematical probability concept
Elements of Probability, Jan 23, 2003
-1-
The Frequentist Approach
In the long run, we are all dead.
John Maynard Keynes (1883-1943)
The Frequency Interpretation of Probability
The probability of an event is the proportion of time that events
of the same kind (repeated independently and under the same
conditions) will occur in the long run.
Example:
Suppose we collect data on the weather in Chicago on Jan 21 and
we note that in the past 124 years it snowed in 34 years on Jan 21,
34
that is 124
100% = 27.4% of the time.
Thus we would estimate the probability of snowfall on Jan 21 in
Chicago as 0.274.
The frequency interpretation of probability is based on the following theorem:
The Law of Large Numbers
If a situation, trial, or experiment is repeated again and again, the
proportion of successes will converge to the probability of any one
outcame being a success.
Elements of Probability, Jan 23, 2003
-2-
0.2
0.4
0.6
0.8
Tosses 1 − 1000
0.0
Relative Frequency of Heads
1.0
The Frequentist Approach
0
100
200
300
400
500
600
700
800
900
1000
0.49
0.50
0.51
Tosses 1000 − 100000
0.48
Relative Frequency of Heads
0.52
Number of Tosses
1
10
20
30
40
50
60
70
80
90
100
0.500
Tosses 100000 − 1000000
0.495
Relative Frequency of Heads
0.505
Number of Tosses (in 1000s)
1
2
3
4
5
6
7
8
9
10
Number of Tosses (in 100000s)
Elements of Probability, Jan 23, 2003
-3-
The Subjectivist (Bayesian) Approach
Not all events are repeatable:
◦ Will it snow tomorrow?
◦ Will Mr Jones, 42, live to 65?
◦ Will the Dow Jones rise tomorrow?
◦ Does the Iraq have weapons of mass destruction?
To all these questions the answer is either “yes” or “no”, but we
are uncertain about the right answer.
Need to quantify our uncertainty about an event A:
Game with two players:
◦ 1st player determines p such that he will “win” $c · (1 − p) if
event A occurs and otherwise he will “loose” $c · p.
◦ 2nd player chooses c which can be positive or negative.
The Bayesian interpretation of probability is that probability
measures the personal (subjective) uncertainty of an event.
Example: Weather forecast
Meteorologist says that the probability of snowfall tomorrow is
90%.
He should be willing to bet $90 against $10 that it snows tomorrow
and $10 against $90 that it does not snow.
Elements of Probability, Jan 23, 2003
-4-
The Elements of Probability
A (statistical) experiment is a process of observation or measurement. For a mathematical treatment we need:
Sample Space S - set of possible outcomes
Example: An urn contains five balls, numbered from 1 through
5. We choose two at random and at the same time. What is the
sample space?
S = {1, 2}, {1, 3}, {1, 4}, {1, 5}, {2, 3}, {2, 4}, {2, 5},
{3, 4}, {3, 5}, {4, 5} .
Events A ⊆ S - an event is a subset of the sample space S
Example: In the example above the event A that two balls with
uneven numbers are choses is
A = {1, 3}, {1, 5}, {3, 5} .
Probability Function
P - assigns each A a value in [0, 1]
Example: Assuming that all events are equally likely we obtain
P(A) = 103 .
Elements of Probability, Jan 23, 2003
-5-
The Elements of Probability
Why not assign probabilities to outcomes?
Example: Spinner labeled from 0 to 1.
◦ Suppose that all outcomes s ∈ S = [0, 1) are equally likely.
◦ Assign probabilities uniformly on S.
P({s}) = c > 0 ⇒ P(S) = ∞
◦ P({s}) = 0 ⇒ P(S) = 0
◦
Solution: Assign to each subset of S a probability equal to the
“length” of that subset:
◦ Probability that the spinner lands in [0, 41 ) is 14 .
◦ Probability that the spinner lands in [ 12 , 34 ) is 14 .
◦ Probability that the spinner lands on 21 is 0.
In integral notation we have
Z b
P(spinner lands in [a, b]) = dx = b − a.
a
Remark:
Strictly speaking, we can define above probability only on a set A of subsets A ⊆ S which
however covers all important and for this class relevant subsets.
In the case of finite or countably infinite sample spaces S there are no such exceptions
and A covers all subsets of S.
Elements of Probability, Jan 23, 2003
-6-
A Set Theory Primer
A set is “a collection of definite, well distinguished objects of our perception
or of our thought”. (Georg Cantor, 1845-1918)
Some important sets:
◦
= {1, 2, 3, . . .}, the set of natural numbers
N
◦ Z = {. . . , −2, −1, 0, 1, 2, . . .}, the set of integers
◦ R = (−∞, ∞), the set of real numbers
Intervals are denoted as follows:
[0, 1] the interval from 0 to 1 including 0 and 1
[0, 1) the interval from 0 to 1 including 0 but not 1
(0, 1) the interval from 0 to 1 not including 0 and 1
If a is an element of the set A then we write a ∈ A.
If a is not an element of the set A then we write a ∈
/ A.
Suppose that A and B are subsets of S (denoted as A, B ⊆ S).
The empty set is denoted by ∅ (Note: ∅ ⊆ A for all subsets A of S).
Difference of A and B (A\B): Set of all elements in A which are not in B.
Intersection of A and B (A ∩ B): Set of all elements in S which are both
in A and in B.
Union of A and B (A ∪ B): Set of all elements in S that are in A or in B.
Complement of A (A{ or A0 ): Set of all elements in S that are not in A.
Note that A ∩ A{ = ∅ and A ∪ A{ = S
A and B are disjoint if A and B have no common elements, that is A∩B =
∅. Two events A and B with this property are said to be mutually
exclusive.
Elements of Probability, Jan 23, 2003
-7-
The Postulates of Probability
A probability on a sample space S (and a set A of events) is a
function which assigns each subset A a value in [0, 1] and satisfies
the following rules:
Axiom 1: All probabilities are nonnegative:
P(A) ≥ 0
for all events A.
Axiom 2: The probability of the whole sample space is 1:
P(S) = 1.
Axiom 3 (Addition Rule): If two events A and B are mutually exclusive then
P(A ∪ B) = P(A) + P(B),
that is the probability that one or the other occurs is the sum
of their probabilities.
More generally, if countably many events Ai, i ∈ N are mutually exclusive (i.e. Ai ∩ Aj = ∅ whenever i 6= j) then
P
∞
S
i=1
Ai =
∞
P
i=1
Elements of Probability, Jan 23, 2003
P(Ai).
-8-
The Postulates of Probability
Classical Concept of Probability
The probability of an event A is defined as
P(A) = #A
,
#S
where #A denotes the number of elements (outcomes) in A.
It satisfies
◦ P(A) ≥ 0
◦
P(S) = #S/#S = 1
◦ If A and B mutually exclusive then
P(A ∪ B) = #(A#S∪ B)
=
#A #B
+
= P(A) + P(B).
#S #S
Elements of Probability, Jan 23, 2003
-9-
The Postulates of Probability
Frequency Interpretation of Probability
The probability of an event A is defined as
n(A)
P(A) = n→∞
lim
,
n
where n(A) is the number of times event A occurred in n repetitions.
It satisfies
◦ P(A) ≥ 0
◦
P(S) = limn→∞ nn = 1
◦ If A and B mutually exclusive then n(A ∪ B) = n(A) + n(B).
Hence
n(A ∪ B)
P(A ∪ B) = n→∞
lim
n
n(A)
n(B) = lim
+
n→∞
n
n
n(A)
n(B)
= lim
+ lim
= P(A) + P(B).
n→∞ n
n→∞ n
Elements of Probability, Jan 23, 2003
- 10 -
The Postulates of Probability
Example: Toss of one die
The events A = {1} and B = {4 5} are mutually exclusive.
Since all outcomes are equiprobable we obtain
P(A) = 16
and
P(B) = 13 .
The addition rule yields
P(A ∪ B) = 16 + 13 = 36 = 12 .
On the other hand we get for C = A ∪ B = {1 4 5}
P(C) = 36 = 12 .
The first two axioms can be summarized by the
Cardinal Rule: For any subset A of S
0 ≤ P(A) ≤ 1.
In particular
◦ P(∅) = 0
◦
P(S) = 1
Elements of Probability, Jan 23, 2003
- 11 -