Download Chapter 2 Probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Chapter 2
Probability
2.1
Basic ideas of probability
One of the most basic concepts in probability and statistics is that of a random
experiment. Although a more precise definition is possible, we will restrict ourselves here to understanding a random experiment as a procedure which is carried
out under a certain set of conditions; it can be repeated any number of times under the same set of conditions, and upon the completion of the procedure certain
results are observed. The results obtained are denoted by s and are called sample
points. The set of all possible sample points is denoted by S and is called a sample
space. Subsets of S are called events and are denoted by capital letters A, B,
C, etc. An event consisting of one sample point only, {s},is called a simple event
and composite otherwise. An event A occurs (or happens) if the outcome of the
random experiment (that is, the sample point s) belongs in A, s ∈ A; A does not
occur (or does not happen) if s ∈
/ A. The event S always occurs and is called the
sure or certain event. On the other hand, the event ∅ never happens and is called
the impossible event.
The complement of the event A, denoted by Ac , is the event defined by:
c
A = {s ∈ S; s ∈
/ A}. So Ac occurs whenever A does not, and vice versa.
S
The union of the Sevents A1 , ..., An , denoted by A1 ∪ ... ∪ An or nj=1 Aj , is
n
the event defined
Sn by j=1 Aj = {s ∈ S; s ∈ Aj , for at least one j = 1, ..., n}.
So the event j=1 Aj occurs whenever at least one of Aj , j = 1, ..., n occurs. The
1
CHAPTER 2. PROBABILITY
2
definition extends to an infinite number
S∞ of events. Thus, for countably infinite
many events Aj , j = 1, 2, ..., one has j=1 Aj = {s ∈ S; s ∈ Aj , for at least one
j = 1, 2, ...}.
The intersection
of the events T
Aj , j = 1, ..., n is the event denoted by A1 ∩
T
... ∩ AnT
or nj=1 Aj and is defined by nj=1 Aj = {s ∈ S; s ∈ Aj , for all j = 1, ..., n}.
Thus, nj=1 Aj occurs whenever all Aj , j = 1, ..., n occur simultaneously. This
definition extends to an infinite number
of events. Thus, for countably infinite
T
many events Aj , j = 1, 2, ..., one has ∞
A
j=1 j = {s ∈ S; s ∈ Aj , for all j = 1, 2, ...}.
If A1 ∩ A2 = ∅, the events A1 and A2 are called disjoint. The events Aj ,
j = 1, 2, ..., are said to be mutually or pairwise disjoint, if Ai ∩Aj = ∅ whenever
i 6= j.
CHAPTER 2. PROBABILITY
3
The differences A1 − A2 and A2 − A1 are the events defined by A1 − A2 =
{s ∈ S; s ∈ A1 , s ∈
/ A2 }, A2 − A1 = {s ∈ S; s ∈ A2 , s ∈ A1 }.
From the definition of the preceding operations, the following properties follow
immediately
1. S c = ∅, ∅c = S, (Ac )c = A.
2. S ∪ A = S, ∅ ∪ A = A, A ∪ Ac = S, A ∪ A = A.
3. S ∩ A = A, ∅ ∩ A = ∅, A ∩ Ac = ∅, A ∩ A = A.
4. Commutative laws
A1 ∪ A2 = A2 ∪ A1
A1 ∩ A2 = A2 ∩ A1
5. Associative laws
A1 ∪ (A2 ∪ A3 ) = (A1 ∪ A2 ) ∪ A3
A1 ∩ (A2 ∩ A3 ) = (A1 ∩ A2 ) ∩ A3
6. An identity:
∪j Aj = A1 ∪ (Ac1 ∩ A2 ) ∪ (Ac1 ∩ Ac2 ∩ A3 ) ∪ ...
7. Distributive laws)
A ∩ (∪j Aj ) = ∪j (A ∩ Aj )
A ∪ (∩j Aj ) = ∩j (A ∪ Aj )
8. DeMorgan s laws:
(∪j Aj )c = ∩j Acj
(∩j Aj )c = ∪j Acj
In the last relations, when the range of the index j is not indicated explicitly,
it is assumed to be a finite set, such as 1, ..., n, or a countably infinite set, such as
1, 2, ....
Formally, a random variable, to be shortened to r.v., is simply a function
defined on a sample space S and taking values in the real line R = (−∞, ∞).
Random variables are denoted by capital letters, such as X, Y , Z, with or without
CHAPTER 2. PROBABILITY
4
subscripts. Thus, the value of the r.v. X at the sample point s is X(s), and the
set of all values of X, that is, the range of X, is usually denoted by X(S).
Two kinds of r.v.’s emerge: discrete r.v.’s (or r.v.’s of the discrete type), and
continuous r.v.’s (or r.v.’s of the continuous type). A r.v. X is called discrete, if X
takes on countably many values; i.e., either finitely many values such as x1 , ..., xn ,
or countably in infinite many values such as x1 , x2 , .... On the other hand, X is
called continuous (or of the continuous type )if X takes all values in a proper
interval I ⊆ R.
Example 2.1
Let S = {(x, y) ∈ R2 ; −3 ≤ x ≤ 3, 0 ≤ y ≤ 4, x and y integers} and define r.v.
X by: X((x, y)) = x + y. Determine the values of X , as well as the following
events: (X ≤ 2), (3 < X ≤ 5), (X > 6).
Solution
X takes on the values: -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7.
(X ≤ 2) ={ (-3, 0), (-3, 1), (-3, 2), (-3, 3), (3, 4), (-2, 0), (-2, 1), (-2, 2), (-2,
3), (-2, 4), (-1, 0), (-1, 1), (-1, 2), (-1, 3), (0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0) }
(3 < X ≤ 5) = (4 ≤ X ≤ 5) = (X = 4 or X = 5) = { (0, 4), (1, 3), (1, 4), (2,
2), (2, 3), (3, 1), (3, 2) }
(X > 6) = (X ≤ 7) = { (3, 4) }
△
Probability is a function, denoted by P , defined for each event of a sample
space S , taking on values in the real line R, and satisfying the following three
properties:
(P1)
P (A) ≥ 0 for every event A (nonnegativity of P ).
(P2)
P (S) = 1 (P is normed).
(P3)
For countably infinite many pairwise disjoint events Ai , i = 1, 2, ...,
T
Ai Aj = ∅ , i 6= j, it holds
!
∞
∞
[
X
Ai =
P
P (Ai )
i=1
i=1
Next, we present some basic results following immediately from the defining
properties of the probability:
1. For an empty set P (∅) = 0
Proof
From the obvious fact that S = S ∪ ∅ ∪ ∅ ∪ ... and property (P3),
P (S) = P (S ∪ ∅ ∪ ∅ ∪ ...) = P (S) + P (∅) + P (∅) + ...
or P (∅) + P (∅) + ... = 0. By (P1), this can only happen when P (∅) = 0.
△
5
CHAPTER 2. PROBABILITY
S
P
2. For any pairwise disjoint events A1 , ..., An , P ( ni=1 Ai ) = ni=1 P (Ai ).
Proof
Take Ai = ∅ for i ≥ n + 1, consider the following obvious relation, and use (P3)
and #1 to obtain:
!
!
∞
n
∞
n
X
X
[
[
P (A1 ) =
P (Ai ).
P
Ai =
Ai = P
i=1
i=1
i=1
i=1
△
3. For any event A, P (Ac ) = 1 − P (A).
Proof
From (P2) and #2, P (A ∪ Ac ) = P (S) = 1 or P (A) + P (Ac ) = 1, so that
P (Ac ) = 1 − P (A).
△
4. A1 ⊆ A2 implies P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ).
Proof
The relation A1 ⊆ A2 , implies A2 = A1 ∪ (A2 − A1 ), so that, by #2, P (A2 ) =
P (A1 ) + P (A2 − A1 ). Solving for P (A2 − A1 ), we obtain P (A2 − A1 ) = P (A2 ) −
P (A1 ), so that, by (P1), P (A1 ) ≤ P (A2 ). At this point it must be pointed out
that P (A2 − A1 ) need not be P (A2 ) − P (A1 ), if A1 is not contained in A2 .
△
5. 0 ≤ P (A) ≤ 1 for every event A.
Proof
Clearly, ∅ ⊆ A ⊆ S for any event A. Then (P1), #1 and #4 give:
0 = P (∅) ≤ P (A) ≤ P (S) = 1.
△
6. (a) For any two events A1 and A2 :
P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ).
Proof
It is clear (by means of a Venn diagram, for example) that
A1 ∪ A2 = A1 ∪ (A2 ∩ Ac1 ) = A1 ∪ (A2 − A1 ∩ A2 ).
6
CHAPTER 2. PROBABILITY
Then, by means of #2 and #4:
P (A1 ∪ A2 ) = P (A1 ) + P (A2 − A1 ∩ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ).
△
6. (b) For any three events A1 , A2 , and A3 :
P (A1 ∪ A2 ∪ A3 ) = P (A1 ) + P (A2 ) + P (A3 ) − (P (A1 ∩ A2 )
+P (A1 ∩ A3 ) + P (A2 ∩ A3 )) + P (A1 ∩ A2 ∩ A3 ).
Proof
Apply part (a) to obtain:
P (A1 ∪ A2 ∪ A3 ) = P [(A1 ∪ A2 ) ∪ A3 ]
= P (A1 ∪ A2 ) + P (A3 ) − P [(A1 ∪ A2 ) ∩ A3 ]
= P (A1 ) + P (A2 ) − P (A1 ∩ A2 ) + P (A3 ) − P [(A1 ∩ A3 ) ∪ (A2 ∩ A3 )]
= P (A1 ) + P (A2 ) + P (A3 ) − P (A1 ∩ A2 )
− [P (A1 ∩ A3 ) + P (A2 ∩ A3 ) − P (A1 ∩ A2 ∩ A3 )]
= P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A−2)−P (A1 ∩A3 )−P (A2 ∩A3 )+P (A1 ∩A2 ∩A3 ).
△
S∞
P∞
Sn
7.
For
any
events
A
,
A
,
...,
P
(
A
)
≤
P
(A
),
and
P
(
1
2
i
i
i=1
i=1
i=1 Ai ) ≤
Pn
i=1 P (Ai ).
Proof
By the identity given in the start of the section and (P3):
!
∞
[
Ai = P A1 ∪ (Ac1 ∩ A2 ) ∪ ... ∪ Ac1 ∩ ... ∩ Acn−1 ∩ An ∪ ...
P
i=1
7
CHAPTER 2. PROBABILITY
= P (A1 ) + P (Ac1 ∩ A2 ) + ... + P Ac1 ∩ ... ∩ Acn−1 ∩ An + ...
≤ P (A1 ) + P (A2 ) + ... + P (An ) + ...
(by #4).
For the finite case:
P
n
[
Ai
i=1
!
= P A1 ∪ (Ac1 ∩ A2 ) ∪ . . . ∪ Ac1 ∩ . . . ∩ Acn−1 ∩ An
= P (A1 ) + P (Ac1 ∩ A2 ) + ... + P Ac1 ∩ ... ∩ Acn−1 ∩ An
≤ P (A1 ) + P (A2 ) + ... + P (An ).
△
Generalization of property #6 to more than three events is given as a theorem
below:
Theorem 2.1
The probability of the union of any n events, A1 , ..., An , is given by:
!
n
n
[
X
X
Aj =
P
P (Aj ) −
P (Aj1 ∩ Aj2 )
j=1
+
X
j=1
1≤j1 <j2 ≤n
P (Aj1 ∩ Aj2 ∩ Aj3 ) − ...
1≤j1 <j2 <j3 ≤n
+(−1)n+1 P (A1 ∩ ... ∩ An ).
First, sum up the probabilities of the individual events, then subtract the probabilities of the intersections of the events, taken two at a time (in the ascending
order of indices), then add the probabilities of the intersections of the events,
taken three at a time as before, and continue like this until you add or subtract
(depending on n) the probability of the intersection of all n events.
8
CHAPTER 2. PROBABILITY
Example 2.2
Students in a certain college subscribe to three news magazines A, B, and C
according to the following proportions: A: 20%, B : 15%, C : 10%, both A and
B : 5%, both A and C : 4%, both B and C : 3%, all three A, B, and C : 2%. If a
student is chosen at random, what is the probability he/she subscribes to none of
the news magazines?
Solution
P (Ac ∩ B c ∩ C c ) = P ((A ∪ B ∪ C)c ) = 1 − P (A ∪ B ∪ C)
= 1 − [P (A) + P (B) + P (C) − P (A ∩ B)
−P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C)]
= 1 − (0.20 + 0.15 + 0.10 − 0.05 − 0.04 − 0.03 + 0.02)
= 1 − 0.35 = 0.65
△
Answer: 0.65
We first briefly describe two somewhat classical methods for assigning probabilities to random variables: the equal likelihood model and the relative frequency
method. When we have an experiment where each of n outcomes is equally likely,
then we assign a probability mass 1/n of to each outcome. When the equal likelihood assumption is not valid, then the relative frequency method can be used.
With this technique, we conduct the experiment n times and record the outcome.
The probability of event E is assigned by P (E) = f /n, where f denotes the number
of experimental outcomes that satisfy event E.
Example 2.3
Consider a well-shuffled deck of 52 cards, and suppose we draw at random three
cards. What is the probability that at least one is an ace?
Solution
Let A be the required event,
and let Ai be defined by:
Ai = ”exactly i cards are aces, ” i = 0, 1, 2, 3. Then,
P (A) = P (A1 ∪ A2 ∪ A3 ) = 1 − P (A0 )
P (A0 ) =
so that P (A) =
1,201
5,525
48 47 46
4, 324
×
×
=
52 51 50
5, 525
= 0.217.
△
CHAPTER 2. PROBABILITY
9
Answer: 0.217
Another way to find the desired probability that an event occurs is to use a
probability density function (pdf) which will be discussed in Chapters 4 and
5.
Exercises
Exercise 2.1
In terms of the events A1 , A2 , A3 in a sample space S and their complements,
express the following events:
1. B0 = {s ∈ S; s belongs to none of A1 , A2 , A3 },
2. B1 = {s ∈ S; s belongs to exactly one of A1 , A2 , A3 },
3. B2 = {s ∈ S; s belongs to exactly two of A1 , A2 , A3 },
4. B3 = {s ∈ S; s belongs to all of A1 , A2 , A3 },
5. C = {s ∈ S; s belongs to at most two of A1 , A2 , A3 },
6. D = {s ∈ S; s belongs to at least one of A1 , A2 , A3 }.
Solution
1. B0 = Ac1 ∩ Ac2 ∩ Ac3
2. B1 = (A1 ∩ Ac2 ∩ Ac3 ) ∪ (Ac1 ∩ A2 ∩ Ac3 ) ∪ (Ac1 ∩ A2c ∩ A3 )
3. B2 = (A1 ∩ A2 ∩ Ac3 ) ∪ (A1 ∩ Ac2 ∩ A3 ) ∪ (Ac1 ∩ A2 ∩ A3 )
4. B3 = A1 ∩ A2 ∩ A3
5. C = B0 ∪ B1 ∪ B2
6. D = B1 ∪ B2 ∪ B3 = A1 ∪ A2 ∪ A3
△
Exercise 2.2
If the events A, B, and C are related as follows: A ⊂ B ⊂ C and P (A) = 14 ,
5
7
P (B) = 12
, and P (C) = 12
, compute the probabilities of the following events:
1. Ac ∩ B
CHAPTER 2. PROBABILITY
10
2. A ∩ B c ∩ C c
3. Ac ∩ B c ∩ C c
Solution
1. Ac ∩ B = B ∩ Ac = B − A and A ⊂ B. Therefore P (Ac ∩ B) = P (B − A) =
5
− 14 = 16 ∼
P (B) − P (A) = 12
= 0.167
2. A ∩ B c ∩ C c = A ∩ (B c ∩ C c ) = A ∩ (B ∪ C)c = A ∩ C c = A − C = ∅, so that
P (A ∩ B c ∩ C c ) = 0.
3. Ac ∩ B c ∩ C c = (A ∪ B ∪ C)c = C c , so that P (Ac ∩ B c ∩ C c ) = P (C c ) =
5
= 0.417.
1 − P (C) = 1 − 712 = 12
△
Answer: 0.167; 0; 0.417.
Exercise 2.3
Let A and B be the respective events that two contracts I and II, say, are completed
by certain deadlines, and suppose that: P (at least one contract is completed by
its deadline) = 0.9 and P ( both contracts are completed by their deadlines) = 0.5.
Calculate the probability: P (exactly one contract is completed by its deadline).
Solution
We have P (A ∪ B) = 0.9 and P (A ∩ B) = 0.5.
We need to calculate
P ((A ∩ B c ) ∪ (Ac ∩ B)) = P (A ∩ B c ) + P (Ac ∩ B).
Clearly, A = (A ∩ B) ∪ (A ∩ B c ) and B = (A ∩ B) ∪ (Ac ∩ B), so that P (A) =
P (A ∩ B) + P (A ∩ B c ) and P (B) = P (A ∩ B) + P (Ac ∩ B). Hence, P (A ∩ B c ) =
P (A) − P (A ∩ B) and P (Ac ∩ B) = P (B) − P (A ∩ B). Then P (A ∩ B c ) + P (Ac ∩
B) = P (A) + P (B) − 2P (A ∩ B) = [P (A) + P (B) − P (A ∩ B)] − P (A ∩ B) =
P (A ∪ B) − P (A ∩ B) = 0.9 − 0.5 = 0.4.
△
Answer: 0.4
Exercise 2.4
A four-sided die has the numbers 1 through 4 written on its sides, one on each
side. If the die is rolled twice:
1. Write out a suitable sample space S.
CHAPTER 2. PROBABILITY
11
2. If X is the r.v. denoting the sum of numbers appearing, determine the values
of X.
3. Determine the events: X ≤ 3, 2 ≤ X < 5, X > 8.
Solution
1. S = { (1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2),
(3, 3), (3, 4), (4, 1), (4, 2), (4, 3), (4, 4) }
2. The values of X are: 2, 3, 4, 5, 6, 7, 8.
3. (X ≤ 3) = (X = 2 or X = 3) = {(1, 1), (1, 2), (2, 1)},
(2 ≤ X < 5) = (2 ≤ X ≤ 4) = (X = 2 or X = 3 or X = 4) =
{(1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1)}
(X > 8) = ∅
△
2.2
Conditional probability
Conditional probability is a probability in its own right, and it is an extremely
useful tool in calculating probabilities. Essentially, it amounts to suitably modifying a sample space S, associated with a random experiment, on the evidence that
a certain event has occurred.
Example 2.4
Tossing three distinct coins once.
Then, with H and T standing for heads and tails respectively, a sample space
is:
S = {HHH, HHT, HT H, T HH, HT T, T HT, T T H, T T T }.
Consider the events B =”exactly 2 heads occur” = {HHT , HT H, T HH}, A =
”coins #1 and #2 show heads” = {HHH, HHT }. Then P (B) = 83 and P (A) =
2
= 14 . Now, suppose we are told that event B has occurred and we are asked to
8
evaluate the probability of A on the basis of this evidence.
Solution
Clearly, what really matters here is the event B, and, given that B has occurred,
the event A occurs only if the sample point HHT appeared; that is, the event
= P P(A∩B)
>
{HHT } = A ∩ B occurred. The required probability is then 31 = 1/8
3/8
(B)
1
= P (A).
4
△
CHAPTER 2. PROBABILITY
12
The conditional probability of an event A, given the event B with P (B) > 0,
is denoted by P (A|B) and is defined by: P (A|B) = P (A ∩ B)/P (B).
Replacing B by the entire sample space S, we are led back to the (uncondi= P (A). Thus, the conditional probability is a
tional) probability of A, as P P(A∩S)
(S)
generalization of the concept of probability where S is restricted to an event B.
The conditional probability is a probability can be seen formally as follows:
P (A|B) ≤ 0 for every A by definition;
P (S|B) =
P (S ∩ B)
P (B)
=
= 1;
P (B)
P (B)
and if A1 , A2 , ... are pairwise disjoint, then
!
∞
[
P ∪∞
P ∪∞
j=1 Aj ∩ B
j=1 (Aj ∩ B)
Aj |B
=
P
=
P (B)
P (B)
j=1
P∞
∞
∞
X
P (Aj ∩ B) X
j=1 P (Aj ∩ B)
=
=
=
P (Aj |B).
P (B)
P (B)
j=1
j=1
It is to be noticed, furthermore, that the P (A|B) can be smaller or larger than the
P (A), or equal to the P (A). The case that P (A|B) = P (A) is of special interest.
If P (A|B) = P (A) then the occurrence of event B provides no information in
reevaluating the probability of A. Under such a circumstance, it is only fitting
to say that A is independent of B. If, in addition, P (A) > 0, then B is also
independent of A because
P (B|A) =
P (B ∩ A)
P (A ∩ B)
P (A|B)P (B)
P (A)P (B)
=
=
=
= P (B).
P (A)
P (A)
P (A)
P (A)
Because of this symmetry, we then say that A and B are independent. From the
definition of either P (A|B) or P (B|A), it follows then that P (A∩B) = P (A)P (B).
Two events A1 and A2 are said to be independent (statistically or in the
probability sense), if P (A1 ∩A2 ) = P (A1 )P (A2 ). When P (A1 ∩A2 ) 6= P (A1)P (A2)
they are said to be dependent. The definition of independence can be extended to
any number n of events A1 , ..., An by requiring any combination of the events to
be independent.
The events A1 , ..., An are said to be independent (statistically or in the probability sense) if, for all possible choices of k out of n events (2 ≤ k ≤ n), the probability of their intersection equals the product of their probabilities. More formally,
for any k with 2 ≤ k ≤ n and any integers j1 , ..., jk with 1 ≤ j1 < < jk ≤ n, we
have:
!
k
k
\
Y
Aji =
P
P (Aji ).
i=1
Example 2.5
i=1
CHAPTER 2. PROBABILITY
13
Let S = {1, 2, 3, 4} and let P ({1}) = P ({2}) = P ({3}) = P ({4}) = 1/4. Define
the events A1 , A2 , A3 by: A1 = {1, 2}, A2 = {1, 3}, A3 = {1, 4}. Check if events
A1 , A2 and A3 are independent.
Solution
P (A1 ) = P (A2 ) = P (A3 ) = 12
P (A1 ∩ A2 ) = P ({1}) = 14 = P (A1 ) × P (A2 ),
P (A1 ∩ A3 ) = P ({1}) = 14 = P (A1 ) × P (A3 ),
P (A2 ∩ A3 ) = P ({1}) = 14 = P (A2 ) × P (A3 ).
However,
P (A1 ∩ A2 ∩ A3 ) = P ({1}) = 14 6= P (A1 )P (A2 )P (A3 ) = 18 .
Events are dependent.
△
We can calculate the probability of the intersection of n events, step by step,
by means of conditional probabilities using following theorem:
Theorem 2.2 Multiplicative Theorem
Tn−1
For any n events A1 , ..., An with P ( j=1
Aj ) > 0, it holds:
P
n
\
Aj
j=1
!
= P (An |A1 ∩ ... ∩ An−1 )P (An−1 |A1 ∩ ... ∩ An−2 )
...P (A2 |A1 )P (A1 ).
Proof
1 ∩P2 )
yields P (A1 ∩ A2 ) =
For n = 2, the theorem is true since p(A2 |A1 ) = P (A
P (A1 )
P (A2 |A1 )P (A1 ).
Next, assume P (A1 ∩ ... ∩ Ak ) = P (Ak |A1 ∩ ... ∩ Ak )...P (A2 |A1 )P (A1 ) and show
that P (A1 ∩ ...Ak+1 ) = P (Ak+1 |A1 ∩ ... ∩ Ak )P (Ak |A1 ∩ ... ∩ Ak−1 )...P (A2 |A1 )p(A1 ).
Indeed, P (A1 ∩...∩Ak+1 ) = P ((A1 ∩...∩Ak )∩Ak+1 ) = P (Ak+1 |A1 ∩...∩Ak )P (A1 ∩
... ∩ Ak ) = P (Ak+1 |A1 ∩ ... ∩ Ak )P (Ak |A1 ∩ ... ∩ Ak−1 )...P (A2 |A1 )P (A1 ) by the
induction.
△
Example 2.6
An urn contains 10 identical balls of which 5 are black, 3 are red, and 2 are white.
Four balls are drawn one at a time and without replacement. Find the probability
that the first ball is black, the second red, the third white, and the fourth black.
Solution
CHAPTER 2. PROBABILITY
14
Denoting by B1 the event that the first ball is black, and likewise for R2 , W3 , and
B4 , the required probability is:
P (B1 ∩ R2 ∩ W3 ∩ B4 ) = P (B4 |B1 ∩ R2 ∩ W3 )P (W3 |B1 ∩ R2 )P (R2 |B1 )P (B1 ).
Assuming equally likely outcomes at each step, we have:
5
, P (R2 |B1 ) = 39 , P (W3 |B1 ∩R2 ) = 82 , P (B4 |B1 ∩R2 ∩W3 ) = 74 . Therefore,
P (B1 ) = 10
5
1
P (B1 ∩ R2 ∩ W3 ∩ B4 ) = 47 × 82 × 39 × 10
= 42
≈ 0.024.
△
Answer: 0.024
The events {A1 , A2 , ..., An } form a partition of
S S, if these events are pairwise
disjoint, Ai ∩ Aj = ∅, i 6= j, and their union is S, nj=1 Aj = S. Then it is obvious
that any event
S B in S may be expressed as follows, in terms of a partition of S;
namely, B = nj=1 (Aj ∩ B).
The concept of partition is defined similarly for countably infinite many events,
and the
P probability P (B) is expressed likewise. In the sequel, by writing j = 1, 2, ...
and j we mean to include both cases, finitely many indices and countably infinite
many indices.
Thus, we have the following result.
Theorem 2.3 Total Probability Theorem
Let {A1 , A2 , ...} be a partition of S, and let P (Aj ) > 0 for all j. Then, for any
event B,
X
P (B) =
P (B|Aj )P (Aj ).
j
The significance of this result is that, if it happens that we know the probabilities of the partitioning events P (Aj ), as well as the conditional probabilities of
B, given Aj , then these quantities may be combined, according to the preceding
formula, to produce the probability P (B).
CHAPTER 2. PROBABILITY
15
Example 2.7
The proportion of motorists in a given gas station using regular unleaded gasoline,
extra unleaded, and premium unleaded over a specified period of time are 40%,
35%, and 25%, respectively. The respective proportions of filling their tanks are
30%, 50%, and 60%. What is the probability that a motorist selected at random
from among the patrons of the gas station under consideration and for the specified
period of time will fill his/her tank?
Solution
Denote by R, E, and P the events of a motorist using unleaded gasoline which is
regular, extra unleaded, and premium, respectively, and by F the event of having
the tank filled. Then the translation into terms of probabilities of the proportions
given above is:
P (R) = 0.40, P (E) = 0.35, P (P ) = 0.25,
P (F |R) = 0.30, P (F |E) = 0.50, P (F |P ) = 0.60.
Then the required probability is:
P (F ) = P (F |R)P (R) + P (F |E)P (E) + P (F |P )P (P )
= 0.30 × 0.40 + 0.50 × 0.35 + 0.60 × 0.25 = 0.445.
△
Answer: 0.4455.
Exercises
Exercise 2.5
For the events A, B, C and their complements, suppose that: P (A ∩ B ∩ C) =
3
2
5
, P (A ∩ B ∩ C c ) = 16
, P (A ∩ B c ∩ C c ) = 16
, P (Ac ∩ B ∩ C) =
P (A ∩ B c ∩ C) = 16
1
1
1
c
c
c
c
c
c
P (A ∩ B ∩ C ) = 16 , P (A ∩ B ∩ C) = 16 , and P (A ∩ B ∩ C c ) = 16
.
1
,
16
2
,
16
1. Calculate the probabilities: P (A), P (B), P (C).
2. Determine whether or not the events A, B, and C are independent.
3. Determine whether or not the events A and B are independent.
Solution
1. A = (A ∩ B ∩ C) ∪ (A ∩ B c ∪ C) ∪ (A ∩ B ∩ C c ) ∪ (A ∩ B c ∩ C c ) and hence
P (A) = 0.6875. Likewise, P (B) = 0.4375, P (C) = 0.5625.
16
CHAPTER 2. PROBABILITY
1
= P (A ∩ B ∩ C) 6= P (A)P (B)P (C) = 0.1691. Thus A, B, and
2. 0.0625 = 16
C are not independent.
3. P (A∩B) = P (A∩B∩C)+P (A∩B∩C c ) =
A and B are not independent.
1
+3
16 16
=
1
4
6= P (A)P (B) = 0.3001,
△
Answer: 0.6875, 0.4375, 0.5625.
2.3
Bayes theorem
The question arises of whether experimentation may lead to reevaluation of the
prior probabilities on the basis of new evidence. To put it more formally, is it
possible to use P (Aj ) and P (B|Aj ), j = 1, 2, ... in order to calculate P (Aj |B)?
Theorem 2.4 Bayes Formula
Let {A1 , A2 , ...} be a partition of S, and let P (Aj ) > 0 for all j. Then, for any
j = 1, 2, ...:
P (B|Aj )P (Aj )
P (Aj |B) = P
.
i P (B|Ai )P (Ai )
Proof
Indeed, P (Aj |B) =
completes the proof.
P (Aj ∩B)
P (B)
=
P (B|Aj )P (Aj )
,
P (B)
and then Total Probability Theorem
△
Example 2.8
A student sits a multiple-choice exam. Consider a single question with 5 possible
answers. Let C be the event that the student answers the question correctly.
Suppose you think the student has not been revising properly and that there is
only a 30% chance that he will know the answer. Call this event K, i.e. P (K) =
0.30. Note that C 6= K because even if he does know the right answer in exam
conditions he may give the wrong answer although this is unlikely and if he does
not know the right answer he might answer correctly by guessing. Suppose you
assess P (C | K) = 0.95 and P (C | K̄) = 0.20 (since there are 5 answers and if
he guesses he will be correct 20% of the time. Suppose the candidate answers the
question correctly, what is the probability that he knew the right answer?
Solution
17
CHAPTER 2. PROBABILITY
P (C) = P (C|K)P (K) + P (C|K c )P (K c ) = 0.95 × 0.3 + 0.2 × 0.7 = 0.425
P (C|K)P (K)
0.95 × 0.3
=
= 0.671
P (C)
0.425
P (K|C) =
△
Answer: 0.671
Exercises
Exercise 2.6
Three machines I, II, and III manufacture 30%, 30%, and 40%, respectively, of
the total output of certain items. Of them, 4%, 3%, and 2%, respectively, are
defective. One item is drawn at random from the total output and is tested.
1. What is the probability that the item is defective?
2. If it is found to be defective, what is the probability the item was produced
by machine I?
3. Same question as in part 2 for each one of the machines II and III.
Solution
1. P (D) = P (I)P (D|I)+P (II)P (D|II)+P (III)P (D|III) = 0.3×0.04+0.3×
0.03 + 0.4 × 0.02 = 0.029;
2. P (I|D) =
P (D|I)P (I)
P (D)
=
0.040.3
0.029
P (D|II)P (II)
= 0.03×0.3
≈
P (D)
0.029
P (D|III)P (III)
0.02×0.4
= 0.029
=
P (D)
3. P (II|D) =
P (III|D)
≈ 0.414;
0.310
≈ 0.276
△
Answer: 0.029; 0.414; 0.310; 0.276
Exercise 2.7
A bag contains 3 red and 4 white balls, a second bag contains 1 red and 5 white
balls. A bag is selected at random. What is the probability that a ball drawn from
this bag is white? If the ball is white what is the probability that the first bag was
selected?
18
CHAPTER 2. PROBABILITY
Solution
Let R be an event that we draw a red ball, W that we draw a white ball, F that
we select a first bag and S that we select the second bag. Then:
3
P (R|F ) = ,
7
4
P (W |F ) = ,
7
1
P (R|S) = ,
6
P (W |S) =
5
6
Also we select bag at random, so each back is equally likely to be picked:
1
P (S) = .
2
1
P (F ) = ,
2
We require P (W ) and P (F |W )
P (W ) = P (W |F )P (F ) + P (W |S)P (S) =
4 1 5×
59
× +
2=
7 2 61
84
Applying Bayes’ theorem:
P (F |W ) =
4/7 × 1/2
24
P (W |F )P (F )
=
=
P (W )
59/84
59
△
Answer: 59/84; 24/59.
Exercise 2.8
Bag A contains two white and two black balls; Bag B contains three white and
two black balls. One ball is drawn from A and is transferred to B, one ball is
then drawn from B and turns out to be white. What is the probability that the
transferred ball was white?
Solution
Let W = ’Ball drawn is white’, T = ’Ball transfered is white’. We know
2
P (W |T ) = ,
3
1
P (W |T c ) = ,
2
P (T ) =
1
2
We require P (T |W ). By Bayes’ theorem
P (T |W ) =
P (W |T )P (T )
2/3 × 1/2
4
=
=
c
c
P (W |T )P (T ) + P (W |T )P (T )
2/3 × 1/2 + 1/2 × 1/2
7
△
Answer: 4/7.
Exercise 2.9
CHAPTER 2. PROBABILITY
19
A screening programme for a disease is proposed. It is thought that 1 in 10000
of the population has the disease. The screening test gives either a positive or
negative result with
P (+|D) = 0.98
P (+|Dc ) = 0.05
where D denotes the event ”the disease is present”.
Suppose an individual gives a positive result. What is the probability he/she has
the disease, i.e. what is P (D|+)?
Solution
Applying Bayes’ rule
P (D|+) =
P (+|D)P (D)
0.98 × 0.0001
=
= 1.96×10−3
c
c
P (+|D)P (D) + P (+|D )P (D )
0.98 × 0.0001 + 0.05 × 0.9999
△
Answer: 1.96 × 10−3