Download Chapter 3 Lecture Notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Chapter 3 Lecture Notes
Conditional Probability and Independence
October 1, 2015
1
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Sections 3.1 – 3.2
Conditional Probabilities
Conditional probabilities are one of the most important concepts in probability theory. In most
cases, partial information is known before we compute the probability of an event. This means the
probability is based on some condition; hence, conditional probability.
Example Suppose we roll two dice. What is the probability that the dice sums to 8? The sample
space is
S = {(i, j) | i, j = 1, 2, . . . , 6} ,
which has 36 elements. The event E = {(i, j) | i + j = 8} contains 5 elements. Since any roll of the
dice is equally likely, it follws that
5
P (E) = .
36
Now, suppose we know that the first dice rolled was a 3. What is the probability that the sum is
8? In this case, our sample space is different:
S = {(3, j) | j = 1, 2, . . . , 6} ,
which only has 6 elements. Furthermore, each element in the space is equally likely with probability
1/6. Since only rolling a 5 with the second dice will yield an 8 in this case, it follows that the
probability of rolling an 8 given that the first roll is a 3 is
1
P (8 is rolled | 3 was rolled first) = .
6
We note that the probability is actually a little bit higher in this case.
In general, if E and F are two events of an experiment, then the conditional probability that
E occurs given that F has already occurred is denoted by
P (E | F ).
We can derive a formula for this probability. Suppose F has occurred. Then for E to occur, it is
necessary that the occurrence is in both E and F , so it is in EF . Since F has occurred, the event
F becomes our new sample space, and we need to see how often E occurs within this new space.
Hence, the probability that EF occurs will equal P (EF ) relative to P (F ). Therefore, we arrive at
the following definition:
Definition If P (F ) > 0, then
P (E | F ) =
2
P (EF )
.
P (F )
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Example (3.1) Two fair dice are rolled. What is the conditional probability that at least one
lands on 6 given that the dice land on different numbers? Here, we let E be the event that at least
one dice lands on 6 and let F be the event that the two numbers are different. Then we seek the
P (E | F ), which is given by
P (EF )
P (E | F ) =
.
P (F )
We can compute each probability on the right hand side. We first note that the sample space has
36 elements, each equally likely. Consider the event E, which is the event that at least one dice is
a six. This event has 11 elements because
E = {(i, 6), (6, i), (6, 6) | i = 1, 2, . . . , 5}.
Of these 11 events, 10 have numbers that are different. Hence,
P (EF ) =
10
.
36
The event F has 30 elements which we can denote by
F = {(i, j) | i 6= j for i, j = 1, 2, . . . , 6}.
There is only 30 elements because the only dice rolls that yield identical numbers are doubles and
there are only 6 doubles. Hence,
30
P (F ) = .
36
Therefore, we can conclude that the desired probability is
P (E | F ) =
10/36
1
= .
30/36
3
Exercise (2a) Suppose Joe is 80 percent certain that his missing key is in one of the two pockets
of his hanging jacket, being 40 percent sure it is in the left pocket and 40 percent sure it is in the
right. If he searches the left pocket and does not find the key, what is the conditional probability
that it is in the right pocket?
Solution: Here, we let L denote the event that the key is in the left pocket and let R denote the
event that the key is in the right pocket. Then the probability that we seek is P (R | Lc ). Hence,
P (R | Lc ) =
P (RLc )
P (R)
0.4
2
=
=
= .
P (Lc )
1 − P (L)
0.6
3
Here, we have used the fact that P (RLc ) = P (R) since R, the event that the key is in the right
pocket, is most certainly a subset of Lc , the event that the key is not in the left pocket. (The rest
of Lc is filled with the event that it is in neither pocket).
3
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Exercise (3.7) The king comes from a family of 2 children. What is the probability that the other
child is his sister?
Solution: Let E be the event that the king is male and F be the event that his sister is female.
We first note that the sample space to this problem is
S = {(g, g), (g, b), (b, g), (b, b)},
where b denotes boy and g denotes girl. We wish to find the probability P (F | E), which is
P (F | E) =
P (F E)
2/4
2
=
= .
P (E)
3/4
3
Sometimes working with the reduced sample space is the best way to go.
Example (3.4) What is the probability that at least one of a pair of fair dice lands on 6, given
that the sum of the dice is i, for i = 2, 3, 4, . . . , 12? In this case, it is much easier to consider the
reduced sample space given by the condition of the sum. Let E be the event that at least one dice
lands on 6 and let Fi be the event that the sum of the dice is i. Then we quickly see that the
desired probability is
P (E | Fi ) = 0,
i = 2, 3, 4, 5, 6
This is because if the sum is less than 6, there is no way a 6 could’ve been rolled. For i = 7, we see
that our reduced sample space is restricted to
F7 = {(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)},
of which only 2 elements contain a 6. Therefore,
P (E | F7 ) =
2
1
= .
6
3
Using the same argument, we see that when i = 8,
F8 = {(6, 2), (5, 3), (4, 4), (3, 5), (6, 2)}.
Our reduced sample space only has 5 elements, 2 of which contain a 6. Therefore,
2
P (E |F8 ) = .
5
Continuing in this manner, we find
1
2
2
P (E | F10 ) =
3
P (E | F11 ) = 1
P (E | F9 ) =
P (E | F12 ) = 1.
4
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
Exercise (3.10) Three cards are randomly drawn, without replacement, from an ordinary deck of
52 playing cards. Compute the conditional probability that the first card is a spade given that the
second and third cards are spades.
Solution: Let E be the event that the first card is a spade and let F be the event that the second
and third cards are spades. This problem is very easy to do if we look at the reduced sample space.
Suppose we draw three cards. If two of them are already spades, then two of the 13 spades are gone
and we are left with 11 spades to choose from a remaining deck of 50 cards. Hence, the probability
is
11
P (E | F ) = .
50
Exercise (2c) In the card game of bridge, the 52 cards are dealt out equally to 4 players - called
East, West, North, and South, where East and West are on a team and North and South are on a
team. If North and South have a total of 8 spades among them, what is the probability that East
has 3 of the remaining spades?
Solution: Let the event E be East having 3 spades and let F be the event that North and South
have 8 spades among the two of them. We can think in the sense of the reduced sample space.
Assuming the North and South
hands have been dealt and they received 8 spades, it follows that
26
East could have a possible 13 hands. We seek the probability that he received 3 of the remaining
5 spades. There are 53 21
10 ways this could happen. Hence,
5 21
P (E | F ) =
3
10
26
13
.
We can rearrange the conditional probability formula and write
P (EF ) = P (F )P (E | F ).
We can generalize the conditional probability formula to include any number of events in succession.
This is called the multiplication rule:
Theorem. (Multiplication Rule)
P (E1 E2 · · · En ) = P (E1 )P (E2 | E1 )P (E3 |E1 E2 ) · · · P (En | E1 E2 · · · En−1 ).
Proof. We can show this is the case by continually applying the conditional probability formula.
Consider the right hand side of this equation:
P (E1 E2 · · · En )
P (E1 E2 )
···
P (E1 )P (E2 | E1 ) · · · P (En | E1 E2 · · · En−1 ) = P (E1 )
P (E1 )
P (E1 E2 · · · En−1 )
= P (E1 E2 · · · En )
5
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
Example (2d) Celine is undecided as to whether to take a French course or a chemistry course.
She estimates that her probability of receiving an A grade would be 1/2 in a French course and
2/3 in a chemistry course. If Celine decides to base her decision on the flip of a fair coin, what is
the probability that she gets an A in chemistry? We can use the multiplicative rule to answer this
question. Let C be the event that she takes a chemistry class, and A be the event that she gets an
A in whatever course she chooses. Then, we know that the probability that she gets an A assuming
she takes chemistry is P (A | C) = 2/3 and we know the probability that she takes chemistry is
P (C) = 1/2. We see the probability that she takes chemistry and gets an A. Hence,
1
2
1
P (CA) = P (C)P (A | C) =
= .
2
3
3
Example (3.13) Suppose an ordinary deck of 52 cards is randomly divided into 4 hands of 13 each.
We wish to determine the probability p that each hand has an ace. Let Ei be the event that hand
i has an ace, then we can determine p = P (E1 E2 E3 E4 ) using the multiplication rule. Here, we can
write
P (E1 E2 E3 E4 ) = P (E1 )P (E2 | E1 )P (E3 | E1 E2 )P (E4 |E1 E2 E3 ).
We need the conditional probabilities on the right hand side of this
equation. Consider the prob
4
ability that the first hand is dealt an ace. There are a total of 52
13 possible hands. There are 1
possible aces to fill the one spot and there are 48
12 remaining cards to fill the remaining 12 spots
in the hand. Therefore,
P (E1 ) =
48
12
52
13
4
1
.
Now, assume we know that the first hand has an ace. Then we have a reduced sample space of 39
cards and there are only 3 aces left. Hence,
3 36
1
P (E2 | E1 ) =
12
39
13
.
Now, assume we know that first two hands contain aces. Then our sample space is reduced further
and there are only 26 remaining cards, 2 of which are aces. This gives
2 24
1
P (E3 | E1 E2 ) =
12
26
13
.
Finally, there is only 1 ace left and there are 13 cards to choose from. Hence,
1 12
1
P (E4 |E1 E2 E3 ) =
12
13
13
= 1.
Therefore, p is given by
4
1
48
12
52
13
p=
3
1
36
12
39
13
2
1
24
12
26
13
≈ 0.1055.
6
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Exercise (3.12) A recent college graduate is planning to take the first three actuarial exams in the
coming summer. She will take the first exam in June. If she passes that exam, then she will take
the second exam in July, and if she passes that one, she will take the third exam in September. If
she fails an exam, she cannot take any more. The probability that she passes the exam 1 is 0.9. If
she passes the first, the conditional probability that she passes the exam 2 is 0.8, and if she passes
both the first and second exams, the conditional probability that she passes the exam 3 is 0.7.
a.) What is the probability that she passes all three exams?
b.) Given that she did not pass all three exams, what is the conditional probability that she
failed the second exam?
Solution:
a.) Let Ei be the event that she passed the ith exam. Then we seek the probability P (E1 E2 E3 ).
From the information given, we know P (E1 ) = 0.9, P (E2 | E1 ) = 0.8, and P (E3 | E1 E2 ) = 0.7.
By the multiplication rule, we have
P (E1 E2 E3 ) = P (E1 )P (E2 | E1 )P (E3 | E1 E2 ) = (0.9)(0.8)(0.7) = 0.504.
b.) We seek the probability P (E2c | (E1 E2 E3 )c ), which the conditional probability formula gives
c (E E E )c
P
E
P (E1 E2c )
1
2
3
2
=
P (E2c | (E1 E2 E3 )c ) =
1 − P (E1 E2 E3 )
P (E1 E2 E3 )c
Here, we have used the fact that P E2c (E1 E2 E3 )c = P (E1 E2c ) since the only way she could
have failed exam 2 is if she passed exam 1 and failed exam 2. This gives
P (E1 )P (E2c | E1 )
1 − 0.504
(0.9)(0.2)
=
≈ 0.3629
0.496
P (E2c | (E1 E2 E3 )c ) =
Exercise (3.14) An urn initially contains 5 white and 7 black balls. Each time a ball is selected,
its color is noted and it is replaced in the urn along with 2 other balls of the same color. Compute
the probability that
a.) the first 2 balls selected are black and the next 2 are white;
b.) of the first 4 balls selected, exactly 2 are black.
7
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
Solution:
a.) Let B denote the event that a black ball was drawn and let W denote the event that a white
ball was drawn. We seek the probability P (BBW W ), which by the multiplication rule is
P (BBW W ) = P (B)P (B | B)P (W | BB)P (W | BBW ).
The probability that the initial ball is black is simply
P (B) =
7
,
12
since there are 7 black balls and 5 white balls initially. Since a black ball was selected, we
put it back and add two more black balls. Now, we seek the probability P (B | B), which is
given by
9
P (B | B) = .
14
Now, we assume a second black ball was chosen so that there are 11 black balls and still only
5 white. If a white ball is selected next, then the probability would be
P (W | BB) =
5
.
16
Finally, the probability that another white ball is chosen would be
P (W | BBW ) =
7
.
18
Therefore, the probability that two black balls are chosen then two white balls are chosen is
given by
7
9
5
7
35
P (BBW W ) =
=
.
12
14
16
18
768
b.) We note that there is exactly 42 ways for 2 black balls and 2 white balls to be chosen. In
every case, the probability will be that found in part a. Therefore, the probability is
4
210
2 35
=
.
P (2 black, 2 white) =
768
768
Recall that we may consider the probability of an event as a long-run relative frequency, i.e.,
lim
n→∞
n(E)
,
n
where n(E) is the number of times event E occurs in n repetitions of an experiment. P (E | F )
is consistent with this interpretation. Let n be large, then if we only consider the experiments in
8
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
which F occurs, then P (E | F ) will equal the long-run proportion of them in which E also occurs.
To verify:
nP (F ) ≈ number of times F occurs
nP (EF ) ≈ number of times both E and F occur.
Then out of nP (F ) experiments in which F occurs, the proportion in which E occurs is
P (EF )
nP (EF )
=
,
nP (F )
P (F )
which is in agreement with our definition of P (E | F ) as n gets large.
9
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Section 3.3
Bayes’s Formula
Consider
E = EF ∪ EF c .
Here, EF and EF c are mutually exclusive. By Axiom 3.], we have
P (E) = P (EF ) + P (EF c )
⇒
⇒
P (E) = P (E | F )P (F ) + P (E | F c )P (F c )
P (E) = P (E | F )P (F ) + P (E | F c ) 1 − P (F ) .
This equation states that the probability of an event E is a weighted average of the conditional
probability of E given that F has occurred and the conditional probability of E given that F has
not occurred. In fact, each weight is the probability of the event on which it is conditioned. This
allows us to find the probability of an event by first “conditioning” on whether or not some second
event has occurred.
Example (3.23) Urn I contains 2 white and 4 red balls, whereas urn II contains 1 white and 1 red
ball. A ball is randomly drawn from urn I and placed into urn II, then a ball is randomly selected
from urn II. What is
a.) the probability that the ball selected from urn II is white?
b.) the conditional probability that the transferred ball was white given that a white ball is
selected from urn II?
To answer part a., we consider what happens if both a red ball was initially drawn and transferred or if a white ball was drawn and transferred. Let Rt be the event that a red ball was drawn
and transferred and let Wt mean that a white ball was transferred. Let W be the event that a
white ball was selected from urn II. Then the probability we seek is P (W ), which is given by
P (W ) = P (W | Rt )P (Rt ) + P (W | Wt )P (Wt )
1
4
2
2
=
+
3
6
3
6
4
= .
9
For part b., we seek the probability P (Wt | W ), which using the conditional probability formula,
10
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
gives
P (Wt W )
P (W )
P (Wt )P (W | Wt )
=
P (W )
2
2
P (Wt | W ) =
=
6
3
4
9
1
= .
2
Exercise (3a) An insurance company believes that people can be divided into two classes; those
who are accident prone and those who are not. The company’s statistics show that an accident
prone person will have an accident at some time within a fixed 1-year period with probability 0.4,
whereas this probability decreases to 0.2 for a person who is not accident prone.
a.) If we assume that 30% of the population is accident prone, what is the probability that a new
policyholder will have an accident within a year of purchasing a policy?
b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is the
probability that she is accident prone?
Solution:
a.) Let A be the event that a person is accident-prone, and let A1 be the event that a person
had an accident within a year. Then, we seek the probability P (A1 ), which we can find by
conditioning on whether or not that person is accident prone. We have
P (A1 ) = P (A1 | A)P (A) + P (A1 | Ac )P (Ac )
= (0.4)(0.3) + (0.2)(0.7)
= 0.26.
b.) Now, we assume the person has had an accident and we want to know whether she was
actually accident prone. In this case, we seek the event P (A | A1 ). This is given by
P (AA1 )
P (A1 )
P (A)P (A1 | A)
=
P (A1 )
6
(0.3)(0.4)
=
= .
0.26
13
P (A | A1 ) =
11
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
Example (3d) A blood test is 95% effective in detecting a certain disease when it is, in fact,
present. The test also yields a “false positive” result for 1% of healthy persons tested. If 0.5% of
the population actually has the disease, what is the probability that a person has the disease given
that the test result is positive? Let D be the event that the person has the disease and let E be
the event that the result is positive. Then from the information given we know
P (E | D) = 0.95,
P (D) = 0.005,
P (E | Dc ) = 0.01.
We seek the probability P (D | E), which is given by
P (D | E) =
P (DE)
P (E)
P (E | D)P (D)
P (E | D)P (D) + P (E | Dc )P (Dc )
(0.95)(0.005)
=
(0.95)(0.005) + (0.01)(0.995)
≈ 0.323.
=
Exercise (3c) In answering a question on a multiple choice test, a student either knows the answer
or guesses. Let p be the probability that the student knows the answer and 1 − p be the probability
that the student guesses. Assume that a student who guesses at the answer will be correct with
probability 1/m, where m is the number of multiple-choice alternatives. What is the conditional
probability that a student knew the answer to a question given that he or she answered it correctly?
Solution: Let C be the event that the student gets the answer correct and let G be the event
that the student guessed at the answer. Then we are given the probabilities P (Gc ) = p and
P (C | G) = 1/m and we seek the probability P (Gc | C), which is the probability that the student
knew the answer assuming he/she got it correct. Using our definition for conditional probability
gives
P (Gc | C) =
P (Gc C)
P (C)
P (C | Gc )P (Gc )
P (C | Gc )P (Gc ) + P (C | G)P (G)
(1)(p)
=
1
(1)(p) + m
(1 − p)
p
=
p + 1−p
m
mp
=
.
1 + (m − 1)p
=
12
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Exercise (3.19) A total of 48% of the women and 37% of the men who took a certain “quit
smoking” class remained nonsmokers for at least one year after completing the class. These people
then attended a success party at the end of a year. If 62% of the original class was male,
a.) what percentage of those at the party were women?
b.) what percentage of the original class attended the party?
Solution: Let A be the event that a smoker attended the party and let W indicate that the
smoker was a woman. Then we are given the following probabilities: P (W ) = 0.38, P (W c ) = 0.62,
P (A | W ) = 0.48, and P (A | W c ) = 0.37. Consider the following solutions:
a.) Here, we are concerned with the probability P (W | A). Using the definition of conditional
probabilities, we obtain
P (W | A) =
=
P (W A)
P (A)
P (A | W )P (W )
,
P (A | W )P (W ) + P (A | W c )P (W c )
where we have conditioned the denominator term over whether or not the person selected is
a male or female. Therefore,
P (W | A) =
(0.48)(0.38)
≈ 0.443.
(0.48)(0.38) + (0.37)(0.62)
b.) We’ve actually already answered this question. We seek the probability P (A), which by
conditioning on whether the smoker is a male or female gives
P (A) = P (A | W )P (W ) + P (A | W c )P (W c )
= (0.48)(0.38) + (0.37)(0.62)
= 0.4118.
Exercise (3f) At a certain stage of a criminal investigation, the inspector in charge is 60% convinced of the guilt of a certain suspect. Suppose, however, that a new piece of evidence which shows
that the criminal has a certain characteristic is uncovered. If 20% of the population possesses this
characteristic, how certain of the guilt of the suspect should the inspector now be if it turns out
that the suspect has the characteristic?
Solution: Let G be the event that the suspect is guilty and let E be the event that the suspect
13
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
has this characteristic. Then the probability we seek is P (G | E), which is the probability that the
suspect is guilty given that he has the characteristic. From conditional probabilities, we have
P (G | E) =
P (GE)
P (E)
P (E | G)P (G)
P (E | G)P (G) + P (E | Gc )P (Gc )
(1)(0.6)
=
(1)(0.6) + (0.2)(0.4)
≈ 0.882.
=
Just as we did before with conditional probabilities, we can generalize the previous result.
Suppose Fi for i = 1, 2, . . . , n are all mutually exclusive events. Then the sample space can be
written as
n
[
S=
Fi ,
i=1
or equivalently, one of Fi must occur. Then we have for any event E,
E=
n
[
EFi ,
i=1
and each event EFi is mutually exclusive. [Venn Diagram] Then
P (E) =
n
X
P (EFi ) =
i=1
n
X
P (E | Fi )P (Fi ).
i=1
This is called the law of total probability. This shows that we can compute P (E) by first conditioning
on which one of the Fi that occurs. Again, P (E) is a weighted average of P (E | Fi ), each term
being weighted by the probability P (Fi ).
Theorem. (Bayes’s Formula)
P (Fj | E) =
P (E | Fj )P (Fj )
P (EFj )
= Pn
.
P (E)
i=1 P (E | Fi )P (Fi )
Proof. Follows directly from the definition of conditional probability and the law of total probability.
Example (3.32) A family has j children with probability pj , where p1 = 0.1, p2 = 0.25, p3 = 0.35,
p4 = 0.3. A child from this family is randomly chosen. Given that this child is the eldest child in
the family, find the conditional probability that the family has
a.) only 1 child;
b.) 4 children.
14
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
To answer these questions, we first start off defining each event. Let E be the event that the
child selected is the oldest and let Fj be the event that the family has j children. From this, we
may conclude that the probability that the child is the oldest, given that there is j children is
P (E | Fj ) = 1/j. Furthermore, we know P (Fj ) = pj as given in the problem. To answer parts a.)
and b.), we seek the probability P (Fj | E). Hence, by the Bayes’s formula
P (EFj )
P (E)
P (E | Fj )P (Fj )
= P4
i=1 P (E | Fj )P (Fj )
1
j pj
=P 4
1
i=1 j pj
P (Fj | E) =
=
pj /j
.
p1 + p2 /2 + p3 /3 + p4 /4
From this Therefore, for part a.), we have
P (F1 | E) =
p1
1/10
6
=
= ,
p1 + p2 /2 + p3 /3 + p4 /4
5/12
25
and for part b.) we have
P (F4 | E) =
p4 /4
3/40
9
= .
=
p1 + p2 /2 + p3 /3 + p4 /4
5/12
50
Exercise (3.36) Stores A, B, and C have 50, 75, and 100 employees, respectively, and 50, 60, and
70 percent of them respectively are women. Resignations are equally likely among all employees,
regardless of sex. One woman employee resigns. What is the probability that she works in store C.
Solution: Here, let W refer to the event that the resignation came from a woman and let A, B,
and C represent the event that the person who resigned worked in store A, B, or C, respectively.
Then we seek the probability P (C | W ). Using the Bayes’s formula, we obtain
P (W | C)P (C)
P (W | A)P (A) + P (W | B)P (B) + P (W | C)P (C)
(0.75)(100/225)
=
(0.5)(50/225) + (0.7)(75/225) + (0.75)(100/225)
= 0.5.
P (C | W ) =
15
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Exercise (3n) A bin contains 3 types of disposable flashlights. The probability that a type 1
flashlight will give more than 100 hours of use is 0.7, with the corresponding probabilities for type
2 and type 3 flashlights being 0.4 and 0.3, respectively. Suppose that 20% of the flashlights in the
bin are type 1, 30% are type 2, and 50% are type 3.
a.) What is the probability that a randomly chosen flashlight will give more than 100 hours of
use?
b.) Given that a flashlight lasted more than 100 hours, what is the conditional probability that
it was a type j flashlight, for j = 1, 2, 3.
Solution: Let E be the event that the chosen flashlight gives more than 100 hours of light and
let Fi be the event that a flashlight of type i is chosen. Consider the following solutions:
a.) We seek the probability P (E). We find this probability by conditioning on which flashlight
is chosen. The law of total probability gives
P (E) = P (E | F1 )P (F1 ) + P (E | F2 )P (F2 ) + P (E | F3 )P (F3 )
= (0.7)(0.2) + (0.4)(0.3) + (0.3)(0.5)
= 0.41.
b.) We seek the probabilities P (Fi | E), which is given by Bayes’s Formula:
P (Fi | E) =
P (E | Fi )P (Fi )
.
0.41
This gives the following
14
41
12
P (F2 | E) =
41
15
P (F3 | E) = .
41
P (F1 | E) =
Exercise (3k) A plane is missing, and it is presumed that it was equally likely to have gone down
in any of 3 possible regions. Let 1 − βi , for i = 1, 2, 3, denote the probability that the plane
will be found upon a search of the ith region when the plane is, in fact, in that region. What is
the conditional probability that the plane is in the ith region given that a search of region 1 is
unsuccessful?
Solution: Here, the values βi are called overlook probabilities for obvious reasons. Let Ri be the
event that the plane is in location i and let E be the event that a search of region 1 was unsuccessful.
Then, using these events, we conclude the following: P (Ri ) = 1/3, P (E | R1 ) = β1 , P (E | R2 ) = 1,
16
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
and P (E | R3 ) = 1. From this, we can find the desired probabilities. Let i = 1, then we seek the
following:
P (E | R1 )P (R1 )
P (E | R1 )P (R1 ) + P (E | R2 )P (R2 ) + P (E | R3 )P (R3 )
β1 (1/3)
=
β1 /3 + 1/3 + 1/3
β1
=
.
2 + β1
P (R1 | E) =
For i = 2, 3, we obtain the following:
P (E | Ri )P (Ri )
P (E | R1 )P (R1 ) + P (E | R2 )P (R2 ) + P (E | R3 )P (R3 )
(1)(1/3)
=
β1 /3 + 1/3 + 1/3
1
=
.
2 + β1
P (Ri | E) =
Definition The odds of an event E are defined as
[ odds ] =
P (E)
P (E)
=
.
c
P (E )
1 − P (E)
The odds tell how much more likely it is that event E occurs than it is that it doesn’t occur. If the
odds are α, then it is common to say “α to 1” in favor of the hypothesis.
We can now compute the odds when new evidence is introduced. Suppose H is true with
probability P (H) and let E be new evidence. Then, given the new evidence,
P (HE)
P (E | H)P (H)
=
P (E)
P (E)
c
P (H E)
P (E | H c )P (H c )
P (H c | E) =
=
P (E)
P (E)
P (H | E) =
Dividing the two expressions gives the “new odds” in light of this evidence:
P (H) P (E | H)
P (H | E)
=
.
P (H c | E)
P (H c ) P (E | H c )
| {z } | {z }
new odds
old odds
We can see that the new odds increase if the new evidence is more likely when H is true than when
it is false.
17
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Example (3i) An urn contains two type A coins and one type B coin. When a type A coin is
flipped it comes up heads with probability 1/4, whereas when a type B coin is flipped, it comes up
heads with probability 3/4. A coin is randomly chosen from the urn and flipped. Given that the
flip landed on heads, what is the probability that it was a type A coin? To answer this question,
we define H as the event of flipping a heads and A the event that coin A is drawn. We find the
odds of drawing an A coin as
[ odds of drawing A ] =
P (A)
2/3
=
= 2.
c
P (A )
1/3
The odds of drawing an A coin are two to one. Now, assume we know that the coin flipped came
up heads. Then in light of this new evidence, we wish to find the odds that it is an A coin. This
means
P (A | H)
P (A) P (H | A)
1/4
2
[ new odds of drawing A ] =
=
= (2)
= .
c
c
c
P (A | H)
P (A P (H | A )
3/4
3
So the new odds are 2/3 to one, which means the probability that the coin picked was the A coin
given that heads was flipped is P (A | H) = 2/5.
18
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Section 3.4
Independence
We say E is independent of F if knowledge that F has occurred does not influence the probability
that E occurs. In other words, if P (E | F ) is the same as P (E), then E is independent of F , which
means
P (EF ) = P (E)P (F ).
Furthermore, it follows that if E is independent of F , then F is independent of E.
Definition Two events E and F are said to be independent if P (EF ) = P (E)P (F ).
Example (4a) A card is selected at random from an ordinary deck of 52 playing cards. If E is
the event that the selected card is an ace and F is the event that it is a spade, then are the events
independent? In this case, yes because because P (E) = 4/52 = 1/13 and the probability of drawing
an ace knowing the card is a spade is exactly P (E | F ) = 1/13. Therefore, knowing the card is a
spade gives you no extra benefit. On the flippity flop, you could say P (F ) = 13/52 = 1/4. Suppose
you know the card you drew was an ace, then what is the probability that it is a spade? Then,
P (F | E) = 1/4 since there are four possible suits of the ace, one of which being a spade. You can
also think that the probability of drawing an ace of spades is P (EF )1/52 and the probability of
each event separately is P (E) = 4/52 and P (F ) = 13/52, which gives P (EF ) = P (E)P (F ).
Example (4b) Two coins are flipped, and all 4 outcomes are assumed to be equally likely. If E is
the event that the first coin lands on heads and F the event that the second lands on tails, then are
E and F independent? Yes, because P (E) = 1/2, P (F ) = 1/2, and P (EF ) = P ({(H, T )}) = 1/4.
Hence, P (EF ) = P (E)P (F ) and the two events are independent.
Exercise (4c) Suppose we toss 2 fair dice. Let E1 denote the event that the sum of the dice is 6
and let E2 denote the event that the sum of the two dice is 7. Let F denote the the event that the
first roll was a 4. Is either E1 or E2 independent of F ?
Solution: The sample space for this problem contains 36 elements. The events consist of the
following outcomes:
E1 = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)}
E2 = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
F = {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)} .
It follows easily that P (E1 ) = 5/36 and P (F ) = 1/6 since each outcome is equally likely. Also,
1
P (E1 F ) = 36
as there is only one element, namely (4, 2), that is in common. Thus,
P (E1 F ) =
1
5
6=
= P (E1 )P (F ),
36
216
19
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
and the two events are not independent. But, we have P (E2 ) = 1/6, P (F ) = 1/6, and P (E2 F ) =
1/36, which means
1
P (E2 F ) =
= P (E2 )P (F ).
36
Proposition If E and F are independent, then so are E and F c .
Proof. We consider the fact that E = EF ∪ EF c . These two events are mutually exclusive.
Therefore,
P (E) = P (EF ) + P (EF c )
= P (E)P (F ) + P (EF c )
since E and F are independent. This equation can be rewritten as
P (E) 1 − P (F ) = P (EF c )
⇒
P (E)P (F c ) = P (EF c ),
which means E and F c are independent.
For three events, say E, F , and G, to be independent, we must have
P (EF G) = P (E)P (F )P (G),
and
P (EF ) = P (E)P (F ),
P (EG) = P (E)P (G),
P (F G) = P (F )P (G).
In general, independence can be extended to more than three events. The events E1 , E2 , . . ., En
are said to be independent if for every subset of E10 , E20 , . . ., Er0 , for r ≤ n, of these events,
P (E10 E20 · · · Er0 ) = P (E10 E20 · · · Er0 ).
Exercise (4f) An infinite sequence of independent trials is to be performed. Each trial results in
a success with probability p and a failure with probability 1 − p. What is the probability that at
least 1 success occurs in the first n trials? What is the probability that k successes occur in the
first n trials?
Solution: Here, we define S to be a success and F to be a failure. We know that each trial is
independent and P (S) = p and P (F ) = 1 − p. To find the probability of at least one success, we
can find the probability of no successes and subtract it from one. Hence,
P (at least one success) = 1 − P (no successes) = 1 − (1 − p)n ,
where we have used the fact that the trials are independent by multiplying the probability of failure
n times. To answer the second question, we consider a sequence of n trials, k of which are successes
and n − k of which are failures:
SSS
· · · SS} F
· · · F F} .
{z
| F F {z
|
k successes
n − k failures
20
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
The probability of this sequence is simply pk (1 −p)n−k . But this is only one such sequence of k
successes and n − k failures. There is exactly nk sequences of this form since we are placing k
successes among n trials. Therefore, the probability is
n k
P (k successes) =
p (1 − p)n−k .
k
Exercise (4i) Suppose there are n types of coupons and that each newPcoupon collected is, inn
dependent of previous selections, a type i coupon with probability pi ,
i=1 pi = 1. Suppose k
coupons are to be collected. If Ai is the event that there is at least one type i coupon among those
collected, then, for i 6= j, find P (Ai ), P (Ai ∪ Aj ), and P (Ai | Aj ).
Solution: We note here that the process of selecting a coupon is independent of the previous
selection and that the actual event defined by Ai is not independent of Aj for i 6= j. Since Ai is
defined as the event that at least one coupon of type i is picked, we again consider the event that
no coupon of type i is selected. Therefore, to find P (Ai ), we obtain
P (Ai ) = 1 − P (no coupon of type i)
= 1 − (1 − pi )k .
The event Ai ∪ Aj is the event that at least one of the coupons i or j is selected. Again, we can
find this by considering the event that neither of coupon i nor j is chosen:
P (Ai ∪ Aj ) = 1 − P (Ai ∪ Aj )
= 1 − (1 − pi − pj )k .
Finally, to find the conditional probability P (Ai | Aj ), we use the conditional probability formula,
which gives
P (Ai | Aj ) =
P (Ai Aj )
.
P (Aj )
To determine P (Ai Aj ), we can use the following identity:
P (Ai ∪ Aj ) = P (Ai ) + P (Aj ) − P (Ai Aj ).
Substituting and evaluating, we obtain the following solution:
1 − (1 − pi )k + 1 − (1 − pj )k − 1 − (1 − pi − pj )k
.
P (Ai Aj ) =
1 − (1 − pj )k
21
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Exercise (3.91) Suppose that n independent trials, each of which
P2results in any of the outcomes
0, 1, or 2, with respective probabilities p0 , p1 , and p2 , such that i=0 pi = 1, are performed. Find
the probability that outcomes 1 and 2 both occur at least once.
Solution:
Let Ei be the event that outcome i does not occur. Then we seek the event P (E1 ∪
E2 )c , which is the event that at least one of the two outcomes occurs. We have
P (E1 ∪ E2 ) = P (E1 ) + P (E2 ) − P (E1 E2 )
= (1 − p1 )n + (1 − p2 )n − (1 − p1 − p2 )n
= (1 − p1 )n + (1 − p2 )n − pn0 .
Subtracting this from one yields the desired probability:
P (E1 ∪ E2 )c = 1 + pn0 − (1 − p1 )n − (1 − p2 )n .
22
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
Section 3.5
P (· | F ) is a Probability
We now consider the conditional probability P (· | F ) as a probability function that satisfies the
three axioms of probability.
Proposition For any event E of a sample space S, the probability P (E | F ) satisfies the three
axioms of probability:
1.] 0 ≤ P (E | F ) ≤ 1.
2.] P (S | F ) = 1.
3.] If events E1 , E2 , E3 , . . . are mutually exclusive (i.e., Ei Ej = ∅ for i, j, when i 6= j), then
!
∞
∞
[
X
P
Ei | F =
P (Ei | F ).
i=1
i=1
Proof. For part 1.], we must show that 0 ≤ P (E | F ) ≤ 1. Here, we can use the formula for
conditional probability to write
P (EF )
P (E | F ) =
.
P (F )
Here, P (EF ) ≥ 0 and it follows that EF ⊂ F , which means P (EF ) ≤ P (F ). Therefore,
0≤
P (EF )
≤ 1.
P (F )
P (E | F ) satisfies the first axiom. Part 2.] follows since
P (S | F ) =
P (SF )
P (F )
=
= 1.
P (F )
P (F )
Finally, consider a sequence of mutually exclusive events E1 , E2 , . . ., then
! !
!
∞
∞
[
[
! P
Ei F
P
Ei F
∞
[
i=1
i=1
P
Ei | F =
=
.
P (F )
P (F )
i=1
From here, since Ei Ej = ∅ for i 6= j it follows that Ei F Ej F = ∅ and we have
!
∞
∞
[
X
P
Ei F =
P (Ei F ).
i=1
i=1
23
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
This is because all other sums in the inclusion-exclusion principle drop out because every event is
mutually exclusive. Therefore,
∞
X
P
∞
[
i=1
!
Ei | F
=
P (Ei F )
i=1
P (F )
=
∞
X
P (Ei | F ).
i=1
Axiom 3.] is satisfied.
if we define Q(E) = P (E | F ), then from the above proposition, Q(E) may be regarded as
a probability function on the events of S. In this sense, all propositions previously proved for
probabilities hold for Q(E). For example, we have
Q(E1 ∪ E2 ) = Q(E1 ) + Q(E2 ) − Q(E1 E2 )
or equivalently
P (E1 ∪ E2 | F ) = P (E1 | F ) + P (E2 | F ) − P (E1 E2 | F ).
Also, we can define a conditional probability for the probability Q(E). Suppose we wish to find
the probability Q(E1 ) by first conditioning on whether or not E2 occurs, then
Q(E1 ) = Q(E1 | E2 )Q(E2 ) + Q(E1 | E2c )Q(E2c )
If we substitute our equation for conditional probability, Q(E1 | E2 ) = Q(E1 E2 )/Q(E2 ), we obtain
Q(E1 E2 )
Q(E2 )
P (E1 E2 | F )
=
P (E2 | F )
Q(E1 | E2 ) =
=
P (E1 E2 F )
P (F )
P (E2 F )
P (F )
P (E1 E2 F )
P (E2 F
= P (E1 | E2 F ).
=
Therefore, if we condition the probability Q(E1 ) on whether or not E2 occurs, it is equivalent to
P (E1 | F ) = P (E1 | E2 F )P (E2 | F ) + P (E1 | E2c F )P (E2c | F ).
Exercise (5a and 3a) An insurance company believes that people can be divided into two classes;
those who are accident prone and those who are not. The company’s statistics show that an accident
prone person will have an accident at some time within a fixed 1-year period with probability 0.4,
whereas this probability decreases to 0.2 for a person who is not accident prone.
a.) If we assume that 30% of the population is accident prone, what is the probability that a new
policyholder will have an accident within a year of purchasing a policy?
24
MATH 305-02 – Probability
Lecture Notes
October 1, 2015
b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is the
probability that she is accident prone?
c.) What is the conditional probability that a new policyholder will have an accident in his or her
second year of policy ownership, given that the policyholder had an accident the fist year?
Solution: Let A be the event that a person is accident-prone, and let A1 be the event that a
person had an accident within a year. Also, let A2 denote the event that a person had an accident
during the second year of holding a policy. We are given P (A1 | A) = 0.4 and P (A1 | Ac ) = 0.2.
Consider the following:
a.) We seek the probability P (A1 ), which we can find by conditioning on whether or not that
person is accident prone. We have
P (A1 ) = P (A1 | A)P (A) + P (A1 | Ac )P (Ac )
= (0.4)(0.3) + (0.2)(0.7)
= 0.26.
b.) Now, we assume the person has had an accident and we want to know whether she was
actually accident prone. In this case, we seek the event P (A | A1 ). This is given by
P (AA1 )
P (A1 )
P (A)P (A1 | A)
=
P (A1 )
6
(0.3)(0.4)
= .
=
0.26
13
P (A | A1 ) =
c.) Now, we wish to find the probability P (A2 | A1 ). We can find this by conditioning on whether
or not the person is accident prone. This gives
P (A2 | A1 ) = P (A2 | AA1 )P (A | A1 ) + P (A2 | Ac A1 )P (Ac | A1 ).
Here, the probability P (A2 | AA1 ) = 0.4 since the person is accident prone and the second
year is separate period from the first year. This gives the following solution:
6
7
+ (0.2)
P (A2 | A1 ) = (0.4)
13
13
≈ 0.29.
Exercise (5e) There are k + 1 coins in a box. When flipped, the ith coin will turn up heads with
probability i/k, for i = 0, 1, . . . , k. A coin is randomly selected from the box and is then repeatedly
flipped. If the first n flips all result in heads, what is the conditional probability that the (n + 1)
flip will do likewise?
25
MATH 305-02 – Probability
October 1, 2015
Lecture Notes
Solution: Let Ci be the event that the ith coin is drawn, Fn be the event that the first n flips
were heads, and H be the event that the n + 1 flip is heads. Then the probability we seek is
1
P (H | Fn ). We can conclude that the probabilities given are P (Ci ) = k+1
, P (H | Ci ) = ki , and
i n
P (Fn | Ci ) = k , as each flip is independent from the last. To find the probability, we condition
on which coin was drawn:
P (H | Fn ) =
k
X
P (H | Ci Fn )P (Ci | Fn )
i=0
We can assume that P (H | Ci Fn ) = P (H | Ci ) = ki since the previous flips really should not influence
the n + 1 flip. The only thing that matters is which coin was used. Therefore, we have
P (H | Fn ) =
k X
i
k
i=0
P (Ci | Fn )
Now, we seek the probability P (Ci | Fn ), which is the probability of drawing the ith coin considering
n heads was flipped. Using Bayes’s Formula, we obtain
P (Fn | Ci )P (Ci )
P (Ci | Fn ) = Pk
j=0 P (Fn | Cj )P (Cj )
1 i n
=P
k
=
k
k+1
n j
j=0 k
i n
k n
Pk
j
j=0 k
1
k+1
From this, we update our probability as
i n+1
i=0 k
n .
Pk
j
j=0 k
Pk
P (H | Fn ) =
We see if k is large, and if we multiplying the numerator and denominator by
following approximation:
1 Pk
i n+1
i=0 k
k
n
j
1 Pk
j=0 k
k
P (H | Fn ) =
1
k,
we obtain the
R1
xn+1 dx
n+1
≈ 0R 1
.
=
n
n+2
0 x dx
This approximation follows as the sum, as k gets big, is a Riemann sum approximation to the
integral.
26