Download peA) P(BI and A)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Inductive probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probabilities and Frequency Ratios
on the basis of new
data can be significant.
23
P(AB). P(AB) = P(AIB)P(B), and P(AIB) = (t)(t)(i)(t)(t)(t)(i)(t)
(!-)6(1-)2 ri\rr. Thus, P(AB) =~. t=~. And, peA) -dh+~=
-dh.
One has two black and
white faces. I put them
it so you cannot see it.
What is the probability
? Most people would
Finally, we obtain
P(WIA) = P(AW)
peA)
= 32/6561
34/6561
32 0941
34 =.
.
Bayes' theorem
The method used in Example 2-4 is an application of Bayes' theorem,
named after the Reverend Thomas Bayes (1702-1761).
Suppose the events B 1 • B 2 • • • • , B" constitute a partition of the sample
space S. In other words, B 1 , B 2 • • • • , B" are mutually exclusive events
which together cover all of S. Probabilities have been assigned to each
event in the partition. Now, suppose event A occurs. How does the infor­
mation that A has occurred affect the probabilities of B\> B 2 , ••• , Bk? We
need to find the conditional probabilities P(BIIA), where i 1,2, ... , k.
By definition, the conditional probability of event BI given A is
P(BdA)
the outcome of 8 rolls is
and we seek P(WIA).
!predominantly white one
~ility that a white face ap­
white), and the probability
)le to regard the 8 rolls as
, and the event B (denoting
.lly exclusive and cover all
,re peA) P(AW or AB) =
gous reasoning we can find
P(Bi and A)
peA)
The probability that B; and A occur simultaneously is equal to the proba­
bility that A occurs given Bi times the probability of B i • That is,
Since B I , B 2 , • • • , Bk form a partition of S, event (B 1 or B2 or ... B k )
is equivalent to S. When A occurs, one and only one of the events in the
partition must occur, so
peA)
P(B I and A)
+ P(B 2 and A) + ... + P(Bk and A).
As was seen earlier, the joint probability PCB; and A)
peA IBi)P(B t ). So ~
Substituting in the definitional equation for the conditional proba­
bility, we obtain
P(BdA)
This is Bayes' theorem.
~---,~----
24
Probability and Statistical Inference
This theorem is basic to the approach to statistical and decision
problems known as Bayesian statistics. But what is distinctive (and con­
troversial) in the Bayesian approach is the subjective assignment of
probabilities to nonrecurring events. Bayes theorem is used to calculate
how new information modifies these probabilities.
Annual Death Rat
Cause of Death
Cancer of lung
Emphysema
Cirrhosis of the liver
Cancer of rectum
Influenza and pneumor
All other causes
Basic definitions and rules for the calculation of probabilities have
been presented and applied through examples. The organization of data
to provide a reasonable test. of a probabilistic hypothesis has also been
demonstrated.
Totals
PROBLEMS
2.1. Consider an ordinary deck of playing cards. What is the probability
of drawing, in a random. drawing of a single card,
a) a spade
b) an ace
c) a face card (jack, queen, or king) or a diamond?
2.2. Consult Table 2-1.
a) What percentage of males 16 years old and over was in the labor
force in May 1970? This is known as a labor force participation
rate.
b) What was the probability that a male teen-ager (16-19) not in the
labor force in May 1970 was going to school?
c) What was the labor force participation rate for male teen-agers
(16-19) in May 1970?
d) What was the labor force participation rate for white male teen­
agers (16-19) in May 1970? Was labor force participation of male
teen-agers independent of race?
/2.3. A randomly chosen group of 383,000 persons has been observed for
a year. Of the group, 249,000 are cigarette smokers and 134,000 are
nonsmokers. Deaths during the year are reported in Table 2-3.
a) Estimate the probability that a randomly selected smoker will
die of lung cancer within a year.
b) Estimate the probability that a randomly selected nonsmoker will
die of lung cancer within a year.
c) Estimate the probability that a randomly selected person is a
cigarette smoker.
d) Estimate the probability that a person who has died oflung cancer
was a cigarette smoker.
e) Estimate theprobal:
~,"
•
of the liver was a cil
f) Estimate the probai;
was a cigarette smo)
g) Is the event that a ~
independent of the
~ 2.4. A coin is biased so
If the coin is tossed
exactly 2 tails?
2.5. Suppose two cards
placed side-by-side
of the second.
a) What is the
b) What is the
c) What is the
d) What is the
e) If the first card is
the probability
2.6. Find the probability
dice is 2, 3, 4, . . . ,
What is the probabilt
2.7. Here are the rules
dice. Only the
total of 7 or 11 on
is 2, 3, or 12, he
first throw is a
I
26 Probability and Statistical Inference
b) Find the probability that he wins on his first throw.
c) Find the probability that he wins given that his point is 10.
d) Find the probability that his first throw is 6 and he wins.
e) Find the probability that he wins.
2.8. Two balls are to be drawn from an urn containing 5 red and 3 black
balls.
a) What is the probability that both balls will be red?
b) What is the probability that the first ball will be red and the
second ball black?
c) What is the probability that one ball will be red and the other
black?
d) If the first ball is replaced before the second drawing, what is
/
the probability that both balls drawn will be red?
I
2.10. For a recent year only 15 percent of couples applying for divorce
had three or more children. Is this evidence to support the con­
tention that "children hold marriages together"? If so, explain the"
evidence. If not, explain why not and describe some data that might
give better evidence.
2.11. Suppose Mr. Jones chooses at random one of the integers 1, 2, or 3.
~ Then he throws as many dice as indicated by the chosen number.
What is the probability that he will score a total of 4 points?
2.12. A fair die was thrown twice.
a) What is the probability that the first throw yielded a 5 given that
the sum of the two throws was 6?
b) What is the probability that the sum of the two throws was 5
given that the first throw yielded an even number?
Probability has been
in the sample space of i
outcomes may take
measurement of height
pair of dice; a complel
family, number of roOl
education, occupation a
these examples suggest,
or include numerical in
the outcome in a numeJ
is male, 0 if female.
It is also worthwhi
may be described in m;
interest. For instance, t:
tion on family income, !
then interviewing the p~
be considered to be the ,
of the outcome might in
the person's name. Infc
ployment status might b
code. Similarly, the oub
full box score, by the fi
done in Example 2-3), (
a loss).
rna
Suppose a test for a rare blood disease is known to be 95 percent reliable. In other words,
if the test shows an individual to have the disease then the probability is 0.95 that he does
have it; when the test shows that the disease is not present, it is correct 95 percent of the
time. Suppose the test is given to a large population, 1 percent of whom have the disease.
What fraction of those whom the test shows to have the disease will actually have it?
Answer: 0.0095 / 0.0590 (about 0.161)
B1 = they have the disease
B2 = healthy - they don't have the disease
A = test positive
P(B1) = 0.01
P(B2) = 0.99
P(A/B1) = 0.95
P(A/B2) = 0.05
P(B1/A) = P(A and B1) / P(A)
P(A and B1) = P(A/B1)P(B1) = 0.95 * 0.01
= 0.0095
P(A) = P(A and B1) + P(A and B2)
= P(A/B1)P(B1) + P(A/B2)P(B2)
= (0.95)*(0.01)+ (0.05)*(0.99)
= 0.059
P(B1/A) = P(A and B1) / P(A)
= 0.0095 / 0.059
(about 0.161)
Example 2.1. Suppose tha
the probability the price,
either 0.4 or 0.6. Based u
that 0 is equally likely to
P(O
The conditional probability of A given B is defined whenever PCB) > 0 and is
P(AIB) = peA n B)
PCB) .
(2.1)
P(BIA) = peA ny)
P(A) .
(2.2)
Similarly, if peA) > 0, then
Rearranging (2.1) and (2.2) gives the very useful multiplicative laws:
peA n B)
P(AIB)P(B)
P(BIA)P(A).
We observe the stock for t
days. Assume that the pri
probability that the price
this further information, ,
probability that 8 is 0.6,
greater than the prior pro
let A be the event that th4
Bayes' law we have
(2.3)
P(O = 0.6IA)
2.2.1 Independence
The events AI, ... ,An are independent if for any 1 ::; i l < ... < ik ::; n
P{A·21 n .. · n k 1.1,,; } = P{k11 } ... P{A lie } •
2.2.2 Bayes' law
Suppose that B 1 , ••• , BK is a partition of S meaning that Bi n B j
and Bl U B2 U··· U BK = S. Then for any set A, we have that
0 if i 1= j
and therefore
peA) = peA n B 1 )
+ ... + peA n BK)'
(2.4)
It follows from (2.2) through (2.4) that
P(BjIA)
~(AIBj)P(Bj2
peA)
= (]
P(AIBj)P(B j )
peA
(0.6)3(0.5) + (0.4)3(0!
Thus, our probability th
utive price increases but
before observing data ar
conditional on observed
probability that 0 equals
Bayes' law is so imp
beliefs in light of new in
information is somethin
ematics. 2 There is a hu
emphasis on new infor
Bayes' law for guidance.
P(AIB0P(B~T+~-:--~-P(AIB~ )P(BK)'
(2.5)
Equation (2.5) is called Bayes' law, also known as Bayes' rule or Bayes'
theorem. Bayes' law is a simple, almost trivial, mathematical result, but its
implications are profound. In fact, there is an entire branch of statistics, called
Bayesian statistics, that is based upon Bayes' law and is now playing a very
wide role in applied statistics. The importance of Bayes' law comes from its
usefulness when updating probabilities. Here is an example, one that is too
simple to be realistic but that illustrates the basic idea behind applying Bayes'
law.
I
2.3 Probability
2.3.1 Random
Dil
variab~
J
A quantity such as the
many possible values, b~
such quantities random
variable and the proba
probability distribution
2
See Edwards (1982).
j
I
nAc
from these two basic
0,1 it follows from
Iwhopn<>v<>r
P(B)
> 0 and is
(2.1)
(2.2)
Example 2.1. Suppose that our prior knowledge about a stock indicates that
the probability the price will rise on any given day, which we denote by 0, is
either 0.4 or 0.6. Based upon past data, say from similar stocks, we believe
that 0 is equally likely to be 0.4 or 0.6. Thus, we have the prior probabilities
P(O = 0.4)
0.5 and P(O
= 0.6) = 0.5.
We observe the stock for three consecutive days and its price rises on all three
days. Assume that the price changes are independent across days so that the
probability that the price rises on each of three consecutive days is 0 3 • Given
this further information, we may suspect that 0 = 0.6, not 0.4. Therefore the
probability that 0 is 0.6, given three consecutive price increases, should be
greater than the prior probability of 0.5, but how much greater? As notation,
let A be the event that the prices rises on three consecutive days. Then, using
Bayes' law we have
(2.3)
P(O = 0.6IA) = P(AIO
(0.6)3(0.5)
(0.6)3
0.2160
(0.6)3(0.5) + (0.4)3(0.5) = (0.6)3 + (0.4)3 = 0.2160 + 0.0640 = 0.7714.
that B; nBj = 0 if i
have that
tf j
(2.4)
Thus, our probability that 0 is 0.6 was 0.5 before we observed three consecutive price increases but is 0.7714 after observing this event. Probabilities
before observing data are called the prior probabilities and the probabilities
conditional on observed data are called the posterior probabilities, so the prior
probability that 0 equals 0.6 is 0.5 and the posterior probability is 0.7714.
Bayes' law is so important because it tells us exactly how to update our
beliefs in light of new information. Revising beliefs after receiving additional
information is something that humans do poorly without the help of mathematics. There is a human tendency to put either too little or too much
emphasis on new information, but this problem can be mitigated by using
Bayes' law for guidance.
26 Probability and Statistical Inference
b) Find the probability that he wins on his first throw.
c) Find the probability that he wins given that his point is 10.
d) Find the probability that his first throw is 6 and he wins.
e) Find the probability that he wins.
2.8. Two balls are to be drawn from an urn containing 5 red and 3 black
balls.
a) What is the probability that both balls will be red?
b) What is the probability that the first ball will be red and the
second ball black?
c) What is the probability that one ball will be red and the other
black?
d) If the first ball is replaced before the second drawing, what is
/
the probability that both balls drawn will be red?
I
2.10. For a recent year only 15 percent of couples applying for divorce
had three or more children. Is this evidence to support the con­
tention that "children hold marriages together"? If so, explain the"
evidence. If not, explain why not and describe some data that might
give better evidence.
2.11. Suppose Mr. Jones chooses at random one of the integers 1, 2, or 3.
~ Then he throws as many dice as indicated by the chosen number.
What is the probability that he will score a total of 4 points?
2.12. A fair die was thrown twice.
a) What is the probability that the first throw yielded a 5 given that
the sum of the two throws was 6?
b) What is the probability that the sum of the two throws was 5
given that the first throw yielded an even number?
Probability has been
in the sample space of i
outcomes may take
measurement of height
pair of dice; a complel
family, number of roOl
education, occupation a
these examples suggest,
or include numerical in
the outcome in a numeJ
is male, 0 if female.
It is also worthwhi
may be described in m;
interest. For instance, t:
tion on family income, !
then interviewing the p~
be considered to be the ,
of the outcome might in
the person's name. Infc
ployment status might b
code. Similarly, the oub
full box score, by the fi
done in Example 2-3), (
a loss).
rna
EXERCISES
Applications
6.25 Worked. Objective: To explore the implications of conditional probability on the interpretation of test results, especially for rare events.
Tests for diseases such as AIDS involve errors.
There are false positives and false negatives. With
false positives, the test indicates that an individual has AIDS when he or she does not, and the
false negative indicates that a tested individual
does not have AIDS when in fact he or she does.
Let us consider the consequences of the first
problem, false positives, when we contemplate
testing a large class of people, or even the entire
population. Suppose that 100 million people are
tested. What information have we been given?
pr(test + |NoAIDS) = .05, pr(test–|AIDS) = .01,
and pr(AIDS) = .005. So, in a population of 100
million, we are looking for 500,000 people. The
test is presumed to be quite accurate in that there
is only a 5% error for false positives and a 1%
error for false negatives.
What do we want to know? What is the probability
of having AIDS, given that a person has tested
positive? We will use our concepts of conditional
probability extensively in answering this question.
We want pr(AIDS|test+). We derive
pr(AIDS|test+) =
pr(AIDS, test+)
pr(test+)
And we can break up the pr(test+) into its constituent states by
pr(test+) = pr(AIDS, test+) + pr(NoAIDS, test+)
where we recognize that the states AIDS and
NoAIDS partition the state “test +.” We can test
positive if we have AIDS and if we do not have
AIDS, and these two states are mutually exclusive.
If we can work out these last two joint probabilities given our information, we will have the problem solved. We define
pr(AIDS, test+) = pr(test + |AIDS)pr(AIDS)
pr(NoAIDS, test+) =
pr(test + |NoAIDS)pr(No AIDS)
209
We have been given the information we need;
pr(AIDS) is .005, so that pr(NoAIDS) is .995.
We know that pr(test + |AIDS) is .99, because
pr(test − |AIDS) is .01. It is also given that
pr(test + |NoAIDS) is .05. We can now calculate
that
pr(AIDS, test+) = .99 × .005 = .00495
pr(NoAIDS, test+) = .05 × .995 = .04975
We can now solve our original problem:
pr(AIDS|test+) =
.00495
= .0905
.00495 + .04975
This result that the probability that one has AIDS
given a positive reading is only about 10% may be
surprising at first sight, but if you experiment with the
calculations a bit you will begin to see the logic of the
situation. The bigger the probability of AIDS in the
first place, the bigger the probability given the test. So
the surprising nature of the result is due to the relative
low probability of AIDS in the whole population.
6.26 Given the facts stated in Exercise 6.25, determine the probability of not having AIDS, given that
you have tested negatively. Rework both exercises if
the probability of AIDS in the population is .2. Draw
some policy conclusions about testing schemes from
these calculations.
Exr 6.25 page 209
total population 100,000,000
50,0000 have aids
99,500,000
B1
B2
A
= they have the disease (AIDS)
= healthy - they don't have AIDS
= test positive
P(B1)
P(B2)
= 0.005
= 0.995
P(A/B2)
P(A/B1)
= 0.05
= 0.99 [P(negative/B1)=0.01]
FIND P(AIDS/TEST POSITIVE):
P(B1/A)
= P(A and B1) / P(A)
P(A and B1) = P(A/B1)P(B1)
= 0.99 * 0.005
= 0.00495
P(A) = P(A and B1)
+ P(A and B2)
= P(A/B1)P(B1) + P(A/B2)P(B2)
=(0.99)*(0.005)+ (0.05)*(0.995)
= 0.00495
+
0.04975
= 0.0547
P(B1/A)
= P(A and B1) / P(A)
= 0.00495/0.0547
= 0.0904936
Exr 6.26 page 209
total population 100,000,000
50,0000 have aids
99,500,000
B1
B2
N
= they have the disease (AIDS)
= healthy - they don't have AIDS
= test negative
P(B1)
P(B2)
= 0.005
= 0.995
P(N/B2)
P(N/B1)
= 0.95 [P(negative/B2)=0.95]
= 0.01 [P(negative/B1)=0.01]
FIND P(NO AIDS/TEST NEGATIVE):
P(B2/N)
= P(N and B2) / P(N)
P(N and B2) = P(N/B2)P(B2)
= 0.95 * 0.995
= 0.94525
P(N) = P(N and B1)
+ P(N and B2)
= P(N/B1)P(B1) + P(N/B2)P(B2)
=(0.01)*(0.005)+ (0.95) * (0.995)
= 0.00005
+ 0.94525
= 0.9453
P(B2/N)
= P(N and B2) / P(N)
= 0.94525/0.9453
= 0.9999471