Download Stat I Prof - fordham.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Inductive probability wikipedia , lookup

Probability wikipedia , lookup

Transcript
Stat I Notes on Contingency Tables and Bayes Theorem
Prof. Vinod April. 2015
Let us measure “Political affiliation” along columns and “attitude toward
federal healthcare” along the rows of a Contingency Table. We use abbreviations:
D=democrat, I=indep., R=republican, F =favors federal healthcare, N=does not
favor a federal role. (file http://www.fordham.edu/economics/vinod/st1cntn.doc )
A survey of 671 persons yielded the following information, which is readily
tabulated in 2 by 3 setup as follows.
D
F
N
Total
I
161
110
271
R
40
40
80
Total
130
331
190
340
320 671=GT=grand
The learning objective is to answer following type of questions.
QUIZ: Please answer following questions: (answers are given later in this file)
1) Find probability that a randomly chosen person is Republican
1b) Find probability that a randomly chosen person is either an independent OR favors federal
healthcare care (Hint: I or F addition Rule, Numerator=161+40+130+40)
2) Find conditional probability P( F| I)
3) Give the formula for a test of statistical independence.
Are Political affiliation and attitude toward federal healthcare statistically independent?
Note that we compute the row and column totals and the grand total (GT).
These are needed in probability computations below. This a contingency
table because a person’s attitude toward federal healthcare may be contingent
upon (depend on) his or her political affiliation.
Probability of Event A: P A  
number of simple events in A
total number of simple events in the sample space
Find the probability of a column characteristic
D denotes the set along first column (being a democrat).
P(D) =271/671 = 0.40387 =0.4039 when rounded to four places.
The P(D) is computed by dividing the column total at bottom margin by the GT.
Find the probability of a row characteristic (being in favor of fed. hlthcare)
P(F) = 331/671 = 0.49329 = 0.4933 (rounded to 4 places)
The P(F) is computed by dividing the row total at the right margin by the GT.
 
Probability of a Complement: P Ac  1  P A
Similarly P(N)=340/671.
In fact we also have P(F)+P(N)=1, since N is
complement of F.
Note that the column/row totals are placed along the margins of the table.
Probabilities computed from margins of contingency table are called
marginal or unconditional probabilities.
Addition Rule for probability from the contingency table.
Addition
(being a
P(ID) =
P(I or
rule is for computing the Probability of a Union of 2 sets.
democrat OR being an Independent, i.e., union of D and I)
P(D)+P(I)-P(D ∩ I)
D) = P(D)+P(I)-P(D and I)= [(161+110)/GT]+ [(40+40)/GT]
- (40/GT) =(331+40)/GT =0.5529
Union of Mutually Exclusive Events: P A  B  P A  PB
Addition Rule: P A  B  P A  PB  P A  B
We apply this to Quiz problem 1b as follows.
P(independent or F)=P(I)+P(F)-P(I ∩ F)=(331/GT)+(80/GT)-(40/GT)=371/GT
Probability of Intersection of two sets. This is the probability of belonging
to two sets simultaneously. This is also called the joint probability.
We used it above for P(D ∩ I) and also for P(I ∩ F).
In a contingency table the numerator for joint probability comes from the
main body of the table (not margins)
For example, let the two sets be D (first column) and F (first row).
Now the joint probability is simply an element from the body of the
contingency table (not margin) divided by the GT.
P(D and F) = P(D ∩ F) = 161/GT =0.23994 or 0.2399
Definition of Conditional probability. Here we restrict attention to
only a subset of sample space where some condition is satisfied:
Condition is described by one or more sets (rows or columns). When we
restrict attention to a set, the denominator in probability calculations
becomes the number of elements in that set (total for that rows or columns)
instead of the grand total.
Example: Let the condition be that we restrict attention to set R
for republicans. There are 320 republicans in the data. What is
the probability (pr) of F "favoring federal healthcare" among these folks?
There are 130 Republicans who favor federal healthcare. Hence by
direct computation the conditional probability of F given R is
130/320 or 0.4063. Instead of doing this kind of tedious conditioning
argument, it is easier to use a general formula for conditional probability.
Conditional probability = joint pr. / (marginal pr. for the specified condition)
FORMULA is: condL = Jt / MargL
Conditional Probability: PA B  
P A  B 
P B 
The word "conditional" is often also described by the word "given" and written
as a vertical bar. P(F | R) is pronounced as Prob. of (F given R).
For our example, let us verify that the formula gives the correct result.
The two sets under consideration are F (first row) and R (third column).
P(F given R)= Joint/marginal, where
Joint=P(F and R)=P(F ∩ R)=130/GT
and Marginal prob.=P(R)=(130+190)/GT .
(Joint / Marginal) =130/320=0.40625 or 0.4063.
Thus we verified that the conditional prob. formula is right for this example.
Example 2: find conditional pr. of being indep. given that we restrict
attention to folks who do NOT favor federal healthcare (set N, second row)
P(I given N) = Joint/marginal for second row
Joint=40/GT and Marginal= (110+40+190)/GT. Now P(I|N)=40/340=0.117
Example 3: find conditional pr. of D given the person favors fed healthcare.
P(D given F)= P(D |F)=Jt/marg with Joint=161/GT,
Marginal=(161+40+130)/GT. Hence P(D|F)=161/331=0.486
STATISTICAL INDEPENDENCE
FROM CONTINGENCY TABLES
Are the column characteristic and row characteristic statistically independent?
In our example, we are considering whether Political affiliation and attitude
toward federal healthcare are statistically independent.
To answer this, we have to make the following formal test:
We pick any one row from the set of rows (say F) and one column from the set of
columns, say D.
The formal test is to check if conditional probability equals the
unconditional (marginal) probability so that the effect of the condition is
moot. It is intuitively clear that if the condition is moot we have
independence.
Then the test for our choice of D and F the test for independence is:
P(D|F) =? P(D)
P(D|F)=161/331=0.486, Is this equal to P(D)=0.404. Answer is NO
This means row and column characteristics are NOT statistically independent.
Since P(D given F)=0.486 is not equal to P(D)=0.404 the row characteristic and
column characteristic are dependent. We reject independence.
In general, the left side of the test has conditional probability of some column
given some row and the right side of the test is probability of some column
irrespective of row condition.
To summarize we chose D and F formal test is: P(D|F) =? P(D)
Let us think about the formula a bit more. There are two things on the left
F and D. Note that we must choose the first “D” on the right hand side!
If both probabilities are numerically equal (a very rare thing) we
satisfy the formal test and conclude that there is independence
between row and column characteristics.
At the risk of confusing you, consider what happens if we switch the
conditioning. An equivalent test is to check if P(F|D)=? P(F)
Notice we have the symbol F before the | conditioning symbol. Hence we must
choose P(F) on the right side of the testing equation. Verify that we reject
independence here.
EXAMPLE WHERE ROW AND COLUMN ARE STATISTICALLY INDEPENDENT.
Take another example: B= baby is happy, M=mother is present
U= baby is unhappy, F= father is present.
The baby was observed for 100 hours
M
B
U
Total
F
30
20
50
Total
30
20
60
40
100=Grand
50 total= GT
Is the baby independent of the mother? Yes if P(B given M) = P(B)
Check if the baby is happy whether the mother is present or not!
The conditional probability P(B given M) = joint/marginal= (30/GT) / (50/GT).
i.e., (30/50)= 0.6
Now the unconditional (marginal) probability that the baby is happy is 60/GT
This is also 0.6
Conclude that conditional probability P(B given M) is exactly the same as
unconditional or marginal probability P(B) NUMERICALLY.
Hence the baby is statistically independent of the mother.
ROW characteristic is independent of column characteristic.
Independent Events Rule: PA B  P A or PB A  PB
QUIZ ANSWERS: Please answer following questions:
1) Find probability that a randomly chosen person is Republican
320/671 = 0.4769
1b) P(F or I) = P(F)+P(I)-P(F and I)
=(331/GT)+(80/GT)-(40/GT)=371/GT
2) Find conditional probability P( F|I) = Joint / marginal , joint= (40/GT), marginal = 80/GT
Cancel GT to yield P(F|I)=40/80=0.5
3) Give the formula for test of statistical independence. P(F|I) =? P(F)
Are Political affiliation and attitude toward federal healthcare statistically independent? Since P(F|I) ≠P(F)
when P(F)=331/671=0.4132936 or 0.4133. This is clearly not 0.5, so political affiliation matters, so they
are Dependent
BAYES THM a good source these days is:
http://en.wikipedia.org/wiki/Bayes'_theorem
Example 1 statement of the problem
A city XYZ has reported 6 in 1000 HIV-positive cases. A person from that city was tested for
HIV and the test was positive. The HIV test is known to be subject to two types of errors. False
negative (person declared as HIV negative even though he really has HIV) error rate is 1 %. The
False positive (person declared as HIV positive even though he really is free of HIV) error rate is 1
in 1000. What is the probability that he has HIV given that he tested positive on the test with these
known error rates?
H1=being HIV positive H2=being HIV negative. These are competing hypotheses.
Since the person is from city XYZ we can assume that there is P(H1)=0.006 probability that he is
HIV positive. (this is prior probability or what frequentists call the prejudice of the researcher)
What above evidence or events? Define E1=test is positive, E2= test is negative.
We know the error rates of the tests, which are following conditional probabilities.
Given that the person has HIV, the probability that the test wrongly declares that he is disease-free
(false negative) happens 1 in 100 times. That is P(E2|H1)=0.01
Given that the person has no HIV, the probability that the test wrongly declares that he has the
disease (false positive) happens 1 in 1000 times. That is P(E1|H2)=0.001
This is all the information we have and we have to compute posterior probability that the person
from city XYZ has HIV knowing that he tested positive. Find P(H1|E1)
I suggest the set up with a long horizontal line, above which is H1, and below which is H2
Note how things must add up to 1 for prior, conditional above the line and conditionals below the
line. Finally the two posteriors should also add up to unity: P(H1|E1)+P(H2|E1)=1
P(H1)=0.006
P(E1 | H1)= 0.99
P(E2| H1) = 0.01
------------------------------------------------------------------long horizontal line
P(H2)=0.994
P(E1| H2) = 0.001
P(E2 |H2) = 0.999
Bayes Theorem says that P(H1 |E1) =
P( E1 | H 1) * P( H 1)
k
 P( E1 | Hi ) * P( Hi )
i 1
where k is number of alternative hypotheses, here k=2
numerator of Bayes theorem right side= P(E1|H1)*P(H1) = 0.99*0.006
#R command
num=0.99*0.006
#(=0.00594)
Note that the first term of the denominator is the same as the numerator
The second term is obtained by replacing H1 by H2 while keeping E1 the same!
P(E1|H2)*P(H2)=0.001*0.994.
#R command to compute second term in the denominator
den2=0.001*0.994 #(=0.000994)
The posterior probability answer for H1 by Bayes theorem then is
0.00594/ (0.00594+0.000994)= 0.00594/ 0.006934
P(H1|E1)= 0.8566484
#R command
num/(num+den2)
The second posterior probability for H2 or P(H2|E1) by the Bayes theorem then is
#R command
den2/(num+den2) # 0.1433516
Verify that the two posteriors should also add up to unity: P(H1|E1)+P(H2|E1)=1
0.8566484+# 0.1433516=1
Let  denote "proportional to", then in words, (upon ignoring the
denominator) Bayes theorem says:
(Posterior probability)  (Prior probability)*(Likelihood),
where posterior probability is revised probability Prior probability is always
probability of a hypothesis and likelihood is the probability of an event given
that hypothesis.
Note that likelihood is observable (objective), the prior probability can be
subjective (prejudice). Some scientists so-called frequentists think that we
should not revise probabilities obtained from objective data. Bayesians on the
other hand argue that only the revised prob’s are reliable.
Derivation of Bayes theorem:
The following left side (LHS) equals right side (RHS) by definition of
conditional probability We fix E1 as the event of interest and ignore E2.
P(H1 |E1) = Joint (H1 & E1)/ Marginal (E1) = P( H1 ∩ E1) /P(E1)
(1)
multiply both sides by P(E1) to yield
P(H1&E1)=P(H1|E1)* P(E1)
(2)
Conversely, the other conditional probability so-called likelihood is
P(E1 |H1) = Joint (H1 & E1)/ Marginal (H1) = P( H1 ∩ E1) /P(H1)
(3)
Now multiply through P(H1) to both sides of (3)to yield
P(H1&E1)=P(E1|H1)* P(H1)
(4)
Note that left sides of (2) and (4) are exactly the same, so we can equate the
right sides also to yield
P(H1|E1)* P(E1) = P(E1|H1)* P(H1)
(5)
Note that the right side of (5) is the numerator of Bayes Theorem. Left side of
(5) has 2 terms of which first is the left side of Bayes Theorem. So we are
almost there in proving it, provided P(E1) equals the denominator of Bayes
Theorem.
Rewrite (5) as:
P(H1|E1)=
P(E1 | H1) * P(H1)
,
P(E1)
(6)
,
where the left side writes the posterior probability of H1 given event E1. For philosophical
discussion it is appropriate to ignore the denominator on the right hand side of (6), and write the
theorem by the statement that left side is proportional to ( ) the numerator of the right hand side.
Then Bayes theorem states
posterior=P(H1|E1)  P(E1|H1)* P(H1)= Likelihood*Prior,
(6b)
For numerical computation of posterior probabilities one does need the
denominator P(E1) and it is derived and explained next. The contingency table
by definition is:
E1
E2
Row total
H1
P(H1&E1) P(H1&E2) P(H1)
H2
P(H2&E1) P(H2&E2) P(H2)
Col Total
P(E1)
P(E2)
1
Inserting numerical values from the HIV example we have. In filling this table
we use the formula conditional probability=(Joint prob.)/(marginal prob.)
Row
E1
E2
total
H1
0.00594 0.000006
0.006
H2
0.000994 0.993006
0.994
Col
0.006988 0.993012
1
Total
The following is seen from the column entitled E1 in the contingency table.
P(E1)= P(E1 & H1) + P(E1 & H2) a sum of joint probabilities. We can split
each Joint probability into that column as conditional probability times
marginal probability
(again by definition of conditional prob)
P( E1 & H1)= P(E1| H1)* P(H1)
P( E1 & H2)= P(E1| H2)* P(H2)
Hence the denominator of Bayes Theorem is written as a summation as many terms
as there are hypotheses. Here there are two hypotheses, so we have:
(7)
P(E1) = P(E1| H1)*P(H1)+ P(E1| H2)*P(H2) = i P(E1|Hi)* P(Hi)
This finishes derivation of the denominator. Now substituting (7) in (6) we have the usual form of
Bayes theorem:
(8)
P(E1 | H1) * P(H1)
P(H1|E1)=
2
 P( E1 | Hi ) P Hi 
i 1
,
QED PROOF completed.
Without loss of generality this theorem can also be proved for the other conditional probability of
the hypothesis H2 as
(9)
P(E1 | H2) * P(H2)
P(H2|E1)=
2
 P( E1 | Hi ) P Hi 
i 1
,
Example 1 statement of the problem
A city has reported 6 in 1000 HIV-positive cases. A person from that city was tested for HIV and
the test was positive. The HIV test is known to be subject to two types of errors. False negative
(person declared as HIV negative even though he really has HIV) error rate is 1 %. The False
positive (person declared as HIV positive even though he really is free of HIV) error rate is 1 in
1000. What is the probability that he has HIV given that he tested positive on the test with these
known error rates?.
EXample 1
There is a 0.4 probability that a salesman will be sent by Morgan Co. to call on Mr. Smith. If
Morgan does not send a salesman, there is a 0.7 probability that Smith will buy paper from
Morgan’s competitor Xerox Co. If on the other hand Morgan does send a salesman, there is only
0.2 probability that Smith will buy from Xerox. If Smith does buy from Xerox, what is the
probability that Morgan did not send a salesman.
ANSWER: EVENT E1 IS THAT SMITH BUYS FROM XEROX (Comes from the last sentence)
E2 is that Smith buys from Morgan
THERE ARE TWO HYPOTHESES INVOLVED: H1: MORGAN sent a salesman
H2: MORGAN did not send a salesman
We are asked to find probability of H2 given that event E OCCURRED=P(H2 | E)
Since we are looking for the probability of a hypothesis we need the Bayes theorem as in equation
(9), because here we are asked to find the conditional probability P(H2|E). In general, the right
hand sides of equations (8) and (9) show that BAYES THEOREM computations need two kinds of
probabilities: (i) Prior probabilities of the exhaustive set of all hypotheses,
AND
(ii) Conditional probability of event E conditional on the hypotheses P(E|H1) and
P(E|H2). However the conditional probability for E2=(opposite of event E) does not appear at all
in Bayes formulas (8) or (9). Once we compute (i) and (ii), we are ready to apply Bayes theorem.
LET US COMPUTE THE prior PROBABILITIES FIRST:
P(H1)=0.4 from the FIRST SENTENCE OF THE PROBLEM.
P(H2)=0.6 BECAUSE THIS IS A COMPLEMENT OF 0.4.
NOTE THAT 0.4+0.6=1.0
H1 AND H2 are mutually exclusive and exhaustive
P(H1) + P(H2) must be a certainty, i.e. its probability must be 1.0
I like to organize the information as follows separately for the two hypotheses with a long
horizontal line separating the two hypotheses. The probabilities conditional on the two hypotheses
are then conveniently written separated by the long horizontal line. The conditional probabilities
add up to 1 conditional for each hypothesis. Their values are discussed below in detail.
Sent salesman
P(H1)=0.4
P(E1|H1)=0.2
P(E2|H1)=0.8
------------------------------------------------------------------long horizontal line
Salesman Not sent
P(E1|H2)=0.7
P(H2)=0.6
P(E2|H2)=0.3
Now we turn to the second part, conditional probability needed for Bayes theorem.
The phrase 'if the salesman is sent' means conditional on H1
The phrase 'if the salesman is not sent' means conditional on H2
The second sentence of the original problem states conditional on H2, so this information goes
below the horizontal line. The probability that Smith will buy from Xerox conditional on H2 is 0.7
P( E | H2)= 0.7. See this written below the line in my set up in red
The third sentence says that if the salesman is sent (conditional on H1) this goes
above the line, the probability that Smith will buy from Xerox is only 0.2
P(E | H1) = 0.2
See this written above in Green color.
Now we are ready to plug into the statement of the BAYES THEOREM of eq. (9)
P(E | H2) P(H2)
P(H2 | E) = ---------------------------------------P(E | H1) P(H1) +
P(E | H2) P(H2)
0.7 TIMES 0.6
= ----------------------------0.2 TIMES 0.4 + 0.7 TIMES 0.6
=0.42/(0.08+0.42)
=
0.84
Example 2 WHO-DONE IT?
From past records, the chance that the accounting mistake was made by Tom (0.5), Dick (0.25) and
Jane (0.25). With the kind of latest mistake the evidence points to Tom and Jane ( 1 in 1000) while
Dick it is 2 in 1000. What is the posterior probability for each of them. Use Bayes Theorem.
Two useful computational tricks are Priors Add up to 1
Conditionals add up to 1 for each hypothesis. E1=mistake was made, E2=No mistake made.
Tom
P(H1)=0.50
P(E1|H1)=0.001
P(E2|H1)=0.999
---------------------------------------------------long horizontal line
Dick
P(E1|H2)=0.002
P(H2)=0.25
P(E2|H2)=0.998
----------------------------------------------------long horizontal line
Jane
P(H3)=0.25
P(E1|H3)=0.001
P(E2|H3)=0.999
Numerator of Bayes Theorem for Tom P(E1|H1) * P(H1) = 0.001*0.5=0.0005
Numerator of Bayes Theorem for Dick P(E1|H2) * P(H2) = 0.002*0.25=0.0005
Numerator of Bayes Theorem for Jane P(E1|H3) * P(H3) = 0.001*0.25=0.00025
Denominator for all = grand sum of above= 0.00125
Posterior for Tom and Dick =0.0005/0.00125 = 0.4
Posterior for Jane= 0.2, SO Tom/Dick did it!
EXAMPLE 3
You are in investor. IRS audits corporations and audits have effect on stock prices. We know that
20% of corporations will file incorrect return. IRS itself makes errors. Sometimes IRS auditors
claim there is an error and in reality there was no error, which happens 10% times. Conversely,
IRS auditors miss real errors 30% times. News reports say that IRS has just notified XYZ
corporation that there is an error in their corporate tax return. Use Bayes theorem to determine the
posterior probability of erroneous return by XYZ Corporation.
Step 1: Define the Event! News of IRS audit and E1=error found, E2=no error found.
Step 2: Prior probabilities?
Step 3: What can affect the prior probability? Determine the conditional probabilities.
Step 4: Compute posterior probability numerators and then posterior probabilities.
H1 = XYZ corp. return does have an error
H2 = XYZ corp. return does Not have an error
IRS audits, E1 IRS finds error
E2 IRS does NOT find any error
As investors we are interested in finding whether XYZ corporation actually (eventually) has an
error in its tax return now that the news has broken out that they are being audited. That is we want
to know the posterior probability P(H1 | E1)
Above the first horizontal line we have P(H1)
Given that XYZ Corp has submitted an erroneous return (i.e, given H1) the probability that IRS
will find it is P(E1|H1)= 0.70 [ We know this from the statement “IRS auditors miss real errors
30% times” which means it will find 70% times, this is called the likelihood function] Hence
P(E2|H1) =0.30
Below the first horizontal line we have P(H2) that XYZ corp. return does Not have an error Case
Given that the returns are good (no error) IRS does make a mistake and alleges errors in 10% of
cases which eventually turn out to be wrong. P(E1 IRS finds error | H2 really no error) =0.10
P(E1|H1)=0.7
P(H1)=0.20
P(E2|H1)=0.3
--------------------------------------------long horizontal line
P(E1|H2)=0.1
P(H2)=0.80
P(E2|H2)=0.9
If we want posterior probability P(H1|E1), numerator has P(E1|H1)*P(H1)=0.7*0.2=0.14
If we want posterior probability P(H2|E1), numerator has P(E1|H2)*P(H2)=0.1*0.8=0.08
Summation of all numerators = 0.14+0.08=0.22 is the denominator for posterior probability.
So the posterior probability by Bayes Theorem that the XYZ corporate return actually has an error
is P(H1 | E1) = Bayes numerator / Bayes denominator = 0.14/0.22 = 0.6364
EXAMPLE 4
A firm tests prospective employees by using a test. Among those who perform their jobs
satisfactorily 65% passed the test. Among those who DID NOT perform their jobs satisfactorily
and were fired, 25% passed the test. According to the recorded data, 90% of the employees
perform satisfactorily. What is the probability that a prospective employee who passed the test will
not perform satisfactorily?
ANSWER: we define Event E=passing the test, E2=failing and H1=satisfactory performance on
the job and H2 as unsatisfactory performance.
Passed: 0.65 =P( E|H1)
H1,Satisfactory:
Failed test: 0.35= P(E2|H1)
----------------------------------------------------------------long horizontal line
Passed: 0.25 =P( E|H2)
H2, Unsatisfactory
Failed test: 0.75= P(E2|H2)
P(E|H2) P(H2)
0.25*0.10
Find P(UN | Pass) = P(H2 | E) =-------------------------------------- = --------------------------P(E|H2) P(H2)+P(E|H1) P(H1)
0.25*0.1+0.65*0.9
= 0.0409 This is the answer.
Note that P(E2|H1) and P(E2|H2) do not appear in the Bayes formula at all. Sometimes the
problem statement contains information about P(E2|H1) instead of the needed information about
P(E|H1). Note that since the two conditional probabilities on any side of the long horizontal line
always add up to 1 we can readily find the needed value by using: P(E|H1)=1P(E2|H1).
EXAMPLE 5
A town suspects that teens are out to steal. Prior data shows prob. that a teen commits theft is 0.8
and the prob that adults commit theft are 0.2. A theft was reported and a teen and an adult were
accused. Investigation showed that the prob. that accused teen is guilty is 0.6, while accused adult
is guilty is 0.7. Find the probability that the accused teen did it.
Answer
Look at the last sentence. It should have P(H1|E) the left side of a typical Bayes Problem.
clearly here H1=teen did it and E1=event that theft was committed.
Now we just have to fill the available information as conditional and marginal probabilities.
P(H1) and P(H2) are prior probabilities
guilty: 0.6 =P( E1|H1)
H1 teen (0.8):
innocent: 0.4= P(E2|H1)
----------------------------------------------------------------long horizontal line
guilty: 0.7 =P( E1|H2)
H2, adult (0.2)
innocent: 0.3= P(E2|H2)
P(E1|H1) P(H1)
0.6*0.8
Find P(teen| theft ) = P(H1 | E1) =-------------------------------------- = --------------------------P(E1|H1) P(H1)+P(E1|H2) P(H2)
0.6*0.8+ 0.7*0.2
P(E|H1) P(H1)
0.7*0.2
Find P(adult | theft ) = P(H1 | E) =-------------------------------------- = --------------------------P(E1|H1) P(H1)+P(E1|H2) P(H2)
0.6*0.8+ 0.7*0.2
#Rcommands
a=0.6*0.8;a
b=0.2*.7;b
teen=a/(a+b);teen
#posterior Prob that accused teen did it= a/(a+b) =0.7741935
adult=b/(a+b);adult
#posterior Prob that accused adult did it= b/(a+b)= 0.2258065
If there we no age profiling the prob. would be only 0.6 for the teen, instead of 0.7741935
This shows that Bayes Thm application can be unfair. Frequentists are opposed to Bayes
They say that only P(E|H1)=0.6 should be relevant.
H1
H2
Col Total
E1
E2
Row total
P(H1&E1) P(H1&E2) P(H1)
P(H2&E1) P(H2&E2) P(H2)
P(E1)
P(E2)
1
Contingency table for this problem
H1
H2
E1
0.48
0.14
E2
0.32
0.06
Row total
0.8
0.2
Col Total
0.62
0.38
1
We filled this table by using the following definitional relations.
P(E1|H1)=0.6 means (Joint E1&H1/marginal= 0.8) is 0.6, so the joint is 0.8*0.6=0.48
P(E1|H2)=0.7 means (Joint E1&H2/marginal 0.2) is 0.7, so the joint is 0.7*0.2=0.14
Example (Defective Product, Which plant produced it?)
A company’s output is split into two plants Plant A producing 60% and plant B producing 40%.
Past record shows that A’s products are generally 90% good and B’s are 95% good. Given that a
defective is complained about, what it the probability that the plant A produced it.
Event is a defective product complaint is received.
Prior is P(HA)=0.6 and P(HB)=0.4 (the supply propensity of each plant, no hidden prejudice here)
E1=good product, E2=defective product. We are asked to find P(HA |E2)
[By the way, the fact that 60% output comes from plant A does not mean 60% defectives also come
from plant A. Frequentists say that this prior should be irrelevant and that what matters is how
many defectives are produced by each plant. But Bayes Theorem says that both should matter. I
agree that as long as there is not unfair prejudice, Bayes’s processing of information is optimal]
good: 0.9 =P( E2|HA)
HA (0.6):
defective: 0.1= P(E1|HA)
----------------------------------------------------------------long horizontal line
good: 0.95 =P( E2|HB)
HB, (0.4)
defective: 0.05= P(E1|HB)
posterior Prob that plant A produced it=prior*conditional=0.6*0.1 /denom
posterior Prob that plant B produced it=0.4*0.05 /denom
R code:
a=0.6*0.1; b=0.4*0.05;den=a+b; postA=a/den;postA
#0.75 is the answer
posterior for B is 0.25. That is plant A is the culprit here. When we are not considering the guilt
or innocence of an individual, Bayes Theorem is useful in business.
Example (Marketing choice)
ABElectronics Inc wants to sell new models of TV sets. In the past 40% were successfully sold
and 60% failed to sell. If a consumer report recommends the new TV model it matters. In the past
80% of successful sales had a favorable recommendation from consumer reports while 30% of
unsuccessful models had also received a favorable recommendation. The new model has received
a favorable recommendation from consumer reports. What is the probability that it will be
successful in the end?
H1= successful (past prejudice prob) P(H1)= 0.40
H2= unsuccessful model prior
P(H2)=0.60
E1= consumer report recommends the model
E2= consumer report DOES NOT recommend it
CR recom| succ =P( E1|H1)= 0.80
H1 (0.4):
CR rejects| success= P(E2|H1)=0.20
----------------------------------------------------------------long horizontal line
CR recom| Unsucc =P( E1|H2) =0.30
H2, (0.6)
CR rejects| Unsucc = P(E1|H2)=0.70
Find posterior P(H1|E1) =num/denom
num1=P(E1|H1)*P(H1)=0.8*0.4
den1=P(E1|H1)*P(H1)=0.8*0.4
den2= P(E1|H2)*P(H2)=0.30*0.6
ans=0.8*0.4 / (0.8*0.4 + 0.3*0.6) =0.64