Download DA_Lecture05

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Inductive probability wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Decision Analysis
Lecture 5
Tony Cox
My e-mail: [email protected]
Course web site: http://cox-associates.com/DA/
Agenda
• Assignment 5
– Bayes’ Rule, simulation-optimization
• Readings 5: Decision psychology
• Probability and Bayes’ Rule
– Monty Hall problem
– Conditioning. Statistical independence. Laws of probability. Joint,
marginal, and conditional probabilities. Bayes’ Rule
– HIV and Problem set 4 solutions via Bayes’ Rule
• Decision psychology: Heuristics and biases
• Probabilistic expert systems (Netica)
2
Homework #5
(Due by 4:00 PM, February 21)
• Problems: (a) Soft pretzels, (b) Optimal R&D intensity
• Readings
– Required: Tversky and Kahneman (1974) on Judgment under uncertainty: Heuristics
and biases, http://psiexp.ss.uci.edu/research/teaching/Tversky_Kahneman_1974.pdf
•
Know these: Representativeness, Availability, Anchoring and Adjustment,
– Required: Charniak (1991), pages 50-53,
http://www.aaai.org/ojs/index.php/aimagazine/article/viewFile/918/836
– Ok to skim for main points: Slovic et al. (2002) on affect heuristic, decision utility vs.
experience utility, http://faculty.psy.ohiostate.edu/peters/lab/pubs/publications/2002_Slovic_Finucane_etal._Rational_actors_or_rational_fools.pdf
– Ok to skim: Thaler (1981) on dynamic inconsistency, inconsistent discounting,
http://faculty.chicagobooth.edu/richard.thaler/research/pdf/Some%20Empirical%20Evi
dence%20on%20Dynamic%20Inconsistency.pdf
– Optional: Camerer et al. (2012) on Neuroeconomics and ambiguity aversion,
http://neuroecon.berkeley.edu/public/papers/Ambiguity-Chapter.pdf
3
Assignment 5, Problem 1:
Soft pretzels
• Your decision model:
– If your new pretzel is a hit, market share will be
30%
– If flop, then market share = 10%
– Prior probabilities: Pr(hit) = 0.5, Pr(flop) = 0.5
• Evidence: 5 of 20 people preferred your
pretzel in a taste test
• Find: Pr(hit | 5 of 20 preferred new pretzel)
– First step toward getting probabilities from data
4
Assignment 5, Problem 2:
Optimal R&D effort
• Each new employee a company hires has a 10%
probability (independently of anyone else) of
solving a certain R&D problem in the next year
• If solution is obtained in the next year, it is worth
$1M (else $0).
• Each new employee costs $0.05M.
• To maximize EMV, how many new employees
should the company hire to work on this R&D
problem?
– Suggested approach: Simulation-optimization in R
5
Probability and Bayes’ Rule
Monty Hall problem
Conditioning
Statistical independence
Laws of probability
Joint, marginal, and conditional distributions
Bayes’ Rule
Solutions to problems
6
Where are we going?
1. Monty Hall problem and intuitive solution
2. Bayes’ Rule
– Probability background: joint, marginal, and conditional
probabilities. Statistical independence.
3. Where do models, Pr(E | H), come from?
– Data for inferring causality
– Valid vs. non-valid statistical models
•
Causal inference: Biases and solutions
– Simpson’s Paradox
4. Causal Graphs (and paper/presentation topics)
– DAG models show conditional independence relations
– Topics: Path analysis, Structural Equation Models
(SEMs), and Simon causal ordering
7
Example: Monty Hall problem
• A prize is hidden in one of three boxes. The
location of the prize is selected at random
• You pick any one of three boxes.
• Before you open it, Monty opens an empty
box. He offers to let you switch to the
remaining unopened box. Should you?
– If you picked the one with the prize, he opens
another box at random
– Else, he open the box without the prize
8
Example: Monty Hall problem
• Pr(you picked the right box) = 1/3
• Pr(the remaining box contains the prize) =
(1/3)(0) + (2/3)(1) = 2/3
• Thus, you should switch.
– Doing so doubles your chances of getting the
prize, from 1/3 to 2/3
– It is not true that, after Monty opens the empty
box, each of the other two is equally likely to
contain the prize
9
Intuitive solution to Monty Hall Problem
• In effect, the host “points to” the correct
box (by opening all incorrect ones) if the
contestant has not already selected it.
(Otherwise, the host points to a randomly
selected incorrect box.)
• So, with probability 2/3, we should accept
the box that the host points to (i.e., Box 3).
• How to discover this without intuition?
10
Our goal: Bayes’ Rule
Conditional probability: Pr(H | E) = Pr(H & E)/Pr(E)
Total probability: Pr(E) = Pr(E | H)Pr(H) + Pr(E | not-H)Pr(not-H)
Bayes’ Rule: Pr(H | E) = Pr(E | H)Pr(H)/Pr(E), with
Pr(E) calculated via Total Probability formula.
•
•
•
•
H = hypothesis or theory
E = evidence (observed data)
Pr(H) = prior probability for H
Pr(H | E) = posterior probability for H, given evidence E (or
conditioned on evidence E) (“|” = “conditioned on”)
• Pr(E | H) = likelihood of E, given H.
• Data base interpretation:
– E tells you which rows to look at
– (H & E) tells you how many of those rows H holds in.
11
DAG form of Bayes Rule
• Simple DAG model: H  E
– DAG = directed acyclic graph
– H is now a random variable (e.g., with one value for
each alternative hypothesis)
• Inference problem: Given a prior probability
for each possible value of H and given an
observed value of E, what are the posterior
probabilities for each value of H, conditioned on
the data E?
• “Arc-reversal”: See E, infer H. Find Pr(H = h | E = e).
• Generalization: From one arc to a whole DAG!
12
Conditioning
13
Notation: Conditioning
• Pr(A | B) = conditional probability of A,
given B
– A and B are events
– Events are subsets of a “sample space”
– We will sometimes condition on acts/decisions
• “|” is the sign for “conditioned on” or “given”
• Example: For a fair die, what is Pr(3)?
What is Pr(3 | odd)?
14
Interpretation of conditioning
• Let A and B be two events
– An “event” is usually interpreted as a subset of a
sample space
– A sample space is often interpreted as a set of
possible observed outcomes
• (A | B) is the symbol for “A given B” or “A
conditioned on B”
• Pr(A | B) = Pr(A is true given that B is true)
• Data set interpretation: Pr(A | B) = fraction of
(rows in which B holds) for which (A holds too)
– Rows = random cases; columns = fields/variables
15
Conditioning in a data set
Record
1
Gender
M
Age
31
Smoker?
Yes
COPD?
No
2
3
4
F
F
M
41
59
26
No
Yes
No
No
Yes
No
5
6
F
M
53
58
No
Yes
No
Yes
• For a randomly sampled record, what is…
– Pr(smoker), Pr(COPD | smoker), Pr(smoker | COPD)
– Pr(COPD | male smoker), Pr(male smoker | COPD & > 50)
– Pr(smoker & COPD)
16
Conditioning in a data set
Record
1
Gender
M
Age
31
Smoker?
Yes
COPD?
No
2
3
4
F
F
M
41
59
26
No
Yes
No
No
Yes
No
5
6
F
M
53
58
No
Yes
No
Yes
• For a randomly sampled record, what is…
– Pr(smoker) = 3/6 = 1/2, Pr(COPD | smoker) = 2/3
– Pr(smoker | COPD) = 1, Pr(COPD | male smoker) = 1/2
– Pr(male smoker | COPD & > 50) = 1/2, Pt(smoker & COPD) = 1/3
17
Why does conditioning matter?
• Learning (or “updating of beliefs”) takes place by
conditioning, in traditional DA
• Value of information is determined by how much it
increases the conditional expected utility of the
best decision
• Causality: Suggests (but does not prove) how
changing some choices (e.g., smoking) might
change the probabilities of consequences (e.g.,
COPD)
• Inference: Can be used to answer queries (Netica)
18
Examples of conditioning
•
•
•
•
A
B
C
Case
1
3
1
4
2
1
5
9
3
2
6
5
4
3
5
8
All rows equally likely
Find: Pr(B = 5 | C > 4)?
Find: Pr(B > C)? (unconditional probability)
Pr(B > C | A + B > C)? (conditional probability)
19
Examples of conditioning
•
•
•
•
A
B
C
Case
1
3
1
4
2
1
5
9
3
2
6
5
4
3
5
8
All rows equally likely
Pr(B = 5 | C > 4) = 2/3
Pr(B > C) = 1/4 (unconditional probability)
Pr(B > C | A + B > C) = 1 (conditional probability)
20
Conditioning can be used to
answer queries
• A universal rule for data-based inference?
– Sound? (Only correct conclusions?)
– Complete? (Can we use it to answer any
answerable question, as well as possible?)
• How to use conditioning to draw causal
inferences?
– Causal graph models will address this
• What other inference rules are there?
– Deduction, induction, analogy, insight
21
Statistical independence
Laws of probability
Joint, marginal, and conditional
probabilities
Bayes’ Rule
22
Statistical Independence
• Events A and B are statistically independent if and
only if: Pr(A and B) = Pr(A)*Pr(B)
– (Thus, since Pr(A and B) = Pr(A)Pr(B | A), we must have
Pr(B | A) = Pr(B) = Pr(B | not-A).)
• Random variables X and Y are statistically
independent if Pr(x, y) = Pr(x)Pr(y) (“joint PDF
equals product of marginals”)
– Justification: Pr(x, y) = Pr(x)*Pr(y | x), If X and Y are
independent, Pr(y | x) = Pr(y)
• If X and Y are not statistically independent, then
each provides information about the other
23
Example
• Joint probability table. Are X and Y
independent?
• (No!)
Y
X
0
1
0
1
0
0.2 0.2
0.3
0.3
0.5 0.8
0.7
24
Probability Basics
•
•
Let X be a random variable with possible
values of a, b, c,…
Then a probability density function (PDF) for X
assigns numbers (“probabilities”) to possible
values of X such that:
1. Pr(X = a) + Pr(X = b) + Pr(X = c) + … = 1
Summation notation:
xPr(X = x) = 1; xPr(x) = 1
2. Each probability number is non-negative
3. If X is continuous, then replace summation with
integration: all x Pr(X = x)dx = 1
For more tutorial, www.sci.utah.edu/~gerig/CS6640F2010/prob-tut.pdf
25
Probability Basics (Cont.)
Rules of probability
• Pr(A or B) = Pr(A) + Pr(B) – Pr(A and B)
– A and B are “events” (subsets of possible
outcomes in sample space)
–  = intersection of events (at semantic level)
corresponds to “&” at the syntactic level)
• Total probability: Pr(A) + Pr(not-A) = 1
• For additive probability measures, summation
of probs. represents disjoint union of events
• Pr(A & B) = Pr(A | B)*Pr(B) = Pr(B | A)*Pr(A)
26
Application: “Law of Total Probability”
• If exactly one of {H1, H2, H2,…Hn} is true,
• Then Pr(E) = Pr(E & H1) + Pr(E & H2) + … + Pr(E
& Hn)
• Q: Why?
• A: Because for E to happen, it must happen in
exactly one of these ways (i.e., with exactly one of
these hypotheses being true.)
– E here represents any event, not only “evidence”
• We will soon see that the above implies:
Pr(E) = Pr(E | H1)Pr(H1) + … + Pr(E | Hn)Pr(Hn)
27
Joint, Marginal, and Conditional Probabilities
Let X and Y be two random variables.
Their joint PDF is:
Pr(X, Y) = Pr(X = x and Y = y)= f(x, y)
Joint distribution can be factored as a product of a
marginal PDF and a conditional PDF:
Key formula:
Pr(x, y)
= Pr(x)Pr(y | x) = marginal * conditional
= Pr(X = x)Pr(Y = y | X = x)
28
Example of Joint and Marginal Distributions
Q: Given the joint PDF for X and Y below,
what is the PDF for X? Find Pr(X > 1).
X
1
2
3
Y
4
0
0.2
0.1
8
0.1
0.2
0
16
0.1
0
0.3
29
Solution to Marginalization Problem
• To marginalize out Pr(X), just sum!
• Pr(X > 1) = 0.8.
X
1
2
3
Y
4
0
0.2
0.1
8
0.1
0.2
0
16
0.1
0
0.3
Marginal Pr(X)
0.2
0.4
0.4
30
Why does Pr(y | x) = Pr(x, y)/Pr(x)?
• Pr(x, y) = Pr(X = x and Y = y)
= Pr(X = x)Pr(Y = y | X = x) = Pr(x)Pr(y | x)
• Dividing both sides by Pr(x) gives
Pr(y | x) = Pr(x, y)/Pr(x)
By symmetry,
Pr(y | x)Pr(x) = Pr(x | y)Pr(y) = Pr(x, y)
So, Pr(y | x) = Pr(x | y)Pr(y)/Pr(x).
This is almost Bayes’ Rule!
31
Bayes’ Rule
Start with Pr(y | x) = Pr(x | y)Pr(y)/Pr(x).
Now, write Pr(x) as:
Pr(x) = vPr(X = x, Y = v) (“marginalize out” Y)
Pr(x) = vPr(x, v) = vPr(x | v)Pr(v).
Then, Pr(y | x) = Pr(x | y)Pr(y)/[vPr(x | v)Pr(v)]
This is Bayes’ Rule. (v ranges over all possible
values of Y.)
32
Bayes’ Rule for Conditioning on Evidence
Bayes’ Rule: Pr(H | E) = Pr(E | H)Pr(H)/Pr(E)
•
•
•
•
H = hypothesis or theory – or query!
E = evidence (observed data)
Pr(H) = prior probability for H (from model)
Pr(H | E) = posterior probability for H, given
evidence E (or conditioned on (“|”) evidence E)
• Pr(E | H) = likelihood of E, given H (from
model).
33
Solving Monty Hall with Bayes’ Rule
• H = prize is in Box # 1, H3 = prize in Box 3, H2 =
prize is in Box 2.
• E = Host shows ball is not in Box # 2
• Pr(H | E) = Pr(E | H)Pr(H)/Pr(E) = Pr(E | H)Pr(H)/  Pr(E | v)Pr(v)]
• Pr(E | H) = 0.5
• Pr(H) = 1/3
• Pr(E) = Pr(E | H)Pr(H) + Pr(E | H2)Pr(H2) + Pr(E |
H3)Pr(H3) = (1/6) + 0 + (1/3) = 0.5
• Pr(H | E) = 1/3, Pr(H3 | E) = 2/3.
v
34
Bayesian learning of p(s):
Conditional probabilities
• Pr(s | y) = Pr(y | s)Pr(s)/Pr(y)
– Definition of conditional probability
– s = state, Pr(s) = prior probability of s
– y = evidence, data, signal, etc.
– Pr(y | s) = likelihood of y given s
– Pr(s | y) = posterior probability of s give y
• Pr(y) = s′ Pr(y | s′)Pr(s′) Law of total probability
• Bayes Rule: Pr(s | y) = Pr(y | s)Pr(s)/ s′ Pr(y | s′)Pr(s′)
35
Example: HIV screening
manual solution
• Pr(s) = 0.01 = fraction of population with HIV
– s = has HIV, s′ = does not have HIV
– y = test is positive
• Pr(test positive | HIV) = 0.99
• Pr(test positive | no HIV) = 0.02
• Find: Pr(HIV | test positive) = Pr(s | y)
– Subjective probability estimates?
36
Example: Screening for HIV
• Bayes Rule: Pr(s | y) = Pr(y | s)Pr(s)/ s′ Pr(y | s′)Pr(s′)
– Pr(s) = 0.01
– s = has HIV, s′ = does not have HIV
– Pr(y | s) = 0.99 Pr(y | not-s) = 0.02
– Pr(y) = Pr(y | s)Pr(s) + Pr(y | not-s)Pr(not-s)
• = (0.99*0.01)/(0.99*0.01 + 0.02*0.99) = 1/3
• If test is positive, Pr(HIV | positive test)= 1/3
– Twice as many false positives as true positives
37
Assignment 4, Problem 1:
Fair Coin Problem (due 2-14-17)
• A box contains two coins: (a) A fair coin; and
(b) A coin with a head on each side. One coin
is selected at random (we don’t know which)
and tossed once. It comes up heads.
• Q1: What is the probability that the coin is the
fair coin?
• Q2: If the same coin is tossed again and
shows heads again, then what is the new
(posterior) probability that it is the fair coin?
Solve manually and/or using Netica.
38
Manual Solution to Fair Coin Problem,
Part 1
•
•
•
•
•
H1 = coin is fair
H2 = coin has 2 heads
E1 = head is obtained on first toss
Posterior = Likelihood*Prior/Pr(evidence)
Pr(H1 | E1) = Pr(E1 | H1)Pr(H1)/Pr(E1)
• = Pr(E1 | H1)Pr(H1)/[Pr(E1 | H1)Pr(H1) + Pr(E1 |
H2)Pr(H2)]
• = (0.5)(0.5)/[(0.5)(0.5) + (1)(0.5)] = 1/3
39
Solution to Coin Tossing Part 2
Pr(Fair |HH) = Pr(HH | Fair)Pr(Fair)/Pr(HH)
= (0.25)(0.5)/[Pr(HH | Fair)Pr(Fair) + Pr(HH | Unfair)Pr(Unfair)]
= (0.125)/[(0.125) + (1)(0.5)] = 1/(1 + 4) = 0.2
Alternative (sequential approach):
Pr(Fair | 2nd toss is H) = Pr(2nd toss is H | Fair)Pr(Fair)/Pr(H)
where all information is conditioned on first toss being H
= (0.5)(1/3)/[(0.5)(1/3) + (1)(2/3)] = (0.5)/[0.5 + 2] = 1/(1 + 4) =
1/5 = 0.2.
Using posteriors from first stage as priors for second stage
gives the same answer as conditioning on entire history.
40
Using Netica to solve fair coin
problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings” (right-click) to
specify observations (i.e., histories of
observations on which answers are to be
conditioned, e.g., “Head on first toss” or “Heads on first
two tosses”)
• Step 3: View the “Coin is fair” root node to view
the answer (i.e., Pr(Coin is fair | Observations).
41
Using Netica to solve fair coin
problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
CoinIsFair
Yes 50.0
No
50.0
FirstToss
Head 75.0
Tail
25.0
SecondToss1
Head 75.0
Tail
25.0
42
Using Netica to solve fair coin
problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings”
• Step 3: View the “Coin is fair” root node to view
the answer (i.e., Pr(Coin is fair | Observations).
CoinIsFair
Yes 33.3
No
66.7
FirstToss
Head
100
Tail
0
SecondToss1
Head 83.3
Tail
16.7
43
Using Netica to solve fair coin
problem
• Step 1: Create DAG model.
(Q: What is its root?)
A: Root node is “Coin is fair”
• Step 2: Use “Enter Findings”
• Step 3: View the “Coin is fair” root node to view
the answer (i.e., Pr(Coin is fair | Observations).
CoinIsFair
Yes 20.0
No
80.0
FirstToss
Head
100
Tail
0
SecondToss1
Head
100
Tail
0
44
Assignment 4, Problem 2: Defective
Items (due 2-14-17)
• Machines 1, 2, and 3 produced (20%, 30%,
50%) of items in a large batch, respectively.
• The defect rates for items produced by these
machines are (1%, 2%, 3%), respectively.
• A randomly sampled item is found to be
defective. What is the probability that it was
produced by Machine 2?
• Exercise: (a) Solve using Netica (b) Solve
manually
• E-mail answer (a single number,) to
[email protected]
45
Manual solution to defective items
problem
• Machines 1, 2, and 3 produced (20%, 30%, 50%) of items
in a large batch, respectively.
• The defect rates for items produced by these machines are
(1%, 2%, 3%), respectively.
• A randomly sampled item is found to be defective. What is
the probability that it was produced by Machine 2?
• Pr(machine 2 | defective) = Pr(defective | machine 2) *
Pr(machine 2)/[vPr(defective |machine = v)Pr(v)]
• = (0.02)*(0.30)/[(0.01)*(0.20) + (0.02)*(0.30) + (0.03)*(0.50)]
• = 0.261 (answer)
46
Netica solution
Machine
A
B
C
8.70
26.1
65.2
Defect
Defect
NoDefect
100
0
47
Simulation-optimization:
One-dimensional choice set A
and state set S
48
Mockingbird Airline’s
overbooking decision
• Plane has 16 seats
• Each seat reserved generates $225 revenue.
– Total revenue = $225*Res
– Res = number of reservations sold = decision variable
• C1 = operations costs: Cost of flight to Mockingbird =
$900 + $100*min(arrivals,16)
• C2 = penalty costs: Also, if > 16 passengers with
reservations show up, then each such passenger (after
16) costs Mockingbird $325 = (325*max(0, arrivals -16))
• Pr(passenger with reservation arrives/shows up) = 0.96
• How many reservations should Mockingbird sell to
maximize EMV?
49
Structuring the decision problem
• What is the set of acts, A?
• What is the set of states, S?
• What is the probability of each state, s, if
act a is taken?
• What is the consequence of each act-state
pair, c(a, s)?
50
Mockingbird’s decision problem
in normal form
• A = choice set = number of reservations to
sell = res (rows of decision table)
• S = state of nature = number of arrivals =
arrivals (columns of decision table)
• C = set of consequences = profit
c(a, s) = c(Res, arrivals)
= 225*Res – (900 + 100*min(arrivals,16)) –
(325*max(0, arrivals -16))
• P(arrivals | res) = dbinom(arrivals, res, 0.96)
51
Mockingbird Overbooking
Expected profit calculation when Res = 16
• R = 225*16 = 3600
• E(C1 | Res = 16) = 900 + 100*E(arrivals) =
900 + 100*np = 900 + 100*16*0.96 = 2436
• E(C2 | Res= 16) = 325*max(0, arrivals - 16)
=0
• E(Profit | Res = 16) = 3600 - 2436-0 = 1164
52
Solution using R
Profit = Rev = Ecost1 = Ecost2 = NULL
for (Res in 1:20){
p = c1 = c2 = NULL
# Variables to hold answers
# loop over all acts
# Reset variables to empty
for (arrivals in 0:Res) {
# loop over all states
p[arrivals] = dbinom(arrivals, Res, 0.96)
# calculate Pr(s | a)
c1[arrivals] = 900 + 100*min(arrivals,16)
# calculate c(a, s)
c2[arrivals] = 325*max(0, arrivals-16)}
Revenue = Res*225; EC1 = sum(p*c1); EC2 = sum(p*c2)
Profit[Res] = Revenue - EC1 - EC2
# Store EU(a) = EU(c(a,s))
Rev[Res] = Revenue; Ecost1[Res] = EC1; Ecost2[Res] = EC2}
plot(Profit)
best_act <- which(Profit == max(Profit))
print(best_act)
53
Profit as a function of Res
(reservations sold)
1000
Inspecting the answers:
0
Profit
500
> Profit[16]
[1] 1164
> Profit[17]
[1] 1180.593
> Profit[18]
[1] 1125.245
> Profit[19]
[1] 1045.438
-500
So, Mockingbird maximizes
expected profit by selling 17
reservations, overbooking by 1
5
10
Index
15
20
54
Bayesian analysis
and subjective probabilities
55
How to get needed probabilities?
1. Derive from other probabilities and models;
condition on data
– Bayes’ rule, decomposition and logic, event
trees, fault trees, probability theory & models
– Monte Carlo simulation models
2. Make them up (subjective probabilities),
ask others (elicitation)
– Calibration, biases (e.g., over-confidence)
3. Estimate them from data
56
Subjective EU theory
• Theory: If preferences are coherent, then
one should have coherent subjective
probabilities for all events.
– In theory, subjective probabilities can be
quantified from preferences for betting on
events vs. betting on outcomes of random
numbers with known probabilities.
• These subjective probabilities can be used
to calculate subjective EU (SEU) for acts
57
Subjective conditional
probabilities
• Linda is 31 year old, single, outspoken, and very bright. She
majored in philosophy. As a student, she was deeply concerned
with issues of discrimination and social justice and also participated
in antinuclear demonstrations. Use your judgment, conditioned on
this information, to rank the following statements by their probability,
from 1 = most probable to 8 = least probable…
a.
b.
c.
d.
e.
f.
g.
h.
Linda is a teacher in an elementary school
Linda works in a bookstore and takes Yoga classes
Linda is active in the feminist movement
Linda is a psychiatric social worker
Linda is a member of the League of Women Voters
Linda is a bank teller
Linda in an insurance salesperson
Linda is a bank teller and is active in the feminist movement
58
How do you rank the
(conditional) probabilities?
• Compare c, f, h
– c Linda is active in the feminist movement
– f Linda is a bank teller
– h Linda is a bank teller and is active in the feminist
movement
• Pr(h) = Pr(c & f) = Pr(c)*Pr(f | c) = Pr(f)*Pr(c | f)
• So Pr(h) cannot exceed Pr(f)
• But subjectively assessed probabilities can (and often
do) violate such logical constraints
• Training improves consistency and calibration of
subjective probability judgments
59
Tversky and Kahneman (1974)
• Subjective estimates of probabilities are
often/usually wrong
– Overconfidence and poor calibration
– Representativeness, availability, and anchoring
heuristics bias subjective probability assessments
• Incorrect priors
– Example: Profit rates of companies
• Ignoring relevant information (and base rates)
– Scope insensitivity, affect heuristic
• Seeking and using irrelevant information
– Confirmation bias
60
Wrap-up on subjective
probabilities
• In theory, coherence of preferences implies that
subjective probabilities and can be used to
calculate subjective expected utilities to optimize
choices
• In reality, heuristics and biases make many/most
subjective probabilities unreliable
– Superforecasting discusses exceptions
• We will emphasize probability models and
statistical methods that rely on data rather than
subjective judgments
61