Download Clean Air Act Benefits

Document related concepts

Choice modelling wikipedia , lookup

Discrete choice wikipedia , lookup

Transcript
Decision Analysis
Lecture 4
Tony Cox
My e-mail: [email protected]
Course web site: http://cox-associates.com/DA/
Agenda
• Problem set 3 solutions
– Simulation-optimization for Joe’s pills
•
•
•
•
Assignment 4: Bayesian inference
Introduction to Netica for Bayesian inference
Wrap-up on decision trees
Binomial distribution
2
Homework #4
(Due by 4:00 PM, February 14)
• Problems
– Machines
– Fair coin
• Readings
– Required: Clinical vs. statistical predictions,
http://emilkirkegaard.dk/en/?p=6085
– Recommended: Important probability distributions
Binomial https://www.utdallas.edu/~scniu/OPRE6301/documents/Important_Probability_Distributions.pdf
– Recommended: Binomial distribution in R
–
–
http://www.r-tutor.com/elementary-statistics/probability-distributions/binomial-distribution
http://www.stats.uwo.ca/faculty/braun/RTricks/basics/BasicRIV.pdf
3
Assignment 3, Problem 1
(ungraded)
• A fair coin is tossed once. Draw the risk
profile (cumulative distribution function) for
the number of heads.
– Purpose: Be able to draw, interpret risk profiles
• Practice!
– You do not have to turn this in, but we will go over
the solution next class
– Helpful background on discrete CDFs:
www.probabilitycourse.com/chapter3/3_2_1_cdf.php
4
Assignment 3, Solution 1
• A fair coin is tossed once. Draw the risk
profile (cumulative distribution function) for
the number of heads.
• Solution:
http://www.gaussianwaves.com/2008/04/probability/
5
Assignment 3, Problem 2
Joe’s medicine
• Joe takes pills to reduce his risk of heart attack
• Pharmacist can prescribe for him either 1 pill per day at full strength,
or 2 pills per day, each at half strength
• The probability that Joe forget to take any given pill on any occasion
is p. Its value is uncertain.
• Here is how pills affect daily heart attack risk:
–
–
–
–
If he takes full strength pill, multiply his risk by 0.5 (it is cut in half)
If he takes 1 half-strength pill, multiply risk by 0.7
If he takes both half-strength pills, multiply his risk by 0.5
If he takes no pill, multiply risk by 1
• What should the pharmacist prescribe?
– Please submit answer as two ranges (intervals) of p values for which
the best choice is (A) Prescribe 1 full strength pill; (B) Prescribe 2 halfstrength pills
6
Binomial distribution with
parameters p and N = 2
• The probability that Joe takes 0 pills is p2
• Pr(takes 2 pills) = (1- p)2
• Pr(takes 1 pill) = p(1 - p) + (1 - p)p
= 2p(1-p)
7
Assignment 3, Solution 2
Joe’s medicine
• The probability that Joe forgets to take any pill is p.
• Here is how pills affect daily heart attack risk:
–
–
–
–
If he takes full strength pill, multiply his risk by 0.5 (it is cut in half)
If he takes 1 half-strength pill, multiply risk by 0.7
If he takes both half-strength pills, multiply his risk by 0.5
If he takes no pill, multiply risk by 1
• What should the pharmacist prescribe?
– 1 full pill per day reduces Joe’s risk by p*1 + (1-p)*0.5 = 0.5 + 0.5p
– 2 half pills per day reduce risk by (p2)*1 + 2p(1-p)*0.7 + (1-p)2*0.5
= p2 + 1.4p – 1.4p2 + (1 -2p + p2)*0.5 = 0.1p2 + 0.4p + 0.5
– 2 pills are better than 1 (i.e., have lower risk for Joe) if 0.5 + 0.5p >
0.1p2 + 0.4p + 0.5  0.5p > 0.1p2 + 0.4p  0.1p > 0.1p2  p > p2
– But p > p2 for all 0 < p < 1. If p = 0 or 1, they are equally good.
– Otherwise, 2 pills are always better than 1, for all p in (0, 1).
8
Graphical solution
2 pills deterministically
dominates 1 pill over the whole
(infinite) set of states in (0, 1).
If EU(a) lines crossed, then we
would have to assess
probabilities for the values of p.
Simulation-optimization:
Could solve using decision tree
and simulation of EU(a), given
model {u(c), Pr(s), and Pr(c | a,
s)}. (Here, state s is p.)
For each a:
•
•
•
•
•
Draw s from Pr(s)
Draw c from Pr(c | a, s)
Evaluate u(c)
Repeat and average u(c) values
Select a with greatest mean(u(c))
9
Assignment 3, Problem 3
Certainty Equivalent calculation
• If you buy a raffle ticket for $2.00 and win, you will get
$19.00; else, you will receive nothing from the ticket.
• The probability of winning is 1/3
• Your utility function for final wealth x is u(x) = log(x)
• Your initial wealth (before deciding whether to buy the
ticket) is $10
• What is your certainty equivalent (selling price) for the
opportunity to buy this raffle ticket?
– Please submit one number
• Should you buy it?
– Please answer Yes or No
10
Assignment 3, Solution 3
Certainty Equivalent calculation
• If you buy a raffle ticket for $2 and win, you will get
$19.00; else, you will receive nothing from the ticket.
• The probability of winning is 1/3
• X = random variable for final wealth if you buy ticket = 10
- 2 +19 = $27 with probability 1/3, else 10 - 2 = $8.
• Your utility function for final wealth x is u(x) = log(x)
• Your initial wealth is $10
• Let CE = CE(X) = CE of final wealth if you buy ticket
• u(CE) = EU(X) = (1/3)*log(10 - 2 + 19) + (2/3)*log(10 - 2)
= 2.4849. CE = exp(2.4849) = $12. So, deciding to buy
the ticket increases your CE(wealth) from $10 to $12.
This transaction is worth $2 to you.
11
Assignment 3, Solution 3
Certainty Equivalent calculation
• u(CE) = EU(X) = (1/3)*log(10 - 2 + 19) + (2/3)*log(10 - 2)
= 2.4849. CE = exp(2.4849) = $12. So, ticket increases
your CE(wealth) from $10 to $12 and is worth $2 to you
• Note: EMV(X) = (1/3)*(10 - 2 + 19) + (2/3)*(10 - 2) =
$14.33, so your risk premium is $14.33 - $12.00 =$2.33
• Note: Suppose initial wealth is 1000: Then CE(final
wealth) is exp((1/3)*log(1000 - 2 + 19) + (2/3)*log(1000 2)) = 1004.29 (compared to EMV = 1000 + (1/3)*17 +
(2/3)*(-2) = 1004.33). Risk premium is $0.04.
12
Assignment 4,Problem 1:
Fair Coin Problem (due 2-14-17)
• A box contains two coins: (a) A fair coin; and
(b) A coin with a head on each side. One coin
is selected at random (we don’t know which)
and tossed once. It comes up heads.
• Q1: What is the probability that the coin is the
fair coin?
• Q2: If the same coin is tossed again and
shows heads again, then what is the new
(posterior) probability that it is the fair coin?
Solve manually and/or using Netica.
13
Assignment 4, Problem 2: Defective
Items (due 2-14-17)
• Machines 1, 2, and 3 produced (20%, 30%,
50%) of items in a large batch, respectively.
• The defect rates for items produced by these
machines are (1%, 2%, 3%), respectively.
• A randomly sampled item is found to be
defective. What is the probability that it was
produced by Machine 2?
• Exercise: (a) Solve using Netica (b) Solve
manually
• E-mail answer (a single number,) to
[email protected]
14
Introduction to Bayesian
inference with Netica®
15
Example: HIV screening
• Pr(s) = 0.01 = fraction of population with HIV
– s = has HIV, s′ = does not have HIV
– y = test is positive
• Pr(test positive | HIV) = 0.99
• Pr(test positive | no HIV) = 0.02
• Find: Pr(HIV | test positive) = Pr(s | y)
– Subjective probability estimates?
16
Solution via Bayesian Network
(BN) Solver
• DAG model: “True state  Observation”
– DAG = “directed acyclic graph”: Nodes and arrows,
no cycles allowed
• Store “marginal probabilities” at input nodes
(having output arrows only)
• Store “conditional probability tables” at all other
nodes.
• Make observations
• Enter query
– Solver calculates conditional probabilities
17
Solution in Netica
• Step 1: Build model, compile network
HIV_status
HIV present
1.0
HIV not present 99.0
Test_result
test positive
2.97
test negative
97.0
18
Solution in Netica
• Step 1: Build model, compile network
HIV_status
HIV present
1.0
HIV not present 99.0
Test_result
test positive
2.97
test negative
97.0
• Step 2: Condition on observation (rightclick, choose “Enter findings”), view
conditional probabilities
HIV_status
HIV present
33.3
HIV not present 66.7
Test_result
test positive
100
test negative
0
19
Wrap-up on Netica introduction
• User just needs to enter model and
observations (“findings”)
• Netica uses Bayesian Network algorithms
to update all probabilities (conditioning
them on findings)
• We will learn to do this manually for small
problems
• Algorithms and software are essential for
large, complex inference problems
20
Review and wrap-up on decision
trees and probabilities
21
Decision tree ingredients
• Three types of nodes
– Choice nodes (squares)
– Chance nodes (circles)
– Terminal nodes / value nodes
• Arcs show how decisions and chance
events can unfold over time
– Uncertainties are resolved as time passes
and choices are made
22
Solving decision trees
• “Backward induction”
• “Stochastic dynamic programming”
– “Average out and roll back”  implicitly, tree
determines Pr(c | a)
• Procedure:
– Start at tips of tree, work backward
– Compute expected value at each chance node
• “Averaging out”
– Choose maximum expected value at each
choice node
23
Obtaining Pr(s) from Decision trees
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Decision 1: Develop or Do Not Develop
Development Successful + Development Unsuccessful
(70% X $172,000) + (30% x (- $500,000))
$120,400 + (-$150,000)
Obtaining Pr(s) from Decision trees
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Decision 1: Develop or Do Not Develop
Development Successful + Development Unsuccessful
(70% X $172,000) + (30% x (- $500,000))
$120,400 + (-$150,000)
What happened to act a and state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Decision 1: Develop or Do Not Develop
Development Successful + Development Unsuccessful
(70% X $172,000) + (30% x (- $500,000))
$120,400 + (-$150,000)
What happened to act a and state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Decision 1: Develop or Do Not Develop
Development Successful + Development Unsuccessful
(70% X $172,000) + (30% x (- $500,000))
$120,400 + (-$150,000)
What happened to act a and state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
What are the 3 possible acts in this tree?
What happened to act a and state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
What are the 3 possible acts in this tree?
(a) Don’t develop; (b) Develop, then rebuild if
successful; (c) Develop, then new line if successful.
What happened to act a and state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Optimize decisions!
What are the 3 possible acts in this tree?
(a) Don’t develop; (b) Develop, then rebuild if
successful; (c) Develop, then new line if successful.
Key points
• Solving decision trees (with decisions)
requires embedded optimization
– Make future decisions optimally, given the
information available when they are made
• Event trees = decision trees with no decisions
– Can be solved, to find outcome probabilities,
by forward Monte-Carlo simulation, or by
multiplication and addition
• In general, sequential decision-making
cannot be modeled well using event trees.
– Must include (optimal choice | information)
What happened to state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
What are the 4 possible states?
What happened to state s?
http://www.eogogics.com/talkgogics/tutorials/decision-tree
What are the 4 possible states?
C1 can succeed or not; C2 can be high or low demand
Acts and states cause consequences
http://www.eogogics.com/talkgogics/tutorials/decision-tree
Key theoretical insight
• A complex decision model can be viewed as a
(possibly large) simple Pr(c | a) model.
–
–
–
–
s = selection of branch at each chance node
a = selection of branch at each choice node
c = outcome at terminal node for (a, s)
Pr(c | a) = sPr(c | a, s)*Pr(s)
• Other complex decision models can also be
interpreted as c(a, s), Pr(c | a, s), or Pr(c |s) models
– s = system state & information signal
– a = decision rule (information  act)
– c may include changes in s and in possible a.
Real decision trees can quickly
become “bushy messes”
(Raiffa, 1968) with many
duplicated sub-trees
Y1|d1
D2|d1
Test All
Test CA only
No BSE
Repeat Test
Test All
BSE in CA
Track Imports
Test All
Test CA only
BSE in US from US
Repeat Test
BSE in US from CA
Test All
Test CA only
Don’t Track Imports
No BSE
Repeat Test
Test All
Repeat Test
BSE in CA
B
A
A
Y2|d1,d2
A
No BSE
BSE in CA
BSE in US from US
Test CA only
Repeat Test
D1
A
Test All
Repeat Test
Test All
BSE in US
Repeat Test
B
BSE in US from CA
A
A
B
A
B
No BSE
BSE in CA
A
BSE in US from CA
B
A
C
C
C
C
C
C
C
No BSE
BSE in CA
BSE in US
Influence Diagrams help to avoid
large trees
http://en.wikipedia.org/wiki/Decision_tree
Often much
more compact
than decision
trees
Limitations of decision trees
• Combinatorial explosion
– Example: Searching for a prize in one of N
boxes or locations involves building a tree of
depth N! = N(N – 1)…*2*1.
• Infinite trees
– Continuous variables
– When to stop growing a tree?
• How to evaluate utilities and probabilities?
39
Optimization formulations of
decision problems
• Example: Prize is in location j with prior
probability p(j), j = 1, 2, …, N
• It costs c(j) to inspect location j
• What search strategy minimizes expected
cost of finding prize?
– What is a strategy? Order in which to inspect
– How many are there? N!
40
With two locations, 1 and 2
Strategy 1: Inspect 1, then 2 if needed:
– Expected cost: c1 + (1 – p1)c2 = c1 + c2 – p1c2
Strategy 2: Inspect 2, then 1 if needed:
– Expected cost: c2 + (1 – p2)c1 = c1 + c2 – p2c1
Strategy 1 has lower expected cost if:
• p1c2 > p2c1, or p1/c1 > p2/c2
• So, look first at location with highest
success probability per unit cost
41
With N locations
• Optimal decision rule: Always inspect next
the (as-yet uninspected) location with the
greatest success probability-to-cost ratio
– Example of an “index policy,” “Gittins index”
– If M players take turns, competing to find prize,
each should still use this rule.
• A decision table or tree can be unwieldy
even for such simple optimization problems
42
Other optimization formulations
• maxa A EU(a)
– Typically, a is a vector, A is the feasible set
– More generally, a is a strategy/policy/decision
rule, A is the choice set of feasible strategies
– In previous example, A = set of permutations
• maxa A EU(a)
s.t. EU(a) = ∑cPr(c | a)u(c)
Pr(c | a) = ∑sPr(c | a, s)p(s)
g(a) ≤ 0 (feasible set, A)
43
Advanced decision tree analysis
• Game trees
– Different decision-makers
• Monte Carlo tree search
(MCTS) in games with
risk and uncertainty
–
–
https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlotree-search/,
http://www.cameronius.com/research/mcts/about/index.html
http://www.cameronius.com/cv/mcts-survey-master.pdf
– Generating trees
http://stackoverflow.com/questions/23803186/monte-carlo-tree-searchimplementation-for-tic-tac-toe
• Apply rules to expand and
evaluate nodes
• Learning trees from data
• Sequential testing
44
Summary on decision trees
• Decision trees show sequences of
choices, chance nodes, observations, and
final consequences.
– Mix observations, acts, optimization, causality
• Good for very small problems; less good
for medium-sized problems; unwieldy for
large problems  use IDs instead
• Can view decision trees and other
decision models as simple c(a, s) models
– But need good optimization solvers!
Road map: Filling in the normal
form matrix
• Assessing probabilities
– Eliciting well-calibrated probabilities
– Deriving probabilities from models
– Estimating probabilities from data
• Assessing utilities
– Utility elicitation
– Single-attribute utility theory
– Multi-attribute utility theory
46
Binomial probability model
47
Some useful probability models
•
•
•
•
•
•
•
•
•
Uniform = unif
Binomial (n trials, 2 outcomes) = binom
Poisson (“rare events” law) = pois
Exponential (waiting time) = exp
Normal (sums, random errors) = norm
Beta (proportions) = beta
p = distribution function (cdf)
r = random sample (simulation)
q = quantile, d = density
48
Binomial model
pbinom(x, n, p)
• Two outcomes on each of n independent
trials, “success” and “failure”
• Probability of success = p for each trial
independently
• Expected number of successes in n trials
with success probability p = ?
• Probability of no more than x successes in
n trials with success probability p =
pbinom(x, n, p)
49
pbinom(x, n, p) includes
probability of x succeses
• Expected number of successes in n trials
with success probability p = np
• Probability of no more than x successes
in n trials with success probability p =
pbinom(x, n, p)
• Note that pbinom is for less than or equal
to x successes in n trials
50
Binomial model
pbinom(x, n, p)
•
•
•
•
2 outcomes on each of n independent trials
P(success) = p for each trial independently
E(successes in n trials) = np = mean
Pr(x successes in n trials) = nC xpx(1 – p)n-x =
dbinom(x,n,p)
– nC x = “n choose x” = number of combinations of
n things taken x at a time
– = n(n-1)…(n – x + 1)/x!
• Example: Pr(1 or 2 heads in 4 tosses of a fair
coin) = ?
51
Binomial model
pbinom(x,n,p), dbinom(x,n,p)
• Pr(1 or 2 heads in 4 tosses of a fair coin) =
Pr(1 head) + Pr(2 heads)
= 4C 1p1(1 – p)3 + 4C 2p2(1 – p)2 = (4 + 6)*0.54
= 10/16 = 5/8 = 0.625
= dbinom(1,4,0.5)+ dbinom(2,4,0.5)
= pbinom(2,4,0.5)- pbinom(0,4,0.5)
52
Example of binomial model
pbinom(x, n, p)
• Susan goes skiing each weekend if the
weather is good
– n = 12 weekends in ski season
– Probability of good weather = 0.65 for each
weekend independently
• What is the probability that she will ski for 8
or more weekends? (Use pbinom)
• Find her expected number of ski weekends
53
Do it!
54
Example of binomial model
pbinom(x, n = 12, p = 0.65)
• Expected number of weekends she skis is
np = ?
• Probability of skiing for 8 or more
weekends = 1 – Pr(no more than 7 ski
trips in 12 weekends, with p = 0.65 for
each) = ?
55
Example of binomial model
pbinom(x, n = 12, p = 0.65)
• Expected number of weekends she skis is
np = 12*0.65 = 7.8
• Probability of skiing for 8 or more
weekends = 1 – Pr(no more than 7 ski
trips in 12 weekends, with p = 0.65 for
each) = 1- pbinom(7, 12, 0.65)
> 1- pbinom(7, 12, 0.65)
[1] 0.583345
56
Optional practice problems on binomial
calculations: Do using R
1. Ten percent of computer parts produced by a certain supplier are
defective. What is the probability that a sample of 10 parts contains
more than 3 defective ones?
2. On the average, two tornadoes hit major U.S. metropolitan areas
every year. What is the probability that more than five tornadoes
occur in major U.S. metropolitan areas next year?
3. A lab network consisting of 20 computers was attacked by a
computer virus. This virus enters each computer with probability
0.4, independently of other computers. a) Find the probability that
the virus enters at least 10 computers. b) A computer manager
checks the lab computers, one after another, to see if they were
infected by the virus. What is the probability that she has to test at
least 6 computers to find the first infected one?
• Check answers at www.utdallas.edu/~mbaron/3341/Practice4.pdf
• E-mail any questions on R solutions to [email protected]
57
Plotting a binomial distribution
(probability density function)
> x = c(0:12)
> y = dbinom(x, 12, 0.65);
plot(x,y)
Probability distribution for
number of ski weekends
This “probability density”
or “probability mass”
function lets us calculate
expected utility of season
pass if its utility is
determined by the number
of ski weekends.
58
Plotting a binomial distribution
> barplot(dbinom(x, 12, 0.65))
59
Risk profile (CDF) for binomial
R: x <- c(0:12)
R: y <- pbinom(x, 12, 0.65)
G: plot(x, y)
60
Using the binomial model to
calculate probabilities
• A company will remain solvent if at least 3
of its 8 markets are profitable.
• The probability that each market is
profitable is 25%.
• What is the probability that the company
remains solvent?
61
Using the binomial model to
calculate probabilities
• A company will remain solvent if at least 3
of its 8 markets are profitable.
• The probability that each market is
profitable is 25%.
• What is the probability that the company
remains solvent?
• Pr(no more than 5 failure) =
pbinom(5, 8, 0.75)
[1] 0.3214569
62
Bayesian analysis
and probability basics
63
How to get needed probabilities?
1. Derive from other probabilities and
models; condition on data
– Bayes’ rule, decomposition and logic, event
trees, fault trees, probability theory & models
– Monte Carlo simulation models
2. Make them up (subjective probabilities),
ask others (elicitation)
– Calibration, biases (e.g., over-confidence)
3. Estimate them from data
64