Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Monte Carlo Analysis
David M. Hassenzahl
Copyright © 2004 David M. Hassenzahl
Purpose of lecture
• Introduce Monte Carlo Analysis as a
tool for managing uncertainty
• Demonstrate how it can be used in the
policy setting
• Discuss its uses and shortcomings, and
how they are relevant to policy making
processes
Copyright © 2004 David M. Hassenzahl
What is Monte Carlo Analysis?
It is a tool for combining distributions, and
thereby propagating more than just
summary statistics
It uses random number generation, rather
than analytic calculations
It is increasingly popular due to high
speed personal computers
Copyright © 2004 David M. Hassenzahl
Background/History
• “Monte Carlo” from the gambling town of the
same name (no surprise)
• First applied in 1947 to model diffusion of
neutrons through fissile materials
• Limited use because time consuming
• Much more common since late 80’s
• Too easy now?
• Name…is EPA “gambling” with people’s lives
(anecdotal, but reasonable).
Copyright © 2004 David M. Hassenzahl
Why Perform Monte Carlo
Analysis?
• Combining distributions
• With more than two distributions,
solving analytically is very difficult
• Simple calculations lose information
– Mean  mean = mean
– 95% %ile  95%ile  95%ile!
– Gets “worse” with 3 or more distributions
Copyright © 2004 David M. Hassenzahl
Monte Carlo Analysis
• Takes an equation
– example: Risk = probability  consequence
• Instead of simple numbers, draws
randomly from defined distributions
• Multiplies the two, stores the answer
• Repeats this over and over and over…
• Then the set of results is displayed as a
new, combined distribution
Copyright © 2004 David M. Hassenzahl
Simple (hypothetical) example
• Skin cream additive is an irritant
• Many samples of cream provide information
on concentration:
– mean 0.02 mg chemical
– standard dev. 0.005 mg chemical
• Two tests show probability of irritation given
application
– low freq of effect per mg exposure = 5/100/mg
– high freq of effect per mg exposure = 10/100/mg
Copyright © 2004 David M. Hassenzahl
Analytical results
• Risk = exposure  potency
– Mean risk = 0.02 mg  0.075 / mg
= 0.0015
or 15 out of 10,000 applications will result in irritation
Copyright © 2004 David M. Hassenzahl
Analytical results
• “Conservative estimate”
– Use upper 95th %ile
Risk = 0.03 mg  0.0975 / mg
= 0.0029
Copyright © 2004 David M. Hassenzahl
Monte Carlo: Visual example
0.01
0.02
0.03
Exposure (mg
chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
Exposure = normal(mean 0.02 mg, s.d. = 0.005 mg)
potency = uniform (range 0.05 / mg to 0.10 / mg)
Copyright © 2004 David M. Hassenzahl
Random draw one
0.0165
0.063
0.01
0.02
0.03
Exposure (mg
chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
p(irritate) = 0.0165 mg × 0.063/mg = 0.0010
Copyright © 2004 David M. Hassenzahl
Random draw two
0.0175
0.01
0.089
0.02
0.03
Exposure (mg
chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
p(irritate) = 0.0175 mg × 0.089 /mg = 0.0016
Summary: {0.0010, 0.0016}
Copyright © 2004 David M. Hassenzahl
Random draw three
0.057
0.0152
0.01
0.02
0.03
Exposure (mg
chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
p(irritate) = 0.152 mg × 0.057 /mg = 0.0087
Summary: {0.0010, 0.0016, 0.00087}
Copyright © 2004 David M. Hassenzahl
Random draw four
0.0238
0.01
0.02
0.03
Exposure (mg
chemical)
0.085
0.05
0.10
Potency (probability of
irritation per mg chemical)
p(irritate) = 0.0238 mg × 0.085 /mg = 0.0020
Summary: {0.0010, 0.0016, 0.00087, 0.0020}
Copyright © 2004 David M. Hassenzahl
After ten random draws
Summary
{0.0010, 0.0016, 0.00087, 0.0020, 0.0011,
0.0018, 0.0024, 0.0016, 0.0015,
0.00062}
mean 0.0014
standard deviation (0.00055)
Copyright © 2004 David M. Hassenzahl
Using software
• Could write this program using a
random number generator
• But, several software packages out
there.
• I use Crystal Ball
– user friendly
– customizable
– r.n.g. good up to about 10,000 iterations
Copyright © 2004 David M. Hassenzahl
100 iterations (about two
seconds)
• Monte Carlo results
– Mean
– Standard Deviation
– “Conservative” estimate
0.0016
0.00048
0.0026
• Compare to analytical results
– Mean
– standard deviation
– “Conservative” estimate
0.0015
n/a
0.0029
Copyright © 2004 David M. Hassenzahl
Summary chart - 100 trials
Forecast: P(Irritation)
100 Trials
Frequency Chart
1 Outlier
.050
5
.038
3.75
.025
2.5
.013
1.25
.000
0
0.00
0.00103
0.00
0.00
0.00161
0.00
Copyright © 2004 David M. Hassenzahl
0.00
0.00311
Summary - 10,000 trials
• Monte Carlo results
– Mean
– Standard Deviation
– “Conservative” estimate
0.0015
0.000472
0.0024
• Compare to analytical results
– Mean
– standard deviation
– “Conservative” estimate
0.0015
n/a
0.0029
Copyright © 2004 David M. Hassenzahl
Summary chart - 10,000 trials
Forecast: P(Irritation)
10,000 Trials
Frequency Chart
88 Outliers
.023
226
.017
169.5
.011
113
.006
56.5
.000
0
0.00
0.00069
0.00
0.00
0.00150
0.00
About 1.5 minutes run time
Copyright © 2004 David M. Hassenzahl
0.00
0.00331
Policy applications
• When there are many distributional
inputs
• Concern about “excessive
conservatism”
– multiplying 95th percentiles
– multiple exposures
• Because we can
• Bayesian calculations
Copyright © 2004 David M. Hassenzahl
Issues: Sensitivity Analysis
• Sensitivity analysis looks at which input
distributions have the greatest effect on
the eventual distribution
• Helps to understand which parameters
can both be influenced by policy and
reduce risks
• Helps understand when better data can
be most valuable (information isn’t
free…nor even cheap)
Copyright © 2004 David M. Hassenzahl
Issues: Correlation
• Two distributions are correlated when a
change in one causes a change in
another
• Example: People who eat lots of peas
may eat less broccoli (or may eat
more…)
• Usually doesn’t have much effect unless
significant correlation (||>0.75)
Copyright © 2004 David M. Hassenzahl
Generating Distributions
• Invalid distributions create invalid
results, which leads to inappropriate
policies
• Two options
– empirical
– theoretical
Copyright © 2004 David M. Hassenzahl
Empirical Distributions
• Most appropriate when developed for
the issue at hand.
• Example: local fish consumption
– survey individuals or otherwise estimate
– data from individuals elsewhere may be
very misleading
• A number of very large data sets have
been developed and published
Copyright © 2004 David M. Hassenzahl
Empirical Distributions
• Challenge: when there’s very little data
• Example of two data points
– uniform distribution?
– triangular distribution?
– not a hypothetical issue…is an ongoing
debate in the literature
• Key is to state clearly your assumptions
• Better yet…do it both ways!
Copyright © 2004 David M. Hassenzahl
Which Distribution?
0.05
0.10
Potency (probability of
irritation per mg chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
0.05
0.10
Potency (probability of
irritation per mg chemical)
Copyright © 2004 David M. Hassenzahl
Random number generation
• Shouldn’t be an issue…@Risk and
Crystal Ball are both good to at least
10,000 iterations
• 10,000 iterations is typically enough,
even with many input distributions
Copyright © 2004 David M. Hassenzahl
Theoretical Distributions
• Appropriate when there’s some
mechanistic or probabilistic basis
• Example: small sample (say 50 test
animals) establishes a binomial
distribution
• Lognormal distributions show up often
in nature
Copyright © 2004 David M. Hassenzahl
Some Caveats
• Beware believing that you’ve really
“understood” uncertainty
• Beware: misapplication
– ignorance at best
– fraudulent at worst…porcine hoof blister
Copyright © 2004 David M. Hassenzahl
Example (after Finkel)
Alar “versus” aflatoxin
Exposure has two elements
Peanut butter consumption
aflatoxin residue
Juice consumption
Alar/UDMH residue
Potency has one element
aflatoxin potency
UDMH potency
Risk =
(consumption  residue  potency)/body weight
Copyright © 2004 David M. Hassenzahl
Inputs for Alar & aflatoxin
Variable
Units
Mean
5th %ile
95th %ile Percentile location
of the mean.
Peanut butter
g/day
11.38
2.00
31.86
66
g/day
136.84
16.02
430.02
69
aflatoxin residue
g/g
2.82
1.00
6.50
61
UDMH residue
g/g
13.75
0.5
42.00
67
aflatoxin
kg-
17.5
4.02
28.23
61
potency
day/mg
UDMH potency
kg-
0.49
0.00
0.85
43
consumption
Apple juice
consumption
day/mg
Copyright © 2004 David M. Hassenzahl
Alar and aflatoxin point
estimates
• aflatoxin estimates:
– Mean
11.38 g 2.82g 17.5kg  day
mg
20kg
1000g
day
g
mg
= 0.028
– Conservative = 0.29
• Alar (UDMH) estimates:
– Mean = 0.046
– Conservative = 0.77
Copyright © 2004 David M. Hassenzahl
Alar and aflatoxin Monte Carlo
• 10,000 runs
• Generate distributions
– (don’t allow 0)
• Don’t expect correlation
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo
results (point values)
Aflatoxin
Mean
Conservative
Analytical
0.028
Monte Carlo
0.028
0.29
0.095
Alar
Mean
Conservative
Analytical
0.046
Monte Carlo
0.046
0.77
0.18
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo
results (distributions)
Forecast: peanut butter risk
10,000 Trials
Frequency Chart
192 Outliers
.016
163
.012
122.2
.008
81.5
.004
40.75
.000
0
0
0.0375
0.075
0.1125
Certainty is 98.05% from -Infinity to 0.1495
Copyright © 2004 David M. Hassenzahl
0.15
Aflatoxin and Alar Monte Carlo
results (distributions)
Forecast: apple juice risk
10,000 Trials
Frequency Chart
125 Outliers
.102
1020
.077
765
.051
510
.026
255
.000
0
0
0.1125
0.225
0.3375
Certainty is 93.93% from -Infinity to 0.15
Copyright © 2004 David M. Hassenzahl
0.45
Aflatoxin and Alar Monte Carlo
results (distributions)
Forecast: peanut butter risk
10,000 Trials
Cumulativ e Chart
192 Outliers
1.000
10000
.750
.500
.250
.000
0
0
0.0375
0.075
0.1125
Certainty is 98.04% from -Infinity to 0.1495
Copyright © 2004 David M. Hassenzahl
0.15
Aflatoxin and Alar Monte Carlo
results (distributions)
Forecast: apple juice risk
10,000 Trials
Cumulativ e Chart
125 Outliers
1.000
10000
.750
.500
.250
.000
0
0
0.1125
0.225
0.3375
Certainty is 93.93% from -Infinity to 0.15
Copyright © 2004 David M. Hassenzahl
0.45
Aflatoxin and Alar Monte Carlo
results (distributions)
Ov erlay Chart
Frequency distribution--comparison
.102
.077
peanut butter risk
.051
.026
apple juice risk
.000
0
0.1125
0.225
0.3375
Copyright © 2004 David M. Hassenzahl
0.45
Aflatoxin and Alar Monte Carlo
results (distributions)
Ov erlay Chart
Cumulativ e distribution--comparison
1.000
.750
peanut butter risk
.500
.250
apple juice risk
.000
0
0.1125
0.225
0.3375
Copyright © 2004 David M. Hassenzahl
0.45
References and Further
Reading
Burmaster, D.E and Anderson, P.D. (1994). “Principles of good practice for
the use of Monte Carlo techniques in human health and ecological risk
assessments.” Risk Analysis 14(4):447-81
Finkel, A (1995). “Towards less misleading comparisons of uncertain risks:
the example of aflatoxin and Alar.” Environmental Health Perspectives
103(4):376-85.
Kammen, D.M and Hassenzahl D.M. (1999). Should We Risk It? Exploring
Environmental, Health and Technological Problem Solving. Princeton
University Press, Princeton, NJ.
Thompson, K. M., D. E. Burmaster, et al. (1992). "Monte Carlo techniques
for uncertainty analysis in public health risk assessments." Risk
Analysis 12(1): 53-63.
Vose, David (1997) “Monte Carlo Risk Analysis Modeling” in Molak, Ed.,
Fundamentals of Risk Analysis and Risk Management.
Copyright © 2004 David M. Hassenzahl