Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Monte Carlo Analysis David M. Hassenzahl Copyright © 2004 David M. Hassenzahl Purpose of lecture • Introduce Monte Carlo Analysis as a tool for managing uncertainty • Demonstrate how it can be used in the policy setting • Discuss its uses and shortcomings, and how they are relevant to policy making processes Copyright © 2004 David M. Hassenzahl What is Monte Carlo Analysis? It is a tool for combining distributions, and thereby propagating more than just summary statistics It uses random number generation, rather than analytic calculations It is increasingly popular due to high speed personal computers Copyright © 2004 David M. Hassenzahl Background/History • “Monte Carlo” from the gambling town of the same name (no surprise) • First applied in 1947 to model diffusion of neutrons through fissile materials • Limited use because time consuming • Much more common since late 80’s • Too easy now? • Name…is EPA “gambling” with people’s lives (anecdotal, but reasonable). Copyright © 2004 David M. Hassenzahl Why Perform Monte Carlo Analysis? • Combining distributions • With more than two distributions, solving analytically is very difficult • Simple calculations lose information – Mean mean = mean – 95% %ile 95%ile 95%ile! – Gets “worse” with 3 or more distributions Copyright © 2004 David M. Hassenzahl Monte Carlo Analysis • Takes an equation – example: Risk = probability consequence • Instead of simple numbers, draws randomly from defined distributions • Multiplies the two, stores the answer • Repeats this over and over and over… • Then the set of results is displayed as a new, combined distribution Copyright © 2004 David M. Hassenzahl Simple (hypothetical) example • Skin cream additive is an irritant • Many samples of cream provide information on concentration: – mean 0.02 mg chemical – standard dev. 0.005 mg chemical • Two tests show probability of irritation given application – low freq of effect per mg exposure = 5/100/mg – high freq of effect per mg exposure = 10/100/mg Copyright © 2004 David M. Hassenzahl Analytical results • Risk = exposure potency – Mean risk = 0.02 mg 0.075 / mg = 0.0015 or 15 out of 10,000 applications will result in irritation Copyright © 2004 David M. Hassenzahl Analytical results • “Conservative estimate” – Use upper 95th %ile Risk = 0.03 mg 0.0975 / mg = 0.0029 Copyright © 2004 David M. Hassenzahl Monte Carlo: Visual example 0.01 0.02 0.03 Exposure (mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) Exposure = normal(mean 0.02 mg, s.d. = 0.005 mg) potency = uniform (range 0.05 / mg to 0.10 / mg) Copyright © 2004 David M. Hassenzahl Random draw one 0.0165 0.063 0.01 0.02 0.03 Exposure (mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) p(irritate) = 0.0165 mg × 0.063/mg = 0.0010 Copyright © 2004 David M. Hassenzahl Random draw two 0.0175 0.01 0.089 0.02 0.03 Exposure (mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) p(irritate) = 0.0175 mg × 0.089 /mg = 0.0016 Summary: {0.0010, 0.0016} Copyright © 2004 David M. Hassenzahl Random draw three 0.057 0.0152 0.01 0.02 0.03 Exposure (mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) p(irritate) = 0.152 mg × 0.057 /mg = 0.0087 Summary: {0.0010, 0.0016, 0.00087} Copyright © 2004 David M. Hassenzahl Random draw four 0.0238 0.01 0.02 0.03 Exposure (mg chemical) 0.085 0.05 0.10 Potency (probability of irritation per mg chemical) p(irritate) = 0.0238 mg × 0.085 /mg = 0.0020 Summary: {0.0010, 0.0016, 0.00087, 0.0020} Copyright © 2004 David M. Hassenzahl After ten random draws Summary {0.0010, 0.0016, 0.00087, 0.0020, 0.0011, 0.0018, 0.0024, 0.0016, 0.0015, 0.00062} mean 0.0014 standard deviation (0.00055) Copyright © 2004 David M. Hassenzahl Using software • Could write this program using a random number generator • But, several software packages out there. • I use Crystal Ball – user friendly – customizable – r.n.g. good up to about 10,000 iterations Copyright © 2004 David M. Hassenzahl 100 iterations (about two seconds) • Monte Carlo results – Mean – Standard Deviation – “Conservative” estimate 0.0016 0.00048 0.0026 • Compare to analytical results – Mean – standard deviation – “Conservative” estimate 0.0015 n/a 0.0029 Copyright © 2004 David M. Hassenzahl Summary chart - 100 trials Forecast: P(Irritation) 100 Trials Frequency Chart 1 Outlier .050 5 .038 3.75 .025 2.5 .013 1.25 .000 0 0.00 0.00103 0.00 0.00 0.00161 0.00 Copyright © 2004 David M. Hassenzahl 0.00 0.00311 Summary - 10,000 trials • Monte Carlo results – Mean – Standard Deviation – “Conservative” estimate 0.0015 0.000472 0.0024 • Compare to analytical results – Mean – standard deviation – “Conservative” estimate 0.0015 n/a 0.0029 Copyright © 2004 David M. Hassenzahl Summary chart - 10,000 trials Forecast: P(Irritation) 10,000 Trials Frequency Chart 88 Outliers .023 226 .017 169.5 .011 113 .006 56.5 .000 0 0.00 0.00069 0.00 0.00 0.00150 0.00 About 1.5 minutes run time Copyright © 2004 David M. Hassenzahl 0.00 0.00331 Policy applications • When there are many distributional inputs • Concern about “excessive conservatism” – multiplying 95th percentiles – multiple exposures • Because we can • Bayesian calculations Copyright © 2004 David M. Hassenzahl Issues: Sensitivity Analysis • Sensitivity analysis looks at which input distributions have the greatest effect on the eventual distribution • Helps to understand which parameters can both be influenced by policy and reduce risks • Helps understand when better data can be most valuable (information isn’t free…nor even cheap) Copyright © 2004 David M. Hassenzahl Issues: Correlation • Two distributions are correlated when a change in one causes a change in another • Example: People who eat lots of peas may eat less broccoli (or may eat more…) • Usually doesn’t have much effect unless significant correlation (||>0.75) Copyright © 2004 David M. Hassenzahl Generating Distributions • Invalid distributions create invalid results, which leads to inappropriate policies • Two options – empirical – theoretical Copyright © 2004 David M. Hassenzahl Empirical Distributions • Most appropriate when developed for the issue at hand. • Example: local fish consumption – survey individuals or otherwise estimate – data from individuals elsewhere may be very misleading • A number of very large data sets have been developed and published Copyright © 2004 David M. Hassenzahl Empirical Distributions • Challenge: when there’s very little data • Example of two data points – uniform distribution? – triangular distribution? – not a hypothetical issue…is an ongoing debate in the literature • Key is to state clearly your assumptions • Better yet…do it both ways! Copyright © 2004 David M. Hassenzahl Which Distribution? 0.05 0.10 Potency (probability of irritation per mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) 0.05 0.10 Potency (probability of irritation per mg chemical) Copyright © 2004 David M. Hassenzahl Random number generation • Shouldn’t be an issue…@Risk and Crystal Ball are both good to at least 10,000 iterations • 10,000 iterations is typically enough, even with many input distributions Copyright © 2004 David M. Hassenzahl Theoretical Distributions • Appropriate when there’s some mechanistic or probabilistic basis • Example: small sample (say 50 test animals) establishes a binomial distribution • Lognormal distributions show up often in nature Copyright © 2004 David M. Hassenzahl Some Caveats • Beware believing that you’ve really “understood” uncertainty • Beware: misapplication – ignorance at best – fraudulent at worst…porcine hoof blister Copyright © 2004 David M. Hassenzahl Example (after Finkel) Alar “versus” aflatoxin Exposure has two elements Peanut butter consumption aflatoxin residue Juice consumption Alar/UDMH residue Potency has one element aflatoxin potency UDMH potency Risk = (consumption residue potency)/body weight Copyright © 2004 David M. Hassenzahl Inputs for Alar & aflatoxin Variable Units Mean 5th %ile 95th %ile Percentile location of the mean. Peanut butter g/day 11.38 2.00 31.86 66 g/day 136.84 16.02 430.02 69 aflatoxin residue g/g 2.82 1.00 6.50 61 UDMH residue g/g 13.75 0.5 42.00 67 aflatoxin kg- 17.5 4.02 28.23 61 potency day/mg UDMH potency kg- 0.49 0.00 0.85 43 consumption Apple juice consumption day/mg Copyright © 2004 David M. Hassenzahl Alar and aflatoxin point estimates • aflatoxin estimates: – Mean 11.38 g 2.82g 17.5kg day mg 20kg 1000g day g mg = 0.028 – Conservative = 0.29 • Alar (UDMH) estimates: – Mean = 0.046 – Conservative = 0.77 Copyright © 2004 David M. Hassenzahl Alar and aflatoxin Monte Carlo • 10,000 runs • Generate distributions – (don’t allow 0) • Don’t expect correlation Copyright © 2004 David M. Hassenzahl Aflatoxin and Alar Monte Carlo results (point values) Aflatoxin Mean Conservative Analytical 0.028 Monte Carlo 0.028 0.29 0.095 Alar Mean Conservative Analytical 0.046 Monte Carlo 0.046 0.77 0.18 Copyright © 2004 David M. Hassenzahl Aflatoxin and Alar Monte Carlo results (distributions) Forecast: peanut butter risk 10,000 Trials Frequency Chart 192 Outliers .016 163 .012 122.2 .008 81.5 .004 40.75 .000 0 0 0.0375 0.075 0.1125 Certainty is 98.05% from -Infinity to 0.1495 Copyright © 2004 David M. Hassenzahl 0.15 Aflatoxin and Alar Monte Carlo results (distributions) Forecast: apple juice risk 10,000 Trials Frequency Chart 125 Outliers .102 1020 .077 765 .051 510 .026 255 .000 0 0 0.1125 0.225 0.3375 Certainty is 93.93% from -Infinity to 0.15 Copyright © 2004 David M. Hassenzahl 0.45 Aflatoxin and Alar Monte Carlo results (distributions) Forecast: peanut butter risk 10,000 Trials Cumulativ e Chart 192 Outliers 1.000 10000 .750 .500 .250 .000 0 0 0.0375 0.075 0.1125 Certainty is 98.04% from -Infinity to 0.1495 Copyright © 2004 David M. Hassenzahl 0.15 Aflatoxin and Alar Monte Carlo results (distributions) Forecast: apple juice risk 10,000 Trials Cumulativ e Chart 125 Outliers 1.000 10000 .750 .500 .250 .000 0 0 0.1125 0.225 0.3375 Certainty is 93.93% from -Infinity to 0.15 Copyright © 2004 David M. Hassenzahl 0.45 Aflatoxin and Alar Monte Carlo results (distributions) Ov erlay Chart Frequency distribution--comparison .102 .077 peanut butter risk .051 .026 apple juice risk .000 0 0.1125 0.225 0.3375 Copyright © 2004 David M. Hassenzahl 0.45 Aflatoxin and Alar Monte Carlo results (distributions) Ov erlay Chart Cumulativ e distribution--comparison 1.000 .750 peanut butter risk .500 .250 apple juice risk .000 0 0.1125 0.225 0.3375 Copyright © 2004 David M. Hassenzahl 0.45 References and Further Reading Burmaster, D.E and Anderson, P.D. (1994). “Principles of good practice for the use of Monte Carlo techniques in human health and ecological risk assessments.” Risk Analysis 14(4):447-81 Finkel, A (1995). “Towards less misleading comparisons of uncertain risks: the example of aflatoxin and Alar.” Environmental Health Perspectives 103(4):376-85. Kammen, D.M and Hassenzahl D.M. (1999). Should We Risk It? Exploring Environmental, Health and Technological Problem Solving. Princeton University Press, Princeton, NJ. Thompson, K. M., D. E. Burmaster, et al. (1992). "Monte Carlo techniques for uncertainty analysis in public health risk assessments." Risk Analysis 12(1): 53-63. Vose, David (1997) “Monte Carlo Risk Analysis Modeling” in Molak, Ed., Fundamentals of Risk Analysis and Risk Management. Copyright © 2004 David M. Hassenzahl