Download overhead - 13 Developing Simulation Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Regression analysis wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Materials for Lecture 13
• Chapter 2 pages 6-12, Chapter 6,
Chapter 16 Section 3.1 and 4
• Lecture 13 Probability of Revenue.xlsx
• Lecture 13 Flow Chart.xlsx
• Lecture 13 Farm Simulator.xlsx
• Lecture 13 Uniform.xlsx
• Lecture 13 Theta UPES.xlsx
• Lecture 13 View Distributions.xlsx
What is a Simulation Model?
• A Model is a mathematical representation of any
system of equations
– When you think through the many steps to solve a
problem you are constructing a model
– When you think or plan your way through a complex
situation you are making a virtual model
– Computer games are models
– Econometric equations can be part of a model
• We build models so we do not have to
experiment on the actual economic system
– Will the business be successful if we change
management practices, etc.?
Outline for the Lecture
•
•
•
•
Organization of a model in an Excel Workbook
Steps for model development
Parts in a simulation model
Generating random variables from uniform
distributions
• Estimating parameters for other distributions
– Parameters are the numbers that define the center
and the dispersion about the center of the random
variable
– For a Normally distributed random variable, the
parameters are the Mean & Std Dev
– For Empirical ….
Organization of Models in Excel
Input Data:
• Costs, inflation & interest rates
• Production functions
• Assets & liabilities
• Scenarios to analyze, etc.
Historical Data for Stochastic
Variables:
• Prices
• Production levels
• Other variables not
controlled by management
Equations to calculate variables:
Model Outputs:
• Production, Receipts, Costs, Amortize
Loans, Update Asset values, etc.
• Statistics for KOVs
• Tables to report financial results:
• Probability charts
• Decision summary
• Final report tables
• Income statement, cash flow,
balance sheet, financial ratios
• KOV Table
• List all output variables of interest
Organization of Models in Excel
• Sheet 1 (Model)
–
–
–
–
–
–
Assumptions and all Input Data
Control variables for managing the system
Logical flow of all calculations
Table of intermediate results
Pro Forma financial tables of results
Key Output Variables (KOVs) Table to send to SimData
• Sheet 2 (Stoch)
– Historical data for all random variables
– Calculations to estimate the parameters for random
variables
– Simulate all random values to be mapped to the Model
• Sheets 3-N (SimData, Stoplite, SERF, STODOM, etc)
– Simulation results and charts
Model Design Steps
KOVs
Design
Intermediate Results
Tables and Reports
Build
Equations and Calculations to
Get Values for Reports
Stochastic Variables
Exogenous and Control Variables
• Model development is like building a
pyramid
– Design the model from the top down
– Build from the bottom up
Steps for Model Development
• Determine the purpose of the model and KOVs
• Draw a sketch of how data will interact to
calculate the KOVs
• Determine the variables necessary to calculate
the KOVs
– For example to calculate Net Present Value (NPV)
we need:
• Annual net cash withdrawals which are a function of net
returns
• Ending net worth which is a function of assets and liabilities
– This means you need a balance sheet and a cash flow
statement to calculate annual cash reserves
– An annual income statement is needed as input into a cash
flow
– Annual net returns are calculated from an income statement
Flow Chart for Simulating NPV
Control Variables for Manager such
as: Levels of Production, Debt
Levels, Market Share
Macro Data as inflation rates
interest rates
Sections and Equations for the Model
Generate the Stochastic Values
Use Projected Means and Historical Data for Random Variables
Use the Stochastic Values in the Equations for the Model
Equations for the System to model
Production = f( scale of the farm and stochastic values)
Price = f( stochastic values)
Revenue = Price * Production for each enterprise
Variable Costs by Enterprise = Production * Unit Cost
Costs = Variable Costs for each enterprise + Fixed Costs
Net Returns = Revenue - Costs
Balance Sheet Information
Asset Valuation
Liabilities
Net Worth
Annual Projected Mean Prices
Key Output Variables
Net Present Value
Probability of Net Returns > 0
Probability of NPV > 0 ( or Prob of Success)
Probability of Increasing Real Net Worth
Analyze KOVs
Budgets for each of the
Enterprises
Stochastic Variables -- need the
historical data to estimate
parameters for random variables
Steps for Model Development
• Write out the equations by hand
– This organizes your thoughts and the model’s structure
– Avoids problem of forgetting important sections
– Example of equations to simulate receipts at this point:
•
•
•
•
•
Output/hour = a stochastic variable
Hours Operated = management control value (scenario)
Production = Output/hour * Hours Operated
Price = forecast mean each year with a risk component
Receipts = Price * Production
• Define input variables
– Exogenous variables are out of the control of management
and are deterministic; usually policy driven
– Stochastic variables management can not control and are
random in nature: weather or market prices, interest rates
– Control variables the manager can manipulate and are
usually used for sensitivity and/or scenario analyses
Steps for Model Development
• Stochastic variables (most time is spent here)
– Identify key random variables that affect the system
– Estimate parameters for the assumed distributions
• Normality – means and standard deviations
• Empirical – sorted deviates and probabilities
• Other distributions should be tested
– Use the best possible econometric model to forecast
deterministic part of stochastic variables – reduce risk
• Model validation starts here
– Use statistical tests of the simulated stochastic variables
to insure that random variables are simulated correctly
• Correlation tests, means tests, variance tests
• CDF and PDF charts to compare history to simulated values
• Key to validating model are statistical tests
Stochastic Variables?
• What are Stochastic Variables?
– Random variables we can not control, such as:
• Prices, yields, interest rates, rates of inflation, sickness, etc.
– Represented by the residuals from regression equations
as this is the part of a variable we did not predict
• Why include stochastic variables?
– To get a more robust simulation answer
– Draw random values from a PDF rather than a single or
deterministic value
– The result is that we can assign probabilities to KOVs
– We can incorporate risk in our decisions of selecting
between scenarios
Simple Economic Model
• Supply and Demand Model
– You learned there is one Demand and one Supply
– But there are many, due to the risk on the equations
Qx = a + b1Px +b2Y + b3Py gives a single line for Demand
Qx = a + b1Px +b2Y + b3Py + ẽ gives infinite Demands
– After harvest Supply is a constant, so we get an infinite
number of Prices as we draw ẽ values at random
Price/U
Supply
• Demand is stochastic so we can
have an infinite number of
Demand functions passing
through the QD distribution
Demand
Quantity/UT
The Basic Business Model
• Profit is generally our Key Output Variable of interest
𝜋 = Total Receipts – Variable Cost – Fixed Cost
𝜋 = ∑(P~i * Ỹi ) - ∑(VCi * Ỹi * Qi ) – FC
~
Where Pi is the stochastic price for product i, as $/bu.
Ỹi is stochastic production level as yield or bu./acre
VCi is variable cost per unit of production for i, or $/bu.
Qi is the level of resources committed to i, as acres
Univariate Random Variables
• More than 50 Univariate Distributions in Simetar
–
–
–
–
–
–
–
Uniform Distribution
Normal and Truncated Normal Distribution
Empirical, Discrete Empirical Distribution
GRKS Distribution
Triangle Distribution
Bernoulli Distribution
Conditional Distribution
• Excel probability distributions have been made
Simetar compatible, e.g.,
– Beta, Gamma, Exponential, Log Normal, Weibull
– See Chapter 16 in Sections 3.1 and 4
Uniform Distribution
• A continuous distribution where each range has an
equal probability of being observed
– 20% chance of seeing a value between 0 and 0.2 or between
0.8 and 1.0
• Parameters for the uniform are minimum and
maximum values and the domain includes all real
number’s
=UNIFORM(minimum, maximum)
• The mean and variance of this distribution are:
min  max

2


max  min 

2
2
12
PDF and CDF for a Uniform Dist.
Probability Density Function
f(x)
min
max X
Cumulative Distribution Function
F(x) 1.0
0.0
min
max X
When to Use the Uniform Distribution
• Use the uniform distribution when every range of length
“n” between the minimum and maximum values has an
equal chance of occurrence
• Use this distribution when you have no idea what type of
distribution to use
• Uniform distribution is used to simulate all random
variables via the Inverse Transform procedure and USD
An example of
how USD is used
to simulate a
Standard Normal
Distribution
Uniform Deviate
1.0
USDi
0.8
0.6
0.5
0.4
0.2
-
3
0
SNDi
+ 3 Std. Normal Dev.
Inverse Transform for Generating a SND from a USD
Uniform Standard Deviate (USD)
• In Simetar we simulate the USD as:
=UNIFORM(0,1) or =UNIFORM()
– Produces a Uniform Standard Deviate (USD) 0 to 1
– Special case of the Uniform distribution
• USD is building block for all random number
generation using the Inverse Transformation
method for simulation. Inverse Transform uses
a USD to simulate a Uniform distribution as:
X = Min + (Max-Min) * Uniform(0,1)
X = Min + (Max-Min) * USD
Simulate a Uniform Distribution
• Alternative ways to program the
Uniform( ) distribution function
= Uniform(Min, Max,[USD])
= Uniform(10,20) Not recommended method
= Uniform(A1,A2) This is the preferred method
= Uniform(A1,A2,A3) where a USD is calculated
in cell A3
Uses for a Uniform Standard Deviate
• USD can be used in all random number
formulas in Simetar to facilitate correlating
random variables
• For example in Simetar we can add USDs:
=NORM(mean, std dev, [USD1])
=TRIANGLE(min, middle, max, [USD2])
= EMP( Si, F(Si), [USD3])
=EMP(values , , [USD4])
NOTE: every variable has its own unique USD
• Note the [ ] means that USD is optional
Generating Random Numbers
• Generate a Uniform Standard Deviate (USD)
=UNIFORM(0,1)
Simetar defaults to simulate 500 values (can be changed to 1,000s)
These are called iterations or draws
Iterations are separate, uncorrelated draws of random variables
USD = UNIFORM(0,1)
Prob
CDF for Uniform(0,1)
0.12
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.08
0.06
0.04
0.02
0
0.2
0.4
0.6
0.8
1
0
0.00
0.13
0.25
0.38
0.50
0.62
0.75
0.87
• Equal chance of observing a number in each of the intervals; both
charts are for the same output
1.00
USD Output in SimData
• Simetar saves the 500 samples in
SimData and calculates summary statistics
Simetar Simulation Results for 500 Iterations. 9:36:20 AM 2/17/2013 (1 sec.). © 2011.
Variable Sheet1!B7
Mean
0.499985
StDev
0.288988
CV
57.79939
Min
0.000895
Max
0.999165
Iteration USD
1 0.512793
2 0.307316
3 0.581277
4 0.787495
5 0.94209
6 0.735971
7 0.048923
8 0.23733
Inverse Transform
• Use the 500 USDs to simulate random
variables for your Ŷ variable
• This involves translating the USDs from a
0 to 1 scale to the scale for your random
variable
• This is done using the Inverse Transform
method shown on the next slide.
• NOTE: you must have a separate USD for
every random variable Y
Inverse Transform
• The 500 USDs converted from the 0 to 1
scale on the Y axis by direct interpolation
• Each random USD is associated with a
unique “random” Y value to get 500 Ỹs
USD or F(x)
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
55.00
CDF of a Random Variable
60.00
65.00
70.00
75.00
Inverse Transform
• Results of 500 iterations for Y using
Inverse Transform
Simulation Results for 500 Iteratio
• USDs and their resulting Ỹs Simetar
Variable Sheet1!G33
Sheet1!G34
Mean
0.499985 65.19666
StDev
0.288988 3.136123
CV
57.79939 4.810251
Min
0.000895 56.38011
Max
0.999165 74.43161
Iteration USD
Y-Tilda
1 0.512793 65.22607
2 0.307316 63.61534
3 0.581277 65.7939
4 0.787495 67.72464
5 0.94209 70.20308
6 0.735971 67.17892
7 0.048923 60.03664
8 0.23733 62.91843
9 0.955568 70.68873
10 0.634662 66.23654
Simulate the Normal Distribution
• Parameters for a Normal Distribution
– Mean or Ŷ from OLS
– Std Dev or σ of residuals
• Simulated using the formula for a Normal
Ỹ = Ŷ + σ * SND
Where the SND is a “standard normal deviate”
We generate 500 SNDs and thus simulate
(calculate) 500 random Y’s
Simulate the Standard Normal Deviate
(SND)
•
•
•
•
•
SND is a random value between ±∞
SND has a mean of zero
SND has a standard deviation of one
SND is simulated by =NORM(0,1)
SNDs are the “number of standard deviations from the
mean” or the number of σ’s Ỹ is from the Ŷ or Ῡ
Uniform Deviate
1.0
USDi
0.8
0.6
0.5
0.4
0.2
-
3
0
SNDi
+
Inverse Transform for Generating a SND from a USD
3
Std. Normal Dev.
Simulate Normal Distribution
• Next apply the random SNDs to the
Normal distribution formula
Ỹ = Ŷ + σ * SND
In Simetar all of these steps are done for
you: = NORM(Ŷ, σ)
or
= NORM(Ŷ, σ, USD)
• Remember where to get Ŷ and σ ?
– In forecasting we estimated
Ŷ = a + bX1 +bX2
σ = Standard Deviation of residuals
Normal Distribution: Simetar
Code and Output
• The USD is used to calculate the SND
• The SND is used to simulate Ỹ
• Simetar gives same result in one step
Simetar Simulation Results for 500 Iterations. 7:56:32
Variable Sheet1!B47Sheet1!B48Sheet1!B49Sheet1!B50
Mean
0.499985 -0.00015 65.48175 65.48175
StDev
0.288988 1.001471 3.946465 3.946465
CV
57.79939 -650265 6.026817 6.026817
Min
0.000895 -3.12303 53.1755 53.1755
Max
0.999165 3.143506 77.86988 77.86988
Iteration USD
SND
Y Tilda
Simetar
1 0.512793 0.032072 65.60874 65.60874
2 0.307316 -0.50347 63.49834 63.49834
3 0.581277 0.20516 66.29082 66.29082
4 0.787495 0.797758 68.62605 68.62605
5 0.94209 1.572561 71.6793 71.6793
6 0.735971 0.630975 67.96882 67.96882
7 0.048923 -1.65539 58.95901 58.95901
Steps for Simulating Random Variables
• Must assume a probability distribution (shape)
– Normal, Beta, Empirical, etc.
• Estimate parameters required to define and
simulate the assumed distribution
• Here are the parameters for selected distributions
–
–
–
–
Normal ( Mean, Std Deviation )
Beta ( Alpha, Beta, Min, Max )
Uniform ( Min, Max )
Empirical ( Si, F(Si) )
• Often times we assume several distribution forms,
estimate their parameters, simulate them and pick
the one which best fits the data
Steps for Parameter Estimation
• Step 1: Check for the presence of a trend, cycle or
structural pattern
– If present remove it & work with the residuals (ẽt)
– If no trend or structural pattern, use actual data (X’s)
• Step 2: Estimate parameters for several assumed
distributions using the X’s or the residuals (ẽt)
• Step 3: Simulate the different distributions
• Step 4: Pick the best match based on
–
–
–
–
Mean, Standard Deviation -- use validation tests
Minimum and Maximum
Shape of the CDF vs. historical series
Penalty function =CDFDEV() to quantify differences
Parameter Estimator in Simetar
• Use Theta Icon in Simetar
– Estimate parameters for 16 parametric distributions
– Select MLE method of parameter estimation
– Provides equations for simulating distributions
Parameter Estimator in Simetar
• Results for Theta Estimate parameters for 16 distributions
– Selected MLE in this example
– Provides equations for simulating distributions based on a common USD
Parameter Estimation, 2/28/2016 2:35:06 PM
Maximum Likelihood Estimates (MLEs)
Dis tribution Param eter
Random Variable
Beta
α ;α>0, A≤x≤B
1.194356
β ;β>0
1.323942
Double Exponential
μ ; -∞<μ<∞, -∞<x<∞
2.27
σ ; σ>0
0.28625
Exponentialα ; -∞<α<∞, ≤x<∞
1.5
β ; β>0
0.74375
Gam m a
α ; α>0, 0≤x<∞
36.56802
β ; β>0
0.061358
Invers e Gaus
μs
; μ>0,
ian 0≤x<∞
2.24375
σ ; σ>0
0.112584
Logis tic
μ ; -∞<μ<∞, -∞<x<∞
2.242846
σ ; σ>0
0.208322
Log-Log
μ ; -∞<μ<∞, -∞<x<∞
2.061776
σ ; σ>0
0.35179
Log-Logis ticμ ; -∞<μ<∞, 0≤x<∞
10.58852
σ ; σ>0
2.235521
Lognorm al μ ; -∞<μ<∞, 0≤x<∞
0.794413
σ ; σ>0
0.16745
Norm al
μ ; -∞<μ<∞, -∞<x<∞
2.24375
σ ; σ>0
0.366774
Pareto
α ;α>0, α≤x≤∞
1.5
β ;β>0
2.571038
Uniform
A ; A<B, A≤x≤B
1.5
B ; B>A
3.05
Weibull
α ; α>0, 0≤x<∞
6.554942
β ; β>0
310.5307
Binom ial
n ; x=0,1,2,...,n
4
p ; 0≤p≤1
0.560938
Geom etric p ; x=1,2,...; 0≤p≤1
0.308285
Pois s on
λ ; 0≤λ<∞, x=0,1,...
Negative Binom
s ; x=1,2,...;
ial
0
p ; 0≤p≤1
2.24375
Which is the Best Distribution?
• Use Simetar function =CDFDEV(History, SimData)
– Perfect fit has a CDFDEV value of Zero
– Pick the distribution with the lowest CDFDEV
Distributions
Beta
Double Exponential
Exponential
Gamma
Logistic
Log-Log
Log-Logistic
Lognormal
Normal
Pareto
Uniform
Weibull
Binomial
Geometric
Poisson
CDFDEV
Formula
0.02
=CDFDEV(Sheet1!$A$2:$A$21,SimData!B9:B108)
0.37
=CDFDEV(Sheet1!$A$2:$A$21,SimData!C9:C108)
2.66
=CDFDEV(Sheet1!$A$2:$A$21,SimData!D9:D108)
0.07
=CDFDEV(Sheet1!$A$2:$A$21,SimData!E9:E108)
0.16
=CDFDEV(Sheet1!$A$2:$A$21,SimData!F9:F108)
0.43
=CDFDEV(Sheet1!$A$2:$A$21,SimData!G9:G108)
0.34
=CDFDEV(Sheet1!$A$2:$A$21,SimData!H9:H108)
0.10
=CDFDEV(Sheet1!$A$2:$A$21,SimData!I9:I108)
0.05
=CDFDEV(Sheet1!$A$2:$A$21,SimData!J9:J108)
79.04
=CDFDEV(Sheet1!$A$2:$A$21,SimData!K9:K108)
0.03
=CDFDEV(Sheet1!$A$2:$A$21,SimData!L9:L108)
0.08
=CDFDEV(Sheet1!$A$2:$A$21,SimData!M9:M108)
1.04
=CDFDEV(Sheet1!$A$2:$A$21,SimData!N9:N108)
33.81
=CDFDEV(Sheet1!$A$2:$A$21,SimData!O9:O108)
4.19
=CDFDEV(Sheet1!$A$2:$A$21,SimData!P9:P108)
Use the “View Distributions.xlsx”
• For a random variable with 10 observations can estimate
the parameters and view the shape of the distribution