Download overhead - 08 Stochastic Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Materials for Lecture 08
• Chapters 4 and 5
• Chapter 16 Sections 3.2-3.7.3
• Lecture 08 Bernoulli .xlsx
• Lecture 08 Normality Test.xls
• Lecture 08 Simulation Model with
Simetar.xlsx
• Lecture 08 Normal.xls
• Lecture 08 Simulate a Reg Model.xls
Stochastic Simulation
• Purpose of simulation is to estimate the unknown
probability distribution for a KOV so decision makers can
make a better decision
– Simulate because we can not observe and measure the KOV
distribution directly
– Want to test alternative values for control variables
• Sample PDFs for random variables, calculate values of
KOV for many iterations
• Record KOV
• Analyze KOV distribution X
X and X are
X
1
2
Model
~
~
Y1 = f(M, Z, X 1 )
~
~
Y2 = f(M, X 2 , Y1 )
~
~ ~
Y3 = f(Y1 , Y2 )
Manageable
Variables
(Mi)
1
Exog Var (Zi)
P(Y3 )
Y1
Y2
2
Stochastic Variables
Y3
Stochastic Variables
• Any variable the decision maker can not
control is thought to be stochastic
• In agriculture we think of yield as
stochastic as it is subject to weather
• For most businesses the prices of inputs
and outputs are not directly controlled by
management so they are stochastic.
– Production may be random as well.
• Include the most important stochastic
variables in simulation models
– Your model can not include all random variables
Stochastic Simulation
• In economics we use simulation because we can
not experiment on live subjects, a business or the
economy without injury
• In other fields they can fabricate an experiment
– Health sciences they feed/treat multiple rats on different
chemicals
– Animal science feed multiple pens of steers, chickens,
cows, etc.
– Engineers run a motor under different controlled
situations (temp, RPMs, lubricants, fuel mixes)
– Vets treat different pens of animals with different meds
– Agronomists set up randomized block treatments for a
particular seed variety
• All of these are just different iterations of “models”
Iterations, How Many are Enough?
Specify
the output
variables’
names
and
location
Specify
the
number of
iterations
in the
Simetar
simulation
engine
• Change the number of iterations based on nature of the
problem -- 500 is adequate.
− Some studies use 1,000’s because they are using a
Monte Carlo sampling procedure which is less
precise than Latin hypercube
−Simetar uses a Latin hypercube so 500 is an
adequate sample size
Definitions
• Stochastic Model – means the model has at least one
random variable
• Monte Carlo simulation model – same as a stochastic
model
• Two ways to simulate random values
Prob
– Monte Carlo – draw random values for the variables purely at random
– Latin Hyper Cube – draw random values using a systematic approach so
we are certain that we sample ALL regions of the probability distribution
– Monte Carlo sampling requires larger number of iterations to insure that
we sampled all regions of the the probability distribution
1
– For a U(0,1) CDF is straight line
0.9
0.8
– MC has bias from straight line
0.7
0.6
– LHC is the straight line
0.5
0.4
– This is with 500 iterations
0.3
– Simetar default is LHC
0.2
0.1
0
0
0.2
0.4
Latin Hypercube
0.6
Monte Carlo
0.8
1
Normal Distribution
• Normal distribution – a continuous random variable
that produces a bell shaped distribution with set
probabilities
• Parameters are
– Mean
– Standard Deviation
• Normal distribution reaches to + and - infinity.
– Can produce negative values so be careful
– Can produce extremely high values
• Most of us have memorized several probabilities for
the normal distribution:
– 66% of observation within +/- 1 of the mean
– 95% of observation within +/- 2 of the mean
– 50% of observations lie above and below the mean.
PDF and CDF for a Normal Dist.
f(x)
Probability Density Function
F(x)
Cumulative Distribution Function
1.0
0.8
0.6
0.4
0.2
-
+
-
0.0
+
Simulating Random Variables
• Normal distribution used frequently, particularly when
simulating residuals for a regression model
• Parameters for a Normal distribution
– Mean expressed as Ῡ or Ŷ
– Standard Deviation σ (or SEP from a regression model)
• Assume yield is a random variable and have production
function data, such as:
– Ỹ = a + b1 Fert + b2 Water + ẽ
– Deterministic component is: a + b1 Fert + b2 Water
– Stochastic component is: ẽ
• Stochastic component, ẽ, is assumed to be distributed Normal
– Mean of zero
– Standard deviation of σe
• See Lecture 8 Simulate a Reg Model.XLS
Use the Normal Distribution When:
• Use the Normal distribution if you have lots of
observations and have tested for normality
• Watch for infeasible values from a Normal
distribution (negative yields and prices)
-10.00
-5.00
0.00
5.00
10.00
15.00
20.00
25.00
Problems with the Normal
• It is easy to use, so it often used when it is not
appropriate
• It does not allow for extreme events (Black
Swans)
– No way to account for record breaking outliers because
the distribution is defined by Mean and Std Dev.
• Std Dev is the “average” deviation from the mean and
averages out BS’s
• Market outliers are washed away in the average
• It is the foundation for Sigma 6
– So Sigma 6 suffers from all of the problems above
– Creates a false sense of security because it never sees
a record braking outlier
Test for Normality
• Simetar provides an easy to use procedure for testing
Normality that includes:
–
–
–
–
–
S-W – Shapiro-Wilks
A-D – Anderson-Darling
CvM – Cramer-von Mises
K-S – Kolmogornov-Smiroff
Chi-Squared
• Simetar’s Hypothesis Testing Icon provides a tab to “Test for
Normality”
Simulating a Normal Distribution
• Normal Distribution
=NORM( Mean, Standard Deviation)
=NORM( 10,3)
=NORM( A1, A2)
• Standard Normal Deviate (SND)
=NORM(0,1) or =NORM()
• SND is the Z-score for a standard normal distribution
allowing you to simulate any Normal distribution
• SND is used as follows:
Ỹ = Mean + Standard Deviation * NORM(0,1)
Ỹ = Mean + Standard Deviation * SND
Ỹ = A1 + (A2 * A3) where a SND is in cell A3
Truncated Normal Distribution
• General formula for the Truncated Normal
=TNORM( Mean, Std Dev, [Min], [Max],[USD] )
• Truncated Downside only
=TNORM( 10, 3, 5)
The values in [ ]
are optional
• Truncated Upside only
=TNORM( 10, 3, , 15)
• Truncated Both ends
=TNORM( 10, 3, 5, 15)
• Truncated both ends with a USD in general form
=TNORM( 10, 3, 5, 15, [USD])
Example Model of Net Returns for a Business Model
- Stochastic Variables -- Yield and Price
- Management Variables -- Acreage and Costs (fixed and variable)
- KOV -- Net Returns
- Write out the equations and exogenous values
Equations and their order
~
Y = Y +  * SND1
~
P = P +  * SND2
~
~
Rec = Y * P * Ac
~
Cost = (Ac * 150) + (0.25 * Y * Ac) + 10
NR = Rec - Cost
Program a Simulation Model in Excel/Simetar -Input Data Section of the Worksheet
-See Lecture 08 Simulation Model with Simetar.XLS
Program Model in Excel/Simetar -- Generate Random
Variables and Simulate Profit
Bernoulli Distribution
PDF for Bernoulli B(0.75)
CDF for Bernoulli B(0.75)
1
.25
.25
.75
0
1
X
0
1
X
PDF and CDF for a Bernoulli Distribution.
• Parameter is ‘p’ or the probability that the
random variable is 1 or TRUE
• Simulate Bernoulli in Simetar as
= Bernoulli(p)
= Bernoulli(0.25)
Lecture 8 Bernoulli.XLSX examples follow
Bernoulli Distribution Application
Bernoulli Distribution Application