Download overhead - 09 Univariate Probability Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Welcome Back From Spring Break
• Brief Review
– Forecasting for 3 weeks
– Simulation
• Motivation for building simulation models
• Steps for developing simulation models
• Stochastic variables and why they are included in models
• What financial simulation model is used for
• Parametric Distributions (N, U, Bernoulli)
Test Results
Mean 80.4, Std Dev 13.1, Range 45-100
Materials for Lecture 9
• Chapter 6
• Chapter 16 Sections 3.2 - 3.7.3, 4.0,
• Lecture 10 Demo Distributions.xlsx
• Lecture 10 Empirical Distributions.xlsx
Non-Parametric and Parametric Distributions
• Non-Parametric Distributions – not a fixed
form that is parameter dependent
– Discrete Uniform
– Empirical
– GRKS
– Triangle
• Parametric Distributions (covered last lecture)
– Fixed form, shape dependent on parameters
– Uniform, Normal, Beta, Gamma
Discrete (Uniform) Empirical
• Discrete Empirical distribution used where only
fixed values can occur
– Each value has an equal probability of being drawn
– No interpolation between observed values
• Examples of Discrete Empirical distributions
– Discrete number of labors who show up to work
– Number of steers on a cattle truck
– Simulating a fair die: 1, 2, 3, 4, 5, 6
– Letter grades: A, B, C, D, F
Discrete (Uniform) Empirical
Distribution
PDF for DE(3, 4, 6, 7)
CDF for DE(3, 4, 6, 7)
1
.75
.5
.25
0
3
4
6
7
X
3
4
6
7
X
PDF and CDF for a Discrete Uniform Distribution.
- Parameters for a DE(x1, x2, x3, …, xn) based on history
- Discrete Empirical means that each observed value of Xi, has
an equal probability of being observed
Row
1
2
3
4
5
A
10
12
20
15
13
B
C
=DEMPIRICAL (A1:A5)
Discrete Uniform Empirical
• Simulate this type of random variable two
ways in Simetar
– Discrete empirical with equal probabilities
=DEMPIRICAL(A1:A5)
=RANDSORT(A1:A5)
Discrete Empirical -- Alphanumeric
• =RANDSORT(I1:I5)
• Random shuffle of names; highlight 5 cells and
Type =RANDSORT(I1:I5) then
press and hold Ctrl Shift Enter
Empirical Distribution
• An empirical distribution is defined totally by the observations
for the data, no distributional form is assumed
• Parameters to simulate an empirical distribution
– Forecasted values: means (Ῡ) or forecasts (Ŷ)
– Calculate percentage deviation from the mean or forecast = (Yi- Ŷi) / Ŷi
– Sort the deviations from the mean or forecast from low to high
– Assign a cumulative probability to each sorted deviates (usually
assume equal probability for each data point).
• Cumulative probabilities go from 0.0 to 1.0; named F(x)
– Assume the distribution is continuous, so interpolate between the
observed points
• Use the Inverse Transform formula to simulate the
distribution
• This requires simulation of a USD for use in interpolation
• Use Emp icon to estimate parameters
PDF and CDF for an Empirical Dist.
Probability Density Function
Cumulative Distribution Function
F(x) 1.0
f(x)
X
min
max
0.0
min
max X
We interpolate the Dark Black line in the CDF based on the discrete CDF and
use it as the approximation for a continuous distribution using the Inverse
Transform method
Using the Empirical Distribution
• Empirical distribution should be used if
– Random variable is continuous over its range,
– You have < 20 observations for the variable, and/or
– You cannot easily estimate parameters for the true PDF
• Simulate crop yields as an Empirical distribution when
you have less than 20 historical values
– Assume we have 10 observed yields:
• Yield can be any positive value, not discrete values
• We don’t have enough observations to test for
normality
• We know the 10 random values were observed with
a probability of 1/10, or one observation each year
– So F(x) goes from 0.0 to 1.0 in equal increments
Simulating Empirical Distributions
• Empirical distribution is usually simulated as percent
deviations from mean or trend:
percent deviates from mean = (Yt – Ῡt )/Ῡt
• Parameters are:
– Mean of the data is either Ῡt or Ŷt
– Sorted deviations from mean or forecasted Ŷ are
St = Sort [(Yt – Ῡt )/Ῡt ]
or
St = Sort [(Yt – Ŷt)/ Ŷt ]
– Probabilities for St’s, are called F(St) or F(x) values and
MUST range from 0.0 to 1.0
• Use the parameters to simulate random variable Ỹ:
Ỹ = Ῡt * (1 + EMP(St, F(St), [USD]) )
Empirical Distribution -- No Trend
•
•
Given a random variable, Ỹ, with 11 observations
Develop the parameters if simulating variable using the mean to forecast
the deterministic component:
• Parameter for deterministic component is
the mean or the second column
• Calculate the stochastic component or ê as:
êi = Yi – Ῡ
• Convert the residual to fractional deviation
of forecast mean value: Devi = êi / Ῡ
• Sort the Devi values from low to high (Si)
and assign the probabilities of Si or F(Si)
• Simulate Ỹ in two steps:
Stoch Devi = EMP(Sort Dev, Prob Dev, USD)
Stoch ỸT+i = ῩT+i * (1 + Stoch Devi)
• Recall : Devi = (Yi- Ῡi) / Ῡi rearrange terms
or
so
(Ῡ * Devi) = Yi – Ῡ
Yi = Ῡ + (Ῡ * Devi)
Empirical Dist. -- With Trend
Parameters for EMP() if deterministic component is the trend forecast
•Calculate the stochastic component
or ê as:
êi = Yi – Ŷi
• Convert residual to fractional deviate of
forecast value: Devi = êi / Ŷi
• Sort the Devi values from low to high (Si)
and calculate the probabilities of Si or F(Si)
• Simulate Ỹ as follows:
Stoch Devi = EMP(Si, F(Si), USD )
ỸT+i = ŶT+i * (1 + Stoch Devi)
• Derived from: Stoch Devi = (Yi - Ŷi) / Ŷi
or
Yi – Ŷi = (Ŷi * Stoch Devi)
or
Y Stochi = Ŷi + (Ŷi * Stoch Devi)
•ỸT+I Could have been developed from a
structural or time series equation, then êi
are the residuals from the regression
Simulate Emp Distribution with Simetar
• Let: Si be in B1:B10 and F(Si) in A1:A10
• If Si expressed as actual values
=EMP(Si ) or =EMP(B1:B10)
Memorize these
formulas. They are
important.
• If Si expressed as residuals mean or OLS
= Ῡ + EMP(B1:B10, A1:A10)
• If Si expressed as fractional deviates from
trend or trend: Si = (ẽ / Ŷ)
= Ŷ * (1 + EMP(B1:B10, A1:A10))
Simulating an Emp Distribution
• Advantages of Emp Distribution
– It lets the data define the shape of the distribution
– Does not force an assumed distribution shape on the
variable
– The larger the number of observations in the sample,
the closer Emp will approximate the true distribution
• Disadvantages of Emp Distribution
– It has finite min and max values
– It does not adhere to known probabilities and
parameters
– Parameters can be difficult to estimate w/o Simetar
Simulating an Emp Distribution
• Advantages of specifying the Si’s as
fractional deviates of forecasted values
– Guarantees the “relative risk” for a random
variable is the same as the historical period
• Coefficient of variation for the sample data is
constant over time CVt = (σ / Ῡt) * 100
– Allows you to use any mean (Ŷ or Ῡ) for the
simulated planning horizon and it will have the
same CV as the historical period
• Historical Ῡ can be 100 and the mean for the
forecast period Ŷ can be 150 and the Ỹ values will
have the same CV as the historical data.
Inverse Transform for Simulating an
Empirical Distribution
F(x)
1.0
Start with a
random USD
U(0,1) = 0.45
Interpolate the Ỹ
axis using the
USD value
0.0
Y1
Y2 Y3
Stochastic
Y4 Y5
Ỹi
Y6
Y7
Derived by linear interpolation