Download Lectures 9 and 10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inverse problem wikipedia , lookup

Renormalization group wikipedia , lookup

Pattern recognition wikipedia , lookup

Corecursion wikipedia , lookup

Taylor's law wikipedia , lookup

Least squares wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Data assimilation wikipedia , lookup

Generalized linear model wikipedia , lookup

Transcript
Selecting Input Probability
Distributions
Introduction
• Part of modeling—what input probability distributions to use as
input to simulation for:
– Interarrival times
– Service/machining times
– Demand/batch sizes
– Machine up/down times
• Inappropriate input distribution(s) can lead to incorrect output, bad
decisions
• Given observed data on input quantities, we can use them in
different ways
2
Data Usage
Use
Trace-driven
Use actual data values to
drive simulation
Empirical distribution
Use data values to define
a “connect-the-dots”
distribution (several
specific ways)
Fitted “standard”
distribution
Use data to fit a classical
distribution (exponential,
uniform, Poisson, etc.)
Pros
Cons
Valid vis à vis
real world
Direct
Not generalizable
Fairly valid
Simple
Fairly direct
May limit range of
generated variates
(depending on form)
Generalizable— May not be valid
fills in “holes” in May be difficult
data
3
Parameterization of Distributions - 1
• There are alternative ways to parameterize most distributions
• Typically, parameters can be classified as one of:
– Location parameter γ (also called shift parameter): specifies
an abscissa (x axis) location point of a distribution’s range of
values, often some kind of midpoint of the distribution
• Example: μ for normal distribution
• As γ changes, distribution just shifts left or right without
changing its spread or shape
• If X has location parameter 0, then X + γ has location
parameter γ
4
Parameterization of Distributions - 2
– Scale parameter β: determines scale, or units of measurement,
or spread, of a distribution
• Example: σ for normal distribution, β for exponential
distribution
• As β changes, the distribution is compressed or expanded
without changing its shape
• If X has scale parameter 1, then βX has scale parameter β
5
Parameterization of Distributions - 3
– Shape parameter α: determines, separately from location and
scale, the basic form or shape of a distribution
• Examples: normal and exponential distribution do not have
shape parameter; α for Gamma and Weibull distributions
• May have more than one shape parameter (Beta distribution
has two shape parameters)
• Change in shape parameter(s) alters distribution’s shape
more fundamentally than changes in scale or location
parameters
6
Continuous and Discrete Distributions
• Compendium of 13 continuous and 6 discrete distributions given
in the textbook with details on
– Possible applications
– Density and distribution functions (where applicable)
– Parameter definitions and ranges
– Range of possible values
– Mean, variance, mode
– Maximum-likelihood estimator formula or method
– General comments, including relationships to other distributions
– Plots of densities
7
Summary Measures from Moments
• Mean and variance
– Coefficient of Variation is a measure of variability relative to the
mean: CV(X)=sX/mX.
• Higher moments also give useful information
– Skewness coefficient gives information about the shape.

E[( X  m )3 ]
s3
– Kurtosis coefficient gives information about the tail weight
(likelihood of extreme-value).

E[( X  m ) 4 ]
s4
8
Example
Find:
• Mean
• Variance
• Coefficient
of variation
• Median
• Skewness
coefficient
f X ( x)  3x 2 if 0  x  1,
1
m X  E[ X ]   x 3x 2 dx 
0
f X ( x)  0 otherwise
3
4
CV 
s X2  Var[ X ]  E[ X 2 ]  m X2 
Median m~X :
3
80
sX
1

mX
15
m~X
1
2
~  21/ 3  0.79
3
x
dx


m
X
0
2
m~X  m X  likely to be left skewed (mass concentrat ed on the right)
1

E[( X  m ) ]
3
s
3

negative skewness
 (x  m
0
X
s
)3 3x 2 dx
3
 left skewed
2

3
5
3  0.86
Exponential Expo(β)
10
Exponential Expo(β)
Expo(1) density function
11
Exponential: Properties
• Coefficient of Variation is a measure of variability relative
to the mean: CV(X)=sX/mX.
• Its Coefficient of Variation is 1 (unless it is shifted).
• The density function is monotonically decreasing (at an
exponential rate).
• Times of events: most likely to be small but can be large
with small probabilities.
• Skewness = 2, Kurtosis (tail weight) =9.
E[( X  m )3 ]

(s 2 )3 / 2
E[( X  m ) 4 ]

(s ) 4
12
Poisson(λ)
Bimodal: Two modes
13
Poisson(λ)
14
Poisson: Properties
• Counts the number of events of over time.
• If arrivals occur according to a Poisson process with rate
l, times between arrivals are exponential with mean 1/l.
• Its Coefficient of Variation is 1/Sqrt(l).
• Events (i.e.) are generated by a large potential
population where each customer chooses to arrive at a
given small interval with a very small probability.
• Number of outbreaks of war over time, number of goals
scored in World Cup games.
15
Normal Distribution: Properties
• Supported by Central Limit Theorem: the random variable is a sum
of several small random variables (i.e. total consumer demand).
• It is symmetrical (skewness = 0, mean=median).
• Kurtosis=3.
• It’s usually not appropriate for modeling times between events (can
take negative values).
Density Function Plot
0.40
0.33
f(x)
0.27
0.20
0.13
0.07
0.00
-3.09
-2.21
-1.32
-0.44
0.44
1.32
2.21
3.09
x
16
Gamma Distribution: Properties
 b a xa 1e  ( x / b )

f X ( x)  
(a )

0
x  0,
otherwise,

whe re ( z )   t z 1e t dt
0
• Shape parameter: a>0, scale parameter b>0
• A special case: sum of exponential random variables (a=1,
corresponds to exponential (b).
• In general, skewness is positive.
• The CV is less than one if shape parameter a > 1.
Density Function Plot
91.12e-3
0.31
75.93e-3
0.25
60.75e-3
0.18
45.56e-3
f(x)
f(x)
Density Function Plot
0.37
0.12
30.37e-3
0.06
15.19e-3
0.00
0.00
1.32
2.64
3.96
5.28
6.60
x
Scale = 1, shape=2
7.91
9.23
0.00
0.00
5.24
10.49
15.73
20.97
26.21
31.46
x
Scale = 1, shape=20
36.70
Weibull Distribution: Properties
ab a xa 1e  ( x / b )
f X ( x)  

0
a
x0
otherwise
• Shape parameter: a>0, scale parameter b>0
• Very versatile
Density Function Plot
3.70
0.62
3.08
0.50
2.47
0.37
f(x)
f(x)
Density Function Plot
0.75
0.25
1.23
0.12
0.00
0.00
1.85
0.62
0.52
1.04
1.55
2.07
x
2.59
3.11
3.63
0.00
0.00
0.17
0.35
0.52
0.69
0.87
1.04
1.21
x
Scale = 1, shape=1.5
Scale = 1, shape=10
18
Lognormal Distribution: Properties

  (ln x  m ) 2 
1


exp 
x0
2
f X ( x)   x 2s 2
2
s



0
otherwise

• Y=ln(X) is Normal(m,s).
• Models product of several independent random factors (X=X1
X2…Xn).
• Very versatile: like gamma and Weibull but can have a spike near
zero.
Density Function Plot
2.00
0.75
1.67
0.60
1.34
0.45
1.00
f(x)
f(x)
Density Function Plot
0.90
0.30
0.67
0.15
0.33
0.00
0.00
0.67
1.34
2.01
2.68
x
Scale = 1, shape=0.5
3.35
4.02
4.69
0.00
0.00
0.39
0.78
1.17
1.56
x
Scale = 2, shape=0.1
1.95
2.33
2.72
19
Empirical Distributions
• There may be no standard distribution that fits the
data adequately: use observed data themselves to
specify directly an empirical distribution
• There are many different ways to specify empirical
distributions, resulting in different distributions with
different properties.
20
Continuous Empirical Distributions
• If original individual data points are available (i.e., data are not
grouped)
– Sort data X1, X2, ..., Xn into increasing order: X(i) is ith smallest
– Define F(X(i)) = (i – 1)/(n – 1), approximately (for large n) the
proportion of the data less than X(i), and interpolate linearly
between observed data points:
0

x  X (i )
 i 1
F ( x)  

n

1
(n  1)( X (i 1)  X (i ) )


1
if x  X (1)
if X (i )  x  X (i 1) for i  1,2,..., n  1
if X ( n )  x
21
Continuous Empirical Distributions
Rises most steeply over
regions where
observations are dense,
as desired.
Sample: 3,5,6,7,9,12
F(3)=0, F(5)=1/5, F(6)=2/5, F(7)=3/5, F(9)=4/5, F(12)=1,
22
Continuous Empirical Distributions
• Potential disadvantages:
– Generated data will be within range of observed data
– Expected value of this distribution is not the sample mean
• There are other ways to define continuous empirical distributions,
including putting an exponential tail on the right to make the range
infinite on the right
• If only grouped data are available
– Don’t know individual data values, but counts of observations in
adjacent intervals
– Define empirical distribution function G(x) with properties similar
to F(x) above for individual data points
23
Discrete Empirical Distributions
• If original individual data points are available (i.e., data are not
grouped)
– For each possible value x, define p(x) = proportion of the data
values that are equal to x
• If only grouped data are available
– Define a probability mass function such that the sum of the p(x)’s
for the x’s in an interval is equal to the proportion of the data in
that interval
– Allocation of p(x)’s for x’s in an interval is arbitrary
24