Download Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Selecting Input Probability
Distribution
Simulation Machine
• Simulation can be considered as an
Engine with input and output as
follows:
Input
Simulation
Engine
Output
Realizing Simulation
• Input Analysis: is the analysis of the random
variables involved in the model such as:
– The distribution of IAT
– The distribution of Service Times
• Simulation Engine is the way of realizing the model,
this includes:
– Generating Random variables involved in the
model
– Performing the requiring formulas.
• Output Analysis is the study of the data that are
produced by the Simulation engine.
Input Analysis
• collect data from the field
• Analyze these data
• Two ways to analyze the data:
– Build Empirical distribution and then sample from
this distribution.
– Fit the data to a theoretical distribution ( such as
Normal, Exponential, etc.) See Chapter 6 of Text
for more distributions.
How to select an Input
Probability distribution
1. Hypothesize a family of distributions.
2. Estimate the parameters of the fitted
distributions
3. Determine how representative the fitted
distributions are
Repeat 1-3 until you get a fitted distribution foe
the collected data. Otherwise go with an
empirical distribution.
Hypothesizing a
Theoretical Distribution
To Fit a Theoretical
Distribution
• Need a good background of the
theoretical distributions (Consult your
Text: Section 6.2)
• Histogram may not provide much insight
into the nature of the distribution.
• Need Summary statistics
Summary Statistics
•
•
•
•
Mean
Median
Variance s2
Coefficient of Variation (cv = s/m) for
continuous distributions
• Lexis ration (t = s2/m) for discrete
distributions
3
E[( X  m ) ]
=
2 3/ 2
(s )
• Skewness index
Summary Stats. Cont.
• If the Mean and the Median are close to
each others, and low Coefficient of
Variation, we would expect a Normally
distributed data.
• If the Median is less than the Mean, and s is
very close to the Mean (cv close to 1), we
expect an exponential distribution.
• If the skewness ( close to 0) is very low
then the data are symmetric.
Example

Consider the following data
5.076808
4.895876
6.77878
6.909572
6.474918
7.607923
6.699065
6.019929
5.249301
4.653011
5.050842
5.300643
6.236305
6.829625
4.524959
4.913438
4.965261
5.505035
5.170052
4.132489
6.398492
4.615494
6.091197
6.121048
4.927547
6.651687
5.968593
4.147587
5.46882
6.241657
Example Cont.
•
•
•
•
•
•
•
Mean 5.654198
Median 5.486928
Standard Deviation
0.910188
Skewness 0.173392
Range 3.475434
Minimum
4.132489
Maximum
7.607923
Example Continue

We might take these data and construct a
histogram
The given summary statistics and the
histogram suggest a Normal Distribution
Empirical Distribution
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Disadvantages of Empirical
distribution
• The empirical data may not adequately
represent the true underlying population
because of sampling error
• The Generated RV’s are bounded
• To overcome these two problems, we
attempt to fit a theoretical distribution.
Estimation of Parameters of
the fitted distributions
Suppose we hypothesized a distribution, then
use the Maximum Likelihood Estimator (MLE) to
estimate the parameters involved with the
hypothesized distribution.
• Suppose that q is the only parameter involve in the
distribution then construct (for example the mean
1/l in the exponential distribution)
• Let L(q) = fq (X1) fq (X2) . . . fq(Xn)
• Find q that maximize L(q) to be the required parameter.
• Example: the exponential distribution. Do in class
Determine how representative
the fitted distributions are
• Goodness of Fit (Chi Squared method)
Goodness of Fit (Chi Square
method)
1. Divide the range of the fitted distribution into k
(k<30) intervals [a0, a1), [a1, a2), … [ak-1, ak] Let
Nj = the number of data that belong to [aj-1, aj)
2. Compute the expected proportion of the data that
fall in the jth interval using the fitted distribution
call them pj
k ( N  np ) 2
j
j
2
 =
3. Compute the Chi-square
np
j =1
j
Chi-square cont.
• Note that npj represents the expected number of
data that would fall in the jth interval if the fitted
distribution is correct.
• If
   k r 1,1a
2
2
• Where r is the number of parameters in the
distribution (in Exponential dist. r = 1 which is l)
• Then do not reject distribution with significance
(1-a)100%.
Example:
• Consider the following data:
0.01, 0.07, 0.03, 0.23, 0.04,
0.10, 0.31, 0.10, 0.31, 1.17,
1.50, 0.93, 1.54, 0.19, 0.17,
0.36, 0.27, 0.46, 0.51, 0.11,
0.56, 0.72, 0.39, 0.04, 0.78
Suppose we hypothesize an exponential
distribution, Use Chi-square test by
dividing the range into 5 subintervals.
• The estimate of l=2.5
• Since k = 5, we have pi=0.2
• For the exponential distribution
• Therefore
1 e
2.5 a1
= 0.2
1  0.2 = e 2.5a1
ln( 1  0.2)
a1 =
= 0.089
 2.5
F (t ) = 1  e
2.5t
• Therefore chi-square = 0.4
• From the tables of chi-square
• we can accept the hypothesis
With significance level 5%
32,0.05 = 7.81
The Chi-square table
Degrees of Freedom
1
2
3
4
5
Probability, p
0.99
0.95
0.00
0
0.02
0
0.11
5
0.29
7
0.55
4
0.00
4
0.10
3
0.35
2
0.71
1
1.14
5
0.05
0.01
3.84
6.64
5.99
9.21
7.82
9.49
11.0
7
11.3
5
13.2
8
15.0
9
0.00
1
10.8
3
13.8
2
16.2
7
18.4
7
20.5
2
Related documents