Download Input Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Simulation Modeling and
Analysis
Input Modeling
1
Outline
•
•
•
•
•
•
•
Introduction
Data Collection
Matching Distributions with Data
Parameter Estimation
Goodness of Fit Testing
Input Models without Data
Multivariate and Time Series Input Models
2
Introduction
• Steps in Developing Input Data Model
– Data collection from the real system
– Identification of a probability distribution
representing the data
– Select distribution parameters
– Goodness of fit testing
3
Data Collection
• Useful Suggestions
–
–
–
–
–
–
Plan, practice, preobserve
Analyze data as it is collected
Combine homogeneous data sets
Watch out for censoring
Build scatter diagrams
Check for autocorrelation
4
Identifying the Distribution
• Construction of Histograms
– Divide range of data into equal subintervals
– Label horizontal and vertical axes appropriately
– Determine frequency occurrences within each
subinterval
– Plot frequencies
5
Physical Basis of Common
Distributions
• Binomial: Number of successes in n
independent trials each of probability p .
• Negative Binomial (Geometric): Number of
trials required to achieve k successes.
• Poisson: Number of independent events
occurring in a fixed amount of time and
space (Time between events is Exponential).
6
Physical Basis of Common
Distributions - contd
• Normal: Processes which are the sum of
component processes.
• Lognormal: Processes which are the product
of component processes.
• Exponential: Times between independent
events (Number of events is Poisson).
• Gamma: Many applications. Non-negative
random variables only.
7
Physical Basis of Common
Distributions - contd
• Beta: Many applications. Bounded random
variables only.
• Erlang: Processes which are the sum of
several exponential component processes.
• Weibull: Time to failure.
• Uniform: Complete uncertainty.
• Triangular: When only minimum, most
likely and maximum values are known.
8
Quantile-Quantile Plots
• If X is a RV with cdf F, the q-quantile of X
is the value  such that F() = P(X < ) = q
• Raw data {xi}
• Data rearranged by magnitude {yj}
• Then: yj is an estimate of the (j-1/2)/n
quantile of X, i.e.
yj ~ F-1[(j-1/2)/n]
9
Quantile-Quantile Plots -contd
• If F is a member of an appropriate family
then a plot of yj vs. F-1[(j-1/2)/n] is a
straight line
• If F also has the appropriate parameter
values the line has a slope = 1.
10
Parameter Estimation
• Once a distribution family has been
determined, its parameters must be
estimated.
• Sample Mean and Sample Standard
Deviation.
11
Parameter Estimation -contd
• Suggested Estimators
–
–
–
–
Poisson:  ~ mean
Exponential:  ~ 1/mean
Uniform (on [0,b]): b ~ (n+1) max(X)/n
Normal:  ~ mean; 2 ~ S2
12
Goodness of Fit Tests
• Test the hypothesis that a random sample of
size n of the random variable X follows a
specific distribution.
– Chi-Square Test (large n; continuous and
discrete distributions)
– Kolmogorov-Smirnov Test (small n;
continuous distributions only)
13
Chi-Square Test
• Statistic
20 = k (Oi - Ei)2/Ei
• Follows the chi-square distribution with ks-1 degrees of freedom (s = d.o.f. of given
distribution)
• Here Ei = n pi is the expected frequency
while Oi is the observed frequency.
14
Chi-Square Test -contd
• Steps
–
–
–
–
Arrange the n observations into k cells
Compute the statistic 20 = k (Oi - Ei)2/Ei
Find the critical value of 2 (Handout)
Accept or reject the null hypothesis based on
the comparison
• Example: Stat::Fit
15
Chi-Square Test - contd
• If the test involves a discrete distribution
each value of the RV must be in a class
interval unless combined intervals are
required.
• If the test involves a continuous distribution
class intervals must be selected which are
equal in probability rather than width.
16
Chi-Square Test - contd
• Example: Exponential distribution.
• Example: Weibull distribution.
• Example: Normal distribution.
17
Kolmogorov-Smirnov Test
• Identify the maximum absolute difference D
between the values of of the cdf of a random
sample and a specified theoretical
distribution.
• Compare against the critical value of D
(Handout).
• Accept or reject H0 accordingly
• Example.
18
Input Models without Data
• When hard data are not available, use:
–
–
–
–
–
Engineering data (specs)
Expert opinion
Physical and/or conventional limitations
Information on the nature of the process
Uniform, triangular or beta distributions
• Check sensitivity!
19
Multivariate and Time-Series
Input Models
• If input variables are not independent their
relationship must be taken into
consideration (multivariable input model).
• If input variables constitute a sequence (in
time) of related random variables, their
relationship must be taken into account
(time-series input model).
20
Covariance and Correlation
• Measure the linear dependence between two
random variables X1 (mean 1, std dev 1)
and X2 (mean 2, std dev 2)
X1 - 1 = (X2 - 2) + 
• Covariance:
cov(X1,X2) = E(X1 X2) - 1 2
• Correlation:
 = cov(X1,X2)/12
21
Multivariate Input Models
• If X1 and X2 are normally distributed and
interrelated, they can be modeled by a
bivariate normal distribution
• Steps
– Generate Z1 and Z2 indepedendent standard
RV’s
– Set X1 = 1 + 1 Z1
– Set X2 = 2 + 2(Z1 + (1-2)1/2 Z2)
22
Time-Series Input Models
• Let X1,X2,X3,… be a sequence of
identically distributed and covariancestationary RV’s. The lag-h correlation is
h = corr(Xt,Xt+h) = h
• If all Xt are normal: AR(1) model.
• If all Xt are exponential: EAR(1) model.
23
AR(1) model
• For a time series model
Xt =  +  (Xt-1 - ) + t
where
t are normal with mean = 0 and var = 2

24
AR(1) model -contd
1.- Generate X1 from a normal with mean 
and variance 2 /(1 - 2). Set t = 2.
2.- Generate t from a normal with mean = 0
and variance 2 .
3.- Set Xt =  +  (Xt-1 - ) + t
4.- Set t = t+1 and go to 2.
25
EAR(1) model
• For a time series model
Xt =  Xt-1 with prob
Xt =  Xt-1 + t with prob
where
t are exponential with mean = 1/ and

26
EAR(1) model - contd
1.- Generate X1 from an exponential with
mean  . Set t = 2.
2.- Generate U from a uniform on [0,1]. If U
<  set Xt =  Xt-1 . Otherwise generate
from an exponential with mean 1/ and set
Xt =  Xt-1 + t
4.- Set t = t+1 and go to 2.
27
Related documents