Download The 2 -test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Analyzing Input and
Output Simulation Data
MIO 310 Optimering och Simulering
2012
(Operations Research, Basic Course)
The main reference for this material is chapter 9 in the book Business Process Modeling, Simulation and Design by M. Laguna
and J. Marklund, Prentice Hall 2005.
1
Overview
• Analysis of Input Data
– Identification of field data distributions
 Goodness-of-fit tests
 Random number generation
• Analysis of Simulation Output Data
– Non-terminating v.s. terminating processes
– Confidence intervals
– Hypothesis testing for comparing designs
2
Why Input and Output Data Analysis?
Input Data
Random
Simulation Model
Output Data
Random
• Analysis of input data
– Necessary for building a valid model
– Three aspects
 Identification of (time) distributions
 Random number generation
Integrated into Extend
 Generation of random variates
• Analysis of output data
– Necessary for drawing correct conclusions
– The reported performance measures are typically random
variables!
3
Example from IKEA
• To develop a general method to determine the most
appropriate statistical distribution to describe average
customer demand during the lead time from DC to
store
STO
Project description
• Optimization of safety stock
– High service level and low costs
– Important to know customer demand
• Today the normal distribution is used
Developed method- Process chart
flow
Tools
– Matlab
– Extend
– Excel
The outcomes of the project
• A method to find the most appropriate
distribution to describe average customer
demand during the lead time from DC to
store
• The normal distribution should not be used
– The gamma distribution seems to be a better fit
Capturing Randomness in Input Data
1. Collect raw field data and use as input for the simulation
+
–
–
–
+
No question about relevance
Expensive/impossible to retrieve a large enough data set
Not available for new processes
Not available for multiple scenarios  No sensitivity analysis
Very valuable for model validation
2. Generate artificial data to use as input data
 Must capture the characteristics of the real data
1. Collect a sufficient sample of field data
2. Characterize the data statistically – Distribution type and parameters
3. Generate random artificial data mimicking the real data
 High flexibility – easy to handle new scenarios
 Cheap
 Requires proper statistical analysis to ensure model validity
8
Procedure for Modeling Input Data
1. Gather data from the real
Distribution hypothesis rejected
system
2. Identify an appropriate
distribution family
3. Estimate distribution
parameters and pick an
“exact” distribution
4. Perform Goodness–of–fit
test
(Reject the hypothesis that the
picked distribution is correct?)
• Plot histograms of the data
• Compare the histogram graphically
(“eye-balling”) with shapes of well
known distribution functions
– How about the tails of the
distribution, limited or unlimited?
– How to handle negative outcomes?
• Informal test – “eye-balling”
• Formal tests, for example
– 2 - test
– Kolmogorov-Smirnov test
If a known distribution can not be accepted
 Use an empirical distribution
9
Example – Modeling Interarrival Times (I)
1. Data gathering from the real system
Interarrival Time (t)
Frequency
0t<3
23
3t<6
10
6t<9
5
9t<12
1
12t<15
1
15t<18
2
18t<21
0
21t<24
1
24t<27
1
Etc.
10
Example – Modeling Interarrival Times (II)
2. Identify an appropriate distribution type/family
– Plot a histogram
1) Divide the data material into appropriate intervals
 Usually of equal size
2) Determine the event frequency for each interval (or bin)
3) Plot the frequency (y-axis) for each interval (x-axis)
25
20
The Exponential
distribution
seems to be a
good first guess!
15
10
5
0
0-3
3-6
6-9
9-12
<15
<18
<21
<24
<27
11
Example – Modeling Interarrival Times (III)
3. Estimate the parameters defining the chosen
distribution
– In the current example Exp() has been chosen  need to
estimate the parameter 
 ti = the ith interarrival time in the collected sample of n
observations
N
 ti
1
 t  i1

N
1
    ...  0.084
t
12
Example – Modeling Interarrival Times (III)
4. Perform Goodness-of-fit test
– The purpose is to test the hypothesis that the data material is
adequately described by the “exact” distribution chosen in steps 13.
– Two of the most well known standardized tests are
• The 2-test
– Should not be applied if the sample size n<20
• The Kolmogorov-Smirnov test
– A relatively simple but imprecise test
– Often used for small sample sizes
– The 2-test will be applied for the current example
13
Performing a 2-Test (I)
 In principle
 A statistical test comparing the relative frequencies for the
intervals/bins in a histogram with the theoretical probabilities of
the chosen distribution
• Assumptions
– The distribution involves k parameters estimated from the sample
– The sample contains n observations (sample size=n)
– F0(x) denotes the chosen/hypothesized CDF
Data:
x1, x2, …, xn
(n observations from the real
system)
Null hypothesis
Alternative hypothesis
Model: X1, X2,…, Xn
(Random variables, independent and
identically distributed with CDF F(x))
H0: F(x) = F0(x)
HA: F(x)  F0(x)
14
Performing a 2-Test (II)
1. Take the entire data range and divide it into r non
overlapping intervals or bins
f0(x)
The area = p2 = F0(a2) - F0(a1)
Min=a0
Bin:
a1
1
a2
2
a3
3
…
Data values
ar-2
ar-1
r-1
ar=Max
r
• pi = The probability that an observation X belongs to bin i
 The Null Hypothesis  pi = F0(ai) - F0(ai-1)
• To improve the accuracy of the test
– choose the bins (intervals) so that the probabilities pi (i=1,2, …r)
are equal for all bins
15
Performing a 2-Test (III)
2. Define r random variables Oi, i=1, 2, …r
– Oi=number of observations in bin i (= the interval (ai-1, ai])
– If H0 is true  the expected value of Oi = n*pi
• Oi is Binomially distributed with parameters n and pi
3. Define the test variable T
r
T
i 1
Oi  n  pi 2
n  pi
– If H0 is true  T follows a 2(r-k-1) distribution
 k = # of estimated parameters in the theoretical distribution being tested
– T = The critical value of T corresponding to a significance level 
obtained from a 2(r-k-1) distribution table
– Tobs = The value of T computed from the data material
 If Tobs > T  H0 can be rejected on the significance level 
16
Validity of the 2-Test
• Depends on the sample size n and on the bin selection (the
size of the intervals)
• Rules of thumb
– The 2-test is acceptable for ordinary significance levels (=1%,
5%) if the expected number of observations in each interval is
greater than 5 (n*pi>5 for all i)
– In the case of continuous data and a bin selection such that pi is
equal for all bins 




n20
20<n 50
50<n 100
n >100




Do not use the 2-test
5-10 bins recommendable
10-20 bins recommendable
n0.5 – 0.2n bins recommendable
17
Example – Modeling Interarrival Times (IV)
• Hypothesis – the interarrival time Y is Exp(0.084) distributed
H0: YExp(0.084)
HA: YExp(0.084)
• Bin sizes are chosen so that the probability pi is equal for all r
bins and n*pi>5 for all i
– Equal pi  pi=1/r
– n*pi>5  n/r > 5  r<n/5
– n=50  r<50/5=10  Choose for example r=8  pi=1/8
• Determining the interval limits ai, i=0,1,…8
H 0  F(a i )  1  e
0.084*a i
 i * pi  1  e 0.084*a i  a i 
ln(1  i * pi )
 0.084
i=1  a1=ln(1-(1/8))/(-0.084)=1.590
i=2  a2=ln(1-(2/8))/(-0.084)=3.425

i=8  a8 =ln(1-(8/8))/(-0.084)=
18
Example – Modeling Interarrival Times (V)
• Computing the test statistic Tobs
8
oi  50 / 82
i 1
50 / 8
Tobs  
 39.6
Note:
oi = the actual number of
observations in bin i
• Determining the critical value T
– If H0 is true  T2(8-1-1)=2(6)
– If =0.05  P(T T0.05)=1-=0.95  /2 table/  T0.05=12.60
• Rejecting the hypothesis
– Tobs=39.6>12.6= T0.05
 H0 is rejected on the 5% level
19
Distribution Choice in Absence of Sample Data
• Common situation especially when designing new processes
– Try to draw on expert knowledge from people involved in similar tasks
 When estimates of interval lengths are available
– Ex. The service time ranges between 5 and 20 minutes
 Plausible to use a Uniform distribution with min=5 and max=20
 When estimates of the interval and most likely value exist
– Ex. min=5, max=20, most likely=12
 Plausible to use a Triangular distribution with those parameter values
 When estimates of min=a, most likely=c, max=b and the
average value=x-bar are available
 Use a -distribution with parameters  and 
( x  a )( 2c  a  b)

(c  x )( b  a )

 b  x 
(x  a)
22
Random Number Generators
• Needed to create artificial input data to the simulation model
• Generating truly random numbers is difficult
– Computers use pseudo-random number generators based on
mathematical algorithms – not truly random but good enough
• A popular algorithm is the “linear congruential method”
1. Define a random seed x0 from which the sequence is started
2. The next “random” number in the sequence is obtained from the
previous through the relation
x n 1  (a  x n  c) mod m
where a, c, and m are integers > 0
23
Example – The Linear Congruential Method
• Assume that m=8, a=5, c=7 and x0=4
 x n 1  (5  x n  7) mod 8
n
xn
5xn+7
(5xn+7)/8
xn+1
0
4
27
3 + 3 /8
3
1
3
22
2 + 6 /8
6
2
6
37
4 + 5 /8
5
3
5
32
4 + 0 /8
0
4
0
7
0 + 7 /8
7
5
7
42
5 + 2 /8
2
6
2
17
2 + 1 /8
1
7
1
12
1 + 4 /8
4
Larger m  longer sequence before it starts repeating itself
24
Generating Random Variates
• Assume random numbers, r, from a Uniform (0, 1) distribution
are available
 Random numbers from any distribution can be obtained by applying the
“inverse transformation technique”
The inverse Transformation Technique
1. Generate a U[0, 1] distributed random number r
2. T is a random variable with a CDF FT(t) from which we would
like to obtain a sequence of random numbers
– Note: 0 FT(t) 1 for all values of t
Let FT ( t )  r and solve for t  t  FT1 ( r )
 t is a random number from the distribution of T, i.e., a realization of T
• See Example – The Exponential distribution
27
Analysis of Simulation Output Data
 The output data collected from a simulation model are
realizations of stochastic variables
– Results from random input data and random processing times
 Statistical analysis is required to
1. Estimate performance characteristics
– Mean, variance, confidence intervals etc. for output variables
2. Compare performance characteristics for different designs
• The validity of the statistical analysis and the design
conclusions are contingent on a careful sampling approach
– Sample sizes – run length and number of runs.
– Inclusion or exclusion of “warm-up” periods?
– One long simulation run or several shorter ones?
28
Terminating v.s. Non-Terminating Processes
Process
Process
Simulation
Simulation
Non-terminating
Non-terminating
Steady
Steady state
state
analysis
analysis
Transient
Transient state
state
analysis
analysis
Terminating
Terminating
Time-controlled
Time-controlled
termination
termination
Event-controlled
Event-controlled
termination
termination
29
Non-Terminating Processes
• Does not end naturally within a particular time
horizon
– Ex. Inventory systems
• Usually reach steady state after an initial
transient period
– Assumes that the input data is stationary
• To study the steady state behavior it is vital to
determine the duration of the transient period
– Examine line plots of the output variables
• To reduce the duration of the transient
(=“warm-up) period
– Initialize the process with appropriate average
values
30
Illustration Transient and Steady state
Line plot of cycle times and average cycle time
30
Transient
state
25
Steady state
Cy cle tim e
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
S imu lat ion time
31
Terminating Processes
• Ends after a predetermined time span
– Typically the system starts from an empty state and ends in an
empty state
– Ex. A grocery store, a construction project, …
• Terminating processes may or may not reach steady state
– Usually the transient period is of great interest for these processes
• Output data usually obtained from multiple independent
simulation runs
– The length of a run is determined by the natural termination of the
process
– Each run need a different stream of random numbers
– The initial state of each run is typically the same
32
Confidence Intervals and Point Estimates
• Statistical estimation of measures from a data material are
typically done in two ways
– Point estimates (single values)
– Confidence intervals (intervals)
• The confidence level 
– Indicates the probability of not finding the true value within the
interval (Type I error)
– Chosen by the analyst/manager
• Determinants of confidence interval width
– The chosen confidence level 
 Lower   wider confidence interval
– The sample size and the standard deviation ()
 Larger sample  smaller standard deviation  narrower interval
33
Important Point Estimates
• In simulation the most commonly used statistics are the
mean and standard deviation ()
– From a sample of n observations
 Point estimate of the mean:
x
x1  x 2  ...  x n
n
n
 Point estimate of  :
s
 (x i  x )2
i 1
n 1
34
Confidence Interval for Population Means (I)
 Characteristics of the point estimate for the population mean
– Xi = Random variable representing the value of the ith observation in a
sample of size n, (i=1, 2, …, n)
– Assume that all observations Xi are independent random variables
– The population mean = E[Xi]=
– The population standard deviation=(Var[Xi])0.5=
X1  X 2    X n
n
– Mean and Std. Dev. of the point estimate for the population mean
– Point estimate of the population mean=
X
EX1   EX 2     EX n  n  
EX  


n
n
Var ( X1 )   Var ( X 2 )
n2

 x  Var ( X ) 


n
n2
n2
35
Confidence Interval for Population Means (II)
 Distribution of the point estimate for population means
– X  N (,  x )
 For any distribution of Xi (i=1, 2, …n), when n is large (n30), due
to the Central Limit Theorem
 If all Xi (i=1, 2, …n) are normally distributed, for any n
• A standard transformation:
X 
Z
 N (0,1)
x
• Defining a symmetric two sided confidence interval
– P(Z/2  Z  Z/2) = 1 
–  is known  Z/2 can be found from a N(0, 1) probability table
 Confidence interval for the population mean 
 Z / 2
x 

 Z / 2  x  Z / 2   x    x  Z / 2   x
x
36
Confidence Interval for Population Means (III)
x  Z / 2   x    x  Z / 2   x
• In case the population standard deviation, , is known
x  
n
• In case  is unknown we need to estimate it
– Use the point estimate s
 The test variable Z is no longer Normally distributed, it follows a
Students-t distribution with n-1 degrees of freedom

s
s
x  t ( n 1), / 2 
   x  t ( n 1), / 2 
n
n
In practice when n is large (30) the t-distribution is
often approximated with the Normal distribution!
37
Determining an Appropriate Sample Size
• A common problem in simulation
– How many runs and how long should they be?
• Depends on the variability of the sought output variables
• If a symmetric confidence interval of width 2d is desired
for a mean performance measure 

x d   x d
– If x-bar is normally distributed

d  (  Z / 2 ) / n
 n  (   Z  / 2 ) / d 2
 If  is unknown and estimated with s

n  (s  Z  / 2 ) / d 2
38
Hypothesis Testing (I)
1. Testing if a population mean () is equal to, larger
than or smaller than a given value
– Suppose that in a sample of n observations the point estimate of = x
Hypothesis
H0: =a
HA: a
H0: a
HA: <a
H0: a
HA: >a
Reject H0 if …
x a
 Z / 2 or
s/ n
x a
 Z / 2
s/ n
Type of test
Symmetric two tail
test
x a
  Z
s/ n
One tail test
x a
 Z
s/ n
One tail test
39
Hypothesis Testing (II)
2. Testing if two sample means are significantly different
– Useful when comparing process designs
• A two tail test when 1=2=s
– H0: 1- 2=a
/typically a=0/
HA: 1- 2a
– The test statistic Z belongs to a Student-t distribution
Z
x1  x 2  (1   2 )
 t ( n1  n 2  2)
1 1
s

n1 n 2
– Reject H0 on the significance level  if it is not true that
 t ( n1 n2 2),(1 / 2)  Z  t ( n1 n2 2),(1 / 2)
40
Hypothesis Testing (III)
• If the sample sizes are large (n1+n2-2>30)
 Z is approximately N(0, 1) distributed
 Reject H0 if it is not true that
 Z / 2  Z  Z / 2
• In practice, when comparing designs non-overlapping 3
intervals are often used as a criteria
–
–
H0: 1- 2>0
HA: 1- 20
Reject H0 if
x1  x 2  3( x1  x 2 )  x1  x 2  3
s 21
n1

s 22
n2
0
41