Download Random Numbers - University of Calgary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mathematics of radio engineering wikipedia , lookup

Large numbers wikipedia , lookup

Addition wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Elementary mathematics wikipedia , lookup

Expected value wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
CPSC 531
Systems Modeling and Simulation
Random Numbers
Dr. Anirban Mahanti
Department of Computer Science
University of Calgary
[email protected]
Introduction
In simulations, we generate random values for
variables with a specified distribution
E.g., model service times using the exponential
distribution
Generation of random values is a two step
process
Random number generation: Generate random
numbers uniformly distributed between 0 and 1
Random variate generation: Transform the abovegenerated random numbers to obtain numbers
satisfying the desired distribution
Random Numbers
2
Pseudo Random Numbers
Common random number generators determine
the next random number as a function of the
previously generated random number (i.e.,
recursive calculations are applied)
X n = f ( X n −1 , X n − 2 , X n −3 ,...)
Random numbers generated, are therefore,
deterministic. That is, sequence of random
numbers is known a priori given the starting
number (called the seed). For this reason,
random numbers are known as pseudo random.
Random Numbers
3
1
Example
Consider this example:
X n = (5 X n −1 + 1) mod 16
If X 0 = 5, we get X 1 = (5 × 5 + 1) mod 16 = 10
The next 8 numbers in the sequence are
3, 0, 1, 6, 15, 12, 13, 2, ...
Random number generators have a cycle length
(or period) that is measured by the count of
unique numbers generated before the cycle
repeats itself. The example has a cycle of 16.
Some generators do not repeat an initial portion of the
sequence referred to as the “tail” of the sequence.
Random Numbers
4
Important Considerations
Random number generation routines
should be:
Fast
Portable
Have sufficiently long cycle
Replicable (given the same seed)
Closely approximate the ideal statistical
properties of uniformity and independence.
Random Numbers
5
Uniformity and Independence
Uniformity: if the interval [0, 1] is divided
into n sub-intervals of equal length, the
expected number of observations in each
interval N/n, where N is the total number of
observations.
Independence: the probability of observing
a value in a particular interval is
independent of the previous values drawn.
Random Numbers
6
2
Random Number Generators
Linear Congruential Generators (LCG)
Combined LCG
Tausworthe Generators
Extended Fibonacci Generators
Random Numbers
7
Linear Congruential Generator
To produce a sequence of integers, X1, X2, …
between 0 and m-1 by following a recursive
relationship:
X i +1 = ( aX i + c ) mod m ,
The
multiplier
The
increment
i = 0 ,1, 2 ,...
The
modulus
The selection of the values for a, c, m, and X0
drastically affects the statistical properties and the
cycle length.
The random integers are being generated [0,m-1],
and to convert the integers to random numbers:
Ri =
Xi
,
m
i = 1, 2 ,...
Random Numbers
8
LCG Example
Use X0 = 27, a = 17, c = 43, and m = 100.
The Xi and Ri values are:
X1 = (17*27+43) mod 100 = 502 mod 100 = 2, R1 = 0.02;
X2 = (17*2+32) mod 100 = 77, R2 = 0.77;
X3 = (17*77+32) mod 100 = 52,
R3 = 0.52;
…
Random Numbers
9
3
More on LCG
Selection of the values of a, c, m, and X0
affects the statistical properties of the
generator and its cycle length.
If c = 0, the generator is called
Multiplicative LCG.
If c ≠ 0, the generator is called Mixed LCG.
Random Numbers
10
Even more on LCG
Modulus m should be large
Period can be at most m-1
Choose m of the form 2k for efficient computation
If c ≠ 0, the maximum possible period is obtained
iff
Integers m and c are relatively prime
Every prime number that is a factor of m is also a factor
of a-1
If integer m is a multiple of 4, a-1 is also a multiple of 4
Random Numbers
11
Combined LCG
Reason: Longer period generator is needed because of the
increasing complexity of simulated systems.
Approach: Combine two or more multiplicative congruential
generators.
Let Xi,1, Xi,2, …, Xi,k, be the ith output from k different
multiplicative congruential generators.
The jth generator:
Has prime modulus mj and multiplier aj and period is mj-1
Produces integers Xi,j is approx ~ Uniform on integers in [1,
m-1]
Wi,j = Xi,j -1 is approx ~ Uniform on integers in [1, m-2]
Random Numbers
12
4
Combined LCG (continued …)
Suggested form:
⎞
⎛ k
X i = ⎜⎜ ∑ (−1) j −1 X i , j ⎟⎟ mod m1 − 1
⎠
⎝ j =1
⎧ Xi
⎪⎪ m ,
Hence, Ri = ⎨ 1
m −1
⎪ 1 ,
⎩⎪ m1
Xi f 0
Xi = 0
The coefficient:
Performs the
subtraction Xi,1-1
The maximum possible period is:
P=
(m1 − 1)(m2 − 1)...( mk − 1)
2 k −1
Random Numbers
13
Seed Selection
Often we need random numbers for more
than one variable in a simulation
E.g., inter-arrival times, service times
Then, we need to use multiple random
number streams such that we do not
introduce correlations between the two
random variables owing to our choice of
random numbers
Random Numbers
14
Seed Selection (continued …)
Do not use zero
Avoid even values
Do not use a single stream for multiple random
variables
E.g., do not generate {u1, u2, u3, u4, …} and use {u1,
u3, …} for inter-arrival times and the remaining for
service times
Use a separate seed for a separate stream such
that random numbers do not overlap
Do not use random seeds because they are
hard to replicate
Random Numbers
15
5
Testing Random Number Generators
Two categories of test
Test for uniformity
Test for independence
Passing a test is only a necessary
condition and not a sufficient condition
i.e., if a generator fails a test it implies it is bad
but if a generator passes a test it does not
necessarily imply it is good.
Random Numbers
16
More on Testing …
Testing is not necessary if a well-known
simulation package is used or if a welltested generator is used
In what follows, we focus on “empirical”
tests, that is tests that are applied to an
actual sequence of random numbers
Chi-Square Test
KS Test
Random Numbers
17
Chi-Square Test
Prepare a histogram of the empirical data with k
cells
Let Oi and Ei be the observed and expected
frequency of the ith cell, respectively. Compute
the following:
(Oi − Ei ) 2
Ei
i =1
k
χ 02 = ∑
χ 02 has a Chi-Square distribution with (k-1)
degrees of freedom
Random Numbers
18
6
Chi-Square Test (continued …)
Define a null hypothesis, H(0), that
observations come from a specified
distribution
The null hypothesis cannot be rejected at a
significance level of α if
χ 02 < χ[21−α , k − s −1]
Obtained from a table
meaning of significance level
α = P(reject H(0) | H(0) is true)
Random Numbers
19
More on Chi-Square Test
Errors in cells with small Ei’s affect the test
statistics more than cells with large Ei’s.
Minimum size of Ei debated: [BCNN05]
recommends a value of 3 or more; if not
combine adjacent cells.
Test designed for discrete distributions and large
sample sizes only. For continuous distributions,
Chi-Square test is only an approximation (i.e.,
level of significance holds only for n→∞).
Random Numbers
20
Chi-Square Test Example
Example: 500 random numbers generated using a
random number generator; observations categorized
into cells at intervals of 0.1, between 0 and 1. At level of
significance of 0.1, are these numbers IID U(0,1)?
Interval
Oi
1
2
3
4
5
6
7
8
9
10
Ei
50
48
49
42
52
45
63
54
50
47
500
[(Oi-Ei)^2]/Ei
50
50
50
50
50
50
50
50
50
50
0
0.08
0.02
1.28
0.08
0.5
3.38
0.32
0
0.18
5.84
χ 02 = 5.85; from the table χ[20.9,9] = 14.68;
Hypothesis accepted at significance level of 0.10.
Random Numbers
21
7
Kolmogorov-Smirnov (KS) Test
Difference between observed CDF F0(x) and
expected CDF Fe(x) should be small; formalizes
the idea behind the Q-Q plot.
Step 1: Rank observations from smallest to
largest:
Y1 ≤ Y2 ≤ Y3 ≤ … ≤ Yn
Step 2: Define Fe(x) = (#i: Yi ≤ x)/n
Step 3: Compute K as follows:
K = max | Fe ( x) − Fo ( x) |
x
j − 1⎫
⎧j
K = max ⎨ − Fe (Y j ), Fe (Y j ) −
⎬
n ⎭
1≤ j ≤n ⎩ n
Random Numbers
22
KS Test
Example: Test if given population is exponential with
parameter β = 0.01; that is Fe(x) = 1 – e–βx; K[0.9,15] =
1.0298.
KS Test for exponential distribution
beta
Y_j
0.01 N
(j/n)-F(Yj)
1 0.017896
2 0.075098
3 0.141765
4 0.110331
5 0.112134
6 0.077057
7 0.015478
8 0.076684
9 0.086752
10 0.143781
11 0.086788
12 0.02313
13 0.04935
14 0.075607
15 0.101266
MAX
0.143781
j
5
6
6
17
25
39
60
61
72
74
104
150
170
195
229
15
F(Yj)-(j-1)/n
0.048771
-0.00843
-0.0751
-0.04366
-0.04547
-0.01039
0.051188
-0.01002
-0.02009
-0.07711
-0.02012
0.043537
0.017316
-0.00894
-0.0346
0.051188
Random Numbers
23
KS Test
KS test suitable for small samples,
continuous as well as discrete distributions
KS test, unlike the Chi-Square test, uses
each observation in the given sample
without grouping data into cells (intervals).
KS test is exact provided all parameters of
the expected distribution are known.
Random Numbers
24
8
Random Variate Generation
Talk about generating both discrete and
continuous random variates
Inverse Transform Method
The Rejection Method
The Composition Method
Focus on Inverse Transform Method
Discrete Random Variates
Continuous Random Variate
Random Numbers
25
Discrete Random Variates
Suppose we want to generate the value of a
discrete random variable X that has the following
PMF P{ X = x } = p , j = 0,1,...; p = 1
∑ j
j
j
j
The Inverse Transform Method is as follows:
Generate u = U(0,1) and set X as follows:
⎧ x0
⎪
⎪ x1
⎪.
⎪
⎪.
⎪
X =⎨ .
⎪x
⎪ j
⎪.
⎪
⎪.
⎪.
⎩
if u < p0
if p0 ≤ u < p0 + p1
j −1
j
i =1
i =1
if ∑ pi ≤ u < ∑ pi
Random Numbers
26
Does X have the Desired Distribution?
Let 0 < a < b < 1.
P{a ≤ u < b} = b − a because F ( x) = x
for the uniform distribution for all valid x.
j
⎧⎪ j −1
⎫⎪
P{ X = x j } = P ⎨∑ pi ≤ u <∑ pi ⎬
⎪⎩ i =1
i =1 ⎪
⎭
= pj
Thus X has the desired distribution.
Random Numbers
27
9
Discrete Random Variate Algorithm
Algorithm
Generate u=U(0,1)
If u<p0, set X=x0, return;
If u<p0+p1, set X=x1, return;
…
After generating a random number u, we find the
value of X by determining the interval
[F(xj-1),F(xj)] in which u lies. That is, the inverse
of F(u) is determined.
Time complexity of generating a random
variable is proportional to the number of
intervals to be searched.
Random Numbers
28
Example
Assume p1= 0.2, p2 = 0.1, p3 = 0.25, p4 = 0.45.
Write an algorithm to generate random variates
such that P{X=j} = pj?
Version 2 (efficient)
Version 1
Generate u=U(0,1)
If u<0.45, set X=4,
return;
If u<0.7, set X=3,
return;
If u<0.9, set X=1,
return;
Set X=2, return;
Generate u=U(0,1)
If u<0.2, set X=1,
return;
If u<0.3, set X=2,
return;
If u<0.55, set X=3,
return;
Set X=4, return;
Random Numbers
29
Geometric Random Variables
X is a geometric random variable with parameter
p if it has the following probability mass function:
P{ X = i} = pq i −1, i ≥ 1, where q = 1 − p
Note that :
j −1
F ({ X = j − 1}) = ∑ P{X = i} = 1 − P{ X > j − 1}
i =1
= 1 − P{first j - 1 trials are all failures}
= 1 − q j −1, j ≥ 1
Generate value of X by generating random
number u and setting X to j such that:
1 - qj-1 ≤ u < 1 – qj
Random Numbers
30
10
Geometric Random Variables
Note that 1 - qj-1 ≤ u < 1 – qj is equivalent to
qj < 1 - u ≤ qj-1
Therefore, X = min{j: qj < 1 – u}
⎛ log(1 − u ) ⎞
⎟ +1
X = Int ⎜⎜
⎟
⎝ log(q) ⎠
Note that (1 - u) is also uniformly distributed on (0,1). Thus,
⎛ log(u ) ⎞
⎟⎟ + 1
X = Int ⎜⎜
⎝ log(q) ⎠
Random Numbers
31
Generating Poisson Random Variables
Poisson Distributioni with rate λ.
pi = P{ X = i} = e − λ
λ
,i ≥ 0
i!
A recursive relation for pi can be derived :
pi +1 =
λ
pi , i ≥ 0
i +1
Poisson RV Algorithm (λ)
1. Generate u=U(0,1);
2. i=0, p = e-λ, F=p;
3. if u < F, set X = i and return;
4. p= λp/(i+1), F = F+p, i=i+1;
5. Go to step 3;
Random Numbers
32
Continuous Random Variates
The inverse transform method relies on
the following proposition: Let U be a
uniform random variable. For any
continuous random variable X, the CDF of
X has the following relationship with U:
X = F-1(U)
Here F-1(u) is defined as the value of x at
which F(x)=u.
Random Numbers
33
11
Exponential Random Variables
Suppose X is an exponential random variable with rate
λ. The CDF of X is defined as follows:
F(x) = 1 – e-λx for x ≥ 0
Generate X1, X2, X3 … as follows:
Ui = F ( X i ) = 1 − e
e
−λX i
ln(e
− λX i
= 1 − Ui
−λ X i
) = ln(1 − U i )
− λX i ln(e) = ln(1 − U i )
Xi = −
1
λ
ln(1 − U i )
Random Numbers
34
Weibull Random Variable
PDF of a Weibull random variable X is:
⎧ β β −1
⎪
x . −( x / α ) β
,x≥0
f (x ) = ⎨ α β
e
⎪⎩0, otherwise
The above form assumes that location parameter is zero.
β
F ( X i ) = 1 − e −( X i / α ) . Set
U i = 1 − e −( X i / α )
β
X i = −α [ln(1 − U i )]1 / β .
Random Numbers
35
Acceptance-Rejection Method
Another technique for generating random
variates. Consider the discrete case first.
If we have a technique for simulating a
random variable Y with PMF qj, j ≥ 0, we
can use it for generating random variable
X with PMF pj, j ≥ 0, by generating values
of Y with PMF {qj} and accepting these
values with a probability proportional to
pY /qY.
Random Numbers
36
12
Acceptance-Rejection Algorithm
To generate a value do the following:
Step 0: Choose a constant c, pY /qY ≤ c for all y
Step 1: Simulate the value of Y (where Y has
PMF qj ).
Step 2: Generate a uniformly distributed random
number between 0 and 1, U.
Step 3: If U ≤ pY /(cqY), set X = Y. Otherwise,
return to step 1.
The equivalent method for continuous
random variables is given next.
Random Numbers
37
Acceptance-Rejection for
Continuous Random Variables
Step 0: Find a constant c such that
f(x) / g(x) ≤ c for all x
Step 1: Generate Y having density
function g (i.e., we have a method for
generating continuous random variates
that follow the distribution function g).
Step 2: Generate a random number U
Step 3: If U ≤ f(Y) / (c.g(Y)), set X = Y.
Otherwise, return to step 1.
Random Numbers
38
Acceptance-Rejection Method
Useful particularly when inverse CDF does not exist in
closed form.
Illustration: To generate random variates, X ~ U(1/4, 1)
Generate R
Procedures:
Step 1. Generate R ~ U[0,1]
Step 2a. If R >= ¼, accept X=R.
Step 2b. If R < ¼, reject R, return
to Step 1
no
Condition
yes
Output R’
R does not have the desired distribution, but R conditioned
(R’) on the event {R ≥ ¼} does.
Efficiency: Depends heavily on the ability to minimize the
number of rejections.
Random Numbers
39
13
Special Properties
Based on features of particular family of
probability distributions
For example:
Direct Transformation for normal and
lognormal distributions
Convolution
Beta distribution (from gamma distribution)
Random Numbers
Direct Transformation
40
[BCNN05]
Approach for normal(0,1):
Consider two standard normal random variables, Z1
and Z2, plotted as a point in the plane:
In polar coordinates:
Z1 = B cos φ
Z2 = B sin φ
B2 = Z21 + Z22 ~ chi-square distribution with 2
degrees of freedom = Exp(λ = 2). Hence, B = (−2 ln R)1/ 2
The radius B and angle φ are mutually independent.
Z1 = (−2 ln R)1/ 2 cos(2πR2 )
Z 2 = (−2 ln R)1/ 2 sin(2πR2 )
Random Numbers
Direct Transformation
41
[BCNN05]
Approach for normal(μ,σ2):
Generate Zi ~ N(0,1)
Xi = μ + σ Zi
Approach for lognormal(μ,σ2):
Generate X ~ N((μ,σ2)
Yi = eXi
Random Numbers
42
14
References
Random Number Generation notes are
based on the famous book by Raj Jain
“The Art of Computer Systems
Performance Evaluation”. This book is
published by Wiley.
Random variate generation notes are
based on the famous Sheldon Ross
textbook “Simulation” . This book is
published by Academic Press.
Random Numbers
43
15