Download Random Number Generation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Infinity wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Georg Cantor's first set theory article wikipedia , lookup

Infinitesimal wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Real number wikipedia , lookup

Large numbers wikipedia , lookup

Central limit theorem wikipedia , lookup

Arithmetic wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Collatz conjecture wikipedia , lookup

Addition wikipedia , lookup

Elementary mathematics wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Random Number Generation
Random Number Generators
Without random numbers, we cannot do Stochastic
Simulation
Most computer languages have a subroutine, object or
function generating random numbers (uniformly
distributed)
Simulation languages provide more than that (you can get
random samples from many distributions)
How do they generate it?
How can we test their randomness?
Properties of Random Numbers
A sequence of random numbers, R1, R2, . . ., must have
have two important statistical properties, uniformity and
independence
Each random number Ri, should be an independent sample
from the continuous uniform distribution between 0 and 1
 1, 0  x  1
f ( x)  
0, otherwise
1
1
E ( Ri )   x dx 
2
0
1
Var ( Ri )   x 2 dx  E ( Ri ) 
2
0
1
12
Properties of Random Numbers
Some consequences of the uniformity and
independence properties are


If the interval (0,1) is divided into n subintervals of
equal length, the expected number of observations in
each interval is N/n, where N is the total number of
observations
The probability of observing a value in a particular
interval is independent of the previous values drawn
Pseudo-Random Numbers
pseu·do adj. False or counterfeit; fake.
[American Heritage Dictionary]
Because these numbers are produced using a deterministic
algorithm
Given the method, the set of random numbers produced
can be replicated
Thus they are not truly random
They simply imitate the properties of uniform distribution
and independence
A statistical test should conclude they are indistinguishable
from true random numbers
Thus, they can be used for all practical purposes instead of
true random numbers
Considerations in Generating
Pseudo-Random Numbers
The routine should be fast ( A simulation may
require billions of RNs)
The routine should be portable to different
environments
The routine should have a sufficiently long cycle
The random numbers generated should be
replicable (Useful in debugging and comparing
systems)
The random numbers generated should imitate the
statistical properties of uniformity and
independence
Linear Congruential Method
This method produces a sequence of integers between 0
and m-1 according to the following recursive relationship:
X i 1  (aX i  c) mod m, i  0,1, 2, ...
The initial value “X0” is called the seed, “a” the constant
multiplier, “c” the increment, and “m” the modulus
If c  0, the form is called the mixed congruential method
if c = 0, the form is called the multiplicative congruential
method
The choice of the parameters affect the statistical
properties and the cycle length
To convert the integers to random numbers use Ri = Xi /m
Example
X0 = 27, a = 17, c=43, and m = 100 (Ri = Xi /100)





X0 = 27
X1 = (17*27+43) mod 100 = 502 mod 100 = 2
X2 = (17*2+43) mod 100 = 77 mod 100 = 77
X3 = (17*77+43) mod 100 = 1352 mod 100 = 52
...
(P1 = .02)
(P2 = .77)
(P3 = .52)
Linear Congruential Method
Maximum density (no gaps in the distribution):



The numbers generated with this method can only assume values from
the set I = {0, 1/m, 2/m, . . ., (m-1)/m}
Thus Ri’s are actually distributed with a discrete distribution defined
over I
This can be accepted as an approximation given that m is large enough
Maximum period (avoid cycling, autocorrelation)




Cycle length (P) depends on the choice of parameters (always less than
m)
For m = 2b, c  0 and relatively prime to m and a = 1+4k, the longest
possible period P = m.
For m = 2b, c = 0, X0 (seed) odd, and a = 3+8k or a = 5+8k, the longest
possible period P = m/4
For m a prime number, etc.. P = m-1
Example
Use multiplicative congruential method with a = 13,
m = 26 = 64 and X0 = 1, 2, 3, 4
(i) 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16
Xi 1
13 41 21 17 29 57 37 33 45 9
Xi 2
26 18 42 34 58 50 10 2
Xi 3
39 59 63 51 23 43 47 35 7
Xi 4
52 36 20 4
53 49 61 25 5
1
27 31 19 55 11 15 3
X0 = 1, I = {1, 5, 9 ,13, …, 53, 57, 61} Gap = 4/64 = 0.0625
Combined Linear Congruential
Generators
To simulate more complex systems, the simulation runs
need to go through larger numbers of elementary events
This means that kind of simulation runs have to use more
random numbers
In order to have healthy runs, pseudo-random generators
with longer periods are needed (So that cycles can be
avoided during the run)
It is possible to combine two or more multiplicative
congruential generators in such a way that the combined
generator has good statistical properties and a longer
period
Random-Numbers Streams and
Seeds
The seed for a random-number generator:


Is the integer value X0 that initializes the random-number sequence.
Any value in the sequence can be used to “seed” the generator.
A random-number stream:


Refers to random numbers obtained by using a starting seed
If the streams are b values apart, then stream i could defined by starting
seed:
Si  X b ( i 1)

Older generators: b = 105; Newer generators: b = 1037.
A single random-number generator with k streams can act like
k distinct virtual random-number generators
To compare two or more alternative systems.

Advantageous to dedicate portions of the pseudo-random number
sequence to the same purpose in each of the simulated systems.
Pseudo-Random Number
Generation in SIMAN
SIMAN employs the multiplicative congruential method
with a = 16807, m = 231-1 = 2,147,483,647
This is an almost full-period generator
For any initial seed between 1 and 231-2, all unnormalized
random numbers between 1 and 231-2, are generated
exactly once before the generator cycles again (P = 231-2)
A SIMAN model may employ several different randomnumber streams
Each stream is generated by using the same generator with
different initial seed values
Pseudo-Random Number
Generation in ARENA
With today’s computing power, the cycling can
occur in minutes of simulation with with a cycle
length of 2 billion (2.1x109)
ARENA thus uses a combined multiple recursive
generator which combines two separate generators
An  (1403580 An  2  810728 An 3 ) mod 4294967087
Bn  (527612 Bn 1  1370589 Bn 3 ) mod 4294944443
X n  ( An  Bn ) mod 4294967087
Pseudo-Random Number
Generation in ARENA
The cycle length of this generator is 3.1x1057
This is inexhaustible with the current computing
speeds
Just to generate them would take 1040 millennia
(thousand years) on a 2GHz PC
The Arena generator has facility to split this cycle
into 1.8x109 separate streams, each of length
1.7x1038
Each stream is further subdivided into 2.3x1015
separate substreams of length 7.6x1022 apiece
Tests for Random Numbers
Two categories:

Testing for uniformity:
H0: Ri ~ U[0,1]
H1: Ri ~/ U[0,1]


Failure to reject the null hypothesis, H0, means that evidence of nonuniformity has not been detected.
Testing for independence:
H0: Ri ~ independently
H1: Ri ~/ independently

Failure to reject the null hypothesis, H0, means that evidence of
dependence has not been detected.
Level of significance a, the probability of rejecting H0 when it is
true:
a = P(reject H0|H0 is true)
Tests for Random Numbers
When to use these tests:


If a well-known simulation languages or random-number
generators is used, it is probably unnecessary to test
If the generator is not explicitly known or documented,
e.g., spreadsheet programs, symbolic/numerical calculators,
tests should be applied to many sample numbers.
Types of tests:


Theoretical tests: evaluate the choices of m, a, and c
without actually generating any numbers
Empirical tests: applied to actual sequences of numbers
produced.