* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Random Numbers - University of Calgary
Survey
Document related concepts
Transcript
CPSC 531 Systems Modeling and Simulation Random Numbers Dr. Anirban Mahanti Department of Computer Science University of Calgary [email protected] Introduction In simulations, we generate random values for variables with a specified distribution E.g., model service times using the exponential distribution Generation of random values is a two step process Random number generation: Generate random numbers uniformly distributed between 0 and 1 Random variate generation: Transform the abovegenerated random numbers to obtain numbers satisfying the desired distribution Random Numbers 2 Pseudo Random Numbers Common random number generators determine the next random number as a function of the previously generated random number (i.e., recursive calculations are applied) X n = f ( X n −1 , X n − 2 , X n −3 ,...) Random numbers generated, are therefore, deterministic. That is, sequence of random numbers is known a priori given the starting number (called the seed). For this reason, random numbers are known as pseudo random. Random Numbers 3 1 Example Consider this example: X n = (5 X n −1 + 1) mod 16 If X 0 = 5, we get X 1 = (5 × 5 + 1) mod 16 = 10 The next 8 numbers in the sequence are 3, 0, 1, 6, 15, 12, 13, 2, ... Random number generators have a cycle length (or period) that is measured by the count of unique numbers generated before the cycle repeats itself. The example has a cycle of 16. Some generators do not repeat an initial portion of the sequence referred to as the “tail” of the sequence. Random Numbers 4 Important Considerations Random number generation routines should be: Fast Portable Have sufficiently long cycle Replicable (given the same seed) Closely approximate the ideal statistical properties of uniformity and independence. Random Numbers 5 Uniformity and Independence Uniformity: if the interval [0, 1] is divided into n sub-intervals of equal length, the expected number of observations in each interval N/n, where N is the total number of observations. Independence: the probability of observing a value in a particular interval is independent of the previous values drawn. Random Numbers 6 2 Random Number Generators Linear Congruential Generators (LCG) Combined LCG Tausworthe Generators Extended Fibonacci Generators Random Numbers 7 Linear Congruential Generator To produce a sequence of integers, X1, X2, … between 0 and m-1 by following a recursive relationship: X i +1 = ( aX i + c ) mod m , The multiplier The increment i = 0 ,1, 2 ,... The modulus The selection of the values for a, c, m, and X0 drastically affects the statistical properties and the cycle length. The random integers are being generated [0,m-1], and to convert the integers to random numbers: Ri = Xi , m i = 1, 2 ,... Random Numbers 8 LCG Example Use X0 = 27, a = 17, c = 43, and m = 100. The Xi and Ri values are: X1 = (17*27+43) mod 100 = 502 mod 100 = 2, R1 = 0.02; X2 = (17*2+32) mod 100 = 77, R2 = 0.77; X3 = (17*77+32) mod 100 = 52, R3 = 0.52; … Random Numbers 9 3 More on LCG Selection of the values of a, c, m, and X0 affects the statistical properties of the generator and its cycle length. If c = 0, the generator is called Multiplicative LCG. If c ≠ 0, the generator is called Mixed LCG. Random Numbers 10 Even more on LCG Modulus m should be large Period can be at most m-1 Choose m of the form 2k for efficient computation If c ≠ 0, the maximum possible period is obtained iff Integers m and c are relatively prime Every prime number that is a factor of m is also a factor of a-1 If integer m is a multiple of 4, a-1 is also a multiple of 4 Random Numbers 11 Combined LCG Reason: Longer period generator is needed because of the increasing complexity of simulated systems. Approach: Combine two or more multiplicative congruential generators. Let Xi,1, Xi,2, …, Xi,k, be the ith output from k different multiplicative congruential generators. The jth generator: Has prime modulus mj and multiplier aj and period is mj-1 Produces integers Xi,j is approx ~ Uniform on integers in [1, m-1] Wi,j = Xi,j -1 is approx ~ Uniform on integers in [1, m-2] Random Numbers 12 4 Combined LCG (continued …) Suggested form: ⎞ ⎛ k X i = ⎜⎜ ∑ (−1) j −1 X i , j ⎟⎟ mod m1 − 1 ⎠ ⎝ j =1 ⎧ Xi ⎪⎪ m , Hence, Ri = ⎨ 1 m −1 ⎪ 1 , ⎩⎪ m1 Xi f 0 Xi = 0 The coefficient: Performs the subtraction Xi,1-1 The maximum possible period is: P= (m1 − 1)(m2 − 1)...( mk − 1) 2 k −1 Random Numbers 13 Seed Selection Often we need random numbers for more than one variable in a simulation E.g., inter-arrival times, service times Then, we need to use multiple random number streams such that we do not introduce correlations between the two random variables owing to our choice of random numbers Random Numbers 14 Seed Selection (continued …) Do not use zero Avoid even values Do not use a single stream for multiple random variables E.g., do not generate {u1, u2, u3, u4, …} and use {u1, u3, …} for inter-arrival times and the remaining for service times Use a separate seed for a separate stream such that random numbers do not overlap Do not use random seeds because they are hard to replicate Random Numbers 15 5 Testing Random Number Generators Two categories of test Test for uniformity Test for independence Passing a test is only a necessary condition and not a sufficient condition i.e., if a generator fails a test it implies it is bad but if a generator passes a test it does not necessarily imply it is good. Random Numbers 16 More on Testing … Testing is not necessary if a well-known simulation package is used or if a welltested generator is used In what follows, we focus on “empirical” tests, that is tests that are applied to an actual sequence of random numbers Chi-Square Test KS Test Random Numbers 17 Chi-Square Test Prepare a histogram of the empirical data with k cells Let Oi and Ei be the observed and expected frequency of the ith cell, respectively. Compute the following: (Oi − Ei ) 2 Ei i =1 k χ 02 = ∑ χ 02 has a Chi-Square distribution with (k-1) degrees of freedom Random Numbers 18 6 Chi-Square Test (continued …) Define a null hypothesis, H(0), that observations come from a specified distribution The null hypothesis cannot be rejected at a significance level of α if χ 02 < χ[21−α , k − s −1] Obtained from a table meaning of significance level α = P(reject H(0) | H(0) is true) Random Numbers 19 More on Chi-Square Test Errors in cells with small Ei’s affect the test statistics more than cells with large Ei’s. Minimum size of Ei debated: [BCNN05] recommends a value of 3 or more; if not combine adjacent cells. Test designed for discrete distributions and large sample sizes only. For continuous distributions, Chi-Square test is only an approximation (i.e., level of significance holds only for n→∞). Random Numbers 20 Chi-Square Test Example Example: 500 random numbers generated using a random number generator; observations categorized into cells at intervals of 0.1, between 0 and 1. At level of significance of 0.1, are these numbers IID U(0,1)? Interval Oi 1 2 3 4 5 6 7 8 9 10 Ei 50 48 49 42 52 45 63 54 50 47 500 [(Oi-Ei)^2]/Ei 50 50 50 50 50 50 50 50 50 50 0 0.08 0.02 1.28 0.08 0.5 3.38 0.32 0 0.18 5.84 χ 02 = 5.85; from the table χ[20.9,9] = 14.68; Hypothesis accepted at significance level of 0.10. Random Numbers 21 7 Kolmogorov-Smirnov (KS) Test Difference between observed CDF F0(x) and expected CDF Fe(x) should be small; formalizes the idea behind the Q-Q plot. Step 1: Rank observations from smallest to largest: Y1 ≤ Y2 ≤ Y3 ≤ … ≤ Yn Step 2: Define Fe(x) = (#i: Yi ≤ x)/n Step 3: Compute K as follows: K = max | Fe ( x) − Fo ( x) | x j − 1⎫ ⎧j K = max ⎨ − Fe (Y j ), Fe (Y j ) − ⎬ n ⎭ 1≤ j ≤n ⎩ n Random Numbers 22 KS Test Example: Test if given population is exponential with parameter β = 0.01; that is Fe(x) = 1 – e–βx; K[0.9,15] = 1.0298. KS Test for exponential distribution beta Y_j 0.01 N (j/n)-F(Yj) 1 0.017896 2 0.075098 3 0.141765 4 0.110331 5 0.112134 6 0.077057 7 0.015478 8 0.076684 9 0.086752 10 0.143781 11 0.086788 12 0.02313 13 0.04935 14 0.075607 15 0.101266 MAX 0.143781 j 5 6 6 17 25 39 60 61 72 74 104 150 170 195 229 15 F(Yj)-(j-1)/n 0.048771 -0.00843 -0.0751 -0.04366 -0.04547 -0.01039 0.051188 -0.01002 -0.02009 -0.07711 -0.02012 0.043537 0.017316 -0.00894 -0.0346 0.051188 Random Numbers 23 KS Test KS test suitable for small samples, continuous as well as discrete distributions KS test, unlike the Chi-Square test, uses each observation in the given sample without grouping data into cells (intervals). KS test is exact provided all parameters of the expected distribution are known. Random Numbers 24 8 Random Variate Generation Talk about generating both discrete and continuous random variates Inverse Transform Method The Rejection Method The Composition Method Focus on Inverse Transform Method Discrete Random Variates Continuous Random Variate Random Numbers 25 Discrete Random Variates Suppose we want to generate the value of a discrete random variable X that has the following PMF P{ X = x } = p , j = 0,1,...; p = 1 ∑ j j j j The Inverse Transform Method is as follows: Generate u = U(0,1) and set X as follows: ⎧ x0 ⎪ ⎪ x1 ⎪. ⎪ ⎪. ⎪ X =⎨ . ⎪x ⎪ j ⎪. ⎪ ⎪. ⎪. ⎩ if u < p0 if p0 ≤ u < p0 + p1 j −1 j i =1 i =1 if ∑ pi ≤ u < ∑ pi Random Numbers 26 Does X have the Desired Distribution? Let 0 < a < b < 1. P{a ≤ u < b} = b − a because F ( x) = x for the uniform distribution for all valid x. j ⎧⎪ j −1 ⎫⎪ P{ X = x j } = P ⎨∑ pi ≤ u <∑ pi ⎬ ⎪⎩ i =1 i =1 ⎪ ⎭ = pj Thus X has the desired distribution. Random Numbers 27 9 Discrete Random Variate Algorithm Algorithm Generate u=U(0,1) If u<p0, set X=x0, return; If u<p0+p1, set X=x1, return; … After generating a random number u, we find the value of X by determining the interval [F(xj-1),F(xj)] in which u lies. That is, the inverse of F(u) is determined. Time complexity of generating a random variable is proportional to the number of intervals to be searched. Random Numbers 28 Example Assume p1= 0.2, p2 = 0.1, p3 = 0.25, p4 = 0.45. Write an algorithm to generate random variates such that P{X=j} = pj? Version 2 (efficient) Version 1 Generate u=U(0,1) If u<0.45, set X=4, return; If u<0.7, set X=3, return; If u<0.9, set X=1, return; Set X=2, return; Generate u=U(0,1) If u<0.2, set X=1, return; If u<0.3, set X=2, return; If u<0.55, set X=3, return; Set X=4, return; Random Numbers 29 Geometric Random Variables X is a geometric random variable with parameter p if it has the following probability mass function: P{ X = i} = pq i −1, i ≥ 1, where q = 1 − p Note that : j −1 F ({ X = j − 1}) = ∑ P{X = i} = 1 − P{ X > j − 1} i =1 = 1 − P{first j - 1 trials are all failures} = 1 − q j −1, j ≥ 1 Generate value of X by generating random number u and setting X to j such that: 1 - qj-1 ≤ u < 1 – qj Random Numbers 30 10 Geometric Random Variables Note that 1 - qj-1 ≤ u < 1 – qj is equivalent to qj < 1 - u ≤ qj-1 Therefore, X = min{j: qj < 1 – u} ⎛ log(1 − u ) ⎞ ⎟ +1 X = Int ⎜⎜ ⎟ ⎝ log(q) ⎠ Note that (1 - u) is also uniformly distributed on (0,1). Thus, ⎛ log(u ) ⎞ ⎟⎟ + 1 X = Int ⎜⎜ ⎝ log(q) ⎠ Random Numbers 31 Generating Poisson Random Variables Poisson Distributioni with rate λ. pi = P{ X = i} = e − λ λ ,i ≥ 0 i! A recursive relation for pi can be derived : pi +1 = λ pi , i ≥ 0 i +1 Poisson RV Algorithm (λ) 1. Generate u=U(0,1); 2. i=0, p = e-λ, F=p; 3. if u < F, set X = i and return; 4. p= λp/(i+1), F = F+p, i=i+1; 5. Go to step 3; Random Numbers 32 Continuous Random Variates The inverse transform method relies on the following proposition: Let U be a uniform random variable. For any continuous random variable X, the CDF of X has the following relationship with U: X = F-1(U) Here F-1(u) is defined as the value of x at which F(x)=u. Random Numbers 33 11 Exponential Random Variables Suppose X is an exponential random variable with rate λ. The CDF of X is defined as follows: F(x) = 1 – e-λx for x ≥ 0 Generate X1, X2, X3 … as follows: Ui = F ( X i ) = 1 − e e −λX i ln(e − λX i = 1 − Ui −λ X i ) = ln(1 − U i ) − λX i ln(e) = ln(1 − U i ) Xi = − 1 λ ln(1 − U i ) Random Numbers 34 Weibull Random Variable PDF of a Weibull random variable X is: ⎧ β β −1 ⎪ x . −( x / α ) β ,x≥0 f (x ) = ⎨ α β e ⎪⎩0, otherwise The above form assumes that location parameter is zero. β F ( X i ) = 1 − e −( X i / α ) . Set U i = 1 − e −( X i / α ) β X i = −α [ln(1 − U i )]1 / β . Random Numbers 35 Acceptance-Rejection Method Another technique for generating random variates. Consider the discrete case first. If we have a technique for simulating a random variable Y with PMF qj, j ≥ 0, we can use it for generating random variable X with PMF pj, j ≥ 0, by generating values of Y with PMF {qj} and accepting these values with a probability proportional to pY /qY. Random Numbers 36 12 Acceptance-Rejection Algorithm To generate a value do the following: Step 0: Choose a constant c, pY /qY ≤ c for all y Step 1: Simulate the value of Y (where Y has PMF qj ). Step 2: Generate a uniformly distributed random number between 0 and 1, U. Step 3: If U ≤ pY /(cqY), set X = Y. Otherwise, return to step 1. The equivalent method for continuous random variables is given next. Random Numbers 37 Acceptance-Rejection for Continuous Random Variables Step 0: Find a constant c such that f(x) / g(x) ≤ c for all x Step 1: Generate Y having density function g (i.e., we have a method for generating continuous random variates that follow the distribution function g). Step 2: Generate a random number U Step 3: If U ≤ f(Y) / (c.g(Y)), set X = Y. Otherwise, return to step 1. Random Numbers 38 Acceptance-Rejection Method Useful particularly when inverse CDF does not exist in closed form. Illustration: To generate random variates, X ~ U(1/4, 1) Generate R Procedures: Step 1. Generate R ~ U[0,1] Step 2a. If R >= ¼, accept X=R. Step 2b. If R < ¼, reject R, return to Step 1 no Condition yes Output R’ R does not have the desired distribution, but R conditioned (R’) on the event {R ≥ ¼} does. Efficiency: Depends heavily on the ability to minimize the number of rejections. Random Numbers 39 13 Special Properties Based on features of particular family of probability distributions For example: Direct Transformation for normal and lognormal distributions Convolution Beta distribution (from gamma distribution) Random Numbers Direct Transformation 40 [BCNN05] Approach for normal(0,1): Consider two standard normal random variables, Z1 and Z2, plotted as a point in the plane: In polar coordinates: Z1 = B cos φ Z2 = B sin φ B2 = Z21 + Z22 ~ chi-square distribution with 2 degrees of freedom = Exp(λ = 2). Hence, B = (−2 ln R)1/ 2 The radius B and angle φ are mutually independent. Z1 = (−2 ln R)1/ 2 cos(2πR2 ) Z 2 = (−2 ln R)1/ 2 sin(2πR2 ) Random Numbers Direct Transformation 41 [BCNN05] Approach for normal(μ,σ2): Generate Zi ~ N(0,1) Xi = μ + σ Zi Approach for lognormal(μ,σ2): Generate X ~ N((μ,σ2) Yi = eXi Random Numbers 42 14 References Random Number Generation notes are based on the famous book by Raj Jain “The Art of Computer Systems Performance Evaluation”. This book is published by Wiley. Random variate generation notes are based on the famous Sheldon Ross textbook “Simulation” . This book is published by Academic Press. Random Numbers 43 15