* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Random Number Generation
Georg Cantor's first set theory article wikipedia , lookup
List of prime numbers wikipedia , lookup
Large numbers wikipedia , lookup
Vincent's theorem wikipedia , lookup
Karhunen–Loève theorem wikipedia , lookup
System of polynomial equations wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Central limit theorem wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Quadratic reciprocity wikipedia , lookup
Infinite monkey theorem wikipedia , lookup
Factorization wikipedia , lookup
Law of large numbers wikipedia , lookup
Collatz conjecture wikipedia , lookup
1 How does the computer generate observations from various distributions specified after input analysis? There are two main components to the generation of observations from probability distributions. 1. Random number generation. 2. Random variate generation. 2 Random number generation – The generation of U(0,1) random variates (observations from Uniform (0,1) distribution). ◦ This serves as the foundation for the generation of observations from other distributions, which is called random variate generation. Random Number Generator is the term used to describe the procedure and parameters used to generate the U(0,1) observations. 3 Since the “stream” of random numbers generated is reproducible, random number generation procedures are also referred to as pseudo random number generators. The stream or sequence of numbers produced by a generator should pass statistical tests for randomness. ◦ An outside observer should not be able to tell the difference (statistically) between a stream of pseudo random numbers and an actual random number stream. 4 A pseudorandom process appears random, but isn’t Pseudorandom sequences exhibit statistical randomness ◦ but generated by a deterministic process Pseudorandom sequences are easier to produce than a genuine random sequences Pseudorandom sequences can reproduce exactly the same numbers ◦ useful for testing and fixing software. 5 Random number generators typically compute the next number in the sequence from the previous number The first number in a sequence is called the seed ◦ to get a new sequence, supply a new seed (current machine time is useful) ◦ to repeat a sequence, repeat the seed 6 Desirable Attributes: ◦ Uniformity ◦ Independence ◦ Efficiency ◦ Replicability ◦ Long Cycle Length 7 Each random number Rt is an independent sample drawn from a continuous uniform distribution between 0 and 1 1 , 0 x 1 pdf: f(x) = 0 , otherwise 8 1 f(x) PDF: 0 x 1 E ( R ) xdx [ x 2 / 2]10 1 / 2 0 1 V ( R ) x 2 dx [ E ( R )] 2 0 1 0 [ x 3 / 3] (1 / 2) 2 1 / 3 1 / 4 1 / 12 9 One early method – the midsquare method (von Neumann and Metropolis 1940) ◦ Start with a four digit positive integer Z0. ◦ Square Z0 to get an integer with up to eight digits (append zeros if less than eight). ◦ Take the middle four digits as the next four digit integer Z1. ◦ Place a decimal point to the left of Z1 to form the first “U(0,1)” observation. ◦ Repeat 10 MidSquare Example: X0 = 7182 (seed) X 02 = 51581124 ==> R1 = 0.5811 X 02 = (5811) 2 = 33767721 ==> R2 = 0.7677 etc. 11 Note: Cannot choose a seed that guarantees that the sequence will not degenerate and will have a long period. Also, zeros, once they appear, are carried in subsequent numbers. Ex1: ==> ==> Ex2: ==> ==> X0 R1 R2 X0 R1 R2 = = = = = = 5197 (seed) 0.0088 0.0077 4500 (seed) 0.2500 0.2500 = 27008809 X = 00007744 X 2 0 2 1 = 20250000 X = 06250000 X 2 0 2 1 12 The prior method does not work well. ◦ Degenerates to zero. What are good methods? ◦ Linear Congruential Generators (LCGs). ◦ Composite generators. ◦ Tausworthe generators. 13 Linear Congruential Generators (LCGs). A LCG generates a sequence of integers Z1, Z2, Z3,… using the following recursive formula, Z i (aZ i 1 c) mod m mod m is short for modulo m or the remainder when divided by m. 14 Since the mod m operation is used, all Zi’s will be between 0 and m-1. To get the “U(0,1)” random observations each Zi generated is divided by m. Z1 Z2 U1 ,U 2 , m m So are the Ui’s really U(0,1) random observations ? 15 Let m=63, a=22, c=4 and Z0 =19. ◦ Generate the first five “U(0,1)” observations. 16 i 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 22*Zi-1+ 4 422 972 598 686 1236 862 950 114 1126 1214 378 4 92 642 268 Zi 19 44 27 31 56 39 43 5 51 55 17 0 4 29 12 16 Ui 0.6984 0.4286 0.4921 0.8889 0.6190 0.6825 0.0794 0.8095 0.8730 0.2698 0.0000 0.0635 0.4603 0.1905 0.2540 i 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 22*Zi-1+ 4 356 906 532 620 1170 796 884 48 1060 1148 312 1324 26 576 202 290 Zi 41 24 28 53 36 40 2 48 52 14 60 1 26 9 13 38 Ui 0.6508 0.3810 0.4444 0.8413 0.5714 0.6349 0.0317 0.7619 0.8254 0.2222 0.9524 0.0159 0.4127 0.1429 0.2063 0.6032 i 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 22*Zi-1+ 4 840 466 554 1104 730 818 1368 994 1082 246 1258 1346 510 136 224 774 Zi 21 25 50 33 37 62 45 49 11 57 61 23 6 10 35 18 Ui 0.3333 0.3968 0.7937 0.5238 0.5873 0.9841 0.7143 0.7778 0.1746 0.9048 0.9683 0.3651 0.0952 0.1587 0.5556 0.2857 i 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 22*Zi-1+ 4 400 488 1038 664 752 1302 928 1016 180 1192 1280 444 70 158 708 334 Zi 22 47 30 34 59 42 46 8 54 58 20 3 7 32 15 19 Ui 0.3492 0.7460 0.4762 0.5397 0.9365 0.6667 0.7302 0.1270 0.8571 0.9206 0.3175 0.0476 0.1111 0.5079 0.2381 0.3016 17 What will happen after the 63rd number is generated? m, a, and c are the parameters of the random number generator. ◦ There can be an infinite number of different implementations of a LCG. The values used for m, a, and c determine whether the generator is good or bad. 18 The example LCG demonstrates cycling in the prior table. ◦ Since m=63, it can generate at most 63 numbers before it repeats the same sequence. This small random number generator has full period since it generates all possible (m=63) numbers before cycling. ◦ A long period (full if possible) is desirable since more observations can be generated before cycling. ◦ No “gaps”. 19 The example generator has full period but bad statistical properties (next slide). A good random number generator will have values for m, a, and c such that full or close to full period is obtained, as well as good statistical properties. ◦ Crystal Ball m = 231 – 1 a = 62089911 c=0 Period = 231 – 2 20 Theorem (Hull and Dobell 1962) ◦ The LCG Zi = (aZi-1 + c) mod m has full period if and only if the following three conditions hold. 1. The only positive integer that exactly divides both m and c is 1. 2. If q is a prime number that divides m, then q divides a-1. 3. If 4 divides m, then 4 divides a-1. The parameters of the LCG dictate the period length of the LCG as well as other properties of the numbers generated. 21 A generator that has the maximum possible period is called a full-period generator. Lower autocorrelations between successive numbers are preferable. Both generators have the same full period, but the first one has a correlation of 0.25 between xn-1 and xn, whereas the second one has a negligible correlation of less than 2-18. 22 Types of LCGs ◦ When c = 0, the LCG is called a multiplicative generator. ◦ When c ≠ 0, the LCG is called a mixed generator. Most LCGs implemented are multiplicative ◦ Can’t have full period. How is m selected. ◦ A large period is desired → m=231 (based on a 32 bit word size). With m=231 it has been proven that the period can be at most 229 (25% of the values are cycled and gaps may be present). 23 Multiplicative LCG: c=0 xi a xi 1 mod m Two types: m2 k m2 k 24 Example: Using the multiplicative congruential method, find the period of the generator for a = 13, m = 26, and X0 = 1, 2, 3, and 4. The solution is given in next slide. When the seed is 1 and 3, the sequence has period 16. However, a period of length eight is achieved when the seed is 2 and a period of length four occurs when the seed is 4. 25 Period Determination Using Various seeds i Xi Xi Xi Xi 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 13 41 21 17 29 57 37 33 45 9 53 49 61 25 5 1 2 26 18 42 34 58 50 10 2 3 39 59 63 51 23 43 47 35 7 27 31 19 55 11 15 3 4 52 36 20 4 26 Maximum possible period 2k-2 Period achieved if multiplier a is of the form 8i± 3, and the initial seed is an odd integer One-fourth the maximum possible may not be too small Low order bits of random numbers obtained using multiplicative LCG's with m=2k have a cyclic pattern. 27 28 29 When the modulus m is a prime number and a >1, the maximum period is m-1, no matter whether c=0 or not. The maximum period m-1 is obtained if and only if ‘a’ is a primitive element modulo m. If m is prime then ‘a’ is a primitive element modulo m (or primitive root of m) if and only if an mod m ≠ 1 for n=1, 2, 3, …,m-2. Recommended. (Prime moduli are best in terms of sequence randomness.) 30 Example: xi 3xi 1 mod 31 Starting with a seed of x0=1: 1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10, 30, 28, 22, 4, 12, 5, 15, 14, 11, 2, 6, 18, 23, 7, 21, 1, … The period is 30 ⇒ 3 is a primitive root of 31 With a multiplier of a = 5: 1, 5, 25, 1,… The period is only 3 ⇒ 5 is not a primitive root of 31 Primitive roots of 31= 3, …????????. 31 LCGs are a special case of the form Zi = g(Zi-1, Zi-2, ...) (mod m), Ui = Zi /m, for some function g Examples: g(Zi-1) = aZi-1 + c LCG g(Zi-1, Zi-2, ..., Zi-q) = a1Zi-1 + a2Zi-2 + ... + aqZi-q multiple recursive generator g(Zi-1) = a'Z 2i-1 + aZi-1 + c quadratic CG g(Zi-1, Zi-2) = Zi-1 + Zi-2 Fibonacci (bad) 32 Composite Generators Combine two (or more) individual generators in some way. Differencing LCGs ◦ Z1i and Z2i from LCGs with different moduli ◦ Let Zi = (Z1i – Z2i ) (mod m); Ui = Zi / m ◦ Very good statistical properties ◦ Very portable (micros, different languages) Wichmann/Hill ◦ Use three LCGs to get U1i, U2i, and U3i sequences ◦ Let Ui = fractional part of U1i + U2i + U3i ◦ Long period, good statistics, portability 33 Originated in cryptography Can achieve very long periods Theoretical appeal: for properly chosen parameters, can prove that over a cycle, ◦ mean 1/2 (as for true U(0,1)) ◦ Variance 1/12 (as for true U(0,1)) ◦ Autocorrelation 0 (as for true IID sequence) Define a sequence of binary digits B1,B2, . . ., by q bi c j bi j mod 2 j 1 where cj = 0 or 1. 34 Looks a bit like a generalization of LCG’s. Let D = delay operator such that Db(n)=b(n+1) Dqb(i q) cq1Dq1b(i q) cq2 Dq2b(i q) c0b(i q) mod 2 or Dq cq1Dq1 cq2 Dq2 c0 mod 2 Dq cq1Dq1 cq2 Dq2 c0 0 mod 2 Since in mod 2 arithmetic subtraction is equivalent to addition, the preceding equation is equivalent to q q 1 q 2 D cq1D cq2 D c0 0 mod 2 35 The polynomial on the left-hand side of this equation is called a characteristic polynomial and is traditionally written using x in place of D x cq1x q q 1 cq2 x q 2 c0 The period of a Tausworthe generator depends upon the characteristic polynomial. In particular, the period is the smallest positive integer n for which xn - 1 is divisible by the characteristic polynomial. The maximum possible period with a polynomial of order q is 2q - 1. The polynomials that give this period are called primitive polynomials. 36 Example: Consider the following polynomial: x7 + x3 + 1 Using the D operator in place of x, we get D7b(n) D3b(n) b(n) 0 mod 2 bn7 bn3 bn 0 mod 2 or or using the XOR operator or bn7 bn3 bn 0 bn7 bn3 bn n 0,1,2, n 0,1,2, Substituting n-7 for n, we get bn bn4 bn7 n 7,8,9, 37 Starting with b0 = b1 = ... = b6 = 1, we get the following bit sequence: b7 b3 b0 1 1 0 b8 b4 b1 1 1 0 b9 b5 b2 1 1 0 b10 b6 b3 1 1 0 b11 b7 b4 0 1 1 38 The complete sequence is: 1111111 0000111 0111100 1011001 0010000 0010001 0011000 1011101 0110110 0000110 0110101 0011100 1111011 0100001 0101011 1110100 1010001 1011100 0111111 1000011 1000000. Period = 127 or 27-1 bits ⇒ The polynomial x7+x3+1 is a primitive polynomial. 39 A Tausworthe sequence can be easily generated in hardware using Linear-Feedback Shift Registers (LFSRs). For example, the polynomial x5 + x3 + 1 results in the generator bn = bn-2 bn-5. This can be implemented using the LFSR shown in the Figure presented next slide. The circuit consists of six registers, each holding one bit. On every clock cycle, each register’s content is shifted out, and the new content is determined by the input to the register. 40 Linear Feedback Shift Register: x5+x3+1 ⇒ bn= bn-2 ⊕ bn-5 This can be easily implemented using shift registers: 41 Generating U(0,1): Divide the sequence into successive groups of s bits and use the first l bits of each group as a binary fraction: xn = 0.bsnbsn+1bsn+2bsn+3 ...bsn+l-1 Here, s is a constant greater than or equal to l and is relatively prime to 2q-1. s ≥ l ⇒ xn and xj for n ≠ j have no bits in common. Relative prime-ness guarantees a full period 2q-1 for xn. 42 Example: bn = bn-4 ⊕ bn-7 The period 27-1=127 l=8, s=8: x0 = 0.111111102 = 0.9921910 x1 = 0.000111012 = 0.1132810 x2 = 0.111001012 = 0.8945310 x3 = 0.100100102 = 0.2968810 x4 = 0.000001002 = 0.3632810 x5 = 0.010011002 = 0.4218810 …. 43 List of Primitive Trinomials x2 + x + 1 x3 + x + 1 x4 + x + 1 x5 + x2 + 1 x6 + x + 1 x7 + x + 1 x7 + x3 + 1 x9 + x 4 + 1 x10 + x3 + 1 x11 + x2 + 1 x15 + x + 1 x15 + x4 + 1 x15 + x7 + 1 x17 + x3 + 1 x17 + x5 + 1 x17 + x6 + 1 x18 + x7 + 1 x20 + x3 + 1 x21 + x2 + 1 x22 + x + 1 x23 + x5 + 1 x23 + x9 + 1 x25 + x3 + 1 x25 + x7 + 1 x28 + x3 + 1 x28 + x9 + 1 x28 + x13 + 1 x29 + x2 + 1 x31 + x3 + 1 x31 + x6 + 1 x31 + x7 + 1 x31 + x13 +1 If xq + xr + 1 is listed, then xq + xq-r +1 is also primitive. 44 Homework: Generate random numbers using the primitive polynomial x5+x2+1. (use l=4) Generate the same sequence using LFSR. 45