Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
School of Engineering University of Guelph FPGA-Optimised High-Quality Uniform Random Number Generators By David Barrie Thomas and Wayne Luk of the Imperial College, London, England -Paper ReviewCédric De Brito Table of Contents • Introduction • Background • The paper • Explanations • Conclusion Introduction • This paper describes a new method to create Random Number Generators (RNG) using only the most basic primitives of FPGAs: Flip-Flops (FF), Lookup Tables (LUT), Shift Registers (SR) and Random Access Memory (RAM) What is a RNG ? • Two types: i. True RNG (TRNG): of physical phenomena, ii. Use Pseudo RNG (PRNG):e.g the jitter between oscillators Background depends on initial seed fundamentallyfunctions unpredictable characteristic are deterministic : a given always give rise to the same output seed greatmust interest in cryptographic applications sequence X cannot produce high abit-rate streams X advances state using recurrence function X state-space must ahave a fixed with cardinality -> must X the impossible to repeat simulation the same stimuli eventually repeat -> Quality ? Background Obstacles to FPGA implementation • Limited resources • Most algorithms are developed for CPUs ->usually optimized for 32 bits ->mostly sequential • FPGAs are fundamentally binary • Parallel -> huge increase in bit rate Background What is currently available Paper published in 2008 • Most commonly used in software: MersenneTwister RNG (MT19937) – adapted to FPGA – runs on an Opteron at 2.2GHz for 4Gb/s • Most efficient on an FPGA: LUT recurrence – MAXIMUM area efficiency – no software equivalent Motivations 1 2 3 Poor Good Best Area Throughput Ease of implementation 1 2 3 2 3 2 LUT recurrence 3 1 1 LUT - FIFO 3 4 2 Software Software-based on Reconfigurables The paper The paper Motivations “ Many scientific and industrial problems have no tractable closed-form solution, and can only be solved through MonteCarlo simulations. However, such simulations require huge amounts of computational power, and the power and size limits of conventional compute-farms have led to significant interest in the use of FPGAs in such applications.” The paper Claims Compared to the MT19937 at 2.2GHz for a output of 4Gb/s • Less area utilisation – > less power • Low clock frequency : 550 Mhz • High speed generation : 48 Gb/s On a Virtex-5 The paper Claims • Extremely long periods: ~211213 -1 ,758,426 e+3345 • VHDL description is platform independent • By nature, its critical path is obvious -> can easily be adapted/optimized for any platform Basics • A RNG relies on a polynomial of degree n to determine the values of the bits of the next generated number. n - a zn-1 +…+a z0 Pz = zExplanation 1 n • This characteristic polynomial is translated into a characteristic recurrence function Si+1T = [si,5 XOR si,6 , si,6 , si,4 , si,5 , si,2 , si,0 , si,1 ] Active bit FIFO bits Explanation A simple example Binary Linear Recurrence (BLR) 1) Choose a polynomial of the form Px = x6 + x5 + x0 2) Find its equivalent recurrence function Si+1T = [si,5 XOR si,6 , si,6 , si,1 , si,2 , si,3 , si,4 , si,5 ] 3) Create a matrix to save these coefficients Explanation A simple example Binary Linear Recurrence (BLR) • Only produces 1 new value per cycle • For a w-length word, w SRs are needed. • Maximum period of 2n-1 instead of 2nw-1 Explanation How is the period calculated ? • Why? Because the state-space defining the characteristic polynomial must be finite/bounded -> must eventually repeat • Without too much math, maximum period is 2n-1 if (special but common case) 2n-1 is a Mersenne prime Prime if n in 2n-1 is prime Explanation How is the quality measured? • The quality is a measure of “how random” the numbers produced are. • Test of quality : • Run Generator twice with several "batteries" (Diehard, SmallCrush, Alphabit) and both runs must produce enough values in [10-6 .. 1-10-6] and one in [0.01..0.99] -> Empirical randomness Conclusion • Critical path : I -> FF -> RAM -> FF -> O • Limitation : speed of RAM – NOT logic/routing Personal Opinion • VERY interesting paper • Condensed but complete • Clear • Can be reproduced • About a rather unusual topic Questions ?