Download PowerPoint Presentation: EE5324 Memory Design

EE 5324 – VLSI Design II Part VII: Floating Point Arithmetic Kia Bazargan University of Minnesota Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 299 Floating-Point vs. Fixed-Point Numbers • Fixed point has limitations  x = 0000 0000. 0000 10012  y = 1001 0000. 0000 00002  Rounding?  Overflow? (x2 and y2 under/overflow) • Floating point: represent numbers in two fixedwidth fields: “magnitude” and “exponent”  Magnitude: more bits = more accuracy  Exponent: more bits = wider range of numbers ± Exponent e X= s Spring 2006 Magnitude m EE 5324 - VLSI Design II - © Kia Bazargan 300 Floating Point Number Representation • Sign field:  When 0: positive number, when 1, negative • Exponent:  Usually presented as unsigned by adding an offset  Example: 4 bits of exponent, offset=8 o Exp=10012  e = 10012-10002 = 00012 o Exp=00102  e = 00102-10002 = 10102 = -6 • Magnitude (also called significand, mantissa)  Shift the number to get: 1.xxxx  Magnitude is the fractional part (hidden ‘1’)  Example: 6 bits of mantissa o Number=110.0101  shift: 1.100101  mantissa=100101 o Number=0.0001011  shift: 1.011  mantissa=011000 Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 301 Floating Point Numbers: Example ± Exponent X = s e (+bias) Magnitude m X = ± 1.m × 2e X1 = 0 1010 0011101 X1 = + 1.0011101 × 22 X2 = 0 0010 1000000 X2 = + 1. 1 × 2-6 X3 = 1 1011 0000001 X3 = - 1.0000001 × 23 X4 = 0 0000 0000000 X4 = + 1.0000000 × 2-8 =0 X5 = 0 1111 0000000 X5 = + 1.0000000 × 27 = + Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 302 Floating Point Number Range • Range: [-max, -min]  [min, max]  Min = smallest magnitude x 2smallest exponent  Max = largest magnitude x 2largest exponent • What happens if:  We increase # bits for exponent?  Increase # bits for magnitude? • Ref:  http://steve.hollasch.net/cgindex/coding/ieeefloat.html  ftp://download.intel.com/technology/itj/q41999/pdf/ia64fpbf.pdf - -max FLP- -min 0 min FLP+ max + . . . . . . . . . . . . Denser Sparser Denser Sparser Negative Positive Overflow Overflow Underflow numbers numbers Region Region Regions [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 303 Floating Point Operations • Addition/subtraction, multiplication/division, function evaluations, ... • Basic operations  Adding exponents / magnitudes  Multiplying magnitudes  Aligning magnitudes (shifting, adjusting the exponent)  Rounding  Checking for overflow/underflow  Normalization (shifting, adjusting the exponent) Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 304 Floating Point Addition • More difficult than multiplication! • Operations:  Align magnitudes (so that exponents are equal)  Add (and round)  Normalize (result in the form of 1.xxx) X= 0 1011 0011101 X = + 1.0011101 × 23 y= 0 1000 1010011 y = + 1.1010011 × 20 y= 0 1011 0011010 y = + 0.0011010 × 23 x+y= 0 1011 0110111 x+y= +1.0110111 × 23 No need to normalize in this case Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 305 Floating Point Adder Architecture Unpack Complement/swap Subtract Exponents +/- Align Magnitudes Cout Sign Logic Add Magnitudes Cin Normalize Adjust Exponent Round/Complement Adjust Exponent Normalize Pack [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 306 Floating Point Adder Components • Unpacking  Inserting the “hidden 1”  Checking for special inputs (NaN, zero) • Exponent difference  Used in aligning the magnitudes  A few bits enough for subtraction o If 32-bit magnitude adder, 8 bits of exponent, only 5 bits involved in subtraction  If negative difference, swap, use positive diff o How to compute the positive diff? • Pre-shifting and swap  Shift/complement provided for one operand only  Swap if needed Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 307 Floating Point Adder Components (cont.) • Rounding  Three extra bits used for rounding • Post-shifting  Result in the range (-4, 4) z = Coutz1z0.z-1z-2…  Right shift: 1 bit max o If Cout  z1 right shift  Left shift: up to # of bits in magnitude o Determine # of consecutive 0’s (1’s) in z, beginning with z1.  Adjust exponent accordingly • Packing  Check for special results (zero, under-/overflow)  Remove the hidden 1 Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 308 Counting vs. Predicting Leading Zeros/Ones Magnitude Adder Magnitude Adder Predict Leading 0/1 Count Leading 0/1 Adjust Exponent Shift amount Post-shifter Counting: Simpler but on the critical path Adjust Exponent Shift amount Post-Shifter Predicting: More complex architecture [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 309 Floating Point Multiplication • Simpler than floating-point addition • Operation:     Inputs: z1= ± 1.m1 × 2e1 z2= ± 1.m2 × 2e2 Output = ± (1.m1 × 1.m2) × 2e1+e2 Sign: XOR Exponent: o Tentatively computed as e1+e2 o Subtract the bias (=127) HOW? o Adjusted after normalization  Magnitude o o o o Spring 2006 Result in the range [1,4) (inputs in the range [1,2) ) Normalization: 1- or 2-bit shift right, depending on rounding Result is 2.(1+m) bits, should be rounded to (1+m) bits Rounding can gradually discard bits, instead of one last stage EE 5324 - VLSI Design II - © Kia Bazargan 310 Floating Point Multiplier Architecture Floating-point operands Unpack Note: Pipelining is used in magnitude multiplier, as well as block boundaries XOR Add Exponents Multiply Magnitudes Adjust Exponent Normalize Round Adjust Exponent Normalize Pack Product Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan [© Oxford U Press] 311 Square-Rooting • Most important elementary function • In IEEE standard, specified a basic operation (alongside +,-,*,/) • Very similar to division • Pencil-and-paper method:  Radicand: z=z2k-1z2k-2…z1z0  Square root: qk-1qk-2…q1q0  Remainder (z-q2) sksk-1sk-2…s1s0 (k+1 digits) Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 312 Square Rooting: Example • Example: sqrt(9 52 41) q2 q1 q0 9 9 52 41 0 52 00 q = z q(0)=0 q2=3 q(1)=3 6q1 × q1  52 q1=0 q(2)=30 q0=8 q(3)=308 ×2 ×2 52 48 41 64 60q0 × q0  5241 03 77 s = 377 q=308 Append digits Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 313 Square Rooting: Example (cont.) • Why double the partial root?      Partial root after step 2 is: q(2) = 30 Appending the next digit q0  10 × q(2) + q0 Square of which is 100×(q(2))2 + 20×q(2)×q0 + q02 The term 100×(q(2))2 already subtracted Find q0 such that (10×(2×q(2)) + q0) × q0 is the max number  partial remainder • The binary case:  Square of 2×q(2) + q0 is: 4×(q(2))2 + 4×q(2)×q0 + q02  Find q0 such that (4×q(2) + q0) × q0 is  partial remainder  For q0=1, the expression becomes 4×q(2)+1 (i.e., append “01” to the partial root) Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 314 Square Rooting: Example Base 2 • Example: sqrt(011101102) = sqrt(118) q3 q2 q1 q0 01 01 11 01 10 00 0 11 00 0 11 10 01 01 01 00 00 00 10 00 1 00 10 Spring 2006 q = q(0)=0 z=(118)10 q3=1 q(1)=1  101 ? No q2=0 q(2)=10  1001 ? Yes q1=1 q(3)=101 No q0=0 q(4)=1010  10101 ? s=1810 EE 5324 - VLSI Design II - © Kia Bazargan q=10102=1010 315 Sequential Shift/Subtract Square Rooter Architecture Put z - 1 here at the outset Trial Difference MSB of 2s(j-1) Partial Remainder Load Square root Complement Cout (l+2)-bit adder l+2 q l+2 -j Select Root Digit sub Cin [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 316 Other Methods for Square Rooting • Restoring vs. non-restoring  We looked at the restoring algorithm (after subtraction, restore partial remainder if the result is negative)  Non-restoring: Use a different encoding (use digits {-1,1} instead of {0,1}) to avoid restoring • High-radix  Similar to modified Booth encoding multiplication: take care of more number of bits at a time  More complex circuit, but faster Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 317 Other Methods for Square Rooting (cont.) • Convergence methods  Use the Newton method to approximate the function f(x) = x2 – z approximates x=z OR f(x) = 1/x2 – z approximates x=1/z , multiply by z to get z  Iteratively improve the accuracy  Can use lookup table for the first iteration Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 318 Square Rooting: Abstract Notation q z -q3 (q(0) 0q3) 26 -q2 (q(1) 0q2) 24 -q1 (q(2) 0q1) 22 -q0 (q(3) 0q0) 20 s Floating point format: - Shift left (not right) - Powers of 2 decreasing Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 319 Restoring Floating-Point Square Root Calc. z 0 1.1 1 0 1 1 0 (118/64) q(0)=1. s(0) = z - 1 2s(0) -1 -[2× ( 1.)+2 ] 0 0 0.1 1 0 1 1 0 0 0 1.1 0 1 1 0 0 1 0.1 q =1 0 s(1) s(1) = 2 s(0) 2s(1) -[2× ( 1.0)+2 -2 ] 1 1 1.0 0 1 1 0 0 0 0 1.1 0 1 1 0 0 0 1 1.0 1 1 0 0 0 1 0.0 1 q =0 q(1)= 1.0 -1 Restore s(2) 2s(2) -3 -[2× ( 1.01)+2 ] 0 0 1.0 0 1 0 0 0 0 1 0.0 1 0 0 0 0 1 0.1 0 1 q =1 -2 q(2)= 1.01 s(3) s(3) = 2 s(2) 2s(3) -4 -[2× ( 1.010)+2 ] 1 1 1.1 0 1 0.0 1 0 0.1 1 0.1 q =0 -3 q(3)= 1.010 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 Restore [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 320 Restoring Floating-Point Sq. Root Calc. (cont.) s(3) s(3) = 2 s(2) 2s(3) -4 -[2× ( 1.010)+2 ] q =0 -3 s(4) 0 0 1.1 1 1 1 0 0 2s(4) 0 1 1.1 1 1 0 0 0 -5 -[2× (1.0101 )+2 ] 1 0.1 0 1 0 1 q =1 -4 q(4)= 1.0101 s(5) 0 0 1.0 0 1 1 1 0 2s(5) 0 1 0.0 1 1 1 0 0 -6 -[2×( 1.01011)+2 ] 1 0.1 0 1 1 0 1 q =1 -5 q(5)= 1.01011 s q 0 1 0 0 1 0 0 0 q(3)= 1.010 0 0 0 0 0 0 0 0 0 1 s(6) s(6) = 2 s(5) 1 1 1.1 0 1 0.0 1 0 0.1 1 0.1 Restore q =0 q(6)= 1.010110 -6 (156/64) Restore 2 0 . 0 0 0 0 1 0 0 1 1 1 0 0 (156/64 ) 1.0 1 0 1 1 0 (86/64) 1 1 1.1 0 1 1 1 1 0 1 0.0 1 1 1 0 0 (true remainder) [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 321 Nonrestoring Floating-Point Square Root Calc. z s(0) = z - 1 2 s(0) -1 -[2× ( 1.)+2 ] s(1) 2 s(1) -2 +[2× ( 1.1)-2 ] s(2) 2 s(2) -[2× ( 1.01)+2 -3 ] s(3) 2 s(3) -4 +[2× ( 1.011)-2 ] s(4) 2 s(4) -5 -[2× ( 1. 0 1 0 1 )+2 ] Spring 2006 0 1.1 1 0 1 1 0 0 0 0.1 0 0 1.1 1 0.1 1 1 1.0 1 1 0.0 1 0.1 0 0 1.0 0 1 0.0 1 0.1 1 1 1.1 1 1 1.0 1 0.1 0 0 1.1 0 1 1.1 1 0.1 (118/64) 1 0 1 1 0 0 1 1 0 0 q =1 q0 =1 -1 q(0)=1. q(1)=1.1 0 1 1 0 1 0 0 1 0 1 1 0 1 1 0 0 1 0 0 0 q =-1 -2 q(2)=1.01 1 0 1 1 0 1 1 1 1 0 0 0 0 0 0 q =1 -3 q(3)=1.011 0 0 1 1 0 0 0 0 0 0 q =-1 -4 q(4)=1.0101 0 0 0 0 1 q =1 -5 q(5)=1.01011 EE 5324 - VLSI Design II - © Kia Bazargan 322 Nonrestoring FP Square Root Calc. (cont.) s(4) 2 s(4) -5 -[2× ( 1. 0 1 0 1 )+2 ] 0 0 1.1 1 1 1 0 0 0 1 1.1 1 1 0 0 0 1 0.1 0 1 0 1 q =1 -5 q(5)=1.01011 s(5) 2 s(5) -[2×( 1.01011 )+2 -6 ] 0 0 1.0 0 1 1 1 0 0 1 0.0 1 1 1 0 0 1 0.1 0 1 1 0 1 q =1 -6 q(6)=1.010111 s(6) s(6)=2 S(5) 1 1 1.1 0 1 0.1 0 0 1 0.0 1 0.0 0 1 . 1 -1 1.0 1 s(6) (corrected) s (true remainder) q (signed-digit) q (corrected bin) 1 1 1 1 Negative (-17/64) 1 1 0 1 Correct 1 1 0 0 (156/64) 0 0 1 0 0 1 1 1 0 0 1 -1 1 1 (87/64) (86/64) 0 1 1 0 If final S negative, drop the last ‘1’ in q, and restore the remainder to the last positive value. Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 323 Square Root Through Convergence • Newton-Rapson method:  Choose f(x)=x2-z  x(i+1) = x(i) – f(x(i)) / f’(x(i))  x(i+1) = 0.5 (x(i) + z / x(i)) • Example: compute square root of z=(2.4)10 x(0) read out from table = 1.5 accurate to 10 -1 x(1) = 0.5( x(0) +2.4/x(0) ) = 1.550 000 000 accurate to 10 -2 x(2) = 0.5( x(1) +2.4/x(1) ) = 1.549 193 548 accurate to 10 -4 x(3) = 0.5( x(2) +2.4/x(2) ) = 1.549 193 338 accurate to 10 -8 [Par00] p354 Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 324 Non-Restoring Parallel Square Rooter q-1 1 0 z-1 1 z-2 Cell z-3 0 1 XOR z-4 q-2 z-5 1 0 FA z-6 q-3 0 z-7 z-8 s-7 s-8 1 q-4 s-1 s-2 s-3 s-4 s-5 s-6 [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 325 Function Evaluation • We looked at square root calculation  Direct hardware implementation (binary, BSD, high-radix) o Serial o Parallel  Approximation (Newton method) • What about other functions?  Direct implementation o Example: log2 x can be directly implemented in hardware (using square root as a sub-component)  Polynomial approximation  Table look-up o Either as part of calculation or for the full calculation Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 326 Table Lookup Operand(s) u bits 2u x v table Preprocessing Logic Operand(s) u bits . . . Smaller table(s) ... Postprocessing logic Result(s) v bits Result(s) v bits Direct table-lookup implementation Table-lookup with preand post-processing [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 327 Linear Interpolation Using Four Subintervals a(3)+b(3)x a(2)+b(2)x 2-bit address a(1)+b(1)x a(0)+b(0)x a(i) b (i) /4 x f(x) Radix Point × 4x + x xmin 4-entry tables x xmax f(x) [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 328 Piecewise Table Lookup z b-bit input b-g b-bit input g d d Table 1 b-h Table v L 2 vH d d Adder -p Sign d-bit output d*-h Table 1 h v d* d Adder Table 2 d* Adder d d* h d+1 d+1 z m* Z mod p d Mux d-bit output z mod p [© Oxford U Press] Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 329 Accuracy vs. Lookup Table Size Trade-off Worst-case absolute error 10-1 10-2 10-3 10-4 10-5 2nddegree 10-6 10-7 3rddegree 10-8 10-9 Spring 2006 Linear 0 2 4 6 8 Number of address bits (h) EE 5324 - VLSI Design II - © Kia Bazargan 10 [© Oxford U Press] 330 Useful Links • M. E. Phair, “Free Floating-Point Madness!”, http://www.hmc.edu/chips/ Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 331

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download PowerPoint Presentation: EE5324 Memory Design