Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
ECE 645: Lecture 1 Basic Adders and Counters Implementation of Adders in FPGAs Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 5, Basic Addition and Counting, Sections 5.1-5.5, pp. 75-85. Required Reading Spartan-3 Generation FPGA User Guide http://www.xilinx.com/support/documentation/spartan-3_user_guides.htm Chapter 9, Using Carry and Arithmetic Logic Half-adder c x HA s y 2 1 x + y = ( c s )2 x 0 0 1 1 y 0 1 0 1 c 0 0 0 1 s 0 1 1 0 Half-adder Alternative implementations (1) a) c = xy s=xy b) c=x+y s = xy + xy Half-adder Alternative implementations (2) c) c = xy s = xc + yc = xc yc Full-adder cout FA s 2 x y cin 1 x + y + cin = ( cout s )2 x 0 0 0 0 1 1 1 1 y 0 0 1 1 0 0 1 1 cin cout 0 0 1 0 0 0 1 1 0 0 1 1 0 1 1 1 s 0 1 1 0 1 0 0 1 Full-adder Alternative implementations (1) a) s = (x y) cin cout = xy + cin (x y) c c s Full-adder Alternative implementations (2) b) cout = xy + xcin + ycin s = x y cin = xycin + xycin + xycin + xycin Full-adder Alternative implementations (3) c) x 0 0 1 1 y 0 1 0 1 cout 0 cin cin 1 s cin cin cin cin Full-adder Alternative implementations (4) Implementation used to generate fast carry logic in Xilinx FPGAs x 0 0 1 1 y 0 1 0 1 cout y cin cin y Cout 0 1 S x y A2 p=xy g=y s= p cin = x y cin D p XOR A1 g Cin Latency of a k-bit ripple-carry adder dripple-add(k) = dFA(x,ycout) + + (k-2) dFA(cincout) + + dFA(cin s) Latency = a + kb Unsigned addition vs. signed addition Programmer Machine 128 64 32 16 8 4 weight carry Unsigned mind 2 1 Signed mind 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 1 1 0 0 0 X +Y =S x6 y6 x7 y7 c8 FA s7 c7 FA s6 x5 y5 c6 FA s5 x3 y3 x4 y4 c5 FA s4 c4 FA s3 x2 y2 c3 FA s2 x1 y1 c2 FA s1 x0 y0 c1 FA s0 Out of range flags Carry flag - C C=1 0 out-of-range for unsigned numbers if result > MAX_UNSIGNED or result < 0 otherwise where MAX_UNSIGNED = 28-1 for 8-bit operands 216-1 for 16-bit operands Overflow flag - V V=1 0 out-of-range for signed numbers if result > MAX_SIGNED or result < MIN_SIGNED otherwise where MAX_SIGNED = 27-1 for 8-bit operands 215-1 for 16-bit operands MIN_SIGNED = -27 for 8-bit operands -215 for 16-bit operands Overflow for signed numbers Indication of overflow Positive + Positive = Negative Negative + Negative = Positive Formulas Overflow2’s complement = xk-1 yk-1 sk-1 + xk-1 yk-1 sk-1 Addition of Signed and Unsigned Numbers C=1 and V=0 8 4 21 1101 1011 11000 -8 4 2 1 13 11 24≠8 1101 1011 -3 -5 11000 -8 C=0 and V=1 -8 4 2 1 8 4 21 0011 0110 3 6 0011 0110 3 6 1001 9 1001 9≠-7 Addition of Signed and Unsigned Numbers C=0 and V=0 8 4 21 -8 4 2 1 0101 1001 5 9 0101 1001 5 -7 1110 14 1110 -2 C=1 and V=1 -8 4 2 1 8 4 21 1100 1011 1 0111 12 11 23≠7 1100 1011 1 0111 -4 -5 -9≠-7 Two’s complement representation of signed integers Overflow for signed numbers (1) Indication of overflow Positive + Positive = Negative Negative + Negative = Positive Formulas Overflow2’s complement = xk-1 yk-1 sk-1 + xk-1 yk-1 sk-1 = = ck ck-1 Overflow for signed numbers (2) xk-1 yk-1 ck-1 ck sk-1 overflow ckck-1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 Implementation of Adders in FPGAs Xilinx FPGA Devices Technology Low-cost Highperformance Virtex 2, 2 Pro Spartan 3 Virtex 4 120/150 nm 90 nm 65 nm 45 nm 40 nm Virtex 5 Spartan 6 Virtex 6 Altera FPGA Devices Technology Low-cost Mid-range 130 nm Cyclone Highperformanc e Stratix 90 nm Cyclone II Stratix II 65 nm Cyclone III Arria I Stratix III 40 nm Cyclone IV Arria II Stratix IV General structure of an FPGA Programmable interconnect Programmable logic blocks The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL 29 ECE 448 – FPGA and ASIC Design with VHDL 30 Xilinx Spartan 3 FPGAs Configurable logic block (CLB) CLB CLB CLB CLB Slice Slice Logic cell Logic cell Logic cell Logic cell Slice Slice Logic cell Logic cell Logic cell Logic cell The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL 31 CLB Structure ECE 448 – FPGA and ASIC Design with VHDL 32 CLB Slice Structure • Each slice contains two sets of the following: • Four-input LUT • Any 4-input logic function, • or 16-bit x 1 sync RAM (SLICEM only) • or 16-bit shift register (SLICEM only) • Carry & Control • Fast arithmetic logic • Multiplier logic • Multiplexer logic • Storage element • • • • Latch or flip-flop Set and reset True or inverted inputs Sync. or async. control ECE 448 – FPGA and ASIC Design with VHDL 33 LUT (Look-Up Table) Functionality x1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 x3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 x1 x2 x3 x4 y 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 y LUT x1 x2 x3 x4 x1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 x3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 y 0 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 • Look-Up tables are primary elements for logic implementation • Each LUT can implement any function of 4 inputs x1 x2 y y ECE 448 – FPGA and ASIC Design with VHDL 34 Carry & Control Logic COUT YB G4 G3 G2 G1 Y Look-Up O Table D Carry & Control Logic S Q CK EC R F5IN BY SR XB F4 F3 F2 F1 X Look-Up Table O CIN CLK CE ECE 448 – FPGA and ASIC Design with VHDL Carry & Control Logic S D Q CK EC R SLICE 35 Carry & Control Logic in Xilinx FPGAs x 0 0 1 1 y COUT 0 y 1 CIN 0 CIN 1 y x y Propagate = x y Generate = y Sum= Propagate CIN = x y CIN Carry & Control Logic in Spartan 3 FPGAs LUT Hardwired (fast) logic Simplified View of Spartan-3 FPGA Carry and Arithmetic Logic in One Logic Cell Simplified View of Carry Logic in One Spartan 3 Slice Critical Path for an Adder Implemented Using Xilinx Spartan 3 FPGAs Number and Length of Carry Chains for Spartan 3 FPGAs Bottom Operand Input to Carry Out Delay TOPCYF 0.9 ns for Spartan 3 Carry Propagation Delay tBYP 0.2 ns for Spartan 3 Carry Input to Top Sum Combinational Output Delay TCINY 1.2 ns for Spartan 3 Critical Path Delays and Maximum Clock Frequencies (into account surrounding registers) Major Differences between Xilinx Families Look-Up Tables Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 4-input 6-input Number of CLB slices per CLB 4 2 Number of LUTs per CLB slice 2 4 Number of adder stages per CLB slice 2 4 Altera Cyclone III Logic Element (LE) – Normal Mode Altera Cyclone III Logic Element (LE) – Arithmetic Mode Altera Stratix III, Stratix IV Adaptive Logic Modules (ALM) – Normal Mode Altera Stratix III, Stratix IV Adaptive Logic Modules (ALM) – Arithmetic Mode Addition of a Constant Addition of a constant (1) + xk-1 xk-2 . . . x1 x0 yk-1 yk-2 . . . y1 y0 variable constant sk-1 sk-2 . . . s1 s0 + xk-1 xk-2 . . . xh+1 xh xh-1 . . . x0 yk-1 yk-2 . . . yh+1 1 0 . . . 0 sk-1 sk-2 . . . sh+1 xh xh-1 . . . x0 variable constant Addition of a constant (2) ck xk-1 xk-2 HA/ MHA HA/ MHA sk-1 sk-2 ... .. ... xh+2 xh+1 HA/ MHA HA/ MHA sh+2 sh+1 xh xh-1 . . . x0 ... xh xh-1 . . . x0 If yi = 0 yi = 1 Half-adder (HA) Modified half-adder (MHA) Modified half-adder c x MHA s y 2 1 x + y + 1 = ( c s )2 x 0 0 1 1 y 0 1 0 1 c 0 1 1 1 s 1 0 0 1 Incrementer xk-1 xk-2 ck HA HA sk-1 Decrementer xk-1 sk-2 xk-2 ck MHA MHA sk-1 sk-2 ... .. x2 x1 HA HA ... s2 s1 ... x2 .. ... x1 MHA MHA s2 s1 x0 x0 x0 x0 Bit-Serial & Digit-Serial Adders yi xi c0 ci+1 Bit-serial adder start clk si yi xi d d c0 ci+1 Digit-serial adder start clk d si Asynchronous Adders Possible solutions to the carry propagate problem 1. Detect the end of propagation rather than wait for the worst-case time 2. Speed-up propagation via • look-ahead • carry skip • carry select, etc 3. Limit carry propagation to within a small number of bits 4. Eliminate carry propagation through the redundant number representation Analysis of carry propagation Probability of carry generation = 1 4 (xiyi = 11) Probability of carry propagation = 1 2 Probability of carry anihilation = 1 2 (xiyi = 01 or 10) (xiyi = 00 or 11) j j-1 . . . . . . . i+1 i 1 0 … 1 …0 … 1 1 1 1 … 0 …1 … 0 1 11 or 00 Probability of carry propagating from position i to position j = 01 or 10 1 j i 1 2 probability of propagation 1 2 = probability of anihilation 1 j i 2 Expected length of the carry chain that starts at position i (1) Expected length(i, k) = k 1 j i k 1 i 1 1 ( j i ) ( k i ) 2 2 j i 1 Length of the carry chain Probability of the given length Distance till the end of adder Probability of propagation till the end of adder Expected length of the carry chain that starts at position i (2) Expected length(i, k) = 22 ( k i 1) For i << k Expected length of the carry propagation is 2