Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
VLSI Design II Introduction [Adapted from presentation by Kia Bazagran, University of Minnesota] EE415 VLSI Design Section Outline Administrative Issues Semiconductor industry trends Chip implementation methodologies Design methodologies EE415 VLSI Design What is This Course All About? Prerequisite » Basic CMOS design » Static/dynamic circuit design » CAD Computer Aided Design tools What is different from “VLSI Design I”? » Higher-level of design (closer to architecture) » Emphasis on performance, processor cores, fault tolerance What is covered? » Sequential logic » Arithmetic circuits and subsystem design » Parasitics, timing, synchronization, pipelining » Memories » Test and testability » New issues and design techniques EE415 VLSI Design Course Outline Sequential Logic » Static sequential circuits » Dynamic sequential circuits » Non-bistable sequential circuits CMOS Designs » Arithmetic & logic unit (ALU) – Bitwise operations – Datapath layout » Adders – Basic adders: carry propagation, Carry Look-ahead, Manchester Carry Chain – More complex adders: Carry Save Adder, Brent-Kung – Fast adders: Carry-Select adder, Wallace tree EE415 VLSI Design Course Outline CMOS Designs » Multipliers – Shift/Add multiplication – Booth encoding – Multiplication by constants – Floating point multiplication Interconnect and parasitics » » » » Parasitic capacitances Parasitic resistances Parasitic inductances Packaging EE415 VLSI Design Course Outline (cont) Timing » » » » » » Clock generation Clock skew Self-timed design Synchronization Pipelining Asynchronous design System Architecture and Power » Low Power Design in CMOS EE415 VLSI Design Course Outline (cont) CMOS Designs (cont) » Shift/Rotate operations » Memories – Memory cells: static and dynamic – Memory arrays: address decoders, sensors and amplifiers Test and testability » Fault models » Design techniques: scan design, built-in self-test New design techniques/platforms » CORDIC algorithms » Bit-serial computations » [Recent circuit examples] EE415 VLSI Design IC Products Processors » CPU, DSP, Controllers Memory chips » RAM, ROM, EEPROM Analog » Mobile communication, audio/video processing Programmable » PLA, FPGA Embedded systems » Used in cars, factories » Network cards System-on-chip (SoC) EE415 VLSI Design Images: amazon.com IC Product Market Shares Analog Programmability EE415 VLSI Design Source: Electronic Business Brick Wall of Nanotechnology EE415 VLSI Design Semiconductor Industry Growth Rates EE415 VLSI Design Source: http://www.icinsight.com/ (McClean Report) More Demand for EDA CAE = Computer Aided Engineering EE415 VLSI Design Source: http://www.edat.com/edac Growth in System Size CAGR = Compound Annual Growth Rate EE415 VLSI Design Source: http://www.edat.com/edac Example: Intel Processor Sizes Silicon Process Technology 1.5m 1.0m 0.8m 0.6m Intel386TM DX Processor Intel486TM DX Processor Pentium® Processor Pentium® Pro & Pentium® II Processors EE415 VLSI Design Source: http://www.intel.com/ 0.35m 0.25m Implementation Methodologies Digital Circuit Implementation Approaches Digital Ckt Implementation Approaches Custom Custom Semi-custom Semi custom Cell-Based Standard Cells Compiled Cells EE415 VLSI Design Macro Cells Array-Based Pre-diffused (Gate Arrays) Pre-wired (FPGA) [© Prentice Hall] Custom Design Using custom design we can get exactly what we want. However: » Complex to design » Takes weeks to fabricate » High design costs » High overhead (nonrecurring – NRE) costs » How do we automate the mapping? EE415 VLSI Design [© Hauck] Standard Cells Use regular layout Can automate the mapping process, but PWR » Takes weeks to fabricate CELL CELLCELL 1 2 3 » No economies of scale GND CELL 4 CELL CELL 5 6 ROUTING Cells PWR CELL 7 ROUTING Cells CELL 8 CELL CELL 9 10 GND ROUTING Cells ROUTING PWR CELL CELL CELL CELL CELL CELL 11 12 13 14 15 16 Cells ROUTING EE415 VLSI Design GND [© Hauck] Combined Standard Cell and Full Custom Use full custom for regular structures & critical paths Standard cells handle complex logic & non-critical logic EE415 VLSI Design [© Hauck] Macrocell Design Methodology Macrocell Floorplan: Defines overall topology of design, relative placement of modules, and global routes of busses, supplies, and clocks EE415 VLSI Design Interconnect Bus Routing Channel Macrocell-Based Design Example SRAM SRAM Data paths Standard cells Video-encoder chip [Brodersen92] EE415 VLSI Design Mask-Programmable Gate Array (MPGA) Prefabricate all but the metal layers EE415 VLSI Design [© Hauck] Discrete Components Prefabricate lots of small, simple parts. Wire them together. EE415 VLSI Design Q D Q D Q D D Q D Q D Q [© Hauck] Gate Array — Sea-of-gates polysilicon VD D rows of uncommitted cells metal possible contact GND In1 In2 In3 In4 routing channel Committed Cell (4-input NOR) Out EE415 VLSI Design Uncommited Cell Sea-of-gate Primitive Cells Prefabricate all but the metal layers and the contacts Oxide-isolation PMOS PMOS NMOS NMOS NMOS Using oxide-isolation EE415 VLSI Design Using gate-isolation Sea-of-gates Random Logic Memory Subsystem EE415 VLSI Design LSI Logic LEA300K (0.6 mm CMOS) Programmable Logic Devices Categories of prewired arrays (or fieldprogrammable devices): » Fuse-based (program-once) » Non-volatile EPROM based » RAM based Recently: » VPGA (Via-Programmable Gate Array) » Structured ASIC EE415 VLSI Design [© Prentice-Hall] Programmable Logic Devices PAL PLA EE415 VLSI Design PROM [© Prentice-Hall] Programming Technologies Mask-programmed Antifuse Polysilicon Field Oxide N+ diffusion ONO Dielectric access gate EPROM EEPROM floating gate n+ source n+ drain P-Type Silicon Write ~Q SRAM EE415 VLSI Design Q [© Hauck] RAMs, ROMs Given a RAM/ROM with 8k memory locations, in 1k*8bit organization » 10 address lines » Can implement 8 arbitrary 10-input functions (but inefficiently) I1 I2 I3 EE415 VLSI Design ROM 000 001 010 011 100 101 110 111 A B C D E F G H [© Hauck] Field Programmable Gate Arrays (FPGAs) Logic cells embedded in a general routing structure Logic cells usually contain: » 5-input function calculator » Flip-flops All features electronically (re)programmable EE415 VLSI Design RAM RAM RAM RAM RAM RAM RAM RAM M RAM AM RAM [© Hauck] Multi-Mode Reconfigurable Systems Tektronix PhaserCard printer controllers Different configurations for different printers Andromeda Systems disk controller Field upgrades performed by modem Radius pivoting monitor Different configurations for landscape & portrait ROM FPGA Config1 Config2 Config3 Config4 Honeywell tape drive Different configurations for read & write operations EE415 VLSI Design [© Hauck] Microprocessors & Microcontrollers Microcontrollers are simple 1-chip computers optimized for embedded control Cheap, can handle complex control flow (relatively slowly) CPU RAM I/O ROM Microcontroller EE415 VLSI Design Sensor Actuator [© Hauck] Digital Signal Processors (DSPs) Fast multiply-accumulate for signal filtering, etc. REGISTER MUX MULTIPLIER REGISTER PC PROGRAM CONTROLLER Address REGISTER MUX Address MUX SHIFTER PROGRAM ROM DATA RAM ALU Program Bus ACCUMULATOR I/O CONTROLLER SHIFTER Data Bus EE415 VLSI Design [© Hauck] Implementation Alternatives PWR Full Custom Standard Cells CELL CELL CELL 1 2 3 CELL 4 CELL CELL 5 6 GND PWR CELL 7 CELL 8 GND Gate Arrays Field-Programmable Gate Arrays (FPGAs) Programmable Logic Devices Discrete Components EE415 VLSI Design i1 i2 i3 i4 i5 i6 o1 CELL 9 CELL 10 Design synthesis EE415 VLSI Design Circuit synthesis derivation of the transistors schematics from logic functions - complementary CMOS - pass transistor - dynamic - DCVSL (differential cascode voltage switch logic) transistor sizing - performance modeling using RC equivalent circuits - layout generation synthesis not popular due to designers reluctance EE415 VLSI Design Logic synthesis state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used synthesis - combinational or sequential - multi level, PLA, or FPGA logic optimization for - area, speed , power - technology mapping EE415 VLSI Design Logic optimization Expresso - two level minimization tool (UCB) state minimization and state encoding MIS - multilevel logic synthesis (UCB) Example : S = (AB) Ci Co= AB + ACi + BCi EE415 VLSI Design Architecture synthesis behavioral or high level synthesis optimizing translation e.g. pipelining Cathedral and HYPER tools HYPER tutorial and synthesis example: http://infopad.eecs.berkeley.edu/~hyper EE415 VLSI Design IC Design Steps (cont.) Specifications EE415 VLSI Design High-level Description Structural Description Behavioral VHDL, C Structural VHDL Figs. [©Sherwani] IC Design Steps (cont.) High-level Description Specifications Physical Design Placed & Routed Design Packaging EE415 VLSI Design Synthesis Technology Mapping Gate-level Design Fabrication Structural Description Logic Description IC Design Steps (cont.) High-level Description Specifications Structural Description Synthesis Physical Design Placed & Routed Design Packaging EE415 VLSI Design Technology Mapping Gate-level Design Fabrication Logic Description X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Figs. [©Sherwani] The Big Picture: IC Design Methods Design Methods Full Custom Standard Cell Library Design ASIC – Standard Cell Design RTL-Level Design EE415 VLSI Design Cost / Development Time Quality % Companies involved Algorithmic » Encoding data, computation scheduling, balancing delays of components, etc. Gate-level » Reduce fan-out, capacitance » Gate duplication, buffer insertion Layout » Move transistors driven by late inputs closer to the output EE415 VLSI Design Level of detail Effectiveness Optimization: Levels of Abstraction Full Custom Design Structural/RTL Description Component Design Ctrl Mem Reg File Comp. Unit Place & Route I/O PLA comp RAM ... A/D EE415 VLSI Design Floorplan [©Sherwani] Layouts [© Prentice Hall] ASIC Design HDL Programming Structural/ RTL Description P_Inp: process (Reset, Clock) begin if (Reset = '1') then sum <= ( others => '0' ); input_nums_read <= '0'; sum_ready <= '0'; Ctrl Mem Reg File D Comp. Unit C A D C EE415 VLSI Design C C B C C add82 : kadd8 port map ( a => add_i1, b => add_i2, ci => carry, s => sum_o); Mult_i1 <= sum_o(7 downto 0); C D C B B Cell library A C B D Floorplan [©Sherwani] More Issues to Consider Area/speed trade-off tp(sec) 80 static 60 look-ahead mirror manchester bypass 40 select select Area (mm2) 0.4 static bypass mirror 0.2 look-ahead 20 0 0 EE415 VLSI Design manchester 0 10 N 20 0 10 20 N [© Prentice Hall] More Issues to Consider (cont.) Aspect ratio, area budgets, datapath layout Power and clock grid Wires (M1) GND Well VDD GND Signal wires (M2) Signal wires (M2) Control wires (M1) Well GND GND Approach I — VDD Approach II — Signal and power lines parallel Signal and power lines perpendicular EE415 VLSI Design Figures: [© Prentice Hall] Datapath Layout Example: Adder Standard cell layout EE415 VLSI Design Bit-slice cell layout [WE92] p.521 Architecture of a CPU Flags: overflow, zero, etc. Control Read/write Mem EE415 VLSI Design Register File Data path Arithmetic and Logic Unit (ALU) Functions » Arithmetic (add, sub, inc, dec) » Logic (and, or, not, xor) » Comparison (<, >, <=, >=, !=) Control signals » Function selection » Operation mode (signed, unsigned) Output » Operation result (data) » Flags (overflow, zero, negative) EE415 VLSI Design Simple ALU Example Bit 3 Bit 2 Bit 1 Bit 0 Tile identical processing elements EE415 VLSI Design Data Out Multiplexer Shifter Adder Register Data in Control [© Prentice Hall] FPGA Architecture - Layout Island FPGAs » Array of functional units » Horizontal and vertical routing channels connecting the functional units » Versatile switch boxes » Example: Xilinx, Altera Row-based FPGAs » Like standard cell design » Rows of logic blocks » Routing channels (fixed width) between rows of logic » Example: Actel FPGAs EE415 VLSI Design FPGA Architecture: Functional Units Functional units » RAM blocks (Xilinx): implement function truth table » Multiplexers (Actel): build Boolean functions using muxes » Logic gates, flip-flops: Such as carry chains. Used for high-performance computations EE415 VLSI Design Address lines (input) output Programmable Switch Elements Used in connecting: » The I/O of functional units to the wires » A horizontal wire to a vertical wire » Two wire segments to form a longer wire segment EE415 VLSI Design Programmable Switch Elements: Implementation SRAM connected to the gate of a transistor (Xilinx) symbol implementation symbol implementation Fuse / Anti Fuse (Actel) Note: Switches degrade the signals slow down EE415 VLSI Design Routing Channels Note: fixed channel widths (tracks) Channel -> track -> segment segment track Segment length? » Long: carry the signal longer, less “concatenation” switches, but might waste track » Short: local connections, slow for longer connections EE415 VLSI Design channel Switch Boxes Ideally, provide switches for all possible connections Trade-off: » Too many switches: – Large area – Complex to program » Too few switches: – Cannot route signals EE415 VLSI Design One possible solution Xilinx 4000 Operation Example 4-bit ripplecarry adder EE415 VLSI Design Programming How to access all programmable elements? » Pin limitation - Chain all config bits in a shift register or use pipelining » Feasibility of access (Actel example) - Partition the elements into subsets, treat each as a memory block - Consider the problem when designing the FPGA architecture - Carefully schedule the programming Are there “invalid” configurations? - Yes! If two functional units drive same line - Avoid at architectural design or when programming EE415 VLSI Design Programming (cont.) Too much detail! (tens of bits for each cell/switch block) » Automated placement, routing and programming » Design a simple structure so that tools can handle Partially reconfigurable? » Extra control circuitry, more flexibility » Runtime reconfigurable? (avoid conflicts with running components) EE415 VLSI Design Pros and Cons General architecture » Slower than ASIC » Less logic capacity (solution: reuse silicon area through reconfiguration) » Flexible Customization helps » Instantiate many small processing elements parallel processing » Some operations faster (e.g., constant multiplication, bit-wise operations) » More operations in parallel reduce clock speed reduce power consumption EE415 VLSI Design New Challenges Balance between elements » Data memory » Configuration memory » Special-purpose functional units » Fine- vs. coarse-grain functional units Communication bandwidth Fast automatic tools Versatile libraries EE415 VLSI Design