Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EEE5026; 943/U0280 Physical Design for Nanometer ICs 張耀文 Yao-Wen Chang [email protected] http://cc.ee.ntu.edu.tw/~ywchang Graduate Institute of Electronics Engineering Department of Electrical Engineering National Taiwan University Spring 2017 Administrative Matters ․ Time/Location: Thursdays 2:20 pm--5:30 pm; BL-114 ․ Instructor: Yao-Wen Chang ․ E-mail: [email protected] ․ URL: http://cc.ee.ntu.edu.tw/~ywchang ․ Office: BL-428. (Tel) 3366-3556; (Fax) 2364-1972 ․ Office Hours: Wednesdays 5—6pm; other times by appointment ․ Teaching Assistant: Yu-Sheng Lu ([email protected]); office hours: 12:30—1:30pm, Wednesdays ․ Prerequisites: data structures, algorithms, and logic design ․ Required Text: Either of the following two books: Wang, Chang, and Cheng (Ed.), Electronic Design Automation: Synthesis, Verification, and Test, Morgan Kaufmann, 2009 Sait and Youssef, VLSI Physical Design Automation: Theory and Practice, World Scientific Publishing Co., 1999 ․ References: Selected reading materials from recent publications Unit 1 Y.-W. Chang 2 1 Teaching Assistant ․Yu-Sheng Lu 呂祐昇 ․Email: [email protected] ․Office: BL-406; Tel: 3366-3700 # 6406 ․Office Hours: 12:30-1:30pm, Wednesdays. ․1st-year Ph.D. student Unit 1 3 Y.-W. Chang Course Objectives ․Study techniques/algorithms for physical design (converting a circuit description into a geometric description) and their comparisons ․Study nanometer process/electrical effects and their impacts on the development of physical design tools ․Study problem-solving (-finding) techniques!!! solution S1 S2 S3 S4 S5 P1 P2 P3 P4 Unit 1 Y.-W. Chang P5 P6 problem 4 2 Course Contents ․ VLSI design flow/styles and technology roadmap ․ Physical design processes Partitioning Floorplanning Placement Routing (global, detailed, clock, and power/ground routing) Post-layout optimization ․ Signal/power integrity: crosstalk, IR drop ․ Timing: timing modeling, performance-driven design ․ Design methodology: large-scale design, interconnect-centric design flow, buffer/wiring planning. ․ Design for manufacturability & reliability process variation, antenna effect, redundant via, optical proximity correction (OPC), chemical mechanical polishing (CMP), multiple pattering, e-beam, extreme ultraviolet (EUV), directed self-assembly (DSA), nanowire, electromigration, thermal issues, etc. Unit 1 Y.-W. Chang 5 Grading Policy ․Grading: Homework assignments + quizzes: 25% Programming assignments + lab: 25% One in-class open-book, open-note exam: 30% (June 22) Final project + presentation + demo: 20% (due June 29) A 1-page project proposal is due in-class on May 18 Could be research work, implementation, and/or literature survey Default project: Problem B, C, or E of the 2017 IC/CAD Contest at http://cad-contest-2017.el.cycu.edu.tw/ (E for undergraduate students) Teamwork is permitted (1--3 persons; preferably 2 persons) Bonus for class participation ․Homework: 30% per day penalty for late submission ․WWW: http://cc.ee.ntu.edu.tw/~ywchang/Courses/PD/pd.html ․Academic Honesty: Avoiding cheating at all cost Unit 1 Y.-W. Chang 6 3 Unit 1: Introduction ․Course contents: Introduction to VLSI design flow/styles Introduction to physical design automation Semiconductor technology roadmap ․Readings W&C&C: Chapter 1 S&Y: Chapter 1 physical design Unit 1 fabrication Y.-W. Chang 7 IC Design & Manufacturing Process Unit 1 Y.-W. Chang 8 4 From Wafer to Chip 2, 4, 6, 8-inch wafers 12-inch wafer 8-inch vs. 1-inch ignot Apple A10 die (iPhone 7) TSMC 16nm FinFET; 3.3B transistors Wafer dicing 18-inch wafer Wire bonding chips Unit 1 9 Y.-W. Chang Die, Package, and Board Apple A9 die for iPhone 6s (1.85GHz; 5B+ transistors) TSMC 16nm FinFET 104.5 mm2 Samsung 14nm FinFET 96 mm2 packages boards packages Y.-W. Chang 5 IC Design Considerations ․Several conflicting considerations: Complexity: large number of devices/transistors Power: low-power consumption Performance: high-speed requirements Cost: die area, packaging, testing, etc. Time-to-market: about a 15% gain for early birds Others: manufacturability, reliability, testability, etc. Unit 1 11 Y.-W. Chang “Moore’s” Law: Driving Technology Advances ․Logic capacity doubles per IC at a regular interval (say, 18 months). G. Moore: Logic capacity doubles per IC every two years (1975). D. House: Computer performance doubles every 18 months (1975) 4Gb Itanium 2 Intel uP 4004 Unit 1 8086 80386 PentiumPro Y.-W. Chang Pentium 4 Itanium 2 12 6 Design Productivity Crisis 10,000K 1,000M 100M 58%/yr compound complexity growth rate Complexity limiter 10M 100K 10K 1M 21%/yr compound 1K productivity growth rate 0.1M 0.01M 1980 1,000K 0.1K 1985 1990 1995 2000 2005 2010 Productivity in transistors per staff-month Logic transistors per chip 100,000K 10,000M 2015 ․Human factors may limit design more than technology. ․Keys to solve the productivity crisis: CAD (tool & methodology), hierarchical design, abstraction, IP reuse, platform-based design, etc. Unit 1 Y.-W. Chang 13 “Old” (1997) Technology Roadmap for Semiconductors ․ Source: International Technology Roadmap for Semiconductors (easier to ․ ․ ․ Unit 1 see the past & trend with the older version; for more recent update, see http://www.itrs.net/). Deep submicron technology: node (feature size) < 0.25 m. Nanometer Technology: node < 100 nm. 14/16 nm technology was in production in 2015. Y.-W. Chang 14 7 Nanometer Design Challenges ․Apple A10 (iPhone 7): feature size = 16 nm FinFET, P frequency 2.34 GHz, die size 125 mm2, transistor count per chip 3.3B, wiring level 10+ layers, supply voltage < 1.0 V, power consumption 20 W (?) Unit 1 Feature size↓ : sub-wavelength lithography (impacts of process variation)? reliability? noise? wire coupling? Frequency ↑, dimension ↑ : interconnect delay? electromagnetic field effects? timing closure? Chip complexity ↑ : large-scale system design methodology? Supply voltage ↓ : signal integrity (noise, IR drop, etc)? Wiring level ↑: manufacturability? yield? 3D layout? Power consumption/density ↑ (?): power & thermal issues? Y.-W. Chang 15 Design Complexity Increases Dramatically!! Mixed-size Placement Routing & interconnect Unit 1 Y.-W. Chang 16 8 Power Is a Key Limiting Factor for IC Design! ․Power density increases exponentially! 1000 Power doubles every 4 years 5-year projection: 200W total, 125 W/cm2 ! Rocket Nozzle Nuclear Reactor Watts/cm 2 100 Pentium® 4 Itanium 2 10 Power & Performance trade-off!! Pentium® Pro Pentium® i386 Itanium 2-DC Pentium® III Pentium® II Hot plate i486 1 Fred Pollack, “New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies,” 1999 Micro32 Conference keynote. Courtesy Avi Mendelson, Intel. Y.-W. Chang Interconnect Dominates Circuit Performance!! 70 Worst-case interconnect delay due to crosstalk 60 Delay (ps) 50 40 30 Interconnect delay 20 10 Gate delay 650 500 350 250 180 150 100 In ≦ 0.18μm wire-to-wire capacitance dominates (CW>>CS) Unit 1 Y.-W. Chang CS 70 (nm) Source: Synopsys Technology Node CW 18 9 Manufacturing with Optical Lithography ․Patterns on a mask are transferred onto a wafer Light source Illumination Lens Mask Projection Immersion (water) Lens Wafer R = k1λ / NA 0.25 * 193 nm / 1.35 = 36 nm R: resolution; k1: resolution constant (>= 0.25); λ: wavelength NA: numerical aperture = f(lens, refractive index) Y.-W. Chang 19 Sub-Wavelength Lithography Gap ․Sub-wavelength lithography: use light of larger wavelength (193nm) to print features of smaller sizes EUV E-beam [S. Borkar, MICRO’04] Y.-W. Chang 20 10 Technology Roadmap [Aitken, 2014] EUV + DSA eNVM CNT EUV LELE Opto Patterning VNW SAQP Interconnect Log (complexity) NEMS monolithic EUV + DWEB TSV MLG, CNT LELELE III-V Transistors SADP EUV HNW FinFET LELE We Are Here HKMG Strain PMOS Planar CMOS NMOS Strong RET LE, < LE, ~ CU wires AI wires 1975 1985 10nm 7nm 5nm 3nm 1995 2005 2015 2025 Source: R. Aitken @ ISPD’14 Keynote & S. Segars @ 2014 Kaufman Award dinner (with revision by Y.-W. Chang) 21 Y.-W. Chang Most Expected Patterning Technologies Multiple patterning lithography (MPL) Extreme ultraviolet lithography (EUVL) Electron beam lithography (EBL) Directed SelfAssembly (DSA) Each technology has different difficulties and requires solutions for a breakthrough Y.-W. Chang 11 Litho-Etch-Litho-Etch (LELE) Double Patterning ․Pro: Simpler layout decomposition into masks ․Con: Overlay error with misalignment between masks 40nm 20nm 20nm = + mask 1 mask 1 20nm 80nm 80nm 1st mask 2 exposure-etching process photoresist mask Target film substrate 2nd exposure-etching process mask 2 23 Y.-W. Chang Extreme Ultraviolet Lithography (EUVL) ․EUVL is the most invested next-generation lithography technology Its wavelength is only 13.5 nm Reflective optical components and masks are used Reflective mask Reflective illuminator optics (mirrors) Reflective projection optics (mirrors) EUV source Wafer Y.-W. Chang 24 12 ASML EUV Lithography System ․ reflective mask EUV source vacuum chamber mirror ․ wafer Source: ASML Y.-W. Chang 25 Electron Beam Lithography (EBL) ․EBL is a maskless next-generation lithography technology Maskless: no more diffraction limitation of light Can define very fine patterns Mapper Lithography Y.-W. Chang 26 13 Directed Self-Assembly (DSA) ․Block copolymer DSA for contact/via patterning Groups of contacts/vias are patterned by guiding templates with traditional 193i lithography Self-assembly block copolymer Topographical and chemical patterns Mask Smaller and denser patterns Template Contacts Contact patterning with DSA Contact patterns formed by various DSA templates [Xiao, et al., ASP-DAC’15] Y.-W. Chang Cut/Via Pattering with DSA ․A large template can be used to pattern multiple close contacts even for sub-7nm nodes Layout close vias Vias Templates 22nm 7nm [Xiao, et al., DAC’14] Y.-W. Chang 14 Reliability Becomes a 1st-Order Effect!! ․Reliability with 10-layer metal? m5 m4 + ++ m3 + +++ m2 m1 sgd Si substrate sgd Source: Patrick Groeneveld Unit 1 Y.-W. Chang 29 “3D” Integration Adds Complexity! 3D IC heat sink device thermal TSV signal TSV inter-layer dielectric TSV tier3 tier2 metal layer device layer substrate tier1 TSV-IO substrate dielectric routing region 2.5D interposer (Xilinx Virtex-7 FPGA) Unit 1 Y.-W. Chang 30 15 3D Transistor: FinFET ․Lower leakage power ․Performance gain at lower voltage ․Higher drive current 3 fins Source: Intel Unit 1 Y.-W. Chang 31 Traditional VLSI Design Cycles 1. 2. 3. 4. 5. 6. 7. ․ ․ ․ System specification Functional design Logic synthesis Circuit design Physical design Fabrication Packaging Other tasks involved: verification, simulation, testing, etc. Design metrics: area, speed, power dissipation, manufacturability, reliability, testability, design time, etc. Design revolution: interconnect (not gate) delay dominates circuit performance in deep submicron era. Unit 1 Interconnects are determined in physical design. Shall consider interconnections in early design stages. Y.-W. Chang 32 16 Traditional VLSI Design Cycle & verification & verification & simulation Unit 1 Y.-W. Chang 33 Traditional VLSI Design Flow (Cont'd) design Unit 1 Y.-W. Chang 34 17 Physical Design (PD) physical design fabrication ․ PD converts a circuit description into a geometric description. ․ The description is used to manufacture a chip. ․ Physical design cycle: • Unit 1 1. Partitioning 2. Floorplanning 3. Placement 4. Routing (clock, power/ground, signal nets) 5. Post-layout optimization (buffering, sizing, etc.) Others: circuit extraction, timing verification and design rule checking 35 Y.-W. Chang Physical Design Flow B*-tree based floorplanning system Routing system Unit 1 Y.-W. Chang 36 18 Floorplan Examples Apple A5 with dual ARM cores Intel Pentium 4 A floorplan with interconnections Unit 1 37 Y.-W. Chang VLSI Placement ․Place objects into a die s.t. no objects overlap with each other & some cost metric (e.g., wirelength) is optimized chip ISPD98 ibm01 12,752 cells 247 macros Amax/Amin = 8416 842K cells 646 macros 868K nets wirings among cells/macros are not shown here!! Y.-W. Chang 38 19 Routing Example • 0.18um technology, two layers, pitch = 1 um, 8109 nets. Unit 1 39 Y.-W. Chang Modern EDA & Circuit Design Challenges Multi-dimension Scalability Heterogeneity Technology Y.-W. Chang 20 Example: Modern Placement ․High complexity 2.5M placeable objects Millions of objects Scalability mixed-size design ․Placement constraints Blockage, routability, density, timing, region, etc. Macros have revolutionized SoC design Multi-dimension ․Mixed-size placement Thousands of big macros with millions of small cells Heterogeneity device TSV TSV ․ More: 3D IC, datapath, FPGA, etc. dielectric routing region substrate Technology Y.-W. Chang Design Styles Others Power Structure ASIC Unit 1 Y.-W. Chang FPGA SPLD 42 21 SSI/SPLD Design Style Unit 1 Y.-W. Chang 43 Full-Custom Design Style • Designers can control the shape of all mask patterns. • Designers can specify the design up to the level of individual transistors. Unit 1 Y.-W. Chang 44 22 Terminology ․Cell: a logic block used to build larger circuits. ․Pin: a wire (metal or polysilicon) to which another external wire can be connected. ․Nets: a collection of pins which must be electrically connected. ․Netlist: a list of all nets in a circuit. nets cells pin Unit 1 Y.-W. Chang 45 Standard-Cell Design Style • Selects pre-designed cells (typically, of the same height) to implement logic • Over-the-cell routing is pervasive in modern designs • Modern designs often contain cells of different row heights (esp. with FinFET transistors) Unit 1 Y.-W. Chang 46 23 Standard Cell Example Courtesy of Newton/Pister, UC-Berkeley Trend: Channelless structure for standard cells and gate arrays Unit 1 Y.-W. Chang 47 Gate Array Design Style • Prefabricates a transistor array • Needs wiring customization to implement logic Unit 1 Y.-W. Chang 48 24 FPGA Design Style ․Logic and interconnects are both prefabricated. ․Illustrated by a symmetric arraybased fieldprogrammable gate array (FPGA) Unit 1 49 Y.-W. Chang FPGA/CPLD Examples Xilinx XC4413 FPGA (0.35 um) Altera Stratix IV FPGA (40 nm) Y.-W. Chang 50 25 FPGA Design Process ․Illustrated by a symmetric array-based FPGA ․No fabrication is needed Unit 1 51 Y.-W. Chang Comparisons of Design Styles Unit 1 Full custom Standard Cell Gate array FPGA Cell size variable fixed height fixed fixed Cell type variable variable fixed programmable Cell placement variable in row fixed fixed Interconnection variable variable variable programmable Full custom Standard Cell Fabrication time --- -- + +++ Packing density +++ ++ + --- Unit cost (large quantity) +++ ++ + --- Unit cost (small quantity) --- -- - +++ Easy design & simulation --- -- - ++ Easy design change --- -- - +++ Timing simulation accuracy -- - - ++ Chip speed +++ ++ + --- Y.-W. Chang Gate array FPGA 52 26 Design Style Trade-offs 10 4 full custom 3 10 Turnaround Time (Days) 2 10 semicustom FPGA CPLD SPLD 10 optimal solution SSI 1 1 10 10 2 10 3 10 4 10 5 10 6 10 7 10 8 Logic capacity (Gates) Unit 1 Y.-W. Chang 53 Appendix: Structured ASIC Unit 1 Y.-W. Chang 54 27 Structured ASIC ․ A structured ASIC consists of predefined metal and via layers, as well as a few of them for customization. ․ The predefined layers support power distribution and local communications among the building blocks of the device. ․ Advantages: fewer masks (lower cost); easier physical extraction and analysis. ․ Popular for engineering change orders (ECO’s) A structured ASIC (M5 & M6 can be customized) Unit 1 Faraday’s 3MPCA structured ASIC (M4--M6 can be customized) 55 Y.-W. Chang Comparisons of Design Styles Full custom Standard Cell Gate array Cell size variable fixed height fixed fixed fixed Cell type variable variable fixed fixed programmable Cell placement variable in row fixed fixed fixed Interconnection variable variable variable variable/fixed programmable Full custom Standard Cell Fabrication time --- -- Packing density +++ Unit cost (large quantity) +++ Unit cost (small quantity) Structure ASIC Structure ASIC FPGA + ++ +++ ++ + - --- ++ + - --- --- -- - + +++ Easy design & simulation --- -- - + ++ Easy design change --- -- - + +++ Timing simulation accuracy -- - - + ++ Chip speed +++ ++ + - --- Unit 1 Y.-W. Chang Gate array FPGA 56 28