Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer Yulei Zhang1, James F. Buckwalter1, and Chung-Kuan Cheng2 1Dept. of ECE, 2Dept. of CSE, UC San Diego, La Jolla, CA 19th Conference on Electrical Performance of Electronic Packaging and Systems Oct 25, 2010 Austin, USA Outline Introduction Equalized On-Chip Global Link Driver Design for On-Chip Transmission-Line CTLE modeling CTLE design example Driver-Receiver Co-Design for Low Energy per Bit Guideline for tapered CML driver Driver design example Continuous-Time Linear Equalizer (CTLE) Design Overall structure Basic working principle Methodology Overall link design example Conclusion 2 Research Motivation Global interconnect planning becomes a challenge in ultra-deep sub-macron (UDSM) process Performance gap between global wire and logic gates Conventional buffer insertion brings in larger extra power overhead Uninterrupted wire configurations are used to tackle the on-chip global communication issues On-chip T-lines to reduce interconnect power Equalization to improve the bandwidth State-of-the-art[Kim2009] 2Gb/s/um, < 1pJ/b, signaling over 10mm global wire in 90nm 3 Our Contributions Contributions Build up a novel equalized on-chip T-line structure for global communication Tapered CML driver + CTLE receiver Accurate small-signal modeling on CTLE receiver to improve the optimization quality A design methodology to achieve driver-wire-receiver cooptimization to reduce the total energy per bit Results of our design 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy per bit in 45nm 4 Equalized On-Chip Global Link Overall structure Tapered current-mode logic (CML) drivers Terminated differential on-chip T-line Continuous-time linear equalizer (CTLE) receiver Sense-amplifier based latch 5 Basic Working Principle Tapered CML Driver T-line Differential wire w/ P/G shielding Geometries (width, pitch) and termination resistance RT CTLE Receiver Provide low-swing differential signals to driver T-line Tapered factor u, number of stages N, fan-out X, final stage current ISS, driver resistance RS Recover signal and improve eye-quality Load resistance RL, source degeneration resistance RD and capacitance CD, over-drive voltage Vod. Sense-amplifier based latch Synchronize and convert signal back to digital level 6 Tapered CML Driver Design Output swing constraint Design guideline [Tsuchiya2006, Heydari2004] Begin from the final stage For given VSW, output resistance RS optimized with RT to increase eye-opening Transistor size Need to design: 1) Output resistance RS 2) Tail current ISS 3) Size of transistors W Tapered factor u = 2.7 for delay reduction Number of stages Each previous stage is designed backward by scaling with the factor u 7 CML Driver Study w/ Loaded T-line Assume 45nm 1P11M CMOS T-line built on M9 with M1 as reference T = 1.2um, H = 3.5um (fixed) Optimize W and S for eye-opening Change of the eye-opening with width for fixed 2um pitch Change of the eye-opening with pitch for equal width/spacing 8 CML Driver Design Example Experimental observations Design methodology Optimal eye happens when width=spacing Eye-opening improves with larger pitch Choose the minimum pitch that satisfied the wire-end eyeopening requirement Design example 9 Accurate CTLE Modeling Design Variables: RL, RD, CD, Vod(Size) (Vod ), (Vod ), K K (Vod ) gm I Bias V Vic W 2 I Bias 1 , rds , Ibias dd , 1.2 Vod I Bias RL L KVod CSpara 1.5fF/um W , CDpara 1.5fF/um W CD CDex CSpara , CL CLex CDpara [Hanumolu2005] GainDC Small Signal Circuit to derive H(s): vin G D gmvgs rds ( RLCL RD CD ) ( g m rds 1) RD RLCL RL RDCD ( g m rds 1) RD rds RL b rds RD CD RLCL ( g m rds 1) RD rds RL RL CL z S RD a vout rds CD 1 sRD CD 1 as bs 2 g m rds RL ( g m rds 1) RD rds RL H ( s ) GainDC 1 RD CD p1 1/ a p2 a / b 10 CTLE Modeling Validation Test case:10mm, 16mV-eye@wire-end Blue lines: simple modeling, not consider rds and parasitics Red line: only consider rds Black line: the proposed accurate model 11 CTLE Design Example Observations of CTLE study Eye-opening improves with relaxed power constraints but tends to be saturated Design example Based on the pre-optimized CML driver + T-line design Eye-opening improved by 4X after CTLE 12 Driver-Receiver Co-Design Methodology Optimization Flow Optimize driver-wire-receiver together by setting Veye/Power as the cost function Choose pre-designed CML/T-line/CTLE as initial solution Driver-to-receiver step-response generation based on SPICE simulation and CTLE modeling Eye-opening estimation based on step-response SQP-based non-linear optimization Variables: [ISS,RT,RL,RD,CD,Vod] Performance Comparison Option A:Driver/Receiver independent design Option B:Low-power driver/receiver co-design 13 Low Energy-per-Bit Optimization Flow Pre-designed CML driver Pre-designed CTLE receiver Driver-Receiver Co-Design Initial Solution Change variables [ISS,RT,RL,RD,CD,Vod] Co-Design Cost Function Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Cost-Function Veye/Power Step-Response Based Eye Estimation Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of overall energy-per-bit 14 Simulated Eye Diagrams Methodology A: driver/receiver separate design Methodology B: driver/receiver co-design for low-power 15 Summary of Performance Comparison Methodology A driver/receiver separate design Methodology B driver/receiver codesign for low-power RS/ohm 47 148 RT/ohm 94 1100 RL/ohm 440 890 RD/ohm 110 1430 CD/fF 680 150 Vod/mV 60 58 Eye-Opening@CTLE/mV 91 113 Power Consumption/mW 8.1 3.8 Note: driver/receiver co-design methodology uses much larger driver/termination resistance to reduce power, but will close the eye-opening at the driver output and wire-end. Final eye is recovered by fully utilizing CTLE. 16 Conclusion We propose a novel equalized on-chip global link using CML driver and CTLE receiver Accurate modeling for CTLE is provided to achieve <10% correlation error and will improve eye-opening optimization quality Our design achieves 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line 11ps/mm latency and 0.2pJ/b energy 17 Thank You! Q&A 18