Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Load-Sensitive Flip-Flop Characterization Seongmoo Heo and Krste Asanović MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/scale WVLSI 2001 April 19, 2001 Motivation • Flip-flops are one of the most important components in synchronous VLSI designs. o Critical effect on cycle time o Large fraction of total system power • Previously published work has failed to consider the effect of circuit loading on the relative ranking of flip-flop structures. [Kawaguchi et al. ’98] [Ko and Balsara ’00] [Kong et al ’00] [Lang et al ’97] [Nikolic et al ’00] [Nogawa and Ohtomo ’98] [Stojanovic and Oklobdzija ’99] [Stollo et al ’00] [Yuan and Svensson ’91] [Zyuban and Kogge ’99] [H.P. et al ’96] [J.M. et al ’96] o o o Fixed and usually overly large output load Large or non-specified input drive No output buffering Observation 1. Different flip-flop designs have different inherent parasitics and output drive strength. o Different number and complexity of logic gates o Different kinds of feedback Observation 1. Different flip-flop designs have different inherent parasitics and output drive strength. o Different number and complexity of logic gates o Different kinds of feedback D Q D D Q Q Observation 2. Output loads in a circuit vary significantly. 120 # of instances 100 80 Flip-flop output load instances in a microprocessor datapath (A custom-designed 32-bit MIPS CPU in 0.25mm process) 7.2fF (4 min inv gate cap) 60 40 115.2fF (64 min inv gate cap) 28.8fF (16 min inv gate cap) 20 0 1.8fF (min inv gate cap) Our Proposal Load effects must be considered in flip-flop characterization to avoid sub-optimal selection. • We will present energy and delay measurements for various flipflops across a range of output loading conditions(EE and absolute load size) and show that the relative rankings of structures vary. • We will show that output buffering at high load can lead to the better performance and energy consumption for some structures. Related Work • Traditional Buffer Sizing • Logical Effort [Sutherland and Sproull] o Logical Effort: drive strength of a circuit structure o Electrical Effort: the ratio of output load to input load o Delay = intrinsic parasitic delay + LE x EE Overview • Flip-Flop Designs • Test Bench & Simulation Setup • Delay and Energy Characterization • Delay Analysis • Energy-versus-Delay Analysis • Summary Flip-Flop Designs Fully static and single-ended [Nikolic et al ’00] Test Bench • Sized clock buffer to give equal rise/fall time • Used a fixed, realistic input driver • Varied output load from 4 min inv cap(7.2fF) to 64 min inv cap(115.2fF). FF • 4 Load and Drive Configurations EE4-min: min input drive, 4 min inv load (7.2fF) o EE16-min: min input drive, 16 min inv load (28.8fF) o EE64-min: min input drive, 64 min inv load (115.2fF) o EE4-big: 16x min input drive, 64 min inv load (115.2fF) o 4 min inv cap 16 min inv cap 64 min inv cap Simulation Setup • 0.25μm TSMC CMOS process, Vdd=2.5V, T=25°C • Hspice Levenberg-Marquardt method was used for transistor size optimization. o Transistor widths optimized for each load and drive conf. to give min delay or min energy for a given delay (transistor lengths were fixed at minimum.) o Parasitic capacitances included in the circuit netlists. Delay and Energy Characterization • Minimum D-Q delay [Stojanovic et al. ’99] (.Measure command) • Total energy = input energy + internal energy + clock energy – output energy • A single test waveform with ungated clock and data toggling every cycle o For a full characterization of energy dissipation, more realistic activity patterns should be considered [Heo, Krashinsky, Asanovic ARVLSI’01]. FF 4 min inv load 16 min inv load 64 min inv load Speed Ranking Without Buffering 3.5 (Transistors sized at each load point, but only for min delay) 3.0 • Delay = const. intrinsic parasitic delay + output drive delay (= load size × driving capability) • Driving Capability = f(# of stages, complexity) 1.5 1.0 0.5 PPCFF SAFF MSAFF HLFF SSAPL Influence of Buffering on Performance (Assuming no penalty for inverting output) Delay (ns) 1.5 SAFF PPCFF MSAFF 1 0.5 0 0 20 40 80 Load (min inv cap) HLFF 1.5 SSAPL : unbuffered : one inverter : two inverters 1 (Min. input drive was used.) 0.5 0 0 20 40 80 Speed Ranking With Buffering Allowed 3.5 • Less speed variation compared to original flip-flops 3.0 1.5 1.0 0.5 PPCFF SAFF MSAFF HLFF SSAPL Energy-Delay Curve : EE4-min Each point sized for min energy for a given delay Delay(ns) EE4-min: min. drive + 4 min inv load(7.2fF) Energy-Delay Curve : EE4-min PPCFF-unbuf Delay(ns) EE4-min: min. drive + 4 min inv load(7.2fF) Energy-Delay Curve : EE16-min SSAPL-unbuf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min SSAPL-unbuf SSAPL-buf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min HLFF-unbuf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min HLFF-unbuf HLFF-buf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min PPCFF-unbuf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE64-min MSAFF-unbuf Delay(ns) EE64-min: min. drive + 64 min inv load(115.2fF) Energy-Delay Curve : EE64-min HLFF-unbuf HLFF-buf Delay(ns) EE64-min: min. drive + 64 min inv load(115.2fF) Energy-Delay Curve : EE4-big Delay(ns) EE4-big: 16x min. drive + 64 min inv load(115.2fF) Energy-Delay Curve : EE4-min vs EE4-big EE4-min PPCFF-unbuf MSAFF-unbuf SAFF-unbuf EE4-big Energy-Delay Curve : EE4-min vs EE4-big EE4-min PPCFF-unbuf MSAFF-unbuf SAFF-unbuf EE4-big MSAFF-unbuf SAFF-unbuf PPCFF-unbuf Summary • Different flip-flops have different gains and parasitics. • Real VLSI designs exhibit a variety of flip-flop output loads. • The output load size affects the relative performance and energy consumption of different flip-flop designs. • Therefore, output load effects should be accounted for when comparing flip-flops. 1. Electrical effort 2. Absolute output load size 3. Output buffering