* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download On-Line Adjustable Buffering for Runtime Power Reduction
History of electric power transmission wikipedia , lookup
Opto-isolator wikipedia , lookup
Variable-frequency drive wikipedia , lookup
Resilient control systems wikipedia , lookup
Mains electricity wikipedia , lookup
Spectral density wikipedia , lookup
Alternating current wikipedia , lookup
Utility frequency wikipedia , lookup
Electrification wikipedia , lookup
Pulse-width modulation wikipedia , lookup
Wireless power transfer wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Audio power wikipedia , lookup
Standby power wikipedia , lookup
Rectiverter wikipedia , lookup
Distribution management system wikipedia , lookup
Solar micro-inverter wikipedia , lookup
Electric power system wikipedia , lookup
Power inverter wikipedia , lookup
Power over Ethernet wikipedia , lookup
Control system wikipedia , lookup
Power electronics wikipedia , lookup
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. KahngΨ Sherief Reda† Puneet SharmaΨ ΨUniversity of California, San Diego †Brown University Outline Introduction Adjustable Buffering Methodology Experiments & Results Conclusions Power: First-Class Objective Power bottleneck to Moore’s law Power-frequency tradeoff exists in CMOS circuits Techniques to exploit power-frequency tradeoff are of interest Allow high freq. operation Can give significant power reduction when max. performance not required Mainstream approach: Dynamic voltage and frequency scaling (DVFS) Power Much higher power required to operate at high frequency Frequency Power-frequency tradeoff with VDD scaling Dynamic VDD & Freq. Scaling Scale down VDD and freq. when high performance not needed Limitations of DVFS VDD cannot be scaled down indefinitely Range of VDD scaling is small and diminishing Extremely high power at high VDD reduce max. VDD High Vth to reduce leakage, noise margins, variability, soft errors increase min. VDD Discrete allowed voltages Power Frequency Ideal frequency-power from VDD scaling Actual frequency-power from DVFS Our objective: enable additional modes to exploit frequency-power tradeoff Useable when VDD cannot be scaled further Useable without DVFS Proposal: Adjustable Buffering Our approach, like DVFS, provides runtime-selectable lowpower modes supplement or replace DVFS Key idea: Lot of logic added for performance, not functionality Turn this logic off when high-performance not needed Poor interconnect scaling large number of repeaters 20-30% of cells are repeaters Fat repeaters are used to improve delay but consume a lot of power We modify repeaters to dynamically adjust their driving capacity Select Transform 32X 16X / 32X Outline Introduction Adjustable Buffering Methodology Experiments & Results Conclusions Adjustable Repeater Design We add PMOS-NMOS pair to turn half the devices off dynamically Control Gate Control Gate Traditional Inverter (INVX8) Adjustable Inverter “LPM” = ON only half devices operational (low-power mode). “LPM” = OFF all devices operational (high-performance mode). What power components are likely to reduce in low-power mode? • Short-circuit power: during switching, PMOS & NMOS ON momentarily short circuit between VDD and VSS • High when transition time (slew) is large • Subthreshold leakage: when one of PMOS-NMOS pair between VDD and VSS ON Adjustable Repeater Requirements Low area overhead High performance in high-performance mode Added PMOS-NMOS pair (LPM devices) takes area LPM signal to be routed or locally generated Layout of the new cell must be simple and low area overhead On-resistance of LPM devices may reduce performance Good power reduction in low-power mode Area Overhead Problem: High performance needed when LPM signal OFF use large control gates large area overhead Control Gate Control Gate Delay overhead: increase in delay of adjustable repeater over traditional repeater Solution: Share control gates among multiple repeaters Control Gate Sharing V’DD V’SS LPM devices shared by two inverters Fewer control gates but virtual VDD (V’DD) and VSS (V’SS) need routing How many control gates needed? Compute simultaneous switching rate (SSR) by finding the max. #repeaters that have overlapping timing windows. Time = O(RlogR) (R = #repeaters) Find total width of all repeater devices controlled by CGs (=WR) For good performance, width of control gates = 4 x SSR x WR Typical SSR=~10% small area overhead Ensuring High Performance Problem: Adjustable repeaters ~5% slower when LPM signal OFF Up to ~5% reduction in circuit performance Solution: do not use adjustable repeaters on timing-critical paths Additional constraint: slew constraints not violated when LPM signal is OFF or ON. We characterize adjustable repeaters (i.e., find delay, slew, power, input capacitance) and then substitute traditional repeaters with adjustable repeaters subject to delay and slew constraints. No loss in circuit performance & no slew violations Power Reduction in Low-Power Mode Short-circuit energy and leakage reduce OFF OFF Traditional Inverter Adjustable Inverter Reduction in short-circuit energy and leakage for INVX8 Short-Circuit Energy Leakage LVT 43% 28% SVT 35% 26% HVT 22% 22% Outline Introduction Adjustable Buffering Methodology Experiments & Results Conclusions Experimental Validation Circuits: s38417 (8,890 cells), AES (15,272), OpenRisc (46,732) Tools: Synopsys HSPICE (SPICE), Design Compiler (synthesis, timing and power analysis); Cadence SoC Encounter (P&R), SignalStorm (library characterization); Artisan TSMC 90nm library models Other settings: power and timing analysis at slow corner, VDD of 1.1V and 0.9V, activity factor of 0.01. Results: Power Reduction AES DVFS DVFS+LPM Total Power (mW) 4.5 VDD=0.9 LPM=0 4 3.5 3 VDD=1.1 LPM=0 VDD=0.9 LPM=1 VDD=1.1 LPM=1 2.5 We perform comparative analysis of: Circuit with DVFS Circuit with DVFS + LPM 2 445 438 432 389 354 349 Frequency (MHz) OpenRisc 343 337 DVFS DVFS+LPM 12 Total Power (mW) 11 10 9 8 7 192 187 181 173 164 159 Frequency (MHz) 154 149 Both dynamic and leakage power reduce 6-12% reduction in total power at low-power mode Results: Area Overhead Logic area overhead due to control gates Depends on SSR Smaller if control gates can be placed in whitespace 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00% s38417 (SSR=13.5%) AES (SSR=7.95%) OpenRisc (SSR=9.77%) s38417 AES OpenRisc Routing overhead LPM, 3.46% LPM routed to control gates 3.44% routing overhead depends on locations of control gates # control gates small overhead small V’DD, V’SS routed to all repeaters For overhead estimation, nets assumed to be Steiner trees 3.42% 3.40% 3.38% 3.36% 3.34% 3.32% Outline Introduction Adjustable Buffering Methodology Experiments & Results Conclusions Conclusions Presented a novel technique that dynamically trades off power and performance by turning off devices not needed at less than max. performance Both leakage and dynamic power reduce; total power reduction is 6-12% on our testcases By sharing of control gates, area overhead reduced to <5.57% No adverse affect on performance of the circuit when LPM signal OFF Future work: Actual layout of adjustable repeaters with routing of V’DD, V’SS, LPM nets to accurately estimate power, performance, area impacts Customization of more cells especially clock repeaters to further improve power-performance tradeoff Thank You Questions?