* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Low Power Design of Integrated Systems Assoc. Prof. Dimitrios
Three-phase electric power wikipedia , lookup
Wireless power transfer wikipedia , lookup
Time-to-digital converter wikipedia , lookup
Variable-frequency drive wikipedia , lookup
Pulse-width modulation wikipedia , lookup
Electrification wikipedia , lookup
Power inverter wikipedia , lookup
Standby power wikipedia , lookup
Audio power wikipedia , lookup
Electric power system wikipedia , lookup
Stray voltage wikipedia , lookup
Power over Ethernet wikipedia , lookup
Electrical substation wikipedia , lookup
Immunity-aware programming wikipedia , lookup
Amtrak's 25 Hz traction power system wikipedia , lookup
Opto-isolator wikipedia , lookup
Surge protector wikipedia , lookup
History of electric power transmission wikipedia , lookup
Integrated circuit wikipedia , lookup
Buck converter wikipedia , lookup
Power engineering wikipedia , lookup
Power electronics wikipedia , lookup
Distribution management system wikipedia , lookup
Alternating current wikipedia , lookup
Voltage optimisation wikipedia , lookup
Mains electricity wikipedia , lookup
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris [email protected] Technology Directions: SIA Roadmap Year 1999 2002 Feature size (nm) Logic trans/cm2 Cost/trans (mc) #pads/chip Clock (MHz) Chip size (mm2) Wiring levels Power supply (V) High-perf pow (W) Battery pow (W) 180 6.2M 1.735 1867 1250 340 6-7 1.8 90 1.4 130 18M .580 2553 2100 430 7 1.5 130 2 2005 2008 2011 2014 100 39M .255 3492 3500 520 7-8 1.2 160 2.4 70 84M .110 4776 6000 620 8-9 0.9 170 2.8 35 390M .022 8935 16900 900 10 0.5 183 3.7 50 180M .049 6532 10000 750 9 0.6 175 3.2 Technology Directions: Technology Process2002 Evolution SIA Roadmap Transistors #Transistors Frequency Performance Performance Power Power Consumption consumption Power Terminology • Power is the rate at which energy is delivered or exchanged » electrical energy is converted to heat energy during operation • Power Dissipation - rate at which energy is taken from the source (Vdd ) and converted into heat Why Smaller Power? • Large Market of Portable devices – e.g. laptops, mobile phones • Achieve larger transistor integration – Pentium IV contains 42 million transistors – Teraflops chip contains 1.9 billion transistors • Need for “green” computers – 10% of total electrical energy consumed by PCs Battery Technology Improvements The Industry’s Reaction • Reduce chip capacitance through process scaling ==> Expensive • Reduce Voltage levels from 5V 3.3V 2V ==> Industry is hard to move (microprocessors, memory,...) • Better Circuit Techniques ==> Gated clocks, Power-Down of non-operational units… • Example: IBM 80 MHz PowerPC RISC (3 W @ 3.3V) –Power Management Logic determines activity on per cycle basis –Clocks of idle blocks are turned off 12-30% savings –Doze - Nap and Sleep mode (5 mW) Example: Intel Pentium-II processor • Pentium-1: 15 Watt (5V - 66MHz) • Pentium-2: 8 Watt (3.3V- 133 MHz) Where Does Power Go in CMOS? • The power consumption in digital CMOS circuits Pavg = Pdynamic + Pshort-circuit + Pleakage • Dynamic Power Consumption Charging and Discharging Capacitors • Short Circuit Currents Short Circuit Path between Supply Rails during Switching • Leakage (Static) Leaking diodes and transistors Present & Future in Power Consumption Dynamic Power Consumption(1) 2 Pdynamic CL Vdd N f • where VDDV supply voltage, CL capacitance, N is the average Vdd Vdd dd number of transitions per clock cycle, and f frequency operation C h argi ng cu rre n t IN O UT O UT CL (a) CL (b) O UT CL Di scharging cu rre n t (c) Dynamic Power Consumption (2) • For technologies up to 0.35 m, the dynamic consumption is about 80% of the total consumption • Goal ===> reduce dynamic power consumption – – – – – reduction capacitance reduction of supply voltage reduction of frequency reduction of switching activity or combination of above factors Leakage current consumption • the reverse-bias diode leakage at the transistor drains and • the sub-threshold current through an turned-off transistor channel Log ID gate p+ p+ n-type substrate leakage current Subthreshold region 10-3 10-5 reversed-biased diode (drain-substrate) Saturated region 10-7 10-9 Decreasing V , Vdd DS 10-11 + Vdd The leakage of a reverse-biased pMOS transistor. 10-13 10-15 0 0.5 1 1.5 2 VGS, volts Subthreshold leakage with respect to gate-source voltage The Design Flow System Specifications Syste m Spe cifications Syste m -Le ve l De sign Archite cture -Le ve l De sign System-Level De sign System-Level Analysis/Estimation Powe r mode l s for S yste m-le ve l compone n ts Architecture -Level De sign Architecture -Level Analysis/Estimation Powe r mode l s for macroce ll s, control l ogic Logic-Le ve l De sign Logic-Level Design Circuit-Le ve l De sign / Layout synthe sis (a) Logic-Level Analysis/Estimation Circuit-Le ve l De sign / Layout synthesis Circuit-Le ve l Analysis/Estimation (b) Powe r mode l s for gate s, ce ll s Power savings in terms of the design level System level 10-20 x RT level 2-5 x Logic level Transistor level 20-50% Layout level Increasing power savings Behavior level Lower Vdd Increases Delay 7.50 7.00 multiplier 2.0m technology clock generator NORMALIZED DELAY 6.50 Td = 6.00 5.50 5.00 CL * Vdd I I ~ (Vdd - Vt)2 4.50 4.00 3.50 ring oscillator 3.00 Td(Vdd=2) 2.50 2.00 1.50 1.00 microcoded DSP chip Td(Vdd=5) adder adder (SPICE) 2.00 4.00 V dd (volts) (2) * (5 - 0.7)2 = (5) * (2 - 0.7)2 4 6.00 Relatively independent of logic function and style. NORMALIZED POWER-DELAY PRODUCT Reducing Vdd 1.5 P x td = E t = CL * Vdd 2 1.00 0.70 0.50 0.30 0.20 quadratic dependence 0.15 0.1 E(Vdd=2) E(Vdd=5) = (CL) * (2)2 (CL) * (5)2 51 stage ring oscillator 0.07 E(Vdd=2) 0.16 E(Vdd =5) 0.05 8-bit adder 0.03 1 2 5 Vdd (volts) Strong function of voltage (V 2 dependence). Relatively independent of logic function and style. Power Delay Product Improves with lowering VDD. Lowering the Threshold Delay I 2V t Vdd D Vt = 0 Vt = 0.2 VGS Reduces the Speed Loss, But Increases Leakage Interesting Design Approach: DESIGN FOR PLeakage == PDynamic Transistor Sizing for Power Minimization Lower Capacitance Higher Voltage Small W/L’s Large W/L’s Higher Capacitance Lower Voltage Larger sized devices are useful only when interconnect dominated. Minimum sized devices are usually optimal for low-power. Techniques to reduce supply voltage Algorithm Transformation to exploit concurrency Architecture Parallelism and Pipelining Circuit/Logic Transistor Sizing, Fast Logic Structures Technology Threshold Voltage Reduction, Feature Size scaling Techniques to minimizing the switched capacitance System Algorithm U Partitioning, Power-down, power states Complexity, Concurrency, Regularity, Locality, Data representation Architecture Concurrency, Instruction set selection, Signal correlations, Data representation, Data Encoding Circuit/Logic Transistor sizing, Logic optimization, Power down, Layout Optimization Technology Advanced packaging, SOI Power consumption of transfer and storage over datapath operations both in hardware [Men95] and software [Tiw94, Gon96] . 9 10 4.4 relative energy relative energy/operation 33 0.4 0.2 3.6 1 0.0 t d) te) cess c i r a e r e ess e l i c c l r e w c ( ( p A -s lti AM O M ry yA / u r I r A a o l it M 6 SR 6 SR rna em it c b b e 1 1 M t 6 x x 1 16 Ex 28 bit 28 1 1 6 1 8x 8x S t SC ec s I n R ts on ock r n c e e h n cl ter Ot mpo In co ag r o t e Architecture Power Optimization Techniques • Architecture-driven voltage reduction: The key idea is to speed up the circuit in order to be able reduces voltage while meeting throughput rate constraints. Voltage reduction can be achieved by introducing parallelism in hardware or inserting flip-flops • Switching activity minimization: Try to prevent the generation and propagation of spurious transitions or to reduce the number of transitions, e.g. retiming, path balancing, data representation • Switched capacitance minimization: Aim at the minimization of switched capacitance • Dynamic power management: Under certain conditions, a circuit part becomes inactive, avoiding unnecessary calculations, e.g. gated clocks, operand isolation, precomputation, and guarded evaluation Architecture Trade-offs: Reference Data Path • • • • Critical path delay Tadder + Tcomparator (= 25ns), fref = 40MHz Total capacitance being switched = Cref Vdd = Vref = 5V Power for reference datapath = Pref = Cref Vref2 fref Voltage Reduction Technique: Parallelism • The clock rate can be reduced by half with the same throughput fpar = fref / 2 • Vpar = Vref / 1.7 Cpar = 2.15 Cref • Ppar = (2.15 Cref ) (Vref /1.7)2 (fref /2) 0.36 P ref Voltage Reduction Technique: Pipeline • fpipe = fref, Cpipe = 1.1 Cref, Vpipe = Vref /1.7 • Voltage can be dropped while maintaining the original throughput • Ppipe = Cpipe Vpipe2 fpipe = (1.1 Cref ) (Vref /1.7)2 fref = 0.37 Pref Comparisons Logic Style and Power Consumption • Power-delay product improves as voltage decreases • The “best” logic style minimizes power-delay for a given delay constraint The concept of gating clock signals < X comparator output Y gated clock B clock < A scheme 1 0 0 clock 1 0 gated clock (scheme 1) < REG clock 0 gated clock clock scheme 2 gated clock (scheme 2) 0 1 clock period (a) (b) (c) Resource Sharing Can Increase Activity Reducing Effective Capacitance Global bus architecture Local bus architecture Shared Resources incur Switching Overhead Data representation • Sign-extension activity significantly reduced using sign-magnitude representation Switching Activity in Adders Switching Activity in Multipliers Signals and Operations Reordering • Example: complex multiplication Trading a multiplication for an addition Xr Xr Ar Xi x x Xr Ai Ai Xi x x Xr + x x Xi Ar Ai-Ar - Xi Ar x + Yr + - Yi Yr Yi (a) (b) Ai+A r Module Selection *ii *i *iii *ii *i +i *iii +i +ii +ii (a) ripple adder (c) RTL Library array multiplier Area=16185 Latency=60 ns Power=18540μW Area=2744 Latency=30 ns Power=1199μW *ii *i + carry loohahead adder wallace multiplier Area=3959 Latency=20 ns Power=1467μW (b) Area=18443 Latency=40 ns Power=23545μW +ii (d) *iii Glitching activity reduction (3) x y x a c b d 0 1 0 1 y a b c 0 Power Consumption: Without glitches: 823.9 μW With glitches: 1650 μW Function if (x < y) then z=c+d else z=a+b z ARCHITECTURE 1 Power Consumption: Without glitches: 951.7 μW With glitches: 1357.7 μW d 1 z ARCHITECTURE 2 Two-Level Logic Circuits Switching Activity Minimization (1) • Taking into account the static and transition probabilities (i.e. temporal correlation) of the primary inputs, we can insert in certain gates of the first logic level (i.e. AND gates), additional input signals resulting into reduced switching activity • Appropriately-selected input signals force the outputs of the AND gates to logic level zero for a number of combinations of the binary input signals Two-Level Logic Circuits Switching Activity Minimization (2) • Example: F x 0 x1 x 0 x 2 x 0 x 3 • Signal x3 exhibits low-transition probability and high static-1 probability, while the signals x0 , x1, and x2 are characterized by high-transition probabilities x0 x1 x0 x2 x0 x3 g1 g2 g3 y1 y2 y3 Intial Logic Circuit x3 gg44 F x0 x1 x0 x2 x0 x3 g1 g2 g3 y1' y2' y3' Modified Logic circuit g4 F' Additional Info • • • • • • • A. Chandrakasan and R. Brodersen, “Low Power CMOS Design”, Kluwer Academic Publishers, 1995 Christian Piguet, Editor, « Low-Power Electronics Design”, CRC Press, November 2004 D. Soudris, C. Piguet, C. Goutis, “Designing CMOS Circuits for LowPower”, Kluwer Academic Press, October 2002 F. Catthoor, K. Danckaert, et. al.: 2002, Data Access and Storage Management for Embedded Programmable Processors. Kluwer Academic Publishers Stamatis Vassiliadis and Dimitrios Soudris, “Fine- and CoarseGrain Reconfigurable Computing” Springer, Dordrecht/London/Boston, August 2007 http://vlsi.ee.duth.gr/~dsoudris AMDREL website http://vlsi.ee.duh.gr/amdrel