Download Leakage Power Analysis and Comparison of Deep Submicron Logic

Leakage Power Analysis and Comparison of Deep Submicron Logic Gates* Geoff Merrett and Bashir M. Al-Hashimi ESD Group, School of Electronics and Computer Science, University of Southampton, UK [email protected] Abstract. Basic combinational gates, including NAND, NOR and XOR, are fundamental building blocks in CMOS digital circuits. This paper analyses and compares the power consumption due to transistor leakage of low-order and high-order basic logic gates. The NAND and NOR gates have been designed using different design styles and circuit topologies, including complementary CMOS, partitioned logic and complementary pass-transistor logic. The XOR gate has been designed using a variety of additional circuit topologies, including double pass-transistor logic, differential cascade voltage switch logic and a gate designed specifically for low power. The investigation has been carried out with HSPICE using the Berkeley Predictive Technology Models (BTPM) for three deep submicron technologies (0.07 m, 0.1 m and 0.13 m). 1 Introduction The rising demand for portable systems is increasing the importance of low power as a design consideration. Considerable research is being performed into various techniques for lowering the power consumption of digital circuits [1]. As MOS transistors enter deep submicron sizes, undesirable consequences regarding power consumption arise. Decreasing the dimensions of the transistor requires a reduction in the supply voltage to keep the dynamic power consumption reasonable [2]. In turn, this demands a reduction of the threshold voltage to maintain performance, which causes an exponential increase in the subthreshold leakage current [3]. Recent research has shown that, with ever shrinking dimensions, the leakage current will become even greater than the dynamic current in the overall power dissipation [4]. Recently, numerous techniques have been proposed which aim to reduce leakage power. These include supply voltage reduction [5], [6], supply voltage gating [2], multiple or increased threshold voltages [2], [6], [7], [8], [9] and minimising leakage power consumption in sleep states [10], [11]. However, from a leakage power perspective, little work has been reported comparing different design styles and circuit topologies of the basic gates. Investigating this will allow designers to choose the correct design style for a gate to achieve low leakage power consumption. This paper presents a systematic comparison between NAND, NOR and XOR gates, at three DSM process technologies (0.07 m, 0.1 m and 0.13 m), implemented using differ* This research is supported in part by EPSRC(UK) grant GR/595770 ent design styles and circuit topologies. Furthermore, this paper demonstrates how the stacking effect [12] can play an important role at the design stage in reducing leakage by considering transistor ordering. 2 Preliminaries This section briefly analyses the leakage power of a 2-input NAND gate and reviews a number of design styles and circuit topologies that can be used to implement basic gates. This information will subsequently be used to explain the investigations and simulation results outlined in this paper. Fig. 1. 2-input NAND, (a) COMP [13], (b) CPL [15] There are numerous design styles that can be used to realise digital CMOS gates. The three most commonly employed are examined in this paper: complementary CMOS [13], partitioned logic [13] and various pass-transistor designs [13], [14], [15] – focusing primarily on complementary pass logic (CPL). Complementary CMOS (denoted from here on as ‘COMP’) consists of a PMOS pull-up network and NMOS pull-down network. Fig. 1(a) shows a 2-input COMP NAND gate, and Fig. 1(b) shows the same gate designed using CPL. The advantages and disadvantages of CPL are well documented [13], [15], [16], [17]. The numbers next to the transistors in the figures represent their widths in relation to their lengths; the lengths are set to the smallest dimensions of the technology used. In designing high-order gates (>2 inputs), COMP and partitioning design styles can be employed. Fig. 2 shows an 8-input NAND gate implemented using the partitioned designed style. Different design styles and circuit topologies have been proposed to implement a 2-input XOR gate. These include COMP [15], CPL [15], transmission gates [13], differential cascade voltage switch logic (DCVSL) [13], double pass logic (DPL) [15] and a gate specifically designed for low power (LP) [14]. Fig. 3 shows the two circuits for a 2-input XOR gate produced using COMP and LP design styles. Fig. 2. 8-input partitioned NAND [13] Fig. 3. 2-input XOR, (a) COMP [15], (b) LP [14] In CMOS digital circuits, power dissipation consists of two main components: dynamic and static. As opposed to dynamic power (which is a result of switching activity), leakage power occurs as a result of the subthreshold leakage current present in deep submicron MOS transistors acting in the weak inversion region [5]. The leakage current is exponentially dependant on the gate-source voltage of the transistor, causing a drain current to flow even when the gate-source voltage is zero [3]. The leakage current for an NMOS transistor operating with Vgs 0 is given in equation (1), where the parameters have their usual meanings [7]. I s = I 0 exp Vgs − VT nVth 1− e −Vds Vth (1) In [3], analytical equations that model the leakage current for a series of stacked transistors were given. These equations give the subthreshold current for a stack of one transistor (IS1), two transistors (IS2) and three transistors (IS3). It was also shown in [3] that the more transistors there are in a stack, the smaller is the leakage current (i.e. IS1 > IS2 > IS3). The leakage current for transistors in parallel is given by the sum of the individual currents through each transistor. The leakage current in a CMOS gate depends on the input pattern applied; for example, in the case of the 2input COMP NAND gate (shown in Fig. 1(a)), the leakage current is highest when A=1 and B=1, since both PMOS transistors are off. In this case, the total leakage current is given by the sum of the individual currents through each of the parallel transistors (IS1+IS1=2*IS1). The leakage current is smallest when A=0 and B=0, since both NMOS transistors are off, and the total leakage current is given by IS2. The leakage power results presented throughout this paper were obtained using HSPICE [18] with the Berkeley Predictive Technology Models (BPTM) [19] and minimum-size transistors [13]. The results were obtained from a transient analysis over an exhaustive set of input patterns. The average leakage power was calculated by averaging the power over all possible input combinations (i.e. all inputs have an equal probability of occurring). The leakage power consumption was calculated by multiplying the simulated leakage current by the supply voltage. 3 Design of Low Order Gates 14000 14000 12000 12000 Leakage Power (pW) Leakage Power (pW) Section 2 has shown that there are many design styles and circuit topologies to realise the functions of basic gates. Whilst comparisons exist between the different designs in terms of speed, area and dynamic power [15], there is little reported work giving a systematic comparison between different gate designs from a leakage power perspective – the main aim of this paper. Fig. 4(a) shows the simulated leakage power performance of 2-input NAND and NOR gates using the COMP and CPL design styles (Fig. 1) at 0.07 m. As can be seen, the leakage power of a CPL based gate is nearly four times that of the equivalent COMP gate. This is because, as outlined in Section 2, the highest leakage current for the COMP NAND gate is 2*IS1 and occurs when the input is “1,1”. The lowest leakage current is IS2, and occurs when the input is “0,0”. In the case of the CPL NAND gate, the highest leakage current is 5*IS1 and occurs when the input is “0,1” or “1,0”. The lowest leakage current is 3*IS1 and occurs when the input is “0,0” or “1,1”. The leakage current is higher for the CPL design style, as there are no stacks (meaning all of the leakage currents are IS1s), and extra leakage current is drawn through level-restoring and output-driving circuitry. 10000 8000 6000 4000 2000 0 NAND NAND NOR NOR (COMP) (CPL) (COMP) (CPL) (a) 10000 2-input COMP NAND 2-input CPL NAND 8000 6000 4000 2000 0 0.13 m 0.1 m 0.07 m (b) Fig. 4. Leakage power of COMP and CPL 2-input, (a) NAND and NOR at 0.07 m, (b) NAND across DSM technologies Fig. 4(b) shows the leakage power performance of the 2-input NAND gate at different DSM technologies. It should be noted that, as the 2-input NOR gate has almost the same leakage as the NAND, the results are not shown in this paper. Fig. 4(b) reinforces the prediction that the leakage power in a CMOS circuit increases as the technology shrinks. It can be observed from Fig. 4 that leakage power is an issue in DSM gates and, to minimise this, designers should select the COMP design style in preference to CPL when implementing 2-input NAND and NOR gates. 20000 Leakage Power (pW) 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 LP [r9] TG [r17] COMP DCVSL [r10] [r17] CPL [r10] DPL [r10] Fig. 5. Leakage power of 2-input XOR at 0.07 m Up to this point, only NAND and NOR gates have been considered. Numerous design styles for 2-input XOR gates (a selection is shown in Fig. 3) have been reported in the literature, each with merits and shortcomings in terms of area, speed and dynamic power [15]. Fig. 5 shows the leakage power comparison of six different design styles using a 0.07 m technology. As can be seen, the LP design (Fig. 3(b)) consumes the least leakage power, the COMP design consumes less leakage power than CPL, and the DPL design [15] has the highest leakage power. The LP gate performs best through the use of only six transistors, compared to 12 for COMP, 10 for CPL and 12 for DPL. The LP gate has a leakage current given by 3*IS1 for inputs “0,0”, “0,1” and “1,0”, and IS1 for input “1,1”. This is in contrast to COMP having a leakage current given by 4*IS1, CPL given by 5*IS1, and DPL given by 6*IS1 (for all inputs). It is informative to note at this stage that the subthreshold current through an NMOS transistor is not equal to that of a PMOS transistor. It is for this reason that, in Fig. 5, the difference between the leakage power of CMOS and CPL (5*IS1 – 4*IS1) is not equal to the difference between DPL and CPL (6*IS1 – 5*IS1). Table 1 examines the leakage power performance of the 2-input XOR gate, implemented at different DSM technologies. As expected, the leakage power increases as the process is scaled down; for example, the leakage power of the COMP XOR at 0.1 m is 3050pW but increases to 10600pW at 0.07 m. It is known that DPL is worse than the other design styles for area, dynamic power and speed, and this has been reinforced by these results for leakage power. From Fig. 5 and Table 1, it can be observed that the LP design reported in [14] has the least leakage power. This design also has the fewest transistors compared to the other topologies outlined in Table 1, and has low delays and dynamic power [14]. This indicates that the LP XOR gate provides an attractive choice in terms of area, speed and power (dynamic and leakage). Table 1. Leakage power of six designs for 2-input XOR gates at three DSM technologies Gate Design Style 2-Input XOR COMP [15] DCVSL [13] CPL [15] DPL [15] TG [13] LP [14] No. of Trans. 12 8 10 12 6 6 Average Leakage Power (pW) 0.07 m 0.1 m 0.13 m 10600 3050 709 10700 2620 659 15700 4300 922 18600 5340 1150 9320 2670 577 7570 2070 448 We believe that, in applications where a unified design style is an important issue for basic gates, the COMP design style should be given serious consideration. This is because the COMP design style produces NAND and NOR gates with low leakage power across DSM technologies (NAND shown in Fig. 4(b)), while the XOR gate also performs comparatively well from a leakage power perspective – CPL is around 50% worse than COMP, while LP is less than 30% better than COMP. 4 Design of Higher Order Gates To design higher-order basic gates (> 2 inputs), the COMP (Fig. 1(a)) or partitioned (Fig. 2) design style can be chosen. As partitioning simply breaks down a high order gate into lower order gates, it could be implemented using CPL gates (Fig. 1(b)). However, as it was shown in Fig. 4 that the CPL leakage power is greater than COMP for individual gates, it is expected that the partitioned CPL gates would also have a higher leakage power consumption. For this reason, in this paper, only complementary CMOS and partitioned CMOS (denoted from hereon as ‘partitioned’) design styles are analysed. Higher-order XOR gates were not investigated as they have limited practical use. 8000 Leakage Power (pW) 7000 COMP Partitioned 6000 5000 4000 3000 2000 1000 0 0.13 m 0.1 m 0.07 m Fig. 6. Leakage power of COMP and partitioned 8-input NAND at three DSM technologies Fig. 6 shows the leakage power of an 8-input NAND gate using COMP (Fig. 7) and partitioning (Fig. 2) at three DSM technologies. For a given technology, it can be seen that the COMP gate consumes less power through leakage than the equivalent partitioned gate. This is because, as outlined in Section 2, the more transistors there are in a stack, the lower the leakage current is through the stack. A COMP NAND or NOR gate has one deep stack – eight in Fig. 7 (leading to one leakage current). Partitioning, however, introduces several smaller stacks (giving rise to more, larger leakage currents). This means that the average leakage power in the partitioned gate is higher than that of the COMP gate. The leakage power performance of 4, 6 and 8 input COMP and partitioned NAND designs are given in Table 2. Table 2. Leakage power of two designs for high order gates at three DSM technologies Partitioned COMP Design Style Gate 4ip NAND 6ip NAND 8ip NAND 4ip NOR 6ip NOR 8ip NOR 4ip NAND 6ip NAND 8ip NAND 4ip NOR 6ip NOR 8ip NOR Average Leakage Power (pW) 0.07 m 0.1 m 0.13 m 1590 416 116 723 173 58 350 73 32 1390 417 91 557 181 40 221 83 20 10600 2880 656 8580 2260 531 6940 1770 424 10300 3040 686 8130 2450 562 6440 2000 464 Fig. 7. 8-input COMP NAND gate We have analysed the leakage power performance of an 8-input NOR gate for a number of DSM technologies, and the results are given in Table 2. Again, this shows that the COMP design is more efficient than partitioning from a leakage power perspective. Fig. 6 and Table 2 indicate that designers should choose the COMP design style to obtain leakage power efficient high order NAND and NOR gates. However, this choice is achieved with an inferior speed performance [13], particularly when the number of gate inputs is greater than six. This trade-off needs to be given careful consideration. It can also be seen from Table 2 that, as the number of inputs to a gate increases, the average leakage power decreases. This is because a higher number of inputs causes a longer stack, and hence a lower average leakage power. 5 Input Pattern Ordering As outlined in Section 2, the leakage power consumed in a CMOS circuit is dependant on the inputs to the gate. Previous research demonstrated that the leakage power is the same for inputs “0,1” and “1,0” at 0.35 m [7], and that leakage current in a stack is ‘almost’ independent of ordering for a constant number of ‘off’ transistors [12]. 0.13 m Leakage Power (pW) 450 400 350 0.13 m 0.07 m 300 6000 5000 4000 250 200 3000 150 2000 100 1000 50 0 0.07 m Leakage Power (pW) 7000 500 0 11 01 10 00 Input Pattern (m.s.b at Top of Stack) Fig. 8. Leakage power of 0.07 m and 0.13 m 2-input COMP NAND gate for all input patterns The 2-input COMP NAND gate (Fig. 1(a)) was simulated for a number of DSM processes. Fig. 8 shows that the leakage current for the inputs “ 0,1” and “ 1,0” are no longer equal, and the difference increases as the DSM process shrinks. It can be seen that the input combination “ 0,1” produces a larger leakage current than “ 1,0” . Exhaustive simulations were carried out for high order NAND gates, and it was observed that the leakage power varied considerably for patterns containing only one ‘zero’ – shown in Fig. 9 for an 8-input NAND. It can be seen that the input pattern “ 01111111” produces the largest current whilst “ 11111110” produces the least. We believe this observation should be exploited. Similarly, the equivalent observation for NOR gates was found; for input combinations containing a single ‘one’ , “ 10000000” gives rise to the smallest leakage current, while “ 00000001” gives the largest. This observation can be explained by the fact that, for an 8-input COMP NAND, the input “ 01111111” gives rise to a greater body effect in the transistor acting in the weak inversion region than the input “ 11111110” . This is investigated further without reference to ordering in [11]. Table 3 shows the saving that can be made for IS1 leakage currents by taking note of input ordering. For example, with an 8-input NAND gate at 0.07 m, a saving of 1230pA can be obtained by using the input “ 01111111” rather than “ 11111110” , and this equates to a percentage saving of 34%. In order to exploit this observation, the following guidelines should be followed: • For NMOS stacks (NAND gates), a ‘zero’ closest to the output node of the gate will give rise to the largest IS1 leakage current, while a ‘zero’ closest to ground will give the smallest IS1 leakage current. For PMOS stacks (NOR gates), a ‘one’ closest to the output node of the gate will give rise to the largest IS1 leakage current, while a ‘one’ closest to Vdd will give the smallest IS1 leakage current. Leakage Power (pW) 3500 3000 2500 2000 1500 1000 500 0 11111110 11111101 11111011 11110111 11101111 11011111 10111111 01111111 Input Pattern (m.s.b at Top of Stack) Fig. 9. Leakage power of an 0.07 m 8-input NAND gate for all inputs containing a single zero Table 3. Leakage power savings through transistor ordering for three DSM technologies Gate 2ip NAND 3ip NAND 4ip NAND 6ip NAND 8ip NAND 2ip NOR 3ip NOR 4ip NOR 6ip NOR 8ip NOR Average Leakage Power (pW) 0.07 m 0.1 m 0.13 m 969 (26%) 305 (41%) 11.1 (8%) 1070 (29%) 330 (44%) 11.8 (8%) 1130 (31%) 342 (46%) 11.8 (8%) 1200 (33%) 354 (47%) 11.8 (8%) 1230 (34%) 361 (48%) 11.8 (8%) 1240 (39%) 373 (36%) 74 (36%) 1370 (42%) 408 (40%) 81 (39%) 1420 (44%) 426 (41%) 84 (40%) 1490 (46%) 445 (43%) 85 (41%) 1530 (48%) 456 (44%) 85 (41%) 6 Conclusions We have presented a systematic comparison of the leakage power for basic gates, designed using different design styles and circuit topologies, and implemented at three DSM technologies. The results have shown that complementary CMOS is favoured over CPL and gate partitioning for implementing both low order and high order NAND and NOR gates. The low power XOR gate design developed in [14] provides a leakage power saving of 50% over CPL. These findings need to be considered carefully when choosing a particular design style due to the trade-offs that exist. Whilst complementary CMOS has the least leakage power for high-order gates, this is achieved at the expense of speed that the partitioning design style permits. We have shown how input pattern ordering may be exploited to reduce leakage power. References 1. Piguet, et al.: “ Extremely Low-Power Logic,” in Proc. Design Automation and Test in Europe Conf., Feb 2004, pp. 656-661 2. Y-F. Tsai, D. Duarte, N. Vijaykrishnan and M. J. Irwin: “ Implications of Technology Scaling on Leakage Reduction Techniques” in Proc. Conf. Des. Automation, 2003, pp. 187-190 3. R. X. Gu and M. I. Elmasry: “ Power Dissipation Analysis and Optimisation of Deep Submicron CMOS Digital Circuits,” IEEE J. Solid-State Circuits, May 1996, vol. 31, pp. 707-713 4. N. S. Kim, et al.: “ Leakage Current: Moore’ s Law Meets Static Power,” IEEE J. Computer, Dec. 2003, vol. 36, pp. 68-75 5. L. Wei, K. Roy and V. K. De: “ Low Voltage Low Power CMOS Design Techniques for Deep Submicron ICs,” Int. Conf. VLSI Design, Jan. 2000, pp. 24-29 6. D. Liu and C. Svensson: “ Trading Speed for Low Power by Choice of Supply and Threshold Voltages,” IEEE J. Solid-State Circuits, Jan 1993, vol. 28, pp. 10-17 7. K. S. Khouri and N. K. Jha: “ Leakage Power Analysis and Reduction During Behavioural Synthesis,” IEEE Trans. VLSI Systems, Dec. 2002, vol. 10, pp. 876-885 8. K. Roy: “ Leakage Power Reduction in Low-Voltage CMOS Designs,” in Proc. Int. Conf. Electron., Circuits Syst., Sept 1998, pp. 167-173 9. N. Sirisantana and K. Roy: “ Low-Power Design Using Multiple Channel Lengths and Oxide Thicknesses,” in IEEE Design & Test of Computers, Jan. 2004, vol. 21, pp. 56-63 10.J. P. Halter and Farid N. Najm: “ A Gate-Level Leakage Power Reduction Method for UltraLow-Power CMOS Circuits,” in IEEE Custom Integrated Circuits Conf., 1997, pp. 475-478 11.Z. Chen, L. Wei, M. Johnson and K. Roy: “ Estimation of Standby Leakage Power in CMOS Circuits Considering Accurate Modelling of Transistor Stacks,” in Proc. Symp. Low Power Design Electron., 1998, pp. 239-244 12.W. Jiang, V. Tiwari, E. de la Iglesia and A. Sinha: “ Topological Analysis for Leakage Power Prediction of Digital Circuits,” in Proc. Int. Conf. VLSI Design, Jan. 2002, pp. 39-44 13.J. M. Rabaey: “ Digital Integrated Circuits (A Design Perspective).” New Jersey, Prentice Hall, 1996 14.J. M. Wang, S. C. Fang and W. S. Feng: “ New Efficient Designs for XOR and XNOR Functions on the Transistor Level,” IEE Solid-State Circuits, July 1994, vol. 29, pp. 780-786 15.R. Zimmermann and W. Fichtner: “ Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic,” IEEE J. Solid-State Circuits, July 1997, vol. 32, pp. 1079-1090 16.M. Alioto and G. Palumbo: “ Analysis and Comparison on Full Adder Block in Submicron Technology,” IEEE Trans. VLSI Systems, Dec. 2002, vol. 10, pp. 806-823 17.A. P. Chandrakasan, S. Sheng and R. W. Bordersen: “ Low-Power CMOS Digital Design,” in IEEE J. Solid-State Circuits, Apr. 1992, vol. 27, pp. 473-484 18.Avant!: “ Star-Hspice Manual (release 1999.2).” June 1999 19.Berkeley Predictive Technology Model (BPTM): Device Group, Univ. California at Berkeley [online] Available: http://www-device.eecs.berkeley.edu/~ptm/.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Leakage Power Analysis and Comparison of Deep Submicron Logic