Download 4. power gating - UVA ECE Wiki

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Index of electronics articles wikipedia , lookup

Surge protector wikipedia , lookup

Valve RF amplifier wikipedia , lookup

Radio transmitter design wikipedia , lookup

Standby power wikipedia , lookup

Power MOSFET wikipedia , lookup

Audio power wikipedia , lookup

Opto-isolator wikipedia , lookup

Power electronics wikipedia , lookup

Captain Power and the Soldiers of the Future wikipedia , lookup

Switched-mode power supply wikipedia , lookup

Rectiverter wikipedia , lookup

CMOS wikipedia , lookup

Transcript
Low Power Lookup-Table Design Using Power Gating and
Body Biasing Techniques
Yang Fu, Kejia Li
ECE 6332 – Fall 2010
University of Virginia
[email protected], [email protected]
In this paper, we describe a low power lookup-table (LUT) design
using power gating and body biasing techniques. The LUT design
could operate in high speed / low power / sleep mode adapting to
system requirement.
1. INTRODUCTION
Lookup table (LUT) is an important component for digital IC.
LUT can be used to encode Boolean functions efficiently. It is
also one of the key components in FPGA. Due to the scaling
down of modern VLSI technology, reducing the power
consumption of LUT in FPGA applications while maintaining
speed performance becomes more and more important [1].
Various techniques including dual-Vt, dual-VDD, body biasing,
sleep transistors and so on can be used to reduce the power
consumption of LUT. Anderson and Najm have studied different
footer / header designs for the multiplexer routing circuitry [2].
Arifur and Vijay compared various leakage reduction techniques
for FPGA circuit [1]. DualVdd/Vt technique has been proposed
by authors in [3]. Reduced swing technique could also be used for
this purpose [4].
In our design, we will use high-Vt devices together with forward
body biasing (FBB) to reduce the power consumption of the mux
in LUT while maintaining its speed performance. We are also
going to use power gating and reduced swing technique to reduce
the power consumption by operating the device in low power /
sleep mode.
2. SYSTEM CONFIGURATION
The system configuration of our design is illustrated in Figure 1.
3. BODY BIASING
3.1 Determine the optimal FBB voltage
First we need to determine the appropriate FBB voltage to use.
The major metrics are static power consumption, propagation
delay and energy per switch. We also need to choose the threshold
voltage Vth of the NMOS pass-transistor in the mux. A lower Vth
pass-transistor would speed up the circuit, but result in larger
static power consumption. We would start with a 4x1 mux. The
comparison is shown below in Figure 2.
NMOS_VTL
NMOS_VTG
NMOS_VTH
1E-5
1E-6
1E-7
0.0
0.2
0.4
0.6
0.8
1.0
Forward Body Biasing (V)
Control Signal
SRAM
configuration
bits
swing on the bus will be reduced to conserve power. The level
converter is used to restore full swing signal and thus prevent
excessive leakage in the following stages due to the reduced
swing signal. A power gating selection circuit is used to control
the operation of power gating headers in each component.
Static Power Consumption (W)
ABSTRACT
Figure 2 Comparison of static power consumption
Input
Mux
Driver
Well Driver
Output
Driver
Bus
Output
Level
Converter
Power
Gating
Selector
Figure 1 System Configuration
The well driver provides the FBB voltage for the NMOS pass
transistors in the mux. Power gating circuitry will control the
virtual VDD (VVDD) of input driver / mux / output driver. There
is a bus with large capacitance loading (100fF) between the output
driver and level converter. In the low power mode, the voltage
We can see NMOS_VTL devices give large static power
consumption with increased FBB voltage. The static power of
both NMOS_VTG and NMOS_VTH devices increases a little
with larger FBB voltage.
delay by 9.8% in a 16x1 mux while increasing the static power
consumption by 32%.
NMOS_VTL
NMOS_VTG
NMOS_VTH
440
420
400
400
380
350
360
340
320
300
280
0.0
0.2
0.4
0.6
0.8
1.0
Forward Body Biasing (V)
Propagation Delay (ps)
Propagation Delay (ps)
460
300
250
200
150
100
50
0
Figure 3 Comparison of propagation delay
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
The comparison of propagation delay is shown in Figure 3. The
NMOS_VTH devices show excessive delay while the delay of
NMOS_VTG and NMOS_VTL devices are acceptable. The
energy per switch for all three type of devices are about the same.
Forward Body Biasing (V)
Figure 5 Comparison of propagation delay
Level_converter
Input_driver
Mux
Output_driver
So we choose NMOS_VTG devices for the pass transistors in the
mux. As we can see from the plot, a FBB voltage of 0.6~0.7V
results in the optimum tradeoff between static power consumption
and propagation delay. So this is the voltage that our FBB
generation circuit would provide.
Static Power Consumption (W)
Next we will break down the static power / delay / energy per
switch graph for each component in the circuit.
1E-4
1E-5
Level_Converter
Input_driver
Mux
Output_driver
Total
100
Energy per Switch (fJ)
3.2 High power mode operation
Output_driver
Mux
Level_converter
10
1
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Forward Body Biasing (V)
Figure 5 Comparison of energy per switch
1E-6
FBB
(0V)
off
FBB on
(0.64V)
Change
percentage
in
1E-7
1E-8
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Forward Body Biasing (V)
Figure 4 Static power consumption in each component
In a 4x1 mux the static power consumption is dominated by the
output driver. The power consumption of the mux (including
input driver) increases fast with increasing FBB voltage. The
delay of each component is shown in Figure 5. Table 1 and 2
summarizes the FBB operation of 4x1 and 16x1 mux in high
power mode. We can see that the FBB will reduce the propagation
Static power
(µW)
0.37
0.45
+21.6%
Mux delay (ps)
69
37
-46.4%
Total delay (ps)
312
297
-4.8%
Energy per
switch (fJ)
148
148
0%
Table 1. FBB operation of 4x1 mux in high power
mode
1.6
650
Change
percentage
Static power
(µW)
0.59
0.78
+32.0%
Mux power (µW)
0.23
0.44
+47.7%
Mux delay (ps)
137
86
-37.2%
Total delay (ps)
Energy per
switch (fJ)
381
151
347
151
600
in
-9.8%
0%
Table 2. FBB operation of 16x1 mux in high power
mode
3.3 The well driver
The structure of the well driver is shown below. It consists of an
inverter and 4 NMOS transistors. The upper two NMOS are
connected in diode configuration. When input is low, the output
of the inverter is high. M2 is off and M3 is on. The output is low.
There is no FBB voltage supplied. When input is high, M3 is off
and M2 is on. The output is VDD – 2Vtn.
3.4 Effect of pass transistor sizing
Parameter sweep of NMOS pass transistor sizing is used to
determine the optimal size of NMOS in the mux. The result is
shown below. (The energy per switch stays about the same. Data
is not shown here)
1.4
550
Delay (ps)
FBB on
(0.64V)
1.2
500
1.0
450
400
0.8
350
Static Power (uW)
FBB
off
(0V)
Delay (ps)
Static Power (uW)
0.6
300
0
200
400
600
800
1000
Pass Transistor Width (nm)
We can see that the static power increases with increasing pass
transistor width. There is a minimal delay point on the graph. This
is the point where the mux is designed at. If the width of pass
transistor is too small, the current driving capability will be
limited causing large delay; on the other hand, capacitance
loading will be too large if the width of pass transistor becomes
too large.
4. POWER GATING
4.1 Header Circuit
The header circuit used for power gating is shown in the above
graph. It consists of a PMOS and a NMOS transistor. During the
high speed mode, both PMOS and NMOS are on. VVDD = VDD.
During low power mode, the PMOS is turned off and only NMOS
is on supplying power. VVDD = VDD – Vtn. The voltage swing
in the circuit is reduced, especially on the bus. Dynamic power
consumption is then reduced. The voltage swing will be restored
to full swing by the level converter. Finally in the sleep mode,
both PMOS and NMOS are off and the circuit is cut off from the
power supply.
4.2 Low power mode operation
The graph below shows the waveform on the bus and final output
during the low power mode (FBB is off). The voltage swing on
the bus is reduced to ~0.7 V.
The delay decreases with increasing NMOS width while the
energy per switch follows an opposite trend. The static power
remains about the same for different NMOS width (data not
shown here). The power supply capability of NMOS increases
with larger width and VVDD will be closer to VDD for larger
NMOS. So the transition becomes faster and energy per switch
increases.
4.4 Sleep Mode
Both NMOS and PMOS in the headers are turned off in sleep
mode. The FBB is also off. The static power consumption in sleep
mode is ~46 nW.
5. CONCLUSION
The power gating can be combined with FBB. The table below
summarizes the result. We can see that the effect of FBB is more
pronounced in the low power mode.
FBB off
FBB on
Change in
percentage
Static power (W)
0.44
0.52
+18.2%
Delay (ps)
829
654
-21.2%
Energy per switch (fJ)
63.6
76.5
+20.1%
4.3 Effect of Header Sizing
The power is supplied to the circuit through the NMOS in the
header in low power mode. The sizing of the NMOS will impact
on the performance of the circuit. A parameter sweep of header
NMOS width is used to determine the effect. The result is shown
below.
1200
delay (ps)
Switch_Energy (fJ)
1100
110
Energy per switch (fJ)
100
90
Delay (ps)
1000
80
900
70
800
60
50
700
40
600
0
200
400
600
800
Header NMOS width (nm)
30
1000
We have developed a low power LUT circuit using power gating
and body biasing technique. The circuit could operation in high
speed / low power / sleep mode for different tasks. This study
provides useful information for low power LUT for FPGA
applications.
6. REFERENCES
[1] Arifur R. and Vijay P. Evaluation of LowLeakage Design
Techniques for Field Programmable Gate Arrays. FPGA’04,
February 22-24, 2004, Monterey, California, USA.
[2] Anderson, J.H.; Najm, F.N.; , Low-Power Programmable
FPGA Routing Circuitry, Very Large Scale Integration
(VLSI) Systems, IEEE Transactions on , vol.17, no.8,
pp.1048-1060, Aug. 2009.
[3] Fei L., Yan L., Lei H. and Jason C. Low Power FPGA Using
Predefined DualVdd/DualVt Fabrics. FPGA’04, February
22-24, 2004, Monterey, California, USA.
[4] Garcia, J.C.; Montiel-Nelson, J.A.; Nooshabadi, S., Adaptive
Low/High Voltage Swing CMOS Driver for On-Chip
Interconnects, Circuits and Systems, 2007. ISCAS 2007.
IEEE International Symposium on, vol., no., pp.881-884, 2730 May 2007.