1 Subthreshold FinFET for Low Power Circuit Operation Anupama Bowonder, Pratik Patel, University of California, Berkeley Abstract— FinFET subthreshold operation will be analyzed for low power applications. With current technology scaling, FinFETs seem like the most likely replacement for bulk CMOS in the near future. Further since they have near 60mV/dec swing, they are more ideal for subthreshold operation than bulk CMOS. Subthreshold FinFET SRAM operation is studied to determine the limit of subthreshold voltage scaling since supply voltage scaling is limited by SRAM functionality. Finally the role of process variations on FinFET SRAM operation in the subthreshold regime is also studied. MOTIVATION FOR SUBTHRESHOLD OPERATION Steady scaling down of CMOS has enabled realization of very high performance VLSI circuits. However for several applications, such high speeds are not required. Medical equipments such as hearing aids and pace-makers and wireless devices such as cellular phones and PDAs, for example require extremely low power consumption , . It is for such applications that subthreshold operation of circuits provides highly reduced power consumption in return for a speed penalty. Subthreshold circuits are the most effective to achieve low power because they reduce both dynamic power and static power consumption. In addition to the quadratic decrease in dynamic power consumption, decreased VDD also decreases DIBL and hence decreases the transistor leakage currents, which may at one point even, exceed the active power consumption and dominate the total power consumption of a transistor. Thus for battery driven wireless devices subthreshold operation is ideal as reduced leakage power helps enhance battery life . ENERGY MINIMIZATION IN SUBTHRESHOLD DESIGN The total energy dissipated in a circuit operating in subthreshold is broken into active energy and leakage energy. The active or switching component is the energy consumed in charging the output capacitance to VDD and is given as Fig. 1. Minimum energy point and constant energy and performance lines for 16-b and 1024-pt FFT.  2 Eactive NCVDD Note that the activity factor is given by α and the total number of clock cycles is given as N. On the other hand, the component due to leakage arises from source to drain diffusion currents when the gate of the transistor is “off”, i.e. VGS = 0 V. This current is drawn statically from the supply during logic computation and is modeled as Vgs Vth Eleakage VDD I S e nVT (1 e Vds )Tdelay VT For a fixed threshold voltage, scaling supply voltage VDD into the subthreshold range results in substantial total energy reduction for both active and leakage energy, although at a significant penalty in performance. Wang, et. al have simulated constant energy and performance contours within VDD and Vt space for a 0.18µm process 16-b 1024-pt FFT as shown in Fig. 1. This figure confirms that for constant Vt, reduction in overall energy follows from reduction in supply voltage. However, further reduction in VDD results in increased energy consumption because delay increases exponentially with decreasing supply voltage, thereby increasing the total leakage energy component. This implies that the absolute minimum supply voltage for correct functionality does not necessarily coincide with the minimum voltage for energy.  For fixed VDD, increasing the threshold voltage decreases the leakage energy since the off-current reduces exponentially, while the delay increases only moderately in a quadratic fashion. When Vt approaches the supply voltage, however, the 2 Fig. 3. Double-Gate Cgate vs. VGS characteristic for various gate length  Fig. 2. PMOS sizing constraints on subthreshold inverter for minimum sized NMOS for 0.18um process.  leakage energy begins to increase because the impact on delay becomes significant. This results in a minimum energy point with respect to threshold voltage Vt. In general, for a given design there exists an optimal (VDD, Vt) pair for which total energy consumption is minimized. Although typically the penalty paid in terms of performance for operating at this minima is severe. By backing off from the absolute lowest energy slightly, the gains in performance can be substantial. In Fig 1., for example, increasing energy by 1.5X results in a performance improvement of 100X.  Fig. 4. Double-Gate ION and S vs. Lgate (VDD = 0.2 V) SIZING CONSIDERATION IN SUBTHRESHOLD DESIGN When scaling the supply voltages into the subthreshold regime, proper transistor sizing and process variation control become critical for ensuring correct circuit operation. Reduction in ION/IOFF ratio and unwanted leakage paths have to be taken into account when designing logic circuits at the lower limit of VDD. For the case of the subthreshold inverter with minimum sized NMOS pull down, supply voltage reduction places upper and lower limits on the sizing of the PMOS pull up device for proper functionality. For the case with a ‘0’ input at the gate, the PMOS must pull up the output node to VDD by overpowering the IOFF idle current of the NMOS device. If the device is not sized strong enough, the output will not rise to the top rail, resulting in reduced noise margins. With process variations the worse case is the fast NMOS/slow PMOS (FS) corner because the pull down has the largest leakage current and the pull up has the smallest drive current. Similarly for the case with a ‘1’ input, the minimum sized NMOS must discharge the output to 0 V, while overpowering the PMOS leakage current with worst case slow NMOS/fast PMOS. Wang, et. al have performed sizing analysis on the subthreshold inverter in 0.18µm process technology as shown in Fig. 2, where the minimum and maximum PMOS width Wp for proper logic swing is plotted as a function of supply voltage. With continued voltage scaling Wp is seen to increase substantially, imposing a significant area penalty when operating at the minimum allowable limit. Fig. 5. Double-Gate Inverter Delay vs. Lgate.  DEVICE DESIGN FOR SUBTHRESHOLD OPERATION In subthreshold CMOS circuit operation since Ion is Ioff scaled by 1/S, for a fixed Ioff, the ideal device for subthreshold operation needs to have near ideal 60mV/dec swing . Prior studies analyzed the use of devices such as DTMOS in the subthreshold regime, since DTMOS has near ideal subthreshold slope . DTMOS, is a device where the body of a planar MOSFET is tied to the gate of the MOSFET. Thus when Vg=0V, the leakage and S of the device are exactly the same as those of a regular MOSFET, but when Vg>0, then Vt decreases as the source body junction gets forward biased. 3 This leads to increased performance due to increased on current for the same off current as a regular MOSFET. The subthreshold DTMOS circuits exhibited lower delay due to increased Ion, higher power consumption due to the forward biased junction and lower power delay product. Also the delay of subthreshold DTMOS as a function of fan-out was found to be almost constant unlike regular CMOS.  In our study, we will analyze FinFETs in subthrsehold operation, since FinFETs and double gate devices are devices which achieve near ideal swing because of enhanced back gate control, thus making these devices the perfect candidates for subthreshold operation. In addition, as transistor scaling continues, FinFETs are seen as the mostly likely replacement to bulk CMOS in the near future. IMPACT OF FINFET PROCESS VARIATION ON SUBTHRESHOLD OPERATION While FinFETs are more easily scaled down to 20nm gate lengths and below due to improved subthreshold swing, building such devices pose several problems. The fabrication process for double gate devices being more complicated than that for planar devices, the chances of non-uniformity across a wafer during fabrication are increased. These increased process variations could adversely affect the electrical characteristics of FinFETs since these variations introduce variations in the Vt of the device and in subthreshold operation Ion is exponentially dependant on Vt. Process variations in double gate fabrication can be induced by various factors, such as body thickness variation, tox variation and gate length variation. Studies of sensitivity of FinFETs to process variations have already been performed. We intend to use the knowledge from these studies to further aid our analysis of how process variations affect subthreshold FinFET operation of logic circuits and SRAM circuits. Since FinFETs have completely depleted bodies and use work function tuning to tune Vt, they are inherently immune to Vt variations due to random dopant fluctuations. In order to get swings close to the ideal 60mV/dec, the silicon body has to be extremely thin. Tsi< 0.5Leff - 6Tox. Variations in body thickness affect not only the Vt of the device but also leakage due to DIBL, and the subthreshold slope of the device. Thinner body thicknesses help improve S and increase Vt due to enhanced back gate control. It has been shown that 1nm variations in body thickness can result in an order of magnitude difference in off currents and 100uA difference in on currents. In order to keep variation of electrical parameters to a minimum the 3sigma variation of body thickness needs to be less than 1nm for a device with 20nm body thickness . Thus we can see that body thickness variation could affect subthreshold circuit operation adversely by affecting S (thus affecting Ion/Ioff), leakage currents and Vt (thus affecting subthreshold Ion). The nature of the Fin etch would determine the roughness of the fin surfaces and thus determine the variations in Tox of a device. However unlike planar MOS devices which rely on thinner oxide thickness to improve S, FinFETs use the back gate to control the short channel effects. Further the drain current of FinFETs in subthreshold is independent of Tox as seen from the equation below. I DS W L 2 Vg Vth Vds kT mkt q e (1 e kT / q ) q Thus to first order it would be ok to consider that Tox variations in FinFETs don’t really affect subthreshold circuit operation. Kim, et. al have shown that for reliable subthreshold operation, under the same Ioff condition, a longer channel length device should be used instead of the minimum gate length (Lg) device. To first order delay can be modeled by CgV/Ion. The study shows that in subthreshold, Cg is almost independent of Lg, since the main component of Cg is the gate overlap capacitance and fringing gate capacitance, both of which are not dependent on Lg. (Fig. 3)  As seen in Fig. 4 Ion however depends on Lg if Ioff is fixed across all gate lengths as then Ion depends only on S of the device and for shorter channel lengths the S being worse, Ion gets worse and hence increases delay. Thus from Fig. 5 we can see that immunity to variations in length in the subthreshold regime can be obtained by using, longer channel length devices for a small penalty in area and increased gate capacitance . Thus variations introduced by dopant fluctuations, length variations and tox variations can easily be dealt with in the subthreshold regime by using work function tuning, and longer channel devices with no significant penalty in variations in electrical parameters of the FinFET. Body thickness variation is the only process induced variation which is difficult to control and thus affects subthreshold operation of FinFETs. IMPACT OF FINFET WIDTH QUANTIZATION ON SUBTHRESHOLD OPERATION Apart from process induced variations brought about by the inherently difficult process of fabricating a FinFET, another inherent drawback of FinFETs is width quantization. Unlike planar CMOS, FinFETs can only have integer widths, as the required width is obtained by placing several fins in parallel. This makes designing circuits difficult, as it eliminates the use of non integer Wp/Wn for proper circuit functioning. Soeleman, et. al have shown that the delay minimum as a function of Wp/Wn for a planar CMOS inverter is much shallower in subthreshold operation and hence a much wider range of Wp/Wn can be accommodated to achieve minimum delay in subthreshold operation . If the minimum is shallow then it would mean that for any Wp/Wn other than the optimum value the delay penalty would be large and in the case of FinFETs if the optimum Wp/Wn happened to be a non integer value then the delay penalty would be inevitable. A shallow minimum however, would ensure that no such penalty occurs. This if true for FinFETs, could prove to be very useful for subthreshold FinFET operation. 4 performance (SNM). This study uses built-in feedback to improve cell read margin, while simultaneously consuming low standby leakage power.  SRAM DATA RETENTION VOLTAGE (DRV) Defined as the minimum supply voltage required to retain data, an SRAM cell DRV can be determined from the following relationship.  Fig. 6. SRAM DRV vs. transistor width scaling.  V1 V2 Fig. 7. Deterioration of inverter VTC as function of VDD.  SUBTHRESHOLD SCALING LIMITATIONS IMPOSED BY SRAM FUNCTIONALITY Scaling VDD to voltages below the threshold voltage of a transistor however presents many challenges, several of which have been studied in the recent past. It is clear from studies that it is ultra low voltage SRAM operation that ultimately limits the VLSI system voltage scaling . Subthreshold logic gates function normally except for degraded speed. However SRAM read stability, static noise margin and data retention voltage are all degraded with lowering VDD. Further in subthreshold operation Ion is just Ioff scaled by 1/S, thus degraded Ion/Ioff becomes a problem, especially in the case of bitlines of memory where several drains of NMOS transistors are connected to the drain of a single PMOS transistor(for charging the bitline up to VDD) . Thus design of subthreshold memory circuits will require the knowledge of the maximum fan-out that ensures proper circuit functioning such as charging up of the bitline In many chip designs, SRAM arrays occupy a large fraction of the chip area. Scaling of memory density needs to track scaling of logic circuits, however this implies dealing with increased transistor leakage and process induced parameter variations. Scaling VDD is the easiest way to reduce leakage, however a low VDD coupled with process variations severely degrades SNM of an SRAM cell. Guo, et. al shows the possible use of FinFETs in both 6T and 4T SRAM cells. FinFETs with their inherently lower leakage and smaller footprint help aid memory scaling, where bulk CMOS cannot. Also FinFETs provide the possibility of using back gate biasing to dynamically adjust threshold voltage to help tune leftinverter V1 V2 , when VDD DRV rightinverter As the VDD is scaled down to DRV, the VTCs of the inverters of the SRAM cell degrade until finally at supply voltage equal to DRV the noise margin of the cell degrades to zero. (Fig. 7) If VDD is reduced below this voltage then the inverters flip and lose the capability to preserve the stored data.  Quin, et. al have shown that the DRV of an SRAM cell can be determined by solving subthreshold VTC equations of the two inverters of the SRAM cell.  This paper also explores the effect of transistor sizing of the various transistors in the SRAM to tune DRV. It is shown that appropriate sizing of the PMOS devices can be used effectively to tune DRV of the cell as seen in Fig. 6.  OUR PROPOSAL In our study we will simulate the subthreshold FinFET inverter and compare energy, delay and energy-delay across a range of supply voltages to 45 nm CMOS inverter to determine the advantages FinFETs may offer. From this analysis we will determine the ideal supply voltage we intend to use for all our subthreshold simulations in regards to energy/performance tradeoff. We will then determine tpo and gamma of a FinFET inverter and further determine the ideal fan-out for the obtained gamma. Next it will be interesting to see if in subthreshold the delay as a function of Wp/Wn does indeed have a shallow minimum. This will be very important if and when work function tuning is not available to make the PMOS and NMOS drive strengths equal. If a range of Wp/Wn provide a minimum delay then width quantization will not prove to be a problem in subthreshold operation. Further we will study the relationship between delay and gate length in subthreshold to see if a length trade off can be made to reduce delay. Transistors with longer gate length should exhibit increased Ion and thus reduced delay because of improved S. In addition it will also be interesting to compare the energy-delay of FinFET logic gates such as a two input NAND, NOR etc to regular bulk CMOS logic gates in the subthreshold regime. For the bulk of our study we hope to understand the tradeoffs that need to be made to make 6T and 4T FinFET SRAMs functional and as high performance as possible in the 5 Fig. 8. Schematic of 6T SRAM Cell.  Fig. 9. Hold/Write Margin vs. VDD.  subthreshold regime. We will study the design tradeoffs that need to be made to achieve a high hold stability, read stability and good write margin. Hold Stability: This can be quantified by the cell static noise margin in standby mode. The PMOS load transistor needs to compensate for the leakage of all the NMOS transistors connected to the storage node Vl (Fig. 8) . This will be challenging in subthreshold due to the degraded Ion/Ioff ratio. Perhaps longer channel devices will be preferred, in spite of the area tradeoff as longer channel FinFETs have better S and thus better Ion for a fixed Ioff than short channel FinFETs. We intend to study if upsizing the PMOS transistors can be used to improve hold stability, while also decreasing DRV as explained earlier in the paper. Read Stability: This can be quantified by the cell SNM during a read access. During a read operation, the node Vr rises above 0 depending on the resistive divider between the access transistor AXR and the pull down transistor NR. To ensure read stability the ratio of the W/L of these two transistors needs to be carefully set to ensure Vr does not flip the bit in the other inverter(formed by NL and PL) of the SRAM cell . In our study we intend to vary the widths of the transistors in each inverter of the SRAM to determine, what area tradeoff can be made to achieve high read stability. Write Margin: A successful write operation is performed if the voltage divider pulls Vl below the trip point of the inverter formed by NR and PR. Write margin can usually be improved by keeping the PMOS device at minimum size and upsizing the access transistor . We need to determine the ideal PMOS device sizing to ensure that neither hold stability nor write margin are compromised. Designing a cell with good read stability, write margin and hold stability by playing only with transistor sizing will prove challenging. The role of back gate biasing in the FinFET to dynamically control Vt will be studied to determine if the cell beta ratio can be adjusted dynamically. By connecting the back gate of the access transistor to the storage node, the strength of the access transistor can be selectively degraded. If the storage node is at 0, the access transistor FinFET has degraded drive strength or in other words increased beta ratio during the read cycle. The tradeoff made here though is reduced write margin because of the degraded drive current of the back-gated FinFET access transistor.  Since prior work (Fig. 9) indicates that lowering Vdd, increases write margin without a significant hit to hold margin, it will be interesting to determine if subthreshold SRAM design inherently allows for large write margin . This large write margin without having to upsize the access transistor would then imply no degradation of cell read margin induced by the large access transistor. Subthreshold FinFET operation however is challenging because not only do we need to work the sizing of the various transistors to design a stable cell, but we also have to deal with inherent process variations and their impact on Vt and further the electrical parameters of the transistors. We intend to study the effect of process variations as best as we can through simulations. Though from simulations of SRAM cells at various supply voltages we can determine the minimum Vdd for data retention or DRV, ultimately the minimum DRV that can be used will also depend on process variations. Having determined from our study of past work that the body thickness variation is the most significant effect of process variation, we intend to study how random body thickness variations of the transistors in an SRAM cell, will affect the mean SNM and DRV. In order to study these we will use a random generator to generate random variations in the body thicknesses of the devices in the SRAM cell. These variations in body thickness will translate to random variations in Vt and hence variations in on current in the subthreshold regime if all transistors have a fixed Ioff. We will then simulate these SRAM cells with these random variations and plot the SNM distribution density function and DRV distribution function as a function of body thickness to determine the mean DRV and SNM and also the or 3sigma variation in DRV and SNM. , . REFERENCES  A. Wang, and A. Chandrakasan, “A 180-mVSubthreshold FET Processor Using a Minimum Energy Design Methodology,” IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 310-319, Jan. 2005. 6       J. Kim, and K. Roy, “Double Gate-MOSFET Subthreshold Circuit for Ultralow Power Applications,” IEEE Trans. Electron Devices, vol. 51 (9), pp. 1468-1474, 2004. S. Xiong and J. Bokor, “Sensitivity of double-gate and Fin-FET devices to process variations,” IEEE Trans. Electron Devices, 50(11):2255– 2261, Nov 2003. [Zheng Guo, Sriram Balasubramanian, Radu Zlatanovici, Tsu-Jae King, Borivoje Nikolić, “ FinFET-based SRAM design,” International Symposium on Low Power Electronics and Design Proceedings of the 2005, pp. 2-7, 2005. H Soeleman, K Roy, BC Paul, “Robust subthreshold logic for ultra-low power operation” IEEE Transactions on Very Large Scale Integration, vol. 9. 1, pp. 90-97, Feb, 2001. J Chen, L.T. Clark, Y. Cao, “Ultra-Low Voltage Circuit Design in the Presence of Variations” IEEE Circuits and Devices Magazine, pp. 1220, Nov/Dec 2005. H. Qin, Y. Cao, D. Markovic, A. Vladimirescu, and J. Rabaey, “SRAM Leakage Suppression by Minimizing Standby Supply Voltage”, IEEE Computer Society, June, 2004.