* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Sunil`s presentation - Texas A&M University
Opto-isolator wikipedia , lookup
Standby power wikipedia , lookup
Valve RF amplifier wikipedia , lookup
Power MOSFET wikipedia , lookup
Oscilloscope history wikipedia , lookup
Rectiverter wikipedia , lookup
Audio power wikipedia , lookup
Radio transmitter design wikipedia , lookup
Integrated circuit wikipedia , lookup
Power electronics wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Index of electronics articles wikipedia , lookup
Captain Power and the Soldiers of the Future wikipedia , lookup
A Variation-tolerant Subthreshold Design Approach Nikhil Jayakumar Sunil P. Khatri . Texas A&M University, College Station, TX 1 Motivation In recent times, chip power consumption has become a significant hurdle Higher power consumption leads to Shortened battery life Higher on-chip temperatures – reduced operating life of the chip There is a large and growing class of applications that where power reduction is paramount – not speed Such applications are ideal candidates for subthreshold logic 2 Sub-threshold Leakage W I ds I D 0 .e L Vgs VT Voff nvt Vds vt . 1 e As supply voltage scales down, the VT of the devices is scaled down as well A lower VT results in exponentially higher leakage Leakage power is becoming comparable with dynamic power A larger VT would reduce leakage but increase delay We can turn this dilemma into an opportunity !! 3 The Opportunity Traditional Ckt Sub-threshold Ckt (Vb = 0V) Sub-threshold Ckt (Vb = VDD) Process Delay(ps) Power(W) P-D-P(J) Delay Power P-D-P Delay Power P-D-P bsim70 14.157 4.08E-05 5.82E-07 17.01X 308.82X 18.50X 9.93X 141.10X 14.43X bsim100 17.118 6.39E-05 1.08E-06 24.60X 497.54X 20.08X 12.00X 100.96X 8.20X Performed simulations for 2 different processes on a 21 stage ring oscillator Impressive power reduction (100X – 500X) PDP improves by as much as 20X Delay penalty can be reduced by several means Applying forward body bias Decreasing VT Circuit approaches 4 The Opportunity VT 0.18 0.17 0.16 0.15 0.14 0.13 bsim70 Delay Power P-D-P 16.15X 167.52X 10.41X 14.88X 151.99X 10.09X 13.78X 137.73X 9.95X 13.15X 124.59X 8.86X 12.43X 112.73X 9.40X 12.32X 101.85X 8.02X VT 0.27 0.25 0.23 0.21 0.19 0.17 bsim100 Delay Power P-D-P 23.32X 479.85X 20.60X 22.43X 464.33X 20.16X 21.02X 444.23X 20.05X 18.69X 400.89X 20.27X 18.42X 366.28X 18.98X 17.51X 323.26X 17.98X We also performed experiments with lower VT values Delays improved with decreasing VT values, as expected PDP remained high Power gains decreased with decreasing VT values 5 Sub-threshold Logic Advantages Circuits get faster at higher temperature. Hence reduced need for expensive cooling techniques Device transconductance is an exponential function of Vgs which results in a high ratio of on to off current. Hence noise margins are near-ideal Disadvantages Ids exhibits an exponential dependence on temperature Ids also has a strong dependency on process variations (such as VT variations) Ids is small 6 Previous Approaches Paul et al (2001) reported a sub-threshold multiplier Compensation of Ids over P/T variations Tschanz et al (2002) discuss a dynamic body bias technique to make design process variation tolerant Applied in the context of regular CMOS technologies Circuit delay matched to critical delay (hard to determine) Matching is performed for entire design monolithically In contrast to these, we: Compensate sub-threshold delay over P/V/T variations Apply our compensation to a network-of-PLA design Critical delay is trivially determined Perform compensation separately for clusters of spatially nearby PLAs 7 Our Solution We propose a technique that uses self-adjusting bodybias to phase-lock the circuit delay to a beat clock Use a network of dynamic NOR-NOR PLAs to implement circuits Regular, area and delay efficient approach PLAs partitioned into clusters of 1000 PLAs each All PLAs in a cluster share bulkn node A representative PLA in the cluster is chosen to phase lock the delay of the cluster to the beat clock Beat clock period determines circuit speed bulkn voltage modulated via charge pump If the delay is too high, a forward body bias is applied to speed up the PLA, and vice versa 8 Dynamic NOR-NOR PLA Inputs Outputs clk clk We use precharged NOR-NOR PLAs as completion the structure of choice Wordlines run horizontally Inputs / their complements and outputs run vertically Several PLAs in a cluster share a common bulkn node Each PLA has a “completion” signal that switches low after all the outputs switch9 The Charge Pump - PLA “completion” signal lags beat clock - bulkn node gets forward biased pullup pulldown - PLA “completion” signal leads beat clock - bulkn goes back to zero bias 10 Effectiveness of the Approach We simulated a single PLA over 0 to 100oC. We also applied VT variations (10%) and Vdd variations (10%) The light region shows the variations on delay over all the corners The red region shows the delays with the selfadjusting body-bias circuit 11 Example Showing Phase Locking VDD changed from 0.2 to 0.22V VDD changed from 0.22 to 0.18V This figure shows how the body bias (and hence the delay of the PLA) changes with changes in VDD Note PLA delay remains relatively constant The adjustment is very quick (within a clock cycle) 12 Summary Sub-threshold circuit design is promising due to extreme low power consumption 100 – 500X power reduction, 10 – 25X speed penalty Appealing for a widening class of applications However, it is inherently not tolerant to PVT variations Our approach dynamically compensates for PVT variations Lock delay of a representative PLA in a cluster, to a beat clock Use a charge pump which modulates nbulk bias voltage Dramatic reduction in sensitivity to PVT variations This can help achieve a significant yield improvement 13 Thank you!! 14