Download DAC Presentation kit

TH EDA Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan NTHU-CS VLSI/CAD LAB Outline  Sleep Transistor Sizing Problem  MIC Estimation Mechanism  Partitioned Time-Frame for MIC Estimation  Experimental Results and Conclusions 2 Power Gating  Leakage increases exponentially – reach 50% of total power in 65nm technology  Power Gating – One of the most effective ways to reduce leakage Low Vth Logic Device VDD GND VGND SL use high Vth Sleep Transistor to reduce the leakage current GND 3 Implementation of Power Gating  Distributed Sleep Transistor Network (DSTN) Low Vth Logic Device VDD C2 C1 C3 VGND SL SL SL 4 Leakage Saving  In standby mode: – Leakage: proportional to the ST’s size – Small ST to reduce leakage VDD VGND Ileakage Ileakage Ileakage 5 Voltage Drop across the ST  In active mode: – Voltage drop across a ST degrades the speed – Voltage drop: inversely proportional to the ST’s size – Large ST to bound the voltage drop VDD VGND VST VST VST 6 Sleep Transistor (ST) Sizing  Dilemma scenario: – Large ST to bound the voltage drop. (active mode) – Small ST to reduce leakage. (standby mode) =>objective: minimize ST size (leakage) under a specified voltage drop constraint, VST* VDD VGND VST* VST* VST* 7 Estimate Voltage Drop by MIC  Maximum Instantaneous Current (MIC) through the ST – determines the worst case voltage drop  Estimating the upper bound of MIC(ST) – for sizing ST appropriately to meet voltage drop constraint VDD C2 C1 C3 MIC(ST): MIC across a ST. MIC(ST1) MIC(ST2) VGND MIC(ST3) 8 Estimate Voltage Drop by MIC  MIC(C) (MIC of a cluster) is easy to measure  Due to current balancing effect – MIC(ST) (MIC through the ST) is hard to predict Finding the MIC of a cluster is fast C1 MIC(C1) MIC(ST1) C2 VDD Finding theC MIC 3 across a ST is time-consuming MIC(ST2) VGND MIC(ST3) 9 Temporal Perspective of Clusters’ MIC Traditional ways – use the entire clock period’s MIC to determine the ST size one clock cycle (Current)  Cluster 1 Cluster 2 MIC(C1) occurs at T6 MIC(C2) occurs at T9 MIC(Ci) waveform (Time Unit) 10 Temporal Perspective of Clusters’ MIC  Smaller time frames leads to: – a more accurate MIC estimation – high computation complexity one clock cycle Current (mA) Cluster 1 Cluster 2 MIC(Ci) waveform (Time Unit) 11 Difficulties  Current balancing effect complicates the sizing problem MIC MIC  MIC MIC Time-frame partitioning leads to high computation complexity one clock cycle 12 Contributions  A more accurate MIC prediction in a temporal perspective  A variable-length partitioning to reduce computation complexity  Heuristics to minimize the size of sleep transistors  Achieving 21% reduction in sleep transistor area 13 Outline  Sleep Transistor Sizing Problem  MIC Estimation Mechanism  Partitioned Time-Frame for MIC Estimation  Experimental Results and Conclusions 14 Resistance Network C1 C2 I(C1) RV I(C2) RV I(ST1) R(ST I(ST2) R(ST C3 I(C3) I(ST3) R(ST 15 Discharging Ratio C2 C1 I(C1) 2 9 0.43 I(C1)  C3 2 8 0.34 I(C2) 10 0.23 I(C3) The discharging ratio can be calculated by – Kirchhoff’s Current Law – Ohm’s Law 16 Discharging Matrix Ψ C1 C2 C3 I(C1) I(C2) I(C3) I(ST1) I(ST2) I(ST3)  I ( ST1 )   I (C1 )  →  I ( ST2 )  Ψ   I (C2 )  I ( ST3 )   I (C3 )  where ψ11 ψ12 ψ13  Ψ  ψ 21 ψ 22 ψ 23  ψ 31 ψ 32 ψ 33  17 MIC(ST) Estimation Mechanism C1 C2 C3 MIC(C1) MIC(C2) MIC(C3) MIC(ST1) MIC(ST2) MIC(ST3)  MIC ( ST1 )   MIC (C1 )  → MIC ( ST2 )  Ψ  MIC (C2 )  MIC ( ST3 )   MIC (C3 )  where ψ11 ψ12 ψ13  Ψ  ψ 21 ψ 22 ψ 23  ψ 31 ψ 32 ψ 33  18 Outline  Sleep Transistor Sizing Problem  MIC Estimation Mechanism  Partitioned Time-Frame for MIC Estimation  Experimental Results and Conclusions 19 Temporal Perspective of Clusters’ MIC  Different MIC(Ci) occurs at different time points (Current) one clock cycle Cluster 1 Cluster 2 MIC(C1) occurs at T6 MIC(C2) occurs at T9 MIC(Ci) waveform (Time Unit) 20 Temporal Perspective of Clusters’ MIC  Different MIC(Ci) occurs at different time points within a clock period  Traditional way to estimate MIC(STi) is over pessimistic  MIC ( ST1 )   MIC (C1 )   MIC ( ST )  Ψ   MIC (C ) 2  2     MIC ( ST3 )   MIC (C3 )  21 Time-Frame Partitioning for MIC(ST) Estimation Expand MIC(Ci) into MIC(Ci,Tj) one clock cycle MIC(C 1,T6) (Current)  MIC(C1,T3) Cluster 1 Cluster 2 MIC(C2,T6) MIC(C1,T1) MIC(C2,T1) MIC(C2,T3) MIC(Ci,Tj) waveform (Time Frame) 22 Time-Frame Partitioning for MIC(ST) Estimation  For each time frame Tj, use MIC(Ci,Tj) to obtain MIC(STi,Tj)  MIC ( ST1 , T1 )   MIC (C1 , T1 )   MIC ( ST , T )   Ψ   MIC (C , T )  2 1  2 1     MIC ( ST3 , T1 )   MIC (C3 , T1 )  23 Time-Frame Partitioning for MIC(ST) Estimation For ST1, the maximum MIC(ST1,Tj) among all Tj is the upper bound of MIC(ST1) after partitioning one clock cycle (Current)  ST 1 Cluster 1 ST 2 Cluster 2 MIC(ST2) MIC(ST1) MIC(STi,Tj) waveform (Time Frame) 24 Time-Frame Partitioning for MIC(ST) Estimation Time-Frame Partitioning leads to a better MIC(ST) estimation! (Current) ORIGINAL_MIC(ST1) 37% larger! one clock cycle) ORIGINAL_MIC(ST 2 27% larger! ST 1 Cluster 1 ST 2 Cluster 2 MIC(ST1) MIC(ST2) MIC(STi,Tj) waveform (Time Frame) 25 Reduce the Computation Complexity  Increase the number of time frames leads to – more accurate voltage drop estimation – high computation complexity  Reduce the computation complexity: – dominated time-frame removal – variable length time-frame partitioning 26 Dominated Time-Frame Removal  T3 is dominated by T6 – MIC(C1,T6) > MIC(C1,T3) – MIC(C2,T6) > MIC(C2,T3)  Neglect T3 and all dominated time frames MIC(C1,T6) Cluster 1 Cluster 2 MIC(C2,T6) MIC(C1,T3) MIC(C2,T3) 27 Variable Length Time-Frame Partitioning MIC(C1,Tc) Tc Tb Ta MIC(C1,Tb) MIC(C2,Tc) Td MIC(C2,Td) MIC(C2,Tb) (1) uniform two-way partition  MIC(C1,Td) (2)variable length two-way partition (Tb dominates Tc ) and (Tb dominates Td) => the estimated upper bound will be smaller  If all the MIC(Ci) are separated, the MIC(STi) can be better estimated! 28 Problem Formulation of ST Sizing  Inputs: 1. Voltage-drop constraint 2. MIC(Ci,Tj): Clusters’ MIC information  Objective: minimize the total ST width  Voltage drops must meet the constraint 29 ST Sizing Algorithm 1. Initialize ST size with a large value. 2. Update the discharging matrix. 3. Update MIC(STi,Tj) and voltage drops. 0.38 0.30 0.21 0.18 Ψ= 99 99 99 Ψ 0.27 0.30 0.21 0.18 MIC(STi,Tj)= ．MIC(Ci,Tj) V(STi,Tj)=MIC(STi,Tj)．R(STi) 0.21 0.24 0.35 0.28 99 0.14 0.16 0.23 0.36 4. Resize ST with the worst drop. WST *  ( MIC ( STi , T j ) VST * )k No Voltage drops ok? Yes 99 99 73 99 Return ST size 30 Outline  Sleep Transistor Sizing Problem  MIC Estimation Mechanism  Partitioned Time-Frame for MIC Estimation  Experimental Results and Conclusions 31 Environment Setup  TSMC 130nm CMOS technology  Vdd = 1.3 volt  Specified tolerable IR drop: 5% of the ideal supply voltage  MIC(Ci,Tj) is obtained via 10,000-random-pattern PrimePower simulations 32 Implementation Flow RTL netlist Synthesis SDF file Gate-level netlist Placement Simulation DEF file VCD file Gate Positioning VCD Partitioning Gate location Partitioned VCD file MIC Estimation V-length Partitioning (Optional) : Commercial tools : Our tools ST Sizing ST size 33 Experimental Results C432 Total Area (Width in μm) [8] [2] TP 12817 8491 6775 C499 10741 8347 6684 7229 3644 568 C880 15050 11296 9233 9676 2561 345 C1355 19352 13056 10591 11496 2514 422 C3540 29808 23020 18650 20282 16856 942 C5315 29794 23773 18785 19534 13830 2190 C7552 dalu frg2 i8 t481 des 41016 3468 3632 13247 9405 11804 29500 2904 2835 9931 7389 9766 24269 2110 2232 7836 5024 7850 25621 2283 2255 8141 5402 8145 17216 3816 701 7720 16289 8321 2896 483 136 1080 1514 1180 AES Avg. 44378 1.70 33965 1.26 27229 1 28137 1.06 28379 8.09 3524 1 Circuit V-TP 7086 Runtime (Sec.) TP V-TP 4262 495 Previous works: [2] Chiou et al. DAC’06, [8] Long et al. DAC’03 34 Conclusions  Propose an efficient sleep transistor sizing method for DSTN power gating designs  Present theorems based on temporal perspective for estimating a tight upper bound of voltage drop  Achieving 21% size (leakage) reduction 35 Thank You! 36 Sleep Transistor (ST) Sizing  Relations between WST, and VST. VDD I(ST): the current through the sleep transistor VST: the voltage drop across the sleep transistor VGND VST I(ST) GND  Sleep Transistors operate in linear region in active mode. I ( ST ) WST  ( )k VST 37 Sleep Transistor (ST) Sizing  Determine the minimum required size (WST* ) based on: 1. MIC(ST) 2. VST*: IR-drop constraint I ( ST ) WST  ( )k VST MIC ( ST ) WST *  ( )k VST * Smaller MIC(ST) leads to a better ST size! MIC(ST): Maximum InstantaneousVDD Current (MIC) through ST VGND MIC(ST) GND 38

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download DAC Presentation kit