Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Die-Hard SRAM Design Using Per-Column Timing Tracking Shi-Yu Huang and Ya-Chun Lai Feb. 10, 2007 @ Las Vegas (IC-DFN) Design Technology Center (DTC) National Tsing-Hua University, HsinChu, Taiwan Outline • Introduction • Timing Tracking Scheme – Traditional Replica-Based Scheme – Our Scheme • Experimental Results • Conclusion 2/25 Nanometer Effects on SRAMs Nanometer Effects Worse Device Mismatch Larger Leakage Current Wider Variations of R and C Could trigger a yield crisis! Lower VDD (smaller noise margins) Worse Supply & Coupling Noise Uncertain Delay 3/25 SRAM Memory Architecture CS WE OE A9 A8 bit line word line .. . Row Decoder A0 Sense Amplifier / Drivers A19 A10 Column Decoder Input-Output (M bits) 4/25 Reading An SRAM Cell pulsed wordline Wordline Q’ 0 An SRAM Cell Q 1 cell current BL BL Bitlines’ Waveforms BL BL 5/25 Two Types of Sense Amplifiers A Sense Amplifier Continuous Type VDD Latch Type VDD VDD sa_in sa_in sa_in saout se saout VDD VDD sa_in sa_in saout se Sensese Enable 6/25 Three Major Problems for SRAM • Mismatch in Bit Cells and Sense Amplifiers – Vt mismatch shrinks the noise margin • Bitline Leakage Current – Could cause failure for READ operations • Timing Tracking – When to turn on sense amplifiers? – When to turn off wordline? (pulsed wordline) 7/25 X-Calibration for Leakage Tolerance (Presented in Last IC-DFN) Leakage is calibrated in two steps: BL 1 1 1 1 1 1 1 0 cell cell cell cell cell cell cell cell 0 0 0 0 0 0 0 1 Leakage Current BL 1.5V 1.8V X-calibration circuit Transform the effects of the bitline leakage to a Voffset between (BL, BL) Deduct Voffset from the input of the sense amplifier When performing sense amplification S.A. 8/25 Die Photo of Test Chip SRAM Type Array Organization Conventional Our X-Calibration 1Kb cells X-Calibration (32 rows × 32 columns) Technology TSMC 0.18um CMOS 1P6M BIST 486um × 265um 486um × 285um (100%) (107.6%) Access Time 1.89 ns 1.93 ns (1.8V) (100%) (102%) Supply Current 3.7 mA 4.15 mA (mA) (100%) (112%) Area 1.373mm BIST Conventional 1.108mm 9/25 Shmoo Plots Ours with X-Calibration Supply Voltage (V) Conventional Supply Voltage (V) Target speed: 150MHz @ 250C Measurement result: Leakage tolerance improved by 317% Pass Fail Ileak=76.6uA Pass Fail Ileak=320uA Injected Leakage Current (uA) 10/25 Outline • Introduction • Timing Tracking Scheme – Traditional Replica-Based Scheme – Per-Column Timing Tracking Scheme • Experimental Results • Conclusion 11/25 Traditional Scheme – Replica Bitline Property: replica bitline pair develops a logic signal (i.e., sense enable) when an accessed bitline pair builds up 100mV signal replica bitline pair active wordline decoder accessed logic sense amps CLK Ref: B. S. Amrutur et al., “A replica technique for wordline and sense control in low-power SRAMs,” IEEE Journal of Solid-State Circuits, Vol. 33, No. 8, pp. 1208-1219, Aug. 1998. 12/25 Problems of Replica Bitline Based Timing Control The factors on the speed of a bitline pair: leakage, RC, driving of cell Each column could have its own bitline development speed A single sense enable control is susceptible to sensing errors Voltage (V) Read cycle Read cycle BL / BL SE 13/25 Adaptive Sensing Control Each sense amp. adapts to its current driving bitline pair! Voltage (V) Read cycle Read cycle BL / BL SE 14/25 Operating Flow Typical READ control steps Added timing tracking steps Row address decoding Timing tracker start-up Wordline activation Timing tracker monitoring Bitline discharging ΔVBL>100mV? S.E. active ? N Y N Y Sense enable generation Sense amplification Timing tracker disabling 15/25 Overall Architecture Row WL Decoder Driver BL det_en BL MC MC MC MC Cell Array MC MC MC MC MUX2 MUX2 Timing Tracker Timing Tracker se SA Controller, Input Buffer, Address Buffer WL Latch& Buffer SA I/O Circuitry Latch& Buffer 16/25 Transient Waveforms for Read Row WL Decoder Driver BL BL MC CLK MC BL / BL MC det_en MC MUX2 WL Timing Tracker se SA Latch& Buffer det_en se Desired property: SE goes high when bitline pair has 100mV! 17/25 Outline • Introduction • Timing Tracking Scheme – Traditional replica-based scheme – Per-Column Timing Tracking • Experimental Results • Conclusion 18/25 Effect of Variation on Sense Amp. Vt • As Vt mismatch in sense amplifier becomes excessive, the probability of read failure increases. 1.2 proposal proposed Pass Rate 1 0.8 dummy bitline replica-based 0.6 0.4 0 10 20 30 40 50 60 Local standard deviation of Vt for transistors in SA (mV) 19/25 Effect of Variation on Bitline Capacitance • Our is insensitive to bitline capacitance variation. • On the contrary, replica-based method is vulnerable. Pass Rate 1.2 Proposal proposed 1 100fF 0.8 300fF 500fF 0.6 dummy bitline replica-based 0.4 0 10 20 30 40 50 60 Local standard deviation of Vt for transistors in SA (mV) 20/25 Layout of Test Chip (Technology): TSMC 0.18um CMOS 1P6M Capacitor Proposed 1.208mm (Creating Nanometer Effects): We used different loadings on different bitlines so as to mimic the different operating speeds in deeper nanometer technologies Compared 1.108mm 21/25 Layout of Compared SRAM Row decoder Cell array IO circuitry Column decoder & Output buffer Control & Input buffer & Row address buffer Column address buffer 22/25 Layout of Proposed SRAM Row decoder Cell array IO circuitry & Timing tracker Column decoder & Output buffer Control & Input buffer & Row address buffer Column address buffer 23/25 Test Chip Characteristics Technology TSMC 0.18um CMOS 1P6M Package 40-pin S/B SRAM macro organization 32 rows x 64 columns Test chip area 1.108 mm x 1.208 mm Power supply voltage 1.8 V Operating Clock frequency 200 MHz Power dissipation for compared SRAM 13.185 mW (100%) Power dissipation for proposed SRAM 17.930 mW (136.8%) Access time for compared SRAM 1.969 ns Access time for proposed SRAM 2.301 ns (116.8%) 24/25 Conclusion • Why Timing Control in an SRAM? – (1) for latch-based sense amplifier enabling – (2) for pulsed wordline control – So as to achieve lower power dissipation • Drawback of Existing Replica-Based Scheme – Replica simply cannot track every bitline pair • Proposed Per-Column Timing Tracking – Adaptive on-the-fly – More tolerant to process variation – Suitable for deeper nanometer technologies 25/25