Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Micro-Architecture Techniques for Sensor Network Processors Amir Javidi EECS 598 Feb 25, 2010 Motivation • Low performance tasks • Long duration • Small energy supplies 2 Papers • • 3 [1] L. Nazhandali, B. Zhai, J. Olson, A. Reeves, M. Minuth, R. Helfand, S. Pant, T. Austin, and D. Blaauw, “Energy optimization of subthreshold voltage sensor network processors,” in Proc. Int. Symp. Computer Architecture, 2005, pp. 197–207. [2] S. Hanson, M. Seok, Y-S. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester, and D. Blaauw, “A low voltage processor for sensing applications with picowatt standby mode,” IEEE Journal of SolidState Circuits, pp. 1145-1155, April 2009. Energy Budget • 2g Vanadium oxide battery: 720 mAh • Powers ARM 720T processor at 100MHz for 45hrs • Thin film zinc/silver oxide battery: 100 μAh/cm2, 1.55 V • 4 For area of 1mm2 average current must be 114pA (power consumption of 177 pW) for 1 year lifetime Performance Requirement • Blood pressure monitoring (low rate): • Sensing @ 800 bps 10,000 inst/sec • EEG brain signal monitoring (high rate): • 5 3200 bps 56,000 inst/sec for filtering, analysis, compression, and storage Architecture/Circuit Techniques • Sub-threshold implementation (Vdd < Vth) • ISA optimization • Voltage scaling • Power gating • Stack forcing • Data/instruction compression 6 Subthreshold design • Why subthreshold? • Processor operating in lowest super-threshold voltages deliver too much performance Performance of sensor network processor applications on embedded targets. Number of times faster than real-time the processor can handle the worst case data stream rate [1]. 7 Subthreshold Circuit Design 8 Subthreshold Energy Optimization Einst EcycleCPI 1 Ecycle ( CsVdd2 Vdd I leak tclk ) 2 tclk sub e kVdd Vmin energy optimal supply voltage Energy as a function of Voltage[1] 9 Subthreshold ISA Optimization • Why ISA optimization? • • • Memory dissipates static/dynamic energy Memory size leakage Tradeoff between memory size and control logic size Logic Vs memory energy tradeoff [1] 10 ISA Optimization • Impact of ISA optimization on code size and control logic complexity 11 Micro-Architecture • Sensor network processor micro-architecture 12 Performance vs. Energy 13 Results • Sensor network processor • • • • • • • 14 ROM/RAM memory 8 bit data path 235 mV supply 182 KHz 1.38 pJ/inst 4.1x faster than necessary for mid-bandwidth 25 years lifetime with 2g vanadium oxide battery (720 mAh) Phoenix Processor • Focus on lowering standby power • Older 0.18μm technology • Custom leakage-optimized instruction set • Simple data memory compression • Ultra-low-leakage memory cell • Huge tradeoff between standby power and area and active energy 15 Phoenix Processor 16 CMOS Technology • Newer technology (65 nm) • • High subthreshold leakage Small capacitance • Older technology (180 nm) • • 17 7.7x larger 647x less total energy consumption Voltage Scaling • Supply voltage of 0.5 V • • • • 18 Mix of subthreshold and near-subthreshold devices Retentive gates high-Vth ~ 0.7 V Non-retentive gates medium-Vth ~ 0.5 V High-Vth consumes ~ 1000x less leakage power Power Gating • Medium-Vth power switch • 19 Smaller switch ~ 1000x • • Less area overhead Less charging/discharging power overhead CPU Architecture • 2 stage, 8bit data width, 10bit inst. Width • ALU (add, subtract, shift) • No multiplier • Simple decoder (min set of operations) 20 ISA Optimization • Minimized instruction width (10 bit) • • 21 Reduces IMEM standby power dissipation Efficient operand encoding • • Explicit operand: more flexibility, more frequently used Implicit operand: less flexibility, less frequently used Memory Design • 64x10b SRAM (IMEM) • • Application specific instructions No power gating • • Commonly used instructions Power gated • • Data compression Fine grain power gating • 64x10b ROM (IROM) • 52x40b SRAM (DMEM) 22 Memory Design • Leakage reduction • • High-Vth bitcell transistors Cross coupled inverters: • • Stacked transistors Increased length • (0.35μm to 0.50μm) ~2x leakage reduction • Robustness • • 23 Full swing read-buffer Power gated Results • Phoenix processor • • • • • • 24 0.5 V power supply 106 KHz 2.8 pJ/cycle 297 nW 226 nW active mode 35.4 pW standby mode 915 x 915 μm2 Questions?