* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download TimothyGoldberg - UCF Computer Science
Electrical ballast wikipedia , lookup
Standby power wikipedia , lookup
Power factor wikipedia , lookup
Wireless power transfer wikipedia , lookup
Power over Ethernet wikipedia , lookup
Electrification wikipedia , lookup
Three-phase electric power wikipedia , lookup
Audio power wikipedia , lookup
Power inverter wikipedia , lookup
Resistive opto-isolator wikipedia , lookup
Opto-isolator wikipedia , lookup
Pulse-width modulation wikipedia , lookup
Electric power system wikipedia , lookup
Variable-frequency drive wikipedia , lookup
Voltage regulator wikipedia , lookup
Distributed generation wikipedia , lookup
Immunity-aware programming wikipedia , lookup
Electrical substation wikipedia , lookup
Stray voltage wikipedia , lookup
Power MOSFET wikipedia , lookup
Amtrak's 25 Hz traction power system wikipedia , lookup
History of electric power transmission wikipedia , lookup
Surge protector wikipedia , lookup
Power engineering wikipedia , lookup
Buck converter wikipedia , lookup
Power electronics wikipedia , lookup
Alternating current wikipedia , lookup
Voltage optimisation wikipedia , lookup
Power Reduction Techniques for Microprocessor Systems by Timothy Goldberg Paper by: Vasanth Venkatachalam and Michael Franz Published 2005 Power Consumption and its Importance Saving Power – Save money, save electricity, save the planet Heat Dissipation – Heat density and cooling Battery Life – Use less energy, extend battery running time Outline Definition of Power and Energy Power Reduction Techniques From the Circuit level through Hardware to Compiler and Application level techniques Commercial Systems Emerging Technologies Power and Energy Need to reduce both Power = Work / Time – Affects heat Energy = Power * Time – Affects battery Dynamic Power Consumption: Circuit activity Switched capacitance (depends on V, f, C, a) • Clock gating Short-circuit current, transistors with opposite charges (10-15% of total power) Power and Energy Leakage Power Consumption: Static/Idle power Depends on Voltage and Leakage Current Sub-threshold leakage: supply voltage, threshold voltage, temperature. • Reduce Voltage, Fewer transistors, increase Threshold voltage Power Reduction From low level circuit changes Low-Power Interconnect Memories and Memory Hierarchies Hardware/Architecture Dynamic Voltage Scaling Resource Hibernation Compiler Application Cross-layer Circuit and Logic Level Techniques Transistor Sizing: Reduce width • less dynamic power consumption, but increases delay Transistor Reordering: Minimize switching activity • place frequently switching transistors closer to the circuit's outputs Logic Gate Restructuring: Reduce switching – Gates must receive inputs at the same time Circuit and Logic Level Techniques Technology Mapping: Software tools • Find best configuration, based on restraints • Design circuit out of logic gates to minimize total power consumption • NP-Hard DAG problem Low Power Flip-Flops: Self-gating flip-flop: Reduce switching activity Dual-edge triggered: Reduce power dissipated by clock signal Circuit and Logic Level Techniques Low Power Control: Processor as a FSM • Activate only the circuitry needed for current executing sub-FSM Delay-Based Dynamic Supply Voltage – Look-up table of voltages and clock speeds has worst case – Adjust voltage based on the delay and monitor errors • Requires more hardware (shadow-latches) Low-Power Interconnect Bus Encoding: inversion to reduce switching Crosstalk: activity in neighbor wires (shield wire) Low Swing Buses: +300mV and -300mV instead of +5V and -5V Immune to crosstalk, but increased hardware at encoder and decoder Bus Segmentation: allows most of bus to remain powered down when not communicating Low-Power Interconnect Adiabatic Buses: Reuses existing charge – Reduce total capacitance – Delay in transferring charge Network-On-Chip: – Functional units sharing buses: lack speed and volume of transfers – Generic Interconnection Networks replace buses • Concurrent connections Low-Power Memories and Memory Hierarchies Reduce power regardless of type (ROM/RAM) Split Memories into smaller Sub-Systems: activate only the needed circuits in accesses Specialized cache to reduce accesses Before first cache level, store application's working set Block Buffering – store most recently accessed cache set Scratch Pad Memories – determined by compiler Trace cache: store instructions in executed order Dynamic direction prediction-based trace cache Selective Trace Cache: compiler helps Low-Power Processor Architecture Adaptations Adaptive Caches: lines, blocks, or sets selectively activated based on miss threshold Lost data and delay with No Voltage Cache Decay turns off unused cache lines after interval Hot Spot Detection: count branch taken, activate cache lines within hotspot Dead Block: powers down cache lines containing basic blocks that have reached final use (compiler-directed) Architecture Adaptations Adaptive Instruction Queues: partitions powered down when instructions aren't needed Heuristics: measure IPC, with thresholds Algorithms for reconfiguring Multiple Structures: Adjust pipeline width and register update unit for hotspots Tests configurations within hotspot Offline Profiling Occupancy-based Selective Way Caches: measure cache hits in each way Dynamic Voltage Scaling Modulate clock frequency and supply voltage Dynamic, depending on workload Difficulties: Unpredictable workloads (tasks and I/O requests, predicting runtime) Indeterminism – how to decide how fast? Running an application at slowest speed may not be best Non-linear effect of frequency Dynamic Voltage Scaling Interval-Based approaches: measure how busy, and estimate future, workloads are not regular Idling with a threshold, thrashing Aged Averages, weighted intervals Intertask Approaches: assign speeds for different tasks Monitor hardware events Frequency for tasks generated in offline mode, cannot be known perfectly beforehand Unaware of program structure, such as memory access Dynamic Voltage Scaling Intratask Approaches: Adjust processor speed and voltage within tasks Split a task into fixed length Time Slots Slow down away from critical path, help from compiler Memory Bounded Code: memory accesses limit how fast program can execute Heuristics through experimentation Cache miss counter Stall cycle counter, PC marked as hot Measure rate of instructions, compute-intensive Dynamic Voltage Scaling Multiple Clock Domain Architectures: Globally Asynchronous Locally Synchronous chip: Chip split into multiple domains with independent clock rates Allows certain sections of CPU to scale down when not needed Needs to be divided such that communication between domains doesn't waste more energy Can scale voltage based on instruction issue queues Resource Hibernation Disk Drives: Stop rotating platter during idle An acceptable threshold Delay non-urgent requests in a queue Dynamic RPM Drives for servers Network Interfaces: can it be turned off? Track idleness of devices, enter listening or sleep mode Allows network card to remain idle before shutting down Displays: Dim display with no input Face-off to recognize a face in front of display Zoned Backlighting: Adjust brightness of display regions Compiler-Level Power Management Code that reduces execution time No fixed relationship between performance and power Reduce memory accesses Remote Compilation and Remote Execution Server compiles and mobile device downloads Cost of download must be less than compiling Statically Optimized Compilers Program's runtime behavior may differ from expected Process will run on an unpredictable system Compiler-Level Power Management Dynamic Compilation: Program recompiled as runtime environment changes Resources levels such as battery capacity and energy budgets Trade-off of recompilation Application-Level Power Management Enable application to adapt to runtime environment Trading off fidelity or quality of data to users Lower QoS when resources are low Interfaces to allow applications to provide hints Allow application to communicate with OS, and OS with hardware Expected execution of tasks, deadlines Better DVS, power down disk for longer periods of time Cross-Layer Adaptations Forge: integrated power management framework Streams videos at most efficient QoS level Frequency and voltage scaling, network card interface Grace: adaptation framework Global and local adaptations Compiler and Operating System interaction Compiler has a worst-case deadline OS adjusts processor speed to meet deadline Conclusion of Techniques Multifaceted effort from various disciplines From transistors to applications, and across all layers Still ongoing research, new algorithms and heuristics Impossible to tell what new technologies will prove most successful Commercial Systems Pentium 4: high performance goal Internal temperature cap Intel Speedstep – 2 frequency and voltage settings Pentium M: mobile performance and low power Reduce switching activity in circuit, idle units and buses Low leakage transistors in cache Enhanced Speedstep with 6 frequency/voltage settings Intel PXA27x: wireless handheld devices Uses memory boundedness to manage power modes Emerging Radical Technologies Fuel Cells to replace batteries Chemical reaction, but can supply energy indefinitely Fuel enters anode, splits into proton + electron and generates charge Fuel is abundantly available, such as hydrogen Micro-electrical and Mechanical Systems Convert mechanical to electrical energy Millimeter scale turbine engines, ignite air with fuel Produce hot exhaust gases and flammability