Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Power engineering wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Pulse-width modulation wikipedia , lookup
Alternating current wikipedia , lookup
Mains electricity wikipedia , lookup
Rectiverter wikipedia , lookup
Distribution management system wikipedia , lookup
Relaxing Constraints: Thoughts on the Evolution of Computer Architecture Joel Emer Alpha Development Group Compaq Computer Corporation Better answers Moore’s Law Alpha-style 100 EV67-730 EV6-575 SPECint95. EV56-500 EV56-400 10 EV5-300 EV45-275 EV4-200 1 3.73 Date of Introduction Better answers EV56-600 Iron Law of Performance Performance = Frequency * Instructions CPI Frequency – largely circuit design/technology CPI – largely organization Instructions – largely architecture/compiler Better answers Outline Review of technology factors Retrospective on the quantitative method Augmenting the quantitative method Recommendation Better answers Power Dissipation Trends Power Dissipation 80 60 40 20 0 21064 21164 21264 21364 80 70 60 50 40 30 20 10 0 3.5 3 2.5 2 1.5 1 0.5 0 21064 21164 21264 21364 •Power consumption is increasing •Supply current is increasing faster! Better answers Voltage (V) Power (W) 100 Current (A) 3.5 3 2.5 2 1.5 1 0.5 0 Voltage (V) 120 Supply Current Coping With Power Growth Technology techniques Better cooling technology needed Accelerate V dd scaling SOI Clock distribution Architectural possibilities Use less power-hungry structures Reduce useless speculation Better answers Clock Distribution Trends 21264 Power (Peak) 2% 5% 8% 32% Global Clock Networks Instruction Issue Units 10% Caches Floating Execution Units Integer Execution Units Memory Management Unit 10% I/O Miscellaneous Logic 15% Better answers 18% Frequencies will continue to scale Clock edge rates are not scaling Coping With Clock Distribution Technology solution Low swing differential clocks Adiabatic clocking Architectural possibilities Multiple clock zones Asynchronous design Better answers Communication Delay Microprocessor Chip 21064 ~ 1cycle 21164 ~ 1.5 cycles 21264 ~ 3 cycles 21464 ~ 6 cycles Not drawn to scale Better answers Coping With Communication Delay Technology solutions Low K dielectrics Thinner (Cu) interconnect Architectural possibilities Deeper pipelining Replication/clustering of structures More autonomous computation Better answers SIA Roadmap 1997 1999 2002 2005 2008 2012 Technology Node (um) 250 180 130 100 70 50 Memory (bit/chip) 256M 1G 4G 16G 64G 256G Transistors/chip (MPU) 11M 21M 76M 200M 520M 1.4G Chip Frequency (MHz) 750 1250 2100 3500 6000 10,000 Wiring Levels (max) 6 6 to 7 7 7 to 8 8 to 9 9 Power Supply Voltage, Vdd (V) 1.8-2.5 1.5-1.8 1.2-1.5 0.9-1.2 0.6-0.9 0.5-0.6 Power - High Performance (W), w/Heat sink 70 90 130 160 170 175 Power -Hand-held (W) 1.2 1.4 2 2.4 2.8 3.2 *The 2012 is directly from the SIA 1997 National Technology Roadmap Better answers Outline Review of technology factors Retrospective on the quantitative method Augmenting the quantitative method Recommendation Better answers Disclaimer The names used and events depicted in this talk are meant to be real. The events are, however, not an exhaustive enumeration of significant milestones. The misrepresentations of fact and omission of contributors are unintentional and solely the responsibility of the presenter. Finally, the interpretations are just that and are mine as well. Better answers Early quantitative method - 1981 Better answers uPC Histogram Chart – 1981-5 Better answers Paper counts ISCA 1 ISCA24 No model 22 1 Analytic Model 5 ½ Simulation 1 21½ Measurement 0 7 Better answers Scientific Method Make hypothesis about behavior Design experiment Run experiment and quantify Interpret results New hypothesis Better answers Scientific Method Make hypothesis about behavior Pick baseline design and workload Run experiment and quantify Interpret results New hypothesis Better answers Scientific Method Make hypothesis about behavior Pick baseline design and workload Run simulation model or measure hardware Interpret results New hypothesis Better answers Scientific Method Make hypothesis about behavior Pick baseline design and workload Run simulation model or measure hardware Interpret results Propose new design Better answers Making and Testing Hypothesis Cache experiment (Schlansker) 64K word cache 32-way set associative cache/LRU replacement 200x200 matrix subblock of an N x N matrix Read twice Sizes N=2727: 0 misses N=2729: 24160 misses N=2731: 36382 misses Better answers Propose new design Skewed associative (Seznec) Direct mapped Better answers 4-way associative 4-way skewed Quantitative Approach Problems Too much abstraction Intra-chip latencies Memory subsystem Poor workloads Too incremental… Better answers Quantitative -> Incremental 4 3.5 3 2.5 2 1.5 1 0.5 0 a Better answers b c d e f g h I j k l Outline Review of technology factors Retrospective on the quantitative method Augmenting the quantitative method Recommendation Better answers Relaxing Constraints Select a constraint to relax Generate design Employ quantitative method Evaluate results Better answers Important Steps… Before Carefully pick a constraint to relax After Find contributions without constraint Preserving results after reinstating the constraint Better answers Extrapolate From Current Trends Personal Workstation – Xerox PARC – late 70’s VAX 11/780 Dorado 5 MHz 15 MHz 512 Kilobytes 8 Megabytes 40+ Users 1 User Results Accelerate innovation Better answers Throw Out Standards Distributed file system - 1985 Better answers Use a Simpler Starting Point Fetch RISC out-of-order (Johnson, Tourng) Decode/ Map Queue Reg Read Execute Dcache/ Store Buffer Reg Write PC Register Map Regs Icache Better answers Dcache Regs Retire CISC-based O-O-O K6 (Johnson) Pentium Pro (Colwell, Papworth…) PC Covert CISC to RISC Icache Better answers RISC O-O-O Core Abandon conventions VLIW (Fisher) Relieve hardware of all dependency responsibility Give that responsibility to compiler Expected consequences Much simpler implementation Faster cycle time Better answers Sometimes not what you expect Compiler scheduling for hardware is a great idea For 21064 - narrow in-order For 21164 - wider in-order For 21264 – wider out-of-order Better answers Issue Logic Critical Loop Issue Conflict Checker to floating point multiply pipeline to floating point add pipeline X to integer pipeline 0 to integer pipeline 1 Instruction Slot S2 Better answers Instruction Issue S3 Make a Radical Departure Multiscalar research (Sohi, Smith…) Better answers New Mechanism Required Dependence prediction (Moshovos) Store Program Order Execution Order Load Load Store Store Load Trap! Load Load Better answers Load What Was Really Important Full hardware management (Sohi) Sequencing Register dependencies Memory dependencies Refinement (Mowry and Olukuton) Compiler managed – registers, sequencing Hardware managed memory dependence only Better answers Ignoring Implementation Realities SMT - in-order (Tullsen, Eggers, Levy) Fetch Issue Reg Read Execute Dcache/ Store Buffer Reg Write PC Icache Regs Icache Better answers Dcache Regs Solution Already Available Fetch SMT out-of-order Decode/ Map Queue Reg Read Execute Dcache/ Store Buffer Reg Write PC Register Map Regs Icache Better answers Dcache Regs Retire Outline Review of technology factors Retrospective on the quantitative method Augmenting the quantitative method Recommendation Better answers Pay Attention to Reality Look at technology trends Power Latency Use more realistic models More organizational details Better workloads Better answers Ignore Reality Look for revolutionary contributions Decide on a constraint to relax Apply the scientific method Revolutionary contributions may arise because – Constraint will be relaxed in time – Constraint wasn’t fundamental – New avenues of exploration will be opened Better answers Acknowledgments Bill Bowhill Paul Gronowski Bill Herrick Toni Juan Geoff Lowney Ellen Piccioli Andre Seznec Better answers