Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Opportunities and Challenges for Better Than WorstCase Design Todd Austin (presenter) Valeria Bertacco David Blaauw Trevor Mudge University of Michigan [email protected] Traditional Worst-Case Design L H Time-to-Market Design-Time Verification and Optimization L H Performance Better Than Worst-Case Design Run-Time Verification L H Time-to-Market Typical Case Optimization L H Performance Addressing Challenges in the Nanometer Regime Design complexity Billions and billions of transistors lead to untenable designs… Soft errors upsets in logic and memory Cosmic rays, alpha particles, neutrons, etc… Uncertainty in design parameters Process and temperature variation, supply noise… Power/performance demands Bounding performance, area, and battery life Example BTWC Design: DIVA Checker Performance Core IF ID EX/ MEM REN REG Correctness speculative instructions in-order with PC, inst, inputs, addr SCHEDULER Checker CHK CT All core function is validated by checker Simple checker detects and corrects faulty results, restarts core Checker relaxes burden of correctness on core processor Tolerates design errors, electrical faults, defects, and failures Core has burden of accurate prediction, as checker is 15x slower Core does heavy lifting, removes hazards that slow checker 4 9 3 clk MEM Shadow Latch clk 5 Main FF Main FF Another BTWC Design: Razor Logic 9 clk_del Double-sampling metastability tolerant latches detect timing errors Second sample is correct-by-design Microarchitectural support restores state Timing errors treated like branch mispredictions Distributed Pipeline Recovery Cycle: 7891234560 recover Flush Control flushID bubble error recover flushID bubble MEM (read-only) error bubble recover flushID error recover flushID Builds on existing branch prediction framework Multiple cycle penalty for timing failure Scalable design as all communication is local Stabilizer FF error EX Razor FF ID Razor FF PC IF Razor FF inst2 Razor FF inst5 inst2 inst1 inst6 inst8 inst7 inst4 inst3 bubble WB (reg/mem) Opportunities for CAD Key observation: Infrequent faults in the core design are tolerable. Opportunities: Focus only on the critical components, no need to verify ad infinitum Optimize performance/power for the most common scenarios (typical-case optimization) Razor Opportunity: Typical-Case Energy Reduction reset Ediff = Eref - Esample Eref - Voltage Control Function Voltage Regulator Vdd Pipeline error signals Ediff . . . Energy reduction can be realized with a simple proportional control function Control algorithm implemented in software Esample Energy/Performance Characteristics Pipeline 1% Throughput Energy IPC Total Energy, Etotal = Eproc + Erecovery 50% Optimal Etotal Energy of Processor Operations, Eproc Energy of Processor w/o Razor Support Decreasing Supply Voltage Energy of Pipeline Recovery, Erecovery Razor Opportunity: Typical-Case Optimized Adder … G15 P15 G14 P14 G13 P13 G12 P12 G11 P11 G10 P10 G9 P9 G8 P8 G7 P7 G6 P6 Kogge-Stone Adder G5 P5 G4 P4 G3 P3 G2 P2 G1 P1 G0 P0 Cin Carry Propagations for Random Data Probability probability 0.01 0 0 16 ca rr ys tar 32 t 48 8 0 16 24 32 tion a 40 g 48 opa r p 56 ry 64 ca r Carry Propagations for Typical Data Probability probability 0.16 0 0 16 ca rr ys tar 32 t 48 8 0 16 24 32 ion t a 40 g 48 opa r p 56 ry 64 ca r Typical Case Optimized Adder … G15 P15 G14 P14 G13 P13 G12 P12 G11 P11 G10 P10 G9 P9 G8 P8 G7 P7 G6 P6 G5 P5 G4 P4 G3 P3 G2 P2 G1 P1 G0 P0 Cin ripple carry circuit carry-lookahead circuit Benefits of Typical Case Optimization Latency (in gate delays) Adder Topology Worst-Case Typical-Case Random Kogge-Stone 8 5.08 7.09 TCO Adder 128 3.03 3.69 Typical-case performance much better than worst case Especially for typical-case optimized design Core CAD Requirement: Observability of Circuit-Level Characteristics Speed and Scope Fidelity and Observability Output App IF Tech Models EX MEM WB Architectural Simulator Arch Config Module Circuit Models ID Inputs, Voltage, Constraints Arch Metrics Delay, Power, Switching Circuit Simulator Circuit Metrics Circuit-Aware Architectural Simulator efficiently melds circuit simulation with architectural simulation Additional CAD Opportunities For synthesis: Typical-case library characterization (e.g., pdf of delay) Synthesize design for target performance, power, etc… TCO-style optimizations possible for macro-modules For verification: Full formal verification for checker components Profile-directed simulation-based verification for core For testing: Checker component can facilitate software-based manufacturing test of core components Conclusions Better than worst-case design abandons traditional worst-case design constraints Couples complex designs with checkers Enables CAD opportunities for typical-case optimization Requires tool support for observability, synthesis and verification For more information: http://www.eecs.umich.edu/razor First tutorial at DATE, Munich, March 2005