Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Designing a Processor from the Ground Up to Allow Voltage/Reliability Tradeoffs Andrew Kahng (UCSD) Seokhyeong Kang (UCSD) Rakesh Kumar (Illinois) John Sartori (Illinois) Timing Errors • Power is a first-order design constraint • Voltage scaling can significantly reduce power • Voltage scaling may result in timing errors Relative Path Delay Actu tar Operating Voltage Path A Path B Path C Nominal voltage Unnecessa sizing Research Questions • How does a conventional processor behave when we fix frequency and scale down voltage? • How can we reduce the voltage at which timing errors are observed? • Reduce power while maintaining the same performance level Limitation of Voltage Scaling 0.5 . 0.4 Errors / Cycle • At some voltage, circuit breaks down 0.3 0.2 0.1 0.0 1.0 0.9 0.8 Voltage 0.7 0.6 0.5 Voltage scaling must halt after only 10% scaling. Limitation of Voltage Scaling Error Rate (%) Module 1.0V 0.9V 0.8V 0.7V 0.6V 0.5V lsu_dctl 0.00 0.23 8.60 29.46 45.13 54.90 lsu_qctl1 0.00 5.94 10.85 16.99 16.56 37.53 lsu_stb_ctl 0.00 0.08 0.65 5.19 11.79 22.38 sparc_exu_div 0.00 0.15 0.23 0.35 0.49 1.10 sparc_exu_ecl 0.00 3.31 10.97 87.08 88.93 73.03 sparc_ifu_dec 0.00 0.08 0.87 7.09 15.22 20.48 sparc_ifu_errdp 0.00 0.00 0.00 0.00 0.00 9.21 sparc_ifu_fcl 0.00 10.56 22.25 50.04 55.06 56.95 spu_ctl 0.00 0.00 0.00 1.30 2.96 35.53 tlu_mmu_ctl 0.00 0.01 0.02 0.06 0.14 0.19 What problems are caused by steep error degradation? Error rate Power consumption Maximum error rate Pmin P’min Vmin V’min Error Rate Power Problems with Steep Error Degradation (lower voltage) Voltage scaling limited in traditional designs. Problems with Steep Error Degradation • No power savings as error rate increases Power PTech1 PTech2 Power consumption Error Rate Technique 1 Error Rate Technique 2 Error Rate Traditional design No reliability/power tradeoff Problems with Steep Error Degradation • Reliability/power tradeoffs enabled • Allows switching between error tolerance techniques at different voltages/error rates Power PTech1 Power consumption PTech2 Error Rate Technique 1 Error Rate Technique 2 Error Rate Higher error rate Lower power Why do circuits fail catastrophically? Reason for Steep Error Degradation • Critical paths are bunched up in traditional designs. Question… How can we change the slack distribution to achieve a graceful failure characteristic? Number of paths Design Objectives and Insight ‘gradual slope’ slack ‘wall’ of slack 0 Timing slack 0 Zero slack after voltage scaling • Optimize frequently exercised critical paths. Power-optimized Slack-optimized design: design: Make slack distribution gradual by re-distributing • De-optimize rarely exercised paths. slack between paths. Reclaim Optimize excess critical timing paths slack Both gradual failure and low power can be achieved. Slack Re-distribution Example Negative Positive Slack P1 A Error Rate = 25% 1% FF P2 B FF Negative Positive Slack TG(P1) = 0.25 Slack(P1) = -0.2 TG(P2) = 0.01 Slack(P2) = 0.1 -0.1 0.0 Proposed Design Flow • Input: RTL description • Output: Gradual slack design • Objective: Minimize voltage for a given error rate over a range of error rates Voltage Scaling Path Optimization Area Reduction NO Choose New (Lower) Target Voltage Estimate Error Rate at Target Voltage Error Rate> Target Rate NO YES Optimize Negative Slack Paths at Target Voltage by Resizing Cells Power > Undo YES Current Power Optimization NO Error Rate> Target Rate YES Reduce Power by Resizing Non-critical Cells (optional) Place and Route Iterative Optimization Negative Slack of Path A Negative Slack of Path A at at the target voltage the target voltage Find target voltage and Actual voltage at the optimize iteratively target error rate Path Path A Path B Path Path CC Path Nominal the voltage Nominal Target voltage with Target voltage estimated error rate(fixed) voltage Unnecessary celltargetTarget New voltagevoltage sizing (fixed) Iterative Using fixed optimization target results avoids in unnecessary over-optimization. swaps. Error Rate Forecasting A FF P1 P2 P3 CLK B FF B P1 P2 P1 P2 P1 P3 Timing Error TG(P1) = 0.3 Slack(P1) = pos TG(P2) = 0.2 Slack(P2) = neg TG(P2) TG(P3) = 0.1 Slack(P3) = pos ER(B) = TG(B) ∙ = 0.2 TG(P1) + TG(P2) + TG(P3) TG(B) = 0.6 Design-level Methodology •• Library characterization Functional Slack ECO Optimization P&R simulation Benchmark generation •• Cadence SignalStorm – Gate-level Synopsys Liberty SOCEncounter – Placement and Cadence C++ with NC Synopsys Verilog PrimeTime – interface simulation • generation Virtutech Simics – Test vector generation for each voltage Routing Gradual Slack Distribution Slack optimization achieves gradual slack distribution. Processor Module Optimization Slack optimized design has lowest power for all error rates. Processor Error Rate and Power Designs with comparable error rates have much higher power/area overheads. Reliability/Power Tradeoff Slack-optimized design enjoys continued power reduction as error rate increases. Enhancing Razor-based Design Slack optimization extends range of voltage scaling and reduces Razor recovery cost. Summary and Conclusion • Showed limitations of traditional processors w.r.t. voltage scaling • Traditional designs break down • Presented design technique that enables voltage/reliability tradeoffs • Optimize frequently exercised critical paths • De-optimize rarely-exercised paths • Demonstrated significant power benefits of gradual slack design • Reduced power 29% for 2% error rate, 27% on average Bonus Slides Slack Optimization Techniques Power consumption Error rate Error rate Operating 1. Optimize Paths point Maximum error rate Pmin Pmin Pmin Operating point Vmin Error Rate Power • Path Optimization and Power Reduction Operating point 2. Reduce Power min Vmin (lower voltage) Extended Voltage Scaling • Focus on frequently exercised negative slack paths • Reduce error rate while minimizing cell swaps (power overhead) Path TG OR1 Opt. Rank A-B-C-D A-B-D 0.22 0.15 1 SWAP A-C-D 0.05 OR2 3 2 Upsize Rank paths cells by in paths error rate to increase contribution. slack. Power Reduction • Downsize cells on rarely exercised paths • Reduce leakage power while leaving error rate unaffected INV4 INV1 SWAP Toggle Rate ≈ 0 Check Path Slack INV1 Error Rate Forecasting • Error rate contribution of one FF ER ff TG ff • Error rate of design TG TG PNEG PALL ERD ER ff ff D Significance of Processor Power 100 80 60 40 20 0 1.0 0.9 0.8 Voltage 0.7 0.6 Power Savings (%) . • Power is a first-order design constraint • Voltage scaling can significantly reduce power 0.5 Voltage Scaling: 50% Power Reduction: 80% DVFS Benefit and Cost Relative Power Relative Performance 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.9 0.8 Voltage 0.7 0.6 0.5 How effective is voltage scaling when frequency is fixed? Alternatives – Blueshift • Goal: Optimize paths that cause errors to enable more frequency overscaling • Techniques: PCT/OSB • Uses iterative simulation loop – infeasible for large designs Alternatives – Tightly Constrained SP&R • Goal: Optimize all paths aggressively • Technique: Traditional SP&R with aggressive target • Some paths will not meet tight constraint, and slack distribution becomes more gradual Insight • Optimize frequently exercised paths at the expense of rarely exercised paths • Optimizing frequently exercised paths enables deeper voltage scaling • De-optimizing rarely exercised paths keeps power overhead low Iterative Optimization Flow Scale Voltage Optimize Paths NO Choose New (Lower) Target Voltage Estimate Error Rate at Target Voltage Error Rate> Target Rate YES Optimize Negative Slack Paths at Target Voltage by Resizing Cells NO Iterate Power > Undo YES Current Power Optimization NO Error Rate> Target Rate YES Reduce Power by Resizing Non-critical Cells (optional) Place and Route A New Processor Design Goal Number of paths • Reshape the slack distribution so processor fails gracefully ‘gradual slope’ slack ‘wall’ of slack 0 Timing slack Zero slack after voltage scaling 0 Moore’s Law • Power consumption of processor node doubles every 18 months. Power Scaling • With current design techniques, processor power soon on par with nuclear power plant Outline • • • • • • Background and Motivation Insight Power Reduction Techniques Design Flow Results Summary