Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Variable-frequency drive wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Power engineering wikipedia , lookup
Mains electricity wikipedia , lookup
Alternating current wikipedia , lookup
Life-cycle greenhouse-gas emissions of energy sources wikipedia , lookup
DVSleak: Combining Leakage Reduction and Voltage Scaling in Feedback EDF Scheduling Yifan Zhu, Frank Mueller North Carolina State University Center for Efficient, Secure and Reliable Computing Background Dyn. Voltage scaling (DVS): lowers dyn. power Dynamic power was dominating — Power ~ pt CLVdd2 f clk I leakVdd Pshort Leakage becoming dominant Sleep: lowers leakage (static) power 2 Real-Time Systems Hard real-time systems — periodic, preemptive, independent tasks [Liu, Layland] – w/ known worst-case execution time (WCET) — jobs: periodically released instances of a task — WCET: measured at the max. freq., w/o DVS — most practical system: U << 1 Earliest-deadline-first (EDF) scheduling Ci — 1 , Ci=WCET, Pi=period i Pi n Ci f — , = act (0< 1) DVS scaling factor f max i 1 Pi 3 Motivation Embedded systems with limited power supply DVS for real-time system — trade-off: energy saving vs. timing requirements — lower CPU voltage/frequency longer to complete Task workloads change dynamically — WCET overestimates actual execution time — wide variation of execution times –Longest vs. shortest times Sleeping: 1-2 orders of magnitude less power — DVS below threshold more energy than sleeping — Long idle more energy than sleeping — But wakeup overhead (cold misses in cache) 4 Motivation Real-world examples: — graphics: 78% of WCET [Wegener/Mueller] — defense: 87%; automotive: 74% — benchmarks: 30-89%; image recognition: 85% [Wolf] 1.2 Prior DVS algorithms: lack adaptability to dynamic workloads Energy (normalized) c==50%WCET 1 c in [20%WCET, 80%WCET] 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 WCET Utilization Look-ahead DVS [Pillai/ Shin] 5 Contribution A feedback-based framework for dynamic workloads [LCTES’02, RTAS’04, LCTES’05] New: A hybrid sleep+DVS scheme, 2 observations: 1. Limit to DVS use sleep below certain threshold 2. Trade-off idle vs. sleep depends on length of inactivity 3. Feedback helps in these decisions Simulation experiments Comparison with prior work 6 Related Work Dynamic Voltage Scaling — General purpose DVS: Weiser, Govil, Pering, Grunwald — Real-time DVS: Lee, Pillai, Aydin — Optimality of DVS: Ishihara, Qu, Lorch, Xie, Saewong Feedback Real-time Scheduling — Stankovic, Lu, Varma, Poellabauer, Minerick Leakage-aware DVS scheduling — Lee, Quan, Jejurikar ’04/’05, Zhang — We compare with Jejurikar’05 (closest related, best scheme) 7 Feedback-DVS Framework V/f selector: error ci Ci A (V,f) = func(error) Fig. Feedback-DVS Framework Maximum EDF schedule determine slack in EDF schedule assumes: c = WCET 8 Voltage-Frequency Selector f : fmax Ck Ci 1 Pk i{1,..., n}\{ k } Pi 1 t Task splitting with WCET: Ci CiA + CiB — CiA at freq. ( 0 100%); CiB at max. freq. — More aggressive: – < uniform frequency w/o splitting — Objective: f –T finishes within the 1st portion 100% lower energy consumption A A =Ci /(Ci +slack) Still guaranteed to meet deadline proof in prior paper Tb Ta t CiA/ CiB 9 Extension to Leakage-aware DVS Power ~ pt CLVdd2 f clk I leakVdd Pshort Dynamic power does NOT dominate anymore! Static power exceeds dynamic power when the voltage is reduced below a threshold value, the critical speed — Voltage below threshold not energy efficient anymore — Sleeping may be better But need to consider wakeup overhead — Mostly due to cache refill — Calculated statically based on time to refill reused lines 10 Speed Reduction vs. Task Delaying T t Speed reduction T t Delaying the start time Why delay the start time of a task? — To maximize the CPU sleeping time 11 Delay Dispatching a Task T1 idle1 (i) Consider Schedule T1 T2 idle2 T3 t WCET idle1 T2 (ii) No Delay WCET T1 sleep (iii) Delay sleep T3 t idle2 T2 T3 WCET CB t 1. If idle1+idle2 > tth before DVS but < tth afterwards no DVS 2. idle1+idle2 < tth no delay 3. If idle1 < CB no delay 4. Otherwise delay Still guaranteed to meet deadline proof in paper threshold for sleep 12 Scaling below the Critical Speed Pure DVS: should never scale frequency below critical speed DVS combined with sleeping: — sleep if threshold tth > idle slot — If idle slot is too short (< tth), scale below critical speed –No other work to do (in contrast to non real-time) –Lower frequency/voltage power savings 13 Experimental Framework Scheduling simulator — Accurately reflects energy trends [Zhu’05] PPC405LP Use the same power model as [Jejurikar’04] — Critical speed, wakeup cost Assume four discrete frequency levels: — 25%, 50%, 75%, 100% of fmax Compare energy in hyperperiod (const. amount of work) for — Pure Feedback-DVS — DVS+sleep: Feedback-DVS w/ sleep policy (no delay policy) — DSR-DP: dyn. procrastination+slack reclamation [Jejurikar’05] — DVSleak: feedback-DVS w/ sleep & delay now/later policies — Lower-bound schedule: best frequency + sleep for max. idle 14 3 Tasks, Const. Execution, 25% WCET 2000000 Pure DVS DVS+sleep DSR-DP DVSleak Lower Bound 1800000 Energy Consumption (mJ) 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 WCET Utilization Significant savings w/ sleep, more for low utilizations DVSleak: Delay most impact for medium to high utilizations — Close to lower bound 15 3 Tasks, Const. Execution, 75% WCET 2000000 Pure DVS DVS+sleep DSR-DP DVSleak Lower Bound 1800000 Energy Consumption (mJ) 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 WCET Utilization All schemes: resilient to actual/WCET ratio DVSleak never worse than other schemes, savings: — 50% over pure, 20% over DVS+sleep, 8.5% over DSR-DP 16 3 Tasks, Var. Execution (pat1), 75% WCET 2000000 Pure DVS DVS+sleep DSR-DP DVSleak Lower Bound 1800000 Energy Consumption (mJ) 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 WCET Utilization DVSleak: more resilient to fluctuating exec. times (unchanged) feedback helps! All others: 5-10% more energy consumption than for const. exec. 17 10 Tasks, Const. Execution, 25% WCET 2200000 PureDVS DVS+sleep DSR-DP DVSleak LowerBound 2000000 Energy Consumption (mJ) 1800000 1600000 1400000 1200000 1000000 800000 600000 400000 200000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 WCET Utilization More tasks 5-10% higher energy cost (switching) DVSleak still best of all (~ same margin) 18 Length of Task Periods U=60%, E normalized to hyperperiod task set 2, c=50% WCET Harmonic (1) vs. non-harmonic (2): — 10-27% more energy for non-harmonic cannot fold jobs released at same time more uncertainty Longer (2) vs. shorter (3) periods for non-harmonic: — 2-28% more energy for shorter periods more job releases, less sleep time — DVSleak ~ 15% lower energy than DSR-DP Feedback more important for shorter periods 19 Conclusion DVSleak: Novel Feedback DVS + leakage (sleep), benefits for — fluctuating execution times — shorter task periods can scale below critical speed — medium utilizations (most common) sleep policy by itself enough for high/low utilizations (always sleep/never sleep) DVSleak energy over other schemes: —avg. 50% over DVS-only —avg. 20% more over DVS+sleep —Avg. 8.5% more over [Jejurikar’05] —Sleep now/later important when actual exec. << WCET Prior: Evaluation on a real embedded platform —V2f model works for OS scheduling 20 Future Work Implementation on IBM PPC 405LP test board Has been used for DVS experiments — Oscilloscope, data acquisition card for voltage / current Assessing sleep modes 1. Clock suspend same power, all still up 2. Suspend 1/10 power, SDRAM up 3. Hibernate N/A (SDRAM NVRAM) 4. Standby N/A (APM over I2C) Need faster resume (reactivating devices slowlow-power modes) 21