* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Name Center for Applied Scientific Computing month day, 2001
Survey
Document related concepts
Variable-frequency drive wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Power engineering wikipedia , lookup
Mains electricity wikipedia , lookup
Alternating current wikipedia , lookup
Life-cycle greenhouse-gas emissions of energy sources wikipedia , lookup
Transcript
DVSleak: Combining Leakage
Reduction and Voltage Scaling in
Feedback EDF Scheduling
Yifan Zhu, Frank Mueller
North Carolina State University
Center for Efficient, Secure and Reliable Computing
Background
Dyn. Voltage scaling (DVS): lowers dyn. power
Dynamic power was dominating
— Power ~
pt CLVdd2 f clk I leakVdd Pshort
Leakage becoming dominant
Sleep: lowers leakage (static) power
2
Real-Time Systems
Hard real-time systems
— periodic, preemptive, independent tasks [Liu, Layland]
– w/ known worst-case execution time (WCET)
— jobs: periodically released instances of a task
— WCET: measured at the max. freq., w/o DVS
— most practical system: U << 1
Earliest-deadline-first (EDF) scheduling
Ci
— 1 , Ci=WCET, Pi=period
i Pi
n
Ci
f
—
, = act (0< 1) DVS scaling factor
f max
i 1 Pi
3
Motivation
Embedded systems with limited power supply
DVS for real-time system
— trade-off: energy saving vs. timing requirements
— lower CPU voltage/frequency longer to complete
Task workloads change dynamically
— WCET overestimates actual execution time
— wide variation of execution times
–Longest vs. shortest times
Sleeping: 1-2 orders of magnitude less power
— DVS below threshold more energy than sleeping
— Long idle more energy than sleeping
— But wakeup overhead (cold misses in cache)
4
Motivation
Real-world examples:
— graphics: 78% of WCET [Wegener/Mueller]
— defense: 87%; automotive: 74%
— benchmarks: 30-89%; image recognition: 85% [Wolf]
1.2
Prior DVS algorithms: lack
adaptability to dynamic
workloads
Energy (normalized)
c==50%WCET
1
c in [20%WCET, 80%WCET]
0.8
0.6
0.4
0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
WCET Utilization
Look-ahead DVS [Pillai/ Shin]
5
Contribution
A feedback-based framework for dynamic workloads
[LCTES’02, RTAS’04, LCTES’05]
New: A hybrid sleep+DVS scheme, 2 observations:
1. Limit to DVS use sleep below certain threshold
2. Trade-off idle vs. sleep depends on length of inactivity
3. Feedback helps in these decisions
Simulation experiments
Comparison with prior work
6
Related Work
Dynamic Voltage Scaling
— General purpose DVS: Weiser, Govil, Pering, Grunwald
— Real-time DVS: Lee, Pillai, Aydin
— Optimality of DVS: Ishihara, Qu, Lorch, Xie, Saewong
Feedback Real-time Scheduling
— Stankovic, Lu, Varma, Poellabauer, Minerick
Leakage-aware DVS scheduling
— Lee, Quan, Jejurikar ’04/’05, Zhang
— We compare with Jejurikar’05 (closest related, best scheme)
7
Feedback-DVS Framework
V/f selector:
error ci Ci
A
(V,f) = func(error)
Fig. Feedback-DVS Framework
Maximum EDF schedule
determine slack in EDF schedule
assumes: c = WCET
8
Voltage-Frequency Selector
f
:
fmax
Ck
Ci
1
Pk i{1,..., n}\{ k } Pi
1
t
Task splitting with WCET: Ci CiA + CiB
— CiA at freq. ( 0 100%); CiB at max. freq.
— More aggressive:
– < uniform frequency w/o splitting
— Objective:
f
–T finishes within the 1st portion
100%
lower energy consumption
A
A
=Ci /(Ci +slack)
Still guaranteed to meet deadline
proof in prior paper
Tb
Ta
t
CiA/ CiB
9
Extension to Leakage-aware DVS
Power ~
pt CLVdd2 f clk I leakVdd Pshort
Dynamic power does NOT
dominate anymore!
Static power exceeds dynamic power when the voltage is reduced
below a threshold value, the critical speed
— Voltage below threshold not energy efficient anymore
— Sleeping may be better
But need to consider wakeup overhead
— Mostly due to cache refill
— Calculated statically based on time to refill reused lines
10
Speed Reduction vs. Task Delaying
T
t
Speed reduction
T
t
Delaying the start time
Why delay the start time of a task?
— To maximize the CPU sleeping time
11
Delay Dispatching a Task
T1
idle1
(i) Consider Schedule
T1
T2
idle2
T3
t
WCET
idle1 T2
(ii) No Delay
WCET
T1
sleep
(iii) Delay
sleep
T3
t
idle2
T2
T3
WCET
CB
t
1.
If idle1+idle2 > tth before DVS but < tth afterwards no DVS
2.
idle1+idle2 < tth no delay
3.
If idle1 < CB no delay
4.
Otherwise delay
Still guaranteed to meet deadline proof in paper
threshold for sleep
12
Scaling below the Critical Speed
Pure DVS: should never scale frequency below critical speed
DVS combined with sleeping:
— sleep if threshold tth > idle slot
— If idle slot is too short (< tth), scale below critical speed
–No other work to do (in contrast to non real-time)
–Lower frequency/voltage power savings
13
Experimental Framework
Scheduling simulator
— Accurately reflects energy trends [Zhu’05] PPC405LP
Use the same power model as [Jejurikar’04]
— Critical speed, wakeup cost
Assume four discrete frequency levels:
— 25%, 50%, 75%, 100% of fmax
Compare energy in hyperperiod (const. amount of work) for
— Pure Feedback-DVS
— DVS+sleep: Feedback-DVS w/ sleep policy (no delay policy)
— DSR-DP: dyn. procrastination+slack reclamation [Jejurikar’05]
— DVSleak: feedback-DVS w/ sleep & delay now/later policies
— Lower-bound schedule: best frequency + sleep for max. idle
14
3 Tasks, Const. Execution, 25% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
Significant savings w/ sleep, more for low utilizations
DVSleak: Delay most impact for medium to high utilizations
— Close to lower bound
15
3 Tasks, Const. Execution, 75% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
All schemes: resilient to actual/WCET ratio
DVSleak never worse than other schemes, savings:
— 50% over pure, 20% over DVS+sleep, 8.5% over DSR-DP
16
3 Tasks, Var. Execution (pat1), 75% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
DVSleak: more resilient to fluctuating exec. times (unchanged)
feedback helps!
All others: 5-10% more energy consumption than for const. exec.
17
10 Tasks, Const. Execution, 25% WCET
2200000
PureDVS
DVS+sleep
DSR-DP
DVSleak
LowerBound
2000000
Energy Consumption (mJ)
1800000
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
More tasks 5-10% higher energy cost (switching)
DVSleak still best of all (~ same margin)
18
Length of Task Periods
U=60%, E normalized to hyperperiod task set 2, c=50% WCET
Harmonic (1) vs. non-harmonic (2):
— 10-27% more energy for non-harmonic cannot fold jobs
released at same time more uncertainty
Longer (2) vs. shorter (3) periods for non-harmonic:
— 2-28% more energy for shorter periods more job releases,
less sleep time
— DVSleak ~ 15% lower energy than DSR-DP
Feedback more important for shorter periods
19
Conclusion
DVSleak: Novel Feedback DVS + leakage (sleep), benefits for
— fluctuating execution times
— shorter task periods
can scale below critical speed
— medium utilizations (most common)
sleep policy by itself enough for high/low utilizations
(always sleep/never sleep)
DVSleak energy over other schemes:
—avg. 50% over DVS-only
—avg. 20% more over DVS+sleep
—Avg. 8.5% more over [Jejurikar’05]
—Sleep now/later important when actual exec. << WCET
Prior: Evaluation on a real embedded platform
—V2f model works for OS scheduling
20
Future Work
Implementation on IBM PPC 405LP test board
Has been used for DVS experiments
— Oscilloscope, data acquisition card for voltage / current
Assessing sleep modes
1.
Clock suspend
same power, all still up
2.
Suspend
1/10 power, SDRAM up
3.
Hibernate
N/A (SDRAM NVRAM)
4.
Standby
N/A (APM over I2C)
Need faster resume (reactivating devices slowlow-power modes)
21