Download Energy Prediction for I/O Intensive Workflows Applications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Eigenstate thermalization hypothesis wikipedia , lookup

Open energy system models wikipedia , lookup

Transcript
Energy Prediction for I/O Intensive Workflow
Applications
MASc Exam
Hao Yang
NetSysLab
The Electrical and Computer Engineering Department
The University of British Columbia
1
Background - Workflow Applications
Computation
Characteristics:
• File based communication
• Large number of tasks
• Large amount of I/O
• Common data access patterns
File Dependency
Montage Workflow
2
Background - Application Execution
File based communication
Workflow
Runtime
Engine
Large I/O volume
App. task
App. task
App. task
App. task
App. task
Local
storage
Local
storage
Local
storage
Local
storage
Local
storage
I/O Bottleneck
Central Storage System (e.g., GPFS, NFS)
3
Background - Intermediate Storage System
Workflow
Runtime
Engine
Compute Nodes
App. task
Local
storage
…
App. task
App. task
Local
storage
Local
storage
Stage Out
Intermediate Storage
Stage In
Central Storage System (e.g., GPFS, NFS)
4
Background - Context of this thesis
This work focuses on workflow application execution on
intermediate storage systems.
5
Research Problem – Energy Consumption
Computing Equipment
Energy Bill
• The pursuit of performance use to dominate the conventional
computing area.
• Energy efficiency is the new concern.
6
Research Problem - Configuration Decisions
Configuring the runtime system is complex
(Example: resource allocation decision)
Montage Workload
Energy Delay Product (EDP)
7
Research Problem - Questions
• Q1: What performance optimizations in storage systems lead
to energy savings?
• Q2: What is the performance and energy impact of powercentric tuning techniques?
• Q3: How can users balance time-to-solution and energy
consumption when given a target application?
8
Outline
•
•
•
•
•
Background
Research Problem
Methodology
Evaluation
Conclusion
9
Methodology – Building Energy Consumption Predictor
The goal of this work is to build an energy consumption predictor
to aid system configuration and provisioning decisions.
• Answer what-if questions (E.g, is A configuration better than B
from the energy perspective?)
• Customize optimization metric (E.g., energy consumption,
performance-energy product)
10
Methodology – Energy Model
Workflow
Runtime
Engine
A
Compute Nodes
C
B
App. task
App. task
Local
storage
Local
storage
…
D
App. task
App. task
Local
storage
Local
storage
Intermediate Storage
Execution States:
• Idle
• Network Transfer
• Storage I/O
• Task Processing
Power Profiles:
11
Methodology – Energy Model
Execution States:
Energy
Power Profile *
Predicted Times
Idle
Network Transfer
I/O ops (read, write)
Task Processing
12
Methodology – Energy Model
How to seed the energy model?
• Power states: using synthetic benchmarks to get
the power consumption in each state.
• Time estimates: augments a performance predictor
to track the time spent in each state.
13
Methodology – Building Energy Consumption Predictor
Sources of inaccuracies
Model Simplification
(metadata, scheduling, …)
Time Prediction
homogeneity,
Power meter
L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for
I/O Intensive Workflows”, In Proceedings of the 28th ACM International Conference on
Supercomputing, ICS'14, (Acceptance Rate: 20%) June 2014.
L. B. Costa, S. Al-Kiswany, A. Barros, H. Yang, and M. Ripeanu, “Predicting Intermediate
Storage Performance for Workflow Applications”, In Proceedings PDSW'13, 2013.
14
Evaluation Outline
•
•
•
•
Synthetic benchmarks: Workflow Patterns
Real workflow applications
Predicting Energy Impact of Power-tuning Techniques
Predicting Energy-Performance Tradeoffs
15
Evaluation - Platform
Grid5000 Lyon site
•
Idle
Taurus Cluster (11 nodes) App
Storage
two 2.3GHz Intel Xeon E5-2630 CPUs
(each I/O
with 6 cores),
Net transfer
32GB memory, 10 Gbps NIC
• Sagittaire Cluster (16 nodes)
two 2.4GHz AMD Opteron CPUs (each with one core),
2GB RAM and 1 Gbps NIC
• SME Omegawatt power-meter per Node
0.01W power resolution at 1Hz sampling rate
16
Evaluation – Synthetic benchmarks: Workflow Patterns
Montage Workflow
Reduce
Pipeline
17
Evaluation – Synthetic benchmarks: Workflow Patterns
18
Evaluation – Synthetic benchmarks: Workflow Patterns
• Average 88% accuracy
• 20-30x times faster than
running the actual benchmark
• 200x-300x less resources
(machines * runtime)
Using Default Storage System Configuration
(DSS)
19
Evaluation – Synthetic benchmarks: Workflow Patterns
Q1: What are the energy savings that performance
optimizations in storage can bring?
DSS – Default Storage System
• Accurate
in both configurations.
Configuration
•WOSS
Suggests
the configuration
from
– Workflow
Optimized Storage
System
Configuration
energy
perspective.
Pipeline Energy Consumption
S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan, M. Ripeanu, “The Case for Cross-Layer Optimizations
in Storage: A Workflow-Optimized Storage System”, IEEE Transactions on Parallel and Distributed Systems
(TPDS), Under Review, Submitted in June 2014
L.B. Costa, H. Yang, E. Vairavanathan, A. Barros, K. Maheshwari, G. Fedak, D.S. Katz, M. Wilde, M. Ripeanu
and S. Al-Kiswany, “The Case for Workflow-Aware Storage: An Opportunity Study using MosaStore”,
Journal of Grid Computing 2014.
20
Evaluation – Real Workflow Applications
BLAST workflow
Montage workflow
21
Evaluation – Real Workflow Applications
BLAST Result (Energy 89%, Time 95% )
Montage Result (Energy 84%, Time 86% )
22
Evaluation – CPU Throttling
• CPU throttling is an important technique where
processors
run
at less-than-maximum
frequency to
Q2:
What is
the
energy and performance
conserveofpower.
impact
CPU throttling? Is it application• this technique can prolong the execution time while
specific?
conserving instantaneous power.
CPU bound application: BLAST
I/O bound application: pipeline benchmark
23
Evaluation – CPU Throttling
Frequency Level: 1200MHz, 1800MHz, 2300MHz
Conclusion:
• The computational and I/O
characteristics
Energy
BLAST Result
Time
96% cost when using
maximum CPU throttling
Energy savings/ energy costs
• The predictor can be
used in make the
decisions.
Energy
Pipeline Result
Time
17% savings when using
maximum throttling
24
Evaluation – Predicting Energy Delay Product
User’s optimization metric
•
•
•
Performance (use more machines)
Energy
Energy-Delay Product (EDP, energy * time)
Q3: How can users balance time-to-solution and energy
consumption when given a target application?
• Consider allocation decision.
• Use Montage workload on two clusters to demonstrate
prediction.
25
Evaluation – Predicting Energy Delay Product
Montage EDP at Sagittaire
Montage EDP at Taurus
26
Conclusion
• This thesis presents an energy consumption predictor in the
workflow application domain.
• The proposed energy model and prediction framework
achieve adequate accuracy to be useful for the energyoriented configurations this work targets.
27
Resulting Publications
Energy Prediction
•
H. Yang, L. B. Costa and M. Ripeanu, “Energy Prediction for I/O Intensive Workflows Applications”,
submitted to 7th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers
(MTAGS) 2014 (Co-located with Supercomputing/SC 2014), under-review.
Performance Prediction and Provisioning
•
•
•
L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration and
Provisioning for I/O Intensive Workflows”, In Preparation.
L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for I/O
Intensive Workflows”, In Proceedings of ICS'14, Acceptance rate: 20%. June 2014.
L. B. Costa, S. Al-Kiswany, A. Barros, H. Yang, and M. Ripeanu, “Predicting Intermediate Storage
Performance for Workflow Applications”, In Proceedings PDSW'13, 2013.
A Workflow-Optimized Storage System
•
•
•
S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan , M. Ripeanu, “A Software Defined Storage for
Scientific Workflow Applications”, In Preparation.
S. Al-Kiswany, L. B. Costa, H. Yang, E. Vairavanathan, M. Ripeanu, “The Case for Cross-Layer Optimizations
in Storage: A Workflow-Optimized Storage System”, IEEE Transactions on Parallel and Distributed Systems
(TPDS), Under Review, Submitted in June 2014
L.B. Costa, H. Yang, E. Vairavanathan, A. Barros, K. Maheshwari, G. Fedak, D.S. Katz, M. Wilde, M. Ripeanu
and S. Al-Kiswany, “The Case for Workflow-Aware Storage: An Opportunity Study using MosaStore”,
accepted by Journal of Grid Computing, 2014.
Evaluating Storage Systems for Scientific Data in the Cloud
•
K. Maheshwari, J. Wozniak, H. Yang, D. S. Katz, M. Ripeanu, V. Zavala, M. Wilde, “Evaluating Storage
Systems for Scientific Data in the Cloud”, In Proceedings of the 5th Workshop on Scientific Cloud
Computing (ScienceCloud), Co-located with ACM HPDC 2014 (Best Paper Award)
28
Backup Slides
System Deployment Configuration
I/O traces
Number of Storage Nodes
Task Dependency Graph
• The system model
• Model seeding
• Workload description
𝑁 𝑠𝑡
Number of Client Nodes
𝑁 𝑐𝑙𝑖
Chunk Size
𝑆𝑐ℎ𝑢𝑛𝑘
Replication Level
𝑅
…
Platform Performance Parameters
Manger Service Time
Storage Service Time
Client Service Time
Remote network service Time
Local network service time
𝜇𝑚𝑎
𝜇 𝑠𝑚
𝜇𝑐𝑙𝑖
𝜇𝑟𝑒−𝑛𝑒𝑡
𝜇lo−𝑛𝑒𝑡
L. B. Costa, S. Al-Kiswany, H. Yang, and M. Ripeanu, “Supporting Storage Configuration for
I/O Intensive Workflows”, In Proceedings of the 28th ACM International Conference on
Supercomputing, ICS'14, June 2014.
29
Backup Slides
Limitations:
• Simplification of the model
• Short tasks/ small workload
• Not validated using new devices (e.g, SSD)
30
Backup Slides
Alternative Approaches:
• Utilization
• Detailed simulation
• Machine learning
31
Backup Slides
Combined states
Apply benchmarks in parallel to get combined power state:
E.g., perform storage and network benchmarks in parallel
𝑃𝑐𝑜𝑚𝑏𝑖𝑛𝑒 ≈ 𝑃𝑖𝑑𝑙𝑒 + (𝑃𝑠𝑡𝑜𝑟𝑎𝑔𝑒 − 𝑃𝑖𝑑𝑙𝑒 ) + (𝑃𝑛𝑒𝑡 − 𝑃𝑖𝑑𝑙𝑒 )
𝑃𝑐𝑜𝑚𝑏𝑖𝑛𝑒 : 160.5W, 𝑃𝑖𝑑𝑙𝑒 :91.6W, 𝑃𝑠𝑡𝑜𝑟𝑎𝑔𝑒 :129.0W, 𝑃𝑛𝑒𝑡 : 127.7W
32
Backup Slides
Energy Composition (pipeline benchmark):
• Idle energy: 64%
• App processing: 9.2%
• Storage operations: 15.8%
• Network transfer: 10.6%
33
Backup Slides
Sagittaire power profiles
175W
25W
8W
7W
34