Download Programmability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Distributed operating system wikipedia , lookup

Transcript
Programmability
Hiroshi Nakashima
Thomas Sterling
Key Challenges (1)
• Parallelism
–
–
–
–
Expose sufficient parallelism (multi billion-way)
Manage the massive parallelism in ensemble (hierarchy)
Reveal rich form and granularity of parallelism
Efficient exploitation of fine grained parallelism
• Distribution and resource assignment
– Enables exploitation of separate concurrency of action
– Need for some kind of global name space
• Locality management
– Reduces latency of access and control
– Exposure of object and control affinity
Key Challenges (2)
• Management of memory hierarchy
– Transparent cache misses
– Finite cache size and structure
– Copy semantics and consistency(?)
• Latency hiding
– Already said locality management
– Intrinsic overlap of communication with computation to mitigate impact
• Hardware idiosyncrasies
– E.g., TLB misses
– Non-deterministic resolution of shared resource contention
– Branch prediction, register renaming, etc.
Key Challenges (3)
• Legacy codes may not meet requirements for future Exascale
systems
– Rewrite only once, please.
• What is the paradigm or execution model for the
programming model to satisfy and cooperate with remaining
system components?
– Distribution of responsibilities across system components
• Libraries
– Code reuse
– Decouples performance issue from logical function
– Can adapt to your program requirements
• Should learn about your data structure, not you about library
Key Challenges (4)
• Interoperability
– Between cooperating concurrently executing functionality
– Exploit existing legacy codes during transitional periods
• Minimization of performance sensitivity
• Robust guarantees of correctness of result
• Elimination of over constraining synchronization bottlenecks
– e.g., global barriers
– Lightweight synchronization
• Re-empower strong scaling
• Portability
– Different systems
– Different scale
– Different generations
Potential Impact on Software Component
• Need for new model of computation
• Programming model reflects user program parallelism
• Runtime system make available runtime information for
decision chain
• Architecture and runtime minimize overhead to enable useful
rich mechanisms for control, cooperation, and sharing
• Asynchrony management for out of order arrival of data
transfers and service completion
• Guaranteed compound atomic operations for user
programmed segments with efficient protection
• OS protocol to inform runtime system – bi directional
exchange
Summary of Research Directions
•
•
•
•
•
•
•
•
•
•
Separation of logical functionality from performance attributes
New model of computation
Diversity of parallelism forms and sizes
Data directed execution
Dynamic graph-based problems, encoding, and control
New programming models that interoperate with old
Dealing with memory hierarchies
Advanced runtime systems
Requirements for new ultra massive architecture
Automatic runtime tuning for heterogeneous architectures
Potential Impact on Usability, Capability, &
Breadth of Community
•
•
•
•
•
Enormous
Essential
Ease of use
Eternal
Everyone
4.x Programmability
Cross-cutting property of concurrency as it
relates to programmability
Billion-way parallelism
10 billion-way parallelism
Exposed Concurrency
10 million-way parallelism
100 million-way parallelism
100 thousand way parallelism
Million-way parallelism
2010
2011
2012
2013 2014
2015
2016
2017
2018
2019
4.x Programmability
• Technology drivers
– Programming models and languages
– Compiler analysis, distribution, and allocation
– Runtime system software
– OS
– Architecture structure, semantics, and
mechanisms
4.x Programmability
• Alternative R&D strategies
– Models of computation
• Message passing with multi threaded processes
• Message-driven work-queue multithreaded
– Programming models
• MPI-8
• Event-driven multithreaded with GAS
• DSP and Declarative
– Runtime system software
4.x Programmability
• Recommended research agenda
– Model of computation
– Decision chain across system layers
– Protocols between successive layers
4.x Programmability
• Crosscutting considerations
– It is one
– Performance
• major hazard for programmabiltiy
– Reliability
• Does the application program play a role in determining
response to faults