Download (slides)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Single Assignment C:
A High Productivity Language
for High Performance Computing
Clemens Grelck
Workshop
Trends in High-Performance Distributed Computing
Amsterdam, Netherlands
March 14, 2012
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
The HPˆ3 Triangle in High Performance Computing
High
Productivity
High
Performance
High Portability
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Welcome to the Cluster Node Hardware Zoo!!
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Compute node hardware is a zoo:
I
Vastly different numbers of cores
I
Vastly different core architectures: general, restricted
I
Vastly different memory architectures
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
HPC in the Age of Multi- and Many-Core
Compute node hardware is a zoo:
I
Vastly different numbers of cores
I
Vastly different core architectures: general, restricted
I
Vastly different memory architectures
Programming diverse hardware is uneconomic:
I
Various low-level programming models
I
Each requires different expert knowledge
I
Heterogeneous combinations of the above ?
I
Cumbersome, error-prone and inefficient
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
An Alternative: Single Assignment C (SAC)
Credo: abstraction, abstraction, abstraction
I
Program what to compute, not exactly how
I
Leave concrete organisation of execution to
compiler and runtime system
I
Put expert knowledge into tools, not into applications
I
Architecture-agnostic programming for portability
I
Compile one source to diverse target hardware
I
Automatically manage resources: memory, cores, etc
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
SAC — Design Space
SAC
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
SAC — Design Space
High-level functional, data-parallel
programming with vectors, matrices, arrays
SAC
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
SAC — Design Space
High-level functional, data-parallel
programming with vectors, matrices, arrays
SAC
Suitability to achieve high performance
in sequential and parallel execution
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
SAC — Design Space
High-level functional, data-parallel
programming with vectors, matrices, arrays
SAC
Easy to adopt for programmers
with an imperative background
Suitability to achieve high performance
in sequential and parallel execution
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Case Study: 1-Dimensional Complex FFT (NAS-FT)
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Case Study: 1-Dimensional Complex FFT (NAS-FT)
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Now, what is functional array programming ?
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Functional Array Programming
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Role of Functions:
I
Map argument values to result values
I
No side effects
I
Call-by-value parameter passing
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Functional Array Programming
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Role of Variables:
I
Variables are placeholders for values
I
Variables do not denote memory locations
I
Automatic memory management
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Functional Array Programming
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Execution Model:
I
Contextfree substitution of expressions
I
No side-effects
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Functional Array Programming
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Control flow constructs:
I
Branches are syntactic sugar for conditional expressions
I
Loops are syntactic sugar for tail-end recursive functions
I
Data flow determines execution order
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
+
FFT
(rofu)
FFT
roots of unity
Functional Array Programming
complex [.] FFT ( complex [.] v , complex [.] rofu )
{
even = condense (2 , v );
odd = condense (2 , drop ( [1] , v ));
even = FFT ( even , rofu );
odd = FFT ( odd , rofu );
*
rofu = condense ( len ( rofu ) / len ( odd ) , rofu );
−
left = even + odd * rofu ;
right = even - odd * rofu ;
return left ++ right ;
}
Nature of Arrays:
I
Pure values, mapping indices to (other) values
I
No state, no fixed memory representation
I
Maybe no memory manifestation at all
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
Case Study: 3-Dimensional Complex FFT (NAS-FT)
Algorithmic idea:
FFT1d
FFT1d
FFT1d
Implementation:
complex [. ,. ,.] FFT ( complex [. ,. ,.] a , complex [.] rofu )
{
b = { [. ,y , z ] -> FFT ( a [. ,y , z ] , rofu ) };
c = { [x ,. , z ] -> FFT ( b [x ,. , z ] , rofu ) };
d = { [x ,y ,.] -> FFT ( c [x ,y ,.] , rofu ) };
return d ;
}
typedef double [2] complex ;
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
The Same in Fortran
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
The Power of Abstraction
I
Programming by composition of building blocks:
I
I
I
I
I
Opportunities for compiler and runtime system:
I
I
I
I
I
Rapid prototyping
High confidence in correctness
Good readability of code
Plenty of code reuse opportunities
Target-independent optimisation
Code generation for variety of target architectures
Automatic parallelisation
Automatic memory management
Result:
I
I
I
Sufficient performance
For a range of architectures
Without extra effort
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
Compilation Challenge
SAC
Functional Array Programming
SAC2C
Advanced Compiler Technology
FPGAs
Amsterdam
MicroGrid
Architecure
Multi−Core
Multi−
Processors
Systems
on a
Chip
GPGPU
Accelerators
And achieve reasonable performance....
Goal: achieve 90% with 10% of the effort
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
Clusters
SAC as a Compiler Technology Project
Some Figures:
I SAC compiler + runtime library:
I
I
I
300,000 lines of code, about 1000 files
about 250 compiler passes
+ standard prelude + standard library
I More than 15 years of research and development
I More than 30 people involved over the years
Involved Universities:
I University of Kiel, Germany (1994–2005)
I University of Toronto, Canada (since 2000)
I University of Lübeck, Germany (2001–2008)
I University of Hertfordshire, England (2004–2012)
I University of Amsterdam, Netherlands (since 2008)
I Heriot-Watt University, Scotland (since 2011)
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
Conclusion
High
Productivity
High
Performance
High Portability
Single Assignment C:
I reconcile productivity, portability and performance
I one source for all architectures
I functional array programming
I exposure of fine-grained concurrency
I automatically sequentialising compiler
I automatic resource management
www.sac-home.org
Clemens Grelck, Universiteit van Amsterdam
Single Assignment C
Related documents