Download slides - FASTER

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Novel,
Emerging
Computing System Technologies
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE
Smart Technologies
for Effective Reconfiguration:
The FASTER approach
July 10th 2012
7th International Workshop on
Reconfigurable Communication-centric Systems-on-Chip
York, UK
M. D. Santambrogio, D. Pnevmatikatos, K. Papadimitriou, C. Pilato, G.
Gaydadjiev, D. Stroobandt, T. Davidson, T. Becker, T. Todman, W. Luk, A.
Bonetto, A. Cazzaniga, G. C. Durelli, D. Sciuto
Reconfiguration
The process of physically altering the location or
functionality of network or system elements.
Automatic configuration describes the way
sophisticated networks can readjust themselves
in the event of a link or device failing, enabling
the network to continue operation.
Gerald Estrin, 1960
2
2
Reconfigurable Technology
• Technology for adaptable hardware systems
 Can add/remove components at run-time/product
lifetime
 Flexibility at hardware speed (not quite ASIC)
 Parallelism at hardware level (depending on
application)
 Ideally: alter function & interconnection of blocks
• Implementation in:
 FPGAs: fine grain, complex gate plus memory and
DSP blocks
 Coarse Grain (custom) chips: multiple ALUs, multiple
(simple) programmable processing blocks, etc.
3
An issue as a new opportunity
• Programming has become very difficult
 Impossible to balance all constraints manually
• More computational horse-power than ever before
 Cores are free, reconfigurable logic available on chip, cores
can be heterogeneous
• Energy is new constraint
 Software must become energy and space aware
• Modern computing systems need to be flexible and
adaptive
 To optimize and meet their requirements taking
advantage as much as possible of the underlying
complex heterogeneous architectures
4
FASTER participants
5
FASTER Motivation
• Focus on fine-grain reconfiguration (not-limited)
• Creating reconfigurable systems is not
straightforward!
 The designer has to:
• Identify portions to be reconfigured
• Establish a schedule that (a) respects dependencies (b)
achieves performance and other constraints
• Manage the system resources (reconfiguration area mainly)
• Reconfiguration cost is substantial (use wisely)
• Verify a changing system!
• Tool support for these tasks is still quite basic
• Resource management is up to the user
• Verification: any support today?
6
FASTER Goals and Innovation
• Include reconfigurability as an explicit design
concept in computing systems design, along
with methods and tools that support run-time
reconfiguration in the entire design
methodology
• Provide a framework for analysis, synthesis
and verification of a reconfigurable system
• Provide efficient and transparent runtime
support for partial and dynamic reconfiguration,
including micro-reconfiguration
• Demonstrate usability & performance on
commercial applications (Maxeler, ST
Microelectronics, Synelixis)
7
FASTER: Overall Methodology
8
High-level Analysis & Reconfigurable System Definition
(led by PDM)
• Analyze each application and:
 Define its components
• HW/SW, reconfigurable HW modules, …
 Estimate, identify, and optimize its
performance and constraints on the target
reconfigurable computing system
• Execution time, floorplanning and placement,
HW/SW execution, …
• How to achieve these goals?
9
How to achieve these goals?
• by Identifying:
 The partitioning of the input specification in
HW/SW components
 The implementation(s) of the modules to be
realized as HW accelerators
 The corresponding level of reconfigurability for
HW components
• none, micro, region based
 The power constraints
 The floorplanning constraints
• size and shape
 The placement requirements
 The baseline schedule for application’s execution
10
A bird’s eye view on the design phase
Input
Output
App Designers
Front-end
Description of
the Architecture
TG generator
XML Designer GUI
XML Designer GUI
Pre-Processing
T2.2
T2.1
High-level
analysis
App task profiling
and identification
of rec. cores
T2.3
Optimization of app
for micro-rec. core
implementation
T2.4
Compile-time baseline
scheduling and core
mapping onto rec. regions
11
Micro-reconfiguration (led by Gent)
• In some applications we can identify fast
changing inputs vs.
slow‐changing “parameters”
• Parameters trigger a small-scale
reconfiguration
• We want to:
 Identify parameters
 Create bitfile with “holes”
 Parameter values => reconfiguration bits for
missing “holes”
 Fine grain, faster reconfiguration time!
 Extend the idea from logic (TLUT) to wires
(TCON)
12
Micro-reconfiguration (led by Gent)
CPU
FPGA
Application
FIR(2,
FIR(4,9)
8)
Software
Reconfiguration
Request
config.
config.
config.
FIR
DB
DB
DB
Static
FIR
Dynamic
Configuration
Manager
Configuration Interface
13
Verifying Reconfigurable Systems
(led by Imperial)
• Study design validation approaches: simulation,
emulation and formal verification
• Extend symbolic simulation to dynamic aspects
of reconfigurable design
• In some cases static approaches may not be
able to verify the entire RC system
 We will use run‐time verification. Address and
minimize impact on:
• Speed, area and power
• Light‐weight architectural support
14
Run-time System (Chalmers/FORTH)
• Provide support for partial & dynamic
reconfiguration
 Extend the OS capabilities, integrate in existing
systems
 Efficient on-line scheduling and placement of task
modules
• Evaluate reconfiguration overhead
• Propose advanced mechanisms to support




Scheduling
Relocation
Fragmentation = f(relocation, scheduling)
Area allocation
• Bottom-line: Extent the flexibility of run-time
support
15
FASTER Runtime System
Area allocation
Scheduling
Fragmentation
Improve:
-Speed
-Power
-Temperature
Placement
Routing
Relocation
Prefetching
16
Design phase and runtime support
WP3 (back-end)
List of HDL func ons
+
C descrip on
+
Parallelism
annota ons
(openMP)
XML descrip on
- App
- Pla orm
- HW/SW par oning
App Designers
Library w/ SW/HW I/F
modules
WP2 (front-end)
Sta c
High-level analysis.
Contains rough/fast
es ma on of
a)Power,
b)Resources,
i
c)Computa on me
Iden fica on of
PR cores +
Applica on
profilng
Op miza on for
microreconfigura on
(TLUT, TCON)
(iden fy annota ons
needed on C/HDL)
Reference
design
Vendor-flow
Reuse
Baseline
scheduling +
Floorplanning
Region-based
Micro-reconfig.
Vendor-flow +
relocation
UGent +
Vendor-flow
Static Area
Reconfigurable Area
System
GPP
RR2
RR1
SW
Verification
Param.
change
RTSM
SW
tools
WP4 (runtime)
•
Define a reconfigurable design methodology that exploits FPGAs:



•
•
•
Design automation flow and tools to generate hardware and software components and runtime support
Dynamically reconfigurable hardware and software architectures
Runtime swap of reconfigurable cores
Exploit dynamic reconfigurability for different target reconfigurable architectures.
Define and implement a new generation of self reconfigurable architectures based on Linux
Increase the reconfiguration performance via novel techniques, i.e. runtime reconfigurable cores
relocation, reconfigurable cores identification, reconfigurable cores reuse
17
Runtime Reconfiguration Management
• Reconfigurable architecture
 Static Area: used to control the reconfiguration process
 Reconfigurable Area: used to swap at runtime different cores
 Reconfiguration- oriented communication infrastructure
• Runtime reconfiguration managed via SW (slide 21)
 Standalone, Operating System
• Increased portability of user applications
• Inherited multitasking capabilities
• Simplified software development process
• Bitstreams relocation technique (slides: 19, 20) to




speedup the overall system execution
achieve a core preemptive execution
assign at runtime the bitstreams placement
reduce the amount of memory used to store partial bitstreams
18
Relocation: Virtual homogeneity
19
Runtime scenario: an example
Reconfigurable Functional Unit
FPU1: required for 3.65s
JPEG: required for 3s
FPU2:
for 3.13s
t
t required
3DES: required for 1s
(a): No relocation
t
7.
s
53
7.
(b): Relocation
t
s
53
FPU2
FPU2
Incoming requests (RFU, request time):
4.
s
(FPU1, 0), (FPU1,
0.1s), (JPEG, 0.1s),
FPU2
(3DES, 0.3s)
JPEG
JPEGFPU1
JPEG
FPU1
53
x
(6,3)
(a)
(9,3)
y
(0,3)
x
(9,0)
x
(9,0)
(0,0)
(0,0)
(3,3)
JPEG
FPU1
3DES
(9,0)
(0,0)
(0,3)
s
3DES
3DES
(9,0)
53
FPU2
FPU1
3DES
y
4.
y
(3,3) (0,3)
(b)
(6,3) (3,3)
x
(0,0)
(9,3) (6,3)
(9,3)
y
(0,3)
(3,3)
(a)
(6,3)
(9,3)
(b)
(a) Total Time = R(FPU1) + R(JPEG) + E(JPEG) + R(FPU2) + E(FPU2) = 7.53s
(b) Total Time = R(FPU1) + R(JPEG) + R2(FPU2) + E(FPU2) = 4.53s
Speed-Up=1.66
20
OS-based management of dynamic
reconfiguration
•
Provide software support for dynamic partial reconfiguration on
Systems-on-Chip running an operating system (i.e., LINUX).




•
•
OS customization for specific architectures
Partial reconfiguration process management from the OS
Addition and removal of reconfigurable components
Automatic loading and unloading of specific drivers for the IP-Cores upon
components configuration and/or de-configuration
Hardware-independent interface for software developers based on the
GNU/Linux
Easier programming interface for specific drivers
21
Demonstration and Use
• Use three complex applications from different
application domains and on commercial platforms:
 Reverse Time Migration (RTM), a computational
seismography algorithm (Maxeler, high-performance)
 Global Illumination and Image Analysis (ST,
desktop)
 Network Intrusion Detection System (Synelixis,
embedded)
• Evaluate the FASTER tool flow on designer
productivity in the design and verification process.
• Metrics: application speed, cost, and power
consumption
22
Expected Results and Conclusions
• FASTER is a focused project that builds on
combined partners expertise as well as on past
research work and projects
• We focus on (and hope to demonstrate):
 productivity improvement in implementation and
verification of dynamically changing systems
 total ownership cost reduction (NIDS and RTM
systems)
 performance improvement under power
constraints for Global Illumination and Image
Analysis application
23
Challenges & Opportunities
• Tool support for analysis and system
definition
• Specification of changing system(s)
• Reconfigurable granularity: influenced
by (influences???) tools and
applications
• Architectural support for reconfiguration
(vendor?)
• Metrics: include design effort/time, total
ownership cost
24
End…
http://www.fp7-faster.eu/
25