Download Meeting on streaming issues WP2 presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Application design flow for the MORPHEUS
heterogeneous dynamically reconfigurable platform
Philippe BONNOT - THALES
CASTNESS’07 workshop – Rome – 2007-01-15
MORPHEUS project
 EU FP6 IST project 02 7342

 Goals : a reconfigurable architecture chip and associated toolset

improving computing density, flexibility (reconfiguration time) and time-tomarket
application
code
--------f(.)
---------
Associated
toolset
 Partners are:

SW code
CPU
configuration bitstreams
RU 1
RU 2
RU 3
programming communication
THALES, THOMSON, ALCATEL-LUCENT, THALES Optronics,
INTRACOM, ST, PACT, M2000, ACE, CRITICALBLUE
 CEA, Universities of KARLSRUHE, DELFT, Bretagne Occ. , BOLOGNA,
BRAUNSCHWEIG, CHEMNITZ, ARTTIC
1
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Started 1st January 2006
 Duration 3 years
 MORPHEUS programming model
 MORPHEUS toolset
 Conclusion and perspectives
2
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Contents
 Introduction
 MORPHEUS execution model
MORPHEUS architecture
IOIOInterf.
Interf./ /
Peripherals
Peripherals
On--chip
On
On-chip
OnOn
On-chip
On-chip
chip
On
SRAM(s)
))
SRAM(s
On-chip
SRAM(s)
SRAM(s
SRAM(s)
SRAM(s)
)
SRAM(s
SRAM(s)
Memories
GeneralARM9
purpose
ARM9
SRAM/DRAM
Controller
Core(s)
SRAM/DRAM Controller processor
Core(s)
Config.
Reconf.
Reconf.
manager
control
control
DMA
DMA
control
control
AMBA AHB Bus (I/O)
AMBA AHB Bus (Config)
Interconnections
Network on Chip (Circuit Switched, Pipelined)
PACT
XPP
Coarse-grain : PACT XPP
•Data flow algorithm
•Huge computational demand
3
Data
Exchange
Buffer
PiCoGA
Reconfigurable
units
Medium-grained : PicoGA
• Reconf. array of 4-bit oriented ALU
•Target instruction level parallelism
CASTNESS’07 workshop – Rome – 2007-01-15
M2000
M2000
M2000
M2000
Fine grain : eFPGA
•Arbitrary logic
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
External
ExternalMemory
MemoryDevice
Device
SRAM/DRAM/FLASH
SRAM/DRAM/FLASH
Master AMBA AHB/APB + DMA
Ext.
DDR
On-chip
SRAM
XPP
HRE
PicoGA
HRE
M2000
HREs
ARM
+ OS
CM
IO
periph.
NOC + DNA
Reconfiguration AMBA AHB + DMA
On-chip
reconfiguration RAM
IO pads
A data stream lives only during the life of a configuration
Streams are under the control of HRE
HRE are under the control of ARM (see control flow) for exec and config (and of CM for config)
4
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Data Flow view
Master AMBA AHB/APB + DMA
Ext.
DDR
On-chip
SRAM
XPP
HRE
PicoGA
HRE
M2000
HREs
ARM
+ OS
CM
NOC + DNA
Reconfiguration AMBA AHB + DMA
On-chip
reconfiguration RAM
5
CASTNESS’07 workshop – Rome – 2007-01-15
IO pads
IO
periph.
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Execution Control Flow view
Master AMBA AHB/APB + DMA
Ext.
DDR
On-chip
SRAM
XPP
HRE
PicoGA
HRE
M2000
HREs
ARM
+ OS
CM
NOC + DNA
Reconfiguration AMBA AHB + DMA
On-chip
reconfiguration RAM
6
CASTNESS’07 workshop – Rome – 2007-01-15
IO pads
IO
periph.
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Configuration Flow view
Master AMBA AHB/APB + DMA
Ext.
DDR
On-chip
SRAM
XPP
HRE
PicoGA
HRE
M2000
HREs
ARM
+ OS
CM
NOC + DNA
Reconfiguration AMBA AHB + DMA
On-chip
reconfiguration RAM
7
CASTNESS’07 workshop – Rome – 2007-01-15
IO pads
IO
periph.
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Reconfiguration Control Flow view
 MORPHEUS programming model
 MORPHEUS toolset
 Conclusion and perspectives
8
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Contents
 Introduction
 MORPHEUS execution model
The application description: What programmer must do

with manual annotations to identify « HW » accelerated tasks and their
synchronisation (parallel execution)
 Detailing accelerated tasks

may generally be complex data-streaming processing functions that requiring
data-parallelism techniques to be described and mapped
 A graphical tool is proposed for that.




A direct path is available when:


9
Engineers who usually design such systems should easily handle it.
However, the task must be split in sub-tasks easily interconnected with the proposed tool.
Sub-tasks have to be described in C.
the accelerated task is not complex
optimisation is not expected
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
 C programming application at global level (sequences of tasks, etc)
Design Flow view
DMA/DNA
parameters
Sequential C-based
description of the application
Configuration (bitstream, …)
compilation-time scheduling
of accelerated functions
setting and execution
Master AMBA AHB/APB + DMA
DDR
controller
On-chip
SRAM
XPP
HRE
PicoGA
HRE
NOC + DNA
M2000
HREs
ARM
+ OS
CM
Run-time
Run-time
scheduling scheduling of
the
of the
application configuration
Reconfiguration AMBA AHB + DMA
10
CASTNESS’07 workshop – Rome – 2007-01-15
IO
periph.
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Graphical parallel + Ckernels
description of accelerated function
WP2 toolset
Sequential C-based
description of the application
Formal verification
MOLEN paradigm and compiler
- Information on
accelerated function
implementations
- DMA/DNA parameters
Accelerated function
synthesis (including
memory to memory
communication aspects)
Sequential C-based description of the application
(compilation-time scheduling of accelerated
functions setting and execution)
ECOS-based dynamic
reconfiguration control
Run-time scheduling
of the application
Configuration manager
Communication mechanisms
(DNA, DMA, DDR controller)
Reconfigurable Units
(M2000 blocks, XPP, PicoGA)
Configuration (bitstream, …)
11
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Graphical parallel + Ckernels
description of accelerated function
 MORPHEUS programming model
 MORPHEUS toolset
 Conclusion and perspectives
12
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Contents
 Introduction
 MORPHEUS execution model
WP2 toolset
Sequential C-based
description of the application
Formal verification
MOLEN paradigm and compiler
- Information on
accelerated function
implementations
- DMA/DNA parameters
Accelerated function
synthesis (including
memory to memory
communication aspects)
Sequential C-based description of the application
(compilation-time scheduling of accelerated
functions setting and execution)
ECOS-based dynamic
reconfiguration control
Run-time scheduling
of the application
Configuration manager
Communication mechanisms
(DNA, DMA, DDR controller)
Reconfigurable Units
(M2000 blocks, XPP, PicoGA)
Configuration (bitstream, …)
13
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Graphical parallel + Ckernels
description of accelerated function
MOLEN paradigm and compiler
 Made by the university of Delft and ACE company

configuration

parameter passing

execution instructions
C with MOLEN annotations
Expansion to MOLEN instructions
Optimized placement of configuration instructions
ARM assembly with MOLEN abstraction library
14
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
 An extension of the instruction set for the reconfigurable
processing elements with :
MOLEN
Architecture
Retargeted
compiler
Binary
Code
call f(.)
HDL
Reconfigurable array
Example:
C code: res = alpha(param1, param2);
movtx XR1 ← param1
movtx XR2 ← param2
set
<address_alpha_set>
exec <address_alpha_exec>
movfx res ← XR3
15
CASTNESS’07 workshop – Rome – 2007-01-15
Send param.
HW reconfiguration
HW execution
Return result
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
f(.)
MOLEN

declares the next function to correspond with PE task id
MOLEN_PARALLEL

Starts/ends scope for parallel tasks execution
MOLEN_CONFLICT

id1 id2
Declares configuration conflict for tasks with id1 and id2
MOLEN Instructions
SET (id)
MOVTX(id,val)
EXEC (id)
BREAK
MOVFX(id,reg)
RELEASE (id)
16
on/off
CASTNESS’07 workshop – Rome – 2007-01-15
Configure PE for task id
Move value to task id
Execute task id
Wait for all executing tasks
Move data from task to reg
Release configuration id
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
C Source Program Annotations
MOLEN_FUNCTION id
WP2 toolset
Sequential C-based
description of the application
Formal verification
MOLEN paradigm and compiler
- Information on
accelerated function
implementations
- DMA/DNA parameters
Accelerated function
synthesis (including
memory to memory
communication aspects)
Sequential C-based description of the application
(compilation-time scheduling of accelerated
functions setting and execution)
ECOS-based dynamic
reconfiguration control
Run-time scheduling
of the application
Configuration manager
Communication mechanisms
(DNA, DMA, DDR controller)
Reconfigurable Units
(M2000 blocks, XPP, PicoGA)
Configuration (bitstream, …)
17
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Graphical parallel + Ckernels
description of accelerated function
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Dynamic reconfiguration RTOS structure
 eCos extension made by university of Karlsruhe
18
CASTNESS’07 workshop – Rome – 2007-01-15
Dynamic reconfiguration RTOS relationships
Compiled application binary code
Retargetable
compilation
RTOS
Reconfiguration
directives
HW status
Dynamic
reconfiguration
Configuration
Manager
Reconfiguration and
execution control
HW status
Reconfiguration
control
HW status
Reconfigurable units
Spatial design
The RTOS performs:




19
Priority calculation
Tasks execution status management
Resource request to the Configuration Manager for fine
dynamic scheduling
Allocation decision (on the various reconfigurable units)
(only in the second phase of the project)
CASTNESS’07 workshop – Rome – 2007-01-15
The Configuration Manager performs:



Configuration priority management
Configuration cache management
Prefetch prediction
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Reconfiguration and
execution system call
WP2 toolset
Sequential C-based
description of the application
Formal verification
MOLEN paradigm and compiler
- Information on
accelerated function
implementations
- DMA/DNA parameters
Accelerated function
synthesis (including
memory to memory
communication aspects)
Sequential C-based description of the application
(compilation-time scheduling of accelerated
functions setting and execution)
ECOS-based dynamic
reconfiguration control
Run-time scheduling
of the application
Configuration manager
Communication mechanisms
(DNA, DMA, DDR controller)
Reconfigurable Units
(M2000 blocks, XPP, PicoGA)
Configuration (bitstream, …)
20
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Graphical parallel + Ckernels
description of accelerated function
 mostly regular data streaming
applications
 captured as acyclic graphs of tasks
 each task represented by its way to
(linearly) access data from/to its
input/output arrays, and as a nest of
loops
 SPEAR DE does not participate to
creating the code within a task
 SPEAR DE helps the user to select
and implement a mapping of the
application on the computing
architecture
21
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Applications in SPEAR DE
SPATIAL DESIGN: framework architecture
 Application capture and
system optimizations
SPEAR DE
SPEAR sub-function
C files
(ANSI C subset)
Cascade
CriticalBlue
CDFG
MADEO
UBO
Bitstream
22
CASTNESS’07 workshop – Rome – 2007-01-15
 CDFG generation
global
CDFG
Data flow of the process
 Technology mapping
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
 Joint works of university of Bretagne Occidentale, Critical Blue
and THALES company
MOtoM
23
Mtom
CPU
F
MOtoM
Mtom
CPU
CASTNESS’07 workshop – Rome – 2007-01-15
G
M0
F
G
M
M
M
m
m
m
mtoM
Fusion
Do 4 times
MtoOtherSeg
mtoM
CPU
Fusion reduces memory needs
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Mapping _ Fusion of tasks
MtoOtherSeg
Design on reconfigurable units
HRE
HRE
F
F
F
F
F
F
C
C
Buffers
C
Buffers
NOC, AMBA
 Synthesis of subtasks from C code
 Automatic generation of interconnections and control logic
24
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
C
C
MADEO: framework architecture
 Behavioral & physical synthesis
Global CDFG
(from SPEAR)
Global and subtasks
CDFG generation
Subtasks CDFG
(from Cascade)
CDFG HLL
Archi. 1 :
M2000
compilation
synthesis
MADEO :
behavioral
&
physical synthesis
Archi. 2 :
XPP
Archi. 3 :
PicoGA
CDFG LL
rewriting
e.g. NML
25
CASTNESS’07 workshop – Rome – 2007-01-15
EDIF
e.g. Griffy-C
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
 Open framework
 MORPHEUS programming model
 MORPHEUS toolset
 Conclusion and perspectives
26
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
Contents
 Introduction
 MORPHEUS execution model
Conclusion and perspectives
 A reconfigurable heterogeneous architecture is in development

Retargetable compiler based on MOLEN paradigm

Reconfiguration control added to eCos OS

Accelerated function synthesis abstracts the architecture heterogeneity
 Developments of application test cases in progress (Work Package 5)
 A comprehensive toolset :

allows application developers to fully exploit MORPHEUS architecture

reduced time-to-market, improving flexibility
 Second phase of the project:
27

Parallel extensions to MOLEN instruction set

Dynamic reconfiguration control

Function synthesis optimizations
CASTNESS’07 workshop – Rome – 2007-01-15
This document and any data included are the property of Thales. They cannot be reproduced, disclosed or used without Thales' prior written approval.
©THALES 2005. Template trtcoen version 1.0.2
 Associated toolset based on C language, composed of 3 modules:
Application programming design flow for the
MORPHEUS heterogeneous dynamically
reconfigurable platform
Philippe BONNOT - THALES
CASTNESS’07 workshop – Rome – 2007-01-15