Download TILERA – TILE64™ PROCESSOR PROCESSOR TILE64

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Architectures for Multimedia
Systems
y
TILERA –
TILE64 PROCESSOR
TILE64™
Mondello Filippo
pp
722955
Index
y
y
y
Tile Processor Architecture
Tile64 implementation
Tile Processor Architecture innovations:
◦
◦
◦
◦
◦
Large number of tiles on a chip
iMesh
Multicore coherent cache
Multicore Hardwall technology
Multicore Development Environment Tools
Suite
Tile Processor Architecture
y
y
MIMD machine
2D grid of 64 homogeneus, generalpurpose compute elements: tiles
Tilera’s iMesh on-chip network
4 DDR2 controllers + I/O controllers
y
TILES:
y
y
◦ Processor
◦ L1 & L2 cache
◦ non-blocking
non blocking switch
Tile Processor Architectures
Tile64 implementation
y
Cores: 32-bit, RISC, VLIW,
90nm technology
192 billion 32-bit ops;
256 billion 16-bit ops;
half a teraops 8
8-bit
bit
operations
y
Memory
y
◦ L1 cache: 8KB I, 8KB D,
1 cycle latency
◦ L2 cache: 64KB unified,
7 cycle latency
◦ Off-chip
Off hi main
i memory,
~70 cycle latency
◦ 32-bit virtual address space
per process
◦ 64-bit
64 bit physical address space
◦ Instruction and data TLBs
◦ Cache integrated 2D DMA
engine
iMesh network
y
Using multiple processors require a system
to allow communication among them.
◦ Old Solution: bus interconnection.
Problem: more cores added to chips Æ bus creates data congestion,
congestion
limiting performance scalability with the increased number of cores
◦ Tilera’s solution: iMesh
y
iMesh:
y
y
y
y
y
user dynamic
d
i network
t
k (UDN)
I/O dynamic network (IDN)
static network (STN)
memory dynamic network (MDN)
tile dynamic network (TDN).
iMesh network
‰ Each tile uses a fully
y connected
crossbar Æ all-to-all five-way
communication.
‰ Dynamic networks:
• packetized, fire-and-forget interface,
dimension-ordered wormhole-routed.
• Packet = header word + up to 128
words per packet
• Hop latency:
• one cycle if packets are going straight
• one extra cycle for route calculation when a packet must make a turn at
a switch.
‰ Static network:
• static configuration of the routing decisions at each switch point.
• auxiliary processor for reconfiguring the network in a programmatic manner.
iMesh network
y
y
y
y
y
UDN Æ userland processes or
threads
IDN Æ direct communication
with I/O devices.
devices
MDN Æ communication with
off-chip DRAM.
TDN Æ direct
d
tile-to-tile
l
l cache
h transfers.
f
Works
k in concert with
h
the MDN.
STN Æ low-latency, high-bandwidth channelized network — great
for streaming data.
Multicore coherent cache
Cache subsystem Æ highhigh
performance, two-level, nonblocking cache hierarchy.
y Each tile's
tile s cache can be shared
with other tiles
Æ each tile can access the
aggregate multi-megabyte
cache.
Æ each tile can view the
collection of on-chip
on chip caches of all
tiles, serving as an L3 cache.
y Neighborhood caching to provide
an on
on-chip
chip distributed shared
cache.
y
Multicore Hardwall technology
y
Enables the user to define one or many cores as a processing
island, eliminating communication between it and other cores
unless specified.
If a packet attempts to cross
the established boundary, an
interrupt is signaled and control
is passed on to the hypervisor.
hypervisor
y Tile Processor architecture
results well suited to hosting
multiple operating systems
running independent
applications, or multiple
instances of the same
application, on a single-chip
platform.
y
Multicore Development
Environment Tools Suite
The Tilera MDE includes a powerful
Eclipse-based integrated development
environment
i
t (IDE)
(IDE), an ANSI
ANSI-standard
t d d 'C'
compiler, a full-system simulation model and
a set of flexible command-line interfaces.
y To
T achieve
hi
optimum
ti
performance
f
on th
the chip,
hi th
the MDE iincludes
l d an
optimized user communication library (iLib) offering standard
mechanisms such as process management, socket-like streaming
channels message passing,
channels,
passing and shared-memory
shared memory communication
communication.
y Tilera defined the Tilera’s Gentle Slope Programming model which
enables the user to begin with familiar programming tools and
mo e to advanced,
move
ad anced large-scale
la ge scale multicore
m ltico e programming
p og amming easily.
easil
y
References::
References
y
http://www.tilera.com/products/processors.php: ProductBrief_Tile64_Web_v3.pdf
y
htt //
http://www.tilera.com/technology/technology.php:
til
/t h l
/t h l
h A
ArchBrief_Arch_V1_Web.pdf
hB i f A h V1 W b df
y
http://en.wikipedia.org/wiki/TILE64
y
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4378780
y
http://techreport.com/discussions.x/13069
y
http://www.hwupgrade.it/news/portatili/64-core-per-il-processore-tile64_22252.html
y
http://www.theregister.co.uk/2007/08/20/tilera_tile64_chip/
p //
g
/
/ / /
_
_ p/
y
http://arstechnica.com/articles/paedia/cpu/MIT-startup-raises-multicore-bar-withnew-64-core-CPU.ars
y
http://www tgdaily com/content/view/33451/135/
http://www.tgdaily.com/content/view/33451/135/
y
http://www.itjungle.com/tlb/tlb082107-story02.html
y
http://www.pcmag.com/article2/0,1895,2173203,00.asp