Download NETWORK-ON-CHIP (NOC): A New SoC Paradigm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer network wikipedia , lookup

PC/104 wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

RapidIO wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

CAN bus wikipedia , lookup

Distributed operating system wikipedia , lookup

Deep packet inspection wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Airborne Networking wikipedia , lookup

VMEbus wikipedia , lookup

IEEE 1355 wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Transcript
NETWORK-ON-CHIP
(NOC):
A New SoC Paradigm
Dr. Konstantinos Tatas
PRESENTATION OUTLINE
Introduction
Part A
 Motivation – SoC Communication
 Current Solutions
 NoC Concept
Part B
 Work@MicroLab
Summary
THE MANY CORES ERA
Source:
International Roadmap for Semiconductors 2007 edition (http://www.itrs.net/)
THE GROWING GAP:
COMPUTATION VS. COMMUNICATION
2:1
9:1
Taken From ITRS, 2001
GROWING CHIP DENSITY
1998
ASIC - 0.35 mm
2012
SoC - 22nm
Memory, I/O
P
Design complexity - high IP reuse
Efficient high performance interconnect
Scalability of communication architecture
TRADITIONAL SOC NIGHTMARE
System Bus
DMA
CPU
Mem
Ctrl.
MPEG
C




DSP
The “Board-on-aChip” Approach
Bridge
I
o
Control Wires
Variety of dedicated interfaces
Poor separation between computation and communication.
Design Complexity
Unpredictable performance
o
The
architecture
is tightly
coupled
Peripheral Bus
COMPUTATIONAL DEMANDS OF FUTURE MULTIMEDIA APPLICATIONS MEMORY BANDWIDTH SCALES PROPORTIONAL
2012-2015
today
BW~
Source: K. Uchiyama., VLSI Circuit Digest of Technical Papers, p 6, 2008.
K. Uchiyama., “Power-Efficient Heterogeneous Parallelism for Digital Convergence”, VLSI Circuit Digest of Technical Papers,
IEEE p 6-9, June 2008
Jian Li, “3D Integration opportunities and challenges”, ISCAS 2008 tutorial on 3D
SHARED ADDRESS SPACE COMMUNICATIONS
SYSTEM BUS
CROSS-BAR
MULTI-STAGES NETWORK ON CHIP
AN NOC EXAMPLE
•Source: ossum, Intel @ MPSoC’07
NOC TOPOLOGIES
Regular topologies: general-purposed
on-chip multiprocessors
 Custom
topologies:
NOC VS. “OFF-CHIP” NETWORKS
What is Different?
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
 Routers on Planar Grid Topology
 Short Point-To-Point Links between routers
 Unique VLSI Cost Sensitivity:
 Area-Routers and Links
 Power
Module
NOC VS. “OFF-CHIP” NETWORKS
No legacy protocols to be compliant with …
No software  simple and hardware
efficient protocols
Different operating env. (no dynamic
changes and failures)
NOC VS. “OFF-CHIP” NETWORKS
No legacy protocols to be compliant with …
No software  simple and hardware
efficient protocols
Custom Network
Designenv.
– You(no
design
what you need!
Different
operating
dynamic
changes and failures)
NOC VS. “OFF-CHIP” NETWORKS
No legacy protocols to be compliant with …
No software  simple and hardware
efficient protocols
Custom Network
Designenv.
– You(no
design
what you need!
Different
operating
dynamic
changes
and
failures)
Example1: Replace modules
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Replace
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
NOC VS. “OFF-CHIP” NETWORKS
No legacy protocols to be compliant with …
No software  simple and hardware
efficient protocols
Custom Network
Designenv.
– You(no
design
what you need!
Different
operating
dynamic
changes
and failures)
Example2: Adapt Links
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
Adapt Links
Module
Module
Module
Module
Module
Module
Module
Module
Module
Module
NOC COST SCALABILITY VS. ALTERNATIVES
•Compare the cost of:
NoC
Non-Segmented Bus (NS-Bus)
Segmented Bus (S-Bus)
Point-To-Point (PTP)
WHY NOC?
Bus
NoC
Longer connections 
Performance does not
higher parasitic
downgrade with network
capacitance
scaling
Arbitration grows and
Arbitration and routing are
becomes a bottleneck
distributed
Bandwidth is limited and
Aggregated bandwidth
shared by all cores
scales with network size
WHICH ARE THE MAIN CHALLENGES?
Communication infrastructure
Communication paradigm selection
Application mapping optimization
Programming model
Physical design
Design automation/tool-flow integration
BASIC SWITCHING TECHNIQUES
 Circuit Switching A real or virtual circuit establishes a direct
connection between source and destination.
 Packet Switching Each packet of a message is routed
independently. The destination address has to be provided
with each packet.
 Store and Forward Packet Switching The entire packet is
stored and then forwarded at each switch.
 Cut Through Packet Switching The flits of a packet are
pipelined through the network. The packet is not
completely buffered in each switch.
 Virtual Cut Through Packet Switching The entire packet is
stored in a switch only when the header flit is blocked due
to congestion.
 Wormhole Switching is cut through switching and all flits
are blocked on the spot when the header flit is blocked.
CIRCUIT SWITCHING (ARE THEY NOC?)
 Phases:
Circuit Setup
Transmission
Tear Down
 Disadvantages:
Exclusive allocation of resources
Long setup phase
 Advantages:
High performance - throughput and latency
Low power consumption
Low overhead during transmission phase
Predictable transmission
PACKET SWITCHING VS CIRCUIT SWITCHING
NOC ROUTER
NoC-based MPSoC
• nodes
– Processing Elements (PEs),
such as CPUs, custom IPs,
DSPs, etc.
– storage elements
(embedded memory
blocks),
•
•
•
•
Routers
Links
Network Interfaces (NIs)
Often a switch together
with its host node memory
is referred to as a tile.
NoC Topologies
• Regular/irregular
• Direct/indirect
– each node has a direct point-to-point link to a
subset of other nodes in the system, called
neighboring nodes
2D Mesh
•simplest and most
popular topology for
NoCs.
•Every switch, except
those at the edges, is
connected to four
neighboring switches
and one node.
R
R
IP
R
R
IP
R
IP
R
R
IP
R
IP
IP
IP
R
IP
IP
2D Torus
•layout of a regular mesh
except that nodes at the
edges are connected to
switches at the opposite
edge via wrap-around
routing channels.
R
R
R
IP
IP
IP
IP
R
R
R
R
•Every switch has five ports
• The limitation of this
topology affects the long
end-around connections
R
IP
IP
IP
R
IP
R
R
R
IP
IP
IP
IP
R
R
R
R
IP
IP
IP
IP
Octagon
•well-established direct
topology found in NoCs.
IP
•ring of 8 nodes connected by
12 bi-directional links.
IP
R
R
•links provide two-hop
communication between any
pair of nodes in the ring
•simple algorithms for fast yet
efficient shortest-path routing.
•In case a platform consists of
more than eight nodes, the
octagon is extended to
multidimensional space
IP
IP
R
R
R
R
R
IP
R
IP
IP
IP
Fat-tree and butterfly fat-tree
R
•
•
•
•
•
•
nodes are connected to an
architecture's external switch
switches have point-to-point links
to other switches.
processing units and memory
modules are assigned to the leafs
of the trees,
switches are placed at the
vertices,
communication involves climbing
up and down some part of the
tree.
A pair of coordinates is used to
label each node, ($l$, $p$), where
$l$ denotes a node's level and
$p$ gives its position within this
level.
R
R
R
R
IP
IP
IP
R
IP
R
IP
IP
IP
R
R
R
IP
IP
R
IP
IP
IP
IP
R
IP
IP
IP
R
IP
IP
IP
IP
Polygon
• widely accepted
topology
• packets travel in a loop
from one router to the
next.
• We can add chords to
IP
the circle
• if chords are inserted
only between opposite
routers, the topology is
called a spidergon.
IP
IP
R
R
R
R
R
R
IP
IP
IP
Star
IP
IP
IP
R
R
IP
IP
IP
IP
CORE
R
R
R
IP
IP
• central router in the middle of the star,
• computational resources, or subnetworks, in the spikes of the star.
• The capacity requirements of the central router are quite large,
• significant possibility of congestion in the middle of the star
Flow Control
• intra-switch
• switch-to-switch
– Buffered
– Bufferless
• end-to-end
ACK/NACK
•
•
•
•
•
•
•
handshaking protocol
When a sender puts data on
the link, it activates a VALID
signal.
When the receiver is ready to
consume the valid data, it
activates the corresponding
ACK signal.
If the data is corrupt or there is
no buffer space to store them,
a NACK signal is activated
instead.
Upon receipt of a NACK, the
sender starts resending flits
starting from the not
acknowledged one
inherently supports fault
tolerance,
additional buffer space
required to keep sent flits in
case retransmission is
required.
Stall/go
• requires just two
control wires
• one going forward,
signifying data
availability,
• one going backward
and signaling either a
condition of buffers
filled ("STALL") or of
buffers free ("GO")
Credit-based
•
•
•
•
•
•
•
transmitter has a "credit" counter
initialized to the value of empty buffer slots of the receiver
decrements it every time a flit is sent.
The credit counter must be updated in case the receiver consumes or
forwards a flit and therefore increases its buffer space.
a credit value that is sent back to the transmitter to be added to the current
value of the credit counter.
transmitter stalls when the credit value is zero and
resumes when its value increases again.
NI Design
• logic required to connect the nodes to the NoC.
• NIs can differ significantly depending on the nature of the
node
• Using a NI allows IPs and communication infrastructure
to be designed independently
• One end of a NI is connected to a router using the
selected flow control protocol
• the other to the node IP
• Since most IPs are designed to communicate through a
bus, the NI uses a bus interface
• NI is not simply a protocol adapter from a processor bus
to a router port.
• Ideally, the NI must offer the processing cores the view
of a shared memory system, and the network itself
should be transparent.
NI services
• adaptation services
– packetization/depacketization
– protocol conversion and clock domain crossing.
– absolute minimum services required of the NI so that data can
be sent and received on the NoC
• transaction reordering services,
• error and flow control services
– error detection and/or correction
– request retransmission when required
• route computation services
– Source routing
• upper layer services
– Cache coherence
Typical NoC Packet Format
Header
Packet
type
Routing
SA
Length
Payload
Control
Address
Tail
Data
SN
Error
Control
DA
OR
SA
HOP HOP
0
1
•
HOP
N
Header
–
–
–
–
•
•
...
routing and network control information.
In the case of distributed routing the information required is the destination and source
addresses
in the case of source routing the complete routing information is written
In the case of variable packet size a length field is required
Payload
Tail
–
–
sequence number
error control fields such as hamming code or CRC fields
Source vs Distributed Routing
• In source routing the entire routing path is computed at
the source and appended to the packet.
– The routers do not make any routing decisions,
• in distributed routing, the routing path is decided in a
hop-by-hop basis at each router even for deterministic
routing algorithms.
– The only information required to be found in the packet is the
destination address.
• The advantage of source routing is that it requires simple
routers and can easily support irregular architectures. Its
disadvantage is that it does not provide adaptiveness
and requires more complex NIs and packets.
Source vs Distributed Routing
DA
R
(0,1)
R
R
(2,1)
(1,1)
R
R
R
R
...
(2,2)
DR
principle
(0,1) (1,1) (2,1) (2,2)
R
(2,2)
R
R
R
R
R
R
R
R
E
E
S
PE
...
SR
principle