Download Networks on Chip (NoC)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Piggybacking (Internet access) wikipedia , lookup

Low Pin Count wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

Distributed firewall wikipedia , lookup

Net bias wikipedia , lookup

CAN bus wikipedia , lookup

RapidIO wikipedia , lookup

MIL-STD-1553 wikipedia , lookup

Bus (computing) wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

VMEbus wikipedia , lookup

Network tap wikipedia , lookup

Computer network wikipedia , lookup

Deep packet inspection wikipedia , lookup

Peering wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Packet switching wikipedia , lookup

IEEE 1355 wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Airborne Networking wikipedia , lookup

Routing wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Transcript
Networks-on-chip
On-chip Communication
Rene van Leuken, EEMCS-CAS
05/10/15
Delft
University of
Technology
Challenge the future
On-chip Networks
What is Interconnect?
What are the problems?
Networks-on-Chip
2
Interconnect
•  In telecommunications, interconnection is the physical linking
of a carrier's network with equipment or facilities not
belonging to that network.
•  On ‘chips’, interconnects serve as the streets and highways of
the integrated circuit (IC), connecting elements of the IC into
a functioning whole and to the outside world. Interconnect
levels (or metal layers) vary in numbers depending on the
complexity of the device.
•  Major difference: dimensions/length/width
Networks-on-Chip
3
Scalability of Interconnect on-Chip
‘Standard’ interconnect throughput decline
Networks-on-Chip
4
Scalability – Area and Power in NoCs
For Same Performance, compare the:
Wire-area and power:
NoC:
O (n)
O (n)
Point-to Point:
( n)
O (n n )
O n
2
d
n
Simple Bus:
d
n
d
n
d
( )
O (n n )
O n3 n
Segmented Bus:
n
d
n
d
( n)
O (n n )
O n
2
n
Networks-on-Chip
5
Similarities and differences between NoCs and
Computer Networks (CN)
Similarities
Differences
Consist of network node
(router/switch, link, PE)
NoC designed toward
application domain while CN
for general purpose
Use packet switching
NoC topology is fixed by
design while CN support
plug and play router
Flit/packet use header flit
that content protocol
information such as routing,
etc.
Energy is important
constraint in NoC thus low
power techniques is
needed.
Implement communication
protocol such as routing,
arbitration and flow control.
NoC can't support heavy
communication protocol
Networks-on-Chip
6
1.
Systems-on-Chip
Multi-core/Many-core
MPSoC
Networks-on-Chip
7
Chip MultiProcessors (CMPs)
IBM Cell:
Parameter
Technology process
Value
90nm SOI with low-κ dielectrics and 8 metal
layers of copper interconnect
Chip area
235mm^2
Number of transistors
~234M
Operating clock frequency
4Ghz
Power dissipation
~100W
Percentage of power dissipation due to 30-50%
global interconnect
Intra-chip, inter-core communication
1.024 Tbps,
2Gb/sec/lane (four shared
bandwidth
buses, 128 bits data + 64 bits address each)
I/O communication bandwidth
0.819 Tbps (includes external memory)
Networks-on-Chip
9
Today’s heterogeneous SOCs
•  The System-on-Chip (SoC) today
•  Heterogeneous ~10 IP’s
•  Homogeneous (MP-SoC) ~ 10
uP (with exceptions)
•  On-Chip BUS (AMBA, Core
Connect, Wishbone, …)
•  Near and long-term forecast
•  ≥ 100 IP/uP: Busses are non
scalable!
•  Physical Design issues: signal
integrity, power consumption,
timing closure
•  Need for “more regular” design
DMA
DSP
CPU
MEM
Interconnection network (BUS)
DSP
Dedicated
IP (MPEG)
I/O
Locally
synchronous
clock domains
Networks-on-Chip
10
Computation vs Communication:
A growing gap
n 
Focus on communication-centric design
q 
q 
q 
Poor wire scaling
n 
Interconnect power + delay more dominant as the technology improves
High Performance
Energy efficiency
n 
Communication architecture large proportion of energy budget
Networks-on-Chip
11
2.
Interconnect
Busses
Networks-on-Chip
15
What Is a Bus?
Networks-on-Chip
16
Shared bus interconnection
infrastructure for SoC
Networks-on-Chip
17
What Is a Bus?
• A Bus is:
• Shared communication link.
• Single set of wires used to connect multiple
subsystems.
• A bus is also a fundamental tool for composing large,
complex systems.
• Systematic means of abstraction.
Networks-on-Chip
18
What Defines a Bus?
Networks-on-Chip
19
3.
On-chip Interconnect
2D Networks on Chip
Networks-on-Chip
20
Bus vs Networks-on-Chip (NoCs)
Bus-based architectures
Irregular architectures
•  Bus based interconnect
•  Low cost
•  Easier to Implement
•  Flexible
n 
Regular Architectures
Networks on Chip
q 
q 
Layered Approach
Buses replaced with
Networked architectures
n 
n 
n 
n 
Better electrical properties
Higher bandwidth
Energy efficiency
Scalable
Networks-on-Chip
21
Networks-on-Chip
• It is clear that even with significant design effort the busstyle interconnect is not going to sufficient for large SoCs:
– the physical implementation does not scale: bus fan-out, loading,
arbitration depth all reduce operating frequency
– the available bandwidth does not scale: the single bus must be
shared by all masters and slaves
• Lets start again: Leverage research from data networking
Networks-on-Chip
22
Packet switched network
communication infrastructure
Networks-on-Chip
23
Introduction
•  Network-on-chip (NoC) is a packet switched on-chip
communication network designed using a layered methodology
•  “routes packets, not wires”
•  NoCs use packets to route data from the source to the
destination PE via a network fabric that consists of
•  switches (routers)
•  interconnection links (wires)
Networks-on-Chip
25
Regular Network on Chip
PE
PE
PE
PE
PE
PE
PE
PE
PE
Router PE
Networks-on-Chip
26
Typical NoC Router
Buffer
H
Buffer
H
Buffer
H
Buffer
H
Crossbar Switch
Buffer
H
Buffer
H
Routing
Arbitration
§  This example uses a centralized
arbiter for all I/O ports
Networks-on-Chip
27
Introduction
• NoCs are an attempt to scale down the concepts of
largescale networks, and apply them to the
embedded system-on-chip (SoC) domain
• NoC Properties
•  Regular geometry that is scalable
•  Flexible QoS guarantees
•  Higher bandwidth
•  Reusable components
•  Buffers, arbiters, routers, protocol stack
•  No long global wires (or global clock tree)
•  No problematic global synchronization
•  GALS: Globally asynchronous, locally synchronous design
•  Reliable and predictable electrical and physical properties
Networks-on-Chip
28
ISO/OSI network protocol stack model
Networks-on-Chip
29
Scalability – Area and Power in NoCs
For Same Performance, compare the:
Wire-area and power:
NoC:
O (n)
O (n)
Point-to Point:
( n)
O (n n )
O n
2
d
n
Simple Bus:
d
n
d
n
d
( )
O (n n )
O n3 n
Segmented Bus:
n
d
n
d
( n)
O (n n )
O n
2
n
Networks-on-Chip
32
4.
2D Networks on Chip
Topology
Networks-on-Chip
33
NoC Topologies
Mesh topology
General binary tree topology used in NoC
Torus topology
Irregular topology
General ring topology used in NoC
Mixed topology
Networks-on-Chip
38
5.
2D Networks on Chip
Switching strategy
Routing algorithms
Networks-on-Chip
39
Switching and Router
General switching process
General NoC communication architecture
General 2-D mesh router architecture
Networks-on-Chip
40
Packets/Flits
•  A message is broken into multiple packets (each packet
has header information that allows the receiver to
re-construct the original message)
•  A packet may itself be broken into flits – flits do not
contain additional headers
•  Two packets can follow different paths to the destination
Flits are always ordered and follow the same path
•  Such an architecture allows the use of a large packet
size (low header overhead) and yet allows fine-grained
resource allocation on a per-flit basis
Networks-on-Chip
41
Switching strategies
• Determine how data flows through routers in the network
• Define granularity of data transfer and applied switching
technique
•  phit is a unit of data that is transferred on a link in a single cycle
•  typically, phit size = flit size
Networks-on-Chip
42
Routing algorithms
•  Responsible for correctly and efficiently routing packets or circuits from the source
to the destination
•  Choice of a routing algorithm depends on trade-offs between several potentially
conflicting metrics
•  minimizing power required for routing
•  minimizing logic and routing tables to achieve a lower area footprint
•  increasing performance by reducing delay and maximizing traffic utilization of
the network
•  improving robustness to better adapt to changing traffic needs
•  Routing schemes can be classified into several categories
•  static or dynamic routing
Networks-on-Chip
46
Routing algorithms
•  Static and dynamic routing
•  static routing: fixed paths are used to transfer data between a
particular source and destination
•  does not take into account current state of the network
•  advantages of static routing:
•  easy to implement, since very little additional router logic is required
•  in-order packet delivery if single path is used
•  dynamic routing: routing decisions are made according to the
current state of the network
•  considering factors such as availability and load on links
•  path between source and destination may change over time
•  as traffic conditions and requirements of the application change
•  more resources needed to monitor state of the network and
dynamically change routing paths
•  able to better distribute traffic in a network
Networks-on-Chip
47
6.
3D Networks on Chip
Networks-on-Chip
49
Many-core processors
•  Single core performance reaching its
limits
• 
Custom ASICs expensive and timeconsuming to design
• 
Focus is on programmable arrays with
multiple processing elements
• 
Each processing element executes a
software task
• 
Changing application standards now
only require software to be updated –
hardware platform remains constant
UC Davis AsAP2 (2008)
Networks-on-Chip
50
Mixed Integration
• 
High integration density – n times as
many processors/logic blocks in the
same area
• 
Smaller dies – lower cost, smaller
thermal gradients
• 
3D interconnect structure – smaller
hop count = lower latency and higher
performance
Networks-on-Chip
51
Challenge 1: Architecture
• 
2D or 3D, architectural challenges are similar
• 
Raw execution performance:
– Task scheduling/mapping/load-balancing –
where to execute what, and when
• 
Memory Hierarchy:
– What data to keep on-chip?
– How to improve the efficiency of on-chip
caches?
• 
Interconnect:
– How to reduce transfer latency, maintain
high throughput
– How to build in adaptability, redundancy,
security
Networks-on-Chip
52
Challenge 3: Physical Effects
• 
3D Stacking reduces lateral thermal
gradients – vertical conduction is 16x
more!
• 
TSV topology affects vertical conduction
• 
Processors on lower tiers run hot, those
close to heat sink run cooler
• 
Thermal TSVs to evacuate heat from
lower tiers
• 
Active power management strategy is
required!
3D-ICE (EPFL)
Networks-on-Chip
54
7.
Wrap-up
Networks on-Chip
55
Trends
•  Move towards hybrid interconnection fabrics
•  NoC-bus based
•  Custom, heterogeneous topologies
•  New interconnect paradigms
•  Optical
•  Wireless
•  Carbon nanotube
Networks-on-Chip
56
Status and Open Problems
• Power
•  complex NI and switching/routing logic blocks are power hungry
•  several times greater than for current bus-based approaches
• Latency
•  additional delay to packetize/de-packetize data at NIs
•  flow/congestion control and fault tolerance protocol overheads
•  delays at the numerous switching stages encountered by
packets
•  even circuit switching has overhead (e.g. SOCBUS)
•  lags behind what can be achieved with bus-based/dedicated
wiring
• Lack of tools and benchmarks
• Simulation speed
•  GHz clock frequencies, large network complexity, greater
number of PEs slow down simulation
Networks-on-Chip
57
8.
Questions
Networks on-Chip
58
Sources
Adaptive NoC for reconfigurable SoC, Istas Pratomo
Courtesy: G. Konstadinidis and et. al., “Architecture and Physical Implementation of a
Third Generation 65 nm, 16 Core, 32 Thread Chip-Multithreading SPARC Processor”
J. Hennessy and D. Patterson, Comter Architecture: A Quantitative Approach,
5th edition, Morgan Kauffman, San Francisco, 2011.
HPEC 2007, Lexington, MA, 2007
E. Bolotin at al. , “Cost Considerations in Network on Chip”, Integration, special issue on Network on Chip,
October 2004
ICS 295, Sudeep Pasricha and Nikil Dutt
Slides based on book chapter 12
Networks-on-Chip
59