Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computer network wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Distributed operating system wikipedia , lookup
Deep packet inspection wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
NETWORK-ON-CHIP (NOC): A New SoC Paradigm Dr. Konstantinos Tatas PRESENTATION OUTLINE Introduction Part A Motivation – SoC Communication Current Solutions NoC Concept Part B Work@MicroLab Summary THE MANY CORES ERA Source: International Roadmap for Semiconductors 2007 edition (http://www.itrs.net/) THE GROWING GAP: COMPUTATION VS. COMMUNICATION 2:1 9:1 Taken From ITRS, 2001 GROWING CHIP DENSITY 1998 ASIC - 0.35 mm 2012 SoC - 22nm Memory, I/O P Design complexity - high IP reuse Efficient high performance interconnect Scalability of communication architecture TRADITIONAL SOC NIGHTMARE System Bus DMA CPU Mem Ctrl. MPEG C DSP The “Board-on-aChip” Approach Bridge I o Control Wires Variety of dedicated interfaces Poor separation between computation and communication. Design Complexity Unpredictable performance o The architecture is tightly coupled Peripheral Bus COMPUTATIONAL DEMANDS OF FUTURE MULTIMEDIA APPLICATIONS MEMORY BANDWIDTH SCALES PROPORTIONAL 2012-2015 today BW~ Source: K. Uchiyama., VLSI Circuit Digest of Technical Papers, p 6, 2008. K. Uchiyama., “Power-Efficient Heterogeneous Parallelism for Digital Convergence”, VLSI Circuit Digest of Technical Papers, IEEE p 6-9, June 2008 Jian Li, “3D Integration opportunities and challenges”, ISCAS 2008 tutorial on 3D SHARED ADDRESS SPACE COMMUNICATIONS SYSTEM BUS CROSS-BAR MULTI-STAGES NETWORK ON CHIP AN NOC EXAMPLE •Source: ossum, Intel @ MPSoC’07 NOC TOPOLOGIES Regular topologies: general-purposed on-chip multiprocessors Custom topologies: NOC VS. “OFF-CHIP” NETWORKS What is Different? Module Module Module Module Module Module Module Module Module Module Module Routers on Planar Grid Topology Short Point-To-Point Links between routers Unique VLSI Cost Sensitivity: Area-Routers and Links Power Module NOC VS. “OFF-CHIP” NETWORKS No legacy protocols to be compliant with … No software simple and hardware efficient protocols Different operating env. (no dynamic changes and failures) NOC VS. “OFF-CHIP” NETWORKS No legacy protocols to be compliant with … No software simple and hardware efficient protocols Custom Network Designenv. – You(no design what you need! Different operating dynamic changes and failures) NOC VS. “OFF-CHIP” NETWORKS No legacy protocols to be compliant with … No software simple and hardware efficient protocols Custom Network Designenv. – You(no design what you need! Different operating dynamic changes and failures) Example1: Replace modules Module Module Module Module Module Module Module Module Module Module Module Module Module Module Replace Module Module Module Module Module Module Module Module Module Module NOC VS. “OFF-CHIP” NETWORKS No legacy protocols to be compliant with … No software simple and hardware efficient protocols Custom Network Designenv. – You(no design what you need! Different operating dynamic changes and failures) Example2: Adapt Links Module Module Module Module Module Module Module Module Module Module Module Module Module Module Adapt Links Module Module Module Module Module Module Module Module Module Module NOC COST SCALABILITY VS. ALTERNATIVES •Compare the cost of: NoC Non-Segmented Bus (NS-Bus) Segmented Bus (S-Bus) Point-To-Point (PTP) WHY NOC? Bus NoC Longer connections Performance does not higher parasitic downgrade with network capacitance scaling Arbitration grows and Arbitration and routing are becomes a bottleneck distributed Bandwidth is limited and Aggregated bandwidth shared by all cores scales with network size WHICH ARE THE MAIN CHALLENGES? Communication infrastructure Communication paradigm selection Application mapping optimization Programming model Physical design Design automation/tool-flow integration BASIC SWITCHING TECHNIQUES Circuit Switching A real or virtual circuit establishes a direct connection between source and destination. Packet Switching Each packet of a message is routed independently. The destination address has to be provided with each packet. Store and Forward Packet Switching The entire packet is stored and then forwarded at each switch. Cut Through Packet Switching The flits of a packet are pipelined through the network. The packet is not completely buffered in each switch. Virtual Cut Through Packet Switching The entire packet is stored in a switch only when the header flit is blocked due to congestion. Wormhole Switching is cut through switching and all flits are blocked on the spot when the header flit is blocked. CIRCUIT SWITCHING (ARE THEY NOC?) Phases: Circuit Setup Transmission Tear Down Disadvantages: Exclusive allocation of resources Long setup phase Advantages: High performance - throughput and latency Low power consumption Low overhead during transmission phase Predictable transmission PACKET SWITCHING VS CIRCUIT SWITCHING NOC ROUTER NoC-based MPSoC • nodes – Processing Elements (PEs), such as CPUs, custom IPs, DSPs, etc. – storage elements (embedded memory blocks), • • • • Routers Links Network Interfaces (NIs) Often a switch together with its host node memory is referred to as a tile. NoC Topologies • Regular/irregular • Direct/indirect – each node has a direct point-to-point link to a subset of other nodes in the system, called neighboring nodes 2D Mesh •simplest and most popular topology for NoCs. •Every switch, except those at the edges, is connected to four neighboring switches and one node. R R IP R R IP R IP R R IP R IP IP IP R IP IP 2D Torus •layout of a regular mesh except that nodes at the edges are connected to switches at the opposite edge via wrap-around routing channels. R R R IP IP IP IP R R R R •Every switch has five ports • The limitation of this topology affects the long end-around connections R IP IP IP R IP R R R IP IP IP IP R R R R IP IP IP IP Octagon •well-established direct topology found in NoCs. IP •ring of 8 nodes connected by 12 bi-directional links. IP R R •links provide two-hop communication between any pair of nodes in the ring •simple algorithms for fast yet efficient shortest-path routing. •In case a platform consists of more than eight nodes, the octagon is extended to multidimensional space IP IP R R R R R IP R IP IP IP Fat-tree and butterfly fat-tree R • • • • • • nodes are connected to an architecture's external switch switches have point-to-point links to other switches. processing units and memory modules are assigned to the leafs of the trees, switches are placed at the vertices, communication involves climbing up and down some part of the tree. A pair of coordinates is used to label each node, ($l$, $p$), where $l$ denotes a node's level and $p$ gives its position within this level. R R R R IP IP IP R IP R IP IP IP R R R IP IP R IP IP IP IP R IP IP IP R IP IP IP IP Polygon • widely accepted topology • packets travel in a loop from one router to the next. • We can add chords to IP the circle • if chords are inserted only between opposite routers, the topology is called a spidergon. IP IP R R R R R R IP IP IP Star IP IP IP R R IP IP IP IP CORE R R R IP IP • central router in the middle of the star, • computational resources, or subnetworks, in the spikes of the star. • The capacity requirements of the central router are quite large, • significant possibility of congestion in the middle of the star Flow Control • intra-switch • switch-to-switch – Buffered – Bufferless • end-to-end ACK/NACK • • • • • • • handshaking protocol When a sender puts data on the link, it activates a VALID signal. When the receiver is ready to consume the valid data, it activates the corresponding ACK signal. If the data is corrupt or there is no buffer space to store them, a NACK signal is activated instead. Upon receipt of a NACK, the sender starts resending flits starting from the not acknowledged one inherently supports fault tolerance, additional buffer space required to keep sent flits in case retransmission is required. Stall/go • requires just two control wires • one going forward, signifying data availability, • one going backward and signaling either a condition of buffers filled ("STALL") or of buffers free ("GO") Credit-based • • • • • • • transmitter has a "credit" counter initialized to the value of empty buffer slots of the receiver decrements it every time a flit is sent. The credit counter must be updated in case the receiver consumes or forwards a flit and therefore increases its buffer space. a credit value that is sent back to the transmitter to be added to the current value of the credit counter. transmitter stalls when the credit value is zero and resumes when its value increases again. NI Design • logic required to connect the nodes to the NoC. • NIs can differ significantly depending on the nature of the node • Using a NI allows IPs and communication infrastructure to be designed independently • One end of a NI is connected to a router using the selected flow control protocol • the other to the node IP • Since most IPs are designed to communicate through a bus, the NI uses a bus interface • NI is not simply a protocol adapter from a processor bus to a router port. • Ideally, the NI must offer the processing cores the view of a shared memory system, and the network itself should be transparent. NI services • adaptation services – packetization/depacketization – protocol conversion and clock domain crossing. – absolute minimum services required of the NI so that data can be sent and received on the NoC • transaction reordering services, • error and flow control services – error detection and/or correction – request retransmission when required • route computation services – Source routing • upper layer services – Cache coherence Typical NoC Packet Format Header Packet type Routing SA Length Payload Control Address Tail Data SN Error Control DA OR SA HOP HOP 0 1 • HOP N Header – – – – • • ... routing and network control information. In the case of distributed routing the information required is the destination and source addresses in the case of source routing the complete routing information is written In the case of variable packet size a length field is required Payload Tail – – sequence number error control fields such as hamming code or CRC fields Source vs Distributed Routing • In source routing the entire routing path is computed at the source and appended to the packet. – The routers do not make any routing decisions, • in distributed routing, the routing path is decided in a hop-by-hop basis at each router even for deterministic routing algorithms. – The only information required to be found in the packet is the destination address. • The advantage of source routing is that it requires simple routers and can easily support irregular architectures. Its disadvantage is that it does not provide adaptiveness and requires more complex NIs and packets. Source vs Distributed Routing DA R (0,1) R R (2,1) (1,1) R R R R ... (2,2) DR principle (0,1) (1,1) (2,1) (2,2) R (2,2) R R R R R R R R E E S PE ... SR principle