Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid Spring 2008 System-On-a-Chip Design ELECT 1002 The SoC nightmare System Bus SoC Design DMA CPU Mem Ctrl. MPEG DSP The “Board-on-a-Chip” Approach Bridge I o o The architecture is tightly coupled Dr. Amr Talaat C Peripheral Bus Source: Prof Jan Rabaey CS-252-2000 UC Berkeley ELECT1002 Very long wires SoC Design Year 2005 1 ns (1 GHz) Year 2010 0.1 ns (10 GHz) B B Dr. Amr Talaat A ELECT1002 A Why NoC? SoC Design Global wire delays increase exponentially or linearly by inserting repeaters The delay may exceed one clock cycle after repeater insertion In ultra-deep submicron processes, 80% or more of the delay of cr itical paths will be due to interconnections Communication structures need to be designed first and then followed by fun ctional blocks Dr. Amr Talaat ELECT1002 Homogeneous SoC (MP-SoC) SoC Design CPU MEM CPU MEM CPU MEM CPU MEM Interconnection network (BUS, XBAR) Dr. Amr Talaat CPU MEM ELECT1002 CPU MEM CPU MEM CPU MEM Why not bus? SoC Design Shared medium arbitrated bus, the most frequently used on-chip interconnect architectures Pros Simple, low area cost, and extensibility Cons The intrinsic parasitic resistance and capacitance can be quite high fo r a long bus line Every additional IP block adds to parasitic capacitance and causes inc reased propagation delay The number of IP blocks that can be connected by the bus is limited Dr. Amr Talaat ELECT1002 On-Chip Communication SoC Design Bus-based architectures Dr. Amr Talaat Bus based interconnect Low cost Easier to Implement Flexible ELECT1002 Irregular architectures Regular Architectures Networks on Chip Layered Approach Buses replaced with Networked arc hitectures Better electrical properties Higher bandwidth Energy efficiency Scalable Network on Chip SoC Design Software Software Transport Transport Network Network Wiring Data Link Layer Wiring Separation of concerns Dr. Amr Talaat Communication-based Design Orthogonalizes function and communication Builds on well-known models-of-computation and correct-by-constru ction synthesis flow Parallels layered approach exploited by communications community ELECT1002 NoC SoC Design What is Network-on-Chip (NoC)? • Leveraging existing computer networking principles to improve intercomponent intra-chip communications for SoC. • Each on chip component connected by switch to a particular comm wire(s) Dr. Amr Talaat • Improvement over standard bus based interconnections for SoC architectures in terms of throughput ELECT1002 SOC Current Trend SoC Design Explicitly parallel SoC architectures Integrating huge amounts of Memory in chip designs Distributed Shared Memory Environments Should allow Interconnection centric design flow and better predictab ility Physical design Closure Wire delay dominates gate delay Dr. Amr Talaat ELECT1002 Design goal of NoC SoC Design High throughput Low latency Less energy consumption Small area requirements Network-on-Chip Basics: Architectures Routing Strategies Evaluation Router Logic CNI IP Core Dr. Amr Talaat Figure 1: NoC Architecture ELECT1002 To/From Network Routing: Circuit/Packet Switching SoC Design Circuit Switching • Dedicated path, or circuit, is established over which data packets will travel • Naturally lends itself to time-sensitive guaranteed service due to resource allocation • Reservation of bandwidth decreases overall throughput and increases average delays Packet Switching Dr. Amr Talaat • Intermediate routers are now responsible for the routing of individual packets through the network, rather than following a single path • Provides for so-called best-effort services ELECT1002 Routing: Wormhole/Virtual Cut Through SoC Design Wormhole Switching • Message is divided up into smaller, fixed length flow units called flits • Only first flit contains routing information, subsequent flits follow • Buffer size is significantly reduced due to the limitation on the number of flits needed to be buffered at any given time Virtual Cut Through Switching • Much like Wormhole switching Dr. Amr Talaat • Header flit can travel ahead and undergo processing while remaining flits are still navigating the network • Higher acceptance rates and lower latencies than Wormhole ELECT1002 Wormhole Switching SoC Design Dr. Amr Talaat ELECT1002 Routing: Contention SoC Design •Contention occurs when routers or IP blocks attempt to send data over the same link at the same time • For Circuit switching, contention is resolved at the time of actual connection setup • For packet switching, contention resolution is handled at a much finer level, by the router buffering and scheduling individual packets of information • Better overall performance for packet switched networks at the cost of lack of service guarantee Dr. Amr Talaat ELECT1002 Architectures: SPIN SoC Design • SPIN: Scalable, Programmable, Integrated Network • Every level has same number switches • Network grows like (NlogN)/8 • Trades area overhead and decreased power efficiency for higher throughput • Illustrative of performance vs. power consumption Dr. Amr Talaat ELECT1002 Architectures: CLICHE SoC Design •CLICHÉ: Chip-Level Integration of Communicating Heterogeneous Elements • Two-dimensional mesh network layout for NoC design Dr. Amr Talaat • All switches are connected to the four closest other switches and target resource block, except those switches on the edge of the layout • Connections are two unidirectional links ELECT1002 Architectures: Torus SoC Design •Similar to mesh based architectures • Wires are wrapped around from the top component to the bottom and rightmost to leftmost • Smaller hop count • Higher bandwidth Dr. Amr Talaat • Decreased Contention • Increased chip space usage ELECT1002 Architectures: Folded Torus SoC Design •Similar to Torus •Torus, the long end-around connections can yield excessive delays •Avoided by folding the torus Dr. Amr Talaat ELECT1002 Architectures: Octagon SoC Design •Standard model: 8 components, 12 interconnects • Design complexity increases linearly with number of nodes • Largest packet travel distance is two hops • High throughput Dr. Amr Talaat • Shortest path routing easy to implement ELECT1002 Architectures: BFT SoC Design •BFT: Butterfly Fat Tree • Each node in tree model has coordinates (level, position) where level is depth and position is from left to right • Leaves are component blocks • Interior nodes are switches • Four child ports per switch and two parent ports Dr. Amr Talaat •LogN levels, ith level has n/(2^i+1) switches, n = leaves (blocks) • Use traffic aggregation to reduce congestion ELECT1002 Network interface SoC Design Open Core Protocol (OCP) An interface standard between IP cores and the interconnection f abric Dr. Amr Talaat ELECT1002 Packet Format SoC Design Dr. Amr Talaat Type: Head, Data, Tail and Complete VCID: Virtual Channel Identifier Route: ‘N’ bit route field with last 2 bits specifying the Route to be used in the next controller 00 - Left 01 - Right 10 - Straight 11 - Extract Data: Actual Data field ELECT1002 Routing Example SoC Design Dr. Amr Talaat ELECT1002 Simulation SoC Design A simulator is used to investigate various metrics: •Each system consists of 256 functional IP blocks •Wormhole routing is used •User can choose uniform and localized traffic •Support both Poisson and self-similar message injection distributions Dr. Amr Talaat A flit is only one word (36 bits, 4 bits are for packet framing). ELECT1002 Area comparison SoC Design SPIN and Octagon have a considerably higher silicon area overhead. Dr. Amr Talaat ELECT1002 Projected performance SoC Design Dr. Amr Talaat ELECT1002