Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ECE 526 – Network Processing Systems Design Network Processor Introduction Chapter 11,12: D. E. Comer Goal • Understanding the inefficiency of 1st, 2nd and 3rd generation network processing systems ─ Scalability plus flexibility • Recognizing the necessity of new solution: 4th generation (network processor technology) • Learning ─ courage to appreciate the challenges ─ skill to characterize the “real” problem ─ art to propose an engineering solution • Be aware of current network processor is a conceptual and general term Ning Weng ECE 526 2 Recall 1ST • 1st generation network processing system • Feasibility study ─ Design a software router • data rate 10Gbps • Assuming small packets (64B) • Assuming each packet need 10,000 instruction to process ─ Can Intel 80986@2007 do the job? • CPU:24Ghz • MIPs:125,000 (Million Instruction Per Second) • 1 billion transistors …. ─ Conclusion: not feasible • What is the real problem here? Ning Weng ECE 526 3 Real Problem is • Technology push: uneven link bandwidth 2 x / year 106 x 105 x Growth ─ Link bandwidth scaling much faster than CPU and memory technology ─ Transistor scaling and VLSI technology help but not enough 107 x 104 x 103 x 102 x CPU 2 x / two years 10 x • Application pull: harder 1x ─ More complex applications are required ─ Processing complexity is defined as the number of instructions and number of memory access to process one packet Ning Weng Mem improvement in latency 10% / year 1975 Hundreds of instructions per packet Layer 2 IPv4 switching routing ECE 526 1980 1985 1990 1995 2000 2005 Thousands of instructions per packet Flow Intrusion Encryption Classification detection Processing Complexity 4 What is the ideal platform? •Structured ASIC •Network Processor •Reconfigura ble Coprocessors •FPGA 5 2nd and 3rd Generations • 2nd generation: offloading and decentralized • 3rd generation: further offloading and using specialized devices (ASIC + embedded processors) • Problems: losing the flexibility and very cost, why? Ning Weng ECE 526 6 Why not ASIC? • High cost to develop ─ Network processing moderate quantity market • Long time to market ─ Network processing quickly changing services • Difficult to simulate ─ Complex protocol • • • • • Expensive and time-consuming to change Little reuse across products Limited reuse across versions No consensus on framework or supporting chips Requires expertise Ning Weng ECE 526 7 Network Processors • Question: where does NP gain higher performance from, compared with conventional processor? Ning Weng ECE 526 8 Instruction Set: minimality • Not general as RISC and CISC processor ─ E.g. no floating point instructions ─ Optimized for packet processing functions only • Not specific to a protocol or part a protocol • Seek a minimal set of instruction set of instructions sufficient to handle arbitrary protocol, ─ plus specific instructions for protocol processing • Example : atomic operation ─ Hard problem and will cover later Ning Weng ECE 526 9 Architecture: multiprocessor • Parallelism ─ The nature of workload network processing: high parallel • • • • Flow-level Queue-level Packet-level Protocol-level • Pipelining ─ Pipeline will help system performance at cost of longer delay ─ Is this acceptable? • System-on-chip ─ Processing: RISC core ─ Memory: register, cache, instruction store, scratch pad, SRAM and SDRAM ─ I/O: network /switch fabric interfaces • Question: how hard to build and use this NPs? Ning Weng ECE 526 10 Typical Processing Ning Weng ECE 526 11 Case Study: IPv4 Packet Forwarding •2-port •From (0) router (2 Gbps) •From (1) •To (0) •Lookup •IPRoute •To (1) •Xilinx Virtex-II Pro FPGA (2VP30) •Root •Memory access 2 •0 •1 •b •b •Memory access 5 •Memory 12 access 6 •0 •1 •F •FFF •FF E •000 •001 •002 •003 •Memory •b •a •a access 1 •a •F •e •0 •c •1 •F •d •d •a •b •c •d •e •Prefix (hex : binary) •: 0* •IP Lookup: •002 : * •longest prefix match •002F : * •FFE : 000* •(trie lookup algorithm) •FFF : * Multiprocessor for Header Processing •RS232 •Timer •BRAM •Packet Reception •Verify •Lookup-1 •Lookup-2 •Transmit •Verify •Lookup-1 •Lookup-2 •Transmit •Verify •Lookup-1 •Lookup-2 •Transmit •Verify •Lookup-1 •Lookup-2 •Transmit •FS L •BRAM •Packet Transmission •OP B •LEDs •FIFO queues •BRAM •BRAM 13 Typical using NPs Router Port packets Port Switching fabric Router Port Processor Core Processor Core Processor Core Coprocessor Processor Core Port Port Processor Core Interconnect Network Interface Network Processor Coprocessor I/O Ning Weng ECE 526 14 System Implementation Space Ning Weng ECE 526 15 Memory Architecture • Memory access bottleneck • Memory is area consuming ─ Limited memory-on-chip ─ Limited bandwidth to off-chip memory: pin and package cost ─ Off-chip memory access is slow: 100 cycles • Possible solutions ─ Profiling application memory access pattern ─ Propose heterogeneous memory architecture ─ Memory aware mapping ─ Transactional memory (project topic) Ning Weng ECE 526 16 Application Mapping Mapping Current approach: fixed topology, assembly coding & hand-tuning Ning Weng ECE 526 17 Basic Steps for Mapping •From (0) •From (1) •To (0) •Lookup •IPRoute •To (1) •MEM •MEM •FPGA •PE •PE •FPGA •Application description •High-level optimizations •Task graph •(platform specific) •Profile •Architecture configuration •MEM •FPGA •PE •PE •FPGA •MEM •MEM 18 •HW / SW partitioning •Task allocation •Data layout •Communication assignment •Compilation / Synthesis Summary • Network Processor ─ ─ ─ ─ Special purpose, programmable hardware device Optimized for network processing Building blocks of network processing systems Fundamental ideas • Flexibility through programmability • Scalability with parallelism and pipelining • Here, NP is a concept ─ We will learn example of network processor soon Ning Weng ECE 526 19 For Next Class & Announcement • • • • Read Comer: chapter 13 and 14 Lab 1 total grade reduce to 82 HW 1 due Wed. Project topic will be announced after Wed. Ning Weng ECE 526 20