* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Commercial Network Processors
Computer network wikipedia , lookup
Distributed firewall wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Airborne Networking wikipedia , lookup
Network tap wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Serial digital interface wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Quality of service wikipedia , lookup
Commercial Network Processor Architectures Agere PayloadPlus Vahid Tabatabaee Fall 2007 ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 1 References Title: Network Processors Architectures, Protocols, and Platforms Author: Panos C. Lekkas Publisher: McGraw-Hill Agere PayloadPlus Family White Papers Payload+: Fast Pattern Matching & Routing for OC-48, David Kramer, Roger Bailey, David Brown, Sean Mcgee, Jim Greene, Robert Corley, David Sonnier, (Agere Systems) in Hot Chips a Symposium on High Performance Chips, Aug. 19-21, 2001 Agere Product Brief documents for FPP, RSP, ASI and FPL. Agere White paper: “The case for a classification Language”, Feb. 2003. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 2 General Information Agere PayloadPlus is a comprehensive networking processor solution for OC-48. It has expanded to support OC-192 through the NP10/TM10 (renamed to APP750NP and APP750TM). This product is discontinued since then. Originally this was a 3 chip solution but later on it was integrated into a single chip solution. We review the original solution and APP550 (single chip) which their info. is on the Agere website. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 3 The Big Picture The network processor family has a pipeline architecture and includes (in the original 3 chip solution): Fast Pattern Processor (FPP) Takes data from PHY chip Protocol recognition Classification based on layer 2 to 7 Table lookup with millions of entries and variable lengths Reassembly Routing Switch Processor (RSP) Queueing Packet Modification Traffic Shaping QoS processes Segmentation Agere System Interface (ASI) Management Tracks state information Support for RMON (Remote Monitoring) ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 4 The 3 Chip Solution POS-PHY: Packet Over Sonet – PHYsical UTOPIA: Universal Test & Operation Phy Interface for ATM FBI: Functional Bus Interface Physical Interface FPP Configuration Bus RSP Fabric Interface Controller Switch Fabric 8-bit POS-PHY FBI ASI microP PCI to Host CPU 8-bit POS-PHY ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures Source: http://nps.agere.com/support/non-nda/docs/FPP_Product_Brief.pdf 5 Main Responsibilities and Interfaces FPP receives data from the PHY over a standard interface that can be POS PHY Level 3 (POS-PL3) or a UTOPIA 2 or 3 interface. FPP classify traffic based on the contained at layer 2 to 7. FPP send packet over POS-PL3 to RSP. RSP is responsible for Queueing, packet modification, shaping, QoS tagging, Segmentation. The ASI chip is responsible for Exceptions, maintains state information, interface to host processor, configure FPP and RSP over the CBI interface. The management-Path Interface (MPI) enables the FPP to receive management frames from the local host. Functional Bus Interface (FBI) connects the FPP to ASI to externally process function calls. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 6 Memory 64 bit standard PC-133 synchronous dynamic random access memory (SDRAM) 133 MHz pipelined zero bus turnaround (ZBT) synchronous static random access memory (SSRAM). PayloadPlus can use standard off-the-shelf standard DRAM for table lookups and does not need expensive and power hungry Content Addressable Memory (CAM). Typical power limit for a line card is 150 W. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 7 FPP Features Programmable classification from layer 2 to 7 Pipelined multi-threaded processing of PDU High-level Functional Programming Language (FPL) that implicitly takes care of multiple threads ATM re-assembly at OC-48 rates (eliminates external SAR) Table lookup with millions of entries Eliminates need for external CAMs Deterministic performance regardless of the table size Configurable UTOPIA/POS interfaces ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 8 FPP Protocol Data Unit (PDU) FPP is a pipelined multithreaded processor that can simultaneously analyze and classify up to 64 protocol data units (PDU). Each incoming PDU is assigned its own processing thread which is called a context. Each PDU consists of one or multiple 64-byte blocks The context is a processing path that keeps track of: All blocks of PDU. Input port Number of the PDU Data offset for the PDU The last block information Program variable associated with the PDU Classification information of the PDU ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 9 FPP Block Diagram Source: http://www.hotchips.org/archives/hc13/3_Tue/13agere.pdf Source: http://nps.agere.com/support/non-nda/docs/FPP_Product_Brief.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 10 FPP Functional Description The input framer frames incoming data into 64 byte blocks. It writes blocks into the data buffer (SDRAM) and into block buffers and context memory. The block buffer stores data that are being processed and other associated context data for the execution of the FPP operations on the incoming data. The output interface sends the PDU and their classification information to the RSP. The Pattern Process Engine (PPE) performs pattern matching to determine how the incoming PDUs are classified. The Queue Engine manages FPP replay contexts, provide address for block buffers and maintains information on blocks, PDUs and connection queues. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 11 FPP Functional Description (two pass) FPP processes bit streams in two passes. In the first pass the PDU blocks are read into the queue engine memory It produces data blocks as separate 64-byte blocks The data offsets of each block is determined Links between individual blocks that compose a PDU is established. The PDU type is identified In the second pass (replay phase) as the PDU is replayed from the queue engine The PDU is processed as a whole entity. Pattern matching is executed At the same time PDU transmission toward the output unit is done. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 12 FPP Top Level Flow Source: http://nps.agere.com/support/non-nda/docs/FPL_Product_Brief.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 13 RSP (Traffic Manager) Features 64K queues Programmable shaping (such as VBR, UBR, CBR) Programmable discard policies (RED, WRED, EPD) Programmable QoS (CBR, VBR, UBR) Programmable CoS (Fixed Priority, Round Robin, WRR, WFQ, GFR) Programmable packet modification Support for multicast Generates required checksums/CRC ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 14 RSP overview Source: http://www.hotchips.org/archives/hc13/3_Tue/13agere.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures http://nps.agere.com/support/non-nda/docs/RSP_Product_Brief.pdf 15 RSP Functional Description RSP handles classification and analysis results of the FPP on the incoming PDU. It supports up to 64 logical input port. For each PDU there is a command from the FPP that instructs RSP how to handle the PDU. The PDU is added to a queue and stored in the PDU SDRAM. RSP supports up to 64K programmable queues. Processed data is output on a configurable 32-bit interface There is also an 8-bit POS-PHY level 3 management interface. RSP uses custom logic and three Very Large Instruction Word (VLIW) compute engines to process PDU ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 16 VLIW Compute Engines The compute engines operate in a pipeline fashion Each compute engine is dedicated to a processing function Traffic Management Engine enforces, discard policies, and keeps queue statistics. Traffic Shaper Engine ensures QoS and CoS for each queue. Stream Editor Engine performs necessary PDU modifications In each queue definition, the RSP includes, destination, scheduling information, and pointers to programs for each of the three VLIW compute engines. Therefore, RSP can run multiple protocols at the same time. The external CPU can also add queue definitions to set up ATM virtual circuits, for example. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 17 RSP Data Flow The RSP 3 major processing stages: 1. Prepares and queues the PDU for scheduling 1. 2. 3. 2. Selects the next PDU block to be scheduled 1. 2. 3. 4. 3. Assembles the blocks into a PDU in SDRAM Determines the destination queue Determines if the PDU should be queued. If it should, it is added to the appropriate queue for scheduling Selects the physical port Selects the logical port Selects the scheduler Selects the QoS queue Selects the CoS queue Modifies and transmits the PDU on the appropriate output ports 1. 2. 3. Adjusts the QoS transmit intervals and CoS priority Performs PDU modifications Perform AAL5 CRC if necessary ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures http://nps.agere.com/support/non-nda/docs/RSP_Product_Brief.pdf 18 Hierarchical Scheduling (Internal Scheduling Logic) Channels: The output interface supports a 32-bit data channel which supports 1-4 POS-PHY or UTOPIA channels. It also has an 8-bit management output. Physical Ports: Physical output ports are assigned to channels. There are up to 32 physical ports since there are 32 back pressure signals. Logical ports: The RSP supports up to 256 logical output ports. Schedulers: A set of schedulers is defined for each logical port. The RSP supports CBR, VBR and UBR schedulers. QoS queues: Each of the QoS queues is assigned to a single scheduler. http://nps.agere.com/support/non-nda/docs/RSP_Product_Brief.pdf COS queues: Up to 16 CoS queues feed a single QoS queue. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 19 ASI ASI seamlessly integrates FPP and RSP with the host processor. It makes it possible for the designer to do the following: Centralized initialization and configuration of the NP system and its physical interfaces. Send routing and VPI/VCI updates to the system. Implement various routing and management protocols. Handle any occurring exceptions. ASI enables high speed flow-oriented state maintenance: Gathering Remote Network Monitoring (RMON) statistics Time stamping packets Checking Packet Sequence Policing ATM and frame relay up to OC-48 rates 8-bit POS-PHY interface over which the ASI sends packets to the FPP and receives them from RSP ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 20 How Does ASI Work? It has a PCI interface for communication with host processor. 32-bit high speed interface (FBI) to get functional call from FPP. Two ALUs for processing FPP external function requests for: Maintaining state and statistics. Policing (leaky bucket) Two SSRAM interface to allow memory access for different tasks without contention http://nps.agere.com/support/non-nda/docs/ASIProductBrief.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 21 ASI Configuration Capabilities ASI enables host processor to configure up to 8 devices The configuration bus is compatible with both Intel and Motorola bus formats. It is used to : Initialize and configure FPP and RSP Load the program code for the FPP and RSP Load the dynamic updates to the FPP tables and RSP queues Configure third party external framers and physical interfaces ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 22 Policy and Conformance Checking ASI performs conformance checking or policing for up to 64k connections at OC-48 rate. It only does marking, not scheduling or shaping Several variations of GCRA (leaky-bucket) algorithm can be used For the dual leaky bucket case, the ASI indicates whether cells or frames are compliant or not and from which bucket the nonconformance was derived. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 23 FPL FPL is a functional language for classification. In the functional language the programmer tells the computing resources what to do rather than how to do it. In FPL you describe the protocol and the actions to process them. In C you have to say how to process protocols. FPL codes would be much shorter, easier to debug, and modify. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 24 FPL Main Features Fast pattern matching and classification of the data stream. Defining functions for the FPP to execute based on the recognized patterns Easy to read semantics Dynamic updating of the code in the FPP Software development tool set ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 25 Two Pass Processing Recall the two pass processing in FPP The first pass does preliminary process such as identifying the PDU type. In the second pass (replay) it can simply transmit the PDU and conclusions or process a higher level protocol. The queue engine allows you to process PDUs embedded in higher layer protocols in the replay phase. ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 26 Sample FPL Program Flow Source: http://nps.agere.com/support/non-nda/docs/FPL_Product_Brief.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 27 FPL code example Source: http://www.hotchips.org/archives/hc13/3_Tue/13agere.pdf ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 28 Dynamic FPL Program Changes You can add and delete certain types of FPL statements from the image code in FPL dynamically. FPL supports two types of pattern statement structures: Single-rule patterns have a single pattern to match with one or two functions to perform. These are called flows These can not be added or removed dynamically Multiple rule pattern statements allow you to define tables This is used to define IP routing tables These are called trees You can add or delete statements from existing trees You can not add a tree dynamically ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 29 Performance of the Network Processor Drop in the performance due to the N+1 problem ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 30 Network Processor Performance Performance evaluation for a mixture of packet sizes Performance drops when the number of computations per packet increases ENTS689L: Packet Processing and Switching Commercial Network Processor Architectures 31