Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
TCP congestion control wikipedia , lookup
Network tap wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Power over Ethernet wikipedia , lookup
Nonblocking minimal spanning switch wikipedia , lookup
FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected] Networked Storage History servers clients Fabric Characteristics: Packet drop on buffer full High-low Latency High-low throughput No multipathing Ethernet (TCP/IP) network Filebased data transfer (eg, NFS, CIFS) Fabric Characteristics: Lossless Low Latency High throughput Reliable Redundant paths (failover) Fibre Channel Storage Area Network SCSI Block Data Transfer CPU Data Data Data Data Data Data Data FC Target System Fileserver NAS Network Attached Storage SAN Storage Area Network Data SCSI Read, Write over FC Host Target Host FCP_C MND Target FCP_C MND FCP_D A TA Unsolicited data (modest amount) _ FCP _ FCP _ FCP A D AT Exchange FCP A D AT A D AT _ FCP R SP SCSI Read ER _ _ XF RDY FCP_D A TA FCP_D A TA Sequence (may be out of order) _ FCP R SP SCSI Write FC Fabric Port Terminology host N_Port switch F_Port E_Port N_Port switch F_Port F_Port N_Port F_Port N_Port E_Port host target N_Port - Host or Target endpoint F_Port - Endpoint-facing switch port E_Port - Inter-switch port Virtualization adds a ‘V’ prefix to all of these FC Routing Fabric Shortest Path First switch switch switch switch Based on OSPF (IP) “Static” Routing Tables per switch Chooses shortest paths (hop counts) switch switch switch switch Load balances multiple paths Handles link failover automatically switch switch switch switch Ethernet Routing Dynamic Scheme: Source Learning If unicast DstMAC is not in lookup table, flood frame to all ports except its source port. Note source port of SrcMAC in lookup table, if not already present Age/invalidate lookup entries Similar flooding behavior for multicast Precludes loops in fabric FC Frame Format SOF Frame Header 31 Opt. header Payload (2KB + Markers) CRC EOF 0 23 R_CTL D_ID Fabric-assigned (Fabric Login) source, destination [V]N_Port identifiers S_ID Sequence trackers Type F_CTL SEQ_ID DF_CTL OX_ID SEQ_CNT RX_ID Parameter Local, Remote Exchange Identifiers, used to look up Exchange state at endpoints Protocol Stack History and Comparison SCSI FCP FC-3 Mapping, Discovery, Services, Recovery Transport iSCSI Mapping TCP Transport FC-2V FC-0 FC-3 Mapping, Discovery, Services, Recovery Transport FC-2V IP FC-1 FCP Network Link PHY Ethernet Chronological order of development FCoE Lossless Ethernet Encap/decap Link PHY Lossless Ethernet – via PAUSE When port receive buffer fills to a high watermark, issue PAUSE XOFF to link peer; when buffer drains to low watermark, issue PAUSE XON to peer Switch or Endpoint Eth Rx Port receive packet buffer Inbound PAUSE Inbound PAUSE Port transmit buffer Eth Tx Switch or Endpoint Ethernet link HWM Outbound PAUSE generator Eth Tx Eth Rx Outbound PAUSE generator Port receive buffer LWM Port transmit buffer FCoE Early Deployment Example Firewall FC Storage Array To/From internet Lossless, Converged Ethernet Fabric FCoE FC gateway FC fabric Presentation Tier 20, 4-way SMP diskless blades Application Tier 8, 16-way SMP diskless blades Database Tier Large SMP FCoE Frame Format 0 31 EtherType = FCoE_TYPE Version SOF Encapsulated FC Frame (n words) EOF FCoE Endpoint Model FC-3/FC-4 FC-2V VN_Port FCoE_LEP FIP mgmt. protocol FIP - Fibre Channel Initialization Protocol initiates Fabric Logins with FCoE switch (FCF) Each Fabric Login Establishes a VN_Port and a VN_Port - VF_Port logical connection. Each VN_Port has a unique MAC address, serveror fabric-provided. FCoE_LEP - link endpoint, performs encapsulation/decapsulation of FC frame. Lossless Ethernet MAC To lossless Eth. Fabric FCoE Switch Functional Model (To FC fabric) (To FC endpoint) E_Port F_Port FC Switch (FC-SW-5) VF_Port VF_Port VE_Port FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FCoE_LEP FIP mgmt. protocol FIP mgmt. protocol FIP mgmt. protocol Lossless Ethernet MAC (FCF-MAC) Lossless Ethernet MAC (FCF-MAC) Lossless Ethernet MAC (FCF-MAC) To lossless Eth. Fabric To lossless Eth. Fabric To lossless Eth. Fabric Converged Ethernet AKA Data Center Bridging (DCB). Run up to four major traffic classes on single 10 GbE fabric. In order of market prevalence: Networking (TCP/IP, lossy). Block Storage (lossless FCoE, or lossless/lossy iSCSI). Management (“heartbeat” traffic, low bandwidth, but must get through). Inter-Process Communication (clustered computing: high bandwidth, low latency, lossless preferred). Groundwork for DCB IEEE 802.1Qaz – ETS & DCBX – bandwidth allocation to major traffic classes (Priority Groups); plus DCB management protocol. IEEE 802.1Qbb – Priority PAUSE. Selectively PAUSE traffic on link by Priority Group. IEEE 802.1Qau – Dynamic Congestion Notification. IEEE 802.1Qaz Enhanced Transmission Selection Support at least 3 Priority Groups/traffic classes PGs identified by Priority field of existing 802.1Q VLAN Tag Configured Bandwidth per PG has 1% resolution PG15 has limitless bandwidth (use sparingly!, for Management) Work Conservation – if the wire’s free, use it. ETS Configuration Example PG0 (Storage): 40% of port b/w PG1 (Networking): 20% of port b/w PG2 (IPC): 40% of port b/w PG15 (mgmt): limitless If a PG underutilizes, others can fill the space. Typical implementation: DWRR. IEEE 802.1Qbb Priority PAUSE Switch or Endpoint Priority PAUSE!! PG0 only PG0 - Storage Switch or Endpoint PG0 - Storage PG1 - Networking PG2 - IPC PG15 - Management Output queues, by traffic class DWRR scheduler lossless buffer ETS PG1 - Networking 10 GbE link lossy buffer PG2 - IPC Generally, Networking (TCP/ IP) should NEVER be PAUSEd PG15 - Management Receive Buffers, by traffic class IEEE 802.1Qau Dynamic Congestion Control Background Lossless fabrics are prone to congestion spreading (congestion trees). Ethernet-FC gateways with their different port speeds (10 GbE; 8 Gbps) are natural bottlenecks. ETS Work Conservation model adds fuel to fire. Solution: switches/endpoints notify traffic sources of incipient congestion, via feedback messages; sources reduce rates accordingly. Congestion Notification in Action 1. Source endpoint, supporting ‘n’ Congestion Controlled Flows, tags each outbound packet with CCF# 2. Switch (or dest. Endpoint) detects incipient congestion; issues Congestion Notification Message back to data source CNM Data 3. Source reacts to CNM, reducing tx rate. Source recovers its rate over time and via byte counting Destination endpoint Congestion Control at Endpoint Transmit IEEE 802.1Qaz/Qau Endpoint Typical implementation: byte-based token buckets Shallow buckets (2 - 6 packets) for rapid CNM response PG0 - Storage PG1 - Networking PG2 - IPC PG15 - Management 802.1 Qaz ETS DWRR scheduler 802.1Qau Congestion Control rate limiters (“Reaction Points”) Shallow queues (2 6 packets) for rapid CNM response CNMs only slow down RPs. Rate recovery is internal (byte- and timebased) Incoming Congestion Notification Messages (CNMs) - “Slow Down!” 10 GbE link FCoE Summary Presents new, but very familiar, PHY and Link Layers for FC. Core switching discipline remains FC-SW-5. Higher FC layers almost completely unchanged (that’s the legacy value!) Biggest Ethernet-level requirement: lossless fabric. Part of Converged Ethernet initiative – lots of ancillary activity at IEEE. Further Reading FCoE: www.t11.org IEEE 802.1Q(az|au|bb): www.ieee.org Thank you! Questions?