Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STORAGE AREA NETWORKING PROTOCOLS AND ARCHITECTURE SESSION OPT-2T01 OPT-2T01 9899_06_2004_X 1 © 2004 Cisco Systems, Inc. All rights reserved. Morning Schedule • 9:00am–10:30am Introduction to Storage Area Networking Storage Terms and Acronyms Storage Networking Devices (Switches, HBAs, Disk) Storage Networking Applications Storage Networking Topologies Intro to Storage Protocols (SCSI, FC, FCIP, iSCSI) • 10:30am–10:45am Break • 10:50am–12:30pm Storage Protocols in-depth Introduction to the Standards SCSI Fibre Channel • 12:30pm–1:30pm OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Lunch 2 Afternoon Schedule • 1:45pm–3:30pm Storage Protocols In-Depth (Cont.) Fibre Channel Services iSCSI FCIP iFCP iSNS and SLP • 3:30pm–3:45pm • 3:50pm–6:00pm Break Storage Network Troubleshooting Required Tools Required Technical Skill Sets Storage Network Architecture Design Practices FC Network Designs IP SANs SAN Extension Implementation and Management OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 3 Associated Sessions • OPT-1051 Introduction to Storage Topologies and Applications • OPT-2051 Fibre Channel Storage Area Network Design • OPT-2052 FCIP Design and Implementation • OPT-2053 iSCSI Design and Implementation • OPT-2054 Storage Networking Security • OPT-3051 Troubleshooting MDS9000 Fibre Channel SAN • OPT-3052 Troubleshooting MDS9000 IP Storage Area SAN • OPT-4051 Design and Architecture of Storage Networking Platforms • OPT-4052 Case Study: Cisco IT Storage Strategy OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 4 Reference Materials • Cisco Storage Networking www.cisco.com/go/storagenetworking • Cisco AVVID Storage Networking Partner Program www.cisco.com/go/partners • Cisco Metro Optical Product Information www.cisco.com/go/comet • Storage Network Industry Association (SNIA) www.snia.org • IETF—IP Storage www.ietf.org/html.charters/ips-charter.html • ANSI T11—Fibre Channel www.t11.org/index.htm OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 5 INTRODUCTION TO STORAGE AREA NETWORKING OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 6 Section Agenda • Storage Terms and Acronyms • Storage Networking Devices • Storage Networking Applications • Storage Networking Topologies • Introduction to Storage Protocols OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 7 STORAGE TERMS AND ACRONYMS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 8 Technologies Overview (or “Storage in a Nutshell”) Servers and Mainframes Storage Area Network (SAN) Technologies SAN Protocols Databases IP CLOUD Backup Apps SAN Applications iSCSI Storage Virtualization JBODs and NAS iSCSI Drivers e Support Center RAID & VirtualRAID Mirroring FSPF IP CLOUD iSCSI om ll H Ca Embedded Management FC HA Virtual SAN Enhanced Fibre Channel TAPE FCIP Generic Fibre Channel SAN FC Switch OPT-2T01 9899_06_2004_X IP CLOUD FC Switch © 2004 Cisco Systems, Inc. All rights reserved. 9 Introduction to SAN Terminology • Block Level I/O • File Level I/O • SCSI—Small Computer Systems Interface • FC—Fibre Channel • RAID—Redundant Array of Inexpensive Disks • iSCSI—Internet SCSI • FCIP—Fibre Channel over TCP/IP • iFCP—Internet Fibre Channel Protocol • iSNS—Internet Storage Name Service OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 10 RAID Levels OPT-2T01 9899_06_2004_X RAID Level Description Min Disks 0 Striping/Concatenation 2 1 Mirror 2 0+1 Striping/Concatenation then Mirror 4 1+0 Mirror then Striping/Concatenation 4 2 Hamming Code N/A 3 Fix parity with concert I/O N/A 4 Fix parity with Random I/O N/A 5 Stripe with distributed parity with Random I/O 3 without log 4 with log © 2004 Cisco Systems, Inc. All rights reserved. 11 Terminology Direct Attached Storage (DAS) • Block level I/O • Can be internal or external • Typically SCSI or FC • Limited scalability • High cost due to management OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 12 Terminology Network Attached Storage (NAS) • File level I/O • Used for file sharing applications IP • IP-based • Deployed over existing low-cost Ethernet networks • Redundant links NAS NAS NAS • Scalable • Multiple servers can share same file system OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 13 Terminology Storage Area Network (SAN) • Block level I/O • Deployed as separate network • Servers share storage subsystem • Scalable • Multiple paths for high availability OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 14 STORAGE NETWORKING DEVICES OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 15 SAN Components Host Bus Adapter (HBA) • Interface between host and storage • Supports copper or optical • Typically one port; Can be multiple ports • 1Gb, 2Gb and 4Gb OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 16 SAN Components Fabric Switch • 1Gb, 2Gb, and 4Gb • 8-40 ports • Low latency • Can be copper or optical OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 17 SAN Components Director Class Switch • 1Gb, 2Gb, 4Gb and 10Gb • FC and FICON • 256 ports and growing • Low latency • Can be copper or optical • Multi-service platforms OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 18 SAN Components JBOD • Just a bunch of disks • Limited scalability • Typically 2 FC ports • SCSI or FC disks • Basic controllers • No caches OPT-2T01 9899_06_2004_X 19 © 2004 Cisco Systems, Inc. All rights reserved. SAN Components Storage Arrays • 36GB to many TB • Typically 2 to many interfaces • Subsystems may mix interfaces • ESCON/FICON, SCSI, FC, or iSCSI • SCSI or FC disks iSCSI • Intelligent controllers • Large caches OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 20 SAN Components Tape Arrays • Tape speed vary 5MBs—30MBs+ • Capacity vary 20GB—300GB+ • Deployed in servers or external libraries • SCSI, FC, Ethernet interface • DLT most common; LTO gaining traction OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 21 STORAGE NETWORKING APPLICATIONS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 22 IT Storage Requirements • Scalability Meet high growth demand for storage capacity (>80% per year) Increase capacity utilization rates • Availability Share data across distributed data centers via fast speed, long distance connectivity links Provide effective disaster recovery Improve interoperability across heterogeneous equipment Enhance security • Manageability Automate storage management functions Provide cross-vendor management tools Managing heterogeneous environments OPT-2T01 9899_06_2004_X 23 © 2004 Cisco Systems, Inc. All rights reserved. Storage Network Build-Out • Application-specific islands of networked storage Homogenous Infrastructure “Isolated Islands” • iSCSI SAN Convenient extension of existing FC SAN to IP-attached servers • Extensive IP services for NAS environments DAS NAS Starting Point OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 24 Storage Network Interconnection • SAN interconnection for SAN Interconnectivity Business continuance Unified management Remote backup • Metro DWDM solutions FCIP Low-latency option for synch replication FCIP Optical • FCIP Lower-cost option for asynch replication and backup consolidation OPT-2T01 9899_06_2004_X Present Trend 25 © 2004 Cisco Systems, Inc. All rights reserved. Intelligent SAN • Intelligent services into the network • Common management framework • Content, file, and block awareness Storage Utility SAN Data Mgmt Services Storage Routing Content Delivery Storage Switching • Transport independent Host Awareness OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Storage Virtualization Storage Management 26 STORAGE NETWORKING TOPOLOGIES OPT-2T01 9899_06_2004_X 27 © 2004 Cisco Systems, Inc. All rights reserved. SCSI I/O Topology Host System • SCSI is the protocol used to communicate between servers and storage devices Initiator • SCSI I/O channel provides a half-duplex pipe for SCSI commands and data SCSI • Parallel implementation Bus width: 8, 16 bits SCSI Adapter Bus speed: 5–80 Mhz Throughput: 5–320 MBps Devices/bus: 2–16 devices Cable length: 1.5m–25m • A network approach can scale the I/O channel in many areas (length, devices, speed) OPT-2T01 9899_06_2004_X Target © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 28 Fibre Channel Topology Host System • Very common method for networking SCSI • Fibre Channel provides high-speed transport for SCSI payload • Fibre Channel overcomes many shortcomings of DAS including: Initiator SCSI Fibre Channel HBA Addressing for up to 16 million nodes (24 bits) Loop (shared) and Fabric (switched) transport Speeds of 100 or 200 Mbps (1 or 2 Gbps) Distance of up to 10km (without extenders) Support for multiple protocols Fibre Channel Fabric • Combines best attributes of a channel and a network OPT-2T01 9899_06_2004_X Target 29 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Storage Topology iSCSI-Enabled Hosts (Initiators) • IP access to open storage sub-systems • iSCSI driver is loaded onto hosts on ethernet network • Able to consolidate servers via iSCSI onto existing storage arrays • Able to build ethernet-based SANs using iSCSI arrays • Storage assigned by iSCSI instance OPT-2T01 9899_06_2004_X iSCSI iSCSI iSCSI iSCSI Array iSCSI (Target) IP Network iSCSI Router FC Fabric FC HBA Attached Host (Initiator) Storage Pool (Target) © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 30 FCIP SAN Extension Topology • FCIP gateways perform Fibre Channel encapsulation process into IP packets and reverse that process at the other end • FC Switches connect to the FCIP gateways through an E_Port for SAN fabric extension to remote location • A tunnel connection is set up through the existing IP network routers and switches across LAN/WAN/MAN Database Servers Backup Server Servers FC SAN FC SAN Storage Existing IP FC Switch FCIP Gateway EMC SRDF Network LAN/WAN/MAN Production Site Production OPT-2T01 9899_06_2004_X FCIP Gateway FC Switch Storage Backup, R&D, Shared Storage, Standby Data Warehousing, Etc. 31 © 2004 Cisco Systems, Inc. All rights reserved. FCIP and iSCSI: Complementary IP Network SI C iS Storage Router iS C SI • FCIP: SAN-to-SAN over IP • iSCSI: Host to storage over IP FC SAN Storage Router FC SAN FCIP FCIP Gateway OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr FCIP Gateway 32 INTRODUCTION TO STORAGE PROTOCOLS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 33 Introducing SCSI • SCSI = Small Computer System Interface • SCSI is a standard that defines an interface between an initiator (usually a computer) and a target (usually a storage device such as a hard disk) • INTERFACE refers to connectors, cables, electrical signals, optical signals and the command protocol that allow initiators and targets to communicate OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 34 SCSI Example In this Case, a File is Being Written to the Hard Drive By an Application on the Workstation Target 1 Initiator SCSI Connector Target 2 SCSI Cable Disk Tape Sun The SCSI Command Protocol Is Used to Communicate Between SCSI Devices Sun Opcode (2A = Write 10) Reserved LBA LBA (0010E43) LBA Reserved LBA Len (128) LBA Control SCSI Command OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 35 Why Is SCSI Important for SANs? • SCSI command protocol is the de facto standard that is used extensively in high-performance storage applications • The command part of SCSI can be encapsulated in FCP—Fibre Channel Protocol or IP and carried across internetworks; This is the core concept behind storage area networking • To understand the finer points involved with transporting SCSI across a network with FC or ethernet, the basics of SCSI must be well understood OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 36 Standards • SCSI has evolved since it was introduced as SASI in 1979 by Shugart Associates—it was approved as a standard by ANSI in 1986 and is now referred to as SCSI-1 • SCSI-2 was approved by X3 in 1990 and by ANSI in 1994 • SCSI-3 refers to a collection of standards, each of which defines a very specific part of SCSI: physical interface, transport interface, command interface, architecture model, programming interface, etc. OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 37 Sample SCSI Standard Components SCSI Parallel Interface: SPI Initiator Sun Target 1 Sun Target 2 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 38 Sample SCSI Standard Components SCSI Primary Commands: SPC SCSI Primary Commands (SPC-2) Initiator Target 1 Target 2 Sun Sun SCSI Block Commands (SBC) SCSI Stream Commands (SSC) OPT-2T01 9899_06_2004_X 39 © 2004 Cisco Systems, Inc. All rights reserved. SCSI Standards: The Big Picture CAM SBC ASPI SSC SES Generic More… SPC-2 / SPC-3 ATAPI OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr FCP SBP FC-xx 1394 SPI-x 40 SCSI Architecture Model “This specification describes a reference model for the coordination of standards applicable to SCSI-3 I/O systems and a set of common behavioral requirements which are essential for the development of host software and device firmware that can interoperate with any SCSI-3 interconnect or protocol.” SCSI Architecture Model November 1995 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 41 SCSI Architecture Model • The SCSI architecture model defines generic requirements and implementation requirements • Each SCSI implementation standard must fulfill the requirements set forth by SAM OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 42 SAM Highlights: Client-Server • SCSI is a client-server protocol • The client is called the initiator (this is usually the OS I/O subsystem) and issues requests to the server • The server is called the target (this is usually the SCSI controller that is part of a storage device) and receives, executes and returns initiator requests and their associated responses OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 43 SAM Highlights: Initiator: Target • A single initiator can have multiple application clients • Targets have ONE task manager and one OR MORE Logical Units (LU), which are numbered (LUN) • The task manager has the authority to modify service requests that have already been received by the target OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 44 SAM Highlights: Logical Units • Each logical unit within a target is numbered; that number is called a LUN and is the only way to refer to that logical unit • The device server is the entity that receives, executes and returns requests that are made to its logical unit • The concept of task set is beyond the scope of this presentation OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 45 SAM Highlights: Command Model • SAM defines two categories of protocol services: Execute command/confirmation services; Data transfer services • This leads to the three main phases of a data transfer: 1. Execute: Send required command and parameters via CDB; 2. Data: Transfer data in accordance with the command; 3. Confirmation: Receive confirmation of command execution OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 46 SAM Highlights: Sample Data READ 1. Send SCSI Cmd issued by initiator—the command sent is READ; 2. SCSI command received by target; Data transfers occur during the ‘working’ phase between initiator and target; … 3. Send command complete is returned by the target; 4. Command complete received by target OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 47 SAM Highlights: Parameters • The data transfer model reflects parameters that will be used by SCSI commands • This model illustrates that a complete data transfer (right) can be broken up into multiple parts (left) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 48 SAM Highlights: Communication Model Let’s Expand on this Portion SAM Defines a Hierarchy of Protocols OPT-2T01 9899_06_2004_X 49 © 2004 Cisco Systems, Inc. All rights reserved. SCSI Transport Protocol SCSI Protocol FCP Parallel Bus FibreChannel iSCSI iFCP FCIP TCP IP Ethernet OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 50 SCSI Transport Protocol SCSI Protocol FCP Parallel Bus FibreChannel iSCSI iFCP FCIP TCP IP Ethernet OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. Today’s In-Depth Protocol Discussions 51 STORAGE PROTOCOLS IN-DEPTH OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 52 Section Agenda • Introduction to Standards • SCSI Protocol • Fibre Channel Protocol • Internet SCSI (iSCSI) • Fibre Channel over IP (FCIP) • Internet Fibre Channel Protocol (iFCP) • iSNS and SLP OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 53 INTRODUCTION TO STANDARDS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 54 Standards Groups: Storage ISO / IEC JTC-1 American National Standards Institute (ANSI) InterNational Committee for Information Technology Standards (INCITS) C J11 C++ J16 OPT-2T01 9899_06_2004_X Techincal Committee on Lower-Level Interfaces (T10) Information Technology Industry Council (ITI) Techincal Committee on Device-Level Interfaces (T11) Techincal Committee on AT Attachment Interfaces (T13) www.t10.org www.t11.org www.t13.org SCSI Fibre Channel HIPPI IPI ATA (IDE) ATAPI © 2004 Cisco Systems, Inc. All rights reserved. 55 Standards Process • Technical Committees (T10) write drafts • Drafts are sent to INCITS for approval • Once approved by INCITS, drafts become standards and are published by ANSI • ANSI promotes american national standards to ISO as a Joint Technical Committee member (JTC-1) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 56 Standards Work Group: IP Storage IP Storage Technical Work Group Acts as Primary Technical Focal Point of the Storage Networking Industry Association (SNIA) on IP Storage Issues, Coordinating with the SNIA IP Storage Forum ISOC Internet Society Transport Area—Has 23 WGs, One which Is the IP Storage WG IEFT Is the Organization Ratifying the IPS Standards OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. IESG Internet Engineering Steering Group IETF Internet Engineering Task Force Transport Area 57 FIBRE CHANNEL IN-DEPTH OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 58 Fibre Channel Protocol Agenda • FC Introduction • Fibre Channel Communications Model • Protocol Constructs • FC-PH (Fibre Channel—Physical and Signaling Interface) • Login Parameters • Frame Processing • Arbitrated Loop • Switch Fabric Operation • Switch and Hub Mixed Topology Network Operations • FC Error Management OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 59 Fibre Channel Environment • Channel reliability Multiprotocol support Overshared serial media With networking capability and functionality OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 60 Fibre Channel Environment • High bandwidth • Circuit/packet • High data integrity • Multiple protocol support • Highly reliable • Destination paced Buffer credits • Scalable • High availability • Shared media • Transport flexibility Dedicated conn—Class 1 Multiplexed—Class 2 Datagram—Class 3 • Configuration flexibility Switch Loop OPT-2T01 9899_06_2004_X 61 © 2004 Cisco Systems, Inc. All rights reserved. What Is It? Channels Networks • Connection service • Connectionless • Physical circuits • Logical circuits • Reliable transfers • Unreliable transfers • High speed • High connectivity • Low latency • Higher latency • Short distance • Longer distance • Hardware intense • Software intense OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 62 What Is It? Fibre Channel Channels Networks • • • • • • • • • • • • • • Connection service Physical circuits Reliable transfers High speed Low latency Short distance Hardware intense Fibre Channel • Circuit and packet switched Connectionless Logical circuits Unreliable transfers High connectivity Higher latency Longer distance Software intense • Reliable transfers • High data integrity • High data rates • Low latency • High connectivity • Long distance OPT-2T01 9899_06_2004_X 63 © 2004 Cisco Systems, Inc. All rights reserved. Fibre Channel Protocol Levels Levels Cluster HIPPI 370 OEM FC-4’s SCSI IP ATM FC-3 Common Services FC-2 Signaling Protocol FC-1 Transmission Code FC-0 Physical Interface FC-PH N_Port OPT-2T01 9899_06_2004_X F_Port PC-PH = Physical and Signaling Layer © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 64 Fibre Channel Functions Structure Is Divided into 5 Levels of Functionality • FC-0 defines the physical interface characteristics Signaling rates, cables, connectors, distance capabilities, etc. • FC-1 defines how characters are encoded/decoded for transmission Transmission characters are given desirable characters • FC-2 defines how information is transported Frames, sequences, exchanges, login sessions • FC-3 is a place holder for future functions • FC-4 defines how different protocols are mapped to use Fibre Channel SCSI, IP, virtual interface architecture, others OPT-2T01 9899_06_2004_X 65 © 2004 Cisco Systems, Inc. All rights reserved. Fibre Channel Topologies N N • Point to point L L L L • Arbitrated loop L L N N F • Switched fabric FC F N OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr F F N 66 Point to Point • Dedicated connection between ‘N’ port Fibre Channel devices • All link bandwidth is dedicated to communication between the two nodes • Suitable for small scale scenarios when storage devices are dedicated to file servers N N N OPT-2T01 9899_06_2004_X N 67 © 2004 Cisco Systems, Inc. All rights reserved. Arbitrated Loop (FC-AL) • TX of each node is connected to the RX of the next node until a closed loop is formed • Maximum bandwidth: 100 MB/sec. (shared amongst all nodes on loop) • 126 nodes max on loop • Not a token passing scheme—no limit on how long a device may retain control • Operational sequence: Arbitrate for control of loop Open channel to target Transfer data Close • Number of nodes on loop directly affects performance OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr L L L L L L L L L FC Fibre Channel Hub 68 Data Integrity Upper Level Protocol OPT-2T01 9899_06_2004_X Operation Control and Byte Counts Signaling Protocol • • • • Operation Frame counts CRC (32 bit) Frame delimiters Transmission Code 8b/10b Code Physical Media Fibre Reliability © 2004 Cisco Systems, Inc. All rights reserved. 69 Flow Control • Back pressure technique • Frame credit Established by receiver during LOGIN • Transmitter Must have credit to transmit • Receiver Reinstates credit with ACK OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 70 FIBRE CHANNEL COMMUNICATIONS MODEL OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 71 The Model • The Fibre Channel communications model is based on the definition of: Physical objects Protocol construct • These objects and constructs: Define the behavior of the physical elements Control the transfer on information Provide for “link” management Provide the basis for: Hardware Firmware Software OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 72 Physical • The fundamental physical objects in Fibre Channel are: Ports Link Nodes Fabric Some Logical Items Used in These Discussion Are: • Addressing • Communications Model OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 73 Fibre Channel: Port Types • ‘N’ port: Node ports used for connecting peripheral storage devices to switch fabric or for point to point configurations; can be considered the end port • ‘F’ port: Fabric ports reside on switches and allow connection of storage peripherals (‘N’ port devices) • ‘L’ port: Loop ports are used in arbitrated loop configurations to build storage peripheral networks without FC switches; these ports often also have ‘N’ port capabilities and are called ‘NL’ ports • ‘E’ port: Expansion ports are essentially trunk ports used to connect two Fibre Channel switches • ‘G’ port: A generic port capable of operating as either an ‘E’ or ‘F’ port; its also capable of acting in an ‘L’ port capacity; Auto Discovery OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 74 N_Port Host / Device Host/ Device Interface N_Port Serial Data Out OPT-2T01 9899_06_2004_X Serial Data In 75 © 2004 Cisco Systems, Inc. All rights reserved. Link • A link consists of 2 unidirectional “fibers” transmitting in opposite directions May be either: Optical fiber Copper • Transmitters may be: Long wave laser Short wave laser LED Electrical OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Link Transfer Rates Clock Mbaud/sec Mbytes/sec 106.25 265.5 100 25 76 Link Host / Device Host/ Device interface N_Port Serial Data Out Serial Data In Link OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 77 Node • The equipment which contains one or more N_Port or NL_Port (topology dependent) May be Computer Controller Device Is NOT a switch fabric OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 78 Node Controller N_Port N_Port Link OPT-2T01 9899_06_2004_X N_Port Link N_Port Link Link 79 © 2004 Cisco Systems, Inc. All rights reserved. Communications Model • Point to point • N_Port to N_Port • Flow control • Acknowledged Node Node N_Port N_Port Transmitter Receiver Receiver Transmitter Transmitter Link OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 80 Fabric • Fabric The entity which interconnects N_Ports Provides routing based on destination address Fabric may be: Point to point—No routing required Switched—Routing provided by switch Arbitrated loop—Routing is distributed throughout attached L_Ports OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 81 Terms • Topology The physical structure of the interconnect of ports Defines the logical behavior of transactions Fibre channel has 3 topologies Pt to Pt Switched Arbitrated loop • Fabric The fabric is the generic item that interconnects nodes A fabric is made of Fibre Channel topologies like Pt to Pt, switches and loops OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 82 Point to Point Fabric Node Node N_Port N_Port Transmitter Receiver Receiver Transmitter Communications Model • Source to destination • Based on address routing through the fabric OPT-2T01 9899_06_2004_X 83 © 2004 Cisco Systems, Inc. All rights reserved. Switched Fabric N_Port N_Port N_Port N_Port N_Port N_Port Switch Fabric Communications Model—Source to Destination Based on Address Routing through the Fabric A OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr B 84 Arbitrated Loop NL_Node “A” Link NL_Node “B” Communications Model—Source to Destination Based on Address Routing Distributed in the NL_Ports on the Loop A OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. B 85 FC PROTOCOL CONSTRUCTS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 86 What Are Protocol Constructs • The fundamental protocol structures in the Fibre Channel are called constructs, and they are: Frames Sequences Exchanges Information Units (IU) Procedures Upper Layer Protocols (ULP’s) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 87 Construct Introduction • FC-2 defines these constructs that allow the related information to be: Grouped together Coordinated Handled in an efficient manner • To accomplish this we define the notion of: Frames Sequences Exchanges • Also defined are means for the Upper Level Protocols ULP’s to communicate with FC-2: Information Units (IU) • A procedure called the login defines the operating environment between the N_Ports Exchange of the data describing the characteristics of the ports OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 88 Chunks • The ULP’s deal with “chunks” of data that are moved across the network • These chunks of data may be either Control Status Real data OPT-2T01 9899_06_2004_X 89 © 2004 Cisco Systems, Inc. All rights reserved. Frames Frame • FC-2 layer will take this chunk of data and move it from Transmitting node to receiving node In the units of what Fibre Channel calls frames Frame Size FC-3 Common Services FC-2 Signaling Protocol FC-1 Transmission Code FC-0 Physical Interface • FC-2 will determine the size of the frames based on operating environment established between the two communicating nodes OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 90 Frame Structure General FC-2 Frame Format Frame Format SOF Idles 24* Frame Header 4 24 CRC Calculated on Frame Header and Data Field Only OPT-2T01 9899_06_2004_X Data Field CRC 0-2114 EOF 4 Idles 4 Bytes * 6 Idle Words (24 bytes) Requires by TX 2 Idle Words (8 bytes) Guaranteed to RX 91 © 2004 Cisco Systems, Inc. All rights reserved. Frame Header Word 0 2 4 1 2 TYPE 8 bits Data structure 4 5 2 3 R_CTL Routing CS_CTL 8 bits Class Spec 3 OPT-2T01 9899_06_2004_X 3 1 SEQ_ID 8 bits 1 6 1 5 8 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 8 bits Sequence Count RX_ID 8 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 92 Data Field Data Field 0-2114 0 - 64 0 - 2112 1-3 Optional Headers Payload F I L L 0–2048 Typical MTU OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 93 Sequence • Sequences Each chunk of Upper Level Protocol (ULP) data is moved within the envelope of what Fibre Channel calls a Sequence (SEQ) A sequence consists of a set of related frames As expected there are lots of rules governing sequences • Information Units (IU) The ULP tells the FC-2 how to transfer theses chunks of data through a structure called a information unit Very few rules for IU’s IU is a convention defined outside of FC-PH IU’s are unique to each upper level protocol OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 94 Sequence • Sequence Initiator (SI) Fabric SI N_Port The N_Port which is transmitting the data frames Data • Sequence Recipient (SR) Chunk Da ta Fr am e SR N_Port The N_Port which is receiving the data frames Data Chunk OPT-2T01 9899_06_2004_X 95 © 2004 Cisco Systems, Inc. All rights reserved. Sequence Initiator (SI) Read Command (Chunk) Fabric Data F rame Target (SR) Sequence (SI) Data Frame Data (Chunk) (SR) Sequence Data Frame Status Sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 96 Sequence Identifier • Sequence initiator assigns an “identifier” to each sequence This “identifier” is called the Sequence_Identifier or Seq_ID The Seq_ID uniquely identifies a given sequence within the context of the operation Each frame is identified within this operation by Seq_ID and Seq_CNT OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 97 Sequences: Active and Open • Sequence Initiator (SI) A sequence is ACTIVE From the time the first frame of the sequence is transmitted until the frame with the end sequence flag is sent A Sequence is OPEN From the time the first frame is transmitted until the reception of the ACK to the last frame • Sequence Recipient (SR) A sequence is ACTIVE and OPEN From the time of the first frame of the sequence is received until the transmission of the ACK to the last frame of that sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 98 Sequences: Active and Open Originator (SI) Responder (SR) First Data_Frame SOF Received ACK to first Frame Active Open Active & Open Frame with End_SEQ set ACK to last Frame EOF Transmitted EOT Received OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 99 Sequence Streaming • Sequence streaming is the ability to Begin transmission of the next sequence while one or more previous sequences are OPEN • Sequence Recipient (SR) grants permission to have up to “n” streaming sequences; This is determined at N_Node login time Must Support “n=1” sequence status blocks (state info) (This Allows for More Data in the Pipe for Distant Connections) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 100 Exchange • Upper level protocols frequently deal with related bits of data as: Request/reply Command/data/status • These relationships are called “operations” Exchanges • “Operations” of data grouped together into what Fibre Channel call exchanges An exchange consists of a set of related sequences Exchanges are bi-directional Sequences are unidirectional and sequential • There are other rules that govern exchanges OPT-2T01 9899_06_2004_X 101 © 2004 Cisco Systems, Inc. All rights reserved. Exchange Initiator (SI) Read Command (Chunk) Fabric Data F rame Target (SR) Sequence (SI) Exchange Data Frame Data (Chunk) (SR) Sequence Data Frame Status Sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 102 Exchange • Exchange originator The N_Port which transmitted the FIRST data frame for this exchange • Exchange responder The N_Port which is the destination of the FIRST data frame of this exchange The designation for the originator and responder are fixed for the duration of the exchange Unlike the SI and SR Which Change Roles Within the Exchange OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 103 Exchange Identifiers X_ID • An exchange has two “identifiers” associated with it Exchange originator: Assigns an OX_ID which is meaningful to it Exchange responder: Assigns a RX_ID which is meaningful to it In general terms it is called the X_ID • Meaningful is that in the exchange there is “context” with information like state, control, and status with regards to the exchange • An N_Port will save, create and update this information throughout the exchange based on the assigned X_ID’s OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 104 Information Unit • Upper Level Protocols (ULP’s) know about Information Units (IU’s) but know nothing about: Frames Sequences Exchanges • A ULP deals with units like: Order of events within the operation Which node will transmit in the next “phase” (Command phase, data phase, status phase) Is required to have some knowledge about Fibre Channel • An information unit is a Fibre Channel sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 105 Information Unit • The IU contains information sets with such items as LUN, task attributes, CDB and the command byte count • The IU’s are used in protocol mapping from FC-4 to FC-2 and are assigned an identifier that is useful to humans not used by the machine • All the information needed to support a ULP is formed into a IU table and is listed as a first , middle or last IU in the exchange We Will See More of these Tables when We Cover SCSI Mapping onto Fibre Channel OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 106 FC-2 Hierarchy The Hierarchy of Constructs Construct Frame Fields Exchange Consists of one or more Sequences for ULP Operation Information Unit The structure used by the ULP to define a Sequence (not visible over link) Sequence Frame OPT-2T01 9899_06_2004_X Meaning OX_ID / RX_ID Consists of one or more related Frames SEQ_ID Contains in its Payload a ULP “chunk” of data SEQ_CNT 107 © 2004 Cisco Systems, Inc. All rights reserved. FC-2 Hierarchy Frame Fields OX_ID & RX_ID EXCHANGE SEQ_ID SEQ_CNT SEQUENCE Frame Frame Information Unit OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr …... …... Frame Per ULP Terms 108 FC-PH (FIBRE CHANNEL: PHYSICAL AND SIGNALING INTERFACE) STRUCTURE, PROCEDURES, AND PROTOCOLS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 109 Transmission Code • Fibre Channel uses a 8b/10b transmission code Each 8 bit data byte to be transmitted is converted into a 10 bit quantity The 10 bit quantity is then transmitted over the FC media The 10 bit quantity is then converted back to the 8 bit data byte by the receiving node • The 10 bit quantities are called transmission characters • Transmission characters come in two forms Data charters Special characters OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 110 8b/10b Code Why 8b/10b 1. To ensure the sufficient transitions are present in the serial bit stream to make clock recovery possible at the receiver 2. Increase the likelihood of detecting any single or multiple bit errors 3. To provide special characters with distinctive and easily recognizable characters to achieve word alignment on the incoming bit stream OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 111 8b/10b Code Characteristics of 8b/10b • The 10 bit transmission code Supports all 256 values of the 8 bit data byte Contains unused code points Illegal codes(called code violations) Detection of code violations May occur on the transmission character in which the error occurred or may be detected on a subsequent character Contains “special” characters Running “disparity” with DC balance (Count of 0’s and 1’s Equal the Same over Time) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 112 8b/10b Code Running Disparity Disparity: The Difference Between the Number of Ones and Zeros in a Transmission Character Running Disparity: A Binary Parameter Indicating the Cumulative Disparity of All Previously Issued Transmission Characters Transmission Characters Always Have Either: 6 Ones and 4 Zeros = Positive Disparity 4 Ones and 6 Zeros = Negative Disparity 5 Ones and 5 Zeros = Neutral Disparity Rules: A Positive Disparity Transmission Character Can Not Be Followed By Another Positive Transmission Character A Negative Disparity Transmission Character Can Not Be Followed By Another Negative Transmission Character At Transmission Character Boundaries the Difference between the Number of Ones and Zeros is + or – 1 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 113 8b/10b Code Code Notation • Each valid transmission character has been assigned a name in the form of: Zxx.y “Z” = K or D D=Data K=Special Character “xx” = Decimal Value of the 5 LSb bits “y” = Decimal Value of the 3 MSb bits OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 114 Conversion Table Transmission Order MSB LSB j 7 6 5 H G F i 4 3 2 1 0 FC-2 Bits E D C B A FC-1 Code Bit Example D1.0 D or K 0 0 1 0 1 0 1 0 0 1 1 . 0 0 0 0 1 0 0 FC-1 Transmission Character Neg Disp Value j and i are add as part of the 10b conversion process OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 115 Special Characters • K28.5 only special character used in Fibre Channel out of the 12 set aside Has no 8 bit representation The only FC transmission character with 5 consecutive 1’s or 0’s Used to find word boundaries and sync Used in ordered sets 0 0 1 1 1 1 1 0 1 0 + Current Running Disparity 110000 OPT-2T01 9899_06_2004_X 0 1 0 1 - Current Running Disparity © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 116 Transmission • Transmission word consists of 4 continuous transmission characters treated as a unit 40 bits long Aligned on a word boundary There is a ordered set and a data word Transmission Order Byte Data Word 2 3 K28.5 Encoded Data Byte Encoded Data Byte Encoded Data Byte Encoded Data Byte Encoded Data Byte Encoded Data Byte Encoded Data Byte Ordered Set OPT-2T01 9899_06_2004_X 1 0 117 © 2004 Cisco Systems, Inc. All rights reserved. Ordered Set • Transmission word starting with the K28.5 special character • Three classifications of ordered sets are defined Delimiters Primitive signals Primitive sequences MSB K28.5 LSB Dxx.y Dxx.y Dxx.y The Three Data Characters Define the Meaning of the Ordered Set and Are Repeated for the Third and Fourth Character OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 118 Primitive Signals • Primitive signals are ordered sets Transmission of primitive signals are interrupted occasionally to transmit frames • Three basic types Receiver_Ready (R_Rdy) Idle (idle or I) Arbitrate (ARBx) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 119 Delimiters • Delimiters are ordered sets that delineate a frame Immediately preceding and following the contents of a frame • Two basic types Start_of_Frame (SOF) End_of_Frame (EOF) • SOF delimiters Identify the start of a frame Identify the transmission class Used to establish a Class_1 connection Identify the beginning and continuation of a sequence • EOF delimiters Terminate frames Identify the end of a sequence Terminate connections Indicate known frame errors OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 120 FC-1 Synchronization • Procedures Sync acquire Initialization Loss of sync procedure • Primitive sequences OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 121 Sync Procedures • Bit synchronization The state in which a receiver is delivering retimed serial data at the required bit error rate • Transmission word synchronization Achieved when the receiver identifies the same transmission word boundary on the receive bit stream as the established by transmitter at the other end of link Acquired by detection of three consecutive ordered sets without errors • Loss of synchronization procedure The receiver shall enter the loss-of-sync state upon detection of the fourth invalid transmission word • Synchronization acquired procedure The receiver shall enter the synchronization-acquired state when it has achieved both bit and transmission word sync OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 122 Synch Acquired Loss of Sync State Waiting on Bit Synchronization Bit Sync Acquired Data Word Rx Ordered set #1 Data Word Rx Ordered set #2 Data Word Rx Ordered set #3 OPT-2T01 9899_06_2004_X Sync Acquired 123 © 2004 Cisco Systems, Inc. All rights reserved. Loss-of-Sync Procedure Sync Acquire State No Invalid Words Detected Two Consecutive Valid Words One Invalid Word in Next 2 Words First Invalid Word One Invalid Word in Next 2 Words Two Consecutive Valid Words Second Invalid Word Two Consecutive Valid Words One Invalid Word in Next 2 Words Third Invalid Word One Invalid Word in Next 2 Words Fourth Invalid Word OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Loss Of Sync 124 FC-1 Constructs • Port states • Primitive sequences NOS/OLS/LR/LRR • Primitive sequence protocols Sequence flows • Relationships • Port state transition table OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 125 Port States • Four primary operational states Active state Link recovery state Link failure state Offline state • Operational states of a port N_Ports F_Ports • Port state changes occur as a result of Conditions detected within the port In response to reception of primitive sequences In response to upper level controlling entity OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 126 Primitive Sequences • Ordered set that is transmitted continuously to indicate that specific conditions within the port are encountered • Transmitted while the condition exist • Four primitive sequences Not Operational Sequence (NOS) Offline Sequence (OLS) Link Reset Sequence (LR) Link Reset Response Sequence (LRR) OPT-2T01 9899_06_2004_X 127 © 2004 Cisco Systems, Inc. All rights reserved. Primitive Sequence NOS Not_Operational Sequence • Transmitted by the port to indicate that Link failure had been detected Loss of sync Loss of signal Port is offline K28.5 OPT-2T01 9899_06_2004_X D21.1 © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr D31.5 D5.2 128 Primitive Sequence OLS Offline Sequence • Transmitted by port to indicate that it is: Initiating the link initialization protocol Receiving NOS Entering the Offline state K28.5 OPT-2T01 9899_06_2004_X D21.2 D10.4 D21.2 129 © 2004 Cisco Systems, Inc. All rights reserved. Primitive Sequence LR Link Reset Sequence • Transmitted by port to indicate that it is: Initiating the link reset protocol To recover from a link timeout To remove a Class_1 connection K28.5 OPT-2T01 9899_06_2004_X D9.2 © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr D31.5 D9.2 130 Primitive Sequence LLR Link Reset Response Sequence • Transmitted by port to indicate that: Link reset is being received K28.5 OPT-2T01 9899_06_2004_X D21.1 D31.5 © 2004 Cisco Systems, Inc. All rights reserved. D9.2 131 Primitive Sequence Protocols Link Initialization Protocols • Required after Port power-on Port internal reset Port has been in offline state Online to offline protocols • Required to enter offline state OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 132 Primitive Sequence Protocols Link Failure Protocol • Required after Detection of loss of synchronization for a period of time greater than 100ms which is the receiver-transmitter timeout value (R_T_TOV) Loss of signal while not in the offline state Link Reset Protocol • Required after Link reset Link timeout OPT-2T01 9899_06_2004_X 133 © 2004 Cisco Systems, Inc. All rights reserved. Primitive Sequence Flows NOS Link Failure State (LF) Link Failure Protocol Active State (AC) Link Initialization Protocol Link Recovery State (LR) Link Reset Protocol Online to Offline Protocol Offline State (OL) Idle OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 134 Primitive Sequence Meanings Currently Transmitting Meaning Transmit in Response Not Operational NOS OLS • Link Failure Offline State OLS LR • Internal port failure • Transmitter power down, perform diags, or perform initialization • Receiver shall ignore Link error or Link Failure Link Reset LR LRR • Remove class_1 Conn • Reset F_Port • OLS recognized Link Reset Response LRR Idles • Link Reset Recognized Operational Link IDLE • Idles and R_RDY recognized OPT-2T01 9899_06_2004_X Idles or R_RDY 135 © 2004 Cisco Systems, Inc. All rights reserved. Link Failure Port A Port B AC AC Link Failure Condition LF NOS LF OLS OL LR LR LRR LR Idle AC = Activity State LR = Link Recovery State AC LF = Link Failure State Idle OL = Offline State AC OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 136 Offline Port A Port B AC AC OL OLS OL Request to Go Offline After 5ms Minimum Diags May Be Preformed LR LR LRR Request to Go Online LR Idle AC Idle LR = Link Recovery State Idle LF = Link Failure State OL = Offline State AC OPT-2T01 9899_06_2004_X AC = Activity State © 2004 Cisco Systems, Inc. All rights reserved. 137 Frame Header Detail • Routing control (R_CTL) • Addressing (D_ID) (S_ID) • Type (TYPE) • Frame control (F_CTL) • Sequence identifier (SEQ_ID) • Sequence count (SEQ_CNT) • Exchange identifiers (OX_ID) (RX_ID) • Parameter field OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 138 Frame Detail: Routing Control Word 3 1 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 2 3 1 6 1 5 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID OX_ID 16 bits Orig Exch ID 5 8 Parameter Specific to frame type OPT-2T01 9899_06_2004_X 139 © 2004 Cisco Systems, Inc. All rights reserved. Routing Control • The Routing control field is an 8 bit field • R_CTL consist of two 4 bit sub-fields Routing Information category 31 28 Routing OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 27 24 Info Category 140 Routing Control • The R_CTL is used to direct the frame to the process the frame is directed to; For example: Frames directed to the fabric for extended link services (0x22) Indication of the function or purpose of the frame payload from the upper level protocol at FC-4 (0x01) OPT-2T01 9899_06_2004_X 141 © 2004 Cisco Systems, Inc. All rights reserved. Port Addressing Word 3 1 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 5 OPT-2T01 9899_06_2004_X 2 3 1 6 1 5 8 7 0 D_ID 24 bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 142 Port Addressing • D_ID and S_ID fields are 24 bits each • They provide the address or identifier of the Source and destination port of a frame • Although the address map is flat, there are several formats depending on: Topology Location OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 143 Port Address Identifiers • Applicable to all topologies Point to point Switched Loop • Dynamically assigned or administratively assigned • Used for frame routing Unique within Fibre Channel network • Assigned by the “fabric” • Some address reserved for special functions OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 144 Port Address Identifiers Topology Assignment Point To Point By N_Port with Higher Worldwide Name (MAC) Switched By Switch During Fabric Logon • Bound to Physical Port on Switch Arbitrated Loop Acquired During Loop Initialization OPT-2T01 9899_06_2004_X 145 © 2004 Cisco Systems, Inc. All rights reserved. Address Identifiers 8 bits Switch Topology Model Private Loop (Not Connected to a Switch) Public Loop (Connected to Switch) OPT-2T01 9899_06_2004_X 8 bits 8 bits Switch Domain Area Device 00 00 Arbitrated Loop Physical Address (AL_PA) Domain Area AL_PA © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 146 Reserved Addresses • FC-PH has defined a block of addresses for special functions: High order 16 addresses in the 24 bit address space Called the well known addresses Main Address Used Today OPT-2T01 9899_06_2004_X FF FF FC Directory Server FF FF FD Fabric Controller FF FF FE Fabric F_Port which N_Port is attached to 147 © 2004 Cisco Systems, Inc. All rights reserved. Data Structure Type Word 0 1 2 3 4 5 OPT-2T01 9899_06_2004_X 3 1 2 4 2 3 R_CTL Routing 1 5 8 7 0 D_ID 24bits Destination CS_CTL 8 bits Class Spec S_ID 24 bits Source TYPE 8 bits Data structure SEQ_ID 8 bits 1 6 F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 148 Type • The TYPE is a 8 byte field • Indicates the upper level carried in the payload of the frame • Examples: SCSI ‘08h’ IP ‘05h’ SNMP ‘24h’ Fibre Channel services ‘20h’ OPT-2T01 9899_06_2004_X 149 © 2004 Cisco Systems, Inc. All rights reserved. Frame Control Word 3 1 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 5 OPT-2T01 9899_06_2004_X 2 3 1 6 1 5 8 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 150 Frame_Control • The frame control is a 24 bit field • It contains a number of flags that are used to control the flow of the sequence • The more common flags are exchange and sequence management, acknowledgement control and error conditions Bits 16-23 deal with the sequence and exchange settings Bits 14-15 deal with X_ID Bits 13-12 form the ACK level for class 1 & 2 Bits 5-4 used for aborting the sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 151 Frame Control Bits 12-13 • Acknowledgment Capability Provide assistance to Sequence Recipient (SR) by translating the ACK capabilities bits in the N_Port class parameters Meaningful only in Class 1 and 2 data frames 0 0 = No ACK 0 1 = ACK level 1 –one for every frame 1 0 = ACK level “N” N = number of frames 1 1 = ACK Level 0—single ACK for complete exchange, used in video streaming OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 152 Sequence Identifier Word 3 1 2 4 2 3 1 6 1 5 8 7 0 0 R_CTL Routing 1 2 3 CS_CTL 8 bits Class Spec S_ID 24 bits Source TYPE 8 bits Data structure SEQ_ID 8 bits 4 5 D_ID 24bits Destination F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 153 Sequences Sequences • Deal with chunks of upper level protocol • Are made up of one or more frames which transport the ULP • The data phase may be subdivided into multiple sequences • Uniquely identifiable with SEQ_ID • The command, data, and status phases of SCSI are examples of sequences OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 154 Sequence Identifier • The Sequence Identifier (SEQ_ID) is a 8 bit field • All Frames of a sequence will carry the same SEQ_ID value Data content of these frames are related in some way by the ULP OPT-2T01 9899_06_2004_X 155 © 2004 Cisco Systems, Inc. All rights reserved. Sequence Count Word 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 5 OPT-2T01 9899_06_2004_X 3 1 2 3 1 6 1 5 8 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 156 Sequence Count • Sequence count (SEQ_CNT) is a 16 bit field • Identifies the order of the transmission of frames within this sequence • Used by Sequence Recipient (SR) to account for all transmitted frames • Used by Sequence Initiator (SI) to account for all transmitted acknowledges (ACK’s) in Class 1 and 2 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 157 Sequence Count • Within a Sequence_Initiative The SEQ_CNT of the first data frame will be zero The SEQ_CNT of each subsequent data frame in the sequence will be incremented by 1 The first data frame of the next sequence may be either zero or one more then the last data frame, this is called “continuously increasing SEQ_CNT” If streamed sequences is used, continuously increasing SEQ_CNT is required OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 158 Sequence Count • Sequence initiator Assigns SEQ_CNT to data frames Keeps a record of ACK frames received • Sequence recipient Records SEQ_CNT of data frames Transmits an ACK frame for each valid data frame when Rx buffer is available Knows that sequence was received without error if all Frames are Rx without errors and are accounted for • Sequence initiator Knows the sequence was received without error if it has Rx an ACK frame to all frames within the sequence OPT-2T01 9899_06_2004_X 159 © 2004 Cisco Systems, Inc. All rights reserved. Exchange Identifiers Word 3 1 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 5 OPT-2T01 9899_06_2004_X 2 3 1 6 1 5 8 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 160 OX_ID and RX_ID • 2 byte fields each • Contain the originator exchange identifier and responder exchange identifier • They point to state and context information regarding the exchange in the originator port and responder port OX_ID’s are reused after each exchange is over OPT-2T01 9899_06_2004_X 161 © 2004 Cisco Systems, Inc. All rights reserved. Parameter Field Word 3 1 2 4 0 R_CTL Routing 1 CS_CTL 8 bits Class Spec 2 TYPE 8 bits Data structure 3 SEQ_ID 8 bits 4 5 OPT-2T01 9899_06_2004_X 2 3 1 6 1 5 8 7 0 D_ID 24bits Destination S_ID 24 bits Source F_CTL 24 bits DF_CTL 8 bits Data field OX_ID 16 bits Orig Exch ID Frame Control SEQ_CNT 16 bits Sequence Count RX_ID 16 bits Respon Exch ID Parameter Specific to frame type © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 162 Parameter Field • The parameter is a 4 byte field • The content of the parameter field is dependent on the specific frame type as identified in the routing field FC-4 data frames ACK link control Port reject and frame reject frames Port busy and fabric busy frames OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 163 LOGIN PARAMETERS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 164 Login Procedure to Determine the Operating Environment for Communications between Two Ports • Exchange “service parameters” done with login frame PLOGI or FLOGI • Required before communications can be established between the two ports • Applies to all topologies • Applies to all ports, node and fabric • Bi-directional ACCEPT Frame contains service parameters of the port addressed OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 165 Login Service Parameters Contain the Following “Type” of Information • Version of Fibre Channel support • N_Port or F_Port functionality • Service classes supported • Size of receive buffers • Number of sequences supported • Support for Intermix • ACK capability • Error policy supported • Others OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 166 ACK’s Informs Transmitter that: • One or more valid data frames were received by the sequence recipient for the corresponding sequence qualifier • Interface buffer is available for another data frame, this only applies to class 1 and class 2 Class 3 are not ACK’ed • Flow control Re-instates end-to-end credit OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 167 ACK’s • Frame Header Constructed from the data frame which is being acknowledged S_ID and D_ID are swapped F_CTL with both exchange and sequence context bit inverted SEQ_ID is unchanged SEQ_CNT is set to the sequence count of the highest data frame being replied to by the ACK Parameter Field Bit 16 = History bit Bits 0-15 are ACK type specific OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 168 ACK’s • Again there are three types of ACK’s ACK_1 default for class 1 and 2 one ACK sent for each SEQ_CNT ACK_N Class 1 or 2 N=ACK sent by recipient for the support indicated during port login ACK_0 class 1 or 2 single ACK sent at end of sequence We could spend a lot more time discussing ACK’s but there is little or no class 1 or 2 used in networks today and doubt if we will see any soon OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 169 Busy and Reject Port Reject P_RJT Fabric Reject F_RJT • Transmitted by destination port or fabric in response to a specific data frame • Applicable to only Class 1 and 2 • Sent in reply to valid frames • Transmitted by the “receiver” of the data frame with reason code • Indicated that the corresponding data frame was NOT delivered to the ULP OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 170 Busy and Reject • Busy sent by fabric if unable to deliver frame due to busy condition • Busy sent by port if temporarily busy and unable to process a frame • If F_BSY or P_BSY is sent, fabric or port give reason code Class 1 busy only allowed on the connection request Class 2 any frame may Rx busy Class 3, busy is not sent; If a frame can not be delivered it is discarded without notification OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 171 Flow Control and Credit • Flow model Frames are moved: From one Buffer To another Buffer Frame Flow is: From the Source buffer and To the destination buffer Depending on the class of service Multiple intermediate buffers may be involved Applies to: All topologies OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 172 Flow Control and Credit • Frame flow is controlled by the receiver Back-pressure mechanism ACK’s class 1 and 2, RDY’s class 3 • Flow control is based on frame flow Which frames are flow controlled is dependent on class of service • Receiver defined parameters during the login procedure Maximum frame size Number of buffers OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 173 Flow Control and Credit Receiver • Establishes operating environment through login Size of buffers Number of buffers (credits) allocated to this transmitting port • Pumps-up these credits By ACK’s when buffer is available • A receive Buffer is available after The frame was verified to be valid, no errors And the frame has been moved off the interface buffer OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 174 Flow Control and Credit Transmitter • Keeps Credit maximum value Credit_Count • Consumes one credit for each “frame” it transmits Credit_CNT = Credit_CNT –1 for each Data_Frame Tx • Regenerates credit for each ACK Rx’ed Credit_CNT = Credit_CNT + N • Stops transmitting when Credit_CNT = 0 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 175 Flow Control and Credit • FC-2 defines two type of credit Buffer to Buffer (BB) End-to-End (EE) • BB credit is the flow of connectionless traffic Over a LINK from Tx to Rx Class 2 and 3 Signal used = R_RDY • EE credit is the flow on connection traffic Source to destination node Class 1 and 2 Signal used = ACK • Both based on Credit Credit_CNT • Differ in Frames controlled and acknowledgement signal OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 176 Flow Control and Credit Sequence Initiator Sequence Recipient Fabric RX Buf TX Buf TX Buf RX Buf R_RDY R_RDY RX Buf ACK TX Buf ACK RX Buf TX Buf R_RDY R_RDY BB_C BB_C EE_Credit OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 177 Class of Service • Applicable to all fabric topologies Switched Point to point Arbitrated loop • These three classes of service are Class 1 dedicated connection Class 2 connectionless multiplexed Class 3 datagram • Delimiters used to set required class for a sequence OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 178 Class of Service • SOF delimiter The required class of service along with basic sequence management are specified in the SOF delimiter of every frame The SOF delimiter dedicate basic link management functions within the fabric The SOF delimiter identifies basic Sequence management functions within the destination N_Port in the initial frame of the sequence and the last frame of the sequence • EOF delimiter Last frame of a sequence is terminated by a special EOF Dedicated connections are removed by a special EOF OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 179 Class of Service Class 1 • Dedicated connection service Connection oriented service between two N_Ports Frames received in order transmitted Guaranteed delivery with notification of non-delivery Guaranteed throughput Optional Intermix Can mix Class 2 and 3 frames if allowed OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 180 Class of Service Class 1 • Requires explicit connection establishment SOF(C1) delimiter • Requires explicit removal of connection ACK with EOF(DT) delimiter • Once connection is established BSY and RJT will not occur • Flow control Buffer to buffer on SOF(C1) frame: R_RDY End to end for all other data frames: ACK OPT-2T01 9899_06_2004_X 181 © 2004 Cisco Systems, Inc. All rights reserved. Class of Service: Class 1 Flow Initiator Recipient Fabric SOF(C1) Connection Requested R_RDY R_RDY ACK Connection Established SOF(n1) SOF(n1) SOF(n1) ACK ACK ACK Conn Removed OPT-2T01 9899_06_2004_X EOF(t) © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 182 Class of Service Class 2 • Multiplexed connectionless service Connectionless oriented service between two N_Ports Order of frame reception not guaranteed Guaranteed delivery Notification of non-delivery No throughput guarantees Optional intermix OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 183 Class of Service Class 2 • Multiplex on a frame-by-frame basis Between different destination N_Ports Among different sequences • BSY and RJT may occur on any frame • Flow Control Buffer-to-buffer for all frames: R_RDY End-to-end for all data frames: ACK OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 184 Class of Service: Class 2 Flow Initiator Recipient Fabric SOF(C2) R_RDY R_RDY ACK R_RDY SOF(n2) R_RDY SOF(n2) R_RDY ACK R_RDY ACK R_RDY R_RDY R_RDY OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 185 Class of Service Class 3 • Datagram multiplexed connectionless service Connectionless oriented service between two N_Ports Order of frame reception not guaranteed Unacknowledged Delivery NOT guaranteed No throughput guarantees Optional intermix OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 186 Class of Service Class 3 • Multiplex on a frame-by-frame basis Between different destination N_Ports Among different sequences • BSY and RJT will not occur on any frame • Flow control Buffer-to-buffer for all data frames: R_RDY OPT-2T01 9899_06_2004_X 187 © 2004 Cisco Systems, Inc. All rights reserved. Class of Service: Class 3 Flow Initiator Recipient Fabric Data Frame R_RDY R_RDY Data Frame R_RDY OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr R_RDY 188 EE Credit Switch EE_Credit NL_Node “A” EE_Credit NL_Node “B” Applies Only to Class 1 and Class 2 Frames for All Topologies OPT-2T01 9899_06_2004_X EE_Credit 189 © 2004 Cisco Systems, Inc. All rights reserved. BB Credit Switch BB_Credit NL_Node “A” BB_Credit NL_Node “B” For All Class 2 and Class 3 Frames for All Topologies OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr BB_Credit 190 FRAME PROCESSING OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 191 Tables • The N_Port will keep the following information Available X_ID table Exchange context table Login table OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 192 Tables Available X_ID Table • This table contains a list of available X_ID’s Can be used for OX_IDs or RX_IDs A given implementation may choose to keep two tables one for OX_ID and RX_ID • When a device driver sends a request to transmit a frame, a value will be taken for the OX_ID • When a port receives a frame for a new exchange, a value will be taken for the RX_ID OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 193 Tables Exchange Context Table • Each exchange ID points to a unique entry in the exchange context table • Each entry contains the context and state information for the particular exchange Port_ID involved in exchange X_ID it assigned to exchange ULP and phase within the operation Data source or destination address Data frames transmitted or received (SEQ_CNT) ACK frames transmitted or received (SEQ_CNT) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 194 Tables Login Table • This table contains one entry for each port to which this port is logged in with • Each entry contains service parameters and working EE_Credit count value OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 195 Data Frames: Putting It All Together Data Frame Transmission • Request for a ULP Initiate some operation with a specific destination port • Login process If you are not logged in, initiate login process Build logging table entry for destination port • Assign OX_ID if needed Get a value from the available X_ID Table Build the exchange context table OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 196 Data Frames Data Frame Transmission (Cont.) • Gather information Exchange context table Receive buffer size and destination port Login table Working credit count of destination port Set-up frame header • Data frame transmission Segmentation process Credit management OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 197 Data Frames Transmit Request • ULP passes a request to transmit a chunk of data to the N_Port Destination Port D_ID is made • The N_Port must access the “login table” to determine the service parameters on the destination port Number of Rx buffers Value of the working credit count And the rest OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 198 Data Frames The Data Transmission • ULP data chunk is moved in frames with the use of the sequence All within the context of the exchange • A number of processes are involved Initialization of the frame header fields Segmentation and reassemble OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 199 First and Last Data Frames • The first data frame of a sequence is identified by SOF(Ix) Delimiter, where ‘x’ is the Class of Service • The last data frame of a sequence is identified by F_CTL bit 19, End_SEQ=‘1’ • A sequence consists of all data frames Starting at the SEQ_CNT for the first frame through the SEQ_CNT of the last frame OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 200 Sequence Processing Sequence Count • ULP chunk of data is transmitted IN ORDER All Frames are sent in order • Sequence_Count (SEQ_CNT) Frames are assigned sequentially increasing numbers as they are sent The receiving N_Port will use the SEQ_CNT to insure that Frames are reassembled in order and back in its original chunk OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 201 Sequence Initiator (SI) • Sets F_CTL bit 23 “0” If it is the exchange originator “1” If it is the exchange responder OX_ID and RX_ID set to assigned values RX_ID = “FFFF” if first sequence of exchange Routing field (R_CTL) set to “0000” to indicate FC-4 data frame Information category field of R_CTL set according to payload OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 202 Sequence Initiator: Frame Header • Sequence ID (SEQ_ID) Any value select that is not used • Sequence count (SEQ_CNT) Assign sequentially as frames are sent Starts with “0” on first frame of sequence Increments by ‘1’ while sequence initiative is held • Parameter Set to ‘offset’ of the first byte of payload with respects to entire chunk Offset = ‘0’ on first frame and ‘1’ + for second and subsequent frames OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 203 Sequence Initiator: Frame Header • Other important F_CTL bits Bit 23, exchange context Bit 21, first sequence Bit 20, last sequence Bit 19, end sequence Bit 16, sequence initiative Used to pass initiative to other device OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 204 Automatic Processes • These processes are automatic and are performed by the protocol chip Segmentation and reassembly SEQ_CNT assignment Higher layers are unaware of these processes OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 205 ULP Processing • The Upper Level Protocol (ULP) uses these fields Routing ‘0000’ = FC-4 data frame Type ’08 = SCSI/FCP Info category Identifies Specific Function of Payload ‘01’ = Solicited Data ‘06’ = Unsolicited Command ‘05’ = Data Descriptor ‘07’ = Command Status OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 206 ARBITRATED LOOP OPT-2T01 9899_06_2004_X 207 © 2004 Cisco Systems, Inc. All rights reserved. Fibre Channel Arbitrated Loop (FC-AL) • Maximum bandwidth: 100 MB/sec. (shared amongst all nodes on loop) L L L L • 126 nodes max on loop • Can be combined with switches L L • Attaches “NL_Ports” • Number of nodes on loop directly affects performance L L L • Defined in it’s own standard FC Fibre Channel Hub OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 208 Loop Advantages • Low cost solution with copper transceivers • Eliminates the need for a discrete “fabric” Fabric routing decision distributed around the loop • Compatible with all FC- 0 variants Copper within a box Optical between boxes • Self discovery procedure • Simple additions to FC-PH OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 209 Loop Advantages • Port bypass network • High availability configurations possible • Supports both public and private loops • Provides access fairness OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 210 NL_Port • N_Port Attaches to the physical transport media Provides the Fibre Channel control and protocol Provides the termination point for Fibre Channel Resides within the node • NL_Port Provides all functionality on N_Port with additional function of the loop An NL_Port can function as a N_Port OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 211 FL_Port • F_Port Attaches to the physical transport media at the edge of the switched fabric • FL_Port The switched fabric port which attaches to a loop F_Port functionality with additional function of the loop G and GL Ports Will Do Both N and F OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 212 Private and Public • Private Loop Contains no FL_Port Communications outside the loop via Fibre Channel is not possible • Public Loop Contains an FL_Port Communications outside loop via Fibre Channel is possible • Private Devices Devices on a public loop may be private, i.e. do not login OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 213 Addressing • Arbitrated Loop Physical Address (AL-PA or PA) Assigned during the loop initialization (soft addressing) A unique 8 bit value 127 valid values • Arbitrated Loop Destination Address (AL-PD or PD) The AL_PA used to identify the destination L_Port Target of a primitive signal or D_ID of a frame • Arbitrated Loop Source Address (AL_PS or PS) The AL_PA used to identify the source L_Port Source of a primitive signal or S_ID of a frame OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 214 The Fabric Definition • The entity that interconnects attached N_Ports • Provides ‘routing’ based on destination address • Fabric may be: Point to point—No routing required Switched—Routing provided by the Switch Arbitrated loop—Routing is distributed throughout the attached NL_Ports OPT-2T01 9899_06_2004_X 215 © 2004 Cisco Systems, Inc. All rights reserved. Switched Fabric N_Port N_Port N_Port N_Port N_Port N_Port Switch Fabric OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 216 Loop = Arbitrated Loop Additional Function Fabric Node Node Node NL_Port NL_Port NL_Port LOOP OPT-2T01 9899_06_2004_X NL_Port NL_Port NL_Port Node Node Node © 2004 Cisco Systems, Inc. All rights reserved. 217 Routing Process: Loop • The routing function is distributed Each L_Port performs a portion of routing • Routing is performed through out-of-band signaling using primitive signals • Connection oriented independent of class of service Obtain ownership of the loop (Arbitration) Establish a connection (Open) Transfer frames (Data) Remove the connection (Close) Relinquish the loop OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 218 Processes and Procedures • Initialization The process by which addresses are assigned and recovery is performed • Arbitration The process by which an L_Port acquires ownership of the loop • Open The process by which the L_Port which owns the Loop uses to select the L_Port to which it wants to communicate with • Close The process by which the L_Port which owns the Loop releases control OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 219 Fill Words • FC-PH defines two signals that may be transmitted between frames (when no other information is being transmitted) Idle R_RDY • FC-AL defines several additional signals that may be transmitted between frames • FC-AL defines the “fill word” to be ARB(F0) ARB(x) Idle OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 220 Primitive Signals and Sequences FC-AL Defined the Following Unique Signals and Sequences • Primitive signals Arbitrate Open Close Mark • Primitive sequences Port bypass enable Port bypass disable Loop initialization OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 221 Credits Buffers Loop Uses Same Credit Method as Previously Discussed But Also Has an Alternate Credit Model • Alternate BB_Credit management requested during login • When activated service parameter BB_Credit = number of buffers available when circuit is established • The receiving L_Port shall transmit R_RDYs for the additional buffers at anytime when “opened” Used to pump up BB_Credit_CNT • Transmitting L_Port Decrements BB_Credit by ‘1’ for each data frame Tx Increments BB_Credit by ‘1’ for each R_RDY Rx Stops transmitting when BB_Credit =‘0’ OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 222 Arbitrated Loop Initialization Procedure Purpose • An L_Port will perform the loop initialization procedure to: Determine the Operating environment for the L_Port; Is this a loop? Acquire an address. AL_PA (Physical Address) Report that an error has been detected OPT-2T01 9899_06_2004_X 223 © 2004 Cisco Systems, Inc. All rights reserved. Loop Commands Loop Initialization Procedure—LIP Is an Ordered Set Command Bytes Payload Contents LISM Link Initialization – Select Master 12 Command & WWN LIFA Link Initialization – Fabric Assigned 20 Command & AL_PA bit map LIPA Link Initialization – Previously Assigned 20 Command & AL_PA bit map LIHA Link Initialization – Hard Assigned 20 Command & AL_PA bit map LISA Link Initialization – Soft Assigned 20 Command & AL_PA bit map LIRP Link Initialization – Report Position 132 Command & AL_PA Collect Position map LILP Link Initialization – Loop Position 132 Command & AL_PA Distribute Position map OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 224 LIP: Initialization Procedure Phase A Phase B Phase C OPT-2T01 9899_06_2004_X Start The Initialization Procedure LIP Select Temporary Loop Master LISM AL_PA Mapping Phase LIFA, LIPA, LIHA, LISA Start The Initialization Procedure FL_Port Wins if Present Otherwise Lowest WWN Wins Build the AL_PA bit Map in 4 Steps Phase D Reporting Phase LIRP Collect the AL_PA Position Map Phase E Distribute AL_PA Map Phase LILP Distribute the AL_PA Position Map Close © 2004 Cisco Systems, Inc. All rights reserved. 225 LIP: Phase A Loop Initialization Primitive Sequence • Transmitted continuously by L_Port until it receives the same LIP configuration LIP (F7F7) the L_Port is attempting to determine if this is a loop and to acquire an AL_PA LIP (F8F7) the L_Port has detected a loop failure at its receiver prior to acquiring an AL_PA LIP (F8) the L_Port (AL_PS) had detected a loop failure at its receiver LIP (F7) the L_Port (AL_PS) has detected a performance degradation OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 226 LIP: Phase B • Each L_Port will build the LISM with: AL_PA = ’00’ hex if FL_Port ’EF’ hex if NL_Port D_ID = ‘0000’hex + AL_PA Example (0000EF) S_ID = “0000’hex + AL_PA Payload = Command + WWN Current Fill Word = Idle • Each L_Port will continuously transmit a LISM • Normal flow control rules are not in effect during initialization OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 227 LIP: Phase B (Cont.) • Each L_Port monitors its receiver Will continue to transmit LISM if Your AL_PA + WWN is less then received AL_PA + WWN Otherwise pass the received LISM • You are temporary loop master If the device receives a LISM identical to the one transmitted FL_Ports always win; If two or more FL _Ports; Lowest WWN wins and the others go non-participating If no FL_Port the NL_Port with lowest WWN wins • Loop master Current fill word would be ARB(F0) When ARB(F0)’s are received, go to phase C OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 228 LIP: Phase C Loop Master Will Form the Initial “Bit” Map as Shown: Lowest AL_PA Bit Position Word 31 0 L000 0000 0000 0000 0000 0000 0000 0000 1 0000 0000 0000 0000 0000 0000 0000 0000 2 0000 0000 0000 0000 0000 0000 0000 0000 3 0000 0000 0000 0000 0000 0000 0000 0000 Where L 24 23 16 15 8 7 0 = 1 Requesting F_Login of all NL_Ports Bit Position = 127 vector corresponding to valid AL_PA’s Word 0 bit 30 = lowest number ’00’hex Highest AL_PA Word 3 bit 0 = high number AL_PA value ‘EF’hex Set the bit = 1 that corresponds to it’s Fabric Assigned AL_PA OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 229 LIP: Phase C • Loop master will transmit the following three commands allowing an L_Port to choose a desired AL_PA LIFA bit map primed with initial value LIPA bit map primed with results of LIFA LIHA bit map primed with results of LIPA • Loop master will then transmit the LISA command LISA bit map primed with results of LIFA allowing L_Ports which were unable to obtain their desired AL_PA to get a “soft assigned” AL_PA OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 230 LIP: Phase C • Each NL_Port will Receive, possibly modify and retransmit the four Initialization Command frames Set the Current Fill Word (CFW) = ARB(F0) • Modify the AL_PA bit map as follows Set one bit of the initialization command AL_PA bit maps based on history of AL_PA assignment If the bit map corresponding to a “desired” AL_PA has been set by an up-stream L_Port, this L_Port assumes a soft AL_PA by setting the first “0” bit=1 in the bit map of the LISA frame If no bit positions were available in the LISA bit map, the L_Port will remain in non-participating mode At most the bit map of one command will be modified by each L_port OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 231 LIP: Phase D • The loop master will prime the AL_PA position map to: Byte 0 = ‘01’ hex Byte 1 = it’s AL_PA Bytes 2-127 = ‘FF’ hex Then transmit the LIRP with this position map • Each NL_Port will: Increment the offset by one and store the offset Store its AL_PA at the offset Retransmit the updated LIRP frame • The loop master will save the resulting loop position map OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 232 LIP: Phase E • The loop master will transmit the LILP command with Payload = AL_PA position map • Each NL_port will Save the loop position map Retransmit the LILP command • When the loop master receives the LIILP command it will Transmit a CLS and go to monitoring state When each NL_Port receives a CLS they will Retransmit the CLS and go to monitoring state Initialization Complete OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 233 LIP: Summary A. LIP starts the initialization procedure B. Select a temporary loop master Lowest AL_PA | WWN wins C. Build a AL_PA bit map Each L_Port indicates the AL_PA it selected in one of 4 requests by the loop master D. Collect a AL_PA position map Each L_Port reports its relative position from master and it’s AL_PA E. Distribute the resulting AL_PA position map to each L_Port OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 234 Arbitration • The process by which L_port request ownership of the loop based on primitive signals Ordered Set MSB K28.5 ARB(x) OPT-2T01 9899_06_2004_X LSB D20.4 AL_PA AL_PA © 2004 Cisco Systems, Inc. All rights reserved. 235 Arbitration Loop Owner • The current loop owner is responsible for Seeds the arbitration process with ARB(F0) Blocks propagation of the received ARB(x) until it relinquishes the loop • Initiates a new arbitration “window” If ARB(F0) is received by setting current fill word = IDLE • Fairness variables Access ARB_WON OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 236 Arbitration Process • When a port is arbitrating it enters the arbitrating state • The CFW is updated to the ports ARB(AL_PA) if the CFW is: 1. 2. 3. 4. IDLE ARB(F0) ARB(FF) Lower-priority ARB (higher value AL_PA) • Arbitration occurs even if a loop circuit exists between another pair of ports • Once a port starts arbitrating it Must continue to arbitrate until it wins Withdraw if it knows that another port is arbitrating OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 237 Fairness Access Fairness • Ports with higher-Priority AL_PA values could lock out lower priority ports When they ARB they will always win Lower Priority ports might never win Arbitration • Access fairness limits how often a port can arbitrate This is done by not arbitrating the loop until all other ports on the loop that are arbitrating have won; This is called a fair port • Access fairness is based on “access” not “duration of usage” Does not limit how long a port uses the loop • Fairness is recommended by the standard but not mandatory FL_Ports may be unfair but NL_Ports should be fair OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 238 Fairness • The fairness is controlled by the FC-AL fairness algorithm called a fairness window Window begins when the first port wins arbitration Ends when a port discovers that it was the last arbitrating port IDLE resets the fairness window The variables used are Access = 0 for fairness window open Access = 1 when NL_Port has won arbitration • Fair ports can only arbitrate once per window After winning arbitration they wait for the end of the window before arbitrating again • Unfair ports can arbitrate at anytime OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 239 Open If the Port Requires the Loop when It Wins ARB • It sends an OPN(yx) or OPN(yy) y=destination port x=source port Full-Duplex establishes a point to point like circuit between the loop ports Half-duplex restricts open recipient to transmit link control frames only Cannot transmit device data frames Used by designs that can not support simultaneous data frames Tx and Rx OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 240 Open Selecting the Destination Port • Is the intended destination port on same loop or connected via fabric switch? If the upper 16 bits of destination field (D_ID) are all zeros the port is on this private loop If the upper 16 bits of the source(S_ID)are all zeros then the source port is a private port and can only talk to ports on same loop If the upper 16 bits of the D_ID are the same as the upper 16 bits of the S_ID then they are both on the same loop or both are public and attached to the same FL_Port • If none of these are true, the destination port is not on the same loop and must be accessed via FL_Port OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 241 Opening a Port on Same Loop • Open Originator inserts the destination AL_PD in the OPN • The AL_PD is obtained from the low-order 8 bits of the destination address in the frame header • This process can be entirely by hardware OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 242 Opening a Port Off the Loop • Originator inserts AL_PD of the FL_Port ’00’ in the AL_PD field of the OPN • The FL_Port is opened and frames are sent to the FL_Port • FL_Port and fabric forwards the frames using the destination address field • FL_Port can send to multiple destination ports on the loop during this OPN OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 243 SWITCH FABRIC OPERATION OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 244 Switch Model Port Connection Matrix Port Fabric Controller Connectionless Switch Matrix Port OPT-2T01 9899_06_2004_X Port © 2004 Cisco Systems, Inc. All rights reserved. 245 Worldwide Names • Each switch element is assigned a WWN at time of manufacture • Each switch port is assigned a WWN at the time of manufacture • During FLOGI the switch identifies the WWN in the service parameters of the accept frame Fabric port and Switch element • These address assignments can then correlate each fabric port with the switch element OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 246 Switch Ports • Four basic types of switch ports F_Port—Uses NOS/LOS to attach to single N_Port FL_Port—Uses LIP to attach 1 to 126 NL_Ports E_Port—Uses NOS/LOS to interconnect switches (inter-link switch ISL) G_Port—Uses NOS/LOS can be a F or E port OPT-2T01 9899_06_2004_X 247 © 2004 Cisco Systems, Inc. All rights reserved. Fabric Addressing • The 24 bit address is partitioned into 3 fields Device Area Domain • This partitioning helps speed up routing • Switch element assigns the address to N_Ports • Address portioning is transparent to N_Ports 8 bits Switch Topology Model OPT-2T01 9899_06_2004_X Switch Domain © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 8 bits Area 8 bits Device 248 Directory Server • Repository of information regarding the components that make up the Fibre Channel network • Located at address ‘FF FF FC’ (Some readings call this the name server) • Components can register their characteristics with the directory server • An N_Port can query the directory server for specific information Query can be the address identifier, WWN and volume names for all SCSI targets OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 249 Directory Server Command Requests These Are Some of the More Used Commands Used to Query the Directory Server • Get objects GA_NXT—Get all next GFT_ID—Get FC-4 types • Register objects RFT_ID—Register FC-4 types • Deregister objects DA_ID—Deregister all OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 250 Fabric Controller • Each switch has a fabric controller • Assigned address ‘FF FF FD’ Every fabric controller in the fabric has the same address It is the N_Port within the switch Responsible for managing fabric, initialization, routing, setup and teardown of Class-1 connections • Responsible to receive request and generate responses for the switch fabric Information must be consistent independent of which fabric controller responds to a request OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 251 Extended Link Services • Extended link services provide a set of protocol functions used by the port to specify a function or service at another port Usually sent from N_Port to F_port to perform needed request The R_CTL field of the first word will be set to 0x22 to indicate an extend link service request Many ELS services will return a payload in response some have no reply OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 252 Extended Link Services • Some of the more important and most used ELS commands are: FLOGI F_Port Login PLOGI N_Port Login FAN Fabric Address Notification PRLI Process Login PRLO Process Logout OPT-2T01 9899_06_2004_X SCN State Change Notification SCR State Change Registration RSCN Registered State Change Notification © 2004 Cisco Systems, Inc. All rights reserved. 253 ELS: FLOGI • FLOGI—Fabric login Issued by N_Port to destination ‘FF FF FE’ to Determine if fabric is present Establish a session with the fabric Exchange service parameters with the fabric FLOGI assigns N_Ports 24 bit address to N_Port or AL_PA to loop ports OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 254 ELS: PLOGI • PLOGI—N_Port login Established sessions between two N-Ports Required before upper level protocol operations can begin N_Port will register to the name server ‘FF FF FC” in fabric with all required login parameters N_Port will then query name server for other N_Ports on the fabric OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 255 ELS: PRLI • PRLI—Process Login Allows the FC-4 levels to exchange service parameters for communications between each other Process is protocol specific (type field) SCSI-3 FCP mapping requires PRLI OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 256 ELS: FAN FAN—Fabric Address Notification • Used in fabric loop attached topology • Provides mechanism for FL_Port to notify NL_Ports of addresses and names of FL_Ports along with fabric name • Allows NL_Ports to verify configuration following a loop initialization OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 257 ELS: SCN SCN—State Change Notification • Provides notification to ports of events that may effect logins or process logins to ports on the fabric • SCN can be sent from N_Port to N_Port N_Port to fabric controller Fabric controller to N_Ports • Notification may indicate login session is no longer valid Loss of signal (NOS, LOS, FLOGI) LIP has occurred SCN sent to fabric controller OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 258 ELS: RSCN RSCN—Registered State Change Notification • Similar to SCN but only sends change notice to those ports registered • SCN did not define a registration method OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 259 Class_F Service • Communications between switch elements use Class_F Service Unique SOF delimiter and normal EOF delimiter • Used to pass control information within the switch • Highest priority within switch • Connectionless service • Has no meaning outside switch, N_Port will discard if received OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 260 Inter-Switch Link • The interconnection between switches is called the inter-switch link E_Port to E_Port • Supports all classes of service Class 1, 2, 3, and switch to switch control traffic, class F • FC-PH permits consecutive frames of a sequence to be routed over different ISL links for maximum throughput OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 261 Interswitch Links (ISLs) • Inter-switch link (ISL) connects switches • Fabric parameters must match on both switch otherwise link would not come up and fabric will be segmented OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 262 Principal Switch Selection • Only one switch is designated principal switch in a fabric Switch 1 Switch 3 Switch 2 Switch 5 Switch 4 Switch 6 OPT-2T01 9899_06_2004_X • The switch with the lowest WWN becomes the principal switch originally • Principal switch makes sure that no new switch is added to the fabric if it has a domain ID conflict with an existing switch in the fabric © 2004 Cisco Systems, Inc. All rights reserved. 263 Fabric Configuration Process • The fabric configuration process enables a switch port to determine its operating mode, exchange operating parameters, and provides for distribution of addresses • The process is summarized in the following steps Establish link parameters and switch port operating mode Principal switch selection Domain ID distribution Path selection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 264 Fabric Configuration Stages Operation Starting Process Ending Establish Link Parameters and Switch Port Operating Mode Switch Port has achieved word synchronization The Switch Port attempts to discover whether it is an FL, F, or E port. Switch Port mode is known. If a Port is an E port, link parameters have been exchanged & Credit has been initialized. Select Principle Switch BF or RCF SW_ILS transmitted or received Switch_Names are exchanged over all ISLs to select a Principle Switch, which becomes the Domain Address Manager The Principle Switch is selected Domain ID Acquisition Domain Address Manager had been selected Switch requests a Domain_ID from the Domain Address Manager Switch has a Domain_ID Path Selection Switch has a Domain_ID Path selection (FSPF) is defined in the next section Switch is operations with routes established Condition OPT-2T01 9899_06_2004_X Condition © 2004 Cisco Systems, Inc. All rights reserved. 265 Fabric Configuration: PS Selection • A principal switch shall be selected whenever at least one inter-switch link (A link between two E_Port) is established • The selection process chooses a principal switch, which is then designated to assign domain identifier to all the switches in the fabric, and any who join later the fabric later on OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 266 Fabric Configuration: PS Selection • The principal switch selection can be triggered by anyone of the following events Switch boot and EFP Build Fabric (BF) Reconfigure Fabric (RCF) OPT-2T01 9899_06_2004_X 267 © 2004 Cisco Systems, Inc. All rights reserved. Fabric Build Process • When the switch first boots up and the first E_Port of a switch becomes operational, the switch starts 2xF_S_TOV timer and then sends out an exchange fabric parameters (EFP) from that port containing its own Destination ID (DoID) in the list trying to become Principle Switch (PS) • The switch receiving the Exchange Fabric Parameter (EFP) replies with either ACCept or ReJecT after comparing the priority and WWN Domain_id 0x11 WWN Record Len (0x10) Payload Len Reserved or t E_ P E_ Principal Switch WWN (Word 1) Po rt E_ P D (0) (FF, Dd) OPT-2T01 9899_06_2004_X EFP EFP B (0) (FF, Bb) © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Domain_ID record M rt Po EF P E_ EFP Domain_ID record 0 EF P or t EF P EF P Priority Priority Principal Switch WWN (Word 0) A (0) (FF, Aa) EFP C (0) (FF, Cc) 268 Fabric Build Process • If the received information has a lower value, the switch keeps the received information and considers sending switch as potential principal switch and also consider that link to be potential upstream link • At that point switch generates another EFP for all other links with the updated potential principal switch • When 2x F_S_TOV expired, all switches in the fabric consider the information collected for the principal switch to be definitive; At that point the principal switch is responsible for assigning the Domain_IDs WWN Domain_id A (0) (FF, Aa) D (0) (FF, Bb) OPT-2T01 9899_06_2004_X E_ Po rt EFP Po rt SW _R JT CC _A Potential Upstream port EFP E_ SW E_ Po rt SW _A C SW C _R JT Priority E_ Po rt Potential Upstream port EFP B (0) (128, Aa) C (0) (FF, Aa) © 2004 Cisco Systems, Inc. All rights reserved. 269 Fabric Configuration Details • After the principal switch selection, the PS will change its priority to 0x02 (PS Priority) and then assign itself a domain ID and then the process of domain distribution starts • The principal switch will initialize the process by sending Domain ID Assigned (DIA) SW_REQ out of all its E_Port • The intermediate switch is actively involved in this process • Each switch will reply back with Request Domain ID (RDI) To allow each switch to request for one or more domain ID The neighboring switch receiving RDI will be able to identify its downstream principal ISL • Each switch can send many RDI but once the principal switch has granted the domain ID to the switch, the following RDI from the switch must request the same set of domain_id OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 270 Fabric Configuration Flows ID Assignment A B D DIA (SW _REQ) SW_ACC SW_RJT Upstream port D A ) RDI (SW_REQ E_ D IA E_ Po rt E_ Po rt A (1) (XX, Aa) Upstream port Po rt DI E A _P or t SW_ACC SW_RJT EFP (SW _REQ) Contains DoID lis t SW_ACC B D (3) (FF, Aa) B (2) (FF, Aa) SW_ACC SW_RJT ) RDI (SW_REQ C (4) (FF, Aa) ) RDI (SW_REQ SW_ACC SW_RJT EFP (SW _REQ) Contains DoID lis t SW_ACC OPT-2T01 9899_06_2004_X DIA (SW _REQ) SW_ACC SW_RJT EFP (SW _REQ) Contains DoID lis t SW_ACC © 2004 Cisco Systems, Inc. All rights reserved. 271 Fabric Configuration: The PS Battle • After the principal switch selection and domain id assignment, all switches in the fabric will start two processes FC_ID assignment FSPF path selection • When a new switch is added to the fabric, it will send out an EFP with its local value (I am PS); the fabric rejects that EFP and replies with DIA telling the new switch to send RDI; the RDI is then routed to the current PS • If the new switch is part of another fabric (it also has a PS) then both fabric sends out an EFP and after comparing the DoID list the fabric enters one of the following states BF state: If the DoID list does not overlap RCF State: If the DoID list overlap Isolation: No auto-reconfigure state or RCF disabled OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 272 Fabric Configuration Disruptive/Non-Disruptive • One of the following three conditions can trigger BF (non-disruptive) or RCF (disruptive) Two disjoints fabric are combined together A principal ISL fails (upstream or downstream) A switch with Domain_ID request for another Domain_ID • Whenever a switch receives a BF/RCF, the switch starts F_S_TOV timer and enters the BF/RCF state; It forwards BF/RCF out of all E_ports except the incoming port (only once) and wait for the timer to expire • When the timer expires, BF/RCF propagation state is left and principal switch selection begins • BF is not a disruptive process • RCF is a disruptive process OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 273 Fabric Configuration Distribution Propagation of BF or RCF Requests Switch Starts the Reconfig OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 274 Fabric Configuration: Reserve ID’s • N_ports and E_ports get one port ID; F ports don’t get any ID’s; FL ports in public AL gets 0x00 port ID Domain_ID Area_ID Port_ID Description 00 00 00 Used during FLOGI 00 00 AL_PA Private Loop NL_Port 00 00 NonAL_PA Reserved 00 01-FF 00-FF Reserved 01-EF 00-FF 00-FF N_Port & E_Port. Port ID=00 for FL port for public devices 255 address F0-FE 00-FF 00-FF Reserved FF 00-FA 00-FF Reserved Multicast & Broadcast OPT-2T01 9899_06_2004_X FF FB 00-FF FF FC 00 Reserved FF FC 01-EF N_Port of domain controller. Port ID is the domain ID FF FC F0-FF Reserved FF FD-FE 00-FF Reserved FF FF 00-EF Reserved FF FF F0-FC,FF Well Known Address FF FF FD N_Port of fabric controller FF FF FE Fabric F_Port, Fabric Login database © 2004 Cisco Systems, Inc. All rights reserved. 275 Fabric Configuration: FSPF • FSPF stands for fabric shortest path first • Based on link state protocol • Begins after domain ID assignment is completed • Conceptually based on open shortest path first (OSPF) internet routing protocol • Currently a standard defined in FC-SW-2 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 276 Fabric Configuration: FSPF • FSPF has four major components Hello protocol Replicated topology database A path computation algorithm Routing table update • FSPF discovers the paths to switches using Domain—Ids • Each switch performs its own shortest path calculations OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 277 Fabric Configuration: FSPF • For FSPF a domain ID identifies a single switch This limits the max number of switches that can support in the Fabric to 239 when FSPF is supported • FSPF performs hop-by-hop routing • FSPF supports hierarchical path selection Provides the scalable routing tables in large topologies OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 278 Fabric Configuration: FSPF • Everyone says HELLO to their neighbor, on all initialized ISLs • The neighbors say HELLO back, unless they are dead • When the HELLO packet is received with both originator and recipient domain id, the two way communication is done and: The ISL is active The ISL may be available as a two-way path for frames OPT-2T01 9899_06_2004_X 279 © 2004 Cisco Systems, Inc. All rights reserved. Fabric Configuration: Hellos Hello Protocol • Point to Point Only • Default Hello Int = 20 S • Default HelloDead Int = 80 S OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 280 Fabric Configuration Link State Update and Ack A B LSU(DB-A) LSU(DB-B) • After a 2-way HELLO is established on a Link, each switch exchanges its entire database with its neighbor using the LSU service • When the recipient of the LSU has processed the database, it sends back the LSA service LSA(DB-B) LSA(DB-A) OPT-2T01 9899_06_2004_X 281 © 2004 Cisco Systems, Inc. All rights reserved. Fabric Configuration Link State Record A B LSU(LSR-A) LSU(LSR-B) LSA(LSR-B) • When the databases are in sync, each switch sends its LSR with the new link included using the LSU service • The LSU is flooded to the entire fabric • Each Switch retransmits the LSU by a mechanism called “reliable flooding” LSA(LSR-A) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 282 Fabric Configuration • Link cost is calculated based on baud rate of the link, plus an administratively set factor • Link cost = S * (1.0625E12/baud rate) S is administrative factor defaults to 1 Ex: Link cost of 1G port = 1000 • Path cost is the sum of the traversed link costs • Lower metric more desirable OPT-2T01 9899_06_2004_X 283 © 2004 Cisco Systems, Inc. All rights reserved. Fabric Configuration Path Selection (FSPF) Operation Summary Operation Starting Condition Process Ending Condition Perform initial HELLO Exchange The switch sending HELLO has a valid Domain_ID HLO SW_ILS frames are exchanged on the link until each switch has received a HELLO with a valid neighbor Domain field Two way communication has been established Perform Initial Database Exchange Two communication has been established LSU SW_ILS frames are exchanged containing the initial database Link State Databases have been exchanged Running State Initial Database Exchange has been completed Routes are calculated and set up within each switch. Links are maintained by sending HELLOs every Hello_Interval. Link databases are maintained by flooding link updates as appropriate FSPF routes are fully functional OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 284 FSPF Characteristics • Uses FSPF as the routing algorithm • FSPF routes traffic based on destination domain ID • FSPF uses total cost as the metric to determine most efficient path • Static routes can be applied OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 285 FSPF Characteristics Paths: • Finds the shortest path to each domain, then programs the hardware routing tables Routes: • Dynamically Round robin • Statically Administrator can configure the route Automatically re-routes upon ISL going away and static routing will again take effect upon ISL return • • • • Automatic failover Fault detection 150 ms Self heals in 500 ms So, alternate route is live in 650 ms OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 286 Routing Software Configurable Parameters • Link cost • Static routes • In Order Delivery (IOD) • Timers (be careful) OPT-2T01 9899_06_2004_X 287 © 2004 Cisco Systems, Inc. All rights reserved. What Is a Route and Path? FC Route FC ISL Path • A route is map between the input and output E_port used to reach the next switch • A path is a map through the topology between a source and destination OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 288 Selecting a Path FC Cost 500 FC Cost 250 Cost 250 • Each inter switch link has a cost metric • The cost of an ISL is related to the bandwidth • The total cost of a path between two switches is the sum of the cost of all the traversed ISLs • The path to a destination switch is the one with the minimum total cost • More than one path can be selected (with the same cost) OPT-2T01 9899_06_2004_X 289 © 2004 Cisco Systems, Inc. All rights reserved. ISL Oversubscription Multiple Nodes 1G 1G 1G Switch ISL 1G Switch OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr • Oversubscription occurs when more nodes can contend for the use of one ISL • Oversubscription ratio is the number of different ports that contend for the use of one ISL • This a 3:1 over subscription 290 FC ERROR MANAGEMENT OPT-2T01 9899_06_2004_X 291 © 2004 Cisco Systems, Inc. All rights reserved. Timers • Four different timers used Receiver-transmitter time-out (R_T_TOV) Error detect time-out (E_D_TOV) Resource allocation time-out (R_A_TOV) Connection request time-out (C_R_TOV) Used in Class 1 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr You will never see class one 292 Timers: R_T_TOV Receiver-Transmitter Time-out • Used to time events at the link level Loss of synchronization Times Responses for link reset protocol • Generally controlled in hardware for all link configurations Default value in FC Standard is 100ms OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 293 Timers: E_D_TOV Error Detect Time-out • Timers for events and responses at the sequence level Missing ACK or R_RDY when buffer credit has reached zero Class 1 or 2 expects response from data frames N_Port logout • Timer value is set at fabric login to accommodate the network environment for better scaling according to delivery time of frames Default is 10 sec OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 294 Timers – R_A_TOV Resource Allocation Time-out • Time-out value for how long to hold resources associated with a failed operation Needed to free shared resources for reuse • Value to determine how long a port needs to keep responding to a link service request before an error is detected R_A_TOV is 2 times E_D_TOV Default setting in Pt to Pt is 20 sec and fabric is 120 seconds OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 295 Timers: CR_TOV Connection Request Time-out • Determines how long the fabric can hold a class-1 request in the queue during connection establishment • Allows for separation of the time in a stacked queue from the E_D_TOV; This separates queuing time from frame transit time • Helps in controlling F_BSY issues OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 296 Recovery: Class 3 • Errors in class 3 sequence can only be detected by the Sequence recipient because there are no ACKs or rejects in class 3 • Class 3 SR will discard single or multiple frames until the exchange is terminated • The upper level recovery may retransmit the entire Sequence or at least the sequence following the error detection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 297 Recovery: Class 3 • Errors a class 3 operation can detect: Out of order delivery and potential missing frame based on SEQ_CNT If missing frame is not Rx’ed within E_D_TOV Indication of a new sequence when last frame of previous Sequence has not been received (in-order delivery set) Relative offset not in order with an order delivery set OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 298 Abort Sequence: ABTS • ABTS protocol Used to terminate sequence or exchange Transmitted by the sequence initiator Can be requested by the sequence recipient by setting bits within the F_CTL of the ACK frame Same class of service delimiter as the sequence being aborted is used for ABTS frame OPT-2T01 9899_06_2004_X 299 © 2004 Cisco Systems, Inc. All rights reserved. Timers • Four different timers used Receiver-transmitter time-out (R_T_TOV) Error detect time-out (E_D_TOV) Resource allocation time-out (R_A_TOV) Connection request time-out (C_R_TOV) Used in Class 1 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr You will never see class one 300 Timers: R_T_TOV Receiver-Transmitter Time-out • Used to time events at the link level Loss of synchronization Times Responses for link reset protocol • Generally controlled in hardware for all link configurations Default value in FC Standard is 100ms OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 301 Timers: E_D_TOV Error Detect Time-out • Timers for events and responses at the sequence level Missing ACK or R_RDY when buffer credit has reached zero Class 1 or 2 expects response from data frames N_Port logout • Timer value is set at fabric login to accommodate the network environment for better scaling according to delivery time of frames Default is 10 sec OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 302 Timers – R_A_TOV Resource Allocation Time-out • Time-out value for how long to hold resources associated with a failed operation Needed to free shared resources for reuse • Value to determine how long a port needs to keep responding to a link service request before an error is detected R_A_TOV is 2 times E_D_TOV Default setting in Pt to Pt is 20 sec and fabric is 120 seconds OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 303 Timers: CR_TOV Connection Request Time-out • Determines how long the fabric can hold a class-1 request in the queue during connection establishment • Allows for separation of the time in a stacked queue from the E_D_TOV; This separates queuing time from frame transit time • Helps in controlling F_BSY issues OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 304 Recovery: Class 3 • Errors in class 3 sequence can only be detected by the Sequence recipient because there are no ACKs or rejects in class 3 • Class 3 SR will discard single or multiple frames until the exchange is terminated • The upper level recovery may retransmit the entire Sequence or at least the sequence following the error detection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 305 Recovery: Class 3 • Errors a class 3 operation can detect: Out of order delivery and potential missing frame based on SEQ_CNT If missing frame is not Rx’ed within E_D_TOV Indication of a new sequence when last frame of previous Sequence has not been received (in-order delivery set) Relative offset not in order with an order delivery set OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 306 Abort Sequence: ABTS • ABTS can be sent under abnormal conditions End-to-end credits not required Sequence initiative not required Open sequence not required Maximum number of concurrent sequences allowed Unidirectional for class 1 connection The reply to an ABTS is a Basic_Accept OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 307 iSCSI RFC 3720 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 308 Session Modules • What is iSCSI and what is the big picture? • iSCSI protocol Introduction • The iSCSI connection • Security, data integrity and error recovery • iSCSI protocol details in-depth • Simple iSCSI connection flows • Service location protocol for IP storage OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 309 What Is iSCSI? • A SCSI transport protocol that operates on top of TCP Encapsulates SCSI-3 CDBs (Control Descriptor Blocks) and Data into TCP/IP byte-streams (defined by SAM-2— SCSI Architecture Model 2) Allows IP hosts to access IP or Fibre Channel-connected SCSI targets Allows Fibre Channel hosts to access IP SCSI targets • Standards status RFC 3720 (assigned May 2004) Major industry support (Cisco, IBM, EMC, HP, Microsoft) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 310 Storage Technology SCSI Domain Device Service Request Device Service Response SCSI Device Target Logical Unit 1 Port Application Client Service Delivery Subsystem Port SCSI Device Initiator Device Server Task Request Task Response Task Manager • To be functional, a SCSI Domain needs to contain a SCSI device that contains a target and a SCSI device that contains an Initiator OPT-2T01 9899_06_2004_X 311 © 2004 Cisco Systems, Inc. All rights reserved. SAN, NAS, iSCSI Comparison DAS SAN iSCSI iSCSI Appliance Gateway NAS Computer System Application Application Application Application Application File System File System File System File System File System Volume Manager Volume Manager Volume Manager Volume Manager SCSI Device Driver SCSI Device Driver SCSI Device Driver iSCSI Driver SCSI Device Driver iSCSI Driver I/O Redirector NFS/CIFS TCP/IP stack NIC SCSI Bus Adapter Fibre Channel HBA TCP/IP stack TCP/IP stack NIC NIC File I/O Block I/O SCSI SAN IP IP IP FC NIC TCP/IP stack iSCSI layer Bus Adapter NIC TCP/IP stack iSCSI layer Bus Adapter NIC TCP/IP stack File System Device driver FC switch OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Block I/O Adopted from IBM Redbook “IP Storage Networking: IBM NAS & iSCSI Solutions” 312 IP Storage Networking • IP storage networking provides solution to carry storage traffic within IP • Uses TCP: A reliable transport for delivery • Can be used for local data center and long haul applications • Two primary protocols: iSCSI—IP-SCSI—Used to Transport SCSI CDBs and Data within TCP/IP Connections IP TCP iSCSI SCSI Data FCIP—IP-SCSI—Used to Transport SCSI CDBs and Data within TCP/IP Connections IP TCP FCIP OPT-2T01 9899_06_2004_X FC SCSI Data 313 © 2004 Cisco Systems, Inc. All rights reserved. Initiator and Target Model for iSCSI • Initiator—SCSI device which is capable of originating SCSI commands and task management requests • Target—SCSI device which is capable of executing SCSI commands and task management requests iSCSI Gateway FC Target FC iSCSI Initiator Target iSCSI Gateway iSCSI Initiator iSCSI Target Mode OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr iSCSI Target iSCSI FC Initiator Target FC Initiator iSCSI Initiator Mode 314 iSCSI Components • iSCSI is an end-to-end protocol • iSCSI has human readable SCSI device (node) naming • iSCSI includes the following base components: IPSEC connectivity security Authentication for access configuration Discovery of iSCSI nodes Process for remote boot iSCSI MIB standards OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 315 iSCSI: Internet SCSI PDU • The iSCSI layer encapsulates the SCSI CDB into a iSCSI Protocol Data Unit (PDU) and forwards it to the Transmission Control Protocol (TCP) layer • It also extracts the CDB from an iSCSI PDU received from the TCP layer, and forwards the CDB to the SCSI layer • iSCSI mapping provides the SCSI-3 command layer with a reliable transport • The communications between the Initiator and target will occur over one or more TCP connections • The TCP connections form a session and will carry the iSCSI PDU’s; the sessions are given an ID called a connection ID (CID); there are two parts of the ID, Initiator Session ID (ISID) and Target ID (TSID) and together make up an “I_T nexus” OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 316 iSCSI Model SCSI CDB’s carried by Fibre Channel Exchange and Sequences FC Storage Device FC Target Logical Unit 1 Port Target Device Service Request Mapping fc1 iSCSI Target ge2 Application Client requests data from LUN 1 Port Data Server Host Initiator Device Server Device Service Response LUN 1 = LUN 2 Logical Unit 2 Device Server SCSI CDB’s Carried in iSCSI PDU’s OPT-2T01 9899_06_2004_X 317 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Stack SCSI Applications (File Systems, Databases) SCSI Device-Type Commands SCSI Generic Commands SCSI Transport Protocols SCSI Block Commands SCSI Stream Commands Other SCSI Commands SCSI Commands, Data, and Status Parallel SCSI Transport FCP SCSI Over FC iSCSI SCSI Over TCP/IP TCP Layer 3 Network Transport IP Layer 2 Network OPT-2T01 9899_06_2004_X Parallel SCSI Interfaces © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr Fibre Channel Ethernet 318 iSCSI iSCSI Packet 46–1500 bytes Destination Source Address Address Preamble 8 6 Type 6 IP TCP Data FCS 2 4 Octet Well-known Ports: 21 FTP 23 Telnet 25 SMTP 80 http iSCSI encapsulated Opcode 3260 iSCSI Opcode Specific Fields Length of Data (after 40Byte header) Sourced Port Destination Port LUN or Opcode-specific fields Sequence Number Acknowledgment Number Offset Reserved U A P R S F Window Checksum Urgent Pointer Options and padding Initiator Task Tag Opcode Specific Fields Data Field … TCP Header OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 319 iSCSI Naming and Discovery RFC 3721 • Initiator and target require iSCSI names Name is location independent iSCSI node name = SCSI device name of iSCSI device Associated with iSCSI nodes, not adapters Up to 255 byte displayable/human readable string (UTF-8 encoding) Use SLP, or iSNS, or query target for names (sendtargets) • Two iSCSI name types: iqn—iSCSI qualified name eui—Extended Unique Identifier (IEEE EUI-64) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 320 iSCSI Name Structure iSCSI Name Structure Type iqn Type . Date . . Unique String . Organization Subgroup Naming Authority or Naming Authority String Defined by Organization Naming Authority iqn.1987-05.com.cisco.1234abcdef987601267da232.betty iqn.2001-04.com.acme.storage.tape.sys1.xyz Date = yyyy-mm when Domain Acquired eui Type . Reversed Domain Name Host Name EUI-64 Identifier (ASCII Encoded Decimal) eui.02004567a425678d OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 321 iSCSI Naming and Addressing Terms • iSCSI host name Name of computer • iSCSI initiator name (iSCSI Node) Name created at iSCSI driver load time on host system • Initiator—Target Session ID (SSID) One or more TCP connections between Initiator and target; This session ID is derived from iSCSI host name, iSCSI target name and TSID, ISID OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 322 iSCSI Naming and Addressing Terms • iSCSI initiator address IP address on Initiator interface; Initiator can have multiple addresses • Initiator port—Also known as network portal IP address on initiator no port number assigned, again Initiator can have several network portals • Target port—Also known as network portal IP address + TCP port number on target interface There can be more then one target interface OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 323 iSCSI Naming and Addressing Terms • iSCSI target name Used to identify multiple SCSI targets behind a single IP address+port; This name is globally unique • Initiator session ID This is an initiator-defined session identifier; It will be the same for all connections within a session; An iSCSI initiator port is uniquely identified by the value pair (iSCSI Initiator Name, ISID) • Target session ID Target assigned tag for a session with a specific named initiator that, together with the ISID uniquely identifies a session with that initiator OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 324 iSCSI Naming and Addressing Terms • iSCSI network entity— Client • iSCSI network entity— Server • It is a combination of the following: • Is a combination of the following: iSCSI initiator iSCSI target name iSCSI host Target port (network portal) iSCSI initiator address Initiator port (network portal) OPT-2T01 9899_06_2004_X Initiator—target session (SSID) © 2004 Cisco Systems, Inc. All rights reserved. 325 iSCSI Naming and Addressing Terms • iSCSI Node iSCSI Initiator or iSCSI Target; There can be one or more iSCSI nodes in a network entity iSCSI node will equal iSCSI initiator name iSCSI target name OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 326 iSCSI Naming and Addressing Terms • Portal Group Groups multiple TCP connections across the same session that is is sent across multiple portals The portal groups are identified by a portal group tag (1-65535) One or more portal groups can provide a path to the same iSCSI node (target node or initiator node) SendTargets requires portal group tag OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 327 iSCSI Discovery Methods • Small networks Static configuration, initiators and targets ‘SendTargets’ command makes configuration easier • Medium-sized networks Service Location Protocol (SLP multicast discovery) • Large-sized networks iSNS (internet storage name service) Includes soft zone domains Includes database for ongoing management OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 328 iSCSI Architecture Network Entity (iSCSI Client) iSCSI Node (Initiator) Network Portal Network Portal 10.1.30.1 10.1.40.1 Network Portal Network Portal 10.1.30.2 10.1.40.2 iSCSI Node iSCSI Node (Target) (Target) Network Entity (iSCSI Server) OPT-2T01 9899_06_2004_X 329 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Architecture IP Network Network Portal 10.1.30.1 Network Portal Network Portal 10.1.40.1 10.1.50.1 Portal Group 1 iSCSI Session (Target Side) iSCSI Name + TSID=2 Portal Group 2 iSCSI Session (Target Side) iSCSI Name + TSID=1 iSCSI Target NodeNode (within Network Entity) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 330 iSCSI Session Model • An iSCSI session exists between a single iSCSI initiator (host) and a single iSCSI target (iSCSI router) • An iSCSI session consists of one or more iSCSI (TCP) connections • Login phase begins each connection • Deliver SCSI commands in order iSCSI Session iSCSI (TCP) Connection TCP/3260 TCP/3260 TCP/3260 iSCSI Routing Instance OPT-2T01 9899_06_2004_X iSCSI Storage Router © 2004 Cisco Systems, Inc. All rights reserved. 331 iSCSI Session Images • Across all connections within a session, an initiator sees one “target image” • The target image would represent all identifying elements such as LUN’s • A target also sees one “initiator image” across all connections within a session OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 332 Put It All Together for iSCSI mike.cisco.com iSCSI Host Name iSCSI Initiator Name disk.cisco.com.stor.1 23 iSCSI Driver, Storag e NIC 1.1.1.1 iSCSI Node s will Configuration iSCSI Initiator address Initiator Port 2.2.2.2 Target Port These Network Portals listens for iSCSI connections on WKP 3260 ISID TCP Connection IP SSID TSID make the connections between storage and iSCSI Initiator 3.3.3.3 4.4.4.4 5.5.5.5 iSCSI Target name configured on iSCSI Device Target-1 Target-4 iSCSI Network Entity-Server OPT-2T01 9899_06_2004_X Target-2 Target-3 Target-5 Storage Systems 333 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Connections and SCSI Phases • A SCSI command and its associated data and status phase exchanges must traverse the same TCP connection • Linked SCSI commands can traverse separate TCP connections for scalability iSCSI Session Linked SCSI Commands iSCSI (TCP) Connection 1 SCSI Command (1) (Read) SCSI Data (1) SCSI Status (1) iSCSI Routing Instance SCSI Command (1) (Write) SCSI Data (1) SCSI Status (1) iSCSI (TCP) Connection 2 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr iSCSI Storage Router 334 iSCSI Connection Session Session Can Process SCSI Commands and Data after Login Is Complete • iSCSI Session has four phases Initial login phase Security authentication phase Operational negotiation phase Full featured phase OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 335 iSCSI Session Establishment Login Begins with the First Connection • Initial login phase Initiator sends login with text strings for InitiatorName, TargetName, and authentication options (which are then selected by the target) • Security authentication phase Authentication text exchanges (ID, password, certificates, etc) • Operational negotiation phase Each side (initiator and target) negotiate the supported options using Keyword=value, or Keyword=value,value,value Amount of unsolicited buffer Types of data delivery Solicited, unsolicited, immediate, etc… • Full featured phase Can carry SCSI CDBs/data, task management, and responses OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 336 iSCSI Session Key Points Sessions: • iSCSI Session = a group of TCP connections linking an initiator with a target (i.e., can be one or more connections) • NOTE: A TCP connection that is part of an iSCSI session will only be used to carry iSCSI traffic • The iSCSI initiator and target use the session to communicating iSCSI commands, control messages, parameters, and data to each other • TCP connections can be added and removed from a session using the iSCSI Login/Logout commands OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 337 iSCSI Sessions • During session establishment, the target identifies the SCSI initiator port (the “I” in the “I_T nexus”) through the value pair (InitiatorName, ISID) • Any persistent state (e.g., persistent reservations) on the target associated with a SCSI initiator port is identified based on this value pair • Any state associated with the SCSI target port (the “T” in the “I_T nexus”) is identified externally by the TargetName and portal group tag and internally in an implementation dependent way OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 338 iSCSI Connection Allegiance • For SCSI commands that require data transfer, the data phase and status phase must be sent over the same TCP connection used by the command phase • Consecutive commands that are part of a SCSI task may use different connections within the session (linked commands) • Connection allegiance is strictly per-command and not per task • Multiple connections allow the iSCSI session to be scaled across multiple links/devices OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 339 iSCSI Connection Termination • Session may end with logout or I/O error causing dropped connection • TCP connections are closed through normal methods i.e. TCP FINs • Graceful shutdowns can only occur when no outstanding tasks are on the connection and not in full-feature phase • Termination of connection abnormally may require a recovery method by logout request for all connections; This prevents stale iSCSI PDU’s being received after going down • Logout can also be issued by the target through asynchronous message PDU OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 340 iSCSI Security • Two types of security IPSec secures TCP/IP nodes; setup at TCP/IP startup— before iSCSI login Session authentication via IKE (Internet Key Exchange) Packet by packet authentication (also provides Integrity) Privacy via encryption (also provides Integrity) See SEC-IPS iSCSI techniques (done/setup during iSCSI Login) Authentication (ensures nodes are authorized to use the iSCSI target node) may use SRP, Chap, or Kerberos OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 341 Challenge Handshake Authentication Protocol • In-band initiator-target authentication • IP-SEC is not assumed • No clear text password accepted • Compliant iSCSI initiators and targets MUST implement the CHAP (RFC1994) • Implementations MUST support use of up to 128 bit random CHAP secrets OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 342 iSCSI Security • Various levels of security can fit different topologies Examples: Secure main floor—No security Campus LAN—iSCSI authentication and CRC32c (digests) Remote private WAN—IPSec with session/packet authentication Remote internet WAN—IPSec with privacy encryption OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 343 iSCSI Data Integrity • Basic level of end-to-end data integrity can be reasonably handled by TCP using the standard checksum • iSCSI CRC32c digest checks for Integrity beyond TCP/IP XOR checksum a) Header digest b) Data payload digest OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 344 Digests (Checksums) • Optional header and data digests protect the integrity of the header and data, respectively; The digests, if present, are located, respectively, after the header and PDU-specific data, and cover the proper data and the padding bytes • The existence and type of digests are negotiated during the login phase • The separation of the header and data digests is useful in iSCSI routing applications, in which only the header changes when a message is forwarded; In this case, only the header digest should be recalculated • Digests are not included in data or header length fields • A zero-length data segment also implies a zero-length data-digest OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 345 Error Recovery Two Considerations for Errors • An iSCSI PDU may fail the digest check and be dropped, despite being received by the TCP layer; The iSCSI layer must optionally be allowed to recover such dropped PDUs • A TCP connection may fail at any time during the data transfer; All the active tasks must optionally be allowed to be continued on a different TCP connection within the same session OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 346 Error Recover: iSCSI Initiator A. NOP-OUT to probe sequence numbers of the target B. Command retry C. Recovery R2T support D. Requesting retransmission of status/data/R2T using the SNACK facility E. Acknowledging the receipt of the data F. Reassigning the connection allegiance of a task to a different TCP connection G. Terminating the entire iSCSI session to start fresh OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 347 Error Recover: iSCSI Target A. NOP-IN to probe sequence numbers of the initiator B. Requesting retransmission of data using the recovery R2T feature C. SNACK support D. Requesting that parts of read data be acknowledged E. Allegiance reassignment support F. Terminating the entire iSCSI session to force the initiator to start over OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 348 Error Recover Classes • Within a command (i.e., without requiring command restart) • Within a connection (i.e., without requiring the connection to be rebuilt, but perhaps requiring command restart) • Connection recovery (i.e., perhaps requiring connections to be rebuilt and commands to be reissued) • Session recovery OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 349 Error Levels • Level determined during logon text negotiation Error recovery level is proposed by an originator in a text negotiation OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 350 iSCSI PROTOCOL DETAILS IN-DEPTH OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 351 iSCSI Key Points • Tasks: A linked set of SCSI commands One and only one SCSI command at a time can be processed within any given iSCSI task • Initiator Task Tag (ITT) and Target Transfer Tag (TTT) Initiator tags for all pending commands must be unique initiator-wide SCSI Data PDUs are matched to their corresponding SCSI commands using tags specified in the protocol ITT for unsolicited data TTT for solicited data OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 352 iSCSI Key Points Solicited or unsolicited messages: • Initiator to target User data or command parameters will be sent as either solicited data or unsolicited data Solicited data is sent in response to ready to transfer (R2T) PDUs Unsolicited data can be part of an iSCSI command PDU (“Immediate data”) or an iSCSI data PDU The maximum size of an individual data PDU or the immediate part of the initial unsolicited burst may be negotiated during login • Target to initiator Ready to transfer (R2T) message to Initiator, requesting data for a write command Command responses Asynchronous messages (SCSI and iSCSI) describing an unusual or error event OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 353 iSCSI Numbering • iSCSI uses command and status numbering Command numbering—Session wide and is used for ordered command delivery over multiple connections within a session; It can also be used as a mechanism for command flow control over a session Status numbering—per connection and is used to enable recovery in case of connection failure • Fields in the iSCSI PDUs communicate the reference numbers between the initiator and target During periods when traffic on a connection is unidirectional, iSCSI NOP PDUs may be issued to synchronize the command and status ordering counters of the initiator and target OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 354 SCSI Command Numbering and Acks within iSCSI • Initiator and target device have three sequence number registers per session CmdSN—Current command sequence number; Sent by initator ExpCmdSN—Expected command by the target; Sent to the initiator by the target to acknowledge CmdSN; Can be used to ACK several sequences MaxCmdSN—Maximum number target can receive in its queue; Can be sent to Initiator from target to adjust queue size OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 355 SCSI Command Numbering and Acks within iSCSI • iSCSI supports ordered command delivery within the session • Command-Sequence-Number (CmdSN) is assigned by initiator and carried in the iSCSI PDU • CmdSN starts at iSCSI login • CmdSN not assigned to data-out (DataSN used) • Immediate delivery does not advance CmdSN • iSCSI must deliver commands to target in order of CmdSN and will not increment until executed state by target OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 356 SCSI Status Numbering and Acks within iSCSI • Status Sequence Number (StatSN) is used to number responses to the Initiator from the target • ExpStatSN is sent by Initiator to acknowledge status • Status numbering starts after Login; During login there can be only one outstanding command per connection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 357 Initiator iSCSI OPcodes 0x00 NOP (No operation, used as ping to target gateway) 0x01 SCSI command (Indicates encapsulated iSCSI packet has a SCSI CDB for target device) 0x02 SCSI task management command 0x03 iSCSI login 0x04 text command 0x05 SCSI data-out (Write data to target device) 0x06 iSCSI logout 0x10 SNACK (Request retransmission from target) 0x1c-0x1e Vendor specific codes OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 358 Target iSCSI OPcodes 0x20 NOP-In (No operation in, used for ping response from target 0x21 SCSI response (Indicates encapsulated iSCSI packet has status or from target device) 0x22 SCSI task management response 0x23 login response 0x24 text response 0x25 SCSI data-in (Read data from target) 0x26 logout response 0x31 Ready to transfer (Sent to initiator from target to indicate it is ready to receive data) 0x32 async message (Message from target to indicate special conditions) 0x3c-0x3e vendor specific codes 0x3f reject OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 359 iSCSI PDU’s • Several different types of iSCSI PDUs used, each of the different iSCSI Operation Codes (Opcodes) determine what iSCSI PDU to use; Some of the more used PDUs are: Login and logout PDU Command and response PDU Data-In and data-out PDU OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 360 iSCSI Login PDU If Set to 1 = Recovery from Failed Connection Initiator ID for This Connection If Set to 1 Indicates Initiator Is Ready to Transit to Next Stage Current Stage/Next Stage 0 – Security Negotiation 1 – Login Operational Negotiation - 3 – Full Feature Phase Unique ID for This Connection OPT-2T01 9899_06_2004_X Initiatior May Provide Initial Text Parameters in This Area © 2004 Cisco Systems, Inc. All rights reserved. 361 iSCSI Login • Login Phase used to: Enable TCP connection (Target listens on well known port) Authentication (CHAP) Negotiate session parameters Open security protocols Mark the TCP connection as a iSCSI session and assign IDs OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 362 iSCSI Text Mode During Login Some Sessions or Connection Parameters May Be Negotiated in a Text Format list = values sent in order of preference Example of values can be: MaxConnections=<1-65535> T or I Sendtargets=all I only Targetname=<iSCSI-Name> T or I SessionType=<Discovery|Normal> I only Others—addressed later in slides (see RFC) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 363 iSCSI Full Feature Phase A Connection Is in Full Feature Mode after a Completed Login • iSCSI PDUs can be sent • PDUs must flow over same connection as login • Size of PDU is negotiated during login OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 364 Data Sequencing within iSCSI • The iSCSI PDUs used for data input and output are the 0x05 iSCSI command and the 0x25 iSCSI command, along with R2T (0x31 ready to transfer) DataSN is a number field and advances by 1 for each input (read) and output (write) Targets will operate in two modes, solicited (R2T) or unsolicited (non-R2T) Target operating in R2T mode can only receive solicited data from the initiator R2TSN advances by one for each received R2T during the data transfer • The DataSN and R2TSN fields are for the initiator to detect missing data OPT-2T01 9899_06_2004_X 365 © 2004 Cisco Systems, Inc. All rights reserved. Data-Out PDU Final Bit Say This Is the Last PDU of a Sequence LUN Number for Data Data Segment Length Based on Capabilities Exchange OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 366 Data-In PDU Final Bit say this is the last read of a sequence Acknowledge Bit used when error recovery level is 1 or higher Flags valid when S bit is set tells how to read Residual Count Status bit tells that there is meaningful data in the StatSN, Status, and Residual Count fields OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 367 iSCSI Read Command Example 1. Initiator sends iSCSI command PDU (CDB=Read) 2. Target sends iSCSI data-in PDU(s) 3. Target sends iSCSI response PDU Notes: • Solicited data via read command PDU (Initiator requests data from the target) • Target may satisfy the single read command with multiple iSCSI data read PDUs (PDUs can be out-of-order) • Command is not complete until all data and status is received by the initiator • Good status can be sent within the last iSCSI data-in PDU • All iSCSI data-in PDUs and the response PDU will be delivered on the same TCP connection that the command was sent on • All data-in PDUs will carry the same value in the ITT field OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 368 SCSI Command PDU Lets Target Know if More Data Is to Follow along with Expected Data Transfer Length Task Attributes See RFC for Detailed Meaning 16 bytes of SCSI CDB, R=1 If the Command Is Expected to Input Data Some SCSI Commands Have Additional Data and This Field Is Used for the Accompanied Data CRC If Capabilities Required This OPT-2T01 9899_06_2004_X W=1 If the Command Is Expected to Output Data 369 © 2004 Cisco Systems, Inc. All rights reserved. SCSI Response PDU SCSI Status per SAM2 Ox00 = Command Completed at Target 0x01 = Target Failure 0x08 – 0xff = Reserved for Vendor Response CRC Check Sums OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 370 SCSI Status and Response Fields for iSCSI OpCode 0x21 • The status field of the iSCSI PDU is used to report status of the command back to the initiator • The specific status codes are documented in the SCSI architectural model for the device • Response field contains the iSCSI codes that are mapped to the SAM-2 response OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 371 Ready to Transfer PDU • When the initiator has sent a SCSI write command to the target the target can specify the blocks be delivered in a convenient order; This information is passed to the initiator in the R2T PDU • Allowing an initiator to write data to a target without a R2T is agreed upon during login • The target may send several R2T PDUs and have several data transfers pending if allowed by the initiator OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 372 Task Management • Functions to provide the initiator a way to control management of the target device Abort the TASK Clear allegiance Logical reset Target reset • Each of these and more are broken down in detail in the iSCSI RFC OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 373 SACK, NOP-IN, NOP-OUT • SACK Optional Used to request retransmission of numbered responses, data or R2T PDUs from the target • NOP-IN Sent by a target as a response to a NOP-Out, as a “ping” to an initiator Or a means to carry a changed ExpCmdSN and/or MaxCmdSN if there is no other PDU to carry them for a long time • NOP-OUT Used by Initiator as a “ping command”, to verify that a connection/session is still active and all its components are operational Used to confirm a changed ExpStatSN if there is no other PDU to carry it for a long time OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 374 Message Synchronization and Steering • Steering of iSCSI out of order TCP segments into pre-allocated buffers instead of temporary buffers • To decrease reassembly time • Not needing to rely on message length information • Provides a synchronization method using fixed interval markers telling where the start of the next iSCSI PDU is in the buffer • Optional for iSCSI RFC OPT-2T01 9899_06_2004_X 375 © 2004 Cisco Systems, Inc. All rights reserved. List of Negotiated Parameters Prior to Going into Full Feature Mode Header Digest Immediate Data Data Digest Max Rec Data Segment Length Max Connections Max Burst Length Send Targets First Burst Length Target Name Default Time 2 Wait Initiator Name Default Time 2 Retain Target Alias Max Outstanding R2T Initiator Alias Data PDU In-order Target Address Data Sequence In-order Target Portal Group Tag Error Recovery Level Initial Ready 2 Transfer Session Type OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 376 Standards: Where to Find Details • http://www.ietf.org/html.charters/ips-charter.html • T10 Technical committee—www.t10.org Technical committee of the National Committee on Information Technology Standards (NCITS), deals with the storage devices • T11 Technical committee—www.t11.org Technical committee of the NCITS, deals with the physical interface and transport level OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 377 SIMPLE ISCSI CONNECTION FLOWS EXAMPLE OF DISCOVERY SESSION WITH CHAP OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 378 iSCSI Flows Initiator TCP port 1026 (Random) Discovery Session Target Establish Initial TCP Session Phase 0X03 Command—Login Key Values Are Sent, InitiatorName, InitiatorAlias, SessionType=Discovery, AuthMethod=CHAP/none, HeaderDigest, DataDigest 0X23 Login Response Status= Accept Login (0X0000), Keyvalues Are Sent, AuthMethod=CHAP, HeaderDigest=none, DatDigest=none 0X03 Command—Login Key Values Sent, InitiatorName, InitiatorAlias, SessionType=Discovery, CHAP_A=5 (CHAP with MD5) TCP Port 3260 This Device Has Already Initialized Onto the Fibre Channel 0X23 Login Response Status=Accept Login, KeyValues CHAP_A, CHAP_I & CHAP_C iSCSI Driver OPT-2T01 9899_06_2004_X 379 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Flows Initiator TCP port 1026 (Random) Discovery Session Target 0X03 Command—Login Key Values Are Sent, InitiatorName, InitiatorAlias, SessionType=Discovery, CHAP_R, CHAP_N TCP Port 3260 0X23 Login Response Final PDU in Sequence, Status= Accept login (0X0000) End of Authentication Phase Start of Parameter Negotiation Phase for Discovery Session 0X03 Command—Login Key Values Sent, InitiatorName, InitiatorAlias, SessionType=Discovery, Negotiate Session Parameters 0X23 Login Response Status=Accept Login, Negotiate Session Parameters iSCSI Driver OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 380 iSCSI Flows Initiator Target 0X04 Text Command TCP port 1026 (Random) Discovery Session SendTargets=all TCP Port 3260 0X24 Login Response Final PDU in Sequence, KeyValue=TargetName (iqn number along with target name configured on iSCSI Target) TCP port 1027 (random) Target Session #1 Start of Target Session Authentication and Target Session Parameter Negotiation Establish TCP connection for target 0X03 Command—Login Note the Addition of Another TCP Session Key Values sent, InitiatorName, InitiatorAlias, SessionType=Normal, TargetName, AuthMethod=CHAP,none 0X23 Login Response Status=Accept Login, AuthMethod=CHAP iSCSI Driver OPT-2T01 9899_06_2004_X 381 © 2004 Cisco Systems, Inc. All rights reserved. iSCSI Flows Initiator TCP Port 1027 (Random) Target Session #1 Target 0X03 Command—Login Key Values are sent, InitiatorName, InitiatorAlias, SessionType=Normal, TargetName, CHAP_A=5 TCP Port 3260 0X23 Login Response Status=Accept Login, KeyValues CHAP_A, CHAP_I & CHAP_C 0X03 Command—Login Key Values are sent, InitiatorName, InitiatorAlias, SessionType=Normal, CHAP_R, CHAP_N 0X23 Login Response Status=Accept Login 0X03 Command—Login iSCSI Driver OPT-2T01 9899_06_2004_X Key Values sent, InitiatorName, InitiatorAlias, SessionType=Normal, TargetName, Negotiate session Parameters © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 382 iSCSI Flows Initiator Target 0X23 Login Response TCP Port 1027 (Random) Target Session #1 Status=Accept Login, Negotiate session Parameters TCP Port 3260 0X01 iSCSI Command SCSI Inquiry CDB 0X12 0X25 iSCSI Data-in (read) iSCSI Driver OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 383 FCIP CONCEPTS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 384 Agenda • What FCIP Is About • The Standards Fibre Channel T11 Standards IETF IPS Working Group Drafts • Understanding FCIP Protocol • Relationships to Other SCSI Transport Technologies OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 385 FCIP: Fibre Channel over IP • FCIP provides a standard way of encapsulating FC frames within TCP/IP, allowing islands of FC SANs to be interconnected over an IP-based network • TCP/IP is used as the underlying transport to provide congestion control and in-order delivery of error-free data • FC frames are treated the same as datagrams • It is not iFCP, mFCP, IPFC, iSCSI transports or extended FC fabric OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 386 FCIP Design FC Server FC Tape Library FC Server FSPF Routing Backbone FC Tape Library FC Switch FC Switch Fiber Channel SAN FCIP Tunnel FSPF Routing Backbone FC Switch IP Network FC Switch FCIP Tunnel Fiber Channel SAN Tunnel Tunnel Session Session FC Switch FC Switch FC Switch FC Switch IP Services Available at Aggregated FC SAN Level FC Server FC JBOD OPT-2T01 9899_06_2004_X FC Server FC JBOD © 2004 Cisco Systems, Inc. All rights reserved. 387 Four (4) Specifications Define Basic FCIP • ANSI: http://www.t11.org/index.htm FC-SW-2 describes the operation and interaction of Fibre Channel switches, including E_Port, B_Port and fabric operation FC-BB-2 is a mapping that pertains to the extension of Fibre channel switched networks across a TCP/IP network backbone and defines reference models that support E_Port and B_Port • IETF IPS working group: Fibre channel over TCP/IP covers the TCP/IP requirements for transporting Fibre Channel frames over an IP network FC frame encapsulation defines the common Fibre Channel encapsulation format OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 388 ANSI: FC-SW-2 Standard • E_Ports are used at both ends of an Inter Switch Link (ISL) • E_Ports forward user traffic (storage data) and control information (class F SW_ILS frames containing FSPF, zone exchanges, etc.) • FC-SW-2 defines fabric merge procedures (Domain_ID assignment, zone transfers, etc.) • FC-SW-2 also defines FSPF OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 389 ANSI: FC-SW-2 Essentials (Recap) • E_Ports provide switch-to-switch connectivity • E_Ports negotiate parameters such as: ELP—Exchange Link Parameters ESC—Exchange Switch Capabilities • FSPF is enabled over E_Ports only • Separate fabrics can be merged over E_Ports • Zoning information is exchanged over E_Ports OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 390 IETF FCIP: Fibre Channel Over IP • Each interconnection is called a FCIP link and can contain one (1) or more TCP connection(s) • Each end of a FCIP link is associated to a virtual ISL link (VE_Port or B_Access Portal) • VE_Ports communicate between themselves just like normally interconnected E_Ports by using SW_ILS: ELP, EFP, ESC, LKA, BF, RCF, FSPF, etc. • B_Access portals communicate between themselves by using SW_ILS: EBP, LKA • The result (when all goes well… ) is a fully merged Fibre Channel fabric between FC switch SAN’s OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 391 IETF FCIP • IETF draft standard that allows IP connectivity to link Fibre Channel storage area networks across WANs Two methods can be used 1) Similar to Cisco STUN—Nailed up tunnel 2) Similar to DLSW—Dynamic peering method We will visit the details of each in later slides • draft-ietf-ips-fcovertcpip Draft 12 is current, will RFC Jan/Feb 2003 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 392 FCIP Architecture Model FCIP Link FCIP FCIP FC-2 TCP TCP FC-2 FC-0—Fibre Channel Physical Media Layer FC-1 IP IP FC-1 FC-1—Fibre Channel Encode and Decode Layer FC-0 LINK LINK FC-0 PHY PHY Key: FC-2—Fibre Channel Framing and Flow Control Layer TCP—Transmission Control Protocol SAN IP—Internet Protocol SAN TCP/IP Network LINK—IP Link Layer PHY—IP Physical Layer OPT-2T01 9899_06_2004_X 393 © 2004 Cisco Systems, Inc. All rights reserved. FCIP • End-station addressing, address resolution, message routing, and other fundamental elements of the network architecture remain unchanged from the Fibre Channel model, with IP introduced exclusively as a transport protocol for an inter-network bridging function • IP is unaware of the Fibre Channel payload and the fibre channel fabric is unaware of IP // Ethernet Header IP TCP FCIP FCP SCSI Data … CRC Checksum // OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 394 FCIP • FCIP only supports class 2, class 3, class 4, and class F frames • No FC primitive signals or primitive sequences supported Physical signal sets used by FC ports to indicate events, i.e. NOS, OLS, LR • IP transport is transparent to Fibre Channel topology OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 395 Understanding FCIP Terms • FC end node—A Fibre Channel device that uses the connection services provided by the FC fabric • FC entity—The Fibre Channel specific functional component that combines with an FCIP entity to form an interface between an FC fabric and an IP network • FC fabric—An entity that interconnects various Nx_Ports attached to it, and is capable of routing FC frames using only the destination ID information in a FC frame header • FC fabric entity—A Fibre Channel specific element containing one or more Interconnect_Ports (see FC-SW-2) and one or more FC/FCIP entity pairs • FC frame—The basic unit of Fibre Channel data transfer • FC frame receiver portal—The access point through which an FC frame and time stamp enters an FCIP data engine from the FC entity • FC frame transmitter portal—The access point through which a reconstituted FC frame and time stamp leaves an FCIP data engine to the FC entity • FC/FCIP entity pair—The combination of one FC entity and one FCIP entity OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 396 Understanding FCIP Terms (Cont.) • FCIP data engine (FCIP_DE)—The component of an FCIP entity that handles FC frame encapsulation, de-encapsulation, and transmission FCIP frames through a single TCP connection • FCIP entity—The entity responsible for the FCIP protocol exchanges on the IP network and which encompasses FCIP_LEP(s) and FCIP control and services module • FCIP frame—An FC frame plus the FC frame encapsulation header, encoded SOF and encoded EOF that contains the FC frame • FCIP link—One or more TCP connections that connect one FCIP_LEP to another • FCIP link endpoint (FCIP_LEP)—The component of an FCIP entity that that handles a single FCIP link and contains one or more FCIP_DE’s • Encapsulated frame receiver portal—The TCP access point through which an FCIP frame is received from the IP network by an FCIP data engine • Encapsulated frame transmitter portal—The TCP access point through which an FCIP frame is transmitted to the IP network by an FCIP data engine • FCIP special frame (FSF)—A specially formatted FC frame containing information used by the FCIP protocol OPT-2T01 9899_06_2004_X 397 © 2004 Cisco Systems, Inc. All rights reserved. FCIP Diagram FC/FCIP Entity Pair FC Entity FCIP Entity Virtual ISL VE_Port VE_Port FC Frame Receiver Portal FCIP Link End Point FCIP Data Engine (Detail) FCIP_LEP FCIP_LEP FCIP Data Engine DE FCIP Frame TX RX Portal DE TX Dynamic CONNECTION PORT for FCIP Connections RX FCIP Link TCP Ports Non Dynamic Connections WKP = 3225 IP Address = 172.16.0.5 TCP Ports WKP = 3225 IP Address = 192.168.1.10 Ethernet Gigabit/WAN Interface More than One TCP Connection Is Allowed Ethernet Gigabit/WAN Interface FCIP Physical Link TCP Connection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr FC Frames in TCP/IP Class 3 and Class F Can Be on Separate Ports or Connections 398 ANSI Meets IETF E-Port • FC-BB-2 • FCIP OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 399 ANSI Meets IETF B-Port • FC-BB-2 • FCIP OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 400 FCIP Standards Stack Details This Will Be the ISL Connection Either a Bridged connection or E_Port; Depending on FCIP Implementation selected by Vendor OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 401 Additional IETF Drafts • SLP: Service Location Protocol draft-ietf-ips-fcip-slp Used for dynamic discovery of FCIP ports • IPSec for storage draft-ietf-ips-security More details later on this requirement for FCIP • MIBs draft-ietf-ips-scsi-mib draft-ietf-ips-fcmgmt-mib draft-ietf-ips-fcip-mib • FC-BB Published ANSI project being superseded by BB-2 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 402 ANSI: FC-BB-2 Essentials (FCIP E-Port) • Defines a slightly complex model; • FC-BB-2 covers the FC portion of this model (FC entity and some of above) • Cisco’s FCIP E_Port implementations will closely follow this model OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 403 IETF: FCIP Essentials (FCIP E-Port) • FCIP follows the model proposed in FC-BB-2; • FCIP covers the lower portion of this model (FCIP entity and below) • Cisco’s FCIP E_Port implementation will follow this model OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 404 ANSI: FCIP Essentials (FCIP B-Port) • Again the FC side of the this model follows SW-BB-2 standards • With B_Port there is no FC switching element so the B_Port device will not be seen as a switch in the fabric but as a passive device OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 405 IETF: FCIP Essentials (FCIP B-Port) • The FCIP part of the B-Port operation is the same as FCIP for the E_Port • Note in this diagram that implementations of this standard can be any number of ports from 1 to n OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 406 About FCIP Links Entity 1 • The FCIP interface represents both the VE_Port and the FCIP link VE_Port • An FCIP link is defined as one or more TCP connections FCIP_LEP DE • FCIP link endpoint (LEP) terminates FCIP links DE TCP Ports • FCIP data engine: One per TCP connection WKP = 3225 IP Address = 192.168.1.10 TCP/IP Network Interface FCIP Link Class F OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. Class 3 407 About the FC Entity • FC entity interfaces (internally) with FCIP entity • FC entity components: Control and Services Module Provides FC frame and timestamp along with synchronization with FCIP entity Correct order delivery of FC frames Works with FCIP entity for flow control Computes end-to-end transit time Throws away expired frames Answers to authentication of TCP connection request OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 408 About the FCIP Entity • FCIP entity interfaces (internally) with FC entity • FCIP entity components: Provide FC frame and timestamp to FC entity Tells FC entity about discarded bytes Tells FC entity about new and lost TCP connections and reason codes Monitors special frame changes Makes request to FC entity for authentication OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 409 FCIP Link Endpoint: Details • FCIP_LEP is the translation point between an FC entity and an IP network • LEP coordinates between FC and TCP flow control mechanisms OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 410 Error Detection and Recovery • Data engine uses various methods to detect errors but does not correct errors • Rather, it inserts EOFa (abort) frame delimiters when possible • Requests sent up to FC entity to handle recovery OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 411 IETF: Fibre Channel Frame Encapsulation Header • Defines the encapsulation header for Fibre Channel frames • Not specific to FCIP • Includes timestamp, CRC and provision for special frames OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 412 Initialization of Port B_Port E_Port • Link initialization • Link initialization • Exchange link parameters • Exchange link parameters • Link reset • Exchange switch capabilities • Reset link • Exchange fabric parameters • Assign domain IDs • Establish routes • Merge zones if required OPT-2T01 9899_06_2004_X 413 © 2004 Cisco Systems, Inc. All rights reserved. Link Initialization Flow E_Port on Switch B_Port or E_Port on FCIP Device These Are All Special Ordered Sets of 8B/10B Coding LF NOS NOS = Not Operational Sequence OLS = Offline Sequence LF OLS LR = Link Reset OL LR LRR =Link Reset Response AC = Activity State LR LRR LR = Link Recovery State LR Idle AC LF = Link Failure State OL = Offline State Idle AC OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 414 Link Capture E_Port on Switch B_Port or E_Port on FCIP Device NOS LR IDLE R-RDY IDLE IDLE LR & LRR to Initialize Flow Control Parameters Per FC-PH IDLE R-RDY IDLE LRR OPT-2T01 9899_06_2004_X At this Point B_Port Device Is Up and E_Port to E_Port Exchange Continues © 2004 Cisco Systems, Inc. All rights reserved. 415 ISL E_Port If It Is an E_Port FCIP Device or If the B_Port Is Now up the Switch to Switch Exchange Continues OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 416 ELP Data Bit 15 of flag will be a 1 for B_Port RA_TOV is fabric wide timer, ED_TOV is per Link PWWN & WWN, Vendor ID also Credit value is one to start to allow only one out standing frame during link start-up Class 2 & 3 supported OPT-2T01 9899_06_2004_X 417 © 2004 Cisco Systems, Inc. All rights reserved. E_Port and B_Port Summary VE - Port FCIP E-Port FC Switch Exchange Link Parameters Exchange Fabric Parameters FC SAN Exchange FCIPLink Parameters Exchange Link Parameters Exchange Fabric Parameters Exchange Fabric Parameters ESC ESC ESC FCIP B-Port FCIP B-Port VB - Port FCIP E-Port 7200 w/ PA-FC-1G FC Switch FCIP E-Port 7200 w/ PA-FC-1G FC Switch IP Network FC SAN B Port Operation FC Switch IP Network FC SAN E Port Operation FCIP E-Port Exchange FCIP-Link Parameters Exchange Link Parameters FC SAN Exchange Link Parameters Exchange Fabric Parameters ESC (Exchange Switch Capabilities) if required OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 418 FCIP: ISL Connection • The E-Port or B-Port FCIP Connection Will Provide: Simple name service across the IP tunnel FC discovery between SAN islands FSPF routing services between fabric switches Management server information Buffer credits OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 419 Comparisons B-Port and EPort Differences OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 420 FCIP Connection Establishment • Non-dynamic TCP connection to a specific IP address • Dynamic Discovery of FCIP entities using SLPv2 • Use of FCIP special frame • Use of options OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 421 Non-Dynamic TCP Connections • The FCIP entity is informed of a TCP connection is needed (Most likely done by configuration parameters in the device) • IP address and security features are established (Configured) • Destination WWN is determined (Configured) • TCP/IP parameters are set (Configured) • Quality of service is determined (Configured) • Connection request is made to Port 3225 or configured port OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 422 Dynamic TCP Connections SLPv2 • IP security for SLP determined • Enter FCIP discovery domain process • Advertise availability to SLP discovery domain service agent • Locate FCIP entities in the discovery domain as a SLP user agent • For each discovered entity follow same process as non-dynamic method to establish connection OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 423 FCIP Special Frame TCP Connection Is Established Sending Side • First frame sent after TCP connection is established • Sending side waits for FSF echo (90 seconds) • Echo is match or non-match (Non-match terminates TCP connection) • Creation of FCIP_LEP and FCIP_DE • Inform FC Entity of connection and usage flags OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 424 FCIP Special Frame TCP Connection Is Established Receiving Side (Listening) • Listen for connections on WKN port 3225 or configured port • Checks database to allow connection • Checks security features • Wait for FSF frame (90 seconds) • Inspect FSF contents and send echo frame Connection nonce Destination FC fabric entity world wide name Connection usage flags Connection usage code OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 425 FCIP Special Frame Details • Used to exchange WWNs, entity pair identifiers, TCP connection identifiers and to except or reject connection • Identify what kind of traffic (SOFi3, SOFn3, EOF) is intended; Not enforced • In conjunction with connection usage flags, connection usage code help FCIP entity apply proper QoS parameters for the connection • Adjustments to FSF with use of change bits can be made when frame is echoed back • If two entities are trying to send FSF connection frames simultaneously first to Rx echo wins OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 426 FCIP: Tunnel Setup as Proposed in FCIP Draft • The first frame transmitted in each direction is a special frame used to identify the peers FCIP entities and to synchronize 0 proto 0x01 version 0x01 ~proto ~version 0xFE 0xFE 1 proto 0x01 version 0x01 ~proto ~version 0xFE 0xFE 2 pFlags 1) Special Frame Sent I Am WWN1, This Is my FC/FCIP Identifier Are You Fabric WWN2? 2)Special Frame echoed Ok WWN1, I Am WWN2 Let’s Setup the Connection 3 0x00 ~pFlags 0x00 ~Flags ~Frame Frame Len 0x12 0x3F Len 0x3ED Flags 0x00 4/5 Timestamp integer/fraction 6 CRC (Reserved in FCIP) 0x00-00-00-00 7 3) FCIP Tunnel Setup Complete 8/9 10/11 FCIP Device Fibre Channel Fibre Channel IP WAN Reserved 0xFFFF Source FC Fabric Entity WWN (identify the fabric) 12/13 Source FC/FCIP Entity Identifier 14 Connection Nonce (random number) 15/16 FC Reserved 0x0000 17 Conn Usage flags 0x00 Connection usage code FC Destination FC Fabric Entity WWN Reserved 0x0000 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. Reserved 0xFFFF 427 pFlag Breakdown OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 428 FCIP Header Format Ones Compliment for Synchronization and Error Checking • FCIP header used after FSF exchange is completed OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 429 Connection Options • TCP selective acknowledgement (SACK) Per RFC 2883 • TCP window scale option • Protection from sequence number wrap (PAWS) • TCP keepalives (KAD) • Flow control mapping between TCP and Fibre Channel OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 430 FCIP Security Requirements (Per Draft) To Support IP Network Security FCIP Entities MUST: • Implement cryptographically protected authentication and cryptographic data integrity keyed to the authentication process, and Implement data confidentiality security features • FCIP utilizes the IPSec protocol suite to provide data confidentiality and authentication services, and IKE as the key management protocol • FCIP Security compliant implementations MUST implement ESP and the IPsec protocol suite based cryptographic authentication and data integrity [11], as well as confidentiality using algorithms and transforms as described in this section OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 431 FCIP Security Requirements (Per Draft) (Cont.) • FCIP implementations MUST meet the secure key management requirements of IPsec protocol suite • FCIP entities MUST implement replay protection against ESP sequence number wrap • FCIP entities MUST use the results of IKE phase 1 negotiation for initiating an IKE phase 2 “quick mode” exchange and establish new SAs Note: An External Device May Be Used in Conjunction with the FCIP Implementation to Meet the “Must Implement ESP” Requirement OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 432 Important FC and FCIP Timers • Resource Allocation Timeout Value (R_A_TOV) Timeout value that determines how long a FC frame can be in transit on the Fibre Channel network This is a fabric wide value with a default value usually at 120 sec on switch networks • Error Detect Timeout Value (E_D_TOV) A value that times events and responses at the link level; Errors at the link level will cause delays of these events This value is defaulted to 10 sec and should be lower then R_A_TOV; Again this is a fabric wide setting • Keep Alive Timer K_A_TOV A value that is applied to TCP connection and is used when no data is present OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 433 Time Stamps and Synchronization • Clock synchronization is required if timestamps are used Synchronized to FC services Synchronized to IP NTP • Transit time through IP network is applied via a timestamp Integer • If no timestamp value is available zero will be used • Fibre channel time values still apply across the ISL link and are timed-out via lack of RDY coming back • End system devices such as HBA attached hosts still require normal responses to timers end-to-end (no spoofing) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 434 Timestamps • TS are the responsibility of the FC entity • This allows transit through the FCIP entity to be included in the measurement • This transit time should be well below R_A_TOV OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 435 Buffer Credits • Fibre channel buffer credit methods do not change • R_Rdy’s will be used to control flow coming from FC switch on a per link basis • Buffer credit establishment is determined at FLOGI • Mechanisms to control flow of R_Rdy’s to FC switch based on TCP/IP congestion is per FCIP solution • FC switches do not require extended credit methods OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 436 Error Recovery • Errors on FC side of local B_Port are not forwarded over the IP network; Issues such as loss of sync or a FC encapsulation error will not be set to the FC entity • Errors on IP side are handled by TCP and frame is dropped if checksum is in error OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 437 Summary • FCIP is the standards approach to connect Fibre Channel ISLs over TCP/IP LAN/WAN connections • State of draft wording will most likely stay as it is worded today • Security, network delay and error recovery will be biggest concerns • No shipping product today conforms to the proposed FCIP draft • Cisco will have several platforms supporting FCIP solutions OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 438 INTERNET FIBRE CHANNEL PROTOCOL OPT-2T01 9899_06_2004_X 439 © 2004 Cisco Systems, Inc. All rights reserved. iFCP Protocol Model FC-4 iFCP iFCP TCP FC-1 FC-0 TCP FC-1 FC-0 IP IP LINK LINK PHY PHY Gateway Region Gateway Region IP Network • iFCP replaces the transport layer of Fibre Channel (FC-2) with an IP network but keeps the FC-4 mapping the existing Fibre Channel transport services on TCP/IP • iFCP processes differently FC-4 frame images (applications), FC-2 frame images (link service request), FC broadcast and iFCP control frames • Topology within the gateway regions are opaque to the IP network and other gateway regions (they appear just like collection of N_Ports) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 440 iFCP Network Model: iSNS Role • An iFCP gateway cannot operate without access to an iSNS server IP Network IFCP Gateway Region Gateway N_port-to-N_port session IFCP Gateway Gateway Region • Client-Server architecture • iSNS functions: Device Discovery and fabric management iSNS Queries IFCP Gateway Region Gateway iSNS N_port-to-N_port session iSNS Queries IFCP Gateway Gateway Region Emulation of the services provided by the FC name server and RSCN Definition and management of discovery domains Definition and management of “logical fabrics” OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 441 iFCP Protocol Description: N_Ports Addresses Allocation • Two different schemes: Address transparent mode (optional): The N_Port FC_IDs are unique across the whole logical fabric Address translation mode (mandatory): The N_Port FC_IDs are unique only inside the gateway region the N_Port belongs to OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 442 Address Transparent Mode • All the gateways belonging to the same “logical fabric” cooperate to assign addresses that are unique across the gateway regions that form the logical fabric • No need for address translation • Not scalable (max 239 gateways) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 443 Address Translation Mode • iFCP gateways use aliases to map the local representation of addresses of external gateway regions to the real addresses outside the gateway region (comparable to IP NAT) Requires a rewrite of the FC_IDs in the FC frame header and in the FC payload for some ELS (i.e. ADISC) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 444 iFCP Protocol Description: Address Translation Mechanism Give Me the Remote Gateway IP Address, N_Port ID, N_Port WWN 2) iSNS iS NS q pl re y/ r ue y TCP/IP IFCP Gateway IFCP Gateway Remote GW IP FC NS FC Re pl y NS Re qu es t 1) The N_Port Issues a NS Query Dest N_Port WWN FC_ID = x.x.x OPT-2T01 9899_06_2004_X FC_ID = y.y.y Dest N_Port ID (y.y.y) Local N_Port alias (z.z.z) 4) The Gateway Sends Back to the N_Port the NS Reply (for FC_ID z.z.z) 3) The Requesting GW Fills Up the Address Translation Table 445 © 2004 Cisco Systems, Inc. All rights reserved. iFCP Protocol Description: Address Translation Mechanism (Cont.) 2) The GW Makes a Table Lookup Gets the Remote GW IP Address (to Set Up the iFCP Session) and the Actual Dest N_Port ID ( to Rewrite the D_ID) PLOGI did Remote GW IP Dest N_Port ID (x.x.x) Local N_Port alias (w.w.w) 3) The receiving GW Fills Up Its Own Translation Table Dest N_Port WWN y.y.y sid x.x.x IFCP Gateway TCP/IP IFCP Gateway PLOGI did y.y.y sid w.w.w 4) The Receiving GW Rewrites the S_ID of the Incoming Request P si logi d x. I did x. x z.z .z 1) The N_Port Ia PLOGI to D_ID z.z.z Remote GW IP Dest N_Port ID (y.y.y) Local N_Port alias (z.z.z) FC_ID = y.y.y Dest N_Port WWN FC_ID = x.x.x • In case of fabric reconfiguration all the address translation tables need to be recalculated with a consequent loss of every active login session OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 446 ISNS AND SLP DISCOVERY PROTOCOLS FOR THE IP-SAN OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 447 Discovery Approach Deploy and Interoperate in Three Stages: 1. Naming and static configuration Configure both targets and initiators Use SendTargets to reduce initiator config 2. SLPv2 for multicast and simple discovery Configure targets 3. iSNS for centralized management Configure central iSNS server OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 448 Service Location Protocol (SLP) • Based on service location protocol v2 (RFC 2608) • Allows hosts to search for instances of a network service they are interested in: Example: printers OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 449 Basic SLP Discovery Requirements • Find targets by initiator’s worldwide unique identifier “Tell me which targets you have that I should see” • Find targets by target’s worldwide unique identifier “Where is target iscsi.com.acme.foo?” • Propagate attributes needed before connecting Boot information, authentication information • Scaling requirements Zero-configuration, no servers in small environments Reduce or eliminate multicast in medium environments Interoperate with LDAP/iSNS in large environments OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 450 Service Location Protocol (SLP) Three Components, Two of Which Run in Our Storage Router • SA—Service Agent; Services register with SA • UA—User Agent; Queries SA or DA for registered services • DA—Directory Agent; Proxies for a set of SAs query/response UA query/ response SA register services services services register DA OPT-2T01 9899_06_2004_X 451 © 2004 Cisco Systems, Inc. All rights reserved. Service Location Protocol for IP Storage host Management Code SLP UA iSCSI Initiator TCP/IP • Service Agent (SA) Advertises services Services have attributes • User Agent (UA) Finds services SLP DA Zero configuration IP • Directory Agent (DA) TCP/IP SLP SA iSCSI Target Propagate service adverts • SLP Protocol Management Code device OPT-2T01 9899_06_2004_X Optional © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr UDP or TCP Minimize multicast 452 Implementing SLP for iSCSI • Targets implement a service agent Answer multicast requests or register with DA • Initiators implement a user agent Use multicast or DA to locate targets • Devices containing targets register: The canonical target or individual targets Attributes of targets • Register target at each of its addresses OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 453 SLP Summary • Serverless discovery of targets Optional, generic DA to scale services • Zero-configuration of hosts SLP makes careful use of multicast • Access list and attribute propagation • Optional message authentication • Available open source implementations OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 454 What Is iSNS iSNS Facilitates Scalable Configuration and Management of iSCSI, iFCP and Fibre Channel (FCP) Storage Devices in an IP Network, By Providing a Set of Services comparable to that Available in Fibre Channel Networks http://www.ietf.org/internet-drafts/draft-ietf-ips-isns-22.txt OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 455 iSNS Functions There Are Four Main Functions of the iSNS: 1. A name server providing storage resource discovery 2. Discovery Domain (DD) and login control service 3. State change notification service 4. Open Mapping of Fibre Channel and iSCSI devices OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 456 Basic: How iSNS Works iSCSI & iSNS Clients 1 iSCSI Clients Register with iSNS Server, Done By Adding iSNS IP Address to iSCSI Application Driver 1 1 1 Fibre Channel SAN IP Network FC 2 2 iSCSI Targets Register with iSNS Server 3 FC 3 iSNS Clients Query iSNS Server for Storage Location and Name 4 iSCSI Client then Selects and Logs into iSCSI Target Using Information from iSNS Server OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 457 Internet Storage Name Service (iSNS) • iSNS server functions: Allows an iSNS client to register/deregister/query with the iSNS server Provides centralized management for enforcing access control of targets from specific initiators Provides a state-change notification mechanism for registered iSNS clients on the change of status of other iSNS clients • Similar to the functionally provided by the FC name Server, Zone Server and the RSCN mechanism OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 458 iSNS Components • iSNS protocol (iSNSP) A flexible and lightweight protocol that specifies how iSNS clients and servers communicate • Discovery Domain (DD) A grouping of storage devices much like a zone in the FCP; discovery domains help in control and manage logins and services available to the clients in the domain; Based on the FC-GS standard for fiber channel; Items like default domain are used • Discovery Domain Set (DDS) A group of one or more discovery domains; A method to store sets of domains within the iSNS database; Multiple DDSs can be active at one time, unlike zonesets in FCP where only one can be active at a time OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 459 iSNS Components • iSNS client The iSNS client is located within storage system and talk to the iSNS server using the iSNSP within its configured device domain; client can belong to one or more DDs; iSNS client registers its attributes with the iSNS server and receives notices of changes within the domain • iSNS database The iSNS database is the information repository for the iSNS server; it maintains information about iSNS clients attributes; a directory-enabled implementation of iSNS may store client attributes in an LDAP directory infrastructure • iSNS server iSNS servers respond to iSNS protocol queries and requests, and initiate iSNS protocol state change notifications; properly authenticated information submitted by a registration request is stored in an iSNS database; listens on port 3205 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 460 iSNS SCN (State Change Notifications) • iSNS clients who wish to receive SCN have to explicitly register with iSNS server the events in order to receive the notifications • Initiator/target/object with add/remove event or to/from discovery domain are the events that can be registered • iSNS servers generate SCN when either the state of any target device changes or when the target device itself requests an SCN to be generated using SCN event message; iSNS listens to FCNS to registration/deregistration OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 461 SCN Types • Regular registrations This type of SCN is used within a DD; The discovery domain will control where the SCN message will go • Management registrations Used by control nodes and can travel outside the DD from which they came Can be TCP or UDP messaging (Most implementations only using TCP for now) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 462 Services Provided by the Discovery Domain • Login control Authorization and control policies for storage targets can be maintained by iSNS servers only allowing authorized devices to access the targets Control of what target portals are accessible within the discovery domain • Fibre Channel to iSCSI device mapping iSNS database learns and stores naming and discovery information about FC storage devices discovery on the iSCSI Gateway and iSCSI devices in the IP network; This database can then be available by FC and IP iSNS clients OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 463 High Availability of iSNS Servers • Can use SLP to discovery other iSNS servers • Database transfers between servers using iSNSP or SNMP • Heartbeat mechanism used between active and backup iSNS servers OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 464 Internet Storage Name Service (iSNS) for iSCSI • The iSNS protocol (iSNSP) provides: A mechanism for iSCSI clients to discover other iSCSI targets/initiators Enforce access control Notifications from an iSNS server on changes to the status of a logged in iSCSI device Provide ability to discovery iSCSI target on different IP network • iSCSI target discovery can happen through: Static configuration of initiator iSCSI sendTargets command Name server/directory server (via iSNS) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 465 iSNSP Header iSNSP Version—C the Current Version is 0x0001; All Other Values Are RESERVED iSNSP Function ID—Defines the Type of iSNS Message and the Operation to Be Executed; iSNSP PDU Length—Specifies the Length of the PDU PAYLOAD Field in bytes; The PDU Payload Contains Attributes for the Operation iSNSP Flags—Indicates Additional Information About the Message and the Type of Network Entity That Generated the Message iSNSP Transaction ID—MUST Be Set to a Unique Value for Each Concurrently Outstanding Request Message; Replies MUST Use the same TRANSACTION ID Value as the Associated iSNS Request Message iSNSP Sequence ID—The SEQUENCE ID Has a Unique Value for Each PDU Within a Single Transaction iSNSP PDU Payload—The iSNSP PDU PAYLOAD Is Variable Length and Contains Attributes Used for Registration and Query Operations Authentication Block—For iSNS Multicast and Broadcast Messages, the iSNSP Provides Authentication Capability; The iSNS Authentication Block Is Identical in Format to the SLP Authentication Block OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 466 iSNSP Commands for iSCSI The Following Are iSNSP Commands Messages Used in Support of iSCSI: OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 467 iSNSP Responses for iSCSI The Following Are iSNSP Response messages Used in Support of iSCSI: OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 468 iSNS Queries for iSCSI • iSNS clients can perform two types of queries: Device attribute query: iSNS server responds with requested attributes of one or more iSNS clients The iSNS server converts the received query to a FC name server query in the SAN FC name server will ensure that the resultant set is filtered based on zones The iSNS server translates each entry returned by the FC name server to the corresponding iSNS clients Apply filters based on iSCSI access control by removing all statically configured virtual targets the query initiator is not allowed to access Device get next query: Allows an iterative query of the iSNS server’s iSNS client database OPT-2T01 9899_06_2004_X 469 © 2004 Cisco Systems, Inc. All rights reserved. Return Information from iSNS iSCSI Query iSCSi Name Name of Port on the IP Gateway Entity IP Address of Portal to Log to and Ask for This target OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 470 iSNS for iFCP • Will work much the same manor as iSCSI just will require other related attributes to be registered and queried • Is required for iFCP • Functions much like domain name server and domain ID manager • Needs to be highly available service for FC devices OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 471 iSNSP Commands for iFCP The Following Are iSNSP Commands Messages Used in Support of iFCP: OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 472 iSNSP Responses for iFCP The Following Are iSNSP Response Messages Used in Support of iFCP: OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 473 SLP and iSNS • SLP used for target discovery No configuration required for the simplest networks Small footprint; no servers required Just enough discovery for small-to-medium networks Device-centric access control model • iSNS adds storage management capabilities Active monitoring of initiators and targets Event propagation Public key distribution Centralized access control model OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 474 Using Both SLP and iSNS • Initiators can use both SLP and iSNS to discover targets • Targets should use SLP only if not configured for iSNS • Gateways or proxies may provide local SLP discovery of remote iSNS devices OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 475 TECHNICAL TOOLS AND SKILLS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 476 Storage Networking Toolbox • Test tools for Fibre Channel and IP • Host based tools • Network component serviceability tools • Software debug tools • Knowledge OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 477 Fibre Channel Analyzers • Most units are based on dedicated hardware, and might be supplied with software tools for performance base lining Very expensive Oriented to protocol conformance testing Requires 2 GBICs interfaces to be implemented • Monitoring units might have a retiming mode, to cleanup some of the timing problems on a link, and to separate them from the real problem at layer 1 Statistical software can run on these type units Collecting statistics on the status of the line, or other parameters (number of bits, exchanges…) • Sharing is still a dream in most cases, it is complex to share in the field, so in most cases the portable versions are the most suitable OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 478 Fibre Channel Analyzers • Snooping GBICs or fiber taps; allow to monitor without service interruption; very important for Fibre Channel work in the field • Traffic probes; used to remotely monitor the state of a network without service interruption • Trace viewers (free from the vendor websites) Each vendor has its own PC viewer and must be used with each capture tool; these can be found at each of their websites OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 479 FC Test Vendors • Leaders in dedicated hardware tools: Finisar (www.finisar.com) Xyratex (www.xyratex.com) Aglient (www.agilent.com) I-Tech (www.I-tech.com) Ancot (http://www.ancot.com/) Spirent/Netcom systems (www.netcomsystems.com) OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 480 SCSI Host-Based Testing • I/O meter http://developer.intel.com/design/servers/devtools/iometer/ • I/O zone http://www.iozone.org/ • SCSI tools http://scsitools.com/ • Xyratex disk basher http://www.xyratex.com/ • Freeshare or software tools for SCSI and I/O analysis, tools for disk manufacturing • www.ethereal.com • www.wildpackets.com OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 481 Windows Tools • iSCSI Driver debug helpers Windows debug utilities http://www.osr.com/resources_downloads.shtml http://www.sysinternals.com/ • Detail uses of O/S disk administrator to verify and check health of target devices OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 482 IP: GiGE • GiGE testers $$$ Agilent Sniffer Fluke Finisar/Shomiti iSCSI decodes just becoming available on most tools • All your IP tools IP Ping, trace, etc. Fibre Channel ping available at http://www.teracloud.com/utilities.html OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 483 iSCSI Decoding • Software only analyzers like Ethereal (www.ethereal.com) • Hardware analyzers • Can use monitor command on Cisco switches to span the iSCSI GiGE port to a 10/100 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 484 Available Certifications • SNIA (Storage Networking Industry Association) Level 1—Fibre Channel storage networking professional Level 2—Fibre Channel storage networking practitioner • iSCSI training available at many education sources Infinity I/O, medusa, solution technology, others • Other certifications that are vendor specific OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 485 ARCHITECTURAL DESIGN OF STORAGE AREA NETWORKS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 486 Section Agenda • Introduction • Hierarchy • Modularity • Architecture Examples OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 487 INTRODUCTION OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 488 Hierarchy, Modularity and Limited Failure Domains Why Do This? (Benefits Summary): • Scalable architecture • Improved performance • Manage change • Improve service • Improved security • Simplified management and troubleshooting • Reduced cost of ownership OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 489 What Problem Are We Solving? Applications Must Be Available and Perform Well Networks that Deliver on this Requirement: • Have consistently high performance • Are reliable, scaleable, and manageable • Are secure and cost-efficient • Are service and solution enabling • Adapt to changing requirements OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 490 Network Design Goals Architecture Provides: • Performance • Reliability, availability, and scalability—RAS • Cost efficiencies • Security • A base to enable services and solutions OPT-2T01 9899_06_2004_X To Meet Mission-Critical Business Objectives, Applications Need to Be Consistently Up, Available, and High-Performing © 2004 Cisco Systems, Inc. All rights reserved. 491 Architecture: Hierarchy, Modularity, and Domains Hierarchy Modularity Domains Functionally Divides the Problem Create Manageable Building Blocks Limits Scope of Potential Failures Fundamentally, We Break the Network Design Process into Manageable Blocks so that the Network will Function within the Performance and Scale Limits of Applications, Protocols and Network Services OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 492 What Does This Mean? We Build Networks that Have Structure: Access Focus of This Discussion Distribution Building Blocks Core Backbone Application Servers Enterprise Storage OPT-2T01 9899_06_2004_X WAN Internet PSTN 493 © 2004 Cisco Systems, Inc. All rights reserved. Applying Design Principles to Storage Core • Hierarchy Predictable performance Scaleable design Fault isolation • Modularity Cost-effective Repeatable • Domain Unified Storage Mgmt Reliability Security OPT-2T01 9899_06_2004_X Shared Storage © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 494 HIERARCHY OPT-2T01 9899_06_2004_X 495 © 2004 Cisco Systems, Inc. All rights reserved. Hierarchy: Physical and Logical • Physical hierarchy Predictable performance Scaleable design Fault isolation High availability • Logical hierarchy Virtual SANs Zoning Enhances physical hierarchy Physical Architecture Virtual SAN A Zone 1 Zone 2 H1 H2 © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr H1 Zone 3 D1 H3 H2 Zone 4 D2 Logical Architecture OPT-2T01 9899_06_2004_X Virtual SAN B Zone 1 Zone 2 D1 H7 D7 Logical Architecture 496 Hierarchy: Physical Consolidated Storage Network • Cost-effective solution Benefits of consolidation iSCSI iSCSI • Limited scalability Small to medium business Expansion can be disruptive • Single fault redundancy Double fault would likely result in isolation OPT-2T01 9899_06_2004_X 497 © 2004 Cisco Systems, Inc. All rights reserved. Hierarchy: Physical Collapsed Core Architecture • Collapsed core iSCSI iSCSI High performance Multiple unequal paths • Better scalability Medium to large enterprise ISLs can limit scalability • Redundant Mesh topology Network survives some double faults OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 498 Hierarchy: Physical Core Edge Architecture • Core—Edge iSCSI iSCSI High performance Load balancing Consistent hop count • Good scalability Large to very large enterprise Non-disruptive expansion • Better fault tolerance Improved fault isolation Single fault within layer okay OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 499 Hierarchy: Physical Oversubscription • To be expected in storage networks • Typically lower factors than we see in LANs • Architecture should be flexible to accommodate differing requirements for various hosts and storage subsystems • Bandwidth can be modified non-disruptively by using port channels between switches • Take into account any “inherent” over subscription in networking hardware • Use actual anticipated throughput rather than link speed for calculating bandwidth requirements OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 500 Hierarchy: Physical Inter-Switch Links • Inter-Switch Link—ISL iSCSI iSCSI Physical FC link between two fabric switches forming a trunk Utilized for FC services and data traffic • Port Channel Multiple FC ISLs combined to form a single aggregated trunk All links in a Port Channel must be directly connected to the same two switches Individual link state changes do not cause ISL trunk state changes OPT-2T01 9899_06_2004_X ISL Port Channel 501 © 2004 Cisco Systems, Inc. All rights reserved. Hierarchy: Physical Scalability • Oversubscription Higher OS acceptable 15:1 OS for some hosts Lower OS for High performance hosts and storage devices Consider impact of multipath load balancing Determine acceptable worst case in various failure scenarios Can be non-disruptively changed by adding/ removing links to port channels OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr iSCSI iSCSI 8:1 OS 1:1 OS Core 4x2Gb ISL 8x2Gb ISL 3:1 OS 502 Hierarchy: Logical Virtual SANs • VSANs provide a means to build a logical structure on top of a physical SAN • Similar to how VLANs are used to scale ethernet networks VSANs help scale Fibre Channel networks • Topology changes are isolated within the VSAN therefore adds, moves, and changes are not disruptive to other VSANs • VSANs can be utilized to establish administrative domains • Zoning provides an additional access control mechanism within each VSAN OPT-2T01 9899_06_2004_X 503 © 2004 Cisco Systems, Inc. All rights reserved. Hierarchy: Logical Logical Architecture • Virtual SANs iSCSI iSCSI Similar to Ethernet VLANs except no inter-VSAN flows Enhanced ISL provides VSAN trunking (EISL) Complimentary to port channel • Services scalability Independent Fibre Channel services for each VSAN Zoning is per VSAN EISL Port Channel • Failure domain Faults contained within VSAN OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 504 Hierarchy: Logical Maximizing VSAN Architecture • Isolate multiple paths into separate VSANs iSCSI iSCSI • Independent FC services per VSAN • Provides complete traffic isolation between redundant paths • Each VSAN converges independently for faster recovery and improved fault isolation OPT-2T01 9899_06_2004_X 505 © 2004 Cisco Systems, Inc. All rights reserved. Hierarchy: Combining Physical and Logical iSCSI • Fabric A provides one set of links and Fibre Channel services • Fabric B provides an independent set of links and services OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr A iSCSI B 506 MODULARITY OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 507 Modularity: Key Elements • The ability to scale the network while maintaining consistent performance • Building block approach breaks network into smaller chunks that are easier to understand, replicate, and deploy • Changes and additions can be made non-disruptively • Provides consistent and limited failure domains • Modularity can also define administrative boundaries OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 508 Modularity: Building Blocks Application Modules iSCSI iSCSI Fiber Channel Core Functional Building Blocks Provide Scalability with Deterministic Performance Storage Modules OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 509 Modularity: Utilizing VSANs • Adds, moves, and changes contained within a VSAN are non-disruptive to other VSANs • Using VSANs facilitates application modeling and testing • Per VSAN statistics • Per VSAN traffic engineering • Per VSAN administration (if desired) • Eliminates costs associated with separate physical fabrics OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 510 Modularity: Benefits of VSANs • Overlay isolated virtual fabrics on same physical infrastructure Department/ Customer “A” Department/ Customer “B” Each VSAN contains zones and separate (replicated) fabric services VSAN membership determined by port • VSANs for availability VSAN-Enabled Fabric Isolate virtual fabrics from fabric-wide faults/reconfigurations • Security VSAN Trunks Complete hardware isolation Mgmt VSAN • Scalability Replicated fabric services Thousands of VSANs per storage network • Management Roll Based Access Control—RBAC Provides administrative boundaries OPT-2T01 9899_06_2004_X Shared Storage 511 © 2004 Cisco Systems, Inc. All rights reserved. Modularity: Storage Intelligence and VSANs Dept 1 VSAN Dept 2 VSAN Dept 3 VSAN Virtualization Data Center VSANs OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr • VSANs created to provide isolation of fabric-wide services. • Virtualization allows physical storage to be in its own VSANs, separate from the host VSANs. VSANs provide • Secure isolation of physical storage • Easier configuration • Dynamic configuration of fabrics • Role-based access control 512 ARCHITECTURE EXAMPLES OPT-2T01 9899_06_2004_X 513 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: iSCSI • Scalability Less expensive alternative for host not requiring 2Gbps Applications TCP Offload Engine—TOE • Host Services Block Device SCSI Generic Recommend separate NIC Consider actual throughput requirements for scalability File System Network File System TCP/IP Stack NIC Driver iSCSI Driver iSCSI Driver TCP/IP Stack NIC Driver TCP/IP Stack Appears as normal HBA Compatible with host based storage utilities— multi-path, load balance, mapping, etc. OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr NIC Driver Adapter Driver SCSI Adapter TOE 514 Architecture: iSCSI High-Availability • Redundant connections to hosts or servers • High-availability iSCSI services • Redundant paths to backend FC SAN Host with Multiple(iSCSI) NICs and Multipathing Software Installed Multiple Ethernet Switches Redundant iSCSI to Fibre Channel Connections and Services Storage Array with Redundant Controller Ports Application Multipathing iSCSI Driver OPT-2T01 9899_06_2004_X 515 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: iSCSI Authentication RADIUS Server • SCSI routing service User1/pwd1 passes username User2/pwd2 and MD5-hashed … / … password from initiators to AAA server • AAA authentication list used to determine which service(s) to use for authentication iSCSI Hosts (Initiators) User1/pwd1 TACACS + Server User1/pwd1 User2/pwd2 … / … User1/pwd1 User2/pwd2 …/… RADIUS TACACS+ CHAP Local Authentication Services List iSCSI Storage (Targets) AAA Authentication Services SCSI Routing Instance IP Network FC Fabric iSCSI Services OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 516 Architecture: iSCSI Topology Front-Side IP Network iSCSI Best Practices Clients • Isolate IP storage network behind application hosts with VLANs • Minimized potential for bandwidth contention iSCSI OPT-2T01 9899_06_2004_X iSCSI iSCSI iSCSI-enabled Hosts Ethernet Switches IP Storage Network • Map VLANs to VSANs for manageability • Dedicated ethernet interfaces on host for attachment to storage network iSCSI iSCSI Services FC Fabric FC Attached Hosts with HBAs Storage Pool 517 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: SAN Extension Technology Technology Choice Requires Matching Storage Application Requirements with Service Availability, Cost, Throughput, and Latency IP WAN CWDM FCIP FCIP FC DWDM FC SONET/SDH OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 518 Architecture: High Availability for SAN Extension: FC • Utilize disparate paths and portchannel for high availability • Utilize VSANs to limit the failure domain in the event of lost connectivity FC Fabric A Fabric B CWDM FC Fabric A Fabric B DWDM PortChannel SONET/SDH • Both fabrics remain connected if one of the paths fails • Use of portchannel prevents state change on link failure OPT-2T01 9899_06_2004_X 519 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: High Availability for SAN Extension: FCIP • Utilize disparate paths and portchannel for high availability • Utilize VSANs to limit the failure domain in the event of lost connectivity • Recommend not using etherchannels FCIP Fabric A Fabric B IP WAN IP WAN PortChannel FCIP Fabric A Fabric B PortChannel • Both fabrics remain connected if one of the paths fails • Use of portchannel prevents state change on link failure OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 520 Architecture: Legacy Storage Implementation Campus Clients Remote Clients Internet Clients • Storage is ‘captive’ behind applications • Inefficient allocation of storage resources • Multiple administrative domains LAN Core Backbone Application Servers SAN Islands Captive Storage Blocks OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 521 Architecture: Factors for Determining Architecture • Current size and anticipated growth for both application servers and storage elements • Baseline performance requirements for servers and storage • Business continuance requirements—SAN extension • Administrative domains • Migration plans • Interoperability considerations • Costs OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 522 Architecture: Collapsed Core Architecture • Servers and storage elements connected to collapsed core Application Servers iSCSI • Some scalability especially with iSCSI • Redundant paths Unified • Achieves Storage Mgmt economical storage consolidation • VSANs can add scalability and management benefits OPT-2T01 9899_06_2004_X Shared Storage 523 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: Large Scale Architecture • Application servers connect to edge switches • Storage devices connect to edge switches Application Servers iSCSI Unified Storage Mgmt • Highly scalable • Highly redundant • Highly modular • Multiple equal paths • VSANs limit the size of any one SAN OPT-2T01 9899_06_2004_X Shared Storage © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 524 Network Design Goals Architecture Summary: • Performance Planned hierarchy, managed oversubscription, and modular design • Reliability, Availability, and Scalability—RAS Limited failure domains, leveraged VSANs, and modular design • Cost efficiencies Consolidated storage, central management, and leveraged resources • Security Limited domains, RBAC management, and consistent architecture • A base to enable services and solutions Business continuance and disaster recovery Management of heterogeneous storage elements Ubiquitous access to storage from anywhere Infrastructure for storage virtualization OPT-2T01 9899_06_2004_X 525 © 2004 Cisco Systems, Inc. All rights reserved. Architecture: End-to-End SAN Architecture Highly Scalable Storage Networks iSCSI iSCSI iSCSI iSCSI FC FC FC FC FC FC FC FC iSCSI FC FC FC FC FC FC iSCSI iSCSI iSCSIFCiSCSI FC FC iSCSI iSCSI iSCSI iSCSI iSCSI FC FC iSCSIEnabled Storage Network Ethernet Switches Multiprotocol/Multiservice SONET Network SONET Network FC FC FC Asynchronous Replication—FCIP over SONET FCIP Remote Storage Access Optical Network Resilient Optical Transport Networks Synchronous Replication—Optical (FCIP/FC) FC FC OPT-2T01 9899_06_2004_X FC FC FC Intelligent Workgroup Storage Networks © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr FC FC FC FC FC 526 Q&A OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 527 Complete Your Online Session Evaluation! WHAT: Complete an online session evaluation and your name will be entered into a daily drawing WHY: Win fabulous prizes! Give us your feedback! WHERE: Go to the Internet stations located throughout the Convention Center HOW: OPT-2T01 9899_06_2004_X Winners will be posted on the onsite Networkers Website; four winners per day © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 528 OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. 529 EXTRAS OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 530 FC LOOP OPERATIONS OPT-2T01 9899_06_2004_X 531 © 2004 Cisco Systems, Inc. All rights reserved. Single Port ARB AL_PA 2A Port RX 1. The Loop is initially filled with IDLES 2. Each port is in the monitoring state 3. Because of no activity CFW = Idle 4. Rx IDLES are replaced with CFW AL_PA B2 TX Port IDLE TX Port AL_PA 01 RX RX TX IDLE AL_PA EF IDLE IDLE Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 532 Single Port ARB AL_PA 2A Port RX 1. Port_01 begins to arbitrate for access to the Loop 2. Port_01 changes its CFW from IDLE to ARB(01) 3. Port_01 transmits ARB(01) when a fill word is required AL_PA B2 TX Port IDLE TX Port AL_PA 01 RX RX TX ARB(01) AL_PA EF IDLE IDLE Port RX OPT-2T01 9899_06_2004_X TX 533 © 2004 Cisco Systems, Inc. All rights reserved. Single Port ARB AL_PA 2A Port ARB(01) RX Whenever a fill word is required ARB(01) is used; With no other activity on the loop ARB(01) is sent AL_PA B2 TX Port 2. ARB(01) is Rx by the next port and updates its CFW to ARB(01) Port AL_PA 01 ARB(01) TX RX TX 1. RX AL_PA EF IDLE ARB(01) Port RX TX When a Port Discards Rx Fill Words and Transmits the CFW this Allows the Port to Compensate for Clock Differences Between Rx Data Stream and Tx Data Stream OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 534 Single Port ARB AL_PA 2A Port ARB(F0) RX OPN RX Port Port_01 sends on OPN to open a loop circuit and changes its CFW to ARB(F0) 3. Port_01 discards any Rx’ed ARB(x) AL_PA B2 TX 2. Port AL_PA 01 When Port_01 receives its own ARB(01) it wins arbitration RX TX 1. TX AL_PA EF ARB(01) Port RX OPT-2T01 9899_06_2004_X TX 535 © 2004 Cisco Systems, Inc. All rights reserved. Single Port ARB AL_PA 2A Port IDLE RX As each port Rx’s the ARB(F0) it updates its CFW to ARB(F0) 2. Assuming that no other port is arbitrating, ARB(F0) travel the complete loop 3. When ARB(F0) is Rx’ed by Port_01 the CFW in Port_01 is changed to IDLE AL_PA B2 TX Port 1. Port AL_PA 01 ARB(F0) TX RX TX RX AL_PA EF ARB(F0) ARB(F0) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 536 Single Port ARB AL_PA 2A Port IDLE 1. RX As long as Port_01 owns the loop it discards any Rx’ed IDLE or ARB(x) and continues to send its CFW when necessary AL_PA B2 TX Port 3. Each port receives the IDLE and updates its CFW to IDLE Assuming the no other port is arbitrating and the IDLES travel the complete loop Port AL_PA 01 IDLE TX RX TX 2. RX AL_PA EF ARB(F0) IDLE Port RX TX Discarding the Receiving Arb(x) Prevents Any Other Port from Winning Arbitration OPT-2T01 9899_06_2004_X 537 © 2004 Cisco Systems, Inc. All rights reserved. Multiple Port ARB AL_PA 2A Port ARB(01) RX Port Port_B2 also begins arbitrating for the loop; It replaces Idle and ARB(x) with ARB(B2) AL_PA B2 TX 2. Port_01 begins arbitrating for access to the loop; Done by replacing IDLE and ARB(x) with ARB(01) Port AL_PA 01 IDLE TX RX TX 1. RX AL_PA EF IDLE ARB(B2) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 538 Multiple Port ARB AL_PA 2A Port ARB(01) RX Port The ARB(B2) also travels to Port_EF which updates its CFW with ARB(B2) AL_PA B2 TX 2. The ARB(01) gets to Port_2A which updates its CFW with ARB(01) and transmits this when the CFW is needed Port AL_PA 01 ARB(01) TX RX TX 1. RX AL_PA EF ARB(B2) ARB(B2) Port RX OPT-2T01 9899_06_2004_X TX 539 © 2004 Cisco Systems, Inc. All rights reserved. Multiple Port ARB AL_PA 2A Port ARB(01) RX ARB(01) TX RX AL_PA B2 TX Port Port AL_PA 01 When Port_B2 receives ARB(01) it changes its CFW to ARB(01) because of 01 has higher priority(Lower AL_PA wins) 2. When Port_01 receives ARB(B2) it is replaced with ARB(01) RX TX 1. AL_PA EF ARB(B2) ARB(01) Port RX TX Because Port_B2’s ARB(B2) Is Replaced with ARB(01) It Will Not Win Arbitration at this Time OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr 540 Multiple Port ARB AL_PA 2A Port ARB(F0) and OPN Port_01 then opens the loop circuit and updates it’s CFW with ARB(F0) when a fill word is required 3. Port_B2 is still arbitrating but is lower priority AL_PA EF RX 2. ARB(01) ARB(01) Port RX OPT-2T01 9899_06_2004_X AL_PA B2 TX Port ARB(01) is Rx by Port_01 and wins arbitration Port AL_PA 01 ARB(01) TX RX TX 1. RX TX 541 © 2004 Cisco Systems, Inc. All rights reserved. Multiple Port ARB AL_PA 2A Port ARB(F0) ARB(F0) TX RX 2. Port_B2 replaces the lower-priority ARB(F0) and transmits ARB(B2) AL_PA B2 TX Port AL_PA 01 Port RX Port_2A receives ARB(F0) and updates the CFW to F0 TX 1. RX AL_PA EF ARB(01) ARB(B2) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 542 Multiple Port ARB AL_PA 2A Port 1. Port_EF updates its CFW to ARB(B2) and transmits on to Port_01 2. Port_01 Tx’s ARB(F0) Port_B2 continues to replace F0 with B2; Port_01 discards all Rx’ed ARB(x) ordered sets RX 3. AL_PA B2 TX Port ARB(F0) TX Port AL_PA 01 RX RX TX ARB(F0) AL_PA EF ARB(B2) ARB(B2) Port RX TX When Port_01 Relinquishes Control of the Loop It Changes Its CFW to ARB(B2) Allowing Port_B2 to Win OPT-2T01 9899_06_2004_X 543 © 2004 Cisco Systems, Inc. All rights reserved. Lower Priority Port ARB AL_PA 2A Port IDLE RX IDLE TX Each Rx’ed IDLE and lower-priority ARB(x) is discarded by Port_B2 and the ARB(B2) is substituted in its place RX TX 2. AL_PA B2 TX Port AL_PA 01 Port Port_B2 begins to arbitrate for the loop by changing CFW to B2 RX 1. AL_PA EF IDLE ARB(B2) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 544 Lower Priority Port ARB AL_PA 2A Port IDLE IDLE TX RX Port_EF changes its CFW to ARB(B2) and Tx’s the ARB(B2) whenever a fill word is needed AL_PA B2 TX Port 2. Port AL_PA 01 RX ARB(B2) propagates around the loop to Port_EF TX 1. RX AL_PA EF ARB(B2) ARB(B2) Port RX OPT-2T01 9899_06_2004_X TX 545 © 2004 Cisco Systems, Inc. All rights reserved. Lower Priority Port ARB AL_PA 2A Port ARB(B2) IDLE TX RX Port_01 changes its CFW to ARB(B2) and Tx’s ARB(B2) whenever a fill word is needed AL_PA B2 TX Port 2. Port AL_PA 01 RX The ARB(B2) propagates around the loop to Port_01 TX 1. RX AL_PA EF ARB(B2) ARB(B2) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 546 Lower Priority Port ARB AL_PA 2A ARB(01) Port ARB(01) 1. TX RX 4. When ARB(01) is Rx’ed at Port_2A its CFW is changed from B2 to 01 AL_PA B2 TX The single ARB(B2) travels around the loop to Port_2A. Port_2A passes the ARB(B2) Port 3. RX Port ARB(B2) XX TX Port_01 begins arbitrating after a single ARB(B2) has passed Port_01 has higher priority than Port_B2 and discards ARB(B2) and replaces it with ARB(01) 2. AL_PA 01 RX AL_PA EF ARB(B2) ARB(B2) Port RX OPT-2T01 9899_06_2004_X TX 547 © 2004 Cisco Systems, Inc. All rights reserved. Lower Priority Port ARB AL_PA 2A Port ARB(01) The single ARB(B2) is Rx’ed by Port_B2 which wins arbitration and begins to discard any Rx’d ARB(x) Port_B2 changes its CFW to ARB(F0) AL_PA B2 RX TX Port 2. ARB(B2) Port AL_PA 01 TX RX TX 1. RX ARB(01) AL_PA EF ARB(B2) ARB(F0) and OPN Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 548 Lower Priority Port ARB AL_PA 2A Port RX 1. Port_EF changes its CFW to ARB(F0) and sends it on to Port_01 2. Port_01 substitutes ARB(01 for every ARB(F0) it receives Port_B2 discards the ARB(01) and sends ARB(F0) as its fill word When Port_B2 relinquishes the loop, it will change its CFW to ARB(01) and allow Port_01 to win the loop 3. 4. AL_PA B2 TX Port ARB(01) TX Port AL_PA 01 RX RX TX ARB(01) AL_PA EF ARB(F0) ARB(F0) Port RX OPT-2T01 9899_06_2004_X © 2004 Cisco Systems, Inc. All rights reserved. © 2004 Cisco Systems, Inc. All rights reserved. Printed in USA. Presentation_ID.scr TX 549