Download FCTC_Zamer_iscsi

Document related concepts

Lag wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Power over Ethernet wikipedia , lookup

Point-to-Point Protocol over Ethernet wikipedia , lookup

Airborne Networking wikipedia , lookup

Computer network wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Deep packet inspection wikipedia , lookup

AppleTalk wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Network tap wikipedia , lookup

IEEE 1355 wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Parallel SCSI wikipedia , lookup

Storage virtualization wikipedia , lookup

TCP congestion control wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Internet protocol suite wikipedia , lookup

Transcript
IP Storage Tutorial
Presented 17 October 2001 by
Marc Staimer, President & CDS – Dragon Slayer Consulting
Ahmad Zamer, Sr. Product Line Marketing – Intel
John Hufferd, Sr. Technical Staff – IBM SSD
Joe Gervais, Director Product Marketing – Alacritech
Tutorial Introduction
Marc Staimer, CDS – Dragon Slayer Consulting
[email protected]
The Purpose of this Tutorial

IP Storage as “block” vs. “file” storage






NAS will be discussed peripherally
To provide details about IP Storage
To provide factual information
To clarify issues
To facilitate understanding
Key point

This is will be pragmatic education not cheerleading
17 October 2001
3
IP Networked Storage
iSCSI – New Possibilities
Ahmad Zamer
[email protected]
October 2001
Overview





Introduction
Benefits of IP Storage
IP Storage technologies
iSCSI
Conclusions
17 October 2001
5
Introduction
“Ethernet wins. Again. In time… Ethernet will eventually triumph
over all other storage networking technologies, including Fibre
Channel”
Source: March 2001 Forrester Research
“If we were starting with a clean piece of paper … we would
probably use gigabit Ethernet and IP”
Source: Bill Miller CTO StorageNetworks, Industry Standard
“... 76% of senior IT executives believe IP will make it easier to
implement large-scale storage networks”
Source: Enterprise Storage Group 9/11/2000
“75% perceive iSCSI as the IP storage standard”
Source: Marc Staimer , Dragon Slayer Consulting – May 2001
17 October 2001
6
Network Storage Models
Direct Attached Storage
•High Cost of Ownership
•In-flexible
17 October 2001
Network Attached Storage
•Transmission optimized for file
transactions
•Storage traffic travels across the LAN
Storage Area Network
•Transmission optimized for database
transactions
•Separate LAN and SAN
•Increases Data availability
•Flexible and scalable
7
Moving from Dedicated to Networked Storage
17 October 2001
8
Benefits of IP Storage






Brings the SAN concept to Ethernet networks
Lower total cost of ownership
Creates a single integrated network
Makes remote data replication possible
Improves enterprise networks management
Provides higher degree of interoperability
17 October 2001
9
Advantages of IP Storage



Storage access over distance
Transparent to Applications
Leverage Benefits of IP





iSCSI
IT Skills
Ethernet & SCSI Infrastructure
Network Management
R&D Investment
Universal Access to Storage
17 October 2001
IP Network
Storage Router
GE
FC or SCSI
Storage appears
local to servers
10
Key Business Trends Favor IP Storage
Network Performance
Overall System Cost
100Gbps
IP Storage
Switches
FC Switches
40Gbps
10Gbps
10Gbps
1Gbps
IP Storage
Switches
1.7Gbps
FC Switches
0.85Gbps
2000
2001
2002
2003
Trained Staff Available
2000
2001
2002
2003
Total Cost of Ownership
IP Storage
Switches
FC Switches
IP Storage
Switches
FC Switches
2000
17 October 2001
2001
2002
2003
2000
2001
2002
2003
11
IP Storage Standards
Storage Networking Industry
Association

IETF IP Storage (IPS) Working Group





iSCSI
FCIP
iFCP
iSNS
Storage Networking Industry Association (SNIA)

SNIA IP Storage Forum
17 October 2001
12
IP Storage Technologies
What are the technologies? (iSCSI, iFCP, FCIP)

iSCSI


FCIP


iSCSI is a TCP/IP-based protocol for establishing and managing
connections between IP-based storage devices, hosts and
clients
FCIP is a TCP/IP-based tunneling protocol for connecting
geographically distributed Fibre Channel SANs transparently to
both FC and IP
iFCP

iFCP is a TCP/IP-based protocol for interconnecting Fibre
Channel storage devices or Fibre Channel SANs using an IP
infrastructure in place of Fibre Channel switching and routing
elements
17 October 2001
14
IP Storage: iSCSI, FCIP, iFCP
End
Devices
Fabric
Services*
iSCSI
iSCSI/IP
Internet
Protocol
FCIP
Fibre
Channel
Fibre
Channel
iFCP
Fibre
Channel
Internet
Protocol
* Fabric Services include routing, device discovery,
management, authentication, inter-switch communication
17 October 2001
15
iSCSI, iFCP and FCIP Protocol Stacks
Applications
Operating System
Standard SCSI Command Set
FCP
New Serial SCSI
FCP
FC-4
FC-4
FC Lower Layers
17 October 2001
TCP
TCP
TCP
IP
IP
IP
iSCSI
iFCP
FCIP
16
iFCP
iFCP





iFCP is a gateway-to-gateway protocol for implementing a
fibre channel fabric over a TCP/IP transport
Traffic between fibre channel devices is routed and
switched by TCP/IP network
The iFCP layer maps Fibre Channel frames
to a predetermined TCP connection for transport
FC messaging and routing services are terminated at the
gateways so the fabrics are not merged to one another
Dynamically creates IP tunnels for FC frames
Ethernet
Header
IP
TCP iFCP
FCP
//
SCSI Data …
CRC
Checksum
17 October 2001
18
iFCP Approach
FC
Server
iFCP provides F port to
F port connectivity only
FC Tape
Library
iFCP
Gateway
iSNS
Server
iFCP
Gateway
iFCP
Gateway
FC
JBOD
17 October 2001
IP Network
iFCP
Gateway
FC
Server
Device-to-Device
Session
FC Tape
Library
FC
Server
iFCP
Gateway iFCP
Gateway
iFCP
iFCP Gateway
Gateway
Device-to-Device
Session
IP Services at individual device level
IETF Standards for Routing, Naming,
Security, QoS, CoS, Discovery (iSNS)
FC
Server
19
FC
JBOD
iSNS
Server
FCIP
FCIP






FCIP encapsulates FC frames within TCP/IP, allowing islands of FC
SANs to be interconnected over an IP-based network
TCP/IP is used as the underlying transport to provide congestion
control and in-order delivery FC Frames
All classes of FC frames are treated the same as datagrams
End-station addressing, address resolution, message routing, and
other elements of the FC network architecture remain unchanged
IP introduced exclusively as a transport protocol for an inter-network
bridging function
IP is unaware of the Fibre Channel Payload and the FC fabric is
unaware of IP
//
Ethernet
Header
IP
TCP FCIP
FCP
SCSI Data …
CRC
Checksum
17 October 2001
21
FCIP Approach—IP Tunneling
FC Tape
Library
FC Server
FC Switch
FC
Switch
Fibre
Channel
SAN
FCIP
Tunnel
FC
Server
FC Tape
Library
FC Switch
IP Network
FCIP
Tunnel
Tunnel Session
FC Switch
Fibre
Channel
SAN
FC Switch
FC Switch
FC
Server
FC Switch
IP Services
Available at Aggregated
FC SAN Level
FC
JBOD
FC
Server
FC
JBOD
FCIP provides E port to E port connectivity
17 October 2001
22
iSCSI
iSCSI


iSCSI is a SCSI transport protocol for
mapping of block-oriented storage data
over TCP/IP networks
The iSCSI protocol enables universal
access to storage devices and Storage
Area Networks (SANs) over standard
TCP/IP networks
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
24
iSCSI, iFCP, FCiP
//
Ethernet
Header
IP
TCP FCIP
FCP
SCSI Data …
CRC
SCSI Data …
CRC
Checksum
Ethernet
Header
IP
TCP iFCP
FCP
Checksum
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
25
iSCSI – Cont.

iSCSI (Internet SCSI) specifies a way to
“encapsulate” SCSI commands in a
TCP/IP network connection:
IP
Header
TCP
Header
iSCSI
Header
SCSI commands and data
Explains how to extract
SCSI commands and data
Provides information necessary to
guarantee delivery
Contain “routing” information
So that the message can find its
Way through the network
17 October 2001
26
iSCSI Deployment
17 October 2001
27
iSCSI Implementations
iSCSI
Client
Native iSCSI Device
IP
Network
iSCSI
Server
17 October 2001
iSCSI
Gateway
FC
Switch
28
Disk
Storage Consolidation
NT
Servers
NT
Servers
Tape
Library
RAID
RAID
(Email)
Tape Drive
Switch
Switch
Switch
Switch
RAID
LAN
Mission-Critical RAID
(Oracle, ERP DB)
SAN
Tape Drive
RAID
Tape Drive
Server and LAN bottlenecks

Single points of failure

Poor scalability (management
overhead, resource inefficiencies)
17 October 2001



Tape Drives => Tape Library
Departmental => Application-centric
disc arrays
29
iSCSI Architecture

Overview
Architectural Model
 Features Beyond // SCSI
 Issues Beyond // SCSI

17 October 2001
30
iSCSI - Layered Model
Initiator I/O System
SCSI
Application
Layer
SCSI Application
Target I/O System
SCSI Application
Protocol
SCSI Device
Server
SCSI CDB
Protocol Service
Interface
iSCSI Protocol
Layer
iSCSI Protocol
Services
iSCSI Protocol
iSCSI Protocol
Services
iSCSI PDU
iSCSI Transport
Interface
TCP/IP
TCP/IP
TCP/IP
TCP/IP
TCP/IP Protocol
TCP/IP
TCP/IP
TCP/IP
TCP segments
in IP
datagrams
iSCSI session
Ethernet
Data link +
Physical
Data link +
Physical
Ethernet
Frame
Ethernet



17 October 2001
Replaces shared bus with switched fabric
Transparently encapsulates SCSI CDBs
Unlimited target and initiator connectivity
31
iSCSI Sessions
iSCSI Host
iSCSI Device
iSCSI Session
iSCSI Initiator
iSCSI Target
TCP Connection
TCP Connection
TCP Connection
iSCSI Target
iSCSI Session

Session between initiator and target
One
or more TCP connections per session
Login phase begins each connection


Deliver SCSI commands in order
Recover from lost connections
17 October 2001
32
iSCSI Encapsulation
Data Servers
IP
Network
SCSI Initiator
iSCSI Initiator
Ethernet
Header
iSCSI Target
FC
SCSI
Header
DATA
C
R
C
Ethernet
Header
T
I
C
P
P
C
R
C
DATA
T
I
C iSCSI SCSI DATA
P
P
C
R
C
SCSI Target
Fibre Channel SAN
LUNs
17 October 2001
33
External
Network
End Users
iSCSI Packet Order
Data Servers
1
2
3
IP
Network
SCSI Initiator
iSCSI Initiator
1
iSCSI Target
1 Target
2
SCSI
3
2
3
Fibre Channel SAN
LUNs
17 October 2001
34
iSCSI Packet
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
35
iSCSI Packet
46–1500 bytes
Preamble
Destination Source
Type
Address Address
8
6
6
Well-known
Ports:
21 FTP
23 Telnet
25 SMTP
80 iSCSI
http
5003
IP
TCP
Data
FCS
2
4 Octet
iSCSI
Encapsulated
Opcode
Opcode Specific Fields
Length of Data (after 40Byte header)
Sourced Port
Destination Port
LUN or Opcode-specific fields
Sequence Number
Acknowledgment Number
OffsetReserved U A P R S F
Window
Checksum
Urgent Pointer
Options and Padding
17 October 2001
TCP Header
Initiator Task Tag
Opcode Specific Fields
Data Field …
36
iSCSI Commands

SCSI Commands



Command phase
Optional data phase
Response phase

iSCSI Commands

17 October 2001
Binds command phase with
associated data into iSCSI
Protocol Data Unit (PDU)
37
iSCSI Architecture Features Beyond // SCSI

Sessions


Device sharing


Comprises one or more TCP connections used for fail
over and/or link aggregation
Any host on the network can potentially use the same
iSCSI device
Device scalability

Hosts can connect to an effectively limitless number of
iSCSI devices
17 October 2001
38
iSCSI Architecture Issues Beyond // SCSI





Naming, addressing and discovering
Security & Data Integrity
Ordering and numbering
Error handling/recovery
Networking Overhead
17 October 2001
39
iSCSI Architecture Issues
Naming, Addressing & Discovery

// SCSI uses a simple NAD scheme:
Devices discovered by polling the bus
 Devices given unique id between 0 and 15


iSCSI requires:
Internet addressing
 Location independent naming

operation beyond firewalls
multiple addresses to one target
multiple targets behind one address
3rd party commands

Scalable discovery (poll the Internet??)
17 October 2001
40
iSCSI Storage Device Discovery Process

1) Host driver requests available iSCSI targets
from the SCSI router

2) SCSI router sends available iSCSI target
names to host

3) Host logs into iSCSI targets that were received

4) SCSI router accepts the login and sends target
identifiers to Host (numbers)

5) Host queries targets for device information

6) Targets respond with device information

7) Host creates table of internal devices (/dev/…)
17 October 2001
41
iSCSI Sequence
Initiator
TCP
Target
Single TCP Session
Establish normal TCP Session
TCP port
5003
0X03 Command—Login
iSCSI Driver
Send Targets
0X43 Login Response—Reject Login Status 1
In text area, list of assessable target names.
Keeps TCP session up.
0X03 Command—Login
List of Target names sent
0X43 Login Response
Response with target drive mapping
17 October 2001
42
This
device
has
already
initialized
onto the
Fibre
Channel
iSCSI Architecture Issues: Security Levels


0: None – ok in controlled environments
1: Initiator and target authentication


2: Digests for header and data integrity


Prevents unauthorized access
Prevents against man-in-middle, insertion,
modification and deletion
3: Encryption (IPSEC)

Prevents against eavesdropping
17 October 2001
43
iSCSI Architecture Issues Ordering & Numbering

Unlike // SCSI, iSCSI PDUs may
Arrive out of order (by taking different routes)
 Not arrive at all


iSCSI requires

Command numbering
Ordered delivery over multiple connections

Status numbering
Detection of a failed connections

Data sequencing
Detection of missing data PDUs
17 October 2001
44
iSCSI Architecture Issues Error Handling & Recovery

// SCSI errors incur costly recovery:
Aborted commands; target, bus and host resets
 OK, because bus errors are infrequent


iSCSI errors will be more frequent
Link failures
 TCP failures
 Bad “middle box” (firewall, router)
 Does the Internet have a “reset” option??

17 October 2001
45
iSCSI Architecture Issues Networking Overhead


Software iSCSI can achieve near GbE wire
speed – but at 100% CPU
Traditional TCP stacks are expensive
multiple memory copies
 too many interrupts
 checksums calculations


We needs TCP offload engines (TOE)
17 October 2001
46
iSCSI - TCP Offload
Ethernet
Header





IP
TCP
iSCSI
SCSI Data
CRC
Ethernet frame requires additional CPU processing
Headers must be stripped
Packets ordered
Data copied into memory buffers
CRC checked
17 October 2001
47
iSCSI Architecture  Issues  Networking

TOE

The challenge rests on the TOE vendor
Interrupt host on command boundaries
Offer zero-copy from NIC to app
Eliminate TCP reassembly buffer
 Provides true zero-copy
 Requires RDMA or synchronization

Proposed IETF solutions for framing
WARP - an RDMA mechanism
Markers – a synchronization mechanism
17 October 2001
48
What’s Next for iSCSI




CRC
SLP (Service Location Protocol)
Authentication
Encryption
17 October 2001
49
Conclusions
Conclusions






IP-based storage will proliferate
Benefits are strong
Significant players
Clear need
Standards will be established
Work with industry leaders
17 October 2001
51
Backup
iSNS




iSNS (Internet Storage Name Server)
Provides registration and discovery of SCSI
devices and Fibre Channel-based
In IP-based storage like iSCSI end devices
registered with iSNS
In iFCP, Fibre Channel-based storage end
devices register with iSNS by a iFCP gateway
17 October 2001
53
iSNS Operation
iSNS
server
FC network 1
FC network 2
Local
iFCP Portal
Server_1
N_port ID
#24
IP
Network
IP address
10.1.2.3
IP address
10.1.2.4
Remote
iFCP portal
Server_2
N_port ID
#24
Problem: Two identical N_port IDs
Solution: Create new ID (based on IP address + N_port ID) = 2422
17 October 2001
54
Tracing an iSCSI Block I/O
Server
Database
Application
1
iSCSI Appliance
Application
File I/O requests
2
Operating System
Database System
Raw Partition
Manager
iSCSI Appliance Storage
Storage I/O Bus
File System
Volume Manager
SCSI Device Driver
iSCSI Device Driver Layer
TCP/IPP stack
Network Interface Card
RAID Host Bus Adapter
SCSI Device Driver
iSCSI Device Driver Layer
TCP/IPP stack
Network Interface Card
Device specific requests to TCP/IP network
Block I/O / data / storage location
17 October 2001
55
Challenge 1 - TCP Overhead
Consider a SCSI WRITE command. How many times do you think
the data is copied before eventually reaching the target HBA?
Linux Host System
Application
File System
1
Buffer Cache
Linux Target System
SCSI Subsystem
2
iSCSI Host Driver
TCP/IP
Ethernet Driver
Ether
Bridging Software
iSCSI Target Driver
3
4
Block Device
Driver
TCP/IP
Ethernet Driver
Ether
HBA
Application –copy-> Buffer Cache –copy-> TCP/IP –DMA-> Ether (2 copies
1 DMA) Ether –DMA-> Ring Buffer –copy-> TCP/IP –copy-> Bridge –DMA->
HBA (2 copies 2 DMA)
17 October 2001
56
TCP Overhead (2)

TCP Processing

Every TCP connection that is part of an iSCSI session has
processing overhead potential
 Connection setup / teardown
 TCP state machine:
 Acknowledge, Timeout, Retransmission
 Window management
 Congestion Control
 TCP segmentation
 IP fragmentation
 Checksum calculations

Partial or Complete TCP Offload mechanisms are
assumed to be required to make iSCSI performance
comparable to FC
17 October 2001
57
Challenge #2 – Framing

Message Boundaries (The Framing - HW-Issue)


iSCSI messages have no alignment relationship with TCP
segments
And TCP does not have a “built in mechanism” for
signaling message boundaries.
 IETF considered leverage the urgent pointer for some time

So how can an iSCSI adapter determine where a message
begins and ends??
 By reading the length field in the iSCSI header
 Determines where in byte stream current message ends and
next begins
 NIC must stay “in sync” with beginning of byte stream
 Works well in a perfect world (Maybe a SAN or LAN ????)
 In a MAN/WAN we have issues
 IP Frags leading to out-of-order packet delivery and/or packet
loss
 Any “middle box” may fragment an IP packet until, sending each
along potentially different routes
17 October 2001
58
Framing (2)

Message Boundaries Continued

THE SCENARIO:
 An iSCSI header is not received when expected because the TCP
segment that it was part of was delivered out of order

THE ISSUE:
 The receiver does not know where to put the trailing data packets
until the packet with the header arrives

The different options?
 Drop all packets until the header arrives
 They will be retransmitted
 Buffer packets until the header arrives. Then “re-assemble.”
 On a 1Gbit WAN link,16MB of buffer memory is required per TCP
connection
 On a 10 Gbit WAN link, 125MB of buffer memory required per TCP
connection
17 October 2001
59
Framing (3)

Message Boundaries Continued

THE BAD NEWS:
Dropping packets greatly impacts performance and
significantly increases network congestion
Local buffering is expensive and NIC logic is complex
17 October 2001
60
Into – SAN View
Storage Management & Apps
Hosts
Infrastructure
Targets
17 October 2001
61
SAN Components

Server Platforms:




Storage Platforms:




Fibre Channel Host Bus Adapters
IP Storage NICs (SNICs)
SAN Software
RAID subsystems
JBOD
Tape subsystems
SAN Interconnect:




Fibre Channel hubs and switches
IP Storage switches
SAN-to-SCSI bridges
MAN and WAN gateways
17 October 2001
62
SAN, NAS, iSCSI Comparison
DAS
SAN
iSCSI
iSCSI
Appliance Gateway
NAS
Computer System
Application
Application
Application
Application
Application
File System
File System
File System
File System
File System
Volume Manager
Volume Manager
Volume Manager
Volume Manager
SCSI Device Driver
iSCSI Driver
SCSI Device Driver
iSCSI Driver
I/O Redirector
NFS/CIFS
TCP/IP stack
NIC
SCSI Device Driver
SCSI Device Driver
SCSI Bus Adapter
Fibre Channel HBA
TCP/IP stack
TCP/IP stack
NIC
NIC
File I/O
Block I/O
SCSI
SAN
IP
IP
IP
FC
NIC
TCP/IP stack
iSCSI layer
Bus Adapter
NIC
TCP/IP stack
iSCSI layer
Bus Adapter
NIC
TCP/IP stack
File System
Device driver
Block I/O
FC switch
17 October 2001
63
17 October 2001
64
Potential Outcomes and Success Probability
17 October 2001
65
I/O Adapters “Data Movers”
Intel and other vendors will have
ONE Ethernet Wire
for
ALL Storage & LAN Traffic
I/O Block Data
GbE
R
010101
Port
LAN Data
010101
17 October 2001
66
Storage Functions/Applications

Current Functions/Applications
Storage Consolidation
 Tape Backup
 Clustering
 Replication
 Disaster Recovery


New Capabilities with IP Storage
SAN Extension
 QoS
 Security

17 October 2001
67
LAN-free Tape Backup
Users
Servers
RAID

SAN Switch
SAN Bridge Tape Subsystem
SAN Advantages for LAN-free Tape Backup:




Removes backup traffic from the LAN
Tape becomes SAN shared resource
High performance SAN infrastructure
SCSI attached via SAN bridge
17 October 2001
68
Remote Backup Application
NT
Server
Backup Server :
• Veritas Shared Storage Option
• Tivoli Storage Manager
Tape
Library
NT
Server
RAID
(Email)
HBAs
LAN
Mission-Critical RAID
(Oracle, ERP DB)
RAID
GE, 10GE ( iSCSI, iFCP )
Fibre Channel
SCSI
iSCSI
Servers
17 October 2001

Tape
Library
Allows customers to move archiving off-site for higher
disaster protection
69
Server Clustering
Users
Heartbeat
Servers
RAID

SAN Switch
SAN Advantages for server clustering:




17 October 2001
RAID
Server access to common storage resources
Failure of a single server still provides data access
Scalable to > 30 servers in a cluster
Simplified storage resource management
70
SAN Extension: Replication over WAN
NT
Server
Tape
Library
NT
Server
RAID
(Email)
HBAs
LAN
IP WAN
RAID
RAID


iSCSI
Servers


17 October 2001
Tape
Library
Unified Management of Data Center and WAN storage routers
Not vulnerable to disruption at a local SAN
IP WAN Link
Leverage current infrastructure
GE, 10GE ( iSCSI, iFCP )
Fibre Channel
Expandable to iSCSI devices
SCSI
(OC-3, T1, etc)
71
TCP/IP Layers
TCP/IP Protocols
OSI Model
TCP/IP layers
7
FTP
Telnet HTTP SNMP
TFTP
Process layer
6
5
4
TCP/IP
UDP
Connection oriented
Connectionless oriented
Host to host
layer
3
IP
Internet layer
2
LAN/WAN
Network
access layer
1
17 October 2001
Ethernet, token ring, ATM, Frame Relay, FDDI
72