Download Storage Area Network - CSE Labs User Home Pages

Document related concepts

UniPro protocol stack wikipedia , lookup

Transcript
Storage Area Network
Baoquan Zhang
Outline
• What is a SAN?
• Why SAN?
• What is a SAN composed of?
• SAN, NAS or DAS?
Outline
• What is a SAN?
• Why SAN?
• What is a SAN composed of?
• SAN, NAS or DAS?
SAN
• Any high-performance network whose primary purpose is to enable
storage devices to communicate with computer systems and with
each other. *
• A high-speed network, an extension to the storage bus, allows the
establishment of direct connections between storage devices and
processors (servers). **
• A network that provides access to consolidated, block level data
storage. ***
*www.snia.org
**Khattar, Ravi Kumar, et al. Introduction to Storage Area Network, SAN. IBM Corporation, International
Technical Support Organization, 1999.
***https://en.wikipedia.org/wiki/Storage_area_network
Outline
• What is a SAN?
• Why SAN?
• What is a SAN composed of?
• SAN, NAS or DAS?
Why SAN?
• Industry Recognition : Three tiers architecture
Presentation
Desktop(PC, NC)
Processing
Application Servers
Data Storage
Storage Devices
Why SAN?
Client/Server Computing
 Limited distance of data transmitting
SCSI: 1.5m~25m
 Poor scalability
Adding Disk for each server
 Hard to share data
information island
• Extra resource of copying and
transmitting data
• Work with out-of-date data
Clients
Client Access LAN
A
Application Storage
Devices
Servers
B
…
DAS
Information Island
Why SAN?
Storage Area Network
Clients
Universal Storage Connectivity
Good scalability
Scale performance and capacity
Relatively long distance data transmitting
IP : Internet-based Long-distance
FC: 15m~10km
IB : 15m~10km
No-copy data sharing
Shared storage pool
Client Access LAN
Application
Servers
…
Storage Area Network
Storage
Storage Model 1: Direct Access Storage
Server
• All storage stranded
behind
server
• Proprietary access
(vendor
specific)
• Storage sharing creates
CPU
overhead
• Network burdened with
disk I/O
traffic
• Limited scalability and low performance
Storage Model 2: Fibre Channel SAN
• Replaces parallel SCSI transport
• SAN is DAS from servers’
perspective
SAN
Intrane
t
• Optimized for movement of
data from server to disk or tape
• Facilitates storage clustering &
LAN-free backup
• Typically does not use LAN
protocols, relies on serial SCSI
(SCSI-3)
Server
SAN
SAN
Server
Interne
t
Server
Storage Model 2: FC SAN Limitations
• Creates a 3rd network (LAN,
WAN, SAN)
• Pre-Gigabit Ethernet bandwidth
assumptions
SAN
Intrane
t
• Management nightmare
Server
• Limited interoperability
• Minimal storage security
SAN
SAN
Server
• Creates “SAN Islands”
Interne
t
Server
Storage Model 3: IP-SAN
• Best features of Fibre Channel & IP networks
Data
Video
SUN
NT 4.0
WIN 2000
LINUX
• Ease of configuration and management
• Servers used optimally
• Support IP Quality of Service, Error detection
& Prioritization
Storage
IP
• Multiple server operating systems supported
• Maintain IT infrastructure, security &
interoperability
Voice
IP
Network
SN 5420
Fibre Channel
Active Disk with OSD Capability As An Example of
Intelligent Storage Devices
IP Network Attached
More Processing
Power and Memory
Storage Area Network
• Server Architecture Based on SAN & NAS
• Network Protocol (FC-AL and SSA)
• Spatial Reuse
• Multiple Links and Switch Based Multiple FC-AL
Internet
Connection
SAN
Host
FC-AL
FC-AL
Internet
Connection
SAN
Host
Host
Host
SAN
FC-AL
FC-AL
FC-AL
Internet
Connection
Internet
Connection
Previous Research on SAN
• Efficient Protocol Design for FC-AL and SSA
• Emphasis on performance for future disks
• Built detailed simulation models for both FC-AL and SSA
• Supported by Seagate and IBM Storage Systems Division
• Scalable Streaming Video Servers based on SAN
• Co-funded a streaming video server company- Steaming21
• Many publications on streaming video servers and
streaming video delivery over Internet
Serial Storage Interfaces
• FC-AL
• SSA
• FC-TORN
• FC-AL3
• InfiniBand
Serial Storage Interfaces
• Fibre Channel
• FC-AL
• FC Switch
• Serial Storage Architecture (SSA)
•
•
•
•
•
Buffer Insertion Ring
Link-by-link flow control
Fairness Algorithm
Independent links: Spatial Reuse
Fault tolerance against link failure
FC-AL Features
• Bandwidth: 100 MB/s
• Connectivity: 126 devices
• Connection Distance: 30m device to device (with copper) and 10km
(with Fiber Optics)
• Fault-Tolerance: CRC protected frames, dual port, hot plug connector
• Distributed switch logic
FC-AL Fairness Algorithm
•
•
•
•
Based on an Access Window with a history variable ACCESS
Default value of ACCESS is true
When an L_Port wins the arbitration, set ACCESS to false
Before opening a circuit, winner send out ARB(F0) to detect if other L_Ports are
also arbitrating
• If receive ARBx, other L_Ports are arbitrating
• When relinquish the loop, the winner sends out:
• ARB(F0) if other L_Ports are arbitrating, or
• IDLE to trigger all L_Ports re-set ACCESS to true
SSA Features
• 2-in and 2-out links per node (with 20 MB/s per
link)
• Fairness Access
• Fault-Tolerance: A multiple host configuration
offers fault tolerance again host, link and adapter
failures.
• Number of attachments: 126 for SSA
• Compact connectors: serial vs. parallel for SCSI
• Transmission distance: 25 m (2.5km) between
devices with copper cables (fiber optic)
Spatial Reuse
• What is spatial reuse?
• Concurrent non-overlap transfers can utilize full link bandwidth
• Why is it important?
• Throughput can scale up with more links and non-overlap
transfers
• Achieved throughput could be as low as link bandwidth
• Device/data sharing may reduce spatial reuse potential
SSA SAT Fairness Algorithm
• Based on token passing and quota
• Forwarding frames have higher priority than
originating frames
• Holding a token allows a node to switch the
priority between the originating and forwarding
traffic.
• Hold quota (a_quota): number of frames that can
be originated when holding the SAT token.
• Idle quota (b_quota): number of frames that can
be originated since a node passed the SAT token
last time and the channel is idle. In general,
b_quota =4*a_quota.
Fairness vs. Channel Utilization
• How to define fairness?
• How to improve channel utilization?
• Starvation possible?
• Fairness+throughput
FC-TORN
• B_RDY (credit) is used to control the number of
frames potentially can be sent to a destination
(disk).
• SAT token based on one quota for each source
(host or disk) to control the maximum number of
frames sent by a source.
• B_RDY and B_RDY’ are used to produce fairness
from sources to a destination.
Storage Area Network
• Server Architecture Based on SAN & NAS
• Network Protocol (FC-AL and SSA)
• Spatial Reuse
• Multiple Links and Switch Based Multiple FC-AL
Internet
Connection
SAN
Host
FC-AL
FC-AL
Internet
Connection
SAN
Host
Host
Host
SAN
FC-AL
FC-AL
FC-AL
Internet
Connection
Internet
Connection
Outline
• What is a SAN?
• Why SAN?
• What is a SAN composed of?
• SAN, NAS or DAS?
SAN Components
Interconnects (The heart of a SAN)
• Cable
Server Host
Connect the components with each other
Adapter
• Adapters
Connect to devices and control the protocol
Interconnects
• Switches (Fabric)
 Interconnect devices, increase bandwidth, reduce
congestion and provide aggregate throughput
 provide simple NameServer services.
or Hub (Arbitrated Loop)
 Share bandwidth
Storage
Array
Tapes
Hard Disks
SAN Components
 Fibre Channel SAN or FC SAN
 IP Network SAN or IP SAN
 InfiniBand SAN or IB SAN
Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect
technology. A growing number of network technologies have architectural and physical properties that
make them suitable for use in SANs. - See more at:
http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf
SAN Components
 Fibre Channel SAN or FC SAN
 IP Network SAN or IP SAN
 InfiniBand SAN or IB SAN
Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect
technology. A growing number of network technologies have architectural and physical properties that
make them suitable for use in SANs. - See more at:
http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf
SAN Components
Fibre Channel
 Fibre Channel started in 1988, with ANSI standard approval in 1994, to
simplify HPPI (High Performance Parallel Interface) system.
 FC is a high-speed network technology (commonly running at 2-, 4-, 8- and
16-gigabit per second rates) primarily used to connect computer data
storage. (32-Gigabit, 128-Gigabit speeds in 2016)
 FC is the best design combining the I/O Channel with Networking.
 Networking pays most attention on handling the changes of configuration and
loads as well as addressing data to proper destination.
 I/O Channel focuses on the performance, which means to move data with least
latency by utilizing a rigorous and simple protocol.
 FC maintains the speed and low overhead of a channel while adding the
flexibility (through connectivity) and the longer distances that are characteristic of
a networking.
SAN Components
The competition of High-end Storage Technology
FC
SSA
Throughput
531.25 Mb/s
640Mb/s
Device amount
Unlimited
Up to 192 hot swappable hard disk per system
Up to 32 separate RAID arrays per adaptor
Distance
10km
10km( with 25 meters apart among arrays)
Up layer Protocol
ATM, IP, FICON, SCSI
SCSI-3
 Eventually the market chose FC over SSA (Serial Storage Architecture).
SAN Components
Fibre Channel Topologies:
FC-P2P:Point to point
 The easiest configuration
 The easiest to administer
 High-speed interconnect between
two nodes
Possible Usage
• Between Central Processing Units
• From a workstation to a specialized graphics processor or simulation accelerator
• From a file server to a disk array
……
SAN Components
Fibre Channel Topologies:
FC-AL: Arbitrated Loop
1. First arbitrate to win control of the loop.
2. Establish a point-to-point (virtual)
connection
3. two nodes consume all of the loop’s
bandwidth until the data transfer
operation is complete
Advantages
• Lower-cost alternative
• Support of up to 126 devices is possible on a single loop.
• ……
However, by 2007, FC-AL had become rare in server-to-storage communication
SAN Components
Fibre Channel Topologies:
FC-SW: Switched Fabric




Increased bandwidth
Increased number of devices
scalable performance
maximum of 16 million devices
FC-SW topology is what
we deploy in a SAN.
High cost : Switch is the most costly hardware device.
SAN Components
Fibre Channel SAN
Server Host
FC Host Bus Adapter
FC Host Bus Adapter
A unique World Wide Name (WWN)
Fibre or copper cable
Cable
Copper
Fibre
15m 100 MB/s
10km 2000MB/s
Fibre Channel Switches
Fibre Channel Switches
Fibre cable
Directors
No single point of failure (high availability)
Switches
smaller, fixed-configuration, less redundant devices
SAN Components
Fibre Channel Protocol Layers
FC-4
FC-3
FC-2
FC-1
FC-0
SAN Components
Fibre Channel Layers
FC-0 Physical layer : describes the physical interface
•
•
•



an analog interface to transmitter
a digital interface to the FC-1 layer
the requirements for infrastructures
Transport media
Receiver hardware
……
Example of options of FC-0 Plants
SAN Components
Fibre Channel Layers
FC-1 Encode/Decode Layer: describes the means of encoding/decoding user data
 8/10 bit encode/decode scheme
 8b/10b encoding was proposed by Albert X. Widmer and Peter A.
Franaszek of IBM Corporation in 1983.
 Minimize errors by equalizing the number of 1’s and 0’s transmitted and not
allowing more than 4 consecutive bits of the same type in a row.
 Allows for distinguishing “Special Characters (K28.5)” and also provides for
simplifying byte and word alignment.
 the evening out of 1’s and 0’s allows for the design of relatively inexpensive
transmitter/receiver circuitry.
SAN Components
FC-1 Encode/Decode Layer: Encode Process
FC-2 byte notation: 0xBC (Hexadecimal)
FC-2 bit notation:
7 6 5 4 3 2 1 0
1 0 1 1 1 1 0 0
FC-1 un-encoded:
H G F E D C B A
1 0 1 1 1 1 0 0
Z
XX
Z
E D C
FC-1 reordered for : K 1 1 1
(+Previous Running Disparity) 5B/6B (Negative)
A B C D E i
FC-1 encoded : 0 0 1 1 1 1
.y
B A F G H
0 0 1 0 1
3B/4B
F G H j
1 0 1 0
Variable
K
Z
K
K28.5
SAN Components
Fibre Channel Layers
FC-2: Framing Protocol/Flow Control
 data using frames
 flow control
 classes of service
SAN Components
 Frames are the basic package used to encapsulate and transport the data.
 Two types of Frames
 Data Frame
 Link Control Frame
 A group of related Frames transmitted in one direction constitute a
sequence.
 Exchanges are groups of related Sequences.
Expiration Security Header
Network Header
User Data(Not used in Link Control Frame )
Association Header
Designates the end of the Frame content
Device Header
and validity of the Frame’s content
SoF: the “comma”
and 3 bytes
indicating the type of
connection service
Verify the data integrity of the FH and Payload
SAN Components
 FC-2 controls the flow of Frames between ports so that receiver buffers are
not overrun.
 Buffer is maintained by the Sequence Initiator (transmitter) and is used to
throttle the transmission of Frames.
 There are two basic types of flow control.
 End to End Control in N_port to N_port communications
The receiver responds to all valid Frames it receives with an ACK Frame.
 Buffer to Buffer Control in N_port talking to a Fabric or an N_port to N_port
connection in a Point to Point topology
Each side is responsible for maintaining its own BB_Credit_Count.
SAN Components
 FC-2 provides up to 5 Classes of Service (CoS). The different CoS represent
different levels of delivery guarantee, bandwidth and connectivity.
Class 1
 dedicated connection
 remain active until being closed.
 R_RDY on Connect Request only
 sustained, high throughput
transactions
SAN Components
Class 2
 control on a Frame by Frame
Basis
 allows interleaving of
Sequences over the single
connection from multiple
N_ports
 the ACK for every Frame. Also
R_RDY.
SAN Components
Class 3
 provides a connectionless service
with no acknowledgment
 lack of ACK. Only R_RDY for link
maintenance
SAN Components
Fibre Channel Layers
FC-3: Common Services
 The FC-3 level is not currently fully defined. The term “common services”
means a service that would utilize multiple N_ports working together on
a single node.
SAN Components
Fibre Channel Layers
FC-4: Upper Level Protocol Support
 The FC-4 level supports the mapping of Upper Level Protocols (ULP) onto
Fibre Channel data structures.






SCSI (Small Computer Systems Interface)
IPI-3 (Intelligent Peripheral Interface-3)
HiPPI (High Performance Parallel Interface)
IP (Internet Protocol) - IEEE 802.2 (TCP/IP) data
ATM/AAL5 (ATM adaptation layer for computer data)
SBCCS (Single Byte Command Code Set)
 The way that FC serves as a transport for ULPs is by mapping the ULP
messages(known as Information Units) into FC Sequences and/or Exchanges.
SAN Components
 IP over FC
Two kinds of Information Units
 IP datagram
Moving between nodes on networks using the IP protocol stack
 ARP datagram
 ARP datagram is used during network configuration to map IP addresses to
Media Access Control addresses (used for routing).
 A dedicated ARP server must be set up at a “well known” address
SAN Components
 IP over FC
IP Packets
Network Header
Split
Frame Header
Network Header
The First Frame
Optional Header
Payload
Frame Header
Payload
Additional Frame
…
Frame
SAN Components
 SCSI over Fibre Channel (Predominate in FC SAN)


Generally, FCP stands for Fibre Channel Protocol for SCSI.*
The transport is accomplished by wrapping SCSI command, response, status and data
blocks.
SCSI Command
*Norman, David. "Fibre Channel Technology for Storage Area Networks."
SAN Components
 SCSI over Fibre Channel
Read Example
Receive
Handle
Initiator FCP_Port
*Norman, David. "Fibre Channel Technology for Storage Area Networks."
Target FCP_Port
SAN Components
 Fibre Channel SAN or FC SAN
 IP Network SAN or IP SAN
 InfiniBand SAN or IB SAN
Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect
technology. A growing number of network technologies have architectural and physical properties that
make them suitable for use in SANs. - See more at:
http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf
SAN Components
IP SAN
An IP SAN is a Storage Area Network that
uses the iSCSI protocol to transfer block-level
data over a network, generally Ethernet.
iSCSI HBAs
iSCSI Node Names
Cable
Ethernet
Network Switches
Server Host
iSCSI Host Bus Adapter
Ethernet
Network Switches
SAN Components
 Fibre Channel SAN or FC SAN
 IP Network SAN or IP SAN
 InfiniBand SAN or IB SAN
Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect
technology. A growing number of network technologies have architectural and physical properties that
make them suitable for use in SANs. - See more at:
http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf
IB SAN
InfiniBand Network Architecture
InfiniBand is a network communications protocol that offers a switch-based fabric
of point-to-point bi-directional serial links between processor nodes, as well as
between processor nodes and input/output nodes, such as disks or storage.
 Higher throughput – 56Gb/s per server and storage connection, and soon 100Gb/s, compared
to up-to 40Gb Ethernet and Fibre Channel
 Lower latency – RDMA zero-copy networking reduces OS overhead so data can move
through the network quickly
 Enhanced scalability – InfiniBand can accommodate theoretically unlimited-sized flat networks
based on the same switch components simply by adding additional switches
 Higher CPU efficiency – Data movement offloads the CPU
InfiniBand Architecture
 EndNodes: Servers and Devices
 Link: copper and optical fibre*




Switches: IBA Switches
A private, protected channel directly between the nodes was established by switches.
Adapters: Host Channel Adapter
Data and message movement without CPU involvement with RDMA and Send/Receive
offloads is performed by adapters.
 The adapters are connected on one end to the CPU over a PCI Express interface and to
the InfiniBand subnet through InfiniBand network ports.
 Subnet Manager: Routing define and Subnet discovery
1X fibre link has two optical fibres, one for each direction
InfiniBand Architecture
IB Storage Stack
InfiniBand Architecture
IB Communication Stack
 A Consumer is a process with virtual
address space.
 A Consumer can have more than one QP.
 A QP(Queue Pair) is a Virtual Interface.
 A OP includes a Send Q and Receive Q.
 QPs are the endpoints of Channel.
 A Channel Adapter has up to 2^24 QPs.
 QPs are independent with each other.
 IB Message Transfer Semantics
 Send/Receive
Simply send and receive.
 RDMA Read/Write
Directly Read and write to Virtual Memory
InfiniBand Architecture
IB Message Transfer Semantics: Send/Receive
Step:
1.
2.
3.
4.
Initiator put the message in the SND.
The Message is sent to Target.
Target receive the Message.
Target put the Message in the RCV.
InfiniBand Architecture
IB Message Transfer Semantics: RDMA
Step:
1. Application on initiator registers a
buffer and puts the send request in
SND.
2. Target receives the request and
reads the data from initiator buffer
directly.
3. Target returns a status to Initiator.
InfiniBand Architecture
Message
 InfiniBand Architecture is said to be message-oriented.
 A message can be any size ranging up to 2^31 bytes in size.
 The InfiniBand hardware automatically segments the outbound message into a
number of packets.
Complete IBA Packet Format
8 Bytes
40 Bytes 12 Bytes
28 bytes
4 Bytes
Local
Routing
Header
Global
Routing
Header
Extended
Transport
Header
Immediate
Data
Intra-subnet
Inter-subnet
Base
Transport
Header
tells endnodes what to
do with packets
0-4096 Bytes 4 Bytes
Message
Payload
Invariant
CRC
2 Bytes
Variant
CRC
InfiniBand Architecture
IB Verbs
 InfiniBand architecture does not
define APIs, only provides the basis
for specifying the APIs.
 A verb is a method by which an
application requests an action from
InfiniBand’s message transport
service.
 Other organizations, such as the
OpenFabrics Alliance, provide a
complete set of APIs and software
that implements the verbs to work
seamlessly with the InfiniBand
hardware.
InfiniBand Architecture
IB Up Layer Protocol
The upper level protocols
 IPoIB : IP over IB
 SRP : SCSI RDMA Protocol
 SDP : Sockets Direct Protocol
 iSER : iSCSI Extensions for RDMA
Linux InfiniBand software architecture
InfiniBand Architecture
SRP Protocol
 SCSI RDMA Protocol (SRP) was
defined by the ANSI T10 committee to
provide block storage capabilities for
the InfiniBand architecture.
 SRP is a protocol that tunnels SCSI
request packets over InfiniBand
hardware
Linux InfiniBand SRP Protocol architecture
SAN Components (Summary)
Bandwidth
Latency
FC SAN
100Mb(Copper)
20Gb(Fibre)
32Gband 128Gb(Coming)
Dedicated to
block I/O
IP SAN
IB SAN
100Mb or 1Gb(Ethernet)
10Gb(10GB Ethernet)
120Gb(12X)
Direct connection
Dedicated to
block I/O
Distance
15m(Copper)
20km(Fibre)
Internet-based Longdistance
125m(12X)
10km(1X)
Cost
High
Cheap
Medium
Outline
• What is a SAN?
• Why SAN?
• What is a SAN composed of?
• SAN, NAS or DAS?
SAN, NAS or DAS?
SAN
More Efficient Block-Level data access
NAS
Convenient data sharing in homogenous File System
DAS
Easy implement and low cost
Acknowledgement
 Professor David Du gives me numerous basic knowledge on Storage
System and provides this interesting topic.
 During the preparation for the presentation, Dr. Fenggang Wu helps me
review the slices and gives me significant references.
Reference
• www.snia.org
• Khattar, Ravi Kumar, et al. Introduction to Storage Area Network, SAN. IBM Corporation,
International Technical Support Organization, 1999.
• https://en.wikipedia.org/wiki/Storage_area_network
• https://en.wikipedia.org/wiki/Fibre_Channel#Fibre_Channel_topologies
• http://www.networkworld.com/article/2174282/lan-wan/fibre-channel-will-come-with32-gigabit--128-gigabit-speeds-in-2016.html
• https://www.pctechguide.com/interfaces/hard-disks-what-is-serial-storage-architecture
• https://en.wikipedia.org/wiki/Fibre_Channel_point-to-point
• Shanley, Tom, and Joe Winkles. InfiniBand Network Architecture. Addison-Wesley
Professional, 2003.
• IP SAN Fundamentals: An Introduction to IP SANs and iSCSI
• Norman, David. "Fibre Channel Technology for Storage Area Networks.“
• Grun, Paul. "Introduction to infiniband for end users." White paper, InfiniBand Trade
Association (2010).
Thank you