Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Remote Desktop Services wikipedia , lookup
Deep packet inspection wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
TCP congestion control wikipedia , lookup
Buffer overflow protection wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Real-Time Messaging Protocol wikipedia , lookup
High-Performance Object Access in OSD Storage Subsystem Yingping Lu Outline OSD Overview Problem and common approaches Related work Initial Proposal Issues Design Objectives of OSD Scalability (local area-enterprise-global) High-performance (high throughput, low latency) Cross platform High availability (resilient to device, machine failure) Support both permanent, mobile and even disconnected clients Security (authentication, access control, transmission and data storage encryption) Data sharing Manageability? Communication Entities: •Client •Metadata Manager •OSD device MetaData Manager MetaData Manager Region MetaData Manager MetaData Manager IP Network Laptop client Laptop client Region MetaData Manager MetaData Manager Desktop client Desktop client Region Communication Paths: •Client to metadata server •Client to OSD device •Metadata to OSD device •Metadata to Metadata Problem The network bandwidth is getting faster and faster (10Gb/s is on the road). OSD Application requires high performance How to efficiently deliver object data between OSD device and client? Potential Measures Potential performance improvement measures – Locality-based Migration (reduce transmission time) – Migrate to the location closer to client. Replication (reduce transmission time) Replicate a copy within the client’s proximity. – Can replicate data object or metadata. – Cache (reduce disk access time/transmission time) – – – Where: client, metadata server, object device, etc. What: data object, metadata, locking. How long: TTL, lease, renewal. Performance Improvement Measures (cont.) Improvement measures – Aggregation (Device grouping) – Improve the aggregate I/O throughput and reliability Works like a RAID system Data path-based Decouple the control path from data path Reduce the length of critical path in the data access level. Performance Constraints Consistency (in updating, reconciliation) Locking and serialization Security Small data size access Crash recovery Leveraging Data Access Path Streamline the end system Zero copy/RDMA User level programming/OS bypass TCP offloading • Improve the transport system • • • • • Large window size Explicit congestion notification Selective acknowledgement Connection splitting (mobile) Explicit congestion control protocol (XCP) What’s Wrong With End System Streamlining end systems – Problems: the end system cannot provide the potential bandwidth to applications. Memory copy Context switching Interrupt service Checksumming generation Protocol processing End System Overhead Streamlining the end system – Overhead Per packet – – – – Protocol processing (execute code, allocate/release buffer) access control Interrupt service time for each received packet Kernel context switching Per byte – – – Checksum generation Memory copy Data transmission Streamlining End System Solutions – – – – – – – RDMA (Zero copy) One system-wide buffer pool User level networking (bypassing kernel) TCP offloading Jumbo packets Interrupt coalescing Scatter/gather list Related work Previous work: – – – – – – – – – I/O Lite VI (Myrinet, Servernet) SDP InfiniBand SRP DAT (Direct Access Transport collaborative) DAFS (SNIA) NFS/RDMA (SNIA) RDMA over TCP/IP I/O Lite Purpose: Reduce memory copy Approach: Maintain a global buffer pool in the system Allow application, IPC, file system, network subsystem to share one copy of data Pros: – – Reduce memory copy Useful for read-only buffer Cons: – – System rewritten Buffer update is difficult RDMA Extend DMA’s semantics across machine boundary Two operations: RDMA read, RDMA Write Memory registration: memory needs “pinned” A descriptor carries the src, dest address, length A special hardware (nic) handle the RDMA operation. Pros: – – Zero copy Offload CPU processing Cons – – Need Special hardware Need reprogramming Remote DMA Scenario Host A Buffer A Host B Buffer B CPU CPU 1 3 RDMA Engine (NIC) 2 RDMA Engine (NIC) Virtual Interface Architecture (VIA) Goal:low latency, high throughput by direct access to NIC, zero copy Programming abstract: VI(queue pair) Components: consumer,VI provider(UA, KA, NIC) Operations: RDMA, Send/Receive Present a standard of RDMA operations and VI abstract InfiniBand An emerging I/O interconnect technology Decouple I/O from CPU Adopt a serial, switchedbased fabric Provide a unified communication mechanism (4 layers) Provide VI support (Verb, QP, RDMA, etc.) Implement VI concept in a standard network SCSI RDMA Protocol (SRP) Goal: provide a SCSI access across IB fabric Exploit the IB RDMA to transfer SCSI data Enable SAN based on IB It’s targeted specifically for IB, not suitable for IP It’s block-level (SCSI) access, (can be object level?) DAFS and NFS/RDMA DAFS is being developed by DAFS consortium A light weight file sharing protocol for local data sharing Leverage NFS4.0 Exploit RDMA mechanism to transfer file data. Being developed by SNIA NFS/RDMA group Enable NFS to exploit the new networking technology (VIA, IB) Make changes to RPC/XDR to use RDMA semantics Target at local area environment Socket Direct Protocol (SDP) Microsoft’s solution in datacenter (2000) Retain the same socket programming interface Bypass the TCP/IP processing in kernel Support RDMA semantic Not routable, works in a data center or cluster Traditional Model Socket App Winsock Direct Model Socket Application WinSock API Switch Switch Winsock Direct SPI TCP/IP/Sockets Provider User Kernel TCP/IP Transport Driver TCP/IP/Sockets Provider TCP/IP Transport Driver NDIS NDIS Driver NIC SAN Provider SAN Mgmt Driver Kernel Bypass Winsock Direct NDIS Driver Kernel Bypass Capable NIC SAN Provider Modules OS Modules NIC Driver & Hardware Figure 1: WSD and SAN Architectural Model RDMA over TCP/IP Developed by rdmaconsortium Support RDMA over TCP/IP network Consisted of three components: RDMAP, DDP, MPA RDMAP: provide RDMA operations DDP: direct data placement MPA: handle framing SCTP: stream-control transport protocol ULP RDMAP DDP MPA SCTP TCP IP Summary Link-level – – – – No routing info carried Rely on the underlying link-level switch to forward Restricted to data center, cluster environment Examples: VIA, InfiniBand, SRP, SDP, DFAS, NFS/RDMA Transport-level – – – Carries TCP/IP header Can traverse to IP network Process framing, direct data placement. OSD Requirements Direct delivery from object device – – Secure delivery – No security channel is assumed, encryption of transmitted object is necessary QoS requirement – Direct transmission between initiator and target device This is the critical data path Object may have specific QoS requirement Mobile client – – Client may be connect, disconnect connected again. Error can occur during transmission Initial Proposal: OSD/Secure RDMA This is a ULP-based RDMA – Leverage RDMA over TCP/IP – Extend the communication to IP network OSD device initiate RDMA request Security-enabled RDMA – The RDMA is tightly integrated with OSD protocol The underlying transport support security QoS support – Virtual Lane-type mechanism to provide QoS support OSD/Secure RDMA Architecture OSD Client OSD Device OSD controller Application Buffers Buffers OSD VIPL OSD VIPL Object Manager VI NIC driver VI NIC driver NIC NIC IP network Disk Driver Protocol Stacks OSD/RDMA maps Consumer OSD to RDMA OSD DDP provide the direct data placement VIPL The underlying transport can be either SCTP or MPA Intelligent NIC with TCP. IPSec is used as security protocol (object encryption) OSD Consumer OSD Protocol OSD/RDMA DDP MPA SCTP TCP IP/IPSec Data Access Case – Get an Object OSD Client OSD Device 1* Request an obj with Obj id, credential, descriptor 2* RDMA write Data packet Data packet 1*: RDMAWrCompl •need first get access permission and establish an session . •Register memory •Post a send request 2*: •Validate the request. •Register a memory buffer •Fetch the object from disk or cache to the buffer •Post a RDMA write request Issues to be solved Elaborate OSD object transfer protocol. – – Should we simply consider SCSI/OSD? What would be new requirement, e.g. security? The integration of iSCSI over RDMA. – The establishment of session – OSD session/iSCSI session/RDMA connection/TCP connection Sequence? Persistence vs. transient? Define the format of OSD/RDMA packet Memory descriptor Commands (login, logout, CMD) Flow-control Issues Integration of RDMA with OSD (cont.) – Define a set of standard API for OSD/RDMA Integration with security – Create a session Register memory Post a work queue element Query status, etc. IPSec vs. SSL? Handle QoS requirement – – QoS attributes, how to specify in an object QoS assurance: credit-based flow control?