Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Distributed firewall wikipedia , lookup
Server Message Block wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Dynamic Host Configuration Protocol wikipedia , lookup
Citizen Lab wikipedia , lookup
Federated DAFS: Scalable Cluster-based Direct Access File Servers Murali Rangarajan, Suresh Gopalakrishnan Ashok Arumugam, Rabita Sarker Rutgers University Liviu Iftode University of Maryland Network File Servers TCP/IP NFS FILE SERVER CLIENTS OS involvement increases latency & overhead TCP/UDP protocol processing Memory-to-memory copying 2 SAN-2 Disco Lab User-level Memory Mapped OS Communication Application Send Receive OS OS NIC Application NIC Application has direct access to the network interface OS involved only in connection setup to ensure protection Performance benefits: zero-copy, low-overhead 3 SAN-2 Disco Lab Virtual Interface Architecture Application RECV COMP SEND QUEUE QUEUE QUEUE Setup & Memory registration VI Provider Library Data transfer from userspace Setup & Memory registration through kernel Communication models Send/Receive: a pair of descriptor queues Remote DMA: receive operation not required Kernel Agent VI NIC 4 SAN-2 Disco Lab Direct Access File System Model DAFS Server Application Buffers DAFS Client User VIPL File access API DAFS File Server Buffers VIPL DAFS File Server Buffers Driver Kernel KVIPL VI NIC Driver VI NIC Driver NIC 5 NIC SAN-2 Disco Lab Goal: High-performance DAFS Server Cluster-based DAFS Server Direct access to network-attached storage distributed across server cluster Clusters of commodity computers - Good performance at low cost User-level communication for server clustering 6 Low-overhead mechanism Lightweight protocol for file access across cluster SAN-2 Disco Lab Outline Portable DAFS client and server implementation Clustering DAFS servers – Federated DAFS Performance Evaluation 7 SAN-2 Disco Lab User-space DAFS Implementation Application Application DAFS Client DAFS Client VI VI DAFS API Request Response VI Network VI Network DAFS Server VI LocalFS FS Local DAFS client and server in user-space DAFS API primitives translate to RPCs on server Staged Event Driven Architecture Portable across Linux, FreeBSD and Solaris 8 SAN-2 Disco Lab DAFS Server SERVER Connection Protocol Threads Manager CLIENT Response Connection Request DAFS DAFS API API Request Request 9 SAN-2 Disco Lab Client-Server Communication buf buf buf req req Application dafs_write(file, buf) dafs_read(file, buf) dafs_read(file, buf) Request Response DAFS Server DAFS Server DAFS Server DAFS Client DAFS Client DAFS Client VI VI VIVI VI Network VI VI Network VINetwork Network VI VI VI VI Local Local FS FS VI channel established at client initialization VIA Send/Receive used except for dafs_read Zero-copy data transfers Emulation of RDMA Read used for dafs_read Scatter/gather I/O used in dafs_write 10 SAN-2 Disco Lab Asynchronous I/O Implementation Applications use I/O descriptors to submit asynchronous read/write requests Read/write call returns immediately to application Result stored in I/O descriptor on completion Applications need to use I/O descriptors to wait/poll for completion 11 SAN-2 Disco Lab Benefits of Clustering Standalone Single Clustered DAFS DAFS DAFS Server Servers Servers on a Cluster Application DAFS Server DAFS Server Clustering Layer DAFS Client VI VI Application Application DAFS Client DAFS Client Local FS DAFS Server DAFS Server Clustering Layer DAFS Server Local VI Local FS FS VI VI • VI • • Application DAFS Server DAFS Server Clustering Layer Application DAFS Client VIClient DAFS VI Local FS VI 12 SAN-2 Disco Lab Clustering DAFS Servers Using FedFS DAFS Server DAFS Server DAFS Server DAFS Server File I/O FedFS over SAN Federated File System (FedFS) Federation of local file systems on cluster nodes Extend the benefits of DAFS to cluster-based servers Low overhead protocol over SAN 13 SAN-2 Disco Lab FedFS Goals Global name space across the cluster Created dynamically for each distributed application Load balancing Dynamic Reconfiguration 14 SAN-2 Disco Lab Virtual Directory (VD) Union of all local directories with same pathname Each VD is mapped to a manager node Virtual Directory (/usr) Determined using hash function on pathname / usr Manager constructs and maintains the VD file1 file2 / / usr usr file1 15 file2 SAN-2 Disco Lab Constructing a VD Constructed on first access to directory Manager performs dirmerge to merge real directory info on cluster nodes into a VD 16 Summary of real directory info is generated and exchanged at initialization Cached in memory and updated on directory modifying operations SAN-2 Disco Lab File Access in FedFS manager(f1) DAFS Server VI FedFS Local FS DAFS Server VI FedFS Local FS DAFS Server VI FedFS Local FS f1 VI Network Each file mapped to a manager home(f1) Determined using hash on pathname Maintains information about the file Request manager for location (home) of file Access file from home 17 SAN-2 Disco Lab Optimizing File Access Directory Table (DT) to cache file information File information cached after first lookup Cache of name space distributed across cluster Block level in-memory data cache 18 Data blocks cached on first access LRU Replacement SAN-2 Disco Lab Communication in FedFS DAFS Server Buffer VI FedFS DAFS Server RDMA for for Send/Receive Response with data Request/Response Local FS FedFS VI Local FS VI Network Two VI channels between any pair of server nodes Send/Receive for request/response RDMA exclusively for data transfer Descriptors and buffers registered at initialization 19 SAN-2 Disco Lab Performance Evaluation Application DAFS Server FedFS VI Local FS DAFS Client VI Application DAFS Client VI • VI Network • • • • • DAFS Server FedFS VI Local FS Application DAFS Client VI 20 SAN-2 Disco Lab Experimental Platform Eight node server cluster Clients 800 MHz PIII, 512 MB SDRAM, 9 GB 10K RPM SCSI Dual processor (300 MHz PII), 512 MB SDRAM Linux-2.4 Servers and Clients equipped with Emulex cLAN adapter 32 port Emulex switch in full-bandwith configuration 21 SAN-2 Disco Lab SAN Performance Characteristics VIA Latency and Bandwidth poll/wait for latency/bandwidth measurement respectively Packet Size (Bytes) 22 Roundtrip Latency (ms) Bandwidth (MB/s) 256 23.3 56 512 27.3 85 1024 36.9 108 2048 56.0 109 4096 91.2 110 SAN-2 Disco Lab Workloads Postmark – Synthetic benchmark Short-lived small files Mix of metadata-intensive operations Benchmark outline 23 Create a pool of files Perform transactions – READ/WRITE paired with CREATE/DELETE Delete created files SAN-2 Disco Lab Workload Details Each client performs 30,000 transactions Each transaction – READ paired with CREATE/DELETE READ = open, read, close CREATE = open, write, close DELETE = unlink Multiple clients used for maximum throughput Clients distribute requests to servers using a hash function on pathnames 24 SAN-2 Disco Lab Base Case (Single Server) Maximum throughput 5075 transactions/second Average time per transaction 25 For client ~ 200 ms On server ~ 100 ms SAN-2 Disco Lab Postmark Throughput Postmark Throughput (txns/sec) 30000 File size: 2 K File size: 4 K File size: 8 K File size: 16 K 25000 20000 15000 10000 5000 0 0 1 2 3 4 5 6 7 8 9 Number of Servers 26 # Servers 2 4 8 Speedup 1.75 3 5 SAN-2 Disco Lab FedFS Overheads Files are physically placed on the node which receives client requests Only metadata operations may involve communication first open(file) delete(file) Observed communication overhead 27 Average of one roundtrip message among servers per transaction SAN-2 Disco Lab Other Workloads No client request sent to file’s correct location Optimized coherence protocol minimizes communication All files created outside Federated DAFS Only READ operations (open, read, close) Potential increase in communication overhead Avoid communication at open and close in the common case Data Caching helps reduce the frequency of communication for remote data access 28 SAN-2 Disco Lab Postmark Read Throughput Each transaction = READ Postmark Read Throughput (txns/sec) 60000 Federated DAFS Federated DAFS - No Cache 50000 40000 30000 20000 10000 0 2 4 Number of Servers 29 SAN-2 Disco Lab Communication Overhead Without Caching Without caching, each read results in remote fetch Each remote fetch costs ~65ms request message (< 256 B) + response message (4096 B) # Servers # Clients for Max. Throughput # Transactions # Remote Reads on each server 2 10 300,000 150,000 4 20 600,000 150,000 30 SAN-2 Disco Lab Work in Progress Study other application workloads Optimized coherence protocols to minimize communication in Federated DAFS File migration Alleviate performance degradation from communication overheads Balance load Dynamic reconfiguration of cluster Study DAFS over a Wide Area Network 31 SAN-2 Disco Lab Conclusions Efficient user-level DAFS implementation Low overhead user-level communication used to provide lightweight clustering protocol (FedFS) Federated DAFS minimizes overheads by reducing communication among server nodes in the cluster Speedups of 3 on 4-node and 5 on 8-node clusters demonstrated using Federated DAFS 32 SAN-2 Disco Lab Thanks Distributed Computing Laboratory http://discolab.rutgers.edu DAFS Performance Postmark Throughput (txns/sec) 40000 File size: 4 K 35000 30000 25000 20000 15000 10000 5000 0 0 2 4 6 8 10 Number of Servers 34 SAN-2 Disco Lab