Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS’s performance (Reliability & Fault Tolerance & Scalability, Data Security, and Data Migration). 5. The challenges of NAS. And how do we handle these problems? 6. Compare NAS and DAS (Directly Attached Storage). 7. Compare NAS and SAN (Storage Area Network). - File level storage vs Block level storage. 8. The future of NAS. Outline Part B 9. Network File System definition 10. NFS Architecture 11. Network File System layer 12. NFS Protocol 13. NFS Design feature 14. Benefits of NFS 15. NFS version 16. New Technology RDMA (Remote Direct Memory Access Protocol) 17. NFS Security 1. What is the Network Attached Storage (NAS)? NAS: a file level storage system which provide local area network node, all the clients or associated devices on the local area network can share the file-based data through the standard Ethernet connection without physically transfer files by external storage device. 1. What is the Network Attached Storage (NAS)? Local area network: a group of computers or associated devices which share the common communication channel. Node: each computer or associated device located on the LAN is an independent node, each node has the capacity to recognize and process other nodes. The Network Attached Storage will assign each independent node a specific IP address. 1. What is the Network Attached Storage (NAS)? Network Attached Storage device usually don’t have keyboard and display, it is managed by the browser-based utility program (utility is a part of operating system). In most case, the NAS devices contain one or more hard drives, they use embedded Linux system, and support USB & Ethernet Connection. 2. What are the applications of NAS 1. In the home, NAS often is used for storing files and automated backups. 2. In the enterprise, the NAS can be used a backup target archiving or disaster recovery. Some higher-end NAS products can hold enough disk to support RAID, which will provide better performance and higher redundancy. 3. NAS is also broadly used in Facebook and YouTube. With NAS supported, users can directly update photos to their Facebook account or video to YouTube through the website. 3. The benefits of NAS 1. Efficient and Reliable NAS servers contain streamlined operating and hardware component, which let the whole storage system more efficient. Moreover, each hardware component is physically separated from the network servers, which is extremely helpful for system reliability. Even the network server fails, the whole system still can be accessed to store or retrieve data. 3. The benefits of NAS 2. Flexible The NAS storage system supports multiple different users access the same file simultaneously. The shared network resource can be efficiently allocated to user based on their specific need. For increasing storage capacity, NAS system support directly plugging a new expansion hard drive to the NAS sever. 3. The benefits of NAS 3. Easy to use The process to use NAS as follow: a. Insert NAS on your network b. Assign IP address c. Specify environment setting d. Ready to use 3. The benefits of NAS 4. Data is protected Most NAS device support RAID (redundant array of independent disk) which provide a way to storing the same data in different place, so even one disk fails, user still can access his or her data. 4. NAS’s performance (Reliability, Fault Tolerance, Scalability, Data Security, and Data Migration) NAS Appliances: a type of hardware which has its own dedicated storage disks and RAID which can be extended when more storage capacity is required. Vertical Scalability: ability to increase capacity by adding more memory or CPU. Horizontal Scalability: ability to increase capacity by connecting multiple hardware and software entities. 4. NAS’s performance (Reliability & Fault Tolerance & Scalability, Data Security, and Data Migration) 1. Reliability & Fault Tolerance & Scalability When the whole NAS system need more storage capacity, the NAS appliances can directly be assigned with larger disk to provide both vertical scalability and horizontal scalability. The other way to increase NAS’s storage capacity is using clustered NAS system which is a distributed file system. By using clustered NAS system, user can access all files from any clustered node regardless of the physical location of the file. The number and location of clustered nodes are transparent to users. Moreover clustered NAS system provides transparent replication and fault tolerance, so even one or more nodes fail, the whole system is still work without loss any data. 4. NAS’s performance (Reliability & Fault Tolerance & Scalability, Data Security, and Data Migration) 2. Data Security NAS is the file-base storage technology. Users can directly using passwords or encrypt their file to guarantee the data security. Passwords is the easiest way to restrict read and write to a particular user. Encrypting is the other way to restrict user access. For file decrypting, user need the key to decrypt the encrypted file. 4. NAS’s performance (Reliability & Fault Tolerance & Scalability, Data Security, and Data Migration) 3. Data Migration Data migration is the process to migrate data between storage system, or computer system. Data migration is used to replace or upgrade servers, to relocate or maintain data center. Here is one software tool called “Data Dynamic Storage X” which can support Network Attached Storage data migration. 5. The challenges of NAS and Solution 1. Usually, the traditional NAS requires additional hardware than it real need, resulting in the server sprawl. Server sprawl is the situation that servers take up more space and consume more resources than it real need. Purchase the low-end server would cause server sprawl. 5. The challenges of NAS and Solution 2. Enterprise usually purchased storage arrays much larger than it needed, to ensure there is enough available space for future extension. However if that extension is never occur or it occur but less than expected, the purchased array space will wasted. 5. The challenges of NAS and Solution 3. In tradition network attached storage, there is a controller called head which is a certain amount of hard disk capacity, sometimes there may be two heads for redundancy. Those heads are kind of fixed, and the storage associated with them can continue to grow. If we need better process performance, we need to upgrade head, if we need more storage capacity, we need to upgrade the disk drive, for upgrade there should be a limitation. 5. The challenges of NAS and Solution Solution Scale-out storage: 1. Scale-out storage is a new network attached storage architecture which can expand storage space as system real need when a given array reaches its storage limitations. 2. Scale-out NAS allows users add addition heads to improve the process performance without limitation. 3. In the scale-out storage, the clustered file system distributes data across the nodes in the scale-out storage, which spreads the data access load across more processors and more I/O connections. 6. NAS vs DAS In Direct Attached Storage, storage devices are directly attached to host system. In order access the file data storage in DAS, user must have physical access to the device. The advantage of DAS is it can end user better performance than NAS. The disadvantage of DAS is, each storage device is managed separately, so that will be too complicated to manage whole direct attached storage system. 7. NAS vs SAN SAN and NAS seem similar, because both of them using internet technology to ensure the user can easily excess and manage their storage data. However, they are two different storage technologies. 7. NAS vs SAN Difference 1. The high performance storage SAN is divided and allocated to individual servers. Users and applications can only access storage through allocated server. Because The Fibre Channel protocol delivers SCSI commands between the server and the SAN’s hard disk system, so it presents as hard disk to servers. However, NAS is connected to all the desktop, workstations and servers on the standard Ethernet. Because NAS provide data access to clients through a file system layer such as Network File System (NFS), so it presents as file servers to client servers. 7. NAS vs SAN Difference 2. SAN uses it specific network standard such as Fibre Channel. SAN has its own dedicated switches, cables and protocol. NAS storage devices use the standard Ethernet network, instead of requiring special Fibre Channel network, switches, and cables, NAS is much cheaper than SAN storage system. 7. NAS vs SAN Difference 3. The final difference, which is also the key distinction between SAN and NAS is that the Storage Area Network manage I/O requests based on block level, whereas the Network Attached Storage manage I/O requests based on file level. So SAN has lower latency and higher performance than NAS. 7. NAS vs SAN (File-level vs Block-level) Block-level storage: the block-level storage is a type of storage, each block in the blocklevel storage system is controlled as the individual hard drive, and each block is managed by the server operating system. The protocol usually is used in block-level storage is iSCSI, and Fibre Channel. File-level storage: the file-level storage is a storage technology which usually is used in Network Attached Storage. In file-level storage system, each file and folder can be accessed and managed by the storage system itself. However the smaller storage block which consists of the files and folders cannot be directly controlled. The file-level storage system is simple to implement. And the protocol usually is used in file-level storage is Network File System (NFS). 7. NAS vs SAN (File-level vs Block-level) Difference 1. With block level storage, each block can be controlled as an individual hard drive, and the blocks is controlled by the server-based operating system. With file level storage, the files and folders can be accessed and managed by the storage system itself in file-level storage (in the network attached storage, these files and folders are controlled by file server(s)). However, the file level storage system usually is disable to manage the smaller storage blocks which consist the files and folders. 7. NAS vs SAN (File-level vs Block-level) Difference 2. The block level storage is the interaction between the system (the allocated server) and storage devices such as disk drive or tape drive. It requires the precise commands that where and how data (block data) is to be located (stored). The file level storage is the interaction between the client server and file server. The client does not care about how and where data is to be located, they just need the file’s name. 8. The future of NAS (hybrid NAS/SAN) It is also called the Unified Storage that support both filed based network attached storage and block based storage area network. Unified storage allows facilities to consolidate their storage by using either IP or Fibre Channel protocol. 8. The future of NAS (hybrid NAS/SAN) Type 1. Some unified storage system layer the file based network attached storage on the top of SAN storage, which provides the block level access. 8. The future of NAS (hybrid NAS/SAN) Type 2. Another unified storage system modified NAS to allow block level access through iSCSI. 9. Network File System Definition The Network File System (NFS) is a client/server application that allows user access, view, store, and update file on a remote client computer through the network in a similar way to local file system. 10. NFS Architecture The server implements the shared file system to the attached clients. The clients implement the user interface to the shared file system. 11. Network File System Layer VFS: virtual file system switch NFS: network file system RPC: remote procedure call XDR: external data representation TCP: transmission control protocol 11. Network File System Layer Virtual file system switch (VFS): “is an abstraction layer on the top of a more concrete file system.” the VFS support client to access the different type of concrete file system in a uniform way. The VFS is a kernel software layer which process all system call that related to the file system. The VFS provides a common interface to the different file systems, in other word, it is a kind of contract between kernel and concrete file system. The VFS is used to determine which storage that a request is intended for, and which file system should be used to satisfy the request. 11. Network File System Layer Network file system (NFS): when the request is found to be related with NFS, VFS will pass it to the NFS within the kernel. Then the VFS will translate the I/O request into an NFS procedure (such as open, access, create, read, remove…). Once a particular NFS procedure is selected from the I/O request, it will be performed by the Remote Procedure Call (RPC) layer. 11. Network File System Layer Remote Procedure Call (RPC): is a technology used to process procedure call or exchange message between systems. RPC manages NFS request and sends them to the appropriate remote server, and then tracks and manage the response 11. Network File System Layer Four value is defined on RPC service: 1. The program number, 2. The version number of the RPC protocol (different version umber of RPC protocol has different Call Name collection), 3. The procedure number (usually assigned sequentially), 4. UDP or TCP transport protocol.” The program is the collection of procedures. Each program has its specific number, for example NFS is a program with program number 100003. 11. Network File System Layer External Data Representation (XDR) protocol: the XDR is layered within the RPC. XDR is concerned with converting data types to the same representation before being sent, and reconverted data once they are received. 11. Network File System Layer Transmission Control Protocol (TCP): once XDR converted data to the same representation, the requested would be transmitted over network by follow Transmission Control Protocol (TCP) which is a kind of Transport Layer. TCP keeps track of the orders of information and resend the missed data. 11. Network File System Layer IP: the network layer is concerned with getting the data from one host to the other on the network. The IP layer is used to implement it. IP must get the correct destination address. But it doesn’t care about data reliability and data order. 11. Network File System Layer Network: is a physical layer to control data transmission. 11. Network File System Layer Once the request is received by TCP/IP. The request would flow up the network stack through RPC/XDR and until NSF server. Then the request is sent to daemon, which identifies the target file system, and VFS is again used to get the file system in the local storage. 12. NFS Protocol The operation form the client to NSF sever is called ‘mount’. Mount represents mounting a remote file system into the local file system. This process begins as a call to mount, which is routed through the VFS to NFS. “After establishing the port number for the mount, the client performs an RPC mount request.” This client request will be checked by special daemon by against the server’s file exported file system, if the requested file system exists and has been accessed, an RPC mount reply would establish the file handle for the file system. “The client side stores the remote mount (file system) information with the local mount point and establishes the ability to perform I/O requests.” 13. NFS Design Feature Inode; File Naming; File Permission; File Locking; File Caching 13. NFS Design Feature Inodes: inodes contains a unique ID number which is used to record the file information. Each file or directory has its own specific individual inode. And the inode contains the following informatins: 1. Inode number; 2. File name; 3. File size and type; 4. Data and time of creation, modification, and access; 5. Date and time of inode modification; 6. File security information; 7. Number of links; 8. Block map, with pointers to the data blocks that make up file. 13. NFS Design Feature File Naming: the file naming rule that is used to NFS clients request 1. File names is not allowed to beyond 255 characters; 2. < > : “ / \ is not permitted characters; 13. NFS Design Feature File Permission: each file has a set of file permission that can be sued to determine who can access the file, and what they can do. 13. NFS Design Feature File Locking: allows a process can access a certain file or prat of a certain file exclusively. For the client, if client is crashed, it would release the lock, so after restart the client, it should reclaim the lock to the server. For the server, if the server is crashed, it would restore the lock status to the previous condition once it is restarted. Moreover, when a file is lock, the buffer cache is no longer used for that file. Every write request can be immediately sent to the server. 13. NFS Design Feature File Caching: is used to store the frequently used information (file or file block) in quickly access memory. For example: the UNIX buffer cache is a part of the system memory, it is used to store the file block information that have been recently referenced. In NFS, file caching is use to the client to eliminate the RPC requests over the network. 14. Benefit of NFS Network File System makes it possible that different computer architecture with different operating system can share fie system across a network. Because NFS is defined as an abstract file system model instead of an architecture specification, so NFS environment can be implemented on different operating system. Each operating system can apply NFS model to its own file—system semantics. 14. Benefit of NFS The benefits of NFS as following: 1. Enables multiple clients/users access the same files simultaneously 2. Reduces storage cost by share files(data) through network instead of using local disk 3. Provides data consistency and reliability 4. Makes mounting of files transparent to clients (users) 5. Makes accessing remote data transparent to the clients(user) 6. Supports different operating system 7. Reduces system administration overhead 15. NFS Version (Version 2,3,4) Version 2 Version 2 is the first NFS protocol version we use. Version 2 continues to be available on a variety platforms after it is released. And it originally only can be operated over UDF instead of TCP. 15. NFS Version (Version 2,3,4) Version 3 Version 3 is the optimized version based on version 2. The version 3 protocol must be running on both NFS server and clients, which improved its interoperability and performance. 1. Version 3 can process files that are larger 2 Gage bytes than version 2. 2. Version 3 allow server to cache client write requests in memory (support the asynchronous writes on server), the client doesn’t need to wait server writes changes to disk, so it improved the response time. 3. Moreover, version 3 also allowed server to batch the requests, which improves the server response time. 15. NFS Version (Version 2,3,4) 4. Version 3 stores the returned file attributes in the local cache, which decreasing the number of RPC calls to the server and improving performance. 5. The file access permission was improved in version 3. In version 2, it usually generates a “write error” or “read error” message when user tried to copy a remote file without permission. However, version 3 implements permission check before the file is opened, so the error is reported as an “open error”. 6. Version 3 remove the 8-kbyte transfer size limitation. The transfer size is decided by both clients and server. 15. NFS Version (Version 2,3,4) Version 4 1. Version 4 protocol represents the user ID and group ID as string. 2. With version 4, when user un-share a file system, all the state for any open files or file locks in that file system will be destroyed. 3. Unlike version 2 and version 3’s sever return persistent file handles, version 4 supports volatile file handles, which means once the file handle is changed, the client must find the new file handle (the file handle is the collection of files and directories information.) 16. RDMA Unlike UDP and TCP, RDMA (Remote Direct Memory Access Protocol) is a new technology used for memory to memory transfer data over high-speed network. One improvement of RDMA is it can transfer data directly to and from memory without CPU intervention. Moreover, RDMA support data placement, which eliminates the data copies. 16. RDMA The relationship between RDMA with other protocol such as TCP and UDP. After XDR encodes the message information, RPC would use UDP/TCP/or RDMA to transport the message between clients and server. Even one of these protocol is not available, RPC still can use the other available protocol to transmit message. 17. NFS Security Several approaches used to secure the NFS access. 1. NFS uses RPC to allow client and server exchange message. The RPC is secured by providing a DES Authentication (a kind of authentication method). Through this method, every RPC message may be optionally authenticated. 17. NFS Security 2. Client and server can through exchanging the timestamp to authenticate each other, where the timestamp is encrypted using DES encryption scheme. To complete authentication: both client and server must agree to a common time, must have the same encryption key; and must securely store it for each user. 17. NFS Security 3. Even authentication ensures the identity of client and server, it still doesn’t mean the file is protected well during the transmission. So the transmission section also should be protected by encrypting it using the common encryption key. Review Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS’s performance (Reliability & Fault Tolerance & Scalability, Data Security, and Data Migration). 5. The challenges of NAS. And how do we handle these problems? 6. Compare NAS and DAS (Directly Attached Storage). 7. Compare NAS and SAN (Storage Area Network). - File level storage vs Block level storage. 8. The future of NAS. Review Part B 9. Network File System definition 10. NFS Architecture 11. Network File System layer 12. NFS Protocol 13. NFS Design feature 14. Benefits of NFS 15. NFS version 16. New Technology RDMA (Remote Direct Memory Access Protocol) 17. NFS Security Reference 1. http://searchstorage.techtarget.com/definition/network-attached-storage 2. http://searchstorage.techtarget.com/answer/How-does-a-scale-out-NAS-environment-affect-storage-management 3. http://searchstorage.techtarget.com/definition/Clustered-network-attached-storage-clustered-NAS 4. http://searchstorage.techtarget.com/podcast/SAN-and-NAS-storage-trends-include-software-defined-storage-flash 5. http://searchstorage.techtarget.com/answer/Block-and-file-level-storage 6. http://www.webopedia.com/TERM/F/file-level-storage.html 7. http://www.webopedia.com/TERM/B/block-level-storage.html 8. http://www.tvtechnology.com/media-systems/0191/storage-picking-a-bit-bucket-san-and-nas/266897 9. David H. C. Du, “Recent advancements and future challenges of storage system,” Vol. 96, No. 11, Nov 2008, pp18751884. 10. http://www.ibm.com/developerworks/library/l-network-filesystems/ 11. https://technet.microsoft.com/en-us/library/cc976863.aspx 12. http://learnlinuxconcepts.blogspot.com/2014/10/the-virtual-filesystem.html 13. http://docs.oracle.com/cd/E19253-01/816-4555/6maoqui8o/index.html 14. http://docs.oracle.com/cd/E19253-01/816-4555/rfsrefer-137/index.html 15. http://docs.oracle.com/cd/E19253-01/816-4555/rfsrefer-154/index.html 16. http://searchstorage.techtarget.com/tip/How-to-secure-NFS-access-to-NAS-devices Thank you Jinfeng Yang [email protected]