Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 11 File System Implementation Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Outline •File-System Structure •File-System Implementation •Directory Implementation •Allocation Methods •Free-Space Management •Efficiency and Performance •Recovery •Log-Structured File Systems •NFS •Example: WAFL File System NCHU System & Network Lab 11.1 File-System Structure Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 File-System Structure • File system: provide efficient and convenient access to disk • File structure – Logical storage unit – Collection of related information • I/O transfers between memory and disk are performed in units of blocks – one or more sectors • Two design problems – How the file system should look to the user •File and its attributes, operations on a file…… – Map the logical file system onto the disk NCHU System & Network Lab Layered File System: File system organized into layers NCHU System & Network Lab Layered File System file name Application Programs write(data_file, item) Logical File System information of data_file Given a symbolic file name, use directory to provide values needed by FOM Transform logical address to File-Organization Module physical block address 2144th logical block Basic File System driver 1, cylinder 73, surface 2, sector 10 I/O Control Devices NCHU System & Network Lab Numeric disk address: Driver: hardware instructions Layered File System •I/O control –device drivers and interrupt handlers –Transfer information between main memory and disk system –Retrieve block 123 HW-specific instructions •Basic file system –Issue generic commands to device driver to read and write physical blocks on the disk –Physical block is identified as: drive 1, cylinder 73, track 2, sector 10 NCHU System & Network Lab Layered File System (Cont.) •File-organization module –Know about files, their logical blocks, and physical blocks –Translate logical blocks to physical blocks (similar to VM) •Logical blocks: 0 –N –Free-space manager: track unallocated blocks –Blocks allocation: allocated free blocked when requested •Logical file system –manage metadata information –Metadata: file-system structure, excluding the actual file contents –Manage the directory structure via file control blocks (FCB) –File control block –storage structure consisting of information about a file •Ownership, permissions, and location of the file content NCHU System & Network Lab A Typical File Control Block NCHU System & Network Lab Layered File System (Cont.) •Why Layered file system? –All the advantages of the layered approach –File system standard: UFS (Unix File System), FAT FAT32, NTFS… –Duplication of code is minimized for different file system standard –Usually I/O control and the basic file system code can be used by multiple file system formats. NCHU System & Network Lab 11.2 File System Implementation Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 On-Disk Structures • Boot control block: information needed by the system to boot an OS from that partition – UFS: called boot block; NTFS: called partition boot sector • Volume control block: volume details – No. of blocks, size of the blocks, free-block count and free-block pointers, free FCB count and FCB pointers – UFS: called superblock; NTFS: called Master File Table • A directory structure is used to organize the files – UFS: includes file names and inode numbers – NTFS: in the Master File Table • A per-file File control block: many of the file’ s details – File permissions, ownership, size, location of the data blocks – UFS: called inode; NTFS: within the Master File Table NCHU System & Network Lab A Possible File System Layout 1 2 3 1. Boot OS from the partition 2. Volume control block 3. Directory structure NCHU System & Network Lab In-Memory Information •Used for FS management and performance via caching –An in-memory mount table containing information about each mounted volume –An in-memory directory-structure cache •Hold the directory information of recently accessed directories –The system-wide open-file table (Chapter 10) •A copy of FCB of each open file –The per-process open-file table (Chapter 10) NCHU System & Network Lab In-Memory File System Structures •The following figure illustrates the necessary file system structures provided by the operating systems. •Figure (a) refers to opening a file. •Figure (b) refers to reading a file. NCHU System & Network Lab In-Memory File System Structures File Open File Read NCHU System & Network Lab Partitions and Mounting •A disk can be sliced into multiple partitions •A volume can span multiple partitions on multiple disks •Each partition can be –Raw: contain no file system •E.g., swap space –Cooked: contain a file system •Root partition: contains OS and other system files –Mounted at boot time •Other partitions are mounted at boot time or manually NCHU System & Network Lab Virtual File Systems •Virtual File Systems (VFS) provide an objectoriented way of implementing file systems. •Two functions –VFS separates file-system-generic operations from their implementation •By defining a clean VFS interface –VFS provides a mechanism for uniquely representing a file throughput a network •VFS is based on a file-representation structure, called a vnode, that contains a numerical designator for a network-wide unique file NCHU System & Network Lab Schematic View of Virtual File System Open, read, write… NCHU System & Network Lab Virtual File Systems (Cont.) •VFS allows the same system call interface (the API) to be used for different types of file systems. •The API is to the VFS interface, rather than any specific type of file system. NCHU System & Network Lab 11.3 Directory Implementation Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Directory Implementation • Linear list of file names with pointer to the data blocks. – Simple to program – Time-consuming to execute- linear search to find a particular entry •Cache and sorted list may help • Hash Table –linear list with hash data structure. – (file name) => hash function => a pointer to the linear list – Decreases directory search time – Collisions –situations where two file names hash to the same location – Fixed size and the dependence of the hash function on that size •If hash table has only 64 entries, hash function is modulo 64 •If we want to enlarge hash table – Hash table becomes 128 entries, hash function must be changed – Or use a chained-overflow hash table, each entry is a linked list NCHU System & Network Lab 11.4 Allocation Methods Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Allocation Methods •How to allocate space to files ? –So that disk space is utilized effectively and files can be accessed quickly •An allocation method refers to how disk blocks are allocated for files: –Contiguous allocation •Extend-based system –Linked allocation –Indexed allocation NCHU System & Network Lab Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk • Simple –only starting location (block #) and length (number of blocks) are required in the directory entry (FCB) • Good – Fast -- Minimal seek time and head movement – Support both sequential and direct access •If direct access to block i of a file that starts at block b, access block b+i • Bad – Finding space for a new file, similar to dynamic storage-allocation problem •Request n blocks from a list of free holes, can use first fit, best fit…… – External fragmentation •compaction (expensive) – When creating a file, may need to determine how much space is needed for a file – Files are difficult to grow NCHU System & Network Lab Contiguous Allocation of Disk Space NCHU System & Network Lab (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and F have been removed NCHU System & Network Lab Extent-Based Systems •Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme •Extent-based file systems allocate disk blocks in extents •An extent is a contiguous block of disks –Extents are allocated for file allocation –A file consists of one or more extents. •Integrate contiguous allocation and linked allocation NCHU System & Network Lab Linked Allocation •Each file is a linked list of disk blocks: –Blocks may be scattered anywhere on the disk. –Directory contains a pointer to the first and last blocks –Each block contains a pointer to the next block •Advantages –No external fragmentation –Easy to grow –Any free block is OK –Free-space management system –no waste of space block = pointer NCHU System & Network Lab Linked Allocation NCHU System & Network Lab Storing a file as a linked list of disk blocks NCHU System & Network Lab Linked Allocation (Cont.) •Disadvantages –Effectively for only sequential-access file •If find ith block, must start at the beginning and follow the pointer –Space required for the pointers –Reliability –What if the pointers are lost NCHU System & Network Lab Linked Allocation (Cont.) •Solution for spaces for pointers –Collect blocks into clusters, and allocate the clusters than blocks –Fewer disk head seeks and decreases the space needed for block allocation and free-list management –Internal fragmentation •Solution for reliability –Double linked list or store the filename and relative block number in each block •More overhead for each file NCHU System & Network Lab Linked Allocation (Cont.) •FAT (File Allocation Table) –OS/2, MS-DOS –The table has one entry for each disk block and is indexed by block number •Similar to the linked list •Contain the block number of the next block in the file –Significant number of disk head seeks •One for FAT, one for data •Improved by caching FAT •Random access time is improved –By reading the FAT NCHU System & Network Lab File-Allocation Table 把Pointer集中放置 於FAT,而不是跟 Data Block放一起 NCHU System & Network Lab Indexed Allocation •Bring all pointers together into the index block –An array of disk-block addresses –The ith entry points to the ith block of the file –The directory contains the address of the index block –Similar to the paging scheme for memory management •Logical view. index table NCHU System & Network Lab Example of Indexed Allocation NCHU System & Network Lab Indexed Allocation (Cont.) •Advantage –Support direct access –Without external fragmentation –Easy to create a file (no size-declaration problem) •Disadvantage –Wasted space: space for index block •Worse than the linked allocation for small files •How large the index block should be –Large index block: waste space for small files –Small index block: how to handle large files NCHU System & Network Lab Indexed Allocation (Cont.) •Mechanism for handling the index block –Linked scheme: link together several index blocks –Multilevel index: like multi-level paging •With 4096-byte blocks, we could store 1024 4-byte pointers in an index block. •Two levels of indexes allows 1,048,576 data blocks, which allow a file of up to 4 gigabytes –Combined scheme: for example BSD UNIX System NCHU System & Network Lab Linked Scheme directory file first index block jeep 19 next 19 next next data data data data data data data data data data data data 24 NCHU System & Network Lab 8 Multilevel Index (Two-Level) NCHU System & Network Lab Combined Scheme: UNIX (4K bytes Per Block) NCHU System & Network Lab 11.5 Free Space Management Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Free-Space Management •Free-space list: used to keep track of free disk space –bit vector (bit map) •One bit for a block (1: free, 0: allocated) –linked list: each free block points to the next –grouping: –counting: NCHU System & Network Lab Bit Vector –block size = 212 bytes –disk size = 230 bytes (1 gigabyte) –n = 230/212 = 218 bits (or 32K bytes) •Simple and efficient to find the first free block, or consecutive free blocks 01 2 n-1 •Requires extra space: example: … bit[i] = 0 block[i] free 1 block[i] occupied •Efficient only when the entire vector is kept in main memory –Write back to the disk occasionally for recovery needs NCHU System & Network Lab Linked List •Link together all free blocks •Keep a pointer to the first free block in a special location on the disk and caching it in memory •Good –No waste of space •Bad –Cannot get contiguous space easily –Not easy to traverse the list (infrequent action) •Fortunately, OS needs one free block at a time •Just find the first free block, no traversal •FAT incorporate the linked list mechanism NCHU System & Network Lab Linked Free Space List on Disk NCHU System & Network Lab Grouping And Counting •Grouping: –Store the address of n free blocks in the first free block. –The first n-1 are actually free. –The final block contains the addresses of another n free blocks…and so on –Good: find a large number of free blocks quickly •Counting: –Keep the address of the first free blocks and the number n of free contiguous blocks –Since several contiguous blocks may be allocated or freed simultaneously NCHU System & Network Lab Example Of Free-Space Management Bit Vector 11000011000000111001111110001111 Grouping (n = 3) Block 2 stores (3, 4, 5) Block 5 stores (8, 9, 10) Block 10 stores (11, 12, 13) Block 13 stores (17, 28, 25) Block 25 stores (26, 27, -1) Counting 2 4 8 6 17 2 25 3 NCHU System & Network Lab 11.6 Efficiency and Performance Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Efficiency and Performance •Efficiency dependent on: –Disk allocation and directory algorithms –Types of data kept in file’ s directory entry •If “ last access date”changed, directory entry must be modified •Performance –On-board cache –local memory in disk controller to store entire tracks at a time –Buffer cache –separate section of main memory for frequently used blocks –Synchronous writes v.s. asynchronous writes –Free-behind and read-ahead –techniques to optimize sequential access NCHU System & Network Lab Page Cache •A page cache caches pages rather than disk blocks using virtual memory techniques •Memory-mapped I/O uses a page cache •Routine I/O through the file system uses the buffer cache •This leads to the following figure •Problem: double caching –Data may be cached in both buffer cache and page cache •Buffer cache: for using ordinary file operations •Page cache: for using memory-mapped file •Sol: unified buffer cache NCHU System & Network Lab I/O Without a Unified Buffer Cache NCHU System & Network Lab Unified Buffer Cache •A unified buffer cache uses the same cache to cache both memory-mapped pages and ordinary file system I/O NCHU System & Network Lab I/O Using a Unified Buffer Cache NCHU System & Network Lab Free-Behind and Read-Ahead •Free-behind –Remove a page from the buffer as soon as the next page is requested –The previous pages are not likely to be used again and waste buffer space •Read-ahead –A requested page and several subsequent pages are read and cached –These pages are likely to be requested soon •Both are applied to sequential access NCHU System & Network Lab 11.7 Recovery Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Recovery • Some directory information is kept in main memory – If a computer crash, information in memory are lost •Cache, buffer contents, and the current I/O operation – The file system may be in inconsistent state • Consistency checker –compares data in directory structure with data blocks on disk, and tries to fix inconsistencies • Use system programs to back up data from disk to another storage device – Floppy disk, magnetic tape, other magnetic disk, optical – Recover lost file or disk by restoring data from backup NCHU System & Network Lab 11.8 Log Structured File Systems Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Log Structured File Systems • Log structured (or journaling) file systems record each metadata update to the file system as a transaction • All transactions are written to a log – A transaction is considered committed once it is written to the log – However, the file system may not yet be updated • The transactions in the log are asynchronously written to the file system – When the file system is modified, the transaction is removed from the log • If the file system crashes – All remaining transactions in the log must still be performed NCHU System & Network Lab 11.9 NFS Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 The Sun Network File System (NFS) •An implementation and a specification of a software system for accessing remote files across LANs (or WANs) •The implementation is part of the Solaris and SunOS operating systems running on Sun workstations –Using TCP or UDP/IP protocol NCHU System & Network Lab NFS (Cont.) • Interconnected workstations viewed as a set of independent machines with independent file systems • Goal: allows sharing among these file systems in a transparent manner – Based on client-server model • Steps – A remote directory is mounted over a local file system directory •The mounted directory looks like an integral subtree of the local file system – Specification of the remote directory for the mount operation is nontransparent •The host name of the remote directory has to be provided – But, files in the remote directory can then be accessed in a transparent manner – Subject to access-rights accreditation, •Potentially any file system (or directory within a file system), can be mounted remotely on top of any local directory NCHU System & Network Lab Three Independent File Systems NCHU System & Network Lab Three Independent File Systems Mounts Cascading mounts mount S1:/user/shared over U:/user/local mount S2:/user/dir2 over U:/user/local/dir1 NCHU System & Network Lab NFS (Cont.) • NFS is designed to operate in a heterogeneous environment of different machines, operating systems, and network architectures – The NFS specifications independent of these media • This independence is achieved through the use of RPC primitives built on top of an External Data Representation (XDR) protocol used between two implementationindependent interfaces • The NFS specification distinguishes between – The services provided by a mount mechanism : the mount protocol – The actual remote-file-access services : the NFS protocol NCHU System & Network Lab The Mount Protocol • Establishes initial logical connection between server and client • Mount operation includes name of remote directory to be mounted and name of server machine storing it – Mount request is mapped to corresponding RPC and forwarded to mount server running on server machine – Export list •Specifies local file systems that server exports for mounting •Along with names of machines that are permitted to mount them • Following a mount request that conforms to its export list, the server returns a file handle —a key for further accesses • File handle –a file-system identifier, and an inode number to identify the mounted directory within the exported file system • The mount operation changes only the user’ s view and does not affect the server side NCHU System & Network Lab NFS Protocol •Provides a set of RPCs for remote file operations. •The procedures support the following operations: –searching for a file within a directory –reading a set of directory entries –manipulating links and directories –accessing file attributes –reading and writing files •NFS servers are stateless; –No open() and close() –Each request has to provide a full set of arguments •Include a unique file identifier and offset NCHU System & Network Lab NFS Protocol (Cont.) •Modified data must be committed to the server’ s disk before results are returned to the client –Lose advantages of caching •The NFS protocol does not provide concurrency-control mechanisms –Users must do it by themselves •NFS is integrated into the OS via VFS NCHU System & Network Lab Schematic View of NFS Architecture NCHU System & Network Lab Three Major Layers of NFS Architecture • UNIX file-system interface – Based on the open, read, write, and close calls, and file descriptors • Virtual File System (VFS) layer –distinguishes local files from remote ones, and local files are further distinguished according to their file-system types – The VFS activates file-system-specific operations to handle local requests according to their file-system types – Calls the NFS protocol procedures for remote requests • NFS service layer –bottom layer of the architecture – Implements the NFS protocol NCHU System & Network Lab NFS Path-Name Translation •Path-Name Translation –Performed by breaking the path into component names •/user/local/dir1/file.text •(1) usr, (2) local, (3) dir1 –Then performing a separate NFS lookup call for every pair of component name and directory vnode •To make lookup faster, a directory name lookup cache on the client’ s side holds the vnodes for remote directory names NCHU System & Network Lab NFS Remote Operations •Nearly one-to-one correspondence between regular UNIX system calls and the NFS protocol RPCs –Except opening and closing files •NFS adheres to the remote-service paradigm –But employs buffering and caching techniques for the sake of performance –File block and file attributes are fetched by RPCs and are cached locally NCHU System & Network Lab NFS Remote Operations (Cont.) •File-attribute cache –the inode-information –The attribute cache is updated whenever new attributes arrive from the server •File-blocks cache – –When a file is opened, the kernel checks with the remote server whether to fetch or revalidate the cached attributes –Cached file blocks are used only if the corresponding cached attributes are up to date •Read-ahead and delay-write are used –Clients do not free delayed-write blocks until the server confirms that the data have been written to disk NCHU System & Network Lab 11.10 Example: WAFL File System Operating System Concepts –7th Edition, Jan 14, 2005 Silberschatz, Galvin and Gagne ©2005 Example: WAFL File System •write-anywhere file layout –Optimized for random writes •Random I/O optimized, write optimized –Random read can be improved by caching –NVRAM for write caching •A stable-storage cache •Can support multiple snapshots –Useful for backups, testing, and so on NCHU System & Network Lab The WAFL File Layout file file file All metadata lives in files NCHU System & Network Lab Snapshots in WAFL NCHU System & Network Lab