Download slides

File System Performance CSE451 Andrew Whitaker Ways to Improve Performance  Access the disk less Caching!  Be smarter about accessing the disk Turn small operations into large operations Turn scattered operations into sequential operations Technique #1: Caching Memory is MUCH faster than disk So, cache whatever we can in memory File buffers i-nodes Directory entries (name => i-node) Caching reads is a no-brainer Caching writes is more interesting… Caching Writes  Two options Synchronous: data is immediately written out to disk  AKA: write-through Asynchronous: disk writes are delayed  AKA: write-back  Programmer’s perspective: what does it mean when the “write” system call returns? With asynchronous writes, the data has not necessarily hit the disk Why Use Asynchronous Writes?  Allows us to batch-up multiple writes to the same block  Allows for better overlap of CPU and I/O CPU does not stall waiting for the disk  Allows the disk scheduler to make better decisions  Application: write(a); write (b); write(c);  Disk: write(b); write(a); write(c);  Most data updates in UNIX systems use asynchronous writes by default Programmer can override: fsync(fd); Problems with Asynchronous Writes File system state can be lost during a crash Missing blocks, missing files, missing directories, storage leaks, etc. For this reason, meta-data updates tend to be done synchronously File/directory creation or deletion Consistency Problems  Problems still arise, even with synchronous meta-data updates  For example, file creation must modify an i-node and a directory entry Initialize the i-node Record the <fileName, i-node> mapping in the directory  Disks do not support atomic operations Dealing with Consistency Problems Always keep the disk in a “safe” state Run a recovery program (like fsck) on startup i-check: File Consistency  Is each block on exactly one list? Create a bit vector with as many entries as there are blocks Follow the free list and each i-node block list When a block is encountered, examine its bit  If the bit was 0, set it to 1  If the bit was already 1 • if the block is both in a file and on the free list, remove it from the free list and cross your fingers • if the block is in two files, call support! If there are any 0’s left at the end, put those blocks on the free list d-check: Directory Consistency Do the directories form a tree? Cycles are bad! Does the link count of each file (i-node) equal the number of directory links to it? Technique #2: Better Data Layout  Recall basic file system structure: Meta-data: i-nodes, free block list Data: file data, directory data Metadata Data Note: i-nodes are far from the data blocks they describe Cylinder groups  Basic idea: group commonly accessed data and meta-data together This reduces seeks  Details: Disk is partitioned into groups of cylinders Data blocks from a file are all placed in the same cylinder group Files in same directory are placed in the same cylinder group i-node for file placed in same cylinder group as file’s data Cylinder Group Analysis + Reduces or eliminates seeks for some common access patterns - Does not address rotational delay - Performance is workload dependent - Performance degrades if cylinders become full - Partial solution: pro-actively reserve space Log Structured File System  Let’s assume all reads are cached An iffy assumption, but let’s suspend disbelief  Q: How can we turn all writes into large, sequential writes?  Insight: this is possible if the location of data on disk can change A Convention File System Files live at fixed location So, file system writes must use seeks For example: Write to Mathias.txt Write to Andrew.txt Write to Jill.txt Bob.txt Joel.txt Jill.txt Matt.txt Andrew.txt Nolan.txt Trish.txt Mathias.txt Log-structured File System Use the disk as an appendonly log All writes go at the end The location of a file changes over time Old data is not over-written Until the file system becomes full Log growth Mathias.txt Andrew.txt Jill.txt LFS Details Everything gets written to the log File data, i-nodes, directories LFS tries to buffer many small writes into large segments Typically 512k, 1MB How Can This Possibly Work? Q: If nothing lives at a fixed location, how do we find “the data”? A: Add a layer of indirection: An i-node map Maps from i-node number to current location The map resides at a fixed location on disk NOT in the log! The map is cached in memory for performance What Happens When the Disk Gets Full? Partial solution: disk is managed in segments, which are threaded on disk Basically, a linked-list But, this re-introduces seeks! Segment Cleaner Goal: make scattered segments contiguous again Approach: Read a segment Write live data to the end of the log Presto: The segment is now clean This is very expensive Each live byte is read and written LFS Analysis For reads, LFS and a traditional FS are largely equivalent LFS has better performance for small writes and meta-data operations The LFS cleaner has a large impact on performance How important is this? LFS in Practice LFS is implemented, but not widely used Reasons? Assumptions about read behavior were not valid Reads have not gone away Performance improvements were not sufficient to offset increase complexity, higher variability

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download slides