Download slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Operational transformation wikipedia , lookup

Data analysis wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Consistency model wikipedia , lookup

File system wikipedia , lookup

Business intelligence wikipedia , lookup

Lustre (file system) wikipedia , lookup

Design of the FAT file system wikipedia , lookup

File Allocation Table wikipedia , lookup

File locking wikipedia , lookup

Computer file wikipedia , lookup

XFS wikipedia , lookup

B-tree wikipedia , lookup

Asynchronous I/O wikipedia , lookup

Files-11 wikipedia , lookup

Disk formatting wikipedia , lookup

Transcript
File System Performance
CSE451
Andrew Whitaker
Ways to Improve Performance
 Access the disk less
Caching!
 Be smarter about accessing the disk
Turn small operations into large operations
Turn scattered operations into sequential operations
Technique #1: Caching
Memory is MUCH faster than disk
So, cache whatever we can in memory
File buffers
i-nodes
Directory entries (name => i-node)
Caching reads is a no-brainer
Caching writes is more interesting…
Caching Writes
 Two options
Synchronous: data is immediately written out to disk
 AKA: write-through
Asynchronous: disk writes are delayed
 AKA: write-back
 Programmer’s perspective: what does it mean
when the “write” system call returns?
With asynchronous writes, the data has not necessarily hit
the disk
Why Use Asynchronous Writes?
 Allows us to batch-up multiple writes to the same
block
 Allows for better overlap of CPU and I/O
CPU does not stall waiting for the disk
 Allows the disk scheduler to make better
decisions
 Application: write(a); write (b); write(c);
 Disk: write(b); write(a); write(c);
 Most data updates in UNIX systems use
asynchronous writes by default
Programmer can override: fsync(fd);
Problems with Asynchronous Writes
File system state can be lost
during a crash
Missing blocks, missing files,
missing directories, storage
leaks, etc.
For this reason, meta-data
updates tend to be done
synchronously
File/directory creation or
deletion
Consistency Problems
 Problems still arise, even with synchronous
meta-data updates
 For example, file creation must modify an i-node
and a directory entry
Initialize the i-node
Record the <fileName, i-node> mapping in the
directory
 Disks do not support atomic operations
Dealing with Consistency Problems
Always keep the disk in a “safe” state
Run a recovery program (like fsck) on
startup
i-check: File Consistency
 Is each block on exactly one list?
Create a bit vector with as many entries as there are
blocks
Follow the free list and each i-node block list
When a block is encountered, examine its bit
 If the bit was 0, set it to 1
 If the bit was already 1
• if the block is both in a file and on the free list, remove it from
the free list and cross your fingers
• if the block is in two files, call support!
If there are any 0’s left at the end, put those blocks on
the free list
d-check: Directory Consistency
Do the directories form a tree?
Cycles are bad!
Does the link count of each file (i-node)
equal the number of directory links to it?
Technique #2: Better Data Layout
 Recall basic file system structure:
Meta-data: i-nodes, free block list
Data: file data, directory data
Metadata
Data
Note: i-nodes are far from the data blocks they describe
Cylinder groups
 Basic idea: group commonly accessed data and
meta-data together
This reduces seeks
 Details:
Disk is partitioned into groups of cylinders
Data blocks from a file are all placed in the same
cylinder group
Files in same directory are placed in the same cylinder
group
i-node for file placed in same cylinder group as file’s
data
Cylinder Group Analysis
+ Reduces or eliminates seeks for some common
access patterns
- Does not address rotational delay
- Performance is workload dependent
- Performance degrades if cylinders become full
- Partial solution: pro-actively reserve space
Log Structured File System
 Let’s assume all reads are cached
An iffy assumption, but let’s suspend disbelief
 Q: How can we turn all writes into large,
sequential writes?
 Insight: this is possible if the location of data on
disk can change
A Convention File System
Files live at fixed location
So, file system writes
must use seeks
For example:
Write to Mathias.txt
Write to Andrew.txt
Write to Jill.txt
Bob.txt
Joel.txt
Jill.txt
Matt.txt
Andrew.txt
Nolan.txt
Trish.txt
Mathias.txt
Log-structured File System
Use the disk as an appendonly log
All writes go at the end
The location of a file
changes over time
Old data is not over-written
Until the file system becomes
full
Log
growth
Mathias.txt
Andrew.txt
Jill.txt
LFS Details
Everything gets written to the log
File data, i-nodes, directories
LFS tries to buffer many small writes into
large segments
Typically 512k, 1MB
How Can This Possibly Work?
Q: If nothing lives at a fixed location, how
do we find “the data”?
A: Add a layer of indirection: An i-node map
Maps from i-node number to current location
The map resides at a fixed location on disk
NOT in the log!
The map is cached in memory for performance
What Happens When the Disk Gets
Full?
Partial solution: disk is
managed in segments,
which are threaded on disk
Basically, a linked-list
But, this re-introduces
seeks!
Segment Cleaner
Goal: make scattered segments
contiguous again
Approach:
Read a segment
Write live data to the end of the log
Presto: The segment is now clean
This is very expensive
Each live byte is read and written
LFS Analysis
For reads, LFS and a traditional FS are
largely equivalent
LFS has better performance for small
writes and meta-data operations
The LFS cleaner has a large impact on
performance
How important is this?
LFS in Practice
LFS is implemented, but not widely used
Reasons?
Assumptions about read behavior were not valid
Reads have not gone away
Performance improvements were not sufficient
to offset increase complexity, higher variability