* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Ch-11_3431
Survey
Document related concepts
Transcript
Chapter 11 – File-System Implementation (Pgs 461-499 )
CSCI 3431: OPERATING SYSTEMS
File System Structure
Files are predominantly stored on Disks
1. Can be rewritten in place
2. All blocks directly accessible (c.f., CD)
But really ...
A. Persistence
B. Accessibility
C. Writeability
D. Access time
Layered Systems
Application(s)
Files, Directories:
File System
OS – File Manager
Device Driver, Interrupt
Handlers
Device + Hardware
File Representation
FCB: File Control Block – the OS
representation of a file
Same as PCB representation of a process
Inode – FCB in Unix
Disk Organisation
Boot control block: typically block 1, sector 1,
track 1, platter 1 – boot information
Volume control block: superblock – partition
details (block size, number of blocks, blocks
free, location of free block list)
Directory structure: Root directory "\" in a
known location
FCBs: Inodes/Data for each actual file
Data blocks: Contents of the files
OS Data
Mount table – what partitions are mounted?
Directory cache
Open file table (system wide)
Per-Process open file table
IO Buffers
Aside:
Many OS/FS treat a directory as another kind of file
Disks
Are divided into sections called partitions or
volumes
Partitions may contain a file system ("cooked")
or store "raw" data directly
E.g., page swap partition has no file system
Boot sector typically stores the boot loader
The boot loader accesses the root partition (of
OS selected) which contains the OS and its root
(always mounted) file system
Other partitions are mounted as required
Logical File System
Model of File System managed by OS and visible
to programs/programmers
Example: Linux Components
inode: an individual file
FILE: an open file
superblock: a file system
dentry: a directory entry
Directories: May be implemented as:
Lists: Sorted, Unsorted, B-Tree
Hash Tables
Contiguous Allocation
Disk blocks are linearly ordered
Files occupy continuous sets of blocks
Problem occurs when files are deleted,
shortened, or moved creating spaces on the disk
Exactly the same issue as fitting a process into
memory (best fit, first fit, etc.)
Compaction removes spaces, but creates extra
work
Generally a bad idea for general purpose file
systems, but may be useful for specialised OSs
Linked Allocation
Files are assigned a (potentially scattered) set
of available disk blocks
A tiny portion of each block is used to store a
pointer (address) of the next block
Need a second pointer to support "rewind" in
a file
Slow file access because a block must be
read, and then the pointer used to schedule
the next read
Indexed Allocation
Use the first block to store a list (an "index") of all
the blocks used
Index may waste space, but data blocks do not
need pointers
Multi-level or linked approaches can be used for
large files that need more than one index block
Access is faster than linked allocation, but still
requires reads from many different disk locations
Indices can be cached in memory to improve
performance
Free-Space Management
Generally need to know if a disk block is
being used or is available
Could use a bitmap stored on the disk, with
one bit per block
1 TeraB disk (with 4K blocks) requires 32
MegaB bitmap
Relatively fast and simple
Other Approaches
Linked list of free blocks
Can use the empty blocks to store the
pointers to the next empty block
Very space efficient -- Only need to store one
pointer to the first empty block
Very simple, but time consuming to allocate
large numbers of blocks
Can "group" the pointers into a single block
for efficiency, and have the last pointer on the
block point to the next group of empty blocks
Compression
In run-length compression, we store a value,
followed by the number of occurrences of
that value – saves lots of space if long "runs"
exist
We can compress the free space map by
storing pairs of values: A free block, and the
number of consecutive free blocks that follow
it
This compressed version has as many entries
as there are memory holes
Efficiency and Performance
We generally desire a file system to be as small and fast as
possible
However, what works best is often a factor of how it will be
used and factors such as:
Disk size
Other physical properties (heads, platters, etc.)
Average file size
Read:Write ratio
Number of IO buffers
Amount of RAM available for caching tables & indices, use of
cache for disk blocks as well as pages (Unified Virtual Memory)
Synchronous vs. Asynchronous access requirements
Viability of "Read-Ahead"
Redundancy requirements
Recovery
Lost data in RAM (except newly generated
data not saved on disk) is usually recoverable
in the event of errors, bugs, power-failures,
etc. – reload it from disk
Disk data must be better protected so that
errors and failures can be recovered from
Causes
Memory contents lost (power failure, crash)
before disk can be updated ... particularly
with cached index or free space tables
Disk block failure (hardware fault)
Write failure (power loss, system crash)
Bugs in the OS, corruption of FS by
applications
Consistency Checking
fsck (unix) and chkdsk (windows) checks all
the tables and structures on a disk for
consistency.
I.e., does the free space + used space
indicated by directories = all the available
space?
Can be run at mount, at boot, via chron, etc.
Can be supplemented with change flags
stored on disk, access/update timestamps,
etc.
Journalled (logged) FS
All disk transactions are written first to a log
Log may be stored on a different disk for
redundancy
Log tends to store a considerable amount of
data for a non-trivial time period
If inconsistency is found, each log entry is
checked to see if it was performed
Of course, if the log is corrupted, then we are still
in trouble
Uses database transaction techniques
Duplication Techniques
Modems split their EPROM in half and duplicate
things so there are two copies
If one is corrupted, the other is used
Can use similar approaches with disks, but is very
wasteful of space
Can also do limited duplication and avoid
overwriting data until the disk is full
Complete duplication to another disk is the only
possible backup in the event of a hardware
failure that renders the disk inoperable
NFS
The location of a file system shouldn't really
matter to the user (except that non-local
data may take longer to access)
Various different protocols are available
File storage in the "Cloud" is really just a
trendy term for a networked file system on a
WAN (usually the Internet)
Networked File Systems
Require:
Mount Protocol
Access Protocol (for specific FS items)
Naming Protocol – to allow local vs. non-local
paths to be mapped
Possible format changes to facilitate local
hardware and OS needs – but this is often
seen as an application-level concern
To Do:
Finish Assignment 2 (Due next week)
Complete Lab 6 (last required lab)
Read Chapter 11 (pgs 461-499; this lecture)