Download TDDB68 Concurrent Programming and Operating Systems Lecture

Document related concepts
Transcript
TDDB68 Concurrent Programming and
Operating Systems
Lecture 6:
File systems
Mikael Asplund, Senior Lecturer
Real-time Systems Laboratory
Department of Computer and Information Science
Copyright Notice:
Thanks to Christoph Kessler for much of the material behind these slides.
The lecture notes are partly based on Silberschatz’s, Galvin’s and Gagne’s book (“Operating System Concepts”, 7th ed., Wiley, 2005). No part of
the lecture notes may be reproduced in any form, due to the copyrights reserved by Wiley. These lecture notes should only be used for internal
teaching purposes at the Linköping University.
1
File system consists of
interface + implementation
2
Storing data
●
Primary memory is volatile
–
●
need secondary storage for long-term storage
A disk is essentially a linear sequence of
numbered blocks
–
–
With 2 operations: write block b, read block b
Low level of abstraction
3
The file abstraction
●
●
Provided by the OS
Smallest allotment of secondary storage known
to the user
–
Typically, contiguous logical address space
●
Organized in a directory of files
●
Has
–
–
Attributes (Name, id, size, …)
API (operations on files and directories)
4
File Attributes
●
●
Name – the only information kept in human-readable form
Identifier – unique tag (number) identifies file within file
system
●
Type – needed for systems that support different types
●
Location – pointer to file location on device
●
Size – current file size
●
Protection – controls who can read, write, execute
●
Time, date, and user identification – data for protection,
security, and usage monitoring
5
Meta data
Such information about files (i.e., meta-data)
is kept in a directory structure,
which is maintained on the disk.
Stored in a File Control Block (FCB) data
structure for each file
6
File Operations (API)
●
File is an abstract data type, with operations
–
Create
–
Write
–
Read
–
Reposition within file
–
Delete
–
...
–
Open(Fi) – search the directory structure on disk for entry Fi, and move
the content of that entry to memory
–
Close (Fi) – move the content of entry Fi in memory to directory
structure on disk
7
Open in unix:
open ( ”filename”, ”mode” )
returns a file descriptor / handle = index into a per-process

table of open files (part of PCB)
(or an error code)
8
File descriptors and open file tables
Process 1
Logical
Address
Space
FILE data
structure
{…, fd, …}
fp
Process-local
open file table
0 stdin (pos, …)
1 stdout (pos, …)
2 stderr (pos,…)
Logical
Address
Space
System-wide
open file table
Console input
Console output
d newfile(pos,…)
returned by
fopen() C
library call
Process 2
KERNEL MEMORY SPACE
newfile (loc.,…)
FCB contents
Process-local
open file table
0 stdin (pos, …)
1 stdout (pos, …)
2 stderr (pos,…)
stdin, stdout,
stderr are
opened upon
process start
Disk
FCB
File
data
open() syscall
returns a file
descriptor =
9
index in local
open file table
Data to manage open files
●
●
Disk location of the file (and other metadata
from FCB)
File-open count: count number of times a file
is opened – to allow removal of data from
open-file table when last process closes it
–
●
shared by all processes who opened the file
File pointer (seekpos): pointer to next
read/write location
–
one for every open system call (process)
10
Storing open file data
●
●
●
Collected in a system-wide table of open files
and process-local open file tables (part of
PCB)
Process-local open file table entries point to
system-wide open file table entries
Semantics of fork()?
11
Access Methods
●
●
Sequential Access
Direct Access
read next block
write next block
reset (rewind)
read block n
write block n
position to n
read next block
write next block
n = relative block number from beginning of file)
12
Directory Structure
●
Files in a system organised in directories
–
●
A collection of nodes containing information about
all files
Both the directory structure and the files reside
on disk.
Directory
Files
F1
F2
F3
F4
Fn
13
Directory API
●
Search for a file
●
Create a file
●
Delete a file
●
List a directory
●
Rename a file
●
Traverse the file system
14
Examples of File-system
Organization
15
Organize the Directory (Logically) to Obtain …
●
Efficiency – locating a file quickly
●
Naming – convenient to users
●
–
Two users can use the same name for different
files
–
The same file can have several different names
Grouping – logical grouping of files by
properties
–
e.g., all Java programs, all games, …
16
Single-Level Directory
●
A single directory for all users
Very simple
Naming problem
Grouping problem
Still used on simple devices, embedded systems, Pintos
17
Two-Level Directory
●
Separate directory for each user
 Path name: username / filename
 Can have the same file name for different user
 Efficient searching
 No grouping capability
18
Tree-Structured Directories
19
Acyclic-Graph Directories
20
Hard links vs. Soft links
●
Example directory:
Name
Location
myfile
371
file2
524
…
mylink_hard
371
…
mylink_soft
./myfile
…
21
Hard links
●
Direct pointer (block address) to a directory or
file
●
Cannot span partition boundaries
●
Need be updated when file moves on disk
●
Unix: ln <filename> <linkname>
22
Soft links
●
Soft links (symbolic links, ”shortcut”, ”alias”)
–
files containing the actual (full) file name
–
still valid if file moves on disk
–
no longer valid if file name (or path) changes
–
Not as efficient as hard links (one extra block read)
–
Unix: ln –s <filename> <linkname>
23
File System Mounting
●
●
●
●
A file system must be mounted before it can
be accessed
Mounting combines multiple file systems in one
namespace
An unmounted file system is mounted at a
mount point
In Windows, mount points are given names C:,
D:, …
24
Example
Existing
file
system
Unmounted volume
residing on
/device/dsk
Mount point:
Mounted /device/dsk
over /users
25
File Sharing
●
Sharing of files on multi-user systems is desirable
●
Sharing may be done through a protection scheme
●
In order to have a protection scheme, the system should
have
–
User IDs - identify users,
allowing permissions and protections to be per-user
–
Group IDs - allow users to be in groups,
permitting group access rights
26
Sharing across a network
●
●
Distributed system
Network File System (NFS) is a common
distributed file-sharing method
●
SMB (Windows shares) is another
●
Protection is a challenge!
27
Protection
●
●
File owner/creator should be able to control:
–
what can be done
–
by whom
Types of access
–
Read
–
Write
–
Execute
–
Append
–
Delete
–
List
28
Access Lists and Groups
●
3 modes of access: read, write, execute
●
3 classes of users:
●
●
a) owner access
7

b) group access
6

c) public access
1

RWX
111
RWX
110
RWX
001
Ask manager to create a group (unique name), say G,
and add some users to the group.
For a particular file (say game) or subdirectory,
define an appropriate access.
owner
Attach a group to a file:
chgrp
chmo
d
G
game
group
761
public
game
29
A Sample UNIX Directory Listing
> ls -l
owner
group
name
30
File system implementation
31
File-System Structure
●
●
File system resides on secondary storage
(disks)
File system organized into layers
32
File-System Layers
File API:
filenames, directories,
attributes, access...
Logical block addresses
(1D array of blocks)
Physical block addresses on
disk (cylinder, sector,...)
read/write block
commands
33
File control block (FCB)
●
●
Resides at the logical FS layer
Storage structure consisting of
information about a file
34
In-Memory File System Structures
creat
(a) Creating a new file
(b) Reading an open file
35
Virtual File System (VFS)
36
Allocation Methods
●
An allocation method refers to
how disk blocks are allocated for files
–
Contiguous allocation
–
Linked allocation
–
Indexed allocation
38
Contiguous Allocation
39
Contiguous Allocation
●
●
●
Pros:
–
Simple
–
Allows random access
Cons:
–
Wasteful
–
Files cannot grow easily
Works well on CD-ROM
40
Linked Allocation
block
=
next-pointer
41
Linked Allocation
●
●
Pros:
–
Simple – need only starting address
–
Free-space management
–
No external fragmentation
Cons:
–
No random access
–
Overhead (space and time)
–
Reliability
42
File-Allocation Table (FAT)
File-allocation table (FAT) – disk-space allocation used by MS-DOS and OS/2.
in FAT
Variant of linked allocation:
FAT resides in reserved section
at beginning of each disk volume
One entry for each disk block,
indexed by block number, points to successor.
Entry for last block in a chain has table value -1
Unused blocks have table value 0
 Finding free blocks is easy
Does not scale well to large disks
or small block sizes
43
Indexed Allocation
●
Brings all pointers together into an index block
44
Indexed Allocation (Cont.)
●
●
Direct access once index block is loaded
–
without external fragmentation,
–
but overhead of index block.
All block pointers of a file must fit into the index block
–
–
How large should an index block be?
●
Small – Limits file size
●
Large – Wastes space for small files
Solution: Multi-level indexed allocation 
45
Multilevel-indexed allocation
Directory

outer-index
index table
file
46
Combined Scheme: UNIX inode
Block size 4 KB
-> With 12 direct block pointers kept in the inode, 48 KB can be addressed directly.
 Small overhead
for small files
 Still allows
large files
47
Free-Space Management
●
Where is there free space on the disk?
–
●
A free-space list
Two basic approaches
–
Free-space map (bit vector)
–
Linked list
48
Bit vector
●
Each block represented by one bit
0 1
2
n-1
…
bit[i] =
1  block[i] free
0  block[i] occupied
First free block: number of bits per word) * (number of 0-value
words) + offset of first 1 bit
49
Bit vector
●
Easy to get contiguous files
●
Bit map requires extra space
●
Example:
block size = 1 KB = 2^10 bytes
disk size = 68 GB ~ 2^36 bytes
n = 2^36/2^10 = 2^26 bits (or 67 MB)
●
Inefficient unless entire bit vector is kept in main
memory
50
Linked list
51
Linked list
●
●
●
Only need to store the pointer to the first free
block
Finding k free blocks means reading in k blocks
from disk
No waste of space
52
Grouping
a really free block
(n-1 references to free blocks)
First ”free”
block
…
53
Counting
●
●
Often, multiple subsequent blocks are
allocated/freed together
For sequences of free blocks located
subsequently on disk, keep only reference to
first one and length of sequence
54
Fact #1
File systems contain multiple data structures
55
Fact #2
These data structures have inter-depencencies
56
Conclusion:
Modification of the file system should be atomic
57
What happens if the computer is suddenly turned
off?
58
File system repair
●
●
●
●
●
For each block
–
Find which files use the block
–
Check if the block is marked as free
The block is used by 1 file xor is free – OK
Two files use the same block – BAD: duplicate the block and give
one to each file
The block is both used and is marked free – BAD: remove from free
list
The block is neither free nor used – Wasted block: mark as free
59
Modern alternatives
●
●
Log-based, transaction-oriented
–
Each modification is made as a transaction
–
Keep a journal (log) of all pending transactions
–
Interrupted transaction can be rolled-back
–
Examples: NTFS, ext4, ...
Snapshot-based
–
Copy-on-write
–
Often combined with checksums
–
Examples: ZFS, Btrfs
60
Memory-Mapped Files
●
●
●
●
Mapping a disk block to a page in memory
A page-sized portion of the file is read from the file system into a
physical page. Subsequent reads/writes to/from the file are
treated as ordinary memory accesses.
Simplifies file access by treating file I/O through memory
rather than read() / write() system calls
Also allows several processes to map the same file
allowing the pages in memory to be shared
61
Memory-Mapped Files
62