Download Embedded Linux Design and Programming

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Operating Systems
Dr. Jerry Shiao, Silicon Valley University
Fall 2015
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
1
File-System Implementation

Overview

The Operating System uses a File System as mechanism for online storage and access to file contents.



Disk, most common secondary storage medium: File-System
Structure




Recover freed Disk Space.
Track locations of data.
Efficiency and Performance


Block allocation and Free-Block algorithms.
Free-Space Management


File-System structure and implementation of local file systems and
directory structure.
Allocation Methods of Disk Space


File System resides permanently on secondary storage or disk.
File System abstracts the physical properties of the storage device into the
logical storage unit, the file, and the organization of the files, the directory.
Interface other parts of the Operating System to Secondary Storage.
Recovery
NFS
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
2
File-System Implementation
 File-System Structure

Most common secondary-storage medium, the disk.



File Systems provides efficient and convenient access to disk.


Disk can be rewritten in place ( read a block, modify the block, write block
back into same location ).
Disk can access directly any block of information in a file, sequentially or
randomly.
 I/O transfers in blocks: Each block is one or more sectors.
 Sector size is 32 Bytes to 4096 Bytes. Default 512 Bytes.
File System research active in Operating System design.
File System Design Issues:


User view of the File System: Defining file attributes, file operations,
directory structure for file organization.
Algorithms and data structures to map logical file system onto physical
secondary-storage device.
 CD-ROM support ISO 9660, standard format for CD-ROM Manufactors
 Disk File-Systems:
 Windows NT, 2000, and XP supports FAT, FAT32, NTFS
(Windows NT File System).
 UNIX supports UNIX File System (UFS) and Berkeley Fast File
System (FFS).
 Linux supports Extended File System ( ext2 and ext3).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
3
File-System Implementation

File-System Structure

File System composed of many different levels.
 Layer concept can be reused by multiple file systems.

Each file system has own Logical File System and File Organization Module.
Logical File System manages metadata information
(file system structures except actual data).
FCB or File Control Block (Linux Inode) has information on file
ownership, permissions, location.
Manages the directory structure used by the File Organization
Module.
File Organization Module handles files and logical block
addresses to physical block addresses translation.
Free Space Manager tracks unallocated blocks.
Basic File System issues commands to device driver to
read/write physical block on the disk.
Manages memory buffer, cache for file system directories and
I/O blocks.
Lowest level: I/O control consists of device drivers and interrupt
handlers. Transfers I/O blocks from main memory to disk.
Hardware-specific instructions used by the hardware controller
to interface with I/O device.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
4
File-System Implementation

File-System Structure

Structures On-Disk for Implementing File System

Boot Control Block ( Per Volume ): Information used to boot an
Operating System from the volume (first block of volume, can be
empty).



Volume Control Block ( Per Volume ): Number of blocks in volume
(or partition), size of the blocks, free-block count/pointers, free-FCB
(File Control Block) count/pointers.



UNIX: Super Block.
NTFS: Master File Block.
Directory Structure ( Per File System ): Organize the files.



UNIX: Boot Block.
NTFS: Partition Boot Sector.
UNIX: File Names and Inode.
NTFS: Master File Table.
File Control Block ( Per File ): Unique Identifier to associate with a
directory entry ( Linux Inode ).


Copyright @ 2009 John
Wiley & Sons Inc.
UNIX: FCB.
NTFS: Master File Table.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
5
File-System Implementation
 File-System Implementation

Structures In-Memory for Implementing File System





Mount Table: Information on each mounted volume.
Directory Structure Cache: Directory information of recently
accessed directories.
Open-File Table ( system wide ): Copy of the FCB of each open file.
Open-File Table ( Per Process): Pointer to the FCB in system-wide
Open-File Table.
Buffers holding File-System blocks being read/written from/to disk.
File Control Block ( FCB )
Copyright @ 2009 John
Wiley & Sons Inc.
FCB:
1) Open () system call passes file name to Logical File
System and looks up file name in Directory Structure.
2) Search system-wide Open-File Table.
3) Already exist (used by another process): Per-Process
Open-File Table created pointing to the system-wide OpenFile entry.
4) Not exist, Directory Structure searched, the FCB is copied
into system-wide Open-File Table. Open-File Table keeps
track of processes using the file.
5) Per-Process Open-File Table created pointing to the
system-wide Open-File entry.
6) UNIX: File Descriptor is index into Per-Process Open-File
Table for the file.
Windows: File Handle.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
6
File-System Implementation
 File-System Implementation
Open() file name NOT in systemwide open-file table. The directory
structure is searched for file name.
Parts of the directory structure is
cached in memory. The entry
contains the pointer to the FCB.
Application uses Index (File
Descriptor) to access file data.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
FCB is placed in the system-wide
open-file table during the open().
7
File-System Implementation
 File-System Implementation

FCB Interface to Directory Structure
1)
2)
3)

Process closes a file: Per-process Open-File entry is removed.
System-Wide Open-File Table entry open count is decremented.
If open count is zero, metadata is copied back to disk-based
Directory Structure.
System-Wide Open-File entry is removed.
FCB Interface to Other System Types (i.e. Networking ( NFS ) or
Named Pipes ( FIFO ):


System-Wide Open-File Table also holds similar information for
network connections and devices.
Sockets



End-to-end communication between two systems over a network.
Client-Server model.
Named Pipes ( FIFO )

Copyright @ 2009 John
Wiley & Sons Inc.
Inter-process communication using File System.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
8
File-System Implementation
 File-System Implementation
 Partitions and Mounting
 Partition with no File System ( Raw ):




Formatted according to its use.
Swap Space
RAID configuration database and bit maps for mirrored blocks.
Boot Information:

Boot Loader loaded as series of blocks.


Boot Loader can support dual boot: Multiple Operating Systems and
multiple File Systems.



Disk can have multiple partitions, each partition contains a different File
System with a different Operating System.
Windows: Mount in separate name space, using letter and colon
( i.e. “F:” device “F” is pointer to file structure of partition).
UNIX: Inode of the in-memory directory has flag indicating
directory is a mount point.


Mounts root partition, loads the Kernel and start executing.
Directory entry  Mount Table  Super Block of the File System.
Partition with File System ( Cooked ):
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
9
File-System Implementation
 File-System Implementation
 Virtual File System
 Integrating Multiple File Systems into Directory Structure



Object-oriented technique to modularize implementation.
Data structures and procedures used to isolate System Calls
from implementation details.
VFS API uses same System Call API to be used for
different types of File Systems.


Clean VFS interface: Separates file-system-generic operations.
Vnode, file-representaiton structure to uniquely representing a
file in a file system locally and throughout a network.




Network File System support.
Vnode structure for each active file or directory node.
UNIX: Inode unique within a single File System.
File-System types


File-System-Specific operations are called through the inode structure.
VFS does not know whether an inode represents a disk file, a directory file
or a remote file ( through NFS ).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
10
File-System Implementation
 File-System Implementation
 Virtual File System Major Layers
1) User interface with File
Descriptors : open(), read(),
write(), close().
2) VFS API, Inode for active
node (file or directory). Inodes
are unique within a single File
System.
3) The File System type or the
Remote File System.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
11
File-System Implementation
 File-System Implementation
 Virtual File System
 Main Object Types defined by Linux VFS:






Inode Object: Represents an individual file.
File Object: Represents an Open File.
SuperBlock Object: Represents as entire File System.
Dentry Object: Represents an individual Directory entry.
Set of Operations defined for each Object Type.
VFS performs operation on the Object Types by
indirectly calling the functions registered for the Object
Types.

Does not need to know what kind of object ( i.e. Inode Object
could represent a disk file, directory file, or a remote file).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
12
File-System Implementation
 Virtual
File System (VFS)
User space has the applications and glibc
(provides API file system calls: open, read,
write, close).
System call interface acts as a switch, funneling
system calls from user space to the appropriate
endpoints in kernel space.
VFS exports APIs and abstracts them to
individual file systems.
Inode cache and Directory cache provides a
pool of recently-used file system objects.
Directory cache speeds up translation from file
pathname to inode.
Individual file systems exports APIs that is used
by the VFS.
Buffer cache buffers the requests between the
file systems and the block devices that they
manipulate (for faster access).
Fall 2013
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
13
File-System Implementation
 Virtual File System (VFS)
Disk File
Superblock
Object
Inode
Object
Inode
Cache
Process 1
File Object
Process 2
File Object
Dentry
Object
Dentry
Object
Access same file.
Dentry
Cache
File Object: Stores info about interaction between open file and process.
Dentry Object: Stores info about linking directory entry (file name) and file.
Inode Object: Stores info about specific file (inode number identifies file). The file control block.
Superblock Object: Stores info about mounted filesystem. The filesystem control block.
Fall 2013
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
14
File-System Implementation
 Virtual File System (VFS)
Inode represents an object in the file system with a
unique identifier (translating filename).
struct file_operations abstractions (i.e.
read/write/open ) allow all I/O operations to have
common interface. The indirect calls (i.e. callback
functions) are APIs specific to the file system.
To achieve the abstraction (i.e. “black box operation) to the user, common API to the user through
glibc library and common callback function signature to the I/O functions.
Fall 2013
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
15
File-System Implementation
 Directory Implementation
 Directory Allocation and Management Algorithms


Affects the efficiency, performance, and reliability of the File
System.
Linear List


Linear list of File names and pointers to the data blocks.
File search, creation and deletion, needs a linear search.

Improve on file access with:




Software cache of most recent directory information.
Sorted linked list.
Binary tree of directory entries.
Hash Table

Directory search time greatly decreased.



Hash index using the File Name to compute the index.
Hash function can make Hash Table very large.
Handle collisions using linked list of collided entries.

Slower, but easier to implement.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
16
File-System Implementation



Allocation Methods
Disk Space Utilized Effectively
Three Methods of Allocating Disk Space:


Contiguous, Linked, Indexed.
Contiguous Allocation

File occupy set of continguous blocks: Disk address define linear
ordering on the disk.



Minimal head movement, seek minimal.


Sequential Access: Disk address of last block plus next block.
Direct Access: Disk address of first block plus block offset.
IBM VM/CMS Operating System uses Contiguous Allocation for
performance.
Limitations: Finding contiguous space for new file.





First Fit and Best Fit dynamic storage –allocation.
File extension: Difficult estimating size of an output file (over-estimate).
 Over-estimitation and internal fragmentation.
External fragmentation.
Off-line compaction (copying to another disk or tape and then compacting).
Hybrid contiguous-allocation scheme.
 Extend to another contiguous block (Internal fragmentation when
extend too large).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
17
File-System Implementation


Allocation Methods
Contiguous Allocation
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
18
File-System Implementation


Allocation Methods
Linked Allocation
 Each file is a linked list of disk blocks, scattered on disk.
 Each block contains 4 Byte pointer to the next block.


Size of file does not have to be known.




Effective only for Sequential-Access files.
Inefficient for Direct-Access files
Allocate clusters of blocks, instead of one block at a time.



A file can grow as long as free blocks are available.
Directory contains pointer to the first and last block of the file.
Limitations:


For 512 Byte block, has 508 Bytes of usable space.
Internal fragmentation problem.
Reliability: Pointers can be damaged in the block.
File Allocation Table ( FAT ): MS-DOS and OS/2.




Section of disk at the beginning of each volume contains the FAT.
Table has one entry for each block and indexed by block number
Access similar to Link List.
FAT cached to minimize disk overhead ( disk head seeks ).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
19
File-System Implementation


Allocation Methods
Linked Allocation
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
20
File-System Implementation


Allocation Methods
Linked Allocation
Directory Entry contains offset to the
start of the File in the FAT.
File Allocation Table ( FAT ):
Each entry represents a physical
block on disk and each entry points to
the next entry (block).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
21
File-System Implementation


Allocation Methods
Linked Allocation
Rose-Hulman Institute of
Technology
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
22
File-System Implementation


Allocation Methods
Indexed Allocations


Each file has index block: Pointers to each Data Block.




Index Block: Bring all pointers into one location.
Similar to Page Table for Virtual Memory.
Index block allocated disk block from free-space manager.
Supports direct access without external fragmentation.
Linked Index Block:

Large files, link several Index Blocks. The Index Block has pointer to next Index
Block.
 Pointer overhead of Index Allocation more overhead than Linked Allocation.

Multilevel Index Block:



Combined Scheme:





First –level Index Block points to Second-level Index blocks, etc.
4096 Byte blocks  1024 4 Byte pointers in Index Block  2 levels Index Blocks
 1024 x 1024  1048576 data blocks x 4096 blocks  4 Gbytes file.
First 15 pointers of the Index Block in the Inode.
First 12 pointers are pointing to direct blocks.
Next 3 pointers are single, double, triple indirect blocks.
Number of Data Blocks to a file exceeds 2^32 or 4 GB.
Index Blocks can be cached in memory for performance.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
23
File-System Implementation


Allocation Methods
Indexed Allocation
1)
2)
3)
Directory entry contains
the address of the Index
Block.
Index Block is an array of
disk-block addresses.
“ith” Data Block, use the
disk address in the “ith”
Index Block entry.
Index block per file similar to
Paging scheme with Page
Table per process.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
24
File-System Implementation


Allocation Methods
Combined Scheme (Linked and Multi-level) Allocation
UNIX File System
Index Block
Linked Allocation
Multi-level Linked
Allocation
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
25
File-System Implementation


Allocation Methods
Performance

Allocation Methods Measured by:


Storage efficiency and data-block access time.
Criterias: How a system is used

Sequential Access


Random Access




Direct Access.
Contiguous Allocation (start of block plus offset).
Define how a file is used and create either contiguous (Direct
Access with maximum length) or Linked Allocation (Sequential
Access).
File converted from one type to another by creating new file of
desired type.



Linked Allocation (address of next block in memory).
Operating System uses contiguous allocation for small files (3 or 4 blocks) and
Index Allocation for larger.
Average performance is good, since most files are small.
UNIX allocates space in clusters of 56 Kbytes (DMA transfer size).

Minimize external fragmentation, seek and latency times.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
26
File-System Implementation

Free-Space Management

Free-Space List: Keep track of free disk space, not allocated to files
or directories.
Bit Map or Bit Vector: 1 = Free Block, 0 = Allocated


Hardware instruction to return the offset in a word of the first bit with
value 1 (Intel 80386 and Motorola 68020 Family).
 (number of bits per word) x (number of 0-value words) + offset of first 1
bit
 Inefficient unless entire vector kept in main memory :



0 1
Disk size constantly increases: size of bit vectors will continually increase.
Caching Bit Map in memory increases performance.
1 TeraByte disk with 4KByte blocks require 32 Mbytes to store Bit Map.
2
n-1

…
bit[i] =
0  block[i] free
1  block[i] occupied
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
27
File-System Implementation

Free-Space Management

Free-Space Linked List

Linked List

Link together free disk blocks:
First free block in special location on disk.
 Operating System usually requests first
block in the Linked List.
 FAT uses Free-Block Linked List.

Grouping Linked List



Address of “n” free blocks in first free block.
“(n-1)” block address is the address of another
“n” free blocks.
Counting

Expect several contiguous blocks allocated/freed
simultaneously.
 Address of first free block and number of free
contiguous block that follow.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
28
File-System Implementation



Efficiency and Performance
Analyze the Block-Allocation and Directory-Management
Options.
Efficiency:





Disk efficiency depends on Disk allocation and Directory
algorithms used.
UNIX inodes preallocated on a volume improves Directory
performance.
Clusters of data blocks improves on file-seek and file-transfer
performance.
Data block linked allocation, and different levels of Indexed
Allocation for directories has difficulties in choosing pointer size
because of the evolving nature of disk technology.
Performance:


Cache Data blocks in Buffer Cache in main memoary.
Page Cache to file data as pages rather than File-SystemOriented Data blocks (Unified Virtual Memory Algorithms).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
29
File-System Implementation


Efficiency and Performance
Performance
I/O without Unified
Buffer Cache.
Double Caching and requires
caching File-System data
twice.
Virtual Memory does not
interface with the Buffer Cache.
Contents of the Buffer Cache
must be copied into the Page
Cache.
Reads in disk blocks from File
System into Buffer Cache.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
30
File-System Implementation


Efficiency and Performance
Performance

Unified Virtual Memory

Use Page Caching for both process pages and file data.
Virtual Memory system
manages File-System data.
Limitations:
- Solaris allocates pages to a
process or to the Page Cache.
- Operating System performing
I/O uses the available memory
for caching pages.
- Needs Priority Paging, Page
Scanner gives priority to
process pages over the Page
Cache.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
31
File-System Implementation


Efficiency and Performance
Performance: File System Writes

Synchronous Writes.



Occur in the order the disk subsystem receives them: write not
buffered.
User process must wait for the data to complete disk write.
Databases uses synchronous writes for atomic transactions.


Asynchronous Writes.




Operating System includes flag to request synchronous writes.
Data stored in the cache.
User process receives control immediately.
Majority writes are asynchronous to reduce latency.
Replacement Algorithm:

Free-behind removes a page from the buffer as soon as the next
page is requested.


Previous page not likely to be used again.
Read-ahead reads the requested page and several subsequent
pages.

Copyright @ 2009 John
Wiley & Sons Inc.
Reading data in one transfer and caching saves I/O processing time.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
32
File-System Implementation


Recovery
System crash can cause inconsistencies among on-disk
File-System data structures.

Typical File Operation causes:







Directory Structures are modified.
FCBs are allocated.
Data Blocks are allocated.
Free counts for FCB pointers, Free-Block pointers are decremented.
Operating System caches data structures.
Operating System can interrupt these changes.
Consistency Checking

File System must detect and correct problems.



Metadata scan to confirm or deny the Operating System consistency when
system boots.
File System can record its state within File System metadata.
Consistency Checker: fsck ( ) in UNIX and chkdsk in Windows.

Allocation algorithm and Free-Space Management algorithm dictates what
type of recovery is possible.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
33
File-System Implementation


Recovery
Log-Based Transaction-Oriented File Systems (
Journaling )



All metadata changes ( Transactions ) are written sequentially to
a circular buffer ( Log ).
Once written to the Log, transactions are considered committed
and system call returns to the User Process.
Log entries are replayed across File System and removed from
the Log when entry completes.



Operating System crashes before Log entry completed,
Changes from incomplete transaction must be undone when
Operating System rebooted.
All transactions in Log will execute to modify the File System.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
34
File-System Implementation


Recovery
System Program Back-Up to other Storage Device

Incremental Backup Minimize Copying.
www.unitrends.com
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
35
File-System Implementation

Network File System ( NFS )

Goal is to allow degree of sharing among independent Computer
Systems and independent File Systems.




Client-Server Network File System: Typically integrated with
overall directory structure and interface of client system.
Remote directory transparency:







Sharing in transparent manner between pair machines.
A system should have both client and server functionality.
Client must use mount operation.
Mounted directory appears as integral subtree of local File System.
Location or Host Name of remote system must be provided.
Subject to access-rights accreditation, any File System, or any directory
within a File System, can be mounted remotely on top of any local directory.
Cascading mounts allow a File System to be mounted over another File
System that is remotely mounted, not local.
Mount mechanism does not exhibit a transitivity property.
NFS operate in a heterogeneous environment of different
machines, Operating Systems, and Network architectures.

Independence achieved by RPC primitives on top of external data
representation ( XDR ) protocol between two implementation-independent
interfaces.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
36
File-System Implementation

Network File System ( NFS )
Three independent File Systems of Operating
Systems in U, S1, and S2 machines.
From U, mount the remote File
Systems in dir1 (S1) and dir2
(S2).
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
37
File-System Implementation

Network File System ( NFS )
Mounting S1: /usr/shared over U:/usr/local.
After mount, users on U can view the contents of
dir1 from S1.
Original directory /usr/local no longer visible.
Cascading Mounting S2:/usr/dir2 over
U:/usr/local/dir1.
After mount, users on U can view
contents of dir1 from S2.
S1
S2
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
38
File-System Implementation



Network File System ( NFS )
Mount Protocol
Initial logical connection between a server and a client.



Mount operation: name of remote directory and name of server
machine storing it.
Mount operation  RPC  Forwarded to remote system.
Server maintains:


Export List: Local File System server exports for mounting.
Export List: Names of remote systems permitted to mount.





/etc/dfs/dfstab: Solaris Access Rights
/etc/exports: Linux Access Rights
List of Client machines and currently mounted directories.
 Used to send Warning that the server is going down.
File handle returned to client as key for further accesses to the
mounted File System.
Other operations: unmount and return Export List.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
39
File-System Implementation


Network File System ( NFS )
The NFS Protocol

A set of RPCs for remote File operations.






NFS Servers are stateless:





No UNIX’s open-file table or file structures exist on the server side.
Each request has full set of arguments, unique file identifier, absolute offset
inside the file for the operations.
Sequence number to identify duplicated or missing requests.
Modified data must be committed to the server’s disk before results are
returned to the client.
Does not provide concurrency-control mechanism.


Searching for a file within a directory.
Reading a set of directory entries.
Manipulating links and enties.
Accessing file attributes.
Reading and writing files.
Intermixed UDP packets from multiple requests to the same remote file.
 Locking mechanism outside of NFS protocol to coordinate and
synchronize access.
Integrated into the Operating System though Virtual File System.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
40
File-System Implementation


Network File System ( NFS )
The NFS Protocol
1)
2)
3)
4)
5)
6)
7)
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
The client initiates a write
operation to remote file.
Operating System maps to
VFS operation and
appropriate inode.
VFS layer identifies the file as
remote and invokes the NFS
procedure.
RPC call is made to the NFS
service layer at the remote
server.
Call is reinjected to the VFS
layer on the remote system.
Remote VFS finds that
system call is local and
invokes file-operation
operation.
Path is retraced to return the
result.
41
File-System Implementation

Network File System ( NFS )

The NFS Protocol
Path-Name Translation


Parsing the path name (i.e. /usr/local/dir1/file.txt) into component names
and performing a separate NFS lookup call for every pair of component
name and directory vnode.



NFS Lookup Call for: (1) /usr (2) /usr/local (3) /usr/local/dir1
Server needs separate component names, since the client’s file directory
layout is unique.
Remote Operations

File operation (except open / close) translate to NFS protocol RPC.
 File-blocks cache – When a file is opened, the kernel checks with the
remote server whether to fetch or revalidate the cached attributes.
 File blocks cache – Used if corresponding cached attributes are up to
date.
 Clients do not free delayed-write blocks until the server confirms that the
data have been written to disk.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
42
File-System Implementation
 Summary

File System Structure is used to abstract the physical properties of
secondary storage (disk) into the logical storage unit, the file, and the
organization of the files, the directory.
 The physical disk can be a partition with one File System or segmented
into multiple partitions or multiple File Systems.
 File Systems are implemented in a layered or modular structure.




Logical File System ( Directory Structure and File Control Block (FCB).
File-organization Module ( Logical to Physical Block Translation ).
Basic File System ( Commands to Device Driver and manages’anages the
memory buffers and caches).
I/O Control ( Device Drivers and Interrupt handlers ).

Allocation Space on Disk can be done in three ways: Through
Contiguous, Linked or Index Allocation.
 Free Space Allocation Methods:




Directory using Linear List or Chained-Overflow Hash Table.
Directory Implemented as Network File System (NFS).


Bit Vectors and Linked Lists.
Optimized through Grouping, Counting, and the FAT (Linked List in MBS
Using Mount and RPC protocol to execute File System operations on the
Remote Server.
Copyright @ 2009 John
Wiley & Sons Inc.
SILICON VALLEY UNIVERSITY
CONFIDENTIAL
43