Download H 10.1. File-System Interface

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Object storage wikipedia , lookup

Library (computing) wikipedia , lookup

MTS system architecture wikipedia , lookup

RSTS/E wikipedia , lookup

Windows NT startup process wikipedia , lookup

Plan 9 from Bell Labs wikipedia , lookup

DNIX wikipedia , lookup

OS 2200 wikipedia , lookup

Commodore DOS wikipedia , lookup

Spring (operating system) wikipedia , lookup

Burroughs MCP wikipedia , lookup

CP/M wikipedia , lookup

Batch file wikipedia , lookup

File locking wikipedia , lookup

Computer file wikipedia , lookup

VS/9 wikipedia , lookup

Unix security wikipedia , lookup

Transcript
Operating System Principles
AUA CIS331
Albert Minasyan
Handout 11
File System Interface
11.1. File Concept
11.1.1. File attributes
11.1.2. File Operations
11.1.3. File Types
11.2. Access Methods
Sequential, Direct, Indexed
Silberschatz, 6th or 9th ed. Chapter 11.
11.1. File Concept
File System Storing
provides
mechanisms Access
for
Data,
Programs
We have to learn:
 File System Interface to the user (programs)
 File System implementation on devices.
For most users, the file system is the most visible aspect of an operating system. It provides the
mechanism for on-line storage of and access to both data and programs of the operating system and all
the users of the computer system.
For most users, the file system is the most visible aspect of an operating system. It provides the
mechanism for on-line storage of and access to both data and programs of the operating system and all
the users of the computer system. The file system consists of two distinct parts: a collection of files,
each storing related data, and a directory structure, which organizes and provides information about
all the files in the system. Some file systems have a third part, partitions, which are used to separate
physically or logically large collections of directories.
The File System Interface
consists of:
 Collection of Files
 Directory Structure
 Partitions
Partition
Directory
Directory
Files
Files
Computers can store information on
several different storage media, such as
magnetic disks, magnetic tapes, and
optical disks. So that the computer system
will be convenient to use, the operating
system provides a uniform logical view
of information storage.
The operating system abstracts from the
physical properties of its storage devices to
define a logical storage unit (the file).
Files are mapped, by the operating system, onto physical devices. These storage devices are usually
nonvolatile, so the contents are persistent through power failures and system reboots.
Operating System Principles
AUA CIS331
Albert Minasyan
A file is a named collection of related information that is recorded on secondary storage.
From a user's perspective, a file is the smallest allotment of logical secondary storage; that
is, data cannot be written to secondary storage unless they are within a file.
Commonly, files represent programs (both source and object forms) and data. Data files may be
numeric, alphabetic, alphanumeric, or binary. Files may be free form, such as text files, or may be
formatted rigidly. In general, a file is a sequence of bits, bytes, lines, or records, the meaning of which
is defined by the file's creator and user. The concept of a file is thus extremely general.
The information in a file is defined by its creator. Many different types of information may be stored
in a file-source programs, object programs, executable programs, numeric data, text, payroll
records, graphic images, sound recordings, and so on. A file has a certain defined structure
according to its type. A text file is a sequence of characters organized into lines (and possibly pages).
A source file is a sequence of subroutines and functions, each of which is further organized as
declarations followed by executable statements. An object file is a sequence of bytes organized into
blocks understandable by the system's linker. An executable file is a series of code sections that the
loader can bring into memory and execute.
Partitions in Windows
In this figure is shown Master Boot Record (MBR), which resides on the first part of HDD and
keeps the partitions’ information. There are multi OS partitions in this example.
HDD
MBR (0 track)
MBR
Entry 0
Partition 0 - Linux “/”
Partition 1 - Drive C: NTFS
Entry 1
Entry 2
Entry 3
Partition 2 -
Another Partition
No other partition
Operating System Principles
AUA CIS331
Albert Minasyan
What is the Directory ?
C:\ logical drive on primary partition
Directory
Contains the list of files & directory’s
Directory1 (attributes)
Directory2
...
File1 (attributes)
File2
...
Directories & Files in C:\ logical drive
File attributes
Unix Partitions
FreeBSD#
df -k
Filesystem 1K-blocks
Used
Avail Capacity
/dev/ad0s1a
2015918
57760 1796886
3%
/dev/ad0s1h
4774910
10664 4382254
0%
/dev/ad0s1g
100750
4
92686
0%
/dev/ad0s1e
6048238 1078834 4485546
19%
/dev/ad0s1f
6048238 986380 4578000
18%
Solaris#
Linux#
/dev(ices directory) = ”Device Manager” list in Windows
df -k
Filesystem
/dev/dsk/c0t0d0s0
/dev/dsk/c0t0d0s3
/dev/dsk/c0t1d0s7
Mounted on
/
/opt
/tmp
/usr
/var
kbytes
used
avail capacity
20160418 1562132 18396682
8%
10718701 377979 10233535
4%
35009161 4815522 29843548 14%
Mounted on
/
/var
/opt
df -k
Filesystem
/dev/hda1
1k-blocks
2016016
Used Available Use% Mounted on
1350612
562992 71% /
Unix Root “/” catalog (directory)
Linux#
drwxr-xr-x
drwxr-xr-x
-rw-r--r-drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
dr-xr-xr-x
drwxr-x--drwxr-xr-x
drwxrwxrwt
drwxr-xr-x
drwxr-xr-x
ls –alF /
19
19
1
2
3
17
42
8
2
7
2
2
4
2
42
4
2
2
16
20
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
4096
4096
0
4096
4096
77824
4096
4096
4096
4096
16384
4096
4096
4096
0
4096
4096
4096
4096
4096
Jun
Jun
Feb
Feb
Feb
Feb
May
Sep
Jun
Feb
Feb
Aug
Feb
Aug
Feb
Nov
Feb
Jun
Feb
Feb
6
6
23
23
23
23
26
7
21
23
23
29
23
23
23
15
23
9
23
23
08:39
08:39
12:58
2006
2006
12:58
12:51
2008
2001
2006
2006
2001
2006
1999
16:58
2007
2006
04:02
2006
2006
./
../
.autofsck
bin/
boot/
dev/
etc/
home/
initrd/
lib/
lost+found/
misc/
mnt/
opt/
proc/
root/
sbin/
tmp/
usr/
var/
Operating System Principles
AUA CIS331



Albert Minasyan
The data is stored in filesystems as files and as catalogs (directories).
The top of Unix directory tree hierarchy is the root “/” catalog.
All other catalogs are created starting from the root “/” catalog.
There are several standard directories in “/” catalog which have the same meaning for the most of
Unix systems.
Log in as root to see the following structures.
/bin
/boot
– Unix basic user commands and utilities (like cat, cp, rm, mv, gzip, gunzip).
– booting files and system kernel
vmlinuz-2.4.7-10 – is the kernel of RedHat 7.2 Linux. Be careful with this file.
/dev
- devices
/dev/hdX, /dev/sdX
- Hard Drives
/dev/fdX
- Floppy drives
/dev/cdrom
- Cdrom drive
/dev/null
- null (empty) device
/dev/rmt
- tape device
/dev/pts /X
- remote terminals
/dev/ttyX
- local terminals
/etc
– main configuration files
/home – users’ home directories
each user has his own home directory which is not accessible for the other users
/lib
- unix system modules and libraries
/mnt - temporarily mounted devices
/mnt/cdrom
/mnt/floppy
/proc - unix running processes in form of usual file system
/root - home directory of user “root”
/sbin - system commands
/tmp
- temporary files not important for the system
/usr
- additional commands, configurations and applications for Unix (this tree looks like to
main directory tree of linux)
/usr/bin, /usr/sbin –
additional commands
/usr/include
c, c++ include libraries
/usr/lib
additional libraries
/usr/src
kernel source code
/usr/share/man manual pages
/usr/share/doc documentation about everything in Linux
/usr/local
the second subtree of directories for local programs
/var
- temporary and dynamically changed files important for system (there are some
exceptions)
/var/cache
cache files (f.e. man cached files)
/var/log
all system log files (system status registration)
messages
the system log file where are registered all important
system events
maillog
the mail log file where is registered all mail traffic log
/var/named
Domain Name System configuration files (very important files,
should be moved to /etc after DNS installation)
/var/run
keeps information about running processes
/var/spool/cron periodic process control files (important files)
/var/spool/mqueue keeps mails before sending to recipient (is defined in
/etc/sendmail.cf configuration file)
Operating System Principles
AUA CIS331
Albert Minasyan
The chain of directories to the desired file or directory, including them, is called path or
pathname.
The path could be absolute or relative.
Absolute path is the path to the file or directory beginning from the root directory.
Relative path is the path to the file or directory beginning from the current directory.
cd /etc/mail
cd /
cd /home
cd <username>
cd ./<username>
cd work/man/mann
-
(absolute path)
(absolute path)
(absolute path)
(relative path)
(relative path)
(relative path)
Windows main directories
and files
System Folders
(directories)
System Files
Boot files
Operating System Principles
AUA CIS331
Albert Minasyan
11.1.1. File Attributes
A file is named, for the convenience of its human users, and is referred to by its name. A name is
usually a string of characters, such as examp1e.c. Some systems differentiate between upper- and
lowercase characters in names, whereas other systems consider the two cases to be equivalent. When a
file is named, it becomes independent of the process, the user, and even the system that created it. For
instance, one user might create the file example.c, whereas another user might edit that file by
specifying its name. The file's owner might write the file to a floppy disk, send it in an e-mail, or copy
it across a network, and it could still be called examp1e.c on the destination system.
File Attributes in Unix
ls –alF
total 32
(This directory uses 32 blocks approximately on HDD)
drwx-----3 studnt studnt
4096 Aug 24 16:59 ./
drwxr-xr-x
5 root
root
4096 Aug 24 16:36 ../
-rw------1 studnt studnt
278 Aug 24 16:36 .bash_history
-rw-r--r-1 studnt studnt
24 Aug 24 11:00 .bash_logout
-rw-r--r-1 studnt studnt
191 Aug 24 11:00 .bash_profile
-rw-r--r-1 studnt studnt
124 Aug 24 11:00 .bashrc
drwxrwxr-x
3 studnt studnt
4096 Aug 24 16:59 .mc/
-rw-r--r-1 studnt studnt
3511 Aug 24 11:00 .screenrc
===================================================================
1 2 3 4
5
6
7
8
9
10
user
group
file size in last modification file name
Protection
bytes
date
“.” is the part of name (hidden file)
Ownership
1 – the type of file (d-directory, “-“ - usual file)
2 – read, write, execute rights of file owner (owner has read and write but not execute permission
for “.bash_logout” file
3 – read, write, execute rights of group (the group has only read permission for “.bash_logout” file
4 – read, write, execute rights of other users (other users have only read permission for
“.bash_logout” file
5 – count of file’s hard links
6 – the username of file owner
7 – the name of group file belongs
A file has certain other attributes, which vary from one operating system to another, but typically
consist of these:






Name: The symbolic file name is the only information kept in human readable form.
Identifier: This unique tag, usually a number, identifies the file within the file system; it is the
non-human-readable name for the file.
Type: This information is needed for those systems that support different types.
Size: The current size of the file (in bytes, words, or blocks), and possibly the maximum allowed
size are included in this attribute.
Protection: Access-control information determines who can do reading, writing, executing, and
so on.
Time, date, and user identification: This information may be kept for creation, last
modification, and last use. These data can be useful for protection, security, and usage monitoring.
The information about all files is kept in the directory structure that also resides on secondary storage.
Typically, the directory entry consists of the file's name and its unique identifier. The identifier in turn
locates the other file attributes. It may take more than a kilobyte to record this information for each
Operating System Principles
AUA CIS331
Albert Minasyan
file. In a system with many files, the size of the directory itself may be megabytes. Because
directories, like files, must be nonvolatile, they must be stored on the device and brought into memory
as needed
Directory Entry
11.1.2. File Operations
A file is an abstract data type. To define a file properly, we need to consider the operations that can
be performed on files. The operating system can provide system calls to create, write, read, reposition,
delete, and truncate files. Let us also consider what the operating system must do for each of the six
basic file operations. It should then be easy to see how similar operations, such as renaming a file,
would be implemented.

Creating a file: Two steps are necessary to create a file. First, space in the file system must be
found for the file. Second, an entry for the new file must be made in the directory. The directory
entry records the name of the file and the location in the file system, and possibly other
information.

Writing a file: To write a file, we make a system call specifying both the name of the file and the
information to be written to the file. Given the name of the file, the system searches the directory
to find the location of the file. The system must keep a write pointer to the location in the file
where the next write is to take place. The write pointer must be updated whenever a write occurs.

Reading a file: To read from a file, we use a system call that specifies the name of the file and
where (in memory) the next block of the file should be put. Again, the directory is searched for
the associated directory entry, and the system needs to keep a read pointer to the location in the
file where the next read is to take place. Once the read has taken place, the read pointer is updated.
A given process is usually only reading or writing a given file, and the current operation location
is kept as a per-process current-file-position pointer. Both the read and write operations use this
same pointer, saving space and reducing the system complexity.

Repositioning within a file: The directory is searched for the appropriate entry, and the currentfile-position is set to a given value. Repositioning within a file does not need to involve any actual
I/O. This file operation is also known as a file seek.

Deleting a file: To delete a file, we search the directory for the named file. Having found the
associated directory entry, we release all file space, so that it can be reused by other files, and
erase the directory entry.
Operating System Principles
AUA CIS331

Albert Minasyan
Truncating a file: The user may want to erase the contents of a file but keep its attributes. Rather
than forcing the user to delete the file and then recreate it, this function allows all attributes to
remain unchanged-except for file length-but lets the file be reset to length zero and its file space
released.
These six basic operations certainly comprise the minimal set of required file operations. Other
common operations include appending new information to the end of an existing file and renaming
an existing file. These primitive operations may then be combined to perform other file operations.
Most of the file operations mentioned involve searching the directory for the entry associated with the
named file. To avoid this constant searching, many systems require that an open system call be used
before that file is first used actively. The operating system keeps a small table containing information
about all open files (the open-file table). When a file operation is requested, the file is specified via an
index into this table, so no searching is required. When the file is no longer actively used, it is closed
by the process and the operating system removes its entry in the open-file table.
The operating system provides a uniform logical view of information storage not only for users
but also for programs.
Basic File Operations








Create – find space in the file system and create an
entry in the directory structure.
Delete – Release the file space and delete the
directory entry.
Directory1
Directory2
...
File1 (attributes)
File2
...
Seek – Repositioning the location pointer in the
file.
Read – Specifies the name of the file and the
memory buffer to store the read data. The system
keeps a location pointer in the file.
Write – Specifies the name of the file and the
information to be written. The system also keeps a
location pointer in the file.
Append – Writes information at the end of the file.
Truncate – Erase the file content (size=0).
Open, Close – To avoid searching the directory for
each of the previous operation, expediting the overall
performance of file operations, the operating system
provides the open and close operations that
associate a handler with the file.
File1
Read
Write
Handler
Buffer
Operating System Principles
AUA CIS331
Albert Minasyan
11.1.3. File Types
A common technique for implementing file types is to include the type as part of the file name. The
name is split into two parts-a name and an extension, usually separated by a period character. In this
way, the user and the operating system can tell from the name alone what the type of a file is.
For example, in MS-DOS, a name can consist of up to eight characters followed by a period and
terminated by an extension of up to three characters. The system uses the extension to indicate the
type of the file and the type of operations that can be done on that file.
For instance, only a file with a .com, .exe, or .bat extension can be executed. The .corn and .exe files
are two forms of binary executable files, whereas a .bat file is a batch file containing, in ASCII
format, commands to the operating system.
In Windows the filename is longer. The executable file extensions are the same as in MS-DOS. All
the other extensions are handled by appropriate applications or by OS itself.
The UNIX system is unable to provide such a feature because it uses a crude magic number stored at
the beginning of some files to indicate roughly the type of the file-executable program, batch file (or
shell script), postscript file, and so on. Not all files have magic numbers, so system features cannot be
based solely on this type of information. UNIX does not record the name of the creating program,
either. UNIX does allow file-name-extension hints, but these extensions are not enforced or depended
on by the operating system; they are mostly to aid users in determining the type of contents of the file.
Extensions can be used or ignored by a given application, but that is up to the application's
programmer.
Certain files must conform to a required structure that is understood by the operating system. For
example, the operating system may require that an executable file have a specific structure so that it
can determine where in memory to load the file and what the location of the first instruction is. Some
operating systems extend this idea into a set of system-supported file structures, with sets of special
operations for manipulating files with those structures.
File Types in MsDOS, Windows
3 letter file extension defines the file type.
– executable binary com type file
– executable binary exe type file
– executable batch or script file. Handled by command.com or cmd.exe
interpreters placed in c:/windows/system32 folder.
– all other extensions are handled by OS or by appropriate applications.
name.com
name.exe
name.bat
name.xxx
File Types in Unix
.
– No file extension in Unix
rc firewall
.
– Dots” are the part of the file name separating it for user convenience
ipnat conf
.
startservice sh
.bash.rc
– Leading “dot” means “hidden” file (not always seen by usual commands).
So extensions do not define File Types in Unix
Operating System Principles
AUA CIS331
Albert Minasyan
Linux lumps everything into four basic types of files: ordinary files, directories, links and
special files.
-rw-r--r-drwxrwxr-x
drwx-----lrwxrwxrwx
brw-rw---crw--w----
1 studnt
studnt
124 Jan 10 18:07 .bashrc
3 studnt
studnt
4096 Jan 10 18:44 .mc/
2 studnt
studnt
4096 Feb
1 root
root
1 root
disk
3,
1 Aug 31
1 studnt
tty
4,
1 Oct 17 17:36 /dev/tty1
3 11:29 mail/
11 Jan 22 19:54 /etc/init.d -> rc.d/init.d/
2001 /dev/hda1
The type of file - the first symbol in the file attributes line:
ordinary file
d
directory
l
symbolic link
b
c
s
p
block-special file
character-special file
socket
named pipe
Special Files
Character-special devices (byte oriented device files)
Every physical device associated with a Linux system, including disks, terminals, and printers, are
represented in the file system. Most, if not all, devices are located in the /dev directory. For
example, if you’re working on the system console, your associated device is named /dev/console.
If you’re working on a standard terminal, your device name might be /dev/tty01. Terminals, or
serial lines, are called tty devices (which stands for teletype, the original UNIX terminal). To
determine what the name of your tty device is, type the command tty. The system responds with
the name of the device to which you’re connected.
tty
/dev/tty1
ls -al /dev/tty1
crw--w----
1 studnt
- we see this device is “c” - character special device
tty
4,
1 Oct 17 17:36 /dev/tty1
Printers and terminals are called character-special devices (byte oriented device files). They
can accept and produce a stream of characters.
Block-special devices (block oriented device files)
Disks, on the other hand, store data in blocks addressed by cylinder, head and sector. You can’t
access just one character on a disk; you must read and write entire blocks. The same is usually
true of magnetic tapes. This kind of device is called a block-special device (block oriented
device files).
Operating System Principles
AUA CIS331
Albert Minasyan
ls -al /dev/fd0
brw-rw----
1 studnt
floppy
2,
0 Aug 31
2001 fd0
disk
3,
1 Aug 31
2001 hda1
ls -al /dev/hda1
brw-rw----
1 root
One device-special file - the bit bucket, or /dev/null – is very useful. Anything you send to
/dev/null is ignored, which is useful when you don’t want to see the output of a command.
ls -al /dev/null
crw-rw-rw-
1 root
root
let’s look at /dev/null device
1,
3 Aug 31
2001 /dev/null
Named pipes (FIFO)
A FIFO (first-in-first-out buffer), also known as a named pipe looks like ordinary files. If you write
to them, they grow. But if you read a FIFO, it shrinks in size. FIFOs are used mainly in system
processes to allow many programs to send information to a single controlling process.
Sockets
Sockets provide interconnection between 2 processes.
Operating System Principles
AUA CIS331
Albert Minasyan
11.2. Access Methods
Figure 11.4 Sequential-access file.
Reset
Read,Read,Read...
Write, Write, Write,...
Direct Access:
0
N
Max
Read N, Write N
N – number relative to the beginning of the file
Operating System Principles
AUA CIS331
Albert Minasyan
The simplest access method is sequential access. Information in the file is processed in order, one
record after the other. This mode of access is by far the most common; for example, editors and
compilers usually access files in this fashion.
The bulk of the operations on a file is reads and writes. A read operation reads the next portion of the
file and automatically advances a file pointer, which tracks the I/O location. Similarly, a write appends
to the end of the file and advances to the end of the newly written material (the new end of file).
Such a file can be reset to the beginning and, on some systems, a program may be able to skip forward
or backward n records, for some integer n-perhaps only for n = 1. Sequential access is based on a tape
model of a file, and works as well on sequential-access devices as it does on random-access ones.
Another method is direct access (or relative access). A file is made up of fixed length logical
records that allow programs to read and write records rapidly in no particular order. The direct-access
method is based on a disk model of a file, since disks allow random access to any file block. For direct
access, the file is viewed as a numbered sequence of blocks or records. A direct-access file allows
arbitrary blocks to be read or written. Thus, we may read block 14, then read block 53, and then write
block 7. There are no restrictions on the order of reading or writing for a direct-access file.
Direct-access files are of great use for immediate access to large amounts of information. Databases
are often of this type.
For the direct-access method,
the file operations must be
modified to include the block
number as a parameter. Thus,
we have read n, where n is
the block number, rather than
read next, and write n rather
than write next.
Figure 11.5 Simulation of sequential access on a direct-access file.
An alternative approach is to retain read next and write next, as with sequential access, and to add an
operation position file to n, where n is the block number. Then, to effect a read n, we would position
to n and then read next.
The block number provided by the user to the operating system is normally a relative block number.
A relative block number is an index relative to the beginning of the file. Thus, the first relative block
of the file is 0, the next is 1, and so on, even though the actual absolute disk address of the block may
be 14703 for the first block and 3192 for the second.
Access using indexes
Other access methods can be built on
top of a direct-access method. These
methods generally involve the
construction of an index for the file.
The index, like an index in the back of
a book, contains pointers to the
various blocks. TO find a record in the
file, we first search the index, and then
use the pointer to access the file
directly and to find the desired record.
Figure 11.6 Example of index and relative files.