* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download H 10.1. File-System Interface
Survey
Document related concepts
Object storage wikipedia , lookup
Library (computing) wikipedia , lookup
MTS system architecture wikipedia , lookup
Windows NT startup process wikipedia , lookup
Plan 9 from Bell Labs wikipedia , lookup
Commodore DOS wikipedia , lookup
Spring (operating system) wikipedia , lookup
Burroughs MCP wikipedia , lookup
File locking wikipedia , lookup
Transcript
Operating System Principles AUA CIS331 Albert Minasyan Handout 11 File System Interface 11.1. File Concept 11.1.1. File attributes 11.1.2. File Operations 11.1.3. File Types 11.2. Access Methods Sequential, Direct, Indexed Silberschatz, 6th or 9th ed. Chapter 11. 11.1. File Concept File System Storing provides mechanisms Access for Data, Programs We have to learn: File System Interface to the user (programs) File System implementation on devices. For most users, the file system is the most visible aspect of an operating system. It provides the mechanism for on-line storage of and access to both data and programs of the operating system and all the users of the computer system. For most users, the file system is the most visible aspect of an operating system. It provides the mechanism for on-line storage of and access to both data and programs of the operating system and all the users of the computer system. The file system consists of two distinct parts: a collection of files, each storing related data, and a directory structure, which organizes and provides information about all the files in the system. Some file systems have a third part, partitions, which are used to separate physically or logically large collections of directories. The File System Interface consists of: Collection of Files Directory Structure Partitions Partition Directory Directory Files Files Computers can store information on several different storage media, such as magnetic disks, magnetic tapes, and optical disks. So that the computer system will be convenient to use, the operating system provides a uniform logical view of information storage. The operating system abstracts from the physical properties of its storage devices to define a logical storage unit (the file). Files are mapped, by the operating system, onto physical devices. These storage devices are usually nonvolatile, so the contents are persistent through power failures and system reboots. Operating System Principles AUA CIS331 Albert Minasyan A file is a named collection of related information that is recorded on secondary storage. From a user's perspective, a file is the smallest allotment of logical secondary storage; that is, data cannot be written to secondary storage unless they are within a file. Commonly, files represent programs (both source and object forms) and data. Data files may be numeric, alphabetic, alphanumeric, or binary. Files may be free form, such as text files, or may be formatted rigidly. In general, a file is a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator and user. The concept of a file is thus extremely general. The information in a file is defined by its creator. Many different types of information may be stored in a file-source programs, object programs, executable programs, numeric data, text, payroll records, graphic images, sound recordings, and so on. A file has a certain defined structure according to its type. A text file is a sequence of characters organized into lines (and possibly pages). A source file is a sequence of subroutines and functions, each of which is further organized as declarations followed by executable statements. An object file is a sequence of bytes organized into blocks understandable by the system's linker. An executable file is a series of code sections that the loader can bring into memory and execute. Partitions in Windows In this figure is shown Master Boot Record (MBR), which resides on the first part of HDD and keeps the partitions’ information. There are multi OS partitions in this example. HDD MBR (0 track) MBR Entry 0 Partition 0 - Linux “/” Partition 1 - Drive C: NTFS Entry 1 Entry 2 Entry 3 Partition 2 - Another Partition No other partition Operating System Principles AUA CIS331 Albert Minasyan What is the Directory ? C:\ logical drive on primary partition Directory Contains the list of files & directory’s Directory1 (attributes) Directory2 ... File1 (attributes) File2 ... Directories & Files in C:\ logical drive File attributes Unix Partitions FreeBSD# df -k Filesystem 1K-blocks Used Avail Capacity /dev/ad0s1a 2015918 57760 1796886 3% /dev/ad0s1h 4774910 10664 4382254 0% /dev/ad0s1g 100750 4 92686 0% /dev/ad0s1e 6048238 1078834 4485546 19% /dev/ad0s1f 6048238 986380 4578000 18% Solaris# Linux# /dev(ices directory) = ”Device Manager” list in Windows df -k Filesystem /dev/dsk/c0t0d0s0 /dev/dsk/c0t0d0s3 /dev/dsk/c0t1d0s7 Mounted on / /opt /tmp /usr /var kbytes used avail capacity 20160418 1562132 18396682 8% 10718701 377979 10233535 4% 35009161 4815522 29843548 14% Mounted on / /var /opt df -k Filesystem /dev/hda1 1k-blocks 2016016 Used Available Use% Mounted on 1350612 562992 71% / Unix Root “/” catalog (directory) Linux# drwxr-xr-x drwxr-xr-x -rw-r--r-drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x dr-xr-xr-x drwxr-x--drwxr-xr-x drwxrwxrwt drwxr-xr-x drwxr-xr-x ls –alF / 19 19 1 2 3 17 42 8 2 7 2 2 4 2 42 4 2 2 16 20 root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root root 4096 4096 0 4096 4096 77824 4096 4096 4096 4096 16384 4096 4096 4096 0 4096 4096 4096 4096 4096 Jun Jun Feb Feb Feb Feb May Sep Jun Feb Feb Aug Feb Aug Feb Nov Feb Jun Feb Feb 6 6 23 23 23 23 26 7 21 23 23 29 23 23 23 15 23 9 23 23 08:39 08:39 12:58 2006 2006 12:58 12:51 2008 2001 2006 2006 2001 2006 1999 16:58 2007 2006 04:02 2006 2006 ./ ../ .autofsck bin/ boot/ dev/ etc/ home/ initrd/ lib/ lost+found/ misc/ mnt/ opt/ proc/ root/ sbin/ tmp/ usr/ var/ Operating System Principles AUA CIS331 Albert Minasyan The data is stored in filesystems as files and as catalogs (directories). The top of Unix directory tree hierarchy is the root “/” catalog. All other catalogs are created starting from the root “/” catalog. There are several standard directories in “/” catalog which have the same meaning for the most of Unix systems. Log in as root to see the following structures. /bin /boot – Unix basic user commands and utilities (like cat, cp, rm, mv, gzip, gunzip). – booting files and system kernel vmlinuz-2.4.7-10 – is the kernel of RedHat 7.2 Linux. Be careful with this file. /dev - devices /dev/hdX, /dev/sdX - Hard Drives /dev/fdX - Floppy drives /dev/cdrom - Cdrom drive /dev/null - null (empty) device /dev/rmt - tape device /dev/pts /X - remote terminals /dev/ttyX - local terminals /etc – main configuration files /home – users’ home directories each user has his own home directory which is not accessible for the other users /lib - unix system modules and libraries /mnt - temporarily mounted devices /mnt/cdrom /mnt/floppy /proc - unix running processes in form of usual file system /root - home directory of user “root” /sbin - system commands /tmp - temporary files not important for the system /usr - additional commands, configurations and applications for Unix (this tree looks like to main directory tree of linux) /usr/bin, /usr/sbin – additional commands /usr/include c, c++ include libraries /usr/lib additional libraries /usr/src kernel source code /usr/share/man manual pages /usr/share/doc documentation about everything in Linux /usr/local the second subtree of directories for local programs /var - temporary and dynamically changed files important for system (there are some exceptions) /var/cache cache files (f.e. man cached files) /var/log all system log files (system status registration) messages the system log file where are registered all important system events maillog the mail log file where is registered all mail traffic log /var/named Domain Name System configuration files (very important files, should be moved to /etc after DNS installation) /var/run keeps information about running processes /var/spool/cron periodic process control files (important files) /var/spool/mqueue keeps mails before sending to recipient (is defined in /etc/sendmail.cf configuration file) Operating System Principles AUA CIS331 Albert Minasyan The chain of directories to the desired file or directory, including them, is called path or pathname. The path could be absolute or relative. Absolute path is the path to the file or directory beginning from the root directory. Relative path is the path to the file or directory beginning from the current directory. cd /etc/mail cd / cd /home cd <username> cd ./<username> cd work/man/mann - (absolute path) (absolute path) (absolute path) (relative path) (relative path) (relative path) Windows main directories and files System Folders (directories) System Files Boot files Operating System Principles AUA CIS331 Albert Minasyan 11.1.1. File Attributes A file is named, for the convenience of its human users, and is referred to by its name. A name is usually a string of characters, such as examp1e.c. Some systems differentiate between upper- and lowercase characters in names, whereas other systems consider the two cases to be equivalent. When a file is named, it becomes independent of the process, the user, and even the system that created it. For instance, one user might create the file example.c, whereas another user might edit that file by specifying its name. The file's owner might write the file to a floppy disk, send it in an e-mail, or copy it across a network, and it could still be called examp1e.c on the destination system. File Attributes in Unix ls –alF total 32 (This directory uses 32 blocks approximately on HDD) drwx-----3 studnt studnt 4096 Aug 24 16:59 ./ drwxr-xr-x 5 root root 4096 Aug 24 16:36 ../ -rw------1 studnt studnt 278 Aug 24 16:36 .bash_history -rw-r--r-1 studnt studnt 24 Aug 24 11:00 .bash_logout -rw-r--r-1 studnt studnt 191 Aug 24 11:00 .bash_profile -rw-r--r-1 studnt studnt 124 Aug 24 11:00 .bashrc drwxrwxr-x 3 studnt studnt 4096 Aug 24 16:59 .mc/ -rw-r--r-1 studnt studnt 3511 Aug 24 11:00 .screenrc =================================================================== 1 2 3 4 5 6 7 8 9 10 user group file size in last modification file name Protection bytes date “.” is the part of name (hidden file) Ownership 1 – the type of file (d-directory, “-“ - usual file) 2 – read, write, execute rights of file owner (owner has read and write but not execute permission for “.bash_logout” file 3 – read, write, execute rights of group (the group has only read permission for “.bash_logout” file 4 – read, write, execute rights of other users (other users have only read permission for “.bash_logout” file 5 – count of file’s hard links 6 – the username of file owner 7 – the name of group file belongs A file has certain other attributes, which vary from one operating system to another, but typically consist of these: Name: The symbolic file name is the only information kept in human readable form. Identifier: This unique tag, usually a number, identifies the file within the file system; it is the non-human-readable name for the file. Type: This information is needed for those systems that support different types. Size: The current size of the file (in bytes, words, or blocks), and possibly the maximum allowed size are included in this attribute. Protection: Access-control information determines who can do reading, writing, executing, and so on. Time, date, and user identification: This information may be kept for creation, last modification, and last use. These data can be useful for protection, security, and usage monitoring. The information about all files is kept in the directory structure that also resides on secondary storage. Typically, the directory entry consists of the file's name and its unique identifier. The identifier in turn locates the other file attributes. It may take more than a kilobyte to record this information for each Operating System Principles AUA CIS331 Albert Minasyan file. In a system with many files, the size of the directory itself may be megabytes. Because directories, like files, must be nonvolatile, they must be stored on the device and brought into memory as needed Directory Entry 11.1.2. File Operations A file is an abstract data type. To define a file properly, we need to consider the operations that can be performed on files. The operating system can provide system calls to create, write, read, reposition, delete, and truncate files. Let us also consider what the operating system must do for each of the six basic file operations. It should then be easy to see how similar operations, such as renaming a file, would be implemented. Creating a file: Two steps are necessary to create a file. First, space in the file system must be found for the file. Second, an entry for the new file must be made in the directory. The directory entry records the name of the file and the location in the file system, and possibly other information. Writing a file: To write a file, we make a system call specifying both the name of the file and the information to be written to the file. Given the name of the file, the system searches the directory to find the location of the file. The system must keep a write pointer to the location in the file where the next write is to take place. The write pointer must be updated whenever a write occurs. Reading a file: To read from a file, we use a system call that specifies the name of the file and where (in memory) the next block of the file should be put. Again, the directory is searched for the associated directory entry, and the system needs to keep a read pointer to the location in the file where the next read is to take place. Once the read has taken place, the read pointer is updated. A given process is usually only reading or writing a given file, and the current operation location is kept as a per-process current-file-position pointer. Both the read and write operations use this same pointer, saving space and reducing the system complexity. Repositioning within a file: The directory is searched for the appropriate entry, and the currentfile-position is set to a given value. Repositioning within a file does not need to involve any actual I/O. This file operation is also known as a file seek. Deleting a file: To delete a file, we search the directory for the named file. Having found the associated directory entry, we release all file space, so that it can be reused by other files, and erase the directory entry. Operating System Principles AUA CIS331 Albert Minasyan Truncating a file: The user may want to erase the contents of a file but keep its attributes. Rather than forcing the user to delete the file and then recreate it, this function allows all attributes to remain unchanged-except for file length-but lets the file be reset to length zero and its file space released. These six basic operations certainly comprise the minimal set of required file operations. Other common operations include appending new information to the end of an existing file and renaming an existing file. These primitive operations may then be combined to perform other file operations. Most of the file operations mentioned involve searching the directory for the entry associated with the named file. To avoid this constant searching, many systems require that an open system call be used before that file is first used actively. The operating system keeps a small table containing information about all open files (the open-file table). When a file operation is requested, the file is specified via an index into this table, so no searching is required. When the file is no longer actively used, it is closed by the process and the operating system removes its entry in the open-file table. The operating system provides a uniform logical view of information storage not only for users but also for programs. Basic File Operations Create – find space in the file system and create an entry in the directory structure. Delete – Release the file space and delete the directory entry. Directory1 Directory2 ... File1 (attributes) File2 ... Seek – Repositioning the location pointer in the file. Read – Specifies the name of the file and the memory buffer to store the read data. The system keeps a location pointer in the file. Write – Specifies the name of the file and the information to be written. The system also keeps a location pointer in the file. Append – Writes information at the end of the file. Truncate – Erase the file content (size=0). Open, Close – To avoid searching the directory for each of the previous operation, expediting the overall performance of file operations, the operating system provides the open and close operations that associate a handler with the file. File1 Read Write Handler Buffer Operating System Principles AUA CIS331 Albert Minasyan 11.1.3. File Types A common technique for implementing file types is to include the type as part of the file name. The name is split into two parts-a name and an extension, usually separated by a period character. In this way, the user and the operating system can tell from the name alone what the type of a file is. For example, in MS-DOS, a name can consist of up to eight characters followed by a period and terminated by an extension of up to three characters. The system uses the extension to indicate the type of the file and the type of operations that can be done on that file. For instance, only a file with a .com, .exe, or .bat extension can be executed. The .corn and .exe files are two forms of binary executable files, whereas a .bat file is a batch file containing, in ASCII format, commands to the operating system. In Windows the filename is longer. The executable file extensions are the same as in MS-DOS. All the other extensions are handled by appropriate applications or by OS itself. The UNIX system is unable to provide such a feature because it uses a crude magic number stored at the beginning of some files to indicate roughly the type of the file-executable program, batch file (or shell script), postscript file, and so on. Not all files have magic numbers, so system features cannot be based solely on this type of information. UNIX does not record the name of the creating program, either. UNIX does allow file-name-extension hints, but these extensions are not enforced or depended on by the operating system; they are mostly to aid users in determining the type of contents of the file. Extensions can be used or ignored by a given application, but that is up to the application's programmer. Certain files must conform to a required structure that is understood by the operating system. For example, the operating system may require that an executable file have a specific structure so that it can determine where in memory to load the file and what the location of the first instruction is. Some operating systems extend this idea into a set of system-supported file structures, with sets of special operations for manipulating files with those structures. File Types in MsDOS, Windows 3 letter file extension defines the file type. – executable binary com type file – executable binary exe type file – executable batch or script file. Handled by command.com or cmd.exe interpreters placed in c:/windows/system32 folder. – all other extensions are handled by OS or by appropriate applications. name.com name.exe name.bat name.xxx File Types in Unix . – No file extension in Unix rc firewall . – Dots” are the part of the file name separating it for user convenience ipnat conf . startservice sh .bash.rc – Leading “dot” means “hidden” file (not always seen by usual commands). So extensions do not define File Types in Unix Operating System Principles AUA CIS331 Albert Minasyan Linux lumps everything into four basic types of files: ordinary files, directories, links and special files. -rw-r--r-drwxrwxr-x drwx-----lrwxrwxrwx brw-rw---crw--w---- 1 studnt studnt 124 Jan 10 18:07 .bashrc 3 studnt studnt 4096 Jan 10 18:44 .mc/ 2 studnt studnt 4096 Feb 1 root root 1 root disk 3, 1 Aug 31 1 studnt tty 4, 1 Oct 17 17:36 /dev/tty1 3 11:29 mail/ 11 Jan 22 19:54 /etc/init.d -> rc.d/init.d/ 2001 /dev/hda1 The type of file - the first symbol in the file attributes line: ordinary file d directory l symbolic link b c s p block-special file character-special file socket named pipe Special Files Character-special devices (byte oriented device files) Every physical device associated with a Linux system, including disks, terminals, and printers, are represented in the file system. Most, if not all, devices are located in the /dev directory. For example, if you’re working on the system console, your associated device is named /dev/console. If you’re working on a standard terminal, your device name might be /dev/tty01. Terminals, or serial lines, are called tty devices (which stands for teletype, the original UNIX terminal). To determine what the name of your tty device is, type the command tty. The system responds with the name of the device to which you’re connected. tty /dev/tty1 ls -al /dev/tty1 crw--w---- 1 studnt - we see this device is “c” - character special device tty 4, 1 Oct 17 17:36 /dev/tty1 Printers and terminals are called character-special devices (byte oriented device files). They can accept and produce a stream of characters. Block-special devices (block oriented device files) Disks, on the other hand, store data in blocks addressed by cylinder, head and sector. You can’t access just one character on a disk; you must read and write entire blocks. The same is usually true of magnetic tapes. This kind of device is called a block-special device (block oriented device files). Operating System Principles AUA CIS331 Albert Minasyan ls -al /dev/fd0 brw-rw---- 1 studnt floppy 2, 0 Aug 31 2001 fd0 disk 3, 1 Aug 31 2001 hda1 ls -al /dev/hda1 brw-rw---- 1 root One device-special file - the bit bucket, or /dev/null – is very useful. Anything you send to /dev/null is ignored, which is useful when you don’t want to see the output of a command. ls -al /dev/null crw-rw-rw- 1 root root let’s look at /dev/null device 1, 3 Aug 31 2001 /dev/null Named pipes (FIFO) A FIFO (first-in-first-out buffer), also known as a named pipe looks like ordinary files. If you write to them, they grow. But if you read a FIFO, it shrinks in size. FIFOs are used mainly in system processes to allow many programs to send information to a single controlling process. Sockets Sockets provide interconnection between 2 processes. Operating System Principles AUA CIS331 Albert Minasyan 11.2. Access Methods Figure 11.4 Sequential-access file. Reset Read,Read,Read... Write, Write, Write,... Direct Access: 0 N Max Read N, Write N N – number relative to the beginning of the file Operating System Principles AUA CIS331 Albert Minasyan The simplest access method is sequential access. Information in the file is processed in order, one record after the other. This mode of access is by far the most common; for example, editors and compilers usually access files in this fashion. The bulk of the operations on a file is reads and writes. A read operation reads the next portion of the file and automatically advances a file pointer, which tracks the I/O location. Similarly, a write appends to the end of the file and advances to the end of the newly written material (the new end of file). Such a file can be reset to the beginning and, on some systems, a program may be able to skip forward or backward n records, for some integer n-perhaps only for n = 1. Sequential access is based on a tape model of a file, and works as well on sequential-access devices as it does on random-access ones. Another method is direct access (or relative access). A file is made up of fixed length logical records that allow programs to read and write records rapidly in no particular order. The direct-access method is based on a disk model of a file, since disks allow random access to any file block. For direct access, the file is viewed as a numbered sequence of blocks or records. A direct-access file allows arbitrary blocks to be read or written. Thus, we may read block 14, then read block 53, and then write block 7. There are no restrictions on the order of reading or writing for a direct-access file. Direct-access files are of great use for immediate access to large amounts of information. Databases are often of this type. For the direct-access method, the file operations must be modified to include the block number as a parameter. Thus, we have read n, where n is the block number, rather than read next, and write n rather than write next. Figure 11.5 Simulation of sequential access on a direct-access file. An alternative approach is to retain read next and write next, as with sequential access, and to add an operation position file to n, where n is the block number. Then, to effect a read n, we would position to n and then read next. The block number provided by the user to the operating system is normally a relative block number. A relative block number is an index relative to the beginning of the file. Thus, the first relative block of the file is 0, the next is 1, and so on, even though the actual absolute disk address of the block may be 14703 for the first block and 3192 for the second. Access using indexes Other access methods can be built on top of a direct-access method. These methods generally involve the construction of an index for the file. The index, like an index in the back of a book, contains pointers to the various blocks. TO find a record in the file, we first search the index, and then use the pointer to access the file directly and to find the desired record. Figure 11.6 Example of index and relative files.