Download - Mitra.ac.in

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data vault modeling wikipedia , lookup

Object storage wikipedia , lookup

Information privacy law wikipedia , lookup

Business intelligence wikipedia , lookup

Design of the FAT file system wikipedia , lookup

File Allocation Table wikipedia , lookup

Optical disc drive wikipedia , lookup

B-tree wikipedia , lookup

3D optical data storage wikipedia , lookup

CD-R wikipedia , lookup

Disk formatting wikipedia , lookup

Transcript
File Structures
& Data Processing
Unit-I
• Introduction : File structure design, File
processing operations : open, close, read,
write, seek. Unix directory structure.
Secondary storage devices: disks, tapes, CDROM. Buffer management. I/O in Unix.
A Program to display the contents of a file on Screen.
1. Open the file for Input(reading).
2. While there are characters to read from the input files:
Read a character from the file.
Write the charater to the screen.
3. Close the files.
.
//C program
# include<Stdio.h>
Void main(){
Char ch;
File * infile;
infile=fopen(“A.txt”,”r”);
While(fread(&ch,1,1,infile)!=0){
fwrite(&ch,1,1,stdout);
}
fclose(infile);
}
.
//Cpp Program
#include<fsteam.h>
Void main(){
Char ch;
fstream Infile;
Infile.open(“A.txt”,ios::in);
Infile.unsetf(ios::Skipws);//set flag so it doesn’t skip white spaces
Infile>>ch;
While(!Infile.fail()){
Cout<<ch;
infile>>ch;
}
Infile.close();
}
Secondary Storage Devices
Since secondary storage is different from main memory we have to
understand how it works in order to do good file structure design.
Two major types of storage devices are:
1.Direct Access storage devices.
-Magnetic Disks
.Hard drives(high capacity,low cost per bit)
.Floppy Disks(low capacity,slow,cheap)
-Optical Disks
.CD-ROM-Compact Disc,Read only memory(read only/write
once,holds a lot of data,cheap production)
2. Serial Devices:
-Magnetic Tapes(Very fast sequential access).
The organization of Disks
.
.
Cylinder: the set of tracks that are directly above
or below each other.
.
• Each platter (disc-shaped) is coated with magnetic material on both
surfaces.
• Each platter surface has arm extended from fixed position.
• Tip of the arm contains read/write head for reading or writing data.
• The arm moves the heads from the spindle edge to the edge of the disk.
.
Internal Structure
• Track Width is 1-2 microns (micrometer).
• A sector contains fixed number of bytes.
• E.g. 215 bytes or 4096 bytes
• Divided in header (stores sector number), data and ECC (Error Correction Code).
• Width of 1 bit in a sector is 0.1 to 0.2 microns.
.
• Cylinder: is a set of tracks on all the surfaces at
a fixed arm position.
• The internal structure of a disk is shown
below:
.
• How data is read/ written:
• Each Block is identified (i.e. block address
contains)
• Cylinder Number (i.e. Track Number)
• Surface Number
• Sector Number
• Based on the block address the disk controller
(digital circuit)Moves the arm to designated
track.
• Platter is rotated (spins angularly) until the
desired sector is located.
• Once the head is aligned with header section, the
reading or writing mechanics is performed.
.
• Read/ Write Mechanism:
• Disks record data by magnetizing a magnetic material
in a pattern that represents the data.
-Magnetic material is ferromagnetic substance such as
iron oxide.
• Write: Head contains an induction coil through which
the current passes.
-This magnetizes the iron oxide.
-Depending on the current direction, the magnetic
particles align either in left or right direction.
.Read: Head passes over the bit region to detect the
magnetization of the material.
-a current is generated in the coil, which is measured
and value is measured determines with bit is 0 or 1.
.
Since a cylinder consists of a group of
tracks, a track consists of a group of sectors
and a sector consists of a group of bytes, it
is easy to compute track, cylinder, and
drive capabilities.
.
.
Organizing tracks by Sector
.
Cluster, Extent and Fragmentation
.
Fragmentation
Organizing Tracks by block
.
Nondata Overhead
.
.
.
The cost of a Disk Access
The Cost of a Disk Access :
• A disk access can be divided into three distinct physical
operations, each with its own cost :
1. Seek Time
2. Rotational Delay
3. Transfer Time
Seek Time :
• Is the time required to move the access arm to the correct
cylinder.
• Average seek time of Hard disk is less than 10 msec and
high performance disks have average seek times as low as
7.5 msec.
Rotational Delay :
• Is time it takes for the disk to rotate so the sector
we want is under the read/write head.
• On average rotational delay is half a revolution.
Transfer Time :
• The transfer time is given by the formula :
• Transfer time is depends on no. of sectors on
track.
• If there are 63 sectors on track then transfer
time is 1/63.
Disk as a Bottleneck
4. RAM Disk: The RAM disk refers to large amount of primary
memory in the form of RAM which behaves like a mechanical disk. It can be
used to achieve faster access speeds. Unlike disk, RAM disk is a
semiconductor memory which needs no rotations. Also, the seeking time is
negligible as compared to the seeking time of disks. Because of less seeking
time and no rotation delay, the required data can be located with faster
access speed. The only disadvantage of RAM disk is its volatile nature.
5. Disk Cache: It is same like that of normal cache memory.
The disk cache is present along with secondary storage device
i.e. magnetic disk which maintains frequently used pages of data
from the disk. Whenever there is a request of a particular data, it
first looks into the disk cache for it. If the data is not found, then
it accesses the disk. The size of the disk cache is negligible as
compared to that of the disk. Both RAM disk and disk cache are
the examples of buffers.
Characteristics of Magnetic Tapes
Organization of Data on Nine-Track Tapes
• Surface of tape is a set of parallel tracks, each of which is a
sequence of bits.
• Since tapes are accessed sequentially, there is no need for
addresses to identify the locations of data on a tape.
Track
Frame
Gap
Data Block
Gap
.
Disk Versus Tape
Physical organization of CD-ROM
The Compact Disc is a spin-off of Laserdisc technology.
Diagram of CD layers.
A. A polycarbonate disc layer has
the data encoded by using bumps.
B. A shiny layer reflects the laser.
C. A layer of lacquer protects the
shiny layer.
D. Artwork is screen printed on
the top of the disc.
E. A laser beam reads the CD and
is reflected back to a sensor, which
converts it into electronic data
.
CLV Vs CAV :
• The space on a computer disc is arranged into
individually addressable areas called sectors.
• There are two basic methods for arranging these
sectors on a disc:
1) placed in concentric rings (called tracks) of equal
angle per sector
2) the other is to have them in an Archimedean spiral
with the physical length of sectors along the disc kept
constant instead of the angle.
• All sectors themselves have identical capacity
regardless of their physical size (area or length),
although the density of the sector (the size of the
individual bits) can vary.
.
• Discs arranged into discrete tracks (including
floppy discs, DVDs, and hard drives) are constant
angular velocity (CAV) discs: the disc spins at a
fixed rate.
• This means that sectors at the outside of the disc
pass under the head much faster than those at
the centre, and thus the data is more spread out.
.
• This wastes physical space on the disc.
• CD-ROMs have a single, spiral track and are
constant linear velocity (CLV) discs.
• CD-ROM drives change the speed at which the
disc spins such that the amount of disc surface
passing under the laser unit is constant;
sectors at the outside and inside of the disc’s
surface are the same size (same length).
• This results in increased capacity at the
expense of a more complex format. (Vinyl
records also have a spiral track but are
nevertheless CAV.)
CD-ROM Strengths and Weaknesses :
1) Seek Performance
2) Data Transfer Rate
3) Storage Capacity
4) Read-Only Access
5) Asymmetric Writing and Reading