Download Slide 1

Document related concepts
no text concepts found
Transcript
OPEN SOURCE
SOFTWARES
1
UNIT – I
INTRODUCTION
2
INTRODUCTION TO OPEN SOURCE:
1. Open source is a development method for software that harnesses the power of
distributed peer review and transparency of process.
2. The promise of open source is better quality, higher reliability, more flexibility, lower
cost, and an end to predatory vendor lock-in.
3. One of our most important activities is as a standards body, maintaining the Open
Source Definition for the good of the community.
4.The Open Source Initiative Approved License trademark and program creates a nexus
of trust around which developers, users, corporations and governments can organize open-source
cooperation.
Definition:
OSS is software for which the source code is freely and publicly available, though the
specific licensing agreements vary as to what one is allowed to do with that code.
1. Open source software (OSS) is defined as computer software for which the source
code and certain other rights normally reserved for copyright holders are provided under a
software license that meets the Open Source Definition or that is in the public domain.
2. This permits users to use, change, and improve the software, and to redistribute it in
3
modified or unmodified forms. It is very often developed in a public, collaborative manner.
PROPERTIES OF OPEN SOURCE
1. Free Redistribution
2. Source Code
3. Derived Works
4. Integrity of the Author’s Source Code
5. No Discrimination against Persons or Groups
6. No Discrimination against Fields of Endeavor
7. Distribution of License
8. License Must Not Be Specific to a Product
9. The License Must Not Restrict Other Software
10. License Must Be Technology-Neutral
4
1. Free Redistribution
The license shall not restrict any party from selling or giving away the software as a
component of an aggregate software distribution containing programs from several different
sources. The license shall not require a royalty or other fee for such sale.
2. Source Code
The program must include source code, and must allow distribution in source code as
well as compiled form. Where some form of a product is not distributed with source code, there
must be a well-publicized means of obtaining the source code for no more than a reasonable
reproduction cost preferably, downloading via the Internet without charge. The source code must
be the preferred form in which a programmer would modify the program. Deliberately obfuscated
source code is not allowed. Intermediate forms such as the output of a preprocessor or translator
are not allowed.
3. Derived Works
The license must allow modifications and derived works, and must allow them to be
distributed under the same terms as the license of the original software.
5
4. Integrity of the Author’s Source Code
The license may restrict source-code from being distributed in modified form only if
the license allows the distribution of “patch files” with the source code for the purpose of
modifying the program at build time. The license must explicitly permit distribution of software
built from modified source code. The license may require derived works to carry a different name
or version number from the original software.
5. No Discrimination against Persons or Groups
The license must not discriminate against any person or group of persons.
6. No Discrimination against Fields of Endeavor
The license must not restrict anyone from making use of the program in a specific field
of endeavor. For example, it may not restrict the program from being used in a business, or from
being used for genetic research.
7. Distribution of License
The rights attached to the program must apply to all to whom the program is
redistributed without the need for execution of an additional license by those parties.
6
8. License Must Not Be Specific to a Product
The rights attached to the program must not depend on the program’s being part of a
particular software distribution. If the program is extracted from that distribution and used or
distributed within the terms of the program’s license, all parties to whom the program is
redistributed should have the same rights as those that are granted in conjunction with the original
software distribution.
9. The License Must Not Restrict Other Software
The license must not place restrictions on other software that is distributed along with
the licensed software. For example, the license must not insist that all other programs distributed
on the same medium must be open-source software.
10. License Must Be Technology-Neutral
No provision of the license may be predicated on any individual technology or style of
interface.
NEED OF OPEN SOURCE
1. Reduce dependency on closed source vendors.
Stop being dragged through constant product upgrades that you are forced to do to stay
on a supported version of the product. Suppose if you are using the License product you have
to wait from the developer to deliver the next updated version. so that he can get a completed
7
version of the software.
2. Your annual budget does not keep up with increases in software maintenance costs and
increased costs of employee health care.
Your budget remains flat, you bought five new tools last year with new annual costs in
the range of 18-20% of the original purchase price for "gold support", and your employees'
health care costs shot up 25% again. What gives?
3. More access to tools.
You can get your hands a variety of development and testing tools, project and portfolio
management tools, network monitoring, security, content management, etc. without having to
ask the boss man for a few hundred thousand green backs.
4. Try before you buy.
Are you getting ready to invest in SOA, BPM, or ECM? Why not do a prototype with
out spending huge sums of money? First of all, it allows you to get familiar with the tools so you
can be educated when you go through the vendor evaluation process. Second of all, you might
find that the tool can do the job and you don't need to lock yourself in to another vendor.
8
5. Great support and a 24/7 online community that responds quickly.
Despite the myths that you can't get support for open source software, the leading
communities provide support far superior to most closed source vendors. Most communities have
a great knowledgebase or wiki for self service support. You can also post a question and one of
the hundreds of community members throughout the world will most likely respond in minutes.
Make sure you chose software with strong community backing.
6. Access to source code and the ability to customize if you desire.
You can see the code, change the code, and even submit your enhancements and/or
fixes back to the community to be peer reviewed and possibly added to the next build. No longer
do you need to wait for a vendor roadmap that doesn't have the feature you need until their
Excalibur release in the Fall of 2009.
7. Great negotiating power when dealing with closed source vendors.
Tired of vendors pushing you around because you don't have options? I wonder if
companies like Microsoft would be more willing to be flexible with their pricing if you have 20
9
desktops running Ubuntu as an alternative desktop pilot initiative.
8. Feature set is not bloated and is driven by collaboration amongst the community.
Tired of products that consume huge amounts of memory and CPU power for the 2000
eye candy features that you will never use? With open source software, most features are driven
by community demand. Closed vendors have to create one more feature then their competitors to
get the edge in the marketplace.
9. More secure then most closed source vendors.
This topic is highly debated, but studies like this one from Trend Micro show that open
source software is typically more secure.
10. Bug fixes are implemented faster then closed source vendors.
Actually, many bugs are fixed by the community before they are even reported by the
users.
ADVANTAGES OF OPEN SOURCE SOFTWARE
Software experts and researchers on open source software have identified several
advantages and disadvantages. The main advantage for business is that open source is a good way
for business to achieve greater penetration of the market. Companies that offer open source
10
software are able to establish an industry standard and, thus, gain competitive advantage.
It has also helped build developer loyalty as developers feel empowered and have a sense of
ownership of the end product. Moreover less cost of marketing and logistical services are needed for
OSS. It also helps companies to keep abreast of all technology developments. It is a good tool to
promote a companies’ image, including its commercial product. The OSS development approach has
helped produce reliable, high quality software quickly and inexpensively. Besides, it offers the potential
for a more flexible technology and quicker innovation. It is said to be more reliable since it typically
has thousands of independent programmers testing and fixing bugs of the software. It is flexible
because modular systems allow programmers to build custom interfaces, or add new abilities to it and it
is innovative since open source programs are the product of collaboration among a large number of
different programmers. The mix of divergent perspectives, corporate objectives, and personal goals
speeds up innovation. Moreover free software can be developed in accord with purely technical
requirements. It does not require thinking about commercial pressure that often degrades the quality of
the software. Commercial pressures make traditional software developers pay more attention to
customers' requirements than to security requirements, since such features are somewhat invisible to
the customer.
•The availability of the source code and the right to modify it is very important.
It enables the unlimited tuning and improvement of a software product. It also makes it
possible to port the code to new hardware, to adapt it to changing conditions, and to reach a
detailed understanding of how the system works. This is why many experts are reaching the
conclusion that to really extend the lifetime of an application, it must be available in source form.
In fact, no binary-only application more than 10 years old now survives in unmodified form,
while several open source software systems from the 1980s are still in widespread use (although
in many cases conveniently adapted to new environments). Source code availability also makes it
11
much easier to isolate bugs, and (for a programmer) to fix them.
• The right to redistribute modifications and improvements to the code, and to reuse other open
source code, permits all the advantages due to the modifiability of the software to be shared by
large communities. This is usually the point that differentiates open source software licenses from
``nearly free'' ones. In substance, the fact that redistribution rights cannot be revoked, and that
they are universal, is what attracts a substantial crowd of developers to work around open source
software projects.
• The right to use the software in any way. This, combined with redistribution rights, ensures (if
the software is useful enough), a large population of users, which helps in turn to build up a
market for support and customization of the software, which can only attract more and more
developers to work in the project. This in turn helps to improve the quality of the product, and to
improve its functionality. This, once more, will cause more and more users to give the product a
try, and probably to use it regularly.
Source code availability
Free of cost
Reduce dependency on S/w vendor
Easier to customize -Allows user to install product according to their needs. Doesn’t need to
install complete software collection.
Highly Secured -Virus free Platform. No need for Anti virus software.
12
APPLICATION OF OPEN SOURCE SYSTEM
The Major areas where the open source software will be used highly,
Internet Application : Starts from Website creation using PHP, Designing
search engine, Content management process
Utilities
: Open source Language script aids in creating utility files
(disk, file management, driver’s routines).
Tool creation
: Aids in the developments of customized tools and
software packages, and games too.
13
LINUX
14
LINUX OVERVIEW
Linux is “open source” software meaning, simply, that anyone can get copies of its
source code files.
1. Richard.M.Stallman started the GNU project in 1984.
2. By the End of 1991 he complete all the core components of the operating system
except the kernel.
3. During that period a student of Computer Science department from a Finland named
Linus Torvalds implemented the first version of the Linux Kernel.
4. As soon as he completes the kernel, many people were collaborating the GNU
project with the Kernel created by Torvalds, and adding many utilities to complete GNU/Linux, a
real operating system.
5. The Linux kernel and the GNU applications used on top of it are covered by GPL.
6. Linux is a full-featured UNIX® implementation. The main design criterion of the
7. Linux kernel is the throughput, while real-time and predictability is not an issue. The
main handicap to considering Linux as a real-time system is that the kernel is not pre emptable; that is, while the processor executes kernel code, no other process or event
can preempt kernel execution.
8. The Linux kernel is useless in isolation; it participates as one part in a larger system
that, as a whole, is useful.
9. As such, it makes sense to discuss the kernel in the context of the entire system.
15
1. User Applications - the set of applications in use on a particular Linux system will be
different depending on what the computer system is used for, but typical examples include a
word-processing application and a web-browser.
O/S Services -- these are services that are typically considered part of the
operating system (a windowing system, command shell, etc.); also, the programming interface
to the kernel (compiler tool and library) is included in this subsystem.
2. Linux Kernel -- this is the main area of interest in this paper; the kernel abstracts and
mediates access to the hardware resources, including the CPU.
3. Hardware Controllers -- this subsystem is comprised of all the possible physical
devices in a Linux installation; for example, the CPU, memory hardware, hard disks, and
network hardware are all members of this subsystem
16
FEATURES OF LINUX
• Robust and Stable.
• Safe and Secure.
• The basic underlying architecture of Linux was designed with security in mind.
• No viruses.
• An “Open” System.
• Free of cost.
Benefits of Linux OS Compare to other OS
•Free of Cost and easily downloadable from the internet from the distributors site.
•Availability of Source code made user to make changes and integrate their own
code to create features required by him
•Flexible i.e. Using the same copy of Linux Os we can able to create work station,
networks and stand alone PC)
•Supports customize installation options.
•Aid the user to test the components functionalities while during OS installation
itself.
Linux Distributions:
The Major distributors provide the Linux operating system as Debian, Red Hat, SUSE,
Mandrake, Turbo, Corel, Ubuntu, etc.,
17
An Overview of the Linux File system
The Linux operating system design is centered on its file system, which has several
interesting characteristics..
Files
A Linux file is an information container structured as a sequence of bytes; the kernel
does not interpret the contents of a file. Many programming libraries implement higher-level
abstractions, such as records structured into fields and record addressing based on keys. However,
the programs in these libraries must rely on system calls offered by the kernel.
From the user's point of view, files are organized in a tree-structured name space as
shown in
18
Directory System Details
19
SHELL
The program which allows user to interrupt and manage commands was referred to as
the shell. There are different types of shells available in the Linux OS such as, BASH, TCSH, KSH
(K Shell).
BASH Shell
BASH is an acronym for Bourne Again Shell, acknowledging the roots of bash coming
from the Bourne shell (sh command) created by Steve Bourne at AT&T Bell Labs. Bash includes
features of the shells originally developed for early UNIX systems, as well as some other
features. Expect bash to be the default shell in whatever Linux system you are using, with the
exception of some specialized Linux systems (such as those run on embedded devices or run
from a floppy disk) that may require a smaller shell that needs less memory and entails fewer
features. All of the Linux distributions use bash as the default shell, with the exception of some
bootable Linux distributions, which use the other shell instead.
Tcsh Shells
The tcsh shell is the open source version of the C shell (csh). The csh shell was created
by Bill Joy and used with most Berkeley UNIX systems (such as those produced by Sun
Microsystems) as the default shell. Many features of the original csh shell, such as commandline editing and its history mechanism are included in tcsh as well as in other shells. While you
can run both csh and tcsh on most Linux systems, both commands actually point to the same
executable file. In other words, starting csh actually runs the tcsh shell in csh compatibility mode.
20
Ksh(K Shell)
The ksh shell was created by David Korn at AT&T Bell Labs and is the predecessor of
the sh shell. It became the default and most commonly used shell with UNIX System V systems.
The open source version of ksh was originally available in many rpm-based systems (such as
Fedora and Red Hat Enterprise Linux).
SHELL COMMANDS
cal
Display a calendar
Syntax: cal [-mjy] [[month] year]
Options:
-m Display monday as the first day of the week.
-j Display julian dates (days one-based, numbered
from January 1).
cat
Display the contents of a file (concatenate)
Syntax: cat [Options] [File]...
-y Display a calendar for the current year.
cmp
Compare two files, and if they differ, tells the first byte and line number
where they differ. You can use the `cmp' command to show the offsets
and line numbers where two files differ. `cmp' can also show all the
21
characters that differ between the two files, side by side.
Syntax: cmp options... FromFile [ToFile]
comm
Common - compare two sorted files line by line and write to standard output:
the lines that are common, plus the lines that are unique.
Syntax: comm [options]... File1 File2
continue
Resume the next iteration of an enclosing for, while, until, or select loop.
Syntax: continue [n]
If n is supplied, the execution of the nth enclosing loop is resumed. n
must be greater than or equal to 1. The return status is zero unless n
is not
greater than or equal to 1.
cp
Copy one or more files to another location.Copy SOURCE to DEST, or multiple
SOURCE(s) to DIRECTORY.
Syntax : cp [options]... Source Dest
cut
Divide a file into several parts (columns). Writes to standard output selected
each line of each input file, or standard input if no files are given or for
name of `-'.
Syntax: cut [OPTION]... [FILE]...
parts of
a file
declare
Declare variables and give them attributes.
Syntax: declare [-afFrxi] [-p] [name[=value]]
22
diff
Display the differences between two files, or each corresponding file in two
directories. Each set of differences is called a "diff" or "patch". For files that are
identical, diff normally produces no output; for binary (non-text) files, diff
normally reports only that they are different.
Syntax: diff [options] from-file to-file
diff3
Show differences among three files. When two people have made independent
changes to a common original, `diff3' can report the differences between the
original
and the two changed versions, and can produce a merged file that
contains both persons'
changes together with warnings about conflicts. The
files to compare are MINE, OLDER,
and YOURS. At most one of these three
file names may be `-', which tells `diff3' to read
the
standard input for that file.
Syntax: diff3 [options] mine older yours
echo
Display message on screen, writes each given STRING to standard output,
with a
space between each and a newline after the last one.
Syntax: echo [options]... [string]...
exec
Execute a command
Syntax: exec [-cl] [-a name] [command [arguments]]
23
exit
Exit from a program, shell or log out of a Unix network.
Syntax: exit
expr
Evaluate expressions, evaluates an expression and writes the result on
standard output.
Syntax: expr expression...
fgrep
Search file(s) for lines that match a fixed string
Syntax: fgrep <options>
find
Search a folder hierarchy for filename(s) that meet desired criteria: Name, Size,
and File type.
Syntax: find [-H] [-L] [-P] [path...] [expression]
grep
Search file(s) for specific text.
Syntax: grep <options> "Search String" [filename]
gzip
Compress or decompress named file(s)
Syntax: gzip options
Init
Init is the parent of all processes. Its primary role is to create processes
script stored in the file /etc/inittab see init.
kill
Stop a process from running, either via a signal or forced termination.
from a
24
logout
Exit a login shell.
Syntax: logout [n]
Returns a status of n to the shell's parent.
lS
ls (list) will simply list the names of the directories and files in the current
directory ls-l will give you a long listing which includes the permissions,
ownership, size, date/time, and name of the files and directories
(ls -a) to list ALL the files in the current directory, including hidden files
man
man formats and displays the on-line manual pages.
Syntax: man command name
mount
To mount a file system.All files accessible in a Unix system are arranged in one
big tree, the file hierarchy, rooted at /. These files can be spread out over several
devices. The mount command serves to attach the file system found on some device to
the big file tree.
mv
Move or rename files or directories.
Syntax: mv [options]... Source Dest
passwd
Modify a user password.
Syntax: passwd [options...]
pwd [-LP]
Print the absolute pathname of the current working directory.
25
pwd [-LP]
Print the absolute pathname of the current working directory.
Pipe
A pipe is a sequence of one or more commands separated by the character. It passes
the output of previous command to the input of the next one, or to the
shell
Syntax: echo ls -l | sh#
Passes the output of "echo ls -l" to the shell,#+ with the same result as a simple "ls -l"
reboot
Reboot the system
return
Cause a shell function to exit with the return value n.
Syntax: return [n]
rm
Remove files (delete/unlink)
Syntax: rm [options]... file...
rmdir
Remove directory, this command will only work if the folders are empty.
Syntax: rmdir [options]... folder(s)...
shutdown
Shutdown or restart linux
Syntax: shutdown [options] when [message]
26
Split
Split a file into fixed-size pieces, creates output files containing consecutive
sections of INPUT (standard input if none is given or INPUT is `-')
test expr
Return a status of 0 or 1 depending on the evaluation of the conditional
expression expr umount Unmount a device
who
Print who is currently logged in.
Syntax: who [options] [file] [am i]
whoami
Print the current user id and name.
Syntax: whoami [options]
while list; do list; done
The while command continuously executes the do list as long as the last
command in
list returns an exit status of zero
passwd
Passwd is used to update a user's authentication token or the user password.
27
KERNEL
Kernel is generally referred as heart of Linux Os or base of Linux Os is the kernel. It
manages resource of Linux Os. Resources refer to the facilities available in Linux. For e.g.
Facility to store data, print data on printer, memory, file management etc. Kernel decides who
will use this resource, for how long and when. It runs your programs (or set up to execute binary
files). The kernel acts as an intermediary between the computer hardware and various
programs/application/shell. The kernel is the software that starts up when you boot your
computer and interfaces with the programs you use so they can communicate effectively and
simply with your computer hardware. The Linux kernel contains device drivers, memory
management, process management and communication management Kernels provide an
execution environment in which applications may run. Therefore, the kernel must implement a
set of services and corresponding interfaces. Applications use those interfaces and do not usually
interact directly with hardware resources.
KERNEL MODE AND USER MODE.
CPU can run either in User Mode or in Kernel Mode. Actually, some CPUs can have more than
two execution states. For instance, the Intel 80x86 microprocessors have four different execution
states. But all standard Linux kernels make use of only Kernel Mode and User Mode. When a
program is executed in User Mode, it cannot directly access the kernel data structures or the
kernel programs. When an application executes in Kernel Mode, however, these restrictions no
longer apply. Each CPU model provides special instructions to switch from User Mode to Kernel
Mode and vice versa. A program executes most of the time in User Mode and switches to Kernel
Mode only when requesting a service provided by the kernel. When the kernel has satisfied the
28
program‘s
request, it puts the program back in User Mode. Processes are dynamic entities that usually have
a limited life span within the system. The task of creating, eliminating, and synchronizing the
existing processes is delegated to a group of routines in the kernel. The kernel itself is not a
process but a process manager. A process running in User Mode refers to private stack, data, and
code areas. When running in Kernel Mode, the process addresses the kernel data and code area
and makes use of another stack. Unix-like operating systems adopt a process/kernel model. Each
process has the illusion that it's the only process on the machine and it has exclusive access to the
operating system services. Whenever a process makes a system call (i.e., a request to the kernel),
the hardware changes the privilege mode from User Mode to Kernel Mode, and the process starts
the execution of a kernel procedure with a strictly limited purpose. In this way, the operating
system acts within the execution context of the process in order to satisfy its request. Whenever
the request is fully satisfied, the kernel procedure forces the hardware to return to User Mode and
the process continues its execution from the instruction following the system call.
The process/kernel model assumes that processes that require a kernel service make use
of specific programming constructs called system calls. Each system call sets up the group of
parameters that identifies the process request and then executes the hardware-dependent CPU
instruction to switch from User Mode to Kernel Mode. Besides user processes, Linux systems
include a few privileged processes called kernel threads with the following characteristics:
•They run in Kernel Mode in the kernel address space.
•They do not interact with users, and thus do not require terminal devices.
•They are usually created during system startup and remain alive until the system is
shut down.
29
Linux processor system, only one process is running at any time and it may run either in User or
in Kernel Mode. If it runs in Kernel Mode, the processor is executing some kernel routine.
Process 1 in User Mode issues a system call, after which the process switches to Kernel Mode
and the system call is serviced. Process 1 then resumes execution in User Mode until a timer
interrupt occurs and the scheduler is activated in Kernel Mode. A process switch takes
30
place, and Process 2 starts its execution in User Mode until a hardware device raises an interrupt.
As a consequence of the interrupt, Process 2 switches to Kernel Mode and services the interrupt.
Kernels do much more than handle system calls; in fact, kernel routines can be activated in
several ways:
• A process invokes a system call.
• The CPU executing the process signals an exception, which is some unusual
condition such as an invalid instruction. The kernel handles the exception on
behalf of the process that caused it.
• A peripheral device issues an interrupt signal to the CPU to notify it of an event
such as a request for attention, a status change, or the completion of an I/O
operation. Each interrupt signal is dealt by a kernel program called an interrupt
handler. Since peripheral devices operate asynchronously with respect to the
CPU, interrupts occur at unpredictable times.
• A kernel thread is executed; since it runs in Kernel Mode, the corresponding program must
be considered part of the kernel, albeit encapsulated in a process.
PROCESS
What is Processes
A process is usually defined as an instance of a program in execution; thus, if 16 users are
running vi at once, there are 16 separate processes (although they can share the same executable
code). Processes are often called "tasks" in Linux source code. Process is any kind of program or
task carried out by your PC. For e.g. $ ls -lR , is command or a request to list files in a directory
and all subdirectory in your current directory. It is a process. A process is program (command
given by user) to perform some Job.
31
Each process is represented by a unique ID number referred to as a process ID (PID). In Linux
when you start process, it gives a number (called PID or process-id), PID starts from 0 to 65535.
In order to manage processes, the kernel must have a clear picture of what each process is doing.
It must know, for instance, the process's priority, whether it is running on the CPU or blocked on
some event, what address space has been assigned to it, which files it is allowed to address, and
so on. This is the role of the process descriptor , that is, of a task_struct type structure whose
fields contain all the information related to a single process. As the repository of so much
information, the process descriptor is rather complex. Not only does it contain many fields itself,
but some contain pointers to other data structures that, in turn, contain pointers to other
structures.
Why Process required
Linux is multi-user, multitasking o/s. It means you can run more than two process simultaneously
if you wish. For e.g.. To find how many files do you have on your system you may give
command like
$ ls / -R | wc –l
This command will take lot of time to search all files on your system. So you can run such
command in Background or simultaneously by giving command like
$ ls / -R | wc -l &
32
The ampersand (&) at the end of command tells shells start command (ls / -R | wc -l) and run it in
background takes next command immediately. An instance of running command is called process
and the number printed by shell is called process-id (PID), this PID can be use to refer specific
running process.
Linux Command Related with Process
To see currently running process ps $ ps
To stop any process i.e. to kill process kill {PID} $ kill 1012
To get information about all running process ps -ag $ ps –ag
To stop all process except your shell kill 0 $ kill 0
For background processing $ ls / -R | wc -l &
Process State
As its name implies, the state field of the process descriptor describes what is currently
happening to the process. It consists of an array of flags, each of which describes a possible
process state. In the current Linux version these states are mutually exclusive, and hence exactly
one flag of state is set; the remaining flags are cleared. The following are the possible process
states:
TASK_RUNNING -The process is either executing on the CPU or waiting to be
executed.
TASK_INTERRUPTIBLE -The process is suspended (sleeping) until some
condition becomes true. Raising a hardware interrupt, releasing a system
resource the process is waiting for, or delivering a signal are examples of
conditions that might wake up the process, that is, put its state back
to
33
TASK_RUNNING.
TASK_UNINTERRUPTIBLE -Like the previous state, except that delivering a
signal to the sleeping process leaves its state unchanged. This process state is seldom
used. It is valuable, however, under certain specific conditions in which
a process must
wait until a given event occurs without being interrupted. For
instance, this state may be
used when a process opens a device file and the corresponding device driver starts probing for a
corresponding hardware
device. The device driver must not be interrupted until the probing
is complete,
or the hardware device could be left in an unpredictable state.
TASK_STOPPED -Process execution has been stopped: the process enters
this
state after receiving a SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU signal.
When a process
is being monitored by another (such as when a debugger executes a ptrace( ) system call to
monitor a test program), any signal may put
the process in the TASK_STOPPED state.
TASK_ZOMBIE -Process execution is terminated, but the parent process has not yet
issued a wait( )- like system call (wait( ), wait3( ), wait4( ), or waitpid( ))
to
return
information about the dead process. Before the wait( )-like call is
issued, the kernel cannot
discard the data contained in the dead process
descriptor because the parent could need it.
34
35
SCHEDULING
Distribution of the resource ’processor’ to the competing tasks. Like any time-sharing system,
Linux achieves the magical effect of an apparent simultaneous execution of multiple processes by
switching from one process to another in a very short time frame.
Scheduling Policy
The scheduling algorithm of traditional Linux operating systems must fulfill several conflicting
objectives: fast process response time, good throughput for background jobs, avoidance of
process starvation, and reconciliation of the needs of low- and high-priority
processes, and so on. The set of rules used to determine when and how selecting a new process to
run is called scheduling policy. Linux scheduling is based on the time-sharing technique, several
processes are allowed to run "concurrently," which means that the CPU time is roughly divided
into "slices," one for each runnable process. Of course, a single processor can run only one
process at any given instant. If a currently running process is not terminated when its time slice or
quantum expires, a process switch may take place. Time-sharing relies on timer interrupts and is
thus transparent to processes. No additional code needs to be inserted in the programs in order to
ensure CPU time-sharing. Recall that stopped and suspended processes cannot be selected by the
scheduling algorithm to run on the CPU. The scheduling policy is also based on ranking
processes according to their priority. Complicated algorithms are sometimes used to derive the
current priority of a process, but the end result is the same: each process is associated with a
value that denotes how appropriate it is to be assigned to the CPU.
36
In Linux, process priority is dynamic. The scheduler keeps track of what processes are doing and
adjusts their priorities periodically; in this way, processes that have been denied the use of the
CPU for a long time interval are boosted by dynamically increasing their priority.
Correspondingly, processes running for a long time are penalized by decreasing their priority.
When speaking about scheduling, processes are traditionally classified as "I/O-bound" or
"CPUbound.“ The former make heavy use of I/O devices and spend much time waiting for I/O
operations to complete; the latter are number-crunching applications that require a lot of CPU
time.
An alternative classification distinguishes three classes of processes:
Interactive processes
These interact constantly with their users, and therefore spend a lot of
time waiting for key presses and mouse operations. When input is received, the process
must be woken up quickly, or the user will find the system to be
unresponsive. Typically, the
average delay must fall between 50 and 150 ms. The variance of such delay must also be
bounded, or the user will find the
system to be erratic. Typical interactive programs are
command shells, text
editors, and graphical applications.
Batch processes
These do not need user interaction, and hence they often run in the
background. Since such processes do not need to be very responsive, they are often
penalized by the scheduler. Typical batch programs are programming language
compilers,
database search engines, and scientific computations.
37
Real-time processes
These have very strong scheduling requirements. Such processes should never
be blocked by lower-priority processes, they should have a short response time and, most
important, such response time should have a minimum variance. Typical real-time programs are
video and sound applications, robot controllers, and programs that collect data from physical
sensors.
38
The Scheduling Algorithm
The Linux scheduling algorithm works by dividing the CPU time into epochs . In a single epoch,
every process has a specified time quantum whose duration is computed when the epoch begins.
In general, different processes have different time quantum durations. The time quantum value is
the maximum CPU time portion assigned to the process in that epoch. When a process has
exhausted its time quantum, it is preempted and replaced by another runnable process. Of course,
a process can be selected several times from the scheduler in the same epoch, as long as its
quantum has not been exhausted—for instance, if it suspends itself to wait for I/O, it preserves
some of its time quantum and can be selected again during the same epoch. The epoch ends when
all runnable processes have exhausted their quantum; in this case, the scheduler algorithm
recomputes the time-quantum durations of all processes and a new epoch begins. Each process
has a base time quantum: it is the time-quantum value assigned by the scheduler to the process if
it has exhausted its quantum in the previous epoch. The users can change the base time quantum
of their processes by using the nice( ) and setpriority( ) system calls . A new process always
inherits the base time quantum of its parent.In order to select a process to run, the Linux
scheduler must consider the priority of each process. Actually, there are two kinds of priority:
Static priority
This kind is assigned by the users to real-time processes and ranges from 1 to 99. It is never
changed by the scheduler.
39
Dynamic priority
This kind applies only to conventional processes; it is essentially the sum of the base
time quantum (which is therefore also called the base priority of the process) and of the number
of ticks of CPU time left to the process before its quantum expires in the current epoch. Of
course, the static priority of a real-time process is always higher than the dynamic priority of a
conventional one: the scheduler will start running conventional processes only when there is no
real-time process in a TASK_RUNNING state.
SIGNALS
Signals were introduced by the first Unix systems to simplify inter-process
communication. The kernel also uses them to notify processes of system events. In contrast to
interrupts and exceptions, most signals are visible to User Mode processes.
A signal is a very short message that may be sent to a process or to a group of
processes. The only information given to the process is usually the number identifying the signal;
there is no room in standard signals for arguments, a message, or other accompanying
information.
Signals are represented by a set of macros whose names start with the prefix SIG is
used to identify signals; Signals serve two main purposes:
• To make a process aware that a specific event has occurred
• To force a process to execute a signal handler function included in its code
Of course, the two purposes are not mutually exclusive, since often a process must
react to some event by executing a specific routine. The kernel distinguishes two different phases
40
related to signal transmission:
• Signal sending The kernel updates the descriptor of the destination process to represent
that a new signal has been sent.
• Signal receiving The kernel forces the destination process to react to the signal by
changing its execution state or by starting the execution of a specified signal handler or both.
Each signal sent can be received no more than once. Signals are consumable resources:
once they have been received, all process descriptor information that refers to their previous
existence is canceled. Signals that have been sent but not yet received are called pending signals .
At any time, only one pending signal of a given type may exist for a process; additional pending
signals of the same type to the same process are not queued but simply discarded. In general, a
signal may remain pending for an unpredictable amount of time. Indeed, the following factors
must be taken into consideration:
• Signals are usually received only by the currently running process (that is, by the
current process). Signals of a given type may be selectively blocked by a process in this
case, the process will not receive the signal until it removes the block.
•When a process executes a signal-handler function, it usually "masks" the
corresponding signal, that is, it automatically blocks the signal until the handler
terminates. A signal handler therefore cannot be interrupted by another occurrence of
the handled signal, and therefore the function doesn't need to be reentrant. A masked
signal is always blocked, but the converse does not hold.
41
Although the notion of signals is intuitive, the kernel implementation is rather complex. The
kernel must:
• Remember which signals are blocked by each process.
• When switching from Kernel Mode to User Mode, check whether a signal for any process
has arrived. This happens at almost every timer interrupt, that is, roughly every 10 ms.
• Determine whether the signal can be ignored. This happens when all of the following
conditions are fulfilled:
• The destination process is not traced by another process (the PF_TRACED flag in the
process descriptor flags field is equal to 0).
• The signal is not blocked by the destination process.
• The signal is being ignored by the destination process (either because the process has
explicitly ignored it or because the process did not change the
default action of the signal and that action is "ignore").
• Handle the signal, which may require switching the process to a handler function at any
point during its execution and restoring the original execution context after the function
returns.
Sending a Signal
When a signal is sent to a process, either from the kernel or from another process, the
kernel delivers it by invoking the send_sig_info( ), send_sig( ), force_sig( ), or force_sig_info( )
functions. These accomplish the first phase of signal handling described earlier; updating the
process descriptor as needed. They do not directly perform the second phase of receiving the
signal but, depending on the type of signal and the state of the process, may wake up the process
42
and force it to receive the signal.
The send_sig_info( ) and send_sig( ) Functions
The send_sig_info( ) function acts on three parameters:
Sig
The signal number.
Info
Either the address of a siginfo_t table associated with real-time
or one of two special values: means that the signal has been sent by a
Mode process, while 1 means that it has been sent by the kernel. The
siginfo_t data structure has information that must be passed to the
receiving the real-time signal, such as the PID of the sender process
UID of its owner.
signals
User
process
and the
Receiving a Signal
We assume that the kernel has noticed the arrival of a signal and has invoked one of the
functions in the previous section to prepare the process descriptor of the process that is supposed
to receive the signal. But in case that process was not running on the CPU at that moment, the
kernel deferred the task of waking the process, if necessary, and making it receive the signal. We
now turn to the activities that the kernel performs to ensure that pending signals of a process are
handled. The kernel checks whether there are nonblocked pending signals before allowing a
process to resume its execution in User Mode. This check is performed in ret_from_intr( ) every
time an interrupt or an exception has been handled by the kernel routines. In order to handle the
nonblocked pending signals, the kernel invokes the do_signal( ) function, which receives two
43
parameters:
regs
The address of the stack area where the User Mode register contents of the
current
process have been saved
oldset
The address of a variable where the function is supposed to save the bit mask array of
blocked signals (actually, this parameter is NULL when invoked from ret_from_intr( )).
44
CLONING
Process of how to clone a Linux installation to a different computer. First, a thing to
remember is that the new computer where you will put a copy of the drive image needs to have a
motherboard with the same architecture as the original one. Otherwise, Linux will not boot.
Consider you have a computer with Fedora Core 2 Linux installed on an IDE drive with the
following partitions:
/dev/hda1 /boot
/dev/hda2 /
/dev/hda3 swap
/dev/hda4 /home
First step is to create images of these partitions and use them to make an exact duplicate on a
drive of the same size in another computer.
Part 1: Make an HDD Image of the Installation
Connect another HDD as a secondary master, where I will put hard drive images of the first disk,
and boot using System Rescue CD or Linux Bootable CD. During booting, it asks for a keyboard
to use, then offers the # prompt. First, we need to mount a partition on the secondary master
drive:
# mount /dev/hdc4 /mnt/temp1
Under temp1, we can make a directory to store our images.
# mkdir /mnt/temp1/fedora_core2_template
# cd /mnt/temp1/fedora_core2_template
Now, it's time to save the Master Boot Record and Partition Table information of the /dev/had
drive.
45
# dd if=/dev/hda of=fedora_core2_template.hda.mbr count=1 bs=512
I use the .mbr extension just to show that this is a Master Boot Record.
# sfdisk -d /dev/hda > fedora_core2_template.hda.pt
.pt is for Partition Table.
Now, we're ready to run part image to save the contents of the /dev/hda1, /dev/hda2,
and /dev/hda4 partitions. We do not need to image the swap partition, as it can be created after
applying the partition table information to a new drive.
To save a partition, use the following command:
# partimage -b -z1 -o -V700 save /dev/hda1 fedora_core2.hda1.partimg.gz
This will create a compressed file of the first partition and, if it is larger than 700 MB, split it into
multiple 700 MB files which end with 000, 001, ..., ###. 700 Mb is just enough to put one file on
a CD, if you ever want to backup your installation. After executing the above command, I type a
description of the image and hit F5 to continue.
Repeat the above command for the /dev/hda2 and /dev/hda4 partitions and a copy of the first hard
drive is done.
# partimage -b -z1 -o -V700 save /dev/hda2 fedora_core2.hda2.partimg.gz
# partimage -b -z1 -o -V700 save /dev/hda4 fedora_core2.hda4.partimg.gz
Part 2: Restore the Image to a New Drive on a Different Computer
The new computer can have an HDD of the same size or larger. Images we made in the first part
of the tutorial cannot be applied to a smaller HDD than the one we made a copy from. Connect
an HDD with images as a secondary master and boot with System Rescue CD. When you get to
46
the # prompt, mount the partition on the second drive.
# mount /dev/hdc4 /mnt/temp1
# cd /mnt/temp1/fedora_core2_template
Now, we can restore the master boot record on the new drive.
# dd if=fedora_core2_template.hda.mbr of=/dev/hda
Before we can run partimage, we also need to apply partition table information to the
new drive.
# sfdisk /dev/hda < fedora_core2_template.hda.pt
Now, everything is ready for partimage. Use the following command to restore the
image to the new drive:
# partimage -e restore /dev/hda1 fedora_core2.hda1.partimg.gz.000
After you hit enter, partimage will display information about the image. You can verify that this is
the right image for this partition and click F5 to continue. After it's done, repeat the above
command for the remaining partitions:
# partimage -e restore /dev/hda2 fedora_core2.hda2.partimg.gz.000
# partimage -e restore /dev/hda4 fedora_core2.hda4.partimg.gz.000
Now, all that's left is to make swap on the /dev/hda3 partition.
# mkswap /dev/hda3
This will create a default swap structure and will use the whole /dev/hda3 partition for it.
Restoration of the installation is complete. We can shut down and disconnect the second HDD.
This will fix the boot loader, and Linux should load nicely after that. This process can be used for
Windows installations, as well. If you would like to look at other ways to do it for Windows
systems, there is a very nice tutorial on cloning Windows XP installations using Norton Ghost,
47
HDClone, and Ranish Partition Manager.
Development with Linux
Basic of Shell Scripting
The following steps are general guidelines for writing shell scripts:
• Using a text editor, create and save a file. You can include any combination of shell
and operating system commands in the shell script file. By convention, shell scripts that
are not set up for use by many users are stored in the $HOME/bin directory.
Note: The operating system does not support the setuid or setgid subroutines
within a shell script.
• Use the chmod command to allow only the owner to run (or execute) the file. For
example, if your file is named script1, type the following:
chmod u=rwx script1
• Type the script name on the command line to run the shell script. To run the script1
shell script, type the following:
script1
How to define User defined variables (UDV)
To define UDV use following syntax
Syntax: variable name=value
'value' is assigned to given 'variable name' and Value must be on right side =48 sign.
Rules for Naming variable name (Both UDV and System Variable)
• Variable name must begin with Alphanumeric character or underscore character (_),
followed by one or more Alphanumeric character. For e.g. Valid shell variable are as follows
HOME,SYSTEM_VERSION,vech,no.
• Don't put spaces on either side of the equal sign when assigning value to variable. For e.g.
In following variable declaration there will be no error
$ no=10
But there will be problem for any of the following variable declaration:
$ no =10;
$ no= 10;
$ no = 10
• Variables are case-sensitive, just like filename in Linux. For e.g.
$ no=10;
$ No=11;
$ NO=20;
$ nO=2
Above all are different variable name, so to print value 20 we have to use $
echo
$NO and not any of the following
$ echo $no # will print 10 but not 20
$ echo $No# will print 11 but not 20
$ echo $nO# will print 2 but not 20
• You can define NULL variable as follows (NULL variable is variable which has no value
at the time of definition) For e.g.
$ vech=
49
$ vech=“”
Try to print it's value by issuing following command $ echo $vech
Nothing will be shown because variable has no value i.e. NULL variable.
• Do not use ?,* etc, to name your variable names.
How to print or access value of UDV (User defined variables)
To print or access UDV use following syntax
Syntax: $variablename=value
Define variable vech and n as follows:
$ vech=Bus ;$ n=10
To print contains of variable 'vech' type. It will print 'Bus',To print contains of
variable 'n' type command as follows
$ echo $vech
$ echo $n
Shell Arithmetic
Use to perform arithmetic operations.
Syntax: expr op1 math-operator op2
Examples:
$ expr 1 + 3; $ expr 2 – 1; $ expr 10 / 2;
$ expr 20 % 3; $ expr 10 \* 3;$ echo `expr 6 + 3`
50
For the last statement not the following points
(1) First, before expr keyword we used ` (back quote) sign not the (single quote
i.e. ') sign. Back quote is generally found on the key under tilde (~) on PC
keyboard OR to the above of TAB key.
(2) Second, expr is also end with ` i.e. back quote.
(3) Here expr 6 + 3 is evaluated to 9, then echo command prints 9 as sum
(4) Here if you use double quote or single quote, it will NOT work
For e.g.
$ echo "expr 6 + 3" # It will print expr 6 + 3
$ echo 'expr 6 + 3' # It will print expr 6 + 3
echo Command
Use echo command to display text or value of variable.
echo [options] [string, variables...]
Displays text or variables value on screen.
The read Statement
Use to get input (data from user) from keyboard and store (data) to variable.
Syntax:read variable1, variable2,...variableN
Example program 1
$ vi sayH
#Script to read your name from key-board
echo "Your first name please:"
51
read fname
echo "Hello $fname, Lets be friend!"
Run it as follows:
$ chmod 755 sayH
$ ./sayH
Output
Your first name please: vivek
Hello vivek, Lets be friend!
More command on one command line
Syntax:command1;command2
To run two command with one command line.
Examples:$ date;who
Will print today's date followed by users who are currently
login. Note that You can't use
Redirection of Standard output/input i.e. Input - Output redirection
Mostly all command gives output on screen or take input from keyboard, but in Linux (and in
other OSs also) it's possible to send output to file or to read input from file. For e.g. $ ls
command gives output to screen; to send output to file of ls command give command $ ls >
filename
It means put output of ls command to filename.
52
There are three main redirection symbols >,>>,<
(1) > Redirector Symbol
Syntax:
Linux-command > filename
To output Linux-commands result (output of command or shell script) to file. Note that
if file already exist, it will be overwritten else new file is created. For e.g. To send output of ls
command give
$ ls > myfiles
Now if 'myfiles' file exist in your current directory it will be overwritten without any type of
warning.
(2) >> Redirector Symbol
Syntax:
Linux-command >> filename
To output Linux-commands result (output of command or shell script) to END of file. Note that if
file exist , it will be opened and new information/data will be written to END of file, without
losing previous information/data, And if file is not exist, then new file is created. For e.g. To send
output of date command to already exist file give command
$ date >> myfiles
(3) < Redirector Symbol
Syntax:
Linux-command < filename
To take input to Linux-command from file instead of key-board. For e.g. To take input for cat
command give
$ cat < myfiles
53
Shells (bash) structured Language Constructs
if condition
if condition which is used for decision making in shell script, If given condition is true
then command1 is executed.
Syntax:
if condition
then
command1 if condition is true or if exit status
of condition is 0 (zero)
...
fi
test command or [ expr ]
test command or [ expr ] is used to see if an expression is true, and if it is true it
return zero(0), otherwise returns nonzero for false.
Syntax:
test expression OR [ expression ]
Following script determine whether given argument number is positive.
$ cat > ispostive
#!/bin/sh
# Script to see whether argument is positive
if test $1 -gt 0
then
echo "$1 number is positive"
fi
54
Loops in Shell Scripts
for Loop
Syntax:
for { variable name } in { list }
do
execute one for each item in the list until the list is
not finished (And repeat all statement between do
and done)
done
Example:
$ cat > testfor
for i in 1 2 3 4 5
do
echo "Welcome $i times"
done
while loop
Syntax:
while [ condition ]
do
command1
command2
....
done
55
Loop is executed as long as given condition is true.
Example:
while [ $i -le 10 ]
do
echo "$n * $i = `expr $i \* $n`"
i=`expr $i + 1`
done
The case Statement
The case statement is good alternative to Multilevel if-then-else-fi statement. It enable you to
match several values against one variable. Its easier to read and write.
Syntax:
case $variable-name in
pattern1) command ;;
pattern2) command;;
patternN) command;;
*)
command;;
esac
The $variable-name is compared against the patterns until a match is found. The shell then
executes all the statements up to the two semicolons that are next to each other. The default is *)
and its executed if no match is found. For e.g. write script as follows:
56
THE END
57