Download Introduction to Unix

Document related concepts

Dynamic-link library wikipedia , lookup

Library (computing) wikipedia , lookup

Version control wikipedia , lookup

Diff wikipedia , lookup

Transcript
Computer Network Programming
Course Information
• Class Hours:
• section 1
– Mon 13:40-15:30 (EB268), Wed 8:40-9:30 (EB267)
• section 2
– Wed 16:40-17:30 (EB262), Fri 10:40-12:30 (EB267)
• Office Hours: Any time when I am in the office
• Textbook :
 W. Richard Stevens, Unix Network Programming, Volume 1, Networking APIs:
Sockets and XTI, Second Edition, Prentice Hall PTR, 1998.
 W. Richard Stevens, Unix Network Programming, Volume 2 , Interprocess
Communications, Second Edition, Prentice Hall PTR, 1998.
Course Information
• Some papers (whitepapers) and documents will be
distributed on topics which are not covered by the
textbook.
• Prerequisites: There is no course set as a prerequsite. However, the
followings are requirements for taking this course:
 Data Structures and Algorithms course (required)
 Fluency in C Programming (required), if you know C++ then it is very
easy to learn also C.
 Operating System course may be taken parallely (recommended but not
necessary)
 Computer Networks course (recommended but not necessary)
Course Information
• Hardware and Software Requirements:
Students need to have access to Unix machines (Solaris). Student may either
work from the console of the unix machines or they can connect to the unix
machines from PCs(Windows) using telnet, Xceed or Xwin.
• This course teaches network programming in the
Unix Operating System.
• Projects and Homework Assignments will be done
on Unix machines (hosts).
• Course homepage: http://www.cs.bilkent.edu.tr/~korpe/cs424.html
– please visit the homepage regularly. Announcements will be posted
there also.
Grading Policy- tentative
Midterm: There will be one midterm exam. %20
Final: %20
Project(s): %30
Homeworks: %30
• Homeworks will include programmimg exercises.
• Project(s) will be large and will include writing programs
of substantial size.
Topics that will be covered
(tentative)
• Overview of Unix Programming Environment
• Unix Programming Tools: compilers, debuggers, utilities,
revision control system.
• Introduction to Computer networking and TCP/IP protocol
suite
• Overview of TCP, IP, IPv6, Ethernet, PPP, ARP protocols
• Debugging and Networking Tools for Network
Programming, Troubleshooting
• Introduction to Sockets, TCP Sockets
• I/O Multiplexing, Socket Options
Topics covered
•
•
•
•
•
•
•
•
•
UDP Sockets, Name and Address conversions, DNS
Daemon Processes, Advanced IO Functions
Unix Domain Protocols, Non-blocking IO
Routing Sockets, Broadcasting, Multicasting
Advanced UDP Sockets, Signal driven I/O
Threads
IP Options, Raw Sockets, Data-link access
Interprocess communication, Pipes and FIFOs
Message queues, shared memory, semaphores
Topics Covered
• Distributed File Systems: NFS and AFS
• RPC, Pseudo-terminals
If time permits
• Implementation of Networking sub-system in Unix OS.
• Network Management and SNMP
• Introduction to Mobile and Wireless Networking
• Mobile IP and Bluetooth
Overview of the Unix
Programming Environment
Logging into a Unix System
• Login in the system:
• type your user name in the login prompt
• type your password after the login prompt
• When you successfully login, you will be
entering into a directory in the file system
which is called your HOME directory
– ex: van:/home1/csstu/korpe$
Logging into the System
aspendos{korpe}:> telnet van.ug.bcc.bilkent.edu.tr
Trying 139.179.11.19...
Connected to van.ug.bcc.bilkent.edu.tr.
Escape character is '^]'.
UNIX(r) System V Release 4.0 (van)
login: korpe
Password:
Last login: Thu Feb 7 11:31:54 from aspendos.cs.bilk
Sun Microsystems Inc.
SunOS 5.5
Generic November 1995
van:/home1/csstu/korpe$
Basic commands
• Listing the files in a directory
• use ls command
• ls -l gives more information about each file
• Displaying the content of a file
• use cat or more commands
• Copying a file: cp command
• cp file1 file2 copies file1 into file2.
• Renaming (moving) a file: mv command
• mv file1 file2 moves file1 into file2.
Basic commands
• rm removes a file from the system
• wc displays the number of lines, words,
characters in the file
• cd changes the directory
• cd .. changes to the parent directory
• cd directory_name changes the current
directory into the directory_name
Unix File Hierarchy
/ (root)
bin
dev
etc
tmp
var
usr
sbin
unix
kernel
vmunix
local
home
you
mike
tmp data.dat program.c
A path: /home/you/data.dat
bin
paul
junk
Pathnames
• You can use full pathnames for the files in
the commands or relative pathnames:
– Example: assume current directory is “you”
– full path: cp home/you/data.dat home/mike
– relative path: cp data.dat ../mike/
• Use the unix manual pages to obtain more
in formation about the commands.
• Ex: man cp gives information about cp command
• man -k subject gives the commands related to
subject
Man -k
van:/home1/csstu/korpe$man -k copy
/usr/openwin/man/windex: No such file or directory
/usr/local/SUNWspro/man/windex: No such file or directory
/usr/dt/man/windex: No such file or directory
/usr/man/windex: No such file or directory
cp
cp (1)
- copy files
cp
cp (1)
- copy files
cp
cp (l)
- copy files
cpio
cpio (l)
- copy files to and from archives
dd
dd (1)
- convert a file while copying it
dd
dd (1)
- convert a file while copying it
dd
dd (l)
- convert a file while copying it
fcat
fcatcmd (1) - copy files in the FSP database to stdout
fcatcmd
fcatcmd (1) - copy files in the FSP database to stdout
install
install (1) - copy files and set their attributes
install
install (1) - copy files and set their attributes
install
install (l) - copy files and set their attributes
tiffcp
tiffcp (l)
- copy (and possibly convert) a .SM TIFF file
Important directories in the File
Hierarchy
`/bin'
Executable (binary) programs. On most systems this is a separate directory to /usr/bin.
In SunOS, this is a pointer (link) to /usr/bin.
`/etc'
Miscellaneous programs and configuration files. This directory has become very messy
over the history of UNIX and has become a dumping ground for almost anything.
Recent versions of unix have begun to tidy up this directory by creating subdirectories `
/etc/mail', `/etc/services' etc!
`/usr'
This contains the main meat of UNIX. This is where application software lives,
together with all of the basic libraries used by the OS.
`/usr/bin'
More executables from the OS.
Important directories
`/usr/local' This is where users' custom software is normally added.
`/sbin'
A special area for statically linked system binaries. They are placed here to distinguish
commands used solely by the system administrator from user commands and so that they
lie on the system root partition where they are guaranteed to be accessible during booting.
`/dev, /devices'
A place where all the `logical devices' are collected. These are called `device nodes' in
unix and are created by mknod. Logical devices are UNIX's official
entry points for writing to devices. For instance, /dev/console is a route to the system
console, while /dev/kmem is a route for reading kernel memory. Device nodes enable
devices to be treated as though they were files.
Important directories
`/home'
(Called /users on some systems.) Each user has a separate login directory where files can
be kept. These are normally stored under /home by some convention decided
by the system administrator.
`/var'
/var/spool and /var/adm etc are used for holding queues for spooling and system log files.
`/vmunix'
This is the program code for the unix kernel. (kernel is the core which implements
basic operating system services).
`/kernel'
On newer systems the kernel is built up from a number of modules which are placed
in this directory.
Input-Output redirection
• “ls -l > filename” lists the files into a file
called filename
• “cat f1 f2 f3 > tmp” concatenates the files
into tmp
• “sort < temp” sorts the strings in the file
temp and displays the sorted output (input
received from temp)
Pipes
• You can redirect the output of a program to
another program as input.
– Example:
– ls | wc -l
– who | grep mary
– who | grep mary | wc -l
Process
• “ps” shows all the programs (processes) that
are currently running
• “ps -ef” shows more detailed information
about each process
• “kill -9 process_number” kills (terminates)
a process
Tailoring the Environment and
Shell
• You can bring the system closer to your
personal taste.
• Shell is the interpreter that executes the
commands that you type.
– It provides you with a command prompt where
you type your commands. (like $)
• There are different kinds of shells that you
can choose from
Different Shells
bash
The Bourne Again shell, an improved sh.
csh
The standard C-shell.
ksh
The Korn shell, an improved sh.
sh
The original Bourne shell.
tcsh
An improved C-shell.
- csh and tcsh are easy to use
- in order to execute a shell just type its name.
Environment Variables
• Environment variables are variables that
shell keeps and they are used to configure
your working environment, so that you can
tailure your environment for your needs and
taste.
• Any program that you run can read these
variable to find out the configuration of the
environment.
Some important environment
variables
PATH
The search path for shell commands (bash)
TERM
The terminal type (bash and csh)
DISPLAY X11 - the name of your display
LD_LIBRARY_PATH
Path to search for object and shared libraries
HOSTNAME
Name of this UNIX host
PRINTER
Default printer (lpr)
HOME
The path to your home directory (bash)
PS1
The default prompt for bash
path
The search path for shell commands (csh)
term
The terminal type (csh)
prompt
The default prompt for csh
home
The path to your home directory (csh)
Setting the Shell Variables
• You can display the current values of shell
variables using the setenv command
• You can set the value of a shell variable
using setenv command
• example:
– setenv PATH /opt/SUNWspro/bin
» set the PATH variable
– setenv PATH ${PATH}:/usr/local/teTeX/bin
» adds one more path to the PATH variable
Shell Configuration Files
• You can add the configuration commands
into special files called .profile and .cshrc
so that your environment is configured
usings the commans in those files when you
initially login to the system.
• .profile and .cshrc are hidden files (use ls -al to see
them)
• .cshrc is used by C-shell, .bashrc is used by Bourne
shell, etc.
Whildcards
• Sometimes you want to be able to refer to several
files in one go while executing a command
• You use whildcards for this
•
The wildcard symbols are
• `?'
– Match single character. e.g. ls /etc/rc.????
• `*'
– Match any number of characters. e.g. ls /etc/rc.*
• `[...]'
– Match any character in a list enclosed by these brackets. e.g. ls
[abc].C
Whildcards
Here are some examples and explanations.
`/etc/rc.????'
Match all files in /etc whose first three characters are rc.
and are 7 characters long.
`*.c'
Match all files ending in `.c' i.e. all C programs.
`*.[Cc]'
List all files ending on `.c' or `.C' i.e. all C and C++ programs.
`*.[a-z]'
Match any file ending in .a, .b, .c, ... up to .z etc.
Regular Expressions
The wildcards belong to the shell. They are used for matching
filenames. UNIX has a more general and widely used mechanism
for matching strings, this is through regular expressions.
# Print all lines which DON'T begin with #
egrep '(^[^#])' /etc/rc
# Print all lines beginning with e, f or g.
egrep '(^[efg])' /etc/rc
# Print all lines beginning with uppercase
egrep '(^[A-Z])' /etc/rc
# Print all lines NOT beginning with uppercase
egrep '(^[^A-Z])' /etc/rc
# Print all lines containing ! * &
egrep '([\!\*\&])' /etc/rc
(“egrep filename” is utility to search and print strings in a file)
How to construct regular
expressions
Regular expressions are made up of the following `atoms'.
`.' Match any single character except the end of line.
`^' Match the beginning of a line as the first character.
`$' Match end of line as last character.
`[..]' Match any character in the list between the square brackets.(see below).
`*' Match zero or more occurrences of the preceding expression.
`+' Match one or more occurrences of the preceding expression.
`?' Match zero or one occurrence of the preceding expression.
File Permissions
• chmod
– Change file access mode.
• chown, chgrp
– Change the ownership of the file.
• use ls -al to find the current permissons of a
file:
File Permissions
Example
aspendos{korpe}:> ls -al .profile
-rw-r--r-1 korpe
staff
144 Apr
1
1997 .profile
The structure of permissions is:
-rwx rwx rwx
Permissons
for the owner
Permissons
for the group
Permissons
for the others
Think rwx as a binary number where 1 corresponds to permission
granted and 0 corresponds to permission disabled.
rwx corresponds to 111 = 7
r-- corresponds to 100 = 4
File Permissions
For example:
•
to obtain a file permission setting as rwxr-xr-x for a file,
we have to execute the command:
chmod 755 fılename
•
chmod +w changes the mode of the file so that it is
writable by everyone
•
chmod +x changes the mode of the file so that it is
executable by everyone
•
chmod ug+w changes the mode of the fıle to writable for
the user and group.
•
chmod uo+x changes the mode of the file to executable for
the user and others
Text editors that you can use
ed
An ancient line-editor.
vi
Visual interface to ed. This is the only "standard" UNIX text
editor supplied by vendors.
emacs
The most powerful UNIX editor. A fully configurable, user programmable
editor which works under X11 and on tty-terminals.
xemacs
A pretty version of emacs for X11 windows.
pico
A tty-terminal only editor, comes as part of the PINE mail package.
xedit
A test X11-only editor supplied with X-windows.
textedit
A simple X11-only editor supplied by Sun Microsystems.
Running a process in the
background
• Type & after the command name
– example: my_program &
• use bg and fg commands to bring the
process back and forth.
• use the jobs command to display the current
jobs
Obtaining System Information
uname gives the name of the Operating Sytem
uname -r gives the version of the Operating System
hostname gives the name of the host.
Unix Programming Tools
Tools
•
•
•
•
•
•
•
editors: emacs, xemacs, vi, pico
compilers: gcc, cc, CC, g++
linker/loader: ld
archieve library builders: ar
debuggers: gdb, xxgdb, dbx
utilities: make, autoconf, purify, gprof, truss
source code management: rcs, cvs, sccs
Compiling
• gcc
– gcc -o hello hello.c
– compiles and links the program hello.c and generated an
exacutable called hello (if yu don’t give a name then the
executable is called a.out)
– gcc -c hello.c
– only compiles the program and produces and object code.
– gcc -Wall
– displays all the warnings
– gcc -llibraryname
– link with the library libraryname
Linking
– ld
– the linker ld links together the object code files and library files and
produces a executable program.
– Linking refers to the process in which a symbol referenced in one
module of your program is connected with its definition in another
module (object file or library).
– ld -Ldirectory_name
» searches the directory called dırectory_name for the
libraries specifies.
– ld -lx
» search a library libx.so or libx.a
» .so shared object library
» .a archieve library
Linker
• Static Linking
» Under static linking, copies of the archive library object
files that satisfy still unresolved external references in your
program are incorporated in your executable at link time.
External references in your program are connected with
their definitions -- assigned addresses in memory -- when
the executable is created.
• Dynamic Linking
» Under dynamic linking, the contents of a shared object are
mapped into the virtual address space of your process at run
time. External references in your program are connected
with their definitions when the program is executed.
Creating a Library
• Static library for linking: libsomething.a
• create .o files: gcc -c helper.c
• ar rlv libsomething.a *.o
• ranlib libsomething.a
• use library as gcc -lsomething,
» searches in /usr/lib, etc.
• Dynamic library
• gcc -shared -fPIC helper.c -o
libhelper.so
• use same as above, LD_LIBRARY_PATH
ldd and nm
• List dynamic dependencies of an executable
$ ldd a.out
libc.soç.1 => /usr/lib/libc.so.1
libdl.so.1 => /usr/lib/libdl.so.1
• Content of an archive or executable
$ nm [-g] a.out
0004155b
R charmap
00030f10 t cleanfree
0002e304 W close
……..
Debugging
• Use gdb or xxgdb
• must compile your programs with -g option
to get the symbol table
• gdb a.out or gdb a.out core
• gdb a.out 1234 attaches to process 1234
• source commands
– list main.c:12 show source
– p(print) x
show variable x
– where
where am I in the stack
gdb: execution
•
•
•
•
•
run arg run program
call f(a,b) call function in program
step N
step N times into functions
next N
step N times over functions
up N
select stack frame that called
current one
• down
select stack frame called by
current one
gdb: break points
•
•
•
•
•
•
break main.c:12
break foo
clear main.c:12
info break
delete 1
display x
set break point
set break point at function
delete break point
show breakpoint
delete break point 1
display variable at each stop
make Utility
• make is used for compiling large projects that
consists of a lot of source files
• maintains dependency graphs:
• source <--- object
• latex <--- postscript
• based on modification times of files
target .. : dependency
command
command
...
How to use make
• Create a file called Makefile in the directory where the
source code (program) resides
– edit the makefile so that dependencies are defined for the code
• type make
– it will automatically read the Makefile and will compile the source
code.
• A Makefile consists of dependencies of the form:
target .. : dependency
command (or rule)
command
Content of Makefile
•
The target is the thing we want to build, the dependencies are like
subroutines to be executed first if they do not exist. Finally the command (or
rule) is to be executed if all if the dependencies exist; it takes the
dependencies and turns them into the target. There are two important things to
remember:
 The file names must start on the first character of a line.
 There must be a TAB character at the beginning of every rule or
action. If there are spaces instead of tabs, or no tab at all, `make'
will signal an error. This bizarre feature can cause a lot of
confusion.
Example
Our example program consists of two source files:
main.c and other.c
It uses a library called libdb which resides in directory
/usr/local/lib
Our aim is to build a program called database.
# # Simple Makefile for `database'
# # First define a macro
OBJ = main.o other.o
CC = gcc
CFLAGS = -I/usr/local/include
LDFLAGS = -L/usr/local/lib -ldb
INSTALLDIR = /usr/local/bin
# # Rules start here. Note that the $@ variable becomes the name of the
# executable file. In this case it is taken from the ${OBJ} variable
#
database: ${OBJ}
${CC} -o $@ ${OBJ} ${LDFLAGS}
# # If a header file changes, normally we need to recompile everything.
# There is no way that make can know this unless we write a rule which
# forces it to rebuild all .o files if the header file changes...
#
${OBJ}: ${HEADERS}
#
# As well as special rules for special files we can also define a
# "suffix rule". This is a rule which tells us how to build all files
# of a certain type. Here is a rule to get .o files from .c files.
# The $< variable is like $? but is only used in suffix rules.
#
.c.o: ${CC} -c ${CFLAGS} $<
(continued in the next page)
#######################################################################
# Clean up
#######################################################################
# # Make can also perform ordinary shell command jobs
#
clean: rm -f ${OBJ} rm -f y.tab.c lex.yy.c y.tab.h
rm -f y.tab lex.yy
rm -f *% *~ *.o
rm -f mconfig.tab.c mconfig.tab.h a.out
rm -f man.dvi man.aux man.log man.toc
rm -f cfengine.tar.gz cfengine.tar cfengine.tar.Z
rm -f cfengine
install: ${INSTALLDIR}/database
cp database ${INSTALLDIR}/database
How to invoke the Makefile
make
make database
make clean
make install
Make uses some special variables: $@ $? $<
Makefile Special Variables
$@ This evaluates to the current target i.e. the name of the
object you are currently trying to build. It is normal to use
this as the final name of the program when compiling
$? This is used only outside of suffix rules and means the name
of all the files which must be compiled in order to build the current
target.
target: file1.o file2.o
TAB cc -o $@ $?
$< This is only used in suffix rules. It has the same meaning as
`$?' but only in suffix rules. It stands for the pre-requisite, or the file
which must be compiled in order to make a given object.
Source Code Management and
Revision Control
• Large scale programs are developed by
many engineers
• a single shared database of source code (program
files) is used by many people
• the access to the source code files needs to be
synchronized (only one user should be able to access
the source file
• We need to store many versions (revisions)
program files at different stages of project
development
RCS
• rcs: source code management and revision
control system in Unix. Others exists:
SCCS, CVS
• manages multiple revisions of files.
• rcs automates the storing, retrievel, logging,
identification and merging of revisions
rcs
• Revisions are stored in a file called RCS file.
• The RCS files are usually stored in a directory called R
• Example: Our program directory foo contains a file called
main.c. All the versions of the main.c is stored in a file
called main.c,v in the directory RCS. main.c is the
currently used version of the file and it may or may not be
stored in the foo directory.
/foo
main. c
/RCS
main.c,v
Functions of RCS
• Store and retrieve multiple revisions of text
– Revisions can be retrieved using the revision numbers,
symbolic names, authors, dates.
• Maintain complete history of changes
– RCS logs all the changes to the file together with the
modifications, the author who modified, the date, and
exlanation message.
• Resolve access conflicts
– when more than one user wants to access the file, RCS
alerts the users and prevents corrupting the file.
Functions of RCS
• Maintain a tree of revisions.
– RCS can maintain separate lines of development for
each module. It stores a tree structure that represents the
ancesteral relationships among revisions
1.1
1.2
branch
1.2.1.1
1.3
1.4
1.2.1.2
merge
Top of tree path
1.2.1.1.1.1
1.2.1.2.1.1
1.2.1.2.1.2
RCS Functions
• Merge revisions and resolve conflicts
– Two different lines of development can be merged. If the
revisions of the merging affects the same section of the code,
RCS alerts the user.
• Control Releases and Configurations
– Revisions can be assigned symbolic names (tagging) and marked
and stable, released, experimental, etc.
• Automatically identify each revision with name,
revision number, creation time and author.
• Minimize secondary storage.
– Only the difference (delta) between revisions is stored in the
RCS file.
Example
• Assume you have a file f.c (working file) that you want to
store in the RCS
• create a directory RCS: mkdir RCS
• use the ci command (check-in) to store the file in the
RCS
– ci f.c (stores the f.c in RCS/f.c,v and assigns a revision
number 1.1
– f.c is deleted from the working directory
• use the command co (check-out) to retrieve the
latest revision from the RCS file.
– co f.c
RCS commands
• Use co -l to lock the retrieve (check-out)
file so that you can make changes to it.
• After doing modifications to the file you
can store (check-in) the file and RCS will
assign it a new revision numver 1.2
• You can check-out a specific revision of the
file using the command
• co -r revision_number (i.e co -r1.2 foo.c)
RCS commands
• You can check-in (store) the file with a revision number of
your choice using the command: ci -rrevision_number file
– example: ci -r1.2.1.1 f.c (creates branch)
• RCS can put automatic identification string into your file.
TO achieve this put $Id$ string in the beginning of the file.
When you checkin and checkout the file RCS will replace
this string with an identification string of the form:
$Id: filename revision date time author state $
• With such a string at the beginning of your working file you will know
with which revision of the file your are working currently.
Other RCS commands
• ident shows information about the working file
– Example:
aspendos{korpe}:> ident main.c
main.c
$Id: main.c,v 1.2 2002/02/06 21:47:25 korpe Exp $
• rcsdiff shows the difference between two revisions of a
file.
– rcsdiff -r1.2 -r1.2 main.c
– rcsdiff -r1.2 main.c (shows the difference between the
working file main.c and its 1.2 revision)
Other RCS commands
• rcsmerge incorporates the changes between two revisions
of a file into the corresponding working file.
– rcsmerge -r1.2 -r1.2.1.1 f.c
(version of the working file is 1.3)
1.1
1.2
delta
1.2.1.1
Working file f.c
1.3
delta is added to current working file
which has revision 1.3
New revision
RCS Commands
• rlog prints log messages and other information
about RCS files.
• rcs creates new RCS files or changes the attributes
of the existing ones
• rcs -l1.2 f.c locks the revision 1.2 of f.c so that we can
check it as revision 1.3.
• rcs -lrevision filename removes the lock.
• rcs -nname[:rev] assigns a symbolic name to the version rev of
the file. (this is called tagging and name is called a tag)
Other Tools
• Memory
– purify: memory leak detector
• Performance
– prof, gprof: profile performance
– truss: trace system calls
gprof
– Profiling = execution profile of a call graph
– periodic CPU sampling
gcc -pg myprog.c -o myprog.c
gprof myprog gmon.out
Output produced into gmon.out
truss
•
•
•
•
•
show execution trace of system calls
does not show stdio calls
-p: attach to existing process id
-f: follow children
-u libc: also follow user libraries
$ truss -u libc -d test
use top to show memory utilization
Purify
• Check for
– memory leaks
– access to free’d memory
– open file descriptors
– purify -cache-dir=/tmp/purify gcc -g test.c
– a.out