Download connect()

Document related concepts

Distributed firewall wikipedia , lookup

Wake-on-LAN wikipedia , lookup

AppleTalk wikipedia , lookup

Computer network wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Lag wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Network tap wikipedia , lookup

Airborne Networking wikipedia , lookup

TCP congestion control wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Internet protocol suite wikipedia , lookup

Real-Time Messaging Protocol wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Transcript
TDC368
UNIX and Network Programming
Week 8:
 Inter-Process Communication on Networks


Socket Application Programming Interface (API)
Examples:
http://condor.depaul.edu/~czlatea/TDC368/Code/tcp_socket/
http://condor.depaul.edu/~czlatea/TDC368/Code/daytime/
Camelia Zlatea, PhD
Email: [email protected]
References
 Douglas Comer, David Stevens, Internetworking
with TCP/IP : Client-Server Programming, Volume
III (BSD Unix and ANSI C), 2nd edition, 1996 (ISBN
0-13-260969-X)
– Chap. 3,4,5
 W. Richard Stevens, Network Programming :
Networking API: Sockets and XTI, Volume 1, 2nd
edition, 1998 (ISBN 0-13-490012-X)
– Chap. 1,2,3,4
 John Shapley Gray, Interprocess Communications
in UNIX -- The Nooks and Crannies Prentice Hall
PTR, NJ, 1998
– Chap. 10
UNIX Network Programming – TDC368-901
Spring 2003
Page 2
Client Server Communication
 The transport protocols TCP and UDP were designed to
enable communication between network applications
• Internet host can have several servers running.
• usually has only one physical link to the “rest of the world”
• When packets arrive how does the host identify which packets should go to
which server?
• Ports
• ports are used as logical connections between network applications
• 16 bit number (65536 possible ports)
• demultiplexing key
• identify the application/process to receive the packet
• TCP connection
• source IP address and source port number
• destination IP address and destination port number
• the combination IP Address : Port Number pair is called a Socket
UNIX Network Programming – TDC368-901
Spring 2003
Page 3
Client Server Communication
Port
65536
65535
65534
65533
65536
65535
65534
65533
82
81
Network Host
123.45.67.89
IP
Network
80
79
82
81
80
79
Network Host
122.34.45.67
78
78
SOCKETS
122.34.45.67: 65534
3
2
1
UNIX Network Programming – TDC368-901
123.45.67.89:80
3
2
1
Spring 2003
Page 4
Client Server Communication
Port
65536
65535
65534
65533
IP Host
HTTP Server with three
active connections (sockets).
IP Network
Active
Active
82
81
IP Host/
Server
80
79
78
3
2
1
UNIX Network Programming – TDC368-901
Active
IP Host
Listening
The HTTP server listens for
future connections.
IP Host
Spring 2003
Page 5
Ports
 Port - A 16-bit number that identifies the application process that
receives an incoming message.
 Port numbers divided into three categories
•
•
•
Well Known Ports
0-1023
Registered Ports
1024-49151 by the IANA (Internet Assigned
Numbers Authority), and represent second tier common ports (socks (1080), WINS
(1512), kermit (1649), https (443))
Dynamic/Private Ports 49152-65535 ephemeral ports, available for
temporary client usage
 Reserved ports or well-known ports (0 to 1023)
– Standard ports for well-known applications.
– See /etc/services file on any UNIX machine for listing of
services on reserved ports.
•
•
•
•
•
•
•
•
1
20
21
23
25
43
69
80
TCP Port Service Multiplexer
File Transfer Protocol (FTP) Data
FTP Control
Telnet
Simple Mail Transfer (SMT)
Who Is
Trivial File Transfer Protocol (TFTP)
HTTP
UNIX Network Programming – TDC368-901
Spring 2003
Page 6
Associations
 A socket address is the triplet:
{protocol, local-IP, local-port}
– example,
{tcp, 130.245.1.44, 23}
 An association is the 5-tuple that completely specifies
the two end-points that comprise a connection:
{protocol, local-IP, local-port, remote-IP, remote-port}
– example:
{tcp, 130.245.1.44, 23, 130.245.1.45, 1024}
UNIX Network Programming – TDC368-901
Spring 2003
Page 7
Socket Domain Families
 There are several significant socket domain
families:
– Internet Domain Sockets (AF_INET)
» implemented via IP addresses and port numbers
– Unix Domain Sockets (AF_UNIX)
» implemented via filenames (similar to IPC “named pipe”)
UNIX Network Programming – TDC368-901
Spring 2003
Page 8
Creating a Socket
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
 domain is one of the Protocol Families (AF_INET,
AF_UNIX, etc.)
 type defines the communication protocol
semantics, usually defines either:
– SOCK_STREAM: connection-oriented stream (TCP)
– SOCK_DGRAM: connectionless, unreliable (UDP)
 protocol specifies a particular protocol, just set
this to 0 to accept the default
UNIX Network Programming – TDC368-901
Spring 2003
Page 9
The Socket Structure
 INET Address
struct in_addr {
in_addr_t s_addr;
}
/* 32-bit IPv4 address */
 INET Socket
Struct sockaddr_in {
uint8_t sin_len; /* length of structure (16) */
sa_family_t sin_family; /* AF_INET */
in_port_t sin_port; /* 16-bit TCP/UDP port number */
struct in_addr sin_addr; /* 32-bit IPv4 address */
char sin_zero[8]; /* unused */
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 10
Setup for an Internet Domain Socket
struct sockaddr_in {
sa_family_t sin_family;
unsigned short int sin_port;
struct in_addr sin_addr;
unsigned char pad[...];
};
 sin_family is set to Address Family AF_INET
 sin_port is set to the port number you want to bind
to
 sin_addr is set to the IP address of the machine you
are binding to (struct in_addr is a wrapper struct for
an unsigned long)
 ignore padding
UNIX Network Programming – TDC368-901
Spring 2003
Page 11
Stream Socket Transaction (TCP Connection)
Server
Client
socket()
bind()
socket()
listen()
connect()
3-way handshake
accept()
write()
data
read()
data
write()
read()
close()
UNIX Network Programming – TDC368-901
EOF
read()
Spring 2003
close()
Page 12
SERVER
Create socket
bind a port to the
socket
• Connection-oriented
socket connections
• Client-Server view
CLIENT
listen for incoming
connections
Create socket
accept an
incoming
connection
connect to server's
port
read from the
connection
write to the
connection
loop
loop
write to the
connection
read from the
connection
close connection
UNIX Network Programming – TDC368-901
Spring 2003
Page 13
Server Side Socket Details
SERVER
Create socket
int socket(int domain, int type, int protocol)
sockfd =socket(AP_INET, SOCK_STREAM, 0);
bind a port to the
socket
int bind(int sockfd, struct sockaddr *server_addr, socklen_t length)
bind(sockfd, &server, sizeof(server));
listen for incoming
connections
int listen( int sockfd, int num_queued_requests)
listen( sockfd, 5);
accept an
incoming
connection
int accept(int sockfd, struct sockaddr *incoming_address, socklen_t length)
newfd = accept(sockfd, &client, sizeof(client)); /* BLOCKS */
read from the
connection
int read(int sockfd, void * buffer, size_t buffer_size)
read(newfd, buffer, sizeof(buffer));
write to the
connection
int write(int sockfd, void * buffer, size_t buffer_size)
write(newfd, buffer, sizeof(buffer));
UNIX Network Programming – TDC368-901
Spring 2003
Page 14
Client Side Socket Details
CLIENT
Create socket
connect to Server
socket
int socket(int domain, int type, int protocol)
sockfd = socket(AF_INET, SOCK_STREAM, 0);
int connect(int sockfd, struct sockaddr *server_address, socklen_t length)
connect(sockfd, &server, sizeof(server)); /* Blocking */
write to the
connection
int write(int sockfd, void * buffer, size_t buffer_size)
write(sockfd, buffer, sizeof(buffer));
read from the
connection
int read(int sockfd, void * buffer, size_t buffer_size)
read(sockfd, buffer, sizeof(buffer));
UNIX Network Programming – TDC368-901
Spring 2003
Page 15
Reading From and Writing To Stream Sockets
 Sockets, are Inter-Process-Communication (IPC)
mechanism, similar with files:
– low level IO:
» read() system call
» write() system call
– higher level IO:
» int recv(int socket, char *buf, int len, int flags);
• blocks on read
• returns 0 when other connection has terminated
» int send(int socket, char *buf, int len, int flags);
• returns the number of bytes actually sent
» where flags may be one of:
• MSG_DONTROUTE (don’t route out of localnet)
• MSG_OOB (out of band data (causes interruption))
• MSG_PEEK (examine, but don’t remove from stream)
UNIX Network Programming – TDC368-901
Spring 2003
Page 16
Closing a Socket Session
 int close(int socket);
– closes read/write IO, closes socket file descriptor
 int shutdown( int socketfd, int mode);
– where mode is:
» 0:
no more receives allowed “r”
» 1:
no more sends are allowed “w”
» 2:
disables both receives and sends (but doesn’t
socket, use close() for that) “rw”
UNIX Network Programming – TDC368-901
Spring 2003
close the
Page 17
Byte Ordering
 Different computer architectures use different byte
ordering to represent/store multi-byte values (such as
16-bit/32-bit integers)
 16 bit integer:
Little-Endian (Intel)
Big-Endian (RISC-Sparc)
Low Byte
Address A
High Byte
High Byte
Address A+1
Low Byte
UNIX Network Programming – TDC368-901
Spring 2003
Page 18
Byte Order and Networking
 Suppose a Big Endian machine sends a 16 bit integer with
the value 2:
0000000000000010
 A Little Endian machine will understand the number as 512:
0000001000000000
 How do two machines with different byte-orders
communicate?
– Using network byte-order
– Network byte-order = big-endian order
UNIX Network Programming – TDC368-901
Spring 2003
Page 19
Network Byte Order
 Conversion of application-level data is left up to the
presentation layer.
 Lower level layers communicate using a fixed byte order
called network byte order for all control data.
 TCP/IP mandates that big-endian byte ordering be used
for transmitting protocol information
 All values stored in a sockaddr_in must be in network
byte order.
– sin_port
– sin_addr
a TCP/IP port number.
an IP address.
UNIX Network Programming – TDC368-901
Spring 2003
Page 20
Network Byte Order Functions
 Several functions are provided to allow conversion
between host and network byte ordering,
 Conversion macros (<netinet/in.h>)
– to translate 32-bit numbers (i.e. IP addresses):
» unsigned long htonl(unsigned long hostlong);
» unsigned long ntohl(unsigned long netlong);
– to translate 16-bit numbers (i.e. Port numbers):
» unsigned short htons(unsigned short hostshort);
» unsigned short ntohs(unsigned short netshort);
UNIX Network Programming – TDC368-901
Spring 2003
Page 21
TCP Sockets Programming Summary
 Creating a passive mode (server) socket.
 Establishing an application-level connection.
 send/receive data.
 Terminating a connection.
UNIX Network Programming – TDC368-901
Spring 2003
Page 22
Creating a TCP socket
int socket(int family,int type,int proto);
int mysockfd;
mysockfd = socket( AF_INET,
SOCK_STREAM,
0);
if (mysockfd<0) { /* ERROR */ }
UNIX Network Programming – TDC368-901
Spring 2003
Page 23
Binding to well known address
int mysockfd;
int err;
struct sockaddr_in myaddr;
mysockfd = socket(AF_INET,SOCK_STREAM,0);
myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( 80 );
myaddr.sin_addr = htonl( INADDR_ANY );
err= bind(mysockfd, (sockaddr *) &myaddr,
sizeof(myaddr));
UNIX Network Programming – TDC368-901
Spring 2003
Page 24
Bind – What Port Number?
 Clients typically don’t care what port they are
assigned.
 When you call bind you can tell it to assign you
any available port:
myaddr.port = htons(0);
UNIX Network Programming – TDC368-901
Spring 2003
Page 25
Bind - What IP address ?
 How can you find out what your IP address is so
you can tell bind() ?
 There is no realistic way for you to know the right
IP address to give bind() - what if the computer
has multiple network interfaces?
 Specify the IP address as: INADDR_ANY, this
tells the OS to handle the IP address
specification.
UNIX Network Programming – TDC368-901
Spring 2003
Page 26
Converting Between IP Address formats
 From ASCII to numeric
– “130.245.1.44”  32-bit network byte ordered value
– inet_aton(…) with IPv4
– inet_pton(…) with IPv4 and IPv6
 From numeric to ASCII
–
–
–
–
32-bit value  “130.245.1.44”
inet_ntoa(…) with IPv4
inet_ntop(…) with IPv4 and IPv6
Note – inet_addr(…) obsolete
» cannot handle broadcast address “255.255.255.255”
(0xFFFFFFFF)
UNIX Network Programming – TDC368-901
Spring 2003
Page 27
IPv4 Address Conversion
int inet_aton( char *, struct in_addr *);
Convert ASCII dotted-decimal IP address to network byte order 32 bit
value. Returns 1 on success, 0 on failure.
char *inet_ntoa(struct in_addr);
Convert network byte ordered value to ASCII dotted-decimal (a
string).
UNIX Network Programming – TDC368-901
Spring 2003
Page 28
Establishing a passive mode TCP socket
 Passive mode:
– Address already determined.
– Tell the kernel to accept incoming connection requests directed at
the socket address.
» 3-way handshake
– Tell the kernel to queue incoming connections for us.
UNIX Network Programming – TDC368-901
Spring 2003
Page 29
listen()
int listen( int mysockfd, int backlog);
mysockfd is the TCP socket (already bound to an
address)
backlog is the number of incoming connections the
kernel should be able to keep track of (queue for
us).
listen() returns -1 on error (otherwise 0).
UNIX Network Programming – TDC368-901
Spring 2003
Page 30
Accepting an incoming connection
 Once we call listen(), the O.S. will queue
incoming connections
– Handles the 3-way handshake
– Queues up multiple connections.
 When our application is ready to handle a new
connection, we need to ask the O.S. for the next
connection.
UNIX Network Programming – TDC368-901
Spring 2003
Page 31
accept()
int accept( int mysockfd,
struct sockaddr* cliaddr,
socklen_t *addrlen);
mysockfd is the passive mode TCP socket.
cliaddr is a pointer to allocated space.
addrlen is a value-result argument
– must be set to the size of cliaddr
– on return, will be set to be the number of used bytes in cliaddr.
 accept() return value
– accept() returns a new socket descriptor (positive integer) or -1 on
error.
– After accept returns a new socket descriptor, I/O can be done using
the read() and write() system calls.
UNIX Network Programming – TDC368-901
Spring 2003
Page 32
Terminating a TCP connection
 Either end of the connection can call the close() system
call.
 If the other end has closed the connection, and there is no
buffered data, reading from a TCP socket returns 0 to
indicate EOF.
UNIX Network Programming – TDC368-901
Spring 2003
Page 33
Client Code
 TCP clients can call connect() which:
– takes care of establishing an endpoint address for the client
socket.
» don’t need to call bind first, the O.S. will take care of assigning the
local endpoint address (TCP port number, IP address).
– Attempts to establish a connection to the specified server.
» 3-way handshake
UNIX Network Programming – TDC368-901
Spring 2003
Page 34
connect()
int connect( int sockfd,
const struct sockaddr *server,
socklen_t addrlen);
sockfd is an already created TCP socket.
server contains the address of the server (IP Address and TCP port
number)
connect() returns 0 if OK, -1 on error
UNIX Network Programming – TDC368-901
Spring 2003
Page 35
Reading from a TCP socket
int read( int fd, char *buf, int max);
 By default read() will block until data is available.
 reading from a TCP socket may return less than max bytes
(whatever is available).
UNIX Network Programming – TDC368-901
Spring 2003
Page 36
Writing to a TCP socket
int write( int fd, char *buf, int num);
 write might not be able to write all num bytes (on a nonblocking
socket).
 Other functions (API)
– readn(), writen() and readline() - see man pages definitions.
UNIX Network Programming – TDC368-901
Spring 2003
Page 37
Example [ from R. Stevens text]
Client Server communication
Client
Network
Machine A
Server
Machine B
• Web browser and server
• FTP client and server
• Telnet client and server
UNIX Network Programming – TDC368-901
Spring 2003
Page 38
Example – Daytime Server/Client
Daytime client
Application protocol
Daytime server
(end-to-end logical connection)
Socket API
Socket API
TCP
IP
MAC driver
TCP protocol
(end-to-end logical connection)
TCP
IP protocol
(physical connection )
IP
MAC-level protocol
(physical connection )
MAC driver
Actual data flow
MAC = media
access control
Network
UNIX Network Programming – TDC368-901
Spring 2003
Page 39
Daytime client
#include
"unp.h"
int main(int argc, char **argv)
{
int sockfd, n;
char recvline[MAXLINE + 1];
struct sockaddr_in servaddr;
 Connects to a daytime server
 Retrieves the current date and
time
% gettime 130.245.1.44
Thu Sept 05 15:50:00 2002
if( argc != 2 )err_quit(“usage : gettime <IP address>”);
/* Create a TCP socket */
if ( (sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
err_sys("socket error");
/* Specify server’s IP address and port */
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(13); /* daytime server port */
if (inet_pton(AF_INET, argv[1], &servaddr.sin_addr)
<=
UNIX Network Programming – TDC368-901
Page 40
Spring 2003
Daytime client
/* Connect to the server */
if (connect(sockfd, (SA *) &servaddr, sizeof(servaddr)) <
0)
err_sys("connect error");
/* Read the date/time from socket */
while ( (n = read(sockfd, recvline, MAXLINE)) > 0) {
recvline[n] = 0;
/* null terminate */
printf(“%s”, recvline);
}
if (n < 0) err_sys("read error");
close(sockfd);
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 41
Simplifying error-handling [ R. Stevens ]
int Socket(int family, int type, int protocol)
{
int n;
if ( (n = socket(family, type, protocol)) < 0)
err_sys("socket error");
return n;
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 42
Daytime Server
#include
#include
"unp.h"
<time.h>
int main(int argc, char **argv)
{
int
listenfd, connfd;
struct sockaddr_in servaddr;
char buff[MAXLINE];
time_t ticks;




Waits for requests from Client
Accepts client connections
Send the current time
Terminates connection and goes
back waiting for more
connections.
/* Create a TCP socket */
listenfd = Socket(AF_INET, SOCK_STREAM, 0);
/* Initialize server’s address and well-known port */
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family
= AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port
= htons(13);
/* daytime server */
/* Bind server’s address and port to the socket */
Bind(listenfd, (SA *) &servaddr, sizeof(servaddr));
UNIX Network Programming – TDC368-901
Spring 2003
Page 43
Daytime Server
/* Convert socket to a listening socket */
Listen(listenfd, LISTENQ);
for ( ; ; ) {
/* Wait for client connections and accept them */
connfd = Accept(listenfd, (SA *) NULL, NULL);
/* Retrieve system time */
ticks = time(NULL);
sprintf(buff, sizeof(buff), "%.24s\r\n", ctime(&ticks));
/* Write to socket */
Write(connfd, buff, strlen(buff));
/* Close the connection */
Close(connfd);
}
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 44
Server Design
Iterative
Connectionless
Iterative
Connection-Oriented
Concurrent
Connectionless
Concurrent
Connection-Oriented
UNIX Network Programming – TDC368-901
Spring 2003
Page 45
Concurrent vs. Iterative
Concurrent
•Large or variable size requests
•Harder to program
•Typically uses more system resources
Iterative
•Small, fixed size requests
•Easy to program
UNIX Network Programming – TDC368-901
Spring 2003
Page 46
Connectionless vs. Connection-Oriented
Connection-Oriented
•Easy to program
•Transport protocol handles the tough stuff.
•Requires separate socket for each connection.
Connectionless
•Less overhead
•No limitation on number of clients
UNIX Network Programming – TDC368-901
Spring 2003
Page 47
Statelessness
 State: Information that a server maintains about the status of ongoing
client interactions.
 Issues with Statefullness
–
–
–
–
Clients can go down at any time.
Client hosts can reboot many times.
The network can lose messages.
The network can duplicate messages.
 Example
– Connectionless servers that keep state information must be designed
carefully
» Messages can be duplicated
UNIX Network Programming – TDC368-901
Spring 2003
Page 48
Concurrent Server Design Alternatives
 One child per client
 Single Process Concurrency
 Pre-forking multiple processes
 Spawn one thread per client
 Pre-threaded Server
UNIX Network Programming – TDC368-901
Spring 2003
Page 49
One child per client
 Traditional Unix server:
– TCP: after call to accept(), call fork().
– UDP: after recvfrom(), call fork().
– Each process needs only a few sockets.
– Small requests can be serviced in a small amount of time.
 Parent process needs to clean up after children (call wait() ).
UNIX Network Programming – TDC368-901
Spring 2003
Page 50
Discussion Stevens Example: Concurrency using fork() 1/3
/* Code fragment that uses fork() and signal()
to implement concurrency */
/* include and define statements section */
void signal_handler(int sig) {
int status;
wait(&status); /* awaits child process to exit
therefore allows child to terminate,
and to transit from ZOMBIE to
NORMAL TEMINATION (END) state
*/
signal(SIGCHLD,&signal_handler);
/* restarts signal handler */
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 51
Discussion Stevens Example: Concurrency using fork() 2/3
main(int argc, char *argv[])
{
/* Variable declaration section */
/* The calls socket(), bind(), and listen() */
See previous slide
signal(SIGCHLD,&signal_handler);
while(1) { /* infinite accept() loop */
newfd = accept(sockfd,(struct sockaddr *)&theiraddr,&sinsize);
if (newfd < 0) {
/* error in accept() */
if (errno = EINTR)
continue;
else {
perror("accept");
exit(-1);
}
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 52
Discussion Stevens Example: Concurrency using fork() 3/3
/* successfully accepted a new client connection newfd >=0 */
switch (fork()) {
case -1: /* fork() error */
perror("fork");
exit(-1);
case 0: /* child handles request */
close(sockfd);
/* read msg and form a response */
/* send response back to the client */
close(newfd);
exit(-1); /* exit() sends by default SIGCHLD to parent */
default: /* parent returns to wait for another request */
close(newfd);
} /* end switch */
} /* end while(1) */
}
UNIX Network Programming – TDC368-901
Spring 2003
Page 53
Appendix
TCP/IP Protocol Suite - Terms and Concepts
TCP/IP Summary
 IP: network layer protocol
– unreliable datagram delivery between hosts.
 UDP: transport layer protocol - provides fast / unreliable
datagram service. Pros: Less overhead; fast and efficient
– minimal datagram delivery service between processes.
– unreliable, since there is no acknowledgement of receipt, there is no way to
know to resend a lost packet
– no built-in order of delivery, random delivery
– connectionless; a connection exists only long enough to deliver a single
packet
– checksum to guarantee integrity of packet data
 TCP: transport layer protocol . Cons: Lots of overhead
– connection-oriented, full-duplex, reliable, byte-stream delivery service
between processes.
– guaranteed delivery of packets in order of transmission by offering
acknowledgement and retransmission:
– sequenced delivery to the application layer, by adding a sequence number to
every packet.
– checksum to guarantee integrity of packet data
UNIX Network Programming – TDC368-901
Spring 2003
Page 55
End-to-End (Transport) Protocols
 Underlying best-effort network
–
–
–
–
–
drops messages
re-orders messages
delivers duplicate copies of a given message
limits messages to some finite size
delivers messages after an arbitrarily long delay
 Common end-to-end services
–
–
–
–
–
–
–
guarantee message delivery
deliver messages in the same order they are sent
deliver at most one copy of each message
support arbitrarily large messages
support synchronization
allow the receiver to apply flow control to the sender
support multiple application processes on each host
UNIX Network Programming – TDC368-901
Spring 2003
Page 56
UDP
Application
process
Application
process
Application
process
Ports
Queues
Packets
demultiplexed
UDP
Packets arrive
UNIX Network Programming – TDC368-901
Spring 2003
Page 57
UDP





Header
Simple Demultiplexor
Unreliable and unordered datagram service
Adds multiplexing
No flow control
Endpoints identified by ports
– servers have well-known ports
– see /etc/services on Unix
 Optional checksum
– pseudo header + udp header + data
 UDP Packet Format
0
16
31
Src Port Address
Dst Port Address
Checksum
Length of DATA
DATA
UNIX Network Programming – TDC368-901
Spring 2003
Page 58
TCP
 Reliable Byte-Stream
 Connection-oriented
 Byte-stream
– sending process writes some number of bytes
– TCP breaks into segments and sends via IP
– receiving process reads some number of bytes
Application process
W rite
bytes
…
…
Application process
Read
bytes
TCP
Receive buffer
TCP
Send buffer
Segment
Segment
…
Segment
T ransmit segments
 Full duplex
 Flow control: keep sender from overrunning receiver
 Congestion control: keep sender from overrunning network
UNIX Network Programming – TDC368-901
Spring 2003
Page 59
TCP
 Connection-oriented protocol
• logical connection created between two communicating processes
• connection is managed at TCP protocol layer
• provides reliable and sequential delivery of data
• receiver acknowledgements sender that data has arrived safely
• sender resends data that has not been acknowledged
• packets contain sequence numbers so they may be ordered
 Bi-directional byte stream
• both sender and receiver write and read bytes
• acknowledgements identify received bytes
• buffers hold data until there is a sent
• multiple bytes are packaged into a segment when sent
UNIX Network Programming – TDC368-901
Spring 2003
Page 60
TCP End-to-End Issues
Based on sliding window protocol used at data link
level, but the situation is very different.
 Potentially connects many different hosts
– need explicit connection establishment and termination
 Potentially different RTT (Round Trip Time)
– need adaptive timeout mechanism
 Potentially long delay in network
– need to be prepared for arrival of very old packets
 Potentially different capacity at destination
– need to accommodate different amounts of buffering
 Potentially different network capacity
– need to be prepared for network congestion
UNIX Network Programming – TDC368-901
Spring 2003
Page 61
TCP Segment Format
 Every TCP segment includes a Sequence Number that
refers to the first byte of data included in the segment.
 Every TCP segment includes an Acknowledgement
Number that indicates the byte number of the next data
that is expected to be received.
– All bytes up through this number have already been received.
 Control flags:
– URG: urgent data included.
– ACK: this segment is (among other things) an
acknowledgement.
– RST: error - abort the session.
– SYN: synchronize Sequence Numbers (setup)
– FIN: polite connection termination.
 Window:
– Every ACK includes a Window field that tells the sender how
many bytes it can send before the receiver buffer will be in
overflow
UNIX Network Programming – TDC368-901
Spring 2003
Page 62
TCP Segment Format
0
16
Source Port Number
31
Destination Port Number
Sequence Number
Acknowledgement
Hdr Len
0
Flags
Window
Checksum
Urgent Pointer
Options/Padding
Data
UNIX Network Programming – TDC368-901
Spring 2003
Page 63
TCP Connection Establishment and Termination
 When a client requests a connection it sends a
“SYN” segment (a special TCP segment) to the
server port.
 SYN stands for synchronize. The SYN message
includes the client’s SN.
 SN is Sequence Number.
UNIX Network Programming – TDC368-901
Spring 2003
Page 64
TCP Connection Creation
Client
Active Participant
Server
Passive Participant
SYN
SN=X
1
2
SYN
SN=Y ACK=X+1
ACK=Y+1
UNIX Network Programming – TDC368-901
3
Spring 2003
Page 65
TCP 3-Way Handshake
 A client starts by sending a SYN segment with the following
information:
1
– Client’s SN (generated pseudo-randomly) = X
– Maximum Receive Window for client.
– Only TCP headers
 When a waiting server sees a new connection request, the server
sends back a SYN segment with:
–
–
–
–
Server’s SN (generated pseudo-randomly) = Y
Acknowledgement Number is Client SN+1 = X+1
Maximum Receive Window for server.
Only TCP headers
2
 When the Server’s SYN is received, the client sends back an ACK
with:
– Acknowledgement Number is Server’s SN+1 = Y+1
3
 Why 3-way?
UNIX Network Programming – TDC368-901
Spring 2003
Page 66
TCP Data and ACK
 Once the connection is established, data can be
sent.
 Each data segment includes a sequence number
identifying the first byte in the segment.
 Each segment (data or empty) includes an
acknowledgement number indicating what data
has been received.
UNIX Network Programming – TDC368-901
Spring 2003
Page 67
TCP
 Reliable Byte-Stream
 Connection-oriented
 Byte-stream
– sending process writes some number of bytes
– TCP breaks into segments and sends via IP
– receiving process reads some number of bytes
Application process
W rite
bytes
…
…
Application process
Read
bytes
TCP
Receive buffer
TCP
Send buffer
Segment
Segment
…
Segment
T ransmit segments
 Full duplex
 Flow control: keep sender from overrunning receiver
 Congestion control: keep sender from overrunning
network
UNIX Network Programming – TDC368-901
Spring 2003
Page 68
TCP Buffering
 The TCP layer doesn’t know when the application will ask
for any received data.
 TCP buffers incoming data so it’s ready when we ask for it.
 Client and server allocate buffers to hold incoming and
outgoing data
– The TCP layer does this.
 Client and server announce with every ACK how much
buffer space remains (the Window field in a TCP segment).
 Most TCP implementations will accept out-of-order
segments (if there is room in the buffer).
 Once the missing segments arrive, a single ACK can be
sent for the whole thing.
UNIX Network Programming – TDC368-901
Spring 2003
Page 69
TCP Buffering
 Send Buffers
– The application gives the TCP layer some data to send.
– The data is put in a send buffer, where it stays until the data is
ACK’d.
– The TCP layer won’t accept data from the application unless (or
until) there is buffer space.
 ACK
– A receiver doesn’t have to ACK every segment (it can ACK many
segments with a single ACK segment).
– Each ACK can also contain outgoing data (piggybacking).
– If a sender doesn’t get an ACK after some time limit it resends the
data.
UNIX Network Programming – TDC368-901
Spring 2003
Page 70
Termination
 The TCP layer can send a RST segment that
terminates a connection if something is wrong.
 Usually the application tells TCP to terminate the
connection gracefully with a FIN segment.
 FIN
–
–
–
–
–
Either end of the connection can initiate termination.
A FIN is sent, which means the application is done sending data.
The FIN is ACK’d.
The other end must now send a FIN.
That FIN must be ACK’d.
UNIX Network Programming – TDC368-901
Spring 2003
Page 71
TCP Connection Termination
App2
App1
FIN
SN=X
2
...
ACK=X+1
1
FIN
SN=Y
ACK=Y+1
UNIX Network Programming – TDC368-901
3
4
Spring 2003
Page 72
Stream Sockets
 Connection-Based, i.e., socket addresses established
before sending messages between C/S
 Address Domain: AF_UNIX (UNIX pathname) or
AF_INET (host+port)
 Virtual Circuit i.e., Data Transmitted sequentially in a
reliable and non-duplicated manner
 Default Protocol Interface is TCP
 Checks order, sequence, duplicates
 No boundaries are imposed on data (its a stream of
bytes)
 Slower than UDP
 Requires more program overhead
UNIX Network Programming – TDC368-901
Spring 2003
Page 73
Datagram Sockets
 Connectionless sockets, i.e., C/S addresses are
passed along with each message sent from one
process to another
 Peer-to-Peer Communication
 Provides an interface to the UDP datagram services
 Handles network transmission as independent
packets
 Provides no guarantees, although it does include a
checksum
 Does not detect duplicates
 Does not determine sequence
– ie information can be lost, wrong order or duplicated
UNIX Network Programming – TDC368-901
Spring 2003
Page 74