* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download connect()
Survey
Document related concepts
Distributed firewall wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Remote Desktop Services wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Network tap wikipedia , lookup
Airborne Networking wikipedia , lookup
TCP congestion control wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Internet protocol suite wikipedia , lookup
Real-Time Messaging Protocol wikipedia , lookup
Transcript
TDC368 UNIX and Network Programming Week 8: Inter-Process Communication on Networks Socket Application Programming Interface (API) Examples: http://condor.depaul.edu/~czlatea/TDC368/Code/tcp_socket/ http://condor.depaul.edu/~czlatea/TDC368/Code/daytime/ Camelia Zlatea, PhD Email: [email protected] References Douglas Comer, David Stevens, Internetworking with TCP/IP : Client-Server Programming, Volume III (BSD Unix and ANSI C), 2nd edition, 1996 (ISBN 0-13-260969-X) – Chap. 3,4,5 W. Richard Stevens, Network Programming : Networking API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN 0-13-490012-X) – Chap. 1,2,3,4 John Shapley Gray, Interprocess Communications in UNIX -- The Nooks and Crannies Prentice Hall PTR, NJ, 1998 – Chap. 10 UNIX Network Programming – TDC368-901 Spring 2003 Page 2 Client Server Communication The transport protocols TCP and UDP were designed to enable communication between network applications • Internet host can have several servers running. • usually has only one physical link to the “rest of the world” • When packets arrive how does the host identify which packets should go to which server? • Ports • ports are used as logical connections between network applications • 16 bit number (65536 possible ports) • demultiplexing key • identify the application/process to receive the packet • TCP connection • source IP address and source port number • destination IP address and destination port number • the combination IP Address : Port Number pair is called a Socket UNIX Network Programming – TDC368-901 Spring 2003 Page 3 Client Server Communication Port 65536 65535 65534 65533 65536 65535 65534 65533 82 81 Network Host 123.45.67.89 IP Network 80 79 82 81 80 79 Network Host 122.34.45.67 78 78 SOCKETS 122.34.45.67: 65534 3 2 1 UNIX Network Programming – TDC368-901 123.45.67.89:80 3 2 1 Spring 2003 Page 4 Client Server Communication Port 65536 65535 65534 65533 IP Host HTTP Server with three active connections (sockets). IP Network Active Active 82 81 IP Host/ Server 80 79 78 3 2 1 UNIX Network Programming – TDC368-901 Active IP Host Listening The HTTP server listens for future connections. IP Host Spring 2003 Page 5 Ports Port - A 16-bit number that identifies the application process that receives an incoming message. Port numbers divided into three categories • • • Well Known Ports 0-1023 Registered Ports 1024-49151 by the IANA (Internet Assigned Numbers Authority), and represent second tier common ports (socks (1080), WINS (1512), kermit (1649), https (443)) Dynamic/Private Ports 49152-65535 ephemeral ports, available for temporary client usage Reserved ports or well-known ports (0 to 1023) – Standard ports for well-known applications. – See /etc/services file on any UNIX machine for listing of services on reserved ports. • • • • • • • • 1 20 21 23 25 43 69 80 TCP Port Service Multiplexer File Transfer Protocol (FTP) Data FTP Control Telnet Simple Mail Transfer (SMT) Who Is Trivial File Transfer Protocol (TFTP) HTTP UNIX Network Programming – TDC368-901 Spring 2003 Page 6 Associations A socket address is the triplet: {protocol, local-IP, local-port} – example, {tcp, 130.245.1.44, 23} An association is the 5-tuple that completely specifies the two end-points that comprise a connection: {protocol, local-IP, local-port, remote-IP, remote-port} – example: {tcp, 130.245.1.44, 23, 130.245.1.45, 1024} UNIX Network Programming – TDC368-901 Spring 2003 Page 7 Socket Domain Families There are several significant socket domain families: – Internet Domain Sockets (AF_INET) » implemented via IP addresses and port numbers – Unix Domain Sockets (AF_UNIX) » implemented via filenames (similar to IPC “named pipe”) UNIX Network Programming – TDC368-901 Spring 2003 Page 8 Creating a Socket #include <sys/types.h> #include <sys/socket.h> int socket(int domain, int type, int protocol); domain is one of the Protocol Families (AF_INET, AF_UNIX, etc.) type defines the communication protocol semantics, usually defines either: – SOCK_STREAM: connection-oriented stream (TCP) – SOCK_DGRAM: connectionless, unreliable (UDP) protocol specifies a particular protocol, just set this to 0 to accept the default UNIX Network Programming – TDC368-901 Spring 2003 Page 9 The Socket Structure INET Address struct in_addr { in_addr_t s_addr; } /* 32-bit IPv4 address */ INET Socket Struct sockaddr_in { uint8_t sin_len; /* length of structure (16) */ sa_family_t sin_family; /* AF_INET */ in_port_t sin_port; /* 16-bit TCP/UDP port number */ struct in_addr sin_addr; /* 32-bit IPv4 address */ char sin_zero[8]; /* unused */ } UNIX Network Programming – TDC368-901 Spring 2003 Page 10 Setup for an Internet Domain Socket struct sockaddr_in { sa_family_t sin_family; unsigned short int sin_port; struct in_addr sin_addr; unsigned char pad[...]; }; sin_family is set to Address Family AF_INET sin_port is set to the port number you want to bind to sin_addr is set to the IP address of the machine you are binding to (struct in_addr is a wrapper struct for an unsigned long) ignore padding UNIX Network Programming – TDC368-901 Spring 2003 Page 11 Stream Socket Transaction (TCP Connection) Server Client socket() bind() socket() listen() connect() 3-way handshake accept() write() data read() data write() read() close() UNIX Network Programming – TDC368-901 EOF read() Spring 2003 close() Page 12 SERVER Create socket bind a port to the socket • Connection-oriented socket connections • Client-Server view CLIENT listen for incoming connections Create socket accept an incoming connection connect to server's port read from the connection write to the connection loop loop write to the connection read from the connection close connection UNIX Network Programming – TDC368-901 Spring 2003 Page 13 Server Side Socket Details SERVER Create socket int socket(int domain, int type, int protocol) sockfd =socket(AP_INET, SOCK_STREAM, 0); bind a port to the socket int bind(int sockfd, struct sockaddr *server_addr, socklen_t length) bind(sockfd, &server, sizeof(server)); listen for incoming connections int listen( int sockfd, int num_queued_requests) listen( sockfd, 5); accept an incoming connection int accept(int sockfd, struct sockaddr *incoming_address, socklen_t length) newfd = accept(sockfd, &client, sizeof(client)); /* BLOCKS */ read from the connection int read(int sockfd, void * buffer, size_t buffer_size) read(newfd, buffer, sizeof(buffer)); write to the connection int write(int sockfd, void * buffer, size_t buffer_size) write(newfd, buffer, sizeof(buffer)); UNIX Network Programming – TDC368-901 Spring 2003 Page 14 Client Side Socket Details CLIENT Create socket connect to Server socket int socket(int domain, int type, int protocol) sockfd = socket(AF_INET, SOCK_STREAM, 0); int connect(int sockfd, struct sockaddr *server_address, socklen_t length) connect(sockfd, &server, sizeof(server)); /* Blocking */ write to the connection int write(int sockfd, void * buffer, size_t buffer_size) write(sockfd, buffer, sizeof(buffer)); read from the connection int read(int sockfd, void * buffer, size_t buffer_size) read(sockfd, buffer, sizeof(buffer)); UNIX Network Programming – TDC368-901 Spring 2003 Page 15 Reading From and Writing To Stream Sockets Sockets, are Inter-Process-Communication (IPC) mechanism, similar with files: – low level IO: » read() system call » write() system call – higher level IO: » int recv(int socket, char *buf, int len, int flags); • blocks on read • returns 0 when other connection has terminated » int send(int socket, char *buf, int len, int flags); • returns the number of bytes actually sent » where flags may be one of: • MSG_DONTROUTE (don’t route out of localnet) • MSG_OOB (out of band data (causes interruption)) • MSG_PEEK (examine, but don’t remove from stream) UNIX Network Programming – TDC368-901 Spring 2003 Page 16 Closing a Socket Session int close(int socket); – closes read/write IO, closes socket file descriptor int shutdown( int socketfd, int mode); – where mode is: » 0: no more receives allowed “r” » 1: no more sends are allowed “w” » 2: disables both receives and sends (but doesn’t socket, use close() for that) “rw” UNIX Network Programming – TDC368-901 Spring 2003 close the Page 17 Byte Ordering Different computer architectures use different byte ordering to represent/store multi-byte values (such as 16-bit/32-bit integers) 16 bit integer: Little-Endian (Intel) Big-Endian (RISC-Sparc) Low Byte Address A High Byte High Byte Address A+1 Low Byte UNIX Network Programming – TDC368-901 Spring 2003 Page 18 Byte Order and Networking Suppose a Big Endian machine sends a 16 bit integer with the value 2: 0000000000000010 A Little Endian machine will understand the number as 512: 0000001000000000 How do two machines with different byte-orders communicate? – Using network byte-order – Network byte-order = big-endian order UNIX Network Programming – TDC368-901 Spring 2003 Page 19 Network Byte Order Conversion of application-level data is left up to the presentation layer. Lower level layers communicate using a fixed byte order called network byte order for all control data. TCP/IP mandates that big-endian byte ordering be used for transmitting protocol information All values stored in a sockaddr_in must be in network byte order. – sin_port – sin_addr a TCP/IP port number. an IP address. UNIX Network Programming – TDC368-901 Spring 2003 Page 20 Network Byte Order Functions Several functions are provided to allow conversion between host and network byte ordering, Conversion macros (<netinet/in.h>) – to translate 32-bit numbers (i.e. IP addresses): » unsigned long htonl(unsigned long hostlong); » unsigned long ntohl(unsigned long netlong); – to translate 16-bit numbers (i.e. Port numbers): » unsigned short htons(unsigned short hostshort); » unsigned short ntohs(unsigned short netshort); UNIX Network Programming – TDC368-901 Spring 2003 Page 21 TCP Sockets Programming Summary Creating a passive mode (server) socket. Establishing an application-level connection. send/receive data. Terminating a connection. UNIX Network Programming – TDC368-901 Spring 2003 Page 22 Creating a TCP socket int socket(int family,int type,int proto); int mysockfd; mysockfd = socket( AF_INET, SOCK_STREAM, 0); if (mysockfd<0) { /* ERROR */ } UNIX Network Programming – TDC368-901 Spring 2003 Page 23 Binding to well known address int mysockfd; int err; struct sockaddr_in myaddr; mysockfd = socket(AF_INET,SOCK_STREAM,0); myaddr.sin_family = AF_INET; myaddr.sin_port = htons( 80 ); myaddr.sin_addr = htonl( INADDR_ANY ); err= bind(mysockfd, (sockaddr *) &myaddr, sizeof(myaddr)); UNIX Network Programming – TDC368-901 Spring 2003 Page 24 Bind – What Port Number? Clients typically don’t care what port they are assigned. When you call bind you can tell it to assign you any available port: myaddr.port = htons(0); UNIX Network Programming – TDC368-901 Spring 2003 Page 25 Bind - What IP address ? How can you find out what your IP address is so you can tell bind() ? There is no realistic way for you to know the right IP address to give bind() - what if the computer has multiple network interfaces? Specify the IP address as: INADDR_ANY, this tells the OS to handle the IP address specification. UNIX Network Programming – TDC368-901 Spring 2003 Page 26 Converting Between IP Address formats From ASCII to numeric – “130.245.1.44” 32-bit network byte ordered value – inet_aton(…) with IPv4 – inet_pton(…) with IPv4 and IPv6 From numeric to ASCII – – – – 32-bit value “130.245.1.44” inet_ntoa(…) with IPv4 inet_ntop(…) with IPv4 and IPv6 Note – inet_addr(…) obsolete » cannot handle broadcast address “255.255.255.255” (0xFFFFFFFF) UNIX Network Programming – TDC368-901 Spring 2003 Page 27 IPv4 Address Conversion int inet_aton( char *, struct in_addr *); Convert ASCII dotted-decimal IP address to network byte order 32 bit value. Returns 1 on success, 0 on failure. char *inet_ntoa(struct in_addr); Convert network byte ordered value to ASCII dotted-decimal (a string). UNIX Network Programming – TDC368-901 Spring 2003 Page 28 Establishing a passive mode TCP socket Passive mode: – Address already determined. – Tell the kernel to accept incoming connection requests directed at the socket address. » 3-way handshake – Tell the kernel to queue incoming connections for us. UNIX Network Programming – TDC368-901 Spring 2003 Page 29 listen() int listen( int mysockfd, int backlog); mysockfd is the TCP socket (already bound to an address) backlog is the number of incoming connections the kernel should be able to keep track of (queue for us). listen() returns -1 on error (otherwise 0). UNIX Network Programming – TDC368-901 Spring 2003 Page 30 Accepting an incoming connection Once we call listen(), the O.S. will queue incoming connections – Handles the 3-way handshake – Queues up multiple connections. When our application is ready to handle a new connection, we need to ask the O.S. for the next connection. UNIX Network Programming – TDC368-901 Spring 2003 Page 31 accept() int accept( int mysockfd, struct sockaddr* cliaddr, socklen_t *addrlen); mysockfd is the passive mode TCP socket. cliaddr is a pointer to allocated space. addrlen is a value-result argument – must be set to the size of cliaddr – on return, will be set to be the number of used bytes in cliaddr. accept() return value – accept() returns a new socket descriptor (positive integer) or -1 on error. – After accept returns a new socket descriptor, I/O can be done using the read() and write() system calls. UNIX Network Programming – TDC368-901 Spring 2003 Page 32 Terminating a TCP connection Either end of the connection can call the close() system call. If the other end has closed the connection, and there is no buffered data, reading from a TCP socket returns 0 to indicate EOF. UNIX Network Programming – TDC368-901 Spring 2003 Page 33 Client Code TCP clients can call connect() which: – takes care of establishing an endpoint address for the client socket. » don’t need to call bind first, the O.S. will take care of assigning the local endpoint address (TCP port number, IP address). – Attempts to establish a connection to the specified server. » 3-way handshake UNIX Network Programming – TDC368-901 Spring 2003 Page 34 connect() int connect( int sockfd, const struct sockaddr *server, socklen_t addrlen); sockfd is an already created TCP socket. server contains the address of the server (IP Address and TCP port number) connect() returns 0 if OK, -1 on error UNIX Network Programming – TDC368-901 Spring 2003 Page 35 Reading from a TCP socket int read( int fd, char *buf, int max); By default read() will block until data is available. reading from a TCP socket may return less than max bytes (whatever is available). UNIX Network Programming – TDC368-901 Spring 2003 Page 36 Writing to a TCP socket int write( int fd, char *buf, int num); write might not be able to write all num bytes (on a nonblocking socket). Other functions (API) – readn(), writen() and readline() - see man pages definitions. UNIX Network Programming – TDC368-901 Spring 2003 Page 37 Example [ from R. Stevens text] Client Server communication Client Network Machine A Server Machine B • Web browser and server • FTP client and server • Telnet client and server UNIX Network Programming – TDC368-901 Spring 2003 Page 38 Example – Daytime Server/Client Daytime client Application protocol Daytime server (end-to-end logical connection) Socket API Socket API TCP IP MAC driver TCP protocol (end-to-end logical connection) TCP IP protocol (physical connection ) IP MAC-level protocol (physical connection ) MAC driver Actual data flow MAC = media access control Network UNIX Network Programming – TDC368-901 Spring 2003 Page 39 Daytime client #include "unp.h" int main(int argc, char **argv) { int sockfd, n; char recvline[MAXLINE + 1]; struct sockaddr_in servaddr; Connects to a daytime server Retrieves the current date and time % gettime 130.245.1.44 Thu Sept 05 15:50:00 2002 if( argc != 2 )err_quit(“usage : gettime <IP address>”); /* Create a TCP socket */ if ( (sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) err_sys("socket error"); /* Specify server’s IP address and port */ bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(13); /* daytime server port */ if (inet_pton(AF_INET, argv[1], &servaddr.sin_addr) <= UNIX Network Programming – TDC368-901 Page 40 Spring 2003 Daytime client /* Connect to the server */ if (connect(sockfd, (SA *) &servaddr, sizeof(servaddr)) < 0) err_sys("connect error"); /* Read the date/time from socket */ while ( (n = read(sockfd, recvline, MAXLINE)) > 0) { recvline[n] = 0; /* null terminate */ printf(“%s”, recvline); } if (n < 0) err_sys("read error"); close(sockfd); } UNIX Network Programming – TDC368-901 Spring 2003 Page 41 Simplifying error-handling [ R. Stevens ] int Socket(int family, int type, int protocol) { int n; if ( (n = socket(family, type, protocol)) < 0) err_sys("socket error"); return n; } UNIX Network Programming – TDC368-901 Spring 2003 Page 42 Daytime Server #include #include "unp.h" <time.h> int main(int argc, char **argv) { int listenfd, connfd; struct sockaddr_in servaddr; char buff[MAXLINE]; time_t ticks; Waits for requests from Client Accepts client connections Send the current time Terminates connection and goes back waiting for more connections. /* Create a TCP socket */ listenfd = Socket(AF_INET, SOCK_STREAM, 0); /* Initialize server’s address and well-known port */ bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(13); /* daytime server */ /* Bind server’s address and port to the socket */ Bind(listenfd, (SA *) &servaddr, sizeof(servaddr)); UNIX Network Programming – TDC368-901 Spring 2003 Page 43 Daytime Server /* Convert socket to a listening socket */ Listen(listenfd, LISTENQ); for ( ; ; ) { /* Wait for client connections and accept them */ connfd = Accept(listenfd, (SA *) NULL, NULL); /* Retrieve system time */ ticks = time(NULL); sprintf(buff, sizeof(buff), "%.24s\r\n", ctime(&ticks)); /* Write to socket */ Write(connfd, buff, strlen(buff)); /* Close the connection */ Close(connfd); } } UNIX Network Programming – TDC368-901 Spring 2003 Page 44 Server Design Iterative Connectionless Iterative Connection-Oriented Concurrent Connectionless Concurrent Connection-Oriented UNIX Network Programming – TDC368-901 Spring 2003 Page 45 Concurrent vs. Iterative Concurrent •Large or variable size requests •Harder to program •Typically uses more system resources Iterative •Small, fixed size requests •Easy to program UNIX Network Programming – TDC368-901 Spring 2003 Page 46 Connectionless vs. Connection-Oriented Connection-Oriented •Easy to program •Transport protocol handles the tough stuff. •Requires separate socket for each connection. Connectionless •Less overhead •No limitation on number of clients UNIX Network Programming – TDC368-901 Spring 2003 Page 47 Statelessness State: Information that a server maintains about the status of ongoing client interactions. Issues with Statefullness – – – – Clients can go down at any time. Client hosts can reboot many times. The network can lose messages. The network can duplicate messages. Example – Connectionless servers that keep state information must be designed carefully » Messages can be duplicated UNIX Network Programming – TDC368-901 Spring 2003 Page 48 Concurrent Server Design Alternatives One child per client Single Process Concurrency Pre-forking multiple processes Spawn one thread per client Pre-threaded Server UNIX Network Programming – TDC368-901 Spring 2003 Page 49 One child per client Traditional Unix server: – TCP: after call to accept(), call fork(). – UDP: after recvfrom(), call fork(). – Each process needs only a few sockets. – Small requests can be serviced in a small amount of time. Parent process needs to clean up after children (call wait() ). UNIX Network Programming – TDC368-901 Spring 2003 Page 50 Discussion Stevens Example: Concurrency using fork() 1/3 /* Code fragment that uses fork() and signal() to implement concurrency */ /* include and define statements section */ void signal_handler(int sig) { int status; wait(&status); /* awaits child process to exit therefore allows child to terminate, and to transit from ZOMBIE to NORMAL TEMINATION (END) state */ signal(SIGCHLD,&signal_handler); /* restarts signal handler */ } UNIX Network Programming – TDC368-901 Spring 2003 Page 51 Discussion Stevens Example: Concurrency using fork() 2/3 main(int argc, char *argv[]) { /* Variable declaration section */ /* The calls socket(), bind(), and listen() */ See previous slide signal(SIGCHLD,&signal_handler); while(1) { /* infinite accept() loop */ newfd = accept(sockfd,(struct sockaddr *)&theiraddr,&sinsize); if (newfd < 0) { /* error in accept() */ if (errno = EINTR) continue; else { perror("accept"); exit(-1); } } UNIX Network Programming – TDC368-901 Spring 2003 Page 52 Discussion Stevens Example: Concurrency using fork() 3/3 /* successfully accepted a new client connection newfd >=0 */ switch (fork()) { case -1: /* fork() error */ perror("fork"); exit(-1); case 0: /* child handles request */ close(sockfd); /* read msg and form a response */ /* send response back to the client */ close(newfd); exit(-1); /* exit() sends by default SIGCHLD to parent */ default: /* parent returns to wait for another request */ close(newfd); } /* end switch */ } /* end while(1) */ } UNIX Network Programming – TDC368-901 Spring 2003 Page 53 Appendix TCP/IP Protocol Suite - Terms and Concepts TCP/IP Summary IP: network layer protocol – unreliable datagram delivery between hosts. UDP: transport layer protocol - provides fast / unreliable datagram service. Pros: Less overhead; fast and efficient – minimal datagram delivery service between processes. – unreliable, since there is no acknowledgement of receipt, there is no way to know to resend a lost packet – no built-in order of delivery, random delivery – connectionless; a connection exists only long enough to deliver a single packet – checksum to guarantee integrity of packet data TCP: transport layer protocol . Cons: Lots of overhead – connection-oriented, full-duplex, reliable, byte-stream delivery service between processes. – guaranteed delivery of packets in order of transmission by offering acknowledgement and retransmission: – sequenced delivery to the application layer, by adding a sequence number to every packet. – checksum to guarantee integrity of packet data UNIX Network Programming – TDC368-901 Spring 2003 Page 55 End-to-End (Transport) Protocols Underlying best-effort network – – – – – drops messages re-orders messages delivers duplicate copies of a given message limits messages to some finite size delivers messages after an arbitrarily long delay Common end-to-end services – – – – – – – guarantee message delivery deliver messages in the same order they are sent deliver at most one copy of each message support arbitrarily large messages support synchronization allow the receiver to apply flow control to the sender support multiple application processes on each host UNIX Network Programming – TDC368-901 Spring 2003 Page 56 UDP Application process Application process Application process Ports Queues Packets demultiplexed UDP Packets arrive UNIX Network Programming – TDC368-901 Spring 2003 Page 57 UDP Header Simple Demultiplexor Unreliable and unordered datagram service Adds multiplexing No flow control Endpoints identified by ports – servers have well-known ports – see /etc/services on Unix Optional checksum – pseudo header + udp header + data UDP Packet Format 0 16 31 Src Port Address Dst Port Address Checksum Length of DATA DATA UNIX Network Programming – TDC368-901 Spring 2003 Page 58 TCP Reliable Byte-Stream Connection-oriented Byte-stream – sending process writes some number of bytes – TCP breaks into segments and sends via IP – receiving process reads some number of bytes Application process W rite bytes … … Application process Read bytes TCP Receive buffer TCP Send buffer Segment Segment … Segment T ransmit segments Full duplex Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network UNIX Network Programming – TDC368-901 Spring 2003 Page 59 TCP Connection-oriented protocol • logical connection created between two communicating processes • connection is managed at TCP protocol layer • provides reliable and sequential delivery of data • receiver acknowledgements sender that data has arrived safely • sender resends data that has not been acknowledged • packets contain sequence numbers so they may be ordered Bi-directional byte stream • both sender and receiver write and read bytes • acknowledgements identify received bytes • buffers hold data until there is a sent • multiple bytes are packaged into a segment when sent UNIX Network Programming – TDC368-901 Spring 2003 Page 60 TCP End-to-End Issues Based on sliding window protocol used at data link level, but the situation is very different. Potentially connects many different hosts – need explicit connection establishment and termination Potentially different RTT (Round Trip Time) – need adaptive timeout mechanism Potentially long delay in network – need to be prepared for arrival of very old packets Potentially different capacity at destination – need to accommodate different amounts of buffering Potentially different network capacity – need to be prepared for network congestion UNIX Network Programming – TDC368-901 Spring 2003 Page 61 TCP Segment Format Every TCP segment includes a Sequence Number that refers to the first byte of data included in the segment. Every TCP segment includes an Acknowledgement Number that indicates the byte number of the next data that is expected to be received. – All bytes up through this number have already been received. Control flags: – URG: urgent data included. – ACK: this segment is (among other things) an acknowledgement. – RST: error - abort the session. – SYN: synchronize Sequence Numbers (setup) – FIN: polite connection termination. Window: – Every ACK includes a Window field that tells the sender how many bytes it can send before the receiver buffer will be in overflow UNIX Network Programming – TDC368-901 Spring 2003 Page 62 TCP Segment Format 0 16 Source Port Number 31 Destination Port Number Sequence Number Acknowledgement Hdr Len 0 Flags Window Checksum Urgent Pointer Options/Padding Data UNIX Network Programming – TDC368-901 Spring 2003 Page 63 TCP Connection Establishment and Termination When a client requests a connection it sends a “SYN” segment (a special TCP segment) to the server port. SYN stands for synchronize. The SYN message includes the client’s SN. SN is Sequence Number. UNIX Network Programming – TDC368-901 Spring 2003 Page 64 TCP Connection Creation Client Active Participant Server Passive Participant SYN SN=X 1 2 SYN SN=Y ACK=X+1 ACK=Y+1 UNIX Network Programming – TDC368-901 3 Spring 2003 Page 65 TCP 3-Way Handshake A client starts by sending a SYN segment with the following information: 1 – Client’s SN (generated pseudo-randomly) = X – Maximum Receive Window for client. – Only TCP headers When a waiting server sees a new connection request, the server sends back a SYN segment with: – – – – Server’s SN (generated pseudo-randomly) = Y Acknowledgement Number is Client SN+1 = X+1 Maximum Receive Window for server. Only TCP headers 2 When the Server’s SYN is received, the client sends back an ACK with: – Acknowledgement Number is Server’s SN+1 = Y+1 3 Why 3-way? UNIX Network Programming – TDC368-901 Spring 2003 Page 66 TCP Data and ACK Once the connection is established, data can be sent. Each data segment includes a sequence number identifying the first byte in the segment. Each segment (data or empty) includes an acknowledgement number indicating what data has been received. UNIX Network Programming – TDC368-901 Spring 2003 Page 67 TCP Reliable Byte-Stream Connection-oriented Byte-stream – sending process writes some number of bytes – TCP breaks into segments and sends via IP – receiving process reads some number of bytes Application process W rite bytes … … Application process Read bytes TCP Receive buffer TCP Send buffer Segment Segment … Segment T ransmit segments Full duplex Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network UNIX Network Programming – TDC368-901 Spring 2003 Page 68 TCP Buffering The TCP layer doesn’t know when the application will ask for any received data. TCP buffers incoming data so it’s ready when we ask for it. Client and server allocate buffers to hold incoming and outgoing data – The TCP layer does this. Client and server announce with every ACK how much buffer space remains (the Window field in a TCP segment). Most TCP implementations will accept out-of-order segments (if there is room in the buffer). Once the missing segments arrive, a single ACK can be sent for the whole thing. UNIX Network Programming – TDC368-901 Spring 2003 Page 69 TCP Buffering Send Buffers – The application gives the TCP layer some data to send. – The data is put in a send buffer, where it stays until the data is ACK’d. – The TCP layer won’t accept data from the application unless (or until) there is buffer space. ACK – A receiver doesn’t have to ACK every segment (it can ACK many segments with a single ACK segment). – Each ACK can also contain outgoing data (piggybacking). – If a sender doesn’t get an ACK after some time limit it resends the data. UNIX Network Programming – TDC368-901 Spring 2003 Page 70 Termination The TCP layer can send a RST segment that terminates a connection if something is wrong. Usually the application tells TCP to terminate the connection gracefully with a FIN segment. FIN – – – – – Either end of the connection can initiate termination. A FIN is sent, which means the application is done sending data. The FIN is ACK’d. The other end must now send a FIN. That FIN must be ACK’d. UNIX Network Programming – TDC368-901 Spring 2003 Page 71 TCP Connection Termination App2 App1 FIN SN=X 2 ... ACK=X+1 1 FIN SN=Y ACK=Y+1 UNIX Network Programming – TDC368-901 3 4 Spring 2003 Page 72 Stream Sockets Connection-Based, i.e., socket addresses established before sending messages between C/S Address Domain: AF_UNIX (UNIX pathname) or AF_INET (host+port) Virtual Circuit i.e., Data Transmitted sequentially in a reliable and non-duplicated manner Default Protocol Interface is TCP Checks order, sequence, duplicates No boundaries are imposed on data (its a stream of bytes) Slower than UDP Requires more program overhead UNIX Network Programming – TDC368-901 Spring 2003 Page 73 Datagram Sockets Connectionless sockets, i.e., C/S addresses are passed along with each message sent from one process to another Peer-to-Peer Communication Provides an interface to the UDP datagram services Handles network transmission as independent packets Provides no guarantees, although it does include a checksum Does not detect duplicates Does not determine sequence – ie information can be lost, wrong order or duplicated UNIX Network Programming – TDC368-901 Spring 2003 Page 74