* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Communication
Wake-on-LAN wikipedia , lookup
TCP congestion control wikipedia , lookup
Dynamic Host Configuration Protocol wikipedia , lookup
Parallel port wikipedia , lookup
Remote Desktop Services wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Internet protocol suite wikipedia , lookup
Communication References On-line tutorials Beej’s Guide to Network programming http://beej.us/guide/bgnet/ The Java Tutorials: All about sockets http://download.oracle.com/javase/tutorial/net working/sockets/ Chapter 2.1 of the Tanenbaum, van Steen textbook Chapter 15, of the Steven’s book on TCP/IP Protocols All communication is based on exchanging messages. A protocol is a set of rules which is used by computers to communicate (using messages) with each other across a network. Protocols control the sending and receiving of messages e.g., TCP, IP, HTTP,FTP Protocol Stacks A computer is connected to the Internet and has a unique address Let’s say that an application on computer A wants to send the message “Hello there” to an application computer B The text message must be translated into electronic signals, transmitted over the Internet, then translated back into text. This is accomplished using a protocol stack Protocol Stacks The protocol stack is usually built into the computer’s operating system The protocol stack used on the Internet is referred to as the TCP/IP protocol stack because of the two major communication protocols used (more later). Internet Protocol Stack Many different protocols are needed at a variety of levels A message starts at the top of the protocol stack and works its way downwards. If the message is long, then the message may be broken up into smaller chunks of data These chunks are known as packets. Packets go to the transport layer application transport network link physical Transport Protocols At the sending end each packet is associated with a destination address and a port number A destination address is insufficient since there could be multiple processes receiving messages A port number allows for an identification of a process that is to receive the packet Transport Protocols At the receiving end, the transport layer has to put back together the pieces into the message before passing it to the application. One of the Internet transport protocols is called Transmission Control Protocol (TCP). TCP is a reliable transport protocol i.e., it can deal with losses The Internet transport protocol that does not deal with losses is called User Datagram Protocol (UDP). Network Layer For a message to get from the sender to the receiver there are often multiple paths that can be taken A path consists of multiple routers Choosing a path is called routing Most widely used network protocol is the connectionless IP (Internet Protocol) Data Link Layer Between path points, a mechanism is needed to detect and correct errors such as damaged or lost messages Basically the data link layer is concerned with data transfer between the path points The data link layer groups bits into units (frames). Each frame has Sequence numbers for detecting lost messages Checksums for detecting damage Physical Layer The physical layer is concerned with the transmission of 0’s and 1’s. How many volts to use for 0 and 1? How many bits per second can be sent? Many physical layer standards have been developed for different media e.g., RS-232-C lines standard for serial communication Application Layer Protocols Application layer protocols include ftp and http. These protocols are built on top of TCP/UDP Client-Server Paradigm Many network applications use a form of communication known as the client-server paradigm Server application waits (passively) for contact. Client initiates the contact. Client and server applications interact directly with a transport-layer protocol for establishing communication and to send and receive information. The transport layer uses lower layer protocols to send and receive individual messages. Application Programming Interface: API The interface an application uses when it interacts with the transport protocol software is known as an Application Programming Interface (API). An API defines a set of operations that an application can perform when it interacts with a protocol. APIs are defined as follows: Provide a set of procedures that can be called For each procedure, define the arguments expected Usually there is a procedure for each basic operation e.g., send, receive Socket API Introduced in 1981 Originally only Unix Implemented as a library This can be used to communicate with TCP or UDP Sockets Sockets provide an interface to TCP/IP and UDP/IP A socket is characterized by the following information representing communication endpoints communication protocol, local address, local port, remote address, remote port Let us now look at how sockets are represented in operating systems Open Files/Socket Representation Here is a code fragment FILE *in_file; // open a file in_file = fopen("list.txt", "r"); Each process has a file descriptor table An entry could point to information about a file File information includes disk location, file status etc; Read/write operations need this info File Descriptor table stdin 0 1 2 3 4 stdout stderr in_file Open Files/Socket Representation An entry can also point to a socket Socket information is information needed for communication transport protocol Address of the local machine address of the remote machine Port numbers File Descriptor table stdin 0 1 2 3 4 stdout stderr in_file socket Sockets and Descriptors Sockets are integrated with I/O. The OS provides a single set of descriptors for files, devices, communication. The read() and write() functions can be used for both files and communication Sockets and Descriptors Before an application can communicate, it must request that the operating system create a socket to be used for communication. The system returns a small integer (index in the file descriptor table) that identifies the socket. The application uses that small integer as an argument in one of the communication API procedures. Socket Creation descriptor = socket(domain, type, protocol) domain specifies the specifies the addressing format to be used by the socket. AF_INET specifies Internet address and AF_UNIX specifies file path names. type specifies the type of communication the socket will use. • SOCK_STREAM: connection-oriented stream transfer • SOCK_DGRAM: connectionless message-oriented transfer protocol specifies a particular transport protocol; usually 0 = default for type; keep in mind that there are variations of TCP; –1 is returned if an error occurs. Socket Creation Example #include <sys/types.h> #include <sys/socket.h> if ((s = socket(AF_INET, SOCK_STREAM, 0) ) < 0){ perror(“socket”);} Endpoint Addresses A socket association is characterized as follows: communication protocol, local address, local port, remote address, remote port Local address, port are defined using the bind() call Remote address, port are defined using connect() call Address Representation This structure holds socket address information for many types of sockets. struct sockaddr { unsigned short sa_family; // address family, char sa_data[14]; // 14 bytes of protocol address }; sa_family represents the address format i.e., AF_INET. sa_data contains an address and port number for the socket. This structure is considered “unyieldy” since most programmers do not want to pack sa_data by hand. Programmers deal with the sockaddr_in structure Address Representation Internally an address is represented as a 32-bit integer struct in_addr { u_long s_addr; } struct sockaddr_in { short sin_family; u_short sin_port struct in_addr sin_addr char sin_zero[8]; } sin_family represents the address format sin_port specifies a port number which is associated with a process. sin_addr specifies the address of a host machine sin_zero is used to fill out the structure. Address Representation sockaddr_in is used for TCP (connection- oriented) and sockaddr_un is used for UDP (connectionless) The API procedures assume that a variable that is a pointer to sockaddr is used. The programmer should cast a variable of type sockaddr_in (or sockaddr_un) to a variable that is a pointer to sockaddr Data Representation Integers are not represented the same on all machine architectures little endian: least significant byte first; the Intel series, VAX big endian: most significant byte first (this is network byte order); IBM 370, Motorola 68000 and Sun Sparc When little endian computers are going to pass integers over the network (e.g., IP addresses), they need to convert them to network byte order. Data Representation m = ntohl(m): network to host byte order (32 bit) m = htonl(m): host to network byte order (32 bit) m = ntohs(m): network to host byte order (16 bit) m = htons(m): host to network byte order (16 bit) Address Conversion host name to IP address: gethostbyname() IP address to name: gethostbyaddr() bind () bind(int socket,struct sockaddr *localaddr, int addrlen) localaddr is a structure that specifies the local address to be assigned to the socket. addrlen specifies the length of the sockaddr structure pointed to by the address argument. Used by both TCP and UDP. bind() struct sockaddr_in sin; if ((s = socket(AF_INET, SOCK_STREAM,0)) < 0 { /*error*/}; memset((char *) &sin, 0, sizeof(sin)); sin.sin_family = AF_INET: sin.sin_port = htons(6000); sin.sin_addr.s_addr = htonl(INADDR_ANY); if bind(s, (struct sockaddr *) &sin, sizeof(sin)) < 0) {…} connect() Client contacts server by Specifying IP address, port number of server process Calling the connect() system call connect() int connect(int s, struct sockaddr *name, int namelen) client issues connect() to establish remote address and port Establish connection (stream socket) The call fails if the host is not listening to port. send() int send(int socket, const void *msg, int len, int flags); msg specifies a pointer to the first of the data to be sent len is an integer that specifies the amount of data to be sent flags are used to specify special options which are mostly intended for system debugging (don’t worry about this; just set it to 0) Used for stream communications recv() int recv(int socket, void *msg, int len, int flags); Similar to send() except this is used to receive data and put into msg listen() Client must contact server For a client to contact a server, the server process must first be running The server must have created a socket that is used to welcome the client’s contact The server must be “listening” for the client’s contact by using the listen() system call. listen() int listen(int s, int queuesize) queuesize specifies a length for the socket’s request queue. The OS builds a separate request queue for each socket. Client requests are put into the queue. Socket being listened to can’t be used for client. Note: • TCP assumes that for any data is sent that the client process sends a message first, then receives a message from the server and acknowledges that messages (three way handshake). accept() When contacted by client, server TCP creates a new socket for the server process to communicate with the client Allows server to talk with multiple clients accept() int accept (int s, struct sockaddr *addr, int *addrlen); All servers begin by calling socket to create socket and bind to specify a protocol port number. These two calls are sufficient for a server to start accepting messages in a connectionless transport. An extra call ( accept() ) is needed for a connectionoriented protocol. accept() fills in fields of argument addr with the address of the client that formed the connection and sets addrlen to the length of the address. A socket descriptor is returned accept() The server uses the new socket to communicate with the client and then closes the socket when finished. The original socket remains unchanged and this is used to accept the next connection from a client. Getting IP address/port from socket int getsockname(int sockfd, struct sockaddr *localaddr, socklen_t *addrlen) Get the local IP/port bound to socket int getpeername(int sockfd, struct sockaddr *remoteaddr, socklen_t *addrlen) Get the IP/port of remote endpoint Sequence of Socket System Calls Client Server socket socket bind listen connect Connection request accept send receive receive send close EOF close Example (Stream Socket; setting address) setaddr(struct sockaddr_in *sp, char *host, int port) { struct hostent *hp; hp = gethostbyname(host); if (hp == NULL) { fprintf(stderr, "%s: unknown host\n", host); exit(1); } sp->sin_family = AF_INET; bcopy(hp->h_addr, &sp->sin_addr, hp->h_length); sp->sin_port = htons(port); } Example (Stream Socket; socket creation) streamsocket(int port) { int s; struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; sin.sin_port = htons(port); s = socket(AF_INET, SOCK_STREAM, 0); if (s < 0) error("socket"); if (bind(s, (struct sockadd_in *) &sin, sizeof (sin)) < 0) error("bind"); return s; } } Example (Stream Socket; client) main(int argc, char * argv) { struct sockaddr_in sin; int s, n, zero; char buf[BUFSIZ]; myname = argv[0]; if (argc < 2) { fprintf(stderr, "usage: %s port [host]\n", myname); exit(1); } Example (Stream Socket; client) s = streamsocket(0); /* port 0 means "any port" */ setaddr(&sin, argc > 2 ? argv[2] : "localhost", atoi(argv[1])); /* connect a socket using name specified by the command line */ if (connect(s, (struct sockaddr_in *) &sin, sizeof(sin)) < 0) error("connecting stream socket"); exit(1); } Example (Stream Socket; client) printf("Type in text\n"); /* read text from the standard input */ while ((n = read(0, buf, sizeof(buf))) > 0){ /* send the text through socket */ if (write(s, buf, sizeof(buf)) < 0) error("writing on stream socket"); bzero(buf, sizeof(buf)); } close(s); exit(0); } Example (Stream Socket; server) #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> char *myname; #define MSGSIZE 1 Example (Stream Socket; server) main(argc, argv) char *argv[]; { struct sockaddr_in from; int s, n, fromlen, msgs, rval; struct hostent *hp; char buf[BUFSIZ]; myname = argv[0]; Example (Stream Socket; server) if (argc < 2) { fprintf(stderr, "usage: %s port\n", argv[0]); exit(1); } s = streamsocket(atoi(argv[1])); fromlen = sizeof(from); if (getsockname(s, &from, &fromlen)) { error("getting socket name"); exit(1); } printf("Socket has port # %d\n",ntohs(from.sin_port)); listen(s, 5); Example (Stream Socket; server) for (;;) { /*start accepting connections*/ msgs = accept(s, 0, 0); if (msgs == -1) error("accept"); else do { bzero(buf, sizeof(buf)); Example (Stream Socket; server) if ((rval = read(msgs, buf, sizeof(buf))) < 0) error("reading stream message"); if (rval == 0) printf("Ending connection\n"); else { printf("%s\n",buf); } UDP: User Datagram Protocol [RFC 768] “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why use UDP instead of TCP No connection establishment (which can add delay) Remember that TCP does a three-way handshake before transmitting data. UDP does not. TCP maintains connection state Receive and send buffers Sequence and acknowledgement numbers Congestion control parameters Smaller segment overhead Each TCP segment has 20 bytes of header overhead while each UDP segment has 8 bytes. The overhead in TCP includes sequence and acknowledgement numbers, and a flow congestion field. There is no congestion control UDP can blast away as fast as possible UDP Often used for streaming multimedia apps loss tolerant rate sensitive Other UDP uses (why?): DNS ping command Applications using UDP and TCP Email – TCP telnet – TCP HTTP – TCP ftp – TCP Remote file server (NFS) – typically UDP DNS – typically UDP Streaming multimedia – typically UDP Internet telephony – typically UDP Network management (SNMP) – typically UDP UDP and Reliability Lack of congestion control It is possible to have reliable data transfer in UDP. The application must have acknowledgements and retransmission mechanisms Streaming applications do this. API for UDP socket() call uses SOCK_DGRAM instead of SOCK_STREAM There is no connect() call Uses recvfrom() and sendto() instead of read() and write() There are no listen() or accept() calls sendto() int sendto (int socket, char *msg, int len, int flags, struct sockaddr *to, int tolen); to specifies the destination address and tolen specifies the length of the destination address. Used for datagram communications recvfrom() int recvfrom (int socket, char *msg, int len, int flags, struct sockaddr *from, int fromlen); sets from to source address of data sets fromlen to valid length of from returns number of bytes received Other Functions There are other functions used in TCP/UDP that can be used to provide for timeouts,etc; We will discuss the use of the select() function. select() int select(int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); FD_CLR(int fd, fd_set *fds); /* clear the bit for fd in fds */ FD_ISSET(int fd, fd_set *fds); /* is the bit for fd in fds */ FD_SET(int fd, fd_set *fds); /* turn on the bit for fd in fds */ FD_ZERO(fd_set *fds); /* clear all bits in fds */ maxfds: number of descriptors to be tested descriptors (0, 1, ... maxfds-1) will be tested readfds: a set of fds we want to check if data is available returns a set of fds ready to read if input argument is NULL, not interested in that condition writefds: returns a set of fds ready to write exceptfds: returns a set of fds with exception conditions select() int select(int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); struct timeval { long tv_sec; long tv_usec; } /* seconds / /* microseconds */ timeout if NULL, wait forever and return only when one of the descriptors is ready for I/O otherwise, wait up to a fixed amount of time specified by timeout • if we don’t want to wait at all, create a timeout structure with timer value equal to 0 select() A select statement can be used to implement …. timeouts s = streamsocket(0); /* port 0 means "any port" */ setaddr(&sin, argc > 2 ? argv[2] : "localhost", atoi(argv[1])); /* connect a socket using name specified by the command line */ if (connect(s, (struct sockaddr *) &sin, sizeof(sin)) < 0) { error("connecting stream socket"); exit(1); } buf = (char *) malloc(BUFSIZ*sizeof(char)); sprintf(buf, "%d", AREYOUUP); write(s,buf,sizeof(buf)); select() tv.tv_sec = 20; tv.tv_usec = 5000; FD_CLR(s,&readfds); FD_ZERO(&readfds); FD_SET(s,&readfds); } while (select(s,&readfds,NULL,NULL,&tv) < 0) { }; if (FD_ISSET(s,&readfds)){ printf("Data Arrived\n"); recv(s,buf,sizeof(buf),0); printf("buffer is %s\n",buf); } else printf("timed out\n"); close(s); exit(0); Difficulties in Socket Programming Data representation Binding Data Representation Complex data structures must be converted Sender must flatten complex data structures Receiver must reconstruct them Sender and receiver must agree on a common format for messages. Example typedef struct { char Name[MAXNAMELENGTH]; float Salary; char JobCode[IDNUMLENGTH]; } Employee You may want to send this information to the server using send(s, (void *) &e, sizeof(e), 0) where e is of type Employee ; This is most likely not going to work. The sender and receiver must agree on a format for the message. Data Representation Sender and receiver must agree on type of message. This can can be quite difficult. Sender must convert data to send to the agreed upon format. This often requires a “flattening” of the data structure representing this data. Receiver should parse the incoming message. Data Representation Code segment for marshalling the Employee structure char *msg; char name[MAXNAMELENGTH]; char jobcode[MAXNAMELENGTH]; float salary; int msglength; Employee e; …. salary = GetSalaryFromEmployee(e); GetJobCodeFromEmployee(e,jobcode); GetNameFromEmployee(e,name); msg = (char *) malloc(sizeof(Employee)); sprintf(msg,"%s %f %s",name,salary,jobcode); ….. msglength = sizeof(Employee); send(s, (void *) msg, msglength)); Data Representation Code segment for unmarshalling the Employee data sent char *msg; char name[MAXNAMELENGTH]; char jobcode[MAXNAMELENGTH]; float salary; int msglength; Employee e; … msg = (char *) malloc(sizeof(Employee)); … msglength = sizeof(Employee); recv(connectfd, (char *) msg, msglength,0); sscanf(msg, “%s %f %s”, name, &salary, jobcode); … Other Representational Issues Usually in a client-server model, the server provides a set of services. Each service can be invoked by a procedure. For example, in an employee management system you would have services that include: Insert an employee record for a new employee Get the salary of an employee Etc; The client must identify the service that it wants. This is part of the message. Other Representational Issues Thus each service must have an identifier e.g., in the previous code segment examples we may have something like this: sprintf(msg,“%d %s %f %s",methodidentifier, name,salary,jobcode); sscanf (msg,“%d %s %f %s",&methodidentifier, name,&salary,jobcode); Other Representational Issues What if we have a list of Employees that we want to send in a message? We do not know ahead of time how many employee records will be sent in a message. One way to handle this: Send a message with the service identifier and the number of employee records being sent. You then send the employee records. Other Representational Issues What was just described works fine if the client and server machines have similar machine types. However, it is common that there are multiple machine types. IBM mainframes use the EBCDIC character code, but IBM PCs use ASCII. It would be rather difficult to pass a character parameter from an IBM PC client to an IBM mainframe server using what has just been described. Similar problems occur with integers (1’s complement vs two’s complement). Other Representational Issues Need a canonical form For the UNIX OS there is XDR(eXternal Data Representation). For Java RMI, there is Java Remote Method Protocol (JRMP). Binding Binding refers to determining the location and identity (communication identifier) of the callee In UNIX, a communication identifier is a socket address containing host’s IP address and port number. Binding Strategies for binding Static binding (which binds the host address of a server into the client program at compilation time) is undesirable. • The client and server programs are compiled separately and often at different times. • Services may be moved from one host to another. Could pass host name and port by reading a file or through the command line. • You do not need to recompile • Client still needs to know the location of the server ahead of time. Binding Strategies for binding (cont) Always run the binder on a “well-known” address (i.e., fixed host and port) The operating system supplies the current address of the binder (e.g., via environment variable in UNIX). • Users need to be informed whenever the binder is relocated • Client and server programs need not be recompiled Use a broadcast message to locate the binder • The binder can be easily relocated • Client/Server programs need not be recompiled Binding Dynamic binding service is desirable. Ideally, a binding service would have the following characteristics: Allows servers to register their exporting services Allows servers to remote services Allows clients to lookup the named service Summary This section briefly summarizes the basic message passing primitives that can be used by processes in a distributed application.