Download Document

10. The Internet Transport Protocols: TCP UDP is a simple protocol and it does some niche uses, such as clientserver interaction and multimedia, but for most Internet applications, reliable, sequenced delivery is needed, but UDP cannot provide this, so another protocol is required. It is called TCP and is the main workhorse of Internet TCP (Transport Control Protocol) was specifically designed to provide a reliable end-to-end byte stream over an unreliable internetwork. An internetwork differs from a single network because different parts may have widely different topologies, bandwidths, delays, packet sizes, and other parameters. TCP was designed to dynamically adapt properties of the internetwork and to be strong in the face of many kinds of failures. TCP provides multiplexing, demultiplexing, and error detection in exactly the same manner as UDP. Nevertheless, TCP and UDP differ in many ways. The most fundamental difference is that UDP is connectionless, while TCP is connection-oriented. UDP is connectionless because it sends data without ever establishing a connection. TCP is connection-oriented because before one application process can begin to send data to another, the two processes must first “handshake” with each other-that is, they must send some preliminary segments to each other to establish the parameters of the ensuring data transfer. Each machine supporting TCP has a TCP transport entity, either a library procedure, a user process, or part of the kernel. In all cases, it manages TCP streams and interfaces to the IP layer. A TCP entity accepts user data stream from local processes, breaks them up into pieces not exceeding 64 Kb (in practice, often 1460 data bytes in order to fit in a single Ethernet frame with the IP and TCP headers), and sends each piece as a separate IP datagram. When datagram containing TCP data arrive at a machine, they are given in the TCP entity, which reconstructs the original byte streams. For simplicity, we will sometimes use just “TCP” to mean the TCP transport entity (a piece of software) or the TCP protocol (a set of rules). From the context it will be clear which is meant. The IP layer gives no guarantee that datagrams will be delivered properly, so it is up to TCP to time out and retransmit them as need be. Datagrams that do arrive may be will in the wrong order; it is also up to TCP to reassemble them into messages in the proper sequence. In short, TCP must furnish the reliability that most users want and that IP does not provide. a. The TCP service Model Both the sender and receiver creating end points, called sockets, and obtain TCP service. Each sockets has a socket number (address) consisting of the IP address of the host and a 16-bit number local to that host, called a port. A port is the TCP name. For TCP service to be obtained, a connection must be clearly established between a socket on the sending machine and a socket on the receiving machine. The socket calls (primitives) are listed in Table 5. A socket may be multiple connections at the same time. In other words, two or more connections may terminate at the same socket. The identifiers at both ends identify connections, which are (socket, socket2). No virtual circuit numbers or other identifiers are used. Table 5. The socket primitives for TCP Primitive Meaning SOCKET Create a new communication and port BIND Attach a local address to a socket LISTEN Announce willingness to accept connections, give queue size ACCEPT Block the caller until a connection attempt arrives CONNECT Actively attempt to establish a connection SEND Send some data over the connection RECEOVE Receive some data from the connection CLOSE Release the connection Port numbers below 1023 are called well-known ports and reserved for standard services. For example, any process wishing to establish a connection to a host to transfer a file using FTP can connect to the destination host’s port 21 to contact its FTP daemon. The list of known ports is listed in Table 6. Table 6. Some assigned ports Port Protocol 21 FTP 23 Telnet 25 SMTP 69 TFTP 79 Finger 80 HTTP 110 POP-3 119 NNTP Use File transfer Remote login E-mail Trivial file transfer protocol Lookup information about a user World Wide web Remote e-mail access USENET news All TCP connections are full duplex and point-to-point, that means that each connection has exactly two end points. TCP does support multicasting or broadcasting. 2. Implementation of the NL Services Network layer can provide two different possible organizations of services, depending on type of service offered. If 1. connectionless service is offered, packets are injected into the subnet individually and routed independently of each other. No advanced setup is needed. In this context, the packets are frequently called datagrams and subnet is called a datagram subnet. See Figure 2. Let us consider datagram subnet. Suppose that the process P1 in Figure 2 has a long message for P2. It hands the message to the transport layer with instructions to deliver it to process P2 on host H2. The transport layer code runs on host H1, typically within the operating system. It depends on a transport header to the front of the message and hands the results to the network layer, probably just another procedure within the operating system. Figure 2. Routing within a diagram subnet. Let us assume that the message is four times longer than the maximum packet size, so the network layer has to break it into four packets, 1, 2, 3, and 4 and sends each of them in turn to router A using some point-topoint protocol. Every router has an internal table telling it where to send packets for each possible destination. Each table entry is a pair consisting of a destination and the outgoing line to use for that destination. Only directly-connected lines can be used. For example, in Figure 2, A has only two outgoing lines-to B and C-so every incoming packet must be sent to one of these routers, even if the final destination is some other router. A’s initial routing table is shown in the Figure 2 under label “initial”. As they arrived at A, packets 1, 2, and 3 were stored briefly (to verify their checksums). Then each was forwarded to C according to A’s table. Packet 1 was then forwarded to E and then to F. When it got to F, it was encapsulated in a DLL and sent to H2 over the LAN. Packet 2 and 3 follow the same route. However, something different happened to packet 4. When it got to A it was sent to router B, even though it is also destined for F. For some reason, A decided to send packet 4 via a different route than that of the first three. Perhaps it learned of a traffic jam somewhere along the ACE path and updated its routing table, as shown under label “later”. The algorithm that manages the tables and makes the routing decisions is called the routing algorithm. If 2. connection-oriented service is used, a path from the source router to the destination router must be established before any data packets can be sent. This connection is called a Virtual Circuit (VC), and subnet is called Virtual circuit subnet. See Figure 3. Figure 3. Routing within a virtual-circuit subnet. For connection-oriented service, we need a virtual circuit subnet. The idea behind virtual circuits is to avoid having to choose a new route for every packet sent, unlike in Figure 2. Instead, when a connection is established, a route from the source machine to the destination machine is chosen as part of the connection setup and stored in tables inside the routers. That route is used for all traffic flowing over the connection, exactly the same way that the telephone system works. When the connection is released, the virtual circuit is also terminated. In Figure 3 host H1 has established connection 1 with host H2. It is remembered as the first entry in each of the routing tables. The first line of A’s table says that if a packet manner connection identifier 1 comes in from H1, it is to be sent to router C and given connection identifier 1. Similarly, the first entry at C routes packet to E, also with connection identifier 1. Let us consider what happens if H3 also wants to establish a connection to H2. It chooses connection identifier 1 (because it is initiating the connection and this is its only connection) and tells the subnet to establish the virtual circuit. This leads to the second row in the tables. Note that we have a conflict here because although A can easily distinguish connection 1 packets from H1 from connection 1 packets from H3, C cannot do this. For this reason, A assigns a different connection identifier to the outgoing traffic for the second connection. Avoiding conflicts of this kind is why routers need the ability to replace connection identifiers in outgoing packets. In some context, this is called label switching. 3. Comparison of Virtual Circuit and Datagram Subnets Inside the subnet, several trade-offs exist between virtual circuit and datagrams. One trade-off is between router memory space and bandwidth. Virtual circuits allow packets to contain circuit numbers instead of full destination address. If the packet tends to be quite short, a full destination address in every packet may represent a significant amount of overhead and hence, wasted bandwidth. The price paid for using virtual circuits initially is the table space within the routers. Depending upon the relative cost of communication circuits versus router memory, one or the other may be cheaper. Another trade-off is setup time versus address parsing time. Using virtual circuits requires a setup phase, which takes time and consumers resources. However, figuring out what to do with a data packet in a virtual-circuit subnet is easy: the router just uses the circuit numbers to index into a table to find out where the packet does. In a datagram subnet, a more complicated lookup procedure is required to locate the entry for the destination. Comparison of datagram and virtual-circuit subnets is shown in Table 1. Table 1. Comparison of Virtual-Circuit and Datagram Subnets 5-4 4. Routing algorithm The main function of the network layer is routing packets from the source machine to the destination machine. In most subnets, packets will require multiple hops to make the voyage. The only notable exception is for broadcast networks. The algorithms that choose the routers and the data structures that they use are a major area of network layer design. The routing algorithm is that part of the NL software responsible for deciding which output line an incoming packet should be transmitted on. If the subnet uses datagrams internally, this decision must be made for every arriving data packet since the best route may have changed since last time. If the subnet uses virtual circuit internally, routing decisions are made only when a new virtual circuit is being set up. Thereafter, data packets just follow the previously-established route. The latter case is sometimes called session routing because a route remains in force for an entire user session. It is sometimes useful to make a distinction between routing, which is making the decision which router to use, and forwarding. One can think of a router as having two processes inside it. One of them handles each packet as it arrives, looking up the outgoing line to use for it in the routing tables. This process is forwarding. The other process is responsible for filling in and updating the routing tables. That is where the routing algorithm comes into play. Main requirements in a routing algorithm are: correctness, simplicity, robustness, stability, fairness, and optimality. In some particular cases should be made trade-offs between these requirements. Routing algorithms can be grouped into two major classes: nonadaptive and adaptive. Nonadaptive algorithms do not base their routing decision on measurements or estimates of the current traffic and topology. Instead, the choice of the route to use to get from I to J (for all I and J) is computed in advance, off-line, and downloaded to the routers when the network is booted. This procedure is sometimes called static routing. Adaptive algorithms, in contrast, change their routing decisions to reflect changes in the topology, and usually the traffic as well. Adaptive algorithms differ in where they get their information (locally, from adjacent routers, or from all routers), when they change the routes (every T sec., when the load changes, or when the topology changes, and what metric is used for optimization (distance, numbers of hops, or estimated transmit time

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document