Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Week 10 Transport Protocols, UDP, TCP 1 Orientation We move one layer up and look at the transport layer across the Internet. User Process User Process User Process TCP IP e.g., Ethernet Media User Process Application Layer UDP Transport Layer IP Layer Network protocols 2 Orientation TCP and UDP are end-to-end protocols They are only implemented at the hosts HOST HOST Application Application TCP/UDP TCP/UDP IP Router IP IP Network 1 protocols Network 1 protocols IP Network 2 protocols Network 2 protocols 3 Transport Protocols in the Internet • The Internet supports 2 transport protocols UDP - User Datagram Protocol datagram oriented TCP - Transmission Control Protocol unreliable, connectionless stream oriented simple reliable, connection-oriented unicast and multicast complex useful for multimedia only unicast applications used for control protocols network management (SNMP), routing (RIP), naming (DNS), etc. used for data applications: web (http), email (smtp), file transfer (ftp), SecureCRT, etc. 4 UDP - User Datagram Protocol UDP extends the host-to-to-host delivery service of IP to an application process-to-application process delivery service It does this by multiplexing and demultiplexing packets from multiple application-to-application communication sessions Applications Applications UDP UDP IP IP router IP router IP router IP 5 UDP packet format IP header UDP header 20 bytes UDP data (payload) 8 bytes Source Port Number Destination Port Number UDP message length Checksum 0 15 16 31 • Port numbers identify sending and receiving applications (processes). Maximum port number is 216-1= 65,535 • Message Length is between 8 bytes (i.e., data field can be empty) and 65,535 bytes (length of UDP header and data in bytes) • Checksum is for UDP header and UDP data 6 Port Numbers UDP (and TCP) use port numbers to identify applications There are 65,535 UDP ports per host. User Process User Process User Process TCP User Process User Process UDP IP User Process Demultiplex based on port number Demultiplex based on Protocol field in IP header 7 TCP Service offered by TCP TCP Header TCP Connection Establishment and Termination Flow control Error control Congestion control 8 TCP = Transmission Control Protocol Provides a reliable unicast end-to-end byte Byte Stream Byte Stream stream over an unreliable internetwork. TCP TCP IP Internetwork 9 TCP is reliable • Byte stream is broken up into chunks which are called segments • Detecting errors: • TCP has checksums for header and data. Segments with invalid checksums are discarded • Each segment that is transmitted has a sequence number. • Receiver sends acknowledgments (ACKs) for segments • Sender maintains a timer. An ACK is expected before the timer times out • Correcting errors: • Lost or errored segments are retransmitted. • Selective repeat ARQ scheme • Cumulative ACKs 10 Byte Stream Service To the lower layers, TCP handles data in "segments" To the higher layers TCP handles data as a sequence of bytes and does not identify boundaries between bytes So: Higher layers do not know about the beginning and end of segments ! Application Application 1. read 40 bytes 2. read 40 bytes 3. read 40 bytes 1. write 100 bytes 2. write 20 bytes TCP queue of bytes to be transmitted Segments TCP queue of bytes that have been received 11 TCP Service offered by TCP TCP Header TCP Connection Establishment and Termination Flow control Error control Congestion control 12 TCP Format • TCP segments have a 20 byte plus options header with >= 0 data bytes IP header TCP header 20 bytes TCP data 20 bytes 0 15 16 Source Port Number 31 Destination Port Number Sequence number (32 bits) 4 bits 6 bits reserved header length 0 Flags TCP checksum 20 bytes Acknowledgment number (32 bits) window size urgent pointer Options (if any) DATA (optional) 13 TCP header fields - Port Numbers Port Number: • A port number identifies the endpoint of a connection. • A pair <IP address, port number> identifies one endpoint of a connection. • Two pairs <client IP address, client port number> and <server IP address, server port number> identify a TCP connection. Applications Ports: 23 80 104 Applications 7 80 16 TCP TCP IP IP Ports: 14 TCP header fields - Sequence Number Sequence Number (SeqNo): Sequence number is 32 bits long. So the range of SeqNo is 0 <= SeqNo <= 232 -1 4.3 Gbyte Each sequence number identifies the byte in the stream of data from the sending TCP to the receiving TCP that the first byte of data in this segment represents. Initial Sequence Number (ISN) of a connection is set during connection establishment 1 500 Segment 1 (Seq. No. 1) 501 1000 Segment 2 (Seq. No. 501) 1001 1500 Segment 3 (Seq. No. 1001) 15 TCP header fields - Ack. No. Acknowledgment Number (AckNo): Acknowledgments are piggybacked, i.e., a segment from A B contains an acknowledgement for a segment sent in the B A direction The AckNo in the B A segment header contains the SeqNo for the next segment expected at B for the A B flow Example: The acknowledgment for a 1500-byte segment with the sequence number 0 is AckNo=1500 A host uses the AckNo field to send acknowledgements. If a host sends an AckNo in a segment it sets the “ACK flag” 16 TCP header fields - Ack. No. Contd. Example: Sender sends two segments with bytes “1..1500” and “1501..3000”, but receiver only gets the second segment. • What is the sequence number of the first segment? • What is the sequence number of the second segment? • What is the ACK number sent in response by the receiver when it receives the second segment? 17 TCP header fields - Header Length Header Length (4 bits): Length of header in 32-bit words Note that TCP header has variable length (minimum of 20 bytes) 18 TCP header fields - Flags Flag bits: URG: Urgent pointer is valid – If the bit is set, the following bytes contain an urgent message in the range: SeqNo <= urgent message <= SeqNo+urgent pointer ACK: Acknowledgement Number is valid PSH: PUSH Flag – Notification from sender to the receiver that the receiver should pass all data that it has to the application as soon as possible. – Normally set by sender when the sender’s buffer is empty (so TCP does not wait expecting more data) 19 TCP header fields - Flags Contd. Flag bits: RST: Reset the connection – The flag causes the receiver to reset the connection – Receiver of a RST terminates the connection and indicates higher layer application about the reset SYN: Synchronize sequence numbers – Sent in the first packet when opening a connection FIN: Sender is finished with sending – Used for closing a connection – Both sides of a connection must send a FIN 20 TCP header fields Window Size: Each side of the connection advertises its receiving window size Window size is the maximum number of bytes that a receiver can accept. Maximum window size is 216-1= 65535 bytes TCP Checksum: TCP checksum covers both TCP header and TCP data Urgent Pointer: Only valid if URG flag is set 21 TCP header fields - Options Options - a few examples: End of Options kind=0 1 byte NOP (no operation) kind=1 1 byte Maximum Segment Size kind=2 len=4 maximum segment size 1 byte 1 byte 2 bytes 22 TCP header fields Options: NOP is used to pad TCP header to a multiple of 4 bytes Maximum Segment Size: • Sets the maximum length of the segments • This option can only appear in a SYN segment 23 TCP Service offered by TCP TCP Header TCP Connection Establishment and Termination Flow control Error control Congestion control 24 Connection Management in TCP Opening a TCP Connection Closing a TCP Connection Special Scenarios State Diagram 25 TCP Connection Establishment TCP uses a three-way handshake to open a connection: (1) ACTIVE OPEN: Client sends a segment with – SYN bit set – port number of client, port number of server – initial sequence number (ISN) of client (2) PASSIVE OPEN: Server responds with a segment with – SYN bit set – initial sequence number of server – ACK for ISN of client (3) Client acknowledges by sending a segment with: – ACK ISN of server 26 Three-Way Handshake aida.poly.edu mng.poly.edu SYN (Seq N o = x) y, AckNo = o N q e (S N SY =x+1) ack (y + 1 ) 27 A Closer Look with tcpdump aida issues a "telnet mng" aida.poly.edu 1 mng.poly.edu aida.poly.edu.1121 > mng.poly.edu.telnet: S 1031880193:1031880193(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp> 2 mng.poly.edu.telnet > aida.poly.edu.1121: S 172488586:172488586(0) ack 1031880194 win 8760 <mss 1460> 3 aida.poly.edu.1121 > mng.poly.edu.telnet: . ack 172488587 win 17520 4 aida.poly.edu.1121 > mng.poly.edu.telnet: P 1031880194:1031880218(24) ack 172488587 win 17520 5 mng.poly.edu.telnet > aida.poly.edu.1121: P 172488587:172488590(3) ack 1031880218 win 8736 6 aida.poly.edu.1121 > mng.poly.edu.telnet: P 1031880218:1031880221(3) ack 172488590 win 17520 28 Three-Way Handshake aida.poly.edu mng.poly.edu S 103188 0193:103 1880193( win 16384 0) <mss 146 0, ...> 8586(0) 8 4 2 7 :1 6 8 5 8 8 S 1 724 < mss 1460> 0 6 7 8 in w 4 9 1 ack 1031880 ack 172488 587 win 175 20 29 First data segment sequence number Note that the data segment following the three-way handshake will start with the sequence number following that of the SYN segment 30 Why to start with a new ISN The problem with starting off each connection with a sequence number of 1 is that it introduces the possibility of segments from different connections getting mixed up. Traditionally, each device chose the ISN by making use of a timed counter, like a clock of sorts, that was incremented every 4 microseconds. This counter was initialized when TCP started up and then its value increased by 1 every 4 microseconds until it reached the largest 32-bit value possible (4,294,967,295) at which point it “wrapped around” to 0 and resumed incrementing. Period: 4 hours 31 TCP Connection Termination Each end of the data flow must be shut down independently (“half-close”) If one end is done it sends a FIN segment. This means that no more data will be sent Four steps involved: (1) X sends a FIN to Y (active close) (2) Y ACKs the FIN, (at this time: Y can still send data to X) (3) and Y sends a FIN to X (passive close) (4) X ACKs the FIN. 32 Connection termination with tcpdump aida.poly.edu 1 mng.poly.edu mng.poly.edu.telnet > aida.poly.edu.1121: F 172488734:172488734(0) ack 1031880221 win 8733 2 aida.poly.edu.1121 > mng.poly.edu.telnet: . ack 172488735 win 17484 3 aida.poly.edu.1121 > mng.poly.edu.telnet: F 1031880221:1031880221(0) ack 172488735 win 17520 4 mng.poly.edu.telnet > aida.poly.edu.1121: . ack 1031880222 win 8733 33 TCP Connection Termination aida.poly.edu mng.poly.edu F 172488734:172488734(0) ack 1031880221 win 8733 . ack 17 2488735 win 174 84 F 10318 80221:1 0318802 ack 172 21(0) 488735 win 175 20 in 8733 w 2 2 2 0 8 8 1 3 0 . a ck 1 34 TCP Half-close FIN ACK of FIN DATA ACK of DATA FIN ACK of FIN 35 MSS B A MTU = 1500 MTU = 296 C SYN <mss 1460> SYN <mss 256> Default is generally 536 bytes 36 Difference between TCP connections and connections in a connection-oriented network TCP “connections” are not the same as connections in a connection-oriented network In a connection-oriented network, a signaling procedure is used to reserve bandwidth for the connection on every link of the end-to-end path (e.g., circuit-switched networks) A TCP connection involves the maintenance of state information at the end hosts Purpose is to provide error correction for TCP segments Initial sequence number exchanged to avoid accidentally sending data to an old connection 37 TCP Service offered by TCP TCP Header TCP Connection Establishment and Termination Flow control Error control Congestion control 38 TCP flow control • Flow Control: How to prevent the sender from overrunning the receiver buffer? •Flow Control in TCP • TCP implements sliding window flow control • Window size is usually sent within acknowledgements. 39 Window Management in TCP The receiver returns two parameters to the sender in an ACK AckNo window size (win) 32 bits 16 bits The interpretation is: • I am ready to receive new data with SeqNo= AckNo, AckNo+1, …., AckNo+Win-1 Receiver can acknowledge data without opening the window Receiver can change the window size without acknowledging data 40 TCP Flow Control receive side of TCP connection has a receive buffer: flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast speed-matching app process may be slow at reading from buffer service: matching the send rate to the receiving app’s drain rate 41 TCP Flow control: how it works Rcvr advertises spare room by including value of RcvWindow in segments (Suppose TCP receiver discards out-of-order segments) spare room in buffer Sender limits unACKed data to RcvWindow guarantees receive buffer doesn’t overflow = RcvWindow = RcvBuffer-[LastByteRcvd LastByteRead] 42 Sliding windows Offered window advertised by receiver 1 2 3 Sent and Acknow. 4 5 6 7 8 9 10 Sent not Usable window: acked Can send ASAP 11 … Can’t send until window moves 43 Sliding Window: Example Receiver Buffer Sender sends 2K of data 0 4K 2K SeqNo=0 2K Sender blocked Sender sends 2K of data Win=2048 AckNo=2048 2K SeqNo=2 048 4K AckNo=4096 Win=0 3K AckNo=4096 Win=1024 44 Sliding Window: In-class example Sender Receiver 4K bytes win 4096 How many more segments can it send now? 3 segments Sequence number: Is 1025 carried in TCP header? Is 1024 carried in TCP header? What is 1024? NOTATION 1:1025(1024) 1025:2049(1024) 4K bytes 2049:3073(1024) 3073:4097(1024) 1K ack 1025 win 3072 How many segments can it send now? 45 Sliding Window: In-class example answers Receiver Sender 4K bytes win 4096 How many more segments can it send now? 3 segments 1:1025(1024) 1025:2049(1024) 4K bytes 2049:3073(1024) 3073:4097(1024) 1K ack 1025 win 3072 How many segments can it send now? 0 46 Silly Window Syndrome Let's say that the server is only able to remove 1 byte of data from the buffer for every 3 it receives. Let's say it also removes 40 additional bytes from the buffer during the time it takes for the next client's segment to arrive. In the worst case, the client then sends a segment with exactly one byte, refilling the buffer until the application draws off the next byte. 47 TCP Service offered by TCP TCP Header TCP Connection Establishment and Termination Flow control Error control Congestion control 48 TCP error control ARQ scheme with positive cumulative ACKs Delayed ACKs: TCP delays transmission of ACKs for up to 200ms The hope is to have data ready in that time frame. Then, the ACK can be piggybacked with the data segment. 49 Delayed ACK timer This timer ticks every 200ms. First timeout occurs based on when the timer was initialized, which is when the system was rebooted. The figure below explains why the delay for the ACKdelay is UP TO 200 ms (and not equal to 200 ms). somewhere here TCP receives segment 1 2 200 ms per tick 3 4 5 6 7 8 9 10 11 12 Delayed ACK timer expires (ACK has to be sent at this point whether or not TCP buffer has received data to enable piggybacking) 50 TCP Retransmission Timer Retransmission Timer: The setting of the retransmission timer is crucial for efficiency Timeout value too small -> results in unnecessary retransmissions Timeout value too large -> long waiting time before a retransmission can be issued A problem is that the delays in the network are not fixed Therefore, the retransmission timers must be adaptive 51 Measuring TCP Retransmission Timers ftp session from aida to rigoletto aida.poly.edu rigoletto.poly.edu •Transfer file from aida to rigoletto • Unplug Ethernet cable in the middle of file transfer 52 tcpdump Trace 10:42:01.704681 10:42:01.705603 10:42:01.706753 10:42:02.741764 10:42:05.741788 10:42:11.741828 10:42:23.741951 10:42:47.742176 10:43:35.742587 10:44:39.743140 10:45:43.743702 10:46:47.744271 10:47:51.752138 10:48:55.745547 10:49:59.746123 10:51:03.745839 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 aida.40001 > > > > > > > > > > > > > > > > rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520 . 162649:164109(1460) ack 1 win 17520 . 164109:165569(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 . 161189:162649(1460) ack 1 win 17520 R 165569:165569(0) ack 1 win 17520 53 Interpreting the Measurements The interval between retransmission attempts in seconds is: 600 1.03, 3, 6, 12, 24, 48, 64, 64, 64, 64, 64, 64, 64. 200 100 0 12 TCP gives up after 13th attempt and 9 minutes (total timeout, tcp_ip_abort_interval is 2 mins in Solaris and can be programmed by administrator 9 mins is the commonly used old timeout value) 10 300 8 Timer is not increased beyond 64 seconds 6 4 Backoff Algorithm) 400 2 Time between retransmissions is doubled each time (Exponential Seconds 500 0 Transmission Attempts 54 TCP timers First timeout occurs based on when timer was initialized. This explains why the first timeout occurs at 1.03 sec and not 1.5. If the base timer clock is 500 ms, the first timeout occurs after 3 timer ticks. This happens to occur at 1.03 sec after first segment was sent. Subsequent retransmissions occur at 3 sec, 6 sec, 12 sec, etc. 1 somewhere here TCP sends first segment 2 3 4 5 6 Retransmission timer expires after three ticks (<1.5 sec; in this case it happens to be 1.03 sec) 500 ms per tick 7 8 9 10 11 12 Retransmission timer expires after six ticks (3 sec) 55 Adaptive mechanism The retransmission mechanism of TCP is adaptive The retransmission timers are set based on round-trip time (RTT) measurements that TCP performs difference between segment transmission and ACK Can’t start a second RTT measurement if timing on one segment is in progress Each connection has only one timer t1 Segment 2 Segment 3 egm ACK for S Segment RTT #3 TCP does not ACK each segment en ACK for Segm RTT #2 But: Segment 1 RTT #1 The RTT is based on time egm ACK for S ent 2 + 3 Segme 5 nt 4 ent 4 egment 5 ACK for S 56 Computation of RTO in adaptive scheme Retransmission timer is set to a Retransmission Timeout (RTO) value. RTO is calculated based on the RTT measurements. The RTT measurements are smoothed by the following estimators A (mean RTT value) and D (smoothed mean deviation of RTT): Err = M - A A A+ g Err=A(1-g)+gM D D+ h (|Err|-D)=D(1-h)+ h|Err| RTO = A + 4D The gains are set to h=1/4 and g=1/8 – In the formula for computing the new smoothed mean RTT A, 0.125 times the newly measured value (M) is added to 0.875 times the old smoothed value of A 57 In-class example Assume A=1, D=1 (initial values) Segment 1 RTT =2 RTO= ? RTO= ? ent 1 ACK for Segm Segment 2 X (packet lost) RTO? RTO=? Segment 2 (retransmitted) egment 2 ACK for S +3 RTO= ? 58 Example of RTO computation (adaptive) Assume A=1, D=1 (initial values) • Err = 2 -1 =1 (since M, the measured RTT is 2) • A = 1 + 0.125×1= 1.125; D = 1+0.25 (1-1)=1 • RTO = A+4D=1.125+4 = 5.125 • This is why in the figure below when segment 2 is lost, it is retransmitted after 5.125 sec. Segment 1 RTT =2 ent 1 ACK for Segm Segment 2 RTO =5.125 X (packet lost) Segment 2 (retransmitted) egment ACK for S 2 59 In-class example Assume A=1, D=1 (initial values) RTO=A+4D=5 RTT =2 RTO=A+4D=5.125 (adaptive: new A = 1.125; D=1) Segment 1 ent 1 ACK for Segm Segment 2 X (packet lost) RTO? RTO=10.25 (doubling) Segment 2 (retransmitted) egment ACK for S 2 RTO=10.25 (Karn's algorithm) 5.125 sec since that is the retransmission timer value 60 Karn’s Algorithm There will be no RTT measurement for the original or retransmitted segment Therefore A and D cannot be updated when the ACK is received, and hence no new RTO computation at this point. Don’t confuse this with the RTO being doubled when the segment is retransmitted following the exponential doubling rule. Timeout ! RTT ? The RTT measurement started for the original transmission should be terminated. segme RTT ? If an ACK for a retransmitted segment is received, the sender cannot tell if the ACK belongs to the original or the retransmission. nt retransm ission of segm ent ACK • RTT measurement is suspended • RTO is doubled 61 RTO = ? fo r ACK nt 4 me t1 ACK Segmen Seg ent 2 r Segm SYN fo r ACK ent 3 m Seg ent 1 r Segm + AC K 3 sec RTT #2 RTT #1 t1 ACK fo ACK fo SYN Timeout ! . At t3: . RTO= ? . At t2: Seg m en Seg t4 me n t5 Se g me nt 6 RTO = 6 sec; A = 2; D = 1 SYN At t1: Segm e nt 2 Segm e nt 3 In-class example t2 t3 t4 t 5 t6 RTT #3 t7 t8 t9 62 At t3: RTO = 12 sec (Karn's algorithm) Se g men Seg t4 me n t5 Seg me nt 6 . Segm ent 2 Segm ent 3 for ACK nt 4 me Se g ent 2 SYN for ACK ent 3 m Seg 3 sec RTT #2 RTT #1 t1 r Segm A C K fo ent 1 + ACK r Segm A C K fo SYN Timeout ! =6sec . RTO= 12 sec (doubling) Segmen t1 At t2: ACK RTO = 6 sec; A = 2; D = 1 SYN At t1: . In-class example t2 t3 t4 t5 t6 RTT #3 t7 t8 t9 63 Thus there are two schemes for determining RTO and two schemes for controlling RTT measurement RTO Exponential backoff if a segment is retransmitted Adaptive RTO as a function of RTT (A+4D) • RTT measurement is in progress and a new segment sent then no RTT measurement is taken for new segment RTT measurement Karn’s algorithm • no RTT measurement on retransmitted segment Can’t start a second RTT measurement if timing on one segment is in progress 64