* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download L09_TCP_IP - Interactive Computing Lab
Distributed firewall wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Wake-on-LAN wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Computer network wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Network tap wikipedia , lookup
Airborne Networking wikipedia , lookup
Deep packet inspection wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
TCP congestion control wikipedia , lookup
Internet protocol suite wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Chapter 3 Transport Layer A note on the use of these ppt slides: We’re making these slides freely available to all (faculty, students, readers). They’re in PowerPoint form so you can add, modify, and delete slides (including this one) and slide content to suit your needs. They obviously represent a lot of work on our part. In return for use, we only ask the following: If you use these slides (e.g., in a class) in substantially unaltered form, that you mention their source (after all, we’d like people to use our book!) If you post any slides in substantially unaltered form on a www site, that you note that they are adapted from (or perhaps identical to) our slides, and note our copyright of this material. Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April 2009. Thanks and enjoy! JFK/KWR All material copyright 1996-2009 J.F Kurose and K.W. Ross, All Rights Reserved Transport Layer 3-1 Chapter 3: Transport Layer Our goals: understand principles behind transport layer services: learn about transport multiplexing/demultiplexing reliable data transfer flow control congestion control layer protocols in the Internet: UDP: connectionless transport TCP: connection-oriented transport • TCP congestion control Transport Layer 3-2 Transport services and protocols provide logical communication between app processes running on different hosts transport protocols run in end systems send side: breaks app messages into segments, passes to network layer rcv side: reassembles segments into messages, passes to app layer more than one transport protocol available to apps Internet: TCP and UDP application transport network data link physical application transport network data link physical Transport Layer 3-3 Transport vs. network layer network layer: logical communication between hosts transport layer: logical communication between processes relies on, enhances, network layer services Household analogy: 12 kids sending letters to 12 kids processes = kids app messages = letters in envelopes hosts = houses transport protocol = Ann and Bill network-layer protocol = postal service Transport Layer 3-4 Internet transport-layer protocols reliable, in-order delivery (TCP) congestion control flow control connection setup unreliable, unordered delivery: UDP no-frills extension of “best-effort” IP services not available: delay guarantees bandwidth guarantees application transport network data link physical network data link physical network data link physical network data link physicalnetwork network data link physical data link physical network data link physical application transport network data link physical Transport Layer 3-5 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-6 Multiplexing/demultiplexing Multiplexing at send host: gathering data from multiple sockets, enveloping data with header (later used for demultiplexing) Demultiplexing at rcv host: delivering received segments to correct socket = socket application transport network link = process P3 P1 P1 application transport network P2 P4 application transport network link link physical host 1 physical host 2 physical host 3 Transport Layer 3-7 How demultiplexing works host receives IP datagrams each datagram has source IP address, destination IP address each datagram carries 1 transport-layer segment each segment has source, destination port number host uses IP addresses & port numbers to direct segment to appropriate socket 32 bits source port # dest port # other header fields application data (message) TCP/UDP segment format Transport Layer 3-8 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-9 UDP: User Datagram Protocol [RFC 768] “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why is there a UDP? no connection establishment (which can add delay) simple: no connection state at sender, receiver small segment header no congestion control: UDP can blast away as fast as desired Transport Layer 3-10 UDP: more often used for streaming multimedia apps loss tolerant rate sensitive Length, in bytes of UDP segment, including header other UDP uses DNS SNMP reliable transfer over UDP: add reliability at application layer application-specific error recovery! 32 bits source port # dest port # length checksum Application data (message) UDP segment format Transport Layer 3-11 TCP: Overview point-to-point: one sender, one receiver reliable, in-order byte steam: no “message boundaries” pipelined: TCP congestion and flow control set window size send & receive buffers socket door application writes data application reads data TCP send buffer TCP receive buffer RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: bi-directional data flow in same connection MSS: maximum segment size connection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchange flow controlled: sender will not socket door overwhelm receiver segment Transport Layer 3-12 TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) source port # dest port # sequence number acknowledgement number head not UA P R S F len used checksum Receive window Urg data pnter Options (variable length) counting by bytes of data (not segments!) # bytes rcvr willing to accept application data (variable length) Transport Layer 3-13 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables (discussed later): seq. #s buffers, flow control info (e.g. RcvWindow) client: connection initiator Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Transport Layer 3-14 TCP Connection Management (cont.) Open a connection: client open socket: Socket cs = new Socket(“hostname”, 80) Step 1: client host sends TCP SYN segment to server specifies initial seq # no data client server open data Step 2: server host receives SYN, replies with SYN-ACK segment server allocates buffers specifies server initial seq. # Step 3: client receives SYN-ACK, replies with ACK segment, which may contain data Transport Layer 3-15 TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); client server close Step 1: client end system sends FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. Transport Layer 3-16 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. client server closing Enters “timed wait” will respond with ACK to received FINs closing Step 4: server, receives Note: why do we wait? timed wait ACK. Connection closed. closed closed Transport Layer 3-17 TCP seq. #’s and ACKs Host A Seq. #’s: byte stream “number” of first byte in segment’s data ACKs: seq # of next byte expected from other side cumulative ACK User types ‘C’ Host B host ACKs receipt of ‘C’, echoes back ‘C’ host ACKs receipt of echoed ‘C’ simple telnet scenario time Transport Layer 3-18 Stop-and-wait (for reliability) sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R U sender = L/R RTT + L / R = .008 30.008 = 0.00027 microsec onds Transport Layer 3-19 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts range of sequence numbers must be increased buffering at sender and/or receiver Two generic forms of pipelined protocols: go-Back-N, selective repeat Transport Layer 3-20 Pipelining: increased utilization sender receiver first packet bit transmitted, t = 0 last bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK last bit of 2nd packet arrives, send ACK last bit of 3rd packet arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R Increase utilization by a factor of 3! U sender = 3*L/R RTT + L / R = .024 30.008 = 0.0008 microsecon ds Transport Layer 3-21 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-22 TCP reliable data transfer TCP creates rdt service on top of IP’s unreliable service Pipelined segments Cumulative acks TCP uses single retransmission timer Retransmissions are triggered by: timeout events duplicate acks Initially consider simplified TCP sender: ignore duplicate acks ignore flow control, congestion control Transport Layer 3-23 TCP sender events: data rcvd from app: Create segment with seq # seq # is byte-stream number of first data byte in segment start timer if not already running (think of timer as for oldest unacked segment) expiration interval: TimeOutInterval timeout: retransmit segment that caused timeout restart timer Ack rcvd: If acknowledges previously unacked segments update what is known to be acked start timer if there are outstanding segments Transport Layer 3-24 TCP: retransmission scenarios Host A X loss Sendbase = 100 SendBase = 120 SendBase = 100 time SendBase = 120 lost ACK scenario Host B Seq=92 timeout Host B Seq=92 timeout timeout Host A time premature timeout Transport Layer 3-25 TCP retransmission scenarios (more) timeout Host A Host B X loss SendBase = 120 time Cumulative ACK scenario Transport Layer 3-26 TCP ACK generation recommendation [RFC 1122, RFC 2581] Event at Receiver TCP Receiver action Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Arrival of in-order segment with expected seq #. One other segment has ACK pending Immediately send single cumulative ACK, ACKing both in-order segments Arrival of out-of-order segment higher-than-expect seq. # . Gap detected Immediately send duplicate ACK, indicating seq. # of next expected byte Arrival of segment that partially or completely fills gap Immediate send ACK, provided that segment starts at lower end of gap Transport Layer 3-27 Fast Retransmit Time-out period often relatively long: long delay before resending lost packet Detect lost segments via duplicate ACKs. Sender often sends many segments back-toback If segment is lost, there will likely be many duplicate ACKs. If sender receives 3 ACKs for the same data, it supposes that segment after ACKed data was lost: fast retransmit: resend segment before timer expires Transport Layer 3-28 Host A Host B timeout X time Figure 3.37 Resending a segment after triple duplicate ACK Transport Layer 3-29 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-30 TCP Flow Control receive side of TCP connection has a receive buffer: flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast speed-matching app process may be service: matching the send rate to the receiving app’s drain rate slow at reading from buffer Transport Layer 3-31 TCP Flow control: how it works Rcvr advertises spare (Suppose TCP receiver discards out-of-order segments) spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd LastByteRead] room by including value of RcvWindow in segments Sender limits unACKed data to RcvWindow guarantees receive buffer doesn’t overflow LastByteSentLastByteAcked <= RcvWindow Transport Layer 3-32 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management 3.6 Principles of congestion control 3.7 TCP congestion control Transport Layer 3-33 TCP congestion control: additive increase, multiplicative decrease (AIMD) Approach: increase transmission rate (window size), Saw tooth behavior: probing for bandwidth congestion window size probing for usable bandwidth, until loss occurs additive increase: increase CongWin by 1 MSS every RTT until loss detected multiplicative decrease: cut CongWin in half after loss congestion window 24 Kbytes 16 Kbytes 8 Kbytes time time Transport Layer 3-34 TCP Congestion Control: details sender limits transmission: LastByteSent-LastByteAcked min{CongWin, RecvWindow} Roughly, rate = CongWin Bytes/sec RTT CongWin is dynamic, function of perceived network congestion How does sender perceive congestion? loss event = timeout or 3 duplicate acks TCP sender reduces rate (CongWin) after loss event three mechanisms: AIMD slow start conservative after timeout events Transport Layer 3-35 TCP Slow Start When connection begins, CongWin = 1 MSS Example: MSS = 500 bytes & RTT = 200 msec initial rate = 20 kbps When connection begins, increase rate exponentially fast until first loss event available bandwidth may be >> MSS/RTT desirable to quickly ramp up to respectable rate Transport Layer 3-36 TCP Slow Start (more) When connection Host B RTT begins, increase rate exponentially until first loss event: Host A double CongWin every RTT done by incrementing CongWin for every ACK received Summary: initial rate is slow but ramps up exponentially fast time Transport Layer 3-37 Refinement: inferring loss After 3 dup ACKs: is cut in half window then grows linearly But after timeout event: CongWin instead set to 1 MSS; window then grows exponentially to a threshold, then grows linearly CongWin Philosophy: 3 dup ACKs indicates network capable of delivering some segments timeout indicates a “more alarming” congestion scenario Transport Layer 3-38 Refinement Q: When should the exponential increase switch to linear? A: When CongWin gets to 1/2 of its value before timeout. Implementation: Variable Threshold At loss event, Threshold is set to 1/2 of CongWin just before loss event Transport Layer 3-39 Summary: TCP Congestion Control When CongWin is below Threshold, sender in slow-start phase, window grows exponentially. When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly. When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold. When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS. Transport Layer 3-40 TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 TCP connection 2 bottleneck router capacity R Transport Layer 3-41 Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally R equal bandwidth share loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R Transport Layer 3-42 Fairness (more) Fairness and UDP Multimedia apps often do not use TCP do not want rate throttled by congestion control Instead use UDP: pump audio/video at constant rate, tolerate packet loss Research area: TCP friendly Fairness and parallel TCP connections nothing prevents app from opening parallel connections between 2 hosts. Web browsers do this Example: link of rate R supporting 9 connections; new app asks for 1 TCP, gets rate R/10 new app asks for 11 TCPs, gets R/2 ! Transport Layer 3-43 Chapter 3: Summary principles behind transport layer services: multiplexing, demultiplexing reliable data transfer flow control congestion control instantiation and implementation in the Internet UDP TCP Next: leaving the network “edge” (application, transport layers) into the network “core” Transport Layer 3-44 Chapter 4 Network Layer A note on the use of these ppt slides: We’re making these slides freely available to all (faculty, students, readers). They’re in PowerPoint form so you can add, modify, and delete slides (including this one) and slide content to suit your needs. They obviously represent a lot of work on our part. In return for use, we only ask the following: If you use these slides (e.g., in a class) in substantially unaltered form, that you mention their source (after all, we’d like people to use our book!) If you post any slides in substantially unaltered form on a www site, that you note that they are adapted from (or perhaps identical to) our slides, and note our copyright of this material. Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April 2009. Thanks and enjoy! JFK/KWR All material copyright 1996-2009 J.F Kurose and K.W. Ross, All Rights Reserved Network Layer 4-45 Chapter 4: Network Layer Chapter goals: understand principles behind network layer services: IP addresses (+ getting an IP address via DHCP) Routing algorithms Network of networks (BGP, dealing with scales) ICMP NAT (network address translation) Network Layer 4-46 Datagram networks no call setup at network layer routers: no state about end-to-end connections no network-level concept of “connection” packets forwarded using destination host address packets between same source-dest pair may take different paths application transport network data link physical 1. Send data 2. Receive data application transport network data link physical Network Layer 4-47 The Internet Network layer Host, router network layer functions: Transport layer: TCP, UDP Network layer IP protocol •addressing conventions •datagram format •packet handling conventions Routing protocols •path selection •RIP, OSPF, BGP forwarding table ICMP protocol •error reporting •router “signaling” Link layer physical layer Network Layer 4-48 IP datagram format IP protocol version number header length (bytes) “type” of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to how much overhead with TCP? 20 bytes of TCP 20 bytes of IP = 40 bytes + app layer overhead 32 bits ver head. type of len service 16-bit identifier time to live upper layer total datagram length (bytes) length fragment flgs offset header checksum for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address E.g. timestamp, record route taken, specify list of routers to visit. Options (if any) data (variable length, typically a TCP or UDP segment) Application TCP/UDP IP Ethernet Subnets IP address: subnet part (high order bits) host part (low order bits) What’s a subnet ? device interfaces with same subnet part of IP address can physically reach each other without intervening router 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 subnet 223.1.3.1 223.1.3.2 network consisting of 3 subnets Network Layer 4-50 IP addressing: CIDR CIDR: Classless InterDomain Routing subnet portion of address of arbitrary length address format: a.b.c.d/x, where x is # bits in subnet portion of address subnet part host part 11001000 00010111 00010000 00000000 200.23.16.0/23 Network Layer 4-51 IP addresses: how to get one? Q: How does a host get IP address? hard-coded by system admin in a file Windows: control-panel->network->configuration>tcp/ip->properties Linux (ubuntu): /etc/network/interface DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server “plug-and-play” Network Layer 4-52 DHCP: Dynamic Host Configuration Protocol Goal: allow host to dynamically obtain its IP address from network server when it joins network Can renew its lease on address in use Allows reuse of addresses (only hold address while connected an “on”) Support for mobile users who want to join network (more shortly) DHCP overview: 1. host broadcasts “DHCP discover” msg 2. DHCP server responds with “DHCP offer” msg 3. host requests IP address: “DHCP request” msg 4. DHCP server sends address: “DHCP ack” msg Network Layer 4-53 Graph abstraction of a network 5 2 u v 2 1 x 3 w 3 1 • c(x,x’) = cost of link (x,x’) 5 z 1 y 2 Graph: G = (N,E) N = set of routers = { u, v, w, x, y, z } E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) } - e.g., c(w,z) = 5 • cost could always be 1, or inversely related to bandwidth, or inversely related to congestion Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp) Question: What’s the least-cost path between u and z ? Routing algorithm: algorithm that finds least-cost path Network Layer 4-54 Routing algorithms Global or decentralized information? Global: all routers have complete topology, link cost info “link state” algorithms (OSPF) Decentralized: router knows physically-connected neighbors, link costs to neighbors iterative process of computation, exchange of info with neighbors “distance vector” algorithms (RIP) Static or dynamic? Static: routes change slowly over time Dynamic: routes change more quickly periodic update in response to link cost changes Network Layer 4-55 Interplay between routing, forwarding routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 0111 1 3 2 Network Layer 4-56 Hierarchical routing for scalability Our routing study thus far - idealization all routers identical network “flat” … not true in practice scale: with 200 million destinations: can’t store all dest’s in routing tables! routing table exchange would swamp links! administrative autonomy internet = network of networks each network admin may want to control routing in its own network Network Layer 4-57 Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 200.23.18.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us “Send me anything with addresses beginning 199.31.0.0/16” Network Layer 4-58 Hierarchical addressing: more specific routes ISPs-R-Us has a more specific route to Organization 1 Organization 0 200.23.16.0/23 Organization 2 200.23.20.0/23 Organization 7 . . . . . . Fly-By-Night-ISP “Send me anything with addresses beginning 200.23.16.0/20” Internet 200.23.30.0/23 ISPs-R-Us Organization 1 200.23.18.0/23 “Send me anything with addresses beginning 199.31.0.0/16 or 200.23.18.0/23” Network Layer 4-59 Hierarchical routing for scalability aggregate routers into regions, “autonomous systems” (AS) routers in same AS run same routing protocol Gateway router Direct link to router in another AS “intra-AS” routing protocol routers in different AS can run different intraAS routing protocol Network Layer 4-60 Interconnected ASs 3c 3a 3b AS3 1a 2a 1c 1d 1b Intra-AS Routing algorithm 2c AS2 AS1 Inter-AS Routing algorithm Forwarding table 2b forwarding table configured by both intra- and inter-AS routing algorithm intra-AS sets entries for internal dests inter-AS & intra-AS sets entries for external dests Network Layer 4-61 Internet inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto standard BGP provides each AS a means to: 1. 2. 3. Obtain subnet reachability information from neighboring ASs. Propagate reachability information to all ASinternal routers. Determine “good” routes to subnets based on reachability information and policy. allows subnet to advertise its existence to rest of Internet: “I am here” Network Layer 4-62 BGP basics pairs of routers (BGP peers) exchange routing info over TCP connections (called BGP sessions) BGP sessions need not correspond to physical links. when AS2 advertises prefix “200.23.16.0/23” to AS1: AS2 promises it will forward datagrams towards that prefix. AS2 can aggregate prefixes in its advertisement eBGP session 3c 3a 3b AS3 1a AS1 iBGP session 2a 1c 1d 2c AS2 2b 1b Network Layer 4-63 Distributing reachability info using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP do distribute new prefix info to all routers in AS1 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session when router learns of new prefix, it creates entry for prefix in its forwarding table. 3c 3a 3b AS3 Any dest w/ IP addr AS1 should be routed to 1c Any dest w/ IP addr AS1 should be routed to 2a eBGP session 1a AS1 iBGP session 2a 1c 1d 2c AS2 2b 1b Network Layer 4-64 Path attributes & BGP routes advertised prefix includes BGP attributes. prefix + attributes = “route” two important attributes: AS-PATH: contains ASs through which prefix advertisement has passed: e.g, AS 67, AS 17 NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may be multiple links from current AS to next-hop-AS) when gateway router receives route advertisement, uses local import policy to accept/decline. Network Layer 4-65 BGP route selection router may learn about more than 1 route to some prefix. Router must select route. elimination rules: 1. 2. 3. 4. local preference value attribute: policy decision shortest AS-PATH closest NEXT-HOP router: hot potato routing additional criteria Network Layer 4-66