* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ppt
Piggybacking (Internet access) wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Airborne Networking wikipedia , lookup
Computer network wikipedia , lookup
Deep packet inspection wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Internet protocol suite wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
CS 54001-1: Large-Scale Networked Systems Professor: Ian Foster TAs: Xuehai Zhang, Yong Zhao Winter Quarter www.classes.cs.uchicago.edu/classes/archive/2003/winter/54001-1 CS 54001-1 Course Goals Yes – Gain understanding of fundamental issues that effect design, construction, and operation of large-scale networked systems – Gain understanding of some significant future trends in network design and use No – Learn how to write network applications CS 54001-1 Winter Quarter 2 Remember I ask you to: – Read Peterson and Davies Ch 1 and 2 – Read “End to End Arguments in System Design” – Use traceroute to determine paths to following locations & build map of network > ANL, IIT, NWU, UIC, Loyola, UIUC, Purdue, Indiana CS 54001-1 Winter Quarter 3 Last Week: Internet Design Principles & Protocols An introduction to the mail system An introduction to the Internet Internet design principles and layering Brief history of the Internet Packet switching and circuit switching Protocols Addressing and routing Performance metrics A detailed FTP example CS 54001-1 Winter Quarter 4 This Week: Routing and Transport Routing techniques – Flooding – Distributed Bellman Ford Algorithm – Dijkstra’s Shortest Path First Algorithm Routing in the Internet – Hierarchy and Autonomous Systems – Interior Routing Protocols: RIP, OSPF – Exterior Routing Protocol: BGP Transport: achieving reliability Transport: achieving fair sharing of links CS 54001-1 Winter Quarter 5 Recap: An Introduction to the Internet gargoyle.cs.uchicago.edu Athena.MIT.edu Application Layer Ian Dave Transport Layer O.S. D Data Header Data Network Layer H D H D D CS 54001-1 O.S. Header Winter Quarter H H D D H H Link Layer 6 Characteristics of the Internet Each packet is individually routed No time guarantee for delivery No guarantee of delivery in sequence No guarantee of delivery at all! – Things get lost – Acknowledgements – Retransmission > How to determine when to retransmit? Timeout? > Need local copies of contents of each packet. > How long to keep each copy? > What if an acknowledgement is lost? CS 54001-1 Winter Quarter 7 Characteristics of the Internet (2) No guarantee of integrity of data. Packets can be fragmented. Packets may be duplicated. CS 54001-1 Winter Quarter 8 Size of the Routing Table at the core of the Internet Source: http://www.telstra.net/ops/bgptable.html CS 54001-1 Winter Quarter 9 This Week: Routing and Transport Routing techniques – Flooding – Distributed Bellman Ford Algorithm – Dijkstra’s Shortest Path First Algorithm Routing in the Internet – Hierarchy and Autonomous Systems – Interior Routing Protocols: RIP, OSPF – Exterior Routing Protocol: BGP Transport: achieving reliability Transport: achieving fair sharing of links CS 54001-1 Winter Quarter 10 The Problem “A” “B” R2 R1 How does R1 choose a route to host B? CS 54001-1 Winter Quarter R4 R3 11 Technique 1: Flooding Routers forward packets to all ports except the ingress port. R1 Advantages: Every destination in the network is reachable. Useful when network topology is unknown. Disadvantages: Some routers receive packet multiple times. Packets can go round in loops forever. CS 54001-1 Winter Quarter 12 Technique 2: Bellman-Ford Algorithm Objective: Determine the route from (R1, …, R7) to R8 that minimizes the cost. Examples of link cost: Distance, data rate, price, congestion/delay, … 1 R1 R2 R3 4 Winter Quarter 4 R4 2 2 CS 54001-1 1 R6 3 R5 2 R7 3 2 R8 13 Solution is simple by inspection... (in this case) 1 R1 1 R2 R4 2 2 R3 4 R5 2 4 R6 3 R7 2 3 R8 The solution is a spanning tree with R8 as the root of the tree. The Bellman-Ford Algorithm finds the spanning tree automatically. CS 54001-1 Winter Quarter 14 The Distributed Bellman-Ford Algorithm 1. Let X n (C1 , C2 , ..., C7 ) where: Ci cost from Ri to R8 . 2. Set X 0 (, , ,..., ). 3. Every T seconds, router i sends Ci to its neighbors. 4. If router i is told of a lower cost path to R8 , it updates Ci . Hence, X n 1 f ( X n ) where f (.) determines the next step improvement. 5. If X n 1 X n then goto step (3). 6. Stop. CS 54001-1 Winter Quarter 15 Bellman-Ford Algorithm: Example 1 R1 R2 2 2 R3 R1 Inf R2 Inf R3 4, R8 R4 Inf R5 2, R8 R6 2, R8 R7 3, R8 CS 54001-1 R1 1 1 2 Winter Quarter 4 R7 2 R4 R5 R6 2 3 R8 2 R3 3 1 R2 4 R4 R5 4 4 2 2 4 3 R7 3 R6 2 2 3 R8 16 R1 6, R3 R2 4, R5 R3 4, R8 R4 6, R7 R5 2, R8 R6 2, R8 R7 3, R8 R1 5, R2 R2 4, R5 R3 4, R8 R4 5, R2 R5 2, R8 R6 2, R8 R7 3, R8 CS 54001-1 Bellman-Ford Algorithm 6 4 1 R1 1 R2 2 2 4 2 3 2 4 R6 3 R7 2 R5 4 R3 R4 6 2 3 R8 Solution 5 R1 4 1 1 R2 2 4 R3 Winter Quarter 4 5 2 R4 3 2 R5 R6 2 R7 3 2 R8 17 Bellman-Ford Algorithm Questions: 1. 2. 3. How long can the algorithm take to run? How do we know that the algorithm always converges? What happens when link costs change, or when routers/links fail? CS 54001-1 Winter Quarter 18 A Problem with Bellman-Ford “Bad news travels slowly” R1 1 R2 1 1 R3 R4 Consider the calculation of distances to R4: Time 0 1 2 3 … CS 54001-1 R1 R2 R3 3,R2 2,R3 1, R4 3,R2 2,R3 3,R2 3,R2 4,R3 3,R2 5,R2 4,R3 5,R2 “Counting to … …infinity” … Winter Quarter R3 R4 fails 19 Counting to Infinity Problem Solutions 1. 2. 3. Set infinity = “some small integer” (e.g., 16) Stop when count = 16 Split Horizon: Because R2 received lowest cost path from R3, it does not advertise cost to R3 Split-horizon with poison reverse: R2 advertises infinity to R3 CS 54001-1 Winter Quarter 20 Technique 3: Dijkstra’s Shortest Path First Algorithm Routers send out update messages whenever the state of a link changes. Hence the name: “Link State” algorithm Each router calculates lowest cost path to all others, starting from itself At each step of the algorithm, router adds the next shortest (i.e., lowest-cost) path to the tree Finds spanning tree routed on source router CS 54001-1 Winter Quarter 21 Dijkstra’s Shortest Path First Algorithm: Example Step 1: Shortest path set, S {R8 }. Candidate set, C {R3 , R5 , R7 , R6 } Step 2: S {R8 , R5 }, C {R3 , R7 , R6 , R2 }. Step 3: S {R8 , R5 , R6 }, C {R3 , R7 , R2 , R4 }. Step 4: R8 R6 R5 R8 S {R8 , R5 , R6 , R7 }, C {R3 , R2 , R4 }. CS 54001-1 R5 Winter Quarter R6 R5 R7 R8 22 Dijkstra’s SPF Algorithm Step 8 : S {R8 , R5 , R6 , R7 , R2 , R1 , R4 }, C {}. R1 1 R2 1 R4 R6 2 R3 CS 54001-1 Winter Quarter 4 R5 2 R7 3 R8 23 This Week: Routing and Transport Routing techniques – Flooding – Distributed Bellman Ford Algorithm – Dijkstra’s Shortest Path First Algorithm Routing in the Internet – Hierarchy and Autonomous Systems – Interior Routing Protocols: RIP, OSPF – Exterior Routing Protocol: BGP Transport: achieving reliability Transport: achieving fair sharing of links CS 54001-1 Winter Quarter 24 Routing in the Internet The Internet uses hierarchical routing Internet is split into Autonomous Systems (ASs) – Examples of ASs: Stanford (32), HP (71), MCI Worldcom (17373) – Try: whois –h whois.arin.net ASN “MCI Worldcom” Within an AS, the administrator chooses an Interior Gateway Protocol (IGP) – Examples of IGPs: RIP (rfc 1058), OSPF (rfc 1247). Between ASs, the Internet uses an Exterior Gateway Protocol – ASs today use the Border Gateway Protocol, BGP-4 (rfc 1771) CS 54001-1 Winter Quarter 25 AS ‘A’ Routing in the Internet AS ‘B’ BGP BGP Interior Gateway Protocol Stub AS CS 54001-1 AS ‘C’ Interior Gateway Protocol Interior Gateway Protocol Transit AS Stub AS e.g. backbone service provider Winter Quarter 26 Routing within a Stub AS There is only one exit point, so routers within the AS can use default routing – Each router knows all Network IDs within AS – Packets destined to another AS are sent to the default router – Default router is the border gateway to the next AS Routing tables in Stub ASs tend to be small CS 54001-1 Winter Quarter 27 Interior Routing Protocols RIP (Routing Information Protocol) – Uses distributed Bellman-Ford algorithm – Updates sent every 30 seconds – No authentication – Originally in BSD UNIX OSPF (Open Shortest Path First) – Link-state updates sent (using flooding) as and when required – Every router runs Dijkstra’s algorithm – Authenticated updates – Autonomous system may be partitioned into “areas” CS 54001-1 Winter Quarter 28 Exterior Routing Protocols Problems: – Topology: The Internet is a complex mesh of different ASs with very little structure – Autonomy of ASs: Each AS defines link costs in different ways, so not possible to find lowest cost paths – Trust: Some ASs can’t trust others to advertise good routes (e.g., two competing backbone providers), or to protect the privacy of their traffic (e.g., two warring nations) – Policies: Different ASs have different objectives (e.g., route over fewest hops; use one provider rather than another) CS 54001-1 Winter Quarter 29 Border Gateway Protocol (BGP-4) BGP is not a link-state or distance-vector routing protocol BGP advertises complete paths (a list of ASs) Example of path advertisement: – “The network 171.64/16 can be reached via the path {AS1, AS5, AS13}”. Paths with loops are detected locally and ignored Local policies pick the preferred path among options When link/router fails, the path is “withdrawn” CS 54001-1 Winter Quarter 30 This Week: Routing and Transport Routing techniques – Flooding – Distributed Bellman Ford Algorithm – Dijkstra’s Shortest Path First Algorithm Routing in the Internet – Hierarchy and Autonomous Systems – Interior Routing Protocols: RIP, OSPF – Exterior Routing Protocol: BGP Transport: achieving reliability Transport: achieving fair sharing of links CS 54001-1 Winter Quarter 31 Outline The Transport Layer The TCP Protocol – TCP Characteristics – TCP Connection setup – TCP Segments – TCP Sequence Numbers – TCP Sliding Window – Timeouts and Retransmission – (Congestion Control and Avoidance) The UDP Protocol CS 54001-1 Winter Quarter 32 The Transport Layer What is the transport layer for? What characteristics might it have? – Reliable delivery – Flow control –… CS 54001-1 Winter Quarter 33 Review of the Transport Layer Athena.MIT.edu Gargoyle.cs.uchicago.edu Application Layer Ian Dave Transport Layer O.S. D Data Header Data Network Layer H D H D D CS 54001-1 O.S. Header Winter Quarter H H D D H H Link Layer 34 Layering: FTP Example Application Presentation FTP ASCII/Binary Session TCP Transport IP Network Ethernet Link Transport Network Link Physical The 7-layer OSI Model CS 54001-1 Application Winter Quarter The 4-layer Internet model 35 TCP Characteristics TCP is connection-oriented – 3-way handshake used for connection setup TCP provides a stream-of-bytes service TCP is reliable: – Acknowledgements indicate delivery of data – Checksums are used to detect corrupted data – Sequence numbers detect missing, or mis-sequenced data – Corrupted data is retransmitted after a timeout – Mis-sequenced data is re-sequenced – (Window-based) Flow control prevents over-run of receiver TCP uses congestion control to share network capacity among users CS 54001-1 Winter Quarter 36 TCP is connection-oriented (Active) Client Syn (Passive) Server Syn + Ack Ack (Active) Client Fin (Passive) Server (Data +) Ack Fin Ack Connection Setup 3-way handshake CS 54001-1 Winter Quarter Connection Close/Teardown 2 x 2-way handshake 37 TCP supports a “stream of bytes” service Host A Host B CS 54001-1 Winter Quarter 38 …which is emulated using TCP “segments” Host A Segment sent when: TCP Data Host B CS 54001-1 1. Segment full (MSS bytes), 2. Not full, but times out, or 3. “Pushed” by application. TCP Data Winter Quarter 39 The TCP Segment Format IP Data TCP Data 0 TCP Hdr 15 Src port 31 Dst port Sequence # Ack Sequence # HLEN 4 RSVD 6 Flags URG ACK PSH RST SYN FIN TCP Header and Data + IP Addresses Checksum IP Hdr Window Size Src/dst port numbers and IP addresses uniquely identify socket Urg Pointer (TCP Options) TCP Data CS 54001-1 Winter Quarter 40 Host A Sequence Numbers ISN (initial sequence number) Sequence number = 1st byte TCP Data TCP Data Host B CS 54001-1 TCP HDR Winter Quarter Ack sequence number = next expected byte TCP HDR 41 Initial Sequence Numbers (Active) Client (Passive) Server Syn +ISNA Syn + Ack +ISNB Ack Connection Setup 3-way handshake CS 54001-1 Winter Quarter 42 TCP Sliding Window How much data can a TCP sender have outstanding in the network? How much data should TCP retransmit when an error occurs? Just selectively repeat the missing data? How does the TCP sender avoid overrunning the receiver’s buffers? CS 54001-1 Winter Quarter 43 TCP Sliding Window Window Size Data ACK’d Outstanding Un-ack’d data Data OK to send Data not OK to send yet Retransmission policy is “Go Back N” Current window size is “advertised” by receiver (usually 4k – 8k Bytes when connection set-up) CS 54001-1 Winter Quarter 44 TCP Sliding Window Round-trip time Round-trip time Window Size Host A Host B ??? Window Size ACK (1) RTT > Window size CS 54001-1 Winter Quarter Window Size ACK ACK (2) RTT = Window size 45 TCP: Retransmission and Timeouts Round-trip time (RTT) Retransmission TimeOut (RTO) Guard Band Host A Estimated RTT Data1 Data2 ACK ACK Host B TCP uses an adaptive retransmission timeout value: Congestion RTT changes Changes in Routing frequently CS 54001-1 Winter Quarter 46 TCP: Retransmission and Timeouts Picking the RTO is important: Pick a values that’s too big and it will wait too long to retransmit a packet, Pick a value too small, and it will unnecessarily retransmit packets. The original algorithm for picking RTO: 1. EstimatedRTT = EstimatedRTT + (1 - ) SampleRTT 2. RTO = 2 * EstimatedRTT Characteristics of the original algorithm: CS 54001-1 Variance is assumed to be fixed. But in practice, variance increases as congestion increases. Winter Quarter 47 TCP: Retransmission and Timeouts Newer Algorithm includes estimate of variance in RTT: Difference = SampleRTT - EstimatedRTT EstimatedRTT = EstimatedRTT + (*Difference) Deviation = Deviation + *( |Difference| - Deviation ) RTO = * EstimatedRTT + * Deviation 1 4 CS 54001-1 Winter Quarter 48 TCP: Retransmission and Timeouts Karn’s Algorithm Host A Host B Host A Retransmission Wrong RTT Sample Host B Retransmission Wrong RTT Sample Problem: How can we estimate RTT when packets are retransmitted? Solution: On retransmission, don’t update estimated RTT (and double RTO) CS 54001-1 Winter Quarter 49 User Datagram Protocol (UDP) Characteristics UDP is a connectionless datagram service – There is no connection establishment: packets may show up at any time UDP packets are self-contained UDP is unreliable: – No acknowledgements to indicate delivery of data – Checksums cover the header, and only optionally cover the data – Contains no mechanism to detect missing or missequenced packets – No mechanism for automatic retransmission – No mechanism for flow control, and so can over-run the receiver CS 54001-1 Winter Quarter 50 User-Datagram Protocol (UDP) A1 A2 B1 B2 App App App App OS UDP Like TCP, UDP uses port number to demultiplex packets IP CS 54001-1 Winter Quarter 51 User-Datagram Protocol (UDP) Packet format By default, only covers the header. SRC port DST port checksum length DATA Why do we have UDP? It is used by applications that don’t need reliable delivery, or Applications that have their own special needs, such as streaming of real-time audio/video. CS 54001-1 Winter Quarter 52 This Week: Routing and Transport Routing techniques – Flooding – Distributed Bellman Ford Algorithm – Dijkstra’s Shortest Path First Algorithm Routing in the Internet – Hierarchy and Autonomous Systems – Interior Routing Protocols: RIP, OSPF – Exterior Routing Protocol: BGP Transport: achieving reliability Transport: achieving fair sharing of links CS 54001-1 Winter Quarter 53 Main points Congestion is inevitable TCP sources detect congestion and, cooperatively, reduce the rate at which they transmit The rate is controlled using the TCP window size TCP modifies the rate according to “Additive Increase, Multiplicative Decrease (AIMD)” To jump start flows, TCP uses a fast restart mechanism (called “slow start”!) TCP achieves high throughput by encouraging high delay CS 54001-1 Winter Quarter 54 Congestion H1 A1(t) 10Mb/s R1 H2 D(t) 1.5Mb/s H3 A2(t) 100Mb/s A1(t) A2(t) Cumulative bytes A2(t) X(t) D(t) A1(t) X(t) D(t) t CS 54001-1 Winter Quarter 55 Congestion is unavoidable Arguably it’s good! We use packet switching because it makes efficient use of the links. Therefore, buffers in the routers are frequently occupied If buffers are always empty, delay is low, but our usage of the network is low If buffers are always occupied, delay is high, but we are using the network more efficiently So how much congestion is too much? CS 54001-1 Winter Quarter 56 Load, Delay and Power Typical behavior of queueing systems with random arrivals: A simple metric of how well the network is performing: Load Power Delay Burstiness tends to move asymptote to the left Average Packet delay Power Load CS 54001-1 Winter Quarter “optimal load” Load 57 Options for Congestion Control 1. 2. 3. Implemented by host versus network Reservation-based, versus feedbackbased Window-based versus rate-based CS 54001-1 Winter Quarter 58 TCP Congestion Control TCP implements host-based, feedbackbased, window-based congestion control. TCP sources attempts to determine how much capacity is available TCP sends packets, then reacts to observable events (loss) CS 54001-1 Winter Quarter 59 TCP Congestion Control TCP sources change the sending rate by modifying the window size: Window = min{Advertized window, Congestion Window} Receiver Transmitter (“cwnd”) In other words, send at the rate of the slowest component: network or receiver “cwnd” follows additive increase/multiplicative decrease – On receipt of Ack: cwnd += 1 – On packet loss (timeout): cwnd *= 0.5 CS 54001-1 Winter Quarter 60 Additive Increase Src D A D D A A D D D A A A Dest Actually, TCP uses bytes, not segments to count: When ACK is received: MSS cwnd MSS cwnd CS 54001-1 Winter Quarter 61 Leads to the TCP “sawtooth” Rate Timeouts halved Could take a long time to get started! CS 54001-1 Winter Quarter t 62 “Slow Start” Designed to cold-start connection quickly at startup or if a connection has been halted (e.g. window dropped to zero, or window full, but ACK is lost). How it works: increase cwnd by 1 for each ACK received. Src 1 D 2 A D D 4 A A D D 8 D A D A A A Dest CS 54001-1 Winter Quarter 63 Slow Start Timeouts Rate halved Exponential “slow start” Slow start in operation until it reaches half of t cwnd. previous Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole window’s worth of data. CS 54001-1 Winter Quarter 64 Fast Retransmit & Fast Recovery TCP source can take advantage of an additional hint: if a duplicate ACK arrives out of sequence, there was probably some data lost, even if it hasn’t yet timed out. Upon 3 duplicate ACKs, TCP retransmits. Does not enter slow-start: there are probably ACKs in the pipe that will continue correct AIMD operation. CS 54001-1 Winter Quarter 65 Course Outline (Subject to Change) 1. (January 9th) Internet design principles and protocols 2. (January 16th) Internetworking, transport, routing 3. (January 23rd) Mapping the Internet and other networks 4. (January 30th) Security (with guest lecturer: Gene Spafford) 5. (February 6th) P2P technologies & applications (Matei Ripeanu) (plus midterm) 6. (February 13th) Optical networks (Charlie Catlett) 7. *(February 20th) Web and Grid Services (Steve Tuecke) 8. (February 27th) Network operations (Greg Jackson) 9. 10. *(March 6th) Advanced applications (with guest lecturers: Terry Disz, Mike Wilde) (March 13th) Final exam * Ian Foster is out of town. CS 54001-1 Winter Quarter 66