Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Fred Baker © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 Best shown using an example… Ping RTT from a hotel to Cisco overnight RTT varying from 278 ms to 9286 ms Delay distribution with odd spikes about a TCP RTO apart; Suggests that we actually had more than one copy of the same segment in queue Because few applications actually worked © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3 • Seen in serial lines, ISP DSH and optical networks at all speeds, LANs, WiFi networks, and input-queued backplanes such as Nexus – in fact, any queue • The buffering delay affects all traffic in the same or lower priority queue, particularly impacting delay sensitive applications like VOIP and ratesensitive applications like video • Common reality to all of those: Offered load at an interface or on a path approximates or exceeds capacity, and as a result a queue builds, even if on a very short time scale • Shared media a special case: WiFi, single cable Ethernet, input-queued backplanes, and other shared media are best modeled as having two queues – • One of packets in each interface • One with interfaces seeking access to the channel As a result, in a congested shared medium, even an uncongested interface can experience congestion © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4 • Average delay at an interface is inversely proportional to average available bandwidth utilization average time in queue = service rate 1 - utilization (M/M/1) • In other words, average delay shoots to infinity (loss) when a link is fully used. Independent of bandwidth (adding bandwidth changes or delays the effect, but does not solve the problem) Not driven by the number of sessions using the link (it might be a lot of little ones or a smaller number of big ones) © 2010 Cisco and/or its affiliates. All rights reserved. Graphic courtesy Sprint, Apricot 2004 Cisco Confidential 5 • Predicted by Kleinrock in 1960’s Dissertation and “Queueing Systems” • RFCs 896 and 970, dated 1984-1985, address network congestion TCP’s “Nagle” algorithm and the development of “fair” queuing • Subject of RFC 2309: Recommendations on Queue Management and Congestion Avoidance in the Internet. RFC 1633: Integrated Services in the Internet Architecture: an Overview RFC 2475: An Architecture for Differentiated Service. Extensive research, published in journals etc. • More recently: Jim Gettys et al, under the topic “bufferbloat” (ask Google) • But new ramifications… © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6 Over coming years, expect video traffic – especially streaming media (video in TCP) – to dominate Internet traffic © 2010 Cisco and/or its affiliates. All rights reserved. Over-the-top providers including Netflix/Roku/Hulu, video sites such as YouTube, Video conferencing, Surveillance, etc Cisco Confidential 7 • Academic Research on non-responsive traffic flows “Router Mechanisms to Support End-to-End Congestion Control” ftp://ftp.ee.lbl.gov/papers/collapse.ps, Floyd & Fall “TCP-Friendly Unicast Rate-Based Flow Control” http://www.psc.edu/networking/papers/tcp_friendly.html, Floyd et al • Net Neutrality discussion “If you congest my network I’ll shut down your traffic!” • Comcast RFC 6057: Determine “top talker” subscribers from Netflow/IPFIX measurements Deprioritize or force round robin service • Fundamental issue: In each case, in various forms, a subscriber can impact SLA delivery for other subscribers. Solution: somehow throttle back the offending traffic flow. © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8 Increasing Measurable Throughput mean throughput = effective window mean round trip time • Effective Window: the amount of data TCP sends each RTT • Knee: the lowest window that makes throughput approximate capacity Bottleneck Capacity “knee” Queue Depth “cliff” • Cliff: the largest window that makes throughput approximate capacity • Note that throughput is the same at knee and cliff. Increasing the window merely increases RTT, by increasing queue depth Increasing TCP Window Yes, there is a more complex equation that takes into account loss. It estimates throughput above the cliff. © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10 “ When the link utilization on a bottleneck link is below 90%, the 99th percentile of the hourly delay distributions remains below 1 ms. Once the bottleneck link reaches utilization levels above 90%, the variable delay shows a significant increase overall, and the 99th percentile reaches a few milliseconds. Even when the link utilization is relatively low (below 90%), sometimes a small number of packets may experience delay an order of magnitude larger than the 99th percentile” “Analysis of Point-To-Point Packet Delay In an Operational Network”, INFOCOMM 2004, analyzing a 2.5 GBPS ISP network © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12 • Many products provide deep queues and drop only from the tail when the queue is full That “1 ms” variation in delay can be in a queue producing a long delay, varying between 9 and 10 ms for example. The sessions affected most by tail drop are new sessions in slow-start, as they send relatively large bursts of traffic Occasional bursts result in unnecessary loss – unnecessarily poor service • Nick McKeown argues for very small total buffer sizes, Same net effect but a smaller average delay Defeats delay-based congestion control by reducing signal strength • Note, BTW, that lower rates imply longer intervals in queue In gigabit networks, we talk about single-digit milliseconds In megabit networks, we talk about tens to hundreds of milliseconds © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13 FIFO traffic, Total Test 400 Typical variation in delay only at top of the queue 350 300 Mean Latency Correlates with Maximum Queue Depth Ns RTT 250 200 150 100 50 0 Elapsed Time Mean RTT © 2010 Cisco and/or its affiliates. All rights reserved. Min RTT Max RTT STD DEV Cisco Confidential 14 New RED Total Test 400 300 Ms RTT 250 200 Dynamic range of configuration 350 Additional Capacity to Absorb Bursts Mean Latency Correlates with target queue depth, minthreshold 150 100 50 0 • Provide queues that can absorb bursts under normal loads, but Mean RTT Min RTT Max RTT STD DEV Elapsed Time which manage queues to a shallow average depth • Net effect: maximize throughput, minimize delay/loss, minimize SLA issues © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15 • Bandwidth, provisioning, and session control If you don’t have enough bandwidth for your applications, no amount of QoS technology is going to help. QoS technology manages the differing requirements of applications; it’s not magic. For inelastic applications – UDP and RTP-based sensors, voice, and video, this means some combination of provisioning, session counting, and signaling such as RSVP • Cooperation between network and host mechanisms for elastic traffic Parekh and Gallagher TCP Congestion Control responds to signals from the network or measurements of the network • Choices in network signaling Loss – TCP responds to loss Explicit Congestion Notification – lossless signaling from the network © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16 • Manage congestion without loss • When AQM would otherwise drop traffic to signal queue deeper than some threshold, mark it “Congestion Experienced” • TCP Receiver reports back to sender, who reduces window accordingly negotiation TCP IP © 2010 Cisco and/or its affiliates. All rights reserved. TCP IP IP Cisco Confidential 17 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18 • Explicit Congestion Control • RFC 3168 ECN: On receipt of ECN Congestion Experienced, return signal in TCP to sender Sender reduces effective window by the same algorithm it uses on detection of loss • Data Center TCP (DCTCP): Based on RFC 3168 (responds either to loss or ECN marks) Reduces effective window proportionally to mark rate © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20 • Routing and switching products should: Implement an AQM algorithm (RED, AVQ, Blue, etc.) on all interfaces Implement both dropping and ECN marking • Target queue depth (informal recommendation): Bit rate (order of magnitude) Min-thresh (ms) Max-thresh (ms) Target Packets in queue 104 2400 6000 2 105 240 2400 2 106 32 320 2.6 107 16 160 13 108 8 80 67 109 4 40 333 1010 2 20 1667 1011 1 10 8333 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21 Thank you.