* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Understanding Network Performance in Extreme Congestion Scenario
Distributed firewall wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Network tap wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Airborne Networking wikipedia , lookup
TCP congestion control wikipedia , lookup
Internet protocol suite wikipedia , lookup
Deep packet inspection wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Transport and Application Layer Approaches to Improve End-to-end Performance in the Internet PhD thesis defense Amit Mondal Committee: Aleksandar Kuzmanovic, Asst. Professor, Northwestern Univ Peter Dinda, Assoc. Professor, Northwestern Univ Yan Chen, Assoc. Professor, Northwestern Univ Jin Li, Principal Researcher, Microsoft Research Internet — A multiservice IP network VoIP FTP IPTV The InternetSRis a UTAH commercial infrastructure used by diverse set of applications and services UCSB UCLA Video Conferencing Streaming Gaming 2 Challenges involved… Applications have end-to-end network performance requirements – Jitter, latency, packet loss, bandwidth, etc Original Internet – – – – Best effort service No service assurance TCP ensures only in-order packet delivery Destination-based IP routing “high throughput” “low delay” Need to provide support to new set of emerging applications in the Internet 3 Application classification based on QoS Low bandwidth High bandwidth Latency sensitive VoIP, network games, SSH, chatting, web browsing, ecommerce Multimedia streaming, IPTV, audio/video conferencing Latency insensitive Email File transfer (FTP/BitTorrent) My focus: Low-latency interactive TCP applications (Chapter II and III) – Telnet, SSH, network games, e-commerce, etc. Interactive multimedia services (Chapter IV and V) – Audio/video conferencing, VoIP, streamed multimedia services, etc. 4 Endpoint Forward error correction, Bitrate adaptation, Chapter-V, etc. TCP smart framing, Limited retransmit, Early retransmit, Chapter-II, Chapter-III, etc. N/A N/A N/A Data Link Physical Transport ECN, ECN+, packet marking & differential dropping, Service differentiation, etc. Application Infrastr Overlay routing (QRON, QSON, etc.) Chapter-IV IntServ, DiffServ, Traffic engineering, Constraint based routing Network The spectrum of QoS provisioning MPLS Bandwidth overprovisioning 5 Research thesis Despite much work to improve end-to-end performance in the Internet, there still exists a significant space for improvement. In my dissertation, I develop techniques to reduce the gap further. For example, I propose techniques that improve – Response times of short TCP flows by five times in certain scenarios – Median Mean Opinion Score (MOS) of VoIP calls over WiFi by a factor of two 6 Outline Chapter I: Introduction Chapter II: Improving performance of thin-stream TCP applications Chapter III: Removing exponential backoff from TCP Chapter IV: Multi-constraint QoS routing framework Chapter V: Audio/video performance Issues: Diagnosis and solutions Conclusion 7 Chapter II: Improving thin-stream TCP flows Upgrading mice to elephants data packets strict priority TCP-fair rate “dummy” packets Packet switched Circuit switched A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, IEEE INFOCOM 2007 A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010 8 Chapter III: Removing Exponential Backoff from TCP V. Jacobson, “Congestion Avoidance and Control,” in ACM CCR, 18(4): 314-329, Aug 1988. – Exponential retransmit timer backoff Implicit packet conservation principle Response times improvement of short and interactive flows by five times in certain scenarios A. Mondal and A. Kuzmanovic, “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008. 9 Chapter IV: Multi-constraint QoS routing framework We design a framework that finds path under multiple constraints without NP-hard computation Dijkstra’s algorithm involves NP-hard computation Hybrid protocol of path vector protocol and ondemand route discovery Using simulation based on real-world data we demonstrated that our solution is both efficient and scalable Built a functional prototype using Click Modular router A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 10 Chapter V: Audio/video performance issues: Diagnosis and solutions Identify challenges towards high quality audio/video conferencing over the Internet Understand loss and jitter behavior in shorter time scale and quantify impacts of various network scenarios Investigate solutions A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Real-time Audio/Video Conferencing”, In IEEE IWQoS 2010 A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010 11 Modern AV conferencing System 12 SureCall platform A distributed measurement and experiment platform – – – – Understand problems and experiment solutions Agents installed on volunteers’ machines Measurements and experiments driven by masters SureCall agents are upgradeable without user intervention Available from http://research.microsoft.com/~chengh/SureCall/SureCall.htm 13 SureCall measurement Emulated bidirectional audio/video sessions using UDP – – – – – 5 minute per hour Audio bitrate : 24 kbps Video bitrate: 192 kbps STUN NAT traversal protocol for home users Detailed packet-level traces collected Network connectivity close to the clients – ICMP packet pair with TTL=2 Traceroute to other endpoint at the beginning and end of each session Environmental details on client machines – CPU load, network interface type 14 SureCall deployment Microsoft global enterprise network Many residential networks Current deployment status – 80 unique machines • Enterprise - 32 • Home – 20 • Both – 28 Enterprise trace and Home trace – Two separate masters (within enterprise network and in public Internet) 15 SureCall dataset 4,800 hours of packet traces – 4,100 from enterprise – 700 from home 1968 unique IP addresses – Enterprise - 1212 – Home -756 Trace classification and stratification – Intra-continental vs intercontinental – Wired vs wireless – Audio-only vs audio+video Trace preprocessing – Clock skew removal Clock skew in wild 16 Jitter computation algorithm Multiple algorithms to compute jitter – Variance of one-way-delay samples – Time difference between actual packet receiving time and ideal receiving time • Most relevant for multimedia streaming/conferencing with playout buffer 17 Jitter in enterprise and residential networks US-US, wired traces Inter-continental, wired traces Residential networks have significantly higher jitter compared to enterprise networks and affected greatly by inter-continental links. 18 Jitter variation across hosts Enterprise Home Jitter variation is much higher in residential networks than in enterprise networks. The 95-th percentile jitter values are significantly worse than median jitter values in home networks. 19 Packet loss in residential and enterprise networks Even well provisioned enterprise networks can become quite congested in short time scale. Both enterprise and home networks show long tail in loss burst size distribution. 20 Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. Enterprise Home 21 Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. The degradation due to WiFi in enterprise scenarios is more severe than that in home scenarios. Enterprise Home 22 Impact of VPN on performance Jitter Loss VPN connection causes more degradation compared to wireless. 23 Can jitter predict future loss events? Extent to which loss and jitter are correlated, i.e. whether abrupt jitter increase can serve as a precursor of network congestion and predict future loss events – audio/video conferencing applications can take anticipatory action. > 10 ms average increase in end-to-end delay for the last three packets preceding a loss event – enterprise networks ~ 82% , – home networks ~ 80% 24 Correlation between loss burst size and jitter Enterprise Home 1. End-to-end delay increases significantly before loss events in both enterprise and home networks. 2. Increase in end-to-end delay is not a great indicator of loss burst size in enterprise networks. 25 Network audio diagnostics Concealed: percent of packets interpolated or extrapolated due to unrecovered packet loss Stretched: percent of packets stretched via time compression Classifier operates as follows Supervised training with ground-truth objectively determined by PESQ score 26 Audio classifier performance The classifier achieves a true positive rate >80% and false positive rate < 1% for T1=T2=0.07. 27 WiFi Relay: Improving VoIP Quality for WiFi Users Large number of WiFi clients both in enterprise and residential networks – 43% enterprises provide only WiFi connections to their employees – 36% uses VoIP over WiFi WiFi links can significantly degrade VoIP performance Possible reasons – dense deployment of APs, overloading of an AP point, other wireless devices in the vicinity, etc 28 Effectiveness of redundancy Passive analysis with voice packet replication – Replication ratio r = 2,3,4, or 5 Packet losses can be effectively mitigated using application layer packet replication 29 Overhead of replication • Typical audio packet size = 60 bytes • Encapsulated with RTP(12bytes), UDP (8bytes), IP(20bytes), 802.11 MAC(28bytes), PHY (20us for 802.11g) headers. • w/o ACK: air time = DIFS + PHY header + (60+76 bytes)/54Mbps = 70 us Replication ratio 1 Air time (us) w/o ACK w/ ACK 70 102 96 128 Replicating audio packet at application layer 111 causes 79 only marginal increase in air time 3 87 120 2 4 30 WiFi relay solution Nearby wired endpoints as relays Heavy replication between relays and wireless endpoints No dedicated infrastructure 31 Evaluation Evaluated on SureCall platform – Upgrade SureCall clients to support relay Simultaneous direct call and relayed VoIP calls between each pair of SureCall agents – Apple-to-apple comparison – One-hop overlay (only one wireless endpoint) – Two-hop overlay (both endpoints are wireless) Relay node selection based on enterprise internal database 32 Impact of relay on jitter No dedicated infrastructure, ordinary endpoints as relay nodes Relay has negligible impact on end-to-end jitter CDF of jitter diff at 50th percentile CDF of jitter diff at 95th percentile 33 Improvement with WiFi relay WiFi relay greatly reduces packet loss WiFi relay significantly improve VoIP quality for WiFi users Mean Opinion Score (MOS) – Calculated from packet loss rate and jitter (Cole et al. CCR’01) – Fixed de-jitter buffer of 100 ms 34 Summary of Chapter V SureCall, a distributed experimental platform, to address the challenges of audio/video communications over Internet. Characterized enterprise and residential networks over a wide variety of network scenarios Classifier that accurately predicts when network issues most likely to cause audio quality degradation WiFi relay that significantly improve VoIP qualify for WiFi clients 35 Conclusion Proposed easily deployable techniques to improve performance of TCP based interactive applications Demonstrated that exponential backoff can be altogether removed from TCP without any stability issues Designed an overlay framework to support multimedia services with multiple QoS constraints Developed an distributed experimental framework, SureCall, to understand the challenges towards IP based audio/video communications and for rapid evaluation of new protocols 36 Thank you! 37 Publications [1] A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, In IEEE INFOCOM 2007 [2] A. Mondal and A. Kuzmanovic , “A Poisoning-Resilient TCP Stack”, In IEEE ICNP 2007 [3] A. Mondal and A. Kuzmanovic , “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008. [4] A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 [5] A. Kuzmanovic, A Mondal, S. Floyd, and K.K. Ramakrishnan. “Adding Explicit Congestion Notification (ECN) Capabilities to TCP’s SYN/ACK Packets”. RFC 5562, June 2009. [6] A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, In IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010 [7] A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Realtime Audio/Video Conferencing”, In IEEE IWQoS 2010 [8] A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010 [9] A. Mondal, I. Trestian, Z. Quin, and A. Kuzmanovic, “P2P as CDN (Akamizing BitTorrent)”, under submission [10] J. Miller, A. Mondal, R. Potharaju, P Dinda, and A. Kuzmanovic, “Network Monitoring is People: Understanding End-user Perception of Network Problems”, Under submission. 38 Backup slides 39 QoS and the Internet QoS Architectures – – – – Integrated Service (Intserv) Differentiated Service (Diffserv) Multi Protocol Label Switching (MPLS) Traffic Engineering and Constraint based routing Key Challenges – Scalability issues in core – Complex signaling protocols – Deployment overhead Current Internet still offers only a best-effort service Motivates to investigate easily deployable solutions that improve end-to-end network performance 40 QoS using transport and application layer techniques without network support Explicit congestion notification [ Floyd 94] Packet marking and differential dropping [Guo and Matta’01] Limited transmit [Allman et al. 01] Service differentiation [Neoreddine and Tobagi’02] Differential congestion notification [Le et al.’04] TCP smart framing [Mellia et al. ‘05] ECN+ [Kuzmanovic’05] Early retransmit [Allman et al.’06] TCP SAReno [Yang and Vecinia’02] PCP [Anderson et al. ‘06] 41 Going beyond TCP-fair Differentiated minRTO – Application-limited flows use reduced minRTO value Short-term padding with dummy packets – Application data followed by three tiny dummy packets Diversity approach – Application layer FEC-based approach – The simplest FEC scheme is replication 42 Why Exponential Backoff? Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol – “IP gateway has essentially the same behavior as Ether in a shared-medium network.” 43 Why Exponential Backoff? Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol – “IP gateway has essentially the same behavior as Ether in a shared-medium network.” – Not true! C C 44 Removing exponential backoff from TCP and its implications Other reasons: no admission control, finite flow size, skewed traffic distribution, etc. When to resend a packet? – Implicit packet conservation principle • As soon as the retransmission timeout expires – End-to-end performance can only improve if we remove the exponential backoff from TCP Implications – Significant improvement of response times for short and interactive TCP flows 45 Multiple QoS Constraints The Internet evolves towards the global multiservice IP network – Diverse applications and different QoS requirements Many applications have multiple QoS requirements – Video streaming, VoIP, Video conferencing, etc. Need support for end-to-end QoS guarantee under multiple constraints Multiple QoS constraints often make the routing problem intractable 46 QoS provisioning using overlay networks Build Overlay Backbone – Deploy overlay nodes at strategic locations in the Internet Provide support for per-flow forwarding – e.g. Anagran Flow Aware Routers Flow route management architecture – Discover and setup end-to-end paths for individual flows with diverse flow QoS requirements – Monitor end-to-end flow performance to trigger path adaptation 47 Overlay flow QoS management architecture Configure intermediate overlay nodes for per-flow forwarding Sensing local link characteristics Adapt to different path dynamically as current path fails to meet QoS parameters Find a path to X with b/w > b, delay < d and loss < l% AS1 AS3 AS4 AS2 End user Overlay node Physical link Logical link 48 Contribution Design a scalable QoS routing protocol which finds path under multiple constraints Propose a distributed algorithm for dynamic path adaptation Evaluate accuracy, efficiency and scalability of the protocol using large-scale simulation and compare with other existing approaches Build a functional prototype using Click modular router 49 Design challenges Multiple QoS metrics – Finding a feasible path using Dijkstra’s algorithm is NPComplete – Randomized and approximation algorithms – Single composite metric derived from multiple metrics • Paths might not meet individual QoS constraints Dynamic overlay-link properties – Increases control message overhead 50 Multi-constraint QoS routing protocol Path vector protocol to disseminate path information – Tag with QoS parameters How to aggregate path information when multiple QoS metrics are considered? – Distribute the best paths for each metrics What about QoS requests which could be served by paths which are not in the best path set? – On-demand route discovery A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 51 MCQoS: Disseminating path information Advertise best path for each QoS metric Local link info B QoS Path Table X AS1 (10ms, 0.01%, 1Mbps) AS3 (5ms, 0.01%, 768Kbps) AS5 B/w X AS1 (2ms, 0.0%, 128Kbps) AS3 (3ms, 0.005%, 378Kbps) AS5 Loss X AS1 (2ms, 0.01%, 128Kbps) AS3 (3ms, 0.02%, 378Kbps) AS5 Delay A Tag QoS characteristics 52 MCQoS: Aggregating path information What about QoS requests in the undecidable region? Bandwidth (b/w) undecideable best b/w best delay infeasible feasible Delay There The source will feasible cannot node exist already requests a feasible knows that path acan path in be theifsupported network the QoS if request but thethe QoS source falls request in the node falls feasible in might the infeasible region not knowregion about those paths, thus cannot admit flows based on local information 53 MCQoS: On-demand route discovery B/W infeasible undecideable Admit or deny flow based on local QoS table if in feasible or infeasible region Otherwise, On-demand route discovery for requests in undecideable region Exploit advertisement received from neighbors to reduce search space while route discovery feasible Delay C A B E D 54 MCQoS: Illustration through example best b/w best delay 10ms 12Mbps 100ms 50Mbps 4ms 5Mbps 105ms 50Mbps A (1ms, 100Mbps) 5ms 5Mbps 106ms 50Mbps C B (5ms, 100Mbps) (2ms, 20Mbps) 2ms 8ms E 5Mbps D 20Mbps Requests: 10ms, 3Mbps 120ms, 15Mbps 10ms, 100Mbps 15ms, 15Mbps OK ABD…E OK ABC…E X ---OK ABD…E ??? 55 Route maintenance in MCQoS D F H A C E G B Route maintenance through path patching Each intermediate node knows the QoS requirements from the node to the destination Upstream node periodically pushes QoS requirements to downstream nodes As a node detects QoS violation, it triggers alternate path search at local node – Notify upstream node if no alternative path 56 Overhead analysis of path dissemination 10 2 1 5 9 3 4 4 10 6 7 8 5 6 10 In MCQoS protocol, a node advertises only the best path to a destination. Thus many alternative paths are pruned, which increases scalability. 57 Overhead analysis of on-demand route discovery Parameters – Average out-degree of the nodes – Overlay distance between source to destination Worst case – Message overhead is proportional to sum of all possible path lengths from source to destination Amortized cost – Fraction of request in undecidable region – Limit no of hops of route discovery More than 99% of the undecidable region is discovered within 5 hops from the source node, thus amortized cost will be significantly less than worst case scenario. 58 Experimental evaluation of MCQoS Built an event-driven simulator Generated random flat topology of nodes using GTITM – Outdegree min(10, size/2) Assigned link metrics from actual planetlab link measurement data 59 Convergence time of path dissemination Convergence time: how long does it take to stabilize for a given network snapshot? Re-stabilization time: how long does it take to stabilize Being path vector based protocol MCQoS takes longer once a link metric changes? time to converge, but does not involve any NP-hard computation, thus scale with network size QRON: Link state based multi-QoS routing protocol using composite metric approach 60 Message overhead of path dissemination Message overhead of MCQoS is comparable to LinkState based (QRON) protocol 61 Elaborating the undecidable region K-hop path: paths in the undecidabe region discovered within k-hops of on-demand route discovery process Depletion Global feasible region: feasible region at the source node ifarea the source node knew all alternative paths like link-state protocol Depletion area: part of global feasible QoS region not known at the source node because many alternate paths are suppressed 62 Overhead of on-demand path discovery How many hops does it take to discover the entire depletion area? We measure the fraction of depletion area discovered More thank90% the depletion is discovered within hopsoffrom the sourcearea node within 3 hops 63 Improvement in accuracy by MCQoS A feasible path with a composite metric might not satisfy individual QoS metrics. The line-segment based approach often suffers from loss/distortion. Our hybrid approach has no false positive and false negative percentage can be reduced to less than one 1% by 3-hop on-demand route discovery. 64 QoS violation ratio in dynamic environment with MCQoS 100 node topology Generate QoS requests with certain arrival rate with b/w [5Mbps, 55Mbps] and delay [100ms,400ms] Each flow lasts between 5 to 10 minutes We simulate the network behavior for 10 minutes New flows arrive before network stabilizes – Expect to observe QoS violation Arrival rate (conn/sec)The 60 120 240 300 QoS violation ratio is negligible even with Violation arrival rate of 600 conn/sec. 0.32 0.33 0.78 0.4 ratio (%) 600 1.12 65 MCQoS enabled overlay node prototype QoS path setup (Y:p -> X:q, Dms, L%, BKbps) Local link characteristics Peers (path ads) Rt. discovery req, Rt. discovery reply Flow setup req Flow setup DataIn Control Plane S3 MCQoS QoS Path table Flow id Next hop Y:p ->X:q C Click Router DataOut Data Plane 66 Summary Designed a scalable multiple constraints QoS flow route management protocol – – – hybrid approach of path vector routing and on-demand route discovery Keep balance between flow setup time and control message overhead No complex NP-hard computation Performed large-scale simulations to demonstrate the efficiency and scalability of the approach Built a prototype using Click modular router 67 ‘Composite Metric’ approach to multi-QoS routing (1/2) Composite Metric = K1*delay + k2/bw where k1=1, k2 = 10^7, delay in sec, b/w in bps False positive: flow is admitted but the path does not meet the QoS False negative: there exists a feasible path but the flow is not admitted 68 ‘Composite Metric’ approach to multi-QoS routing (2/2) 69 ‘Line Segment’ approach to multi-QoS routing (1/2) Lui et al. proposed line segment based approach to for topology aggregation in delay-bw plane. Tam et al. designed a distance vector based QoS protocol using the line-segment approach False positive: Fraction of undecidable region that is actually infeasible, but the approach labels as feasible. False negative: Fraction of undecidable region that is feasible, but the approach labels as infeasible. 70 ‘Line Segment’ approach to multi-QoS routing (2/2) 71