* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Understanding Network Performance in Extreme Congestion Scenario
Distributed firewall wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Network tap wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Airborne Networking wikipedia , lookup
TCP congestion control wikipedia , lookup
Internet protocol suite wikipedia , lookup
Deep packet inspection wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Transport and Application Layer Approaches to Improve End-to-end Performance in the Internet PhD thesis defense Amit Mondal Committee: Aleksandar Kuzmanovic, Asst. Professor, Northwestern Univ Peter Dinda, Assoc. Professor, Northwestern Univ Yan Chen, Assoc. Professor, Northwestern Univ Jin Li, Principal Researcher, Microsoft Research Internet — A multiservice IP network VoIP FTP IPTV The InternetSRis a UTAH commercial infrastructure used by diverse set of applications and services UCSB UCLA Video Conferencing Streaming Gaming 2 Challenges involved…  Applications have end-to-end network performance requirements – Jitter, latency, packet loss, bandwidth, etc  Original Internet – – – – Best effort service No service assurance TCP ensures only in-order packet delivery Destination-based IP routing “high throughput” “low delay” Need to provide support to new set of emerging applications in the Internet 3 Application classification based on QoS Low bandwidth High bandwidth Latency sensitive VoIP, network games, SSH, chatting, web browsing, ecommerce Multimedia streaming, IPTV, audio/video conferencing Latency insensitive Email File transfer (FTP/BitTorrent) My focus:  Low-latency interactive TCP applications (Chapter II and III) –  Telnet, SSH, network games, e-commerce, etc. Interactive multimedia services (Chapter IV and V) – Audio/video conferencing, VoIP, streamed multimedia services, etc. 4 Endpoint Forward error correction, Bitrate adaptation, Chapter-V, etc. TCP smart framing, Limited retransmit, Early retransmit, Chapter-II, Chapter-III, etc. N/A N/A N/A Data Link Physical Transport ECN, ECN+, packet marking & differential dropping, Service differentiation, etc. Application Infrastr Overlay routing (QRON, QSON, etc.) Chapter-IV IntServ, DiffServ, Traffic engineering, Constraint based routing Network The spectrum of QoS provisioning MPLS Bandwidth overprovisioning 5 Research thesis Despite much work to improve end-to-end performance in the Internet, there still exists a significant space for improvement. In my dissertation, I develop techniques to reduce the gap further.  For example, I propose techniques that improve – Response times of short TCP flows by five times in certain scenarios – Median Mean Opinion Score (MOS) of VoIP calls over WiFi by a factor of two 6 Outline Chapter I: Introduction  Chapter II: Improving performance of thin-stream TCP applications  Chapter III: Removing exponential backoff from TCP  Chapter IV: Multi-constraint QoS routing framework  Chapter V: Audio/video performance Issues: Diagnosis and solutions  Conclusion  7 Chapter II: Improving thin-stream TCP flows Upgrading mice to elephants data packets strict priority TCP-fair rate “dummy” packets Packet switched   Circuit switched A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, IEEE INFOCOM 2007 A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010 8 Chapter III: Removing Exponential Backoff from TCP  V. Jacobson, “Congestion Avoidance and Control,” in ACM CCR, 18(4): 314-329, Aug 1988. – Exponential retransmit timer backoff  Implicit packet conservation principle  Response times improvement of short and interactive flows by five times in certain scenarios  A. Mondal and A. Kuzmanovic, “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008. 9 Chapter IV: Multi-constraint QoS routing framework  We design a framework that finds path under multiple constraints without NP-hard computation  Dijkstra’s algorithm involves NP-hard computation Hybrid protocol of path vector protocol and ondemand route discovery  Using simulation based on real-world data we demonstrated that our solution is both efficient and scalable  Built a functional prototype using Click Modular router   A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 10 Chapter V: Audio/video performance issues: Diagnosis and solutions  Identify challenges towards high quality audio/video conferencing over the Internet  Understand loss and jitter behavior in shorter time scale and quantify impacts of various network scenarios  Investigate solutions   A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Real-time Audio/Video Conferencing”, In IEEE IWQoS 2010 A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010 11 Modern AV conferencing System 12 SureCall platform  A distributed measurement and experiment platform – – – –  Understand problems and experiment solutions Agents installed on volunteers’ machines Measurements and experiments driven by masters SureCall agents are upgradeable without user intervention Available from http://research.microsoft.com/~chengh/SureCall/SureCall.htm 13 SureCall measurement  Emulated bidirectional audio/video sessions using UDP – – – – –  5 minute per hour Audio bitrate : 24 kbps Video bitrate: 192 kbps STUN NAT traversal protocol for home users Detailed packet-level traces collected Network connectivity close to the clients – ICMP packet pair with TTL=2   Traceroute to other endpoint at the beginning and end of each session Environmental details on client machines – CPU load, network interface type 14 SureCall deployment Microsoft global enterprise network  Many residential networks  Current deployment status  – 80 unique machines • Enterprise - 32 • Home – 20 • Both – 28  Enterprise trace and Home trace – Two separate masters (within enterprise network and in public Internet) 15 SureCall dataset  4,800 hours of packet traces – 4,100 from enterprise – 700 from home  1968 unique IP addresses – Enterprise - 1212 – Home -756  Trace classification and stratification – Intra-continental vs intercontinental – Wired vs wireless – Audio-only vs audio+video  Trace preprocessing – Clock skew removal Clock skew in wild 16 Jitter computation algorithm  Multiple algorithms to compute jitter – Variance of one-way-delay samples – Time difference between actual packet receiving time and ideal receiving time • Most relevant for multimedia streaming/conferencing with playout buffer 17 Jitter in enterprise and residential networks US-US, wired traces Inter-continental, wired traces Residential networks have significantly higher jitter compared to enterprise networks and affected greatly by inter-continental links. 18 Jitter variation across hosts Enterprise Home Jitter variation is much higher in residential networks than in enterprise networks. The 95-th percentile jitter values are significantly worse than median jitter values in home networks. 19 Packet loss in residential and enterprise networks Even well provisioned enterprise networks can become quite congested in short time scale. Both enterprise and home networks show long tail in loss burst size distribution. 20 Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. Enterprise Home 21 Impact of WiFi connections In both enterprise and home networks, wireless traces show significantly worse jitter statistics than wired traces. The degradation due to WiFi in enterprise scenarios is more severe than that in home scenarios. Enterprise Home 22 Impact of VPN on performance Jitter Loss VPN connection causes more degradation compared to wireless. 23 Can jitter predict future loss events?  Extent to which loss and jitter are correlated, i.e. whether abrupt jitter increase can serve as a precursor of network congestion and predict future loss events – audio/video conferencing applications can take anticipatory action.  > 10 ms average increase in end-to-end delay for the last three packets preceding a loss event – enterprise networks ~ 82% , – home networks ~ 80% 24 Correlation between loss burst size and jitter Enterprise Home 1. End-to-end delay increases significantly before loss events in both enterprise and home networks. 2. Increase in end-to-end delay is not a great indicator of loss burst size in enterprise networks. 25 Network audio diagnostics Concealed: percent of packets interpolated or extrapolated due to unrecovered packet loss  Stretched: percent of packets stretched via time compression  Classifier operates as follows   Supervised training with ground-truth objectively determined by PESQ score 26 Audio classifier performance The classifier achieves a true positive rate >80% and false positive rate < 1% for T1=T2=0.07. 27 WiFi Relay: Improving VoIP Quality for WiFi Users  Large number of WiFi clients both in enterprise and residential networks – 43% enterprises provide only WiFi connections to their employees – 36% uses VoIP over WiFi WiFi links can significantly degrade VoIP performance  Possible reasons – dense deployment of APs, overloading of an AP point, other wireless devices in the vicinity, etc 28 Effectiveness of redundancy  Passive analysis with voice packet replication – Replication ratio r = 2,3,4, or 5 Packet losses can be effectively mitigated using application layer packet replication 29 Overhead of replication • Typical audio packet size = 60 bytes • Encapsulated with RTP(12bytes), UDP (8bytes), IP(20bytes), 802.11 MAC(28bytes), PHY (20us for 802.11g) headers. • w/o ACK: air time = DIFS + PHY header + (60+76 bytes)/54Mbps = 70 us Replication ratio 1 Air time (us) w/o ACK w/ ACK 70 102 96 128 Replicating audio packet at application layer 111 causes 79 only marginal increase in air time 3 87 120 2 4 30 WiFi relay solution    Nearby wired endpoints as relays Heavy replication between relays and wireless endpoints No dedicated infrastructure 31 Evaluation  Evaluated on SureCall platform – Upgrade SureCall clients to support relay  Simultaneous direct call and relayed VoIP calls between each pair of SureCall agents – Apple-to-apple comparison – One-hop overlay (only one wireless endpoint) – Two-hop overlay (both endpoints are wireless)  Relay node selection based on enterprise internal database 32 Impact of relay on jitter  No dedicated infrastructure, ordinary endpoints as relay nodes Relay has negligible impact on end-to-end jitter CDF of jitter diff at 50th percentile CDF of jitter diff at 95th percentile 33 Improvement with WiFi relay WiFi relay greatly reduces packet loss WiFi relay significantly improve VoIP quality for WiFi users  Mean Opinion Score (MOS) – Calculated from packet loss rate and jitter (Cole et al. CCR’01) – Fixed de-jitter buffer of 100 ms 34 Summary of Chapter V SureCall, a distributed experimental platform, to address the challenges of audio/video communications over Internet.  Characterized enterprise and residential networks over a wide variety of network scenarios  Classifier that accurately predicts when network issues most likely to cause audio quality degradation  WiFi relay that significantly improve VoIP qualify for WiFi clients  35 Conclusion Proposed easily deployable techniques to improve performance of TCP based interactive applications  Demonstrated that exponential backoff can be altogether removed from TCP without any stability issues  Designed an overlay framework to support multimedia services with multiple QoS constraints  Developed an distributed experimental framework, SureCall, to understand the challenges towards IP based audio/video communications and for rapid evaluation of new protocols  36 Thank you! 37 Publications [1] A. Mondal and A. Kuzmanovic, “When TCP Friendliness Becomes Harmful”, In IEEE INFOCOM 2007 [2] A. Mondal and A. Kuzmanovic , “A Poisoning-Resilient TCP Stack”, In IEEE ICNP 2007 [3] A. Mondal and A. Kuzmanovic , “Removing Exponential Backoff from TCP”, In ACM SIGCOMM CCR, Volume 38, Number 5, October 2008. [4] A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 [5] A. Kuzmanovic, A Mondal, S. Floyd, and K.K. Ramakrishnan. “Adding Explicit Congestion Notification (ECN) Capabilities to TCP’s SYN/ACK Packets”. RFC 5562, June 2009. [6] A. Mondal and A. Kuzmanovic, “Upgrading Mice to Elephants: Effects and End-Point Solutions”, In IEEE/ACM Transactions on Networking, Volume 18, Issue 2, April 2010 [7] A. Mondal, R. Cutler, C. Huang, J. Li, and A. Kuzmanovic, “SureCall: Towards Glitch-Free Realtime Audio/Video Conferencing”, In IEEE IWQoS 2010 [8] A. Mondal, C. Huang, M. Jain, J. Li, and A. Kuzmanovic, “A Case of WiFi Relay: Improving VoIP Quality for WiFi Users”, In IEEE ICC 2010 [9] A. Mondal, I. Trestian, Z. Quin, and A. Kuzmanovic, “P2P as CDN (Akamizing BitTorrent)”, under submission [10] J. Miller, A. Mondal, R. Potharaju, P Dinda, and A. Kuzmanovic, “Network Monitoring is People: Understanding End-user Perception of Network Problems”, Under submission. 38 Backup slides 39 QoS and the Internet  QoS Architectures – – – –  Integrated Service (Intserv) Differentiated Service (Diffserv) Multi Protocol Label Switching (MPLS) Traffic Engineering and Constraint based routing Key Challenges – Scalability issues in core – Complex signaling protocols – Deployment overhead  Current Internet still offers only a best-effort service  Motivates to investigate easily deployable solutions that improve end-to-end network performance 40 QoS using transport and application layer techniques without network support Explicit congestion notification [ Floyd 94]  Packet marking and differential dropping [Guo and Matta’01]  Limited transmit [Allman et al. 01]  Service differentiation [Neoreddine and Tobagi’02]  Differential congestion notification [Le et al.’04]  TCP smart framing [Mellia et al. ‘05]  ECN+ [Kuzmanovic’05]  Early retransmit [Allman et al.’06]  TCP SAReno [Yang and Vecinia’02]  PCP [Anderson et al. ‘06]  41 Going beyond TCP-fair  Differentiated minRTO – Application-limited flows use reduced minRTO value  Short-term padding with dummy packets – Application data followed by three tiny dummy packets  Diversity approach – Application layer FEC-based approach – The simplest FEC scheme is replication 42 Why Exponential Backoff?  Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol – “IP gateway has essentially the same behavior as Ether in a shared-medium network.” 43 Why Exponential Backoff?  Jacobson adopted exponential backoff from the classical shared-medium Ethernet protocol – “IP gateway has essentially the same behavior as Ether in a shared-medium network.” – Not true! C C 44 Removing exponential backoff from TCP and its implications Other reasons: no admission control, finite flow size, skewed traffic distribution, etc.  When to resend a packet?  – Implicit packet conservation principle • As soon as the retransmission timeout expires – End-to-end performance can only improve if we remove the exponential backoff from TCP  Implications – Significant improvement of response times for short and interactive TCP flows 45 Multiple QoS Constraints  The Internet evolves towards the global multiservice IP network – Diverse applications and different QoS requirements  Many applications have multiple QoS requirements – Video streaming, VoIP, Video conferencing, etc. Need support for end-to-end QoS guarantee under multiple constraints  Multiple QoS constraints often make the routing problem intractable  46 QoS provisioning using overlay networks  Build Overlay Backbone – Deploy overlay nodes at strategic locations in the Internet  Provide support for per-flow forwarding – e.g. Anagran Flow Aware Routers  Flow route management architecture – Discover and setup end-to-end paths for individual flows with diverse flow QoS requirements – Monitor end-to-end flow performance to trigger path adaptation 47 Overlay flow QoS management architecture Configure intermediate overlay nodes for per-flow forwarding Sensing local link characteristics Adapt to different path dynamically as current path fails to meet QoS parameters Find a path to X with b/w > b, delay < d and loss < l% AS1 AS3 AS4 AS2 End user Overlay node Physical link Logical link 48 Contribution Design a scalable QoS routing protocol which finds path under multiple constraints  Propose a distributed algorithm for dynamic path adaptation  Evaluate accuracy, efficiency and scalability of the protocol using large-scale simulation and compare with other existing approaches  Build a functional prototype using Click modular router  49 Design challenges  Multiple QoS metrics – Finding a feasible path using Dijkstra’s algorithm is NPComplete – Randomized and approximation algorithms – Single composite metric derived from multiple metrics • Paths might not meet individual QoS constraints  Dynamic overlay-link properties – Increases control message overhead 50 Multi-constraint QoS routing protocol  Path vector protocol to disseminate path information – Tag with QoS parameters  How to aggregate path information when multiple QoS metrics are considered? – Distribute the best paths for each metrics  What about QoS requests which could be served by paths which are not in the best path set? – On-demand route discovery  A. Mondal, P. Sharma, S. Banerjee, and A. Kuzmanovic, “Supporting Application Network Flows with Multiple QoS Constraints”, In IEEE IWQoS 2009 51 MCQoS: Disseminating path information Advertise best path for each QoS metric Local link info B QoS Path Table X AS1 (10ms, 0.01%, 1Mbps) AS3 (5ms, 0.01%, 768Kbps) AS5 B/w X AS1 (2ms, 0.0%, 128Kbps) AS3 (3ms, 0.005%, 378Kbps) AS5 Loss X AS1 (2ms, 0.01%, 128Kbps) AS3 (3ms, 0.02%, 378Kbps) AS5 Delay A Tag QoS characteristics 52 MCQoS: Aggregating path information  What about QoS requests in the undecidable region? Bandwidth (b/w) undecideable best b/w best delay infeasible feasible Delay There The source will feasible cannot node exist already requests a feasible knows that path acan path in be theifsupported network the QoS if request but thethe QoS source falls request in the node falls feasible in might the infeasible region not knowregion about those paths, thus cannot admit flows based on local information 53 MCQoS: On-demand route discovery B/W infeasible undecideable Admit or deny flow based on local QoS table if in feasible or infeasible region  Otherwise, On-demand route discovery for requests in undecideable region  Exploit advertisement received from neighbors to reduce search space while route discovery feasible Delay C A  B E D 54 MCQoS: Illustration through example best b/w best delay 10ms 12Mbps 100ms 50Mbps 4ms 5Mbps 105ms 50Mbps A (1ms, 100Mbps) 5ms 5Mbps 106ms 50Mbps C B (5ms, 100Mbps) (2ms, 20Mbps) 2ms 8ms E 5Mbps D 20Mbps Requests: 10ms, 3Mbps 120ms, 15Mbps 10ms, 100Mbps 15ms, 15Mbps OK ABD…E OK ABC…E X ---OK ABD…E ??? 55 Route maintenance in MCQoS D F H A C E G B  Route maintenance through path patching  Each intermediate node knows the QoS requirements from the node to the destination  Upstream node periodically pushes QoS requirements to downstream nodes  As a node detects QoS violation, it triggers alternate path search at local node – Notify upstream node if no alternative path 56 Overhead analysis of path dissemination 10 2 1 5 9 3 4 4 10 6 7 8 5 6 10 In MCQoS protocol, a node advertises only the best path to a destination. Thus many alternative paths are pruned, which increases scalability. 57 Overhead analysis of on-demand route discovery  Parameters – Average out-degree of the nodes – Overlay distance between source to destination  Worst case – Message overhead is proportional to sum of all possible path lengths from source to destination  Amortized cost – Fraction of request in undecidable region – Limit no of hops of route discovery More than 99% of the undecidable region is discovered within 5 hops from the source node, thus amortized cost will be significantly less than worst case scenario. 58 Experimental evaluation of MCQoS  Built an event-driven simulator  Generated random flat topology of nodes using GTITM – Outdegree min(10, size/2)  Assigned link metrics from actual planetlab link measurement data 59 Convergence time of path dissemination  Convergence time: how long does it take to stabilize for a given network snapshot?  Re-stabilization time: how long does it take to stabilize Being path vector based protocol MCQoS takes longer once a link metric changes? time to converge, but does not involve any NP-hard computation, thus scale with network size  QRON: Link state based multi-QoS routing protocol using composite metric approach 60 Message overhead of path dissemination Message overhead of MCQoS is comparable to LinkState based (QRON) protocol 61 Elaborating the undecidable region  K-hop path: paths in the undecidabe region discovered within k-hops of on-demand route discovery process Depletion  Global feasible region: feasible region at the source node ifarea the source node knew all alternative paths like link-state protocol  Depletion area: part of global feasible QoS region not known at the source node because many alternate paths are suppressed 62 Overhead of on-demand path discovery  How many hops does it take to discover the entire depletion area?  We measure the fraction of depletion area discovered More thank90% the depletion is discovered within hopsoffrom the sourcearea node within 3 hops 63 Improvement in accuracy by MCQoS  A feasible path with a composite metric might not satisfy individual QoS metrics.  The line-segment based approach often suffers from loss/distortion.  Our hybrid approach has no false positive and false negative percentage can be reduced to less than one 1% by 3-hop on-demand route discovery. 64 QoS violation ratio in dynamic environment with MCQoS  100 node topology  Generate QoS requests with certain arrival rate with b/w [5Mbps, 55Mbps] and delay [100ms,400ms]  Each flow lasts between 5 to 10 minutes  We simulate the network behavior for 10 minutes  New flows arrive before network stabilizes – Expect to observe QoS violation Arrival rate (conn/sec)The 60 120 240 300 QoS violation ratio is negligible even with Violation arrival rate of 600 conn/sec. 0.32 0.33 0.78 0.4 ratio (%) 600 1.12 65 MCQoS enabled overlay node prototype QoS path setup (Y:p -> X:q, Dms, L%, BKbps) Local link characteristics Peers (path ads) Rt. discovery req, Rt. discovery reply Flow setup req Flow setup DataIn Control Plane S3 MCQoS QoS Path table Flow id Next hop Y:p ->X:q C Click Router DataOut Data Plane 66 Summary  Designed a scalable multiple constraints QoS flow route management protocol – – – hybrid approach of path vector routing and on-demand route discovery Keep balance between flow setup time and control message overhead No complex NP-hard computation Performed large-scale simulations to demonstrate the efficiency and scalability of the approach  Built a prototype using Click modular router  67 ‘Composite Metric’ approach to multi-QoS routing (1/2)  Composite Metric = K1*delay + k2/bw where k1=1, k2 = 10^7, delay in sec, b/w in bps  False positive: flow is admitted but the path does not meet the QoS  False negative: there exists a feasible path but the flow is not admitted 68 ‘Composite Metric’ approach to multi-QoS routing (2/2) 69 ‘Line Segment’ approach to multi-QoS routing (1/2)  Lui et al. proposed line segment based approach to for topology aggregation in delay-bw plane. Tam et al. designed a distance vector based QoS protocol using the line-segment approach  False positive: Fraction of undecidable region that is actually infeasible, but the approach labels as feasible.  False negative: Fraction of undecidable region that is feasible, but the approach labels as infeasible. 70 ‘Line Segment’ approach to multi-QoS routing (2/2) 71
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            