Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Edge of Smartness Carey Williamson Department of Computer Science University of Calgary Email: [email protected] 1 Copyright © 2005 Department of Computer Science Main Message • Now, more than ever, we need “smart edge” devices to enhance the performance, functionality, and efficiency of the Internet Application Application Transport Transport Network Network Data Link Data Link Physical Core Network Physical 2 Copyright © 2005 Department of Computer Science The End-to-End Principle • Central design tenet of the Internet (simple core) • Represented in design of TCP/IP protocol stack • Wikipedia: Whenever possible, communication protocol operations should be defined to occur at the end-points of a communications system • Some good reading: – J. Saltzer, D. Reed, and D. Clark, “End-to-End Arguments in System Design”, ACM ToCS, 1984 – M. Blumenthal and D. Clark, “Rethinking the Design of the Internet: The end to end arguments vs. the brave new world”, ACM ToIT, 2001 3 Copyright © 2005 Department of Computer Science The End-to-End Principle: Revisited • Claim: The ongoing evolution of the Internet is blurring our notion of what an end system is • This is true for both client side and server side – Client: mobile phones, proxies, middleboxes, WLAN – Server: P2P, cloud, data centers, CDNs, Hadoop • When something breaks in the Internet protocol stack, we have to find a suitable retrofit to make it work properly • We have done this repeatedly for decades, and will likely keep doing it again and again! 4 Copyright © 2005 Department of Computer Science (Selected) Existing Examples • • • • • • • • • • • Mobility: Mobile IP, MoM, Home/Foreign Agents Small devices: mobile portals, content transcoding Web traffic volume: proxy caching, CDNs Wireless: I-TCP, Proxy TCP, Snoop TCP, cross-layer IP address space: Network Address Translation (NAT) Multi-homing: smart devices, cognitive networks, SIP Big data: P2P file sharing, BT, download managers P2P file sharing: traffic classification, traffic shapers Security concerns: firewalls, intrusion/anomaly detection Intermittent connectivity: delay-tolerant networks (DTN) Deep space: inter-planetary IP 5 Copyright © 2005 Department of Computer Science The Smart Edge • Similar “tweaks” will be needed at server side • Putting new functionality in a “smart edge” device seems like a logical choice, for reasons of performance, functionality, efficiency, security • What is meant by “smart”? – Interconnected: one or more networks; define basic information units; awareness of location/context – Instrumented: suitably represent user activities; location, time, identity, and activity; perf metrics – Intelligent: provisioning, management, adaptation; appropriate decision-making in real-time 6 Copyright © 2005 Department of Computer Science Example 1: Redundant Traffic Elimination 7 Copyright © 2005 Department of Computer Science Basic Principles of RTE • If you can “remember” what you have sent before, then you don’t have to send another copy • Redundant Traffic Elimination (RTE) • Done using a dictionary of chunks and their associated fingerprints • Examples: – Joke telling by certain CS professors – Data deduplication in storage systems (90% savings) – “WAN Optimization” in networks (20% savings) 8 Copyright © 2005 Department of Computer Science Redundant Traffic Elimination (RTE) • Purpose: Use bottleneck link more efficiently • Basic idea: Use a cache of data chunks to avoid transmitting identical chunks more than once Distance Overlap • RTE process: Chunk B FP A Chunk A ... FP C ... ... FP B ... Chunk C ... ... • Works within and across files • Combines caching and chunking Chunk C ... ... Chunk A Chunk B – Divide IP packet into chunks – Select a subset of chunks FP A = fingerprint (Chunk A) – Store a cache of chunks at two ends of a network link or path – Transfer only chunks that are not cached Chunk cache 9 Copyright © 2005 Department of Computer Science RTE Process Pipeline Current Improve traditional RTE Exploit traffic non- NIC NIC uniformities: Packet Packet size (bypass technique) Chunk popularity (new cache management scheme) Content type (content-aware RTE) Up to 50% more detected redundancy Proposed Packet Large enough? No Chunking (no overlap) Yes Next chunk Fingerprinting Overlap OK? Yes No Content promising? No Chunk expansion Yes FIFO cache management Fingerprinting Forwarding non-FIFO cache management 10 Copyright © 2005 Department of Computer Science Forwarding Main Sources of Redundancy Type Value Description Example Nulls 57.1% Consecutive null bytes 0x00000000 Text 16.7% Plain text (English) Gnutella HTTP 7.3% HTTP directives Content-Type: Mixed 6.2% Plain text and other chars 14pt font Binary 5.8% Random characters 0x27c46128 HTML 3.7% HTML code fragments <HTML> <p> Char+1 3.2% Repeated text chars AAAAAAAz 11 Copyright © 2005 Department of Computer Science RTE Summary • Improves traditional RTE savings by up to 50% • Techniques can be used individually or together • RTE very beneficial for wireless traffic – 30% of users have 10-50% redundant traffic • Proposed a novel content-aware RTE – Improve RTE savings by up to 38% • Challenges of content-aware RTE – Needs refinement to be able to work on real traces, or exploit an appropriate traffic classification scheme – Needs improvement in execution time 12 Copyright © 2005 Department of Computer Science Example 2: The TCP Incast Problem 13 Copyright © 2005 Department of Computer Science Motivation • Emerging IT paradigms – – – – Data centers, grid computing, HPC, multi-core Cluster-based storage systems, SAN, NAS Large-scale data management “in the cloud” Data manipulation via “services-oriented computing” • Cost and efficiency advantages from IT trends, economy of scale, specialization marketplace • Performance advantages from parallelism – Partition/aggregation, Hadoop, multi-core, etc. – Think RAID at Internet scale! (1000x) 14 Copyright © 2005 Department of Computer Science Problem Formulation TCP retransmission timeouts How to provide high goodput for data center applications? • • • • TCP throughput degradation High-speed, low-latency network (RTT ≤ 0.1 ms) Highly-multiplexed link (e.g., 1000 flows) Highly-synchronized flows on bottleneck link Limited switch buffer size (e.g., 100 packets) 15 Copyright © 2005 Department of Computer Science Summary Summary: TCP Incast Problem • Data centers have specific network characteristics • TCP-incast throughput collapse problem emerges • Solutions: – Tweak TCP parameters for this environment – Redesign TCP for this environment – Rewrite applications for this environment (Facebook) – Smart edge coordination for uploads/downloads 16 Copyright © 2005 Department of Computer Science Concluding Remarks • We need “smart edge” devices to enhance the performance, functionality, security, and efficiency of the Internet (now more than ever!) Application Application Transport Transport Network Network Data Link Data Link Physical Core Network Physical 17 Copyright © 2005 Department of Computer Science Future Outlook and Opportunities • • • • • • • • Traffic classification QoS management Load balancing Security and privacy Cloud computing Virtualization everywhere Multipath TCP congestion control … 18 Copyright © 2005 Department of Computer Science