Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
BREAKING THE DATA TRANSFER BOTTLENECK UDT: A High Performance Data Transport Protocol Yunhong GU [email protected] Laboratory for Advanced Computing National Center for Data Mining University of Illinois at Chicago October 10, 2005 udt.sourceforge.net udt.sourceforge.net 1 :: 50 Outline INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 2 :: 50 >> INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 3 :: 50 Motivations The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.) has enabled many new distributed data intensive applications Inexpensive fibers and advanced optical networking technologies (e.g., DWDM - Dense Wavelength Division Multiplexing) 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging Large volumetric datasets Satellite weather data Astronomy observation Network monitoring The Internet transport protocol (TCP) does NOT scale well as network bandwidth-delay product (BDP) increases New transport protocol is needed! udt.sourceforge.net 4 :: 50 Data Transport Protocol Functionalities Streaming, messaging Reliability Timeliness Unicast vs. multicast Congestion control Efficiency Fairness Convergence Distributedness Applications Transport Layer Network Layer Data link Layer Physical Layer udt.sourceforge.net 5 :: 50 TCP Reliable, data streaming, unicast Congestion control Increase congestion window size (cwnd) one full sized packet per RTT Halve the cwnd per loss event ½ Bandwidth * RTT Poor efficiency in high bandwidth-delay product networks Bias on flows with larger RTT udt.sourceforge.net 6 :: 50 TCP LAN US US-EU US-ASIA 1000 600 400 Throughput (Mb/s) 800 200 1000 Throughput (Mb/s) 800 0.01% 0.05% 600 0.1% 400 0.5% 200 0.1% 1 10 100 200 400 Round Trip Time (ms) udt.sourceforge.net 7 :: 50 Related Work TCP variants HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP Parallel TCP PSockets, GridFTP Rate-based reliable UDP RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on UDT) XCP SABUL udt.sourceforge.net 8 :: 50 Problems of Existing Work Hard to deploy TCP variants and XCP Need modifications in OS kernel and/or routers Cannot be used in shared networks Most reliable UDP-based protocols Poor fairness Intra-protocol fairness RTT fairness Manual parameter tuning udt.sourceforge.net 9 :: 50 A New Protocol LAN US US-EU US-ASIA 1000 600 400 Throughput (Mb/s) 800 200 1000 Throughput (Mb/s) 800 0.01% 0.05% 600 0.1% 400 0.5% 200 0.1% 1 10 100 200 400 Round Trip Time (ms) udt.sourceforge.net 10 :: 50 UDT (UDP-based Data Transfer Protocol) Application level, UDP-based Similar functionalities to TCP Connection-oriented reliable duplex unicast data streaming New protocol design and implementation New congestion control algorithm Configurable congestion control framework udt.sourceforge.net 11 :: 50 Objective & Non-objective Objective For distributed data intensive applications in high speed networks A small number of flows share the abundant bandwidth Efficient, fair, and friendly Configurable Easily deployable and usable Non-objective Replace TCP on the Internet udt.sourceforge.net 12 :: 50 UDT Project Open source (udt.sourceforge.net) Design and implement the UDT protocol Design the UDT congestion control algorithm Evaluate experimentally the performance of UDT Design and implement a configurable protocol framework based on UDT (Composable UDT) udt.sourceforge.net 13 :: 50 INTRODUCTION >> PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 14 :: 50 UDT Overview Two orthogonal elements The UDT protocol The UDT congestion control algorithm Protocol design & implementation Functionality Efficiency Congestion control algorithm Efficiency, fairness, friendliness, and stability udt.sourceforge.net 15 :: 50 UDT Overview Applications UDT Socket Applications Applications UDT Socket API Socket API TCP UDP udt.sourceforge.net 16 :: 50 Functionality Reliability Packet-based sequencing Acknowledgment and loss report from receiver ACK sub-sequencing Retransmission (based on loss report and timeout) Streaming Buffer/memory management Connection maintenance Handshake, keep-alive message, teardown message Duplex Each UDT instance contains both a sender and a receiver udt.sourceforge.net 17 :: 50 Protocol Architecture Sender UDP Seq. No TS Payload Sender UDP Channel Receiver A ACK Seq. No NAK Loss List Receiver B udt.sourceforge.net 18 :: 50 Software Architecture CC Sender's Buffer Sender's Loss List UDP Channel API Receiver's Buffer Sender Receiver's Loss List Receiver Listener udt.sourceforge.net 19 :: 50 Efficiency Consideration Less packets Timer-based acknowledging Less CPU time Reduce per packet processing time Reduce memory copy Reduce loss list processing time Light ACK vs. regular ACK Parallel processing Threading architecture Less burst in processing Evenly distribute the processing time udt.sourceforge.net 20 :: 50 Application Programming Interface (API) Socket API New functionalities sendfile/recvfile Overlapped IO support Transparent to existing applications Recompilation needed Certain limitations exist XIO support (in Globus Toolkit 4.0) Wrapper for other programming languages Java, Python udt.sourceforge.net 21 :: 50 INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION >> CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 22 :: 50 Overview Congestion control vs. flow control Congestion control: effectively utilize the network bandwidth Flow control: prevent the receiver from being overwhelmed by incoming packets Window-based vs. rate-based Window-based: tune the maximum number of on-flight packets (TCP) Rate-based: tune the inter-packet sending time (UDT) AIMD: additive increases multiplicative decreases Feedback Packet loss (Most TCP variants, UDT) Delay (Vegas, FAST) udt.sourceforge.net 23 :: 50 AIMD with Decreasing Increases AIMD x = x + (x), for every constant interval (e.g., RTT) x = (1 - ) x, when there is a packet loss event where x is the packet sending rate. TCP (x) 1, and the increase interval is RTT. = 0.5 AIMD with Decreasing Increase (x) is non-increasing, and limx->+ (x) = 0. udt.sourceforge.net 24 :: 50 AIMD with Decreasing Increases (x) UDT Scalable TCP HighSpeed TCP AIMD (TCP NewReno) x udt.sourceforge.net 25 :: 50 UDT Control Algorithm Increase (x) = f( B - x ) * c where B is the link capacity (Bandwidth), c is a constant parameter (x) ( x) 10log(B x ) c Constant rate control interval (SYN), irrelevant to RTT SYN = 0.01 seconds Decrease Randomized decrease factor = 1 – (8/9)n x udt.sourceforge.net 26 :: 50 The Increase Formula: an Example Bandwidth (B) = 10 Gbps, Packet size = 1500 bytes x (Mbps) B - x (Mbps) Increment (pkts/SYN) [0, 9000) (1000, 10000] 10 [9000, 9900) (100, 1000] 1 [9900, 9990) (10, 100] 0.1 [9990, 9999) (1, 10] 0.01 [9999, 9999.9) (0.1, 1] 0.001 9999.9+ <0.1 0.00067 udt.sourceforge.net 27 :: 50 Dealing with Packet Loss Loss synchronization Randomization method Non-congestion loss Do not decrease sending rate for the first packet loss M=5, N=2 M=8, N=3 Packet reordering udt.sourceforge.net 28 :: 50 Bandwidth Estimation Packet Pair P2 P1 P2 P1 P2 P1 Packet Size / Space Bottleneck Bandwidth Filters Cross traffic Interrupt Coalescence Robust to estimation errors Randomized interval to send packet pair udt.sourceforge.net 29 :: 50 INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL >> PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 30 :: 50 Performance Characteristics Efficiency Higher bandwidth utilization, less CPU usage Intra-protocol fairness Max-min fairness Jain's fairness index TCP friendliness Bulk TCP flow vs Bulk UDT flow Short-lived TCP flow (slow start phase) vs Bulk UDT flow Stability (oscillations) Stability index (standard deviation) udt.sourceforge.net 31 :: 50 Evaluation Strategies Simulations vs. experiments NS2 network simulator, NCDM teraflow testbed Setup Network topology, bandwidth, distance, queuing, Link error rate, etc. Concurrency (number of parallel flows) Comparison (against TCP) Real world applications SDSS data transfer, high performance mining of streaming data, etc. Independent evaluation SLAC, JGN2, UvA, Unipmn (Italy), etc. udt.sourceforge.net 32 :: 50 Efficiency, Fairness, & Stability 206.220.241.16 145.146.98.81 206.220.241.15 145.146.98.80 206.220.241.14 145.146.98.79 1Gb/s bandwidth, 106 ms RTT, 206.220.241.13 145.146.98.78 StarLight, Chicago SARA, Amsterdam Flow 1 Flow 2 Flow 3 Flow 4 0 100 200 300 400 Time (sec) 500 600 700 udt.sourceforge.net 33 :: 50 Efficiency, Fairness, & Stability 1000 Throughout (Mbits/s) 900 450 300 200 0 Flow 1 0 100 902 Flow 2 200 300 466 313 446 Flow 3 400 Time (s) 500 600 215 301 452 308 216 310 452 302 202 307 Flow 4 700 885 197 Efficiency 902 912 923 830 918 904 885 Fairness 1 0.999 0.999 0.998 0.999 1 1 Stability 0.11 0.11 0.08 0.16 0.04 0.02 0.04 udt.sourceforge.net 34 :: 50 TCP Friendliness TCP Throughput (Mb/s) 80 70 60 50 40 30 20 0 1 2 3 4 5 6 Number of UDT flows 7 8 9 10 500 1MB TCP flows vs. 0 – 10 bulk UDT flows 1Gb/s between Chicago and Amsterdam udt.sourceforge.net 35 :: 50 INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION >> COMPOSABLE UDT CONCLUSIONS udt.sourceforge.net 36 :: 50 Composable UDT - Objectives Easy implementation and deployment of new control algorithms Easy evaluation of new control algorithms Application awareness support and dynamic configuration udt.sourceforge.net 37 :: 50 Composable UDT - Methodologies Packet sending control Window-based, rate-based, and hybrid Control event handling onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc. Protocol parameters access RTT, loss rate, RTO, etc. Packet extension User-defined control packets udt.sourceforge.net 38 :: 50 Composable UDT - Evaluation Simplicity Can it be easily used? Expressiveness Can it be used to implement most control protocols? Similarity Can Composable UDT based implementations reproduce the performance of their native implementations? Overhead Will the overhead added by Composable UDT be too large? udt.sourceforge.net 39 :: 50 Simplicity & Expressiveness Eight event handlers, four protocol control functions, and one performance monitoring function. Support a large variety of protocols Reliable UDT blast TCP and its variants (both loss and delay based) Group transport protocols udt.sourceforge.net 40 :: 50 Simplicity & Expressiveness CCC Base Congestion Control Class CTCP TCP NewReno CGTP CUDPBlast Group Transport Protocol Reliable UDP Blast 28 CVegas CScalable CHS CBiC CWestwood TCP Vegas Scalable TCP HighSpeed TCP BiC TCP TCP Westwood 73 / +132-6 11 / +192-29 8 / +27-1 11 / +192-29 27 / +145-2 CFAST FAST TCP 37 / +351-2 udt.sourceforge.net 41 :: 50 Similarity and Overhead Similarity How Composable UDT based implementations can simulate their native implementations CTCP vs. Linux TCP Flow # Throughput TCP Fairness CTCP TCP CTCP Stability TCP CTCP 1 112 122 1 1 0.517 0.415 2 191 208 0.997 0.999 0.476 0.426 4 322 323 0.949 0.999 0.484 0.492 8 378 422 0.971 0.999 0.633 0.550 16 672 642 0.958 0.985 0.502 0.482 32 877 799 0.988 0.997 0.491 0.470 64 921 716 0.994 0.996 0.569 0.529 CPU usage Sender: CTCP uses about 100% more times of CPU as Linux TCP Receiver: CTCP uses about 20% more CPU than Linux TCP udt.sourceforge.net 42 :: 50 INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT >> CONCLUSIONS udt.sourceforge.net 43 :: 50 Contributions A high performance data transport protocol and associated implementation The UDT protocol Open source UDT library (udt.sourceforge.net) User includes ANL, ORNL, PNNL, etc. An efficient and fair congestion control algorithm DAIMD & the UDT control algorithm Packet loss handling techniques Using bandwidth estimation technique in congestion control A configurable transport protocol framework Composable UDT udt.sourceforge.net 44 :: 50 Publications Papers on the UDT Protocol Supporting Configurable Congestion Control in Data Transport Services, Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA. Optimizing UDP-based Protocol Implementation, Yunhong Gu and Robert L. Grossman, PFLDNet 2005, Lyon, France, Feb. 2005. Experiences in Design and Implementation of a High Performance Transport Protocol, Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov 6 - 12, Pittsburgh, PA. An Analysis of AIMD Algorithms with Decreasing Increases, Yunhong Gu, Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA. SABUL: A Transport Protocol for Grid Computing, Yunhong Gu and Robert L. Grossman, Journal of Grid Computing, 2003, Volume 1, Issue 4, pp. 377-386. Internet Draft UDT: A Transport Protocol for Data Intensive Applications, Yunhong Gu and Robert L. Grossman, draft-gg-udt-01.txt. udt.sourceforge.net 45 :: 50 Publications Papers on Data Transfer Service using UDT Experimental Studies of Data Transport and Data Access of Earth Science Data over Networks with High Bandwidth Delay Products, Robert Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy, Computer Networks, Volume 46, Issue 3, Oct. 2004, pp. 411-421. Teraflows over Gigabit WANs with UDT, Robert Grossman, Yunhong Gu, Xinwei Hong, Antony Antony, Johan Blom, Freek Dijkstra, and Cees de Laat,, Journal of Future Computer Systems, Vol. 21, Issue 4, pp. 501-513, April 2005. The Photonic TeraStream: Enabling Next Generation Applications Through Intelligent Optical Networking at iGrid 2002, J. Mambretti, J. Weinberger, J. Chen, E. Bacon, F. Yeh, D. Lillethun, R. Grossman, Y. Gu, M. Mazzuco,, Journal of Future Computer Systems, Volume 19, Number 6, pages 897-908. Experimental Studies Using Photonic Data Services at IGrid 2002, R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, Journal of Future Computer Systems, 2003, Volume 19, Number 6, pages 945-955. udt.sourceforge.net 46 :: 50 Publications Papers on Applications using UDT Open DMIX: High Performance Web Services for Distributed Data Mining, R. Grossman, Y. Gu, C. Gupta, D. Hanley, X. Hong, and P. Krishnaswamy, 7th International Workshop on High Performance and Distributed Mining, . Open DMIX - Data Integration and Exploration Services for Data Grids, R. Grossman, Y. Gu, D. Hanley, X. Hong, and G. Rao, First International Workshop on Knowledge Grid and Grid Intelligence (KGGI 2003). Global Access to Large Distributed Data Sets using Photonic Data Services, R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2003), Los Alamitos, CA. Data Webs for Earth Science Data, Asvin Ananthanarayan, Rajiv Balachandran, Yunhong Gu, Robert Grossman, Xinwei Hong, Jorge Levera, Marco Mazzucco, Parallel Computing, Volume 29, 2003, pages 1363-1379. udt.sourceforge.net 47 :: 50 Achievements SC 2002 Bandwidth Challenge “Best Use of Emerging Network Infrastructure” Award SC 2003 Bandwidth Challenge “Application Foundation” Award SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP Fairness” Award SC 2005 ? Nov. 12 – 18, Seattle WA High Performance Mining of Streaming Data using UDT iGrid 2005 Exploring and mining remote data at 10Gb/s udt.sourceforge.net 48 :: 50 Vision Short-term A practical solution to the distributed data intensive applications in high BDP environments Long-term Evolve with new technologies (open source & open standard) More functionalities and support for more use scenarios Network research platform (e.g., fast prototyping and evaluation of new control algorithms) udt.sourceforge.net 49 :: 50 The End Thank You! Yunhong Gu, October 10, 2005 udt.sourceforge.net 50 :: 50