Download UDT - SourceForge

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
BREAKING THE DATA TRANSFER BOTTLENECK
UDT: A High Performance Data Transport Protocol
Yunhong GU
[email protected]
Laboratory for Advanced Computing
National Center for Data Mining
University of Illinois at Chicago
October 10, 2005
udt.sourceforge.net
udt.sourceforge.net
1 :: 50
Outline
INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
PERFORMANCE EVALUATION
COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
2 :: 50
>> INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
PERFORMANCE EVALUATION
COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
3 :: 50
Motivations
 The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.)
has enabled many new distributed data intensive applications
 Inexpensive fibers and advanced optical networking technologies (e.g.,
DWDM - Dense Wavelength Division Multiplexing)
 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging
 Large volumetric datasets
 Satellite weather data
 Astronomy observation
 Network monitoring
 The Internet transport protocol (TCP) does NOT scale well as network
bandwidth-delay product (BDP) increases
 New transport protocol is needed!
udt.sourceforge.net
4 :: 50
Data Transport Protocol
 Functionalities




Streaming, messaging
Reliability
Timeliness
Unicast vs. multicast
 Congestion control




Efficiency
Fairness
Convergence
Distributedness
Applications
Transport Layer
Network Layer
Data link Layer
Physical Layer
udt.sourceforge.net
5 :: 50
TCP
 Reliable, data streaming, unicast
 Congestion control
 Increase congestion window size (cwnd) one full sized packet per RTT
 Halve the cwnd per loss event
½ Bandwidth * RTT
 Poor efficiency in high bandwidth-delay product networks
 Bias on flows with larger RTT
udt.sourceforge.net
6 :: 50
TCP
LAN
US
US-EU
US-ASIA
1000
600
400
Throughput (Mb/s)
800
200
1000
Throughput (Mb/s)
800
0.01%
0.05%
600
0.1%
400
0.5%
200
0.1%
1
10
100
200
400
Round Trip Time (ms)
udt.sourceforge.net
7 :: 50
Related Work
 TCP variants
 HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP
 Parallel TCP
 PSockets, GridFTP
 Rate-based reliable UDP
 RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on
UDT)
 XCP
 SABUL
udt.sourceforge.net
8 :: 50
Problems of Existing Work
 Hard to deploy
 TCP variants and XCP
 Need modifications in OS kernel and/or routers
 Cannot be used in shared networks
 Most reliable UDP-based protocols
 Poor fairness
 Intra-protocol fairness
 RTT fairness
 Manual parameter tuning
udt.sourceforge.net
9 :: 50
A New Protocol
LAN
US
US-EU
US-ASIA
1000
600
400
Throughput (Mb/s)
800
200
1000
Throughput (Mb/s)
800
0.01%
0.05%
600
0.1%
400
0.5%
200
0.1%
1
10
100
200
400
Round Trip Time (ms)
udt.sourceforge.net
10 :: 50
UDT (UDP-based Data Transfer Protocol)
 Application level, UDP-based
 Similar functionalities to TCP
 Connection-oriented reliable duplex unicast data streaming
 New protocol design and implementation
 New congestion control algorithm
 Configurable congestion control framework
udt.sourceforge.net
11 :: 50
Objective & Non-objective
 Objective





For distributed data intensive applications in high speed networks
A small number of flows share the abundant bandwidth
Efficient, fair, and friendly
Configurable
Easily deployable and usable
 Non-objective
 Replace TCP on the Internet
udt.sourceforge.net
12 :: 50
UDT Project
 Open source (udt.sourceforge.net)
 Design and implement the UDT protocol
 Design the UDT congestion control algorithm
 Evaluate experimentally the performance of UDT
 Design and implement a configurable protocol framework based on
UDT (Composable UDT)
udt.sourceforge.net
13 :: 50
INTRODUCTION
>> PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
PERFORMANCE EVALUATION
COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
14 :: 50
UDT Overview
 Two orthogonal elements
 The UDT protocol
 The UDT congestion control algorithm
 Protocol design & implementation
 Functionality
 Efficiency
 Congestion control algorithm
 Efficiency, fairness, friendliness, and stability
udt.sourceforge.net
15 :: 50
UDT Overview
Applications
UDT Socket
Applications
Applications
UDT
Socket API
Socket API
TCP
UDP
udt.sourceforge.net
16 :: 50
Functionality
 Reliability




Packet-based sequencing
Acknowledgment and loss report from receiver
ACK sub-sequencing
Retransmission (based on loss report and timeout)
 Streaming
 Buffer/memory management
 Connection maintenance
 Handshake, keep-alive message, teardown message
 Duplex
 Each UDT instance contains both a sender and a receiver
udt.sourceforge.net
17 :: 50
Protocol Architecture
Sender

UDP
Seq. No
TS
Payload
Sender

UDP Channel
Receiver
A
ACK
Seq. No
NAK
Loss List
Receiver
B
udt.sourceforge.net
18 :: 50
Software Architecture
CC
Sender's
Buffer
Sender's
Loss List
UDP Channel
API
Receiver's
Buffer
Sender
Receiver's
Loss List
Receiver
Listener
udt.sourceforge.net
19 :: 50
Efficiency Consideration
 Less packets
 Timer-based acknowledging
 Less CPU time




Reduce per packet processing time
Reduce memory copy
Reduce loss list processing time
Light ACK vs. regular ACK
 Parallel processing
 Threading architecture
 Less burst in processing
 Evenly distribute the processing time
udt.sourceforge.net
20 :: 50
Application Programming Interface (API)
 Socket API
 New functionalities
 sendfile/recvfile
 Overlapped IO support
 Transparent to existing applications
 Recompilation needed
 Certain limitations exist
 XIO support (in Globus Toolkit 4.0)
 Wrapper for other programming languages
 Java, Python
udt.sourceforge.net
21 :: 50
INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
>> CONGESTION CONTROL
PERFORMANCE EVALUATION
COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
22 :: 50
Overview
 Congestion control vs. flow control
 Congestion control: effectively utilize the network bandwidth
 Flow control: prevent the receiver from being overwhelmed by incoming
packets
 Window-based vs. rate-based
 Window-based: tune the maximum number of on-flight packets (TCP)
 Rate-based: tune the inter-packet sending time (UDT)
 AIMD: additive increases multiplicative decreases
 Feedback
 Packet loss (Most TCP variants, UDT)
 Delay (Vegas, FAST)
udt.sourceforge.net
23 :: 50
AIMD with Decreasing Increases
 AIMD
 x = x + (x), for every constant interval (e.g., RTT)
 x = (1 - ) x, when there is a packet loss event
where x is the packet sending rate.
 TCP
 (x)  1, and the increase interval is RTT.
  = 0.5
 AIMD with Decreasing Increase
 (x) is non-increasing, and limx->+ (x) = 0.
udt.sourceforge.net
24 :: 50
AIMD with Decreasing Increases
(x)
UDT
Scalable TCP
HighSpeed TCP
AIMD (TCP NewReno)
x
udt.sourceforge.net
25 :: 50
UDT Control Algorithm
 Increase
 (x) = f( B - x ) * c
where B is the link capacity
(Bandwidth), c is a constant
parameter
(x)
 ( x)  10log(B  x )   c
 Constant rate control interval
(SYN), irrelevant to RTT
 SYN = 0.01 seconds
 Decrease
 Randomized decrease factor
  = 1 – (8/9)n
x
udt.sourceforge.net
26 :: 50
The Increase Formula: an Example
Bandwidth (B) = 10 Gbps, Packet size = 1500 bytes
x (Mbps)
B - x (Mbps)
Increment (pkts/SYN)
[0, 9000)
(1000, 10000]
10
[9000, 9900)
(100, 1000]
1
[9900, 9990)
(10, 100]
0.1
[9990, 9999)
(1, 10]
0.01
[9999, 9999.9)
(0.1, 1]
0.001
9999.9+
<0.1
0.00067
udt.sourceforge.net
27 :: 50
Dealing with Packet Loss
 Loss synchronization
 Randomization method
 Non-congestion loss
 Do not decrease sending rate for the first packet loss
M=5, N=2
M=8, N=3
 Packet reordering
udt.sourceforge.net
28 :: 50
Bandwidth Estimation
 Packet Pair
P2 P1
P2
P1
P2
P1
Packet Size / Space  Bottleneck Bandwidth
 Filters
 Cross traffic
 Interrupt Coalescence
 Robust to estimation errors
 Randomized interval to send packet pair
udt.sourceforge.net
29 :: 50
INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
>> PERFORMANCE EVALUATION
COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
30 :: 50
Performance Characteristics
 Efficiency
 Higher bandwidth utilization, less CPU usage
 Intra-protocol fairness
 Max-min fairness
 Jain's fairness index
 TCP friendliness
 Bulk TCP flow vs Bulk UDT flow
 Short-lived TCP flow (slow start phase) vs Bulk UDT flow
 Stability (oscillations)
 Stability index (standard deviation)
udt.sourceforge.net
31 :: 50
Evaluation Strategies
 Simulations vs. experiments
 NS2 network simulator, NCDM teraflow testbed
 Setup
 Network topology, bandwidth, distance, queuing, Link error rate, etc.
 Concurrency (number of parallel flows)
 Comparison (against TCP)
 Real world applications
 SDSS data transfer, high performance mining of streaming data, etc.
 Independent evaluation
 SLAC, JGN2, UvA, Unipmn (Italy), etc.
udt.sourceforge.net
32 :: 50
Efficiency, Fairness, & Stability
206.220.241.16
145.146.98.81
206.220.241.15
145.146.98.80
206.220.241.14
145.146.98.79
1Gb/s bandwidth, 106 ms RTT,
206.220.241.13
145.146.98.78
StarLight, Chicago
SARA, Amsterdam
Flow 1
Flow 2
Flow 3
Flow 4
0
100
200
300
400
Time (sec)
500
600
700
udt.sourceforge.net
33 :: 50
Efficiency, Fairness, & Stability
1000
Throughout (Mbits/s)
900
450
300
200
0
Flow 1
0
100
902
Flow 2
200
300
466
313
446
Flow 3
400
Time (s)
500
600
215
301
452
308
216
310
452
302
202
307
Flow 4
700
885
197
Efficiency
902
912
923
830
918
904
885
Fairness
1
0.999
0.999
0.998
0.999
1
1
Stability
0.11
0.11
0.08
0.16
0.04
0.02
0.04
udt.sourceforge.net
34 :: 50
TCP Friendliness
TCP Throughput (Mb/s)
80
70
60
50
40
30
20
0
1
2
3
4
5
6
Number of UDT flows
7
8
9
10
 500 1MB TCP flows vs. 0 – 10 bulk UDT flows
 1Gb/s between Chicago and Amsterdam
udt.sourceforge.net
35 :: 50
INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
PERFORMANCE EVALUATION
>> COMPOSABLE UDT
CONCLUSIONS
udt.sourceforge.net
36 :: 50
Composable UDT - Objectives
 Easy implementation and deployment of new control algorithms
 Easy evaluation of new control algorithms
 Application awareness support and dynamic configuration
udt.sourceforge.net
37 :: 50
Composable UDT - Methodologies
 Packet sending control
 Window-based, rate-based, and hybrid
 Control event handling
 onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc.
 Protocol parameters access
 RTT, loss rate, RTO, etc.
 Packet extension
 User-defined control packets
udt.sourceforge.net
38 :: 50
Composable UDT - Evaluation
 Simplicity
 Can it be easily used?
 Expressiveness
 Can it be used to implement most control protocols?
 Similarity
 Can Composable UDT based implementations reproduce the
performance of their native implementations?
 Overhead
 Will the overhead added by Composable UDT be too large?
udt.sourceforge.net
39 :: 50
Simplicity & Expressiveness
 Eight event handlers, four protocol control functions, and one
performance monitoring function.
 Support a large variety of protocols
 Reliable UDT blast
 TCP and its variants (both loss and delay based)
 Group transport protocols
udt.sourceforge.net
40 :: 50
Simplicity & Expressiveness
CCC
Base Congestion
Control Class
CTCP
TCP NewReno
CGTP
CUDPBlast
Group Transport
Protocol
Reliable UDP
Blast
28
CVegas
CScalable
CHS
CBiC
CWestwood
TCP Vegas
Scalable TCP
HighSpeed TCP
BiC TCP
TCP Westwood
73 / +132-6
11 / +192-29
8 / +27-1
11 / +192-29
27 / +145-2
CFAST
FAST TCP
37 / +351-2
udt.sourceforge.net
41 :: 50
Similarity and Overhead
 Similarity
 How Composable UDT based implementations can simulate their native
implementations
 CTCP vs. Linux TCP
Flow
#
Throughput
TCP
Fairness
CTCP
TCP
CTCP
Stability
TCP
CTCP
1
112
122
1
1
0.517
0.415
2
191
208
0.997
0.999
0.476
0.426
4
322
323
0.949
0.999
0.484
0.492
8
378
422
0.971
0.999
0.633
0.550
16
672
642
0.958
0.985
0.502
0.482
32
877
799
0.988
0.997
0.491
0.470
64
921
716
0.994
0.996
0.569
0.529
 CPU usage


Sender: CTCP uses about 100% more times of CPU as Linux TCP
Receiver: CTCP uses about 20% more CPU than Linux TCP
udt.sourceforge.net
42 :: 50
INTRODUCTION
PROTOCOL DESIGN & IMPLEMENTATION
CONGESTION CONTROL
PERFORMANCE EVALUATION
COMPOSABLE UDT
>> CONCLUSIONS
udt.sourceforge.net
43 :: 50
Contributions
 A high performance data transport protocol and associated
implementation
 The UDT protocol
 Open source UDT library (udt.sourceforge.net)
 User includes ANL, ORNL, PNNL, etc.
 An efficient and fair congestion control algorithm
 DAIMD & the UDT control algorithm
 Packet loss handling techniques
 Using bandwidth estimation technique in congestion control
 A configurable transport protocol framework
 Composable UDT
udt.sourceforge.net
44 :: 50
Publications

Papers on the UDT Protocol






Supporting Configurable Congestion Control in Data Transport Services,
Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA.
Optimizing UDP-based Protocol Implementation, Yunhong Gu and Robert L.
Grossman, PFLDNet 2005, Lyon, France, Feb. 2005.
Experiences in Design and Implementation of a High Performance Transport
Protocol, Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov 6
- 12, Pittsburgh, PA.
An Analysis of AIMD Algorithms with Decreasing Increases, Yunhong Gu,
Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid
Applications (Gridnets 2004), Oct. 29, San Jose, CA.
SABUL: A Transport Protocol for Grid Computing, Yunhong Gu and Robert L.
Grossman, Journal of Grid Computing, 2003, Volume 1, Issue 4, pp. 377-386.
Internet Draft

UDT: A Transport Protocol for Data Intensive Applications, Yunhong Gu and
Robert L. Grossman, draft-gg-udt-01.txt.
udt.sourceforge.net
45 :: 50
Publications

Papers on Data Transfer Service using UDT




Experimental Studies of Data Transport and Data Access of Earth Science
Data over Networks with High Bandwidth Delay Products, Robert Grossman,
Yunhong Gu, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy,
Computer Networks, Volume 46, Issue 3, Oct. 2004, pp. 411-421.
Teraflows over Gigabit WANs with UDT, Robert Grossman, Yunhong Gu,
Xinwei Hong, Antony Antony, Johan Blom, Freek Dijkstra, and Cees de Laat,,
Journal of Future Computer Systems, Vol. 21, Issue 4, pp. 501-513, April 2005.
The Photonic TeraStream: Enabling Next Generation Applications Through
Intelligent Optical Networking at iGrid 2002, J. Mambretti, J. Weinberger, J.
Chen, E. Bacon, F. Yeh, D. Lillethun, R. Grossman, Y. Gu, M. Mazzuco,,
Journal of Future Computer Systems, Volume 19, Number 6, pages 897-908.
Experimental Studies Using Photonic Data Services at IGrid 2002, R.
Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M.
Mazzucco, and J. Weinberger, Journal of Future Computer Systems, 2003,
Volume 19, Number 6, pages 945-955.
udt.sourceforge.net
46 :: 50
Publications

Papers on Applications using UDT




Open DMIX: High Performance Web Services for Distributed Data Mining,
R. Grossman, Y. Gu, C. Gupta, D. Hanley, X. Hong, and P. Krishnaswamy, 7th
International Workshop on High Performance and Distributed Mining, .
Open DMIX - Data Integration and Exploration Services for Data Grids, R.
Grossman, Y. Gu, D. Hanley, X. Hong, and G. Rao, First International
Workshop on Knowledge Grid and Grid Intelligence (KGGI 2003).
Global Access to Large Distributed Data Sets using Photonic Data Services, R.
Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M.
Mazzucco, and J. Weinberger, 20th IEEE/11th NASA Goddard Conference on
Mass Storage Systems and Technologies (MSST 2003), Los Alamitos, CA.
Data Webs for Earth Science Data, Asvin Ananthanarayan, Rajiv
Balachandran, Yunhong Gu, Robert Grossman, Xinwei Hong, Jorge Levera,
Marco Mazzucco, Parallel Computing, Volume 29, 2003, pages 1363-1379.
udt.sourceforge.net
47 :: 50
Achievements
 SC 2002 Bandwidth Challenge “Best Use of Emerging Network
Infrastructure” Award
 SC 2003 Bandwidth Challenge “Application Foundation” Award
 SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP
Fairness” Award
 SC 2005 ?
 Nov. 12 – 18, Seattle WA
 High Performance Mining of Streaming Data using UDT
 iGrid 2005
 Exploring and mining remote data at 10Gb/s
udt.sourceforge.net
48 :: 50
Vision
 Short-term
 A practical solution to the distributed data intensive applications in high
BDP environments
 Long-term
 Evolve with new technologies (open source & open standard)
 More functionalities and support for more use scenarios
 Network research platform (e.g., fast prototyping and evaluation of new
control algorithms)
udt.sourceforge.net
49 :: 50
The End
Thank You!
Yunhong Gu, October 10, 2005
udt.sourceforge.net
50 :: 50
Related documents