* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Overlay Network Monitoring and its Applications
Internet protocol suite wikipedia , lookup
Distributed firewall wikipedia , lookup
Backpressure routing wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
TCP congestion control wikipedia , lookup
Computer network wikipedia , lookup
Network tap wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Spanning Tree Protocol wikipedia , lookup
Routing in delay-tolerant networking wikipedia , lookup
Tomography-based Overlay Network Monitoring and its Applications Yan Chen Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley Motivation • Applications of end-to-end distance monitoring – – – – – Overlay routing/location Peer-to-peer systems VPN management/provisioning Service redirection/placement Cache-infrastructure configuration • Requirements for E2E monitoring system – – – – Scalable & efficient: small amount of probing traffic Accurate: capture congestion/failures Incrementally deployable Easy to use Existing Work • Static estimation: – Global Network Positioning (GNP) • Dynamic monitoring – Loss rates: RON (n2 measurement) – Latency: IDMaps, Dynamic Distance Maps, Isobar • Latency similarity under normal conditions doesn’t imply similar losses ! • Network tomography – Focusing on inferring the characteristics of physical links rather than E2E paths – Limited measurements -> under-constrained system, unidentifiable links Problem Formulation Given n end hosts on an overlay network and O(n2) paths, how to select a minimal subset of paths to monitor so that the loss rates/latency of all topology other paths can be inferred. Overlay Network Operation Center measurements End hosts • Key idea: select a basis set of k paths that completely describe all O(n2) paths (k «O(n2)) – – – Select and monitor k linearly independent paths to compute the loss rates of basis set Infer the loss rates of all other paths Applicable for any additive metrics, like latency Path Matrix and Path Space A B p lj • Path loss rate p, link loss rate l 1 p (1 l ) j j s.t .vj 1 • Totally s links, path vector v s s log( 1 p) vj log( 1 lj ) vjxj v x T j 1 • Path matrix G O(n2) – Put all r = paths together – Path loss rate vector b j 1 r s Gx b, G {0 | 1} , b r 1 Sample Path Matrix A b2 1 b1 3 D 2 B b3 C 1 1 0 G 0 0 1 1 1 1 x1 b1 G x 2 b 2 x3 b3 link 2 (1,-1,0) (1,1,0) row space (measured) null space (unmeasured) link 1 link 3 1 0 b1 / 2 • x1 - x2 unknown => cannot ( x1 x2 ) 0 b / 2 xG 1 x compute x1, x2 3 1 2 • Set of vectors [1 1 0]T 0 1 b2 form null space 1 • To separate identifiable vs. x ( x1 x2 ) 1 N unidentifiable components: 2 0 x = xG + xN • All E2E paths are in path b Gx Gx Gx Gx G N G space, i.e., GxN = 0 Intuition through Topology Virtualization • Virtual links: minimal path segments whose loss rates uniquely identified • Can fully describe all paths • xG : similar forms as virtual links 1’ 1 2 Rank(G)=1 Rank(G)=2 1 1’ 2’ 2 1’ 1 2 Rank(G)=3 3 2’ 3’ 4 3 4’ 1 2 1 1 2 3 Virtualization Real links (solid) and overlay paths (dotted) going through them Virtual links 5 Algorithms • Select k = rank(G) linearly independent paths to monitor – Use rank revealing decomposition, e.g., QR with column pivoting – Leverage sparse matrix: time O(rk2) and memory O(k2) • E.g., 10 minutes for n = 350 (r = 61075) and k = 2958 • Compute the loss rates of other paths Solve xG with GxG b, then b GxG – Time O(k2) and memory O(k2) How much measurement saved ? • k « O(n2) ? • For power-law Internet topology, M nodes, N end hosts – There are O(M) links and N > M/2 (with proof) – If n = O(N), overlay network has O(n) IP links, k = O(n) When a Small Portion of End Hosts on Overlay • Internet has moderate hierarchical structure [TGJ+02] – If a pure hierarchical structure (tree): k = O(n) – If no hierarchy at all (worst case, clique): k = O(n2) – Internet should fall in between … For reasonably large n, (e.g., 100), k = O(nlogn) BRITE 20K-node hierarchical topology Mercator 284K-node real router topology Practical Issues • Topology measurement errors tolerance – Care about path loss rates than any interior links – Poor router alias resolution present show multiple links for one => assign similar loss rates to all the links – Unidentifiable routers => ignore them as virtualization • Topology changes – Add/remove/change one path incurs O(k2) time – Topology relatively stable in order of a day => incremental detection • Simulation Evaluation – Topology • BRITE: Barabasi-Albert, Waxman, hierarchical: 1K – 20K nodes • Real router topology from Mercator: 284K nodes – Fraction of end hosts on the overlay: 1 - 10% – Loss rate distribution (90% links are good) • Good link: 0-1% loss rate; bad link: 5-10% loss rates • Good link: 0-1% loss rate; bad link: 1-100% loss rates – Loss model: • Bernouli: independent drop of packet • Gilbert: busty drop of packet – Path loss rate simulated via transmission of 10K pkts • Metric: path loss rate estimation accuracy – Absolute/relative errors – Lossy path inference Experiments on Planet Lab • 51 hosts, each from different organizations – 51 × 50 = 2,550 paths • Simultaneous loss rate measurement US (40) – 300 trials, 300 msec each – In each trial, send a 40-byte UDP pkt to every other host • Simultaneous topology measurement – Traceroute • Experiments: 6/24 – 6/27 – 100 experiments in peak hours # of hosts Areas and Domains Europe (6) International (11) .edu 33 .org 3 .net 2 .gov 1 .us 1 France 1 Sweden 1 Denmark 1 Germany 1 UK 2 Taiwan 1 Hong Kong 1 Asia (2) Canada 2 Australia 1 Tomography-based Overlay Monitoring Results • Loss rate distribution loss rate [0, 0.05) % 95.9% lossy path [0.05, 1.0] (4.1%) [0.05, 0.1) [0.1, 0.3) [0.3, 0.5) [0.5, 1.0) 1.0 15.2% 31.0% 23.9% 4.3% 25.6% • Accuracy – On average k = 872 out of 2550 – Absolute error |p – p’|: • Average 0.0027 for all paths, 0.0058 for lossy paths – Relative error p( ) p' ( ) F ( p, p' ) max , p' ( ) p( ) where p( ) max( , p), p' ( ) max( , p' ) Absolute and Relative Errors • For each experiment, get its 95 percentile absolute and relative errors for estimation of 2,550 paths Lossy Path Inference Accuracy • 90 out of 100 runs have coverage over 85% and false positive less than 10% • Many caused by the 5% threshold boundary effects Topology Measurement Error Tolerance • Out of 13 sets of pair-wise traceroute … • On average 248 out of 2550 paths have no or incomplete routing information • No router aliases resolved Conclusion: robust against topology measurement errors Performance Improvement with Overlay • With single-node relay • Loss rate improvement – Among 10,980 lossy paths: – 5,705 paths (52.0%) have loss rate reduced by 0.05 or more – 3,084 paths (28.1%) change from lossy to non-lossy • Throughput improvement – Estimated with 1.5 throughput rtt lossrate – 60,320 paths (24%) with non-zero loss rate, throughput computable – Among them, 32,939 (54.6%) paths have throughput improved, 13,734 (22.8%) paths have throughput doubled or more • Implications: use overlay path to bypass congestion or failures Adaptive Overlay Streaming Media Stanford 5. Alert + New Overlay Path OVERLAY NETWORK OPERATION CENTER 2. Register trigger 4. Detect congestion / failure UC Berkeley X CLIENT UC San Diego 1. Setup connection SERVER 3. Network congestion / failure 7. Skip-free streaming media recovery 6. Setup New Path OVERLAY RELAY NODE HP Labs • • • • Implemented with Winamp client and SHOUTcast server Congestion introduced with a Packet Shaper Skip-free playback: server buffering and rewinding Total adaptation time < 4 seconds Adaptive Streaming Media Architecture MEDIA SOURCE OVERLAY NETWORK OPERATION CENTER SERVER CLIENT Winamp client Triggering / alert + new path SHOUTcast Server RELAY Overlay Layer Buffering Layer Winamp Video/Audio Filter Byte Counter Path Management RELAY Path Management TCP/IP Layer Calculated concatenation point TCP/IP Layer Client 4 Client 3 Client 2 Client 1 From SHOUTcast server BUFFER Byte Counter Overlay Layer TCP/IP Layer Internet Client 1 OVERLAY RELAY NODE Client 2 INTERNET Client 3 Client 4 Conclusions • A tomography-based overlay network monitoring system – Given n end hosts, characterize O(n2) paths with a basis set of O(nlogn) paths – Selectively monitor O(nlogn) paths to compute the loss rates of the basis set, then infer the loss rates of all other paths • Both simulation and real Internet experiments promising • Built adaptive overlay streaming media system on top of monitoring services – Bypass congestion/failures for smooth playback within seconds Backup Slides