Download ChouTutorial04 - Microsoft Research

Practical Network Coding for the Internet and Wireless Networks Philip A. Chou with thanks to Yunnan Wu, Kamal Jain, Pablo Rodruiguez Rodriguez, Christos Gkantsidis, and Baochun Li, et al. Globecom Tutorial, December 3, 2004 Outline  Practical Network Coding      Packetization Buffering Results for SprintLink ISP Experimental comparisons to multicast routing Internet and Wireless Network Applications  Live Broadcasting, File Downloading, Messaging, Interactive Communication Network Coding applicable to real networks?  Internet  IP Layer   Application Layer    Routers (e.g., ISP) Infrastructure (e.g., CDN) Ad hoc (e.g., P2P) Wireless    Mobile ad hoc multihop wireless networks Sensor networks Stationary wireless (residential) mesh networks Theory vs. Practice  Theory:   Symbols flow synchronously throughout network; edges have integral capacities Practice:  Information travels asynchronously in packets; packets subject to random delays and losses; edge capacities often unknown, varying as competing communication processes begin/end Theory vs. Practice (2)  Theory:   Some centralized knowledge of topology to compute capacity or coding functions Practice:  May be difficult to obtain centralized knowledge, or to arrange its reliable broadcast to nodes across the very communication network being established Theory vs. Practice (3)  Theory:   Can design encoding to withstand failures, but decoders must know failure pattern Practice:  Difficult to communicate failure pattern reliably to receivers Theory vs. Practice (4)  Theory:   Cyclic graphs present difficulties, e.g., capacity only in limit of large blocklength Practice:  Cycles abound. If A → B then B → A. Need to address practical network coding in real networks        Packets subject to random loss and delay Edges have variable capacities due to congestion and other cross traffic Node & link failures, additions, & deletions are common (as in P2P, Ad hoc networks) Cycles are everywhere Broadcast capacity may be unknown No centralized knowledge of graph topology or encoder/decoder functions Simple technology, applicable in practice Approach  Packet Format   Removes need for centralized knowledge of graph topology and encoding/decoding functions Buffer Model  Allows asynchronous packets arrivals & departures with arbitrarily varying rates, delay, loss [Chou, Wu, and Jain, Allerton 2003] [Ho, Koetter, Médard, Karger, and Effros, ISIT 2003] Standard Framework   Graph (V,E) having unit capacity edges Sender s in V, set of receivers T={t,…} in V Broadcast capacity h = mint Maxflow(s,t) y(e’1) = x1 y(e’) y(e1) y(e’2) = x2 y(e’h) = xh   v e y(e) t ... ... s y(e2) ...  y(eh) y(e) = ∑e’ be(e’) y(e’) b(e) = [be(e’)]e’ is local encoding vector Global Encoding Vectors y(e’) y(e1) y(e’2) = x2 y(e’h) = xh    v y(e2) y(e) t ... ... s e ... y(e’1) = x1 y(eh) By induction y(e) = ∑hi=1 gi(e) xi g(e) = [g1(e),…,gh(e)] is global encoding vector Receiver t can recover x1,…,xh from  y (e1 )   g1 (e1 )  g h (e1 )   x1   x1          G      t        y(eh )  g1 (eh )  g h (eh )  xh   xh  Invertibility of Gt  Gt will be invertible with high probability if local encoding vectors are random and field size is sufficiently large  If field size = 216 and |E| = 28 then Gt will be invertible w.p. ≥ 1−2−8 = 0.996 [Ho et al., 2003] [Jaggi, Sanders, et al., 2003] Packetization x1=[x1,1,x1,2,…,x1,N] y(e’) y(e1) x2=[x2,1,x2,2,…,x2,N] y(e2) e y(e)=[y1(e),y2(e),…,yN(e)] xh=[xh,1,xh,2,…,xh,N]   t ... v ... ... s y(eh) Internet: MTU size typically ≈ 1400+ bytes y(e) = ∑e’ be(e’) y(e’) = ∑hi=1 gi(e) xi s.t.  y (e1 )   y1 (e1 )        y (eh )  y1 (eh )  x1,1 y N (e1 )      Gt    xh ,1 y2 (eh )  y N (eh ) y2 (e1 )   x1, 2  xh , 2  x1, N      xh , N  Key Idea   Include within each packet on edge e g(e) = ∑e’ be(e’) g(e’); y(e) = ∑e’ be(e’) y(e’) Can be accomplished by prefixing i th unit vector to i th source vector xi, i=1,…,h 1 0 x1,1 x1, 2  x1, N   g1 (e1 )  g h (e1 ) y1 (e1 ) y2 (e1 )  y N (e1 )     G              t 0  g1 (eh )  g h (eh ) y1 (eh ) y2 (eh )  y N (eh ) 1 xh,h xh, 2  xh, N   Then global encoding vectors needed to invert the code at any receiver can be found in the received packets themselves! Cost vs. Benefit  Cost:   Overhead of transmitting h extra symbols per packet; if h = 50 and field size = 28, then overhead ≈ 50/1400 ≈ 3% Benefit:  Receivers can decode even if     Network topology & encoding functions unknown Nodes & edges added & removed in ad hoc way Packet loss, node & link failures w/ unknown locations Local encoding vectors are time-varying & random Erasure Protection  Removals, failures, losses, poor random encoding may reduce capacity below h 1 0 x1,1 x1, 2  x1, N   g1 (e1 )  g h (e1 ) y1 (e1 ) y2 (e1 )  y N (e1 )         G k h            t  0  g1 (ek )  g h (ek ) y1 (ek ) y2 (ek )  y N (ek ) 1 xh,h xh, 2  xh, N   Basic form of erasure protection: send redundant packets, e.g., last h-k packets of x1,…, xh are known zero Priority Encoding Transmission (Albanese et al., IEEE Trans IT ’96)   More sophisticated form: partition data into layers of importance, vary redundancy by layer Received rank k → recover k layers Exact capacity can be unknown Global encoding vectors 1 2 3 4 5 1 0 0 0 0 0 x1,1 x1,2 x1,3 x1,4 x1,5 x1,6 x1,7 x1,8 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Data layer h=6 ... x1,11 x1,12 ... x1,N Source packet 1 ... x2,11 x2,12 ... x2,N Source packet 2 ... x3,11 x3,12 ... x3,N Source packet 3 ... x4,11 x4,12 ... x4,N Source packet 4 x5,8 x5,9 x5,10 x5,11 x5,12 ... x5,N ...  ... x6,N Source packet h=6 x2,2 x2,3 x2,4 x2,5 x2,6 x2,7 x2,8 x3,3 x3,4 x3,5 x3,6 x3,7 x3,8 x4,5 x4,6 x4,7 x4,8 0 0 0 0 x6,12 Asynchronous Communication  In real networks, “unit capacity” edges grouped    Packets on real edges carried sequentially Separate edges → separate prop & queuing delays Number of packets per unit time on edge varies   Loss, congestion, competing traffic, rounding Need to synchronize   All packets related to same source vectors x1,…, xh are in same generation; h is generation size All packets in same generation tagged with same generation number; one byte (mod 256) sufficient Buffering random combination arriving packets (jitter, loss, variable rate) edge asynchronous reception Transmission opportunity: generate packet edge buffer node asynchronous transmission Decoding  Block decoding:   Collect h or more packets, hope to invert Gt Earliest decoding (recommended):  Perform Gaussian elimination after each packet    At every node, detect & discard non-informative packets Gt tends to be lower triangular, so can typically decode x1,…,xk with fewer more than k packets Much lower decoding delay than block decoding  Approximately constant, independent of block length h Flushing Policy, Delay Spread, and Throughput loss  Policy: flush when first packet of next generation arrives on any edge  Simple, robust, but leads to some throughput loss Flushing time s v Fastest path ... ... ... Slowest path X ... Time of arrival at node v throughput loss (%)  X ... X X ... Delay spread delay spread (s) delay spread (s)  sending rate (pkt/s)  generation duration (s) h I Interleaving  Decomposes session into several concurrent interleaved sessions with lower sending rates 0   1 0 1 0 1 0 1 2 3 2 3 2 3 2 3 ... Does not decrease overall sending rate Increases space between packets in each session; decreases relative delay spread Simulations   Implemented event-driven simulator in C++ Six ISP graphs from Rocketfuel project (UW)     Sender: Seattle; Receivers: 20 arbitrary (5 shown)    SprintLink: 89 nodes, 972 bidirectional edges Edge capacities: scaled to 1 Gbps / “cost” Edge latencies: speed of light x distance Broadcast capacity: 450 Mbps; Max 833 Mbps Union of maxflows: 89 nodes, 207 edges Send 20000 packets in each experiment, measure:  received rank, throughput, throughput loss, decoding delay vs. sendingRate(450), fieldSize(216), genSize(100), intLen(100) Received Rank 100 100 (450 Mbps) Chicago Pearl Harbor (525 Mbps) (625 Mbps) Anaheim (733 Mbps) Boston (833 Mbps) SanJose 90 Avg. received rank Received rank 90 80 70 60 80 70 60 50 50 40 Chicago (450 Mbps) Pearl Harbor (525 Mbps) Anaheim (625 Mbps) Boston (733 Mbps) SanJose (833 Mbps) 0 20 40 60 80 100 120 140 Generation number 160 180 200 40 0 5 10 15 Field size (bits) 20 25 Throughput 850 850 Chicago (450 Mbps) Pearl Harbor (525 Mbps) Anaheim (625 Mbps) Boston (733 Mbps) SanJose (833 Mbps) Throughput (Mbps) 750 700 750 650 600 550 700 650 600 550 500 500 450 450 400 400 450 500 550 600 650 Chicago (450 Mbps) Pearl Harbor (525 Mbps) Anaheim (625 Mbps) Boston (733 Mbps) SanJose (833 Mbps) 800 Throughput (Mbps) 800 700 Sending rate (Mbps) 750 800 850 400 400 450 500 550 600 650 700 Sending rate (Mbps) 750 800 850 Throughput Loss 100 300 Throughput Loss (Mbps) 90 80 70 Throughput Loss (Mbps) Chicago (450 Mbps) Pearl Harbor (525 Mbps) Anaheim (625 Mbps) Boston (733 Mbps) San Jose (833 Mbps) 60 50 40 30 20 Chicago (450 Mbps) Pearl Harbor (525 Mbps) Anaheim (625 Mbps) Boston (733 Mbps) SanJose (833 Mbps) 250 200 150 100 50 10 0 20 30 40 50 60 70 Generation Size 80 90 100 0 0 10 20 30 40 50 60 70 Interleaving Length 80 90 100 Decoding Delay 200 200 Pkt delay w/ blk decoding Pkt delay w/ earliest decoding 180 160 Packet delay (ms) Packet delay (ms) 160 140 120 100 80 60 140 120 100 80 60 40 40 20 20 0 Pkt delay w/ blk decoding Pkt delay w/ earliest decoding 180 20 30 40 50 60 70 Generation Size 80 90 100 0 0 10 20 30 40 50 60 70 Interleaving Length 80 90 100 Distributed Flow Algorithms  Work in progress… Comparing Network Coding to Multicast Routing      Throughput Computational complexity Resource cost (efficiency) Robustness to link failure Robustness to packet loss Multicast Routing sender receivers    Standard multicast: union of shortest reverse paths Widest multicast: maximum rate Steiner tree Multiple multicast: Steiner tree packing Complexity and Throughput of Optimal Steiner Tree Packing   Packing Steiner trees optimally is hard [e.g., JMS03] Network Coding has small throughput advantage [LLJL04]: Z. Li, B. Li, D. Jiang, and L. C. Lau, On achieving optimal end-to-end throughput in data networks: theoretical and emprirical studies, ECE Tech. Rpt., U. Toronto, Feb. 2004, reprinted with permission. Complexity and Throughput of Optimal Steiner Tree Packing (2)  Z. Li, B. Li, D. Jiang, L. C. Lau ‘04:  Optimally packed Steiner trees in 1000 randomly generated networks (|E|<35)   Network Coding advantage = 1.0 in every network “The fundamental benefit of network coding is not higher optimal throughput, but to facilitate significantly more efficient computation and implementation of strategies to achieve such optimal throughput.” Complexity of Greedy Tree Packing  Find widest (max rate) distribution tree in polynomial time [Prim]:   Grow tree edge by edge Postprocess tree Choose the edge with maximum capacity Reached nodes  Nonreached nodes Repeatedly pack widest distribution trees (Mbps) Comparison of Achievable Throughput 800 700 600 500 400 300 200 100 0 1 2 3 4 5 6 ISP# Path-packing Max-rate Steiner tree (Prim's algorithm) Greedy tree-packing based on Prim's algorithm Greedy tree-packing based on Lovasz's proof to Edmonds' theorem Practical network coding Multicast capacity [Wu, Chou, and Jain, ISIT 2004] Comparison of Efficiency (#receivers * throughput / bandwidth cost) Network coding Widest tree Z. Li, B. Li, D. Jiang, and L. C. Lau, On achieving optimal end-to-end throughput in data networks: theoretical and emprirical studies, ECE Tech. Rpt., U. Toronto, Feb. 2004, reprinted with permission. Robustness to Link Failure   Ergodic link failure: packet loss Non-ergodic link failure: G  G-e  Compute throughput achievable by  Network Coding   Adaptive Tree Packing   Multicast capacity of G-e Re-pack trees on G-e Fixed Tree Packing   Keep trees packed on G Delete subtrees downstream from e [Wu, Chou, and Jain, ISIT 2004] Robustness to Link Failures Robustness to Packet Loss  Ergodic link failure:   5% loss 5% loss Routing requires link-level error control to achieve throughput 95% of sending rate Network Coding obviates need for link-level error control   Outputs random linear combinations at intermediate nodes Delays reconstruction until terminal nodes Comparing Network Coding to Multicast Routing      Throughput √ Computational complexity √ Resource cost (efficiency) √ Robustness to link failure √ Robustness to packet loss √ Manageability Network Coding for Internet and Wireless Applications File download wireless Internet Xbox Live Windows Messenger Peer Net Bit Torrent Digital Fountain ALM Gnutella Kazaa Ad Hoc (P2P) Akamai RBN SplitStream CoopNet Infrastructure (CDN) Live Broadcast  State-of-the-art: Application Layer Multicast (ALM) trees with disjoint edges (e.g., CoopNet)   FEC/MDC striped across trees Up/download bandwidths equalized a failed node Live Broadcast (2)  Network Coding [Jain, Lovász, and Chou, 2004] :  Does not propagate losses/failures beyond child failed node affected nodes (maxflow: ht → ht – 1) unaffected nodes (maxflow unchanged)  ALM/CoopNet average throughput: (1–ε)depth * sending rate Network Coding average throughput: (1–ε) * sending rate File Download  State-of-the-Art: Parallel download (e.g., BitTorrent)     Selects parents at random Reconciles working sets Flash crowds stressful Network Coding:   Does not need to reconcile working sets Handles flash crowds similarly to live broadcast   Throughput download time Seamlessly transitions from broadcast to download mode File Download (2) Mean Max LR Free 124.2 161 LR TFT 126.1 185 FEC Free 123.6 159 FEC TFT 127.1 182 NC Free 117.0 136 NC TFT 117.2 139 C. Gkantsidis and P. Rodriguez Rodruiguez, Network Coding for large scale content distribution, submitted to INFOCOM 2005, reprinted with permission. Instant Messaging  State-of-the-Art: Flooding (e.g., PeerNet)     Peer Name Resolution Protocol (distributed hash table) Maintains group as graph with 3-7 neighbors per node Messaging service: push down at source, pops up at receivers How? Flooding    Adaptive, reliable 3-7x over-use Network Coding:   Improves network usage 3-7x (since all packets informative) Scales naturally from short message to long flows Interactive Communication in mobile ad hoc wireless networks  State-of-the-Art: Route discovery and maintenance   Timeliness, reliability Network Coding:  Is as distributed, robust, and adaptive as flooding   Each node becomes collector and beacon of information Can also minimize energy Network Coding Minimizes Energy (per bit) a a a a,b b a a a a b a a a b a optimal multicast transmissions per packet = 5 a a b a+b a+b aa,b bb,a network coding transmissions per packet = 4.5 Wu et al. (2003); Wu, Chou, Kung (2004) Lun, Médard, Ho, Koetter (2004) Network Coding in Residential Mesh Network (simulation) Physical Piggybacking a a+b s   b a+b t Information sent from t to s can be piggybacked on information sent from s to t Network coding helps even with point-to-point interactive communication    throughput energy per bit delay Network Coding in Residential Mesh Network (multi-session) Network Coding with Cross-Layer Optimization Network Coding has > 4x bits per Joule Summary  Network Coding is Practical    Network Coding outperforms Routing on       Packetization Buffering Throughput Computational complexity Resource cost (efficiency) Robustness to link failure Robustness to packet loss Network Coding can improve performance    in IP or wireless networks in infrastructure-based or P2P networks for live broadcast, file download, messaging, interactive communication applications

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download ChouTutorial04 - Microsoft Research