Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distributed operating system wikipedia , lookup
Computer network wikipedia , lookup
Airborne Networking wikipedia , lookup
TCP congestion control wikipedia , lookup
Spanning Tree Protocol wikipedia , lookup
Internet protocol suite wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
UniPro protocol stack wikipedia , lookup
REALM: A REliable Application Layer Multicast protocol Justin Templemore-Finlayson PhD student Supervised by Prof. S. Budkowski Département Logiciels-Réseaux Institut National des Télécommunications 17 February 2002 1 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 2 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 3 Naive multicast • Not really multicast, but « repeated unicast ». • Sender sends to multiple receivers by copying data to each receiver individually. • Default mechanism if no multicast service is available. • Duplication at two points: – In the access network: One transmission per receiver. – In the backbone: Redundant duplicates of data on physical links. 17 February 2002 4 Naive multicast S R R R R R S R R R 17 February 2002 R R 5 Native multicast = IP multicast • Multicast as a network service as an extension to IP. • Deploys a multicast tree which is the union of the IP path between the sender and each receiver. • Network copies and forwards data as necessary. • No duplication: – access network: Sender transmits once for group size n. – backbone: Data carried by a given network link at most once. 17 February 2002 6 Native multicast = IP multicast S R R R R R S R R R 17 February 2002 R R 7 Application Layer Multicast • Hybrid between naive and native (IP) multicast. • Endsystems perform multicast routing over unicast links between group members. • Reduces duplication – access network: one transmission per receiver, but load is distributed across the group members. – backbone: as efficient as IP multicast in eliminating duplicates assuming that each router has a group member attached. • Capable of adaptive routing 17 February 2002 8 Application Layer Multicast S S R R R R R R R R R 17 February 2002 R 9 ALM with adaptive routing • ALM is not forced to use the static shortest hop paths of IP multicast when calculating tree. • Can use end-to-end metrics which satisfy an application’s specific requirements. – example: throughput for reliable multicast transfers. • Can shape tree according to real network capacity and dynamic traffic conditions. – example: use real “fastest path” instead of shortest path. 17 February 2002 10 ALM with adaptive routing Assumption that all links are equal. S Implicit in IP multicast least-hop routing. R 1 R 1 1 1 1 R 17 February 2002 1 R R 11 ALM with adaptive routing Links are not all equal. S Subject to faults, congestion. R 1 ALM uses adaptive routing to find true « fastest path » 1 5 IPm continues to use the least-hop path with faulty/congested link. 1 1 R 17 February 2002 R 1 R R 12 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 13 Reliable multicast • Characteristics: – – – – Large numbers of receivers Tree spans heterogeneous subnetworks Lost data must be recovered Throughput constrained by slowest or bottleneck branch • Challenges: – Scalable, distributed error control • minimise / eliminate sender feedback • minimise time spent recovering data – Network heterogeneity • how to reconcile different subnetwork capabilities? – TCP-fair congestion control 17 February 2002 14 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 15 REALM problem statement • Provide a one-to-many reliable multicast service using an ALM approach. • Goals: 1.Efficient multicast distribution 2.Deployable / Not dependent on IP multicast 3.Scalable ec+fc+cc 4.High performance using adaptive, throughputoriented tree deployment. 17 February 2002 16 The REALM service model • Sender creates group on a local UDP port. – Large address space, – Distributed, unique address allocation. • Group discovery by ordinary Internet methods – email, HTML, well-known address • Receivers rendezvous at sender group address. • Initial naïve tree deployed – worst case - tree deployment always improves – transmission starts immediately. 17 February 2002 R S R R R R 17 The REALM tree • Maximum Bottleneck Throughput Tree (MBTT): – Each branch in the tree is a TCP conxn. – Spanning tree such that no other spanning tree has a bottleneck link with higher TCP conxn throughput. S R • Calculated as follows: – Prim’s Minimum Cost Tree algorithm [2]. – Measured link RTT and loss are used to predict TCP conxn rate [3]. 17 February 2002 R R R R 18 Data transmission (1) • Optimistic pipeline – Each node forwards data as soon as it receives it • Recovery buffer – Each node buffers recent data to allow nearby orphaned descendents to recover missed data. – Nearby is defined as being at most a configured number of levels below the node in the tree. size= Min{sendRate,readRate} * 2 * T_BIND_MAX * RECOVERY_DEPTH – Independent of group size • Single-rate flow control: Each node reads from parent only as fast as it sends to slowest child – Recursive back-pressure all the way to sender – Slows sender transmission rate to bottleneck rate 17 February 2002 19 Data transmission (2) S Data stream: R R R R 17 February 2002 R R R R R R R R R R 20 Error control (1) • Two types of loss events: – Network losses: • Dropped packets recovered from parent by TCP – Parent node failure: • • The recovery buffer allows an orphaned node to recover data from nearby ancestors in the tree This local recovery is scalable and performant – no sender involvement – minimises recovery latency 17 February 2002 21 Error control (2) S R R R R 17 February 2002 R R R R R R R R R R 22 Error control (2) S R R R R 17 February 2002 R R R R R R R R R R 23 Error control (2) S recovery depth = 3 R R R R 17 February 2002 R R R R R R R R R 24 Congestion control and avoidance • Multicast trees experience localised temporal congestion • Govern with TCP until throughput falls below a threshold and then re-deploy the tree. A 8 S 6 B 4 C 4 (a) A deployed MBTT. Bottleneck branch is SC. Throughput = 4. 17 February 2002 A 2 S A 6 2 B 4 C 4 (b) Congestion on SA. New bottleneck is SA. Throughput = 2. S 6 B 4 C 4 (c) REALM redeploys MBTT. Branch SA is dropped Throughput = 4. 25 Centralised tree deployment (1) SN SN SN SN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN (a) Initial naive tree (b) Sender makes PROBE request SN SN RN (c) Receivers exchange PINGs (d) Receivers return PROBERESULTS SN SN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN RN (e) Sender builds graph 17 February 2002 and calculates MBTT RN (f) Sender distributes parent information RN RN (g) Receivers reparent if necessary (h) Deployed MBTT. 26 Centralised tree deployment (2) Advantages • No problem with small group sizes • Converges immediately • No inconsistencies => no heavyweight loop detection or partition management needed • Useful for many existing applications 17 February 2002 Disadvantages • Not scalable to large group sizes 27 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 28 Experimentation • We implemented a prototype of REALM in Java • Validation and verification – REALM services have been formally validated in Estelle – Scenarios used to verify the operation of the prototype • Performance comparison – Compare sender throughput to • naive multicast • to IP multicast 17 February 2002 29 Comparison to naive (1) Paris, France Evry, France Paris, France Warsaw, Poland Krakow, Poland Aachen, Germany (a) Tree deployed for Segment 1 (Initial naive multicast tree) Evry, France Warsaw, Poland Krakow, Poland (b) MBTT tree deployed for Segment 2 Paris, France Paris, France Evry, France Warsaw, Poland Krakow, Poland Aachen, Germany (c) MBTT tree deployed for Segments 3 through 6 17 February 2002 Aachen, Germany Evry, France Warsaw, Poland Krakow, Poland Aachen, Germany (d) MBTT tree deployed for Segment 7 30 Comparison to naive (2) Throughput (Bytes/second) 4500 deploy! 4000 deploy! deploy! 3500 3000 2500 2000 1500 naive 1000 500 0 1 2 3 4 5 6 7 Segment # Throughput measured at sender for evenly sized data stream segments (700KB each) 17 February 2002 31 Comparison to IP multicast (1) • Given three connected end-systems and their measured RTTs: joy (Paris) 20 10 galera (Warsaw) 15 lipari (Evry) • Which of these structure provides higher throughput between joy in Paris and galera in Warsaw? joy (Paris) 20 10 galera (Warsaw) 15 lipari (Evry) (a) The direct IP / IPm shortest path 17 February 2002 joy (Paris) 20 10 galera (Warsaw) 15 lipari (Evry) (b) And indirect route via a series of shorter paths (MBTT routing used in REALM) 32 Comparison to IP multicast (2) 2300 11000 9000 1800 Bytes / second 7000 5000 1300 3000 unadaptive direct throughput adaptive throughput 1000 800 -1000 -3000 300 RTT (joy-galera direct) RTT (longest in indirect path) -5000 -200 -7000 1 2 3 4 5 6 7 8 9 10 Experiment run 17 February 2002 33 Outline • Background – Application Layer Multicast – Reliable multicast • The REALM protocol – Service – Operation • Results • Conclusions 17 February 2002 34 Conclusions • REALM provides a deployable one-to-many reliable multicast service: – Useful to many existing applications – Not dependent on IP multicast – Avoids new network complexity • REALM uses multicast tree distribution: – Reduces duplication caused by and – improves on performance of naive multicast • The MBTT: – Maximises sender throughput for a reliable multicast group – Can better the performance of IP multicast shortest-path routing • REALM uses adaptive routing: – deals with temporal heterogeneity – avoids local network congestion to which IP multicast is subjected 17 February 2002 35 Future extensions • Distributed tree construction could improve scalability – In progress • A multi-rate flow control solution could better satisfy heterogeneous user requirements. – Partition receivers into homogeneous sub-groups and use a separate REALM tree for each layer of data (ALC, RMX); or – Implement entire file buffering at forwarding nodes (Overcast) • Different tree algorithm to optimise different application need – example: reliable music file vs reliable music streaming 17 February 2002 36 For more information • mail: [email protected] • www: http://www-lor.int-evry.fr/~templemo/ 17 February 2002 37 Selected References [1] Cohen, Kaempfer. A Unicast-based approach for streaming multicast, in Proceedings of INFOCOM 2001, 2001 [2] Gondran and Minoux. Graphes et algorithmes, 2nd edition. Editions Eyrolles, 1985. [3] Padyhe, Firoiu, Towsley, Kurose. Modeling TCP Reno performance: A simple model and its empirical validation, in IEEE/ACM Transactions on networking, April 2000. [4] Mankin, Romanow, Bradner, Paxson. RFC 2357 : IETF Criteria for Evaluating Reliable Multicast Transport and Application Protocols, June 1998. [5] Rao, Radhakrishan, Choel. NetLets, in Proceedings of the International Conference on Networking, Colmar, France, 2001. 17 February 2002 38