Download An Analysis of Fault Isolation in Multi

An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park [email protected] Korea Advanced Institute of Science and Technology System Architecture Lab 1 Table of Contents 1. Motivations / Problem Definition 2. Background 3. Analysis 4. Issues 5. Candidate Model 6. Simulation Results 7. Conclusion References System Architecture Lab 2 Before we start… • Terminology – Unicast : to a single receiver – Multicast : to a specific subset of receiver • single-source : only one source in a session (one-to-many multicast) • multi-source : many sources in a session (many-to-many multicast) – Fault Detection : perceiving the fault in somewhere in the network – Fault Isolation : locating the fault that on-tree router or link which is the origin of a fault. Hmm… Fault is in somewhere… Fault Detection OK! I found the Fault! Fault Isolation System Architecture Lab 3 Motivation / Problem Definition 1. Network monitoring is necessary to detect and discover of network problems. 2. Some participants in multicast experience severe packet loss. Obtained using Rqm [rqm] tool 3. Fault detection / isolation approaches in multicast are focused on single-source network. 4. In multi-source multicast, little work has been done for fault isolation. 5. Straightforward reuse single-source solution is not sufficient for large number of multi-source multicast.  New model for fault isolation in multi-source multicast is needed. System Architecture Lab 4 Background 1. 2. 3. 4. IP Multicast Multi-Source Multicast Applications Challenges of Multicast Monitoring Needs for Multicast Fault Isolation 1. IP Multicast routing path is changed when a fault is occurred. receiver receiver source send to a multicast session receiver Multicast Packets When fault occur System Architecture Lab receiver 5 Multi-Source Multicast Applications 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Networked virtual environments Synchronized resource like database updates Distributed or parallel concurrent processing Large-scale distributed military simulation Peer-to-peer multicast file transfer model Large-scale multimedia conference Large-scale replicated database Cooperative web cache protocols Shared editing and collaboration 1,000,000 Interactive distance learning 1,000 Network games or chatting 10 and more… 1 Group Size [LN01] Number of Senders Peer-to-Peer Applications Distributed Information Systems Games Streaming Collaboration Tools 10 Content Distribution 1,000 1,000,000 Number of Receivers System Architecture Lab 6 Multicast Monitoring Tools [SA01] Management, Debugging and Modeling via Active / Passive Monitoring Time Monitoring Debugging Management ~1992 Modeling mrmap mrdebug mrinfo rtpmon mtrace Mah’s Study mwatch mlisten mstat mview mrtree GDT NetIQ’s Chariot * ~1997 Dr. Watson Yajnik’s Study MultiMon Handley’s Study mhealth RouteMonitor MantaRay mantra NIMI * sdr-mon MINC * Otter mmon MRM * mwalk ~2000 HPMM SNMP_NG System Architecture Lab * : can be used for active monitoring recent research work 7 Needs for Multicast Fault Isolation 1. Monitoring of multicast network has become a crucial for maintaining the multicast operations – since the delivery service in multicast is more complex than in traditional unicast networks – Supervising multicast traffic is more difficult problem as each multicast tree involves multiple hosts with correlated, simultaneous faults. 2. There are various reasons causing multicast fault. – session announcement problem, reception problem, multicast router problem, congestion and rate-limiting problems, multicast routing problem, etc. [TA00] 3. It is not easy work even in single-source multicast, to say nothing of multi-source multicast. System Architecture Lab 8 Analysis on Single-Source Approach  Only for fault detection 1. MRM (Multicast Reachability Monitoring) [SA01] • active probing from a test sender(TS) to a test receiver(TR) by MRM manager 2. SMRM (SNMP-Based MRM) [AT02] •  SNMP-based approach defined several MIB for multicast monitoring Both detection and isolation 3. HPMM (Hierarchical Passive Multicast Monitoring) [WL00] • passive monitoring scheme that agents are organized in a hierarchy and communicate with each other using unicast 4. MTR (Fault Isolation in Multicast Tree) [RGE00] • receiver-driven method using IGMP multicast traceroute  Most approaches up to now focused on single-source multicast. System Architecture Lab 9 MRM (Multicast Reachability Monitor) [SA01] - Description Step 3: TR(s) Monitor Group Transmission R1 R2 TR2 Step 2: TS Transmits TS R6 TR3 TS: Test sender TR: Test Receiver R3 TR1 R4 R5 MRM Manager Step 1: Mgr Configures TS(s) and TR(s) Router End-Host Step 4: Mgr Collects and Displays TR Reports Manager  Agent Communication System Architecture Lab 10 SMRM (SNMP-Based MRM) [AT00] - Description smrmMIB Group in Extended MIB II System Architecture Lab 11 HPMM (Hierarchical Passive Multicast Monitor) [WL00] - Description Foreign domain 1 Foreign domain 2 source 1 1 Local domain 2 1 1 D source 2 B group 1 group 2 A 2 C 1 E 2 • Each node knows exactly which upstream agent to notify in case of a fault occurrence. • Node D has only one parent for both multicast groups 1 and 2, which is node B • Node E defines a parent agent in B for group 1 and a parent agent in C for group 2. System Architecture Lab 12 MTR (Fault Isolation in Multicast Tree) [RGE00] - Description Before After Source Source : Fault Isolated Fault Region Rb Ra System Architecture Lab Rb common ancestor router of Ra & Rc Rc Ra Rc 13 Comparison on Related Works Single- source Multi-source Active/ Passive Detect Isolate Detect Isolate MRM Active ○  △  test session SMRM Passive ○  △  SNMP-based HPMM Passive ○ ○   child-parent relationship MTR Active ○ ○   IGMP mtrace Remarks ※ No suggested approaches are sufficient for fault isolation in multi-source multicast network. System Architecture Lab 14 Message Complexity of Current Approaches 1. Network overload exponentially increased by extending number of members • As extend member size, mtrace request packets and mtrace reply packets are excessive. 2. Simulation result by ns-2 tree topology: 100 nodes, out-degree : 3 number of members : 5 ~ 60 (increased by 5) 5 times average calculation 3. Thus, it needs different strategy to handle multi-source multicast fault detection and isolation. 300 200 150 100 50 0 System Architecture Lab x 1,000 250 overload • • • 5 10 15 20 25 30 35 40 45 50 55 60 number of members 15 Issues 1. Application Characteristics 2. Message Complexity 3. Fault Isolation Error 4. Scalability 5. Deployment 1. Application Characteristics Performance requirements Loss Session length Group characteristics Source transmission patterns Conferencing Application Require low latency and high bandwidth tolerate loss Broadcasting Application interested in bandwidth, latency is not concern require reliable data delivery long lived, over 10 min short-lived Dynamic and small groups relatively static multiple sources a single static source Comparison on two applications [CRSZ01] System Architecture Lab 16 Issues for Multi-Source Multicast Fault Isolation 1. Message Complexity – Message complexity will be main concern. – Not to increase linearly, but to logarithmic message complexity • not O(N), but O(logN) or O(1) not good Acceptable size of members 2. Fault Isolation Error – Should be same or decreased compared to previous approach. – No sudden computation overload to isolate faults – near-realtime fault detection and isolation function 3. Scalability – not effected with the number of members – dynamic member action like join / leave actions 4. Deployment – should be easily deployable not depend on protocols and techniques. System Architecture Lab 17 Candidate Model • Goal : Isolate the fault promptly and accurately using efficient and scalable approach in the multicast network when the fault is occurred. • Basic Idea : member grouping 1. 2. 3. 4. • do not let all member send probe there exists shared path from local member to other members make maximum use of shared information only group leader send probe for fault isolation to other group leaders Benefits 1. reduce message complexity 2. scalable since not depend on size of members System Architecture Lab 18 Draft Model Group A 1. Each group select a group leader. 2. Group leader manages its member and sends probes for fault isolation. 3. Not send to all other group leaders, but send just common ancestor router with other group leaders. A2 A1 C1 Group C Group B B2 B1 A3 D1 D2 Group D System Architecture Lab 19 Member Grouping 1. how the members are grouped – simply, boundary within border router – need to find a way to make a group bigger since the number of group can be still large 2. how the members in a group know their group leader one group group leader – group leader send a probe to group member periodically “i-am-leader” packet border router 3. how know group leader exist – newly joined member send “i-am-leader” packet in a group using multicast scoping – if no response, it becomes the leader. – if somebody send “i-am-leader” packet, consider there is a leader. System Architecture Lab 20 Group Leader Action Lists 1. Managing members in group – use “i-am-leader” to control group member – “you-are-leader” packet when leave 2. Fault Isolation – primarily function for group leader – exchange among other group leaders 3. Group leader announcement – It is not easy work to announce and to find out the group leaders System Architecture Lab 21 Simulation Results • Overview – – – – • Simulated a simplified protocol using ns-2 simulator Random graph by GT-ITM Average value after five time simulations Compared with best approach among related works Results 18000 – All-member-based (best-performance) – Group-leader-based y  58  x 14000 message complexity y  176  x 16000 12000 10000 8000 All-member-based Approach 6000 Group-leader Approach 4000 – reduced the message complexity 68% 2000 0 10 20 30 40 50 60 70 80 90 100 number of source System Architecture Lab 22 Conclusion 1. It is important to locate the fault in a network. 2. Little work has been done for fault isolation even in detection in multi-source multicast. 3. In multi-source multicast fault isolation, message complexity is main concern. 4. One candidate approach is a group-based architecture to locate the fault in a multi-source multicast session. 5. Simulation results show group-based approach reduced the message complexity as amount of 68% than the best performance approach among other ones. 6. However, group-based approach is not fully enough for scalability reason, etc. System Architecture Lab 23 Future Works • Need more efficient approach for message complexity. • Possible model is suppressed one-way probing mechanism. Source sends a special packet to multicast group. All internal router records its routing information in the special packet. Without packet suppression, implosion problem will be occurred. Receiver compare to check whether routing path was changed. • Simulation results show that this suppressed one-way probing is well suit for multisource multicast network. • Several things to elaborate… • Any comment will be appreciated. System Architecture Lab 4 12 Message complexity (r = 10) x 10 10 message complexity – – – – No suppression Max Suppression Min Suppression 8 6 4 2 0 0 200 400 600 800 1000 1200 number of source 24 References [AT02] E. Al-Shaer and Y. Tang, “SMRM: SNMP-based multicast rechability monitoring,” in IEEE/IFIP Network Operations and Management Symposium (NOMS) 2002, Florence, Italy, April 2002. [CRSZ01] Yang-hua Chu, Sanjay G. Rao, Srinivasan Seshan and Hui Zhang, “Enabling conferencing applications on the Internet using an overlay multicast architecture,” in ACM SIGCOMM 01, San Diego, California, August 2001. [LN01] J. Liebeherr and M. Nahas, “Application-layer multicast with Delaunay Triangulations,” Global Internet Symposium, IEEE GlobeCom 2001, San Antonio, Texas, November 2001. [RGE00] A. Reddy, R. Govindan and D. Estrin, “Fault isolation in multicast trees,” In Proceeding of ACM SigComm 2000, Stockholm, Sweden, Aug. 2000. [Rqm] C. Perkins, “RTP Quality Matrix,” (RTP Quality Matrix), [online], http://wwwmice.cs.ucl.ac.uk/multimedia/software/rqm/ (Accessed: 7 March 2003). [SA01] K. Sarac and K. C. Almeroth, “Supporting multicast deployment efforts: A survey of tools of multicast monitoring,” Journal of High Speed Networking--Special Issue on Management of Multimedia Networking, vol. 9, num. 3/4, pp. 191-211, March 2001. [TA00] D. Thaler, B. Aboba, “Multicast Debugging Handbook,” Internet draft, draft-ietfmboned-mdh-*.txt, Internet Engineering Task Force (IETF), November 2000. [WL00] J. Walz and B. N. Levine, “A hierarchical multicast monitoring scheme,” In 2nd International Workshop on Networked Group Communication, Nov. 2000. System Architecture Lab 25 Thank you. Question? Comment? System Architecture Lab 26

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download An Analysis of Fault Isolation in Multi