Download An Analysis of Fault Isolation in Multi

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Deep packet inspection wikipedia , lookup

Distributed operating system wikipedia , lookup

Network tap wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Transcript
An Analysis of Fault Isolation
in Multi-Source Multicast Session
Network Research Workshop
2003. 8. 28
Heonkyu Park
[email protected]
Korea Advanced Institute of Science and Technology
System Architecture Lab
1
Table of Contents
1. Motivations / Problem Definition
2. Background
3. Analysis
4. Issues
5. Candidate Model
6. Simulation Results
7. Conclusion
References
System Architecture Lab
2
Before we start…
• Terminology
– Unicast : to a single receiver
– Multicast : to a specific subset of receiver
• single-source : only one source in a session (one-to-many multicast)
• multi-source : many sources in a session (many-to-many multicast)
– Fault Detection : perceiving the fault in somewhere in the network
– Fault Isolation : locating the fault that on-tree router or link which is
the origin of a fault.
Hmm… Fault is in somewhere…
Fault Detection
OK! I found the Fault!
Fault Isolation
System Architecture Lab
3
Motivation / Problem Definition
1. Network monitoring is necessary to detect and discover of
network problems.
2. Some participants in multicast experience
severe packet loss.
Obtained using Rqm [rqm] tool
3. Fault detection / isolation approaches in multicast are
focused on single-source network.
4. In multi-source multicast, little work has been done for
fault isolation.
5. Straightforward reuse single-source solution is not
sufficient for large number of multi-source multicast.
 New model for fault isolation in multi-source
multicast is needed.
System Architecture Lab
4
Background
1.
2.
3.
4.
IP Multicast
Multi-Source Multicast Applications
Challenges of Multicast Monitoring
Needs for Multicast Fault Isolation
1. IP Multicast
routing path is changed when a
fault is occurred.
receiver
receiver
source send to a
multicast session
receiver
Multicast Packets
When fault occur
System Architecture Lab
receiver
5
Multi-Source Multicast Applications
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Networked virtual environments
Synchronized resource like database updates
Distributed or parallel concurrent processing
Large-scale distributed military simulation
Peer-to-peer multicast file transfer model
Large-scale multimedia conference
Large-scale replicated database
Cooperative web cache protocols
Shared editing and collaboration 1,000,000
Interactive distance learning
1,000
Network games or chatting
10
and more…
1
Group Size [LN01]
Number
of Senders
Peer-to-Peer
Applications
Distributed
Information
Systems
Games
Streaming
Collaboration
Tools
10
Content
Distribution
1,000
1,000,000
Number
of Receivers
System Architecture Lab
6
Multicast Monitoring Tools [SA01]
Management, Debugging and Modeling via Active / Passive Monitoring
Time
Monitoring
Debugging
Management
~1992
Modeling
mrmap
mrdebug
mrinfo
rtpmon
mtrace
Mah’s
Study
mwatch
mlisten
mstat
mview
mrtree
GDT
NetIQ’s
Chariot *
~1997
Dr. Watson
Yajnik’s
Study
MultiMon
Handley’s
Study
mhealth
RouteMonitor
MantaRay
mantra
NIMI *
sdr-mon
MINC *
Otter
mmon
MRM *
mwalk
~2000
HPMM
SNMP_NG
System Architecture Lab
* : can be used for active monitoring
recent research work
7
Needs for Multicast Fault Isolation
1. Monitoring of multicast network has become a crucial for
maintaining the multicast operations
– since the delivery service in multicast is more complex than in
traditional unicast networks
– Supervising multicast traffic is more difficult problem as each
multicast tree involves multiple hosts with correlated,
simultaneous faults.
2. There are various reasons causing multicast fault.
– session announcement problem, reception problem, multicast
router problem, congestion and rate-limiting problems, multicast
routing problem, etc. [TA00]
3. It is not easy work even in single-source multicast, to say
nothing of multi-source multicast.
System Architecture Lab
8
Analysis on Single-Source Approach

Only for fault detection
1. MRM (Multicast Reachability Monitoring) [SA01]
•
active probing from a test sender(TS) to a test receiver(TR) by
MRM manager
2. SMRM (SNMP-Based MRM) [AT02]
•

SNMP-based approach defined several MIB for multicast
monitoring
Both detection and isolation
3. HPMM (Hierarchical Passive Multicast Monitoring) [WL00]
•
passive monitoring scheme that agents are organized in a hierarchy
and communicate with each other using unicast
4. MTR (Fault Isolation in Multicast Tree) [RGE00]
•
receiver-driven method using IGMP multicast traceroute
 Most approaches up to now focused on single-source
multicast.
System Architecture Lab
9
MRM (Multicast Reachability Monitor) [SA01] - Description
Step 3: TR(s) Monitor
Group Transmission
R1
R2
TR2
Step 2: TS
Transmits
TS
R6
TR3
TS: Test sender
TR: Test Receiver
R3
TR1
R4
R5
MRM
Manager
Step 1: Mgr Configures
TS(s) and TR(s)
Router
End-Host
Step 4: Mgr Collects and
Displays TR Reports
Manager  Agent
Communication
System Architecture Lab
10
SMRM (SNMP-Based MRM) [AT00] - Description
smrmMIB Group in Extended MIB II
System Architecture Lab
11
HPMM (Hierarchical Passive Multicast Monitor) [WL00] - Description
Foreign
domain 1
Foreign
domain 2
source 1
1
Local
domain
2
1
1
D
source 2
B
group 1
group 2
A
2
C
1
E
2
• Each node knows exactly which upstream agent to notify in case of a fault occurrence.
• Node D has only one parent for both multicast groups 1 and 2, which is node B
• Node E defines a parent agent in B for group 1 and a parent agent in C for group 2.
System Architecture Lab
12
MTR (Fault Isolation in Multicast Tree) [RGE00] - Description
Before
After
Source
Source
: Fault
Isolated Fault Region
Rb
Ra
System Architecture Lab
Rb
common ancestor
router of Ra & Rc
Rc
Ra
Rc
13
Comparison on Related Works
Single- source
Multi-source
Active/
Passive
Detect
Isolate
Detect
Isolate
MRM
Active
○

△

test session
SMRM
Passive
○

△

SNMP-based
HPMM
Passive
○
○


child-parent
relationship
MTR
Active
○
○


IGMP mtrace
Remarks
※ No suggested approaches are sufficient for fault
isolation in multi-source multicast network.
System Architecture Lab
14
Message Complexity of Current Approaches
1. Network overload exponentially increased by extending
number of members
•
As extend member size, mtrace request packets and mtrace reply
packets are excessive.
2. Simulation result by ns-2
tree topology: 100 nodes, out-degree : 3
number of members : 5 ~ 60 (increased by 5)
5 times average calculation
3. Thus, it needs different
strategy to handle
multi-source multicast
fault detection and
isolation.
300
200
150
100
50
0
System Architecture Lab
x 1,000
250
overload
•
•
•
5
10
15
20
25
30
35
40
45
50
55
60
number of members
15
Issues
1. Application Characteristics
2. Message Complexity
3. Fault Isolation Error
4. Scalability
5. Deployment
1. Application Characteristics
Performance
requirements
Loss
Session length
Group
characteristics
Source transmission
patterns
Conferencing Application
Require low latency and
high bandwidth
tolerate loss
Broadcasting Application
interested in bandwidth,
latency is not concern
require reliable data delivery
long lived, over 10 min
short-lived
Dynamic and small groups
relatively static
multiple sources
a single static source
Comparison on two applications [CRSZ01]
System Architecture Lab
16
Issues for Multi-Source Multicast Fault Isolation
1. Message Complexity
– Message complexity will be main concern.
– Not to increase linearly, but to logarithmic
message
complexity
• not O(N), but O(logN) or O(1)
not good
Acceptable
size of members
2. Fault Isolation Error
– Should be same or decreased compared to previous approach.
– No sudden computation overload to isolate faults
– near-realtime fault detection and isolation function
3. Scalability
– not effected with the number of members
– dynamic member action like join / leave actions
4. Deployment
– should be easily deployable not depend on protocols and techniques.
System Architecture Lab
17
Candidate Model
•
Goal : Isolate the fault promptly and accurately using
efficient and scalable approach in the multicast network
when the fault is occurred.
•
Basic Idea : member grouping
1.
2.
3.
4.
•
do not let all member send probe
there exists shared path from local member to other members
make maximum use of shared information
only group leader send probe for fault isolation to other group
leaders
Benefits
1. reduce message complexity
2. scalable since not depend on size of members
System Architecture Lab
18
Draft Model
Group A
1. Each group select a
group leader.
2. Group leader manages
its member and sends
probes for fault isolation.
3. Not send to all other
group leaders, but send
just common ancestor
router with other group
leaders.
A2
A1
C1
Group C
Group B
B2
B1
A3
D1
D2
Group D
System Architecture Lab
19
Member Grouping
1. how the members are grouped
– simply, boundary within border router
– need to find a way to make a group bigger
since the number of group can be still large
2. how the members in a group know their
group leader
one group
group leader
– group leader send a probe to group member
periodically “i-am-leader” packet
border router
3. how know group leader exist
– newly joined member send “i-am-leader” packet in a group using
multicast scoping
– if no response, it becomes the leader.
– if somebody send “i-am-leader” packet, consider there is a leader.
System Architecture Lab
20
Group Leader Action Lists
1. Managing members in group
– use “i-am-leader” to control group member
– “you-are-leader” packet when leave
2. Fault Isolation
– primarily function for
group leader
– exchange among other
group leaders
3. Group leader
announcement
– It is not easy work to
announce and to find out
the group leaders
System Architecture Lab
21
Simulation Results
•
Overview
–
–
–
–
•
Simulated a simplified protocol using ns-2 simulator
Random graph by GT-ITM
Average value after five time simulations
Compared with best approach among related works
Results
18000
– All-member-based (best-performance)
– Group-leader-based
y  58  x
14000
message complexity
y  176  x
16000
12000
10000
8000
All-member-based Approach
6000
Group-leader Approach
4000
– reduced the message complexity 68%
2000
0
10
20
30
40
50
60
70
80
90
100
number of source
System Architecture Lab
22
Conclusion
1. It is important to locate the fault in a network.
2. Little work has been done for fault isolation even in
detection in multi-source multicast.
3. In multi-source multicast fault isolation, message
complexity is main concern.
4. One candidate approach is a group-based architecture to
locate the fault in a multi-source multicast session.
5. Simulation results show group-based approach reduced
the message complexity as amount of 68% than the best
performance approach among other ones.
6. However, group-based approach is not fully enough for
scalability reason, etc.
System Architecture Lab
23
Future Works
• Need more efficient approach for message complexity.
• Possible model is suppressed one-way probing mechanism.
Source sends a special packet to multicast group.
All internal router records its routing information in the special packet.
Without packet suppression, implosion problem will be occurred.
Receiver compare to check whether routing path was changed.
• Simulation results show that
this suppressed one-way
probing is well suit for multisource multicast network.
• Several things to elaborate…
• Any comment will be
appreciated.
System Architecture Lab
4
12
Message complexity (r = 10)
x 10
10
message complexity
–
–
–
–
No suppression
Max Suppression
Min Suppression
8
6
4
2
0
0
200
400
600
800
1000
1200
number of source
24
References
[AT02] E. Al-Shaer and Y. Tang, “SMRM: SNMP-based multicast rechability monitoring,”
in IEEE/IFIP Network Operations and Management Symposium (NOMS) 2002,
Florence, Italy, April 2002.
[CRSZ01] Yang-hua Chu, Sanjay G. Rao, Srinivasan Seshan and Hui Zhang, “Enabling
conferencing applications on the Internet using an overlay multicast architecture,” in
ACM SIGCOMM 01, San Diego, California, August 2001.
[LN01] J. Liebeherr and M. Nahas, “Application-layer multicast with Delaunay
Triangulations,” Global Internet Symposium, IEEE GlobeCom 2001, San Antonio,
Texas, November 2001.
[RGE00] A. Reddy, R. Govindan and D. Estrin, “Fault isolation in multicast trees,” In
Proceeding of ACM SigComm 2000, Stockholm, Sweden, Aug. 2000.
[Rqm] C. Perkins, “RTP Quality Matrix,” (RTP Quality Matrix), [online], http://wwwmice.cs.ucl.ac.uk/multimedia/software/rqm/ (Accessed: 7 March 2003).
[SA01] K. Sarac and K. C. Almeroth, “Supporting multicast deployment efforts: A survey of
tools of multicast monitoring,” Journal of High Speed Networking--Special Issue on
Management of Multimedia Networking, vol. 9, num. 3/4, pp. 191-211, March 2001.
[TA00] D. Thaler, B. Aboba, “Multicast Debugging Handbook,” Internet draft, draft-ietfmboned-mdh-*.txt, Internet Engineering Task Force (IETF), November 2000.
[WL00] J. Walz and B. N. Levine, “A hierarchical multicast monitoring scheme,” In 2nd
International Workshop on Networked Group Communication, Nov. 2000.
System Architecture Lab
25
Thank you.
Question?
Comment?
System Architecture Lab
26