Download Slides: Scalable Management for Network and Services

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Scalable Management
for Networks and Services
Rolf Stadler
Laboratory for Communication Networks
KTH Royal Institute of Technology
Stockholm
HP Laboratories, Palo Alto, March 31, 2003
The Shift of a Management Paradigm
Management Program
Manager
P
Management
station
M
P
download &
execute
Agent
Management
station
results
P
A
A
A
node
A
A
A
node
Manager-Agent based management
•Centralized Control
•Management protocols: SNMP, CMIP
•Program runs on Management Station
•Decentralized Control
•Program runs on network nodes
Architecture for
Pattern-based Management
Management
Program
Management station
navigation
Execution Environment
Router
Code Server
Weaver—A Testbed for
pattern-based Management
Management
Station
WAN
A
Router A
WAN
B
Router B
WAN
C
Router C
FastEthernet
Switch
WAN
D
Router D
Simple Navigation Patterns
Echo Pattern (expansion)
Echo Pattern (expansion)
droot=1
Echo Pattern (expansion)
droot=2
Echo Pattern (expansion)
droot=3
Echo Pattern (expansion)
droot=4
Echo Pattern
droot=5
Echo Pattern (contraction)
droot=4
Echo Pattern (contraction)
droot=3
Echo Pattern (contraction)
droot=2
Echo Pattern (contraction)
droot=1
Echo Pattern (contraction)
Echo Pattern (contraction)
The Echo Pattern
• Two phases of traversal
– expansion phase: explorers flood network with requests
for local operations
– contraction phase: echoes return and aggregate results
• Properties
– Generates balanced traffic load
– Traffic load depends on network topology,
not on speed of traversal
– Time complexity increases linearly with network
diameter.
Examples of Echo-based Management
• Get information on topology
– compute the current number of leaf nodes, the
connectivity distribution
– discover current topology within 10 hops of node x
• Get information on network state
– identify 10 most congested links
– compute distribution of link utilization, queue lengths
– identify sub topologies with highly loaded links
– find a resource R closest to node x
Pattern-based Management—
An Engineering Approach to
Decentralized Management
• A management program consists of
– A navigation pattern (distr. graph traversal algorithm)
– An operation on nodes
– An aggregation function
• Relevance of this approach
– Provides a basis to analyze management operation for
performance, scalability, robustness
– Supports concept of re-usable patterns, hides
complexity
Composing Management Programs
Segall
Echo Aggregators
Scope
Skip
Leaf Count
Chang
Res. Disc.
Multi
Wait
Load. Hist.
Echo Patterns
Conn. Hist.
Navigation Patterns
Aggregators
SNMP
XML
HTTP
CLI
Management Program
Local Operations
Node Access
Properties of Patterns
Simple Echo
Skip
Leaf Count
Scope
Multi
Wait
Res. Disc.
Load. Hist.
Conn. Hist.
Echo Patterns
Navigation Patterns
Aggregators
SNMP
XML
• A pattern can be used for many management operations.
• A pattern can be chosen according to performance objectives.
Program
• A pattern hidesManagement
the complexity
of a distributed operation.
• Network failures can be handled within patterns.
• Code mobility can be controlled.
HTTP
Chang
Echo Aggregators
Others
CLI
Segall
Robust Echo
Node Access
The Interface between
Pattern and Aggregator
visitedi : boolean
Gi
: set of integers
parenti : integer
OnAggregate
init false;
init neighbors();
init -1;
Echo(inmsg: bytes, from: integer) {
Gi := Gi - from;
if visitedi = false {
parenti := from;
visitedi := true;
OnInitiate
OnBegin
OnInitiate(inmsg, outmsg);
if Gi != empty
dispatch(parenti, outmsg, i);
} else
OnAggregate(inmsg);
if Gi = empty {
OnComplete(outmsg);
if parenti >= 0
dispatch(parenti, outmsg, i);
else
OnTerminate(inmsg);
}
}
OnComplete
OnTerminate
…
av_load := load();
n:=1;
…
…
av_loadi := av_load;
…
…
av_load := (av_load*n + av_loadj)/(n+1);
n:=n+1;
…
SIMPSON: A SIMple Pattern Simulator
fOr Large Networks
Traffic vs Time for 221 node grid network
1.2e+06
1e+06
Traffic (bytes)
800000
600000
400000
200000
0
0
1
2
3
Time (secs)
4
5
6
Analyzing Management Operations
Execution Graphs G’=(V’,E’)
Network Graph G=(V,E)
A
A
A
D
C
C
B
B
B
C
D
E
F
F
E
E
Star Pattern
Centralized Management
D
E
F
D
F
E
Echo Pattern
Distributed Management
Traffic Complexity
of Management Operations

C traffic =
hopcount  v' child k  v'    I q + Ir 
v'  V'  0  k  childcou nt v' 
Amount of traffic placed on the network during execution.
echo
Ctraffic
star
degree  G – 2 V
=  Iq + I r   E + ------------------------------------------- + 1
2
C traffic =  Iq + I r 

v'  V'
ho pcount  v' ro ot v' 
Time Complexity
of Management Operations
C time = C time  v' r oot 
if childcount  v'  = 0
 tc + tr
C time  v'  = 
 t c + t r + M  v'  otherwise
M v'  = max kt q + 2 hopcount  v'  childk  v'  tl + C time  child k  v'   
1  k  childcount  v' 
Time needed from invocation until completion of a operation.
echo
Ctime = O d 
star
Ctime = O V 
Performing Echo-based Operations
on the Entire Internet
• Purpose is illustrating the scalability of echo-based
operations.
• What we needed:
– Complexity analysis of pattern
– Estimation of Internet topological properties
• diameter
• connectivity distribution
• number of nodes
Estimated Performance
of Echo-based Operation on the Internet
Echo Pattern
Star Pattern
Aggregated Traffic
2.25 x1011 bytes
1.31 x 1012 bytes
Max Traffic on a Link
Completion Time
4'096 bytes
17.48 seconds
1.8 x 109 bytes
5.09 days
Assumptions:
Process-level transmission time: 5ms
Network delay per hop: 4ms
Message size: 1KB
Local operation: 500ms per execution
Diameter of Internet: 34 hops
Management
Station
results
Source Code
Source code,
Active node management
Active Node Manager
Source
Repository
Management commands
C++
Compiler
Binaries
Repository
Execution
Environment
Source, State
Events
Management
Operation
Results
Device
Manager
Local
Program
States
SNMP gets/traps
Router
Node
State
Preprocessor
SNMP sets
Transport
Access
Point
Source, State
Source, State
Active Node Engine
Weaver Active Node
Suboperations in Weaver
start
Execution (T1)
Serialization (T2)
TC1
Dispatch (T3)
Time
Receiving (T4)
Loading (T5) or Instantiation (T6)
De-serialization (T7)
Execution (T1)
Serialization (T2)
Dispatch (T3)
TC2
end
Receiving (T4)
Resolving (T8)
De-serialization (T7)
Execution (T1)
Node A
Node B
Measuring Execution Times on Weaver
Duration in ms
Performed by Module
Execution (T1)
1.57 (σ = 0.48)
Execution Environment
Serialization (T2)
3.46 (σ = 0.71)
Execution Environment
Dispatch (T3)
1.67 (σ = 0.49)
Transport Access Point
Receiving (T4)
0.62 (σ = 0.30)
Transport Access Point
Loading (T5)
23.42 (σ = 0.70)
Execution Environment
Instantiation (T6)
0.77 (σ = 0.015)
Execution Environment
De-serialization (T7)
2.04 (σ = 0.49)
Execution Environment
Resolving (T8)
0.15 (σ = 0.001)
Execution Environment
Communications Delay (TC)
4.04 (σ = 0.10)
---
Estimating Execution Times of
Echo-based Operations on Weaver
Designing Robust Patterns
Plain Echo
Skip Echo
Wait Echo
Echo(inmsg: bytes, from: integer) {
SkipEcho(inmsg: bytes from: integer) {
SkipEcho(inmsg: bytes from: integer) {
Gi := Gi - from;
if visitedi = false {
parenti := from;
visitedi := true;
OnInitiate(inmsg, outmsg);
if Gi != empty
dispatch(parenti, outmsg, i);
} else
OnAggregate(inmsg);
if Gi = empty {
OnComplete(outmsg);
if parenti >= 0
dispatch(parenti, outmsg, i);
else
OnTerminate(inmsg);
}
}
if visitedi = false {
parenti := from;
visitedi := true;
OnInitiate(inmsg, outmsg, i);
Gi = up_neighbors() - from;
if Gi != empty
dispatch(parenti, outmsg, i);
} else {
Gi = Gi - from;
OnAggregate(inmsg);
}
if completei != true and G i = empty {
OnComplete(outmsg);
completedi := true;
if parenti >= 0
dispatch(parenti, outmsg, i);
else
OnTerminate(inmsg);
}
}
alarm(type: {failure, recovery}, affected: integer){
if visitedi = true {
if type = failure {
Gi := Gi - affected
if completei != true and Gi = empty {
completei := true;
OnComplete(outmsg);
if parenti >= 0
dispatch(parenti, outmsg, i);
else OnTerminate(inmsg);
}
}
}
}
if visitedi = false {
parenti := from;
visitedi := true;
OnInitiate(inmsg, outmsg, i);
Gi = up_neighbors() - from;
if Gi != empty
dispatch(parenti, outmsg, i);
} else {
Gi = Gi - from;
OnAggregate(inmsg);
}
if completei != true and G i = empty {
OnComplete(outmsg);
completedi := true;
if parenti >= 0
dispatch(parenti, outmsg, i);
else
OnTerminate(inmsg);
}
}
alarm(type: {failure, recovery}, affected: integer){
if visitedi= true {
if type == failure {
Gi = Gi - affected
Bi = Bi + affected
if completei != true and Gi = empty {
completei := true;
OnComplete(outmsg);
if parenti >= 0
dispatch(parenti, outmsg, i)
else OnTerminate(inmsg);
}
} else {
if affected is in Bi {
Bi = Bi - affected
Gi = Gi + affected
}
}
}
}
Network Coverage vs. Execution Time
for Skip Echo
Coverage Vs Time for skipecho
220
MTTF=3.683
MTTF=7.367
MTTF=11.05
MTTF=14.733
MTTF=29.467
MTTF=73.67
200
180
hrs
hrs
hrs
hrs
hrs
hrs
MTTR  inf
MTTR  0
160
MTTR=11 min
MTTR=1 min
140
120
100
52.5
MTTF = 3.6 hrs
MTTF = 7.3 hrs
MTTF = 11.0 hrs
MTTF = 14.7 hrs
MTTF = 29.4 hrs
MTTF = 73.6 hrs
53
53.5
54
Time (mins)
54.5
55
55.5
Current and Planned Work
• Self-organizing, adaptable Networks and Systems:
Patterns for routing and dynamic construction of network
control structures. (Constantin Adam)
• WQL: A table-based Network Query Language on Weaver.
(Koon-Seng Lim)
• Policy-based Management: Patterns for distribution and
dynamic re-computation of policies.
(Alberto Gonzalez)
Literature on this Work
• K.S. Lim, R. Stadler: “Weaver—Realizing a scalable management
paradigm on commodity routers,” Eighth IFIP/IEEE International
Symposium on Integrated Network Management (IM 2003), Colorado
Springs, Colorado, USA, March 24-28, 2003.
• K.S. Lim and R. Stadler: "Developing pattern-based management
programs," IFIP/IEEE International Conference on Management of
Multimedia Networks and Services (MMNS 2001), Chicago, IL,
October 29 - November 1, 2001.
• K.S. Lim and R. Stadler: "A navigation pattern for scalable Internet
management,"IFIP/IEEE International Symposium on Integrated
Network Management (IM 2001), Seattle,Washington, 14-18 May,
2001.
• R. Kawamura and R. Stadler: "A middleware architecture for active
distributed management of IP networks, "IEEE/IFIP Network
Operations and Management Symposium (NOMS 2000), Honolulu,
Hawaii, April 10-14, 2000.
Related documents