Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Scalable Management for Networks and Services Rolf Stadler Laboratory for Communication Networks KTH Royal Institute of Technology Stockholm HP Laboratories, Palo Alto, March 31, 2003 The Shift of a Management Paradigm Management Program Manager P Management station M P download & execute Agent Management station results P A A A node A A A node Manager-Agent based management •Centralized Control •Management protocols: SNMP, CMIP •Program runs on Management Station •Decentralized Control •Program runs on network nodes Architecture for Pattern-based Management Management Program Management station navigation Execution Environment Router Code Server Weaver—A Testbed for pattern-based Management Management Station WAN A Router A WAN B Router B WAN C Router C FastEthernet Switch WAN D Router D Simple Navigation Patterns Echo Pattern (expansion) Echo Pattern (expansion) droot=1 Echo Pattern (expansion) droot=2 Echo Pattern (expansion) droot=3 Echo Pattern (expansion) droot=4 Echo Pattern droot=5 Echo Pattern (contraction) droot=4 Echo Pattern (contraction) droot=3 Echo Pattern (contraction) droot=2 Echo Pattern (contraction) droot=1 Echo Pattern (contraction) Echo Pattern (contraction) The Echo Pattern • Two phases of traversal – expansion phase: explorers flood network with requests for local operations – contraction phase: echoes return and aggregate results • Properties – Generates balanced traffic load – Traffic load depends on network topology, not on speed of traversal – Time complexity increases linearly with network diameter. Examples of Echo-based Management • Get information on topology – compute the current number of leaf nodes, the connectivity distribution – discover current topology within 10 hops of node x • Get information on network state – identify 10 most congested links – compute distribution of link utilization, queue lengths – identify sub topologies with highly loaded links – find a resource R closest to node x Pattern-based Management— An Engineering Approach to Decentralized Management • A management program consists of – A navigation pattern (distr. graph traversal algorithm) – An operation on nodes – An aggregation function • Relevance of this approach – Provides a basis to analyze management operation for performance, scalability, robustness – Supports concept of re-usable patterns, hides complexity Composing Management Programs Segall Echo Aggregators Scope Skip Leaf Count Chang Res. Disc. Multi Wait Load. Hist. Echo Patterns Conn. Hist. Navigation Patterns Aggregators SNMP XML HTTP CLI Management Program Local Operations Node Access Properties of Patterns Simple Echo Skip Leaf Count Scope Multi Wait Res. Disc. Load. Hist. Conn. Hist. Echo Patterns Navigation Patterns Aggregators SNMP XML • A pattern can be used for many management operations. • A pattern can be chosen according to performance objectives. Program • A pattern hidesManagement the complexity of a distributed operation. • Network failures can be handled within patterns. • Code mobility can be controlled. HTTP Chang Echo Aggregators Others CLI Segall Robust Echo Node Access The Interface between Pattern and Aggregator visitedi : boolean Gi : set of integers parenti : integer OnAggregate init false; init neighbors(); init -1; Echo(inmsg: bytes, from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true; OnInitiate OnBegin OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty { OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } OnComplete OnTerminate … av_load := load(); n:=1; … … av_loadi := av_load; … … av_load := (av_load*n + av_loadj)/(n+1); n:=n+1; … SIMPSON: A SIMple Pattern Simulator fOr Large Networks Traffic vs Time for 221 node grid network 1.2e+06 1e+06 Traffic (bytes) 800000 600000 400000 200000 0 0 1 2 3 Time (secs) 4 5 6 Analyzing Management Operations Execution Graphs G’=(V’,E’) Network Graph G=(V,E) A A A D C C B B B C D E F F E E Star Pattern Centralized Management D E F D F E Echo Pattern Distributed Management Traffic Complexity of Management Operations C traffic = hopcount v' child k v' I q + Ir v' V' 0 k childcou nt v' Amount of traffic placed on the network during execution. echo Ctraffic star degree G – 2 V = Iq + I r E + ------------------------------------------- + 1 2 C traffic = Iq + I r v' V' ho pcount v' ro ot v' Time Complexity of Management Operations C time = C time v' r oot if childcount v' = 0 tc + tr C time v' = t c + t r + M v' otherwise M v' = max kt q + 2 hopcount v' childk v' tl + C time child k v' 1 k childcount v' Time needed from invocation until completion of a operation. echo Ctime = O d star Ctime = O V Performing Echo-based Operations on the Entire Internet • Purpose is illustrating the scalability of echo-based operations. • What we needed: – Complexity analysis of pattern – Estimation of Internet topological properties • diameter • connectivity distribution • number of nodes Estimated Performance of Echo-based Operation on the Internet Echo Pattern Star Pattern Aggregated Traffic 2.25 x1011 bytes 1.31 x 1012 bytes Max Traffic on a Link Completion Time 4'096 bytes 17.48 seconds 1.8 x 109 bytes 5.09 days Assumptions: Process-level transmission time: 5ms Network delay per hop: 4ms Message size: 1KB Local operation: 500ms per execution Diameter of Internet: 34 hops Management Station results Source Code Source code, Active node management Active Node Manager Source Repository Management commands C++ Compiler Binaries Repository Execution Environment Source, State Events Management Operation Results Device Manager Local Program States SNMP gets/traps Router Node State Preprocessor SNMP sets Transport Access Point Source, State Source, State Active Node Engine Weaver Active Node Suboperations in Weaver start Execution (T1) Serialization (T2) TC1 Dispatch (T3) Time Receiving (T4) Loading (T5) or Instantiation (T6) De-serialization (T7) Execution (T1) Serialization (T2) Dispatch (T3) TC2 end Receiving (T4) Resolving (T8) De-serialization (T7) Execution (T1) Node A Node B Measuring Execution Times on Weaver Duration in ms Performed by Module Execution (T1) 1.57 (σ = 0.48) Execution Environment Serialization (T2) 3.46 (σ = 0.71) Execution Environment Dispatch (T3) 1.67 (σ = 0.49) Transport Access Point Receiving (T4) 0.62 (σ = 0.30) Transport Access Point Loading (T5) 23.42 (σ = 0.70) Execution Environment Instantiation (T6) 0.77 (σ = 0.015) Execution Environment De-serialization (T7) 2.04 (σ = 0.49) Execution Environment Resolving (T8) 0.15 (σ = 0.001) Execution Environment Communications Delay (TC) 4.04 (σ = 0.10) --- Estimating Execution Times of Echo-based Operations on Weaver Designing Robust Patterns Plain Echo Skip Echo Wait Echo Echo(inmsg: bytes, from: integer) { SkipEcho(inmsg: bytes from: integer) { SkipEcho(inmsg: bytes from: integer) { Gi := Gi - from; if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg); if Gi != empty dispatch(parenti, outmsg, i); } else OnAggregate(inmsg); if Gi = empty { OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and G i = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } alarm(type: {failure, recovery}, affected: integer){ if visitedi = true { if type = failure { Gi := Gi - affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } } } if visitedi = false { parenti := from; visitedi := true; OnInitiate(inmsg, outmsg, i); Gi = up_neighbors() - from; if Gi != empty dispatch(parenti, outmsg, i); } else { Gi = Gi - from; OnAggregate(inmsg); } if completei != true and G i = empty { OnComplete(outmsg); completedi := true; if parenti >= 0 dispatch(parenti, outmsg, i); else OnTerminate(inmsg); } } alarm(type: {failure, recovery}, affected: integer){ if visitedi= true { if type == failure { Gi = Gi - affected Bi = Bi + affected if completei != true and Gi = empty { completei := true; OnComplete(outmsg); if parenti >= 0 dispatch(parenti, outmsg, i) else OnTerminate(inmsg); } } else { if affected is in Bi { Bi = Bi - affected Gi = Gi + affected } } } } Network Coverage vs. Execution Time for Skip Echo Coverage Vs Time for skipecho 220 MTTF=3.683 MTTF=7.367 MTTF=11.05 MTTF=14.733 MTTF=29.467 MTTF=73.67 200 180 hrs hrs hrs hrs hrs hrs MTTR inf MTTR 0 160 MTTR=11 min MTTR=1 min 140 120 100 52.5 MTTF = 3.6 hrs MTTF = 7.3 hrs MTTF = 11.0 hrs MTTF = 14.7 hrs MTTF = 29.4 hrs MTTF = 73.6 hrs 53 53.5 54 Time (mins) 54.5 55 55.5 Current and Planned Work • Self-organizing, adaptable Networks and Systems: Patterns for routing and dynamic construction of network control structures. (Constantin Adam) • WQL: A table-based Network Query Language on Weaver. (Koon-Seng Lim) • Policy-based Management: Patterns for distribution and dynamic re-computation of policies. (Alberto Gonzalez) Literature on this Work • K.S. Lim, R. Stadler: “Weaver—Realizing a scalable management paradigm on commodity routers,” Eighth IFIP/IEEE International Symposium on Integrated Network Management (IM 2003), Colorado Springs, Colorado, USA, March 24-28, 2003. • K.S. Lim and R. Stadler: "Developing pattern-based management programs," IFIP/IEEE International Conference on Management of Multimedia Networks and Services (MMNS 2001), Chicago, IL, October 29 - November 1, 2001. • K.S. Lim and R. Stadler: "A navigation pattern for scalable Internet management,"IFIP/IEEE International Symposium on Integrated Network Management (IM 2001), Seattle,Washington, 14-18 May, 2001. • R. Kawamura and R. Stadler: "A middleware architecture for active distributed management of IP networks, "IEEE/IFIP Network Operations and Management Symposium (NOMS 2000), Honolulu, Hawaii, April 10-14, 2000.