Download slides - Microsoft Research

New Directions in Enterprise Network Management Aditya Akella University of Wisconsin, Madison MSR Networking Summit June 2006 Enterprise Network Management • Very broad topic… – Tuning performance and availability of networkattached services – Traffic sniffing for trouble-shooting – Monitoring utilization – Mapping network topology and resources, etc. • Several tools (both commercial and free) – Tailored to enterprises of different sizes, requirements Outline • Enterprises desire specific management functionalities that current tools fundamentally cannot provide – Three examples • Inability arises from how enterprises are designed and operated today (IP-based) – Decentralization and no control over routing • Thoughts on enterprise network design principles – … Simplified management is a side-effect So What’s Missing? • Cumbersome or impossible to support – What-If analysis – Effective trouble-shooting – Fine-grained resource management • Some tools may provide one of these – No tool provides all of them 1. What-If Analysis New config stable? Will bottleneck disappear? Will upgrade violate policy? • What will happen if I change X in my network? New link/ network upgrade – Policy/control plane level – Reason about connectivity before installing changes New policies for sales Alternate configuration  Decentralized config specification – Complex config/policy split across several devices/mechanisms • Firewalls, Proxies, NATs, router ACLs, VLANs, port filtering – … And across different network layers – Hard to reason about cross-layer, cross-device interaction 2. Trouble-Shooting • What is the current “status” of my network? – Who is talking to who and how? Resource consumption? – Avoid overload; control plane trouble shooting • Information at arbitrary granularities – Users, machines, groups… – Ability to go back in time – Unexpected patterns of communication; Protocol usage How many conns from sales? Who is using access link? How many connections from guests? Finance grp protocol usage last week? 2. Trouble-Shooting • Today… – – – – SNMP for tracking resource consumption  Coarse-grained Monitoring key resources  Application specific; not network-wide Inference  Rely on heuristics, error prone Not fine-grained enough  Distributed decision on whether to allow flows – Distributed and/or local to services and devices – By default all-to-all is allowed • Something is undesirable  local restrictions • Use appropriate mechanism (ACLs, port filters, firewalls etc.) – Poll to figure out what’s going on, or infer – Hard to archive control-plane events 3. Resource Management • Route around overloaded/failed switches and links – Connection latency – Availability Guests  restrict b/w Sales  virus-1 + image-filter + compression • Control levels of resource consumptions – Prioritize applications or users – Restrict bandwidth consumption of “sales” X • Middle-boxes and proxies – Placed at network choke points – Ideally, deploy at diverse locations – Route different classes of flows via different middleboxes Products  virus-2 + compression 3. Resource Management • Limited or no support in enterprises today – SNMP-based/manual tuning, OSPF, load-balancing using DNS  Lack of tight control over routing – Forwarding tables, hop-by-hop dst IP based routing inflexible • Very little info used for routing • Additional info into forwarding tables  complexity; slow look-up • Aggregation  No control over flows or groups of flows – Need tighter, app flow-level control • Forwarding tables fundamentally insufficient Desiderata Should AD be allowed? A B A  B using HTTP C  D using AIM via proxy A  D using AIM via filter … C D • Centralization: – Of config specification (who can access what and how) – Of enterprise-wide decision-making (should flow X be allowed) – What-if analysis or connectivity becomes trivial • (Offline) Analysis of a central database of policies – Troubleshooting and forensics is simple • Current set or complete log of accepted conn requests or active flows Desiderata Route AD (AIM) through s1p1p2s2 A B Route AD (HTTP) through s1p1s3s2 C D • Tight control over routing: – Centrally pre-ordain the path of each flow – No more designing around choke-points • Easy to integrate arbitrary number/type of middle-boxes – Fine-grained resource control – Also aids trouble-shooting and what-if analysis An Architectural View • Take all configuration and decision-making out of switches, routers – Put all eggs in one basket • Central entity tells switches how to forward packets – Wire a circuit for each new flow… – … Or hand out a source route  Switches have no forwarding table – Dumb forwarding elements – Under the direct control of the central controller (via control channels) Effect on Management • Control-plane related management or monitoring easy to do – – – – – How many connections per users? Upgrades violate policy? Who accessed service X? Route different flows differently React to failures/overload • “Data-plane management” harder to do – Band-width related – E.g. Restrictions on users; Monitor Utilization Data Plane Management • Switches need to be slightly less dumb – Minimal management support to enable data plane management? • • • • Counters per-flow? Per-flow queuing? Up-to-date link utilization? Push vs pull based?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download slides - Microsoft Research