* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 2005-maltz-job-talk
IEEE 802.1aq wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Computer network wikipedia , lookup
Network tap wikipedia , lookup
Distributed firewall wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Deep packet inspection wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Peer-to-peer wikipedia , lookup
Packet switching wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Rethinking Network Control & Management The Case for a New 4D Architecture David A. Maltz Carnegie Mellon University Joint work with Albert Greenberg, Gisli Hjalmtysson Andy Myers, Jennifer Rexford, Geoffrey Xie, Hong Yan, Jibin Zhan, Hui Zhang 1 Is the Network Down Again? You sit at your home computer, trying to access a computer at work… …But no data is getting through Minutes or hours later, data flows again… …You never find out why Network operators aren’t much better at predicting outages … 2 Outline What do networks look like today? New approach to predicting network behavior A new architecture for controlling networks 3 Many Kinds of Networks Each has different • Size – generally 10-1000 routers each • Owner – company, university, organization • Topology – mesh, tree, ring Examples: • Enterprise/Campus networks • Access networks: DSL, cable modems • Metro networks: connect up biz in cities • Data center networks: disk arrays & servers • Transit/Backbone networks 4 A Conventional View of a Network A E H C F B I J D G Physical topology is a graph of nodes and links Run Dijkstra to find route to each node 5 A Conventional View of a Network A E H C F B I J D G Physical topology is a graph of nodes and links Run Dijkstra to find route to each node 6 Network Equipment Picture from Internet2 Abilene Network Boxes: router, switch Links: Ethernet, SONET, T1, … 7 The Data Plane of a Network Hosts/servers Router/Switch Interfaces 8 Packets Meta-data Source Address Destination Addr Port numbers …. User data For this talk, networks traffic in packets • A sequence of bytes processed as a unit 9 The Data Plane of a Network Destination A NextHop left B C right left Forwarding Information Base (FIB) • Basically a look-up table, each entry is a route • Tests fields of packet and determines which interface to send packet out 10 The Data Plane of a Network Permit A->B Drop C->B Packet Filter • Specific to a single interface • Tests fields of packet and determines whether to permit or drop packet • Finer granularity than FIB – can test more fields, even target specific applications 11 The Data Plane of a Network Many other mechanisms… • Queueing discipline • Packet transformers (e.g., address translation) 12 The Control Plane of a Network Destination A B NextHop left right C left Where do FIB entries come from? • A distributed system called the Control Plane Control plane failures responsible for many of the longest, hardest to debug outages! 13 The Control Plane of a Network Routing Process FIB Routers run routing processes 14 The Control Plane of a Network Routing Process A,B FIB Routing Process FIB Routing C,D Process FIB Adjacent processes exchange routing information • Information format defined by routing protocol • Many routing protocols: BGP, OSPF, RIP, EIGRP • Adjacent processes must use the same protocol 15 The Control Plane of a Network Routing Process FIB D Routing Process D DestinationFIB NextHop D left Routing Process FIB Routing protocols define logic for computing routes • Combine all available information • Pick best route for each destination 16 Control Plane Creates Resiliency Routing Process D left D Routing Process D D left D D Routing Process D left 17 Control Plane Creates Resiliency Routing Process D right D Routing Process D D left D Routing Process D left 18 A Study of Operational Production Networks How complicated/simple are real control planes? • What is the structure of the distributed system? Use reverse-engineering methodology • There are few or no documents • The ones that exist are out-of-date Anonymized configuration files for 31 active networks (>8,000 configuration files) • 6 Tier-1 and Tier-2 Internet backbone networks • 25 enterprise networks • Sizes between 10 and 1,200 routers • 4 enterprise networks significantly larger than the backbone networks 19 Excerpts from a Router Configuration File interface Ethernet0 ip address 6.2.5.14 255.255.255.128 interface Serial1/0.5 point-to-point ip address 6.2.2.85 255.255.255.252 ip access-group 143 in frame-relay interface-dlci 28 access-list 143 deny 1.1.0.0/16 access-list 143 permit any route-map 8aTzlvBrbaW deny 10 match ip address 4 route-map 8aTzlvBrbaW permit 20 match ip address 7 ip route 10.2.2.1/16 10.2.1.7 router ospf 64 redistribute connected subnets redistribute bgp 64780 metric 1 subnets network 66.251.75.128 0.0.0.127 area 0 router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 12762 neighbor 66.253.160.68 distribute-list 4 in 20 Size of Configuration Files in One Network 2000 Lines in config file 1000 0 0 881 Router ID (sorted by file size) 21 Routing Processes Implement Policy Routing Process A,B Routing A Process Routing Process FIB FIB FIB R1 R2 R3 Extensive use of policy commands to filter routes • Prevent some hosts from communicating: security policy • Limit access to short-cut links: resource policy 22 Packet Filters Implement Policy Packet filters used extensively throughout networks • Protect routers from attack • Implement reachability matrix – Define which hosts can communicate – Localize traffic, particularly multicast 23 Multiple Interacting Routing Processes Client Server OSPF OSPF Internet FIB OSPF FIB Policy1 OSPF Policy2 BGP FIB OSPF FIB OSPF FIB 25 The Routing Instance Graph of a 881 Router Network 26 Take Away Points Networks deal with both creating connectivity and preventing it Networks controlled by complex distributed systems • Must understand system to understand behavior Focusing on individual protocols is not enough • Composition of protocols is important and complex Developed abstractions to model routing design • Routing Process Graph – accurately model design • Routing Instance – abstracts away details • Reverse-engineer routing design from configs 27 Outline What do networks look like today? New approach to predicting network behavior • Frame the problem of reachability analysis • Sketch algebra for predicting reachability A new architecture for controlling networks 28 Reachability A B i j Can A send a packet to B? • Depends on routing protocols, advertised routes, policies, packet filters, ... Predicting reachability is key to network survivability and security 29 Reachability A B i j We focus on two types of policy: – Survivability: Certain packets should always be permitted, under all possible network states – Security: Certain packets should never be permitted, under all possible network states 30 Reachability Example R1 Chicago (chi) R2 New York (nyc) Data Center Front Office R5 R3 R4 • Two locations, each with data center & front office • All routers exchange routes over all links 31 Reachability Example R1 Chicago (chi) R2 New York (nyc) Data Center Front Office R5 R3 R4 chi-DC chi-FO nyc-DC nyc-FO 32 Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 R2 R5 chi Front Office nyc R4 chi-DC chi-FO nyc-DC nyc-FO 33 Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 R2 R5 chi Front Office nyc R4 A new short-cut link added between data centers • Intended for backup traffic between centers 34 Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 R2 R5 chi Front Office nyc R4 Oops – new link lets packets violate security policy! • Routing changed, but • Packet filters don’t update automatically 35 Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 R2 R5 chi Front Office nyc R4 Typical response – add more packet filters to plug the holes in security policy 36 Reachability Example R1 Drop nyc-FO -> * R2 Data Center R5 chi Front Office nyc Drop chi-FO -> * R3 R4 Packet filters have surprising consequences • Consider a link failure • chi-FO and nyc-FO still connected 37 Reachability Example R1 Drop nyc-FO -> * R2 Data Center R5 chi Front Office nyc Drop chi-FO -> * R3 R4 Network has less survivability than topology suggests • chi-FO and nyc-FO still connected • But packet filter means no data can flow! • Probing the network won’t predict this problem 38 State of the Art in Reachability Analysis Build the network, try sending packets • ping, traceroute, monitoring tools Only checks paths currently selected by routing protocols • Cannot be used for “what if” analysis Our goal: Static Reachability Analysis • Predict reachability over multiple scenarios through analysis of router configuration files 39 Predicting Reachability How can we formalize the reachability provided by a network? Ri,j(s) i j • The set of packets the network will carry from router i to router j • A function of the forwarding state s • s represents the contents of each FIB • Ri,j(s) is the instantaneous reachability 40 Computing Reachability The set of all paths from i to j Packets allowed along path p R1 R2 R4 Fi,j(s): Set of packets permitted along link from node i to node j in network state s R3 41 Jointly Modeling the Effects of Packet Filters and Routing Key Problem: • Fi,j(s) affected by routing and packet filters Key Insight: • Treat routes as dynamic packet filters R1 Permit *->B Drop *->* Dest A B C R2 Permit *->A Permit *->C Drop *->* NextHop R3 R1 R3 R3 42 Bounding the Instantaneous Reachability Knowing the exact forwarding state s is impractical Knowing Ri,j(s) doesn’t help much, anyway • Want to predict behavior over a range of states Luckily, predicting behavior over set of all possible states is easier than predicting reachability for a single state 43 Reachability Bounds Lower bound on Reachability Packets in this set never prohibited by network Upper bound on Reachability Packets not in this set always prohibited by network 44 Example Upper Bound Analysis R1 Packet filter: Drop nyc-FO -> * Permit * R2 Packet filter: Drop chi-FO -> * Permit * R5 R3 chi nyc R4 Before short-cut link added: After short-cut link added: 45 Example Lower Bound Analysis R1 Packet filter: Drop nyc-FO -> * Permit * R2 Packet filter: Drop chi-FO -> * Permit * R5 R3 chi nyc R4 Before extra packet filters added: After extra packet filters added: 46 Take Away Points We have defined an algebra for modeling reachability • Packet filters, routing protocols, NAT • Griffin&Bush validated RFC 2547 VPNs Status • Algebra works on test cases • Currently experimenting with production networks Algebra’s strength and weakness is static analysis • Can validate that network meets static objectives • Can have false positives • Cannot design the network to meet objectives • Cannot control network to obey dynamic objectives 47 Outline What do networks look like today? New approach to predicting network behavior A new architecture for controlling networks • New principles for network control • New architecture embodying those principles • Experimental validation 48 Does Network Control Actually Matter? YES! • Microsoft: All services fell off the network for 23 hours due to misconfiguration of routers in their network (2001) • Major ISP: 50% of outages occur during planned maintenance (2005) • IP networks have 2-3x the outages as circuit-switched networks (2005) 49 Three Principles for Network Control & Management Network-level Objectives: • Express goals explicitly • Security policies, QoS, egress point selection • Do not bury goals in box-specific configuration Reachability matrix Traffic engineering rules Management Logic 50 Three Principles for Network Control & Management Network-wide Views: • Design network to provide timely, accurate info • Topology, traffic, resource limitations • Give logic the inputs it needs Reachability matrix Traffic engineering rules Management Logic Read state info 51 Three Principles for Network Control & Management Direct Control: • Allow logic to directly set forwarding state • FIB entries, packet filters, queuing parameters • Logic computes desired network state, let it implement it Reachability matrix Traffic engineering rules Write state Management Logic Read state info 52 Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Decision Plane: • All management logic implemented on centralized servers making all decisions • Decision Elements use views to compute data plane state that meets objectives, then directly writes this state to routers 53 Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Dissemination Plane: • Provides a robust communication channel to each router • May run over same links as user data, but logically separate and independently controlled 54 Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Discovery Plane: • Each router discovers its own resources and its local environment • E.g., the identity of its immediate neighbors 55 Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Data Plane: • Spatially distributed routers/switches • No need to change today’s technology 56 Control & Management Today Shell scripts Traffic Eng Planning tools Databases Config files SNMP netflow OSPF Link metrics OSPF BGP FIB OSPF BGP FIB Routing policies OSPF BGP FIBPacket filters Management Plane • Figure out what is happening in network • Decide how to change it Control Plane • Multiple routing processes on each router • Each router with different configuration program • Huge number of control knobs: metrics, ACLs, policy Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels 57 Good Abstractions Reduce Complexity Management Plane Control Plane Data Plane Configs FIBs, ACLs Decision Plane FIBs, ACLs Dissemination Data Plane All decision making logic lifted out of control plane • Eliminates duplicate logic in management plane • Dissemination plane provides robust communication to/from data plane routers 58 Three Key Questions • Could the 4D architecture ever be deployed? • Is the 4D architecture feasible? • Can the 4D architecture actually simplify network control and management? 59 Deployment of the 4D Architecture Pre-existing industry trend towards separating router hardware from software • IETF: FORCES, GSMP, GMPLS • SoftRouter [Lakshman, HotNets’04] Incremental deployment path exists • Individual networks can upgrade to 4D and gain benefits • Small enterprise networks have most to gain 60 The Feasibility of the 4D Architecture We designed and built a prototype of the 4D Decision plane • Contains logic to simultaneously compute routes and enforce reachability matrix • Multiple Decision Elements per network, using simple election protocol to pick master Dissemination plane • Uses source routes to direct control messages • Extremely simple, but can route around failed data links 61 Performance of the 4D Prototype Evaluated using Emulab (www.emulab.net) • Linux PCs used as routers (650 – 800MHz) • Tested on 9 enterprise network topologies (10-100 routers each) Recovers from single link failure in < 300 ms • < 1 s response considered “excellent” Survives failure of master Decision Element • New DE takes control within 1 s • No disruption unless second fault occurs Gracefully handles complete network partitions • Less than 1.5 s of outage 62 4D Makes Network Management & Control Error-proof R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 R2 R5 chi Front Office nyc R4 chi-DC chi-FO nyc-DC nyc-FO 63 Prohibiting Packets from chi-FO to nyc-DC 64 4D Makes Network Management & Control Error-proof R1 Drop nyc-FO -> * R2 Data Center R5 chi Front Office nyc Drop chi-FO -> * R3 R4 65 Allowing Packets from chi-FO to nyc-FO 66 Related Work • Driving network operation from network-wide views – Traffic Engineering – Traffic Matrix computation • Centralization of decision making logic – Routing Control Point [Feamster] – Path Computation Element [Farrel] – Signaling System 7 [Ma Bell] 67 Take Aways No need for complicated distributed system in control plane – do away with it! 4D Architecture a promising approach Power of solution comes from: • Colocating all decision making in one plane • Providing that plane with network-wide views • Directly express solution by writing forwarding state Benefits • Coordinated state updates ! better reliability • Separates network issues from distributed systems issues 68 Summary Networks must meet many different types of objectives • Security, traffic engineering, robustness Today, objectives met using control plane mechanisms • Results in complicated distributed system • Ripe with opportunities to set time-bombs • Predicting static properties is possible, but difficult Refactoring into a 4D Architecture very promising • Separates network issues from reliability issues • Eliminates duplicate logic and simplifies network • Enables new capabilities, like joint control 69 Questions? 70 Backup Slides 71 Computing Reachability Bounds • Problem reduced to estimating all routes potentially in routing table (FIB) of each router • Much easier than predicting exactly which routes will be in FIB 72 How to Organize the Decision Plane? We have exposed the network control logic --- now what? Need a way to structure that logic • Mutual optimization of multiple objectives – Potentially mutually exclusive • Each objective has different time constants • Multiple objectives may affect the same bit of data-plane state 73 Future Directions 4D in different network contexts • Ethernet networks • Mixed networks: circuit- and packetswitched Include services in the 4D • Domain Name Service • HTTP Proxies and load balancers 74 Reverse-Engineering Overview Configuration files Find links Construct Layer 3 Topology Find adjacent routing processes Construct Routing Process Graph Condense adjacent routing processes AS2 OSPF #1 BGP AS1 OSPF #2 Construct Routing Instance Graph 75 Reconstruct the Layer 3 Topology Internet Router 1 Config Router 2 Config interface Serial1/0.5 interface Serial2/1.5 ip address 1.1.1.1 255.255.255.252 ip address 1.1.1.2 255.255.255.252 …. …. 76 Abstract to a Routing Instance Graph OSPF OSPF RT RT OSPF OSPF BGP OSPF OSPF RT Route Table RT AS2 OSPF #1 Policy1 BGP AS1 Policy2 OSPF #2 • Pick an unassigned Routing Process • Flood fill along process adjacencies, labeling processes • Repeat until all processes assigned to an Instance 77 Textbook Routing Design for Enterprise Networks EBGP EBGP • Border routers speak eBGP to external peers • BGP selects a few key external routes to redistribute into OSPF • 7 of 25 enterprise networks follow this pattern OSPF BGP AS #1 AS2 AS3 78 Reality: A Diversity of Unusual Routing Designs BGP AS #2 Rest of the World EBGP BGP AS #3 EBGP BGP AS #1 EBGP BGP AS #4 EBGP BGP AS #5 • Network broken up into compartments, each with only 1 to 4 routers • Each compartment has its own AS number • Hub and spoke logical topology • Why? Lots of control over how spokes communicate 79 Reality: A Diversity of Unusual Routing Designs Rest of the World EBGP EIGRP Rest of the World EBGP BGP AS #1 EBGP EIGRP BGP AS #4 BGP AS #2 EIGRP EBGP BGP AS #3 • Network broken up into many compartments, each running EIGRP, some with 400+ routers • BGP used to filter routes passed between compartments • Compartments themselves pass information between BGP speakers • Why? Little need for IBGP; few routers speak BGP; Lots of control over how packets move between compartments 80 Link Down 81 Reconvergence Time Under Single Link Failure 82 Reconvergence Time When Master DE Crashes 83 Reconvergence Time When Network Partitions 84 Reconvergence Time When Network Partitions 85 Slides in Progress or Looking for a Place to go 86 Separation of Issues The 4D Architecture separates issues • Networking logic goes into decision plane 87 Dissemination Plane Make clear that dissem paths can use same physical links, but different routing Discovery and dissem packets can be independent of data-plane (e.g. IP) IP is very configuration intensive (addresses, etc) so we avoid it whenever possible 88 Questions What if I want to take a bunch of hosts and stick them together into a small network? Haven’t you made this common case terrifically hard? • Today, I’d use static routes – it’s neither common nor easy • In the 4D model, what do I do? – DE co-located on the host – Doesn’t talk to any other DEs or routers 89 Problems with State of the Art Today: Network behavior determined by multiple interacting distributed programs, written in assembly language • No way to visualize or describe routing design • Impossible to establish linkage between configurations and network objectives • Only a few “textbook” routing designs are widely known 90