Download Routing Scalability

Routing Scalability Dimitri Papadimitriou [email protected] Alcatel-Lucent Bell Today 200+ million domain delegations [Verisign] over a billion domain names Name->Addr. resolution 350k routing paths (350k BGP routing entries at each DFZ router) and 35k autonomous systems IP Address-based routing Distributed adaptive routing BGP (selective push) Question ? Total number of files ? File Name-based routing (???) Name -> Name resolution (???) 200+ million domain delegations [Verisign] over a billion domain names Domain Name-based routing (???) Routing Scaling Analysis • Two dimensional problem – Problem 1: Addressing (amplification of address prefix deaggregation) – Problem 2: Routing (Inter-domain routing protocol (BGP) limitations) Problem 1: Addressing (Amplification of address prefix de-aggregation) • Originally, host IP addresses were Provider Allocated (PA) and assigned based on network topological location – Adoption in the mid 90's of Classless Inter-Domain Routing (CIDR) [RFC4632] to perform address aggregation was felt sufficient to handle address scaling • Conditions to achieve efficient address aggregation and relatively small routing tables (tradeoff routing information aggregation vs granularity) are not met anymore [RFC4984] • Deterioration root causes – Host mobility, site multi-homing (~25% of sites), traffic-engineering (prefix deaggregation) – RIR policy to allocate PI addresses (not topologically aggregatable) thus making CIDR ineffective  Growth of routing table (routing protocol must not only scale with increasing network size) even if network itself would not be growing Problem 2: Inter-domain routing protocol limitations 1. BGP Implementation and configuration: may be circumvented 2. BGP Routing algorithmic: shortest AS-path vector routing – BGP (as any path-vector routing): slow convergence due to uninformed path exploration – BGP suffers from churn/overhead which increases load on routers due to topological failures and traffic engineering (prefix de-aggregation) 3. BGP Protocol usage: policy-based routing (without policy distribution) – Intra-AS oscillations: MED-induced oscillations – Inter-AS oscillations: local preference over shortest AS-Path – Conflicting policy interactions • Unintended stable state (wedgies) • Unintended unstable state (dispute wheels) Internet Growth Rate Growth of Active BGP Entries (from Jan’89 to Sep’10) • Traffic – Traffic volume (per month): [8,9] Exabytes – Traffic growth rate: 50% (+/- 5%) per year • Routing tables size Jan.1 2006 – FIB Size: 176,000 prefixes – Update Rate: 0.7M prefix updates / day – Withdrawal Rate: 0.4M prefix withdrawals / day Jan.1 2009 - FIB size: [275,000;300,000] prefixes - Update Rate: 1.7M prefix updates / day - Withdrawal Rate: 0.9M withdrawals / day – Number of active Routing Table (RT) entries: 345k (Sep.2010) – Growth rate: 15%-25% per year • Autonomous Systems (AS) Jan.1 2011 (low-end predictions) - Size: [370,000;400,000] prefixes - Update Rate: 2.8M prefix updates per day - Withdrawal Rate: 1.6M withdrawals per day - 550Mbytes Memory - 120% of 1.5Ghz processor Number of AS advertised in BGP routing table – Number of advertized AS: 35k (Sep.2010) – Growth rate: 10% per year – Ratio ~ 10 IPv4 prefix per AS • Characteristic AS-path length – Steady ~3.7 • AS transit interconnection degree: growing (2.56 – 2.60) Source: BGP Routing Table Analysis Reports - http://bgp.potaroo.net 35.269 Ratio: prefix/AS ~ 10 Current Internet Growth Rate • Dynamics BGP updates (routing convergence) – Between Jan.2006 and Jan.2009: prefix update and withdrawal rates per day increased by a factor of about 2.25-2.5 [Huston07] • Average: 2-3 per sec. – Peak: O(1000) per sec. – BGP suffers from churn which increases load on routers due to topological failures and traffic engineering (prefix de-aggregation) – BGP’s path vector amplifies these problems (path exploration) Relationship to AS topology • Meshed AS topology (average AS degree ~ 2.5-3) with high clustering coefficient (~ 0.4) • BGP uninformed path exploration – BGP listens without understanding (local BGP route selection) – BGP routing updates are not coordinated in space and time but rate limited (MRAI timer) -> state coupling between topologically correlated BGP updates Cycle -> Exploration Cycle -> Exploration Cycle -> Exploration Space segmentation: lasagne or spaghettis IP Address Prefix Indirection ID Indirection Abstraction relation 1:n ID Network Name Locator Relation m:n, m>n Host-driven Network-driven Partition Network layer vs Overlay routing 1. Either focus on technological limits (scalability, resiliency, stability, convergence, etc.) and operational limits (policing) of existing network-level routing 2. Or build an infrastructure-based overlay on top of existing IP network layer  Additional layer of indirection 2. Infrastructure-based overlay 1. Revisit network “routing functions” In D A d In B C d a b c Edge Out a c b Out Network layer vs Overlay routing  Additional layer of indirection adds benefits such as customization, independence, and flexibility ... but also detrimental effects – Conflicting cross-layer interactions that impact overall network performance (amplified by selfish routing where individual user/overlay controls routing of infinitesimal amount of traffic to optimize its own performance without considering network-wide criteria) – Scalability (rate x state) – Resiliency (user-initiated states) and security (interaction with userinitiated states) – Genericity and evolvability Note: the looser the coupling the higher the flexibility, the stronger the coupling the higher the performance (pick one) ! Effects of indirections “Any problem in computer science can be solved with another layer of indirection.” — David Wheeler … “But that usually will create another problem.” — rest of the quote Indirection = (generic) infrastructure-based overlay routing Overlay Traffic Overlay control info Overlay control Overlay fwd info Multiple control mechanisms  conflicting cross-layer interactions (due to diff. performance objectives & contention) NOP Decapsulation Open i/f Encapsulation Open i/f RIB Routing engine FIB Packet in TC MF classifier Lookup Longest matching prefix Packet out Routing Scaling dependency on Addressing Address prefix assignment Network • Topology-dependent: locator address structure designed specifically to enable “topological aggregation” to scale with routing system • Topology-independent: addressing space used as flat ID to prevent topological changes (TCP impact) and provider renumbering impact Host/site Addressing follows topology Topology dependent Topology independent Address = Loc. ID Address = flat ID Locator/Identifier (Loc/ID) Separation (1) • Motivation: restore aggregatibility of routing states by "segmenting" the address space (hosts vs networks) and their respective allocation policy – Loc/ID split using different numbering spaces for end-point identifiers (EID) block allocation per organization and Locators (RLOC) that are topology congruent and aggregatable • Principles – Segmentation between topology independent endpoint identifier (= user address space) and topology dependent locator (= network address space) – Resolution via distributed database (= mapping database) including info necessary to translate hosts’ topology independent addresses (identifiers) to topology dependent addresses (locators) – Traffic-driven at ingress "edge": forwarding entries preceded by ID-to-RLOC mapping entries (encapsulation) populated per incoming traffic arrival – Memory-less at egress "edge": do not keep track of source of ID-to-RLOC mapping requests (in case of mapping change, initial requestor not directly updated) Locator/Identifier (Loc/ID) Separation (2) • Host A (EID A) -> Host B (EID B) Host A Host B APP APP Edge A Edge B Network Network TCP Network EID A TCP EID B Network Map Request <?, EID B> EID-to-RLOC lookup EID-to-RLOC lookup Map Reply <RLOC B, EID B> Network Edge router (ITR): A -> B Edge router (ETR): B -> A LOC A Network RLOC lookup LOC B Network Edge router (ETR): A -> B Edge router (ITR): B -> A Locator/Identifier (Loc/ID) Separation (3) Main challenges • Responsiveness: to spatio-temporal properties of incoming traffic (and variations) – Differential delay and/or drop of initial incoming packets -> effect on transport layer (e.g. congestion and flow control) – Port scanning (EID-to-RLOC cache updates) • Churn: effect of changes for "in-use" EID-to-RLOC mappings – Changes in EID reachability (at egress) -> effect on established flows – Asymmetric forwarding paths (dual edges) • De-aggregation: EID block segmentation – EID sub-blocks decomposition and allocation to multiple RLOCs (forwarding still longest-match prefix based) ...More fundamentally • Locator ID Separation Protocol (LISP) is a form of nameindependent routing using topology-unaware flat addressing running on top of name-dependent routing scheme – In addition, to maintain and update EID-to-RLOC tables, the network maintains and updates a distributed database of EID-to-RLOC mappings ( indirection layer) – Average scaling characteristics of name-independent routing schemes cannot be better than name-dependent ones • Reason: name independent schemes are essentially name dependent schemes plus mapping tables and ID-to-RLOC name-resolution mechanism, which incur to both routing table size increase and stretch • Bottom-line: LISP can thus not directly result into a global routing scalability improvement Design Principles ? • End-to-end principle emphasizes – Functional placement: guides placement & spatial distribution of functionality – Correctness and completeness: a (sub-)system should consider only functions that can be completely and correctly implemented within it • Don’t implement a function at lower layers unless it can be completely and correctly implemented at this level (relieve the burden from hosts) • Don’t rely on information or processing that’s not available along the data path as it makes network layer more complicated (example: DNS) – Overall system cost-performance tradeoff • If an application can implement a functionality correctly, implement it a lower layer only as performance enhancement but iff it does not impose burden on applications that do not require that functionality • Don’t put application semantics in network: leads to loss of flexibility – Cannot change existing applications easily and cannot introduce new applications easily • Fate sharing – The network does not maintain any state about the applicative data flows that traverses the network (app-stateless nature of the network) • Minimum intervention principle Which architectural alternative ? • End-to-end principle • Loose/weak coupling Information Fusion: info  communication Communication • RFC 1925, Art.5 Information • Erosion of the end-to-end principle: network-aware app's and application aware network Communication Information Mediation Communication • Net result: +1 layer (only ?) • Gain ? Does it improve cost x complexity perf. x functionality

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Routing Scalability