Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BGP Deployment & Scalability Mike Pennington Network Consulting Engineer Cisco Systems, Denver ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 1 Basic BGP Review ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 2 Border Gateway Protocol • Routing Protocol used to exchange routing information between networks exterior gateway protocol • RFC1771 work in progress to update draft-ietf-idr-bgp4-17.txt • Currently Version 4 • Runs over TCP ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 3 BGP • Path Vector Protocol • Incremental Updates • Many options for policy enforcement • Classless Inter Domain Routing (CIDR) • Widely used for Internet backbone • Autonomous systems ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 4 Path Vector Protocol • BGP is classified as a path vector routing protocol (see RFC 1322) A path vector protocol defines a route as a pairing between a destination and the attributes of the path to that destination. 12.6.126.0/24 207.126.96.43 1021 0 6461 7018 6337 11268 i AS Path ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 5 AS-Path • Sequence of ASes a route has traversed AS 200 AS 100 170.10.0.0/16 180.10.0.0/16 • Loop detection • Apply policy 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 AS 300 AS 400 150.10.0.0/16 AS 500 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 180.10.0.0/16 170.10.0.0/16 150.10.0.0/16 300 200 100 300 200 300 400 6 AS-Path loop detection AS 200 AS 100 170.10.0.0/16 180.10.0.0/16 140.10.0.0/16 170.10.0.0/16 500 300 500 300 200 AS 300 140.10.0.0/16 AS 500 180.10.0.0/16 170.10.0.0/16 140.10.0.0/16 ISP Workshops 300 200 100 300 200 300 © 2001, Cisco Systems, Inc. All rights reserved. 180.10.0.0/16 is not announced to AS100 as AS500 sees that it is originated from AS100, and that AS100 is the neighbouring AS – loop detection in action 7 Autonomous System (AS) AS 100 • Collection of networks with same routing policy • Single routing protocol • Usually under single ownership, trust and administrative control ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 8 BGP Basics Peering A C AS 100 AS 101 B D E BGP speakers are called peers ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. AS 102 9 BGP General Operation • Learns multiple paths via internal and external BGP speakers • Picks the best path and installs in the forwarding table • Policies applied by influencing the best path selection ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 10 External BGP Peering (eBGP) A AS 100 C AS 101 B • Between BGP speakers in different AS • Should be directly connected • Do not run an IGP between eBGP peers ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 11 Internal BGP Peering (iBGP) AS 100 D A • Topology independent • Each iBGP speaker must peer with every other iBGP speaker in the AS ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. B E 12 Internal BGP (iBGP) • BGP peer within the same AS • Not required to be directly connected • iBGP speakers need to be fully meshed they originate connected networks they do not pass on prefixes learned from other iBGP speakers ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 13 BGP Attributes ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 14 What Is an Attribute? ... Next Hop AS Path MED ... ... • Describes the characteristics of prefix • Transitive or non-transitive • Some are mandatory ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 15 AS-Path • Sequence of ASes a route has traversed AS 200 AS 100 170.10.0.0/16 180.10.0.0/16 • Loop detection • Apply policy 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 AS 300 AS 400 150.10.0.0/16 AS 500 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 180.10.0.0/16 170.10.0.0/16 150.10.0.0/16 300 200 100 300 200 300 400 16 Next Hop 150.10.1.1 AS 200 150.10.0.0/16 A 150.10.1.2 B AS 300 150.10.0.0/16 150.10.1.1 160.10.0.0/16 150.10.1.1 AS 100 160.10.0.0/16 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. • Next hop to reach a network • Usually a local network is the next hop in eBGP session 17 20 Next Hop (continued) • IGP should carry route to next hops • Recursive route look-up • Unlinks BGP from actual physical topology • Allows IGP to make intelligent forwarding decision ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 18 Local Preference AS 100 160.10.0.0/16 AS 200 AS 300 D 500 800 A 160.10.0.0/16 > 160.10.0.0/16 500 800 E B AS 400 C ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 19 Local Preference • Local to an AS – non-transitive local preference set to 100 when heard from neighbouring AS • Used to influence BGP path selection determines best path for outbound traffic • Path with highest local preference wins ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 20 Multi-Exit Discriminator (MED) AS 200 C 192.68.1.0/24 2000 192.68.1.0/24 A 1000 B 192.68.1.0/24 AS 201 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 21 Multi-Exit Discriminator • Inter-AS – non-transitive metric reset to 0 on announcement to next AS • Used to convey the relative preference of entry points determines best path for inbound traffic • Comparable if paths are from same AS • IGP metric can be conveyed as MED ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 22 MED & IGP Metric • set metric-type internal enable BGP to advertise a MED which corresponds to the IGP metric values changes are monitored (and re-advertised if needed) every 600s bgp dynamic-med-interval <secs> ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 23 Community • BGP attribute • Used to group destinations • Represented as two 16bit integers • Each destination could be member of multiple communities • Community attribute carried across AS’s • Useful in applying policies ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 24 Community ISP 2 160.10.0.0/16 170.10.0.0/16 X 300:1 300:1 200.10.0.0/16 200.10.0.0/16 E 300:9 D AS 400 ISP 1 AS 300 160.10.0.0/16 C 300:1 AS 100 A 160.10.0.0/16 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 170.10.0.0/16 B 300:1 AS 200 170.10.0.0/16 25 BGP Deployment Guidelines ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 26 Recommended BGP commands for everyone • ip bgp-community new-format • no auto-summary • no synchronization • bgp deterministic-med Whatever you do, use of deterministic-med MUST be consistent in your Autonomous System. ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 27 Other serious considerations • For public peering: filter EBGP routes inbound and outbound Block your own address space inbound Block RFC 1918 space (inbound and outbound) Block DSUA space (inbound and outbound): http://www.ietf.org/internet-drafts/draft-manning-dsua-08.txt • Use prefix-lists for route-filtering when possible (easier to read than ACLs) ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 28 Other serious considerations • If you carry a default in the IGP, your BGP next-hops ALWAYS resolve (generally not good) • bgp bestpath compare-routerid Restores RFC-compliant path selection; OFF by default to reduce update churn, use with discretion • If you have a large BGP network, consider techniques in the next section ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 29 BGP Scaling Techniques ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 30 BGP Scaling Techniques • How to scale iBGP mesh beyond a few peers? • How to implement new policy without causing flaps and route churning? • How to reduce the overhead on the routers? ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 31 BGP Scaling Techniques • Dynamic reconfiguration • Peer groups • Route flap damping • Route reflectors ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 32 Soft Reconfiguration Problem: • Hard BGP peer clear required after every policy change because the router does not store prefixes that are denied by a filter • Hard BGP peer clearing consumes CPU and affects connectivity for all networks Solution: • Soft-reconfiguration ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 33 Soft Reconfiguration discarded peer normal soft accepted BGP in table received peer ISP Workshops BGP in process received and used BGP table BGP out process © 2001, Cisco Systems, Inc. All rights reserved. 34 Soft Reconfiguration • New policy is activated without tearing down and restarting the peering session • Per-neighbour basis • Use more memory to keep prefixes whose attributes have been changed or have not been accepted ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 35 Configuring Soft reconfiguration router bgp 100 neighbor 1.1.1.1 remote-as 101 neighbor 1.1.1.1 route-map infilter in neighbor 1.1.1.1 soft-reconfiguration inbound ! Outbound does not need to be configured ! Then when we change the policy, we issue an exec command clear ip bgp 1.1.1.1 soft [in | out] ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 36 Managing Policy Changes • clear ip bgp <addr> [soft] [in|out] <addr> may be any of the following x.x.x.x IP address of a peer * all peers ASN all peers in an AS external all external peers peer-group <name>all peers in a peer-group ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 37 Route Refresh Capability • Facilitates non-disruptive policy changes • No configuration is needed • No additional memory is used • Requires peering routers to support “route refresh capability” – RFC2918 • clear ip bgp x.x.x.x in tells peer to resend full BGP announcement ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 38 Soft Reconfiguration vs Route Refresh • Use Route Refresh capability if supported find out from “show ip bgp neighbor” uses much less memory • Otherwise use Soft Reconfiguration ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 39 Peer Groups Without peer groups • iBGP neighbours receive same update • Large iBGP mesh slow to build • Router CPU wasted on repeat calculations Solution – peer groups! • Group peers with same outbound policy • Updates are generated once per group ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 40 Peer Groups - Advantages • Makes configuration easier • Makes configuration less prone to error • Makes configuration more readable • Lower router CPU load • iBGP mesh builds more quickly • Members can have different inbound policy • Can be used for eBGP neighbours too! ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 41 Configuring Peer Group router bgp 100 neighbor ibgp-peer peer-group neighbor ibgp-peer remote-as 100 neighbor ibgp-peer update-source loopback 0 neighbor ibgp-peer send-community neighbor ibgp-peer route-map outfilter out neighbor 1.1.1.1 peer-group ibgp-peer neighbor 2.2.2.2 peer-group ibgp-peer neighbor 2.2.2.2 route-map infilter in neighbor 3.3.3.3 peer-group ibgp-peer ! note how 2.2.2.2 has different inbound filter from peer-group ! ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 42 Route Flap Damping • Route flap Going up and down of path or change in attribute BGP WITHDRAW followed by UPDATE = 1 flap eBGP neighbour going down/up is NOT a flap Ripples through the entire Internet Wastes CPU • Damping aims to reduce scope of route flap propagation ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 43 Route Flap Damping (Continued) • Requirements Fast convergence for normal route changes History predicts future behaviour Suppress oscillating routes Advertise stable routes • Implementation described in RFC2439 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 44 Operation • Add penalty (1000) for each flap Change in attribute gets penalty of 500 • Exponentially decay penalty half life determines decay rate • Penalty above suppress-limit do not advertise route to BGP peers • Penalty decayed below reuse-limit re-advertise route to BGP peers penalty reset to zero when it is half of reuse-limit ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 45 Operation 4000 Suppress limit 3000 Penalty 2000 Reuse limit 1000 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Time Network Announced ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. Network Not Announced Network Re-announced 46 Operation • Only applied to inbound announcements from eBGP peers • Alternate paths still usable • Controlled by: Half-life (default 15 minutes) reuse-limit (default 750) suppress-limit (default 2000) maximum suppress time (default 60 minutes) ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 47 Configuration Fixed damping router bgp 100 bgp dampening [<half-life> <reuse-value> <suppresspenalty> <maximum suppress time>] Selective and variable damping bgp dampening [route-map <name>] route-map <name> permit 10 match ip address prefix-list FLAP-LIST set dampening [<half-life> <reuse-value> <suppresspenalty> <maximum suppress time>] ip prefix-list FLAP-LIST permit 192.0.2.0/24 le 32 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 48 Operation • Care required when setting parameters • Penalty must be less than reuse-limit at the maximum suppress time • Maximum suppress time and half life must allow penalty to be larger than suppress limit ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 49 Configuration • Examples - bgp dampening 30 750 3000 60 reuse-limit of 750 means maximum possible penalty is 3000 – no prefixes suppressed as penalty cannot exceed suppress-limit • Examples - bgp dampening 30 2000 3000 60 reuse-limit of 2000 means maximum possible penalty is 8000 – suppress limit is easily reached ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 50 Configuration • Examples - bgp dampening 15 500 2500 30 reuse-limit of 500 means maximum possible penalty is 2000 – no prefixes suppressed as penalty cannot exceed suppress-limit • Examples - bgp dampening 15 750 3000 45 reuse-limit of 750 means maximum possible penalty is 6000 – suppress limit is easily reached ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 51 Maths! • Maximum value of penalty is • Always make sure that suppress-limit is LESS than max-penalty otherwise there will be no route damping ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 52 Enhancements • Selective damping based on AS-path, Community, Prefix • Variable damping recommendations for ISPs http://www.ripe.net/docs/ripe-229.html • Flap statistics show ip bgp neighbor <x.x.x.x> [dampened-routes | flap-statistics] ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 53 Scaling iBGP mesh Avoid n(n-1)/2 iBGP mesh n=1000 nearly half a million ibgp sessions! 13 Routers 78 iBGP Sessions! Two solutions Route reflector – simpler to deploy and run Confederation – more complex, corner case benefits ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 54 Route Reflector: Principle Route Reflector A AS 100 B ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. C 55 Route Reflector Clients • Reflector receives path from clients and non-clients • Selects best path • If best path is from client, reflect to other clients and non-clients • If best path is from non-client, reflect to clients only • Non-meshed clients Reflectors A B C AS 100 • Described in RFC2796 ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 56 Route Reflector Topology • Divide the backbone into multiple clusters • At least one route reflector and few clients per cluster • Route reflectors are fully meshed • Clients in a cluster could be fully meshed • Single IGP to carry next hop and local routes ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 57 Route Reflectors: Loop Avoidance • Originator_ID attribute Carries the RID of the originator of the route in the local AS (created by the RR) • Cluster_list attribute The local cluster-id is added when the update is sent by the RR Cluster-id is router-id (address of loopback) Do NOT use bgp cluster-id x.x.x.x ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 58 Route Reflectors: Redundancy • Multiple RRs can be configured in the same cluster – not advised! All RRs in the cluster must have the same cluster-id (otherwise it is a different cluster) • A router may be a client of RRs in different clusters Common today in ISP networks to overlay two clusters – redundancy achieved that way Each client has two RRs = redundancy ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 59 Route Reflector: Benefits • Solves iBGP mesh problem • Packet forwarding is not affected • Normal BGP speakers co-exist • Multiple reflectors for redundancy • Easy migration • Multiple levels of route reflectors ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 60 Configuring a Route Reflector router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 route-reflector-client neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-reflector-client neighbor 3.3.3.3 remote-as 100 neighbor 3.3.3.3 route-reflector-client ISP Workshops © 2001, Cisco Systems, Inc. All rights reserved. 61 62