Download BGP Route Reflectors

Document related concepts
Transcript
Deployment and Operation of BGP
TECRST-2310
Agenda
 Introduction to BGP
 BGP General Operation
 BGP Attributes and Policy Control
 BGP Path Selection Algorithm
 Applying Policy with BGP
 Multi-Protocol BGP
 BGP Load Balancing
 Full Mesh IBGP
 BGP Route-Reflectors
 Scaling BGP Updates
 BGP Fast Convergence
 A Little BGP “Show and Tell”
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
2
Introduction to BGP
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
3
Autonomous System
 A network sharing the same routing policy
Possibly multiple IGPs
Usually under single administrative control
 Contiguous internal connectivity
 Numbering range form 1 to 65,535—Globally
unique—“AS Number”
Private range: 64512–65534
Reserved: 0 and 65535
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
4
Border Gateway Protocol - BGP
 BGP is classified as a path vector routing protocol
(see RFC 1322)
A path vector protocol defines a route as a pairing between
a destination and the attributes of the path to that
destination.
 BGP used internally (iBGP) and externally (eBGP)
 iBGP used to carry
Some/all Internet prefixes across ISP backbone
ISP’s customer prefixes
 eBGP used to
Exchange prefixes with other Autonomous Systems (ASes)
Implement routing policy
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
5
BGP Basics
eBGP
Peering
A
C
AS 101
AS 100
iBGP
D
B
BGP speakers are
called peers or
neighbors
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
E
AS 102
Cisco Confidential
6
External BGP - eBGP
 Between BGP speakers
in different AS
AS 2
 Usually directly connected
2.0.0.0
 Usually sets next-hop to self
Router A
router bgp 1
neighbor 2.0.1.1 remote-as 2
.1
Router B
router bgp 2
neighbor 2.0.1.2 remote-as 1
2.0.1.0
neighbor
2.0.1.2 route-map X {in|out}
..
route-map X permit 10
{set | match} <attribute>
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
A
Cisco Confidential
1.0.0.0
.2
B
AS 1
7
Internal BGP - iBGP
 Neighbor in same AS
 Next-hop unchanged…usually
 May be several hops away
 Don’t forward iBGP learned routes to
other iBGP peers
n*(n-1)/2 peering mesh – scaling problem!
Route-Reflectors relax this constraint
A
B
Router B:
router bgp 1
neighbor 1.0.1.1 remote-as 1
Router A:
router bgp 1
neighbor 1.0.2.1 remote-as 1
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
8
iBGP and Loopback Interfaces
RtrA
RtrB
interface loopback0
ip address 1.1.1.254 255.255.255.255
!
Router bgp 100
neighbor 1.1.2.254 remote-as 100
neighbor 1.1.2.254 update-source loopback0
interface loopback0
ip address 1.1.2.254 255.255.255.255
!
router bgp 100
neighbor 1.1.1.254 remote-as 100
neighbor 1.1.1.254 update-source loopback0
AS 100
RtrB
RtrA
Why not peer to the address assigned to a physical interface?
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
9
Reasons for Using BGP
1. You need to scale your IGP
2. You’re a multihomed ISP customer and need to
implement routing policy
3. You’re an MPLS/VPN subscriber to an SP service
and want to run dynamic routing between CE and
PE routers
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
10
Using BGP to Scale Your IGP
 Scaling a large network—“Divide and Conquer”
Hierarchy
Periodic IGPs/flooding
Isolate network instability
 Complex policies
Control reachability to prefixes
Merge separate organizations
Connect multiple IGPs
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
11
Best Path Selection for Cisco Routers
Which Route Is Best?
 First, always take the next-hop
advertising the longest prefix
(most specific route to
destination)
Route Source
Default
Distance
Values
Connected interface
0
Static route*
1
Enhanced Interior Gateway Routing
Protocol (EIGRP) summary route
5
External Border Gateway Protocol (BGP)
20
Internal EIGRP
90
IGRP
100
OSPF
110
Intermediate System-to-Intermediate
System (IS-IS)
115
Routing Information Protocol (RIP)
120
Exterior Gateway Protocol (EGP)
140
On Demand Routing (ODR)
160
See table on the right
External EIGRP
170
Lower is more believable
Internal BGP
200
Unknown**
255
Choose next-hop advertising
10.1.1.0/24 over the next-hop
advertising 10.1.0.0/16
 If two next-hop routers
advertising exact same route,
refer to Default Administrative
distances as index of
believability
Defaults can be modified if
necessary (with caution)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
12
General Operation
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
13
BGP General Operation
 Learns multiple paths via internal
and external BGP speakers
 Picks the best path and installs in
the forwarding table
 Policies applied by influencing the
best path selection
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
14
Summary of Operation
 TCP connection established (port 179)
 Both peers attempt to connect—There is an
algorithm to resolve “connection collisions”
 Exchange messages to open and confirm the
connection parameters
 Initial exchange of entire table
 Incremental updates after initial exchange
 Keepalive messages exchanged when
there are no updates
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
15
What Are Incremental Updates?
 IGPs typically rebroadcast routes
 BGP runs over TCP => reliable date delivery
 Once BGP sends a route to a peer, it assumes the
peer will keep it unless:
A replacement route is sent—Implicit
withdraw of old route
The route is withdrawn—Explicit withdraw
The BGP session goes down (keepalive failure)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
16
Inserting Prefixes into BGP
 Two ways to insert/originate prefixes into BGP
Redistribute (static or dynamic)
Network command
Always necessary for default route
 Default rules for re-advertising BGP learned prefixes
to other BGP neighbors
eBGP learned routes are sent to all eBGP and iBGP peers
ee, ei
iBGP learned routes are sent to all eBGP but NO iBGP peers
ie
Exception: iBGP Route-Reflectors
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
17
Inserting Prefixes into BGP Redistribute
 Configuration Examples:
router bgp 109
redistribute static
ip route 198.10.4.0 255.255.254.0 serial0
router bgp 109
redistribute eigrp 100
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
18
Inserting Prefixes into BGP - Network
Network
 Used to tell BGP which networks to advertise to
neighbors; unlike IGPs, the network command is
not used to determine which interfaces will be
active for the protocol; networks must be in the IP
routing table in order for them to be advertised
router bgp 100
neighbor x.x.x.x remote-as Y
network 172.16.0.0 If auto-summary is on then a specific route from
172.16.0.0 must be in the routing table; if auto-summary is off then
the prefix 172.16.0.0/16 must be in the IP routing table
network 172.17.1.0 mask 255.255.255.0 Must be an exact match in
the IP routing table
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
19
Inserting Prefixes into BGP –
Network Command
 Configuration Example
router bgp 109
network 198.10.4.0 mask 255.255.254.0
network 0.0.0.0
A matching route must exist in the routing table before the
network is announced
Exact prefix length
“show ip route x.x.x.x” must return exact route
before BGP will advertise
 Static route can be real next hop or null0 interface
ip route 198.10.4.0 255.255.254.0 192.168.1.1
ip route 192.10.4.0 255.255.254.0 null0
ip route 0.0.0.0 0.0.0.0 null0 250
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
20
BGP Attributes and Policy Control
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
21
Route Metrics
 OSPF has a dimensionless metric based on
interface speed
 EIGRP has a 5-tuple
[(K1 * BW + K2 * BW/(256 – Load) + K3 * Delay) * K5/(K4 + Reliability] * 256
 RIP has a hop count
 BGP has …
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
22
BGP Attributes
(More Than Just Route Cost…)
 AS path
 Next hop
 Weight
 Local preference
 Multi-Exit Discriminator (MED)
 Community
 Atomic
 Origin
 Originator ID
 Cluster list
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
23
What Is an Attribute?
...
Next
Hop
AS
Path
MED
...
...
 Properties associated with a prefix/route
 Used to determine the best path to a destination
when multiple paths exist
 Attribute Categories
Well-known, mandatory
Well-know, discretionary
Optional, transitive
Optional, non-transitive
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
24
AS-Path
Well-known, Mandatory, Code = 2
 Sequence of ASes a route has traversed
 Loop detection
 Apply policy
AS 200
AS 100
170.10.0.0/16
180.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
AS 300
AS 400
150.10.0.0/16
AS 500
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
180.10.0.0/16
300 200 100
170.10.0.0/16
300 200
150.10.0.0/16
300 400
Cisco Confidential
25
Next Hop
Well-known, Mandatory, Code = 3
150.10.1.1
AS 200
150.10.0.0/16
150.10.1.2
A
B
AS 300
150.10.0.0/16 150.10.1.1
160.10.0.0/16 150.10.1.1
AS 100
160.10.0.0/16
TECRST-2310_c1
 Next hop to reach a network
 Usually a local network is the
next hop in eBGP session
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
26
Next Hop
150.10.1.1
150.10.1.2
iBGP
AS 200
150.10.0.0/16
A
B
eBGP
C
AS 300
150.10.0.0/16 150.10.1.1
160.10.0.0/16 150.10.1.1
AS 100
160.10.0.0/16
Next hop not changed
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
27
Local Preference
Well-known, Code = 5
AS 100
160.10.0.0/16
AS 200
AS 300
D
500
800
B
A
160.10.0.0/16
> 160.10.0.0/16
500
800
E
AS 400
C
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
28
Local Preference
 Local to an AS
Local preference set to 100 when heard from
neighbouring AS
 Used to influence BGP path selection
Determines best path for outbound traffic
 Path with highest local preference wins
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
29
Local Preference
 Configuration of Router B:
router bgp 400
neighbor 220.5.1.1 remote-as 300
neighbor 220.5.1.1 route-map local-pref in
!
route-map local-pref permit 10
match ip address prefix-list MATCH
set local-preference 800
!
ip prefix-list MATCH permit 160.10.0.0/16
ip prefix-list MATCH deny 0.0.0.0/0 le 32
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
30
MULTI_EXIT_DISC (MED or Metric)
Optional, Non-transitive, Code = 5
 4 octets
 Used by a BGP speaker’s Decision Process to
discriminate among multiple entry points into a
neighboring autonomous system.
 If MED is missing, it is assumed MED=0
If bgp bestpath missing-as-worst then it is assumed the
MAXIMUM value
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
31
MULTI_EXIT_DISC (MED or Metric)
192.0.1.0 /24
MED = 10
Route with
lowest MED
wins!!
MED 20
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
32
How to Scale Routing Policy
 Communities!
 NOT in decision algorithm
 BGP route can be a member of many communities
 Really just a number for grouping prefixes.
 Typical communities:
Destinations learned from customers
Destinations learned from ISPs or peers
Destinations in VPN—BGP community is fundamental to
the operation of BGP VPNs
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
33
BGP Attributes: COMMUNITY
 Activated per neighbor/peer-group:
neighbor {peer-address | peer-group-name} sendcommunity
 Carried across AS boundaries
 BGP community values are configured as a
32-bit number (old format) or as a 2x2 byte number
(new format).
 Common convention is string
of four bytes: <AS>:[0-65536]
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
34
IP BGP-Community New-Format
 Specifies that communities be displayed in a 4-byte AA:NN format
AA identifies the autonomous system
NN is a number that identifies the community within the autonomous system.
r2#show ip bgp 10.10.1.0/24
BGP routing table entry for 65001:100:10.10.1.0/24, version 9
<snip>
Community: 6553700
r2 (config)#ip bgp-community new-format
r2#show ip bgp 10.10.1.0/24
BGP routing table entry for 65001:100:10.10.1.0/24, version 9
<snip>
Community: 100:100
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
35
BGP Attributes: COMMUNITY (Cont.)
 Each destination can be a member of multiple
communities
 Using a route-map: set community
<1-4294967295> community number
aa:nn
community number in aa:nn format
additive
Add to the existing community
none
No community attribute
local-AS
Do not send to EBGP peers (well-known community)
no-advertise
no-export
TECRST-2310_c1
Do not advertise to any peer (well-known community)
Do not export outside AS/confed (well-known community)
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
36
BGP Path Selection Algorithm
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
37
BGP Path Selection Algorithm
 Do not consider path if no route to next hop
Example: Router learns a route from an eBGP peer and
then advertises to an iBGP peer. If the iBGP peer does not
know how to reach the next hop the route is rejected. iBGP
usually does not change the next hop.
 Do not consider iBGP path if not synchronized
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
38
Synchronization
A BGP Router Will Not Accept a Route from an iBGP Neighbor
Unless the Route Is Already in the IP Routing Table
Rtr B
Rtr A
Rtr C
iBGP
eBGP
172.16.0.0
TECRST-2310_c1
• Rtr B does not know about 172.16.0.0;
therefore, Rtr C should not advertise
172.16.0.0 to Rtr D
• Redistribute 172.16.0.0 into IGP, use a full
iBGP mesh or disable synchronization if
iBGP path = physical path.
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
eBGP
Rtr D
39
BGP Path Selection Algorithm
 Highest weight (local to router)
 Highest local preference (global within AS)
 Prefer locally originated route (aggregate address)
 Shortest AS path
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
40
BGP Path Selection Algorithm (Cont.)
 Lowest origin code
IGP < EGP < incomplete
IGP – network command
EGP – from an eBGP neighbor
Incomplete - redistribution
 Lowest Multi-Exit Discriminator (MED)
If bgp deterministic-med, order the paths before comparing
(not the default but recommend using it)
If bgp always-compare-med, then compare for all paths
otherwise MED only considered if paths are from the same
AS (default)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
41
BGP Path Selection Algorithm (Cont.)
 Prefer eBGP path over iBGP path
 Path with lowest IGP metric to next-hop
 For eBGP paths
If multipath enabled, install N parallel paths in routing table
If router-ID is the same, go to next step
If router-ID not the same, select “oldest”
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
42
BGP Path Selection Algorithm (Cont.)
 Lowest router-id (originator-id for reflected routes)
 Shortest Cluster-List
Client must be aware of Route Reflector attributes!
 Lowest neighbor IP address
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
43
Applying Policy with BGP
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
44
Constructing the Forwarding Table
Input policies
BGP in
process
in
discarded
accepted
everything
bgp
BGP
table
peer
forwarding
table
best paths
out
BGP out
process
output policies
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
45
Applying Policy with BGP
 Policy based on various attributes:
AS path
Community
Destination prefix
Many, many others…
 Reject/accept selected routes
 Set attributes to influence path selection
 Tools (IOS):
Distribute-list or prefix-list
Filter-list (as-path access-list)
Community-list
Route-maps (the Swiss army knife)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
46
Policy Control - Prefix List
 Per-peer prefix filter, inbound or outbound
 Allows coverage for ranges of prefix lengths (ge, le)
 Based upon network numbers in NLRI (using
familiar IPv4 address/mask format)
 Example configuration:
router bgp 200
neighbor 220.200.1.1 remote-as 210
neighbor 220.200.1.1 prefix-list PEER-IN in
neighbor 220.200.1.1 prefix-list PEER-OUT out
!
ip prefix-list PEER-IN deny 218.10.0.0/16
ip prefix-list PEER-IN permit 0.0.0.0/0 le 32
ip prefix-list PEER-OUT permit 215.7.0.0/16
ip prefix-list PEER-OUT deny 0.0.0.0/0 le 32
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
47
Policy Control - Prefix List
a.b.c.d/x [ge | eq | le] y
care vs. don’t care bits
base prefix length to match
operator
operand
ip prefix-list PEER-IN permit 10.0.0.0/8 le 32
10.0.0.8/8 le 32 = all 10.x.x.x subnets, regardless of mask length
(e.g. 10.1.2.4/24, 10.1.1.1/32, 10.1.0.0/16)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
48
Policy Control - Prefix List
 More Examples:
0.0.0.0/0 eq 32 = all /32 prefixes (e.g. 1.2.3.4/32)
192.168.1.0/24 = 192.168.1.0/24 eq 24
(ONLY 192.168.1.0/24)
172.16.0.0/16 ge 28 = all subnets from 172.16.0.0/16 that
have a mask length of /28 or greater (e.g. 172.16.4.0/28)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
49
Policy Control - Filter List
 Filter routes based on AS path
 Inbound or Outbound
 Example Configuration:
router bgp 100
neighbor 220.200.1.1 filter-list 5 out
neighbor 220.200.1.1 filter-list 6 in
!
ip as-path access-list 5 permit ^200$
ip as-path access-list 6 permit ^150$
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
50
Policy Control - Regular Expressions
 Simple Examples
.*
Match anything
.+
Match at least one character (cannot be empty)
^$
Match routes local to this AS (as-path is empty)
_1800$
Originated by 1800 (as-path ends with 1800)
^1800_
Received from 1800 (as-path starts with 1800)
_1800_
Via 1800 (1800 is somewhere in the middle of the as-path)
_790_1800_ Passing through 1800 then 790
 For more information on regular expressions:
http://www.cisco.com/en/US/docs/ios/12_2/termserv/configuration/guide/tc
faapre_ps1835_TSD_Products_Configuration_Guide_Chapter.html
http://www.ccietalk.com/2008/07/25/cisco-regular-expression-characters
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
51
Policy Control – Setting Communities
 Example Configuration
router bgp 100
network 215.7.0.0
neighbor 220.200.1.1 remote-as 200
neighbor 220.200.1.1 send-community
neighbor 220.200.1.1 route-map set-community out
!
ip bgp-community new-format
!
route-map set-community permit 10
match ip address prefix-list NO-ANNOUNCE
set community no-export
!
route-map set-community permit 20
match ip address prefix-list EVERYTHING
!
ip prefix-list NO-ANNOUNCE permit 172.168.0.0/16 ge 17
ip prefix-list EVERYTHING permit 0.0.0.0/0 le 32
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
52
Policy Control – Matching Communities
 Example Configuration
router bgp 100
neighbor 220.200.1.2 remote-as 200
neighbor 220.200.1.2 route-map filter-on-community in
!
route-map filter-on-community permit 10
match community 1
set local-preference 50
!
route-map filter-on-community permit 20
match community 2 exact-match
set local-preference 200
!
ip community-list 1 permit 150:3 200:5
ip community-list 2 permit 88:6
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
53
Multi-protocol BGP
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
54
MP-BGP (RFC4760)
 Extension to the BGP protocol
 Carry routing information about other protocols:
IPv4 and IPv6 Unicast
IPv4/IPv6 + Label (RFC 3107, 6PE)
IPv4 and IPv6 Multicast
Multi-Protocol Label Switching (MPLS) VPN (IPv4 and IPv6)
Layer 2 VPN
…many others proposed
 Multi-Protocol Capabilities must be negotiated at
session setup time (important!)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
55
MP-BGP Attributes
 New non-transitive and optional Border Gateway
Protocol (BGP) attributes
MP_REACH_NLRI
“Carry the set of reachable destinations together with
the next-hop information to be used for forwarding to
these destinations” (RFC4760)
MP_UNREACH_NLRI
Carry the set of unreachable destinations
 Note: NEXT_HOP has different format for different
AFI/SAFI
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
56
MP-BGP Attributes (Cont.)
 Attribute contains one or more triples:
Address Family Information (AFI) with Sub-AFI
Identifies type of protocol information carried in the
Network Layer Reachability Info (NRLI) field
Next-hop information
Reachability/non-reachability information
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
57
MP-BGP Capabilities Negotiation
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
58
MP-BGP Capabilities Negotiation (Cont.)
 BGP router sends an OPEN message with
CAPABILITIES parameter containing its capabilities
:
Value
Description
Reference
0
Reserved
RFC 5492
1
Multiprotocol Extensions
RFC 2858
2
Route Refresh
RFC 2918
3
Outbound Route Filtering
RFC 5291
4
Multiple Routes to Destination RFC 3107
5
Extended Next Hop Encoding
RFC 5549
64
Graceful Restart
RFC 4724
65
4-octet AS number
RFC 4893
69
ADD-PATH
draft-ietf-idr-add-paths
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
59
MP-BGP Session Establishment
AS 123
AS 321
BGP: 3FFE:B00:C18:2:1::1 sending OPEN, version 4, my as: 100
BGP: 3FFE:B00:C18:2:1::1 rcv OPEN, version 4
BGP: 3FFE:B00:C18:2:1::1 rcv OPEN w/ OPTION parameter len: 16
BGP: 3FFE:B00:C18:2:1::1 rcvd OPEN w/ optional parameter type 2 (Capability) len 6
BGP: 3FFE:B00:C18:2:1::1 OPEN has CAPABILITY code: 1, length 4
BGP: 3FFE:B00:C18:2:1::1 OPEN has MP_EXT CAP for afi/safi: 2/1
BGP: 3FFE:B00:C18:2:1::1 rcvd OPEN w/ optional parameter type 2 (Capability) len 2
BGP: 3FFE:B00:C18:2:1::1 went from OpenSent to OpenConfirm
BGP: 3FFE:B00:C18:2:1::1 went from OpenConfirm to Established
%BGP-5-ADJCHANGE: neighbor 3FFE:B00:C18:2:1::1 Up
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
60
BGP Load Balancing
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
61
Load Balancing
 BGP isn’t inherently designed to load-balance traffic
 By default, BGP chooses, installs, and advertises one
“best” route
 Attempting to balance traffic comes in two parts
Inbound traffic
Outbound traffic
 Load balancing is relatively trivial in some topologies
A pair of eBGP peers connected via multiple links
Two connections from one router to the same AS
 …but not others
Multi-homed to more than one provider
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
62
Single Path – eBGP Multihop
 Router A configuration:
• A must do a recursive lookup for 2.2.2.2
interface loopback 0
ip address 1.1.1.1 255.255.255.255
• A has two equal cost paths to 2.2.2.2
!
router bgp 100
neighbor 2.2.2.2 remote-as 200
neighbor 2.2.2.2 update-source loopback0
neighbor 2.2.2.2 ebgp-multi-hop
!
• A will load balance traffic over these two links
• B must be configured similarly for bidirectional
load balancing
ip route 2.2.2.2 255.255.255.255 serial 0
ip route 2.2.2.2 255.255.255.255 serial 1
!
B
Loopback 0
2.2.2.2/32
200
A
100
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
63
eBGP Multipath Support
100
200
A
 A peers with multiple routers in the same neighbor AS
 Install multiple routes in IP routing table
Use ‘maximum-paths ebgp’ command
 Routes must be identical in terms of LOCAL_PREF, AS_PATH, MED,
etc… (probably true if coming from the same AS)
 Outbound traffic will be split over these two links
 A still advertises one best path to peers
 Next-hop is set to self (using loopback interface)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
64
Multi-Homed AS
AS 100
AS 300
D
A
C
B
AS 200
 Very common topology for many customers
 Customer wants to split traffic between AS 100 and AS 300
 Misconception: “I’ll make half of my routes preferred via AS 100 and the other half
through AS 300. Then I’ll have load-balancing!!”…no, you’ll have prefix splitting!
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
65
Multi-Homed AS
 Huge difference between “load balancing” and “prefix splitting”
 Traffic may be balanced perfectly…until traffic patterns change
 Some customers use this method but they are forced to change their
policies to accommodate for changes in traffic patterns
 For outbound balancing use
Weight
LOCAL_PREF (recommended)
 For inbound balancing use
Conditional-advertisement
AS_PATH prepending (may not work)
MEDs (may not work)
Communities and LOCAL_PREF (recommended…but requires upstream
coordination!)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
66
BGP Multipath
 Multiple eBGP paths can be flagged as multipath as long as
the paths are “similar”
 “Similar” means that all relevant BGP attributes are a tie (up to
next-hop metric)
If paths 1 and 2 both have a local-pref of 200, MED of 300, etc… but the
Router-IDs are different then paths 1 and 2 are eligible for multipath
 These paths are installed in the RIB/FIB to load-balance
outbound traffic
 Multipath is the correct approach to a difficult problem but not
terribly useful because it can only be used in one specific
topology
iBGP multipath and Link-BW will help correct this
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
67
iBGP: Without Multipath
R2
R4
R1
AS 100
AS 200
10.0.0.0/8
R3
R5
 R1 has two paths for 10.0.0.0/8
 Both paths are identical in terms of localpref, med, IGP cost to next-hop, etc
 Router-ID, peer-address, etc are different but these are arbitrary in terms of
selecting a best path
 R1 will select one path as best and send all traffic for 10.0.0.0/8 towards one
of the exit points
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
68
iBGP Multipath
 Flag multiple iBGP paths as ‘multipath’
Each path must have a unique NEXT_HOP
 All multipaths are inserted the RIB/FIB
 Number of multipaths can be controlled
maximum-paths ibgp <1-6>
 Still advertise a single bestpath
 Each BGP next-hop is resolved and mapped to
available IGP paths (not next-hop-self unless
routing follows forwarding)
 Supported on all IOS versions in past ~10 yrs
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
69
iBGP: With Multipath
 R1 has two
paths for
10.0.0.0/8
R2
 Both paths
are flagged
as “multipath”
R1
R4
AS 200
10.0.0.0/8
AS 100
R3
R5
R1#sh ip bgp 10.0.0.0
200
20.20.20.3 from 20.20.20.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal, multipath
200
20.20.20.2 from 20.20.20.2 (2.2.2.2)
Origin IGP, metric 0, localpref 100, valid, internal, multipath, best
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
70
iBGP Multipath
 These two paths are
installed in the RIB/FIB
 Traffic is load-balanced
across the two
paths/exit points based
on per-packet hash
 Depending on
platform/version, there
may or may not be
multiple levels of load
balancing (IGP + BGP)
TECRST-2310_c1
R1#sh ip route 10.0.0.0
Routing entry for 10.0.0.0/8
* 20.20.20.3, from 20.20.20.3,
Route metric is 0, traffic
1
AS Hops 1
20.20.20.2, from 20.20.20.2,
Route metric is 0, traffic
1
AS Hops 1
00:00:09 ago
share count is
00:00:09 ago
share count is
R1#show ip cef 10.0.0.0
10.0.0.0/8, version 237, per-destination
sharing
0 packets, 0 bytes
via 20.20.20.3, 0 dependencies, recursive
traffic share 1
next hop 20.20.20.3, FastEthernet0/0 via
20.20.20.3/32
valid adjacency
via 20.20.20.2, 0 dependencies, recursive
traffic share 1
next hop 20.20.20.2, FastEthernet0/0 via
20.20.20.2/32
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
71
eiBGP Multipath
*Applies Only to the MPLS/VPN Case*
 The traffic destined to a site may be load shared
between all entry points.
From the MPLS/VPNs provider’s point of view, these
entry points may not all correspond to internal or
external peers.
The intent is for the MPLS/VPN network to be
transparent to the customers.
 The ability to consider both iBGP and eBGP paths,
when using multipath, is needed.
 Paths must match up to MED attribute
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
72
eiBGP Multipath
Example
 PE-2 has two possible
paths into Site-1
eiBGP Multipath allows
both paths to be used.
PE-1
PE-2
CE-3
Site-2
CE-1
CE-2
Site-1
SOO=100:65
maximum-paths eibgp <num>
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
73
Full Mesh iBGP
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
74
Full Mesh iBGP
 “If a particular AS has multiple BGP speakers and is
providing transit service for other ASes, then care
must be taken to ensure a consistent view of
routing within the AS. A consistent view of the
interior routes of the AS is provided by the IGP used
within the AS. For the purpose of this document, it
is assumed that a consistent view of the routes
exterior to the AS is provided by having all BGP
speakers within the AS maintain IBGP sessions
with each other.”
RFC 4271
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
75
Full Mesh iBGP
Why?
A
Learns 10.1.1.0/24 eBGP
 Because BGP relies on the AS
Path to prevent loops
B
Learns 10.1.1.0/24 iBGP
C
iBGP
Thus…
 There’s no way to tell if a route
advertised through several iBGP
speakers is a loop!
Advertises 10.1.1.0/24 iBGP
iBGP
 iBGP peers are in the same AS,
so they do not add anything to the
AS Path
Advertises 10.1.1.0/24
eBGP
 If a router learns a route from an
iBGP peer, it will not re-advertise
that route to another iBGP peer
Do not advertise
10.1.1.0/24 iBGP
D
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
76
Full Mesh iBGP
 How scalable is using a full mesh
of iBGP speakers?
2 speakers == 1 session
3 speakers == 3 sessions
4 speakers == 6 sessions
5 speakers == 10 sessions
 n(n-1)/2 = O(n2) sessions
 (n-1) sessions per speaker
 How can we better handle scale?
Confederations (yuck)
Route Reflectors (hooray!)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
77
Confederations
A
Sub-AS
65002
B
C
Sub-AS
65004
Sub-AS
65003
G
D
E
F
H
Sub-AS
65001
Confederation
100
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
78
BGP Route Reflectors
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
79
BGP Route Reflectors
 Route Reflector Basics
 Hierarchical Route Reflectors
 Deploying Route Reflectors
 Route Reflector Redundancy
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
80
Route Reflector Basics
 A route reflector is an
iBGP speaker that
reflects routes learned
from iBGP peers to
other iBGP peers
Route reflectors
 Route reflectors are
designated by
configuring some of
their iBGP peers as
route reflector clients
A
B
neighbor <A> route-reflector-client
neighbor <B> route-reflector-client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
81
Route Reflector Basics
 A route reflector client
is just an iBGP speaker
Route reflectors
 There is no special
configuration for a
route reflector client
A
B
Route reflector client
neighbor <A> route-reflector-client
neighbor <B> route-reflector-client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
82
Route Reflector Basics
 A cluster is a route
reflector and its clients
Route reflectors
 Route reflector
clusters may overlap
Cluster
A
B
Route reflector client
neighbor <A> route-reflector-client
neighbor <B> route-reflector-client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
83
Route Reflector Basics
Route reflectors
 A non-client is any
route reflector iBGP
peer that is not a route
reflector client
Non-client
Cluster
 Each route reflector is also
a non-client of each other
route reflector in this
network
A
 Route reflectors must be
fully iBGP meshed with
non-clients
B
Route reflector client
neighbor <A> route-reflector-client
neighbor <B> route-reflector-client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
84
Hierarchical Route Reflectors - Motivation
 All of the route
reflectors will need to
be fully meshed
Full iBGP
mesh
between
reflectors
Reflectors still follow the
normal rules of iBGP
route propagation
between themselves
 This full iBGP mesh
between reflectors can
still contain so many
routers that it presents
a scaling problem
as well
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cluster
Cisco Confidential
Cluster
85
Hierarchal Route Reflectors
 To resolve this, route
reflectors can be deployed
in a hierarchy
Client and reflector
Cluster
 A single router can be
a reflector client and a
reflector
Cluster
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
Cluster
86
Hierarchal Route Reflectors
 An unlimited number of tiers can be used
But very rare to see more than 3 levels
 Edges of route reflector tiers are a natural place to
reduce the amount of routing information being
carried in the lower tiers
RRs would be ABRs in “textbook” network design
 The same topology rule applies: The reflector
topology should follow the physical topology to
prevent loops and black holes
 RRs can lead to suboptimal routing because they
can hide full path information from clients (RRs can
advertise a single best path).
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
87
Route Reflector Basics
If a Route Reflector Receives a Route from an eBGP Peer:
 Send the route to
all clients
eBGP peer
Non-client
iBGP peer
 Send the route to
all non-clients
Send
Send
Send
Non-client
iBGP peer
Client
Client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
88
Route Reflector Basics
If a Route Reflector Receives a Route from a Client:
 Reflect the route to
all clients
Non-client
iBGP peer
eBGP peer
Unless “no client-to-client reflection”
Send
Reflect
 Reflect the route to
all non-clients
Reflect
 Send the route to
all eBGP peers
Non-client
iBGP peer
Client
Client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
89
Route Reflector Basics
If a Route Reflector Receives a Route from a Non-Client:
 Reflect the route to
all clients
Non-client
iBGP peer
eBGP peer
Send
 Send the route to
all eBGP peers
Reflect
Reflect
Non-client
iBGP peer
Client
Client
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
90
Route Reflector Basics
 What we need is a mechanism to prevent loops
within the AS!
 RFC2796 defines two BGP attributes to provide
loop detection within an AS
 Originator ID
Set to the router ID of the router injecting the route into
the AS
 Cluster List
Each route reflector the route passes through adds their
cluster ID to this list. Cluster-id = Router ID by default
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
91
Route Reflector Basics
 When reflecting a route, a route reflector always:
Creates a cluster list if one doesn’t exist and adds its router
ID (or configured cluster ID)
Adds the router ID of the peer it received the route from as
the Originator ID
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
92
Deploying Route Reflectors
 Use the divide and conquer approach to convert
from a full iBGP mesh to route reflectors
 Divide network into multiple clusters, using the
physical topology as a guide to the logical divisions
 Pick out one router to act as the reflector in each
cluster, making certain reflection follows the
physical topology
 Remove redundant iBGP sessions as you configure
reflectors in each cluster
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
93
Deploying Route Reflectors
 This small network has
nine routers, and 36 iBGP
sessions
A
Reflectors
 First, choose clusters
using the physical
topology as a guide
B
C
D
 Next choose reflectors
based on the physical
topology
F
E
G
J
H
Physical links
iBGP sessions
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
94
Deploying Route Reflectors
 Configure each client in a single cluster
A
 Remove extra iBGP sessions
 Start with B
B
C
D
F
E
G
J
H
neighbor <f> route-reflector-client
neighbor <h> route-reflector-client
neighbor <d> route-reflector-client
Physical links
iBGP sessions
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
95
Deploying Route Reflectors
 Next, configure G, E, and J as
route reflector clients of C
A
 Remove extra iBGP sessions
B
C
D
F
E
G
J
H
neighbor <g> route-reflector-client
neighbor <e> route-reflector-client
neighbor <j> route-reflector-client
Physical links
iBGP sessions
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
96
Deploying Route Reflectors
 The resulting network has nine
iBGP sessions along physical links
A
B
C
D
F
E
G
J
H
Physical links
iBGP sessions
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
97
Route Reflector Design and Redundancy
 A client may peer with more than one reflector,
in different clusters
A client that peers to only one reflector has a
single point of failure
Clients should peer to at least two reflectors to
provide redundancy
 How many reflectors should a single client be
peered to?
 Where should the RRs be placed in the network?
 How many RRs are needed?
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
98
Route Reflector Design and Redundancy
 Redundancy is needed but….
 Too much burns memory on RRCs because the
client learns the same information from each RR
 Also burns memory on the RRs because they learn
multiple paths for each route introduced by a RRC
 Two route reflectors per client should
be plenty…
 …but this is not a hard and fast rule
 As with everything else….”it depends”
PEs, RRs, SLAs, network size, network topology, etc.
Other sessions dedicated to this topic…
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
99
Scaling BGP Updates
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
100
Scaling BGP Updates
 Aggregation
 Peer Groups
 Input Queue Tuning
 Path MTU Discovery
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
101
Aggregation
 Why aggregate?
 Reduce number of Internet prefixes
Advertise only your CIDR block
According to some studies, about 50% of the current
Internet routing table represents “leakage past aggregates”
 Increase stability
If you aggregate properly, the aggregate will remain stable
even if specific components of the aggregate come and go
Perhaps your upstream provider will not allow the more
specifics (filter long prefixes, dampening)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
102
Aggregation
 One of the easiest ways to
scale eBGP is to aggregate routing
information
 To configure aggregation in BGP, use
the aggregate address command
10.1.0.0/24 “65001”
10.1.1.0/24 “65001 65002”
10.1.0.0/22 “”
AS65100
 Aggregated route is created if we have
at least one component:
Components are the longer length prefixes that
fall within the aggregate’s range
 By default:
The aggregate address command only creates
an aggregate
For the new created aggregate route, ASPATH=NULL and other attributes are default for
local routes
AS65101
10.1.0.0/24 “65100 65001”
10.1.1.0/24 “65100 65002”
10.1.0.0/22 “65100”
aggregate-address 10.1.0.0 255.255.252.0
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
103
Aggregation
10.1.0.0/24 “65001”
10.1.1.0/24 “65001 65002”
10.1.0.0/22 “”
 Adding the keyword
summary-only
causes BGP to suppress
the components of
the aggregate
AS65100
Suppressed route = use it, but
do not advertise it to any peer
AS65101
10.1.0.0/22 “65100”
aggregate-address 10.1.0.0 255.255.252.0 summary-only
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
104
Aggregation
Adding the Keyword as-set Causes BGP to:
10.1.0.0/24 “65001”
10.1.1.0/24 “65001 65002”
10.1.0.0/22 “”
 For the aggregate, AS-PATH = AS
Set made by merging of all ASes
of all the components
 Additionally (not shown):
AS65100
Merge all the communities and
extended-communities of all
components
AS65101
10.1.0.0/22 “65100 {65001 65002}”
aggregate-address 10.1.0.0 255.255.252.0 summary-only as-set
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
105
Aggregation
10.1.0.0/24 “65001”
10.1.1.0/24 “65001 65002”
10.1.0.0/22 “” LP=200
 Use a route map to set the
aggregate’s other attributes.
AS65100
AS65101
10.1.0.0/22 “65100 {65001 65002}”
aggregate-address 10.1.0.0 255.255.252.0 summary-only as-set route-map foo
route-map foo
set local-preference 200
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
106
Aggregation
 Other aggregate commands
advertise-map <route-map>: to select which components
are considered as part of the aggregate
suppress-map <route-map>: to select which components
we want to suppress
neighbor … unsuppressed-map <route-map>: to
unsuppress (advertise) a suppressed component towards a
particular peer
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
107
Aggregation
 Creating an aggregate with an aggregate command
Adds AGGREGATOR attribute (troubleshooting info with
the IP and AS of the router that did the aggregation)
If as-set keyword is NOT used: Atomic Aggregate
attribute is also added (troubleshooting info that indicates
loss of AS Path information)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
108
Peer Groups
 What is it?
A way to group peers with similar configuration
Configuration of neighbor is now done in 2 steps:
Define a peer-group like a neighbor
It has associated neighbor commands, policies, etc.
Define individual neighbors as a member of that
peer-group
All the configuration of the peer-group applies to
the member
 Reasons for using peer-groups:
Ease of administration
Scaling
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
109
Peer Groups
 Ease of administration:
Offering customers a few options in the number of routes
they receive, rather than filtering per customer
Classifying peering arrangements with other providers so
you only manage two or three types of connections
 Example for customer types:
cust-default—send default route only
cust-cust—send customer routes only
cust-full—send full Internet routes
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
110
Peer Groups
Core Peer Group
CORE
Your AS
CIDR Block: 1.0.0.0/8
Route Reflector
Aggregation Router
(RR Client)
Client Peer Group
Full Routes
Peer Group
TECRST-2310_c1
“Default”
Peer Group
© 2010 Cisco and/or its affiliates. All rights reserved.
Customer Routes
Peer Group
Cisco Confidential
111
Peer Groups
router bgp 65000
neighbor 10.1.1.1
neighbor 10.1.1.1
neighbor 10.1.1.1
neighbor 10.1.1.1
neighbor 10.1.1.2
neighbor 10.1.1.2
neighbor 10.1.1.2
neighbor 10.1.1.2
neighbor 10.1.1.3
neighbor 10.1.1.3
neighbor 10.1.1.3
neighbor 10.1.1.3
NO PEER-GROUPS
remote-as 65001
route-map cust-receive
route-map cust-default
send-community
remote-as 65002
route-map cust-receive
route-map cust-default
send-community
remote-as 65003
route-map cust-receive
route-map cust-default
send-community
Defining peer-groups
Applying peer-groups
to neighbors
TECRST-2310_c1
in
out
in
out
in
out
PEER-GROUPS
router bgp 65000
neighbor cust-default route-map cust-receive in
neighbor cust-default route-map cust-default out
neighbor cust-default send-community
neighbor 10.1.1.1 remote-as 65001
neighbor 10.1.1.1 peer-group cust-default
neighbor 10.1.1.2 remote-as 65002
neighbor 10.1.1.2 peer-group cust-default
neighbor 10.1.1.3 remote-as 65003
neighbor 10.1.1.3 peer-group cust-default
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
112
Peer Groups
 Peer groups also improve scaling
 Advertising 100,000+ routes to hundreds of peers
is a big challenge from a scalability point of view
(1) Each packet to each peer must be individually formatted
(2) Each packet to each peer must be individually transmitted
 Peer-groups makes possible to do (1) only once for all
the members of the peer-group
 GOLDEN RULE of peer-groups
Outbound policy MUST be unique
Individual peers cannot be configured with outbound policy
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
113
Peer Groups
 Update generation without peer groups
BGP table is walked for every peer
Updates are generated and sent to each peer
 Update generation with peer groups
A peer-group leader is elected for each peer group
The BGP table is walked for the leader only
Updates are generated, transmitted by the peer-group
leader, and replicated and transmitted by the rest of peergroup members
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
114
Peer Groups
 For the same amount of convergence time
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
115
Beyond Peer Groups
 Today peer-groups are not used but live in spirit
Peer-groups still can be configured
But we have decoupled its two functions:
Scalability: update-groups
Administration: peer templates
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
116
Beyond Peer Groups
 Update-groups
Software automatically groups neighbors that can be
included in the same update-group
Basically, all the neighbors that share outbound policy
Only one update is formatted for each update-group
To check how many update-groups and members are
created:
show ip bgp [<af>] update-group
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
117
Beyond Peer Groups
 Peer-templates
Configuration is similar to peer-groups
Define a peer-template with configuration commands
Individual neighbor is configured to inherit commands from peer-template
And additionally:
multiple peer-templates can be applied to a neighbor
peer-templates can be applied to another peer-template
No GOLDEN rule: individual peers can be configured with outbound policy
Two types of peer templates
peer-session: defines session commands (update independent)
Remote-as, update-source …
peer-policy: defines policy commands (associated to updates)
Route-map inbound, route-map outbound, remove-private-as, …
Neighbors can still be grouped in update-groups
If “total” outbound policy is the same
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
118
Beyond Peer Groups
 Peer-template
(peer-policy)
example
router bgp 1
template peer-policy ppol1
route-map map1 out
filter-list 1 in
inherit peer-policy ppol2 10
inherit peer-policy ppol3 5
template peer-policy ppol2
filter-list 2 in
distribute-list 2 in
route-reflector-client
Neighbor 1.1.1.10 for IPV4 uses:
Route-map out = map0
distribute-list in = 2
Filter-list in = 1
It’s a route-reflector client
Uses next-hop self
TECRST-2310_c1
template peer-policy ppol3
distribute-list 3 in
next-hop-self
address-family ipv4
neighbor 1.1.1.10 route-map map0 out
neighbor 1.1.1.10 inherit peer-policy ppol1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
119
Input Queue Tuning
 Large bursts of input packets may overflow the input holdqueue and produce input queue drops
BGP packets may be dropped when many BGP peers are reached via the
same interface (usually an Ethernet interface)
The final effect is that the available bandwidth is lower than the available
bandwidth (TCP congestion window is reduced)
 Solutions:
Increase input hold-queue:
hold-queue <1 – 4096> in (default is only 75!)
Give extra buffer for BGP packets (marked with precedence 6):
spd headroom <0-65535> (default in last version is good : 2000)
show [ip] spd to verify
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
120
Larger Input Queues
 For the same amount of convergence time
Results from increasing the interface
input queue depth from 75 (default) to 1000
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
121
TCP Path MTU Discovery
 MSS (Max Segment Size)
Largest segment that can traverse a TCP session
Does not include IP or TCP headers
MSS is 536 bytes by default (in multihop sessions)
Anything larger must be fragmented & re-assembled
 536 bytes is inefficient for Ethernet (1500) & POS (4470)
Increases the number of IP packets
Makes TCP work harder
Slows BGP convergence and reduces scalability
 Solution: ip tcp path-mtu-discovery
 Another helpful command:
show ip bgp neighbors | include max data
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
122
TCP Path MTU Discovery
 MSS increased from 536
to 1460 Bytes (GE)
350
300
Supported Peers
 Configuring path MTU
discovery between BGP
peers can provide
dramatic results in the
speed of convergence
w/ PMTUD
250
200
150
100
w/o PMTUD
50
0
80K
90K
100K 110K
Routes
120K
MSS Formula = Lowest MTU - IP overhead (20 bytes) – TCP overhead (20 bytes)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
123
BGP Fast Convergence
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
124
Faster Convergence
 Increased focus on faster BGP convergence
Critical for traffic (i.e. voice, video)
VPN customers want IGP like convergence
 Several factors influence BGP convergence
Detection
Propagation
Scalability
Stability
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
125
Faster Convergence
 Typically two scenarios where we need faster convergence
 Single route convergence
A bestpath change occurs for one prefix
How quickly can BGP propagate the change throughout the
network?
How quickly can the entire BGP network converge?
Key for VPNs and voice networks
 Bootup or “clear ip bgp *” convergence
Most stressful scenario for BGP
CPU may be busy for several minutes
Limiting factor in terms of scalability
Key for any router with a full Internet table and many peers
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
126
Convergence Basics – BGP Scanner
 BGP Scanner plays a key role in convergence
 Full BGP table scan happens every 60 seconds
bgp scan-time X
Affects only some AF dependent tasks, most tasks are still perform every 60 seconds
 Full scan performs multiple housekeeping tasks
Validate nexthop reachability
Validate bestpath selection
Route redistribution and network statements
Conditional advertisement
Route dampening
BGP Database cleanup
 Import scanner runs once every 15 seconds
Imports VPNv4 routes into vrfs
bgp scan-time import X
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
127
Convergence Basics – BGP Nexthops
 Every 60 seconds the BGP scanner recalculates
best path for all prefixes
 Changes to the IGP cost of a BGP nexthop will go
unnoticed until scanner’s next run
IGP may converge in less than a second
BGP may not react for as long as 60 seconds 
 Need to change from a polling model to an event
driven model to improve convergence
Polling model – Check each BGP nexthop’s IGP cost every
60 seconds
Event driven model – BGP is informed by a 3rd party
process when the IGP cost to a BGP nexthop changes
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
128
ATF – Address Tracking Filter
 ATF is a middle man between the RIB and
RIB clients
RIB clients: BGP, OSPF, EIGRP, etc.
 ATF and client interaction
Client tells ATF to register a given IP address (ex: an IP
address that is used as a BGP next-hop)
RIB notifies ATF of any route modification/creation/deletion
ATF notifies client if the lookup route associated to any
registered IP address changes/switches/appears/disappears
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
129
ATF – Address Tracking Filter
 BGP tells ATF to let us
know about any changes
to 10.1.1.3 and 10.1.1.5
BGP
ATF
 Changes to 10.1.1.3/32
and 10.1.1.5/32 are
passed along to BGP
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
BGP Nexthops
10.1.1.3
10.1.1.5
ATF filters out any changes for
10.1.1.1/32, 10.1.1.2/32, and
10.1.1.4/32
RIB
10.1.1.1/32
10.1.1.2/32
10.1.1.3/32
10.1.1.4/32
10.1.1.5/32
Cisco Confidential
130
NHT – Next Hop Tracking
 BGP Next Hop Tracking
Enabled by default
[no] bgp nexthop trigger enable
 BGP registers all nexthops with ATF
Hidden command will let you see a list of nexthops
show ip bgp attr nexthop
 ATF will let BGP know when a route change occurs
(if of interest for a BGP nexthop)
 ATF notification will trigger a “lightweight BGP
Scanner” run
Bestpaths will be calculated
The rest of the other “Full Scan” work will NOT happen
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
131
NHT – Next Hop Tracking
 Once an ATF notification is received BGP waits 5
seconds (default) before triggering NHT scan
bgp nexthop trigger delay <0-100>
Configured value should be the maximum time it takes for
the IGP to converge
 Event driven model allows BGP to react quickly to
IGP changes
No longer need to wait as long as 60 seconds for BGP to
scan the table and recalculate bestpaths
Tuning your IGP for fast convergence is recommended
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
132
NHT – Next Hop Tracking
 Dampening is used to reduce frequency of
triggered scans
It does not allow too frequent “lightweight BGP scanner”
 show ip bgp internal
Displays data on when the last NHT scan occurred
Time until the next NHT may occur
 New commands
bgp nexthop trigger enable
bgp nexthop trigger delay <0-100>
show ip bgp attr next-hop ribfilter
debug ip bgp events nexthop
debug ip bgp rib-filter
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
133
Fast External Fallover
 Objective: Tear down the session if the interface to reach the
peering address goes down
No need to wait for the hold timer to expire!
 When does it work?
Only when peering address is directly connected
Only for eBGP peers
ebgp-multihop OR disable-connected-check can NOT be configured
 Configuration
ON by default
Under router bgp:
[no] bgp fast-external-fallover
Under interface (priority over router configuration):
[no] ip bgp fast-external-fallover {permit|deny}
 Recommended if interface goes down during failure
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
134
Fast Session Deactivation (FSD)
 Objective: Tear down the session if the route to reach the
peering address disappears
No need to wait for the hold timer to expire!
 How does it work?
BGP registering peering address with ATF (similar to NHT)
It’s triggered immediately (trigger-delay = 0 and cannot be configured)
 Configuration:
OFF by default
Under router bgp: neighbor <neighbor-ip> fall-over
 Recommended for multihop eBGP peers known via IGP
 Very dangerous for iBGP peers
If we loose the route for a split second, we bring the peer down!
iBGP sessions usually re-route!
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
135
Scalability Update – Overview
 Bootup convergence (or convergence after “clear ip bgp *”) are
the biggest challenges
Must receive updates from all peers
Must compute all best-paths
Must format updates for all peers
Must transmit updates for all peers
 To improve the process:
Make sure that you don’t start computing best-paths till you have received
updates from all peers
All the peers will send you a KA or a EOR when they have finished sending
you the updates
Maximum timer: bgp update-delay <1-3600> (default 120)
Increase it if your network takes lot of time to converge
Depends on number of routes, number of peers and on specific platform
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
136
NSR – Non Stop Routing
 NSR and NSF (Non Stop Forwarding) are not the same
Both provide for a restarting speaker to continue forwarding
Usually, FIB is distributed and not affected while the main RP is
restarting
 NSF in a nutshell
Needs support of (NSF aware) peers
Peers are aware that restarting speaker keeps forwarding while
restarting and don’t delete the routes towards him.
BGP extensions required: GR (Graceful Restart)
Not a challenge within an AS
PE  CE is a problem
Upgrading CEs is a huge deployment challenge
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
137
NSR – Non Stop Routing
 NSR in a nutshell
Provides forwarding and
preserves routing during
Active RP failover to
Standby RP
It’s called a SSO
(stateful switchover)
BGP peers’ TCP
sessions are maintained
BGP extensions: NOT
required
CEs do not need to be
upgraded!
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
138
NSR – Non Stop Routing
 Deployment challenges:
NSF is easy to implement inside an AS
All the routers can be upgraded to support GR
Problem are CEs (upgrading to support GR can be a
huge deployment challenge)
NSR is easier to implement
No need to upgrade CEs
PE uses NSR with CEs that are not NSF-aware
PE uses NSF with NSF-aware CEs
PE uses NSF with RRs (NSF-aware)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
139
NSR – Non Stop Routing
 Simplified deployment
for service providers
Only PEs need to be
upgraded to support
NSR (incremental
deployment)
CEs are not touched!
(i.e., no software
upgrade required)
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
140
NSR – Related Commands
 show ip bgp vpnv4 all sso summary
 Used to display the number of BGP peers that
support Cisco BGP NSR
Router# show ip bgp vpnv4 all sso summary
Stateful switchover support enabled for 40
neighbors
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
141
Route Flap Dampening
 Defined in RFC 2439
 Route flap: The bouncing of a path or a change in
its characteristics
A flap ripples through the entire Internet
Consumes CPU cycles, causes instability
 Solution: Reduce scope of route flap propagation
Suppress oscillating routes (history predicts future
behavior)
Only eBGP routes are dampened
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
142
Route Flap Dampening
 Flap: every time we receive a withdrawn or change
of attributes for a given route
Withdrawn: we increase the penalty by 1000
Change of attributes: we increase the penalty by 500
 To suppress (dampen a route):
Penalty accumulated must be greater than the
suppress-limit
 To reuse a route (“undampen” a route):
Penalty decreases exponentially
When it reaches reuse-limit, we use it again
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
143
Route Flap Dampening
4000
Penalty
Suppress limit
3000
Penalty
2000
Reuse limit
1000
0
0 1 2
3 4
5 6 7 8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Time
Network
Announced
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Network
Not Announced
Cisco Confidential
Network
Re-announced
144
Route Flap Dampening
 Benefits
Basically reduces CPU hit load
Does not propagate local flaps to the whole internet
Troubleshooting PLUS: Makes all these local flaps (routes
that have been suppressed) visible
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
145
Route Flap Dampening
 Guidelines RIPE-229:
“Progressive” dampening: more aggressive for longer
prefixes
Needs to be coordinated
Some parameters recommended
“golden” networks (like gTLD name servers) should not be
dampened
Apply as close as possible to the prefix being advertised
Peering, upstream, customer boundaries
No need to dampen routes from customers that use
Provider Aggregated addresses
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
146
Route Flap Dampening
 Guidelines RIPE-378:
Internet today:
A single “normal” withdraw/update can propagate as
many withdraws/updates a few hops away
Route dampening would maintain this prefixes
unreachable unnecessarily
Routers today:
Power makes them more tolerant to route flapping
Recommendation:
Do NOT implement route dampening
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
147
A Little BGP “Show and Tell”
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
148
AS 1
AS 3
Loop0:
10.1.1.1
R1
10.3.1.0/30
2/0
2/0
R3
Loop0:
10.3.3.3
1/0
Internet
10.1.2.0/24
10.201-249 /16
AS 4
1/0
Loop0:
10.1.1.2
TECRST-2310_c1
10.4.1.0/30
R2
2/0
© 2010 Cisco and/or its affiliates. All rights reserved.
2/0
Cisco Confidential
R4
Loop0:
10.4.4.4
149
Complete Your Online
Session Evaluation
 Give us your feedback and you
could win fabulous prizes.
Winners announced daily.
 Receive 20 Cisco Preferred
Access points for each session
evaluation you complete.
 Complete your session
evaluation online now (open a
browser through our wireless
network to access our portal)
or visit one of the Internet
stations throughout the
Convention Center.
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Don’t forget to activate your
Cisco Live and Networkers Virtual
account for access to all session
materials, communities, and on-demand
and live activities throughout the year.
Activate your account at any internet
station or visit www.ciscolivevirtual.com.
Cisco Confidential
150
Enter to Win a 12-Book Library
of Your Choice from Cisco Press
Visit the Cisco Store in the
World of Solutions, where
you will be asked to enter
this Session ID code
Check the Recommended Reading brochure for
suggested products available at the Cisco Store
TECRST-2310_c1
© 2010 Cisco and/or its affiliates. All rights reserved.
Cisco Confidential
151