* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download BGP
Survey
Document related concepts
Computer security wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Distributed firewall wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
Network tap wikipedia , lookup
Airborne Networking wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Transcript
Computer Networks (Graduate level) Lecture 7: Inter-domain Routing University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Univ. of Tehran Computer Network 1 Inter-Domain Routing Border Gateway Protocol (BGP) Assigned reading [LAB00] Delayed Internet Routing Convergence Sources RFC1771: main BGP RFC RFC1772-3-4: application, experiences, and analysis of BGP RFC1965: AS confederations for BGP Christian Huitema: “Routing in the Internet”, chapters 8 and 9. John Stewart III: “BGP4 - Inter-domain routing in the Internet” Univ. of Tehran Computer Network 2 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues Univ. of Tehran Computer Network 3 Internet’s Area Hierarchy What is an Autonomous System (AS)? A set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to route packets within the AS and using an exterior gateway protocol (EGP) to route packets to other AS’s Sometimes AS’s use multiple IGPs and metrics, but appear as single AS’s to other AS’s Each AS assigned unique ID AS’s peer at network exchange routing information. Univ. of Tehran Computer Network 4 Example 1 2 IGP 2.1 IGP EGP 1.1 2.2.1 1.2 EGP EGP EGP 3 IGP 4.1 EGP 5 3.1 5.1 Univ. of Tehran 2.2 IGP IGP 4.2 4 3.2 5.2 Computer Network 5 History Mid-80s: EGP Reachability protocol (no shortest path) Did not accommodate cycles (tree topology) Evolved when all networks connected to NSF backbone Result: BGP introduced as routing protocol Latest version = BGP 4 BGP-4 supports CIDR Primary objective: connectivity not performance Univ. of Tehran Computer Network 6 Choices Link state or distance vector? Problems with distance-vector: No universal metric – policy decisions Bellman-Ford algorithm may not converge Problems with link state: Metric used by routers not the same – loops LS database too large – entire Internet May expose policies to other AS’s Univ. of Tehran Computer Network 7 Solution: Distance Vector with Path Each routing update carries the entire path Loops are detected as follows: When AS gets route check if AS already is in path If yes, reject route If no, add self and (possibly) advertise route further Advantage: Metrics are local - AS chooses path, protocol ensures no loops Univ. of Tehran Computer Network 8 Interconnecting BGP Peers BGP uses TCP to connect peers Advantages: Simplifies BGP No need for periodic refresh - routes are valid until withdrawn, or the connection is lost Incremental updates Disadvantages Congestion control on a routing protocol? Poor interaction during high load Univ. of Tehran Computer Network 9 Hop-by-hop Model BGP advertises to neighbors only those routes that it uses Consistent with the hop-by-hop Internet paradigm e.g., AS1 cannot tell AS2 to route to other AS’s in a manner different than what AS2 has chosen (need source routing for that) Univ. of Tehran Computer Network 10 AS Categories Stub: an AS that has only a single connection to one other AS - carries only local traffic. Multi-homed: an AS that has connections to more than one AS, but does not carry transit traffic Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions) Univ. of Tehran Computer Network 11 AS Categories AS1 AS3 AS1 AS2 AS1 AS3 AS2 Transit Stub AS2 Multi-homed Univ. of Tehran Computer Network 12 Policy with BGP BGP provides capability for enforcing various policies Policies are not part of BGP: they are provided to BGP as configuration information BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s Univ. of Tehran Computer Network 13 Examples of BGP Policies A multi-homed AS refuses to act as transit A multi-homed AS can become transit for some AS’s Limit path advertisement Only advertise paths to some AS’s An AS can favor or disfavor certain AS’s for traffic transit from itself Univ. of Tehran Computer Network 14 Routing Information Bases (RIB) Routes are stored in RIBs Adj-RIBs-In: routing info that has been learned from other routers (unprocessed routing info) Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally) Adj-RIBs-Out: info to be advertised to peers (routes to be advertised) Univ. of Tehran Computer Network 15 BGP Common Header 1 0 2 3 Marker (security and message delineation) 16 bytes Length (2 bytes) Type (1 byte) Types: OPEN, UPDATE, NOTIFICATION, KEEPALIVE Univ. of Tehran Computer Network 16 BGP OPEN message 1 0 2 3 Marker (security and message delineation) Length Type: open version My autonomous system Hold time BGP identifier Parameter length Optional parameters <type, length, value> My AS: id assigned to that AS Hold timer: max interval between KEEPALIVE or UPDATE messages interval implies no keep_alive. BGP ID: IP address of one interface (same for all messages) Univ. of Tehran Computer Network 17 BGP UPDATE message 1 0 2 3 Marker (security and message delineation) Length ..routes len Type: update Withdrawn.. Withdrawn routes (variable) ... Path attribute len Path attributes (variable) Network layer reachability information (NLRI) (variable) •Many prefixes may be included in UPDATE, but must share same attributes. •UPDATE message may report multiple withdrawn routes. Univ. of Tehran Computer Network 18 BGP UPDATE Message List of withdrawn routes Network layer reachability information Path attributes List of reachable prefixes Origin Path Metrics All prefixes advertised in a message have same path attributes Univ. of Tehran Computer Network 19 NLRI Network Level Reachability Information list of IP address prefixes encoded as follows: Length (1 byte) Prefix (variable) Univ. of Tehran Computer Network 20 Path attributes Type-Length-Value encoding Attribute type (2 bytes) Attribute length (1-2 bytes) Attribute Value (variable length) Attribute type field Attribute flags (1 byte) Attribute type code (1 byte) Flags: optional, v.s. well-known transitive, partial, extended length Univ. of Tehran Computer Network 21 BGP NOTIFICATION message 1 0 2 3 Marker (security and message delineation) Length Type: NOTIFICATION Error code Error sub-code Data •Used for error notification TCP connection is closed immediately after notification Univ. of Tehran Computer Network 22 BGP KEEPALIVE message 0 1 2 3 Marker (security and message delineation) Length Type: KEEPALIVE Sent periodically to peers to ensure connectivity. If hold_time is zero, messages are not sent.. Sent in place of an UPDATE message Univ. of Tehran Computer Network 23 Path Selection Criteria Information based on path attributes Attributes + external (policy) information Examples: Hop count Policy considerations Preference for AS Presence or absence of certain AS Path origin Link dynamics Univ. of Tehran Computer Network 24 Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED traffic engineering i-BGP < e-BGP Lowest IGP cost to BGP egress Throw up hands and break ties Lowest router ID Univ. of Tehran Computer Network 25 Back to Frank … peer provider peer Local preference only used in iBGP customer AS 4 local pref = 80 local pref = 90 AS 3 local pref = 100 AS 2 Higher Local preference values are more preferred AS 1 13.13.0.0/16 26 Implementing Backup Links with Local Preference (Outbound Traffic) AS 1 primary link Set Local Pref = 100 for all routes from AS 1 backup link AS 65000 Set Local Pref = 50 for all routes from AS 1 Forces outbound traffic to take primary link, unless link is down. We’ll talk about inbound traffic soon … 27 Multihomed Backups (Outbound Traffic) AS 1 AS 3 provider provider primary link backup link Set Local Pref = 100 for all routes from AS 1 Set Local Pref = 50 for all routes from AS 3 AS 2 Forces outbound traffic to take primary link, unless link is down. 28 ASPATH Attribute AS 1129 135.207.0.0/16 AS Path = 1755 1239 7018 6341 135.207.0.0/16 AS Path = 1239 7018 6341 AS 1239 Sprint AS 1755 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341 Ebone AS 12654 AS 6341 AT&T Research RIPE NCC RIS project 135.207.0.0/16 AS Path = 7018 6341 AS7018 135.207.0.0/16 AS Path = 6341 Global Access 135.207.0.0/16 AS Path = 3549 7018 6341 AT&T 135.207.0.0/16 AS Path = 7018 6341 AS 3549 Global Crossing 135.207.0.0/16 Prefix Originated 29 COMMUNITY Attribute to the Rescue! AS 1 AS 3 provider provider AS 3: normal customer local pref is 100, peer local pref is 90 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 192.0.2.0/24 ASPATH = 2 primary backup customer AS 2 192.0.2.0/24 Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 30 Hot Potato Routing: Go for the Closest Egress Point 192.44.78.0/24 egress 2 egress 1 15 56 IGP distances This Router has two BGP routes to 192.44.78.0/24. Hot potato: get traffic off of your network as Soon as possible. Go for egress 1! 31 Getting Burned by the Hot Potato 2865 High bandwidth Provider backbone 17 SFF Low bandwidth customer backbone Heavy Content Web Farm NYC 15 56 San Diego Many customers want their provider to carry the bits! tiny http request huge http reply 32 Cold Potato Routing with MEDs (Multi-Exit Discriminator Attribute) Prefer lower MED values 2865 17 Heavy Content Web Farm 192.44.78.0/24 MED = 56 192.44.78.0/24 MED = 15 15 56 192.44.78.0/24 This means that MEDs must be considered BEFORE IGP distance! Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance 33 Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED traffic engineering i-BGP < e-BGP Lowest IGP cost to BGP egress Throw up hands and break ties Lowest router ID This is somewhat simplified. Hey, what happened to ORIGIN?? Univ. of Tehran Computer Network 34 Policies Can Interact Strangely (“Route Pinning” Example) backup customer 1 3 2 Disaster strikes primary link and the backup takes over Univ. of Tehran 4 Install backup link using community Primary link is restored but some traffic remains pinned to backup Computer Network 35 Path Attributes Categories (recall flags): well-known mandatory (passed on) well-known discretionary (passed on) optional transitive (passed on) optional non-transitive (if unrecognized, not passed on) Optional attributes allow for BGP extensions Univ. of Tehran Computer Network 36 Path attribute message format (repeated) Attribute flags OTPE Attribute type code 0 O: optional or well-known T: transitive or local P: partially evaluated E: length in 1 or 2 bytes Univ. of Tehran Computer Network Origin AS_path Next hop etc. 37 ORIGIN path attribute Well-known, mandatory attribute. Describes how a prefix was generated at the origin AS. Possible values: IGP: prefix learned from IGP EGP: prefix learned through EGP INCOMPLETE: none of the above (often seen for static routes) Univ. of Tehran Computer Network 38 AS_PATH attribute Well-known, mandatory attribute. Important components: If forwarding to internal peer: list of traversed AS’s do not modify AS_PATH attribute If forwarding to external peer: prepend self into the path Univ. of Tehran Computer Network 39 Next hop path attribute Well-known, mandatory attribute NEXT_HOP: IP address of border router to be used as next hop Usually, next hop is the router sending the UPDATE message Useful when some routers do not speak BGP Univ. of Tehran Computer Network 40 Example of NEXT_HOP A UPDATE MSG through BGP (BGP) B (BGP) Traffic to 138.39.0.0/16 C (no BGP) 138.39.0.0/16 Univ. of Tehran Computer Network 41 LOCAL PREF Local (within an AS) mechanism to provide relative priority among BGP routers R5 R1 AS 200 R2 AS 100 AS 300 R3 Local Pref = 500 Local Pref =800 R4 I-BGP AS 256 Univ. of Tehran Computer Network 42 AS_PATH List of traversed AS’s AS 200 AS 100 170.10.0.0/16 180.10.0.0/16 AS 300 AS 500 Univ. of Tehran 180.10.0.0/16 300 200 100 170.10.0.0/16 300 200 Computer Network 43 CIDR and BGP AS X 197.8.2.0/24 AS T (provider) 197.8.0.0/23 AS Z AS Y 197.8.3.0/24 What should T announce to Z? Univ. of Tehran Computer Network 44 Options Advertise all paths: Path 1: through T can reach 197.8.0.0/23 Path 2: through T can reach 197.8.2.0/24 Path 3: through T can reach 197.8.3.0/24 But this does not reduce routing tables! We would like to advertise: Path 1: through T can reach 197.8.0.0/22 Univ. of Tehran Computer Network 45 Sets and Sequences Problem: what do we list in the route? List T: omitting information not acceptable, may lead to loops List T, X, Y: misleading, appears as 3-hop path Solution: restructure AS Path attribute as: Path: (Sequence (T), Set (X, Y)) If Z wants to advertise path: Path: (Sequence (Z, T), Set (X, Y)) In practice used only if paths in set have same attributes Univ. of Tehran Computer Network 46 Multi-Exit Discriminator (MED) Hint to external neighbors about the preferred path into an AS Non-transitive attribute (we will see later why) Different AS choose different scales Used when two AS’s connect to each other in more than one place Univ. of Tehran Computer Network 47 MED Hint to R1 to use R3 over R4 link Cannot compare AS40’s values to AS30’s 180.10.0.0 MED = 50 R1 AS 10 R3 R2 AS 40 180.10.0.0 MED = 120 180.10.0.0 MED = 200 R4 AS 30 Univ. of Tehran Computer Network 48 MED • MED is typically used in provider/subscriber scenarios • It can lead to unfairness if used between ISP because it may force one ISP to carry more traffic: ISP1 SF ISP2 NY • ISP1 ignores MED from ISP2 • ISP2 obeys MED from ISP1 • ISP2 ends up carrying traffic most of the way Univ. of Tehran Computer Network 49 Other Attributes ORIGIN NEXT_HOP Source of route (IGP, EGP, other) Address of next hop router to use Used to direct traffic to non-BGP router Check out http://www.cisco.com for full explanation Univ. of Tehran Computer Network 50 Decision Process Processing order of attributes: Select route with highest LOCAL-PREF Select route with shortest AS-PATH Apply MED (if routes learned from same neighbor) Univ. of Tehran Computer Network 51 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues Univ. of Tehran Computer Network 52 Internal vs. External BGP •BGP can be used by R3 and R4 to learn routes •How do R1 and R2 learn routes? •Option 1: Inject routes in IGP •Only works for small routing tables •Option 2: Use I-BGP R1 AS1 R3 E-BGP R4 AS2 R2 Univ. of Tehran Computer Network 53 I-BGP Univ. of Tehran Computer Network 54 Internal BGP (I-BGP) Same messages as E-BGP Different rules about re-advertising prefixes: Prefix learned from E-BGP can be advertised to I-BGP neighbor and vice-versa, but Prefix learned from one I-BGP neighbor cannot be advertised to another I-BGP neighbor Reason: no AS PATH within the same AS and thus danger of looping. Univ. of Tehran Computer Network 55 Internal BGP (I-BGP) • R3 can tell R1 and R2 prefixes from R4 • R3 can tell R4 prefixes from R1 and R2 • R3 cannot tell R2 prefixes from R1 R2 can only find these prefixes through a direct connection to R1 Result: I-BGP routers must be fully connected (via TCP)! • contrast with E-BGP sessions that map to physical links R1 AS1 E-BGP R3 R4 AS2 R2 I-BGP Univ. of Tehran Computer Network 56 Link Failures Two types of link failures: Failure on an E-BGP link Failure on an I-BGP Link These failures are treated completely different in BGP Why? Univ. of Tehran Computer Network 57 Failure on an E-BGP Link • If the link R1-R2 goes down • The TCP connection breaks • BGP routes are removed • This is the desired behavior E-BGP session AS1 R1 R2 AS2 Physical link 138.39.1.1/30 Univ. of Tehran 138.39.1.2/30 Computer Network 58 Failure on an I-BGP Link •If link R1-R2 goes down, R1 and R2 should still be able to exchange traffic •The indirect path through R3 must be used •Thus, E-BGP and I-BGP must use different conventions with respect to TCP endpoints 138.39.1.2/30 R2 Physical link 138.39.1.1/30 R1 R3 I-BGP connection Univ. of Tehran Computer Network 59 Outline External BGP (E-BGP) Internal BGP (I-BGP) Multi-Homing Stability Issues Univ. of Tehran Computer Network 60 Multi-homing With multi-homing, a single network has more than one connection to the Internet. Improves reliability and performance: Can accommodate link failure Bandwidth is sum of links to Internet Challenges Getting policy right (MED, etc..) Addressing Univ. of Tehran Computer Network 61 Multi-homing to a Single Provider Case 1 Easy solution: Use IMUX or Multi-link PPP ISP Hard solution: R1 Use BGP Makes assumptions about traffic (same amount of prefixes can be reached from both links) Univ. of Tehran R2 Customer Computer Network 62 Multi-homing to a single provider: Case 2 If multiple prefixes, may use MED good if traffic load from prefixes is equal ISP If single prefix, load may be unequal break-down prefix and advertise different prefixes over different links Univ. of Tehran R1 138.39/16 Computer Network R2 R3 Customer 204.70/16 63 Multi-homing to a single provider: Case 3 For ISP-> customer traffic, same as before: use MED good if traffic load to prefixes is equal ISP R1 For customer -> ISP traffic: R3 alternates links multiple default routes Univ. of Tehran R2 R3 138.39/16 Computer Network Customer 204.70/16 64 Multi-homing to a single provider: Case 4 Most reliable approach Customer -> ISP: no equipment sharing ISP same as case 2 R1 R2 R3 R4 ISP -> customer: same as case 3 138.39/16 Univ. of Tehran Computer Network Customer 204.70/16 65 Multi-homing to Multiple Providers Major issues: ISP3 Customer address space: Addressing Aggregation Delegated by ISP1 Delegated by ISP2 Delegated by ISP1 and ISP2 Obtained independently ISP1 Customer Advantage and disadvantage? Univ. of Tehran ISP2 Computer Network 66 Address Space from one ISP Customer uses address space from one, I.e ISP1 ISP1 advertises /16 aggregate Customer advertises /24 route to ISP2 ISP2 relays route to ISP1 and ISP3 ISP2-3 use /24 route ISP1 routes directly Problems with traffic load? Univ. of Tehran ISP3 138.39/16 ISP1 Computer Network ISP2 Customer 138.39.1/24 67 Pitfalls ISP1 aggregates to a /19 at border router to reduce internal tables. ISP1 still announces /16. ISP1 hears /24 from ISP2. ISP1 routes packets for customer to ISP2! Workaround: ISP1 must inject /24 into I-BGP. ISP3 138.39/16 ISP1 ISP2 138.39.0/19 Customer 138.39.1/24 Univ. of Tehran Computer Network 68 Address Space from Both ISPs ISP1 and ISP2 continue to announce aggregates Load sharing depends on traffic to two prefixes Lack of reliability: if ISP1 link goes down, part of customer becomes inaccessible. Customer may announce prefixes to both ISPs, but still problems with longest match as in case 1. Univ. of Tehran ISP3 ISP1 138.39.1/24 Computer Network ISP2 204.70.1/24 Customer 69 Address Space Obtained Independently Offers the most control, but at the cost of aggregation. Still need to control paths suppose ISP1 large, ISP2-3 small customer advertises long path to ISP1, but local-pref attribute used to override ISP3 learns shorter path from ISP2 Univ. of Tehran ISP3 ISP1 Computer Network ISP2 Customer 70 Outline External BGP (e-BGP) Internal BGP (i-BGP) Multi-Homing Stability Issues Univ. of Tehran Computer Network 71 Signs of Routing Instability Record of BGP messages at major exchanges Discovered orders of magnitude larger than expected updates Bulk were duplicate withdrawals Stateless implementation of BGP – did not keep track of information passed to peers Impact of few implementations Strong frequency (30/60 sec) components Interaction with other local routing/links etc. Univ. of Tehran Computer Network 72 Route Flap Storm Overloaded routers fail to send Keep_Alive message and marked as down I-BGP peers find alternate paths Overloaded router re-establishes peering session Must send large updates Increased load causes more routers to fail! Univ. of Tehran Computer Network 73 Route Flap Dampening Routers now give higher priority to BGP/Keep_Alive to avoid problem Associate a penalty with each route Increase when route flaps Exponentially decay penalty with time When penalty reaches threshold, suppress route Univ. of Tehran Computer Network 74 BGP Limitations: Oscillations AS 0 (*R,1R,2R) R AS 1 AS 2 (0R,1R,*R) Univ. of Tehran (0R,*R,2R) Computer Network 75 BGP Limitations: Oscillations AS 0 (-,*1R,2R) (*R,1R,2R) W R W W AS 1 AS 2 (0R,1R,*R) (*0R,1R,-) Univ. of Tehran (*0R,-,2R) (0R,*R,2R) Computer Network 76 BGP Limitations: Oscillations AS 0 (-,*1R,2R) (-,*1R,2R) 01R 01R R AS 1 AS 2 (*0R,1R,-) (01R,*1R,-) Univ. of Tehran (-,-,*2R) (*0R,-,2R) Computer Network 77 BGP Limitations: Oscillations AS 0 (-,-,*2R) (-,*1R,2R) 10R R AS 1 AS 2 (01R,*1R,-) (*01R,10R,-) Univ. of Tehran 10R Computer Network (-,-,*2R) (-,-,*2R) 78 BGP Limitations: Oscillations AS 0 (-,-,-) (-,-,*2R) 20R R AS 1 AS 2 (*01R,10R,-) (*01R,10R,-) Univ. of Tehran 20R Computer Network (-,-,*20R) (-,-,*2R) 79 BGP Limitations: Oscillations AS 0 (-,*12R,-) (-,-,-) 12R R AS 1 AS 2 (*01R,10R,-) (*01R,-,-) Univ. of Tehran 12R (-,-,*20R) (-,-,*20R) Computer Network 80 BGP Limitations: Oscillations AS 0 (-,*12R,21R) (-,*12R,-) 21R R AS 1 AS 2 (*01R,-,-) (*01R,-,-) Univ. of Tehran 21R Computer Network (-,-,-) (-,-,*20R) 81 BGP Oscillations Can possible explore every possible path through network (n-1)! Combinations Limit between update messages (MinRouteAdver) reduces exploration Forces router to process all outstanding messages Typical Internet failover times New/shorter link 60 seconds Down link 180 seconds Results in simple replacement at nodes Results in search of possible options Longer link 120 seconds Results in replacement or search based on length Univ. of Tehran Computer Network 82 Problems Routing table size Need an entry for all paths to all networks Required memory= O((N + M*A) * K) N: number of networks M: mean AS distance (in terms of hops) A: number of AS’s K: number of BGP peers Univ. of Tehran Computer Network 83 Routing Table Size Networks Mean AS Distance Number of AS’s BGP Peers/Net Memory 2,100 5 59 3 27,000 4,000 10 100 6 108,000 10,000 15 300 10 490,000 100,000 20 3,000 20 1,040,000 Problem reduced with CIDR Univ. of Tehran Computer Network 84 Next Lecture: TCP Transport layer issues Assigned reading [S+99] The End-to-End Effects of Internet Path Selection [Tsu88] The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks Univ. of Tehran Computer Network 85