* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download GPF_2_2007_peer_cariden
Survey
Document related concepts
Transcript
Peering Planning Cooperation without Revealing Confidential Information Aaron Hughes [email protected] GPF 2.0 March 25-30, 2007 Miami, Cozumel & Belize www.cariden.com GPF 2.0 2007 (c) cariden technologies Failover Matrices–Cariden 1 The Issue • Multi-Homed Neighbor, 2 or more links > 50% • Example – 1000Mbps connections to Peer X in 3 locations – SJC-to-Peer = 600Mbps, NYC = 100, WDC = 600 – SJC-to-Peer link fails • Are we in trouble? ... or ... ? SEA BOS CHI SEA ? BOS CHI NYC SJC WDC KCY HST SJC ATL WDC KCY HST LAX ATL LAX Peer X GPF 2.0 2007 NYC MIA Failover Matrices–Cariden Peer X MIA 2 Capacity Planning Utopia • Uniform capacity links • Diverse connections (unlikely double failures at Layer 3) • Upgrade at 50% (planning objective is to be resilient to single failures) GPF 2.0 2007 Failover Matrices–Cariden 3 Capacity Planning Reality • Range of capacities • Multiple Layer 3 failures • Upgrade impediments (money, cable plant, ...) GPF 2.0 2007 Failover Matrices–Cariden 4 IGP Different from BGP • Data is more accessible • Failure behavior is predictable • Established process for within AS planning – Gather Data • Topology (OSPF, IS-IS, ...) • Traffic matrix [1] • Estimate growth – Simulate for failures – Perform traffic engineering (optional)[2] – Upgrade as necessary • Commercial and free tools GPF 2.0 2007 Failover Matrices–Cariden 5 The Trouble with BGP • Data is larger and harder to access • BGP decision process complicated • Planning practices not well established • Failure behavior often depends on someone else’s network! subject of – e.g., incoming traffic from a peer GPF 2.0 2007 Failover Matrices–Cariden this talk 6 BGP Path Decision Algorithm[1] 1. Reachable next hop 2. Highest Weight 3. Highest Local Preference 4. Locally originated routes 5. Shortest AS-path length 6. IGP > EGP > Incomplete 7. Lowest MED Respect MEDs 8. EBGP > IBGP Shortest Exit Routing 9. Lowest IGP cost to next hop 10. Shortest route reflection cluster list 11. Lowest BGP router ID 12. Lowest peer remote address [1] Junos algorithm shown here. Cisco IOS uses a slightly different algorithm. GPF 2.0 2007 Failover Matrices–Cariden 7 Common Routing Policies • Shortest Exit – Often used for sending to peers – Get packet out of network as soon as possible – Local Prefs used to determine which neighbor, IGP costs used to determine which exit • Respect MEDs – Often used for customers who buy transit – Deliver packets closest to destination – Neighbor forwards IGP costs as MEDs (multi-exit discriminators) GPF 2.0 2007 Failover Matrices–Cariden 8 Blind Spots • Cannot predict behavior when routing depends on other network (see 3 cases below). Relationship to Remote AS Routing To Remote AS Shortest Exit in Peer known network Respect MEDs Customer from unknown Transit Shortest Exit in Provider known network GPF 2.0 2007 Failover Matrices–Cariden Routing From Remote AS Shortest Exit in unknown network Shortest Exit in unknown network Respect our MEDs 9 Failover Matrices • Solution to peering planning blind spots • Procedure – Gather data • Topology, Traffic, Routing Configurations – Simulate knowable effects • Generate Failover Matrices – Share Failover Matrices for unknowables • e.g., peer gives failover matrix for traffic it delivers, we provide peer failover matrix for traffic we deliver • Both sides benefit from cooperating • AS-Internal information is kept confidential GPF 2.0 2007 Failover Matrices–Cariden 10 Failover Matrix Example Traffic: %Traffic: no failure fail_SJC ar1.sjc:Gig3/2 600 Node:Interface ar1.nyc:ge-2/1 100 48% (388) ar2.wdc:ge-2/2 600 52% (912) %Traffic: %Traffic: fail_nyc fail_wdc 10% (610) 1% (606) 70% (670) 95% (670) - Note: 388Mbps=100Mbps+(0.48*600Mbps), 912=600+(0.52*600), ... GPF 2.0 2007 Failover Matrices–Cariden 11 No Scale Failover Example (from real network) No Scale Peer Circuit 1: Traffic levels at five minute intervals No Scale Peer Circuit 2: Traffic levels at five minute intervals No Scale Peer Circuit 3: Traffic levels at five minute intervals Peer Circuit 4: Traffic levels at five minute intervals • Circuit 2 fails. Traffic shifts to circuit 4. GPF 2.0 2007 Failover Matrices–Cariden 12 No Scale Failover Example (from real network) No Scale Peer Circuit 1: Traffic levels at five minute intervals No Scale Peer Circuit 2: Traffic levels at five minute intervals No Scale Peer Circuit 3: Traffic levels at five minute intervals Peer Circuit 4: Traffic levels at five minute intervals • Circuit 1 fails. Some traffic shifts to 2 & 4 • Some “leaks” to other AS’s GPF 2.0 2007 Failover Matrices–Cariden 13 Questions • How do I calculate a failover matrix? • How do I use a failover matrix from a peer? • What if my peer does not cooperate? • What if a substantial amount of traffic “leaks” to another AS? GPF 2.0 2007 Failover Matrices–Cariden 14 Calculating Failover Matrices • Accurate and Detailed[1,2] – Per prefix routing and traffic statistics – Full BGP simulation • Simple and Good Enough[3] – Traffic matrix based on ingress-egress pairs • e.g., Peer1.LAX-AR1.CHI (measure and/or estimate) instead of 192.12.3.0/24-208.43.0.0/16 – Limited simulation model • Shortest Path, Respect MEDs • “Our” AS plus immediate neighbors [1] “Modeling the routing of an Autonomous System with C-BGP,” B. Quoitin and S. Uhlig, IEEE Network, Vol 19(6), November 2005. [2] “Network-wide BGP route prediction for traffic engineering,” N. Feamster and J. Rexford, in Proc. Workshop on Scalability and Traffic Control in IP Networks, SPIE ITCOM Conference, August 2002. [3] Cariden MATE, available at http://www.cariden.com. GPF 2.0 2007 Failover Matrices–Cariden 15 Using Failover Matrix from Peers • Peer calculates failover matrix • Peer exports failover matrix using IP addresses of peering links • We import failover matrix • We include in a representative model of peer network • Use Failover Matrix in simulation GPF 2.0 2007 Failover Matrices–Cariden 16 Estimate if Peer not Cooperate bbs1 sel2 sea1 bhx1 lba1 min1 sfo1 nrt4 man1 det1 cph1 nqt1 roc1 bos1 ewr1 nrt1 chi1 osa1 sjc2 jfk1 ham1 lon3 sac1 pao2 kcy1 cle1 nyc2 wsh1 ams2 fra2 snv2 tpe3 kul1 lax1 cdg2 den2 dal1 phi1 Group own sources based on exit location (4 groups here) ams1 wdc2 sin1 lin1 bru1 hkg1 atl1 phx1 sna1 hou1 tpa1 mia1 bbs1 sel2 pty1 mex1 bhx1 gru1 syd1 sea1 lba1 eze1 min1 sfo1 nrt4 Quantify shift (to 3 groups) after failure Assume similar for other side man1 det1 cph1 nqt1 roc1 bos1 ewr1 nrt1 chi1 osa1 sjc2 jfk1 ham1 lon3 sac1 pao2 kcy1 cle1 nyc2 wsh1 ams2 fra2 snv2 tpe3 kul1 lax1 cdg2 den2 dal1 phi1 ams1 wdc2 sin1 bru1 hkg1 lin1 atl1 phx1 sna1 hou1 tpa1 mia1 pty1 mex1 gru1 syd1 eze1 • Valid if topology and traffic distributions are similar GPF 2.0 2007 Failover Matrices–Cariden 17 Leaks to Other AS’s x A INTERNET B • Simple option – Leaks between peers relatively small • Ignore – Shifts between transit providers can be large • Equal AS-path length to most destinations: • Assume complete shift (easy to model) • Accurate option – Extend model to more than one AS away – Add columns in traffic matrix to designate extra traffic in case of other network failures GPF 2.0 2007 Failover Matrices–Cariden 18 Work in Progress • Evaluating goodness of models – Compare actual failures to models • Evaluating goodness of failover estimates – Work with both sides of a peering arrangement, compare failover estimates to simulations – Compare estimated failover matrices to actual failures • Streamlining sharing of information • Looking for more participants Contact me to participate in the above GPF 2.0 2007 Failover Matrices–Cariden 19 Summary • Peering/transit links are some of the most expensive and difficult to provision links • We can improve capacity planning on such links by modeling the network • BGP modeling can be much more complex than IGP modeling – Some required information is not even available • Failover Matrices provide a simple way to share information without giving away details • Failover Matrices can be estimated using one’s own network details GPF 2.0 2007 Failover Matrices–Cariden 20 Acknowledgments • Jon Aufderheide (Global Crossing) • Clarence Filsfils (Cisco) GPF 2.0 2007 Failover Matrices–Cariden 21