Download etri03-part1 - Princeton University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Network science wikipedia , lookup

Path integral formulation wikipedia , lookup

Corecursion wikipedia , lookup

Graph coloring wikipedia , lookup

Centrality wikipedia , lookup

Travelling salesman problem wikipedia , lookup

Signal-flow graph wikipedia , lookup

Transcript
Characterizing the Internet Hierarchy
from Multiple Vantage Points
Jennifer Rexford
Internet and Networking Systems
AT&T Labs - Research; Florham Park, NJ
http://www.research.att.com/~jrex
Work with L. Subramanian, S. Agarwal, and R. Katz
http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Outline
 Internet
architecture
– Autonomous Systems (ASes) and IP addressing
– Interdomain routing and AS relationships
 Type-of-relationship
problem
– Motivation, formulation, and practical challenges
 Analyzing
partial views of the AS graph
– Assigning a rank to each AS from a single vantage point
– Comparing ranks of ASes across multiple vantage points
 Analysis
results
– Interdomain routing data and inferred AS relationships
– Five-level classification of the Internet hierarchy
 Conclusions
Internet Architecture
 Divided
into Autonomous Systems
– Distinct regions of administrative control (~14,000)
– Set of routers and links managed by a single institution
– Service provider, company, university, …
 Hierarchy
of Autonomous Systems
– Large, tier-1 provider with a nationwide backbone
– Medium-sized regional provider with smaller backbone
– Small stub network run by a company or university
 Interaction
between Autonomous Systems
– Internal topology is not shared between ASes
– … but, neighboring ASes interact to coordinate routing
Autonomous Systems (ASes)
Path: 6, 5, 4, 3, 2, 1
4
3
5
2
7
1
6
Web server
Client (128.235.184.76)
IP Addressing (for the host www.etri.re.kr)
 32
bits in dotted-quad notation (129.254.19.28)
 Divided
into network and host portions
 129.254.0.0/16:
129
16-bit prefix with 232-16 addresses
254
19
28
10000001 11111110 00010011 00011100
Network (16 bits)
Host (16 bits)
Border Gateway Protocol (BGP)
 ASes
exchange info about who they can reach
– IP prefix: block of destination IP addresses
– AS path: sequence of ASes along the path
 Policies
configured by the AS’s network operator
– Path selection: which of the paths to use?
– Path export: which neighbors to tell?
“I can reach 129.254.0.0/16
via AS 4766”
“I can reach 129.254.0.0/16”
ETRI
3748
129.254.19.28
7018
4766
KIX
AT&T
Customer-Provider Relationship
 Customer
pays provider for access to the Internet
 AS exports customer’s routes to all neighbors
 AS exports provider’s routes only to its customers
Traffic to the customer
Traffic from the customer
d
provider
advertisements
provider
traffic
customer
d
customer
Peer-Peer Relationship
 Peers
exchange traffic between their customers
 Free of charge (assumption of even traffic load)
 AS exports a peer’s routes only to its customers
Traffic to/from the peer and its customers
advertisements
peer
d
traffic
peer
Moving Beyond the AS Graph
 AS
graph doesn’t capture effects of routing policies
– Some paths not allowed (e.g., transit through a peer)
– Local preference of paths (e.g., prefer customer path)
– Can’t simply compute shortest path on the graph
 AS
graph doesn’t capture the Internet hierarchy
– Large AS may have few neighbors (mostly peers)
– Small AS may have many neighbors (mostly providers)
– Can’t simply rely on node degree to rank the nodes
 Need
to know the relationship between AS pairs
AS Relationships Matter
 Understanding
the Internet’s basic structure
– More than just a graph of interconnected ASes
 Placement
of servers for content distribution
– Place replicas so they can reach everyone well
 Selection
of new peers or providers for an AS
– Select peers/providers with good “reach” to others
 Installing
route filters to protect the network
– Filter customer routes that include other large providers
 Analyzing
the convergence properties of BGP
– Hierarchical relationships avoid protocol oscillation
Inferring Relationships from Routing Data
 Practical
realities of the Internet
– AS graph is not known
– AS relationships are proprietary
– … at least some routing data is publicly available!
 Exploiting
routing data
– Available via traceroute experiments or BGP tables
– Provides a set of AS paths, such as “7018 4766 3748”
– Implies existence of edges (7018, 4766) & (4766, 3748)
– Implies that 4766 (KIX) allows AS 7018 (AT&T) to
transit to AS 3748 (ETRI)
Valid and Invalid Paths
 AS
relationships limit the kinds of valid paths
– Uphill portion: customer-provider relationships
– Plateau: zero or one peer-peer edge
– Downhill portion: provider-customer relationships
AT&T
Qwest
AT&T
UUNET
Invalid
Invalid
Valid
Sprint
Rutgers
NJIT
AT&T Research
Lixin Gao, “On inferring Autonomous System relationships in the
Internet,” IEEE/ACM Transactions on Networking, December 2001.
AT&T
UUNET
Type-of-Relationship Problem
 Given
the inputs
– AS graph G(V,E) with vertices V and edges E
– Set of paths P on the graph G
 Find
a solution that
– Labels each edge with an AS relationship
– Minimizes the number of invalid paths in P
 Properties
of the problem
– NP complete (?)
– May have multiple solutions
– We propose a heuristic algorithm
Practical Challenges
 Peer-peer
relationships are hard to infer
– Mislabeling a peer-peer edge as provider-customer does not
change a valid path into an invalid path
– We use heuristics to detect the peer-peer edges
 Some AS
pairs have unusual relationships
– Sibling ASes that provide transit service for each other
– Backup relationship for connectivity under failure
– Misconfiguration of a conventional AS relationship
– We detect these cases by analyzing the “invalid” paths
 Getting
access to a large path set P is hard
– We exploit BGP routing tables from multiple vantage points
Validation Approaches
 Quantify
the number of invalid paths
– Small number suggests better results
– …still, this doesn’t mean that inferences are correct
 Compare
results with other inference algorithms
– Higher confidence if inferences are the same
– … still, both algorithms could give wrong answers
 Compare
results with Routing Arbiter Database
– Higher confidence if consistent with RADB routing policies
– … still, RADB information is incomplete and out-of-date
 Compare
results with proprietary ISP data
– Higher confidence if answers are correct for this AS
– … still, answers may be wrong for other ASes
Partial View of the AS Graph
 Routing
data from a single source AS
– Collection of paths starting from the source
– Directed graph from union of all edges in these paths
E
Actual graph
C
B
A
D
E
F
D
B
F
C
C
D
B
E
A
F
A
Assigning Rank to AS in a Partial View
 Reverse
pruning algorithm to assign rank
– Rank 1 to the leaves, then remove leaves
– Rank 2 to the leaves, then remove leaves…
– Single (largest) rank to nodes in connected component, if any
5 E
5 B
4
D
3
F 1
4
C
C
3
D
2
B
2
E
1
A
1
F
A 1
Combining Information From Multiple Views
 Vector
of ranks for each AS
– A single element for each of the n views
 Dominance:
provider-customer relationship
– Provider has higher ranks than customer in most views
– For example, B has (2,5) and A has (1,1)
 Equivalence:
peer-peer relationship
– Peers have equal ranks in or inconsistent ranks
– For example, C has (3,4) and D has (4,3)
 Probabilistic
inference
– Thresholds to tolerate some variations across the views
– E.g., an AS dominates in n-1 views and dominated in 1
Applying Our Algorithm
 Applying
the algorithm to ten public BGP tables
– RouteViews table and nine Looking Glass servers
– Extracted set of unique paths P for each view
– Applied reverse pruning algorithm to each view
– Applied inference rules to the vectors of ranks
 Results
of the analysis on data from April 2001
– AS graph with 10,698 ASes and 23,935 edges
– Inferences were made for 99.2% of the edges
– 94.5% provider-customer and 4.7% peer-peer edges
– Most inferences do not require the probabilistic rules
Advantage of Multiple Vantage Points
 A single
vantage point is not enough
– 15% of the edges appear in exactly one BGP table
– Only 25% of the edges appear in all ten BGP tables
Analyzing Invalid Paths
 Checking
the validity of inferences
– Assume the relationship inferences are correct
– Identify paths that are invalid under these inferences
– Compute the number of invalid paths
– Investigate common anomaly triples (A, B, C)
 Results
of our analysis
– Applied to paths in 2 of the original 10 BGP tables
– Applied to paths in 4 other BGP tables
– 0.5-3% of paths are invalid for five of the six tables
– 8.7% of paths are invalid for the KDDI table
Common Anomaly Patterns
 Misconfiguration
– (1, 65112, 6461): 65112 is a private AS that should not
appear between Genuity and AboveNet
 Sibling
relationships
– (7018, 6841, 3300): Infonet Europe merged with AUCS
– (1239, 1740, 7018): Cerfnet was acquired by AT&T
– (1239, 8043, 6395): IXC Communications acquired
SmartNAP and renamed Broadwing
 Heuristic
for identifying sibling relationships
– AS pair that appears in a large number of “invalid” paths
– Our analysis identified 22 possible sibling relationships
AS Classification
 Directed AS
graph
– Directed edge from provider to customer
– Bidirectional edge between two peers
 Lowest
level: Stubs
– Leaf nodes: no peers or downstream customers
– 8898 of the 10915 ASes (82.5% of ASes)
– Ex: UC Berkeley (25), AT&T Labs (6431), and ETRI (3748)
 Next
lowest level: Regional ISPs
– Leaf nodes after successive pruning of leaf nodes
– 971 ASes of the 10915 ASes (8.9% of ASes)
– Ex: PacBell (5676), US West (6223), and UUNET Canada (815)
 Remaining
1046 ASes: Core
Dense Core
 Ways
to classify so-called “tier-1” ASes
– Any AS with no upstream provider (98 such nodes)
– AS set that forms the largest clique of peer edges (13 nodes)
 Relaxing
the definition
– Tolerate some missing or misclassified edges
– Tolerate some ASes with sibling relationships
 “Almost
a clique”
– Subgraph of m nodes with in and out degree at least m/2
– Greedy algorithm for locating the largest near-clique
 20 ASes
in the near-clique
– 15 of the ASes form a subgraph just 3 edges short of a clique
– Genuity, Sprint, UUNET, AT&T, Verio, Level3, C&W,…
Transit and Outer Core
 Transit
core
– ASes that peer with the dense core and each other
– Notion of a “weak in-way cut” to isolate these ASes
– Algorithm for identifying the ASes in transit core
– 129 ASes, including top providers in Europe and Asia
– Ex: UUNET Europe, KDDI, and Singapore Telecom
 Outer
core
– All of the remaining ASes in the core
– 897 ASes, including large regional and national ISPs
– Ex: Turkish Telecom and Minnesota Regional Network
Node Degree is Not Enough
 Node
degree ignores relationships
– A stub AS may have many upstream providers (one bankrupt!)
– A core AS may have a small # of peers (TeliaNet USA degree 40)
– Some ASes have customers that don’t have AS numbers
Related Work
 AS
graph characterization
– Constructing graph from BGP tables or traceroute experiments
– Characterizing the topological properties of the graph
 Inferring AS
relationships (Lixin Gao)
– Identifies the key properties of paths (uphill, downhill, etc.)
– Heuristic using node degree to infer boundary point between the
uphill and downhill portions of the path
– Application of the algorithm using RouteViews routing table
 Characterization
of the hierarchy of ASes
– Early work by Govindan/Reddy based on node degree
– Recent work by Ge et al based on AS relationships
Conclusions
 Inferring AS
relationships
– Reverse pruning to assign rank to each AS
– Comparison of ranks from different vantage points
 Performance
evaluation
– Application of algorithm to collection of ten BGP tables
– Exploration of the anomalies that cause invalid paths
 Characterization
of Internet hierarchy
– Stub, regional ISP, outer core, transit core, & dense core
– Algorithms for identifying the three parts of the core
– Application to AS graph inferred from the BGP tables
Ongoing Work
 Classification
of siblings
– Use anomalous triples (A, B, C) to identify siblings
– Group siblings into a single node (with union of edges)
– Repeat classification of the AS hierarchy on new graph
 Longitudinal
study
– Repeat the study over a period of time with new data
– Study how AS relationships and hierarchy changes
 Validation
of our inference results
– Compare to RADB, Lixin’s results, AT&T data, etc.

http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Digression #1: Really Weird “Invalid” Paths…
1 701 703 9304 7018
Genuity
 Properties
UUNet
Hutchinson AT&T
of the path
– Two tier-1 U.S. providers (Genuity and UUNet)
– One service provider in Hong Kong (Hutchinson)
– Another tier-1 U.S. provider (AT&T) at the end of the path
 Looking
at internal AT&T configuration data…
– AT&T does not have a BGP session with AS 9304
– AT&T does not originate the prefixes (e.g., 152.141.116.0/24)
 Explanation
– Another AS was using the AT&T AS number (for over a year!)
– We sent them an e-mail and asked them to stop, and they did
Digression #1: How Could This Happen, and Persist?
 BGP configuration
is done locally by neighbors
– Customer configures its router with AS number 7018
– Provider configures its router with neighbor of 7018
 The
misconfiguration didn’t necessarily cause a problem
– Hop-by-hop routing took the traffic to the right place
– Most BGP policies don’t look at the identity of the ASes
 Could
have caused a problem: route filtering
– Large providers might applying filtering to customer routers
– Discard routes with other large providers in the path
 Could
have caused a problem: loop detection
– The bogus routes did not appear in AT&T’s routing tables
– AT&T router saw 7018 in the path and discarded the route
– AT&T router did have a route for the supernet (152.141.0.0/16)
Digression #2: Making Up The Examples
 Getting
an IP address
– E-mail from [email protected]
– “nslookup adm.njit.edu” gave 128.235.184.176
 Finding
the AS number and a sample path
– telnet://route-views.oregon-ix.net/
– “show ip bgp 128.235.184.176”
– One of the paths is “7018 209 4246”
 Double
checking this belongs to NJIT
– “whois –h whois.arin.net 128.235.184.176”
– “whois –h whois.arin.net 4246”
Digression #2: whois –h whois.arin.net 128.235.184.76
OrgName: New Jersey Institute of Technology
OrgID:
NJIT
Address: 323 Martin Luther King Boulevard
City:
Newark
StateProv: NJ
PostalCode: 07102
Country: US
NetRange: 128.235.0.0 - 128.235.255.255
CIDR:
128.235.0.0/16
NetName: NJIT
NetHandle: NET-128-235-0-0-1
Parent: NET-128-0-0-0-0
NetType: Direct Assignment
NameServer: DNS1.NJIT.EDU
NameServer: DNS2.NJIT.EDU
NameServer: THORIN.UTHSCSA.EDU
Comment:
RegDate:
Updated: 2001-04-05
TechHandle: PT35-ARIN
TechName: Teklinski, Peter
TechPhone: +1-973-596-2908
TechEmail: [email protected]