Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
Net neutrality wikipedia , lookup
Network tap wikipedia , lookup
Deep packet inspection wikipedia , lookup
Distributed firewall wikipedia , lookup
Net neutrality law wikipedia , lookup
Airborne Networking wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 13 – Network Mapping Why Predict Network Performance? In any large distributed system must assign responsibility to nodes Want to maximize performance, availability, etc. How to choose? Communication patterns are often localized Need to find nodes that are near each other Need geographic & network mapping 02-25-03 Lecture #13 2 Network Mapping Difficult to find nearby nodes quickly and efficiently Huge number of paths to measure TCP bandwidth and RTT probes are timeconsuming No clean mapping of IP address location (geographic or network topology) 02-25-03 Lecture #13 3 Outline IP2Geo IDMaps SPAND GNP 02-25-03 Lecture #13 4 IP2Geo Techniques GeoTrack DNS-based GeoPing Latency-based GeoCluster BGP-based 02-25-03 Lecture #13 5 GeoTrack Traceroute to host DNS names for routers near host Typically city, airport, country names embedded in names Need ISP-specific heuristics to reduce false positives Codes used and location in name Impact of proxies Always returns location of proxy Client may be anywhere 02-25-03 Lecture #13 6 GeoTrack Performance FooTV CDF looks identical to proxy location PDF Most FooTV customers use a proxy Median Error: 590km University hosts work well 02-25-03 No proxy, always on, traceroute always completes Median error: 102km Lecture #13 7 GeoPing Can latency be converted to distance? Distance to different locations grouped by latency While good at short latencies error grows for longer latencies Can’t break speed of light, but can get slow route to nearby places 02-25-03 Lecture #13 8 GeoPing Performance Match host to other nearby nodes Delay map: vector of delays to known n landmarks Matching: most similar delay map 02-25-03 Euclidean distance in n dimensional space Use matched host’s location Lecture #13 9 GeoCluster Classify all hosts in an address cluster as same location BGP address prefixes as starting point Address allocation in CIDR blocks Basic granularity of announcements BGP announcements may be subdivided to reflect different attributes (e.g. policy) 100,000 address prefixes announced from 12,000 ASes Address prefixes may still be large entities 02-25-03 May span large geographic areas Lecture #13 10 GeoCluster Sub-clustering prefix is large look at consensus on location When If consensus is weak split prefix in half Repeat if needed until too few samples Proxies result in lack of consensus and therefore no location report 02-25-03 Lecture #13 11 GeoCluster Performance Without sub-clustering University data set Median error: 28km No report for 12% of hosts bCentral data set Median error: 685km 02-25-03 Much larger ISPs cause problems 23% with no answer Lecture #13 12 GeoCluster Performance bCentral poor performance due to large size of prefixes Can subclustering help? yes 02-25-03 Note that even /24 clusters don’t work well surprising given the address allocation of Internet Lecture #13 13 Outline IP2Geo IDMaps SPAND GNP 02-25-03 Lecture #13 14 Network Distance Round-trip propagation and transmission delay Reflects Internet topology and routing A good first order performance optimization metric Helps achieve low communication delay A reasonable indicator of TCP throughput Can weed out most bad choices But the O(N2) network distances are also hard to determine efficiently in Internet-scale systems 02-25-03 Lecture #13 15 Active Measurements Network distance can be measured with pingpong messages But active measurement does not scale 02-25-03 Lecture #13 16 Scaling Alternatives 02-25-03 Lecture #13 17 Sharing Measurements IDMaps [Francis et al ’99] A/B 50ms Server Probe A Probe Probe B 02-25-03 Lecture #13 18 Assumptions Probe nodes approximate direct path May require large number Careful placement may help Requires that distance between end-points is approximated by sum Triangle inequality must hold (i.e., (a,c) > (a,b) + (b,c) 02-25-03 Lecture #13 19 Triangle Inequality in the Internet 02-25-03 Lecture #13 20 Outline IP2Geo IDMaps SPAND GNP 02-25-03 Lecture #13 21 SPAND Design Choices Measurements are shared Hosts share performance information by placing it in a per-domain repository Measurements are passive Application-to-application traffic is used to measure network performance Measurements are application-specific When possible, measure application response time, not bandwidth, latency, hop count, etc. 02-25-03 Lecture #13 22 SPAND Architecture Internet Client Packet Capture Host Data Performance Server Perf. Reports Perf Query/ Response Client 02-25-03 Lecture #13 23 SPAND Assumptions Geographic Stability: Performance observed by nearby clients is similar works within a domain Amount of Sharing: Multiple clients within domain access same destinations within reasonable time period strong locality exists Temporal Stability: Recent measurements are indicative of future performance true for 10’s of minutes 02-25-03 Lecture #13 24 Cumulative Probability Prediction Accuracy 1 0.8 0.6 0.4 0.2 0 1/64 1/16 1/4 1 4 16 64 Ratio of Predicted to Actual Throughput Packet capture trace of IBM Watson traffic Compare predictions to actual throughputs 02-25-03 Lecture #13 25 Outline IP2Geo IDMaps SPAND GNP 02-25-03 Lecture #13 26 First Key Insight With millions of hosts, “What are the O(N2) network distances?” may be the wrong question Instead, could we ask: “Where are the hosts in the Internet?” What does it mean to ask “Where are the hosts in the Internet?” Do we need a complete topology map? Can we build an extremely simple geometric model of the Internet? 02-25-03 Lecture #13 27 New Fundamental Concept: “Internet Position” Using GNP, every host can have an “Internet position” O(N) positions, as opposed to O(N2) distances Accurate network distance estimates can be rapidly (x2,y2,z2) computed from “Internet positions” y “Internet position” is a local (x1,y1,z1) property that can be determined before applications need it x Can be an interface for independent systems to interact z (x3,y3,z3) 02-25-03 Lecture #13 (x4,y4,z4) 28 Vision: Internet Positioning Service (2,4) 65.4.3.87 33.99.31.1 (5,4) 128.2.254.36 (7,3) 12.5.222.1 (1,3) (6,0) (2,0) 123.4.22.54 126.93.2.34 Enable every host to independently determine its Internet position Internet position should be as fundamental as IP address 02-25-03 “Where” as well as “Who” Lecture #13 29 Global Network Positioning (GNP) Coordinates Model the Internet as a geometric space (e.g. 3D Euclidean) Characterize the position of any end host with geometric coordinates Use geometric distances to predict network distances 02-25-03 Lecture #13 y (x2,y2,z2) (x1,y1,z1) x z (x3,y3,z3) (x4,y4,z4) 30 Landmark Operations (Basic Design) (x2,y2) y L2 (x1,y1) L1 L1 L3 L2 x Internet (x3,y3) Measure inter-Landmark distances L3 Use minimum of several round-trip time (RTT) samples Compute coordinates by minimizing the discrepancy between measured distances and geometric distances 02-25-03 Cast as a generic multi-dimensional minimization problem, solved by a central node Lecture #13 31 Ordinary Host Operations (Basic Design) (x2,y2) y L2 (x1,y1) L1 L1 L3 L2 x Internet (x3,y3) L3 (x4,y4) Each host measures its distances to all the Landmarks Compute coordinates by minimizing the discrepancy between measured distances and geometric distances 02-25-03 Cast as a generic multi-dimensional minimization problem, solved by each host Lecture #13 32 Overall Accuracy 0.1 02-25-03 0.28 Lecture #13 33 Why the Difference? IDMaps GNP (1-dimensional model) IDMaps overpredicts 02-25-03 Lecture #13 34 Next Lecture Lecturer: Brad Karp Topic: distributed systems Fault tolerance Replication Logging/monitoring Readings Ivy Byzantine 02-25-03 Fault Tolerance Lecture #13 35