Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Airborne Networking wikipedia , lookup
Distributed firewall wikipedia , lookup
Server Message Block wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Dynamic Host Configuration Protocol wikipedia , lookup
Remote Desktop Services wikipedia , lookup
Hypertext Transfer Protocol wikipedia , lookup
Computer cluster wikipedia , lookup
Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao UC Berkeley ([email protected]) C. Cranor, M. Rabinovich, O. Spatscheck, and J. Wang AT&T Labs-Research F. Douglis IBM Research Motivation Content Distribution Networks (CDNs) Attempt to deliver content from servers close to users Origin servers Cache server Internet Cache server Clients Cache server Clients DNS based server selection Originator problem Assumes that clients are close to their local DNS servers Authoritative DNS server ns.service.com www.service.com? www.service.com? ns.service.com Client.myisp.net Local DNS Server ns.myisp.net A.GTLD-SERVERS.NET Verify the assumption that clients are close to their local DNS servers Measurement setup Three components www.att.com 1x1 pixel embedded transparent GIF image A specialized authoritative DNS server Allows hostnames to be wild-carded An HTTP redirector 1x1 transparent GIF <img src=http://xxx.rd.example.com/tr.gif height=1 width=1> Always responds with “302 Moved Temporarily” Redirect to a URL with client IP address embedded Embedded image request sequence 1. HTTP GET request for the image Client [10.0.0.1] 2. HTTP redirect to IP10-0-0-1.cs.example.com Redirector for xxx.rd.example.com Content server for the image 4. Request to resolve IP10-0-0-1.cs.example.com Local DNS server 5. Reply: IP address of content server Name server for *.cs.example.com Measurement Data Site Participant 1 2,3 4 att.com Personal pages (commercial domain) AT&T research 5-7 University sites 8-19 Personal pages (university domain) Image hit count 20,816,927 Duration 1,743 212,814 3 months 3 months 4,367,076 3 months 26,563 3 months 2 months Measurement statistics Data type Unique client-LDNS associations HTTP requests Unique client IPs Unique LDNS IPs Client-LDNS associations where Client and LDNS have the same IP address Count 4,253,157 25,425,123 3,234,449 157,633 56,086 Proximity metrics: AS clustering Network clustering Traceroute divergence Roundtrip time correlation AS clustering Autonomous System (AS) A single administrative entity with unified routing policy Observes if client and LDNS belong to the same AS Network clustering [Krishnamurthy,Wang sigcomm00] Based on BGP routing information using the longest prefix match Each prefix identifies a network cluster Observes if client and LDNS belong to the same network cluster Traceroute divergence Probe machine a •[Shaikh et al. infocom00] •Use the last point of divergence b 1 2 3 1 •Traceroute divergence: Max(3,4)=4 2 3 4 client Local DNS server Roundtrip time correlation Correlation between message roundtrip times from a probe site to the client and its LDNS server The probe site represents a potential cache server location A crude metric, highly dependent on the probe site Aggregate statistics of AS/network clustering Metrics AS clustering # client clusters 9,215 # LDNS clusters 8,590 Total # clusters 9,570 Network clustering 98,001 53,321 104,950 More than 13,000 ASes Close to 75% total ASes 440,000 unique prefixes Close to 25% of all possible network clusters We have a representative data set Proximity analysis: AS, network clustering Metrics Client IPs HTTP requests AS cluster 64% 69% Network cluster 16% 24% AS clustering: coarse-grained Network clustering: fine-grained Most clients not in the same routing entity as their LDNS Clients with LDNS in the same cluster slightly more active Proximity analysis: Traceroute divergence Probe sites: NJ(UUNET), NJ(AT&T), Berkeley(Calren), Columbus(Calren) Sampled from top half of busy network clusters Median divergence: 4 Mean divergence: 5.8-6.2 Ratio of common to disjoint path length 72%-80% pairs traced have common path at least as long as disjoint path Improved local DNS configuration For client-LDNS associations not in the same cluster, do we know a LDNS in the client’s cluster? Metrics Client IPs Original Improved HTTP requests Original Improved AS cluster 64% 88% 69% 92% Network cluster 16% 66% 24% 70% Impact on commercial CDNs Data set Client-LDNS associations LDNS-CDN associations Available CDN servers Client w/ CDN server in cluster Verifiable clients: w/ responsive LDNS Misdirected clients: directed to a cache not in client’s cluster Clients with LDNS not in same cluster Impact on commercial CDNs AS clustering CDN CDN X CDN Y CDN Z Clients with CDN server in cluster 1,679,515 1,215,372 618,897 Verifiable clients 1,324,022 961,382 516,969 Misdirected clients (% of verifiable clients) 809,683 (60%) 752,822 (77%) 434,905 (82%) Clients with LDNS not in client’s cluster (% of misdirected clients) 443,394 354,928 262,713 (55%) (47%) (60%) Impact on commercial CDNs Network clustering Less than 10% of all clients CDN CDN X CDN Y CDN Z Clients with cache server in cluster 264,743 156,507 103,448 Verifiable clients 221,440 132,567 90,264 Misdirected clients (% of verifiable clients) 154,198 (68%) 125,449 (94%) 87,486 (96%) Clients with LDNS not in client’s cluster (% of misdirected clients) 145,276 116,073 84,737 (94%) (93%) (97%) Conclusion Novel technique for finding client and local DNS associations DNS based server selection works well for coarse-grained load-balancing Fast, non-intrusive, and accurate 64% associations in the same AS 16% associations in the same network cluster Server selection can be inaccurate if server density is high Related work Measurement methodology 1. IBM (Shaikh et al.) 2. Univ of Boston (Bestavros et al.) Assigning multiple IP addresses to a Web server Differences from our work: 3. Time correlation of DNS and HTTP requests from DNS and Web server logs Our methodology: efficient, accurate, nonintrusive Web bugs Proximity metrics Cisco’s Boomerang protocol: uses latency from cache servers to the LDNS