Download Evaluation of the Proximity between Web Clients and their Local

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Airborne Networking wikipedia , lookup

Distributed firewall wikipedia , lookup

Server Message Block wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Dynamic Host Configuration Protocol wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Hypertext Transfer Protocol wikipedia , lookup

Computer cluster wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Lag wikipedia , lookup

Transcript
Evaluation of the Proximity
between Web Clients and their
Local DNS Servers
Z. Morley Mao
UC Berkeley ([email protected])
C. Cranor, M. Rabinovich, O. Spatscheck, and J. Wang
AT&T Labs-Research
F. Douglis
IBM Research
Motivation

Content Distribution Networks (CDNs)

Attempt to deliver content from servers close to
users
Origin servers
Cache server
Internet
Cache server
Clients
Cache server
Clients
DNS based server selection

Originator problem

Assumes that clients are close to their local
DNS servers
Authoritative DNS server
ns.service.com
www.service.com?
www.service.com?
ns.service.com
Client.myisp.net
Local DNS Server
ns.myisp.net
A.GTLD-SERVERS.NET
Verify the assumption that clients are close to
their local DNS servers
Measurement setup

Three components

www.att.com
1x1 pixel embedded transparent GIF
image


A specialized authoritative DNS server


Allows hostnames to be wild-carded
An HTTP redirector


1x1 transparent GIF
<img src=http://xxx.rd.example.com/tr.gif
height=1 width=1>
Always responds with “302 Moved
Temporarily”
Redirect to a URL with client IP address
embedded
Embedded image request
sequence
1. HTTP GET request for the image
Client
[10.0.0.1]
2. HTTP redirect to
IP10-0-0-1.cs.example.com
Redirector for
xxx.rd.example.com
Content server for the image
4. Request to resolve IP10-0-0-1.cs.example.com
Local DNS server
5. Reply: IP address of content server
Name server for
*.cs.example.com
Measurement Data
Site
Participant
1
2,3
4
att.com
Personal pages
(commercial domain)
AT&T research
5-7
University sites
8-19
Personal pages
(university domain)
Image hit
count
20,816,927
Duration
1,743
212,814
3 months
3 months
4,367,076
3 months
26,563
3 months
2 months
Measurement statistics
Data type
Unique client-LDNS associations
HTTP requests
Unique client IPs
Unique LDNS IPs
Client-LDNS associations where
Client and LDNS have the same IP address
Count
4,253,157
25,425,123
3,234,449
157,633
56,086
Proximity metrics:




AS clustering
Network clustering
Traceroute divergence
Roundtrip time correlation
AS clustering

Autonomous System (AS)


A single administrative entity with unified
routing policy
Observes if client and LDNS belong to
the same AS
Network clustering




[Krishnamurthy,Wang sigcomm00]
Based on BGP routing information using
the longest prefix match
Each prefix identifies a network cluster
Observes if client and LDNS belong to
the same network cluster
Traceroute divergence
Probe machine
a
•[Shaikh et al. infocom00]
•Use the last point of
divergence
b
1
2
3
1
•Traceroute divergence:
Max(3,4)=4
2
3
4
client
Local DNS server
Roundtrip time correlation



Correlation between message roundtrip
times from a probe site to the client and
its LDNS server
The probe site represents a potential
cache server location
A crude metric, highly dependent on
the probe site
Aggregate statistics of
AS/network clustering
Metrics

AS clustering
# client
clusters
9,215
# LDNS
clusters
8,590
Total #
clusters
9,570
Network clustering
98,001
53,321
104,950
More than 13,000 ASes


Close to 75% total ASes
440,000 unique prefixes

Close to 25% of all possible network clusters
 We have a representative data set
Proximity analysis:
AS, network clustering
Metrics
Client IPs
HTTP requests
AS cluster
64%
69%
Network cluster
16%
24%




AS clustering: coarse-grained
Network clustering: fine-grained
Most clients not in the same routing entity as
their LDNS
Clients with LDNS in the same cluster slightly
more active
Proximity analysis:
Traceroute divergence

Probe sites:





NJ(UUNET), NJ(AT&T), Berkeley(Calren),
Columbus(Calren)
Sampled from top half of busy network clusters
Median divergence: 4
Mean divergence: 5.8-6.2
Ratio of common to disjoint path length

72%-80% pairs traced have common path at least
as long as disjoint path
Improved local DNS
configuration

For client-LDNS associations not in the
same cluster, do we know a LDNS in the
client’s cluster?
Metrics
Client IPs
Original Improved
HTTP requests
Original Improved
AS cluster
64%
88%
69%
92%
Network cluster
16%
66%
24%
70%
Impact on commercial CDNs

Data set



Client-LDNS associations
LDNS-CDN associations
Available CDN servers
Client w/ CDN server
in cluster
Verifiable clients:
w/ responsive
LDNS
Misdirected clients:
directed to a cache
not in client’s cluster
Clients with LDNS
not in same cluster
Impact on commercial CDNs
AS clustering
CDN
CDN X
CDN Y
CDN Z
Clients with CDN server in
cluster
1,679,515
1,215,372
618,897
Verifiable clients
1,324,022
961,382
516,969
Misdirected clients
(% of verifiable clients)
809,683
(60%)
752,822
(77%)
434,905
(82%)
Clients with LDNS not in
client’s cluster
(% of misdirected clients)
443,394
354,928
262,713
(55%)
(47%)
(60%)
Impact on commercial CDNs
Network clustering
Less than 10% of all clients
CDN
CDN X
CDN Y
CDN Z
Clients with cache server
in cluster
264,743
156,507
103,448
Verifiable clients
221,440
132,567
90,264
Misdirected clients
(% of verifiable clients)
154,198
(68%)
125,449
(94%)
87,486
(96%)
Clients with LDNS not in
client’s cluster
(% of misdirected clients)
145,276
116,073
84,737
(94%)
(93%)
(97%)
Conclusion

Novel technique for finding client and local
DNS associations


DNS based server selection works well for
coarse-grained load-balancing



Fast, non-intrusive, and accurate
64% associations in the same AS
16% associations in the same network cluster
Server selection can be inaccurate if server
density is high
Related work

Measurement methodology
1.
IBM (Shaikh et al.)

2.
Univ of Boston (Bestavros et al.)



Assigning multiple IP addresses to a Web server
Differences from our work:

3.
Time correlation of DNS and HTTP requests from DNS
and Web server logs
Our methodology: efficient, accurate, nonintrusive
Web bugs
Proximity metrics

Cisco’s Boomerang protocol: uses latency from
cache servers to the LDNS