Download web_server_placement_03172004

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Network tap wikipedia , lookup

Cross-site scripting wikipedia , lookup

Airborne Networking wikipedia , lookup

Lag wikipedia , lookup

Semantic Web wikipedia , lookup

Transcript
On the Placement of Web Server
Replicas
Yu Cai
Paper
• On the Placement of Web Server Replicas
• Lili Qiu, Venkata N. Padmanabhan,
Geoffrey M. Voelker
• Infocom 2001
What is the paper talking about?
• Web server replicas placement problem.
– A popular Web site aims to improve its performance
(e.g., reducing its clients’ perceived latency) by pushing
its content to some hosting services.
– The problem is to choose M replicas (or hosting
services) among N potential sites (N > M) such that
some objective function is optimized under a given
traffic pattern.
– The objective function can be minimizing either its
clients’ latency, or its total bandwidth consumption, or
an overall cost function if each link is associated with a
cost.
Contribution of the paper
• Present several placement algorithms
• Evaluate the performance by simulating on
synthetic and real network.
Algorithms
• Tree based algorithm:
– The underlying topologies are trees, and
modeled it as a dynamic programming problem.
– The algorithm was originally designed for Web
proxy cache placement, and it is also applicable
for Web replica placement.
– Unrealistic assumption and not so good
performance under normal network.
Algorithms
• Greedy algorithm
– Evaluate N potential site to determine its
suitability by assuming all clients traffic
converge at this site
– Pick the best one
– Repeat step 1 and 2 for the rest N-1 site,
– Until we pick M sites.
Algorithms
• Random algorithm
– Randomly pick M sites from N sites.
– Can be improve by introducing genetic evolution.
• Hotspot algorithm
– Place replicas near the clients generating the greatest
load.
– Can be improve by using client clustering.
• Super optimal algorithm to get lower bound
– May not be feasible solution, only used for comparison.
Simulation
• Use GT-ITM to generate random network
topology
– Transit-stub, hierarchical graph
• Use BGP routing tables to generate real
Internet topology
– AS hop count ???
Simulation
• Web workload and client generation
– Use the access logs collected from real web
sites, like MSNBC
– Cluster the web clients who are topologically
close to each other
– Top 10, 100, 1000 and 3000 clusters account
for 24%, 45%, 78% and 94% requests.
– Map the clusters randomly to the nodes in the
simulation network
Simulation
• Evaluate the effects of imperfect data
– Salt the input data with random noise of
uniform distribution.
• Evaluate the effects of dynamic network
– Input data change over time.
Conclusion
• Greedy perform the best
– Error increase or network changes, the
performance degrades slightly.