Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
On the Placement of Web Server Replicas Yu Cai Paper • On the Placement of Web Server Replicas • Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker • Infocom 2001 What is the paper talking about? • Web server replicas placement problem. – A popular Web site aims to improve its performance (e.g., reducing its clients’ perceived latency) by pushing its content to some hosting services. – The problem is to choose M replicas (or hosting services) among N potential sites (N > M) such that some objective function is optimized under a given traffic pattern. – The objective function can be minimizing either its clients’ latency, or its total bandwidth consumption, or an overall cost function if each link is associated with a cost. Contribution of the paper • Present several placement algorithms • Evaluate the performance by simulating on synthetic and real network. Algorithms • Tree based algorithm: – The underlying topologies are trees, and modeled it as a dynamic programming problem. – The algorithm was originally designed for Web proxy cache placement, and it is also applicable for Web replica placement. – Unrealistic assumption and not so good performance under normal network. Algorithms • Greedy algorithm – Evaluate N potential site to determine its suitability by assuming all clients traffic converge at this site – Pick the best one – Repeat step 1 and 2 for the rest N-1 site, – Until we pick M sites. Algorithms • Random algorithm – Randomly pick M sites from N sites. – Can be improve by introducing genetic evolution. • Hotspot algorithm – Place replicas near the clients generating the greatest load. – Can be improve by using client clustering. • Super optimal algorithm to get lower bound – May not be feasible solution, only used for comparison. Simulation • Use GT-ITM to generate random network topology – Transit-stub, hierarchical graph • Use BGP routing tables to generate real Internet topology – AS hop count ??? Simulation • Web workload and client generation – Use the access logs collected from real web sites, like MSNBC – Cluster the web clients who are topologically close to each other – Top 10, 100, 1000 and 3000 clusters account for 24%, 45%, 78% and 94% requests. – Map the clusters randomly to the nodes in the simulation network Simulation • Evaluate the effects of imperfect data – Salt the input data with random noise of uniform distribution. • Evaluate the effects of dynamic network – Input data change over time. Conclusion • Greedy perform the best – Error increase or network changes, the performance degrades slightly.