Download job-talk0

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
An Integrated Approach to
Improving Web Performance
Lili Qiu
Cornell University
1
Outline


Motivation & Open Issues
Solutions




Study the workload of a busy Web server
Optimize TCP performance for Web transfers
Provision the content distribution networks
Summary & Other Work
2
Motivation

Web is the most dominant traffic in the
Internet today


Accounts for over 70% wide-area traffic
Web performance is often unsatisfactory


WWW – World Wide Wait
Consequence: losing potential customers!
Network
congestion
Overloaded
Web server
3
Challenges in Providing Highly
Efficient Web Services

Protocol
Inefficiency
Workload characterization


Workload
Characterization
Protocol inefficiency


Infrastructure
Provisioning
The workload of busy Web
sites is not well understood
Mismatch between Web
transfers and TCP protocol
Infrastructure provisioning


Current trend: Content
Distribution Networks
Problem: Where to place
replicas?
4
Our Solutions

Web Workload Characterization


Improve protocol efficiency


Study the workload of a busy Web server
Optimize TCP startup performance for Web
transfers
Provision Web replication infrastructure

Develop placement algorithms for content
distribution networks (CDNs)
5
Part I Web Workload Characterization

The Content and Access Dynamics of a Busy Web
Site: Findings and Implications. Proceedings of
ACM SIGCOMM 2000, Stockholm, Sweden,
August 2000. (Joint work with V. N. Padmanabhan)
6
Motivation


Solid understanding of Web workload is critical
for designing robust and scalable systems
Missing piece in previous work: workload of busy
Web servers
replica
proxy
Internet
proxy
Clients
replica
proxy
Servers
7
Overview

MSNBC server site






Server logs




a large news site
consistently ranked among the busiest sites in the Web
server cluster with 40 nodes
25 million accesses a day (HTML content alone)
Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd
HTTP access logs
Content Replication System (CRS) logs
HTML content logs
Data analysis


Content dynamics
Access dynamics
8
Temporal Stability of
File Popularity

Methodology

17DEC98 - 18OCT99
01AUG99 - 18OCT99
17OCT99 - 18OCT99
Extent of overlap
1

0.8

0.6

0.4
Results

0.2
0

1
10
100
1000
# popular documents picked
10000
Consider the traces from a
pair of days
Pick the top n popular
documents from each day
Compute the overlap
100000

One day apart:significant
overlap (80%)
Two months apart: smaller
overlap (20-80%)
Ten months apart: very
small overlap (mostly below
20%)
The set of popular documents remains stable for days
9
Spatial Locality in
Client Accesses
Dec. 17, 1998
1.2
1
Fraction of requests shared
Fraction of requests shared
Normal Day
0.8
0.6
0.4
0.2
0
1
0.8
Trace
0.6
Random
0.4
0.2
0
0
10000
20000
30000
Domain ID
40000
50000
0
5000
10000 15000 20000 25000 30000 35000
Domain ID
Domain membership is significant
except when there is a “hot” event of global interest
10
Spatial Distribution of
Client Accesses

Cluster clients using
network aware clustering
[KW00]


IP addresses with the
same address prefix
belongs to a cluster
Top 10, 100, 1000, 3000
clusters account for
about 24%, 45%, 78%,
and 94% of the requests
respectively
A small number of client clusters
contribute most of the requests.
11
The Applicability of Zipf-law
to Web requests
MSNBC
Proxies
Less popular servers
2
1.5

1
0.5
0

The Web requests follow Zipf-like distribution


Request frequency  1/i, where i is a document’s ranking
The value of  is much larger in MSNBC traces




1.4 – 1.8 in MSNBC traces
smaller or close to 1 in the proxy traces
close to 1 in the small departmental server logs [ABC+96]
Highest when there is a hot event
12
Impact of larger 
Percentage of Requests

1.2
1
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
Percentage of Documents (sorted by popularity)
12/17/98 Server Traces
10/06/99 Proxy Traces
08/01/99 Server Traces
Accesses in MSNBC traces
are much more concentrated
90% of the accesses are
accounted by

Top 2-4% files in MSNBC
traces

Top 36% files in proxy
traces (Microsoft proxies
and the proxies studied in
[BCF+99])

Top 10% files in small
departmental server logs
reported in [AW96]
Popular news sites like MSNBC see much more concentrated
accesses  Reverse caching and replication can be very effective!
13
Part II Transport Layer
Optimization for the Web

Speeding Up Short Data Transfers: Theory,
Architectural Support, and Simulation Results.
Proceedings of NOSSDAV 2000 (Joint work with
Yin Zhang and Srinivasan Keshav)
14
Motivation

Characteristics of Web data transfers



Short & bursty [Mah97]
Use TCP
Problem: Short data transfers interact
poorly with TCP !
15
TCP/Reno Basics

Slow Start



Congestion Avoidance


Exponential growth in
congestion window,
Slow: log(n) round trips
for n segments
Linear probing of BW
Fast Retransmission

Triggered by 3
Duplicated ACK’s
16
Related Work

P-HTTP [PM94]


T/TCP [Bra94]




Cache connection count, RTT
TCP Control Block Interdependence [Tou97]:


Reuses a single TCP connection for multiple Web
transfers, but still pays slow start penalty
Cache cwnd, but large bursts cause losses
Rate Based Pacing [VH97]
4K Initial Window [AFP98]
Fast Start [PK98, Pad98]

Need router support to ensure TCP friendliness
17
Our Approach


Directly enter Congestion Avoidance
Choose optimal initial congestion window

A Geometry Problem: Fitting a block to the
service rate curve to minimize completion time
18
Optimal Initial cwnd

Minimize completion time by having the
transfer end at an epoch boundary.
19
Shift Optimization

Minimize initial cwnd while keeping the
same integer number of RTTs
Before optimization:
cwnd = 9
After optimization:
cwnd = 5
20
Effect of Shift
Optimization
21
TCP/SPAND

Estimate network state by sharing performance
information

SPAND: Shared PAssive Network Discovery
[SSK97]
Internet
Web Servers


Performance
gateway
Directly enter Congestion Avoidance, starting with
the optimal initial cwnd
Avoid large bursts by pacing
22
Implementation Issues

Scope for sharing and aggregation



Collecting performance information


Sliding window average
Retrieving estimation of network state


Performance reports, New TCP option, Windmill’s
approach, …
Information aggregation


24-bit heuristic
network-aware clustering [KW00]
Explicit query, active push, …
Pacing

Leaky-bucket based pacing
23
Opportunity for Sharing

MSNBC: 90% requests arrive within 5 minutes
since the most recent request from the same
client network (using 24-bit heuristic)
24
Cost for Sharing

MSNBC: 15,000-25,000 different client networks
in a 5-minute interval during peak hours (using 24bit heuristic)
25
Simulation Results

Methodology


Performance Metric


Download files in rounds
Average completion time
TCP flavors considered




reno-ssr: Reno with slow start restart
reno-nssr: Reno w/o slow start restart
newreno-ssr: NewReno with slow start restart
newreno-nssr: NewReno w/o slow start restart
26
Simulation Topologies
27
T1 Terrestrial WAN Link with
Single Bottleneck
28
T1 Terrestrial WAN Link with
Multiple Bottlenecks
29
TCP Friendliness
30
Summary

TCP/SPAND significantly reduces latency
for short data transfers





35-65% compared to reno-ssr / newreno-ssr
20-50% compared to reno-nssr / newreno-nssr
Even higher for fatter pipes
TCP/SPAND is TCP-friendly
TCP/SPAND is incrementally deployable


Server-side modification only
No modification at client-side
31
Part III Provision Content
Distribution Networks (CDNs)

On the Placement of Web Server Replicas. To
appear in INFOCOM'2001. (Joint work with V. N.
Padmanabhan and G. M. Voelker)
32
Introduction to CDNs

server
server
CDN server

server
server
Content providers want to
offer better service to their
clients at lower cost
Increasing deployment of
content distribution networks
(CDNs)

Clients
Content
Providers


Akamai, Digital Island, Exodus …
Idea: a network of servers
Features:
 Outsourcing infrastructure
 Improve performance by
moving content closer to end
users
 Flash crowd protection
33
Placement of CDN servers

server
Goal

server
CDN server

server
server
minimize users’ latency or
bandwidth usage
Minimum K-median problem

Select K centers to minimize
the sum of assignment costs

Clients
Content
Providers

Cost can be latency or
bandwidth or other metric we
want to optimize
NP-hard problem
34
Placement Algorithms

Tree based algorithm [LGG+99]



Random


Assume the underlying topologies are trees, and
model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential
places
Pick the best among several random assignments
Hot spot

Place replicas near the clients that generate the
largest load
35
Placement Algorithms (Cont.)

Greedy algorithm
Greedy(N,M) {
for I = 1 .. M {
for each remaining replica R {
cost[R] = cost after placing an
additional replica at R
}
select the replica with the lowest cost
}
}

Super Optimal algorithm

Lagrangian relaxation + subgradient method
36
Simulation Methodology

Network topology

Randomly generated topologies


Real Internet network topology


AS level topology obtained using BGP routing data
from a set of seven geographically dispersed BGP
peers
Web Workload

Real server traces


Using GT-ITM Internet topology generator
MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric

Relative performance: costpractical/costsuper-optimal
37
Simulation Results in
Random Graph Topologies
38
Simulation Results in
Real Internet Topologies
39
Effects of Imperfect Knowledge
about Input Data

Predict load using moving window average
(a) Perfect knowledge
about topology
(b) Knowledge about Topology
with a factor of 2 accurate
40
Summary



First experimental study on placement of CDNs
Knowledge about client workload and topology is
crucial for provisioning CDNs
The greedy algorithm performs the best


The greedy algorithm is insensitive to noise


Stay within a factor of 2 of the super-optimal when the
salted error is a factor of 4
The hot spot algorithm performs nearly as well


Within a factor of 1.1 – 1.5 of super-optimal
Within a factor of 1.6 – 2 of super-optimal
How to obtain inputs


Moving window average for load prediction
Using BGP router data to obtain topology information
41
Contributions

Protocol
Efficiency
Workload characterization


Protocol efficiency

Workload
Characterization
Infrastructure
Provisioning

Study the workload of
MSNBC web site
Optimize TCP startup
performance for Web
transfers
Infrastructure provisioning

Develop placement
algorithms for Content
Distribution Networks
42
Other Work




Available at
http://www.cs.cornell.edu/lqiu/papers/papers.html
Fast Firewall Implementations for Software and
Hardware-based Routers. Submitted to ACM
SIGMETRICS’2001.
Integrating Packet FEC into Adaptive Voice
Playout Buffer Algorithms on the Internet.
Proceedings of IEEE INFOCOM'2000, Tel-Aviv,
Israel, March 2000.
On Individual and Aggregate TCP Performance.
7th International Conference on Network
Protocols (ICNP'99), Toronto, Canada, October
1999.
43