Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
An Integrated Approach to
Improving Web Performance
Lili Qiu
Cornell University
1
Outline
Motivation & Open Issues
Solutions
Study the workload of a busy Web server
Optimize TCP performance for Web transfers
Provision the content distribution networks
Summary & Other Work
2
Motivation
Web is the most dominant traffic in the
Internet today
Accounts for over 70% wide-area traffic
Web performance is often unsatisfactory
WWW – World Wide Wait
Consequence: losing potential customers!
Network
congestion
Overloaded
Web server
3
Challenges in Providing Highly
Efficient Web Services
Protocol
Inefficiency
Workload characterization
Workload
Characterization
Protocol inefficiency
Infrastructure
Provisioning
The workload of busy Web
sites is not well understood
Mismatch between Web
transfers and TCP protocol
Infrastructure provisioning
Current trend: Content
Distribution Networks
Problem: Where to place
replicas?
4
Our Solutions
Web Workload Characterization
Improve protocol efficiency
Study the workload of a busy Web server
Optimize TCP startup performance for Web
transfers
Provision Web replication infrastructure
Develop placement algorithms for content
distribution networks (CDNs)
5
Part I Web Workload Characterization
The Content and Access Dynamics of a Busy Web
Site: Findings and Implications. Proceedings of
ACM SIGCOMM 2000, Stockholm, Sweden,
August 2000. (Joint work with V. N. Padmanabhan)
6
Motivation
Solid understanding of Web workload is critical
for designing robust and scalable systems
Missing piece in previous work: workload of busy
Web servers
replica
proxy
Internet
proxy
Clients
replica
proxy
Servers
7
Overview
MSNBC server site
Server logs
a large news site
consistently ranked among the busiest sites in the Web
server cluster with 40 nodes
25 million accesses a day (HTML content alone)
Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd
HTTP access logs
Content Replication System (CRS) logs
HTML content logs
Data analysis
Content dynamics
Access dynamics
8
Temporal Stability of
File Popularity
Methodology
17DEC98 - 18OCT99
01AUG99 - 18OCT99
17OCT99 - 18OCT99
Extent of overlap
1
0.8
0.6
0.4
Results
0.2
0
1
10
100
1000
# popular documents picked
10000
Consider the traces from a
pair of days
Pick the top n popular
documents from each day
Compute the overlap
100000
One day apart:significant
overlap (80%)
Two months apart: smaller
overlap (20-80%)
Ten months apart: very
small overlap (mostly below
20%)
The set of popular documents remains stable for days
9
Spatial Locality in
Client Accesses
Dec. 17, 1998
1.2
1
Fraction of requests shared
Fraction of requests shared
Normal Day
0.8
0.6
0.4
0.2
0
1
0.8
Trace
0.6
Random
0.4
0.2
0
0
10000
20000
30000
Domain ID
40000
50000
0
5000
10000 15000 20000 25000 30000 35000
Domain ID
Domain membership is significant
except when there is a “hot” event of global interest
10
Spatial Distribution of
Client Accesses
Cluster clients using
network aware clustering
[KW00]
IP addresses with the
same address prefix
belongs to a cluster
Top 10, 100, 1000, 3000
clusters account for
about 24%, 45%, 78%,
and 94% of the requests
respectively
A small number of client clusters
contribute most of the requests.
11
The Applicability of Zipf-law
to Web requests
MSNBC
Proxies
Less popular servers
2
1.5
1
0.5
0
The Web requests follow Zipf-like distribution
Request frequency 1/i, where i is a document’s ranking
The value of is much larger in MSNBC traces
1.4 – 1.8 in MSNBC traces
smaller or close to 1 in the proxy traces
close to 1 in the small departmental server logs [ABC+96]
Highest when there is a hot event
12
Impact of larger
Percentage of Requests
1.2
1
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
Percentage of Documents (sorted by popularity)
12/17/98 Server Traces
10/06/99 Proxy Traces
08/01/99 Server Traces
Accesses in MSNBC traces
are much more concentrated
90% of the accesses are
accounted by
Top 2-4% files in MSNBC
traces
Top 36% files in proxy
traces (Microsoft proxies
and the proxies studied in
[BCF+99])
Top 10% files in small
departmental server logs
reported in [AW96]
Popular news sites like MSNBC see much more concentrated
accesses Reverse caching and replication can be very effective!
13
Part II Transport Layer
Optimization for the Web
Speeding Up Short Data Transfers: Theory,
Architectural Support, and Simulation Results.
Proceedings of NOSSDAV 2000 (Joint work with
Yin Zhang and Srinivasan Keshav)
14
Motivation
Characteristics of Web data transfers
Short & bursty [Mah97]
Use TCP
Problem: Short data transfers interact
poorly with TCP !
15
TCP/Reno Basics
Slow Start
Congestion Avoidance
Exponential growth in
congestion window,
Slow: log(n) round trips
for n segments
Linear probing of BW
Fast Retransmission
Triggered by 3
Duplicated ACK’s
16
Related Work
P-HTTP [PM94]
T/TCP [Bra94]
Cache connection count, RTT
TCP Control Block Interdependence [Tou97]:
Reuses a single TCP connection for multiple Web
transfers, but still pays slow start penalty
Cache cwnd, but large bursts cause losses
Rate Based Pacing [VH97]
4K Initial Window [AFP98]
Fast Start [PK98, Pad98]
Need router support to ensure TCP friendliness
17
Our Approach
Directly enter Congestion Avoidance
Choose optimal initial congestion window
A Geometry Problem: Fitting a block to the
service rate curve to minimize completion time
18
Optimal Initial cwnd
Minimize completion time by having the
transfer end at an epoch boundary.
19
Shift Optimization
Minimize initial cwnd while keeping the
same integer number of RTTs
Before optimization:
cwnd = 9
After optimization:
cwnd = 5
20
Effect of Shift
Optimization
21
TCP/SPAND
Estimate network state by sharing performance
information
SPAND: Shared PAssive Network Discovery
[SSK97]
Internet
Web Servers
Performance
gateway
Directly enter Congestion Avoidance, starting with
the optimal initial cwnd
Avoid large bursts by pacing
22
Implementation Issues
Scope for sharing and aggregation
Collecting performance information
Sliding window average
Retrieving estimation of network state
Performance reports, New TCP option, Windmill’s
approach, …
Information aggregation
24-bit heuristic
network-aware clustering [KW00]
Explicit query, active push, …
Pacing
Leaky-bucket based pacing
23
Opportunity for Sharing
MSNBC: 90% requests arrive within 5 minutes
since the most recent request from the same
client network (using 24-bit heuristic)
24
Cost for Sharing
MSNBC: 15,000-25,000 different client networks
in a 5-minute interval during peak hours (using 24bit heuristic)
25
Simulation Results
Methodology
Performance Metric
Download files in rounds
Average completion time
TCP flavors considered
reno-ssr: Reno with slow start restart
reno-nssr: Reno w/o slow start restart
newreno-ssr: NewReno with slow start restart
newreno-nssr: NewReno w/o slow start restart
26
Simulation Topologies
27
T1 Terrestrial WAN Link with
Single Bottleneck
28
T1 Terrestrial WAN Link with
Multiple Bottlenecks
29
TCP Friendliness
30
Summary
TCP/SPAND significantly reduces latency
for short data transfers
35-65% compared to reno-ssr / newreno-ssr
20-50% compared to reno-nssr / newreno-nssr
Even higher for fatter pipes
TCP/SPAND is TCP-friendly
TCP/SPAND is incrementally deployable
Server-side modification only
No modification at client-side
31
Part III Provision Content
Distribution Networks (CDNs)
On the Placement of Web Server Replicas. To
appear in INFOCOM'2001. (Joint work with V. N.
Padmanabhan and G. M. Voelker)
32
Introduction to CDNs
server
server
CDN server
server
server
Content providers want to
offer better service to their
clients at lower cost
Increasing deployment of
content distribution networks
(CDNs)
Clients
Content
Providers
Akamai, Digital Island, Exodus …
Idea: a network of servers
Features:
Outsourcing infrastructure
Improve performance by
moving content closer to end
users
Flash crowd protection
33
Placement of CDN servers
server
Goal
server
CDN server
server
server
minimize users’ latency or
bandwidth usage
Minimum K-median problem
Select K centers to minimize
the sum of assignment costs
Clients
Content
Providers
Cost can be latency or
bandwidth or other metric we
want to optimize
NP-hard problem
34
Placement Algorithms
Tree based algorithm [LGG+99]
Random
Assume the underlying topologies are trees, and
model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential
places
Pick the best among several random assignments
Hot spot
Place replicas near the clients that generate the
largest load
35
Placement Algorithms (Cont.)
Greedy algorithm
Greedy(N,M) {
for I = 1 .. M {
for each remaining replica R {
cost[R] = cost after placing an
additional replica at R
}
select the replica with the lowest cost
}
}
Super Optimal algorithm
Lagrangian relaxation + subgradient method
36
Simulation Methodology
Network topology
Randomly generated topologies
Real Internet network topology
AS level topology obtained using BGP routing data
from a set of seven geographically dispersed BGP
peers
Web Workload
Real server traces
Using GT-ITM Internet topology generator
MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric
Relative performance: costpractical/costsuper-optimal
37
Simulation Results in
Random Graph Topologies
38
Simulation Results in
Real Internet Topologies
39
Effects of Imperfect Knowledge
about Input Data
Predict load using moving window average
(a) Perfect knowledge
about topology
(b) Knowledge about Topology
with a factor of 2 accurate
40
Summary
First experimental study on placement of CDNs
Knowledge about client workload and topology is
crucial for provisioning CDNs
The greedy algorithm performs the best
The greedy algorithm is insensitive to noise
Stay within a factor of 2 of the super-optimal when the
salted error is a factor of 4
The hot spot algorithm performs nearly as well
Within a factor of 1.1 – 1.5 of super-optimal
Within a factor of 1.6 – 2 of super-optimal
How to obtain inputs
Moving window average for load prediction
Using BGP router data to obtain topology information
41
Contributions
Protocol
Efficiency
Workload characterization
Protocol efficiency
Workload
Characterization
Infrastructure
Provisioning
Study the workload of
MSNBC web site
Optimize TCP startup
performance for Web
transfers
Infrastructure provisioning
Develop placement
algorithms for Content
Distribution Networks
42
Other Work
Available at
http://www.cs.cornell.edu/lqiu/papers/papers.html
Fast Firewall Implementations for Software and
Hardware-based Routers. Submitted to ACM
SIGMETRICS’2001.
Integrating Packet FEC into Adaptive Voice
Playout Buffer Algorithms on the Internet.
Proceedings of IEEE INFOCOM'2000, Tel-Aviv,
Israel, March 2000.
On Individual and Aggregate TCP Performance.
7th International Conference on Network
Protocols (ICNP'99), Toronto, Canada, October
1999.
43