Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Integrated Approach to
Improving Web Performance
Lili Qiu
Cornell University
1
Outline
Motivation & Open Issues
Solutions
Study Web workload, and properly provision the
content distribution networks
Optimizing TCP performance for Web transfers
Fast packet classification
Summary
Other Work
2
Motivation
Web is the dominant traffic in the
Internet today
Web performance is often unsatisfactory
WWW – World Wide Wait
Consequence: losing potential customers!
Network
congestion
Overloaded
Web server
3
Why is the Web so slow?
Application layer
Transport layer
Web transfers are short and busty, and interact poorly
with TCP
Network layer
Web servers are overloaded …
Routers are not fast enough
Network congestion
Route flaps and routing instabilities
…
Inefficiency in any layer of the
protocol stack can slow down the Web!
4
Our Solutions
Application layer
Transport layer
Study Web Workload
Properly provision content distribution
networks (CDNs)
Optimize TCP startup performance for Web
transfers
Network layer
Speed up packet classification (useful for
firewall & diff-serv)
5
Part I Application Layer Approach
Study the workload of busy Web servers
The Content and Access Dynamics of a Busy Web Site:
Findings and Implications. Proceedings of ACM
SIGCOMM 2000, Stockholm, Sweden, August 2000.
(Joint work with V. N. Padmanabhan)
Properly provision content distribution networks
On the Placement of Web Server Replicas. Submitted to
INFOCOM'2001. (Joint work with V. N. Padmanabhan and
G. M. Voelker)
6
Introduction
Solid understanding of Web workload is critical
for designing robust and scalable systems
The workload of popular Web servers is not well
understood
Study the content and access dynamics of MSNBC
web site
a large news server
one of the busiest sites in the Web
25 million accesses a day (HTML content alone)
Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd
Properly provision content distribution networks
Where to place the edge servers in the CDNs
7
Temporal Stability of
File Popularity
Methodology
17DEC98 - 18OCT99
01AUG99 - 18OCT99
17OCT99 - 18OCT99
Extent of overlap
1
0.8
0.6
0.4
Results
0.2
0
1
10
100
1000
# popular documents picked
10000
Consider the traces from a
pair of days
Pick the top n popular
documents from each day
Compute the overlap
100000
One day apart:significant
overlap (80%)
Two months apart: smaller
overlap (20-80%)
Ten months apart: very
small overlap (mostly below
20%)
The set of popular documents remains stable for days
8
Spatial Locality in
Client Accesses
Dec. 17, 1998
1.2
1
Fraction of requests shared
Fraction of requests shared
Normal Day
0.8
0.6
0.4
0.2
0
1
0.8
Trace
0.6
Random
0.4
0.2
0
0
10000
20000
30000
Domain ID
40000
50000
0
5000
10000 15000 20000 25000 30000 35000
Domain ID
Domain membership is significant
except when there is a “hot” event of global interest
9
Spatial Distribution of
Client Accesses
Cluster clients using
network aware clustering
[KW00]
IP addresses with the
same address prefix
belongs to a cluster
Top 10, 100, 1000, 3000
clusters account for
about 24%, 45%, 78%,
and 94% of the requests
respectively
A small number of client clusters
contribute most of the requests.
10
The Applicability of Zipf-law
to Web requests
MSNBC
Proxies
Less popular servers
2
1.5
1
0.5
0
The Web requests follow Zipf-like distribution
Request frequency 1/i, where i is a document’s ranking
The value of is much larger in MSNBC traces
1.4 – 1.8 in MSNBC traces
smaller or close to 1 in the proxy traces
close to 1 in the small departmental server logs [ABC+96]
Highest when there is a hot event
11
Impact of larger
Percentage of Requests
1.2
1
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
Percentage of Documents (sorted by popularity)
12/17/98 Server Traces
10/06/99 Proxy Traces
08/01/99 Server Traces
Accesses in MSNBC traces
are much more concentrated
90% of the accesses are
accounted by
Top 2-4% files in MSNBC
traces
Top 36% files in proxy
traces (Microsoft proxies
and the proxies studied in
[BCF+99])
Top 10% files in small
departmental server logs
reported in [AW96]
Popular news sites like MSNBC see much more concentrated
accesses Reverse caching and replication can be very effective!
12
Introduction to Content
Distribution Networks (CDNs)
server
server
CDN server
server
server
Content providers want to
offer better service to their
clients at lower cost
Increasing deployment of
content distribution networks
(CDNs)
Clients
Content
Providers
Akamai, Digital Island, Exodus …
Idea: a network of servers
Features:
Outsourcing infrastructure
Improve performance by
moving content closer to end
users
Flash crowd protection
13
Placement of CDN servers
server
Goal
server
CDN server
server
server
minimize users’ latency or
bandwidth usage
Minimum K-median problem
Select K centers to minimize
the sum of assignment costs
Clients
Content
Providers
Cost can be latency or
bandwidth or other metric we
want to optimize
NP-hard problem
14
Placement Algorithms
Tree based algorithm [LGG+99]
Random
Assume the underlying topologies are trees, and
model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential
places
Pick the best among several random assignments
Hot spot
Place replicas near the clients that generate the
largest load
15
Placement Algorithms (Cont.)
Greedy algorithm
Greedy(N,M) {
for I = 1 .. M {
for each remaining replica R {
cost[R] = cost after placing an
additional replica at R
}
select the replica with the lowest cost
}
}
Super Optimal algorithm
Lagrangian relaxation + subgradient method
16
Simulation Methodology
Network topology
Randomly generated topologies
Real Internet network topology
AS level topology obtained using BGP routing data
from a set of seven geographically dispersed BGP
peers
Web Workload
Real server traces
Using GT-ITM Internet topology generator
MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric
Relative performance: costpractical/costsuper-optimal
17
Simulation Results in
Random Tree Topologies
18
Simulation Results in
Random Graph Topologies
19
Simulation Results in
Real Internet Topologies
20
Effects of Imperfect Knowledge
about Input Data
Predict load using moving window average
(a) Perfect knowledge
about topology
(b) Knowledge about Topology
with a factor of 2 accurate
21
Conclusion
Characterize Web workload using MSNBC traces
Placement of CDN servers
Knowledge about client workload and topology is crucial
for provisioning CDNs
The greedy algorithm performs the best
The greedy algorithm is insensitive to noise
Stay within a factor of 2 of the super-optimal when the
salted error is a factor of 4
The hot spot algorithm performs nearly as well
Within a factor of 1.1 – 1.5 of super-optimal
Within a factor of 1.6 – 2 of super-optimal
How to obtain inputs
Moving window average for load prediction
Using BGP router data to obtain topology information
22
Part II Transport Layer
Approach
Speeding Up Short Data Transfers: Theory,
Architectural Support, and Simulation Results.
Proceedings of NOSSDAV 2000 (Joint work with
Yin Zhang and Srinivasan Keshav)
23
Motivation
Characteristics of Web data transfers
Short & bursty [Mah97]
Use TCP
Problem: Short data transfers interact
poorly with TCP !
24
TCP/Reno Basics
Slow Start
Congestion Avoidance
Exponential growth in
congestion window,
Slow: log(n) round trips
for n segments
Linear probing of BW
Fast Retransmission
Triggered by 3
Duplicated ACK’s
25
Related Work
P-HTTP [PM94]
T/TCP [Bra94]
Cache connection count, RTT
TCP Control Block Interdependence [Tou97]:
Reuses a single TCP connection for multiple Web
transfers, but still pays slow start penalty
Cache cwnd, but large bursts cause losses
Rate Based Pacing [VH97]
4K Initial Window [AFP98]
Fast Start [PK98, Pad98]
Need router support to ensure TCP friendliness
26
Our Approach
Directly enter Congestion Avoidance
Choose optimal initial congestion window
A Geometry Problem: Fitting a block to the
service rate curve to minimize completion time
27
Optimal Initial cwnd
Minimize completion time by having the
transfer end at an epoch boundary.
28
Shift Optimization
Minimize initial cwnd while keeping the
same integer number of RTT’s
Before optimization:
cwnd = 9
After optimization:
cwnd = 5
29
Effect of Shift
Optimization
30
TCP/SPAND
Estimate network state by sharing performance
information
SPAND: Shared PAssive Network Discovery
[SSK97]
Internet
Web Servers
Performance
Server
Directly enter Congestion Avoidance, starting with
the optimal initial cwnd
Avoid large bursts by pacing
31
Implementation Issues
Scope for sharing and aggregation
Collecting performance information
Sliding window average
Retrieving estimation of network state
Performance reports, New TCP option, Windmill’s
approach, …
Information aggregation
24-bit heuristic
network-aware clustering [KW00]
Explicit query, active push, …
Pacing
Leaky bucket based pacing
32
Opportunity for Sharing
MSNBC: 90% requests arrive within 5 minutes
since the most recent request from the same
client network (using 24-bit heuristic)
33
Cost for Sharing
MSNBC: 15,000-25,000 different client networks
in a 5-minute interval during peak hours (using 24bit heuristic)
34
Simulation Results
Methodology
Performance Metric
Download files in rounds
Average completion time
TCP flavors considered
reno-ssr: Reno with slow start restart
reno-nssr: Reno w/o slow start restart
newreno-ssr: NewReno with slow start restart
newreno-nssr: NewReno w/o slow start restart
35
Simulation Topologies
36
T1 Terrestrial WAN Link with
Single Bottleneck
37
T1 Terrestrial WAN Link with
Multiple Bottlenecks
38
T1 Terrestrial WAN Link with Multiple
Bottlenecks and Heavy Congestion
39
TCP Friendliness (I)
Against reno-ssr with 50-ms Timer
40
TCP Friendliness (II)
Against reno-ssr with 200-ms Timer
41
Conclusions
TCP/SPAND significantly reduces latency
for short data transfers
35-65% compared to reno-ssr / newreno-ssr
20-50% compared to reno-nssr / newreno-nssr
Even higher for fatter pipes
TCP/SPAND is TCP-friendly
TCP/SPAND is incrementally deployable
Server-side modification only
No modification at client-side
42
Part III Network Layer
Approach
Fast Packet Classification on Multiple Dimensions.
Cornell CS Technical Report 2000-1805, July
2000. (Joint work with G. Varghese and S. Suri, in
progress)
43
Motivation
Traditionally, routers forward packets based on
the destination field only
Diff-serv and firewall require layer 4 switching
forward packets based on multiple fields in the packet
header, e.g. source IP address, destination IP address,
source port, destination port, protocol, type of service
(tos) …
The general packet classification problem has poor
worst-case cost:
Given N arbitrary filters with k packet fields
either the worst-case search time is ((logN)k-1)
or the worst-case storage is O(Nk)
44
Problem Specification
Given a set of filters (or rules), where each filter
specifies
a class of packet headers based on K fields
an associated directive, which specifies how to
forward the packet matching this filter
Goal: Find the best matching filter for each
incoming packet
A packet P matches a filter F if every field of P
matches the corresponding field of F
Exact match, prefix match, or range match
Assume prefix matching
45
Problem Specification (Cont.)
Example of Cisco Access control List
(ACL)
1.
2.
3.
access-list 100 deny udp 26.145.168.192
255.255.255.255 74.199.168.192
255.255.255.0 eq 2049
access-list 100 permit ip
74.199.191.192 255.255.0.0 255
74.199.168.192.255.0.0
access-list 100 permit tcp
250.197.149.202 255.0.0.0 74.199.20.76
255.0.0.0
Packet: tcp 250.19.34.34 74.23.5.12 matches
filter 3
46
Backtracking Search
F1
00*
F2
10*
D
E
A trie is a binary
branching tree, with each
branch labeled 0 or 1
The prefix associated
with a node is the
concatenation of all the
bits from the root to the
node
47
Backtracking Search (Cont.)
Extend to multiple
dimensions
Backtracking is a
depth-first traversal
of the tree which
visits all the nodes
satisfying the given
constraints
Example: search for
[00*,0*,0*]
48
Trie Compression Algorithm
If a path AB satisfies the Compressible Property:
All nodes on its left point to the same place L
All nodes on its right point to the same place R
then we compress the entire branches by 3 edges
Center edge with value (AB) pointing to B
Left edge with value < (AB) pointing to L
Right edge with value > (AB) pointing to R
Advantages of compression: save time & storage
49
Trading Storage for Time
Smoothly tradeoff storage for time
Exponential
Time
Exponential
Space
Selective push
Push down the filters with large backtracking
time
Iterate until the worst-case backtracking time
satisfies our requirement
50
Example of Selective Push
Goal: worst-case memory
accesses < 12
The filter [0*, 0*, 0000*]
has 12 memory accesses.
Push the filter down
reduce lookup time
Now the search cost of
the filter [0*,0*,001*]
becomes 12 memory
accesses. So we need to
push it down. Done!
51
Using Available Hardware
So far, we focus on software techniques for
packet classification.
Further improve the performance by taking
advantage of limited hardware if it is available
By moving some filters (or rules) from software to
hardware
Key issue: Which filters to move from software to
hardware?
Answer:
To reduce lookup time, move the filters with the largest
number of memory accesses when using software approach
52
Summary
Approach
Description
Performance Gain
Trie
compression
algorithm
Effectively exploit
Reduce lookup time by a
redundancy in trie nodes factor of 2 – 5, save
storage by a factor of
2.8 – 8.7
Selective
push
“Push down” the filters
with large backtracking
time
Reduce lookup time by 10
– 25% with only marginal
increase in storage
Moving
filters from
software to
hardware
Heuristics to move a
small number of filters
from software to
hardware
Moving 10 – 20 rules to
hardware cuts storage
by 33% - 50%, or lookup
time by 10% – 20%
53
Contributions
Application layer
Transport layer
Study Web Workload of busy Web servers
Properly provision content distribution
networks
Optimize TCP startup performance for short
Web transfers
Network layer
Speed up packet classification
54
Other Work
Available at
http://www.cs.cornell.edu/lqiu/papers/papers.html
Integrating Packet FEC into Adaptive Voice Playout
Buffer Algorithms on the Internet. Proceedings of
IEEE INFOCOM'2000, Tel-Aviv, Israel, March 2000.
On Individual and Aggregate TCP Performance. 7th
International Conference on Network Protocols
(ICNP'99), Toronto, Canada, October 1999.
Understanding the End-to-End Performance Impact of
RED in a Heterogeneous Environment. July 2000.
Submitted to INFOCOM'2001.
55
Integrating Packet FEC into Adaptive
Voice Playout Buffer Algorithms
Internet telephony are subject to
Variable loss rate
Variable delay
Previous work has addressed the two
problems separately
Use FEC for loss recovery
Use playout buffer adaptation for delay
jitter compensation
56
Integrating Packet FEC into Adaptive
Voice Playout Buffer Algorithms
(Cont.)
Our work
Demonstrate the interaction between playout
algorithm and FEC
Playout algorithm should depend on both FEC and
network loss conditions and network jitter
Propose several playout algorithms that provide
this coupling
Demonstrate the effectiveness of the
algorithms through simulations
57
On Individual and Aggregate
TCP Performance
Motivation
TCP behavior under many competing TCP
connections has not been sufficiently
explored
Our work
Use extensive simulations to investigate
the individual and aggregate TCP
performance for many concurrent
connections
58
On Individual and Aggregate
TCP Performance (Cont.)
Major findings
All connections have the same rtt
Wc > 3*Conn global synchronization
Conn < Wc < 3*Conn local synchronization
Wc < Conn shut off connections
Adding random processing time
synchronization and consistent discrimination
less pronounced
Derive the general characterization of overall
throughput, goodput, and loss probability
Quantify the roundtrip bias for connections
with different RTT
59
Understanding the End-to-End
Performance Impact of RED in a
Heterogeneous Environment
Motivation
IETF recommends wide spread
deployment of RED in routers
Most previous work studies RED in
relatively homogeneous environment
Our work
Investigate the interaction of RED with
five types of heterogeneity
60
Understanding the End-to-End
Performance Impact of RED in a
Heterogeneous Environment (Cont.)
Major findings
Mix of short and long TCP connections
Mix of TCP and UDP
ECN-capable TCP connections get higher goodput than nonECN-capable TCP connections
Effect of different RTT
Bursty UDP tends to get lower loss rate with RED than with
Drop Tail
Mix of ECN and non-ECN capable traffic
Short TCP connections get higher goodput with RED than
with Drop Tail
RED reduces the bias against long-RTT bulk transfers
Effect of two-way traffic
When ACK path is congested, TCP gets higher goodput with
RED than with Drop Tail
61
Effects of Imperfect
Knowledge about Input Data
62
Effects of Imperfect Knowledge
about Input Data (Cont.)
The effect of imperfect
topology information
Randomly remove from 0 up
to 50% edges in the AS
topology derived from the
BGP routing tables
The greedy algorithm is
insensitive to edge removal
Performs within 2.6 of
optimal when the edge
removal is 50%
63