Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Integrated Approach to Improving Web Performance Lili Qiu Cornell University 1 Outline Motivation & Open Issues Solutions Study Web workload, and properly provision the content distribution networks Optimizing TCP performance for Web transfers Fast packet classification Summary Other Work 2 Motivation Web is the dominant traffic in the Internet today Web performance is often unsatisfactory WWW – World Wide Wait Consequence: losing potential customers! Network congestion Overloaded Web server 3 Why is the Web so slow? Application layer Transport layer Web transfers are short and busty, and interact poorly with TCP Network layer Web servers are overloaded … Routers are not fast enough Network congestion Route flaps and routing instabilities … Inefficiency in any layer of the protocol stack can slow down the Web! 4 Our Solutions Application layer Transport layer Study Web Workload Properly provision content distribution networks (CDNs) Optimize TCP startup performance for Web transfers Network layer Speed up packet classification (useful for firewall & diff-serv) 5 Part I Application Layer Approach Study the workload of busy Web servers The Content and Access Dynamics of a Busy Web Site: Findings and Implications. Proceedings of ACM SIGCOMM 2000, Stockholm, Sweden, August 2000. (Joint work with V. N. Padmanabhan) Properly provision content distribution networks On the Placement of Web Server Replicas. Submitted to INFOCOM'2001. (Joint work with V. N. Padmanabhan and G. M. Voelker) 6 Introduction Solid understanding of Web workload is critical for designing robust and scalable systems The workload of popular Web servers is not well understood Study the content and access dynamics of MSNBC web site a large news server one of the busiest sites in the Web 25 million accesses a day (HTML content alone) Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd Properly provision content distribution networks Where to place the edge servers in the CDNs 7 Temporal Stability of File Popularity Methodology 17DEC98 - 18OCT99 01AUG99 - 18OCT99 17OCT99 - 18OCT99 Extent of overlap 1 0.8 0.6 0.4 Results 0.2 0 1 10 100 1000 # popular documents picked 10000 Consider the traces from a pair of days Pick the top n popular documents from each day Compute the overlap 100000 One day apart:significant overlap (80%) Two months apart: smaller overlap (20-80%) Ten months apart: very small overlap (mostly below 20%) The set of popular documents remains stable for days 8 Spatial Locality in Client Accesses Dec. 17, 1998 1.2 1 Fraction of requests shared Fraction of requests shared Normal Day 0.8 0.6 0.4 0.2 0 1 0.8 Trace 0.6 Random 0.4 0.2 0 0 10000 20000 30000 Domain ID 40000 50000 0 5000 10000 15000 20000 25000 30000 35000 Domain ID Domain membership is significant except when there is a “hot” event of global interest 9 Spatial Distribution of Client Accesses Cluster clients using network aware clustering [KW00] IP addresses with the same address prefix belongs to a cluster Top 10, 100, 1000, 3000 clusters account for about 24%, 45%, 78%, and 94% of the requests respectively A small number of client clusters contribute most of the requests. 10 The Applicability of Zipf-law to Web requests MSNBC Proxies Less popular servers 2 1.5 1 0.5 0 The Web requests follow Zipf-like distribution Request frequency 1/i, where i is a document’s ranking The value of is much larger in MSNBC traces 1.4 – 1.8 in MSNBC traces smaller or close to 1 in the proxy traces close to 1 in the small departmental server logs [ABC+96] Highest when there is a hot event 11 Impact of larger Percentage of Requests 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 Percentage of Documents (sorted by popularity) 12/17/98 Server Traces 10/06/99 Proxy Traces 08/01/99 Server Traces Accesses in MSNBC traces are much more concentrated 90% of the accesses are accounted by Top 2-4% files in MSNBC traces Top 36% files in proxy traces (Microsoft proxies and the proxies studied in [BCF+99]) Top 10% files in small departmental server logs reported in [AW96] Popular news sites like MSNBC see much more concentrated accesses Reverse caching and replication can be very effective! 12 Introduction to Content Distribution Networks (CDNs) server server CDN server server server Content providers want to offer better service to their clients at lower cost Increasing deployment of content distribution networks (CDNs) Clients Content Providers Akamai, Digital Island, Exodus … Idea: a network of servers Features: Outsourcing infrastructure Improve performance by moving content closer to end users Flash crowd protection 13 Placement of CDN servers server Goal server CDN server server server minimize users’ latency or bandwidth usage Minimum K-median problem Select K centers to minimize the sum of assignment costs Clients Content Providers Cost can be latency or bandwidth or other metric we want to optimize NP-hard problem 14 Placement Algorithms Tree based algorithm [LGG+99] Random Assume the underlying topologies are trees, and model it as a dynamic programming problem O(N3M2) for choosing M replicas among N potential places Pick the best among several random assignments Hot spot Place replicas near the clients that generate the largest load 15 Placement Algorithms (Cont.) Greedy algorithm Greedy(N,M) { for I = 1 .. M { for each remaining replica R { cost[R] = cost after placing an additional replica at R } select the replica with the lowest cost } } Super Optimal algorithm Lagrangian relaxation + subgradient method 16 Simulation Methodology Network topology Randomly generated topologies Real Internet network topology AS level topology obtained using BGP routing data from a set of seven geographically dispersed BGP peers Web Workload Real server traces Using GT-ITM Internet topology generator MSNBC, ClarkNet, NASA Kennedy Space Center Performance Metric Relative performance: costpractical/costsuper-optimal 17 Simulation Results in Random Tree Topologies 18 Simulation Results in Random Graph Topologies 19 Simulation Results in Real Internet Topologies 20 Effects of Imperfect Knowledge about Input Data Predict load using moving window average (a) Perfect knowledge about topology (b) Knowledge about Topology with a factor of 2 accurate 21 Conclusion Characterize Web workload using MSNBC traces Placement of CDN servers Knowledge about client workload and topology is crucial for provisioning CDNs The greedy algorithm performs the best The greedy algorithm is insensitive to noise Stay within a factor of 2 of the super-optimal when the salted error is a factor of 4 The hot spot algorithm performs nearly as well Within a factor of 1.1 – 1.5 of super-optimal Within a factor of 1.6 – 2 of super-optimal How to obtain inputs Moving window average for load prediction Using BGP router data to obtain topology information 22 Part II Transport Layer Approach Speeding Up Short Data Transfers: Theory, Architectural Support, and Simulation Results. Proceedings of NOSSDAV 2000 (Joint work with Yin Zhang and Srinivasan Keshav) 23 Motivation Characteristics of Web data transfers Short & bursty [Mah97] Use TCP Problem: Short data transfers interact poorly with TCP ! 24 TCP/Reno Basics Slow Start Congestion Avoidance Exponential growth in congestion window, Slow: log(n) round trips for n segments Linear probing of BW Fast Retransmission Triggered by 3 Duplicated ACK’s 25 Related Work P-HTTP [PM94] T/TCP [Bra94] Cache connection count, RTT TCP Control Block Interdependence [Tou97]: Reuses a single TCP connection for multiple Web transfers, but still pays slow start penalty Cache cwnd, but large bursts cause losses Rate Based Pacing [VH97] 4K Initial Window [AFP98] Fast Start [PK98, Pad98] Need router support to ensure TCP friendliness 26 Our Approach Directly enter Congestion Avoidance Choose optimal initial congestion window A Geometry Problem: Fitting a block to the service rate curve to minimize completion time 27 Optimal Initial cwnd Minimize completion time by having the transfer end at an epoch boundary. 28 Shift Optimization Minimize initial cwnd while keeping the same integer number of RTT’s Before optimization: cwnd = 9 After optimization: cwnd = 5 29 Effect of Shift Optimization 30 TCP/SPAND Estimate network state by sharing performance information SPAND: Shared PAssive Network Discovery [SSK97] Internet Web Servers Performance Server Directly enter Congestion Avoidance, starting with the optimal initial cwnd Avoid large bursts by pacing 31 Implementation Issues Scope for sharing and aggregation Collecting performance information Sliding window average Retrieving estimation of network state Performance reports, New TCP option, Windmill’s approach, … Information aggregation 24-bit heuristic network-aware clustering [KW00] Explicit query, active push, … Pacing Leaky bucket based pacing 32 Opportunity for Sharing MSNBC: 90% requests arrive within 5 minutes since the most recent request from the same client network (using 24-bit heuristic) 33 Cost for Sharing MSNBC: 15,000-25,000 different client networks in a 5-minute interval during peak hours (using 24bit heuristic) 34 Simulation Results Methodology Performance Metric Download files in rounds Average completion time TCP flavors considered reno-ssr: Reno with slow start restart reno-nssr: Reno w/o slow start restart newreno-ssr: NewReno with slow start restart newreno-nssr: NewReno w/o slow start restart 35 Simulation Topologies 36 T1 Terrestrial WAN Link with Single Bottleneck 37 T1 Terrestrial WAN Link with Multiple Bottlenecks 38 T1 Terrestrial WAN Link with Multiple Bottlenecks and Heavy Congestion 39 TCP Friendliness (I) Against reno-ssr with 50-ms Timer 40 TCP Friendliness (II) Against reno-ssr with 200-ms Timer 41 Conclusions TCP/SPAND significantly reduces latency for short data transfers 35-65% compared to reno-ssr / newreno-ssr 20-50% compared to reno-nssr / newreno-nssr Even higher for fatter pipes TCP/SPAND is TCP-friendly TCP/SPAND is incrementally deployable Server-side modification only No modification at client-side 42 Part III Network Layer Approach Fast Packet Classification on Multiple Dimensions. Cornell CS Technical Report 2000-1805, July 2000. (Joint work with G. Varghese and S. Suri, in progress) 43 Motivation Traditionally, routers forward packets based on the destination field only Diff-serv and firewall require layer 4 switching forward packets based on multiple fields in the packet header, e.g. source IP address, destination IP address, source port, destination port, protocol, type of service (tos) … The general packet classification problem has poor worst-case cost: Given N arbitrary filters with k packet fields either the worst-case search time is ((logN)k-1) or the worst-case storage is O(Nk) 44 Problem Specification Given a set of filters (or rules), where each filter specifies a class of packet headers based on K fields an associated directive, which specifies how to forward the packet matching this filter Goal: Find the best matching filter for each incoming packet A packet P matches a filter F if every field of P matches the corresponding field of F Exact match, prefix match, or range match Assume prefix matching 45 Problem Specification (Cont.) Example of Cisco Access control List (ACL) 1. 2. 3. access-list 100 deny udp 26.145.168.192 255.255.255.255 74.199.168.192 255.255.255.0 eq 2049 access-list 100 permit ip 74.199.191.192 255.255.0.0 255 74.199.168.192.255.0.0 access-list 100 permit tcp 250.197.149.202 255.0.0.0 74.199.20.76 255.0.0.0 Packet: tcp 250.19.34.34 74.23.5.12 matches filter 3 46 Backtracking Search F1 00* F2 10* D E A trie is a binary branching tree, with each branch labeled 0 or 1 The prefix associated with a node is the concatenation of all the bits from the root to the node 47 Backtracking Search (Cont.) Extend to multiple dimensions Backtracking is a depth-first traversal of the tree which visits all the nodes satisfying the given constraints Example: search for [00*,0*,0*] 48 Trie Compression Algorithm If a path AB satisfies the Compressible Property: All nodes on its left point to the same place L All nodes on its right point to the same place R then we compress the entire branches by 3 edges Center edge with value (AB) pointing to B Left edge with value < (AB) pointing to L Right edge with value > (AB) pointing to R Advantages of compression: save time & storage 49 Trading Storage for Time Smoothly tradeoff storage for time Exponential Time Exponential Space Selective push Push down the filters with large backtracking time Iterate until the worst-case backtracking time satisfies our requirement 50 Example of Selective Push Goal: worst-case memory accesses < 12 The filter [0*, 0*, 0000*] has 12 memory accesses. Push the filter down reduce lookup time Now the search cost of the filter [0*,0*,001*] becomes 12 memory accesses. So we need to push it down. Done! 51 Using Available Hardware So far, we focus on software techniques for packet classification. Further improve the performance by taking advantage of limited hardware if it is available By moving some filters (or rules) from software to hardware Key issue: Which filters to move from software to hardware? Answer: To reduce lookup time, move the filters with the largest number of memory accesses when using software approach 52 Summary Approach Description Performance Gain Trie compression algorithm Effectively exploit Reduce lookup time by a redundancy in trie nodes factor of 2 – 5, save storage by a factor of 2.8 – 8.7 Selective push “Push down” the filters with large backtracking time Reduce lookup time by 10 – 25% with only marginal increase in storage Moving filters from software to hardware Heuristics to move a small number of filters from software to hardware Moving 10 – 20 rules to hardware cuts storage by 33% - 50%, or lookup time by 10% – 20% 53 Contributions Application layer Transport layer Study Web Workload of busy Web servers Properly provision content distribution networks Optimize TCP startup performance for short Web transfers Network layer Speed up packet classification 54 Other Work Available at http://www.cs.cornell.edu/lqiu/papers/papers.html Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms on the Internet. Proceedings of IEEE INFOCOM'2000, Tel-Aviv, Israel, March 2000. On Individual and Aggregate TCP Performance. 7th International Conference on Network Protocols (ICNP'99), Toronto, Canada, October 1999. Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment. July 2000. Submitted to INFOCOM'2001. 55 Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms Internet telephony are subject to Variable loss rate Variable delay Previous work has addressed the two problems separately Use FEC for loss recovery Use playout buffer adaptation for delay jitter compensation 56 Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms (Cont.) Our work Demonstrate the interaction between playout algorithm and FEC Playout algorithm should depend on both FEC and network loss conditions and network jitter Propose several playout algorithms that provide this coupling Demonstrate the effectiveness of the algorithms through simulations 57 On Individual and Aggregate TCP Performance Motivation TCP behavior under many competing TCP connections has not been sufficiently explored Our work Use extensive simulations to investigate the individual and aggregate TCP performance for many concurrent connections 58 On Individual and Aggregate TCP Performance (Cont.) Major findings All connections have the same rtt Wc > 3*Conn global synchronization Conn < Wc < 3*Conn local synchronization Wc < Conn shut off connections Adding random processing time synchronization and consistent discrimination less pronounced Derive the general characterization of overall throughput, goodput, and loss probability Quantify the roundtrip bias for connections with different RTT 59 Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment Motivation IETF recommends wide spread deployment of RED in routers Most previous work studies RED in relatively homogeneous environment Our work Investigate the interaction of RED with five types of heterogeneity 60 Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment (Cont.) Major findings Mix of short and long TCP connections Mix of TCP and UDP ECN-capable TCP connections get higher goodput than nonECN-capable TCP connections Effect of different RTT Bursty UDP tends to get lower loss rate with RED than with Drop Tail Mix of ECN and non-ECN capable traffic Short TCP connections get higher goodput with RED than with Drop Tail RED reduces the bias against long-RTT bulk transfers Effect of two-way traffic When ACK path is congested, TCP gets higher goodput with RED than with Drop Tail 61 Effects of Imperfect Knowledge about Input Data 62 Effects of Imperfect Knowledge about Input Data (Cont.) The effect of imperfect topology information Randomly remove from 0 up to 50% edges in the AS topology derived from the BGP routing tables The greedy algorithm is insensitive to edge removal Performs within 2.6 of optimal when the edge removal is 50% 63