Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
World Wide Web Caching: Trends and Technologies Gerg Barish & Katia Obraczka USC Information Sciences Institut , USA ,2000 This report presented by Loubna ALI Introduction : The web caching is the Introducing proxy servers at certain points in the network that serve in caching Web documents for faster client access. In our days web caching is very important because of: –The rapid growth in HTTP traffic to form the largest part of the Internet traffic which causes more network congestion and server unavailability. –The number of Web static pages almost doubles every year. But for that it becomes an attractive solution it must has the following features: Bandwidth saving Improving content availability. Improving web server availability. Reducing network latency. Server load balancing. Improving user’s perception about networks performance. In this paper we will described several web caching architectures , Cache deployment options, and Design techniques. Finally we will organize the summary and the future works. Caching Architectures: 1-Proxy Caching: this kind of cache is deployed at the edges of the network and it has the following disadvantages: –Unavailable cache cause Unavailable network. –Single point of failure. –User browser manual reconfiguration in times of failure (browser autoreconfiguration is a recent trend). 2-Reverse Proxy Caching: in this kind the proxies situate near the content provider 3-Transparent Caching: we have an advantage here that the needs to manually configure web browsers is eliminated. 1 There is two kind of transparent caching: –Router-based transparent proxy caching –Switch-based transparent proxy caching the switch-based caching is better than the router based because it is less expensive, it reduce the latency in doing the load balancing. 4-Adaptive Web Caching: it uses the distributed cache meshes to solve the hot spot problem, and it has the following properties: –Caches dynamically join and leave the groups based on content demand –Adaptivity and self-organizing it use the Cache Group Management Protocol(CGMP) and the Content Routing Protocol(CRP) 5-Push Caching: it keep the data close to those clients requesting this information. We assume here that we are able to launch caches that may cross administrative boundaries. But we have as disadvantage incurs cost (storage and transmission). 6-Active Caching: this cache is applies caching to dynamic documents because 30 % of client HTTP requests contains cookies, it use the cache applets and when we demand the information the servers provides the cache with the objects and any associated cache applets. Cache Deployment options: Near the content consumer(consumer-oriented) In this situation we have the better response time and other advantage that the requests are serve locally. Near the content provider(provider-oriented) the advantages of this situations are: –Improves access to logical sets of data –Improve the scalability and availability of content but we have a problem critical to delay sensitive content (audio, video) At strategic points in the network –Based on user access patterns and network topology and conditions but there is a problem with administrative control 2 Design Techniques: Hierarchical Caching: The Caches are arranged in a tree-like structure: -A child cache can query parent caches and other siblings but a parent cache can never query children -This maintains information gradually filtering down to the leaves here we have the problem of parents swamping and to avoid this problem clustering may be applied to hierarchies. Advantages: –Bandwidth efficient , especially when cache servers are slow. –Allows to efficiently diffuse popular web pages towards the demand. Disadvantages –Cache server needs to be placed at key access points of the network à requires coordination among caches. –Each level adds a delay. –High levels are bottlenecks. –multiple copies at different cache levels. Intercache Communication This design composed of multiple distributed caches. It use the following protocols: –ICP (Internet Cache Protocol) [Squid]: Caches issue queries to other caches to determine the best location of object retrieval. Main problem is the message overhead –CRP (Content Routing Protocol): ICP with multicast feature to query cache meshes –Cache digests [Squid]: summarizes cache objects –WCCP (Web Cache Communication Protocol) [Cisco]: Enables transparent redirection of HTTP traffic to Cisco Cache Engine –CARP (Cache Array Routing Protocol) [Microsoft]: Uses Hashing Schemes for location determination of the required proxy having the requested information Hashing function The principal idea of this design is to point the local cache in direction of other caches which have the object or can get it. 3 -Hash-Based request routing: –Use hash-function to map a key (such as the url) to a cache within a cluster –Reduces (eliminates) the need of caches to query each other –Ex) Netcache-MD5-indexed URL hash-function CARP Optimized I/O: It treat the object cache with high performance data base for determine if the object has been cached in memory data structure, and the disk operations locate where is in the disk place the content. The advantage here is that the costly I/O operations can be avoided. Microkernel Operating System: It present how the resources are managed . The advantage are: Improve resource allocation. Optimize cache performance. Content prefetching The principal idea of this design is that the latter uses data accumulate by the server, such as historical information. We have three manner to implement this cache: –Between clients and servers –Between clients and proxies –Between proxies and servers Improvements: –Less latency (from 26% improvement to 57%) –Improved access time Cache coherency (consistency) This cache ensure that the cached object does not reflect stale or defunct data. The consistency techniques: –Client polling: compare the cached object with that of the original object . –Invalidation callbacks: the server contact the proxies when objects change. –TTL and Adaptive TTL –If-Modified Since: caches invalid objects only when they are requested and there expiration date has been reached. 4 Summary As we have seen, there are different designing caches but some issues common among them. we have as advantages: 1.Improve content availability. 2.Reduce network latencies. 3.Reduce address increasing bandwidth demands. 4.Can hide network problems. 5.Reduce server burden. Disadvantages: 1.Stale pages. 2.Information retained in caches. But the election of the cache which is the must suitable for our application depend at the application itself may be we need the cache which has the more less latency or which has the more security properties….etc. Open Future Works(trends): To improve the informations of this theme we must dispute the problem of security and real time data. Content Security: For ensure the security we can present two example for two mechanisms which are developed in 2002: 1.Net cache: -The appliance deployed in parallel to firewall -The appliance can be used to control who accesses a web site. -Virus scanning for all incoming content 2. Cache flow: -Added content filtering to its caches. Handling more complex objects and real-time data RTEE(real time event engine): captures, caches, and queries data at speeds greater than 12000 event/s. Web Caching based on Ontology ? –User access pattern prediction –Prefatching –Cache placement/replacement 5 This article: Good organized . Explains all kind of web caching and all technique with manner simple. Presents the need of caches ,desirable properties and the gains which we expected for well choose the cache which is the most suitable for our applications. But: It does not Present and explain the trends which is very important in this range ,such as :security and real time data. There is loss at tools of illustration such as: photos , designs ..etc. Loubna ALI 6