Mar. 2004, Vol.19, No.2, pp.113{127
J. Comput. Sci. & Technol.
Web Caching: A Way to Improve Web QoS
Ming-Kuan Liu1;2 , Fei-Yue Wang1;2 , and Daniel Dajun Zeng1;3 1
The Lab for Complex Systems and Intelligent Sciences, Institute of Automation, The Chinese Academy of Sciences
Beijing 100080, P.R. China
2 3
Systems and Industrial Engineering Department, University of Arizona, Tucson, AZ 85721, USA Management Information Systems Department, University of Arizona, Tucson, AZ 85721, USA
E-mail:
[email protected];
[email protected];
[email protected] Received July 30, 2002; revised June 2, 2003. Abstract As the Internet and World Wide Web grow at a fast pace, it is essential that the Web's performance should keep up with increased demand and expectations. Web Caching technology has been widely accepted as one of the eective approaches to alleviating Web traÆc and increase the Web Quality of Service (QoS). This paper provides an up-to-date survey of the rapidly expanding Web Caching literature. It discusses the state-of-the-art web caching schemes and techniques, with emphasis on the recent developments in Web Caching technology such as the dierentiated Web services, heterogeneous caching network structures, and dynamic content caching. Keywords Web traÆc, Web caching, Web QoS, dierentiated service, dynamic content caching
1 Introduction The Internet and World Wide Web have experienced tremendous growth in the past decade. As a direct result of the Web's popularity and increasing acceptance, the Web network bandwidth demands have been increasing much faster than the bandwidth capacity expansion, resulting in the network traÆc congestion and Web server overload. Many researchers have been working on how to improve the Web performance since early 90's and many approaches have been proposed and studied[1] . Among them, Web Caching technology has emerged as one of the eective solutions to reduce Web traÆc congestion, alleviate Web server overload, and in general, improve the scalability and Quality of Service (QoS) of the Web-based systems[1 7] . The idea of using caches to improve system performance has been developed and applied successfully long before the advent of the Web. The bestknown applications of caching are in CPU, RAM, and le systems where caches are de ned as fast, temporary stores for commonly used items[8 10] . Similar caching ideas can be extended to the Webbased systems. Caching user-requested documents at the client Web browser or at local Web caching proxy servers lowers the user perceived access latency de ned as the amount of time elapsed between the time when a user request is issued and the time when the requested object is returned to the user's browser. In addition, Web caching has Regular
the potential of reducing network bandwidth consumption, thereby alleviating Web traÆc congestion. Moreover, Web caching can alleviate the workload of original Web servers by reducing the number of the client requests. Lastly, Web caching may improve the failure tolerance and robustness of the whole Web system by maintaining a cached copy of Web documents and serving user requests even when original servers or networks become temporarily unreachable. Web caching has experienced rapid growth in recent years[6] . The academic literature on Web caching is quickly expanding. Numerous conferences and workshops with a speci c emphasis on Web caching have been held by the engineering and business research communities. In addition, many caching-related commercial oerings are launched and accepted as part of the general Web infrastructure and as a speci c type of value-added Web-based commercial service[11 13] . Although surveys on Web caching technology exist in the literature (e.g., [7]), we conclude that recent developments in Web caching warrant a significantly updated survey. This paper represents our eorts in understanding and systemizing signi cant technical issues that are being addressed by the recent Web caching literature, with emphasis on new trends in Web caching research including dierentiated Web QoS, heterogeneous caching architecture, and dynamic content caching. In the remainder of this section, we present a
Paper This work is supported in part by the Outstanding Oversea Scholar Award and the Outstanding Young Scientist Research Fund from the Chinese Academy of Sciences. The authors of this paper are listed alphabetially, all equally contributed to its completion.
114
high-level taxonomy of Web caching systems and lay out the scope of our survey. Web contents can be cached at dierent locations along the path between clients and original Web servers. According to the location of caches, Web caching systems can be classi ed into three types: the browser caches, proxy caches, and surrogate caches. Browser Caches. Most modern Web browsers have build-in caches. Browser caches utilize the client's local hard disk or RAM to cache Web documents. The user can customize the size of the browser caches. Browser-based caching typically works for only one user and sharing of cached contents among dierent users is not allowed. This type of caching can signi cantly reduce the Web access latency when the user needs repeated accesses to certain Web pages. Proxy Caches. Unlike browser caches, Web proxy caches are located between client computers and Web servers and function as a data relay. A typical Web proxy cache serves many users at the same time. When the Web caching proxy server receives a user request, it rst looks up the requested object in its cache. If a fresh cached copy is found, the proxy returns it to the client. Otherwise it relays the request to other cooperating cache proxies or the original Web server, and returns the found fresh documents to the client while leaving a copy in its own cache. Proxy caches usually are placed close to the client. Surrogate Caches. Surrogate caches work similarly as proxy caches; the key dierence is that surrogate caches are typically located close to Web servers. While the main goal of proxy caches is to reduce the Web access latency, that of surrogate caches is to alleviate Web servers' workload. The de nition of surrogate caches is given in RFC 3040 as \a gateway co-located with an origin server, or at a dierent point in the network, delegated the authority to operate on behalf of, and typically working in close cooperation with, one or more origin servers"[14] . Surrogate caches can be used to replicate the contents of the corresponding Web servers at many dierent locations on the Web. Users' requests for objects from these Web servers can then be directed to the nearest surrogate cache that contains the requested contents. As a result, surrogate caches alleviate the workload of the servers and potentially reduce the client access latency. Another common usage of surrogate caches is to accelerate Web servers' performance. Some Web servers are slow because they have to generate Web pages dynamically. Surrogate caches can be used to cache
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2 the response of these servers to improve server performance. Proxy caching has been the main focus of the current Web caching research for several reasons. First, a proxy cache-based approach makes minimum assumptions about the Web servers and networking protocols. Thus it can be readily applied in a wide spectrum of application settings without modifying the Web server behavior or the underlying networking infrastructure. Second, unlike browser caches, proxy caches typically serve a number of users who are on the same subnets. These users, often working for the same organization, have lots of commonalities in their Web usage. This creates a lot of opportunities to realize the potential performance gain and latency reduction of Web caching. Third, from the system architecture point of view, proxy caches are located on proxy servers that have been traditionally used for other purposes (e.g., network security). This makes installation and con guration of caching services relatively easy and user-transparent. Recent years have also witnessed the growing trend of providing value-added Web services o proxy servers. Co-locating caching services with these Web services can ease system integration and maintenance eorts and lead to signi cant cost savings. This paper mainly focuses on proxy caching. The rest of the paper is organized as follows. Section 2 presents various architectural designs of Web caching systems. Section 3 analyzes the characteristics of Web traÆc. Section 4 describes prefetching, cache replacement, and document coherence maintenance policies. These policies govern various aspects of the operation of a proxy cache. In Section 5, we address inter-cache routing and cooperation problems. These problems arise when multiple caches work collaboratively. We brie y discuss the performance metrics and evaluation of a caching system in Section 6 and discuss recent trends in Web caching in Section 7. Section 8 concludes the paper with issues for future research.
2 Architectural Designs of Web Caching Systems 2.1 Overall Design of Web Caching Systems Fig.1 illustrates a high-level Web caching system architecture followed by most Web caching systems. Web objects can be cached at either the clients' local machines, Web cache proxy servers, surrogate servers, or Web servers, or any combinations of these locations. The following description is based on the assumption that caching is enabled at all the above-
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
115
et al
mentioned locations. Note that it is easy to adapt the description below to t any caching system in which the caching mechanism is absent at certain locations (e.g., the Web browser's caching function is turned o).
Fig.1. Overall Web caching system architecture.
When a user requests a Web object, the browser rst tries to locate a valid copy in its own cache. If a valid copy is found, the cached copy will be presented to the user immediately without incurring any Web traÆc. If none is found, a cache proxy server, which typically resides on a network location close to the client machine, is contacted. Upon receiving such a request from a client, the cache proxy rst checks in its cache to locate a valid cached copy. If it is found, the cache proxy returns it to the client, which in turn presents the object to the user. If none is found, i.e., a cache miss occurs, the cache proxy directs the request to participating cooperative cache proxies or the original Web server. These cooperative cache proxies process the request in a similar manner and may need to relay it to a surrogate cache. The worse-case scenario is that none of the caches along the network path has a valid copy of the requested object. When this happens, the request is relayed all the way up to the original Web server that publishes the requested Web object.
A critical research issue in developing cooperative Web caches is how to organize and coordinate the behavior of the caches that work cooperatively. In this section, we survey several well-studied architectures for collaborative Web caching. Issues concerning the development of a single proxy cache will be discussed in Section 4. Cooperative proxy Web caching architectures can be divided into three major categories: hierarchical[17;20] , distributive[18;21] , and [19 hybrid ;22] . Hierarchical Caching Architectures. The hierarchical caching architecture, as shown in Fig.2, was rst proposed in the Harvest project[17] . In this type of architecture, the caching system consisting of multiple proxy caches is structured as a tree. Each tree node corresponds to a proxy cache and has exactly one parent (with the exception of the root node). When a cache miss occurs at a certain node (i.e., a valid copy of the requested object cannot be found), this node forwards the request to its parent. This forwarding process can be repeated till the requested object is located or the root cache is reached. The root cache may need to contact the origin Web server if it is unable to satisfy the request. Whenever the requested object is found, either at one of the caches or at the original server, it travels back to the client in the order that is the reverse of the cache request chain. A copy of this object is also cached at each of the intermediate caches along this path. The hierarchical caching architecture roughly maps to the topology of the Internet organized as a hierarchical network of ISPs and serves to diuse the popular Web objects towards the demand. However, this architecture suers from several major drawbacks. First, each hierarchy level introduces additional time delays in processing requests. Second, there is signi cant
2.2 Architectural Designs of Cooperative Web Caches It has been observed recently that a Web caching system relying on a single proxy cache has limited value[15;16] . However, cooperative Web caching that utilizes a network of cache proxy servers to serve a set of clients has shown great potential in (a) improving Web QoS, (b) reducing the chance of certain cache proxy servers becoming the performance and Web traÆc bottleneck, and (c) improving the overall system fault-tolerance and robustness. In eect, cooperative Web caching has already been widely used in recent caching systems[17 19] .
Fig.2. Hierarchical caching architecture.
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2
116
redundancy in the storage of Web objects since many objects are cached at all levels of the cache tree. Lastly, caches close to the root need to store a large number of Web objects and can become major performance and traÆc bottlenecks. Distributed Caching Architectures. Fig.3 illustrates the architecture of distributed proxy caching systems. In this architecture, there are only \institutional" caches at the edge of the network that cooperate to serve each other's misses. No intermediate caches between the clients and these institutional caches are used[18;21] .
Fig.3. Distributed caching architecture.
jects but not the cached objects themselves. Institutional caches can also cooperate with each other using a hash function that maps a client request into a certain cache[25 27] . With this approach there are no duplicated copies of the same object in dierent caches and there is no need for caches to know about each other's content. However, having only one single copy of an object among all cooperating caches limits this approach to local environments with well-interconnected caches. Hybrid Caching Architectures. In a hybrid architecture, caches cooperate with other caches at the same level or at a higher level using distributed caching. ICP is typically used as the underlying communication and coordination protocols in this architecture. For instance, a Web object can be fetched from either a parent or a neighbor cache depending on which cooperating cache oers the lowest round-trip time (RTT). The cooperation between participating caches needs to be carefully planned in order to avoid repeated fetching of objects from distant or slow caches when it might be more eÆcient to fetch these objects directly from the original Web servers[28] . Recently, Rodriguez and Spanner[22] proposed a mathematical model to analyze the performance of the above three architectures. They show that hierarchical caching systems have lower connection time (the time needed to establish the connection between the client machine and a cache or between caches) while distributed caching systems have lower transmission time (the time needed to transmit the requested Web object over the network). In addition, hierarchical caching has lower network bandwidth usage, while distributed caching is able to distribute the Web traÆc more evenly and reduce \hot spots" (congested network locations). This is because that distributed caching makes more use of local network bandwidth. As for disk storage requirements, distributed caching typically requires relatively small storage space with an institutional cache needing several gigabytes of space, while hierarchical caching needs more storage space, especially for caches near the root. It is common for high-level caches to have hundreds of gigabytes storage in a hierarchical caching system. Their analysis also shows that in a hybrid architecture the latency greatly varies depending on the number of caches that cooperate at each network level.
Because of the lack of the simple chain of referencing and control as in the case of hierarchical caching architectures, the facilitation of cooperation between institutional caches is relatively complex and can signi cantly impact cache performance. Several key coordination mechanisms that have been developed in the literature are surveyed below. Institutional caches can query other cooperating caches for Web objects or documents that have resulted in local misses. The Inter Cache Protocol (ICP) can be used to transmit these queries and replies between cache servers[19] . The main drawbacks of this query-based approach are: (a) potentially signi cant increase in network bandwidth consumption, and (b) potentially long access latency as a result of needing to poll all cooperating caches and wait for responses. Institutional caches can keep a digest or summary of the contents of the other cooperating caches[23;24] , avoiding the need for expensive queries and polls. Content digests or summaries are periodically exchanged among the institutional caches. To make the distribution of the digests/summaries more eÆcient and scalable, a hierarchical infrastruc- 2.3 ISAAC: An Adaptive Web Caching Arture of intermediate nodes can be used. However, chitecture this hierarchical infrastructure only distributes meta information including the locations of the Web obAlmost all the caching architectures surveyed
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
above assume that various aspects of the Web environment, such as Web traÆc patterns, server and client locations, network connectivity, and cooperative caching network topology, remain relatively stable. Obviously, because of the dynamic nature of the Web, a caching system that dynamically adapts its behavior according to the changes in the Web environment has many advantages over the non-adaptive approaches. In eect, research on adaptive Web caching is starting to emerge in recent years[29] . In this section, we use the ISAAC system, developed at the University of Arizona, as an example to illustrate the basic ideas behind adaptive Web caching.
117
operation strategies, relevant to cooperative caching systems. The CPKM module implements key Web caching policies such as prefetching, coherency, and replacement. These policies are not hardwired. Different types of policies and control parameter settings are invoked to best match the speci c caching scenario characterized by the input from NTM and IPCM. The SM module is responsible for the eective management of hard disk and RAM to enable eÆcient and fast cache access.
3 Web TraÆc Characteristics 3.1 Basic Web-Based Interaction The interaction between HTTP clients and servers is illustrated in Fig.5. The amount of the time needed to retrieve a Web object when a new connection is required can be approximated by two RTTs plus the time to transmit the response.
Fig.5. Interaction between a Web client and a server under HTTP.
Fig.4. The ISAAC cache system structure.
As illustrated in Fig.4, the ISAAC (short for Intelligent Strategies and Architectures for Adaptive Caching) system consists of ve major modules[30] : the client request processor (CRP), the network traf c monitor (NTM), the inter-proxies cooperation manager (IPCM), the cache proxy kernel module (CPKM), and the storage manager (SM). The CRP module is responsible for preprocessing and parsing HTTP requests of clients. The NTM module monitors the current network traÆc and provides input to the CPKM module. For example, if the NTM detects that the network is idle, it will inform the CPKM module to initiate the prefetching or coherency maintaining processes (see Section 4 for details about these processes). The IPCM module is responsible for the inter-cache routing and co-
In the older but still widely-used HTTP version 1.0, a separate TCP connection needs to be opened and closed for each object embedded in a Web page. This increases the user-perceived access latency as well as the communication overhead. To address this problem, the current HTTP protocol HTTP version 1.1 utilizes the so-called persistent connections to eliminate the needs of establishing multiple TCP connections. Under persistent connections, TCP connections are kept open for a Web server to serve multiple objects to the requesting client until a time-out expires. According to a most recent analysis, however, the utilization of HTTP 1.1 does not result in notable dierences in HTTP traÆc[31] . There is strong evidence to support the observation that the behavior of Web users strongly aects the nature of TCP connections. In particular, it is shown that the time between two page visits is critical to determine if existing TCP connections can be re-utilized or if new connections have to be
118
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2
opened. It is unclear how HTTP 1.1, which has re- highest possible cache hit rate under the given replaced HTTP 1.0 as the dominant Web protocol, source constraints such as storage capacity is one of will aect Web caching research. the primary goals of all Web caching systems. An ideal caching system should achieve a hit rate close to 100%. However, recent results suggest that the 3.2 Web Objects and TraÆc Patterns highest cache hit rate that can be achieved by the system is usually no more than 40% An in-depth understanding of Web traÆc pat- best caching [20] . One of the main reasons for this lessto 50% terns can substantially contribute to the development of Web caching technology[31 36] . For in- than-ideal hit rate is that Web users are constantly stance, temporal locality and spatial locality of ref- seeking for new information. Caching only old Web erence within user communities are of particular objects that have been visited in the past limits the relevance to Web caching. Temporal locality im- caching system's capability to serve user requests. One way to improve hit ratio is to anticipate plies that recently accessed objects are more likely future document requests and prefetch these docuto be referenced in the near future. Spatial locality ments in a local cache before they are actually rerefers to the property that objects neighboring an quested by a user[41 45] . Such prefetching activities object accessed in the past are likely to be accessed in the future. Strong empirical evidence exists in can be performed by the cache proxy server when support of these localities in the context of Web the network load is low and the Web servers are reltraÆc[32;37;38] . These localities in part help explain atively idle. Prefetching delivers an additional benthe success of Web caching in improving Web QoS e t in o-line browsing where new documents that the user will be most probably interested in are auand point out new fruitful research directions[38] . Researchers have also identi ed the characteris- tomatically prefetched on the local machine. Current Web prefetching mechanisms can be ditics of Web documents that can guide the design vided into three types: proxy-initiated policies[41;46] , of Web caching systems and help determine spepolicies[44] and hybrid-prefetching ci c caching policies such as prefetching and replace- server-initiated [42 ;43;45;47;48] . ment. Two main characteristics of Web documents policies Proxy-Initiated Policies . Two main proxy-inirelevant to caching are: lifetime and modi cation [39 ;40] tiated prefetching policies are: the Client-Sidepatterns . For instance, more accurate Time[41] and Prediction-by-PartialPrefetching policy To-Live (TTL) estimation of Web objects can di[46] rectly lead to more eÆcient Web object coherency Matching (PPM) policy . In the Client-Sideoperations. The modi cation patterns also have ma- Prefetching policy, what and when to prefetch are jor in uence on the design of caching prefetching decided by the client. This policy is simple and easy techniques. Historical information concerning the to implement but does not provide any prediction Web objects and their hosting servers can be used for future requests. The basic idea of the PPM polto derive useful estimates, possibly in an adaptive icy is to use past Web access patterns to predict future accesses for individual users. The patterns caching framework[30] . captured are in the form of \User A is likely to access URL U1 right after URL U2". The key advan4 Prefetching, Replacement, and Cohe- tage of this policy is the potentially high accuracy in rency predicting user future accesses. A main open issue with this policy is when to prefetch the objects that In this section, we discuss the main policies that are likely to be visited. These timing decisions are govern various aspects of the inner working of a sin- particularly important in a real-time online environgle proxy cache system. Note that most of these ment with a large number of users. policies are applicable to caches located in network Server-Initiated Policies. In this type of locations other than proxy servers (such as client prefetching mechanisms, the server anticipates fubrowser and surrogate servers) as well. ture document requests and sends the documents to participating proxies. The \Top-10" algorithm is a classical example of server-initiated policies 4.1 Prefetching Policies where the server periodically compiles a list of its Cache hit rate, or simply hit rate, is calculated most popular objects and serving them to clients or as the number of the user-requested Web objects proxies[44] . To make sure that these popular objects that are answered by the cache divided by the to- are sent only to the interested clients, the server petal number of the requested objects. Achieving the riodically singles out a subset of active clients (that
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
have visited the server frequently) and sends objects only to them. Experiments show that serverinitiated policies can be fairly successful in serving future requests (with 60% accuracy) with small (less than 20%) increase in network traÆc[43] . The success of server-initiated policies can be partially explained by the information-pooling eect: the server as the central access point has the most complete information concerning user interest in the set of Web objects residing on the server. Server-initiated policies are also faced with several challenges[6;43] . First, because of the very fact that caching is gaining acceptance, server-based usage data may not represent true user interest (e.g., caches can hide lots of user requests from the server). Second, the server needs to keep track of participating clients and proxies, incurring additional computational and communication overheads. Hybrid Prefetching Policies. Combining user access patterns from a client machine and general statistics from a server can improve caching prefetching. Yu and Kobayashi presented a general mathematical formulation and related operating policies in [49]. Their model also considers factors such as server response time and document update frequency. We now summarize the state of the art in prefetching research and point out related ongoing and future research topics. Most existing prefetching models are developed for individual caches. They assume that the local cache storage capacity is unlimited. When predicting user future requests, these models consider factors such as user-speci c historical access patterns, and aggregate document popularities and access frequencies. When making prefetching decisions, additional factors such as the time needed to prefetch the documents and the network congestion are also considered. Prefetching remains an active area of study[6] . We believe that the following four topics in prefetching are most likely to produce fruitful research results and make signi cant impact on caching practice. In eect, some of these topics have already been actively pursued by caching researchers. First, the data mining literature provides a wide selection of models and eÆcient algorithms that can be used to learn potentially complex user Web access patterns[50 52] . For instance, the mining techniques for association rules can be readily applied to learn user Web behavior. We believe that research in the intersection of data mining and Web caching may lead to meaningful results and bene t caching system design and implementation in general. Second, most current prefetching research does
119
not model system constraints such as cache storage capacity. As a result, important Web object characteristics such as size are ignored in making prefetching decisions. For instance, in the Top-10 policy, all top-ranked popular objects are fetched regardless of their size. In a caching system where local storage capacity is a major limiting factor, a better prefetching policy would consider an object's popularity and size together. Future prefetching research needs to explicitly address these system constraint issues. Third, recent research has started to look into the possibility of integrating prefetching and other aspects of Web caching such as replacement and coherency policies to achieve better overall caching performance[53] . For instance, an \overzealous" prefetching policy can better serve some users' future needs but may push out many frequentlyvisited old cached contents from the cache and in turn hurt the cache's performance. This type of problem, called \thrashing", can be dealt with when prefetching decisions are not made in an isolated manner. Fourth, new research is called for to develop eective prefeching policies that work in the collaborative caching context. Applying the policies developed for individual caches in collaborative caching can lead to signi cant waste in resource and sub-optimal performance.
4.2 Replacement Policies One key resource constraint that any caching system has to consider is the local cache storage space. When the cache is near full and new Web objects (including those being prefetched) need to be stored, some existing objects have to be evicted from the cache. Caching replacement policies govern such eviction activities. Three types of replacement policies have been developed in the caching literature: traditional replacement policies, key-based replacement policies, and cost-based replacement policies. Traditional replacement policies include the Least Recently Used (LRU) algorithm[54;55] , the Least Frequently Used (LFU) algorithm[56;57] , and the Pitkow/Recker policy[58] . Key-based replacement policies make eviction decisions based on a primary key decided by certain characteristics of the Web objects under examination[58;59] . Cost-based replacement policies employ a cost function to rank the Web objects in the current cache for their appropriateness to be evicted[60 62] . Traditional Replacement Policies. LRU and LFU are now widely used in computer memory and disk caching systems. LRU replaces the object that
120
was used least recently and LFU replaces the object that was used least frequently. Because of their simplicity, they have been adopted by Web caching systems as well[54;56] . The Pitkow/Recker policy is a variation of LRU. Objects are evicted in the order decided by LRU with one exception. If all the candidate objects were accessed within the same day, then the largest one is to be removed (rather than the one that was visited closest to the beginning of that day)[58] . The performance of these traditional replacement policies is relatively low since neither the relevant characteristics of the Web objects (such as size) nor the general environment indicators (such as network transmission speed) are considered in making eviction decisions. Key-Based Replacement Policies. This type of policy uses a primary key to decide which object to evict. Ties are broken using additional keys. A widely-used primary key is the object size. One popular approach evicts the largest object[58] . The LRU-MIN policy uses a more sophisticated method. If there are objects in the cache whose size exceeds a threshold S , LRU-MIN evicts the least recently used objects. If the size of all the remaining objects is smaller than S , LRU-MIN re-evaluates the objects using a new threshold set to S=2. This process continues until enough space is reclaimed for new objects. Cost-Based Replacement Policies. A cost-based replacement policy relies on a cost function to decide which object should be evicted. This cost function can take into consideration the characteristics of an object such as last access time, usage frequency, and HTTP headers. It can also depend on general environmental parameters such as network traÆc situation, the current time at which eviction has to be performed, and so on. Several cost-based policies are derived from the Greedy-Dual-Size algorithm[63] , which considers in an integrated manner an object's size, its retrieval cost, and how recently it has been accessed. These policies work as follows. Based on its size and retrieval cost, an object is given an initial value when it rst gets into the cache or is accessed by a user. This value then decreases incrementally when time goes by if it is not accessed by any user. The object with the least value is the candidate to be replaced. One possible improvement to these policies based on Greedy-Dual-Size is to consider the popularity of an object in addition to its size and recency in access. An important factor to consider while designing cost-based replacement policies is the overhead of computing the cost function. In some cases, high computational overhead may prevent the use of cer-
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2 tain otherwise desirable algorithms in practice. Recall that in a cooperative caching system, access characteristics dier signi cantly across the levels of the caching hierarchy[64 66] . This suggests that dierent caching replacement policies should be used at dierent levels of a caching hierarchy to achieve better performance[67] .
4.3 Cache Coherence Policies As argued before, serving user requests o the cache proxy server can potentially improve Web QoS when cache hits occur for a reasonable portion of user requests. Achieving high cache hit rate, however, is only one of the challenges facing Web caching. Another major challenge is concerned with how to avoid serving \stale" contents to the user when a hit occurs. This challenge is relevant to not only Web caching but also other caching applications such as distributed le systems[68] . In the context of Web caching, cached objects can quickly become stale or out-of-date when their counterparts on the original Web servers are frequently updated. The main technical goal of Web caching coherence mechanisms is to avoid or minimize the probability that stale Web objects are used to serve user requests. In other words, caching coherence mechanisms aim to provide users with the cached contents that are as fresh as possible compared with their original copy. Existing web caching coherence mechanisms can be categorized into three classes: proxy-initiated policies, server-initiated policies, and hybrid prefetching policies. Server-initiated policies mainly include Callback[69] and Piggyback Server Invalidation (PSI)[70] . Proxy-initiated policies include Poll Each Read, Poll Periodically[71] , Adaptive TTL[72] , and Piggyback Cache Validation[73] . Major hybrid policies are Lease[74] , Volume Lease[71] , and Adaptive Lease[75] . Server-Initiated Policies. We discuss two types of cache coherence policies that are initiated from the server side. The rst type is the Callback algorithm. In this algorithm[69;72] , Web servers keep track of which proxies are caching which objects. When an object is to be modi ed, the server noti es the proxies that cache this object and then waits for reply. After receiving replies from all related cache proxies, the server nishes modifying the object. The main advantage of the Callback algorithm is that it is able to maintain the strongest consistency between original Web objects and their cached copies. The main disadvantage is the computational overhead imposed on the server. Such an
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
overhead is signi cant when a large number of proxies are present and when the set of cached objects maintained by each proxy server is changing constantly. The second server-initiated mechanism is the Piggyback-Server-Invalidation (PSI) algorithm. The basic idea is for the server to piggyback on a reply to proxies with a list of objects that have changed since the last access by the proxy. Upon receiving such a list, the proxy invalidates cached objects on the list and extends the life cycle of other cached objects that are not on the list. The key advantage of PSI is the reduced coherence-related messaging and resulting network bandwidth savings[70] . Proxy-Initiated Policies. There are mainly four coherence policies that are initiated from the proxy side. The rst one is the Poll Each Read algorithm. Following this policy, before sending a cached object to the user, a proxy contacts the server hosting this object to nd out whether the object is valid. If it is not valid, the server sends the latest version of the object to the proxy which in turn updates its cache and forwards it to the user. This policy can be easily implemented and guarantees strong coherence. The main drawback is the added access latency because of the validation messages between the proxy and server for each object requested by the user. The second policy is the Poll-Periodically algorithm. This algorithm is based on Poll Each Read but assumes that cached objects remain valid for at least a certain amount of time after it is validated. This approach can obviously improve access latency compared with Poll Each Read. However, it is diÆcult to choose the appropriate timeout period[71] . The third policy is the Piggyback Cache Validation (PCV) algorithm[73] . In its simplest form, whenever a proxy cache needs to send a message to a server, it piggybacks a list of cached objects from that server, for which the expiration time is unknown and the heuristically determined TTL has expired. The server handles the request and indicates which cached objects on the list are now stale and thus need to be updated. Requests for cached objects that have not recently been validated cause an If-Modi ed-Since (IMS) Get request to be sent to the server. The main advantage of PCV is that it can reduce access latency by minimizing the number of the network connections that need to be established between the server and the proxy. The disadvantage of this algorithm mainly lies in the increased size of the regular request messages due to piggybacking. The computational overhead for the proxy cache is slightly increased, as it must maintain a list of cached objects on a per server basis. The additional overhead for the server is that it must
121
validate the piggybacked object list in addition to processing regular requests. The fourth and also the last proxy-initiated coherence policy is the Adaptive-TTL algorithm. The basic TTL policy maintains cache consistency through the use of the time-to-live (TTL) attribute of a cached object. A cached object is considered valid until its TTL expires. The problem with the basic TTL policy is that it is diÆcult to estimate TTL parameters. The adaptive TTL policy handles this problem by adjusting the TTL attribute of an object based on observations of its life cycles[72] . This approach takes advantage of the fact that object life cycle distributions tend to be bi-modal: if a le has not been modi ed for a long time, it tends to stay unchanged. Positive empirical results have demonstrated the usefulness of Adaptive-TTL[72] . Hybrid Prefetching Policies. Hybrid prefetching policies refer to coherence polices that actively involve both Web servers and proxy caches. We discuss three major hybrid policies. The rst one is the Lease-based coherence algorithm motivated to improve system scalability and fault-tolerance[71;74] . Under this policy, if a proxy or network failure prevents a server from invalidating cached copies of a Web object that needs to be changed, the server needs only to wait until the \lease" expires before modifying the object. A main challenge that Leasebased policies need to address is how to determine the appropriate lease length for a given object. The second policy is the Volumes-Lease algorithm[71] . The main goal of this algorithm is to exploit spatial locality to amortize overheads across multiple objects in a volume. This algorithm uses a combination of object leases and volume leases. Object leases are associated with individual data objects, while volume leases are associated with a collection of related objects on the same server. This policy has been shown to perform well in a WAN environment[71] . The fourth and the last hybrid prefetching policy is the Adaptive-Leases algorithm. This algorithm determines the optimal lease durations based on a number of factors including the need to maintain strong consistency, network connection costs, object update frequencies, etc. Adaptive-Leases is able to achieve signi cant improvement over other approaches while maintaining a modest and manageable computational overhead[75] .
5 Inter-Cache Cooperation and Routing A Web caching system based on a single cache poses many limitations[76] . It does not scale well and
122
the single proxy cache can easily become the performance and communication bottleneck. On the contrary, cooperative proxy caching systems, which consist of a set of cache proxy servers serving a group of users, can overcome these limitations. As argued in Section 2, the eectiveness of cooperative caching is largely determined by the cooperative and routing strategy employed to coordinate inter-cache cooperation. This section discusses four main types of cooperation and routing strategies developed in the literature: broadcast queries, hierarchical caching, URL hashing, and directory-based routing table[6] . Broadcast Queries Policy. The Broadcast Queries Policy works as follows. When a cache proxy receives a client request, it rst checks whether the requested object can be served from its local cache. If a valid cached copy is not found in the local cache, this cache proxy will broadcast the request to all participating proxies asking for their help[17] . The main advantage of this policy is its
exibility that the eects of a proxy joining or departing the cooperative caching network are localized to its immediate neighbors. Two main drawbacks of this policy are: (a) a cache proxy has to wait for the last response from its neighbors before concluding that none of them has the requested documents and sending the request to the next level of the caching hierarchy; (b) broadcast queries result in extra network traÆc and impose computational overhead on participating cache proxies. Hierarchical Caching Policy. In this policy, when a local miss occurs, the cache proxy simply forwards the missed request to its parent in the caching hierarchy without attempting to query sibling cache proxies. Besides obvious savings in communication overhead compared with the Broadcast Queries Policy, this policy has the additional bene t of allowing dierent organizations to utilize a common highlevel parent proxy without sharing cached contents at the lower levels. The main disadvantages of the hierarchical caching policy are: (a) cache proxies closing to the root need to store a large number of objects and can become performance bottlenecks; and (b) the entire hierarchy has to be traversed before it can be concluded that a requested object has to be fetched directly from the original Web server. Directory-Based Cache Cooperation. In this approach, the location of cached objects is explicitly maintained by a directory server[23] . When a miss occurs at a certain cache proxy, it will query the directory server to nd out which proxy can provide the requested object. Upon receiving a response from the directory server, this proxy will either contact another cache or visit the original Web
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2 server directly. To ensure that the directory server always provides fresh location (meta) information, any cache proxy that is caching new objects or dropping old contents needs to send updates to the directory server. Since the communications between the directory server and caches do not contain the actual content, messaging overhead associated with this approach is not signi cant. Another advantage of this approach is that it promotes loose and exible connections between cache proxies. Individual proxies can be added to and removed from the system without the knowledge of other cache proxies. The main disadvantage of this policy is that the directory server may become a single point of failure. Hashing Function-Based Cache Routing. In a hashing function-based approach, all Web clients store a common hash function that can be used to map any given URL to a hash space. The hash space is partitioned and each set in the partition is associated with one of the sibling caches. When a client needs to access a Web object, it rst hashes the URL of the object and then requests it from the sibling cache whose set contains the hash value. If this cache cannot satisfy the request, it retrieves the object from the original server, places a copy in its cache, and forwards the object to the client. Several approaches have been reported in the literature that belong to this general hashing-based framework, including the Cache Array Routing Protocol[77] and Consistent Hashing algorithm[25] . The advantages of hashing-based approaches are three-fold. First, they are scalable with respect to the number of user requests. More cache proxies can be easily added and the hash space can be repartitioned. Second, their computational and communication overheads are relatively low. Third, they lead to eÆcient use of cache storage space because no duplicated copies of the same document will be made on dierent cache proxies. The main disadvantage of these hashingbased approaches is that the same cache proxy must process all requests for a given URL, no matter where the requests come from. This can lead to potential performance degradation.
6 Performance Metrics and Evaluation of Web Caching Systems 6.1 Performance Metrics for Web Caching Systems Recent years have seen the rapid development of Web caching technology. Many Web caching architectures and policies that govern various operational aspects of caching systems have been devel-
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
oped. Often, these architectures and policies oer competitive advantages in some aspects of caching but deliver sub-satisfactory performance in other aspects. To assess the performance of a Web caching system and understand the tradeos oered by various caching approaches, it is essential for caching researchers and developers to establish a comprehensive performance metric. This section provides a summary of the performance measures commonly used to evaluate the performance and eectiveness of Web caching systems[67;78 81] . These measures are divided into seven groups. Web TraÆc Patterns. Web traÆc-related measurements characterize the environment in which a caching system is operating. They provide useful input to the overall design of a caching system (e.g., storage capacity planning and caching network topology determination). Examples of such measurements are: the transfer size distribution which describes the distribution of the total number of bytes transferred in the network, the le size distribution which describes the distribution of the size of cached objects, and the proxy traÆc intensity which indicates the number of client requests received by a proxy during a given time interval. Aggregate Performance. This group of measures shows the aggregate performance of a caching system. By \aggregate", we mean that these measures are obtained when treating the given caching system as a whole rather than analyzing its internal components. Two main performance indices in this group are (a) the processing capacity and speed of a cache proxy and (b) the request response time. Hit Analysis. There are many performance indices in this group: hit rate, byte hit rate (discussed in Section 4), disk hits (the number of the requests resolved by reading the content from the disk), memory hits (the number of the requests resolved by reading the content from the cache's internal memory), negative hits (the number of the uncached objects being requested), and so on. Inbound TraÆc. A cache proxy and its clients rely on an internal network for their communications. The QoS of this inbound network has a major impact on response time. Two related measures are: the Client Connection Time which is the delay on the proxy from accepting connections until receiving a parseable HTTP request; and the Proxy Reply Time which is the time it takes to send a reply to a client after receiving the document from the cache or another server. Disk Storage Subsystem. Since most hits need to be served from the local disk attached to the cache proxy, the request response time depends directly
123
on the performance of the local disk storage subsystem. Three indices are commonly used in this group of measures: the Disk TraÆc Intensity measuring the number of swap requests per second; the Concurrent Disk Requests describing the number of concurrent swap requests; and the Disk Response Time indicating the total time taken to swap in/out a document from/into the disk cache. Network Utilization. This group of performance indices measures the average network bandwidth consumption, the latency of the network transmission, the number of HOPs along the network path, among others. Outbound TraÆc. When a cache miss occurs, communications between the cache proxy and the original Web servers need to take place which go beyond the internal network linking the proxy and clients. The related indices include: the Proxy Connect Time which is the time taken to send an HTTP request to an original server or other proxies, and the Server Reply Time which is the time taken to receive a reply from an original server.
6.2 Evaluation of Web Caching Systems The web caching community has developed several standard benchmarking tools to evaluate the performance of cache systems. Some of the tools are self-contained and can generate all HTTP requests and responses internally. Others rely on trace log les for requests and live Web servers for responses. A benchmark tool that uses real URLs obtained from actual Web servers is easy to implement but likely to give inconsistent and irreproducible results. A self-contained benchmark tool is much more complicated to develop but has the advantage of being con gurable and reproducible. The most commonly utilized web-caching benchmark tools are described below. Web Polygraph. Web Polygraph is a free benchmarking tool for caching proxies, original server accelerators, L4/7 switches, content lters, and other Web intermediaries (http://www.webpolygraph.org/). It was developed by NLANR and can simulate Web clients and servers as well as generate workloads to mimic typical Web accesses. Blast. The Blast software package was developed by Jens-S (http://www.cache.dfn.de/DFNCache/Development/blast.html). It replays trace log les for the Web requests. Blast launches a number of child processes in parallel, each handling one request at a time. It also includes a program that simulates a Web server. This simulated server sup-
124
ports the GET method, HTTP/1.1 persistent connections, and If-Modi ed-Since validation requests. Wisconsin Proxy Benchmark. The Wisconsin Proxy Benchmark (WPB) is one of the earliest publicly available cache benchmarking tools (http://www.cs.wisc.edu/~cao/wpb1.0.html). It was developed at the University of Wisconsin, Madison. The WPB can generate requests and responses on demand (versus from trace les) like Web Polygraph. It can also be con gured to use a onerequest-per-process approach similar to Blast. Besides above evaluation tools, there are some well-known Web traÆc traces available for simulating Web visiting activities. For instance, the Internet TraÆc Archive (http://ita.ee. lbl.gov/index.html) is a moderated repository to support widespread access to traces of Internet network traÆc, sponsored by ACM SIGCOMM. Those traces can be used to study network dynamics, usage characteristics, and growth patterns, as well as provide the grist for trace-driven simulations.
7 Recent Trends in Web Caching In this section, we brie y survey two emerging areas of Web caching research that have received considerable attention in the recent literature.
7.1 Dierentiated Web QoS An emerging Web phenomenon is the increasing diversity of Web clients, content providers, and Internet information appliances ranging from highend servers and workstations, to regular PCs networked through dial-up links, to small wireless devices with limited computational power and messaging capacity. This calls for a caching architecture that takes into consideration of performance dierentiation on information access. Some researchers propose to address performance dierentiation issues in an integrated manner crossing the underlying network infrastructure, Web caching proxies, and Web servers[82] . Adaptive and intelligent Web caching architectures and strategies are also proposed to deal with these performance dierentiation issues[30] .
7.2 Dynamic Content Caching The bene t of current Web caching systems is limited by the fact that only a fraction of Web data (e.g., static HTML pages) is easily cacheable. Webbased information systems are increasingly becoming more interactive and dynamic in nature, how-
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2 ever, partially propelled by the dynamic content generation technologies such as Web services, JSP, ASP, and JDBC. Since dynamic Web objects have to be generated by the Web server each time they are requested, serious performance problems could occur, especially when a large number of requests have to be processed by the server. Also, because of the required server involvement, it is hard to cache these objects on proxy servers that do not have direct access to the server's computational resources or are not allowed to access the internal data sources used by the server to generate these dynamic objects in the rst place. Despite its diÆculty, caching \uncacheable" objects has tremendous value and is being actively pursued by caching researchers. The basic idea behind this line of research is to try to \reuse" dynamicallygenerated objects between accesses based on the user-provided input parameters used to generate these objects. Currently two approaches have attracted a lot of attention from the caching community that address the dynamic content caching problem: active caches and surrogates. Active Caches. The Web servers supply specialized cache applets along with the requested documents. The cache proxies that receive these cache applets are required to invoke them upon cache hitting. These applets will generate appropriate dynamic contents at the cache level without communicating with the original server[83] . It is shown that this Active Cache approach can signi cantly reduce network bandwidth consumption. Surrogate Caches. Surrogates, or the so-called Web server accelerators, are placed in front of one or more Web servers to speed up user accesses. These surrogate caches can cache the server's responses to client requests. They also provide an API which allows application programs to explicitly add, delete and update cached data.
8 Conclusions As the Web is experiencing tremendous growth, there is an increasingly pressing need to improve the Web performance and QoS. Web caching has proven to be an eective approach to deal with Web performance and QoS challenges through reducing access latency and alleviating Web server bottleneck. This paper is a survey of the state of the art in Web caching technology. We discussed various Web caching system architectural designs including single-cache and cooperative caching and emphasized a number of policies that govern the internal operations of a caching system. Furthermore, we
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
discussed the performance metrics for and evaluation of caching systems. Web caching remains an active eld of study. Many open research topics need to be addressed. How to develop an intelligent and adaptive Web caching scheme to deal with the heterogeneous Internet environment where dierentiated performance is desired? Where to place the proxy cache servers on a network? How to cache dynamic contents eÆciently? As Application Service Providers (ASP) and networked and need-based services become more commonplace, service caching, based on the Open Services Gateway Initiative (OSGi) Speci cation[53] , might become a new important area for caching research and system development.
References [1] Zari M, Saiedian H, Naeem M. Understanding and reducing Web delays. IEEE Computer Magazine, Dec. 2001, 34(12): 30{37. [2] Abrams M, Standridge C R, Abdulla G, Williams S, Fox E A Limitations and potentials. In Proc. the 4th International WWW Conference, 1995, pp.119{133. [3] Aggarwal C, Wolf J L, Yu P S. Caching on the World Wide Web. IEEE Trans. Knowledge and Data Engineering, 1999, 11(1): 94{107. [4] Davison B D. A Web caching primer. IEEE Internet Computing, July/Aug. 2001, 5(4): 38{45. [5] Feldmann A, Caceres R, Douglis F, Glass G, Rabinovich M. Performance of Web proxy caching in heterogeneous bandwidth environments. In Proc. the IEEE Infocom'99 Conference, March 1999, pp.107{116. [6] Rabinovich M, Spatscheck O. Web Caching and Replication, rst Edition. Addison Wesley Publishing Co., New York, Dec. 2001. [7] Wang J. A survey of Web caching schemes for the Internet. ACM Computer Communication Review, Oct. 1999, 29(5): 36{46. [8] Moon S M. Increasing cache bandwidth using multiport caches for exploiting ILP in non-numerical code. In IEE Proc. Computers and Digital Techniques, Sept. 1997, 144(5): 295{303. [9] Thiebaut D, Stone H S, Wolf J L. Improving disk cache hit-ratios through cache partitioning. IEEE Trans. Computers, June 1992, 41(6): 665{676. [10] Stiliadis D, Varma A. Selective victim caching: A method to improve the performance of direct-mapped caches. IEEE Trans. Computers, 1997, 46(5): 603{610. [11] Kangasharju J, Kwon Y G, Ortega A. Design and implementation of a soft caching proxy. Journal of Computer Networks and ISDN Systems, 1998, (30): 2113{2121. [12] Rabinovich M, Aggarwal A. RaDaR: A scalable architecture for a global Web hosting service. In Proc. the 8th World Wide Web Conference, 1999, pp.467{483. [13] Rabinovich M, Rabinovich I, Rajaraman R, Aggarwal A. A dynamic object replication and migration protocol for an Internet hosting service. In Proc. the 19th IEEE Int. Conf. Distributed Computing Systems, 1999, pp.101{113. [14] Cooper I, Melve I, Tomlinson G. RFC 3040 Internet Web Replication and Caching Taxonomy, Jan. 2001.
125
[15] Chankhunthod A, Danzig P B, Neerdaels C M F S, Worrell K J. A hierarchical Internet object cache. In Proc. the USENIX 1996 Annual Technical Conference, Jan. 1996, pp.153{163. [16] Malpani R, Lorch J, Berger D. Making World Wide Web caching servers cooperate. In 4th Int. WWW Conf., Dec. 1995, pp.107{117. [17] Chankhunthod A, Danzig P B, Neerdaels C, Schwartatz M F, Worrell K J. A hierarchical Internet object cache. In Proc. USENIX'96, January 1996, pp.22{26. [18] Tewari R, Vin H, Dahlin M, Kay J S. Beyond hierarchies: Design considerations for distributed caching on the Internet. Technical Report TR98-04, Department of Computer Science, University of Texas at Austin, Feb. 1998. [19] Wessels D, Clay K. Application of Internet cache protocol (ICP). Internet Draft: draft-wessels-icp-v2-appl-00, Internet Engineering Task Force, RFC 2187, 1997. [20] Michel S, Nguyen K, Rosenstein A, Zhang L. Adaptive Web caching: Towards a new caching architecture. In 3rd Int. Caching Workshop, June 1998, pp. 2041{2046. [21] Povey D, Harrison J. A distributed Internet cache. In Proc. the 20th Australian Computer Science Conference, Sydney, Australia, Feb. 1997, pp.175{184. [22] Rodriguez P, Spanner C, Biersack E W. Web caching architectures: Hierarchical and distributed caching. In 4th International Caching Workshop, 1999, pp.37{48. [23] Fan L, Cao P, Almeida J, Broder A Z. Summary cache: A scalable wide-area Web caching sharing protocol. ACM/IEEE Trans. Networking, 2000, 8(3): 281{293. [24] Wang Z. Cachemesh: A distributed cache system for World Wide Web. 97's Web Cache Workshop, 1997. [25] Karger D, Lehman E, Leighton T, Levine M, Lewin D, Panigraphy R. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proc. the 29th Annual ACM Symposium on Theory of Computing, 1997, pp.654{663. [26] Ross K W. Hash-routing for collections for shared Web caches. IEEE Network, Nov./Dec. 1997, 11(6): 37{44. [27] Wu K L, Yu P S. Latency-sensitive hashing for collaborative Web caching. Computer Networks, 2000, 33: 633{ 644. [28] Rabinovich M, Chasse J, Gadde S. Not all hits are created equal: Cooperative proxy caching over a wide-area network. Computer Networks and ISDN Systems, Nov. 1998, 30(22-23): 2253{2259. [29] Tsui K C, Liu J, Liu H L. Autonomy oriented load balancing in proxy cache servers. In First Asia-Paci c Conference on Web Intelligence: Research and Development, 2001, pp.115{124. [30] Wang F Y, Liu M K, Zeng D J. ISAAC: Intelligent strategies and architectures for adaptive caching. Technical Report 01-0402, PARCS Lab, the University of Arizona, 2002. [31] Casilari E, Lecuona A R, Gonalez F J, Estrella A D et al. Characterization of Web traÆc. In Proc. the 2001 IEEE Global Telecommunications Conference, 2001, 3: 1862{1866. [32] Abdulla G. Analysis and modeling of World Wide Web traÆc [Dissertation]. Virginia Polytechnic Institute and State University, May 1998. [33] Barford P, Bestarvros A, Bradley A, Crovella M. Changes in Web client access patterns: Characteristics and caching implications. World Wide Web Journal, Special Issue on Characterization and Performance Evaluation, 1999, 2(1): 15{28.
126 [34] Breslau L, Cao P, Fan L, Phillips G, Shenker S. Web caching and Zipf-like distributions: Evidence and implications. IEEE INFOCOM 1999, 1999, 1: 126{134. [35] Chen X, Mohaptra P. Lifetime behavior and its impact on Web caching. IEEE Workshop on Internet Application, 1999, pp.54{61. [36] Swaminathan N, Raghavan S V. Intelligent prefetch in WWW using client behavior characterization. In Proc. 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2000. [37] Cunha C R. Trace analysis and its applications to performance enhancements of distributed information systems [Dissertation]. Boston University, 1997. [38] Leland W E, Taqqu M S, Willinger W, Wilson D V. On the self-similar nature of Ethernet traÆc (extended version). IEEE/ACM Transactions on Networking, 1994, 2: 1{15. [39] Dilley J. The eect of consistency on cache response time. Technical Report HPL-1999-107, Hewlett-Packard Labs, 1999. [40] Douglis F, Feldmann A, Krishnamurthy B. Rate of change and other metrics: A live study of the World Wide Web. In Proc. the USENIX Symposium on Internet Technologies and Systems, 1997, pp.147{158. [41] Eden A N, Joh B W, Mudge T. Web latency reduction via client-side prefetching. In Proc. 2000 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS-2000), April 2000, pp.193{200. [42] Jiang Z. Kleinrock L. An adaptive network prefetch scheme. IEEE Journal on Selected Areas in Communications, April 1998, 6(3): 358{368. [43] Kroeger T M, Long D D E, Mogul J C. Exploring the bounds of Web latency reduction from caching and prefetching. In Proc. USENIX, Symposium on Internet Technology and Systems, Dec. 1997, pp.13{22. [44] Markatos E P, Chronaki C E. A top-10 approach to prefetching on the Web. In Proc. the INET 98 Conference, 1998. [45] Wang L. A study of measurement-based Web prefetch control. Presentation at CCECE2000, Halifax, May 7{ 10, 2000. [46] Fan L, Cao P, Lin W, Jacobson Q. Web prefetching between low-bandwidth clients and proxies: Potential and performance. In Proc. the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'99), May 1999, pp.178{187. [47] Jiang Z, Kleinrock L. Prefetching links on the WWW. In Proc. the IEEE Int. Conf. Communications (ICC'97), Montreal, Canada, June 1997, pp.483{489. [48] Li J Q, Wang F Y. Consistency and prefetching for effective Web caching: An qualitative analysis. Technical Report 04-1201, PARCS Lab, the Univ. Arizona, 2001. [49] Yu S, Kobayashi H. A New Prefetch Cache Scheme. In Proc. 2000 IEEE Global Telecommunications Conference (GLOBECOM'00), 2000, 1: 350{355. [50] Nanopoulos A, Katsaros D, Manolopoulos Y. Eective prediction of Web-user accesses: A data mining approach. In Proc. the Workshop WEBKDD 2001, 2001. [51] Mobasher B, Jain N, Han E, Srivastava J. Web mining: Pattern discovery from World Wide Web transactions. Technical Report TR-96050, Department of Computer Science, University of Minnesota, 1996. [52] Pal S K, Talwar V, Mitra P. Web mining in soft computing framework: Relevance, state of the art and future directions. IEEE Trans. Neural Networks, 2002, 13(5): 1163{1177.
J. Comput. Sci. & Technol., Mar. 2004, Vol.19, No.2 [53] Yang Q, Zhang Z. Model based predictive prefetching. In Proc. 12th International Workshop on Database and Expert Systems Applications, 2001. [54] Cheng K, Kambayashi Y. LRU-SP: A size-adjusted and popularity-aware LRU replacement algorithm for Web caching. In Proc. 24th Annual International Computer Software and Applications Conference, 2000, pp.48{53. [55] O'Neil E J, O'Neil P E, Weikum G. The LRU-K page replacement algorithm for database disk buering. In Proc. ACM SIGMOD International Conference on Management of Data, New York, 1993, pp.297{306. [56] Kim K, Park D. Least popularity-per-byte replacement algorithm for a proxy cache. In Proc. 8th International Conference on Parallel and Distributed Systems, 2001, pp.780{787. [57] Robinson J T, Devarkonda M V. Data cache management using frequency-based replacement. Performance Evaluation Review, May 1990, 18(1): 134{142. [58] Williams S, Abrams M, Standridge C R et al. Removal policies in network caches for World Wide Web documents. In Proc. SIGCOMM'96, 1996. [59] Michel B S, Nikoloudakis K, Reiher P, Zhang L. URL forwarding and compression in adaptive Web caching. In Proc. IEEE INFOCOM 2000, Mar. 2000, 2: 670{678. [60] Hosseini-Khayat S. On optimal replacement of nonuniform cache objects. IEEE Trans. Computers, August 2000, 49(8): 769{778. [61] Jin S, Bestavros A. Popularity-aware greedy dual-size Web caching algorithms. Technical Report TR-99/09, Computer Science Department, Boston University, 1999. [62] Zeng D J, Wang F Y, Fang B. Multiple-queues: An adaptive document replacement policy in Web caching. Technical Report 03-0402, PARCS Lab, the University of Arizona, 2002. [63] Cao P, Irani S. Cost-aware WWW proxy caching algorithms. In Proc. the 1997 USENIX Symp. Internet Technology and Systems, Dec. 1997, pp.193{206. [64] Mahanti A, Williamson C, Eager D. TraÆc analysis of a Web proxy caching hierarchy. IEEE Network, May/June 2000, 14(3): 16{23. [65] Che H, Wang Z, Tung Y. Analysis and design of hierarchical Web caching systems. In Proc. IEEE INFOCOM'2001, Anchorage, Alaska, April 2001, pp.1416{ 1424. [66] Weikle D, McKee S, Wulf W. Caches as lters: A new approach to cache analysis. In Proc. MASCOTS'98, Montreal, PQ, July 1998, pp.2{12. [67] Busari M, Williamson C. Simulation evaluation of a heterogeneous Web proxy caching hierarchy. In 9th Int. Symp. Modeling, Analysis and Simulation of Computer and Telecommunication System, 2001, pp.379{388. [68] Gwertzman J, Seltzer M. World Wide Web cache consistency. In Proc. the USENIX 1996 Annual Technical Conference, Jan. 1996, pp.141{152. [69] Howard J, Kazar M, Menees S, Nichols D, Satyanarayanan M, Sidebotham R, West M. Scale and performance in a distributed le system. ACM Trans. Computer Systems, Feb. 1988, 6(1): 51{81. [70] Krishnamurthy B, Wills C E. Piggyback server invalidation for proxy cache coherency. In Proc. the WWW-7 Conference, 1998, pp.185{194. [71] Yin J, Alvisi L, Dahlin M, Lin C. Volume leases for consistency in large-scale systems. IEEE Trans. Knowledge Data Engineering, July 1999, 11(4): 563{576. [72] Cao P, Liu C. Maintaining strong cache consistency in the World Wide Web. IEEE Trans. Computers, April 1998, 47(4): 445{457.
Ming-Kuan Liu
.: Web Caching: A Way to Improve Web QoS
et al
[73] Krishnamurthy B, Wills C E. Study of piggyback cache validation for proxy caches in the World Wide Web. In Proc. the 1997 USENIX Symposium Internet Technology and Systems, Dec. 1997, pp.1{12. [74] Gray C, Cheriton D. Leases: An eÆcient fault-tolerant mechanism for distributed le cache consistency. In Proc. 12th ACM Symposium Operating System Principles, 1989, pp.202{210. [75] Duvvuri V, Shenoy P, Tewari R. Adaptive leases: A strong consistency mechanism for the World Wide Web. In 19th. Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2000, 2000, 2: 834{843. [76] Krishnan P, Raz D, Shavitt Y. Transparent en-route cache location for regular networks. In DIMACS Workshop on Robust Communication Networks: Interconnection and Survivability, DIMACS Book series, New Brunswick, NJ, USA, Nov. 1998. [77] Valloppillil V, Ross K W. Cache Array Routing Protocol V1.0. Internet Draft: draft-vinod-carp-v1-03.txt, June 1997. [78] Almeida J, Cao P. Measuring proxy performance with the Wisconsin proxy benchmark. Journal of Computer Networks and ISDN Systems, 1998, 30: 2179{2192. [79] Chen Y. Experimental study of Internet traÆc modeling and bandwidth allocation. 2001 IEEE Paci c Rim Conference on Communications, Computers and Signal Processing, 2001, 2, 587{590. [80] Lai G, Liu M K, Wang F Y, Zeng D J. Web Caching: Architectures and Performance Evaluation Survey. 2001 IEEE International Conference on Systems, Man, and Cybernetics, 2001, 5: 3039{3044. [81] Maltzahn C, Richarson K J. Performance issues of enterprise level Web proxies. In Proc. the ACM SIGMETRICS Int. Conference, June 1997, pp.13{23. [82] Lu Y, Saxena A, Abdelzaher T F. Dierentiated caching services: A control-theoretical approach. IEEE 21st International Conference on Distributed Computing Systems, 2001, pp.615{622. [83] Cao P, Zhang J, Beach K. Active cache: Caching dynamic contents on the Web. In Proc. IFIP Int. Conf. Distributed Systems Platforms and Open Distributed Processing (Middleware'98), 1998, pp.373{388. Ming-Kuan Liu is a Ph.D. candidate in the Electrical and Computer Engineering Department at the University of Arizona. His research interests include intelligent control, speech recognition, web caching, voice over IP, network traÆc modeling and simulation. He received an M.S. degree in information science from the Institute of Automation (IA), CAS and an M.S. degree in industrial engineering from the University of Arizona in 2000 and 2002, respectively. Fei-Yue Wang
received his Ph.D. degree in electri-
127
cal, computer and systems engineering from the Rensselaer Polytechnic Institute, Troy, New York, in 1990 and currently is a professor at Systems and Industrial Engineering Department of the University of Arizona, and a research scientist at the Institute of Automation, the Chinese Academy of Sciences. His research interests include modeling, analysis, and control mechanism of complex systems, linguistic dynamic systems, agentbased control systems, intelligent control, real-time embedded systems, application speci c operating systems (ASOS), intelligent transportation systems, intelligent vehicles and telematics, web caching and service caching, smart appliances and home systems, and network-based automation systems. He has published over 200 books, book chapters, and papers in those areas since 1984. In 1996 he received Caterpillar Research Invention Award and in 2001 the National Outstanding Young Scientist Research Award from the National Natural Science Foundation of China. He was the Editor-in-Chief of the International Journal of Intelligent Control and Systems from 1995 to 2000, and is the Editor-in-Charge of Series in Intelligent Control and Intelligent Automation, the Editor of IEEE Intelligent systems ITS Department, and an Associate Editor of the IEEE Transactions on Systems, Man, and Cybernetics (SMC), ITS, and Robotics and Automation (R&A). He was the Program Chair of the 1998 IEEE Int. Symposium on Intelligent Control and the 2001 IEEE Int. Conference on Systems, Man, and Cybernetics, the General Chair of the 2003 IEEE Int. Conference on Intelligent Transportation Systems and will be the Co-Program Chair and General Chair of the 2004 and 2005 IEEE Int. Conf. on Intelligent Vehicles. He is the Vice President of the American Zhu Kezhen Education Foundation, Vice President of IEEE ITS Society. He is an IEEE Fellow. Daniel Dajun Zeng received his Ph.D. degree from the Graduate School of Industrial Administration and the Robotics Institute at Carnegie Mellon University and currently is an assistant professor at Management Information Systems Department of the University of Arizona, and the associate director of Homan ECommerce Lab at Eller College of Business and Public Administration. He has co-edited one book and published about 50 peer-reviewed articles in Management Information Systems and Computer Science journals, edited books, and conference proceedings. He is currently directing four US's National Science Foundationfunded research projects as PI or co-PI. His research areas include multi-agent systems, distributed optimization, computational support for auctions and negotiations, intelligent information integration and caching, and recommender systems.