World Wide Web, 4: 255–275 (2001) 2002 Kluwer Academic Publishers
Scalable Federation of Web Cache Servers A. BELLOUM and L.O. HERTZBERGER University of Amsterdam, Kruisslan 403, 1098 SJ Amsterdam, The Netherlands H. MULLER University of Bristol, MV Building, Woodland Road, Bristol, BS8 1UB, UK
{adam,bob}@science.uva.nl
[email protected]
Abstract Web caches are traditionally organised in a simple tree like hierarchy. In this paper, a new architecture is proposed, where federations of caches are distributed globally, caching data partially. The advantages of the proposed system are that contention on global caches is reduced, while at the same time improving the scalability of the system since extra cache resources can be added on the fly. Among other topics discussed in this papers, is the scalability of the proposed system, the algorithms used to control the federation of Web caches and the approach used to identify the potential Web cache partners. In order to obtain a successful collaborative Web caching system, the formation of federations must be controlled by an algorithm that takes the dynamics of the Internet traffic into consideration. We use the history of Web cache access in order to determine how federations should be formed. Initial performance results of a simulation of a number of nodes are promising. Keywords:
distributed Web caching, simulation
1. Introduction Web-caches are sites with recently accessed Web pages. The idea is that, as with any cache, the page may be required in the near future by the same person, or, in the case of a proxy server, it may be required by another person in the same area. Web-caches exists for two important reasons: they reduce the average latency observed by the user (for the user may get a local copy quickly rather than a remote copy), and Web-caches reduce long distance bandwidth requirements. A disadvantage is that Web-caches introduce extra latency, which can be considerable if the cache is shared by more than one user. Another problem with Web-caches is that their hit ratio can be poor [4]. In previous studies, we have focused on the improvement of the cache performance through cache management policies such as cache replacement and coherency strategies. We have shown in these studies that the gain in performance obtained remain within a small range [2,3]. It was clear from these studies that further improvements could be obtained via cooperative Web-caching. If several Webcaches can be organized in an efficient way, it is likely that they can benefit from the data stored in each cache. In this paper we describe the implementation of a scalable Web-cache. The idea is that it serves as a proxy server and cache, and that we combine the thousands of existing Webcaches into one mega-cache. This mega-cache will have a much higher hit ratio, and, because of its distributed nature, there is less contention. The higher hit ratio results in a
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 1
256
BELLOUM ET AL.
further reduction of latency and bandwidth requirements, while the lower contention limits the additional latency caused by the caching. The Web-cache is loosely based on technology previously employed in the construction of Virtual Shared Memory (VSM) machines. A VSM architecture is a shared memory machine out of distributed memory nodes. In order to speed up accesses between nodes, caches are installed on the nodes that store recently used data. In order to build a scalable VSM, one needs to implement a hierarchy of caches [21]. We have used a similar approach in designing the Federation of Caches. The idea of the federation of caches is that caches are grouped together in a federation, and that federations can be grouped together in (meta-)federations. Whenever a page is required by one of the nodes of a federation, the node will first consult its local cache. If the data is not present locally, the node will subsequently consult the combined cache of all the federation partners. Via a distributed index (described in detail in Section 2.1), the cache will determine whether a copy is available in the federation. If not, the combined cache of the meta-federation is inspected, and so on. There are two ways to implement federations. One option is to align federations with the network topology (so that local area networks form federations, and a cluster of LANs for a meta-federation). The second option, explored in this paper in Sections 2.3 and 2.4, is to form federations between nodes with similar streams of accesses. The paper concentrates on the algorithms that we use to construct and maintain a network of caches, and on preliminary performance results.
2. Scalable federations Instead of viewing Web-caches as separate entities, we propose to tie them closer together. We want Web-caches to share resources (i.e., cached pages), and we want to take advantage of the proximity in networks, such a cooperative caching systems have be discussed in several publication [22–24]. For example, if all Web-caches on a Local Area Network would pool their information, a much larger (virtual) cache would be available to the users of this LAN. Similarly, if all Web-caches on a campus network would pool their contents, an even larger cache would become available. A virtual cache spread over a LAN would be reasonably fast, a cache over a Campus Network would be slower but much larger. This is precisely the scheme that one need for a multi level cache. Most importantly, the caches higher up on the hierarchy are not centralised, the bandwidth scales with the size of the cache. This property limits congestion. We call a group of caches on a network a federation. Each of these caches will maintain a set of pages (which are the pages which the user of that node are interested in) and an index of some of the other pages available on the LAN. If we assume that we have a LAN with n0 nodes, then the size of the index at each node would be approximately a fraction 1/n0 of the global index. The index is distributed in a way we discuss later, so that each node can with one remote LAN operation identify any local copies of the data the node is interested in.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 2
SCALABLE FEDERATION OF WEB CACHE SERVERS
257
At the next level, the campus network level, we have n1 nodes (where n1 n0 ), and each node will maintain a fraction 1/n1 of the index of all n1 nodes. Indeed, this scheme will extend to as many levels as one needs, spreading the index of a bigger and bigger virtual cache, over more and more nodes. Observe that the size of the virtual cache at level i will be O(ni ) (where ni is the number of nodes on network level i and below), so that the size of the index at each node is O(ni )/ni = O(1). Observe also that the amount of traffic at each level is of O(ni ), because there are ni nodes, the traffic to each of the indexes is constant again. The access latency grows with the logarithm of the number of nodes. These three properties, O(1) index size, O(1) bandwidth requirements and O(log n) latency, cause the system to be scalable, limited by the network capabilities only (not by the performance of the cache-server). This builds a hierarchy of caches, like [9,15,23]. We have placed special emphasis on the scalability (discussed above) and the dynamic formation of the hierarchy based on access patterns (Section 2.4). In the remainder of this section we will explain the operation of a federation of caches. We first discuss the implementation fundamentals. Subsequently we show how to locate a page, how to join a federation and how to leave a federation.
2.1. Implementation fundamentals At the first level cache, we calculate a 64-bit hash value for each URL that we access. This 64-bit value is the key that will be used to distributing indexes, and to locate pages based on a URL. Each cache at level i caches part of the 64-bit hash space. The first few bits of the hash value determine how to distribute the hash value. In Figure 1 we show an example of such a distribution. In this example, we have assumed four federations, two with three nodes, two with two nodes. (Real cases will have more nodes per network!) Each node caches the pages in which that node is interested. At the first level, each node indexes part of the index of the local caches. Node 0 indexes all hash values which start with the bit “0”, node 1 indexes all hash values which start with the bits “10”, node 2 indexes all hash values which start with bits “11”. Between the three of them, they have a complete index of all the caches on the local federation. Indeed, each of the nodes will know where to query for which hash value on their local federation. At the second level, the nodes index all pages within the federation of federations. The nodes index the pages with hash values that start with “00”, “010”, “011”, “100”, “101”, “11”. Nodes 0–5 index all pages on the federation of federations. Note that only the first level indexes need to know where to go for their next step. Node 0 knows that it has to go to either of the first three indices, node 1 knows it has to go to node 3 or 4, node 3 knows it has to query node 5. The cost of building the index is very small indeed. The index is maintained dynamically, so for each page access the index is updated. Note that only local updates are necessary in this protocol. The overhead of calculating the hash function is minimal, since this hash is usually calculated anyway for storing and retrieving the page. Also, because each node stores only a fraction of each index (O(1)), the index could be kept in memory.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 3
258
BELLOUM ET AL.
Figure 1. Distributed-index fundamentals.
2.2. Obtaining a page from the federation There are two ways to use the federation of caches. First, one can design a federal proxy cache. Ordinary clients (such as Netscape) will first check their private cache, whereupon they visit the proxy cache which is part of a federation. The protocol for obtaining page in the proxy cache is detailed in the following paragraphs. The second way to use a federal cache is to make the clients part of the federation. In that case, we have to change the client so that it implements the federation protocol, and allows other clients to access pages in the private cache. The protocol for obtaining pages is the same as for the first method, except that the client is part of the federation. The protocol to obtain the data for a URL works as follows. 1. Check if a page is in the cache. If so, page found, return it. 2. If a page is not in the cache, consult a lookup table to find out which cache in the federation will know more about this page. This lookup table is small, it indexes the first few bits of the hash function onto federation members. 3. Query the federation member for the page.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 4
SCALABLE FEDERATION OF WEB CACHE SERVERS
259
This protocol neatly extends to multiple layers in the federation cache. Below, we describe the operation of the cache index at level i. When a request is received for a page: 1. Check if the page is in the index of level i in this node. If so, forward a message to the node maintaining the page to send the data to the originator of the request. 2. If the page is not in the index at level i, do a lookup on the next few bits of the hashed URL to find out which node in the meta-federation at level i + 1 is responsible for maintaining the meta-index for that page. 3. Forward the request to that node. In the distributed indexing mechanism described in Section 2.1, while the missing request is searched within the federation of caches, the different parts of the index are updated until the document is found or a federation miss is reached. The search of the index has two phases: a bottom–top search along the hierarchy of caches until an entry for the requested URL is found. The second phase is a top–down index search to locate the node where the searched document is stored. If the searched document is missing at all levels of the federation the second phase (top–down) is never executed and the cache which initiated the request gets the document from the origin server. In term of hops among the nodes, maintaining the index and searching for documents within the federation cost at most 2imax + 2 hops, if the document is stored somewhere in the hierarchy, and imax if not; where imax is the maximum level the hierarchy of caches. 2.3. Building the hierarchy The federations of caches presented in this paper are fully dynamic. New caches can join any federation and caches which are member of one or more federations can leave these federations at any time. Figure 2 shows an example of the federation’s activities. The caches start as stand-alone nodes. The nodes progressively gather into federations and hierarchies of federations depending on their interests. To start a federation of caches, we use the approach proposed in the Adaptive Web Caching [27]. In this section, we are going to focus on the definition of the protocol which controls the different activities of the federation of Web-caches, namely, the join/leave federation requests. A recent study has shown that the document retrieval time is mainly dominated by the document download time, which is determined by the bandwidth and the delay at the origin server [17]. We have thus defined a cost function to asses a join federation request using parameters such as bandwidth, the similarity of the workloads (discussed later), and the computation power of the cache requesting the join. Based on this cost function, the cache can be accepted as a member at the first level, or any higher level of the hierarchy of the federation. When a cache is accepted as member at a higher level of the federation hierarchy, it means that this cache failed to satisfy one of more selection criteria. It is thus loosely coupled to the federation such that it does not disturb the performance of the federation. We have considered this policy to give a chance to potential caches to join progressively the federation. For now we are using a very simple cost function using the domain name, the round-trip time, and the capacity of the cache server to handle HTTP requests.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 5
260
BELLOUM ET AL.
Figure 2. Evolution of the federation of caches.
This protocol is quite similar to the Internet Cache Protocol (ICP), however, our protocol does not consider relationships such as parent or sibling. When a request is sent to a cache, it has get it from the origin server or forwarded to another cache. As an example, Figure 2 shows how cache C5, which requested to join federation F11, has been accepted only at the second level of the hierarchy. This means that cache C5 can forward its requests to any member of federation F11, and the members of federation F11 forward their requests to C5 only as a second level miss, i.e., the request could not be found within federation F11. The advantage of this protocol is that a reduced number of requests is forwarded to C5 which will avoid to overload it. Recall that cache C5 has been accepted as a second level of the hierarchy because C5 failed to satisfy the joining criteria at level 1 while its traffic interest matches the traffic federation F11. The indexing method presented in Section 2.1 is used to define the document search space, called also URL space, for each document in the federation. The number of bits used for the indexing depends on the number of members of the federation. They allow to restrict the document search space assigned to each cache. When a new cache joins the federation, the document search is split in a round-robin like fashion, the new member gets half of the document search space of the one having the large search space, i.e., the one that uses less bits to index pages. If more than one satisfy this criteria, the one that have been registered first is chosen. Since caches are assigned different document search spaces only one copy of each document is kept at each level of the federation. However, multiple copies can be stored at different levels of the hierarchy.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 6
SCALABLE FEDERATION OF WEB CACHE SERVERS
261
When a cache leaves the federation, its search space is reassigned to the cache from where it has been extracted at the join transaction. In the current model, we do not reallocate objects when a cache leaves the federations. The index is maintained by one member of the federation, the federation moderator, it should be one of the caches that initiated the creation of the federation. In our model, maintaining the cache does not involve heavy tasks. The federation coordinator has to keep track of how the document search space is split among the federation members (a lookup table with one entry per cache), and which document search space has to be split when a new member joins or leaves the federation. The proposed protocol can be regarded an extent ion of the Cache Array Routing Protocol (CARP) [20]. However there is still a difference in the way of performing the split and how the members of the federation of caches identify which cache has been assigned which search space. The protocol proposed in this paper presents a way to automate some of the functions proposed in the CARP protocol as adding/removing a cache server to the pools of caches. Besides, the assignment of documents to the caches in the CARP protocol, focuses more on the capacity of the cache to handle HTTP requests than grouping similar documents on the geographically close caches, a major difference between the two approaches.
2.4. Workload similarity Besides network bandwidth and the geographic location of the caches; the similarity amongst the workloads of caches could be a good reason to start building up a collaborative caching system. Adding similarity of the cache load to the parameter currently used to build collaborative caching systems will increase the efficiency of these systems. A similarity of access workloads implies strong locality of reference which increases the hit-ratio of the federation. To take into consideration the dynamics of the Internet traffic, the similarity among workloads is computed using the access history of each cache. The similarity criterion is used for both of the join and leave transactions. The analysis made by Wolman et al. has shown indeed that cooperative caching is highly sensitive to the homogeneity of the load of the caches composing the cooperative system [15]. The similarity criterion used in our approach is a potential metric to measure the homogeneity of the cache access patterns. When computing the similarity between two access log files, we try to identify partial matches. Two access log files are considered similar: if a certain percentage of the requests they contain matches. To manipulate the users’ requests a unique identifier has to be assigned to each one. Besides identifying requests, uniquely the identifiers provide a useful information on the similarity of the requests, for instance, if two requests are referring to documents within the same domain or which have the same keywords then these two requests must have identifiers in the same range. Such information is needed because the similarity of access log files considered in this context does not seek for an exact match among the sets of requests. It is thus important to be able to identify that a set of requests has a particular pattern; in other words, we must be able to detect when two sequences
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 7
262
BELLOUM ET AL.
of requests are searching for documents within the same domain or with the same keywords. To achieve this goal, the function that generates the request identifiers has to preserve the similarity of the requests. Roughly speaking the function that generates the requests IDs uses mainly the URL of the request which contains information about the domain, the location, and the name of the requested document. The information related to the similarity of the requests is contained in their identifiers, one way to measure this similarity is to use the “distance” between two points (identifiers), i.e., two requests are similar if the points representing their identifiers are close to each other. What is left is to find a mapping function that conserves the distance between two points. This problem was tackled in “Time Series and Sequence Databases” analysis proposed in [10]. It is demonstrated in this work that the Discrete Fourier Transform (DFT) can indeed keep track of the distance between the points. In the context of Internet traffic characterization, a general methodology for characterizing the access patterns of Web requests based on time-series has been developed at IBM Watson research and applied for the Web site of the Olympic Games [15]. The sequences of users’ requests is represented in a multi-dimensional space. The dimensions of this space are fixed by the mapping function, in this case the first two coefficients of the DFT. Each DFT coefficient represents one feature of the sequence of users’ requests, thus the points do not represent single requests instead they characterize the main features of a subset of requests. These points are obtained by a process described in [10]. The original sequence is thus mapped into a space representing its main features where points can be grouped into trails which can be viewed as rectangles. For each trail only few points are kept, these points are the Minimum Bounding (hyper)-Rectangles (MBRs). Using the trails, it is now easy to perform the sequence-matching by just tracking the intersection of the the trails composing the two sequences (Figure 3). The search for potential sequence match might be time consuming, it is thus important to look for fast search algorithms (we have used a spatial access method, the R-tree, to store the MBRs).
Figure 3. Sequence matching process. ×: MBR representing trail A; +: MBR representing trail B.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 8
SCALABLE FEDERATION OF WEB CACHE SERVERS
263
Since we are not targeting exact matches in the sequence-matching process, we have introduced the error tolerated in this matching process. We first enlarge the rectangles representing the trails and then perform the sequence-matching.
2.5. Mapping a URL to a number When looking at the access log files of Web-cache servers, the URL defining each request contains all the information needed to figure out the similarity between the requests, similar to [23]. In each URL, we can find the name of the server that contains the document as well as the name and the path to the requested document. We map each of them onto a real number, using the function shown below. These functions are designed to preserve document similarity. This means that documents with similar names, paths or stored on same server or servers with similar names are represented by positive real number that lays within a short distance from each other. The mapping function we have used is a very simple one, it parses the URL seeking for the name of the server, the path, and the name of the documents. Besides the strings collected from the URL, each cache server can specify a special sequence of characters that should be included in the one of the serv_str, the path_str, or the name_str. Each string of characters is converted into a real number using the following expressions: servID =
length_ serv_str−1 i=0
pathID =
serv_str[i] , i+1
(1)
path_str[i] , i+1
(2)
name_str[i] , i+1
(3)
length_ path_str−1 i=0
nameID =
length_name _str−1 i=0
where servID, pathID, and nameID are the real numbers representing the server name, the document path, and the document name included in the URL. The function that maps the documents onto the real number space combines the previously described factors using the following weighted-function: DocumentID = w1 · ServID + w2 · PathID + w3 · nameID.
(4)
w1 , w2 and w3 allow to tune the mapping function, in our experiments w1 = w2 and w1 , w2 w3 . Obtaining the optimal values of the weights is not the target in the current phase of the implementation, the values used in the experiments presented in this paper allow us to give more importance to server name and the path of the document. Using this mapping function, it is possible to assign a unique real number to each requested document. Because we preserve the document similarity (above), we can use the distance between the points to classify the requested documents. Clustering techniques rely on efficient algorithms to compute the closest points or pair of closest points [16].
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 9
264
BELLOUM ET AL.
3. Results To perform our experiments, we have built a simulation environment that mimics the behavior of a cache. One part of the model implements the main functions of a cache server, mainly the document replacement strategy. The other part generates the requests using real access logfiles. In all the following experiments, we have fixed the cache size to 64 MB and the document replacement strategy to the Least Recently Used (LRU) method. We do not consider the problem of cache coherence, the documents are considered static. In a previous study, we have shown that if the LRU replacement strategies is used and the cache size is fixed to 64 MB, a large fraction of the hits are performed on recently cached document [5]. We hope this will reduce the number of hits on out of date documents. The end of warm-up phase is considered when either the cache becomes full for the first time or the hit ratios remain within the predefined interval as suggested in [1].
3.1.
Building the experimental hierarchy
The experiments discussed in this section involves nine caches which start as nine stand alone caches. Periodically, each cache randomly selects another cache and initiates a join operation. The cost function used to assess the join/leave operation is reduced to a unique parameter which measures the similarity of the access log files (equations (1)–(4)). If the matching process described in Section 2.4 is successful, a new federation is created composed of the two caches and the one which has received the join operation becomes the moderator of the federation. Each cache within the federation is assigned (i) the federation hash function, (ii) a mask used to select the URL space, and (iii) a specific part of the distributed index. On a cache miss, the missed URL is processed with the hash function to generate a hashed-value of the URL (a 64 bit word). The later is passed through the mask to keep only the bits that must be matched with the distributed index. The mask is a 64 bit word where the number of bits set to one is equal to log2 n, where n is number of cache composing the hierarchy. When a new cache want to join an existing federation, it contacts the moderator of the federation which evaluates the cost function to define where the new cache should be located within the hierarchy. Since the aim of this experiment is to show the impact of the workloads similarity on the federation activity, the hierarchy is build level by level, i.e., a higher level of the hierarchy is created only when the maximum number of caches at the precedent level has been reached (the maximum number of caches per level has been fixed to 4). The moderator assigns to the new cache a specific URL space, and updates the distributed index of all the existing federation numbers which will forward requests to the new cache. The leave transaction is quite similar to the join except that the moderator of the federation has to merge two URL space. This operation consists of reseting the most significant bit set to one in the mask of the cache which will take charge of the document search space previously assigned to the cache leaving the federation.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 10
265
SCALABLE FEDERATION OF WEB CACHE SERVERS Table 1. Workloads characteristics Workload
Transfered data (GByte)
Number of requests
NLANR_sd NLANR_sv NLANR_pb NLANR_rtp NLANR_uc NLANR_pa NLANR_bo1 NLANR_lj NLANR_sj
38.58 29.30 29.08 21.52 27.65 17.00 16.43 10.02 2.50
3506330 2443151 1880325 1829650 1739095 1327694 1150827 752967 165226
3.2. Metrics and workloads To test the federations of Web-caches, we have chosen metrics that record information on both of the local users of each cache, and the federation activities. We have defined: • The User Hit-Ratio (UHR): showing the percentage of hits on requests issued by the end-users (local users) of each cache. Hits recorded on requests coming from other cache servers are not counted. • The Federation Hit-Ratio (FHR): showing the percentage of hits on requests issued by other caches in the federation. By selecting these two metrics, we want to point out the impact of the caches activities within the federation on the performance as they are seen by the local users of each cache. The federation hit-ratios are the extra hits scored within the federation, improving the response time to the user. The experiments discussed in this paper are performed using access log files extracted form the nine servers composing the NLANR1 cache servers. Due to the huge amount of requests contained in these access log file (Table 1) only three days of each cache server are used. The NLANR access log files show a weak locality of reference, especially if short duration of these workloads are considered. They form thus a very hard test environment for Web-caches, because the number of hits is rather low comparing to the total requests. Besides, the NLANR server are already forming a static collaborative caching system, using their access logfiles can show how a real system behaves when a dynamic topology of caches is considered. We have performed our initial experiments on small federations only. The advantage of federations will increase with increasing sizes of federations. We are currently measuring performance figures on larger federations. 3.3. Federations activity The activity of the federation in terms of join/leave transactions is a good measure for the stability of federations of caches. We hope that the federations will extend and shrink
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 11
266
BELLOUM ET AL.
following the evolutions of the similarity of the requests of the users of each cache. However, the frequency of join and leave operations should not be too high. A high number of leave-federation transactions is the result of a poor matching algorithm, we observe such a phenomenon when we relaxed the condition on the error of the sequence-matching algorithm. The first experiments show that the federation of caches activities is sensitive to three parameters: the error tolerated in the sequence matching process, the length of the matching sequences, and the order in which the federations are checked for a join-federation. Figures 4 and 5 shows the federation join/leave activity for nine cache servers used in the experiment. The figure shows the federation size against time. In Figure 4, the process of selecting a potential federation for a possible join-federation is following a round-robin like algorithm. This creates unbalanced federations, federations with low identifiers extend first, and extensions are pretty much serialised in time. Random selection does not create this bias, Figure 5 shows that federations expand together (rather than one at a time), increasing the periods between successive join transactions for each federation. To discuss the impact of the other parameters, we have decided to go on with the random selecting process, to have a non-bias experiment. Our next experiment shows how to stabilize the system by tuning the error margins. Figure 6 presents the federations activities (join/leave transaction) when the matching sequences are six-times longer than the ones used in the experiment shown in Figure 4. At a first sight it seems that they are less stable, there is a quite dense activity when the federations contain between three and four members. Long sequence matching are too restrictive, if they are combined with a small tolerated error, they increase the number of leave-federation transactions. When the miss-ratios start increasing, the long sequencematching suggests that it is the result of being member of a certain federation, and thus
Figure 4. Number of join/remove transactions per federation: round-robin selection.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 12
SCALABLE FEDERATION OF WEB CACHE SERVERS
267
Figure 5. Number of join/remove transactions per federation: random selection.
Figure 6. Number of join/remove per federation: large matching sequence.
recommend leaving the federation where the sequence-matching process has failed. In some experiments, the hierarchy of federation could not go beyond the first level.
3.4. Hit-ratios The hit-ratios of the federation of caches is reported in Figures 7 and 8. We have plotted three hit-ratios in each figure: User Hit-Ratio (UHR), Federation Hit-Ratio (FHR), and
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 13
268
BELLOUM ET AL.
Figure 7. Hit-ratio when small a sequence matching is used.
Figure 8. Hit-ratio when sequence matching is six time longer.
the hit-ratio recorded for the caches in a stand alone configuration. The stand alone hit ratio and the UHR are measuring the same traffic in two different configurations. Since the used workload shows a weak locality of references, it is expected that UHR will suffer from the load generated by the documents which are requested only once. In such a hard condition our aim is to keep the UHR as close as possible to the stand alone ratio. The effect of the cooperative Web caching can be seen when summing the UHR and the FHR. The difference between Figures 7 and 8 is in the length of the matching sequence.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 14
SCALABLE FEDERATION OF WEB CACHE SERVERS
269
Figure 9. Hits recorded when the join/leave transactions are based on randomly selected federations.
In all cases the average hit-ratio offered by the federations is about the same as the hit-ratio at the user level. In other words, a large number of hits performed within the federation where good conditions are set to reduce the documents retrieve time. Figure 7 shows some possible side effects, for instance, cache 6 is not performing that much hits for the federation. In the current state of the protocol each cache decides when to leave the federation, this strategy allows caches to join a federation and remain members of it even if they do not contributed actively in improving the performance of the federation. Note that the cache users’ hit-ratio does not suffer much from the federations activities: the stand-alone hit-ratio and user hit-ratio are almost identical. The length of the matching sequences has an impact on the federation hit-ratio. A longer matching-sequence does not lead to a better federation hit-ratio, except for one or two caches. In order to understand why some caches reacts positively to long matching sequences one has to consider the fact that since long matching-sequences reduce the number of joinfederation, it also maintains a large document search space for each cache. Thus in some caches it reduces the number of federation hits. This is the case in caches 6 and 7 (Figure 8). For cache 8, where significant increases of the FHR has been recorded, a slight decrease of the local hit-ratio (UHR) is observed, however, it remains relatively small comparing to the gain obtained for the FHR. We have also compared a federation of caches based on a random groups of caches, compared to subsequence matching. We found that the Federation hit ratios in our case are a factor of two higher, and we expect this gain to increase for larger federations (Figure 9). 3.5. Scalability issues Scalability is an essential feature of the federation of Web-caches. The federation of Webcaches generates extra traffic (for administration), and we must ensure that we do not
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 15
270
BELLOUM ET AL.
Figure 10. Fraction of requests forwarded over time.
overload the Web-caches with this traffic. We also have to make sure that this traffic is evenly distributed among the federations members. Figure 10 shows the evolution of the ratio of the forwarded requests for each cache. After the transition phase the number of forwarded requests to each cache converges towards a stable value, however, they still represent for some caches a quite large fraction of the requests, for instance for cache 8 the forwarded requests represent around 70% of the received requests. A tradeoff has to be made between the gain obtained for the UHR and FHR on one side and the extra traffic the cache has to deal with on the other side. Figure 10 shows the evolution of the ratio (FHR/UHR). This ratio is closely related to the join leave transaction. A rapid increase is recorded after successful join transitions, indicating growing federation hit ratios. In Figure 11 the sharp transitions of the curve of cache 8 are the result of the difference between its local user load and the load of the other cache involved in the same federation. The extra traffic should be evenly distributed among the Web-caches composing the federations. This problem has been taken into consideration when assigning the document search space to each Web-cache discussed in Section 2.1. In Figure 12, we have represented the ratio (FHR/UHR). Because of the difference in the local users’ load of the caches composing the federation 2, the proportion of federation hits is much higher for cache 8 than for caches 3 and 1. The distribution of the forwarded requests over the caches composing federation 2 is more or less balanced, the proportions are as follows: 39% for cache 8, 29% for cache 1, 23% for cache 3 and 7% for cache 2. Clearly cache 8 is working more for federation 2 than for its own local users; the consequence is that cache 8 left the federation near the end of the experiment. More developments has to be done to improve the reaction of the caches to such a bad situation. When considering large number of Web-cache servers the only factor used in building the hierarchy that is directly related to the number of caches is the document indexing
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 16
SCALABLE FEDERATION OF WEB CACHE SERVERS
271
Figure 11. Number of requests forwarded.
Figure 12. Requests and hit-ratios over time for federation 2.
mechanisms and we have already shown in Section 2, that this indexing mechanism scales quite well. However, as most of the hierarchical architecture our approach inherently is limited by two main factors: the depth of the hierarchy, and the number of Web-cache per level of the hierarchy. For now, we just fix a maximum number for both factors, but in the future we plan to use the document retrieval latency, and cache utilization to decide if we continue in building the next level of the hierarchy.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 17
272
BELLOUM ET AL.
4. Related work During the last four years extensive research has been performed on the topic of the cooperative Web caching, most of the approaches were based on a static hierarchy of caches, where management tasks needed to be performed by the system administrator. Only a few projects have allowed a dynamic management of cooperative Web caches, even though, the selection of the group of caches to join is still based on very simple algorithm which most of the time does not take into consideration the dynamics of the Internet traffic. The following paragraphs give a short description of some of the research work developed around the topic of distributed cooperative Web caching. A protocol was proposed by Malpani to handle the cooperative work of Web caches. For each request, a client randomly picks a caching proxy server from a list of cooperating servers and sends its request to it. The called cache server referred to as the master of the request can either get it from its local storage or multi-casts the query to the other cooperating servers. If no reply is received within a certain time, it acts as ordinary caching proxy [19]. The CRISP (Caching and Replication for the Internet Service Performance) is another caching service designed to serve the needs for ISP (Internet Service Providers). CRISP servers cooperate to share their caches, using a central mapping service with the complete directory of the cache contents of all participating proxies. CRISP cache has been implemented and is in active use at AT&T Labs. To probe the cooperative cache, the proxy forwards the requested URL to a mapping server. The latter maintains a complete cache directory; proxies notify the mapping service any time they add or remove an object from the cache [13]. A decentralized version of this protocol was used in the Cache Digests, where each cache keeps a compact summary of the cache directory [11]. The Flexible Open Caching uses an object oriented paradigm. It defines specific servers and objects called W3Oserver and W3Object, respectively. The transaction in this context are per-object oriented, clients communicate with objects, i.e., invoke operations upon objects. Caches represent some objects that could be requested by the clients. Objects encapsulate data and each supports a number of operations, defined within their class, through which the data must be accessed. Each sever offers a remote procedure call endpoint, with a unique network address, via which the objects any be accessed [7]. The Relais Protocol proposed by INRIA is based on a distributed index, each member of the federation has an up-to-date copy of this index. The latter is used to locate a copy of a document within the federation. To keep the index up-to-date, modifications are sent to all the members, these notification messages are sent per-group instead of one-by-one, which allow to save bandwidth. To maintain document coherency, the Relais Protocol relies on the state of the cache that stores the document, if this cache was recently in a faulty or disconnected state then its contents cannot be used by the other caches until the faulty cache goes through a reconciliation phase [18]. Self Organizing Cooperative WWW caching is a symmetric distributed cache system. It consists of equal functionality cache servers. A protocol for handling the activity management is defined, which allow to maintain a list of active cache servers in a distributed manner. Caches join a cooperative cache system by first measuring the turn around time to
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 18
SCALABLE FEDERATION OF WEB CACHE SERVERS
273
the peers and then choosing the one which replies the first. The join request is handled by the group leader which decides to accept it, or creates a new group by moving half of the members to the new group [14]. Adaptive Web Caching allows groups of caches to overlap each other, which allows not only cooperative work among caches within the group but also inter-group collaboration. The node holding the page multi-casts the response to the group, which loads neighboring in the same group with the requested document. Adaptive Web caching allow the creation of new groups, extension of old ones, and merge between groups [27]. Our protocol differs form the previous research work in many ways. One of the major difference we the works listed above is the fact that we introduce the load and the access patterns of the caches as a main metric for composing the federation of caches. First, we hash the URL to be used as an index to distribute the index over a federation in a scalable manner. This, on purpose, maps similar URLs to different nodes in order to balance the load of the system. Second, we use the URL to calculate trails to determine efficient joining and distribution of federations. Another important difference with the other research works, is the fact that we base our forwarding mechanism on a split of the document research space among the caches composing the federation. This approach reduces the size of the forwarding lookup table to one entry per cache instead of storing the summary of the content of the cache directory (Summary Cache [12]) or the hash chain sequences of the disseminated hot spots and geyser events (Adaptive Web Caching [23]). For now, we use a simple binary splitting mechanism which does not aim at providing research subspaces covering a specific category of documents. It allows caches to keep a relatively heterogeneous local storage. This strategy prevents the federation form imposing a specific access pattern for each local cache, which might reduce the hit rates of the proxy caches that have a heterogeneous access pattern. A geographical hashing mechanism will lead to a uniform pattern for each cache composing the federation such an organization might suitable be for proxy cache server with uniform access patterns. Besides that, we have a number of the mechanisms used in the composition of the federations of caches are based on cost functions which allows a flexible way to adapt these mechanisms to the changes of the Internet traffic and infrastructure.
5. Conclusions and future work Nowadays, the distributed Web caching is becoming more mature, a number of industrial realizations of the concepts of distributed Web caches have been built during the last two years allowing thus this concepts to be tested and applied in real situations, the Cache Engine 500 series from CISCO, the Alteon L4 switch, and the ArrowPoint L7 switch are just a few examples. In this paper, we propose a possible extension of the concept of distributed Web caching, we have shown the possibility of building up an adaptive cooperative Web caching system. Amongst other, advantages of the proposed system is the good policy for load balancing the document search space among the members composing the hierarchy of federations of
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 19
274
BELLOUM ET AL.
caches. Such a policy is important for the extension of the collaborative work because it gives a good reason for any group already functioning to admit new members. Another advantage of the proposed system is in the process of deciding when to join or leave a federation of caches which is using the history of accesses to these caches. The search for similarities among recent history of accesses allow the cooperative Web caching system to keep up with the variation of Internet traffic, which will allow to cache only documents which are currently the most requested ones. The experiments we have performed have shown a good behaviour of the proposed system, the number of hits performed for the local users of each cache is not disturbed by the activity of the caches within the federations. We have recorded as many hits for the requests issued within the federations as the ones resulting from the local hits for each cache. We have also pointed out the impact of parameters such as the length of the matching sequence, and the error tolerated during the matching process on the federation activities. Using these parameters, it is possible to control the number of join/leave transactions, and make them more accurate. What we have shown in this paper is just a concept proof of the adaptive cooperative Web caching system, further research and experiments have to be done to improve some parts of this system such as the mapping function, and the scaling of the error tolerated during sequence matching. We are also planning to point out the impact of the replacement strategy, the size of the caches, and cache coherence on the proposed system. We also want to extend our results to much larger scale systems, with one or two orders of magnitude more nodes. Although our algorithms are inherently scalable, the underlying network need not be scalable. Still we expect to gain by using federations of tens of caches, and meta-federations incorporating up to one thousand caches.
Note 1. NLANR: National Laboratory for Applied Network Research has as its primary goal to provide technical, engineering, and traffic analysis support of NSF High Performance Connections sites.
References [1] M. F. Arlitt and C. L. Williamson, “Trace driven simulation of document caching strategies for internet Web servers,” Simulation, 1997, 23–33. [2] A. Belloum and L. Hertzberger, “Dealing with one-timer-documents in Web caching,” in Proc. of the Conf. EUROMICRO 98 (Multimedia and Networks), Sweden, August 1998, pp. 544–550. [3] A. Belloum and L. Hertzberger, “Replacement strategies dedicated to Web caching,” in Proc. of the IEEE Conf. ISIC/CIRA/ISAS 98, Gaithersburg, MD, September 1998, pp. 576–581. [4] A. Belloum and L. Hertzberger, “Simulation of a two level cache serve,” Technical Report CS-98-01, Computer Science Department of the University of Amsterdam, 1998. [5] A. Belloum and L. Hertzberger, “The impact of the cache size on the document replacement strategy,” Internal Report, Computer Science Department of the University of Amsterdam, 2000, submitted to Simulation Journal CS-2000-02.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 20
SCALABLE FEDERATION OF WEB CACHE SERVERS
275
[6] A. Belloum, H. Muller, and L. Hertzberger, “Scalable federations of Web caches,” Internal Report CS-0003, Computer Science Department of the University of Amsterdam, 2000. [7] S. J. Caughey, D. B. Ingham, and M. C. Little, “Flexible open caching for the Web,” in Proc. of the WWW Conf., April 1997. [8] J. Challenger, P. Dantzig, and A. Iyengar, “A scalable system for consistently caching dynamic Web data,” in Proc. of the 18th Annual Joint Conf. of the IEEE Computer and Communications Societies, New York, 1999. [9] C. Chiang, Y. Li, M. Liu, and M. Muller, “On request forwarding for dynamic Web caching hierarchies,” in Proc. of the 20th Internat. Conf. on Distributed Computing Systems (ICDCS’00), Taipei: Taiwan, April 2000. [10] C. Faloutos, M. Raganathan, and Y. Manalopoulos, “Fast subsequence matching in time-series databases,” in Proc. of the ACM SIGMOD, June 1994, pp. 419–429. [11] L. Fan, P. Coa, J. Almeida, and A. Z. Broder, “Wide-area Web caching sharing protocol,” in Proc. of the SIGCOMM 98, 1998, pp. 70–78. [12] L. Fan, P. Cao, J. Almeida, and A. Z. Broder, “Summary cache: A scalable wide-area Web cache sharing protocol,” IEEE/ACM Trans. Networking J. 8(3), 2000, 281–293. [13] S. Gadde, M. Rabinovich, and J. Chase, “Reduce, reuse, recycle: An approach to building large Internet caches,” in Proc. of the 6th Workshop on Hot Topics in Operating System, 1997, pp. 93–98. [14] S. Inohara, Y. Masuoka, J. Min, and F. Noda, “Self-organizing cooperative in WWW caching,” in Proc. of the 18th Conf. on Distributed Computing Systems, 1998, pp. 74–83. [15] A. Iyengar, M. Squillante, and L. Zhang, “Analysis and characterization of large-scale Web server access patterns and performance,” World Wide Web J., June 1999. [16] R. T. C. Lee, “Cluster analysis and its applications,” in Advances in Information Sience System, Plenum: New York, 1981, pp. 169–292. [17] C. Lindemann and O. Waldhorst, “Evaluating cooperative Web caching protocols for emerging network technologies,” in Proc. of the Workshop on Caching, Coherence and Consistency (WC3 ’01), Sorrento, Italy, June 2001. [18] M. Makpangou and E. Berenguier, “Relais: Un protocol de maintien de coherence de caches Web cooperants,” in Proc. of the NoTeRe’97 Colloquium, November 1997. [19] R. Malpani, J. Lorch, and D. Berger, “Making World Wide Web caching servers cooperate,” in Fourth Internat. WWW Conf., Boston, December 1995. [20] Microsoft-Corporation, Ache array routing protocol and Microsoft server 2.0, White paper, Microsoft Corporation, 1997. [21] H. L. Muller, P. W. A. Stallard, and D. H. D. Warren, “Implementing the data diffusion machine using crossbar routers,” in Proc. of the 10th Internat. Parallel Processing Symposium, Honolulu, Hawaii, April 1996, IEEE Computer Soc. Press: Silver Spring, MD, pp. 152–158. [22] S. Paul and Z. Fei, “Distributed caching with centralized control,” in Proc. of the 5th Internat. Web and Cachign Content Delivery Workshop, Lisbon, Portugal, May 2000. [23] B. Scott Michel, K. Nikoloudakis, P. Reiher, and L. Zhang, “URL forwarding and compression in adaptive Web caching,” in INFOCOM 2000, 2000, pp. 670–678. [24] J. Wang, “A survey of Web caching schemes for the Internet,” ACM Comput. Commun. Rev. 29(5), 1999, 36–46. [25] A. Wolman, G. M. Voelker, N. Sharma, N. Cardwell, A. Karlinm, and H. M. Levy, “On the scale and performance of cooperative Web proxy caching,” in Proc. of the 17th ACM Symposium on Operating Systems Principles (SOSP ’99), Kiawah Island Resort, USA, December 1999, pp. 16–31. [26] K.-L. Wu and P. S. Yu, “Latency-sensitive hashing for collaborative Web caching,” in Proc. of the Conf. on Computer Networks and ISDN, Amsterdam, The Netherlands, May 2000. [27] L. Zhang, S. Floyd, and V. Jacobson, “Adaptive Web caching,” in Proc. of the Boulder Cache Workshop 97, June 1997.
VTEX(CG) PIPS No:404531 artty:ra (Kluwer BO v.2001/10/30) WJ404531.tex; 22/02/2002; 10:33; p. 21