Caching in Bandwidth and Space Constrained Hierarchical Hyper-media Servers Renu Tewari, Harrick M. Vin, Asit Dany, and Dinkar Sitaramy
y IBM Research Division T.J. Watson Research Center Hawthorne, NY 10532 E-mail: fasit,
[email protected] Phone : (914) 784-7953
Department of Computer Sciences The University of Texas at Austin Austin, TX 78712-1188 E-mail: ftewari,
[email protected] Phone : (512) 471-9732, Fax : (512) 471-8885
Abstract Information services of the future will employ a hierarchical server architecture, in which continuous as well as non-continuous media information stored at a remote server may be cached at a local server on-demand. Unlike memory caches, disk caches are constrained both by available bandwidth and space. Consequently, in addition to multiplexing the cache space among multiple data types, cache management algorithms will be required to efficiently utilize both disk space and bandwidth, and thereby minimize the load imposed on the network and the remote server. In this paper, we present an algorithm that achieves these objectives. To do so, the algorithm: (1) selects an entity to be cached based on the current cache bandwidth and space utilization; and (2) determines which, if any, of the entities presently in the cache that should be replaced. The procedure for determining the entities to be replaced is governed by the relative values of the bandwidth and space requirement of the entity to be cached and the available cache resources. By judiciously selecting the entities to be replaced, the algorithm maximizes the hit ratio, and thereby minimizes the load imposed on the network and the remote server. We have performed extensive simulations to evaluate our caching algorithm. We have also instantiated our caching algorithm in a prototype information delivery system. We describe the architecture of our prototype system, and present and analyze our simulation results.
1
Introduction
1.1 Motivation Recent advances in computing and communication technologies have enabled the development of information services in a wide range of application domains. For instance, in the education domain, the effectiveness of distance learning and self-paced education can be substantially enhanced by: (1) creating a repository of hyper-media documents for courses, each containing video lectures, notes, presentation slides, simulation packages, animation, images, etc., and (2) providing methods for organizing and browsing through the information space. Similarly, in commercial and entertainment domains, the process of disseminating advertisements and videos can be significantly improved by video on-demand services. A common feature in all of these information services is that they involve retrieving and delivering hyper-media objects (containing audio, video, imagery, textual and numeric data) from databases (or servers) to a large number of geographically distributed clients. Given the large size and bandwidth requirements of continuous media data (e.g., MPEG compressed video streams require a data transfer rate of 4 Mbits/s, each image from the Earth Observing Satellite (EOS) is about 60 MBytes in size, etc.), delivering these objects to clients impose substantial load on the network. Additionally, guaranteeing continuous delivery of audio and video streams requires sufficient resources to be reserved at the server and network. When clients are multiple network-hops away from the server, the overhead and cost of such reservation can be high. Moreover, the resources required for meeting the real-time requirements of clients may not be available when desired.
1
Remote DB Server
Local Server
CLIENT
CLIENT
NETWORK LAN
CLIENT
CLIENT
Cache
Figure 1 : Architecture of a hierarchical hyper-media information service Due to these limitations, most information services of the future will be based on an architecture in which information stored at a remote server is cached at a local server on-demand (e.g., the WWW caching proxies [15]) as shown in Figure 1. Such architectures will make it possible to service a large fraction of user requests from the local server; thereby avoiding the cost and overhead of retrieving objects from the remote server. Since cache memory sizes are relatively small as compared to the sizes of multimedia objects, achieving the above objective will require hyper-media objects to be cached on disks at the local server. Although techniques for managing memory caches have been extensively studied, disk cache management algorithms have not received much attention. Development of a caching algorithm that can efficiently multiplex a disk cache among a variety of data types constituting hyper-media objects is the subject matter of this paper. 1.2 Related Work on Caching Algorithms Traditionally, memory caches have been used to improve the performance of programs [1, 3, 7, 11, 16, 12]. Specifically, by accessing information requested by a program from a memory cache (rather than from a slower disk), caching techniques reduce the response times for requests, and thereby improve the program performance. To ensure that a large fraction of requests are served from the cache, most caching techniques exploit the locality of reference inherent in programs and maintain a subset of pages (generally referred to as the working set) in cache. To handle the scenario when the cumulative size of the working sets for all active programs exceeds its capacity, caches employ page replacement algorithms. The objective of these algorithms is to maintain those pages in cache that yield the highest hit ratio (i.e., maximize the number of requests served from the cache). Some of the well-known page replacement algorithms include: LRU, which replaces the least recently used page from the cache [7]; LFU, which maintains the frequency with which a page is accessed and then replaces the least frequently used page [2]; and CLOCK, which computes the number of references to each page in cache over a time duration, and then replaces the ones with the smallest number of references [17]. In addition to these algorithms (which utilize past measurements), several other algorithms that assume some knowledge of the access pattern of a user have been developed [22]. These include the hot-set and the locality-set models used for optimizing query processing in databases; as well as algorithms that exploit sequential access patterns through prefetching and immediate page discarding [2]. In addition to improving response times for program requests, memory caches have also been used for providing efficient access for continuous media (CM) data (e.g., audio, video, animation, etc.) [4, 5, 6, 18]. CM access is sequential and imposes real-time constraints [21, 23]. Thus, for CM requests, rather than minimizing the response times, meeting the real-time requirements is of greater concern. To meet these requirements, network and server resources must be reserved. The need for resource reservati on constrains the maximum number of users that can be simultaneously serviced by a server, and thereby limits its scalability. Consequently, the objective of the caching algorithm for CM data is to increase the scalability of the server by: (i) maintaining objects (or parts of objects) that are accessed by multiple users in the cache, and (ii) servicing as many requests from the cache as possible. Examples of caching algorithms that are optimized for CM accesses include interval caching [5, 6] and distance caching [18]. These algorithms exploit the sequentiality of CM accesses and cache intervals formed by pairs of
2
consecutive accesses to the same CM object. For instance, in Figure 2(a), arrows S11 through S22 represent various requests for CM objects 1 and 2, and 3; and [S11; S12], [S12; S13], etc., denote the intervals formed as a result of the access pattern. For each such interval, the algorithm caches blocks read by the preceding request until they are requested by the following request. Thus, whereas the preceding request in an interval accesses data from server’s disk, the following request reads it from the memory cache, thereby achieving a 50% reduction in the disk bandwidth requirement. The cache requirement of an interval is equal to the number of blocks needed to store interval-length portions of the object (which is a function of the time-interval between the two requests and the data rate of object). To maximize the number of streams served from cache, the interval caching algorithm orders the intervals in terms of their storage space requirements, and then caches the ones with the smallest requirements. S S 11 12
S 13
S 14 arrival time
Object 1
S S 11 12
S 13
S 14
c c 12 13 S 21
arrival time Object 1 b b 12 13 S 21
b 14
c 14
S 22
Object 2 c 22 S 32
S 31
S 22
Object 2 Object 3
b 22
(a) Interval Caching
c 3
(b) Generalized Interval Caching
Figure 2 : Caching techniques optimized for continuous media access By virtue of requiring an interval to be formed, the interval caching algorithm only caches portions of objects which are accessed concurrently. This may result in only portions of large objects (i.e., ones with large playback durations) to be cached, and all the small objects may be ignored. The Generalized Interval Caching (GIC) algorithm addresses this limitation by extending the definition of an access interval to small objects [6]. To illustrate the basic concept of GIC, consider Figure 2(b), in which arrows S11 through S32 represent various requests for large CM objects 1 and 2, and a short CM object 3. Whereas all the requests for objects 1 and 2 are being serviced concurrently, the request S32 has arrived after the previous request S31 has terminated. In such a scenario [S31; S32] is said to form a predicted interval The cache space requirement for this interval is equal to the size of the object. Therefore, for such small objects, if the interval is selected for caching, the entire CM object will be cached. Thus, the GIC policy may cache either a set of consecutive blocks of a CM object or entire objects. Finally, for environments with highly skewed access patterns (i.e., a large fraction of the accesses go to a disproportionately small set of hot objects [7]), instead of caching only portions of a CM object, a server may employ a frequency caching algorithm [20, 24], which caches entire CM objects in cache, and serves subsequent requests for those objects from the cache. Such an algorithm would cache a CM object purely based on its access frequency (and hence, without any considerations of its size). Observe that all of the above caching techniques assume that the system is disk performance limited. However, for services that involve retrieving objects from remote servers, network becomes a bottleneck (e.g., accessing information from a remote web site over internet). In such environments, the performance of applications can be improved by: (1) caching entire objects (instead of pages) on disks at a local server, and (2) using the combined access pattern of users (instead of individual user’s access pattern) for managing the content of the cache. Such an approach not only minimizes the network traffic between the local and the remote server, but also amortizes the cost of network transmission and caching across accesses for the object from multiple users. Many of the traditional caching techniques (e.g., LRU, LFU, etc.) have already been extended so as to: (1) cache entire files or hyper-text documents on disks at a local server (e.g., proxy servers in the World-Wide Web [8, 10]), and (2) use the combined access pattern of users to manage the cache contents [13]. Additionally, several new algorithms that use different attributes to order the contents of the cache for replacement have been developed. For instance, an algorithm that orders the content of the cache the time of day of their accesses was proposed in [19]; another algorithm that uses the size of the object (or its logarithm) to order objects during replacement was proposed in [13]; the HyperG
3
caching algorithm uses the number of references to an object over a time interval as the primary key, and the time for last reference as the secondary key [9]; and finally, the LRU-MIN algorithm uses the logarithm of object sizes as the primary key and the time for last reference as the secondary key [10]. These relatively straightforward adaptations of the traditional caching algorithms to disk caches have proved to be adequate in today’s web environment. This is because, the hyper-media documents available on the web today predominantly contain textual/numeric data as well as static imagery, which do not impose any strict performance requirement on the cache. Hence, for these applications, a disk cache is equivalent to a large but slow memory cache. However, it is conjectured that, by the year 2005, more than 50% of the information available over the network will contain continuous media data [14]. Such CM data will not only occupy disk cache space but will also require some disk cache bandwidth to be reserved. Given that disk caches are limited both in space and bandwidth (unlike memory caches which are only limited in their size), to maximize the hit ratio, cache management algorithms will be required to efficiently utilize both of these resources. Since the conventional caching algorithms developed for textual and numeric data only consider the space constraints of a cache, they are not applicable for caching CM data in such bandwidth and space constrained environments. Even the algorithms for caching CM data in memory caches are inadequate for disk caches. For instance, although the interval caching policy is opportunistic and caches only the shortest deterministic intervals, it incurs the overhead of writing a block to the cache for each block read from the cache. Consequently, the effective cache bandwidth available for reads can reduce upto 50% of the total disk bandwidth, which, in turn, limits the number of user requests that can be served from the cache1 . At the other extreme, due to the large storage space requirement of the CM objects, frequency caching techniques, that cache entire CM objects in cache to amortize the cost of writing an object to cache over a large number of reads, can only store a few objects in the cache, thereby limit the number of requests that can be served from the cache. In summary, future information delivery services will employ the remote server - local server architecture (see Figure 1), and cache hyper-media objects on the disks available at the local server. In such environments, in addition to multiplexing the cache space among multiple data types, cache management algorithms will be required to efficiently utilize both disk space and bandwidth. None of the existing caching algorithms achieve all of these objectives. 1.3 Research Contributions of This Paper In this paper, we present an algorithm for caching CM and non-CM objects in bandwidth and space constrained environments (e.g., a disk cache). The main objective of the algorithm is to effectively utilize both the cache bandwidth and space, and thereby minimize the load imposed on the network and the remote server. To achieve this objective, the algorithm: (1) selects an entity (either an interval or the complete object) to be cached based on the current cache bandwidth and space utilization; and (2) determines which, if any, of the entities presently in the cache that should be replaced. The procedure for determining the entities to be replaced is governed by the relative values of the bandwidth and space requirement of the entity to be cached and the available cache resources. By judiciously selecting the entities to be replaced, the algorithm maximizes the hit ratio, and thereby minimizes the load imposed on the network and the remote server. We have performed extensive simulations to evaluate our caching algorithm. We have also instantiated our caching algorithm in a prototype information delivery system. We describe the architecture of our prototype system, as well as present and analyze our simulation results. The rest of this paper is organized as follows. In Section 2, we describe our caching algorithm. Results of our simulations are presented in Section 3. The software architecture of our prototype system is described in Section 4, and finally, Section 5 summarizes our results. 1 Under GIC, some fraction of the cache bandwidth is used for writing blocks to be read by following stream. This fraction in general is less than half, since successive intervals may be cached and only the rst stream has to write blocks into the cache, amortizing the overhead over a small number of readers.
4
2
Integrated Bandwidth and Space Constrained (IBSC) Caching
2.1 Problem Formulation Consider the information delivery architecture shown in Figure 1. Let N objects (denoted by O1 , O2, ...., ON ), each containing either continuous media data (i.e., audio and video) or non-continuous media data (e.g., textual and numeric data, imagery, etc.) be stored at the remote database server. Let si and bi , respectively, denote the size and the bandwidth requirement of each object. Let the local server cache both CM and non-CM objects on a disk cache of size S and bandwidth B . Whereas a non-CM object is always cached in its entirety, a CM object may either be cached in its entirety or in parts in the form of one or more intervals. Let E = fE1; E2; :::; Eng denote the set of entities (i.e., an entire object or an interval or group of intervals) being maintained in the cache, and let si and bi , respectively denote the cache space and bandwidth occupied by entity Ei. Thus, if Ei is an entire object Oj , then si = sj . On the other hand, if Ei is an interval of length tk (i.e, the time duration between the last reference and the current reference) of object Ok , then
b bb
b = minf si
s k ; tk
bk g
Similarly, the bandwidth occupied by a cached entity is dependent on the data type. Specifically, since non-CM data objects do not impose any strict real-time rate requirements, without any loss of generality, we will assume that for all non-CM objects, bi = 0. For a CM entity, the total cache bandwidth occupied is a function of the number of concurrent readers and writers for the entity. Observe that if the cached entity is a single interval, then it has exactly one reader and one writer (however, if the preceding request was being served from cache as part of another interval, then the number of writers could be 0). On the other hand, if the cached entity is the entire CM object, then the number of readers is equal to the number of concurrent accesses to the object; and the number of writers can be at most one. Thus, if entity Ei caches object Ok , and if ri and wi (wi = 0 or 1) denote the number of concurrent readers and writers for Ei , then
b
b =(
r i + wi )
bi
bk
Note that for CM entities, there is a difference in the estimation of concurrent readers between intervals and entire objects. For an interval, the number of readers is known to be one. For an entire CM object, on the other hand, the number of concurrent readers is estimated based on the long term probability of access. This probability can be computed as a weighted average of observed frequency of access and some external knowledge, if available. Thus if a CM object Ok , is accessed with probability pk , and if is the request arrival rate in the system, then the number of concurrent readers for Ok is given by rk
= pk sk =bk
Finally, the performance gain obtained by caching is also data type dependent. Whereas caching non-CM objects yields a lower response time and reduces the load on the remote server, caching CM entities results in the saving of the network and remote server bandwidth (and hence, increases the scalability of the system). The caching gain of an entity Ei, belonging to a continuous media object Ok , is the remote bandwidth saved, and is given by gi
= ri bk
Thus, having defined the resource usage of the cache entities, the state of a disk cache can be defined as a pair
(Us ; Ub), where Us and Ub denote the utilization of the cache space and bandwidth, respectively. Specifically, s U =1 b U =1 s
S
Xb
Ei 2E
i
b
B
Xb
Ei 2E
i
Clearly, 0 Us ; Ub 1. In such an environment, a disk cache management algorithm should effectively utilize both the cache bandwidth and space (i.e., cache the entities that minimize the load imposed on the network and the remote server, by maximizing the caching gain for a fixed amount of cache resource). In what follows, we describe the integrated bandwidth and space constrained (IBSC) caching algorithm that achieves these objectives. In the description of the algorithm, we will assume that: (i) the access frequency2 for the objects in the remote server can be determined and, (ii) that each object is always accessed in its entirety.
2
The frequency of accessing an object can either be estimated using past statistics, or can be pre-speci ed.
5
1
Bandwidth Utilization
U > U s b U = U b s
U > U b s 0
1 Space Utilization
Figure 3 : Cache State 2.2 IBSC Caching Algorithm When a new request for an object is received, if the requested information is already being cached, then it is served from the cache, if possible. (Note that even when the requested information is found in the cache, there may be a limitation on the cache bandwidth or server admission capacity, to prevent it from being served from the cache.) If the requested information is not in cache, the cache management algorithm must determine: (1) the entity, if any, that should be cached, and (2) which, if any, of the entities present in the cache should be replaced to accommodate the new one. 2.2.1 Selection of the Cache Entity If a non-CM object is requested, then the entity to be cached refers to to the entire object. On the other hand, if a CM object is accessed, and if there exists a preceding request to this object (i.e., if an interval can be formed), then the cache management algorithm must determine if the entire object or the interval should be cached. This selection is governed by the state of the cache.
If Us < Ub , then the algorithm selects the entire object for caching.
This is based on the fact that if an entire object is cached and remains in the cache for a while, the write bandwidth overhead can be amortized over multiple future readers. If an interval had been selected, due to the continuous discarding of data after reading, there is a larger write overhead.
If Ub < Us, then the algorithm selects the interval for caching. This is because, the space usage of an interval is much smaller than that of the entire object.
The above selection is based on the desire that the entity added to the cache should minimally use the resource that is currently more utilized. Observe that, if the state of the cache is represented as a point within a unit square (where the x-axis and the y-axis denote the space and the bandwidth utilization, respectively (see Figure 3), then the above policy attempts to move the state of the cache towards the diagonal. Since the all the points close to the diagonal capture the scenario where Us Ub, this policy attempts to achieve equitable utilization of cache space and bandwidth.
b
2.2.2 Cached Entity Replacement Algorithm
b
U U
Given the current state of the cache ( s ; b) as well as the storage space and bandwidth requirements (namely, s and b, respectively) of the entity E selected for caching, the entity replacement algorithm characterizes the cache as: (i)
unconstrained, (ii) space constrained, (iii) bandwidth constrained, or (iv) bandwidth and space constrained, and then determines the entities to be replaced so as to accommodate E . 1. Unconstrained: The cache is said to be unconstrained if the cache space and bandwidth available exceed the requirements of the entity to be cached. In this scenario, the entity is placed in cache, and the state of the cache is appropriately updated.
6
b
b
2. Space Constrained: The cache is said to be space constrained if the available bandwidth is sufficient to accommodate entity E (i.e., b (1 ?Ub ) B ) but available space is not (i.e., s > (1 ?Us ) S ). To generate the relative ordering among the entities in the cache in this scenario, each CM entity Ej is given a goodness value Gj , which is the ratio of the caching gain, gj , of the entity to the space, sj , occupied by it, that is,
b
b
Gj = gj =sj
The goodness value of an entity represents the gain per unit space used, and hence is the gain density. Note that this goodness value for CM entities maps to the inverse of the time-to-reaccess (TTR) the object that it represents. In traditional caching algorithms, for a cache limited by space, the replacement policy replaces the data block that has the largest time-to-reaccess. This guarantees that whatever is removed from the cache will only be accessed later than any block that remains in the cache. Different cache replacement algorithms (LRU, FIFO, LFU, MRU, CLOCK) provide heuristics that help estimate the time-to-reaccess, based on the characteristics of the application. These algorithms do not quantify what the value of TTR is, but only define the relative ordering among different objects. Since the definition of goodness for CM entities is identical to the TTR value, it can also be used to compare with non-CM entities. For a non-CM entity Ei, representing object Ok , the goodness value is simply the inverse of TTR. The TTR value can be computed, if the access probability pk , of the object can be determined. The TTR can also be assumed to be directly proportional to the time duration between the last reference to the object, Tk (prev), and the current time of reference, Tk (cur). The goodness value of a non-CM entity is given by,
Gi = 1=T T R = p1 k
or
1
Tk (cur) ? Tk (prev)
All the entities in the cache are ordered in ascending order of their goodness values. The entities that have a lower goodness value compared to the one being inserted, are considered candidates for removal. The selection of entities for removal proceeds from the one with the worst goodness value till the space released is sufficient to accommodate the new entity. To resolve a tie among entities with equal goodness values, a best-fit selection based on the cache space used by each entity is used. The best-fit choice ensures that for a given amount of space required, we minimize the loss in the bandwidth savings. Thus if two entities, Ei and Ej , have goodness values, Gi = Gj then gi = gj . If si and sj are greater than the space required, then by deleting the entity with the size
b b b b bb si
sj
b
of min(si ; sj ), the other entity which had a larger gain of max( gi ; gj ) remains. The best-fit selection can also be applied over groups of entities, if individual si values are smaller than the space required. The combined entity g +g will have the same goodness value as the individual ones; i.e. i j = Gi = Gj , and hence can be treated as a si +sj single entity. Note also, that when multiple entities ordered by Gi values are removed from cache, the goodness value of the combined entity that they form, remains less than that of the entities remaining in the cache. That g is for entities Ei ; Ej ; Ek , if gi < j < gk , then if entities i and j are removed from cache, their combined goodness of
bb b
gi +gj si+sj
(1 ? Ub ) B ). When the cache is bandwidth constrained, the previously described ordering based on space usage is not of primary importance. Instead, the ordering among the entities in the cache should reflect their bandwidth occupied. The goodness value of the continuous media entities Ej in the cache is given by the gain per unit cache bandwidth used, that is
b
Gi = gi =bj
Thus the ordering is based on the ratio of remote bandwidth saved to the cache bandwidth used. A ratio of 1, which is considered the best value, indicates that an entity uses all the cache bandwidth for reading from the cache and no bandwidth is wasted in writing to the cache. The goodness value measures to what degree the write
7
bw space constrained policy (new entity Ei )
space constrained policy (new entity Ei )
8 entities k in cache if (Gk Gi )
(1 ? Us ) S and b > (1 ? Ub) B ). In this case the cached entities are ordered into two lists. The space-list orders using the goodness value based on the space constrained case (i.e, Gi = gi =si ), while the bandwidth-list orders using the goodness in the bandwidth constrained case (i.e, Gi = gi =bi). The algorithm removes entities from either of the two lists, until enough resources are released for the new entity. To decide which list is used, at each step, the percentage of extra space and bandwidth required by the new entity is determined. If the fractional space available is smaller, an entity at the top of the space-list is selected, otherwise the one at the top of the bandwidthlist is selected. After that entity is marked for possible removal, the resource availability is re-evaluated, and the list selection process is repeated. In this procedure, the entities selected for removal are those that have the worst goodness value in either the bandwidth constrained case or the space constrained case. The order in which the
b
b b
8
lists are selected depends on which resource is more utilized as the replacement procedure proceeds. The cache replacement procedure in the dual constrained case is described in Figure 4(b)3.
open (new object Ok ) update object access statistics; if (object Ok in cache); if (bw f ree bk ); serve from cache else call bw constrained policy() serve from cache else if ( b < s) new entity Ei = Ok // the entire object k else new entity Ei = generalized interval // an interval of object k if (unconstrained) then allocate cache resources for object; cache object; return; else if(bw constrained) call bw constrained policy( Ei ) else if(space constrained) call space constrained policy(Ei ) else if(space and bw constrained) call bw space constrained policy(Ei ) if(resources available) cache object; return;
U
b
U
Figure 5 :
Details of Open
Figure 5 shows the actions taken when an open request for a new object is received. The procedure first serves the object from the cache, if the object is found in the cache and sufficient bandwidth is available. If the object referenced is not found in the cache, it determines which entity (the entire object or interval) to cache. If there is sufficient free space and bandwidth are available in the cache for the entity, it is cached. When the entity selected by open cannot be cached due to shortage of cache resources, the respective cache replacement algorithms based on the constraint of the cache are invoked. If sufficient entities with lower goodness values can be removed from the cache to accommodate the new entity, they are deleted and the current entity selected is cached. 2.3 Further Optimization In IBSC caching, when deciding on what entity to cache, only two extreme options were considered for a CM object: (i) the entire object, (ii) a small part of the object, i.e., a single reader-writer interval. However, when there are multiple users accessing the same object concurrently, a set of adjacent intervals called runs could be grouped together to form the entity to be cached. Extending intervals to runs, introduces a range of possible entities that can be cached; from a single reader interval to multiple reader runs, to the entire object. Note that in interval caching, grouping of readers was done only if the previous request was already being read from cache. Thus to generate a new entity, interval caching did not look beyond the previous request. Run caching on the other hand, considers all previous requests when forming an entity to cache. Figure 6 shows the possible runs r12 to r14, that can be formed on a set of users accessing the same data object. Run caching is useful when both space and bandwidth of the cache are a constraint, since it amortizes 3 To avoid the computational overhead of handling two lists, a single goodness value that can be used, is the ratio of the calibrated gain to the space used by the entity. The calibrated gain in bandwidth is determined by reducing the original gain by the fractional bandwidth wasted on writes (i.e., Gi = gi (1?ws i bi =bi ) ).
b b i
9
S 11
S S 12 13
S S 14 15
r 11
Object 1
r 12
r 11 : 1 reader 1 writer r 12 : 2 readers 1 writer
time r 13
r 14
r 13 : 3 readers 1 writer r 14 : 4 readers 1 writer
Figure 6 : Caching of Interval Runs the write overhead over multiple readers without requiring the space usage of the entire object. The optimized IBSC algorithm considers both runs and entire objects for caching. 2.4 Discussion The problem of determining a caching algorithm that minimizes the load on the remote server and network for a fixed size and bandwidth cache can be mapped to a two constraint knapsack problem. The IBSC algorithm reduced this problem to two single constraint knapsack problems. If the cache had only a single constraint (limited space), the caching problem reduces to the 0/1 knapsack problem 4 . The 0/1 knapsack problem is known to be NP-complete. Note however, that if the gain values for each object are identical, i.e., the gi values are equal, the 0/1 knapsack has an optimal solution which consists of ordering the objects in ascending order by size. For a fractional knapsack problem (objects are not whole but can be fractional), the optimal solution is the greedy ordering by gain density, gi =si . For the 0/1 knapsack solution with varying values of gi, the greedy ordering of gain density is a simple approximation. The IBSC scheme uses the gain density approximation in ordering the entities for the different cache constraints. 3
Experimental Evaluation
To evaluate the effectiveness of the IBSC caching algorithm, we have carried out extensive simulations. For our simulations, we assumed an environment consisting of a remote server maintaining a library of hyper-media objects consisting of 1000 text and image objects, 200 short CM objects and 100 large CM objects. The workload was characterized by defining: (i) the object size distributions, (ii) the request arrival pattern, and (iii) the distribution of requests among objects. For object sizes, we considered the scenario in which the short CM objects are an order of magnitude smaller that the large objects, but are an order of magnitude larger than the text and image objects. Each class of objects has a range of uniformly distributed sizes. For each type of object, the request arrivals were assumed to be exponential, with a mean rate . Observe that, for CM objects, the maximum mean arrival rate is determined by the system capacity. For instance, for a server that can support at most 50 concurrent CM requests, each with a time duration of 15 minutes, then the mean arrival rate is 0.05 5. Finally, the access skew among the objects was assumed to be based on a Zipf distribution, with parameter . It has been shown that the Zipf distribution can be used to accurately model access skew in environments where a small fraction of the objects account for a large fraction of the references (e.g., large digital multimedia libraries, web servers, etc.) [13]. A Zipf distribution yields high access skews at low values of , and tends to become more uniform 4 The 0/1 knapsack problem is to ll a knapsack of xed size S, with objects, each with a gain gi and a size si , such that the total gain is maximized 5 This can derived from Little's Law of N = T .
10
at larger values of . Figure 7 shows the cumulative access probability of objects as a function of the fraction of objects accessed for different values of . It demonstrates that, for a cumulative access probability of 75%, the fraction of objects accessed ranges from 26% to 60% as the value of increases from 0 to 0.4. Zipfian Distribution 1
Cumulative Access Probability
0.9
theta 0.0 theta .05 theta 0.1 theta 0.2 theta 0.3 theta 0.4
0.8
0.7
0.6
0.5
0.4
0.3 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fraction of Total Objects
Figure 7 : Zipf Distribution Observe that, given the size and bandwidth requirements of objects, the values of mean request arrival rate , Zipf skew parameter , as well as the desired value of hit ratio can be used to derive bounds on the cache size and bandwidth. For example, for = 0:05, a hit ratio of 0.75 can be achieved by maintaining 30% of the objects in the cache. Thus, for a database containing 100 objects, each with a mean size of 500MB, the approximate cache size requirement to achieve a hit ratio of 0.75 is 15GB. Similarly, if we assume that each request has a bandwidth requirement of 0.5MB/sec and a playback duration of 15 minutes, then an arrival rate of = 0:05 implies that there can be 50 concurrent requests. In such a scenario, to achieve a hit ratio of 0.75, the cache bandwidth must be at least 0:75 50 0:5 = 18:75MB/sec. In what follows, we compare the IBSC caching algorithm with several known algorithms. We will first present results for caching CM objects, and then present the results for mixed (i.e., CM and non-CM) workload. Our metric for comparison is hit ratio, which we define as the ratio of the number of bytes read from the cache to the total number of bytes read in a given time duration. As may be evident, the hit ratio is an indicator of the reduction in the load on the remote server and the network. 3.1 Effect of Varying the Cache Resource In the first set of experiments, we explored the effect of varying both the space and bandwidth of the cache on the hit ratio yielded by several caching algorithms. Whereas the size of the cache was varied from 1 to 1024 times the size of the smallest object in the system, and the bandwidth of the cache was varied from 1 to 128 times the mean data rate requirement of the objects6 . The data rate requirement of all the CM objects was assumed to be the same (namely, 4Mb/sec). The sizes of the short and long were chosen to be 50MB and 500MB, respectively (corresponding to MPEG-2 compressed video streams of duration 90 seconds and 15 minutes, respectively). The access skew among objects was assumed to be Zipf with = 0.05, and the ratio of accesses between small and large CM objects was set to 1:4. Figures 8 shows the hit ratios obtained by: (i) frequency caching, (ii) generalized interval caching (GIC), (iii) LRU+LFU, and (iv) IBSC caching algorithms. Recall that, whereas frequency caching policy maintains entire CM objects in cache, the GIC policy is opportunistic and caches only the shortest deterministic and predicted intervals. The LRU+LFU policy was also assumed to cache entire CM objects. The frequency caching scheme was assumed to use predicted frequency of access of an object along with the observed history of accesses in the ratio 3:2. For the combined LRU+LFU policy, on the other hand, the weighting factor was assumed to be 40% LRU and 60% LFU since it had the best observed performance. The graphs show the hit ratio values for the different isolines with fixed cache 6 Note that the space and bandwidth values assumed for cache are not representative of a single disk but that of a `logical disk cache' that could be mapped to a disk array.
11
1
0.8
0.6
0.6
Hit Ratio
Hit Ratio
0.8
0.4
0.4
0.2
0.2 0 150
0 150 6
100
6
100
5
5
4 3
50
x 10
1 0
Space in MB
4
2 0
Bandwidth in MB/sec
(a) Frequency Caching
x 10
1 0
Space in MB
(b) Generalized Interval Caching
1
1
0.8
0.8
0.6
0.6
Hit Ratio
Hit Ratio
3
50
4
2 0
Bandwidth in MB/sec
4
0.4 0.2
0.4 0.2
0 150
0 150 6
100
6
100
5
5
4 3
50 Bandwidth in MB/sec
2 0
50
4
x 10
1 0
4 3
Bandwidth in MB/sec
Space in MB
(a) LRU+LFU
2 0
4
x 10
1 0
Space in MB
(b) Integrated Caching
Figure 8 : Comparison of hit ratios yielded by different caching schemes bandwidth and varying cache size, and vice versa. As illustrated in Figure 8, the GIC policy yields a higher hit ratio at smaller cache sizes while the frequency caching policy performs better well when space is large. The reverse is true when varying cache bandwidth. Under all conditions, the IBSC caching algorithm yields a hit ratio that is at least as large as the maximum of the hit ratios yielded by the GIC and frequency caching policies (i.e., IBSC is an envelope). Moreover, hit ratio variation for IBSC has a steeper slope (i.e., IBSC reaches the maximum possible hit ratio at smaller values of cache space and bandwidth than the other schemes). To clarify the issue further, Figure 9(a) and 9(b) show the variation in the hit ratios for the different cache algorithms with varying cache size (with fixed disk bandwidth 4MB/sec) and varying cache bandwidth (with fixed space 1.6GB), respectively. They demonstrates that, once the cache bandwidth (space) saturates, the hit-ratio reaches a plateau, and is unaffected by further increase in cache size (bandwidth). Figure 9(a) demonstrates that when the cache size is small (< 2GB i.e., ratio of cache size to large object size is 1:4), interval caching yields a higher hit ratio compared to frequency caching. This is because intervals use much less space than the entire objects. However, as the cache size increases, cache bandwidth becomes the constraint, and hence, frequency caching yields a higher hit ratio. This is because, while interval caching provides a better gain per unit space used, it wastes half of the cache bandwidth for writes, making it perform poorly in a bandwidth constrained environments. The IBSC algorithm forms an envelope over the both interval and frequency caching schemes, and reached the plateau at 20% higher hit ratio. In the extreme cases, the integrated caching scheme has a hit ratio that is double of what is achievable by the other policies. Figure 9(b), on the other hand, shows that when the cache bandwidth is small (< 6 MB/sec, ratio of 1:12), frequency caching has a higher hit ratio compared to interval caching, as it has lower write overheads. For the same size cache, when the bandwidth increases, the cache size becomes the constraint, thereby favoring interval caching. Even under this scenario, the IBSC algorithm remains an envelope of either policies. Note that, since we have assumed a stable access pattern, LRU+LFU which uses only past history and last reference times, has a worse performance than frequency caching that uses predictions and past history. In the scenario where the access patterns fluctuate rapidly or predicted access probabilities are not available, LRU+LFU would perform similar to frequency caching.
12
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk BW 4MB/sec) 0.18 0.16
FREQ GIC Integrated LRU+LFU
0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0
2000
4000
6000
8000
10000
12000
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk SP 1.6GB) 0.14 FREQ GIC Integrated LRU+LFU
0.12
0.1
0.08
0.06
0.04
0.02
0 0
2
4
Cache Size in MB
6
8
10
12
14
16
18
Bandwidth of Cache (MBytes/sec)
(a) Varying Cache Size
(b) Varying Bandwidth
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk BW: 4MB/sec, 8MB/sec) 0.4 FREQ BW 4MB/sec GIC BW 4MB/sec FREQ BW 8MB/sec GIC BW 8MB/sec
0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0
5000
10000
15000
20000
Cache Size in MB
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
Figure 9 : Comparison of caching schemes - a snapshot (Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk SP: 0.8GB, 1.6GB) 0.12 FREQ SP 1.6GB GIC SP 1.6GB FREQ SP 0.8GB GIC SP 0.8GB
0.1
0.08
0.06
0.04
0.02
0 0
2
4
6
8
10
Bandwidth of Cache (MBytes/sec)
(a) Varying Space
(b) Varying Bandwidth
Figure 10 : Variation in transition points between GIC and Frequency Caching Figure 10 examines the variation in transition points between GIC and frequency caching (i.e., when interval caching begins to dominate frequency caching, and vice versa) with increase in the cache bandwidth and size. It demonstrates that: (1) the cache size required for frequency caching to yield a higher hit ratios than GIC increases with increase in cache bandwidth (see Figure 10(a)); and (2) the cache bandwidth required for GIC to yield higher hit ratios than frequency caching increases with increase in cache size (see Figure 10(b)). Note that the transition points vary with the parameters of the cache and the workload and cannot be determined easily. Since the integrated caching scheme always forms an envelope, it dynamically selects what to cache based on the nature of the workload and the cache resources. 3.2 Analysis of the IBSC Algorithm We had observed in Figure 9 that the IBSC algorithm forms an envelope over the other caching policies. In Figures 11(a,b) we examine how the IBSC algorithm allocates the cache between intervals and entire objects as the cache resources vary. Figure 11(a) shows that, when the cache size is small (less than 1.6GB) for a fixed bandwidth of 4MB/sec, the IBSC algorithm selects more intervals than entire objects. As the cache size increases and the cache bandwidth becomes the bottleneck, more of the cache is filled with entire objects. In fact, when the cache size is very large (> 10GB) only entire objects are selected for caching. It is interesting to note that, even when only entire objects are cached, the IBSC algorithm yields a higher hit ratio than pure frequency caching (see Figure 9(a)). This is because, frequency caching does not distinguish between objects that are being written to the cache and the ones that are already in the cache. The IBSC algorithm, on the other hand, differentiates such entities. Consequently, in a bandwidth constrained environment, IBSC algorithm will replace an entity currently occupying more write bandwidth,
13
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk BW: 4 MB/sec)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
and thereby achieve a higher hit ratio. Figure 11(b), on the other hand, shows that, for a cache size of 1.6GB, when the disk bandwidth is low (less than 5MB/sec), a large proportion of the cache is filled with entire objects so as to amortize the bandwidth wasted for writing the object to cache over a large number of reads. As the cache bandwidth increases, however, the cache size becomes the constraint and the number of intervals selected increases. Figures 11(a) and 11(b) also demonstrate that the IBSC algorithm achieve efficient utilization of cache resources by dynamically partitioning the cache between intervals and entire objects. 0.18 0.16
Integrated Integrated-Intervals Integrated-Segments
0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0
5000
10000
15000
20000
25000
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk SP: 1.6GB) 0.14 Integrated Integrated-Intervals Integrated-Segments
0.12
0.1
0.08
0.06
0.04
0.02
0 0
5
Cache Size in MB
10
15
20
25
30
35
Bandwidth of Cache (MBytes/sec)
(a) Varying Space
(b) Varying Bandwidth
Figure 11 : Analysis of the IBSC algorithm
3.3 Effect of Varying Workload Characteristics
(Large File Size: 125MB-1GB, Small: 12.5MB-100MB, PlayRate: 2-6Mb/sec, Disk BW 8MB/sec) 0.35 FREQ GIC Integrated
0.3
0.25
0.2
0.15
0.1
0.05
0 0
5000
10000
15000
20000
25000
30000
35000
40000
Cache Size in MB
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4, lambda 0.04)
We studied the behavior of the IBSC algorithm with varying workload characteristics (namely, object size distribution, request arrival rates, as well as mixed data types). Figures 12 show the hit ratios yielded by the various caching algorithms in the scenario where the data rates and object sizes were chosen from uniform distributions. Whereas the data rates were varied from 2-6Mb/sec, the object sizes were varied from 12.5MB to 1 GB. As Figure 12 demonstrates, the IBSC algorithm remains an envelope of both the interval and frequency caching policies, and the the nature of the curve remains similar to those in Figure 9. (Large File Size: 125MB-1GB, Small: 12.5MB-100MB, PlayRate: 2-6Mb/sec, Disk SP 6.4GB) 0.35 FREQ GIC Integrated
0.3
0.25
0.2
0.15
0.1
0.05
0 0
20
40
60
80
100
120
140
Bandwidth of Cache (MBytes/sec)
(a) Varying Space
(b) Varying Bandwidth
Figure 12 : Variation in hit ratios with varying data rates and object sizes Figures 13(a) shows the effect of varying the request inter-arrival times. It demonstrates that very small interarrival times cause the cache to be in a constant state of upheaval resulting in thrashing (and hence, low hit ratios). Increasing the inter-arrival time decreases the number of concurrent accesses to an object, and hence, increases the average interval length. This results in a smaller number of intervals to be cached, and hence, reduces the hit ratio. The IBSC algorithm continues to be the envelope of the two policies.
14
FREQ GIC Integrated
0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0
100
200
300
400
500
600
Interarrival Times in secs.
Cache Hit Ratio (theta=0.05, small:large file ratio 4:1, lambda 0.04)
Cache Hit Ratio (theta=0.05, small:large file ratio 1:4)
(Large File Size: 500MB, Small: 50MB, PlayRate: 4Mb/sec, Total Disk BW: 16 MB/sec, SP 12.8 GB) 0.55
(CM File Size: 125MB-1GB, Non CM File: 100KB -1MB, PlayRate: 0-6Mb/sec, Disk BW 8MB/sec) 0.7 Integrated LRU+LFU 0.6
0.5
0.4
0.3
0.2
0.1
0 0
10000
20000
30000
40000
50000
60000
Cache Size in MB
(a) Varying Inter-arrival Times
(b) With CM and non-CM objects
Figure 13 : Effect of request inter-arrival times and mixed data types on hit ratios In the previous experiments the workload was restricted to CM objects that had both space and bandwidth requirements. Non-CM objects on the other hand, do not require any retrieval rate guarantees, and hence, do not reserve any bandwidth. In Figure 13(b) we compare the hit ratios yielded by the LRU+LFU and the IBSC algorithm, when the the remote server contained both CM and non-CM objects. For our experiment, the sizes of non-CM objects were assumed to range from 100KB to 1 MB, while the sizes of CM objects ranged from 12.5MB to 1 GB. The data rate requirement for CM objects was assumed to vary in the range 2-6Mb/s. The ratio of accesses is 80% for non CM objects and 20% for CM objects. As Figure 13(b) demonstrates, IBSC caching algorithm yields higher hit ratios than the LRU+LFU algorithm. Note that while the number of references to CM objects are much smaller, they contribute much more to the hit ratio due to their size. Since the integrated scheme considers the cache resource available, it performs better or similar to LRU+LFU. 3.4 Effect on Run Caching As we had outlined in Section 2.3, when multiple users are accessing the same object, a set of adjacent intervals called runs could be grouped together and considered for caching. This is not an issue when the cache is space constrained. However, when the bandwidth of the cache is limited, it is profitable to group intervals together to amortize the write overhead over multiple readers. Figures 14(a,b) compares the hit ratios yielded by two versions of the IBSC caching algorithm: one that considers only single “reader-writer” intervals and the other that considers runs of intervals. When the bandwidth is limited (8MB/sec), the IBSC algorithm which caches runs of intervals yields around 8% higher hit ratios (see Figure 14(a)). On the other hand, when the cache size is fixed, IBSC with run caching can improve the bandwidth usage for much less space penalty compared to caching entire objects, and hence, improves the hit ratio by around 40%. 4
Prototype Design and Implementation
We have implemented the IBSC caching algorithm in an environment consisting of a cluster of AIX workstations. The remote server in our configuration is an IBM POWERParallel-SP2 system consisting of 4 RS/6000 workstations connected by a SP2 switch. This server is connected to local server over an ethernet. The local server caches objects on a set of 2.2 GB SCSI disks with raw sequential access bandwidth of 7.5 MB/s. The software executing on the local server has the following main components: (i) a client interface, which consists of a client handler that accepts control requests from clients and a data pump that is responsible for the delivery of data to the client, (ii) a remote database interface that retrieves data from remote servers, and (iii) a system controller which instantiates all of the disk cache management techniques. Specifically, the system controller contains: 1. An Object Manager, which maintains a table containing the statistics regarding objects cached in their entirety. Some of the statistics being maintained include the number of references to the object within an interval, time last referenced, long term access probability, etc.
15
Cache Hit Ratio (theta=0.2, small:large file ratio 1:4, lambda 0.25 )
Cache Hit Ratio (theta=0.2, small:large file ratio 1:4, lambda 0.25 )
(Large File Size: 125MB-1GB, Small: 12.5MB-100MB, PlayRate: 2-6Mb/sec, Disk BW 4MB/s) 0.06 IBSC IBSC-RUNS
0.055 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0
2000
4000
6000
8000
10000
12000
14000
16000
(Large File Size: 125MB-1GB, Small: 12.5MB-100MB, PlayRate: 2-6Mb/sec, Disk SP 4GB) 0.35 IBSC IBSC-RUNS 0.3
0.25
0.2
0.15
0.1
0.05
0 0
20
40
Cache Size in MB
60
80
100
120
140
Bandwidth of Cache (MBytes/sec)
(a) Varying Space
(b) Varying Bandwidth
Figure 14 : Effect on run caching on the performance of IBSC algorithm Client Data Reject/ Accept
open request
Client Handler
Data
DATA PUMP Remote/local Cache/dont cache
open request Reject/accept
Remote/local Cache/dont cache
REMOTE DB INTERFACE
Serve from remote DB Control
Yes/No
SYSTEM CONTROLLER
Data
Serve from cache?
Yes/No CACHE INTERFACE
Cache Object?
Data
Data
Yes/No LOCAL DATABASE
CACHE CO−ORDINATOR
GEN. INT. MANAGER
Data
SEG. MANAGER
CACHE
Figure 15 : Control and Data Flow in the Prototype 2. A Generalized Interval Manager, which maintains information about intervals (including predictive intervals for small objects) in an Interval Table. This includes information such as temporal spacing between the clients, the data rate of the object, etc. 3. A Cache Coordinator, which implements the IBSC caching algorithm. Specifically, on receiving a request for accessing an object, the cache coordinator checks to see if the information requested is already in the cache. In this scenario, it coordinates its retrieval and transmission with the data pump. Otherwise, the coordinator interacts with the generalized interval manager and the object manager to obtain the relevant information about the entities in the cache, and then determines, which, if any, of the entities that must be replaced in order to accommodate the new object. 4. A Cache Interface, which provides a simple interface to read, write and delete data in the cache. Additionally, it implements an admission control algorithm, which determines if sufficient bandwidth is available to access the requested object from the disk cache. In the event that sufficient disk bandwidth is not available, even if the object is available in the cache, the object needs to be accessed from the remote server. Figure 15 shows the control and data flow in our prototype system.
16
5
Concluding Remarks
Information services of the future will employ a hierarchical server architecture, in which continuous as well as noncontinuous media information stored at a remote server may be cached at a local server on-demand. Unlike memory caches, disk caches are constrained both by available bandwidth and space. Consequently, in addition to multiplexing the cache space among multiple data types, cache management algorithms will be required to efficiently utilize both disk space and bandwidth, and thereby minimize the load imposed on the network and the remote server. In this paper, we presented the IBSC caching algorithm that achieved this objective. To do so, the algorithm: (1) selects an entity to be cached based on the current cache bandwidth and space utilization; and (2) determines which, if any, of the entities presently in the cache that should replaced. The procedure for determining the entities to be replaced depends on whether the cache is unconstrained, bandwidth constrained, space constrained, or bandwidth and space constrained. By judiciously selecting the entities to be replaced, the algorithm maximizes the hit ratio, and thereby minimized the load on the network and remote server. Through extensive simulations, we demonstrated that the IBSC caching algorithm achieves higher hit ratios as compared to several existing algorithms under widely varying workloads and cache configurations. We also described the architecture of our prototype system. The evaluation of the prototype system for efficient delivery of hyper-media objects is the focus of some of our on-going research. REFERENCES [1] M. Blaze and R. Alfonso. Dynamic Hierarchical Caching in Large-Scale Distributed File Systems. In Proceedings of International Conference on Distributed Computing Systems, June 1992. [2] H. Chou and D.J. DeWitt. An evaluation of buffer management strategies for relational database systems. Proceedings of the 11th VLDB Conference, 1985. [3] M. D. Dahlin, R. Wang, T. E. Anderson, and D. Patterson. Cooperative caching: Using remote client memory to improve file system performance1994. In Proceedings of the Operating Systems Design and Implementation Symposium, 1994. [4] A. Dan, D. Dias, R. Mukherjee, D. Sitaram, and R. Tewari. Buffering and caching in large scale multimedia servers. In Proceedings of IEEE COMPCON, pages 217–224, March 1995. [5] A. Dan and D. Sitaram. Buffer Management Policy for an On-Demand Video Server. IBM Research Report RC 19347, T.J Watson Research Center, Yorktown Heights, New York, January 1993. [6] A. Dan and D. Sitaram. A Generalized Interval Caching Policy for Mixed Interactive and Long Video Environments. In IS&T SPIE Multimedia Computing and Networking Conference, San Jose, CA, January 1996. [7] A. Dan and D. Towsley. An approximate analysis of the LRU and FIFO buffer replacement schemes. In ACM SIGMETRICS, pages 143–152, May 1990. [8] A. Bestavros et. al. Application-Level Document Caching in the Internet. In Proceedings of Workshop on Services and Distributed Environments, June 1995. [9] K. Andrews et. al. On second generation hypermedia systems. In Proceedings of ED-MEDIA, World Conference on Educational Multimedia and Hypermedia, June 1995. [10] M. Abrams et. al. Caching proxies: Limitations and potentials. In 4th International World-Wide Web Conference, pages 119–133, December 1995. [11] M. Feeley et. al. Implementing global memory management in a workstation cluster. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, December 1995. [12] R. H. Patterson et. al. Informed prefetching and caching. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, December 1995.
17
[13] S. Williams et. al. Removal policies in network caches for world-wide web documents. In ACM SIGCOMM, pages 293–305, August 1996. [14] G. A. Gibson, J.S. Vitter, and J. Wilkes. Storage and I/O Issues in Large-Scale Computing. ACM Workshop on Strategic Directions in Computing Research, ACM Computing Surveys, 1996. http://www.medg.lcs.mit.edu/doyle/sdcr. [15] A. Luotonen and K. Altis. World-wide web proxies. Computer Networks and ISDN Systems, 27(2), 1994. [16] D. Muntz and P. Honeyman. Multi-level Caching in distributed filesystems or your cache ain’t nothing but trash. In Proceedings of the Winter USENIX, January 1992. [17] V.F. Nicola, A. Dan, and D.M. Dias. Analysis of the generalized clock buffer replacement scheme for database transaction processing. In Proceedings of ACM SIGMETRICS, pages 35–46, 1992. [18] B. Ozden, R. Rastogi, and A. Silberschatz. Buffer replacement algorithms for multimedia storage systems. In Proceedings of the International Conference on Multimedia Computing and Systems, pages 172–180, June 1996. [19] J.E. Pitkow and M. M. Recker. A simple yet robust caching algorithm based on dynamic access patterns. In Proceedings of 2nd International WWW Conference, pages 1039–1046, October 1994. [20] J. T. Robinson and M. V. Devarakonda. Data cache management using frequency-based replacement. In Proceedings of ACM SIGMETRICS, May 1990. [21] L. A. Rowe and B. C. Smith. A continuous media player. In Proceedings of the 3rd International Workshop on Network and Operating System Support for Digital Audio and Video, November 1992. [22] P. Sarkar and J. Hartman. Efficient Cooperative Caching using Hints. In Proceedings of Operating Systems Design and Implementation Conference, October 1996. [23] H.M. Vin and P.V. Rangan. Designing a Multiuser HDTV Storage Server. IEEE Journal on Selected Areas of Communications, January 1993. [24] D.L. Willick, D. L. Eager, and R. B. Bunt. Disk cache replacement policies for network fileservers. In Proceedings of International Conference on Distributed Computing Systems, May 1993.
18