1
On Design and Performance Evaluation of Multimedia Proxy Caching Mechanisms for Heterogeneous Networks Reza Rejaie Jussi Kangasharju AT&T Labs - Research freza,
[email protected] Abstract—Multimedia Proxy caching (MPC) can improve scalability by reducing network load, and maximize delivered quality to heterogenous clients. The notion of delivered quality is a new dimension in design and evaluation of MPC mechanisms that does not exist in traditional web caching. Furthermore, there is a fundamental tradeoff between delivered quality, network load and delay. Previous studies on MPC did not consider delivered quality in design and evaluation of MPC mechanisms. This paper makes three contributions: First, we explore the design space of MPC mechanisms with respect to this tradeoff. We identify three promising caching strategies that are either conservative(LQC) or aggressive(HQC), or adaptive(AQC) with respect to delivered quality, by leveraging this tradeoff differently. Second, we present the first methodology for comprehensive evaluations of MPC mechanisms with respect to this tradeoff. Third, we use our proposed methodology to conduct simulation-based comparison among the three strategies. Our simulation results show that HQC can always maximize delivered quality at the cost of high network load and long delay. In contrast, AQC, can deliver moderate to high quality at the substantially lower load and no delay. Delivered quality, network load and delay are always low in LQC.
I. I NTRODUCTION Today’s multimedia streaming applications often use a client-server architecture where a requested stream is pipelined from the server to the client, i.e., the buffered portion of the stream is being played while the rest of the stream is being delivered. The client-server architecture does not scale well to a large number of clients because streaming applications usually require high bandwidth for the entire session. Furthermore, even if the client has a high bandwidth connectivity to the network, quality of delivered streams is limited to the bottleneck bandwidth and could change with unpredictable variations in background traffic along the client-server path. Proxy caching of multimedia streams is a promising solution that could improve scalability to a large number of clients and maximize delivered quality simultaneously. Similar to a web proxy cache, a multimedia proxy cache can pull and cache popular streams from the server on-demand, thus it can significantly reduce network (and server) load and improve scalability. More importantly,
C3 2 Mbps 500 Kbps
Server
Proxy
500 Kbps
C2
56 Kbps
C1
Fig. 1. Multimedia Proxy Caching
since the proxy is usually located at the point of bandwidth discontinuity (e.g., campus gateways), proxy can maximize delivered quality to heterogenous clients despite a potential bottleneck between the proxy and the server. Content Distribution Network (CDN) is an alternative solution to achieve scalability and deliver high quality. CDNs have the ability to push popular streams to content servers (proxies) that are located close to interested clients. Such a solution is viable when the CDN can predict demand for a particular stream a priori (e.g., a scheduled video playback). Typically, however, demand is not known in advance, therefore it is more efficient for content servers to pull the objects in a demand-driven fashion, similar to proxy caches. Therefore, by examining the pullstrategy, we not only study the multimedia proxy caching mechanisms but also cover the typical pull-strategy in a CDN. Pipelining nature of delivery for multimedia streams is significantly different from the atomic delivery for web objects. We use an example to illustrate the implications of pipelining on design and evaluations of multimedia proxy caching mechanisms. Figure 1 shows a proxy that serves three clients with heterogeneous bandwidth connectivity. Each client expects to receive the maximum quality that can be pipelined through the client-proxy connection. When C1 requests stream s that is missing from the cache, a version of s that has 56Kbps bandwidth (s56K ) is delivered from the server to C1 and is cached at the proxy. If C2 requests stream s, the proxy has two choices: it can simply deliver the low quality version of s (s56K ) from the cache. Since the server-proxy bandwidth is higher than stream bandwidth (500Kbps), the proxy can also treat this as a miss, i.e., proxy can request the server to pipeline a 500Kbps version of s (s500K ) to C2 , and cache s500K . Clearly, the latter strategy would result in a higher deliv-
2
ered quality at the cost of higher network load. Furthermore, the cache should store and manage two versions of stream s in the latter case. If later C3 requests stream s, one strategy is to deliver the highest quality version of s that is available in the cache(i.e., s500K ). In this scenario, the server-proxy bandwidth is the bottleneck and determines maximum quality that can be pipelined from the server. Therefore, any stream with a quality higher than 500 Kbps should be first downloaded from the server to the proxy (as a file) and then it can be pipelined from the proxy to C3 . Obviously, downloading could result in a long delay which may not be tolerated by the client. On the other hand, if the 2 Mbps stream is not provided to C3 , this client does not benefit from its higher bandwidth connectivity even though he is paying a higher premium than C2 . This simple example illustrates keys issues in design and evaluations of multimedia proxy caching mechanisms that are rather different from web caching schemes. Quality Adaptive Caching: Multimedia proxy caching mechanisms should be quality adaptive, i.e., they should be able to provide multiple quality of a popular stream for clients with different bandwidth connectivity to the proxy. Thus, a multimedia proxy caching mechanism should address two issues: – Which streams should be cached? – What is the appropriate quality for each cached stream? The first issue in essence is a cache replacement problem that is similar to web caching mechanisms and has been extensively studied in the context of web caching. The second issue, however, addresses the notion of “delivered quality” for cached streams which does not exist in the context of web caching. Quality-Network Load-Delay Trade-off: This example illustrates that there is a tradeoff between delivered quality, network load, and delay. Notice that size of each stream monotonically increases with its quality. Thus for a given cache size, a proxy that only caches and delivers low-quality streams, can store more streams and reduce network load more effectively. In contrast, a proxy that caches multiple versions of each popular stream with different quality, can provide higher quality for cached streams. However, this approach results in a higher network load for the same cache size. Furthermore, supporting a high quality stream may result in a long delay. The quality-network load-delay tradeoff reveals that overall performance evaluation of a multimedia proxy caching mechanism should collectively examine cache’s ability to reduce network load as well as delivered quality and delay. Consequently, existing web performance evaluation metrics (e.g., ByteHitRatio) and methodologies are not suit-
able for evaluations of multimedia proxy caching mechanisms. During recent years, proxy caching for streaming media has received increasing attention from both research an industrial communities. Several commercial proxy caches for streaming media have been introduced by various companies (e.g., [1][2]), however there is no technical information about these products. Most of the previous work in this area either treated streaming objects similar to web objects and ignored the notion of quality for streaming objects. We have been investigating various design and evaluation issues for multimedia proxy caching mechanisms through simulation and prototype implementation. To the best of our knowledge, there is no evaluation framework to examine the performance of multimedia proxy caches with respect to the quality-network load-delay tradeoff. This paper makes three contributions: First, we explore the design space of multimedia proxy caching mechanisms with respect to the quality-network load-delay tradeoff. We identify High Quality Caching(HQC), Low Quality Caching(LQC) and Adaptive Quality Caching(AQC) as the three most promising caching strategies that leverage the tradeoff differently. We believe that all existing solutions should use one of these candidate strategies or a reasonable combination of them. Second, we present the first methodology for evaluation of multimedia proxy caching mechanisms that examines the performance with respect to delivered quality, network load and delay. Our methodology justifies the performance metrics that should be measured, describes different granularities for performance evaluations, and presents various dimensions of the evaluation space. Third, we use our proposed evaluation methodology to compare these strategies. This not only illustrates various aspects of our methodology, but it also quantifies the differences between these three basic strategies and provides some insight on behavior of each strategy across the evaluation space. Our results show that HQC always delivers the maximum quality for a high load and a long delay. AQC, however, is able to achieve moderate to high quality, with a significantly lower load and no delay. In this paper we mainly focus on different strategies for design of quality adaptive proxy caching mechanisms for multimedia streams. However, we do not discuss many other orthogonal design issues for multimedia proxy caches such as customized file system for streaming media, managing different media types, and copy right issues. The rest of this paper is organized as follows: In Section II, we present the design space of multimedia proxy caching and identify three candidate strategies. We also present a qualitative comparison among these strategies. Section
3
TABLE I N OTATION
III describes our proposed methodology for performance evaluation of multimedia proxy caches. We apply our methodology to conduct a simulation-based comparison among three candidate mechanisms in Section IV. Section V reviews previous work on multimedia proxy caching. Finally, Section VI concludes the paper and sketches our future directions. II. E XPLORING
THE
Symbol
1 We are currently evaluating various replacement algorithm (e.g., different variants of LFU and LRU). Our primary results indicated that LRU outperforms other algorithms.
Definition.
bwsp Server Proxy BW bwi Proxy-Clienti BW bwh BW between Proxy & low bw Client bwl BW between Proxy & high bw Clienti Qnl (bw) Quality of a non-layered stream Ql (bw) Quality of a layered-encoded stream BW (s) BW of stream s
D ESIGN S PACE
As we mentioned earlier, there is a general tradeoff between delivered quality, network load, and delay. Depending on the way this tradeoff is leveraged, one can design a spectrum of proxy caching strategies for multimedia streams. In this section, we present three representative strategies for caching of multimedia streams. Each one of these strategies uses the quality-network load-delay tradeoff differently. We believe that these strategies are the most reasonable points of the design spectrum. Therefore, any proxy caching mechanism for multimedia streams should either use one of these candidate strategies or a combination of them. The three caching strategies are: 1) Low Quality Caching (LQC), 2) High Quality Caching (HQC), and 3) Adaptive Quality Caching (AQC). In LQC and HQC strategies, all streams are single-layer encoded with different quality (and thus different bandwidth), and stored at the server. The AQC strategy assumes that all streams are layered encoded and stored at the server. In layered encoding, each stream is encoded into multiple layers. The more layers are delivered to the client, the higher the delivered quality. However, a client can only decode layer Li if all the lower layers (i.e., Lj , for any j < i) are delivered as well. All three schemes deploy LRU replacement algorithm 1 . However, LQC and HQC perform atomic replacement whereas AQC performs replacement at a perlayer granularity. Note that in all these three strategies, each cached stream can be divided into number of segments. This allows the cache to perform fine-grained replacement by flushing only a minimum number of segments from a victim stream (instead of the entire stream or the entire layer) in a demand driven fashion. However, since all three strategies can equally benefit from such a segmentation scheme, we do not discuss this issue in this paper Instead, our main goal is to investigate the qualitynetwork load-delay tradeoff. Table I summarizes the notations that we use in this section. We assume that any server-client connection comprises of two parts: 1) the server-proxy connection, and
USED THROUGHOUT THE PAPER
2) the proxy-client connection. A stream can be pipelined through a connection when its bandwidth is less than or equal to the connection’s bandwidth. By pipelining we refer to the delivery scheme where client plays buffered portion of the stream while the rest of the stream is being delivered. We assume that clients always desire to receive maximum deliverable quality (i.e., Q(bwi )) that can be pipelined through the proxy-client bandwidth2 . A requested stream can be pipelined from the server with the quality that is limited by the server-proxy bandwidth (Q(bwsp )). To deliver a higher quality stream (i.e., BW (s) > bwsp) from the server, first the stream should be downloaded (as a file) from the server to the proxy, and then pipelined from the proxy to the client. If a stream is pipelined from the server or the proxy, the delay consists of a round-trip-time and initial buffering. Since this component of the delay is the same across all strategies and is often short, it is ignored in this paper However, downloading a stream could result in a long delay3 . Since the server-proxy and proxy-clients bandwidth directly affect quality of delivered streams, we need to examine three possible scenarios with respect to available bandwidth as follows: Scenario I: bwsp bwi , for any i Scenario II: bwsp > bwi , for some i Scenario III: bwsp > bwi , for any i A. Low Quality Caching (LQC) The goal in LQC scheme is to deliver the maximum quality that can be directly pipelined to the client. Therefore, delay in LQC strategy is always low, and delivered quality is determined by MIN(Q(bwsp ), Q(bwi )). Clearly, when the server-proxy bandwidth is the bottleneck, delivered quality in the LQC strategy is low. The following pseudo-code presents the LQC strategy more precisely. 2
This is a reasonable assumption because clients often pay the premium based on their bandwidth connectivity. 3 We do not consider off-line downloading of a stream (or a layer) because it is similar to the push-strategy in CDN which requires prediction of future access.
4
LQC(s, bwi , bwsp ) IF(IsQInCache(s, bwi ))
1 2 3 4 5 6
PipelineStrFrmCache(s, bwi ) ELSE IF(bwi
bwsp )
PipelineStrFrmServer( s, bwi ) ELSE PipelineFrmServer( s, bwsp )
When client i requests stream s, if a version of s with the required quality exists in the cache, it is directly pipelined from the cache (line 1,2). However, if the desired quality does not exist in the cache but can be pipelined from the server, it is directly pipelined from the server and cached at the proxy (line 3,4). Otherwise, the highest quality that can be pipelined from the server is delivered to the client and cached at the proxy(line 5,6). Notice that LQC may cache multiple versions of a cached stream with different quality whenever the server-proxy bandwidth is not the bottleneck and clients have different bandwidths (i.e., in scenarios II and III). When cache space is exhausted, the least recently used stream (i.e., one version of a cached stream) is selected and entirely evicted from the cache. B. High Quality Caching(HQC) The goal in HQC is to always provide the desired quality (i.e., Q(bwi )) to each client, i.e., delivered quality is solely determined by proxy-client bandwidth. Clearly, when the desired quality does not exist in the cache and can not be pipelined from the server, HQC results in a long downloading delay. The following pseudo-code presents the HQC strategy more precisely. HQC(s, bwi , bwsp )
1 2 3 4 5 6 7 8
IF(IsQInCache(s, bwi )) PipelineStrFrmCache(s, bwi )
ELSE IF(bwi bwsp ) PipelineFrmServer( s, bwi ) ELSE DownloadStrFrmServer( s, bwi )
downloaddel = (bwi *leni )/bwsp PipelineStrFrmCache(s, bwi )
If the requested stream with maximum deliverable quality is available in the cache, it is pipelined from the cache(line 1,2). Otherwise, if the server-proxy bandwidth is sufficient, the desired quality is directly pipelined from the server(line 3,4). Finally, if the server-proxy bandwidth is not sufficient, the stream is first downloaded (i.e., transfered as a file) to the proxy and cached, and then pipelined from the cache to the client (line 5-8). HQC does not compromise delivered quality for the cost of potential downloading delay on a cache miss. If the stream is long or server-proxy bandwidth is low, the delay
could be long and may not be tolerated by the client. Alternatively, the proxy can start pipelining the stream to the client while it is being downloaded from the server. In this case the client is likely to experience a stall in playback whenever the proxy runs out of data. This approach does not reduce the delay. Instead, the delay is spread out over the entire session. Once the high quality stream is downloaded, the following requests for this high quality stream can be served directly from the cache until it is replaced. HQC may cache multiple versions of one stream in all three scenarios if clients have heterogeneous bandwidths. When cache space is exhausted, the least recently used stream (i.e., one version of a cached stream) is selected and evicted from the cache in an atomic fashion. C. Adaptive Quality Caching(AQC) The goal in AQC strategy is to provide high quality streams with moderate load and minimum delay. AQC leverages the layered structure of encoded streams to provide different qualities (i.e., by delivery of different numbers of cached layers) from a single cached stream. Thus, AQC does not require to cache multiple copies of a popular cached stream. AQC strategy is described in the following pseudo-code: AQC(s, bwi , bwsp )
1 2 3 4 5 6 7
ReqLayer = BW2Layer(0,bwi ) CachedLayer = LayerInCache(s) IF(CachedLayer
ReqLayer)
PipelineLayerFrmCache(s, 0, ReqLayer)) ELSE PipelineLayerFrmCache(s, 0, CachedLayer)) PipelineLayerFrmServer( s, CachedLayer, ReqLayer)
If the number of cached layers is more than or equal to the number of requested layers, then the requested layers are pipelined directly from the cache(line 3,4). Otherwise, not only all the cached layers are pipelined to the client, but any other higher layers that are requested and can be pipelined from the server, are also pipelined to the client and cached (line 5-7). Therefore, the delivered quality is determined by the total number of layers that are pipelined from both the cache and the server. AQC strategy takes advantage of layered structure of the streams to match quality of cached stream s with desired quality for stream s that is being determined by available bandwidth between proxy and clients who are interested in stream s. When only low-bandwidth clients are interested in stream s, only a small number of lower layers are
5
38
Single layer FGS SCDC
36 34 32
PSNR
cached. If stream s becomes popular among higher bandwidth clients, higher layers of s are gradually brought into the cache to improve delivered quality. But if high bandwidth clients occasionally request stream s, the lower layers (that are popular and cached) are pipelined from the cache and higher layers that are not popular are pipelined from the server.
30 28 26 24
If the server-proxy bandwidth is higher than bandwidth of all layers, higher layers can be gradually pipelined from the server and cached during requests from high bandwidth clients. The rate of improvement in the quality of a cached stream in the AQC strategy depends on the server-proxy bandwidth. Obviously, if the server-proxy bandwidth is less than the bandwidth requirement for pipelining a layer, that layer can be only downloaded which could result in a long delay. When the cache space is exhausted, the least recently used layer of one cached stream is selected and flushed. Notice that when layer Li of stream s is delivered from the cache, all the lower layers of stream s (Lj for any j < i) are also delivered from the cache. Therefore, the recent access time is monotonically increases from the top layer to the bottom layer of a cached stream. This implies that the victim layer is always the top cached layer of one stream.
C.1 Layering Penalty Flexibility of layered coding comes at the cost of encoding efficiency. More specifically, a layered encoded stream with bandwidth b usually has a lower quality than a single-layer encoded stream with the same bandwidth. Inefficiency of layered encoding often monotonically increases with the number of layers. However, recent studies on layered encoding [3] have shown a significantly improvement in the efficiency of layered encodings. Figure 2 shows delivered quality (in terms of PSNR) of two layered encoding schemes, Fine-Grained Scalable encoding (FGS) encoding[4] and Scalable Codec with Drift Control(SCDC) [5], in compare with a single-layered encoding for sequence “Mobile & Calender” at different rates. Note that the shape of these PSNR curves is content-dependent and these curves are only representative. Therefore, the overall performance of the AQC strategy depends on the content of the stream and the layered encoding scheme. Since the slope of improvement in quality exponentially decreases with the bandwidth, each layer can result in the same level of improvement in delivered quality only if layer bandwidth distribution is exponential, i.e., bandwidth of layer Li is twice the bandwidth of Li 1 .
22 20 0
200
400
600
800
1000
1200
1400
Rate
Fig. 2. Quality-rate function for two layered and a single layered encodings TABLE II Q UALITATIVE COMPARISON
LQC HQC AQC
Delivered Quality < Q(bwsp) Maximum Moderate/High
Net. Load Low High Moderate/Low
Delay. Low High Low
D. Qualitative Comparison Before examining these strategies in details, we present a qualitative comparison to provide some high level insight. Delivered quality in LQC is limited by the serverproxy bandwidth and delay is always low. In contrast, HQC always delivers the desired quality but it may result in a long delay. HQC is likely to cache all versions of a popular stream with different qualities. LQC may also cache multiple versions of one stream, but it is more likely that the cached copies have a lower quality than HQC. Therefore, for a given cache size, HQC experiences a higher degree of replacement which in turn results in a higher network load than LQC. AQC should cause a lower network load than HQC because it caches only a single copy of a popular stream. Delivered quality in AQC might be lower than HQC because 1) some layers may not exist in the cache and can not be pipelined from the server, and 2) even if all the requested layers are delivered, quality of the corresponding single-layer stream is higher. However, delay in AQC scheme is as low as LQC. Table II summarizes the above qualitative comparison. Although we can intuitively draw the above qualitative comparison, the following questions still beg for a quantitative comparison between these three strategies. 1. What is the relative delivered quality for HQC, AQC and LQC? 2. What is the extra cost in terms of network load and delay for HQC in compare to other strategies? 3. How much does the layered encoding inefficiency contributes in the gap between the delivered quality of AQC
6
and HQC? By answering these questions, we can quantify how each strategy utilizes the quality-network load-delay tradeoff.
B. Granularity of Evaluation
III. E VALUATION M ETHODOLOGY In this section, we first present our evaluation metrics. Then, we motivate the need for evaluation at different granularities. Finally, various dimensions of the evaluation space are justified. A. Evaluation Metrics We use a black-box approach to evaluate a multimedia proxy cache. For a complete evaluation of a multimedia proxy cache, its performance with respect to three metrics should be examined as follows: 1. Network Load: In the context of multimedia proxy caching, delivered quality of a requested stream might be lower than the desired quality. Therefore, total number of requested bytes4 may not be the same as total number of delivered bytes. Therefore, it is hard to define a meaningful normalized metric (such as ByteHitRatio for web caching) to measure cache’s ability to reduce load. As a result, we use total number of bytes that are delivered from the server (i.e., absolute value of network load, NetLoad) to compare how different caching strategies reduce network load for a given request sequence. 2. Delivered Quality: Average delivered quality across all request can be used to present cache’s ability to improve delivered quality at the aggregate level as follows:
AvgDelQ(R) =
P8r2R Len str r DelQ str r P8r2R Len str r (
( ))
(
(
( ))
( ))
(1) where R, str (r ), Len(s) and DelQ(s) present a given request sequence, requested stream for request r , length of stream s and delivered quality of stream s, respectively. Notice that delivered quality of each requested stream must be weighed by the stream length to correctly calculate the averaged deliver quality. 3. Delay: We measure the average delay that a client experiences until a requested stream is played. We assume that the delay is negligible when a requested stream is pipelined from the proxy or even the server. Thus, the only contributing factor in the delay is the downloading time of high bandwidth streams that are missing from the cache and could not be directly pipelined from the 4
server. We average the downloading delay across all requests (even those that did not experience any download delay) to present a per-request average delay.
Notice that the number of requested bytes for each request depends on the proxy-client bandwidth. Thus, the total number of requested bytes directly depends on the mapping of requests among clients with different bandwidth.
Performance of multimedia proxy caching strategies should be examined at different granularity as follows: Aggregate Level Evaluation: NetLoad, AvgDelQ, and Delay present the performance of a multimedia caching strategy at the aggregate level. Aggregate metrics are very useful for overall comparison of different schemes. However, to examine the dynamics of each caching strategy, we need to measure network load, delivered quality, and delay at finer granularities. Per-Client Level Evaluation: We can measure network load (NetLoad ) and average delivered quality (AvgDelQ ) only across requests from a group of homogeneous clients (i.e., clients with the same bandwidth connectivity to the proxy). NetLoad presents contribution of this group of client in the total network load. AvgDelQ is defined similar to AvgDelQ but is only averaged among requests from clients with the same proxy-client bandwidth. AvgDelQ shows how much delivered quality is improved for this group of clients for the cost of generated network load (i.e., NetLoad ). Since the desired (or target) quality can be defined for a group of homogeneous clients (i.e., Q(bwi )), a comparison between AvgDelQ and Q(bwi ) reveals how close the delivered quality was to the target quality. Per-Stream Level Evaluation: To study the dynamics of the caching strategies in further detail, we can measure the average delivered quality and share of network load and delay on a per-client-per-stream basis. This means averaging across all requests for stream s from a group of clients with the same proxy-client bandwidth. The stream measurements basically show distribution of per-client measurements across different streams. If this information is ordered based on stream popularity, it can clearly demonstrate the impact of stream popularity on 1) its share of network load, 2) its delivered quality, and 3) the delay. Per-Stream over Time: Finally, to investigate the details of cache replacement mechanisms, we can measure both the delivered quality and the network load for each request from a group of homogeneous clients as a function of time. In essence, this measurement shows evolution of the perstream values in time. C. Evaluation Space Performance of a multimedia caching mechanism clearly depends on traditional evaluation parameters such
7
as cache size, and workload characteristics (e.g., Zipf parameters) as well as multimedia specific parameters such as the server-proxy or proxy-client bandwidth, or even mapping of requests among clients. Here, we focus on those evaluation parameters that are specific to multimedia proxy caching since the importance of the traditional evaluation parameters are well-understood. Ratio of High Bandwidth Requests (rhbwr): Appropriate quality of a popular stream s in the cache is a function of available bandwidth between the proxy and clients who are interested in s. For example, if s is popular among low bandwidth clients, a low quality version of s should be cached. This implies that a workload for a multimedia proxy cache should present a request sequence and its mapping among heterogeneous clients. For a cache with a high and a low bandwidth client, this mapping can be presented by the ratio of high bandwidth requests (or rhbwr ). For example, rhbwr = 10% means that only 10% of total requests are issued by the high band width clients. Degree of Client Bandwidth Heterogeneity( bw bwhl ): The range of desired quality expands with the range of proxyclient bandwidth. Therefore, the degree of client bandwidth heterogeneity is a key parameter that shows how sensitive cache performance is to the range of client bandwidth connectivity. bwh ) bwh presents Degree of Bandwidth Discontinuity ( bw sp bwsp the maximum bandwidth discontinuity between serverproxy and proxy-client connections. This ratio determines the gap between the quality that can be directly pipelined from the server and the quality that is desired by clients. Thus, the sensitivity of cache performance to this parameter is very important. IV. S IMULATION - BASED C OMPARISON In this section, we use our proposed evaluation methodology to conduct a simulation-based comparison among the three candidate strategies at different granularity using our own session-level simulator. We have conducted more than thousand simulations and compared performance of these strategies along various dimensions of the evaluation space over a wide range of evaluation parameters, and for different streams with various layered encodings. Due to the lack of space, we only present some key results to show primary characteristics of the candidate caching strategies in this section. Other results can be found in a technical report[6]. We use stream “Mobile & Calender” with the SCDC layered encoding and the single-layer encoding (SLC) (shown in Fig 2). Table III summarizes the quality-rate information for these two encodings from Figure 2. Therefore, each layered encoded stream has 4 layers where
Q UALITY
TABLE III OF SLC AND SCDC
Rate (Kbps) SLC(PSNR) SCDC(PSNR)
100 23.46 21.17
300 27.48 24.62
ENCODINGS
700 32.23 29.54
1300 37.68 34.60
layer bandwidths are 100Kbps, 200Kbps, 400Kbps and 600Kbps, respectively. A recent study by Chesire et al.[7] found that their multimedia proxy workload exhibited a Zipf-like distribution with Zipf parameter 0.47. Thus, we generate a request sequence that consists of 200000 requests where stream popularity conforms to the Zipf distribution. We also examine the impact of Zipf parameter on cache performance. Our dataset consists of 1000 streams where length of each stream is chosen randomly between [2min, 60min]. Cache size is an evaluation parameter and is specified as a percentage of the dataset size (i.e., CF ). Notice that the bandwidth of the stream with maximum quality is the same for both layer and single-layer encodings in table III. This implies that the dataset size is the same for different encodings. We use ratio of higher bandwidth requests (rhbwr ) and cache size factor (CF ) as our primary evaluation variables. rhbwr changes from 10% to 90% in the steps of 20%, while CF varies within [0.002,0.2]. We use the topology in Figure 1 and assume a proxy with a high and a low bandwidth client with bwh and bwl bandwidth, bwsp is set to the maximum layer bandwidth (i.e., bwsp = 700Kbps) so that AQC can pipeline at least one layer from the server. A. Aggregate Level Evaluation First we examine the aggregate performance of all three strategies as a function of rhbwr and CF where Zipf parameter is 0.5, and bwsp , bwh and bwl are 700Kbps, 1300Kbps and 100Kbps, respectively. Each line in Figure 3a presents aggregate delivered quality as a function of total network load for HQC and LQC as rhbwr changes. Figure 3b shows the same information for AQC. Results for LQC are the few points at the left bottom part of Figure 3a. For the same simulation, Figure 3c depicts average delay in HQC strategy as a function of cache size for five values of rhbwr . Figure 3a and 3b clearly show how each strategy leverages the quality-load tradeoff. Delivered quality in LQC is low for any cache size. The aggregate delivered quality in HQC is between 10% to 25% higher than AQC, but HQC achieves such a high quality with 1) 100% to 450% higher network load than AQC, and 2) roughly 10min to 1 hour average delay for each request. Furthermore, the rate of improvement in quality as a func-
8
tion of load (i.e., slope of lines) is higher for AQC. Clearly, in HQC strategy, the proxy can start pipelining the stream to the client while the stream is being downloaded form the server. However, this could easily result in frequent hiccups in playback at the client whenever the proxy runs out of data. In this case, the total delay is the same but it is spread over the entire session. A.1 Impact of Zipf Parameter The workload with Zipf parameter 0.5 significantly stresses the cache because popularity distribution is not very skewed. To examine the effect of Zipf parameter on the candidate strategies, we repeated the previous simulations with a Zipf parameter 1 and results are shown in Figure 4. As one expects, all components of the overall cache performance (i.e., delivered quality, network load and delay) improve in HQC and AQC strategies as the locality of reference in the request sequence increases (i.e., higher Zipf parameter). Moreover, the impact of the cache size on overall cache performance is more visible for higher values of Zipf parameter. Notice that the observation about the relative comparison of HQC and AQC still holds for higher values of Zipf parameter. For the rest of this section, we only present the result for a request sequence with the Zipf parameter 1. A.2 Degree of Client Bandwidth Heterogeneity In previous simulations, we examined the scenario where the degree of bandwidth heterogeneity is maximum (i.e., 1300Kbps/100Kbps). To examine the effect of degree of client bandwidth heterogeneity on aggregate performance, we repeated our simulations for two higher values of bwl (i.e., 300Kbps & 700Kbps) without changing any other parameters (i.e., Zipf param = 1, bwh =1300Kbps and bwsp =700Kbps). Figure 5a,b and Figure 6a,b depict delivered quality as a function of network load for HQC, LQC and AQC for lower degrees of client bandwidth hetKbps 1300Kbps erogeneity, 1300 300Kbps and 700Kbps respectively. These results show a few interesting points: The minimum delivered quality increases with bwl in all three strategies. However, delivered quality for LQC is limited by bwsp . As the ratio of high bandwidth request increases, network load for high values of rhbwr (i.e., top point of each line) remain relatively constant because a majority of requests are issued by the high bandwidth client. In contrast, network load for low values of rhbwr (i.e., bottom point of each line) increases with bwl in all three strategies because most requests are from low bandwidth clients who require higher quality. This increase in the network load is significantly higher for HQC (especially with small caches)
because it maintains multiple copy of each cached stream. Figure 6a shows that the network load does not increase in HQC strategy when rhbwr goes from 70% to 90% and cache is large (CF =0.2). Since the bwl is closer to bwh in compare to previous cases, the size of low-quality streams is comparable with the size of high quality streams. When rhbwr is 70%, some low quality streams are sufficiently popular to stay in the cache and may even result in flushing some high quality streams. In contrast, when rhbwr becomes 90%, only high quality streams are cached, thus it is more likely that a requested stream resides in the cache. As a result, network load does not increase. Figure 6b shows that for any value of rhbwr, delivered quality significantly increases with the cache size for AQC strategy. This basically shows that AQC strategy can effectively use cache space because a single layered encoded stream in the cache can provide different qualities. In this case, the high and low bandwidth clients need 4 and 3 layers, respectively. When 4 layers of a popular stream are cached, all requests for that stream can be served from the cache. B. Per-Client Level Evaluation Aggregate evaluation does not show how well the cache serves each client. We need to examine performance at the per-client granularity in order to measure: 1) the delivered quality to each client, and 2)contribution of each client in the network load. Figure 7 presents the result in Figure 6 at the per-client level. Note that each figure shows the result for both AQC and HQC strategies for a single client. This result shows the adaptive behavior of the AQC strategy for individual clients, that is 1) delivered quality substantially increases with the cache size, and 2) network load rapidly decreases as rhbwr increases. Figure 7 clearly shows that the AQC strategy can maximize delivered quality to both clients when cache size is sufficiently large. Therefore, the only gap between the delivered quality in AQC and HQC strategies is the layering inefficiency. However, network load for AQC strategy is at least half of HQC strategy. C. Per-Stream Level Evaluation We can study our results in further details by examining the delivered quality and associated network load on a perstream basis for each individual client. Figure 8 shows a detailed view of one simulation from Figure 7 a, b (point A and B where CF =0.2 and rhbwr =90%) for the AQC strategy. In essence, here we show every axis of Figure 7 separately. Figure 8a,b depicts delivered quality for the high and low bandwidth clients where the delivered quality is divided into two parts 1) cache contribution in delivered quality and 2) network contribution in delivered quality.
9
Avg. quality vs. load (All clients)
Avg. quality vs. load (All clients) 40
35
30
25
Cache size vs. delay 4000
CF=0.002 CF=0.010 CF=0.050 CF=0.100 CF=0.200
35
30
Z 0.5 - rhbwr 10% Z 0.5 - rhbwr 30% Z 0.5 - rhbwr 50% Z 0.5 - rhbwr 70% Z 0.5 - rhbwr 90%
3500 3000 Delay (Seconds)
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 LQC - CF=0.002 LQC - CF=0.010 LQC - CF=0.050 LQC - CF=0.100 LQC - CF=0.200
Avg. Quality (PSNR)
Avg. Quality (PSNR)
40
25
2500 2000 1500 1000 500
20
20 0
5e+10 1e+11 1.5e+11 2e+11 2.5e+11 3e+11 3.5e+11 4e+11 4.5e+11
0 0
Load (Bytes)
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11
0
0.05
Load (Bytes)
(a) HQC
0.1
0.15
0.2
Cache size (% of total)
(b) AQC
(c) Delay for HQC
Fig. 3. Zipf 0.5 Avg. quality vs. load (All clients)
Avg. quality vs. load (All clients) 40
35
30
25
Cache size vs. delay 4500
CF=0.002 CF=0.010 CF=0.050 CF=0.100 CF=0.200
35
30
Z 1.0 - rhbwr 10% Z 1.0 - rhbwr 30% Z 1.0 - rhbwr 50% Z 1.0 - rhbwr 70% Z 1.0 - rhbwr 90%
4000 3500 Delay (Seconds)
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 LQC - CF=0.002 LQC - CF=0.010 LQC - CF=0.050 LQC - CF=0.100 LQC - CF=0.200
Avg. Quality (PSNR)
Avg. Quality (PSNR)
40
25
3000 2500 2000 1500 1000 500
20
20 0
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11
0 0
Load (Bytes)
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11
0
0.05
Load (Bytes)
(a) HQC & LQC
0.1
0.15
0.2
Cache size (% of total)
(b) AQC
(c) Delay for HQC
Fig. 4. Zipf 1.0 Avg. quality vs. load (All clients)
Avg. quality vs. load (All clients) 40
35
30
25
Cache size vs. delay 4500
CF=0.002 CF=0.010 CF=0.050 CF=0.100 CF=0.200
35
30
Z 1.0 - rhbwr 10% Z 1.0 - rhbwr 30% Z 1.0 - rhbwr 50% Z 1.0 - rhbwr 70% Z 1.0 - rhbwr 90%
4000 3500 Delay (Seconds)
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 LQC - CF=0.002 LQC - CF=0.010 LQC - CF=0.050 LQC - CF=0.100 LQC - CF=0.200
Avg. Quality (PSNR)
Avg. Quality (PSNR)
40
25
3000 2500 2000 1500 1000 500
20
20 0
1e+11
2e+11
3e+11 4e+11 Load (Bytes)
5e+11
6e+11
0 0
(a) HQC
Avg. quality vs. load (All clients)
25
0.1 Cache size (% of total)
0.15
0.2
(c) Delay in HQC
Avg. quality vs. load (All clients)
Cache size vs. delay 4500
CF=0.002 CF=0.010 CF=0.050 CF=0.100 CF=0.200
35
30
Z 1.0 - rhbwr 10% Z 1.0 - rhbwr 30% Z 1.0 - rhbwr 50% Z 1.0 - rhbwr 70% Z 1.0 - rhbwr 90%
4000 3500 Delay (Seconds)
Avg. Quality (PSNR)
Avg. Quality (PSNR)
30
0.05
BWh /BWl = 1300Kbps/300Kbps
40
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 LQC - CF=0.002 LQC - CF=0.010 LQC - CF=0.050 LQC - CF=0.100 LQC - CF=0.200
35
0
(b) AQC
Fig. 5. 40
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11 Load (Bytes)
25
3000 2500 2000 1500 1000 500
20
20 0
1e+11
2e+11
3e+11
4e+11
5e+11
6e+11
0 0
Load (Bytes)
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11 Load (Bytes)
(a) HQC
(b) AQC
Fig. 6.
0
0.05
0.1
0.15
0.2
Cache size (% of total)
(c) Delay in HQC
BWh /BWl = 1300Kbps/700Kbps
Figure 8c,d shows contribution of each stream on the network load for high and low bandwidth clients, respectively. In the AQC strategy, we can also divided network load into two components: 1) bytes that are delivered on a cache miss, and 2) bytes that are delivered during pipelining of higher layers from the server. The stream IDs (i.e., x axis)
in all figures are sorted based on the stream popularity. Figure 8a,b clearly show that the 100 most popular streams are delivered from the cache to both clients with the desired quality. The remaining streams are mostly delivered from the network with a lower quality. The network load is mainly caused by request for moderately popular
10
Avg. quality vs. load (Low BW clients)
35
30
Avg. quality vs. load (High BW clients) 40
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 AQC - CF=0.002 AQC - CF=0.010 AQC - CF=0.050 AQC - CF=0.100 AQC - CF=0.200
A
Avg. Quality (PSNR)
Avg. Quality (PSNR)
40
25
20
HQC - CF=0.002 HQC - CF=0.010 HQC - CF=0.050 HQC - CF=0.100 HQC - CF=0.200 AQC - CF=0.002 AQC - CF=0.010 AQC - CF=0.050 AQC - CF=0.100 AQC - CF=0.200
35 B 30
25
20 0
5e+10
1e+11
1.5e+11 2e+11 Load (Bytes)
2.5e+11
3e+11
0
5e+10 1e+111.5e+112e+112.5e+113e+113.5e+114e+114.5e+115e+11 Load (Bytes)
(a) Low bandwidth
(b) High bandwidth
Fig. 7. Per client view Average quality for low bandwidth clients
Average quality for high bandwidth clients 40
From cache From network
35
Avg. Quality (PSNR)
Avg. Quality (PSNR)
40
30
25
20
From cache From network
35
30
25
20 0
100
200
300
400
500
600
700
800
900
1000
0
100
200
300
Stream popularity rank
(a) Low bandwidth - Quality
500
600
700
800
900
1000
(b) High bandwidth - Quality
Load for low bandwidth clients 2e+08
400
Stream popularity rank
Load for high bandwidth clients 2e+08
Cache miss Prefetch
Cache miss Prefetch
1.8e+08 1.6e+08 Total load in bytes
Total load in bytes
1.5e+08
1e+08
1.4e+08 1.2e+08 1e+08 8e+07 6e+07
5e+07 4e+07 2e+07 0
0 0
100
200
300
400
500
600
700
800
900 1000
Stream popularity rank
(c) Low bandwidth - Load
0
100
200
300
400
500
600
700
800
900 1000
Stream popularity rank
(d) High bandwidth - Load
Fig. 8. Per client and per stream view (AQC)
streams from the high bandwidth clients. V. R ELATED W ORK The MiddleMan architecture [8] is a collection of cooperative proxy servers that collectively act as a video cache for a well-provisioned local network (e.g. LAN). Video streams are stored across multiple proxies where they can be replaced at a granularity of a block. They examine performance of the MiddleMan architecture with different replacement policies. A class of caching mechanisms for multimedia streams propose to cache only selected portions of multimedia streams to improve delivered quality. Clearly, these solutions do not decrease the load on the server (or the network). To smooth out the playback of variable bit rate video streams, work in [9] proposes a technique called
Video staging. The idea is to prefetch and store selected portions of video streams in a proxy to reduce burstiness of the stream during the playback. Sen et al. [10] also present a prefix caching mechanism to reduce startup latency. Work in [11] suggests caching only selective frames of a media stream based on the encoding properties of the video and client buffer size in order to improve robustness against network congestion. Work in [12] presents a caching architecture for multimedia streams, called SOCCER. SOCCER consists of a self-organizing and cooperative group of proxies. Work in [13] describes design and implementation issues of a single proxy in the SOCCER architecture that implements LRU replacement algorithm. They focus mostly on issues such as segmentation of multimedia streams and request aggregation. Work in [14] studies the layered video caching problem using an ana-
11
lytical revenue model based on a stochastic knapsack. The authors also develop several heuristics to decide which layers of which streams should be stored in the cache to maximize the accrued revenue. In the context of media servers, various caching strategies of multimedia streams in main memory have been studied in prior work[15], [16]. The idea is to reduce disk access by grouping requests and retrieving a single stream to serve the entire group. Tewari et al. [17] present a diskbased cache replacement algorithm for heterogeneous data type, called Resource Based Caching(RBC). RBC considers the impact of resource requirement of each stream (i.e., bandwidth and space) on cache replacement algorithms. Work in [18] further examined the RBC algorithm and presented a hybrid LFU/interval caching strategy. Most of the previous work on multimedia proxy caching treat multimedia streams similar to Web objects (i.e., perform atomic replacement). In our earlier work, we presented overall design of a caching mechanism for layeredencoded stream[19]. Then, we examined overall performance of the proposed scheme through simulation in [20] and prototype implementation in [21].
various streams (or layers) into the caching strategy. R EFERENCES [1] [2] [3]
[4]
[5]
[6]
[7] [8] [9]
[10] [11]
VI. C ONCLUSION
AND
F UTURE W ORK
In this paper, we identify the three way tradeoff between delivered quality, network load and delay as a key issue in design and evaluation of multimedia proxy caching mechanisms. Depending on the way this tradeoff is leveraged, one could design a spectrum of proxy caching mechanism for multimedia streams. We presented three candidate strategies that are either conservative(LQC) or aggressive(HQC), or adaptive(AQC) with respect to delivered quality, as reasonable points of the design spectrum. We sketched an evaluation methodology to examine overall performance of each candidate strategy with respect to this basic tradeoff. Then we used our proposed methodology to conduct a simulation-based comparison among the candidate strategies. Our simulation results show that the HQC can always maximize delivered quality for the cost of high network load and high delay. However, AQC can only deliver the maximum quality when cache size is sufficiently large. In other cases, AQC results in a lower quality, significantly lower network load, and no delay. LQC always has the minimum load and its delivered quality is limited to the server-proxy bandwidth. We plan to continue this work in a couple of directions. First, we are currently evaluating various replacement algorithms to investigate their performance for multimedia proxy caching. Second, we plan to incorporate other factor such as size and utility (i.e., or impact in quality) of
[12]
[13]
[14]
[15]
[16]
[17] [18]
[19]
[20]
[21]
Inktomi Inc, ,” 1999, http://www.inktomi.com. “Infolibria MediaMall,” 1999, http://www.infolibria.com. A. R. Reibman and L. Bottou, “Managing drift in a DCT-based scalable video coder,” in Proceedings of Data Compression Conference, Snowbird, UT, Mar. 2001, pp. pp. 351–360. H. Radha and Y. Chen, “Fine-granular-scalable video for packet networks,” in Proceedings of Packet Video Workshop, New York, NY, Apr. 1999. A. R. Reibman, L. Bottou, and A. Basso, “DCT-based scalable video coding with drift,” in Proceedings of International Conference on Image Processing, Greece, Oct. 2001, (to appear). R. Rejaie and J. Kangasharju, “On design and performance evaluation of multimedia proxy caching mechanisms for heterogeneous networks,” Tech. Rep., AT&T Labs - esearch, July 2001. http://www.research.att.com/ reza/Papers/mc-eval01.ps M. Chesire, et al. “Measurement and analysis of a streamingmedia workload,” in USITS, 2001. S. Acharya and B. C. Smith, “Middleman: A video caching proxy server,” in NOSSDAV, June 2000. Y. Wang, Z.-L. Zhang, D. Du, and D. Su, “A network conscious approach to end-to-end video delivery over wide area networks using proxy servers,” in Proc. of the INFOCOM, Apr. 1998. S. Sen, J. Rexford, and D. Towsley, “Proxy prefix caching for multimedia streams,” in Proceedings of the INFOCOM, 1999. Z. Miao and A. Ortega, “Proxy caching for efficient video services over the internet,” in 9th International Packet Video Workshop (PVW ’99), New York, Apr. 1999. M. Hofmann, E. Ng, K. Gue, S. Paul, and H. Zhang, “Caching techniques for streaming multimedia over the internet,” Tech. Rep., Bell Laboratories, Apr. 1999. E. Bommaiah, K. Guo, M. Hofmann, and S. Paul, “Desgin and implementation of a caching system for streaming media over the internet,” in IEEE Real Time Technology and Applications Symposium, June 2000. J. Kangasharju, F. Hartanto, M. Reisslein, and K. W. Ross, “Distributing layered encoded video through caches,” in Proceedings of IEEE Infocom, Anchorage, AK, Apr. 2001. A. Dan and D. Sitaram, “Multimedia caching strategies for heterogeneous application and server environments,” in Multimedia Tools and Applications, 1997, vol. 4, pp. 279–312. M. Kamath, K. Ramamritham, and D. Towsley, “Continuous media sharing in multimedia database systems,” in Proceedings of the 4th International Conference on Database Systems for Advanced Applications, Apr. 1995. R. Tewari, H. Vin, A. Dan, and D. Sitaram, “Resource based caching for web servers,” in Proc. MMCN, San Jose, CA., 1998. J. M. Almeida, D. L. Eager, and M. K. Vernon, “A hybrid caching strategy for streaming media files,” in Proceedings Multimedia Computing and Networking, San Jose, CA., Jan. 2001. R. Rejaie, M. Handley, H. Yu, and D. Estrin, “Proxy caching mechanism for multimedia playback streams in the internet,” in Proceedings of the 4th International Web Caching Workshop, San Diego, CA., Mar. 1999. R. Rejaie, H. Yu, M. Handley, and D. Estrin, “Multimedia proxy caching mechanism for quality adaptive streaming applications in the internet,” in Proc. of INFOCOM, Tel-Aviv, Isreal, Mar. 2000. R. Rejaie, and J Kangasharju, “Mocha: A Quality Adaptive Multimedia Proxy Cache for Internet Streaming,” in Proc. of NOSSDAV, New York, June 2001.