Computer Communications 30 (2007) 2213–2224 www.elsevier.com/locate/comcom
A user-focused evaluation of web prefetching algorithms Josep Dome`nech *, Ana Pont, Julio Sahuquillo, Jose´ A. Gil Department of Computer Engineering, Universitat Polite`cnica de Vale`ncia, Camı´ de Vera, s/n, 46022 Vale`ncia, Spain Received 14 August 2006; received in revised form 3 May 2007; accepted 11 May 2007 Available online 18 May 2007
Abstract Web prefetching mechanisms have been proposed to benefit web users by hiding the download latencies. Nevertheless, to the knowledge of the authors, there is no attempt to compare different prefetching techniques that consider the latency perceived by the user as the key metric. The lack of performance comparison studies from the user’s perspective has been mainly due to the difficulty to accurately reproduce the large amount of factors that take part in the prefetching process, ranging from the environment conditions to the workload. This paper is aimed at reducing this gap by using a cost-benefit analysis methodology to fairly compare prefetching algorithms from the user’s point of view. This methodology has been used to configure and compare five of the most used algorithms in the literature under current and old workloads. In this paper, we analyze the perceived latency versus the traffic increase (both in bytes and in objects) to evaluate the benefits from the user’s perspective. In addition, we also analyze the performance results from the prediction point of view to provide insights on the observed performance. Results show that higher algorithm complexity does not improve performance, objectbased algorithms outperform those based on pages, and performance among object-based algorithms present minor differences in the object traffic increase. 2007 Elsevier B.V. All rights reserved. Keywords: Web prefetching; Performance evaluation; User-perceived latency
1. Introduction The knowledge and comprehension of the behavior of a web user are important keys in a wide range of fields related to the web architecture, design, and engineering. The information that can be extracted from web user’s behavior permits to infer and predict future accesses. This information can be used, for instance, for improving Web usability [1], developing on-line marketing techniques [2] or reducing user-perceived latency [3], which is the main goal of prefetching techniques. These techniques use access predictors to process a user request before the user actually makes it. Several ways of prefetching user’s requests have been proposed in the literature: the preprocessing of a request by the server [4], the transference of the object requested *
Corresponding author. Tel.: +34 963879215; fax: +34 963877579. E-mail addresses:
[email protected] (J. Dome`nech), apont@ disca.upv.es (A. Pont),
[email protected] (J. Sahuquillo), jagil@ disca.upv.es (J.A. Gil). 0140-3664/$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2007.05.003
in advance [5,6], and the pre-establishment of connections that are predicted to be made [7]. Despite the large amount of research works focusing on this topic, comparative and evaluation studies from the user’s point of view are rare. This fact leads to the inability to quantify in a real working environment which proposal is better for the user. On the one hand, the underlying baseline system where prefetching is applied differs widely among the studies. On the other hand, different performance key metrics were used to evaluate their benefits [8]. In addition, the used workloads are in most cases rather old, which significantly affects the prediction performance [9], making the conclusions not valid for current workloads. Research works usually compare the proposed prefetching system with a non-prefetching one [10,3,5]. In these studies, different workload and user characteristics have been used, so making it impossible to compare the goodness and benefits among proposals. Some papers have been published comparing the performance of prefetching algorithms [11–16]. These studies
2214
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
mainly concentrate on the performance of the predictor engine, but unfortunately, they do not evaluate the userperceived latency [11–14,16]. In spite of being an essential part of the prefetching techniques, performance measured at the prediction level does not completely explain the performance perceived by the user [17,8], since the perceived latency is affected by a high amount of factors and not only by the hit ratio. In addition, performance comparisons are rarely made by means of a useful cost-benefit analysis, e.g., latency reduction as a function of the traffic increase. In this paper, we describe a cost-benefit methodology to perform fair comparisons among web prefetching algorithms. Using this methodology, we compare different prediction algorithms and evaluate their performance using both old and current traces. A deep analysis through the predicted objects has also been carried out in order to identify the main reasons why performance among prediction algorithms differs. This paper has three main contributions: (i) prefetching algorithms are compared from the user’s point of view; (ii) analysis results show that performance differences among prediction algorithms are mainly due to the size of the predicted objects; and (iii) results highlight the importance of using current workloads when evaluating the benefits of the prefetching in the current web. This paper extends the work presented in [18] in several ways. A deeper analysis of the algorithms has been carried out to take the main cost of prefetching, i.e., the increase of requests, as a higher bandwidth consumption and as a higher load to the server. The performance of prefetching techniques in current versus old workloads are compared. Results claim that there is no reason to analyze web prefetching performance using old workloads. The remainder of this paper is organized as follows. Section 2 gives background information. Section 3 describes the experimental environment used for running the experiments. Section 4 presents the methodology used for evaluating prefetching algorithms. Section 5 analyzes the experimental results. Finally, Section 6 presents some concluding remarks. 2. Background In this section we discuss recent works that deal with comparisons of web prefetching algorithms, focusing on the implemented algorithms and the metrics that they use to evaluate their performance. Dongshan and Junyi [11] compare three versions of a predictor based in Markov chains. They compare the accuracy, the model-building time, and the prediction time varying the order of the Markov model. In a similar way, Chen and Zhang [12] implement three variants of the PPM predictor. They show how the hit ratio and traffic increase vary as a function of the cutoff threshold of the algorithm. Nanopoulos et al. [13] show a cost-benefit analysis of the performance of four prediction algorithms by comparing
the precision and the recall to the traffic increase by means of a simple table. However, this analysis is only applied to a subset of the workload considered in the paper and ignores how the prediction performance affects the final user, i.e., the user-perceived latency. Bouras et al. [14] show the performance achieved by two configurations of the PPM algorithm and three of the nmost popular algorithm. For each one of the five configurations, they quantify the usefulness (recall), the hit ratio (precision), and the traffic increase. Both the number of experiments carried out and the metrics calculated make it difficult to identify which algorithm performs better. They only conclude that prefetching can be potentially beneficial to the considered architecture. In a more recent work [15] they also show the latency reduction for the same experiments. Nevertheless, it is only an estimated latency calculated as the difference between the prediction time and the user request time. This value could be considered as an approximation of the latency reduction by the algorithm, since it does not include interactions between user and prefetching requests. 3. Experimental environment In this section, we first describe the experimental framework used for the performance evaluation study. Second, we present the workload used. Finally, we briefly present the prefetching algorithms taken into account. 3.1. Framework In [19] we proposed an experimental framework for testing web prefetching techniques. In this subsection we summarize the main features of such environment and the configuration used for carrying out the experiments presented in this paper. The architecture is composed of three main parts (as shown in Fig. 1): the back end (server and surrogate), the front end (client), and optionally the proxy server, which is not used in the experiments presented in this paper. The framework implementation combines both real and simulated parts in order to provide flexibility and accuracy. To perform prefetching tasks, a prediction engine implementing different algorithms has been included in the server side. Clients take the generated predictions to download those objects in advance.
Fig. 1. Architecture of the simulation environment.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
The back end part includes the web server and the surrogate server. The framework emulates a real surrogate, which is used to access a real web server. Although the main goal of surrogates is to act as a cache for the most popular server responses, in this work it is used as a prediction engine. To this end, the surrogate adds new HTTP headers to the server response together with the result of the prediction algorithms. The server in the framework is an Apache web server set up to act as the original one. For this purpose, a CGI program returns objects with the same size and MIME type than those recorded in the traces. The front end, or client part, represents the users’ behavior exploring the Web with a prefetching enabled browser. To model the set of users that access concurrently to a given server, the simulator can be fed using either real or synthetically generated traces. Simulated clients obtain the results of the prediction engine from the response, and prefetch the hinted objects in their idle times, as implemented in Mozilla [20]. The simulator collects basic information for each user request and writes it to a log file. By analyzing this log at post-simulation time, all performance metrics can be calculated. 3.2. Workload description The behavior pattern of users was taken from three different logs. Traces A and B were collected during May 2003. They were obtained by filtering the accesses in the log of a Squid proxy of the Polytechnic University of Valencia. Trace A contains accesses to a news web server, whereas trace B has the accesses to a student information web server. Trace C is the EPA-HTTP data set used in [21–23], publicly available in the Web. Although it is a quite old trace (it dates from 1995), it has been included to illustrate the different performance that prefetching achieves using old and current traces. The main characteristics of the traces are shown in Table 1. As one can observe, the differences between new Table 1 Traces characteristics Characteristics
Trace A
B
C
Year Users Page accesses Object accesses Training accesses Avg. objects per page Bytes transferred (MB) Avg. object size (KB) Avg. page size (KB) Avg. HTML size (KB) Avg. image size (KB) Avg. page download time at 1 Mbps (s) Avg. page download time at 8 Mbps (s)
2003 300 2263 65,569 35,000 28.97 218.09 3.41 98.68 32.77 2.36 3.01 2.95
2003 132 1646 36,837 5000 22.38 142.49 3.96 88.64 28.21 2.83 12.00 11.24
1995 2330 20,683 47,748 22,000 2.30 285.29 6.12 14.12 8.00 4.71 1.06 –
2215
and old traces are more evident in the amount of objects per page and, consequently, in the page size. The training length of each trace has been adjusted to optimize the perceived latency reduction of the prefetching. 3.3. Prefetching algorithms The experiments were run using five of the most widely used prediction algorithms in the literature: four main variants of the Prediction by Partial Match (PPM) algorithm [21,23,24,11–13] and the Dependency Graph (DG)-based algorithm [5,13]. A Markov model is a finite-state machine (FSM) where the next state depends only on the current state. The FSM is represented as a directed cyclic graph where the weight of the arc indicates the probability of making that transition. When applied to the prediction of user accesses, each state represents the context of the user, i.e., their recent past accesses, and the transitions represent user demands. In this sense, the probability attached to each transition is the probability of accessing a given object. The main parameter of a Markov model is the order, which refers to the number of accesses that defines each different context. The PPM prediction algorithm uses Markov models of m orders to store previous contexts. Predictions are obtained from the comparison of the current context to each Markov model. PPM algorithm has been proposed to be applied either to each object access [21,23] or to each page (i.e., each container object) accessed by the user [24,11,12]. In addition, two ways of selecting which object or page shall be predicted have been used: predicting the top-n likely objects [21,11,23] and using a confidence threshold [24,13,12]. The four variants resulting from combining the unit of prediction (object or page) and the way of selecting the candidates (top-n or using a threshold) are implemented and analyzed. In the remainder of the paper, the variants are labeled as PPM-x–y, where x can be OB or PG depending on the chosen element of prediction (object or page, respectively), and y can be TOP or TH depending on the way of selecting the candidates (the most likely top-n or using a threshold, respectively). The DG prediction algorithm constructs a dependency graph that depicts the pattern of accesses to the objects. The graph has a node for every object that has ever been accessed. As defined in file systems prediction [25], and implemented in web prefetching [5,26,13], there is an arc from node A to B if and only if at some point in time a client accessed to B within w accesses after A, where w is the lookahead window size. The weight of the arc is the ratio of the number of accesses to B within a window after A to the number of accesses to A itself. The prefetching aggressiveness is controlled by a cutoff threshold parameter applied to the arcs weight. Further details on how PPM and DG algorithms work can be found in Section 5.4.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
2216
4. Evaluation methodology This section is aimed at introducing the performance metrics considered and the cost-benefit methodology used in the evaluation of the algorithms. 4.1. Performance indexes One of the most important steps in a performance evaluation study is the correct choice of the performance indexes. In this work, the algorithms performance has been evaluated by using the main metrics related to the user-perceived performance, prefetching costs and prediction performance [17]: • Latency per page ratio: the latency per page ratio is the ratio of the latency that prefetching achieves to the latency with no prefetching. The latency per page is calculated by comparing the time between the browser initiation of an HTML page GET and the browser reception of the last byte of the last embedded image or object for that page. This metric represents the benefit perceived by the user, which will be better as lower its value is. • Traffic Increase (DTr): the bytes transferred through the network when prefetching is employed divided by the bytes transferred when prefetching is not employed. Notice that this metric includes both the extra bytes wasted by prefetched objects that the user will never use, and the network overhead produced by the transference of the prefetch hints. The variant Object Traffic Increase (DTrob) measures this cost in amount of objects. Both indexes evaluate the costs that prefetching incurs to achieve the benefits. They are better as lower their value is. • Recall (Rc): the ratio of objects requested by the user that were previously prefetched. This metric is the prediction index that better explains the latency per page ratio [17], i.e., this is the benefit of the prefetching from the prediction point of view. It ranges from 0 to 1, being 1 the best value. The variant Byte Recall (RcB) measures the percentage of the bytes requested previously prefetched. • Precision (Pc): the ratio of prefetch hits to the total number of objects prefetched. It ranges from 0 to 1, being 1 the best value. As it was demonstrated in [8], the metrics that concern the prediction are interrelated as Eq. (1) shows DTrob ¼ 1 Rc þ
Rc Pc
goal; i.e., the benefit, is usually the reduction of the userperceived latency. Therefore, the comparison of prefetching algorithms should be made from the user’s point of view and using a cost-benefit analysis. When predictions fail, prefetched objects waste user and/or server resources, which can lead to a performance degradation either for the user himself or for the rest of users. Since in most proposals the client downloads the predicted objects in advance, the main cost of the latency reduction in prefetching systems is the increment of the network load. This increment has two sides: the increase in the amount of bytes transferred and the increase of the server requests. We use the Traffic Increase and the Object Traffic Increase metrics to quantify the former and the latter, respectively. As a consequence, the performance analysis should consider the benefit of reducing the user-perceived latency at the expense of increasing the network traffic and the amount of requests to the server. For comparison purposes, we have simulated systems implementing the algorithms described above. Each simulation experiment on a prefetching system takes as input the user behavior, its available bandwidth and the prefetching parameters. The main results obtained are traffic increase, object traffic increase, and latency per page ratio values. Comparisons of two different algorithms can only be fairly made if either the benefit or the cost have the same or close value. For instance, when two algorithms present the same or very close values of traffic increase, the best proposal is the one that presents less user-perceived latency, and vice versa. For this reason, in this paper performance comparisons are made through curves that include different pairs of traffic increase and latency per page ratio for each algorithm. To obtain each point in the curve we varied the aggressiveness of the algorithm, i.e., how much an algorithm will predict. This aggressiveness is controlled by a threshold parameter in those algorithms that support it (i.e., DG, PPM-OB-TH, and PPMPG-TH) and by the number of returned predictions in those based in the top-n (i.e., PPM-OB-TOP and PPMPG-TOP). A plot can gather the curves obtained for each algorithm in order to be compared. By drawing a line over the desired latency reduction in this plot, one can obtain the traffic increase of each algorithm. The best algorithm for achieving that latency per page is the one having less traffic increase. We can proceed with the object traffic increase metric, in a similar way. 5. Experimental results
ð1Þ
4.2. Cost-benefit methodology Despite the fact that prefetching has been also used for reducing the peaks of bandwidth demand [27], its primary
5.1. Selecting algorithm parameters Each algorithm evaluated has a parameter that refers to the scope of the prediction. This parameter is the lookahead window size in the DG algorithm and the order in the PPM. To identify which parameter value achieves the
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
best performance, a subset of possible values is explored by means of several experiments. In this sense, we have simulated the trace A with the DG and PPM-OB-TH algorithms. As one can observe in Fig. 2, the lookahead window in DG has been ranged from 2 to 5, whereas the maximum Markov chain order of the PPM has been ranged from 1 to 4, as shown in Fig. 3. Each simulated user has a connection of 1 Mbps of available bandwidth. To build each curve, the cutoff threshold of both algorithms has been ranged from 0.2 to 0.7, increasing in steps of 0.1. Results show that the lookahead window size in the DG algorithm hardly impacts on the performance, since all curves fall very close considering both cost factors, as shown in Fig. 2. A deeper analysis of the results concludes that an increase in the lookahead window size has an effect similar to reducing the confidence threshold. This effect is shown by the fact that the 0.2 threshold point in the curves approaches the right side as the lookahead window size increases. Therefore, in the following experiments we have selected a window size of 2 since it is computationally more efficient. In a similar way, an increase of the maximum order of the Markov model of the PPM predictor does not improve the performance. Fig. 3a shows that curves of higher order are further from the origin, which means that achieving the same reduction of latency requires extra traffic increase. 1
5.2. Old and current workload differences The following experiments are aimed at showing how web prefetching techniques have become less effective with the evolution of the World Wide Web. In [9], we studied the performance evolution of current and old workloads from the prediction point of view. In this paper, we evaluate the performance from the user’s point of view, by using the DG and the PPM-OB-TH algorithms. Figs. 4 and 5 illustrate the results for two different user-available bandwidths, which represent dial-up and DSL user, respectively. Fig. 4 shows that for 1 Mbps users, web prefetching performance is significantly worse under the current workload (trace A). For instance, permitting a traffic increase of 35%, DG algorithm reduces the user-perceived latency about 15% under the trace C (see Fig. 4a). However, this latency reduction is only about 8% when using the trace A. The same result is obtained if we consider as cost metric the object traffic increase, as shown in Fig. 4b. 1
lookahead window = 2 lookahead window = 3 lookahead window = 4 lookahead window = 5
0.98 Latency per page ratio
Latency per page ratio
When considering the object traffic increase (Fig. 3b) the curves are closer among them, showing that similar performance is achieved through all the simulated orders. As a consequence, a Markov model of first-order will be used in the remaining experiments using the PPM algorithm. Similar results were obtained when trace B was used.
lookahead window = 2 lookahead window = 3 lookahead window = 4 lookahead window = 5
0.98 0.96 0.94 0.92 0.9
2217
0.88
0.96 0.94 0.92 0.9 0.88
0.86
0.86 1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1
1.1
Traffic Increase
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Object Traffic Increase
Fig. 2. Lookahead window size parameter selection in DG algorithm. Users of 1 Mbps of available bandwidth with trace A are simulated. Each point in the curves represents a given threshold, from 0.2 (right) to 0.7 (left). (a) Bytes; (b) objects. 1 0.98
Order = 1 Order = 2 Order = 3 Order = 4
0.98
0.96
Latency per page ratio
Latency per page ratio
1
Order = 1 Order = 2 Order = 3 Order = 4
0.94 0.92 0.9 0.88
0.96 0.94 0.92 0.9 0.88
0.86
0.86 1
1.1
1.2
1.3
1.4
1.5
Traffic Increase
1.6
1.7
1.8
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Object Traffic Increase
Fig. 3. Selection of the maximum order of the Markov model in PPM-OB-TH algorithm. Users of 1 Mbps of available bandwidth with trace A are simulated. Each point in the curves represents a given threshold, from 0.2 (right) to 0.7 (left). (a) Bytes; (b) objects.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
2218 1.05
Trace: A, Alg: DG Trace: A, Alg: PPM OB TH Trace C, Alg: DG Trace: C, Alg: PPM OB TH
1 Latency per page ratio
1 Latency per page ratio
1.05
Trace: A, Alg: DG Trace: A, Alg: PPM OB TH Trace C, Alg: DG Trace: C, Alg: PPM OB TH
0.95 0.9 0.85 0.8
0.95 0.9 0.85 0.8
0.75
0.75 1
1.1
1.2
1.3 1.4 1.5 Traffic Increase
1.6
1.7
1.8
1
1.1
1.2
1.3 1.4 1.5 1.6 Object Traffic Increase
1.7
1.8
Fig. 4. Performance comparison between old and current workloads. Users of 1 Mbps of available bandwidth are simulated. Each point in the curves represents a given threshold, from 0.2 to 0.7. (a) Bytes; (b) objects.
1.05
Trace: A, Alg: DG Trace: A, Alg: PPM OB TH Trace C, Alg: DG Trace: C, Alg: PPM OB TH
1 Latency per page ratio
1 Latency per page ratio
1.05
Trace: A, Alg: DG Trace: A, Alg: PPM OB TH Trace C, Alg: DG Trace: C, Alg: PPM OB TH
0.95 0.9 0.85 0.8
0.95 0.9 0.85 0.8
0.75
0.75 1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Traffic Increase
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Object Traffic Increase
Fig. 5. Performance comparison between old and current workloads. Users of 48 kbps of available bandwidth are simulated. Each point in the curves represents a given threshold, from 0.2 to 0.7. (a) Bytes; (b) objects.
Looking at Fig. 5 one can observe that the simulation of 48 kbps of available bandwidth per user exhibits a similar trend. In this scenario, the worst case is even worse than the 1-Mbps one, since the latency per page ratio is greater than 1; i.e., the use of prefetching techniques adversely impacts on the user-perceived latency. This performance analysis from the user’s point of view confirms our previous results from the prediction engine side [9]. To date, the prefetching techniques proposed in the literature achieve better results when applied to old traces (which are more or less contemporary with the prefetching proposals) than when they are applied under current workloads. One of the reasons of these discouraging results can be found in the dynamic nature of the current web applications that make user’s accesses less predictable. Nevertheless, web prefetching techniques are more interesting as the network bandwidth increases, which is the current trend. Therefore, our main objective is to explore Web prefetching benefits or drawbacks at present. Hence we use current workloads in the analysis. This means that there is no reason to think that prefetching studies should be conducted using old traces. In this paper, we claim that prefetching algorithms should be rethought to take into account the characteristics of the current web.
5.3. Algorithms comparison Once the algorithms parameters and the workloads have been selected, the algorithms are going to be compared using the methodology described in Section 4.2. The comparisons have been broken down into two groups, since we found that the results of the page-based algorithms (i.e., PPM-PG-TH and PPM-PG-TOP) show some odd behavior. Hence their separated analysis. Figs. 6 and 9 show the results of the performance comparison for the object-based algorithms described in Section 3.3, while the results for the page-based algorithms are shown in Fig. 11. Each algorithm is evaluated in four situations, as a result of combining two workloads (i.e., A and B) with two configurations of the user-available bandwidth (i.e., 1 and 8 Mbps). The curves of each plot in DG and PPM-x-TH algorithms are obtained by varying the confidence threshold of the algorithms, from 0.2 to 0.7 in steps of 0.1. To draw the curves of the PPM-x-TOP algorithms, the number of returned predictions are ranged from 1 to 9 in steps of 1, except for 6 and 8. Results for traffic increase and object traffic increase greater than 2 are not represented in order to keep the plot focused on the area where the algorithms can be compared. Fig. 6a and b illustrates the performance evaluation of the algorithms simulating users who have 1 Mbps of
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224 1
DG PPM OB TH PPM OB TOP
0.98
0.96
Latency per page ratio
Latency per page ratio
1
DG PPM OB TH PPM OB TOP
0.98
0.94 0.92 0.9 0.88 0.86 0.84
0.96 0.94 0.92 0.9 0.88 0.86 0.84
0.82
0.82 1
1.2
1.4
1.6
1.8
2
1
1.2
Traffic Increase 1
1.4
1.6
1.8
2
Object Traffic Increase 1
DG PPM OB TH PPM OB TOP
0.98
DG PPM OB TH PPM OB TOP
0.98
0.96
Latency per page ratio
Latency per page ratio
2219
0.94 0.92 0.9 0.88 0.86 0.84
0.96 0.94 0.92 0.9 0.88 0.86 0.84
0.82
0.82 1
1.2
1.4 1.6 Traffic Increase
1.8
2
1
1.2
1.4 1.6 Object Traffic Increase
1.8
2
Fig. 6. Performance comparison from the user’s point of view between objects based algorithms running trace A. Each point in the curves represents a given threshold in PPM-OB-TH and DG, while they represent a given number of returned hints in PPM-OB-TOP. (a) 1 Mbps users, bytes; (b) 1 Mbps users, objects; (c) 8 Mbps users, bytes; (d) 8 Mbps users, objects.
available bandwidth and behave in accordance with the workload A. The first plot shows that, when considering traffic increase as the key cost, DG algorithm achieves better performance than the others in the range in which it is evaluated, since its curve falls always below the ones of the PPM algorithms. However, when the extra traffic is measured by means of the object traffic increase metric (Fig. 6b), no significant performance differences can be found between the different algorithms. Curves of PPM-OB-TH and PPM-OB-TOP algorithms exhibit peculiar behavior in Fig. 6a. As one can observe, the PPM-OB-TOP curve seems to continue the curve of the PPM-OB-TH. This fact, which is also reflected in the other plots, means that selecting candidates by a cutoff threshold makes less predictions than selecting them by the top-n, although the overall performance is not considerably affected. When considering 8 Mbps users, the DG algorithm also outperforms the others when the traffic increase is analyzed, as Fig. 6c shows. However, Fig. 6d reveals minor performance differences between algorithms in terms of object traffic increase. From the prediction perspective, the reason why the DG algorithm achieves better performance when considering the traffic increase lies in the fact that it predicts more user requests per each extra byte wasted by the prefetching; i.e., given a fixed recall value, DG generates less traffic increase than the other three algorithms. Plots 7a and c show this fact for both user’s bandwidths.
On the other hand, the reasons of the similar performance exhibited in the comparisons that consider the object traffic increase can be extracted from Fig. 7b and d. They show that, given a fixed object traffic increase, the recall is almost the same in the three algorithms. This involves that the algorithms predict a similar amount of user requests per each extra request to the server wasted by the prefetching system. Taking into account Eq. (1), one can observe that all the algorithms have the same precision. This, together with the fact that DG requires less traffic increase than the others algorithms in order to reach the same recall value, let us conclude that the objects prefetched by DG are, on average, smaller than those prefetched by the other algorithms. This result is also noticed in Fig. 8, where DG is the algorithm with less Byte Recall, i.e., given an object traffic increase, DG predicts the same objects than the other algorithms (see Fig. 7b) but less bytes (see Fig. 8). Plots for other combinations of workload and available bandwidth show similar results, but are not shown due to space restrictions. The reasons why the size of the predicted objects differ among the different algorithms are explained in Section 5.4. Fig. 9 shows that the algorithms have less performance differences when using the trace B for both user-available bandwidths. DG algorithm slightly outperforms the others when considering the wasted bytes as a cost in all its range with the only exception of the most aggressive threshold (i.e., th = 0.2), in which the PPM-OB-TOP algorithm achieves a slightly lower latency with the same traffic
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224 0.3
0.3
0.25
0.25
0.2
0.2 Recall
Recall
2220
0.15 0.1
0.15 0.1
0.05
0.05
DG PPM OB TH PPM OB TOP
0 1
1.2
1.4
1.6
1.8
DG PPM OB TH PPM OB TOP
0 2
1
1.2
1.4
1.6
1.8
2
Object Traffic Increase
0.3
0.3
0.25
0.25
0.2
0.2 Recall
Recall
Traffic Increase
0.15 0.1
0.15 0.1
0.05
0.05
DG PPM OB TH PPM OB TOP
0 1
1.2
1.4 1.6 Traffic Increase
1.8
DG PPM OB TH PPM OB TOP
0 2
1
1.2
1.4 1.6 Object Traffic Increase
1.8
2
Fig. 7. Performance comparison from the prediction point of view between object-based algorithms running trace A. (a) Prefetching recall versus traffic increase simulating 1 Mbps users; (b) prefetching recall versus object traffic increase simulating 1 Mbps users; (c) prefetching recall versus traffic increase simulating 8 Mbps users; (d) prefetching recall versus object traffic increase simulating 8 Mbps users. 0.2
DG PPM OB TH PPM OB TOP
Byte Recall
0.15
0.1
0.05
0 1
1.2
1.4
1.6
1.8
2
Object Traffic Increase
Fig. 8. Byte Recall as a function of Object Traffic Increase. Users of 1 Mbps of available bandwidth under the workload A are simulated.
increase. When the cost analysis is focused on the amount of objects, plots show the same conclusion as when evaluating the workload A: performance differences are negligible. To complete this study, we also analyze results from the prediction perspective. Fig. 10a and c reveals that the recall value of the DG algorithm is the highest when compared to the traffic increase, whereas there are no significant differences when compared to the object traffic increase (see Fig. 10b and d). This fact shows that the DG algorithm under the workload B predicts smaller objects than the PPM algorithms, as well as it occurs when using workload A. Fig. 11 illustrates the performance of the page-based algorithms using both current traces and simulating users
of 8 Mbps available bandwidth. Both plots in the figure refer to the latency versus traffic increase comparison, and show almost horizontal curves. Fig. 11a shows that a more aggressive policy will not reduce the perceived latency. Fig. 11b manifests that increasing the traffic not only does not involve latency reduction but it also adversely impacts on the perceived latency. These results indicate that the page-based algorithms are not scalable, since they are only cost-effective when working in a nonaggressive manner, i.e., high confidence threshold (0.7 or 0.6) and few predictions returned (top-1 maximum). From the prediction point of view, the results are explained by the fact that recall hardly rises when increasing the aggressiveness, so dropping the precision. As page-based algorithms predict only HTML files, the object traffic increase is negligible if compared to the traffic increase in bytes of the same experiment. That is because HTML objects are, on average, much larger than the others in the considered workload (see Table 1). In this sense, plots for comparing page-based algorithms using object traffic increase show almost all points near the left edge of the plot, which have not been included because all points concentrate in a small area. For instance, the run of the PPM-PG-TOP algorithm using a top-1 under the workload A results in a traffic increase of 25%, while the object traffic increase is just 1.3%. Despite the small value of the object traffic increase, the latency is not reduced with the extra aggressiveness, so the algorithms are not also scalable when focusing the cost analysis on the object traffic increase.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224 1
DG PPM OB TH PPM OB TOP
0.98
0.96
Latency per page ratio
Latency per page ratio
1
DG PPM OB TH PPM OB TOP
0.98
0.94 0.92 0.9 0.88 0.86 0.84
0.96 0.94 0.92 0.9 0.88 0.86 0.84
0.82
0.82 1
1.2
1.4
1.6
1.8
2
1
1.2
Traffic Increase 1
1.4
1.6
1.8
2
Object Traffic Increase 1
DG PPM OB TH PPM OB TOP
0.98
DG PPM OB TH PPM OB TOP
0.98
0.96
Latency per page ratio
Latency per page ratio
2221
0.94 0.92 0.9 0.88 0.86 0.84
0.96 0.94 0.92 0.9 0.88 0.86 0.84
0.82
0.82 1
1.2
1.4 1.6 Traffic Increase
1.8
2
1
1.2
1.4 1.6 Object Traffic Increase
1.8
2
0.3
0.3
0.25
0.25
0.2
0.2 Recall
Recall
Fig. 9. Performance comparison from the user’s point of view between objects based algorithms running trace B. (a) 1 Mbps users, bytes; (b) 1 Mbps users, objects; (c) 8 Mbps users, bytes; (d) 8 Mbps users, objects.
0.15 0.1
0.15 0.1
0.05
0.05
DG PPM OB TH PPM OB TOP
0 1
1.2
1.4
1.6
1.8
DG PPM OB TH PPM OB TOP
0 2
1
1.2
1.4
1.6
1.8
2
Object Traffic Increase
0.3
0.3
0.25
0.25
0.2
0.2 Recall
Recall
Traffic Increase
0.15 0.1
0.15 0.1
0.05
0.05
DG PPM OB TH PPM OB TOP
0 1
1.2
1.4 1.6 Traffic Increase
1.8
DG PPM OB TH PPM OB TOP
0 2
1
1.2
1.4 1.6 Object Traffic Increase
1.8
2
Fig. 10. Performance comparison from the prediction point of view between objects based algorithms running trace B. (a) Prefetching recall versus traffic increase simulating 1 Mbps users; (b) prefetching recall versus object traffic increase simulating 1 Mbps users; (c) prefetching recall versus traffic increase simulating 8 Mbps users; (d) prefetching recall versus object traffic increase simulating 8 Mbps users.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
2222 1
PPM PG TH PPM PG TOP
0.98
0.96
Latency per page ratio
Latency per page ratio
1
PPM PG TH PPM PG TOP
0.98
0.94 0.92 0.9 0.88 0.86 0.84
0.96 0.94 0.92 0.9 0.88 0.86 0.84
0.82
0.82 1
1.2
1.4
1.6
1.8
2
Traffic Increase
1
1.2
1.4
1.6
1.8
2
Traffic Increase
Fig. 11. Performance comparison between page-based algorithms with 8 Mbps users. Each point in the curves represents a given threshold in PPM-PGTH and a given number of returned hints in PPM-PG-TOP. (a) Workload A, bytes; (b) workload B, bytes.
5.4. Algorithms analysis To provide insights on the understanding of why DG predicts objects smaller than PPM, we carefully analyzed the requests contained in the traces. We found that the main reason why algorithms predict different sized objects lies in the fact that DG predicts more likely image objects than HTML files, which are larger (see Table 1). For illustrative purposes, in this section we use an hypothetical example. Let’s suppose that the algorithms are trained by two user sessions. The first one contains the following accesses: HTML1, IMG1, HTML2, and IMG2. The second session includes the accesses: HTML1, IMG1, HTML3, and IMG2. Note that IMG2 is embedded both in HTML2 and in HTML3. We found that this fact is common along the analyzed workloads, especially in workload A, where different pieces of news (i.e., HTML files) contain the same embedded images, since they are included in the site structure. Fig. 12 shows the state of the graph of the current configuration of the DG algorithm after the aforementioned training. Each node in the graph repre-
Fig. 12. State of the graph of the DG algorithm with a lookahead window size of 2 after the accesses HTML1, IMG1, HTML2, and IMG2 by one user; and HTML1, IMG1, HTML3, and IMG2 by another user.
Fig. 13. State of the graph of a first-order PPM algorithm after the accesses HTML1, IMG1, HTML2, and IMG2 by one user; and HTML1, IMG1, HTML3, and IMG2 by another user.
sents an object whereas the weight of each arc is the confidence level of the transition. The state of the PPM algorithm after the same training is illustrated in Fig. 13. Each node represents a context, where the root node is in the first row, the order-0 context is in the second, and the order-1 context is in the third. The label of each node includes the number of times in which a context has appeared, so one can obtain the confidence of a transition by dividing the counter of a node by the counter of its parent. The arcs indicate the possible transitions. For instance, the label of the IMG2 in order-0 context is 2 because IMG2 appeared twice in the training. As it appeared once after HTML2 and another after HTML3, IMG2 has two nodes in the order-1 context, one per each HTML on which it depends. As one can observe in Fig. 12, the DG algorithm can predict the access to IMG2 after accessing the first page, i.e., HTML1 and IMG1. However, Fig. 13 shows that PPM can only predict IMG2 when the user has requested HTML2 or HTML3, but at that time the prediction is useless since there is no time to prefetch the object. For this reason, the DG algorithm predicts more likely embedded objects of the next page than the PPM, which would predict only HTML files after the IMG1 object.
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
2223
6. Conclusions
References
In this paper we have described a cost-benefit methodology to evaluate and compare prefetching algorithms from the user’s perspective. The methodology has been used for selecting the most effective algorithm parameters, concluding that a higher complexity of the algorithm involves more aggressiveness but not better performance. The evaluation methodology also highlights how performance achieved by the prefetching techniques differs depending on the workload age. Results show that there is no reason to think that prefetching studies should be conducted using old traces. Using the proposed methodology, five prediction algorithms have been implemented and compared. Experimental results show that DG algorithm slightly outperforms the PPM-OB-TH and the PPM-OB-TOP algorithms in most of the studied cases, when considering the traffic increase as the main cost. However, the aggressiveness (and, consequently, the latency reduction) of the DG is more limited than the PPM-OBTOP one. For this reason, when prefetching is not desired to be very aggressive due to bandwidth or computational restrictions, DG achieves the best cost-effectiveness. However, for more aggressive policies, a larger DG lookahead window or the PPM-OB-TOP algorithm should be used. We have analyzed why the DG algorithms achieves better performance, finding that it is because, on average, DG predicts objects smaller than the PPM-based algorithms. This result has been intuitively explained by understanding how the algorithms work. The comparison of the algorithms when the cost analysis focuses on the object traffic increase shows that the differences between the object-based algorithms are negligible. Results also show that the PPM-OB-TOP algorithm is a more aggressive version of the PPM-OB-TH, but it offers similar cost-benefit ratio. Nevertheless, it is more difficult to adapt PPM-OB-TOP to the traffic increase restrictions, since the number of returned hints is discrete whereas the threshold is continuous. Page-based algorithms have exhibited a certain odd behavior, since extra aggressiveness involves extra traffic but the same or less latency reduction. This fact means that the most likely predicted pages reduce the perceived latency, but the others not only do not reduce the latency but also they can degrade the performance. In this sense, these algorithms cannot modify their aggressiveness since they must work with a high confidence to be cost-effective.
[1] E.H. Chi, A. Rosien, G. Suppattanasiri, A. Williams, C. Royer, C. Chow, E. Robles, B. Dalal, J. Chen, S. Cousins, The bloodhound project: automating discovery of web usability issues using the infoscent simulator, in: Proceedings of the ACM CHI 2003 Conference on Human Factors in Computing Systems, Fort Lauderdale, USA, 2003. [2] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web usage mining: discovery and applications of usage patterns from web data, SIGKDD Explorations 1 (2) (2000) 12–23. [3] D. Duchamp, Prefetching hyperlinks, in: Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, Boulder, USA, 1999. [4] S. Schechter, M. Krishnan, M.D. Smith, Using path profiles to predict http requests, in: Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, 1998. [5] V.N. Padmanabhan, J.C. Mogul, Using predictive prefetching to improve World Wide Web latency, Computer Communication Review 26 (3) (1996) 22–36. [6] R. Kokku, P. Yalagandula, A. Venkataramani, M. Dahlin, NPS: a non-interfering deployable web prefetching system, in: Proceedings of the USENIX Symposium on Internet Technologies and Systems, Palo Alto, USA, 2003. [7] E. Cohen, H. Kaplan, Prefetching the means for document transfer: a new approach for reducing web latency, Computer Networks 39 (4) (2002) 437–455. [8] J. Dome`nech, J.A. Gil, J. Sahuquillo, A. Pont, Web prefetching performance metrics: a survey, Performance Evaluation 63 (9–10) (2006) 988–1004. [9] J. Dome`nech, J. Sahuquillo, A. Pont, J.A. Gil, How current web generation affects prediction algorithms performance, in: Proceedings of the 13th International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 2005. [10] L. Fan, P. Cao, W. Lin, Q. Jacobson, Web prefetching between lowbandwidth clients and proxies: potential and performance, in: Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Atlanta, USA, 1999, pp. 178–187. [11] X. Dongshan, S. Junyi, A new Markov model for web access prediction, Computing in Science and Engineering 4 (6) (2002) 34–39. [12] X. Chen, X. Zhang, A popularity-based prediction model for web prefetching, IEEE Computer 36 (3) (2003) 63–70. [13] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A data mining algorithm for generalized web prefetching, IEEE Transactions on Knowledge and Data Engineering 15 (5) (2003) 1155–1169. [14] C. Bouras, A. Konidaris, D. Kostoulas, Efficient reduction of web latency through predictive prefetching on a WAN, in: Proceedings of the 4th International Conference on Advances in Web-Age Information Management, Chengdu, China, 2003, pp. 25–36. [15] C. Bouras, A. Konidaris, D. Kostoulas, Predictive prefetching on the web and its potential impact in the wide area, World Wide Web 7 (2) (2004) 143–179. [16] B. Wu, A.D. Kshemkalyani, Objective-optimal algorithms for long-term web prefetching, IEEE Transactions on Computers 55 (1) (2006) 2–17. [17] J. Dome`nech, J. Sahuquillo, J.A. Gil, A. Pont, About the heterogeneity of web prefetching performance key metrics, in: Proceedings of the IFIP International Conference on Intelligence in Communication Systems (INTELLCOMM), Bangkok, Thailand, 2004. [18] J. Dome`nech, A. Pont, J. Sahuquillo, J.A. Gil, A comparative study of web prefetching techniques focusing on user’s perspective, in: Proceedings of the IFIP International Conference on Network and Parallel Computing (NPC 2006), Tokio, Japan, 2006. [19] J. Dome`nech, A. Pont, J. Sahuquillo, J.A. Gil, An experimental framework for testing web prefetching techniques, in: Proceedings of the 30th EURO MICRO Conference 2004, Rennes, France, 2004, pp. 214–221. [20] D. Fisher, G. Saksena, Link prefetching in Mozilla: a server driven approach, in: Proceedings of the 8th International Workshop on Web Content Caching and Distribution (WCW 2003), New York, USA, 2003.
Acknowledgments This work has been partially supported by Spanish Ministry of Education and Science and the European Investment Fund for Regional Development (FEDER) under Grant TSI 2005-07876-C03-01.
2224
J. Dome`nech et al. / Computer Communications 30 (2007) 2213–2224
[21] R. Sarukkai, Link prediction and path analysis using Markov chains, Computer Networks 33 (1–6) (2000) 377–386. [22] Q. Yang, J.Z. Huang, M. Ng, A data cube model for prediction-based web prefetching, Journal of Intelligent Information Systems 20 (1) (2003) 11–30. [23] B.D. Davison, Learning web request patterns, in: Web Dynamics – Adapting to Change in Content, Size, Topology and Use, Springer, 2004, pp. 435–460. [24] T. Palpanas, A. Mendelzon, Web prefetching using partial match prediction, in: Proceedings of the 4th International Web Caching Workshop, San Diego, USA, 1999. [25] J. Griffioen, R. Appleton, Reducing file system latency using a predictive approach, Technical report, University of Kentucky, 1994. [26] Z. Jiang, L. Kleinrock, An adaptive network prefetch scheme, IEEE Journal on Selected Areas in Communications 16 (3) (1998) 358–368. [27] C. Maltzahn, K.J. Richardson, D. Grunwald, J.H. Martin, On bandwidth smoothing, in: Proceedings of the 4th International Web Caching Workshop, San Diego, USA, 1999.
Josep Dome`nech received the B.S. degree in Computer Science in 2000, the M.S. degree in Computer Science in 2002 and the M.S. degree in Multimedia Applications in 2005 from the Polytechnic University of Vale´ncia (UPV), and the B.S. degree in Business from the University of Vale´ncia in 2004. He obtained the Best University student award given by the Valencian Local Government (Spain) in 2001. He is currently working toward the Ph.D. degree in the Architecture Research Group of the Department of Computer Engineering of the UPV since 2003. His research interests include Internet architecture, web prefetching and user characterization.
Ana Pont an received her M.S. degree in Computer Science in 1987 and a Ph.D. in Computer Engineering in 1995, both from Polytechnic University of Valencia. She joints the Computer Engineering Department in the UPV in 1987 where currently she is full professor of Computer Architecture. Since 1998 until 2004 she was the head of the Computer Science High School in the UPV. Her research interest include multiprocessor architecture, memory hierarchy design and performance evaluation, web and internet architecture, proxy caching techniques, CDNs,
communication networks. Professor Ana Pont-Sanjua´n an has published a substantial number of papers in international Journals and Conferences on Computer Architecture, Networks and Performance Evaluation she has been reviewer for several journals and regularly participates in the technical program committees of international scientific conferences. She also has participated in a high number of research projects financed by Spanish Government and Local Valencian Government. Currently, she leads the Web Architecture Research group at the UPV where she has been advisor in several Ph.D. Thesis and currently she is directly tutoring six Ph.D. more. Since January 2005 she is the Chairperson of the IFIP TC6 Working Group 6.9: Communication Systems for Developing Countries.
Julio Sahuquillo received his B.S., M.S., and Ph.D. degrees in Computer Science from the Polytechnic University of Valencia (UPV), in Vale´ncia, Spain. Since 2002 he is an associate professor at the Computer Engineering Department at the Polytechnic University of Valencia. His current research topics have included multiprocessor systems, cache design, instruction-level parallelism, and power dissipation. An important part of his research has also concentrated on the web performance field, including proxy caching, web prefetching, and web workload characterization.
Jose´ A. Gil is an associated professor in the field of Computer Architecture and Technology at the Polytechnic University of Valencia, Spain. He teaches at the Computer Engineering School where he is also member of the Staff. Professor Gil obtained his B.S., M.S., and Ph.D. degrees from the Polytechnic University of Valencia. Their current interests include topics related with web systems, proxy cache and distributed shared memory systems. He is joint author of several books on the subject of computer architecture and he has published numerous articles about industrial local area networks, computer evaluation and modelling, proxy cache systems and web systems. It has participated in numerous investigation projects financed by the Spanish Government and the Local Government of the Cominidad Valenciana and in development projects for different companies and city councils.