Caching Dynamic Content with Automatic Fragmentation Ikram Chabbouh∗ and Mesaac Makpangou†
Abstract. In this paper we propose a fragment-based caching system that aims at improving the performance of Webbased applications. The system fragments the dynamic pages automatically. Our approach consists in statically analyzing the programs that generate the dynamic pages rather than their output. This approach has the considerable advantage of optimizing the overhead due to fragmentation. Furthermore, we propose a mechanism that increases the reuse rate of the stored fragments, so that the site response time can be improved among other benefits. We validate our approach by using TPC-W as a benchmark.
1.
Introduction
As Web-based applications become increasingly popular, the topic of maintaining an acceptable level of performance for these applications becomes critical to businesses. In particular, site response time and availability are key aspects of online satisfaction. Typically, the Web pages that support online applications are computed dynamically. This means that the delays experienced by users are directly affected by server performance and not simply due to download times. Besides, more requests are made to the servers, and often the magnitude of user demand outstrips server capacity. The outcome of this observation is that users may be denied access to the server or the access device may become unacceptably slow. Caching is currently the primary mechanism used to reduce server load as well as users’ observed latency and bandwidth consumption, but caching dynamic pages requires specific techniques as by their very definition, dynamic pages are not supposed to be cacheable. One approach to cache dynamic pages is fragmentbased caching, which consists in considering a page as a container holding distinct objects (called fragments) with heterogeneous characteristics. Recent work on dynamic content caching, has proven the advantages of fragment-based schemes ( [2], [5]). It goes without saying that the efficiency of a fragment-based caching system is conditioned by the relevance of the fragments calculated. In this paper, we propose a fragment-based caching system that aims at maximizing the reuse of the stored fragments in order to lighten the application server’s burden and minimize the generation delay of the pages. Our system has two distinct functionalities. The first is to fragment the pages of a site automatically, and the second is a proxy cache functionality taking advantage of the defined fragments to answer clients’ requests. The emerging studies on the automatic detection of fragments tend to run the programs generating the dynamic pages, several times with the same parameters in order to infer properties that are used to fragment the pages, whereas our approach consists in statically analyzing these programs rather than their output. This approach is more rigorous, insofar as no approximation or assumption is made to determine the fragments. Moreover, our approach has the considerable advantage of operating once and off-line on the “parent” of the generated pages rather than on each instance of the execution, so that no redundant processing is done and no extra-traffic is generated on the server. Finally, unlike the other approaches where the whole page has to be regenerated whenever a part of it changes, the criteria retained to select the fragments make it possible to ask for them separately. In order to evaluate and prove the effectiveness of the proposed caching system, we conducted a set of experi∗ †
INRIA Rocquencourt, Team Regal, France, email:
[email protected] INRIA Rocquencourt Team Regal, France, email:
[email protected]
ments using the TPC-W Benchmark, which is the most popular benchmark for e-commerce applications. The remainder of the paper is organized as follows: Section 2. delves deeper into the fragmentation issue and covers related work. Section 3. explains the underlying principles of our system. Section 4. presents experimental results, carried out with the TPC-W Benchmark. Finally, in Section 5. we give our concluding remarks and we raise some open questions related to our future work.
2.
Background
A dynamic Web page written in a scripting language (see figure 1) typically consists of a number of code blocks, each of which performs some work (such as retrieving or formatting content) and produces an HTML fragment as output [4]. A write-to-out statement, which follows each code block, places the resulting HTML fragment in a buffer.
Code Block Ad Component
Write to Out
Code Block Write to Out
Code Block Write to Out Scripted Program
Navigation Component
Dynamically Generated Page
Figure 1. Dynamic scripting Process
Because of personalization and data freshness aspects, dynamic pages are likely not to be fully reused by the cache, so fragmentation turns out to be the best way to ensure the reuse of cached entries. The idea behind fragmentation is to isolate parts of the dynamic page that exhibit potential benefits and thus are cost-effective as cache units [8]. A fragment is then a part of the generated HTML which does not necessarily correspond to a logical entity. Caching at fragment level aims to achieve several benefits such as the reduction of server load, bandwidth consumption and cache storage space. In this paper we will focus on server CPU time and bandwidth consumption as they both directly affect the latency observed by clients. Server CPU time and bandwidth consumption are reduced when the response can be retrieved from the cache instead of asking the original server for it again. In the particular context of dynamic pages, CPU time is saved if instead of generating the whole page, the server can only generate the missing and the stale fragments of the documents already cached. The same is true for bandwidth consumption, as this parameter is reduced if only missing and stale fragments of a cached document are sent rather than the entire document. Therefore, to increase efficiency, the server should be aware of the fragment entity and should be able to serve fragments separately, which is seldom the case for existing sites. 2.1.
Related work
Existing fragment-based caching solutions rely on different hypotheses as regards the initial structure of the site. Several studies assume that the pages of the site are already fragmented, which generally either implies that the site is constructed with specific tools that make it possible to create and handle fragments ([10],[9]), or
that the administrator does the fragmentation manually ([11],[3]). The first assumption is too restrictive, as existing sites seldom handle fragments originally, while the second is costly, error prone and simply not scalable. Though there have been considerable efforts to exploit the potential of fragment-based schemes, there has been little research on detecting the fragments automatically on existing Web sites. To the best of our knowledge, only two research studies gone more deeply into the automation of the fragmentation ([7] and [6]). Both studies rely on the generation of a modified HTML tree 1 but differ in the selection criteria of the fragments. The first study identifies two criteria: i) a fragment is deemed relevant if it is shared among “M” already existing fragments (where “M”> 1), and ii) if it has different lifetime characteristics from those of its encompassing candidate fragment. To detect and flag “candidate” fragments from pages, the study proceeds in three steps [8]. First, a data structure (called Augmented Fragment Tree) representing the dynamic web pages is constructed. Second, the system applies the fragment detection algorithm to augmented fragment trees to detect candidate fragments. The algorithm for detecting shared fragments works on a collection of different dynamic pages generated from the same Web site, whereas the algorithm that detects the lifetime characteristics works on different versions of each Web page which are obtained from a single query being repeatedly submitted to the given Web site. In the third step, the system collects statistics about the fragments (such as size, access rates etc.), that are meant to help the administrator to decide whether to activate the fragmentation or not. The second study [6], considers the relative size of HTML portions (regarding the size of the whole page), as the prominent factor in the characterization of fragments. The fragmentation algorithm comprises two phases: training and updating. In the training phase, each Web page is analyzed for a period of time. The training algorithm fetches the latest instance of the page from its corresponding URL, parses it and constructs an HTML tree. The tree is then analyzed in order to produce an index tree which contains particular information on every node in the HTML tree. Finally, the training algorithm analyzes the index tree and calculates which areas of the page will be extracted as Web fragments. The update phase begins after the training phase has been completed. The update algorithm proceeds in the same way as the training algorithm except that, afterwards, the former checks whether there are differences between the index tree structures (of the same page) computed during the two phases. If there is any difference, the algorithm updates the latest instance of the Web page by calculating the new fragments and storing them. As the two studies proceed more or less in the same way, they have more or less the same disadvantages. First and foremost, both methods request a certain number of versions of the page to be fragmented from the original Web server. This results at least in two drawbacks. The first is the delay required by both systems in order to fragment a Web page, and the second is the large amount of traffic generated on the application server side which instead of relieving the servers puts extra-strain on them. Finally, for both approaches, it is difficult to know when a fragment becomes stale, and even when this is known, it is still difficult to obtain the new version of the fragment as the application is not aware of the fragments calculated.
3.
Description of the solution
The objective of our work is to propose a fragmentation that increases the reuse rate of the cached content in order to lower the number of requests that hit the server and to decrease the generation time of the dynamic pages. We also aim to develop the overall fragment-based caching system relying on the defined fragments to ascertain the benefits of the proposed fragmentation. In Section 3.1., we summarize the fragments selection criteria, then in Section 3.2. we give an overview of the solution and we detail the architecture of the system. 1
an HTML tree roughly corresponds to a structure in which the tags present in the page are internal nodes and the visible text strings are leaves
3.1.
Candidate fragments
The fragmentation we propose separates the dynamic content from the static content in dynamic Web pages so that, at least, the static content can be fully reused. This choice is motivated by the observation made on a large number of dynamic sites, that the redundancy rate of fragments between pages that do not execute the same program is very small. In other words, the pages generated by the respective URLs: http://server/path1/pgm1?var=val1 and http://server/path1/pgm1?var=val2 are likely to share many more fragments with each other than with the page generated by the URL: http://server/path2/pgm2?var=val Figure 2 compares the fragments’2 reuse rate between ten different pages randomly accessed on the popular site of the BBC (www.bbc.co.uk), and ten other distinct pages generated by the same program on the same site.
Figure 2. Distribution of fragments
We can notice for instance, that for random URLs, more than 80% of the fragments belong to less than two pages, while for the different instances3 of the page, about 70% of the fragments are shared between the ten pages. Thus the fragments’ reuse rate for the randomly accessed pages is very low, while it is far superior for different instances of the same page. This is easily understandable since the pages that are generated by the same program, share at least all their static part; then, depending on the scripts’ inputs and the variability of the 2 The fragments were calculated by comparing different versions of the same page in order to localize the HTML portions that are likely to correspond to the output of different scripts 3 we say that two dynamic pages are instances of the same page if they execute the same program even with different parameters, thus http://server/path/pgm?var=val1 and http://server/path/pgm?var=val2 are instances of the same page
handled data, dynamic parts can be partially or totally shared. Therefore, it is interesting to focus on the reuse of fragments between the different instances of the page whenever this is possible. The fragmentation we propose also selects the fragments that can be separately fetchable from the server, because, as we explained in Section 2., great benefits can be achieved when the server is aware of the fragment entity and when fragments can be served separately. In order to be separately fetchable, the fragments are likely to correspond to the execution of independent programs, therefore, the identification of fragments simply boils down to the detection of the code independent scripts. 3.2.
Architecture of the system
As mentioned previously, our caching system has two functionalities, the first is to fragment the dynamic Web pages on the server side and the second is to deploy the logic of caching and handling fragments outside the application server. The first functionality is performed by a module which is located on the application server side, while the second is performed by a reverse proxy cache module that is positioned in front of the Web server. Figure 3 depicts the different entities interacting with the caching system. The fragmentation module takes the dynamic pages of the Web site as input, analyzes them, extracts useful information on the scripts, determines the fragments then augments the pages with fragmenting instructions. The execution of the augmented dynamic pages, by the Web server, then results in the generation of fragmented pages containing meta-data of the embedded fragments. The most interesting element in the metadata calculated is what we call the “fragment filter”. A fragment filter is a piece of information, associated with a script, which is used to produce a unique identifier per different output generated by the script. Particular attention should be paid to the importance of the filters in our system. In order to increase the stored fragments’ reuse rate and to lighten the application server’s load, the cache should only ask for the missing fragments of a served page whenever this is possible. In the best case scenario, the cache would know in advance the fragments it is supposed to ask for when it receives a request. But normally, the attribution of an identifier to a fragment is performed after the generation of the page containing the fragment itself, and thus it is difficult to know in advance which fragments a page contains unless the page has already been generated. As will be explained later, the filters provide a generic characterization of the scripts (i.e. not specific to a particular execution) so that it becomes possible for a cache to calculate the identifiers of fragments which should be present in a page that has not already been requested, if only the program generating the page has been executed with different parameters. The following subsections specify the functioning of the system’s components. 3.2.1.
The fragmentation module
The fragmentation module statically parses the code of the programs generating dynamic pages as this code contains the exact set of variables that are actually used, the operations made on them, the set of database accesses and there is still a clear separation between the static and the dynamic content (unlike the dynamic pages generated in which the code has already been executed and the HTML generated). When parsing the code, the fragmentation module constructs a semantic tree describing the attributes and the dependencies of the different scripts of the page. It goes without saying that to do so the program must handle the scripting language in which the program has been written. This is done by parsers that are intended to abstract the content away from the grammar of the language. The current version of our system only includes a PHP parser. The semantic tree produced is then given as input to a program that analyzes it and extracts different sets of meaningful information for the subsequent caching modules. This information is stored in separate files called configuration files.
In this paper, we will only concentrate on the information that is relevant to the fragmenation and the construction of filters. It should be noted that our automatic fragmentation will insert tagging instructions into the scripts in order to generate the markup that will delimit the fragments and specify the metadata characterizing them. Thus, we first need to know where to insert the tagging instructions. In practice, this requires determining the begin and end offset of each fragment. Some other important information to fragment the pages and calculate their filter, is the set of variables used by the script. In this context we distinguish between two types of variables: the script’s local variables and the environment variables. The first kind of variables is used to determine the page independent scripts, as two independant scripts must not share the same local variables. The second kind of variables is used to compute the fragment filter. As the filter is meant to characterize the output of a script in a unique way, we propose finding out the parameters of the request that do affect the generated result. It is important to recall that the output of a script also depends on the updates of the database (if any displayed information is retreived from a database), but in this paper, we will only focalize on the first kind of dependency, as the second concerns another requirement of caching dynamic pages (i.e. invalidation of the stored content which is beyond the scope of our study). Therefore, the relevant parameters are: parameters sent in the GET request, parameters sent in the POST request, cookies, HTTP elements, CGI elements.
Hence based on the configuration file created, the fragmentation module selects the fragments and determines their attributes (identifier, filter and subfragments). Then the final step of the fragmentation simply consists in augmenting the analyzed programs in order to generate fragmented pages and to enable the application to serve fragments separately. The modification consists in the following actions: 1. Strip the body of the fragment from the page, surround it with the appropriate tagging instructions and store it as an independant program in the same directory as the page, 2. replace the occurence of the fragment in the page by an include instruction referring to it. Henceforth, a dynamic page will be assimilated to a template containing the references of underlying fragments. Let us take the example of figure 4 to illustrate the fragmentation process. Figure 4 depicts a dynamic page generated by calling a PHP program with two arguments. Our system considers all the static part of the page as a single fragment, and considers the output of the independent scripts as separate fragments, thereby fragmenting the page as shown in figure 5. Now let us take the example of figure 6 to illustrate the principle of filters. As we can see in this figure, the output of fragment Home-1.php depends on the variable C ID, while the output of the fragment Home-2.php depends on the variable I ID (both variables are contained in the query string). The filter associated to the fragment Home-1.php consists in a structure containing the set of corresponding labels of environment variables affecting the script. The fragment key is then constructed by mapping the filter labels to their corresponding value and combining them to the fragment ID (see subection 3.3. for details).
Dynamic
Fragmentation
Dynamic
Pages
Module
Pages+
SerchResult.php Element 1
Web Server
Element 4
static 1
Element 5
Element 3
Fragment Aware Reverse Proxy
Element 5
static 2
Element 2
Client
Client
Client
Figure 3. Global view of the interactions
Fragment 1
Figure 4. Example of a PHP dynamic page
Augmented SearchResult.php
SearchResult.php
echo(""); echo(""); echo("");
Fragment 2
Fragment 4
Fragment 5
echo(""); include(’SearchResult−1.php>’); include(’SearchResult−2.php>’);
include(’SearchResult−3.php>’); include(’SearchResult−4.php>’);
static 2
http://server/path/SearchResult.php?SEARCH_TYPE=author&SEARCH_STRING=ted
http://server/path/SearchResult.php?SEARCH_TYPE=author&SEARCH_STRING=ted
?>
SearchResult−1.php echo (""); pgm1 echo("");
...
Figure 5. Fragmentation of a dynamic page
3.3.
Fragment-aware reverse proxy cache
As its name suggests, the fragment-aware reverse proxy cache manipulates fragments as base entity. To explain the functioning of the reverse proxy cache, we will first illustrate it by a concrete scenario, and then we will give the general algorithm describing the logic. The fragment-aware proxy cache maintains a map indexed by the accessed URLs and containing the fragment’s attributes. Let us assume that the reverse proxy receives the following request as the first request to the home page: GET /Home.php?C_ID=210&I_ID=1 HTTP/1.1 it initially checks whether it has an entry in the map corresponding to the URL “http://server/Home.php”, but as the URL is not already stored, the proxy requests the original server for it. The server sends an augmented response which contains markups delimiting the fragments as well as their metadata (see figure 6). Upon receiving the response, the proxy cache parses it, extracts the embedded name, filter and subfragments, and stores this information in the map. The entries of the map also have a pointer to a chained list whose nodes describe the stored different instances of the same URL. Each node of the structre contains the actual body (HTML) and the key of the instance (see figure 7). The key of a fragment is a structure containing the current values of the variables stored in the filter. Now when the proxy receives the following request: GET /Home.php?C_ID=300&I_ID=1 HTTP/1.1 it locates an entry in the map corresponding to the URL “http://server/Home.php”, and as the template (also called root fragment) is static, it will be reused as it is. Next, the proxy checks the subfragments of the root, for Home.php there are two subfragments “Home-1.php” and “Home-2.php”. The two subfragments also have their own entries in the map, and thus in order to know whether the required instances are already stored, the proxy checks the filters and finds out that the fragments depend respectively on C ID and I ID. Based on the current request parameters and the stored filters, the proxy will calculate the keys of the fragments that are to be sent in the response. In this case it will find that there is no instance stored of Home-1.php with the value “300” of C ID, whereas there is already an instance of Home-2.php with the value “1” of I ID, so it will only ask the orginal server for the first fragment. Then it renconstructs the page and sends the reponse. http://server/path/Home.php? C_ID=210&I_ID=1 static 1 Home−1.php?C_ID=210 Home.php static 1
C_ID
static 2
Home−1.php
Home−2.php?I_ID=1
Home.php
subfrags:
Home−1.php , Home−2.php
filter:
none
stored_instances:
key_val: body: next
static 3 static 2 I_ID
Home−1.php
Home−2.php static 3
http://server/path/Home.php? C_ID=300&I_ID=1 static 1
Home−2.php
Home−1.php?C_ID=300
SearchRequest.php
subfrags:
none
filter:
C_ID
stored_instances:
key_val: body: next:
none buffer
210 buffer
key_val: body: next:
300 buffer
static 2
... Home−2.php?I_ID=1
static 3
Figure 6. Example of dependency
Figure 7.
The following simplified algorithm describes the rationale of the process that handles clients’ requests in the
reverse proxy cache:
handle request(request){ if (!stored template) { request server(request); analyze response(response); store response(analyze output); send response(formatted response); } else //the template is stored { fetch template(URL); lookup subfragments(template) while (subfragments) { if (!stored URL fragment){ request server(fragment name, query string); analyze response(response); store fragment(analyze output); } else{ extract filter(fragment); calculate fragment key(filter, query string); if (!stored instance) { request server(fragment name, query string); analyze response(response); store fragment(analyze output); } } } reconstruct page(fragments body); send response(constructed response); } }
4.
Performance evaluation
In order to validate our approach and prove the benefits of the system, we decided to focus on e-commerce as a particular Web-based application. Thus it seemed natural to turn to TPC-W [1] as a Benchmark. TPC-W specifies an e-commerce workload that simulates the activities of a retail store website. Emulated users can browse and order products from the site. Users are emulated via several Remote Browser Emulators (RBEs)
and all RBEs can be configured to generate different interaction mixes: • Browsing mix (95% of browsing and 5% of ordering), • Shopping mix (80% of browsing and 20% of ordering), • and Ordering mix (50% of browsing and 50% of ordering). We ran our fragmentation module on the pages of the site, and it should be noticed that not all the fragments were cacheable. In particular, the fragments modifying the backend data base were not cached. The fragmented pages contained three fragments on average given that the pages were relatively small. As the system aims to lighten the server’s load and the generation delay of the pages, we first measured the fragments reuse rate for the different interactions mixes. As one would expect, the greater the percentage of browsing interactions, the greater the reuse rate. For the browsing mix the average calculated on ten simulations, run with 500 clients and 1000 items in the database, was about 60% which means that almost 2/3 of the fragments needed, per simulation, were served from the cache instead of hitting the original server. For both other mixes, approximately 1 fragment out of 3 was retreived from the cache (the mean percentages were respectively 48% and 39% under the same test conditions). Reusing fragments from the cache also significantly reduces the amount of traffic that flows between the server and the cache. In particular, in browsing mix sessions the system was able to save up to 60% of the bandwidth consumption. The benefits briefly discussed above have a direct impact on the generation delay of the pages. It is important to stress that the measurements presented in this paper were made on the server and on the cache, hence these are the minimum gains that can be achieved as the propagation delay over the network is not taken into account 4 . Figure 8, represents the time required by the proxy to answer 100 randomly generated requests to the home page. We can notice that after the first request, the reverse proxy cache response time decreases greatly and remains low. It is worth noting that the cache response time for the first request is not higher than the server response time for the same request. Moreover, the server response time does not decrease over the time unless internal caching is used. 12
200
Generation time on the reverse proxy
Generation time on the reverse proxy cache Generation time on the original server
180 160 Response time (seconds)
Response time/request (seconds)
10
8
6
4
140 120 100 80 60 40
2
20 0
0
10
20
30
40 50 60 Request number
70
80
90
Figure 8. Home page reuqests - Server vs cache
100
0
0
10
20
30
40 50 60 Request number
70
80
90
100
Figure 9. SearchResults - Server vs cache
Figure 9 represents the respective response time of the cache and the server to answer the same search requests made by the clients. While the server response time increases linearly with the number of requests, the cache response time increases much more slowly, the difference is even more noticeable for heavy scripts. The more 4
as the proxy is usually assumed to be nearer the clients, the propagation delay over the network should be lower between the cache and the clients
time the scripts take to execute, the higher the savings. Other characteristics such as the percentage of dynamic content and the number of scripts contained in a page may also influence the performance. To test the performances of our system under different configurations, we developed a configurable generator of php pages. This generator takes values of the characteristics above as input and generates pages accordingly. To give an idea of the performances and limitations of the system we will present the best and the worst case scenarios obtained for the benchtest considered. In our benchtest, we maintained constant the average number of scripts in a page (i.e. 10 scripts/page), and we varied other parameters such as the percentage of “heavy”/“light” scripts and the redundancy rate of fragments. We call heavy script, a script that makes heavy requests to the database, whereas a light script is a script that only executes few instructions that do not access to the database. Figure 10 represents the response time observed when all the fragments of the pages are heavy. We notice that in this case, independently of the fragments’ reuse rate, it is always worth fragmenting the pages and asking for the fragments separately.
Figure 10. Response time for heavy scripts
Figure 11. Response time for light scripts
Figure 11 represents the response time observed when all the fragments of the page execute a single printing instruction. Here, we can notice that for the fragments considered, the fragmentation only becomes worthwhile beyond a certain threshold of fragments’ reuse rate. This stems from the fact that the cost of the function calling the script is no longer made up for by the execution time of the script in question. In fact, the execution time of the include instruction becomes greater than the execution time of the script itself, and thus when the reuse rate is low, and when most of the fragments have to be generated, the cost of generating the page may even double. This only shows that the fragments should have a minimum size in order to be cost effective as cache units.
5.
Discussion and perspectives
In this paper we have proposed a fragment-based caching system for dynamically generated pages. Our approach consists in statically analyzing the code of the programs generating the dynamic pages. Such a static analysis avoids redundant processing and lowers the overhead of fragmentation inasmuch as the entire analysis is made once and off-line on the programs themselves. Special care was taken to increase the fragments’ reuse rate. Thanks to calculated filters, our system enables the cache to know in advance the identifiers of fragments required to construct a page (if just the template of the page is already stored). This results in optimizing the requests, lowering the generation delays, reducing the load on the original server insofar as, henceforth, only the missing fragments are requested.
One might consider modifying of the site repository as a drawback; nonetheless, this is a minor intrusion as the application logic and processing remain unchanged and only the organization of the page changes. In future versions of the system, we aim at deploying the fragments handling logic to a hierarchy of proxies, and we are now working on the specification of the collaboration protocol between the proxies. Furthermore, the current version of the system fully automates the fragmentation. We are now considering the possibility of allowing the administrator to modify the automatic selection of fragments if necessary, as human intervention is likely to improve the performances since it leads to a better understanding of the application particular needs. Finally we intend to study more closely the effect of fragment characteristics (such as size, execution time) on the performances of the system.
References [1] http://www.tpc.org/tpcw/default.asp. [2] C HALLENGER , J., I YENGAR , A., C., K. W., F ERSTAT, AND R EED , P. Publishing system for efficiently creating dynamic web content. Proceedings of the IEEE INFOCOM 2000 (May 2000). [3] C HALLENGER , J., I YENGAR , A., AND DANTZIG , P. A scalable system for consistently caching dynamic ´ New York (1999). web data. Proceedings of the IEEE INFOCOM99, [4] DATTA , A., D UTTA , K., T HOMAS , K. R. H., AND VANDER M EER , D. Dynamic content acceleration: A caching solution to enable scalable dynamic web page generation. Proceeding of the fifteenth ACM Symposium on Operating Systems Principles (SIGOPS) (May 2001). [5] DATTA , A., H ELEN , K. D., S URESHA , T. D. V., AND R AMAMRITHAM , K. Proxy-based acceleration of dynamically generated content on the worl wide web: An approach and implementation. ACM SIGMOD 2002 (june 2002). [6] M ISEDAKIS , I., K APOULAS , V., AND B OURAS , C. Web fragmentation and content manipulation for constructing personalized portals. APWeb 2004, LNCS 3007 (2004). [7] R AMASWAMY, L., I YENGAR , A., L IU , L., AND D OUGLIS , F. Techniques for efficient fragment detection in web pages. Proceedings of the 12th International Conference on Information and Knowledge Management, CIKM 2003 (November 2003). [8] R AMASWAMY, L., I YENGAR , A., L IU , L., AND D OUGLIS , F. Automatic detection of fragments in dynamically generated web pages. WWW2004 ACM 1-58113-844-X/04/0005 New York USA (May 2004). [9] Y UAN , C., C HEN , Y., AND Z HANG , Z. Evaluation of edge caching/offloading for dynamic content delivery. Proceedings of the 12th international conference on World Wide Web WWW-2003 (November 2003). [10] Y UAN , C., H UA , Z., AND Z HANG , Z. Proxy+ : Simple proxy augmentation for dynamic content processing. Tech. rep., Microsoft Research Asia, 2003. [11] Z HU , H., AND YANG , T. Class-based cache management for dynamic web content. Tech. rep., University of California Santa Barbara, 2001.