The shared hosting of Web applications and/or services on a single, possibly clustered ..... languages for dynamic content (e.g. JSP, ASP, PHP) with directives to ...
Preprint. Paper published in the 5th International Conference on Web Information Systems Engineering (WISE'2004).
Scaling Dynamic Web Content Provision Using Elapsed-time-based Content Degradation? Lindsay Bradford, Stephen Milliner, and Marlon Dumas Centre for Information Technology Innovation Queensland University of Technology, Australia {l.bradford, s.milliner, m.dumas}@qut.edu.au
Abstract. Dynamic Web content is increasing in popularity and, by its nature, is harder to scale than static content. As a result, dynamic Web content delivery degrades more rapidly than static content under similar client request rates. Many techniques have been explored for effectively handling heavy Web request traffic. In this paper, we concentrate on dynamic content degradation, believing that it offers a good balance between minimising total cost of ownership and maximising scalability. We describe an algorithm for dynamic content degradation that is easily implemented on top of existing mainstream Web application architectures. The algorithm is based on measuring the elapsed time of content generation. We demonstrate the algorithm’s adaptability against two traffic request patterns, and explore behavioural changes when varying the algorithm’s key parameters. We find our elapsed time based algorithm is better at recognising when the server is unloaded, that the supporting architecture limits the effectiveness of the algorithm and and that the algorithm must be configured pessimistically for best results under load.
1
Introduction
An Internet hosting facility routinely maintains a large and heterogeneous collection of publicly available services. The shared nature of the underlying computing node in conjunction with the best-effort nature of the Internet public infrastructure often yields significant delays in processing user requests, especially in the presence of bursty request cycles or during periods of sustained elevated traffic. The shared hosting of Web applications and/or services on a single, possibly clustered, platform implies a strong relationship between the application/service provider and the administrator of the facility. Part of this relationship is the guarantee that service should not be disrupted during events such as shortage in capacity and high fluctuations in demand. Even in the presence of such events, users should ideally experience an almost regular response from the system. One approach to overcome these problems is to over-provision computing resources in the facility, typically by supplying several duplicated machines that share requests through some load balancer. This solution is not only costly, but also requires increased configuration and administration efforts. Multiple other ways exist to address the above problems. One of them, widely used in telecommunications, is to apply denial of service when demand goes beyond designated capacity limits. Another possibility is to create digital islands of caching, fashioned after Content Delivery Networks (CDNs), in order to disseminate Web content. Web content degradation is yet another approach to scaling Web content delivery. To date, research has mostly been focused on static and multimedia Web content in the form of degrading image and video quality under load. Very little has emerged for dynamic Web content, despite its growing usage. The research we report here is a first attempt at implementing graceful dynamic Web content degradation at a single server using a simple control feedback loop. The main objectives of our work are: (i) to determine if benefits associated with the proposed method for dynamic Web content degradation outweigh the overhead introduced; (ii) to investigate the behavioural changes when altering the key parameters of our proposal; and (iii) to examine the impact of differing high user request patterns on a given service. To achieve these objectives, we start by defining a simple and pragmatic framework for measuring service adequacy. We then propose a service degradation method based on a selector of service provisioning approaches parameterised by two tunable thresholds. ?
This research is funded in part by the SAP Reasearch Centre, Brisbane.
In order to evaluate this proposal, we undertook a prototyping effort on a cluster of machines and conducted experiments corresponding to different operating settings of an Internet hosting facility, under two patterns of service requests: steady and periodic. The experimental results show that our proposal increases service adequacy and sheds light on the impact of varying the key parameters of our proposed method. This paper extends the approach described in [1]. Here, we introduce a memory-intensive type of content delivery to complement the CPU-intensive type described in our first publication. We also introduce the modification of key parameters in the approach selector algorithm with this paper. Section 2 discusses prior work on scalability of service provision. Section 3 and Section 4 successively present the proposed service adequacy metrics and the technique for controlled service degradation. Section 5 describes the experimental environment while Section 6 presents experimental results. Finally, Section 7 concludes.
2
Background
When delivering static content such as basic HTML pages, network bandwidth is typically the bottleneck [2]. For dynamic Web content delivery, the resource bottleneck typically shifts to the CPU (see [3], [4], [2] and [5]). Given that static content changes infrequently, basic Internet infrastructure allows caching of the content ‘closer to the user’ (via reverse-proxy caching, Web browser caching, CDNs, edge servers, etc.). This allows a degree of scalability by spreading bandwidth consumption amongst a set of remotely located caches. Resource management for bandwidth under heavy load has also been investigated and approaches typically attempt to limit certain types of bandwidth consumption (see [6] for a comparison of popular approaches) and to implement controlled degradation of service [7]. In times of extreme load, traditional static content caching techniques can be augmented with more active approaches to better meet demand [8]. Peer-to-peer caching schemes [9], adaptive CDNs [10], and cooperative networking models (where clients act as proxy cache servers to newer clients) [2] are examples of approaches to mitigate the effects of heavy request loads for static Web content. As the Internet continues to grow, there is strong evidence that dynamic content is becoming more prevalent [11]. When attempting to scale delivery of dynamic Web content, the application of caching is less applicable. Basic Internet infrastructure offers limited support for caching results to dynamic queries, simply because it can make no guarantees on what future response might be. Data update propagation [3], Active query caching [12] and Macro pre-processing of HTML [13] are examples of dynamic content caching. However, the applicability of dynamic content caching is limited (see [14] and [15]). Hence, minimising the resource requirements of dynamic Web content during times of heavy load should also be considered when caching is inappropriate. Graceful service degradation is not a new concept, at least in terms of static Web pages (see [16] and [17]), the delivery of multimedia (e.g. [18]) and time-constrained neural networks (e.g. [19]). Dynamic Web content degradation, however, has only recently been catered for by a small number of adaptive Web architectures. These architectures range from using content degradation as the primary resource management technique across several nodes [20] through to using content degradation alongside a suite of other adaptive service provision techniques [21]. This paper explores using a “single server” method for dynamic content degradation, using a popular, existing Web application server. We do this to minimise the interdependence between varying pieces of Web infrastructure and to discover what limitations may exist when using popular existing architectures. We use user-perceived quality of service as our guide, and choose elapsed time of content generation to trigger decisions on when a service must alter its behaviour in delivering content (see [22] and [23]). We draw on these works to determine an appropriate target for elapsed time of responses, and as support for using content degradation for scaling dynamic Web content. We argue that the server is thus more closely aligned to its end-users in how quality of service is being judged. Certain parallels exist between our proposal and the automatic Quality of Service (QoS) control mechanisms proposed by Menasc´e and Mason [24]. Their approach, however, involves modifying the configuration of the application/Web server, which usually cannot be done at runtime as required in case of sudden capacity overflow. In contrast, we focus on what the application developer may do without modifying the underlying server’s configuration or API.
3
Measuring Service Adequacy
We supply a more rounded discussion of what we mean by adequacy in [1], but condense the discussion here to just those points needed to make the paper self-contained. A response to a given Web request is adequate if the response is returned to the client within one second of receiving the request. For a response to be adequate, the total end-to-end response time [25] must be under one second. The degree of adequacy that a particular set of measured results has is the percentage of adequate responses delivered for the total measured requests sent, and is concerned only with message latency. We chose this one-second limit based on human–computer interface research showing that any human– computer interaction taking over a second is a noticeable delay [26]. We could vary this limit for specific circumstances. For example, it may need to be smaller for each machine that delivers a composite response for content to an end user. It could be much larger if the client waiting on the response is a machine, and not prone to boredom. The nature of the content being delivered might also matter. There is evidence (see [26] and [27]) suggesting that the more cognitive effort a user must put into completing a task, the longer they expect to wait for a response. Users are also willing to wait longer for a response if there is obvious visual feedback on the progress being made [23]. Our experiments attempt to simulate the delivery of an average Web page, requiring little cognitive effort from the user from a server whose preference is to deliver all content for a given URL within a time-frame such that the user will not feel that they have waited. Our measure of adequacy does not address the softer concerns of possible user “dissatisfaction” that may arise with a less “complete” service provision.
4
Design Considerations
We assume that for processing a given type of request, there are multiple approaches available, each with its own expected system execution time. When request traffic is low, more CPU effort can be devoted to each request, and a heavier approach is used. Conversely, when traffic at the server is high, a lighter approach is executed in an attempt to reduce load on the server and maintain an acceptable response time. Approaches are statically ranked during servlet initialisation in order of expected system time execution from heaviest to lightest.1 4.1
The Approach Selector
The approach to be used is determined by an approach selector as illustrated in Fig. 1 and is implemented as a Java servlet filter. The approach selector assumes that a service provider would prefer to use the most costly approach possible so long as the response is timely and expects the approaches to be ranked as described above. The approach selector, at a given point in time, is pointing its cursor at the current approach. The current approach is used to process incoming requests. The approach selector delegates content production to the current approach and measures the elapsed time the current approach takes to generate and deliver a response. This elapsed time is then fed into the current approaches time reporter (discussed below). The approach selector picks a new approach by querying the reported time of the current approach and comparing this reported time against the two key parameters defined below: upper-time-limit: The maximum elapsed time we desire for a response to be generated for any received request. Once the reported time of an approach breaches this limit, the approach selector will point its cursor at the next cheaper approach, deactivating the current approach. The cheapest approach is never deactivated, regardless of the degree to which the limit is breached. lower-time-limit: Once the reported time of an approach drops below this limit, the approach selector points its cursor at the next, more expensive approach. The threshold upper-time-limit differs from the client-time-limit defined earlier, in that it does not take into account time outside the control of the server (such as communications overhead). Assuming that the 1
We have not yet attempted to automatically rank approaches at initialisation.
Most preferred
lower-time-limit (try next slower)
Slowest Approach
upper-time-limit (try next faster)
Current Approach Web Browser
reported time Always Active
Fastest Approach
Elapsed Time of Approach
Time Reporter
Least preferred
Fig. 1. The Approach Selector
communication delay is modelled as a constant value c, this means that upper-time-limit should be set to clienttime-limit − c so that degradation occurs when inadequate responses (in terms of time) are being generated. This is not an unreasonable assumption to make given our former assumption that CPU consumption is the bottleneck, even when using the cheapest approach. We will consider a mixed model where the bottleneck shifts between CPU and bandwidth in future work. 4.2
Application Development Implications
For service degradation to be applicable, several versions of the application providing the service should be available: a “full-service” version and one or more “degraded” versions. The need for these multiple versions thus becomes a requirement in the application development process. In order to reduce the costs of handling this requirement in terms of analysis, design, coding, and testing, (semi-)automated support can be provided. In particular, design-level support could be introduced that would allow designers to provide meta-data about the cost and importance of various processing steps. At the coding level, special directives could be inserted in the dynamic content generation scripts to indicate, for example, when not to generate certain sections of a document. The number of possibilities are large and outside the current scope of this work.
5
Experimental Environment
We designed an experimental environment to study how to deploy an application to optimise service delivery using approach selection. The experimental environment aims to study: – the overhead associated with approach selection for graceful service degradation, and whether its costs outweighs its benefits – the importance of the algorithm parameters in optimising the approach selection decision – the effect of different patterns of traffic being experienced at the server – what to expect from memory-intensive vs CPU-intensive service provision in our approach to content degradation. We choose the URL as the level of granularity for dynamic content degradation. In other words, a number of approaches could be invoked to provide a response for a given URL depending on the approach selector algorithm. CPU-intensive content provision is simulated by running a number of loops and performing a floating point division in each loop. We consider four approaches (or versions of a service), corresponding to 3000, 1000, 500 or 100 loops respectively. We use the 3000-loop approach as a baseline. Memory-intensive content provision is simulated by running a number of loops and accessing a random element of a 2K block of memory in each loop. Again, we consider four approaches of 3000, 1000, 500, and 100 loops, with the 3000-loop approach being the memory-intensive baseline. The idea is to simulate a service that spends most of its time parsing requests and inserting response data into templates.
Each experiment is conducted using two request traffic patterns. The first, namely the “steady” request pattern, corresponds to a server under heavy, constant request pressure. The second, namely the “periodic” pattern, alternates between one-minute periods of request activity (exceeding the baseline’s ability to respond adequately) and one-minute periods of zero traffic. This second request pattern we use to verify that the approach selector is exercising its full range of approaches for a given URL. The experiments are carried out on a set of eight dedicated Sun boxes running Debian Linux on an isolated 100 megabit per second LAN. The testing architecture relies on a single testing server and a single target server. The testing server broadcasts UDP messages to synchronise the activity of clients in order to generate differing request patterns (specifically, to synchronise client machines for periodic traffic pattern). The target server, a Sparc Ultra-5, 192 MB, runs the Tomcat 4.1.18 application server, configured to 25 threads, with our approach selection algorithm. Two of the test client machines are used to generate sufficient traffic to ensure the server is close to, but not actually at, the point where it will refuse incoming connections for the baseline CPU approach. These machines send a request and once the request is received, will immediately disconnect, generating the equivalent of a denial of service attack. Early tests verified that Tomcat 4.1.18 will fully process a request (except actually returning the response) of a client that deliberately breaks a connection early. The remaining four machines send requests and wait for responses before sending new requests. The figures reported later have been derived from the data collected from these four machines. The expectation, when introducing approach selection, is that it should be able to reduce end-to-end response times, and allow significantly more timely service responses reported from the four sampling machines than when run against the baseline. Experiments are run for one hour each to ensure the receipt of sufficient sampled data. Statistics are not collected for the first 10 minutes of each experiment to ensure the results are not skewed by warm-up time. We track the end-to-end response time of each client request and report the percentage that fail to fall within our client-time-limit of one second. For both request patterns there is a target number of requests processed that the approach selector should tend towards. If less requests than the target are processed, the approach selector may have been overly optimistic about the workload. Similarly, if more are processed, the approach selector may have been overly pessimistic about the workload. For the steady request pattern, the target request number is the number of sampling clients times the number of seconds in the 50 minute period, which is 4 * 3000, or 12,000 requests. The periodic pattern spends only 25 minutes of the 50 sending requests, so its target is 6,000 requests.
6
Experimental Results
We conducted a total of four experiments. The first experiment runs approach selection with a na¨ıve setting of key parameters under the periodic request pattern for both the CPU-intensive and memory-intensive approach types. The second experiment repeats this setup using the steady request pattern. In Experiments 1 and 2, the thresholds upper-time-limit and lower-time-limit, discussed in Sect. 4 are set at 800 and 400 milliseconds (ms) respectively. The upper-time-limit threshold of 800 ms was chosen as a number within our one second target with an amount of time (200 ms) set aside to receive and transmit request and response messages. The 400 ms lower-time-limit was chosen simply as half of the upper-time-limit. Experiments 3 and 4 are similar to 1 and 2, varying the upper and lower time limits, in order to determine their effect on service adequacy under the periodic and steady request patterns respectively. We display our experimental results as either histograms or bar charts. The histograms show the response times (in seconds) on the horizontal axis, and plot on the vertical axis the cumulative percentage of responses sent within a given response time limit, thereby measuring service adequacy. The percentage of requests returned in under one second represents our measure of adequacy. The bar charts, on the other hand, show for each approach (i.e. version of the service) the percentage of requests that were processed using that approach in a given experiment. 6.1
Experiment 1: Approach Selection With Periodic Pattern
Fig. 2 shows the results of running approach selection for both memory- and CPU-intensive approach types with the periodic request pattern. Figure 2(a) shows the cumulative percentage of responses returned within a number of seconds and places these adequacy results against their respective baselines. Figure 2(b) shows the
Periodic Approaches Selected - 800/400ms
Periodic Approach Selection - 800/400ms
6000
100%
Responses/Approach
90%
% Messages Received
80% 70% 60% 54.57% 49.80%
50% 40%
CPU - 800/400ms
30%
Memory - 800/400ms
20%
CPU - Baseline
16.06%
10%
Memory - Baseline
4.98%
0% 0
1
2
3
4
5
6
7
5000 2583
4000 3000
2118 6000
2000
4001
9
2509
1000
Target
10
Response Time (Seconds)
(a) Adequacy – Periodic Pattern
2792
2365
483 36
448 19
Mem Mem CPU Base AS Base Approach Type
CPU AS
0 8
3000 Loops 1000 Loops 500 Loops 100 Loops
(b) Throughput – Periodic Pattern
Fig. 2. Na¨ıve approach selection using periodic request pattern
actual number of requests processed against a given approach for both approach selection and the baseline. An extra column shows the target request number the server should tend towards. For the default time-limit settings of the approach selector, we see that we achieved an adequacy measure of just under 50% for both approach types. With respect to throughput, both approach types came much closer to our target throughput figure than their corresponding baselines, but still fell somewhat short. Interestingly, a large percentage of the requests were served with the most costly approach, and many of the extra requests (on top of the baseline number) were served with the next most expensive approach. The time-limit values chosen saw the approach selector behave somewhat overly optimistically. A better result would have been to generate fewer 3000-loop responses in favour of more 1000-loop responses. 6.2
Experiment 2: Approach Selection With Steady Pattern Steady Approaches Selected - 800/400ms
Steady Approach Selection - 800/400ms
12000
100%
Responses/Approach
90%
% Messages Received
80% CPU - 800/400ms
70%
Memory - 800/400ms
60%
CPU Baseline
55.53%
50%
Memory Baseline
45.31%
40% 30%
3000 Loops 1000 Loops 500 Loops 100 Loops
10000 8000 6000
12000
4000
1434
2000
906 1076
20%
478
0
10%
Target
0% 1
16
31
46
61
76
91
Response Time (Seconds)
(a) Adequacy – Steady Pattern
106
121
1318
872 718 990 2453 107
Mem Mem CPU Base AS Base Approach Type
CPU AS
(b) Throughput – Steady pattern
Fig. 3. Na¨ıve approach selection using steady request pattern
Fig. 3 shows the results of running approach selection for both memory- and CPU-intensive approach types with the steady request pattern. Figures 3(a) and 3(b) show the same detail for the steady pattern as Figures 2(a) and 2(b) did for the periodic pattern. For the default time-limit settings of the approach selector, again, we see that we achieved an adequacy measure of just under 50% for both approach types. With respect to throughput, both approach types fell far short of our target throughput figure, but were significantly higher than their baselines. Of striking difference is the response times recorded. The minimum response times recorded for the the memory and CPU baselines
were 14 seconds and 55 seconds respectively. The difference between the baseline and na¨ıve approach selection is far more pronounced with this request pattern. In these approach selection results, there was a far more even distribution of requests across the various approaches. The time-limit values chosen saw the approach selector again behave overly optimistically. In this case, a better match of approaches to the request traffic would have seen far more of the smaller approaches used than what was witnessed. 6.3
Experiment 3: Periodic Request Pattern Varying Time Limits
In this section we present CPU approach type results only. Memory results were consistent with those gained via CPU. Figures 4(a) and 4(b) display adequacy and throughput results for the periodic request pattern with varying upper and lower time limits. The limits are reported in the graphs with the upper limit followed by the lower limit. Periodic CPU varying time limits
Periodic CPU Approaches Selected
100%
10000
80%
Responses/Approach
% Messages Received
90%
70% 60% 50%
800/600ms
40%
800/400ms
30%
800/100ms
20%
400/100ms
8000 6000 4000
0
1
2 3 4 Response Time (Seconds)
(a) Periodic adequacy w.r.t limits
5
6
7
4
6205
6356
6558
1982
1798
1679
33 800,100
177
147
400,100
200,100
2612
0
0%
0
2118
2000
200/100ms
10%
3000 Loops 1000 Loops 500 Loops 100 Loops
2365 1387 74 2 800,600
448 19 800,400
Time Limits (upper,lower) in milliseconds
(b) Periodic throughput w.r.t limits
Fig. 4. Approach selection with periodic pattern, varying time limits (CPU)
We see from Fig. 4(a) that changes to the lower-time-limit had a significant impact on adequacy. Changes to upper-time-limit, however, had no noticeable impact. Turning our attention to throughput in Fig. 4(b), the lower-time-limit again makes a significant difference. We note that with a lower-time-limit of 100 milliseconds we processed far more requests than our target of 6000, but that our adequacy figure did not climb above about 75%. A smaller lower-time-limit figure forces the approach selector to be more pessimistic about trying a more costly approach than the one it is currently using. An initial study of the throughput might suggest that the approach selector was too pessimistically configured for the request traffic. However, adequacy figures of only 75% suggest that the approach selector was actually overly optimistic. There is a period of time between receipt of request and the handing of a request to the approach selector (which is where the measuring of elapsed time begins). A significant amount of time can pass between receipt of a request by the application server and its subsequent processing by the approach selector. 6.4
Experiment 4: Steady Request Pattern Varying Time Limits
Figures 5(a) and 5(b) display adequacy and throughput results for the steady request pattern using the same time limits as figures 4(a) and 4(b). Again, we see similar trends. The lower-time-limit has significant impact on adequacy and throughput, the upper-time-limit far less so. An initial look at the throughput figures would suggest that a 100 millisecond lower-time-limit is about right for ensuring that measuring clients are receiving one request per second. When we look at adequacy measures, however, again we see that the adequacy achievable via changing limits is capped and does not do as well as with the periodic request pattern.
Steady CPU varying time limits
Steady CPU Approaches Selected
100%
14000 12000
80% 70% 60% 50% 40% 800/600ms 800/400ms 800/100ms 400/100ms 200/100ms
30% 20% 10%
Responses/Approach
% Messages Received
90%
0% 0
1
2
3
4
5
6 7 8 9 10 11 12 13 14 15 16 17 18 Response Time (Seconds)
(a) Steady adequacy w.r.t Limits
10000
3000 Loops 1000 Loops 500 Loops 100 Loops
8000
11
24
5191
5654
6000 4000 2000 0
920 446 363 1160
800,600
872 718 990
4007
2453
2125
800,400
800,100
2679
3191
400,100
21
5825
2149
4184
200,100
Time Limits (upper,lower) in milliseconds
(b) Steady throughput w.r.t Limits
Fig. 5. Approach selection with steady pattern, varying time limits (CPU)
7
Concluding Remarks and Future Work
This paper has demonstrated that the principle of controlled dynamic content degradation in times of load can deliver significant performance improvement. In addition the results provide insights for tuning a hosting facility with “capped” processing ability using content degradation. We tested a simple, easy-to-implement method for dynamic content degradation using a popular, unmodified application server. As elapsed time is how an end-user typically judges the quality of a web service offering, our algorithm also uses elapsed time for decisions in changing server behaviour. Our algorithm is a simple control loop with the two key parameters we name the lower-time-limit and upper-time-limit. These limits tell the algorithm when to try faster or slower approaches to content generation respectively once elapsed time goes below or above these limits. The smaller we make these values, the more pessimistic we make the approach selector. From the results of our experiments, we conclude that: – varying the lower-time-limit has significant impact: the smaller the limit, the better the service adequacy and throughput obtained – varying the upper-time-limit had far less impact on both adequacy and throughput – if significant time passes in the application server that is outside of the period of time measured by the approach selector, very little can be done by tweaking the parameters of the algorithm to lift the adequacy results further. To maximise the benefits of our approach selector, we need a supporting environment that minimises the effort to receive requests and record as accurately as possible time of receipt of requests. Perhaps a highvolume front end “request forwarder” that marks up a request with receipt time before passing on to the main application server would be appropriate. Without a supporting environment that offers this, we should err on the side of pessimistically configuring the approach selector to somewhat compensate for the time the approach selector cannot measure. We see a number of possible research directions. Firstly, the inclusion of further factors into the experimental setup (e.g. I/O and database access) should be considered. Secondly, we would like to study the benefits of automatically varying the values of the thresholds and dynamically changing the approach selection heuristic itself. A secondary control loop on the lower-time-limit may be a good first step. We envision this requiring a modified application server as outlined above. Thirdly, we would like to investigate ways in which degrading approaches for generating content can be obtained with minimal overhead on the application development process, for example, by extending scripting languages for dynamic content (e.g. JSP, ASP, PHP) with directives to indicate portions of code to be skipped in certain times of overload.
References 1. Bradford, L., Milliner, S., Dumas, M.: Varying Resource Consumption to Achieve Scalable Web Services. In: Proceedings of the VLDB Workshop on Technologies for E-Services (TES), Berlin, Germany, Springer Verlag (2003) 179–190 2. Padmanabhan, V., Sripanidkulchai, K.: The Case for Cooperative Networking. In: 1st International Peer To Peer Systems Workshop (IPTPS 2002). (2002) 3. Challenger, J., Iyengar, A., Witting, K., Ferstat, C., Reed, P.: A Publishing System for Efficiently Creating Dynamic Web Content. In: INFOCOM (2). (2000) 844–853 4. Kraiss, A., Schoen, F., Weikum, G., Deppisch, U.: Towards Response Time Guarantees for e-Service Middleware. Bulletin on the Technical Committee on Data Engineering 24 (2001) 58–63 5. Amza, C., Chanda, A., Cecchet, E., Cox, A., Elnikety, S., Gil, R., Marguerite, J., Rajamani, K., Zwaenepoel, W.: Specification and Implementation of Dynamic Web Site Benchmarks. In: Fifth Annual IEEE International Workshop on Workload Characterization (WWC-5). (2002) 6. Lau, F., Rubin, S.H., Smith, M.H., Trajovic, L.: Distributed Denial of Service Attacks. In: IEEE International Conference on Systems, Man, and Cybernetics. Volume 3., Nashville, TN, USA (2000) 2275–2280 7. Singh, S.: Quality of service guarantees in mobile computing. Computer Communications 19 (1996) 8. Iyengar, A., Rosu, D.: Architecting Web sites for high performance. In: Scientific Programming. Volume 10. IOS Press (2002) 75–89 9. Stading, T., Maniatis, P., Baker, M.: Peer-to-Peer Caching Schemes to Address Flash Crowds. In: 1st International Peer To Peer Systems Workshop (IPTPS 2002), Cambridge, MA, USA (2002) 10. Jung, J., Krishnamurthy, B., Rabinovich, M.: Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites. In: Proceedings of the International World Wide Web Conference, ACM Press New York, NY, USA (2002) 252–262 11. Barford, P., Bestavros, A., Bradley, A., Crovella, M.: Changes in Web Client Access Patterns: Characteristics and Caching Implications. Technical Report 1998-023 (1998) 12. Luo, Q., Naughton, J.F., Krishnamurthy, R., Cao, P., Li, Y.: Active Query Caching for Database Web Servers. In Suciu, D., Vossen, G., eds.: WebDB (Selected Papers). Volume 1997 of Lecture Notes in Computer Science., Springer (2001) 92–104 13. Douglis, F., Haro, A., Rabinovich, M.: HPP: HTML Macro-Preprocessing to Support Dynamic Document Caching. In: USENIX Symposium on Internet Technologies and Systems. (1997) 14. Yagoub, K., Florescu, D., Issarny, V., Valduriez, P.: Caching Strategies for Data-Intensive Web Sites. In Abbadi, A.E., Brodie, M.L., Chakravarthy, S., Dayal, U., Kamel, N., Schlageter, G., Whang, K.Y., eds.: VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10-14, 2000, Cairo, Egypt, Morgan Kaufmann (2000) 188–199 15. Shi, W., Collins, E., Karamcheti, V.: Modeling Object Characteristics of Dynamic Web Content. Technical Report TR2001-822, New York University (2001) 16. Tarek F. Abdelzaher and Nina Bhatti: Web Server QoS Management by Adaptive Content Delivery. In: Proceedings of the 8th Internation World Wide Web Conference, Toronto, Canada (1999) 17. Chandra, S., Ellis, C.S., Vahdat, A.: Differentiated Multimedia Web Services using Quality Aware Transcoding. In: INFOCOM 2000. Proceedings of the Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Volume 2., IEEE (2000) 961–969 18. Hutchison, D., Coulson, G., Campbell, A.: 11. In: Quality of Service Management in Distributed Systems. Addison Wesley (1994) 19. : Stohchastic and Distributed Anytime Task Scheduling. In: Tools with Artificial Intelligence. Volume 10., IEEE (1998) 280–287 20. Chen, H., Iyengar, A.: A Tiered System for Serving Differentiated Content. In: World Wide Web: Internet and Web Information Systems. Volume 6., Netherlands, Kluwer Academic Publishers (2003) 331–352 21. Welsh, M., Culler, D.: Adaptive Overload Control for Busy Internet Servers. In: USENIX Symposium on Internet Technologies and Systems. (2003) 22. Ramsay, J., Barbesi, A., Peerce, J.: A psychological investigation of long retrieval times on the World Wide Web. In: Interacting with Computers. Volume 10., Elsevier (1998) 77–86 23. Bhatti, N., Bouch, A., Kuchinsky, A.: Integrating user-perceived quality into Web server design. Computer Networks (Amsterdam, Netherlands: 1999) 33 (2000) 1–16 24. Menasc´e, D.A., Mason, G.: Automatic QoS Control. IEEE Internet Computing 7 (2003) 92–95 25. Menasc´e, D.A., Almeida, V.A.F.: Capacity Planning for Web Services. Prentice Hall (2001) 26. Miller, R.: Response Time in Man–Computer Conversational Transactions. In: Proc. AFIPS Fall Joint Computer Conference. Volume 33. (1968) 267–277 27. Selvidge, P.R., Chaparro, B.S., Bender, G.T.: The world wide wait: effects of delays on user performance. International Journal of Industrial Ergonomics 29 (2002) 15–20
Copyright Springer Verlag, Proceedings of the WISE'2004 conference.