It is common for Internet service hosting centres to dedicate server resources to different ... of Computer Science, University of Warwick, {mhd}@dcs.warwick.ac.uk ... The work in [14] focussed on maximising profits of best-effort requests when ...
The Effect of Server Reallocation Time in Dynamic Resource Allocation M. Al-Ghamdi
A.P. Chester
J.W.J. Xue
S.A. Jarvis∗
Abstract It is common for Internet service hosting centres to dedicate server resources to different applications such that revenue is maximised through efficient use of the available resources. Dynamic resource allocation has been shown to provide a significant increase in total revenue through the reallocation of available resources in accordance with changes in the workloads on each applications’ resources. In this paper multi-tiered enterprise applications are modelled as multi-class closed queuing networks, with each network station corresponding to an application tier. The effects of server reallocation time are evaluated through simulation. The experimental results demonstrate that reallocation time is inversely proportional to total revenue obtained.
1
Introduction
Internet hosting centres are often used for the cost-effective hosting of enterprise applications. Typically theses enterprise applications employ a multi-tier architecture, which provides a clear separation of roles between the tiers. Commonly a multi-tier architecture consists of three tiers; a client-facing web tier, an application tier for the application logic and a data persistence tier that is usually comprised of a relational database management system (RDBMS). At each tier servers may be clustered to provide high-availability and improve performance. An Internet hosting centre may host many multi-tier applications for its clients, each of which will have a separate service level agreement (SLA). The SLA defines the level of service agreed between the client and the hosting centre and may include performance and availability targets, with penalties to be paid if such targets are not met. It is in the interests of the service hosting centre to ensure that its SLAs are met so that it can maximise its revenue, whilst ensuring that its resources are well utilised. In this work we model the typical enterprise system using a multi class closed queuing network to compute the various performance metrics. The advantage of using an analytical model is that we can easily capture the different performance metrics, and identify potential bottlenecks without running the actual system. The model can also react to parameter changes when the application is running (e.g. from the monitoring tools or system logs) and make dynamic server switching decisions to optimise pre-defined performance metrics [20]. Workloads for internet services have been shown to be bursty with large variations in demand [1], [3], [22]. When static resource allocation policies are in use they may not be able to handle large surges in traffic, leading to SLA violations and reduced revenues. Dynamic resource allocation systems have been shown to provide a significant increase in revenue in such environments by reallocating servers into a more profitable configuration [20]. ∗ Department
of Computer Science, University of Warwick, {mhd}@dcs.warwick.ac.uk
There are several considerations in a dynamic resource allocation system. These include the decision interval, which is the time taken between evaluations of the policy, and the server reallocation time, which is the time taken to reallocate servers. In this paper we focus on how the time taken to reallocate servers between applications affects the total revenue obtained by the system. Bottlenecks are resources that limit the overall performance of the system [5], thus they have a significant impact on overall system performance, therefore it is desirable to avoid the bottleneck. The identification of bottlenecks is also important in tuning studies to evaluate the performance gains of different tuning alternatives [5]. Therefore substantial research has been conducted in bottleneck identification for multi-class closed queuing networks. However, it is non-trivial to predict or identify the system bottleneck as it can be shifted between tiers according to the changes of the workload mix and depends upon the number of the jobs in the network [2]. In this paper we use the approach developed in [5], where convex polytopes for bottleneck identification in multi-class queuing networks have been used. The specific contributions of this paper are: • to evaluate the effects of switching duration on a dynamic resource allocation system; • to review the behaviour of two known switching policies in this context; • to examine the impact of these effects on revenue maximisation; The remainder of the paper is structured as follows; section 2 reviews related work, section 3 describes the model of the system and the revenue function. Sections 4 outlines the idea of a system bottleneck and its identification methodology. Section 5 describes the admission control system and the two switching policies used. Section 6 presents a description of the experimental setup and results, finally section 7 concludes the paper.
2
Related Work
Revenue maximisation is a key goal of many dynamic resource allocation systems. In [18] the authors use priority queues to offer differentiated services to different classes of request to optimise company revenue. Different priorities are assigned to differed requests based upon their contributions to the revenue. The work in [14] focussed on maximising profits of best-effort requests when combined with requests requiring a specific quality of service (QoS) in a web farm. In [14] it is assumed that arrival rates of requests are static, whist the arrival rates in our work are dynamic. In [9] the authors attempt to maximise revenue by partitioning servers into logical pools and switching servers at runtime. This paper differs from [9] as we consider switching in a multi-tier environment. Our work in [20] presents the two server switching policies used here; the proportional switching policy and the bottleneck aware switching policy. The policies are discussed in detail in section 6. In our previous work a fixed switching time was selected based upon the real-world switching system we developed in [7]. This work extends our previous work by examining the effects of server reallocation time on the revenue achieved by the system. The work in [6] examines the effectiveness of admission control policies in commercial web sites. A simple admission control policy was developed in our previous work [20] and is used again here.
Symbol Sir vir N K R Kir mi φr πi T Dr Er Pr Xr 0 Xr Ui ts td
3 3.1
Table 1: Notation used in this paper. Description Service time of job class-r at station i Visiting ratio of job class-r at station i Number of service stations in QN Number of jobs in QN Number of job classes in QN Number of class-r job at station i Number of servers at station i Revenue of each class-r job Marginal probability at centre i System response time Deadline for class-r jobs Exit time for class-r jobs Probability that class-r job stays Class-r throughput before switching Class-r throughput after switching Utilisation at station i Server switching time Switching decision interval time
Modelling Multi-tiered Internet Services and Revenue Functions The System Model
A multi-tiered Internet service can be modelled using a multi-class closed queuing network [22][19]. The closed queuing network model used in this paper is illustrated in figure 1. In a multi-class closed queuing network Sir represents the service time, which is defined as the average time spent by a class-r job during a single visit to station i and vir symbolize the visiting ratio of class-r jobs to station i (the notation used in this paper is summarised in table 1). Service demand Dir is defined in [13] as the sum of the service times at a resource over all visits to that resource during the execution of a transaction or request (Dir = Sir · vir ). The total population of the network (K) is defined as the total population of customers of class r (Kr ): X K= Kr (1) r
In modern enterprise systems, servers are often clustered together so both -/M/1-FCFS and -/M/m-FCFS in each station should be measured as a consequence of using a cluster of servers in each tier in our model. The mean response time of a class-r job at station i can be computed as follows [4], PR Dir 1 + r=1 K ir (k − 1r ) , mi = 1 P T ir (k) =
Dir
1+
R
K ir (k − 1r )
r=1 mi i + Pmi −2 (mi − j − 1) πi (j | k − 1r ) , mi > 1 j=0
(2)
C
AS WS
AS
C
DS
WS C
AS
Figure 1: A Model of a Typical Configuration of a Cluster-based Multi-tiered Internet Service. Where, • there are k jobs in the queuing network, for i = 1, . . . , N and r = 1, . . . , R, • (k - 1r ) = (k1 , . . . , kr - 1, . . . ,KR ) is the population vector with one class-r job less in the system. The mean system response time T i(k) is the sum of mean response time for each tier: Ti (k) =
R X
Tir (k)
(3)
r=1
For the case of multi-server nodes (mi > 1), it is necessary to compute the marginal probabilities. The marginal probability that there are j jobs (j = 1, . . . , (mi - 1)) at the station i, given that the network is in state k, is given by [4], " R # 1 X vir πi (j | k) =
j
r=1
Sir
Xr (k) πi (j − 1 | k − 1r )
(4)
The throughput of class-r jobs can be calculated using Little’s law [13] by dividing the total population of customers of class-r Kr by the sum of the visiting ratio vir , multiplied by the sum of mean response time of each tier, Xr (k) = PN i=1
kr
(5)
virT ir (k)
By applying Little’s Law again with the Force Flow Law [13], the mean queue length K ir is obtained by multiplying the throughput Xr (k), the mean response time Tir (k), and the visiting ratio vir. K ir (k) = Xr (k) · T ir (k) · vir
(6)
Where, K i r (0, 0 . . . , 0) = 0, πi (0 — 0) = 1, and πi (j — 0) = 0; the system response time, throughput and mean queue length in each tier can be calculated after K iterations. In multiclass product-form queuing networks, the utilisation per-class station Uir (k) can be computed using the following equation [15],
kr Dir D [1 + K i (k − 1r )] ir i
Uir (k) = P
3.2
(7)
Modelling the Revenue Function
In [17] the session is defined as a sequence of requests of different types made by a single customer during a single visit to a site. Where a client request is met within the deadline the maximum revenue is obtained, while revenue obtained from requests which are not served within the deadline decreases linearly to zero, at which point the request exits the system. Equation 8 explains how the probability function of the request execution in the system (which is donated by P (Tr )) works in our model, where r, Dr , Tr , and Er represent the request and its deadline, response time, and dropped time from the system respectively. Tr < Dr 1, P (Tr ) =
Tr − Dr
Er − Dr 0,
,
Dr ≤ Tr ≤ Er
(8)
Tr > E r
The first part of equation 8 states that the full revenue will be contributed by the request if it is processed before the deadline Dr . It is clear from the second part of the equation that the gained revenue by the request is calculated by dividing the difference between the request response time Tr and its deadline Dr by the difference between request dropped time from the system Er and its deadline Dr . The request gains no revenue when its response time Tr is greater than the time at which the request exits the system Er . With respect to the probability of the request execution, the gained and lost revenue is i calculated. The loss revenue function, which is denoted as Vloss , is calculated in equation 9, with the assumption that the servers are switched from pool i to pool j. Equation 10 is used to i calculate the gained revenue Vgain . Note that because the servers are being switched, they can not be used by both pools i and j during the switching process and the time that the migration takes cannot be neglected. The revenue gain from the switching process is calculated during the 0 switching decision interval time td as shown in equation 10 where the switching decision interval time is greater than the switching time. i Vloss =
R X r=1
j Vgain =
R X r=1
0
Xri (ki )φir P (Tr )td −
R X
0
Xri (ki )φir P (Tr )td
(9)
r=1
Xrj (kj )φjr P (Tr )(td − ts ) −
R X
Xrj (kj )φjr P (Tr )(td − ts )
(10)
r=1
After calculating the achieved and lost revenue using equations 9 and 10 servers may be switched between the pools. In this paper servers are only switched between the same tiers, and only when the revenue gain is greater than the revenue lost.
4
Bottleneck
The work in [12] summarises a bottleneck resource as one for which: (i) short-term demand exceeds capacity; (ii) the work-in-process (WIP) inventory is at its maximum, which means that the number of waiting jobs in queue L=L(λ, µ) is at its highest; where the arrival rate of the jobs to a machine are denoted as λ, and µ represents its capacity or, (iii) production capacity is
at its minimum, relative to demand (i.e., the capacity utilisation which is represented as ρ= λ/µ is at its maximum). A bottleneck in the system may be shifted between tiers according to changes in the workload mix and the number of jobs in the system [2]. It is clear that bottleneck identification should be one of the first steps in any performance study; any system upgrade which does not remove the bottleneck(s) will have no impact on the system performance at high loads [16]. A significant amount of research has been done trying to solve the problem of bottlenecks [2] [5] [8] [10] [11]. The work in [2] [8] [11] studies bottleneck identification for multi-class closed product-form queuing networks for an infinite population, while [5] [10] study a large population. Our work in [21] uses the convex polytopes approach to identify the bottleneck in two different pools for their chosen configuration using two classes of jobs (gold and silver). From the results we conclude that the bottleneck may occur at any tier and may shift between tiers, there is also a possibility that the system enters the crossover points region where more than one tier becomes a bottleneck. This method can compute the set of potential bottlenecks in a network with one thousand servers and fifty customer classes in just a few seconds.
5
Admission Control and Server Switching Policies
Overloading can cause a significant increase in the response time of requests, which leads to an obvious degradation in revenue. Admission control is a possible solution to the overloading problem. A simple admission control policy was developed in our previous work [20] and is used again in this research. This works through a simple policy of dropping less valuable requests when the response time exceeds a threshold value. Due to the variation in demand for an online service, it is difficult to predict the workload in the future. In a statically allocated system, comprised of many static server pools, a high workload may exceed the capacity of the pool causing a loss in revenue, while lightly loaded pools may be considered as wasted resources if their utilisation is low. The policies which we examine here are the proportional switching policy (PSP) and the bottleneck-aware switching policy (BSP) that were developed in [21].
5.1
The Proportional Switching Policy
The proportional switching policy used here is shown in algorithm 1 and was first presented in [21]. This policy works by allocating servers at each tier proportionally according to workload, subject to an improvement in revenue.
5.2
The Bottleneck-aware Switching Policy
Many factors contribute to the performance of a system. The bottleneck-aware switching policy attempts to overcome some of the factors which impact the system negatively in order to improve the results of both the static allocation and proportional switching policy. The bottleneck-aware switching policy is a best effort algorithm; it may not find the optimal server allocation. The bottleneck identification phase works if there is a bottleneck detected in either of the pools. If a bottleneck is detected at the same tier in each pool, migrating servers at that tier will not remove the bottleneck. If a bottleneck exists within a single pool servers are migrated to remove the bottleneck, subject to a revenue improvement. The local search algorithm (algorithm 3) works when there is no bottleneck saturation in either pool. The algorithm uses a nested loop to evaluate server migrations starting from the
Algorithm 1 The proportional switching policy Input: N , mi , R, Kir , Sir , vir , φr , ts , td Output: Server configuration for each i in N do m1i /m2i = K 1 /K 2 end for calculate Vloss and Vgain using eq. 9 and eq. 10 if Vgain > Vloss then do switching according to the calculations 0 Sir ← Sir else server configuration remains the same end if return current configuration Algorithm 2 The bottleneck-aware switching policy Input: Nr , mi , R, Kir , Sir , vir , φr , ts , td Output: new configuration while bottleneck saturation found in one pool do if found at same tier in the other pool then return else switch servers to the bottleneck tier 0 0 mi ← mi and Sir ← Sir end if end while search configurations using Algorithm 3 return current configuration
web tier to the application tier before finally evaluating the database tier. The revenue gain is computed at each stage, with the highest revenue state being chosen. Due to the small number of classes in modern enterprise systems, solving the multi-class closed queueing network is very quick [21]. The time complexity of the algorithm is O(m0 m1 m2 ), where m0 , m1 , m2 are the total number of web, application and database servers in all pools.
6
Experimental Setup and Results
In this paper we simulate two applications which are running as two logical pools. Each application is multi-tiered, with each tier being comprised of a cluster of servers. Each application also has two classes of request, gold and silver, which are used to represent the value of each request. The service time Sir and the visiting ratio vir are chosen based on realistic values or from those supplied in supporting literature. Table 2 summarises the main experimental parameters which are used. The focus of the experimentation is to investigate how the time taken to reallocate servers affects the revenue derived from the system. We have fixed the decision interval for the policies at 60 seconds, and experimented with reallocation times of 5 to 55 seconds. We have conducted these experiments under two inversely proportional workloads, as shown in figures 2 and 3.
Algorithm 3 The configuration search algorithm Input: Nr , mi , R, Kir , Sir , vir , φr , ts , td Output: best configuration Initialisation: compute Ui1 , Ui2 while U01 > U02 do if m20 > 1 then 2 20 m20 ↓, m10 ↑; S0r ← S0r while U11 > U12 do if m21 > 1 then 2 20 m21 ↓, m11 ↑; S1r ← S1r 1 2 while U2 > U2 do if m22 > 1 then 2 20 m22 ↓, m12 ↑; S2r ← S2r ; compute Vloss using eq. 9 1 10 S2r ← S2r ; compute Vgain using eq. 10 if Vgain > Vloss then store current configuration end if compute new Ui1 , Ui2 end if end while similar steps for U21 < U22 1 10 S1r ← S1r ; compute new Ui1 , Ui2 end if end while similar steps for U11 < U12 1 10 S0r ← S0r ; compute new Ui1 , Ui2 end if end while similar steps for U01 < U02 return best configuration
Under workload one both policies show that revenue decreases as server reallocation time increases. An increase in reallocation time, decreases the amount of time that the servers are available to service requests, thus reducing revenue obtained. Figure 4 shows the results of the proportional switching policy under workload one. The policy demonstrates a linear decrease in revenue, as the reallocation time increases from 5 to 30 seconds. At 35 seconds there is a significant reduction in revenue, which increases slightly as the reallocation time increases. The behaviour of the policy changes throughout the experiment, with the policy migrating many servers when the reallocation time is small and fewer as the time increases. This is due to the reallocation time being a consideration in the revenue gain as calculated by equation 10. The proportional switching policy demonstrates an improvement in revenue over the static allocation at all reallocation durations. The use of admission control has no effect on the revenue generated by the policy. The bottleneck aware switching policy results are shown in figure 5. This policy most clearly shows a linear relationship between the reallocation time and the revenue generated. The linear relationship is preserved with or without the use of the admission control policy. The policy demonstrated significant improvements in revenue over a statically allocated system. Using the
Table 2: The main experimental parameters. Pool 1 Pool 2 Silver Gold Gold Silver Service WS 0.07 0.1 0.05 0.025 time AS 0.03125 0.1125 0.01 0.06 (sec) DS 0.05 0.025 0.0375 0.025 WS 1.0 0.6 1.0 0.8 Visiting AS 1.6 0.8 2.0 1.0 ratio DS 1.2 0.8 1.6 1.6 Deadline (sec) 20 15 6 8 Exit point (sec) 30 20 10 12 Revenue unit 2 10 20 4 Number WS 4 5 of AS 10 15 DS 2 3 servers
admission control policy the system generates less revenue and our work in [21] outlines how aggressive admission control negatively affects a system under light load. Under the second workload, the proportional switching policy (figure 6) performs worse than a static allocation at all intervals with the exception of 55 seconds. It should be noted however that the maximum reduction in revenue is 4.77%. Initially the policy behaves as expected, with a linear decrease in revenue, however the revenue increases from 25 seconds as fewer servers are migrated due to the reduced revenue gained from making further migrations. In this scenario the use of admission control has a slight negative impact on the revenue obtained through the use of the policy. The bottleneck-aware switching policy is the best performing policy under workload two. It provides a significant improvement in the revenue generated by the system. The improvement decreases linearly to 30 seconds, where there is a drop at 35 seconds, before the revenue decreases linearly again. The large drop in revenue at 35 seconds is caused by an increase in the number of server migrations from 12 at a 30 second duration to 15 at a 35 second duration. Under the second workload the use of the admission control policy enhances the revenue generated at all reallocation times, however the enhancement is reduced when the reallocation time increases beyond 30 seconds. The findings from these experiments suggest that: 1. the reduction in revenue due to increased reallocation intervals generally holds over all server switching policies; 2. that minimising the reallocation interval is therefore crucial; 3. that optimising the reallocation interval will be application specific; however, it amy be possible to identify common traits (e.g. queue minimisation) that holds across a wide range of applications, which will provide a focus for future optimisation research.
200
Workload
150
100
50 Application 1 Application 2 0
5
10
15 20 Time (mins)
25
30
35
Figure 2: Workload One.
7
Conclusion
In this paper we have modelled an internet service provider as a collection of multi-class closed queueing networks, each of which represents a three tier web application architecture with a cluster of servers at each tier. Our model supports the dynamic reallocation of servers at the same tier between pools. We have evaluated the behaviour of two switching policies and compared them against a static allocation under a range of reallocation intervals and observed that larger reallocation intervals have a negative impact on revenue. The results under the first workload demonstrate a clear inversely proportional relationship between reallocation interval and revenue. Under the second workload the relationship is well defined. In the short term we will investigate the impact of other aspects of dynamic resource allocation. These include evaluating the decision interval which was fixed at 60 seconds in this work, and examining the combined effects of reallocation and decision intervals. We will also evaluate possible optimisations to the switching process to ensure that the reallocation time is minimised. Current policies evaluate the system retrospectively and make no predictions about the workload after making a migration. In future we hope to make predictions based on historical workload analysis to better guide policies’ decisions.
References [1] M. Arlitt and T. Jin. A Workload Characterization Study of the 1998 World Cup Web Site. IEEE Network, 14(3):30–37, 2000. [2] G Balbo and G Serazzi. Asymptotic Analysis of Multiclass Closed Queueing Networks: Multiple Bottlenecks. Performance Evaluation, 30(3):115–152, 1997. [3] P. Barford and M. Crovella. Generating Representative Web Workloads for Network and Server Performance Evaluation. SIGMETRICS Performance Evaluation Review, 26(1):151– 160, 1998.
200
Workload
150
100
50 Application 1 Application 2 0
5
10
15 20 Time (mins)
25
30
35
Figure 3: Workload Two. 215 210
Total Revenue
205 200 195 190 185 NSP PSP no A.C. PSP with A.C.
180 175 0
10
20
30 Switching Time (s)
40
50
60
Figure 4: Revenue Generated by the Proportional Switching Policy Under Workload One at Different Reallocation Times. [4] G. Bolch, S. Greiner, H. deMeer, and K. Trivedi. Queueing Networks and Markov Chains: modelling and performance evaluation with computer science applications. Wiley, 2nd edition, 2006. [5] G. Casale and G. Serazzi. Bottlenecks identification in multiclass queueing networks using convex polytopes. In 12th Annual Meeting of the IEEE Int’l Symposium on Modelling, Analysis, and Simulation of Comp. and Telecommunication Systems (MASCOTS), 2004. [6] L. Cherkasova and P. Phaal. Session based admission control: a mechanism for peak load management of commercial web sites. IEEE Transactions on Computers, 51(6), 2002. [7] A.P. Chester, W.J. Xue, L. He, and S.A. Jarvis. A system for dynamic server allocation in application server clusters. In International Symposium on Parallel and Distributed Processing with Applications, Sydney, Austrailia, December 2008.
260 250
Total Revenue
240 230 220 210 200 190 NSP BSP no A.C. BSP with A.C.
180 170
0
10
20
30
40
50
60
Switching Time (s)
Figure 5: Revenue Generated by the Bottleneck Aware Switching Policy Under Workload One at Different Reallocation Times. 178
Total Revenue
176 174 172 170 168
NSP PSP no A.C. PSP with A.C.
166 0
10
20
30 Switching Time (s)
40
50
60
Figure 6: Revenue Generated by the Proportional Switching Policy Under Workload Two at Different Reallocation Times. [8] D.L. Eager and K.C. Sevcik. Bound hierarchies for multiple-class queueing networks. Journal of ACM, 33(1):179–206, 1986. [9] L. He, W.J. Xue, and S.A. Jarvis. Partition-based Profit Optimisation for Multi-class Requests in Clusters of Servers. the IEEE International Conference on e-Business Engineering, 2007. [10] B.A. Huberman and S.H. Clearwater. Swing options: A mechanism for pricing peak in demand. Computing in economics and finance, HP Labs, 2005. [11] Internet Trace Internet Traffic Archive Hosted at Lawrence Berkeley National Laboratory. http://ita.ee.lbl.gov/html/traces.html, 2008. [12] S.R. Lawrence, , and A.H. Buss. Economic analysis of production bottlenecks. Mathematical Problems in Engineering, 1(4):341–363, 1995. [13] M. Litoiu. A performance analysis method for autonomic computing systems. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2(1):3, 2007.
450
Total Revenue
400 350 300 250 200 150
NSP BSP no A.C. BSP with A.C. 0
10
20
30
40
50
60
Switching Time (s)
Figure 7: Revenue Generated by the Bottleneck-aware Switching Policy Under Workload Two at Different Reallocation Times. [14] Z. Liu, M.S. Squillante, and J.L. Wolf. On maximizing service-level-agreement profits. ACM SIGMETRICS Performance Evaluation, 29(1):43–44, 2001. [15] J.K. MacKie-Mason and H.R. Varian. Pricing congestible network resources. IEEE Journal on Selected Area in Communications, 13(7):1141–1149, 1995. [16] M. Marzolla and R. Mirandola. Performance prediction of web service workflows. In The third International Conference on the Quality of Software-Architectures (QoAS), volume 4880, pages 127–144, Medford, MA, USA, 2007. [17] D.A. Menasce. Using performance models to dynamically control e-business performance. In International Multiconference on Measurement, Modelling, and Evaluation of ComputerCommunication Systems, pages 1–5, Aachen, Germany, 2001. [18] D.A. Menasce, V.A. Almeida, R. Fonseca, and M.A. Mendes. Business-oriented resource management policies for e-commerce servers. Performance Evaluation, 42(2-3):223–239, 2000. [19] J. Rolia, X. Zhu, M. Arlitt, and A. Andrzejak. Statistical service assurances for applications in utility grid environments. Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS), pages 247–256, 2002. [20] W.J. Xue, A.P. Chester, L. He, and S.A. Jarvis. Dynamic resource allocation in enterprise systems. In International Conference on Parallel and Distributed Systems, Melbourne, Australia, December 2008. [21] W.J. Xue, A.P. Chester, L. He, and S.A. Jarvis. Model-driven server allocation in distributed enterprise systems. In The 3rd International Conference on Adaptive Business Information Systems (ABIS’09), Leipzig, Germany, March 2009. [22] J.Y. Zhou and T. Yang. Selective early request termination for busy internet services. In 15th International Conference on World Wide Web, 2006.