... for example, server clus- ters deployed in large data centers and web hosting facilities. ... number of online servers based on the system load. ... [11] provided a static off-line scheduling algorithm and a number of on-line algorithms with good.
Practical Energy-Efficient Policies for Server Clusters Ruibin Xu, Cosmin Rusu, Dakai Zhu, Daniel Moss´e, Rami Melhem Computer Science Department, University of Pittsburgh xruibin,rusu,zdk,mosse,melhem @cs.pitt.edu Abstract Power conservation has become a key design issue for many systems, for example, server clusters deployed in large data centers and web hosting facilities. Another example is satellite-based multiprocessor systems, where power management ultimately determines the lifetime of the system. The goal is to minimize the aggregate energy consumption of the whole server cluster while ensuring timely responses to client requests. Energy-efficient policies that employ a combination of dynamic voltage scaling (DVS) and turning nodes on/off have been studied under the assumption of continuous frequencies and the cubic-root rule of power-frequency relation. Many of the proposed policies strongly depend on the existence of a perfect load-balancing mechanism. In this work, we focus on the more realistic case of discrete frequencies and propose a new policy that adjusts the number of online servers based on the system load. The proposed policy is practical (e.g., eliminates the strong dependency on perfect load-balancing) and is shown to be more effective in reducing the overall power consumption.
1
Introduction
In server farms, power consumption and cooling account for a significant fraction of the total operating cost. Furthermore, system overheat resulting from excessive power consumption can lead to intermittent system failures. Based on the observation that a system designed for peak load is rarely fully utilized, applying power management schemes can successfully lead to significant power savings while maintaining adequate system performance (for example, response time). Another type of system that requires power management comes from our industrial research partners and deals with satellite-based signal processing. Signal data collected through an external sensor (equivalent to the front end of a server cluster) is disseminated to several processing units (equivalent to servers) for further analysis by a signal processing application. Currently we are investigating two such applications, referred to as Event Extraction and CAF (Complex Ambiguity Function). Power management mechanisms can be divided into two categories: vary-on/vary-off (VOVO) and DVS. Node VOVO, originally proposed by Pinheiro et. al. [7], takes cluster nodes offline when the incoming workload can be adequately served by a subset of the nodes in the cluster and brings nodes back online when the workload increases beyond the capacity of the online nodes. Offline nodes are placed in a low power state to achieve significant power savings.
This work has been supported by the Defense Advanced Research Projects Agency through the PARTS (Power-Aware
Real-Time Systems) project under Contract F33615-00-C-1736.
1
On the other hand, individual node power management is possible because current power-efficient systems have management functions that can be invoked to choose among different power states for each component. Dynamic voltage scaling (DVS) is an important technique that allows performancesetting algorithms to dynamically adjust the performance level of the processor. An increasing number of processors implement DVS, which can yield quadratic energy savings for systems in which the dominant power consumer is the processing unit [6, 11]. It has been shown that combining VOVO and DVS power management mechanisms can achieve significant power savings for server clusters. However, the existing policies are based on the assumptions of continuous frequencies and cubic-root rule of the power-frequency relation, which do not hold in practice. Currently available commercial DVS processors only provide about 4-10 discrete operating frequencies and most of them do not comply with the cubic-root rule of the power-frequency relation. Furthermore, some processors have inefficient frequencies that must be eliminated [9]. In this paper, we propose a new power management policy for server clusters. Our policy applies directly to the case of discrete frequencies and does not make any assumption on the power-frequency relations. Our policy can achieve maximum power savings when the incoming workload is balanced across the whole cluster, but does not rely on the assumption of perfect load-balancing. The proposed policy, to be implemented at the front-end of a server cluster, assumes that each server in the cluster performs DVS independently, to meet a desired QoS requirement. For some systems (such as web servers), the QoS requirement is simply to keep up with the rate of request arrivals. Other systems impose more stringent real-time constraints on requests. We investigated a variety of server DVS policies in [8]. We emphasize the fact that the power management policy investigated in this work is orthogonal to the DVS policies of individual servers. The remainder of the paper is organized as follows. We first present related work in Section 2. The system model under consideration is described in Section 3. Our new policy is presented in Section 4. The advantages of the proposed policy over previously proposed policies are reported in Section 5. We conclude the paper in Section 6.
2
Related Work
Dynamic voltage-scaling (DVS), which involves dynamically adjusting the voltage and frequency of the CPU, has become a major research area. Quadratic energy savings [6, 11] can be achieved at the expense of just linear performance loss. For real-time systems, DVS schemes focus on minimizing energy consumption in the system while still meeting the deadlines. The seminal work by Yao et al. [11] provided a static off-line scheduling algorithm and a number of on-line algorithms with good competitive performance, assuming aperiodic tasks and worst-case execution times (WCET). Heuristics for on-line scheduling of aperiodic tasks while not hurting the feasibility of off-line periodic requests are proposed in [5]. Automatic DVS for Linux with distinction between background and interactive jobs was presented in [4]. Power management has traditionally focused on portable and handheld devices. IBM Research broke with tradition and presented a case for managing power consumption in web servers [2]. Elnozahy et al. evaluated five policies which employ various combinations of DVS and node vary-on/vary-off for cluster-wide power management in server farms [3]. Sharma et al. [10] investigates adaptive algorithms for dynamic voltage scaling in QoS-enabled web servers to minimize the energy consumption subject 2
to service delay constraints. Aydin et al. [1] attempted to incorporate variable voltage scheduling of periodic task sets to partitioned multiprocessor real-time systems. Among the related work, the policies in [3] for power management in server clusters are most relevant to our work. These policies determine the number of online servers and apply a certain DVS scheme independently or coordinately. A vary-on/vary-off coordinated voltage scaling policy (VOVO-CVS) resulted in most power savings. For every possible number of online servers, the policy precomputes an on-frequency and an off-frequency such that: if the running frequency of the processor exceeds the on-frequency an offline server will be brought online, and, if the running frequency of the processor is below the off-frequency one of the online servers will be taken offline. The policy precomputes the on-frequencies and off-frequencies based on the assumptions of perfect load-balancing, continuous frequencies and cubic power functions. When these assumptions do not hold, the policy is suboptimal.
3
System Model
A server cluster consists of a front end and identical servers, each equipped with a DVS processor. The front end is responsible for collecting requests from clients, and request distribution to the online servers. The front end is also capable of taking the servers offline and bringing them online, according to some policy. Each server is in one of the three states at any given time: busy, idle, and sleep (offline). The server power consumption when sleeping/offline is ; the server power consumption when idle is ; when it is busy, the processor runs at one of the discrete operating frequencies: !!!"$# '"( (!!! )#*( and the corresponding server powers are &% . &% &%
4
Load-aware Vary-on/Vary-off Independent Voltage Scaling (LAOVS)
In our approach, a front end distributes the incoming requests across the non-sleeping servers, using a weighted round-robin distribution policy, with the servers weighted according to the average response time of recent requests. Each non-sleeping server carries out DVS independently, running at the lowest frequency that keeps up with the request arrival rate. Individual server policies were investigated in [8]. Suppose the incoming workload for one particular server is + cycles/sec, it will need to run at ,+-, cycles/sec where ,+., denotes the lowest available discrete frequency higher than + . Thus, the utilization ("; ( %=@/ 8:A . of the server is /102+435,+., and the power consumption of the server is 607/98:&%,+., Our goal is to minimize the aggregate power1 of the whole server cluster. Thus the problem reduces to deciding the number of non-sleeping servers to service the system load. Our policy is to decide the number of non-sleeping servers based directly on the system load, which can be obtained by the front end by recording the number of requests it has received over the recent past or by getting feedback from each server. The main difference between our policy and the policies in [3] is that we decide the number of servers needed based on the system load and the policies in [3] turn on/off systems based on the frequency of non-sleeping nodes. The assumption in [3] is that the non-sleeping nodes are never idle (continuous frequencies) and that they are all running at the same frequency (perfect 1
Note that the aggregate energy consumption is proportional to the aggregate power, as we are not changing the system operation time.
3
Table 1: IBM PowerPC750 system - speed settings and power consumption State sleep idle 6.25MHz 12.5MHz 25MHz 50MHz 75MHz 100MHz Power 0.0 2200.0 2287.0 2430.0 2718.0 3230.0 3845.0 4460.0 (mW)
112MHz 6060.0
Table 2: Transmeta Crusoe system - speed settings and power consumption State sleep idle 300MHz 400MHz 500MHz 600MHz 633MHz Power (W) 0.0 3.349 3.349 3.809 4.714 6.348 6.915 load-balancing). If the individual frequencies are not equal, the front end can compute the average frequency in the cluster, thus eliminating the need for perfect load balancing. However, we argue that the frequency of a server at a given time poorly correlates with its actual load (unless the load is well balanced and the frequency is constant), thus resulting in a poor estimation of the average frequency needed (i.e., the system load). Deciding the number of servers needed based directly on the load over the recent past (rather than the server frequencies) results in a more accurate estimation and eliminates the strong dependency on perfect load balancing. We also perform this decision without the assumption of continuous frequencies (which implies no idle times). Let B be the normalized system workload ( CEDFB2GH< , normalized with respect to the maximum # workload that the system can handle, i.e., JI cycles/sec). Thus the minimum number of online servers that can handle the workload B is KMLNO.0FP B4I"RQ . Let K be the number of non-sleeping servers, then the frequency of the processor of each server is
)S and the utilization of the processor is
B.IFI 0 , T K
#
,
Y \Z [ /U0WV!X O SX
and the total power of the whole system for load B , denoted by ]^%_K
( B
is
( $Sa(b; ( ( ; ( ]%_K B 0`%_/.I&% %cE/ I^ I$K Me \ f 4I?%g>EK
Thus the problem is to find for each load B the number of servers K ( KThHKbLNO ) that minimizes ( ]%iK B . The problem can be easily solved offline2, resulting in a table (to be stored in the front end) that decides for each load what is the optimal number of online servers. At runtime, it is the responsibility of the front end to detect the system load, to perform a table lookup to determine the number of nonsleeping servers, and to distribute the requests to the selected servers.
5
Experimental Results
We compare our proposed policy (LAOVS) against the VOVO-CVS policy in [3] using different power models: IBM PowerPC750, Transmeta Crusoe, and Intel XScale. Tables 1 to 3 show the speed settings 2
due to lack of space we omit the description of the algorithm
4
Table 3: Intel XScale system - speed settings and power consumption State sleep idle 150MHz 400MHz 600MHz 800MHz 1000MHz Power (mW) 0.0 355.0 355.0 445.0 675.0 1175.0 1875.0 Table 4: Request execution times (in millions of cycles) Event Extraction CAF Type 1 Type 2 Type 1 Type 2 Type 3 Min 2.9 2.0 8.2 4.1 1.3 Max 82.6 753.6 5045 210.2 32.9 Avg 9.7 123.2 820.2 45.0 5.8 Stdev 7.2 153.8 1251 78.5 6.2 % 79% 21% 5.4% 2.9% 91.7% and power consumption for these systems. According to the measurements we obtained, we include a 2W constant power for the other system components in the PowerPC750 model. Similarly, a constant power of 2W is included for the Transmeta model and 275mW constant power was included for the XScale system, simulating an Infineon Mobile-RAM (very similar results are obtained for the XScale model with different values for the constant power, such as 2W). Processors may have inefficient frequencies [9], which means that some lower frequencies may result in higher power consumption (combined with less performance) than other higher frequencies. Thus, for each system, we applied the method in [9] to eliminate the inefficient frequencies. For all experiments, the number of servers in the cluster is j0`klC . Similar results are obtained for different values of , but are not shown for lack of space. Figures 1 to 3 show the experimental results obtained through simulation. The number of processors used by the two policies and their power consumption was plotted as a function of the system load, assuming perfect load balancing. Considering the system load rather than the frequencies of the individual systems and eliminating the cubic-root and the continuous assumptions results in LAOVS outperforming the VOVO-CVS policy. To better illustrate the limitations of the continuous assumption, we performed the following experiment: we artificially eliminated the 400MHz and 800MHz frequencies of the XScale model (the continuous assumption is more inaccurate for fewer discrete frequencies). The disadvantage of VOVO-CVS becomes more evident, as shown in Figure 4. An interesting result from our experiments is that the number of online servers does not necessarily increase with the system load, as seen in Figures 2 and 3. This is somewhat counterintuitive and is due to the fact that frequencies are discrete. VOVO-CVS cannot capture this behavior, as it assumes continuous frequencies. In fact, it can be proven that for continuous frequencies and convex power functions, the number of online servers that minimizes the system power is always increasing with the load. Note that all experiments assume that the load is perfectly balanced across the online servers. While this assumption may be safe for some applications, there are application areas with unpredictable behavior, as shown in Tables 4-5 for the signal processing traces that we have (the execution time statistics in Table 4 are shown for all request types). Perfect load-balancing may not be a realistic assumption for such workloads. This may result in higher power consumption than indicated in Figures 1- 3. As 5
Table 5: Request inter-arrival times (in seconds) Event Extraction CAF 81 min 1030 sec 1800 sec Min 0.13 0.1 0 Max 6.7 11 5 Avg 0.37 0.44 0.7 Stdev 0.62 0.77 1.74 events 13045 2307 2564 our policy is less sensitive to load balancing, we expect that power savings over previously proposed policies will be even higher for realistic not-perfectly-balanced workloads. 20
140000
18 120000 16 100000
12
80000 power
number of online servers
14
10
60000
8 6
40000
4 20000 2 0
LAOVS VOVO-CVS minimum 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
0
1
(a) number of online servers vs. system load
LAOVS VOVO-CVS p(nmin,x) 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
1
(b) system power (mW) vs. system load
Figure 1: IBM PowerPC750 System
6
Conclusion and Future Work
In this work, we study energy-efficient policies for server clusters. We propose a new power management policy that applies directly to the case of discrete frequencies and does not make any assumption on the power-frequency relation. The proposed policy uses the system load and considers the idle power to determine the number of online servers. In contrast, previously proposed policies ignore the idle power and determine the number of online servers based on their frequencies, which do not accurately reflect the actual load and power consumption. We show that the proposed policy is practical (i.e., does not require perfect load balancing, and considers discrete frequencies) and outperforms previously proposed policies. This work will continue with the study of feedback mechanisms to estimate the system load, load balancing mechanisms and the implementation of the proposed policy.
6
20
140
18 120 16 100
12
80 power
number of online servers
14
10
60
8 6
40
4 20 2 0
LAOVS VOVO-CVS minimum 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
0
1
LAOVS VOVO-CVS p(nmin,x) 0
0.1
(a) number of online servers vs. system load
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
1
(b) system power (W) vs. system load
Figure 2: Transmeta Crusoe System 20
40000
18
35000
16 30000
25000
12 power
number of online servers
14
10 8
20000
15000
6 10000 4 2 0
5000
LAOVS VOVO-CVS minimum 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
0
1
(a) number of online servers vs. system load
LAOVS VOVO-CVS p(nmin,x) 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
1
(b) system power (mW) vs. system load
Figure 3: Intel XScale System
References [1] Hakan Aydin and Qi Yang. Energy-Aware Partitioning for Multiprocessor Real-Time Systems. In International Parallel and Distributed Processing Symposium (IPDPS’03), pages 113–121, March 2003. [2] P. Bohrer, E. Elnozahy, M. Kistler, C. Lefurgy, C. McDowell, and R. R. Mony. The case for power management in web servers. In Power Aware Computing, Kluwer Academic Publications, 2002. [3] E.N. Elnozahy, Michael Kistler, and Ramakrishnan Rajamony. Energy-Efficient Server Cluters. In Workshop on Power-Aware Computer Systems (PACS’02), 2002. [4] K. Flautner and T. Mudge. Vertigo: automatic performance-setting for Linux. In Proceeding 7
20
40000
18
35000
16 30000
25000
12 power
number of online servers
14
10 8
20000
15000
6 10000 4 2 0
5000
LAOVS VOVO-CVS minimum 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
0
1
(a) number of online servers vs. system load
LAOVS VOVO-CVS p(nmin,x) 0
0.1
0.2
0.3
0.4
0.5 load(%)
0.6
0.7
0.8
0.9
1
(b) system power (mW) vs. system load
Figure 4: Intel XScale System without 400MHz and 800MHz frequencies of the 5mon Symposium on Operating Systems Design and Implementation (OSDI’02), December 2002. [5] I. Hong, M. Potkonjak, and M. B. Srivastava. On-line Scheduling of Hard Real-Time Tasks on Variable Voltage Processor. In Computer-Aided Design (ICCAD’98), pages 653–656, 1998. [6] I. Hong, G. Qu, M. Potkonjak, and M. Srivastava. Synthesis Techniques for Low-Power Hard Real-Time Systems on Variable Voltage Processors. In Proceedings of the 19 mon IEEE Real-Time systems Symposium (RTSS’98), Madrid, Spain, December 1998. [7] E. Pinheiro, R. Bianchini, E. V. Carrera, and T. Heath. Load Balancing and Unbalancing for Power and Performance in Cluster-Based Systems. In Workshop on Compilers and Operating Systems for Low Power, September 2001. [8] Cosmin Rusu, Ruibin Xu, Rami Melhem, and Daniel Moss´e. Energy-Efficient Policies for Request-Driven Soft Real-Time Systems. In Proceedings of the 16 mon Euromicro Conference on Real-Time Systems, Catania, Italy, July 2004. [9] Saowanee Saewong and Ragunathan Rajkumar. Practical Voltage-Scaling for Fixed-Priority RTSystems. In Proceedings of the 9 mon IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’03), May 2003. [10] Vivek Sharma, Arun Thomas, Tarek Abdelzaher, Kevin Skadron, and Zhijian Liu. Power-aware QoS Management in Web Servers. In Proceedings of the 24 mon IEEE Real-Time systems Symposium (RTSS’03), Cancun, Mexico, December 2003. [11] F. Yao, A. Demers, and S.Shankar. A Scheduling Model for Reduced CPU Energy. In IEEE Annual Foundations of Computer Science, pages 374–382, 1995.
8