Scalable Multiple Server (SMS) architecture design for resource management in a ... in a hosting center, with a principal focus on power management. ... Windows 2000 Advanced Server, and is the IIS5.0 Web server .... power cost, the determination of a machine size or service capacity .... server on a dedicated machine.
WEB SERVER POWER ESTIMATION,
MODELING
AND MANAGEMENT
Chia-Hung Lient, Ying-Wen Baitt, Ming-Bo Lint, Chia-Yi Changtt and Ming-Yuan Tsai tt
t Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, 106, R.O.C. Department of Electronic Engineering, Fu Jen Catholic University, Taipei, Taiwan, 242, R.O.C. E-mail: lianhomegseed.net.tw
exponential moving averages of point samples or average resource utilization over some sample period [3]. In [4], resource demands are derived from request counts and bytes transferred, using a model of a static Web server. These studies have led to many benchmarks such as SPECweb, and WebStess [5,6]. The studies on improving a Web server's performance focus on improvement of either the interactivity between the Web servers, the underlying operating systems, or disk 11O and network I/0. This paper investigates the policies for allocating resources in a hosting center, with a principal focus on power management. We present the design and implementation of flexible resource management architecture--ScalableMultiple-Server (SMS) -- that controls server allocation and the routing of requests to selected servers through a reconfigurable switching infrastructure. Figure 1 depicts the elements of the SMS architecture: pools of shared servers act together to support the request load of each co-hosted service. Server resources are both generic and interchangeable. DNS servers dynamically redirect incoming request traffic to eligible servers. Each hosted service appears to external clients as a single virtual server, whose power consumption grows and shrinks with the requested load and available resources. Server operating systems and DNS servers continuously monitor the load; the system combines periodic load summaries to detect load shifts and to estimate the aggregate service performance. Manager: The "brain" of the hosting center OS provides a policy decision that dynamically reallocates server resources and reconfigures the network to respond to variations in the observed load, resource availability, and service intensity. SMS employs a Web server's power model to manage server allocation and provisioning in a way that maximizes resource efficiency and minimizes power consumption. Section 2 describes the power model in detail.
Abstract - An Internet server center provides multiple service sites by means of a multiple server pool. This paper presents a Scalable Multiple Server (SMS) architecture design for resource management in a server center, with an emphasis on the proper balance of energy and performance. The goals are to provide different servers for different service sites in a way that automatically adapts to the offered load and improve the energy efficiency of the server pool by dynamically reconfiguring the server pool in accordance with an analytic model. We analyze the power consumption of a Web server as a function of a server's utilization based on experimental arrival data. Two power models such as linear and exponential are shown to fit the experimental arrival data. The linear model is used with a flowequivalent approximate queuing analysis to minimize power consumption of the Web server. We validated our model by comparing the various working loads generated by a benchmark against measurements. The experimental results demonstrate the calculated values of the model tracks and the results generated by real machines. We also tested our SMS architecture by applying it to a real-life Web server center, and achieving a 16.9 % reduction in energy consumption.
1. INTRODUCTION Recently, we have seen an explosive growth of the Internet. The excellent performance of Web servers are one of the critical factors to the success of the Internet. To improve the performance of Web servers, researchers have studied the Web server's macro-performance, namely, the response time and throughput, which can be directly perceived by end users. Energy-conscious switching was recently proposed independently [1, 2] and appears to be newly discussed in the literature. This switching adjusts the on-power capacity at the coarse granularity of entire servers, as a complementary but simpler and more effective alternative to fine-grained techniques, which reduce CPU power demand under a light load but do not address energy losses in the power supply. One of the most successful techniques employed by designers at the system level is dynamic power management [3, 4]. This technique reduces power dissipation either by selectively turning off or by reducing the performance of system components when they are idle, or partially unexploited. Fundamental to server resource management is the load estimation problem: how much resource is needed to meet the necessary service quality goals for the corresponding load level? This problem is related to the characteristics of characterizing network flows. Most approaches used
0-7803-9746-0/06/$20.00(2006 IEEE
541
power consumption. degradation.
Moreover, there is no performance
r I
Web server
Fig. 2 The power state transition model of a Web server
In the past,
Pji1
=I
(2)
Here I is the average power consumption of an idle state that correlates closely with the different system architectures. the busy-state power is modeled as a constant value [19]: PWb
= I+B
(3)
Here B is extra average power in a busy state in addition to I
Fig. 1. SMS architecture.
2.1 States Power Model Inference from Experimental Results
This paper is power model
organized as follows. Section 2 describes the of a Web server. The models are based on the experimental data collected. Next we present an allocation model of Web servers to minimize their power consumption in Section 3. Section 4 presents experimental results. Section 5 concludes the paper.
In this subsection, we present some power estimation models of a Web server by using the objective simple and easily computable techniques. We measured the power consumption of an ACPI-compliant Web server system with a single 2.4 GHz Pentium 4 processor, 266 MHz bus, l.OGB RAM, IBM 80GB IDE hard disk, and a single Gigabit Ethernet card. The Web server runs Windows 2000 Advanced Server, and is the IIS5.0 Web server system.
2. POWER MODELING FOR A WEB SERVER SYSTEM SMS allocates to each service a suitable server that it needs to its load. However, a Web server system can be large, complex, heterogeneous system, often including multiple resources as electronic, optical, and magnetic components. In this paper, we focus on developing an abstract model of the system that allows a fast estimation of the service-level power consumption of a Web server, and how the use of the resources' dynamic reconfiguration can impact the overall power consumption. A service-level power model views a Web server model as a black box. We are not concerned with how resources are designed; instead we focus on how they interact with the environment. Hence, the internal details of the box are not modeled explicitly; only the overall power consumption of the box is considered.
This measure leads to the following observations: although when in the idle state a Web server can have null performance, it has inherent power consumed by some resources to maintain the basic routines of the system. Moreover, the power consumption shows only slight variations when in the idle state and the average power consumption is approximately constant. According to our observation, the power consumption, while in the busy state, covers a wider range of value, followed by the utilization of the servers. A power consumption histogram, or probability distribution function, serves as one useful measure of the power consumption variation with time. Examples of probability distribution functions for the three utilizations as measured above - utilization = 0.15, 0.5 and 0.85 - appear in Figure 3.
serve
The Web server can operate in two different power states and the transition graph of the model is shown in Figure 2. A Web server in the busy state is denoted by the word, busy (b) when a service is performed and when in the idle state is denoted by the word, idle (i) otherwise. The transition (or the opposite transition) from busy to idle state is almost instantaneous; hence, this transition does not consume a sizable amount of
542
Equation (5) implies the power consumption of the server while in a busy state corresponds to the different p. The power of system in busy state increases as the utilization p
0.09 0.08 0.07 0.06 E O 0.05
(5)
Pwb = I + (B - I)p
0.1
Utilization
=
0.15 Utilization Utilization
=
=
0.85
increases.
0.5
2.2 Steady-state Probability ofMIM/1 Model
./u, then the state probabilities can be represented as Eq. (6). And we may suppose that the arrival rate is not influenced by the state of the system. The transition rates can be summarized in a diagram, as in Figure 4.
0.03 0.02 0.01
Power consumption (W)
Fig. 3. Histogram of power consumption for a Web server
Busy (b)
A
Obviously, the different utilization results in a different distribution of the power dissipation. The larger the utilization, the more likely the mean is to take on larger values and the variance is to take on smaller values. We will discuss two models which incorporate these characteristics of power consumption of a busy state in a subsequent subsection. We focus on the average power consumption of a server in its busy state according to different utilizations. The first model is more accurate and suitable for simulations, while, the second leads to a simple manageable architecture. Based on this observation, we model the power consumption of the idle-state as:
Idle (i)
0
A
A
1
2 p
A
3
I
\K> A
1
Fig. 4. The state transition of a queue for web server process
requests
To be in steady state, the transition rate out of each state must be equal to the transition rate into each state. We can easily obtain the state probabilities:
(a) An exponential model A key parameter in evaluating the power consumption of a busy state is a server's utilization p. We first model the busystate power as an exponential function. A more accurate model of busy-state power has
Il-ep-I
K>
p~~~~i
(I_ ),n
n
pw =I+(B-I)
/
p 0.65 is less meaningful. It is partly due to a server near to full load, and it may partly be due to an overflowing request buffer. Figure 5 illustrates the accuracy of these estimation models. The exponential model is found to yield estimation errors within 15%.
'U'Pwb =Io +II +
120
PW
110
Measured Exponentinal
105
Leaner Constant
/J /
uI
+(B1-I).
juI
tp
(14)
Applying (13) and (14) to the related equations of the M/M/I model, we obtain
/
115
(13)
/
)(I p) + (Io + I,
PW = (Io + II
+ (BI
-
-
II)
P) p
100,
=I
95 90 -
W1
+ (B1
p
-II)
PW=IO+11
80 0
p2
1
With the utilization p = ,u, yielding
85
75
+JI
0.1
0.2
0.3
0.5 Utilization
0.4
0.6
0.7
0.8
0.9
+-+(B1 -I1)
p/J
julJU-1
1
2
(15)
To minimize Eq. (15) with respect to the service rate, we simply take the derivative of Eq. (15) with respect to ,. The derivative of the power function PW is given as
Fig. 5. Average power consumption versus utilization p
3. CHOICES OF WEB SERVERS FOR POWERSAVING
d
It may seem advantageous from a performance standpoint to have an infinite machine service rate, but this approach may not be the most economical way of managing. Because the arrival requests are stochastic and the idle resource have power cost, the determination of a machine size or service
W=I1
Pw-
A--
2d
3(Bl I,)A2
--
2
The value of ,u that minimizes power consumption now can be obtained by equating to zero. Thus, the optimum service rate u*, is given as that value that solves the following equation:
544
* 3(B1-1I)
DNS server. The Manager runs as a user-level server on a dedicated machine. The monitor consists of 800 lines of C code compiled into the Manager. In addition to the load monitor code, it includes 300 lines of C code implementing the policies described in Section 4 for a single pool of homogeneous servers. In our experiments the Manager spreads request loads from all services evenly across the entire server pool and use 24-hour epochs. The servers are 2.8 GHz Pentium-4 systems (Asus P4PE) with dual Gigabit Ethernet NICs connected to the server pool and internet. This server runs Windows 2000 advanced DNS server. In Windows 2000, the active directory is tightly integrated with the domain name system that supports the dynamic updating of records for client computers and servers in the server pool. The DNS server can carry sufficient traffic to saturate the server pool under a standard HTTP 1.0 Web workload. The servers comprise one 200 MHz Pentium systems (GA-586SG), three 400 MiHz Pentium-II systems (Asus P2B/440BX), three 800 MHz Pentium-Ill systems (Asus P3B), two 1 GHz Pentium-4 systems (Asus P4B), two 1.7 GHz Pentium-4 systems (Asus P4B) and two 2.4 GHz Pentium-4 systems (Asus P4PE). These servers are connected to the Ethernet through 100 Mb/s links. All servers contain a complete replica of the Web service file set, and our workloads serve a large majority of requests from memory so that the CPU is the critical resource.
(16)
I1 The previous research results show that the volume of Web server traffic can be quite predictable for some sites when we measure the system power consumption for a long enough time scale [10]. For a general site, there is a steady, predictable weekly access pattern. Every weekend has lower traffic as expected, but more interestingly we can also see a consistent, predictable increase in the number of requests over the weekdays. If a Web site i has a request arrival rate A(i), we can choose u*(i) so as to minimize its power consumption.
4. EXPERIMENT 4.1 Experiment 1 To understand how this model yields a decision of the new machine, consider the graphs illustrated in Figure 6. The value of PWs (power consumption) have been calculated for the machines with different service rates associated with a basic machine of u= 221.25 requests/sec, 10 = 20.12 W, I1 = 21.56 W and B1=61.03 W. assume that PW is a continuous function with its minimum at yk; that is, for u < Uk the function PW is decreasing, and for u > u the function is increasing. Superimposed on Figure 6, as shown by dashed lines, is the measurement results, obtained by real machines. The computed values of model tracks the result generated by real machines.
To show the impact with multiple services under varying loads, we use an enterprise edition of the Web Stress [24] synthetic Web service load generator. Web Stress generates high traffic burst characteristic of Web workloads, with heavy-tailed object size distributions so that the per-request service demand is highly variable. One advantage of synthetic traffic is that the load generator is dosed, meaning that the request arrival rate is sensitive to the manner of the response latency. Also, we can generate synthetic load "waves" with any amplitude and period by modulating the number of generators. The experiments presented are designed to illustrate various aspects of the system's behavior in simple scenarios. Our experiments use a combination of the Web Stress' synthetic traffic workload and real request traces. We use five servers log trace from the Fu Jen Catholic University Web servers (net.cc.fju.edu.tw), consisting of the 520M requests received during October 2-11, 2003. Figure 7 shows the servers pool power draw as measured by a DW-6090 POWER ANALYZER with one second measurement intervals.
4.2 Experiment 2
This section presents our experimental results from the SMS prototype to show the behavior of the dynamic server resource management. These experiments consider the simple case of a symmetric cluster with a single server pool. 100 95
P-4 1.7GHz 512MB
90 P 200MHz 128MB
85
a1)
/+ P-4 1GHz 1GB
80
I
a) 75
P-11 400MHz 256MB
70 -
65
/
P-11 300MHz 128MB
60
0
500
P-I11 450MHz
51 2MB
256MB
1000
From Figure 7, we compared the accumulated energy curves for our SMS with traditional fix servers. The SMS total energy consumption for this run was 64.48 KWh, a savings of 16.900. The graph reflects the Manager's more conservative policies for retiring machines during periods of declining load, thus yielding somewhat higher energy consumption than necessary. Furthermore, the measured average response time. We can see that our SMS has a shorter response time and satisfied the performance demand with a response time under
P-11 800MHz 150(DO
2000
2500
3000
SeMce Rate
Fig. 6. The average power consumption veirsus service rate ,
Figure server
1 depicts the data center testeId, which consists of a pool driven by traffic-genera[ting clients through a
500ms.
545
Operating Systems (ASPLOS VIII). San Jose, California, United
States, pages 205-216, Oct. 1998. [2] Yasnshi Saito, Brian N. Bershad, and Henry M. Levy, "Manageability, Availability and Performance in Porcupine: A Highly Scalable Cluster-Based Mail Service", Proceedings of 17th ACM Symposium on Operating Systems Principles (SOSP). Charleston, South Carolina, United States, pages 1-15, Dec.
l
90
80
Fix server alloi
w~
F'LX F alocahon server W 60
>50
/
/
1999.
H~
[3] Armando Fox and Steven D. Gribble and Yatin Chawathe and Eric A. Brewer and Paul Ganthier, "Cluster-based scalable network services", Proceedings of 16th ACM Symposium on Operating System Principles (SOSP). Saint Malo, France, pages 78-91, Oct.1997. >/ 20[4] Darrell C. Anderson, Jeffrey S. Chase, and Amin M. Vahdat, "Interposed Request Routing for Scalable Network Storage", ACM Transaction on Computer Systems, 20(1):25-48, Feb.
40
30
all 20
,/
10 0
1
2
3
4
5
61 2 3 45 7
69
8
9
~~~~~~2002.
to
10Days [5] http://www.paessler.com/WebStress/webstress.htm [6] http://www.specbench.org/osg/web99/
Fig. 7. Accumulated energy for a real Web request traces
5. CONCLUSION This paper describes the design and implementation of SMS, resource management architecture for hosting centers. SMS defines policies for adaptive servers' allocation in hosting centers using a power- conscious approach. A principal objective is to incorporate power management into a comprehensive resource management framework for data centers. This objective enables a center to improve its energy efficiency under fluctuating loads, and to dynamically match both load and power consumption. We have shown both how to use reconfigurable network-level request redirection to route incoming request traffic toward dynamically provisioned server sets and also how to enable energy-conscious provisioning, which concentrates a load request on a subset of servers. Our SMS leverages power management of a Web server to control the center's overall power level by automatically transitioning the power states of any idle servers. We have proposed a new system model and method for dynamic server management in service-level. The problem of service-level power consumption was formulated as a stochastic model. This model is used to describe a SMS prototype for experimentation in data center test beds. The experimental results from the prototype demonstrate our model's potential to adapt service provisioning to respond to dynamically varying resource availability and power consumption in a server pool. The prototype can reduce server energy consumption by 16.9% for the long-term operation of the representative Web workloads.
6. REFERENCES [1] Vivek S. Pal, Mohit Area, Gaurav Banga, Michael Svendsen, Peter Drnschel, Willy Zwaenopoel, and Erich Nahum. "Locality-Aware Request Distribution in Cluster-based Network Servers". Proceedings of 8th International Conference on Architectural Support for Programming Languages and
546