Performance evaluation of web servers using central load balancing ...

23 downloads 0 Views 253KB Size Report
Dhirubhai Ambani Institute of Information and. Communication Technology,. Gandhinagar, INDIA [email protected]. Sanjay Chaudhary. Dhirubhai ...
Performance Evaluation of Web Servers using Central Load Balancing Policy over Virtual Machines on Cloud Abhay Bhadani

Sanjay Chaudhary

Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, INDIA

Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, INDIA

[email protected]

[email protected]

ABSTRACT Cloud Computing adds more power to the existing Internet technologies. Virtualization harnesses the power of the existing infrastructure and resources. With virtualization we can simultaneously run multiple instances of different commodity operating systems. Since we have limited processors and jobs work in concurrent fashion, overload situations can occur. Things become even more challenging in distributed environment. We propose Central Load Balancing Policy for Virtual Machines (CLBVM) to balance the load evenly in a distributed virtual machine/cloud computing environment. This work tries to compare the performance of web servers based on our CLBVM policy and independent virtual machine(VM) running on a single physical server using Xen Virtualizaion. The paper discusses the efficacy and feasibility of using this kind of policy for overall performance improvement.

Keywords Virtualization, Cloud Computing, Resource Allocation, Distributed Computing, Load Balancing, Live Migration

1.

INTRODUCTION

Today, we have high capacity servers, network bandwidth and processor bandwidth with high speed processing capabilities with multi-core processors. But, most of the time these servers remain underutilized. The only purpose to have these high capacity servers was to meet a few peak load periods. Today, virtualization is playing a key role in server consolidation and performance isolation. Virtualization is a technique which multiplexes the hardware resources and enables users to run multiple OS on the same physical hardware. It provides several features like isolation of different operating systems, security, scalability, server consolidation, etc. In general, in virtualization there is atleast one running OS also known as “Host OS” and, others popularly known as guest OS or simply guests. One of the important use of virtualization is from data center perspective, which helps them to use the resources effectively. The issue with server consolidation in data centers is that it can lead to poor performance due to overload,

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Compute’10, Jan 22-23, 2010, Bangalore, Karnataka, India Copyright 2010 ACM 978-1-4503-0001-8/00/0010. . . $5.00.

such as CPU, Network, I/O, if not managed properly. This poses threats related to the reliability and performance of applications hosted on cloud environment. Every service in the cloud needs a resource in terms of network, disk space, CPU time, main memory, etc. It is true that virtualization has some overhead which reduces the performance of the VM. Unfortunately, it becomes difficult to provide a single formula for predicting how well an application performs in the virtual environment. This can only be achieved by performing stress-testing on the virtualized environment. When measuring performance, it’s important to keep in mind that best predictions can be made by using real-world activity. We can predict nearby performance by running the applications inside the VMs and measure the throughput for common activity based on varying load. Our work has tried to simulate the real traffic and analyse it’s behaviour in cloud environment using the proposed CLBVM policy and compare our results with isolated VMs. This work has been divided into four sections. Section II introduces about cloud computing and need for load balancing, section III discusses the motivation and related work, section IV discusses Experiments and Results. Finally, Section V concludes with scope for future work.

2.

CLOUD COMPUTING

Cloud Computing is an emerging paradigm which has the capability to harnesses the power of Internet and wide area network (WAN). It can use remotely available resources providing cost-effective solution to most of the real life requirements. “A Cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resource(s) based on service-level agreements established through negotiation between the service provider and consumers” [2]. One of the most important mechanisms which a distributed operating system can provide is the ability to move a process from one processor to another. This mechanism is called process migration [7]. Migrating OS instances across distinct physical hosts allows a clean separation between hardware and software, facilitates fault management, load balancing and low-level system maintenance. By carrying out majority of migration while OS continues to run, it achieves impressive performance with minimal service downtimes. Detailed discussion can be found in Clark et. al [3]. Performance enhancement is one of the most important issues in virtualization based distributed systems. Obvious but expensive ways of achieving this goal are to increase the capacity of the participating servers and to add more guests to the system. Adding more nodes or increasing the capacity of some of the nodes may be needed when all of the nodes in the system are overloaded. However, in many situations

poor performance is due to uneven load distribution. The performance can often be improved to an acceptable level by simply redistributing the load among the pServers. Hence, load redistribution is a cost effective way for the reliability and the performance improvement of the overall system. This kind of problem is categorized as load balancing or load sharing [10].

2.1

Need for Load Balancing in Cloud

Load balancing is a process by which inbound IP traffic can be distributed across multiple servers. It enhances the performance, leads to optimal utilization and ensures that no single server is overwhelmed. We can have multiple servers in a server farm or a data center, which can host multiple guests. Each guest may differ on load, leading to a situation where some of the servers may become overwhelmed or capitulated in terms of computational resources, memory resources and I/O devices. For simplicity and without loss of generality, we will consider load in terms of CPU time. When a server A is unable to allocate sufficient CP U time slice due to heavy demand by other VMs running parallely and another server B is idle, we can redistribute the load from A to B by migrating few guests to B. This would be an ideal situation when we require load balancing policy in Cloud Computing scenario, improving the percentage of server idle times, marginal job response times, etc. In order to achieve, short response time and high system throughput, we need to consider the following characteristics: a) the load balancing process generates little traffic overhead and adds low overhead on the computational and network resources. b) it keeps upto-date load information of the participating systems. c) It balances the system uniformly and takes action instantaneously or on a periodic basis. d) it can run on a dedicated system, or it can be a decentralized effort. e) The available server should have sufficient resources available to host and run the migrated guest. f) The live migration of complete operating system should take minimum acceptable time with minimum downtime. g) Network communication should be reliable and fast.

3.

MOTIVATION AND RELATED WORK

Various load balancing algorithms have been proposed, addressing most of the issues in distributed environment. Few classic works by Kunz [9], L. M. Ni [12], Zhou [16] and Eager [4] include policies in which each node periodically broadcasts its status to all other nodes. They [16, 4] also demonstrate that algorithms using a threshold policy have achieved better performance. Lin and Raghavendra [10] proposed a load balancing mechanism called “Dynamic Load Balancing Policy with a Central Job Dispatcher (LBC)”. A study carried out by Zhou [6] on load balancing algorithm shows that the periodic policies using distinguished agent and distributing the load information reduces the traffic overhead and scales better. Singh et al. [14] and Korupolu et al. [8] proposed coupled placement advisor (CPA) algorithm that optimizes coupled placement of application’s computational and storage resources on nodes in a modern virtualized data center environment. It uses greedy approach of match making inspired from Knapsack and Stable Marriage Problems. VMWare DRS [1] dynamically balances computing capacity across a collection of hardware resources aggregated into logical resource pool and intelligently allocating available resources among the VMs based on pre-defined policies that reflect business needs. When a VM experiences an increased load, it automatically allocates additional resources from the resource pool. Wood et al. [15] presented a greedy heuristic hotspot mitigation algorithm. Their heuristic algorithm de-

termines which overloaded VMs need to be migrated and where, such that migration overhead is minimized. To determine the most overloaded server and the least overloaded server, they introduced new metrics such as volume and volume-to-size ratio (VSR) which are based on the resource utilization and size of memory footprint. Using these metrics, the algorithm generates a list of overloaded VMs and a new destination physical machine for each VM. However, in cases where there are no sufficient idle resources on lightly loaded machines, the migration algorithm further considers VM swaps as an alternative. Park et al. [13] considered an automated strategy for VM migration in self-managing virtualized server environments and proposed an optimization model based on a linear programming(LP) which minimizes operational and migration costs. Park et al. [13] introduced an optimization model for VM migration, and presented some experimental examples. They came out with the conclusion that VM migration should not be handled manually, and a fine-grained algorithm is needed. Our work is primarily motivated from the earlier work carried out by Lin and Raghvendra [10], Park et al. [13] and few other related works. We have found very few publications related to this area and believe this work would add significant value to the scientific community at large.

3.1

Problem Definition

We can have multiple servers in a server farm which can host multiple guests on each physical servers. Each guest may have different load, leading to an unbalanced situation including CPU, memory and I/O resources. We believe the concept and policy proposed in [10] can be extended to handle virtualization based load balancing for VMs.

3.2

Proposed Approach

In our system, we denote each guest as a job and each pServer as a node. We propose a load-balancing policy with a central dispatcher called “Central Load Balancing Policy for Virtual Machines (CLBVM)”, which makes loadbalancing decisions based on global state information. This policy has centralized information and location rules. The transfer rule is partially distributed and partially centralized.

Figure 1: Experimental Setup

3.3

CLBVM Policy

The CLBVM policy takes into account the following considerations : 1. Network load is constant and does not change frequently.

2. Each VM has different identifications. 3. The load information collector daemon process, runs continuously in each pServers collecting aggregate CPU load and system utilization by guests. Based on the data collected by the pServers, labels itself as Heavy(H), Moderate(M) or Light(L). 4. The messages is exchanged with the Master Server, which takes decisions periodically for load balancing. Heavily loaded systems are balanced first with lightly loaded. If all pServers are evenly loaded, no migration is performed.

Figure 3: Comparision of different VMs at a data rate of 1024 conn/sec

5. Frequent change in state is taken into account by the load balancing algorithm, such that unnecessary migrations can be avoided.

4.

EXPERIMENTAL SETUP AND RESULTS

We carried out our experiments on Intel Pentium 4 CPU 1.8 GHz having 2 GB RAM running CentOS 5.2, Xen Kernel (2.6.18-92.el5xen) as host and guests. We carried out multiple experiments for different scenarios to measure the performance of the Web Server by varying load, connection rate, etc. Figure 1 shows such a setup. We modified xenmon [5] script known as xen client.py running on each pServer, which used to measure the CPU usage of each VMs and the aggregate CPU usage. We have developed a clbvm balancer written in C, which keeps an updated data about the CPU usage of each of the pServers. Each pServer marks itself as H, M and L based on CPU usage. If a pServer changes it’s state, it notifies the clbvm balancer. The clbvm balancer periodically executes load balancing algorithm to make the system evenly balanced by instructing the heavily loaded system to transfer lightly loaded VM to lightly loaded pServer. For our experiments, we used Apache web server, httperf [11], and a long running CPU intensive OpenMPI based application program. Firstly we established a benchmark for the webserver using httperf tool on core linux system. We performed the same experiment on xen kernel and VM1 hosted on Xen. We created three VMs with similar configurations (512 MB RAM) on each of the three pServers, each running a web server. We applied load using httperf by changing the number of conn/s. We also executed OpenMPI based CPU intensive application to make the system unbalanced and test the usefulness of CLBVM policy and evaluating the performance of the web server hosted on VMs. The results achieved yielded better results when compared with isolated systems hosting several VMs, gaining performance improvement of upto 20% in most of cases.

Figure 2: Comparision of different VMs at a data rate of 512 conn/sec

Figure 4: Comparision of different VMs at a data rate of 2048 conn/sec

We found that when the data rates were less, the web server hosted on core Linux kernel was able to handle almost all connections with very few time-outs and file descriptor(FD) unavailability. As we increased the data rates, a significant decrement in the throughput was seen. The dips in the graph are due to the unavailability of FDs to handle new requests. There are only a limited number(i.e. 1024, which can be changed) of FDs available for single processe. When socket closes, it enters the TIMEWAIT state for 60 seconds to avoid reaching the port number limitation. We therefore run each benchmark for sucessive number of connections (i.e. 512, 1024, 2048, ..., 131072), and wait for all sockets to leave the TIMEWAIT state before we continue with next benchmark. We treat these results as our base case for the further experiments to analyse the behavior of Xen hypervisor, VMs hosted on Xen and overall performance of the CLBVM policy. We found similar behaviour for Xen-Dom0 VM also showing almost 100% throughput when the data rates are low. Refer Figure 2, 3 and 4. As data rates increase, many timedout connections and FD-unavailability is noticed resulting in decreased thoughput, which is due to the additional layer of Xen hypervisor. Xen uses synchronous calls from a domain to hypervisor using a hypercall, while notifications are delivered to domains from hypervisor using asynchronous “event” mechanism. It remains to be seen whether this request processing latency is due to: accepting incoming connections or writing the response (nonblocking write() system call) or managing the cache or some other unforeseen problem. Similarly, In Xen-VM1 100% throughput is achieved at lower data rates. It resulted in decreased throughput and many timed-out connections with higher rates. It shows similar throughput at higher data rates when compared to Dom0 running on Xen. This is primarily beacuse of the bridge connections used to multiplex the ethernet cards and maintain network connections with guests. Dom0 also takes some processing time resulting this behaviour. Communi-

cation from Xen to a domain is provided by a synchronous event mechanism, which replaces the usual delivery mechanisms for device interrupts. Xen uses a ring implemented as circular queue of desciptors allocated by a domain which is accessible from within Xen. The descriptors do not directly contain I/O data; instead I/O data buffers are allocated by the GuestOS and indirectly referenced by I/O descriptors. Access to each ring is based around two pairs of producer-consumer pointers: domain place request on a ring, advancing a request producer pointer, and Xen removes these requests for handling, advancing an associated request consumer pointer. Responses are placed back in the same fashion. The VM associates a unique identifier with each request which is reproduced in the associated response. This allows Xen to unambigously reorder I/O operations due to scheduling or priority considerations. We believe this is the main reason behind the degradation of throughput on the VMs. It shows degraded behavior, as expected in all the connection rates as another VM is running compute intensive task, consuming more CPU time. At slower rates, throughput varied between 80% to 95%. Whereas degraded performance is observed with higher data rates, which prohibits the usage of VMs to meet real traffic. After investigating the scheduling mechanism, we found that it is due to the working behaviour of credit scheduler. The other VM which is running compute intensive job consumed more credits as it also supports Work Conserving(WC) mode, which means the shares are merely guarantees, and the CPU is idle if and only if there is no runnable client. It means that in case of two clients with equal weights when one of the clients is blocked, the other can consume the entire CPU. But when VM1 needs more CPU time, it suffers and recieves it’s share percentage, leading to relatively poor performance. This behavior of the VM1 made us to resolve this issue by load balancing mechanism on multiple servers with virtualization support. We found that after adopting our CLBVM policy, the results were interesting. It showed slightly better throughput. Our CLBVM algorithm has two sections where client code is running as a daemon service on all pServers. It updates the central server when the pServer changes it’s state from H to L, M and vice versa. The central server runs the load balancing algorithm every N Minutes (in our work we consider N=10) and instructed the highly loaded server to transfer any lightly loaded VM to lightly loaded server. We believe, despite adding extra burden to CPU by executing client code on each server, we achieved a considerable throughput. The client code hardly consumed 1-2% of CPU time.

5.

CONCLUSION AND FUTURE WORK

Our work extends application of existing model for load balancing of jobs in distributed computing environment to VMs. The proposed CLBVM policy is at an inception stage and requires work for I/O as well as memory availability on the target server. The CLBVM policy has the potential to improve the performance of the overall pServers though it does not consider fault tolerant systems. We tried to make the system completely distributed such that, if the performance of the VM gets affected by another VM, it can move itself to lightly loaded server on the fly. The migration of OS adds an extra cost to the overall system, which is minimized by selecting least active VM. We need to consider fault tolerant system acting as a backup incase master server fails. After achieving these results, we can conclude that this work can be scaled to handle more nodes in real situations realizing true cloud computing. Our implementation is strictly meant for Xen based systems. The CLBVM policy was tested on a limited capacity desktop PCs, hence

for a more realistic results, furthur experiments on server machines with real traffic are to be carried out. Decentralized load balancing algorithms would be an interesting area to explore.

6.

REFERENCES

[1] Vmware infrastructure: Resource management with wmware drs. Tech. rep., VMWare, 2006. [2] Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., and Brandic, I. Cloud computing and emerging it platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25, 6 (2009), 599–616. [3] Clark, C., Fraser, K., Hand, S., Hansen, J. G., Jul, E., Limpach, C., Pratt, I., and Warfield, A. Live migration of virtual machines. In Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation (2005), pp. 273–286. [4] Eager, D. L., Lazowska, E. D., and Zahorjan, J. Adaptive load sharing in homogeneous distributed systems. IEEE Trans. Softw. Eng. 12, 5 (1986), 662–675. [5] Gupta, D., Gardner, R., and Cherkasova, L. Xenmon: Qos monitoring and performance profiling tool. Tech. Rep. HPL-2005-187, HP Labs, 2005. [6] Hac, A., and Johnson, T. A study of dynamic load balancing in a distributed system. SIGCOMM Comput. Commun. Rev. 16, 3 (1986), 348–356. [7] Hansen, J. G., and Jul, E. Self-migration of operating systems. In Proceedings of the 11th workshop on ACM SIGOPS European workshop (2004), p. 23. [8] Korupolu, M., Singh, A., and Bamba, B. Coupled placement in modern data centers. In IEEE International Symposium on Parallel & Distributed Processing (2009), pp. 1–12. [9] Kunz, T. The influence of different workload descriptions on a heuristic load balancing scheme. IEEE Transactions on Software Engineering 17, 7 (1991), 725–730. [10] Lin, H.-C., and Raghavendra, C. S. A dynamic load-balancing policy with a central job dispatcher (lbc). IEEE Trans. Softw. Eng. 18, 2 (1992), 148–158. [11] Mosberger, D., and Jin, T. httperf: A tool for measuring web server performance, 1998. [12] Ni, L., Xu, C.-W., and Gendreau, T. A distributed drafting algorithm for load balancing. IEEE Transactions on Software Engineering SE-11, 10 (1985), 1153–1161. [13] Park, J.-G., Kim, J.-M., Choi, H., and Woo, Y.-C. Virtual machine migration in self-managing virtualized server environments. In 11th International Conference on Advanced Communication Technology (2009), vol. 03, pp. 2077–2083. [14] Singh, A., Korupolu, M., and Bamba, B. Integrated resource allocation in heterogeneous san data centers. In Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing (2007), pp. 328–329. [15] Wood, T., Shenoy, P., and Arun. Black-box and gray-box strategies for virtual machine migration. In 4th USENIX Symposium on Networked Systems Design & Implementation (2007), pp. 229–242. [16] Zhou, S. A trace-driven simulation study of dynamic load balancing. IEEE Trans. Softw. Eng. 14, 9 (1988), 1327–1341.