A Virtualized HPC Cluster Computing Environment on Xen with Web-Based User Interface* Chao-Tung Yang1,∗∗, Chien-Hsiang Tseng1,2, Keng-Yi Chou1, and Shyh-Chang Tsaur3 1
High-Performance Computing Laboratory Department of Computer Science, Tunghai University, Taichung, 40704, Taiwan, ROC {ctyang,g97357006,g97350007}@thu.edu.tw 2 ARAVision Incorporated, Sindian City, Taipei Count, 23141, Taiwan, ROC
[email protected] 3 Department of Electronics National Chin-yi University of Technology, Taichung County 411 Taiwan, ROC
[email protected]
Abstract. Nowadays, large-scale computing solutions are common issues. A PC cluster computing system is the solution to problems faced by end users. And problems associated with System Virtualization are evenly popular issues in recently years. This technology is necessarily needed for enterprises and end users in the future applications, indeed. This paper introduces how to implement and integrate virtualization with cluster computing system on Xen and how it works, and, also explains and contrasts the differences between nonvirtualization and virtualization. The virtual cluster computing system is more efficient in operations and economic in power than the traditional cluster computing system; the virtualized cluster computing system is a trend of cluster system. This virtual cluster computing system is used to deal with large-scale problems instead of using the transitional cluster computing system. Keywords: Cluster Virtualization, Cluster Computing System, Virtualization computing.
1 Introduction Over the past 50 years, human beings have changed ecosystems more rapidly and extensively than any comparable period of time in human history, largely due to meet rapidly growing demands for food, fresh water, timber, fiber and fuel. The rapid loss of earth resources is a significant issue in recently years. Taiwan, one of members in the earth village, demands immediate attention to deal with the global warming and climate change. A traditional cluster system, consisting of many real computers consumes a great deal of powers and the heat energy generated *
This work is supported in part by National Science Council, Taiwan R.O.C., under grants no. NSC 96-2221-E-029-019-MY3 and NSC 98-2220-E-029-004. ∗∗ Corresponding author. W. Zhang et al. (Eds.): HPCA 2009, LNCS 5938, pp. 503–508, 2010. © Springer-Verlag Berlin Heidelberg 2010
504
C.-T. Yang et al.
by this system, causes the global warming. Therefore, the virtual cluster computing system is a trend to save energy and protect environment in the future development. This paper introduces how to implement and integrate virtualization with cluster computing system on Xen and how it works, and, also explains and contrasts the differences between non-virtualization and virtualization [1-7]. The virtual cluster computing system is more efficient in operations and economic in power than the traditional cluster computing system; the virtualized cluster computing system is a trend in the cluster system. This virtual cluster computing system is used to deal with large-scale problems instead of using the transitional cluster computing system. The rest of this paper is stated as follows. In section 2, we will discuss the virtual cluster computing system in details. In section 3, we will mention our system architecture, web interface, resource broker, and experiments by using efficient virtualization. Finally, conclusions and the future work are stated in section 4.
2 Background Review A virtual Cluster Computing System is a system consisting of cluster computers connected and virtualization of computing. 2.1 Cluster Systems A cluster system is a group of linked computers, working together so closely to complete jobs that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through a fast LAN, and each of them is called a “node”. A cluster skeleton is divided into homogeneous and heterogeneous types. Frame clusters, based on function, can be categorized as three types: High Availability Clusters, Load Balancing Clusters and High Performance Computing Clusters. 2.2 Virtualization Virtualization is simply the logical separation of the request for some service from the physical resources that actually provide that service. There are five or more virtualization technologies, but we just discuss two of them, which are known by most people. In order to evaluate the viability of the difference between virtualization and nonvirtualization, the virtualization software we used in this paper is XEN. XEN is chosen to be our system’s virtual machine monitor because it provides better efficiency, supports different operating systems simultaneously, and gives each operating system an independent system environment.
3 System Implementation In our entire framework, a cluster computing system with XEN is a major architecture to achieve the economization of power.
A Virtualized HPC Cluster Computing Environment on Xen
505
3.1 System Architecture A Beowulf cluster uses a multi-computer architecture. It features a parallel computing system that consists of one or more head nodes and available tail nodes, compute nodes, or cluster nodes, which are interconnected via widely available network interconnects. All of the nodes in a typical Beowulf cluster are commodity systems consisting of PCs, workstations, or servers-running commodity software. Based on a traditional cluster framework mentioned before, we introduce an ideas of virtualization in the cluster system to economize power; therefore, there are some differences on framework of cluster; all physical compute hosts use message passing interface to perform communication, but in the virtualization computing cluster, fully paravirtualized machines are used instead of physical machines. Our cluster system was built up with four homogeneous computers; the system of these computers is equipped with Intel Xeon CPU E5410 2.33GHz, eight gigabytes memory, 500 gigabytes disk, fedora 8 operating system, and the network connected to a gigabit switch. Referring to the next chapter of experiment, besides the master node can be NFS, NIS server, etc., it can also be a compute node during job computing. We will compare virtualization with non-virtualization on the effect of computing efficiency and power economization, including two software stack diagrams, with XEN and without XEN. The software stack diagrams (with XEN) are shown on Fig. 1.
Fig. 1. Software stack of node with Xen
3.2 Resource Broker It’s a good idea to use resource broker, end users can easily provide their jobs. In this portal, we can support VM environment and non-VM environment. In VM environment, we can provide heterogeneous platforms, which also support to upload, and compile their job files automatically. The resource broker is the soul of this system; it controls the total computing resources, assigns available resources to end users, and manages licenses for this system. The flow between end users and computing resources is shown on Fig. 2. The working flow is listed in the following: 1. Upload source code by UI 2. Request available resources
506
3. 4. 5. 6.
C.-T. Yang et al.
Request management and booting resources Start computing Get results Use UI to download results
Fig. 2. Flow of resource broker
We define the following equation for the resource broker. • • • • • •
NodeTotalvnum: Total available virtual machines. NodeRealvnum: Real virtual machines. Cn: Total available virtual machines per physic node. Memp: Physic memory on one physic node. Memv: Virtual memory on virtual node. L: License counts.
Using this equation, we can 1. Count virtual machine’s numbers. 2. Avoid using physics swap. Performance is achieved because total virtual machine’s memory can be no larger than physics machine’s memory. 3. Control license counts. 3.3 Experimental Results We focus on economization of power and compare efficiency between a traditional cluster and a virtual cluster. Therefore, by using matrix multiplication, LINPACK and LU test sets are used to verify that a virtual cluster will economize more power and operate more efficiently than a traditional cluster.
A Virtualized HPC Cluster Computing Environment on Xen
507
Matrix Multiplication in mathematics is the operation of multiplying a matrix with either a scalar or another matrix. Besides this matrix multiplication, the matrix size could be varied into a different value. We focus on economizing power, and then the matrix multiplication is the test set to verify that virtualization can save more power than non-virtualization. Here we compare a physical cluster, which consists of four real computers, that each has eight cores, with four virtual machines, each with eight cores on one real computer. Therefore, we can calculate the power consumption of virtualization and nonvirtualization by mathematic equations shown below: Equation 1 Equation 2 Equation 3
The CPU, Intel Xeon E5410, needs 80 watts to operate, so our computer needs 160 watts by equation 1. And through equation 2, the cluster watt is 640 watts for nonvirtualization and 160 watts for virtualization. By equation 3, we get a set of values about thermal power for non-virtualization and virtualization. All data sets are shown in table 1. Because our real cluster is eight cores for each computer, and then there are 32 cores. Therefore, the matrix size has to begin from 32 or more. By using experiment data, we draw a linear illustration to make a description of thermal power between non-virtualization and virtualization as shown in Fig.3.
4 Conclusions and Future Work It is a trend to integrate virtualization with a cluster computing system. Virtualization is simply the logical separation of the request for some service from the physical resources that actually provide that service. Regardless of efficiency or economization of power, the experiment shows that a virtualization cluster is more efficient and economical on power than a non-virtualization cluster. But virtualization is not all good for every application, it’s just better than non-virtualization on some applications, such as email server, ftp server and http server, etc... that can use virtualization to build up and manage. Since a cluster also can be virtualized to be the compute node, so it not only saves power but also let other physical computers execute other works. And the next stage of our system, we will build more functions in user interface for end users, who do not major in the computer science domain and need to use cluster computing to solve a large scale problem. The resource broker proposed is used to handle all physical and virtualization resources, and dispatch resource that end users have authority to use. Adding more logic functions in resource broker will enhance better management.
508
C.-T. Yang et al.
Fig. 3. Thermal power matrix multiplication with watt hour
References 1. Dong, Y., Li, S., Mallick, A., Nakajima, J., Tian, K., Xu, X., Yang, F., Yu, W.: Extending Xen with Intel Virtualization Technology. Intel® Technology Journal 10(03), 1–14 (2006) 2. Sharifi, M., Hassani, M., Mousavi, S.L.M., Mirtaheri, S.L.: VCE: A New Personated Virtual Cluster Engine for Cluster Computing. In: 3rd International Conference on Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008, April 7-11, pp. 1–6 (2008) 3. Liu, J., Huang, W., Abali, B., Panda, D.K.: High Performance VMM-Bypass I/O in Virtual Machines. In: Proceedings of USENIX 2006 (2006) 4. Huang, W., Liu, J., Abali, B., Panda, D.K.: A Case for High Performance Computing with Virtual Machines. In: ICS 2006: Proceedings of the 20th annual international conference on Supercomputing, pp. 125–134. ACM Press, New York (2006) 5. Menon, A., Santos, J.R., Turner, Y., Janakiraman, G., Zwaenepoel, W.: Diagnosing Performance Overheads in the Xen Virtual Machine Environment. In: VEE 2005: Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments, pp. 13–23. ACM Press, New York (2005) 6. Cherkasova, L., Gardner, R.: Measuring CPU Overhead for I/O Processing in the Xen Virtual Machine Monitor. In: USENIX 2005 Annual Technical Conference, pp. 387–390 (2005), http://www.usenix.org/events/usenix05/tech/general/ cherkasova.html 7. Smith, J.E., Nair, R.: The Architecture of Virtual Machines. Computer 38(5), 32–38 (2005)