A Resource Scheduling Algorithm of Cloud Computing based on ...

10 downloads 23968 Views 536KB Size Report
Cloud Computing based on energy efficient optimization methods. The experimental results demonstrate that, for jobs that not fully utilized the hardware ...
A Resource Scheduling Algorithm of Cloud Computing based on Energy Efficient Optimization Methods Liang Luo, Wenjun Wu National Key Lab. Of Software Environment Development, Bei Hang University Beijing, China Email: {luoliang,wwj}@nlsde.buaa.edu.cn Abstract—Cloud computing has been emerging as a flexible and powerful computational architecture to offer ubiquitous services to users. It accommodates inter connected hardware and software resources in a unified way, which is different to traditional computational environments. A variety of hardware and software resources are integrated together as a resource pool, the software is no longer resided in a single hardware environment, it is performed upon the schedule of the resource pool for optimized resource utilization. The optimization of energy consumption in the cloud computing environment is the question how to use various energy conservation strategies to efficiently allocate resources. In this paper, we study the relationship between infrastructure components and power consumption of the cloud computing environment, and discuss the matching of task types and component power adjustment methods, and then we present a resource scheduling algorithm of Cloud Computing based on energy efficient optimization methods. The experimental results demonstrate that, for jobs that not fully utilized the hardware environment, using our algorithm can significantly reduce energy consumption. Keywords- Energy Efficient; Cloud Computing; Cluster

I.

INTRODUCTION

The development of cloud computing is closely related to the progress of the Internet. In early 2003, Google released GFS, BigTable, MapReduce[1][2][3] and other technologies. A number of companies and research institutions expressed great interests in this framework. Then, Apache Foundation launched an open source distributed computing project: Hadoop [4]. It is a framework that enables the distributed processing of large data sets across computer clusters. It achieves a similar function as GFS, Map/Reduce, BigTable and Chubby. In March 2006, Amazon published the Simple Storage Service (Simple Storage Services, S3) [5] which is a service using the SOAP protocol for users to store and access their own data objects. For the service providers, they offered a reliable and extensible storage service to users who use it as an online storage solution case. In July 2007, Amazon launched a Simple Queue Service [6] that allows the virtual hosts to send messages to each other. It also support data transfer between distributed applications with no need to consider the message loss problem. Recently, Amazon starts to provide EBS (Elastic

978-1-4673-2154-9/12/$31.00 ©2012 IEEE

Dichen Di,Fei Zhang,Yizhou Yan,Yaokuan Mao National Key Lab. Of Software Environment Development, , Bei Hang University Beijing, China Email: {ddc,zhangf,yanyzh,myk}@nlsde.buaa.edu.cn Block Store) service to users with block-level storage interfaces. With the variety of cloud computing model and their framework have been proposed, hundreds or even thousands of computing nodes compose new forms of data centers. These data center’s energy consumption is also very impressive. Take Google's cloud computing center in Oregon for example, when it operated at full load, its energy consumption is equal to the total amount consumed by all family in a medium-sized city[7]. Microsoft Dublin Data Center consumes 5.4 megawatts of electricity and may be expanded to 22.2 MW in the near future[8]. The Tianhe-1,a cluster computer in Tianjin, China, consumes 128KW electricity per hour. This is equivalent to the electricity consumption of 2 million ordinary families [9]. Such large-scale use of energy will generate a lot of carbon emissions and bring a huge impact to the environment. the World's leading technology consulting Gartner published Gartner technology trends report in 2008 and 2010,it predict that the Green IT technology based on the low-energy Consumption of computing systems will be on the top of ten most important IT technologies, and reveals a broad application prospects[10]. At present, the resource allocation methods of cloud computing environment mostly use preallocation ways, in order to ensure quality of service of application, it needs to allocate the maximum demand of resources in accordance with the requires of the applications. This approach is very simple, but it brings a lot of problems too. On the one hand, resource requirements cannot be predicted in advance, on the other hand, low resource utilization is a problem because once the resources have been allocated, users can only wait until the application release them [11]. Academia and industry have been working on the research of the energy consumption and management of computer for years, it has already accumulated a lot of research achievements and technologies. Recently, with the advances in hardware and the application of new computing models (such as virtualization and cloud computing), the traditional methods of managing and saving energy have encountered challenges. Eugen Feller[12]summarized the energy consumption and adjustment methods of stand-alone computer and cluster servers. In Chung-Hsing Hsu's work[13],They continuously observed the energy consumption and utilization of servers from 2007 to 2010, pointing out that the server's default energy

model has changed. Meanwhile, many hardware vendors start to provide the embedded energy consumption adjustment model. However, the above works are often considered from the hardware point of view, they may not suitable for the cluster system with complex software environments. Based on these studies, many scholars have launched more in-depth researches. Rong Ge focused on the paralleled systems and I/O intensive jobs[14][15].In Charles Lively & Xingfu Wu's work[16], they use precise experimental probes to collect energy consumption and performance characteristics in different parallel implementations. There are many studies for special aspect of energy efficiency of cloud computing system too .GreenHDFS is a Hadoop Distributed File System with Energy Management and Regulation. On the basis of the above studies, we should further research the energy consumption's regulation strategies of Cloud Computing, and apply it to resource scheduling algorithm of Cloud Computing Environments. In this paper, we discuss the energy efficiency of Cloud Computing Environment. Firstly, we study the energy consumption model, and sort the computing resource in four categories: CPU, memory, storage, and network. We designed various regulation strategies for different components. Then, we propose a dynamic resource scheduling algorithm based on energy optimization of CPU, main memory and storage. Evaluation methodology is then proposed. Finally the simulation results and analysis demonstrate the algorithm effectiveness. The remainder of this paper is organized as follows. Section 2 describes the energy consumption of four category resources and the energy model of cloud computing; In Section 3, we discuss the job classification and present the algorithm with energy efficient optimization methods; Section 4 briefly describe the evaluation methodology. Section 5 is the experiment and the analysis of results; Section 6 concludes the paper and lists our future work. II.

ENERGY CONSUMPTION ANALYSIS OF CLOUD CLUSTERS

A. Structure of Cloud cluster node In order to effectively control the energy consumption of cloud computing clusters, firstly we should figure out how cloud computing swallow up so much energy, so the establishment of the energy consumption model of cloud computing is a priority. In the cloud computing environment, the structure of node is similar to the traditional servers, however, due to the underlying sever resources can be provided as a service directly to users, its parallel processing, network connectivity and storage performance is generally better.

Figure 1 General Structure of a Cloud Cluster Node.

A node is mainly composed by more than one multicore CPU, a number of multi-channel memory, motherboard, various types of network interface cards, multiple types of storage, power supply and fans. Multi-core CPU can support paralleled task processing, a large number of memory modules could ensure fast operation of CPU and avoid resource contention, many types of network interface card for data exchange between nodes and provides a good support for the expansion of the cluster, multiple kinds storage was supplied for massive data processing. B.

Energy Model of a Cloud Computing Environment The overall energy consumption of a cloud computing system can be expressed as the following formula:

ECloud = E Node + ESwitch + E Storage + EOhters (1) ENode represents the node's energy consumption, ESwitch represents the energy consumption of all the switching equipment. EStorage represents the energy consumption of the storage device. EOthers repersents the energy consumption of other parts, including the fans, the current conversion loss and others. The above formula can be further decomposed, a cloud computing environment with n nodes, m switching equipment and a centralized storage device, its energy consumption can be expressed as:

ECloud = n( ECPU + E Memory + E Disk + EMainboard + E NIC ) + m( EChassis + ELinecards + EPorts ) +

(2)

( E NASServer + EStorageController + E DiskArray ) + EOthers We classify and statute of the above formula and express it by the following six-tuple, Each parameter are the vector, and represents a way to adjust the energy consumption of a certain class of devices. (num, fCPU , f Memory , mod eDisk , speed Disk , SpeedFactorSwitch ) (3)

Finally, the energy consumption of the cloud computing environment as a whole and the relationship of the various parameters could be expressed as follows:

t2

ECloud = ∫ P (num, f CPU , f Memory , mod eDisk , speed Disk , SpeedFactorSwitch )dt t1

(4) num represents the number of nodes and all the adjustable components in the node. fCPU refers to the adjustable frequency range of the CPU. modeDisk and speedDisk refer to the adjustable disk modes (idle,standby,sleep) and speed.SpeedFactorSwitch enables the NIC of node and port of switch change the network throughput(10/100/1000M).

III.

JOB CLASSIFICATION AND RESOURCE SCHEDULING ALGORITHM

From the discussion of the previous section we can see that the CPU, main memory, hard disks and other components can be set to different frequency or modes to save power. Then, based on these hardware parameters and working modes, we could build an energy model, by comparing a large number of cloud computing system's hardware architecture, we investigate the hardware structure and analysis the energy parameters of these components, and soon afterwards, we select four parameters (CPU, memory, storage, and network) to demonstrate the energy consumption of a Cloud Computing environment. After determining the status of the equipment adjustment method and types of job, then, we would create corresponding templates depending on the different type of jobs. Table 1 shows the type of jobs, policy and the corresponding template. Type of Job

Dependence

Type of policy

Policy template

Compute intensive

★★★

Slow down network, turn off the extra module Reduce the disk speed, memory status change

Storage intensive

★★★

Network Policy+ Storage Policy CPU Policy

I/O intensive

★★★

CPU Policy

DVFS,VOVO

Compute Storage Intensive Compute I/O Intensive

★★ ★

Network Policy

Slow down network, turn off the extra module

★★ ★

Storage Policy

Reduce the disk speed, memory status change

Storage I/O Intensive

★★ ★

CPU Policy

DVFS,VOVO

DVFS,VOVO

Table 1 Classification of Jobs and Energy Policy Template

In Table 1, the template is established based on different hardware architectures and different types of jobs, according to the statistics of the hardware infrastructure and regulation of all adjustable components method, we create a template to change the mode of cluster components to achieve the purpose of saving energy. The classification of jobs and matching of corresponding energy adjustment template is the key to the success of our

approach. The classification of the job is divided into two steps. In the first step, before running the job, user could choose the type of his/her job, such as compute-intensive, memoryintensive, network-intensive and I / O intensive. In the second step, when the job starts to execute, we will apply some method to test the type of jobs by counting the instruction execution speed in CPU. Infrastructure Preparation: resource_stat(Component,Num); energy_stat(component,value); scheme_settlement(CPU,MainMem,Storage,Network); Job Pre_Processing: resource_scale_usage(CPU,MainMem,Storage,Network,Job Tpyes) Template_Extraction() Template_Apply(Template) Job Execution: Energy_Scheme_Stat(node); start=n-1; i=start; do{ i=(i+1) mod n; NumCPU = available num cores in Nodei - num cores needed by job NumMemory = available num Memory in Nodei num Memory needed by job NumDisc = available num discs in Nodei - num discs needed by job IF( NumCPU > 0 && NumMemory > 0 && NumDisc > 0) W(Si)=1; Else W(Si)=0; IF( W(Si)){ shechule vm on Nodei update the energy scheme start=i; return Nodei; }while(i!=start) return NULL; Figure2 Resource Scheduling Algorithm for Isomorphism Nodes

This algorithm is characterized by reserving the simplicity and efficiency of round-robin scheduling algorithm, while increasing the function of regulating energy consumption of resources. The algorithm requires that all nodes must have the same hardware configuration. The algorithm is divided into three stages: Infrastructure Preparation, Job Preprocessing and Job Execution. In the Infrastructure Preparation stage, according to the components list provided by the hardware vendors, resource_stat counts all the hardwares of the cloud computing environments. Energy_stat counts the energy consumption and adjustment methods of the components. Scheme_settlement make four types of different energy policy templates (CPU, MainMem, Storage, Network)to use for different applications. In the Job Preprocessing stage, Service providers collect the usage scale of resources from user, then, they provide standard virtual machine templates, each template includes a certain amount of resources, e.g. CPUs,Memory,

Storage, Networks. Subsequently, an instance based on template will be allocated to user. In the Job_Execution stage, we firstly count previous energy-saving measures, to the unused nodes, we will hibernate them, to the used nodes, we will count their energy efficient templates. Then we begin to perform resource scheduling and allocation process. Suppose there are a set of clustered servers S = {S0, S1, …, Sn-1}, An indicator variable start points to the last selected node. The start is initialized to n-1, that is, points to the n-1 node, where n>0. In the algorithm, we introduce an additional weight value, W (Si), denotes the availability of a node. When W (Si) = 0, it indicates that the server is not available and could not be scheduled. the purpose of doing so is to successfully cut the failure node out of the service (such as shielding the hardware failure and system maintenance), while ensuring that other nodes would not be affected and work normally. i represents the current node, it initially point to start, then into the resource allocation cycle, i point to the successor node of the node which just allocated resources. Then we count the available resources of the node, if the available resources can meet the requirements of the job, node weight value is set to 1, otherwise, node weight value is set to 0(W (Si)=0), we cut the node out of services, re-enter the resource allocation cycle. If W(Si)>0,we allocate resources on nodei, then we update the node energy efficiency strategies, and finally the current position i is assigned to the start, this cycle of resource allocation is finished. IV.

EVALUATION METHODOLOGY

To evaluate the effect of the proposed approach, we use a real Cloud Computing Environment. Since the measurement and regulation of energy consumption of Cloud Computing Environment is needed, we have to modify and reinforce the original cloud computing system by increasing some monitors and controllers, such as resource usage monitoring, energy consumption monitoring, resource controller, virtual machine controller. By adding these components, we can effectively monitor and adjust the state of the firmware, and successfully apply the energy efficiency templates. Currently , there are mainly two ways to monitor the energy consumption of server,one is direct monitoring, by adding energy consumption measuring instruments in each module of the server, we could achieve in the purpose of realtime measuring energy consumption. This approach has the advantage of high accuracy, and could be used to measure various components of their real-time energy consumption (such as CPU, main memory, hard drives, etc.).The disadvantage is that this method requires a large amount of additional equipment, therefore, this method is difficult to deployed and applied in large-scale clusters. Another way is indirect monitoring. By embedded sensors in the server, we can measure the energy consumption of the server in real time, for the servers which support IPMI and ACPI Interface specifications, performance and energy data can be centralized collecting and processing, thus these data could be harmonized monitor and centralized display. In this paper, we chose indirect measurement method, most cluster systems are dense blade servers, these servers

themselves may not have the energy consumption measurement module, so it is necessary to increase the additional measurement capabilities, thus an IPMI supported smart meter is included. By introducing the smart meters, we can monitor the storage controllers, disk arrays and the energy consumption of the power supply controllers in real-time, and through the network link feature of the smart meters, energy consumption data was sent to a uniform energy consumption monitoring platform. The same measurement strategy was also used in the server and network computing nodes to measure the real-time energy consumption of the various components of cloud computing cluster environments. We use hierarchical design ideas to divide the framework into four layers. Each layer bears a unique function in order to provide different levels of services, thus it is easy to operate, manage and maintain. The lowest layer is combined with IPMI interfaces and probes. The probe is divided into two kinds, one is called the acquisition probe for data collection, and another is action probe for executing instructions. An agent is set on the top of the probe, these probes will be running as a proxy process on each cluster server. Agent also provides interfaces for management layer to facilitate the monitor and control the status of cluster nodes. Management layer's main function is to regulate the status of the components of the cluster nodes. When the node is on, we can directly regulate the state of components through the probe. When the server is turned off, the agent on the node cannot be accessed, in this case, the agent could boot the node through IPMI. The top layer is the user management, it provides users with the operational interface, users can intuitively manage the statue and the energy consumption of nodes. V.

EXPERIMENT AND RESULTS ANALYSIS

Our experimental environment consists of 12 physical computing nodes,each node is a SuperMicro blade server running Linux (CentOS 5.5) ,with 2 Quad-Core Intel CPU at 2.4GHz,8G RAM,160GB SAS Drive and 1Gbps Ethernet through switch module of server blade box, one core switch, one network storage of 60T ,the specific configuration and energy consumption data are as follows: Component Name CPU Main Memory Disk Main Broad Core Switch Network Storage

Power (W) 80 12

Quant ity 2 12

Total power(W) 160 144

Percentag e(%) 45 41

12 37 600 2430

1 1 1 1

12 37 600 2430

4 10 -----

Table 2 Component Power Values

We deployed two sets of experimental environments: the physical computing environment which treat physical cluster nodes as a computing unit and the virtual computing environment which treat the virtual machines as a computing unit. The second environment applied our resources scheduling algorithm and energy-saving methods.

We applied two kinds of experimental set, one is different types of benchmarks, including iozone (I/O-intensive tasks), netperf (network-intensive tasks), Spec 2006 (CPU, memoryintensive tasks).Another kink of experimental set is actual computing tasks, we run a few Hadoop jobs and collect the data of CPU usage, memory usage, disk speed, network traffic, energy consumption and etc. We compare the energy consumption of a node in different situations in order to find the most reasonable balance between the energy consumption and performance. The result shows between Figure 3 and Figure 8. Figure 3 shows one node resource usage while executing an iozone task. In the first experiment, we perform the iozone test set, on each node's local disk, we create 2GB of data and perform a series of operations, including write, re-write, random write, read, random mix, backwards read etc. In performing local I/O-intensive tasks, the CPU usage rate is not high, memory usage rate is high, network throughput is small. Based on these observation, we can use the CPU policy template to save energy. Figure 4 shows the task execution time of each node. Furthermore, with the recorded current and voltage data by smart meters, we can calculate the overall energy consumption of the task. Figure 5 shows the overall resource usage while executing another iozone task. This experiment is similar to the former one, but all the data operations are executed on a remote network storage, such experiments can be seen as the I / O intensive and network-intensive tasks. Compared with Figure 3, we can see an apparent increase in network throughput and an 8% increase of CPU usage, mainly for the storage iscsi service consumption. For such tasks, we should still use the CPU policy template to save energy. Figure 6 shows the comparison of execution time of the two tasks. We found a significant increase in task execution time of task 2. Following the above experiments of different types of benchmarks, we have deployed an actual cloud computing environment to test the performance of our algorithm. The cloud computing environment is Eucalyptus, the data processing program is Hadoop, the experimental test application benchmarks are Hadoop random read and write (I/O-intensive). We do the same experiment under the original round-robin resource scheduling algorithm and our energy efficient optimized resources scheduling algorithms. Firstly, we use virtual machine scheduler to generate 6 virtual machines with Hadoop. Then we execute a random write job and random read on each node with 5GB data. Experiments under each scheduling algorithm are operated in 8 different CPU frequency values. Figure 7 and Figure 8 show the results. We can see that through the use of resource scheduling algorithms with energy regulation strategy, we can effectively save energy. Through repeated experiments, we found that these strategies are particularly effective for tasks which do not make full use of hardware resources. But for those CPU-intensive tasks, especially resources are fully utilized, if we adjust the energy consumption, it may reduce system performance, thereby

greatly increasing the task execution time, and this would produce more energy consumption. This is also reflected in our experiments.

Figure3 One Node Resource Usage in iozone

Figure4 Ttask Execution Time of Each Node

Figure5 One Node Resource Usage in iozone with Remote Storage

several experiments, this turning point is floating depend on program job, hardware framework or networks. Therefore, we need to deploy an automatic process to find the appropriate CPU frequency, main memory’s mode or disk’s mode or speed. We have also deployed scalable distributed monitoring software for the cloud clusters. We collect much system or program information of cluster nodes through ACPI and sensors, arrange them, and show them in real-time charts or pass them as parameters to functional programs.

Figure6 Comparison of Execution Time of Two Tasks

In the future, we plan to refine our algorithm by collecting and implementing more information, such as temperature or fan speed. Meanwhile, a more powerful monitor system is vital for the cloud clusters. There exist some open source project, but none of them could directly applied to our environment, e.g. the template should be re-create . So there is a lot of work to do. [1] [2] [3] [4] [5] [6] [7]

Figure7 Energy consumption of Write Job with different frequency [8]

[9]

[10]

[11]

[12] [13] Figure8 Energy consumption of Sort Job with different frequency

VI.

CONCLUSION AND FUTURE WORK

In this paper we present an algorithm for a cloud computing environment that could automatically allocate resources based on energy optimization methods. Then, we prove the effectiveness of our algorithm. In the experiments and results analysis, we find that in a practical Cloud Computing Environment, using one whole Cloud node to calculate a single task or job will waste a lot of energy, even when the structure of cloud framework naturally support paralleled process. Furthermore, we found that it is not the higher the frequency of CPU, the faster the execution of program, there exist a turning point which could make us achieve a balance between frequency energy consumption. Unfortunately, based on our

[14]

[15]

[16]

The Google File System,http://labs.google.com/papers/gfs-sosp2003.pdf MapReduce: Simplifed Data Processing on Large Clusters, http://labs.google.com/papers/mapreduce-osdi04.pdf Bigtable: A Distributed Storage System for StructuredData, http://labs.google.com/papers/bigtable-osdi06.pdf Hadoop, http://lucene.apache.org/hadoop/ Amazon Simple Storage Service, http://aws.amazon.com/s3/ Amazon Simple Queue Service, http://aws.amazon.com/sqs/ http://www.netxt.com/power-103-megawatt-secret-google-containerdata-center/ Microsoft Dublin Data Center, http://www.datacenterknowledge.com/inside-microsofts-dublin-megadata-center/dublin-data-center-generators/ Yong Dong,"Power Measurements and Analyses of Massive Object Storage System"Computer and Information Technology (CIT), 2010 IEEE 10th International Conference ,pp. 1317 – 1322,2010 Eugen Feller, Daniel Leprince,Dr. Christine Morin, "State of the art of power saving in clusters + results from the EDF case study," INRIA Rennes - Bretagne Atlantique, France 31. May 2010 E.N. (Mootaz) Elnozahy, Michael Kistler, and Ramakrishnan Rajamony,"Energy-Efficient Server Clusters," B. Falsafi and T.N. Vijaykumar (Eds.): PACS 2002, LNCS 2325, pp. 179–197, 2003. Feller E., D. Leprince, C. Morin. State of the art of power saving in clusters + results from the EDF case study. 2010. Hsu C.H. & S. W. Poole. Power Signature Analysis of the SPECpower_ssj2008 Benchmark[C]. Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium, 2011: 227-236. Thomas Wirtz and Rong Ge, Improving MapReduce Energy Efficiency for Computation Intensive Workloads, the 2nd International Green Computing Conference (IGCC2011), Orlando, USA. Rong Ge, Xizhou Feng, Shuaiwen Song, Hung-Ching Chang, Dong Li, Kirk W. Cameron, PowerPack: Energy profiling and analysis of HighPerformance Systems and Applications, IEEE Transactions on Parallel and Distributed Systems, Vol 21, No.5, 658-671 (2010). Lively C., X. Wu, V. Taylor, S. Moore, H. Chang & K. Cameron. Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems [J]. International Journal of High Performance Computing Applications, 2011, 25(3): 342-350. .