Research on Resource Self-Organizing Model for Cloud Computing

Research on Resource Self-Organizing Model for Cloud Computing Weiwei Lin

Deyu Qi

School of Computer Science and Engineering South China University of Technology, Guangzhou, China [email protected]

School of Computer Science and Engineering South China University of Technology, Guangzhou, China [email protected]

Abstract—The primary problem to be solved for Cloud Computing is the formation of resource cloud. Cloud formation is from small to large process and dynamic complex process that can be achieved well using a self-organizing method. A new resource management method for Cloud Computing, resource self-organizing model that can form self-government resource groups in the absence of centralized management control and dynamically optimize the organizational structure of resources in accordance with resource changes, is proposed. The management protocol and the management node election algorithm for resource self-organizing model are presented. The simulation experiment to test the performance of resource self-organizing system is designed and the experiment results show that the proposed model can optimize the organizational structure of resources in dynamic system and improve the system performance.

academician of CAE said, Cloud Computing has the nature of the uncertainty. The formation of cloud is from small to large process and dynamic complex process that can be achieved well using a self-organizing method. Therefore, the method of resource self-organizing for Cloud computing is presented. It adopts self-organizing method in the management and utilization of resources, and it does not require a dedicated resource management server, which can not only significantly reduce the complexity of system administration, but also improve system scalability and flexibility of Cloud Computing, and help to use large computing resources on the Internet efficiently. Differ from the existing self-organizing methods, the resources self-organization model for Cloud Computing proposed focuses on the implementation of resource selforganization actions and the dynamic optimization of resource organization structure. This method can bring the heterogeneous geographic distributing and idle computer’ resources together effectively, which can provide the Cloud Computing environment a large number of available computing resources, and achieve the optimal scheduling and efficient utilization of these resources.

Keywords-Cloud Computing; resource self-organizing; model; resource management

I.

INTRODUCTION

Cloud Computing is an emerging computing paradigm that focuses on sharing data and computations over a scalable network of nodes. This paradigm is increasingly popular in the industry, where industrial leaders such as Microsoft [1], Google [2], and IBM [3] strongly promote the paradigm in recent years. There is a definition for Cloud Computing[4]: a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. Cloud Computing will become the mainstream application with the development of its research in depth and application. However, the present Cloud Computing research is still in its infancy. Although some studies have been carried out world wide[4-13],Cloud Computing theory and technology are not mature, and there are still many implementation issues of Cloud Computing. A lot of researches and practices have to be done in Cloud Computing security, management technology and application platforms, and so on. The essence of Cloud Computing is to form various resource clouds on the Internet(including computing cloud and storage cloud, and so on) and provide better resource services to users. The cloud in Cloud Computing is formed by bring together many resources on the Internet. The resources of cloud can be inserted or withdrawn dynamically, just as LI Deyi,

II.

RELATED WORK

Cloud Computing is derived from grid computing and P2P computing. The current influential grid computing software systems or tools include BOINC, Condor, SETI@ home, SZTAKI Desktop Grid and so on. These systems generally require specialized server to manage the system resources, and it is difficult to adapt to the dynamic changes of resources and the system scalability is rather poor. These systems often require a lot of manual intervention and complex management, which greatly limits the full utilization of resources. Some typical cloud systems, such as GFS of Google[2], Blue Cloud of IBM[3], Elastic Cloud of Amazon[16], a central entity to index or manage the distributed data storage entities and resource. It is effective to simplify the design and maintenance of the system by a central managed architecture, but the central entity may become a bottleneck if the visiting to it is very frequent. Because of these problems existing in the current resource management methods of the distributed system, it is difficult to manage large-scale, distributed, and heterogeneous computing resources and to construct large-scale Cloud Computing systems with current methods. Therefore, new resource management methods are needed urgently.

978-1-4244-5143-2/10/$26.00 ©2010 IEEE

Currently, a number of research institutions have been involved in Cloud Computing research. UC Berkeley Reliable Adaptive Distributed Systems Lab recently published a report on Cloud Computing above the Clouds: A Berkeley View of Cloud Computing [6]. The report notes that Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the datacenters that provide those services. Ten obstacles to the growth of Cloud Computing have been mentioned in the report, including service availability, data security, scalable storage, performance unpredictability, and so on. Then the report focuses on the economic models of Cloud Computing. The important one among obstacles to be solved is the service availability problem, which is actually a resource management problem. [7] describes a high-performance Cloud Computing infrastructure, consisting of the Sector storage cloud and the Sphere computing cloud, which supports data mining applications. The Sector storage cloud is a combined system of a P2P-based routing layer and a distributed storage service layer. Yet Sphere is implemented over Sector, and provides the following services: locating data, moving data, locating and managing computing resources, load balancing, and fault tolerance. Based on the infrastructure, two applications have been implemented and the experiment results show that with wide area high performance networks and a cloud-based architecture, computing with distributed data can be done with approximately the same efficiency as computing with local data. [11] provides an implementation of Cloud based on the Virtual Computing Laboratory technology, and discusses the concept of Cloud Computing, issues it tries to address, related research topics. In addition, it points out that the construction of the complex resource environment is an important research topic in Cloud Computing. In [5]the authors compares and analyzes some of the current mainstream computing paradigms, makes a definition of Cloud Computing, and presents the architecture for creating market-oriented Clouds; presents thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA oriented resource allocation. In addition, the reference [12] takes the grid computing as the Cloud Computing infrastructure, and gives a five-layer abstraction structure of Cloud Computing. In order to meet the growing demand of enterprises for collecting and analyzing increasingly large amount of data, the GridBatch system[10], which aims at solving large scale data-intensive batch problems under the Cloud infrastructure constraints, is proposed. GridBatch is a programming model and associated library that hides the complexity of parallel programming. In the past two years, although there had been some research in Cloud Computing, one of the core issues of Cloud Computing – the construction of cloud or resource pool had no uniform and good solution. Therefore, the resource self-organizing method is proposed to deal with constructing of Cloud.

III.

RESOURCE SELF-ORGANIZING MODEL FOR CLOUD COMPUTING

A. Model Description As shown in Figure 1, the Self-organizing Model for Cloud Computing consists of several autonomous resource groups (The dotted circle in picture indicated resource group, as A1, A2 shown in figure). Each resource group is made up of one management node (such as G1, G2 ,G3 and G4 in figure1) and a number of general resource nodes (indicated by P), with each management node taking charging of the self-management work such as the resource information maintenance and inquiry. The management node is generated by the fully distributed election algorithm which can elect a node with good communication performance and good stability. The size of the resource group must be properly controlled. In order to minimize the number of the connections of all the management nodes, the size of the resource group is set as N1/2 (N represents the number of all nodes in the system). The core idea of the resource self-organizing model is resource selforganizing activities, including the automatic split and mergence of resource group, the dynamic update of the management nodes and the dynamic selection of resource group by resource nodes. (1) The automatic split of the resource group: when the number of resources in the group increases to a certain number, split of the resource group is needed. (2) The automatic mergence of the resource group: when the number of resources in the group decreases to a certain number, mergence of resource group is needed. (3) The dynamic update of the management nodes: when the current management node finds a general resource node with better performance in the resource group, this general resource node will be selected as the new management node; (4) The dynamic selection of resource group by resource nodes: when a resource node finds a more suitable resource group in which this resource node is closer to the management node, the resource node will join this new resource group.

Fig.1 Resource self-organizing model B. The Maintenance Protocol of Resource Self-organizing Model When a node A wants to join a resource self-organizing Cloud Computing system, it first contacts an arbitrary node B known to already participate in the system and sends a SEARCHSUPERNODE message to node B. When node B

receives the query message, it forwards SEARCHSUPERNODE message to its management node C. The management node C obtains the information of all the management nodes through a message forward algorithm and sends the information of the management nodes to node A. Then node A sends a PING message to all the management nodes to obtain the nearest management node D, then the node A sends a JOIN message to the node D. If the management node D agrees node A to join its resource group, it will reply with an ACCEPT message. Then the connection will be established between node A and the management node D, and node A successfully joins the system. Otherwise, the Management Node D replies with a Reject message, then Node A will select another management node and the previous procedure will be repeated. The resource self-organizing activities of the resources self-organizing model include the automatic split and mergence of resource groups, the dynamic update of the management nodes and the dynamic selection of resource group by resource nodes. (1) The automatic split of the resource groups: When the number of resources(nodes) in a resource group exceeds the maximum threshold, and the split of the resource group is needed, the management node X of the resource group choose a resource node Y, a resource node in the group with best-performance, and promotes it to management node. The node X sends an ASSUPERNODE message to node Y. If Node Y agrees, it replies with an ACCEPTSUPERNODE message. That means the Node Y is successfully elected as a management node. Then an ADDSUPERNODE message will be sent to other management nodes. (2) The automatic mergence of the resource groups: In contrast, when processing the mergence of two resource groups, one of two management nodes, Node X, should be removed. The node X sends a REMOVESUPERNODE message to all the resource nodes in the group and other management nodes and all the resources nodes within the group will join another management node. (3) The dynamic update of the resource groups: when a betterperformance general resource node within the group, node Y, is found by a management node X, Node Y would be upgraded as a new management node. Node Y sends a NEWSUPERNODE message to all the resource nodes within the group and establishes connection with them. Meanwhile, Node X sends a REMOVESUPERNODE message to all the resource nodes within the group and other management nodes; (4) The dynamic selection of resource group by resource nodes: when a resource node finds itself closer to other management nodes, it will join that resource group. The resource node would change its management node. Therefore, it will send a JOIN message to the new management node. If the new management node agrees, it will send a LEAVE message to the old management node. Then the resource node successfully joins the new resource group. C. Election Algorithm for Management Nodes An important character of the resource self-organization model is the self-adaptation to the resource changes, which

helps to optimize the resource organization structure dynamically. Namely, in the proposed resource selforganization model, more appropriate management nodes for resource groups can be selected dynamically according to resource insertion and changes in network communication performance (communication delays), thus resource organization structure be optimized. This character is mainly reflected in the election algorithm for management nodes. And it’s implemented in the following way: In response to the dynamic changes of the system, general resource nodes can reselect a closer management node and rejoin to the new resource group; management nodes can reselect a general resource node with better performance as a management node. The proposed election algorithm for management nodes is the core content of the Maintenance Protocol of Resource Selforganization Model, including two parts: the management node part and the general resource node part. The general resource node part of the proposed algorithm mainly refer to a resource node’s reselecting a closer management node in response to the dynamic changes in the network and joining the new resource group where the new management node is located. It can be described in the following pseudo-code: Procedure SSA_G(); Sleep(c); FindNearestSupernode (i); //find the nearest node If (mysupernode! =i) mysupernode: =i; Connect(i); End If End The mainly cost of SSA_G() Algorithm is the operation, FindNearestSupernode(i). Because the topology formed by the management nodes in resource self-organization model is Full Interconnection Network, node i must send a message to all management nodes in order to find the nearest management node. Therefore, the number of messages produced by SSA_G() Algorithm is determined by the number of management nodes. That is, the number of the messages produced by SSA_G () Algorithm is O(N1/2). Before the description of the management node part of the election algorithms, election principles and related symbols should be given. Suppose the total for nodes in the system is N, the number of management nodes is M, and M