How CLEVER-Based Clouds Conceive Horizontal and Vertical Federations Francesco Tusa, Antonio Celesti, Maurizio Paone, Massimo Villari and Antonio Puliafito Dept. of Mathematics, Faculty of Engineering, University of Messina Contrada di Dio, S. Agata, 98166 Messina, Italy. e-mail: {ftusa, acelesti, mpaone, mvillari, apuliafito}@unime.it
Abstract—Nowadays, the cloud computing ecosystem includes hundreds of independent and heterogeneous clouds. Most of such cloud platforms can be considered as “islands in the ocean of the cloud computing” and do not present any form of federation. At the same time several clouds are beginning to use the cloud-based services of other clouds, but there is still a long way to go toward the establishment of a worldwide ecosystem including thousands of cooperation federated clouds. This paper aims to investigate the existing cloud middleware solutions able to address all the potential issues involved in these new cloud scenarios. In particular, the CLEVER cloud middleware will be analyzed, highlighting its design and features, and explaining the motivations that allow us to consider it suitable to the different phases of the evolution of federated cloud computing. Keywords-Cloud Computing; CLEVER; Federation.
The paper is organized as follows. Section II introduces the concept related to the cloud federation, considering its evolution stages and its future perspectives. Section III describes the current existing Open Source projects for building clouds acting as Virtual Infrastructure Manager (VIM), Cloud Manager or both, specifically pointing out the features that may be useful in federated cloud scenarios. Section IV introduces the CLEVER cloud middleware describing its general organization and some of its specific design concepts. In section V, considering the CLEVER characteristics, it will be provided an overview on how CLEVER may be classified in the current cloud federation evolution trend. Conclusions and light to the future will be reported in Section VI.
I. I NTRODUCTION
II. C LOUD F EDERATION
Until now, the trend of the cloud computing ecosystem has been characterized by the steady rising of hundreds of independent, heterogeneous cloud providers, managed by private subjects, yielding various types of cloud-based services to their clients (e.g., IT societies, organizations, universities, desktop and mobile end-users, etcetera). Currently, most of such clouds can be considered as “islands in the ocean of the cloud computing” and do not present any form of federation. At the same time a few clouds are beginning to use the cloud-based services of other clouds, but there is still a long way to go toward the establishment of a worldwide Cloud ecosystem including thousands of cooperating clouds. In such a perspective, the latest trend toward cloud computing is dominated by the idea to federate heterogeneous clouds. This means not to think about independent private clouds anymore, but to consider a new cloud federation scenario where different clouds, belonging to different administrative domains, interact each other, sharing and gaining access to physical resources, and becoming themselves at the same time both “users” and “resources providers”. In this paper we will focus our attention to the existing cloud computing middlewares, specifically pointing out the ones by means of which it is possible to build federated cloud infrastructures. More specifically, after such an analysis, the CLEVER cloud middleware will be deeply described, highlighting its main features and paying attention to its ability of building federated cloud environments.
Until now, the cloud ecosystem has been characterized by the steady rising of hundreds of independent and heterogeneous cloud providers, managed by private subjects which yield various services to their clients. Using this computing infrastructure it is possible to pursue new levels of efficiency in delivering services (SaaS, PaaS, and IaaS: in general *aaS) to clients such as IT companies, organizations, universities, generic single end-user which can range from desktop to mobile users, and so on. Despite such an ecosystem includes hundreds of independent, heterogeneous clouds, many business operators have predicted that the process toward interoperable federated cloud scenarios will begin in the near future. We imagine a scenario where different clouds, belonging to different administrative domains, could interact each other becoming themselves both “users” and resources provider at the same time. Obviously the interaction and cooperation among the entities of this scenario might be complex and needs to be deeply investigated. As it has been claimed in [1], the evolution of the cloud computing market is hypothesized to evolve according to the following three subsequent stages: •
stage-1 “Independent Clouds” (now), cloud services are based on proprietary architectures, islands of cloud services delivered by mega-providers (this is what Amazon, Google, Salesforce and Microsoft look like today);
stage-2 “Vertical Federation”, over time, some cloud providers will leverage cloud services from other providers. The clouds will be proprietary islands yet, but the ecosystem will start (partially started); • stage-3 “Horizontal Federation”, smaller, medium, and large providers will federate themselves to gain economies of scale, an efficient use of their assets, and an enlargement of their capabilities (to be planned). Even though the idea of creating federated infrastructures seems to be very profitable, bridging the existing gap could not be straightforward: on one hand, in fact, highly scalable infrastructures are required to comply with the varying load, software and hardware failures using Cloud federation scenarios. On the other hand, autonomic managed infrastructures are required to adapt, manage and utilize cloud ecosystems in an efficient way. Furthermore, for building up an interoperable heterogeneous federated environment, clouds have to cooperate together accomplishing trust contexts for providing new business opportunities such as cost-effective assets optimization, power saving, on-demand resources provisioning, delivery of new types of *aaS, etcetera. •
III. R ELATED W ORKS AND BACKGROUND This Section describes the current state-of-the-art in cloud computing analyzing the main existing middleware implementations, evaluating their main features. On the market are currently available both proprietary (e.g., Amazon EC2 [2], Microsoft Azure [3], Saleforce, Google and Yahoo) and open source cloud computing middlewares such as CLEVER [4], OpenQRM [5], OpenNebula [6], Nimbus [7], Eucalyptus [8]. Nimbus is an open source toolkit that allows to turn a set of computing resources into an Iaas cloud. Nimbus comes with a component called workspace-control, installed on each node, used to start, stop and pause VMs, implements VM image reconstruction and management, securely connects the VMs to the network, and delivers contextualization. Nimbus’s workspace-control tools work with Xen and KVM but only the Xen version is distributed. Nimbus provides interfaces to VM management functions based on the WSRF set of protocols. There is also an alternative implementation exploiting Amazon EC2 WSDL. Eucalyptus [8] is an open-source cloud-computing framework that uses the computational and storage infrastructures commonly available at academic research groups to provide a platform that is modular and open to experimental instrumentation and study. Eucalyptus addresses several crucial cloud computing questions, including VM instance scheduling, cloud computing administrative interfaces, construction of virtual networks, definition and execution of service level agreements (cloud/user and cloud/cloud), and cloud computing user interfaces. OpenQRM is an open-source platform for enabling flexible management of computing infrastructures. Thanks to
its pluggable architecture, OpenQRM is able to implement a cloud with several features that allows the automatic deployment of services. It supports different virtualization technologies managing Xen, KVM and Linux-VServer VMs. It also supports P2V (physical to virtual), V2P (virtual to physical) and V2V (virtual to virtual) migration. This means VMs can not only easily move from physical to virtual (and back), but that they can also be migrated from different virtualization technologies, even transforming the images. OpenNebula is an open and flexible tool that fits into existing data center to build a Cloud computing environment. OpenNebula can be primarily used as a virtualization tool to manage virtual infrastructures in the data-center or cluster, which is usually referred as Private Cloud. The more recent versions of OpenNebula are trying to supports Hybrid Cloud to combine local infrastructure with public cloud-based infrastructure, enabling highly scalable hosting environments. OpenNebula also supports Public Clouds by providing Cloud interfaces to expose its functionalities for VM, storage and network management. All the above mentioned Open Source tools are able to satisfy the requirements of scenarios with Vertical Federation. The CLEVER middleware that we are going to describe in the following, instead suits as well as possible both scenarios in Vertical Federation and Horizontal Federation thanks to its features. IV. T HE C LEVER A RCHITECTURE A. Overview In order to clarify the architecture on which CLEVER is based, let us consider a scenario formed by a set of physical hardware resources (i.e., a cluster) where VMs are dynamically created and executed on the hosts considering their workload, data location and several other parameters. The basic operations our middleware should perform refer to: 1) Monitoring the VMs behavior and performance, in terms of CPU, memory and storage usage; 2) Managing the VMs, providing functions to destroy, shut-down, migrate and set network parameters; 3) Managing the VMs images, i.e., images discovery, file transfer and uploading. Considering the concepts stated in [9], such features can be analyzed on two different layers: Host Management (lower) and Cluster Management (higher). The middleware is based on the architecture schema depicted in Figure 1, which shows a cluster of n nodes (also an interconnection of clusters could be analyzed) each containing a host level management module (Host Manager). A single node may also include a cluster level management module (Cluster Manager). All the entities interact exchanging information by mean of the Communication System based on the XMPP. The set of data necessary to enable the middleware functioning is stored within a specific Database deployed in a distributed fashion.
Figure 1.
CLEVER architecture.
Figure 1 shows the main components of the CLEVER architecture, which can be split into two logical categories: the software agents (typical of the architecture itself) and the tools they exploit. To the former set belong both Host Manager and Cluster Manager: • Host manager (HM) performs the operations needed to monitor the physical resources and the instantiated VMs; moreover, it runs the VMs on the physical hosts (downloading the VM image) and performs the migration of VMs (more precisely, it performs the low level aspects of this operation). To carry out these functions it must communicate with the hypervisor, hosts’ OS and distributed file-system on which the VM images are stored. This interaction must be performed using a plug-ins paradigm. • Cluster Manager (CM) acts as an interface between the clients (software entities, which can exploit the cloud) and the HM agents. CM receives commands from the clients, performs operations on the HM agents (or on the database) and finally sends information to the clients. It also performs the management of VM images (uploading, discovering, etc.) and the monitoring of the overall state of the cluster (resource usage, VMs state, etc. ). At least one CM has to be deployed on each cluster but, in order to ensure higher fault tolerance, many of them should exist. A master CM will exist in active state while the other ones will remain in a monitoring state. Regarding the tools such middleware components exploit, we can identify the Distributed Database and the XMPP Server. B. Internal/External Communication The main CLEVER entities, as already stated, are the Cluster Manager and the Host Manager modules, which include several sub-components, each designed to perform a specific task. In order to ensure as much as possible the middleware modularity, these sub-components are mapped on different processes within the Operating System of the same host, and communicate each other exchanging
messages. CLEVER has been designed for supporting two different types of communication: intra-module (internal) communication and inter-module (external) communication. 1) Intra-module (Internal Communication): The intramodule communication involves sub-components of the same module. Since they essentially are separated processes, a specific Inter Process Communication (IPC) has to be employed for allowing their interaction. In order to guarantee the maximum flexibility, the communication has been designed employing two different modules: a low level one implementing the IPC, and an high-level one instead acting as interface with the CLEVER components, which allows access to the services they expose. For implementing the communication mechanism, each module virtually exchanges messages (horizontally) with the corresponding peer exploiting a specific protocol (as the horizontal arrows indicate in Figure). However, the real message flow in the one indicated by the vertical arrows: when the Component Communication Module (CCM) of the Component A aims to send a message to its peer on a different Component B, it will exploit the services offered by the underlying IPC module. Obviously, in order to correctly communicate, the CCM must be aware of the interface by means of these services are accessible. If all the IPC were designed according to the same interface, the CCM will be able to interact with them regardless both their technology and implementation. Looking into the above mentioned mechanism, when the Component A needs to access a service made available from the Component B, it performs a request through its CCM. This latter creates a message which describes the request, then formats the message according to the selected communication protocol and sends it to its peer on the Component B by means of the underlying IPC module. This latter in fact, once received the message, forwards it to its peer using a specific container and a specific protocol. The IPC module on the Component B, after that such a container is received, extracts the encapsulated message and forwards it to the overlying CCM. This latter interprets the request and starts the execution of the associated operation instead of the Component A. 2) Inter-module (External Communication): When two different hosts have to interact each other, the inter-module communication has to be exploited. The typical use cases refer to: • Communication between CM and HM for exchanging information on the cluster state and sending specific commands; • Communication between the administrators and CM using the ad-hoc client interface. As previously discussed, in order to implement the intermodule communication mechanism, an XMPP server must exist within the CLEVER domain and all its entities must be connected to the same XMPP room.
When a message has to be transmitted from the CM to an HM, as represented in Figure 2, it is formatted and then sent using the XMPP. Once received, the message is checked from the HM, for verifying if the requested operation can be performed.
Figure 2.
Activity diagram of the external communication.
As the Figure shows, two different situations could lay before: if the request can be handled, it is performed sending eventually an answer to the CM (if a return value is expected), otherwise an error message will be sent specifying an error code. The “Execution Operation” is a sub-activity whose description is pointed out in Figure 3. When the subactivity is performed, if any return value is expected the procedure terminates, else this value has to be forwarded to the CM in the same way has been done previously with the request. The sequence of steps involved in the sub-activity is represented in Figure 3. If the operation that has to be executed involves a component different from the Host Coordinator, the already described intra-module communication has to be employed. Once the selected component receives the message using this mechanism, if no problem occurs, the associated activity will be performed, else an error will be generated. If the operation is executed correctly and a return value has to be generated, the component will be responsible of generating the response message which will be forwarded to the HM and thus to the CM. V. CLEVER IN F EDERATED C LOUD C OMPUTING S CENARIOS In this Section we motivate how CLEVER is able to support both vertical and horizontal federation scenarios. A. CLEVER in Vertical Federation CLEVER supports vertical federation scenarios offering services to other clouds. In fact, external clouds, using the
Figure 3.
Activity Diagram of the sub-activity Executing Operation.
CLEVER’s web service interface, are able to arrange their own PaaS and SaaS deploying VMs in CLEVER’s IaaS. In order to clarify ideas, we specifically consider a use case where a customer needs to compute a huge amount of data. Since this task requires a great deal of computing power, a large physical infrastructure has to be employed for its accomplishment. The obvious solution could be to buy all the required hardware, to install a parallel computing middleware and then start the required tasks. Such an approach is not efficient if the computation has to be performed just once: in this case, it could be more convenient to lease an ad hoc infrastructure using a pay-per-use billing approach, for the time requested to accomplish the task. B. CLEVER and SGE In this section we propose a use case description, in which CLEVER is used for deploying a SGE [10] grid infstructure. For this example we will go to consider SGE Master node and Worker nodes. SGE Cloud Service is formalized as follows: 1) Master VM contains the Master, Administration and Submit Hosts. 2) Worker VM contains the Execution Hosts. Figure 4 depicts SGE use case of vertical federation involving CLEVER. In this scenario a customer who needs to suddenly compute a huge amount of data, contacts a “Scientific Computing Cloud” which at low level exploit the CLEVER’s IaaS(s).
been started, the “Scientific Computing Cloud” receives the IP addresses of the VMs in order to access them via SSH and perform some configurations (if needed). Once the whole IaaS is arranged, in step 4 the “Scientific Computing Cloud” instantiates the graphical web interface of the “Grid SaaS” and links this latter with the IaaS (running the parallel computing environment) provided by CLEVER. Finally, the “Grid SaaS” is delivered to the customer. C. CLEVER in Horizontal Federation
Figure 4.
Example of CLEVER in vertical federation.
In step 1 the customer sends a “Grid SaaS” instantiation request to the “Scientific Computing Cloud”. In order to arrange the required SaaS, in step 2 the “Scientific Computing Cloud” sends an IaaS instantiation request to CLEVER via the CLEVER’s web services and according to a given SLAs. The CLEVER web services interface is connected to the XMPP external communication room along with the active CM and the HMs. When a request arrives via web services, it is converted in a corresponding CLEVER message (formatted according to the external communication protocol) and then sent. Once sent, the request included within the message will be caught from the active CM which will execute the desired operation. The IaaS consists of several VMs whose operating system contains a pre-installed software for setting up a parallel computing environment based on SGE. We assume that the disk-images (including both the guest OS and the SGE software) for the instantiation of the VMs are uploaded by the “Scientific Computing Cloud” to CLEVER. More specifically, the “Scientific Computing Cloud” uploads two types of disk-image (one for the SGE Master and the other for the worker nodes) to the CLEVER Storage Manager running on the CM (using the CLEVER web service interface). The Storage Manager registers the images within the cluster catalogue and save them within the distributed storage system. In step 3, CLEVER arranges the IaaS including six VMs (one acting as SGE Master and five as worker nodes) interconnected by a virtual network. After that, the “Scientific Computing Cloud” via web service interface sends the VMs submission requests to the CM which in turn queries the distributed database for identifying the best HMs for allocating the VMs. Furthermore, as depicted in the top part of the Figure 4, the VMs are configured (using the Network Manager) for allowing their network communication. When all the needed VMs have
CLEVER has been designed with an eye toward horizontal federation. In fact, the choice of using XMPP for the CLEVER module communication (i.e., external communication XMPP room) has been made thinking about the possibility to support in the future also interdomain communication between different CLEVER administrative domains. The interdomain communication is the base for the horizontal federation. Federation allows to clouds to “lend” and “borrow” computing and storage resources to other clouds. In the case of CLEVER this means that a CM of an administrative domain is able to control one or more HMs belonging other administrative domains. For example, if a CLEVER domain A runs out the resources of its own HMs, it can establish an horizontal federation with a CLEVER domain B, in order to allow the CM of the domain A to use one or more HMs of the domain B. This enable the CM of domain A to allocate VMs both in its own HMs and in the rented HMs of domain B. In this way, on one hand the CLEVER cloud of domain A can continue to allocate services for its clients (e.g., IT companies, organization, desktop end-users, etcetera), whereas on the other hand the CLEVER cloud of domain A earns money from the CLEVER cloud of domain B for the renting of its HMs. As anyone may run its own XMPP server on its own domain, it is the interconnection among these servers which makes up the interdomain communication. Commonly, every user on the XMPP network has a unique Jabber ID (JID). To avoid requiring a central server to maintain a list of IDs, the JID is structured like an e-mail address with a user name and a domain name for the server where that user resides, separated by an at sign (@). For example, considering the CLEVER scenario a CM could be identified by a JID
[email protected], whereas a HM could be identified by a JID
[email protected]: bach and liszt respectively represent the host names of the CM and the HM, instead domainB.net and domainA.net represent respectively the domains of the cloud which “borrows” its HMs and of the cloud which “lends” HMs. Let us suppose that
[email protected] wants to communicate with
[email protected], bach and liszt, each respectively, have accounts on domainB.net and domain A XMPP servers. A CLEVER cluster includes a set of HMs, orchestrated by a CM, all acting on a specific domain and connected to the same XMPP intradomain communication room. Each HM is
deployed in a physical host and is responsible to manage its computing and storage resources according to the commands given by the CM. The idea of horizontal federation in CLEVER environments is founded on the concept that if a CLEVER cluster on a domain needs of external resources of other CLEVER clusters, acting on different domains, a sharing of resources can be accomplished, so that the resources belonging to a domain can be logically included in another domain. Within CLEVER this is straightforward by means of the built-in XMPP features. Figure 5 depicts an example of interdomain communication between two CLEVER administrative domains for the renting of two HMs from a domain to another.
acteristics, is able to be involved in both. We are currently working on the development of SSO authentication security mechanisms for enabling horizontal federation, establishing trust contexts between different CLEVER domains. ACKNOWLEDGEMENTS The research leading to the results presented in this paper has received funding from the European Union’s Seventh Framework Programme (FP7 2007-2013) Project VISIONCloud under grant agreement number 217019. R EFERENCES [1] T. Bittman, “The evolution of the cloud computing market,” Gartner Blog Network,http://blogs.gartner.com/ thomas bittman/2008/11/03/the-evolution-of-the-cloudcomputing-market/, November 2008. [2] Amazon Elastic Compute Cloud (Amazon EC2): http://aws.amazon.com/ec2/. [3] Windows Azure Platform, http://www.microsoft.com/windowsazure/.
Figure 5.
Example of CLEVER in horizontal federation.
Considering the aforementioned domains domainA.net and domainB.net, in scenarios without federation, they respectively include different XMPP rooms for intradomain communication (i.e.,
[email protected] and
[email protected]) on which a single CM, responsible for the administration of the domain, communicates with several HMs, typically placed within the physical cluster of the CLEVER domain. Instead, considering a horizontal federation scenario among the two domains, if the CM the domainB.net domain needs of external resources, it could invite within its
[email protected] room one or more HMs of the domainA.net domain. As previously stated, in order to accomplish such a task a trust relationship between the domainA.net and the domainB.net XMPP servers has to be established in order to enable a Server-to-Server communication allowing to HMs of domain A to join the external communication XMPP room of domain B. VI. C ONCLUSIONS AND R EMARKS In this paper we focused our attention to the existing cloud computing middlewares, specifically pointing out the ones by means of which it is possible to build federated cloud infrastructures. After such a discussion, we deeply analyzed the CLEVER cloud middleware, highlighting its main features and paying attention to its ability of building federated cloud environments. We presented two different scenario of federation: a Vertical Federation and an Horizontal Federation and demonstrated how CLEVER, thanks to its char-
[4] F. Tusa, M. Paone, M. Villari, and A. Puliafito., “CLEVER: A CLoud-Enabled Virtual EnviRonment,” in 15th IEEE Symposium on Computers and CommunicationsS Computing and Communications, 2010. ISCC ’10. Riccione, June 2010. [5] OpenQRM, “the next generation, open-source Data-center management platform”, http://www.openqrm.com/. [6] B. Sotomayor, R. Montero, I. Llorente, and I. Foster, “Resource Leasing and the Art of Suspending Virtual Machines,” in High Performance Computing and Communications, 2009. HPCC ’09. 11th IEEE International Conference on, pp. 59– 68, June 2009. [7] C. Hoffa, G. Mehta, T. Freeman, E. Deelman, K. Keahey, B. Berriman, and J. Good, “On the Use of Cloud Computing for Scientific Workflows,” in SWBES 2008, Indianapolis, December 2008. [8] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, “The Eucalyptus OpenSource Cloud-Computing System,” in Cluster Computing and the Grid, 2009. CCGRID ’09. 9th IEEE/ACM International Symposium on, pp. 124–131, May 2009. [9] B. Sotomayor, R. Montero, I. Llorente, and I. Foster, “Virtual infrastructure management in private and hybrid clouds,” Internet Computing, IEEE, vol. 13, pp. 14–22, September 2009. [10] Oracle Grid Engine, http://www.oracle.com/us/products/tools/oraclegrid-engine-075549.html, March 2011.