ment and management of grid-enabled system programs; they do not ... A multi-site virtual cluster is composed of distributed VMs in different lo- cations, which ..... Rocks allows flexible node-by-node configuration by installing a customized list ...
Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi1 , Takeshi Yokoi1 , Tadashi Ebara1,2, Yusuke Tanimura1 , Hirotaka Ogawa1, and Hidetomo Nakada1 1
National Institute of Advanced Industrial Science and Technology (AIST) 2 Mathematical Science Advanced Technology Laboratory Co., Ltd.
Abstract. Grid computing is a promising technology for utilizing distributed computing resources seamlessly. However, it is still difficult for end-users to deploy and manage their own computing environments into large number of distributed locations easily and rapidly. Recent grid technologies do not address the problems involved in distributed deployment and management of grid-enabled system programs; they do not allow end-users to obtain computing resources dynamically from different locations and organizations, and to build their computing systems with them easily and scalably. Therefore, we propose a user-oriented, distributed deployment and management mechanism for grid computing systems, based on multi-site virtual clusters. It enables end-users to easily configure grid computing environments on distributed computing resources through their own large-scale clusters of virtual machines. It is composed of inter-domain resource control protocols, resource virtualization technologies including virtual machines and virtual private networks, and scalable virtual node management mechanisms. It allows greater autonomy for end-users in obtaining and utilizing distributed computing resources from different organizations. We developed a prototype implementation of the proposed mechanism in order to confirm its validity in WAN environments. Our experiments showed that users could rapidly create large-scale multi-site virtual clusters, and customize their virtual nodes easily and quickly under WAN latencies.
1
Introduction
In grid computing environments, users want to access distributed computing resources located in geographically separated locations, and administrated by different organizations, seamlessly, Recent grid technologies enable users to utilize remote resources through service interfaces provided by grid-enabled middleware and application programs. These middleware suites and application programs for computing resource collaboration need to be installed and configured appropriately in advance of being accessed by users. Only privileged users (e.g., administrators) can perform system-wide deployment and configurations of such software programs.
For massively distributed resources of different organizations, however, this “manpower intensive” deployment task becomes a bottleneck hindering rapid and dynamic collaboration over diverse computing resources by end-users. It is difficult to deploy software programs with their appropriate settings in everchanging heterogeneous resource environments, in an efficient manner for all kinds of computing resource usage. This problem is caused by administratororiented management approaches to resource allocation and software deployments, which means that end-users cannot decide how to combine distributed resources on demand and how to fully configure them for their purposes. We believe that a user-oriented resource management framework is required, which enables users to choose distributed resources and build their own computing environment completely, autonomously. In this paper, we propose a user-oriented, distributed deployment and management mechanism for grid computing systems, which is based on multi-site virtual clusters; allowing users to incorporate distributed computing resources into their own single virtual clusters over WAN. Our mechanism is composed of inter-domain resource control protocols, resource virtualization technologies including virtual machines (VMs) and virtual private networks (VPNs), and scalable virtual node management mechanisms. It allows greater autonomy for end-users in obtaining and utilizing distributed computing resources from different organizations. A multi-site virtual cluster is composed of distributed VMs in different locations, which are created by end-users via the inter-domain resource control protocols. The virtualization mechanisms available from resource providers create VMs at the request of end-users, and the allocated VMs are interconnected via VPNs for single cluster views. Its virtual node management mechanisms allow users to deploy operating systems and grid applications in their distributed VMs rapidly, and to customize each virtual node flexibly, with the minimum management cost. This enables users to access distributed resources seamlessly via their own virtual clusters, and to create their computing environments on them in the same manner as in physical clusters. The contribution of this paper is to address system deployment and management problems in large-scale distributed computing environments, which have not yet been tackled by both grid and virtualization technologies. It presents novel system deployment and management mechanisms that enable end-users to build their own distributed computing environments easily and rapidly. It also shows the validity of the proposed mechanisms in WAN environments through verification by our experiments. Section 2 summarizes related work briefly, and Section 3 explains the concept of our proposed architecture and its system design. Section 4 shows the results of our experiments. Finally, Section 5 notes the conclusions of this paper.
2
Related Work
Although existing grid technologies provide remote access mechanisms for distributed resources, they do not address the distributed deployment and management of grid-enabled programs by end-users. Virtualization is the key to creating isolated computing environments for users. Recent virtualization research and technologies, however, have not addressed distributed deployment and management problems. Cloud computing with virtualization, such as Amazon EC2 [1], is a web service through which users create VMs and customize them for their applications. However, because its target customers are web service providers and not grid application users, it does not have the functionality to manage large-scale distributed VMs from different organizations. Virtual Workspace [2] is a virtualized execution environment based on the Globus Toolkit framework. It allows clients to create VMs on remote resource providers through the WSRF protocols. However, it lacks an integration mechanism for VMs allocated in different locations. Our research aims to allow users to manage large-scale distributed VMs in an easy and flexible manner, as if they were in a single physical cluster; providing the distributed computing infrastructure through which end-users and different organizations can collaborate easily by using easy-to-use multi-site virtual clusters. PlanetLab [3] is a distributed computing testbed for large-scale network applications. Users can run their network programs on VMs distributed at many sites over the Internet. Since it focuses only on academic network research communities, it does not address deployment and management of grid computing systems for end-users.
3
Multi-site Virtual Clusters
As mentioned above, our system aims to provide a distributed computing infrastructure for distributed deployment and management of grid computing systems, through which end-users can easily build large-scale computing environments from distributed computing resources in different organizations. 3.1
The Concept of Multi-site Virtual Clusters
We propose the concept of a multi-site virtual cluster, which is a single cluster of distributed VMs allocated in different locations. It is an easy-to-use computing environment, through which users can rapidly deploy their customized software programs to a large number of virtual nodes, and easily manage them as a single cluster. A user’s view of our concept is illustrated in Figure 1. Virtual organization X creates a multi-site virtual cluster from computing resources at Sites A, B, and C. It is composed of 30 VMs at Site A, 10 at B, and 20 at C, respectively. X can add more VMs to its virtual cluster from other sites. All of these VMs,
Fig. 1. User’s view of multi-site virtual clusters
connected to a common private L2 network, can be seen as a single physical cluster. X can fully customize its distributed computing environment, including operating system and middleware programs, in the most appropriate way for its target application. The advantages of the proposed architecture are summarized as follows: First, it enables end-users to create their own distributed computing environments in a fully-customizable manner via hardware virtualization. Applications and grid middleware programs can be rapidly installed in distributed VMs, appropriately configured, and easily managed by end-users, not administrators at each site. In addition, distributed hardware resources can be seen as a single physical cluster. This breaks down the barriers that prevent end-users from easy utilization of distributed resources. Since virtualization mechanisms absorb physical hardware differences, the internal systems of multi-site virtual clusters can be installed on a monolithic hardware platform. All VMs of a multi-site virtual cluster have a network interface connected to a single network segment over WANs, so that existing management tools and application programs for cluster computing environments can also be exploited. Furthermore, administrators at each site can manage hardware resource in a flexible manner through virtualization mechanisms. The group of allocated VMs is fully isolated from other hardware resources, so that it can be exported to external users without increasing invasion risks. There is, however, a possible issue that must be addressed carefully: Although the bandwidth of recent WAN environments is increasing to levels over 1 Gbps, large network latency is inevitable, and the total end-to-end bandwidth is much less than in a LAN. Our proposed concept needs to achieve both rapid system deployment and strong customizability for a large-scale multi-site virtual cluster composed of distributed nodes over a WAN. As stated in the later sections, we tackled this issue by means of the data transfer optimization of internal system deployments, and the resulting performance was evaluated in our experiments.
3.2
Design Criteria
A high-level view of the system design needed to achieve multi-site virtual clusters as a resource collaboration infrastructure in massively distributed environments, is discussed here. It needs to be designed to meet the following requirements: – Independence of resource providers. The policy decision of whether hardware resources are to be allocated or not should be made based on the management policy of each site. In multi-site environments composed of different organizations, it is difficult to achieve scalable architecture with centralized decision points. – Autonomy of resource users. A resource user should be able to make his own decision about the acquisition and combination of VMs; the user decides from which sites he solicits the allocation of his VMs, and how these allocated VMs are combined into multi-site virtual clusters. The autonomy of resource users allows flexible deployments of customized systems. – Scalability of virtual clusters. Multi-site virtual clusters should have the maximum scalability, which means deployment and management cost does not increase in proportion to the number of distributed VMs. Node-by-node customization should be allowed quickly and flexibly, with the minimum of user interaction. 3.3
System Design
Therefore, we designed a system architecture composed of 3 separated components: The inter-domain resource control mechanism of VM management allows the whole system architecture to be decentralized, providing for the independence of resource providers and the autonomy of resource users. The virtualization mechanism of local physical clusters also contributes to end-user autonomy by isolating allocated resources completely. The cluster configuration framework for distributed virtual nodes provides scalable deployment and management of internal operating systems and applications. Inter-domain Resource Control Mechanism A multi-site virtual cluster is composed of local virtual clusters allocated at each site. These single-site virtual clusters are interconnected by VPN services and incorporated into the multi-site virtual cluster. An overview of the system design is illustrated in Figure 2. A virtual cluster system is configured at each site, which allocates and control VMs on physical clusters via the resource management API in Table 1. Through this API, users can create/destroy virtual clusters, add/remove VMs to them, and start/stop VMs. It also includes an API for starting/stopping VPNs among remote sites to establish the internal private network of multi-site virtual clusters. Ethernet VPNs are utilized to bridge local networks of allocated single-site virtual clusters, so that users manage the internal system of a multi-site virtual
Fig. 2. Design overview of multi-site virtual clusters
cluster by exploiting existing cluster management tools. This reduces the administrative cost of the internal system, and enables users to create a large-scale cluster composed of multi-site virtual nodes. Application programs on the multisite cluster can communicate with allocated VMs via both private and public networks. Application data sent from private network interfaces is transferred to other VMs, and the data sent to the VMs at other sites is automatically redirected by the VPN services. This means existing cluster programs basically run in a multi-site virtual cluster without any modification, On the other hand, application data can also be transferred via the public network interface by exploiting communication mechanisms over the Internet, thereby avoiding the overhead of VPN services. This mechanism is suitable for grid-enabled programs that can exchange data securely over public networks in an efficient manner. The resource management API involves authentication, authorization, and accounting for carrying out local resource management policies. An authorization framework based on virtual organizations is utilized for the maximum flexibility with distributed decision points. Since multi-site virtual clusters can be collaboration environments among different organizations, the concept of virtual organization is suitable to create virtual groups for particular purposes utilizing customized computing environments built on multi-site virtual clusters. It separates the membership management of a virtual organization from resource providers, and reduces authorization control costs of virtual cluster systems at each site. In the figure, users communicate directly with target sites to allocate required resources via the Web service API. It is also possible to develop a portal service that enables users to delegate distributed resource management. The portal service, exploiting the same API to allocate and control resources, can provide advanced interfaces, such as automatic resource provider selection. GSI-
Table 1. Resource management REST API URI (path) /api/vc /api/vc /api/vc/$id /api/vc/$id/vm /api/vc/$id/vm /api/vc/$id/vm/$id /api/vc/$id/vpn /api/vc/$id/vpn /api/vc/$id/vpn/$id
Method GET POST GET GET POST POST GET POST POST
Parameters Description lists virtual clusters creates a virtual cluster (VC) gets the status of a VC gets the status of a VM mem, disk adds a VM to a VC action starts or stops a VM gets the status of a VPN peer id adds a VPN to a VC action starts or stops a VPN
We defined our RESTful API, and named it so because it can be called easily from Web browsers via JavaScript. Other RPC mechanisms, such as SOAP and XML-RPC, can also be used to implement the same functionality. There are other functions for deleting components. A requesting entity accesses the above API provided by the cluster manager of a target site, for example, https://example.com/api/vc. SSL client/server authentication is used to verify connecting peers.
Fig. 3. Virtual cluster management system
based authentication [4] is suitable for the portal service by exploiting proxy certification issued by users. Resource Virtualization Mechanism for Physical Clusters A virtual cluster system is configured at each site, through which site administrators incorporate physical hardware nodes in the local resource pool of virtual clusters (Figure 3). The virtual cluster system at each site is designed only for local resource management, including VM allocation and VPN service control. It does not control other site resources, retaining site independence in the inter-domain system architecture. All the requests to a cluster manager are first authorized by a resource allocation policy at each site. This policy defines the amount of resources that can be allocated for requesting entities, such as the maximum number of VM instances, and the maximum size of disk space and memory per VM. A Distin-
Fig. 4. Internal network of a local virtual cluster
guished Name in a client SSL certificate is used to identify a requesting entity, and a Fully Qualified Attribute Name (FQAN) entry embedded in the certificate is also used to perform VO-based authorization, if available. Accounting information (e.g., the running time of VMs) is recorded in the cluster manager. To create an isolated virtual cluster, the virtual cluster system first sets up a newly tagged VLAN in its local physical network. A VLAN interface is added to all physical nodes in the resource pool. Next, when the virtual cluster system gets a user request to add a VM to the virtual cluster, it creates a new VM image on one of the physical nodes, and bridges the private network interface of the VM to the VLAN-tagged network interface of the host physical machine. The public network interface of the VM can also be added and bridged to the public network segment of the resource pool on request. An Ethernet VPN daemon of OpenVPN [5] is launched to prepare new Ethernet VPN sessions bridging the private network inside the VLAN with other private networks of remote virtual clusters. Virtual Cluster Configuration Framework for End-users End-users need to configure large-scale multi-site virtual clusters, in which VMs may be added or removed dynamically. Users also have to frequently customize system software and applications for VMs in order to optimize their grid computing systems. We designed a cluster configuration framework optimized for a large number of computing nodes located under nonuniform Ethernet topologies, for instance, including wide area Ethernet services. Because a multi-site virtual cluster that has its private Ethernet network over a WAN can be seen as a single physical cluster, the cluster configuration framework is based on a physical cluster configuration toolkit; so adding/removing nodes, installing operating systems, and customizing node by node are possible in the same manner as that of physical clusters. Users can perform such cluster-wide configurations through the command line console of a frontend node. The VPNs interconnecting distributed local virtual clusters, however, become a bottleneck for internal node management due to the performance limitation of WANs. Our cluster configuration framework exploits a package-based cluster installer, Rocks [6], and a transparent package caching mechanism, Squid [7]. Rocks allows flexible node-by-node configuration by installing a customized list
Fig. 5. Transparent package caching mechanism
of packages with the minimum user interaction. In our framework, all package transfers are transparently intercepted by the package cache mechanism at each site (Figure 5). The outbound download requests to a frontend node are minimized into those of a unique set of packages, and inter-domain network traffic over VPNs is dramatically reduced. 3.4
Prototype Implementation
We have developed a prototype implementation of the proposed multi-site virtual cluster system, which supports resource allocations and controls via the REST API. It has a portal Web site through which users can make reservations of multi-site virtual clusters and monitor their status. Through the portal Web site, users can automatically install an internal operating system optimized for system customization and deployment inside multi-site virtual clusters (Figure 6). These implementations are designed to be add-on packages of the NPACI Rocks cluster toolkit. They will be available under an open source license on our project page. Some well-tested components such as VLAN and the VMware utilities are already downloadable from http://code.google.com/p/grivon/.
4
Experiments
We performed experiments to show the validity of the proposed system. As noted in Section 3.1, a primary issue is the deployment and management cost of widely distributed, multi-site virtual clusters. The proposed system has to enable users to quickly configure their customized systems into distributed VMs through the view of a single cluster. The experiments emulate a multi-site virtual cluster composed of VMs at two different locations. Its internal network is built with an OpenVPN session linking them. As in Figure 7, two physical clusters 3 4 are connected with Gb 3 4
Cluster A: 16 nodes (AMD Opteron Processor 244, 3GB memory, Gb Ethernet x2) Cluster B: 134 nodes (AMD Opteron Processor 246, 6GB memory, Gb Ethernet x2)
Fig. 6. A screenshot image of our resource manager program for users. It was captured during a demonstration creating multi-site virtual clusters from distributed computing resources at 5 remote sites. A user-created multi-site virtual cluster composed of 26 VMs is connected to the same network segment via VPNs. Six VMs were hosted on UCSD’s cluster and others were at different locations in Japan. Four VPN connections were established for a private network over the Internet.
Fig. 7. Experiment Setting
Ethernet via a network emulator [8]. This adds network latencies to all traffic between the clusters to emulate a WAN environment. The 134 VMs on Cluster B are reconfigured for the new settings of grid system programs from a frontend node on Cluster A. The VMs are reinstalled with a set of software packages (900MB) with their new settings. All software packages are retrieved from the frontend node over the VPN session. A transparent package caching server, however, intercepts package requests and merges them to reduce network traffic over the VPN session. Figure 8 shows the completion time for reconfiguration of all the 134 VMs. Figures 9 to 11 show VPN traffic and download traffic of the cache server in the
Installation Time (134 nodes) 4500
installation time (s)
4000 3500 3000 2500 2000 1500 1000
cache disabled cache enabled cache enabled (pre-cached)
500 0 0
20
40 60 RTT (ms)
80
100
Fig. 8. Deployment time for new configuration over different network latencies VPN Throughput
eth0.121 Throughput TX RX
TX RX
120000
100000
throughput (Kbytes/s)
throughput (Kbytes/s)
120000
80000 60000 40000 20000
100000 80000 60000 40000 20000
0
0 0
500
1000
1500 time (s)
2000
2500
3000
0
500
1000
1500
2000
2500
3000
time (s)
Fig. 9. VPN traffic and cache server traffic (cache disabled)
case of a network latency of 20 ms 5 . The vertical line in the figures shows the completion time for reconfiguration of all the 134 VMs. The reconfiguration was completed 800-1000 seconds faster by enabling a cache server. As in Figure 10, VPN traffic becomes much smaller than download traffic from the cache server. The total transferred data over the VPN is considered as a unique set of downloaded packages. In comparison with Figure 9, VPN traffic was far less than the case without caching. In the case where all packages are pre-cached before reconfiguration, it repeatedly took 1300 seconds under various network latencies. As in Figure 11, VPN traffic was dramatically reduced due to the fact that there was no package transfer. It is reasonable to share common packages in a local cache repository at each location. These results clarify that our cluster configuration framework designed for multi-site virtual clusters resolves the major issues of distributed deployment and management of large-scale VMs. Distributed VMs in remote locations are rapidly configured by our efficient package caching mechanism, and also easily 5
A typical WAN latency between two major cities (Tokyo and Osaka) in Japan.
VPN Throughput (scaled)
TX RX
120000
10000
throughput (Kbytes/s)
throughput (Kbytes/s)
eth0.121 Throughput TX RX
12000
8000 6000 4000 2000
100000 80000 60000 40000 20000
0
0 0
500
1000
1500
2000
2500
3000
0
500
1000
time (s)
1500
2000
2500
3000
time (s)
Fig. 10. VPN traffic and cache server traffic (cache enabled)
VPN Throughput (scaled) 1200
TX RX
120000
1000
throughput (Kbytes/s)
throughput (Kbytes/s)
eth0.121 Throughput TX RX
800 600 400 200
100000 80000 60000 40000 20000
0
0 0
500
1000
1500 time (s)
2000
2500
3000
0
500
1000
1500
2000
2500
3000
time (s)
Fig. 11. VPN traffic and cache server traffic (cache enabled, pre-cached)
managed via the user-friendly interface based on a cluster configuration toolkit for single physical clusters.
5
Conclusions
We proposed a user-oriented, distributed deployment and management mechanism for grid computing systems, based on multi-site virtual clusters. It enables end-users to create large-scale clusters composed of VMs at different locations, through which they can easily and rapidly configure their grid system programs distributed over different locations. Our mechanism is composed of inter-domain resource control protocols, resource virtualization technologies including VMs and VPNs, and scalable virtual node management mechanisms. Our prototype implementation enables users to rapidly build multi-site virtual clusters over the Internet, and to easily configure their internal systems through the view of a single cluster. The experiments showed that our cluster management framework allowed rapid configuration of large-scale VMs under WAN latencies.
Acknowledgment This work was supported by JST/CREST.
References 1. Amazon Elastic Compute Cloud. http://aws.amazon.com/ec2 2. Zhang, X., Freeman, T., Keahey, K., Foster, I., Scheftner, D.: Virtual clusters for grid communities. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID2006), IEEE Computer Society (2006) 3. Peterson, L., Anderson, T., Culler, D., Roscoe, T.: A blueprint for introducing disruptive technology into the Internet. ACM Computer Communication Review 33(1) (2003) 59–64 4. Foster, I., Kesselman, C., Tsudik, G., Tuecke, S.: A security architecture for computational grids. In: Proceedings of the Fifth ACM Conference on Computer and Communications Security (CCS’98), ACM Press (1998) 83–92 5. OpenVPN. http://openvpn.net/ 6. Papadopoulos, P.M., Katz, M.J., Bruno, G.: NPACI Rocks: Tools and techniques for easily deploying manageable Linux clusters. In: Proceedings of Cluster 2001: IEEE International Conference on Cluster Computing, IEEE Computer Society (2001) 258–267 7. Squid: Optimizing Web Delivery. http://www.squid-cache.org/ 8. Kodama, Y., Kudoh, T., Takano, R., Sato, H., Tatebe, O., Sekiguchi, S.: GNET-1: Gigabit Ethernet network testbed. In: Proceedings of Cluster 2004: IEEE International Conference on Cluster Computing, IEEE Computer Society (2004) 185–192 9. Nakada, H., Yokoi, T., Ebara, T., Tanimura, Y., Ogawa, H., Sekiguchi, S.: The design and implementation of a virtual cluster management system. In: Proceedings of the First IEEE/IFIP International Workshop on End-to-end Virtualization and Grid Management (EVGM2007), Multicon Verlag (2007) 185–192