Dynamic Deployment and Management of Elastic Virtual ... - IEEE Xplore

4 downloads 64981 Views 431KB Size Report
College of Computer Science and Technology ... provide a good platform for Grid computing [1]. Many ..... Virtual cluster deployment using best-effort resource.
2011 Sixth Annual ChinaGrid Conference

Dynamic Deployment and Management of Elastic Virtual Clusters

Xiaohui Wei, Haibin Wang, Hongliang Li, Lei Zou College of Computer Science and Technology Jilin University Changchun, P.R.C China [email protected], {hbwang08, hongliang09,zoulei09}@mails.jlu.edu.cn Abstract—Virtual clusters are the fundamental support of

a lot of difficulties to the resource managers and system

new generation virtual High Performance Computing

administrators. For example, system heterogeneity must

(HPC) systems. As a high level virtual system component,

be adapted, security attack from malicious hackers must

virtual cluster is designed based on virtual machine and

be filtered, and node failure which may be caused by a

virtual network to provide virtual executing environment

memory leak from some process must be avoided. Virtual

for large-scale parallel applications. This paper proposes a

machine provides a higher level abstraction for resources

novel solution for dynamic deployment and customization

[1]. At this abstraction level, the resource users and

of Elastic Virtual Cluster (EVC). In this work, we support

managers see a pool of uniformed virtual machines, each

customization of virtual clusters, such as OS images,

of which can be configured independently. Virtualization

network topologies and

technology provides a promising way to build new

cluster software. Virtual clusters

are automatically configured with isolated virtual networks

generation

high

performance

computing

systems.

(network address, DNS domain, NIS domain, etc.) and

Nowadays, moving computing platform from traditional

software environments. Two data transmission protocols

physical resources to virtualized resources has become a

are presented in this paper to accelerate image deployment.

trend.

In addition, we propose a novel solution that virtual

A lot of research into integrating virtualization

machines (VMs) on the same host share image without the

technology into current distributed physical resource

aid of VMM. Experiment results show that virtual clusters

models have been conducted. The existing works either

can be deployed and managed on distributed physical nodes

focus on the virtual infrastructure’s creation and

efficiently and therefore can be used by the upper level

management or focus on the service models based on the

resource consumers or applications.

virtual infrastructures. Our work falls into the former and mainly provide virtual clusters as a kind of virtual

Keywords: elastic virtual cluste; VJM; disk-image

computing infrastructure for resource consumers and

deployment; cloud computing

I.

managers.

INTRODUCTION

In this paper we propose and implement elastic virtual

With the development of grid virtualization and cloud

cluster framework that supports:

computing, virtualization technology has attracted more



and more attention. In the traditional distributed

resources that are controlled by different VMMs.

computing area, a computing task is represented as a job



involving all the physical resources and software

Physical resource selection that maps virtual machines to computing nodes.

execution environments. Such a job is bound to the physical node and operating system too tight thus it poses 978-0-7695-4472-4/11 $26.00 © 2011 IEEE DOI 10.1109/ChinaGrid.2011.31

Dynamic creation of virtual machines on physical

35





Aggregation of virtual machines that run on more

cluster as a resource allocation unit. CARE[11]

than one administration domains to form virtual

framework did the similar works but they aimed to create

clusters.

virtual machines as the supplement of physical resources

Efficient virtual machine images’ distribution,

thus the computing resources are mixed with physical

updates, caching, and sharing independent of

nodes and virtual nodes.

VMMs. •

Dynamic

preparing

software

III.

execution

DESIGN

EVC provides its users a meta-data defined

environment to facilitate application’s running. Resizing or merging compatible virtual clusters at

abstraction of both execution environments and optional

runtime thus provides the functionality to

job description that can be instantiated dynamically on

cooperate with dynamic workloads balancer.

physical resources spanning one or more administrator

The rest of this paper is organized as follows. Section

domains. Users’ jobs can be dispatched to the virtual

2 describes the related works and literals. Section 3

cluster using well-defined protocols. Virtual clusters will

elaborates the key issues in EVC design.

be destroyed or merged with other compatible virtual



Section 4

clusters after job running.

describe EVC architecture in our implementation. Section 5 shows the experiment results of deploying virtual

We split the process of deploying a virtual cluster into

clusters dynamically using EVC. We summarize our

three major phases: 1) vjob scheduling phase, in which a

work in section 6, and the promising future works are

set of physical nodes are selected to host the virtual

also explained.

machines. 2) image distribution phase, in which the image template is distributed to the selected computing II.

With

the

RELATED WORK

advantage

of

isolation,

nodes in an efficient and economic way, and 3) virtual machine

security,

aggregation

and

virtual

cluster

customization, and legacy support, virtual machines

auto-configuration phase, in which the virtual machines

provide a good platform for Grid computing [1]. Many

are configured to form a virtual cluster and software

works have been done to integrate virtualization

environment are prepared.

technology into Grid, such as Globus Virtual Workspace

A. Virtual Cluster Representation

[12][14]. OpenNebula [17] is an open source solution for

Xen [2], VMware [3], VirtualBox [4] etc. are popular

virtual infrastructure management but has no high level

virtual machine monitors (hypervisors). They respectively

virtual clusters that are composed of virtual machines

defined their own way of vitalizing physical computing

across several remote physical clusters. Cluster on

nodes. Many attributes, such as virtual network structure,

Demand (COD) project [15] multiplex a physical cluster

disk image format, virtual machine administrator

and enables a grid client to obtain a physical cluster

privilege and so on are not compatible with each other. A

partition based on credentials. Image propagation

universal virtualization mechanism must adapt all these

problem is a key issue and [6] investigates this problem

differences and provide unified interface for upper level

and propose several optional solutions. Virtuoso [16],

users.

Violin [7], ViNet [8] and LimeVI [9] explore the virtual

To represent a virtual cluster as a basic resource unit,

networking issues, while LimeVI propose its virtual

each virtual cluster is assigned a unique public network

network architecture to facilitate the virtual cluster live

address by which any other services or end users can

migration.

communicate with it. The virtual cluster that is composed

EVC is mainly to provide a lightweight and efficient

of virtual machines is described in universal XML

virtualized resource manager and makes the virtual

format. Figure 1 is a sample XML file which describe a

36

virtual cluster with 32 virtual machines. For that

B. Virtual Job Scheduling

description, 32 vjobs will be created. Each vjob

Co-allocation problem must be considered when

corresponds to a virtual machine. All the virtual machines

instantiating virtual cluster using the combined resources

have same configurations and they are transparent to the

from multiple administrator domains. We borrow ideas

EVC users.

and approaches from Virtual Job Model(VJM)[5] that dispatches virtual jobs (vjobs) to co-reserve physical resources for the virtual cluster. Resource selection strategies and deadlock detection technologies are reused in EVC to cope with the resource competing issues. Vjob in EVC refers to the process of creating and configuring virtual machine on the remote computing nodes. Vjob scheduling has an important impact on the whole performance of virtual cluster's deploying and running. Poor performance clusters would then cause degrade services for end users. In our preliminary work we

adopt

two

kinds

of

user

jobs

namely

computation-intensive and communication-intensive, and defined

two

resource

selection

strategies

in

correspondence. For communication-intensive jobs, we

Figure 1. Virtual Cluster Description File

leverage the VJM (Virtual Job Model) resource selection

Each virtual machine in the same virtual cluster is

strategy to select a subset of physical nodes which span

arranged a private network address for communicating

minimal

with other virtual machines. The virtual machine that

administration

domains.

For

computation-intensive jobs, we use best-effort strategy to

hosts the public network interface is called a headnode.

select computing nodes.

Headnode is selected randomly and can be changed at

Best-effort strategy sorts the physical resources and

runtime. All the virtual machine configurations are

then matches the vjob to the physical resource using a

handled by the vjob agent and that information is then

greedy algorithm. The resource properties of physical

recorded to a disk image. When a virtual machine is

resource we concerned including dynamic workloads and

booted, an OS boot script is executed to read the

image template cache. Thus the sorting algorithm prefers

configuration file and configure the virtual machine. Each

the following order: 1) physical resources that has image

virtual machine is described in an internal metadata

cache and low workloads, 2) physical resources that has

expressed by XML format, as Figure 2 illustrated.

no cache but low workloads, 3) physical resources that has cache but high workloads, 4) physical resources that has no image caches and high workloads. The workloads of physical resource is measured by the proportion of the number of running virtual machines to the maximum number of virtual machines that can run simultaneously on that physical resource. Physical resources' statuses are required when

Figure 2. Internal Virtual Machine Metadata

scheduling. They are collected and filtered by the InformationService, for example, Globus MDS. In our

37

work we extended the MDS IP(Information Provider)

will

be

used

component to collect the extra resource properties. The

retransmission.

therefore

avoid

a

second-time

in

However, it is not enough to use only image caching

ResourceSelector module for demonstrate, but more

technique. A physical node may host multiple virtual

sophisticated strategies are easily plugged in.

machines of one virtual cluster, so we must copy the

two

simple

strategies

are

implemented

cached image for every virtual machine. This work is

C. Image Preparation

time consuming and a waste of disk space. To solve this

Preparing disk images for each virtual machine is the

problem, we combine UnionFS [10] and Bind-Mount to

most time-consuming work during deployment. This

form a copy-on-write base image.

work must be done efficiently and economically, that is,

For one physical

node, the same base image is cached only one copy and

use as less time as possible, and cause least overheads to

shared by all the virtual machines.

the system. In our work, a virtual machine is composed of four disk images named base image, swap image, data image and status image. This structure speeds up image distribution because we can prepare images concurrently. Another benefit of this structure is that it occupies less disk space on the ImageServer because some image, for example, swap image, needs not to be allocated on the ImageServer. Base image contains permanent code and data such as operating system, development library and

Figure 3. Image caching based on administrating domains and sharing

software stack. It always resides on the ImageServer. It is

between virtual clusters. Each virtual cluster has its own data image.

better to keep the base image pure (all spaces are

D. Virtual Machine Aggregation and Customization

occupied by valid data) and make its size as small as

The virtual machine aggregation process consists of a

possible. Swap image act as the swap partition in the

succession of service configurations including virtual

virtual machine and is created by the vjob agent on the

network service, NFS, SSH and user/group account. All

computing node. Data image stores the user data and

the required information for aggregation is carried by

resides on the DataServer. Status image stores the

vjob hence there is no need to interact with EVC client.

temporary status file during virtual machine runtime.

The configuration parameters are written in a disk file

The size of base image in our experiment is about 2G.

which resides in the status image. When virtual machine

BitTorrent Protocol [13] is used to distribute the base

boots, a modified OS startup script is executed, the status

image from ImageServer to computing nodes. This

image is mounted, and the configuration parameters are

solution not only reduces the distribution time but also

parsed.

reduce the workloads on ImageServer.

A special program we developed and

pre-installed in the base image receives the parameters

In addition to using BitTorrent as the distribution tool,

and co-ordinates the configuration with other virtual

we propose an image cache and sharing mechanism to

machines.

further cut down the image preparation time. Figure 3

Currently EVC supports software customization by

demonstrates our mechanism. The base image is kept on

allowing users to specify the software which can be

the physical nodes after the virtual cluster is destroyed.

installed after the virtual machine boots. The software

The next time when the same image template is selected

in which users can select is confined to the operating

to instantiate the virtual cluster this cached base image

system releases.

38

The software customization is done

using the software manager such as 'yum' command in

receives a grid job request specified by Globus RSL

Redhat system, while the PacketServer is the online

language and run the job in the virtual cluster. VirtualClusterFactory is the core component of EVC.

software repository. IV.

It delegates all the required tasks to create a virtual

IMPLEMENTATION

cluster.

When

receiving

metadata

from

VirtualClusterAllocator, this module parses the metadata EVC builds its virtual clusters dynamically on the

and verifies the availability of physical resources,

distributed physical nodes and provides unified API and

network addresses and image templates. If all the

command line interfaces for upper level services or

required resources can be acquired, this module invokes

applications. Figure 1 illustrates the EVC architecture.

the ResourceSelector module to select the most

VirtualClusterAllocator is the main interface exposed

appropriate subset of physical resources for the vjobs.

by EVC. Its main function is to allocate or free virtual

Once all the vjobs get scheduled, it then dispatches all the

clusters for resource users just as the malloc/free library

vjobs to the local resource managers through some

routines in C programming language. When allocating a

protocol such as GRAM.

virtual cluster, it can reuse an idle virtual cluster which is

cluster is added to a common pool, which is maintained

compatible with the user’s resource specifications or

by the VirtualClusterPool module.

The newly created virtual

create a new virtual cluster. The allocating request is transformed to virtual cluster metadata and passed to VirtualClusterFactory. The JobExecutor is a thin layer of encapsulation on top of VirtualClusterAllocator. It

Figure 4. EVC Architecture

VirtualClusterPool consists of four sub modules: 1) A

the newly created virtual cluster and refreshes the

container contains all the virtual clusters which exists and

certificates at runtime. 3) VirtualClusterMessageCenter

have not been destroyed in current system. 2)

module which is responsible for the message exchange

VPoolCAManager module which allocates certificate to

between the vjob agents and local EVC client, such as

39

virtual cluster status query and virtual cluster exit

which all the vjobs has been scheduled by SGE, and 3)

notification. 4) VirtualClusterMonitor module which

virtual cluster creation phase, in which virtual machine is

monitors the virtual cluster and virtual machine status and

booted, configured, and aggregated.

check the integrity of virtual clusters.

Figure 5 and Figure 6 compares the ratio of time used

VirtualNetworkManager maintains and updates all the

by each of the three phase to the time taken by the whole

resources for the virtual network. It arranges public and

deployment process under the two different vjob

private network addresses; allocate unique domain

scheduling strategies.

names, and updates DNS map entries for the newly created virtual clusters. ImageInfoManager stores and updates the metadata of virtual machine image templates. about

the

cached

virtual

machine

time

ImageCacheManager traces the location information images.

ImageTransferService module distributes the image template from the ImageServer to the computing nodes. V.

EXPERIMENTAL RESULTS

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 3

5

10

15

20

25

virtual cluster size

We conduct our experiment to create virtual clusters with different size on a grid environment to evaluate the

Figure 5. Virtual cluster deployment using best-effort resource

performance of EVC. The physical resource in our

selection algorithm

experiment consists of 7 physical nodes which are grouped into three clusters, as table 1 shows.

Cluster

EXPERIMENT ENVIRONMENT

nodes

slots

VMM

C1(SGE)

3

12

Xen 3.3.0

C2(SGE)

3

8

Xen 3.3.0

C3(Fork)

1

-

VMwareServer2.0

time

TABLE I.

The two SGE clusters consist of simultaneous computing nodes that are of Dell OPTIPLEX 380

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 3

platform. Every node has 2.93G dual-core cpu and 2G

5

10

15

20

25

virtual cluster size

memory. NFS is used to access the cluster storage. The fork cluster has 2.66G 8-core cpu and 4G memory. All

Figure 6. virtual cluster deployment using VJM resource selection

the three clusters are managed by Globus middleware.

algorithm

The network bandwidth in our experiment is 100Mib/s. In our experiment the image templates are pre-staged

Figure 7 compares the total time taken to deploy

into the cluster storage. We divide the deployment into

virtual clusters using the two different vjob scheduling

three phases: 1) vjob creation phase, in which virtual

algorithms. From the graph we can see VJM vjob

cluster metadata is parsed, vjobs are created and

scheduling strategy cost less time than the best-effort

scheduled, and network addresses are allocated. 2)

scheduling strategy. The reason is that VJM strategy can

resource co-allocation and image preparation phase, in

40

save much more inter-cluster communication time than

time(in second

best-effort strategy.

[2]

200 [3] [4] [5]

150 100 50 0 3

5

10

15

20

25

[6]

virtual cluster size best-effort

vjm [7]

Figure 7. Virtual cluster deploying timing results [8]

VI.

CONCLUSION AND FUTURE WORKS [9]

In this paper, we propose an architecture that deploys virtual clusters on distributed physical resources. It can be used by the upper level resource consumers or other

[10]

services as a virtualized resource manager. We use vjob concept to adapt many kinds of popular virtual machine monitors/hypervisors. Two simple vjob scheduling

[11]

strategies are developed for demonstration, but new strategies can be easily introduced. We also investigate the image propagation problem and use BitTorrent [12]

protocol to distribution images. An image caching and sharing mechanism is also proposed to cut down the virtual cluster deployment time. From the experiment we can see vjob scheduling has

[13] [14]

an important impact on the performance. More sophisticated scheduling strategy needs to be developed.

[15]

Virtual machine live migration is also a very useful technique. We can apply this technique to virtual clusters so that we can adjust workloads dynamically.

[16]

ACKNOWLEDGMENT. The authors would like to acknowledge supports from

[17] [18]

the China NSF (No.60703024) and Program for New Century

Excellent

Talents

in

University

(NCET-09-0428) of Ministry of Education of China. REFERENCE [1]

R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes, "A case for grid computing on virtual machines", in Proceedings of the 23rd

41

International Conference on Distributed Computing Systems, 2003. Barham, P., B. Dragovic, K. Fraser, S. Hand, T. Harris,A. Ho, R. Neugebar, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In ACM Symposium on Operating Systems Principles (SOSP). VMware: http://www.vmware.com/ VirtualBox.http://www.virtualbox.org/ Xiaohui Wei, Zhaohui Ding, Shaocheng Xing, Yaoguang Yuan, Wilfred Li. VJM: A Novel Grid Resource Co-allocation Model for Parallel Jobs. 2nd International Conference on Future Generation Communication and Networking Symposia, 2008. Matthias Schmidt, Niels Fallenbeck, Matthew Smith, Bernd Freisleben. Efficient Distribution of Virtual Machines for Cloud Computing. Euromicro Conference on Parallel, Distributed, and Network-Based Processing, 2010, 567-574. Xuxian Jiang, Dongyan Xu. VIOLIN: Virtual Internetworking on OverLay Infrastructure. Department of Computer Sciences Technical Report CSD TR 03-027, Purdue University, July 2003. Tsugawa, M. Fortes, J.A.B.. A Virtual Netowk(ViNe) Architecture for Grid Computing. Parallel and Distributed Processing Symposium, 2006. Xiaohui Wei; Hongliang Li; Liang Hu; Qingnan Guo; Na Jiang. LimeVI: Extend Live-Migration-Enabled Virtual Infrastructure across Multi-LAN Network. 2010 Fifth International Conference on Frontier of Computer Science and Technology (FCST), Changchun, Jilin Province, 18-22 Aug. 2010, 22-29. D. Quigley, J. Sipek, C. P. Wright, and E. Zadok. UnionFS: Userand Community-oriented Development of a Unification Filesystem. In Proceedings of the 2006 Linux Symposium, Ottawa, Canada, July 2006(2): 349—362. Thamarai Selvi Somasundaram, Balachandar R. Amarnath, R. Kumar, P. Balakrishnan, K. Rajendar, R. Rajiv, G. Kannan, G. Rajesh Britto, E. Mahendran, B. Madusudhanan. CARE Resource Broker: A framework for scheduling and supporting virtual resource management. Future Generation Computer Systems, March 2010, 26(3):337-347. I. Foster, T. Freeman, K. Keahy, D. Scheftner, B. Sotomayer, and X. Zhang. Virtual Clusters for Grid Communities. In CCGRID’06: Proceedings of the Sixth IEEE Internatinal Symposium on Cluster Computing and the Grid(CCGRID’06), pages 513-520. IEEE Computer Society, 2006. BitTorrent. http://www.bittorrent.com/, June 2008. K. Keahey, I. Foster, T. Freeman, X. Zhang, and D. Galron, "Virtual workspaces in the Grid", in 11th International Europar onference, Lisbon, Portugal, September 2005. J. S. Chase, D. E. Irwin, L. E. Grit, J. D. Moore, and S. E. Sprenkle, "Dynamic virtual clusters in a grid site manager", in HPDC'03: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing, June 2003. A.Sundararaj, P. Dinda. Towards Virtual Networks for Virtual Machine Grid Computing. Proceedings of the third USENIX Virtual Machine Research and Technology Symposium (VM 04), May, 2004. OpenNebula, http://www.opennebula.org Hongliang Li, Xiaohui Wei, Huixin Yao. CLIMP: Concurrent Live Migration Protocol for Elastic Virtual Clusters. Accepted by ICIC Express Letters, vol.5, no.9(B), pp. 3429-3436, Sep. 2011.

Suggest Documents