ANGEL: Agent-Based Scheduling for Real-Time Tasks in Virtualized ...

IEEE TRANSACTIONS ON COMPUTERS, VOL. 64,

NO. 12,

DECEMBER 2015

3389

ANGEL: Agent-Based Scheduling for Real-Time Tasks in Virtualized Clouds Xiaomin Zhu, Member, IEEE, Chao Chen, Laurence T. Yang, Senior Member, IEEE, and Yang Xiang, Senior Member, IEEE Abstract—The success of cloud computing makes an increasing number of real-time applications such as signal processing and weather forecasting run in the cloud. Meanwhile, scheduling for real-time tasks is playing an essential role for a cloud provider to maintain its quality of service and enhance the system’s performance. In this paper, we devise a novel agent-based scheduling mechanism in cloud computing environment to allocate real-time tasks and dynamically provision resources. In contrast to traditional contract net protocols, we employ a bidirectional announcement-bidding mechanism and the collaborative process consists of three phases, i.e., basic matching phase, forward announcement-bidding phase and backward announcement-bidding phase. Moreover, the elasticity is sufficiently considered while scheduling by dynamically adding virtual machines to improve schedulability. Furthermore, we design calculation rules of the bidding values in both forward and backward announcement-bidding phases and two heuristics for selecting contractors. On the basis of the bidirectional announcement-bidding mechanism, we propose an agent-based dynamic scheduling algorithm named ANGEL for real-time, independent and aperiodic tasks in clouds. Extensive experiments are conducted on CloudSim platform by injecting random synthetic workloads and the workloads from the last version of the Google cloud tracelogs to evaluate the performance of our ANGEL. The experimental results indicate that ANGEL can efficiently solve the real-time task scheduling problem in virtualized clouds. Index Terms—Agent-based scheduling, real-time, bidirectional announcement-bidding mechanism, virtualized cloud

Ç 1

INTRODUCTION

N

OWADAYS, cloud computing has become an efficient paradigm to offer computational capabilities as services on a “pay-per-use” basis [1]. Meantime, the virtualization technology is commonly employed in clouds, e.g., the Amazon’s elastic compute cloud (EC2), to provide flexible and scalable services, which gives users the illusion of infinite resources [2]. Running applications on virtual machines (VMs) has become an effective solution [3]. Leveraging the server virtualization technology, a single host can simultaneously run multiple virtual machines. The VMs can be dynamically relocated by live VM operations (e.g., creation, migration and deletion) to achieve fine-grained optimization of computing resources. This technology has brought ample opportunities for scalability, cost-efficiency, reliability, and high resource utilization [4], [5], [6]. It is worthwhile noting that many applications deployed on clouds have the real-time nature in which the correctness depends not only on the computational results, but also on the time instants at which these results become available

X. Zhu and C. Chen are with the Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, Hunan 410073, P.R. China. E-mail: {xmzhu, chenchao}@nudt.edu.cn. L.T. Yang is with the Department of Computer Science, St. Francis Xavier University, Antigonish, NS B2G 2W5, Canada. E-mail: [email protected]. Y. Xiang is with the School of Information Technology, Deakin University 221, Burwood, VIC 3125, Australia. E-mail: [email protected]. Manuscript received 4 Nov. 2014; revised 15 Feb. 2015; accepted 21 Feb. 2015. Date of publication 3 Mar. 2015; date of current version 11 Nov. 2015. Recommended for acceptance by J. Cao. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TC.2015.2409864

[7]. For some applications, it is even mandatory to provide real-time guarantees. For example, weather forecasting and medical simulations have strict deadlines which, once broken, make the result useless [8]. Therefore, it is critical for these kinds of deadline-constrained applications to obtain guaranteed computing services within timing constraint. In order to obtain high performance for running realtime applications in clouds, scheduling plays an essential role, in which real-time tasks in these applications are mapped to machines such that deadlines and response time (RT) requirements are satisfied. Efficient scheduling algorithms are able to facilitate the resources to contribute the whole system, and thus can significantly boost the system’s service capability. To date, a handful of scheduling strategies on clouds have been investigated (e.g., [7], [9], [10], [11], [12], [13], [14]). Unfortunately, an important scheduling technology, i.e., agent-based scheduling technology that shows great advantages in dealing with task allocation issue in distributed systems is not sufficiently considered on the emerging clouds. The agent-based technology is derived from distributed artificial intelligence (DAI) domain. It has great strength in open, complex, dynamic, and distributed environment due to the inherent nature that agents make decisions based on local interactions, and good agent-based scheduling algorithms allow them to adapt, and enable them to coordinate through self-organization [15]. According to Wooldridge and Jennings [16], an agent is “a self-contained program capable of controlling its own decision making and acting, based on its perception of its environment, in pursuit of one or more objectives”, which provides a new way to allocate tasks in clouds.

0018-9340 ß 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

3390

IEEE TRANSACTIONS ON COMPUTERS,

In the scheduling process based on agent technique, multiple individual agents are capable of composing a multi-agent system, in which these agents interact with each other to accomplish the goal of the system. Meanwhile, interaction technology among agents is of great importance, and the corresponding rules in the interactions can be designed by users, which makes agent-based technology flexible enough to meet totally different requirements while scheduling. To facilitate interactions, the ability to cooperate, coordinate, and negotiate with each other is required. Cooperation is the process when several agents work together and draw on the broad collection of their knowledge and capabilities to achieve a common goal. Coordination is the process of achieving the state in which actions of agents fit in well with each other. Negotiation is a process by which a group of agents communicate with one another to try to come to a mutually acceptable agreement on some matter [17]. Agent-based scheduling is to employ these operations so as to finish task allocation on cloud resources. Our work deviates from traditional scheduling algorithms in the literature by designing and implementing a novel scheduling mechanism based on an intelligent agent approach and then develop a corresponding dynamic scheduling algorithm for real-time tasks executed in clouds. Contributions. The major contributions of this work are summarized as follows:

We designed a bidirectional announcement-biddiing mechanism based on an improved contract net protocol (CNP). We developed an agent-based scheduling algorithm in virtualized clouds or ANGEL for independent, real-time tasks. We designed two selection strategies—MAX strategy and P strategy to determine the contractors. We investigated a dynamic scaling up method used for our ANGEL to further enhance the schedulability. The rest of this paper is organized as follows. The related work in the literature is summarized in Section 2. Section 3 formally models the dynamic real-time scheduling problem in virtualized clouds. In Section 4, the agent-based scheduling mechanism is presented. This is followed by our proposed ANGEL algorithm and the main principles behind it in Section 5. The experiments and performance analysis are given in Section 6. Section 7 concludes the paper with a summary and future directions.

2

RELATED WORK

Up to now, a great deal of scheduling strategies have been developed in a wide range of application domains. Scheduling algorithms can be either static (i.e., off-line) or dynamic (i.e., on-line) [18]. In static scheduling algorithms, assignments of tasks and the time at which the tasks start to execute are determined a priori. They are usually developed for periodic tasks [7]. Whereas the arrival time of aperiodic tasks is not known a priori and with timing requirements (i.e., real-time), the tasks must be scheduled by dynamic scheduling strategies.

VOL. 64,

NO. 12,

DECEMBER 2015

Specially, there exist many scheduling algorithms that were designed for cloud computing environment. For example, Zhang et al. developed a heterogeneity-aware framework that dynamically adjusts the number of machines to strike a balance between energy savings and service delay [19]. He et al. investigated the reduction of resource consumption by VM consolidation, and employed the Genetic Algorithm (GA) to solve the issue [5]. Kong et al. concentrated on the uncertainties of both the availability of virtualized servers and workloads, and utilized the type-I and type-II fuzzy logic systems to predict the resource availability and workloads to enhance the system’s availability and responsiveness performance [20]. Calheiros and Buyya suggested a resource provisioning and scheduling strategy for real-time workflow on IaaS cloud, in which the particle swarm optimization technique was employed to minimize the overall workflow execution within timing constraint [21]. Malawski et al. presented several static and dynamic scheduling algorithms to enhance the guarantee ratio of real-time tasks while meeting QoS constraints such as budget and deadline. Besides, they took the variant of tasks’ execution time (ET) into account to enhance the robustness of their methods [22]. Goiri et al. proposed an energy-efficient and multifaceted scheduling policy, modeling and managing a virtualized cloud, in which the allocation of VMs is based on multiple facets to optimize the provider’s profit [23]. Graubner et al. suggested an energyefficient scheduling algorithm that was based on performing live migrations of virtual machines to save energy, and the energy costs of live migrations including pre-processing and post-processing phases were considered [24]. However, the aforementioned algorithms cannot efficiently address the large-scale dynamic scheduling issue. It should be noted that in clouds, both tasks and resources are dynamically varied. To be specific, most of tasks arrive in an aperiodic mode and resources changed with the variation of system workload. Thus, the scheduling algorithms that are used to allocate tasks and adjust resources are very essential to enhance the system’s schedulability and utilization in dynamic cloud environment. In agent-based scheduling, each agent can directly represent a physical object such as a machine, a task, and an operator [25]. Thus, agent-based scheduling algorithms have the ability to allocate tasks through negotiation, which brings great advantages for dealing with dynamically arrived tasks in distributed systems (e.g., the cloud computing systems). The agent-based scheduling algorithms can be classified into two categories, i.e., threshold-based algorithms and market-based algorithms. In the first category, scheduling algorithms are developed from the threshold model in insect colonies. For example, Price evaluated the adaptive nature inspired task allocation against decentralized multi-agent strategies [26]. Campos et al. investigated the dynamic scheduling and division of labors in social insects [27]. Generally, the complexity of this kind of algorithms is high. Another category of agent-based scheduling algorithms derives from market-based mechanism, in which the contract net protocol is the mostly used market-based mechanisms where groups of individuals employ market-like approaches i.e., auction, to decide who realizes these goals, with bids based on the individual’s desire and the ability to

ZHU ET AL.: ANGEL: AGENT-BASED SCHEDULING FOR REAL-TIME TASKS IN VIRTUALIZED CLOUDS

TABLE 1 Definitions of Main Notation Notation

Definition

ti ai ; li ; di ; pi

The ith task in the task set T ¼ ft1 ; t2 ; . . .g ti ’s arrival time, length/size, deadline, and priority The kth host in the host set H ¼ fh1 ; h2 ; . . .g Active host set, Ha H The jth VM on host hk vjk ’s ready time and creating time The start time, execution time, and finish time of ti on vjk xijk is “1” if ti is assigned to vjk ; otherwise, xijk is “0” The ith task agent in the task agent set T A ¼ ftA i ; i ¼ 1; 2; . . . ; jT jg The jth VM agent in the VM agent set V A ¼ fvA jk ; j ¼ 1; 2; . . . ; jVk j; k ¼ 1; 2; . . . ; jHa jg The manager agent The bidding value in forward bidding and backward bidding, respectively The finish time of ti ’s preceding task tp on the same VM vjk The ith announcer in the announcer set AN ¼ fani ; i ¼ 1; 2; . . . ; ng The jth bidder for ani in the bidder set BIi ¼ fbiij ; j ¼ 1; 2; . . . ; mg The bidding value of biij for ani

hk Ha vjk rjk ; cjk sijk ; eijk ; fijk xijk tA i vA jk mA fbijk ; bbijk fpjk ani biij bvij

finish their goals. For example, Owliya et al. investigated a ring-like model as a competitor for the web-like CNP-based job allocation within the concept of holonic manufacturing systems and a new algorithm was developed for scheduling and the assignment of tasks to resources based on the ring structure [28]. Later, Owliya et al. proposed four agent-based models for task allocation in manufacturing shop floor in which two of them employed the CNP. Besides, the prominent position of the agent-based scheduling within the broad area of scheduling was discussed [29]. Lange and Lin studied a new approach to modeling well scheduling processes in oil and gas industry using the notion of virtual enterprise with intelligent agents and contract net protocol in multi-agent systems technologies, which efficiently assists in the scheduling of resources across the well life cycle [30]. In this study, we investigated a novel agent-based scheduling method based on an improved CNP model to address the real-time task scheduling issue in virtualized clouds. The proposed strategy adopts the bidirectional announcement-bidding mechanism where the tasks and resources are both announcers and bidders to improve the system’s schedulability.

3

MODELS AND PROBLEM FORMULATION

3391

computing hosts providing the hardware infrastructure for creating virtualized resources to satisfy users’ requirements. The active host set is modeled by Ha with n elements, Ha H. For each host hk 2 Ha , it contains a set Vk ¼ fv1k ; v2k ; . . . vjVk jk g of virtual machines. All the VMs constitute a VM set V ¼ fV1 ; V2 ; . . . VjHa j g. In this study, a VM is a basic computing unit and its count is dynamically varied based on the system workload. Namely, VMs can be dynamically created if the system is under heavy workload so as to meet users’ QoS; otherwise, VMs can be deleted if the system workload decreases to improve the system utilization, which sufficiently embodies one of the most important characteristics of clouds—elasticity. The fractions of CPU performance are defined by Million Instructions Per Second (MIPS). For a given VM vjk , we use rjk to denote the ready time of VM vjk and cjk to represent the finish time of creating VM vjk . We consider a set T ¼ ft1 ; t2 ; . . .g of tasks that are independent, non-preemptive, aperiodic and importantly with deadlines. A task ti submitted by a user can be modeled by a collection of parameters, i.e., ti ¼ fai ; li ; di ; pi g, where ai , li , di and pi are the arrival time, task length/ size, deadline, and priority of task ti , respectively. Let sijk be the start time of task ti on VM vjk . Similarly, fijk represents the finish time of task ti on vjk . Due to the heterogeneity in terms of CPU processing capabilities of VMs, we let eijk be the execution time of task ti on VM vjk . In addition, xijk is employed to reflect a mapping of tasks to VMs at different hosts in a virtualized cloud, where xijk is “1” if task ti is allocated to VM vjk at host hk and is “0”, otherwise. In this paper, we assume that each task cannot be preempted while executing, thus one task can only be allocated to one VM that cannot be shared by another task if these two tasks have overlap in terms of execution. Consequently, we have the following constraint C1 8 PjH j PjV j a k > < k¼1 j¼1 xijk ¼ 1 or 0 i 2 ½1; jT j; C1 : ðxijk ¼ 1Þ ^ ð½spjk ; fpjk > : xpjk ¼ 0 if \ ½sijk ; fijk 6¼ ;Þ: (1) The start time sijk of task ti on VM vjk is designed for its earliest start time that is related to its arrival time ai and the ready time rjk of vjk sijk ¼ maxfai ; rjk g;

(2)

where ready time rjk is the time instance at which the VM can be used. Consequently, we can calculate rjk according to the following formula: rjk ¼

cjk maxffijk g

if 8xpjk ¼ 0jap ai ; if 9xpjk ¼ 1jap ai :

(3)

In this section, we introduce the system model, notation, and terminologies used throughout in this paper. For future reference, we summarize the main notation used in this study in Table 1.

The finish time fijk of task ti executed by vjk can be easily determined as follows:

3.1 System Model In this paper, the target systems is a virtualized cloud that is characterized by an infinite set H ¼ fh1 ; h2 ; . . .g of physical

The finish time is, in turn, used to determine whether the task’s deadline can be satisfied. Therefore, we have the following deadline constraint C2 on a resource allocation:

fijk ¼ sijk þ eijk :

(4)

3392


C2 :

8 > > < xijk ¼ 0

8j 2 ½1; jVk j; 8k 2 ½1; jHa j; sijk þ eijk > di ; 9j 2 ½1; jVk j; 9k 2 ½1; jHa j; if sijk þ eijk di :

if

> > : xijk ¼ 1 or 0

(5)

If task ti is executed by vjk , the vjk is unavailable to other tasks in this time slot. This comes to the following constraint: xC3 :

ðsijk fpjk Þ_ ðfijk spjk Þ

if

8i 6¼ p; j 2 ½1; jVk j; k 2 ½1; jHa j; xijk ¼ xpjk ¼ 1: (6)

From C3 , we can easily get that if task ti and task tp are both allocated to VM vjk , they cannot be executed at the same time, namely, sijk fpjk and fijk spjk .

3.2 Scheduling Objectives In this work, we take task guarantee ratio (TGR) and priority guarantee ratio (PGR) as two main scheduling objectives. Regrading the real-time tasks, our scheduling algorithm strives to finish as many tasks as possible before their deadlines. Moreover, if the system cannot finish all tasks due to heavy workload, our scheduling algorithm tries to finish tasks with higher priorities. Consequently, we have the objectives modeled as follows: (1) Task Guarantee Ratio ( jH j jV j jT j ) a X k X X xijk jT j : max

(7)

k¼1 j¼1 i¼1

(2) Priority guarantee ratio max

( jH j jV j jT j a X k X X k¼1 j¼1 i¼1

4

xijk pi

jT j X

) pi :

(8)

i¼1

AGENT-BASED SCHEDULING MECHANISM DESIGN

On the design of our agent-based scheduling mechanism, we attempt to employ a kind of market-like mechanism— contract net protocol to complete task scheduling in virtualized clouds. The CNP model allows agents to coordinate and produce desirable system-wide behavior. More importantly, we devise a novel bidirectional announcement-bidding mechanism within the CNP model to improve scheduling quality.

4.1 Agent Design In this study, we design three kinds of agents, i.e., task agent, VM agent, and manager agent. Each of them works based on their own rules and they cooperate with each other to finish the auction process of CNP. These agents are modeled as follows:

T A ¼ ftA i ; i ¼ 1; 2; . . . ; jT jg is a task agent set, where A ti represents the ith task agent in T A . V A ¼ fvA jk ; j ¼ 1; 2; . . . ; jVk j; k ¼ 1; 2; . . . ; jHa jg is a VM agent set, where vA jk donates the jth VM on hk . mA represents a manger agent.

VOL. 64,

NO. 12,

DECEMBER 2015

A task agent yields with the arrival of the task and disappears with the finish of the task. A VM agent yields when the VM is established, and dies out when the VM is destroyed. However, the manger agent is always existent. The VM agents constantly update their own information and release their information to the manger agent.

4.2 Design of Bidirectional-Bidding Announcement Mechanism To facilitate understanding the bidirectional announcementbidding mechanism, we firstly give two important definitions used in it, i.e., the forward announcement-bidding and the backward announcement-bidding. Definition 1 (Forward announcement-bidding). An announcement that is from the task’s perspective, namely, the task information is treated as the announcement information to announce towards VMs, and VMs bid to tasks. Definition 2 (Backward announcement-bidding). An announcement that is from the VM’s point of view, i.e., the VM information is treated as the announcement information to tasks, and tasks bid to VMs. The key point of task scheduling is to select a proper VM for one task to run on. On the arrival of a new task, it will choose the best suited VM for execution as it likes. Note that in the forward announcement-bidding, it is probably to occur the phenomenon that multiple tasks have selected the same VM as their preference. Thus, the VM selects one of the tasks reversely to finish task allocation, which is taken place in the backward announcement-bidding phase. After the bidirectional announcement-bidding, the newly arrived task will be assigned to a VM or to be rejected according to the VMs’ service capability. In the following part of this paper, we will discuss both the forward announcement-bidding and the backward announcement-bidding mechanisms in detail. Fig. 1 illustrates basic interactions of the three kinds of agents. In the process of interactions, the VM agent set constantly generates or updates VM agents based on the real VM configuration. That is to say, if a new VM is created, then a new VM agent is built; if a VM is cancelled, then the corresponding VM agent is removed; if a VM’s performance such as CPU, RAM is varied, the relevant VM agent information is updated. Once VMs’ information is changed, the VM agent set forwards it to the manager agent on its board. Now, we are in the position to discuss how the bidirectional announcement-bidding mechanism works. It basically includes three phases, i.e., basic matching phase, forward announcement-bidding phase, and backward announcement-bidding phase.

4.2.1 Basic Matching Phase In this phase, all the VMs satisfying the basic requirement of tasks are filtered to shrink the VM scope and thus reduce the interactions between task agents and VM agents. The detailed process of basic matching is described as follows:

A new task agent is generated when the task arrives. The task agent sends basic task requirement information to the manager agent including task ID, task type, etc.


3393

Fig. 1. Basic interactions of agents.

The manager agent receives the task requirement information and then matches each task agent with VM agents from VM information board to choose those VMs that satisfy the basic requirements posted by task agents. The manager agent sends the selected VMs’ information to corresponding task agents.

4.2.2 Forward Announcement-Bidding Phase In the forward announcement-bidding phase, the task agents negotiate with VM agents so as to select one or multiple VMs for a task on condition that the VMs can guarantee the timing constraint. The forward announcement-bidding phase is introduced as below:

The task agent set receives the VMs’ information from the manager agent. Each task agent generates forward announcement information including its arrival time, length, deadline, priority, etc., and then sends it to relevant VM agents. The VM agents receive the tasks’ announcement information and calculate the corresponding bidding values based on some rules. The task agents receive the VM agents’ bidding values and make forward awarding contracts for VM Agents. Fig. 2 depicts an example of the first two phases, i.e., the basic matching phase and the forward announcement-

bidding phase. In this example, we assume there are three A A A A A task agents—tA 1 , t2 , and t3 , and four VM agents v11 , v21 , v12 , A and v13 that are available at the beginning. A A Fig. 2a illustrates that task agents tA 1 , t2 , and t3 firstly send their basic requirements to the manager agent mA . Then the manager agent mA matches task agents and VM agents and send feedback information to these task agents. From Fig. 2a, it can be observed that vA 11 can meet the basic A A and t ; v can only meet the basic requirements of tA 1 2 12 A ; v can only meet the basic requirement requirement of tA 1 13 A A of t3 ; v21 cannot meet the basic requirements of any task agents. The reason why only part of VMs can meet the basic requirements of tasks maybe due to lacking the applications deployed on these VMs to run the tasks. Consequently, tA 1 A A can only make forward announcement to vA 11 and v12 , and t2 can only make forward announcement to vA 11 , which greatly reduces the interaction volume. After that, the VM agents bid to task agents. Later, each task agent selects a suitable VM agent and gives a forward contract. Fig. 2a shows that A A A tA 1 and t2 both select v11 . It should be noted that t3 does not make forward announcement to any VM agent because there is no VM agent that can meet the basic requirement of tA 3 . From Figs. 2a and 2b, we can find that the difference between them lies in that in Fig. 2b tA 3 makes forward A announcement to vA but v does not bid to tA 13 13 3 , which can A be explained that vA cannot finish t before its deadline, 13 3 A thus there is no bidding from v13 . It is easy to get from Fig. 2

3394


VOL. 64,

NO. 12,

DECEMBER 2015

Fig. 2. An example of the first two phases.

that tA 3 cannot be finished in the two cases. To address this issue, we sufficiently consider the elasticity of clouds and attempt to maximize the guarantee ratio of real-time tasks by dynamically adding VMs. An example of adding VMs is illustrated in Fig. 3. Once a task cannot be finished before its deadline using current VMs, the manager agent will apply for VMs from clouds. Following by this operation, a new VM is dynamically created to run the task. After that, the VM agent set adds the new VM agent and then sends the update information to the manager agent. From Fig. 3a, we can find that the manager agent mA sends the newly created VM agent vA 14 to A task tA as a matched VM agent. Then, t makes forward 3 3 announcement to vA 14 . After calculating the bidding value by A A vA 14 , it bids to t3 using this value. Finally, v14 is capable of finA A ishing t3 , so t3 makes forward contract with vA 14 . It is A observed from Fig. 3b that vA does not bid to t although it 14 3 is a newly created VM agent because the VM agent still cannot finish the task due to its tight deadline. Thereby, tA 3 has to be rejected.

4.2.3 Backward Announcement-Bidding Phase From the forward announcement-bidding phase, it can be found that perhaps there exists the case that multiple

Fig. 3. An example of adding VMs.

task agents select the same VM agent. Consequently, the backward announcement-bidding is needed to realize one-to-one match. In the backward announcement-bidding phase, VM agents only make backward announcement to those task agents that have forward awarding contracts with them. If a VM agent is selected only by one task agent, then the task agent and VM agent confirm a contract directly. The detailed process of backward announcement-bidding phase is listed as below:

The VM agents send backward announcement to task agents that make forward contracts with them. The task agents receive the VM agents’ announcement information and calculate the corresponding bidding values based on some rules. The VM agents receive the task agents’ bidding values and make backward awarding contracts. If a bidirectional contract is built between a task agent and a VM agent, the task is allocated to the VM. Fig. 4 shows an example of the backward announcementbidding on the basis of the example in Fig. 3a.


Fig. 4. An example of backward announcement. A In Fig. 4, vA 11 makes backward announcement to t1 and A A A A Then t1 and t2 bid to v11 . The VM agent v11 compares A the backward bidding values of tA 1 and t2 , and then selects A the task agent t2 to award a contract. It should be noted that A vA 14 directly awards a contract with t3 because only one task A agent t3 gives a forward contract to vA 14 . Hence, there is no need in terms of announcement and bidding between tA 3 and vA 14 . As a result, based on the bidirectional announcementbidding mechanism, task t2 is allocated to virtual machine v11 , and task t3 is allocated to a newly constructed virtual machine v14 . For the task t1 , it will be scheduled in the next round announcement and bidding. Noticeably, although the aforementioned bidirectional announcement-bidding mechanism adds the step that the manager agent deals with the VM information, it can prominently reduce the scheduling burden of the system. The manager agent filters all the VMs meeting the basic task requirement, which shrinks the scope of forward announcement. Thus, the calculations of forward announcement values are only for the matched VM agents. Those non-matched VM agents will not receive the forward announcement information, and thus the calculation time is significantly decreased.

tA 2.

4.3 Bidding Values In the aforementioned bidirectional announcement-bidding mechanism, the selections are based on the bidding values. Thus, how to calculate bidding values becomes an important issue. In the forward announcement-bidding phase, task agents prefer to select the VM agents that have more capability to run them. Whereas in the backward announcement-bidding phase, VM agents are inclined to select the task agents with tighter deadlines. In this study, the bidding values including forward bidding value and backward bidding value are designed in the following sections. 4.3.1 Forward Bidding Value In the forward announcement-bidding, task agents make announcements and those VM agents meeting the basic requirements of the task agents start to bid. The forward bidding values reflect the capabilities of VM agents. The calculation of bidding values in forward bidding is on the VM agents. We use fbijk to represent the bidding value in forward bidding and it can be calculated as follows:

3395

Fig. 5. An example of calculating fbijk:

fbijk ¼ di eijk fpjk ;

(9)

where fpjk represents the finish time of task ti ’ preceding task tp on the same VM vjk . The formula is mainly to show the laxity of task ti on vjk , and the bigger value of fbijk means that ti has more flexible time to run on vjk before its deadline. fbijk > 0 means that the VM vjk has the ability to finish the task ti before its deadline without affecting the executions of other tasks allocated to vjk . In contrast, fbijk < 0 indicates that VM vjk cannot finish ti before its deadline. Fig. 5 depicts an example of calculating fbijk .

4.3.2 Backward Bidding Value In the backward announcement-bidding, VM agents make announcements and task agents bid to VM agents. bbijk is used to represent the bidding value of task ti to vjk , and it is calculated according to the formula as below: bbijk ¼

ðdi fij Þ

pui eijk PK PJ k¼k1

j¼j1

fbijk

;

(10)

where the parameter u represents the weight of priority, J denotes the count of VM agents that have bid to task agent tA i . K denotes the count of hosts in which there exist VM agents that have bid to task agent tA i in the forward announcement-bidding phase. This formula is based on the following considerations. Firstly, the higher priority, the stronger requirement to allocate the task. Secondly, the tighter a task’s finish time approaches its deadline, the higher likelihood this task cannot be allocated successfully, thus it should be allocated preferentially. Thirdly, the smaller number of VMs to execute a task, the smaller feasibility to finish this task. Hence, the task should be allocated with higher preference.

4.4 Selection Strategies Both in the forward and backward announcement-bidding phases, an announcer is responsible for selecting a bidder to award contract if the bidder has the ability to execute this task. It is noted that different strategies may lead to distinctive performances. In this study, we design two kinds of selection strategies, i.e., MAX strategy and P strategy. We use AN ¼ fani ; i ¼ 1; 2 . . . ng to denote an announcer set, BIi ¼ fbiij ; j ¼ 1; 2 . . . mi g to represent the bidder set for announcer ani . BV ¼ fbvij ; i ¼ 1; 2 . . . n; j ¼ 1; 2 . . . mi g is the bidding value set, in which bvij denotes the bidding value of bidder biij for announcer ani .

3396


4.4.1 MAX Strategy When more than one bidder simultaneously bid to one announcer, the announcer will choose the bidder with maximal bidding value. 8ani 2 AN, if biik is selected, it must meet the constraint: 8bvij 2 BV ) bvik bvij jj 6¼ k. 4.4.2 P Strategy P strategy means probability selection strategy. When there exist more than one bidder bid to one announcer in the same round, the announcer will select a bidder according to probability policy. 8ani 2 AN, the winning probability prj of bidder brij can be calculated as: prj ¼ bvij

X mi

bvik :

(11)

k¼1

Without loss of generality, we let pr0 ¼ 0. In addition, we let pr be a random number, and pr 2 ð0; 1Þ. If the randomly generated number satisfies the following formula (12), then the bidder brij is selected as a contractor j1 X m¼1

prm < pr

j X

prn :

(12)

n¼1

As there are two selections in the bidirectional announcement-bidding mechanism, four kinds of scheduling algorithms based on the above selection strategies can be constructed, i.e., MAX-MAX, MAX-P , P -MAX and P -P .

5

AGENT-BASED SCHEDULING ALGORITHM ANGEL

In this section, we present a novel agent-based scheduling algorithm in virtualized clouds—ANGEL on the basis of our agent-based scheduling model for independent, aperiodic, real-time tasks. Specifically, the ANGEL integrates the aforementioned bidirectional announcement-bidding mechanism and the MAX & P strategies. Moreover, ANGEL efficiently considers the schedulability, priority, and load balancing. The ANGEL is detailedly described as follows.

VOL. 64,

NO. 12,

DECEMBER 2015

corresponding task agents. The announcement-bidding phase repeats in Lines 5-7 until all the tasks are allocated or rejected.

Algorithm 2. The Algorithm for Task Agent 1 valueList ;; 2 foreach vA jk in Vi do A 3 Task agent tA i sends the announcement information to vjk ; A 4 fbijk VM agent vjk calculates the forward bidding value; 5 if fbijk 0 then 6 valueList:addðfbijk Þ; 7 if valueList 6¼ ; then 8 Task agent ti selects a bidder vselect based on the values in valueList using the MAX/P Strategy; 9 else 10 vnew scaleUpResources(); 11 if vnew 6¼ NULL then 12 The new VM vnew generates a new VM agent vA new and sends the information to the manager agent; 13 Allocate ti to vnew ; 14 Twaiting :removeðtA i Þ; 15 else 16 Reject ti ; 17 Twaiting :removeðtA i Þ;

Algorithm 2 depicts the pseudocode of the algorithm for task agents. In Line 3, each task agent generates forward announcement information to relevant VM agents. The VM agents return the corresponding bidding values in Line 4. According to the Eq. (9), fbijk 0 means that the VM vjk has the ability to finish the task before its deadline, while fbijk < 0 indicates that the VM cannot finish the task. Consequently, the algorithm only adds the fbijk that is larger than or equal to zero into the valueList. After the VMs’ bidding, the task agent ti selects a bidder vselect based on the values in valueList using the MAX Strategy or the P Strategy. If valueList ¼ ;, it means that none of the existing VMs is able to finish the task, and as a result the function scaleUpResourcesðÞ is invoked to create a new VM to accommodate the task. If it still fails to finish the task before its deadline, the task will be rejected.

Algorithm 3. The Algorithm for VM Agent

1 Twaiting the tasks that arrive at the same time instant; 2 foreach tA i in Twaiting do 3 ViA the VM agents that satisfy the basic requirements of tA i ; ; 4 Send the ViA to task agent tA i 5 while Twaiting 6¼ ; do 6 The task agents start the forward announcement-bidding phase using Algorithm 2; 7 The VM agents start the backward announcement-bidding phase using Algorithm 3;

1 Tcandidate the task agents that send the forward contract with the VM agent vA jk ; 2 if Tcandidate 6¼ ; then 3 valueList ;; 4 foreach tA in Tcandidate do i 5 bbijk tA i ’s backward bidding value; 6 valueList:addðbbijk Þ; A 7 VM agent vA jk selects a bidder tselect based on the values in valueList using the MAX/P Strategy; A 8 Build a bidirectional contract between the vA jk and the tselect ; 9 Twaiting :removeðtA Þ; i

The pseudocode of the algorithm for manager agent is presented in Algorithm 1. The manager agent schedules the tasks that arrive at the same time instant as one batch. Lines 2-4 in Algorithm 1 represent that the manager agent chooses those VMs that satisfy basic requirements of task agents, and sends the selected VMs’ information to those

The pseudocode of the algorithm for VM agents is shown in Algorithm 3. In Lines 4-6, the VM agents send backward announcement to task agents that send the forward contract with them, and the task agents return the corresponding bidding values. Then, the VM agents choose a bidder using the MAX Strategy or the P Strategy, and make backward

Algorithm 1. The Algorithm for Manager Agent


awarding contracts. By now, a bidirectional contract is built in Line 8, and the task is allocated to the VM. If a task cannot fit in any existing VMs, a new VM will be created. Function scaleUpResourceðÞ in Algorithm 4 depicts the procedure of adding a new VM. The algorithm first tries to allocate the new VM to an active host. If there is no suitable host, the VM migration will function to make room for the new VM. If it still fails, a turned-off host will be turned on and added into Ha .

Algorithm 4. Function scaleUpResources() 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Create newVM with processing power Pnew ; find FALSE; vnew NULL; foreach hk in Ha do if hk can accommodate newVM then Allocate newVM to hk ; vnew newVM; find TRUE; break; if find ¼¼ FALSE then sourceHost Migrate VMs among the hosts to make room for newVm; if sourceHost 6¼ NULL then Allocate newVm to sourceHost; vnew newVM; find TRUE; if find ¼¼ FALSE then Turn on a host hnew in H Ha ; if the capability of hnew satisfies Pnew then Allocate newVM to hnew ; vnew newVM;

We now evaluate the time complexity of ANGEL. Suppose Nvm denote the maximum number of items in ViA among all the task agents. For Lines 2-6, the time complexity is OðNvm Þ. The time complexity of Line 8 is OðNbidder Þ, where Nbidder is the number of items in valueList. The time complexity of scaleUpResourcesðÞ is OðNhost Þ. As a result, the time complexity of the algorithm for task agents is OðNvm þ maxfNbidder ; Nhost gÞ. Because Nvm is larger than Nbidder and Nhost , the time complexity of Algorithm 2 is OðNvm Þ. For Algorithm 3, suppose Ntask denotes the number of items in Tcandidate . The time complexity of Lines 2-6 is OðNtask Þ. The time complexity of Line 7 is OðNbidder Þ, where Nbidder is the number of items in valueList. Therefore, the time complexity of the algorithm for VM agents is OðNvm þ Nbidder Þ. As Ntask is larger than Nbidder , the time complexity of Algorithm 3 is OðNtask Þ. Finally, suppose the announcement-bidding phase repeats M rounds until all the tasks are scheduled. The time complexity of the announcement-bidding (Lines 5-7) in Algorithm 1 is OðMðNvm þ Ntask ÞÞ.

on MAX-MAX, MAX-P , P -MAX, and P -P strategies, respectively. Besides, we compare them with an improved classical scheduling algorithm—Improved Greedy (I-Greedy for short) scheduling algorithm as shown in Algorithm 5.

Algorithm 5. The I-Greedy Algorithm 1 2 3 4 5 6 7 8 9 10 11 12 13 14

T newly arrived tasks and tasks in the waiting queue; find FALSE; foreach ti in T do foreach vjk in V do Calculate the bidding value fbijk ; if fbijk > 0 then Allocate ti to vjk ; find TRUE; if find ¼¼ FALSE then Create a new VM vpq and calculate the bidding value fbipq ; if fbipq > 0 then Allocate ti to vpq ; else Reject ti ;

The performance metrics by which we evaluate the system performance include: 1)

2)

1) 2) 3) 4)

PERFORMANCE EVALUATION

As mentioned in Section 4, both the MAX strategy and P strategy can be employed in either the forward announcement-bidding phase or the backward announcement-bidding phase. Thus, four kinds of scheduling algorithms can be generated by employing different selection strategies. We use ANGEL-M-M, ANGEL-M-P, ANGEL-P-M, and ANGEL-P-P to denote the four scheduling algorithms based

Task guarantee ratio defined as: TGR = Total count of tasks guaranteed to meet their deadlines/Total count of tasks; Priority guarantee ratio. PGR = Sum of priorities of tasks that are finished before their deadlines/Sum of priorities of all tasks.

6.1 Simulation Setup In order to ensure the repeatability of the experiments, we choose the way of simulation to evaluate the performance of aforementioned algorithms. In our simulations, the CloudSim toolkit [31] is chosen as a simulation platform. Based on the characteristics of cloud computing, we add some new settings to conduct our experiments. The detailed setting and parameters are given as follows:

5)

6

3397

6)

7)

Each host is modeled to have one CPU core and the CPU performance can be 1,000, 1,500, or 2,000 MIPS; Each VM requires one CPU core with 250, 500, 750, 1,000 MIPS, 128 MB of RAM and 1 GB of storage; The start-up time of a host is 30 s and the creation time of a VM is 30 s; Tasks arrive dynamically in batch mode. In our experiment, the task count batchtaskCount in each batch is selected to be an uniformly distributed random variable between bathsizeLower and bathsizeLower þ 200. The initial value of bathchsizeLower is 300; The simulation interval is set by the total task count, varying from 1,000 to 10,000. And each group of tasks are divided into batches according to the value of batchtaskCount aforementioned; The interval time between two batches is in uniform distribution, the distribution interval is ðintervalTimeLower; intervalTimeLower þ 90Þ. The initial value of intervalTimeLower is 50; The priority of each task is an uniformly distributed random variable number between 1 and 10;

3398


VOL. 64,

NO. 12,

DECEMBER 2015

TABLE 2 Parameters for Simulation Studies Parameter

Value (Fixed)-(Varied)

Task count bathsizeLower intervalTimeLower (s) theta

(5,000)-(1,000 9,000) (500)-(100, 300, 500, 700, 900, 1100) (50)-(10, 30, 50, 70, 90, 110, 130) (1)-(1, 2, 3, 4, 5, 6)

8)

9)

The computing length of each task is defined as a uniformly distributed random value between 5 104 MI and 15 104 MI; We use parameter baseDeadline to control a task’s deadline that can be calculated as: di ¼ ai þ baseDeadline;

(13)

where parameter ai denote the arrival time of task, and baseDeadline is in uniform distribution UðbaseTime; a baseTimeÞ and we set a ¼ 8; 10) The value of u is set to 1 when calculating the bidding value in backward announcement; 11) As the VMs in cloud computing environment can be constructed dynamically, we establish 50 VMs in the initialization. When the VMs cannot satisfy the requirements of tasks, new VMs will be created. In the contrary, when there exists any VM that has been idle for a given time, it will be turned off by the system. The values of parameters are listed in Table 2.

6.2 Performance Impact of Random Sequence The scheduler schedules the tasks that arrive at the same time instant as one batch. It is very essential to determine which task in the batch should be firstly scheduled and which parameter is supposed to have a considerable impact on the performance of algorithms. In this group of experiment, the scheduling sequence of tasks in one batch is random determined in ANGEL-M-M, ANGEL-M-P, ANGELP-M, ANGEL-P-P and I-Greedy. Fig. 6 shows the performances of the five algorithms in terms of TGR and PGR. It can be observed from Fig. 6a that the TGRs of ANGELM-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P are much higher than that of I-Greedy because the ANGELs employ the bidirectional announcement-bidding mechanism which considers both the conditions of tasks and VMs. As a result, they are able to reach better global solutions. In contrast, I-Greedy does not employ the bidirectional announcement-bidding mechanism, so the tasks are allocated on a proper VM directly without the consideration of the condition of VMs. In this way, the performance among the whole system cannot be guaranteed, and the scheduling solutions may be not optimized. In addition, an interesting phenomenon can be observed from Fig. 6a that the TGRs almost keep stable when the task count in one batch varies. This is can be attributed to the fact that there are infinite resources in the cloud. When the task count increases, the system can satisfy the new request by creating new VMs to finish tasks before their deadlines. Despite of the infinite resources in the cloud, some tasks still cannot be accepted because it takes additional time to create a new VM and

Fig. 6. Performance impact of random sequence.

turn on a new host, which causes that tasks cannot start timely, and thus miss their deadlines. The scalable resource is a main advantage of clouds, which makes it be a perfect platform to process real-time tasks. Besides, we can find that ANGEL-P-M performs better than others, which demonstrates that using P strategy in the forward bidding phase and MAX strategy in the backward bidding phase will provide the best system performance. From Fig. 6b, we can observe that ANGEL-M-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P exhibit a better performance than I-Greedy in terms of PGR because IGreedy dose not take the task priority into consideration during the scheduling. However, when ANGEL-M-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P calculate the bidding value in the backward announcement-bidding phase, the priority is taken into account. Therefore, a task with higher priority is more likely to be allocated successfully even under the random sequence, which makes ANGELs achieve higher PGRs. It can be concluded from Fig. 6 that ANGEL-M-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P are superior to I-Greedy with regards to both PGR and TGR. Our proposed algorithms can efficiently solve the scheduling problem for real-time tasks and fully utilize the main advantage of clouds, namely, the scale of VMs can dynamically vary according to the current state of the system.

6.3 Performance Impact of Priority-Based Sequence In this group of experiment, the newly arrived tasks in one batch are sorted by their priorities in a descent order, and then they are scheduled by ANGELs and I-Greedy. Fig. 7

Fig. 7. Performance impact of priority-based sequence.


3399

Fig. 8. Performance impact of intervalTimeLower.

Fig. 9. Performance impact of u:

shows the performances of ANGEL-M-M, ANGEL-M-P, ANGEL-P-M, ANGEL-P-P and I-Greedy. We can observe that, under the priority-based sequence, ANGEL-M-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P still perform better than I-Greedy in terms of both PGR and TGR. By comparing Figs. 6a and 7a, we can see that there is little difference on TGR under the two scheduling sequences. This result indicates that the scheduling sequence has almost no impact on TGR because the VMs in a cloud can be dynamically created, and thus some parameters, e.g., priority, deadline can determine if a task can be executed successfully. Nonetheless, the experimental results in Figs. 6b and 7b suggest that the PGR of I-Greedy under the prioritybased sequence is higher than that under the random sequence, while the PGRs of ANGELs show little difference under the two sequences. The reason is that ANGEL-M-M, ANGEL-M-P, ANGEL-P-M and ANGEL-P-P adopt the bidirectional announcement-bidding mechanism, and the priority is considered in the calculation of bidding values. In this way, the tasks with higher priorities have more possibility to be scheduled successfully. However, I-Greedy pays no attention to the priority of tasks, and VMs accept any task that can be finished by it. When the tasks are sorted by their priorities, the tasks with higher priorities will be scheduled earlier and are more likely to be executed successfully, which makes the I-Greedy achieve higher PGR under the priority-based sequence. We can conclude from Figs. 6 and 7 that when scheduling real-time tasks, ANGEL has the ability to achieve both higher TGR and PGR than I-Greedy. In addition, the scheduling sequence has obvious impact on I-Greedy showing little influence on ANGEL. The experimental results reveal that ANGEL is with higher adjustability and stability by using the bidirectional announcement-bidding mechanism.

algorithms increase with the explanation that the system workload becomes light gradually. Therefore, the possibility to accept more tasks enhances. An interesting observation is that in the period of intervalTimeLower increasing from 10 to 30, the TGR of ANGEL increases greatly because more small-size tasks can be accepted. However, when we increase intervalTimeLower from 30 to 70, the increment of TGR is relatively slight. We can attribute the reason to the fact that all small-size tasks have been accepted but some big-size tasks still cannot be accepted because the system workload is still heavy. Further, when the intervalTimeLower increases from 70 to 130, the TGR of ANGEL increases obviously. We can attribute this result to the fact that the system workload becomes light, thus more big-size tasks can also be accommodated. It is worth noting that the TGR of ANGEL turns out to be higher than that of other I-Greedy when intervalTimeLower varies. The reason is that ANGEL employs the bidirectional announcementbidding mechanism and is able to adaptively add the VMs striving to accepting more tasks. We can observe from Fig. 8b that ANGEL-M-M, ANGELM-P, ANGEL-P-M and ANGEL-P-P exhibit a better performance than I-Greedy with regard to PGR. The reason is similar to that in Fig. 8a. Another observation is that some PGRs decrease when the parameter intervalTimeLower varies from 30 to 70 because although some big-size tasks can be accommodated, their priorities may not very high. Whereas some tasks with higher priorities cannot be accepted because they cannot be finished before their deadlines. Consequently, some PGRs degrades. Besides, we can also see that the PGR of ANGEL (see Fig. 8b) is a little higher than TGR of ANGEL (see Fig. 8a) with the explanation that ANGEL sufficiently considers tasks’ priorities while scheduling.

6.4 Performance Impact of Arrival Rate A group of experiments are conducted in this section to observe the impact of arrival rate on the performance. Parameter intervalTimeLower that varies from 10 to 130 with an increment of 20 determines the interval time between two consecutive batches. Again, the interval time further decides the task arrival rate. The experimental results are depicted in Fig. 8. Fig. 8a demonstrates that when we increase the parameter intervalTimeLower from 10 to 130, the TGRs of all the

6.5 Performance Impact of u In our experiments, the value of u plays an important role to determine the weight of priority. The goal of this group of experiment is to study the impacts of u on the performance of ANGEL. Fig. 9 illustrates the performance of ANGEL when varying the parameter from 1 to 6 with an increment of 1. It can be observed from Figs. 9a and 9b that our ANGEL achieves higher PGR than TGR, which can be explained that ANGEL sufficiently considers task priorities in its selection

3400


Fig. 10. Performance impact of task count in one batch.

policy. This inclination to take task priority into account is derived from the bidding value in the backward announcement-bidding phase. Fig. 9a shows that with the increment of u, the task guarantee ratio only varies in a very small scope, and almost keeps stable. However, the PGRs raise when the value of u increases from 1 to 6 (See Fig. 9b). The reason is that ANGEL sufficiently considers the task priority while scheduling, i.e., it prefers to allocate tasks with higher priorities. Consequently, the increasing of PGR is more distinct when the value of u increases. The experimental results indicate that ANGEL has excellent performance to allocate tasks with priorities.

6.6 Performance Impact of Task Count in One Batch Now we investigate the impact of the task count in one batch on the performance of the ANGEL. The parameter batchSizeLower varying from 100 to 1,100 with an increment of 200 is used to determine the number of tasks in a batch. The experimental results are shown in Fig. 10. Fig. 10a demonstrates that when we increase the batchSizeLower from 100 to 500, the TGR of ANGEL noticeably increases. This is due to the fact that ANGEL is capable of adding VMs to accommodate more tasks when the system workload becomes heavy with the increase of tasks in a batch. Thus, the TGR is improved correspondingly. Nevertheless, an interesting observation from Fig. 10a is that when we continue increasing batchSizeLower from 500 to 1,100, the TGR significantly degrades because when the task size in one batch is larger, more tasks surge into the systems at the same time making the tested system heavily loaded, whereas the current active hosts are not enough to accept these tasks, thus more hosts will be started and VMs will be created on the started hosts to accommodate these tasks. However, due to the deadline constraints of tasks and the timing requirement of creating VMs, more tasks cannot be finished before their deadlines when constantly increasing the batch size. Consequently, the TGR degrades inevitably. This result indicates that ANGEL can offer better system performance when selecting a suitable batch size. Fig. 10b shows that the PGR has similar trend with TGR in Fig. 10a. However, it is obvious that PGR is higher than TGR with the similar explanation in other groups of experiments.

VOL. 64,

NO. 12,

DECEMBER 2015

Fig. 11. The count of tasks submitted to the system.

6.7 Performance on a Real-World Trace The aforementioned groups of experiments demonstrate the performance of the different algorithms in various random synthetic workloads. To further evaluate the practicality and efficiency of our proposed algorithms in practical use, in this section, we carry out experiments on a real-world trace deriving from the latest version of the Google cloud tracelogs [32]. The Google cloud tracelogs are made up of task events as well as resource demand and usage records for over 25 million tasks grouped in about 672 thousand jobs over a time span of 29 days. It is relatively difficult to conduct an experiment based on all the tasks due to the enormous count of tasks in the tracelogs. As a result, the day 18, a representative day among the 29 days according to the analysis in [33], was selected as a testing sample. There were 955,626 tasks submitted to the cloud system in day 18. To observe the change of the task count over the time, we depict the count of tasks submitted in every 100 seconds in Fig. 11, where we can easily observe that the task count fluctuates greatly over the time. In some time stamps, a great deal of tasks surge into the system, the resource demand is at a peak, whereas the resource demand decreases sharply at the times tamps where a few of tasks were submitted into the system, which brings a grant challenge for the system to schedule these tasks sufficiently. Based on the task information in the day 18, the mean value of the execution time of the tasks is 1,076 seconds, and the majority of the tasks execute in less than 1,000 seconds. Additionally, we can observe from Fig. 12 that the execution time follows a Lognormal distribution where most of the tasks are with short execution time. The distribution of

Fig. 12. The distribution of execution time (ET in short) and response time (RT in short).


3401

TABLE 3 Performance on Google Cloud Workloads Algorithm Metric TGR PGR

ANGEL-M-M

ANGEL-M-P

ANGEL-P-M

ANGEL-P-P

I-GREEDY

92.01% 92.25%

84.04% 84.41%

55.02% 55.56%

51.46% 51.87%

19.57% 19.62%

response time, i.e., the time span from a task’s submission to the task’s completion is shown in Fig. 12, which basically follows a Lognormal distribution. On average, the ratio between a task’s response time and execution time is 2.75. The experimental results in the context of the Google cloud tracelogs are listed in the Table 3. The results argue that ANGEL can exhibit a good performance in practical use. Particularly, the TGR and PGR of ANGEL-M-M reach to 92:01, and 92:25 percent, respectively, which indicates that in dynamic cloud environment with great fluctuation of task count, ANGEL-M-M yields the best performance in the practical context. An interesting observation from the experimental results is that ANGEL-P-M performs almost always the best in all when using random synthetic workloads while ANGEL-M-M performs the best in all when using clould tracelogs. We attribute this reason to the fact that when using the P strategy in the forward announcement-bidding phase, the tasks can be relatively evenly allocated to VMs, so more tasks can be successfully accepted. However, in the real-world trace, the situation is greatly changed. Sometimes, extensive amounts of tasks surge into the system, employing the P can evenly allocate tasks to VMs, but the capabilities of VMs with large laxity are not sufficiently utilized. Employing the MAX strategy in the forward announcement-bidding phase can efficiently use the laxity by Eq. (9) especially when a great deal of tasks surge into the system. Also, we can observe that although clouds have nearly infinite resources, there are still some tasks that were rejected, which can be attributed to the restriction of deadlines in our experiment. It is straightforward that there are great performance differences among the five algorithms because in some time stamps, extensive amounts of tasks surge into the system (the maximum amount of tasks submitted to the system in 100 seconds is over 13,000), which immediately incurs the system overloaded, and requires much more active hosts to accommodate these tasks. The active hosts in the system are much more than those in the previous synthetic workloads. It should be noted that starting hosts and creating VMs need additional time, well-designed algorithms will exhibit better performance, especially in practice, it is more pronounced. Based on the Google cloud tracelogs, in terms of TGRs, ANGEL-M-M, ANGEL-M-P, ANGEL-P-M, and ANGEL-P-P outperform I-GREEDY 72.44, 64.47, 35.45, and 31.89 percent, respectively. Regarding PGRs, ANGEL-MM, ANGEL-M-P, ANGEL-P-M, and ANGEL-P-P outperform I-GREEDY 72.63, 64.79, 35.94, and 32.25 percent, respectively. This outperforming can be explained that our ANGEL comprehensively employs the bidirectional announcement-bidding mechanism where the selection strategies (MAX strategy and P strategy) and dynamic scaling up method are integrated.

7

CONCLUSIONS AND FUTURE WORK

In this study, we investigated the problem of agent-based scheduling for aperiodic, independent real-time tasks in virtualized clouds and proposed a novel dynamic scheduling algorithm—ANGEL. The ANGEL employs a new bidirectional announcement-bidding mechanism, in which our contributions include designing the basic matching policy, forward announcement-bidding and backward announcement-bidding, as well as their process flows. Besides, we devised the calculation of bidding values in forward biding and backward bidding. What is more, two selection strategies, i.e., MAX strategy and P strategy were put forward to determine the contractors. Again, we sufficiently considered the elasticity of clouds and proposed a scaling-up policy to dynamically add VMs so as to enhance the system schedulability. The extensive simulation studies by using random synthetic workloads as well as the workload from the last version of the Google cloud tracelogs indicate that ANGEL is a feasible scheduling algorithm designed for real-time tasks in virtualized clouds. Our ANGEL is the first of its kind reported in the literature, because ANGEL employs the agent-based approach and comprehensively addresses the issue of schedulability, priority, scalability, real-time in virtualized cloud environment. In our future studies, we plan to address the following three issues: Firstly, we will implement a new scheduling mechanism in which communication and dispatching times are taken into account. Secondly, we will integrate the scaling down policy with our ANGEL to improve the resource utilization. Finally, we plan to run our ANGEL in a real cloud environment.

ACKNOWLEDGMENTS This research was supported by the Program for Changjiang Scholars and Innovative Research Team in University of China under grant No. IRT13014, the National Natural Science Foundation of China under grants No. 91024030 and No. 61403402, the Project of Institute of Southwest Electronics and Telecommunication Technology under grant No. 2013001, the Hunan Provincial Natural Science Foundation of China under grant No. 2015JJ3023 as well as Australian Research Council (ARC) grants DP150103732, DP140103649, and LP140100816. Yang Xiang is the corresponding author.

REFERENCES [1]

R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Comput. Syst., vol. 57, no. 3, pp. 599– 616, 2009.

3402

[2]

[3] [4]

[5]

[6] [7] [8]

[9]

[10] [11] [12] [13]

[14] [15] [16] [17] [18] [19]

[20] [21] [22]

[23]

[24]


M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, and I. Stoica, “A view of cloud computing,” Commun. ACM, vol. 53, no. 4, pp. 50–58, 2010. J. Rao, Y. Wei, J. Gong, and C. Xu, “QoS guarantees and service differentiation for dynamic cloud applications,” IEEE Trans. Netw. Service Manage., vol. 10, no. 1, pp. 43–55, Mar. 2013. A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,” Future Generation Comput. Syst., vol. 28, no. 5, pp. 755–768, 2012. L. He, D. Zou, Z. Zhang, C. Chen, H. Jin, and S. A. Jarvis, “Developing resource consolidation frameworks for moldable virtual machines in clouds,” Future Generation Comput. Syst., vol. 32, pp. 69–81, 2014. M. Dong, H. Li, K. Ota, and H. Zhu, “HVSTO: Efficient privacy preserving hybrid storage in cloud data center,” in Proc. 3rd Workshop Commun. Control Smart Energy Syst., 2014, pp. 529–534. X. Qinand H. Jiang, “A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems,” Parallel Comput., vol. 32, no. 5, pp. 331–356, 2006. K. Plankensteiner, R. Prodan, T. Fahringer, A. Kertesz, and P. Kacsuk, “Fault-tolerant behavior in state-of-the-art grid workflow management systems,” Tech. Rep. TR-0091, Inst. Grid Inform., Resource Workflow Monitoring Serv., CoreGRIDNetw. Excellence, 2007. H. M. Fard, R. Prodan, and T. Fahringer, “A truthful dynamic workflow scheduling mechanism for commercial multicloud environments,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp. 1203–1212, Jun. 2013. L. F. Bittencourt, E. R. M. Madeira, and N. L. S. da Fonseca, “Scheduling in hybrid clouds,” IEEE Commun. Mag., vol. 50, no. 9, pp. 42–47, Sep. 2012. B. Sotomayor, R. S. Montero, I. M. Liorente, and I. Foster, “Virtual infrastructure management in private and hybrid clouds,” IEEE Internet Comput., vol. 13, no. 5, pp. 14–22, Sep./Oct. 2009. M. Mishra, A. Das, P. Kulkarni, and A. Sahoo, “Dynamic resource management using virtual machine migrations,” IEEE Commun. Mag., vol. 50, no. 9, pp. 34–40, Sep. 2012. X. Liu, C. Wang, B. Zhou, J. Chen, T. Yang, and A. Y. Zomaya, “Priority-based consolidation of parallel workloads in the cloud,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 9, pp. 1874–1883, Sep. 2013. Y. Mei, L. Liu, X. Pu, S. Sivathanu, and X. Dong, “Performance analysis of network I/O workload in virtualized data centers,” IEEE Trans. Serv. Comput., vol. 6. no. 1, pp. 48–63, 1st Quarter 2013. H. Goldingay and J. Mourik, “The effect of load on agent-based algorithms for distributed task allocation,” Inf. Sci., vol. 222, pp. 66–80, Feb. 2013. M. Wooldrige and N. R. Jannings, “Intelligent agents: Theory and practice,” Knowl. Eng. Rev., vol. 10, pp. 115–152, 1995. K M. Sim, “Agent-based cloud computing,” IEEE Trans. Services Comput., vol. 5, no. 4, pp. 564–577, Oct.-Dec. 2012. X. Zhu, X. Qin, and M. Qiu, “QoS-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters,” IEEE Trans. Comput., vol. 60, no. 6, pp. 800–812, Jun. 2011. Q. Zhang, M. F. Zhani, R. Boutaba, and J. L. Hellerstein, “Harmony: Dyanmic heterogneity-aware resource provisioning in the cloud,” in Proc. 33rd Int. Conf. Distrib. Comput. Syst., 2013, pp. 510–519. X. Kong, C. Lin, Y. Jiang, W. Yan, and X. Chu, “Efficient dynamic task scheduling in virtualized data centers with fuzzy prediction,” J. Netw. Comput. Appl., vol. 34, no. 4, pp. 1068–1077, 2011. R. N. Calheiros and R. Buyya, “Meeting deadlines of scientific workflows in public clouds with tasks replication,” IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 7, pp. 1787–1796, Jul. 2014. M. Malawski, G. Juve, E. Deelman, and J. Nabrzyski, “Cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds,” in Proc. Int. Conf. High Perform. Comput., Netw., Storage Anal., 2012, pp. 1–11. I. Goiri, J. L. Berral, J. O. Fit o, F. Julia, R. Nou, J. Guitart, R. Gavald a, and J. Torres, “Energy-efficient and multifaceted resource management for profit-driven virtualized data centers,” Future Generation Comput. Syst., vol. 28, pp. 718–731, 2012. P. Graubner, M. Schmidt, and B. Freisleben, “Energy-efficient management of virtual machines in Eucalyptus,” in Proc. IEEE 4th Int. Conf. Cloud Comput., 2011, pp. 243–250.

VOL. 64,

NO. 12,

DECEMBER 2015

[25] N. Liu, M. A. Abdelrahman, and S. Ramaswamy, “A complete multiagent frame for robust and adaptable dynamic job shop scheduling,” IEEE Trans. Syst., Man, Cyber., vol. 37, no. 5, pp. 904– 916, Sep. 2007. [26] R. Price, “Evaluation of adaptive nature inspired task allocation against alternate decentralised multiagent strategies,” BSc dissertation, University of Birmingham, Birmingham, U.K., 2004. [27] M. Campos, E. Bonabeau, G. Theraulaz, and J. L. Deneubourg, “Dyanmic scheduling and division of labor in social insects,” Adaptive Behavior, vol. 8, no. 2, pp. 83–94, 2001. [28] M. Owliya, M. Saadat, R. Anane, and M. Goharian, “A new agents-based model for dynamic job allocation in manufacturing shopfloors,” IEEE Syst. J., vol. 6, no. 2, pp. 353–361, Jun. 2012. [29] M. Owliya, M. Saadat, G. G. Jules, M. Goharian, and R. Anane, “Agent-based interaction protocols and topologies for manufacturing task allocation,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 43, no. 1, pp. 38–52, Jan. 2013. [30] G. Lange and F. Lin, “Modeling well scheduling as a virtual enterprise with intelligent agents,” in Proc. IEEE 17th Int. Conf. Comput. Sci. Eng., 2014, pp. 89–96. [31] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. D. Rose, and R. Buyya, “CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms,” Softw.: Practice Experience, vol. 41, no. 1, pp. 23–50, 2011. [32] Google cluster data v2. [Online]. Available: https://code.google. com/p/googleclusterdata/wiki, 2014. [33] I. S. Moreno, P. Garraghan, P. Towend, and X. Jie, “An approach for characterizing workloads in Google cloud to derive realistic resource utilization models,” in Proc. IEEE Int. Symp. Service Oriented Syst. Eng., 2013, pp. 49–60. Xiaomin Zhu received the BS and MS degrees in computer science from Liaoning Technical University, Liaoning, China, in 2001 and 2004, respectively, and PhD degree in computer science from Fudan University, Shanghai, China, in 2009. In the same year, he received the Shanghai Excellent Graduate. He is currently an associate professor in the College of Information Systems and Management, National University of Defense Technology, Changsha, China. His research interests include scheduling and resource management in green computing, cluster computing, cloud computing, and multiple satellites. He has published more than 60 research articles in refereed journals and conference proceedings such as IEEE TC, IEEE TPDS, IEEE TCC, JPDC, JSS, ICPP, CCGrid, and so on. He is an editorial board member of the Journal of Big Data and Information Analytics. He is a member of the IEEE. Chao Chen received the BS degree in information systems from National University of Defense Technology, Changsha, China, in 2013. He is currently working toward the MS degree in the College of Information Systems and Management, National University of Defense Technology. His research interests include agent-based scheduling, cloud computing, and mobile cloud computing.


Laurence T. Yang received the BE degree in computer science and technology from Tsinghua University, China, and the PhD degree in computer science from University of Victoria, Canada. He is a professor with the School of Computer Science and Technology, Huazhong University of Science and Technology, China, as well as with the Department of Computer Science, St. Francis Xavier University, Canada. His research interests include parallel and distributed computing, embedded and ubiquitous/pervasive computing, and big data. He has published more than 200 papers in various refereed journals (about 40 percent in IEEE/ACM Transactions and Journals and the others mostly in Elsevier, Springer, and Wiley Journals). His research has been supported by the National Sciences and Engineering Research Council of Canada (NSERC) and the Canada Foundation for Innovation. He is a senior member of the IEEE.

3403

Yang Xiang received the PhD degree in computer science from Deakin University, Australia. He is currently a full professor in School of Information Technology, Deakin University. He is the director in the Network Security and Computing Lab (NSCLab). His research interests include network and system security, distributed systems, and networking. In particular, he is currently leading his team developing active defense systems against large-scale distributed network attacks. He has published more than 150 research papers in many international journals and conferences, such as IEEE TC, IEEE TPDS, and IEEE TISF. Two of his papers were selected as the featured articles in the April 2009 and the July 2013 issues of IEEE TPDS. He has served as the Program/General chair for many international conferences such as ICA3PP 12/11, IEEE/IFIP EUC 11, IEEE TrustCom 13/11, IEEE HPCC 10/09, IEEE ICPADS 08, and NSS 11/10/09/08/07. He has been the PC member for more than 60 international conferences in distributed systems, networking, and security. He serves as the associate editor of IEEE TC, IEEE TPDS, Security and Communication Networks (Wiley), and the editor of Journal of Network and Computer Applications. He is the coordinator, Asia for IEEE Computer Society Technical Committee on Distributed Processing (TCDP). He is a senior member of the IEEE. " For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.

ANGEL: Agent-Based Scheduling for Real-Time Tasks in Virtualized ...

ANGEL: Agent-Based Scheduling for Real-Time Tasks in Virtualized ...

Suggest Documents

Two-Phase Provisioning for HPC Tasks in Virtualized Datacenters

Bin Repacking Scheduling in Virtualized Datacenters

Resource CoAllocation for Scheduling Tasks with ...

Efficient Tasks scheduling for heterogeneous ... - CiteSeerX

Improving Tasks Scheduling In Cloud Computing

Scheduling Replicated Critical Tasks in Faulty ...

Planning, Scheduling and Dispatching Tasks in ...

Theoretical aspects of scheduling coupled-tasks in

New Heuristic for Scheduling of Independent Tasks in Computational ...

Scheduling for Emergency Tasks in Industrial Wireless Sensor ... - MDPI

A Scheduling Algorithm for Aperiodic Groups of Tasks in Distributed

Scheduling Moldable Tasks for Dynamic SMP Clusters in ... - CiteSeerX

TTSA: An Effective Scheduling Approach for Delay Bounded Tasks in ...

Towards a Load-Aware Scheduling Framework for Realtime Video ...

Efficient Dynamic Task Scheduling in Virtualized ... - Semantic Scholar

Bin Repacking Scheduling in Virtualized Datacenters - Google Sites

Multicore scheduling in automotive ECUs - RealTime-at-Work

Multicore scheduling in automotive ECUs - RealTime-at-Work

Energy-Aware Lease Scheduling in Virtualized Data ...

Analysis of Quasi-Static Scheduling Techniques in a Virtualized ...

Virtualized Execution and Management of Hardware Tasks on a ...

ANGEL: Agent-Based Scheduling for Real-Time ...

Scheduling Independent Tasks on Metacomputing Systems - CiteSeerX

Scheduling independent tasks with multiple modes - ScienceDirect