Multi-objective Task Assignment in Cloud Coputing by Particle Swarm Optimizatoin Lizheng Guo1,2, Guojin Shao1 1
Department of Computer Science and Engineering Henan University of Urban Construction Pingdingshan,China
Shuguang Zhao2 2
College of Information Sciences and Technology Donghua University Shanghai, China
[email protected]
[email protected]
Abstract—The cost of processing and transferring and the time of processing and transferring are critical to the success of a lot of enterprises using cloud computing. Most of the existing optimization algorithms only focus on one aspect. This paper formulates a model for the multi-objective task assignment and describes a particle swarm optimization algorithm in cloud computing environment. The algorithm not only optimizes the time, but also optimizes the cost. The experimental result manifest that the proposed method is more effective and efficient in time and cost. Keywords-cloud computing; particle swarm optimization;multiobjective task allocation.
I.
INTRODUCTION
Cloud computing becomes more and more attractive, for cloud computing [1-3] is becoming an increasingly popular enterprise model in which store and computing resources are purchased on-demand by the user. The main feature of cloud computing is combination of vast software and common hardware to provide a powerful computing paradigm. It [4] is estimated that, by statistically multiplexing the resource in large scale economies, cloud computing uncovers factors of 5 to 7 decrease in cost of electricity, network bandwidth, operations, software, and hardware available at these very large economies. In cloud computing system, there are lots of advantages, but there are many obstacles to overcome. It [4] listed top ten obstacles in which the number one is availability of service. In cloud computing involved a set of cooperating processors and communications which distribute all over the world. In order to gain the best availability, there are many aspects to be considered. To increase the system availability, many methods were adopted which range from reducing the movement of the data [5], to minimizing the executing and communication time [7, 8], to minimizing the cost of computation and transferring [6]. In addition, the system processors and communication links may be able to limited resources. The most important reason is that using resources is charged by the using time or by using data amount in cloud computing. So, this paper deals with the task assignment in which multiple objectives are considered, which are minimization of the executing and communication time, minimization of the cost of executing and transferring.
In fact, task assignment has been found to be NP-complete [9]. Since task assignment is NP-Complete problem, Genetic Algorithm (GA) has been used for task assignment [10-11]. But, genetic algorithm may not be the best method. L. Zhang has illustrated that the particle swarm optimization algorithm is able to get the better schedule than genetic algorithm in grid computing [7]. A. Salman has shown that the performance of Particle Swarm Optimization (PSO) algorithm is better than GA algorithm in distributed system [12]. Not only the PSO algorithm solution quality is better than GA in most of the test cases, but also PSO algorithm run faster than GA. So, we use a method called Particle Swarm Optimization to optimize the task assignment problem. In this paper, we focus on not only minimizing the total executing time and transferring time, but also minimizing the total cost of executing and transferring. The rest of this paper is organized as follows. Section Ⅱ presents problem describing and modeling. Section Ⅲ introduces the PSO algorithm. Section Ⅳ gives the details of the experiment setting and result analyzing. Section Ⅴ concludes the paper.
II.
PROBLEM DESCRIBING AND MODELING
A. Problem describing In cloud computing, on the one hand, there are lots of processors which have different capacity of computing and storage; on the other hand, the using fee dependents on the capacity of the processor and the amount of transferring data. Thus, the task assignment problem can be described that assigning all the data of task to all the processors in the cloud computing environment makes the total cost and time of processing and communication to minimize. We regard the task assignment as a mapping which maps all the tasks to a Directed Acyclic Graph (DAG) G (V, E). The V={1,2, …, n} delegate the n tasks and E represents the interaction between these tasks. Each node has a weight which denotes the data amount to be performed by a given task on the special processor. Each edge has a weight which describes the amount of information from one task generating to another task to be dealt with.
Identify applicable sponsor/s here. (sponsors)
978-1-61284-683-5/12/$31.00 ©2012 IEEE
B. Modeling In this paper, we regard the task assignment as the following case. The processors in the cloud computing are heterogeneous and they have different processing ability. A task's processing cost may be variety which dependents on the different capacity of the processor and the standard of charge fee for different provider or different place. The processing time will be variety according to the task being assignment to different processors. On the other hand, the communication time between two tasks will be change because between two different node's bandwidth have diversity and changing over the time, so the cost of communication will be diversity. Our target is minimizing the time and cost of the communication and processing by mapping all the tasks to all the processors. In order to formulate the task assignment, we define Ti i={1, 2, 3, …, n} as n independent tasks permutation, CPUk k={1, 2, 3, …, m} as m processors, Bij ,i, j= {1, 2, 3, …, k}as the bandwidth between two node and k is the number of node;xik=1 if task i is assigned to processor k, and xik=0 otherwise; yijkl=1 if only if task i is assigned to processor k and task j is assigned to processor l and j ≠ k, and yijkl=0 otherwise; n is the number of tasks; m is the number of processors; DEi is the amount of data that the i task assigning to the processor k and CPUk is the capacity of processor; DTij is the interchange data amount between task i which is the generator of the transferring data and task j which is the consumer; equation (1), (2) and (3) respectively represent the executing time, the transfer time and the total time.
TABLE I. Region
Linux/UNIX Usage
Windows Usage
US East Virginia
$0.085 per hour
$0.12 per hour
EU(Ireland)
$0.095 per hour
$0.12 per hour
Asia Pacific(Tokyo)
$0.10 per hour
$0.12 per hour
TABLE II.
Pricing
Texe =
m
∑∑
i =1 k =1
D Ei x ik × . CPU k
Data Transfer OUT
$0.100 per GB $0.100 per GB $0.100 per GB
$0.150 per GB $0.150 per GB $0.201 per GB
n −1
n
m
∑ ∑ ∑∑ y
Tt =
ijkl
×
D Tij B ij
i =1 j = i + 1 k =1 l ≠ k
Total (T ) = Texe + Tt
Ct =
n −1
n
m
∑ ∑ ∑∑ y
.
.
(2)
(3)
× ( D Tij × POU T + D Tij × PIN ) . (4)
ijkl
i =1 j = i + 1 k =1 l ≠ k
C e xe =
n
m
∑∑
x ik ×
i =1 k =1
D Ei × Pk . CPU k
(5)
Total (C ) = Ct + Cexe .
(6)
FitnessF = Total (T ) + Total (C ) .
m
∑x
Subject to
ik
= 1, ∀i = 1, 2,...n
.
(7)
(8)
k =1
m
m
∑∑ y
ijkl
= 1, ∀i, j = 1, 2,...n
.
k =1 l =1
xik , yijkl ∈ {0,1}, ∀i, j , k , l III.
(1)
Data Transfer IN
US East Virginia EU(Ireland) Asia Pacific(Tokyo)
Supposing that the processing time and cost are know for task i executing on processor k and the transfer time and cost are know for transfer the data from i node to j node. Our purpose is that mapping all the tasks to all the processors make the total time and cost minimizing, which making the (7)'s value is minimizing. All the limited conditions are (8), (9) and (10). n
DATA TRANSFER PRICING
Region
Amazon EC2 [13] provide three types of charging method as following: the first is On-Demand, the second is Reserved and the third is Spot. In this paper, we chose the On-Demand method as our computing standard. The processing pricing is listed in table 1 and the data transferring pricing is listed in table 2. In light of the charging standard of Amazon EC2, we define POUT as the data transfer out pricing and PIN as the data transfer in pricing. Let Ct (4) be the total cost of data transfer. Pk is the processing pricing of standard on-demand of the small method and Cexe (5) is the total cost of processing. Let Total(C) (6) be the total cost which is the sum of the cost of data transfer and processing; (7) is the fitness function.
STANDARD ON-DEMAND SMALL PRICING
.
(9)
(10)
PARTICLE SWARM OPTIMIZATION ALGORITHM
In this section, we present a particle swarm optimization algorithm for multi-object task assignment. The heuristic method optimizes the cost and time of executing, the time and cost for a given task assignment project based on the particle swarm optimization algorithm. The PSO algorithm which is
like the other evolution computation was proposed by Kennedy and Eberhart in 1995[14]. The PSO is inspired by the behaviors of bird flocking or fish flocking which is look for food through the search space. In their looking for food or escaping the enemies, the swarm population of birds or fishes flocks synchronously, changes direction suddenly, scatters and regroups interactively, and finally perches on a target. A particle has a velocity vector and position vector. Each particle moves and adjusts the direction according to the velocity and the position in the search space at any time. Each particle has a fitness value, which will be evaluated by a fitness function to be optimized in iteration. Pbest is the best position of the particle which has gained and the gbest is the best position of the entire particle which has held. In their action, each particle adjusts their direction and speed in terms of their current position, velocity, pbest and gbest. The performance of each particle is measured by a fitness function which is formulation by special question. In this paper the fitness function is (7). In each generation the velocity and the position of particle will be update in light of (11) and (12) respectively. vik +1 = ω vik + c1rand1 × ( pbesti − xik ) + c2 rand 2 × ( gbest − xik ) .(11)
xik +1 = xik + vik .
(12)
B. Implemention of the PSO Alogrithm In PSO, each particle is a candidate solution of the underlying problem and has n dimensions which are decided by special problem. In this paper n is to be assigned task. The resolution of each particle improves until meeting the condition of end which is either maximum iteration times or the getting ideal result. The detail of the PSO algorithm is described in the Fig. 1. PSO algorithm 1. Initialize particle position vector and velocity vector randomly according (13) and (14). The vector's dimension equal to the size of the special tasks. 2. According to (7) counts the fitness function value, if one particle's fitness value is better than current, setting current value replace previous pbest and as the new pbest. 3. Selecting the best particle from the entire particle, if its value better than the current gbest, then replace the previous gbest and as the new gbest. 4. For all particles update their position and velocity by (12) and (11) 5. If reaching to the maximum iteration or getting the ideal result, the program stops, otherwise repeating from Step 2. Figure 1.Algorithm of PSO
A. Initialization of the Particle Swarm Population The particle swarm population is generation at random. The position vector and velocity vector of each particle swarm is generated according to (13) and (14) [15]. In (13) and (14), the value of xmin and vmin is 4.0, the value of xmax and vmax is -0.4. xi1 = xmin + ( xmax − xmin ) × rand .
(13)
vi1 = vmin + (vmax − vmin ) × rand .
(14)
As the vector of position is continuous, the task assignment is discrete permutation, so we should transform the continuous value to discrete value. We use the smallest position value rule (SPV) [15] which is used to find a permutation corresponding to the continuous position. In the following, we detail the process in the table 3. TABLE III.
Dimension 1 2 3 4 5 6
IV.
EXPERIMENT SETTINGS AND RESULT ANALYZING
In this section, we discuss the experiment setting and result analyzing.
A. Experimet Settings In order to more best evaluating our algorithm, we generate the test data at random. The data amount of tasks is restricted between 100 and 1000; the processor capacity is between 2.0 and 6.0; the data amount of transferring varies from 1 to 100; the bandwidth varies form 1 to 10. In the following part, all of the experiments are tested on an Intel(R) Pentium(R) Dual CPU E2160 1.80 GHz with 1G RAM under a Microsoft Windows XP environment. All the experiments were implemented in Matlab R2009b. The parameters of the PSO are as following: size of swarm is 30, self-recognition coefficient c1 is 1.49445, social coefficient c2 is 1.49445 and weight w is 0.729 [16]. B. Experimental Result Analyzing
CONTINOUS VALUE TO DISCRETE VALUE
Continuous 3.1848 3.5855 0.1587 3.6189 2.3824 0.0292
Discrete 6 3 5 1 2 4
1) Performance analyzing As the PSO is stochastic-based algorithm, the same problem may produce different result. In order to acquiring more reliable result, we get the average result of every test instance which runs ten times. We test a serial set of data which is different tasks for the processor centers. The centers are US East Virginia, EU Ireland and Asia Pacific Tokyo. The pricing of processing and transferring of the centers are listed in table 1 and table 2 respectively. In the processing pricing, we chose
the Unix/Linux usage. The criteria of performance considered were the quality of the cost of task assignment used for the benchmarks. The percentage improvement is cost is computed as (15): (1 −
CostO ) × 100% Cost NO
(15) V.
The experimental result is listed in table 4. From the experimental result, we can get the conclusion that the bigger the task is, the larger the cost difference between optimization and no optimization. So our method is more scalable against problem complexity and more suitable to cloud computing. Overall, optimization is 15% better than on optimization. TABLE IV.
CONTINOUS VALUE TO DISCRETE VALUE
Cost($) Task
Processor
10
Improvement
No Optimization(NO)
Optimization(O)
3
157.32
144.42
12.9
15
3
380.62
352.90
27.72
20
3
532.21
462.36
69.85
3
752.14
615.65
136.49
30
3
1243.9
1079.3
164.6
35
3
1479.2
1180.0
299.2
4545.39
3834.63
710.76
15.64
Improvement (%)
2) Convergence analyzing
Figure 2
35 tasks and 3 processors
CONLUSION
In summary, we make a modeling for the multi-objective task assignment optimization problem and present a PSO algorithm. Our optimization objects are not only one target, but include processing time, transferring time, processing cost and the transfer cost. Experiment results demonstrate our method is not only to decrease the cost of the processing and transferring, but also to decrease the time of processing and transferring. This mean that our algorithm not only increases the efficient, but also decreases the cost in cloud computing.
in cost($)
25
Total
Fig. 2 shows the run of PSO for solving a problem instance of 35 tasks to be assigned to 3 processors. From Fig. 2, we can observe not only the cost decreases, but also the processing time decreases with the iterations. So our PSO algorithm optimizes both cost and time.
REFERENCES [1]
D. Chappell, "A short Introduction to Cloud Platforms", David Chappell& Associates, August 2008. [2] G. Gruman,"What cloud computing really means", InfoWorld, Jan.2009. [3] R. Buyya, Y.S chee , and V. Srikumar,"Market-Oriented Cloud Computing:Vision, Hype, and Reality fro Delivering IT Services as Computing Utilities", Department of Computer Science and Software Engineering, University of Melbourne, Australia, July 2008,pp.9. [4] Armbrust M. et al., Above the Clouds: A Berkeley View of Cloud Computing,TechnicalReport,http://www.eecs.berkeley.edu/Pubs/TechRp ts/2009/EECS-2009-28.pdf. [5] D Yuan, Y Yang, X Liu, "A data placement strategy in scientific cloud workflows", Future Generation Computer Systems(2010)1200-1214. [6] S. Pandey, L. Wu, S. M. Guru, R. Buyya. " A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments", 2010 24th IEEE International Conference on Advanced Information Networking and Applications (2010), Volume: i, Issue: 1, Publisher: Ieee, Pages: 400-407. [7] L.Zhang, Y.H. Chen, R.Y Sun, S. Jing, B. Yang. " A Task Scehduling Algorithm Based on PSO fro Grid Computing", International Jouranal of Computational Intelligence Research.(2008),pp.37-43. [8] P.Y. Yin, S.S. YU, P.P. Wang, Y.T. Wang. " A hybrid particle swarm optimization algorithm for optimal task assignment in distributed systems", Computer Standards & Interfaces 28 (2006) 441-450. [9] V.M. Lo, "Task assignment in distributed systems", PhD dissertation, Dep. Comput. Sci., Univ. Illinois, Oct. 1983. [10] C.K. Chang, H.Y Jiang, Y. Di, D. Zhu, Y.J Ge. " Time-line based model for software project scheduling with genetic algorithms", Information and Software Technology (2008) 1142–1154. [11] G. Gharooni-fard, F. Moein-darbari, Hossein Deldari and Anahita Morvaridi, Procedia Computer Science, Volume 1, Issue 1, May 2010,Pages1445-1454 ,ICCS 2010. [12] A. Salman. "Particle swarm optimization for task assignment Problem". Microprocessors and Microsystems, November 2002. 26(8):363–371. [13] Amazon EC2 Pricing, http://aws.amazon.com/ec2/pricing/,5-19-2011. [14] J. Kennedy, R.C. Eberhart, "Particle swarm optimization", Proceedings IEEE Int’l. Conf. on Neural Networks, vol. IV,( 1995)1942– 1948. [15] M. F. Tasgetiren, Y. C. Liang, M. Sevkli, and G. Gencyilmaz, “Particle Swarm Optimization and Differential Evolution for Single Machine Total Weighted Tardiness Problem,” International Journal of Production Research, (2006) 4737-4754 . [16] Y Shi, R C Eberhart. "Empirical study of particle swarm optimization". Proc. IEEE Congr. Evol. Comput. (1999)1945-1950.