International Journal of Computer Applications (0975 – 8887) Volume 107– No.1, December 2014
EDLT: An Extended DLT to Enhance Load Balancing in Cloud Computing Afife Fereydooni
Mostafa Ghobaei Arani
Mahboubeh Shamsi
Department of Computer Engineering, Mahallat Branch,
Department of Computer Engineering, Parand Branch,
Islamic Azad University,
Islamic Azad University,
Department of Computer Engineering, Qom Branch, University Of Technology Qom, Iran
Mahallat, Iran
Tehran, Iran
[email protected]
[email protected]
ABSTRACT In recent years, cloud computing has gained prominence and most companies are applying some kind of cloud computing technology in their company or part of it. Growing number of clients who are willing to use cloud services has made load balancing an eminent challenge in this field. A general approach to load balancing is the application of Divisible Load Theory (DLT). In DLT, the workload is divided among master systems. For their turn, master systems divide the load among their slave systems. This article presents an enhanced technique for load balancing based on DLT. Simulated results indicate that the expanded DLT reduces measurement/report time and shows an improved performance at a lower failure rate.
Keywords Load-balancing, cloud computing, Divisible Load Theory (DLT), efficiency, failure rate.
1. INTRODUCTION In recent years, with rapid development of network and internet technology, more people are logging to internet to acquire information, to shop and to have fun. Consequently the volume of data and client requests increase which requires more computation and processing on servers. Servers should meet their clients' demands as soon as possible to keep them satisfied. A new computation model known as cloud computing has been proposed in this respect. Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (for example networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction[1]. In another definition, Cloud computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centers that provide those services [2]. One of the ongoing challenges for cloud computing is load balancing. Workload varies from CPU charge, memory capacity to network delays and network traffic. Load balancing is a mechanism to evenly distribute the load over
[email protected]
the entire nodes of the cloud without any overflows or underflows on a node. It leads to more customer satisfaction and finally enhances total efficiency of the system [3]. One of the flexible approaches to load balancing is Divisible Load Theory (DLT1) [4]. In DLT, workload is divided among a number of master systems, slave systems and links between them. Then this load is distributed over the network. In this approach, before distributing the load, statistics and measurements are collected from master systems. Then workload is divided among master systems based on these statistics. Since a master system is connected to some slave systems in this theory, the master system evenly distributes its share of the load among its slave systems. General objective of this theory is optimized distribution of load among master systems and then slave systems to process the load in shortest possible time and to increase the output [5, 6]. In DLT approach, servers at nodes do not carry any load on them due to users' different demands in cloud environment. Therefore some nodes become overloaded with insufficient resources while some other nodes are idle because they are under-loaded. How should the workload be evenly distributed among servers using an appropriate load balancing algorithm to enhance the usage of research resources in cloud computing and system performance has become a key challenging issue in the field of cloud computing [7]. This article tries to propose an enhanced approach based on DLT that takes impact of master/slave failure rate into consideration. Enhanced algorithm based on DLT for load balancing has the following objectives [8]: Decrease the time to assign user demands to cloud resources Avoid idle nodes to maximize system performance Study probability of master/slave failure on measurement/report time Decrease measurement/report time Scalability The rest of the article is organized as follows: second section reviews and compares different load balancing algorithms. In third section, enhanced approach based on DLT is explained and in fourth section experiment results are analyzed and 1
. Divisible Load Theory
1
International Journal of Computer Applications (0975 – 8887) Volume 107– No.1, December 2014 finally in fifth section conclusions and suggestions are provided.
2. RELATED WORKS In this section, different load balancing algorithms in cloud computing will be discussed and evaluated. From one perspective, load balancing algorithms are divided into static and dynamic algorithms. Static algorithms depend on the current status of the system. They do not need any prior knowledge and information of the system. These algorithms evenly distribute workload among servers. Therefore, in static approaches, server load is inefficiently divided into small pieces so these algorithms are relatively incomplete. Dynamic algorithms make their load balancing decision based on the current status of the system while they do not require prior information about the system. These algorithms require constant monitoring of nodes and progress of tasks. They are more difficult to implement. However, these algorithms are more precise and can perform better at load balancing. Later in this article a brief explanation will be provided for static algorithms such as ant colony [9], CLDBM [10], enhanced Map-Reduce [11], VM-mapping [12] as well as dynamic algorithms such as INS [13], ESWLC [14] and DDFTP [15]. Finally these algorithms will be evaluated and compared. Radojevic et al. has proposed an algorithm called CLBDM (Central Load Balancing Decision Model) [16]. CLBDM is an enhanced version of another algorithm called Round-Robin which operates via switching sessions at program layer of the application. The enhancement made in CLBDM is that the time of communication between user and node is calculated and if the time exceeds a specific threshold then a problem has occurred. When this problem is pinpointed the connection is terminated and the task will be sent to another node based on regulations laid down by Round Robin. CLBDM operates as an automatic network administrator. Kumar [17] et al. has proposed an algorithm for load balancing based on evolutionary ant algorithm. In their proposed algorithm, when a request is sent, some ants will release pheromones and other ants embark on their direct path starting from the main node. Forward movement means the ant starts its journey from an overloaded node and searches for the next node to check if it is overloaded or not. Then the ant starts its journey back. In [11], an enhanced model of Map-Reduce has been proposed for load balancing. Map-Reduce is a model with two main duties: this algorithm examines the tasks and reduces the results. This algorithm has three methods: Part, Comp and Group. Map-Reduce initially runs Part to examine the tasks. In this stage requests are divided into a few parts with Map Task. Then key for each part is stored in a table of hash keys. Then Comp starts the comparison of parts. Then Group tries to group similar parts using Reduce Task. Jonji et al. [12] has designed a load balancing algorithm for private clouds which use VM Mapping. The architecture of this algorithm includes a central scheduling controller and a resource supervisor. Scheduling supervisor computes how much task each resource can perform and then assigns the task to it. Resource supervisor collects detailed information about the availability of the resource. In [13] INS algorithm (Index Name Server) has been proposed to minimize data duplication and redundancy. This optimizing algorithm integrates access point and duplication. Many parameters are considered in the process of calculating the optimum point including the location of the server which hosts data blocs, quality of data transfer based on node
performance, maximum download bandwidth from target server and the route. Ren et al [14] have proposed a dynamic load balancing algorithm for cloud computing environment based on an enhancement of WLS (Weighted Least Connection) called ESWLC (Exponential Smooth Weighted Least Connection). WLS algorithm assigns tasks to nodes based on the number of their connections. However, WLS ignores capabilities of each node such as processing speed, storage capacity and bandwidth. ESWLC improves WLS by considering time series and number of trials. In other words, once capabilities of a node are identified, ESWLC assigns the task to that node. ESWLC decides based on the past experience about the node's CPU power, memory, number of connections and available hard disk. Then ESWLC predicts which node will be selected based on exponential smoothing. In [15] a bi-directional download algorithm has been proposed for FTP (DDFTP). This algorithm can also be utilized in cloud computing environment as well. DDFTP, divides a file with the size of m into half. Then each server node processes its assigned task based on a specific pattern. For example, one server starts from bloc zero and downloads in ascending order while another server starts from bloc m and downloads in descending order. Thus, two servers work independently but the whole file will be downloaded within the best possible time according to performance and characteristics of both servers. Therefore, when two servers download two consecutive blocs, the task is considered complete and other tasks can be assigned to servers. This algorithm decreases network connections required between clients and nodes and therefore diminishes network overhead. In addition, attributes such as network load, node load and network speed are automatically considered while no supervision is required during the time of execution. Table no.1 demonstrates a comparison of algorithms based on a number of criteria. For example, INS is the only algorithm that avoids data redundancy and duplication of stored items. However, INS is a centralized algorithm so it has a singlefailure point. On the other hand, DDFTP relies on duplicated resources and decreases storage size. However, it has a decentralized dynamic approach to load distribution. It is also a simpler algorithm to download stored data. However, DDFTP can be enhanced by segmented duplication so that less storage be implemented. Each algorithm usually addresses a limited number of these challenges making it appropriate for the case where the challenge exists. For instance, INS, CLDBM and VM-Mapping has single-failure point. Therefore in a highly stable environment where many resources are available they show a very satisfactory performance. In addition, all algorithms except ant colony and VM-Mapping can handle highly distributed environments. Therefore, these algorithms are more appropriate for general clouds than two other algorithms. Moreover, all algorithms except DDFTP have enormous overhead on the network. Consequently, if network status becomes worse, these algorithms may become seriously troubled because delays will increase leading to a delay in the whole process of load balancing. However, DDFTP can be more capable in handling such delays since it does not rely on time supervision and control
2
International Journal of Computer Applications (0975 – 8887) Volume 107– No.1, December 2014
Replication INS ESWLC CLDBM Ant Colony Map reduce VM Mapping DDFTP
Partial Full Full Full Full
TABLE 1. Comparison of load balancing algorithms [18] Speed Heterogeneity SPOF Network Spatially Overhead Distributed Moderate Yes Yes Yes Yes Fast Yes No Yes Yes Slow Yes Yes Yes Yes Fast No No Yes No Slow Yes No Yes Yes
Full
Fast
Yes
Yes
Full
Fast
Yes
No
3. EXTEND OF FAILURE RATE
DLT
BASED
ON
With respect to section two, Divisible Load Theory (DLT) [8] is among dynamic algorithms. In this section, DLT will be explained and the impacts of master/slave failure on measurement/report time will be studied.
3.1 DLT Divisible Load Theory (DLT) for clouds includes optimized division and distribution of workload among a number of master systems, slave systems and links between them. Internet as a whole, can be considered as a huge cloud that contains connection-based and non+ services. Load balancing in wireless sensor network (WSN) proposed in [8, 19] can be
Yes
No
Implementatio n Complexity High High Low No High High
Fault Tolerance No Yes No Yes Yes Yes
No Yes Low Yes utilized in clouds. WSN is similar to a cloud with a number of servers and clients. It is supposed that clients have specific capacity and computation is carried out on servers so all measurement data are initially collected from related clients. Only measurement time and connection time of clients are considered while their computation time is ignored. Here the kind of could considered in DLT is a star topology, a single layer tree with k servers each with n connections to clients as shown in figure no.1. It is assumed that the whole cloud can be arbitrarily divided such that in cloud for each server and client one share of the load is assigned. It is also assumed that computing time is ignorable compared to measurement and connection time and a client may be connected to one or more servers at a given time.
Fignre1: K no. of master computers each joining N no. of slave computers in single level Tree network (STAR Topology) [8].
3.2 Extended Approach to DLT Through extended approach to DLT, this article tries to study enhancement based on failure rates of servers and clients and Notation β T
T
Failure rate
their impact on measurement/report time. Before the explanation of enhanced DLT, all parameters are introduced in table no.2:
Table 2: description of parameters, notation and definitions Meaning A portion of load that is assigned to slave i by master k A constant in the cloud that is inversely proportional to the measuring speed of slave i. A constant in the cloud that is inversely proportional to the communication speed of link i. Constant for measurement intensity. This is the time it takes i th slave to measure the entire load when = 1. The entire measurement load assigned can be measured on the i th slave in time . Constant for communication intensity. This is the time it takes to transmit all of the measurement load over a link when = 1. the entire load can be transmitted over the i th link in time b T . The total time that has passed between the beginning of the scheduling process at t=0 and the time when slave i completes its reporting to the master k, i=0,1,…N. This includes in addition to measurement time, reporting time and idle time. It is the time when the last slave of the master k finishes reporting (finish time or make-span) T = max (T , T , … , T ) It is the time when the last master receives the measurement from its last slave. T = max (T , T , … , T ) Failure rate of masters and slaves
3
International Journal of Computer Applications (0975 – 8887) Volume 107– No.1, December 2014
Enhanced approach to DLT based on server/client failure rates and also sequential and simultaneous reporting are classified in four scenarios:
3.2.1 Server failure in the form of sequential reporting: First when time is t=0 all clients are idle and systems in cloud start to communicate with first related client. It is assumed that after measurement only once the client reports to root server at the same time. (Or it can be said that there is only one link between them) In this case, clients receive a share of the load from their related servers and computing start after each complete load share. Therefore minimum measurement/report time of the network when a number of servers are randomly out of service is calculated according to equation no.1: =t +
Where:
=
of
servers
in
sequential
In figure no.2, all homogenous clients connected to 10 servers have been drawn in which inverse link speed b and inverse measuring speed a has a constant value of 1. In all cases, T = 1 and T = 1. Failure rate of servers ranges from 0.2 to 0.8. Figure no.3 shows a case where inverse measuring speed and inverse link speed ranges from 0 to 1 (with intervals of 0.3).
( )(1 − ) + (1) (1 − Failure Rate)(1 − ) /(
+
)
(2)
In a network if n approaches to infinity, measurement/report time approaches t + ( )/ (1 − Failure Rate) so that when number of clients connected to a server approaches infinity, reporting time decreases the measurement time. Accordingly the above statement can be calculated for other computers.
3.2.2 Server reporting:
4.1 Failure reporting
failure
rate
in
Figure 2: Total time over server failure rate in sequential
simultaneous
In this case, each N client connected to a single server finish reporting time within the same time. This cloud has the same reporting finish time. Each client has an independent and separate channel to its server. Therefore the minimum measurement/report time, when a number of servers are randomly out of service is calculated based on equation no.3: + T =t + (3) K (1 − Failure Rate)N
3.2.3 Client failure in the form of reporting:
In a similar manner the minimum measurement/report time, when a number of servers are randomly out of service is calculated based on equation no.4: ( )(1 − ) + =t + (4) ( ) 1− If k approaches infinity, then measurement/report time will approach zero.
3.2.4 Failure of clients in simultaneous reporting:
reporting. Figure 3: Total time over failure rate of masters and inverse link speed (b) and inverse measurement speed (a) in sequential reporting.
4.2 Server reporting
failures
in
simultaneous
In figure no.4, total time over the number of clients connected to a server has been drawn for simultaneous measurement start and simultaneous reporting. Figure no.5 displays the case where inverse measuring speed and inverse link speed ranges from (with intervals of 0.3).
In a similar manner, in this case, minimum measurement/report time of the network, when a number of clients have randomly failed, is calculated based on equation no.5: + T =t + (5) K N(1 − Failure Rate)
4. PERFORMANCE EVALUATION
To evaluate the proposed approach, two different scenarios have been applied. In first scenario, measurement/report time with random failure of some servers is taken into account. In second scenario measurement/report time with random failure of clients is taken into consideration.
Figure 4: Total time over failure rate of masters in simultaneous reporting.
4
International Journal of Computer Applications (0975 – 8887) Volume *– No.*, ___________ 2013
Figure 5: Total time over failure rate of masters and inverse link speed (b) and inverse measuring speed (a) in simultaneous reporting.
Figure 8: Total time over failure rate of slaves in sequential reporting.
4.3 Client failures in sequential reporting In figure no.6 inverse link speed b and inverse measuring speed has a constant value of 1. Failure rate of clients ranges from 0.2 to 0.8. Figure no.7 shows the case where inverse measuring speed and inverse link speed ranges from 0 to 1. (With intervals of
Figure 9: Total time over failure rate of slaves and inverse link speed (b) and inverse measurement speed (a) in simultaneous reporting.
0.3). Figure 6: Total time over failure rate of slaves in Sequential reporting.
Figure 10: Total time over failure rate of masters in simultaneous and sequential reporting.
Figure 7: Total time over failure rate of slaves and inverse link speed (b) inverse measurement speed (a) in sequential reporting.
4.4 Failure of clients in simultaneous reporting In figure no.8 inverse link speed b and inverse measuring speed a have the constant value of 1. Figure no.9 displays the case where inverse measuring speed and inverse link speed ranges from 0 to 1. (With intervals of 0.3)It can be concluded from figure no.10 and no.11 that in sequential reporting failure of servers and clients has a greater influence on the increase of reporting time than simultaneous reporting.
Figure 11: Total time over failure rate of slaves in simultaneous and sequential reporting.
5
International Journal of Computer Applications (0975 – 8887) Volume *– No.*, ___________ 2013
5. CONCLUSION WORKS
AND
FUTURE
In recent years, with rapid development of network and internet technology, more people are logging to internet to acquire information, to shop and to have fun. Consequently the volume of data and client requests increase which requires more computation and processing on servers. Due to the growing number of requests to use cloud services, one of the main challenges in this respect, is load balancing. In this article, an enhanced approach to DLT has been proposed for load balancing. To evaluate this approach, measurement/report time has been used. Measurement/report time in simultaneous reporting is less than sequential reporting since in sequential reporting some clients receive almost zero load from the server. The number of effective clients in this case is less than simultaneous reporting. Therefore, as the number of clients connected to a server increases, an extra completion time remains which is almost similar in both simultaneous and sequential cases. Thus if the number of clients of a server increases, completion time will decrease and with the increase of clients number connected to a server in a cloud, completion time may be enhanced up to satiation (in sequential reporting) But when reporting starts and stops, completion time can be considerably decreased through adding more clients to a server. Moreover, experiment findings indicate that as the ratio of server failure rate to client failure rate increases, measurement/report time increases and measurement/report time in simultaneous case is less than sequential case. Future works can be as following: first, in DLT, workload is distributed over two layers but if a layer is added between server and client, processing will become faster. Second, instead of connecting each client to a single server, each client can be connected to multiple servers.
6. REFERENCES [1]
[2] [3]
[4] [5]
[6]
[7]
[8] [9]
Haozheng Ren, Yihua Lan “The Load Balancing Algorithm in Cloud ComputingEnvironment”2012 IEEE International Conference on Computer Science and Network Technology. Anthony T.Velte, Toby J.Velte, Robert Elsenpeter, Cloud Computing A Practical Ap- proach, TATA McGRAW-HILL Edition 2010. Ali M. Alakeel, A Guide to Dynamic Load Balancing in Distributed Computer Systems, IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.6, June 2010. Mequanint Moges, Thomas G.Robertazzi,”Wireless Sensor Networks: Scheduling for Measurement and Data Reporting”, August 31, 2005. Ratan Mishra،and Anant Jaiswal ،Ant colony balancing in Optimization: A Solution of Load CloudInternational Journal of Web & Semantic Technology (IJWesT) Vol.3, No.2, April 2012. Availability and Load Balancing in Cloud Computing،2011 International Conference on Computer and Software Modeling IPCSIT vol.14 (2011) © (2011) IACSIT Press, Singapore. Ratan Mishra،and Anant Jaiswal ،Ant colony Optimization: A Solution of Load balancing in Cloud International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.2, April 2012 Ram Prasad Padhy, P Goutam Prasad Rao “LOAD BALANCING IN CLOUD COMPUTING SYSTEMS”, India May, 2011, Rourkela-769 008, Orissa. Z. Zhang, and X. Zhang, “A Load Balancing Mechanism Based on Ant Colony and Complex
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
Network Theory in Open Cloud Computing Federation”, Proceedings of 2nd International Conference on Industrial Mechatronics and Autom. A. Bhadani, and S. Chaudhary, “Performance evaluation of web servers using central load balancing policy over virtual machines on cloud”, Proceedings of the Third Annual ACM Bangalore Conference (COMPUTE), January 2010. Gunarathne, T., T-L. Wu, J. Qiu and G. Fox, "MapReduce in the Clouds for Science," in proc. 2nd International Conference on Cloud Computing Technology and Science (CloudCom), IEEE, pp:565572, November/December 2010. Ni, J., Y. Huang, Z. Luan, J. Zhang and D. Qian, "Virtual machine mapping policy based on load balancing in private cloud environment," in proc. International Conference on Cloud and Service Computing (CSC), IEEE, pp: 292-295, December 2011. T-Y., W-T. Lee, Y-S. Lin, Y-S. Lin, H-L. Chan and J-S. Huang, "Dynamic load balancing mechanism based on cloud storage" in proc. Computing, Communications and Applications Conference (ComComAp), IEEE, pp: 102-106, January 2012. Sotomayor, B., RS. Montero, IM. Llorente, and I. Foster, "Virtual infrastructure management in private and hybrid clouds," in IEEE Internet Computing, Vol. 13, No. 5, pp: 14-22, 2009. Al-Jaroodi, J. and N. Mohamed. "DDFTP: DualDirection FTP," in proc. 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), IEEE, pp: 504-503, May 2011. Radojevic, B. and M. Zagar, "Analysis of issues with load balancing algorithms in hosted (cloud) environments." In proc.34th International Convention on MIPRO, IEEE, 2011. Nishant, K. P. Sharma, V. Krishna, C. Gupta, KP. Singh, N. Nitin and R. Rastogi, "Load Balancing of Nodes in Cloud Using Ant Colony Optimization." In proc. 14th International Conference on Computer Modelling and Simulation (UKSim), IEEE, pp: 3-8, March 2012. Klaithem Al Nuaimi, Nader Mohamed, Mariam Al Nuaimi and Jameela Al-Jaroodi College of Information Technology, UAEU Al Ain, United Arab Emirates” A Survey of Load Balancing in Cloud Computing: Challenges and Algorithms” 2012 IEEE Second Symposium on Network Cloud Computing and Applications. Mequanint Moges, Thomas G.Robertazzi,”Wireless Sensor Networks: Scheduling for Measurement and Data Reporting”, August 31, 2005.
6