Reliability Aware Load Balancing Algorithm for Content Delivery ...

4 downloads 89700 Views 330KB Size Report
Cite this paper as: Gupta P., Goyal M.K., Gupta N. (2015) Reliability Aware Load Balancing Algorithm for Content Delivery Network. In: Satapathy S., Govardhan ...
Reliability Aware Load Balancing Algorithm for Content Delivery Network Punit Gupta, Mayank Kumar Goyal, and Nikhil Gupta Department of Computer Science Engineering, Jaypee University of Information Technology, Himachal Pradesh, India {punitg07,mayankrkgit,candid.nikhil}@gmail.com

Abstract. With increasing use of internet and data sharing over internet network traffic over internet has increased beyond a limit, which has also increased the number of request made for a resource over a server. So to maintain the Quality of service even if the request made become large CDN (Content Delivery Network) is used. Main goal of CDN is to balance the load over the servers. But with the increase of load over a server even after balancing decreases the reliability of server with increase in fault rate and processing time. Proposals made for CDN do not take into consideration real time faults accuring in a server with time. To overcome this fault and reliability aware load balancing algorithm for CDN is proposed in this paper to increase the scalability and reliability of CDN. Keywords: CDN, QoS, Reliability, Load balancing, Fault rate, Network load, System load.

1

Introduction

With the rapid development of Internet, network traffic is exploding. There is now an upsurge in demand for content services. Therefore to provide uninterrupted services to users, Internet Service Providers should scale their infrastructure. A network of few servers is not sufficient to handle the large number of requests. As the request over servers is rapidly increasing its management is also getting tougher. This fast growing traffic over servers is triggering many problems. Content Delivery Networks is among one of the best methods to cope up with the increasing demands. A Content Delivery Network (CDN) is emerging as an effective solution to support the load of fast growing web applications by adopting a distributed network of servers [8]. CDN has been widely accepted as a method of circulating large amount of content to the user’s .By making several redundant copies of content CDN can solve even high congestion issues occurred due to unexpectedly high request rate from clients to an extent. CDN consists of a backend server which consists of new data to be spread, together with more than one stand-in servers. Only static data is stored upon the stand-in servers, while dynamic information is stored on few backend servers. The backend server also updates the stand-in servers regularly. Therefore CDN maintains accessibility, while © Springer International Publishing Switzerland 2015 S.C. Satapathy et al. (eds.), Emerging ICT for Bridging the Future – Volume 1, Advances in Intelligent Systems and Computing 337, DOI: 10.1007/978-3-319-13728-5_48

427

428

P. Gupta, M.K. Goyal, and N. Gupta

preserving correctness by keeping the redundant copies of data over multiple servers. It involves use of the techniques like load balancing, request routing and maintaining multiple copies of resources. There are many issues and parameters which restricts the performance of CDN such as issue of load balancing, cost, request traffic, response time. Many proposals based on Cost [1-3] have been proposed to balance load based on Cost, and few others based on Response time and load on server [4-6]. CDN has even been discussed on the basis of Energy consumption and data transfer rate [3,7, 9]. These proposals take into consideration energy consumed by the server and data transfer rate in the server. So the primary issue that persists in CDN is load balancing which can be defined as (number of requests fulfilled by a data center and queue length of a data center). In this paper we have tried to resolve the problem of load balancing in CDN and to overcome the drawbacks of proposed studies and to design a generalized load balancing algorithm for CDN networks. This paper is divided into 5 sections: Introduction, Related Work, Proposed Model, Experimental result and Conclusion. Section 2 related work discusses the survey of proposed model for load balancing in CDN and their drawbacks. In section 3 we present the proposed models for request scheduling. This section discusses the proposal to overcome the drawbacks.

2

Related Work

Taking into consideration load balancing algorithm proposed for CDN and other distributed system like grid computing and cloud computing. Many load balancing algorithms have been proposed in content delivery networks based on load, cost, response time, QoS, response time and energy. Some of the similar load balancing algorithm based on cost was proposed by Chrysa Papagianni [1], in which a hierarchical framework is proposed which is further evaluated towards an efficient and scalable of content distribution over a multi provider networked cloud environment, where inter and intra cloud communication resources are simultaneously considered along with traditional cloud computing resources. The performance of this proposed framework is accessed via simulation and modeling, while appropriate metrics are defined associated with and reflecting the interests of different key players.

Fig. 1. Content Delivery Networks

Fig. 2. Cost based Distribution

Reliability Aware Load Balancing Algorithm for Content Delivery Network

429

Similar cost based algorithm was proposed by Naoya Maki [3] which propose a periodic combined-content distribution mechanism to increase the gain in traffic localization. As shown in figure 2, this proposed mechanism automatically optimizes the distribution period by using how long we can expect the previous downloaded combined-content to localize traffic. Another algorithm based on linear program formulation was proposed by Derek Leong [2] takes into account various costs and constraints associated with content dissemination from the origin that is server to nodes that is storage, and the eventual fetching of content from storage nodes by end users. Vimal Mathew [7] giving new dimension to CDN proposed an energy aware load balancing algorithm which is an optimal offline and online algorithm and can be used to extract energy savings both at the level of local load balancing at the data center and global load balancing across data centers. S.Manfredi [10] also proposed an algorithm to improve overall throughput and response time and hence the performance. It proposes a highly dynamic distributed strategy based on the periodical exchange of information about the status of the nodes, in terms of load. Also they study the scalability of this algorithm and performed a comparative evaluation of its performance with respect to its best solutions.

3

Proposed Work

In all the papers which we discussed previously, Fault aware load balancing in Content delivery network (FLB) has been simulated and modeled taking in consideration the parameters like cost, load, request time, QoS, throughput. But none of them took fault as a parameter while considering load balancing for CDN. So our algorithm along with the factors discussed previously will also take fault as a parameter for load balancing in CDN. Therefore the factors on which our algorithm is based on are Network Load, Fault Rate, Queue length, and Response Time. These parameters can be defined as: a) System Load: Total number of MIPS (Million of Instruction per Second) under use. b) Fault rate: Number of faults over a period of time. c) Queue Length: Maximum size of request queue length a server can maintain and fulfill. d) Response time: Time taken to start fulfilling a request. e) Network load: Total bandwidth of server out of total under utilization. Proposed algorithm is divided into three phases: a) Initialization b) Load balancing c) Updating

430

P. Gupta, M.K. Goyal, and N. Gupta

A. Initialization In this phase fitness value for a datacenter is initialized with default values of all the parameters explained above. All the parameters are checked and updated periodically. Initially fault rate and network load is zero where as Queue length and response time are based on the datacenter properties. For example if a datacenter can maintain a queue length of 100 request and response time on an average for any request be 10 milliseconds. Based on these values fitness values is calculated. When a new datacenter is introduced in CDN it is initialized with default values and fitness value is calculated and updated after equal interval of time. Initial parameters are defined as: Fault_Ini: Initial fault rate. Quel_Ini: Initial Queue length based on datacenter. Resp_Ini: Initial response time based on datacenter. Netload_Ini: Initial network load. B. Load Balancing In this phase when the server queue is full and no more request can be queued other request are dropped or rejected because they cannot be fulfilled. So to overcome this replica of the data been requested is made on other server i.e. we require to find a datacenter which can fulfill the request. Here we can classify the servers into two categories as hot spot and cold spot. Hot spots are those servers which are over loaded with requests and have most of the MIPS and network bandwidth under utilization. Cold spots are those datacenters which have low request rate and can accommodate more requests .In other words datacenters with low MIPS and network bandwidth under utilization i.e. load network and processing load. Load balancing is required to stop server to become hot spot. Whenever load balancing is called we need to find best fit server based on following parameters. 1) Fault rate : It is directly propositional to the load on the server that can be network load which leads to network failure and system load which leads to system failure on the other had degradation in QoS (quality of service) provided by the server.

ʎ (t) = F(N_Load_new, S_load_new) ʎ(t) : fault over a time t. N_load : network load. S_load : system load.

(1)

Equation 1 defines that fault rate at particular instance of time is functionally and directly proportional to system load and network load.

ʎ: fault rate ʎ = ∑ total number of fault / per hour;

(2)

Reliability Aware Load Balancing Algorithm for Content Delivery Network

431

2) Response time: This can be defined as the time taken to start processing a request i.e. difference between the time request was submitted and the time server started processing the request. It is directly propositional to system load. As the CPU utilization of server increases response time increases. So the server which needs to be selected should have least average response time. Resp = Response time 3) Queue length: Every server has a fixed request queue length which can be fulfilled without request failure. So we need to select a server for load balancing which have a sufficient largest free queue size to accommodate new requests without failure. To balance the load we need to take all the above parameters into consideration and calculate fitness value for each server over which load can be balanced to provide better QoS and increase the reliability of the overall system by balancing the load and reducing failures. Fitness value for a server can be determined as: fval (s): Fitness value of datacenter s

Fval(s) = α1 ∗

ʎ

+ α2 ∗ R

server_id= min (fval1, fval2…., fvaln);

(3) (4)

Server for load balancing will be selected based on the one with highest fitness value i.e. least fault rate, lease network load, least system load, and largest free queue length. This approach helps in maintaining skewness and increase reliability and decrease fault rate. C. Fitness Updating This phase includes updating the value of current network load, system load, fault rate, queue length of datacenter. This phase is repeated after an equal interval of time to get the updated current status of the servers. Initially all the parameters are initialized with default values in which fault rate ʎ (t) is initially zero, network load in also zero and system load is also taken zero. Queue length of a server in always initially zero because there are no request made for that server.

ʎ (t) _Initial = 0 \\ Initial fault rate N_load_initial = 0 \\Initial network load. S_load_initia = 0 \\Initial system load. Q_len_initial = 0 \\Initial queue length Res_Initial = 0; \\Initial server response time

432

P. Gupta, M.K. Goyal, and N. Gupta

For calculating new fitness value we need to find changes in the parameters. Let Si be the server, ʎ (t)_new, N_load_new, S_load_new, Q_len_new , Res_new are new fault rate over a time ‘t’ ,new network load, system load, queue length and response time correspondingly. Let new fitness value be fvalt_new (Si) of server i.

ʎ (t) = F(N_Load_new, S_load_new) Fval_new(s) = α1 ∗

(5)

1 1 + α2 ∗ Resp_new ʎ_new

(6)

server_id= min (fval1_new,fval2_new….,fvaln_new)

(7)

Whenever a fitness value is upgraded next request is always diverted to server with highest fitness value based on updated fitness values giver equation 4.

4

Experimental Results

In this for simulation GridSim API [10] is used. GridSim API basically support scheduling and load balancing in parallel and distributed environment. Load balancing, fault in server and server request queue feature of GridSim are used to simulate CDN. Initially GridSim do not support failure in servers. In this implementation we have introduces fault aware scheduling in GridSim to study the performance of CDN in fault aware environment. To study the performance and compare the improvement we have referred queue length based load balancing proposal [11] by S manfredi. In this paper author has taken queue length as a basis for load balancing. Based on this attribute we have computed the result to study the problem. For this we have considered 3 servers S1, S2, S3 with queue length, fault rate and processing rate i.e. is the number of requests that can be parallel processed shown in table 1. With 60 user requests which are bifurcated as 50 request for server1, 10 request for server2 and server3 is free shown in table 2. So the total number of failure that occurred is shown in table 3 using queue length based load balancing (QLBLB). On the other hand using proposed algorithm selects the server with sufficient free queue length and least fault rate which decreases overall failure in the system and output is shown in table 4. Table 1. Servers Parameters

Server Name Server1 Server2 Server3

Queue length 10 50 50

Fault rate 0.143 0.125 0.5

Processing rate 10 50 50

Table 2. Request rate

Server Name

Request

Server1 Server2 Server3

50 10 0

Reliability Aware Load Balancing Algorithm for Content Delivery Network Table 3. Failure Count with QLBLB

433

Table 4. Failure Count with FLB

Server Name

Failure count

Server Name

Failure count

Server1 Server2 Server3

1 1 11

Server1 Server2 Server3

1 5 0

Table 5. Configuration Details

Server Name Server1 Server2 Server3

Queue length 100 250 250

Fault rate 0.143 0.125 0.5

Processing rate 30 90 90

Fig. 3. Comparison of FLB and QLBLB with various requests counts Table 6. Failure rates

Failure count of Proposed (FLB) Failure count of QLBLB

60 8 13

Request count 100 200 15 28 23 43

300 48 71

Figure 3 shows improvement in fault rate with increase in requests. This shows the performance of QLBLB and proposed fault aware load balancing algorithm. Proposed algorithm is also tested for 200, 300, 400, 500 user requests and 3 datacenters with following configuration shown in table 5. Table6 shown the over reduce in failure rate and increase in reliability as the request count increases. So as the result shown increase in overall system reliability with the use of fault based load balancing algorithm.

434

5

P. Gupta, M.K. Goyal, and N. Gupta

Conclusion

In this paper different type of Load balancing algorithm have been discussed with their drawbacks in CDN. To overcome the drawbacks an efficient fault aware load balancing algorithm is proposed which perform better than other load balancing algorithm proposed for CDN in fault aware environment. For future work this algorithm may be compared with other proposals and study the improvement in the QoS.

References [1]

[2]

[3]

[4]

[5]

[6]

[7] [8]

[9] [10]

[11]

Chrysa, P., Leivadeas, A., Papavassiliou, S.: A cloud-oriented content delivery network paradigm: modeling and assessment. IEEE Transactions on Dependable and Secure Computing 10(5), 287–300 (2013) Derek, L., Ho, T., Cathey, R.: Optimal content delivery with network coding. In: 43rd Annual Conference on Information Sciences and Systems, CISS 2009, pp. 414–419. IEEE (2009) Naoya, M., Shinkuma, R., Mori, T., Kamiyama, N., Kawahara, R.: A periodic combined-content distribution mechanism in peer-assisted content delivery networks. In: Proceedings of the ITU Kaleidoscope: Building Sustainable Communities (K-2013), pp. 1–8. IEEE (2013) Xueying, J., Li, S., Yang, Y.: Research of load balance algorithm based on resource status for streaming media transmission network. In: 2013 3rd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 503–507. IEEE (2013) Li, L., Xiaozhen, M., Yulan, H.: CDN cloud: A novel scheme for combining CDN and cloud computing. In: 2013 International Conference on Measurement, Information and Control (ICMIC), vol. 1, pp. 687–690. IEEE (2013) TaeYeon, K., Song, H.: Hierarchical Load Balancing for Distributed Content Delivery Network. In: 2012 14th International Conference on Advanced Communication Technology (ICACT), pp. 810–813. IEEE (2012) Vimal, M., Sitaraman, R.K., Shenoy, P.: Energy-aware load balancing in content delivery networks. In: Proceedings IEEE INFOCOM, pp. 954–962. IEEE (2012) Naoya, M., Nishio, T., Shinkuma, R., Takahashi, T., Mori, T., Kamiyama, N., Kawahara, R.: Expected traffic reduction by content-oriented incentive in peer-assisted content delivery networks. In: 2013 International Conference on Information Networking (ICOIN), pp. 450–455. IEEE (2013) Sabato, M., Oliviero, F., Romano, S.P.: Optimised balancing algorithm for content delivery networks. IET Communications 6(7), 733–739 (2012) Rajkumar, B., Murshed, M.: Gridsim: A toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurrency and computation: Practice and Experience 14(13-15), 1175–1220 (2002) Sabato, M., Oliviero, F., Romano, S.P.: A distributed control law for load balancing in content delivery networks. IEEE/ACM Transactions on Networking (TON) 21(1), 55– 68 (2013)

Suggest Documents