Request Redirection and Data Layout for Network Traffic Balancing in

0 downloads 0 Views 126KB Size Report
This type of clusters can also be employed in geo- ... that are closely coupled in geography. 2. As the ... their impact on request redirection and load balancing.
Request Redirection and Data Layout for Network Traffic Balancing in Cluster-based Video-on-Demand Servers  Xiaobo Zhou Department of Computer Science Wayne State University Detroit, MI48202, USA [email protected]

Abstract

VoD servers have to be scalable in storage and streaming capacity. Early large scale VoD servers mostly run on massively-parallel computers. For costeffectiveness and scalability, cluster-based servers are coming into practice in recent commercial applications like HP’AutoRAID. This type of VoD servers is usually built on a shared RAID storage subsystem. A front-end server serves as the dispatcher for incoming requests. Video data is striped into blocks and distributed over the disk arrays. Although such systems are easy to be built and administrated, they have limited scalability due to disk access contention. Furthermore, as the number of disks in the storage subsystem increases, so do the controlling overhead and the probability of a failure [5].

Backbone Internal NIC

Dispatcher

High High

Client

Disk

IP Network

 This research was supported in part by NSF grants ACI0203592 and CCR-9988266.

Speed

Disk

Speed

1 0 0 1 0 1

1 0 0 1 0 1

High

Speed

. . .

Recent advances in storage, compression and communication have renewed interests in Video-on-Demand (VoD) applications in areas such as home entertainment, distance learning and e-commerce. As the lastmile bandwidth problem is being solved with the proliferation of broadband access, we are facing the challenge of designing and implementing large scale VoD servers that are capable of processing and delivering rich continuous video simultaneously to millions of clients who are connected to the Internet [8].

. . .

High

1 Introduction

1 0 0 1 0 1

Client

External NIC Client

Cluster architecture is a cost-effective approach to building up scalable servers. For I/O-intensive videoon-demand (VoD) applications, network bandwidth is usually the primary bottleneck. In this paper, we propose a request redirection strategy to balance the network traffic in cluster-based VoD servers. The redirection strategy utilizes the servers’ internal backbone bandwidth to equalize their external network traffic. Performance of redirection is related with data layouts based on replication. We implemented two replication methods and two placement methods for videos with different popularities. Furthermore, we evaluated redirection strategy with different data layouts. The simulation results verify the effectiveness of the redirection strategy and proposed data layout methods.

Cheng-Zhong Xu Department of Elec. & Computer Engineering Wayne State University Detroit, MI48202, USA [email protected]

Speed Disk

Disk

Figure 1: A cluster architecture. With these problems in mind, we follow an alternative cost-effective approach [5, 7]. As shown in Figure 1, a distributed storage cluster comprises a number

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

Video A

(A1, 3)

(A2, 3)

(A3, 3)

Video B

(B1, 3)

(B2, 3)

(B3, 3)

Video C

(C1, 3)

(C2, 3)

(C3, 3)

Video A

(A1, 2)

(A2, 2)

(A, 1.5)

Video B

(B1, 2)

(B2, 2)

(B, 1.5)

Video C

(C1, 2)

(C2, 2)

(C, 1.5)

Backbone (a) Data striping in a shared storage subsystem

(b) Data striping and replication in distributed storage subsystem

Figure 2: Data layout in shared and distributed storage subsystems. of VoD servers, linked by a backbone network. Each server has its own storage subsystem. The cluster is controlled by a dispatcher, which makes admission decisions and determines a responsible server for each incoming request. In order to avoid the network traffic jams around the dispatcher, the system relies upon a TCP handoff protocol to enable servers to respond to client requests directly. Due to the server autonomy, the distributed storage cluster architecture has better scalability in terms of storage and streaming capacity. This type of clusters can also be employed in geographically distributed environments [11]. VoD applications are I/O-intensive. Network bandwidth is usually the primary bottleneck [1, 8]. Balancing network traffic is critical in heavy-load periods. Data layout methods have a significant effect on the scalability and balancing network traffic of servers. Previous work on data layout mostly focused on striping. As shown in Figure 2(a), in shared storage clusterbased servers, video data is striped into blocks and distributed over a disk array. The first integer label indicates the block index of a video and the second integer specifies the encoding bit rate. In distributed storage servers, video data is striped into blocks and distributed over servers of the cluster. The main advantages of striping are high disk utilization and capability of dealing with skews in access patterns for load balancing. However, wide data striping has the following shortcomings: 1. Striping leads to high scheduling overhead in large scale systems. Due to continuity constraints with a single video, some forms of synchronization in its delivery from multiple servers must be considered. Thus, striping is limited to servers that are closely coupled in geography.

2. As the VoD server scales with more videos and more storage spaces, expansion of the system would result in a re-striping of the entire data set. 3. The number of disks for striping is limited and also the disks have to be homogeneous [3]. 4. Striping could lead to a low availability in the event of disk failures [5]. A complementary approach to striping is replication [3, 4, 9, 12]. Instead of scattering a video data across servers, replication stores each video on a single server and keeps its copies on different servers. Replication simplifies system organization and administration. One main disadvantage is the number of replicas and their placement may have to be adjusted as the video access patterns change dynamically, which would incur extra costly dynamic data movements. Essentially, striping is a good approach to load balancing while replication is a good approach to scalability. In practice, as shown in Figure 2(b), note that a hybrid approach based on striping and replication can be used. In this paper, we focus on video replication strategies between servers. To deal with the load imbalance due to the skews in access patterns, we propose a request redirection policy. The contributions of the paper are: 1. We propose a request redirection strategy for utilizing the backbone bandwidth to equalize the external network traffic between servers of a VoD cluster during peak workload. 2. Assuming a priori knowledge about video popularities and the same peak arrival period for all videos, we implement a family of efficient data layout methods based on replication.

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

3. We conduct comprehensive performance evaluation with focus on the impact of request redirection, data layout methods, replication degree and bandwidth of internal backbone.

2 Related Work VoD applications have long been a research topic. Early studies mostly focused on data striping, data retrieval and disk scheduling in the server side [2, 4, 9], as well as caching and multicasting policies at servernetwork interfaces for reduing I/O requirements [1]. Recent focus is on scalability and reliability on distributed servers [3, 5, 7]. Lee et. al. employs data striping at server level to achieve fine-grain load balancing across multiple servers [7]. Due to the pull-based model, clients must know the placement and migration of all video blocks. The authors in [5] focus on the impact of data striping methods on reliability. Chou et. al. study the tradeoff between the striping degree and replication degree [3]. Dynamic replication techniques are employed to achieve scalability and reliability with the cost of CPU resources and I/O bandwidth for data movement. In [2], the authors suggested replication based on Zipf-like video popularity distribution could improve throughput. However, they did not give methods of taking advantage of the access patterns for replication and placement. Wolf et. al. proposed the DASD scheme for load balancing in a multi-disks server. It employed a replication technique borrowed from the theory of resource allocation problems [9]. In this paper, we propose and address a request redirection strategy for network traffic balancing. In its context, we propose a family of data layout algorithms and study their impact on request redirection and load balancing. Our work is complementary to previous work.

3 Data Layout Algorithms Network bandwidth is the major performance bottleneck in many VoD systems [1, 8]. Load balancing improves the system throughput and reduces the rejection rate during the heavy-load periods. Increasing the replication degree enhances the ability of load-

balancing. Additionally, multiple replicas increases the availability since requests of a failed replica can be redirected to another replica. However, the replication degree is constrained by encoding bit rates of videos and storage capacity of the cluster. Let  denote the degree of network communication load imbalance of the cluster. There are many ways for the definition of load imbalance degree. We adopt

         

(1)



where  is the mean communication load in a cluster of  servers, i.e.     . Consider the data layout of  videos with the same duration (say 90 minutes for typical movies) on a cluster of  homogeneous servers. We assume that the encoding bit rates of videos is fixed to be the same one. This is because recent transcoding algorithms could provide various streaming quality on-thefly [10]. Each server in the cluster has a storage capacity  in terms of the number of replicas and an external communication bandwidth  . Like many other work [1], we assume that the video popularity distribution conforms to Zipf-like distributions with a skew  parameter , i.e.        . Typically,     . Let  denote the communication load imposed on a replica of video  by a round-robin scheduling. The objective of the replication is to get fine granularity of  for later placement. Assuming a priori knowledge about video popularities and the same peak arrival period for all video accesses, we formulate the video replication as a minimax optimization problem:





Minimize

    



(2)

subject to: 1)     ,    , and 2)        , where  is the number of replicas of video  . We note that it would be impossible to find a linear relation between the number of replicas and the popularity of a video, because: 1) Zipf-like distribution is not linear; 2) There is a limitation of     ; and 3)  must be an integer. This minimax problem is close to a classical apportionment problem [6]. One difference is the number of replicas of a video is

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

z1

u1

ri= 6 ri= 5

z2

u2

p2 4

ri=

3

p1 u 3 p+

ri=

20

z3

p3

z4 z5

ri=

2

ri=

1

p4

p5

u4 p6 p20

.

V1 V2 V3 V4 V5 V6

u5 u6

z6 .

popularity

Zipf−like popularity distributions

p1

.

high

z0

.

In order to better understand the impact of replication method on performance, first we give a straightforward algorithm called classification based replication. It classifies the  videos roughly into  classes according to their popularities.  is the number of servers in the cluster. All these classes contain pproximate number of videos ( or  ). First step, the replication procedure gives each video one replica. Then, if the cluster storage capacity allows, each iteration of replication procedure gives each video one replica more starting from the video with the highest popularity. Figure 3 illustrates an example with the setting of 20 videos and 4 servers. Each dotted line shows a replication iteration. At the last, if there are more replicas that can be placed onto the cluster, the replication algorithm just does round-robin replication. The replication procedure guarantees that the number of replicas of videos in two adjacent classes is differed by 1 or 2. Obviously, classification based replication meets the two constraints of Eq. (2).

.

3.1 Classification based Replication

by Adams Monotone divisor algorithm. The Zipf-like distributed based algorithm also assigns the number of replicas to videos according to the classification of video popularities. Its basic idea is to partition the Yaxis range of [0,   ] into  (number of servers) classification intervals, according to a Zipf-like distribution with the skew parameter of  . It then assigns the number of replicas  to video  according to the Y-axis interval number of its popularity. Figure 4 illustrates the replication of 20 videos in a cluster with 6 servers. The cluster storage capacity is 42 replicas. Parameter  actually determines the total number of replicas generated. We have got a bounded search space and a termination factor for parameter  . We have presented an    ) binary search approach to get the optimal parameter  in [12].

.

bounded by the number of servers. A well-known solution is based on Adams’ Monotone divisor [6, 9]. Its main idea is to give a replica to the video with current greatest communication load, if its number of replicas is lower than the number of servers in the cluster and there is space to do replication. Its complexity is      [6].

V 20

Figure 4: Zipf-like distribution based replication.

low

class 1

class 2

class 3

class 4

[v1 ~ v5]

[v6 ~ v10]

[v11 ~ v15]

[v16 ~ v20]

Figure 3: Classification based replication.

3.2 Zipf-like Distribution based Replication Motivated by the classification based replication, we give an efficient algorithm that utilizes the information about video Zipf-like popularity distributions to approximate the optimal replication scheme received

3.3 Video Placement The objective of video placement is to map all replicas of videos to servers of the cluster so as to minimize the degree of load imbalance . If replication leads to uniform communication weight for all replicas, a simple grouped round-robin placement achieves an optimal solution. It assumes that the replicas are listed in groups as

 ½   ¾     

However, in most cases, communication weight of replicas of different videos is variable after replication. This is due to the three reasons discussed above, i.e, 1) Zipf-like distributions are not linear, 2)     ,

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

and 3)  must be an integer. For example, the ratio of  (the highest popularity) to  (the lowest popu½ larity) is  . If   , we have ½ 

½  

 . It means that the communication weight of replicas of different videos may be different. We have proposed a grouped lightest loaded first placement algorithm in [12]. It arranges all replicas of a video in a group and sorts all groups in a non-increasing order according to the communication weight of replicas in the groups. At each iteration of placement, it places the replicas in the group with the greatest communication weight to a corresponding number of servers with the lightest load and available storage capacity.

½ 

 

a server with available outgoing network bandwidth, say  , which has no replica of  but if the bandwidth of the internal link connecting these two servers is available. The dotted line shows a rediected request. Redirection avoids extra run-time overhead and excessive delays incurred by dynamic replication possibly. However, its effect is constrained by data layout methods and internal bandwidth of the backbone network. Considering the server autonomy, redirection won’t be employed when the rejection does not appear. Thus, redirection does not always guarantee a load balanced system, since a load balanced system is desirable but it is not the final goal. S1

S2

request V2 S3

4 Request Redirection

streamV2

S: server V: video

NIC

We note that the preceding analyses on replication is based on the assumption of a priori knowledge of video popularities. We have also assumed the same peak period of request rates for all videos. It might be true for videos in the same category. Nevertheless, in general, different types of videos would not have the same peak period of arrival rates. Actually, many experiments involving nonstationary traffic patterns assumed that the relative popularity of videos varied on a weekly, daily, and even hourly basis [1, 9]. Currently, we are studying the methods of predicting the video popularity distributions and request arrival rates, according to video classifications, subscriber profiles and history data. However, we know it is impossible to have the full knowledge about the video popularities in advance. Balancing network traffic for servers of a VoD cluster is critical during heavy-load periods. We propose a request redirection strategy to achieve dynamic network traffic balancing by taking the advantage of internal backbone bandwidth. Figure 5 presents its illustration. Consider the situation that video  is more popular than it was expected and video  is less popular than it was expected. Server  on which video  is mapped may be overloaded and server  on which video  is mapped may be underloaded. Note that a request for video  not only can be serviced by its hosting server  when its outgoing network bandwidth is available, but also can possibly serviced by

V1

V2

V3

V4

V5

Figure 5: An illustration of the redirection.

5 Performance Evaluation In this section, we present the experimental results due to the use of request redirection and different data layout methods under different replication degrees. We found that the replication algorithms based on Zipf-like distribution and Adams’ monotone divisor achieved nearly the same results in most testing cases, except their time complexities. For brevity in presentation, we omit the results of Adams’ monotone replication. Hence, for the replication, we have the classification based and the Zipf-like distribution based. As for the placement, we compare the grouped round-robin and the grouped lightest load first. Hence, we have four different data layout methods. In the experiments, it was assumed that the VoD cluster contained  videos. Their duration is 90 minutes. The cluster consisted of homogeneous servers. Each server had 

Gbs outgoing network bandwidth.

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

9

3.5

Classification replication + round-robin placement Classification replication + lightest load first placement Zipf replication + round-robin placement Zipf replication+ lightest load first placement

8

Classification replication + round-robin placement Classification replication + lightest load first placement Zipf replication + round-robin placement Zipf replication+ lightest load first placement

3

7

Rejection rate (%)

Rejection rate (%)

2.5 6 5 4 3

2 1.5 1

2 0.5

1 0

0 20

25

30 Arrival rate (per minute)

35

40

20

25

(a) average replication degree is 1.2

30 Arrival rate (per minute)

35

40

(b) average replication degree is 3.0

Figure 6: Impact of Data layout methods on rejection rate ( = 1). 14 12 10 8

6

6

Classification replication + round-robin placement Classification replication + lightest load first placement Zipf replication + round-robin placement Zipf replication+ lightest load first placement

5

4 2 0 20

25

30

35

40

Arrival rate (per minute)

Figure 7: Impact of layout methods on load balancing.

Load imbalance L (%)

Load imbalance L (%)

rate as the performance metric. We give some representative results as follows. Each simulation result was an average of 200 runs.

Classification replication + round-robin placement Classification replication + lightest load first placement Zipf replication + round-robin placement Zipf replication+ lightest load first placement

4

3

2

1

0 20

The encoding bit rate for videos was fixed to be the typical one for MPEG II movies, i.e. 4 Mbs. Thus, 40 requests per minute was the peak rate at the full utilization of the streaming capacity of the cluster. The storage capacity of each server ranged from 67.5 GB to 202.5 GB. Thus, the storage capacity of the cluster ranged from 200 to 600 replicas and the average replication degree ranged from 1.0 to 3.0. We assumed that video requests were generated by a Poisson process with an exponentially distributed interarrival time with mean 1/. Thus, the overall video arrival rate was . In experiments  varied between 20 to 40 requests per minute. We assumed Zipf-like distributions for the relative popularities of the 200 videos, governed by the Zipf skew parameter (    ). The simulation model employed a simple admission control that a request was rejected if required communication bandwidth was not available. Hence, we adopted rejection

25

30 Arrival rate (per minute)

35

40

Figure 8: Impact of layout methods on load balancing.

5.1 Impact of Data Layout Methods Figure 6 shows the impact of the four data layout methods with various replication degrees on rejection rate. We conducted the experiments with a wide range of replication degrees and Zipf skew parameter . Due to the space limitations, we only present the results when the average replication degree is 1.2 and 3.0, which are shown in Figure 6(a) and 6(b), respectively. In all following results, parameter is 1. It can be seen that the data layout methods with either Zipf-like distribution based replication or grouped lightest load first placement improve over the combination of classification based replication and the

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

9

3.5

Classification replication + round-robin placement Zipf replication+ lightest load first placement Classification replication + round-robin placement + Redirection Zipf replication+ lightest load first placement + Redirection

8

Classification replication + round-robin placement Zipf replication+ lightest load first placement Classification replication + round-robin placement + Redirection Zipf replication+ lightest load first placement + Redirection

3

7

Rejection rate (%)

Rejection rate (%)

2.5 6 5 4 3

2 1.5 1

2 0.5

1 0

0 20

25

30 Arrival rate (per minute)

35

40

20

25

(a) average replication degree is 1.2

30 Arrival rate (per minute)

35

40

(b) average replication degree is 3.0

Figure 9: Impact of request redirection on rejection rate.

5.2 Impact of Request Redirection Results above are based on the assumption of a priori knowledge of video popularities and the assumption of same peak period for all videos. Dynamic network traffic balancing can be achieved by request redirection. Figure 9 shows the performance of request redirection strategy combined with two different data lay-

9

Classification replication + round-robin placement + Redirection Zipf replication+ lightest load first placement + Redirection

8 7

Rejection rate (%)

round-robin placement significantly. From the figure, it can also be seen that with the Zipf-like distribution based replication, the round-robin placement and lightest load first placement have nominal differences. This demonstrates the effectiveness of the Zipf-like distribution based replication from another perspective. The replication receives desirable granularity of communication load of replicas. The results also reveal when storage capacity of the cluster is rather limited, the data layout methods are more critical. Figures 7 and 8 show the impact of data layout methods on balancing network traffic when average replication degree is 1.2 and 3.0, respectively. They help us understand the performance curves in Figure 6. The data layout methods with either Zipf-like distribution based replication or lightest load first placement are more stable with the changes of arrival rate. Good data layout methods lead to a desired load balance even with a simple round-robin scheduling. Note that as the arrival rate is reaching the peak capacity, i.e. 40 requests per minute, all load imbalance curves degrade considerably. This is because more and more servers have reached their capacities of network traffic.

6 5 4 3 2 1 0 0

50 100 150 Bandwidth of each internal link (Mbs)

200

Figure 10: Impact of bandwidth of internal backbone. out methods. Each internal link between two servers was assigned 100 Mbs or 25-stream bandwidth. The average replication degree is 1.2 and 3.0 in subplots (a) and (b) respectively. Obviously, request redirection reduces the rejection rate considerably by balancing the outgoing network traffic of servers in the cluster. It also postpones the appearance of rejection. Interestingly, with redirection strategy, the poor data layout method and good data layout method have the same performance in the subplots (a) and (b). This probably is because the bandwidth of each internal link between two servers is so large that outgoing network traffic has been fully balanced by the use of redirection. Note that when arrival rate is reaching the peak capacity, i.e. 40 requests per minute, there are still some rejection rate (about 1%) by the use of redirection. This is because the instances of arrival rate may exceed the average arrival rate in the simulation.

0-7695-1573-8/02/$17.00 (C) 2002 IEEE

3.5

References

Classification replication + round-robin placement + Redirection Zipf replication+ lightest load first placement + Redirection

3

[1] C.C. Aggarwal, J.L. Wolf, and P.S. Yu. The maximum factor queue length batching scheme for videoon-demand systems. IEEE Trans. on Computers, 50(2):97–110, 2001.

Rejection rate (%)

2.5 2 1.5

[2] A.L. Chervenak, D.A. Patterson, and R.H. Katz. Choosing the best storage system for video service. In Proc. ACM Multimedia’95, pages 109–119, 1995.

1 0.5 0 0

50 100 150 Bandwidth of each internal link (Mbs)

200

Figure 11: Impact of bandwidth of internal backbone. To study the impact of internal bandwidth on the redirection’s performance on load balancing and hence on rejection rate, we conducted another group of experiments. Figures 10 and 11 show the rejection rate under the peak arrival rate when redirection is employed with variable bandwidth of internal backbone links. The average replication degree is 1.2 and 3.0, respectively. Clearly, as the bandwidth of internal links exceeds some knees, the good data layout method and poor data layout method have the same performance due to load balancing achieved by request redirection. When average replication degree is 1.2, as it is shown in Figure 10, this knee appears when each internal link has about 100 Mbs bandwidth. When average replication degree is 1.2, as it is shown in Figure 11, this knee appears when each internal link has about 50 Mbs bandwidth.

[3] C.F. Chou, L. Golubchik, and J.C.S. Lui. Striping doesn’t scale: how to achieve scalability for continuous media servers with replication. In Proc. IEEE ICDCS’00, pages 64–71, 2000. [4] A. Dan and D. Sitaram. An online video placement policy based on bandwidth to space ratio (BSR). In Proc. ACM SIGMOD’95, pages 376–385, 1995. [5] J. Gafsi and E.W. Biersack. Modeling and performance comparison of reliability strategies for distributed video servers. IEEE Trans. on Parallel and Distributed Systems, 11(4):412–430, 2000. [6] T. Ibarkai and N. Katoh. Resource allocation problem - Algorithmic approaches. The MIT Press, 1988. [7] Y.B. Lee and P.C. Wong. Performance analysis of a pull-based parallel video server. IEEE Trans. on Parallel and Distributed Systems, 11(12):1217–1231, 2000. [8] J. Liu and D.H.C. Du. Continuous media on demand. IEEE Computer, 34(9):37–39, 2001. [9] J.L. Wolf, P.S. Yu, and H. Shachinai. Disk load balancing for video-on-demand systems. ACM/Springer Multimedia Systems Journal, 5(6):358–370, 1997. [10] J. Youn, M.T. Sun, and C.W. Lin. Motion vector refinement for high-performance transcoding. IEEE Trans. on Multimedia, 1(1):30–40, 1999.

6 Conclusion A distributed storage cluster is a cost-effective approach to building up scalable VoD servers. In this paper, we have proposed a request direction strategy to balance network traffic by utilizing the internal bandwidth of the backbone. A family of video replication and placement algorithms has also been presented. The experimental results have shown the effectiveness of redirection strategies with different data layout methods. The results have also shown the impact of replication degree and internal backbone bandwidth on balancing network traffic of a cluster.

[11] X. Zhou, R. L¨uling, and L. Xie. Solving a media mapping problem in a hierarchical server network with parallel simulated annealing. In Proc. IEEE ICPP’2000, pages 115–124, 2000. [12] X. Zhou and C. Xu. Placement and scheduling for service differentiation in video-on-demand clusters. CIC01-8, Technical Report, Dept. of ECE, 2001.

0-7695-1573-8/02/$17.00 (C) 2002 IEEE