Random duplicate storage strategies for load balancing in multimedia servers Joep Aerts1;2 1
Jan Korst1
Sebastian Egner1
Philips Research Laboratories, Prof. Holstlaan 4, WY-21, 5656 AA Eindhoven, The Netherlands Universiteit Eindhoven, Dept. of Mathematics and Computing Science, Eindhoven, The Netherlands
[email protected]
2 Technische
An important issue in multimedia servers is disk load balancing. In this paper we use randomization and data redundancy to enable good load balancing. We focus on duplicate storage strategies, i.e., each data block is stored twice. This means that a request for a block can be serviced by two disks. A consequence of such a storage strategy is that we have to decide for each block which disk to use for its retrieval. This results in a so-called retrieval selection problem. We describe a graph model for duplicate storage strategies and derive polynomial time optimization algorithms for the retrieval selection problems of several storage strategies. Our model unifies and generalizes chained declustering and random duplicate assignment strategies. Simulation results and a probabilistic analysis complete this paper. Keywords: Analysis of algorithms, combinatorial problems, information retrieval, real-time systems
1 Introduction A multimedia server [4] offers continuous streams of multimedia data to multiple users. In a multimedia server one can generally distinguish three parts as shown in Figure 1: an array of hard disks to store the data, an internal network, and a fast memory from which the users consume, which is usually implemented in RAM.
internal server network
disk array
buffers (RAM)
Figure 1. Model of a multimedia server.
The multimedia data is stored on the hard disks in
blocks, such that a continuous data stream is realized by periodically reading a block from disk and storing it in the RAM. The total RAM is split up into a number of buffers, one for each user. A user consumes, possibly at a variable bit-rate, from his/her buffer and the buffer is refilled with blocks from the hard disks. A buffer generates a request for a new block as soon as the amount of data in the buffer becomes smaller than a certain threshold. We assume that the requests are handled periodically in batches in such a way that the requests that arrive in one period are serviced in the next one. In the server we need a cost-efficient storage and retrieval strategy, that guarantees, either deterministically or statistically, that the buffers do not underflow or overflow. The costs of a multimedia server mainly consist of the costs of hard disks and RAM. The number of hard disks needed for a given number of users depends on the disk storage and disk transfer rate requirements of the entire system. Our analysis focuses on systems for which the transfer rate requirements form the bottleneck, like video-on-demand applications. This means that it is important to use the available bandwidth of
2
the disk array as efficient as possible. For that we have to (1) use each single disk efficiently, i.e., optimize the effective reading time, and (2) balance the workloads of the disks. To optimize the effective reading time we should make the data blocks as large as possible, but this is in contrast with the buffer costs, as the amount of RAM per user grows linearly in the block size. In this paper we do not discuss this trade-off any further; we focus on the load balancing performance of different storage strategies. The load balancing performance of a storage strategy is an important factor in the disk costs per user, as an efficient usage of the available bandwidth of the disk array increases the maximum number of users. In this paper we analyze the load balancing performance of random duplicate storage strategies; data blocks are stored on two, randomly chosen disks. Data redundancy gives the freedom to obtain a balanced load. To exploit this freedom we have to solve in each period a retrieval selection problem (RSP), i.e., decide for each data block which of the two disks to use for its retrieval. A drawback of redundant storage strategies is the disk storage overhead, but in case the disk transfer rate forms the bottleneck instead of the disk storage capacity this is not a serious drawback. In this paper we analyze the complexity of the RSPs resulting from different duplicate storage strategies and describe solution algorithms and their performance. The rest of the paper is organized as follows. In the next section we describe prior work that has been done in the area of random and redundant data storage strategies. In Section 3 we introduce a graph model which is used in the rest of the paper to represent different classes of RSP. In Sections 4 and 5 we analyze the RSP of specific duplicate storage strategies and in Section 6 we give a probabilistic analysis of the load balancing possibilities of these storage strategies. Section 7 concludes the paper and gives ideas for further research.
2 Related work Until recently data redundant storage strategies were mainly used to make systems less sensitive to disk failures; examples are the RAID technology [10] and chained declustering [5]. Data redundancy for load balancing reasons is less often described. Merchant
Joep Aerts, Jan Korst, and Sebastian Egner
and Yu [7] use striping techniques for database applications in which each data object is stored twice, in such a way that each subblock is stored on two disks. The redundancy is not only used for disk failures but also for performance improvements. A request for a subblock is assigned to the disk with smallest queue. Papadopouli and Golubchik [9] use redundant data to solve disk bandwidth congestion, incurred by dynamic changes of user requirements. They use chained declustering as the replication scheme and give a max-flow algorithm for load balancing. Berson, Muntz, and Wong [3] introduce a random striping data placement strategy and use the available parity blocks for load balancing; they solve the resulting retrieval problem with a simple heuristic. Muntz, Santos, and Berson [8] and Tetzlaff and Flynn [13] describe a system in which randomness as well as data redundancy is used for load balancing. Both use very simple online retrieval algorithms in which requests are assigned to the disk with smallest queue. Tetzlaff and Flynn compare their results with coarse-grained striping and random single storage strategies. Korst [6] introduces a replication scheme in which each data block is stored on two randomly chosen disks, defined as random duplicate assignment (RDA). Korst analyzes the load balancing results of a number of retrieval algorithms, including heuristic algorithms as well as a max-flow based optimization algorithm, and compares their performance with full striping. Our paper analyzes the load balancing performance of duplicate storage strategies in the range between chained declustering and random duplicate assignment. In our analysis we assume that the data blocks are fetched periodically in batches. This means that all disks start with a new batch at the same time. Muntz, Santos and Berson [8] and Sanders [12] analyze asynchronous retrieval strategies, in which a disk can start with a new request as soon as it is idle. Muntz et al. use shortest queue scheduling in their real-time multimedia server. On average, asynchronous retrieval using shortest queue scheduling outperforms periodic retrieval in load balancing, but from a worst-case perspective it can be at least twice as bad as optimal periodic retrieval [6]. Sanders considers alternative asynchronous retrieval algorithms that outperform shortest queue scheduling. However, his analysis focusses on
Random duplicate storage strategies for load balancing in multimedia servers
retrieving one request at a time, i.e., seek optimization is not considered. Furthermore, asynchronous retrieval algorithms are more difficult to analyze probabilistically. The results presented by Muntz et al. and Sanders are mainly experimental, where we also give probabilistic bounds on the load balancing performance.
With the above notation we can give a formal problem definition for the retrieval selection problem. Problem 2 [Edge weight partition problem]. Given are a graph G = (V; E ) with a nonnegative integer weight wi j on each edge fi; jg 2 E. Assign nonnegative integer values to each ai j such that the maximum load `max is minimized subject to the following constraints:
8i2V
3 Graph model for duplicate storage
A duplicate storage strategy can be modeled by a graph G = (V; E ), in which the set V of vertices represents the set of disks. An edge fi; jg 2 E between vertices i and j indicates that there exist blocks for which the two copies are stored on disk i and disk j. In this graph we can represent an instance of RSP by putting on each edge fi; jg 2 E a weight wi j , giving the number of blocks that has to be retrieved either from disk i or from disk j. Note that ∑e2E we = n. We call such a graph G with weights an instance graph. An assignment of block requests to disks corresponds to a division of the weight of each edge over its endpoints. We define ai j as the number of blocks of edge fi; jg assigned to disk j and a ji the number of blocks assigned to disk i. Note that wi j = w ji = ai j + a ji . The load l ( j) of a disk j is given by the sum of the assigned weights over all incident edges. The maximum load over all disks is denoted by `max . An example of two disks in an instance graph is given in Figure 2. aji
1 0
disk i
wij
1 0
disk j
Figure 2. Example of two nodes of an instance graph of RSP.
l (i) :=
∑
j 2V ;fi; j g2E
a ji ;
ai j + a ji = wi j :
;
2 In the feasibility variant of the problem we introduce an input value K and the question is if there exists an assignment such that `max K. A known solution approach to this feasibility variant is the use of max-flow algorithms [14], as described by Korst [6]. Max-flow problems can be solved in polynomial time. The complexity depends on the number of nodes and the structure of the graph. We can define a max-flow graph for the edge weight partition problem as follows. The set of nodes consists of a source, a sink, a node for each disk and a node for each edge of the instance graph. The set of arcs consists of (1) arcs from the source to each edge node, with a capacity equal to the weight of that edge, (2) arcs with infinite capacity from each edge node to the disk nodes with which it was incident in the instance graph, and (3) arcs with capacity K from each disk node to the sink. An example of such a max-flow graph is shown in Figure 3. edge nodes
disk nodes 0
03
0 w
w03
1 w15
w05 05
1
12
2
w
05
12
w03 w 13
5
3
15
45
w24 3
instance graph
K
13
2 source
w
4
aij
`max
8fi jg2E
In this paper we analyze the retrieval selection problem for duplicate storage strategies. This problem can be defined as follows. Problem 1 [Retrieval selection problem ( RSP)]. Given is a set of n blocks that has to be retrieved from a disk array with m disks. Furthermore, for each request the two disks are given from which it can be retrieved. The problem is to select for each block the disk from which to retrieve it, in such a way that the maximum number of block requests assigned to one 2 disk is minimized.
3
24
4
45
5
K sink
K w45
max-flow graph
Figure 3. Example of an instance graph and corresponding max-flow graph.
Joep Aerts, Jan Korst, and Sebastian Egner
4
With this max-flow graph we can solve the feasibility variant of the edge weight partition problem. If a maximum flow from source to sink saturates all the edges leaving the source, this flow corresponds to a feasible assignment. This means that this solution approach does not only solve the feasibility question but also gives an assignment in case of a positive answer, which can be extracted from the flow over the edges with infinite capacity. We conclude this section with a theorem about the optimal value of `max , denoted by `max . Theorem 1 . For each duplicate storage strategy &
1 `max = max I jI j fi; jg2∑E;i; j2I wi j
' ;
(1)
where I runs through all connected non-empty subsets 2 of V . Proof. It is easy to see that the right-hand side of (1) , since the total weight gives a lower bound on `max within a set I has to be distributed over the elements of I. So we can prove equality by showing that we can construct a connected non-empty set I V such that
`
max
&
'
1 wi j jI j fi; jg2∑ E;i; j 2I
(2)
:
Assume that we have an assignment for which the . Furthermore, assume that maximum load equals `max the number of nodes with maximal load is minimized. We take the instance graph and determine a node i with load `max . We define i to be in I and determine neighbors j 2 V of i for which a ji > 0. As a ji > 0, we know for each such j that l ( j) `max 1, otherwise the load of i could have been decreased, contradicting the assumption that the number of nodes with maximal load is minimized. We add these neighbors to I and continue recursively by adding for each i 2 I the neighbors j with a ji > 0 to this set. All elements of I have a load of at least `max 1 and node i has a load of `max , and following from the construction of I none of the loads of the elements of I can be assigned to elements outside of I . So the total weight on the edges within I is at least
jI j
(
1) (`max
1) + `max = jI j (`max
1) + 1:
Straightforward mathematics show that this connected set I gives the required result for (2). 2
4 Random chained declustering The first duplicate storage strategy that we analyze is based on chained declustering. Hsiao and DeWitt [5] propose to store the successive data blocks in a roundrobin fashion, where the second copy of each block is stored on the subsequent disk. Compared to this strategy, we drop the round robin assignment, as random storage outperforms round-robin storage in case of unpredictable block requests, e.g., due to variable bit rate streams or user interaction. We still store two copies of each data block on two subsequent disks, i.e., disk i and disk (i + 1) mod m, but disk i is chosen uniformly at random. Figure 4 gives an instance graph corresponding to random chained declustering. The instance graph is a cycle, as each block is stored on two neighboring disks. disk 0
1 0 0 w01 1
1 0 0 1
disk 1
1 0 0 1
w12
disk 2
1 0 0 1 w23
1 0 0 1
1 0 0 1 disk 3
1 0 0 1 0 1 0 1 Figure 4. Example of an instance graph of RSP for random chained declustering.
Due to the simple structure of the instance graph we can design the following linear algorithm to solve the feasibility variant of RSP. Note that in case of random chained declustering the optimal value of K is bounded between mn and max wi j . We use a clockwise point of view and define for each disk i disk (i + 1)mod m as its successor. For ease of notation we assume in the rest of this section that the operations on the disk numbers are modulo m. We first give a sketch of the algorithm; a more formal description of this so-called double loop algorithm is given in Figure 4. Theorem 3 proves that the algorithm actually solves the feasibility variant of RSP for random chained declustering.
Random duplicate storage strategies for load balancing in multimedia servers
We start with an edge with highest weight and without loss of generality we assume that the edge connects disk 0 and disk 1. We assign K blocks to disk 0 and w0;1 K 0 to disk 1. We continue in clockwise direction with the following recurrence relation: a j+1; j := minfw j; j+1 ; K
a j; j+1 := w j; j+1
aj
a j+1; j
1; j
g
(3)
(4)
If a j; j+1 > K for any disk j, the proof of Theorem 3 shows that no feasible solution for this value of K exists. Otherwise, the algorithm finishes the first loop with the computation of am 1;0 . At that point there are two possibilities: (1) a feasible assignment is constructed, i.e., a j 1; j + a j+1; j K for all j 2 V , or (2) an overload occurs on disk 0, i.e., am 1;0 > 0. In case of (2) the algorithm starts a second loop with a new assignment on the first edge. Instead of assigning K blocks to disk 0, we assign K am 1;0 blocks to disk 0. We recompute the values for each edge with Equations (3) and (4). Again we conclude infeasibility if a j; j+1 > K for any disk j. If the second loop has been completed, we have found a feasible assignment as is shown in the proof of Theorem 3. Theorem 2 [Complexity]. The double loop algorithm has a time complexity of O(m). 2 Proof. We model the problem as a cycle of m disks. The algorithm stops after at most 2 loops of m steps and in each step a constant number of operations has 2 to be executed, which gives the stated result. Theorem 3 [Correctness]. The double loop algorithm solves the feasibility variant of RSP in case of random chained declustering as 2 storage strategy. Proof. If the algorithm returns a “yes” answer after the first loop, an assignment is given as well, which is correct by construction. In case the algorithm returns a “no” answer, we show that no assignment can be constructed with `max K, by showing that in that case a set I V can be constructed for which 1 jIj ∑i; j2I wi j > K, which is sufficient according to Theorem 1.
5
determine edge with maximum weight, assume (0; 1); a1;0 = K; a0;1 = w0;1 K; j = 1; infeasibility = false; repeat if (a j 1; j > K) infeasibility = true; else a j+1; j = minfw j; j+1 ;K a j 1; j g; a j; j+1 = w j; j+1 a j+1; j ; j = j + 1; until (infeasibility = true _ j = 0) if (am 1;0 > K) infeasibility = true; if (infeasibility = true) return “no”; fno feasible assignment foundg else fsecond loopg if (am 1;0 = 0) return “yes”; ffeasible assignment foundg else a1;0 = K am 1;0 ; a0;1 = w0;1 a1;0 ; j = 1; repeat if (a j 1; j > K) infeasibility = true; else a j+1; j = minfw j; j+1 ;K a j 1; j g; a j; j+1 = w j; j+1 a j+1; j ; j = j + 1; until (infeasibility = true _ j = 0) if ( j = 0) return “yes”; ffeasible assignment foundg else return “no”; fno feasible assignment foundg
Figure 5. Double loop algorithm for RSP of random chained declustering for mn K max wi j .
The algorithm stops if an a j; j+1 -value is computed that is larger than K. To construct the set I we start with I = f j + 1g, move backwards, and add each previous disk to the set I if its assigned load equals K, until we reach the first disk with load less than K, say disk i. Now we know that ai;i+1 = 0 and that no load can be transferred outside of I. For the load within I it holds that
∑
i1 ;i2 2I
wi1 i2
= a j; j +1 +
jI j
= a j ; j +1 + (
∑
i2I nf j +1g
l (i) =
1)K > jI j K :
(5)
This implies, according to Theorem 1, that no feasible solution exists with `max K.
6
To complete the proof we show that in case of completion of the second loop always a feasible assignment is found. If the second loop is started, we know that the first loop ended with an overload p on disk 0. For the second loop we start with a1;0 = K p and, con sequently, a0;1 is increased by p blocks. As K mn and disk 0 had a load larger than K after the first loop, there is at least one disk with a load less than K. We define the set Vmin as the set of disks with load less than K and we want to shift theoverload on disk 0 to the disks of Vmin . As K mn , we know that ∑i2Vmin (K l (i)) p. During the second loop there are two possible outcomes: (1) an a j; j+1 -value becomes larger than K which means infeasibility or (2) all disks are filled up to K until all p blocks are shifted away to the disks of Vmin . In the second situation the increase of a0;1 will not influence the value of am 1;0 , so the latter will be p again. The new assignment is feasible as a1;0 = K p. 2 The algorithm for the feasibility variant can be used to construct a fast algorithm for the optimization variant, by doing a bisection search on the value of K. We know that a feasible K exists in the set f mn ; : : : ; max wi j g. As the cardinality of this set can be bounded by n, the overall time complexity of the optimization algorithm is O(m log n). Furthermore we know that in case of a “no” answer a set I V can be constructed for which j1I j ∑i; j2I wi j > K. This means that the bisection procedure can be improved by using j1I j ∑i; j2I wi j as a new lower bound in case of a “no” answer. This new lower bound makes sure that each time the feasibility algorithm is run, the algorithm will stop at least one node further than in the previous run. This means that we can also bound the number of feasibility problems to be solved by m, such that the complexity of the optimization algorithm is O(minfm2 ; m log ng). The simplicity of the instance graph in case of random chained declustering enables a very fast algorithm but the freedom for load balancing turns out to be quite small, as shown by the simulation results presented in the second row of Table 1. The table gives for n = 100 and m = 10 the fraction of the instances for which the maximum load exceeds a certain value. The simulation consisted of 1,000,000 randomly generated instances. To generate an instance we assume that the
Joep Aerts, Jan Korst, and Sebastian Egner
requests are uniformly distributed over the edges of the instance graph. To show that the results are much better than in case of random storage without redundancy, we also give simulation results for random single assignment, in which the successive blocks are stored on randomly chosen disks without duplication.
5 Circulant graphs In this section we increase the number of edges of the instance graph to improve the load balancing results. Due to the underlying application we are only interested in vertex-transitive instance graphs, i.e., graphs for which the edge structure of each node is equivalent. An interesting class of vertex-transitive graphs consists of so-called circulant graphs. Definition 1 [Circulant graphs]. Given are m 2 IN + and a set A f1; : : : ; m2 g. A circulant graph Cm (A) is defined as the graph (V; E ) with V = f0; : : : ; m 1g and E = ffi; (i + a)mod mgji 2 V ^ a 2 Ag. 2 The complexity of solving the edge weight partition problem is strongly connected to the number of edges, as each edge becomes a node in the corresponding max-flow graph. We first analyze circulant graphs in which the number of edges is twice the number of nodes. We use a circulant graph on m vertices with A = f1; kg for some integer k. An instance is constructed by assigning n blocks to the edges uniformly at random, where an assignment to an edge means that the block can be retrieved from the disks corresponding to the endpoints. In Figure 6 the graph is shown for m = 8 and A = f1; 2g. disk 0 w disk 1 01
1 0 0 1
1 0 0 1
w02
1 0 0 w 1 12 disk 2
1 0 0 1
w23
1 0 0 1
1 0 0 1 disk 3
1 0 0 1 0 1 0 1 Figure 6. Circulant graph C8 (f1; 2g) as instance graph.
Random duplicate storage strategies for load balancing in multimedia servers
graph
11
# edges
Single assignment Random chained declustering
0 10
12
1 1 2:0525 10 7:9723 10
1
3:6896 10 5:9431 10
2
1
C10 (f1; 2g)
20
C10 (f1; 3g)
20
2
C10 (f1; 4g)
2:6030 10 3:2097 10
20
2:5926 10 3:2002 10
2
4:4285 10 7:2734 10
2
2
2
2
C10 (f1; 5g)
15
C10 (f1; 2; 3g)
30
2
C10 (f1; 2; 3; 4g)
2:4254 10 2:7174 10
40
2
Complete graph C10 (f1; 2; 3; 4; 5g)
45
2:3331 10 2:5502 10 2:3428 10 2:5210 10
2
2
2 2 2
13
9:9895 10 9:9990 10
1
1:5140 10 3:3682 10
2
2:6100 10 3:6398 10
4
4:0000 10 4:2442 10
6
4:0000 10 3:9494 10
6
4:8300 10 5:4461 10
4
0 1:6648 10 0 1:4328 10 0 1:0214 10
7
1
2
4
6
6
1
8:5700 10 1:4722 10
4
4:0000 10 3:7392 10
6
0 6:4948 10 0 5:9549 10
4
1:0000 10 8:6809 10
6
0 2:5464 10
7
7
14
9:6668 10 9:6687 10
0 2:4635 10 0 3:6744 10
1
3
8:2623 10 8:2641 10
1
4:4000 10 6:3211 10
5
1
5
6
0 3:4287 10
8
9
0 1:8637 10
11
9
0 1:7276 10
11
6
0 1:1355 10
7
9
0 4:0566 10
12
5
12
13
0 3:5181 10 0 5:9646 10
16
18
Table 1. Results of simulation and probabilistic analysis for 100 block requests on 10 disks. The upper value in each entry gives the fraction of the simulation instances that resulted in a load of at least 11, 12, 13, or 14. The lower value gives an upper bound on the corresponding probability.
In Table 1 the results are given for the circulant graphs with jV j = 10 and jAj = 2. For each graph we generated 1; 000; 000 random test instances in which 100 blocks had to be retrieved from 10 disks. The results show that increasing the number of edges gives significant improvements in load balancing when compared to random chained declustering. This observation is underlined by the fact that the results for C10 (f1; 5g) are worse than the results for the other circulant graphs with jAj = 2, as that graph contains only 15 edges. A second conclusion is that not only the number of edges influences the load balancing performance, but also the structure of the graph, e.g., the results for C10 (f1; 3g) are better than the results for C10 (f1; 2g). This can be explained by the observation that the smallest cycle in the latter graph has length 3 whereas in C10 (f1; 3g) it has length 4. A small cycle is likely to give a higher chance on a set I with a large value of j1I j ∑i; j2I wi j .
random duplicate assignment [6], i.e., each block is stored on two different, randomly chosen disks.
We did a second experiment in which we increased the number of edges step by step by increasing j k the set A. For each value of r between 1 and jV2 j we consider
6 Probabilistic Analysis
circulant graph C10 (f1; : : : ; rg). An instance of the edge weight partition problem is obtained by choosing uniformly at random jan edge for each block to k retrieve. Note that for r = jV2 j the constructed graph is a complete graph. This represents the situation of
The last rows of Table 1 show that the results converge very fast to the results of random duplicate assignment. These results are quite close to a perfect load balance, i.e., over 97% of the instances results in a load of 10 for each disk. In the analysis of the duplicate storage strategies we did not allow the retrieval of parts of blocks, even if this is possible for duplicate storage strategies, e.g., half the block is retrieved from one disk and the other half from the second. This extra freedom could improve the results on the maximum load at the expense of extra switch time, but as Theorem 1 without ceiling brackets is valid in this case, the decrease of the optimal value is bounded by 1.
The cases of RSP that we analyzed in this paper can all be solved to optimality with polynomial time algorithms. In the previous sections we gave simulation results to show the quality of the storage strategies. The graph model of RSP, as described in Section 3, can also be analyzed probabilistically resulting in bounds on the probability that a certain load will be exceeded.
Joep Aerts, Jan Korst, and Sebastian Egner
8 Theorem 4 . Given is an instance graph G = (V; E ) with weights wi j . Then for each integer α > 0
P [`max > α℄
∑F
n;
I
j(I I ) \ E j α jI j (6) jE j ;
where I runs through all connected non-empty subsets of V and F denotes the cumulative distribution function of the binomial distribution, i.e., F (n; p; x) gives the probability that a B (n; p) binomially distributed random variable is at least x. 2 Proof. By Theorem 1 and the principle of inclusionexclusion we can derive "
P [`max > α℄
=
P
9I :
∑P
"
∑
i; j 2I;fi; j g2E
∑
i; j 2I;fi; j g2E
I
wi j
wi j
α jI j
α jI j
#
# :
(7)
Since the blocks are independently uniformly distributed over the edges, ∑fi; jg2E;i; j2I wi j is a
B (n; jIjEI\j E j ) distributed random variable. Inserting
the cumulative distribution function F yields the ex2 pression stated in the theorem.
Note that the summand in Theorem 4 only depends on jI j and jI I \ E j and not on I itself. Since we study vertex transitive graphs, which are highly symmetric, the number of different summands is rather small. For a graph G = (V; E ) and values ui and di we can determine the number ni of connected subsets I V for which jI j = ui > 0 and jI I \ E j = di . The set of graph structure constants of a graph G, f(u1 ; d1 ; n1 ); : : : ; (ur ; dr ; nr )g, consists of all such triples. Then for all α > 0
P [`max > α℄
r
di ∑ ni F n; jE j ; αui i=1
:
(8)
The values of F (n; p; x) can be evaluated numerically in terms of the incomplete Beta-function (see [11, Section 6.4]). Table 1 gives the probabilistic results for the graphs. The values can be compared with the simulation results of the previous sections. In most cases the probabilistic upper bounds are quite close to the simulation results.
7 Conclusion Several papers describe the possibilities of using redundancy and randomization for load balancing in a video on demand server. This paper unifies and generalizes the ideas of random duplicate assignment and chained declustering by giving a general model for the retrieval problem of duplicate storage strategies. The retrieval selection problem is described as a combinatorial optimization problem and polynomial time optimization algorithms are given for several cases of RSP. The load balancing performance of the storage strategies is analyzed by simulation and probabilistic analysis. We can conclude that the strategies as described in this paper perform well in the sense that a good load balance is obtained with a high probability. Comparing the different storage strategies, we observed a trade-off between the performance and the computation time of the retrieval selection algorithms. In this paper the load balancing is done on the level of blocks; the objective is to minimize the maximum number of blocks assigned to one disk. The corresponding retrieval selection problems are all solvable in polynomial time. These models can be extended such that zoned disks, partial duplication, and switch times can be taken into account [1, 2]. Further research will focus on these extensions.
References [1] J. Aerts, J. Korst, and W. Verhaegh. Load balancing for redundant storage strategies: Multiprocessor scheduling with machine eligibility. Submitted to Journal of Scheduling. [2] J. Aerts, J. Korst, and W. Verhaegh. Load balancing in multimedia servers. In Proceedings seventh international workshop on project management and scheduling, pages 25–28, April 2000. [3] S. Berson, R.R. Muntz, and W.R. Wong. Randomized data allocation for real-time disk I/O. In Proceedings of Compcon Conference, February 1996. [4] J. Gemmell, H.M. Vin, D.D. Kandlur, P.V. Rangan, and L.A. Rowe. Multimedia storage servers: A tutorial. IEEE Computer, pages 40–49, May 1995. [5] H. Hsiao and D.J. DeWitt. Chained declustering: A new availability strategy for multiprocessor database machines. In Proceedings of Data Engineering, pages 456–465, 1990. [6] J. Korst. Random duplicated assignment: An alternative to striping in video servers. In Proceedings ACM Multimedia, pages 219–226, November 1997.
Random duplicate storage strategies for load balancing in multimedia servers
9
[7] A. Merchant and P.S. Yu. Analytic modeling and comparisons of striping strategies for replicated disk arrays. IEEE Transactions on Computers, 44(3):419–433, 1995.
[11] W.H. Press, S.A Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipes in C (2nd Ed.). Cambridge University Press, 1992.
[8] R. Muntz, J.R. Santos, and S. Berson. A parallel disk storage system for real-time multimedia apllications. International Journal of Intelligent Systems, 13:1137–1174, 1998.
[12] P. Sanders. Asynchronous scheduling for redundant disk arrays. In Proceedings 12th ACM Symposium on Parallel Algorithms and Architectures, 2000. To appear.
[9] M. Papadopouli and L. Golubchik. A scalable video-ondemand server for a dynamic heterogeneous environment. In Proceedings MIS ’98 (LNCS 1508), pages 4–17, 1998.
[13] W. Tetzlaff and R. Flynn. Block allocation in video servers for availability and throughput. In Proceedings Multimedia Computing and Networking, 1996.
[10] D.A. Patterson, G.A. Gibson, and R.H. Katz. A case for redundant arrays of inexpensive disks ( RAID). In Proceedings of the ACM conference on management of data (SIGMOD ’88), pages 109–116, 1988.
[14] J. Van Leeuwen. Graph algorithms. In J. Van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume A, Algorithms and Complexity, pages 525–631. Elsevier/MIT Press, 1990.