Video File Allocation over Disk Arrays for. Video-On- ..... With the parameters presented in Table 8, the video allocation problem can be described as follows:.
Video File Allocation over Disk Arrays for Video-On-Demand Yuewei Wang, Jonathan C.L. Liu, David H.C. Du and Jen-Wei Hsieh Distributed Multimedia Research Center1 & Department of Computer Science University of Minnesota
Abstract A Video-on-Demand (VOD) server needs to store hundreds of movie titles and to support thousands of concurrent accesses. This, technically and economically, imposes a great challenge on the design of the disk storage subsystem of a VOD server. Due to dierent demands for dierent movie titles, the numbers of concurrent accesses to dierent movie titles can dier a lot. We de ne access pro le as the number of concurrent accesses to each movie title that should be supported by a VOD server. The access pro le is derived based on the popularity of each movie title and thus serves as a major design goal for the disk storage subsystem. Since some popular (hot) movie titles may be concurrently accessed by hundreds of users and a current high-end magnetic disk array (disk) can only support tens of concurrent accesses, it is necessary to replicate and/or stripe the hot movie les over multiple disk arrays. The consequence of replication and striping for hot movie titles is the potential increase on the required number of disk arrays. Therefore, how to replicate, stripe, and place the movie les over a minimum number of magnetic disk arrays such that a given access pro le can be supported is an important problem. In this paper, we formulate the problem of the video le allocation over disk arrays, demonstrate that it is a NP-hard problem, and present some heuristic algorithms to nd the near-optimal solutions. The result of this study can be applied to the design of the storage subsystem of a VOD server to economically minimize the cost or to maximize the utilization of disk arrays.
Keywords: Multimedia, Video-On-Demand, Concurrent Access, MPEG-II, RAID 3, Replication, Striping, 2-D Vector Packing
1 Distributed Multimedia Research Center (DMRC) is sponsored by US WEST, Honeywell, IVI Publishing, Computing Devices International and Network Systems Corporation.
1 Introduction A central design issue of providing VOD services is how to organize and store hundreds of movie les over multiple disks (disk arrays) such that thousands of viewers (i.e., users) can concurrently access these movie les. In addition to having powerful CPUs and high speed network connections, a VOD server needs to have a mass storage subsystem to store these movie titles and must support thousands of video streams during playback in real-time. Since dierent movie titles have dierent popularities, the access patterns to so-called \hot" movies must also be taken into consideration when designing a VOD system. Based on dierent demands for the movie titles, we de ne the number of concurrent accesses that can be supported to the stored movie titles as the access pro le. In the access pro le, some hot movie titles may be concurrently accessed by hundreds of viewers while other movie titles may be accessed by only one viewer. As a simple example, supposedly we want to provide VOD services of 110 concurrent accesses to 10 movie titles. Table 1 shows an example of a given access pro le. Table 1: An example of concurrent access pro le Symbol A B C D E F G H I J
Movie Title # of Concurrent Viewers (Accesses) Forest Gump 44 True Lies 30 Time Cop 20 Clear And Present Danger 7 Star Gate 4 Top Gun 1 Fly II 1 RoboCop 1 Total Recall 1 Star Trek VI 1
Assume that we want to store these movie titles on an array of RAID 3s [1] where each RAID 3 consists of 8 data + 1 parity Seagate ST12400N 2.1 gigabyte drives with a Ciprico 6710 controller. A 90-minute movie le in MPEG-II format occupies about 5 gigabytes of storage and requires I/O bandwidth of about 8 megabits per second (Mbps) 2 . Based on our experimental results, each RAID 3 can store at most 3 movie les and support at most 11 concurrent accesses if a reasonable size buer is dedicated to each stream [5, 8]. In the above access pro le, we want to support 44 concurrent viewers (i.e., accesses) to movie title \Forest Gump". Clearly, this movie cannot be stored on only one RAID 3 since each RAID 3 can only support 11 MPEG-II streams. This problem can be solved by replicating this movie le over 4 RAID 3s, striping it over 5 RAID 3s, or using a combination of replication and striping over some number of RAID 3s. We should point out that striping this movie title over 4 RAID 3s was not sucient to support 44 concurrent accesses since there is a performance penalty in the disk striping [8]. The number of MPEG-II streams that can be supported by striping over 4 2 We are aware that the current MPEG-II speci cation is 4 ? 20 Mbps. We choose a typical value 8 in our discussion.
1
RAID 3s (4-way striping) is 36. Table 2 shows the number of concurrent accesses that can be supported by dierent ways of striping over RAID 3s [8]. Table 2: The number of concurrent accesses vs. the number of ways of striping # of ways of striping # of concurrent accesses that can be supported
1 2 4 8 16 11 20 36 64 112
In general, the storage of movie titles in the storage subsystem of a VOD server imposes two requirements: storage and concurrent access requirements. The storage requirement is de ned as the disk space that needs to be occupied by all the movie les, and the concurrent access requirement is de ned as the number of concurrent accesses speci ed in a given access pro le that should be supported by a VOD server. That is, a VOD server needs to store these movie titles in such a way that the number of concurrent accesses speci ed in the access pro le can be satis ed. In a traditional le server, the retrieval of data is usually best-eort oriented (i.e., there is no real-time constraint over the retrieval and no limit on how many people can access it). Thus, the storage space requirement usually is the primary factor or concern in a traditional le server. In a VOD server, in which real-time video playback is supported, not only the storage requirement, but also the concurrent access requirement imposes a great challenge on the disk storage subsystem. As a matter of fact, the concurrent access requirement often plays the dominant role due to the limitation of sustained transfer rate (i.e., eective I/O bandwidth) that magnetic disks or disk arrays can provide. In the previous example, we want to support totally 110 concurrent accesses to 10 movie titles. Without considering the concurrent e = 10 RAID access requirement, we need only d 103 e = 4 RAID 3s, while we need at least d 110 11 3s when concurrent access requirement must also be satis ed. Due to this observation, one might jump to the conclusion that the video le should be allocated based on the concurrent access requirement. In the following naive solution (Table 3) to the previous example, without considering replication, we stripe the movie les over 13 RAID 3s to satisfy the concurrent access requirement. In this solution, the storage capacity of disk 0 to disk 9 is severely underutilized while the concurrent access capacity of disk 11 and 12 is under-utilized. Certainly, this solution over-emphasizes the importance of concurrent access requirement in the allocation of movie les. A better solution which requires 12 RAID 3s is to balance the storage load and concurrent access load of the disk arrays, as shown in Table 4. We can see that there is some trade-o between satisfying the concurrent access requirement and satisfying the storage requirement. A good solution needs to consider both requirements when balancing the storage load and concurrent access load of the disk arrays during the allocation of the video les. In the above examples, we did not consider the possibility of the replication. For some movie titles, replication can be eective because replication reduces the number of disk arrays required for striping. For example, we can further reduce the number of disk arrays required to 11 using replication, as shown in Table 5. Although striping increases the total number of concurrent accesses to a single movie, the average number of concurrent accesses that can be supported by each disk array is reduced. Replication, on the other hand, increases the storage requirement. There is certainly some trade-o that needs to be analyzed 2
Table 3: A naive solution using 13 RAID 3s
Video Allocation
Access Pro le
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6 Disk 7 Disk 8 Disk 9 Disk 10 Disk 11 Disk 12
Movie Title Access Requirement A 44 B 30 C 20 D 7 E 4 F 1 G 1 H 1 I 1 J 1
(A) (A) (A) (A) (A) (B ) (B ) (B ) (C ) (C ) (D; E ) (F; G; H ) (I; J )
Table 4: A better solution using 12 RAID 3s
Video Allocation
Access Pro le
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6 Disk 7 Disk 8 Disk 9 Disk 10 Disk 11
Movie Title Access Requirement A 44 B 30 C 20 D 7 E 4 F 1 G 1 H 1 I 1 J 1
(A; F ) (A; G; H ) (A; I; J ) (A) (A) (A) (B ) (B ) (B ) (C ) (C ) (D; E )
between replication and striping. How to replicate, stripe, and place the movie titles on a minimum number of disk arrays (disks) such that we can satisfy both the access and storage requirements is an important design problem to minimize the cost of a VOD server. This paper attempts to formulate and solve this problem. There are some related studies on video le allocation over the storage subsystems for a VOD server. Load-balancing is very important in maximizing the number of concurrent accesses, as pointed out by Little and Venkatesh in [7]. In [3], Doganata and Tantawi gave a cost/performance analysis over a hierarchical storage system of a centralized VOD server. In [9], Ramarao and Ramamoorthy proposed a probabilistic model to assign video to dierent levels of a three-tiered storage hierarchy in a distribute VOD server environment. They did not address how video les are allocated in each site. Ghandeharizadeh and Ramos proposed a scheme to decluster and/or replicate a media object over a number of processors/disks in a share-nothing architecture in [4]. Their approach mainly deals with the situation in that 3
Table 5: A solution with replication
Access Pro le
Movie Access # of copies A 44 8 (A1 :::A8 ) B 30 8 (B1 :::B8 ) C 20 4 (C1 :::C4 ) D 7 2 (D1 ; D2 ) E 4 1 F 1 1 G 1 1 H 1 1 I 1 1 J 1 1
Access split-ups (6; 6; 6; 6; 5; 5; 5; 5) (4; 4; 4; 4; 4; 4; 3; 3) (5; 5; 5; 5) (4; 3) (4) (1) (1) (1) (1) (1)
Video Allocation
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6 Disk 7 Disk 8 Disk 9 Disk 10
(A1 ; A5 ) (A2 ; D1 ; F ) (A3 ; A6 ) (A4 ; A7 ) (A8 ; B1 ; G) (B2 ; B3 ; B7 ) (B4 ; B5 ; B8 ) (B6 ; C1 ; H ) (C2 ; C3 ; I ) (C4 ; E; J ) (D2 )
the bandwidth requirement of a continuous media stream is more than that of the disk. All the approaches reported in literature either did not consider striping or assumed no striping penalty. In this paper, we consider the eect of this striping penalty on video le allocation in a VOD server. We also consider the balance eect of replication and striping in the video allocation. We try to solve the problem of minimizing the number of disk arrays to support a given concurrent access pro le. In the remainder of this paper, Section 2 provides the information of our experimental facility and the assumptions that are made in this paper. Section 3 formulates the problem. Section 4 presents some heuristic solutions with and without replications. Section 5 analyze the relationship between striping and replication and give a heuristic solution in the case that striping is not needed. Section 6 concludes the paper.
2 System Architecture and Assumptions 2.1 VOD System Architecture and VOD Service Quality Figure 1 shows the architecture of a large-scale VOD server that is under development in the Distributed Multimedia Center at University of Minnesota [5, 8]. A SGI Onyx with 20 MIPS R4400 150 MHz processors and 512 MBytes random access memory (RAM) is used as the computing engine. The system bus has a bandwidth of 1:2 GB/second and can be con gured with up to 4 POWERchannel-2 I/O adapter boards. Each POWERchannel-2 I/O adapter board has a bandwidth of 160MB/second and can provide enough bandwidth for 8 fast and wide SCSI-2 channels (20 MB/second). An array of RAID 3s is used to store the video les. One SCSI-2 channel is dedicated to each RAID 3. Each RAID 3 achieves a sustained transfer rate of 17:8 MBytes per second under this con guration [5]. Within each RAID 3, data is interleaved byte-wise over multiple data disks, and a single parity disk is used to tolerate a disk failure. This is called RAID 3 Byte Striping in [5]. The striping we consider in this paper is the user-level striping in which the application controls the 4
System Controller
CPU Borads (1-5)
Memory Boards (1-4)
System Bus (1.2 GB/sec)
POWERchannel-2 Boards HIO Bus
POWERchannel-2 Boards HIO Bus
POWERchannel-2 Boards HIO Bus
320 MB/sec
20 MB/sec
RAID 3 Disk Arrays
RAID 3 Disk Arrays
RAID 3 Disk Arrays
Figure 1: VOD Server Architecture
allocation and retrieval of data over multiple RAID 3s. This is called Application Striping [5] and is very similar to providing a RAID 0 level of striping over multiple RAID 3s. That is, in Application Striping, a user can stripe a video le over multiple RAID3s using the RAID 0 level of striping. Each striping block in a RAID 3 is further striped over multiple disks via RAID 3 Byte Striping. In the rest of the paper, when we say striping, we mean Application Striping. Conceptually, each RAID 3 can be considered as a single disk with larger storage capacity and higher sustained transfer rate. In the rest of the paper, when we say a disk array or a disk, we mean a RAID 3. The quality of VOD service can be de ned by the interactivity between the users and the VOD server. In this paper, we try to provide the highest quality of VOD service in which certain amount of resource (i.e. memory space, I/O bandwidth, and etc.) is dedicated to each user such that the user potentially can have the full control over the playback process without the interference from other users. This is the most desired feature because it can truly emulate the VCR functions and is referred to as T-VOD service in [7].
2.2 Striping Penalty As we mentioned before, a RAID 3 can provide sustained transfer rate of 17:6 Mbytes/second and a MPEG-II video stream needs a bandwidth of about 8 Mbits/second. Although theoretically we can support b 17:868 c = 17 MPEG-II streams, due to system overhead we were able to support only 11 MPEG-II streams as our experiments demonstrated in [8]. Moreover, as the number of ways of striping increases, the average number of concurrent streams per disk array 5
that can be supported will decrease. This performance behavior is veri ed by the experiments conducted in [8]. We de ne a striping function f (i) as the number of concurrent accesses (i.e., MPEG-II streams) that a RAID 3 can support on average if movie les are i-way striped over i disk arrays. In other words, i disk arrays can support a total number of i f (i) concurrent accesses if movie les are i-way striped over these i disk arrays. Table 6 is an example of striping function from the experiments reported in [8] when each of stream is allocated a 1 megabyte buer . From the table we can observe that the number of streams that a disk array can supTable 6: Striping function i f (i)
1 2 4 8 16 32 11 10 9 8 7 6
port decreases as the number of ways of striping increases. If we increase the buer size of each stream, the numbers in the above table will increase, but the monotonically decreasing relation of f (i) with respect to i still holds [8]. However, in the above discussion, we only consider the concurrent accesses to the movie les with the same ways of striping. In reality, there may be hundreds of movie titles and each of them may be striped over dierent number of disk arrays. An interesting question is how to calculate the concurrent access load of a disk array when it stores parts of movie les and each movie le might have a dierent way of striping. This is addressed in the next subsection.
2.3 Combined Concurrent Access Load Calculation In order to make the calculation easier, we normalize the disk concurrent access capacity to 1. Each of 11 streams will consume 111 of concurrent access capacity of a disk array. In a general case, when a movie is striped i ways over i disk arrays, each access to this movie individually will carry a concurrent access load of f (1i) to each of these i disk arrays. How to calculate the total normalized access load of a disk array when dierent streams access dierent movies which are partially stored on this disk array with dierent ways of striping? We conjecture that the total normalized access load of a disk array is the summation of the individual normalized loads from all the concurrent accesses. Assuming a disk array (partially) stores k movie titles with movie i is si -way striped and on average the number of concurrent accesses to movie i on the disk array is Ni . The normalized access load imposed by movie i is then fN(sii ) and the P total access load imposed on this disk array is ki=1 fN(sii ) . This conjecture is veri ed by an experiment described below. In the following experiment, we use 4 RAID 3s storing 7 movie titles (A to G) with 4 of which (D to G) are 1-way striped (no striping), two of which (B and C ) are 2-way striped, and 1 of which (A) is 4-way striped (see Figure 2). We design this experiment to balance the load on these 4 RAIDS while the accumulated eect of concurrent accesses can be observed. We then use dierent combinations of the numbers of streams accessing movie titles with dierent ways of striping and measure the the delay jitters (i.e., percentage of the number of accesses to the disk that miss the playback deadline versus the number of all accesses). For example, 6
Figure 2 shows the case where there are 12 concurrent accesses, 3 of which access each of the 4 1-way striped movie titles, 8 concurrent accesses with 4 of those accessing each of the 2 2way striped movie titles, and 20 concurrent access which access the 4-way striped movie title. Therefore, there are 10 concurrent accesses on average to the movie titles in each disk array, with 3 concurrent accesses to the 1-way striped movies, 2 concurrent accesses to the 2-way striped movies, and 5 concurrent accesses to the 4-way striped movie. POWERchannel-2 Board
HIO Bus
SCSI-II Controller 20 accesses 4-Wide
A
2-Wide
B
1-Wide
D
4 accesses 3 accesses
E
4 accesses C 3 accesses 3 accesses F G
3 accesses
RAID 3s
Figure 2: Combination of 12 1-way, 8 2-way, and 20 4-way concurrent accesses
P
Table 7 shows the total concurrent access load calculated using formula i fN(sii ) and the experimental delay jitters under the load situation, where Ni is the number of concurrent accesses to movie i which is stripedPsi -way. Ntotal is the total average number of concurrent accesses to each disk array. Thus, i fN(sii ) is the total load on each of the disk arrays. For P example, in the rst row of the table, we have i fN(sii ) = 113 + 102 + 59 = 0:917172. Table 7: Concurrent access loads vs. jitters
Pi Nif?way i
N1 N2 N4 Ntotal 3 2 5 10 2 2 5 9 8 1 1 10 1 2 6 9 1 1 7 9 5 3 2 10 4 3 3 10 3 4 3 10
( )
0.917172 0.937374 0.938384 0.957576 0.968687 0.976768 0.996970 1.006061
jitters (%) 0.0472 1.0306 1.6425 0.0583 0.7472 7.5900 5.3550 13.5175
It can be observed that whenever the total access load is less than 0:97( 1), the percentage of the delay jitters is acceptable (less than 2%). In other words, as long as the combined 7
concurrent access load of a disk array does not exceed the total normalized concurrent access capacity (0:97) of the disk array, these concurrent accesses can be supported with acceptable P jitters. This means that we can use formula i fN(sii ) to calculate the combined concurrent P access loads on each disk array, and (1 ? i Nif?(way i) ) to calculate the leftover concurrent access capacity on the disk array. This term give us a convenient way to calculate the concurrent access load on each disk array, and a method to determine the eectiveness of replication and striping, as shown by the later sections.
3 Problem Formulation In the following, we rst formulate the placement problem without replication. We then prove it is NP-hard, and generalize it to the problem with replication.
3.1 Problem Formulation without Replication Suppose we have M movies of the same length 3 (i.e., they occupy the same amount of storage space) and each movie may be accessed concurrently by ai (i = 1::M ) users. We assume that a disk array can store z movie les and can support f (i) concurrent accesses when a video le is i-way striped. Here, f (i) is de ned as the number of concurrent accesses that a single disk array can support when a le is striped over i disk arrays, as discussed in Section 2.2. That is, i-way striping can support i f (i) concurrent accesses. In order to support ai number of concurrent accesses, video le i should be striped over si disk arrays such that f (si) si ai . By striping across si disk arrays, video le i will thus impose a normalized access load of si afi(si ) on of the striped disk arrays. Thus, the concurrent access load of any disk array k is then Pj each Nj = P aj =sj = P aj j f (sj ) j sj f (sj ) , 8 movie j stored on k . f (sj ) Table 8: Parameters Parameter M ai f (i) si za
Pj s fj s j
(
j)
Meaning # of movie les. # of concurrent access requirement for movie le i; i = 1::M . # of concurrent access that a single disk can support when i-way-striped . striping factors for movie i (i = 1::M ). # of movies that a disk array can store. normalized access load of a disk array.
With the parameters presented in Table 8, the video allocation problem can be described as follows: Given 3 We make this assumption to simplify the discussion in this paper. However, it is also known that most movie titles have playback lengths of about 90 minutes, especially in USA.
8
1. M movies of the same size and their concurrent access pro le ai ; i = 1::M , 2. a set of disk arrays with storage capacity of z movies les, and 3. a striping function f such that f (sj ) is the average number of concurrent accesses that can be supported by a single disk array when a movie j is striped over sj disk arrays. We want to determine how these M movie les can be allocated to a minimum number of disk arrays such that the given access pro le can be supported. Theorem 1:The video allocation problem is NP-hard in terms of determining the minimum number of disk arrays required. Proof: We prove it by reducing 2-D vector packing problem (which is NP-hard) to it. A 2-D vector packing problem will consist of M objects with 2-D requirement (ui ; vi ); i = 1::M and containers with standard capacity of (c,d). The objective is to pack these M objects into a minimum number P of standard size P containers such that for all k objects packed in the k same container, we have i=1 ui c and ki=1 vi d Returning to our problem, we can treat each movie i as a 2-D object with 2-D size (ui = 1; vi = si afi(si ) ) when it is si-striped, and each disk as the 2-D container with standard size (c = z; d = 1:0). Here, ui is the storage requirement of movie i and vi is the concurrent access requirement of movie i impose on each disk. Thus, objects in 2-D vector packing problem correspond to movie les in our problem and the containers to the disks. The only dierence between our problem and 2-D vector packing problem is that our problem will probably stripe the movie across dierent containers (disks) while 2-D vector packing will not allow any striping. However, by setting striping function as f (1) = c > 0 and f (i) = 0; i = 2; :::, we will insure that the output of our problem will not have any striping (each striping factor is 1). This is because using striping factors larger than 0 will reduce the concurrent access capacity of disks to 0 and movie les won't be able to be placed on these disks. Thus, for any instance of 2-D vector packing problem, we can transform it in polynomial time into an instance of a special case of our problem which does not have any striping. Since 2-D vector packing problem is NP-hard [2], our problem is also NP-hard. 2
3.2 Problem Formulation with Replication The formulation in Section 3:1 does not consider the replication possibility of a video le. The formulation with replication of video les needs to decide the replication factor ri from each video le i. That is, each movie i (1 i M ) with concurrent access requirement ai can be replicated ri copies with each copy j (1 j ri) having the access requirement aij , where Pri aij =into a . Thus, ri and the partition of access requirement ai need to be determined. i j =1 This problem is harder than the one without replication since we need to decide the replication factor for each movie. It can be shown that that it is also NP-hard. 9
4 Heuristic Algorithms Due to the NP-hardness, the optimal solution to the video allocation problem is dicult to obtain and might be very infeasible to compute in a reasonable computing time. Instead, heuristic approaches should be sought. There are three basic subproblems needed to be solved in the video allocation problem: 1. How many copies (i.e., replication factor) should each movie title have? 2. What is the striping factor for each copy of each movie title? 3. How to place the movies over a minimum number of disk arrays? In the following discussion, we adopt a bottom-up heuristic approach. That is, we rst give a heuristic algorithm to place movie les over disks assuming we have the knowledge of optimal replication factors and striping factors. Next we develop some heuristic algorithms to determine the good striping factors using our placement heuristic. Finally, we give a heuristic algorithm to replicate the movie titles by balancing the concurrent access requirement and the storage requirement. By combining the three algorithms together we have a heuristic approach to solve video allocation problem.
4.1 A Heuristic Algorithm to Place Video Files Over a Minimum Number of Disks Let us rst de ne movie i's minimum striping factor, smin i , as the smallest integer si such min that si f (si) ai . Certainly, using si will not always yield an optimal solution. There is no reason to use a very large striping factor either. In this section, we assume that the striping factor for each movie is known. Using these pre-determined striping factors, we then propose a way to place movie les over a small number (but maybe not the minimum) of disk arrays. The above problem is very similar to 2-D vector packing problem [2] except that some objects may be placed in more than one container. As a result, we found that many heuristic algorithms developed for 2-D vector packing problem can be used. We adopt a Best- t [2] algorithm to place a movie over disk arrays. The idea of a Best- t algorithm is to place the movie over disk arrays that have been used so far, and will have the heaviest \loads" after the movie is placed on the disk array. Intuitively, this is a greedy approach to reduce the number of disk arrays required. Here, the load C of a disk array is de ned as the linear combination of the concurrent access load, Laccess, and storage load, Lstorage , as follows:
C = Raccess=storage Laccess + Rstorage=access Lstorage z ; min
P
Nstorage M ai is the min access , and R where Raccess=storage = NNstorage min . Naccess = i=1 f (smin min storage=access = Naccess i ) minimum number of disk arrays needed to satisfy the concurrent access requirement of an access min = M is the minimum number of disk arrays needed to store M movies. Note pro le and Nstorage z min
10
that by dividing Lstorage by its storage capacity z , the storage capacity is normalized to 1. is a positive balancing factor that balances the storage load with concurrent access load. It should be decided through experiments. In our experiments, we set it to 1:0 and thus treat storage load and concurrent access load equally. This cost function C is adaptive to the access pro le in a sense that it favors concurrent access load when the access pro le is concurrent access dominant min > N min ), and favors storage load when it is storage dominant (N min > N min ). (Naccess access storage storage The concurrent access load Laccess and storage load Lstorage for each disk array are initialized to zero at the beginning of the placement. After a movie i with striping factor si is place on si disk arrays, the concurrent access load of these disk arrays will increase by f (sai )isi and the storage load will increase by s1i . Thus, Laccess and Lstorage of these si disk arrays should be updated as follows: Laccess = Laccess + f (s a)i s ; Lstorage = Lstorage + s1 i i i For any disk array, its Laccess should not be greater than 1 and its Lstorage should not be greater than z . There can be some variations of the Best- t algorithm based on the dierent orders of the movies that are placed on the disk arrays. There are three typical orderings: Decreasing Ordering, Increasing Ordering, and Random Ordering. The order should be based on both storage requirement and concurrent access requirement of the movies. However, only the concurrent access requirement is considered since the storage requirement is usually less signi cant in a RAID 3 environment. In order to compare the impact of these possible orderings, a series of experiments are performed. The result shows that both Deceasing Order and Random Ordering perform better than Increasing Ordering. We also discover a fourth ordering (we call it Matching Ordering) which has some improvement over the Decreasing Ordering and Random Ordering. The idea of Matching Ordering is to balance the load of disk arrays with movie titles which have high concurrent access demand with those which have low concurrent access demand. In the Matching Ordering Best Fit algorithm, we rst sort the movie titles in decreasing order. Whenever we have placed a movie with the highest concurrent access requirement, we try to place one or more movie titles with the lowest concurrent access requirement as long as no new (i.e. empty) disk array will be used (i.e., we try to ll the disk arrays that have been used so far as full as possible with the movie titles with the lowest concurrent access requirements). Table 9 compares the results of Increasing Ordering, Random Ordering, Decreasing Ordering, and Matching Ordering with approximately 110 movie titles (i.e. M 110). In the comparison, we assume that the movie access pro le can be approximated by the normalized geometry distribution as used in [3]. That is, let skew factor (0 < < 1) denotes the variation in demand, M denotes the number of the movie titles and A denotes the total number of concurrent accesses to the M movie titles. The skew factor indicates the deviations in the concurrent access pro le. The smaller the is, the more deviation of concurrent accesses in the access pro le is. Figure 3 shows the concurrent access requirement for dierent values of skew factor when M = 20 and A 40: min There are three types of pro le of concurrent access with respected to the ratio of Naccess min min min to Nstorage , i.e., storage dominating pro le (Nstorage > Naccess), concurrent access dominating min < N min ), and balanced pro le (N min N min ). N min = M is directly pro le (Nstorage access storage access storage z 11
Table 9: Comparison of Dierent Orderings of Best- t Algorithm Cases Access domin. pro les min < Naccess min Nstorage A ( M 11) Balanced pro les min Naccess min Nstorage ( MA 4) Storage domin. pro les min > N min Nstorage access ( MA 2:5)
Inc. Ord. Rand. Ord. Dec. Ord. Match Ord. 0:1 345 339 326 326 0:3 252 239 239 239 0:5 231 213 213 213 0:7 192 177 172 171 0:9 146 128 128 129 0:1 70 63 49 48 0:3 63 58 47 47 0:5 63 50 45 44 0:7 58 46 43 42 0:9 49 40 43 38 0:1 54 48 42 41 0:3 52 46 42 41 0:5 50 40 41 40 0:7 48 39 40 39 0:9 41 36 38 35
Geometry Distribution of Concurrent Access Demands alpha = 0.5 alpha = 0.7 alpha = 0.9
14
Concurrent Access
12 10 8 6 4 2 0 2
4
6
8
10 12 Movie #
14
16
18
20
Figure 3: Demand distributions of dierent
P
ai increases when A = min = M proportional to the number of the movies M and Naccess i=1 f (smin i ) PM ai increases. The above three kinds of pro les can also be classi ed by the ratio of A to i=1 M.
From Table 9, Matching Ordering performs the same as or better than the rest of the Orderings except in one case (i.e., access dominating pro le with = 0:9). Thus, we decide to use Matching Ordering in our heuristic algorithm. Let access pro le, a, and striping vector of all video les, s, be de ned as a = (a1 ; a2 ; :::; aM ), and s = (s1 ; s2 ; :::; sM ). Here, the striping vector s is pre-determined and serves as an input to the algorithm. Let U be de ned as the upper bound of the number of disk arrays used that we know in advance. How to obtain this upper bound will be discussed in the next two subsections. It is used to cut some searching cost for the heuristic algorithms in later sections. 12
Algorithm Placement performs the following steps: 1. Initialize the variables used. 2. Sort the movies titles in non-increasing order with respect to the concurrent access requirement. 3. Execute the following two steps until all the movie titles are placed on disks: (a) Place a movie, say i, at the front of the sorted sequence (i.e., the movie title which has not been placed yet and has the largest concurrent access requirement). This is done by choose si disks which has been used (i.e., they are not empty) and have the largest loads C after movie i is placed on these si disks. (b) Place zero or more movies at the end of the sorted sequence (i.e., the movie titles which have not been placed yet and have the smallest concurrent access requirements). This is done in the same way as placing the movie on the front of the sorted sequence in the above step.
Algorithm Placement: Given the access pro le a, the striping vector s, the minimum
min , the minimum number number of disks required due to the concurrent access requirement Naccess min of disks required due to the storage requirement Nstorage , and the upper bound (i.e., the solution we got so far) on the minimum number of disks requirement U , the algorithm tries to place the movie les over the minimum number of disks D.
13
min ; Nstorage min ; U ) 1 Placement(a; s; Naccess 2f /* Initialize the variables. */ 3 D = 0; /* D is the minimum # of disk arrays used. */ min =Nstorage min ; 4 Raccess=storage = Naccess 1 5 Rstorage=access = Raccess=storage ; 6 Initialize Liaccess , Listorage , and C i to 0 for all disks (i = 1::M ); 7 left = 1; /* left pointer to the sorted sequence. */ 8 right = M ; /* right pointer to the sorted sequence. */ 9 Sort the movies (i = 1::M ) in non-increasing order with respect to ai ; 10 WHILE (left U ) /* Exceed the upper bound U */ 15 RETURN (D); 16 g /* update Lstorage and Laccess of these sleft disk arrays*/ 17 FOR (k k2 feach ofksleft disks 1selected g) f 18 Lstorage = Lstorage + sleft ; 19 Lkaccess = Lkaccess + sleft afi(sleft ) ; k 20 C k = Raccess=storage Lkaccess + Rstorage=access Lstorage z ; 21 g 22 left = left + 1; /* next one. */ /* Place one or more `right' movie now. */ 23 WHILE (right >= left) f 24 pick sright disks which are not empty and have enough capacity, and have the largest C k 's after movie (right) is placed on these disk arrays, (k 2 feach of sright disks selected aboveg) ; 25 IF (we are only able to nd t < sright such disks) 26 break; /* we don't want to use new disk arrays. */ /* update Lstorage and Laccess of these sright disk arrays*/ 27 FOR (k k2 feach ofksright disks1 selected aboveg) f 28 Lstorage = Lstorage + sright ; 29 Lkaccess = Lkaccess + sright afi(sright ) ; k 30 C k = Raccess=storage Lkaccess + Rstorage=access Lstorage z ; 31 g 32 right = right ? 1; /* next one. */ 33 g /* end of while */ 34 g /* end of while */ 35 RETURN (D); 36 g
Lines 3 ? 8 perform Step 1 described above. Line 9 performs Step 2. Lines 10 ? 22 perform Sub-step a of Step 3, while Lines 23 ? 32 perform Sub-step b of Step 3. Lines 11 and 24 both need to nd si disks which have the largest loads after movie i is placed over the disks. Each of these steps take time at most maxi (si D) = O(D2 ). Thus, 14
the worst case complexity of this algorithm is O(M D2 ). This complexity can be improved by sorting the disks according to their loads and re-insert back in the ordered list those disks whose loads have been changed after movie i is placed over these disks. Using this ordering of disks, the cost of selecting si disks is constant and inserting si disks into a sorted sequence with maximum length D is then si log D = O(D log D). Thus the complexity can be improved to O(M D log D).
4.2 Heuristic Algorithms to Determining Striping Factors In the placement algorithm discussed in the previous subsection, we assume that the striping factor for each movie title is known. In this section we propose some ways to determine a good combination of striping factors. Assume that soptimal is the optimal striping factor for movie title i. That is, if we use a i optimal striping factor si! = si , it might be required that we use more number of disks place the video les than we should. The problem here is how to nd these optimal striping factors. A straight-forward way to determine a good combination of the striping factors is to bruteforce search all the possible combinations of the striping factors. That is, for each combination of the striping factors, we use Placement algorithm to place the video les and compute the minimum all the combinations. The computational complexity of such algorithm is M D2 QMi=1(ai ?ofsmin i ), which is prohibitively high. In the worst case in which all ai 's are equal and smin = 1 8 i , the complexity is O(M D2 ( MA ? 1)M )! With M in the range of hundreds and i A in the range of thousands, the computational cost is highly intractable. Alternatively, we can use branch-and-bound technique to reduce the computation complexity and still nd a good combination of striping factors. This is addressed in the following subsection. However, in the worst case, branch-and-bound algorithm still exhibits exponential computational cost with respect to the number of movie titles. In the later subsection, we also develop an ecient algorithm which performs very close to the branch-and-bound algorithm, while has a reasonable computational complexity.
4.2.1 A Branch-and-Bound Placement Algorithm In this section, we will discuss the Branch-and-Bound heuristic algorithm, which indeed nds a good combination of striping factors to stripe the movies and places the movies on the disks using Placement algorithm. Assume that we know the upper bound (U ) for the number of disks needed. This can be easily obtained by choosing the minimum striping factors for all movie les and then running the Placement algorithm. U certainly is an upper bound of all striping factors. Moreover, from the practical point of view, we might not want to have 1 concurrent access to be split over more than 1 disk arrays with each of which having a fraction of this 1 concurrent access. It means that the concurrent access requirement ai is also another upper bound for striping factor si. Therefore, the striping factor si should be between the range [smin i ; min(U; ai )]. Let s = (s1 ; s2 ; :::; sM ) 15
titles. s0 > s if and and s0 = (s01 ; s02 ; :::; s0M ) be any two dierent striping factors of the M movie P M 0 only if si > si ; (8i; 1 i M ). Let's de ne function Naccess(s) = d i=1 f (asii ) e as the lower bound of the number of disk arrays needed (due to concurrent access) using striping vector s. Clearly, function Naccess(s) is a monotonically increasing function with respect to s. That is, if s0 > s, we have Naccess(s0 ) > Naccess(s). Let Doptimal is the minimum number of disk arrays that found so far. When we try to stripe the movies using the striping vector s and found Naccess(s) > Doptimal , it is not possible to nd a better solution than Doptimal using s. More over, for all s0 > s, we have Naccess(s0 ) > Naccess(s) > Doptimal and we can trim all those cases with s0 > s too. The following Branch-And-Bound-Placement algorithm performs the following ve steps: 1. 2. 3. 4. 5.
Obtain an initial upper bound on the minimum number of disks needed. Initialize some used variables. Compute the next search point (combination of striping factors s) in the search space. Compute the lower bound Naccess(s) using the s. If Naccess(s) is larger than the optimal solution, Doptimal , then we trim all the combinations s0 > s in the searching space. 6. Use the algorithm Placement to place the video les using s and compute the number of disks used. If it is better than what we got so far, then we save the disk placement state to Stateoptimal and adjust Doptimal to the number of disks used.
Algorithm Branch-And-Bound-Placement: Given the number of movie titles M , the access pro le a, and the minimum striping vector smin , the algorithm outputs the number of disk arrays required and the disk placement status using the branch-and-bound technique and the Placement algorithm developed in the previous section.
16
1 Branch ? And ? Bound ? Placement(M; a; smin ) 2f /* Initialize some variables. */ min = dM=z e; 3 Nstorage ai e; min = d M 4 Naccess i=1 f (smin i ) min ; Naccess min ; 1); 5 U = Placement(M; a; smin ; Nstorage 6 Doptimal = U ; 7 FOR (imax = 1; i Doptimal ) f/* branch-and-bound. */ smax tracepoint = stracepoint ? 1; continue; /* go back to next one. */
g
/* Try to place the movies using striping factors s. */ min ; Naccess ; Doptimal ); D = Placement(M; a; s; Nstorage IF (D < Doptimal) f /* better solution. */ Doptimal = D ; Save the current state to Stateoptimal;
g g /* Next striping factor. */
RETURN Doptimal and Stateoptimal;
Lines 3-5 compute the initial upper bound by calling Placement algorithm (Step 1). Lines 6-10 do some initialization. Lines 11-28 compute the next searching point in the search space. Line 29 computes Naccess(s). Lines 29-33 trim some search points if possible. Note that for j < tracepoint, sj = smin j . And for j > tracepoint, sj will keep increased in the rest of the placement process. Thus, for any s0 with s0tracepoint >= stracepoint , we have s0 s. Thus, 17
we can trim the cases with s0tracepoint >= stracepoint from now on. This is done by setting smax tracepoint = stracepoint ? 1. Lines 34-40 place the movie and update the optimal solution if it is better. The average computational time of this algorithm is generally better than brute-force algorithm. However, the worst complexity of this algorithm is the same as that of brute-force algorithm, which is O(M D2 ( MA )M ). In the next section, we propose a greedy placement algorithm which has a performance closed to that of the branch-and-bound algorithm, while its worst case complexity is much lower.
4.2.2 A Greedy Placement Algorithm Both brute-force search and branch-and-bound technique incur an exponential computing time in worst case and is impractical in many cases, as we show in the previous section. Thus, a more ecient heuristic algorithm is needed. The algorithm we will introduce below is a greedy algorithm which only requires the polynomial time complexity. This approach provides an ef cient method to nd a fairly good combination of the striping factors. The idea is that we try to optimize the placement based on the previous placement history and current placement requirement. To place movie i, we assume the rest of movie titles that have not placed yet use their minimum striping factors. Then a brute-force search is used to nd a good striping factor for movie i. After we nd a good striping factor for movie i, we x it for the rest of the placement process and continue with the next movie. We know the striping factor has the lower bound of smin i . We restrict the upper bound of striping factor si to be ai since we do not want to split the load of a single stream over multiple disk arrays. When we place the movies over the disks based on the assumed striping factors, we use Placement heuristic algorithm which is described in Section 4.1. This heuristic algorithm packs the disks that have been used as full as possible and potentially reduces the extra disks required. In the following algorithm, ; :::; smin ; smin a = (a1 ; a2 ; :::; aM ), s = (s1 ; s2 ; :::; sM ), and smin = (smin 2 1 M ) are the access pro le, the striping factors at any given time, and the minimum striping factors for each movie (i = 1::M ), respectively. M is the number of the movies, state S is the placement of video les over the disk arrays at any given time, and state Soptimal is the best of the placements found by the algorithm. The Greedy-Placement performs the following steps: 1. Initialization. 2. For each movie i, does the following: (a) Set movie j (i < j M ) to smin j . (b) For each of the striping factors smin i si ai , use placement algorithm to place the video les and nd the best striping factor for si and save it in optimalStripingFactors[i]. (c) Fix the striping factor si = optimalStripingFactors[i] for the rest of the placement process. 18
Algorithm Greedy-Placement : Given the number of movie title M , the access pro le a and the minimum striping vector smin , the algorithm returns the number of disk arrays required and the corresponding disk placement status using Placement algorithm described in Section 4.1. 1 Greedy ? Placement(M; a; smin ) 2f 3 /* Initialize the variables. */ min = dM=z e; 4 Nstorage ai e; min = dPM 5 Naccess i f smin i
6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 g
=1
(
)
FOR (i = 1; i i) need a larger striping factor than the minimum striping factor smin j , Greedy Placement fails to nd this better solution. Branch-and-Bound Placement heuristic algorithm searches almost every combinations of striping factors and thus nds a better combination of striping factors in the above scenario. Since the computational complexity of Branch-and-Bound Placement is much larger than that of Greedy Placement, Branch-and-Bound Placement should be used properly. We found that when M and A is very large (say, M = 100 and A = 1000), Branch-and-Bound Placement is only practical to compute for 0:5 (access dominating pro les and some balanced pro les). Thus, we propose a hybrid solution as follows:
If 0:5 or M and A are reasonably small, use Branch-and-Bound-Placement to solve it.
Otherwise, use Greedy Placement.
4.4 Heuristic Algorithm with Replication Consideration As we mentioned in Section 1, the larger the striping factor is, the more concurrent access capacity will be consumed by each concurrent access. The minimum number of disk arrays min is realized when all the les use the needed due to the concurrent access requirement Naccess 20
P
M ai . To satisfy the storage requiremin minimum striping factors smin i . That is, Naccess = i=1 f (smin i ) min = M disk arrays. When a video le is striped si ways, we ment, we need at least Nstorage z min = maxi (smin ) is certainly need si disk arrays with enough capacity to store it. Thus, Nstriping i also another lower bound for the minimum number of disk arrays needed. That is, the minimum min ; N min ; N min ). number of disk arrays needed is at least D = max(Naccess storage striping min and N min . After replication, The eect of replication tends to reduce the bounds Nstriping access the access requirement of a single movie is split over all the copies and we get a new access pro le with more movie titles and a smaller skew factor P . This will reduce the minimum 0i min = d M 0 amin . The eect of this is that N striping factors smin e access i=1 f (si ) is reduced since the i striping function f is monotonically decreasing. For example, if a movie with 44 concurrent access requirement is duplicated into two copies with each having concurrent access requirement min is reduced from 44 = 5:5 to 2 22 = 4:9, if we assume the of 22, its contribution to Naccess 8 9 striping function in Table 6. Another advantage of replication is that, when the access pro le is concurrent access requirement dominant and the total concurrent access requirement is the same, the smaller the concurrent access requirement is, we generally can place the movie over min , replication will generally reduce a smaller number of disk arrays. Thus, unless D = Nstorage min > N min ), and the D if originally we have a concurrent access dominating pro le (i.e. Naccess storage min = N min . However, replication increases the storage minimum of D is realized when Nstorage access min since we have to store multiple copies of a single movie. If originally we have a bound Nstorage min < N min ), replication is not preferred. storage dominating pro le (i.e. Naccess storage The heuristic solution (we call it Replication algorithm) we presented below aims at minimin ; N min ) > mizing the lower bound L through replication. If originally we have min(Nstriping access min min min min Nstorage . L is minimized when Nstorage min(Nstriping ; Naccess). This is because we can replimin and N min and increase N min such that min(N min ; N min ) and cate to reduce Nstriping access storage striping access min are balanced. In the following process, we do the replication by two steps, with the Nstorage min and the second to further reduce the bound N min . rst to eliminate the bound Nstriping access Replication will change the number of movie copies and thus the minimum striping factors, min should re ect this change (the new number of movie les created after repliM , and Nstorage cation and the new storage bound). Let M 0 and a0 = (a01 ; a02 ; :::; a0M 0 ) be the number of movies (copies) and their concurrent access requirement at any moment during replication. We can calculate these three bounds using the following formula:
0 0 min min Calculate the smallest smin i such that si f (si ) ai , 81 i M . min = d M 0 e. Nstorage z
min = dPM 0 Naccess i
a0i e. f (smin i ) min = maxi (smin ). Nstriping i =1
The Replication algorithm consists of three steps: 1. Initialization. 21
2. Duplicate the hottest movie titles until smin striping Nstorage) do f duplicate the hottest movie (with largest a0i ) with concurrent access split over them evenly; M 0 = M 0 + 1; min , and Naccess min ; */ /* Recalculate a, Nstorage min min 0 Calculate smallest si such that smin i f (si ) ai ; 0 min M Nstorage = d z e; min = d M 0 a0i e; Naccess i=1 f (smin )
P
g
RETURN (M 0 ; a0);
i
Line 2-3 does some initialization. Line 4-13 performs step 2. Line 15-23 performs step 3.
After nishing the replication process, M movie les will be duplicated into M 0 movie les with new concurrent access requirement a0 = (a01 ; a02 ; :::; a0M 0 ). We apply the heuristic algorithm Greedy-Placement to this new pro le.
Algorithm Greedy Placement with Replication: Given the number of movie titles
M and access pro le a, this algorithm uses the Replication algorithm above, and applies the
Greedy-Placement to the new access pro le after replication, and returns the number of disk arrays required and the corresponding disk placement status.
22
1 Greedy-Placement-with-Replication(M; a) 2f 3 (M 0 ; a0 ) = Replication(M; a); 4 Calculate new smin = (smin ; smin ; :::; smin 1 2 M 0 ); 5 RETURN(Greedy ? Placement(M 0; a0; smin )); 6g
The algorithm above nds the solution as shown in Table 5 to the access pro le described in Section 1. Table 11 shows some results of this algorithm using the same example data in the previous section. In the table, Lno?replication is the lower bound without replication, while L is the lower bound with replication. Note that L is usually smaller than Lno?replication in the concurrent access dominating pro le. Table 11: Comparison of replication versus no-replication Cases min < N min Nstorage access A M 6 min Naccess min Nstorage A 4:2 M min > N min Nstorage access A M 2:5
M 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
A 117 118 117 114 110 88 88 87 84 81 48 48 46 46 40
Lno?replication Greedy-Best-Fit L Greedy-with-Replication 0:1 14 17 11 13 0:3 14 14 11 12 0:5 13 14 11 13 0:7 12 12 11 12 0:9 11 11 10 11 0:1 11 11 9 9 0:3 10 11 9 10 0:5 9 9 8 8 0:7 8 9 8 9 0:9 8 8 8 8 0:1 7 8 7 8 0:3 7 8 7 8 0:5 7 7 7 7 0:7 7 7 7 7 0:9 7 7 7 7
From the table, we can see that replication does reduce the number of the disks used. This reduction is signi cant when the ratio of A to M is large,i.e., more hot movies are provided, as illustrated by Table 12 with MA 10. As a matter of fact, when the ratio of A to M is very large, we might be able to replicate to the degree that striping is not necessary, as discussed in the next section.
5 Striping versus Replication Both replication and striping have some bene ts and potential penalties, complicate the placement process and make it dicult to nd a good solution. In some cases, the minimum number of disks can be maintained even without using either replication or striping. In section 4.1, we proposed a solution without replication. In this section, we try to propose a solution min >> N min , replication actually without using striping. For the access pro le with Nstorage access 23
Table 12: Replication versus no-replication when MA 10 M 100 100 100 100 100
A 1098 1094 1091 1091 1070
Dno?replication Greedy-Best-Fit D Greedy-with-Replication 0:1 247 320 100 105 0:3 230 239 100 103 0:5 208 213 100 119 0:7 166 172 100 100 0:9 122 128 98 100
increases the storage bound and thus increase the minimum number of disk required. In this case, the Greedy Placement algorithm without replication will yield good solutions. In the min f (1) (2 f ? 1), then we can replicate the movies in a way such that the lower bound of the number of the disk arrays needed is minimized and the maximum concurrent access requirement in the new access pro le after replication is equal to or less than f (1). (1)
Proof:
min and increase N min . If originally we We know the replication eect is to reduce Naccess storage min > N min , then the lower bound D = max(N min ; N min ) is minimized when have Naccess storage access storage min = N min . In order to restrict the maximum concurrent access requirement in the new Naccess storage access pro le after replication to be equal to or less than f (1), we should have d f A(1) e d Mz e after replication. If we use the replication algorithm in Section 4.3, we will at least need to z e ? M times. For video le with access requirement ai (> f (1)) to be duplicated duplicate d fA(1) into multiplea copies with concurrent access requirement less than or equal to f (1), we need at a P d log f i e d log f i e A z ? 1) duplications. Thus, if we have d f (1) e ? M ai >f (1) (2 ? 1), then most (2 we will be able to duplicate all video les with ai > f (1) into copies whose concurrent access requirement is at most f (1). 2 Example 1: Let M = P10, A 110, f (1) = 11, z = 3, and a = (90; 2; 1; 1; 1; 1; 1; 1; 1; 1), we a have d A113 e ? M = 18, ai >11 (2dlog 11i e ? 1) = 15 < 18. The access pro le after replication is a0 = (6; 6; 6; 6; 6; 6; 6; 6; 6; 5; 5; 5; 5; 5; 5; 5; 3; 3; 3; 3; 1; 1; 1; 1; 1; 1; 1; 1). 2: aiLet M = 10; A 110, ai = (14; 13; 12; 11; 10; 10; 9; 8; 7; 6), we have d A113 e ? M = 18, PExample e d log ai >11 (2 11 ? 1) = 3 < 18. The access pro le after replication is 0 a = (6; 5; 5; 5; 5; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3). (1)
(1)
The above theorem tell us in many cases, we can place the movie les over the disks without using striping. This allows us develop a simpler algorithm as follows:
24
Algorithm Placement without Striping: Given the number of movie titles M and the access pro le a, this algorithm uses Replication algorithm to replicate the movie titles, applies Placement algorithm to the new access pro le directly without using striping, and return the number of disk arrays required and the corresponding disk placement status. To use this algorithm, Theorem 2 must be satis ed. 1 Placement ? Without ? Striping(M; a)f 2 (M 0 ; a0 ) = Replication(M; a); min and N min according to M 0 and a0 ; 3 Calculate Naccess storage min ; Nstorage min ); 4 U = max(Naccess min ; Nstorage min ; U )); 5 RETURN(Placement(M 0; a0; Naccess 6g Table 13 shows the comparison of this algorithm with the greedy algorithm using striping: Table 13: Striping versus no-striping M 100 100 100 100 100
A 1098 1094 1091 1091 1070
0:1 0:3 0:5 0:7 0:9
D Greedy-Placement-with-Replication Placement-without-Striping 100 105 105 100 103 103 100 119 119 100 100 103 98 100 101
In most cases, Placement-without-Striping performs as good as Greedy-Placement-withReplication except in the case when = 0:7. We use a simple and ecient Placement Algorithm in both cases because Greedy-Placement-with-Replication algorithm needs to use this algorithm to nd good striping factors. More expensive and better heuristic algorithm than Best- t can be used in Greedy-Placement-without-Striping algorithm to further improve the placement. Actually, after the replication step, the problem becomes the traditional 2-D vector packing problem. The advantage of this algorithm is that we can use any good (and probably very expensive) heuristic algorithms developed for 2-D vector packing (including Simulated Annealing [6] algorithms) to optimize the placement.
6 Conclusion The access pro le of a given set of movie titles speci es the number of concurrent accesses that the disk subsystem of a VOD server needs to support. Video les impose requirements on disks in two dimensions: storage and concurrent access. Due to the mechanical features of the disks, the concurrent access requirement tends to impose more severe challenge than the storage requirement in a RAID 3 environment. In this paper, we proved that placing video les over disks such that both of the storage and concurrent access requirements are satis ed is a NP-hard problem. Since the optimal solution is impractical, we have developed 25
some practical heuristic algorithms to solve this problem. We have developed a placement algorithm (Placement) with an assumption that the replication factors and striping factors are known. Based on the placement algorithm, we propose algorithms using either a Branchand-Bound strategy (Branch-and-Bound-Placement) or a Greedy strategy (Greedy-Placement) to nd a good combination of striping factors. Although the Branch-and-Bound-Placement algorithm performs slightly better than the Greedy-Placement algorithm, it potentially has an exponential complexity with respect to the number of movie titles. On the other hand, the Greedy-Placement imposes a polynomial complexity with respect to the number movie titles in the worst case. We also showed that the lower bound of the minimum number of disks required can be minimized through replication, which in turn decides the replication factor for each movie title (Replication algorithm). Combining the Replication algorithm with GreedyPlacement algorithm, we provided an ecient method to place the video les over the minimum number of disk arrays such that the given access pro le is satis ed. It is worth noting that simply striping without replication in an array of RAID 3s environment might not always produce the best results. Since the access capacity is usually more constrained than the storage capacity in an array of RAID 3s environment, especially for the access dominating pro les. Thus, we proposed an algorithm without using striping (GreedyPlacement-without-Striping). The performance results in our study are limited to the RAID 3 environment, in which the access capacity seems to be more important than the storage capacity. It is not clear that the same results can be observed in a RAID 5 environment. An important future study will be to investigate the dierent striping functions for dierent multiple-disks (disk arrays) environments (e.g., a RAID 5). Further understanding of the trade-os using striping and replication strategies for dierent environments will be greatly bene cial to the designers of VOD systems. It is also true that in reality, the customer demands for the video les may be changed from time to time, thus the estimated access pro le may no longer match with the actual demands. Dynamic re-allocation of the movie les among the disk arrays may be necessary to solve this problem. We continue to work on this important problem. Ecient schemes are under investigation will be reported on in the near future.
Acknowledgment The authors would like to thank a number of people. Jon Buerge at Army High-Performance Computing and Research Center helped conduct the experiments in the paper; Harish Vedavy and Simon Shim at the University of Minnesota provided valuable comments; Dr. Ronald J. Vetter at North Dakota State University and Ted Smith at the University of Minnesota gave valuable suggestions for this paper.
26
References [1] Ciprico Inc. RF6700 Controller Board Reference Manual, 1993. [2] E.G. Jr. Coman, M.R. Garey, and D.S. Johnson. Approximation Algorithms for BinPacking { An Updated Survey. Analysis and Design of Algorithms in Combinatorial Optimization, pages 147{172, 1981. [3] Y. N. Doganata and A. N. Tantawi. A Cost/Performance Study of Video Servers with Hierarchical Storage. Proceedings of the International Conference on Multimedia Computing and Systems, 1994. [4] Shahram Ghandeharizadeh and Luis Ramos. Continuous Retrieval of Multimedia Data Using Parallelism. IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No. 4,, August 1993. [5] J. Hsieh, M. Lin, J. Liu, and D. Du. Performance of a Mass Storage System for Video-OnDemand. To appear in a Special Issue on Multimedia Processing and Technology, Journal of Parallel and Distributed & Computing (JPDC), August 1995. [6] S. Krikpatrick, C.D. Gelatt, and M.P. Jr. Vecchi. Optimization by Simulated Annealing. Science, Vol. 220, No. 4598, pages 671{680, May 1983. [7] T.D.C. Little and D. Venkatesh. Popularity-based assignment of movies to storage devices in a video-on-demand system. Multimedia Systems, No. 2, 1995. [8] J. Liu, J. Hsieh, D. Du, and Mengjou Lin. Performance of A Storage System for Supporting Dierent Video Types and Qualities. Technical Report TR95-060, Department of Computer Science, University of Minnesota, September 1995. [9] Ram Ramarao and Victor Ramamoorthy. Architectural Design of On-Demand Video Delivery Systems: The Spatio-Temporal Storage Allocation Problem. Proceedings of International Conference on Communications ICC'91, pages 17.6.1{17.6.5, 1991.
27