Disk Scheduling for Variable-Rate Data Streams Jan Korst, Verus Pronk and Pascal Coumans Philips Research Laboratories Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands fkorst,pronk,
[email protected] Abstract We describe three disk scheduling algorithms that can be used in a multimedia server for sustaining a number of heterogeneous variable-rate data streams. A data stream is supported by repeatedly fetching a block of data from the storage device and storing it in a corresponding buffer. For each of the disk scheduling algorithms we give necessary and sufficient conditions for avoiding under- and overflow of the buffers. In addition, the algorithms are compared with respect to buffer requirements as well as average response times. key words: disk scheduling, continuous media, video server, multimedia, variable rate, buffer requirement, response time.
1 Introduction Multimedia applications can be characterized by their extensive use of audio-visual material in an interactive way. The presentation of this material requires a continuous data stream from the storage device on which the material is stored to the user. The magnetic disk is considered most appropriate as secondary storage medium in a multimedia server. It offers a large storage capacity and small random access times at a reasonable cost [Reddy & Wyllie, 1994]. To ensure that magnetic disks are used cost-effectively, several users have to be serviced simultaneously by a single disk or a disk array. This is realized by repeatedly fetching a data block for each of the users and storing it in a corresponding buffer, which has room for a number of data blocks. The buffers are implemented in random access memory (RAM), which is relatively expensive. Here, we assume that a user consumes data from his buffer, via some communication network, at a rate that may vary between zero and a maximum consumption rate. Users may have different maximum consumption rates. So far, the problem of scheduling variable-rate data streams has received little attention in the literature. Most papers assume constant-rate data streams; see, e.g., ¨ Yu, Chen & Kandlur [1993], Ozden, Rastogi & Silberschatz [1995], and Korst, Pronk, Aarts & Lamerikx [1995]. Important advantages of considering variable rates are that (i) the server can handle variable-bit-rate-encoded data streams and (ii) no extra provisions have to be made for handling slow-motion and pause/continue requests from the user. Variable-bit-rate-encoded data streams are defined in, for example, the MPEG-2 standard. A disk scheduling algorithm determines on-line when and how much data must be fetched for each of the users. It must guarantee that buffers do not become empty or overflow. For the sake of convenience, we speak of buffer underflow if a buffer becomes empty. In addition, a disk scheduling algorithm must also respond promptly to user requests. Reading small data blocks from disk results in relatively small buffer sizes and small response times, but also in many disk accesses per unit of time, i.e., in a less effective use of the disk. Therefore, there is generally a trade-off between, on the one hand, buffer sizes and corresponding response times and, on the other, the effectiveness of disk usage, which determines how many users can be serviced simultaneously. The aim of this paper is to describe three disk scheduling algorithms for sustaining a number of variable-rate data streams. The algorithms are compared with respect to buffer requirements and response times.
The remainder of this paper is organized as follows. In Section 2, we give a more precise statement of the disk scheduling problem of our interest. Related work is briefly discussed in Section 3. In Section 4, we first consider the simpler case in which the consumption rates are constant. There, we discuss the well-known double buffering or SCAN algorithm. We introduce three disk scheduling algorithms, that are based on the double buffering algorithm, for sustaining variablerate data streams in Section 5. In Section 6, we compare the buffer requirements and, in Section 7, the average response times of these three scheduling algorithms. Finally, Section 8 contains some concluding remarks.
2 Problem Definition Figure 1 shows a schematic of a video server. Before giving a more precise statement of the disk scheduling problem, we briefly discuss the three basic components, namely the disks, the users, and the buffers. Disks. The disk array consists of one ore more magnetic disks, using RAID technology [Patterson, Gibson & Katz, 1988]. The data is striped across all disks in the array, such that a request for a data block results in a disk access on each of the disks in the array. In this way, load balancing problems are avoided and a large composite data transfer rate can be guaranteed. As such, the disk array can be regarded as a single virtual disk. We will therefore often regard the array as a single disk in the following sections. The data on a disk is stored on concentric circles, called tracks. Each track consists of an integer number of sectors, the tracks near the outer edge usually containing more sectors than the tracks near the inner edge of the disk. A disk rotates at a constant angular velocity, so that reading near the outer edge results in a higher data transfer rate than reading near the inner edge. The time required for accessing data from a single disk generally consists of seek time, i.e., the time required to move the reading head to the required track, rotational delay, i.e., the time that passes before the required data moves under the reading head once the required track has been reached, and read time, i.e., the time required to actually read the data. The sum of the seek time and the rotational delay is called the switch time. The read time depends on the amount of data that is to be read and on the radial position of the track(s) on which the data is stored. The rotational delay per access takes at most one revolution of the disk. The seek time per access is maximal if the reading head has to be moved from the inner edge to the outer edge of the disk, or vice versa. To avoid that we have to take into account such a maximum seek time for each access, disk accesses are generally handled in batches. As the head moves from the inner edge of the disk to the outer edge, or vice versa, the required data blocks are read in the order in which they are encountered by the head. Carrying out such a batch is called a sweep. The worst-case total seek time that is required to execute a sweep with n disk accesses has been analysed by Oyang [1995]. For our problem, we assume that the disk array is characterized by a guaranteed data transfer rate r and a switch time function s(l ; m). The data transfer rate r gives the minimum guaranteed rate at which data can be read. By pairing the tracks of different disks, such that if data is read from the inner track of one disk, then the corresponding data is read from the outer track of the other disk, one can guarantee a rate of d times the average rate of a single disk, where d denotes the number of disks in the array [Birk, 1995]. The switch time function s(l ; m) gives the time that is maximally spent on switching when l data blocks have to be fetched in m sweeps, where the l data blocks may be assigned arbitrarily to the m sweeps. The worst-case switch time is defined in this way to allow easy analysis of the disk scheduling algorithms presented in Section 5. Users.
The users of a server, once admitted service, can be in one of two states: waiting or
buffers user 1
user 2
disk array
userN
status disk access requests
scheduler
user requests
Figure 1: The basic components of a video server.
consuming. Initially, a user is waiting. When sufficient data has been fetched from disk and stored in his buffer, the user can start to consume. In this state, a user i is allowed to consume data from his buffer for an indefinite period of time, at a rate that may vary between zero and a maximum consumption rate cmax . i The maximum rate that must be allocated for the playback of a variable-bit-rate MPEG sequence can be determined in an off-line analysis of this sequence, as explained by Dengler, Bernhardt & Biersack [1996]. If we choose cmax on the basis of the peak bit rate, which is determined by the i size of the largest frame in the sequence, then this generally leads to a cmax that is considerably lari ger than the mean bit rate. If the rate at which data is consumed is averaged for each sequence of, say, N successive frames, then the resulting value for cmax can be chosen as the maximum average i value, which will be considerably smaller. An example is discussed in Section 7.2. We restrict ourselves in this paper to disk scheduling algorithms that offer deterministic guarantees. While a user is consuming, additional data must be repeatedly fetched, in such a way that his buffer neither under- nor overflows. A disk scheduling algorithm is called safe if it guarantees that the buffers of consuming users never under- or overflow. For reasons of simplicity, we assume a continuous model, also called fluid-flow model, with respect to the consumption of data by the users. In practice, data is of course consumed in discrete units. Assuming a continuous model, however, considerably simplifies the analysis of disk scheduling algorithms, while the resulting differences are negligible. If a consuming user requests other data, he temporarily becomes waiting again. First, his buffer has to be filled with a sufficient amount of new data. The time between the moment that a user request arrives at the server and the moment the user can start consuming the corresponding new data is called the response time of this request. Note that delays caused by the communication network are not incorporated in the response time defined above. Disk scheduling algorithms can be compared with respect to their worst-case as well as their average-case response times. Usually average-case times are more important, because the probability of a worst-case situation occurring is usually very small. Buffers. As already mentioned, buffers are implemented in RAM, which is relatively expensive.
Since a consuming user may also cease to consume data for an indefinite period of time, a data block can only be fetched in a given sweep, if at the start of this sweep there is already enough room in the buffer to store it. Otherwise, buffer overflow may occur, causing data which has not yet been read from the buffer to be overwritten. If a data block is fetched in a given sweep, then it may arrive in the buffer at any moment during the sweep, immediately at the start or only at the end. In the analysis of a possible occurrence of buffer underflow, we assume that a data block arrives in the buffer at the end of the sweep. Given the above assumptions we can now define the disk scheduling problem as follows. Given a disk with a data transfer rate r and a switch time function s(l ; m), find a safe disk scheduling < r, minalgorithm that can simultaneously service a set U of n users, provided that ∑ j2U cmax j imizing a combination of buffer requirements and average-case response times. Whether more emphasis should be put on minimizing the buffer sizes or minimizing the response times depends on the specific multimedia application at hand. For video-on-demand applications, response times are probably not that important, as long as they are not too large. For highly interactive applications, such as games, response times are very important.
3 Related Work In this section, we discuss related work on the retrieval of variable-rate data streams from disk arrays. Existing papers propose solutions specifically developed either for handling variable-bitrate MPEG video or for implementing VCR-like functions such as fast forward at variable speed. Vin, Goyal, Goyal & Goyal [1994] present a disk scheduling algorithm in which the variability in seek times and frame sizes of MPEG-encoded video is exploited to provide only statistical guarantees. A similar statistical approach is pursued by Rautenberg & Rzehak [1996]. Chang & Zakhor [1994] present two general approaches for handling variable-bit-rate MPEG video, called constant time length (CTL) and constant data length (CDL). In the CTL approach, data blocks that correspond to a constant playback duration are periodically retrieved from disk. The successive data blocks will usually vary in size. In the CDL approach, data blocks of constant size are repeatedly fetched, with the playback time of a data block varying from data block to data block. Based on the CTL approach, Chang & Zakhor propose an admission control algorithm that offers statistical guarantees. In addition, the authors consider the use of scalable compression to achieve graceful degradation in overload situations. Chen, Kandlur & Yu [1995] discuss a number of ways to handle variable-rate playback, e.g., fast forward at different rates, taking into account the interframe dependencies of MPEG video data. They subdivide an MPEG sequence into independently encoded segments and assign these segments to the disks in the disk array in a specific way, to allow the skipping of a variable number of segments between two consecutively retrieved segments. They do not consider the specific problems relating to variable-bit-rate MPEG video. Dengler, Bernhardt & Biersack [1996] present an interesting retrieval algorithm for variablebit-rate MPEG video using the CTL approach, which offers deterministic guarantees. The authors determine the maximum size of the data blocks that have to be periodically fetched from disk, based on an off-line analysis of the video data. By averaging the consumption rate over a fixed period, in which a data block is fetched for each user, this maximum size can be chosen considerably smaller than the data block size that can be derived from the peak bit rate of the MPEG sequence. The maximum block size can be further reduced by averaging over more than one period, allowing the admission of more users. However, this increases the buffer sizes as well as the response times. The scheduling algorithm proposed by Dengler, Bernhardt & Biersack presupposes constant-frame-rate consumption and retrieves data blocks in periods of fixed length.
The disk scheduling algorithms that we present in this paper differ from the algorithm of Dengler, Bernhardt & Biersack in the sense that we use periods of variable length, which leads to considerably smaller average-case response times.
4 Scheduling for Constant-Rate Data Streams Before focusing on variable-rate data streams, let us first consider the simpler case of constantrate data streams. Hence, we assume that user i consumes data at a constant rate of ci . For ease of reference, the data stream that is being consumed by user i will often be referred to as data stream i below. For this special case, we discuss the double buffering algorithm (DB), which is based on the well-known SCAN algorithm, originally introduced by Denning [1967]. The SCAN algorithm has been adapted by a number of authors for handling continuous data streams; see, e.g., Gemmell [1993], Kandlur, Chen & Shae [1991], Kenchammana-Hosekote & Srivastava [1994], and Rangan, Vin & Ramanathan [1992]. All n data streams are serviced by the double buffering algorithm as follows. Let the time axis be divided into periods of length P. To minimize the buffer requirements, P should be chosen as small as possible. During each period, a data block is fetched for each data stream in a single sweep of the disk reading head. The size Bi of a data block for data stream i equals Pci , which is the amount of data user i consumes during one period. In this way, for each user, exactly one data block is fetched and one data block is consumed in every period. The number of data streams that can be guaranteed to be serviced simultaneously is of course bounded by the disk’s data transfer rate r. Let C be the sum of the consumption rates of all users, i.e., let C = ∑ j2U c j . Then the users can only be serviced simultaneously if C < r. The buffer for a data stream has room for exactly two data blocks. A data block that is fetched from disk in a given period is consumed in the next period. If a user i has just been admitted service or has requested another data stream, then the consumption of this new data stream can start at the end of the period in which the first data block was fetched for this stream. This time is denoted as Tistart . We first derive a necessary and sufficient condition for the safety of DB. The worst-case time Ts necessary for fetching a data block for each of the data streams in one sweep is given by ∑ j2U B j r Now, the period length P must be at least Ts , or Ts =
+ s(n; 1):
∑ j2U B j + s(n; 1): r Since Bi = Pci for each i 2 U, we can rewrite ∑ j2U B j as P
∑ B j = P ∑ c j = PC
j 2U
j 2U
:
(1)
(2)
By combining (1) and (2), we can prove the following result. Theorem 1 . Let U be a given set of n users, with C = ∑ j2U c j . Let C < r and let the buffer for each i 2 U have room for two data blocks of size Bi . Furthermore, let Bi =ci = B j =c j for each pair i; j 2 U. Then, the double buffering algorithm is safe if and only if for each i 2 U Bi
r s(n; 1) c: r ?C i
(3)
Proof. The sufficiency of (3) can be demonstrated as follows. If each sweep is guaranteed to be completed within P time units, i.e., if Ts P, then neither buffer under- nor overflow will ever occur. This can be shown as follows. At time Tistart , the buffer for stream i contains exactly one data block. Now, if Ts P, then in each following period one data block will be consumed, and one data block will be fetched from disk for this stream. Consequently, at the start of each following period, the buffer again will contain exactly one data block. Since at the start of a period the buffer contains sufficient data for consumption during that period, buffer underflow will not occur. Since there is already room for one data block at the start of a period, buffer overflow will not occur either. We next prove that Ts P is equivalent to (3). By combining (1) and (2), it is easily shown that Ts P is equivalent to P r sr(?nC;1) . Since Bi = Pci for each i 2 U, elimination of P yields the required result. The necessity of (3) can be demonstrated as follows. If (3) does not hold, then there is an i 2 U r s(n;1) r s(n;1) such that Bi =ci < r?C . This implies that P < r?C , which is equivalent to Ts > P, i.e., the resulting period P is not large enough to accommodate a worst-case sweep. Thus, buffer underflow may occur. The worst-case response time of DB is 2P. This occurs if a request is issued just after the start of a period. The consumption of the new data can then start only af the end of the next period. The average-case response time is 32 P, because on average we have to wait 21 P before a request can be handled in the following period, which takes another P. We end this section with the observation that DB requires the consumption from the buffers to be strictly synchronized with the data retrieval from disk. At the start of each period, there must be exactly one data block in the buffer of each consuming user. Even for constant-rate data streams this strict synchronization may be difficult to realize in practice.
5 Scheduling for Variable-Rate Data Streams If the consumption rates of the users vary over time, then the double buffering algorithm cannot prevent buffer under- and overflow by just using cmax instead of ci in Equation (3). This can be i inferred as follows. Since the consumption rate may be equal to zero, one can only fetch a data block in a given period, if there is room for this block at the start of this period. Otherwise, buffer overflow may occur. However, if at the start of some period, there is an amount of Bi + ε in the buffer for user i, for some ε with 0 < ε < Bi , and the buffer can store exactly 2Bi of data, then the next data block can be fetched only in the next period. This block may arrive at the end of this period, in which case user i may have consumed 2Bi of data so that buffer underflow may have occurred. Since the execution of sweeps can no longer be synchronized with the consumption of data from the buffers, it is of no use to allocate a worst-case time for each sweep. A new sweep is started immediately upon completion of the previous one, instead of every P time units. This can improve the average-case response times considerably, as will be shown in Section 7. A user can (re)start consuming at the end of the sweep in which the first data block has been fetched. As in Section 4, this time is denoted by Tistart . Furthermore, we assume that successive sweeps are consecutively numbered. Next, we discuss three disk scheduling algorithms for sustaining variable-rate data streams, each using this variable-period approach.
5.1 Triple Buffering Algorithm A straightforward way of generalizing the double buffering algorithm, such that it can handle variable consumption rates, is obtained by using cmax instead of ci in Equation (3) and by extending i the buffers such that they can store three data blocks. By analogy, this algorithm is called the triple buffering algorithm (TB). It works as follows. A data block of constant size is fetched for data stream i in a given sweep, only if there is room for another data block at the start of the sweep. Again, a data block must contain enough data to survive at least a worst-case sweep. With respect to the safety of TB, we derive the following result. Theorem 2 . Let U be a given set of n users, with C = ∑ j2U cmax j . If C < r and the buffer for each i 2 U has room for three data blocks of size Bi , then the triple buffering algorithm is safe if and only if for each i 2 U B Bi ∑ j2rU j + s(n; 1): (4) cmax i Proof. The sufficiency of (4) can be demonstrated as follows. Assuming that (4) holds, it is easy to see that, for each user i 2 U, at most Bi data will be consumed in a single sweep. Buffer overflow will never occur, because a data block will be fetched in a sweep only if there is already enough room for this data block at the start of the sweep. The proof that buffer underflow cannot occur is by contradiction. Let j be the sweep that is started at time Tistart , and let j 0 , j 0 j, be the first sweep in which buffer underflow can occur for user i. Clearly, j 0 > j, since at the start of sweep j there is exactly Bi data in the buffer, so buffer underflow cannot occur in sweep j. Let bi (k) be defined as the amount of data in the buffer for user i at the start of sweep k. Now, by definition of j 0 , bi ( j 0 ) < Bi . Since at most Bi data is consumed in each sweep, we obtain that bi ( j 0 ? 1) < 2Bi . Clearly, either bi ( j 0 ? 1) < Bi or Bi bi ( j 0 ? 1) < 2Bi . The first case contradicts the assumption that j 0 is the first sweep in which buffer underflow can occur. The second case contradicts that bi ( j 0 ) < Bi , since during sweep j 0 ? 1 a data block will be fetched from disk for user i, as bi ( j 0 ? 1) < 2Bi , while at most Bi data will be consumed in this sweep. Consequently, in both cases we derive a contradiction. The necessity of (4) can be shown as follows. If (4) does not hold, then buffer underflow can occur for some user i, for which (4) does not hold, in the sweep starting at Tistart . At Tistart , the buffer for user i contains exactly Bi data. If this sweep is of worst-case duration, user i consumes at maximum rate, and user i is serviced at the end of the sweep, then buffer underflow will occur. It is remarked that letting the user wait another two sweeps before he can start consuming does not solve the problem of buffer underflow in the second part of the proof of Theorem 2. That indeed a buffer must be large enough to store three data blocks and not less than three, can be inferred from the following. Suppose that we reserve buffers of size (2 + x)Bi , 0 x < 1. Now, if at the start of a given sweep the buffer contains (1 + x + ε)Bi data, with ε > 0 and x + ε < 1, then no data can be fetched for user i during this sweep. The next data block may only arrive at the end of the next sweep. In the meantime, 2Bi may have been consumed. Consequently, with smaller buffers, buffer underflow may occur. We now derive an expression for the minimum buffer requirements as follows. Since we have to store three data blocks for each data stream, we have to buffer a total of 3 ∑ j2U B j data. By adding all Bi ’s, using Equation (4), we derive a lower bound on ∑ j2U B j , which is given by
∑ B j r ?C j 2U
r s(n; 1)
C;
(5)
max = B =cmax where C = ∑ j2U cmax j j j . This lower bound can be attained, if we assume that Bi =ci for each pair i; j 2 U, from which we derive that ∑ j2U B j = Bi =cmax C. Using this expression in i Equation (5), we derive that Bi r rs(?n;C1) : cmax i
Now, by choosing r s(n; 1) max (6) c ; r ?C i this lower bound is indeed attained. Since each sweep is started immediately upon completion of the previous one, a data block will not be fetched for each user in each sweep. As a result, the number of data blocks that have to be fetched in a single sweep can be considerably smaller than n. We give an example in Section 7. Bi =
5.2 Variable-Block Double Buffering Algorithm An alternative approach for generalizing the double buffering algorithm is obtained by fetching data blocks of variable size. This algorithm is called variable-block double buffering algorithm (VDB). Instead of fetching a constant-size data block whenever there is room for one in the buffer, one could fetch, in a given sweep, an amount of data that is guaranteed to fit in the buffer, i.e., the size of a data block is chosen identical to the amount of room that is available at the start of the sweep, with a maximum of Bi as given by (6). The buffer for data stream i must have room for two data blocks of size Bi . In that case, it is easily seen that neither buffer underflow nor overflow will occur. Let bi ( j ) denote the amount of data in the buffer for user i at the start of sweep j. Buffer overflow will not occur, since a data block will never exceed the room that is available at the start of the sweep in which it is fetched from disk. Buffer underflow will not occur either. At time Tistart , there is exactly Bi data in the buffer, i.e., enough to survive the first sweep. Let this sweep be denoted by j. Furthermore, for each sweep j 0 j, it holds that if underflow does not occur in sweep j 0 , it will not occur in sweep j 0 + 1, as is shown as follows. By assumption, bi ( j 0 ) Bi and the amount of data fetched for user i in this sweep equals 2Bi ? bi ( j 0 ). Since user i consumes at most an amount Bi of data, bi ( j 0 + 1) bi ( j 0 ) + 2Bi ? bi ( j 0 ) ? Bi = Bi , i.e., user i survives sweep j 0 + 1. Unless a user does not consume from his buffer, a data block will be fetched in each sweep. Note that reading data blocks of variable size may impose additional constraints on the layout of the data on disk and the granularity of striping, since we must guarantee that a data block can be fetched by a single disk access.
5.3 Dual Sweep Algorithm Finally, we propose the dual sweep algorithm (DS), which operates as follows. In each sweep, a constant-size data block is fetched for data stream i whenever there is room for one in the buffer at the start of the sweep, unless a data block for this stream has already been fetched in the previous sweep. Hence, for each pair of successive sweeps at most one data block is fetched for each data stream. Let B0i be the size of the data blocks that are repeatedly fetched from disk for user i by the dual sweep algorithm. B0i equals the maximum amount of data consumed in two successive sweeps. Since a data block of size B0i is sufficient to survive two successive sweeps, a buffer must only have room for two such data blocks.
Theorem 3 . Let U be a given set of n users, with C = ∑ j2U cmax j . If C < r and the buffer for each 0 i 2 U has room for two data blocks of size Bi , then the dual sweep algorithm is safe if and only if for each i 2 U ∑ j2U B0j B0i + s(n; 2): (7) cmax r i Proof. The sufficiency of (7) can be shown as follows. It is easy to see that buffer overflow will never occur, since a data block will be fetched in a sweep only if there is already enough room for this data block at the start of the sweep. With respect to fetching data blocks for stream i, let the set of sweeps be divided into yessweeps and no-sweeps. In a yes-sweep, a data block is fetched for stream i; in a no-sweep, no data block is fetched for stream i. Between two successive yes-sweeps there is at least one nosweep. At time Tistart , at the start of sweep j, there is exactly B0i data in the buffer. With respect to buffer underflow, we first show for each i 2 U that if at the end of a yes-sweep there is at least B0i data in the buffer, then there is also at least B0i data in the buffer at the end of the next yes-sweep. We consider two cases. If between these two yes-sweeps there is exactly one no-sweep, then in the time between the completion times of the two yes-sweeps a block of size B0i is fetched, while at most B0i data is consumed. On the other hand, if there are more than one no-sweeps between the two yes-sweeps, then at the start of the last no-sweep there must be more than B0i data in the buffer, since otherwise it would not have been a no-sweep. Again, between the start of this last no-sweep and the completion of the succeeding yes-sweep a block of size B0i is fetched and at most B0i is consumed. During any such no-yes subsequence the buffer will not underflow, since at the start at least B0i data is available while at most B0i data is consumed. Furthermore, during the remaining no-sweeps underflow will never occur. The necessity of (7) can be shown analogously as the necessity of (4) in Theorem 2. By analogy with the minimum buffer requirements for TB, we can derive that B0i is minimal if B0i =
r s(n; 2) max c : r ?C i
(8)
6 Comparing Buffer Requirements Using Equations (6) and (8) we can compare the minimum buffer requirements of the three disk scheduling algorithms. The relation between the block sizes Bi and B0i can be rewritten as B0i =Bi = s(n; 2)=s(n; 1). To give some quantitative results, we use the characteristics of the Seagate Elite 9 disk. We assume that this disk can offer a guaranteed data rate of 44 Mbit/s. For n = 5; 10; 15; 20; and 25, the resulting ratio B0i =Bi equals 1.185, 1.102, 1.070, 1.053, and 1.045, respectively. It is noted that, for n > 10, B0i is at most 10% larger than Bi . To give an impression of the buffer , assuming requirements, Table 2 presents values of Bi and B0i for different values of n and cmax i that all users have an identical maximum consumption rate and that one Seagate Elite 9 disk is used. The buffer sizes for TB and VDB are given by 3Bi and 2Bi , respectively. The buffer sizes for DS are given by 2B0i . From (6), it is easily seen that limC!r Bi = ∞. If, for example, we have 29 users, with cmax = 1:5 Mbit/s for each user i, and r = 44 Mbit/s, then Bi = 66:121 Mbit. i
7 Comparing Average Response Times To compare the average response times of the algorithms, we have carried out two kinds of simulations. In Section 7.1, we assume that the users have identical constant consumption rates, whereas
B0i
Bi cmax i n
0.5
(Mbit/s) 1.0
cmax i 1.5
2.0
n
5
0.055 0.117 0.188 0.269 0.363
5
10
0.107 0.244 0.430 0.693 1.094
10
0.118 0.269 0.474 0.764 1.206
15
0.165 0.415 0.840 1.721 4.633
15
0.177 0.444 0.899 1.841 4.957
20
0.232 0.658 1.691 7.890
–
20
0.244 0.693 1.781 8.308
–
25
0.309 1.025 4.494
–
25
0.323 1.071 4.696
–
–
0.5
(Mbit/s) 1.0
2.5
1.5
2.0
2.5
0.065 0.139 0.223 0.319 0.430
–
. Table 1: Values of Bi and B0i , in Mbit, for different values of n and cmax i in Section 7.2, we assume that they consume at variable rates. In these simulations, we restrict ourselves to the case in which user requests can be considered independent. This is realised by creating enough time between two successive requests.
7.1 Simulating Constant-Rate Streams In order to compare the fixed-period double buffering algorithm with the three variable-period algorithms presented in this paper, we assume that all streams consume at the same constant rate of 1.5 Mbit/s. For the disk parameters, we used the characteristics of the Seagate Elite 9 disk. This disk has a rate that varies from 44 Mbit/s for the inner tracks to 65 Mbit/s for the outer tracks. During each simulation the number n of users is fixed. The size of the data blocks that are repeatedly fetched for a user depends on the disk scheduling algorithm that is being used and nmax , the maximum number of users that is admitted service simultaneously. Response times are measured by repeatedly generating a request for other data by one of the users. The time between two successive requests is chosen uniformly from the interval [13; 28]. Each request is issued by a randomly chosen user. Each simulation is based on a total of 50,000 requests. Table 2 gives the average observed response times and corresponding 99% quantiles, for TB, VDB, and DS. The 99% quantile is defined as the smallest value for which 99% of the values observed are smaller than this value. For the three algorithms, Figure 2 gives the frequency diagrams of the observed sweep times when nmax = 25 and n = 25. The frequency diagrams of the corresponding response times are given in Figure 3. The peak structure as observed in the frequency diagrams of the sweep times of TB and DS can be explained as follows. The i-th peak represents the sweeps in which i data blocks are retrieved. From Figure 2 it follows that, although n = nmax , the maximum number of disk accesses observed in one sweep differs substantially from the worst-case value of 25. This results from the differences in the average and worst-case data transfer rate, rotational latency, and seek time. Hence, on average, a sweep takes less time. As a result, a data block will be fetched only once every few sweeps, leading to a considerable further decrease of the average sweep time. In addition, the variation in the degree in which the buffers are filled at any moment in time seems to be large enough to keep the probability that two or more data blocks are to be fetched in a single sweep rather low. In the frequency diagram of the sweep times of VDB we only observe one peak. Since each consuming user is serviced in each sweep there is little variation in the duration of sweeps. As each sweep consists of n disk accesses, the use of VDB results in larger sweep times. We observe that all three algorithms have average response times that are considerably smaller than those of DB, as discussed in Section 4. For example, if nmax = 25, then a period equals 3 seconds, resulting in an average response time of 4.5 seconds for DB. Hence, starting sweeps immediately upon completion of the previous one can indeed improve the average response times
TB
VDB
DS
n
nmax
nmax
5
10
15
20
25
5
5
44.5 97.3
34.9 86.1
39.9 55.4 110.2 97.5 131.3 243.8
125.5 129.9 137.4 180.2 184.2 192.3
152.8 208.6
204.7 265.1
40.7 102.3
10
65.2 54.5 68.1 126.9 146.9 141.2 181.0 326.7
265.6 273.6 368.8 377.5
288.8 393.4
340.4 447.4
62.8 55.7 70.1 131.1 161.1 151.9 191.6 336.3
15
89.9 89.9 154.0 213.8 242.8 424.3
465.7 637.3
481.6 654.1
532.7 707.2
89.9 92.8 158.5 234.8 262.8 446.6
20
134.1 199.8 335.7 571.5
785.9 1065.9
836.6 1119.3
139.3 205.7 378.6 604.6
25
280.6 743.9
1404.8 1888.2
289.7 825.0
20
25
5
nmax 10
15
10
15
20
25
35.2 40.9 57.2 113.8 91.1 100.0 133.4 239.0
Table 2: Average response times and 99% quantiles (in milliseconds) for TB, VDB, and DS in case of constant-rate simulations. The former are printed in roman, the latter in italics.
considerably. In case of TB and DS, users have to wait only a small time before their request is taken into account: either the disk is idle at the moment the request is issued or a sweep with only a few disk accesses is being carried out. In case of VDB, a user has to wait, on average, half a sweep in which for each of the n users a small data block is retrieved, before the request is taken into account. In the following sweep one large data block for the user that issued the request and n ? 1 small data blocks for the remaining users have to be retrieved. This results in considerably larger average response times than those of DS and TB.
7.2 Simulating Variable-Rate Streams In this section we assume that the users consume at variable rates. In the simulations we used a variable-bit-rate-encoded MPEG-2 sequence which consists of 54667 video frames and which has a frame rate of 25 frames per second. The peak bit rate equals 21.7 Mbit/s and the mean bit rate equals 2.7 Mbit/s. Using the peak bit rate for cmax would imply that only two users could be i serviced by one Seagate Elite 9 disk. As explained by Dengler, Bernhardt & Biersack [1996], averaging over several successive frames results in a much lower bit rate such that more users can be admitted service simultaneously. For these simulations we averaged over 12 frames, resulting in a bit rate of 9.43 Mbit/s. We assume that all users have the same upper bound on the consumption rate, i.e., cmax = 9:43 Mbit/s for all i 2 U. In order to be able to service more than four users, we i assume that the data is striped across an array of 10 Seagate Elite 9 disks that are synchronized, i.e., all disks read from the same radial position at the same time. In this way the array has a data transfer rate that varies from 440 Mbit/s to 650 Mbit/s. Requests are generated in a similar way as described in Section 7.1. Table 3 gives the average observed response times and corresponding 99% quantiles, for TB, VDB, and DS. On average, the users consume at the mean bit rate of 2.7 Mbit/s, while the sizes of the data blocks are based on cmax = 9:43 Mbit/s. Together with the reasons presented in the previi ous section, this causes the disk array to be idle for at least 71% of the total simulation time. As a result, in case of TB and DS almost all sweeps consist of exactly one disk access. Furthermore, as the probability that a request is issued while the disk array is idle is very large, the corresponding response times are almost completely determined by the time it takes to retrieve the requested data block.
16000 "TB_S_25_25.dat" TB "DB_S_25_25.dat" VDB "DS_S_25_25.dat" DS
14000
Number of sweeps
12000
10000
8000
6000
4000 TB DS
2000
VDB
0 0
0.2
0.4
0.6 Sweep time (s)
0.8
1
1.2
Figure 2: Frequency diagram of the sweep times observed for TB, VDB, and DS, in case of constant-rate simulations where nmax = 25 and n = 25.
8 Conclusions In this paper we have considered three safe disk scheduling algorithms for sustaining multiple heterogeneous variable-rate data streams. We can draw the following conclusions. Using a disk scheduling algorithm that starts a new sweep immediately upon completion of the previous one, instead of executing sweeps strictly periodically, leads to much better average response times. This already holds for constant-rate data streams, as we have seen in Section 7.1, and is even more interesting for variable-rate data streams, as we have seen in Section 7.2. With respect to the buffer requirements, the triple buffer algorithm uses 50% more buffer space than the variable-block double buffering algorithm. The dual sweep algorithm uses at most 10% more buffer space than the variable-block double buffering algorithm if the number of users is larger than 10, assuming that the disk parameters of the Seagate Elite 9 disk are used. For larger numbers of users, the buffering overhead is even smaller. With respect to the average response times, the results for the triple buffer and the dual sweep algorithms are more or less the same. In this respect, these algorithms perform considerably better than the double buffer algorithm with variable blocks. To conclude, DS combines the small buffer requirements of VDB with the favourable response times of TB.
References Birk, Y. [1995], Track-pairing: A novel data layout for vod servers with multi-zone-recording disks, Proceed-
35 "TB" TB "DB" VDB "DS" DS
30
Number of requests
25
TB
20
DS
15
10 VDB
5
0 0
0.5
1 Response time (s)
1.5
2
Figure 3: Frequency diagram of the response times observed for TB, VDB, and DS, in case of constant-rate simulations where nmax = 25 and n = 25. ings of the IEEE International Conference on Multimedia Computing and Systems, 248-255. Chang, E., and A. Zakhor [1994], Proceedings of the 1st International Workshop on Community Networking Integrated Multimedia Services to the Home, San Francisco, July 13-14, 127-137. Chen, M.-S., D.D. Kandlur, and P.S. Yu [1995], Storage and retrieval methods to support fully interactive playout in a disk-array-based video server, Multimedia Systems 3, 126-135. Dengler, J., Ch. Bernhardt, and E. Biersack, Deterministic admission control strategies in video servers with variable bit rate streams, in: B. Butscher, E. Moeller, and H. Pusch (Eds.), Proceedings Workshop Interactive Distributed Multimedia Systems and Services, Berlin, March 4-6, 245-264. Denning, P.J. [1967], Effects of scheduling file memory operations, Proceedings of the 1967 AFIPS SJCC 30, 9-21. Dey, J.K., C.-S. Shih, and M. Kumar [1994], Storage server for high-speed network environments, Proceedings of the SPIE 2188, 200-211. Gemmell, D.J. [1993], Multimedia network file servers: Multi-channel delay sensitive data retrieval, ACM Multimedia 6, 243-249. Kandlur, D.D., M.-S. Chen, and Z.-Y. Shae [1991], Design of a multimedia storage server, IBM research report, June 1991. Kenchammana-Hosekote, D.R., and J. Srivastava [1994], Scheduling continuous media in a video-on-demand server, Proceedings of the International Conference on Multimedia Computing and Systems 5, 19-28. Korst, J., V. Pronk, E. Aarts, and F. Lamerikx [1995], Periodic scheduling in a multimedia server, Proceedings of the 1995 INRIA/IEEE Symposium on Emerging Technologies and Factory Automation, ETFA’95, Paris, October 10-13, 205-216. ¨ Ozden, B., R. Rastogi, and A. Silberschatz [1995], A framework for the storage and retrieval of continuous media data, Proceedings of the International Conference on Multimedia Computing and Systems, Washington, May 15-18, 2-13.
TB
VDB
DS
n
nmax
5
23.1 25.9 58.3 54.1
35.1 59.5 65.0 101.3
377.1 565.5
111.8 116.5 126.0 150.3 161.0 166.3 175.5 201.5
465.2 573.2
22.9 26.3 56.9 54.6
35.9 62.5 65.6 104.7
410.1 597.3
15
29.8 77.8
37.8 62.3 88.0 127.1
388.2 729.9
297.8 306.9 331.5 412.1 421.2 446.4
645.8 803.9
30.1 79.2
38.3 65.5 87.6 132.9
422.5 798.6
41.3 65.7 108.6 160.8
402.5 882.5
495.5 519.4 835.0 675.7 699.5 1049.7
41.9 68.8 109.6 167.0
437.2 931.3
35
69.9 420.4 179.7 1029.4
724.3 1039.0 976.1 1320.5
73.2 189.0
456.2 1128.4
45
439.1 1126.7
1266.0 1618.7
5
25
nmax 15
25
35
45
5
nmax 15
25
35
45
5
15
25
35
45
477.6 1216.8
Table 3: Average response times and 99% quantiles (in milliseconds) for TB, VDB, and DS in case of variable-rate simulations. The former are printed in roman, the latter in italics.
Oyang, Y. [1995], A tight upper bound of the lumped disk seek time for the scan disk scheduling policy, Information Processing Letters 54, 355-358. Paek, S., and S.-F. Chang [1996], Video server retrieval scheduling for variable bit rate scalable video, Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Hiroshima, June 17-23, 108-112. Patterson, D.A., G.A. Gibson, and R.H. Katz [1988], A case for redundant arrays of inexpensive disks (RAID), Proceedings of the ACM Conference on Management of Data, 109-116. Rangan, P.V., H.M. Vin, and S. Ramanathan [1992], Designing an on-demand multimedia service, IEEE Communications Magazine 7, 56-64. Rautenberg, M., and H. Rzehak [1996], A control for an interactive video on demand server handling variable data rates, in: B. Butscher, E. Moeller, and H. Pusch (Eds.), Proceedings Workshop Interactive Distributed Multimedia Systems and Services, Berlin, March 4-6, 265-276. Reddy, A.L.N., and J.C. Wyllie [1994], I/O issues in a multimedia system, Computer 27, 69-74. Vin, H.M., P. Goyal, A. Goyal, and A. Goyal [1994], A statistical admission control algorithm for multimedia servers, Proceedings ACM Multimedia, San Francisco, 33-40. Yu, P.S., M. Chen, and D.D. Kandlur [1992], Design and analysis of a grouped sweeping scheme for multimedia storage management, in P.V. Rangan (Ed.), Proceedings of the 3rd International Workshop on Network and Operating Systems Support for Digital Audio and Video, La Jolla, CA, Lecture Notes in Computer Science 712, 44-55.