Future Generation Computer Systems 20 (2004) 157–170
Generalized data retrieval for pyramid-based periodic broadcasting of videos Jin B. Kwon a,∗ , Heon Y. Yeom b,1 a b
Department of Computer and Information Sciences, Sunmoon University, 100 Kalsan, Tangjeong, Asan, Chungnam 336-708, South Korea Department of Computer Science and Engineering, Seoul National University, San56-1, Sillim-dong, Kwanak-gu, Seoul 151-742, South Korea
Abstract A true video-on-demand system allows users to view any video program, at any time, and perform any VCR functions, but its per-user video delivery cost is too expensive to have commercial use. Periodic broadcasting (PB), which is a near video-on-demand technique, broadcasts videos repeatedly over broadcast channels. In this way, PB can service an unlimited number of clients simultaneously with a bounded service latency. We propose a data retrieval scheme, consisting of buffer management and data placement, for PB servers. Unlike existing schemes devised for a specific PB technique, our scheme can be adopted by general PB. Furthermore, it is devised considering variations in disk load induced by VBR-encoded videos. © 2003 Elsevier B.V. All rights reserved. Keywords: Video-on-demand servers; Data placement; Buffer management; Video broadcasting
1. Introduction A video-on-demand (VOD) service enables subscribers to watch videos of their choice at the press of a button. Generally, a large number of video objects are stored in VOD servers, delivered through high-speed communication networks to distributed clients, and played by each client. In true video-on-demand (TVOD) systems, each subscriber is served by an individually allocated channel. The techniques that allocate channels among clients are referred to as user-centered or TVOD. Although they can respond ∗ Corresponding author. Tel.: +82-41-530-2258; fax: +82-41-530-2876. E-mail addresses:
[email protected] (J.B. Kwon),
[email protected] (H.Y. Yeom). 1 Tel.: +82-2-876-2159; fax: +82-2-871-4912.
to requests immediately, the server rapidly depletes the network bandwidth. Therefore, TVOD systems are expensive due to their low scalability. The network I/O bottleneck faced by TVOD techniques may be eliminated by using the multicast facility of modern communication networks to share a server stream among multiple clients. The clients requesting the same video may be serviced by one stream [7], which is referred to as near video-on-demand (NVOD). It is known that most requests are for a small group of videos [7,8]. That is, since the distribution of requests are highly biased, the stream sharing is substantially effective. There are two basic approaches to NVOD provision: scheduled multicast and periodic broadcast. In conventional scheduled multicast [2–4,7,8,19], the server collects user requests (i.e., a batch) during a specific time period. Clients requesting the same video within
0167-739X/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0167-739X(03)00151-1
158
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
the same period will receive the video stream over a single multicast channel. Periodic broadcasting (PB), an NVOD technique, can service an unlimited number of clients simultaneously with bounded service latency [7]. In PB, videos are broadcast periodically, as the name suggests, i.e., clients can start watching a video every d minutes (or seconds). Because dedicated server channels are allocated to each video object the schemes are referred to as data-centered [22]. PB schemes can guarantee the worst service latency experienced by any client to be less than d minutes (or seconds). In addition, since it bypasses the need to process individual user requests, it is more scalable. Because of these benefits, a number of periodic broadcast schemes have recently been published [1,7,9,11,12,16,17,22]. Most of the proposed schemes aim at minimizing the system resources required for a given worst-case service latency, or minimizing the worst-case service latency for given system resources (server network bandwidth, client I/O bandwidth, client disk space, etc.). In recent work, the video is fragmented into separate segments of “increasing size” and each segment is repeatedly transmitted over a different channel of “equal bandwidth” [9,11,22]. These are called pyramid-based schemes. The techniques mentioned above focus only on the server network bandwidth and the client resources. In addition to network resources, servers should manage disk and memory resources efficiently. Since the size and the bit rate of video data are substantial and memory space is much more expensive than disk space, data will be stored on disks and disk I/O will be invoked frequently. Therefore, disk bandwidth is likely to be a bottleneck on the server side, and then saving disk bandwidth is required to build a cost-effective PB server. Disk bandwidth may be saved by an efficient data retrieval scheme that is accomplished by disk scheduling, buffer replacement, and data placement. There are a few studies of data retrieval for PB [6,21]. Proper placement of data onto disks may be a method of saving disk bandwidth. Chen and Thapar [6] and Tsao and Huang [21] proposed video data placement schemes for a VOD server periodically broadcasting videos. These enable the server to load data from disks to the memory buffer without disk seek. However, these schemes are optimized for staggered broadcasting or equal-length broadcasting (EB) [7], which is one of the earliest periodic broad-
cast schemes. Therefore, they are not effective for modern schemes, such as pyramid-based schemes. In this paper, we propose a data retrieval scheme for PB servers, Generalized Data Retrieval for PB (GDRPB). This scheme can be generally adopted by servers periodically broadcasting VBR videos. The scheme saves disk bandwidth requirements by reducing the time to access data blocks, using the disk access pattern known in advance and the variation in disk load induced by VBR-encoded videos. GDRPB consists of a data placement scheme and a buffer management scheme. Our data placement scheme reduces disk seek overhead by placing data blocks frequently read together. And, our buffer management scheme reduces disk access by caching the blocks with small time-to-next-reference (TTNR). Both exploit the pre-known disk access pattern of PB. In addition, GDRPB is independent of how to fragment the video segments. Consequently, PB servers using GDRPB can service more videos or provide shorter service latency by utilizing the saved disk service time. We demonstrate the high-quality performance of our scheme by using trace-driven simulations. The results show how much GDRPB can reduce the required disk bandwidth and how short service latency and how many videos it provide with the saved bandwidth. The remainder of this paper is organized as follows. Section 2 presents our system model and some assumptions, and Section 3 introduces the existing data placement technique. We analyze the disk access pattern of general PB schemes and present a guideline for the data retrieval in PB in Section 4. Then, GDRPB is proposed in Section 5, and some issues not dealt with here are discussed in Section 6. Finally, Section 7 demonstrates the performance of GDRPB, and Section 8 concludes this paper.
2. System model The sequence of lengths of the segments Si is called the broadcast series or partition series of the PB scheme, and is represented as 1, α2 , ␣3 , . . . , αK , where αi is the normalized length of Si and K is the number of segments. Si means the ith segment from the beginning of the video. The delay time from the request to the start of playout is defined as the service latency. Assuming that D is the length of a
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
d
2d
4d
1
2
3
video t0 channel 1 channel 2
t1 1
request t2 1
2
4d 4
t3
1
t5
t4 1
2
159
1
1
1
2
1
1
2
channel 3
3
3
channel 4
4
4 time
Fig. 1. PB: GDB3: 1, 2, 4, 4, . . . .
video, in minutes or seconds, the worst-case service latency d is the length of S1 , i.e., D/ K α . Fig. 1 i i=1 illustrates the video transmission schedule of GDB3 broadcasting [9], the broadcast series of which is 1, 2, 4, 4, 10, 10, . . . . Si is repeatedly transmitted over channel i every d·αi seconds, and clients receive the video data according to a reception schedule, which is determined by the starting point of the service. The shaded rectangles indicate a reception schedule, along which the clients, who requested the video between t0 and t1 , proceed. From the clients’ viewpoint the video appears as if it is broadcast every d seconds. That is why the length of S1 , the worst-case service latency, is d. In typical VOD servers, video data is loaded from disks to memory buffer round by round [10]. That is, all the streams being serviced retrieve data from disks every round and the server proceeds to the next round after the current round is finished. For example, assuming that a server is servicing n streams and the round length is β seconds, all the streams retrieve the β second data every round. If disk service time spent on retrieving n one-round data exceeds β seconds, it causes starvation, in that some streams cannot obtain the required data because of disk overload. It is reasonable to assume round-level disk I/O for the PB model. In round-level disk I/O, it is natural to use round time length (RTL) striping [13], where the video is divided into one-round data and each one-round data is stored contiguously on disks. The one-round data stored contiguously are called media blocks. The size of the media blocks is variable in the case of VBR-encoded video.
The model of disk service time is important because our work deals mainly with saving the disk load. Disk service time consists of transfer time and disk overhead, such as seek time and rotational latency. In PB, because multiple clients are allowed to receive from a single channel, starvation of a channel or a stream causes hiccups on the screens of the clients receiving data from the channel. Therefore, it is desirable for PB servers to provide deterministic Quality of Service (QoS) guarantees. We consider the worst-case disk service time in order to provide deterministic QoS. The server serving n streams induces n seeks within a round. Under SCAN disk scheduling, and a realistic function of seek time, seek time is a maximum for equidistant seek positions of the n requests [5,15]. The seek time function itself is assumed to be proportional to the square root of the seek distance for small distances below a disk specific constant, and a linear function of the seek distance for longer distances [18]. Thus, for given disk parameters, the maximum total seek time of a sweep can be computed by assuming the n seek positions to be at cylinders (i·c)/n for i = 1, . . . , n, where c is the total number of disk cylinders, and applying the seek time function [14]. Let Ttran (Bi ) be the transfer time of media block Bi , Trot the worst rotational latency, Tseek (n) the worst disk seek time needed for n requests, and any other overhead. Then, max is given by the worst-case disk service time Tserv max = Tseek (n) + Tserv
n i=1
(Ttran (Bi ) + Trot + ).
(1)
160
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
3. Data placement for staggered broadcasting In [6,21], data placement schemes for staggered broadcasting [7], the simplest PB scheme, are presented (the two schemes are almost the same). Let D be the total length of a video and K the number of channels, the staggered broadcasting begins to broadcast the video every D/K minutes. Then, there are K streams and the maximum service latency is D/K minutes. The staggered broadcasting is a special case of pyramid-based PB, whose broadcast series is 1, 1, 1, . . . , 1. Assume that m is the length of the segments in rounds, K the number of segments, i.e., the number of channels, and Bi,k the kth media block belonging to the ith segment. Then, B1,k , B2,k , BK,k for 1 ≤ k ≤ m are always read at the same round. Fig. 2 illustrates the data access pattern of staggered broadcasting and the disk behavior induced by the data placement schemes. The shaded boxes indicate disk seek time, those with segment index and block index indicate transfer times of the blocks Bi,j , and each vertical bar consisting of the boxes indicates the total disk service time of each round. For simplicity, we assume the size of the media blocks and disk seek time is fixed. However, the media blocks are of variable size in VBR-encoded videos, and the disk seek time is actually variable. Fig. 2(a) illustrates the RTL striping [13], and Fig. 2(b) illustrates the data placement [6,21] optimized for the staggered broadcasting, which places all the media blocks contiguously in the order B1,1 , B2,1 , . . . , BK,1 , B1,2 , . . . , BK,2 , . . . , B1,m , . . . , BK,m . Therefore, the scheme induces no seek during
m rounds except for the first seek to B1,1 . The whole video can be broadcast round by round without any disk seek. The placement scheme is derived from the idea that Bi,k and Bj ,k for i = j are always read at the same round. We call this scheme Simple Data Placement (SDP) in the following sections. However, the scheme has little effect on pyramid-based broadcasting, which uses an increasing broadcast series because the length of the segments is variable. Therefore, Bi,k and Bj ,k for i = j may not be read in the same round, unlike in staggered broadcasting. Consequently, a new data retrieval scheme is required that can be used for pyramid-based schemes as well as for staggered schemes. In the next section, we propose GDRPB, which includes data placement and buffer management.
4. Disk access pattern and block layout The disk access pattern of a PB is not only known in advance, but also is repeated periodically. Let the broadcast series of a video be 1, α2 , α3 , . . . , αK , the worst service latency be d, and round length be β. Then, the length of Si is m1 αi = mi , and that of the first segment S1 in rounds is d/β = m1 . Letting γ be the least common multiple (LCM) of mi ’s, the access pattern to the video is repeated in a γ-round cycle. The sequence of the successive γ rounds is called a section. The section length is the LCM of mi , i = 1, . . . , K. Fig. 3 illustrates a disk access pattern of a GDB3 broadcast [9], where m1 is 2, and the broadcast
Fig. 2. Disk access of staggered broadcasting: (a) RTL striping, and (b) SDP.
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
161
section broadcast series
d
1
1,1
1,2
1,1
1,2
1,1
1,2
1,1
1,2
1,1
1,2
2
2,1
2,2
2,3
2,4
2,1
2,2
2,3
2,4
2,1
2,2
4
3,1
3,2
3,3
3,4
3,5
3,6
3,7
3,8
3,1
3,2
4
4,1
4,2
4,3
4,4
4,5
4,6
4,7
4,8
4,1
4,2
R1
R2
R3
R4
R5
R6
R7
R8
R9
R 10
rounds
Fig. 3. Access pattern: 1, 2, 4, 4, m1 = 2.
series is 1, 2, 4, 4. The box with a segment index i and a block index j indicates the block Bi,j . The variability of the data size of the media blocks is not reflected in the figure, for simplicity of illustration. We also define the set of media blocks to be read in the tth round within a section as round group Rt . GDRPB exploits the pre-known disk access pattern and considers the variation of disk load per round. Although blocks may be retrieved more efficiently by data replication, it is not considered in this paper. If video data is placed only using the RTL striping in Section 2, the number of seeks is the total number of segments except for some special cases (Fig. 2(a)). If, however, the pre-known disk access pattern is exploited, the per-round seek overhead may be reduced and accordingly the disk throughput may be improved by contiguously placing media blocks belonging to
the same round group. It is impossible to place all the blocks of each round group contiguously in a section, since a media block belongs to one or more round groups and a media block can be placed contiguously with at most two others. We define adjacency of blocks as follows. Definition 1. For any two blocks X, Y ∈ Rt , which is the tth round group, X and Y are adjacent with respect to Rt or X|t Y(= Y |t X), if and only if 1. X and Y are placed physically contiguously or X||Y, or 2. all blocks placed between X and Y are all in Rt . Let us consider as an example the round groups R1 , R3 , and R5 of Fig. 3. Fig. 4 illustrates the block layout and disk behavior for the example. All the blocks of
round groups R1
1,1
2,1
3,1
4,1
R3
1,1
2,3
3,3
4,3
R5
1,1
2,1
3,5
4,5
read
R3
R1
block layout
...
4,3
3,3
2,3 R5
1,1
2,1
read
3,1 seek
Fig. 4. Adjacency groups.
4,1
3,5 read
4,5
...
162
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
R1 and R3 are adjacent, and accordingly the disk reads all four blocks of each round group with a long stroke after a disk seek. However, the blocks of R5 cannot be read with a seek and a stroke, because some blocks are not adjacent (e.g., B1,1 and B3,5 are not adjacent). The subsets of a round group in which all blocks are mutually adjacent are defined as adjacency groups (AG). Thus, R1 and R3 have only one AG, and R5 has two AGs: {B1,1 , B2,1 } and {B3,5 , B4,5 }. Because all the blocks belonging to an AG of Rt are read with one seek and one stroke in round t, the number of seeks induced in round t is equal to the number of AGs of Rt . The smaller is the number of AGs per round group, the higher is the disk throughput, because of the reduction in disk seeks. The AGs are determined by the disk access pattern and the block layout on the disk. We receive a hint of the block layout suitable for a PB scheme from its broadcast series. If the blocks that frequently occur in the same rounds lie contiguously on the disk, we can expect the average number of AGs per round group to be reduced. In that case, the key point is to find the frequency for which any two blocks occur in the same round groups. The frequency can be computed from the length of the segments to which the blocks belong. Theorem 1. Let F(Bi,j , Bp,q ) be the number of round groups to which both blocks Bi,j and Bp,q belong, among the γ round groups. F(Bi,j , Bp,q ) = γ/ LCM(mi , mp ) if F(Bi,j , Bp,q ) = 0. Proof. γ, the LCM of all the segment lengths, is the common multiple of mi and mp , and is also the multiple of c = LCM(mi , mp ). The c-round access pattern on Si and Sp is repeated γ/c times for a section of γ rounds. Therefore, since F(Bi,j , Bp,q ) = 0, there exists at least one-round group to which both Bi,j and Bp,q belong, among the first c round groups. Let the first round group be Rt . Then Bi,j ∈ Rt , Rt+mi , Rt+2mi , . . . , Rj+c−mi ,
(2)
Bp,q ∈ Rt , Rt+mp , Rt+2mp , . . . , Rj+c−mi ,
(3)
where max(j, q) ≤ t ≤ c. Assume that there is another round group Rt+d to which both Bi,j and Bp,q belong, where 0 < d ≤ c − t. Then, from Eqs. (2) and (3), t + d = t + mi x = t + mp y ≤ c, i.e., mi x = mp y ≤ c − t. Since this violates the assumption that
broadcast series
section
1
1,1
1,2
1,1
1,2
1,1
1,2
1,1
1,2
2
2,1
2,2
2,3
2,4
2,1
2,2
2,3
2,4
4
3,1
3,2
3,3
3,4
3,5
3,6
3,7
3,8
4
4,1
4,2
4,3
4,4
4,5
4,6
4,7
4,8
R1
R2
R3
R4
R5
R6
R7
R8
Fig. 5. Simultaneous occurrence.
c = LCM(mi , mp ), Rt+d does not exist. Accordingly, both Bi,j and Bp,q belong to only one-round group for c rounds, and c-round pattern is repeated γ/c times during a section. Therefore, F(Bi,j , Bp,q ) = γ/c = γ/LCM(mi , mp ). 䊐 Fig. 5 is presented as an example of the simultaneous occurrence of blocks for the access pattern mentioned above. The length of both S3 and S4 is 8 (LCM(m3 , m4 ) = 8), and B3,6 and B4,6 with diamond shape in the figure occur simultaneously in the round group R6 (F(B3,6 , B4,6 ) = 0). Hence, F(B3,6 , B4,6 ) = γ/LCM(m3 , m4 ) = 1 because γ = 8. That is, R6 is the only round group to which B3,6 and B4,6 both belong. Since LCM(m1 , m2 ) = 4 and LCM(m1 , m3 ) = 8, F(B1,1 , B2,1 ) = 2 (blocks with circle shape) and F(B1,2 , B3,2 ) = 1 (blocks with square shape). From Theorem 1, we derive the following lemma. Lemma 1. If LCM(mi , mp ) ≤ LCM(mi , mp ) and, for any j and q, F(Bi,j , Bp,q ) = 0 and F(Bi,j , Bp ,q ) = 0, then F(Bi,j , Bp,q ) ≤ F(Bi,j , Bp ,q ). Proof. Is straightforward from Theorem 1.
䊐
When the blocks with larger F(X, Y) values tend to be adjacent, the average number of AGs of the layout is smaller. Lemma 1 provides us an analytic guideline for effective block placement to save disk bandwidth. 5. Generalized data retrieval In this section, we present a data retrieval scheme, GDRPB, that efficiently fetches data blocks from
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
disk. GDRPB consists of a data placement scheme using block adjacency and a buffer management scheme using a buffer cache. Both schemes exploit the pre-known disk access, and consider the variability of VBR-encoded data. The data placement scheme is based on Lemma 1, and the buffer management scheme uses the caching schedule that contains when and which blocks are to be kept in the buffer cache. 5.1. Data placement As described in Section 2, a deterministic QoS must be guaranteed in PB. To provide a deterministic QoS, PB servers must ensure that per-round disk loads do not exceed the round length β. The worst-case disk load of a round x, f(x), is the same max of Eq. (1) in Section 2. Therefore, the as Tserv disk bandwidth requirement for deterministic QoS is max1≤x≤γ f(x). GDRPB aims at minimizing the disk bandwidth requirement. In this subsection, we focus only on reducing the number of seeks by placing as a chunk the blocks that are frequently needed in the same round, based on Lemma 1. However, we do not consider the seek distances between the chunks. The objective of GDRPB is not to minimize the maximum number of AGs in a round group, but to minimize the maximum disk load. That is because the disk load depends on the transfer time as well as the seek time. That is, to minimize the maximum number of AGs does not mean to minimize the disk bandwidth requirement. Our placement scheme focuses on the round with maximum load, and it tries to reduce its disk load by decreasing the seek overhead. That is, the placement scheme reduces the number of AGs of the round group with the maximum load so as to minimize the required disk bandwidth. Lemma 1 is used as a heuristic to choose blocks to be grouped to an AG. Initially, a round group consists of K AGs having a block. To restrict a block to be contiguous with at most two other blocks, each block is assumed to have two tokens. GDRPB first calculates f(x), 1 ≤ x ≤ γ, by Eq. (1), and then determines a round x (=max) that maximizes f(x). Let Rmax be the round group of round max. Then, two blocks are selected from among the blocks of Rmax . The two blocks, X and Y, satisfy the following conditions:
163
1. X||Y must be possible. That is, both X and Y must have one or more tokens. 2. X and Y must not belong to the same AGs. That is, they must not be adjacent. 3. Assuming mX and mY to be the lengths of the two segments to which X and Y belong, respectively, LCM(mX , mY ) ≤ LCM(mX , mY ) for all the X and Y in Rmax which satisfy the above conditions. That is, X and Y must be the block-pair that most frequently occur simultaneously among the targeted blocks. A token is taken from the two selected blocks (i.e., X||Y), and the two AGs to which each block belongs are merged into an AG. Then, the seek overhead of round max would decrease by merging two AGs of Rmax , which results in a decrease of max1≤x≤γ f(x). Next, GDRPB recalculates f(x) for round x that satisfies 1 ≤ x ≤ γ and x = max ± c · LCM(mX , mY ), c = 0, 1, 2, . . . This recalculation is required because X||Y affects all the round group, to which both X and Y belong. Then, it finds the new Rmax and repeats the placement procedure until there are no X and Y satisfying the above conditions. Finally, the actual placement of blocks on disk runs so that the blocks of an AG are physically contiguous. Let us consider R1 and R5 in Fig. 3. Since B2,1 ||B3,1 , B4,1 ||B5,1 , B4,5 ||B5,5 after phase 1, R1 and R5 have three AGs, respectively: R1 = {B1,1 } ∪ {B2,1 , B3,1 } ∪ {B4,1 , B5,1 }, R5 = {B1,1 } ∪ {B2,1 , B3,1 } ∪ {B4,5 , B5,5 }. Assuming that Rmax = R1 , B1,1 ||B2,1 or B1,1 ||B3,1 in phase 2. Then, {B1,1 } and {B2,1 , B3,1 } of R1 and R5 are merged: R1 = {B1,1 , B2,1 , B3,1 } ∪ {B4,1 , B5,1 }, R5 = {B1,1 , B2,1 , B3,1 } ∪ {B4,5 , B5,5 }. Next, f(1), f(5), f(9), f(13) and f(17) are recalculated and the placement procedure is repeated. 5.2. Buffer management The goal of the placement scheme is to reduce the seek overhead in the overall disk service time, but not transfer time. Thus, from Eq. (1), the lower bound of disk service time or requirement of the placement scheme is max Ttran (B) . 1≤i≤γ B∈Ri
164
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
By using memory buffer effectively, this boundary can be exceeded and the disk bandwidth requirement can be lowered. We introduce caching with a surplus of buffer space to save disk bandwidth. Caching schemes usually select the blocks kept in the buffer cache based on some heuristic strategies, such as LRU or MRU, because future disk access pattern is not known in advance. However, in PB, TTNR of a block is known in advance. It is known that the highest cache hit rate is achieved by caching the blocks with small TTNRs [20]. Since Bi,j is read in m1 αi -round cycle, the time during which the block has to remain in the buffer cache from fetching to the cache hit is m1 αi . This is TTNR of Bi,j , and is reduced by one as rounds proceed. Therefore, the cache strategy replacing the block with the largest TTNR is the best for the average cache hit rate. However, because our scheme aims at minimizing the disk bandwidth requirement to provide a deterministic QoS, we focus on the round group which causes the maximum disk load, just as in the placement scheme. It is also important in the proposed buffer management scheme that the disk access pattern is known in advance and that a section of the pattern is repeated. These properties make it possible to generate a caching schedule, containing which blocks are kept in the buffer cache in
each round. We propose an algorithm generating the best caching schedule to save the disk bandwidth requirement. The algorithm is an off-line algorithm since it generates a caching schedule before the service start with the pre-known access pattern and the worst-case disk load based on our disk service time model. Fig. 6 illustrates the basic idea of the caching scheme of GDRPB using the disk access pattern presented as an example in the previous sections. The height of the boxes with indices indicates the block size or disk transfer time, and that of the dark shaded boxes indicates disk seek time. The light shaded boxes with indices represent the blocks kept in the buffer cache. The thick line shows the variation of disk bandwidth required in each round. The requirement is determined by which disk service time of the round group R6 is the highest. If B1,2 fetched in round 4 is kept in the buffer cache until round 6, only the other three blocks are required to be fetched from disk, because of the caching effect of B1,2 . Furthermore, the requirement becomes lower from (1) to (2), and Rmax remains R6 . In the same way, if B2,2 , fetched in round 2, is kept until round 6, the requirement becomes low again from (2) to (3). Also it can drop to (4) by caching B1,2 in round 6. Here, the target blocks to be cached is the one with the
Fig. 6. Block caching.
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
minimal m1 αi among all the non-cached blocks of Rmax . For R6 , since m1 α1 = 2, B1,2 is used two rounds before round 6 (i.e., round 4). That is, B1,2 is the block most recently fetched among all the blocks of R6 . The other target blocks are selected in the same way. The caching schedule is contained in γ × K-bit bitmap, sched[␥][K]. The generalized algorithm from the above example is the following: 1. Calculate f(x), which is the disk service time of round x. 2. Find Rmax . 3. Find a target block such that the block minimizing m1 αi among all non-cached blocks of Rmax . 4. If there is available buffer cache space for the target block: • Then, mark sched[max][i]. • Else, stop. 5. Recalculate f(max). 6. Go to step 2. This algorithm continues until it cannot find any block in Step 3. The sched[␥][K] generated by the algorithm is referenced as a caching schedule at run-time by the system. The system maintains the contents of buffer cache according to the schedule. 5.3. Integration While data placement affects disk seek overhead, a buffer management scheme does transfer time as well as data seek overhead. Since these schemes affect each other, we must consider their relation in order to integrate them. In the case of the “caching after placing” approach, if some of the blocks placed adjacently by the placement scheme are cached, the benefit originally intended by the placement would be diminished. It is more reasonable that the placement scheme adapt itself to the new disk access pattern changed by the caching algorithm. Therefore, we adopt the “placing after caching” approach simply by generalizing the GDRPB placement. Caching causes the disk access pattern described in Section 4 to change. Since this change is reflected in the caching schedule, the system is still aware of the future disk access pattern. Let k be the number of the rounds which X or Y are read from the buffer cache. k can be got from the caching schedule. Then, F(X, Y) is readjusted as
165
follows: F(X, Y) =
γ − k. LCM(mX , mY )
The placement scheme can be adapted to the new access pattern by using the adjusted F(X, Y) instead of LCM(mX , mY ) when selecting the target block-pair. In this way, GDRPB integrates the buffer management scheme with the placement scheme. More specifically, a caching schedule is first generated, the blocks are placed on the disk referencing the caching schedule, and then the PB server manages the blocks in the buffer cache according to the schedule after the service start. That is, GDRPB is an off-line technique applied before the service start.
6. Discussion A good data retrieval scheme must use disk bandwidth efficiently with a reasonable buffer requirement. It is generally assumed that VOD servers use double buffering, in-buffer and out-buffer. All the blocks delivered in the (t + 1)th round are loaded to in-buffer in the tth round, and all the blocks in out-buffer are flushed at the end of the tth round. Accordingly, the minimum buffer requirement for PB servers servicing round by round is as follows: Buf min = 2 · max size(B) , (4) 1≤t≤γ
all B∈Rt
where size(B) is the size of a block B. In other words, Bufmin is twice the maximum of the sum of block sizes needed in a round within a section. GDRPB does not pre-fetch blocks, but caches the blocks in the previous rounds with the remainder of the buffer space. Thus, its minimum buffer requirement is the same as Bufmin . As described in Section 3, since SDP [6,21] is designed only for staggered broadcasting, SDP cannot be adopted by general pyramid-based PBs. Therefore, we can compare SDP with GDRPB only on staggered broadcasting, because it is a special case of pyramid-based schemes. As all the segments have equal length, i.e., m1 = m2 = · · · = mK = m, γ = m and all blocks occur in only one-round group. Thus, all the blocks in a round group are put to an AG by the placement of GDRPB, because all their tokens are
166
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
taken in only the round group, which means that the blocks can be placed as a chunk. Therefore, for staggered broadcasting, the disk bandwidth required by the placement scheme of GDRPB is equal to that of SDP. Moreover, since GDRPB integrates the caching scheme with the placement, its performance is better than SDP. Unfortunately, our schemes have not yet implemented. The implementation issue remains as our future work.
7. Experimental results We demonstrated the effectiveness of GDRPB through trace-driven simulation. Trace data for frame sizes of MPEG-1 was obtained from [24]. These are MPEG-1 traces of 60 min movies and all data are VBR-encoded. The default round length β in the simulation is 1 s, and parameters of our disk model are from IBM Ultrastar 36LZX (Table 1).
Table 1 Disk characteristics [23] Rotational latency (ms) Average seek time (ms) Minimum seek time (ms) Maximum seek time (ms) Transfer rate (MB/s) Number of cylinders
3.0 4.9 0.5 10.5 36.1 151059.0
The data retrieval schemes compared with GDRPB are BASE, PLACE, and CACHE. BASE is the scheme only using RTL striping [13] without our placement and caching schemes. PLACE is the scheme only with the placement scheme of GDRPB and CACHE is the scheme only with the buffer management scheme of GDRPB. And, to show the quality of our schemes found by GDRPB, we compare them to a lower bound (LB) where all blocks in a round group is assumed to be adjacent by replicating the blocks. This solution is unrealistic since it requires too much disk space. We
Fig. 7. Disk bandwidth requirement: GDB3.
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
167
Fig. 8. Disk bandwidth requirement: skyscraper.
measured the disk bandwidth requirement for deterministic QoS for ten 60 min movies, i.e., worst-case disk load, of BASE, PLACE, LB, CACHE, GDRPB, and LB + CACHE, while varying service latency and buffer size. LB + CACHE means LB with caching. A decrease of service latency means an increase of broadcast channels or streams required to provide the same service latency, i.e., it requires higher disk bandwidth. Fig. 7 shows the results under the GDB3 broadcasting scheme for service latencies of 171, 116, 65, and 27 s. As shown from the results, performances of PLACE and GDRPB are close to their lower bounds, LB and LB + CACHE. As the buffer size increases, the performance of CACHE, GDRPB, and LB + CACHE is improved compared with BASE. That is because the larger the buffer space is, the greater the caching effect is. As the buffer space is larger, the improvement by the placement of GDRPB, which is the difference of the results GDRPB and
CACHE, decreases gradually. The required disk bandwidth of GDRPB is about 68, 54, and 42% less than that of BASE for the service latency of 27 s, when the buffer size is 128, 256, and 512 MB, respectively (Fig. 7(d)). Since round length β is 1 s, the bars above 1000 ms in the figures mean that deterministic QoS cannot be guaranteed. Fig. 8 shows the results under Skyscraper [11] broadcasting scheme. From the above results, we can expect that GDRPB should be capable of providing shorter service latency or servicing more videos than BASE with the saved bandwidth. To verify this expectation, we investigated the service latency and the number of movies that GDRPB can provide. Fig. 9 shows the service latencies that can be provided by the above six solutions while varying the buffer size. The number of channels allocated to each movie increases as the save bandwidth. And, the service latency decreases exponentially as the number of available channels increases in PB.
168
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
Fig. 9. Service latency.
Therefore, as the saved bandwidth is larger, the service latency exponentially decreases. Consequently, as the buffer size is larger, the latency of the solutions with caching is shorter exponentially. We can also know that the latency of GDRPB is very close to its lower bound. GDRPB can provide almost an immediate service for 512 MB buffer under this situation. Fig. 10 shows how many movies BASE and GDRPB can be serviced with 512 MB buffer, while decreasing the service latency. As the latency gets shorter, the required bandwidth per movie increases and then the
number of movies decreases. We can also see that the improvement increases as the latency gets short. The reason is the following. The shorter the latency is, the shorter the lengths of segments are. And, the blocks belonging to a short segment are accessed more frequently than those to a long segment. That is, the average TTNR of the blocks is shorter. Hence, since the caching effect is greater, the improvement increases as the latency gets shorter. As shown from the figure, GDRPB can service 1.5–2 times as many movies as BASE can.
Fig. 10. Number of videos.
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170
8. Conclusion PB, which is one NVOD technique that broadcasts videos repeatedly over broadcast channels, can service an unlimited number of clients simultaneously with a bounded service latency. Disk bandwidth may be saved by an efficient data retrieval scheme that is accomplished by disk scheduling, buffer management, and data placement techniques. In this paper, we propose a data retrieval scheme, GDRPB, consisting of buffer management and data placement, for PB servers. Unlike existing schemes devised for a specific PB technique, our scheme can be adopted by general PB. Furthermore, it is devised considering variations in disk load induced by VBR-encoded videos. Consequently, a PB server using GDRPB can service more videos and provide shorter service latency. References [1] C.C. Aggarwal, J.L. Wolf, P.S. Yu, A permutation-based pyramid broadcasting scheme for video-on-demand systems, in: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS’96), Hiroshima, Japan, June 1996, pp. 118–126. [2] C.C. Aggarwal, J.L. Wolf, P.S. Yu, On optimal batching policies for video-on-demand storage servers, in: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS’96), Hiroshima, Japan, June 1996. [3] K.C. Almeroth, M. Ammar, A scalable interactive videoon-demand service using multicast communication, in: Proceedings of the International Conference of Computer Communication and Networks (ICCCN’94), San Francisco, CA, September 1994. [4] K.C. Almeroth, M. Ammar, On the use of multicast delivery to provide an interactive video-on-demand service, IEEE J. Selected Areas Commun. 14 (6) (1996) 1110–1122. [5] E. Chang, H. Garcia-Molina, Effective memory use in a media server, in: Proceedings of the International Conference on 23rd Very Large Data Bases (VLDB’97), Athens, Greece, August 1997, pp. 496–505. [6] S. Chen, M. Thapar, A novel video layout strategy for near-video-on-demand servers, in: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS’97), Ottawa, Canada, June 1997, pp. 37–45. [7] A. Dan, D. Sitaram, P. Shahabuddin, Scheduling policies for an on-demand video server with batching, in: Proceedings of the ACM Multimedia, October 1994, pp. 15–23. [8] A. Dan, D. Sitaram, P. Shahabuddin, Dynamic batching policies for an on-demand video server, Multimedia Syst. 4 (3) (1996) 112–121.
169
[9] L. Gao, J. Kurose, D. Towsley, Efficient schemes for broadcasting popular videos, in: Proceedings of the Eighth International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV ’98), Cambridge, UK, July 1998. [10] J. Gemmell, H.M. Vin, D.D. Kandlur, V. Rangan, Multimedia storage servers: a tutorial and survey, IEEE Comput. 28 (5) (1995) 40–49. [11] K.A. Hua, S. Sheu, Skyscraper broadcasting: a new broadcasting scheme for metropolitan video-on-demand systems, in: Proceedings of the Conference, ACM SIGCOMM’97, Cannes, France, September 1997, pp. 89–100. [12] L. Juhn, L. Tseng, Harmonic broadcasting for video-ondemand service, IEEE Trans. Broadcast. 43 (3) (1997) 268– 271. [13] K.O. Lee, H.Y. Yeom, Deciding round length and striping unit size for multimedia servers, in: Proceedings of the Fourth International Workshop on Multimedia Information Systems (MIS’98), Istanbul, Turkey, September 1998, pp. 33– 44. [14] K.O. Lee, H.Y. Yeom, An effective admission control mechanism for variable-bit-rate video streaming, Multimedia Syst. J. 7 (4) (1999) 305–311. [15] Y.-J. Oyang, A tight upper bound of the lumped disk seek time for the SCAN disk scheduling policy, Inform. Process. Lett. 54 (1995) 355–358. [16] J.-F. Pˆaris, S.W. Carter, D.D.E. Long, A low bandwidth broadcasting protocol for video on demand, in: Proceedings of the IEEE International Conference on Computer Communications and Networks (ICCCN’98), October 1998, pp. 690– 697. [17] J.-F. Pˆaris, S.W. Carter, D.D.E. Long, Efficient broadcasting protocols for video on demand, in: Proceedings of the International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’98), July 1998, pp. 127–132. [18] C. Ruemmer, J. Wilkes, An introduction to disk drive modeling, IEEE Comput. 27 (3) (1994) 17–28. [19] S. Sheu, K.A. Hua, T.H. Hu, Virtual batching: a new scheduling technique for video-on-demand servers, in: Proceedings of the Fifth DASFAA’97, Melbourne, Australia, April 1997. [20] A. Silberschatz, P.B. Galvin, Operating System Concepts, Addison-Wesley, Reading, MA, 1995. [21] S.-L. Tsao, Y.-M. Huang, An efficient storage server in near video-on-demand systems, IEEE Trans. Consumer Electron. 44 (1) (1998) 27–32. [22] S. Viswanathan, T. Imielinski, Metropolitan area video-ondemand service using pyramid broadcasting, Multimedia Syst. 4 (4) (1996) 197–208. [23] IBM Hard Disk: Ultrastar36LZX. http://www.storage.ibm. com/hardsoft/diskdrdl/ultra/ul36lzx.htm. [24] Informatik MPEG-I Traces. ftp://ftp-info3.informatik.uniwuerzburg.de/pub/MPEG.
170
J.B. Kwon, H.Y. Yeom / Future Generation Computer Systems 20 (2004) 157–170 Jin B. Kwon is an assistant professor in the Department of Computer and Information Sciences, Sunmoon University, South Korea. He has received a BS in Statistics from Hankuk University of Foreign Studies in 1988, and an MS and a PhD in Computer Science from Seoul National University in 2000 and 2003, respectively. His research interests include multimedia systems, network security and distributed systems.
Heon Y. Yeom is an associate professor in the Department of Computer Science and Engineering, Seoul National University, South Korea. He received a BS from Seoul National University majoring in Computer Science in 1984 and an MS and a PhD in Computer Science from Texas A&M University in 1986 and 1992, respectively. His research interests include multimedia systems, distributed systems and fault-tolerant systems.