I/O Scheduling for Digital Continuous Media Part II: VCR{like Operations
Deepak R. Kenchammana-Hosekote & Jaideep Srivastava Department of Computer Science, University of Minnesota, MN 55455
fkencham,
[email protected] Abstract
Advances in storage, compression, and network technology are making support for digital video and audio, collectively called continuous media (CM), possible. To provide economically feasible access to CM data an emerging service model is one in which many clients connect across a network to specialized CM servers (CMS). The continuous and real{time data needs in providing such a service requires eective resource management and scheduling of storage devices at the CMS. In this paper we study the eect of executing VCR{like operations by clients on the BSCAN[KHS94] scheduling strategy at the CMS. We rst de ne a suite of primitive VCR{like operations that clients can execute to change the ow of CM data. The eect of the execution of such operations on BSCAN is then analyzed. We show that the uncontrolled change in the BSCAN schedule will aect clients' playback. In order to avoid breakdown in playback guarantees while executing VCR{like operations at the CMS we develop two general techniques, namely the passive accumulation and active accumulation. Using the response time, i.e. the time to execute a VCR{like operation, as a comparison metric we show that active accumulation algorithms outperform passive accumulation algorithms. We then derive the optimal response time algorithm in a class of active accumulation strategies. The results presented here are con rmed by simulation studies.
Keywords: Multimedia, Continuous Media, I/O Scheduling, VCR{like operations, Response
time
This work was supported in part by Honeywell Inc. under grant F30602-93-C-0172 from Rome Air Force Labs,
Rome, NY.
1
Contents
1 Introduction
1.1 Summary of Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.2 Relation to Previous Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.3 Organization : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
3 4 4 5
2 Scheduling Strategy at CMS 5 2.1 The BSCAN Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.2 Computing a BSCAN Schedule : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.3 Admission Control : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.4 Relaxing Constraints on BSCAN : : : : : : : : : : : : : : : : : : : : : : : : : : : :
3 VCR{like Operations at CMS 3.1 3.2 3.3 3.4
VCR {like Operations : : : : : : : : : : : : : Eect of VCR {like Operations : : : : : : : Computing the New State : : : : : : : : : : Admission Control for VCR {like Operations
4 Eecting State Transitions in BSCAN
4.1 State Transition : : : : : : : : : : : : : : 4.2 Algorithms for State Change : : : : : : : 4.2.1 Passive Accumulation Algorithms : 4.2.2 Active Accumulation Algorithms :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
: : : :
8 8
9
9 10 11 12
12
12 15 15 16
5 Two Phase Active Accumulation Algorithms
17
6 7 8 A
22 27 27 30
5.1 The Two Phase Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17 5.2 The Time Optimal Two Phase Active Accumulation Algorithm : : : : : : : : : : : : 20
Simulation Studies Concluding Remarks Future Work Computing New States A.1 n for Rate Variation Operations : : : : A.2 n for Sequence Variation Operations : : B Derivations for Section 5 B.1 Derivation of ad = 1G?? G1 a. : : : : : : : B.2 Derivation of Bx 1b rvT ad ? Ap . : : : : : B.3 Derivation of K = ui 1?G G ? ai . : : : : : : B.4 Derivation of x = wi (1ui?G G) ? wzii : : : : : : B.5 Computing Gopt i
: : : : : : : : : : : : : : : : : : : : : : : : 30 : : : : : : : : : : : : : : : : : : : : : : : : 30
: : : : :::::::::::::::
2
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
31
31 31 32 32 32
1 Introduction Multi{media computing is a rapidly emerging application area due to advances in computer hardware and software technology, particularly in mass storage, image and video compression, and high speed networks. Amongst the various types of data that comprise multi{media, video and audio data requires continuous real{time data ow during playback and recording. The growing set of application domains range from distance education[SMRD94], entertainment[NYT94a], medical services[NYT94b], oce automation[ONWC87] to national defence[USA94]. Many of these applications require audio and video data which is collectively described by the term continuous media (CM). Due to the large size and the high data bandwidth needed to transmit CM data providing economically feasible access by these applications is the challenge [NYT95]. In order to provide such a service an emerging service model is one in which many clients(applications) connect across high speed interconnection networks to specialized data servers that store and retrieve CM data. Such servers called Continuous Media Servers (CMS), unlike conventional le servers (NFS), must handle requests for data which require continuous and real{time access to storage devices. File servers like NFS were designed for small le accesses and neither provide data in real{time nor exploit their access semantics in retrieving and storing CM. Hence, techniques for eective resource management and scheduling of storage devices at the CMS need to be developed. A CMS stores and manages movement of CM data to/from a storage system that typically comprises magnetic and optical storage devices like magnetic disk arrays (e.g. RAID) and optical jukeboxes. Typically, to satisfy the data request of a subscriber CM data is moved up1 the storage hierarchy { tertiary(tape/juke) to secondary(disk), and then to primary(main memory), after which it is shipped across the network to the subscriber's play out device. The data movement from secondary to primary memory is almost always done concurrently with clients' play out. Given the slow access characteristics of secondary storage devices like magnetic disks, the high volume of CM data, and the presence of concurrent accesses to them in real{time, the CMS must eectively schedule the data movement from the storage system and in doing so use nite ( xed) main memory. In scheduling data accesses the CMS must ensure that each client is provided a continuous real{ time delivery of CM data. Such guarantees are usually given to clients upon admittance which the CMS is obliged to maintain thereafter, unless the server enters into a re-negotiation with the client(s). In supporting concurrent clients, the CMS must be able to provide playback guarantees even when the access characteristics of a subset of clients change dynamically. Such changes are brought about when clients request VCR {like operations, e.g. Play at a higher rate, FastForward, ReversePlay, etc. on{line. The CMS must cautiously react to such changes always ensuring that other clients' playback remains unaected. 1
Throughout this paper, for the sake of brevity, we shall discuss the case when clients retrieve data from the
CMS. The discussion for the case when clients store data is assumed to follow symmetrically. 3
1.1 Summary of Contributions In this paper we study the eect of executing VCR {like operations by clients on the BSCAN [KHS94] scheduling strategy at the CMS. We rst de ne a suite of primitive VCR {like operations that clients can execute to change the ow of CM data. The eect of the execution of such operations on BSCAN is then analyzed. We show that an uncontrolled change in the BSCAN schedule will aect clients' playback. In order to avoid breakdown in playback guarantees while executing VCR { like operations at the CMS we develop two general techniques, namely the passive accumulation and active accumulation strategies. Using the response time, i.e. the time to execute a VCR {like operations, as a comparison metric we show that active accumulation algorithms generally outperform passive accumulation algorithms. We then derive the optimal response time algorithm in a class of active accumulation strategies. Simulation studies are used to validate analytical results presented here.
1.2 Relation to Previous Work The problem of designing a scheduling strategy for CM data has been previously addressed in [RV93], [CL93] [LS92], [AOG92], [RVG+ 93], [KEL94], [Gem93], [GC92], [CKY94], [TPBG93]. Servicing strategies like [CKY93], [RW93], and [Gem93] use single buers while allowing head scheduling algorithms like C{SCAN to decide the servicing order. Using single buers can restrict the subscribers' consumption sequence thereby disallowing certain kinds of VCR {like operations. [RS92] enumerates a similar list of VCR {like operations as enumerated in Table 1 in the context of a retrieval tool for CM data. However, their main focus is on synchronous delivery of data across the network. Also, their protocols provide probabilistic playback guarantees to clients. Much of the previous work on designing CM servers discussed supporting concurrency set operations like Open, and Close. [RV93] and [AOG92] discuss safe state transition algorithms to handle this set of operations. However, they do not try to minimize response time of such operations. The importance of maintaining nite slack time in a cycle to handle VCR {like operations has been mentioned in [TPBG93]. They discuss the relationship between slack and the response time of concurrency set operations, which they denote as start-up latency . A key dierence between our work and [TPBG93] is that our strategy provides playback guarantee at all time besides supporting an extended suite of operations. [LS92] discusses support for Pause, Stop, Play operations, besides Open and Close. However, they do not provide playback guarantees while such operations are being executed. Concurrent with our work on supporting VCR {like operations like FastForward and ReversePlay are [CKY94] and [DSKT94]. By anticipating the execution of such operations at the server, [CKY94] places CM data in the storage system using techniques like segment sampling wherein some groups of units are marked for display based on the type of VCR {like operation. The ecacy of such an approach is constrained by the placement strategy selected at storage time. The approach taken by [DSKT94] has been to provide probabilistic guarantees when such operations are being executed, 4
especially for operations like FastForward. Recently [LV95] has presented a placement technique to store compressed video, speci cally MPEG streams, to support robust playback during operations like FastForward and ReversePlay. Again, such a technique is constrained by the storage scheme that is usually a static optimization.
1.3 Organization This paper is organized thus: In Section 2 we outline the BSCAN servicing strategy at the CMS. The classi cation of VCR {like operations and their eect on BSCAN is presented in Section 3. In Section 4 we discuss the eects of executing VCR {like operations on BSCAN. We then introduce the passive and active accumulation algorithms, as techniques to prevent any transitory eect on clients' playback. In Section 5 we analyse a class of active accumulation algorithms with the aim of deriving an algorithm with optimal response time. Simulation studies reported in Section 6 con rm our ndings.
2 Scheduling Strategy at CMS 2.1 The BSCAN Algorithm In [KHS94] we described a scheduling approach called the BSCAN (Batched{SCAN) algorithm to service concurrent accesses by clients to the storage system at the CMS. BSCAN uses the SCAN[TP72] algorithm to service dierent streams and batches accesses to data of a single stream. Since the SCAN algorithm services the s streams in an order based on their location in the storage system, it minimizes the retrieval latency. A stream is stored in sequences of disk blocks within the storage system of the CMS. Typically, in accessing a disk block the head assembly needs to be positioned at the beginning of the block before the transfer can commence. For data accesses for the same stream, the per block access time is the time required to position the head assembly from the current disk block to the next adjacent block, , and the time to transfer the block. If the size of each disk block is b bytes and R is the transfer rate, then the per{block access time for stream i, vi , is i + Rb . Note that BSCAN does not constrain placement of disk blocks of a single stream. However, it becomes advantageous to store groups of disk blocks of a single stream as close to each other as possible since it minimizes the overheads in accessing them. Data blocks of dierent streams, however, may not neccessarily be stored close to each other. This is because (i) each stream could have been recorded at dierent times, and thus stored separately, and (ii) no assumption can be made about which set of streams will be concurrently accessed at the CMS. Thus, we do not assume the location of blocks for dierent streams are placed contiguously in the storage system. When the disk blocks of a stream have been accessed, the storage system starts servicing the next stream in the SCAN order, for which the head assembly needs to be positioned at the disk blocks of the next stream. This involves head seek and rotational latencies. For storage systems 5
with non{linear2 actuators, with T tracks and rotational latency tmax rot , the positioning time in servicing s concurrent streams is given by O(s), such that s
O(s) (s ? 1)(0 + 1 s ?T 1 ) + stmax rot
(1)
In the BSCAN algorithm, on servicing the last stream in the SCAN order, the head assembly reverses direction and begins servicing streams in the reverse order of the SCAN sequence and proceeds as before. We denote each pass of the head assembly in the storage system as a cycle . Thus, in BSCAN accesses are scheduled in cycles of SCAN order (or reverse{SCAN depending on the scan direction) for dierent streams while block accesses of a single stream are batched. If in k , is each cycle k, nki blocks of data are fetched for stream i then the time duration of cycle k, Tsvc given by s
s X
T k (s ? 1)( + Tsvc vinki (2) 0 1 s ? 1 ) + stmax rot + i=1 The set of nki 's blocks that are fetched in the kth cycle is called the schedule for that cycle and
the component of the CMS that periodically executes the schedules as the scheduler . The schedule for cycle k is denoted by vector nk . While schedules are being executed at the storage system, the (previously) accessed stream data is concurrently consumed by the clients at some (pre-de ned) rate. In order to ensure that the client's consumption rate remains unaected, or that the client never starves for data, the cumulative data produced must exceed the cumulative data consumed. BSCAN uses a double buer organization, thereby allowing out{of{sequence retrieval of data blocks which leads to increased I/O throughput from the storage system without any additional complexity to the disk scheduling algorithm. Data fetched by the scheduler for stream i in the kth cycle is stored in one of the double buers, while data fetched in the (k ? 1)th cycle is stored in the other and consumed by clients.
2.2 Computing a BSCAN Schedule In order to compute feasible schedules for BSCAN two additional constraints need to be considered. These are: 1. The schedule must have entries that are integral multiples of the block size . A basic constraint of the storage system is that data accesses must be in integral multiples of the physical block size. For example, it is not possible to fetch 4.37 blocks since it involves fetching a fraction of the 5th block.
In [BG88] the seek time for t tracks in a disk with non{linear actuator is (t) = 0 + 1 t; 0 ; 1 > 0, when t > 0, (0) = 0. p
2
6
2. The schedule must permit client playback to proceeds in discrete units . A video stream is a sequence of frames and the decoders/frame{buers involved in the playback require the entire frame periodically. At the time of decompression/rendition data corresponding to the entire frame must be in the main memory. Thus, in the computed schedule it is neccessary to ensure that the entire frame is available in buer at the time of decompression/rendition. The solution to the problem of computing feasible BSCAN schedules that satisfy these two constraints while guaranteeing smooth playback is discussed in [KHS94]. We supply an outline of the solution here. If s clients concurrently request access to dierent video streams from the CMS, such that client i requires playback at rate i frames per second, and ui is the size of each frame (in blocks) in the stream, then the problem of computing a feasible schedule3 , n, requiring minimum buer, Bmin , can be formally stated as,
Problem 1 (P)
min B = 2
such that
s X i=1
bni
8i; ni d(O(s) + vT n)ieui and all ni 's 0 and are integral.
The solution to Problem P is given as
nBSCAN = dne + pu where ni = 1?PsO(sv) j j uj iui and pu is computed by Algorithm PU STAR . The proof for this j claim is given in [KHS94]. We sketch its outline here. The solution for Problem P is greater than (or equal to) dn e. PUSTAR commences its search of the feasible subspace from this point in Rs . Due to the additional constraints it examines only integer solutions for n. In each iteration(lines 4{9) of PU STAR the feasibility of the current solution, n + p, is examined. If it is infeasible then in line 8 PU STAR computes the data de cit, p, in executing the current schedule. This is carried over to the next iteration wherein the schedule for stream i is increased by dpi ui e. PUSTAR terminates as soon as the current schedule is feasible. Algorithm 1 PUSTAR =1
.
Algorithm to derive pu .
1 p = 0; 2 p 0;
The superscript k is omitted since all cycles are assumed to be identical. Furthermore, a variable typefaced in bold indicates a vector, i.e n = (n1 ; : : : ; ns)T . 3
7
3 4 5 6 7 8 9
do
for i 1 to s pi
end
pi
+ dpi uie;
. Compute the cycle duration in servicing dn e + p. Tsvc O(s) + T (d e + ); . p is the data de cit due to servicing dn e + p.
v n
p
for i 1 to s pi
end
dTsvc i e ? (dniuei+pi ) ;
until ( dpe 0); 10 p p;
2.3 Admission Control BSCAN controls admission of new streams and rate changes of admitted streams based on the availability of I/O bandwidth and buer space. If Bavail is the buer space available at the CMS then the condition for admitting stream (s + 1) is as follows: If sX +1
(
viiui < 1)
sX +1
^ ( 2bnBSCAN Bavail) i
|i=1 {z } {z } |i=1 Buffer Space Limitation I=O Bandwidth Limitation
(3)
then stream (s + 1) is admitted to the CMS with frame rate s+1 , and frame size us+1 .
2.4 Relaxing Constraints on BSCAN When the BSCAN schedule is restricted to having entries that are integral multiples of block size and the clients' consumption is assumed to proceed in discrete units, the scheduling problem is outlined as Problem P. However, P is mathematically unwielding. For purposes of analysis we shall relax the two constraints on P to get a modi ed problem P0 formally stated as
Problem 2 (P0)
min B = 2
such that
s X i=1
bni
Mn O(s)r where M = (bI ? rvT ) and ni 0, for all i i s. Notice that ri is the consumption rate of stream i in bytes per second, such that ri = biui . Problem P0 has a closed form solution, n , given as 8
O(s) r n = b ? P s vr i=1 i i
(4)
As stated earlier, nBSCAN n . However, we shall use P0 for all analyses in the rest of this paper.
3 VCR{like Operations at CMS The discussion in the previous section assumed that r, v, and s are constant when the BSCAN schedule is computed, i.e. at some time in the past these parameters had been set and thereafter the scheduler has been executing the schedule repeatedly. In practice, clients will initiate consumption of data at some (unpredictable) moment, and over the duration of their session with the CMS they (may) change their consumption rate and/or pattern, and nally after a nite time duration they will end their session. When such changes to the clients' access requests are allowed, the server must make appropriate changes in its operation to satisfy the new requirements, without aecting existing clients' playback guarantees. In this section we discuss a set of these interactions of the clients with the CMS, that we denote VCR {like operations, and their eect on BSCAN.
3.1 VCR{like Operations A client views continuous media from the CMS as a sequence of data units owing as a data stream in real time. For example, in a video stream this data unit will be a frame (uncompressed/ intra-frame compressed JPEG video) or a group of frames (IB P B in inter-frame compressed MPEG video) sequenced in a pre-de ned order at (say) 30 frames per second. In order to change the data ow in such a stream the client can execute one or more VCR {like operation(s) from a set categorized as follows.
Rate Variation Operations : Operations in this class change the rate of data units owing in
the stream. Since the stream is a timed sequence of data units, this results in a speed-up or slowing of the stream. For example, the VCR {like operation SlowMotion on a video stream changes the frame rate in the stream. Thus if the play out rate was 30 frames per second the SlowMotion operation could reduce the rate to 15 frames per second.
Sequence Variation Operations : An operation that changes the order in which the data units
are owing in the stream belongs to this class. Such operations presume the existence of a (possibly time-stamped) order of data units in the data stream4. For example, the operation FastForward on a video stream may achieve the eect by displaying alternate frames and thereby changing the display sequence from the (original) recorded sequence. Notice that the rate at which data units are being consumed can remain unchanged in such an operation. The
4
Such data being discrete samples of a continuous phenomenon are usually ordered by timestamp.
9
sequence (and hence the contents) of the frames displayed gives the eect of having witnessed a phenomenon in a time interval, which in reality lasted twice the duration.
Concurrency Set Operations . Operations like Start and Stop which increase and decrease the number of concurrent streams being scheduled belong to this class. We shall dierentiate such operations from the rate variation operations shortly.
Henceforth in this paper, we shall represent the set of rate variation operations by { Playand ReversePlay, the set of sequence variation operations by ForwardSkipand ReverseSkip, and the set of concurrency set operations by Open and Close. Table 1 describes the eect of each of these operations on a CM stream S . Other operations like Pause, FastPlay, SlowMotion can be implemented using this set of primitive operations.
3.2 Eect of VCR{like Operations A VCR {like operation requested by a client will aect the scheduler since it will change the schedule computed with PU STAR . To understand the eect of such an operation we will de ne the concept of the state of a scheduler and then relate the execution of VCR {like operations in Section 3.1 to changes in scheduler state. Before we motivate the de nition of the state of the scheduler it is necessary to introduce a measure of the data buered in the main memory at the CMS at the end of each cycle. Let us denote Bik to be the number of excess(in addition to the schedule) data blocks buered for stream i in main memory at the end of cycle k, and Bk be a vector of Bik 's. The state of the scheduler in cycle k is de ned thus:
De nition 1 (State of a scheduler) State of a scheduler at the end of cycle k is nk + Bk . In other words, the amount of data available for consumption at the end of any cycle k de nes the state of the scheduler. Note, k?1 = B Bk + n| k?1 ? Tk r | {z } |{z} {z svc } Excess data in new state Excess data from previous state Accumulation in cycle k
(5)
The state of the scheduler can change with cycles. Such changes can occur when any of the k / nk . three parameters used in computing nk , i.e. s, r, and v, change since Tsvc Rate variation operations modify the consumption rate and hence the vector r. Sequence variation operations change the sequence of data blocks accessed. The required data blocks tend to be spaced farther apart compared to accessing them in the sequence they were stored. Hence sequence variation operations change the vector v, the per-block access cost vector. Concurrency set operations change the number of concurrent streams being serviced by the scheduler. Table 1 shows the classi cation of the set of VCR {like operations and the parameters that consequently change leading to a state change at the scheduler. In the table a set of VCR {like 10
VCR {like Operations Open(S ) Close(S ) Play(S ,r) ReversePlay(S ,r) ForwardSkip(S ,skip) ReverseSkip(S ,skip)
Description s r v p Start S with r = 0 p Terminate S whose r = 0 p Play S at rate r p Play S in reverse at rate r p Play S at rate r skipping every (integral) skip units p Reverse Play S at rate r skipping every skip units
Table 1: VCR {like Operations on a CM Stream S . operations are enumerated along with the parameter that they modify. For example, the Close(S ) operation on a CM stream S reduces the number of concurrent streams at the server and hence aects s. Similarly Play(S ,r) changes the consumption rate of S and thus aects the rate vector r. Note that ReversePlay(S ,r) need not aect v since the disk scheduler is not constrained to retrieve blocks in the order in which they are to be viewed. Hence, since data fetched in cycle k is not consumed until cycle k + 1, blocks for ReversePlay are fetched in the same sequence as Play with the only dierence being the order in which data is consumed. Other operations can be implemented using the primitive operations listed in Table 1. For example, Pause(S ) is essentially Play(S ,0). Similarly, FastForward(S ) may be implemented as ForwardSkip(S ,2)5.
3.3 Computing the New State When a VCR {like operation is invoked by the client, the scheduler needs to compute the new schedule corresponding to that operation, and make a transition (if needed) to the new state. Since from Equation 5 a change in the schedule solely causes change in state, we will discuss the computation of the new state using the schedule. Let n and nnew be the schedule in the old and new states, respectively. Let
nnew = n + n Table 2 summarizes the state changes due to the VCR {like operations derived in Appendix A. It may appear that concurrency set operations can be reduced to equivalent rate variation operations. For example, it appears as if the addition of a new stream should be no dierent from stepping up its rate from i = 0, prior to its play out, to some non-zero value. While this Such an implementation of FastForward mplies skipping alternate frames. In cases where streams are inter-frame compressed MPEG video a group of frames (IB P B ) is skipped since it is is hard (and possibly meaningless) to skip arbitrarily. 5
11
Class of Operations Rate Variation Sequence Variation Concurrency Set
Variable Change Lower Bound on n T T (( b ? v r ) I + rv ) r new = + (b?vT r)(s b?vT rnew ) T ) vnew = v + v (b?vTsr)((b?v vrnewT r) r s snew = s + s b?vT r r
Table 2: New State Computation in BSCAN (8i; ri = bivi ). equivalence would appear to clients, the server needs to dierentiate between these two operations. The reason for this distinction is that an extra context switch time is required to access data for a new stream. As described in Equation 2, data access time comprises a xed component and a variable component. Hence, for the server the process of admitting a new stream and that of stepping up the rate of a stream from = 0 to a greater value are not equivalent. The distinction is similar to the operations of Pause and Stop in conventional VCRs.
3.4 Admission Control for VCR{like Operations BSCAN must control the admission of VCR{like operations based on the availability of the I/O bandwidth and buer space. If
s s X X vinew new u < 1) ^ ( 2bnBSCAN Bavail) i i i i =1 i =1 | {z } | {z } I=O Bandwidth Limitation Buffer Space Limitation
(
(6)
then the VCR {like operation at a CMS is admitted, else the operation is rejected.
4 Eecting State Transitions in BSCAN The eect of VCR {like operations by clients on the servicing strategy is to change the schedule. In this section we consider the problems associated with switching from one schedule to another. We show that an uncontrolled switching of schedules results in a temporary violation of rate guarantees that can cause undesirable eects on rendition at the clients' play out devices. We then outline two general schemes to solve this problem.
4.1 State Transition Consider a VCR {like operation op that requires the scheduler in state S to change to state S new . In the previous section we computed the vector nnew corresponding to this new state S new . The next step is the design of an algorithm that will eect this state change. Figure 1 illustrates a typical state transition at the scheduler. In the gure, the horizontal axis measures time. The vertical 12
Transient states: fT g State S
State S new
transition profile
Bk nk k0
k0 + c
t
Figure 1: State Transition for VCR {like Operation op. axis6 measures data available for consumption. The horizontal line at height nk is the schedule fetched in each cycle prior to cycle k0 + c. Data build{up above this line is Bk , the excess data buered. In changing from state S to S new the scheduler passes through a sequence of transition states, labelled as the set fT g in Figure 1. After c cycles the new state S new is reached and the operation op is said to have been executed. Clearly a gamut of algorithms can eect the state transition, each possibly enforcing a dierent criteria for deciding how and when to change states. A seemingly natural way of handling this situation is to immediately step up(or down) the number of blocks fetched in the next cycle. However, in doing so rate guarantee to clients can be violated. Since the fundamental goal of the scheduling model described in Section 2 is to provide guaranteed data rate to clients at all time, we must select an algorithm that changes states while ensuring that executing VCR {like operations by one(or more) client(s) does not aect the rate guarantee to other clients. If, in the new state, the cycle duration is larger then the scheduler must fetch more blocks of data in every cycle. If these additional blocks of data are to be fetched without aecting the rate guarantees to other clients, then the scheduler must accumulate sucient data in buers of other clients before the VCR {like operation is executed causing the scheduler to resume steady operation in the new state. In Figure 1 this is interpreted as follows: Suppose the VCR {like operation op was requested at the start of cycle k0. In the next c cycles corresponding to the transition states, data is accumulated to reach a level corresponding to state S new , i.e. Bk +c n. The shaded region in Figure 1 shows the pro le of data accumulation in the transition states. We call this pro le the transition pro le for state change. In eect, the VCR {like operation is executed only 0
By a plot of a vector we imply a set of plots, one for each element of the vector. However, for the sake of brevity we shall imply henceforth any one such plot. 6
13
after cycle k0 + c. Notice that the time taken to execute the intermediate c cycles is the response time for the VCR {like operation since it is the time elapsed from the time of its invocation to its actual execution. If, in the new state, the cycle duration is smaller then the scheduler needs to reduce its service vector in the subsequent cycles. Figure 1, when followed right to left, illustrates this situation. To eect such a state transition is trivial. The service vector is set to 0 for a few subsequent cycles until nk + Bk = nnew . Since making a transition from S to S new is more involved than moving from S new to S when the cycle duration increases, henceforth we will restrict our discussion to handling state transitions of the former type. Rate Guarantee Violation
S new
S new S
S k0
k0
k0 + c
k0 + c
Figure 3: A Safe Transition Pro le
Figure 2: An Unsafe Transition Pro le
Figure 2 is a possible transition pro le when the VCR {like operation is executed immediately on invocation. Such a pro le is typical of an algorithm that, in an eort to increase Bk , immediately tries to fetch more blocks of data. However, in fetching more data blocks the duration of that scheduling cycle increases. Due to double buering there will not be enough data fetched in cycle k0 to sustain the clients in a dilated cycle k0 + 1. In such an event clients starve temporarily during the cycles following cycle k0 + 1. Such a transition pro le that does not ensure rate guarantees is an unsafe transition pro le. Consequently, we de ne a safe transition pro le as follows:
De nition 2 (Safe Transition Pro le) A transition pro le is safe if for each transition cycle k0 + j , 0 j c k0 +j |B {z } Excess data at the end of cycle k0 +j
0
Graphically, a safe transition pro le never allows Bk +j to dip below the horizontal line since if in any cycle such a dip does occur then it follows that in that cycle clients will starve. The transition pro le in Figure 2 is an unsafe transition pro le. Figure 3 illustrates a safe transition pro le. In the subsequent discussion we shall consider state transition algorithms that have safe transition pro les since only they ensure rate guarantees to clients. 0
14
4.2 Algorithms for State Change An algorithm that has a safe transition pro le must implement a strategy of fetching additional blocks over and above the schedule in each of the transition cycles k0 through k0 + c. By fetching additional data, the scheduler builds up data in Bk until a time when Bk +c n. At the point the state change is eected and the VCR {like operation executed. Data accumulation can be done in two ways. In a passive accumulation strategy the schedule is not modi ed and the slack time in the cycles in state S is used to accumulate data. As long as there exists some slack time a passive accumulation algorithm will accumulate data over a nite number of cycles until Bk +c is large enough to transit to the new state. In an active accumulation strategy an attempt is made to fetch additional data blocks by increasing the length of the schedule. However, the dilation is done carefully to ensure the safety of the resulting transition pro le. 0
0
4.2.1 Passive Accumulation Algorithms Passive accumulations algorithms, as the name suggests, require the scheduler to make no explicit attempt to accumulate data towards state change. They critically rely on the slack time in each cycle that exists in state S to accumulate data. Over some nite number of transition cycles sucient data is accumulated which allows the scheduler to switch to servicing nnew . At this point the state transition occurs and the stream operation is safely executed. Let us assume that the scheduler in state S was fetching n + a, where n is the set of ni given in Equation 4. For passive accumulation algorithms the data accumulated in each cycle is constant since there is xed slack time in each transition cycle. If we assume this accumulation to be Ap blocks per cycle, then this quantity can be computed as follows.
Ap =
n| {z+ a}
Blocks available for consumption
?
Since Tsvc = s + vT (n + a), Since sr = Mn,
1T r b svc
| {z } Blocks actually consumed
Ap = n + a ? 1b (s + vT (n + a))r Ap = M(n + a ? n) = Ma
Thus,
Ap = 1b Ma
(7)
This accumulation increases Bk , which is the excess data buered at the end of cycle k, to Bk+1 = Bk + Ap. Or, in x cycles Bk +x = xAp + Bk . 0
0
15
Consumption Fraction (CF)
bv
1
Consumption Fraction (CF)
Tr
bv
1
Accumulation T FractionT (AF) b? b s+ T ( +
vr
Tr
vv an a
)
Accumulation T Fraction T(AF) b? b s+ T ( +
vr
Overhead Fraction (OF)
Overhead Fraction (OF)
vv an a
)
s s+vT (n+a)
s s+vT n
Before Dilation
After Dilation
Figure 4: Increase in Accumulation Fraction Due to Cycle Dilation Thus, at the earliest cycle k0 + c, when Bk +c n, the state transition is made without violating rate guarantees. Notice that as long as Ma > 0 the data accumulated thus will grow. In other words, nite slack is essential to maintaining a safe transition pro le for accumulation algorithms . 0
4.2.2 Active Accumulation Algorithms The main drawback of passive accumulation strategies is that they suer from a slow (and xed) rate of growth of Bk . A more aggressive strategy is to increase the rate of growth in Bk by dilating the schedule in order to fetch more data in exchange for a higher growth rate of Bk . We denote schemes that dilate their schedule in order to increase the rate of data accumulation as active accumulation algorithms. Active algorithms dilate their schedule one or more times during the transition cycles. The main reason why dilating the schedule increases the rate of data accumulation is explained by the fact that in larger data fetches, the fraction of the bandwidth wasted due to context switch decreases leading to an increase in the data throughput from the disk. Figure 4 shows the distribution of time spent in each cycle before and after dilating the schedule. Time in each cycle is divided into three parts, (i) Consumption Fraction (CF) or the fraction of the time spent fetching data that is to be consumed, (ii) Overhead fraction (OF) or the fraction of time spent as overhead in fetching data in this cycle, and (iii) Accumulation Fraction (AF) or the fraction of the time spent fetching data to be accumulated in the cycle. In dilating the schedule the OF reduces, thereby increasing AF. The increase in AF results in a growth in the rate of data accumulation. It is easy to contemplate a wide variety of active accumulation algorithms. Thus, it is useful 16
S new
n
nnew
S
n+a
P-phase
A-phase
c
xx+1
Scheduling rounds (k) Figure 5: The transition pro le of a two phase Active Accumulation Algorithm to classify such algorithms based on when and how many times they change their schedule during transition.
De nition 3 (k Phase Active Accumulation Algorithm) A k phase active accumulation algorithm is one that changes its schedule length k ? 1 times during the transition cycles. In the next section we discuss the family of two phase active accumulation algorithms.
5 Two Phase Active Accumulation Algorithms In this section we analyze the class of two phase active accumulation algorithms. An algorithm in this class dilates its schedule once during transition (Figure 5). Assuming that there existed some slack in state S the two phase algorithms dilate the schedule only if, by dilating the scheduler can achieve a G{fold increase in the rate of accumulation over that in state S .
5.1 The Two Phase Algorithm Figure 5 illustrates the typical transition pro le of algorithms in this class. Such an algorithm accumulates data in two phases { passive phase (P{phase) and active phase (A{phase). In the P-phase the algorithm passively accumulates data until sucient accumulation exists to make it fruitful to dilate the schedule. At this point (cycle x in Figure 5) the algorithm makes the decision to dilate the next cycle in exchange for a G{fold growth in the rate of accumulation over its P{ phase. In dilating the schedule it permits a temporary fall in Bx+1 which is seen as the knee in 17
Figure 5. For the desired growth the schedule is dilated by fetching an additional ad blocks in each of the cycles in the A{phase. From cycle x + 1 the A{phase commences, wherein accumulation grows at G times that in the P{phase. This is continued until the desired amount, i.e. Bk +c n, is accumulated. Thus, after c cycles the state change is completed. However, there is a limitation as to how large G can be. This limit is given by Lemma 1. 0
Lemma 1 If a scheduler is executing in state S with a schedule n + a, the maximum increase in rate of data accumulation G, w.r.t to state S , is bounded above by
s + vT (n + a)
vT a
Proof: The maximum possible growth in the rate of accumulation is possible when AF in
Figure 4 expands to (almost) completely envelope OF. Notice that OF will tend to zero but will never be zero since 6= 0. When AF grows to cover the entire region of AF+OF, that will be the maximum possible accumulation rate since CF remains invariant during cycle dilation. Hence, the maximum possible growth in data accumulation w.r.t to AF in USTA is
G < 1 ?AFCF
Since CF is the fraction of time spent in producing data that is to be consumed in the next cycle, CF is given by T ( (s+vT (n +a))r ) 1 v CF = s + vT (nb + a) = b vT r
(8)
Since OF = s+vTs(n+a) , we can compute AF in USTA as AF = 1 ? CF ? OF or,
AF = 1 ? 1b vT r ? s + vTs(n + a)
This simpli es to,
!
T T AF = b ? bv r vs((+nv+T (an) ?+ na))
(9)
Substituting the values of CF and AF from Equations 8 and 9, respectively, we get 1 T T G < b?vT r1?vTb((vnr+a)?n ) = s + vvT(an + a) b s+vT (n +a)
This proves the claim.
The limiting value of G can be achieved when the scheduler dilates its cycle such that the overhead fraction (OF) of the service cycle tends to zero. In other words, as the service cycle 18
tends to an in nitely large value the scheduler reaches its maximum throughput and in such state achieves the largest growth in the rate of data accumulation G. After cycle x, i.e. the start of the A{phase, the scheduler dilates cycle x + 1 by fetching ad additional blocks there onwards. If Aa is the data accumulated in each cycle of the A{phase then we want 1
1
s + vT (n{z+ a + ad ) Aa} = G |s + vT ({zn + a) Ap} | Rate of Accumulation in A?phase Rate of Accumulation in P ?phase
(10)
Aa can be re-written in terms of ad as follows, Aa =
n| + {za + ad}
data available for consumption in a A?phase cycle
?
A?phase , Expanding Tsvc
1 T A?phase r
svc } |b {z data consumed in a A?phase cycle
Ad = (n + a + ad ) ? 1b (s + vT (n + a + ad ))r Substituting sr as Mn we get, Aa = 1b M(ad + a)
(11)
Using Equations 10 and 11 ad , is derived to be (see Appendix B.1)
G ? 1 ad = 1 ? G a
(12)
where = s+vvTT(na +a) . Note that in fetching ad blocks in the switching cycle x +1, some of the data accumulated until now as Bx will be consumed. Since the transition pro le of the algorithm must be safe we must make sure that
b|(Bx?1 + A{zp + n + a)}
data available for cycle x+1
x+1 r T| svc {z }
data consumed in cycle x+1
Since Bx = Bx?1 + Ap (see Appendix B.2),
Bx 1b rvT ad ? Ap
(13)
Bx measures the depth of the knee and hence Condition 13 can be graphically interpreted to
be depth of the knee seen in Figure 5. Substituting Equation 12 in Condition 13 we get the earliest cycle x +1 when it is safe to dilate the cycle and achieve the increased data accumulation rate. T Bx v b a (11 ?? G)G r ? a 19
Equation 12 will yield a feasible solution only if G < 1. This is indeed true due to Lemma 1. An interesting consequence of Lemma 1 is that larger the value of vT a, the smaller is the upper bound on G. In other words, slack is inversely proportional to the upper bound on G. Hence, G alone is inadequate to compare two 2{phase active accumulation algorithms for state change. G is analogous to the gain in signal ampli ers where it is similarly de ned as the ratio of the output signal to the input signal. A comparison of a 20 db ampli er with a input signal of strength of 1 milli ampere to a 10 db ampli er with a input signal with strength of 1 ampere will be inaccurate if solely based on the value of G. Clearly, for a fair comparison the accumulation(slack) in the initial state must be taken into account. In the next section we motivate the use of the response time of a VCR {like operation as an unbiased gure of merit for state transition algorithms and derive a two phase active accumulation algorithm that is optimal in its response time.
5.2 The Time Optimal Two Phase Active Accumulation Algorithm The aim of active accumulation algorithms is to increase the rate of data accumulation, and thereby reduce the time required to accumulate data for state transition. As de ned previously, this time is the response time for a VCR {like operation, since it is the time elapsed from the instant of request for the operation until its execution. Hence, lesser the time it takes to accumulate data the smaller the response time gets. A desirable feature for clients and well as the service provider (CMS) will be the ability to execute VCR {like operations as quickly as possible. Hence, transition algorithms that reduce this time are desirable. In this context we de ne the notion of a time optimal safe state transition algorithm as follows,
De nition 4 (Time Optimal Transition Algorithm) A transition algorithm eecting state transitions in the least time is a time optimal transition algorithm.
In this section we construct a time optimal state transition algorithm in the space of two phase active accumulation algorithms. While in the previous section we derived conditions under which a scheduler could dilate its cycle to increase its rate of accumulation in the A{phase to be G times that in the P{phase, in this section we pose the question: For what value of G is the response time minimized? . Figure 6 illustrates the pro les of three algorithms in this class. If G is too small, e.g. Gsmall , the rate of accumulation is not steep enough. On the other hand, if G is too large, e.g. Glarge , the P{phase will be longer, resulting in too late a switch to the A{phase. Thus, the optimal value of G, Gopt , is one that minimizes the time to change states and thus is the time optimal safe transition algorithm in the space of two phase active accumulation algorithms. Figure 7 illustrates the problem of computing Gopt i for a two phase active algorithm. In the gure, Mi represents the required accumulation for state transition(ni ). Ki is the depth of the knee. Let the rate of accumulation in the P{phase be mi , and ci be the number of transition cycles. Let T and T new be the time durations of each cycle of the P{phase and A{phase, respectively. In 20
S new
S
Gi mi
n
Gopt Glarge Gsmall
Ki
mi
x
Mi
xi xi + 1
T min
T
i
Figure 7: Computing Gopt i .
Figure 6: The time optimal and two other two phase active accumulation algorithms.
Figure 7, Ti represents the time needed to accumulate Mi blocks. Formulated thus, the problem of computing Gopt i is essentially that of nding the value of Gi that minimizes Ti .
m(x T )
| {zi } Accumulation in P ?phase
+ Gm (ci ?{zxi )T new} = | Accumulation in A?phase
+ |{z} Ki Mi |{z} Data for state change Depth of knee
(14)
Notice that (ci ? xi )T new = Ti ? xi T .
Ti = m1G ((Gi ? 1)ximiT + Mi + Ki) (15) i i From Equation 13 we can compute Ki (see Appendix B.3) and xi (see Appendix B.4) in terms of Gi as
G u G i i i Ki = ui 1 ? G ? ai ; xi = w 1 ? G ? wzi i i i i T where ui = v a(1b? )ri , zi = ai + Bik , and wi = (ai ? vTb a ri). Substituting the values of Ki and xi into Equation 15 we get Ti as a function of Gi .
(16)
0
M u T z T u 1 i i i i Ti(Gi) = m + w ? ai G + 1 ? G + w 1G?i ? G1 ? zwiT (17) i i i i i i i To get Timin , Equation 17 is dierentiated w.r.t Gi and set to 0. On solving it yields (see
Appendix B.5)
Gopt i =
1q + pqii
where pi = Mm + zwi Ti ? ai , and qi = (1?w )iui T + ui . 21
(18)
To derive the transition time for a set of s clients we pick T min to be
T min = i=1 max T (Gopt) ;;s i i In other words, the optimal transition time for the CMS servicing a set of s clients will be as large as the slowest transition. Thus, given n the time optimal two phase active accumulation algorithm computes Gopt i 's and uses Condition 13 and Equation 12 to decide when to dilate the schedule and by how much. Such an algorithm minimizes the response time of the VCR {like operation.
6 Simulation Studies In this section we validate our analyses of the preceeding sections via simulation studies. The experiments presented here were conducted using MAGELLAN [KH]. MAGELLAN is a process oriented similator written in C++/CSIM [Sch90] and MATLAB. A schematic of MAGELLAN is illustrated in Figure 8. Further description of MAGELLAN can be found in [KH]. MAGELLAN Client
MAGELLAN Client
MAGELLAN Client
Statistics Collectors
Scheduler
Buffer Manager
Tcl/Tk Visualizer
Data Service Queue
Data Log
Path Statistics Collector
Storage System Simulator
Tcl/Tk Visualizer
Data Log MAGELLAN Server
Figure 8: Schematic of MAGELLAN Simulator. For results presented in this paper, MAGELLAN simulated a CMS with parameters described in Table 3. This con guration corresponds to a Sun Sparcstation 20 with a Seagate Barracuda 2GB 22
disk with Parallax video card playing back motion{JPEG compressed video streams. Parameter Block size b Total Tracks T RPM
Value 2048 bytes 2800 tracks 5400 rpm tmax 11.11 msec rot Capacity of 1 track 48MB Transfer rate R 4.42 MBps Per block Access v 0.49 msec Fixed Component 0 0.005 sec Variable Component 1 0.001 sec/track Table 3: Simulation Parameters for MAGELLAN. The storage system is assumed to store a set of video streams V = fV1; ; Vng. To demonstrate the performance of the active and passive accumulation algorithms described in this paper we will, for clarity of explanation, consider a scenario where a single client exists at time t = 0. At that time instant the client requests the VCR {like operations Open (V1) followed by Play 7 (V1, 78.125 KBps). A few seconds later, at time t = 3:2512, the client executes Play (V1, 312.5 KBps), i.e. the client quadruples the play out rate of V1. Let the state of the scheduler at t = 3:2512 be S and the state after executing the VCR {like operation be S new . Table 4 lists the values of the variables involved in the state transition. Figures 9 through 12 plot Bk (in blocks) w.r.t. t (seconds). In all gures the dark horizontal line measures 0 on the y {axis. Notice that for a safe transition (De nition 2) we require that the line traced by Bk remain above this dark horizontal line. In the period before t = 3.2512 seconds some nite slack exists that causes data build up in the buer over time. This accumulation is bounded using the technique described in [KHS94]. In that scheme as soon as an integral number of blocks are accumulated the service vector is appropriately adjusted. Such a bounding gives rise to the saw{tooth like value of Bk in steady states like S and S new as is seen clearly in Figures 9 through 13. Figure 9 simulates a state transition algorithm that allows immediate transition at time t = 3.25 seconds to the new state S new . In the process, the safety of the transition is compromised. In fact for about 2.25 seconds the client perceives jitter caused due to starvation at the CMS. On reaching S new at time t = 5:57 seconds, B1k +c is consumed during the servicing of the new schedule. Thereafter, the bounded{buer algorithm in [KHS94] bounds data accumulation to 1 block causing the saw{tooth like pattern once again. 0
7
We translated frame rate into data rate.
23
Parameter S r 78.125 v 0.004 n 0.9907 a 0.0093 n+a 1 M { Ap 0.0078 T 0.0254 A p m (= T ) 0.3076 0.0015
S new
312.5 0.004 8.9167 0.0833 9 7.93 0.0313 0.0574 0.5444 0.0058
Units KBps seconds blocks blocks blocks blocks blocks per cycle seconds blocks per second
Table 4: Values of Variables in State S and S new .
* MAGELLAN * Buffer_Utilization * 10.000 B_1
-2.0000 TIME
0.000
35.000
Figure 9: Plot of B1k vs. t during S ! S new with immediate transition. In the period [3.33, 5.57] B1 < 0 causing client starvation for 2.25 seconds.
24
* MAGELLAN * Buffer_Utilization * 10.000 B_1
-2.0000 0.000
TIME
35.000
Figure 10: Plot of B1k vs. t during S ! S new with passive algorithm. Transition time T is 25.91 seconds. Figure 10 shows the transition pro le for the passive accumulation algorithm. The algorithm critically relies on the slack in the initial state S . In order to make the transition the algorithm takes 25.91 seconds. During this period B1k grows at a constant rate ( ATp ). On accumuating n1 blocks the scheduler switches to the new schedule and thereafter operates using the bounded{ buer algorithm in S new . Figure 11 shows the transition pro le of the time optimal 2{phase active accumulation algorithm. The algorithm spends 0.3048 seconds in the P{phase wherein it accumulates sucient data to dilate its cycle by 0.68 blocks. Thereafter it spends the remaining time in the A{phase accumulating sucient data for enabling safe state transition. The optimal growth factor Gopt was found to be 67.1 for the transition. This algorithm implements the VCR{like operation in 0.70 seconds which is about 3700% improvement over its passive variant! On reaching S new the algorithm resumes execution of the bounded{buer algorithm. Figure 12 shows the time optimal 2{phase active accumulation algorithm with a key dierence in that it illustrates the feasible version of the algorithm, i.e. it implements the modi ed optimal algorithm wherein cycle dilations are restricted to be integral multiples of the block size of the storage system. In our example it is infeasible to dilate a cycle by 0.68 blocks. Hence, the smallest, larger integral value is used (in this case 1 block). This additional dilation increases the P{phase which is oset by a faster accumulation in the A{phase. Consequently, the feasible implementation of the algorithm executes the VCR{like operation in 0.81 seconds { about 16% o from the optimal 1
25
* MAGELLAN * Buffer_Utilization * 10.000 B_1
-2.000 0.000
TIME
35.000
Figure 11: Plot of B1k vs. t during S ! S new with time optimal 2{phase algorithm. Transition time T is 0.7 seconds if fractional block fetches are allowed. * MAGELLAN * Buffer_Utilization * 10.000 B_1
-2.0000 0.000
TIME
35.000
Figure 12: Plot of B1k vs. t during S ! S new with a feasible time optimal 2{phase algorithm. Transition time T is 0.81 seconds when only integral block fetches are allowed. 26
* MAGELLAN * Buffer_Utilization * 2.000 B_1
-0.1000 0.1676
TIME
5.6676
Figure 13: A magni ed view of B1k vs. t during the transition as seen in Figure 12 showing the two phases of the accumulation algorithm. transition but still about 3100% improvement over the passive accumulation algorithm! Figure 13 shows a magni ed view of the state transition as shown in Figure 12. The P{phase, the knee, and the A{phase is seen more clearly in the gure.
7 Concluding Remarks A servicing strategy for clients' access must be able to handle changes brought about by interactions via VCR {like operations. In this paper we have considered the eect of VCR {like operations requested by clients on BSCAN. We have shown that handling VCR {like operations requires special care to eliminate transient eects in BSCAN if playback guarantees to the clients are to be maintained. We have suggested two general techniques to handle such transitory situations. Using the response time as a merit we have been able to derive the optimal response time for algorithms for a class of active accumulation strategies. Simulation studies support the analyses presented here.
8 Future Work The techniques described here are being implemented on a Sun SparcStation 20 with a Parallax Video Card, 168M main memory, and 8G secondary storage running Solaris 2.3 as part of the 27
project [AHPR94]. The I/O scheduler is being implemented as a single thread sharing address space with the clients using the real{time and multi{thread support in Solaris 2.3 [KSZ92]. Panels analogous to those on VCRs are provided to clients to execute VCR {like operations. The problem of nding the optimal algorithm, i.e. with least response time, to handle state transitions remains unsolved. Its construction is part of our future eorts. Recent work like [LV95] is leading us to consider some mixed solutions, i.e. solutions that rely on the placement strategy as well as the scheduling strategy to provide fast executions of VCR {like operations. Presto
References [AHPR94] M. Agrawal, J. Huang, S. Prabhakar, and J. Richardson. Integrated System Support for Continuous Multimedia Applications. In Proceedings of International Conference on Distributed Multimedia Systems and Applications, August 1994. [AOG92]
D. Anderson, Y. Osawa, and R. Govindan. A le system for continuous media. ACM Transaction on Computer Systems, 10(4), November 1992.
[BG88]
D. Bitton and J. Gray. Disk shadowing. In 14th International Conference on Very Large Data Bases, pages 331{338, 1988.
[CKY93]
M. Chen, D. Kandlur, and P.S. Yu. Optimization of the Group Sweep Scheduling with Heterogeneous Multimedia Streams. In Proceedings of the ACM Multimedia, pages 235{242, Anaheim, CA, August 1993.
[CKY94]
M. Chen, D. Kandlur, and P.S. Yu. Support For Fully Interactive Playout in a disk{ array based Video Server. In Proceedings of 2nd ACM Multimedia Conference. ACM, October 1994.
[CL93]
H.-J. Chen and T.D.C Little. Physical storage organization for time dependent multimedia data. In 4th Intl. Conference on Foundation of Data Organization and Algorithms, 1993.
[DSKT94] J. Dey, J. Salehi, J. Kurose, and D. Towsley. VCR Capabilities for Very Large Scale Video On Demand. In Proceedings of 2nd ACM Multimedia Conference. ACM, October 1994. [GC92]
J. Gemmell and S. Christodoulakis. Principles of Delay{sensitive Multimedia Storage and Retrieval. ACM Transactions on Information Systems, 10(1):51{90, January 1992.
[Gem93]
J. Gemmell. Multimedia Network File Servers: Multi{channel Delay Sensitive Data Retrieval. In Proceedings of 1st ACM Multimedia Conference. ACM, October 1993.
28
[KEL94]
R. Keller, W. Eelsberg, and B. Lamparter. Performance bottlenecks in digital movie systems. In Proc. of 4rd Intl. Workshop on Network and Operating Systems, 1994.
[KH]
D.R. Kenchammana-Hosekote. The Document in preparation.
[KHS94]
D.R. Kenchammana-Hosekote and J. Srivastava. Scheduling Continuous Media on a Video-On-Demand Server. In International Conference on Multi-media Computing Systems, Boston, MA, May 1994. IEEE.
[KSZ92]
S. Khanna, M. Sebree, and J. Zolnowsky. Real{time Scheduling in SunOS 5.0. In Proceedings of Winter USENIX. USENIX, 1992.
[LS92]
P. Lougher and D. Shepherd. Design and implementation of a continuous media storage server. In Proc. of 3rd Intl. Workshop on Network and Operating Systems, 1992.
[LV95]
T.D. Little and D. Venkatesh. A Scalable Video{on{Demand Service for the Provision of VCR{like Functions. In IEEE Multimedia. IEEE, 1995. To appear.
magellan
Continuous Media Server Simulator.
[NYT94a] NYT. Demanding Task:Video on Demand. Article in New York Times, Sunday 23rd, January 1994. [NYT94b] NYT. Multi{media in Medicine. Article in New York Times, Sunday 23rd, January 1994. [NYT95]
NYT. Perspectives on Video{on{Demand. IEEE Spectrum Special Issue, April 1995.
[ONWC87] B.C. Ooi, A.N. Narasimalu, K.Y. Wang, and I.F. Chang. Design of a multi-media les server using optical disks for oce automation. In IEEE Computer Society Oce Automation Symposium, pages 157{163, 1987. [RS92]
L. Rowe and B.C. Smith. A Continuous Media Player. In Proc. of 3rd Intl. Workshop on Network and Operating Systems, 1992.
[RV93]
V.P. Rangan and H.M. Vin. Designing a multi{user HDTV storage server. IEEE Journal on Selected Areas in Communications, 11(1), January 1993.
[RVG+ 93] K.K. Ramakrishnan, L. Vaitzblit, C. Gray, U. Vahalia, D. Ting, P. Tzelnic, S. Glasner, and W. Duso. Operating system support for a Video-On-Demand service. In Proc. of 4rd Intl. Workshop on Network and Operating Systems, 1993. [RW93]
A.L.N. Reddy and J. Wyllie. Disk Scheduling in a Multimedia Server. In Proceedings of the ACM Multimedia, pages 225{234, Anaheim, CA, August 1993.
[Sch90]
H.D. Schwetman. CSIM Reference Manual (Revision 15). Technical Report ACA-ST257-87, Microelectronics and Computer Technology Corporation, Austin Texas, 1990. 29
[SMRD94] J. Schnepf, V. Mashayekhi, J. Riedl, and D.H.-C. Du. Closing the Gap in Distance Learning Education: Computer{Supported, Participative, Media{Rich Education. ED{TECH Review, 1994. Fall/Winter. [TP72]
T.J. Teorey and T.B. Pinkerton. A Comparitive Analysis of Disk Scheduling Policy. Communications of the ACM, 15(3):177{184, March 1972.
[TPBG93] F. Tobagi, J. Pang, R. Baird, and M. Gang. Streaming RAID { A disk array management system for video les. In 1st Conference on Multimedia, pages 393{400. ACM, 1993. [USA94]
USAF. C4I: The Advanced Warrior Program. USAF Publication, 1994.
A Computing New States A.1 n for Rate Variation Operations We have (bI ? (r + r)vT )(n + n) = s(r + r) If we let A = (bI ? rvT ) then we know that An = sr. This simplies the above expression to n = (A ? rvT )?1(r + r) ? A?1 r This, on expansion, yields !
Or,
vT A?1 r A?1n vr + n = 1 + (1 ? vT A?1 r)(b ? vT r) 1 ? vT A?1 r T T n = (b ? vT r)(s b ? vT rnew ) ((b ? v r)I + rv )r
A.2 n for Sequence Variation Operations We have (bI ? r(vT + vT )(n + n) = sr If we let A = (bI ? rvT ) then we know that An = sr. The above expression then yields,
n = s (A ? rvT )?1 r ? A?1 r 30
Given that A?1 r = b?v1 T r r,
!
s(vT r) n = r (b ? vT r)(b ? vnewT r)
B Derivations for Section 5 B.1 Derivation of ad =
G?1 1? G
a.
Starting from 1
1
T (n + a + ad ) Aa = G T (n + a) Ap s + v s + v {z } {z } | | Rate of Accumulation in A?phase Rate of Accumulation in P ?phase ! T ad v Aa = G 1 + s + vT (n + a) Ap
Since Aa = 1b M(ad + a) and Ap = 1b Ma,
!
T ad + a = G 1 + s + vvT a(nd + a) a
Or,
!
GavT
I ? s + vT (n + a) ad = (G ? 1)a
Given that (I ? uvT )?1 = (I + 1?uvvTT u ) and G < 1,
G ? 1 ad = 1 ? G a
B.2 Derivation of Bx 1b rvT ad ? Ap. Starting from
b|(Bx?1 + A + n + a)} {zp
data available for cycle x+1
Since Bx = Bx?1 + Ap,
x+1 r T| svc {z }
consumption in cycle x+1
b(Bx + n + a) (s + vT (n + a + nd ))r Since b(n + a) ? rvT ad = bAp, 31
Bx 1b rvT ad ? Ap Substituting ad = 1G?? G1 a and Ap = 1b (bI ? rvT )a,
T 1 r ? a + 1 rvT a Bx v b a 1G?? G b
v T a G ? 1 + 1 b 1 ? G T Bx v b a (11 ?? G)G r ? a
Bx
B.3 Derivation of K = ui 1?G G ? ai. Starting from K = Bx ,
G K = ui 1 ? G ? ai
B.4 Derivation of x = wi (1ui?G G) ? wzii
Starting from Bx = Apx + Bk . 0
Ta v G k (ai ? b ri)x + Bi = ui 1 ? G ? ai 0
Re{arranging,
x = w (1u?i G G) ? wzi i
i
B.5 Computing Gopt i Starting from
Ti(Gi) = m1G ((Gi ? 1)ximiT + Mi + Ki) i i Substituting the values of Ki and xi we get, i + zi T ? a 1 + ui + ui T Gi ? 1 ? zi T Ti(Gi) = M mi wi i Gi 1 ? Gi wi 1 ? Gi wi
32
Since
We get,
; d Gi ? 1 = 1 ? d 1 = dGi 1 ? Gi (1 ? Gi )2 dGi 1 ? Gi (1 ? Gi )2 Mi + zi T ? a 1 1 dT = (1 ? )ui T + u ? i (1 ? G )2 dGi wi mi wi i G2i i
i (Gi ) = 0 we get, Setting dTdG i
Solving for Gi we get,
1 ? Gi 2 = qi
Gi
Gopt i =
pi
1q + pqii
33