user query, user pro le, and session pro le, there are a number of exibilities ..... a query script q that references n continuous media ob- ..... Servers: A Tutorial.
1
CONTINUOUS MEDIA RETRIEVAL OPTIMIZER FOR HIERARCHICAL STORAGE STRUCTURES Cyrus Shahabi, Ali Esmail Dashti, and Shahram Ghandeharizadeh Integrated Media Systems Center and Computer Science Department University of Southern California, Los Angeles, California 90089 [cshahabi,dashti,shahram]@cs.usc.edu
ABSTRACT One of the key components of multimedia systems is a Continuous Media server (CM server) that guarantees the uninterrupted delivery of continuous media data (e.g., video). Digital libraries and commercial broadcasting systems are sample applications that can bene t from such multimedia systems. Queries imposed by such applications might require the retrieval of one or more continuous objects stored on the CM server. Traditionally, multimedia systems have opted to guarantee that the CM server can display all the objects in the set to the user with no interruptions and with very strict timing among the display of these objects, resulting in a single retrieval plan. However, for a class of applications, we have observed that depending on the user query, user pro le, and session pro le, there are a number of exibilities that can be exploited for retrieval optimization. In (Shahabi et al., 1998), we presented a formal de nition for these exibilities and a description of a Pro le Aware Retrieval Optimizer (Prime) that utilizes these exibilities to improve system performance. In this paper, we consider two extensions to Prime, namely: 1) Prime+ : which uses memory buering to alleviate fragmented server bandwidth, and 2) Prime : which is an extension of Prime for a hierarchical storage system. By means of a simulation study, we show that Prime and Prime+ can signi cantly improve the system response time and/or other application speci c metrics (e.g., latency time).
1 INTRODUCTION Multimedia applications impose a new set of constraints on data management systems. These applica-
This research was supported in part by gifts from HewlettPackard and Hughes, NSF grants EEC-9529152 (IMSC ERC), IRI-9203389, IRI-9258362 (NYI award), MRI-9724567, and NIMH grant 1P20MH52194-01A1 (USC Brain Project).
tions combine a variety of data types, such as: text, images, audio, video, and animations. In many of these applications, the result of a query is a set of continuous media objects (i.e., audio or video) that should be retrieved from a continuous media server (CM server), and displayed to the user. The display requirements of such applications can be classi ed as either: 1) twopass, or 2) single-pass. In the two-pass paradigm, during the rst pass, a set of objects (that satis es the query lter expression) are identi ed. Subsequently, the user interactively selects the objects of interest for display. To assist the user, textual and thumbnail information are used to represent dierent objects. Most of current Internet applications use the two-pass paradigm. In the singlepass paradigm, a set of temporal relationships (among the set of identi ed objects) govern the display timing of the objects. The display of the objects is considered to be coherent when all of the temporal relationships are satis ed. Therefore, after the submission of the request, no user interaction is necessary. That is, the system starts the display as soon as it can guarantee that the CM server can display all of the objects in the set to the user with no interruptions, while satisfying the temporal relationships. Some of the new Internet applications (e.g., Point-Cast) that use the push paradigm, can be considered as a variation of the single-pass paradigm. In the pure push paradigm, the user does not make a request, rather the system keeps broadcasting the information to the user (i.e., data driven (Shahabi and Ghandeharizadeh, 1995)). Some sample applications that may use the single-pass display paradigm include: customized newson-demand, customized advertisement, and digital libraries & museums. Applications using the single-pass paradigm can be classi ed as either: 1) Restricted Presentation Applications (RPA), or 2) Flexible Presentation Applications (FPA). RPA, such as non-linear editing applications, have very strict set of temporal relationships that have
to be met. This strict timing requirement results in a single retrieval plan, and hence very limited retrieval optimization is possible. However, in FPA, the set of temporal relationships among the objects is not as strict, yielding a number of equivalent retrieval plans that can be used for retrieval optimization purposes. RPA imposes very strict display requirements. This is due to the type of queries imposed by users in such applications. It is usually the case that the user can specify what objects he/she is interested in and how to display these objects in concert. Multimedia systems have to guarantee that the CM server can retrieve all the objects in the set and can satisfy the precise time dependencies, as speci ed by the user. We have studied the scheduling of continuous media retrievals for RPA, in (Chaudhuri et al., 1995; Shahabi, 1996). FPA, however, provide some exibilities in the presentation of the continuous media objects. It is usually the case that the user does not know exactly what he/she is looking for and is only interested in displaying the objects with some criteria (e.g., show me todays news). In general, almost all applications using a multimedia DBMS fall into this category. In this case, depending on the user query, user pro le, and session pro le, there are a number of exibilities that can be exploited for retrieval optimization. We have identi ed the following exibilities: delay exibility: which speci es the amount of delay the user/application can tolerate between the display of dierent continuous media clips (i.e., relaxed meet, after, and before relationship (Allen, 1983)). ordering exibility: which refers to the display order of the objects (i.e., to what degree the display order of the objects is important to the user). presentation exibility: which refers to the degree of exibility in the presentation length and presentation startup latency. display-quality exibility: which speci es the display qualities acceptable by the user/application, when data is available in multiple formats and/or in hierarchical or layered formats (based on layered compression algorithms (Keeton and Katz, 1995)). In FPA, the exibilities allow for the construction of multiple retrieval plans per presentation. Subsequently, the best plan is identi ed as the one which results in minimum contention at the CM server. To achieve this, three steps should be taken: Step 1: gathering exibilities, Step 2: capturing the exibilities in a formal format, and Step 3: using the exibilities for optimization. In our system architecture, Figure 1, the rst two
steps are carried out by the Pro le Aware User Query Combiner (Parrot). It takes as input the user query, user pro le, and session pro le (e.g., type of display monitor) to generate a query script (as output). This query script would capture all the exibilities and requirements in a formal manner. The query script is then submitted to the Pro le Aware Retrieval Optimizer (Prime) which in turn would use it to generate the best retrieval plan for the CM server. The main focus of our study is on the formal de nition of the query script and the process of using this information to optimize continuous media data retrieval (Steps 2 and 3). In a previous work (Shahabi et al., 1998), we identi ed an initial framework for Prime. This study, as an extension of (Shahabi et al., 1998), introduces Prime+ and Prime , as well as including a performance study. Parrot (Step 1) can be as simple as a graphical user interface to facilitate explicit de nition of exibilities by the user, or as complex as a high level query analyzer with knowledge discovery capabilities (to extract information from various pro les). In this paper, we do not consider design of Parrot and the process of the query script generation. These issues are part of our future research. Using the query script, Prime de nes a search space that consists of all the correct retrieval plans. A retrieval plan is correct if and only if it is consistent with the de ned exibilities and requirements. Prime also de nes a cost model to evaluate the dierent retrieval plans. The retrieval plans are then searched (either exhaustively or by employing heuristics) to nd the best plan depending on the metrics de ned by the application. We also consider an extention to Prime to use a simple memory buering technique, SimB, to alleviates retrieval problems when the system bandwidth becomes fragmented, termed Prime+ . In addition, we introduce a hierarchical storage system that consists of: a tertiary storage device (e.g., tape libraries), a CM server, and some memory. We describe a pipelining technique that might be utilized by an extended Prime, namely Prime , to improve system performance. Our simulation studies show signi cant improvement when we compare the system performance for the best retrieval plan with that of worst and average plans. For example, if latency time (i.e., time elapsed from when the retrieval plan is submitted until the onset of the display of its rst object) is considered as a metric, the best plan found by Prime observes 41% to 92% improvement as compared with the worst plan, and 26% to 89% improvement as compared with the average plans, when SimB is not applied. The rest of this paper is organized as follows. In Section 2, we present a brief overview of the Pro le Aware Retrieval Optimizer (Prime). In Section 3, we
Profile Aware Retrieval Optimizer (Prime) accepts the Query-Script, which contains the user retrieval requirements and flexibilities (ordering, delay, display-quality, and presentation) as a formal definition, and then generates a retrieval plan that is optimal for the current CM-Server load.
Video clips being streamed to the user
Prime Profile Aware User Query Combiner Module (Parrot) accepts a user query and consults the user profile, session profile, and the meta-data to generate a Query-Script.
Parrot
CM Server
4 seconds
6 seconds
Sport: USC vs. UCLA football; 90 seconds (MPEG II) Sport: USC vs. Stanford waterpolo; 60 seconds (MPEG I) Business: IBM story; 75 seconds (MPEG II)
MetaData User Query (e.g., “Show me today’s news”
User Profile
Meta-Data
5 seconds Business: ATT Story; 60 seconds (MPEG I)
Fig. 1 System Architecture. present an extention of Prime that uses memory buering (Prime+ ). In Section 4, we introduce some concepts for extending Prime over a hierarchical storage system (Prime ). Section 5 evaluates our optimizer using a simulation model. Our conclusion and future research directions are contained in Section 6.
2 PROFILE AWARE RETRIEVAL OPTIMIZER (Prime) The query script received by Prime captures formally the exibilities and requirements imposed by the user query, user pro le, and session pro le. It is the responsibility of Prime to determine how this query script should be imposed against the CM server to reduce contention. The number of IO's to retrieve continuous media objects from the CM server is xed, however, dierent retrieval plans in uence retrieval contention at the server. Prime nds a retrieval plan such that it minimizes contention at the CM server, in order to improve system performance. This process consists of accepting a query script and then nding the best retrieval plan (among a set of correct retrieval plans) to be scheduled on the CM server. Therefore, the optimizer Prime is concerned with three issues: 1) search space, 2) cost model, and 3) a strategy to search for the best retrieval plan. The search space is de ned by the query script. The number of correct retrieval plans determines the size of this search space. The cost model is a set of metrics used to evaluate each correct retrieval plan that is being considered. The search strategy explores the search space for the best retrieval plan based on the de ned metrics,
see (Shahabi et al., 1998) for details. The CM server, utilized by Prime, guarantees the uninterrupted display of the continuous media objects. There has been a number of studies describing the design and implementation of such servers, see (Berson et al., 1994; Gemmell et al., 1995). We ignore the detail architecture of the CM server and conceptualize it as a server bandwidth, termed RCM . Prime considers four types of exibilities that might be tolerable by user/application submitting a request: delay, ordering, presentation-time, and display-quality. To capture the delay exibility, we de ne a minimum and a maximum tolerable delay between the nishing time of one object and the start time of the subsequent Min and T Max ). For ordering exibility, we object (TDelay Delay de ne three variations: 1) Unordered Object Retrieval (UOR), 2) Suggested Object Retrieval (SOR), and 3) Ordered Object Retrieval (OOR). To illustrate, consider a query script q that references n continuous media objects, q = fo1 ; o2 ; :::; on g. UOR imposes no ordering constraint on the display of the n objects. SOR suggests an ordering for the n objects; however, this ordering is not restrictive, rather it is given with some con dence. It is expected that a large number of queries imposed on multimedia applications to be of this type. OOR requires the display of the n objects in a speci c order, and it is necessary to satisfy this ordering. To capture the presentation-time exibility, we de ne two variables: 1) the presentation length, TLength , as the the total time to display all objects in a request plus the delays between them, and 2) the presentation start-up latency, TStartup , as the time elapsed from the submission of the request to the start of the presentation. Furthermore, we de ne a minimum presentation
Table 1 Some terms and their corresponding de nitions Term
De nition A query-script, , is a formal de nition of the user, system, and application exibilities and requirements. This is the input to Prime. () A query script release time. It is the time at which the query script is released to the optimizer. A retrieval-plan, , consists of tuples j , 8 : 1 , where j is one of the objects to be displayed from the query-script, and is one of the possible positions. () Retrieval plan start time. It is the time at which the rst object of the retrieval plan is displayed, ( ) = ( j ) where ( j ) = 1. () Retrieval plan nish time. It is the time interval at which the last object of the retrieval plan nishes its display, ( ) = ( j ) where ( j ) = . The time elapsed from the release time of the query script ( ) to the nish time Response of the plan ( ), Response ( ) = ( ) ? ( ). The time elapsed from the release time of the query script ( ) to the start time of Latency the retrieval plan ( ), Latency ( ) = ( ) ? ( ). Avg ( ) The average delay between the nish time of an object and the start time of the n?1 +1) (p) . subsequent object, Avg ( ) = i=1 n?(i;i 1 n?1 ( (p)?Avg (p))2 V ar ( ) The delay variance, V ar ( ) = i=1 (i;i+1) . n?1 (i;i+1) ( ) Delay between the completion time of object r at position , and starting time of object s at position + 1, 8 : 1 ( ? 1). q
q
r q
p
p
< o ;i >
n
i; j
i; j
i
s p
n
o
n
p
s p
f p
s o
P os o ; p
p
f p
f o
P os o ; p
T
n
r q
f p
T
p
f p
r q
T
r q
s p
T
p
P P
p
p
p
s p
r q
p
p
o
o
i
i
Min , and a maximum presentation length, length, TLength Max , tolerated by the user and the application. We TLength Min , also de ne a minimum startup latency time, TStartup Max . To capand a maximum startup latency time, TStartup ture the display-quality exibility, we de ne a function C that returns a set of m acceptable display bandwidths for an object o, (C (o) = fc1(o); c2 (o); ::; cm (o)g). We assume that this function returns all display formats that are available on the CM server and that are acceptable to the user. After accepting a query script from Parrot, Prime may search the dierent correct retrieval plans permitted by the query script to nd the best retrieval schedule. The search space consists of all of the correct retrieval plans, where the correctness of a retrieval plan depends on the display order being considered with the given con dence threshold. The correctness of retrieval plan is not aected by the other query script parameters. When scheduling the retrieval plans, we are implicitly considering TResponse and TLatency as two major optimizations metrics. This is due to the fact that we try to schedule a given retrieval plan as soon as possible, and hence minimizing both metrics. It is possible to apply three other metrics: average delay (Avg ), delay variance (V ar ), and con dence level ( (p)), see Table 1 for de nitions. For each schedule, we calculate all of the metrics, and Prime may choose to optimize for one or more of these metrics. The order by which
i
i
n
these metrics are applied may aect the selected retrieval plan and is application dependent. The objective of Prime is to nd the best retrieval plan in the search space, where best is dependent on the metrics used. This problem can be shown to be NPcomplete by reduction to the bin-packing problem (Garey and Johnson, 1979). Even though the problem is NPcomplete, it is possible to do exhaustive search when there are a small number of objects being referenced. However, when the number of referenced objects is large, alternative search strategies have to be employed, such as: heuristic search strategies, or randomized search strategy. Bin-packing strives to t a set of variable size items into a set of xed size bins. The objective is to minimize the number of bins lled. A good heuristic devised for bin-packing is First Fit Decreasing (FFD). In our case, each object can be considered as an item and the bins are discrete time intervals over the system bandwidth. Note that here, bins have variable sizes as a function of the system load. Minimizing response time is consistent with the bin-packing objective. To adapt FFD to our problem, the main issue is to nd a measure for the size of the objects. The size of an object o could be proportional to its bandwidth requirement (c(o)), its display time (l(o)), or its real size (say in Megabytes) as c(o) l(o). From our previous experiences (Shahabi, 1996), we choose the size to be c(o) l(o). This is intuitive as we are considering the area of the rectangular representation of the object as opposed to its height or
length. This heuristic is termed Largest Object First (LOF). We study an entirely opposite heuristic as well: Smallest Object First (SOF). This is because in our previous experiments (Shahabi, 1996), SOF was found to outperform LOF in some cases. The complexity of each of the heuristics is O(n2 log n).
3 OPTIMIZATION WITH MEMORY BUFFERING (Prime ) +
When the server load is moderate to high, it is dicult to nd time slots (i.e., a rectangle) that can satisfy the display requirements of the objects for their entire duration. This is due to the fact that: 1) the available server bandwidth is less than the display requirements of the objects (i.e., not enough height), and 2) there might be time slots that satisfy the display requirement of the objects, however, the length of these time slots is shorter than the length of the objects (i.e., not enough length). We refer to this problem as the server bandwidth fragmentation problem, and it leads to larger TResponse and TLatency . Memory buering can play a role in alleviating the server bandwidth fragmentation problem by allowing the emulation of higher server bandwidths when necessary. There are two ways of applying memory buering: 1) Simple Memory Buering Mechanism (SimB), and 2) Variable Rate Memory Buering Mechanism (VarB). The SimB mechanism treats the system load on the server as a discrete function over time. Either there is enough bandwidth to satisfy the minimum bandwidth requirement of the object being scheduled, or there is not enough bandwidth. The objective of SimB is to mend two or more time slots such that a larger time slot can be emulated. Hence, when a single time slot cannot accommodate the retrieval of an object, it is possible to apply SimB repeatedly so that two or more time slots are mended to accommodate the display of a single object. The other memory buering mechanism, VarB, takes a more general approach. Its objective is to use the variable server bandwidth with memory to emulate a constant bandwidth for the display of the objects. Therefore, it treats the system load as a continuous function over time, and it uses concepts similar to the pipelining technique presented in the next section and in (Ghandeharizadeh et al., 1995). In this paper, we present SimB mechanism as a simple way to use the available memory at the client side (or at the server side) to reduce the delays and improve system performance. To illustrate SimB mechanism, assume a system is loaded such that there are two time slots of length 30 seconds starting at times t=0 and and t=40 seconds,
Figure 2(a). Moreover, assume that these time slots can satisfy the bandwidth requirements of object ox , which is 60 seconds in length. Without memory buering, it is obvious that ox cannot be scheduled using the two time slots. However, using SimB mechanism, it is possible to mend the two 30 second time slots such that a longer 60 second time slot is emulated. That is, starting the retrieval of half of object ox in the rst time slot, and the retrieval of the second half of the object in the second time slot, Figure 2(a). To ensure the uninterrupted display of ox , it is necessary to have 10 second of the ox prefetched into the memory prior to the start of its display, Figure 2(a) and (b). The amount of memory required to mend any two time slots is dependent on the bandwidth requirement of the object (c(ox )) and the length of the gap being mended (TGap ): MemBuff(SimB) = c(ox)8 TGap RCM Mbytes. Therefore, if c(ox ) = :01, RCM = 1000 Mbps, and TGap = 10 seconds, then MemBuff(SimB) = 12:5 Mbytes. Note that SimB is only applicable if the resulting latency time is tolerable by the user (i.e., consistent with presentation exibility). SimB is a simple mechanism that can be used in mending two time slots into a longer time slot; however, in the general case, it is necessary to apply this mechanism repeatedly such that the object ts in the emulated time slot. It is the responsibility of the optimizer, Prime+ , to apply this technique when necessary to improve system performance. The optimizer looks at the server load, a retrieval plan, and the maximum available memory, as inputs. Subsequently, it produces a retrieval schedule that does not violate the query script
exibilities and the maximum memory requirements.
4 OPTIMIZATION FOR HIERARCHICAL SYSTEMS (Prime ) The storage organization of systems that support large databases of continuous media documents are expected to be hierarchical, consisting of: a tertiary storage device, a CM server (consisting of clusters of disk drives), and some memory (Maier et al., 1993; Ghandeharizadeh et al., 1995). It is expected that the database resides permanently on the tertiary storage device and its objects are materialized on the CM Server on demand (and deleted from the CM server when its storage capacity is exhausted). A small fraction of a referenced object may be staged in memory to support its display. The reason for expecting hierarchical storage managers is the cost of storage. It is economical to stage the data at the dierent levels of hierarchy in the following manner: a small fraction of an object in memory for immediate display, a number of frequently accessed ob-
Ox Display Bandwdith
MemBuff (Mbytes) 12.5
1.0 0.5
Ox
0.5
Ox
Retrieval
0.5
6.25
time
time 0
10
20
30
40
50
(a) Object retrieval and display
60
70
0
10
20
30
40
50
60
70
(b) Memory requirements when using SimB
Fig. 2 Object retrieval-schedule and display with Simple Memory Buering (SimB)
jects on the CM server, and the remaining objects on the tertiary storage device. Depending on the system architecture, there are two alternative hierarchical storage models: 1) pyramid model: where the tertiary is visible only to the CM server via a xed size memory , or 2) relaxed model: where memory serves as an intermediate staging area between the tertiary storage device, the CM server, and the display station. With the rst organization, the data must rst be staged on the CM server before it can be displayed. With the second organization, the system may elect to display an object from the tertiary storage device by using the memory as an intermediate staging area. When a request references an object that in not CM server resident, one approach might materialize the object on the CM server in its entirety before initiating its display. In this case, assuming a zero system load, the latency time of the system is determined by: the time for the tertiary to reposition its read head to the starting address of the referenced object, the bandwidth of the tertiary storage device, and the size of the referenced object. A superior alternative is to use pipelining in order to minimize the latency time. The pipelining technique is described in detail in (Ghandeharizadeh et al., 1995). Brie y, the pipelining mechanism splits an object into s logical slices (S1 , S2 , S3 , ..., Ss ) such that the display time of S1 overlaps the time required to materialize S2 , the display time of S2 overlaps the time to materialize S3 , so on and so forth. This ensures a continuous display while reducing the latency time because the system initiates the display of an object once a fraction of it (i.e., S1 ) becomes disk resident. When the tertiary bandwidth (i.e., production rate) is less than the object display bandwidth (i.e., consumption rate), then the time required to materialize an object is more than its display time. However, when the tertiary bandwidth exceeds the bandwidth required to display an object, then two alternative approaches can be employed to compensate for the fast production rate, either: 1) multiplex the bandwidth of tertiary among several requests referencing dierent objects, or 2) use the high tertiary bandwidth to materialize the object onto the CM server in a shorter period. The rst approach wastes the tertiary bandwidth because
the device is required to reposition its read head multiple times. The second approach utilizes more resources in order to avoid the tertiary device from repositioning its read head. It is the responsibility of Prime to apply this technique when retrieving objects from the tertiary. The optimizer may use the pipelining technique to materialize objects in parallel with the display of other objects. We expect that this parallelism and pipelining, in combination with the available exibilities, will result in substantial system performance improvements. The design and implementation of Prime is part of our future research.
5 PERFORMANCE EVALUATION In this section we report the results of our experiments on UOR. We did not consider SOR because the results of those experiments would have been a subset of those reported for UOR. We implemented a simulation model to: 1) show possible performance improvements even when a subset of the exibilities are considered, and 2) evaluate our proposed heuristics (i.e., LOF and SOF). It is important to note that our intend is not to only show that the optimizer chooses a better retrieval plan as compared to a random pick (i.e., no optimizer). Instead, we intend to show that the margin of improvement (even as compared to the average of plans) is signi cant to justify the use of such an optimizer.
5.1 SIMULATION MODEL
For the purposes of this evaluation, we assumed a continuous media server with a sustained transfer rate of RCM = 1000 Mbps. All the objects in the media server are MPEG-2 video clips with 10 Mbps bandwidth requirement (i.e., 8j c(oj ) = 0:01). Note that assuming single media type is to the disadvantage of our optimizer because it reduces the dierences between alternative query plans. The size of the objects is random and varies from 30 seconds to 2 minutes. We xed Min and T Max to 0 and 20 seconds, respectively. TDelay Delay We employed a Poisson distribution for request ar-
rivals to generate variable loads on the server. However, we only report the cases where the average load was moderate (60%?80%). The reason is that for low (high) system loads all the query plans of a script demonstrated a similarly good (bad) performance. Hence, the choice of a plan by the optimizer did not result in a signi cant dierence in the system performance. We expect that a real-world system is usually loaded moderately (i.e., neither under-utilized nor over-utilized). A moderate system load is indeed where the optimizer makes a dierence. For the real implementation, Prime can examine the system load and only invokes the optimizer when the load is moderate. We varied the size of the query scripts (n) from 3 to 6 objects. Subsequently, for each script q, we generated the entire search space with the cardinality of n!. Each plan of q was then scheduled and dierent metrics were measured. We only considered two levels of optimization with TResponse as the primary metric. As secondary metrics, we considered TLatency , Avg , and V ar . For each script, we recorded the best value of a metric, its worst value, and its average value among all the plans. This was done to measure the improvement of the metric as compared to both the worst case and the average case. Note that the improvement of the metric over the average case is very important. For each script size, we generated 1000 scripts with random size objects in order to eliminate the possibility of luck. Hence, the improvement over the worst case and the average case was averaged over all the 1000 scripts. We compared the results of an exhaustive search strategy and two heuristic search strategies, with the worst and average cases, for all the relevant metrics. Moreover, we consider performance improvements with Prime+ , when SimB mechanism is used and when it is not used. The simulation models were implemented in C language and executed on HP 9000 Series 700 Model 735 machines, under HP-UX 9.0 operating system.
5.2 EXPERIMENTAL RESULTS
Table 2 summarizes the results of our experiments when SimB mechanism is not applied (i.e., Prime). Table 3 summarizes the results of our experiments when SimB mechanism is applied with 4 Mbytes of memory for each presentation (i.e., Prime+ ). We follow similar formats in both tables; however, we do not compare the results of the two tables with each other. This is because it is obvious that Prime+ outperforms Prime. The rst column is the size of each query-script. The second column shows the improvement of the response time as the primary metric. The rst and the second number in each column are the percentage of improvement over the worst and the average case, respec-
tively. Observe that the duration of response time is dominated by the constant length of the objects, therefore, even small improvements in response time do contribute signi cantly to the overall system performance. For example, the improvements over worst case range from 36%-52% with Prime and 30%-43% with Prime+1 . Third column shows the improvement of TLatency as the secondary metric. Due to the close relation between response time and latency time, the improvement of latency time is also signi cant. For example, the improvements over worst case range from 42%-93% with Prime and 46%-54% with Prime+ . Observe, minimizing TLatency and TResponse are non-con icting optimization objectives. From the bad performances of Avg and V ar (see fourth and fth columns), however, we can conclude that minimizing Avg and V ar con icts with minimizing TResponse . To illustrate, one can push the scheduling of a plan so late that the delay between the clips be zero. This results in the best value for Avg and V ar while the observed latency time and hence response time will be unacceptable. Finally, as we expected, LOF is superior to SOF with Prime. This is due to: 1) the similarity of our problem with bin-packing, and 2) careful adaption of the FFD heuristic (devised for bin-packing) by choosing c(o)l(o) as the size of object o. The performance of LOF is obviously worse than the optimal but above the average. As the number of objects in the script grows, the performance of SOF approaches that of LOF. Unfortunately, due to high complexity, we cannot compute average and worst case response times for large n's to con rm if this trend continues or not. An interesting observation is that with Prime+ SOF is superior to LOF . This is because, SimB mends fragmented time slots to emulate larger slots for object retrieving. However, when memory is limited (which is the case in our simulations) the maximum size of the emulated slot is determined by the amount of available memory. Hence, the possibility of emulating smaller slots is higher than emulating larger ones. SOF can bene t from small emulated slots by scheduling small objects rst. In contrast, LOF cannot utilize the small emulated slots because it stubbornly tries to schedule the largest objects rst.
6 CONCLUSION In this paper, we described two extensions to Prime: Prime+ and Prime . Prime+ uses main memory to reduce bandwidth fragmentation and Prime tries to overlap object retrieval from tertiary with the display
1 Note: the reason that the margin of improvement with Prime+ is lower than that of Prime is because SimB was applied for all retrieval plans.
No. of objects n
3 4 5 6
Table 2 Prime simulation results.
Two level optimization improvement (or degradation) over worst/avg cases Primary optimization metric Secondary optimization metrics TResponse TLatency Avg V ar
36%/22% 52%/33% 47%/27% 36%/22%
93%/87% 75%/57% 57%/36% 42%/26%
0%/(-91%) 11%/(-56%) 27%/(-45%) 28%/(-74%)
4%/(-74%) 27%/(-20%) 36%/(-11%) 38%/(-18%)
TResponse improvement (or degradation) over worst/avg cases SOF heuristic LOF heuristic
0%/(-22%) 11%/(-26%) 24%/(-4%) 22%/4%
36%/22% 32%/4% 34%/10% 24%/7%
Table 3 Prime+ simulation results (using 4 Mbytes of memory per presentation). No. of objects n
3 4 5 6
Two level optimization improvement (or degradation) over worst/avg Primary optimization metric Secondary optimization metrics TResponse TLatency Avg V ar
30%/16% 43%/24% 40%/24% 38%/22%
47%/29% 54%/34% 48%/31% 46%/30%
of objects. To verify our optimizer, we conducted a simulation experiment to study Prime and Prime+ . The main observations are as follows: 1) invoking the optimizer only makes a dierence when the system load is moderate (i.e., system load is 60% ? 80% of capacity), 2) response time and latency time are consistent metrics and both have been improved signi cantly by our optimizer, and 3) SOF can bene t from the mending procedure applied by SimB (Prime+ ), even when the available memory is restricted. However, when assigning no memory for optimization (i.e., Prime), LOF outperforms SOF. As part of our future study, we plan on considering a number of extensions. First, Prime strives to optimize for a single query script (i.e., intra-presentation optimization). We intend to study both inter-presentation, and global-presentation optimizations. Second, in this study we investigated a simple memory buering technique, SimB. As part of our future work, we will be considering more elaborated memory buering techniques for Prime+ , such as VarB. Moreover, we will consider the performance gains of Prime for a hierarchical storage system that utilizes the exibilities with the de ned pipelining technique. Finally, we will investigate the design of Parrot.
REFERENCES Allen, J. F. (1983). Maintaining Knowledge about Temporal Intervals. Communications of the ACM, 26(11):832{843. Berson, S., Ghandeharizadeh, S., Muntz, R., and Ju, X. (1994). Staggered Striping in Multimedia Information Systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data.
2%/(-63%) 4%/(-55%) 4%/(-56%) 8%/(-82%)
8%/(-95%) 1%/(-97%) 3%/(-97%) 7%/(-90%)
TResponse improvement (or degradation) over worst/avg cases SOF heuristic LOF heuristic
30%/13% 26%/2% 24%/4% 20%/2%
5%/(-14%) 11%/(-18%) 17%/(-5%) 18%/0%
Chaudhuri, S., Ghandeharizadeh, S., and Shahabi, C. (1995). Avoiding Retrieval Contention for Composite Multimedia Objects. In Proceedings of the VLDB Conference. Garey, M. and Johnson, D. (1979). Computers and Intractability: A Guide to the Theory of NPCompleteness, pages 236{242. W.H. Freeman and Company, New York. Gemmell, D. J., Vin, H. M., Kandlur, D. D., Rangan, P. V., and Rowe, L. A. (1995). Multimedia Storage Servers: A Tutorial. IEEE Computer, 28(5). Ghandeharizadeh, S., Dashti, A. E., and Shahabi, C. (1995). A Pipelining mechanism to minimize the latency time in hierarchical multimedia storage managers. Computer Communications, 18(3). Keeton, K. and Katz, R. H. (1995). Evaluating Video Layout Strategies for a High-performance Storage Server. ACM Multimedia Systems, 3(2). Maier, D., Walpole, J., and Staehli, R. (1993). Storage System Architectures for Continuous Media Data. In Proceedings of the Foundations of Data Organization and Algorithms (FODO) Conference. Shahabi, C. (1996). Scheduling the Retrieval of Continuous Media Objects. PhD thesis, University of Southern California. Shahabi, C., Dashti, A. E., and Ghandeharizadeh, S. (1998). Pro le aware retrieval optimizer for continuous media. In Proceedings of the World Automation Congress (to appear). Shahabi, C. and Ghandeharizadeh, S. (1995). Continuous Display of Presentations Sharing Clips. ACM Multimedia Systems, 3(2).