For example, a chase scene between a dinosaur and a jeep becomes ..... The expressions vi 2 Cj and :vi 2 Cj denote that vi and :vi are disjuncts of Cj.
On the Complexity of Resource Scheduling for Coordinated Display of Structured Presentations Martha L. Escobar{Molano and Shahram Ghandeharizadeh Computer Science Department University of Southern California Abstract
With the structured approach to representing video clips, a presentation consists of a collection of background objects and actors (3-D representations) constrained using spatial and temporal constructs along with rendering features (e.g., shading, audiences' view point). While the spatial constraints de ne the position of displayed objects on the screen, the temporal constraints describe when the objects are rendered. As compared with an alternative approach (termed stream-based) that conceptualizes a video clip as a sequence of frames, the structured approach provides for both re-usability of objects in other presentations and eective query processing techniques for retrieval of relevant data. The display of a structured presentation is termed coordinated when the display of objects respects the pre-speci ed temporal and spatial constraints. Otherwise, the display might suer from failures that translate into meaningless scenarios. For example, a chase scene between a dinosaur and a jeep becomes meaningless if the system fails to render the dinosaur when displaying the scene. Assuming a multi-disk hardware platform con gured with a xed amount of memory, this study shows the following: (1) the computation of a resource schedule that supports a coordinated display and yields the minimum latency is an NP-Hard problem, and (2) given a system load, the computation of a schedule to change the placement of data across disk drives in minimum time is an NP-Hard problem.
This research was supported in part by NSF grants IRI-9222926, IRI-9258362 (NYI award), and CDA-9216321, and a unrestricted cash/equipment gift from Hewlett-Packard.
1 Introduction One may represent video using two alternative approaches: stream-based and structured [Gha95]. With the stream-based approach, a video clip consists of a sequence of frames that are displayed at a pre-speci ed rate (e.g., 30 frames per second) to fool the human perception to observe motion. With the structured approach [EMG95, EM95], a video clip is represented as a collection of objects (e.g., 3D representations of a dinosaur) with spatial and temporal constraints (e.g., positions of the dinosaur and the time of their appearances) along with their rendering features (e.g., light intensity, view point). Presently, the structured approach is used to produce animated sequences. For example, \Toy Story" [Bat95] and \Reboot" [Ber94] are animations generated using the structured approach. Structured presentations provide for both re-usability of information and development of eective query processing techniques. They enable a user to extract a character (e.g., a dinosaur), a motion path (e.g., the trajectory and timing of the dinosaur's motion) from one presentation and re-use it in another. In addition, one can devise algorithms to support processing queries that reason about the temporal and spatial information of a structured presentation. To illustrate, consider the animation \The Lion King". Assuming it was represented using the structured approach, to retrieve the scene where Simba nds his father (Mufasa) dead, a user can pose the following query: select scenes that contain both Simba and Mufasa such that Mufasa is static while Simba is moving. The system locates the relevant data by analyzing the temporal and spatial constraints imposed on the 3D representations of Mufasa and Simba. With structured presentations, the database contains 3-D representations of dierent objects, spatial and temporal constraints, and rendering features. A challenging task for the system is to ensure a coordinated display where the system renders objects in a presentation memory resident at the periods de ned by the temporal constraints. If the temporal constraints associated with the presentation are not satis ed, then the presentation might suer from errors. To illustrate, consider the sequence of postures p1 ; : : : ; pn that provide the illusion that a dinosaur is walking. These postures might be a collection of persistent delta updates on the dinosaur's 3-D representation changing its facial expressions (morphing) and moving its body along a curve as a function of time [TT90a, TT90b, Lev79]. If the system fails to render the original 3-D object, the deltas will yield a partially missing character with a changing face and moving body parts. A resource schedule that supports a coordinated display consists of object retrievals, object migrations and object replications. Once the display starts, the schedule must render objects memory resident at the periods de ned by the temporal constraints. To satisfy these temporal constraints, the system can either retrieve the objects immediately before their display, pre-fetch them at an earlier time and have them memory resident until they are displayed, or manipulate the placement of data to retrieve them from disks dierent from their original location. There are two alternatives to manipulate the placement of data: migration and replication. With migration, the system removes the original copy of the object while with replication the system keeps both copies. The focus of this study is the computation of a resource schedule that satis es the temporal constraints associated to a presentation and yields the minimum latency. This computation (motivated here by structured presentations) is fundamental for a database management system that supports multimedia applications such as: structured presentations, coordinated use of multiple streams in multimedia documents, and interactive TV games. This study shows that the complexity of this computation is an NP-Hard problem. 1
Term
m t Time Interval i Instant i Si Pi Pid Fi Fid Uid Ki Rja Wja D B Bid C n k R [0, N] A p
De nition
Number of time intervals in the presentation Duration of a time interval Period [i; i + 1] The beginning of time interval i Pages in memory at instant i Pages containing objects displayed during interval i Pages in Pi that reside in d Pages retrieved from disk onto memory during time interval i Pages retrieved from disk d onto memory during time interval i Pages written to disk d during time interval i Pages discarded from memory during time interval i Read page a from disk j Write page a to disk j Disk drives in the system Maximum number of pages read by a drive during a time interval Disk bandwidth available at drive d during interval i Number of memory frames in the system Number of clauses Number of variables Set of replications to be scheduled Period when R must be scheduled System load during [0; N ] Start up latency in time intervals
Table 1: List of terms used in this paper and their de nitions In Section 2, we present the formal statement of the problem. Section 3 describes related studies. Section 4 shows that given a system load, the computation of a schedule to change the placement of data across disk drives in minimum time is an NP-Hard problem. Section 5 proves that the computation of a resource schedule that supports a coordinated display with the minimum latency is an NP-Hard problem. In Section 6, we present conclusions and future research directions.
2 Statement of the Problem Our target platform for displaying structured presentations consists of D homogeneous disks and a xed amount of memory. We assume that a disk page is the unit of transfer between a disk drive and memory. The D disk drives may retrieve D pages into D dierent memory frames simultaneously. An object might be either smaller or larger than a disk page. When an object is larger than a disk page, it is represented as a collection of pages. We discretize time into x-sized units, termed time intervals. The duration of each time interval is denoted as t. The beginning of a time interval i is termed time instant i (Figure 1). When a user requests a presentation, the system has advanced knowledge of the identity of pages that should be memory resident at speci c times to support a coordinated display. This schedule is termed a display schedule:
De nition: A display schedule is a sequence fP ; : : : ; Pm? g of disk pages sets. Where m is the duration 0
1
of the presentation in time intervals, and Pi is the set of pages displayed during interval i.
A resource schedule also depends on the placement of data across disk drives. An unbalanced placement of pages across the disk drives increases memory requirements. It forces the system to use only the aggregate 2
Time Time Instant i Interval i S-p -p
Si
S0 0
1
i
Sm i+1
m
Time Intervals Time
Request Display Arrives Starts
Display Ends
Figure 1: Time interval and time instant. bandwidth of the disks containing referenced pages (instead of the aggregate bandwidth of all disks). This under utilization of the disk bandwidth increase the number of pre-fetched pages as compared with a system that maintains a balanced placement of data. The increase in number of pre-fetched pages might increase the memory requirement so that it is impossible to display the presentation with the memory available to the system. An alternative to pre-fetching is to either replicate or migrate pages before they are referenced so that the pages referenced simultaneously are evenly distributed across the disk drives.
De nition: A placement of data maps a page identi er and a time interval into one or more disk drives. The state of memory (i.e., pages occupying memory frames) at each instant i, denoted as Si , can be de ned in terms of pages swapped out of memory (Ki ), written to disk (Uid ) and those retrieved from dierent disks (Fid ):
De nition: Given a system with D disks, the state of memory at each instant i is de ned as: Si = (Si?1 ? Ki?1 ? (Ui0?1 [ : : : [ UiD??1 1 )) [ (Fi0?1 [ : : : [ FiD??1 1 ) A resource scheduler consumes a display schedule, a system con guration (B; C; D) and a placement of data P to compute a schedule of page retrievals, replications and migrations that satisfy the temporal constraints dictated by the display schedule.
De nition: Given a system with C memory frames, D drives with disk bandwidth that allows each disk
to retrieve B pages during a time interval, an initial state of memory S?p , and an initial placement of data P . A resource schedule consists of p + m time intervals: m of these overlap with the display, and p of them either pre-fetch pages into memory or modify the placement of data across the disks in preparation for the display. In essence, p denotes the incurred latency. Associated with each time interval i are: (1) a collection of pages retrieved from each of the D disks during time interval i, denoted as Fi0 ; : : : ; FiD?1 , (2) a collection of pages written to each of the D disks during time interval i, denoted as Ui0 ; : : : ; UiD?1 , (3) a collection of pages swapped out of memory to accommodate these retrievals, denoted as Ki . Furthermore, the retrieved, written, and swapped pages are subject to the following restrictions: 3
Request Arrives
Display Starts Display Ends
c e f d b R1a R1 R1 R1 R1 R1
-4
-3
-2
-1
0
1
a
a b
a b c
a b c d
a b c d e
Request Arrives Display Starts Display Ends W2e R2e b d f c a R1e R1 R1 R1 R1 R1 Time
Time 2
c d e f
3
-4
-3 e
Pages in Memory
-2
-1
0
1
a
a b
a b c
a b c d
e f
2
3
e
e
c d f
Pages in Memory f
(b)
(a)
Figure 2: (a) Retrieval Schedule. (b) Resource Schedule. (i) Once the display starts, the set of pages in memory at each instant i is a subset of those required by the display schedule: For each i 2 [0; m ? 1]; Pi Si and Pi Si+1 . (ii) The number of pages retrieved and written to a disk during a time interval does not exceed B : For each i 2 [?p; m ? 1] and each d 2 [0; D ? 1], jFidj + jUidj B . (iii) The number of memory resident pages at each time instant is lower than the number of available memory frames: For each i 2 [?p; m]; jSi j C . (iv) The retrievals respect the placement of data: For each page a, interval i, and disk d: a 2 Fid implies that d 2 P 0 (a; i). Where P 0 is the placement of data resulting from updating P with migrations and replications scheduled before interval i. If there is no manipulation of the placement of data (i.e., for each time interval i and drive d, Uid is empty), then we have a retrieval schedule. This study demonstrate that the computation of a resource schedule that yields a minimum latency and supports a coordinated display of fP0 ; : : : ; Pm?1 g in a system con guration (B; C; D) assuming an initial placement of data P is NP-Hard. To illustrate these concepts, consider a display schedule that consists of three time intervals: P0 =fa; bg, P1 =fc; dg, and P2 =fe; f g. Assume that the system consists of four disks (D=4), each with the bandwidth to retrieve one disk page during a time interval (B =1). Assuming that all the referenced pages reside on disk one, Figure 2(a) shows a retrieval schedule that supports a coordinated display. R1a denotes that disk page a is read from disk one. In this gure, a negative time instant corresponds to page retrievals performed prior the display. A page might either be retrieved during the time interval prior to its display (e.g., f ) or pre-fetched at an earlier time interval (e.g., a). Pre-fetching increases the memory requirements of the system. For example, 5 frames of memory are allocated at instant one (a; b; c; d; e) while the display schedule dictates that only four should be allocated (a; b; c; d). The other page, e is pre-fetched for later use and increase the memory requirements of the system. 4
As illustrated by this example, an unbalanced schedule of references to disks might result in formation of bottleneck disks that requires the system to pre-fetch pages while other disks remain idle. In our example, while the bandwidth of four disks could accommodate the retrieval of four pages, the system was forced to pre-fetch pages because they all reside on disk one. The scheduler may construct resource schedules that utilize the idle disk bandwidth in order to minimize the number of pre-fetched pages. Figure 2(b) shows one such schedule. With this schedule, the system reads page e from disk one during time interval -4 and replicates or migrates it to disk two (denoted as W2e ) during time interval -3. This allows the system to free the memory frame occupied by e at time instant -2 and, utilize disk number two to retrieve e during time interval one to satisfy the display schedule. With this schedule, only 4 memory frames are required at instant one ( a, b, c, d). The distinction between replications and migrations in the resource schedule is the availability of migrated and replicated pages afterwards. When a page a is replicated from disk d1 to d2 , the system can retrieve a from either d1 or d2 after the replication is complete. On the other hand if a is migrated instead, then the system must retrieve a from d2 . The purpose of replications and migrations in a resource schedule is to facilitate the retrieval of a page when it is referenced by the display, otherwise the system will do wasteful work. Therefore, at least one of the page references in the display must retrieve the page from the new location. The distinction between replications and migrations vanishes when a page is referenced only during one continuous period in the display. Because, the new placement of the page is irrelevant for the display after such period. The display schedules considered in the proofs presented in this study reference a page only during one continuous period. Therefore, the distinction between replications and migrations is not needed. Henceforth, we refer to both of them as replications. The scheduling of replications depends on the system load, namely disk bandwidth and memory, at each time interval.
De nition: The system load for a period [0; N ] speci es the system resources availability as follows: For each time interval i 2 [0; N ? 1]: (1) Disk bandwidth available at each drive during interval i: Bi0 ; : : : ; BiD?1 . (2) Page frames available during interval i: Mi . A replication consists of a disk page, a source drive and a set of alternative drives to place the replica: (a; source ! ftarget1 ; : : : ; targetn g). The execution of the replication can utilize intermediate drives. For example to execute the replication (a; 5 ! f6g), the system might read a from disk 5 at t1 and write it to disk 1, then read it from disk 1 and write it to disk 6 at t2 . The advantage of using disk 1 as an intermediate stage is that it might reduce the memory requirements. In the absence of disk bandwidth for drive 6 between t1 and t2 , the system is forced to stage a into memory during [t1 ; t2 ]. While for the case of using disk 1 as intermediate stage, the system does not have to stage a in memory between the time interval when a is written to disk 1 and the time interval when it is read from disk 1. The identity of the disk page is irrelevant for the proofs, we thus omit it henceforth.
De nition: Given a collection of replications R = f(source ! ftarget ; : : : ; targetn0 g), : : :, (sourcer ! ftargetr ; : : : ; targetrn g)g, and a system load A for a period [0; N ]. A schedule for replications R on A maps 0
1
r
5
0 1
0
Page in Memory
.....
..... 0
t1i
t2i
read page from sourcei
.....
.....
write page to drive d1i
.....
.....
t3i
t2k i i
read page write page from drive d1i to drive d2i
N
write page to drive target ip
Figure 3: Schedule of a replication. each replication (sourcei ! ftargeti1 ; : : : ; targetin g)g into a sequence fti1 ; (di1 ; [ti2 ; ti3 ]); : : : ; (dik ?1 ; [ti2k ?2 ; ti2k ?1 ]); (targetip ; ti2k )g (Figure 3). Such that: i
i
i
i
i
(i) The replications are scheduled within the period [0; N ] and the reads and writes are scheduled in the right order: For each j 2 [1; 2ki ] and l 2 [1; 2ki ], 0 tij N , and j < l implies tij < til . (ii) The page is written to one of the target drives: p 2 [1; ni ]. (iii) There is disk bandwidth available in sourcei at interval ti1 and in targetip at interval ti2k to read and to write the page, respectively. i
(iv) For each j 2 [1; ki ? 1], there is bandwidth available in disk dij to write the page at interval ti2j and to read the page at interval ti2j+1 . (v) There is memory available to have the page memory resident during the following periods: (ti1 ; ti2 ), : : :, (ti2k ?1 ; ti2k ). i
i
(vi) The intermediate drives are dierent from the target drives, otherwise the system would be doing wasteful work: For each j 2 [1; ki ? 1], dij 62 ftargeti1 ; : : : ; targetin g i
This study demonstrates that the computation of a schedule that performs replications R in minimum time based on a pre-speci ed system load A is NP-Hard.
3 Related Work Several researchers have studied the complexity of scheduling problems [GJ75, EIS76, GJS76]. Their studies assume a pre-de ned number of jobs with speci c resource requirements and duration. Conversely, in a resource schedule that supports a coordinated display, the duration and resource requirements of the jobs (render an object memory resident during a speci c time period) are not pre-de ned. To have an object memory resident, the system might either retrieve the object directly from the disk containing it or manipulate the placement of data so that the object can be retrieved from another disk. The duration, memory requirement, and disk bandwidth requirement of the manipulation of the placement of data are not prede ned. A replication or migration might take several steps to reach its destination. For instance, consider 6
the migration of an object in disk d1 to disk d3 . The system might either migrate the object directly from d1 to d3 , or migrate the object from disk d1 to d2 , and then from d2 to its nal destination d3 . Furthermore, the object must be memory resident between consecutive reads and writes in a migration or replication (e.g., between 'read from d1 ' and 'write to d2 '). These periods, the disk drives used, and the number of steps in the migration or replication are not pre-de ned. For a single-disk architecture, there is an optimal resource schedule (greedy) for coordinated display of structured presentations [EMGIng]. This optimal schedule minimizes both the memory requirement at each instant and the latency and can be computed in time O(n lg n). In a multi-disk architecture, this optimal schedule can be applied as follows: Given a display schedule and an initial placement of data without replicas (i.e., there is only one instance of each page), the projection fP0d ; : : : ; Pmd ?1 g of the display schedule on each disk drive d is as follows. Pid = fa j a 2 Pi ^ a resides in dg. The union of the retrieval schedules computed by greedy on each projection fP0d ; : : : ; Pmd ?1 g yields a nal retrieval schedule for the display. However, this retrieval schedule does not have any impact on the placement of data across disks drives. And, manipulation of the placement of data might reduce the memory requirement and the latency.
4 Replication Scheduling This section demonstrates that the computation of a schedule that performs replications R in minimum time based on a pre-speci ed system load A is NP-Hard. We rst prove that deciding whether there is a schedule for replications R on a system load A de ned over a period [0; N ] is NP-Complete. We rst introduce a polynomial algorithm SAT2RepSc that transforms any instance C1 ; : : : ; Cn ; v1 ; : : : ; vk of SAT into an instance N; A; R of the replications schedule problem. An instance of SAT is de ned as a collection fC1 ; : : : ; Cn g of n clauses over a set fv1 ; : : : ; vk g of k variables. The problem is deciding whether there is a variable assignment that makes all clauses true. Without loss of generality assume that there is not a clause in the SAT instance with disjuncts vi and :vi for some variable vi (if this is the case, remove such clauses because they are true for any truth assignment).
SAT2RepSc:
Input: C1 ; : : : ; Cn ; v1 ; : : : ; vk . Output: R; A; N . N : Let N be (2n + 2) 2k
A: Let A be de ned as in Figure 4. Thick time instants denote that the system has 0 memory frames available at that instant:
For i 2 f2; 4; 6; : : :; N g; Mi = 0
The system has at least one memory frame available at instant 0:
M0 > 0 Thin time instants denote that the system has 1 memory frame available at that instant: For i 2 f1; 3; 5; : : :; N ? 1g; Mi = 1 7
The labels on time intervals denote disk drives with available bandwidth during the interval. If a label has a superscript + then the disk has bandwidth available for the retrieval/write of at least one page during the interval. Otherwise, the disk has bandwidth available for only one page retrieval/write during the interval. The expressions vi 2 Cj and :vi 2 Cj denote that vi and :vi are disjuncts of Cj respectively. To illustrate consider time interval 2, there is disk bandwidth available in drives w3 and d11 . If v1 2 C2 , then disk d2 has also bandwidth available during interval 2. The other disk drives do not have bandwidth available during interval 2. The system load for interval 2 is as follows: M2 = 0; for each j 62 fw3 ; d11 ; d2 g, B2j = 0; B2w3 > 0; B2d11 = 1; if v1 is a literal in C2 then B2d2 = 1, otherwise B2d2 = 0.
R: Let R be de ned as follows: For each variable vi , si ! fti ; ui g is a replication in R. For each clause Cj , dj ! fdji j vi 2 Cj g [ feji j :vi 2 Cj g is a replication in R. To illustrate the transformation consider the example in Figure 5. The patterns in Figure 5 (c) are used to denote the alternatives to schedule replications associated to variables and clauses. Note that the drives wj cannot participate in any schedule because they are neither a source nor a target. Moreover, they cannot be an intermediate drive because they have bandwidth available only during one time interval. Schedules of replications compete with each other for disk bandwidth and memory. For example, f0; (1; d11)g (an alternative to schedule the replication associated to C1 ) compete with f0; (d11 ; [1; 2]); (d21 ; [3; 4]); (t1 ; 5)g (an alternative to schedule the replication associated to v1 ) for bandwidth of d11 at interval 1 and for memory at instant 1. Intuitively, a variable assignment that makes v1 true is equivalent to select the replication denoted by the line under v1 with the pattern of v1 (Figure 5 (c)) as the schedule for the replication associated to v1 . Then, the replication denoted by the line under v1 with the pattern of C1 (Figure 5 (c)) can be selected as the schedule for the replication associated to C1 (they would not be competing for disk bandwidth of drive d11 nor for memory at instant 1). It is easy to see that SAT2RepSc can be performed in polynomial time. SAT2RepSc de nes a resources availability A and a collection R of replications so that the possible schedules on A for replications in R follow a speci c pattern. For the case of replications associated to variables, there are only two possible schedules on A per variable.
Lemma 4.1: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn ; v ; : : : ; vk ), and r = si ! fti; uig be 1
1
the replication in R associated to variable vi . There are only two alternatives to schedule r on A: 4 ⋅ ( n + 1) ⋅ ( i – 1)
4 ⋅ ( n + 1) ⋅ ( i – 1) + 2 ⋅ ( n + 1)
read s
read s
i
write e 1i
write d 1i read d 1i
read e 1i
. . .
4 ⋅ ( n + 1) ⋅ ( i – 1) + 2 ⋅ ( n + 1)
i
. . . write d ni
write e ni
read d ni
read e ni
write t i
4 ⋅ ( n + 1) ⋅ i
(a)
(b)
8
write u i
Time Intervals 0
v1
w+1 w+2
.. . . . . 2 ⋅ ( n + 1)
.. .
s ,d 1 1 if v 1 ∈ C 1 s otherwise 1 d 11 d , d if v ∈ C 1 2 11 2 otherwise d 11 d
n1
d
w 2 +⋅ ( n + 1 )
n1
t
1
v1
. .. . . .
.. .
¬v s ,d 1 1 if 1 ∈ C 1 s otherwise 1 e 11 ¬v ∈ C e ,d 2 11 2 if 1 otherwise e 11 e
n1
e 4 ⋅ ( n + 1)
.. .
+ wN ---k
n1
u
1
.. .
.. .
.
..
vk
.. .
.
s , d if v ∈ C k 1 k 1 s otherwise k d 1k d , d if v k ∈ C 2 1k 2 d otherwise 1k d
nk
d
nk t k
.
vk
. .. . .
.. .
s ,d k 1 if ¬v k∈ C 1 s otherwise k e 1k e , d if ¬v k ∈ C 2 1k 2 e otherwise 1k e
N
nk
w N+ – 1
e
+ wN
u
Figure 4: System Load. 9
nk k
Time Intervals 0
N ---3
2⋅N ----------3
v1
s , d w +1 1 1 d w 2+ 11 d , d w+3 11 2 d w+4 21 d w+5 21 + t w6 v 1 1 s w+7 1 e w 8+ 11 + e w9 11 + e w 10 21 + e w 11 21 + u w 12 1
C C
1
= v ∨ ¬v ∨ v 1 2 3
2
= v ∨ v ∨ ¬v 1 2 3
(a)
s → {t , u } 1 1 1
v : 1 v : 2 v : 3
v2 + 2 w 13 w+ d 12 14 + d , d w 15 12 2+ d w 16 22 + d w 17 22 + v t w 18 2 2 + s , d w 19 2 1+ e w 20 12 + e w 21 12 + e w 22 22 + e w 23 22 u w 24+ v 3 2 s
s
s → {t , u } 2 2 2 s → {t , u } 3 3 3
C : 1
d → {d , e , d } 1 11 12 13
C : 2
d → {d , d , e } 2 21 22 23
(b)
,d + 3 1w 25 d w 26+ 13 d w+ 13 27 d w+ 23 28+ d w 29 23 + v 3 t w 30 3 s
+
3 w 31 + 13 w 32 e , d w 33+ 13 2 e w 34+ 23 + e w 35 23 + u w 3 36 e
N
v1 v2 v3 C1 C2 (c)
(d)
Figure 5: Example of reduction of a SAT instance into a replications scheduling instance. (a) SAT instance, (b) Corresponding set of replications, (c) Patterns to denote alternative schedules of replications associated to variables and clauses, (d) Corresponding system load. 10
Proof: There are only two time intervals with bandwidth available at drive si. Consider the case when the schedule starts with read the page from si at interval 4 (n + 1) (i ? 1). Because there is not memory available at instant 4 (n + 1) (i ? 1) + 2 then the next step must be to write the page to drive d i during interval 4 (n + 1) (i ? 1) + 1. The next operation to schedule must be to read the page from d i during interval 4 (n + 1) (i ? 1) + 2, because there will not be other interval with bandwidth available for drive 1
1
d1i afterwards. A similar argument can be applied to conclude that the subsequent steps in the schedule are to write the page from d2i during interval 4 (n + 1) (i ? 1) + 3, and then read it from d2i during interval 4 (n + 1) (i ? 1) + 4, and so forth. The nal step in the schedule must be to write the page to drive ti at interval 4 (n + 1) (i ? 1) + 2 (n + 1) ? 1, because there will not be memory available to hold the page at instant 4 (n + 1) (i ? 1) + 2 (n + 1). In sum, one alternative to schedule the replication associated to vi is the sequence in (a). Similarly, we can show that the other alternative is the sequence in (b).
For the replications associated to clauses, there are only c possible schedules on A for a clause with c disjuncts.
Lemma 4.2: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn ; v ; : : : ; vk ), and r = dj ! fdji j vi 2 Cj g[feji j :vi 2 Cj g be the replication associated to clause Cj . Let l = 4 (n +1) (i ? 1) and s = 2 (j ? 1). 1
1
There are only c (c = number of disjuncts in Cj ) possible schedules for the replication associated to Cj :
ffl + s; (dji ; l + s + 1)g j vi 2 Cj g [ ffl + 2 (n + 1) + s; (eji ; l + 2 (n + 1) + s + 1)g j :vi 2 Cj g
Proof: The schedule for the replication associated to Cj must start with a read from dj . From SAT2RepSc,
we conclude that there are exactly c time intervals in A with bandwidth available in disk dj . Moreover, the time intervals with bandwidth available for disk dj are: fl + s j vi 2 Cj g [ fl + 2 (n + 1) + s j :vi 2 Cj g. Let r be the time interval when the read is scheduled. There are two cases: (1) r = fl + s for some i, and vi 2 Cj ; or (2) r = l + 2 (n + 1) + s for some i, and :vi 2 Cj . Consider case (1): From the construction of A (Transformation SAT2RepSc) we conclude that there will be bandwidth available at drive dji during interval r + 1 and there will not be memory available at instant r + 2. Moreover, dji is an alternative target for the replication. Therefore, the schedule must nish with a write to disk dji at interval r + 1. Similar argument can be applied to case (2). In conclusion, the possible schedules for the replication associated to Cj are the c alternatives described above.
Lemma 4.3: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn; v ; : : : ; vk ). If there is a truth assignment for variables fv ; : : : ; vk g that makes all clauses C ; : : : ; Cn true, then there is a replication schedule R 1
1
1
1
on A during period [0; N ].
Proof: Let a be a truth assignment that makes all clauses C ; : : : ; Cn true. Consider the following schedule 1
for the replications in R:
(i) For each variable vi : if a(vi ) is true, then consider the schedule in Lemma 4.1 (b) for the replication associated to vi . Otherwise, consider the schedule in Lemma 4.1 (a). 11
(ii) For each clause Cj : let vi be the variable such that either a(vi ) is true and vi 2 Cj or a(vi ) is false and :vi 2 Cj . Let l = 4 (n + 1) (i ? 1) and s = 2 (j ? 1). If a(vi ) is true and vi 2 Cj , then consider the schedule fl + s; (dji ; l + s + 1)g for the replication associated to Cj . If a(vi ) is false and :vi 2 Cj , then consider the schedule fl + 2 (n + 1) + s; (eji; l + 2 (n + 1) + s + 1)g for the replication associated to Cj . To prove that the above is a replication schedule for R on A, it suces to show that the schedules for each replication do not overlap each other (i.e., they do not compete for disk bandwidth nor memory). The schedules for replications associated to variables span disjoint periods of time: For each i and j such that i 6= j , the following time intervals are disjoint: [4 (n + 1) (i ? 1); 4 (n + 1) (i ? 1) + 2 (n + 1)] [4 (n + 1) (i ? 1) + 2 (n + 1); 4 (n + 1) i] [4 (n + 1) (j ? 1); 4 (n + 1) (j ? 1) + 2 (n + 1)] [4 (n + 1) (j ? 1) + 2 (n + 1); 4 (n + 1) j ] Similarly, the schedules for replications associated to clauses span disjoint periods of time. Suppose that the schedule for a variable vi overlaps the schedule for a clause Cj . Therefore, either vi or :vi makes Cj true. If a(vi ) is true, then the schedule for vi spans the period [4 (n +1) (i ? 1)+2 (n +1); 4 (n +1) i] and the schedule for Cj spans the period [4 (n +1) (i ? 1)+2 (j ? 1); 4 (n +1) (i ? 1)+2 (j ? 1)+1]. However, these two periods are disjoint. Hence, it contradicts the assumption that the schedules for vi and Cj overlap. Similarly for the case where a(vi ) is false, we can conclude that the schedules would not overlap. Therefore, If there is a truth assignment for variables fv1 ; : : : ; vk g that makes all clauses C1 ; : : : ; Cn true, then there is a replication schedule R on A during period [0; N ]. We now prove the other direction, if the replication schedule yielded by SAT 2RepSc has a solution then the input SAT instance has a solution.
Lemma 4.4: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn; v ; : : : ; vk ). If there is a replication schedule for R on A, then there is a truth assignment for fv ; : : : ; vk g that makes all clauses fC ; : : : ; Cn g 1
1
true.
1
1
Proof: The schedules for the replications associated to variables in SATRepSc follow either pattern of
Lemma 4.1. Therefore, a valid truth assignment a is as follows: a(vi ) is true if the execution of the replication associated to vi follows the pattern in Lemma 4.1 (b), and is false if it follows the pattern in Lemma 4.1 (a). We now show that a makes all clauses fC1 ; : : : ; Cn g true. Suppose that there exists a clause Cj such that all its disjuncts are false. The schedule of the replication associated to Cj must be either (Lemma 4.2): (a) f4 (n + 1) (i ? 1) + 2 (j ? 1); (dji ; 4 (n + 1) (i ? 1) + 2 (j ? 1) + 1)g, if vi 2 Cj ; or (b) f4 (n + 1) (i ? 1) + 2 (n + 1) + 2 (j ? 1); (eji ; 4 (n + 1) (i ? 1) + 2 (n + 1) + 2 (j ? 1) + 1)g, if :vi 2 Cj . Suppose that the schedule of Cj is as described in (a). Then the schedule of the replication associated to vi must follow the pattern in Lemma 4.1 (b). Otherwise, there would be a con ict for the disk bandwidth of dji between the schedules for Cj and vi . Therefore a(vi ) is true, according to the de nition of a described above. However as stated in (a), vi 2 Cj then that Cj is true. This contradicts the assumption that all disjuncts in Cj are false. Similarly, we can reach a contradiction when the schedule for Cj is as described in (b). 12
Therefore, a makes all clauses fC1 ; : : : ; Cn g true. Because the transformation SAT 2RepSc is a polynomial time algorithm and Lemmas 4.3 and 4.4, we conclude the following.
Theorem 4.5: Deciding whether there is a schedule for a set R of replications on a system load A over a period [0; N ] is NP-Complete.
Corollary 1 Given a system load A, the computation of a schedule to perform a set R of replications (to change the placement of data across disk drives) in minimum time is an NP-Hard problem.
5 Resource Scheduling This section shows that deciding whether there is a resource schedule for a given display schedule that yields the latency of one time interval is NP-Complete. We rst introduce a polynomial algorithm SAT2ResSc that transforms any instance C1 ; : : : ; Cn ; v1 ; : : : ; vk of SAT into an instance fP0 ; : : : ; Pm?1 g; P ; B; C; D of the resource schedule problem.
SAT2ResSc:
Input: C1 ; : : : ; Cn ; v1 ; : : : ; vk . Output: fP0 ; : : : ; Pm?1 g; P ; B; C; D.
Replication Instance:
Let R; A; N be SAT2RepSc(C1 ; : : : ; Cn ; v1 ; : : : ; vk )
System Con guration:
Set the number of disks in the system to the number of dierent labels used in SAT2RepSc: Let D = m + 1 + 2 k + n + 2n k. Let MaxCard = maxf cardinality of targeti j i 2 [1; n] and sourcei ! targeti is the replication associated to Ci g Let the disk bandwidth of each drive be such that it can retrieve up to MaxCard + 1 pages during an interval: Let B = MaxCard + 1 Let q = B (D ? 1) Set the memory capacity of the system as follows: Let C = 2 q
Display Schedule: Let fP ; : : : ; Pm? g (m = N + n + k) be the display schedule in column Pi of Table 2. Placement of Data: 0
1
Set the placement of pages on the disk drives so that: (1) any resource schedule must include a replication schedule for R, (2) the system load after applying retrieval schedule greedy(fP0 ; : : : ; Pm?1 g; B D) (Column S^i in Table 2) would be identical to A during the period [0; N ]:
{ Every page resides in one disk (i.e., there is no replicas). 13
{ Set the placement of disk pages in the display schedule as follows: Placement of pages retrieved during interval ?1:
a1 ; : : : ; aq would be placed on disk drives dierent from w0 (B pages on each drive). Because D B = q + B , the only drive with available bandwidth during interval ?1 is w0 . Placement of pages retrieved during even intervals in [0; N ]: For the assignment of the q ? 1 pages retrieved during even time intervals before instant N , we
have two cases: (1) There are three disks (x; y; wi ) with available bandwidth during the interval in A: assign the rst (D ? 3) B pages to drives dierent from x; y; wi (B pages to each drive), assign the next B ? 1 to drive x, the next B ? 1 to drive y, and the last page to wi . Then drives x and y would have disk bandwidth available for one page retrieval/write each. And, drive wi would have disk bandwidth available for B ? 1 retrievals/writes. (2) There are two disks (x; wi ) with available bandwidth during the interval in A: assign the rst (D ? 2) B pages to drives dierent from x; wi (B pages to each drive), assign the last (B ? 1) pages to drive x. Then drive x would have disk bandwidth available for one page retrieval/write and drive wi for B retrievals/writes. Placement of pages retrieved during odd intervals in [0; N ]: The assignment of the q pages retrieved during odd time intervals before instant N and after instant 0 is as follows: Let x; wi be the disk drives with available bandwidth during the interval in A. Assign the rst (D ? 2) B pages to drives dierent from x; wi (B pages to each drive), assign the next (B ? 1) pages to drive x, and the last page to wi . Then drive x would have available disk bandwidth for one page retrieval/write and drive wi for B ? 1 retrievals/writes. Placement of pages retrieved during [N; N + k]: The assignment of the q pages retrieved during each interval i is as follows: Let ti?N +1 ; ui?N +1 be targets of the replication associated to vi?N +1 . Assign the rst (D ? 3) B pages to drives dierent from fti?N +1 ; ui?N +1 ; wi+1 g (B pages to each drive), assign the next page to si?N +1 , the next B ? 1 to ti?N +1 , the next B ? 1 to ui?N +1 , and the last one to wi+1 . Then drives ti?N +1 and ui?N +1 would have disk bandwidth available for one page retrieval/write each, drive si would have exceeded the disk bandwidth requirement by one page, and drive wj would have bandwidth available for B ? 1 retrievals/writes. Placement of pages retrieved during [N + k; m]: The assignment of the q pages retrieved during each interval ti is as follows: Let x1 ; : : : ; xi be the target disk drives of replication associated to Cti?N ?k+1 . Assign the rst (D ? i ? 1) B pages to drives dierent from x1 ; : : : ; xi ; wti+1 (B pages to each drive), the next page to dti?N ?k+1 , the next B ? 1 to x1 , the next B ? 1 to x2 , and so forth. Finally, assign the last i ? 1 pages to wti+1 . Then, drives x1 ; : : : ; xi would have disk bandwidth available for one page retrieval/write each, drive dti?N ?k+1 would have exceeded the disk bandwidth requirement by one page, and drive wti+1 would have bandwidth available for B ? i + 1 retrievals/writes.
Observation 1 From the transformation SAT 2ResSc, we can observe the following: Let R; A; N = SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). The transformation SAT 2ResSc produces a display
14
S^i Pi Fi a1 : : : a q 0 aq+1 : : : a2q?1 a1 : : : a q a1 : : : aq 1 a2q : : : a3q?1 a1 : : : a2q?1 aq : : : a2q?1 2 a3q : : : a4q?2 aq: : : a3q?1 a2q : : : a3q?1 3 a4q?1 : : : a5q?2 a2q : : : a4q?2 a3q?1 : : : a4q?2 4 : a3q?1 : : : a5q?2 a4q?1 : : : a5q?2 : : : : : : : : N b2q+1 : : : b3q b1 : : : b2q bq+1 : : : b2q N + 1 b3q+1 : : : b4q bq+1 : : : b3q b2q+1 : : : b3q N +2 : b2q+1 : : : b4q b3q+1 : : : b4q : : : : : : : : m?1 : : : i
?1
Table 2: Display and Retrieval Schedules: fP0 ; : : : ; Pm?1 g and fS^0 ; : : : ; S^m g schedule fP0 ; : : : ; Pm?1 g, a system con guration (B; C; D), and an initial placement of data P such that: (1) There is a resource schedule for fP0 ; : : : ; Pm?1 g consisting of a retrieval schedule Ret = fS^0 ; : : : ; S^m g and a replication schedule Rep, such that: (i) the system load during period [0; N ] resulting from applying Ret is identical to A, and (ii) Rep is a schedule for R on the system load resulting from applying Ret. (2) Ret does not pre-fetch pages. Therefore for any resource schedule, for each instant i; 0 i m ? 1, S^i Si (Table 2). (3) For any resource schedule that supports fP0 ; : : : ; Pm?1 g, there is not memory available at instants N; : : : ; m ? 1 for pre-fetching nor replication. Because for each i; N i m ? 1, jS^i j = C .
We now show that given an instance C1 ; : : : ; Cn ; v1 ; : : : ; vk of SAT, there is a one-time interval resource schedule for SAT 2ResSc(C1; : : : ; Cn ; v1 ; : : : ; vk ) if and only if SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ) has a solution.
Lemma 5.1: Let R; A; N be the output of SAT 2RepSc(C1; : : : ; Cn ; v1; : : : ; vk ). The transformation SAT 2ResSc produces a display schedule fP0 ; : : : ; Pm?1 g, a system con guration (B; C; D) and an initial placement of data P such that any resource schedule for fP0 ; : : : ; Pm?1 g that yields a one-time interval latency must schedule replications R during time interval [0; N ]. Proof: The system must replicate the pages that cannot retrieve during intervals N; : : : ; m ? 1 (Observation 1 (3)) before instant N . Hence, for each i 2 [1; k] the system must replicate a page from drive si to either ti , ui , or wN i before interval N . And, for each i 2 [1; n] the system must replicate a page from drive di to +
either drive in the target set of the replication associated to Ci or to wN +k+i , before interval N . The system must schedule replications R before instant N . Ret retrieves each referenced page during [0; N ] only once and does not pre-fetch pages (Observation 1 (2)). Therefore, the disk bandwidth requirement 15
from a drive during [0; N ] is at least the disk bandwidth required by Ret during the same period. Then the source, target and intermediate drives in the schedule of a replication must have disk bandwidth available in A during [0; N ]. Otherwise, the bandwidth requirements of a disk drive would exceed the disk bandwidth availability during [0; N ]. Thus, disk drives wi for i 2 [N + 1; m] cannot be a target drive of a replication. Replications R must be scheduled after instant 0 otherwise the latency would be higher than one time interval. Starting the schedule of a replication at interval ?1 would increase the latency because: (1) the retrieval of all pages in P0 would require the disk bandwidth of all disks except w0 and (2) w0 is not a source drive for any replication. In sum, the system must schedule replications R during [0; N ].
Lemma 5.2: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn; v ; : : : ; vk ). Let fP ; : : : ; Pm? g; P , 1
1
0
1
B , C , D be the output of SAT 2ResSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). If there is a replication schedule RS for R on A during [0; N ], then there is a resource schedule that yields a one-time interval latency and supports a coordinated display of fP0 ; : : : ; Pm?1 g on a system con guration (B; C; D) and an initial placement of data P.
Proof: Construct a resource schedule as follows: Step 1: Include retrieval schedule Ret in Observation 1 (1). Step 2: Change the retrievals in Ret of replicated pages in R to be retrieved from their target drives in RS . Step 3: Include the schedule of replications RS . This resource schedule supports a coordinated display of fP0 ; : : : ; Pm?1 g that yields a one-time interval latency. To prove the other direction, we show that scheduling a replication r 2 R as part of a resource schedule for SAT 2ResSc(C1; : : : ; Cn ; v1 ; : : : ; vk ) requires at least the memory required by the scheduling of r on A during [0; N ]. Where R; A; N is the output of SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). Given a schedule of a replication, the time intervals when the reads and writes are scheduled determines the memory requirements. The memory requirements of two replications that coincide in the time interval when a read and the next write is scheduled are identical.
De nition: Given a replication schedule fti ; (di ; [ti ; ti ]); : : : ; (dik ? ; [ti k ? ; ti k ? ]); (targetip ; ti k )g 1
1
2
3
i
1
2 i
2
2 i
1
2 i
the memory requirements of the replication schedule during [?p; N ] is de ned as the sequence: ( 0| ; :{z: : ; 0} ; 1| ; :{z: : ; 1} ; 0| ; :{z: : ; 0} ; : : : ; 1| ; :{z: : ; 1} ; 0| ; :{z: : ; 0} ) p+t1 +1 times t2 ?t1 times t3 ?t2 times t2 ?t2 ?1 times N ?t2 times i
i
i
i
i
i
ki
i
ki
i
ki
that represents the number of memory frames required by the schedule at each instant i, i 2 [?p; N ] 16
Let R; A; N be the output of SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). For any resource schedule for SAT 2ResSc( C1 , : : : ; Cn , v1 , : : :, vk ), there are two alternatives to schedule a replication r in R: (1) schedule r based on the system load A, or (2) modify the retrieval schedule Ret (See observation 1) to accommodate the replication r. For the second alternative, the system might schedule additional replications. For example, to schedule a replication of page a from drive s to drive t. The system might utilize the disk bandwidth used to retrieve a page b from s in Ret to read the page a from s at interval ti. Then, page b can either be pre-fetched at an earlier time interval, or be replicated from s to a disk u with available bandwidth at ti so that b can be retrieved from u at ti. If there is not memory to pre-fetch b, then the system is forced to replicate b. Therefore the schedule of the replication (from s to t) includes the schedule of a new replication (from s to u). The additional replications also increase the memory requirements. Therefore their memory requirements should also be considered to obtain the memory requirements of the schedule.
De nition: Let fP ; : : : ; Pm? g; P ; B; C; D be the output of SAT 2ResSc(C ; : : : ; Cn ; v ; : : : ; vk ). Let 0
1
1
1
R; A; N be the output of SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). If the scheduling of a replication r0 in R modi es the retrieval schedule Ret in such a way that additional replications r1 ; : : : ; rn must be scheduled. Then the extension of r0 is the set of replications fr0 ; : : : ; rn g. r
r
For example, suppose that the system schedules replications r1 and r2 to accommodate replication r0 . Suppose that the memory requirements of the replications schedules are as follows: The timing for replication r0 is (0; 0; 0; 0; 0; 0; 1; 0; 1; 0), for r1 is (0; 0; 0; 0; 1; 0; 0; 0; 0; 0), and for r2 is (0; 0; 1; 0; 0; 0; 0; 0; 0; 0). Then the extension of r0 is fr0 ; r1 ; r2 g and its memory requirements is (0; 0; 1; 0; 1; 0; 1; 0; 1; 0). The memory requirements of replications schedules de ne a partial order on the schedules.
De nition: A replication schedule sr is greater (smaller) than a replication schedule sr if and only if for each instant i 2 [?p; N ] the memory requirement of sr at i is greater (smaller) than or equal to the 1
2
1
memory requirement of sr2 at i
Lemma 5.3: Let fP ; : : : ; Pm? g; P ; B; C; D be the output of SAT 2ResSc(C ; : : : ; Cn ; v ; : : : ; vk ). Let 0
1
1
1
R; A; N be the output of SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). Let sr be the schedule of the extension of r in a one-time-interval resource schedule for fP0 ; : : : ; Pm?1 g, where r is a replication associated to clause Cj . Then, there exists some replication schedule sr0 on A for r such that sr0 is smaller than sr.
Proof: It suces to consider the case when the system changes Ret to accommodate the scheduling of r.
sr must end with a write page on drive dji or drive eji . This write page must be scheduled at an interval l such that there is memory available at instant l. Therefore, the write page must be scheduled during an odd time interval (i.e., 1; 3; 5 etc.). Suppose that the write page is scheduled during an interval l that does not
have bandwidth available for any drive in the target set. Then, there are two alternatives: (1) to pre-fetch a retrieval from a target drive that was scheduled during l in Ret, or (2) to replicate a page a retrieved from a target drive during l in Ret to a drive with available bandwidth in l, so that a can be retrieved from another drive during l. The rst alternative is not possible, because the write a page operation requires an additional memory frame at instant l to hold the page. The memory is thus exhausted at instant l, then there is not memory available to hold the pre-fetched page. The second alternative is not possible either, because the disk drives with available bandwidth during odd time intervals (e.g., ui ; ti ; eji ; dji ) do not have 17
disk bandwidth available at an earlier time interval. Therefore, the replication of a would increase the disk bandwidth requirement of such drives to more than what is available during the period [0; l]. In sum, the write page operation must be scheduled during an odd time interval l that has bandwidth available for a drive in the target set. Because there is not memory available at instant l ? 1, the page must be read during interval l ? 1. In conclusion, an alternative sr0 to schedule the replication associated with Cj in Lemma 4.2 is smaller than sr.
Lemma 5.4: Let fP ; : : : ; Pm? g; P ; B; C; D be the output of SAT 2ResSc(C ; : : : ; Cn ; v ; : : : ; vk ). Let 0
1
1
1
R; A; N be the output of SAT 2RepSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). Let sr be the schedule of the extension of r in a one-time-interval resource schedule for fP0 ; : : : ; Pm?1 g, where r is a replication associated to variable vi . Then, there exists some replication schedule sr0 on A for r such that sr0 is smaller than sr.
Proof: It suces to consider the case when the system changes Ret to accommodate the scheduling of r.
sr must end with a write page a to either drive ti or drive ui . As for the case of write page on a target drive in proof of Lemma 5.3, the write page on either drive ti or ui must be scheduled during an odd time interval l that has bandwidth available for either drive ti or ui . Without loss of generality, suppose that it writes the page on drive ti . Because there is not memory available at instant l ? 1, the page must be read during interval l ? 1. However, there is not available disk bandwidth for drive si during interval l ? 1. Then the system must either (1) replicate a from si to a disk with available bandwidth during l ? 1 so that a can be retrieved from the other disk, or (2) replicate a page b retrieved from si during l ? 1 in Ret to a drive with available bandwidth during l ? 1 so that a is retrieved from si and b from the new location during l ? 1. Then the system must replicate a page (a or b) from si to dni . To schedule this replication, the write a page on drive dni must be scheduled at interval l ? 2 because it is the only odd time interval before l ? 1 with available disk bandwidth for dni . Then the system has to schedule a read from si at interval l ? 3 because there is not memory available at instant l ? 3. If there is only one clause in the SAT instance, then an alternative sr0 to schedule the replication associated with vi in Lemma 4.1 is smaller than sr. If there is more than one clauses in the SAT instance, then there is not available disk bandwidth for si at interval l ? 3. Therefore, as before the system has to replicate a page from si to either dn (if there is available bandwidth in dn ) or d(n?1)i . Because there is not available bandwidth for drive dn during an odd time interval, the system has to schedule the replication from si to d(n?1)i . Similar reasoning can be applied iteratively to conclude that an alternative sr0 to schedule the replication associated with vi in Lemma 4.1 is smaller than sr. We now conclude the proof of the other direction of the equivalence of instances.
Lemma 5.5: Let R; A; N be the output of SAT 2RepSc(C ; : : : ; Cn; v ; : : : ; vk ). Let fP ; : : : ; Pm? g; P , 1
1
0
1
B , C , D be the output of SAT 2ResSc(C1; : : : ; Cn ; v1 ; : : : ; vk ). If there is a resource schedule that yields a one-time interval latency and supports a coordinated display of fP0 ; : : : ; Pm?1 g on a system con guration (B; C; D) and an initial placement of data P , then there is a replication schedule for R on A during [0; N ].
Proof: Suppose that there is a resource schedule Sc for fP ; : : : ; Pm? g that yields a latency of one interval 0
1
and there does not exist a replication schedule for R on A during [0; N ]. Consider the following schedule for replications R on A (Sa): For each replication r 2 R, consider a schedule sr0 in Lemmas 5.3 and 5.4 such that sr0 < sr, where sr is the schedule of r's extension in Sc. 18
Because there is not a replication schedule for R on A, there must be two replications r1 and r2 such that their corresponding schedules in Sa con ict. The only possibility of con ict between the schedules for r1 and r2 in Sa is if r1 is associated to a variable vi and r2 to a clause Cj . Because the other combinations do not have overlapping periods. Without loss of generality suppose that the schedule of r1 in Sa span the period [x; x +2 (n +1)] where x = 4 (n +1) (i ? 1) and the schedule of r2 the period [x +2 (j ? 1); x +2 (j ? 1)+1]. Both schedules require a memory frame at instant x + 2 (j ? 1) + 1. Then, the schedules of the extensions of r1 and r2 in Sc would also require two memory frames at instant x + 2 (j ? 1) + 1. However, there is only one memory frame available at this instant. Therefore Sc is not a resource schedule for fP0 ; : : : ; Pm?1 g that yields a latency of one interval, which contradicts our assumption about Sc. The computation of SAT 3ResSc is polynomial time. Therefore, because of Lemmas 4.3 and 4.4, we conclude:
Theorem 5.6: Deciding whether there is a resource schedule that yields the latency of one time interval
for a given display schedule is NP-Complete.
Corollary 2 Computing the resource schedule that yields the minimum latency for a given display schedule is NP-hard.
The computation of a resource schedule is constrained by the memory capacity of the system. An increase of the memory capacity might lead to a decrease in latency. One question that arises is what the minimum memory requirement is to render a resource schedule with a given latency. However, deciding whether there is a resource schedule that yields a latency of one time interval on a system with memory capacity C is NP-Complete. Therefore, computing the minimum memory capacity is NP-hard.
Corollary 3 Computing the minimum memory requirements to render a resource schedule that yields a given latency is NP-hard.
6 Conclusions and Future Research A coordinated display of a structured presentation must satisfy the temporal and spatial constraints associated with each object. Once the display starts, objects must be rendered at pre-speci ed times de ned by the temporal constraints. We studied the complexity of a resource scheduler that supports a coordinated display of structured presentations for a multi-disk architecture. We showed the following: (1) the computation of a resource schedule that supports a coordinated display and yields the minimum latency is an NP-Hard problem, and (2) given a system load, the computation of a schedule to change the placement of data across disk drives in minimum time is an NP-Hard problem. One question that arises is whether the resource scheduling for special cases of multi-disk architectures such as a system with 2 disks is still NP-Hard.
19
References [Bat95]
J. Bates. Steve jobs to get executive producer credit on disney animated lm. Los Angeles Times, Section D, October 24 1995. [Ber94] S. Bernstein. Techno-Artists 'Tooning Up. Los Angeles Times, Section F, November 10 1994. [EIS76] S. Even, A. Itai, and A. Shamir. On the complexity of timetable and multicommodity ow problems. SIAM J. Comput., 5:691{703, 1976. [EM95] M. L. Escobar-Molano. Management of Resources to Support Continuous Display of Structured Video Objects. Technical Report 95-616, USC, 1995. [EMG95] M. L. Escobar-Molano and S. Ghandeharizadeh. A framework for conceptualizing structured video. In First International Workshop on Multimedia Information Systems, 1995. Arlington, Virginia. [EMGIng] M. L. Escobar-Molano, S. Ghandeharizadeh, and D. Ierardi. An Optimal Resource Scheduler for Continuous Display of Structured Video Objects. IEEE Transactions on Knowledge and Data Engineering (Correspondence on Recent Developments), Forthcoming. [Gha95] S. Ghandeharizadeh. Stream-based Versus Structured Video Objects: Issues, Solutions, and Challenges. In S. Jajodia and V.S. Subrahmanian, editors, Multimedia Database Systems: Issues and Research Directions. Springer Verlag, 1995. [GJ75] M. R. Garey and D. S. Johnson. Complexity results for multiprocessor scheduling under resource constraints. SIAM J. Comput., 4:397{411, 1975. [GJS76] M. R. Garey, D. S. Johnson, and R. Sethi. The complexity of owshop and jobshop scheduling. Math. Oper. Res., 1:117{129, 1976. [Lev79] E. L. Levitan, editor. Handbook of Animation Techniques. Van Nostrand Reinhold Co., 1979. [TT90a] N. M. Thalmann and D. Thalmann, editors. Computer Animation Theory and Practice. Springer-Verlag, 1990. [TT90b] N. M. Thalmann and D. Thalmann, editors. Synthetic Actors in Computer-Generated 3D Films. SpringerVerlag, 1990.
20