Storage and. JIM GEMMELL. Delay-Sensitive. Multimedia Data. Retrieval. Simon Fraser ... Storage and Retrieval]: Information Search and Retrievalâreb-i-.
Principles of Delay-Sensitive Storage and Retrieval
Multimedia
Data
JIM GEMMELL Simon
Fraser
University
and STAVROS
CHRISTODOULAKIS
University
of Waterloo
This
paper
tive
multimedia
trieval
establishes
some fundamental
data.
of these
data
types
to be acceptable
to the
intuition
reader,
to
theoretical show the
the
framework
bounds
of the
assist
the
possible
for
presented
audio,
has to satisfy is based
the
are
results
for the real-time
storage,
for certain
are basic
all
are
to any playback
in estimating
rate
Making
in
A
data
We and
framework,
scenarios.
categorized
requirements
data.
playback,
of this and
of delay-sensitive
hardware
to provide
of the audio
use
Re-
in order
order
audio
retrieval
then
video.
delay-sensitive
of digital
common
data
of delay-sensiand
constraints
audio
to
of the consumption
synchronized
designer
time
on digital
applicable
secondary
space requirements
and storage animations,
certain
requirements
in terms from
multichannel paper
retrieval
digital
presentation
rate
system
for the
include
storage
requirements
in this
multimedia
design
The
although
for buffer
strategies
The results
user.
these
data
secondary
data-retrieval
are derived
placement
from
is developed
how to describe nature
principles
Delay-sensitive
data
Storage examined.
and should
and in evaluating
choices.
Categories and Subject Descriptors: H. 3.2 [Information Storage and Retrieval]: Information Storage; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval–reb-iInterfaces and Presentation]: Multimedia Information Syseval models; H.5. 1 [Information tems–animations, General
audio
Terms:
Additional
input
Design,
Key
Words
/output,
uideo
(e.g.,
tape,
disk,
DV,[)
Performance and Phrases:
Continuous
media,
delay-sensitive,
real
time
1. INTRODUCTION Data
is said to be delay
sensitive
when
there
are real-time
deadlines
for the
presentation of successive units of the data to the user. Examples of data that would be classified as delay sensitive are digital audio, animations, and video. For digital audio to correctly reproduce sound, digital audio samples
Authors’
addresses:
Burnaby,
British
Science,
University
Permission not made
J. Gemmell, Columbia,
V5A
of Waterloo,
to copy without or distributed
Department 1S6, Waterloo,
fee all or part
for direct
Simon
N2L
material
advantage,
3G1,
Fraser
Department
0734-2047/92/0100-051
Transactions
on Information
University, of Computer
Canada.
is granted the ACM
provided copyright
that notice
the copies
are
and the title
and notice is given that copying is by permission of the To copy otherwiee, or to republish, requires a fee and/or
permission.
@ 1992 ACM ACM
Science,
S, Christodoulakis,
Ontario, of this
commercial
of the publication and its date appear, Association for Computing Machinery. specific
of Computer
Canada;
$01.50 Systems,
Vol.
10, No. 1, January
1992, Pages 51-90
52
J. Gemmeli and S. Christodouiakis
.
must be delivered cases of animation of successive interruptions. This paper
images
ever,
data.
the the
for
establishes
multimedia discuss
to a digital-to-analog and video, real-time
In
to flow
principles order
retrieval results
motion
could
smoothly,
of storage
to provide
requirements given
converter at a precise rate. In the deadlines must be met for the display
and retrieval
greater
in terms equally
without
pauses
or
of delay-sensitive
intuition
to the
of digital
audio
to any
similar
apply
any
reader,
playback.
we Hovv-
delay-sensitive
data, because the issues involved in digital audio are a superset of the issues involved with most other delay-sensitive data. Very little work has been done in this area to date. Wells et al. and Yu et al. [12, 15] have disk is employed given a high-level sitive
data,
retrieval
examined
a very
uncommon
scenario
in which
an optical
with extremely limited buffering. Anderson et al. [3, 4] have description of a distributed system that includes delay-sen-
without
are met.
describing Abbot
in detail
how real-time
[1] has examined
a storage
requirements placement
for data
strategy
that
yields a high probability of improved system throughput, but does not absolutely guarantee meeting real-time constraints. Park and English [7] give a strategy for using lower-quality, lower-bandwidth audio data when contention for bandwidth occurs. However, their method does not provide a guaranteed minimum quality. In this paper we first develop a theoretical framework for describing the real-time
requirements
strategies
in the light
audio. relate
and
results.
then
Section
Sections 3–5 deal with a dedicated to single-channel (i. e., monophonic)
tiple-channel
playback.
storage
device,
system.
Finally,
gies for merits.
such
Section as would
Section
synchronized
2, REVIEW Digital
of retrieval of those
is the
in
some approaches a multitasking
the possible
multichannel
storage
placement
the basics
of digital
storage device: Sections 3 and 4 playback, and Section 5, to mul-
examines
be found
7 categorizes
OF DIGITAL
audio
6 then
consider 2 reviews
playback
storage
and
or
to a shared distributed
placement
discusses
their
straterelative
AUDIO encoding
of an
audio
signal
as a series
of symbols
(numbers) that can be processed by a computer [11]. Much attention has been given to digital audio, especially since the advent of the compact disk, an optical disk that is used for storing digital audio recordings [2, 14]. Digital audio has been made more practical in recent years by mass storage technology that can handle the large amounts of data that may be involved in digital audio [10], and by the mass production of digital-to-analog and analog-to-digital conversion hardware. At the same time, also in part due to improved mass storage technology, research and which extend
development traditional
video, and sound extended to include ACM
Transactions
have begun in the area of multimedia text-only databases to include images,
[5]. Electronic mail, once strictly sound and images in such systems
on Information
Systems,
Vol
10, No
1, January
1992
databases, animations,
text only, has been as NeXT Computing’s
Delay-Sensitive Multimedia Data Storage and Retrieval workstation devices
[9].
With
becoming
powerful
ever more
dominant form of man-machine Digitization of audio data analog signal at fixed intervals in binary form [6, 11]. A digital samples. converter the then
Playback consists at the same rate
consumption creates
sampling
that
and
multimedia
communication is most commonly
mass will
[8]. achieved
by
and by storing the amplitude audio record therefore consists
53
storage
become sampling
the an
of each sample of a number of
of feeding these samples into a digital-to-analog at which they were sampled. We refer to this closely
system.
The digital-to-analog
approximates
to the digital-to-analog
occurred,
workstations
and powerful,
rate of the playback
a signal
not delivered
graphics
cheap
.
the
converter
the reproduced
sound
original.
at precisely
will
not match
as
converter
If samples the rate
are
at which
the sampled
sound.
Systems may be designed in which samples come off the disk at this precise rate, but in most cases retrieval will be more irregular. When the retrieval rate is irregular, samples must first be buffered and then passed to the converter
at the desired
rate.
(Note that the term read is used in this paper to refer to retrieving data is used to refer to data from the storage device, while the term consume being consumed by the digital-to-analog converter.) If we compare the real-time demands of digital audio playback with real-time demands of most database systems, a significant difference observed. In most database systems, pleting the retrieval of a requested real-time
demand
of an
audio
the real-time demand consists of comrecord before a given deadline. The
system
is that
the
retrieval
of successive
samples of the audio record meet successive deadlines. The time to retrieve the first sample (or set of samples for multichannel important, because it essentially is the time from the request until some sound may be heard. However, correct playback does not depend on it.
3.
PRINCIPLES
In this
section
performance framework
it is not a real-time
OF DELAY-SENSITIVE
DATA RETRIEVAL
we develop
theoretical
a general
the is
framework
that it takes playback) is for playback deadline,
for studying
as a
the
of any playback system with a dedicated storage device. This allows us to determine if the system can meet the real-time
deadlines of playback, given a distribution of the data production rate from a secondary storage device. This production rate depends on the device characteristics, the data placement on the device, the compression rate achieved at any point in the playback stream (if employed), in addition to the number of samples recorded per second, and the granularity of the sample distinct values that can be recorded). Therefore, the general developed in this section provides a set of tools for the multimedia designer deci~ions
to study the effect of the above on the performmme of the system.
We assume throughout this paper in the order the data is consumed. ACM
Transactions
design
and
hardware
(number of framework database selection
that data is read from Abbot [1] has explored
the storage device the possibility of
on Information
10, No. 1, January
Systems,
Vol.
1992.
54
J. Gemmell and S. Christodoulakis
a
Number
of samples read versus time I
R(t)
I
I I I I
I I
I I
I
I I
I
t, t Fig.
1.
R(t):
The reading
function.
reading audio data out of order to reduce rotational delays on a disk. However, in general, reading out of order does not yield any such gain. In fact, it can be shown that, if the consumption order is known and the physical placement of the data may be specified, then reading in terms of buffer requirements and throughput. 3.1
Minimizing
In this audio
section record
Buffer
Utilization
the basic from
for the successful
storage
device
analyze the minimal buffer requirements successful playback. Let the total number of samples read (see Figure 1). The time samples have been read
will
be optimal
and Start Time
requirements
a dedicated
in order
playback
are explained.
and
the
up to and
minimal including
of a digital
In addition, start time
time
we for
t be l?(t)
chosen to be time zero should be chosen so that no prior to that time; that is, for t t,,R(t) = R(tr). R(t) is a strictly nondecreasing
be t,, that function,
is, the
because
earliest the
time
number
at which of samples
read can never decrease. However, the slope of R(t) will be zero whenever no samples are being read, for example, during seeks or during any waits for buffers to become available. Note also that when compression schemes are used R(t) remains the number of decompressed samples. Achieving high compression appears as an apparent increase in the transfer rate of the storage device and, hence, the slope of R(t) at that point. For example, under a differential compression scheme the slope of R(t) would be relatively low when read. ACM
actual
Transactions
samples
are read,
on Information
and would
Systems,
Vol
increase
10, No, 1, January
only
1992
when
differences
are
Delay-Sensitive Multimedia Data Storage and Retrieval
Number
of samples consumed
o
55
begins
be
versus time
# s
a m P 1 e s
to
time Fig. 2.
Let
the time
to. We refer by
the
that
playback
to to as the
are consumed
using
start
digital-to-analog
samples
C(t): The consumption function,
the digital-to-analog
time.
Also,
converter
at
at re samples
converter
let the number
of samples
consumed
t be C(t, to) (see Figure
time
per second,
o
2).
if
t to.
(1)
C(t, to)= { r,(t If the number time,
then
of samples
the samples
read
that
– to)
is greater
have
buffered. The number of samples that then R(t) – C( t, to), which we denote
B(t, to)= See Figure If buffer allocated
3 for an example of the space must be allocated for
the
maximum
peak
the number
but
must as
consumed
not yet played
be buffered
at any
back
at any given
must
be
time
is
R(t) - C(t, to).
(2)
B, C, and R functions. in advance, then enough
B( t,to).We
of
utilization at time t,and the utilization. If B(t,to)is less than zerol
than
been read
value
refer
of B(t,
in the range
If
then
to
space must
13(t, to) as the
to) as the
tos t s tr,this
maximum would
be
buffer buffer
mean
that
the digital-to-analog converter has consumed a sample that has not yet been read from the storage device, which is, of course, impossible. Otherwise,
1 Strictly
speaking,
B( t, to) = O would
also mean
yet read. However, in this paper consumption reality it occurs in discrete one-sample steps. rested
at zero, our continuous
as the error
model
would
that
something
is modeled Therefore,
have
is being
consumed
as a continuous function, if the true discrete model
gone negative,
and thus,
that
is not
although in would have
we can use B( t, to) 0 in the range to < t s t,.The buffer function is equal to the read function up to to, so it is equivalent to, say, B(t, to) a O for O tw. Now consider any point in time Therefore, both B(t, y) and B( t,tw) have the value R(t). If tU, < t < y, then 13(t, y) = R(t) If t > y, then
B(t, y) > B(t, tW). and ll(t, tw)= R(t) – rC(t – tW). Therefore, l?(t, y) = R(t) – rC(t – y) and 13(t, tw) = R(t) – rC(t – tW). Be-
B(t, y) > B(t, tW). Therefore, at any time t, the cause y > tw, once more, buffer space required for playback beginning at time tu is the same or less for playback beginning at time y. This means that beginning playback at time
tw is optimal
in terms
of buffer
❑
space.
Now that we have demonstrated that the solution giving wait also gives the minimum buffer size, we will show how
the minimum to find such
a
solution. THEOREM
Figure
2.
4): First,
then B(t,
0) is a solution,
the minimum B(tn,~,
The minimum value for to may be found as follows (see consider B(t, O). If this function is nonnegative for O s t < t,, value
and the minimum
for B(t,
O) = – m. The intersection
value for to, yielding PROOF.
If
B(t,
a workable O) is
value for to is zero. Otherwise,
O), in the range of R(t)
with
O < t < t,, B(t,
be at time
t~,~
let with
O) + m is at the minimum
playback.
a solution,
then
it
must
Theorem 1. Otherwise, let the minimum value () < t < t,, be at time t~i~ with B(t~i~, ()) = – m.
be for
optimal
according
B( t, O), in
the
to
range
Suppose that a solution is to begin playback at time x. That is, B(t, x) is at B(t, x) with B(t, O) in the least zero for any time O < t < t,.If we compare range
x < t < tr, we see that B(t,
O) = R(t)
(5)
- rC(t - O)
and B(t, This
shows
constant,
us that, rC x. That
x) = R(t)
– rC(t – x) = B(t,
in the range is, they
x) differs x s t < t,,Z3(t,
have
the
same
(6)
O]l + rCx.
slope,
which
from
B(t,
means
O) by a
they
are
parallel in this region. Now suppose that the minimum value for II(t, O) (occurs at a time greater at least zero than or equal to x; that is, t~,~ s_ x. Because B(t, z) is always Therefore, in the range x ~ t < t,, for O < t < t,, we know that B(tmln,x) ~ (). the constant difference between the two equations is at least m if t~,. > x. ACM
Transactions
on Information
Systems,
Vol.
10, No, 1, January
1992.
58
.
J, Gemmell and S. Christodoulakis
Samples read and buffered versus time
# s a m P 1 e s
t
t,
~in
time
Legend:
........... -..--
l?(r) B (t ,0) B(r,O)+m Fig.
Now
suppose
that
t~,n< x. Because
the
4
Finding
minimum
the mmimal
start
value
B( t, O) occurs
for
time,
at some value
of
R(t)
is nondecreasing and nonnegative, it follows that differ l?(t~ln) is at least O. Z?(t~,~, O) is defined as being – m, so the functions by at least m at time tmln.From t~,. to x, B(t, x) grows as R(t) does, while B(t, O) grows only as fast as R(t) – rCt. Therefore, the difference between B( t, x) and B( t, O) remains at least m up to time x. We have already seen that there is a constant difference between the two equations in the range ~ ~ t s tr. This difference must be at least m, because it is at least that value at time x. B(t, O) + m. Let Now consider the intersection of R(t) with the function because by definition it is the intersection be at time t,.B(t, t,) is a solution, than B( t, O) by m, the same as R(t) up to time t,,and after t,it is greater ensuring that it has a value of at least zero. It remains to show that t, is the minimum playback start time. We have shown that there is a constant difference of at least m between has any solution B(t, x) and B(t, O) in the range x < t s t,.B(t, t,) therefore ACM Transactions on Information
Systems,
Vol.
10, No
1, January
1992.
Delay-Sensitive Multimedia Data Storage and Retrieval the least
constant
II( t, y) with
difference
of any
solution.
Suppose
there
.
exists
59
a solution
y < tt. By definition, R(t) B(t,
ti)
if
t< t,,
if
t> ti,
if
tt,) = ~(t) - ~c(t - ti) = R(t) -
that
t,
a consumption
function
of this subsection
rate of the storage
have a maximum
buffer
and
device ( rC > rf),
space utilization
of
at least (19)
In addition,
the number
of sector-sized
buffers
required
for playback
is at least (20)
PROOF.
Let the read
proof,
except
being
read,
with so that
time
and buffering
functions
t = O being
the moment
be adopted when
R(O) = O. The buffering
function
II
s. – rCt
B(t,
O) =
tg s,
from
the first
the previous sector
begins
is then (21)
and tr = lJsS /ri). During every sector read, the buffer function fa”lls by at least a sector, because rC > ri. This causes l?(t, O) to decrease in a zigzag pattern, as illustrated in Figure 7. It is easy to see that it reaches its minimum at the end of the last read, just before the sector data becomes available. This is at time t,, and this minimum is tr(rC – rt) – s.. Using Theorem 2, tminmay be ACM
Transactions
on Information
Systems,
Vol.
10, No. 1, January
1992.
64
J. Gemmell and S. Christodoulakis
.
found by calculating the This is found by solving
intersection
13(t,0) which,
by the definitions
t 11
rt
mm —
+ t,(r,
of R(t)
s. – ret~ln
s.
which
of R(t)
-
r,)
and
with
II(t,
O) + tr(rC – r,) – s,.
(22)
- s. = R(t),
B( t, O), is equivalent
+ tr(rC
– r,)
– s. =
to solving (23)
t~,nc s,, s~ H
yields t,(rC
– rf)
Now is the
consider same
utilization
the buffer
must
requires
function
The buffering
up to time
function
t~,~, so the
B(t,
maximum
t~,~) buffer
be at least
‘(tmin
which
rC
space utilization.
as the reading
– s, (24)
t=mm
,1“(r’‘rr’)- ( )] H
‘mm)=~(~m,n)=
that
‘s 27-C
c
(25)
s,,
R ( ‘rein)
(26)
s.
the next sector-sized buffers be allocated. (Note that this bound is not tight; before a buffer has been emptied, so one read after time tm,nmay complete more buffer may be required. ) ❑ Therefore, we observe is equal to the transfer sector-sized on the
buffers.
length
of the
that, rate
for the special of the device,
In all other recording.
case where the consumption playback can be done using
cases, the buffer In practice,
space required
memory. Therefore, it is desirable handle records of arbitrary length.
that
is dependent
it is undesirable
space dependent on the length of the audio record, because only records up to a certain length can be played back with a playback
rate two
to have
buffer
it will mean that the given system
algorithm
be able
to
It should be noted that, when the consumption rate is less than the transfer rate, samples are being read faster than they can be consumed. If somehow reading could be slowed down, then the buffer requirements would be reduced. For example, if the transfer rate was exactly twice the consumption rate, then a samples-sized gap could be left after each sample to reduce the effective transfer rate. Different strategies are possible, but the essential ingredient in all of them is the introduction of artificial delays, which is discussed 4.2
in the next
Playback
section.
with Artificial
Delays
In the previous section, it was space requirements dependent ACM
Transactions
on Information
Systems,
noted that it is not practical on the length of the audio Vol
10, No, 1, January
1992
to have buffer record. In this
Delay-Sensitive Multimedia Data Storage and Retrieval section, ered.
the
same
case of single-channel
It is assumed
delays,
but
that
requirements.
artificial
algorithms
rate
must
under
from
this
point
on the are
ways.
in
order
on that
to
can
with
limit
For the purposes
the
record.
The
back
of this
rate
otherwise
audio
play
no
buffer
the consumption
of the
that
65
is consid-
continuously
(r, < r~), because
length
those
playback
back
introduced
of the device
depend
in many
be played
are
consideration
length. can be created
uncompressed
could
delays
the transfer
requirements
arbitrary Delays
the record
It is also assumed
is no more than buffer
that
.
records
result,
of
it does
not matter how they are accomplished. The assumption will be that delays cannot be of arbitrary length, but must be in multiples of some minimum delay. An example of an artifical delay on a disk would simply be to stop reading. In this case, the length of the delay must be a multiple of the rotational
delay.
THEOREM 6. For a single-channel which could be read contiguously,
audio record stored but in which delays
be introduced, the number of sector-sized buffers arbitrarily long records, with i-t > rC, is at least
in sectors of size s,, of length d~,~ may
required
for
playback
of
(27)
PROOF.
that
Suppose
the playback
algorithm
requiring
the
least
number
of buffers introduces no delays. For the algorithm to work over all lengths of records, it is apparent from the previous theorem that it must be the case that rt = rC. I-Iowever, introduced. Furthermore, ning
of playback,
this violates these delays
because
prior to that. Any time there
there
is a delay,
our will
assumption, necessarily
is no possible
there
must
so delays must be come after the begin-
gain
from
introducing
be enough
data
buffered
delays to last
for
the duration of the delay, plus the time it takes to read a sector after the delay is complete. So, if the amount of data buffered at the beginning of a delay
is b, then
b=rc(dmin+;) and at least The during
( b / s~l buffers
above
bound
delays.
relies
In practice,
last buffer is being gives an algorithm
filled that
must
this
❑
be allocated.
on using
all
maybe
(28)
of the
available
difficult
buffers
to implement,
another buffer is being uses one buffer more
to hold because
data as the
emptied. The next theorem than the bound presented
above. THEOREM
which
could
7.
For
be read
a single-channel contiguously, ACM
Transactions
audio
record
stored
but in which
delays
on Information
Systems,
in sectors
of length Vol.
of size s.,
d~i~
may be
10, No. 1, January
1992.
66
J. Gemmell and S. Christodoulakis
.
introduced,
and
with
rt > rC, there
exists
a playback
algorithm
using
‘b=lk+d+l+’ sector-sized PROOF.
buffers. Consider
(1) Allocate
the following
~((s. /r,)
(2) Begin
reading
(3) Begin
playback.
(4) Repeat
(29)
until
+ (d~,~))
the audio
no sectors
algorithm: ( rC /s~)]
+ 1 sector-sized
record
sequentially,
are left
to be read:
(i)
If any buffer
is empty,
(ii)
If no buffers
are empty,
read
the next
perform
buffers.
and fill
the first
sector
and fill
an artificial
delay.
The above is a solution if there is always data the time playback begins until all reading
buffer.
it.
buffered, to be consumed from is complete. When playback
begins, all buffers are full. When there are no delays, there will always be data buffered because rf > rC. Delays during playback only occur when there are no free buffers, that nonempt y. The amount amount
of data
is, when n6 – 1 buffers are full, and one buffer is of data consumed during a delay is d~ln rC. The
in the full
buffers
is (30)
Therefore, buffered. another
at the end of any delay, there will be at least (rC / rt)s~ still At the end of each delay, if there are still no free buffers, then delay
may
be introduced
with
the
there are free buffers, then the next sector (rC / r,)s~ is still buffered, there will always ❑ until the read is completed. 4.3
Playback
In this
with Unavoidable
section
we consider
that includes some delays is possible to introduce express certain are given. 4.3.1
Block
THEOREM
of arbitrary
bounds,
Consider length, with
still
holding.
playback
in which
reading
is not continuous,
and then
results
of a simulation
verifying
a single-channel system the following properties:
that plays
the bounds
back audio
(2)
In between reading each block, a delay is incurred. The delay is unpredictable, but is bounded above by d~~X. on Information
but
Allocation
Data is read in blocks, each block consisting of a fixed The block size is s~ = is., where i is an integer.
‘hansactmns
if
Delays
(1)
ACM
Otherwise,
by necessity (e. g., seeks). Again, we assume that it artificial delays. First, theorems are given that
Size and Buffer
8.
above
will be read, and because at least be data buffered for consumption
Systems,
Vol
10, No. 1, January
1992
number
records
of sectors.
length
of each
Delay-Sensitive Multimedia Data Storage and Retrieval The minimum
(3)
The transfer
(4)
length
of an artificial
rate of the storage
delay
device
is d~i~,
is greater
with
than
d~,~
.
67
< d~~x.
the consumption
rate
(r, > 7-C). Then
any
record
algorithm
length
must
that
allocates
use a block
a number
of buffers
independent
of the
size of at least
‘+X-+1 sectors
and
must
allocate
(31)
at least
(32)
sector-sized PROOF.
buffers. Consider
the
case where
all
is, d~.X. Let playback (consumption) read after this time. Let it be the sector
of the
nth
block
is completed,
before this rise, the total amount of data consumed
amount by this
(n-l)
Because
reading
must
the
stay
+ — :)
ahead
that
As the
read
buffered
will
a block for the first
will
be (n – 1)s,.
rise.
Just
The
total
(33)
r, – tOrC.
of consumption,
( =f++-tor
(n-l) stI>(n-l) which
length,
to,and consider
amount
of data read time will be
(d~,X
always
are of maximum
delays
begin at time nth block read.
(34)
d
yields
to d— max s,
>
—
n–1 1
1
LA —
—
rC We can take
n to be arbitrarily
large,
(35)
—
rt so this
will
approach
d max sb>
~
(36)
~ —
re Because
the block
size is in sector
units,
rt we must
have (37)
‘b>i%-%)lss. ACM
Transactions
on Information
Systems,
Vol.
10, No. 1, January
1992.
68
. Now
added
J. Gemmell and S. Christodoulakis consider
the
to the buffer
buffer
allocation
by reading
requirements.
a block
during
Let
playback
the
amount
of data
be
(38)
that the
is, the change first
sector
in the buffer
of the block
from
function,
becomes
available
sector of the block becomes available. when the first sector becomes available, the end of reading
the first
that IJ < d~,.. Consider two blocks amount
of data
left
read
block. after
If
playback after
(
+;
rC d~,,
the moment moment
just
when
before the
last
It is assumed that playback begins so this is the amount in the buffer at ~ > d~,~,
in the buffer
from to the
then
we are done,
(consumption)
the first
block
so assume
has begun.
is read
Let
the
be
(39)
+4+7. )
This means that rCy was left in the buffer previous to the read. y may be zero, but is nonnegative. If ~ + ~ > d~,n, then we are done, so assume that y + -y < dm,n. Let k = \[(2~
+ y)/d~l~]
+ 11. For
all
x it is true
that
lx
+ 1]>
x, so
kd~i~ >2 ~ + y. Now let c >0 be chosen such that c < kd~l~ Suppose that after the first block is read there is a delay of – kd~l. + c, and after the second block is read there is a delay that the first delay is in fact less than d~~X, as the choice of e Also, the first delay is in fact a nonnegative value, as k is at
of d~,X. Note and k ensures. most 2, and is
only that value when 2$ + ~ is at least Suppose also that no artificial delay
reading
second reading
block until
(
will
before
of the
is complete. From the time that the first block completes the time the first sector of the second block is available,
rC d~,,
sectors
d~,.. is introduced
– 2 ~ – y. d~,X + 2 IJ + y
be consumed.
+2~+y–kd~,n+~+~
(40) rt )
There
was
(41)
in the buffer when the first block completed reading; therefore, first sector of the second block becomes available, there is
(
rC d~,X +Z+++y rt = rC(kd~,. ACM
Transactions
on Information
–rCd~,X
as the
+2$+~–kd~i.+e+~
)(
rt )
– e – ~) Systems,
just
(42) Vol.
10, No. 1, January
1992
Delay-Sensitive Multimedia Data Stora!~e and Retrieval in
the
buffer.
buffered
will
SO after
the
second
block
reading,
the
69
amount
be
(
rC dmax + ~
+ ~
+ rC(kd~in
– c – J)
= rC CQx (
introduced
before
)
If k artificial reading,
completes
.
delays
had been
-t ~ + kdm,n – E . (43) ) rt
the
second
block
completed
then (44)
would be left in the buffer. However, the next delay is of length d~,X, so by the time the first sector of the third block becomes available, r,[ d~.. + ( SS/ rf)] would
have
been
consumed.
This
which is an error in playback. introduced, reducing the buffer
r,
c can be chosen utilized
would
the
small.
function
negative,
k – 1 delays
could
be
(45)
d~,X +~+d~i.–c rt
Therefore,
buffer
at most
(
to be arbitrarily
for consumption.
make
Therefore, space to
) However,
the buffer
partial
samples
space must
cannot
be
be at least
(46)
And
the number
of sector-sized
buffers
must
then
be at least
(47)
As
in
Theorem always algorithm
the
case
of Theorem
8 is difficult one buffer that
TEI~O~~NI 9. properties:
to achieve
being
6, the
bourld
in practice,
emptied
as one is being
comes close to achieving Consider
for
a single-channel
buffer
because
the bound playback
allocation
during
fillled.
Theorem
given system
with
Data is read in blocks, each block consisting of a fixed The block size is S6 = is., where i is an integer.
(2)
In between reading each block, a delay is incurred. The delay is unpredictable, but is bounded above by d~~X.
(3)
The minimum
(4)
The transfer (r, > 7-J.
length
of an artificial
ACM
Transactions
device
delay
is d~ ,~, with
is greater
on Information
than
Systems,
is
9 gives
an
8.
the following
number
of sectors.
length
d~,~
of each
< T~.
the consumption
Vol
in
there
in Theorem
(1)
rate of the storage
given
playback
10, No. 1, January
rate
1992.
70
J. Gemmeil and S. Christodoulaids
.
Then
there
exists
an algorithm
requiring
block
sizes of
(48)
sectors
and allocating
(49)
sector-sized
buffers.
PROOF.
buffers
algorithm
The
allocated
(1) Read the first (2) Begin
until
reading
Let the delay
(ii)
While
the
sumption
block
size
and
number
of
incurred
block
be of length buffered
size
to
delay
an artificial
be
T~ < d~~x, is at least
and a block
enough
to satisfy
con-
read,
delay.
block
satisfies
the buffer a playback
buffered,
still
for an artificial
Read the next
show that constitute
is completed:
amount
(a) perform
data
the
block.
(i)
The
Let
Then
playback.
(3) Repeat
(iii)
is as follows:
be as above.
the
allocation solution.
requirements
of the
theorem.
It
remains
to
is sufficient and that the algorithm does It is a playback solution if there is always
consumed
from
the
time
playback
begins
until
all
reading is complete. Clearly there is data buffered when playback begins. The amount buffered after the read of any block is enough to satisfy consumption through the worst-case delay, and reading the next sector. Therefore, playback could only fail
as a result
of delays.
However,
delays
are only
performed
when
there
is
enough data buffered to last through a delay and a block read. Therefore, the above is a playback solution. Now consider the buffer space utilized by the above algorithm. Before reading any block, there will always be less than
‘Jdmin+%) buffered; otherwise, another delay would have buffer space. Therefore, before reading the block,
been there
issued to reduce may be
(50) the
(51) ACM
Transactions
on Information
Systems,
Vol
10, No. 1, January
1992
Delay-Sensitive Multimedia Data Storage and Retrieval buffers
in use, Reading
the block
requires
.
71
at most
R(1-31 more
buffers.
Therefore,
the maximum
number
(52)
of buffers
required
is (53)
4.3.2
Simulation
Delays. magnetic each block then
Results
The scenario disk as the located
a seek will
for
Uncompressed
described in the storage medium.
on a single
track.
be performed
Playback
with
Unavoidable
last two theorems can occur using a Audio data can be stored in blocks,
If blocks
are stored
to each block.
The
in random
seek will
then
locations, be random,
but will be bounded above Following the algorithm
by the maximum seek tilme of the disk drive. given in Theorem 9, a seek would be performed,
and the read
over the first
proper read,
moved were
delay
in practice,
conditions
head,
and
simulation,
(2) Begin
until
(i)
Perform
(ii)
While
sector to read
was modified
to occur.
the
it.
However,
calculation
would
have
Therefore,
it would this
completes
begun
moving
for the
If the not be cannot
to see if under
purposes
the
of the
to the following:
block.
reading
the
overhead,
amount
has the
same
still
the next
buffered
for a rotational
block
is
delay,
is on.
more
than
enough
a block
read,
and any hardware
to
satisfy
wait.
Read the next
case would
is completed:
a seek to the track
consumption
delays. From
time
to be read.
otherwise,
playback.
(3) Repeat
(iii)
the
of the block
be read;
be allowed
by the
be too late
the algorithm
sector
it would
would
because
are satisfied,
it would
(1) Read the first
This
satisfied,
and a rotational
be done the
head
conditions
block.
properties
be a maximum
in terms
seek, plus
of buffer
requirements.
a maximum
d~l~ would be a maximum rotation the theorem, the block size required
rotation,
plus is
d~,X
plus
an,y overhead
in this
any overhead delays.
(54)
i%+-:)1 sectors.
Therefore,
integer,
a consumption
if a block rate
size of
n sectors
is chosen,
with
n being
an
of nsa rC=—
should results
(55)
l–~ d max ()
be feasible to support, according were compared to this value.
rt
to
the
l;heorem.
The
simulation
ACM Transactions on Information Systems,Vol. 10, No. 1, January 1992
72
.
J. Gemmell and S. Christodoulakls BEGIN TB = time to consumea block nrsmsectors= numt!er of sectors in a block /*** wait time is time that we may haveto wait for new data to be *** availableafter issuing a read command ***J waft —time =
Maximum rotation time
+ hardware
overhead
+ time to read a sector ‘read’ first block (actuatly /*** ***
buff_tirue ~o~t
verify)
is the amount of time it takes to consume the totat of data read so f~.
***f btrf_time
=
TB;
1%start playback’1 start_time= the current time thereare more blinks to read
WHILE
seek to the tmck the next block is on temp = buff_timc
- extra;
do { /“ wait as long as we can wids having danger of failure
*/
now = the current time } while ( now - staft_time ‘read’
< buff_time
the next block (actuatly
/***
- wait_timc);
verify)
see if we rats out of data while the block was read.
***
This would
***
not enough to last until the end of reading tbe
be the case if the amount buffered
***
fist
was
sector,
***1 now = the current time lst_sector —read —tirm = now - (time to read nmnsectors-1 spMe_time
=
if (spwe_time
btif_time
sectors)
- starf_time);
< O)
simulation buff
- (1 st_seetor_read_tirm
fails
(exit)
time = buff —time + TB;
ENDWHI~E simulation ENDsimuI
is successful
ation Fig.
8.
Pseudocode
for the simulation
of Theorem
9
The routine that simulates the above algorithm is passed a list of blocks to read, the number of blocks in the list, the consumption rate to simulate, and the number of sectors per block. The pseudocode for it is given in Figure 8. Code was written to worst-case list and a beginning of a track, For a list which is not ACM
Transactions
generate two types of lists for the simulation routine; a nonworst-case list. In each case blocks began at the so that each block would lie entirely within one track. worst case, each block began on a random track. For a
on Information
Systems,
Vol
10, No
1, January
1992
Delay-Sensitive Multimedia Data Storage and Retrieval
Consumption
.
73
rate vs bllock size
14LW.XJ #.x 1211XII
-.2i-
*0- * lm
x
.’x
.4
,+
0
r=
8e032
x
.“x
.’
.*”,*
‘
*+
/“ Jf
.“* /-
M“” #0”
#-
‘:#””
~:+”:;” + F“”#&
m
.+’
,$.,$+$ ;.
4mxl
~ti$
2KYXI
*4 o
1 0
I
I
I
I
r
I
I
1
I
I
I
I
1
1
I
I
512 1024 1536 20.$8 2560 3072 3584 4096 W38 5120 56U 6144 6656 7168 7@0 8192 8704
Block siw
Legend:
# ---- -## estimated value x----x simulation results + --- + worst case simulation results Fig.
9.
Experimental
supportable
by
results
a given
block
for Theorem
9 simulations.
size
Comparison
(bytes).
Maximum
consumption
of experimental
results
rate
(bytes/s)
with
predicted
and
outside
values.
worst-case
list,
blocks
alternated
cylinders, with the actual track The simulation was performed to an IBM
PC/AT
between within using
compatible.
The
being
on the
the cylinder a 155-Mbyte software
was
inside
being random. SCSI hard drive written
under
attached the
DOS
operating system, using Microsoft’s C compiler, assembler, and linker. SCSI commands were sent to the drive using the software driver provided with the controller card. The driver contained a command that allowed SCSI command data blocks to be transferred, so that the drive could be directly under the experimental program’s control, with no intervening algorithms applied. Timing
was
done
using
the
Intel
8254-2
timer/counter
chip
built
into
the
PC/AT, with an accuracy in excess of 1 ps. Experimentally, it was found that a delay of at least 6 ms must occur before a sector may be read. Because of this, when issuing a read command while already on the correct track the possible delay is as high as a maximum rotation plus 6 ms. The rotational delay of the drive is at most 17 ms. ACM
Transactions
on Information
Systems,
Vol.
10, No, 1, January
1992.
74
J. Gemmell and S. Christodoulakis
.
The maximum seek time is 40 ms. The transfer Mbytes/s. Sectors were 512 bytes. The simulation was run using both the worst various
numbers
Each time, maximum
per
block.
The graph in Figure results indicated. simulation
Theorem
results
9. In
some
5. PROPERTIES
meet
cases,
or
the
exceed
playback,
several
simply
important
properties.
storage
device
—The
number
of playback
channel rates
—Playback read for
the
data
No assumptions
channels
system
Let the following
rate
in
the
verifying
than
the
non-
back
synchronously.
to illustrate
the following
some
properties:
playback.
rate,
i-C. The sum of the consump-
of the storage
there
are
device,
nCrC s rf.
during which we use S6 to
as a physical
as to how the data
variables
the was
with
results,
order
have
reading periods that, although
need not be stored
— During each reading period, bounded above by d~,X.
for
read.
is nC
the transfer
are made
rates until in 20 trials
are better
are played
to the audio
can be divided into each channel. (Note
predicted
is studied
has the same consumption
is at most
were
of the simulations,
results
of audio
system
is dedicated
cases,
20 blocks
studied the case of single-channel (i.e., we turn our attention to multichannel
Let the playback
–The
k
nonworst
is 1.07
PLAYBACK
channels
playback
drive
the performance of each simulation deencountered than on the length of seeks.
OF MULTICHANNEL
in which
A relatively
amount,
and trial,
the
worst-case
In the previous sections, we have monophonic) audio playback. Now
tion
each
9 show the results
worst-case results. This is because pends more on the rotational delays
—Each
For
of the
the simulation was run using various consumption consumption rate for which no failures occurred
recorded. predicted The
of sectors
rate
block
of nonzero
are the
for each channel.
are physically
delays
Sb data indicate stored.)
length
that
are
be defined:
the ratio of the data transfer rate of the storage device over the total data consumption rate, that is, rt / nCrC. Because we assume that nCrC s rf, it must
be the case that
1P the reading
period
k > 1.
length.
This
is the
delay
time
plus
the reading
time.
We are particularly interested in the reading period length and the block size. In general, the delays assc,ciated with retrieving the first block for each channel are similar to that for the remaining blocks. Therefore, the reading period length is a good indication as to how long request for playback and the playback beginning, be as low as possible. requirements. ACM Transactions on
Similarly,
Information
s~ gives
Sy;~tems, Vol
10, No
the delay will be between a and it is desirable for it to
a good indication 1, January
1992.
of the
buffering
Delay-Sensitive Multimedia Data Storage and Retrieval The
reading
needed
period
to transfer
length
is at most
one block
of data
the
maximum
delay
.
plus
75
the
time
for each channel: (56)
It is clear
that
be enough duration
for playback
buffered
in
of a reading
to work
each
for records
reading
period;
that
period
of arbitrary to satisfy
length
there
consumption
must
for
the
is, 7-C (57)
sb 2 1P It is possible
that
each reading
period
may
be of maximum
length,
so (58)
which
can be rewritten
using
k as krC d~~X
r, d~~X (59)
‘b> and which obtain
‘nC(k–l)
k–l
can be substituted
into
the
formula
for
reading
period
length
to
k 1P
>
(60)
dmax— k–l”
Solving
for consumption
rate
yields (61) sb
rt or Sb
k–l —
rc ( ns~ / x). This means that the closer we come to using the maximum number of audio channels, the more costly it is to add other nonaudio channels. For example, suppose the numer
of audio
of bandwidth
that
channels
remaining.
we used
(r, / rC) – 0.5 audio
is maximized,
In this
but
there
case, we would
is still
need
channels; a small
an audio
block
that
is,
amount size of
S6 >2 nsd, so that each audio block would have to be twice as large as the sum of all data blocks. Suppose that we wish to keep our audio block size down to at most m, the size of a data block. Using the above results, we obtain n, < ( rt / r,) – ( n / m), or n s m[(ri / rC) – n,]. That is, the number of nonaudio channels should not exceed m, the difference between the maximum number of audio channels and the actual number of audio channels. We observe, then, that the greatest flexibility is obtained by not attempting to utilize fully the bandwidth of the storage device for audio. It was already demonstrated that for a dedicated storage device this is the most economical route. Sharing the storage device only exacerbates the problems associated
with
7. STORAGE
pushing PLACEMENT
the bandwidth
limits.
STRATEGIES
This section considers several strategies for placing audio data on a storage device in order to play back multiple-channel audio synchronously. In Section 7.1 the possible placement strategies are categorized, and in Section 7.2 these categories are applied to some systems currently in use. Section 7.3 discusses ACM
Transactions
on Information
Systems,
Vol.
10, No. 1, January
1992
Delay-Sensitive Multimedia Data Storage and Retrieval issues best
particular under
to interleaving
certain
logical
blocks
impact
of compression
7.1
of audio
Categories
and
conditions.
indicates
Section
in sectors
what
7.4 takes
on a storage
sort
81
of interleaving
up the
device.
.
problem
Section
is
of placing
7.5 considers
the
on placement.
for Placement
Strategies
Placement of data may be either leaved placement groups together
interleaved or noninterleaved. An interdata that are read at the same time from
all of the channels. Notice that this implies that synchronization information is complete before placement and that changing synchronization involves moving the data on the storage device. Interleaving or noninterleaving may be combined contiguous together
with
a scattered
samples,
in a contiguous
a scattered various
either
strategy, strategy,
locations
in
fashion, the
on the
or contiguous the
order
one after
data
may
storage
will
another,
be split
device.
pdacement
they
strategy.
be read,
are
on the storage
up
into
blocks
The placement
and
strategies
In a stored
device.
In
placed
in
discussed,
then, are scattered interleaved, scattered, contiguous interleaved, and contiguous (a placement is assumed to be noninterleaved unless otherwise stated). In scattered placement, data for each channel is split up into blocks. Each data block of a channel may lie anywhere on the stcm-age device. In contiguous placement, on the
placement, nized.
all of the data
storage
device,
in
it is assumed
Instead
for a particular
the that
of storing
all
order
it
several
of the
In
is stored
contiguously
contiguous
channels
have
from
one channel
data
contiguous placement, interleaving read at a particular time together.
channel
is read.
already
interleaved been
together
synchroas in the
places the data from all channels that are These groups of data to be read succes-
sively are then stored one after another, forming one large contiguous file. Scattered interleaved placement is the same, except that the large contiguous file is split up into pieces, the scattered placement.
which
are then
There are various advantages and tered placement allows more flexibility external small fill
fragmentation gaps must
them.
However,
the locations ments require
on a storage
there
is more
of scattered blocks. much less seeking
on the
storage
device
as in
disadvantages to each strategy. Scatthan contiguous placement, so that
may be avoided.
be left
placed
External
device
fragmentation
because
overhead
no file
involved
in
occurs when
is small
enough
to
track
of
keeping
In addition, continuous interleaved placeduring playback than scattered interleaved
placements. Interleaving is used in an attempt to reduce seeking. By placing blocks that must be read at the same time side by side, minimum seeking between that synchronization information must them is guaranteed. Note, however, exist in order to do interleaving, and it cannot change without rewriting the data. Adjustment of synchronization is very flexible under a noninterleaved scheme. Another benefit of using noninterleaved p] acement is that when a certain
sound
is used
several
ACM
times
Transactions
it
need
on Information
only
be recorded
Systems,
Vol.
once
on the
10, No. 1, January
1992.
82
J. Gemmell and S. Christodoulakis
.
storage
device.
With
interleaved
placement,
for every time it is to be used. With noninterleaved placement, channels
being
played
back.
it would
we only
With
need
interleaved
have
to be copied
to consider
data,
the
once
number
of
we also need to consider
the number of interleaved channels, which may be greater. We denote the number of interleaved channels by n, and the number of playback channels by rzC. There are assumed interleaved scheme, because 7.2 In
Existing this
Placement
section
to be at least two otherwise interleaving
playback channels in would be meaningless.
Strategies
some
placement
surveyed and identified section. The best-known digital
strategies
according audio
that
to the
storage
are
categories
medium
digital
audio. can
Data
is placed
be performed
on the storage continuously,
currently given
device without
in
in
is the compact
(CD). As mentioned in the section reviewing digital storage device, which can hold about 72 minutes reading
an
the
use
are
previous
storage
device
audio, a CD is an optical of stereo (two-channel) in a spiral any
seeks.
format,
so that
This
spiral
is
divided into 2K sectors that contain the audio data and some additional error-correction data. Thus, CDs could be classified as using a contiguous interleaved placement strategy, with two channels of data. A spin-off of the CD is compact storage device interactive (CD-I). The same physical storage device is used for this application, but data other than just audio are permitted, including machine instructions that allow user interaction. Of interest to this survey is that, besides including CD-type sound, CD-I also includes sound. Level
three other types, which are referred to as level A, B, or C A uses about half the bandwidth of regular CD quality audio,
level
half
B about
bandwidth each sample.
that
of level
is cut by using The CD-I
A, and level
a lower
standard
sampling
C about rate
allows for buffering so the placement RAM) to be played back later, However, Philips, a major player in developing the
half and/or
that
of level
less bits
B. The
to encode
of data (l-Mbyte system strategy is not fixed. CD-I standard, indicates
only interleaved data [81. For systems using magnetic storage devices as the storage medium, little information regarding placement is available. However, Watkinson [10] indicates that in studio systems an attempt is made to keep the data contiguous where possible, although scattered placement may be necessary due to storage device fragmentation. 7.3
Options
in Interleaving
When dealing with must be considered
interleaved along with
placement, the exact manner of interleaving the placement strategy. A contiguous section
of the storage device containing data for each channel is read in each reading period, but the question remains; How is the data interleaved? One way of interleaving would be to store the data in such a way that the first sample is read for each channel, then the second sample for each channel, etc. Such an ACM Tramsact,onson Information Systems,Vol. 10, No, 1, January 1992.
Delay-Sensitive Multimedia Data Storage and Retrieval interleaving
strategy
interleaving
occurs
lecting
the
data
is referred sample
for each
to
by
here
as
sample.
channel
The
together,
sample other
which
.
interleaving, extreme
because
would
is referred
83
be col-
to as channel
interleaving. As mentioned in the previous section, a block is a unit of data read for a channel in a reading period. A metablock is the unit of data read for all channels in a reading period the data may or may not depending
on the interleaving
We now conditions.
demonstrate
for interleaved strategies. Within a metablock, be divided up into blocks for each channel, strategy.
the
optimality
of sample
interleaving
under
certain
THEOREM 11. For an interleaved placement strategy (scattered or contiguous) in which the number of playback channels is the same as the number of interleaved channels (n, = n,], it is optimal in terms of buffer space require-
ments
and prefetching
time
Suppose
PROOF.
some
to store the data other
scheme
in sample-interleaved
of storing
optimal. Now compare the read functions for the sample-interleaved strategy. Let the read function ith
channel
At
any
under
time
the
sample interleaving t the amount time scheme
optimal
t, a total
solution
interleaved
the reading any channel
data
is
optimal solution and for a and buffer function for the
be, respectively,
of z ~~~R ~(t) samples
distributes read for
the
fashion.
will
R ~(t) and
have
been
read.
evenly among the under the sample
Z3i(t, to). Because
channels, at interleaving
is at least
(70)
Therefore,
because
consumption buffer function be at least
the buffer
function,
which
for any
function
is equal
is constant
channel
under
to the
and the the
read function
minus
same for each channel,
sample-interleaving
scheme
the the must
(71)
Now under
consider
any
the optimal
point scheme.
in time
t such
At such a time,
that
Bi( t, to) z O for
we must
all
channels
have
(72)
ACM
Transactions
on Information
Systems,
Vol.
10, No. 1, January
1992
84
J. Gemmell and S. Chrlstodoulakis
.
Therefore, all channels under the sample-interleaved scheme are nonzero in buffer space. Therefore, sample interleaving may be substituted as a play❑ back solution with the same buffer space and prefetch time. This theorem used to reduce any other
is derived from the reasoning that, when reading time, sample interleaving is never
interleaving
strategy.
However,
in many
seeks cannot be any worse than
cases seeking
can be used
effectively. If, for example, the number of playback channels is less than the number of interleaved channels, it may be worthwhile to divide the data into a block
for each channel.
channel,
rather
interleaving 7.4 This
than
seeks
then
be performed
to pass over
for each playback
unwanted
data
in any
other
scheme.
Placement section
use the
One seek may
many
within discusses
term
sector
Sectors how placement
is affected
for
block,
a physical
by sector
to
avoid
size (recall
confusion
that
with
we
logical
blocks). If the logical block size is such that it is an integer multiple of the sector size, then one may proceed straightforwardly. If this is not the case, then some difficulties arise. For example, suppose the desired logical block size is 1.5, the sector (1) Because
a block
1.5 blocks. when read. (2) What block
of the
multiple
smallest
unit
the first
a logical
second problem
of the sector
immediately to read,
the cost for reading block,
problem clear.
size, then
first,
(s~ / s,]
sectors
0.5 of a sector
logical
strategies
themselves:
may
In general,
may
(assuming
need the
it will
make
size
is not
block
(1) The remainder of the sector may be simply logical block is made up of an integral number padded placement.
to read just
may be higher.
as discussing
If the two
present
it is impossible
the data
at least
is to be done with the extra began on a sector boundary)?
We tackle ties
is the
Thus,
reading
size. Two problems
to be logical
the properan
integer
be taken: left unused, so that each of sectors. This is called a
(2) No padding is added, and data from the next logical up the sector. This is called a close-packed placement.
block
is used to fill
Obviously, a close-packed placement uses the storage device space more effectively, so it is desirable if there are no negative drawbacks. In order to look at the implications of close packing, we consider interleaved and noninterleaved placements separately. In particular, we examine (1) the implications on reading time for a logical block and (2) buffer space utilization. First consider close-packed, contiguous, noninterleaved placement. Two reading strategies are possible. The first strategy is to round up the logical block size to [ S6 / s~l sectors. In this way, (s~ /s~l sectors channel in each reading period. Because this much must
are read for the be read for any
given block without close packing, there are no disadvantages for buffer space or reading time. Note that more data is being read than necessary, so ACM
Transactions
on Information
Systems,
Vol.
10, No
1, January
1992,
Delay-Sensitive Multimedia Data Storage and Retrieval artificial
delays
to become
are introduced
available.
are necessary
The
into
second
in each reading
the system
strategy
period.
while
waiting
is to readl just
Because
the first
.
85
for buffer
as many
strategy
space
sectors
always
as
reads
too much, it is possible that after a certain amount of time the amount overbuffered will accumulate to the point where one (or more) less block(s) may be read in a reading period. The performance of the second strategy is even better than the first in terms of reading time and buffer it is never ahead of the first in reading. Note, however, that, first logical block, it must read just as much as with the first
space, because at least for the strategy, so its
peak buffer utilization and its worst-case reading time are the fore, although it may use less buffer space and take l[ess reading points
during
other
strategy.
data. The that
playback,
reasoning instead
strategies.
it has peaks
In both
for scattered
of two
reading
In the first
each scattered
cases, there
groups
is to be close packed, recall
that
sample
sample interleaving like playing back
because
is identical,
actually
two
of ~s~ / s,! selctors are kept at most that
many
the
except
placement together
in
are, whereas
at
there would be no reason not to close pack. placement, we must consider exactly what
more
interleaving
are
as the
not to close pack
placement
there
and in the second,
some points it may be less. Again, When dealing with interleaved
the same performance
be no reason
noninterleaved strategies
strategy,
location;
of reaching would
same. Theretime at some
than
one channel
was shown
is kept
to be optimal
together.
First,
in some cases. When
is optimal and is employed, then playback is very one channel, and the argument for noninterleaved
much data
above may be used again to show that close packing is optimal. The only situations where sample interleaving would not be used is where it saves time
to do seeks of individual
channels
within
the interleaved
record,
and we
assume in this situation that all of the data for each channel is kept separate within the metablock. In order to make the metablock an integral number of sectors,
three
approaches
(1) The block
may
be taken.
size for each channel
may
be increased
to an integral
of sectors, and hence, the metablock is also an integral This approach ensures that each block for each channel boundary,
whereas
the other
number begins
number of sectors. on a sector
two do not.
(2) Each channel may be increased just integral number of sectors, whereas
enough that the individual
the metablock is an channel blocks are
not. (3) Data
belonging
to the
next
metablock
may
be added
to fill
in the
sector.
Approach (3) can be rejected immediately for scattered interleaved placement. By taking it, data for the next metablock may be widely separated, defeating the purpose of interleaving. The first strategy spreads the blocks out a bit more than the second, which spreads them out more than the third. Having blocks spread out more may result in slightly higher seek (although still within the upper bounds given in this
chapter). ACM
Transactions
on Information
Systems,
Vol.
10, No. 1, January
1992.
86
J. Gemmell and S. Christodoulakis
. When
blocks
( sb /s.] aligned,
are
aligned
sectors
will
more
blocks
always suffice for each channel. When blocks are not may need to be read. For example, suppose that each
logical
block
within
the first
second,
third,
In general,
consists and
on
sector
boundaries
of one-and-two-thirds second
and fourth
sectors.
sectors,
nonalignment
sectors.
The
second
so three
may mean
(approach
sectors
that
The first
logical
(l)),
logical
block
must
reading
block
lies
lies within
the
be read to retrieve
~s~ / s~l + 1 sectors
it.
need to be read
for each channel. Therefore, alignment of each block on physical boundaries may cost slightly more in seek time, whereas nonalignment may cost slightly more in reading time and may require more buffer space. Finally, using padding in any of the three cases would not improve performance cases. 7.5
Effects
It is very
in any way,
We can conclude
of Compression
on Placement
common
to use compression
that
schemes
close packing
on audio
is optimal
data.
in all
A compression
scheme attempts to reduce the amount of data required to store some information by translating to a different form of encoding. In this section we examine
the impact
7.5.1
Compression
resolution
of compression and
on digital
System
of a delay-sensitive
Resolution.
playback
audio
storage
Let
us begin
system
and retrieval,
as follows:
an audio playback system is the smallest and played back independently of the rest A compression scheme should be chosen
amount of data of the data, so as to not limit
the system.
if the compression
For example,
on the entire decompressed. how little
the
that
of
can be read
the resolution scheme
of
worked
file, such that it needed to be read in entirety in order to be This would mean that the entire file must be read no matter
is desired
for the whole
it is undesirable
by defining The resolution
file.
for playback, An example
and also that is a simple
buffer
“delta”
space must
compression
have
algorithm,
space in
which actual samples are not stored; rather, from its predicted value, based on previous compression is employed across an entire file,
the difference of each sample samples, is stored. If delta reading would always have to
proceed from the beginning played back. For reasons of resolution,
if only
of the
file,
it is desirable
even
to treat
the
last
tenth
is to be
each logical
block
as a file
in its own right and to employ compression on it separately. However, general compression algorithms tend to perform better on larger blocks of data. Usually typical sector sizes will suffice [13]; so if compression is applied to either the sector or logical block, whichever is larger, compression performance should be good, and system resolution should be as high A remark on the application to video is in order at this point.
as possible. Many video
compression schemes hinge on the calculation of differences from frame to frame of the same pixel or region. From our study in this paper, we might think of each region as a channel in a multichannel synchronized system. The conclusions then be applied.
for multichannel
audio
storage
and retrieval
ACM Transactions on Information Systems,Vol. 10, No 1, January 1992
strategies
may
Delay-Sensitive Multimedia Data Storage and Retrieval
7.5.2 that
Compression,
compression
ensure
good
Block
should
system
Size,
and
resolution.
storage
of compressed
exactly
one sector
Close Packing.
be performed
data.
on small
However, For
this
example,
and is compressed
We have units
of data
does not
suppose
tell
87
already
seen
in
order
us much
a logical
to 0.8 of a sector.
.
block
Unless
to
about
takes
up
some form
of
close packing is employed, no savings in either storage device space or performance would be realized, because an entire sector must still be occupied and read for this logical block. In terms of playback, there is no reason not to close-pack noninterleaved compressed data, just as for the blocks should be placed one after
noncompressed another, without
aries.
packing
Note,
difficulties
however, in
an
that
editing
close system.
edited it may no longer block was close packed.
be the In the
entire file. However, we systems. In the case of interleaved compressed vary, make the
data
than
for
was
same worst
restrict
of compressed is because
data
when
our
discussion
placements,
may
a logical
the
here
options
data.
to
lead
to
block
is
are
Because
playback-only
more
number
of sectors.
limited
compression
to extend the length of each number of sectors or so that
an integral
logical bound-
size and may not fit in where the old case, this could lead to rewriting the
noncompressed
it would be very difficult them all fill an integral metablock
This
case. Compressed regard for sector
Three
for rates
block so as to the length of
choices
are
then
possible. (1) Pad every sectors.
block
for
every
channel
to make
(2) Close pack blocks within each metablock, make it an integral number of sectors. (3) Close pack
blocks
For contiguous the
metablocks.
However, ceptable
interleaved This
for scattered because
within
each metablock, data,
would
ensure
interleaved
it would
widely
there that
and then
an
integral
number
of
pad each metablock
to
and close pack
would
be no reason
storage-space
close packing, separate
it
metablocks. not to close pack
savings
metablocks
the data within
are realized. would
metablocks,
be unacdefeat-
ing the purpose of interleaving. For scattered interleaved data, approach (2) should be taken, because it will never waste any more space than approach (l), and may do better. In conclusion, close packing should be employed for all compressed data except scattered interleaved, which should combine close packing and padding, as indicated by approach (2) above.
8. SUMMARY The real-time requirements of digital audio playbaclk are significantly different from the requirements of most “real-time’7 databases. In order to have a successful playback, the retrieval of successive samples must meet successive real-time deadlines. ACM
Transactions
on Information
Systems,
Vol
10, No. 1, January
1992.
88
.
J. Gemmell and S Christodoulakis
In Section 3 we have developed a general theoretical framework for studying the performance of any playback system. We have demonstrated that playback that
can always
playback
buffer
space and
finding
Given the
samples mum
start
the minimum
space. from
be achieved,
can be done,
the
by the
start
time. start
device
problem
time
(and,
describes
at a given
hence,
that
is equivalent
that
digital-to-analog
time
all
is then
It was proved
function
storage
by buffering
the
for
and
converter,
samples
the
consumption
buffer
Knowing
the
smallest playback
the minimum
many
a method
the minimum
samples.
single-channel
to finding
how
time
of the
to do it with
for
had
buffer
been rate
finding
read of the
the
mini-
space) has been given.
We have also studied the effects of reading data in sector units, and how to use the previous results in such a case. Reading data in sectors usually means that buffer space must be allocated before a read begins and that data from the following The
sector results
results
multimedia devices for
of Section system
hardware
faster
3 form
designers
selection with
sector is read. the entire was performed on sectors.
is not available until assumed that reading
decisions
a set of design
to study (such
special buffer
effect
Having
are dependent buffer
not desirable
placement, on system
avoided
on the length
most
cases,
by introducing
dependent and Section
artificial
delays.
can be used designs
compression
by and
schemes,
and simpler results Section 4.1 gave
case of single-channel uncomIt was found that, unless the rate of the device, the buffer
of the audio on
of the
performance.
cases have been examined, space have been obtained.
requirements
in
that
of alternative
times)
bounds on buffering requirements for the pressed audio that can be read continuously. consumption rate is equal to the transfer requirements
tools
as data
seek or transfer
In Section 4 several finding minimum
the
All
record.
the length
of the audio
4.2 has examined A lower
bound
how this was found
record
is
could
be
for buffer
requirements, and an algorithm was demonstrated that had buffer requirements very close to the bound. Section 4.3 has extended this work to cover the case in which some unavoidable delays occur in addition to artificial delays introduced and an algorithm demonstrated.
by the algorithm. Again, with buffer requirements
Section illustrate
5 has presented some important
indicative
of buffer
the properties relationships.
space requirements,
a lower bound was produced, approaching the bound was
of multichannel playback in order to It was observed that block size is while
reading
period
length
is indica-
tive of start time delays. It was found that when nonzero seeks are encountered, block size and reading period length asymptotically approach infinity as the total consumption rate approaches the transfer rate of the storage device. Because of this, it is advisable to have some “headroom,” with the transfer rate of the storage device being significantly higher than the total consumption rate. It was shown that, in order to support greater seek time, one may either decrease the consumption rate of each channel of audio, increase the block size read in each reading period, or increase the data transfer
ACM
rate
Transactions
of the storage
on Information
device.
Systems,
VOI
10, No. 1, January
1992
Delay-Sensitive Multimedia Data Storage and Retrieval In Section
6 these
of a shared special might
storage
needs
results
have
device.
Two
of a shared
allow
free time
the absolute ily, but that
been extended models
system.
The
in its reading
length of free time it is more difficult
in order
to consider
have
been presented
first
considered
cycle for other
per reading to increase
.
the case
to consider
how
tasks.
89
the
an audio
task
It was shown
that
period ma,y be increased arbitrarthe percentage of free time. The
second model considered all requests for data on the storage device to belong to a channel of audio playback. This model has been useful in illustrating the effects of bandwidth utilization, and similar to Section 5, it was found that it is not economical to push the bandwidth limits of the device too far. Section
7 took up the topic
synchronized leaved
playback.
or
noninterleaved,
contiguous ity
strategies
and disk
physical
of storage
Strategies and
achieved
fragmentation.
blocks
was
placement
were contiguous
higher
and
analyzed
scattered.
being
logica]
compressed
inter-
Interleaved
at the expense
of placing
for both
for multichannel
as either
performance
The problem
then
strategies
categorized
blocks
and
and
of flexibil
-
of data
in
noncompressed
data. Because the basic real-time requirements of digital audio playback are entirely analogous to other data types such as video and animations, the results presented here extend to delay-sensitive datal in general. The basic principles
for
retrieving
and
storing
delay-sensitive
multimedia
been described in this paper. These principles tem designer to estimate hardware requirements
data
caln be used by and to evaluate
have a sysdesign
strategies. REFERENCES 1. ABBOT, C.
Efficient
editing
of digital
sound
on disk.
J.
Audio
Eng.
Sot.
32, 6 (June
1984),
394-402. 2. ANAZAWA,
T.,
HAYASHI,
H.,
INOKUCHI,
TODOROKI, S., AND YAZAWA, recording Audio
technology
network
Society
Information
system.
Systems
TAKAHASHI, overview
Presented
(Toronto,
D, P., GOVINDAN,
window
K.,
A historical
at Denon.
Engineering
3. ANDERSON,
H.
at the
Ontario,
R., AND HOMSY,
In
Proceedings
’91 (Singapore,
G.
of
Jan.
AES
May the
16-18,
Y.,
TA~ASU,
of the 7th
YAMAMOTO,
K.,
of PCM/digital
International
14-17,
Conference
of the
1989).
Abstractions
for
International 1991).
A.,
developments
continuous
Conference
McGraw-Hill,
media
on
New
in a
Multimedia
York,
1991,
pp.
273-298. 4. ANDERSON,
D.
continuous
P.,
media
on Distributed
Tzou,
York,
of MINOS:
1986,
6. NAKAJIMA, Ridge ondary Systems 8. PHILIPS
H.,
A.,
F.,
GOVINDAN,
In
AND ANDREWS,
of the 10th
M.
Support
International
for
Conference
(Paris, May 28-June 1, 1990). pp. 54-61.
AND THEODORIDOU,
A symmetric
R.,
Proceedings
approach.
In
M.
The
ACM
multimedia
Proceedings
object
presentation
of SIGMOD.
ACM,
New
Tab Books,
Blue
pp. 295-310.
Summit,
7. PARK,
R.,
system.
Systems
Ho,
S.,
WAHBE,
DASH
Computing
5. CHRISTODOULAKIS, manager
S.,
in the
DOI, T., FUKUDA, Pa.,
AND ENGLISH,
storage.
In
P,
A variable
Proceedings
’91 (Singapore,
Jan.
Digital
Vs.,
rate
strategy
of the International 16-18,
Compact
INTERNATIONAL.
Sons, Harrisonburg,
J., AND I~A, A.
Az~dlo
Technology.
1983.
1991). Disk
for
McGraw-Hill,
Interactive:
retrieving
Conference New
audio
data
on Multimedia York,
A Designer’s
1991,
Oueruiew.
from
sec-
Information
pp. 135-146. Donnelley
and
1, January
1992.
1988.
ACM
Transactions
on
Information
Systems,
Vol.
10, No
90
J. Gemmell and S. Christodoulakis
.
9. THOMPSON, T., AND BARAN, 10. WAT~INSON,
J.
Digital
11. WATKINSON,
J.
The Art
12. WELLS, J., YANG, of the
International
16-18,
1991).
13. WILKES,
D.
14. YAMAMOTO,
T.
Yu,
audio
compression Univ.
Society
C., SUN, W.j data
disks
the
YANG,
for
AES May
pp
Boston, data
, 1988
on optical
disks.
Systems
ACM
June
TransactIons
1990;
revised
February
on Information
1991;
Systems,
Vol.
158-175.
1988),
Mass
In
501-508. Proceedings
’91 (Singapore,
Jan.
123-134 addressed
files.
Can 7th
optical
Master’s
disks
International
14-17,
thesis,
Dept.
of
R., AND TULLIS,
applications.
accepted
be
applicable
Conference
to
digital
of the
Audio
1989). Commun,
862-871.
Received
1988),
36, 6 (June
1985.
Q , BRUNO,
real-time
Press,
Sot.
Information
technology: at
B.vte 13, 12 (Nov. Eng.
of audio
randomly
Ontario,
D.,
BITTON,
1991,
Ontario,
recording
(Toronto,
on optical
York, for
Presented
Focal
Multimedia
of Toronto,
Optical
Audio
Placement on
New
computer. J.
Audio
C.
Conference
workstations?
Engineering 15
Q., AND Yu,
Data
Science,
The NeXT recorders.
of Digital
McGraw-Hill,
Computer audio
N.
audio
July
1991
10, No. 1, January
1992.
J.
Efficient ACM
32,
placement 7 (July
of
1989),