Temporal Relations in Multimedia Objects: WWW

3 downloads 0 Views 99KB Size Report
presentation. 1. Introduction ... multimedia system has to follow such a specification in ..... MMIS. Multimedia Systems, vol. 1, no. 4, pp. 173-185,. February 1994.
Temporal Relations in Multimedia Objects: WWW Presentation from HyTime Specification Maria da Graça C. Pimentel Laércio Baldochi Jr Fabiano Fagundes

Cesar A.C. Teixeira

Depto. de Computação e Estatística - ICMSC-USP {mgp|baldochi|fagundes}@icmsc.sc.usp.br CP 668 - São Carlos - SP - 13560-970 - Brazil

Departamento de Computação- UFSCar [email protected] CP 668 - São Carlos - SP - 13560-970 - Brazil

Abstract The representation of temporal relations in multimedia objects is based on models of time that allow the identification and the specification of temporal relations among different media, particularly those relations relevant to the process of multimedia synchronization. Initially, this paper discusses the use of HyTime for the specification of binary temporal relations. Next, the paper discusses an approach to the automatic transformation of the HyTime synchronization specifications into elements to be presented in the context of the World Wide Web environment. Keywords: Media Scheduling and Synchronization; Synchronization Specification; HyTime; WWW presentation.

1. Introduction Research in multimedia systems involves the investigation of problems that arise in the integration of distinct media such as text, images, audio and video. The presentation of continuous media such as audio and video, when components of a multimedia object, requires the correct positioning in time of each component. The task of coordinating such presentation is referred to as multimedia synchronization [2]. Multimedia synchronization depends on the specification of temporal relations among several media; such relations may be implicit, as is the case with simultaneous acquisition of voice and video data in a video-conference, for instance, or they can be explicitly formulated, as in documents composed of text and voice  in both cases the characteristics of each media and the

relationship between them must be established in order to provide a properly synchronized presentation. The representation of temporal relations between multimedia objects is based on models of time, which allow the identification and specification of temporal relations among distinct media, particularly those relations relevant to the process of multimedia synchronization [15]. This paper contributes (a) a discussion of the specification of binary temporal relations using HyTime, an international standard for the specification of hypermedia documents that allows the formalization of hypertext and multimedia structures [9], and (b) the presentation of the work being carried out in order to allow the automatic translation of documents containing the HyTime formal specification of those relations into elements to be presented in the context of the World Wide Web. This paper is organized as follows: Section 2 discusses concepts related to models of time used in multimedia synchronization; Section 3 presents the HyTime formal specification corresponding to binary temporal relations; Section 4 reports the work being carried out to provide automatic translation of HyTime specifications into elements of a WWW presentation; and Section 5 concludes by discussing the next steps for work reported here as well as some related work.

2. Models of time The process of authoring a multimedia presentation demands the specification of the relationship among the components of the presentation  many of these components may be dependent in terms of time [2] a multimedia system has to follow such a specification in

order to guarantee a proper presentation. This section summarizes the models of time found in the literature.

2.1 Conceptual models of time The concepts of instant and interval are used in the modeling of multimedia synchronization. An instant of time is a moment of length zero in time; an interval of time is defined by the length between two instants. Then, given two instants a and b, and b > a, the length of the interval is given by b - a. An interval can also be defined in terms of other intervals.

Advanced model of synchronization based on intervals. Based on Allen’s binary temporal relations, Wahl and Rothermel defined their model of synchronization based on inter vals with 29 temporal relations identified as relevant to multimedia presentation [18]. These relations were used to define the 10 operators shown in Figure 2. A δ1

before(δ1) B

A δ1

Binary temporal relations. Allen has identified the 13 temporal relations between two intervals [1]; Figure 1 shows seven of them, the other six are the inverse of the first six. In order to be used in the modeling of multimedia applications, temporal intervals are associated to media components such as an image or an audio segment  such intervals represent the time parameters required for the media presentation. The use of such modeling allows the representation of complex multimedia objects.

beforeendof(δ1)

B A δ1

cobegin(δ1) B A δ1

coend(δ1)

B

A before B

A

B

A δ1

A overlaps B

while(δ1, δ2)

δ2

delayed(δ1, δ2)

δ2

startin(δ1, δ2)

δ2

endin(δ1, δ2)

B A δ1 B

A

A starts B

B

A δ1

A

A equals B

δ2 B

A

B

B A δ1

A meets B

A

B

B A δ1

A

A during B

B

A

A finishes B

δ2

cross(δ1, δ2)

B

B

overlaps(δ1, δ2, δ3)

A δ1

δ2

δ3 B

Figure 1 - Binary Temporal Relations [1]

Figure 2 - Ten operations of synchronization based on intervals [18]

In this figure, each arrow indicates that the starting or ending point of object A activates a delay which, when finished, determines the starting or ending of object B. Wahl and Rothermel’s model allows the definition of both the duration of an object of continuous media and the length of time that an object of discrete media is to be presented. Moreover, the model allows the definition of user interaction and this is one of the strengths of the model. The model also allows the specification of nondeterministic temporal relations through the definition of the intervals in terms of their duration and delay parameters. The model is quite flexible and allows the specification of presentations subject to several variations in their execution [15].

2.2

Models for the specification synchronization of multiple media objects

multimedia information [9]; HyTime is based on the international standard SGML  Standard Generalized Markup Language [8]. HyTime is formally defined by a set of constructors called Architectural Forms (AFs) used in the specification of the structural aspects of a document  hypermedia structural aspects include multi-ended links; multimedia structural aspects include powerful addressing and locating mechanisms used in the scheduling and synchronization of events in space and time. The AFs defined in HyTime are organized in six modules, two of which are relevant to the work reported in this paper: the measurement module, defining the representation of measurement units; and the scheduling module, defining the representation of temporal and spatial relations.

of

Some models allow the specification of more than two media objects; the following discussion is based on the work by Steinemetz and Nahrstedt [15]. Specification of synchronization based on flow control. In this case, the flow of the presentation of concurrent threads is synchronized in pre-defined points of the presentation. The three forms of synchronization based on flow control are (a) the hierarchical specification, (b) specification of synchronization using reference points and (c) specification of synchronization using Petri Nets [12]. Specification of synchronization based on events. In this case the presentation of the objects is triggered by events of synchronization; such events can be external (associated to the expiration of a timer or some user-interaction, for instance) or internal to the presentation  when caused by an object dependent on time (when a stream reaches its end, for instance). This model is widely used  one example being the MHEG-5 ISO standard [10]. Specification of synchronization based on axes. In this case events of a presentation are mapped to one or more axes common to all objects of the presentation. Two forms are possible: synchronization based on one global timer [7], and synchronization based on virtual axes  the last one is directly supported by the HyTime ISO standard, as discussed next.

3. Specification of temporal relations with HyTime HyTime  Hypermedia/Time-Based Structuring Language  is an international standard for the definition of the structure of documents containing hypermedia and

3.1 Positioning and scheduling of objects Measurement module. The AFs of this module formalize the specification of measurement units using elements that allow the marking in space and, as a result, the definition of distance, positioning and length units [6][9]. The measures are expressed as maximum and minimum integer values in a finite space. The architectural forms allow the definition of elements ranging from a point in space to complex areas in several dimensions. Scheduling module. In order to allow the scheduling of the presentation of objects, HyTime adopts a model for space and time that is based on finite axes  where each axis defines one addressable space [9]. A set of axes is defined as a finite coordinate space using the AF fcs. The scheduling module allows the definition of regions in a finite coordinate space where objects are to be placed  therefore an event is defined as the association of a data object to a particular interval in that space [9]. The semantics associated with an event, as well as its presentation form are, as always in HyTime, defined by the application [6].

3.2

Specification of temporal multimedia objects

relations

in

HyTime defines constructors that allow the definition of a relation during the specification of the scheduling  for instance to indicate that an event must begin after another event has finished or that two events coincide. One important constructor is the HyTime element dimref, that chooses a value from a given dimension and copies this value as a marker to the dimension being specified. The element dimref has attributes elemcomp, that specifies the target dimension, and selcomp, which specifies which

information in that dimension is the marker. The attribute selcomp can have the values first or last, indicating the first or last marker in the dimension, respectively; or qcnt, specifying the length of the dimension. As an example, Figure 3 presents the definition of the dimension event1 (evt1) using the constructor extlist  that allows definition of markers, and specifies the events evt2 and evt3, using dimref, as dependents of evt1 (the first marker defines the starting point for the event and the second marker defines its length).

1 20 20 Figure 4: Specification of location and sync using extlist and dimref Once defined, the elements location and sync themselves can be used to specify the positioning of objects on a time axis  this means that the specification

of the synchronization of events on a time axis is made possible, using a few architectural forms of the measurement and scheduling modules. Therefore, the binary temporal relations presented in Figure 1 can be mapped to a time axis using the elements location and sync as shown in Table 1, in the Annex A., which shows how a document would specify the use of such relations. It is important to observe that those definitions in Table 1 assume previous knowledge of the dimensions of the related intervals  an action-based specification is presented in Section 3.3. As shown in the work by Wahl and Rothermel [18], the specification of synchronization may demand the use of delay parameters (Figure 2). These operations demand the definition of a new element opersync (Figure 5), extending the previous elements location and sync. The specification of opersync defines the attributes oper, delta1, delta2, delta3 and ext  oper indicates the target synchronization operation; delta1, delta2, delta3 indicate the interval(s) required; and ext gives the extent of the event on the axis. When used in a document, the information is interpreted by the application in charge of the presentation  this application is also responsible for solving error conditions.


>

Figure 5: HyTime specification of the operations defined by [18]

3.3 More powerful HyTime specification The feature of HyTime supporting the definition of virtual axes allows a more elaborate specification of the relationship between media objects  this can be exploited in the definition of action-based operations as defined in Figure 6.





Initially, this section discusses aspects related to the automatic translation of HyTime elements in order to allow their presentation in the WWW. Next, the limitations of the current implementation are discussed.

4.1 Allowing documents

HyTime

elements

in

HTML

One of the advantages of using formal specification in the definition of documents is the ease of performing automatic processing of the specifications in order to achieve some useful result. In the case of this paper, the aim is to demonstrate how HyTime synchronization specification can be handled in a open distributed system such as the WWW with the use the Java Virtual Machine. Figure 7 illustrates a typical environment that processes HyTime documents: the HyTime environment should contain a HyTime engine, that accepts the output of a SGML parser (responsible for the verification of the underlying SGML structure of the document), identifies and processes the HyTime elements contained in the document, and provides applications with the output from both the SGML parser [4] and the HyTime engine.

- o empty > (begin|end) cdata (init|stop) idref

#required #implied #required #implied

database

hyperdocument

SGML parser

HyTime engine

Figure 6: Specification of the operation supporting actions (events)

The feature demonstrated in Figure 6 formalizes synchronization operations between objects whose duration is not known in advance  the end of such an object is treated as the occurrence of an event.

4. Presenting HyTime elements in the WWW The documents presented in the WWW environment are formatted according to HTML  HyperText Markup Language [5]. When limited to HTML, these documents are mostly textual with no facilities for multimedia synchronization. When the WWW environment is extended with the Java Virtual Machine [16], the HTML documents may refer to Java applets, which activate the execution of Java programs.

application

Figure 7: A typical HyTime document processing environment [13] In the environment of the WWW, from the HyTime point of view, the browsers are the applications that process the structure of the document, present related information and deal with user interaction. In order to allow the presentation in the WWW environment of a HyTime document containing the synchronization operations defined, two steps were necessary:

1. a Java applet was built to control the presentation of one object of the media text synchronized with one object of the media audio  this applet is called SyncEvent; 2. a translator was built that processes a modified HTML document  the modifications include the use of the synchronization operations specified above  and generates a HTML document that includes the activation of the applet SyncEvent The applet SyncEvent presents two objects in the WWW  a textual object and an audio object  related to each other in any of Allen’s 13 temporal relations. SyncEvent exploits the multithreading features of Java in order to implement three threads: • TextThread: retrieves and presents a textual object  the text is presented in a scrolling mode; • AudioThread: retrieves and plays back an audio object; • TimerThread: controls timers associated with the other two threads in order to synchronize the presentation. After the translation, the resulting document can be loaded and executed by a Java browser extended with the Java Virtual Machine. This process, illustrated in Figure 8, indicates that such multimedia service can be made available in the WWW  limited to the capabilities of the Java Virtual Machine environment, as discussed next.

Figure 8: Processing HyTime document to WWW presentation

4.2 Limitations of the current approach The characteristic of multi-threading of Java makes it suitable for the presentation of multimedia data in the

WWW to a limited extent  for instance the interval associated with a media object has to be known in advance. The current Java Virtual Machine environment, without the full functionality to be provided by the Java Media Framework, limits the implementation of code that supports more elaborate HyTime specifications. For instance, the specification presented in Figure 6 allows the definition of synchronization operations between objects whose duration is not known in advance. The support for such operations using SyncEvent would demand the TimerThread be informed of the end of a media object by its associated thread  this is not possible with the current Java technology. However, this limitation should be overcome with the full availability of the Java Media Framework for several platforms; the work reported in this paper is under continuous development in order to exploit the partial versions of the Java Media Framework as they are made available.

5. Final remarks Some researchers have reported the use of HyTime as a tool for the specification of multimedia objects. Buford et al. demonstrated the use of such specifications in the definition of an interactive slide-show application [3]  having defined specifications equivalent to those presented in Figure 4. Similar work is presented by Kimber in [11]. The work reported by Rutledge, Buford & Rutledge demonstrates the use of Java applets in the presentation of their slide-show in the WWW environment  however very little of the HTML structure is maintained since the applets communicate directly with their HyTime engine [14]. HyTime is an international standard for the specification of documents containing both multimedia and hypermedia structures  the aim is to provide a standard form to describe documents to be exchanged among several systems and or platforms. This paper has demonstrated (a) a few of HyTime’s capabilities in terms of the specification of media scheduling and synchronization; and (b) how such specifications can be used in the distributed environment of the WWW, with the support of the Java Virtual Machine. In terms of specification, the work reported the definition of operations of synchronization defined by [1] and [18]. Such specification was presented using the original HyTime features of addressing virtual axes. More elaborate specifications were provided which extended HyTime’s original event-handling in order to allow actionbased operations.

The next step in this research involves investigating how HyTime documents can be translated into MHEG-5 objects so they can be presented by a MHEG-5 engine, as proposed by [17]. This task will be facilitated by the use of the specification of action-based operations presented in this paper  which is similar to MHEG-5 specification.

[7] [8] [9] [10]

Acknowledgments This work is supported by CNPq/ProTeM-CC/SMmD grant #680077/94-4. Mr.Baldochi and Mr.Fagundes are supported by FAPESP grants #96/595-7 and 96/573-3.

[11] [12] [13]

References [1] [2] [3] [4] [5]

[6]

Allen, J. F. Maintaining Knowledge about Temporal Intervals. Communications of the ACM, vol. 26, no. 11, pp. 832-843, November, 1983. Buford, J. F. K.; Multimedia Systems. ACM Press, 1994. Buford, J. F. K. et al. HyOctane: a HyTime engine for an MMIS. Multimedia Systems, vol. 1, no. 4, pp. 173-185, February 1994 Clark, J. SP. http://www.jclark.com/sp/index.html, June, 1996. Connolly, D. W. Hypertext Markup Language - 2.0 HTML Public Text. World Wide Web Consortium, Laboratory for Computer Science, MIT, http://www.w3.org/pub/WWW/ MarkUp/html-spec/htmlspec_9.html, 1995. DeRose, S.J.; Duran, D.G.. Making Hypermedia Work: A Users’s Guide to HyTime. Kluwer Academic Publishers, Massachusetts, 1994.

[14] [15] [16] [17] [18]

Hodges, M. E.; Sasnett, R. M.; Ackerman, M. S.; Athena Muse: A Construction Set for Multimedia Applications. IEEE Software, pp. .37-43, January, 1989. ISO/IEC 8879 Information Processing — Text and Office Systems — Standard Generalized Markup Language. October 1986. ISO/IEC88744 Hypermedia/Time-Based Structuring Language-HyTime, 1992. ISO/IEC 13522-5: Support for Base-Level Interactive Applications MHEG-5, 1995. Kimber, E.; A Tutorial on HyTime. 1996. Little, T. D. C.; Ghafoor, A.; Scheduling of Bandwidthconstrained Multimedia Traffic. Computer Communication, vol. 15, pp. 381-387, 1992. Newcomb, S.R. Kipp, N.A.; Newcomb, V.T. The HyTime Hypermedia/Time-based document structuring language Communications of the ACM vol. 34, no. 11, November, pp. 67-83, 1991. Rutledge, L.; Buford, J. F.; Rutledge, J. L.; Applying HyTime to HTML. IASTED Distributed Multimedia Systems and Applications Conference 95, August 1995. Steinmetz, R.; Nahrstedt, D.; Multimedia: Computing, Communications & Applications. Prentice Hall, 1995. and the JVM, Java http://www.java.sun.com; http://www.javasite.bme.hu/doc/plaftorm/JavaPlatform.doc 2.html, 1996. Teixeira, C. A. C. et al. Arquitetura do projeto SMmD. I Workshop em Sistemas Hipermedia Distribuídos. São Carlos, 1995, pp. 30-39. Wahl, T.; Rothermel. K.; Representing Time in Multimedia Systems. Proceedings of International Conference on Multimedia Computing and Systems, p. 538-543, May, 1994.

Annex A Table 1 A before B

A ∝1.............∝2 δ1

A overlaps B

A

B β1..............β2 δ2

B ∝1...........................∝2 δ2 δ1 β1 ..........................β2 A starts B

A B ∝1.........................∝2 δ1 δ2 β1 .............................................β2

α1 δ1 β1 δ2 α1 δ1 β 1 δ2 α1 δ1 δ2 α1 δ1

A equals B

A B ∝1.........................∝2 δ1 = δ2 β1 ...........................β2

A meets B

A ∝1........................∝2 δ1

A during B

A B ∝1.........................∝2 δ1 δ2 β1 ...........................β2

A finishes B

α1 δ1 ∝1.....................∝2 δ2 δ1 β1 ..............................................β2 β1

B β1.............β2 δ2

α1 δ1 δ2 α1 δ1 β1 δ2

A

B

Table 1: The 13 temporal relations of synchronization specified using location and sync

Suggest Documents