A next step in our approach to authoring platform-independent hypermedia documents is to provide ... An example interactive multimedia presentation. Contents.
Authoring Hypermedia for Fluctuating Resource Availability: An analysis of the problem Lynda Hardman, Dick C.A. Bulterman
Abstract A next step in our approach to authoring platform-independent hypermedia documents is to provide a document specification that enables satisfactory playback under fluctuating system resource conditions. We discuss the problem for both fluctuating network usage and variable end-user machine capabilities. The problem is placed in terms of an “information pipeline”, where information is selected from a large store and assembled to deliver a presentation to the end-user. Our approach to the problem is to provide a high-level description of a hypermedia presentation and a means of determining media selection. A brief discussion of related work is given. A solution will be based on extensions of our existing hypermedia authoring tools.
1
Introduction
This paper describes the problem of authoring platform-independent hypermedia presentations that can dynamically adapt to use available resources. This is a continuation of our work on creating transportable hypermedia presentations, where we specify a presentation so that it can be played on a variety of end-user platforms. Here, we concentrate on the problem of assisting an author with the specification of hypermedia presentations which can adapt to the available resources, such as the end-user’s hardware platform and fluctuating network loads. In order to support an author in the creation of such presentations we first need to look at the resources that can limit the quality of the presentation. A hypermedia presentation needs to be able to be displayed on any of the diverse hardware platforms connected to the network. This may not be different users, but the same user wishing to access the information in the office, at home, or while travelling. The user needs access to the same information in each of these locations, but the available hardware may differ greatly—for example a high-end Silicon Graphics machine in the office and a laptop Macintosh in the aeroplane. The hypermedia presentation defined by the author needs to run on any of the user’s machines and still give the impression that it is the same presentation (although its physical appearance may differ). The amount of information available to any of the end-user platforms is limited by the communication bandwidth of the network. The problem is to allocate the bandwidth in a convenient and fair manner. For example, if the user has a low-resolution screen, there is
1
little point in transmitting a high resolution picture over the network. On the other hand, if the user does have a high resolution screen then a crowded network will mean a long wait for the high quality picture. In this case the user is better served if a low resolution picture is sent initially, with the higher resolution version being sent as, and if, bandwidth allows. These trade-offs may be not only within one medium, but across media, where bandwidth-intensive media (such as audio and video) may be substituted by lower bandwidth alternatives (such as pictures and text)). Our question is then the following: how can an author specify a sufficiently rich description of a hypermedia presentation in such a way that it can be played back on a variety of run-time environments, making best use of the hardware capabilities available, and at the same time not wasting network bandwidth. Two parts to the question are: (1) what is the information an author needs to specify and (2) with how much support can we provide the author, in terms of tools and computational aids, for specifying this information? The following section gives an example of a hypermedia presentation and discusses the problems of resource availability—for both fluctuating network usage and variable enduser machine capabilities. Section 3 describes our question in terms of an “information pipeline”. This is followed by an outline of our approach to a solution—requiring a highlevel description of a presentation and a means of determining media selection. A discussion of related work is given, and a final section discusses the status of our current tools.
2
Adapting a Hypermedia Presentation to Available Resources
As a starting point for our discussion on dynamically allocating resources for hypermedia presentations, we consider the characteristics of a “typical” hypermedia presentation. One example of such a presentation is shown in Fig. 1, which illustrates three fragments from a tour of the city of Amsterdam. In the top fragment, which is analogous to a table of contents, a user is given a description of the tour and a number of choices that can be selected within the presentation. One choice is illustrated—containing a description of walks through the city, highlighting several features found on the tour—which is itself
Welcome to Amsterdam
CWI
This hypermedia application allows you to explore the city of Amsterdam.
Leisure activities Walking routes Maps You can return to this screen by using the contents button on every other screen.
The canal houses are famous for ...
Many of the houses and canals in ...
Contents
Contents
Gables
De grachtenpanden zijn ...
Veel van de huizen en ...
Figure 1.
Musicians
An example interactive multimedia presentation
2
sub-divided across a number of other fragments. From a media perspective, each fragment consists of a number of media items displayed on screen or played through loudspeakers. Hypermedia presentations can be read interactively, or they can be displayed in the manner of a film, where a fixed sequence of information fragments are projected to a passive user. A typical characteristic of hypermedia is allowing the end-user to select a navigation path through the information via links. The information at the destination of the links may be stored locally, but may be stored anywhere around a (potentially global) network1. The interactive nature of the presentation means that the author cannot predict which path a reader will take through the information. Thus the system cannot calculate how much bandwidth to reserve for the data to be transferred. (The current situation with information available via the Internet is that at certain times of the day the network becomes overloaded, so that perceived data transfer rates per individual are extremely low, resulting in large numbers of frustrated users.) We seek a way of transferring sufficient data to recreate the intended essence of the presentation, but not so much data that the reader has to wait an inordinate amount of time for it to be transferred. For example, where the author has created a presentation, such as in Fig. 1, but the available bandwidth is insufficient for delivering the sound, videos and text in a reasonable amount of time (within a second or two), then an adequate impression of this particular presentation can be given by, for example, sending all the text, the first video frame and no sound (since this is duplicating the information conveyed by the subtitles). The cause for a decrease in presentation quality may not lie with the network, but with the capabilities of the reader’s display environment, e.g. it may have no audio hardware and a low-resolution screen. Some combination of network bandwidth availability and machine capability has to be taken into account. For example, when the network is full, the capabilities of the machine are not the determining factor. Whereas, if the user’s machine has only basic capabilities, the load on the network is irrelevant. Only when the user has a high-performance machine and the network can guarantee the necessary bandwidth can the user hope for a high-quality presentation (as the author had intended). Information about the user’s hardware capabilities and the data “intensity” of the media objects both need to be known and taken into account when calculating which data should be sent across the network.
1. The WWW (World Wide Web) is a good example of globally distributed information, where, using a browser such as Mosaic, at the click of a mouse-button the reader can access text, pictures, or even sound samples and video images, from anywhere on the Internet (Schatz & Hardin 1994).
3
3
The Information Pipeline
Our view of the information environment is shown in Fig. 2. Information, or media, objects are distributed around a network (a). Each of these is labelled, in some way, with a means of identifying its contents1 (shown in the figure as small hatched boxes). Some of these objects (b) have been selected (for example by the author) to be part of a presentation. These may have been selected on the basis of content matter, fulfilling a user query, availability (e.g. cost of retrieval in either financial or resource-usage terms), or preferred representation medium (e.g. visual versus audible). Having selected a number of fragments, these require to be assembled into a coherent presentation. Information, such as grouping the objects and hyperlinks among the objects, has to be added. This results in a document structure (c), conforming to, for instance, the hypermedia model described in (Hardman et al., 1994). This is then used to generate, taking into account any user preferences for choice of media, a platform-independent presentation specification including layout and synchronization constraints (d). The semantic information (denoted by the small hatched boxes in (c)) has been used to generate these playback constraints and is no longer needed. The final step is to take the machine-independent representation, (d), adapt it according to the end-user’s hardware (which can assumed to be static) and (constantly changing) network load and play it back on an end-user’s hardware (e). The part of the pipeline on which we wish to concentrate is (c), along with the information needed for the conversion from a given hypermedia structure, (c), to a platform-independent presentation specification (d). That is, what type of information needs to be stored in the platform independent hypermedia specification, or even with the media objects themselves, in order that it can be interpreted to make most efficient use of the end-user platform and current network conditions. (Where “most efficient” means the highest quality presentation given both the hardware and network limitations.) Also, how info selection choice: content; availability (access rights, existence, cost); representation
(a) information fragments
info structuring
generation why -> how are we using information? (e.g. user pref.s)
composition links
(b) selected
fragments
(c) structured
fragments
(d) CMIF spec
(multi env.)
doc. execute network constraints; local m/c constraints
(e) final
presentation
Data objects Data objects with labels
Figure 2.
platform independent
platform dependent
The Information Pipeline
1. For example, the URC’s (Uniform Resource Citations) being proposed for the WWW (Berners-Lee, 1993). These provide a classification mechanism for data items on the network, similar to the bibliographic information contained in a library collection.
4
are we able to use this information for deriving the platform-dependent presentation descriptions. In addition to the information in the platform-independent presentation description we are interested in providing the author with a supportive environment for, firstly, creating a high-quality hypermedia presentation and, secondly, adding a minimal authoring overhead for describing the possible trade-offs that could be made by the system for high and low quality presentations. This authoring support may be in the form of direct authoring tools—such as the CMIFed environment, (Van Rossum et al., 1993), for creating nonadaptive hypermedia presentations—but may also include (semi)automatic tools for satisfying author-specified constraints. (For readers already familiar with the CMIFed work, this might be similar to the derivation of the resource-based view from the structurebased view in CMIFed, (Hardman et al., 1993), where the author is able to fine-tune the timing relations in the derived view.)
4
Information Specification Requirements for Authors
4.1
High-level presentation description
In order to be able to derive a number of different “ready-to-play” presentations from one baseline presentation, this base has to be described in non-media specific terms. That is, the presentation must be described in terms of its message, or content, rather than directly in terms of the media items comprising it, an approach taken by both (MacNeil, 1991) and (Davis, 1994). This can be compared with the current CMIFed structure (or hierarchy) view, as described in (Van Rossum et al., 1993), where the author’s narrative of the presentation is built up in a hierarchy, and only the leaf nodes of this description contain references to the media objects to be used. If we omitted the leaf nodes of this hierarchical description, then we would be left with the story as the author had planned it, but without the specific items needed to present the narrative. The author then requires some way of specifying candidate items to be played in the presentation, given the user's hardware configuration and the current (or predicted) network loading. 4.2
Dynamic means of choosing media items
Given a high-level, media independent description of the presentation, the author needs to specify a number of media representations from which the system can select. Firstly, the representation choices need to be made explicit, and secondly, the grounds on which the system can make a selection need to be specified. For example, a minimal specification would be for two ends of the spectrum—one description for a “highest possible quality” presentation, making use of the highest resolution screen available when there is no restriction on available network bandwidth, and an alternative “lowest acceptable quality” presentation for a low-end machine when the network is full. 4.2.1 Adaptive information object model The object we propose for storing the possible media representations is the AIO, (Bulterman, 1993) shown in Fig. 3. In this illustration, three possible representations of a particular scenario are shown: a video clip along with its soundtrack (preferred); an audio
5
description with a set of still images (adequate); and a text block with captioned stills (minimal). This abstraction captures a number of media representations from which the system can dynamically select. A particular AIO may be complex having many representations available, as shown in the figure, or it may be a simple server that knows how to present controlled delivery of a single data type. Each AIO may be a separate entity or it may be managed via an object server. The overall role of the AIO is to manage the implementation of constraints on the presentation of data. The AIO is a means of storing the information fragments represented in Fig. 2 (a). 4.2.2 Author, reader and environment constraints The selection of which representation to use from the AIO is made at runtime, based on information given by the author (for example, which are the “preferred” representations) and a number of constraints. These constraints fall into two broad categories: static constraints, whose impact on a particular projection selection is known when the document is defined, and dynamic constraints, whose impact on projection selection is not known until the document is accessed. Static constraints on the selection of a projection are those that are defined when the document or information object is created. While the resolution of these constraints occurs at runtime, the choices can be analyzed before presentation begins. Examples are: • information encodings: information can be mapped to one of several projections, for instance encapsulated within an AIO, each of which may use one or more types of presentation media. While selecting among projections is a runtime task, the range of projections available is assumed to be known statically—even though the media representation itself may be synthesized at the time it is referenced. • author preferences: while defining a presentation, the author can specify the representation(s) that best project the information under different circumstances, for example the choices “preferred”, “adequate”, and “minimal” in Fig. 3. Dynamic projection constraints are those that depend on the combination of a particular user of a document and a particular presentation environment. In addition, constraints include those based on the runtime state of the interconnection environment. Dynamic factors include: • reader preferences: for a given presentation, the needs of readers in consuming the information will depend on a variety of factors. The current task may dictate that a reader would rather receive text-based than audio-based information, or that, because the reader is visually monitoring another process, audio is required in place of text;
A A preferred
Figure 3.
adequate
minimal
An Example Adaptive Information Object.
6
• system resources: for a given presentation, a homogeneous presentation platform cannot be assumed. Some presentation platforms will support a wide range of input and output devices, but others will contain only a subset of those potentially available. These constraints would be based on the presence/absence of a particular type of input/output functionality; • environment capabilities: for a given presentation, the basic ability of the support environment to provide information to the user will be constrained by resource availability across the environment. While, for example, a particular user may prefer to see a sequence of video images instead of a block of text, the underlying environment— including information servers, transport networks, intermediate buffering hosts, local operating systems, etc.—may not be able to provide this service (even if the local presentation environment can). The nature of the various constraints is not yet well understood, and the list above is not complete. What is clear, however, is that the author requires tools for creating a series of different media combinations for different hardware and network configurations, and specifying which of these are preferred under which circumstances. Based on these author-specified options, plus session-dependent reader preferences, it is then up to the system to select and send the appropriate media items as part of the platform-dependent presentation (as shown in Fig. 2 (e)).
5
Related work
(André & Rist, 1993) use a rich domain representation in order to generate presentations specific to a user’s information request. Not only is the presentation itself generated, but also the text and diagrams (built up from graphical primitives) making up the presentation. The major disadvantage of this approach, for our requirement for large information sources, is that it works only for the limited domains which have already been modelled — a time-consuming process. The presentations thus generated take into account the end-user’s hardware platform, but not the (fluctuating) network conditions. Another restriction is that the process is fully automatic, so that the author is left with no artistic control over the final presentation. In the “information pipeline”, Fig. 2, we hope to use a content-based description of the media items, while still allowing the author to allocate candidate media items to be used in different circumstances. Our task is to provide convenient tools for the author to specify media choices for different circumstances. A content-based view, such as that we have been describing, can also found in (Aguierre Smith & Pincever, 1991), where the authors define labelling strata to describe film sequences shot for a documentary. They create multiple strata running in parallel with the film, then label sections of the film with these strata to give a content-description of the film. This can then be used to search for relevant pieces of film, in particular, for assembling pieces for the final documentary. For example, one part of the film may be labelled “ambulance”, which is part of a scene on “rescuing victims”, which in turn is part of a longer piece on “providing disaster relief”. This approach to labelling is not necessarily strictly hierarchical, since the stratum level “ambulance” could be applied to every occurrence of an ambulance in the film, whether it is part of a “rescuing victims” scene or not. This approach mirrors our own, where we intend to take a content descrip-
7
tion (e.g. made up of the different strata levels) and have it translated into the appropriate media, rather than take an existing medium and labelling it with content descriptions.
6
Future Work
The Multimedia Kernel Systems group at CWI—in particular Guido van Rossum, Jack Jansen and Sjoerd Mullender—has already implemented an environment with which we can begin to look at the creation of adaptive presentations. This is the hypermedia authoring environment CMIFed, (Van Rossum et al., 1993), which supports the construction of statically specified hypermedia presentations. The editor has been built to be extensible, so with minimal extra effort we will be able to investigate approaches to constructing adaptive documents. The environment already supports a structured approach to authoring hypermedia presentations, (Hardman et al., 1993), and this can also be extended to support the extra specifications needed for creating a hypermedia document that can adapt to fluctuating resource availability.
References Aguierre Smith, T. G., & Pincever, N. C. (1991). Parsing movies in context. In Proceedings: USENIX Summer 1991, Nashville, TN, 157 - 167. André, E., & Rist, T. (1993). The Design of Illustrated Documents as a Planning Task. In M. T Maybury (Ed.) Intelligent Multimedia Interfaces, (pp. 94 - 116) AAAI Press/MIT Press, ISBN 0-262-63150-4. Berners-Lee, T., Internet Draft Standards on URNs/URCs. Bulterman, D. C. A. (1993). Specification and Support of Adaptable Networked Multimedia. ACM/Springer-Verlag Multimedia Systems Journal, 1(2), 68-76. Davis, M. (1994). Knowledge Representation for Video. In Proceedings AAAI ’94, Seattle, WA, 120-127. Hardman, L., Van Rossum, G., & Bulterman, D. C. A. (1993). Structured Multimedia Authoring. In Proceedings: ACM Multimedia (pp. 283 - 289), Anaheim CA, August. Hardman, L., Bulterman, D. C. A. & Van Rossum, G. (1994). The Amsterdam Hypermedia Model: Adding Time and Context to the Dexter Model. Communications of the ACM, 37 (2), Feb, 50 - 62. MacNeil, R. (1991). Generating Multimedia Presentations Automatically using TYRO, the Constraint, Case-Based Designer’s Apprentice. In Proceedings IEEE Workshop on Visual Languages, 74 - 79. Van Rossum, G., Jansen, J., Mullender, K. S., & Bulterman, D. C. A. (1993). CMIFed: a presentation environment for portable hypermedia documents. In Proceedings: ACM Multimedia (pp. 183 - 188), Anaheim CA, August. Schatz, B. R., & Hardin, J. B. (1994). NCSA Mosaic and the World Wide Web: Global Hypermedia Protocols for the Internet. Science 265, 12 August 1994, 895 - 901.
8