Franca Garzotto §, Luca Mainetti §, Paolo Paolini §,#. §HOC-Hypermedia Open Center. Politecnico di MilanoâItaly. #Telemedia Lab, University of LecceâItaly.
Designing Modal Hypermedia Applications Franca Garzotto §, Luca Mainetti §, Paolo Paolini §,# § HOC-Hypermedia Open Center Politecnico di Milano—Italy # Telemedia Lab, University of Lecce—Italy E-mail: {garzotto, mainetti, paolini}@elet.polimi.it
ABSTRACT
Different users of a hypermedia application may require different combinations of modes, i.e., different ways of perceiving the content or different ways of interaction. Multimodality—intended as the coexistence of multiple combinations of modes in the same application—can improve application richness and can accommodate the needs of different categories of users. On the other hand, multimodality increases complexity and may affect usability, since a variety of different interaction styles may be disorienting for the users. Designing an effective multimode hypermedia is a difficult problem. This paper discusses this issue, presenting a taxonomy of different kinds of modes in hypermedia applications and introducing the concept of modal hypermedia interaction. Modal interaction means that the semantics of normal application commands are dependent not only on the application state, as usual, but also on mode setting. We introduce a formal model for modal hypermedia interaction that helps us to analyse more precisely design alternatives and their impact on usability. We illustrate our approach by examples from a museum hypermedia called “Polyptych” that we actually built. KEYWORDS: modal interaction, usability, hypermedia application design, hypermedia models 1 INTRODUCTION
The term “mode” has been loaded in literature with a variety of meanings [7, 9, 15]. In the context of hypermedia applications, we can use the term for two different broad categories of meaning: communication mode and interaction mode. A communication mode denotes a “carrier of information” [9], i.e., the way used to convey the content of the application to the reader. An interaction mode denotes the way users interact with the application and utilize it. A complex hypermedia application is naturally multimode for both the above aspects: the content is conveyed through various combinations of media, languages, rhetorical styles, presentation metaphors, and several interaction paradigmas Permission to make digital/hard copies of all or part of this material for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copyright is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires specific permission and/or fee. Hypertext 97, Southampton UK © 1997 ACM 0-89791-866-5...$3.50
38
are available, related to different styles of information access, e.g., search and navigation [5], and different ways of operating on media of different nature. Some combinations of modes are more appropriate for some user profiles but are unsatisfactory for others [1]. The proper choice of modes, for a given category of users, depends on a number of different factors: the user expertise with the application domain, his knowledge of computers in general and of hypermedia in particular, the goal he is trying to achieve, the time available for the session, the context of use, the evolution of the current session, etc. If a hypermedia aims to address several types of users and tasks, a multimode system is more appropriate than a monomode application (based on a single combination of modes) or a set of mono-mode applications. A mono-mode application is often a crude compromise among different user needs, none of them being fully satisfied. On the contrary, different combinations of modes within a single application can accommodate different categories of user requirements and can support a variety of tasks in different situations. Moreover, a user can switch between mode combinations reasonably more seamlessly than in a set of mono-mode applications. Unfortunately, the co-existence of several combinations of modes can affect usability since the user is faced with the additional complexity of selecting the proper mode combination or switching between different combinations. The aim of this paper is to discuss this problem and to identify crucial usability issues that should be addressed when designing a multimode hypermedia. Our proposal is modal interaction, as a technique combining richness of solutions (i.e., availability of multiple combinations of modes) with suability. Modal interaction is not a totally new concept, since several existing systems (word processors, for example) make already use of it, to a certain degree. The novelty of this paper is to specialize this concept for hypermedia. Our contributions are a taxonomy of hypermedia-specific modes, a model to precisely define the concept of modal hypermedia application, and an analysis of possible design trade-offs.
Section 2 discusses different types of modes in the context of hypermedia applications and introduces the concept of modal interaction. Section 3 precisely defines the notion of modal hypermedia and proposes a formal model to describe it. Section 4 discusses design options. Section 5 presents a number of examples taken for a hypermedia application named “Polyptych” that we actually built. Section 5 draws the conclusions.
experimented that sequences are easy to navigate but sometimes hide the real structure of information; we are currently experimenting (see section 5) the idea of providing different topologies for the same guided tour, such as trees or lattices, to experienced readers, in order to represent semantic relationships among guided tour constituents, while retaining the linear structuring for more naive readers. Control mode (interaction): different degrees of control
2 MODES AND MODAL INTERACTION IN HYPERMEDIA APPLICATIONS
In hypermedia applications, we can identify several different types of modes within the two broad categories of modes mentioned in the introduction—communication and interaction. They are summarized in the following table1 and discussed in the rest of this section:
could be exercised over the application execution, ranging from the “couch-potato” mode to “full active control” mode. In the couch-potato mode, the user is mostly passive, doing almost nothing or selecting simple choices. Full active control means that the user has the total control over execution. Access mode (interaction): applications may range from a
Mode Category Communication
Mode Type
Example(s)
Media Rhetorical style Language Size
Interaction
Topology Control Access
Text, Animation+Sound Concise English Short guided tours (e.g., max. 10 steps) Sequence, Tree Couch Potato Navigation, Query
Media mode (communication): a medium or a combination of different media characterize the way to convey content. Different media modes (e.g. text, image, animation, video, or “text plus audio”) could be used to convey the same content in different situations. Rhetorical style mode (communication): within the same medium (text or audio, for example) different styles could be used: concise vs. extensive, light vs. in-depth, expert vs. amateur, etc. Language mode (communication): the choice of a specific
foreign language can be considered as a very simple communication mode. Size mode (communication): complex objects, such as
guided tours [17], collections [3], or entities [2], can be delivered in different sizes, according to different situations. A guided tour on a given subject, for example, could consist of five steps for a short version, and twenty steps for a long version. Topology mode (communication): different topologies can be used to organize information on the same subject, according to different situations. Guided tours, for example, are very often structured as sequences of steps. We have 1
This taxonomy does not pretend to be exhaustive, but it covers most of the modes we have found in analyzing over 100 commercial or research hypermedia applications.
“Question&Answer” style of access (typical of data base oriented or information retrieval oriented applications) to “Point&Click” browsing. The different access modes are often intermixed; a typical combination is represented, for example, by browsing over a collection of objects previously selected through a query. When the application is multimode, i.e., several combinations of modes are needed within the same hypermedia, two extreme options are available. One possibility is to have a different interaction paradigm for each mode setting. Another possibility is to support modal interaction, i.e., to provide the same set of commands for each mode combination but to alter their semantics according to the current setting of modes. We will say that a multimode hypermedia is modal if it supports modal interaction, and modeless otherwise. Defining when a modal approach is more effective then a modeless approach is a design problem that, to our knowledge, has been so far received limited attention. If the application is significantly complex, and the number of mode combinations is large, the modeless “version” may require the user to learn and to remember too many commands (one set for each mode combination), thus violating two fundamental usability factors: learnability and memorability [12, 14]. In a modal hypermedia, a user must learn fewer commands, but he needs to understand how a command's effect is dependent upon the current setting of modes; usability problems may arise if the application does not provide sufficient perceptual cues [11, 10] to help user identify the current combination of modes. Furthermore, if mode setting is under the user control, the user needs to learn how to change mode configuration, which introduces additional complexity. It is outside the purpose of this paper to provide general guidelines for designing the appropriate features of modal interaction in each possible situation and for each possible user profile. Our goal is to identify design alternatives and to
39
suggest possible trade-offs. Before addressing these issues, we will first discuss modal interaction from a formal point of view. The formalism—relatively simple—has the purpose to help defining some concepts in a more rigorous way and provides the terminology to analyse more precisely various design choices, discussed in section 4. 3 FORMAL DEFINITION OF MODAL HYPERMEDIA INTERACTION
Our model distinguishes between regular application commands, which affect the execution state of the application, and mode commands, which affect the setting of modes only. The “classic” (i.e., non modal) semantics of commands for modeless applications can be formally defined by the following function: (1)
φ: Γ x Σ → Σ;
φ(γ,σ)
→ σσ'
In (1), Γ is the universe of “normal” application commands (possibly with parameters), Σ is the universe of possible states of execution for the application. φ is the command interpretation function mapping the execution of a command γ, activated in an application state σ, into the new state σ'. The formal definition of “state of execution” is omitted in this paper, since it is not relevant for the purpose of our discussion2. We only assume that an execution state does not include the definition of mode settings in it. φ is a partial function, since some interaction commands might not be available in some states of the application. In other words, φ(γ,σ) is undefined if a command γ is not available in a state σ. To introduce modal interaction, we need to replace the definition (1) with a new one: (1a)
φ':: Γ x Σ x Μ →Σ; φ φ' (γ, σ, µ) → σ σ'
In (1a), Γ, Σ, σ, γ, and σ' are defined as for modeless applications. Μ is the set of mode combinations (also called mode states or mode settings) available in the application, and µ is a combination of modes in Μ which represents the setting of modes currently active in the application state σ. According to this definition, the same command, applied to two identical execution states, can have different semantics, being dependent upon the combination of currently active modes (i.e., µ ). We could have considered the set of modes as part of the description of the execution state of the application, reducing formula (1a) to (1); we believe, instead, that is convenient to keep the description of the state of execution 2
The reader is referred to [6, 8] for approaches modelling the state of multimedia objects and to [5] for a definition of the notion of navigation state in modeless applications.
40
of an application distinct from the mode setting. One reason is to keep a clean separation between normal application commands, and commands altering the mode setting: definition (1a) makes it clear that normal commands do not have side effects on modes, i.e., they do not modify implicitly the current configuration of modes. Thus there is a unique behavior for each command executed in a given state and in a given setting of modes. In order to modify the configuration of modes, we introduce a new set of commands, the semantics of which is defined by the following interpretation function: (2))
ψ: Ξ x Μ x Σ → Μ x Σ; ψ(ξ, µ, σ) → (µµ', σ) ψ(ξ0, −, σ) → (µµδ, σ)
In (2), Ξ is the set of the mode setting commands, M is the set of mode states and Σ is the set of execution states for the application as in (1a). The partial function ψ maps each mode command into its meaning; a mode command ξ, applied to a mode state µ , produces a new mode state µ’ without affecting the execution state σ. The special mode command ξ0 is used to switch to a special mode state µδ and can be considered as the mode re-setting command. In fact, µδ is the standard, i.e., default, mode setting. We can assume that at any time during a session there is only one default mode setting, defined by the default assignment function δ. 4 DESIGN TRADE-OFFS
The formal definitions introduced in the previous section are preliminary to the discussion about when and how the mode combinations should be set or modified in a modal hypermedia, and by whom. Different design options can be considered. A) to assign the control of the mode settings to system manager only. Technically speaking, this means that the
mode setting commands (the ξ‘s and ξ0) as well as the default assignment function δ are not available within the normal execution of the application, but to a “special” user only. This solution can be valid in situations where the same application must be deployed in different “versions” for different purposes. A museum hypermedia, for example, can be designed so that it is used as an information point installed at the entrance of the museum, as a consultation point in the reading room, as a professional system in an office, as a CD-ROM usable at home, as a WWW application, etc. Instead of creating different applications, the same application could be used with a different standard configuration of modes in each different delivery context. Each deployed application will have one modality of execution (according to the default setting of modes defined by the system manager) possibly different for each
installation point. Each deployed application behaves as a modeless application, but the overall application environment is modal. B) to let the user select the wished combination of modes at the beginning of a session and modify the setting during the normal execution of the application. Technically speaking,
this means that the default assignment function δ and the 0 mode setting commands (the ξ‘s and ξ ) are always available to the user within the normal execution of the application. This solution is very flexible, and can be appropriate when the same installation must be shared by a large community of users, with different skills, roles, and tasks. The drawback of this solution is that it requires from the user the ability of selecting the proper setting of modes and of “tuning” the application to his needs. C) to provide a subset of different initial mode settings for each different version of the application; to allow the user choose a mode configuration at the beginning of a session and switch to the default mode setting during the application execution. This is a compromise between (a) and (b). The
default assignment function δ and the mode setting 0 commands (the ξ‘s and ξ ) are available to the system manager to configure each installation. Some mode setting commands ξ‘s are also available to the user to select a mode configuration different from the default one (usually set by the system manager), but only at the beginning of a session. µ', σ) is defined for the user In other words, ψ(ξ, µ, σ)→ (µ only for a subset of Μ and only if σ is the initial state σ0. In addition, at any time during a session the user can switch to the default mode setting activating the re-setting command ξ 0 . This solution takes into account the different needs of . each application deployment, while at the same time it provides a good degree of flexibility to the user3. D) to support automatic adaptation of mode configuration. Technically speaking, this means that the mode setting commands are automatically invoked by the system, under certain circumstances. During the execution of the application, the system could establish the proper configuration of modes depending upon the user profile, the pattern of usage, the task being accomplished, etc. [13, 16]. This solution is the most ambitious and could appear very attractive, but is also the most complex. In practice, it is seldom adopted, and if adopted, it is seldom fully satisfactory. A simple example of automatic mode adaptation can be found in the “Louvre” multimedia CDROM (a French product by “Réunion des Musées Nationaux” and Monparnasse Multimédia), where the presentation of a painting and some interaction modalities change if the reader has visited the same subject before. For 3
A slightly more sophisticated version of this solution is to make also the default assignment function available to the user, to allow him to select a specific mode configuration as his own default setting.
example, the first time the user visits a painting, he gets a full screen image and an audio comment; at the end of the audio, the screen automatically changes to a “static” presentation showing a small size image of the same painting, the painter's picture, and some navigation buttons. In any successive access to the same painting (within the same session) the user gets the static presentation first. From here, a navigation button allows the user to access the full screen image with the audio comment; differently from the first time, at the end of the audio nothing happens, and the user must guess that, by clicking everywhere on the screen, he can return to the static presentation. The idea behind this “adaptive” behavior is probably that the static presentation and the full control on navigation are more appropriate if the user is somehow “expert” on the subject domain, i.e., he has some knowledge about the current painting since he has previously explored it. Still, in some usability experiments that we have run we tested that users were disoriented and did not understood what was happening. If solutions b) or c) are chosen, an additional design issue 0 concerns how the mode setting commands (the ξ's and ξ ) should be made available to the user. One possibility is to provide explicit commands, easily at hand for the user. This solution may increase usability for the sophisticated user, but it may be disorienting for a not-so-expert person, who might involuntarily modify the mode setting and get confused by the change of behavior of the application. To improve usability, it is crucial that the application makes a good job of informing the user about the current mode he is in and how to enter the other modes. Another solution is to provide the user with “hidden” mode commands, in the sense that only the expert user is informed of them off-line but an unaware user may never realise that they are available. This choice has the advantage of retaining simplicity and safeness for the inexperienced user, still allowing higher control over the navigation style to the experienced user. Some usability problems may result if, by chance, the unaware user discovers this trapdoor and activate mode commands with no knowledge of their effects. Yet another solution is to allow mode setting commands only when the application is in a special execution state, difficult to reach (or reachable with a protected access only); here the lack of flexibility is balanced by the “safeness” of the solution. In the next section, we will exemplify the concepts discussed so far by shortly describing modal interaction in the hypermedia application “Polyptych”. 5 EXAMPLES FROM THE HYPERMEDIA APPLICATION “POLYPTYCH”
“Polyptych” is a hypermedia presentation of a museum research concerning the “Agostinian Polyptych” by Piero
41
della Francesca4. It is currently installed at the Poldi Pezzoli museum in Milano, within a “traditional” exhibition on Piero della Francesca, and in the museum house in Tuscany (Borgo San Sepolcro) where Piero della Francesca was born. Polyptych will be also available as a CD-ROM edition by next year.
Mode Types Control
“Polyptych” has been intended for a variety of users, that have been classified in three major categories: “casual” visitors: just passing through the information point
by chance For them the application is a “walk-up-and-use” system that is only intended to be used once, probably for a short time. They might differ in their skill about computers and hypermedia technology; “intentional” visitors: they have some knowledge, or, at least, a significant interest, about the subject domain and want to learn more about it. They might differ in their knowledge about hypermedia technology and in the amount of time available to explore the application; “specialist” visitors: they are specialists in the application
domain, e.g., researchers in history of art; again, they might differ in knowledge about technology and in the time available to use the application. Modal interaction in “Polyptych” has been designed so that it can take into account the needs of these different categories of users, and the different situations of fruition (i.e. time availability, motivations to use the application, tasks to be accomplished, etc.) In the rest of this section, we will shortly present the content and the structure of the application, using the concepts of HDM—the Hypermedia Design Model [2, 3, 4]. This description is preliminary to the discussion concerning the design of modal interaction in “Polyptych”. 5.1 Content structure
The overall content of “Polyptych” has been organized as a set of eight main structures called “paths” (corresponding to HDM “collections”): “Reconstruction”, “Technical Analysis”, “Restoration”, “Fashion”, “Textiles”, “Jewelry”,
Size
Modes
Automatic Navigation
Passive (Couch Potato) Manual
X
Short
X
Mode
settings
Visual Navigation
Text-based Navigation
Expert Navigation
X
X
X X
Average
X
Long Media
Image + Audio
Text + ( images + animation) Topo- Sequence logy Lattice
User Profiles
X X
X
X
X
X
X
X X
Casual visitor— low skill
Casual visitor— average skill
Intentional visitor
Domain specialist
Table 1: mode settings in “Polyptych”
“Archive Documents”, “Renaissance Art Related Works”. Each path corresponds to a research topic, and its content has been created by a different group of art researchers (from the Poldi Pezzoli, the Milanese “Academy of Brera”, the Florential “Opificio delle Pietre Dure”, the University of Lecce, the Library of Borgo San Sepolcro). All paths have a similar organization, consisting of a short Introduction, and a set of “sections”. A section corresponds to a subtopic of the general subject of the path, and consists of one node of type “Visual”, one node of type “Text”, and several nodes of type “Detail”. A “Visual” node consists of a large image, a caption, and an audio comment. A “Text” node includes one or two columns of text, one or more images, and, sometimes, animations.
4
The so called “Agostinian Polyptych” is one of the most mature works by Piero della Francesca, one of the greatest artists of the Italian Renaissance. The various components of this polyptych— with the exception of the central panel, got lost—are currently exhibited in some important museums world-wide: Frick Collection in New York; National Gallery of Lisbon; National Gallery of London; Poldi Pezzoli Museum in Milano. For years, a team of art researchers have attempted to virtually “reconstruct” how the overall polyptych might have looked like. This research is presented in our hypermedia application, proposing a number of possible reconstruction hypothesis. These are based on the analysis of previous restorations, the investigation of ancient documents, and the compared analysis of sculpture, textile, fashion, jewellery, every-day life, and religious life in the Renaissance.
42
Nodes of type “Visual” and “Text” provide essential, nonspecialist information about the section subject. A “Detail” node provides additional information on the section subject, have a structure very similar to of “Text nodes”, but the rhetorical style is mainly for art experts. In each path, the set of sections is organized according to two topologies—sequence or lattice—to address different user categories. The lattice structure is intended to capture the semantic relationships among the topics presented in a given path, and is mainly intended for domain experts. The linear structure arrange the sections according to a suggested
sequence of reading. To each topology corresponds a node (“Index Node”, in the HDM terminology [2, 3, 4]), which shows the path structure and allows the user to select a section and access it directly.
Mode setting Automatic Navigation
Text-based Navigation
Effect of modal command “Entra” display the node of type “visual” in fig. 2a, simultaneously activating the audio comment display the node of type “Visual” in fig. 2b, simultaneously activating the audio comment display the node of type “Text” in fig. 3
Expert Navigation
display the node of type “Text” in fig. 5
5.2 Mode settings
Four possible settings of modes are available in “Polyptych”, corresponding to different styles of navigation: Automatic Navigation, Text-based Navigation, Visual Navigation, and Expert Navigation. They are schematically summarized in table 1. The rest of this subsection will describe the meaning of each mode setting, using examples from the path “Jewelry” (“Ricami Metallici”). Figure 1 shows the node representing the introduction to this path.
Visual Navigation
Navigation control on the destination At the end of the audio comment, the next node in the path (of type “Visual”) is displayed automatically At the end of the audio comment, navigation along the path (across nodes of type “Visual”) is under user control Navigation is under user control; next and previous nodes are of type “Text” Similar navigation as in Text based mode setting, but now additional navigation links are available
Table 2: effects of clicking the modal button “Entra” in each different setting of nodes
Figure 1: introduction node of path “Jewelry” (“Ricami Metallici”). The button “Entra” is associated to a modal command.
The buttons “Avanti” (Forward) and “Indietro” (Backward) correspond to modeless commands, and allow to proceed to the introduction of the “next” or “previous” path, respectively. The button “Entra” (standing for “Enter the path”) allows to explore the various sections of the path. This button corresponds to a modal command, its effects being dependent from on the current setting of modes. Table 2 summarizes the semantics of this command in each different setting of modes, referring to the figures shown along this section. Automatic navigation. Under this setting of modes, the user
can browse automatically across the different sections of the current path, visiting only nodes of type “Visual nodes”. This mode setting has been designed for totally novice casual visitors, since it provides a quick overview of the content of each path and requires a minimum degree of user control. An example of Visual node that has been accessed under this mode setting is shown in Figure 2a.
Figure 2a: first visual node of path “Jewelry”. Mode setting = “Automatic Navigation”.
The application proceeds to the next section at the end of the audio comment, without requiring any user interaction. The user may only change the path of interest or switch to mode setting “Visual Navigation” (see below) by selecting the button “Manuale” (“Manual”). Visual navigation. This mode setting is intended for casual visitors who have some hypermedia skill and want to exercise a certain degree of control upon the navigation. The main difference with respect to “Automatic Navigation” mode setting is that the user must explicitly request the transition from a section to the Next, or Previous (or First or Last sections). Figure 2b shows the same section of path “Jewelry” presented in Figure 2a, but now the application is under “Visual Navigation” mode configuration. The node includes the buttons to navigate across the path such as “Avanti” (Forward), and “Ultimo” (“Last”—to access the last section of the current research path.)
43
Figure 2b: first visual node of path “Jewelry” Mode setting = “Visual Navigation”
The user may change the mode setting (switching to “Automatic”), or change the path of interest. In addition, by using the button “Vai a ....” (Goto), he is allowed to access the Index node which shows the path structure (see figure 4). This setting is intended for “intentional visitors”, since it provides a significant amount of information for each path and requires an active participation of the user to explore such a content. The user can access the different sections, via nodes of type “Text”. Figure 3 shows an example of a node accessed under mode setting “Text-based navigation”. The content (concerning the same section of the path as in figures 2a and 2b) appears in two columns of text and one image. The interaction is slightly more complex than in Visual Navigation mode setting, since some new commands are available. Some of them are associated to symbols embedded in the text. Textual notes, for example, can be displayed by selecting the note symbols—numbers in round brackets- and animations can be activated by selecting the symbol “@”. No command Text-based
navigation.
Figure 3: first text node of path “Jewelry” Mode setting = “Text-based Navigation”.
44
Figure 4: index node showing the structure of path “Jewelry”. Mode setting = “Text-based Navigation”.
is available to the user to control animations. The user may change the mode setting switching to “Automatic” or “Visual” Navigation. Expert navigation. This setting is appropriate for domain
experts, e.g., researchers in the application subject: the amount of content is larger and the representation of information is more complex and richer than in the other mode settings, in terms of topology of the path (lattice), content (a section is now represented by a node of type Text node, plus a number of nodes of type “Details”), rhetorical style (more specialist). Richer structures and contents implies that also interaction is more complex, since new navigation links and commands to control active media (animation) are now available. Figure 5 shows the first node of the first section of path “Jewelry” under mode setting “Expert Navigation”. If compared with figure 3, the text is the same but the image is now the third frame of an animation; the right column of text is covered by a note commenting the animation; below the
Figure 5: first text node of path “Jewelry” Mode setting = “Expert Navigation”.
frame caption there are now the buttons to control the animation execution. In addition, we can notice some small book-like icons below the left column of text; they represent navigation commands to access nodes of type “Details” that provide further information on the current section. The nature of these additional contents (not available in “Textbased” setting of modes) is strictly technical, and the rhetorical style of their texts are appropriate for domain experts. One of such “Details nodes” is shown in figure 6.
A lattice captures more information, at the expenses of a certain difficulty both in understanding the intended meaning of the structure and, above all, in using it. In fact, in “Expert Navigation” mode setting the commands to navigate across the path are apparently the same, but their semantics is substantially changed, since the path structure is not linear any more. The command associated to the button “Avanti” (Forward), for example, may not take the user directly to the next section, in general, but it may provide a number of “next” options since several sections may semantically “follow” the current one in a lattice structure. 5.3 Mode configuration control
A final consideration concerns the way of controlling mode configurations in “Polyptych”. The application is deployed as several information points in the museum, and the system manager can choose different initial mode settings for each installation. When the application is reset to the cover page (either manually or through a time-out mechanism) the mode setting defined at installation time is always restored.
Figure 6: third “Details” node associated to the first text node of path “Jewelry” Mode setting = “Expert Navigation”
Finally, some commands on the node in figure 5 have a different meaning with respect to the same commands under Text-based mode (see figure 3). For example, the command to access the Index node of the path now displays a lattice structure, shown in figure 7. We can compare figure 7 with figure 4, which shows a linear topology and represents the Index node of the same path under Text-based, Visual, Automatic Navigation settings of modes.
Switching among Text-Based, Visual, or Automatic Navigation mode settings is based upon explicit configuration functions and/or mode commands, that are represented by visible buttons displayed on nodes (see figures 2a, 2b, and 3). Commands to switch to Expert Navigation mode setting, instead, are somehow “hidden”, in the sense that the unaware user may never realize that they are available. In fact, the user can set the mode configuration to “Expert Navigation” only if he is placed on the Index node displaying the structure of the current path (see figure 4) and uses the right button of the mouse to select the section of interest. The effect of this voluntarily “obscure” command is to place the user on a Text node like the one shown in figure 5, under Expert Navigation setting. After this action, any further command different from a mode change command will be interpreted in the context of Expert Navigation setting. The intention is that only specialists or strongly motivated users will exploit the possibilities of this mode setting, while normal users will not be aware of it and will not be confused by its intrinsic complexity. 5.4 Discussion
We have done some evaluation studies about the usability of “Polyptych” and about the effectiveness of its modal interaction design. Intensive user testing have involved casual visitors, or art teachers and students visiting the exhibition where the application is installed, or art researchers from the institutions that collaborated in the project. The analysis and interpretation of the test results are still ongoing and will be discussed in details in a future report.
Figure 7: Index node showing the structure of path “Jewelry”. Mode setting = “Expert Navigation”
We can anticipate here that art specialists, once informed about the “hidden” mechanism to set mode configuration to “Expert Navigation”, have used it quite extensively with a good degree of satisfaction, and they have also appreciated
45
the possibility of switching to other mode settings to explore the application content in different ways. In particular, art teachers frequently switched from Expert to Visual Navigation when discussing, through the application, the various topics with their students. We have also noticed that students, after starting the visit of the application under Automatic Navigation mode, tended to switch to Visual Navigation after 3-6 minutes of use and to continue the exploration under this configuration (with short jumps backand-forth to Text-based navigation). Finally, user testing has shown that the mechanism to switch to Expert navigation is safe for naive users; during our observations, no user used the mouse right button and switched to Expert navigation by chance unless explicitly informed about this possibility. 6 CONCLUSIONS
By their very nature, hypermedia applications employ a large number of different modes, of different types. Multiple modes improve the quality and richness of the applications, accommodating the needs of different users, in different situations. Multiple combinations of modes, on the other hand, increases complexity and may create usability problems. In this paper we propose modal interaction as a technique of hypermedia interaction design that allows the contemporary achievement of simplicity for users and flexibility of tuning interaction styles to specific user needs. Modal interaction is characterized by clean separation between normal application commands and mode setting commands; normal commands affect the execution state of the application, while mode setting commands affect mode configurations only. The semantics of normal application commands depends upon the execution state and the mode settings. We have discussed an example of modal interaction, as implemented in “Polyptych”, a hypermedia application developed at HOC-Politecnico di Milano, in co-operation with the Poldi Pezzoli Museum in Milano. In “Polyptych”, the design of the various modes and of the various mechanisms of modal interaction have taken into account the needs of different types of users: casual visitors, intentional visitors that have some interest on the subject domain of the application, and experts, i.e., researchers in history of Art. All the configurations of modes and modal navigation techniques discussed for “Polyptych” have been implemented. The implementation is based upon a navigation engine that maintains a separate representation of the execution state from mode configuration, and distinguishes among modeless command, mode control commands, and modal commands. Our research is planned to continue along the following directions:
46
a) to define a richer sets of modes, with the proper effects upon application commands; the most promising are size modes and topology modes b) to improve the set of mode commands, for initial setting and modification of modes c) to generalize the architecture of the current navigation engine, enlarging the flexibility of its modal navigation mechanisms d) to improve the switching among the different modes, under the user control e) to conclude the analysis of usability experiments to validate the effectiveness of the various design choices of “Polyptych”. ACKNOWLEDGEMENTS
We would like to thank all the members of the team of “Polyptych”, for their help and constant enthusiasm in this project. We are especially grateful to A. Mottola Molfino, A. Zanni, and A.Di Lorenzo from the Poldi Pezzoli, C. Frosinini and M. Bellucci from Opificio delle Pietre Dure in Florence, L. Polcri from the Library of Borgo San Sepolcro, A. De Marchi from University of Lecce, G. Butazzi and M. Pinin Brambilla, G. Restano, F. Bolognesi, and M. Angeleri from Politecnico di Milano, and the Image Processing group at ITIM-CNR. We also would like to thank the many visitors of the Poldi Pezzoli museum who contributed to test “Polyptych”. We also acknowledge the generous contribution of EPSON-Italy for the hardware equipment. REFERENCES
1.
Bearne M., Jones S., Bearne J. S-F. M., “Towards Usability Guidelines for Multimedia Systems”, In Proc. ACM Multimedia ’95, S. Francisco (CA), Oct. 1995
2.
Garzotto F., Paolini P., Schwabe D. “HDM—A Model Based Approach to Hypermedia Application Design” In ACM Trans. Inf. Syst., 11 (1), Jan. 1993
3.
Garzotto F., L. Mainetti, P. Paolini “Adding Multimedia Collections to the Dexter Model”. In Proc. ACM ECHT'94, Edinburgh (UK), Sept. 1994
4.
Garzotto F., Mainetti L., Paolini P. “Hypermedia Design, Analysis, and Evaluation Issues”. In Comm. ACM, 38 (8), Aug. 1995
5.
Garzotto F., Mainetti L., Paolini P. “Navigation in Hypermedia Applications: Modeling and Semantics”. In Journal of Organizational Computing, 6 (3), 1996
6.
Gibbs S., Breiteneder C., Tsichritzis D., “Data Modeling of Time-Based Media”. In Proc. ACM SIGMOD, Minneapolis, May 1994
7.
Hanne K., Bullinger H., “Multimodal Communication: Integrating Text and Gesture”, In Blatter M.M., Dannenberg R.B. (eds.) Multimedia Interface Design, ACM Press, 1992
8.
Hardman L., Bulterman D.C.A., Van Rossum G., “Adding Time and Context to the Dexter Model”. In Comm. ACM, 37 (2), Feb. 1994
9.
Hill W., Wrobkewsky D., McCandless T., Cohen R., “Architectural Qualities and Principles for Multimodal and Multimedia Interfaces”, in Blatter M.M., Dannenberg R.B. (eds.) Multimedia Interface Design, ACM Press, 1992
10. Kahn P., “Global and Local Hypermedia Design in the Encyclopaedia Africana”. In Fraisse S., Garzotto F., Isakowitz T. Nanard J, and Nanard M. (eds.) Hypermedia Design, Springer, 1996 11. Norman D., “Design Rules Based on Analyses of Human Errors”, In Comm. ACM, 26 (4),April 1983
12. Nielsen J., “Usability Engineering”, Academic Press, 1993 13. Norcio A.F., Stanley J. “Adaptive Human-Computer Interfaces: a Literature Survey and a Perspective”. In IEEE Trans. Systems, Man, and Cybernetics, 19 (2), March/April 1989 14. Preece J., “Human-Computer Interaction”, Addison Wesley, 1994 15. Rudnick A.I., Hauptmann A.G. “Multimodal Interaction in a Speeh System”. In Blatter M.M., Dannenberg R.B. (eds.) Multimedia Interface Design, ACM Press, 1992 16. Stotts P.D., Furuta R. “Dynamic Adaptation of Hypertext Structure”. In Proc. ACM Hypertext’91, S. Antonio (TX), Dec. 1991 17. Trigg R.H., “Guided Tours and Tabletops: Tools for Communicating in a Hypertext Environment”. In ACM Trans. Inf. Syst. 6 (4), 1988
47