Dialogue Techniques for Supporting Multimedia

In: Working Notes of the IJCAI ’95 Workshop on “Intelligent Multimedia Information Retrieval”, Montreal, Canada, August 19, 1995, pp. 157-171.

Dialogue Techniques for Supporting Multimedia Information Retrieval Adelheit Stein, Adrian Müller & Ulrich Thiel GMD-IPSI (German National Research Center for Information Technology – Integrated Publication and Information Systems Institute) Dolivostr. 15, D-64293 Darmstadt, Germany e-mail: {stein, amueller, thiel}@darmstadt.gmd.de

Abstract Advanced information retrieval and database management systems employ a variety of elaborate data representation, indexing and retrieval techniques, but usually apply less elaborate – often simplistic – models of the human-computer interaction. Effective interaction, however, is essential for effective information retrieval. To be able to assist and guide the user actively through the entire information-seeking process a cooperative system must rely on an elaborate model of the interaction, or, dialogue. In this paper we describe a prototype of a multimedia information retrieval system, which monitors and guides the user-system interaction based on an explicit representation of the dialogue. The retrieval engine employs a novel approach to content-oriented retrieval based on abductive reasoning. It will be shown by examples, how the retrieval engine and the dialogue manager interact with each other to generate situation-dependent interaction possibilities and system responses to create a coherent interaction structure as the dialogue evolves.

1 Introduction Current day multimedia technology and knowledge-based information processing enhance the development of advanced information retrieval (IR) systems which do not only offer access to references or full-text documents but also to interlinked graphics, images, audio and video-documents. This diversity of data requires new indexing and retrieval methods, and an enhanced retrieval functionality as compared to traditional IR systems. As the information and hence the interaction possibilities of users become increasingly complex, the process of information search, retrieval, and relevance assessment has to be supported by easy-to-use multimodal user interfaces. Furthermore, the demand for user guidance in multimedia retrieval applications causes a shift from exploratory interaction styles (“browsing”) to conversational dialogue structures (cf. Thiel 1995). Advanced interfaces that act as intelligent mediators between the user and other components (agents) of the information system aim at exploiting the dialogue context in order to make the interaction possibilities, their meaning and consequences transparent (cf., for example, Maybury 1993). This includes active – context-, task-, and user-tailored – assistance in any phase of the interaction, for example, clarification of information needs, of related dialogue goals and means for accomplishing them (query formulation being only one of these goals). Thus, the importance of monitoring the pragmatics and semantics of the dialogue is emphasized as well as the prominent role of supportive meta-communication. We assume that a substantial improvement in both functionality and user acceptance can be achieved by the integration of multiple modes of interaction, in particular, when natural language (written or spoken) for conveying meta-dialogic contributions is com-

bined with multimedia presentations of the retrieved target information, and additionally, the user is allowed to directly manipulate dialogue objects (informational objects and interface objects). In the remainder of this paper we introduce an approach to integrate novel content-oriented retrieval mechanisms with a comprehensive dialogue model in order to support the user-system interaction as the retrieval dialogue develops. Our approach is based on theoretical and empirical results from several research areas – such as information retrieval, discourse theory, and multimedia interfaces – and exploits experiences made with previous prototypes for “conversational information retrieval”, for example, the systems MERIT (cf. Stein & Thiel 1993) and CORINNA (cf. Fischer, Maier & Stein 1994). Examples taken from the current prototype – MIRACLE – are presented and discussed in the last sections of the paper.

2 Conversational Information Retrieval Traditionally, researchers in information retrieval assumed that it suffices to look at the current query and to identify the allegedly relevant documents by a procedure that computed those documents that matched the query according to a given retrieval model. However, as the user is in most cases unable to formulate a query expressing his or her information need precisely, the dynamic aspects of the information retrieval process are extremely important: How can the interaction be structured in a way that supports the user in inspection, judgement and query formulation? Results of user studies support the assumption, that users might benefit from a sort of “flexible” user guidance that provides a useful strategy to be executed in order to solve the information problem while allowing for deviations in a natural way, for example, if an explanation is needed. This type of interaction resembles a “conversation” to a large extent (cf., for example, Osgood & Bareiss 1993) – hence we will refer to it as “conversational information retrieval” in the sequel. On a different track, most approaches in content-based retrieval also stress the interactivity of the retrieval process. Here, the user exploits a similarity measure established between non-textual items, e.g., pictures, in order to browse the database. Unrestricted browsing, however, might be extremely cumbersome, therefore the need to impose a structure onto the search process arises. The most obvious way to achieve a structured and coherent navigation makes use of (semantic) representations of the non-textual item. Thus, we are not dependent on isolated atomic features (e.g., a certain color) of the information item when navigating through the collection of items. Instead, we can exploit the imposed partitioning to access either a close neighbor of the object at hand, i.e., stay within the same class of objects, or use the characteristics defining the current class as a starting point to arrive at a more suitable concept. In sum, this means that the browsing process is not governed by links on the instance level alone. Switching to a different class by deliberately changing a characteristic property, however, is very similar to the modification of a query, which is a common operation in IR systems. Multimedia retrieval systems have to solve at least two different tasks: First, the relevant items have to be identified, and second, they have to be presented in a way that the user can relate them to each other, and, what is often more complicated, to the query. Both problems can be tackled by a semantic representation of the information items.

The current prototype, called MIRACLE, is centered around a logic-based information retrieval engine (Müller & Thiel 1994) that employs an abductive inference mechanisms to map a query statement to the existing database(s) and index structures and access methods. Figure 1 displays the system architecture with the three basic components: • The Indexer – called MAGIC (Multimedia oriented Automatic Generation of Indices and Clusters, cf. Müller & Kutschekmanesch 1995) – combines probabilistic text indexing with representation methods for non-textual objects derived from content-based retrieval approaches. • The Abductive Retrieval Engine works on a knowledge base comprising a semantic domain model, a model of document structure, and the semantic counterparts to the index terms assigned to the multimedia documents in the database. • The Dialogue Manager acts as a mediator between the user and the retrieval engine. In order to achieve an appropriate amount of user guidance, this component relies on an explicit dialogue representation and repository of dialogue acts (tactics) and strategies.

Graphical User Interface Input

Response

Stratified Knowledge Base Strategies

Dialogue Manager

Tactics

Abductive Retrieval Engine

Semantic Domain Model Document/ Object Structure Model Concept Index Syntactic Term Index

Indexer

Extended Query

DB Access/ Instantiation

Figure 1: System architecture of MIRACLE Applying abductive reasoning, we devise a method for content-oriented concept retrieval in multimedia databases. For a given query (Q) the abductive inference mechanism generates query reformulations with respect to the existing information structures. Each reformulation is based on a set of additional “hypotheses” which can be seen as the semantic context in which a query will be embedded. The hypotheses generated may be negotiated with the user interactively. If both – the user

and the system – agree on (some) hypotheses, the corresponding query interpretations can be checked individually by evaluating them with respect to the database’s content. This approach distinguishes between an intensional (or: conceptual) representation of the domain and the extensional model (i.e., instances retrieved from the database) of an inferred query interpretation. The application of the abductive mechanism goes along with a “natural” distinction of global phases of the interaction – or, tasks –, i.e., query formulation, inspection of the generated query interpretations, and inspection of instances retrieved from the database. In order to make this implicit order of steps transparent to the user, and to allow additional phases and interaction sequences to be included, we need, however, a more complex model of the interaction. For example, for the user it might be crucial to understand, compare, and evaluate the generated hypotheses before she selects the appropriate ones to be executed, or, after having seen and evaluated the retrieved data (with respect to the query interpretation used) she might want to turn back to a previous stage, select another hypothesis, and so on. To keep track of such complex interaction structures and to assist the user in finding an orientation, the system has to exploit the dialogue history. Based on an explicit representation of the dialogue acts of both user and system the dialogue manager is responsible for interpreting and guiding the entire interaction cooperatively. It is obvious that this requires an elaborate model of dialogue that specifies the various interaction possibilities and regulates the interchange of the acts that constitute a dialogue. MIRACLE is at the stage of a prototype and the integration of the retrieval component with the dialogue manager is currently under development. Future work will include the improvement of the user interface design to support the conversational approach and evaluations of the system with potential end-users – both with respect to the functionality of the individual components (retrieval component, user interface) and the integrated system. Currently, we can draw on experiences and tests with the former systems MERIT and CORINNA, which were also based on an explicit representation of the dialogue (using parts of the dialogue model presented here) but provided a less elaborate retrieval functionality. MERIT employed a “case-based dialogue manager” (cf. Tißen 1993) using cases of successful retrieval sessions as global dialogue plans to guide the user step by step through a new retrieval dialogue. Although this straightforward form of user guidance was combined with flexible options to depart from the suggested path and to modify the current case, we observed a need for a more flexible navigation and local interaction possibilities, for example, clarification dialogues and negotiations of the retrieval strategy. The CORINNA prototype, on the other hand, allows the user to enter clarification dialogues in the query formulation phase, but does not provide sufficient user support on the global level of the interaction.

2.1 Abductive Information Retrieval Non-trivial domains demand that a retrieval system act as a mediator between the way in which users express information needs and the system’s interpretation and computation of the query. Traditional and state of the art document retrieval systems offer as a retrieval result only a ranked list of documents, since this is the best feedback available (cf., for example, Callan et al. 1993). More complex domains such as structured documents and multimedia databases with fragments of texts,

non-textual data, and implicit as well as explicit links between them, require reformulations of queries on the level of concepts, i.e., dynamic aggregations of basic information items. A widely used technique for reasoning in information retrieval (cf. Nie 1992, Hess 1992, Meghini et al. 1993) is deductive inference, mostly within first-order or probabilistic logic. Such systems assign a truth value to a given query by computing the deductive closure of a given theory (a set of axioms, stored in a database, and a set of rules) and checking whether the query is an element of this closure. On a certain level of formal abstraction, this is comparable to DATALOG-based information systems, although the (formal) properties differ considerably. All deductive approaches need to provide an accurate model of their domain and they have to face the fact that changes in any part of the theory might lead to inconsistencies and to unretrievable parts of the data. The abductive reasoning approach combines methods from logic-based IR and object-oriented database theory: Documents are represented as complex structures, containing parts of different data types (texts, pictures, speech, audio, etc.). In accordance with van Rijsbergen’s formulation of the retrieval problem, the retrieval method attempts to prove that a database entry D entails (a part of) the query Q: D –> Q (van Rijsbergen 1989). Abductive reasoning, the inference process we use in the MIRACLE system, can roughly be described as ‘processing this formula from right to left’. Abduction generates a set of explanations which imply the consequence (the query). These formulae may be regarded as system-generated interpretations of the user’s information need in terms of database (or: domain) structure and contents. As a consequence of applying abductive reasoning, this process yields not only a query expansion on the level of search terms, as for example in thesaurus-based systems, but also produces different possible readings of the query which may differ in their meaning on both the semantic and structural level. Thus, the retrieval system provides a qualitative feedback of how it processes a given query statement. The abductive reasoning procedure yields a set of additional assumptions (called hypotheses) which are necessary for deriving evidence that D entails Q within a formal theory, which describes the structure of the data. These hypotheses model the context in which the human-computer interaction takes place in the following way: A certain query interpretation is only valid within its associated hypothesis (query context). A hypothesis is the collection of assumptions, the inference process needed to reformulate a query statement in such a way that it becomes executable for the given databases. For example, let us assume that the database consists of a collection of biographies of artists. Each biography is concerned with one person, although it might include many references to related persons (teachers, scholars, etc.). A work of art is typically created by one person and it is mentioned within this person’s biography. Rewriting this assumptions as a set of first-order logic rules allows us to establish a primary domain model. Now consider a query statement like “Which artists are concerned with impressionism?”. The abductive reasoning infers that one way of executing this query is to retrieve all works of art which are indexed with ‘Impressionism’ and group them by their creators. This inference needed an additional assumption to succeed: this query reformulation is based on the hypothesis that it suffices to process the ‘group by’ command by treating documents as biographies, i.e., to find the creator of an work of art by looking for the subject (the artist) of the

corresponding document. This assumption is the context in which this query interpretation is executable. If the user agrees with this assumption, the system is able to process the query in the way described above. In the examples given in Section 2.3 we will show that the trivial domain model does not suffice for all documents (e.g., for survey articles, library catalogues, etc.) in our prototype. The query interpretation of the example above would fail for non-biographies, since there is no general method of relating impressionistic works of art, mentioned for example in a catalogue of an exhibition about impressionism, to individual artists. Another set of domain rules is needed to capture this case, which results in different query contexts. As the necessary domain models increase in size and complexity, there will be conflicting rules. A deductive inference process cannot treat inconsistent models properly, since the reasoning process is defined over a sound and complete closure of all rules. The abductive reasoning process, as it is used in the MIRACLE system, provides the means for negotiating the necessary assumptions with the user. It finds several alternative explanations within the domain collecting the required assumptions and presenting them together. Thus, the user can select between several query reformulations and the assumptions on the domain, which are a prerequisite to compute the corresponding query reformulation. This process resembles the common human search technique of investigating several hypotheses about a given ‘black box’ – and thereby ignoring the fact, that these hypotheses might be mutually inconsistent. Technically, the retrieval engine, the dialogue manager and the user interact via first-order logic formulae. We will see in Section 2.3 examples of such formulae and the way they are presented at the user interface. Each formula expresses one query interpretation, mapping query predicates to applicable information retrieval operations. The hypotheses are part of this mapping: They are the additional lemmas – propositions which have been derived during the inference process and which are neither part of the query formulation nor of the database layer of the system – combined conjunctively within that formula. Each formula defines an intensional concept for the given domain, i.e., a way to interpret the basic query with respect to the existing database(s). The extensional truth of such a formula is computed in a second step. MIRACLE is implemented in a PROLOG environment equipped with process-communication facilities. The inference process is constrained in such a way that only predicates which refer to external functions can form the base of a reformulated query, e.g., text retrieval functions like the INQUERY system, or picture and audio access methods. Thus, the user’s information need is expressed in terms that can be processed by the underlying database system. The reformulated query is – seen from a logical point of view – a formula containing quantified variables which are instantiated during the recursive evaluation of the query. Function calls corresponding to the executable predicates are sent through the process-communication to external modules, and, collecting the results bottom-up, the query results are constructed. This design guarantees that each query reformulation is valid, i.e., if there are no results for a given interpretation then this is always due to a lack of data, but not due to a misconception of the underlying data structures.

2.2 Dialogue Modeling and Management As many users do not have initially well-defined information needs and are often unable to formulate precise queries, interaction plays a crucial role in information systems. A flexible and cooperative system should be capable to keep track of changing user goals and strategies, first, to to deal with ‘unexpected’ user actions; second, to adapt its own behavior to changed user strategies and to respond accordingly (for example, negotiating and offering alternative strategies that the system can support); and, third, to allow (meta-)reasoning about the information-retrieval method applied in the current system (for example, offering explanations of the abductive inference procedure, the application of rules, the meaning of the generated hypotheses, etc.). Furthermore, as the retrieval functionality of MIRACLE is considerably richer and the interaction more complex than in our previous retrieval systems (such as MERIT), a more sophisticated support of the interaction process is needed. Therefore, our dialogue modeling framework (cf. Stein & Thiel 1993, Stein & Maier 1995) is being enhanced to take into account the current state of the retrieval process as defined in MIRACLE. The model comprises two parts (or layers) which interact and mutually constrain each other, thus dynamically generating the resulting dialogue structure (history). First, local patterns of the exchange are covered by the “Conversational Roles model” (COR). Second, a set of “dialogue scripts” describe action sequences that typically occur in information retrieval dialogues to pursue a certain information-seeking strategy. 2.2.1 Local Interaction Possibilities Defining necessary or recommended problem-solving steps to pursue a task or strategy is a useful means of guiding the user through the global stages of the information retrieval dialogue. However, this is not sufficient to specify all of the interaction possibilities in any situation. To also cover unexpected situations and reactions/ interactions the COR model is used as a general abstract model of the local interaction possibilities. COR was developed for the genre of information-seeking dialogues to model mainly the illocutionary aspects of the dialogue development (Sitter & Stein 1992, Stein & Thiel 1993, Stein & Maier 1995). Analogously to the “Conversation for Action” model of Winograd and Flores (1986), the dialogue is seen as a “negotiation” based on the interplay of commitments and expectations of the two participants. As opposed to the modeling of spontaneous conversations (for example, Dessalles 1992) or sales dialogues (for example, Jameson et al. 1994), the COR model is based on the assumption that the information seeker (A) and provider (B) enter a cooperative negotiation. This does not only mean that their overall goals coincide and they take complementary roles, but also that they develop common plans based on the mutually accepted purpose of the current state and future direction of the dialogue (cf. Stein 1995). COR defines about 14 general types of dialogue acts with respect to the main illocutionary point (cf. Searle 1979) expressed, for example, request, offer, accept, reject offer, inform (cf. Table 1). The basic units of an actual dialogue are individual or atomic dialogue ‘acts’. They are elements of superordinated complex dialogue contributions, the ‘moves’ which can be assigned the same illocutionary point as the atomic acts. The COR model of the entire ‘dialogue’ is represented as a recursive transition network (ATN) consisting of dialogue states and transitions between the states, i.e., the

moves. Thus, for any of the dialogue states the possible follow-up moves and possible action sequences can be described. Table 1 includes a fraction of the possible sequences of moves1, the initial move being either a ‘request’ for information or an ‘offer’ to search for information and afterwards to present the retrieved items (‘inform’). In any state both participants have the opportunity to withdraw a previous decision or reject an offer or request – either definitely (quit) or with the intention to continue the dialogue and begin a new dialogue cycle (see Table 1). Table 1:

Typical COR dialogue sequences

‘Ideal’ course (complying with role expectations) Dialogue(A,B) –> request(A,B) promise(B,A) Dialogue(A,B) –> offer(B,A) accept(A,B) Alternative course Dialogue(A,B) –> Dialogue(A,B) –> Dialogue(A,B) –> Dialogue(A,B) –> ... Dialogue(A,B) –> Dialogue(A,B) –> ...

inform(B,A) inform(B,A)

be-contented(A,B) be-contented(A,B)

offer(B,A) offer(B,A) offer(B,A) request(A,B)

accept(A,B) inform(B,A) be-discontented(A,B) accept(A,B) withdraw-accept(A,B) reject-offer(A,B) reject-request(B,A)

request(A,B) request(A,B)

promise(B,A) promise(B,A)

inform(B,A) continue(A,B) withdraw-promise(B,A)

Dialogue(A,B) Dialogue(A,B)

A move can also be represented as a recursive transition network (an example is given in Figure 2). Its elements are atomic acts (“B: offer”), other moves (“assert (B, A, supply context information)”), and dialogues (“dialogue (A, B, solicit context information)”) – the latter are particularly important for clarification dialogues related to the previous act. As “jumps” indicate optional transitions, it is obvious that a move can consist of a single atomic dialogue act and that even Offer (B,A)

dialogue (A,B, solicit context info)

b

jump

B: offer assert (B,A, supply context info)

c

a assert (B,A, supply context info)

jump

B: offer b’

jump

dialogue (A,B, identify offer)

Figure 2: COR network for ‘moves’ –––––––––––––– 1. Dialogues on the left hand side of an arrow can be decomposed into the moves on the right hand side. Parameter A represents the information seeker and B the information provider; the first parameter referring to the speaker and the second to the addressee. Rows that end with Dialogue (A,B) indicate recursion, i.e., the dialogue continues with a new dialogue cycle starting in state 1.

the entire move may be omitted in certain situations, i.e., when the respective intention can be inferred from the context (for instance, a ‘promise’ is often skipped in case the requested information can be given immediately; a ‘continue’ needs not to be made explicit if the speaker actually continues with a new request). Traversing the COR networks recursively during a dialogue session allows a hierarchical structure to be build up representing the dialogue history. This can be best illustrated by a fragment of an example dialogue (taken from the MERIT system and paraphrased in natural language); the COR analysis is given in Figure 5. Example dialogue 1 U: S: U: S: U: S: U: S: U:

Search for projects dealing with “arts”. ... Sorry, that’s wrong. Why? Because actually I mean “encyclopedia or dictionary of arts” or the like. Wait, I’m searching ... I’ve found one project: “Europublishing” within the RACE 2 program. Show me the consortium partners. I’d like to see the addresses and contact persons. [... shows list or table ...] That’s interesting. Now show me their location on a map. [... shows map of Europe with the locations highlighted ...] Now, I’d like ...

U: request + withdraw S: request U: inform S: promise S: inform U: request + assert S: inform U: continue + request S: inform

dialogue(U,S)

request(U,S) U: request “arts”

withdraw(U,S) U: withdraw Sorry...

request(U,S)

dialogue(S,U)

request() promise() S: request Why?

jump

jump

inform() be contented() U: inform

promise(S,U)

inform(S,U)

S: promise Wait...

dialogue(U,S)

S: inform Europublishing

jump

Because...

......

request(U,S) ......

U: request U: assert Show ...

I’d like ...

Figure 3: COR anaysis of example dialogue 1 2.2.2 Dialogue Scripts and their Relation to the COR Model As the notion of “dialogue scripts” has been introduced in detail elsewhere (cf. Belkin, Cool, Stein & Thiel 1995) we give here only a brief account of the theoretical background and concentrate on the discussion of examples. Based on a multi-dimensional classification of information-seeking strategies (cf. Belkin, Marchetti & Cool 1993) a set of prototypical dialogue scripts can be defined for a given task setting related to a selected information-seeking strategy (ISS). These scripts can be used as dialogue plans to guide the user through an information retrieval session. As opposed to the “cases” used in the abovementioned MERIT system, scripts do not only define straightforward paths but foresee branching points in certain situations among which the user may choose an

appropriate option. They also may contain some meta-sequences, e.g., for negotiating the further strategy or tactic in case the current strategy fails to fulfill the user’s information need. To illustrate the dialogue guidance we introduce a fragment of a script which is presented here – somewhat abbreviated – in every day natural language, although within MIRACLE most of the dialogue contributions are realized by graphical means, e.g., query forms, graphs, buttons and icons. On the left hand side of the script we enumerate the stages and optional steps, the assigned (atomic) COR acts appear on the right side, SD-1 to SD-5 denoting different subdialogues; the variables given in Greek letters indicate the actual interface and informational objects. In the introductory sequence, which is common to all scripts, user and system negotiate the user’s global task and interest until they come to an agreement. This enables the system to instantiate the relevant internal script. Introductory sequence 1 2 3 4

S U S U

Here’s what we can do ... [list for choice, e.g., tasks or interest: p] Let’s do this ... [points or selects: e] Here’s how we’ll do it ... [describes plan/ script, text: h] –> 4 or 1 or 5 a. OK. [‘relevant button’: c] –> 5 b. No, I don’t like this. [‘irrelevant button’: l] –> 1

S: offer U: accept S: inform U: continue U: continue

Example script (for ISS 10) 5 U Generate query interpretations for ... [fills in query form: a] –> 6 6 S a. Here’s the one interpretation found ... [presents graph: b1] –> 7a or 7a.2a b. Here are some interpretations ... [presents graph: b1 and ‘next button’: g] –> 7a or 7b c. Sorry, I can’t find any interpretation, because ... [d] –> 5 or 7c d. Sorry, you have seen all alternatives. –> 5 or 7c 7 U a. 1. I like this. [‘relevant button’: c] –> 9 or 7a.2a 2. Please explain the graph. [‘explain button’: f] –> 7a.2a or 7a.2b S a. [explains, text: h, or other graph: b] –> 7a.3 or 7a.4/5 b. I can’t explain, because ... [i] –> 5 or 7c or 11 U 3. Please execute this query. [‘search button’: k] –> new script 4. I don’t like this. [‘irrelevant button’: l] –> 1 or 7c 5. Let’s quit. [m] –> 11 U b. 1. Show me the next interpretation. [‘next button’: g] –> 7a or 7b ... S c. Here are some ways to find a good query interpretation ... [list for choice: p] ... 8 U a. Please explain difference between interpretations ... [presents graphs bi and bj] ... b. Please execute all queries and show results. [‘execute all queries’: s] –> new script 9 S Shall we save this and continue? ... 13 S Goodbye.

U: request S: inform S: inform S: withdraw S: withdraw U: continue U: request (SD-1) S: inform (SD-1) S: reject (SD-1) U: request (SD-2) U: continue U: (dis)content U: request (SD-3) S: offer U: request (SD-4) U: request S: request (SD-5)

As scripts model only the recommended options to pursue a goal (here, specification of a query, inspection, explanation and selection of query interpretations, etc.), not all of the alternative moves a user may wish to perform can be foreseen. She might, for example, quit the dialogue before having retrieved anything, reject a system’s offer, ask for help to fill out the query form or wish to enter

another type of clarification dialogue. In such cases the COR model allows the monitoring of the local negotiations (e.g., the insertion of subdialogues), while scripts constrain the number of general interaction possibilities provided by COR. The integration, or interaction of both COR and scripts gives us the means for combining global information-seeking strategies with local and contingent user actions in a unified framework. Recently, it has been proposed to integrate an implementation of this dialogue model into an established text generation system (KOMET-PENMAN) whose mechanisms are used to represent scripts in a similar way as global text structures and to guide the interaction of the system modules by appropriate realization statements (cf. Bateman, Hagen & Stein 1995). The main goal of this project (the European Union funded project SPEAK!) is to generate meta-comments about the dialogue itself as properly intonated speech. To integrate the dialogue model into MIRACLE, the repertoire of applicable scripts is being tailored to the functionality of the retrieval system and the tasks that can be supported. The objectoriented perspective allows the integration of the conceptual model of the domain, the dialogue model, and the database access methods in one unified framework, which is construed as a heterogeneous semantic network. The dialogue manager maintains a set of constraints, which is build from elements of this network. These constraints restrict the inferential possibilities of the retrieval engine, thus tuning the system according to the context of the retrieval dialogue. For this integrated retrieval-dialogue system the available interaction modalities, which are direct manipulation and form-based input so far, will be extended by natural language capabilities for generating explanations of the abductive retrieval mechanism and the dialogue development.

2.3 Interaction with MIRACLE: Examples In the following we give a concrete example of a user’s interaction with MIRACLE, that follows the script introduced in Section 2.2.2. We refer to the numbering of the stages and steps of this script and also use the variables (Greek letters) to identify the respective interface objects, as necessary. The figures give an impression of the visualization of important steps and informational objects on the screen. Figure 4, for instance, displays the query form the user is to fill in, Figures 5 and 6 the generated graphs representing different query interpretations. In the introductory sequence of our example the user selects a domain of interest, here, a “dictionary of art and artists”. The dialogue continues in stage 5 with the user’s first query formulation: “Show me biographies about ‘Art Nouveaux’ since 1875” [a]. This query statement can be entered by means of a text widget (see Figure 4), showing the most prominent features of the domain, or by entering a formula directly. The abductive retrieval engine finds two interpretations for this query, thus the dialogue continues at 6.b, showing the first interpretation (b1, see Figure 5) and enabling a ‘next’ button [g]. The formula is layouted as a directed graph, directions indicating the inference sequence. For example, the left hand side of Figure 5 represents the following inference steps: ‘Assuming some A has a date of birth X, greater than 1875. Then A is qualified for this numerical constraint.’ The right hand side

about: Art Nouveau artist: ?A date from: 1875 date to: country: picture-subject: style:

Figure 4: Query statement shows that this interpretation will evaluate the ‘aboutness’ statement by examining the textual components of the document collection. A, the missing link for the two parts of this query reformulation, is restricted to be a concept of type artist() and thus it is a key for the document collection.

date_from(1875) greater(X,1875)

about(art_nouveaux)

birthdate(A,X)

doc(D,A,art_nouveaux)

artist(A)

Figure 5: Relevant biographies and the related artist are checked for the artist’s birthdate (b1) By selecting one of the boxes in the graph b1 the user may ask for an explanation (stage 7a.2, f) of the corresponding concept. Selecting ‘birthdate’ the user gets informed (in an additional text window, h) that the time period ‘starting 1875’ is mapped to the birthdate of the relevant artists. Since this differs from what she intended, she uses the ‘next’ button (stage 7b.1, g) to inspect the alternative(s). The system presents the second interpretation (b2), but now disables the next button, since there are no more interpretations left. Within this query interpretation, it is not the artist’s birthdate, but the pictures’ date of creation of the Art Nouveaux artists the time period is checked for. Here, the textual retrieval component (right hand side) and the pictures are finally linked by the artist A via the computable relation pic_artist(). The user indicates that she “likes” this interpretation by selecting it [c] and asks the system to check the database for relevant elements [k]. about(art_nouveaux) date_from(1875) greater1(Z,1875)

doc(D,A,art_nouveaux)

creation_from(Pic,Z)

artist(A)

pic_desc(Pic,A,U) pic_artist(Pic,A)

Figure 6: Information is centered around pictures and their date of creation (b2)

Now this query formulation is instantiated: First, the system collects all biographies of artists which are related to Art Nouveaux. The documents are assigned to the variable D. Second, all pictures (Pic) and their descriptions (U) are retrieved. Third, all pictures which have been built later than 1875 are returned. Both partial solutions are combined during the computation of pic_artist(). No artists will show up, who did not create pictures after 1875; no pictures will be taken into consideration, which are not created by a qualified artist. The final sets of proper instantiations are presented to the user.

Figure 7: Biography and picture retrieved Thus, the functions of retrieval engine and dialogue manager are intertwined. The dialogue planning is based on formal constraints which are defined on the semantic properties of the objects involved. For example, the association of a given retrieval result with one of the interpretations of the user’s original query allows the explanation of the relevance decision of the system by referring to the assumptions underlying the interpretation.

3 Conclusions We have introduced a theoretical framework for “conversational information retrieval” – combining content-oriented information retrieval mechanisms with a comprehensive dialogue model – and its application in a multimedia information retrieval prototype (MIRACLE). The knowledgebased retrieval engine employs a novel mechanism of content-oriented retrieval based on abductive reasoning. The system is able to assist the user actively in her information-seeking interaction, or, dialogue. Based on an explicit representation of the dialogue on two layers, i.e., using the COR model and dialogue scripts, the system monitors and guides the interaction as the dialogue develops. By examples of the user-system interaction in MIRACLE we demonstrated and discussed, how the retrieval engine and the dialogue manager interact with each other to construct, depending on the

user’s input, a semantically and pragmatically coherent dialogue course. Future work involves improving the interaction of the system components, designing a more appropriate multimedia interface to support the conversational approach, incorporating text generation components for generating explanations of the abductive retrieval method, and evaluating the integrated prototype.

References Bateman, J., Hagen, E. & Stein, A. (1995) Dialogue Modeling for Speech Generation in Multimodal Information Systems. In: Proceedings of the ESCA Workshop on Spoken Dialogue Systems – Theories and Applications (ETRW ’95), Vigsø, Denmark, 1995. Belkin, N.J., Cool, C., Stein, A. & Thiel, U. (1995) Cases, Scripts, and Information Seeking Strategies: On the Design of Interactive Information Retrieval Systems. To appear in: Expert Systems and Applications, Vol. 9, 1995. Also available as: Arbeitspapiere der GMD, No. 875. Sankt Augustin: GMD, Nov. 1994. Belkin, N.J., Marchetti, P.G. & Cool, C. (1993) BRAQUE: Design of an Interface to Support User Interaction in Information Retrieval. Information Processing and Management, Vol. 29 (3), 1993, pp. 325-344. Callan, J.P., Croft, W.B. & Harding, S.M. (1992) The INQUERY Retrieval System. In: Proceedings of the Third International Conference on Database and Expert Systems Application. Valencia, Spain: Springer, 1992, pp. 78-83. Dessalles, J.-L. (1992) From Knowledge to Conversation: A Computational Model of Argumentation. Technical Report, TELECOM Paris 92D019, Paris, Dec. 1992. Fischer, M., Maier, E. & Stein, A. (1994) Generating Cooperative System Responses in Information Retrieval Dialogues. In: Proceedings of the 7th International Workshop on Natural Language Generation (IWNLG’94), Kennebunkport, Maine, 1994, pp. 207-216. Hess, M. (1992) An Incrementally Extensible Document Retrieval System Based on Linguistic and Logical Principles. In: Proceedings of the 15th International Conference on Research and Development in Information Retrieval (SIGIR ’92), Pittsburgh, PA, 1992, pp. 190-197. Jameson, A., Kipper, B., Ndiaye, A., Schäfer, R., Simons, J., Weis, T. & Zimmermann, D. (1994) Cooperating to be Noncooperative: The Dialog System PRACMA. In: Proceedings of the 18th German Conference on Artificial Intelligence (KI ’94), Saarbrücken, 1994, pp. 106-117. Maybury, M.T. (ed.) (1993) Intelligent Multimedia Interfaces. Menlo Park, CA: AAAI/ MIT Press, 1993. Meghini, C., Sebastiani, F., Straccia, U. & Thanos, C. (1993) A Model of Information Retrieval Based on a Terminological Logic. In: Proceedings of the 16th International Conference on Research and Development in Information Retrieval (SIGIR ’93), Copenhagen, Denmark, 1993, pp. 298-307. Müller, A. & Kutschekmanesch, S. (1995) Using Abductive Inference and Dynamic Indexing to Retrieve Multimedia SGML Documents. Technical Report, Darmstadt: GMD-IPSI, 1995. (submitted for publication) Müller, A. & Thiel, U. (1994) Query Expansion in an Abductive Information Retrieval System. In: Proceedings of the RIAO’94 Conference – Intelligent Multimedia Information Retrieval Systems and Management, New York, NY, Vol. 1, 1994. pp. 461-480. Nie, J.-Y. (1992) Towards a Probabilistic Modal Logic for Semantic-based Information Retrieval. In: Proceedings of the 15th International Conference on Research and Development in Information Retrieval (SIGIR ’92), Pittsburgh, PA, 1992, pp. 140-151.

Osgood, R. & Bareiss, R. (1993) Automated Index Generation for Constructing Large-scale Conversational Hypermedia Systems. In: Proceedings of the 11th National Conference on Artificial Intelligence (AAAI’93), Washington DC. Menlo Park: AAAI Press/ The MIT Press, 1993, pp. 309-314. van Rijsbergen, C.J. (1989) Towards an Information Logic. In: Belkin, N.J. & van Rijsbergen, C.J. (eds): Proceedings of the 12th International Conference on Research and Development in Information Retrieval (SIGIR ’89), Cambridge, MA, 1989, pp. 77-86. Searle, J.R. (1979) A Taxonomy of Illocutionary Acts. In: Searle, J.R. Expression and Meaning. Studies in the Theory of Speech Acts. Cambridge, MA: Cambridge University Press, 1979, pp. 1-29. Sitter, S. & Stein, A. (1992) Modeling the Illocutionary Aspects of Information-Seeking Dialogues. Information Processing and Management, Vol. 28 (2), 1992, pp. 165-180. Stein, A. (1995) Dialogstrategien für kooperative Informationssysteme: Ein komplexes Modell multimodaler Interaktion. To appear in: Sprache und Datenverarbeitung, Vol. 19, 1995. Stein, A. & Maier, E. (1995) Structuring Collaborative Information-Seeking Dialogues. KnowledgeBased Systems (Special Issue on Human-Computer Collaboration), Vol. 8, March 1995. Also available as: Arbeitspapiere der GMD, No. 853, Sankt Augustin: GMD, June 1994. Stein, A. & Thiel, U. (1993) A Conversational Model of Multimodal Interaction in Information Systems. In: Proceedings of the 11th National Conference on Artificial Intelligence (AAAI’93), Washington DC. Menlo Park: AAAI Press/ The MIT Press, 1993, pp. 283-288. Thiel, U. (1995) Interaction in Hypermedia Systems: From Browsing to Conversation. In: Schuler, W., Hannemann, J. & Streitz, N. (eds): Designing User Interfaces for Hypermedia (Research Reports ESPRIT Project 6532 HIFI, Vol. 1). Berlin: Springer, 1995, pp. 43-54. Tißen, A. (1993) Knowledge Bases for User Guidance in Information Seeking Dialogues. In: Wayne, D.G., et al. (eds.): Proceedings of the International Workshop on Intelligent User Interfaces (IWIUI ’93), Orlando, FL. New York: ACM Press, 1993, pp. 149-156. Winograd, T. & Flores, F. (1986) Understanding Computers and Cognition. Norwood, NJ: Ablex, 1986.

Dialogue Techniques for Supporting Multimedia

Dialogue Techniques for Supporting Multimedia

Suggest Documents

multimedia communication techniques for remote

Multimedia & Animation Techniques

Ontology-based reasoning techniques for multimedia ... - CiteSeerX

Caching Techniques for Streaming Multimedia over ...

Energy Reduction Techniques for Multimedia ... - Ece.umd.edu

Compilation Techniques for Multimedia Processors - Complang

Statistical Techniques for Robust ASR - Multimedia Information ...

various speech processing techniques for multimedia ... - wseas

Ontology-based reasoning techniques for multimedia ... - CiteSeerX

Multimedia Techniques for Device and Ambient ...

Semantic user profiling techniques for personalised multimedia ...

Tamper-Resistant Storage Techniques for Multimedia ... - CiteSeerX

Multimedia Techniques for Device and Ambient

Supporting Reconfigurable Parallel Multimedia Applications

Supporting Multimedia Streaming in VANs

Supporting Multimedia Streaming in VANs

SITE REMEDIATION TECHNIQUES SUPPORTING ...

A Dialogue Manager Supporting Natural Language Tutorial Dialogue

Data-Mining Techniques for Supporting Merging ... - CiteSeerX

Machine Learning Techniques for Supporting Renewable Energy ...

System Description: A Dialogue Manager Supporting Natural

System Description: A Dialogue Manager Supporting Natural

Social Dialogue as a Supporting Mechanism for ... - Kolegia SGH

Open Framework Supporting Multimedia Web ... - Semantic Scholar