The Knowledge Mining Center? - CiteSeerX

1 downloads 0 Views 468KB Size Report
of information during interviews by using multimedia technology and ..... additional information to question nodes, such as audio, dvi, and postscript les and largeĀ ...
The Knowledge Mining Center? Joachim Klausner, Gerhard K. Kraetzschmar??, Josef Schneeberger, Herbert Stoyan FORWISS, Knowledge Acquisition Research Group, Am Weichselgarten 7, 91058 Erlangen-Tennenlohe, Germany fjmklausn,gkk,jws,[email protected]

Abstract. Interviewing experts in order to elicit the knowledge they use in solving problems is a very common task in the knowledge acquisition phase. In practice, interviewing experts is a very tedious and costly task. Surprisingly, up to now there is very little technological support for performing interviews widely available. Recent developments in new hardware and software now provides a signi cantly better technological base to ll this gap. The Knowledge Mining Center is an easy-to-use, yet very exible tool for interviewing. It allows for better presentation of information during interviews by using multimedia technology and online access to information such as design documents or protocols of earlier interviews. Its main advantage, however, is its capability to semiautomatically capture, segment, and annotate information verbally communicated during interviews. The paper motivates the need for technological support for interviewing, discusses design considerations for interviewing tools, describes architecture, functionality, and implementation issues of the Knowledge Mining Center, illustrates its use by an application example, and concludes with a summary of experiences gained so far and our plans for future work. Keywords: knowledge elicitation, interviewing, multimedia applications in knowledge acquisition

1 Introduction WissAk is a basic research project sponsored by FORWISS. Its goal is to provide tools that improve the knowledge acquisition process. Rather than adding yet another model-based knowledge engineering methodology, we try to identify where problems occur in practical projects and ll these gaps by providing appropriate tools. One such gap is interviewing. Considering the attention knowledge acquisition has seen both in industry and research in the last ten years, the current state of interviewing technology is surprisingly low. As various forms of interviewing experts can be expected to remain the backbone of knowledge acquisition in industrial projects for quite some time, making this process more e ective and ecient is highly desirable. ? ??

Submitted to the 1994 European Knowledge Acquisition Workshop (EKAW-94) Please send all communications to this author.

The Knowledge Mining Center, a tool providing technological support for interviewing, is presented and described in detail in this paper. We start out by analyzing both the current practice of interviewing in typical industrial expert system projects (Sect. 2) and the kind of technological support needed to improve the situation (Sect. 3). After discussing some design considerations for interviewing tools (Sect. 4), we describe the architecture, functionality and implementation of the Knowledge Mining Center (KMC) tool as developed within the WissAk project (Sect. 5). An example of its application is given in Sect. 6. The experiences in using KMC made so far are summarized in Sect. 7, while Sect. 8 outlines future developments. The nal section draws some conclusions.

2 Current Practice of Interviewing In practically all the projects we have seen in industry there were experts on one side and knowledge engineers on the other. They had to communicate so that the knowledge engineers could acquire all the knowledge necessary for building the expert system. This communication process, usually called knowledge elicitation, is well known to be often very dicult for various reasons. Most expert have very little time and they hate to waste it. Also, knowledge engineers have no time to waste either { there is always not enough of it in any project. On the other hand, at least in the beginning knowledge engineers and experts speak di erent languages and do not yet understand each other very well. It takes them some time to become acquainted with each other and to sort out their terminology. Thus, there are many negative in uences on the knowledge elicitation process. Considering the number of tools that have been developed for other steps in knowledge acquisition, there is surpisingly little available for knowledge elicitation in general or interviewing in particular. Also, books on knowledge acquisition contain surprisingly little information and good advice about how to structure, lead, and evaluate good interviews with experts.3 Interviewing, however, still is the dominating method for knowledge elicitation for industrial projects. And it often is surprisingly inecient and frustrating for knowledge engineers. A typical situation is that the interview took anything between 1 and 4 hours, was not very concise, and results in just a few pages of hand-written notes. After the interview, both participants feel totally worn out and do not go about writing a protocol right away (if this is done at all). The next day (or even later, if other things all of the sudden become more urgent) the knowledge engineer tries to understand what he learned from the expert. More often than not, he discovers that his notes are not sucient to reconstruct the information exchanged in the interview. And he cannot ll in the missing links from memory. Summarized shortly, he does not understand at all many things that seemd to be so easy the day before. He must go back to the expert and talk about the very same topic again, thereby running in danger of frustrating the expert. Doing knowledge elicitation like this is obviously not a very e ective and ecient process. 3

A notable exception is the book by Scott, Clayton and Gibson[4].

When asking experienced knowledge engineers about the use of tape or video recorders, the answers seem to be always the same: Audio records would be great, but it is hopelessly inecient to work them through. Manually recording tape counter numbers helps a little, but it is usually forgotten during interviews. Video can be attractive in special cases, but is not practical nor necessary for ordinary interviews. Thus, o -the-shelf technology sometimes recommended in knowledge acquisition books does not work very well in practice.

3 Technological Support for Interviewing The bottom line of knowledge elicitation interviews is that the knowledge engineer is interested in maximizing the information he gathers from the expert. The measures to facilitate maximum information transfer can be summarized by 1. access to and presentation of information, and 2. capture and segmentation of information. While capture and segmentation of information deals with somehow recording what the expert says or does and cutting it into chunks for later use, presentation of and access to information are measures to invoke more and better communicative activity from the expert. Naturally, access to information is a prerequisite for presenting it. What kind of information do we refer to and what kind of technology can be used in an interviewing tool? The next few paragraphs specify an open list of answers, giving examples for the information on hand and identifying multimedia technology to present or capture it. Information that facilitates good information transfer and is therefore desirable to be accessed and presented during knowledge elicitation interviews includes the following: Text The use of text les ranges from source code les and design documents to e.g. lists of possible attributes for a class or concept, which have been speci cally created for a particular interview. Visually presenting the text is better than just reading it to the expert, because many people are more visually oriented. Graphics Presenting a graphical picture, e.g. of a system architecture, an example user interface layout, or a block diagram of technical system is often necessary during knowledge elicitation interviews. Animations In some cases animated graphics, like those generated in many simulation tools, can be used. Audio For some projects it is desirable to present audio recordings, if sound plays a signi cant role for the expert system. As an example, consider an expert doing scheduling waste furnaces or a construction engineer for automobile exhausts. Video If a large and complex technical object, such as a power plant and its control center or a job shop in a factory, is the topic of knowledge elicitation, then making a video of the object rst and using it during knwoledge acqusition can be very useful.

Programs Showing the expert a program relevant to the topic on hand, e.g. a

short demo of the current prototype expert system or a program that the expert uses for problem solving, can make knowledge elicitation more focussed and shorter.

Most of this information up to now was accessible and presentable only with excessive use of resources (many printouts, tape recorder, ampli er, speaker boxes, TV set, video recorder, etc.). It also used up a lot of time for preparation and handling and, therefore, was very costly. Using state-of-the-art multimedia and network technology, many of the above can now { at least in principle { be done on standard workstations. However, some additional technology to improve the presentation of information may still be in order, especially if knowledge elicitation involves more than two or three people, i.e. if it is done in a larger group:

Large Screens, Overhead Projectors, and LCD-Displays may be used to

project all or part of a computer screen to larger screen, so that a group of people (4{6, up to a dozen) can easily view the displayed information. Large Speaker Boxes, Sound Mixers and Ampli ers can produce sound at much better quality than the technology used in workstations. However, built-in technology is sucient for the standard case. Some examples of capturing information are:

Note Taking of the knowledge engineer or any other participant, Graphical Sketches generated by the expert to explain the structure of some object or to illustrate a process,

Gestures and Hand Movement performed by the expert to explain operations,

Text documents and Graphics, Technical Drawings, and Pictures that the expert carried to the interview,

Sound, either on pre-recorded tapes provided by the expert or simply the discussions of the interview, and nally

Vision, e.g. video recordings of (parts of) a knowledge elicitation session. As experience with classical tape recording in knowledge elicitation has shown, just capturing all this information is not enough. Especially audio and video generate such an overwhelming amount of unstructured information, that using them for post-processing interviews seems to incur prohibitive costs. A central idea of technological support for interviewing is to { at least semi-automatically { structure the information captured and link related parts. For audio and video this means that the recording is to be segmented into chunks that can be associated to other pieces of information, e.g. a question text in an interview document. Given that the use of a computer during knowledge acqusition provides access to all the software running on it, modi ed forms of classical interviews, such as joint authoring of a document, can be practiced.

4 Design Considerations for Interviewing Support Tools One of the reasons for the often disappointing size of notes taken during traditional knowledge elicitation interviews is that maintaining a lively dialog with the expert and carefully taking notes are at odds with each other. Therefore Scott et al.[4] suggest the participation of two knowledge engineers, such that one can concentrate on creating a good interview atmosphere and maintaining the dialog while the other mainly takes notes. The knowledge engineers may switch roles at breaks during the interview, however. We can only join them to recommend such a procedure, i.e. when using interviewing support tools one involves best two knowledge engineers in the knowledge engineering process .4 With all the jazzy multimedia technology around using it for knowledge elicitation does not seem to be a problem. However, most of the new multimedia tools are still not very well integrated with each other. The unifying framework which ties the various components and tools together and let knowledge engineers use them through a concise user interface is still missing. Also, many of these components provide much more functionality that can reasonably be used in an interview. This extraneous functionality can easily get in the way. It is of utmost importance to design interview support tools such that they facilitate dialog instead of building an obstacle . This issue can be extended further to the consequences the use of a technology has for both for pre- and post-processing interviews and for the implementation of support tools:

Pre- and Post-Processing Interviews Text and graphics impose no big

problems for pre- and postprocessing, because they use only limited storage and there are a lot of standard programs to process them. Many UNIX workstation nowadays feature audio processors, including microphone and speaker box, as standard con guration, and add-on cards are available for PCs at very low prices. However, audio can generate signi cant amount of data. The availability of utilities for compression and transfer to other formats may be crucial, if one wants to record interviews of several hours length. On the other hand, recording audio is very attractive (provided teh segmentation problem can be solved), because it is a complete record of the information exchanged. Video is a totally di erent story. Video cards are usually not standard con guration and still costly. Also, video generates such an overwhelming amount of information, that { even with smart compression algorithms { we are talking about extra Gigabytes of storage. Adding the fact that usually the by far largest part of the interview consists of boring scenes of two people talking to one another, it does not make much sense 4

The Knowledge Mining Center described later on may well be used by a single knowledge engineer, but he should be aware that he has to divide his attention between the expert and the computer. We suggest doing this only if the expert can be deeply involved in using KMC, either because graphics, animations, audio or video are included in the interview document and shown to him, or because the expert himself wants to use the system, for instance to draw a diagram.

to record a complete interview. The marginal bene t of video is quite low, while its cost are still high. Therefore, video will be limited to special applications, like those outlined earlier.

Implementation The implementation of a tool that provides processing and

presentation functionality for all the di erent kinds of information indicated above from scratch is a very expensive and costly endeavor. Also, development and standardization are proceeding so rapidly in some areas, that keeping the tool up-to-date seems almost impossible. Instead, we recommend using as much available tools and technology as possible and constructing just a unifying framework5 that integrates all these tools.

5 The Knowledge Mining Center The Knowledge Mining Center6 is a tool supporting interviewing. It is designed to increase the overall performance of meetings and interviews by making use of advanced computer technology.

5.1 Architecture of KMC The Knowledge Mining Center is a software system that uses and combines several di erent components to yield multimedia support for interviewing. KMC processes a structured interview document, which { among other things { contains questions to be asked and links to pieces of information like audio les or graphics. It automatically invokes pre-speci ed utilities if the presentation of information in such les is asked for. KMC also records both the notes typed in by a knowledge engineer and an audio recording of the whole interview session.7 Whenever a knowledge engineer requests the next question, both notes and audio sequence are automatically segmented and the pieces are attached to the associated question. KMC is installed on a workstation that is placed in a meeting room where knowledge elicitation sessions are performed (see Fig. 1). The workstation is connected to the network of computers the knowledge engineers use during their daily work so that they can access relevant documents, if necessay. The workstation screen is connected to a LCD display on an overhead projector, which generates a large copy of computer screen on the wall (or a special screen). Ideally, a printer and a scanner are attached to the workstation to allow printing documents and screen dumps and to scan in text and graphics provided by the expert. Microphones are placed in front of the seats of experts and knowledge engineers and connected to the workstation such that audio recordings can be generated. A video camera could be added, if necessary, to make short videos. 5 6 7

Called meta-tool by others.[1] For a full description see citeKlausner. Audio recording may be switched on and o on demand.

screen

printer

scanner

overhead projector and LCD display camera workstation

expert

microphones knowledge engineer

knowledge engineer expert expert

knowledge engineer

Fig. 1. The KMC architecture.

5.2 Functionality of KMC

Using KMC for knowledge elicitation consists of three steps: 1. Preparing an interview document. 2. Performing the interview, thereby modifying the document. 3. Evaluating the modi ed interview document. Although KMC is intended for the second step only, it is possible to use for preand postprocessing interview documents as well. However, functionality eventually provided by the Knowledge Engineering Assistant, another tool currently developed in WissAk, provides much better functionality for this purpose. The structured interview document used to hold all the relevant information is a normal LATEXsource le written with the kmc style option. This has the advantage that interview protocols can be generated fairly easy by running the document through LATEXafter post-processing the interview. The kmc style option de nes LATEXcommands for de ning administrative information (experts and knowledge engineers participating in the interview).

Modi ed versions of LATEXsectioning commands are used to structure the interview. New commands are included to de ne question nodes, answer possibilities (list of possible answers, multiple-choice, single-choice) and answers (used by KMC to add the answer notes). The commands nlinkflabelg and ngotoflabelg are used to provide limited control structure as needed e.g. for multiple-choice questions. Finally, a range of commands is available to attach additional information to question nodes, such as audio, dvi, and postscript les and large number graphics formats using di erent tools for displaying. This foreign le interface is customizable so that other utilities (mpeg play, for instance) can easily be added. When asked to perform an interview, KMC reads a structured interview document and extracts browsing information, which is displayed in the interview browser . The user can navigate through the document by using the browser, clicking on buttons in the interview control panel , or by typing short control commands. If the user selects a question node, the question is displayed in its own window on the screen. Additional information is displayed either automatically or on demand, depending on what is speci ed in the interview document. If that tool allows to edit and modify the graphic, the user can use this to add annotations to graphical pictures. All this functionality is available in both the browsing mode and the interviewing mode. Whenever the user switches to interviewing mode, the user can enter textual notes, which are attached to the current question node. Also, switching to interviewing mode enables audio recording. Segmentation of the audio sequence occurs automatically whenever the user requests to proceed to another question; he may also request segmentation explicitly, if the answer of the expert becomes very long. The audio sequence is sampled by the audio processor and stored on hard disk. Knowledge elicitation interviews very rarely can be planned to such an extent that all the questions are xed. The expert may misunderstand questions or the knowledge engineer does not understand part of the expert's answer. In both cases, a knowledge engineer will simply ask additional questions. KMC provides a very simple way to extend the interview document structure: the ad-hoc-facility . With the click of a button or a short keyboard command, the knowledge engineer using KMC can dynamically extend the document by ad-hoc questions. After the interview, the knowledge engineer can browse through the interview document. He is free to add text to answer nodes or complete the notes already there. If he does not remember what was said during the interview, he can play the audio recording. He will get the audio sequence that was recorded while the question it is associated with was on the screen. Using the audiotool, he can move to an arbitrary position in the audio le and even edit it. Playing audio sequencing is very fast, because the segments are directly accessible and stored on hard disk rather than tape; thus, there is no such thing as the time-consuming forwarding or backwarding the tape.

5.3 Implementation Issues The current version of KMC is implemented in EmacsLisp and requires Lucid Emacs 19.8 or newer and a machine with an audio device. It has been tested on two platforms

{ SUN SPARCstation 10 with speaker box and SUN OS { PC 486DX33 with SoundBlaster card and Linux As outlined above, the basic idea for implementation was to use as many external tools as possible and to combine them with a single user-interface. For the presentation of multimedia data KMC uses tools like ghostview, xdvi, xv, mpeg play, and xflick. Since audio software is very hardware dependent, KMC uses external tools for recording, playing and compressing audio data. Thus, KMC itself remains hardware independent and can be used on various machines. The use of external tools did not only save implementation time, but also gives the user the possibility to work with tools he already knows well.

6 Application Example of KMC The following paragraphs and gures illustrate the use of KMC. An example interview from a project on coin classi cation[3] was used.

6.1 Preparing an Interview Excerpts from the structured interview document are shown in Fig. 2. The line npsfcoin20.psgf-g, for instance speci es that a postscript le named coin20.ps should be displayed on demand only using ghostview, while nxvfcoin20.psgf-g asks for the corresponding gif le to be displayed the program xv. The rst two pages of the printout, generated by running the source le through latex and dvips and displaying the result in ghostview, are shown in Fig. 3 and Fig. 4.

6.2 Performing an Interview Loading the interview document into KMC will result in a screen layout similar to the one in Fig. 5. The windows displayed are from top-to-bottom, left-to-right: the audio control panel, the interview control panel, the notes input window, the interview browser, the question display window, and an xv window displaying a coin. For clarity and better illustration, the interview control panel and the interview browser are displayed in larger size in gures 6 and 7, respectively. As a nal example, a screen dump from the right half of the screen is given in gure 8. It shows the question display window, an Emacs bu er displaying a text le, and a xv window displaying another coin.

\documentstyle[a4wide,german,kmc]{article} \title{MEX-Fragebogen} \begin{document} \maketitle \experts{% \person{Dr. Maue, Germanisches Nationalmuseum N"urnberg}{Maue}} \interviewers{% \person{Joachim Rick, Universit"at Erlangen}{JR} \person{Joachim Klausner, Universit"at Erlangen}{JK}} ... \X{Beschreibung von M"unzen} \XX{Beispiel einer Beschreibung} \qu{Nehmen wir an, ich bringe Ihnen diese M"unze und bitte Sie als M"unzexperte, die M"unze {\bf mglichst genau} zu beschreiben.\par Wie sieht eine solche Beschreibung von Ihnen aus?} \ps{coin20.ps}{-} ... \XX{Liste A} \text{liste.a}{+} ... \XX{Fotos} \qu{Welchen Stellenwert haben fotografische Abbildungen f"ur die Beschreibung von M"unzen?} \xv{coin9.gif}{-} \xv{coin20.gif}{-} ... \XX{Gemeinsame Merkmale} \XXX{f"ur Zeitr"aume} \qu{Haben M"unzen eines gewissen Zeitraums gemeinsame stilistische oder symbolische Merkmale?} \oc{\ocitem{Ja}{} \ocitem{Nein}{f"ur geographische R"aume 2}} ...

Fig. 2. Examples from a source le of a KMC interview document.

6.3 Evaluating an Interview During the interview KMC modi es the interview document and attaches notes and audio les. The notes are automatically tagged with an acronym for the knowledge engineer who entered it, an acronym for the expert who provided the information, and the date. As an example, we include the second page of the

Fig. 3. Page 1 of a KMC interview document prior to the interview. printout generated after the intreview.8

7 Experiences with KMC During the test interviews we have performed so far, KMC has shown its usefulness in supporting knowledge elicitation interviews. There was no problem 8

In this example, the rst page happens to be the same.

Fig. 4. Page 2 of a KMC interview document prior to the interview. with the experts accepting the new technology. Also, we found out that there is a higher need to prepare for an interview. Knowledge engineers feel less secure going unprepared into a KMC interview than without it, because KMC makes the lack of preparation so obvious (there is no document and everybody can see it, while you might always come up with some clever questions out of your head without KMC). However, this is probably not that negative at all, because it leads to better preparation, and as a consequence, to better structured and more focussed interviews.

Fig. 5. A typical layout for a KMC interview screen. In order to avoid extraneous work for setting up all the equipment every time, a dedicated meeting room should be selected, where all the equipment is set up only once. The workstation could be used by other people, e.g. only casual users, but should be available whenever an interview is planned. The current version of KMC still is just a prototype and needs further improvements in various areas: { A more complex KMC interview may consist of a quite large number of les (if lot of graphics are included, for instance). We had interviews with around eighty audio les alone. Since there are at least two versions of every KMC interview (before and after the interview), and possibly many more (interviewing several experts, several steps of post-processing), it can become dicult to manage les. An interface to a revision control system (e.g. RCS) should be provided. { The syntax for KMC interview documents should be extended, e.g. de nition of dynamically con gurable questions and more ad-hoc support. Depending on how things proceed in the hypermedia world, switching to HTML might be advisable at some point in the future.

Fig. 6. The KMC control panel.

{ At the moment, the preparation of KMC interview documents is solely sup-

ported by the standard Lucid Emacs facilities for editing LATEX les. At least, there should be an Emacs-mode for this purpose. The situation improves when KEA supports this. { The user-interface can be improved. We are in the process of implementing a Tcl/Tk version. { The communication with the various external tools could possibly be improved. The Tcl/Tk version mighthelp here, too. { The LATEX-style for printing KMC-documents could be improved.

7.1 Future Work

Besides the improvements already outlined above and the development of tools for preparing and evaluating interview documents in KEA, future work centers around three major threads:

Extending Information Capture in Studio-KMC. The KMC as described in this paper is used in a kind of studio atmosphere. Within the studio setting, additional methods for input capture can be made available, such as adding video cameras, data gloves, light pens, drawing tablets, etc. Also, presentation facilities could be improved by applying technology from virtual reality. Getting KMC on the Road with Mobile-KMC. Many times knowledge engineers have to do knowledge elicitation on-site, where they don't have all

Fig. 7. The KMC interview browser. teh studio equipment available. For these purposes, and for performing very simple and short interviews, a mobile version of KMC is planned. Presentation of information must be reduced to a notebook screen, and audio recording must be applied more carefully. It remains an open question, exactly what part of the current functionality of KMC should be ported to the mobile station.

Enabling Group Work with Group-KMC. Finally, advances in networking technology have not yet been exploited either. The idea here is to avoid a meeting at all and use computer networks to do all the communication. While online remote knowledge elicitation, e.g. using video conferencing, might still be too big a challenge and not yet work stable enough, the oine variant of it seems more practical. The knowledge engineer may send a multimedia document to the expert, who works through it when it ts his schedule best and sends the results back. However, this scenario requires the expert to have access to and knowledge of computers, and it may not be advisable to use it in early stages of knoweldge elicitation.

Fig. 8. An example for displaying text and graphics during a KMC interview.

Fig. 9. Page 2 of a KMC interview document after the interview. Page 1 is the same as 3.

8 Conclusions We have identi ed a lack of appropriate tools to support knowledge elicitation interviews. In order to maximize information transfer, support is needed to access and present information and to capture and segment information during interviews. New developments in hardware and software, especially multimedia technology and computer networks, now provide a better technological base to build

such tools. The Knowledge Mining Center is such an easy-to-use interviewing tool. First experiences show good and promising results, but also indicate the necessity of further development.

References 1. Hendrik Eriksson: Meta-Tool Support for Knowledge Acquisition , Ph.D. Thesis, published in Link"oping Studies in Science and Technology, Dissertations No. 244, Department of Computer and Information Science, Link"oping University, Link"oping, Sweden, 1991. 2. Joachim Klausner: Entwurf und Implementierung eines multimedialen Systems zur Unterst"utzung von Experteninterviews Studienarbeit, IMMD-VIII, Universit"at Erlangen-N"urnberg, Erlangen, 1994. 3. Joachim Rick: Wissenerwerb und Wissensrepr"asentation am Beispiel M"unzen Diplomarbeit, IMMD-VIII, Universit"at Erlangen-N"urnberg, Erlangen, 1993. 4. A. Carlisle Scott, Jan E. Clayton, Elizabeth L. Gibson: A practical guide to knowledge acquisition . Addison-Wesley, New York, 1991.

This article was processed using the LATEX macro package with LLNCS style