tal systems (Text Illustrator and Agi3 le) which are de- signed to illustrate anatomy textbooks. Both systems exploit a symbolic representation of the con-.
Knut Hartmann, Stefan Schlechtweg, Ralf Helbing, Thomas Strothotte: "Knowledge-Supported Graphical Illustration of Texts". In: Proceedings of International Working Conference on Advanced Visual Interfaces, AVI 2002, 22.-24. Mai, Trento, Italien, S. 300-307, ACM Press, New York.
Knowledge-Supported Graphical Illustration of Texts K. Hartmann, S. Schlechtweg, R. Helbing and Th. Strothotte Department of Simulation and Graphics Otto-von-Guericke University of Magdeburg Universitätsplatz 2, D-39106 Magdeburg, Germany
{knut, stefans, helbing, tstr}@isg.cs.uni-magdeburg.de ABSTRACT We introduce a new method to automatically and dynamically illustrate arbitrary texts from a predefined application domain. We demonstrate this method with two experimental systems (Text Illustrator and Agi3 le) which are designed to illustrate anatomy textbooks. Both systems exploit a symbolic representation of the content of structured geometric models. In addition, the approach taken by the Agi3 le-system is based on an ontology providing a formal representation of important concepts within the application domain as well as a thesaurus containing alternative linguistic and visual realizations for entities within the formal domain representation. The presented method is text-driven, i.e., an automated analysis of the morphologic, syntactic and semantic structures of noun phrases reveals the key concepts of a text portion to be illustrated. The specific relevance of entities within the formal representation is determined by a spreading activation approach. This allows to derive important parameters for a non-photorealistic rendering process: the selection of suitable geometric models, camera positions and presentation variables for individual geometric objects. Part-whole relations are considered to assign visual representations to elements of the formal domain representation. Presentation variables for objects in the 3D rendering are chosen to reflect the estimated relevance of their denotation. As a result, expressive non-photorealistic illustrations which are focussed on the key concepts of individually selected text passages are generated automatically. Finally, we present methods to integrate user interaction within both media, the text and the computer-generated illustration, in order to adjust the presentation to individual information seeking goals.
Categories and Subject Descriptors H.5.1 [Multimedia Information Systems]; H.5.4 [Hypertext/Hypermedia]: Architectures, Theory; I.2.4 [Knowledge Representation Formalisms and Meth-
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Advanced Visual Interfaces (AVI 2002) May 22–24, 2002, Trento, Italy Copyright 2002 ACM X-XXXXX-XX-X/XX/XX ...$5.00.
ods]: Semantic networks; I.2.7 [Natural Language Processing]: Text analysis; I.3.6 [Methodology and Techniques]: Interaction techniques
General Terms Text Illustration
Keywords Text Analysis, Semantic Networks, Spreading Activation, Non-photorealistic Rendering, Image-Text Coherence
1.
INTRODUCTION
In many application domains, there is an increasing demand on creating on-line documents. As a recent trend, various publishers convert printed versions of widely accepted textbooks into multimedia counterparts. Thus, most online reference materials contain illustrations which tend to be little more than digitized versions of those appearing in the printed materials from which they stem. Consequently, these multimedia presentations do not exploit the potential of interactive systems. They would be far more attractive if they could be furnished with illustrations adapted to the current text portion. In this paper we provide advanced solutions to what we call the text illustration problem. Before presenting our approach to solving the text illustration problem, we shall make some observations on the role of illustrations in documents containing textual and pictorial elements. Illustrations are considered by several authors to be subservient to text ([6], [1]). In many textbooks, important principles are explained textually, while illustrations are often designed to summarize these principles or to provide an example of their application.1 Illustrations can also provide background information to concepts described textually. Anatomy textbooks frequently describe complex spatial relations. Figure 1 shows a section from our text corpus describing the location of two muscles with respect to characteristic landmarks such as bone segments. Furthermore, for a limited number of objects, anno1 This situation is typical for anatomy textbooks as a specific type of scientific texts. It should be noted that this attitude toward illustrations may be common among scientists and engineers nowadays. Nevertheless, this is not the only way to present information. Illustrations can, in principle, be used to convey key parts of a message, an accompanying text being relegated to the role of providing examples and an alternative explanation. This topic is, however, beyond the scope of this paper.
THE MUSCULO-SKELETAL SYSTEM THE MUSCLES AND MOVEMENTS OF THE LOWER LIMB MUSCLES THAT MOVE THE ANKLE JOINT MUSCLES THAT EVERT THE FOOT muscles from the extensor column evert the foot without contributing to dorsiflexion (Fig. 20.10): • peroneus longus • peroneus brevis. Peroneus longus arises from the lateral condyle of the tibia and the proximal two-thirds of the lateral surface of the fibula. It is subcutaneous throughout its course in the leg towards the posterior surface of the lateral malleolus. Here it runs distally and medially to enter and cross the sole of the foot, where it lies deeply. Peroneus longus inserts into the medial cuneiform bone and the adjacent part of the first metatarsal, near the insertion of tibialis anterior. Peroneus brevis takes origin from the distal part of the lateral surface of the fibula. It runs distally, deep to peroneus longus and its tendon. Its tendon accompanies that of peroneus longus around the posterior aspect of the lateral malleolus, lying at first medial to it, then anterior, and finally lateral. Peroneus brevis inserts into the base of the fifth metatarsal. Actions Peroneus longus and brevis evert the foot; they are assisted in this action by peroneus tertius. Nerve supply Like the other muscles derived from the extensor column, peroneus longus and brevis are supplied by the common peroneal nerve; their supply is from spinal nerves L5 and S1.
Fig. 20.10 The perionei: view
lateral
Figure 1: Left: An anatomy text. Right: The illustration to which the anatomy text refers [8, p. 302, 304]. tations in the illustration point to the denotation of technical terms used in the text. In addition, the figure caption describes the viewing direction. Empirical evaluation reveals that descriptions of spatial relations are comprehended more effectively if furnished with illustrations depicting them. In this paper, we present two approaches exploiting these strategies to generate text illustrations automatically. To illustrate a text portion under consideration, these systems make three assumptions: 1. an illustration should reflect the contents of the text, i.e., objects being described in a text portion should also be portrayed in the illustration 2. illustrations should reflect the user’s interest, or the system’s beliefs on user’s interests 3. the generated illustration as well as the text should be furnished with interaction points. This paper is organized as follows. In Section 2 we introduce the approach taken in the Text Illustrator to solve the text illustration problem. While the Text Illustrator demonstrates an innovative and fast method to illustrate on-line texts dynamically and interactively, it also poses some problems that are discussed in Section 3. Subsequently, a new architecture to overcome these problems, the prototypical Agi3 le-system2 is proposed in Section 4, which also introduces the fundamental algorithmic framework of the Agi3 le-system. Finally, some examples and concluding remarks are given in Sections 5 and 6. 2 Automatic Generation of Intelligent Interactive Interfaces through Language Engineering
2.
TEXT ILLUSTRATOR
A first solution to the text illustration problems was presented by Schlechtweg and Strothotte with their Text Illustrator system [9, 10]. A preselected geometric model as well as the text to be illustrated serve as input data to the system. Schlechtweg and Strothotte use the symbolic identifiers of objects contained in structured geometric models to determine co-referential text elements and geometric objects in a pattern matching approach. Hyperlinks encode these co-referential relations. Furthermore, the text structure is utilized to assign a “degree of importance” to each of the text parts depending on its position in the hierarchy of the document (chapter, section, etc.). To illustrate the currently visible text portion, the hyperlinks are extracted. Geometric objects with co-referring counterparts in the visible text portion are rendered so that they can be easily identified in the illustration. Figure 2 demonstrates how the Text Illustrator employs different rendering styles to convey the estimated degree of importance. Depending on the intended document type and its runtime restrictions (on-line document vs. printed documents) appropriate illustration techniques are selected by the Text Illustrator. The left illustration in Figure 2 shows a screenshot of an on-line document with an illustration using transparency values to indicate the relevance of geometric objects. The right picture in Figure 2 illustrates this text portion in a printed document with more expressive visual clues, such as different line styles. The presented approach permits to illustrate interactively selected text segments. While browsing on-line documents, these illustrations present an overview of the document content (illustrative browsing).
site bone surfaces of articulating bones are often denoted by the same term. The same phenomenon occurs due to the fact that anatomy objects are often named according to the pattern classification + direction (e.g., facies articularis superius). Thus, the denotation of these terms can only be resolved by the context.
3.2
Figure 2: A screenshot of the Text Illustrator.
3.
THE TEXT ILLUSTRATION PROBLEM REVISITED
The pattern matching approach as used by the Text Illustrator is not without problems. Pattern matching relies on the coincidence of the terminology in the text and the symbolic representation of the geometric model to determine co-referring geometric object and text elements. This technique can be applied in domains with highly standardized terminology like medicine, since precisely the terms that are used in the 3D model must occur within the text. To increase the flexibility of an information system, multilingual document versions are often incorporated into on-line documents, which raise additional requirements to a pattern matching approach. In the following, we will discuss some of these problems in detail.
3.1
Reference Problem
Terminology. There is a long tradition on standardizing the terminology in important medical disciplines, such as anatomy, histology and embryology resulting in various nomenclatures. However, several competing nomenclatures are used in parallel. Furthermore, these standards evolved over time and continue to do so, yielding terminological variants. Moreover, an inconsistent and often contradictory terminology is used in various medical disciplines. The terms defined in medical nomenclatures are of Latin or Greek origins. Hence, colloquial terms in the mother tongue are frequently used to denote important terms. Colloquial and standardized terms are inflected, truncated and used in coordinated structures (ligamenta calcaneo- et talonaviculare vs. ligamentum calcaneonaviculare and ligamentum talonaviculare). Consequently, co-referring terms may have different morpho-syntactical structures. This problem could be solved by using different sets of patterns and matching them with a normalized version (i.e., a translation of competing standards into a real standard). Unique reference. Anatomy names do not necessarily refer to a unique object. Frequently, the same term is used to denote different anatomy objects. For example, oppo-
Design Problem
In order to create meaningful illustrations – illustrations which serve the intention of the associated text part – the designer has to select graphical objects and their presentation style. Scholz [12] investigated visualizations in technical documentation, concluding that illustrations do not only represent a mere reproduction of the shape of objects, but explain the “coherence”, i.e., the interrelations of the depicted objects. That observation is not restricted to visualizations in technical documentation. The Text Illustrator emphasizes geometric objects whenever there are co-referring elements in a visible or user selected text segment. This may result in unnatural illustrations. To improve the computer generated illustrations, we analyzed which objects are mentioned in the text and depicted in the associated illustration. The illustration in Figure 1, for example, does not portray some of the objects to which the text refers (e.g., musculus peroneus tertius). In contrast, the appearance of unmentioned objects in the illustration has to be explained (e.g., the femur and the patella in the upper region and the tarsal bones and some ligaments in the lower region). The illustration in Figure 1 is focussed on two objects – the perionei muscles, i.e., musculus peroneus longus and brevis. This illustration should enable the viewer to recognize the perionei muscles by providing contextual information, which may not be completely derived from the text portion to be illustrated. There are visual clues indicating that both muscles are most dominant: they are displayed in red with many details (dotted lines indicating the direction of muscle fibres), whereas the other objects are displayed in gray or are omitted. The annotation and the reference in the figure caption are further indications of the dominance of both muscles.
3.3
Scope Problem and Representation Issues
In the Text Illustrator, the hierarchical structure of the geometric model is the sole basis for determining interrelationships between objects in the text. While this has the advantage of being simple and suffices to a certain extent for many situations, it has several limitations: • The interrelationships among the objects recorded within the geometric model are motivated by the object’s physical structure. Hence, functional dependencies of objects are normally not modeled. However, texts generally discuss the function of objects, and often these functional relationships should be reflected in the illustration. Without additional knowledge surpassing the pure geometric data, these functional aspects cannot be ascertained adequately. • Geometric models are designed to mimic the appearance of entities in the real world. It contains a fixed collection of objects actually modeled irrespectivly of their function or importance in the text to be illustrated.
• The symbolic information associated with a geometric model is supplied by experts in geometric modeling rather than domain experts and tends to be of an ad hoc nature at best. The approach introduced in this paper is aimed at long texts covering a broad range of topics. Consequently, a single 3D model can not suffice to illustrate the different text topics (the scope problem). The collection of anatomy models which we use for our experiments contains 3D models representing different regions of the body, organs in varying levels of detail and with different thematic foci. An independent representation is therefore required to describe the content of the 3D model(s). This separate description of object properties and their interrelations enables the text illustration system to select an appropriate 3D model from the library. This independent representation can be used to establish co-references if the text is referring to substructures of objects which are not separated in the 3D model.3 Unlike the geometric models, it is not restricted to a single hierarchy according to the included-in relation. Furthermore, it enables to model immaterial objects, such as holes, which may even have substructures, without modeling these areas as (fully transparent) volumes.
4.
KNOWLEDGE-SUPPORTED TRATION
ILLUS-
The text illustration approach holds great potential for creating adaptive illustrations for texts created by domain experts, but some issues still remain to be solved. The remainder of this paper presents a new architecture that avoids these problems based on the lessons learned from the Text Illustrator.
4.1
Architecture
The knowledge-supported text illustrator embraces a language- and media-independent formal representation of entities of the application domain. Another important resource contains alternative language and media-specific realizations of entities of the formal representation. As shown in Figure 3, the main task in this new scenario is to bridge two different levels, the document level containing entities in the text to be illustrated or object selections in the graphics – and the conceptual level containing entities in the formal domain representation (knowledge base). Co-referring entities at the document level share a denotation at the conceptual level. A linguistic analysis reveals the morpho-syntactic structure of noun-phrases contained in text elements at the document level. The normalized morpho-syntactic structure is compared with morpho-syntactic structures contained in a phrasal lexicon (see Figure 3). These language-specific linguistic annotations enable us to assign noun-phases a deno3 This usually occurs when the geometric model not being as detailed as the text or its internal structure is not accurate enough. If, for example, there is only one object given for a hand, it is impossible to selectively emphasize the thumb even if it is clearly visible.
tation, i.e., entities within the conceptual level. To determine the focus of the text, associations between entities at the conceptual level are exploited. The system architecture is shown in Figure 3. The anatomy tutor system contains large amounts of text and a library of 3D models. First, the user can interactively select a text portion or a 3D model. The overall goal of the Agi3 le-system is to coordinate the content of both media automatically. In the partial interpretation, references from elements of the document level (noun phases or geometric objects) to elements within the conceptual level (knowledge base) are extracted. This process comprises a normalization of the morpho-syntactic structures and exploits linguistic and visual annotations. Furthermore, according to the document structure, initial degrees of dominance are assigned to entities of the conceptual level. These initial dominance values are propagated via conceptual associations between entities of the knowledge base.4 Starting from the initial dominance, other objects (which may not even be mentioned in the text) can be found relevant based on their interrelations to dominant objects and treated appropriately in the illustration. We refer to this process as generalization. Graphical emphasis techniques can be tailored to the strength of the association in order to create customized illustrations. This strategy presents a new approach to tackle the design problem (see Section 3.2). Once dominant entities in the conceptual level and their interrelations (the focus structure) are determined, the numeric dominance values must be conveyed in the target media, i.e., we have to translate our results from the conceptual level to the document level. The system’s interaction response depends on the media where the user interaction was carried out. After an interaction with the text, the illustration must be updated, and vice versa. These new text portions or generated illustrations are subject to further user interaction (user interaction loop).
4.2
Knowledge Representation
On the conceptual level, a knowledge base contains a formal representation of the relevant application domain’s knowledge. Concepts represent sets with shared properties or attributes, whereas instances represent the attributes of atomic entities (i.e., one-element sets). These attributes are represented with relations between instances or concepts. The knowledge base was created by manually analyzing several anatomy textbooks, anatomy atlases, medical dictionaries and lexica. This analysis reveals several important concepts, their hierarchical classification and the instance attribute values forming a complex semantic network. In addition, the manual analysis reveals alternative Latin, English and German linguistic realizations. Furthermore, the denotation of the objects in several geometric models is represented as visual annotations. Agi3 le contains a hierarchical representation of basic anatomy concepts such as bones, muscles, articulations, tendons as well as their parts and regions. In the current version it covers the objects of the pelvic girdle and the lower limb, i.e., about 50 basic anatomic concepts, 70 relations and over 900 instances. Agi3 le could combine the 4 This process is comparable to global illumination computations used in computer graphics where incoming energy (light) is distributed among surfaces.
user interaction
illustrated text
text analysis
illustration
interaction analysis
phrasal lexicon
document level
linguistic annotations
geometric models
partial analysis
visual annotations
knowledge base objects initial dominance knowledge base
generalization
conceptual level relevant objects and relations (focus structure)
text illustration text selection text generation
geometric model camera position presentation variables
Figure 3: The Agi3 le system architecture. language-independent formal knowledge representation as well as language-specific linguistic annotations in XML topic maps [2, 7], which are then transformed into LISP-code using a description logic inference machine (LOOM [5]).
4.3 Partial Interpretation The analysis of text segments as presented in Figure 4 comprises several steps: First, the steam of characters is segmented into basic elements (tokens). The morphological analysis reveals the classification, morpho-syntactical features and stems of lexical tokens. A syntactical analysis extract noun phrases and their morpho-syntactic structure.5 Subsequently, various methods are applied to normalize truncated and coordinated structures. In the current implementation, we incorporate a probabilistic and lexiconbased part-of-speech tagger6 to reduce lexical token to their stems and a bottom-up chart parser which extracts the morpho-syntactic structure of noun phrases. In a subsequent step, normalized syntactic structures of noun phrases are matched with lexical annotations in the phrasal lexicon in order to determine entities on the conceptual level (i.e., instances or concepts) to which linguistic structures actually refer. 5 The syntactic analysis normally reveals alternative linguistics structures. 6 We employ the probabilistic tagger TreeTagger [11] for English and German text combined with two lexicon-based morphologic tools for German (Morphix [4]) and Latin (a morphological analyzer implemented by the first author).
4.4
Generalization
Our main idea – the distribution of dominance via associations in the conceptual level – is formalized in the dominance propagation algorithm. First, initial dominance values are assigned to entities of the knowledge base within the main focus. Second, this dominance can be spread to conceptually associated entities which can in turn propagate a portion of their dominance. This is achieved by a recursive application of the propagation function for all initial dominance values until the termination is determined by a threshold. Figure 5 presents the key concepts of the dominance propagation algorithm. The semantic network is enhanced to represent dominance values of domain objects (nodes) and the flux of dominance via the relations (edges) between domain objects. We define the focus structure to be a subgraph of this enhanced semantic network marking relevant concepts, instances and attributes of the presentation. An analogous spreading activation approach was developed in cognitive science by Collins and Loftus [3]. After presenting the basic idea of the dominance distribution algorithm, we discuss the initialization and the recursive application in detail: Initial dominance values are assigned to entities on the conceptual level depending on their frequency and their position in the document structure. In the current implementation, we assign to references in headings and in normal text an initial dominance value of 100, whereas references mentioned in emphasized text parts receive an initial dominance value of 500. Subsequently a generalization takes place via a spread of
XML text
dominance propagation
segmentation token stream
activated node
expand abbreviations normalize numerical values
relation weight (resistor / amplifier)
activating dominance (flux)
token stream
dominance consumption accumulated dominance dominance loss dominance distribution
morphologic analysis token stream
Figure 5: The dominance propagation: key concepts.
syntactic analysis
syntactic structure
a breadth first processing of open dominance distribution tasks. The idea behind the third heuristic is that the results of the dominance distribution algorithm should be (almost) independent from the granularity of the semantic description. In other words, the algorithm should produce similar results irrespective of whether the knowledge engineer decides to model a specific relation as between objects or their parts, and without regard to the name of the relation.
phrasal lexicon
semantic structure
enriched XML text
text illustration
Figure 4: Linguistic analysis pipeline. dominance over relations of a semantic network. The initial dominance values are passed on to associated entities. To estimate the parameters for the dominance distribution algorithm (dominance consumption, distribution and loss, as well as the relation weight; see also Figure 5) three heuristics are used: 1. activated nodes distribute almost all of their input dominance, 2. nodes which have already consumed any dominance distribute a bigger portion of their input dominance than nodes which did not consume some dominance before, 3. the full initial dominance is distributed via part-whole relations, regardless of how much dominance is consumed by the node. The first heuristic establishes interrelations between depicted objects and is intended to tackle the design problem (see Section 3.2) by focusing the illustration on associated objects rather than to depict only the objects mentioned in the text. According to the second heuristics, unvisited nodes consume selfishly dominance during their first activation, but generously distribute dominance in subsequent activations. Consequently, the result of the algorithm is sensitive to the order of processing. Therefore, in each iteration, we use
4.5
Media-Specific Realization
In a final step, the dominance values as well as the visual annotations enables to select an appropriate geometric model. The focus of a computer-generated illustration depends on the scope and granularity of the underlying geometric representation. There may be a variety of geometric models which can be used for an illustration. Which one is most appropriate depends on what the author wants to express, or, even more interesting, on what he wishes to reduce in an illustration. The sum of the dominance values which could be illustrated with a geometric model is used as a measure to select adequate geometric representation. The dominance values and the visual annotations are also exploited to determine the rendering parameters of geometric objects. First, those objects in the 3D model are determined, whose denotations received some dominance during the dominance propagation. Second, these 3D objects are ordered by increasing dominance values. Finally, these dominance values are grouped into clusters, in order to estimate groups of 3D objects displayed with identical rendering parameters.
5.
A CASE STUDY
In this section, we will analyze the text given in Figure 1 and present an illustration based on the results of the dominance distribution algorithm. First, we present a fragment of the semantic network which formalizes relevant domain knowledge about muscles. For muscles, information on their origin, insertion and nerve supply is crucial and has to be provided. The text presented in Figure 1 contains a typical description of the relevant attributes of musculus peronaeus longus in anatomy text-
(tell (:about musculus-peronaeus-longus Musculus (is-Component-of lower-leg-extensor-muscles) (has-Origin fibula) (has-Origin tibia-condylus-lateralis) (is-Origin-of musculus-plantaris) (has-Insertion basis-os-metatarsalis-I) (has-Insertion os-cuneiforme-medialis) (has-Nerve-supply nervus-peronaeus-superficialis))) Figure 6: LOOM representation of peronaeus longus books. The manual analysis of another textbook reveals the LOOM representation presented in Figure 6 which slightly differs in the scope and the granularity of description. As mentioned in the previous section, the third heuristic to determine the parameters of the dominance distribution algorithm instructs activated parts to transfer the full amount of initial dominance to their neighours. Thus, irrespective of whether or not the knowledge engineer decides to model the basis of the first metatarsal or the bone itself to be the insertion of the peroneus brevis muscle, an activation of the insertion would yield an identical activation of the muscle. The analysis of entities on the document level reveals their denotation on the conceptual level. The explicit references to entities on the conceptual level extracted from the text in Figure 1 are listed in Figure 7. Note, that Incorrect reference assignments due to the partial syntactic analysis may be corrected within the dominance distribution algorithm. The syntactic analysis of the phrase on the end of the second paragraph in Figure 1 the adjacent part of the first metatarsal, near the insertion of tibialis anterior revealed two noun phrases: the first metatarsal which refers to os-metatarsalis-I and the insertion of tibialis anterior, for which the system could only establish the reference for the embedded reference musculus-tibialis-anterior. However, the whole phrase refers to the location on the first metatarsal bone, where this muscle inserts. In the discussion of the heuristics to determine the parameters of the dominance propagation algorithm we stress that activated
instance articulatio-talocruralis musculus-peronaeus-longus musculus-peronaeus-brevis tibia-malleolus-lateralis tibia fibula musculus-peronaeus-tertius musculus-tibialis-anterior os-metatarsalis-I os-metatarsalis-V os-cuneiforme-medialis nervus-peronaeus-superficialis
occ. 1 7 5 2 1 2 1 1 1 1 1 1
style head emphasized emphasized
Figure 7: Explicit references to concepts of the formal domain representation in Figure 1.
Figure 8: Illustration created by Agi3 le with rendering parameters derived from dominance values. nodes distribute almost all their initial dominance to associated nodes. Thus, there is a good chance to activate the correct reference. In fact, despite a wrong reference assignment, the correct reference basis-metatarsale-I will be activated over the inverse relation of has-Insertion. Furthermore, all related entities on the conceptual level will also be activated, providing background for the focussed object. Multiple activations increase the dominance. Since a discourse has normally a clear subject matter, this multiple activation should increase the most relevant domain objects. In the example network, the metatarsal bones (ossametatarsale) are activated via an initial activation of the first metatarsal and via the basis of the fifth metatarsal. The dominance propagation algorithm terminates when all open dominance distribution tasks fall below a threshold value. In our first experiments, some observations can be made: all objects to which the text refers are assumed to be most dominant and the most frequently mentioned objects (musculus peronaeus longus and brevis, fibula) received the highest dominance values. The estimated dominance values and the visual annotations are used to measure the ability of geometric models to convey these dominance values. The consideration of conceptual associations in the dominance distribution algorithm enables a flexible determination of co-referring objects. In our example text, the ankle joint (i.e., articulatio-talocruralis) appears in the heading, and thus is assumed to be important. On the other hand, joints are not modeled in the available 3D models. Nevertheless, as the most important function of joints is to connect bones, and this information can be inferred from the knowledge base, this reference activates the connected bones. The resulting illustration is shown in Figure 8. In our implementation, the dominance values affect the rendering algorithm and the presentation variables of the geometric objects in several ways: • Dominance values are mapped to opacity values, making less dominant objects transparent. • When rendering lines, dominance values determine the percentage of lines that are drawn, starting with lines
at the sharpest edges. Silhouette lines (the toes in Figure 8) are not affected. • Graphics objects are sorted by increasing dominance, so that dominant objects are drawn on top of less dominant objects. • When dominance values reach a certain threshold, lines are drawn even if they would normally be hidden by other objects. This clearly reveals the full extent of an object even if its colors had insufficient contrast to surroundig objects. The choice and combination of presentation variables and their mapping to dominance values, of course, depend on the target rendering system and on the geometric models that are available. If, e.g., our anatomic models and our rendering system had supported texture mapping, the texture intensity would have provided another degree of freedom, making dominant objects appear more detailed than the rest.
6.
CONCLUSIONS
In this paper we have presented a knowledge-supported method to relate entities in the document and conceptual level in order to determine co-referring textual and geometric objects. We use the term knowledge-supported instead of the more common term knowledge-based as the Agi3 lesystem improves a non-knowledge-based approach. Automatically estimated dominance values enable us to generate (non-photorealistic) illustrations to accompany the text. This approach has several advantages over previous systems: Terminology. Terminological and linguistic variants do no prevent the system from determining the correct reference. The morpho-syntactic analysis of noun phrases takes care of variations in keywords pertaining to declinations, plurals, synonyms, etc. The approach can also be applied in a multilingual framework. Even though the system may fail to determine the correct denotation of some terms, referred objects often received some activation as a side effect of the dominance propagation. Computing interrelationships. Associations between entities on the conceptual level are considered to determine dominant objects. More interrelationships besides included-in relations can be taken into account, whereas previous approacheses are restricted to the analysis of the physical hierarchy. The knowledge-based computation of more elaborate interaction possibilities which should consider the visible objects in the graphics and those mentioned in the visible text portion, the automatic generation of figure captions and the use of other non-photorealistic rendering techniques in the framework of the anatomy tutor system are subjects of ongoing research.
Acknowledgments The authors wish to thank Antonio Kr¨ uger (DFKI Saarbr¨ ucken) for many discussions on the subjects presented in this paper. The first author appreciate the constant support of the Agi3 le project from Prof. D. R¨ osner and the members of knowledge-based systems and document processing group.
7.
REFERENCES
[1] S.-P. Ballstaedt. Wissensvermittlung. Die Gestaltung von Lernmaterial. Psychologie Verlags Union, Weinheim, 1997. [2] M. Biezunski, M. Bryan, and S. Newcomb, editors. ISO/IEC 13250:2000 Topic Maps: Information Technology – Document Description and Markup Language. International Organization for Standarization (ISO) and International Electrotechnical Commission (IEC), Dec. 1999. [3] A. Collins and E. Loftus. A Spreading-Activation Theory of Semantic Processing. Psychological Review, 82(6):407–428, 1975. [4] W. Finkler and G. Neumann. MORPHIX: A Fast Realization of a Classification-Based Approach to Morphology. In H. Trost, editor, Proc. der 4. ¨ Osterreichischen Artificial-Intelligence Tagung, Wiener Workshop Wissensbasierte Sprachverarbeitung, pages 11–19. Springer Verlag, Berlin, Aug. 1988. [5] R. MacGregor. A Description Classifier for the Predicate Calculus. In B. Hayes-Roth and R. Korf, editors, Proc. of the Twelfth Annual National Conference on Artificial Intelligence (AAAI-94), pages 213–220, Seattle, Washington, Aug., 1–4 1994. AAAI Press, Menlo Park. [6] J. Peeck. The Role of Illustration in Processing and Remembering Illustrated Text. In H. Houghton and D. Willows, editors, The Psychology of Illustration, volume 1: Basic Research, chapter 4, pages 115–151. Springer Verlag, New York, 1987. [7] S. Pepper and G. Moore, editors. XML Topic Maps (XTM) 1.0. TopicMaps.Org, 2001. http://www.topicmaps.org/xtm/. [8] A. Rogers. Textbook of Anatomy. Churchill Livingstone, Edinburgh, 1992. [9] S. Schlechtweg and T. Strothotte. Illustrative Browsing: A New Method of Browsing in Long On-line Texts. In M. Sasse and C. Johnson, editors, Computer Human Interaction. Proc. of INTERACT-99, pages 466–473, Edinburgh, Sept. 1999. IOS Press. Amsterdam. [10] S. Schlechtweg and T. Strothotte. Generating Scientific Illustrations in Electronic Books. In Smart Graphics. Papers from the 2000 AAAI Spring Symposium (Stanford, March , 2000), pages 8–15, Menlo Park, 2000. AAAI Press. [11] H. Schmid. Probabilistic Part-of-Speech Tagging Using Decision Trees. In D. Jones and H. Somers, editors, New Methods in Language Processing. ACL Press, London, 1997. [12] M. Scholz. Technologische Bilder – Aspekte visueller Argumentation. PhD thesis, FB Design, Kunst und Medienp¨ adagogik, Druck der Bergischen Universit¨ at-Gesamthochschule (BUGH) Wuppertal, Weimar, 2000.