A Strategic Level for Digital Libraries

A Strategic Level for Digital Libraries Ling Feng

Manfred A. Jeusfeld

Jeroen Hoppenbrouwers

Infolab, Tilburg University, PO Box 90153, 5000 LE Tilburg, The Netherlands

fling,

jeusfeld, hoppie g

@kub.nl

Abstract Digital libraries (DLs) are a resource for answering complex questions. Up to now, such systems mainly support keyword-based searching and browsing. The mapping from a research question to keywords and the assessment whether an article is relevant for a research question is completely with the user. In this paper, we present a two-layered digital library model. The aim is to enhance current DLs to support dierent levels of human cognitive acts, thus enabling new kinds of knowledge exchange among library users. The low layer of the model, namely, the tactical cognition support layer, intends to provide users with requested relevant documents, as what searching and browsing do. The upper layer of the model, namely, the strategic cognition support layer, not only provides users with relevant documents but also directly and intelligently answers users' cognitive questions. On the basis of the proposed model, we divide the DL information space into two subspaces, i.e., a knowledge subspace and a document subspace, where documents in the document subspace serves as the justi cation for the corresponding knowledge in the knowledge subspace. Detailed description of the knowledge subspace, as well as query facilities against the enhanced DLs for users' knowledge sharing and exchange, are particularly discussed.

1 Introduction In todays' networked environment, people demand new evolutionary technology for information retrieval and intellectual exchange. DLs serve as an increasingly important channel to a vast array of information sources and services. They provide new opportunities to assemble, organize and access large volumes of information from multiple repositories, while making distributed heterogeneous resources spread across the network appear to be a single uniform federated source [SC99]. Under the assistance of DL systems, users can move from source to source, seeking and linking information automatically or semi-automatically to solve their various information problems. Traditionally, when people retrieve information, their activities are classi ed into two broad categories: searching and browsing. Searching implies that the user knows exactly what to look for, while browsing should assist users navigating among correlated searchable terms to look for something new or interesting. Currently, most of the major work on DL systems focus on supporting these two kinds of activities. However, with more DLs built up to handle users' information requests, we ask questions: Are the existing information technologies powerful enough to facilitate DL users' problem solving? Or what technologies do we still need to help users better do their work? Compared to traditional physical libraries, are there any important features missing from the current DL systems? To answer these questions, let's rst look at a scenario on the use of a DL system. 1

Kooper works in a city plan & management oce. He is investigating the precaution policy against ood this year. If there will be a ood in the coming summer, necessary actions (e.g., strengthing the embankment, resource allocation, etc.) must be taken now to prevent the city from sustaining losses. According to Kooper's previous experience, it seems that \A wet winter might cause ood in summer." To con rm this pre-conceived hypothesis, he logins on a DL system to request information talking about the reasons of ood. The DL system returns a number of articles. He browses through the returned article list and selects three articles that look most relevant. All of the three articles mention that \A precursor to the ooding in summer is a wet winter." To assure this information is also valid for the area where Kooper lives, he accesses the DL again to ask for documents reporting river ood in the city before. From the articles returned, he gets to know the latest 3 severe ood that happened in 1986, 1995, 1997, respectively, in the city. He then navigates to the meteorological repository of the DL, and accesses the weather information of the city during the winter time of these three years. He notes a tight correlation between wet winter and summer ood. Based on the information obtained from the DL, Kooper is pretty sure now about his prior ood prediction assumption for the city. He logs o the system. Having experienced a very wet winter this year, Kooper decides to draw up a city ood precaution plan immediately.

This scenario shows us a few shortcomings that exist in current DLs. Below we outline some of the problems and propose our work to solve them. 1) Inadequate strategic level cognition support. The aim of DL systems is to empower users to nd useful information to solve their problems. When a person goes into a library to look for something, s/he usually has certain purposes in mind. For example, s/he may want to read a speci c article written by a certain author. In this situation, the target of the user is precise and clear. Sometimes, the user wishes to explore the available resources rst before exploiting them. This exploration may be targeted at re ning a prior understanding of a certain information context, or formulating a further concrete requirement for speci c documents. Most eorts of the current DL systems aim to support these two kinds of users' activities, namely, searching and browsing. However, besides simply entertaining users with documents as what searching and browsing do, DL systems should also consider supporting human's strategic level cognitive work which can directly enable correct actions and problem solving. Typically, users have already some prior domain-speci c knowledge or pre-conceived hypotheses. They may expect the library systems to con rm/deny their existing concepts, or to check whether there are some exceptional/contradictional evidences against the pre-existing notions, or to provide some predictive information so that users can take eective actions. Under this circumstance, the users' information needs are not only for relevant documents, but also for intelligent answers to their questions together with a series of supporting literatures for justi cation and explanation, as illustrated in the above scenario. To distinguish this kind of information needs from the traditional searching and browsing requests, we categorize traditional users' information searching and browsing activities into tactical level cognition act and the latter into strategic level cognition act. As [CH98] says, \The nature of an information system is to provide informational support to people as they carry out their intentional tasks." As one of the most complex and advanced forms of information 2

systems, DL system designers must have a comprehensive understanding of users' information needs and their purposes for using DLs. In the above example, if the DL system could automate user's rational exploration from the knowledge space, consisting of propositions and assertions, to the corresponding justi cation/explanation space, consisting of data sources in various forms like textual articles, reports, databases, etc., the information role of the DLs in helping users derive eective decisions will be greatly enforced. Unfortunately, such high-order cognition support and its impact on users' work have not received much exploration in current DL systems. 2) Inadequate knowledge sharing facilities. Traditional libraries are a public place where a large extent of mutual learning, knowledge sharing and exchange can happen. A user may ask a librarian for search assistance. Librarians themselves may collaborate in the process of managing, organizing and disseminating information, or share experiences in using consistently-emerging new systems and tools to tackle users' search questions. Users may communicate and learn from each other by observing how others use library's resources, or by asking for help. With paper sources digitalized and physical libraries moving to virtual DLs, these valuable features of traditional libraries should be retained. We believe DLs of the future should not just be simple storage and archival systems. To be successful, DLs must become a knowledge place where knowledge acquisition, sharing and propagation take place. For example, if the DL in the above scenario could make readily available expertise and answers to strategic level questions, which might require time-consuming search or consultation with experts, it can help users to better exploit the DL and improve working eectiveness. Also, as computerized knowledge does not deteriorate with time as that human knowledge does, for long-term retention, DLs oer ideal repositories for the knowledge in the world, and make them universally accessible. Our work. We propose a two-layered DL function model to support both tactical level and strategic level cognitive tasks of users according to their information needs. The model moves beyond simple searching and browsing across multiple correlated repositories, to acquisition of knowledge. On the basis of the proposed function model, we further divide the DL information space into two subspaces, i.e., a knowledge subspace and a document subspace. Documents in the document subspace serves as the justi cation for the corresponding knowledge in the knowledge subspace. We give a detailed description of the DL knowledge subspace, as well as query facilities against such enhanced DLs. The remainder of the paper is organized as follows. In section 2, we review some closely related work. A two-layered DL model, consisting of a tactical cognition support layer and a strategic cognition support layer, is presented in section 3. A formal description of the DL knowledge subspace and query facilities against the enhanced DLs are presented in section 4 and section 5, respectively. Section 6 concludes the paper with a brief discussion of future work.

2 Related work DLs integrate a variety of information technologies, calling for multi-disciplinary cooperation in information science. In this section, we provide a brief review of relevant work in dierent elds, including ontologies and lexical relations, knowledge representation, information retrieval and multisource integration in DLs. 3

Ontologies and Lexical Relations The Oxford English Dictionary de nes ontology as the \the science or study of being ". In philosophy, the word ontology refers to a systematic description of a minimal set of concepts that a language needs to express all its other concepts [Hop97]. In linguistic and lexicographic contexts, an ontology is a speci cation of the concepts and their relationships that exist for a community of people or automated agents. Its purpose is to enable knowledge sharing and reuses [Gru93]. To re ect cognitive relations among the concepts as expressed by or embodied in words in natural languages, the theory of lexical relations was developed [Kie69], where information is represented as a collection of nodes connected by labeled arcs that express links or relationships between the nodes. These links may represent semantic and syntactic relationships among concepts, and can be used to re ect not only major aspects of word meaning, but also their morphological relationships and implicatures [Kie69, ES79, NFE90, MNE92, Nut89]. The most widely recognized relationships include is-a, synonym, antonym, and so on. In the literature, a link can be denoted using either a labeled arc or a proposition node [ea78, Fin79, CM87]. Further, a methodology for identifying, evaluating and describing dierent relationships, along with their logical properties in modeling human reasoning, was presented [MNE92]. It begins with a study of dictionary and folk de nitions to obtain the linguistic formulae used to express the link in question. Then it discovers how this link appears in anthropology, linguistics, philosophy and psychology, and the way it interacts with other links and the way it is used in conceptual information processing. A hierarchy of lexical-semantic relations is thus constructed which contains both basic and non-basic semantic relations as well as a large number of lexical relations. Many (but not all) of the lexical relations so far identi ed can be extracted automatically from dictionary de nitions [NFE90]. CYC [Len95] aims to build a common sense knowledge base consisting of roughly 100,000 general concepts spanning all aspects of human reality. WordNet is another remarkable and widely used lexicon, which groups the words together by means of synonym sets. It aims to provide an aid to search in a lexicon in a conceptual rather than an alphabetical way [MBF+ 93].

Knowledge Representation Knowledge is an important element of any AI application. A number of knowledge representation schemas are designed in the AI eld so that the knowledge can be applied in the reasoning process to solve problems. These techniques can be roughly categorized as either declarative methods, in which most of the knowledge is represented as a static collection of facts accompanied by a small set of general procedures for manipulating them; or procedural methods, in which the bulk of the knowledge is represented as procedures for using it [And92, Ric83, Gar88]. Typical declarative methods include predicate logic, semantic nets, frames and scripts. 1) Predicate logic involves using standard forms of logical symbolism to represent real-world facts as statements, written as well-formed formulae, which are made up of predicates, constant terms, variables, logical connectives, and quanti ers. 2) Semantic nets take the complex structure of the world into consideration, where information is represented as a set of nodes connected to each other by a set of labeled arcs, representing relationships among the nodes [Min68a, Min68b]. Both 3) frames and 4) scripts serve as general-purpose knowledge structures that represent some common features of things or sequence of events in a particular context. Some 4

procedural knowledge representation methods include procedures and production rules. In our study, we explore the use of predicate logic and semantic network mechanisms to represent hypotheses and their inter-relationships in the DL knowledge subspace.

Information Retrieval To support ecient information searching and browsing activities, many eorts have been made in developing retrieval models, building document and index spaces, extending and re ning queries for information retrieval systems [Fid94, FBY92, CLvRC98]. A three-dimensional model of information exploration is presented in [WC91], with the attempt to clarify the respective roles of the human and the system in browsing and information retrieval, and to characterize alternative interaction styles to maximize retrieval eectiveness. In [DvR93], index terms are automatically extracted from documents and a vector-space paradigm is exploited to measure the matching degrees between queries and documents. Indexes and metadata can also be manually created from which semantic relationships are captured [BS95, Dao98]. Besides, the large collection of documents can be semantically partitioned into dierent clusters, so that queries can be evaluated against relevant clusters [Wil88]. To improve query recall and precision, several query expansion and re nement techniques based on relational lexicons/thesauri or relevance feedback have been explored [VWSG97, Eft93, JGR+ 95]. [Ing94] applies a cognitive polyrepresentation of information needs to information retrieval systems, aiming to improve the intellectual access to information sources on an enriched contextual platform. [BJ98] investigates the interface design issues for a highly interactive information retrieval system based on a probabilistic retrieval model with relevance feedback. [BM90] proposes a method for specifying the functionality of an intelligent interface to large-scale information retrieval systems, and for implementing those functions in an operational environment. Furthermore, [Che92] outlines a theoretical framework for understanding the information storage and retrieval process in terms of the \agents" involved in the indexing and searching processes. It shows how human agents are involved in the retrieval processes, the types of knowledge these agents possess, and the observed characteristics of their indexing and searching behaviors. A recent work incorporates knowledge about the document structures into information retrieval, and the presented query language allows the assignment of structural roles to individual query terms [WFC00]. Individual dierences in using information retrieval systems have also been extensively studied in [Bor87, MDR89, BBDMB94, SS98, Dum99].

Multi-Source Integration in DLs The emerging eld of DLs brings together participants from many existing research areas [FM98, FM01]. From a database or information retrieval perspective, DLs may be seen as a form of federated databases, since a DL usually contains lots of distributed and heterogeneous repositories which may be autonomously managed by dierent organizations. To facilitate users' easy access to diverse sources, many eorts have been engaged in handling various structural and semantics variations and providing users with a coherent view of a massive amount of information. Currently, the interoperability problem has sparked vigorous discussions in the DL community [SC99, SMC+ 99, Sch98, Sch95, Che99, PBLO99, PCGMW98]. The concept extraction, mapping and switching techniques, developed in [BHCS99, MG95, CSN97], enable users in a cer5

tain area to easily search the specialized terminology of another area. A dynamic mediator infrastructure [MGMP00] allows mediators to be composed from a set of modules, each implementing a particular mediation function, such as protocol translation, query translation, or result merging [PBJ+ 00]. [PL99, JL97] present an extensible digital object and repository architecture FEDORA, which can support the aggregation of mixed distributed data into complex objects, and associate multiple content disseminations with these objects. [KW95, PCGM+ 98, PH96] employ the distributed object technology to cope with interoperability among heterogeneous resources. According to topic areas, a distributed semantic framework is proposed to contextualize the entire collection of documents in a DL for ecient large-scale searching [PH99]. Experiences in designing and implementing DLs, including the archival repository architecture, user interface, cross-access mechanism, and intellectual property protection, etc., are substantially presented in [Bat02, HP00, CGM99, CCGM00, HBOS96, Hou95, PW97, CMFH00, Kan98, BSH97, BL97]. Recommendations for DL evaluation from a both longitudinal and multifaceted viewpoint are provided in [Mar00]. On the other hand, applying AI to library science, many library-oriented expert systems have been developed in the literature [LS90, LS97]. However, most of these systems essentially aid in carrying out the support operations of libraries, such as descriptive cataloging, collection development, disaster planning and response, database searching, and document delivery, etc. [LS90, LS97]. There is a fundamental dierence from the work reported in the paper, whose intention is to answer library users' strategic questions.

3 A Two-Layered DL Function Model Supporting users' information searching and browsing has been the focus of DL research for a long time. However, as social humans, their information expectations for a DL are more than pure searching and browsing. In this section, we extend the traditional role of DLs from information provider to information & knowledge provider. Figure 1 illustrates a two-layered DL function model, consisting of a tactical cognition support layer and a strategic cognition support layer. User’s Request

Strategic Level Cognition Support

DL’s Cognitive Function Modules

look for "hypothesis" knowledge with/without justifications look for "potentially" useful documents

Tactical Level Cognition Support

DL’s Result

Knowledge Inquiry

answer to a research question based on knowledge and supporting documents

Information Browsing

correlated concept space with relevant documents

Information Searching

look for "specific" documents

requested documents

Figure 1: A two-layered DL function model

3.1 Tactical Level vs. Strategic Level Cognition Support Tactical level cognition support. We view traditional DL searching and browsing as tactical

level cognitive acts. The target of searching is towards certain speci c documents. One searching 6

example is \Look for the article written by John Brown in the proceedings of VLDB88." As the user's request can be precisely stated beforehand, identifying the target repository where the requested document is located is relatively easy. Primarily, the ability to search indexes of repositories can support this kind of searching activities. In comparison to searching whose objective is wellde ned, browsing aims to provide users with a conceptual map, so that users can navigate among correlated items to hopefully nd some potentially useful documents, or to formulate a more precise retrieval request further. For instance, a user reads an article talking about a water reservoir construction plan in a certain region. S/he wants to know the possible in uence on ecological balance. By following semantic links for the water reservoir plan in the DL, s/he navigates to the related \ecological protection" theme, under which a set of searchable terms with relevant documents are listed for selection. To facilitate browsing, DLs must integrate diverse repositories to provide users with a uniform searching and retrieval interface to a coherent collection of materials. The capability that enables navigation among a network of inter-related concepts, plus the searching capability on each individual repository, constitute the fundamental support to browsing activities. As the techniques of searching and browsing have been extensively studied and published in the literature, we will not discuss these any further here. Strategic level cognition support. In contrast to tactical level cognition support which intends to provide users with requested documents, strategic level cognition support not only provides relevant documents but also intelligently answers high-order cognitive questions, and meanwhile provides justi cations and evidences. Taking the scenario in section 1 as an example, the purpose of Kooper on the use of the DL is to con rm his prior knowledge about \Wet winter causes summer ood ". Instead of retrieving documents using dispersed keywords like wet winter, summer ood, cause, etc, the user would prefer to pose a direct question Q1 as follows, and expect a direct con rmed/denied answer from the DL system rather than a list of articles lacking explanatory semantics and waiting for his assessment. Q1 : \Tell me whether a wet winter will cause summer ood." Other high-order cognitive request examples are like: Q2 : \Tell me what will possibly cause summer ood. Give me referent articles that talk about this." Q3 : \Give me articles which state the necessary in uences of wet winter." In response to dierent questions as described above, it is desirable for DLs to provide knowledgelevel answers, together with the justi cations for these answers. For example, the justi cation for Q1 is a series of articles talking about \wet winter causes ood in summer ", as well as evidence articles which talk about, respectively, wet winter in certain years and summer ood in the next years in certain particular regions. The provision of strategic level cognition support adds values to DLs beyond simply providing document access. It reinforces the exploration and utilization of information in DLs, and advocates a more close and powerful interaction between users and DL systems. With this high-order cognitive assistance, DL users will be able to nd things to solve their real information problems themselves. From the viewpoint of DLs, to realize such a strategic level cognition function, substantial information analysis and abstraction need to be done. This inevitably involves the navigation and correlation of information items across multiple repositories in DLs, and production of intelligent knowledge in 7

answering users' strategic questions.

3.2 An Enlarged DL Information Space In order to support the two kinds of cognitive acts, we further divide the DL information space into two subspaces, i.e., a knowledge subspace and a document subspace, as shown in Figure 2. Note that in this gure, we only illustrate the organization of documents for knowledge justi cation purpose. Documents in a DL are in fact also indexed and clustered based on the ontology and thesauri. Knowledge Subspace

H1:

relate (winter|*, river-behavior|*)

....

H2: cause (wet-winter|*, summer-flood|*) H3:

cause (wet-winter|*, summer-flood and hot-summer|*)

(wet-winter|*, H4: cause summer-flood and cool-summer|*)

......

Document Subspace

...... articles on H3

articles on H2

articles on H1

articles on H4

(for justification) represents a hypothesis represents a specification relationship between two hypotheses represents an opposite relationship between two hypotheses

Figure 2: An enlarged DL information space

The knowledge subspace. The basic constituent of the knowledge subspace is knowledge, such

as hypothesis, rule, belief, etc. 1 Each hypothesis describes a certain relationship among a set of concepts. For example, the hypothesis \H2 : wet winter causes summer ood " explicates a causal relationship between a cause wet winter and the eect summer ood it has. Considering that the DL knowledge subspace is for users to retrieve strategical level knowledge, it will inevitably be subject to the classical information retrieval's vocabulary (synonymy and polysemy) problem. Previous research [Bat86, CLBN93] demonstrated that dierent users tend to use dierent terms to seek identical information. Furnas et.al [FLLD87] showed that the probability of two people using the same term to classify an object was less than 20%. To enable knowledge exchange and reusability, we build various relationships including equivalence, speci cation/generalization and opposition over the hypotheses knowledge, expanding a single user's hypothesis into a network of related hypotheses. Later, if one user's inquiry has the form of a hypothesis, the above relationships can be explored to nd matching hypotheses in the knowledge subspace. The hypotheses together with the backing documents, serving as the justi cation of hypotheses, in the document subspace are returned to the user as a part of the answer to his/her strategical request. For example, a more general hypothesis in respect to H2 is \H1 : wet winter is related with river behavior ". Section 4 gives a detailed description of the knowledge subspace. A query language against the enhanced DLs is provided in Section 5. 1

In this initial study, we focus on hypothesis knowledge in empirical sciences.

8

The document subspace. Under each hypothesis is a justi cation set, giving reasons and evidences

for the knowledge. These justi cations, made up of articles, reports, etc., constitute the document subspace of the DL information space. Taking the above hypothesis H2 for example, the articles mentioning exactly that \wet winter is an indicator of summer ood." constitute the justi cation for that hypothesis. Returning to the previous example questions Q1 , Q2 and Q3 , the DL system can provide not only intelligent answers according to the hypothesis, but also a series of links to the justifying articles for users' reference. It is worth notice here that the document subspace challenges traditional DLs on literature organization, classi cation, and management. For belief justi cations, we must extend the classical keyword -based index schema, which is mainly used for information searching and browsing purposes, to knowledge -based index schema, in order that the information in DLs can be easily retrieved by both keywords and knowledge.

4 A Formal Description of DL Knowledge Subspace In this section, we de ne the basic constituent of the DL knowledge subspace - hypothesis, starting with its two constructional elements, i.e., concept terms and relation terms. Dierent concept-termbased and relation-term-based relationships are then de ned. They form the base for the de nitions of hypotheses and the equivalence, speci cation/generalization and opposition relationships of hypotheses. Throughout the discussion, we assume the following notation is used.

A nite set of entity concepts EC oncept = fe1 ; e2 ; : : : ; et g. A nite set of relation concepts RC oncept = fr1 ; r2 ; : : : ; rs g. A nite set of concepts C oncept, where C oncept = EC oncept [ RC oncept: A nite set of contextual attributes Att = fa1 ; a2 ; : : : ; am g. The domain of ai 2 Att is denoted as Dom(ai ):

A nite set of concept terms CT erm = fn1; n2 ; : : : ; nug. A nite set of relation terms RT erm = fm1 ; m2; : : : ; mv g. A nite set of hypotheses Hypo = fH1 ; H2; : : : ; Hw g.

4.1 Concepts Concepts represent real-world entities, relations, states and events. Here, we classify concepts into two categories, i.e., entity concepts and relation concepts, according to the roles they play in representing knowledge, which will be addressed in this section. Entity concepts are used to describe concept terms, including attributes. They are similar to entities in classical Entity-Relationship (ER) diagrams. Some entity concept examples are dog, animal, trac-jam, wet-winter, summer- ood, etc.. In contrast, relation concepts are mainly for presenting various relationships among concept terms. 9

They thus look like relationships in ER diagrams. For example, the causal relation between a wetwinter in the north and a consequence summer- ood in the south can be expressed with the relation concept cause. Concepts are the building blocks of a sentence. They convey the most fundamental cognitive knowledge in human society. Based on the substantial work on lexicography and ontology [Sow84, Kie69, ES79, NFE90, Gru93, GW00], three typical primitive relationships between concepts can be established. They are Is-A, Synonym, and Opposite, each of which is denoted using a binary predicate. For example, Is-A (summer- ood, river-behavior ) represents the notion that summer- ood is a riverbehavior, Opposite (wet-winter, dry-winter ) represents a pair of antonym concepts, wet-winter and dry-winter, and Is-A (cause, relate ) speci es a more general relation concept relate in respect to cause.

Property 1 The three primitive relationships of concepts (which can be either entity concepts or

relation concepts) have the following properties: 1) Is-A is re exive and transitive. 2) Synonym is re exive, transitive and symmetric. 3) Opposite is symmetric. 2

4.2 Concept Terms Observing that real-world users' investigation and understanding of concepts are usually under certain speci c contexts, in this study, we associate each entity concept with a conceptual context to denote the circumstance under which the entity concept is considered. Basically, we can declare a conceptual context by a set of attributes called contextual attributes, each of which represents a dimension in the real world. Typical contextual attributes include time, space, temperature, and so on. Note that the contextual attributes could be of any kind as long as they are meaningful and make commitment to the existence of entity concepts. For example, the context for the entity concepts wet-winter and summer- ood could be constructed by two contextual attributes, Year and Region, indicating the time and place for the occurrences of wet-winter and summer- ood. Such a context can be further instantiated by assigning concrete values to its constructional attributes. For example, we can restrict the contextual Region and Year of wet-winter to the north and south areas in 2000 by setting Region:=f\north", \south"g and Y ear:=f\2000"g. Region := Dom(Region) assigns all applicable regions (\south", \north", \east", and \west") to contextual attribute Region. In this paper, we view an entity concept and its associated context as an integral unit, where the context spans the universe of discourse for that entity concept. To distinguish it from the traditional notion of entity concept, we name it as a concept term.

De nition 1 A concept term n 2 CT erm is of the form n = e jAV , where e 2 EC oncept and AV = fa := Va j (a 2 Att) ^ (Va Dom(a))g. Let Att = fa1 ; a2 ; : : : ; am g be a set of contextual attributes which constitute the context under consideration. The default setting for attribute ai 2 Att is the whole set of applicable values in the domain of ai , i.e., ai := Dom(ai ). AV = fa1 := Dom(a1 ); a2 := Dom(a2 ); : : : ; am := Dom(am )g depicts a universe context. A simple and equivalent representation of the universe context is AV = . 2 Example 1 Suppose the context is comprised of two contextual attributes Region and Y ear. wet10

winter jfRegion:=f\north";\south"g; Y ear:=f\2000"gg is a concept term, denoting a wet-winter entity concept in the north or south in 2000. The concept term wet-winter jfY ear:=f\2000"gg implies a wetwinter in 2000 in all possible regions, including \north", \south", \west" and \east". The context in wet-winter jfRegion:=Dom(Region); Y ear:=Domain(Y ear)g spans the universe of discourse, equivalent to the expression wet-winter j according to De nition 1. 2 Dierent relationships of concept terms can be identi ed based on their entity concepts and associated contexts. Before giving the formal de nitions, we rst de ne three relationships between two instantiated contexts.

De nition 2 Let AV1 and AV2 be two instantiated contexts. AV1 a AV2 (or AV2 a AV1 ), i 8(a := V2 ) 2 AV2 9(a := V1) 2 AV1 (V1 V2), AV1 =a AV2, i AV1 a AV2 and AV2 a AV1 . AV1 a AV1 ), i (AV1 a AV2 ^ AV1 6=a AV2 ). According to De nition 1, for any instantiated context AV , we have AV a . 2 The =a relationship between two instantiated contexts AV1 and AV2 indicates that they both have exactly the same contextual attributes with the same attribute values. AV1 a AV2 states that AV2 is broader than AV1 , covering the contextual scope of AV1 .

Example 2 Assume we have four instantiated contexts: AV1 = ; AV2 = fY ear := f\1999"; \2000"; \2001"gg; AV3 = fRegion := f\north"g; Y ear := f\2000"gg; and AV4 = fRegion := f\south"g; Y ear := f\2000"gg:

Following De nition 2, AV2 = Doc.

2

The de ned equivalence, (strict) speci cation and opposition relationships of hypotheses lead us to the following axiom and theorem.

Axiom 1 , Let H1; H2 be two hypotheses in Hypo. If H1 is a (strict) speci cation of H2, i.e., H1 h (h)H2, then H1 H2. If H1 is equivalent to H2, i.e., H1 h H2, then H1 = H2. If H1 is opposite to H2, i.e., H1 /h H2, then H1 \ H2 = ;. 2 Using De nition 16 of speciality/generality between hypotheses, we can be sure that if a hypothesis is consistent with a set of documents, any generalization of it will also be consistent with this document set. In contrast, if a document does not conform to a hypothesis, it cannot conform to any specialization of that hypothesis either. For any hypothesis H 2 Hypo where (? h H h >), it is obvious that ; = ? H > = Doc:

Theorem 1 Let H1; H2; H3 be three hypotheses in Hypo, where H1 is a common generalization of H2 and H3, i.e., (H2 h H1 ) and (H3 h H1 ). We have ( H2 [ H3 ) H1 . 2 Proof 1 Since (H2 h H1), according to Axiom 1, ( H2 H1). Similarly, ( H3 H1) because of (H3 h H1 ). Thus, ( H2 [ H3 ) H1. 2

5 Queries Against the Enhanced DLs With the proposed two-layered DL information space, users can pose to DLs various strategic questions like \what will cause a summer- ood in the south? " \Does mobile phone cause brain cancer? If so, give me documents to prove that." To express this kind of knowledge requests, we present a query language against the enhanced DLs, using a syntax similar to logical expressions.

5.1 Query Constructors Basically, a DL query consists of a single head and a single tail separated by the \j" symbol, hheadi j htaili. The tail describes the query condition, while the head describes the query result, which can be either a complete or part of a hypothesis with/without justi cation documents, or a boolean indicating whether the inquiried knowledge is true or false. A query tail is a list of predicates connected by logical connectives AND (^), OR (_), or NOT (:), where each predicate falls into any of the following three categories: 16

an element binding constraint F unc(e)/F unc(e; E ); a hypothesis of a binary predicate form m(IN ; ON ); a comparison constraint between two expressions hexpressioni hopcompi hexpressioni: An element in a query tail can be either a variable or a constant, representing a hypothesis, a hypothesis component (which can be a concept term, a contextual attribute, a contextual attribute value, a relation term or a modal symbol), or a referent document of a hypothesis. helementi ::= hhypothesisi j hhypothesis componenti j hreferent-documenti hhypothesis componenti ::= hconcept-termi j hcontextual-attributei j hcontextual-attribute-valuei j hrelation-termi j hmodali We use dierent constraint functions of the form F unc(e) or F unc(e; E ) to bind an element to the corresponding category.

Hypo (e) - e denotes a hypothesis; Ref (e; H ) - e denotes a referent document of hypothesis H ; CT erm (e; H ) - e denotes a concept term in hypothesis H ; Att (e; H ) - e denotes a contextual attribute in hypothesis H ; V al (e; a) - e denotes a value of the contextual attribute a; RT erm (e; H ) - e denotes a relation term in hypothesis H ; Modal (e; H ) - e denotes a modal in hypothesis H .

An expression in a comparison constraint hexpressioni hopcomp i hexpressioni can be an element or an arithmetic expression involving single-value-oriented operators f+, -, *, /g or set-value-oriented operators f\; [g. That is, hexpressioni ::= helementi j helementi hopmath i hexpressioni hopmath i ::= + j , j j = j \ j [ To query the knowledge in a DL, we expand the traditional comparison operators to include the hypothesis-based comparison operators fh ; h ; h ; h ; h ; 6h ; /hg de ned in the previous section, i.e., hopcompi ::= < j j = j 6= j > j j j j j j h j h j h j 6h j /h j h j h. Within a query (tail and head), we use e to indicate a single element, and brace feg to indicate a set of elements.

5.2 Query Examples In this subsection, we illustrate the usage of the presented query constructors through a set of examples. We formulate our descriptions according to dierent types of query outputs. 17

1) Query the tenability of a hypothesis Q1 : Tell me whether wet winter will cause summer ood. j cause (fwet-winter j g; fsummer-flood jg).

An empty query head merely inquires the tenability of the input knowledge. If it holds, the query returns true, and otherwise, false.

2) Query the referent documents of a hypothesis Q2 : Give me documents which talk about the causal relation between wet winter and summer ood. fdg j Hypo(H ) ^ Ref (d; H ) ^ H h cause (fwet-winter j g; fsummer-flood j g). Some intermediate variables with corresponding binding constraints (e.g., H in this query example) can be employed for the ease of query expressions.

3) Query the hypotheses and their referent documents Q3 : Tell me all hypotheses in contrast to the one \wet winter will cause summer ood". Give me all

the referent documents. fH; fdgg j Hypo(H ) ^ Ref (d; H ) ^ H /h cause (fwet-winter j g; fsummer-flood j g). The result is a set whose element is made up of two elds, a hypothesis and a set of supporting documents for that hypothesis.

4) Query the concept terms in a hypothesis Q4 : Tell me what factor(s) will possibly cause a summer ood. Give me the referent documents. fng; fdg j Hypo(H ) ^ CT erm(n; H ) ^ Ref (d; H ) ^ H h 3 cause (n; fsummer-flood jg).

5) Query the relation term in a hypothesis Q5 : Tell me whether there is a relation between wet winter and summer ood. Give me the referent documents.

m; fdg j Hypo(H ) ^ RT erm(m; H ) ^ Ref (d; H ) ^ H h m (fwet-winter j g; fsummer-flood j g).

6) Query the modal of a relation term in a hypothesis Q6 : Tell me whether wet winter will necessarily or possibly cause summer ood. j Hypo(H ) ^ Modal(; H ) ^ H h cause (fwet-winter j g; fsummer-flood j g. The query answer will be 2, 3 or empty.

18

7) Query the contextual values in a hypothesis Q7 : Tell me which region(s) will have summer ood due to the wet winter in the south. Give me the referent documents.

fvg; fdg j Hypo(H ) ^ V al(v; Region) ^ Ref (d; H ) ^ H h cause (fwet-winter jRegion:=fvg g; fsummer-flood jRegion:=fsouthg g).

8) Query the contexts (attribute-value pairs) in a hypothesis Q8 : Tell me the contexts under which wet winter will cause summer ood. fa := fva gg; fb := fvb gg j Hypo(H ) ^ Att(a; H ) ^ V al(va ; a) ^ Att(b; H ) ^ V al(vb ; b) ^ H h cause (fwet-winter jfa:=fva gg g; fsummer-flood jfb:=fvb gg g).

6 Conclusion Motivated by the shortcomings - (i) inadequate strategical level cognition support; (ii) inadequate knowledge sharing facilities - with the present-day DLs, we rst introduce a two-layered DL model to support dierent levels of human cognitive acts. The tactical level cognition support aims to provide users with requested documents, as searching and browsing do, while the strategic level cognition support can provide not only documents but also intelligent answers to users' high-order cognitive questions. Second, to address users' strategic requests, we propose an enlarged information space comprised of a knowledge subspace and a document subspace. A formal description of the knowledge subspace is particularly discussed. Finally, we present a query language against the enhanced DLs for knowledge acquisition and sharing among DL users. Compared with the traditional knowledge-based systems, the enhanced DL systems with knowledge elements have the following distinguished features. 1) Cognitive function. A knowledge-based system is usually designed to apply logical inference rules to make judgement in processing business routines or come up with a conclusion to a certain pre-de ned problem. The mission of a DL system equipped with a knowledge subspace is to make expertise knowledge widely available to the public. From such DLs, users can obtain not only requested documents, but also intelligent answers to their strategic questions. 2) Broad scope. A knowledge-based system intends to solve problems in a narrow domain, e.g., company delivery charge, heart disease diagnosis, etc. The rules stored in its knowledge base are thus limited to a particular eld. On the contrary, the scope of the DL knowledge subspace is broad, covering a wide spread of scienti c and enginering disciplines. 3) Extensive content. DLs collect a huge body of documents, which can serve as knowledge justi cation. Comparatively, knowledgebased systems contain a limited amount of rules and facts in a particular eld of expertise. The work reported in the paper is a rst step towards strategic level cognition support for DL users. For future work, a lot of issues deserve investigation. 1) One immediate work is to build up the knowledge subspace and link it to the document subspace. Currently, we are exploring two ways of knowledge acquisition to ll in the knowledge subspace. One of them is human-centered knowledge acquisition. Experienced humans input hypothesis knowledge manually, based on which justifying articles are collected by performing searching and browsing on DL systems. We intend to implement some kind of immediate reward system (e.g., extending users' quota for certain activities) that makes 19

it attractive for users to input their knowledge and justifying documents. Existing systems which use this \return on investment" policy have showed that such a policy has a self-regulating property 2 , and indeed may lead to a coarse but useful ltering of data items (documents) useful to support a strategic goal. The second approach is machine-centered knowledge acquisition. Under the assistance of users, DL systems automatically deduce hypothesis knowledge by correlating and analyzing data sources. 2) To facilitate strategic level cognitive activities, ecient storage and management of the knowledge and document subspaces is very important and must be carefully planned. This demands eective indexing strategies for both knowledge and justifying documents. 3) Ecient query evaluation and knowledge navigation mechanisms must be built to help users make the best of information and knowledge assets in solving their problems. 4) Our eventual goal is to develop a practical DL system, which can empower human with real actionable knowledge in solving their information problems.

References [And92]

R.G. Anderson. Information and Knowledge Based Systems: an Introduction. Prentice Hall, Inc., 1992. [Bat86] M.J. Bates. Subject access in online catalogs: A design model. Journal of the American Society for Information Science, 37(6):357{376, 1986. [Bat02] M.J. Bates. The cascade of interactions in the digital library interface. Information Processing and Management, 38(3):381{400, 2002. [BBDMB94] N.J. Belkin, C.L. Borgman, S.T. Dumais, and M.H-Beaulieu. Evaluating interactive retrieval systems. In Proc. of the 17th Intl. Conf. on Research and Development in Information Retrieval, page 361, Dublin, Ireland, July 1994. [BHCS99] N. Bennett, Q. He, C. Chang, and B.R. Schatz. Concept extraction in the interspace prototype. Technical report, Dept. of Computer Science, University of Illinois at Urbana-Champaign, 1999. [BJ98] M. Beaulieu and S. Jones. Interactive searching and interface issues in the Okapi best match probabilistic retrieval system. Interacting with Computers, 10:237{248, 1998. [BL97] B.P. Butten eld and S.T. Larsen. Bilding a digital library for earth system science: The Alexandria project. In Proc. of Annual Meeting of the Geoscience Information Society, pages 23{26, USA, October 1997. [BM90] N.J. Belkin and P.G. Marchetti. Determining the functionality and features of an intelligent interface to an information retrieval system. In Proc. of the 13th Intl. Conf. on Research and Development in Information Retrieval, pages 151{177, Brussels, Belgium, September 1990. [Bor87] C.L. Borgman. Individual dierences in the use of information retrieval systems: Some issues and some data. In Proc. of the 10th Intl. Conf. on Research and Development in Information Retrieval, pages 61{71, New Orleans, USA, June 1987. [BS95] K. Beard and V. Sharma. Multidimensional ranking in digital spatial libraries. In Proc. of the 2nd Intl. Conf. on the Theory and Practice of Digital Libraries, Texas, USA, June 1995. 2

http://www.slashdot.org

20

[BSH97]

K. Beard, T. Smith, and L. Hill. Meta-information models for georeferenced digital libraries. Journal on Digital Libraries, 1(2):153{160, 1997. [CCGM00] B. Cooper, A. Crespo, and H. Garcia-Molina. Implementing a reliable digital object archive. Technical report, Standford University, 2000. [CGM99] A. Crespo and H. Garcia-Molina. Modeling archival repositories for digital libraries. Technical report, Standford University, 1999. [CH98] P. Checkland and S. Holwell. Information, Systems and Information Systems - making sense of the eld. John Willey & Sons, Inc, 1998. [Che80] B.F. Chellas. Modal Logic: An Introduction. Cambridge University Press, 1980. [Che92] H. Chen. Knowledge-based document retrieval: frameowrk and design. Journal of Information Science, 18:293{314, 1992. [Che99] H. Chen. Semantic research for digital libraries. D-Lib Magazine, 5(10), 1999. [CLBN93] H. Chen, K.J. Lynch, K. Basu, and T. Ng. Generating, integrating, and activating thesauri for concept-based document retrieval. IEEE Expert, 8(2):25{34, 1993. [CLvRC98] F. Crestani, M. Lalmas, C. van Rijsgergen, and I. Campbell. `is this document relevant? ... probably': A survey of probabilistic models in information retrieval. ACM Computing Surveys, 30(4), 1998. [CM87] N. Cercone and G. McCalla, editors. The Knowledge Frontier, chapter SNePS considered as a fully intensional propositional semantic network (S.C. Shapiro and W.J. Rapaport). SpringerVerlag, New York, 1987. [CMFH00] T.W. Cole, W.H. Mischo, R. Ferrer, and T.G. Habing. XML technologies for the digital library. Technical report, University of Illinois at Urbana-Champaign, 2000. [CSN97] H. Chen, T.R. Smith, and T.D. Ng. Geoscience self-organizing map and concept space. In Proc. of 2nd ACM Intl. Conf. on Digital Libraries, page 257, Philadelphia, USA, July 1997. [Dao98] T. Dao. An indexing model for structured documents to support queries on content, structure and attributes. In Proc. of the IEEE Forum on Research and Technology Advances in Digital Libraries, pages 88{97, California, USA, April 1998. [Dum99] S. Dumais. Beyond content-based retrieval: Modeling domains, users, and interaction. In Proc. of the IEEE Advanced in Digital Libraries, pages 144{145, Maryland, USA, May 1999. [DvR93] M. Dunlop and C. van Rijsbergen. Hypermedia and free text retrieval. Information Processing and Management, 29(3), 1993. [ea78] R.J. Brachman et al. KL-ONE reference mannual. Technical report, Bolt Beranek and Newman Inc. BBN Report No. 3848, Cambridge, MA, 1978. [Eft93] E. Efthimiadis. A user-centered evaluation of ranking algorithms for interactive query expansion. In Proc. of the 16th Intl. Conf. on Research and Development in Information Retrieval, Pittsburgh, PA, June 1993. [ES79] M.W. Evens and R.N. Smith. A lexicon for a computer question-answering system. American Journal of Computational Linguistics, 83:1{93, 1979. [FBY92] W.B. Frakes and R. Baeza-Yates. Information Retrieval: Data Structures and Algorithms. Prentice Hall, Inc, 1992.

21

[Fid94] [Fin79] [FLLD87] [FM98] [FM01] [Gar88] [Gar01] [Gru93] [GW00] [HBOS96] [Hop97] [Hou95] [HP00] [Ing94]

[JGR+ 95] [JL97] [Kan98] [Kie69]

R. Fidel. User-centered indexing. Journal of the American Society for Information Science, 45(8):572{576, 1994. N.V. Findler, editor. Associative Networks: The Represntation and Use of Knowledge by Computers, chapter The SNePS semantic network processing system (S.C. Shapiro). Academic Press, New York, 1979. G.W. Furnas, T.K. Landauer, L.M.Gomex, and S.T. Dumais. The vocabulary problem in human-system communication. Communications of the ACM, 30(11):964{971, 1987. E.A. Fox and G. Marchionini. Toward a world wide digital library. Communications of the ACM, 41(4):28{32, 1998. E.A. Fox and G. Marchionini. Digital libraries: Introduction. Communications of the ACM, 44(5):30{32, 2001. A. Garnham. Arti cial Intelligence: An Introduction. Routledge & Kegan Paul, Inc, 1988. J. Garson. The Stanford Encyclopedia of Philosophy (Winter 2001 Edition), Edward N. Zalta Ed., chapter Modal Logic. http://plato.stanford.edu/archives/, 2001. T. Gruber. A translation approach to portable ontologies. Knowledge Acqusition, 5(2):199{220, 1993. N. Guarino and C. Welty. Ontological analysis of taxonomic relationships. In Proc. of the 19th International Conference on Conceptual Modeling, USA, October 2000. N.A. Van House, M.H. Butler, V. Ogle, and L. Schi. User-centered iterative design for digital libraries: The cypress experience. D-Lib Magazine, 2(2), 1996. J. Hoppenbrouwers. Conceptual Modeling and the Lexicon. PhD Thesis, Tilburg University, 1997. N.A. Van House. User needs assessment and evaluation for the UC berkeley electronic environmental library project. In Proc. of the 2nd Intl. Conf. on the Theory and Practice of Digital Libraries, Texas, USA, June 1995. J. Hoppenbrouwers and H. Paijmans. Invading the fortress: How to besiege reinforced information bunkers. In Proc. of the IEEE Advanced in Digital Libraries, pages 27{35, Washington D.C., USA, May 2000. P. Ingwersen. Polyrepresentation of infrmation needs and semantic entities: Elements of a cognitive theory for information retrieval interaction. In Proc. of the 17th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 101{110, Dublin, Ireland, July 1994. S. Jones, M. Gatford, S. Robertson, M. H-Beaulieu, J. Secker, and S. Walker. Interactive thesaurus navigation: Intelligence rules ok? Journal of the American Society for Information Science, 46(1):52{59, 1995. R. Daniel Jr and C. Lagoze. Extending the warwick framework: From metadata containers to active digital objects. D-Lib Magazine, 3(11), 1997. Paul Kantor. Observation and measurement in evaluating digital libraries. Technical report, University of Illinois at Urbana-Champaign, 1998. F. Kiefer, editor. Studies in Syntax and Semantics, chapter Semantics and Lexicography: Towards a New Type of Unilingual Dictionary (Y.D. Apresyan, I.A. Melcuk and A.K. Zolkovsky). Dordrecht - Holland, 1969.

22

[KW95]

R. Kahn and R. Wilensky. A framework for distributed digital object services. Technical report, Corporation for National Research Initiatives, 1995. [Len95] D. Lenat. CYC: a large-scale investment in knowledge infrastructure. Communications of ACM, 13(11), 1995. [LS90] F.W. Lancaster and L.C. Smith, editors. Arti cial Intelligence and expert systems: will they change the library?, chapter Arti cial Inttelligence: What will they think of next? (D.P. Metzler). The Board of Trustees of the University of Illinois, 1990. [LS97] F.W. Lancaster and B. Sandore. Technology and Management in Library and Information Services. University of Illinois Graduate School of Library and Information Science, 1997. [Mar00] G. Marchionini. Evaluating digital libraries: A longitudinal and multifaceted view. Library Trend, 49(2):304{333, 2000. [MBF+ 93] G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introduction to wordnet: an on-line lexical database. Technical report, Princeton University, 1993. [MDR89] C. McKnight, A. Dillon, and J. Richardson. Problems in hyperland? a human factors perspective. Hypermedia, 1(2):167{178, 1989. [MG95] J.P. Mead and G. Gay. Concept mapping: An innovative approach to digital library design and evaluation. Technical report, Cornell University, 1995. [MGMP00] S. Melnik, H. Garcia-Molina, and A. Paepcke. A mediation infrastructure for digital library services. Technical report, Standford University, 2000. [Min68a] M. Minsky, editor. Semantic Information Processing, chapter Semantic Memory (R. Quillian). MIT Press, Cambridge, Mass., 1968. [Min68b] M. Minsky, editor. Semantic Information Processing, chapter A Computer Program for Semantic Information Retrieval (B. Raphael). MIT Press, Cambridge, Mass., 1968. [MNE92] J.A. Markowitz, J.T. Nutter, and M.W. Evens. Beyond is-a and part-whole: More semantic network links. Journal of Computers and Mathematics with Applications, 23(6-9):377{390, 1992. [NFE90] J.T. Nutter, E.A. Fox, and M.W. Evens. Building a lexicon from machine-readable dictionaries for improved information retrieval. Journal of Literary and Linguistic Computing, 5(2):129{137, 1990. [Nut89] J.T. Nutter. Representing knowledge about words. In Proc. of the Second National Conference of AI and Cognitive Science, pages 313{328, London, 1989. [PBJ+00] A. Paepcke, R. Brandri, G. Janee, R. Larson, B. Ludaescher, S. Melnik, and S. Raghavan. Search middleware and the simple digital library interoperability protocol. D-Lib Magazine, 6(3), 2000. [PBLO99] S. Payette, C. Blanchi, C. Lagoze, and E.A. Overly. Interoperability for digital objects and repositories: The cornell/cnri experiments. D-Lib Magazine, 5(5), 1999. [PCGM+ 98] A. Paepcke, S.B. Cousins, H. Garcia-Molina, S.W. Hassan, S.K. Ketchpel, M. Roscheisen, and T. Winograd. Using distributed objects for digital library interoperability. Technical report, Standford University, 1998. [PCGMW98] A. Paepcke, C.K. Chang, H. Garcia-Molina, and T. Winograd. Interoperability for digital libraries worldwide. Communications of the ACM, 41(4):33{43, 1998.

23

[PH96] [PH99] [PL99] [PW97] [Ric83] [SC99] [Sch95] [Sch98] [SMC+ 99] [Sow84] [SS98] [VWSG97] [WC91] [WFC00] [Wil88]

A. Paepcke and S. Hassan. Combining CORBA and the World-Wide Web in the stanford digital library project. In Proc. of the OMG's WWW/CORBA workshop, June 1996. M. Papazoglou and J. Hoppenbrouwers. Contextualizing the information space in federated digital libraries. SIGMOD Record, 28(1):40{46, 1999. S. Payette and C. Lagoze. Flexible and extensible digital object and repository architecture (FEDORA). In Proc. of the 2nd European Conference on Research and Advanced Technology for Digital Libraries, pages 41{59, Crete, Greece, September 1999. T.A. Phelps and R. Wilensky. Multivalent annotations. In Proc. of the 1st European Conf. on Research and Advanced Technology for Digital Libraries, Pisa, Italy, September 1997. E. Rich. Arti cial Intelligence. McGraw-Hill, Inc, 1983. B. Schatz and H. Chen. Digital libraries: Technological advances and social impacts. IEEE Computer, 32(2):45{50, 1999. B.R. Schatz. Information analysis in the net: The interspace of the of the twenty- rst century. In America in the Age of Information: A Forum on Federal Information and Communications, July 1995. B.R. Schatz. High-performance distributed digital libraries: Building the interspace on the grid. In Proc. of 7th IEEE Intl. Symposium on High-Performance Distributed Computing, pages 224{234, July 1998. B. Schatz, W. Mischo, T. Cole, A. Bishop, S. Harum, E. Johnson, L. Neumann, H. Chen, and D. Ng. Federated search of scienti c literature. IEEE Computer, 32(2):51{59, 1999. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, Inc., 1984. A. Spink and T. Saracevic. Human-computer interaction in information retrieval: Nature and manifestations of feedback. Interacting with Computers, 10(3):249{267, 1998. B. Velez, R. Weiss, M.A. Sheldon, and D.K. Giord. Fast and eective query re nement. In Proc. of the 20th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 6{15, Philadelphia, USA, July 1997. J.A. Waterworth and M.H. Chignell. A model for information exploration. Hypermedia, 3(1):35{ 58, 1991. J.E. Wol, H. Florke, and A.B. Cremers. Searching and browsing collections of structural information. In Proc. of the IEEE Advances in Digital Libraries, pages 141{150, Washington D.C. USA, May 2000. P. Willett. Recent trends in hierarchical document clustering: A critical review. Information Processing and Management, 24(5), 1988.

24