Lightweight Deductive Databases on the World-Wide Web - CiteSeerX

2 downloads 222 Views 204KB Size Report
Jun 26, 1996 - such as transaction processing and query optimisation. ..... the project names to search over the Web site for citation database module IDs.
Lightweight Deductive Databases on the World-Wide Web Seng Wai Loke, Andrew Davison, and Leon Sterling Technical Report 96/25 Department of Computer Science The University of Melbourne Parkville, Victoria 3052 Australia June 26, 1996

Abstract

We investigate a Web information structuring mechanism called lightweight deductive Lightweight deductive databases enable more sophisticated automated searching, extraction, and processing, and can facilitate agent-based programming. We also explore how these deductive databases bene t from being distributed on the Web.

databases.

1 Introduction Our aim is to enhance the Web with information which is more susceptible to sophisticated automated searching, extraction, and processing. Such information needs to be stored in a structured form, and be suciently high-level (or abstract) to be programmable, while remaining readable. An interesting candidate as a representation formalism is logic programming [22]. Logic programming is based on mathematical logic, where computation is treated as deduction from a set of axioms or rules. Logic programming provides a uniform means for representing data and computations, is declarative (compared to imperative languages), and has a solid semantic basis. In addition, knowledge based query processing is possible with the aid of logic programming. For instance, we can represent concept hierarchies, and meta-level knowledge about databases. Deductive databases extend relational databases, utilising logic programming rules for more complex data modelling [20]. A deductive database is, in essence, a logic program: base relations map to facts, and rules are used to de ne new relations in terms of base relations, and to process queries. Also, deductive databases structure information according to prede ned conceptual schema. Hence, deductive databases qualify as an appropriate metaphor for information processing on the Web. This paper investigates how deductive databases can be incorporated into Web pages. In the spirit of [9], we call these databases lightweight deductive databases, since our intention is to use the Web as a source of structured information, rather than to provide functionalities such as transaction processing and query optimisation. Since lightweight deductive databases are distributed on the Web, they have the following features:  Distributed maintenance. A Web page can contain a part of a deductive database. The completed database can be created as necessary by retrieving the relevant Web pages, and composing them together. This allows the components of a database to be separately maintained, and combined only during query processing.  Extensibility. The dynamic nature of the complete database allows the incremental addition, or removal, of parts during query evaluation. This facilitates the incremental development of lightweight deductive databases. Moreover, users, who are not the database creators, can extend the existing database by writing their own rules, using schemata included on the pages containing the database.  Reusability. Rules and knowledge bases for database query processing can be placed in Web pages, encouraging them to be reused, or shared.  Client-side processing. Lightweight deductive databases complement the work on building forms-based interfaces to conventional databases. However, in our model, query processing is carried out on the client-side, rather than on the server machine. This reduces server load, and permits state information, such as the results of previous queries, to be kept in the client-side process or les. We also utilise client-side caching, which means that once the databases are loaded, they need not be fetched again. We base this work on LogicWeb [17], an integration of logic programming and the Web, which treats Web pages as logic programming modules. Meta-level rules on how LogicWeb modules can be accessed and combined are written as logic programs. 1

In the rest of the paper, we illustrate the above ideas, by considering lightweight deductive databases for citations. On-line citations are a valuable resource for bibliography entries, and for obtaining papers. In the following, we assume that the reader is acquainted with Prolog. In x2, we introduce lightweight deductive databases, and give a brief overview of LogicWeb, followed by a simple lightweight deductive database example. In x3, we show how lightweight deductive databases can be combined and extended, and in x4, knowledge-based query processing is explored. In x5, we review related work, and compare existing deductive database systems with our lightweight counterpart. x6 concludes.

2 Lightweight Deductive Databases LogicWeb provides the framework for lightweight deductive databases. Below, we outline LogicWeb, and describe a lightweight deductive database of citations.

2.1 LogicWeb

LogicWeb treats Web pages as logic program modules, termed LogicWeb modules [17]. Ordinary Web pages can be parsed to extract facts, such as a collection of links used in the page. This provides an additional layer of abstraction beyond the text of a page. More importantly, a LogicWeb module can contain a logic program written in Prolog with some extensions. Among the extra operators are ones for invoking goals in other modules, and for combining modules together. A key element of these features is the use of terms of the form m id(URL) to refer to modules, where URL is the URL1 of the module. When a LogicWeb module is downloaded, its logic program, and any other information it might contain (written in ordinary HTML2 ) are loaded into a Prolog environment. The ordinary HTML text is stored in a fact, and can be subsequently processed (e.g., links it contains can be extracted and stored in newly created facts, as mentioned above). A LogicWeb module may contain a lightweight deductive database (or component thereof). In addition, it may include descriptions of the database (e.g., database schemata) written in ordinary HTML. In the rest of the paper, we shall use the term database modules to refer to LogicWeb modules containing lightweight deductive databases.

2.2 A Citation Database

A lightweight deductive database is a logic program with its clauses categorised according to three main roles: base relations, derived relations, and rules to process queries3 . They are illustrated below. Consider a lightweight deductive database of publication citations. A citation consists of distinct components (or attributes), and it should be possible to perform queries using these attributes. Also, it is convenient for citations to be separately stored and maintained (e.g., by the authors themselves). For these reasons, it is useful to store citations in structured databases, and to combine them only during query evaluation. A possible schema for citations is: URL stands for Uniform Resource Locator. HTML stands for Hypertext Markup Language. 3 These categories are borrowed from the eld of deductive databases [20].

1 2

2

pub cit(authors,title,pub type,collection name,web location,date)

The schema describes the components of a typical publication citation: the names of the authors, the title of the paper (which is used as the primary key, indicated by underlining), the type of the publication (e.g., conference, technical report, or journal), the collection in which the paper was published, the URL of an on-line version, and the date of publication. An instance of the schema is: pub_cit([author("Seng","Loke"),author("Andrew","Davison")], "Logic Programming with the World-Wide Web", conference,"Hypertext '96", "http://www.cs.mu.oz.au/~swloke/papers/paper1.ps.gz", date(march,1996)).

This fact is not in relational database normal form, as structured data is used. Some of the attribute values can be stored as atoms instead of strings, but strings are easier to manipulate in queries (e.g., we could utilise pattern matching with a tolerance for character mismatches). From a database of pub cit/6 facts, a database of journal citations in 1996 can be formed, which conforms to the schema: journal 1996 cit(authors,title,collection name,month)

The new relation is derived using the rule: journal_1996_cit(Authors,Title,CollectionName,Month) :pub_cit(Authors,Title,journal,CollectionName,_,date(Month,1996)).

Rules can be written to process queries on the above databases. For example, the following rules nd all the titles of papers by an author in a given year: get_titles(Name,Year,Titles) :setof(Title,get_title(Name,Year,Title),Titles). get_title(Name,Year,Title) :pub_cit(Authors,Title,_,_,_,date(_,Year)), member(Name,Authors).

A database module can be downloaded to a client system, and queried. Alternatively, the base relations (facts) can be stored in one module, and the query processing rules in another. These can be downloaded separately, and combined on the client-side.

3 Combining and Extending Lightweight Deductive Databases In this section, we explore the use of virtual relations and relational joins. A virtual relation is formed from other database relations by a set of rules, all de ning the same head predicate, but each using a di erent relation in its body. In a relational join, two relations are combined based on shared attribute values. 3

From structured logic programming [6], we appropriate the notions of context switching and the composition operations of union, and overriding union. By composing database modules, we e ectively combine the databases in those modules. Other ways of composing logic programs have been considered in the literature, such as intersection [4], retraction [5], and various forms of inheritance [6, 19]. Utilising a larger repertoire of operators allows more kinds of compositions to be expressed in queries, but increases complexity.

3.1 Virtual Relations and Joins

The LogicWeb goal, Module#>Goal, causes the evaluation of the logic programming Goal in the LogicWeb Module. Module is loaded the rst time a goal requesting it is evaluated. Subsequent evaluations of goals in the module use the already loaded version. The same loading behaviour is employed by the other LogicWeb operators introduced below. The #> operator executes a goal using only the predicates in its prescribed module, excluding other modules. In the following examples, we assume a database of technical report details with the schema: tr cit(authors,title,tr number,web location,date)

A virtual relation, publication details:

cit/6,

of citations can be de ned in terms of technical report and

cit(Authors,Title,Type,BookName,WebLocation,Date) :pub_cit(Authors,Title,Type,BookName,WebLocation,Date). cit(Authors,Title,technical_report,BookName,WebLocation,Date) :tr_cit(Authors,Title,BookName,WebLocation,Date). cit/6 assumes that pub cit/6 and tr cit/5 are present in the current module (i.e., the module containing the virtual relation). However, if they are stored in di erent modules, we can employ #> to retrieve them: cit(Authors,Title,Type,BookName,WebLocation,Date) :location_of(publications,PubsURL), m_id(PubsURL)#>pub_cit(Authors,Title,Type,BookName,WebLocation,Date). cit(Authors,Title,technical_report,BookName,WebLocation,Date) :location_of(technical_reports,TRsURL), m_id(TRsURL)#>tr_cit(Authors,Title,BookName,WebLocation,Date).

The locations of the relevant modules are obtained from location of/2 facts. A derived relation of citations that are both technical reports and publications can be formed as follows: pub_tr_cit(Authors,Title,PubWebLocation,TRWebLocation) :location_of(publications,PubsURL), location_of(technical_reports,TRsURL), m_id(PubsURL)#>pub_cit(Authors,Title,_,_,PubWebLocation,_), m_id(TRsURL)#>tr_cit(Authors,Title,_,TRWebLocation,_).

4

The shared variables, Authors and Title, are used to select citations common to both relations.

3.2 Forming Databases Using Union

The union of two LogicWeb modules corresponds to the set-theoretic union of the clauses in both modules [6]. Union is useful when a query is to be evaluated with respect to two or more modules. In LogicWeb, this is expressed using the lw union/1 construct in a goal, such as: lw_union(ListOfModules)#>Goal

This goal evaluates Goal in the union of the modules in ListOfModules. We can use the following rule to retrieve citation authors from a union of databases: authors(CitMod,Modules,Authors) :lw_union([CitMod|Modules])#>cit(Authors,_,_,_,_,_).

contains the original de nition of cit/6 from x3.1. Modules is a list of modules, containing pub cit/6 and tr cit/5 facts. Citations are retrieved from the union of all the modules, treated as a single citation database. Hence, the term lw union([CitMod|Modules]) represents a virtual citation database. If the modules are on di erent servers, they are retrieved before the cit/6 goal is evaluated. This behaviour allows the combined database to be spread across several servers. CitMod

3.3 Using Dynamic Composition

In discussing the dynamic composition operations, we employ the notion of current context of a goal (or operation). This refers to the module, or the union of modules, in which the operation is evaluated. The dynamic composition operations compose a prescribed module with their current context, whereas the #> operation ignores its current context.

Dynamic union. The dynamic union operation, denoted by Module>>Goal, forms the union of Module and its current context, and evaluates Goal in this union. This di ers from a Module#>Goal operation, which ignores its current context. The di erence is illustrated below. The operation is dynamic in that the union is only carried out for the duration of the evaluation of the goal. The following rule utilises the module m id(AddressBookURL) which contains a predicate cit db/2. cit db/2 contains the database module URLs for every department. The module URL for a given department is retrieved, and the citation authors in that module are then retrieved. authors(Department,AddressBookURL,Authors) :m_id(AddressBookURL)#>cit_db(Department,CitationsModURL), m_id(CitationsModURL)>>cit(Authors,_,_,_,_,_).

The use of >> enables cit/6 to be obtained from the current module, and the pub cit/6 and facts from m id(CitationsModURL). In contrast, the use of #> means that only the facts in m id(AddressBookURL) are used, and any such facts in the current module are ignored.

tr cit/5 cit db/2

5

Another use of dynamic union is to load modules containing utility predicates for processing queries. Dynamic union allows the utility predicates to call predicates in the current context.

Dynamic overriding union. Dynamic overriding union, denoted by Module@>Goal, is used when some predicate de nitions in Module are to be used in preference to those in the current context. If the dynamic union operation of the previous example was: m_id(CitationsModURL)@>cit(Authors,_,_,_,_,_)

then the de nition of cit/6 in the current context would momentarily be overridden by the de nition of cit/6 in m id(CitationsModURL).

4 Knowledge-Based Query Processing on the Web Complex forms of query processing are possible with lightweight deductive databases. We demonstrate this with a longer example involving the logic programming encoding of the domain-speci c information within Web pages, and knowledge about the structure of Web sites of a certain type. This information is used to guide automated search.

4.1 Page Types

We assume that research in a Computer Science department is organised into sections. A section is divided into groups, and sections and groups are composed from projects. Each project consists of researchers. This follows the organisation of research in the Computer Science department at the University of Melbourne4 as of May 1996. When describing the structure of a Web site, we use the notion of page types. Page types are names for sets of pages (some of which may contain only one page). They enable individual pages to be referred to by their types rather than by their URLs which would be too speci c, and may change over time. Page types also allow collections of pages to be conveniently addressed. We utilise the following page types in our example: dept, research, project, project members, and researcher. These page types will be used for describing our Web search strategies. The Web structure for a Computer Science department is shown in Figure 1. The departmental home page (of type dept) has a link to a page (of type research) containing research information (in HTML). This information includes a description of the relationships between sections, groups, and projects, and links to pages describing projects (of type project). Each project page has a link to a page containing information about its members (of type project members), and each project members page has links to the members' home pages (which are of type researcher). The publication citations for each project are distributed (and separately maintained by authors) in database modules accessible from the home pages of the researchers. The home page of the University of Melbourne Computer Science department is at http://www.cs.mu.oz.au/. 4

6

dept research project project_members researcher (citation database module)

Figure 1: A representative diagram of the hypertext structure rooted at a departmental home page. The arrows denote the sequence in which the various page types are reached, starting from the dept page.

4.2 Searching for Citations

We wish to develop code to nd all the citations for a given section. Two approaches are possible:  We can store the URLs of all the relevant citation database modules for a section in a pre-determined module, and query the citations using it. The authors/3 predicate of x3.3 employs this technique.  Starting from some pre-determined page, we can search the Web site for the URLs of the citation database modules. The rst method requires the maintenance of a module containing the database module URLs. For example, if a researcher leaves or joins a section, or if the URLs of pages change, then the module has to be modi ed. For this reason, we shall use the second method. Finding database modules consists of two main steps: 1. use the given section name to nd the constituent project names; and 2. using those names, search the Web site by following the sequence of page types shown in Figure 1, starting from the dept page, until the relevant database modules are found. The search starts from the dept page, rather than the research page, in order to accommodate changes to the research page's URL. The URL of the dept page is least likely to change. In order to implement the above strategy, we need to formalise the relationships between sections, groups, and projects (to carry out (1)), and specify the Web structure (in order to do (2)). 7

4.3 Representing Knowledge

Representing Research Information. To nd the projects contained in a given section,

we can parse the research page, and extract the required information. However, we would need to do that each time a query is processed. A more ecient alternative is to represent the knowledge on that page as logic programming facts and rules, and reason with them. The relationships between sections, groups, and projects, are depicted as a hierarchy of concepts in Figure 2. This is a representative diagram; a typical department would have many more sections, groups, and projects. section(programming_langs)

project(prolog_techniques)

group(new_declarative_langs)

project(mercury)

project(lygon)

Figure 2: A hierarchy of sections, groups and projects. The edges represent the has part/2 relationships. The relationships are represented using has part/2 facts, which speci es how sections, groups, and projects are related. contains/2 de nes the transitive closure of the has part/2 relation. has_part(section(programming_langs),group(new_declarative_langs)). has_part(group(new_declarative_langs),project(mercury)). has_part(group(new_declarative_langs),project(lygon)). has_part(section(programming_langs),project(prolog_techniques)). contains(X,Y) :- has_part(X,Y). contains(X,Z) :- has_part(X,Y), contains(Y,Z).

Representing the Web structure. The sequence of page types in Figure 1 can be cap-

tured using precedes/2 facts:

precedes(dept,research). precedes(research,project). precedes(project,project_members). precedes(project_members,researcher).

4.4 An Implementation of Citation Finding

A section name (e.g., \Programming Languages") can be selected by a user from a menu displayed by the browser. This is translated into a section/1 term, and a goal like the following is generated: 8

get_citations(section(programming_langs),Citations)

Evaluation of the goal results in Citations being instantiated with a list of pub cit/6 facts. These are formatted and displayed by the browser. get citations/2 uses the logic programming representation of the research information (contains/2 and has part/2) to retrieve all projects belonging to the speci ed section, and then collects the citations from those projects: get_citations(Section,Citations) :setof(project(Name),contains(Section,project(Name)),Projects), collect_citations(Projects,Citations).

uses the Web structure information (coded as precedes/2 facts) and the project names to search over the Web site for citation database module IDs. A union of these databases is accessed to obtain the citations for the projects. The de nition of collect citations/2 is: collect citations/2

collect_citations(Projects,Citations) :location_of(dept,DeptURL), setof(ModId,find_citmod(dept,DeptURL,Projects,ModId),ModIds), retrieve_citations(ModIds,Citations).

The call to location of/2 retrieves the URL of the dept page. The URLs of the citation database modules are found by calling find citmod/4 in a setof/3 call. The predicate retrieve citations/2 retrieves citations from the union of the modules using code very much like that in x3.2. The initial call to find citmod/4 in collect citations/2 searches from the dept page until a researcher page is found. The database module URL on that page is then extracted by extract modURL/2. find citmod/4 makes use of a predicate next link/5, which uses heuristics to determine the next link to follow on any page. The de nition of find citmod/4 is as follows: find_citmod(researcher,PageURL,_,m_id(CitationsModURL)) :extract_modURL(PageURL,CitationsModURL). find_citmod(PageType,PageURL,Projects,CitationsModId) :next_link(PageType,PageURL,Projects,NextPageType,NextPageURL), find_citmod(NextPageType,NextPageURL,Projects,CitationsModId). next link/5 examines the links on the current page (of type PageType and of URL PageURL) to determine the next page to visit, returning its type (i.e., NextPageType) and URL (i.e., NextPageURL). The choice of page is constrained by the list of project names in Projects, obtained earlier from contains/2. The de nition of next link/5 is as follows: next_link(PageType,PageURL,Projects,NextPageType,NextPageURL) :precedes(PageType,NextPageType), m_id(PageURL)#>link(Label,NextPageURL), useful_link(Projects,NextPageType,Label,NextPageURL).

9

initially obtains the next page type of interest, by using precedes/2. It also retrieves a link from its current page by calling link/2. The project names, the next page type, the label of the link, and the new page URL, are used in useful link/4 to evaluate if the link leads the search closer to a citation database module. If it doesn't, then backtracking will cause a reevaluation of link/2 to obtain another link. One technique used in useful link/4 is to analyse the link's label, which is a string, to see if the page it leads to is of type NextPageType. This is achieved by matching the words in the label string against particular cue words. For some page types, the cue words are stored in cue word/2 facts. For instance, the word \Research" is a cue word for pages of type research: next link/5

cue_word("Research",research).

For other page types, the cue words may not be stored in facts. For example, the project names are used as cue words to judge links to project pages. Other heuristics useful for constraining search are described in [18]. extract modURL/2 uses similar techniques to useful link/4 to nd the database module URL on the researcher's page.

4.5 Discussion

This example illustrates how the techniques of knowledge representation and automated Web search can be used to implement complex processing of queries on lightweight deductive databases. Logic programming rules are used to represent a concept hierarchy, and to specify search behaviour over Web pages. The former maps a section name to its component project names, while the latter handles changes to the contents of the Web pages. The concept hierarchy has the drawback that it needs to be updated if a research section, group, or project, is added or removed. However, this can be done automatically, i.e. by parsing the information on the research page into logic programming facts. In fact, a script has been written which parses the page containing information on sections, groups, and projects at the University of Melbourne5 into logic programming facts. The facts and rules of the previous subsections reside in a separate module at the Web site. This module is loaded rst, and other pages and database modules are loaded during query evaluation.

5 Related Work

5.1 Database on Web Pages

Dobson et al [8, 9] utilise additional HTML tags to embed relational databases (called lightweight databases) in Web pages. Essentially, an entity-relationship diagram is mapped onto the hypertext structure of the Web. Relationships between entities on di erent pages are speci ed by hypertext links, with attributes de ning the relationships. Lightweight databases have been used to generate HTML documents, for indexing, and for databases spreading over several servers. 5

The URL of this page is http://www.cs.mu.oz.au/alt/research/research.html.

10

Our work is motivated by their use of relational databases. However, we employ deductive databases to provide more powerful modelling capabilities. We can also specify module composition in logic programming rules. Sandewall [21] proposes the World-Wide Data Base, where a database consists of downloadable short text les, each le containing an object description. An object consists of properties, represented in a specialised language, and can reference other database les (i.e., objects), or HTML pages. Other object-oriented concepts, such as message-passing, are not employed. The main application discussed is HTML page generation, where objects store resources for the generation of pages. In particular, values of properties may be scripts (in LISP) that specify how to generate HTML expressions. Our databases are deductive, whereas theirs are object-based. However, their objects can be expressed in our language. Also, they do not provide a uniform query language for their objects database, while we use logic programming for that purpose. Their use of one le per object may incur heavy network transmission costs. In our approach, modules can contain multiple relations.

5.2 Comparison with Deductive Database Systems

Our approach to implementing lightweight deductive databases uses Prolog with LogicWeb extensions. However, existing deductive database systems [20, 13] di er from Prolog systems in several ways, including:  Query optimisation. Query processing often nds all answers to the query, i.e. the \set at a time" paradigm is more ecient than the \tuple at a time" paradigm of Prolog. To facilitate this, optimised bottom-up evaluation is often used, rather than the top-down evaluation of Prolog systems.  Restrictions on rules. The rules in deductive database systems are range-restricted, i.e. all variables that appear in the head of the clause must appear in the body. This implies that all facts must be ground. This removes the need for full uni cation, thereby increasing eciency. Another common restriction is that all terms in the program are variables or constants. This ensures that logical entailment is decidable. A logic programming language with this restriction is Datalog [20]. A major di erence between our lightweight deductive databases and existing deductive database systems is that our databases are not updateable. If updates were possible, transaction processing, and concurrency control would also have to be available. We have not yet considered updates because the main focus of this work is structuring information on the Web.

5.3 Knowledge-Based Access to Information

The Information Manifold [16, 15] is a system for building a knowledge base representing the user's interests. This uses a combination of Horn rules and the CLASSIC knowledge representation language to describe information sources, and taxonomy relationships among them. The knowledge base is also used to process the results of searches submitted to multiple Web index servers. 11

Our framework supports knowledge-based query processing. Users can build their own knowledge base and query processing rules, perhaps on top of those provided by information servers. Web pages have been generated from knowledge bases by using user pro les [14], and user queries [11]. We can achieve a similar functionality when the results of lightweight deductive database queries are Web documents. Barcaroli et al [2] represents hypertext at a Web site using a knowledge base. Their system answers user queries by returning a sequence of links leading to the page containing the answer. In our approach, the information provider can provide a similar capability by mapping user queries to appropriate URLs. These approaches typically make use of knowledge representation languages based on description logics in order to represent concept models. We restrict ourselves to logic programming since it is a versatile paradigm, used in a range of AI problems (e.g., expert systems and knowledge representation), and in deductive databases [22, 20]. Context logic, an extension of rst order logic where sentences are true with respect to a given context, has been used to integrate databases [12]. Axioms are written which lift sentences from several contexts into a common one. This is similar to de ning a new relation in terms of relations in other modules by using context switching, as shown in x3.1.

6 Conclusions The Web should be enhanced with richer information content, and more sophisticated processing capabilities [3]. Our lightweight deductive databases provide improved querying, automated searching, and sophisticated processing and extraction of structured information on the Web. Lightweight deductive databases can be easily combined, and extended during query processing, using well-established techniques from the areas of deductive databases and structured logic programming. We conclude with several possible avenues for future work. We have not dealt with integrity constraints in this paper. For example, the uniqueness property of primary keys must be maintained when databases are combined. There exists methods for combining knowledge bases with integrity constraints [1]. We have demonstrated the utility of a small set of composition operations, but it may be useful to explore other operations. A modelling language (with constructs closely related to the application domain) could be used to describe particular kinds of knowledge. This may take the form of additional syntax, which can be translated into Prolog. The structured information provided by lightweight deductive databases should be more readable by intelligent software agents [10], than plain HTML text. An interesting possibility is to use lightweight deductive databases to store agent functionality, which an agent could upload as it searches the Web. Lightweight deductive databases must be manually generated at present. However, in some cases, lightweight deductive databases can be generated from text. For example, if citations are suciently marked-up in HTML, they can be converted into facts automatically. This idea is used in x4, to generate the logic programming representation of the research information. 12

Approaches to automatically generating new lightweight deductive databases from existing ones, given database schemas, should be investigated. Work on agent-based knowledge extraction (or data mining) from databases using Inductive Logic Programming [7], might be applicable.

References [1] C. Baral, J. Minker, and S. Kraus. Combining Multiple Knowledge Bases. IEEE Transactions on Knowledge and Data Engineering, 3(2):208 { 221, June 1991. [2] C. Barcaroli, L. Iocchi, M. Lenzerini, and D. Nardi. Knowledge-Based Acess to the Network. In Fifth International World-Wide Web Conference Workshop on "Arti cial Intelligence-based tools to help WWW users", http://www.info.unicaen.fr/serge/3wia/workshop/papers/paper6.html, May 1996. [3] T. Berners-Lee, Robert Cailliau, Ari Luotonen, Henrik Frystyk Nielsen, and Arthur Secret. The World-Wide Web. Communications of the ACM, 37(8):76 { 82, August 1994. [4] A. Brogi. Program Construction in Computational Logic. PhD thesis, Universita di Pisa-Genova-Udine, 1993. [5] A. Brogi, P. Mancarella, D. Pedreschi, and F. Turini. Composition Operators for Logic Theories. In J.W. Lloyd, editor, Computational Logic, symposium Proceedings, pages 117 { 134. Springer-Verlag, 1990. [6] M. Bugliesi, E. Lamma, and P. Mello. Modularity In Logic Programming. Journal of Logic Programming, pages 443 { 502, 1994. [7] W.H.E. Davies and P. Edwards. Agent-Based Knowledge Discovery. ftp://ftp.csd.abdn.ac.uk/pub/pedwards/daviesss95.ps. [8] S. Dobson and V. Burrill. Towards Improving Automation in the World-Wide Web. http://www.scit.wlv.ac.uk/ndisd/dobson.ps. [9] S.A. Dobson and V.A. Burrill. Lightweight Databases. Proceedings of the 3rd International World-Wide Web Conference, http://www.igd.fhg.de/www/www95/proceedings/papers/54/darm.html, April 1995. [10] O. Etzioni and D. Weld. Intelligent Agents on the Internet: Fact, Fiction, and Forecast. IEEE Expert, 10(4):44 { 49, August 1995. [11] J. Euzenat. Knowledge Bases as Web Page Backbones. In Fifth International World-Wide Web Conference Workshop on "Arti cial Intelligence-based tools to help WWW users", http://www.info.unicaen.fr/serge/3wia/workshop/papers/paper10.html, May 1996. [12] A. Farquhar, A. Dappert, R. Fikes, and W. Pratt. Integrating Information Sources Using Context Logic. In On-line Working Notes of the AAAI Spring Symposium Series on Information Gathering from Distributed, Heterogeneous Environments, http://www.isi.edu/sims/knoblock/sss95/farquhar.ps, January 1995. 13

[13] J. Harland and K. Ramamohanarao. An Aditi Implementation of a Flights Database. Applications of Logic Databases, 1994. [14] T. Hoppe, C. Kindermann, O.K. Paulus, and R. Tolksdorf. The MIHMA Project: a Web Information Service Based on Dsscription Logics. In Fifth International World-Wide Web Conference Workshop on "Arti cial Intelligence-based tools to help WWW users", http://www.info.unicaen.fr/serge/3wia/workshop/papers/paper25.html, May 1996. [15] T. Kirk. Knowledge Based Access to Information on the WorldWide Web. In Fifth International World-Wide Web Conference Workshop on "Arti cial Intelligence-based tools to help WWW users", http://www.info.unicaen.fr/serge/3wia/workshop/papers/paper20.html, May 1996. [16] T. Kirk, A.Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold. In On-line Working Notes of the AAAI Spring Symposium Series on Information Gathering from Distributed, Heterogeneous Environments, http://www.isi.edu/sims/knoblock/sss95/kirk.ps, January 1995. [17] S.W. Loke and A. Davison. Logic Programming with the World-Wide Web. In Proceedings of the 7th ACM Conference on Hypertext (available at http://www.cs.unc.edu/barman/HT96/P14/lpwww.html)., pages 235 { 245. ACM Press, March 1996. [18] S.W. Loke, A. Davison, and L. Sterling. CiFi: An Agent for Citation Finding on the World-Wide Web. In To appear in the Proceedings of the 4th Paci c Rim International Conference on Arti cial Intelligence. Also available as the technical report 96/4 at http://www.cs.mu.oz.au/tr db/mu 96 04.ps.gz. 1996. [19] J. J. Moreno-Navarro. Tuple Inheritance: A New Kind of Inheritance for (Constraint) Logic Programming (Extended Abstract), Full paper at http://gedeon.ls. .upm.es/jjmoreno/pap bib.html#inh. In G. Levi and M. Martelli, editors, Proceedings of the 12th International Conference on Logic Programming, page 829. MIT Press, 1995. [20] K. Ramamohanarao and J. Harland. An Introduction to Deductive Database Languages and Systems. VLDB Journal, 3(2):107 { 122, April 1994. [21] E. Sandewall. Towards a World-Wide Data Base. Proceedings of the 5th International World-Wide Web Conference, http://www5conf.inria.fr/ ch html/papers/P54/Overview.html, May 1996. [22] L. Sterling and E. Shapiro. The Art of Prolog. MIT Press, 1994.

14