Feb 21, 2003 - and also designed an ontology based infrastructure for e- commerce ..... Gerald Tesauro and William E. Walsh, âCooperative. Negotiation in ...
An Agent Based Autonomic Semantic Platform Dario Bonino, Alessio Bosca, Fulvio Corno Politecnico di Torino Dipartimento di Automatica ed Informatica Torino, Italy {dario.bonino, alessio.bosca, fulvio.corno}@polito.it
Abstract In 2001, two distinct revolutionary approaches to distributed system integration and web content diffusion were proposed: Autonomic Computing and the Semantic Web. These approaches are currently two of the most active research fields in the information technology area. After three years, many common themes came out: while Autonomic Computing research is focused on providing ways to define “self-surviving” systems, the Semantic Web struggles to define web content semantics producing complex and distributed services on the Internet. In a sense, the first area approaches systems defining the “internal, operational knowledge”, while the second defines the knowledge about the environment into which software entities and humans should cohabit. This paper lies in the convergence of the two approaches and proposes an agent-based system able to semantically characterize and search a set of web resources, and to maintain and update its knowledge base by means of monitoring sensors and interaction with external web services. First experimental results confirm the validity of the combined approach in terms of search recall, thanks to autonomous update of covered topics.
1. Introduction The Semantic Web vision, proposes a new generation of the web in which information is given well defined meaning, better enabling collaboration between humans and machines in accomplishing everyday tasks. However, handling the power of semantics is not a trivial issue, as experimented in the Artificial Intelligence and the Semantic Web research communities. Several solutions for the introduction of “semantic descriptors” on the web have been proposed along with a set of semantic toolkits [3,6] and frameworks [2] aimed at providing the basis for the exploitation of the next generation of the web. Semantic platforms are often composed of some kind of reasoner, of a storage facility backed up on a database or on a file, of an ontology that describes the knowledge domain and of some utility tools for ontology learning [7], merging [8], and of tools for mapping syntactical entities (i.e., resources) to semantic rich objects. Those platforms become more and more interrelated and diverse, and software architects are less and less able to anticipate and
correctly design interactions between the involved entities, leaving the solution of whatever problems at runtime. In a short period it is likely that such systems, as is already happening on other fields, will become too complex, even for skilled programmers, to manage. The most suitable solution to this critical problem can be found designing and developing so-called “Autonomic Computing (AC)” systems [11], i.e., systems that possess explicit knowledge about their components, about their state, about available resources for accomplishing tasks and about the neighbor systems to interact width. Such systems are able to successfully handle and autonomously trigger upgrades of their internal components, and manage internal reconfigurations, looking for improvements in work flow for better accomplishing tasks, in less time and with less resource requirements. As related work, IBM is promoting several research projects related to AC [18] and there are several interesting works ranging from the definition of negotiation paradigms based on elicitation [19], to definition of requirements for autonomic computing environments [20], to the integration of autonomic principles into complex applications such as operating systems [21]. Several autonomic computing frameworks have already been proposed, such as eModel [22], and autonomic computing programming toolkits are available [23,24]. In this paper we propose an agent-based semantic platform (DOSE [9]) able to automatically index new resources in response to search failures and auto-detection of low covered conceptual areas. Since one of the key issues of semantics on the web is the ability to constantly provide up to date and semantically rich information, autonomic principles should be integrated to free developers from repeated, annoying and error prone manual updating. In conjunction with such requirement, semantic platforms should also be able to face, in real time, requests relevant to the conceptual area they are tied to, consequently updating their internal knowledge base. The platform we propose is therefore able to autonomously interact with existing web services like the Google web API and existing active web entities like daemons, scripts and server pages, in order to improve the quality of its internal knowledge base. The paper is organized as follows: in section 2 a brief introduction to the Semantic Web and to existing semantic platforms is given, while section 3 describes the semantic platform into which AC principles are used and section 4
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
details how those guidelines have been integrated into a collaborative agent environment. Section 5 shows some experimental results while section 6 provides conclusions and proposes some future work.
2. The Semantic Web On May 2001, Tim Berners-Lee wrote the Semantic Web manifesto on “Scientific American”, in that article he proposed a new vision of the web: “The Semantic Web (SW) is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”. In his view the next generation of the web will be strongly based on semantics in order to allow effective communication between humans and machines, leading to a powerful collaboration between them in accomplishing tasks. As he said, the Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. Such an agent coming to a clinic's Web page will know not just that the page has keywords such as "treatment, medicine, physical, therapy" (as might be encoded today) but also that Dr. Hartman works at that clinic on Mondays, Wednesdays and Fridays[1]. The ideas formalized by Berners-Lee came out after years of research on Artificial Intelligence and from the relatively recent research on the Web, and grouped many researchers from all the world promoting further explorations towards the next generation of the web. During the past three years the Semantic Web community has been one of the most active research community in the world, producing many diverse technologies and applications trying to put in reality the SW vision. Many milestones have been reached in this endless and exciting process, in particular a large enough agreement on semantics integration on the web has been reached. Actually there is no unique recipe to insert “meaning descriptors” on the existing web, but it is quite clear what are the main requirements to satisfy for the development of scalable and useful semantic applications. Researchers found that for an effective inclusion of semantics on the current web the meaning information should be definable by people or machines potentially different from content creators, and the common agreed way to fulfill such requirement is the definition of entities called “semantic annotations” pointing at described resources. Consequently, several works proposed techniques to provide semantic information through independent annotations, offering services for annotation editing, storage and retrieval [2]. As those systems reached a significant diffusion in the academic world some problems were noticed, in particular it was clear that the task of manually annotate the whole existing web was not feasible, thus the subsequent evolution in research involved the design of automatic annotation platforms.
At the same time some other issues became important, related to ontology design and reasoning: in the early SW many ontologies were developed to describe specific knowledge domains. Since one of the assumptions onto which the Semantic Web is based is the ability for everyone to write their own ontology which describes their little piece of world, integration of such ontologies, redundancies elimination and interoperability between different knowledge bases assumed high relevance, fostering the research on ontology merging. Moreover there was often a crisp separation between the world described by ontologies and reality therefore frameworks for ontology learning from web resources were needed. In this hot and active environment many projects try to address at least some of the issues just cited: the W3C with Amaya [3] focuses its action on providing tools for semantic annotation during normal web surfing, working at the same time on standardization and producing powerful languages for ontology description starting from RDF [4] to the latest OWL [5]. Alexander Maedche, Steffen Staab and their colleagues at the Karlshrue university proposed many tools for ontology editing (OntoEdit [6]), ontology merging and learning [7] and also designed an ontology based infrastructure for ecommerce sites named KAON [2]. Systems such as Yawas [15] and Annotea [14] allow to create and share annotations attached to web documents, i.e. comments, notes, explanations, or other types of external remarks that can be useful to readers. The On-To-Knowledge project [8], funded by the European Commission, produced many tools for ontology editing, merging and learning and wrote several deliverables about methodology and techniques for ontology design. Many other important research efforts provide hints and solutions for introducing semantics on the current web; the authors of this paper proposed an automatic semantic annotation platform that is able to annotate resources at the document substructure level, independently from language. A new funded European project, CABLE [10], is working on Case Based e-Learning leveraging the power of semantics through ontology development, maintenance and reasoning for training Educators in social environments.
3. DOSE: a Distributed Open Semantic Elaboration platform In this paper we take as a case study a semantic platform (DOSE [9]) and starting from an agent-based deployment of such platform we apply the autonomic computing principles to provide a fully functional automatic semantic annotation platform able to detect its degree of resource coverage with respect to its internal ontology and to cooperate with other services on the web, from WordNet to the Google web API, in order to update and optimize its internal knowledge base. The platform can provide facilities for semantic indexing of resources and search
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
services independently from the language of queries and resources, being constantly up to date. In this section we introduce the foundations of the DOSE platform, while the following section shows how autonomic computing principles of self-configuration, selfmanagement and self-optimization could be successfully integrated into such agent platform allowing to automatically keep up to date the internal knowledge base and to dynamically face new requests regarding not covered conceptual areas.
3.1 System architecture DOSE [9] is an agent based semantic elaboration platform able to provide semantic web services such as automatic annotation and retrieval. The conceptual organization of the platform is showed in figure 1 and aims at being at same time readily usable with nowadays technologies and easily scalable and portable to the full power of the Semantic Web. The underlying motivation is to provide an effective tool for automatic annotation of web resources at a proper detail, by identifying specific document substructures using XPointers.
Semantic annotations related to the same web document are hierarchically organized, with respect to the document structure, in order to provide automatic level of detail detection and to collapse redundant information in the search phase. DOSE is logically organized onto a four layer architecture (figure 2) where agents are located on layers depending on the kind of service they provide. Communication with external applications use XML-RPC protocol, allowing easy integration of provided semantic functionalities into existing web sites. The “Service layer”, includes the agents exposing interfaces to the outer world: the Indexer and the Clever Search Engine. The Indexer coordinates the automatic annotation of a given set of resources by coordinating lower level service providers, while the Clever Search Engine uses the Semantic Mapper module to translate text queries into conceptual queries, and performs semantic searches on the internal annotation base. The ontology structure and in particular the relationships between concepts offer the means for search refining, by applying automatic relevance feedback over the Basic Search engine module located one level deeper in the architecture. Search results are composed of many fragments coming from different web resources and can be accommodated into one or more result pages using relevance criteria. External Applications
DOSE
XML-RPC messages
Services Indexer
Management Search Engine
Kernel
GUI Manager
Clever Search
FIPA-ACL messages
Front-end Annotation Repository
Semantic Mapper
Substructure Extractor
Fragment Retriever
Language Detector
Basic Search
FIPA-ACL messages Back-end Ontology
Fig. 1. Conceptual Organization of the DOSE (Distributed Open Semantic Elaboration) platform.
Synset manager
Synset Generator
Wrappers Ontology Wrapper
DOSE works on concepts or topics, defined by a knowledge representation model (ontology): each concept is a language independent entity related with similar entities by means of semantic relationships (inheritance, etc.). To each topic in the ontology a set of different lexical entities is associated, representing words usually adopted to refer to that concept, for all supported languages. Resources are composed of fragments extracted from the original information source and are referred to ontology concepts through semantic annotations (1).
SemNet
(1)
Stemmer / synset Wrapper
Fig. 2 DOSE logical architecture. The “Kernel layer” is actually composed by two sub layers: the Kernel front-end layer and the Kernel back-end layer. Agents offering access to semantic and syntactic information are located at the front-end level. They are the Semantic Mapper, the Annotation Repository, the Substructure Extractor, the Fragment Retriever, the Language Detector and the Basic Search. The Semantic Mapper takes an ontology and a group of lexical entities, and for each input resource returns the collection of ontology concepts the resource is related
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
with. Each concept of the given ontology is in fact connected to a group of lexical entities (called synset, as each set of lexical entities is mainly composed of synonyms) that allows mapping between resource terms and semantic entities. Lexical entities are used by classical information retrieval techniques [13] to classify resources and to identify the most reliable associations with the ontology concepts. DOSE allows the annotation of relevant fragments of web resources by offering the Substructure Extractor module, that can syntactically split a web resource into its basic components and can produce unique identifiers using the XPointer syntax [16]. The Annotation Repository stores semantic annotations independently from annotated resources and is able to identify appropriate level of detail for annotations by means of generalization relationships. In a search task, results that are more relevant could therefore be obtained by focalizing or widening annotation search according to the query. In addition, annotations referred to fragments coming from the same resource could be collapsed into a more general annotation. The Basic Search module performs simple searches using the Annotation Repository and applying a modified vector space retrieval model to provide a set of relevant annotations with respect to a given query while the Language Detector allows operations on resources written in different languages and facilitates inter-language queries to the system, i.e., allows to specify a query in a given language and to require results in another one. Agents belonging to different logical layers are spread on different locations depending on their functional characteristics. Some modules should be resident on a “main” machine for centralized storage and retrieval, e.g., the Annotation Repository and the Search Engine, while the others could migrate towards information location, thus reducing the overall network load and allowing focused indexing of resources (figure 3). On the main machine also resides a special purpose agent which coordinates agents migration using different policies for balancing the architecture workload, and efficiently managing the distributed semantic annotation process. As an example we could consider and indexing scenario: when the Agency Manager receives a new indexing request, it performs the following actions: it locates the machine that hosts the required information and looks into its internal database for agent availability. If a set of agents able to perform the required task (named indexing squad) is already up on that machine, it forwards the request to such squad. Instead, if no squads are present, the agency creates a new indexing team and tries to migrate those agents on the remote host. The mobile core of the architecture is the so-called “indexing squad” usually composed by a team of five collaborating agents: the Indexer, the Fragmenter, the Retriever, the Language detector and the Semantic Mapper. This team is provided with a semantic model in
form of an ontology and with a list of resources to be indexed.
It can migrate over the Internet to the target machine and can autonomously accomplish the indexing task. It is constantly in communication with the agency manager providing annotations and can reside on the remote host, at the end of the indexing task, in order to monitor the site and trigger updates of the Agency knowledge base. Depending on the kind of remote site to be indexed the standard squad could be integrated with special purpose agents tied up to specific services or daemons, allowing direct access to dynamic site databases.
4. Autonomic computing solutions in dose One of the core components of a semantic platform is the richness of the database into which semantic annotations are stored. The quality of search results directly depends on the number and the quality of information in the semantic index (Annotation Repository). We applied the principles of self-management and selfoptimization of the autonomic computing theory to design a set of sensors for monitoring the DOSE knowledge base. Those sensors have been plugged into the Annotation Repository agent, for self-optimization and on the Search Engine agent for managing the global response of the DOSE agency to changing and conflicting requests. A new agent, the Enrichment Agent, located at the kernel frontend logical level, has been designed to coherently update the DOSE knowledge base by interacting with external services and agents (figure 4).
4.1 Self-optimization and self-management in dose agents The Annotation Repository agent resides on the “main” machine managing the central repository for semantic annotations, and must constantly be kept up to date in order to satisfy user requests. Such ability is granted by two self-management paradigms: “uniform topic coverage” and “user triggered knowledge integration”. Semantic annotations, into the repository, catalog resources with
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
respect to the DOSE ontology and cover a subset of ontology concepts with different relevance values: a topic occurring in many annotation has a higher coverage value than a concept which has no annotations. The “uniform topic coverage” paradigm aims at maintaining a certain degree of uniformness of topic coverage in the repository, possibly triggering focused indexing processes in order to correct situations in which some topics have a low number of annotations.
annotations is reasonably high. Statistically speaking this technique tends to transform the topics coverage distribution from an unknown non-uniform one to a uniform distribution in which all topics have the same occurrence value. To implement “user triggered knowledge integration”, the Search Engine agent has been enhanced allowing the semantic platform to dynamically face changes in user habits and interests, and to discover modeled topics currently uncovered by annotations. When a query is issued to the search agent, it constantly monitors its results in terms of relevance. If retrieved resources have relevance weights under a given threshold, or, worse, if no resources could be provided as result due to a lack in the Annotation Repository, the search module sends an asynchronous message to the enrichment agent in order to force an indexing cycle focused on the uncovered conceptual areas.
4.2 Enrichment Agent Self-monitoring and self-optimization functions into the core agents of DOSE require the design and development of a new agent, called Enrichment Agent, for managing the intelligent update of the Annotation Repository.
Such paradigm can only improve knowledge that is already known, since the Annotation Repository knows the existance of concepts for which at least an annotation exists. The only way to cover new topics is to act at a higher logical level, where the ontology is known. A transparent triggering of not-annotated concepts coverage can be achieved by monitoring user requests at the Search Engine and by detecting the knowledge areas modeled by the ontology, i.e., for which the system is able to provide a conceptual description, that are uncovered into the repository due to the lack of annotations. This second mechanism is called “user triggered knowledge integration”. Uniform topic coverage has been implemented in to ways: the first one is basically a search for the minimum occurrence of topics on the set of stored annotations. All covered topics in the repository are ordered by annotation occurrence and the lowest ten percent is selected as “low covered set” and provided to the enrichment agent that is charged of triggering the indexing of new, suitable resources. The second way involves some statistical considerations: basic indexes are computed for evaluating the statistical properties of the topic coverage in the repository such as the mean occurrence value, the variance and the standard deviation. After the evaluation of such figures, a threshold based algorithm selects all the topics located under a given fraction of the standard deviation, starting from the mean occurrence value, and triggers an enrichment cycle. The threshold value has been manually selected performing different experiments and strongly depends on the shape of the topics occurrence distribution, which is near-uniform only if the amount of stored
This new agent should be able to discover new information for semantic indexing and to understand the specification of a conceptual domain consequently focusing the resource selection, in order to reach a satisfying annotation coverage with respect to that area. In other words, the enrichment agent accepts as input a set of concepts, performs some internal operations which may involve collaboration with other agents, and provides as output a set of URIs that, when indexed, should generate annotations covering the topics received as input. In normal operating conditions the enrichment agent stays idle listening for requests coming from trusted agents such as the Annotation Repository and the Search Engine. If a message is received from one of these entities, the agent unpacks the message extracting a list of concepts whose coverage should be enhanced and contacts the synset wrapper included into the Semantic Mapper to find lexical entities associated to such topics. It subsequently composes, textual queries to be issued towards classical text-based search engines. Once textual queries have been composed two concurrent processes start, one interacting with the Agency Manager in order to trigger incremental indexing on already known sites and the second interacting with search web services (the Google web API, as an example) in order to retrieve a list of possibly relevant URIs. At the end of such processes the Enrichment Agent performs some filtering on retrieved URIs, deleting not understandable resources such as .pdf and .doc file, DOSE in fact can only support HTML, XML and XHTML resources and eventually requesting translation services, operating on external caches which already hold translations of such documents.
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
After the filtering process, the agent composes a list of resources to be indexed, identified by URIs, and sends an indexing request to the indexing agent which subsequently performs semantic annotation and updates the Annotation Repository; figure 5 shows the corresponding sequence diagram.
Enrich(topics) CheckUpdates() CheckUpdates()
Enrich(topics)
CheckUpdates() Search(query)
Results Index(URIs) Index(URIs)
Insert(annotations)
Insert(annotations)
Fig. 5 Enrichment cycle sequence diagram.
5. Experimental results The former version of DOSE, which was not autonomous, relied entirely on external inputs (either given by human or machines) in order to trigger an indexing operation and acquire new resources, since no update policy was present to manage the Annotation Repository. A direct comparison between the proposed adaptive platform and and the former one is therefore not significant and in some particular conditions (Experiment 2 in the next section) is not possible. To evaluate the DOSE self-management capabilities, several runs have thus been executed measuring the effectiveness of the proposed rules and assessing the quality of the annotations added to the repository, by means of relevance measures.
5.1 Experimental setup The DOSE agency has been fully deployed in Java choosing the JADE [17] framework among many different, available, agent platforms because of its free availability, its well-known diffusion, and its FIPA compliant paradigm, that allows an easy implementation of high level communication among agents. The JADE software framework has been developed by TILAB (formerly CSELT) and supports the implementation of multi-agent systems through a middleware that claims to be fully compliant with the FIPA specifications, in order to be able to inter-operate with other FIPA compliant systems such as Zeus (British Telecom) and FIPA-OS. JADE agents are implemented as Java threads and live within Agent Containers that provide the runtime support to the agent execution and messages handling. Containers
can be connected via RMI and can be both local and remote, the main container is associated with the RMI registry. Agent activities are modeled through different Behaviors and execution is managed by an agent internal scheduler. The ontology model used in these experiments has been developed in collaboration with the Passepartout service of the city of Turin. The Passepartout service is a public service for disabled people integration and aid, and is active in social services since 1981. The developed ontology was about disability aids, norms and laws, and social integration. It involved at least two experts from the Passepartout service and one ontology engineer. The ontology is organized on 4 different areas which are modeled in deep detail, as an example “disabled people working aids” was one of them. At the end of the first interaction cycle the ontology counted about 450 concepts organized into 4 main areas, for each ontology concept a definition and a set of lexical entities has also been specified (see [9] for a more detailed explanation) for a total amount of over 2500 words. A set of available computational resources was used for the tests, consisting in three local hosts that act as container providers for Indexing squads and a “friend” host (www.superabile.it) for which we deployed a special purpose search agent.
5.2Results Experimentation was articulated in two phases, in the first of which we tested the “uniform topic coverage” paradigm in its two implementations, while in the second one we tested the combined operations of the “uniform topic coverage” and the “user triggered knowledge integration”. The initial size of the DOSE annotation repository base has been fixed at about two hundred annotations and contained approximately 20 indexed URI. We used the first algorithm defined in section five keeping track of the selected weak topics and we evaluated their evolution. Data reported in Table 1 shows the concepts extracted in three successive executions of the algorithm, in particular the first, the fifth and the twenty-fifth ones. The experiment took about 10 minutes on three family class PCs and the correspondent repository size was around 2 Mb, corresponding to about two thousand annotations. A second experiment was performed in order to test the statistical selection method, it involved ten runs of statistical selection algorithm with a correspondent repository size increment around 3,2 Mb (~3000 annotations). URIs provided by the enrichment agent could be judged as relevant or not by the indexing system thus resulting in a number of actually indexed resources possibly lower than the amount received. With respect to a human expert selection, provided URIs could in fact be judged relevant by the expert and not-relevant by the system resulting in
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
the set of so-called missed URIs, or conversely, as non relevant by the human and relevant by the system resulting in the set of wrong annotations. Table 1. Auto-updating results after 1, 5 and 25 runs. 1st run of the 10% selection algorithm Natural filiation, Parking, Invalidities, Documents and services managed from the Motorization, Means of land transport, Mobility of the job, Functional Diagnosis of the unable person, mountain Community, total civil Blindness, Roulotte, Undercarriage elevator, Block of the circulation, Accreditation, local Community, Equipment, IPT, Accident in itinere, Check salary, Retribution, Dismissal for adoption, Legitimate Filiation. 5th run of the 10% selection algorithm Functional compromise, Invalidity, Means of land transport, Functional Diagnosis of the unable person, Undercarriage elevator, Block of the circulation, Collaterals, Accreditation, Equipment, Pension to the survivors, Accident in itinere, Check salary, Retribution, Dismissal for adoption, Resources orchestrates them and informative, Target, Area of pause, Elevator, Social Security, primary social Nets, Employ Center, Beneficence, providential Pension, Indemnities of severance pay. 25th run of the 10% selection algorithm Taxi, Metropolitan, Target, Employ Centre, Elevator, Office relations with the public, Train, Bus equipped, Industrial accident, Functional Diagnosis of the unable person, Retribution, precarious Occupation, Standard, Contract to partial time, Accessibility, Accident in itinere, Accreditation, deafness & mutism, special integrating Indemnity, the minimal Treatment, metropolitan Area, Invalidity, Bus, Dismissal for adoption, retributive Difference between sexes, Station, Mobility of the job, Disparity of kind, Indemnity of severance. With regard to the last experiment, self-managing and self-upgrade processes in the DOSE platform provided 93 distinct URIs by collaborating with the Google web service and the remote indexing squads, 41 of which were successfully indexed, judged relevant by the system, increasing the overall knowledge base size. Some behavioural tendencies emerge from the experimental results: as first, selected concepts change from one run to successive others confirming the ability of retrieved URIs to effectively fill knowledge lacks. As an example, we can observe that in the first run the concept “Natural filiation” appears as first, while in the fifth run it does not appear at all. In five runs the self-updating
mechanism has been able to increase the annotation coverage of such topic preventing its inclusion into to the low covered set. Secondly the degree of randomness present in the set of URIs, obtained by collaborating with external services, results in an additional knowledge base enlargement, increasing the annotation coverage for topics related to the ones in the low covered set, and speeding up the self-maintenance process. In the second experimentation phase we tested the combined operations of the “uniform topic coverage” and the “user triggered knowledge integration”. We used an initially empty repository, such condition represents an extreme case where a non adaptive system could not provide any service while an adaptive one succeeds and is able to reach a satisfying operating state. We issued a query to the search engine requiring resources about “lavoro disabile” which is the Italian word for “disabled people job”. Clearly the search failed since no knowledge was in the repository, however the “user triggered knowledge integration” forced the DOSE platform to find and index a minimal set of relevant resources (9 URIs, 663 annotations). The “uniform topic coverage” paradigm subsequently triggered consecutive improvements of the repository, covering, for a great extent, the ontology branch into which the concept “lavoro disabile” appears and provided a total repository size, after two runs, of about 3204 annotations. Re-run query retrieved a total amount of 62 resource fragments thus showing the effectiveness of the two autonomic paradigms integrated into DOSE.
6.CONCLUSIONS This paper proposed an agent-based semantic platform in which autonomic computing principles are applied to ensure the constant update of the platform knowledge base. Self-optimization and self-management techniques proved to be very effective for the population and the update of a semantic annotation repository and preliminary experimental results give positive feedback in sensing knowledge base lacks and finding new resources to overcome them. Moreover low computational requirements and a built-in and naturally distributed architecture allow an easy deployment of the proposed platform on the current Web. Future works will include a more extensive test campaign and the integration of self-healing and self-protection principles into the agents composing the platform.
7.Acknowledgements This work has been partially funded by the European Commission under the Socrates Programme, project 109883-CP-1-2003-1-IT-MINERVA-M “CABLE: CaseBased e-Learning for Educators”. The sole responsibility of this work lies with the authors and the Commission is not responsible for any use that may be made of the information contained therein.
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE
International Conference, Hong Kong, May 1-5, 2001.
8. References [1] Berners-Lee, T., Hendler, J., and Lassila,O. The semantic Web. Scientific American, 5/01, May 2001 [2] KAON Ontology and Semantic Web Infrastructure, http://kaon.semanticweb.org [3] The Amaya W3C Editor/Browser, http://www.w3.org/Amaya/ [4] O. Lassila and R. Swick, Resource Description Framework (RDF) Model and Syntax Specification. World Wide Web Consortium. 22 February 1999. [5] Deborah L. McGuinness and Frank van Harmelen, OWL Web Ontology Language, W3C Proposed Recommendation 15 Dec 2003. [6] York Sure, Michael Erdmann, Juergen Angele, Steffen Staab, Rudi Studer, and Dirk Wenke. Ontoedit: Collaborative ontology development for the semantic web. In Proceedings of the 1st International Semantic Web Conference, 2002, Sardinia, Italia. [7] Alexander Maedche, Ontology Learning for the Semantic Web. The Kluwer International Series in Engineering and Computer Science, Volume 665 [8] On-To-Knowledge-Project, http://www.ontoknowledge.org [9]
D. Bonino, F. Corno, F. Farinetti, “DOSE: a Distributed Open Semantic Elaboration Platform”, ICTAI’03, Sacramento, California, November 2004.
[10] CABLE: CAse Based e-Learning for Educators, http://elite.polito.it/cable. [11] “Autonomic computing: IBMs perspective on the State of Information technology”, International Business Machines corporation 2001, http://www.research.ibm.com/autonomic/manifesto/ [12] OWL-S 1.0 Release, http://www.daml.org/services/owl-s/1.0/ [13] R. Baeza-Yates, B. Ribeiro-Neto, “Modern Information retrieval”, Addison-Wesley, 1999. [14] J. Kahan, M. Koivunen, E. Prud’Hommeaux, R. Swick, “Annotea: An Open RDF Infrastructure for Shared Web Annotations”, in Proc. of the WWW10
[15] L. Denoue, L. Vignollet, “An annotation tool for web browsers and its applications to information retrieval”, in Proc. of RIAO 2000, Paris, France, April 12-14, 2000. [16] P. Grosso, E. Maler, J. Marsh, N. Walsh, “XPointer Framework”, World Wide Web Consortium, 2002. [17] F. Bellifemmine, A. Poggi and G. Rimassa. Jade Programmers guide, last update 21-Feb-2003, http://sharon.cselt.it/projects/jade [18] IBM research projects on Autonomic Computing, http://www.research.ibm.com/autonomic/research/pro jects.html [19] Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro and William E. Walsh, “Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation”, in Proc. Of the 19th Conference on Uncertainty in Artificial Intelligence, Acapulco, Mexico, 8-10 August 2003. [20] Roy Sterrit, Dave Bustard, “Towards an Autonomic Computing Environment”, in Proc. Of the IEEE Workshop on Autonomic Computing Principles and Architectures (AUCOPA 2003), Banff, Alberta, Canada, 22-23 August 2003. [21] J. Appavoo, K. Hui, C.A.N. Soules et Al., “Enabling autonomic behavior in systems software with hot swapping”, IBM systems journal, vol. 42, n. 1, 2003. [22] C. H. Crawford, A. Dan, “eModel: Addressing the Need for a Flexible Modeling Framework in Autonomic Computing”, in Proc. Of the 10th IEEE International Symposium on Modeling, Analysis and Simulation of computer and Telecommunications systems”, Forth Worth, Texas, October 11-16, 2003. [23] Rajan Kumar, Prathiba V. Rao, “A model for selfmanaging Java servers to make your life easier”, http://www-106.ibm/developerworks/library/ac-alltimeserver/. [24] “ETTK: Emerging Technologies Toolkit”, http://www.alphaworks.ibm.com/tech/ettk/
Proceedings of the International Conference on Autonomic Computing (ICAC’04) 0-7695-2114-2/04 $20.00 © 2004 IEEE