Translating Between Ontologies For Agent

Gerstner Laboratory for Intelligent Decision Making and Control Czech Technical University in Prague

Series of Research Reports Report No:

GL 149/02

Translating Between Ontologies For Agent Communication

Marek Obitko [email protected] http://cyber.felk.cvut.cz/gerstner/reports/GL149.pdf

Gerstner Laboratory, Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University Technická 2, 166 27 Prague 6, Czech Republic tel. (+420-2) 2435 7421, fax: (+420-2) 2492 3677 http://gerstner.felk.cvut.cz/

Prague, 2002 ISSN 1213-3000

Table of Contents 1

INTRODUCTION ........................................................................................................................ 2

2

PROBLEM FORMULATION AND APPROACH TO SOLUTION ...................................... 2

3

STATE OF THE ART.................................................................................................................. 3 3.1 ONTOLOGIES ............................................................................................................................... 3 3.2 REPRESENTATION OF ONTOLOGIES ............................................................................................ 4 3.2.1 FRAME BASED MODELS ..................................................................................................... 5 3.2.2 DESCRIPTION LOGICS ........................................................................................................ 5 3.2.3 KIF AND ONTOLINGUA ...................................................................................................... 5 3.2.4 OKBC AND FIPA KNOWLEDGE MODEL ........................................................................... 5 3.2.5 WEB STANDARDS – XML AND RDF(S) ............................................................................ 6 3.2.6 XOL................................................................................................................................... 6 3.2.7 OIL..................................................................................................................................... 6 3.2.8 DAML+OIL....................................................................................................................... 7 3.2.9 EXTENSIONS OF ONTOLOGY .............................................................................................. 7 3.2.10 PROTOTYPE-BASED ONTOLOGY ........................................................................................ 7 3.3 OPERATIONS WITH ONTOLOGIES ................................................................................................ 8 3.4 RELATIONS BETWEEN ONTOLOGIES ........................................................................................... 9 3.5 TRANSLATING BETWEEN ONTOLOGIES ...................................................................................... 9 3.6 LANGUAGE GAMES................................................................................................................... 11 3.7 LEARNING OF OPERATIONS WITH ONTOLOGIES ....................................................................... 11 3.8 COMMUNICATION IN MULTI-AGENT SYSTEMS AND ONTOLOGIES .......................................... 13

4

DISCUSSION.............................................................................................................................. 15 4.1 ARCHITECTURE......................................................................................................................... 15 4.1.1 COMMUNICATION BETWEEN AGENTS ............................................................................. 15 4.1.2 AGENT ARCHITECTURE ................................................................................................... 16 4.2 TRANSLATION BETWEEN ONTOLOGIES .................................................................................... 17 4.3 LEARNING TO TRANSLATE ....................................................................................................... 17 4.4 TESTING .................................................................................................................................... 18 4.4.1 MANUFACTURING DOMAIN ............................................................................................. 19 4.4.2 OOTW DOMAIN .............................................................................................................. 19 4.5 OTHER RELATED ISSUES .......................................................................................................... 20

5

CONCLUSION ........................................................................................................................... 20

6

ACKNOWLEDGEMENTS ....................................................................................................... 20

7

REFERENCES ........................................................................................................................... 21

APPENDIX A – SAMPLE ONTOLOGY FOR TRANSPORTATION DOMAIN ....................... 24 A.1 ORIGINAL XML-BASED ONTOLOGY .......................................................................................... 24 A.2 OIL ONTOLOGY ......................................................................................................................... 24 A.2 OIL ONTOLOGY IN DAML+OIL FORMAT ................................................................................. 26

1

1 Introduction As information systems are changing from isolated programs to more powerful interconnected systems, new challenges arise. The issue of a big importance is a communication between these systems. This issue is particularly important in heterogeneous systems – for example when connecting existing legacy systems from different companies that were not designed to work together. Connected systems that communicate in order to exchange information and work together to achieve some common goal are studied in the area of multi-agent systems. Interesting real-life example provided in (Madnick 1995) shows how a transportation chain supplying US Army involving different companies works well for transportation of containers weighing tons around the world, but doesn’t work so well for transportation of information saying what is in these containers. This results in the need of manual repacking of huge number of containers when they arrive to see what is in them. The problem is that there are hundreds of different computer systems and databases in airlines, shipping companies, trucking companies and manufacturing companies that were never designed to directly operate with each other. Although each system may be efficient, the interfaces between these systems introduce tremendous disruption and delay and usually require significant human intervention. To ensure compatibility for communication, we can distinguish three basic layers – technical compatibility, syntactic compatibility, and finally semantic compatibility. In the first layer, we have to ensure that two systems are able to establish communication (at the lowest level, e.g. through network protocols) and exchange messages. In the second layer, we have to make sure that both systems use the same syntax of messages – i.e. that both systems are able to parse messages from each other and to recognize syntactical elements. The last layer, the most interesting one, is the semantic layer. In this layer we want systems actually understand the content of messages, i.e. to understand the meaning of messages. It is the last layer that we will deal with in this work. The problem of semantic understanding is known as interoperability problem or as semantic heterogeneity. Agents (that can be wrappers of existing systems) want to communicate but they do not share the form of expressing their messages – to be concrete, they do not share the ontology (see below) used for expressing the content of their messages. For example an agent from U.S. may use other vocabulary than an agent from France (FIPA 1998) and the problem of communication arises even if agents are able to connect only because that they do not understand each other. There are many possible areas where the problem of semantic interoperability arises and needs to be solved (Fensel 2001a, Fensel 2001b). Examples include communication in heterogeneous multi-agent systems, integration, fusion and gathering of data, information and knowledge from multiple sources, querying over distributed knowledge bases, system interoperability in e-commerce, application integration, B2B applications, interoperability in semantic web and many others. This issue is important not only in communication between agents. Very similar problems are studied for example in the area of the integration of databases and their schemas, data warehouses, data migration and database schema evolution.

2 Problem Formulation and Approach to Solution The problem we will deal with in this report can be informally defined as follows. There are agents that use different ontologies as a base for their knowledge base. They want to communicate, however they can create messages only according to their ontology, which may differ from ontologies used by other agents. There can be many ways how the ontologies may differ. Ontologies of agents may not share the conceptualization (roughly speaking a way of viewing or modeling the world) or the specification of conceptualization (e.g. the vocabulary to describe concepts in the same conceptualization) or both. How can these agents with different ontologies understand each other?

2

To be more specific, we assume that agents create messages using the ontology underlying their knowledge base. When the agents communicate about something that is familiar to both of them, they should be able to proceed with the communication, even if they do not exactly share their ontologies. It should be possible to: ·

·

Translate messages from one agent for other agent so those agents can communicate and understand in communication. Input for this translation is a message based on one ontology and output is a message based on another ontology. These messages should be semantically equivalent, if possible, or should as equivalent as possible (it is possible that some information may be lost during the translation between different ontologies, because different ontologies may differ in details that they take into account). Learn how to understand each other, i.e. learn this translation (or mapping) between ontologies or negotiate meaning of messages or parts of messages. Of course, to start learning or negotiation, agents must share something. For example they may share the world they reside in (it can be virtual or physical world) so that they can identify objects or instances from the world represented in their knowledge bases. Another possibility is that agents share some parts of ontology (e.g. the top level ontology) and understand these messages formed in these shared parts of ontologies, but do not understand other terms from other unshared parts of ontologies.

We will elaborate background and proposals for these two issues in the rest of this report. We try to have both the translation and translation learning automatic and distributed as much as possible. The next section also contains discussion why one common shared ontology is not enough or even not possible.

3 State of the Art In this section we will briefly describe the state of the in the area of ontologies and their use for communication in multi-agent system.

3.1 Ontologies Ontology is a term that comes from philosophy, where it is a branch that deals with the nature and the organization of reality. In philosophy, it is often defined as the science of being as such. Philosophical branch of Ontology tries to answer questions like “what is existence”, “what properties can explain the existence”, “what is the being”, or in a meaningful reformulation, “what are features common to all beings” (Guarino, et al. 1995). The term ontology was adopted by AI and database researchers. The most known definition comes from Thomas Gruber (Gruber), (Gruber 1993b), (Gruber 1993a): “ontology is an explicit specification of conceptualization”. In the sense presented in (Gruber), explicit specification of conceptualization means that ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. Another more detailed definition comes from John Sowa (Sowa): “The subject of ontology is the study of the categories of things that exist or may exist in some domain. The product of such a study, called an ontology, is a catalog of the types of things that are assumed to exist in a domain of interest D from the perspective of a person who uses a language L for the purpose of talking about D. The types in the ontology represent the predicates, word senses, or concept and relation types of the language L when used to discuss topics in the domain D. An uninterpreted logic is ontologically neutral: It imposes no constraints on the subject matter or the way the subject is characterized. By itself, logic says nothing about anything, but the combination of logic with an ontology provides a language that can express relationships about the entities in the domain of interest.“ Other similar definitions of ontologies for AI purposes, such as “representation of a conceptual system via a logical theory”, are discussed rigorously from a logical point of view in (Guarino, et al. 1995). 3

An informal definition of ontology as a theory of vocabulary or concepts used for building artificial systems is discussed in (Mizoguchi, et al. 1996). Sometimes, ontology is defined as a “body of knowledge describing some domain”, typically a common sense knowledge domain. In this case, ontology includes whole “upper” knowledge base. A typical usage of this definition is the project CYC (Whitten 1997). A summary of these definitions is available in (Obitko 2001). Although these definitions are not exactly the same, they in principle say that ontology consists of these parts: · conceptualization of a domain, i.e. a way how to view/model a domain · specification of the conceptualization, e.g. a formal description In addition, both of conceptualization and specification are influenced by a modeling method (e.g. frames and slots). This can be also considered as a part of ontology (it is sometimes described as metaontology). At the conceptualization level, we decide which objects and relations between them from the domain of interest we will include in the ontology and also to which details we will go. We choose the modeling method (see the overview of ontology representation languages in the next section) and model the domain. At the specification level, we formally specify the conceptualization, usually in some formal language. The formality of the specification may influence of further reuse and sharing of the ontology. The level of formality can range from a simple glossary of terms without anything else, through informal definition in natural language, formal is-a relation, formal description of frames and properties, value restrictions, to general logical constraints. It is clear that less formal ontology is much simpler to develop and that more formal ontology usually enables much easier reuse and sharing. A nice guide to creating ontology based on classes and slots is available in (Noy, et al. 2001). The ontology defines how to model the state of affairs in a domain together with restrictions what is possible and what is impossible. Ontology should capture knowledge that is not changing (or doesn’t change very often), while the particular state of affairs is captured in a knowledge base. For example, ontology can state that there are patients and illnesses, and knowledge base would contain information about particular patient and his illness. For a software agent, what “exists” is only what can be represented in his mind, i.e. what can be represented in the ontology underlying his knowledge base. Ontologies are developed and used because they enable among others: · to share knowledge – by sharing understanding of the structure of information shared among software agents and people · to reuse knowledge – once developed ontology can be reused for other systems operating on a similar domain · to make assumptions about a domain explicit A short survey of other reasons and possible applications of ontologies is available in (Obitko 2001). The semantic interoperability (a possibility to understand shared data, information, and knowledge) is one of the main reasons why ontologies are being used (Fensel 2001a). It was supposed that it should be possible to conceptualize a domain and create a common shared ontology so that any system working on that domain could use that ontology for sharing knowledge. However, it seems that ontologies for one domain from two ontology engineers are always different, even if they use similar modeling methods and approaches. When two organizations create their ontology for a shared domain, ontologies will never be the same. Rational reasons for the fact that ontology that would be optimal for everyone is impossible is given in (O'Leary 2000) together with illustration on different best practice knowledge bases of Arthur Andersen and Price Waterhouse.

3.2 Representation of Ontologies There exist number of languages that are used for specification of ontologies (Ribiere, et al.), (Fensel, et al.). These languages impose some restrictions on what can be represented and on the other hand, may allow constructs that are not expressible in other languages. This results in the fact that ontologies

4

represented in some more powerful languages may not be translatable to less powerful languages without loss of information. Otherwise the translation of ontologies between different languages is only a matter of syntactic translation, which is relatively easy and not as interesting as semantic translation. In this chapter we will very briefly describe languages that are currently used for ontology representation. For many of described languages there are tools available that support creating ontologies in these particular languages. The most popular tools are described e.g. in (Ribiere, et al.), (Noy, et al. 2001), (Grosso, et al. 1999). 3.2.1 Frame Based Models Frame based systems use entities like frames and their properties as a modeling primitive. Note that this is a common description of the approach, we are not describing one particular language here. The central modeling primitive is a class (also called frame, sometimes concept) with attributes (properties, slots). These attributes are applicable only to the classes they are defined for. Value restriction (facets) can be defined for each attribute. A frame provides a context for modeling one aspect of a domain. An important part of frame-based languages is the possibility of inheritance between frames. The inheritance allows inheriting attributes together with restrictions on them. Knowledge base then consists from instances (objects) of these frames (classes). 3.2.2 Description Logics Description Logic (DL) tries to find a fragment of first-order logic with high expressive power that still has a decidable and efficient inference procedure. They result from work on semantic networks and define a formal and operational semantics for them. Unlike frame-based systems, that use frames as modelings primitive, central modeling primitives of description logics are predicates. Classes or concepts in DL are defined intentionally in terms of description that specify the properties that objects must satisfy in order to belong to the concept. Implemented systems (AT&T) include BACK, CLASSIC, CRACK, FaCT, FLEX, K-REP, KL-ONE, KRIS, LOOM and YAK. These systems include reasoning support, however in some cases the support is not powerful enough for practical applications. 3.2.3 KIF and Ontolingua Knowledge Interchange Format (KIF) (Genesereth, et al. 1994) is a language designed for use for exchange of knowledge between different systems. It is based on predicate logic, with syntax based on LISP. It has declarative semantics, i.e. the meaning of KIF expressions can be understood without defining manipulation with these expressions. It allows representing arbitrary sentences in the first order predicate calculus. This language was defined around the Ontolingua tool (Gruber 1993b) that provides a cooperative ontology builder. KIF is highly expressive, so that it can serve as an interchange format between various knowledge representation formalisms. When KIF is used, one usually implements that representation formalism in KIF and uses this implementation for representation of particular knowledge or ontology. This is also the case of Ontolingua – Frame Ontology defining classes, slots, facets etc. was defined in KIF, and the KIF together with the frame ontology forms the language of Ontolingua, that allows write ontologies in a canonical form. The drawback of KIF (and Ontolingua) is in its high expressive power that is provided without any means to control it. No reasoning support has ever been provided for these languages. 3.2.4 OKBC and FIPA Knowledge Model Open Knowledge Base Connectivity (OKBC) (Chaudhri, et al. 1998) defines an API for accessing knowledge representation systems. It defines most of the concepts found in frame-based systems, object databases and relational databases (Chaudhri, et al. 1998), (FIPA 1998). The conceptualization in OKBC format is based on frames, slots, facets, instances, types, and constants. The OKBC API is defined in language independent fashion, and implementations exist for Common Lisp, Java, and C (Chaudhri, et al. 1998). OKBC is not sufficient to represent easily axioms and rules. Despite this fact,

5

the OKBC knowledge model was adopted by FIPA1 recommendations as FIPA-meta-ontology (FIPA 1998) that describes basic modeling constructs for ontologies used in FIPA-compliant systems. 3.2.5 Web Standards – XML and RDF(S) Extensible Markup Language (XML) (Bray, et al. 1998) is a meta-language designed and developed by W3C. XML allows defining user tags and attributes and define possible structures of documents (via DTD and other schema languages). Its advantages include simplicity, extensibility and separation of content and semantics and presentation. The semantics provided by XML is however only informal – in the core standard there are no means how to attach meaning to tags explicitly. Resource Description Framework (RDF) (Lassila, et al. 1999) was designed to describe metadata for the resources on the web in means of triples of subjects, predicates, and objects. These metadata can be embedded to a document without making any assumption on the document structure. RDF Schema (RDFS) adds a basic type schema for RDF. Objects, classes, and properties can be described and predefined properties can be used to model instance of a class. Support of defining subclasses (even of relationships), as well as domain restrictions is included. A specialty of RDF(S) is that everything is provided globally and is not encapsulated as in the case of frame systems. 3.2.6 XOL XML-based ontology language (XOL) (Karp, et al. 1999) enabling writing and exchanging ontologies was developed and is recommended by the BioOntology Working Group2. The syntax is based on XML, the semantics and modeling primitives are based on simplified OKBC model, so called OKBCLite. Its knowledge model includes basic data types such as integers, floating point numbers, strings, booleans, and higher level primitives such as classes and individuals, slots and facets and their values. 3.2.7 OIL Ontology Inference Layer3 (OIL) (Fensel, et al.) is a result of a careful analysis of previously mentioned ontology representation languages. OIL comes from EC project On-To-Knowledge and is now proposed as a standard for ontology specification and exchange. It has roots in description logics, frame-based systems and web languages such as RDF(S). OIL XML syntax is based on XOL. OIL takes an opposite approach than KIF and Ontolingua that enable to express anything, but do not provide sufficient reasoning support. OIL provides a limited core language that is expressive enough and still enables effective reasoning support. It is possible to extend the language for particular applications, where the basic core language is not sufficient. Ontology in OIL is specified in two layers – ontology definition layer contains actual ontology definitions, while ontology container level contains metadata about the ontology represented using the Dublin Core standard. The concrete instances based on the ontology are present at the object level that is in fact a knowledge base. Basic modeling primitives of the ontology definition layer include: · class-def – type, name, documentation, subclass-of, slot-constraints · slot-def – name, documentation, subslot-of, domain, range, inverse, properties · slot-constraint – name, has-value, value-type, max-cardinality, min-cardinality An example ontology expressed in OIL that illustrates several modeling primitives (another ontology in OIL text format is available in the Appendix A): class-def animal class-def plant

% animals are a class % plants are a class

1

http://www.fipa.org/ http://smi-web.stanford.edu/projects/bio-ontology/ 3 http://www.ontoknowledge.org/oil/ 2

6

subclass-of NOT animal % that is disjoint from animals class-def tree subclass-of plant % trees are a type of plants class-def branch slot-constraint is-part-of % branches are parts of some tree has-value tree max-cardinality 1 class-def defined carnivore % carnivores are animals subclass-of animal slot-constraint eats % that eat any other animals value-type animal class-def defined herbivore % herbivores are animals subclass-of animal, NOT carnivore % that are not carnivores, and slot-constraint eats % they eat plants or parts of plants value-type plant OR (slot-constraint is-part-of has-value plant) Current limitations of OIL include lack of support for default reasoning, inability to express arbitrary axioms and rules, and limited second-order expressivity. Modelers are free to extend the language with their own extensions, but this may compromise the decidability and reasoning support. 3.2.8 DAML+OIL OIL has been adopted by DARPA Agent Markup Language (DAML) project (DAML). DAML+OIL is a semantic markup language for Web resources. It builds on earlier W3C standards such as RDF(S) and extends these languages with richer modeling primitives that are commonly found in frame-based languages. The DAML language has a clean and well-defined semantics. Because DAML is based on RDF, a knowledge base based on DAML+OIL is in a form of collection of RDF triples. DAML+OIL specifies meaning for these triples. Any additional RDF statements not described by DAML+OIL ontology are allowed, but DAML+OIL then says nothing about their semantics (KIF axiomatization provides a meaning for them, with no guarantee of reasoning support). DAML+OIL can import other ontologies to provide means for modular construction of ontologies. 3.2.9 Extensions of Ontology Extended knowledge model facilitating knowledge sharing is presented in (Tamma, et al. 2001). Extensions that provide better means for characterizing parts of ontology are proposed in addition to typical facets (e.g. in OKBC model). The extensions include additional characterization of slot values by providing e.g. typical range of values, exceptions, frequency of change, confidence, events that may change the value etc. The presented enriched knowledge model explicitly represents the behavior of attributes over time by describing permitted changes. These extensions can be very useful when comparing two ontologies and finding common parts of them. 3.2.10 Prototype-Based Ontology A prototype-based ontology (Sowa) is different from ontologies mentioned above. It is a terminological ontology whose categories are distinguished by typical instances or prototypes rather than by axioms and definitions in logic. For every category c in a prototype-based ontology, there must be a prototype p and a measure of semantic distance d(x,y,c), which computes the dissimilarity between two entities x and y when they are considered instances of c. An entity x can be then classified by a simple recursive procedure (Sowa). As an example, a black cat and an orange cat would be considered very similar as instances of the category Animal (Sowa), since their common catlike properties would be the most significant for distinguishing them from other kinds of animals. But in the category Cat, they would share their catlike properties with all the other kinds of cats, and the difference in color would be more

7

significant. The difference will be significant in the category BlackEntity. Prototype-based ontologies are usually derived by a method that learns from examples.

3.3 Operations with Ontologies Different operations can be performed with ontologies. An overview of different operations performed between ontologies or on ontologies could include (Sowa 2000, Sowa): · Merging – merge ontologies to create a new ontology that has features inherited from original ontologies. Usually it is required that all features from original ontologies are included, however it may not be possible because of inconsistencies between original ontologies. A merged ontology may introduce new concepts and relations that can form a bridge between original ontologies. · Mapping – mapping concepts and relations4 from one ontology to another one. In the easiest case, mapping can be one-to-one, i.e. one concept in ontology A always maps to one concept in ontology B. If this is not the case, the mapping can be partial and cause loss of information. However, it shouldn’t introduce any inconsistencies. In fact, mapping describes how to translate statements from one ontology to another. · Alignment – the process of mapping of ontologies in both ways that may require adapting original ontologies to ensure that suitable targets for alignment will exist, i.e. when the mapping between ontologies is not possible, it is necessary to add new concepts or relations that have equivalents in the other ontology. Alignment (as well as mapping) can be partial. The specification of alignment between two ontologies is called articulation. · Refinement – an alignment of every category of an ontology A to some category of another ontology B. Every category in A must correspond to an equivalent category in B, but some primitives of A might be equivalent to non-primitives in B. Refinement defines partial ordering between ontologies. · Unification – a one-to-one alignment of all concepts and relations in two ontologies that allows any inference expressed in one to be mapped to an equivalent inference in the other. The usual way of unifying two ontologies is to refine each of them to more detailed ontologies. · Integration – The process of finding commonalities between two different ontologies A and B and deriving a new ontology C that facilitates interoperability between computer systems that are based on the A and B ontologies. The new ontology C may replace A or B, or it may be used only as an intermediary between a system based on A and a system based on B. Depending on the amount of change necessary to derive C from A and B, the levels of integration can range from alignment to unification. · Inheritance – ontology A inherits from ontology B, if all the concepts, relations and constraints from ontology B are present in ontology A and no inconsistency is introduced. This notion is important for the idea of modular construction of ontologies from ontology libraries (i.e. starting with top-level ontology defining what is time, process, etc., going through general domain ontology, we can come to definition of an ontology for a particular focused application). Inheritance defines partial ordering between ontologies. It can be seen that some of these operations overlap, and their exact meaning can be different in different projects (especially mapping and alignment are often interchanged). Some operations may not be available for particular ontologies (see the next section for overview of relations between ontologies). A human usually performs these described operations. As in the case of ontology modeling, some tools exist that enable to perform some of these operations. One of them is Chiamera system (McGuinness, et al.) that helps with ontology merging. It provides suggestions for subsumption, disjointness or instance relationship. These suggestions are generated heuristically and are provided for an operator, so that he may choose which one will be actually used. PROMT or SMART (Noy, et al. 1999), (Klein) system is a similar system that provides suggestions based on 4

Or other parts of ontology. For simplicity, we will assume in this section that ontology is formed from concepts and relations.

8

linguistic similarity, ontology structure and user actions. It points the user to possible effects of these changes.

3.4 Relations between Ontologies In an open environment, agents may benefit from knowing about relationships between ontologies. For example, if they know that two ontologies can be mapped 1:1, they may use easier mapping procedures as well as easier learning of the mapping. A possible classification of relationships is provided in FIPA ontology service specification (FIPA 1998): · · · · · · · · · · · ·

Extension – ontology O1 extends ontology O2 The ontology O1 extends or includes the ontology O2. Informally this means that all the symbols that are defined within the O2 are found in the O1 together with the restrictions, meanings and other axiomatic relations of these symbols from O2. Identical – ontologies O1 and O2 are identical Vocabulary, axiomatization and the language are physically identical, but the name can be different. Equivalent – ontologies O1 and O2 are equivalent Logical vocabulary and logical axiomatization are the same, but the language is different (e.g. XML and Ontolingua). When O1 and O2 are equivalent then they are strongly translatable in both ways. Strongly-translatable – source ontology O1 is strongly translatable to the target ontology O2 The vocabulary of O1 can be totally translated to the vocabulary of O2, axiomatization from O1 holds in O2, there is no loss of information from O1 to O2 and there is no introduction of inconsistency. Note that the representation languages can still be different. Weakly-translatable – source ontology O1 is weakly translatable to the target ontology O2 The translation permits some loss of information (e.g. the terms are simplified in O2), but doesn't permit introduction of inconsistency. Approx-translatable – source ontology O1 is approximately translatable to the target ontology O2 The translation permits even introduction of inconsistencies, i.e. some of the relations become no more valid and some constraints do not apply anymore.

The problem of deciding the relationship between two ontologies is still open research issue, mainly because deciding whether two logical theories (as ontologies usually are) have relationships to each other is in general computationally very difficult. Therefore, knowing about these relationships often requires manual intervention.

3.5 Translating between Ontologies When knowledge is to be shared between two systems that do not share the same ontology, either changing ontologies or translation is needed. We focus here on translation between semantic mismatches. We do not consider translation between different representation languages. In this section we are particularly interested in the way and representation of mapping. The 1:1 mapping between two ontologies can be represented as a simple table that can be used for direct translation, however there can be more complicated mappings that require more sophisticated ways and representations of translation. Mapping between ontologies is often done manually using a supporting system. An example is Microsoft BizTalk Mapper (Microsoft) that supports one-way mapping between different XML documents. The environment of the BizTalk Mapper shows both XML documents and enables to connect their elements visually together. When connecting, user may use so called functoids, which are various operations for conversion between different fields (for e.g. unit conversion). Custom functoids can be created. The result of the mapping is stored and can be used to generate XSL

9

(Extensible Stylesheet Language) style sheet that can be used to convert another XML documents that have the same structure. SMART system (Noy, et al. 1999) supports a user in aligning or merging ontologies. The system provides operations on parts of ontologies that can be used to achieve the goal (align or merge). These operations include (the list is not complete): · Merge – merges two frames of the same type · Shallow copy – copies one a frame from one ontology to another one without anything that the frame refers to (e.g. slots) · Deep copy – copies a frame to another ontology together with recursive copy of all superclasses of that frame until the root of ontology is reached · Remove frame – removes frame from ontology · Remove parent – removes the relation of being parent (superclass) · Add parent – adds the relation of being parent (superclass) · Rename – changes the name of a specified frame · Remove attachment – removes slot from a frame together with facets · Attach slot – attaches a new slot to a frame These operations can be written as a sequence of primitive OKBC API calls. In (Park, et al.) three kinds of mappings are distinguished – implicit, procedural and declarative. Implicit mapping is described as a conversion that someone performs to adapt a component to work with other component. Procedural mapping involves explicit translation code that converts instances from a knowledge base. Advantages of procedural mapping include its efficiency, while disadvantages include its complexity and difficulty to reuse. Declarative mapping constitutes a descriptive method for defining the conversion between entities in the knowledge bases. Mapping interpreter is then usually a translation engine that parses mapping declarations and performs a conversion. (Park, et al.) propose a simple yet useful mapping ontology for description of several mapping operations. This ontology serves for creating an instance of mapping that could be used for mapping between two particular ontologies. Translation between vocabularies defined in logical formulas is described in (van Eijk, et al. 2001). Translation formulas define how formulas from different first-order systems with different signatures can be translated by saying which symbols can have similar meaning. No assumptions are put on ontologies – they are considered as any set of axioms expressed in first-order system. An example if translation formula is q = "x(s = t ) that says that a term s from ontology O1 can be translated to term t in ontology O2. A formula f expressed in ontology O1 can be understood in ontology O2 by combining it with a translation formula, i.e. for example q Ù q . It is further discussed, what properties should translation formula have. For example, it shouldn’t introduce any inconsistency with the target ontology after translation. OntoMorph (Chalupsky 2000) is a system that provides a language for syntactic transformation of ontologies that is not very different from XSLT. Rewrite rules used in the language have a general form “rewrite pattern to result”. When they are applied, they find patterns in a text form of knowledge base statement and replace these patterns with results. The language is general enough to provide syntactic translation even between different representation languages. A general approach to mapping representation between variously structured ontologies is presented in (Burstein, et al., McDermott, et al.). Lambda functions are used for general expressions that are called “glue code”. In this approach, the body of the lambda function represents conversion procedure between various knowledge structures. It can be applied to statements expressed in one ontology to get equivalent statements expressed in another ontology. Combining lambda functions and applying them to source statement results in a statement in target ontology. Conversion lambda functions can be found by solving higher-order equation for the unknown functions. This process is not automatic,

10

however it is shown (Burstein, et al.) on fairly complex example of two airport agents, how solving of such an equation for transformation lambda functions may proceed.

3.6 Language Games Language game is a term introduced by a philosopher Ludwig Wittgenstein (see e.g. (Shawver)). He emphasized that natural language and meaning are not based on context independent abstractions, but rather arise as a part of specific interactive situations. This means that the meaning of a word or phrase comes from its role in a game. It can be different in different games (contexts) and usually cannot be clearly and exactly defined in a way of pure logic description. He illustrates his ideas on a way in which children (and also adults in unknown situations) learn to speak – they try to say something to describe what they mean and they may succeed in some situations (so that other person understands them) and may be corrected in other situations. Based on their observed success they learn where to use which words or phrases. This idea was taken by several researchers (Steels 2001) to introduce language games between agents. Language (or guessing) game is played between two agents that share the same world. Agents develop their shared language by pointing to objects in a shared world and reinforcement learning. The details differ in different publications, however the main idea can be described as follows. Agent A says something to agent B and agent B guesses what agent A described in the world. Agent A may already know from the previous conversation some construct how to describe an object. If not, he may think out any new word or phrase (agents start with empty vocabulary). Agents then point to the object that they identified from the utterance and they see whether the game was successful, i.e. whether they pointed to the same object. Based on the game result, both agents may change their vocabulary. They increase weights for successful words may and decrease weights for words that failed in communication. After usually several thousands of played games agents are able to communicate about a shared world. This has been demonstrated on both very simple two-dimensional world of points (Kaplan) as well as on a more complex physical world of colored geometric figures at a white board in the Talking Heads experiment (Steels 2001). In the Talking Heads experiment during a three-months periods, agents played about half a million of language games and created a stable core vocabulary of 300 words (they generated thousands of words overall) (Steels 2001). The core is that agents evolve their vocabularies through the language games and that they may introduce new words if their vocabularies are not sufficient to describe something in the world. Even if they start with empty vocabularies and when multiple agents are involved, they are able to stabilize the dictionary and evolve a language used for description of a shared domain. The described model of language games was used for meanings constructed from perceptions (e.g. visual source in Talking Heads experiment) and for constructing new artificial words for describing the domain from empty dictionary. It would be interesting to use similar model for finding mapping between ontologies when agents want to find the mapping only and do not want to develop completely new language. A work that tries to go in this direction is (Wiesman, et al. 2001) – an approach to finding mapping between two address databases based on language game is described in the next section.

3.7 Learning of Operations with Ontologies Language games present an approach to learning completely new vocabularies or lexicons that would enable to describe a shared domain. Other approaches are available that take structured knowledge base and generate an ontology from it. In this section we are interested mainly in approaches that operate on already existing ontologies or database schemas and that try to learn mapping or alignment of existing ontologies.

11

Aligning hierarchies of concepts is described in (Ichise, et al.). A statistical method of evaluating similarity from top to bottom is used to find alignment between two Internet directories. When comparing two concepts, two categorization criteria S1 and S2 are used. It is supposed that we can decide whether a particular instance belongs to a particular category or not. So, for every two categories we can make a table that shows number of instances belonging or not belonging to S1 and S2 categories. We may suppose that if the categories are very similar, then there are many instances that either fall into both categories or fall out of both categories. On the other hand, when they are not similar, instances may fall randomly into one category and not into the other one. Pairs of categories are then recursively compared to find appropriate alignment. According to (Ichise, et al.), over 80% instances were classified correctly after the alignment except problematic parts of ontologies that are further discussed in that work. Automatic ontology mapping based on common instances classified in two ontologies is discussed in (Wiesman, et al. 2001). Learning is illustrated on two directories of people that have common instances. Two descriptions of the same person are compared and concept mapping is derived using additional string operations such as concatenating. For example, first name and family name can be different attributes in one directory and one attributes in another directory, and thus the values of these attributes has to be concatenated during mapping. The approach is discussed in relation to language games and communication between agents. An approach that is also based on comparing tokens used in attribute description is described in (Perkowitz, et al.). An algorithm that learns mapping from external sources to local categories is illustrated on personal directories. The algorithm finds an explanation of received information by comparing tokens to local categories and makes hypothesis of relevance from this explanation. The hypothesis can be explanatory, inconsistent or consistent – the triple of evaluation of these properties forms the score of the hypothesis. Generated hypotheses are compared to other hypotheses from history, and the algorithm terminates when some hypothesis is significantly better than other ones. A detailed theoretical analysis of the algorithm is provided. Formal concept analysis (Ganter, et al. 1999) is a mathematically grounded technique for deriving a lattice of concept types from objects (instances) that have assigned different properties. Objects that have the same properties are classified under the same concept. Resulting concepts are organized in a lattice, i.e. there is a partial order between concepts and we can find common superconcept and subconcept for two concepts. Every concept is equivalently specified by its intent, i.e. concept properties, and extent, i.e. objects that are classified under that concept. This method can be also used for deriving lattices from values of attributes, not only from attribute-object assignments – details are discussed e.g. in (Ganter, et al. 1999). Formal concept analysis can be obviously used for generation of a core of ontology from existing instances. It was applied as a data mining procedure for finding patterns in sets of objects. Formal concept analysis was used for bottom-up merging of ontologies (Stumme, et al.). If there are no instances that would be described in both ontologies, they have to be generated. The process of generation uses linguistic analysis of several documents and creates a list of concepts from both ontologies that seem to be relevant to the document. This generates a formal context – assignment of objects (documents) and properties (concepts from original ontologies). This formal context is processed by formal concept analysis to create a pruned lattice. Only concepts that are above at least one formal concept generated by (ontology) concept of the source ontologies are considered for the resulting lattice. The lattice serves as a suggestion for merging similar concepts. The last step, derivation of merged ontology from the resulting lattice, requires human intervention. System LSD (Learning Source Descriptions) (Doan, et al. 2001) that learns 1:1 XML mapping between mediated schema and data source schemas uses multi-strategy learning. Input is a set of schemas and common instances together with background knowledge. Output should be the mapping. The key idea behind LSD is that the system should reuse information hidden in previous manual or automatically generated mappings. It learns a mapping from a user and then tries to apply it to other

12

schemas. Several learners observe mapping of the user and then produce a prediction together with its confidence score. Base learners include Name learner, Naive Bayes and XML learner – each of these learners observes different features of mapping. Base learners can be combined with meta-learners together with weights specifying their confidence score. Meta-learners are then selected (based on their combined confidence score) for suggesting mapping of previously unknown schemas.

3.8 Communication in Multi-Agent Systems and Ontologies Knowledge sharing and exchange is particularly important in communication in multi-agent systems. An agent is usually described as a persistent entity with some degree of independence or autonomy that carries out some set of operations depending on what he perceives. An agent usually contains some level of intelligence, so he has to have some knowledge about his goals or desires. In multi-agent systems, an agent cooperates with other agents, so he should have some social and communicative abilities. Each agent has to know something about a domain he is working in and also has to communicate with other agents. An agent is able to communicate only about things that can be expressed in some ontology. This ontology must be agreed and understood among the agent community (or at least among its part) in order to enable each agent to understand messages from other agents. The ontology in multi-agent system can be explicit or implicit (FIPA 1998). It is explicit when it is specified in declarative form e.g. as a set of axioms and definitions. It is implicit, when the assumptions on the meaning of its vocabulary are implicitly embedded in agents, i.e. in software programs representing agents. The explicit form enables and requires communication about an ontology that can be modified when agents agree on that. The implicit form is fixed, so no communication about ontology is required, but the change is impossible without reprogramming agents. It is obvious that in open systems, where agents designed by different programmers or organizations may enter into communication, the ontology must be explicit. In these environments, it is also necessary to have some standard mechanism to access and refer to explicitly defined ontologies. We will describe, how this mechanism is proposed e.g. by FIPA (FIPA 1998). First of all, agents must be able to send messages that have common form not depending on particular message content and on ontology that gives semantics to the content. Usually some derivative of KQML (Knowledge Query and Manipulation Language (KQML)) is used. FIPA adopts KQML with some minor changes and recommends using its FIPA agent communication language (ACL) (FIPA 1997). An example message in this language is: (inform :sender agent1 :receiver hpl-auction-server :content (price (bid good02) 150) :in-reply-to round-4 :reply-with bid04 :language sl :ontology hpl-auction ) As we can see, this format is content and ontology independent – content can be anything, and the language and ontology of the content is specified explicitly. FIPA adopts OKBC knowledge model that specifies how ontology should look like. FIPA recommendations also include description of several content languages, such as SL, KIF and RDF. FIPA proposes a special dedicated “ontology agent” (OA) to handle ontology services. The role of such an agent is to provide some or all of the following services (FIPA 1998): 13

· · · · · ·

discovery of public ontologies in order to access them help in selecting a shared ontology for communication maintain (e.g. register with the directory facilitator, upload, download, or modify) a set of public ontologies translate expressions between different ontologies and/or different content languages respond to query for relationship between terms or between ontologies facilitate the identification of a shared ontology for communication between two agents

It is not mandatory that an ontology agent must provide all of these services, but every OA must be able to participate in a communication about these tasks. Also, it is not mandatory that every agent platform must contain an ontology agent, but when an ontology agent is present, it must complain to FIPA specifications. Example scenarios (FIPA 1998) of using OA particular services are: ·

·

·

Querying the OA for definitions of terms – user interface agent A wants to receive pictures from picture-archiver agent B to show them to a user. It asks agent B for “citrus”. However the agent B discovers that it doesn't have any picture with that description. So it asks the appropriate OA to obtain sub-species of “citrus” within the given ontology. OA answers B that “orange” and “lemon” are sub-species of “citrus”, so the agent B can send pictures with these descriptions to agent A and so satisfy his requirements. Finding equivalent ontology – ontology designer declares the ontology “car-product” to the ontology agent OA2 in U.S. in English terms and translates the same ontology to French for the ontology agent OA1 in France. Agent A2 uses the ontology from OA2 and wants to communicate with agent A1 about cars in ontology maintained by OA2. Because agent A1 doesn't know ontology of agent A2, it queries OA1 for ontology equivalent to that one used by A2 (and maintained by OA2). OA1 returns its French ontology about cars and so A1 can inform A2 that these two ontologies are equivalent and that OA1 can be used as a translator. After that, a dialogue between A1 and A2 can start. Translations of terms – agent A1 wants to translate a given term from an ontology #1 into the corresponding term in an ontology #2 (for example the concept “the name of a part” can be called “name” in ontology #1 and “nomenclature” in ontology #2). A1 queries directory facilitator for an OA that supports the translation between these ontologies. DF returns the name of an OA that knows the format of these ontologies and has capabilities to make translations between them. A1 can then query this OA and request translation of a term from ontology #1 to ontology #2.

FIPA recommendations only describe some possible scenarios of OA use together with ontology for communication with the OA. A full FIPA ontology agent that would provide all these described features (especially translation) was not yet implemented. An implementation that is described in (Suguri, et al. 2001) serves in fact as an FIPA-compliant agent interface to OKBC database, so it supports performing actions with ontologies (such as changing and querying them) and queries for ontology relationships (relationships must be determined by human and stored in a special database). Translation is the feature that is not supported. Ontology negotiation between search agents that is described as a protocol that enables finding intended meaning of a used word in a common ontology is discussed in (Bailin, et al. 2001). Search agents, that use only a part of a big common ontology (WordNet (WordNet) in this case) exchange messages and may use the protocol of ontology negotiation to find common meaning or explanation of a term used in query keywords by asking for explanations. Relations that are used for explanation of terms include specialization, generalization, similar meanings, implication and intersection. Agents may update their ontology if they find it useful.

14

4 Discussion We probably cannot expect that there will ever be a final ontology even for a less complex domain that would be optimal for everyone forever. We can however hope that there will be a good ontology representation and sharing framework that would enable ontology reuse and sharing even between ontologies that are not exactly the same. When two agents that do not share the same ontologies want to communicate, they may not understand each other because they use different terms for the same or similar meanings (or the same terms for different meanings). In this case, translating between meanings of terms or meaning negotiation is needed. Today, humans usually perform the translation between ontologies at once and the translation is then used to translate different statements based on these ontologies. This is clearly not scalable when many ontologies are involved. More automatic, decentralized and incremental approaches are preferred in this report. As stated in section 2, we would like agents to be able to: · Translate messages from one agent for other agent so those agents can communicate and understand in communication. · Learn how to understand each other, i.e. learn this translation (or mapping) between ontologies or negotiate meaning of messages or parts of messages. In this section, we will try to elaborate these ideas more in detail.

4.1 Architecture Knowledgeable agent has a knowledge base or several knowledge bases in which he stores his view of the world, his beliefs, desires, plans etc. All of these knowledge bases are based on some ontology and if an agent wants to communicate about the content of these knowledge bases, he has to share the ontology with other agents or there must be a mechanism how to translate his messages. We will focus on communication between agents that may not share exactly the same ontology. 4.1.1 Communication Between Agents In this case, a translation between statements expressed in different ontologies is needed. We can distinguish two possible ways – translation via specialized agent or translation by each agent himself. The first way, translation via specialized agent, is an approach proposed by FIPA (ontology agent in (FIPA 1998)). An advantage of this approach is that each agent can specialize on his own task and can ask a specialized agent for translation service. The translation agent can provide translation for communication as well as finding translation for two agent using different ontologies. In this case, both agents must be able to express the form of their ontologies, so that they can communicate with the ontology agent about their ontology that they intend to use for communication. However, when the ontology translator agent is unavailable for various reasons, no communication can proceed. Provided that the translation is straightforward, i.e. no meaning negotiation or learning is needed, this type of communication requires two times more messages (agent – translator – agent instead of agent – agent, because all the communication must go through the ontology translator agent). The latter way, translation by each agent, requires more powerful agents. They must be able to translate messages themselves and also be able to find a translation if the exact process of translation between particular ontologies is not entered by a human. However, the knowledge how to translate is stored in a more distributed way in this case. This means that the knowledge can be stored at more places, and also learned at more places, which requires more memory and time resources, however, this approach doesn’t rely on one specialized agent that is required for all communication between agents that do not share their ontologies. A central ontology translator has also a disadvantage that it knows about all ontologies of all agents. This may not be optimal for communication between agents that do not want to fully expose their ontology, because it is proprietary and want to perform only a

15

short communication with a particular agent about a particular subject. They may not want their whole ontology to be stored publicly (or even privately) on another place. Hybrid approaches are also possible. For example ontology agent may be able to translate messages, to learn translation, and to send “translation rules” to individual agents that can then use them for individual communication, without having to learn the translation themselves. In all cases it is obvious that if agents have to learn the translation themselves, they must be able to reason not only about knowledge they have, but also about ontologies they use. Clearly, there must be some common protocol and ontology to establish the communication even when the ontology for communication about a particular subject or domain is not shared. This common ontology can include for example constructs for expressing parts of ontologies for negotiation about their meaning. It should be noted that it is not needed to have translation rules for a whole ontology to be able to have a successful communication. In case of big ontologies describing a large domain, only a part of the ontology needs to be translated for communication about a part of the domain. Agents have to know how to translate only those terms that are used in their communication. In this work, we initially focus on a simple model of two communicating agents. Both of these agents share a protocol and ontology to be able to communicate about ontologies. Agents have ordinary structure, however in addition they have a module that is able to translate between ontologies and to reason about ontologies (if they have to learn the translation). This module must also know about ontology that an agent uses for his knowledge base. Once the “translation rules”5 for a subject of communication are known, this module can serve as a wrapper of agent’s knowledge base for communication with other agent. The goal is to be able to communicate between these agents that have different ontologies even without prior knowledge of translation rules. The translation knowledge should be built incrementally and partially, without having to find translation between two whole ontologies in order to proceed with a communication about a simple subject. We believe that this model can be then easily extended to a learning ontology agent that would provide a translating service between two or more communicating agents. 4.1.2 Agent Architecture In the previous text we discussed a way of communication between agents. However, the internal architecture is also important. An agent that uses explicit ontological knowledge in communication has to store the knowledge somehow and must be able to reason about that knowledge. An efficient organization of other types of knowledge stored in separated knowledge bases is discussed e.g. in (Pechoucek, et al. 1998). The ontological knowledge should be accessible at least to the “communication interface” of an agent and available for the translation engine. If the ontology is to be evolved and changed, the ontological knowledge must be also accessible to all the knowledge bases of the agent that use that ontology. In the easiest case, where a translator wraps the agent’s knowledge base and the organization (ontology) of the knowledge base is not to be changed, the ontological knowledge is stored only in the translation wrapper and can be fully separated from the agent body. This allows integration of existing agents communicating in different ontologies only by defining explicitly ontology that they use and adding the communication wrapper that would enable translation between different ontologies.

5

By „translation rules“ we mean here the knowledge about the translation between two particular ontologies. This knowledge doesn’t have to be expressed in rules.

16

4.2 Translation Between Ontologies By translation between ontologies we mean a process of translating statements, queries etc. expressed in one ontology to semantically equivalent (if possible) statements in another ontology. The term “semantically equivalent” roughly speaking means that the statements mean the same, i.e. · they refer to the same objects (only represented in another way) · it is possible to infer the same consequences from both statements. It may not be possible to translate a statement to another ontology so that it would mean exactly the same. This may be for example because the target ontology doesn’t make the distinctions that the source ontology makes. We generally want to have the statement after translation be as much equivalent as possible, with possible information loss, but without introducing inconsistencies. Even that may not be possible when translating between ontologies that are not consistent. This problem usually doesn’t arise in translation between simple database schemas, but may come in case of translation between complex ontologies. To construct a general ontology translation framework, we have to identify possible ontology mismatches between ontologies and to propose an acceptable way how to deal with them (i.e. find possible translations). A number of ways of translating between ontologies are presented in the state of the art section. The algorithm of translation usually derives from the representation of “translation knowledge” between particular ontologies. From this we can see that to translation knowledge representation is a core part of an ontology translation framework. Again, we should note that we are interested in knowing how to translate parts of ontologies without knowing how to translate whole ontologies. It is obvious that in this case we have to know how to identify, which statements can be translated with the translation knowledge gained so far, and which can be not. We will start with simple ontologies in a form of hierarchies and continue to add other features expressible in ontologies, such as attributes and restrictions on attributes or classes (slots, facets). The target is to explore whole OIL model and to see how far it is possible to go with being able to say anything about the quality of translation (here we do not mean only the translation, which may be easy once we have the translation knowledge, but also learning of the translation knowledge that may not be possible for all cases – see the next section). The results of finding appropriate translation together with translation knowledge representation, ontology mismatches resolution, etc., can be used to implement an ontology translation framework that can be used as a module in agents using different ontologies.

4.3 Learning to Translate Our agents should be able to find the translation between ontologies themselves. This means that they have to be able to communicate and reason about ontologies and to be able to learn the translation actively. By “learn actively” we mean that the learning is not only passive learning from available information, but also includes a process of requiring new information to be able to find guaranteed translation. The translation module should be exact, i.e. for a statement there should be an algorithm that translates that statement to another ontology without any uncertainty. However, when learning the translation, an agent may not be able to say exactly e.g. which concept corresponds to another concept, but may be able to say that it is possible to translate it (with possible information loss) to concepts A, B, C, but not to concept D, since this would introduce inconsistency. The knowledge of mapping can be uncertain in first phases of learning (knowing that something is possible, sure, or impossible should be distinguished). The mapping knowledge and the knowledge about how far the mapping knowledge is 17

accurate should be stored separately. A possible goal of learning to translate a simple statement is to find a guarantee that the statement cannot be translated in a better way than with the current knowledge. Proposals how to learn ontology mapping include language game like approach and negotiation based on ontology structure. It is obvious that agents must share something in order to start communication, i.e. some basic protocol and ontology to communicate about ontologies or to be able to point to a shared world. Language games use learning of a shared ontology from described objects that are located in a shared world. Agents can point to these objects so that they can verify about which object they are speaking. This kind of learning can be used when nothing from ontologies between which we should translate is common. Using this approach, common concepts can be found from examples. Note that unlike in the case of classical language games, we are interested in finding translation between existing ontologies. The learning can be driven by an agent that wants to find a meaning of a specific term by trying asking rather by thousands of random examples. Negotiation of ontology can be used when some parts of ontology are shared and agents have to negotiate meaning of a particular term. This is not unrealistic assumption, since many ontologies are available to be reused, however when using them, different authors may add different features that are appropriate for their application. There are efforts to find an upper ontology that would be acceptable as a base for creating ontologies – see e.g. IEEE Standard Upper Ontology Group (SUO) (SUO). This negotiation can be also used after several common concepts were found using the approach mentioned in previous paragraph. In this kind of negotiation, all the available features of ontologies can be used, such as is-a relation, slots, facets etc., to find structural similarities. Ontology structures can be compared in several ways – basically top-down and bottom-up approaches can be distinguished. It may be also possible to compare ontologies as complex structures and find a possible mapping between them even if no common concepts are known. This would be possible under the assumption that both ontologies are very similar or the same in structure, and that the only difference is naming of concepts, attributes etc. As in the case of ontology translating, in learning we will also start with simple ontologies in a form of hierarchies and continue to add other features expressible in ontologies. Again, the target is to explore whole OIL model and to see how far it is possible to go with being able to say anything about the quality and accuracy of translation. If we would have to translate just a content of one message, the learning should be directed to learn sufficiently only terms used in that message. Finding conditions under which it is possible to achieve learning of translation is an important part of this work.

4.4 Testing Testing of the translation and learning of translation can be done using artificial ontologies that have desired features. We plan to start with simple artificial ontologies created for testing purposes and then try to apply the algorithms to some of the ontologies mentioned here. Many examples of ontologies conforming to OIL can be found for at the DAML website (ontology library at http://www.daml.org/ontologies/). Some of these ontologies describe the same domain, so it may be challenging to find a translation between them by processes proposed above. XML schemas for B2B document exchange that can be also explored can be found at BizTalk.Org website. Big ontologies in a form of hierarchies are available at web directory sites, such as Yahoo, Dmoz.Org, Google directory etc. Many examples of other web sites with ontologies can be found in (Fensel 2001b). Many companies provide information about their products in different forms, so for virtually any B2B/B2C server we can find another one that provides information about similar products but in another form. Possible domains of products or services include books, music, traveling, financial data, weather forecast etc.

18

We are particularly interested in two domains – manufacturing (or more specifically material handling) domain (Vrba, et al. 2001, Vrba, et al. 2002) and collaboration in operations other than war (OOTW) application domain (Barta, et al. 2002). 4.4.1 Manufacturing Domain An multi-agent based material handling as a part of the manufacturing domain is described in (Vrba, et al. 2002). In the domain of communication on the assembly line we can usually expect high level of ontology standardization. It may not be needed to dynamically learn and negotiate the translation between different ontologies in this case. Instead, once the translation is found and validated, it is not supposed to be changed. Often, when two agents of the same type (i.e. representing the same physical entities) communicate, they may not require any translation at all because of the standardization. However, the need for translation between different ontologies may easily arise when two agents of different types communicate, because their ontologies view the same objects from a different points of view (similar situation is illustrated in the airport domain in (Burstein, et al.)). Let us illustrate this situation on a simplified example scenario that takes into account broader area of manufacturing. Agent O in a company that produces engines receives an order for a particular type of an engine. It remembers the order with the details about the customer and asks a planning agent P for evaluating the time for producing such an engine and announces that time back to a customer. Planning agent (or other one) also prepares a description of the jobs required to assembly that engine and sends the proper information to agents Mi that control assembly and other machines. These agents may request delivery agents Di to deliver work pieces that are to be manufactured and assembled or to deliver prefabricated parts to another machines. Once the job is done, the engine is sent to a storage area to be distributed to a customer, and agent O is notified. All of the communication between different agents requires different ontology, even when agents are talking about the same physical object. We can see these problems also when agents from different companies are involved in the communication. Different companies may in addition use a different conceptualization of the same subdomain. Clearly, the translation between these ontologies requires much of background knowledge – it is not based on simple rules only. Another simpler scenario is an assorting line of a delivery service company. Packaged parcels may enter the line at different places (as they are delivered from different locations). The content of the parcels needs to be unpacked and transported to the right place where it is packed up to a parcel again to be delivered to another location together. Unpacking and packing machines view the objects in a different way than agents that care about the transportation between these machines. Again, a translation between different ontologies that describe in fact the same objects in different views is needed. The advantage of the manufacturing domain, or the material handling subdomain is the relative simplicity of used ontologies (see Appendix A for an example of the material transportation ontology), so that ontologies from this domain can be used for early testing. On the other hand, information loss in communication that can arise because of translation may cause serious problems. In this case, agents should be able to recognize this situation and possibly ask a human for conflict resolution. 4.4.2 OOTW Domain A collaboration in coalitions between different organizations or institutions in operations other than war (OOTW) domain is studied in CPlanT project (Barta, et al. 2002). These operations include humanitarian help after disasters. Organization of collaborating entities is dynamic and flexible in contrast to strictly hierarchical organization. The main purpose of this approach is that the organizations do not want to share their private information (e.g. intentions, goals, resources) in a central planning facility and want to be as flexible as possible in collaboration. Every organization wants to keep its private organization private and still to be able to communicate with other organizations that could be possible collaborators.

19

There are several ontologies used in the CPlanT system. One of the most interesting ontology for our purposes is the one that describes humanitarian services that agents (or organizations that are represented by these agents) provide. It is likely that in real situations different organizations will use different ontologies for representing their knowledge about their resources and services because of the reasons mentioned above (different views and distinctions etc.). A scenario of agents using different ontologies that need to translate between them can easily occur here. The advantage of this domain is that it can provide very rich ontologies for testing purposes and that heuristic translation that may introduce loss of information because of other views in differing ontologies can be more easily employed here than in the manufacturing domain. Here the loss of information does not mean that the communication should be rather stopped until someone else resolves the conflicts.

4.5 Other Related Issues We will enumerate a few other related issues that are interesting in a frame of this report: · Ontology may evolve (e.g. new features of products may be introduced for a shopping agent) and thus partially invalidate learned translation rules. How to recognize this situation and how to update invalid translation rules efficiently? · Change of ontology may also require a change of the underlying knowledge base. For this purpose the ontology translation framework can be used after finding mapping between the old and new ontology (the change will be typically very small). · It may be desirable to be able to explain or prove the learned translation to a user (or to other agent) or to explain the process of translation. This can be done either by textual explanation or through a proper visualization. Several ways of ontology visualization exist, so their usage for these reasons can be explored. · How the prototype-based ontology can be used for learning mapping? This is not straightforward, since we need exact translation rules, and the prototype-based ontology contains uncertainty. · Ontology can contain other features than the one reflected in OIL, for example inference rules or general axioms. How to handle with them in translation and translation learning? · Algorithms developed for negotiation of ontology between agents can be also used for the same task between an agent and a human. The algorithms may be thus also useful for software agents for acquiring ontologies from experts.

5 Conclusion We have presented and discussed the area of using ontologies for communication in multi-agent systems. The problem of semantic interoperability arises in many areas such as managing distributed knowledge bases in e-commerce and semantic web and other presented areas. Currently, the translation between different representations is usually solved manually. This approach is not scalable, so more automatic, distributed and incremental approaches are needed. Even partial solution of the problems mentioned above or finding conditions under which these problems are solvable could help in finding these approaches.

6 Acknowledgements I would like to thank professor Vladimír Mařík and Dr. Michal Pěchouček for their comments that helped to improve this proposal. Pavel Vrba is thanked for providing the help with the manufacturing domain and scenarios, Jaroslav Bárta is thanked for providing the help with the OOTW domain. This work has been supported by the MSMT grant no. 212300013.

20

7 References 1. (AT&T 2001) AT&T. Implemented Description Logic-based Systems. 2001. http://www.research.att.com/sw/tools/classic/imp-systems.html 2. (Bailin, et al. 2001) Sidney C. Bailin and Walt Truszkowski. Ontology Negotiation Between Scientific Archives. Proceedings of the 13th International Conference on Scientific and Statistical Database Management. 245-250. 2001. 3. (Barta, et al. 2002) Jaroslav Barta, Michal Pechoucek and Vladimir Marik. CPlanT Coalition Planning Tool. 2002. http://agents.felk.cvut.cz/cplant/ 4. (Bray, et al. 1998) Tim Bray, Jean Paoli and C. M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. 1998. http://www.w3.org/TR/REC-xml 5. (Burstein, et al. 2002) Mark Burstein, Drew McDermott and Douglas R. Smith. Derivation of Glue Code for Agent Interoperation. Journal of Autonomous Agents and Multi-Agent Systems (to appear). 2002 6. (Chalupsky 2000) Hans Chalupsky. OntoMorph: a translation system for symbolic knowledge. Principles of Knowledge Representation and Reasoning: Proceedings of the Seventh International Conference (KR2000). 2000. 7. (Chaudhri, et al. 1998) Vinay K. Chaudhri, Adam Farquhar, Richard Fikes, Peter D. Karp and James P. Rice. Open Knowledge Base Connectivity 2.0.3. 1998. http://www.pms.informatik.uni-muenchen.de/mitarbeiter/ohlbach/Ontology/OKBC/okbc-2-03.pdf 8. (DAML 2001) DAML. DAML - DARPA Agent Mark-Up Language. 2001. http://www.daml.org/ 9. (Doan, et al. 2001) A. Doan, P. Domingos and A. Halevy. Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach. Proceedings of the ACM SIGMOD Conference on Management of Data (SIGMOD-2001). 2001. 10. (Fensel 2001a) Dieter Fensel. Ontologies and Electronic Commerce. IEEE Intelligent Systems. 16. 2001a 11. (Fensel 2001b) Dieter Fensel. Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer-Verlag. 2001b. 12. (Fensel, et al. 2000) Dieter Fensel, I. Horrocks, F. Van Harmelen, S. Decker, M. Erdmann and M. Klein. OIL in a Nutshell. Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling, and Management (EKAW'00). 2000. 13. (FIPA 1997) FIPA. Agent Communication Language Specification. 1997 14. (FIPA 1998) FIPA. Ontology Service Specification. 1998 15. (Ganter, et al. 1999) Bernhard Ganter and Rudolf Wille. Formal Concept Analysis Mathematical Foundations. Springer-Verlag. 1999. 16. (Genesereth, et al. 1994) Michael R. Genesereth and Richard E. Fikes. Knowledge Interchange Format, Version 3.0, Reference Manual. 1994. http://logic.stanford.edu/kif/Hypertext/kif-manual.html 17. (Grosso, et al. 1999) W. E. Grosso, H. Eriksson, R. W. Fergerson, J. H. Gennari, S. W. Tu and M. A. Musen. Knowledge Modeling at the Millennium - The Design and Evolution of Protégé-2000. Twelfth Workshop on Knowledge Acquisition, Modeling and Management (KAW'99). 1999. http://www-smi.stanford.edu/pubs/SMI_Reports/SMI-1999-0801.pdf 18. (Gruber 1993a) Thomas R. Gruber. Toward Principles for the Design of Ontologies Used for Knowledge Sharing. Formal Ontology in Conceptual Analysis and Knowledge Representation. 1993a. 19. (Gruber 1993b) Thomas R. Gruber. A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition. 5. 1993b

21

20. (Gruber 1994) Thomas R. Gruber. What is an Ontology? 1994. http://wwwksl.stanford.edu/kst/what-is-an-ontology.html 21. (Guarino, et al. 1995) Nicola Guarino and Pierdaniele Giaretta. Ontologies and Knowledge Bases - Towards a Terminological Clarification. Towards Very Large Knowledge Bases. 1995. 22. (Ichise, et al. 2001) Ryutaro Ichise, Hideaki Takeda and Shinichi Honiden. Rule Induction for Concept Hierarchy Alignment. Proceedings of the IJCAI-01 Workshop on Ontology Learning (OL-2001). 2001. 23. (Kaplan 1998) Frédéric Kaplan. A new approach to class formation in multi-agent simulations of language evolution. Proceedings of the third international conference on multiagent systems (ICMAS 98). 1998. 24. (Karp, et al. 1999) Peter D. Karp, Vinay K. Chaudhri and Jerome Thomere. XOL Ontology Exchange Language. 1999. http://www.ai.sri.com/pkarp/xol/ 25. (Klein 2001) Michel Klein. Combining and relating ontologies: and analysis of problem and solutions. Workshop on Ontologies and Information Sharing, IJCAI'01. 2001. 26. (KQML 2001) UMBC KQML. UMBC KQML Web. 2001. http://www.cs.umbc.edu/kqml/ 27. (Lassila, et al. 1999) Ora Lassila and Ralph R. Swick. Resource Description Framework (RDF) Model and Syntax Specification. 1999. http://www.w3.org/TR/PR-rdf-syntax/ 28. (Madnick 1995) Stuart E. Madnick. From VLDB to VMLDB (Very MANY Large Data Bases): Dealing with Large-Scale Semantic Heterogeneity. Proceedings of the 21st VLDB Conference. 1995. 29. (McDermott, et al. 2001) Drew McDermott, Mark Burstein and Douglas R. Smith. Overcoming Ontology Mismatches in Transactions with Self-Describing Service Agents. International Semantic Web Workshop. 2001. 30. (McGuinness, et al. 2000) Deborah L. McGuinness, Richard Fikes, James P. Rice and Steve Wilder. The Chimaera Ontology Environment. Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI2000). 2000. 31. (Microsoft 2001) Microsoft. Microsoft BizTalk Server Homepage. 2001. http://www.microsoft.com/biztalk/ 32. (Mizoguchi, et al. 1996) Riichiro Mizoguchi and Mitsuru Ikeda. Towards Ontology Engineering. The Institute of Scientific and Industrial Research, Osaka University. 1996. 33. (Noy, et al. 2001) Natalya F. Noy and Deborah L. McGuinness. Ontology Development 101: A Guide to Creating Your First Ontology. Stanford University. 2001. 34. (Noy, et al. 1999) Natalya F. Noy and M. A. Musen. SMART: Automated Support for Ontology Merging and Alignment. Twelfth Banff Workshop on Knowledge Acquisition, Modeling, and Management. 1999. 35. (Obitko 2001) Marek Obitko. Ontologies - Description and Applications. Gerstner Lab for Intelligent Decision Making and Control, Czech Technical University in Prague. 2001. http://cyber.felk.cvut.cz/gerstner/reports/GL126.pdf 36. (O'Leary 2000) Daniel E. O'Leary. Different Firms, Different Ontologies, and No One Best Ontology. IEEE Intelligent Systems. 2000 37. (Park, et al. 1998) John Y. Park, John H. Gennari and Mark A. Musen. Mappings for Reuse in Knowledge-Based Systems. Eleventh Workshop on Knowledge Acquisition, Modeling and Management. 1998. 38. (Pechoucek, et al. 1998) Michal Pechoucek, Jiri Lazansky, Vladimir Marik and Olga Stepankova. Tri-Base Acquitance Model for Project Driven Production Modelling. Changing the Ways We Work. 623-631. 1998. 39. (Perkowitz, et al. 1995) Mike Perkowitz and Oren Etzioni. Category Translation: Learning to understand information on the Internet. Proceedings of IJCAI95. 1995.

22

40. (Ribiere, et al. 2001) Myriam Ribiere and Patricia Charlton. Ontology Overview. Motorola Labs. 2001. 41. (Shawver 2001) Lois Shawver. On Wittgenstein's Concept of a Language Game. 2001. http://www.california.com/~rathbone/word.htm 42. (Sowa 2000) John F. Sowa. Knowledge Representation - Logical, Philosophical and Computational Foundations. Brooks/Cole. 2000. 43. (Sowa 2001) John F. Sowa. Ontology. 2001. http://www.jfsowa.com/ontology/index.htm 44. (Steels 2001) Luc Steels. Language Games for Autonomous Robots. IEEE Intelligent Systems. 2001 45. (Stumme, et al. 2001) Gerd Stumme and Alexander Maedche. Ontology Merging for Federated Ontologies on the Semantic Web. IJCAI '01 - Workshop on Ontologies and Information Sharing. 2001. 46. (Suguri, et al. 2001) Hiroki Suguri, Eiichiro Kodama, Masatoshi Miyazaki, Hiroshi Nunokawa and Shoichi Noguchi. Implementation of FIPA Ontology Service. OAS2001. 2001. 47. (SUO 2001) IEEE SUO. IEEE Standard Upper Ontology Study Group Homepage. 2001. http://suo.ieee.org/ 48. (Tamma, et al. 2001) Valentina Tamma and Trevor Bench-Capon. A conceptual model to facilitate knowledge sharing in multiagent systems. OAS01. 2001. 49. (van Eijk, et al. 2001) Rogier M. van Eijk, Frank S. de Boer, Wiebe van der Hoek and John-Jules Ch. Meyer. On Dynamically Generated Ontology Translators in Agent Communication. International Journal of Intelligent Systems. 26. 587-607. 2001 50. (Vrba, et al. 2001) Pavel Vrba and Václav Hrdonka. Material Handling Problem: FIPA Compliant Agent Implementation. Proceedings of the Twelfth International Workshop on Database and Expert Systems Applications. 2001. 51. (Vrba, et al. 2002) Pavel Vrba and Václav Hrdonka. Material Handling Problem: FIPA Compliant Agent Implementation. Multi-Agent Systems and Applications II (to be published). 2002. 52. (Whitten 1997) David Whitten. The Unofficial, Unauthorized Cyc Frequently Asked Questions. 1997. 53. (Wiesman, et al. 2001) F. Wiesman, N. Roos and P. Vogt. Automatic Ontology Mapping for Agent Communication. BNAIC01. 2001. 54. (WordNet 1996) WordNet. WordNet - a Lexical Database for English. 1996. http://www.cogsci.princeton.edu/~wn/

23

Appendix A – Sample Ontology for Transportation Domain An ontology for transportation/material handling domain is described in (Vrba, et al. 2001, Vrba, et al. 2002). This appendix contains a brief description of this XML-based ontology together with an ontology developed in OIL extending this ontology. The OIL Ontology is described in plain text OIL and DAML+OIL formats for illustration.

A.1 Original XML-Based Ontology Ontology described in (Vrba, et al. 2001, Vrba, et al. 2002) is used for communication between material transportation agents. Agents are communicating about particular configuration of transportation (or material handling) system and about changes of this configuration as well as about items (workpieces) that are being sent between agents. The meaning of used XML elements is described in the following table (Vrba, et al. 2001): XML representation