Ontology-Based Integration of Data Sources

2 downloads 0 Views 204KB Size Report
[13] Yannis Kalfoglou, and Marco Schorlemmer,. Ontology mapping: the state of the art, ... [33] A. Valente, T. Russ, R. MacGrecor, and W. Swartout, Building and ...
Ontology-Based Integration of Data Sources Michel Gagnon Defence R&D Canada - Valcartier Department National Defence Québec, Canada [email protected] Abstract – Many applications, e.g., data/information fusion, data mining, and decision aids, need to access multiple heterogeneous data sources. These data sources may come from internal and external databases. They have to evolve due to requirement changes. Any change in an application domain induces semantics change in the data sources. The integration of these data sources raises several semantic heterogeneity problems. This has traditionally been the subject of data/schema integration and mapping. However, many heterogeneity conflicts remain in information integration due to lack of semantics. Therefore, richer semantics of data are needed to resolve the heterogeneity problems. Ontological approaches now offer new solution avenues to this interoperability limitation. In this perspective, we propose an ontologybased information integration with a local to global ontology mapping as an approach to the integration of heterogeneous data sources. Keywords: Interoperability, data semantics, data/schema mapping, ontology mapping, information integration.

1

Introduction

Accessing heterogeneous data repositories is a requirement encountered in various applications. It is particularly essential in data/information fusion, data mining, and decision aids. For these applications, data/information coming from different sources must be collected, combined, corrected, aligned, and aggregated. This is the case, for instance, in intelligence, security and military applications where experts must share parts of the data they have. In order to make cooperative work between applications, multiple data sources must be linked and integrated to be processed. Any advanced information integration approach should support data/information fusion, data mining and decision aid applications [29]. In addition to autonomy, heterogeneity and distribution, any data integration system has, as a requirement, a capability for continuous change and evolution [3]. The capacity to access data/information from multiple information systems is known as information interoperability [34]. More precisely, it is the capacity of different information systems, applications and services to communicate, share and interchange data, information and knowledge in an effective and precise way, as well as to integrate with other systems, applications and services in

order to deliver new electronic products and services [32]. Interoperability must be provided on a technical and informational level [34]. Especially, interoperability has been addressed in the database field where it is often seen as a magic access to different data files, on different computers interconnected through public and private networks. This topic is the concern of the schema integration, which is the constitution of a coherent data set obtained from several heterogeneous data sources. It is a fundamental problem in many database application domains, such as cooperative information systems, e-business, data warehousing, and semantic query processing. It also concerns challenging evolution problems of systems that include high-volume and/or complex data storage in databases. Schema integration is also known under various terms in the literature as schema alignment; although, the predominant term is schema matching [25]. In practice, data/schema matching is typically performed manually, by domain experts and database designers. This operation has significant limitations. Manually specifying data/schema matches is a tedious, time-consuming, errorprone, and therefore expensive process [25]. Consequently, as databases are becoming more complex and access to data on the Web mandatory, there is a real need for automating this process without fault. Fortunately, there is a lot of research and projects on schema alignment and matching developed in the context of schema translation and integration, knowledge representation, machine learning and information retrieval [25]. The aim is to develop methods and tools to integrate multiple data sources in a transparent and seamless way. Our focus is on improving the automation of integration of heterogeneous data sources by the mean of their ontology. The ontology represents a shared, explicit specification of a conceptualization of the domain of knowledge [10]. It provides a formal specification of the vocabularies of concepts and the relationships among them, in a domain of interest. These ontologies now support the integration task as they now describe more explicitly the exact content and the semantics of these data sources. This paper presents a study of advances related to this interoperability topic. The main goals are to review heterogeneity semantic problems related to data/schema mapping and to describe an ontology-based data

integration concept by mapping data objects by the mean of their ontology. The paper is organized as follows. Section 2 identifies and categorizes the semantic conflicts analyzed and met during the data/schema integration and reviews the system architectures for data integration systems. Section 3 describes the recent advances in ontology-based information integration and the proposed approaches to achieve it. Section 4 describes the mapping approach and the steps to map a global ontology with local ontologies. In each of these sections, we notice some of the relevant works and make the link with data/information fusion and data mining. Section 5 emphasizes our main conclusions.

2

Semantic data integration

Such integration is used to handle implicit semantics of objects. This operation is necessary whenever data sources provide no schema, and the integration bases on a metadata schema. Aggregation and grouping are other well-known data integration operation types. Note that many data transformations often prevent possible mapping between original sources data and the resulting global schema.

2.2

Heterogeneity conflicts

Heterogeneity problems and integration conflicts have been the subject of several taxonomies, mainly within the community of distributed database systems [1, 4, 6, 14, 16, 21, 22, 26, 34]. We generally recognize three main conflict categories:

Integration describes the features to reconcile and condense collected or stored data. Semantic integration is the process of matching/merging/fusing individual data/schemas in order to provide a global and integrated (unified) view. Therefore, it involves the transformation of data sources into an integrated view and the resolution of heterogeneity conflicts [7].



Syntactic heterogeneity concerns differences between data models (entity-relationship, relational, objectoriented);



Structural heterogeneity (schematic heterogeneity) which means that different information systems store their data in different structures (e.g., [14, 16] );

2.1



Data integration operations

There are many integration operations on data sources. There are situations in which data are collected unchanged from data sources and others in which semantic integration of higher levels is also performed, e.g., in decision aids and data/information fusion. In addition to data/schema matching (alignment), other data integration operations are: Collection: Data objects are collected unchanged. There is no matching with equivalent other data objects coming from diverse sources. Fusion: This term introduced by [15] is much narrower than its application in the Fusion model. The integration of data objects is done by a simple extraction; no further abstracting computations are done. In contrast to the data collection approach, data object fusion is performed in conjunction with semantically equivalent data objects coming from different sources. Furthermore, the fusion processes try to determine consistent representations, i.e., if data sources report contradicting values for the same data item, the fusion uses mapping rules and heuristics to remove data conflicts. It should be noted that even the integration at the data fusion level may be very difficult. Frequently, it is impossible to identify data objects or to decide which data value is correct. Abstraction: Transformations may be applied to the source data to match the abstraction level. Abstraction encompasses functions for aggregating data, reclassifying entities, or even more complex reasoning processes. Supplementation: Data are not only derived from object data, but other data are added which describe the content or context of the object data (e.g., semantic metadata).

Semantic heterogeneity which considers the content of an information item and its intended meaning (data heterogeneity). In order to achieve semantic interoperability in a heterogeneous information system, the meaning of the information that is interchanged must be understood across the systems e.g., [16]. In a broader view, semantic interoperability is at the knowledge level and at the application level. This latter allows different software components to exchange information even though their implementation languages, interfaces or execution platforms differ. Semantic conflicts of this latter level are harder to disambiguate. These conflicts occur whenever “two contexts” do not use the same interpretation of the information. Taking the contextual perspective into account, Goh et al. [9] identifies three main causes for semantic heterogeneity: • “Confounding conflicts occur when information items seem to have the same meaning, but differ in reality e.g., due to different temporal contexts. • Scaling conflicts occur when different reference systems are used to measure a value. Examples are different currencies. • Naming conflicts occur when naming schemes of information differ significantly. A frequent phenomenon is the presence of homonyms and synonyms”. Park and Ram [22] analyzed semantic conflicts while summarizing the results of several previous studies. The authors separated the semantic conflicts that occur at the data level from those at the schema level. Table 1 presents the taxonomy resulting from this analysis, which then incorporates these previous studies.

Table 1 -Taxonomy of semantic conflicts Data-level conflicts Data value: different meaning Data representation: different format Data -unit: different measurement units Data precision: e.g., grade and granularity differences Schema-level conflicts Naming: entity and attribute homonyms and synonyms; different labels Entity-identifier: different identifiers Schema-isomorphism: different attributes Generalization: different concepts Aggregation: one attribute for many attributes Schematic discrepancies: different structures Such conflict taxonomy is above all useful in order to develop efficient automatic mapping algorithms. The pure schema-level approach, without data-level interoperability, may result in achieving interoperability between different schemas that may be semantically different but structurally similar. It is, therefore, essential to achieve interoperability at both levels in most applications.

2.3

Integration system architectures

Integration means a form of cooperation between several users and several sources of data. There are several taxonomies of integration systems, e.g., [7, 27]. For the current purpose, we will consider that the integration architecture is regardless of the notion of centralized or distributed information systems. Multibase system A multibase (multiple database) system allows the users to view the database through a single global schema, simulating to users that a federated data base exists. However, in this architecture there is no attempt to unify the semantics of data from the various sources. The integration is considered as dynamic as mapping links between schemas are not predefined but established as required [8]. The multibase system has two components: a schema design tool and a query processing system. The first component provides the tools needed by the database designer for designing the global schema and defining a mapping from the local databases to the global schema. The second component uses the mapping definition to translate global queries into local queries. Federated systems The architecture of federated systems has been defined by Sheth and Larson [27]. It is characterized by the existence of a federated schema which establishes the interface to this integrated system. The integration is achieved at the schema level of each data source. The design of a federated schema supposes to unify source schemas and to handle their heterogeneousness. It is necessary to identify the mapping and to resolve the conflicts between the

schema elements. This mapping can be expressed, for example, by means of various languages, by means of rules or, as we will see below, through an ontology. Thus, there is, in this architecture, a unified vision by data sources. Integration offers a common access and a common representation to data sources. In this architecture, the integration of federated systems is considered as static as mapping links between schemas are predefined. Mediation system The idea of a mediator was introduced by Wiederhold [35] who defined it as follows: “A mediator is a software module that exploits encoded knowledge about some sets or subsets of data to create information for a higher layer of applications.” Since then, it has been used in many data integration projects and techniques. We may perceive a mediator as a software component that mediates between the user and physical data sources. Data are not stored in the mediation system but remain at their sources. Interrogation of data sources is made by wrappers which establish an interface to the various data sources. These wrappers translate the sub-requests expressed in the language request specific to every source. The results are then returned to the wrappers who integrate them before presenting them to the user [18]. Data Warehouse In this architecture, data are accessed, transformed and stored in a single location, the data warehouse. Once the data are extracted, they are then processed by analytics tools, data fusion and data mining tools, decision aid tools and query languages. There is one global schema and there is no need to return to the original data sources. This architecture is often privileged in large organizations.

3

Linking data to ontologies

We mentioned above that data integration is an effective approach to share data that reside in different sources for the purpose of providing users with a single point of access to those data. To achieve this interoperability, data accurateness and consistency are mandatory. This is why the creation of a global database schema has been pursued by many organizations to provide a single point of access in order to manage semantic problems. However, this approach has severe limitations when integrating dynamically data from external sources. Relational models and UML type class models only represent data semantics at the schema level (e.g., tables, classes, attributes, and class structures). The need for richer data semantics for resolving conflicts emerging from heterogeneous data sources has been recognized. To overcome many heterogeneity problems, researchers have recently proposed the use of ontologies in the context of data integration [2, 11, 34]. Furthermore, the use of ontologies is now recognized as a key to interoperability [30]. Since, many ontology-based approaches have been developed in order to achieve information interoperability [2].

Creation and use of ontologies differs from schema. They first appeared in database design to communicate information requirements with end users and to reduce the amount of work for a designer. Figure 1 locates where typically the ontology creation results from the conceptualization process and before (logical and physical) database design. Compared to schema, ontologies aim at providing richer semantics and possible means to overcome semantic heterogeneity problems by the elicitation of implicit and hidden knowledge at the schema level. To one extent or another, the use of the term “ontology integration” may be confusing [23]. Based on a literature analysis, Pinto et al. [23] characterize at least three different meanings (and uses) of ontology integration depending on the situation: • Integration – when building a new ontology reusing (by assembling, extending, specializing or adapting) other ontologies already available; • Fusion – when building an ontology by merging several ontologies about the same subject into a single one that unifies all of them; • Usage in applications – when building an application using one or more ontologies. Existing ontologies are adapted if necessary.

The second trend uses ontologies for the generation of global schemas, e.g., [2, 9]. This is suitable whenever the schemas are not subject to frequent changes. In this perspective, database schemas commit to the ontology of a community. This is done by relating every term in the schema definitions to a definition in the community’s ontology.

3.1

The correspondence between schema and ontology is achieved by an operation also known as mapping. Wache et al. [34] analyzed approaches to ontology-based data/information integration and reported 25 mapping applications. Recently, Kalfoglou and Schorlemmer [13] presented a survey on the state of the art in ontology mapping. The authors review approaches, techniques and tools and argue: “Multiple data models need to be accessed from several applications. Representing these data models by means of ontologies and mapping these ontologies can provide a common layer from which these ontologies could be accessed and hence could exchange information in semantically sound manners.” A typical example of mappings between Description Logics, EntityRelationship and UML classes and Ontology is proposed in [12].

3.2

Real Word Conceptualization

Representation

Conceptual Model Mapping

Design Database

Figure 1 - Database design process One of the main justifications of ontology-based integration approaches is the diversity of data/information sources and the explicit description of semantics of data sources and as well as their diversity. Hakimpour [11] distinguished two trends for using ontologies in resolving semantic heterogeneity. One uses ontologies for translating queries, or their results. This approach is suitable whenever schemas are subject to frequent changes, when many data sources are involved, or when the number of involved data sources changes frequently (e.g., data sources on the Internet).

Ontology mapping

Ontologies for content explication

In [34], Wache et al. adapted a previous model describing a federated information system architecture proposed by [3]. The authors transposed similarly a classification of ontology-based integration approaches following three main categories (Figure 2): 1. Global Ontology Approach. An integrated ontology describes data in all sources, and query is through this global ontology (Figure 2a). This approach may look straightforward; however, it needs a domain expert who knows the semantics of all data sources to define the global ontology. It can also be a combination of several specialized ontologies. 2. Multiple Ontology Approach: Each data source is represented by its own local ontology. Mapping between the ontologies has to be established (Figure 2b). Query on integrated data is through a local ontology and the mapping is used to perform local queries on other local ontologies. There is no need to integrate data in a global ontology. Besides, changes to local ontologies may not affect the mapping. 3. Hybrid Ontology Approach. It is intended to overcome the drawbacks of the previous approaches. Data in each source are represented by a local ontology and a shared vocabulary (not ontology!) is built for sharing vocabularies among local ontologies (Figure 2c). This approach is intended to take the advantages of the first two approaches: ease of defining ontologies locally and of querying through a shared vocabulary.

linkage techniques [29] and data warehousing to access data. The creation of integrated views (user views) for heterogeneous data is a fundamental pre-processing step for these applications. To fulfill these data/information needs, we investigated integration approaches and methods allowing sharing and/or merging different ontologies of a more or less automated way. Technical solutions allow managing multiple ontologies developed, e.g., the KRAFT Architecture for Knowledge Fusion and Transformation [24]. These solutions differ according to the operations performed (matching, transformation, management of version, etc.) and according to the mapping techniques used [5, 19].

4.1

Figure 2- Basic architectures in ontology integration approaches (reproduced form [34]) It should be noted that several researches address the ontology matching problem from different perspectives: artificial intelligence, information system and databases [28]. To quote Shvaiko and Euzenat [28], “ontologies and schemas are similar in the sense that (i) they both provide a vocabulary of terms that describes a domain of interest and (ii) they both constrain the meaning of terms used in the vocabulary.” However, there are some important differences and commonalities between schema and ontology matching even if the processes and heterogeneity problems are similar. Semantic is not a part of database specifications.

4

Ontology to ontology mappings

There are many issues to consider when analyzing ontology compatibility: first, the context, and second, the ontological representation language used, the level of generality and detail, etc. Besides, it is recognized that ontology reuse is cost-effective instead of building a new one from scratch. The level of effort depends on the integration type: “combining ontologies that have been designed for the same domain” and “combining ontologies from different domains” [23]. Once again, the integration conflicts are quite similar to data and schema integration conflicts since ontologies may still use the same terminology with different semantics. Many applications, e.g., decision aids, data/information fusion and data mining, rely most of the time on record

Ontology-based data integration architecture

Starting on the technological advances, especially at the level of semantic Web and ontology mapping, we describe an ontology-based data integration system which consists in building a global ontology from the local ontologies corresponding to the data sources as opposed to a federated system approach. This integration goes farther than the one described by Wache et al. [34] as we now develop a local ontology for each data source and a global ontology. The role of the data integration system, which may be designed as a semantic portal for end users at the organization level (cf., [31]) is to exploit the global ontology and its integration with the local ontologies of data sources, as illustrated in Figure 3.

Decision Aids

Data Mining

Data Fusion

GIS

Data Integration System Mediator Global ontology & Mappings

Wrapper

Wrapper

Wrapper

End User

Queries/ Answers

Wrapper

Other Sources Figure 3 - An Ontology-Based Data Integration Architecture In this architecture, the data integration system constitutes a virtual database as opposed to a data warehouse, which copies data from several data sources in a single database. Now, the mediator maps the requests and answers between the global ontology/schema and the local ontologies with their associated source schemas.

4.2

Ontology mapping construction method

The building of the ontology-based data integration system proceeds in three stages. The first stage consists in creating first, if not available, the local ontologies for each data source. Local ontologies developed by using a semantic model are independent of the logical or physical implementation of the local data sources (databases). One reason for designing a local ontology is to capture more complete structural information about a local database, which may not be captured in the global ontology at a later stage of the integration process. Implicit information from data models is made explicit and captured in a semantically rich set of semantics constructs to alleviate the semantic gaps between the data model and the domain knowledge represented in the ontology. The second stage consists in extracting the global ontology from the various concepts used in the local ontologies. It should be noted that there is a possibility that concepts only exist in the local ontologies. The third stage is the definition of the various correspondences between the data sources, the local ontologies and the global ontology. This may require a correspondence of the existing ontological/modeling languages with the chosen representation language, e.g., Protégé ontology editor [20]. The correspondence then has to rest on the formulation of axioms and rules for that purpose, cf., [18]. Paradoxically, the variety of ontologies often represented in different languages and formats may also raise problems at the syntax and semantic levels. The validation by domain experts is still required. The task consists in revising and aligning correctly the concepts and in resolving the remaining semantic conflicts. It is also necessary to take care of the context knowledge (semantics of the concepts). This will have for consequence to increase the precision and the quality of the discovered knowledge. In brief, the construction of a global ontology will be easier because the semantics of the terms are already contained in their context (local ontologies). This method has the characteristic to be ascending which will facilitate the semantic reconciliation between the various concepts. It also increases the capability to automate the whole integration process. The data integration problem still remains establishing sound mechanisms for linking existing data to the instances of the concepts and the roles in the ontology. Consequently, in the absence of a unique concept identifier, the integration process will remain semiautomatic driven by a domain expert.

4.3

Defining ontology mapping

For this purpose, we represent an ontology by a pair O = (C, A) where C is a set of concepts and A is a set of axioms describing the interpretation of the concepts in a given domain. According to [13], a total ontology

mapping from O1 = (C1, A1) to O2 = (C2, A2) can be expressed as a morphism f: C1 → C2 to semantically relate concept C1 to C2, such that, A2 - f(A1), i.e., all interpretations that satisfy O2’s axioms also satisfy O1’s translated axioms. We will say there is a partial ontology mapping form O1 = (C1, A1) to O2 = (C2, A2) if there exists a sub-ontology O′1 = (C′1, A′1) with (C′1 ⊆ C1 and A′1 ⊆ A1) such that there is a total mapping from O′1 to O2. This last definition allows the introduction of a composite approach. This is because different applications use different ontologies and it is difficult or unnecessary to pursue the creation of a single global ontology. Therefore, a composite approach may be more appropriate, as there are several small islands around essential domain ontologies, where within each island there is a form of global integration. A predominant ontology, i.e., the one that can be share by the larger number of sources, becomes the global ontology. A number of local ontologies are mapped to this global ontology, as required. Mappings between the ontologies are generated, as illustrated in Figure 4.

Figure 4 - Composite approach to ontology mapping We mentioned the mapping example in [12], where the authors described their method for integrating two data sources with UML class models based on an analysis of their ontologies. In [12], an ontology describes semantics of data of each source. Then, the ontologies are analyzed and compared to determine their similarities and differences. The result of this comparison is then used to devise an integrated ontology that enables querying on the integrated information.

5

Conclusion

Since data are rarely collected and stored at a single entry point, integration from multiple heterogeneous sources is a prerequisite step for many applications, e.g., decision aids, data/information fusion and data mining. It is also a prevailing task by many organizations in order to improve their knowledge sharing as well as the efficiency of their operations. Ontology-based data integration is an attractive avenue as it is also a key factor for enabling interoperability. However, integrating vast amount of information from different sources is a difficult, complex and demanding task. The use of ontology-based data integration systems and tools to automate partly the data integration task and reduce this effort are welcome.

The contribution of the method proposed here is that it does not require committing to a global unique ontology. The establishment of local ontologies must represent the vocabulary used in the domain in order to recognize the synonyms relation and the hierarchical relations between the concepts. On this basis, the ontology matching becomes less time-consuming than the global schema matching as the method aims to reduce the amount of integration decisions and the number of rules. One of the key challenges is now to establish semantic correspondences between ontologies. It appears that ontologies could also face hard problems with respect to heterogeneity as any other pieces of information [33]. Therefore, advanced projects to develop more efficient integration techniques to achieve this interoperability goal are still needed. Finally, if the ontology-based approaches appear as a promising way to resolve semantic issues in information interoperability, it also appears that there is no one single shared ontological representation language which will answer the requirements of all actors. Advances in intelligent systems, e.g., “Intelligent Information Agents” for the Internet, will help. Emerging and more mature standards such as “Extensible Markup Language” (XML), “Ontology Web Language” (OWL) and Web Services based on “Simple Object Access Protocol” (SOAP), “Universal, Description, Discovery, and Integration” (UDDI) and “Web Service Description Language” (WSDL), will also help to resolve many software-level interoperability problems. We believe that new project endeavors should rely on these Web interoperability standards in order to integrate information dynamically.

References [1] Carlo Batini, Maurizio Lenzerini, and S.B. Navathe, A comparative Analysis of methodologies for Database Schema Integration, ACM Computing Surveys, Vol 18, No. 4, pp. 323-364, 1986. [2] Sonia Bergamaschi, Silvana Castano, Maurizio Vincini, and Domenico Beneventano, Semantic integration of heterogeneous information sources, Data and Knowledge Engineering, Vol 36, No. 3, pp. 215–249, 2001. [3] Susanne Busse, Ralf-Detlef Kutsche, Ulf Leser, and Herbert Weber, Federated Information Systems: Concepts, Terminology and Architectures, Technical Report TR99-9, Technical University of Berlin, 1999. [4] Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Danielle Nardi, and Riccardo Rosati, Description logic framework for information integration, In Proceedings of the 6th Conference on the Principles of Knowledge Representation and Reasoning (KR_98), 1998, Morgan Kaufmann, pp. 2-13.

[5] Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Ontology of integration and integration of ontologies, In Proceedings of the 9th International Conference on Conceptual Structures (ICCS'01), Stanford, CA, USA, August 2001. [6] R. M. Colomb, Impact of Semantic Heterogeneity on Federating Databases, The Computer Journal, Vol 40, No. 5, pp. 235-244, 1997. [7] Renato Fileto, and Claudia Bazer Medeiros, A survey of Information System Interoperability, Technical Report IC-03-030, Instituto de Computaceo, Universidade Estadual de Campinas, 2003. [8] Manuel Garcia-Solaco, Felix Saltor, and Malu Castellanos, Semantic heterogeneity in multidatabase systems, In Object-oriented Multidatabase Systems: A Solution for Advanced Applications, Omran A. Bukhres and Ahmed K. Elmagarmid, editors, Prentice-Hall, 1996, Chapter 5, pp. 129-202. [9] Cheng Hian Goh, Stephane Bressan, Stuart Madnick, and Michael Siegel, Context Interchange: New Features and Formalisms for the Intelligent Integration of Information, ACM Transactions on Information Systems, Vol 17, No. 3, pp. 270-293, 1999. [10] Tom Gruber, A translation approach to portable ontology specifications, Knowledge Acquisition, Vol 5, No. 2, 199-220, 1993. [11] Farshad Hakimpour, and Andreas Geppert, Resolving Semantic Heterogeneity in Schema Integration: an Ontology Based Approach, Proceedings of Conference on Formal Ontology in Information Systems, FOIS’01, Ogunquit, Maine, USA, October 17-19, 2001. [12] Manachaya Jamadhvaja, and Twittie Senivongse, An Integration of Data Sources with UML Class Models, IHIS’05, Bremen, Germany, November 4, 2005. [13] Yannis Kalfoglou, and Marco Schorlemmer, Ontology mapping: the state of the art, The Knowledge Engineering Review Journal, Vol 18, No. 1, pp.1-31, 2003. [14] Vipul Kashyap, and Amit Sheth, Semantic and schematic similarities between database objects: a context-based approach, The VLDB Journal, Vol 5, No. 4, pp. 276–304, 1996. [15] William Kent, The Breakdown of the Information Model in Multi-database Systems, SIGMOD Record, Vol 20, No. 4, pp. 10-15, December 1991.

[16] Won Kim, and Jungyun Seo, Classifying schematic and data heterogeinity in multidatabase systems, IEEE Computer, Vol 24, No. 12, pp.12-18, December 1991. [17] Alon Y. Levy, Divesh Srivastava, and Thomas Kirk, Data model and query evaluation in global information systems, Journal of Intelligent Information Systems, Vol 5, No. 2, pp. 121-143, September 1995. [18] Nora Maiz, Omar Boussaid, and Fadila Bentayeb, Un système de médiation basé sur les ontologies, Proceedings of Extraction et Gestion des Connaissances (EGC 2006), Lille, 17 janvier 2006, pp. 27-38. [19] Natalya F. Noy, Tools for Mapping and Merging Ontologies, In Handbook on Ontologies, S. Staab and R. Studer editors, Springer-Verlag, 2003, pp. 365-384. [20] Natalya F. Noy, Semantic Integration: A Survey Of Ontology-Based Approaches, Sigmod Record, Vol 33, No. 4, pp. 65-70, December 2004.

Autonomous Databases, ACM Computing Surveys, Vol 22, No. 3, pp. 183-236, September 1990. [28] Pavel Shvaiko, and Jérôme Euzenat, A survey of schema-based matching approaches, Journal on Data Semantics, Vol IV, pp.146-171, 2005. [29] Vicenç Torra, and J. Domingo-Ferrer, Record linkage methods for multidatabase data mining, In Information Fusion in Data Mining Studies in Fuzziness and Soft Computing, Vicenç Torra, editor, Vol. 123, Springer-Verlag, Heidelberg, 2003. [30] M. Uschold, and M. Gruninger, Ontologies : principles, methods and applications, Knowledge, Engineering Review, Vol 11, No. 2, pp. 93-155, 1996. [31] Mike Uschold, M. King, S. Moralee, and Y. Zorgios, The Enterprise Ontology, The Knowledge Engineering Review, Vol 13, No. 1, pp. 31-89, 1998.

[21] Aris M. Ouskel, and Amit Sheth, Semantic Interoperability in Global Information Systems - A brief Introduction to the Research Area and the Special Section, SIGMOD Record, Vol 28, No. 1, March 1999.

[32] J. A. Martinez Useroa, P. Beltrán Orenesa, J. A. Martínez Comechea, and R. San Segundo Manuelb, Information interoperability evaluation model for public web sites, Knowledge and Information Management, 0101-2006, Online.

[22] Jinsoo Park, and Sudha Ram, Information systems interoperability: What lies beneath? ACM Transactions on Information Systems, Vol 22, No. 4, pp. 595-632, October 2004.

[33] A. Valente, T. Russ, R. MacGrecor, and W. Swartout, Building and (re)using an ontology for air campaign planning, IEEE Intelligent Systems, Vol 14, No. 1, pp. 27-36, 1999.

[23] H. Sofia Pinto, Asunción Gómez-Pérez, and João P. Martins, Some Issues on Ontology Integration, In Proceedings of the IJCAI-99 Workshop on Ontologies and Problem-Solving Methods (KRR5), Stockholm, Sweden, August 1999.

[34] H. Wache, T. Vögele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann, and S. Hübner, Ontology-based Integration of Information - A Survey of Existing Approaches, In Proceedings of IJCAI-01 Workshop: Ontologies and Information Sharing, Seattle, WA, 2001, pp. 108-117.

[24] Alan Preece, Kit Hui, Alex Gray, Philippe Marti, Trevor Bench-Capon, Dean Jones and Zhan Cui, The KRAFT Architecture for Knowledge Fusion and Transformation, In Intelligent Systems XVI (Proc ES99), Research and Development, M. Bramer, A Macintosh & F. Coenen, editors, Springer, New York, 1999, pp. 23-38. [25] Erhard Rahm, and Philip A. Bernstein, A survey of approaches to automatic schema matching, The VLDB Journal, Vol 10, pp. 334-350, 2001. [26] Sudha Ram, and V. Ramesh, Schema Integration: Past, Current and Future, In Management of Heterogeneous and Autonomous Database Systems, Edited by A. Elmagarmid, M. Rusinkeiwicz, and A.P. Sheth, San Francisco : Morgan Kaufmann, 1999, pp. 119155. [27] Amit P. Sheth, and J.A. Larson, Federated Database Systems for Managing Distributed, Heterogeneous, and

[35] G. Wiederhold, Mediators in the Architecture of Future Information Systems, IEEE Computer, Vol 25, No. 3, pp. 38-49, 1992.

Suggest Documents