Ontologies for Geographic Information Integration - CiteSeerX

10 downloads 23643 Views 368KB Size Report
and how they can be used for geographical information integration. ..... In this case, the result of ontology-based information retrieval is a list of valuable ...
1

Ontologies for Geographic Information Integration H. Stuckenschmidt1, U. Visser1, G. Schuster1 & T. Vögele 1

Abstract The opening of geographical information systems (GIS) and the interoperability between these systems demands new requirements for the description of the underlying data. The exchange of data between GIS systems is problematic and often fails due to confusion in the meaning of concepts. The term semantic translator, a translator between GIS systems and/or catalogue systems which gives the user the option to map data between the s stems is a current research topic. This paper proposes an overview of formal ontologies and how they can be used for geographical information integration. A description of an intelligent architecture for semantic-based information retrieval is introduced and shows how this approach can be used for general purposes. In conclusion we attempt to provide a roadmap for the use of ontologies for geographic information processing.

1. Introduction Information processing in geographical applications is a complex task. To solve a problem in an environmental domain (e.g. where does the high sulfate concentration in the river come from?) involves various data from different areas (e.g. data r egarding the river, adjacent waste dumps, ground water flow etc). Frequently, all the data is not available from one database but distributed and has different formats (e.g. single data, time series, spatial data with different resolutions). Therefore, this equires a profound data preparation before the actual analysis can be accomplished. Recent studies in areas such as data warehousing (Wiener et al., 1996) , information integration (Galhardas et al., 1998; Bergamashi et al., 1999) and interoperabilit between GIS (Vckovski et al., 1999) have addressed these problems. The problem domain is complex, not completely understood and dynamic. Inventing and developing methods for environmental information systems is a challenging task for both

1

TZI - Center for Computing Technologies, University of Bremen, Universitätsallee 21-23, D28359 Bremen, Germany email: {heiner|visser|schuster |vogele}@informatik.uni-bremen.de

2 computer and geo-scientists. Therefore, how do we integrate the intelligent methods and algorithms computer scientists offer us and how do we merge this with the knowledge of environmental experts?

1.1. Spatio-Temporal Information Spatio-temporal information can be decribed as a window at a certain place and time. This window gives insight about the data but is only one window in the spacetime continuum. The OpenGIS™ Consortium 1 (OGC) defines place as a measurable piece of the real world. Time is a point, an interval or collection of points and inte rvals in what we perceive as the time continuum. Time and place can be measured and surveyed, and their coordinates in a particular spatial, temporal reference system can be derived. The consortium uses the term locatio to cover both place and tim (OGC, 1999b). Günthe (Günther, 1998) describes properties of environmental data as follows (extract): 1. Complex: spatial data objects often have a complex structure. An object could be represented by a single point or thousands of po ygons. 2. Dynamic: spatial data are dynamic. Insertions and deletions interact with u pdates. 3. Large: spatial data tend to be large (e.g. geographical maps) 4. No standard algebra defined: basically this means that there is no standard set of operators2.

1.2. Geographic Information Systems Geographical information systems are essential and important tools to analyse and visualize spatio-temporal information. Originally developed for the creation of thematic maps, GIS systems support data capture (e.g. digitizing), data storage (DBMS, spatial DBMS), and data analysis (e.g. combination of spatial and non-spatial data). Lately, the OGC demands new requirement for GIS. The objectives of the OGC are full integration of geo-spatial data and geo-processing resources into mainstream computing. Open and interoperable geo-processing, or the ability to share heterogeneous geo-data and geo-processing resources transparently in a networked environment is the main aim of this organization. The interoperability of GIS demands ne requirements which can be achieved in two ways. Firstly, the developers of GIS 1

A non-profit organization dedicated to open systems geo-processing Lately, the OGC published their abstract specification (OGC, 1999) which specifies simple features (among other specifications) of GIS systems.}

2

3 have to come together and define de facto standards. The OGC's abstract specification models are a first (and big!) step in this direction. Secondly, an approach is to develop semantic translators, to define the meaning of concepts. For example the concept forest in the ATKIS (AdV, 1998) catalogue has a different semantic than th concept forest in CORINE land cover catalogue (EEA, 1997-1999). We will discuss this example later in this paper. Figure 1 gives an overview about the OGC's GI future. The idea is to define 'simple features' and compose these to a customized GIS system. GIS as autonomous systems on various platforms Simple features as basic functions of various GIS as components

GIS I

simple feature simple feature simple feature

GIS II

simple feature

simple feature

GIS uses components

simple feature simple feature

GIS III

towards a fully componentized GIS

Figure 1: Aspired Development of GIS Systems

1.3. Current activities As mentioned previously the activities related to interoperability between GIS s stems can be seen as two main streams: Firstly, the OGC is challenging the standardization of components and has published their first frozen abstract specification (OGC, 1999a). Secondly, the activities around semantic translators are worth mentioning. OGCs topic 14 Semantics and Information Communities hasn't been considered in depth, however, a core task force will be working on this topic in the future. Standardization: The OGC is working on an evolution from traditional GIS solutions, in which proprietary data models and monolithic software functions are made interoperable and extensible. Applications which adhere to the objectives of Open GIS are free to access and use various types of distributed data, and to utilize multiple geo-processing tools and services. A formal specification, the Open Geodata Interoperability Specification (OGIS), is currently under development and will de-

4 fine the types and methods necessary to build interoperable systems. At the 2nd I nternational Conference on Interoperating Geographic Information Systems (Vckovski, Brassel, & Schek, 1999) almost 50% of the contributions were related to the OpenGIS ideas. We can assume that the OGC and their ideas and visions will continue in the near future. The relation between the IT industry with their standards approach and the GIS interoperability approach was one of the topics at the above mentioned conference (Berre, 1999). General problems and solutions for syntactic and semantic interoperability in the context of IT-standards, such as ISO RM-ODP, ISO CSMF, CORBA/EJB/COM+, UML, XML and the European DISGIS Esprit IV project (DISGIS-Project, 1999) , which deals with practical experiences regarding the use of ISO/TC211 and Op nGIS interoperability approaches were discussed. Semantic Translation: Semantic translation is a method for data translation that goes beyond the traditional mapping and conversion of geometric primitives. If w look at the term 'semantic' w.r.t. geographical data we are referring to the meaning of a concept (e.g. the concept forest in a geographical sense). This is quite different to the term 'semantic' as it is used in programming languages, where semantic d etermines the exact function of a language. Commercial & non-commercial tools: Currently, there are a few commercial and non-commercial systems on the market that make use of semantic translation. An example: The Feature Manipulation Engine (FME), originally developed for the Canadian Government, is 'emerging as a de facto standard in the industry for sharing geospatial data between diverse applications' (Michael Cosentino, Geospatial Ma rket Development Manager, Sun Systems Inc.). Underlying the engine is a rich data model, which is internally consistent and inherently extensible. Constructs within the models of the input or output formats or systems are mapped to constructs in the engine's model. The engine provides a series of methods to carry out model to model transformations, applicable to data either on input or output. Cosentino argues that this functionality ensures that neither the data provider nor data consumer feels co nstrained; they can use their respective systems however they wish. FME provides a translation tool through which sophisticated spatial translation operations between various standard GIS data formats can be performed. FME is the core of a number of applications, such as the Geo-Task Server (Huber, 1998). Other activities: The German Federal/States working group 'environmental information systems' (BLAK UIS) stated that semantic interoperability is required fo open environmental systems on their workshop (IFGI, 1999) this year. It is anticipated that the authorities will perform further work on this topic. An idea to ove rcome the deficiencies of exchanging or comparing data between GIS and/or between catalogues is to use ontologies. The advantage of ontologies is the existence of fo r-

5 mal semantics. This allows defined ontologies for concepts (such as forest) for di fferent catalogue systems and to define axioms for the 'translation' between those ontologies.

2. Ontologies and Information Sources The term ontology was originally used in philosophy to describe a theory of "being and existence". In the area of artificial intelligence it was adopted to describe kno ledge models that provide definitions of vocabulary used to describe a certain domains. An often cited definition of ontology is the one given by Grube (Gruber, 1993): "An Ontology is an explicit specification of a concept ualization."

2.1. Knowledge representation with ontologies Ontologies are useful for our purposes because they precisely describes what a certain term means. For this purpose, an ontology defines concepts and functions describing attributes of concepts and relations between these defined concepts. What distinguishes ontologies from other well known data modeling approaches such as UML (Oestereich, 1998) is, that it also provides necessary and sufficient conditions for membership in concepts and relations, thus enabling a conclusion as to where a certain object belongs to a class or a relation. increasing expressiveness

Human Expertise Conceptual Models

Ontologies

first-order Artificial Neural Networks

Classical Knowledge Representation

Meta Data

propositional

intuitive

structured

formal

increasing determinism

Figure 2: A Comparison of Knowledge Modeling Techniques There are different approaches to specify an ontology. We use the Ontolingua a pproach (Gruber, 1993) that provides methods and tools to ease the development and

6 application of ontologies. Somehow, it has become a standard because it enables the translation of specified ontologies in different formats. The core of the Ontolingua approach is it's modeling language. It is based on the Knowledge Interchange Fo rmat (KIF) (Genesereth & Fikes, 1992), a uniform language that represents models of different applications in a logic-based setting. Ontolingua extends KIF with prim itives for object-oriented modeling. What distinguishes ontologies from other representations of knowledge is the combination of expressability and formal strength that enables us to formally define complex domains. From a formal point of view they are comparable to classical forms of knowledge representation such as logic programs or frame languages. At the same time they reach an expressiveness comparable to conceptual models of e xpertise (Figure 2). This combination makes the definition of o ntologies a timeconsuming and difficult task. Therefore, the Ontolingua approach provides tools via Internet (http://www.ksl.stanford.edu) that support the specification of ontologies. The ontolingua server includes an ontology editor and a library of predefined o ntologies for concepts frequently used such as natural numbers.

2.2. Information Integration An ontology can be used as an abstract interface of an information source. This is because the exact definition of the meaning of different terms in the information e nables the use of this information by those who are not familiar with the intended meaning. These properties enable ontologies to be used for the integration of diffe rent and possibly heterogeneous information sources. Figure 3 shows three different ways in which information sources can be integrated in principle.

Information Source 1

Information Source 1

Information Source 2

Translation

Translation

Information Source 1

Information Source 2

Reference Translation

Reference

Source Ontology 1 Information Source 2

Source Ontology 1

Common Data Format Deductive Matching

Figure 3: General Approaches to Information Integration The first way to access heterogeneous information usually needs a translation from the representation of one source into the representation of another resulting in a common model that can be accessed. This approach suffers, as it can not be assured that a translation is possible for all available information, because this strongly de-

7 pends upon the expressiveness of the representations used. Also, this approach is computational expensive as it needs n 2-translators. The second way is to describe different information sources using a meta-language. This approach, known as metainformation, is widely used in environmental information systems and constitutes the state-of the art in the environmental information systems community. A good example is the German environmental data catalogue (UDK) (Lessing, Günther, & Swoboda, 1995) . It describes different sources of environmental information in Central Europe. This approach needs n-translators. The use of ontologies to define a common vocabulary for environmental information integrates the two approaches described above and solves some of the problems. Firstly, the ontology provides a description of the information sources in the sense that it is made clear what kind of information is available as instances of specified concepts. Secondly, the information can be accessed via abstract concepts that subsume the more specific concepts used from different information sources. This approach needs n ontologies and an inference machine.

2.3. Information Retrieval In its common definition information retrieval (IR) is concerned with the selection of documents from a collection that is of interest to a user. With the growth of te xtual information sources available on the internet, this discipline has grown in importance. While many approaches to IR are keyword-based and therefore syntactic in nature, complex queries and information that goes beyond textual description benefits from the use of semantic information. Formal ontologies offer all benefits of semantics-based IR, e.g. an extended expressiveness, reasoning and learning f acilities (see (Möller, Haarslev, & Neumann, 1998)). Especially, knowledge derived in a reasoning process can be used to identify interrelations that are not contained in the information sources explicitly. Additionally, the integration facilities described in the last section can be exploited to retrieve information from heterogeneous i nformation sources. An example for an ontology-based approach for semantic information retrieval from the internet is the "Ontobroker" (see Fensel, Decker, Erdmann & Studer, 1998 and fig. 4). This approach consists of several components: 1. A library of ontologies used by a special group of users to semantically enrich their information annotating their pages with terms from the onto logy. 2. A search engine that searches a predefined section of the web for annotated pages and stores the content in a fact database for further pro cessing. 3. A query environment that allows to compose complex queries regarding properties and relationships of information specified in the ontology.

8 4. An inference engine that performs symbolic reasoning on the fact database in order to give well-founded answers to the queries stated in the query envir nment. The architecture of this approach implies some common steps needed for an appl ication. First, ontologies describing the domain of interest have to be developed and stored in the ontology library. Furthermore, web-pages that are meant to be sources of information have to be identified, annotated with terms from the ontology and registered by the search engine. The annotation principle leaves freedom for different kinds of retrieval, because it is not only possible to annotate semi-structured text with ontological constructs. Also, one can imagine to annotate information sources as a whole. In this case, the result of ontology-based information retrieval is a list of valuable information sources (e.g. data bases). 111111 000000 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111

Interpretation

Query Interface

Inference Engine

Search Engine

Facts

111111 000000 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111

Ontologies Annotation

Figure 4: The Ontobroker Architecture

3. Geographic Information Sources and ontologies, an example An example about the used geographical information sources and their description with Ontolingua will be given in this section. We use two catalogue systems, namel the German ATKIS-OK-250 (AdV, 1998) and the European CORINE land cover catalogue (EEA, 1997-1999). The vegetation ontology will be used for the definition of primitives (e. g. forest-trees, forest-plants, grass). 3.1. Ontologies for ATKIS, CORINE and Vegetation The ATKIS-OK-250 Catalogu : ATKIS (Amtliches Topographisch-Kartographisches Informationssystem) is an official geoinformation system in Germany. It is a project of the head surveying offices of all the German states. The working group

9 offers digital landscape models (e. g. DLM 250, 1:250 000) with a detailed doc umentation in the object catalogue OK-250. This catalogue is the basis for our description. The ontology for our concept forest consists of Classes, Functions and Instances. One class is the following: ;;; ------------------ Classes -------------;;; Forest (Define-Okbc-Frame Forest :Direct-Superclasses (Vegetation-Area) :Direct-Types (Class Primitive) :Own-Slots ((Arity 1)) :Sentences ((=> (Forest ?X0) (Or (Has-Vegetation ?X0 Forest-Plants) (And (Has-Vegetation ?X0 Grass) (Is-Cultivated ?X0 1)))) (=> (Forest ?X0) (> (Size-In-Hectares ?X0) 10))) :Template-Facets ((Size-In-Hectares (Numeric-Minimum 10))))

We see that the clas forest is a subclass from vegetation_area. We also see that there is an internal rule which says that a thing is a forest if it has forest-plants or cultivated grass as vegetation. In addtion, the size of the area has to be at least 10 hectares. ;;; ------------------ Instance -------------;;; Stadtwald-1990 (Define-Okbc-Frame Stadtwald-1990 :Direct-Types (Forest) :Own-Slots ((Is-Cultivated 0) (Is-Cultivated) (Has-Vegetation Forest-Trees) (Size-In-Hectares 25))) ;;; Weidedamm3-1990 (Define-Okbc-Frame Weidedamm3-1990 :Direct-Types (Forest) :Own-Slots ((Is-Cultivated 0) (Is-Cultivated) (Has-Vegetation Forest-Plants) (Size-In-Hectares 12)))

The instances Stadtwald_1990 and Weidedamm3_1990 show the state of a particular area in 1990. The Stadtwald is not cultivated but has forest-trees and is an area of 25 hectares while the Weidedamm3 is not cultivated, has forest-plants, and measures twelve hectares. CORINE land cover: From 1985 to 1990, the European Commission carried out the CORINE Programme (Co-ordination of Information on the Environ-ment). The results are essentially of three types, corresponding to the three aims of the Pr ogramme: (a) an information system on the state of the environment in the European Community has been created (the CORINE system). It is composed of a series of data bases describing the environment in European Community, as well as of data bases with background information. (b) Nomenclatures and methodologies were de-

10 veloped for carrying out the programme, which are now used as the reference in the areas concerned at the Community level. (c) A systematic effort was made to co ncert activities with all the bodies involved in the production of environmental information especially at international level. As a result of this activity, and indeed of the whole programme, several groups of international scientists have been working together towards agreed targets. They now share a pool of expertise on various themes of environmental information. This nomenclature with its 44 classes is the basis for our desciption. In order to demonstrate the hierarchy we use a tree to desribes parts of the ontology (see fig. 5): size > 24 ha

Area is_cultivated = yes

Artificial-Surfaces

Artificial-Non-AgriculturalVegetated-Area

Green-UrbanAreas

...

Sport-AndLeisure-Facilities

...

Level 1

Forests-And-SemiNatural-Areas

has_vegetation sod_grass

Urban-Fabric

...

...

Forests

Level 2

has_vegetation forest_plants

...

DiscontinuousUrban-Fabric

Mixed-Forest

Broad-leaved Forest

...

Level 3

Figure 5: Part of the CORINE land cover ontology Here we see that a forest has forest-plants and is at least 25 ha big because the s uperclass of forest is Forests-And-Semi-Natural-Areas and then area. The minimum size in hectares of an area is 25 (see facet in class area). Please note that according to the CORINE nomenclature sport and leisure facilities are artificial-nonagruicultural-vegetated-areas which itself consists of sod_grass as vegetation. Lets define some instances for this onto logy: ;;; ------------------ Instance -------------;;; Pauliner-Marsch (Define-Okbc-Frame Pauliner-Marsch :Direct-Types (Sport-And-Leisure-Facilities) :Own-Slots ((Size-In-Hectares 40) (Size-In-Hectares))) ;;; Stadtwald-2000 (Define-Okbc-Frame Stadtwald-2000 :Direct-Types (Mixed-Forest) :Own-Slots ((Size-In-Hectares 25) (Is-Cultivated 0) (Has-Vegetation Forest-Trees))) ;;; Weidedamm3-2000 (Define-Okbc-Frame Weidedamm3-2000 :Direct-Types (Discontinuous-Urban-Fabric)

11 :Own-Slots

((Size-In-Hectares 30) (Is-Cultivated 1)))

This is the information we would get out of an classified satellite image. There is a Stadtwald_2000 instance which is a type of mixed forest with 25 ha, not cultivated and forest trees. There is also the Pauliner Marsch, a sports-and-leisure facility a ccording to CORINE land cover with 40 ha. Vegetation: If we want to match or process the knowledge from the above d escribed ontologies we either have to define another ontology which matches the concepts or we use a domain ontology which will act as a fundament for the two ontologies (see also figure 6). In this ontology the primitives such as plants, soiltype etc. are defined. Please note that we show a part of the ontology as a tree for better und rstanding. Plants

...

...

Forest-Plants

Forest-Trees

Grass

PastureGrass

... Special-Culture

SodGrass

...

...

...

Figure 6: Part of the vegetation ontology We define forest_trees as forest_plants and those as plants. Also, Sod_grass is grass and grass is plants. Special cultures such as vine or hop are also defined as plants.

3.2. Flexible Retrieval of Geographic Information In this chapter we show how the above mentioned methods can be used for flexible retrieval of geographical information. We mentioned in section 2.3 how information can be retrieved in general and that ontology based information retrieval offers ben efits for this process. In order to show how this works we come back to our concepts forest within the ATKIS-OK-250 and CORINE land cover catalogues. The use of ontologies gives us two options: (a) integrated views and (b) verification. An integrated view from the users perspective merges the data between the cat alogues. This process can be seen as two layers which lay on top of each other. Th view needs a third ontology with axioms for the translation process between the concepts. The second option gives users the opportunity to verify ATKIS-OK-250 data with CORINE land cover data or vice versa.

12

Satellitepicture

ATKISmap

Solution ?

ATKIS S

Objecttype Vegetation

CORINE Landcover Class Forests...

Data structure

Data structure

Theoremprover Ontology

Ontology

ATKIS

CORINE

domain ontologies, such as plants, soiltype etc

Figure 7: Deductive Integration of Geographic Information A query interface -- this could be an intelligent dialogue within a GIS system -sends its request to an inference engine. The inference engine builds up the actual knowledge base by using the ontologies of the concepts. The interesting part of the whole idea is that the inference engine can infer on the actual knowledge base and i therefore able to derive new knowledge which can be used for further questions. A typical problem within the planning process of authorities is the use of heterogeneous data categorized as ATKIS-OK-250 and CORINE land cover satellite pictures (see figure 5). In order to map/exchange data between these two catalogues we can use one of the three methods which we described in section “Information Integr ation”. We will use the third approach and first look at the ontologies which are partl described on page 9 (ATKIS-OK-250) and in figure 5 (CORINE land cover). As theorem prover we use a PROLOG system such as SWI-Prolog (Wielemaker, 1998). A simple query to an inference engine could be: "What is the superclass of 'Sodgrass'?" If we would use a PROLOG system we would put the query: subclass_Of(sod_grass,X) and we would get the solution grass. We would get all solutions and the complete class hierarchy if we query subclass_Of(X,Y). More complex queries require more complex representations. We use the following ax ioms to describe the concept forest. There are two possibilities:

13

1. The size in hectares must be greater than 10 and the vegetation has to be forest plants. The concept forest_pants is defined in the ontology vegetation. 2. The size in hectares must be greater than 10 and the vegetation has to be cgrass. The concept grass is also defined in the ontology vegetation. In addition, the vegetation has to be cultivated. We denoted the rules in PROLOG syntax: forest(X) :size_In_Hectares(X,Y), Y>10, has_Vegetation(X,Z), a_kind_of(Z,forest_Plants). forest(X) :size_In_Hectares(X,Y), Y>10, has_Vegetation(X,Z), a_kind_of(Z,grass), is_Cultivated(X,true).

An example for an integrated view would be the following scenario: The user want to see the development of the forests within in certain area over the last years. He uses ATKIS-OK-250 data within his GIS system and wants to verify the data with actual satellite images. He gets categorized CORINE land cover data and is seeking for the equivalent of forest in this catalogue (see Figure 5). The theorem prover derives the answer to this question in building up the knowledge base. The query would be forest(stadtwald_2000). The following shows the path through the KB (traced). An Exit marks a return with a Yes. Call: Call: Exit: Call: Exit: Call: Exit: Call: Call: Exit: Call: Exit: Call: Fail: Redo: Call: Exit:

forest(stadtwald_2000) size_In_Hectares(stadtwald_2000, _L144) size_In_Hectares(stadtwald_2000, 25) 25>10 25>10 has_Vegetation(stadtwald_2000, _L145) has_Vegetation(stadtwald_2000, forest_Trees) a_kind_of(forest_Trees, forest_Plants) class(forest_Trees) class(forest_Trees) class(forest_Plants) class(forest_Plants) forest_Trees=forest_Plants forest_Trees=forest_Plants a_kind_of(forest_Trees, forest_Plants) subclass_Of(forest_Trees, forest_Plants) subclass_Of(forest_Trees, forest_Plants)

14 Exit: a_kind_of(forest_Trees, forest_Plants) Exit: forest(stadtwald_2000)

We can see that the theorem prover first checks if the size is bigger than ten. It knows (through the instance entered by the user) that the stadtwald_2000 has forest trees. The system is seeking fo forest_tree and concludes that a forest_tree is a forest_plant. This matches the first prolog rule mentioned above and therefore the answer to the query is Yes. The user now checks whether the area Weidedamm III which according to the ATKIS-OK-250 catalogue was a forest in 1990, still is a fo rest. He checks this with the help of actual satellite images in CORINE land cove format. The query would be forest(weidedamm3_2000) and the answer can be seen here: Call: Call: Exit: Call: Exit: Call: Fail: Fail:

forest(weidedamm3_2000) size_In_Hectares(weidedamm3_2000, _L144) size_In_Hectares(weidedamm3_2000, 30) 30>10 30>10 has_Vegetation(weidedamm3_2000, _L145) has_Vegetation(weidedamm3_2000, _L145) forest(weidedamm3_2000)

As we see the query fails because the slot has_vegetation fails. This is because there is no vegetation on this area anymore, the satellite image was classified Discontinuous_Urban_Fabric within the CORINE catalogue. The results presented are not very surprising, because most of the conditions for membership to the concept forest were directly met by the instance 'stadtwald' and the missing of vegetation in the instance weidedamm is also a criterion easy to check. Nevertheless, we want to sho that the ontological foundation enables us to perform reasoning that produces result that are not obvious and require some additional knowledge. The ATKIS-OK-250 ontology gives two possible definitions of the concept forest. We will use the second one to deduce that the so-called ‘Pauliner Marsch’ is also a forest according to that ontology. The only information we have is that it is a member of the concept Sport_And_Leisure_Facilities taken from the CORINE land cover ontology. To classify this area as a 'forest' we need background knowledge about vegetation and cultivation of sport and leisure facilities. This background knowledge can also be specified using PROLOG clauses: is_Cultivated(X, true) :instance_Of(X,Y), a_kind_of(Y, artificial_Surfaces) has_Vegetation(X, sod_Grass) :instance_Of(X,Y),

15 a_kind_of(Y, sport_And_Leisure_Facilities).

The clauses attach characterizing properties to the concepts artificial_surfaces and Sport_and_Leisure_Facilities. These properties are inherited by the instances of the subconcepts thereby completing the properties needed to classify the instance under consideration as being a member of the concept forest. The trace of the PROLOG reasoner illustrates this: Call: Call: Exit: Call: Exit: Call: Call: Exit: Call: Exit: Exit: Call: Exit:

forest(pauliner_Marsch) size_In_Hectares(pauliner_Marsch, _L193) size_In_Hectares(pauliner_Marsch, 40) 40>10 40>10 has_Vegetation(pauliner_Marsch, _L194) instance_Of(pauliner_Marsch, _L206) instance_Of(pauliner_Marsch, sport_And_Leisure_Facilities) a_kind_of(sport_And_Leisure_Facilities, sport_And_Leisure_Facilities) a_kind_of(sport_And_Leisure_Facilities, sport_And_Leisure_Facilities) has_Vegetation(pauliner_Marsch, sod_Grass) a_kind_of(sod_Grass, grass) a_kind_of(sod_Grass, grass)

After comparing the size of the instance with the required size, the vegetation is checked. As there is no vegetation defined for the instance, the mentioned axiom i used to deduce that all instances of that class have sod_Grass as a vegetation. Just as in the examples above, the system is able to identify sod grass as being a kind of grass. The fact that the 'Pauliner Marsch' is cultivated is deduced similar way as shown in trace below. Call: is_Cultivated(pauliner_Marsch, true) Call: instance_Of(pauliner_Marsch, _L252) Exit: instance_Of(pauliner_Marsch, sport_And_Leisure_Facilities) Call: a_kind_of(sport_And_Leisure_Facilities, artificial_Surfaces) ... Redo: a_kind_of(artificial_Non_Agricultural_ Vegetated_Area, artificial_Surfaces) Call: subclass_Of(artificial_Non_Agricultural_ Vegetated_Area,_L276) Exit: subclass_Of(artificial_Non_Agricultural_ Vegetated_Area, artificial_Surfaces) Call: a_kind_of(artificial_Surfaces, artificial_Surfaces)

16 ... Exit: a_kind_of(artificial_Surfaces, artificial_Surfaces) Exit: a_kind_of(artificial_Non_Agricultural_ Vegetated_Area, artificial_Surfaces) Exit: a_kind_of(sport_And_Leisure_Facilities, artificial_Surfaces) Exit: is_Cultivated(pauliner_Marsch, true) Exit: forest(pauliner_Marsch)

This example still does by no means cover all possibilities of ontology based info rmation integration. We restricted ourselves to simple taxonomical reasoning using only the second order predicates class, subclass_Of and instance_Of. One ca imagine to make use of other concepts like range restrictions on slots or mathemat ical properties of relations.

Discussion In this paper we demonstrated how the use of formal ontologies can enhance intelligent information retrieval. We have seen that ontologies with formal semantics can help to generate semantic translators between data sources. There are several ways to translate one data source into another, but the benefits of using underlying ontol ogies and an additional inference engine with the ability to derive new knowledge the benefits are obvious. We outlined the advantages of ontologies and stated that their formal semantics can help to support the semi-automatical translation process b etween data sources. We noted that adding new knowledge to an ontology is easier than adding kno ledge to semi-structured meta-data. The formal description helps us to find errors (e. g. units, scales). Ontologies therefore help to improve the data quality. Ontologies are more flexible because we can use them not only for functional components (Ref IBROW). In addition, we can think about the integration of other sources. The satellite picture for instance could be pre-processed with advanced image operators (e. g. for texture, edges and colors). This additional knowledge could be transferred into a knowledge base semi-automatically and could act as an additional source for potential queries.

References AdV (1998). Amtliches Topographisch-Kartographisches Informationssystem ATKIS. Bonn.

17 Bergamashi, Castano, Vincini, & Beneventano (1999). Intelligent Techniques for the Extra ction and Integration of Heterogeneous Information, Workshop Intelligent Information Integration, IJCAI 99, Stockholm, Sweden. Berre, A. (1999). The IT Standards Approach to GIS Interoperability. A.a.B. Vckovski, K.E. and Schek, H.-J. (Ed.), INTEROP 99, Vol. 1580 (pp. 328), Zürich. DISGIS-Project (1999). Distributed Geographical Information Systems (DISGIS). EEA (1997-1999). CORINE Land Cover. Fensel, D., Decker, S., Erdmann, M., & Studer, R. (1998). Ontobroker: The Very High Idea, FLAIRS-98 11st International Conference, Sanibal Island, USA. Galhardas, H., Simon, E., & Tomasic, A. (1998). A Framework for classifying Environmental Metadata, AAAI, Workshop on AI and Information Integration, Madison, WI. Genesereth, M.R., & Fikes, R.E. (1992). Knowledge Interchange Format Version 3.0 Refe rence Manual: Knowledge Systems Laboratory, Stanford University. Gruber, T.R. (1993). A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, 5(2). Günther, O. (1998). Environmental Information Systems. Berlin. Huber, M. (1998): High-tech-Entscheidungstrends bei Geodaten-Servern, GeoBit 3/98, S.1820. IFGI, I.f.G. (1999). Offene Umweltinformationssysteme, BLAK UIS Workshop, Münster. Lessing, H., Günther, O., & Swoboda, W. (1995). An Object-Oriented Class Model for the Environmental Data catalogue. H. Kremers & W. Pillmann (Eds.), Space and Time in Environmental Information Systems, 9th International Symposium on Computer Science for Environmental Applications: Metropolis Verlag. Möller, R., Haarslev, V., & Neumann, B. (1998). Semantics-Based Information Retrieval. J. Cuena (Ed.), IT & KNOWS Information Technology and Knowledge Systems, XV. IFI World Computer Congress, Vienna, Budapest: Austrian Computer Society. Oestereich, B. (1998). Objektorientierte Softwareentwicklung Analyse und Design mit der Unified Modeling Language. München: Oldenbourgh. OGC (1999a) “The OpenGIS Abstract Specification,” Open GIS Consortium 99-100r1.doc, 1999. OGC (1999b). Topic 2: Spatial Reference Systems. Vckovski, A., Brassel, K.E., & Schek, H.-J. (1999). Proceedings of the 2nd International Co nference on Interoperating Geographic Information Systems. A.a.B. Vckovski, K.E. and Schek, H.-J. (Ed.), INTEROP 99, Vol. 1580 (pp. 328), Zürich. Wielemaker, J. (1998). SWI-Prolog 3.1. Wiener, J.L., Gupta, H., Labio, W.J., Zhuge, Y., Garcia-Molina, H., & Widom, J. (1996). WHIPS: A system prototype for warehouse view maintenance, Workshop on materialized views (pp. 26-33), Montreal, Canada.