Abduction and Deduction in Geologic Hypermaps - CiteSeerX

Abduction and Deduction in Geologic Hypermaps Agnes Voisard Computer Science Institute, Freie Universitat Berlin D-14195 Berlin, Germany [email protected] http://www.inf.fu-berlin.de/ voisard

Abstract. A geologic map is a 2-dimensional representation of an interpretation of 3-D phenomena. The work of a geologist consists mainly in (i) inferring subsurface structures from observed surface phenomena and (ii) building abductive models of events and processes that shaped them during the geologic past. In order to do this, chains of explanations are used to reconstruct the Earth history step-by-step. In this context, many interpretations may be associated with a given output. In this paper, we rst present the general contexts of geologic map manipulation and design. We then propose a framework for geologic map designers which supports multiple interpretations.

1 Introduction A geologic map of a given area is a 2-dimensional representation of accepted models of its 3-dimensional subsurface structures. It contains geologic data which allow an understanding of the distribution of rocks that make up the crust of the Earth as well as the orientation of the structures they contain. It is based on a geologist's model explaining the observed phenomena and the processes that shaped them in the geologic past. In the current analog approach, this model is recorded in a geologic map with an explanatory booklet that describes the author's conclusions as well as relevant eld and other observations (e.g., tectonic measurements, drill-hole logs, fossil records). Today, this variety of information is handled in a digital and even hypermedial form. This necessitates to conceive, develop and implement a suitable geologic hypermap model beforehand. The main objective of our project is the design of models and tools well-suited for the interaction between users and geologic hypermaps. Geologic hypermaps are a family of hyperdocuments [CACM95] with peculiar requirements due to the richness (in terms of semantics and structure) of the information to be stored. In these applications, users in general are both endusers (e.g., engineers or geology researchers) and designers (map makers). This contribution focuses on the handling and representation of geologic knowledge within hypermap applications and addresses the needs for a data model that supports multiple interpretations from one or many geologists. When

designing geologic maps, the objectives are twofold: (i) inferring subsurface structures from observed surface phenomena and (ii) building abductive models of events and processes that shaped them in the geologic past. For this, chains of explanation are used to reconstruct the Earth history step by step. Basic requirements for handling geologic applications in DBMS environments are described in [Voi98]. Those have been studied within a joint project with geologists at the Free University of Berlin. We focus here on the task of the map maker. To our knowledge, this is one of the rst attempts to de ne tools for geologists. The US Geological Survey [USGS97] is currently de ning a database to store geologic information in a relational context. A few authors (e.g., [Hou94,BN96,BBC97]) have also studied the 3-dimensional geometric aspects of geologic applications. Note however that, even though our goal is to build tools for the next generation of geologic maps (i.e., stored in a database), the map creation cannot be fully automated as some knowledge is dicult to express in terms of rules: There are no real laws that work at all time in all possible situations. Hence some steps still need to be performed manually. Building the next generation of tools for geologists is a challenging task. The underlying geologic models and all the possible ways of manipulating the information make the supporting systems extremely complex. In addition, these systems borrow from many disciplines such as geospatial databases of course but also sophisticated visualization, simulation models, or arti cial intelligence. Here we restrict our attention to the reasoning process of geologists. As a rst step, we present an explanation model for map designers, which is meant to be used during the abduction and the deduction processes. The model is based on complex explanation patterns (e.g., simulation models, similarities), and con dence coecients. Even though geologic map interpretation has been studied thoroughly in the eld of pure geology (e.g., [Bly76,Pow92]) for many years, to the best our knowledge, the mechanism behind geologic map making was not studied by computer scientists. Beside its complexity, a reason for that is the current lack of data. At present, not many complete geologic maps are stored in a digital form. So far, publishing a geologic map was a time-consuming task. These maps are now in the process of being digitized, but it will take many years before several documented versions of a map are available. The ultimate goal we are aiming at is the participation in the de nition of a digital library of geologic maps whose elements may serve as starting points for further geologic map de nitions. This paper is organized as follows. Section 2 gives examples of geologic map manipulation and shows three main categories of users. Objects of interest and representative queries are given for each category. In Section 3, we present a reasoning model based on explanations and coecients of con dence. Finally, Section 4 draws our conclusions, relates our work to other disciplines, and presents the future steps of our project.

2 Geologic hypermaps manipulation According to the degree of expertise of the enduser, manipulating geologic hypermaps can be understood at many levels of abstraction. Three main categories of users can be identi ed, from the naive user to the application designer. They all communicate with the geologic maps database in a dierent way. While a naive user would like to know, for instance, the nature of soils in a given area, a more sophisticated user would like to nd out why a particular geologic object was de ned a certain way. An even more sophisticated user would like to access knowledge on other areas, for instance for comparison. Note that the most sophisticated users also have the requirements of the less quali ed ones. They all access the basic data of their level together with metadata, which is understood dierently for each category. In this section, each category of users is studied separately. For each one we give the main objects of interests. To illustrate our discourse, examples of queries and a description of the tools needed are also given. This leads to a hierarchy of tools for manipulating such maps.

2.1 Traditional enduser (User type 1) These users want to get straightforward information stored in a geologic database. A typical user of this category is an engineer who would like to nd out the type of the soil in a given area in order to install a water pipe. These endusers need to access such basic information as well as metainformation. Metainformation has two aspects here. In a geospatial sense, it denotes information regarding the origin of data. It is for instance the date and the conditions under which some measurements were performed. This metadata is invariant in the three categories. The other kind of metadata is a high level description of data in the database, as for instance the possible values of a given attribute at a naive user level.

Objects of interest. The major objects of interest at this level are geospatial, textual and multimedia objects (e.g., pictures and videos). These objects denoted HO are linked together via hyperlinks. A geologic object is a complex entity of the real world. It has a description as well as spatial attributes. The description may be elaborate, as we will see later. In particular, structured information, possibly of a multimedia type, can be attached to a geologic object. This information can be easily accessed in a hypermap system. What the users see and interact with are hypermaps in a graphical window. These are manipulated in a straightforward manner, through both mouse clicks on cartographic objects and a basic user interface that allows to access data values. More information on the HO's manipulation and structure can be found in [Voi98].

De nition 2.1 A geologic map GM is a directed weakly connected graph (V 1. 2. 3.

HO

; E ; F ), where HO

V is a set of hyperobjects HO E is a set of edges. F : E ! P (V ) is a labeling function. HO

HO

HO

HO

A prototype for this category of users was coded using ArcViews [ESRI96a] and its programming language Avenue [ESRI96b], as reported in [KSV98]. In addition to basic visualization and querying, features such as exible attribute display and specialized tools for specialists (e.g., a soil scientist) were also implemented. Figure 1 gives an example of a screenshot of the prototype.

Fig. 1. Screenshot from the prototype

Examples of queries. Queries to be posed to the system may include: At a basic data level (instance level) What are the characteristics (alphanumeric description) of this object (point query)? What are all geo-objects in this area? (region query)

What is the nature of the soil here? Where on this map do I nd type of fossils FS22? How much of geologic layer l is represented at depth d?

At a metadata level What are the existing soil classi cations in this map? When was this map created? 2.2 Geology researcher (User type 2)

A geologic map is the basic tool of geologists. Hence these sophisticated users need to access basic information as well but also to understand why objects relate to each other and thus access the theories that justify the existence or the particular aspect (e.g., attribute values or shape) of each geologic object that constitutes the map. For instance, they may require explanations behind given phenomena.

Objects of interest. These users access all objects described above but also

another type of hyperobjects. As we will see in the next section, some geologic objects are for sure part of a map (with probability 1), and other objects are \guesstimated" by the map maker (with 0 existence-probability 1). Type 2 users can access these two categories of objects as well as the complex structures behind them.

Examples of queries. The interesting queries in this category are those related

to object existence as well as assumptions or explanations on objects. The list is quite long as we want to show many possible manipulations of the structure de ned in the next section.

At a basic data level What are the geologic objects related to the existence of this one? What are all the explanations of Ms. XYZ in this area? What assumptions led to the de nition of this object (if any)? Which objects are not based on assumptions? Which objects are de ned exclusively under assumptions? Which objects are de ned using this explanation? What is the explanation of the genesis of this layer? At a metadata level What are the maps designed after October 1990? What are all the geologic maps using Explanation E? What are the names of the geologists who studied this area? How many versions do I have for geologic map GM231? Where should we install drill-holes in this map?

2.3 Designer (User type 3)

One peculiarity of this level is that, for a given map (i.e., covering a given area), many versions based on dierent interpretations can be de ned, as described below. The map making process can be described by two basic mechanisms, abduction and deduction. The geologist starts with a map of a given area. Currently, this map is often a topographic map. From such a map, s/he rst looks at broad topographic features such as valleys and summits. Relationships between geologic borders and topography is then inferred. With knowledge of geologic structures at the surface (from drill-holes, for instance), the geologist also infers other geologic structures and groups them together. Groups are obtained by interpolation at the surface but also at the subsurface. Besides, s/he associates explanations or justi cation of the presence of certain geologic features together with a coecient of con dence (abduction mechanism). Geologic map making is an iterative process. The map maker draws a rst version of a map based of course on observed features but also on his/her interpretation of the whole scene. Then s/he veri es the hypothesis by running simulation models and going to the eld. A new version is then inferred, and so on and so forth. Eventually, s/he will obtain a map that corresponds to his/her interpretation but which is not \frozen" as things may still evolve in the future (e.g., with new explanation or new observed facts). However, the delta between his/her successive versions usually tends to become smaller and smaller. What we want to provide the designer with, beside assistance in this process, is a possibility of storing many underlying theories. Thus many explanations can be associated with one object, with dierent coecients of con dence. Explanations can be combined to form chains of explanations. Dierent combinations of explanations lead to many interpretations for the same output. A given interpretation of a geologic region together with all its interpreted geologic features is a version of a geologic map (in the database sense) for that region. Hence there are two ways to create new versions of a geologic map GM in the database: 1. Various explanations can be associated with one geologic object o (o 2 GM ) by one or many geologists. 2. Various de nitions (identi cation and attribute values assignments) of a collection of geologic objects within the same region can be given by one or many geologists. In addition, the novelty of the approach that underpins our new generation of tools for map makers is the ability for them to start with existing geologic maps. Then those will be transformed and customized according to dierent

interpretations. Hence there is no need anymore to create a geologic map from scratch.

Objects of interest. The objects manipulated by these users are geologic maps

composed of documented geologic features (in our model, to geologic object, we prefer the term geologic feature in order to stay closer to the geologist's reasoning, although both terms are used interchangeably in the paper). Geologic features are simple or complex as de ned below: geologic-feature = (AttributeList, SpatialExtent, UndergroundExtent) /* atomic geologic feature */ | (AttributeList, SpatialExtent, UndergroundExtent, {geologic-feature}) /* complex geologic feature */

Complex geologic features are for instance stratigraphic or tectonic structures. Note that some values of a complex geologic feature (e.g., SpatialExtent and UndergroundExtent) can be inferred from those of subobjects [EF89]. There is a clear distinction between observed objects, hence existing for sure in a map (probability 1) and objects that are \guesstimated". In the sequel we refer to observed objects as \hard objects" as opposed to \soft objects". A soft object can become a hard object when an hypothesis is veri ed, while a hard object cannot turn into a soft object. The probability attached to the existence of a soft object can change according to a dierent interpretation. Note that within a given map, some external events can change the nature of geologic objects, for instance the introduction of a drill-hole introduces a new geologic object in the map. Associated with these objects are explanations together with coecients of con dence. An explanation can justify the presence of an object or document the value of an attribute (e.g., the soil concentration of a component).

Objects manipulation and querying. Obviously, querying does not play a

crucial rule for this category of users. What is important is to provide designers with tools for de ning and manipulating the underlying structure (see Section 3). However below are a few examples of queries and structure manipulation. As we can see from the selected collection of queries, many concerns are of interest for the designer, from the representation of a given attribute to the number of maps de ned in a given area.

At a basic data level What are the interpretations using Geologic Model M? What if assumptions of Ms. XYZ are not justi ed anymore? What if model M turns out to be unapplicable in this area?

I know there should be cobalt here. Where should it be (shape/location)?

At a metadata level What geologic model could I use to justify this tectonic structure? What maps were de ned with geologic map GM324 as a starting point? How should I represent a layer with iron? What are the maps published in area a? What is the dierence between interpretation I1 and interpretation I2?

3 Supporting Geologic Map Making This section presents the kernel of a tool that assists map designers in the geologic map making process. We place ourselves in the context of a geologic map factory that communicates with a geologic map library. Eventually, objects de ned in the map factory will be validated and stored in the geologic map library. Such a library is extremely useful in this context as it allows in particular geologic maps to be built with other maps as starting points. Browsing the library is hence a key functionality of the environment oered to geologic map makers. The generic geologic map library is not presented here. It is a special kind of geospatial library [SF95,FE99] that supports multiversioning based on various interpretations. A geologic map factory contains many modules, among them a reasoning module. Components such as cartographic module, help module, or validation module are not our focus here. Rather we study the major task of the reasoning module, namely the support for abduction and deduction in the map making process.

3.1 Reasoning model In order to assist the designer in the map making process, the main object to consider is a reasoning structure, i.e., a documented geologic map (DGM). All the elements of a DGM, i.e., geological features (atomic or complex) and explanations are described thereafter, rst in general terms and then using an O2 -like [BDK92] speci cation language. For the sake of simplicity and legibility, the speci cations we give are kept as simple as possible.

Geologic features. The basic elements to consider are geologic features. These can be atomic (e.g., a fault) or complex (e.g., a tectonic structure). In any case they have a description (alphanumeric attributes), a spatial extension (spatial part that gives the shape and the location) and an underground extent. A geologic feature is of one of the two following types:

Hard type. Such objects are part of the map with probability one. The feature

was for instance seen on the elds and the numeric values of its attribute could be computed. Soft type. These objects belong to the map with a probability (con dence) between 0 and 1. We will see further how they are created. Many kinds of relationships exist among geologic features. Beside composition, topological relationships such as adjacency are of course of prime importance. These relationships are beyond the scope of our study. In addition, fuzzyness plays a crucial role in these applications as geological features can have (i) a fuzzy description (e.g., concentration of cobalt between 70 and 100 %), (ii) a fuzzy spatial part (location not precise, shape with fuzzy borders), and (iii) a fuzzy underground extent.

Schema de nition: Class GeologicFeature type tuple (description: Attributelist, spatialextent: Spatial, undergroundextent: Solid) Class SimpleGeologicFeature inherits GeologicFeature Class ComplexGeologicFeature inherits GeologicFeature type tuple (geolfeatures: { GeologicFeature})

Class HardGF inherits GeologicFeature Class SoftGF inherits GeologicFeature type tuple (origin: Text)

Classes AttributeList, Spatial, Solid are not detailed here. Spatial and Solid embody both geometry (coordinate location and shape) and topology. For a possible de nition of Spatial (basically points, lines, curves, polygons and regions) in a database context, see [EF89,SV92,Wor94,GS95,Sch97]. Solid (volume) modeling (3D-geometric modeling) appears typically in CAD/CAM applications (see for instance [BN96,DH97]).

Explanations. We are interested here in the process of documenting such maps, i.e., in the possible collections of explanations to justify: 1. The existence of a geologic feature. 2. The particular values of some attributes of a geologic feature. 3. The presence of other explanations (when for instance an explanation contains references. Typically when a bibliographic reference is given in an explanatory text).

In this context, an explanation can be of three dierent types: 1. Provable (reliable). It can be justi ed by a hard fact, such as a drill-hole, or a geologic simulation model. Note the hierarchy in the reliability. 2. Similarity-based (training areas). This occurs when some part of a map seems to be similar to a part of either the same map or of another map. 3. Experience-based (or feeling). Such explanations can also mutate to become provable if an underlying assumption is veri ed. An explanation is a complex object composed of structured text, and possibly (bibliographic) references, references to simulation models, geologic features in the map, and other areas (coordinates) from either the same or a dierent map. Moreover, it contains the geologic features that serve as justi cation for the argumentation as well as the geologic features that could be further consequences of this explanation through the deduction mechanism. Such arguments could be de ned as query expressions over a geologic map (set of geologic features). The simple speci cation of an explanation is given below. Note the basic superclass Explanation which is further specialized into various explanation classes such as ProvableExplanation, SimilarityExplanation, and ExpertExplanation. In addition, an explanation is either basic or complex, which leads to classes BasicExplanation and ComplexExplanation.

Schema de nition: Class

Explanation type tuple (author: string, argument: {HardGF}, consequence: {SoftGF})

Class BasicExplanation inherits Explanation () Class ComplexExplanation inherits Explanation type tuple (all-explanations: {Explanation})

Class ProvableExplanation inherits Explanation /* e.g., models or drill-holes */ Class SimilarityExplanation inherits Explanation /* similarity-based explanation */ Class ExpertExplanation inherits Explanation /* experience-based explanation */ Class BasicExplanation type tuple (text: string) Class ModelExplanation inherits BasicExplanation type tuple (argument: string) /* argument = modelref */

Class BiblioExplanation inherits BasicExplanation type tuple (argument: bibitem) /* argument = bibliographic reference */ Class HardObjectExplanation inherits BasicExplanation type tuple (argument: {GeologicFeature}) /* argument = geologic features */ Class SoftObjectExplanation inherits BasicExplanation type tuple (argument: {GeologicFeature}) /* argument based on geologic features */ Class AreaExplanation inherits BasicExplanation type tuple (area: zone) /* argument: Area, the coordinate of a region */

Environmental processes that lead to a given geologic feature are found out by looking recursively at all the explanations of type ModelExplanation.

Complete and Documented Geologic Map (DGM). A complete geologic map is a 5-tuple (r; d; c; l; dgm), where r is the reference of the map in the map library, d the date of creation, c the coordinates of the covered area (dom(c) = (