using case-based reasoning in a system that ... - Semantic Scholar

1 downloads 0 Views 46KB Size Report
case-based reasoning technology coupled with a database system to develop systems that store and manage the information and support expert tasks.
USING CASE-BASED REASONING IN A SYSTEM THAT SUPPORTS PETROGRAPHIC ANALYSIS M. Abel, E.B. Reategui, J.M. Castilho Abstract : This paper presents an intelligent database system to support petrographic analysis. The main goal of the research is to solve problems in knowledge acquisition and representation to develop efficient systems in domains with weak theory. The proposition of this work is to use case-based reasoning technology coupled with a database system to develop systems that store and manage the information and support expert tasks. The system manages petrographic information and helps beginners to describe and interpret reservoir sandstone occurrences. The paper evaluates the use of cases to represent knowledge in a geological domain, describes the knowledge acquisition techniques used in the construction of the knowledge base and the decisions and problems in system development.

1. INTRODUCTION Part of the task of exploring petroleum resides in understanding of the oil reservoir rock. Geologists collect hundreds of petrographic analysis to study a sedimentary unit that can be economically interesting. Each petrographic analysis includes a complete description of the minerals and the structures of the rock sample (optical and electronic microscope analysis, isotopic and chemical analysis and petrophysics) and the proposition of some theories about the genesis of the rock. This information composes the body of knowledge of some sedimentary unit or oil field, that is kept by sedimentology experts. This work focuses on the petrographic analysis and interpretation of the most important oil reservoir rock, the sandstone. We present a system that assists non-expert petrographers in describing and interpreting rock samples using previous similar cases. It stores and manages petrographic information of several sedimentary units, makes automatic quantitative classification of sandstones and suggests genetic interpretation based on previous cases and on the expert's heuristics. The similarity between two cases can usually assure that the solution adopted for a previous case (if it was a good solution, of course...) can also be adopted for the new case, or some other times, be modified to fit the particular requirements of the new case .

2. THE APPLICATION DOMAIN The petrographic study of reservoir rock gives important information to define how valuable some deposit could be. With a petrographic analysis we can estimate the porosity and permeability of the rock that contains oil, which is related with the oil flux and production. The exploration of a single oil field could demand hundreds of rock sample analysis to characterize the kind of reservoir. Studying a rock sample to support oil exploration means collecting the results of several different forms of analysis, like optical and electronic microscope analysis, isotopic and chemical analysis and petrophysics, and developing some interpretation of all the data. In a oil company, these analysis are made by petrographers who use particular and different methods to perform the task and to record the information. As a result, the body of knowledge about some oil reservoir is distribuited among several storages, including human minds, written documentation and magnetic storage. Experiences have demonstrated that most of the information produced was lost or couldn't be retrieved when a field was re-evaluated later. Besides the problem of the geologic memory of the oil field, which is concerned with the database resource, there is the problem related with supporting an expert task and the standardization of the petrographic task. The formation of a good petrographer demands years of experience and study in a hard subject. Each professional uses to develop a particular method in describing a rock sample, including some subjective interpretation of the rock features. One petrographer hardly would get the original aspect of a rock

throught a description made by another one. This is one of the causes of loosing information. Another problem is related with the representation of feebly structured information using the database systems models. This work faces on two problems of petrological study of reservoirs: •

how to acquire, represent and store symbolic and weakly structured information in an information system that guarantees safety and efficient management;



how to use the expert knowledge embbeded whitin this information to guide the description of new samples in a non-subjective and standard way, and to support geological correlation and interpretation.

In the first phase of the project, we collected the user requirements about the reservoir petrographic information system. The requirements address four aspects of the application: the heterogeneity, weak structure and great amount of data, and the necessity of knowledge processing. The system should manage information about geological unit descriptions and geographicaly referenced rock analysis to support geological correlations among layers and units. To accomplish this goal the system should manage different types of data, like images, numeric tables, textual information, and include features of the Geographical Information Systems (GIS, second Abdelmonty et al. 1993) to treat spatially correlated data. Besides the information representation there is the problem about acquiring and using problem solving expert knowledge. From these information and requirements, we've decided to solve a representative part of problem. Three types of descriptions obtained throught an analysis in optical microscope are treated by the system: •

A qualitative description of a rock sample, that describes the way that minerals and structures are presented in the rock, their interrelationships and main features. Each sample has among 45 features that need some textual description;



a quantitative description, that quantifies the minerals, pores (the empty space among minerals, that could store oil) and special features present in the thin section. Each sample contains about eighty minerals or features that need to be quantified;



a microphotography obtained in an optical or electronic microscope, that illustrates the general aspects of the rock.

The knowledge about how to describe and interpret rock samples are used to support the rock description and suggest the provenience and origin of rock. In a group of 20 samples that are quantified, only one is described and photographed. This complete description - the qualitative description - provides the main information for classifying and interpreting the rock features. It has the form of a feebly structured text and describes the textural aspects of the rock resulting from the environment conditions of the sedimentary unit. The petrographer describes the general aspects of the rock, the structures that can be identified, and how the minerals were disposed or altered during deposition. The quantitative description consists of a numeric table, each field representing a mineral present in the rock. A rock may contain eighty out of five hundred possible minerals or features. The petrographer counts the occurrence of each mineral in the area of a thin section. All the minerals are grouped in ten different categories and their totals are used to characterize the rock. The classification, provenience and porosity of the rock are obtained by plotting the proportion of each category in one of the several compositional triangular diagram types (Folks, 1968). These evaluations are used to indicate the genetic and environment interpretation of the rock. The microphotography illustrates the general aspects of the sample and explores the strong pictorial memory of the geologist. Many sedimentary interpretations are made by only using the visual appearance of the rock.

Other objects of the system are basin, field, unit, depth and rock. At this time, these objects represent attributes of the single object sample. With the expansion of the system they will be used for geographic and geological correlation.

3. APPROACH AND CONCEPTS APPLIED The system was designed in accordance with the requirements of a real application. The geological knowledge was elicited from an expert petrographer in two phases using different knowledge engineering techniques. The system was implemented to support database queries as well as more complex queries that demand the use of artificial intelligence techniques. To achieve this goal, the system has four main components: •

a case library that contains previous cases solved by the expert, stored in a relational database;



a knowledge base that stores in a sysbolic system the domain knowledge and the problem solving method of the expert ,



a reasoning mechanism that looks for the cases in the case library that most resemble the problem case, implemented using heuristic search and the indexing mechanism of the database;



an adaptation mechanism that modifies the solution adopted for an old case and makes it fit the problem case, using the knowledge stored in the knowledge base.

3.1. The knowledge acquisition and modelling The knowledge acquisition process was carried out in two distinct phases. In the first phase, traditional knowledge engineering techniques were used, such as bibliographic immersion, interviews using retrospective and concurrent protocols, and card-sorting (Wright and Ayton 1987). The main goals of this phase were the identification of the user's expectations for the system, the specification of the object hierarchy of the domain that would compound the domain knowledge in sedimentary petrography. At this phase the schema (as used in Mattos, 1991) of the knowledge base was generated. In the second phase of the knowledge acquisition process, the description of a large number of samples was provided by the expert. Each sample description corresponds to a previous problem in sedimentary and genetic interpretation that was solved by the expert, and includes the diagenesis, the genetic interpretation and the classification of that sample. A description also compounds a case and represents knowledge at an operational level (Kolodner 1993), that is, it makes explicit how a sedimentary interpretation was carried out, or how a piece of knowledge (for instance a set of rock features) was applied in identifying a specific kind of deposit. The expert's reasoning that composed the rock sample interpretations through the analysis of the information obtained from a thin section was analyzed using retrospective protocols. The expert was faced with a number of cases, or solved problems, and was asked to explain why a specific solution was chosen instead of another and which features supported his decision. This information was represented as knowledge graphs (Leão 1990), a knowledge graph being a directed AND/OR acyclic graph in which three types of nodes can be identified: •

hypothesis nodes: representing the classification hypothesis considered in the graph;



evidence nodes: representing different features that support the classification hypothesis. They appear in the graph in their order of importance;



intermediate nodes: representing different groupings of evidence used by an expert for reasoning about a problem.

There is a weight associated with each node of the graph that indicates how much that particular node influences the classification hypothesis. The graph shown in figure 1 illustrates how the expert recognizes a

tractional deposit using textural features of the rock, that is, stratification, granulometry and sorting. The numbers associated with the nodes indicate the confidence factor (CF) of the corresponding feature (Buchanan and Shortliffe, 1984), or how much the expert believes that some hypothesis is true if some particular evidence (or a group of evidence) associated with the hypothesis were confirmed. In the example of figure3, the feature cross-stratification alone practically confirms the interpretation, as demonstrated by the CF 9. The features coarse to medium grained and moderately sorted weakly indicate a tractional deposit, with CF respectively 6 and 7. These features increase the confidence of the interpretation to 8. Together, the three features are enough to confirm the interpretation, as indicated by the CF 10.

TRACTIONAL DEPOSITS (10)

(8)

CROSS-STRATIFICATION

COARSE TO MEDIUM

MODERATELY

GRAINED

SORTED

(6)

(7)

(9)

Figure 1 - The knowledge graph showing the relationship among rock features and the interpretation of the sedimentary deposit.

The collected cases represent instances of problems, but they do not cover the whole range of possible rock occurrences. It is almost impossible to determine the amount of cases necessary to cover the all possible rock occurrences, and it is just as difficult to convince the expert of the need for such a large number of cases. For this reason, knowledge graphs were used to expand the range of problems addressed by the system. When building the knowledge graphs, the knowledge engineer guided the expert by using complementary information represented in the domain knowledge acquired in the first phase of the knowledge acquisition process. He also confronted the expert's graphs with the cases already stored in the library in order to validate the graphs.

SANDSTONE 1. Qualitative description 1.1 Identification 1.2 Macroscopy description 1.3 Microscopy description 1.4 Detrital composition 1.5 Diagenesis 1.6 Porosity 1.7 Genetic interpretation 1.8 Classification 2. Quantitative desctiption 2.1 Quantity of each mineral set 2.2 Quantity of each kind of mineral 3. Photography or image

Figure 2. The object sandstone, composing the overall structure of a case.

The domain knowledge is represented by a set of complex objects formed by an aggregation of attributes and values plus the relationships among objects (Abel and Castilho 1993). A number of constraints controls the values of object attributes and guarantee that each object in the knowledge base is a possible object in the real world. Furthermore, each attribute is classified as essential or accessory to represent a significance factor that indicates how important the feature is for the recognition of the object. This classification has been based on the ranking of attributes of the system Internist (Miller et al. 1986) and it has been used to represent the knowledge stored in intermediate layers of neural networks (Reategui and Leão 1993) and to direct the reasoning process of a hybrid symbolic/connectionist system in cardiology (Leão and Reategui 1993). Essential attributes represent the features that guarantee the identity of the object. Accessory attributes complement the description of the object. However, they do not necessarily have to be present for the correct identification of the object to be assured. The description of a sample is a special object that represents a case in a case library. The attributes and values represent the petrographic features of the rock; the classification, diagenesis and genetic interpretation are special attributes that implement the solution of the problem (items 1.5, 1.7 and 1.8 in figure 4). The overall structure of a case is presented in figure 2.

4. THE QUERYING AND REASONING PROCESS The user describes a sandstone sample guided by a predefined but flexible structure, similar to that depicted at figure 2. These structure is stored and manage mainly by the database system. The database system also keeps a dictionary of petrographic terms, like mineral and feature names, and the association of each legal term with the specific atribute of the case that the term qualify. These terms are suggested to the user when he fill out the mineralogical, textural and structural characteristics of the rock sample. Using the filled structure the system computes the contents of determined attributes, previously classified as essential by the expert, and applies the compositional classification method to obtain the formal name of rock. This method considers the proportion of the rock-former minerals, listed in the quantitative description, and the textural aspects of the rock. This algoritm is based on the Folks (1968) method of compositional classification, but the system could easily include other alternative classifications. After that, using the formal name of rock and particular values of textural and structural descriptions the system searchs for similar cases in the case library. The main classes of rock and the features that indicates the depositional environment of the rock are used for indexing the samples in the database, in order to improve the retrievel of similar cases. The similarity concept applied in this work is not based on the number of attributes with the same value. The system considers what type of attribute are matching and on which values the matching was achieved. This is necessary to treat the sufficiency index involved in the textural aspects of the rocks, that is implemented through a inference Lisp system. That system realizes a forward search among the graphs and the cases sellected by the database management system in the previous phase, looking for the specific sample that are the closer matching with the user sample. After that, the system evaluates the interpretation associated with this sample using the knowledge graph and the features of the rock. It modifies the unappropriated parts of solution and includes this new interpretation whitin the sample in the database. Using knowledge graphs to guide case adaptation is in fact similar to using inferential rules and heuristics for adaptation. Most of the case-based systems that include adaptation tasks use some sort of knowledge elicited from experts in the adaptation process. For instance, the system Persuader (Sycara 1988) uses rules and heuristics to create adaptation strategies to generate compromise solutions in labour negotiations. The knowledge graph differs of rules by the amount of the knowledge that it can represent. Each graph should be replaced for four or five production rules.

5. IMPLEMENTATION ASPECTS The cases are mapped from the object representation described at the section 3.1 to a set of normalized tables in a relational database system. The relational model itself was not used as the conceptual model for the cases because it would lead to difficulties in representing symbolic knowledge (Kaula & Ngwenyama 1990). For instance, the relational model would not be appropriate to represent objects that have a different and variable number of attributes, or to represent objects that have a large number of relationships among them. However, using a relational database system to manage the information allows the storage of larger amounts of data and may improve the performance of the system (Parsaye et al 1989). The figure 3 depicts the overall structure of the system. INTERFACE

INFERENCE SYSTEM

KNOWLEDGE GRAPHS

SQL PROCESSOR

CASES

DBMS

Figure 3. The components of the system: the case library, stored in the database, the integration module, the symbolic system and user interface.

Only the descriptive part of the knowledge - the rock structure and its features - has been represented in the tables of the database system. The more complex constraints, the methods and the knowledge graphs have been stored in a symbolic Lisp system that accesses and modifies the database through an interface and the SQL language processor. This system applies the knowledge graphs using a forward-backward inference combined with a method of chaining the confidence and influence factores. The system is being developed on the Windows Microsoft environment, using a database management system and a Lisp language, coupled with the ODBC interface. Problems with the integration of the different environments s and difficults in developing some aspects of the system point to the reimplementation of the system using the C language. By now the system uses a poor interface and stores about eigty quantitative descriptions and twelve qualitative descriptions of samples from two different sedimentary units.

4. CONCLUSION This work has explored the efficiency of a database system and the flexibility and richness of a knowledge base system applied to a complex domain. Knowledge elicited through different knowledge acquisition techniques has been combined with that of real cases to assist petrographers in interpreting new rock samples. A prototype has been developed and is being tested, using simple but efficient approaches in order to produce an economically feasible system . The main aspects involved in this project are: • the selection of the most appropriate knowledge acquisition techniques for a complex domain such as sedimentary geology; • the definition of a conceptual model for the representation of knowledge and data for efficient access and update; • the use of case-based reasoning to assist the user in describing rock samples and providing sedimentary interpretation,

• the definition of the appropriated computational environment to develop and support this kind of system. Each decision are defined taking in count aspects like the flexibility and subjectivity of the task , the richness of the inference, the amout of data expected for the real use, the efficiency of the running version and the estimated cost of the final version of the system For future work the system will be extended to identify and interpret other kinds of reservoir rocks, such as conglomerates and carbonatic rocks, the most economically important sedimentary rocks. We are also incorporating in the system another type of reasoning that experts in the field use when building geological interpretations for a new rock sample: they use implicit information about the geographic location of the sample and other neighboring rocks that form the geological context of the sedimentary unit. We will incorporate in the system methods generally used on Geographic Information Systems that deal with spatial reasoning.

5. REFERENCES ABDELMONTY, A.I.; WILLIAMS, M.H. and PATON, N.W. Deduction and deductive databases for geographic data handling. In ABEL, D. and OOI, B. C. (Eds.) Advances in Spatial Databases. Lecture Notes in Computer Science, 692. Heidelberg, Springer-Verlag, 1993. ABEL, M. and CASTILHO, J. M. V. de. Hybrid information systems: integrating data and knowledge management. In Proceedings International Conference of the Sociedade Chilena de Ciencia de la Computacion. La Serena, Chile, 1993. ABEL, Mara; REATEGUI, Eliseo Berni; CASTILHO, José M.V;CAMPBELL,John Evaluating Case-Based Reasoning in a geological model. Proceedings. London, DEXA 95,1995. BUCHANAN, B. and SHORTLIFFE, E. Rule-based expert systems: the MYCIN experiments. Reading, Addison-Wesley, 1984. FOLK, R. L. Petrology of sedimentary rocks. Austin, Texas, Hemphill's Book Store, 1968. KAULA, R. and NGWENYAMA, O. K. An approach to open intelligent information systems. Information Systems, 15(4): 489-496, 1990. Oxford: Pergamon Press. KOLODNER, J. Case-based reasoning. Morgan Kaufman, San Matheo, 1993. LEÃO, B. F. and ROCHA, A. F. Proposed methodology for knowledge acquisition: a study on congenital heart disease diagnosis. Methods of Information in Medicine, 29: 30-40, 1990. LEÃO, B. F and REATEGUI, E. B. A hybrid connectionist expert system to solve classificational problems. In Proceedings of Computers in Cardiology, 1993. London, UK. MATTOS, Nelson M. An approach to knowledge base Management. Berlin, Spring-Verlag, 1991. MILLER, R. A. et al. Internist-I: an experimental computer-based diagnostic consultant for general internal medicine. In REGGIA, J. A. and STANLEY, T. (Eds.). Computer-assisted medical decision making - vol. 2. New York, NY: Springer Verlag,1986. PARSAYE, K.et al. Intelligent databases. Toronto, John Wiley & Sons , 1989. REATEGUI, E. B. and LEÃO, B. F. Integrating neural networks with the formalism of frames. In Proceedings of the First World Congress on Neural Networks '93. Portland, Oregon 1993. SYCARA, K. Using case-based reasoning for plan adaptation and repair. In Proceedings of the Case-Based Reasoning Workshop '88. Clearwater Beach, Florida, 1988. WRIGHT, G. and AYTON, P. Eliciting and modelling expert knowledge. Decision Support Systems 3: 13-26. Elsevier Science Publishers, North-Holland, 1987.

Suggest Documents