Integrating Inductive and Case-Based Technologies for Classification and Diagnostic Reasoning Klaus-Dieter Althoff
Michel Manago
Ralph Traphöner
Ralph Bergmann
Eric Auriol
Martin Bräuer
Frank Maurer
Noël Conruyt
Stefan Dittrich
Stefan Wess
AcknoSoft
tecInno GmbH
P.O. Box 3049
58a rue du Dessous des
Sauerwiesen 2
University of Kaiserslautern
Berges
D-67661 Kaiserslautern
[email protected]
75013 Paris, France
Germany
Abstract In this paper, we present two techniques for reasoning from cases: induction, and case-based reasoning. We compare and contrast the two technologies (that are often confused) and show how they complement one-another. We then describe how they can be integrated in a single platform for reasoning from cases: the INRECA system.
1
Introduction
Currently, the acquisition of knowledge and the construction of a knowledge-based system from this knowledge is performed largely by hand. To improve the process of knowledge acquisition to overcome this well-known bottleneck of expert system development, various activities are ongoing from both the knowledge engineering as well as the machine learning community. This is partially reflected by a number of scientific workshops on topics like multistrategy learning, computational architectures for supporting machine learning and knowledge acquisition, integrated learning architectures, mixed-paradigm reasoning among others. One central issue here is the integration of different knowledge representation and processing schemes. Focusing on diagnostic problems, we will describe two different approaches to solve such problems using machine learning techniques, namely induction and case-based reasoning, respectively. Since it is the current state of the art that there are no satisfying answers available how these two approaches to acquire diagnostic competence differ in principle from a more theoretical, machine learning-oriented standpoint (cf., e.g., Shavlik & Dietterich, 1990), we will show that induction and case-based reasoning are complementary techniques from a knowledge acquisition technology perspective. We will describe one concrete inductive approach to diagnostic problem
solving as well as a case-based one. Therefore, we will shortly introduce exemplary systems developed by the authors which are the starting point for the integration of both technologies. We will then show how the two approaches are complementary and will benefit from each while being integrated within one computational architecture, the INRECA system. Approaches to diagnostic problem solving have to meet certain requirements in order to be successful. We now introduce requirements which represent partially the authors´ past experiences from developing solutions for diagnostic problems. We concentrate on aspects which are of major importance for the integration of induction and case-based reasoning for diagnostic problem solving. In a diagnostic problem, some complete fault situations have occured but are only partially known, i.e. one is confronted with some incomplete situation. The task is to determine the diagnosis of the unknown complete situation (at least with some certainty). At first glance this seems to be a pure classification problem. With equal right one can say, however, that the real problem is to find an optimal way to complete incomplete situations sufficiently enough such that a diagnosis with a high degree of certainty can be established. This task has been attacked less successfully in the literature. Thus, the task of diagnostic reasoning can be decomposed into one classification subtask (associating given symptom/attribute values with a diagnosis) and one test selection subtask (associating given symptom/attribute values with another symptom to test). Acquiring diagnostic competence then means acquiring classification and test selection competence, respectively. An approach to diagnostic reasoning should be able to deal with incomplete information. The knowledge base should be easily maintainable, e.g. be able to deal with exceptional cases and/or be easily adaptable to modifications of the underlying system (e.g. an engineering system). Incremental interactive development of the knowledge base should be possible as well as automated knowledge acquisition to a high degree (a more detailed discussion of requirements can be found in Traphöner, Manago et al., 1992; Althoff, 1992a+b). After the introduction of an exemplary inductive and an exemplary case-based reasoning system, we will focus on the handling of unknown (missing) values during consultation. This appears to be one key issue for the integration of these different technologies. Then an integrated learning architecture which includes inductive and case-based reasoning is presented. We also dis cuss our approach from several scientific points of view.
2
The Inductive Approach
Induction is a technology that automatically extracts knowledge from training examples (reference cases). It derives general knowledge from the cases: from an extensional description of concepts (i.e. the examples), it derives an intensional description of these concepts in the form of a
decision tree, a set of most general rules (most general version of the concepts), or a characteristic description of the examples (most specific version of the concepts) (Manago, 1987). This general knowledge is then used to solve new problems. We now briefly introduce KATE as an exemplary inductive reasoning system for diagnostic problem solving. KATE automatically builds a decision tree from a data base of training cases. KATE uses the same search strategy (hill-climbing) and preference criteria (information gain that is based on Shannon's entropy) as the ID3 system (Quinlan, 1983). At each node in the decision, it evaluates information gain for all the attributes which are relevant and picks the one which yields the highest increase of the information gain measure. However, unlike ID3, KATE can handle complex data represented by structured objects, relations and background knowledge about the domain (Manago & Kodratoff, 1991). At each node, KATE does some additional work to compute the set of attributes of objects that are relevant in the current context. For example, as soon as it is known that an object that appears in the tree is an instance of class "males", it will stop considering attributes such as "state of pregnancy" that are clearly irrelevant in the current context. ID3, on the other hand, only uses data represented by a flat table of attributes and performs unnecessary computations such as calculating the information gain of an attribute like "state of pregnancy" when it already knows that the value of attribute "sex of patient" is equal to male. KATE uses simple background domain knowledge encoded in the objects' definitions in order to constrain the search space. It can also handle relations between objects, hierarchies of objects (with multiple inheritance capabilities), hierarchies of values and can tackle applications that could not be handled by earlier versions of ID3. Since our object-based representation is a particular form of predicate calculus (Nilsson, 1980), the extension to ID3 we have made allows us to extend the representational capabilities of the system from propositional calculus to first order logic. However, our normalized object representation (binary predicates) and the use of background knowledge encoded in objects allows us to constrain the search space and preserves the efficiency of the algorithm (Manago, 1989). We have not yet made any extensive comparisons between KATE and other extended versions of ID3 that can handle complex relational data such as C4 (Quinlan, 1991) although we have discussed this with Ross Quinlan on a few occasions by comparing, for example, results obtained by KATE and C4 on the Michalski's trains toy application (Medin et al., 1987).
3
The Case-Based Reasoning Approach
Case-based reasoning is a technology that allows to find analogies between a current working case and past experiences (reference cases). Case-based reasoning makes direct use of past experiences to solve a new problem by recognizing its similarity with a specific known problem and by, at least partially, applying the known solution to reach a solution for the actual new problem. We now
briefly introduce PATDEX as an exemplary case-based reasoning system for diagnostic problem solving. PATDEX realizes a learning-apprentice-like case-based diagnosis system. It consists of two case-based reasoning subcomponents for classification and test selection, respectively. Both components rely on a case-based reasoning mechanism that bases on the similarity of the underlying cases. The similarity is interpreted by using an evaluation function (similarity measures). A procedure that dynamically partitions the case base enables an efficient computation and updating of the similarity. In case of the classification subcomponent, the applied similarity measures are completely dynamical. The underlying evaluation function can be adapted to observed expert behavior using a connectionist learning technique (competitive learning). For the test selection component, the adaptation of similarity measures is based on an estimation of the average costs for ascertaining symptoms using an A*-like procedure. PATDEX can deal with redundant, incomplete, and incorrect cases and includes the processing of uncertain knowledge through default values for symp toms. PATDEX has been described in detail and discussed from several (other) points of view in Althoff and Wess (1991), Richter and Wess (1991), and Wess (1991).
4
Handling missing values with KATE
KATE presents some limitations for building an identification system that can handle missing values during consultation. Consider the following data base of cases: CASE CLASS Ex1 PARADISCONEMA Ex2 COSCINONEMA Ex3 CORYNONEMA ... ...
SHAPE(BODY) ELLIPSOID CONICAL ELLIPSOID ...
TEETH-TIP(MACRAMPHIDISQUES) LARGE LANCET-SHAPE LANCET-SHAPE ...
... ... ... ... ...
Table 1 - A data base of cases for an application which identifies marine sponges This example has been extracted from an actual application which identifies marine sponges developed by the Museum of Natural History in Paris using KATE. KATE works in two stages: it first learns the decision tree, such as the one shown below, and then uses that tree to identify the class of a new incoming sponge whose class is unknown. Consider what happens if the user does not know the answer to the first question that is asked during interactive consultation (i.e. what is the value of teeth-tip(macramphidisques)?):
teeth-tip(macramphidisques)= ??? lancet-shape
large
shape(body) = conical ellipsoid corynonema: ex3
paradisconema: ex1 conical coscinonema : ex2
Figure 1: A consultation of the decision tree learned by KATE When the user answers "unknown" to this question, KATE proceeds by following both branches "lancet-shape" and "large" and combines the conclusions found at the leaf nodes. In the "large" branch, KATE reaches the "Paradisconema" leaf node. In the "lancet-shape" branch, it reaches a test node and the user is queried for the value of the "shape" of the object "body". He answers "conical". The system follows that branch and reaches the "Coscinonema" leaf node. It then combines the two leaf nodes and concludes with some uncertainty that the current case is a "Paradisconema" with a probability of 0.5 or a "Coscinonema" with a probability of 0.5. However, if we consider the case at the "Paradisconema" leaf node (ex1), we note that its characteristic "shape(body)" has the value "ellipsoid" unlike the current case where it is "conical". Therefore, the current case is closer to ex2 than to ex1 and the correct conclusion ought to have been "Coscinonema" with a probability of 1. Unfortunately, the information about the body shape of ex1 was generalized away during the induction phase and this information is no longer available during consultation. Note that there are other methods for handling an unknown value during consultation such as assigning a probability to each branch (Quinlan, 1989) but that this does not change the problem presented above. This problem is not caused by a flaw of the particular induction algorithm used in KATE (we could have used another induction algorithm and encounter a similar problem) nor is it a flaw of the decision tree formalism (we could have used production rules generated automatically or even manually and still encounter a similar problem). This particular problem is due to the fact that we are reasoning using some abstract knowledge and have lost some useful information which was originally contained in the training cases. It is, therefore, a flaw of the knowledge-based system approach to problem solving (reasoning about a problem using abstract knowledge). We argue that in order to provide a general solution to this problem, we must adopt a case-based reasoning approach instead of a knowledge-based system approach. One might object that an "intelligent" induction algorithm could know which information will never become of crucial importance and can therefore be generalized away. One might also object that if we had the same configuration of unknown values in the training cases (for instance, a fourth case with unknown "teeth-tip(amphidisques)"), the behavior of the resulting consultation system would be correct. We argue that in some application it is impossible to know in a fixed way
which information will be useful, and that it is impossible to produce an exhaustive data base of cases with all configurations of unknown values. Consider an application where we assist the user in identifying an object on a photo. We are dealing with a three-dimensional object from which we can only see one part (e.g., the right side and not the left side). Moreover, parts of the object may be hidden by other objects which are on the first plan or can simply be missing (for example, the deep sea diver might bring back a partial sample of a sponge whose pedoncula is still at the bottom of the ocean). What would be the size of the data base of training cases if we wanted to enter all the configurations of unknown values which could possibly occur? We were faced to this problem in practice. In an application we had to deal with, an object is described by 58 characters. Since the characters of the object can be hidden/missing in any combinations, there are 258 (˜ 2,8 * 1017) combinations of unknown values (i.e. a character is known versus it is unknown) per class which is a very large number. In order to provide an exhaustive data base, we would need to collect at least that many examples for each case to identify. This is clearly not possible. Instead, we prefer to enter prototypic reference cases (reconstructed from several photos) and index them dynamically in order to retrieve the ones which best match the current picture. Since this indexing cannot be done in a fixed way (for example by constructing a static decision tree), the identification system must react dynamically to the information available from the user and find analogies with the reference cases. Thus, we need casebased reasoning.
5
Handling missing values with PATDEX
PATDEX has been tested within the area of fault diagnosis of tool machines. For an experimental evaluation of the system, we have been interested in measuring its ability to reach a correct solution, though the presented information (working case) has been incomplete (i.e. contains unknown values). Splitting cases into disjoint training and test sets and then measuring the classification accuracy is not the most important aspect. Since PATDEX is a precedent-based (or casematching) case-based reasoner, it can only find the correct diagnosis if there is at least one case in its case base which includes it. Thus, it makes no sense to check classification accuracy against such test cases (this point of view is, e.g., conformable with that of Bareiss, 1989). Even for a huge number of cases such a test procedure would be of lower interest, because during the knowledge acquisition phase one important criterion is to have every possible diagnosis included in the knowledge/case base. In addition, PATDEX needs only a few cases per diagnosis (we call this the "density" of each diagnosis) to classify each input correctly which is completely and correctly specified. Reduced information (%)
0
10
20
30
40
50
60
70
80
90
100
Classification accuracy (%)
100
99
97
96
91
90
76
60
28
11
0
Table 2 - Measuring correctness against reduction of information Experiments have been conducted with a training set of one hundred cases from the area of "fault diagnosis in tool machines". The test set also consists of (the same) one hundred cases. For every test case the number of known symptom values has been stepwise reduced. Thus, classification accuracy is measured against reduction of the presented information. For this, we reduce the PATDEX system to an IB1-like algorithm (cf. Aha, Kibler & Albert, 1991) but use the PATDEX similarity measure. The result is shown in table 1. Here, reduced information of, e.g., 70% means that every case is classified based on 30% of its known symptom values (where 60% of such cases have been correctly classified).
INRECA Activation
Heuristic Classification
Case-Based Classification Decision Tree
Cases
Cases
Cases
Cases
Cases
Cases
Cases
Case Base
KATE™
S3-CASE
Figure 2: Subcomponents of INRECA As we can see, up to a certain limit, classification accuracy is not significantly decreased by reducing the number of known attribute values in the current case. This is due to the fact that when
a feature is unknown, a case-based reasoning tool such as PATDEX looks for other alternative features which allow to identify the current case. Unlike in the inductive approach, the cases are not indexed by using a static structure such as a decision tree but it can dynamically react and exp loit all the information available. As a comparison, in the marine sponge application presented in the previous section, when a single value used in the decision tree is unknown (which corresponds to a 0.5% reduction in the information available) the classification accuracy immediately drops by 50%. In addition, the system is more resilient to errors made by the user during consultation which is a particular form of noise (Manago & Kodratoff, 1987b). It computes a similarity measure which is based on the global description of the cases (and not a minimal subset like the inductive approach) and allows to confirm the results of the interactive consultation by asking additional questions. Thus, PATDEX not only has the ability to focus on a particular hypothesis by restraining more and more the set of cases considered (like by going through a decision tree), but it also has the ability to expand the current set of cases which are currently considered.
6
Integration
We now introduce the INRECA approach to the integration of inductive and case-based technologies for classification and diagnostic reasoning. The presented architecture already considers the integration of domain knowledge. Though we will not discuss this topic in the very detail here, we will present some basic ideas and point out relevant literature for this aspect. We also rise some interesting questions which still are to be answered.
Initialization Domain Knowledge Classification
Decision Tree yes
Output
Induction
CBR
Terminate ?
Case Base
Ascertain a symptom
no
Test Selection
Information Gain
Domain Knowledge
Case-Based Reasoning
Figure 3: Test selection and classification in INRECA 6.1
Basic Architecture
The INRECA system1 consists of a case data base, a fast retrieval mechanism for nearest neighbor search, a heuristic classification component based on an automatically generated decision tree, a case-based classification component as well as a component which controls the test selection and the classification process (cf. figure 22). From every consultation, a protocol is generated. If a finished consultation has been marked as successful by the user, the case is passed to the case base. To be memorized, the case has to pass an IB2-like filter mechanism (cf. Aha, Kibler & Albert, 1991; Manago, Althoff et al., 1992). Cases that have been added to the case base could be used by the case-based reasoning system directly afterwards. The default consultation mode of the INRECA system tries to classify presented symptom values based on its decision tree (cf. figure 3). By default, the test selection is driven by the information gain criterion. If the user cannot or does not want to answer certain questions, the case-based
1 2
This description does not necessarily reflect the official opinion of the whole INRECA consortium. Ongoing applications and deeper insights might change this. S3-Case is a product of tecInno GmbH based on PATDEX a.o.
reasoner is activated. Based on the known symptom values, it retrieves the actual most similar cases from the case base. If there is at least one case sufficiently similar (with respect to a prespecified diagnosis -dependent threshold), this case, i.e. its included diagnosis, is the result of the classification process. Otherwise, additional symptoms have to be ascertained. The test selection process is driven by the actual most similar case: it is tried to determine all symptoms which are included in the case description but for which values have not yet been specified by the user. Additional domain knowledge (costs, probabilities, causal relations a.o.: cf. Althoff, 1992a,b) is also used here. If there is no case which is similar enough, INRECA stops with the statement that it is not competent enough to classify the input data. 6.2
Improving the Knowledge/Case Base
In view of the cooperation of the decision tree and the case-based reasoner during consultation to improve the knowledge/case base, the decision tree controls the test selection process while the case-based reasoner does its case-based classification as a background job. The test selection is driven by the case-based reasoner for clarification purposes only, i.e. if contradictory classifications of the classification subcomponents are present and cannot be solved immediately. The following situations are of special interest: •
The decision tree classifies correctly and the case-based reasoner does not find a sufficiently similar case in its case base: The diagnosis can be made, because no contradiction has been observed.
•
Both the decision tree and the case-based reasoner classify correctly: The diagnosis can be made. There is the possibility to exclude the cases from the CBR case base, because the decision tree already covers them.
•
The case-based reasoner classifies correctly and the decision tree does not find any diagnosis: There is a gap in the decision tree. If the most similar case within the CBR case base is a temporary exception, i.e. it has been added after the decision tree has been generated, then a new decision tree could be generated and the case excluded from the CBR case base.
•
The decision tree and the case-based reasoner classify differently: Different conclusions have to be taken depending on the actual situation. It is possible that the case-based reasoner is overgeneralizing and the decision tree is classifying correctly, or that the case-based reasoner is classifying in a more special way than the decision tree and an exception can be identified (cf., e.g., Althoff & Wess, 1991a; Golding & Rosenbloom, 1993). It is necessary to prove the case chosen by the case-based reasoner. If the case is proven to be sound, then the decision tree does not fit to the current situation. It has to be checked if the decision tree still makes sense taking a more general point of view. If it is not
possible to prove the selected case, its similarity has to be updated such that the respective diagnosis -dependent threshold of the case-based reasoner is not exceeded by the case´s similarity any more. •
Other situations, e.g. wrong classifications by the case-based reasoner or the decision tree: The correct case can be directly given as input. Afterwards the case-based reasoner can correctly classify the identical situation. If necessary, wrong parts of the decision tree have to be corrected (e.g., by the generation of a new tree).
6.3
Extensions
There are some extended features of the INRECA system which further improve the overall classification process. E.g., the decision tree could be used to detect incorrect cases. In addition, some kinds of misclassifications of the case-based reasoner could be repaired automatically by the use of a competitive learning strategy. Therefore, every attribute has a diagnosis -dependent weight (relevance) which are all together collected within a (relevance) matrix. If the similarity of a case exceeds its threshold, i.e. the case is sufficiently similar to classify the input data, but, actually the case includes the wrong diagnosis, then the competitive learning strategy computes an update of the relevances such that case´s similarity does not exceed its threshold any more. During training this strategy is applied to the whole CBR case base. Additionally, it is used if the user rejects a classification of the case-based reasoner (see above). INRECA can use domain knowledge to improve its diagnostic reasoning capabilities. E.g., causal and empirical (i.e. total and partial) determination rules as well as default values can be used to derive new symptom values from already known ones. This shortens the test selection process and improves the classification process. If uncertain symptom values have been used, the user gets a note on that. Thus, he has the possibility to ascertain such values if he thinks that it is necessary. Empirical determination rules could be generated by an inductive learning process which also considers already available domain knowledge (cf. Althoff, 1992a). Causal determination rules can be, e.g., generated from a deep functional model of the engineering system which is to be diagnosed (cf. Rehbold, 1991; Althoff, Maurer & Rehbold, 1990). Such a deep functional model can also be used to detect incomplete, redundant and/or incorrect cases. It also allows to use the casebased reasoner for "problem solving case-based reasoning", i.e. to adapt the underlying solution (a diagnosis) of a case to the specified input data (cf. Wess, 1993; Pews, Weiler & Wess, 1992). 6.4
Deep Integration
In view of a deeper integration of inductive and case-based reasoning we developed a retrieval mechanism which is based on k-d trees, a multi-dimensional binary search tree (Bentley, 1975; Friedman, Bentley & Finkel, 1977; Wess, 1993). This mechanism is built on top of an objectoriented data base (Öchsner & Wess, 1992). Retrieval with k-d trees could be seen in direct correspondence to the M AC part of Gentner and Forbus´ M AC/FAC model (Gentner & Forbus, 1991).
A2
A1 Figure 4: Bounds-test for nearest neighbor search Within the k-d tree an incremental best-match search is used to find the m most similar cases (nearest neighbors) within a set of n cases with k specified indexing attributes. The search is guided by application-dependent similarity measures based on user-defined value ranges. The used similarity measures are constructed according to Tversky´s contrast model (Tversky, 1977), but the user is free to define other ones. He is only restricted to use ordered value ranges and monotonic similarity functions. The k-d tree uses the inhomogeneity of the search space for density-based structuring. The balanced retrieval structure results in a small number of accesses to external media. Every node within the k-d tree represents a subset of the cases of the case base, the root node represents the whole case base. Every inner node partitions the represented case set into two dis joint subsets. The next discriminating attribute within the tree is selected based on the interquartil distance of the attributes´ value ranges (cf. Koopmans, 1987). Splitting in the median of the dis criminating attribute makes the k-d tree an optimal one (the tree is optimal if all leaf nodes are at adjoining levels). Search in the k-d tree is done via recursive tree search and the use of two test procedures: BALL-WITHIN-BOUNDS (BWB) and BOUNDS-OVERLAP-BALL (BOB) (cf. figure 4). These procedures check whether it would be reasonable to explore certain areas of the search space in more detail or not. Such tests can be carried out without retrieving the respective cases. The geometric bounds of the considered subspaces are used to compute a "similarity interval" whose upper bound then "answers" the question to explore or not.
The average case effort (measured by the number comparisons; cf. Jacquemain, 1988) for generating a k-d tree is O[k*n*log2n], for the worst case O[k*n2]. The average costs for retrieving the most similar case are O[log2n], if the tree is optimal organized. For the worst case the retrieval costs are O[n]. The retrieval mechanism is correct in that sense that it always finds the m most similar cases. The costs for the reorganization of the k-d tree (making the tree an optimal one again) are O[l*log2l], where l is the number of leaf nodes belonging to the non-balanced subtree, i.e. the costs to rebuild the whole tree are O[n*log2n]. 6.5
Open Issues
The above described retrieval mechanism is already a compromise between case-based and inductive technologies. It is a fixed, automatically generated indexing structure for the cases, which is comparable to the use of the decision tree during consultation (see above, section 6.1). This enables a fast retrieval of cases. Otherwise, the handling of unknown/missing values, the adaptation of the similarity measures, and the use of domain knowledge can be integrated into this retrieval mechanism (cf. Althoff, Maurer & Wess, 1993). Up to now, the k-d tree is built from a set of cases, global diagnosis -dependent similarity measures, and local value range dependent similarity measures. Inductive learning might be useful for improving the building of the tree, e.g. for the selection of the discriminating attributes or for improving the above mentioned test procedures (BOB & BWB). The use of domain knowledge might also be useful for the building of as well as searching within the k-d tree. Open questions are whether it should be preferred to directly enhance the basic search algorithm or to build something on top of the k-d tree retrieval mechanism (may be comparable to the FAC phase, see above) and how to integrate decision trees, k-d trees and, e.g., domain dependent fault trees.
7
Discussion
Case-based reasoning represents a specific method for solving a certain class of problems, especially for the treatment of inhomogeneous solution spaces. Within such solution spaces, cases correspond to homogeneous (i.e. “small” changes of the problem descriptions result in “small” changes of the solutions/justifications) subspaces. An overview on case-based reasoning can be found in Kolodner (1988), Hammond (1989), Riesbeck and Schank (1989), Bareiss (1991), Althoff and Wess (1991), Althoff, Wess et al. (1992), Wess, Althoff et al. (1992), or Kolodner (1993). Case-based reasoning is a well-suited approach if cases are an important knowledge source within the underlying domain, and the available experts reason from cases (even a formal dis cipline as mathematics uses case-based reasoning, e.g., to find a certain proof: Owen, 1990). In addition,
many domains are “case-based” in their overall structure, e.g., law, medicine, and economy. The keyword here is "help desk applications" or "case-based decision aiding" (cf. Simoudis & Miller, 1991; Kolodner, 1991; Slade, 1991). Commercial systems like CBR-EXPRESS and REMIND (cf., e.g., Schult, 1992) aim at that kind of applications. Both inductive learning and case-based learning have in common that they derive "global" knowledge from "local" observations (which, of course, are uncertain, respectively). However, they use different techniques to achieve this: inductive learning bases mainly on logical concept descriptions ("logical reasoning"), whereas case-based reasoners often use analytic descriptions ("geometric reasoning") (cf., e.g., Richter, 1992). One consequence from this is that inductive learners mostly start with the "dropping of complete dimensions" in contrast to case-based reasoners like PATDEX which "decompose complete dimensions into intervals". It depends on the use of a learning result which particular technique is then the more successful one. Therefore, the INRECA approach integrates both learning strategies within a broader architecture for identification and diagnostic reasoning. Up to now, much work has been done on the integration of different knowledge representation and processing schemes to improve knowledge acquisition. E.g., a comparative analysis as well as a proposed integration of models, cases and compiled knowledge have been given by van Someren, Zheng and Post (1990). The M OLTKE architecture also bases on these three schemes (cf. Althoff, Maurer & Rehbold, 1990; Althoff, Maurer et al., 1992; Althoff, 1992a). The GRANUL system integrates several existing knowledge acquisition tools into one coherent system that supports several styles of knowledge acquisition (Aben & van Someren, 1992). The M OBAL system is an interesting example for the integration of manual and automatic knowledge acquisition methods (the balanced cooperative modeling issue, cf. Morik, 1991). Van de Velde and Aamodt (1992) have analyzed the possible use of machine learning techniques within the KADS approach to expert system development. Rissland and Skalag (1989) introduced the notion of mixed paradigm reasoning for the integration of different reasoning reasoning schemes (reasoning from cases, rules, constraints, deep models etc.). Examples here are CABARET (Rissland, Basu et al., 1991), CREEK (Aamodt, 1991), PATDEX /M OLTKE (Althoff & Wess, 1991), GREBE (Branting & Porter, 1991), and JULIA (Hinrichs & Kolodner, 1991) among others. A first suggestion for the integration of case-based reasoning and model-based knowledge acquisition is given in Janetzko and Strube (1992). Tecuci (1991), e.g., describes an integration of different learning strategies for improving domain modelling and knowledge acquisition.
8
Conclusion
We have introduced the architecture of the INRECA system which uses induction and casebased reasoning for classification and diagnostic reasoning. INRECA is being applied to real world problems in the areas of technical maintenance as well as the pharmaceutical industry. Results from
this applications might change the suggested architecture. In addition, empirical evaluations based on smaller test applications are ongoing which also could result in a new architecture. We also think about a generic architecture which will be adaptable to different domain classes (for identification and diagnostic reasoning tasks).
9
Acknowledgement
Funding for this research has been provided by the Commission of the European Communities (ESPRIT contract P6322, the INRECA project). The partners of INRECA are AcknoSoft (prime contractor, France), tecInno (Germany), Irish Medical Systems (Ireland), the University of Kaiserslautern (Germany). KATE is a trademark of Michel Manago, KATIOUCHA is a trademark of AcknoSoft. S3-CASE is a product of tecInno GmbH. We would like to thank Prof. Claude Lévi and Mr. Jacques Le Renard at the Museum of Natural History in Paris for providing the sample application we have used to illustrate some of the ideas presented in this paper.
10
References
Aamodt, A. (1991). A Knowledge-Intensive, Integrated Approach to Problem Solving and Sustained Learning. Doctoral Dissertation, University of Trondheim Aben, M., van Someren, M. W. & Terpstra, P. (1992). Functional and Representational Integration in Knowledge Acquisition. Proc. International Machine Learning Conference, Workshop on "Computational Architectures for Supporting Machine Learning and Knowledge Acquisition" in Aberdeen Aha, D. W., Kibler, D. & Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning, 6, 37-66 Althoff, K.-D. (1992a). A Case-Based Learning Component as an Integrated Subpart of the M OLTKE Workbench for Fault Diagnosis in Engineering Systems (in German: Eine fallbasierte Lernkomponente als integrierter Bestandteil der M OLTKE-Werkbank zur Diagnose technischer Systeme). Doctoral Dissertation, University of Kaiserslautern; also: St. Augustin (Germany): DISKI 23, infix Verlag Althoff, K.-D. (1992b). Machine Learning and Knowledge Acquisition in a Computational Architecture for Fault Diagnosis in Engineering Systems. Proc. International Machine Learning Conference, Workshop on "Computational Architectures for Supporting Machine Learning and Knowledge Acquisition" in Aberdeen (edited by M. Weintraub) Althoff, K.-D., Maurer, F. & Rehbold R. (1990). Multiple Knowledge Acquisition Strategies in M OLTKE. In: B. J. Wielinga, J. Boose, B. Gaines et al. (eds.), Current Trends in Knowledge Acquisition (Proc. EKAW-90). Amsterdam: IOS Press, 21-40 Althoff, K.-D., Maurer, F. & Wess, S. (1993). Retrieval of Cases in INRECA. Technical Report, University of Kaiserslautern (forthcoming) Althoff, K.-D., Maurer, F., Wess, S. & Traphöner, R. (1992). M OLTKE - An Integrated Workbench for Fault Diagnosis in Engineering Systems. Proc. EXPERSYS -92, Paris
Althoff, K.-D. & Wess, S. (1991a). Case-Based Knowledge Acquisition, Learning, and Problem Solving in Diagnostic Real World Tasks. Proc. EKAW-91, Glasgow & Crieff Althoff, K.-D. & Wess, S. (1991b). Case-Based Reasoning and Expert System Development. In: F. Schmalhofer, G. Strube & T. Wetter (eds.), Contemporary Knowledge Engineering and Cognition, Springer Verlag Althoff, K.-D., Wess, S., Bartsch-Spörl, B. & Janetzko, D. (eds.) (1992). Similarity of Cases in CaseBased Reasoning (in German: Ähnlichkeit von Fällen beim fallbasierten Schließen). Proc. of the 1st Meeting of the German Working Group on Case-Based Reasoning, SEKI W ORKING PAPER SWP -92-11, University of Kaiserslautern Althoff, K.-D., Wess, S., Bartsch-Spörl, B. & Janetzko, D., Maurer, F. & Voß, A. (1992). Case-Based Reasoning in Expert Systems: Which Role Play Cases for Knowledge-Based Systems ? (in German: Fallbasiertes Schließen in Expertensystemen: Welche Rolle spielen Fälle für wissensbasierte Systeme ?). KI (4)92, 14-21 Bentley, J. L. (1975). Multidimensional Search Trees Used for Associative Searching. Communications of the ACM 18, 509-517 Bareiss, R. (1989). Exemplar-Based Knowledge Acquisition. London: Academic Press Bareiss, R. (ed.) (1991). Proceedings 3rd DARPA Case-Based Reasoning Workshop, San Mateo, California, Morgan Kaufmann Barletta, R. & Mark, W. (1988). Explanation-Based Indexing of Cases. Proc. of the DARPA Workshop on Case-Based Reasoning, Clearwater, FL Bergadano, F., Kodratoff, Y. & Morik, K. (1992). Machine Learning and Knowledge Acquisition. AI Communications, Vol. 5, No. 1, 19-24 Branting, L. K. & Porter, B. W. (1991). Rules and Precedents as Complementary Warrants. Proc A AAI-91, 3-9 De la Ossa, A. (1992). Knowledge Adaptation: Analogical Transfer of Empirical Knowledge Across Domains for Case-Based Problem Solving. Doctoral Dissertation, University of Kaiserslautern Friedman, J. H., Bentley, J. L. & Finkel, R. A. (1977). An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Trans. math. Software 3, 209-226 Gentner, D. & Forbus, K. D. (1991). M AC/FAC: A Model of Similarity-Based Retrieval. Proc. of the 13th Annual Conference of the Cognitive Science Society, 504-509 Golding, A. R. & Rosenbloom, P. S. (1993). Improving Rule-Based Systems Through Case-Based Reasoning. In: B. G. Buchanan & D. C. Wilkins (eds.), Readings in Knowledge Acquisition and Learning, 759-764, Morgan Kaufmann Hammond, K. (ed.) (1989). Proc. 2nd DARPA Workshop on Case-Based Reasoning. Holliday Inn, Pensacola Beach: Morgan Kaufmann Hinrichs, T. R. & Kolodner, J. L. (1991). The Roles of Adaptation in Case-Based Design. In: R. Bareiss (ed.), Proc. 3rd DARPA Workshop on Case-Based Reasoning, Morgan Kaufmann, 121-132
Jacquemain, K. J, (1988). Efficient Data Structures and Algorithms for Multidimensional Search Problems (in German: Effiziente Datenstrukturen und Algorithmen für mehrdimensionale Suchprobleme). Hochschultexte Informatik (Bd. 5), Heidelberg: Hüthig Verlag Janetzko, D. & Strube, G. (1992). Case-based Reasoning and Model-based Knowledge Acquisition. In: F. Schmalhofer, G. Strube & Th. Wetter (eds.), Contemporary Knowledge Engineering and Cognition, Springer Verlag Kolodner, J. L. (ed.) (1988). Proc. of a DARPA Workshop on Case-Based Reasoning. Holliday Inn, Clearwater Beach: Morgan Kaufmann Kolodner, J. L. (1991). Improving Human Decision Making Through Case-Based Decision Aiding. AI Magazine, 91(2), 52-68 Kolodner, J. L. (ed.) (1993). Special Issue on Case-Based Reasoning. Machine Learning (forthcoming) Koopmans, L. H. (1987). Introduction to Contemporary Statistical Methods. Second Edition, Duxbury Press, Boston Levinson, R. A. (1990). Pattern Associativity and the Retrieval of Semantic Networks. Technical Report 90-30, Baskin Center of Computer Science, University of California, Santa Cruz Lebowitz, M. (1987). Experiments with Incremental Concept Formation: UNIMEM . Machine Learning, Vol. 2, No.2, 103-138 Manago, M. & Kodratoff, Y. (1987a). Model Driven Learning of Disjunctive Concepts. Progress in Machine Learning (Proc. of 2nd European Workshop on Machine Learning), Bradko & Lavrac eds, Sigma Press (distributed by John Wiley & sons) Manago, M. & Kodratoff, Y. (1987b). Noise and Knowledge Acquisition. Proc. of the 10th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Manago, M. (1989). Knowledge Intensive Induction. Proc. of the 6th International Machine Learning Workshop, Morgan Kaufmann Manago, M., Althoff, K.-D., Conruyt, N., Maurer, F., Traphöner, R. & Wess, S. (1992). Induction and Reasoning from Cases. Submitted (also Technical Report, University of Kaiserslautern) Manago, M. & Kodratoff, Y. (1990). KATE: A Piece of Computer Aided Knowledge Engineering. Proc. of the 5th AAAI workshop on Knowledge Acquisition for Knowledge-Based Systems, Gaines B. R. and Boose J. eds., Banff (Canada), A AAI Press Manago, M. & Kodratoff, Y. (1991). Induction of Decision Trees from Complex Structured Data. Knowledge Discovery in Databases, Piatesky-Shapiro G. and Frawley, W., A AAI Press (distributed by the M IT Press) Manago, M., Conruyt, N. & Le Renard, J. (1992). Acquiring Descriptive Knowledge for Classification and Identification. In: Th. Wetter, K.-D. Althoff, J. Boose, B. Gaines, M. Linster & F. Schmalhofer (eds.), Current Developments in Knowledge Acquisition EKAW ´92, Springer Verlag Medin, D. L., Wattenmaker, D. & Michalski, R.S. (1987). Constraints and Preferences in Inductive Learning : An experimental Study of Human and Machine Performance. Cognitive Science, Vol 11-3, pp. 299-349. Norwood: Ablex Publisher.
Morik, K. (1991). Balanced Cooperative Modeling Using M OBAL - An Introduction. Technical Report (GMD-F3-Nachrichten AC Special Nr. 3), GM D, St. Augustin Nilsson, N. (1980). Principles of Artificial Intelligence, Chapter 9. San Matteo, CA: Morgan Kaufmann. Öchsner, H. & Wess, S. (1992). Similarity-Based Retrieval of Cases by Associative Search in a Multidimensional Search Space (in German: Ähnlichkeitsbasiertes Retrieval von Fällen durch assoziative Suche in einem mehrdimensionalen Datenraum). In: Althoff, Wess et al. (1992), 101-106 Owen, S. (1990). Analogy for Automated Reasoning. Academic Press Pews, G., Weiler, F. & Wess, S. (1992). Computation of Similarity for Case-Based Diagnosis Based on Simulation of Engineering Systems (in German: Bestimmung der Ähnlichkeit in der fallbasierten Diagnose mit simulationsfähigen Maschinenmodellen). In: Althoff, Wess et al. (1992), 47-50 Quinlan, R. (1983) Learning efficient classification procedures and their application to chess end games. In R. S. Michalski, J. G. Carbonell & T. M. Mitchell (Eds), Machine Learning: An Artificial Intelligence Approach (Vol. 1). San Matteo, CA: Morgan Kaufmann. Quinlan, J. R., (1987). Generating Production Rules from Decision Trees, Proceedings of the Tenth. International Joint Conference on Artificial Intelligence (pp. 304-307). Morgan Kaufmann. Quinlan, J. R. (1989). Unknown Attribute Values in Induction. Proceedings. of the Sixth International Workshop on Machine Learning (pp. 164-168). Ithaca, NY: MorganKaufmann. Rehbold, R. (1991). Integration of Model-Based Knowledge into Diagnostic Expert Systems for Engineering Systems (in German: Integration modellbasierten Wissens in technische Diagnostik-Expertensysteme). Doctoral Dissertation, University of Kaiserslautern Richter, M. M. (1992). Classification and Learning of Similarity Measures. Proc. of the 16th Annual Meeting of the German Society for Classification, Springer Verlag Richter, M. M. & Wess, S. (1991). Similarity, Uncertainty and Case-Based Reasoning in PATDEX . Automated Reasoning - Essays in Honor of Woody Bledsoe, Kluwer Academic Publishers Riesbeck, C. K. & Schank, R. C. (1989). Inside Case-Based Reasoning. Hillsdale, NJ: Lawrence Erlbaum Associates Rissland, E. L., Basu, C., Daniels, J. L., McCarthy, J., Rubinstein, Z. B. & Skalag, D. B. (1991). A Blackboard-Based Architecture for Case-Based Reasoning: An Initial Report. In: R. Bareiss (ed.), Proc. 3rd DARPA Workshop on Case-Based Reasoning, Morgan Kaufmann, 77-92 Rissland, E. L. & Skalag , D. B. (1989). Combining Case-Based and Rule-Based Reasoning: A Heuristic Approach. Proc IJCAI-89, 524-530 Schult, Th. J. (1992). Tools for Case-Based Systems (in German: Werkzeuge für fallbasierte Systeme). KI (4)92, 52-53
Shavlik, J. D. & Dietterich, T. G. (eds.) (1990). Readings in Machine Learning. San Mateo: Morgan Kaufmann Simoudis, E. & Miller, J. S. (1991). The Application of Case-Based Reasoning to Help Desk Applications. In: Bareiss (1991), 25-36 Slade, S. (1991). Case-Based Reasoning: A Research Paradigm. AI Magazine, 91(1), 42-55 Tecuci, G. (1991). A multistrategy learning approach to domain modelling and knowledge acquisition. In: Y. Kodratoff (ed.), Proceedings of the 5th European Working Session on Learning, 14-32, Springer Verlag Traphöner, R., Manago, M. et al. (1992). Industrial Criteria for Comparing Induction and CaseBased Reasoning. Technical Report, ESPRIT P6322 (INRECA-D4) Tversky, A. (1977). Features of Similarity. Psychological Review, 84, 327-352 Utgoff, P. (1988). ID5 : An Incremental ID3. In Proceedings of the fifth International Conference on Machine Learning. Irvine, CA: Morgan Kaufmann. Van de Velde, W. & Aamodt, A. (1992). Machine Learning Issues in CommonKADS. ESPRIT Project P5248, Technical Report KADS-II/TII.4.3/TR/VUB/002/3.0 van Someren, M. W., Zheng, L. L. & Post, W. (1990). Cases, Models or Compiled Knowledge: a Comparative Analysis and Proposed Integration. In: B. J. Wielinga, J. Boose, B. Gaines et al. (eds.), Current Trends in Knowledge Acquisition (Proc. EKAW-90). Amsterdam: IOS Press Wess, S. (1991). PATDEX /2: an Adaptive, Case-Focussing Learning System for Technical Diagnosis (in German: PATDEX /2: ein System zum adaptiven, fallfokussierenden Lernen in technischen Diagnosesituationen). S EKI Working Paper SWP-91-01 Wess, S. (1992). PATDEX - Knowledge-based, Incremental Improvement of Similarity Measurement in Case-Based Diagnostic Problem Solving (in German: PATDEX - ein Ansatz zur wissensbasierten und inkrementellen Verbesserung von Ähnlichkeitsbewertungen in der fallbasierten Diagnostik. Proc. of the 2nd German Conference on Expert Systems, Hamburg Wess, S. (1993). Case-Based Reasoning in Knowledge-Based Systems for Engineering Applications (in German: Fallbasiertes Schließen in wissensbasierten Systemen für technische Anwendungen). Doctoral Dissertation, University of Kaiserslautern (forthcoming) Wess, S., Althoff, K.-D., Maurer, F., Paulokat, J., Praeger, R. & Wendel, O. (eds.) (1992). CaseBased Reasoning - An Overview (in German: Fallbasiertes Schließen - Eine Übersicht). Vol. 1-3: Seki Working Paper SWP -92-08/SWP -92-09/SWP -92-10, 425 pages Wess, S., Janetzko, D. & Melis, E. (1992). Goal-Driven Similarity Assessment. S EKI-Report, University of Kaiserslautern