From Cases to Classes : Focusing on Abstraction in Case-Based Reasoning. Isabelle BICHINDARITZ LIAP-5, UFR de Mathematiques et Informatique 45 rue des Saints-Peres 75006 Paris FRANCE tel : (+33) 1 44 55 35 63 fax : (+33) 1 44 55 35 36 email :
[email protected]
Introduction Object-oriented methodology (OOM) and case-based reasoning methodology (CBR) have close roots in the 70's: frames for OOM [7] and scripts for CBR [10]. Although these methodologies have evolved independently since then, it is interesting to study how they have developped, and to compare them. So this paper studies the main characteristics of these methodologies, presents a CBR system built following OOM, and the advantages gained. A key concept in this article is that of abstraction. It can be de ned as the fact of considering separately a representation element (whether an attribute or a relation or a behaviour), or a subset of the representation elements available, by focusing especially on it and by ignoring the other ones. It is also the result of this process: a representation element, or a subset of representation elements, isolated by the mind.
1 Object-Oriented Methodology OOM has been imported from arti cial intelligence (AI) to software engineering. The purpose of OOM is to provide a methodology to build a computer program or system. It relies on Minsky's concept of frame (classes ancestor) [7] to represent knowledge. What has actually been transferred is the concept of class, and the central role played by the classi cation process (ie the organization of knowledge around classes, and the reasoning associated) in modeling. A class represents a set of objects sharing both common properties, and common behaviours. These objects are called instances of this class. The abstraction of these common representation elements in a class presents several advantages, the main ones being : 1. reusability : once classes have been de ned, they can be reused later without going back to the details of their de nition. They are building stones that can be used to build more complex systems. The links between classes, that permit to build assemblies of classes, can be various, but the main link used in OOM is the generalization/specialization link. The inheritance mechanism describes how the properties and the behaviours of classes linked by a generalization/specialization link are related to one another. In essence, specialized classes of a more general class share the properties and behaviours of this more general class. But specialized classes can also have speci c description elements.
2. ease of evolution : the abstraction eort in building the system permits to concentrate both properties and behaviours so that they need to be modi ed only at one place in the model or its implementation when an update has become necessary. This makes evolution of the system easier. 3. data encapsulation : abstraction also permits, within a class, to distinguish the representation elements that are private to a class (its implementation details), and those that are accessible to other classes. This information masking is called data encapsulation. One of its advantages is to permit to modify the implementation details of a class without modifying any other class, because the access to the modi ed class has not been changed.
input data
new problem computer system
Implementation
Analysis Object-Oriented
detailed dynamic
Model
model
object model
detailed functional dynamic
model
model detailed object
functional
model
model
Conception
Fig.
1 - The reasoning cycle of OMT.
This importance granted to the abstraction of classes is shared at all stages of software engineering, namely analysis, design and implementation. One of its consequences is the reusability of the classes conceived and implemented. It is then possible to formulate such a relation as: abstraction ?! reusability
where ?! means entails. An example of an OOM methodology is OMT (Object Modeling Technique). With OMT the realization of a computer system goes thru several steps, starting with the de nition of the problem to computerize (see gure 1): 1. analysis : the result of this step is a model of the application domain composed of three models. The rst one is the object model, composed of the objects from the domain, linked by relations (static relations between the objects). The second model is the dynamic model, showing the dierent events happening between objects, and the states of each object between the events happening to it. It represents the dynamic relations between the objects. The third model is the functional model. It shows in detail how output values are calculated from input values. 2. design : the analysis model is transformed into a more detailed model dealing with computer objects, and relations. These are derived from the application domain objects and relations of the analysis model. Finally, the dynamic and the functional models are merged into the object model by being transformed into behaviours (or methods) associated to the classes of the object model. 3. implementation : this model is then implemented into a program written in an objected oriented language, such as Smalltalk-80 (Visualworks), or C++. The important eort of modeling permits to this step to be straightforward.
2 Case-Based Reasoning Methodology CBR is an AI methodology the purpose of which is to conceive and implement a computer system that reasons from past experiences [5]. It is grounded on a cognitive model: Schank's model of reasoning such that reasoning, memory and learning are closely linked [9]. A case is a set of empirical data. The general idea underlying CBR is that, in order to process a new case, it is preferable (more ecient, more exact, faster ...) to use one or several cases processed before. The classical CBR cycle, as presented on gure 2, which has been inspired by Aamodt [1] and Bichindaritz [2], follows ve main steps: to abstract, to retrieve, to reuse, to revise and to retain.
2.1 Abstract The initial case representation is abstracted into an abstract case representation, in which the relevant features are calculated. The abstracted case is then matched with the cases in memory. This matching process is directed by the task to perform. The result returned is a set of cases, which can be considered as the most similar to the new case. Cand = fCase g: These cases are ranked by a similarity measure, such as P P n sim(El ; El ) sim(Case1; Case2) = =1 sim(El ;El n)++P =1nn n =1 k
n k
i;k
l;k
n k
n k
ki ;pred
ki ;pred
i;pert
i;pert
i;k
l;k
where is a weight associated to the representation elements which are not important for the current process, and where n and n are some weighting variables learnt by the system during previous cases processing. These variables are associated to the representation elements El and El of cases. i;pert
ki ;pred
i;k
l;k
input data
Abstract
new case
Retain
learnt case
Experimental Memory repaired
abstract case
Retrieve
Theoretical Memory
case
Revise
retrieved case solved case
Fig.
Reuse
2 - The reasoning cycle of CBR.
2.2 Retrieve The most similar case to the new case, according to a similarity assessment such as the preceding similarity measure is selected for being reused.
2.3 Reuse This step is very dependent on the task performed, and examples of reuse are adaptation of the most similar memorized case, or building an argumentation around a subset of the most similar memorized cases.
2.4 Revise This step is concerned with the improvement of the case reused, after an evaluation of the results of its use.
2.5 Retain Finally the revised case is added to the memory of the system. So if abstraction is used in CBR, it is not always necessary, thus it is possible to formulate such a relation as: where ?! means entails.
reusability ?! abstraction
3 Comparing OOM and CBR In the rst place, it can be understood that the aim of CBR is much more restricted than that of OOM. The scope of OOM is to build any kind of computerized system, not only that special kind of systems that reason from cases, as CBR. And OOM can certainly be used to build a CBR system. If CBR is, even more than OOM, focused on reusability, it is not the same kind of reusability. In OOM, reusability deals with classes, which are constructed by abstraction during modeling, whatever the software development stage. So reusability is a consequence of abstraction. In CBR, reusability deals with cases. The whole methodology is based on reusability, and its advantages, whatever they are are consequences of reusability. And abstraction, when it is used in some CBR systems to abstract categories or concepts from cases, is secondary to reusability. The process of abstraction, which is expensive, is rejected in CBR as far as possible: mainly either in the abstraction step of the CBR reasoning (see gure 2), or in the abstraction of the categories of cases that structure the cases memory, in some CBR systems. OOM CBR Objective build a computer system build a system reasoning from cases Based on Minsky's frames Schank's scripts Relation abstraction/reusability abstraction ! reusability reusability ! abstraction What is reused Classes Cases Learning Dicult (rigidity) Easy In the following section, an objected-oriented CBR system is presented.
4 An Objected-Oriented CBR System MNAOMIA is a CBR system which has been built following the OMT methodology. The objectoriented model resulting from OMT is composed of an object model, a dynamic model and a functional
model. These models are presented here for the analysis step, because for they are much more detailed in the conception step, and so not as easy to understand. 1. the object model: gure 3 shows the general object model. The memory of the system is composed of nodes of fours types: cases, concepts, prototypes and models. Concepts and models are organized in hierarchies dependent upon the points of view, which are some specialized models [3]. Cases are linked to concepts by instanciation links, while prototypes are linked to models by other instanciation links. Other parts of the object-model, not represented on this gure, deal with other elements such as indices, indexes, relations.
Memory
◊ composed by
Node
∆
Concept
Case
instance of
∆
generalizes
Prototype
instance of
∆
organized by
Trend
Fig.
Model
∆
∆
Point of View
TimeIndependent Concept
3 - The object-model of MNAOMIA.
2. the dynamic model: it shows the succession in time of the dierent reasoning steps presented with the CBR methodology. It globally represents the same sequence as on gure 2. 3. the functional model: examples of such a model represent the credit and blame assignment performed in the system, or the similarity measure. After the analysis, the conception and the implementation lead to a computer program in Smalltalk-80 (Visualworks). The important eort of modeling permits the implementation step to be straightforward. The advantages gained are the possibility to reuse the classes de ned, the ease of evolution of the system,
and its independence from the hardware (it can run on several platforms, such as Unix, Windows and Mac)
5 Dynamic abstraction A class Case has been de ned in the previous section, by abstraction during OOM. If both cases and classes are reused, only classes are abstracted. Cases are speci c chunk of knowledge. The classes thus de ned are given to the system : they are static. An interesting challenge is that such a system can add classes during processing, by a dynamic abstraction from the cases. Machine learning inferences, such as an inductive method, permit to construct structures similar to classes. For example, incremental concept learning [6] can be used to learn concepts. These concepts can be de ned in such a way that they correspond to the de nition of classes. When concepts are used to gather, and index cases, such cases are instances of these classes, and then instanciation is a form of indexation [4]. The relations between CBR, classi cation-based reasoning and incremental concept learning (or categorization) have been studied by Napoli [8].
Conclusion A case-based reasoning system can be analyzed, designed and implemented following an OOM, with the well known advantages of such a methodology. It entails a better formalization of CBR. But machine learning techniques are then important to compensate for the rigidity of OOM, that can be an obstacle to the main quality of CBR : learning thru experience. Learning new classes, and re ning them thru experience, is suitable to compensate for this OOM rigidity.
References [1] Aamodt Agnar and Plaza Enric,
\Case-based reasoning : Foundational issues, methodological variations, and system approaches", AI Communications, 7(1), March 1994.
[2] Bichindaritz Isabelle,
Apprentissage de concepts dans une memoire dynamique : raisonnement a partir de cas adaptable la t^ache cognitive, Thesis of University Rene Descartes, Paris, 1994.
[3] Bichindaritz Isabelle,
\Case-based Reasoning Adaptive to Dierent Cognitive Tasks", In : Proceedings ICCBR, A. Aamodt and M. Veloso (Edts.), Springer-Verlag LNCS/LNAI, 1995, 391-400.
[4] Bichindaritz Isabelle,
\Case-Based Reasoning and Conceptual Clustering : For a Co-operative Approach", In : Proceedings 1st UK CBR Workshop, I. Watson and F. Mahrir (Edts.), Springer-Verlag LNCS/LNAI, 1995, in press.
[5] Kolodner Janet L.,
Case-Based Reasoning. Morgan Kaufmann Publishers, San Mateo, California, 1993.
[6] Lebowitz Michael,
\Concept Learning in a Rich Input Domain : Generalization-Based Memory", In : Machine Learning: An Arti cial Intelligence Approach, Vol 2. R.S. Michalski, J.G. Carbonell and T.M. Mitchell. (Edts.), Morgan Kaufmann, Los Altos, CA, 1986.
[7] Minsky Marvin,
\A Framework for Representing Knowledge", In : Psychology of Computer Vision. P.H. Winston (Edt.), McGraw-Hill, 1975.
[8] Napoli Amedeo,
\Categorisation, raisonnement par classi cation et raisonnement a partir de cas", In : JAVA'94. Strasbourg, E1-E14, 1994.
[9] Schank Roger C.,
Dynamic memory. A theory of reminding and learning in computers and people. Camdridge University Press, Cambridge, 1982.
[10] Schank Roger C. and Abelson Robert P.,
Scripts, Plans, Goals and Understanding. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1977.