A Framework for Learning Adaptation Knowledge Based on Knowledge Light Approaches Wolfgang Wilke1 , Ivo Vollrath1 , Klaus-Dieter Altho 2 , Ralph Bergmann1
Abstract In this paper, we present a framework for learning adaptation knowledge with knowledge light approaches for case-based reasoning (CBR) systems. \Knowledge light" means that these approaches use knowledge already acquired and represented inside the CBR system. Therefore, we describe the sources of knowledge inside a CBR system along with the dierent knowledge containers. Next we present our framework in terms of these knowledge containers. Further, we apply our framework to two very dierent knowledge light approaches for learning adaptation knowledge. After that we point out some issues which should be addressed during the design or the use of such algorithms for learning adaptation knowledge. From our point of view, many of these issues should be the topic of further research. Finally we close with a short discussion.
1 Introduction One of the major challenges during designing a case-based reasoning (CBR) system is the modeling of appropriate adaptation knowledge. Usually adaptation knowledge is, in contrast to cases, not quickly available and hard to acquire. Adaptation is unavoidable when CBR is used for more complex tasks because it makes problem solving still tractable or even possible at all. This results from the possibility to broaden the coverage of past cases by adaptation knowledge for new problems. One approach to overcome the necessary knowledge engineering eort is the automatic learning of adaptation knowledge. Until now there are only few investigations in learning adaptation knowledge. Some approaches for learning adaptation knowledge can be found in DIAL (David Leake, 1993; Leake, 1995b; Leake, 1995a) or CHEF (Hammond, 1986; Hammond, 1989). These systems use knowledge intensive derivational analogy approaches (Carbonell, 1986; Veloso and Carbonell, 1993) to learn adaptation knowledge. Knowledge intensive means that these approaches require a lot of background and problem solving knowledge. For example in DIAL and CHEF University of Kaiserslautern Centre for Learning Systems and Applications (LSA) Department of Computer Science P.O. Box 3049, D-67653 Kaiserslautern, Germany e-Mail: fwilke, vollrath,
[email protected] 2 Fraunhofer Einrichtung fur Experimentelles Software Engineering (FhG IESE) Sauerwiesen 6 D-67661 Kaiserslautern, Germany e-Mail: altho@iese.fhg.de 1
1
adaptation strategies for special problem elds are acquired based on general domain knowledge. So a reduction of the knowledge acquisition cost is not necessarily the case because the knowledge engineering eort might be costly. In this work we want to focus on what we call knowledge light approaches for learning adaptation knowledge3 . Knowledge light means that these algorithms don't presume a lot of knowledge acquisition work before learning; they use already acquired knowledge inside the system to learn adaptation knowledge. The learning of parameters used during adaptation from already acquired knowledge is an example for such an approach. More complex work in this area was done by (Hanney, 1996; Hanney and Keane, 1996). They use an inductive learning algorithm to extract adaptation knowledge from (the cases in) the case base. The hope is to overcome the knowledge acquisition bottleneck (Feigenbaum and McCorduck, 1983) which arises when the acquisition and explicit representation of general domain and problem solving knowledge is necessary to solve a problem. In this paper we rst categorize the knowledge inside a CBR-System with knowledge containers rst described by Richter (Richter, 1995). Based upon this, we sketch a framework for learning adaptation knowledge from these containers. Further, we enrich this framework by two case studies and address open research issues. At the end we close with a discussion and an outlook on our further work.
2 The Dierent Sources of Knowledge in a CBR System Richter (Richter, 1995) described four containers in which a CBR system can store knowledge. This knowledge may include domain knowledge as well as problem solving knowledge which describes the \method of application" of the domain knowledge inside the container. The four containers mentioned are:
The vocabulary used to describe the domain, the case base, the similarity measure used for retrieval and the solution transformation used during the adaptation.
In general, each container is able to hold all the available knowledge but this is not advisable. The rst three containers include compiled knowledge. With \compile time" we mean the development time before actual problem solving, and \compilation" is taken in a very general sense including human knowledge engineering activities. The case base consists of case speci c knowledge that is interpreted at run time, i.e. during the process of problem solving. For compiled knowledge, especially if manually compiled by human beings, the acquisition and maintenance task is as dicult as for knowledge-based systems in general. However, for 3
In the following we always use the term \learning adaptation knowledge" for knowledge light approaches
interpreted knowledge the acquisition and maintenance task is potentially easier because it requires updating the case base only4 . Part of the attractiveness of CBR comes from the exibility to pragmatically decide which container includes which knowledge and therefore to choose the degree of compilation versus case interpretation. When developing a CBR system a general aim should be to manually compile as little knowledge as possible but only as much as absolutely necessary. More precisely, a CBR system developer has to decide how the knowledge is distributed to the dierent containers depending on the availability of knowledge and the engineering eort. If there is not enough knowledge available to ll one container as requested, there is a need for knowledge transformation from some containers to others. An example for such a transformation can be found in (Globig and We, 1993) where an improvement of the similarity measure is learned by knowledge transfer from the case base into the similarity container. If knowledge is transfered into the adaptation container this is a knowledge light approach for learning adaptation knowledge because the knowledge used is already available and coded in some of the other containers. We will focus on learning adaptation knowledge by knowledge-transfer from other containers into the adaptation container or improving the adaptation container itself with already acquired adaptation knowledge. We will provide a general framework that makes it possible to compare and evaluate dierent approaches to this problem. When developing such a learning algorithm, several design decisions have to be made. Some problems and possible solutions concerning these decisions will be discussed after the general framework has been described.
3 A Framework for Learning Adaptation Knowledge In this chapter we rst want to describe our framework for learning adaptation knowledge in the light of the discussed knowledge containers. After that we present two case studies which will be classi ed according to our framework and discuss some open research issues.
3.1 How to Learn Adaptation Knowledge from Knowledge Containers Figure 1 is an abstract view of the process of learning adaptation knowledge with an inductive algorithm. The sources of knowledge are the previously described knowledge containers: vocabulary, similarity measure, the case base and the adaptation container. This knowledge is transformed into adaptation knowledge by a learning algorithm. We focus on inductive algorithms because they generate general knowledge from examples. Often a CBR system consists of examples because the available model of the real world problem is incomplete. 4 Adding new cases to the case base could also cause a need for re-engineering some of the compiled knowledge, but this should not happen very often during maintainance of a CBR system.
vocabulary similatity measure
inductive learning algorithm
preprocessor case base
improved adaptation knowledge
examples adaptation
Figure 1: How adaptation knowledge are learned from the knowledge containers First, there must be a selection of knowledge from the containers to learn from. Depending on the kind of selected knowledge and the inductive algorithm used, this data has to be preprocessed into a suitable representation. The result is a set of examples. Every example is characterized by a set of example attributes. These attributes are derived from the knowledge in the described containers. Finally, it is necessary to integrate the output of the learning process with the old adaptation knowledge to form an adaptation container with the improved knowledge.
3.2 Case Studies Let us now apply our framework to two very dierent approaches for the task of learning adaptation knowledge: 1. One well known adaptation technique is the weighted majority voting (Michie et al., 1994)[p. 10 pp], (Weiss and Kulikowski, 1991) [p. 70] used with k-NN retrieval: The retrieval mechanism returns the k nearest neighbors to a given query and the target of the query is computed as the weighted majority5 of the targets of the retrieved cases. A possible technique for learning the best k for k-NN retrieval is used in the feature weights learning algorithms VSM (Lowe, 1995) and k ? NNvsm (Wettschereck and Aha, 1995). They make a leave-one-out test (Weiss and Kulikowski, 1991)[p. 31] which computes for each case q a set of the n closest neighbors, where n is varied from 1 to the size of the case base. To each of these sets of neighbors the weighted majority voting is applied to calculate the target of q . These calculated targets are compared to the real target of q and the value of n that results in the best overall quality of the CBR-system is used as the value for k for the future. Let us now describe this technique for learning adaptation knowledge using the terms of our framework: The computation of the leave-one-out results can be seen as the preprocessing stage. The preprocessor needs input from the case base container and from the similarity container. 5
or weighted numerical average for continuous targets
The output of the preprocessor { the examples for the learning algorithm { are
pairs containing a value for n and the corresponding overall quality of the CBR system. The learning algorithm takes the example value pairs and nds the value for n with the best overall quality, which is then used as the value for k. The adaptation container consists of some representation of the (possibly) dynamic adaptation knowledge and an algorithm which describes the application of this adaptation knowledge. In our example the weighted majority strategy is the adaptation algorithm, and the parameter k is the only dynamic part of the adaptation container. In this case learning means optimizing the value for k. 2. A more sophisticated approach towards learning adaptation knowledge has been described by Hanney and Keane (Hanney and Keane, 1996). Their algorithm builds pairs of cases and uses the feature dierences of these case pairs to build adaptation rules. We will brie y describe this algorithm in terms of our framework: The preprocessor builds pairs from all possible cases and extends them by noting the feature dierences and the target dierence of these cases. Information from the containers \case base" and \vocabulary" is needed. Hanney and Keane also suggest to constrain the case pairs by limiting their number or by taking advantage of the similarity measure to select pairs suitable for learning. The example input for the learning algorithm are the case pairs computed by the preprocessor. In a rst step the learning algorithm builds adaptation rules by taking a case pair's feature dierences as preconditions and the target dierence as the conclusion. These rules are subsequently re ned and generalized to extend the coverage of the rule base. For complexity reasons, these generalizations are only performed at adaptation time. Hanney and Keane also describe how the learning of rules may be constrained and guided by explicitly known domain knowledge which has been manually compiled into the adaptation container before the automatic rule learning takes place. If this kind of information is to be exploited, the preprocessor must also use the adaptation container as a source of input. The dynamic part of the adaptation container consists of a set of adaptation rules along with some additional information such as con dence ratings for each rule. These con dence ratings indicate the reliability of a rule's information. They are calculated by the learning algorithm based on the degree of generalization that has been applied to generate the associated rule. The algorithm that controls the rule application and the strategy for resolving con icts is the non dynamic part of the adaptation container (see (Hanney, 1996; Hanney and Keane, 1996) for details). The authors also describe an approach for integrating the learned rules into already known rules. In this case learning means determining and generalizing the rules and improving the con dence ratings.
3.3 Issues for Designing Knowledge Light Learning Algorithms In this section we point out some issues which should be addressed during the design or the use of an algorithm for learning adaptation knowledge. One should ask oneself some questions and be aware of some aspects that have signi cant in uence on the applicability of the nal algorithm and the quality of its results. We now categorize some of these issues according to the terms of our framework. From our point of view many of these open questions should be the topic of further research.
General Issues for Designing a Learning Algorithm In general, rst there should be
the question: \Is a knowledge light approach promising for my learning task?" The answer depends mainly on the learning goal and on the available knowledge. Especially the knowledge in the containers must be sucient to solve the learning task. The learning goal and the available knowledge constrains the possible inductive learning algorithms which might be useful for the learning task.
Preprocessing of the Knowledge After the availability of the knowledge is approved,
a learning algorithm is selected and the learning goal is well de ned, the designer has to focus on the selection and preprocessing of the information from the knowledge containers. So many \How to" questions arise here, like: How to select the knowledge to learn from, measure the quality of knowledge for learning, construct the learning examples, handle noisy or unknown data, derive additional knowledge from containers for the learning algorithm, etc. If the dierent sources of knowledge are contradictory inside a knowledge container or between dierent containers there might be the necessity for a con ict resolution strategy during the selection of the knowledge.
Choosing a Learning Algorithm Concerning the selected learning algorithm there are also some interesting points to inquire. It might be useful to know how many learning examples the algorithm needs and how to solve con icts between contradictory examples during learning. Also the designer has to prove that the learning algorithm is appropriate for the learning task and the given examples. There are many other issues concerning learning algorithms which are addressed in the research area of algorithmic learning theory. Currently this is a research topic. For example, the \International Conference on Algorithmic Learning Theory" this year will have a special track on learning in the area of CBR (ALT97, 1997). Integrating the Learned Knowledge After the learning step, the learned knowledge has to be integrated into the already known adaptation knowledge. So the designer has to prove that this is possible and how to manage that task. The improved adaptation container also should be evaluated against the original adaptation knowledge. Possible criteria for the evaluation are quality, correctness or coverage of the old and the new knowledge. Also the problem might occur that the learned knowledge is contradictory to the old knowledge. So con ict solution might also be a task during this process.
However we believe that the discussed issues are mostly unanswered today and we see many exercises for further research.
4 Discussion and Related Work The main intention of this paper is to provide a starting point for a framework for learning adaptation knowledge with knowledge light approaches. However, we think that complex adaptation knowledge is hard to acquire with such techniques. Nevertheless, there are a number of problems where such an approach might be useful like classi cation tasks with numerical or symbolic targets. Also the learning of parameters for more sophisticated adaptation solutions might be useful with such learning algorithms. The open questions and issues could be a starting point for further work in this area. Finding answers to some of the addressed issues could lead to a methodology for the design, classi cation and comparison of knowledge light adaptation learning algorithms. A well proved technique from the area of software engineering for this task could be the GQM-Approach (Basili V. R. and Caldiera G. and Rombach H. D., 1994) which we successfully used for a comparison between dierent commercial CBR-Tools (Altho et al., 1995). Also a framework for knowledge intensive learning approaches would extend this work.
Acknowledgments The authors would like to thank Prof. Michael M. Richter for the helpful discussions and suggestions. This work was partially funded by the Commission of the European Communities (ESPRIT contract P22196, the INRECA II project: Information and Knowledge Reengineering for Reasoning from Cases) with the partners: AcknoSoft (prime contractor, France), Daimler Benz (Germany), tecInno (Germany), Irish Medical Systems (Ireland) and the University of Kaiserslautern (Germany).
References ALT97 (1997). International Conference on Algorithmic Learning Theory - Call for Papers. http://www.maruoka.ecei.tohoku.ac.jp/ alt97/cfp.html. Altho, K.-D., Barletta, R., Manago, M., and Auriol, E. (1995). A Review of Industrial Case-Based Reasoning Tools. AI Intelligence, Oxford. Basili V. R. and Caldiera G. and Rombach H. D. (1994). The Goal Question Metric Approach. In J. J. Marciniak, editor, Encyclopedia of Software Engineering, volume 1, pages 528{532. Wiley, New York. Carbonell, J. G. (1986). Derivational analogy: a theory of reconstructive problem solving and expertise acquisition. In Michalski, R., Carbonnel, J., and Mitchell, T., editors, Machine Learning: an Arti cial Intelligence Approach, volume 2, pages 371{392. Morgan Kaufmann, Los Altos, CA. David Leake (1993). Learning Adaptation Strategies by Introspective Reasoning about Memory Search. In David Leake, editor, Proceedings AAAI-93 Case-Based Reasoning Workshop, vol-
ume WS-93-01, pages 57{63, Menlo Park, CA. ftp://ftp.cs.indiana.edu/pub/leake/INDEX.html, AAAI Press. Feigenbaum, E. and McCorduck, P. (1983). The fth Generation. Addison Wesley. Globig, C. and We, S. (1993). Case-based and symbolic classi cation algorithms - a case study using version space. In Richter, M. M., We, S., Altho, K.-D., and Maurer, F., editors, Proceedings First European Workshop on Case-Based Reasoning, Lecture Notes in Arti cial Intelligence, 837, pages 133{138. Springer Verlag. Hammond, K. J. (1986). Learning to Anticipate and Avoid Planning Problems through the Explanation of Failures. In Proceedings of the 5th Annual National Conference on Arti cial Intelligence AAAI-86, pages 556{560, Philadelphia, Pennsylvania, USA. AAAI86, Morgan Kaufmann Publishers. Hammond, K. J. (1989). Case-Based Planning: Viewing Planning as a Memory Task. Academic Press, Boston, Massachusetts. Hanney, K. (1996). Learning Adaptation Rules From Cases. In Diploma Thesis. Trinity College, Dublin. Hanney, K. and Keane, M. (1996). Learning Adaptation Rules From a Case Base. In Smith, I. and Faltings, B., editors, Advances in Case-Based Reasoning (EWCBR-96), pages 179{192. Springer Verlag. Leake, D. B. (1995a). Becoming An Expert Case - Based Reasoner: Learning To Adapt Prior Cases. Proceedings of the Eigth Annual Florida Arti cial Intelligence Research Symposium, pages 112{ 116. ftp://ftp.cs.indiana.edu/pub/leake/INDEX.html. Leake, D. B. (1995b). Combining Rules and Cases to Learn Case Adaptation. In in press, editor, Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society. ftp://ftp.cs.indiana.edu/pub/leake/INDEX.html. Lowe, D. (1995). Similarity metric learning for a variable-kernel classi er. Neural Computation, 7:72{85. Michie, D., Spiegelhalter, D., and Taylor, C. C. (1994). Machine Learning, Neural and Statistical Classi cation. Ellis Horwood. Richter, M. M. (1995). The knowledge contained in similarity measures. Invited Talk on the ICCBR95. http://wwwagr.informatik.uni-kl.de/~lsa/CBR/Richtericcbr95remarks.html. Veloso, M. M. and Carbonell, J. G. (1993). Toward scaling up machine learning: A case study with derivational analogy in prodigy. In Minton, S., editor, Machine Learning Methods for Planning, chapter 8, pages 233{272. Morgan Kaufmann, San Mateo. Weiss, S. M. and Kulikowski, C. A. (1991). Computer Systems That Learn { Classi cation and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems. Morgan Kaufmann. Wettschereck, D. and Aha, D. W. (1995). Weighting features. In Veloso, M. and Aamodt, A., editors, Case-Based Reasoning Research and Development, pages 347{358. Springer.