One advantage of case-based reasoning over rule-based reasoning that has been advocated is that cases can be interpreted differently, whereas once a ruleĀ ...
Incorporating (Re)-Interpretation in Case-Based Reasoning* Scott O'ttara and Bipin Indurkhya College of Computer Science Northeastern University Boston, MA 02115, USA Emaih {bipin]ohara} @ccs.neu.edu Tel: 617-373-5204; Fax: 617-373-5121
1
Introduction
One advantage of case-based reasoning over rule-based reasoning that has been advocated is that cases can be interpreted differently, whereas once a rule has been abduced from cases, there is no possibility of re-interpreting the cases. For instance, Riesbeck and Schank ([16], pp. 9-14) compare and contrast three modes of reasoning: 1) reasoning with ossified cases (rules or abstract principles), 2) reasoning with paradigmatic cases (cases with a given interpretation), and 3) reasoning with stories (cases with many possible interpretations and capable of re-interpretations). They argue that it is the third mode of reasoning that displays the most flexibility and power of having a knowledge base containing cases.
However, most of the existing work on case-based reasoning remains confined to the second mode or to a version of the third mode where cases have a number of fixed interpretations. Almost all existing case-based reasoning systems associate dimensions (also called indices) with every case in a case-base, and use these dimensions for similarity assessment and retrieval. Consider, for instance, the system Hypo that applies case-based reasoning to law [2]. At the time each case is entered in the case base, one must determine the possible ways in which that case might be relevant, and each relevant factor that is found is assigned a dimension. Before a new case is given to the system, it is also dimensioned by the human user, and these dimensions are compared with the dimensions of the cases in the case library to determine the relevant cases that might apply to the new case. To see the limitation of this approach, consider the domain of home-office tax deduction that is discussed in [17]. Here, one issue that has often been the bone of contention in legal arguments is what is the taxpayer's "principal place of business". Until a point in time, the courts were consistently interpreting "principal place of business" to mean the place where the most visible part of the taxpayer's business was carried out: classroom for a teacher, concert stage * This paper describes research supported in part by National Science Foundation grant IRI-9105806.
247
Fig. 1. A figure in a case-base.
for a musician, courtroom for a judge, hospital for a doctor, and so on. However, in 1983, in a case involving a concert violinist (Drucker v. C.I.R.; 715 F.2d 67), the courts decided that this interpretation is not always fair, and in certain situations taxpayer's home-office could be the principal place of business even though the most visible part of taxpayer's activity is not carried out there. To further justify its decision, and to counter the Tax Commissioner's arguments, the court distinguished between the principal place of business of the employee and the principal place of business of the employer, and allowed that the two do not necessarily have to coincide. In particular, in the case of the concert violinist, the courts ruled that the principal place of business of the violinist was his studio at home where he spent most of his time practising, and not the Lincoln Center, which was the principal place of business of his employer, the Metropolitan Opera Association, Inc. Obviously, the Drucker case introduces new dimensions: "principal place of business of the employee", "principal place of business of the employer", and "the place where the taxpayer spends most time in business-related activities." But to incorporate these dimensions, one needs to reexamine each of the cases in the case library to determine whether the new dimensions apply to it. Moreover, the system, as it was, would have been quite unhelpful when applied to the facts of Drucker, because the new dimensions were not apparent until after the case was decided. Of course, one solution is to try to anticipate all possible ways in which a case might be relevant, and include a large number of dimensions with it. But this has two disadvantages. One is that it might cause far too many cases to be retrieved, most of which may well be irrelevant. The other is that, no m a t t e r how hard one tries, it is impossible to anticipate all possible ways in which a case might be relevant. This point is best illustrated with the domain of geometric figures. Imagine a case-based system where the cases are geometric figures. When a new figure is input, the system retrieves all the cases which are similar to the input figure, possibly with some measure of the similarity and an explanation of why the figures are considered to be similar. Suppose that the figure shown in Figure 1 is one of the cases in the case-base. Now suppose that several different figures are input to our geometric casebased system as illustrated in Figure 2. Which input figures will cause the retrieval of Figure 1? In Figure 2(a), a window-pane-like figure is input consisting of four squares arranged in a box shape. Figure 1 is retrieved if it has been dimensioned as being composed of four triangles arranged in a box shape. In Figure 2(b), a figure is input consisting of two overlapping squares. Figure 1 will be retrieved if it has also been dimensioned as being composed of two overlap-
248 Input Case:
Retrieved Case:
,~
?
Fig. 2. Figures input to a geometric case-based system. Is Fig. 1 retrieved? A
B
C
D
Fig. 3. A proportional analogy involving Fig. 1.
ping triangles. Now consider the input figure shown in Figure 2(c). Will Figure 1 be returned? If so, what explanation will be given? There is a context in which Figure 1 can be seen as obviously similar to the cross-inside-a-square figure of 2(c) and this is the proportional analogy shown in Figure 3 A proportional analogy is a four-part analogy of the form A is to B as C is to D of the kind often found on intelligence tests. An example of a verbal proportional analogy would be: 'a man is to lungs as a fish is to gills. In this paper we will deal with geometric proportional analogies, of which Figure 3 is an example. Perhaps the reader saw this similarity and perhaps not. Surely, the reader will admit that it is possible that the programmer may have neglected to include the sort of dimensions needed to find this similarity. More importantly, no matter how many dimensions were used in the initial representation of Figure 1, it will always be possible to produce another figure in some new context that is similar to it, but requires a new dimension. It is clear that what is necessary is a way to interpret the figure differently depending on the context, and create new dimensions or indices when necessary. This is the ultimate promise of case-based reasoning, as rightly emphasized by Riesbeck and Schank, that is yet to be delivered. We have been working towards fulfilling the promise of case-based reasoning. One of the authors [6, 7] has been working on formalizing the process of reinterpretation in an algebraic framework, and on articulating the crucial role it plays in many aspects of cognition. The other author [14] has been implementing a system PAN that models this re-interpretation process in the domain o f geometric proportional analogy problems in which the component geometric figures m a y require re-interpretion. We will present an overview of PAN in Section 2. In
249
Section 3 we discuss how the re-interpretation mechanism might be incorporated in a conventional case-based reasoning system. In Section 4, we point out the further research questions that are raised by our approach. 2
An Overview
of PAN
PAN (for Proportional ANalogy) is a program we are developing to solve geometric proportional analogy. PAN takes as input three geometric figures A, B and C made up of straight line segments and returns an answer figure D such that the four descriptions of A, B, C and D satisfy the proportional analogy relation: A is to B as C is to D. Descriptions in PAN are at a higher "conceptual-level" than the initial line-segment input and involve rotations, translations, repetitions, convex polygons, symmetry etc. A key feature of PAN is that the descriptions of the figures are constructed "in context" i.e., the figures and their descriptions act as contexts for one another during their construction. In connecting PAN to case-based reasoning, note that the dimensions of a case may be thought of as higher-level descriptions of the case. In this section, we first discuss the overall architecture of PAN. Then we discuss the algebraic description language used by PAN and define what we mean by a description. Next we define the proportional analogy relation used by PAN and give a more precise statement of the input/output structure of PAN. We then discuss the search strategy we expect PAN to use and consider how PAN might solve the proportional analogy in Figure 3. Finally, we state briefly what parts of PAN have been implemented and what parts remain. 2.1
T h e A r c h i t e c t u r e of P A N
The architecture of PAN is illustrated by the diagram in Figure 4. In the diagram, circular and oval shapes represent data structures and rectangular shapes represent processes. The input to PAN is represented by the bottom-most oval labeled raw inpu! data and is simply a set of line segments representing the figures A, B and C. The first task performed by PAN is represented in the diagram by the rectangle labeled: preprocessor module. The preprocessor converts the raw input data into a graph-like structure that makes computations on the figures more efficient. One such graph is constructed for each of figures A, B and C and is stored in the description space. After PAN finishes the preprocessing step, it enters into a search process that builds the descriptions of the figures in the description space. Initially, figures A, B and C each have "null" partial descriptions in the description space. The descriptions are built-up together in an iterative fashion in which each iteration consists of: (1) examining the contents of the description space, (2) choosing an action to be taken in the description space, and (3) applying the action in the description space. This process is represented in Figure 4 by the blocks labeled perception module and proportional analogy (PA) module and the loop of arrows
250
/ modify description space
/I
Proportional An~ogy Mo~l~
I
I\
\ inputfrom doscrip6on spaco
Fig. 4. The Architecture of PAN
which first point from the description space to the perception and PA modules (description space data is input to these modules) and which then point back from the perception and PA modules to the description space (modifications are made to the description space.) The perception module contains various perception modules for detecting different kinds of geometric concepts such as polygons, partial polygons, iterated (or repeated) figures and other relations between figures such as above, left-of, etc. Since a vast number of geometric concepts may be potentially detected in a figure, an ordering is imposed on the detection of concepts. For example, in the case of polygons, the simplest shapes are detected first such as regular polygons, followed by polygons with fewer regularities. Top-down considerations provided by the PA module may be used to modify the concept-ordering in the perception module. The PA module consists of a set of condition-action rules for building proportional analogies and may be thought of as a single perception module for detecting proportional analogies. Generally, the condition part of a rule is applied to the description space and detects relationships between the partial descriptions relevant to creating a proportional analogy. The action part of a rule modifies the description space in order to build a proportional analogy. The actions that may be undertaken by the PA module include: invoking the perception module to detect geometric concepts in one or more of the figures; extending one or more of the partial descriptions of A, B and C; create mappings between partial descriptions of A, B and C; evaluate the "goodness" of the partial proportional
251
analogy so far constructed; and finally, construct the description of figure D when the descriptions of A, B and C are completed.
2.2
An Algebraic Description Language
An algebra consists of a set of objects and a set of operators defined over these objects. The operators have the property of closure which means that when an operator from the set of operators is applied to objects in the set of objects, the result is an object in the set. In PAN, a geometric algebra is defined consisting of a set of primitive geometric figures and a set of operations which transform geometric objects into new ones. The set of objects of the geometric algebra is then the closure of the geometric operations over the set of primitives. PAN uses an algebraic description language because of the natural way in which algebra can be used to represent the decomposition of geometric figures. We are further motivated to build on the algebraic approach to analogy and metaphor discussed in [6, 7]. Let us take a closer look at the objects and operators of the geometric algebra. O b j e c t s : The set of objects in the geometric algebra includes the set of all geometric figures that can be formed with straight line segments. The objects of the geometric algebra are exactly those objects permitted in PAN analogies. We define primitive objects to be the set of all polygons and the set of all partial polygons where a partial polygon is any connected subfigure of a polygon. Note that a line segment is a partial polygon. Operators: The set of operators in the geometry algebra is quite large and consists of three distinct subsets: the global operators, the iterative processes and the join operators. G l o b a l O p e r a t o r s : The global operators are parameterized, single argument operators that act on every point of a geometric figure. For example, a rotation applied to a geometric figure rotates every point in the figure around a given point. The angle and point of rotation are parameters of the class of rotation operators. For space reasons, we won't discuss the details of how the global operators are parameterized, though we hope the reader can infer how this is done from the description below. The global operators break down into the seven classes of operators shown in Figure 5. Note that the objects depicted in the figure are surrounded by a dotted box. This box is not part of the geometric object but is provided as a frame of reference so the position and orientation of different figures can be visually compared. The operator classes TRANS, ROT, REFL and GLIDE_REFL are known as isometrics which are geometric transformations that preserve size and shape. A TRANS operator translates its argument figure along the x and y axes. A ROT operator rotates its argument around some point in the plane. A REFL operator reflects its argument across some line in the plane. A GLIDE_REFL operator first reflects its argument across a line in the plane and then translates the result parallel to the line of reflection. The operator classes SCALE, SCALE_ROT and SCALE_REFL are known as similarities which are geometric transformations that preserve shape but not
252 isometries
D
~s
similarities
(i [] i)
[]
=
.
;........... J
'
r .........
1
.
"I
o
.
r .........
[ .......... ]
9- ( i q ) ,.. .........
.,"
=
! f ........
i)=
i)=i f i
1 ..:
!1
1
!
Fig. 5. The Global Operators. ~,~)
(ID
I) = . . . . .
--!
[ ........
i
i~ 3
i
i
L........
J
i
'
Fig. 6. Two Examples of Iterative Processes
size. A SCALE operator changes the size (and possibly location) of its argument by multiplying the x and y coordinates of the argument by a constant. A SCALE_ROT operator first scales its argument then rotates it. A SCALE_REFL operator first scales its argument then reflects it across some line. I t e r a t i v e P r o c e s s e s : The iterative processes are parameterized, single argument operators which make successive copies of its arguments. We include the iterative processes in our algebra because of the psychological sMience of repetitive structures (see [11, 20] among many who make this observation.) Two examples of iterative processes are shown in Figure 6. The iterative processes are parameterized with a global operator and with the number of iterations. In the first example, the global operator parameter is a TRANS operator and the number of iterations is three. In the second example, the global operator parameter is a ROT operator and the number of iterations is two. An additional parameter is included in the ROT operators representing the rotational symmetry of the figure. In the second example, this parameter is six.
J o i n O p e r a t o r s : :Join operators are multi-argument operators which combine two or more figures into a single composite figure. Associated with each join operator is a relation which the arguments must satisfy. Typical join operators are ABOVE, LEFT, INSIDE, and so on. D e s c r i p t i o n : A description of a geometric figure is a tree representing a history of how the figure might have been constructed. The external nodes of the tree
253
[2
!zl _J
(a)
(b)
Fig. 7. Two Descriptions of Figure 1
are labeled with objects and the internal nodes are labeled with operators. The object represented by a description can be found by recursively evaluating the nodes in the tree. Two examples of descriptions of Figure 1 are shown in Figure 7. The first description (Figure 7a) describes Figure 1 as two copies of a pair of repeated triangles where one copy is the reflection of the other. The operator IP(TRANS,2) is applied to the lower left-hand triangle resulting in two horizontally adjacent triangles. The operator IP(REFL) is then applied resulting in Figure 1. The second description (Figure 7b) describes Figure 1 as a diamond shape inside a concave polygon. The operator INSIDE is applied to the outer concave figure and the diamond shape also resulting in Figure 1. 2.3
The Proportional Analogy Relation
The proportional analogy relation is defined as a precise relationship between the complete descriptions of the the four figures A,B,C and D involved in the analogy. Description pairs (A,B), (A,C), ( n , o ) and (C,D) are related to one another by substituting, inserting and deleting nodes in the description trees. A d a p t i n g D e s c r i p t i o n s : Descriptions may be adapted to other descriptions by applying d-adaptations. A d-adaptation is a sequence of zero or more primitive d-adaptations. There are three primitive d-adaptations: INSERT, DELETE and SUB. INSERT inserts exactly one operator node into a description. Associated with each INSERT is an operater label specifying which operator is to be inserted and a pointer into the description specifying where in the description to make the insertion. DELETE removes exactly one operator node from a description and also has an operator label and a pointer associated with it. (Note: for simplicity of exposition, we allow the insertion and deletion of unary operators only. It is possible to insert and delete multi-argument operators, but we will not discuss that possibility here.) SUB substitutes an object or operator node with some other object or operator node of the same arity. Associated with each SUB are two pointers, one into each description, and two operator labels designating the operator to be matched and its substitute. Substitutions between nodes in different descriptions are permitted only if there exists some suitable generalization between the object or operator labels of the nodes.
254
F
dc
F'
dD
Fig. 8. A Diagram Illustrating the Proportional Analogy Relation
A-Adaptations: D-adaptations can in turn be modified by a-adaptations. An a-adaptation is a unary function that takes a d-adaptation as input and returns a new, modified d-adaptation. A-adaptations may add, remove or modify SUB primitives in the input d-adaptation. A SUB primitive can be modified by applying other SUB primitives to the two associated operator labels. INSERT and DELETE primitives may not be added or removed from the input d-adaptation but may be modified by applying SUB's to their associated operator labels. P r o p o r t i o n a l A n a l o g y : A careful examination of reM-world examples of proportional analogies reveals that exchanging the terms A and D, or B and C of a proportional analogy always produces a new proportional analogy. A significant feature of our definition is that it captures this symmetry property of proportional analogies. Inspiration for our definition of proportional analogy in an algebraic framework was found in the symmetric definition of [15] based on the notion of a 'pushout' from category theory. P r o p o r t i o n a l A n a l o g y R e l a t i o n : For geometric figures A, B, C and D, A is to B as C is to D if and only if there exists four descriptions dAi dB, de and dD of figures A, B, C and D respectively, four d-adaptations F, F I, G and G' such that F(dA) = dB, G(dA) = de, F ' ( d c ) = dD and G'(dB) = dD, and two a-adaptations F and G such that G ( F ) = F ' and F(G) = G' (See Figure 8.) We can state now the input/output structure of PAN. The input consists of three geometric figures A, B and C. The output consists of a new geometric figure D, the descriptions dA, dB, dc and do, the four d-adaptations F, F', G and G', and the two a-adaptations G and F.
2.4
A "Multiple Hill-Climbing" Search Strategy
The overall search process executed by PAN may be characterized as "multiple hill-climbing." At any given time, the search focuses on building a single proportional analogy. The next step in constructing the proportional analogy is chosen in a hill-climbing fashion, always choosing the best of the possible choices at the current node in the search tree. To evaluate proportional analogies and allow their comparison with one another, a proportional analogy quality (PAQ) function is defined on an integer scale from 0 to infinity where a value of 0 stands
255 for the "best analogy". A proportional analogy is considered to be better than some other proportional analogy involving the same figures if its PAQ value is less than the PAQ value of the other proportional analogy. The search process has three parameters which are fixed at the beginning of the search. The first, the optimal quality threshold specifies a maximum PAQ value below which the search procedure may halt: a good analogy has been found. The second, the minimum quality threshold specifies a maximum PAQ value above which the system will not even consider as a possible answer. Finally, the third parameter, the search tree size places a limit on the maximum size of the search space. Progress along the current path continues until either (1) a proportional analogy is found; or (2) no further progress can be made from this node; or (3) the minimum quality threshold is exceeded. In case 1, if the optimal quality threshold is not exceeded then the search halts, an answer is found, if the optimal quality threshold is exceeded, then the search continues as in cases 2 and 3. If progress has halted on the current path, then search must continue from one of the other nodes in the search tree. Criteria for making this choice will be determined empirically. At present we speculate that these "choice points" should be the most developed partial proportional analogies with the lowest PAQ value. If the search continues until the search tree size is reached, and no analogy with a PAQ below the optimal quality threshold has been found, then the system simply returns the proportional analogy with the highest PAQ found so far. The entire search tree consisting of partially constructed proportional analogies is kept in the description space. Space is conserved by storing object and operator information only once for any given description node and using pointers to this information from the partial proportional analogies in the search tree. An example of how PAN might solve the proportional analogy in Figure 3 is shown in Figure 9. Figure 9 shows a hypothetical search tree of how PAN might proceed. The order of expansion of nodes are labeled 0 through 7. For expository reasons, the search tree is considerably simplified from what it is likely to be. A challenge in our research will be to find powerful enough heuristics to avoid an exponential explosion. In part, the PAQ function and the quality thresholds are meant to do this, but still the branching factor in these problems can be quite large, and the search tree can be quite "bushy" below the quality thresholds. In discussing this example, we will point out some heuristics that we think will be helpful. The search starts at node 0 where there are initially no descriptions of any of the figures A, B or C. The PA module initially focuses attention on figure C. (Actually, the PA module probably would use a "figure complexity" heuristic here and thus focus on figure A or B which are simpler in some sense than figure C. However it will help us illustrate the quality thresholds by starting with figure C.) Having decided to focus on figure C, the first step is to find a primitive with the lowest PAQ value. Here polygons with the lowest number of component line segment lengths might be preferred. An equilateral triangle
256
A
C
B .~
i ..........
0
No
No
Description
I i 1
NO
Description
2
No
Description
I
Description
J
........
i ...........
No
Description
NO
4 !n!=
: ~...~ L..u..:
No
No Description
No
Description
!~,--~ L....:
35. I ,---,
Descri~n
l
r o.o.,,
Description
No
Description
d---..~ ,----,~
II !~ ,O. o .!. ,i ==,,...,,1 i
il31: =: =.,.=... t . , . , , ,
i ~176176
No Description
l
'....,,,1 =,,...,.,
M i n i m u m Quality Threshold E x c e e d e d
L...,
~ , , . . , o =....o.1
,....=
=,,=..o
L....:
l i'b~',, '.' [fll t ;,3 ,~'=ol I L...., =,....J
"'""
"'"
=-----: ="'-"
~....,
~i! "'""
a l
"'-'"
I
l
A n s w e r Found.
Fig. 9.
S e a r c h i n g for a P r o p o r t i o n a l A n a l o g y .
would have a low PAQ value since it has three line segments of the same length. In figure C, there are six possible triangles, four small and two large, so these must be distinguished. Preference (i.e., a low PAQ value) could be for smaller (or larger) figures. Let's say smaller figures are preferred leaving us with four small triangles. We could then prefer higher figures over lower figures leaving us with two triangles. Finally, we could prefer left hand figure over right hand figures leaving us with the single small, upper-left triangle of node 1. In node 2, various possiblities of applying iterative processes are considered. Using gestalt-like criteria, the PAQ value of an iterative process is proportional to the distance between the components of the iterative process. This criteria would leave us with two possibilites: the triangle directly below and the triangle directly to the right of the initial upper-left triangle. To distinguish these two
257
possiblities, a preference ordering might be imposed on t h e global operaters in which translations have a lower PAQ value than reflections. In this case then, the right-hand triangle is selected to form an iterative process. In node 3, a similar approach to that applied in node 2 is used to form the final reflective iterative process. This now gives us a full description of figure C and the PA module would now want to build the descriptions of the other figures. Since figure C must be the result of a d-adaptation G applied to A's description, focus now shifts to building A's description. PAN will try to build A's description to be structurally simlilar to C's description so as to minimize the complexity of d-adaptation G and thus minimize the overall PAQ value. However, the structure of figure A does not permit a low complexity d-adaptation with respect to C's description. Some d-adaptation might be found but this d-adaptation would have a high number of INSERT's and DELETE's and thus a high PAQ value. Most likely, the minimal quality threshold will be exceeded. Thus a new choice point must be found. Since, nodes 1 and 2 are not particularly well developed, processing moves back to node 0. Now, the PA module shifts attention to figure A. In node 4, similar methods to those described above are used to detect the square and the cross subfigures. Since these figure are unconnected the task is made easier. In node 5, an a t t e m p t is made to find iterative processes but no additional copies of the square or cross are found. It is thus decided to join the two figures with the INSIDE operator. Given that a description of A has been found, focus shifts to figure B. The fact that A has been described with a square and cross is used directly to start describing B as also consisting of a square and cross. (Figure B in this example is fairly unambiguous, however in a different example, the square and cross might be connected and thus other ways of describing B might be possible.) PAN proceeds in node 6 to describe figure B as a square left of a cross. The d-adaptation F (not shown, but implied) has a very low PAQ value since no operators are deleted and inserted and the PAQ values of the substitutions: INSIDE to LEFT, square to square, and cross to cross are low. Since we have constructed descriptions of figures A and B, the PA module shifts the focus to describing figure C. In this a t t e m p t at describing C, the context of A's description is used to influence the process. The INSIDE operator in A's description causes the system to look for "containing figures" resulting in the choice of the outer concave polygon in figure C. Because of the context of A's description, the PAQ value of the concave polygon is rated much lower than it otherwise would have been. The INSIDE operator along with the concave polygon subsequently serve to lower the PAQ value of the diamond shape inside the concave polygon. Thus, in node 7, we create a complete description of C. With three complete descriptions of A, B and C, and d-adaptations F and G, d-adaptations F ' and G ~ and a-adaptations F and G are constructed, and the description of D is then generated.
258
2.5
Current Status of PAN
PAN is currently a partially implemented system. Using the diagram in Figure 4, the preprocessor module and much of the perception module have been implemented. The preprocessor converts figures represented as lists of lines segments into graphs for easier computation. The routines in the perception module use the graphs as input. The perception module contains a polygon recognizer and a partial polygon recognizer. The figure recognizers do not find all polygons and partial polygons at once. Instead, an ordering is placed on the figures such that it finds the "simplest" figures first. The perception module also includes a rotitine that, given an arbitrary shape, can find this shape as a subshape of another figure. The subshape may be isometric or similar (in the technical geometric sense) to the original shape. A routine also exists to find iterative processes which uses the shape finding routine as a subroutine. Given an arbitrary subfigure, this routine can detect repetitions of the subfigure. In addition, the routine can be constrained to find only repeated figures involving only one kind of global operator such as only translations, or only rotations.
3
Re-Interpretation in Case-Based Reasoning
In the introduction, we articulated the need for a re-interpretation component in case-based reasoning. We would like to emphasize here that re-interpretation plays a fundamental role in human cognition and problem-solving. Empirical studies of creative problem-solving in real-world situations have shown that one of its key mechanisms is the ability to see the problem situation from different points of views [5, 9, 18]. Re-interpretation, typically called change of representation in AI, has been given some attention by AI researchers [1, 4, 10, 12, 13, 19]. Nevertheless, it has not yet been brought into the mainstream of AI research, especially not in the area of case-based reasoning. In the context of case-based reasoning systems, we would like to point out that we do not propose re-interpretation as an alternative to the conventional approach using dimensions (or indices) but in addition to it. It should be clear from our brief description of the PAN architecture here that it is a computationally expensive process. Moreover, when an aspect of a case is deemed relevant, and is turned into a dimension, it is usually because it is considered to have a more general appeal than just an idiosyncracy of the case. Therefore it seems quite likely that many new problems could be solved using conventional dimensions, which allow a fast retrieval of similar past cases. Therefore it would be prudent to continue to encode the cases in terms of dimensions depending on what aspects of it seem relevant at the time the case is entered into the case library. But then we could provide an interpretation module that is evoked when the retrieval based on conventional dimensions is not helpful. This could be because the retrieved cases, even though they are similar to the problem, do not have solutions that can easily be adapted to solve the problem [3]. Or it could be because the problem at hand requires attention to an aspect
259 that was not considered relevant so far, and is therefore not included in the dimensions. In all such situations, the re-interpretation mechanism is activated, which alters the similarity metric (as manifested by the existing dimensions) so that the cases in the case library are made to look similar to the new problem, like making Figure 1 seem similar to the cross-in-square figure from the introduction. In combining re-interpretation with conventional case-based reasoning, one necessary ingredient is that the reasoning system must be aware of its own progress at some meta-level. This is because the computationally expensive reinterpretation module is invoked only when the conventional index-based retrieval fails or is not good enough. But for this latter task, the system must be able to evaluate how strongly or weakly the cases it has retrieved so far apply to the problem at hand. It must also be given a threshold so that if the strength of the cases retrieved by the conventional mechanism is below that threshold, then the re-interpretation module is activated. It may seem at first that the re-interpretation process is rather like a runaway horse, retrieving a horde of useless cases from the case library, for almost anything could be made to look similar to anything else. However, a careful analysis in any domain shows that there are sufficient top-down, goal-directed constraints to keep the reins on re-interpretation. For instance, in the domain of geometric proportional analogy, the context provided by the other figures acts as a powerful constraint to focus the search for new dimensions in the right direction. In the domain of legal reasoning, which one of the authors has been exploring and where there is a crucial need for the re-interpretation mechanism, we have found that the goals of the user serve as a beacon to keep the search for new dimensions and relevant precedents from growing exponentially [8].
4
Conclusions
and
Further
Research
We have argued in this paper for a need to incorporate a re-interpretation mechanism in case-based reasoning systems, and have outlined an approach to it. Obviously, we are just crossing the threshold into a new realm where a lot of exploration needs to take place: Our work on modeling re-interpretation in geometric proportional analogies and legal reasoning is only a beginning. There are a number of issues that have to be studied before re-interpretation can be incorporated into practical case-based reasoning systems. First of all, the mechanism of re-interpretation has to be studied in a number of different domains in which case-based reasoning systems already exist. Then one needs to design and implement re-interpretation modules in each of these domain and experiment with different control structures for integrating it with conventional index-based case retriever. Only after that can we hope to design a domain-independent casebased reasoning shell that has the ability to create new dimensions, and yet is efficient when applied to those problems that do not require such creativity.
260
References 1. Amarel S., 1968, "On the Representations of Problems of Reasoning about Actions," in D. Michie (ed.) Machine Intelligence 3, American Elsevier, New York, NY. 2. Ashley K.D., 1990, Modeling Legal Argument: Reasoning with Cases and Hypotheticals, MIT Press, Cambridge, Mass. 3. Bfrner K., 1993, "Structural-Similarity as Guidance in Case-Based Design," in this volume. 4. French R. and Hofstadter D., 1991, "Tabletop: An Emergent, Stochastic Computer Model of Analogy-Making." In Proceedings, 13th Annual Cognitive Science Society Conference, pp. 708-713, Laurence Erlbaum Associates, Hillsdale, New Jersey. 5. Gordon W.J.J., 1961, Synectics: The Development of Creative Capacity, Harper and Row, New York, NY. 6. Indurkhya B., 1991, '~On the Role of Interpretive Analogy in Learning," New Generation Computing 8, No. 4, pp. 385-402. 7. Indurkhya B., 1992, Metaphor and Cognition, Kluwer Academic Publishers, Dordrecht, The Netherlands. 8. Janetzko D. and Indurkhya B., in preparation, "Toward a Model of Reinterpretation in Legal Reasoning." 9. Koestler A., 1964, The Act of Creation, Hutchinsons of London; 2nd Danube ed., 1976. 10. Korf R. E., 1980, "Toward a Model of Representation Changes," Artificial Intelligence 14, pp. 41-78. 11. Leyton, M., 1992, Symmetry, Causality, Mind, MIT Press, Cambridge, Massachusetts. 12. Lowry M., 1990, "STRATA: Problem Reformulation and Abstract Data Types," in D. Paul Benjamin (ed.) Change of Representation and Inductive Bias, Kluwer Academic Publishers, Dordrecht, The Netherlands. 13. Mitchell M., and Hofstadter D., 1990, "The Emergence of Understanding in a Computer Model of Concepts and Analogy-Making," Physica D, 42:322-334. 14. O'Hara S., 1992, "A Model of the 'Redescription' Process in the Context of Geometric Proportional Analogy Problems," in K.P. Jantke (ed.) Analogical and Inductive Inference, Lecture Notes in Artificial Intelligence 642, Springer-Verlag, Berlin, Germany, pp. 268-293. 15. P6tschke D. P., 1987, "Analogical Reasoning Using Graph Transformations," pp 135-144 in Analogical and Inductive Inference, K.P. Jantke (ed.), International Workshop AII '86, Wendisch-Rietz, GDR, October 6-10, 1986, Proceedings, Springer-Verlag, Berlin, GDR. 16. Riesbeck C.K. and Schank R.C., 1989, Inside Case-Based Reasoning, Lawrence Erlbaum and Associates, Hillsdale, New Jersey. 17. Rissland E.L. and Skalag D.B., 1991, "Cabaret: Rule Interpretation in a Hybrid Architecture," International Journal of Man-Machine Studies 34, pp. 839-887. 18. Schfn D.A., 1963, Displacement of Concepts, Humanities Press, New York. 19. Subramanian D., 1990, '~A Theory of Justified Reformulations," in D. Paul Benjamin (ed.) Change of Representation and Inductive Bias, Kluwer Academic Pubfishers, Dordrecht, The Netherlands. 20. van der Helm, P. A., van Lier, R. J., and Leeuwenberg, E. L. J., 1992, "Serial Pattern Complexity: Irregularity and Hierarchy," Perception, Volume 21, pp. 517544.