A framework for retrieval in case-based reasoning ... - Springer Link

Annals of Operations Research 72(1997)51 – 73

51

A framework for retrieval in case-based reasoning systems Ali Reza Montazemi and Kalyan Moy Gupta School of Business, McMaster University, Hamilton, Ontario, Canada L8S 4M4 E-mail: [email protected]; [email protected]

A case-based reasoning (CBR) system supports decision makers when solving new decision problems (i.e., new cases) on the basis of past experience (i.e., previous cases). The effectiveness of a CBR system depends on its ability to retrieve useful previous cases. The usefulness of a previous case is determined by its similarity with the new case. Existing methodologies assess similarity by using a set of domain-specific production rules. However, production rules are brittle in ill-structured decision domains and their acquisition is complex and costly. We propose a framework of methodologies based on decision theory to assess the similarity of a new case with the previous case that allows amelioration of the deficiencies associated with the use of production rules. An empirical test of the framework in an ill-structured diagnostic decision environment shows that this framework significantly improves the retrieval performance of a CBR system.

1.

Introduction

A CBR system supports problem solving based on past experience with similar decision problems. To assist a decision-maker (DM), the process followed by a CBR system is as follows [36, 44, 52] (see figure 1): a previous case (or cases) similar to the new decision problem (new case) is (are) retrieved; the solution of the previous case is mapped as a solution for the new case; the mapped solution is adapted to account for the differences between the new case and the previous case; and the adapted solution is then evaluated against hypothetical situations. To aid in future decision making, feedback of the success or failure of the evaluated solution is obtained from the DM. Most frequently, recently developed CBR systems retrieve previous cases to provide decision support [1, 32]. The retrieval of relevant previous cases is critical to the success of a CBR system [39]. Central to retrieval methodologies is the search for and the filtering of previous cases, and an assessment of the similarities of a new case with previous cases. Production rules are essential to this filtering and assessment process [29]. However, acquisition of production rules creates a bottleneck in CBR systems development. In this paper, to eliminate the need for production rules, © J.C. Baltzer AG, Science Publishers

52

A.R. Montazemi, K.M. Gupta y Retrieval in case-based reasoning systems

Figure 1. Processes in a CBR system.

decision theoretic techniques are used for similarity assessment, and constraint-based methodology is proposed for filtering previous cases. Explanation-based learning is used to acquire these constraints and the result is the elimination of the difficulty associated with the use of production rules in the filtering process. The paper is structured as follows: section 2 provides an overview of retrieval methodologies; section 3 presents the methodologies for similarity assessment and filtering; and section 4 describes our investigation for assessing effectiveness of the proposed methodology. Section 5 closes the article. 2.

Retrieval methodologies overview

The aim of case-based retrieval is to retrieve the most useful previous cases towards the optimal resolution of a new case [23, 31] and to ignore those previous cases that are irrelevant [33]. Retrieval in a CBR system takes place as follows. Based on a description of the new case, the case-base is searched for previous cases that have the potential to provide decision support (see figure 2). Typically, the search is under-constrained and a large number of previous cases are retrieved [5]. It is, however, possible to filter the previous cases based on exclusion criteria [49]; this involves comparison and filtering [6]. The previous cases that remain after filtering are matched and ranked in order of decreasing degree of similarity. Matching is the process that assesses the degree of similarity of a potentially useful previous case with the new case.


53

Figure 2. Retrieval in CBR.

2.1. Matching A case can be considered a schema that consists of a set of attribute value pairs (i.e., descriptors) [21, 29]. For example, in a credit assessment decision scenario, a loan manager may assess several attributes value pairs (e.g., attribute “Character of the applicant” has a value of “average”). Matching involves establishing the similarity of the schema of the new case with the schema of previous cases. Matching involves two steps: (1) assessment of similarity of the schemata of the new case and the previous case along the descriptors, and (2) assessment of overall similarity of the schemata by a matching function. Similarity of the schemata of two cases along descriptors has been assessed by domain-specific matching rules (e.g., [53], JULIA [25], and PROTOS [40]). For example, a matching rule can determine that the descriptor “color of the object” with the value orange is very similar to the same descriptor with the value red. However, large numbers of matching rules would be required to determine the similarity of all possible pairs of values for the descriptor “color of the object”. Acquisition of matching rules, therefore, can be an onerous task [44]. The overall similarity of a new case with a previous case is assessed by aggregation of similarity along descriptors by using a matching function. In this investigation, to assess the overall similarity, we used the nearest-neighbour (NN) matching function. Nearest-neighbour matching [17] is widely used in current CBR systems [11, 16, 26, 29]. The overall similarity (OS NN ) of the new case “n” and the previous case pk using the NN matching function is as follows:

∑ i =k 1 wik sim( ain , ai k ) m

OS

NN

( n, p k ) =

p

∑ i =k 1 wik m

,

(1)

54


where sim(ain , aipk) is the similarity of the new case with a previous case k along a descriptor pair, and wi is the importance of the ith descriptor. The NN matching function assesses overall similarity by a weighted linear combination of similarities along descriptors. This is similar to the methods used in multi-attribute decision making. Weighting represents the degree of importance of the descriptors towards the goal of a decision problem. The NN matching function has been adopted from the pattern matching literature. In pattern matching, all previous cases are represented by the same set of descriptors and their importance is determined by means of an inductive machine learning technique which minimizes classification error [17]. However, this approach is not feasible in CBR systems because the number of descriptors that could be used to describe previous cases is large, and only a subset of descriptors can be used to describe a particular previous case. The importance of a descriptor in CBR systems has been used at two levels of granularity – global and local [29]. At a global level, the importance of a descriptor is the same irrespective of the previous case in which it is used, whereas at a local level, the importance of a descriptor is specific to a previous case. The global level is coarse and context insensitive. In contrast, the local level is fine grained and context sensitive (e.g., MEDIATOR [28]). In some CBR systems, the degree of importance of a descriptor for the local level is acquired from the domain expert by a knowledge engineer [26, 40]. However, assessments of the importance of a descriptor provided by a domain expert can be noisy [7, 10]. Furthermore, the importance of descriptors acquired from domain experts is static and independent of the previous cases in the case base. An alternative approach is to determine the degree of importance, dynamically, during retrieval. For example, domain-specific rules are used to determine a descriptor’s importance during retrieval in HYPO [3]. During retrieval, this approach takes into account the context of the new case. Nonetheless, the need to determine rules for this methodology limits its application. New cases are matched with the previous case with the purpose of applying the solution of a previous case to the new case [48]. Matching in CBR systems is only partial, because CBR systems support ill-structured decision problems. This lack of structure in the decision problems could lead to instances in which, despite a high degree of overall similarity, the solution of a previous case is not applicable to the new case. To deal with such instances, filtering is necessary. 2.2. Comparison and filtering Improper application of the solution of a previous case to a new case results in what is called over-generalization. Over-generalization can occur when the solution of a previous case is not applicable because of certain conditions existing in the new case [19]. For example, in the credit assessment decision environment, the decision rules used to assess the loan application of a middle-aged entrepreneur may not be applicable to an assessment of loan application from a young entrepreneur, despite


55

a high degree of similarity along other descriptors. To prevent this type of overgeneralization, production rules are used to assess the validity of a previous case toward the new case [13, 19, 40, 49]. Production rules have two limitations. First, they assume well-defined domain knowledge; and second, their acquisition from a domain expert is fraught with difficulty [8]. This is why, instead of rules, we propose the use of constraints that take into account the imperfections of domain knowledge and provide an explanation-based learning method to acquire these constraints. 3.

Proposed retrieval methodologies

In response to the description of a new case that consists of a set of descriptors, the case memory is searched to determine a set of candidate previous cases potentially useful for providing decision support in the new case. Candidate previous cases are matched with the new case and rank ordered in decreasing degree of similarity. This is effected by the proposed retrieval methodologies which have the following components: (1) Similarity assessment along descriptors: This is a multi-attribute decision making technique to determine the closeness of the new case and a previous case along descriptors. (2) Contextual determination of importance of descriptors: This is a domain independent technique to determine the importance of descriptors in previous cases in the context of a new case during retrieval. (3) Acquisition and application of validity constraints: This is an explanation-based learning method to acquire the validity constraints for preventing over-generalization. 3.1. Similarity assessment The values of descriptors are depicted by a variety of scoring scales. These can be numeric, ordinal or nominal valued [29]. For example, in an assessment of credit worthiness, the descriptor “character of applicant” is measured on the following five point scale representing ordinal linguistic values: Very Poor I 0

Poor | 1

Average | 2

Good | 3

Very good | 4

Excellent | 5

Each descriptor i has an acceptable range Ri . Let A n be the set of descriptors in the problem-schema of the new case such that: A n = {ain }

i = 1 … m,


56

where ain is the ith descriptor of the new case, and xin is its score on the scale. Let there be k previous cases that are candidates for matching, and A pk be the set of descriptors for the problem-schema of the kth candidate previous case:

A pk = {aipk },

with i = 1, … m and k = 1 … l (candidate previous cases),

p

p

where ai k is the ith descriptor of the kth candidate previous case, and xi k its score. xin is the desired or ideal score that should be achieved by a candidate previous case [57]. The degree of closeness (cik ) along the ith descriptor for the k th candidate previous case is determined as follows: | x n − x ipk | cik = 1 − i ∀i and k , (2) Ri where Ri is the range of the scoring scale of the ith descriptor. For example, the credit “character of applicant” value excellent (i.e., score of 5) is matched with the descriptor “character of applicant” for a previous case with a value average (i.e., score 2), then their closeness cik = 1 – |5 – 2|y5 = 0.4. The same result in the form of a production rule is as follows: IF “character of applicant in the new case” is excellent AND “character of applicant in the previous case” is average THEN “the closeness along character of applicant ” is 0.4. However, ten rules would be required to accommodate all possible combinations (i.e., C25 ) of the two descriptors in a rule-based system. This example restricted the possible values of the descriptor (i.e., “character of applicant”) to five discrete values. This restriction is necessary for encoding the combinatoric space of explicit representation for production rules that makes rule base representation brittle [8, 54]. An attractive feature of our model is that the value of descriptors in equation (2) can take any value within their range Ri . Therefore, the method applies equally to ordinal-valued descriptors and to numeric-valued descriptors. Furthermore, the addition of a new value does not require the specification of any additional information. However, a rule-based representation would require addition of new rules and the modification of the existing ones. Thus, the degree of closeness methodology overcomes the brittleness associated with the rule-based representation. In the existing CBR systems, the matching of an ordinal-valued descriptor is exact [11]. For example, the descriptor “City” with value “Los Angeles” matches with value “Los Angeles” but does not match with “San Diego”. This can be dealt with, within our representation, by converting a nominal-valued descriptor into binaryvalued descriptors. For example, the descriptor “City” with value “Los Angeles” is converted to the descriptor “Los Angeles City” with values “True” and “False”. However, representation of cases by using nominal-valued descriptors should be avoided because they do not allow meaningful similarity assessment. Instead, any nominal-


57

valued descriptor can be decomposed into component descriptors that allow comparison of two different cases [46]. For example, depending on the system objective, the nominal-valued descriptor “City” can be decomposed into component ordinal-valued descriptors such as “Economic prosperity”, “Population density”, and “Size”. Another feature of our implicit model of closeness is the possibility of deriving the degree of closeness in the form of fuzzy membership. This enables us to adapt the retrieval of the previous cases in relation to the requirements of the DM. 3.1.1. Degree of closeness adaptation A DM, faced with an ill-structured decision problem, engages in both a knowledge search and a problem search [37]. This leads to the preparation versus deliberation trade-off (see figure 3). Preparation is the immediate knowledge encoded in the memory (i.e., the domain knowledge of the DM), and deliberation is the

Figure 3. Preparation versus deliberation trade-off.

knowledge obtained by considering multiple situations (i.e., previous cases) by searching. With an exponential fan-out of search, to receive an additional increment of coverage, requires considering an increase in total previous cases. Therefore, the requirement for previous cases for DM “A” (see figure 3), increases as the complexity of the new case increases. Furthermore, in figure 3 the curves are equiperformance isobars. This means that DMs (e.g., A and B) with different levels of immediate knowledge can attain the same task performance (i.e., solution to a new case) by searching a different number of previous cases. Therefore, a CBR retrieval mechanism

58


should adapt its matching processes to support the requisite variety of DMs faced with decision problems with varying degrees of complexity. Although this adaptation is extremely difficult to implement when explicit production rules are used, our matching function can support it with ease. Our matching function interprets the degree of closeness (cik ) as the degree of membership of the kth previous case for the ith descriptor. This is akin to fuzzy set theory. A fuzzy subset of some universe U is a collection of objects from U (the set part), such that with each object is associated with a degree of membership (the fuzzy part). The degree of membership is always a real number between zero and one, and it measures the extent to which an element belongs to a fuzzy set. In our case, the universe is the set of l previous cases. The degree of closeness (cik ) determined by equation (2) is based on a membership function that is implicitly generated for each possible combination of descriptor scores. Consider five candidate previous cases with the closeness of ci1 = 1.0, ci2 = 0.9, 3 ci = 0.2, ci4 = 0.8, and ci5 = 0.3; their degree of memberships of the ith descriptor of the new case are then 1.0, 0.9, 0.2, 0.8, and 0.3, respectively. This means that, with regard to the ith descriptor, previous cases three and five have little similarity with the new case, and previous case one is very similar to the new case. The attractiveness of fuzzy set theory is that it provides two powerful operations for adjusting the degree of membership in relation to task complexity: concentration and dilation [47]. Concentration operation decreases the membership of descriptors that have low degrees of membership proportionally more than for descriptors with high degree of membership. Dilation is the opposite of concentration. This operation increases the membership of descriptors with low degrees of membership proportionally more than descriptors with high degree of membership. The membership function of cik in equation (2) can be changed by a power parameter t as follows: cik → ( cik ) t 0 ≤ t < ∞, ( 3) where “ → ” means “is replaced by”. The concentration operation can be performed by using t > 1, and the dilation operation uses t < 1. The concentration operation is suitable in retrieval of previous cases for simple new cases with low need for an information search (see figure 3). The dilation operation can increase the information search for complex new cases by increasing the membership of descriptors with low cik . Thereby, the chance of selecting the kth previous case for retrieval is increased. For instance, when a dilation of t = 0.01 is applied to the closeness vector of five candidate previous cases along a descriptor, the differences between closeness values disappear, thereby reducing the contrast among the candidate previous cases along the descriptor (see table 1). However, when a concentration operation of t = 2.0 is applied, the closeness values along a descriptor move apart, giving a sharper contrast between the candidate previous cases.


59

Table 1 Effect of concentration and dilation on the degree of closeness along a descriptor.

Operation

Power parameter t

Previous cases 1

2

3

4

5

Normal

1.00

0.20

0.60

0.65

0.90

1.00

Dilation

0.90 0.50 0.01

0.23 0.45 0.98

0.63 0.77 0.99

0.68 0.81 0.99

0.91 0.95 0.99

1.00 1.00 1.00

Concentration

1.10 1.50 2.00

0.17 0.09 0.04

0.57 0.46 0.36

0.62 0.52 0.42

0.89 0.85 0.81

1.00 1.00 1.00

The overall similarity is determined by aggregating the similarity along the descriptors by the NN matching function. The NN matching function determines the overall similarity by a weighted linear combination of similarity along the descriptors. Equation (1) is thus transformed as follows:

∑ i =k 1 wik cik m

OS

NN

( n, p k ) =

∑ i =k 1 wik m

.

( 4)

The question is how might we determine an appropriate set of weights wi (i.e., importance of the descriptors). 3.2. Contextual importance determination Typically, the importance of the descriptors in a previous case is directly acquired from a domain expert [29]. This approach ignores other previous cases in the case-base, and also the possible effect of the context of the new case on the importance of descriptors. Matching should consider the similarities and the differences of the new case with a candidate previous case as well as the similarities and differences among the candidate previous cases [2]. This assertion is based on the view that previous cases compete among each other for the best match during retrieval [31, 55]. DMs find the evaluation of alternatives (e.g., previous cases) more easily interpreted when the decision making methodology produces the greatest divergence in their attractiveness [20, 57]. Therefore, the competition among the candidate previous cases should tend to produce maximum divergence in their overall similarity. Viewed as an information processing activity, the decision-relevant information about the previous cases is transmitted, perceived, and processed via their descriptors. Thus, the ability of a descriptor to provide information useful for discriminating the previous

60


cases is a measure of its importance. The variation of the degree of closeness along a descriptor across the candidate previous cases can be used to measure this ability. The standard deviation σ i of the closeness along a descriptor is a measure of its variation:

σi =

1 l

l

∑ (cik − ci ) 2

∀i,

( 5)

k =1

where ci =

1 l

l

∑ cik ,

k =1

and l is the number of candidate previous cases. Therefore, the importance of the ith descriptor is determined by context (wdik ) as follows: wd ik =

(σ i ) ∑ I (σ i )

I ∈ A n > A pk ,

∀k .

(6)

The contextual determination of importance of a descriptor is illustrated by the following example. Consider four candidate previous cases with all the descriptors equally important (i.e., w ik = 1, ∀k, i) and whose degree of closeness along the descriptors is shown in table 2(a). The closeness c3 has the greatest variation (σ 3 = 0.4085), hence it has the greatest ability to discriminate the candidate previous cases. Closeness c1 has no variation across the candidate previous cases ( σ 1 = 0.0), hence it cannot discriminate the candidate previous cases. Table 2(b) shows the contextually determined weights (wdik ) for the previous cases shown in table 2(a). For example, the values of weights of descriptors that are equally important in table 2(a) (e.g., w23 = w33 = 1) have changed to those values shown in table 2(b) (e.g., wd23 = 0.9684 and wd33 = 1.7692). Furthermore, with equal weighting of descriptors, the overall similarity of previous case three and previous case four are the same (i.e., OS NN(N, 3) = OS NN(N, 4) = 0.775) (see table 2(a)). The changes resulting from the contextual determination of importance of descriptors make the overall similarity of the third previous case (OS NN(N, 3) = 0.731) greater than the overall similarity of the fourth previous case (OS NN(N, 4) = 0.623) (see table 2(b)). The methodologies described in sections 3.1 and 3.2 are domain independent. They not only eliminate the use of production rules, but also make it possible to take into account the individual preferences of the DM as well as to incorporate the effect of the context of a new case to improve the retrieval in CBR systems. A judicious combination of domain-independent techniques with domain knowledge can significantly improve retrieval in a CBR system. The underlying issue, however, is the acquisition of domain knowledge. In the following section, we present a technique to represent the domain knowledge in the form of validity constraints and provide a machine learning technique to acquire them.


61

Table 2 Contextual determination of importance of descriptors . (a) Variation of closeness (cik ) along the descriptors of candidate previous cases with equal weights (wik = 1), and their overall similarity (OS NN ) with the new case. Previous case

* #

Closeness along descriptors

OS NN

c1

c2

c3

c4

1 2 3 4

1 1 1 1

0.6 0.4 0.8 1.0

0.0* 0.0# 1.0 0.3

0.5 0.0# 0.3 0.8

σ

0

0.2236

0.4085

0.2915

Descriptors not included in previous case Descriptors not included in previous case

P1, ∑ 3i =1 wi1 P2, ∑ 2i =1 wi2

0.700 0.700 0.775 0.775

= 3. = 2.

(b) Contextually determined importance of the descriptors (wdik ) of the candidate previous cases and its effect on overall similarity (OS NN ) of candidate previous cases. Contexually determined importance of descriptors

Previous case 1 2 3 4

OS NN

wd1

wd2

wd3

wd4

0 0 0 0

1.3022 2.0000 0.9684 0.9684

0.0000 0.0000 1.7692 1.7692

1.6977 0.0000 1.2624 1.2624

0.543 0.400 0.731 0.623

3.3. Validity constraints Over-generalization can occur when the solution of the previous case is not applicable because certain conditions in the new case prevent it [19]. Validity constraints are conditions (i.e., domain knowledge) that prevent over-generalization. Violation of these constraints invalidates the application of the solution of a previous case as a solution to a new case. In other words, validity constraints specialize a previous case, or limit its applicability [15]. For example, in the decision domain of credit assessment, a particular previous case is applicable to a new case only when the “age of the applicant” is between “35 and 50”. The same can be represented in the form of a production rule as follows: IF “Age of applicant in the new case” is not “between 35 and 50” THEN filter the previous case.

62


Validity constraints are represented as bounds of the descriptors in the previous cases. For example, a constraint on the descriptor “Age of applicant” is that it should be greater than 35 (i.e., lower bound 35). Yet another example could be one in which the previous case is applicable only if the new applicant provides no “collateral” (i.e., lower bound and upper bound of the descriptor “collateral” is equal to 0). Validity constraint acquisition from a domain expert requires two steps. First, the expert must recognize that a validity constraint is required. Therefore, the expert must recall and structure all possible scenarios to assess the applicability of a previous case. However, this is difficult because, in an ill-structured decision problem, there are potentially a large number of scenarios and because DMs are not good at unaided recall [41]. Second, the expert must determine the bounds of a validity constraint. This poses considerable difficulty because the expert must select bounds from a large number of possible values that correctly generalize or specialize a previous case. Currently, there are no available techniques for the identification and specification of validity constraints. The following section provides a methodology to acquire validity constraints from a domain expert. 3.3.1. Validity constraints acquisition The machine learning approach of explanation-based learning (EBL) [14] can be used to acquire validity constraints. EBL uses examples to construct an explanation of a concept. The learned explanation is then applied to other instances of the same concept [35]. Validity constraints are explanations as to why a particular previous case is not applicable to a new case. Acquiring validity constraints for the stored previous cases proceeds in three steps as follows: Step 1. In response to a new case, the CBR system retrieves previous cases and presents them to the domain expert. The new case provides a possible scenario that enables the recognition of situations where validity constraints are needed. This improves the expert performance since recognition is much easier than recall [41]. Step 2. The domain expert identifies the irrelevant retrieved previous cases. Step 3. For each irrelevant previous case, the domain expert provides an explanation. The explanation is structured as follows: if the score of the descriptor in the new case is greater (or lower) than the score of the descriptor in the previous case, then the score of the descriptor in the new case becomes the upper (or lower) bound of the validity constraint. For example, in the decision domain of credit assessment, when the CBR system is used to assess the application of a young entrepreneur, and a previous case of a middleaged entrepreneur is retrieved, the domain expert explains that the previous case is not applicable to the new case because the age of the applicant in the new case is too low. Therefore, the lower bound of the validity constraint


63

on “age” is set to the “age” of the applicant in the new case. By repeated application of new scenarios, the appropriate bounds are deduced. To learn the validity constraints applicable to all the previous cases, each previous case is used as a training case (i.e., example) to retrieve the remaining previous cases and steps 1 through 3 are followed. 4.

Empirical evaluation

4.1. Objectives An empirical evaluation was conducted to determine the following: (1) the effectiveness of the closeness model and the effect of power parameter “t” on the retrieval performance; (2) the effectiveness of the contextual determination of importance of descriptors; and (3) the effect of validity constraints on retrieval performance. 4.2. Methodology 4.2.1. Environment We developed a CBR system to assist the service personnel of a manufacturing and service organization to diagnose and repair alternating current (AC) motors. AC motors are used in diverse applications that range from driving exhaust fans in mine shafts to driving pumps in sewage stations. Therefore, a large number of combinations of symptoms and faults are possible. Experience of the application and the equipment that utilize AC motors is required to identify faults and repair them. The descriptions of trouble-shooting episodes were available from the service reports, where details of diagnostic and repair activity were also recorded. These service reports formed the basis of our case-base. 4.2.2. Subjects The organization under study has several regional services divisions across Canada and a central engineering services division in Ontario. Each division has a team of service engineers who trouble-shoot a variety of electrical machinery. Problems that are not solved at the regional level are referred to Central Engineering Services. Ten trouble-shooters from Central Engineering Services and Regional Field Services participated in this investigation. All subjects had a college or university degree. The trouble-shooting experience of subjects ranged from 4 to 30 years. Average experience was 10 years. All subjects were male.

64


4.2.3. Instruments A single item questionnaire was used to measure the usefulness of a retrieved previous case towards the new decision problem (test case) with a 7-point Likert scale [18]. This is based on reported findings that the usefulness and relevance of retrieved information are equivalent [43]. The scale was as follows: The previous case ___________ was useful towards solving and analyzing the new case. 3 2 1 0 –1 –2 –3 likely | - - - - - - - | - - - - - - - | - - - - - - - | - - - - - - - | - - - - - - - | - - - - - - - | - - - - - - - | unlikely extremely

quite

slightly

neither

slightly

quite

extremely

The usefulness of the previous cases was used to determine the rank-ordering of previous cases towards analyzing and solving a new case. 4.2.4. Tool A Case Based Reasoning Shell (CBRS) was developed that incorporated the matching methodologies presented in section 3. The CBRS accepts description of a new case, retrieves most appropriate previous cases, and compares and analyzes the new case on the basis of retrieved previous cases. CBRS can also explain its retrieval and matching process. CBRS was used to develop the Trouble-shooting and Repair Assistant for AC motors (TRAAC). On the basis of a general diagnostic model [42], TRAAC assists the trouble-shooter hypothesize faults and gather evidence. TRAAC then retrieves the previous cases to confirm or reject the proffered hypotheses. An analysis of the new case based on the retrieved previous cases is also provided. TRAAC also supports the identification of fault(s) and assists in formulating a repair plan. The TRAAC system was developed by priming a CBRS with AC motor knowledge. Thirty-five representative previous cases from the available service reports were selected in consultation with experts for priming its knowledge-base. A vocabulary of 152 descriptors was developed, and these were used to represent the schema of previous cases. The number of descriptors in the schema of a previous case ranged from 6 to 17. The importance (i.e., weights) of descriptors for these previous cases was acquired from the domain expert by use of a 5-point Likert type scale. The experts response was converted into the relative importance of the descriptors. Next, validity constraints were acquired from the domain expert by means of the methodology described in section 3.3.1. 4.2.5. Test Eleven test cases (i.e., new cases) representative of problems that had occurred in the field were selected for our investigation. The diagnosis and the repair solution used in the test cases were known. To begin with, the functionality and the utility of TRAAC was demonstrated to the subject by reference to a trouble-shooting event that


65

had occurred in the field. The subject was then given a warm-up test case to familiarize himself with the functionality of the system. This test case was removed from subsequent analysis. After the subject was comfortable with the system, he was given one test case at a time. Learning during the evaluation can affect the subject’s response [43]. Hence, the test cases were assigned in a random order. The subject was given the general specification of the AC motor application and the initial report of symptoms for each test case. Based on the recommendations of TRAAC, the subject selected descriptors that best described the test scenario. To simulate the conditions in the field, the subject was allowed to ask for any information regarding tests performed and observations that were made. The response to such questions was based on available information contained in the service reports. Based on the test case description provided by the subject, TRAAC retrieved a subset of the 35 previous cases and presented them in a random order to the subject. The subset included all previous cases that matched with at least one descriptor of the new case. Therefore, all potentially useful cases were retrieved. After reading the content of each retrieved previous case, the subject rated it by means of a usefulness questionnaire. When the response indicated ties in the usefulness rating, the subject was given the option of expressing a preferred order among the tied retrieved previous cases. On completing the rating, the subject wrote the analysis of the new case and suggested a repair technique. The experimental session comprising the description of the new case, matching process, and retrieved previous cases was recorded by TRAAC for later analysis. On average, it took 45 minutes to complete an analysis of a test case. Due to the lack of time, the number of test cases assessed by the subjects varied between four and ten. A total of 80 cases were analyzed by the 10 subjects. 4.2.6. Measure of CBR retrieval performance Although a number of CBR systems have been reported in the literature, only a few attempted to evaluate their performance [12]. Some CBR systems use classification accuracy as a measure of retrieval performance [40, 51], while other CBR systems use recall and precision as a measure [33, 50]. Recall and precision have been adopted from the information retrieval literature (e.g., see [27, 38, 45]). However, these measures are not suitable when multiple previous cases are retrieved and rank ordered based on their degree of usefulness. The reason is that recall and precision ignore the rank ordering of retrieved previous cases [22]. Furthermore, use of these two measures (i.e., recall and precision) make comparison of alternative retrieval methodologies ambiguous. A measure of retrieval performance of a CBR system should incorporate the following four components: (1) retrieved previous cases useful to the new case, (2) retrieved previous cases not useful to the new case, (3) useful previous cases not retrieved, and

66


(4) agreement of ranking produced by the CBR system and ranking expected by the DM. We used the Kendal Tau with ties [24] to measure retrieval performance. This incorporates all the above components [22]. Kendal’s Tau with ties ( τ ) measures the agreement of judgments from two sources which produces the ordinal ranking of a set of items. The two sources in the CBR retrieval evaluation are the rank ordering of previous cases determined by the usefulness rating of retrieved previous cases provided by the DM and the rank ordering of the previous cases retrieved by the CBR system. The number of agreements and disagreements between the rank order by the DM and the rank order by the system are determined. The statistical correlation between the two rank orderings is then measured. 4.2.7. Experimental design The following three computations were made towards the empirical evaluation of our objectives as presented in section 4.1: (1) Retrieval was performed without the validity constraints, and the power parameter “t” was varied from 0.5 to 2 (computation set 1). (2) Retrieval was performed without the validity constraints using both contextually determined importance of descriptors and importance of descriptors acquired from the domain experts (computation set 2). (3) Retrieval was performed with the validity constraints. The power parameter “ t” was set equal to 1, and retrieval was performed without any contextually determined importance weights (computation set 3). 4.3. Analysis The solution provided by the subjects for each test case was scored in accordance with the actual solution implemented in the field. The average score was 87.38%. The question arises whether there were any significant differences among these solutions. To assess differences between subjects, the analysis of variance (ANOVA) of their scores was conducted. The results indicated that no significant difference existed among subjects ( p = 0.139, F = 1.58). Subjects were uniformly competent in assessing the test cases (i.e., the new cases). Retrieval was carried out for computation sets 1 to 3 by reference to the description of the test cases provided by the subjects. τ was computed for each matching methodology, and comparisons were made using a pair-wise signed test. The analyses of computation sets 1 to 3 are presented next. 4.3.1. Computation set 1 First, the effectiveness of similarity assessment along descriptors was assessed. As noted earlier, the retrieved previous cases were randomly presented to the subjects.


67

Table 3 Agreement of previous cases selected by the subjects with those ranked highest by TRAAC using different values of power parameter. Power parameter t

Agreement (%)

0.50 0.75 1.00 1.50 2.00

69.18 69.06 67.78 66.41 65.22

Table 4 Effect of power parameter t on retrieval performance. Comparison A vs. B

Wins τA > τ B

Losses τA < τ B

Draws τA = τ B

z-value (level of significance p)

t = 0.5 vs. t = 1

20

10

50

t = 0.75 vs. t = 1

16

7

57

t = 1 vs. t = 1.5

18

10

52

t = 1 vs. t = 2.0

29

14

47

1.4605 (0.0721) 1.6667 (0.0475) 1.3228 (0.0951) 2.1347 (0.0166)

On average, the subjects selected 3.13 retrieved previous cases as being relevant for solving a test case. Subsequent analysis showed that these were among 67.78% of the previous cases ranked highest by TRAAC when we set t = 1 (see table 3) (i.e., on average, 3 previous cases selected by subjects would have been among the first 4.5 previous cases ranked highest by our method). Therefore, the methodology of similarity assessment at t = 1 was quite effective. Next, the power parameter “ t” was varied from 0.5 to 2.0, and their retrieval performance was compared with retrieval performance at t = 1 (see table 4). Results show that retrieval performance was improved significantly at t = 0.75 ( p = 0.0475), whereas it deteriorated significantly at t = 2.00 ( p < 0.0166). The comparison of selection of previous cases made by the subjects and those ranked highest by our method when t = 0.75 revealed that the previous cases selected by the subjects would have been among the 69.06% of the previous case ranked highest by TRAAC, thereby showing an overall improvement of 1.28% over the set of test cases compared to t = 1 (see table 3). Reducing the

68


parameter value below 0.5 would result in a drop in performance. This is because as the parameter “ t” becomes closer to 0, the difference between cases reduces, that is, overall similarity approaches 1.0. Consequently, the rank ordering of retrieved cases gets adversely affected. This is evident from table 4, which shows that the proportion of losses and wins increases as we change the parameter from 0.75 to 0.5. The question arises whether any relationship exists between the test case complexity and the power parameter “ t”. To answer this question, further analysis was performed, and the results were categorized into two groups. One group consisted of the test cases in which retrieval performance deteriorated at t < 1. The test cases in this group, on average, had 7.2 descriptors, and the subjects viewed 4.03 candidate previous cases. The second group consisted of the test cases in which retrieval performance deteriorated at t > 1. On average, the number of descriptors of test cases in this group was 6.5 and subjects viewed 3.05 candidate previous cases. Therefore, based on the number of descriptors, the first group of test cases was more complex than the second group. Additionally, these results show that the information search (i.e., number of retrieved cases viewed) for the first group was higher than the second group. This is in line with our conjecture in section 3.2. Therefore, parameter “ t” can be used to adapt the retrieval process to the complexity of the new case. 4.3.2. Computation set 2 Table 5 shows the comparison of retrieval performance between contextually determined weights and the equally weighted descriptors. Contextually determined weights improved the retrieval performance; nonetheless, the improvement was not significant ( p = 0.1335). A comparison of retrieval performance with contextually determined weights and weights acquired from the domain expert showed that there was no significant difference between them ( p = 0.3250). However, retrieval using Table 5 Contextual determination of importance of descriptors. Comparison A vs. B

Wins τA > τ B

Losses τA < τ B

Draws τA = τ B


Contextually determined weights vs. equally weighted descriptors

18

11

51

1.11406 (0.1335)

Weights acquired from domain experts vs. contextually determined weights

23

21

36

0.45659 (0.3250)

Weights acquired from domain expert vs. equally weighted descriptors

38

22

20

1.9365 (0.0264)


69

weights from the domain expert is significantly better than equally weighted descriptors ( p = 0.0264). Therefore, contextually determined weights can reduce the knowledge acquisition bottleneck without a significant loss in retrieval performance. 4.3.3. Computation set 3 Table 6 shows the effect of validity constraints on retrieval performance. The application of validity constraints improved the retrieval performance significantly ( p = 0.0028). This was due to the ability of validity constraints to filter the irrelevant candidate previous cases. Therefore, the EBL methodology of acquiring the validity constraints was effective. Table 6 Effect of validity constraints on retrieval performance.

5.

Comparison A vs. B

Wins τA > τ B

Losses τA < τ B

Draws τA = τ B


With validity constraints vs. without validity constraints

47

22

11

2.768 (0.0028)

Discussion

CBR systems support ill-structured decision problems. Unlike rule-based expert systems which are brittle [34, 54], CBR systems are flexible and able to evolve along a domain characterized by change. Such change is typical of ill-structured decision problems. The basis of this flexibility lies in the retrieval and matching methodologies that enable a CBR system to present useful previous cases to support a DM. It is to this end that the existing CBR systems rely on the use of matching rules. However, matching rules require the acquisition of domain knowledge to search and retrieve relevant previous cases, and acquisition of matching rules creates a bottleneck in a CBR system’s development [44]. We presented an alternative: a framework of methodologies based on decision theory which eliminates the need for matching rules. Our framework included three methodologies: a methodology to assess similarity along the descriptors; a methodology to determine the importance of descriptors contextually; and a methodology to filter irrelevant candidate previous cases. The similarity assessment by degree of closeness eliminates the need for matching rules and thereby reduces the knowledge acquisition bottleneck and overcomes the brittleness associated with rule-based representation. The methodology deals with numeric and ordinal-valued descriptors. It can also deal with nominalvalued descriptors by converting them into a binary-valued descriptor. Our methodology, developed to determine the closeness of previous case descriptors with a new case, makes possible the use of powerful operations of fuzzy

70


set theory and to determine an appropriate level of information search. The CBR system, by interacting with the DM, collects and analyzes hisyher preferences and determines the decision rules for selection of the adaptive retrieval parameter “ t”. On the basis of retrieval performance analysis with various values of “ t”, complexity of a new case is defined for the DM. Using this definition, the DM selects appropriate values for a given new case to optimize the retrieval performance. Future research should investigate the use of machine learning approaches for pattern recognition and automatic adaptation of case retrieval. Furthermore, in this investigation, our subjects were equally knowledgeable of the decision domain. Future research could be performed to assess the validity of the power parameter “ t” on the case retrieval for the DMs with different domain expertise. The contextually determined weights improved retrieval performance. This implies that the context of a new case exerts a strong influence on the importance of descriptors and that DMs use information that allows them to discriminate among the candidate previous cases. It is the variation of scores along the descriptors that provides the necessary discriminatory information. Although a CBR system’s ability to discriminate among previous cases during retrieval has been considered important [4], only a few current CBR systems support this capability by means of rules that make the process domain dependent. The methodology we propose eliminates the need for domain-specific rules and this makes it applicable to diverse decision domains. Since our contextual determination methodology supports a retrieval performance comparable to that of using the descriptor’s importance acquired from a domain expert, it eliminates the bottleneck attributable to knowledge acquisition. In CBR systems with a multitude of cases in the case-base where the knowledge acquisition problem is acute, the benefits of our proposed methodology could be substantial. Consequently, further assessment by application to other CBR systems is required. Current CBR systems use rules for filtering irrelevant previous cases. The underlying issue is the difficulty associated with acquisition of these rules. We proposed a methodology based on validity constraints to filter irrelevant previous cases. A deductive machine learning technique (EBL) was proposed for acquiring the validity constraints. Our empirical evaluation showed that these validity constraints acquired using EBL significantly improve the retrieval performance. Our methodology did, however, require the acquisition of validity constraints from a domain expert, and this was time-consuming. Future research should be directed towards a consideration of inductive machine learning technique to assess relevant validity constraints. Acknowledgements The authors would like to extend their appreciation to the employees of Westinghouse Canada Inc. for their help and cooperation in this research, and would like to


71

thank the two anonymous reviewers for their helpful comments on the original version of the manuscript. This paper has been supported by Grant No. 39126 from Natural Sciences and Engineering Research Council of Canada. References [1] [2] [3] [4] [5] [6] [7]

[8] [9] [10] [11] [12] [13]

[14] [15]

[16] [17] [18] [19]

[20] [21]

B.P. Allen, Case based reasoning, business applications, Communications of the ACM 37(1994) 40 – 42. K.D. Ashley, Assessing similarity among cases: A position paper, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 72 – 75. K.D. Ashley and E.L. Rissland, A case-based approach to modeling legal expertise, IEEE Expert 3(1988)70 –77. K.D. Ashley and E.L. Rissland, Compare and contrast, a test of expertise, Proceedings of the 6th National Conference on AI, AAAI, Seattle, WA, 1987, pp. 273 – 278. R. Bariess and J.A. King, Similarity assessment in case-based reasoning, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 67 – 71. N.J. Belkin and W.B. Croft, Information filtering and information retrieval, two sides of the same coin?, Communications of the ACM 35(1992)29 – 38. C. Bento and E. Costa, Retrieval of cases imperfectly described and explained: A quantitative approach, Case-Based Reasoning, Papers from 1993 Workshop, Technical Report WS-93-01, AAAI Press, Menlo Park, CA, 1993, p. 156. M.H. Bickhard and L. Terveen, Foundational Issues in Artificial Intelligence and Cognitive Science: Impasse and Solution, Elsevier, New York, 1995. F. Bolger and G. Wright, Assessing the quality of expert judgement: Issues and analysis, Decision Support Systems 11(1994). T. Cain, M. Pazzani and G. Silverstein, Using domain knowledge to influence similarity judgments, Proceedings: Case-Based Reasoning Workshop, Washington, DC, May 1991, pp. 191 – 198. Cognitive Systems, REMIND Developers Reference Manual, Boston, MA, 1992. P.R. Cohen, Evaluation and case-based reasoning, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 168 – 172. L. Console and P. Torasso, A multi-level architecture for diagnostic problem solving, in: Computational Intelligence, Vol. 1, A. Martelli and G. Valle, eds., Elsevier Science, Amsterdam, The Netherlands, 1989, pp. 101 – 112. G. DeJong and R. Mooney, Explanation-based learning: An alternative view, Machine Learning 1(1986)145 – 176. T.G. Dietterich and R.Z. Michalski, A comparative review of selected methods for learning from examples, in: Machine Learning: An Artificial Intelligence Approach, R.Z. Michalski, J.G. Carbonell and T.M. Mitchell, eds., MIT Press, Cambridge, MA, 1990, pp. 41 – 81. D. Donahue, OGRE: Generic reasoning from experience, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 248 – 252. R. Duda and P. Hart, Pattern Classification and Scene Analysis, Wiley, New York, 1973. M.B. Eisenberg, Measuring relevance judgements, Information Processing and Management 24 (1988)373 – 389. T.C. Eskeridge, Continuous analogical reasoning: A summary of current research, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 253 – 257. L. Festinger, Conflict, Decision and Dissonance, Tavistock Publications, London, UK, 1964. D. Gentner, Structure mapping: A theoretical framework for analogy, Cognitive Science 7(1983) 155 – 170.

72


[22] K.M. Gupta and A.R. Montazemi, A methodology for evaluating the retrieval performance of casebased reasoning systems, Research and Working Paper Series, # 398, School of Business, McMaster University, 1994. [23] K.M. Gupta and A.R. Montazemi, Empirical evaluation of retrieval in case-based reasoning systems using modified cosine matching function, IEEE Transaction on Systems, Man, and Cybernetics, forthcoming. [24] W.L. Hays, Statistics, Holt Rinehart and Wilson, New York, 1963. [25] T.R. Hinrichs, Problem Solving in Open Worlds: A Case Study in Design, Erlbaum, Northvale, NJ, 1992. [26] J. King and R. Bareiss, Similarity assessment in case-based reasoning, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 67 – 77. [27] D.W. King and E.C. Bryant, The Evaluation of Information Services and Products, Information Resources Press, Washington, DC, 1971. [28] J.L. Kolodner and R.L. Simpson, The MEDIATOR: Analysis of an early case-based problem solver, Cognitive Science 13(1989)507 – 549. [29] J.L. Kolodner, Case-Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1993. [30] J.L. Kolodner, Improving human decision making through case-based decision aiding, AI Magazine 12(1991)52 –68. [31] J.L. Kolodner, Judging which is the best case for a case-based reasoner, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 77 – 81. [32] J.L. Kolodner and W. Mark, Case-based reasoning, IEEE Expert 7(1992)5 – 6. [33] M. Kriegsman and R. Barletta, Building a case-based help desk application, IEEE Expert 8(1993) 18 – 26. [34] D.B. Lenat, R.V. Guha, K. Pittman, D. Pratt and M. Shepherd, CYC: Toward programs with common sense, Communications of the ACM 33(1990)30 – 49. [35] S. Minton, J.G. Carbonell, C.A. Knoblock, D.R. Kuokka, O. Etzioni and Y. Gil, Explanation-based learning: A problem solving perspective, in: Machine Learning: Paradigms and Methods, J.G. Carbonell, ed., MIT Press, Cambridge, MA, 1990, pp. 64 – 118. [36] A.R. Montazemi and K.M. Gupta, An adaptive agent for case description in diagnostic CBR systems, Journal of Computers in Industry 29(1996)209 – 224. [37] A. Newell, Unified Theories of Cognition, Harvard University Press, Cambridge, MA, 1994. [38] E. Ozakarahan, Database Machines and Database Management, Prentice-Hall, Englewood Cliffs, NJ, 1986. [39] Panel of CBR Workshop, Case-based reasoning from DARPA machine learning program, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1989, pp. 1 – 14. [40] B.W. Porter, R. Bariess and R.C. Holte, Concept learning in weak theory domains, Artificial Intelligence 45(1990)229 – 264. [41] J. Preece, Y. Rogers, H. Sharp, D. Benyon, S. Holland and T. Carey, Human-Computer Interaction, Addison-Wesley, New York, 1994. [42] O. Raoult, A survey of diagnosis expert systems, in: Knowledge Based Systems for Test and Diagnosis, G. Saucier, A. Ambler and M.A. Breuer, eds., Elsevier Science, New York, 1989, pp. 153 – 167. [43] J.J. Regazzi, Performance measures for information retrieval systems – an experimental approach, Journal of American Society for Information Science 39(1988)235 – 251. [44] C.K. Riesbeck and R.C. Schank, Inside Case-Based Reasoning, Lawrence Erlbaum Associates, Hillside, NJ, 1989. [45] G. Salton, The state of retrieval system evaluation, Information Processing and Management 28(1992)441 – 449. [46] T.L. Satty, Thoughts on decision making, ORMS Today 23(1996). [47] K.J. Schmucker, Fuzzy Sets, Natural Language Computations, and Risk Analysis, Computer Science Press, Rockville, MD, 1984.


73

[48] H.S. Shinn, Abstractional analogy: A model of analogical reasoning, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA, 1988, pp. 370 – 389. [49] E. Simoudis and J. Miller, Validated retrieval in case-based reasoning, 8th National Conference on AI, AAAI, Vol. 1, 1990, pp. 310 – 315. [50] E. Simoudis, Using case-based retrieval for customer technical support, IEEE Expert 7(1992) 7 – 11. [51] C. Stanfill and D.L. Waltz, Memory-based reasoning paradigm, Proceedings of DARPA Workshop on Case Based Reasoning, Morgan Kaufmann, San Mateo, CA 1988, pp. 414 – 424. [52] R.J. Sternberg, Component processes in analogical reasoning, Psychological Review 84(1977) 353 – 378. [53] R.H. Stottler, CBR for cost and sales prediction, AI Expert 9(1994)25 – 33. [54] R. Sun, A connectionist model for commonsense reasoning incorporating rules and similarities, Knowledge Acquisition 4(1992)293 – 332. [55] P. Thagard, K.J. Holyoak, G. Nelson and D. Gochfeld, Analog retrieval by constraint satisfaction, Artificial Intelligence 46(1990)259 – 310. [56] C. Tsatsoulis and R.L. Kashyap, Case-based reasoning in manufacturing with TOLTEC planner, IEEE Transactions on Systems, Man, and Cybernetics 23(1994)1010 – 1023. [57] M. Zeleny, Multiple Criteria Decision Making, McGraw-Hill, New York, 1982.

A framework for retrieval in case-based reasoning ... - Springer Link

A framework for retrieval in case-based reasoning ... - Springer Link

Suggest Documents

A framework of combining casebased reasoning ... - Wiley Online Library

a novel framework for sketch-based image retrieval ... - Springer Link

A framework for micropayment evaluation - Springer Link

A framework for karst ecohydrology - Springer Link

Materials integrity in microsystems: a framework for a ... - Springer Link

a multilevel framework - Springer Link

Is inferential reasoning just probabilistic reasoning in ... - Springer Link

Is inferential reasoning just probabilistic reasoning in ... - Springer Link

Response bias in relational reasoning - Springer Link

Retrieval scheduling for collaborative multimedia ... - Springer Link

Children's conditional reasoning - Springer Link

A Logical Framework for Default Reasoning - CiteSeerX

A Reasoning Framework for Solving Nonograms - CWI

Response bias in relational reasoning - Springer Link

interpretation in case-based reasoning - Springer Link

Public health reasoning - Springer Link

A Framework for Reasoning about Requirements Evolution

A Unified Framework for Non-standard Reasoning

A Framework for Open Mechanized Reasoning.

A Reasoning Framework for Solving Nonograms - Liacs

DEVELOPING A FRAMEWORK FOR REASONING ABOUT

A General Framework for Reasoning About Change

A Logical Framework for Behaviour Reasoning and Assistance in a

Challenges in Web Information Retrieval - Springer Link