Macro and Micro Applications of Case-Based Reasoning to Feature-Based Product Selection Guy Saward, Toby O'Dell University of Hertfordshire College Lane, Hatfield, Herts AL10 9AB, UK {g.r.saward, t.odell}@herts.ac.uk +44 (0) 1707 284375 ABSTRACT: This paper examines alternative applications of case-based reasoning to product selection. It is motivated by shortfalls in e-commerce solutions but addresses wider issues in the selection of case representations and reasoning strategies where solution requirements are negotiable and the number of potential features are large. Two approaches are examined: the micro approach uses the traditional CBR nearest-neighbour algorithm to augment product searches, while the macro approach is based on the traditional CBR-cycle but substitutes crisp constraint relaxation for similarity based retrieval. While the discussion is set against the background of a specific case study, the results have a wider applicability across a number of problem domains.
1.
Introduction
The pivotal role of knowledge management in e-commerce applications is acknowledged at both a strategic level [PIU 1999] and at the level of individual applications. Many sites clearly fail to provide the knowledge1 required to "help consumers make purchase decisions and buy products" [STA 99]. The inability of users to navigate a site easily and to correctly identify products that are relevant is highlighted as one of the most important issues for e-commerce [Shern & Crawford 1999]. The issue of product selection will become even more important as e-tailers (on-line retailers) "create new, large product databases that can be used to give online consumers a better choice of products than can be found off-line." [Jones et al, 1999]. The use of expert systems for product selection and configuration is a well-established idea (e.g. the R1 system used for configuring Dec VAXs [McDermott, 1980]). More recently, case base reasoning has been applied at a number of different levels. Fuzzy similarity based matching can be used (e.g. [Vollrath et al, 1998]) to provide a more graded response than that obtained by using crisp Boolean searches. This avoids the issue of getting too many results if the search is under constrained, or too few if the search is over constrained. Anther approach is to focus on the CBR cycle of retrieval, reuse, revision and retainment [Aamodt and Plaza, 1994]. This parallels work in the information retrieval domain, where iterative searching is becoming more important [Chen et al, 1998], and the convergence of case-based reasoning, information retrieval [Rissland and Daniels, 1996] and knowledge management processes [Saward, 2000]. Extensions of the CBR cycle have also been proposed to support product selection and sales [Wilke et al, 1999].
1 Using a teleological definition of knowledge as information likely to increase the probability of action
[Nonaka 1994]
In this paper we separate out the use of similarity-based approaches from the CBR cycle. Although both approaches can still be characterised as case-based reasoning [Watson, 1997], we described the first as a micro-application of CBR that uses one of CBR's dominant techniques and the latter as a macro-application of CBR that is based on its underlying philosophy. In describing the macro approach we show how a query-based case representation can be used with crisp case matching to implement the standard CBR-cycle.
2.
Products, Features and Cases
At a rational level the sales process can be viewed as a way of selecting a product that will best fulfil a customer's needs. One way of doing this is to translate product features into comparative advantages that will then deliver specific benefits to meet those needs [Kotler 2000]. However, in the main body of the paper we shall focus on product selection at the feature level. The justification for this is two-fold: first, the majority of large e-tailers are working at the feature level [Saward et al 2000]; second, the approach outlined here could easily be adopted through the use of a hierarchical case base structure. 2.1 Products as Cases Taking a feature-based approach has the advantage of simplifying the sales process as there is no need to translate customer needs into product features. It also simplifies the knowledge representation as cases become simple relations. Taking a case as a situation-solution pair, each product becomes a single case. The product features Fi not only describe the product but also describe the customer feature requirements that the product would satisfy. Ci ::= Fi ::= ∈ A1 x ... x An Ci ::= case for ith product in case base Pi ::= product identifier for ith case Fi ::= feature values for ith product Aj ::= set of allowed values for jth feature vi,j ::= value of ith feature of jth product
In general the product cases should be kept as simple and as descriptive as possible and focus on the product specification, rather than the requirements that a product may fit. For example, although a requirement for a red car may be met by cars that are maroon or scarlet, this information would not be captured in the car's colour attribute. The natural place to store this knowledge is in the functions used to assess similarity. 2.2 Assessing Similarity The product feature case model fits easily with a relational database design and allows for easy extensions from Boolean queries, e.g. using SQL, to similarity based searches. Similarity between any product case Ci and a customer's requirements R, expressed by a set of feature values, can be assessed using the standard nearest neighbour similarity function applied to the product features Fi .
Sim(R, Fi ) =
Σ
n 1= j1
wj x σj ( rj , vj,i )
Σ
where
n 1= j1
wj
wj ::= weighting for jth attribute σj : Aj x Aj → [-1,1] ::=
similarity function for jth attribute used to assess degree of match between requirement value rj and product feature value vj,i
The similarity functions used in assessing a product's fulfilment of a specific customer requirement (i.e. a single attribute) may take many forms, both in terms of the type of data and the shape of the function. Using car selection as an example domain, the data used could be numeric (e.g. price), symbolic (e.g. 4-door or estate case) or some combination of the two via an encoding scheme (e.g. location or colour). The similarity functions could be: • • • •
symmetric - e.g. for matching engine size upper bounded - e.g. for price lower bounded - e.g. for number of seats / passengers pair-wise relational - e.g. for colour matches between any pair of colours
The similarity functions defined above allows for easy extension of relational database queries to incorporate fuzzy retrieval of candidate products without reference to the revision or reuse parts of the CBR-cycle. For this basic approach, as with any other knowledge engineering task, the determination of the case attributes, similarity functions and attribute weightings would need to be determined by an analysis of the problem domain. However, there are generic issues with this basic approach that need to be addressed as discussed below. 2.3 Managing Choice The main motivation of using CBR for product selection is to deal with the issue of search requirement specification. Using Boolean database queries, under-constrained search criteria will result in too many candidate products while over constrained criteria will result in too few. In principle the product feature case approach allows under- and over-specification to be managed by ranking candidate products according to degree of similarity. In practice, the large number of product features and the size of the case base (i.e. number of potential products) make the specification, interpretation and use of similarity a non-trivial task. For example, when choosing a car one might have the option to specify values for major criteria such as: • • • •
Make and Model Price Availability (i.e. delay in manufacture / delivery / time to market) Age and mileage (if not a new car)
• • • • •
Body style Engine size Exterior and Interior colour Dealer location Equipment level, e.g. entry level, luxury, sporty etc
The choice generate by this list of features is substantial. This choice is compounded if one includes the specification of individual items of equipment such as entertainment (radio, tape, CD, PC) or other systems (trip computer, phone, navigation), safety systems (antilock brakes, air bags, fog lights), climate control (air conditioning, seat heating, quick clear windscreens) and so on. The key issue with using the product feature case approach outlined above is the difficulty in specifying the notion of similarity so that the system can cope with the number of trade-offs and the subjectivity or personalisation involved in that assessment. For example, how might a consumer compromise their requirements if their ideal car is not going to be available for six months, and their closest alternatives are a larger, more expensive car or a two year old, lime green car at the other end of the country. We have identified three distinct approaches for creating a solution that addresses this issue: • • •
An adjusted weighting model that focuses on the similarity measure to determine a weighting that provides the highest utility in ranking of cases; An abstract feature model that focuses on the case representation to reduce the number of cases, and thereby reduce the discrimination needed by the similarity measure; An iterative relaxation model that focuses on the CBR cycle to allow for iteration through selection cycle and identifies the best dimensions for compromise.
Figures1b-1d show the effect of each of these approaches on the CBR process through changes in the data, neighbourhood geometry and starting point of the basic product feature case approach.. Figure 1a shows an abstract representation of the product feature case approach in which similarity is assessed in two dimensions2 to see which products (♦) best match / are closest to the specified requirements (ο). In figure 1b, decreasing the importance of the price allows more variation in its value while maintaining the same similarity score3 . Figure 1c shows the effect of using a more abstract feature representation in which the number of cases is reduced, while figure 1d shows the effect of revising the initial user requirements. Although these approaches could be combined, we treat them as distinct in this paper. We define the first two as micro-applications of CBR as they focus on the low level mechanics of case-based reasoning while the third is defined as a macro-application as it is working with high-level procedural issues derived from CBR.
2 A circle is used to represent a neighbour of equal similarity. 3 This approach mirrors the stretching proposed by Dietrich et al [1997] for k-NN classifiers
Figure 1b: Adjusted Weights
3500
3500
3000
3000 Engine Size (cc)
Engine Size (cc)
Figure 1a: Product Feature Cases
2500 2000 1500 1000 500
2500 2000 1500 1000 500
0
0 0
5
10
15
0
5
Price (£K)
15
Price (£K)
Figure 1d: Iterative Relaxation
Figure 1c: Abstracted Features 6
3500
5
3000 Engine Size (cc)
Power (rated 1-5)
10
4 3 2 1
2500 2000 1500 1000 500
0
0 0
5
10
Price Range (£K)
15
0
5
10 Price (£K)
Figure 1: Alternative case representation and reasoning approaches The differences in approach stem from the case representations used in each. In the adjusted weighting model, the case retains the product feature case with each individual product represented as a case. In the abstract feature model, each case is used to model a product exemplar in which individual features are used to derive more abstract attributes. For example, a car may be deemed sporty if it has a high power to weight ratio, alloy wheels and spoilers. Finally, in the iterative relaxation model, a case is used to model a previously successful query and CBR is used to determine which relaxation of requirements is likely to be most successful. The three types of case are increasingly abstract, and increasingly general or fuzzy about the exact product specification.
15
3.
Micro-CBR Approaches
The emphasis for micro-CBR approaches is on tuning the knowledge representation of the product cases and the retrieval algorithm in order to produce the best possible search results. 3.1 Adjusted Weighting There are two basic approaches to refine the potentially arbitrary similarity function. The first is to abdicate responsibility of setting the weighting to the user. This then allows the user to set and adjust the relative importance of particular product attributes in an attempt to find the best match. This adjustment could take place during the search process and would allow users to evaluate "what if" scenarios. Although this allows the system to cater for the subjectivity and personal preference inherent in product selection, this does not present a systematic approach to managing the problem of choice. Moreover, this approach could be said to compound the problem as the user now has twice as much choice:. not only do they have to pick values for particular attributes, they have to rank or score the importance of that attribute. The second approach is to allow the CBR system to adjusting the weightings automatically. This could either be done without intervention, using a maximum discrimination or entropy approach [Wilkes et al, 1999] or by allowing the user to pick the product that they perceive best matches their need. This user selection is then used as a training input to adjust the attribute weightings to reduce the error between the current system generated best selection and the user's best selection. There are various approaches that could be taken to this type of lazy learning where performance feedback (i.e. user perception of success) is used to tune the weighting of features [Dietrich et al, 1997]. One dimension of difference is that between on-line optimisers that adjust the weights after each use, and batch optimisers, e.g. those based on genetic algorithms that require training on a batch of stored data. The simplest approach to feature weighting depends on the notion that there is a single best set of weightings. Although the case library is being used for only one purpose4 , the degree of subjectivity in assessing the user's best selection means that the system will only be able to arrive at a compromise weighting covering all users of the system. It would be possible to extend the system to allow it to be trained for individual users or groups of users. This would mean incorporating personal characteristics such as age, wealth, occupation etc into the case representation. Another change to the representation would be to allow different sets of features (i.e. variable weightings) to be used for different types of products [Domingos, 1997]. Other extensions would apply to the individual attributes' similarity function. These may also need to be adjusted as subjectivity is not only limited to trade-offs between features but is also inherent in the compromises for specific features. For example, sport enthusiasts might consider red and yellow very similar, while more conventional drivers would associate red with purple. A final issue to consider here is the degree of independence between attributes and their weightings. For example, the amount of flexibility in the price a customer is prepared to pay is 4 This is normally taken as an indicator of success [Kolodner 1993, p359],
not a function of the users bank balance and the scale of the purchase. It will also relate to perceived benefit or "value for money" which in turn is a function of other product features [Hulthage & Stobie, 1998]. The result of this is that in some situations it is not possible to model the user preferences by adjusting the weightings of individual features so that the users preferred product becomes the highest ranked selection. Although techniques exist to tune models with interacting features, they either require domain specific knowledge to guide the learning process or result in representation that cannot be meaningfully interpreted [Dietrich et al, 1997]. 3.2 Abstracted Features The abstracted feature approach is based on the notion of reducing the number of features used in the case representation. This approach can also be extended to reducing the set of possible values for a particular feature, e.g. by mapping the size of the engine to a range of discrete values, say 1 - 5, or symbolic values such as very powerful. This approach is akin to exemplar based classification in which a single item is used to represent a class of similar objects. It is highly appropriate in this situation as the feature-rich products provide "well articulated models of the items (the system) is trying to match" which are reported to work well with this type of representation [Kolodner 19993, p482]. The abstraction that comes from using exemplars can be generated in two ways: it may be inherent in the configuration of standard products for sale; or it may be generated from the class of instances either through some kind of induction, or through explicit knowledge engineering within the domain. The former is exemplified by the purchase of new cars in which users can select the major characteristics such as model, body shape, equipment level and price band before tailoring their purchase through the selection of additional optional extras. The latter approach is akin to second hand car purchases where combinations of features might be used to determine if the car is nearly new (by age or mileage), and/or a family car (by number of seats and boot size). This distinction is also becoming important in the housing market where buyers of new houses have the opportunity to specify certain elements of the design while those purchasing existing houses must look for properties that meet their requirements be they a family house or a bachelor pad. In general, using abstract features should help to manage choice as it reduces the number of degrees of freedom that the user has to work within. In practice, this will only happen if the abstractions are meaningful and generally applicable. This is not the case for example if one looks at car model designations where a range of options are packaged together. A top of the range Ford that includes almost all options could be branded a "GhiaX" while the equivalent Vauxhall would be a "CD". Other designations are more applicable, e.g. an "L" designation, but vary from one manufacturer to another. The fact that the same designation from the same manufacturer might also include different options over time will also decrease the meaningfulness of such abstractions.
4.
Macro-CBR Approaches
The iterative relaxation approach is based on the notion that previous searches can be used to guide the modification or compromise of user requirements. In doing this, the underlying case representation is changed from the product feature case to a query requirements case. This is a
fundamental change to the knowledge representation5 and differs from a use-adapted case retriever in which the history is used to augment an existing case retrieval [Alterman and Griffin, 1996]. It also represents a departure from the CBR-cycle for electronic sales support applications proposed by Wilke et al [1999] of: • • • •
retrieve products that match the initial user demands reuse products as the starting point for configuring an offer to meet the user's specific demands revise6 view of products to establish list of evaluated products refine list of user requirements based on evaluated products
In the proposed iterative relaxation approach the product feature case is replaced by cases that are used to represent how users compromise their requirements. The single set of product features Fi used to represent both the product features and user requirements is replaced by two sets of product features. One set of features is used to represent the initial user query and the second represents the revised requirements that lead to a successful search. Both the initial query Qi and the revised requirements Ri are expressed as sets of attribute values. Ci ::= Qi ::= ∈ A1 x ... x An Ri ::= ∈ A1 x ... x An
The standard CBR-cycle is then used to generate crisp database queries, with the result set used to judge the success or failure of the query. The search may be considered unsuccessful, if it returns too few, too many or simply the wrong type of product. In this situation, a case base search is undertaken to find the best adaptation of requirements, as shown in figure 2, using the traditional CBR cycle applied to the query cases: • • • •
retrieve previous queries or searches that are similar to the initial user demands reuse queries by substituting variables as required revise query by trial execution to see if it represents a better solution retain query if it results in a successful search
5 Although this is a fundamental change to the product feature case approach, the notion of storing local
actions for optimal behaviour is not new, e.g. [Sycara and Miyashita, 1994] 6 The original does not identify what is actually being revised - it is assumed that it is the customer's view of the product that it is revised.
Retain
adapted search
user requirements
Retrieve
Case Base
similar searches
revised requirements
Revise
Reuse
Figure 2: Iterative relaxation CBR cycle As an example, suppose the user is looking for a green, 2 litre Ford Mondeo in Devon costing £10,000. If a search of the product database fails to retrieve any matching products then the user must compromise on their requirements. A search of the case base might reveal that similar requests had been met by an £8,000 blue Mondeo from Essex and a 1.8 litre from Coventry. Taking a simple majority voting approach over the previously relaxed attributes would indicate that the user should compromise on location first. In practice, the inclusion of the reuse and revise phases of the CBR cycle will depend upon the ability to generalise from queries and apply specific rewrite rules. In the case of general query representations, the case will also need to have some measure of utility attached to it, so that the most successful query adaptations, i.e. most successful dimension for constraint relaxation, are attempted first.
5.
Implementation Issues
We are currently enhancing a nearest neighbour, product feature case application to investigate the comparative advantages and disadvantages of micro- and macro-CBR approaches. The application is based on an Access database and accessed via a combination of JavaScript, VBScript and ASP. Although the trial domain is used cars, the approach is equally applicable to other feature rich domains such as property, PCs, other consumer electronics and holidays. These sectors currently account for almost 50% of current e-commerce sales [Jones et al, 1999] as well as representing some of the high growth sectors. The system should also work equally well with new, "previously enjoyed" or auctioned goods. A key issue for the evaluation of the macro-approach to product selection will be the acquisition of sufficient customer data to allow for meaning constraint relaxation. This is currently being investigated with a number of major e-tailers.
Another key issue is that of case base maintenance. For the micro-approaches that use the product feature case, the case base must be maintained as new products are added and old products deleted. Where abstract features are used and the products represent exemplars of a product range this should be relatively straightforward as the product changes are less frequent than for concrete product feature cases. In the worst case where each case represents a single unique instance of a product, such as used cars, the case base indices and reasoning mechanisms must be maintained in real time as major dealers can buy and sell hundreds of products a day. The changing stock of available products has less predictable consequences for the macro-CBR approach. The query cases show how the user requirements were modified at a particular point in time to meet a specific need. However, this compromise of requirements would have been done within the context of a particular set of choices, i.e. the products available within the product database at that specific point in time. The applicability of what was the best choice at that point in time will be reduced if the context has changed. The degree to which this affects the system usability can only be established through an on-going, long-term system evaluation.
6.
Conclusions
Feature based product selection is alive and well on the Internet and will continue to be so. However, it is clear that if e-commerce is to meet its explosive growth targets that e-tailers must provide systems to support less knowledgeable shoppers [Saward et al, 2000]. Case-based reasoning is an approach that can provide a number of different solutions to the problem of managing choice, all based on a simple underlying knowledge representation. We have presented a range of alternatives from fuzzy database matching and the use of abstraction, to iterative approaches to searching. There are other interesting CBR solutions for product selection and recommendation that have not been explored in this paper. They include: • the combination of approaches outlined above for feature-based selection; • the use of hierarchical or stratified CBR [Branting and Aha, 1995] to allow for the translation of user needs and benefits into product features for high ticket items; • user classification into on-line communities such as reading circles and music channels for cross-selling related products for smaller ticket items. The context of the work presented here has been limited to a small number of domains. However, the approaches outlined are widely applicable in an industrial context as well as providing a platform for exploring synergies between CBR, machine learning and information retrieval.
References Aamodt A and Plaza E, 1994. Case-based reasoning: Foundational issues, methodological variations and system approaches, AI Communications Aha, D.W., 1997. Lazy Learning, Artificial Intelligence Review, 11(1-5), pp1-423 Altman R and Griffin D, 1996. Improving Case Retrieval by Remembering Questions, Thirteenth National Conference on Artificial Intelligence, pp678-683
Branting L. and Aha D., 1995. Stratified Case-Based Reasoning: Reusing hierarchical Problem Solving Episodes, Fourteenth International Joint Conference on Artificial Intelligence, pp384390. Chen et al, 1998. Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques, Journal of the American Society of Information Science, July 1998, pp582-603. Domingos P., 1997. Context-Sensitive Feature Selection for Lazy Learners, Artificial Intelligence Review, 11(1-5), pp227-253. Hulthage I.E. and Stobie I, 1998. Countrywide Automated Property Evaluation System - CAPES, Tenth Innovative Applications of Artificial Intelligence, pp1039-1046 Jones I, Patel V, Beauvillain O and Neufeld E, 1999. European Online Shopping, Jupiter Communications, London. Kolodner J., 1993. Case-Based Reasoning, Morgan Kaufman.
Kotler P., 2000. Marketing Management, Prentice Hall. McDermott J., 1980. R1: an expert in the computer system domain, Proc National Conference on Artificial Intelligence, pp269-71. Nonaka I., 1994. A Dynamic Theory of Organizational Knowledge Creation, Organization Science, 5(1), pp1437. PIU 1999.
[email protected], Performance and Innovation Unit Report, UK Cabinet Office, September 1999.
Rissland E. and Daniels J, 1996. The Synergistic Application of CBR to IR, Artificial Intelligence Review, 10(1-2), pp441-475. Saward G, 2000. The challenge for customer service: managing heterogeneous knowledge, in Schwatz D. (eds) Internet Information Systems for Knowledge Management, Idea Group Publishing Saward G., Ambrosiadou V., and Polovina S., 2000. A FAB Approach to E-Commerce Knowledge Accessibility Requirements, Technical Report, University of Hertfordshire, forthcoming. Shern S and Crawford F, 1999. 2nd Annual Internet Shopping Study, Ernst & Young, New York. STA, 1999. Click Here Commerce, Shelley Taylor Associates, February 1999, http://www.infofarm.com
Sycara K. and Miyashita K., 1994. Case-based acquisition of User Preferences for Solution Improvement in Ill-Structured Domains, Twelfth National Conference on Artificial Intelligence, pp44-55 Vollrath I., Wilke W., and Bergmann R, 1998. Case-based reasoning support for online catalogue sales, IEEE Internet Computing, July-August 1998 Wilke W., Bergmann R., and Wess., 1999. Negotiation During Intelligent Sales Support with Case-Based Reasoning, German Workshop on CBR Wettschereck D., Aha D. and Mohri T., 1997. A review of Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms, Artificial Intelligence Review, 11(1-5), pp273-314