Top (2011) 19:402–420 DOI 10.1007/s11750-009-0133-0 O R I G I N A L PA P E R
Domain structuring for knowledge-based multiattribute classification (a verbal decision analysis approach) Eugenia M. Furems
Received: 25 December 2008 / Accepted: 15 December 2009 / Published online: 6 January 2010 © Sociedad de Estadística e Investigación Operativa 2010
Abstract Classification problems in decision making are, at least, ill-structured or even unstructured ones, since, among other things, human judgments (i.e., Decision Maker preferences and/or expert knowledge) are the primary sources of information for their solving. Thus, not only the classification rules eliciting, but the application domain structuring as well, is a complex problem itself. The paper focuses on knowledge-based classification problem structuring in the context of complete (up to the expert knowledge) and consistent knowledge base construction for a Diagnostic Decision Support System. Two structuring techniques are proposed as expert aids, as well as an approach to large-size problem decomposition. It is asserted that application domain structuring and classification rules eliciting have to be arranged as interconnected procedures. Keywords Multiattribute classification problems · Verbal decision analysis · Knowledge acquisition · Application domain structuring · Large-size problem decomposition Mathematics Subject Classification (2000) 91B06 · 68U35 · 68T35
1 Introduction Multiple criteria classification problem (MCCP) is one of the typical decision making problems. It consists in assigning the given objects (alternatives, situations, etc.), described with the values upon multiple criteria, to the categories/classes from their E.M. Furems () Institute for System Analysis, Russian Academy of Science, 9, Prospect of 60 let Octyabrya, 117312 Moscow, Russia e-mail:
[email protected]
Domain structuring for knowledge-based multiattribute classification
403
predefined set according to Decision Maker (DM) judgments. There are various approaches to MCCPs solving in multicriteria decision aiding, the most well known of which are: utility theory (von Neumann and Morgenstern 1947; Keeney and Raiffa 1976; Jacquet-Lagreze and Siskos 1982), outranking approach (Roy 1981; Figueira et al. 2005), rough sets theory (Greco et al. 1998, 2002, 2005; Greco 2008), fuzzy sets theory (Zadeh 1965), and verbal decision analysis (VDA) (Larichev and Moshkovich 2001; Ashikhmin et al. 2008). As emphasized in Zopounidis and Doumpos (2004), the difference between MCCPs and traditional classification problems studied within the statistical and machine learning framework is based on possibility to obtain some preferential information both in respect of the alternatives’ values upon criteria and in respect of predefined classes, and, thus, the nature of such classification is specified in ordinal rather than nominal terms. Thus, in such a context, the term “judgments” referred above has to be replaced with the term “preferences,” and an MCCP is formulated as follows: assigning a set of the given objects (alternatives, situations, etc.), described with the values upon multiple criteria, to a set of preordered classes according to DM’s preferences. Such MCCPs are often called multicriteria sorting problems (Zopounidis 2002; Zopounidis and Doumpos 2002). A different kind of MCCP—multiple criteria nominal classification—is addressed in Chen (2006), where classes are not ordered by preference, but the resulting classification is based on DM’s preferences. However, in some application context, e.g., medical or technical diagnostics, the concept “preferences” is not appropriate, and the term “judgments” has to be replaced with “knowledge.” Let us explain this statement. Certainly, DM’s preferences are based, inter alia, on his/her knowledge, experience, skills, and intuition. Nevertheless, the concept “preference” implies both a possibility of ordering the objects according to their value, utility, etc., and a possibility of choice among them (Doyle 2004). Thus, it is incorrect to use the term “preference” in the context of diagnostics. Indeed, it sounds amusing if we say that a physician while acting as a Decision Maker prefers one symptom to others. Besides, there is no ultimate choice goal in such classification and the terms “criterion” and “alternative” are not appropriate. Moreover, it should be noted here, that in the literature devoted to MCCPs, the terms “criteria” and “attributes” are distinguished, for example, as follows: “The notion of attribute differs from that of criterion because the domain (scale) of a criterion has to be ordered according to a decreasing or increasing preference, while the domain of the attribute does not have to be ordered” (Greco et al. 2002, p. 250). Thus, in Multiple Criteria Decision Making (MCDM) theory the term “criteria” is associated with preferences, while the attributes are considered as nonordered characteristics of the objects to be classified. However, as it is shown below, in some cases, the values upon the attributes’ scales may be also ordered on the basis of DM’s knowledge on such values’ inherence in (typicality to) classes. Thus, to avoid a confusion, below, in the case of preference-based classification problems, we shell use the terms “MCCP”, “criteria”, and “alternatives”, and in the case of knowledge-based ones, the terms “multi-attribute classification problem” (MACP), “attributes”, and “objects”, respectively. Besides, in the case of MACP, instead of “Decision Maker”, we use the term “expert”, which means here a person who makes classification decisions on the basis of his/her knowledge. Accordingly, we may redefine the problem under
404
E.M. Furems
consideration as follows: assigning the given objects described with the values upon multiple attributes to the categories/classes from their pre-defined set according to expert knowledge. Nevertheless, such delimitation of concepts and definitions does not prevent the application of the approaches above to knowledge-based MACPs. According to Simon (1960), both MCCPs and MACPs are, at least, ill-structured or even unstructured ones, since, among other things, human judgments are the primary source of information for their solving, and such judgments are subjective to a great extent, even if they are based not on preferences but on knowledge. Therefore, such problems structuring is a complex problem itself. In von Winterfeldt (1980) problem structuring is specified as the most difficult part in decision aiding; there is noted the absence of “sound methodology” for “formal and manageable” structuring, and it is emphasized that “this step is still an art left to the intuition and craftsmanship of the individual analyst” (p. 71). While the paper above has been published about 30 years ago, the situation has changed insignificantly. In the comprehensive review (Wallenius et al. 2008) of MCDM theory and practice achievements for about last fifteen years and the future agenda in this area, it is noted that a multi-criteria decision making problem structuring continues to be of significant interest. In Montero et al. (2007) it is emphasized that a structure (objects, criteria, classes, alternatives, etc.) is a surprising missing piece in many mathematical models. The same is true for a MACP. The approaches listed in the beginning of this section (except for VDA) are motivated to some extent with poor accessibility of a DM or, in other words, with limited (if any) willingness of a DM to participate in the process of his/her preferences elicitation due to its significant time-consuming. VDA-based methods described shortly in the next section are designed for MCCPs/MACPs where there is DM/expert, who both is able and desires to share with his/her preferences/knowledge and, thus, to participate at all stages of a problem solving. To assist a DM/expert in such participation, these methods involve various techniques to cope with the challenge of time consumption. However, the paper focuses not on the methods for MACPs solving but on such problems structuring in a view of complete (up to the expert’s knowledge) and consistent knowledge base construction for Diagnostic Decision Support Systems (DDSS). In Sect. 2 we describe in brief the VDA-based methods for knowledge-based MACP. Then, in Sect. 3 the elements of application domain (AD) structure for VDAbased MACPs are determined. Two techniques are proposed in Sect. 4 as the aids for an expert in such problem structuring. It is asserted there that AD structuring and classification rules eliciting have to be arranged as interconnected procedures to ensure a structure is complete as far as possible. Section 5 presents some approach to large-size problem decomposition. Section 6 contains conclusions. The preliminary draft of the paper was published as the conference paper (Furems 2008).
2 Methods for MACP solving within VDA paradigm For the first time, expert’s knowledge, rather than DM’s preferences as an input for decision-making problems has been proposed in the monograph Larichev et al. (1991). Knowledge acquisition for a DDSS, including, but not limited to medical
Domain structuring for knowledge-based multiattribute classification
405
Table 1 VDA-based Methods for MACP Partial orders over
Order over the set of diagnostic decisions (classes)
the scales of
Classes are ordered according to
No order
attributes
a level of expert’s confidence in the only
(different diagnoses)
diagnosis/level of the diagnosis severity Partial orders
ORCLASS (Larichev et al. 1991),
according to
DIFCLASS (Larichev and Bolotov 1996),
values’ inherence
CYCLE (Larichev et al. 2002)
NORClassa (Larichev et al. 1991)
in classes No partial orders
N/A
STEPCLASS (Furems and Gnedenko 1996)
a This method is untitled in Larichev et al. (1991); however, due to its Nominal-ORdinal nature we shall call it as NORClass for further references
ones, is stated there as MACP, where the objects to be classified are described with the values of multiple attributes, and a class is the subset of objects in respect of which an expert makes the same diagnostic decision (in this paper, we consider “diagnostic decision” and “classification decision” as synonyms). Both the methods for ordinal and nominal-ordinal MACPs proposed in Larichev et al. (1991) and developed further in Larichev and Bolotov (1996), Larichev and Naryzhny (1999), Larichev et al. (2001, 2002), and the method for nominal MACP (Furems and Gnedenko 1996) are based on the VDA principles, which require (1) using only those operations of eliciting information from DM/expert that deem to be reliable from the psychological point of view; (2) control of information so obtained in a view of its consistency; and (3) processing such information (which has verbal or, in other words, qualitative nature) without any quantitative conversion, so that any resulting conclusion is both transparent and well explainable to the DM/expert. These methods are listed in Table 1. The main idea behind all of these methods consists in the following: a DDSS is an application-oriented system, and, thus, its knowledge base has to contain the classification rules of the expert, who is an efficient practitioner. Naturally, such expert would be able to specify a number of his/her classification rules directly. However, it is well known: an expert knows more than he/she is able to say, and thus, the most certainly, classification rules listed by an expert directly would be applicable to the typical objects only. So, the set of such rules would be incomplete both in relation to his/her knowledge and in regard of the AD coverage. There are various reasons to explain this phenomenon, and one of them is as follows: an expert does not formulate the rules in his/her routine activity, but he/she applies them while analyzing the really existing objects. Thus, it seems to be reasonable to generate the objects to be classified as the combinations of the attributes’ values and to present them to the expert for analysis and classification. Both in ORCLASS (and its modifications listed in Table 1) and NORClass, multi-attribute description of an object to be classified is displayed to an expert in toto, and an expert is invited to choose the proper class/classes the object belongs to, while in STEPCLASS this procedure simulates
406
E.M. Furems
an expert’s decision-making process more closely: an expert is disclosed the values of those attributes only, which he/she requests explicitly for the object under consideration. All of these methods provide for eliciting the complete (up to the expert’s knowledge) and consistent set of the expert’s classification rules allowing one to classify each object from the set of all hypothetically possible objects in the given AD described with the values of the expert-specified attributes. Consistency means here that any number of rules may exist for an object; however, all of these rules have to assign it to the same class/classes. Thus, general formal statement of knowledge-based MACP is as follows: It is given: Some AD, each object of which may belong to one or more classes: C = {C1 , C2 , . . . , CL } Q = {Q1 , Q2 , . . . , QM } Km = {km1 , km2 , . . . , kmnm } A = K1 × K2 × · · · × KM
the names of classes, predefined by the expert; the names of attributes, values of which describe the objects within the given AD; the values (scale) of the mth attribute; the set of M-attribute descriptions of all hypothetically possible objects of AD; the ith object is specified as M-tuple ai = (ai1 , ai2 , . . . , aiM ), where aim ∈ Km , m = 1, . . . , M.
It is required: to assign the objects from A to classes from C on the basis of the expert’s knowledge so that the resulting classification is both complete (up to the expert’s knowledge) and consistent. To reduce a number of the objects presented to an expert for direct classification, methods ORCLASS, DIFCLASS, CYCLE, and NORClass use so-called Inherence Hypothesis (IH). In the case of the only diagnosis (ORCLASS, DIFCLASS, CYCLE), a class is a set of objects with the same level of the expert confidence in the presence of such diagnosis or with the same level of such diagnosis severity, and, thus, the classes are ordered by natural way. IH means here the assumption that the expert is able to order the values of each attribute from the most inherent in (typical to) such diagnosis to the least inherent (typical) one independently of the values of other attributes. For example, if a problem is to classify the combinations of the attributes’ values (the latter play in such a problem the role of symptoms) according to a severity of the only diagnosis “Myocardial Infarction,” and one of the attributes is “Localization of Pain” with two values: (1) the retrosternal pain and (2) the pain left to sternum, the expert may order the values of this attribute as follows: the retrosternal pain is more inherent in (typical to) Myocardial Infarction than the pain left to sternum. In the case of different diagnoses, and, thus, nonorderable classes (NORClass), it is assumed that (i) the individual values of each attribute are inherent in (typical to) each class differently; (ii) the expert is able to order such values in terms of their inherence in an appropriate class; and (iii) the orders of the values of each attribute are independent of the other attributes’ values. For example, if a problem is to assign the combinations of symptoms (attributes’ values) to either of such classes as “Myocardial Infarction” and “Stenocardia,” and
Domain structuring for knowledge-based multiattribute classification
407
there is the attribute “Localization of Pain” with two values: (1) the retrosternal pain and (2) the pain left to sternum, the expert may order this values according to their inherence in (typicality to) each class above as follows: (a) the retrosternal pain is more inherent in (typical to) Myocardial Infarction than the pain left to sternum, (b) the pain left to sternum is more inherent in (typical to) Stenocardia than the retrosternal pain. Inherence-based orderings of the attributes’ values for each class are used to construct so-called binary Inherence Dominance Relations (IDRs) over the set of multiattribute objects for each class (the only IDR, in the case of the only diagnosis). In the case of the only diagnosis, once the expert has identified a class an object belongs to, all objects described by the attributes’ values not less inherent in such diagnosis than the object above may belong only to one of the classes corresponding to not less severity of (confidence in) such diagnosis as well, the same class including. In the case of different diagnoses (nonorderable classes), once the expert has identified a class an object belongs to, all objects described by the attributes’ values not less inherent in this class than the object above belong to the same class as well. Similarly, if an object does not belong to a class, the objects less inherent in the class also do not belong to the class. IDRs and the rules above are specified more formally in the next section. These methods allow reducing significantly the number of multiattribute objects to be presented to the expert for analysis directly and, thereby, to avoid the exhaustive search. Effectiveness of the methods above depends on a sequence in which the objects are presented to an expert. In average, about 30% of the objects are presented to an expert for direct classification, and about 70% are classified indirectly. Besides, the methods above allow an expert to express different levels of confidence in respect of the objects assigned to the same class. And last, but not least, these methods allow controlling the consistency of the expert’s decisions in the terms of nonviolation of the binary IDRs over the set of the objects. Inherence Hypothesis has been verified with a number of applications, and the methods it underlines have been implemented successfully for constructing a number of medical DDSSs (e.g., diagnostics of ischemia heart disease at the prehospital stage and diagnostics of acute surgical diseases of the abdominal organs). However, it should be noted that the procedure of values’ ordering is not relevant to an expert’ practice, so IH-based approach applicability depends on the expert’ ability to make such orderings, and, thus, it is appropriate not to any particular MACP. Method STEPCLASS is designed for MACPs where IH is not applicable (an expert is not able to order the values of attributes either due to his/her nonunderstanding of such task and/or due to the significant attributes’ dependence). It addresses the nominal knowledge-based multi-attribute classification and contains some techniques both to avoid the exhaustive search, including, but not limited to, so-called Implicit Rules (see below), and to control the consistency of the expert’s decisions as well.
3 Not “it is given” but “it is required” Ill-structured/unstructured nature of MCCPs/MACPs underlies the fact that such problems structuring involving definition of criteria/attributes and their values is a
408
E.M. Furems
key part of decision making (Scheubrein and Zionts 2006). Structuring is the first step in a problem representing and solving, since it provides comprehending the relevant elements of the problem under consideration (Saaty and Shihb 2009). Thus, it is incorrect to think that the names of attributes, their values, and, therefore, the set of objects to be classified are given. It is partially true for the classes as well. The problem of adequate structuring of MACP is a complex problem itself. Thus, the structure elements’ specification should be removed from the part “it is given” into the part “it is required” of the MACP model. Let us define the structure of a VDA-based MACP as follows: l , S = C, Q, {Km }{Vm }, P C , Rm where: C = {C1 , C2 , . . . , CL },
Q = {Q1 , Q2 , . . . , QM },
and
Km = {km1 , km2 , . . . , kmnm } are defined as above; l is an L × n matrix of the attribute Q ’s values admissibility to the Vm ≡ vmj m m l = 1 if k l classes: vmj mj ∈ Km is admissible to the class Cl , otherwise vmj = 0 (m = 1, . . . , M). P C = {(Ci , Cj ) ∈ C × C|i < j } is linear and antireflexive transitive binary relation over the set of classes according to a level of expert’s confidence in the presence or absence of the only diagnosis in the objects or according to a level of such diagnosis severity. l = {(k , k ) ∈ K × K |v m = 1, v m = 1} are reflexive and transitive binary Rm mi mj m m li lj relations over the values of each attribute Qm , admissible to the class Cl , so that l if k a pair (kmi , kmj ) ∈ Rm mi is not less inherent in Cl than kmj .
The classes, the attributes, and their values are the obligatory elements of AD structure for MACP. The admissibility matrices are the auxiliary elements that may be specified for any MACP. Really, for each object ai , it is possible to specify a list of admissible classes. Let l . Then the L-dimension vector of a ’s admissiaim = kmj and denote v l (aim ) = vmj i bility to the classes from C is defined as follows: M
M M 1 2 L Vi = vmj , vmj ,..., vmj . m=1
m=1
m=1
In Furems and Gnedenko (1996) the following obvious Implicit Rules for any object ai are determined: (1) If all the components of the vector Vi equal zero (i.e., the object ai is described with the values, combination of which is inadmissible to any class from C), this object does not belong to any such class (or it belongs to a nonlisted class, or its description contains incompatible values).
Domain structuring for knowledge-based multiattribute classification
409
(2) If the only component of the vector Vi equals one and all of its other components equal zero, this object most likely belongs to the class with the same number as that nonzero component. If an object satisfies one of the Implicit Rules above, than it is not necessary to present it to the expert for direct classification (until, at least, AD structure is changed at the further stages, if applicable). However, it should be noted here, that, in practice, the Implicit Rules only may be insufficient to classify all hypothetically possible objects in the given AD. In general, the more classes a value is admissible for, the less opportunity (if any) to form the Implicit Rules involving such value. Nevertheless, such Implicit Rules are useful to reduce the number of objects presented to the expert directly. l are the optional elements. They are determined in Binary relations P C and Rm such cases only when the expert is able to order the attributes values according to their inherence in (typicality to) the classes. In the case of the only diagnosis, a class is a set of objects with the same level of the expert confidence in the presence or absence of such diagnosis or with the same level of such diagnosis severity. If the expert is able to order the values of each attribute according to their inherence in (typicality to) such diagnosis independently of the values of other attributes and, thus, to specify binary relations Rm over the values of each attribute Qm (here the upper index (l) in Rm is omitted, since the attributes’ values are ordered in respect to the only diagnosis), it is possible to construct the following reflexive and transitive binary Inherence Dominance Relation R over the set A: R = (ai , aj ) ∈ A × A|∀m = {1, . . . , M} (ami , amj ) ∈ Rm . Then the following rules are applicable Larichev et al. (1991): 1. If the expert assigns an object ai to class Cl , any object aj such that (aj , ai ) ∈ R may belong only to one of the classes Cx , x ≤ l. 2. Accordingly, any object aj such that (ai , aj ) ∈ R may belong only to one of the classes Cx , x ≥ l. If the classes correspond to various diagnoses and the expert is able to order the values of an attribute according to their inherence in (typicality to) each class indel over pendently of the values of other attributes (i.e., to specify binary relations Rm the values of each attribute Qm admissible to the class Cl ), it is possible to construct the following reflexive and transitive binary Inherence Dominance Relations R l over the set A: l R l = (ai , aj ) ∈ A × A|∀m = {1, . . . , M} (ami , amj ) ∈ Rm , l = {1, . . . , L} . Then the following rules are applicable Larichev et al. (1991): 1. If the expert assigns an object ai to class Cl , any object aj such that (aj , ai ) ∈ R l belongs to Cl as well. 2. If an object ai does not belong to class Cl , any object aj such that (ai , aj ) ∈ R l does not belong to Cl as well.
410
E.M. Furems
Such rules are useful to indirect classification of some objects on the basis of direct classification of other objects by the expert. We include the admissibility matrix (as the auxiliary element) and the binary relations above in MACP structure (as the optional elements), since they may be defined initially at the stage of AD structuring (nevertheless, they may be adjusted at the stage of classification). Thus, MACP may be restated as two subproblems as follows: It is given: Some AD, each object of which may belong to one or more classes: It is required: 1. To define the Structure of the AD: l . S = C, Q, {Km }{Vm }, P C , Rm 2. To assign the objects from A = K1 × K2 × · · · × KM to classes from C on the basis of the expert’s knowledge so that the resulting classification is both complete (up to the expert’s knowledge) and consistent. The structure defined as above allows one to determine a type of MACP and to choose an appropriate method for its solving. If the structure contains binary relation P C and the only IDR, the method ORCLASS (or its modifications from Table 1) is applicable. If the structure contains L IDRs, the method NORClass is applicable. Finally, if all binary relations are absent (the classes are nonordered and/or the expert is not able to order the values of attributes according their inherence in (typicality to) classes), the method STEPCLASS is applicable.
4 Two approaches for MACP structuring The fact discussed in Sect. 2 above that an expert does not formulate his/her decision rules in practice, but he/she applies them while analyzing the factual objects, remains true for such MACP structure elements as the names of classes, attributes, and their values even if he/she is able to list some of them. Thus, it is necessary to support an expert not only in her/his classification rules revealing but in AD structuring as well. We propose to define an AD structure with either of two procedures, Explicit Structuring and Structuring by Examples. Both of these procedures begin with the request to the expert to list the classes. Such task is relatively easy for a domain expert, since he/she knows the scope of the future DDSS. Nevertheless, since it is necessary to construct a complete and consistent knowledge base, it is impossible to require the expert to predefine all possible classes at this stage. 4.1 Explicit structuring The Explicit Structuring (Fig. 1) is a simple branched procedure which allows the expert to list the attributes and to specify their values admissible for each class. Let us consider the procedure of Explicit Structuring for the case of three predefined classes C1 , C2 , and C3 and two attributes Q1 and Q2 specified by the expert initially.
Domain structuring for knowledge-based multiattribute classification
411
Fig. 1 Scheme of Explicit Structuring
1. The expert is asked to list the values of the attribute Q1 admissible to the class C1 , and he/she indicates the values k11 and k12 . Thereby, he/she defines implicitly the 1 = 1, v 1 = 1. admissibility of each of these values for the given class: v11 12 2. The expert is asked to list the classes, other than C1 , the value k11 is admissible 2 = 1 and v 3 = 1. to. Let the expert list the classes C2 and C3 . This means that v11 11 3. The expert is asked to list the classes, other than C1 , the value k12 is admissible 2 = 0 (since he/she does to. Let the expert indicate the class C3 . This means that v12 3 = 1. not indicate the class C2 ) and v12 4. The expert is asked to list the values of the attribute Q1 , other than k11 , admissible 2 = 1). to the class C2 , and he/she indicates k13 (thus, v13 5. The expert is asked to list the classes, other than C2 , the value k13 is admissible 1 = 0 (since he/she does to. Let the expert indicate the class C3 . This means that v13 3 = 1. not indicate the class C1 ) and v13 6. The expert is asked to the list values of the attribute Q1 , other than k11 , k12 , k13 , 3 = 1). admissible to the class C3 , and he/she indicates k14 (thus, v14 7. Finally, the expert is asked to list the classes, other than C3 , the value k14 is admissible to. The expert indicates classes C1 and C4 (a new class). This means that 1 = 1, v 4 = 1, and v 2 = 0 (since the class C is not indicated). And so on, until v14 2 14 14 the expert says he/she is not able to recall a new value of the attribute Q1 . The same procedure is fulfilled for attribute Q2 . For each attribute Qm , the expert has to do L × nm operations. To prove this, let us denote nlm as the number of Qm ’s values the expert specifies as admissible to a
412
E.M. Furems
class Cl , l = 1, L (either initially as for C1 or additionally for each other class). The expert has to match each of such values with the remaining L − 1 classes. Thus, the number of such operations for a class Cl , l = 1, L, and attribute Qm is equal to nl + (L − 1)nlm = L × nlm , and the total number of operations for Qm is equal to mL L l l for all attributes Qm , m = 1, M, l=1 L × nm = L l=1 nm = Lnm . Consequently, the total number of operations is equal to L M n m=1 m , and, thus, this approach has polynomial algorithmic complexity. 4.2 Structuring by examples Another procedure for MAC structuring is the Structuring by Examples: 1. The expert enters at least one example of an object for each Class (either in free format, which is converted then in the production form “If-Then” or in the latter form immediately). 2. Then, she/he parses each example to determine the following: (i) The attributes whose values are given in the given example (ii) Other values each of these attributes may have for the given class (iii) Other classes each of the listed values of each attribute are admissible for (iv) Other values of the given attribute that may exist in objects belonging to other classes. (Steps 2(ii)–2(iv) are the same as in Fig. 1). Let us consider the procedure of Structuring by Examples for some fictitious problem of seasons’ classification. Such problem seems to be funny, but the weather behaves itself so confusingly, that the reclassification of seasons becomes a topical problem. So, we have four pre-defined classes: “Winter”, “Spring”, “Summer”, and “Autumn”. Let us assume further that our expert is a poetical person and that she/he extracts the example for the class “Autumn” from the poem of the great Russian poet Alexander Pushkin: “October has arrived—the woods have tossed Their final leaves from naked branches; A breath of autumn chill—the road begins to freeze. . .” Then, our expert converts this text into production form: If October , and Naked branches , and Road begins to freeze , then “Autumn”, and parses it answering the leading questions as in Fig. 2. As in the procedure of Explicit Structuring above, the admissibility of the attributes’ values to classes is determined automatically. Besides, in both procedures, should the expert be able ordering the attributes’ values according to their inherence in (typicality to) classes independently of the values of other attributes, she/he makes such orderings. The list of attributes and their values extracted through the example above parsing are shown in Table 2. Numbers “1” and “0” there indicate the admissibility of the attributes’ values to classes, and the ranks of the attributes’ values according to their inherence in (typicality) to classes are shown in parentheses. It should be noted that
Domain structuring for knowledge-based multiattribute classification
413
Fig. 2 The example for class “Autumn” parsing Table 2 Attributes and their values after example for “Autumn” parsing
Q1 (Month)
Q2 (Foliage)
Q3 (Ground frosts)
Autumn
Winter
Spring
Summer
January
1 (4)
1 (1)
0
0
February
0
1 (1)
0
0
March
0
1 (3)
1 (2)
0
April
0
0
1 (1)
0
May
0
0
1 (3)
1
June
0
0
0
1 (1)
July
0
0
0
1 (1)
August
1 (2)
0
0
1 (1)
September
1 (1)
0
0
1 (2)
October
1 (1)
1 (4)
0
1 (3)
November
1 (1)
1 (2)
0
0
December
1 (3)
1 (1)
0
0
Naked branches
1 (2)
1 (1)
1 (3)
0
Yellow and red leaves
1 (1)
0
0
0
Plants shoot out buds
0
0
1 (1)
0
Exuberant foliage
1 (3)
0
1 (2)
1 (1)
Leaves are pert and green
0
0
1
1 (1) 0
Hoar-frosty ground
1 (1)
1 (2)
1 (2)
There is ground frosts
1 (1)
1 (1)
1 3)
0
There is no ground frosts
1 (2)
0
1 (1)
1 (1)
414
E.M. Furems
Fig. 3 The example for class “Winter” parsing
such ranks reflect the order of the values according to their inherence in the classes, and the least number corresponds to the most inherent value. Then the expert enters the example for class “Winter”: If Night temperature is below than −10°C , Day temperature is not higher than −5°C , and Thick snow cover , then “Winter”. As one can see, the expert has defined three new attributes in this example: “Night temperature”, “Day temperature”, and “Snow cover”, which has not been indicated in the previous example (for class “Autumn”). The expert either forgot to indicate these attributes previously or she/he did not consider them important for the class “Autumn”. Nevertheless, it is necessary to determine whether the values of such new attributes the expert indicated for “Winter” are admissible for other classes, including “Autumn” (see Fig. 3). The list of the attributes and their values (along with their admissibility and ranks) after the second example parsing are shown in Table 3. The examples for other classes (“Spring” and “Summer”) are entered and parsed similarly. In result, the sets of classes, attributes, and their values are obtained along with admissibility of attributes’ values for each class and (if applicable) their ranks according to their inherence in (typicality to) the classes. The algorithmic complexity of this procedure is comparable with that for Explicit Structuring. 4.3 Discussion Both Explicit Structuring and Structuring by Examples support an expert in determining the names of classes, attributes, and their values. Besides these procedures allow obtaining both such auxiliary information as values’ admissibility for classes and, if applicable, such optional information as the order over the classes and/or the partial orders of the attributes’ values according to their inherence in (typicality to) classes.
Domain structuring for knowledge-based multiattribute classification
415
Table 3 Attributes and their values after example for “Winter” parsing Autumn Q4 (Night temperature)
Q5 (Day temperature)
Q6 (Snow cover)
Winter
Spring
Summer
Below −10°C
1
1 (1)
0
0
−10–5°C
1 (4)
1 (2)
1 (3)
0
−4–+5°C
1 (1)
1 (3)
1 (1)
0
+6–+15°C
1 (2)
0
1 (2)
1 (2)
Above 15°C
1 (3)
0
1
1 (1)
Is not higher than −5°C
1
1
1
0
−5–+5°C
1 (2)
1 (1)
1 (3)
0
+6–+15°C
1 (1)
0
1 (1)
0 (3)
+16–+25°C
1 (3)
0
1 (2)
1 (2)
Above 25°C
1 (4)
0
1 (4)
1 (1) 0
Thick snow cover
0
1 (1)
0
Thin snow cover
1 (2)
1 (2)
1 (2)
0
There is no snow cover
1 (1)
1 (3)
1 (1)
1 (1)
One of the major requirements to the structure of a decision-making problem is its completeness so that nothing would be left out that has an important influence (Saaty and Shihb 2009). Nevertheless, while the procedure of Explicit Structuring involves special leading questions helping an expert to recall a new class and/or new values of predefined attributes admissible for different classes (e.g., a new class C4 at Step 7 above and the value k14 admissible to the class C1 not indicated at Step 1), it is not incident to her/his routine activity. While classifying the objects, an expert knows the attributes she/he has to pay primary attention to, and such attributes he/she lists likely without problems. However, depending on the values of such initial attributes, an object analysis may proceed in different directions and may require ascertaining the values of the different attributes. Thus, an expert has to imagine combinations of initial attributes’ values to recall such additional attributes. So it is difficult (if it is possible at all) to require complete domain structure from an expert at this stage. Structuring by Examples is more effective than Explicit Structuring, since it does not require an expert to list attributes and their values but allows deriving them from the examples for each class. Besides, in contrast to the Explicit Structuring, this procedure allows adding the new attributes. However, this approach is alien to an expert’ practice as well. As a rule, an expert gives typical examples for each class only, where the attributes required for more difficult and infrequent objects classification are absent. Thus, a structure defined with either approach above may be considered as preliminary only, and it is necessary to provide for its adjustment and/or extension at the stage of classification itself (Fig. 4). While analyzing an object, the expert has to be able to indicate a new class, if necessary, not specified by him/her at the stage of AD structuring, and/or to request a value of a new attribute not defined previously. Such structure elements may be incorporated in AD structure at the classification stage through the procedure of Explicit
416
E.M. Furems
Fig. 4 Structuring and classification as interconnected procedures
Structuring, which allows one also to specify all existing attributes’ values admissibility to a new class and the values of a new attribute admissibility to the existing classes (along with corresponding orderings, if applicable).
5 Decomposition of large-size MACPs The performance of various knowledge acquisition techniques suffers from the socalled curse of dimensionality (Huang 2003). As it has been noted in Puppe (1998), since the knowledge acquisition difficulty increases exponentially with the degree of dimension of knowledge bases, it is impossible to make the assumption that a knowledge base is complete, neither with respect to the coverage of all possibly relevant objects nor with respect to the correctness and completeness of the knowledge drawing conclusions from them. Such “impossibility” is an arguable question. Indeed, should an expert determine the structure of his/her AD with large number of attributes and large number of values of each attribute, the number of hypothetically possible objects to be classified amounts to millions or even billions, and it is impractical to construct the complete knowledge base for acceptable time period, though the methods listed in Table 1 allow reducing significantly the number of objects presented to an expert for direct classification. Nevertheless, as a rule, in practice, an expert does not deal with the whole set of attributes simultaneously; he/she analyses some of them either independently from others or taking into account a preliminary decision on the basis of other attributes’ combinations analysis. One way to address such challenge is to decompose the problem into a number of sub-problems with reduced structure as much close to the expert’s practice as pos-
Domain structuring for knowledge-based multiattribute classification
417
Fig. 5 Decomposition by Attributes for DDBAC problem
sible. The basic approaches to large-size problem decomposition are summarized in Levin (2006). Methods in Table 1 provide for combined approach to problem decomposition: decomposition by attributes and/or decomposition by classes. Decomposition by attributes is as follows: the set of attributes is analysed in a view to divide them into subsets that is meaningful for the expert and, thus, to form hierarchical substructures for a number of classification subproblems. To explain the term “meaningfulness” used here, let us call the attributes of a substructure as “primary” ones. Then, “meaningfulness” means that the expert is able to divide all hypothetically possible combinations of all primary attributes’ values in the given substructure into such classes (i.e., preliminary diagnoses), which she/he can use as the values of new “aggregate” attribute in the original problem instead of the set of such primary attributes. To illustrate this approach, let us consider not fictitious problem as in the previous section but a real-world one, namely: Differential Diagnostics of Bronchial Asthma in Children (DDBAC). A knowledge base for this problem has been constructed with STEPCLASS by Dr. L. Sokolova to the order of the Moscow Institute of Pediatrics and Children’s Surgery. The structure of DDBAC problem consists of 13 classes (diagnoses), 34 attributes, and the number of their values varies from 2 to 10; thus, the total number of all possible objects to be classified amounts to some tens of billions. While analyzing the attributes, the expert has divided them on the meaningful groups in accordance with the sections of the uniform case history of a pulmonology patient, such as Complaints, Clinical Examination, Anamnesis, Blood Test, X-Ray Data, and Respiratory Function. In the result, the problem has been decomposed on subproblems so that each of these subproblems had “acceptable” dimension (2–5 attributes) and was not laborious for the expert (Fig. 5). Decomposition by classes (Fig. 6) is used when some attributes are relevant to some, but not all, classes only, and such attributes are used to differentiate such
418
E.M. Furems
Fig. 6 Decomposition by Classes
classes and/or to clarify classification/diagnostic decisions. Thus, the attributes may be divided to common ones (for each class, there is at least one admissible value of each common attribute) and the subsets of attributes that are relevant for a given subset of classes only (it should be noted that such subsets may be intersected by attributes). Thus, the expert first uses initially only common attributes to divide their values’ combinations into the groups of original classes, and then she/he proceeds to classification within such groups using the relevant attributes.
6 Conclusion AD structuring for knowledge-based MACP is an important and complex problem, since it is not incident to an expert’s practice: he/she does not formulate structure elements in his/her daily activity but applies them while analyzing the real objects. Proposed techniques of Explicit Structuring and Structuring by Examples support an expert in determining the names of classes, attributes, and their values and in obtaining such auxiliary information as values’ admissibility for classes and, if applicable, such optional information as the order over the classes and/or partial orders of the attributes’ values according to their inherence in (typicality to) classes. However, a structure defined with either approach above may be considered as preliminary only, and it is necessary to provide for its adjustment and/or extension at the stage of classification itself. The approaches proposed in the paper both for MACP structuring and for a largesize MACP reducing to subproblems of acceptable dimension have been applied to a number of real-world problems, some of which are listed in Table 4 along with the numbers of their structure elements, decomposition techniques applied, and the total time spent by the corresponding expert to define a preliminary structure and then to construct the complete (up to his/her knowledge) and consistent set of classification rules.
Domain structuring for knowledge-based multiattribute classification
419
Table 4 DDSSs constructed in STEPCLASS environment Problem
Dimension Classes
Attributes
Values
Decomposition techniques
Time spent by
By attributes
the expert to
By classes
knowledge base construction Diagnostics of the
8
15
2–8
+
–
14 working days
87
124
5–10
+
+
45 working days
13
34
2–10
+
+
20 working days
fluid inflows into exploratory well Diagnostics of genetic types of terrigenous and carbonate deposits Differential diagnostics of Bronchial Asthma in Children
Acknowledgements This work is partially supported by the Russian Academy of Sciences, Research Programs “Basic Problems of Informatics and Information Technologies”, “Bases of Information Technologies and Systems”, the Russian Foundation for Basic Research (projects 08-01-00247, 08-07-89***, 07-01-00515).
References Ashikhmin I, Furems E, Petrovsky A, Sternin M (2008) Intelligent DSS under verbal decision analysis. In: Adam F, Humphreys P (eds) Encyclopedia of decision making and decision support technologies, vols 1, 2. Information Science Reference, Hershey, pp 514–527 Chen Y (2006) Multiple criteria decision analysis: classification problems and solutions. A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of PhD in Systems Design Engineering, Waterloo, Ontario, Canada. Available via http://etd.uwaterloo.ca/etd/ y3chen2006.pdf Doyle J (2004) Prospects for preferences. Comput Intell 20(2):112–136 Figueira J, Mousseau V, Roy B (2005) Electre methods. In: Figueira J, Greco S, Ehrgott M (eds) Multiple criteria decision analysis: state of the art surveys. Springer, Berlin, pp 133–162 Furems E (2008) Knowledge-based multi-attribute classification problems structuring. In: Ruan D, Montero J et al (eds) Computational intelligence in decision and control. Proceedings of the 8th international FLINS conference. World Scientific, Singapore, pp 465–470 Furems E, Gnedenko L (1996) Stepclass—a system for eliciting expert knowledge and conducting expert analyses for solving diagnostic problems. Autom Doc Math Linguist 30(5):23–28. Translations of selected articles from Nauchno-Tekhnicheskaya Informatsiya Greco S (2008) Dominance-based rough set approach for decision analysis—a tutorial. Lecture notes in computer science, vol 5009/2008. Springer, Berlin, pp 23–24 Greco S, Matarazzo B, Slowinski R (1998) A new rough set approach to evaluation of bankruptcy risk. In: Zopounidis C (ed) Operational tools in the management of financial risks. Kluwer Academic, Dordrecht, pp 121–136 Greco S, Matarazzo B, Slowinski R (2002) Rough sets methodology for sorting problems in presence of multiple attributes and criteria. Eur J Oper Res 138:247–259 Greco S, Matarazzo B, Slowinski R (2005) Decision rule approach. In: Figueira J, Greco S, Ehrgott M (eds) Multiple criteria decision analysis: state of the art surveys. Springer, Berlin, pp 507–562
420
E.M. Furems
Huang S (2003) Dimensionality reduction in automatic knowledge acquisition: a simple greedy search approach. IEEE Trans Knowl Data Eng 15(6):1364–1373 Jacquet-Lagreze E, Siskos Y (1982) Assessing a set of additive utility functions for multicriteria decision making: the UTA method. Eur J Oper Res 10:151–164 Keeney R, Raiffa H (1976) Decisions with multiple objectives: preferences and value tradeoffs. Wiley, New York Larichev O, Bolotov A (1996) The DIFKLASS system: construction of complete and noncontradictory expert knowledge bases in problems of differential classification. Autom Doc Math Linguist 30(5):12– 17. Translations of selected articles from Nauchno-Tekhnicheskaya Informatsiya Larichev O, Moshkovich H (2001) Verbal decision analysis for unstructured problems. Kluwer Academic, Dordrecht Larichev O, Naryzhny Y (1999) Computer-based tutoring of medical procedural knowledge. In: Lajoie S, Vivet M (eds) Artificial intelligence in education. JOS Press, Amsterdam, pp 517–523 Larichev O, Moshkovich H, Furems E et al (1991) Knowledge acquisition for the construction of the full and contradiction free knowledge bases. Iec ProGAMMA, Groningen Larichev O, Asanov A, Naryzhny Y, Strahov S (2001) ESTHER—expert system for the diagnostics of acute drug poisonings. In: Macintosh A, Moulton M, Preece A (eds) Applications and innovations in intelligent systems IX, Proceedings of ES2001, the 21 SGES international conference on knowledge based systems and applied artificial intelligence. Springer, Berlin, pp 159–168 Larichev O, Asanov A, Naryzhny Y (2002) Effectiveness evaluation of expert classification methods. Eur J Oper Res 138(2):260–273 Levin M (2006) Composite systems decisions. Springer, New York Montero J, Gómez D, Bustince H (2007) On the relevance of some families of fuzzy sets. Fuzzy Sets Syst 158:2429–2442 Puppe F (1998) Knowledge reuse among diagnostic problem-solving methods in the shell-kit D3. Int J Hum-Comput Stud 49:627–649 Roy B (1981) A multicriteria analysis for trichotomic segmentation problems in multiple criteria analysis. In: Nijkamp P, Spronk J (eds) Operational methods. Gower Press, Farnborough, pp 245–257 Saaty T, Shihb H-S (2009) Structures in decision making: on the subjective geometry of hierarchies and networks. Eur J Oper Res 199(3):867–872 Scheubrein R, Zionts S (2006) A problem structuring front end for a multiple criteria decision support system. Comput Oper Res 33(1):18–31 Simon H (1960) The new science of management decision. Harper and Row, New York von Neumann J, Morgenstern O (1947) Theory of games and economic behavior, 2nd edn. Princeton University Press, Princeton von Winterfeldt D (1980) Structuring decision problems for decision analysis. Acta Psychol 45:71–93 Wallenius J, Dyer J, Fishburn P, Steuer R, Zionts S, Deb K (2008) Multiple criteria decision making, multiattribute utility analysis: recent accomplishments and what lies ahead. Manag Sci 54(7):1336– 1349 Zadeh L (1965) Fuzzy sets. Inf Control 8:338–353 Zopounidis C (2002) MCDA methodologies for classification and sorting. Eur J Oper Res 138(2):227–228 Zopounidis C, Doumpos M (2002) Multicriteria classification and sorting methods: a literature review. Eur J Oper Res 138(2):229–246 Zopounidis C, Doumpos M (2004) Multicriteria decision aid in classification problems. EWG-MCDA Newsletter, Ser 3(10)