IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
399
IVTURS: A Linguistic Fuzzy Rule-Based Classification System Based On a New Interval-Valued Fuzzy Reasoning Method With Tuning and Rule Selection Jos´e Antonio Sanz, Alberto Fern´andez, Humberto Bustince, Member, IEEE, and Francisco Herrera, Member, IEEE
Abstract—Interval-valued fuzzy sets have been shown to be a useful tool to deal with the ignorance related to the definition of the linguistic labels. Specifically, they have been successfully applied to solve classification problems, performing simple modifications on the fuzzy reasoning method to work with this representation and making the classification based on a single number. In this paper, we present IVTURS, which is a new linguistic fuzzy rule-based classification method based on a new completely interval-valued fuzzy reasoning method. This inference process uses interval-valued restricted equivalence functions to increase the relevance of the rules in which the equivalence of the interval membership degrees of the patterns and the ideal membership degrees is greater, which is a desirable behavior. Furthermore, their parametrized construction allows the computation of the optimal function for each variable to be performed, which could involve a potential improvement in the system’s behavior. Additionally, we combine this tuning of the equivalence with rule selection in order to decrease the complexity of the system. In this paper, we name our method IVTURS-FARC, since we use the FARC-HD method to accomplish the fuzzy rule learning process. The experimental study is developed in three steps in order to ascertain the quality of our new proposal. First, we determine both the essential role that interval-valued fuzzy sets play in the method and the need for the rule selection process. Next, we show the improvements achieved by IVTURS-FARC with respect to the tuning of the degree of ignorance when it is applied in both an isolated way and when combined with the tuning of the equivalence. Finally, the significance of IVTURS-FARC is further depicted by means of a comparison by which it is proved to outperform the results of FARC-HD and FURIA, which are two high performing fuzzy classification algorithms. Index Terms—Fuzzy reasoning method, interval-valued fuzzy sets (IVFS), interval-valued restricted equivalence functions (IV-REF), linguistic fuzzy rule-based classification systems (FRBCS), rule selection, tuning.
Manuscript received February 22, 2012; revised September 19, 2012 and December 28, 2012; accepted January 10, 2013. Date of publication January 25, 2013; date of current version May 29, 2013. This work was supported in part by the Spanish Ministry of Science and Technology under Project TIN2011-28488 and Project TIN2010-15055, as well as by Andalusian Research under Plan P10-TIC-6858 and Plan P11-TIC-7765. J. A. Sanz and H. Bustince are with the Departmento of Autom´atica y Computaci´on, Universidad Publica de Navarra, 31006 Pamplona, Spain (e-mail:
[email protected];
[email protected]). A. Fern´andez is with the Department of Computer Science, University of Ja´en, 23071 Ja´en, Spain (e-mail:
[email protected]). F. Herrera is with the Department of Computer Science and Artificial Intelligence, Research Center on Information and Communications Technology (CITIC-UGR), University of Granada, 18071 Granada, Spain (e-mail:
[email protected]). Digital Object Identifier 10.1109/TFUZZ.2013.2243153
I. INTRODUCTION UZZY rule-based classification systems (FRBCSs) have been widely employed in the field of pattern recognition and classification problems [1]–[8]. Aside from their good performance, FRBCSs are adequate since they also provide a linguistic model that is interpretable to the users because they are composed of a set of rules composed of linguistic terms [9], [10]. One of the key points in the subsequent success of fuzzy systems (like FRBCSs) is the choice of the membership functions [11]. This is a complex problem due to the uncertainty related to their definition, whose source can be both the intrapersonal and the interpersonal uncertainty associated with the linguistic terms [12], [13]. Interval-valued fuzzy sets (IVFSs) [14] have proven to be an appropriate tool to model the system uncertainties and the ignorance in the definition of the fuzzy terms [15]. An IVFS provides an interval, instead of a single number, as the membership degree of each element to this set. The length of the interval can be seen as a representation of the ignorance related to the assignment of a single number as membership degree [16]. IVFSs have been successfully applied in computing with words [17], mobile robots [18], and image processing [19], [20], among others. In this paper, we present IVTURS, which is short for linguistic FRBCS based on an Interval-Valued fuzzy reasoning method (IV-FRM) with TUning and Rule Selection. The main contribution of IVTURS is a novel IV-FRM in which the ignorance represented by the IVFSs is taken into account throughout the reasoning process. To do so, we completely extend the classical fuzzy reasoning method [21] including the computation of the matching degree using interval-valued restricted equivalence functions (IV-REFs) [22], [23]. The goal is to show how equivalent are the interval membership degrees of the antecedent of the rules to the ideal interval membership degree ([1, 1]). In a nutshell, the higher the equivalence between the example and the antecedent, the greater the significance of the rule in the decision process. IV-REFs are constructed using parametrized functions. It is, therefore, easy to construct different functions by modifying the values of their parameters. This makes it possible to compute the most suitable set of IV-REFs for each specific problem by defining a genetic tuning to accomplish this optimization problem. Additionally, we combine it with a fuzzy rule selection process
F
1063-6706/$31.00 © 2013 IEEE
400
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
as it is a well-known synergy in this field to improve both the interpretability and the accuracy of the final fuzzy system [10]. IVTURS is composed of three stages: 1) the generation of an initial Interval-Valued Fuzzy Rule-Based Classification System (IV-FRBCS). To do this, we first learn the Rule Base (RB) using the recent fuzzy rule learning algorithm known as FARC-HD [1] (Fuzzy Association Rule-based Classification model for High Dimensional problems). Then, we model its linguistic labels with IVFSs and initialize the IV-REF for each variable of the problem; 2) the application of the new IV-FRM which makes use of IV-REFs; 3) the optimization step using the proposed synergy between the tuning of the equivalence and rule selection. In this paper, our methodology is built-up over the FARC-HD algorithm in order to learn the initial FRBCS, hence denoting the whole model as IVTURS-FARC. We show the goodness and high potential of the use of IVTURS-FARC first by determining the suitability of the synergy produced when combining IVFSs, tuning, and rule selection and then by studying whether the new IV-FRBCS enhances the results achieved by the tuning of the weak ignorance [24] when it is performed both individually and combined with the new tuning of the equivalence. Furthermore, we analyze the significance of the results obtained with IVTURS-FARC versus the ones achieved by two of the best performing fuzzy methods published in the specialized literature, i.e., the original FARC-HD model [1] and the FURIA algorithm [2]. The performance of the proposals will be evaluated according to the accuracy rate and will be tested over a wide collection of datasets selected from the KEEL dataset repository1 [25], [26]. We will use some nonparametric tests [27]–[29] for the purpose of showing the significance of the performance improvements achieved by our proposal. This paper is arranged as follows. In Section II, we recall some preliminary concepts in both FRBCSs and IVFSs theory together with the description of the construction method of IV-REFs used in this paper. Next, we introduce in detail the components of IVTURS-FARC in Section III, which involves the description of the initialization of the IV-FRBCS’ components, the definition of the new IV-FRM, and the proposal to combine the genetic tuning of the system’ parameters with rule selection. The experimental framework and the results obtained by the application of our approaches together with the corresponding analysis are presented in Sections IV and V, respectively. We finish this paper with the main concluding remarks in Section VI. II. PRELIMINARIES In this section, for the sake of completeness, we first review several preliminary concepts in both FRBCSs and IVFSs. Then, we recall the concept of IV-REFs [22], [23] and the construction method of these functions used in this paper. A. Fuzzy Rule-Based Classification Systems There are a lot of techniques used to deal with classification problems in the data mining field. Among them, FRBCSs are 1 (http://www.keel.es/dataset.php)
widely employed as they provide an interpretable model by means of the use of linguistic labels in their rules. The two main components of FRBCSs are as follows. 1) Knowledge base: This is composed of both the RB and the database, where the rules and the membership functions are stored, respectively. 2) Fuzzy reasoning method: This is the mechanism used to classify objects using the information stored in the knowledge base. In order to generate the knowledge base, a fuzzy rule learning algorithm is applied that uses a set of P labeled patterns xp = (xp1 , . . . , xpn ), p = {1, 2, . . . , P }, where xpi is the ith attribute value (i = {1, 2, . . . , n}). Each of the n attributes is described by a set of linguistic terms together with their corresponding membership functions. In this paper, we consider the use of fuzzy rules in the following form: Rule Rj : If x1 is Aj 1 and . . . and xn is Aj n then Class = Cj with RWj
(1)
where Rj is the label of the jth rule, x = (x1 , . . . , xn ) is an n-dimensional pattern vector, Aj i is an antecedent fuzzy set representing a linguistic term, Cj is the class label and RWj is the rule weight [30]. Specifically, in this paper, we consider the computation of the rule weight using the most common specification, i.e., the fuzzy confidence value or certainty factor defined in [31] as x p ∈C lassC j μA j (xp ) (2) RWj = CFj = P p=1 μA j (xp ) where μA j (xp ) is the matching degree of the pattern xp with the antecedent part of the fuzzy rule Rj . Let xp = (xp1 , . . . , xpn ) be a new pattern to be classified, L denotes the number of rules in the RB, and M is the number of classes of the problem; then, the steps of the fuzzy reasoning method [21] are as follows. 1) Matching degree, i.e., the strength of activation of the if-part for all rules in the RB with the pattern xp . A conjunction operator (t-norm) T is applied in order to carry out this computation μA j (xp ) = T (μA j 1 (xp1 ), . . . , μA j n (xpn )) j = 1, . . . , L.
(3)
2) Association degree. To compute the association degree of the pattern xp with the M classes according to each rule in the RB. To this aim, a combination operator h is applied to combine the matching degree with the rule weight. When using rules in the form shown in (1), this association degree only refers to the consequent class of the rule (i.e., k = Class(Rj )) bkj = h(μA j (xp ), RWjk ),
k = 1, . . . , M j = 1, . . . , L
(4)
3) Pattern classification soundness degree for all classes. We use an aggregation function f, which combines the positive
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
degrees of association calculated in the previous step Yk =
f (bkj ,
j = 1, . . . , L and
bkj
> 0),
k = 1, . . . , M. (5)
4) Classification. We apply a decision function F over the soundness degree of the system for the pattern classification for all classes. This function will determine the class label l corresponding to the maximum value F (Y1 , . . . , YM ) = arg max(Yk )
(6)
k =1,...,M
B. Interval-Valued Fuzzy Sets In this section, we introduce the IVFSs’ theoretical concepts that are necessary to understand the paper. First, we recall the definition of IVFSs with our interpretation of their length. Then, we remind both the intersection operation on IVFSs, which will be used to carry out the conjunction among the antecedents of the rules, and the complement operation on IVFSs, which is used in the definition of IV-REFs. Next, we present the interval arithmetic that will be used to compute the rule weight as an element of L([0, 1]). Finally, we introduce the total order relationship for intervals, which will be used in the classification step of the IV-FRM. Let us denote by L([0, 1]) the set of all closed subintervals in [0, 1], i.e., L([0, 1]) = {x = [x, x]|(x, x) ∈ [0, 1]2 and x ≤ x}. Definition 1 [32], [33]: An IVFS (or interval type-2 fuzzy set) A on the universe U = ∅ is a mapping AI V : U → L([0, 1]) so that AI V (ui ) = [A(ui ), A(ui )] ∈ L([0, 1])
for all ui ∈ U.
We denote by L the length of the interval under consideration, i.e., L(AI V (ui )) = A(ui ) − A(ui ). The length of the IVFSs can be seen as a representation of the ignorance related to the definition of the membership functions [16]. Independently of the source of the ignorance, it can be quantified by means of weak ignorance functions, as introduced in our previous work [34]. In this paper, we will use two basic operations with IVFSs, namely, intersection and complement. On the one hand, we will apply t-norms [35], [36] to model the conjunction among the linguistic variables composing the antecedent of the rules. Therefore, we recall its extension on IVFSs. Definition 2 [37]: A function T : (L([0, 1]))2 → L([0, 1]) is said to be an interval-valued t-norm if it is commutative, associative, increasing in both arguments (with respect to the order x ≤L y if and only if x ≤ y and x ≤ y), and has the neutral element 1L . Definition 3 [37]: An interval-valued t-norm is said to be t-representable if there are two t-norms Ta and Tb in [0, 1], being Ta ≤ Tb so that T(x, y) = [Ta (x, y), Tb (x, y)] for all x, y ∈ L([0, 1]).
401
All interval-valued t-norms without zero divisors verify that T(x, y) = 0L if and only if x = 0L or y = 0L . In this paper, we model the intersection by means of trepresentable interval-valued t-norms without zero divisors that will be denoted TT a ,T b , since they can be represented by Ta and Tb , as defined previously. On the other hand, we will use interval-valued fuzzy negations, which are the extension of fuzzy negations on IVFSs, because IV-REFs must fulfill a condition based on them. In fuzzy set theory, a strictly decreasing continuous function c : [0, 1] → [0, 1] so that c(0) = 1 and c(1) = 0 is called a strict negation [38]. If c is also involutive, then it is called a strong negation. On this basis, we recall the definition for an intervalvalued fuzzy negation. Definition 4 [37]: An interval-valued fuzzy negation is a function N : L([0, 1]) → L([0, 1]) that is strictly decreasing (with respect to ≤L ) so that N (1L ) = 0L and N (0L ) = 1L . If for all x ∈ L([0, 1]), N (N (x)) = x, N is said to be involutive. Next, we present the interval arithmetic that we will use to compute the rule weight as an element of L([0, 1]). This fact allows us to extend the IV-FRM in such a way that the ignorance represented by the IVFSs is taken into account throughout the inference process. A deep study of interval arithmetic can be found in [39]. Let [x, x] and [y, y] be two intervals in R+ so that x ≤L y, the rules of interval arithmetic, are as follows. 1) Addition: [x, x] + [y, y] = [x + y, x + y]. 2) Subtraction: [x, x] − [y, y] = [|y − x|, y − x]. 3) Multiplication: [x, x] ∗ [y, y] = [x ∗ y, x ∗ y]. 4) Division: [min(min( xy , xy ), 1), min(max( xy , xy ), 1)]. When we will need to use a total order relationship for intervals, i.e., when the largest interval membership needs to be determined in the last step of the IV-FRM, we will use the one defined by Xu and Yager in [40]: Let [x, x], [y, y] ∈ L([0, 1]), and let s([x, x]) = x + x − 1 be the score of [x, x], and h([x, x]) = 1 − (x − x) be the accuracy degree of [x, x]. Then 1) if s([x, x])s([y, y]), then [x, x] < [y, y]; 2) if s([x, x]) = s([y, y]), then a) if h([x, x]) = h([y, y]), then [x, x] = [y, y]; b) if h([x, x]) < h([y, y]), then [x, x] < [y, y]. Observe that any two intervals are comparable with this order relation. Moreover, it follows easily that 0L is the smallest element in L([0, 1]) and 1L is the largest. We must remark that in the case of working with intervals in R+ 0 , the previously described total order relationship works but the domain of the result for both the score and the accuracy degrees changes (from [−1, 1] to [−1, ∞] for the score degree and from [0, 1] to [−∞, 1] for the accuracy degree).
C. Construction Method of Interval-Valued Restricted Equivalence Functions This section is aimed at providing an appropriate background about the concept of IV-REFs [22], [23], which is one of the main tools used in our new IV-FRM. IV-REFs are used to quantify the equivalence degree between two intervals. They are the
402
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
Proposition 1 [42]: If φ1 , φ2 are two automorphisms of the unit interval, then REF (x, y) = φ−1 1 (1 − |φ2 (x) − φ2 (y)|) with c(x) = φ−1 2 (1 − φ2 (x)) is a restricted equivalence function. Example 2: Taking φ1 (x) = x, φ2 (x) = x, we obtain the following restricted equivalence function: REF (x, y) = 1 − |x − y| which satisfies conditions (R1)–(R5) with c(x) = 1 − x for all x ∈ [0, 1]. As mentioned previously, IV-REFs are the extension of restricted equivalence functions on IVFSs. Their definition is as follows. Definition 7 [22], [23]: An IV-REF associated with a intervalvalued negation N is a function Fig. 1. Example of different automorphisms generated by varying the value of the parameter a, as shown in Example 1.
extension on IVFSs of the restricted equivalence functions [41], since they allow us to quantify how equivalent two values are. Definition 5 [41]: A function REF : [0, 1]2 → [0, 1] is called a restricted equivalence function associated with a strong negation c, if it satisfies the following conditions: (R1) REF (x, y) = REF (y, x) for all x, y ∈ [0, 1]; (R2) REF (x, y) = 1 if and only if x = y; (R3) REF (x, y) = 0 if and only if x = 1 and y = 0 or x = 0 and y = 1; (R4) REF (x, y) = REF (c(x), c(y)) for all x, y ∈ [0, 1], c being a strong negation; (R5) For all x, y, z ∈ [0, 1], if x ≤ y ≤ z, then REF (x, y) ≥ REF (x, z) and REF (y, z) ≥ REF (x, z). We must point out that in this study, we use the standard negation, i.e., c(x) = 1 − x. Among the methods developed to construct restricted equivalence functions [42], we use the one based on automorphisms, which are defined as follows. Definition 6: An automorphism of the unit interval is any continuous and strictly increasing function φ : [0, 1] → [0, 1] so that φ(0) = 0 and φ(1) = 1. Example 1: The equation φ(x) = xa , where a ∈ (0, ∞), generates a family of automorphisms. 1) If a = 1 → φ(x) = x. 2) If a = 2 → φ(x) = x2 . 1 3) If a = 0.5 → φ(x) = x( 2 ) . 4) If a = 100 → φ(x) = x100 . 1 5) If a = 0.01 → φ(x) = x( 1 0 0 ) . Fig. 1 depicts the behavior of these five automorphisms. Specifically, the construction method of restricted equivalence functions based on automorphisms is the one introduced in Proposition 1.
IV -REF : L([0, 1])2 → L([0, 1]) so that (1R1) IV -REF (x, y) = IV -REF (y, x) for all x, y ∈ L([0, 1]); (1R2) IV -REF (x, y) = 1L if and only if x = y; (1R3) IV -REF (x, y) = 0L if and only if x = 1L and y = 0L or x = 0L and y = 1L ; (1R4) IV -REF (x, y) = IV -REF (N (x), N (y)) with N being an involutive interval-valued negation; (1R5) For all x, y, z ∈ L([0, 1]), if x ≤L y ≤L z, then IV -REF (x, y) ≥L IV -REF (x, z) and IV -REF (y, z) ≥L IV -REF (x, z). In [22] and [23], the authors show several construction methods of IV-REFs. Among them, we make use of the IV-REF construction method given in the following corollary. Corollary 1 [22], [23]: Let REF be a restricted equivalence function, and let T and S be any t-norm and any t-conorm in [0, 1]: IV -REF (x, y) = [T (REF (x, y), REF (x, y)) S(REF (x, y), REF (x, y))] is an IV-REF. Using the construction method of IV-REFs given in Corollary 1 applying the construction method of restricted equivalence functions recalled in Proposition 1, we obtain the IV-REFs construction method used in this paper: IV -REF (x, y) = −1 [T (φ−1 1 (1 − |φ2 (x) − φ2 (y)|), φ1 (1 − |φ2 (x) − φ2 (y)|)) −1 S(φ−1 1 (1 − |φ2 (x) − φ2 (y)|), φ1 (1−|φ2 (x)−φ2 (y)|))]. (7)
Example 3: Taking φ1 (x) = x, φ2 (x) = x, we obtain the following IV-REF: IV -REF (x, y) = [T (1 − |x − y|, 1 − |x − y|) S(1 − |x − y|, 1 − |x − y|)].
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
Fig. 2.
403
Flowchart of our new IV-FRBCS.
III. LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM BASED ON AN INTERVAL-VALUED FUZZY REASONING METHOD WITH TUNING AND RULE SELECTION In our new approach, we propose the introduction of the concept of minimum distance classifiers in the IV-FRM of the IV-FRBCSs. To do so, we compute the matching degree between the patterns and the antecedent of the rules using IV-REFs in order to quantify the equivalence degree between the interval membership degree and the ideal interval membership degree (1L = [1, 1]) for each linguistic label composing the antecedent of the rule. The motivation is to strengthen the relevance of the rules with a higher equivalence degree with respect to the new pattern to be classified. The parametrized construction of the IV-REFs allows an easy generation of many of these functions to be performed. In this manner, we face the problem of choosing a suitable similarity function by applying a genetic tuning, which can lead to an improvement of the behavior of the system in a general framework by looking for the most appropriate set of IV-REFs to solve each specific problem with which we deal. In the remainder of this section, we first present a general outline of our new method (see Section III-A). Then, we describe in detail the initialization of the system’ parameters (see Section III-B) and the novel IV-FRM making use of IV-REFs (see Section III-C). Finally, we introduce the tuning approach used to choose the most appropriate IV-REF for each variable together with the rule selection method (see Section III-D). A. Overviewing IVTURS This section is aimed at showing a general overview of IVTURS. As depicted in Fig. 2, it is composed of three steps. 1) Initialization of the IV-FRBCS. This step involves the following tasks: a) Generate the initial FRBCS by means of the FARCHD method by Alcal´a-Fdez et al. [1]. b) Model the linguistic labels of the FRBCS by means of IVFSs. c) Generate of the initial IV-REF for each variable of the problem. 2) This step is the extension of the fuzzy reasoning method on IVFSs. 3) This step is the application of the optimization approach, which is composed of a) the genetic tuning in which we look for the best values of the IV-REFs’ parameters; b) the rule selection process in order to decrease the system’s complexity.
As we have mentioned, in this paper, our approach combines IVTURS with the FARC-HD method to carry out the fuzzy rule learning process. Therefore, we denote our new proposal as IVTURS-FARC. In the remainder of this section, we describe in detail each step composing our new method. B. Initialization of the Interval-Valued Fuzzy Rule-Based Classification System It is well known that there are two possibilities in the generation of a type-2 model [43]: 1) a partial dependent one, where an initial type-1 fuzzy model is learned and then used as a smart initialization of the parameters of the type-2 fuzzy model [44]–[46]; and 2) a total independent method, where the type-2 fuzzy model is learned without the help of a base type-1 fuzzy model [47]. In this paper, we use the first option, i.e., we generate a base FRBCS using the FARC-HD algorithm, which is based on three stages: 1) Extract the fuzzy association rules for classification by applying a search tree, whose depth of the branches is limited. 2) Preselect the most interesting rules using subgroup discovery in order to decrease the computational cost of the system. 3) Optimize the knowledge base by means of a combination between the well-known tuning of the lateral position of the membership functions and a rule selection process. We make use of the two first stages in order to learn the initial FRBCS, which is the basis of our IV-FRBCS. We must point out that for the learning step, we consider triangular membership functions, which are obtained by performing a linear partitioning of the input domain of each variable. After having the base FRBCS, we model its linguistic labels by means of IVFSs. To this aim, we apply the following process. 1) We take as the lower bound of each IVFS the initial membership function (the one used in the learning step). 2) We generate the upper bound of each linguistic label. For their construction, the amplitude of the support of the upper bounds is determined by the value of the parameter W , which is initially set to 0.25 to achieve an amplitude 50% larger than that of their lower bound counterpart. An example of these sets is depicted in Fig. 3. Finally, we also have to generate the initial IV-REF associated with each variable of the problem. To this end, we apply the construction method recalled in Section II-C using the identity function as automorphism (φ(x) = x) in every case. In this
404
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
b) φ1 (x) = x2 and φ2 (x) = x: IV -REF ([0.6, 0.7], [1, 1]) = [0.77, 0.84]
1
c) φ1 (x) = x0.5 and φ2 (x) = x:
Aj (u)
IV -REF ([0.6, 0.7], [1, 1]) = [0.36, 0.49]. 2) Interval association degree: We apply a combination operator h to the interval matching degree computed previously and the rule weight:
Aj (u) 0
a
W
b
u
b+c 2
c
W
d
U
[bkj , bkj ] = h([Aj (xp ), Aj (xp )], [RWjk , RWjk ]) k = 1, . . . , M,
Fig. 3. Example of an initial constructed IVFS as defined in [24] and [34]. The solid line is the initial fuzzy set, and therefore, it is the lower bound of the IVFS. The dashed line is the upper bound of the IVFS.
manner, the initial IV-REF for every variable of the problem is the one that was shown in Example 3. C. Interval-Valued Fuzzy Reasoning Method This section is aimed at describing the new IV-FRM. To do so, we modify all the steps of the fuzzy reasoning method [21] recalled in Section II-A. This way, we develop a method that intrinsically manages the ignorance that the IVFSs represent. Let L be the number of rules in the RB and M the number of classes of the problem. If xp = (xp1 , . . . , xpn ) is a new pattern to be classified, the steps of the IV-FRM are the following. 1) Interval matching degree: We use IV-REFs to compute the similarity between the interval membership degrees (of each variable of the pattern to the corresponding IVFS) and the ideal membership degree 1L , and then, we apply a t-representable interval-valued t-norm (TT a ,T b as introduced in Section II-B) to these results:
j = 1, . . . , L.
(9)
We must point out that the rule weight is an element of L([0, 1]). To compute it, we apply the certainty factor [see (2)] making use of the interval arithmetic introduced in Section II-B. The resulting equation is shown as follows: x p ∈C lassC j [Aj (xp ), Aj (xp )] [RWj , RWj ] = . (10) P p=1 [Aj (xp ), Aj (xp )] 3) Interval pattern classification soundness degree for all classes: We aggregate the positive interval association degrees of each class by applying an interval aggregation function f: [Yk , Yk ] = f ([bkj , bkj ], j = 1, . . . , L and [bkj , bkj ] > 0L ) k = 1, . . . , M. (11) 4) Classification. We apply a decision function F over the interval soundness degree of the system for the pattern classification for all classes: F ([Y1 , Y1 ], . . . , [YM , YM ]) = arg max([Yk , Yk ]) . (12) k =1,...,M
TT a ,T b (IV -REF ([Aj 1 (xp1 ), Aj 1 (xp1 )], [1, 1]), . . .
The last step of the IV-FRM consists of selecting the maximum interval soundness degree. Therefore, in order to be able to make this decision, we use the total order relationship for intervals [40] presented in Section II-B.
IV -REF ([Aj n (xpn ), Aj n (xpn )], [1, 1]))
D. Tuning of the Equivalence and Rule Selection
[Aj (xp ), Aj (xp )] =
j = 1, . . . , L.
(8)
We must point out that the result of the initial IV-REF for each variable is the interval membership degree, since the equation shown in Example 3 is applied. The result of each IV-REF changes according to the values of its parameters. These situations are shown in Example 4, where the result of the initially constructed IV-REF is shown in the first item, whereas the two last items show results of IV-REFs when the initial values of their parameters have been modified. Example 4: Let [A(u), A(u)] = [0.6, 0.7]. The results provided when using IV-REFs [see (7)] constructed from different automorphisms are a) φ1 (x) = x and φ2 (x) = x: IV -REF ([0.6, 0.7], [1, 1]) = [0.6, 0.7]
In this proposal, we make use of genetic algorithms with a double aim: 1) to tune the values of the parameters used in the construction of the IV-REFs in order to increase the reasoning capabilities of the IV-FRM and 2) to perform a rule selection process in which we obtain a compact and cooperative fuzzy rule set. According to (7), each IV-REF is constructed using the automorphisms φ1 and φ2 . In this paper, we take φ1 (x) = xa and φ2 (x) = xb , where a, b ∈ (0, ∞). Therefore, we can generate a huge number of IV-REFs by taking different values for the parameters a and b. The selection of a suitable IV-REF to measure the equivalence degree in each variable could involve an improvement in the behavior of the system to solve specific problems. To face this optimization problem, we propose the tuning of the values of the parameters a and b used to construct the IV-REFs associated with each variable of the problem. In order
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
to cover as much search space as possible, we suggest varying these values in the interval [0.01, 100]. This range is due to the fact that when using φ(x) = x0.01 , the shape of the function is close to a crisp one, and we use the inverse function (φ(x) = x100 ) in order to provide the same flexibility on both sides of the identity function. The shadow surface of Fig. 1 (see Section II-C) depicts the search space covered when using the proposed variation interval, which is almost the whole search space. On the other hand, regarding the size of the output model, fuzzy rule learning methods usually generate a large number of fuzzy rules so as to achieve a highly accurate system. However, in the fuzzy rule set created, we can find irrelevant, redundant, erroneous, or conflicting rules, which may perturb the performance of the system [48]. Therefore, a rule reduction process is often applied in order to improve the system’s accuracy by removing useless rules and, in turn, easing the readability of the system. There are a lot of methods that deal with the rule reduction process [10], [49]; from among them, we will use a rule selection approach developed using a simple binary codification in order to express whether or not the fuzzy rules belong to the rule set. In order to accomplish this genetic process, we consider the use of the CHC evolutionary model [50] due to both its good properties to deal with complex search spaces [51], [52] and the good results provided in this topic [53], [54]. In the following, we describe the specific features of our evolutionary model. 1) Coding Scheme. Each chromosome is composed of two well-differentiated parts implying a double codification scheme: real codification for the tuning of the equivalence (CE ) part and binary coding for the rule selection process (CR ). a) Tuning of the equivalence. Letting n be the number of attributes, the part of the chromosome to carry out the tuning of the IV-REFs is a vector of size 2 × n: CE = {a1 , b1 , a2 , b2 , . . . , an , bn }, where ai , bi ∈ [0.01, 1.99] with i = 1, 2, . . . , n. Each pair of genes (ai , bi ) represents the values of the parameters a and b to construct the IV-REF associated with the ith attribute. To construct the corresponding IV-REF, we have to adapt the gene values to the interval in which the automorphisms can vary ([0.01, 100]). To do so, we adapt the value of each parameter a using the following equation (we consider the same adaptation for parameter b):
a=
⎧ ⎨ a,
if 0 < a ≤ 1
1 ⎩ , 2−a
if 1 < a < 2.
(13)
b) Rule selection. Let L be the number of fuzzy rules in the RB; the part of the chromosome to perform the rule selection is a vector of size L, CR = {r1 , . . . , rL }, where ri ∈ {0, 1} with i = {1, 2, . . . , L}, determining the subset of fuzzy rules which compose the final RB as follows: if ri = 1 then Ri ∈ / RB. RB else Ri ∈
405
Therefore, the whole chromosome scheme is as follows: CE +R = {CE , CR }. 2) Initial Gene Pool. To include the initial FRBCS in the population, we initialize an individual with all genes with value 1. In this manner, we construct the initial IV-REFs using the identity functions as automorphism for each variable, and we include all the fuzzy rules in the RB. 3) Chromosome Evaluation. We use the most common metric for classification, i.e., the accuracy rate. 4) Crossover Operator. Due to the double coding scheme of the chromosome, we apply a different crossover operator for each part of the chromosome: the Parent Centrix BLX operator [55] (which is based on the BLX-α) is used for the real coding part and the half uniform crossover scheme [56] is considered for the remainder. 5) Restarting Approach. To get away from local optima, we consider a restarting approach. To do so, as in the elitist scheme, we include the best global solution found until this moment in the next population, and we generate the remaining individuals at random. For more details about the evolutionary algorithm, see [1]. IV. EXPERIMENTAL FRAMEWORK In this section, we first present the real-world classification datasets selected for the experimental study. Next, we introduce the parameter setup considered throughout this study. Finally, we introduce the statistical tests that are necessary to compare the results achieved throughout the experimental study. A. Datasets We have selected a wide benchmark of 27 real-world datasets selected from the KEEL dataset repository [25], [26], which are publicly available on the corresponding webpage,2 including general information about them, partitions for the validation of the experimental results, and so on. Table I summarizes the properties of the selected datasets, showing for each dataset the number of examples (#Ex.), the number of attributes (#Atts.), and the number of classes (#Class.). We must point out that the magic, page-blocks, penbased, ring, satimage, and shuttle datasets have been stratified sampled at 10% in order to reduce their size for training. In the case of missing values (crx, dermatology and wisconsin), those instances have been removed from the dataset. A fivefold cross-validation model was considered in order to carry out the different experiments. That is, we split the dataset into five random partitions of data, each one with 20% of the patterns, and we employed a combination of four of them (80%) to train the system and the remaining one to test it. B. Methods Setup This section is aimed at introducing the configurations that have been considered for the different methods used along the 2 http://www.keel.es/dataset.php
406
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
TABLE I SUMMARY DESCRIPTION FOR THE EMPLOYED DATASETS
Id.
Data-set
#Ex.
#Atts.
#Class.
aus bal cle con crx der eco ger hab hay hea ion iri mag new pag pen pim sah spe tae tit two veh win wiR wis
Australian Balance Cleveland Contraceptive Crx Dermatology Ecoli German Haberman Hayes-Roth Heart Ionosphere Iris Magic New-Thyroid Page-blocks Penbased Pima Saheart Spectfheart Tae Titanic Twonorm Vehicle Wine Winequality-Red Wisconsin
690 625 297 1,473 653 358 336 1,000 306 160 270 351 150 1,902 215 548 1,992 768 462 267 151 2,201 740 846 178 1,599 683
14 4 13 9 15 34 7 20 3 4 13 33 4 10 5 10 16 8 9 44 5 3 20 18 13 11 9
2 3 5 3 2 6 8 2 2 3 2 10 3 2 3 5 10 2 2 2 3 2 2 4 3 11 2
experimental study: the FARC-HD method [1], our different versions of this method using IVFSs and the FURIA algorithm [2], which is briefly described as follows (see [2] for details). FURIA [2] builds upon the RIPPER interval rule induction algorithm [57]. The model built by FURIA uses fuzzy rules of the form given in (1) using fuzzy sets with trapezoidal membership functions. Specifically, FURIA builds the fuzzy RB by means of two steps. 1) Learn a rule set for every single class using a one-versus-all decomposition. To this aim, a modified version of RIPPER is applied, which involves a building and an optimization phase. 2) Obtain the fuzzy rules by means of fuzzifying the final rules from the modified RIPPER algorithm in a greedy way. At classification time, the class predicted by FURIA is the one with maximal support. In case the query is not covered by any rule, a rule stretching method is proposed based on modifying the rules in a local way to make them applicable to the query. Regarding the configurations, for the FARC-HD algorithm we will apply the following one: 1) Conjunction operator: product t-norm. 2) Combination operator: product t-norm. 3) Rule weight: certainty factor. 4) Fuzzy reasoning method: additive combination [21]. 5) Number of linguistic labels per variable: five labels. 6) M insup: 0.05. 7) M axconf : 0.8. 8) Depthm ax : 3. 9) kt : 2.
For the IVTURS-FARC, we have considered the following configuration: 1) Conjunction operator: product interval-valued t-norm. 2) Combination operator: product interval-valued t-norm. 3) IVFSs construction: a) Shape: Triangular membership functions. b) Upper bound: 50% greater than the lower bound (W = 0.25). 4) Configuration of the initial IV-REFs: a) T-norm: minimum. b) T-conorm: maximum. c) First automorphism: φ1 (x) = x1 (a = 1). d) Second automorphism: φ2 (x) = x1 (b = 1). Regarding the genetic tuning with rule selection process, we have used the values suggested in [1], which are: 1) Population Size: 50 individuals. 2) Number of evaluations: 20 000. 3) Bits per gene for the Gray codification (for incest prevention): 30 bits. Finally, for the parameters of the FURIA algorithm, namely, the number of folds and optimizations, we have set their values to 3 and 2, respectively, as recommended by the authors. C. Statistical Tests for Performance Comparison In this paper, we use some hypothesis validation techniques in order to give statistical support to the analysis of the results [58], [59]. We will use nonparametric tests because the initial conditions that guarantee the reliability of the parametric tests cannot be fulfilled, which implies that the statistical analysis loses credibility with these parametric tests [27]. Specifically, we use the Friedman aligned ranks test [60] to detect statistical differences among a group of results and the Holm post hoc test [61] to find the algorithms that reject the equality hypothesis with respect to a selected control method. The post hoc procedure allows us to know whether a hypothesis of comparison could be rejected at a specified level of significance α. Furthermore, we compute the adjusted p-value (APV) in order to take into account the fact that multiple tests are conducted. In this manner, we can directly compare the APV with respect to the level of significance α in order to be able to reject the null hypothesis. Furthermore, we consider the method of aligned ranks of the algorithms in order to show graphically how good a method is with respect to its partners. The first step to compute this ranking is to obtain the average performance of the algorithms in each dataset. Next, we compute the subtractions between the accuracy of each algorithm minus the average value for each dataset. Then, we rank all these differences in descending order, and, finally, we average the rankings obtained by each algorithm. In this manner, the algorithm that achieves the lowest average ranking is the best one. These tests are suggested in the studies presented in [27]–[29], and [58], where it is shown that their use in the field of machine learning is highly recommended. A complete description of the test and software for its use can be found at http:// sci2s.ugr.es/sicidm/.
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
TABLE II DESCRIPTION OF THE METHODS USED IN SECTIONS V-A AND V-B OF THE EXPERIMENTAL STUDY Notation
Linguistic labels
Matching degree
Tuning of the equivalence
Tuning of the ignorance degree
Rule selection
FS T E FS T E+R IVFS T E IVFS T WI IVFS T E+WI IVTURS FARC
Fuzzy sets Fuzzy sets IVFSs IVFSs IVFSs IVFSs
REFs REFs IV-REFs IV-REFs IV-REFs IV-REFs
Yes Yes Yes No Yes Yes
No No No Yes Yes No
No Yes No No No Yes
V. ANALYSIS OF THE USEFULNESS OF IVTURS-FUZZY ASSOCIATION RULE-BASED CLASSIFICATION MODEL FOR HIGH-DIMENSIONAL PROBLEMS In this section, we analyze the behavior of IVTURS-FARC. To do so, we develop an experimental study composed of three steps. 1) We determine the importance of both the rule selection process and, foremost, the IVFSs by comparing IVTURSFARC versus its fuzzy counterpart with and also without rule selection (see Section V-A). 2) We analyze the improvements achieved with respect to our previous proposal in the topic (see Section V-B). 3) We study whether IVTURS-FARC improves the results obtained by two state-of-the-art fuzzy classifiers (see Section V-C). The description of the methods used to carry out the first two steps of the experimental study is introduced in Table II. Methods using the prefix F S use the fuzzy system that is learned by applying the first two stages of the FARC-HD algorithm, as defined in [1], and they apply REFs3 to compute the matching degree. Methods using the suffix T_WI apply our previous proposal to tune the ignorance degree that each IVFS represents [24]. Table III shows the classification accuracy of the different approaches used along the experimental study. Results are grouped in pairs for training and test, where the best global result for each dataset is emphasized in bold face. Vertical lines group the methods involved in each scenario. In the remainder of this section, we develop the analysis carried out in the three aforementioned scenarios. A. Determining the Suitability of IVTURS-FARC This section is aimed at showing the goodness of IVTURSFARC, i.e., the combination of IVFSs with the tuning of the equivalence and a rule selection process. To do so, we have to analyze whether the rule selection step strengthens the quality of the results, and, most importantly, we must determine whether the use of IVFSs is the main cause of the enhancement achieved. For this reason, we compare IVTURS-FARC with respect to its version without the rule selection process (IVFS_T_E), as well as with the fuzzy counterparts (FS_T_E + R and FS_T_E), in order to support the key role that IVFSs play in the system.
3 These
functions allow us to quantify the equivalence between two numbers. They are the non-interval-valued fuzzy version of the IV-REFs, as explained in Section II-C.
407
Analyzing the results achieved by these four approaches, we find that IVTURS-FARC reaches the best performance in 12 out of the 27 datasets implying a global performance improvement. This fact is confirmed in Fig. 4, where it is clearly shown that our new IV-FRBCS is the best ranking method. When applying the Friedman aligned ranks test, we find a p-value of 5.70E-5, which confirms the existence of statistical differences among these four approaches. For this reason, we perform the Holm post hoc test, the results of which are shown in Table IV, selecting IVTURS-FARC as the control method since it is the best ranking one. These results clearly determine the quality of IVTURS-FARC since 1) the use of IVFSs leads to the outperforming of the results of the non-interval-valued fuzzy versions of IVTURS-FARC with and without the rule selection stage, and 2) the rule selection process allows us to enhance the results of the tuning of the equivalence. B. Analyzing the Performance Improvement With Respect to the Tuning of the Weak Ignorance Degree Once the suitability of the IVTURS-FARC is confirmed, we study whether our new proposal enhances the results of our previous approach to the topic. To do so, we consider the use of the following proposals. 1) IVFS_T_WI: the initial IV-FRBCS4 with a previous tuning approach to modify the ignorance degree that each IVFS represents [24], i.e., we modify the value of the parameter W used in the construction of the IVFSs; 2) IVFS_T_E + WI: the initial IV-FRBCS with a tuning approach that simultaneously performs both the tuning of the ignorance degree [24] and the tuning of the equivalence, i.e., we modify the values of the parameters W, a, and b used in the construction of the IVFSs and the IV-REFs, respectively; 3) IVTURS-FARC: the proposed method, as explained in Section III-A. From the results in Table III, we can observe that IVTURSFARC achieves a notable enhancement of the global performance, which is based on the achievement of the best result in more than half of the datasets. This situation is confirmed in Fig. 5, where it is shown that the best ranking is reached by our new IV-FRBCS. In order to detect significant differences among the results of the three approaches, we carry out the Friedman aligned rank test. This test obtains a p-value near to zero (3.88E-5), which implies that there are significant differences between the results. For this reason, we can apply a post hoc test to compare our new methodology with the previous ones. Specifically, a Holm test is applied, which is presented in Table V. The statistical analysis reflects that IVTURS-FARC is the best choice among this group of proposals.
4 The initial IV-FRBCS is the system obtained after the modeling of the linguistic labels of the base fuzzy classifier by means of IVFSs. It uses the new IV-FRM introduced in Section III-C with the identity function as the automorphism for the construction of each IV-REF.
408
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
TABLE III RESULTS IN TRAIN (TR.) AND TEST (TST) ACHIEVED BY THE DIFFERENT APPROACHES CONSIDERED IN THIS PAPER Data
FS T E
Set
T r.
T st
T r.
T st
T r.
aus bal cle con crx der eco ger hab hay hea ion iri mag new pag pen pim sah spe tae tit two veh win wiR wis
88.44 84.68 83.50 56.96 89.01 99.44 88.17 81.95 78.68 86.36 91.94 98.29 98.17 82.14 95.47 96.44 96.09 79.56 79.17 91.20 70.04 77.06 96.05 75.30 99.44 62.59 97.80
86.38 80.96 58.92 53.02 85.45 92.75 78.60 71.10 73.84 76.41 88.15 91.18 96.00 79.13 92.09 94.52 91.82 75.51 72.28 77.51 52.37 77.06 89.19 64.66 93.24 59.54 96.49
90.65 91.56 88.30 61.80 91.58 100.00 91.07 86.98 82.19 91.28 94.44 98.58 98.50 83.43 97.91 96.85 97.68 82.29 82.74 92.60 75.50 79.07 96.76 78.69 99.86 64.93 98.61
84.93 82.40 54.54 53.90 86.83 90.49 80.07 73.80 72.19 79.46 86.67 90.60 94.67 80.33 94.42 94.89 92.00 76.17 69.70 77.87 54.43 78.87 90.54 66.90 95.49 59.66 96.78
88.77 86.92 80.81 57.04 89.40 99.65 86.24 81.48 77.61 86.36 92.31 98.36 98.17 81.32 96.63 96.03 93.68 78.71 78.52 89.42 70.37 77.65 95.91 70.77 99.30 60.94 97.88
Mean
86.07
79.56
88.66
79.95
85.56
79.85
70
FS T E+R
IVFS T E
IVFS T WI
IVFS T E+WI
T st
T r.
T st
T r.
T st
T r.
T st
T r.
T st
T r.
T st
85.07 82.08 57.58 52.68 86.53 94.69 76.20 71.90 72.22 80.23 86.30 91.17 96.67 78.81 92.09 94.71 91.09 74.99 69.28 80.52 53.68 77.65 93.24 65.26 96.08 58.91 96.34
83.22 88.56 80.81 57.16 84.07 98.39 85.49 78.25 74.43 81.26 90.09 96.15 97.83 78.25 95.23 94.66 94.45 75.39 77.22 85.68 67.55 77.65 95.71 70.30 98.88 58.30 96.12
83.48 82.40 57.25 52.68 82.85 94.69 77.69 71.00 73.20 77.18 86.30 92.33 96.67 76.18 93.02 93.42 91.16 72.00 70.99 77.13 50.39 77.65 91.49 62.89 97.76 56.35 95.90
88.84 88.12 81.82 57.37 89.47 99.65 87.43 81.65 78.92 86.36 92.22 98.58 98.33 81.65 98.14 96.26 94.23 79.46 79.06 90.08 72.19 77.65 96.39 71.57 99.44 61.60 98.13
85.51 82.88 59.59 52.55 87.75 94.42 77.39 70.70 73.51 80.23 85.19 90.33 96.00 78.97 93.49 94.52 90.18 74.35 68.40 80.51 54.34 77.65 92.97 63.48 94.95 58.97 96.78
90.43 92.12 89.56 62.80 91.65 100.00 92.11 87.70 81.53 91.28 94.63 98.50 98.50 84.46 98.95 96.94 98.34 83.66 83.49 92.42 77.15 79.07 97.84 80.38 99.86 56.22 98.76
85.51 87.36 57.92 52.68 86.53 89.94 80.07 71.60 71.22 80.20 84.44 90.32 94.00 80.49 95.35 94.34 92.64 74.08 70.77 78.64 48.41 78.87 89.32 68.44 96.62 53.96 96.63
88.99 88.84 62.37 56.81 89.70 98.88 89.66 76.90 75.57 88.44 89.72 97.65 98.50 84.03 99.07 99.54 98.98 79.17 74.84 95.70 54.63 78.46 99.46 79.34 99.58 57.12 98.83
86.09 83.68 56.57 54.17 86.37 93.86 80.06 73.30 72.55 81.00 78.15 88.91 94.00 80.65 94.88 95.25 92.45 76.17 70.33 77.88 47.08 78.51 88.11 70.21 93.78 51.29 96.63
90.04 91.84 85.44 59.69 91.42 99.86 89.06 85.35 80.72 91.28 93.61 98.72 98.17 82.28 98.84 96.85 95.50 80.57 79.55 90.45 73.18 79.07 96.35 72.81 99.30 62.01 98.50
85.80 85.76 59.60 53.36 87.14 94.42 78.58 73.10 72.85 80.23 88.15 89.75 96.00 79.76 95.35 95.07 92.18 75.90 70.99 80.52 50.34 78.87 92.30 67.38 97.19 58.28 96.49
83.75
79.04
86.10
79.84
88.83
79.64
85.21
79.33
87.42
80.57
65,76
60
60,85 60
53,94
FARC-HD
FURIA
IVTURS-FARC
55,74
50
50
41,89 37,44
40 30
40 30
25,37
20
20 10
10
0 FS_T_E
FS_T_E+R
IVFS_T_E
IVTURS-FARC
0
Fig. 4.
IVFS_T_WI
Rankings of the four versions of IVTURS-FARC. TABLE IV HOLM TEST TO COMPARE IVTURS-FARC WITH RESPECT TO ITS DIFFERENT VERSIONS
i
Algorithm
Hypothesis
APV
1 2 3
FS T E IVFS T E FS T E+R
Rejected for IVTURS-FARC Rejected for IVTURS-FARC Rejected for IVTURS-FARC
0.003 0.018 0.053
C. On the Comparison Versus State-of-the-Art Fuzzy Classifiers In this section, we analyze the performance of IVTURSFARC against two recognized state-of-the-art classifiers, i.e., the original FARC-HD algorithm by Alcal´a-Fdez et al. [1] and the FURIA algorithm by H¨uhn and H¨ullermeier [2].
Fig. 5.
IVFS_T_E+WI
IVTURS-FARC
Rankings of both our new and our previous proposals in the topic.
TABLE V HOLM TEST TO COMPARE IVTURS-FARC WITH RESPECT TO OUR PREVIOUS TUNING APPROACHES
i
Algorithm
Hypothesis
APV
1 2
IVFS T WI IVFS T E+WI
Rejected for IVTURS-FARC Rejected for IVTURS-FARC
4.21E-6 0.019
From the results presented in the last three pairs of columns of Table III, we must highlight the mean performance improvement obtained by our new proposal with respect to the FURIA algorithm (1.22%), as well as the notable improvement versus the original FARC-HD method.
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
50
when applying several tuning approaches to the new IV-FRBCS. We must stress that the use of IVFSs allows us to strengthen the quality of the results, since the performance of the fuzzy counterparts of our new IV-FRBCS is highly improved. The highlight of the study is the high accuracy of IVTURS-FARC, since it outperforms the results achieved by two state-of-the-art fuzzy classifiers.
47,11
46,59
40 29,3
30
409
20
REFERENCES
10 0
FARC-HD
FURIA
IVTURS-FARC
Fig. 6. Rankings of the two state-of-the-art fuzzy classifiers, together with IVTURS-FARC. TABLE VI HOLM TEST TO COMPARE IVTURS-FARC WITH RESPECT TO FARC-HD AND FURIA
i
Algorithm
Hypothesis
APV
1 2
FURIA FARC-HD
Rejected for IVTURS-FARC Rejected for IVTURS-FARC
0.011 0.011
In order to compare the results, we have applied the nonparametric tests described in Section IV-C. The p-value obtained by the Aligned Friedman test is 3.20E-5, which implies the existence of significant differences among the three approaches. Fig. 6 graphically shows the rankings of the different methods (which have been computed using the Aligned Friedman test). We now apply Holm’s test to compare the best ranking method (IVTURS-FARC) with the remaining ones. Table VI shows that the hypothesis of equality is rejected with the rest of methods with a high level of confidence. Therefore, taking into account the previous findings, we can conclude that IVTURS-FARC is the best performing method among these three proposals. VI. CONCLUDING REMARKS In this paper, we have introduced a new linguistic fuzzy rulebased classification method called IVTURS. It is based on a new IV-FRM in which all the steps make the computation using intervals, allowing an integral management of the ignorance that the IVFSs represent. The key concepts are the IV-REFs, which are applied to compute the matching degree, allowing the relevance of the rules with a high interval equivalence degree to be emphasized. Furthermore, the parametrized construction of these functions allows us to compute the most suitable set of IV-REFs to solve each specific problem. In this manner, we have defined a genetic process to choose the set of IV-REFs for each problem and to reduce the complexity of the system by performing a fuzzy rule set reduction process. For the experimental study, we have used an instance of IVTURS using the FARC-HD method, which has been denoted IVTURS-FARC. Throughout the experimental analysis, it has been shown that IVTURS-FARC enhances the results obtained
[1] J. Alcala-Fdez, R. Alcala, and F. Herrera, “A fuzzy association rulebased classification model for high-dimensional problems with genetic rule selection and lateral tuning,” IEEE Trans. Fuzzy Syst., vol. 19, no. 5, pp. 857–872, Oct. 2011. [2] J. H¨uhn and E. H¨ullermeier, “FURIA: An algorithm for unordered fuzzy rule induction,” Data Mining Knowl. Disc., vol. 19, no. 3, pp. 293–319, 2009. [3] Z. Chi, H. Yan, and T. Pham, Fuzzy Algorithms With Applications to Image Processing and Pattern Recognition. Singapore, World Scientific, 1996 [4] O. Cordon, M. J. del Jesus, and F. Herrera, “Genetic learning of fuzzy rulebased classification systems cooperating with fuzzy reasoning methods,” Int. J. Intell. Syst., vol. 13, no. 10–11, pp. 1025–1053, 1998. [5] H. Isibuchi, T. Yamamoto, and T. Nakashima, “Hybridization of fuzzy GBML approaches for pattern classification problems,” IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 35, no. 2, pp. 359–365, Apr. 2005. [6] A. Gonz´alez and R. Perez, “Slave: A genetic learning system based on an iterative approach,” IEEE Trans. Fuzzy Syst., vol. 7, no. 2, pp. 176–191, Apr. 1999. [7] E. Mansoori, M. Zolghadri, and S. Katebi, “Sgerd: A steady-state genetic algorithm for extracting fuzzy classification rules from data,” IEEE Trans. Fuzzy Syst., vol. 16, no. 4, pp. 1061–1071, Aug. 2008. [8] Y.-C. Hu, R.-S. Chen, and G.-H. Tzeng, “Finding fuzzy classification rules using data mining techniques,” Pattern Recognit. Lett., vol. 24, no. 1–3, pp. 509–519, 2003. [9] E. H. Mamdani, “Applications of fuzzy algorithm for control a simple dynamic plant,” Proc. IEEE, vol. 121, no. 12, pp. 1585–1588, Dec. 1974. [10] M. J. Gacto, R. Alcal´a, and F. Herrera, “Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures,” Inf. Sci., vol. 181, no. 20, pp. 4340–4360, 2011. [11] O. Cord´on, F. Herrera, and P. Villar, “Analysis and guidelines to obtain a good uniform fuzzy partition granularity for fuzzy rule-based systems using simulated annealing,” Int. J. Approx. Reason., vol. 25, no. 3, pp. 187– 215, 2000. [12] J. M. Mendel, “Computing with words and its relationships with fuzzistics,” Inf. Sci., vol. 177, no. 4, pp. 988–1006, 2007. [13] W.-H. Au, K. Chan, and A. Wong, “A fuzzy approach to partitioning continuous attributes for classification,” IEEE Trans. Knowl. Data Eng., vol. 18, no. 5, pp. 715–719, May 2006. [14] R. Sambuc, “Function φ-flous, application a l’aide au diagnostic en pathologie thyroidienne” Ph.D. dissertation, Univ. Marseille, Marseille, France, 1975. [15] H. Bustince, M. Pagola, E. Barrenechea, J. Fernandez, P. Melo-Pinto, P. Couto, H. R. Tizhoosh, and J. Montero, “Ignorance functions. An application to the calculation of the threshold in prostate ultrasound images,” Fuzzy Sets Syst., vol. 161, no. 1, pp. 20–36, 2010. [16] D. Dubois and H. Prade, “An introduction to bipolar representations of information and preference,” Int. J. Intell. Syst., vol. 23, no. 8, pp. 866– 877, 2008. [17] D. Wu and J. M. Mendel, “Computing with words for hierarchical decision making applied to evaluating a weapon system,” IEEE Trans. Fuzzy Syst., vol. 18, no. 3, pp. 441–460, Jun. 2010. [18] C. Wagner and H. Hagras, “Toward general type-2 fuzzy logic systems based on zslices,” IEEE Trans. Fuzzy Syst., vol. 18, no. 4, pp. 637–660, Aug. 2010. [19] E. Barrenechea, H. Bustince, B. De Baets, and C. Lopez-Molina, “Construction of interval-valued fuzzy relations with application to the generation of fuzzy edge images,” IEEE Trans. Fuzzy Syst., vol. 19, no. 5, pp. 819–830, Oct. 2011. [20] P. Couto, A. Jurio, A. Varejao, M. Pagola, H. Bustince, and P. Melo-Pinto, “An IVFS-based image segmentation methodology for rat gait analysis,” Soft Comput. Fusion Found. Methodol. Appl., vol. 15, no. 10, pp. 1937– 1944, 2011.
410
[21] O. Cord´on, M. J. del Jesus, and F. Herrera, “A proposal on reasoning methods in fuzzy rule-based classification systems,” Int. J. Approx. Reason., vol. 20, no. 1, pp. 21–45, 1999. [22] M. Galar, J. Fernandez, G. Beliakov, and H. Bustince, “Interval-valued fuzzy sets applied to stereo matching of color images,” IEEE Trans. Imag. Process., vol. 20, no. 7, pp. 1949–1961, Jul. 2011. [23] A. Jurio, M. Pagola, D. Paternain, C. Lopez-Molina, and P. Melo-Pinto, “Interval-valued restricted equivalence functions applied on clustering techniques,” in Proc. Joint Int. Fuzzy Syst. Assoc. World Congr. Eur. Soc. Fuzzy Logic Technol. Conf., Lisbon, Portugal, Jul. 2009, pp. 831–836. [24] J. Sanz, A. Fern´andez, H. Bustince, and F. Herrera, “Improving the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets and genetic amplitude tuning,” Inf. Sci., vol. 180, no. 19, pp. 3674–3685, 2010. [25] J. Alcal´a-Fdez, L. S´anchez, S. Garc´ıa, M. J. del Jesus, S. Ventura, J. Garrell, J. Otero, C. Romero, J. Bacardit, V. Rivas, J. C. Fern´andez, and F. Herrera, “Keel: A software tool to assess evolutionary algorithms for data mining problems,” Soft Comput., vol. 13, no. 3, pp. 307–318, 2009. [26] J. Alcal´a-Fdez, A. Fern´andez, J. Luengo, J. Derrac, S. Garc´ıa, L. S´anchez, and F. Herrera, “KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework,” J. MultipleValued Logic Soft Comput., vol. 17, no. 2/3, pp. 255–287, 2011. [27] J. Demˇsar, “Statistical comparisons of classifiers over multiple data sets,” J. Mach. Learn. Res., vol. 7, pp. 1–30, 2006. [28] S. Garc´ıa and F. Herrera, “An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons,” J. Mach. Learn. Res., vol. 9, pp. 2677–2694, 2008. [29] S. Garc´ıa, A. Fern´andez, J. Luengo, and F. Herrera, “Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power,” Inf. Sci., vol. 180, no. 10, pp. 2044–2064, 2010. [30] H. Ishibuchi and T. Nakashima, “Effect of rule weights in fuzzy rule-based classification systems,” IEEE Trans. Fuzzy Syst., vol. 9, no. 4, pp. 506–515, Aug. 2001. [31] H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rulebased classification systems,” IEEE Trans. Fuzzy Syst., vol. 13, no. 4, pp. 428–435, Aug. 2005. [32] H. Bustince, E. Barrenechea, and M. Pagola, “Generation of intervalvalued fuzzy and Atanassov’s intuitionistic fuzzy connectives from fuzzy connectives and from k α operators: Laws for conjunctions and disjunctions, amplitude,” Int. J. Intell. Syst., vol. 23, no. 6, pp. 680–714, 2008. [33] D. Wu and J. M. Mendel, “Linguistic summarization using if-then rules and interval type-2 fuzzy sets,” IEEE Trans. Fuzzy Syst., vol. 19, no. 1, pp. 136–151, Feb. 2011. [34] J. Sanz, A. Fern´andez, H. Bustince, and F. Herrera, “A genetic tuning to improve the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets: Degree of ignorance and lateral position,” Int. J. Approx. Reason., vol. 52, no. 6, pp. 751–766, 2011. [35] G. Beliakov, A. Pradera, and T. Calvo, Aggregation Functions: A Guide for Practitioners. What is An Aggregation Functio, Ser. Studies In Fuzziness and Soft Computing. San Mateo, CA, USA: Springer, 2007. [36] T. Calvo, A. Kolesarova, M. Komornikova, and R. Mesiar, Aggregation Operators New Trends and Applications: Aggregation Operators: Properties, Classes and Construction Methods. Heidelberg, Germany: PhysicaVerlag, 2002. [37] G. Deschrijver, C. Cornelis, and E. Kerre, “On the representation of intuitionistic fuzzy t-norms and t-conorms,” IEEE Trans. Fuzzy Syst., vol. 12, no. 1, pp. 45–61, Feb. 2004. [38] E. Trillas, “Sobre funciones de negaci´on en la teor´ıa de conjuntos difusos,” Stochastica, vol. 3, no. 1, pp. 47–59, 1979 (reprinted (English version) (1998) in Advances of Fuzzy Logic, S. Barro et al., Eds. tri-Universidad de Santiago de Compostela, pp. 31–43). [39] G. Deschrijver, “Generalized arithmetic operators and their relationship to t-norms in interval-valued fuzzy set theory,” Fuzzy Sets Syst., vol. 160, no. 21, pp. 3080–3102, 2009. [40] Z. S. Xu and R. R. Yager, “Some geometric aggregation operators based on intuitionistic fuzzy sets,” Int. J. General Syst., vol. 35, no. 4, pp. 417– 433, 2006. [41] H. Bustince, E. Barrenechea, and M. Pagola, “Restricted equivalence functions,” Fuzzy Sets Syst., vol. 157, no. 17, pp. 2333–2346, 2006. [42] H. Bustince, E. Barrenechea, and M. Pagola, “Image thresholding using restricted equivalence functions and maximizing the measures of similarity,” Fuzzy Sets Syst., vol. 158, no. 5, pp. 496–516, 2007. [43] J. M. Mendel, Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Englewood Cliffs, NJ, USA: Prentice–Hall, 2001.
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 3, JUNE 2013
[44] E. A. Jammeh, M. Fleury, C. Wagner, H. Hagras, and M. Ghanbari, “Interval type-2 fuzzy logic congestion control for video streaming across IP networks,” IEEE Trans. Fuzzy Syst., vol. 17, no. 5, pp. 1123–1142, Oct. 2009. [45] H. Wu and J. M. Mendel, “Classification of battlefield ground vehicles using acoustic features and fuzzy logic rule-based classifiers,” IEEE Trans. Fuzzy Syst., vol. 15, no. 1, pp. 56–72, Feb. 2007. [46] R. Hosseini, S. Qanadli, S. Barman, M. Mazinani, T. Ellis, and J. Dehmeshki, “An automatic approach for learning and tuning Gaussian interval type-2 fuzzy membership functions applied to lung cad classification system,” IEEE Trans. Fuzzy Syst., vol. 20, no. 2, pp. 224–234, Apr. 2012. [47] D. Wu and W. W. Tan, “Genetic learning and performance evaluation of interval type-2 fuzzy logic controllers,” Eng. Appl. Artif. Intell., vol. 19, no. 8, pp. 829–841, 2006. [48] R. Alcal´a, J. Alcal´a-Fdez, M. J. Gacto, and F. Herrera, “Rule base reduction and genetic tuning of fuzzy systems based on the linguistic 3-tuples representation,” Soft Comput., vol. 11, no. 5, pp. 401–419, 2007. [49] M. J. Gacto, R. Alcal´a, and F. Herrera, “Adaptation and application of multi-objective evolutionary algorithms for rule reduction and parameter tuning of fuzzy rule-based systems,” Soft Comput., vol. 13, no. 5, pp. 419– 436, 2009. [50] L. Eshelman, “The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination,” in Foundations of Genetic Algorithms. San Mateo, CA, USA: Morgan Kaufman, 1991, pp. 265–283. [51] D. Whitley, S. Rana, J. Dzubera, and K. E. Mathias, “Evaluating evolutionary algorithms,” Artif. Intell., vol. 85, no. 1/2, pp. 245–276, 1996. [52] A. Fern´andez, M. J. del Jesus, and F. Herrera, “On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets,” Exp. Syst. Appl., vol. 36, no. 6, pp. 9805–9812, 2009. [53] R. Alcal´a, J. Alcal´a-Fdez, and F. Herrera, “A proposal for the genetic lateral tuning of linguistic fuzzy systems and its interaction with rule selection,” IEEE Trans. Fuzzy Syst., vol. 15, no. 4, pp. 616–635, Aug. 2007. [54] A. Fern´andez, M. J. del Jesus, and F. Herrera, “On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets,” Inf. Sci., vol. 180, no. 8, pp. 1268–1291, 2010. [55] F. Herrera, M. Lozano, and A. M. S´anchez, “A taxonomy for the crossover operator for real-coded genetic algorithms: An experimental study,” Int. J. Intell. Syst., vol. 18, no. 3, pp. 309–338, 2003. [56] L. J. Eshelman and J. Schafffer, “Real coded genetic algorithms and interval schemata,” in Foundations of Genetic Algorithms. San Mateo, CA, USA: Morgan Kaufman, 1993, pp. 187–202. [57] W. W. Cohen, “Fast effective rule induction,” presented at the 12th Int. Conf. Mach. Learn., Lake Tahoe, CA, USA, 1995. [58] S. Garc´ıa, A. Fern´andez, J. Luengo, and F. Herrera, “A study of statistical techniques and performance measures for genetics–based machine learning: Accuracy and interpretability,” Soft Comput., vol. 13, no. 10, pp. 959–977, 2009. [59] D. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, 2nd ed. London, U.K.: Chapman & Hall/CRC, 2006. [60] J. L. Hodges and E. L. Lehmann, “Ranks methods for combination of independent experiments in analysis of variance,” Ann. Math. Statist., vol. 33, pp. 482–497, 1962. [61] S. Holm, “A simple sequentially rejective multiple test procedure,” Scandinav. J. Stat., vol. 6, pp. 65–70, 1979.
Jos´e Antonio Sanz received the M.Sc. and Ph.D. degrees in computer sciences from the Public University of Navarra, Pamplona, Spain, in 2008 and 2011, respectively. He is currently an Associate Lecturer with the Department of Automatics and Computation, Public University of Navarra. He is the author of seven published original articles. His research interests include fuzzy techniques for classification problems, intervalvalued fuzzy sets, genetic fuzzy systems, data mining, and medical applications using soft computing techniques. Dr. Sanz is member of the European Society for Fuzzy Logic and Technology.
SANZ et al.: IVTURS: A LINGUISTIC FUZZY RULE-BASED CLASSIFICATION SYSTEM
Alberto Fern´andez received the M.Sc. and Ph.D. degrees in computer science from the University of Granada, Granada, Spain, in 2005 and 2010, respectively. He is currently an Assistant Professor with the Department of Computer Science, University of Ja´en, Ja´en, Spain. His research interests include data mining, classification in imbalanced domains, fuzzy rule learning, evolutionary algorithms, resolution of multiclassification problems with ensembles, and decomposition techniques. Dr. Fern´andez received the “Lofti A. Zadeh Prize” of the International Fuzzy Systems Association for the “Best paper in 2009-2010” for his work of hierarchical fuzzy rule based classification system with genetic rule selection for imbalanced data-sets.
Humberto Bustince (M’06) received the B.Sc. degree in physics from the Salamanca University, Salamanca, Spain, in 1983 and the Ph.D. degree in mathematics from the Public University of Navarra, Pamplona, Spain, in 1994. He has been a Teacher with the Public University of Navarra since 1991, where he is currently a Full Professor with the Department of Automatics and Computation. He served as a Subdirector of the Technical School for Industrial Engineering and Telecommunications from January 1, 2003 to October 30, 2008. He was involved in the implantation of Computer Science courses with the Public University of Navarra. He is currently involved in teaching artificial intelligence for students of the computer sciences. He has authored more than 70 journal papers (Web of Knowledge) and more than 73 contributions to international conferences. He has also been a coauthor of four books on fuzzy theory and extensions of fuzzy sets. His current research interests include interval-valued fuzzy sets, Atanassov’s intuitionistic fuzzy sets, aggregation functions, implication operators, inclusion measures, image processing, decision making, and approximate reasoning. Dr. Bustince is a Fellow of the IEEE Computational Intelligence Systems Society and a member of the Board of the European Society for Fuzzy Logic and Applications. He is currently the Editor-In-Chief of the Mathware & Soft Computing Magazine and Notes on Intuitionistic Fuzzy Sets. He is also a Guest Editor of Fuzzy Sets and Systems and a member of the Editorial Board of the Journal of Intelligence & Fuzzy Systems, the International Journal of Computational Intelligence Systems, and the Axioms Journal.
411
Francisco Herrera (M’10) received the M.Sc. and Ph.D. degrees from the University of Granada, Granada, Spain, in 1998 and 1991, respectively, both in mathematics. He is currently a Professor with the Department of Computer Science and Artificial Intelligence, University of Granada. He has contributed to more than 230 papers in international journals. He is a coauthor of Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases (Singapore: World Scientific, 2001). His current research interests include computing with words and decision making, bibliometrics, data mining, data preparation, instance selection, fuzzy rule-based systems, genetic fuzzy systems, knowledge extraction based on evolutionary algorithms, memetic algorithms, and genetic algorithms. Dr. Herrera is the Editor-In-Chief of the Progress in Artificial Intelligence. He is an Area Editor of the International Journal of Computational Intelligence Systems and an Associate Editor of IEEE TRANSACTIONS ON FUZZY SYSTEMS, Information Sciences, Knowledge and Information Systems, Advances in Fuzzy Systems, and the International Journal of Applied Metaheuristics Computing. He serves as a member of several journal Editorial Boards, including Fuzzy Sets and Systems, Applied Intelligence, Information Fusion, Evolutionary Intelligence, the International Journal of Hybrid Intelligent Systems, Memetic Computation, and Swarm and Evolutionary Computation. He has received the following honors and awards: ECCAI Fellow 2009, 2010 Spanish National Award on Computer Science ARITMEL to the “Spanish Engineer on Computer Science,” International Cajastur “Mamdani” Prize for Soft Computing (Fourth edition, 2010), the IEEE TRANSACTIONS ON FUZZY SYSTEMS Outstanding 2008 Paper Award (bestowed in 2011), and the 2011 Lotfi A. Zadeh Prize Best Paper Award from the International Fuzzy Systems Association.