Fuzzy Rule Weights Optimization based on Imperialist ...

2 downloads 0 Views 348KB Size Report
their weights using Imperialist Competitive Algorithm (ICA). Among the evolutionary algorithms, here, ICA is chosen to solve the premature convergence problem ...
Fuzzy Rule Weights Optimization based on Imperialist Competitive Algorithm Mansoureh Rezaei

Reza Boostani

Computer Science & Engineering Department Electrical and Computer Engineering Faculty Shiraz University, Shiraz, Iran [email protected]

Computer Science & Engineering Department Electrical and Computer Engineering Faculty Shiraz University, Shiraz, Iran [email protected]

Abstract— Fuzzy rule-based systems are appropriate tools to deal with the classification problems due to their interpretabilities and accuracies. The aim of the paper is to improve the performance of Fuzzy Rule-Based Classification Systems (FRBCS) by learning their weights using Imperialist Competitive Algorithm (ICA). Among the evolutionary algorithms, here, ICA is chosen to solve the premature convergence problem of the other competitive algorithms. To evaluate the proposed method, several datasets belonged to the UCI database are selected as the benchmark and applied to the proposed FRBCS optimized by ICA and finally compared to the other FRBCS which their weights are adjusted by other evolutionary algorithms such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The achieved results on most of the datasets imply on the superiority of the proposed combinational scheme compared to the other similar rivals. Keywords- FRBCS; Imperialist Competitive Algorithm (ICA); GA; PSO.

I.

INTRODUCTION

Fuzzy Logic was first proposed by Zadeh [1] and then it has been deployed in many applications [2]. However, these systems suffer from the lack of systematic design and learning capability [3]. Recently, optimization algorithms are deployed in adjusting the structure of fuzzy systems to overcome their adaptation and learning ability shortcomings. Although some efforts are made to incline the ability of Fuzzy Rule-Based Classification Systems (FRBCS) by neural networks [4-6], evolutionary algorithms such as Genetic Algorithm (GA) [711] and Particle Swarm Optimization (PSO) [12-14] were repeatedly utilized to improve the FRBCS capability. Some of the approaches were deeply engaged genetic algorithm to their structure by encoding the fuzzy rules in the chromosomes and evolve the antecedents of rules by some novel operators to create efficient and also short rules [15]. Moreover, PSO and GA were employed to optimize the membership function parameters [16], rule aggregation [17] and defuzzification methods [18-19]. Nevertheless, the mentioned FRBCSs optimized by the current evolutionary algorithms still suffer from the premature convergence and their low speed and weakness to deal with non-convex, nonlinear, integer mixed optimization problems. In the case of dividing each attribute intervals into several segments, the decision accuracy is enhanced but the interpretability becomes

978-1-4799-3351-8/14/$31.00 ©2014 IEEE

problematic due to facing with several rules with long antecedents. Hence, there is a trade-off between the interpretability and the accuracy of fuzzy rulebased systems. To solve this problem, Ishibuchi [8] has proposed a flexible structure that creates fuzzy rules with different antecedent length by increasing the number of intervals iteratively. In this paper, we employed the Ishibuchi [8] algorithm as the core and Imperialist Competitive Algorithm (ICA) optimizes the rule weights to increase the performance of reasoning method. ICA is a socio-politically motivated global search strategy [23] that is able to continuously set the rule weights where enough randomness is involved to the algorithm. The rest of this paper is organized as follows. In Section II , the proposed method is explained in detail. Section III illustrates the experimental results and discusses the advantages and disadvantages of the proposed method compared to other heuristics algorithms. Section IV concludes the paper and proposes an outline as the future work. II.

METHODS

In this part, first the Ishibuchi’s algorithm [8] is explained. Then the ICA is briefly introduced and finally the proposed algorithm is expressed. A. Ishibuchi’s method A fuzzy IF–Then rule for a pattern classification problem is expressed as follow: Rj :If X1 is A1j and … and Xn is Anj Then Class Cj Where X = [ X 1 , X 2 , . . . ,

(1)

X n ] is an n-dimensional pattern

vector, Aij (i = 1, 2, . . . , n) are antecedent linguistic values and Cj is the consequent class of the jth rule ( Rj). Generally, to design a rule based classification system for an M-class problem, a set of fuzzy rules, each similar to (1), should be provided. At first, each attribute is rescaled to the unit interval [0 1]. Then, the pattern space is partitioned into subspaces where each is identified by a fuzzy rule [21]. One approach for creating fuzzy rules is to consider all combinations of antecedent linguistic values and generate a complete fuzzy rule engine to cover all probable states. This approach fails

when facing with high-dimensional inputs, because the complexity increases drastically and the interpretability is highly decreased. For this reason, some rule evaluation measures are proposed to select a small subset of rules among all candidates to cover the most probable states and increase the interpretability. In the Ishibuchi’s approach [8], an arbitrarily specified number of rules can be generated for classification. Because of some antecedent conditions can be “don’t care”, the number of antecedent conditions, is not always n in our fuzzy if-then rules. Usually, the confidence and the support can be used as the rule evaluation measures. In [20], a measure for confidence evaluation of a fuzzy rule is used as:

Conf ( A j → ClassT ) =

∑ μ j (X p ) X p ∈ClassT m ∑ μ j (X p ) p =1

(2)

X ≈ U ( 0, β × d ), θ ≈ U ( −γ , γ )

Where m is the number of given training patterns. The consequent class Cj of the fuzzy rule Rj is determined by finding the class with the maximum “confidence” obtained by (2). Another rule evaluation measure is “support” which is defined in [22] as follow: S F ( A j → ClassC j ) =

results in several applications and in some of them has outperformed other evolutionary algorithms such as genetic algorithm (GA) and particle swarm optimization (PSO) [23]. This algorithm starts with a random population grouped in the form of initial countries. Upon the optimization problem, each of these countries has a specific fitness. Some of the best countries that have better fitness are selected to be the imperialist and all the other countries form the colonies of them. The colonies are distributed among the mentioned imperialists according to their powers. After assigning all colonies among imperialists and creating the initial empires, these colonies start moving toward their relevant imperialist. This movement is a simple model of assimilation policy that was pursued by some imperialists. Fig. 1 shows the movement of a colony towards its imperialist. In this assimilation policy, θ and x are random numbers with uniform distribution as illustrated in (6) and d is the distance between the colony and the imperialist.

1

∑ μ j (X p ) m X p ∈ClassC j

(3)

The compatibility grade μ j ( X p ) is usually defined by the minimum operator or the product operator. Here, we use the product operator as described below: n

Where β and γ are arbitrary values which modify the area that colonies randomly search around the imperialist. The total power of an empire is calculated by the summation of imperialist power and a percentage of the mean power of its colonies. In imperialistic competition phase, all empires try to pull the colonies of other empires and control them. Fig. 2 shows this model of imperialistic competition. Depends on the problem, the termination criterion would be different: one situation is that just one imperialist remains and all the others become colonies of this imperialist. Another state is that the algorithm terminates after a predefined number of iterations and in the last case, the optimization condition is satisfied.

(4)

μ j ( X p ) = ∏ μ ji ( x pi ) i =1

In [8], the goal was to consider both “confidence” and “support” as the rule evaluation criterion. In this regard, the following definition is used. S ( A j → ClassT) = Conf( A j → ClassT) ×

m ∑ μ j (X p ) p=1 m

(5)

The class with maximum value of S is selected for the consequent part of rule Aj. The generated fuzzy if-then rules are divided into M groups according to their consequent classes. Finally the best rules in each group are selected. B. Imperialist Competitive Algorithm Imperialist Competitive Algorithm (ICA) was introduced by Atashpaz et al. [23]. It is a computational method that is inspired by human social and political characteristic. In continuous optimization, ICA has provided acceptable

(6)

Figure 1: Colony movement

The pattern classification rate is the main factor for the fuzzy rule set and we consider it as the cost function where it is defined as follow:

n accuracy ( R ) =

corrects

(7)

D

tot

Where R is a rule set,

ncorrect indicates the number of

training instances correctly classified by R and

Figure 2: Imperialist competition Algorithm

C. The Proposed Method The main goal of paper is optimizing just the weighs of the constructed fuzzy rules without manipulating the number or the antecedent length of rules during the evolution. Our approach consists of two phases, Candidate rule generation by rule evaluation measures and rule weighing by Imperialist Competitive algorithm. In our approach, first candidate fuzzy if-then rules are generated from numerical data and prescreened using the rule evaluation measures in terms of confidence and support. Then these rules should be optimized by their weights. In this implementation, the optimal weights can be formulated as a global search problem. Here, a country is translated as an array of variable values, which are the rule set weights. This rule set is constructed by the Ishibuchi’s algorithm. In the first attempt, the weights are just assigned to the consequent part of the rules while in the second proposed approach both antecedents and consequents are assigned by a weight vector. It is obvious that in the second approach, the number of weights in antecedent part of each rule is equal to its length. Therefore, each country is coded as a rule set that each of them has certain weights, which first is selected randomly. The proposed method is explained in the form of 8 stages described as follows: (1) Pre-processing phase. The rule set is constructed according to the Ishibuchi’s algorithm [8] and the initial population of weights is randomly created accordingly. In this phase depending on the type of data set, the number of rules is different and prescreened fuzzy rule set is generated. (2) Initialize the empires.

Dtot denotes

to the number of input samples. Depending on the fitness, some of these rule sets is selected as imperialist and the rest considered as colonies in the current generation. (4) Apply assimilation operators. In this phase, the weight of colonies (each colony is considered as a rule set) is changed according to the weight of their imperialist. (5) Apply revolution operators to each colony. By applying this operator, some of these rule sets are selected randomly and their weighs are changed with random number (mutation). (6) Apply Empire Possession operators. If colonies cost is better than the corresponding imperialist, the better colony is replaced with the imperialist. (7) Apply imperialist competition operators. In this phase, one of the rule set of weakest empires is assigned to another empire. (8) Unifying two empires if their costs are similar. If two rule sets is similar, these rule set should merge and create one rule set In the second proposed approach which both antecedents and consequents are assigned by a weights vector, all the above steps are repeated with the exception that the number of variables or countries length that would be adjust increases. The number of weight values in this approach is the number of antecedents and consequents. III.

EXPERIMENTAL RESULT

In this section, the performance of the proposed method in comparison with the Ishibuchi’s method, combined Ishibuchi and genetic algorithm, combined Ishibuchi and Particle Swarm Optimization algorithm and the Genetic fuzzy system [24-25] methods are demonstrated. The parameters of problem are listed in Table II. In the implementations the value of about 2 is considered for β and about π/2 (Radian) for γ which have resulted in good convergence of countries to the global minimum. The fuzzy membership function that used in this implementation is triangular for providing better interpretability. For this evolutionary fuzzy system method, the interval of each attribute is segmented into 14 parts, each with 50% overlap with the next neighbor membership and the single winner method is used for the final reasoning. In the implemented Ishibuchi’s algorithm, each rule antecedent can contain just one, two or three features. This

As we can see from Table III, the proposed scheme outperforms the GA and PSO optimization methods in most datasets, especially imbalance datasets such as glass. As we can see from Table III, proposed method in all of datasets is better than PSO, Because ICA is similar to PSO but is more integrate than PSO and enough randomness is involved to the algorithm. By considering the same of maximum iteration size, the proposed scheme demonstrates the better convergence by preserving enough diversity through the search.

means that, the length of these rules can be maximally three. Then, the constructed rules are evaluated by ten-times tenfolds cross validation. The weights values can be changed between zero and two where this interval is determined empirically. Ten datasets derived from the UCI machine learning database are selected for evaluating the proposed method. Table I shows these datasets and their information about number of instances, number of attribute and classes. Table III shows the comparison results of the proposed method to the other compared algorithms.

TABLE I . THE DATASETS EMPLOYED IN EXPERIMENTAL STUDY Dataset

Number of Instances

Number of Attributes

Number of Classes

Bupa

345

5

2

Iris

195

3

4

Glass

214

9

7

Haberman

306

3

2

New thyroid

215

5

3

Cancer

286

9

2

Wine

178

13

3

Pima

768

8

2

Balance

625

4

3

Ecoli

336

7

8

TABLE III. THE BEST PARAMETER SETTINGS

Parameters

Number of Initial Countries

value

value

Damp Ratio

0.9

Uniting Threshold

0.02

Number of Generations

200

2

Γ

π/4

Assimilation Angle Coefficient

0.5

Β

Coefficient of countries impact power of the empire

0.02

Variable Interval

Number of Imperialist Revolution rate Assimilation Coefficient

100

parameters

8 0.3

2 [0,2]

TABLE IIIII. ERROR RATES FOR ALL CLASSIFIERS ON TEN DATASETS BY TEN-TIMES TEN-FOLDS CROSS VALIDATION. DATASET

Error Rate Ishibuchi

Proposed Method

Ishibuchi & Genetic

Ishibuchi & PSO

Target[24]

Chi-IVFS-Amp[25]

3.57

1.42

2.14

3.05

7.07

5.33 ± 2.98

Iris Bupa

41.43

39.11

39.41

39.15

34.03

42.3± 1.59

Glass

46.33

34.73

36.19

36.19

55.89

40.14± 5.24

Haberman

25.72

25.09

24.74

25.95

28.5

27.12± 0.84

New thyroid

7.28

7.26

6.78

8.21

13.21

14.88 ± 3.53

Cancer

5.32

4.58

3.84

4.99

4.25

3.95±.03

Wine

6.74

3.12

4.49

5.21

17.76

5.67± 1.68

Pima

28.63

26.01

26.37

27.13

26.98

27.61± 2.41

Balance

15.95

13.36

16.89

15.95

24.38

9.12± 7.24

Ecoli

26.66

26.07

27.56

26.66

34.51

28.55± 7.22

IV.

CONCLUSION

We have presented an algorithm that combines the Ishibuchi’s method for first phase and Imperialist Competitive Algorithm for second phase. Also we demonstrate the proposed scheme outperforms the GA and PSO optimization methods in most datasets. We have two objective criteria which are rule construction and weight adjustment. In this paper, for rule construction, Ishibuchi’s method and for the weight adjustment, Imperialist Competitive Algorithm is implemented. Therefore, we were able to construct appropriate weights of rules. We can say that it is more suitable when appropriate rules are made by the evolutionary algorithm, namely these rule are constructed by ICA. For future work, we can implement this work which appropriate rules and their weight are optimized using the Imperialist Competitive Algorithm.

REFERENCES [1] [2]

[3]

[4]

[5]

[6]

[7]

L. A. Zadeh, "Fuzzy logic and approximate reasoning," Synthese, vol. 30, pp. 407-428, May. 1975. D. Dubois, H. Prade, R. R. Yager, Fuzzy information engineering: a guided tour of applications, John Wiley & Sons, 1nd ed, Michigan, 1996. O. Cordón, Genetic Fuzzy Systems, Evolutionary Tuning And Learning Of Fuzzy Knowledge Bases, Wspc, Fuzzy Systems - Applications & Theory, 2002. R.P. Lia, M. Mukaidonob and I. Burhan Turksenc, “A fuzzy neural network for pattern classifcation and feature selection,” Fuzzy Sets and Systems, vol. 130, pp. 101-108, Aug. 2002. Sa. H. Lee, J.S. Lim, and D. K. Shin, “Extracting Fuzzy Rules to Classify Motor Imagery Based on a Neural Network with Weighted Fuzzy Membership Function,” Communications in Computer and Information Science, vol. 87, pp. 7-14, 2010. V. Ravi, H.-J. Zimmermann, “A neural network and fuzzy rule base hybrid for pattern classification,” soft Computing, vol . 5, pp. 152-159, April 2001. Y. C. Hu, R. S. Chen, and G.H. Tzeng, “Finding fuzzy classification rules using data mining techniques,” Pattern Recognition Letters, vol . 24, pp. 509-519, Jan. 2003.

[8]

[9]

[10]

[11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

H. Ishibuchi and T. Yamamoto, “Fuzzy rule selection by multiobjective genetic local search algorithms and rule evaluation measures in data mining," Fuzzy Sets and Systems, vol. 141, pp. 59-88, Jan. 2004. H. Ishibuchi, T. Yamamoto and T. Murata, "Three-objective geneticsbased machine learning for linguistic rule extraction," Information Sciences, vol. 136, no. 4, pp. 109-133, Aug. 2001. H. Ishibuchi, K. Nozaki, N. Yamamoto and H. Tanaka, "Selecting Fuzzy If-Then Rules for Classification Problems Using Genetic Algorithms," IEEE Transactions on Fuzzy Systems, vol. 3, No. 3, pp. 260-270, 1995. H. Ishibuchi, T. Murata and I.B. Turksen, "Single-Objective and TwoObjective Genetic Algorithms for Selecting Linguistic Rules for Pattern Classification Problems," Fuzzy Sets and Systems, vol. 89, No. 2, pp. 135-150, 1997. C. C. Chen, "Design of PSO-based Fuzzy Classification Systems," Science and Engineering, vol. 9, No 1, pp. 63-70, 2006. D. Chen, J. Wang, F. Zou, H. Zhang, and W. Hou, "Linguistic fuzzy model identification based on PSO with different length of particles," Applied Soft Computing Soft Computing, vol. 12, pp. 3390-3400, Nov. 2012. A. Kashnipour, N. Shamshiri Milani, A. R. Kashanipour, H. Haji Eghrari, "Robust Color Classification Using Fuzzy Rule-Based Particle Swarm Optimization," Image and Signal Processing, vol. 2, pp. 110114, 2008. E. G. Mansoori, M. J. Zolghadri, and S. D. Katebi, "SGERD: A SteadyState Genetic Algorithm for Extracting Fuzzy Classification Rules From Data," IEEE Transaction On Fuzzy Systems, vol. 16, NO. 4, pp. 10611071, Aug. 2008. E. nakamura and N. kwhtarnavaz, "Optimization of fuzzy membership function parameters", IEEE International Conference on Systems, Man and Cybernetics, , vol.1, pp. 1-6, Oct 1995. Z. Jing, G.Qiang, H.Zhigang and H. Z.Ting, " Optimization of fuzzy rules by Muilti-objective genetic algorithm in avionic fault diagnosis system, " IEEE International Conference on Intelligent Computing and Intelligent Systems, vol. 1, pp. 433-437, Nov . 2009. O.Cordon, F. Herera, F. A. Marquez and A. Peregrin, "A Study on the Evolutionary Adaptive Defuzzification Methods in Fuzzy Modeling," International Journal of Hybrid Intelligent Systems, vol. 1, pp. 36 - 48, April 2004. F.Valdez and P. Melin, "A New Evolutionary Method with Particle Swarm Optimization and Genetic Algorithms Using Fuzzy Systems to Dynamically Parameter Adaptation, " Soft Computing for Recognition Based on Biometrics Studies in Computational Intelligence, vol. 312, pp. 225-243, 2010. H. Ishibuchi and T. Nakashima, “Effect of rule weights in fuzzy rulebased classification systems,” IEEE Transaction On Fuzzy Systems., vol. 9, no. 4, pp. 506–515, Aug. 2001.

[21] H. Ishibuchi, K. Nozaki, and H. Tanaka, “Distributed representation of fuzzy rules and its application to pattern classification,” Fuzzy Sets System, vol. 52, no. 1, pp. 21–32, 1992. [22] T. P Hong, C. S. Kuo, and S. C. Chi, “Trade-off between computation time and number of rules for fuzzy mining from quantitative data,” Int. J. Uncertainty, Fuzziness Knowl.-Based Syst., vol. 9, no. 5, pp. 587– 604, 2001. [23] E. Atashpaz-Gargari and C. Lucas, “Imperialist Competitive Algorithm: An Algorithm for Optimization Inspired by Imperialistic Competition,”

IEEE Congress on Evolutionary Computation (CEC 2007), pp. 46614667, 2007. [24] J. Brian Gray and G. Fan, “Classification tree analysis using TARGET,” Int. J. Uncertainty, Computational Statistics & Data Analysis., pp. 1362– 1372, vol. 52 , Issue 3, Jan. 2008. [25] J. A. Sanz, A. Fernandez, H. Bustince, and F. Herrera, “Improving the performance of fuzzy rule-based classification systems with intervalvalued fuzzy sets and genetic amplitude tuning,” Journal of Information Science , vol. 180, Issue 19, pp. 3674-3685, Oct. 2010.

Suggest Documents