February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
International Journal of Computational Intelligence and Applications Vol. 9, No. 1 (2010) 49–67 c Imperial College Press DOI: 10.1142/S1469026810002768
MIXED GENETIC ALGORITHM APPROACH FOR FUZZY CLASSIFIER DESIGN
D. DEVARAJ Professor and Head, Department of Electrical and Electronics Engineering Arulmigu Kalasalingam College of Engineering Krishnankoil-626190, Tamil Nadu, India
[email protected] P. GANESH KUMAR Lecturer, Department of Information Technology Anna University Coimbatore, Coimbatore 641047, Tamilnadu, India pganeshkumar
[email protected] Received 23 December 2008 Revised 16 July 2009 An important issue in the design of FRBS is the formation of fuzzy if-then rules and the membership functions. This paper presents a Mixed Genetic Algorithm (MGA) approach to obtain the optimal rule set and the membership function of the fuzzy classifier. While applying genetic algorithm for fuzzy classifier design, the membership functions are represented as real numbers and the fuzzy rules are represented as binary string. Modified forms of crossover and mutation operators are proposed to deal with the mixed string. The proposed genetic operators help to improve the convergence of GA and accuracy of the classifier. The performance of the proposed approach is evaluated through development of fuzzy classifier for seven standard data sets. From the simulation study it is found that the proposed algorithm produces a fuzzy classifier with minimum number of rules and high classification accuracy. Statistical analysis of the test results shows the superiority of the proposed algorithm over the existing methods. Keywords: Fuzzy classifier; if-then rules; membership function; genetic algorithm.
1. Introduction Fuzzy Rule Based Systems (FRBS) have been successfully applied in modeling,9,32,43 control,42,32 and classification problems.27 The key to success of the FRBS is its ability to incorporate human expert knowledge in decision-making. Formation of fuzzy if-then rules and membership functions are the important tasks in the design of FRBS. Generally, the rules and membership functions are formed from the experience of human experts. But, for the problems with many input variables, the possible number of rules increases exponentially, which makes it difficult for experts to define a complete rule set. Data-driven approaches30,34,47 have been proposed for developing the FRBS from numerical data without the knowledge of domain experts. Abe et al.1,2 49
February 26, 2010 9:15 WSPC/S1469-0268
50
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar
proposed a rule generation method in which each fuzzy if-then rule was represented by using a hyper box in multidimensional pattern space. This approach is very weak in self-learning and determining the required number of fuzzy if-then rules. Ishibuchi et al.26,28 proposed a heuristic method for generating fuzzy if-then rules for pattern classification problems using grid-type fuzzy partition. In this method, prior knowledge on linguistic values is required for specifying the membership function. Also, this method fails to handle high dimensional problems due to the curse of dimensionality. During the late 1990s attempts have been made to improve the learning ability of the fuzzy rule based systems using soft computing techniques. Genetic Fuzzy Rule Based System11 (GFRBS) is one such approach in which a FRBS is augmented by a learning process based on Genetic Algorithm (GA). GAs18 are search algorithms based on the mechanics of natural genetics. In literatures the GFRBS proposed falls into four categories: (i) Learning fuzzy rules with fixed fuzzy membership functions,25 (ii) Learning fuzzy membership functions with fixed fuzzy rules,49 (iii) Learning fuzzy rules and membership functions in stages (i.e., first evolving good fuzzy rule sets using fixed membership function, then tuning membership functions using the derived fuzzy rule sets)38,39,41 and (iv) Learning fuzzy rules and membership functions simultaneously.40,46 This paper follows the last approach. The choice of an efficient representation is one of the most important issues in designing a fuzzy system using GA. Ishibuchi et al.27 introduced the concept of distributed fuzzy if-then rules. They coded a set of fuzzy if-then rules using binary string and treated them as an individual in GA. Heuristic procedures have been used in Ref. 25 to improve the searching capability of the fuzzy classifier system for large-scale pattern classification problems. An integer string was used in Ref. 49 to represent the rule set and the membership function. In addition, the genetic parameters of the evolutionary algorithms were adapted via a fuzzy system. A two-step approach is proposed in Ref. 41 for function approximation, dynamic system modeling and data classification problems. Here, first fuzzy clustering is applied to obtain a compact initial rule-based model. Then a real-coded GA10 is applied for the simultaneous optimization of the parameters of the membership functions and the rule sets. In Wang et al.46 a GA-based fuzzy-knowledge integration framework has been proposed to effectively integrate multiple fuzzy knowledge sources into a single knowledge base. Here, each fuzzy rule set with its associated membership function is first transformed into an intermediary representation and then further encoded as a string. A number of crossover operators have been used in the literature for real coded genetic algorithm. Herrera et al.22 gives a detailed taxonomy of the crossover operators for the real coded genetic algorithms. A hybrid GA approach that uses a combination of binary strings and real numbers are proposed in Refs. 46, 40, 7, 3, 4 Wang et al.46 used two-substring crossover and two-part mutation operator to the chromosome. Russo et al.41 used multi-cut crossover for binary strings and weighted mean crossover for real parameter. Casillas et al.7 used a threefold coding
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
51
scheme that uses real and integer coded chromosomes. The authors have used maxmin arithmetic crossover for real part and two-point crossover for integer part. Alcala et al.3 proposed a double coding scheme for both rule selection and weight derivation. They used two-point crossover for binary string and max-min arithmetic crossover for real parameter. Alcala et al.4 used two different kinds of coding schemes based on the two different types of tuning (global tuning of the semantics and local tuning of the rules). For both cases, real parameter coding is considered and Parent Centric BLX (PCBLX) and Half Uniform (HUX) crossover operators are used. This paper proposes a Mixed Genetic Algorithm (MGA) approach in which a mixed form of representation is followed to encode the membership function and the rule set. In the proposed MGA, floating point numbers are used to represent the membership function and binary strings are used to represent the rule set. For effective genetic operation, modified form of crossover and mutation operators that can deal with the mixed string are proposed. Two-point crossover, Gene-Cross Swap Operator (GCSO),14 and bit wise mutation are applied to rule set. Arithmetic Crossover and Uniform Mutation are applied to the membership function. The performance of the proposed approach has been tested using seven benchmark datasets.
2. Fuzzy Logic for Data Classification Classification is a supervised learning problem that takes labeled data samples and generates a model (classifier) that classifies new data samples into different predefined groups or classes. The classification problem can be solved using fuzzy logic. Fuzzy logic,45 uses fuzzy set theory, in which a variable is a member of one or more sets, with a specified degree of membership. Fuzzy logic when applied to computers, allows them to emulate the human reasoning process, quantify imprecise information, make decisions based on vague and incomplete data, yet by applying a “defuzzification” process, arrive at definite conclusions. Fuzzy classifiers consist of interpretable if-then rules representing the input features and the output class of the form: Rj : if xp1 is Aj1
and . . . and xpn
is Ajn
then class Cj
where Aj1 , . . . , Ajn are antecedent fuzzy sets of the input variable xp1 , . . . , xpn and Cj is one of the output class label. Collections of such rules are used as knowledge base of the fuzzy classifier. With input-output relationship expressed as a collection of fuzzy if-then rules, in which the “if” part uses linguistic variables of each fuzzy set and the “then” part have class labels, qualitative reasoning is performed to infer the results. In this paper, Mamdani inference system31 with product t-norm and max t-conorm is used. Here the set of input variable is matched against the if part of each if-then rule, and the response of each rule is obtained through fuzzy implication operation.
February 26, 2010 9:15 WSPC/S1469-0268
52
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar
The response of each rule is weighted according to the extent to which each rule fires. The responses of all the fuzzy rules for a particular output class are combined to obtain the confidence with which the input is classified to the corresponding output class. Generally the rules and the membership functions used by the fuzzy logic for solving the classification problem are formed from the experience of the human experts. With an increasing number of variables, the possible number of rules for the system increases exponentially, which makes it difficult for experts to define a complete rule set for good system performance. Also the system performance can be improved by tuning the membership functions. In this paper a modified form of GA is proposed to develop the fuzzy classifier. 3. Review of Genetic Algorithm GA18 is a generalized search and optimization technique inspired by the theory of biological evolution. Figure 1 shows the various steps in applying GA for any optimization problem. GA maintains a population of individuals that represent candidate solutions to the problem. Each individual in the population is evaluated to give some measure of its fitness to the problem using the objective function.
Generate Initial Population
Calculate the Fitness value
Select Parents for Reproduction
Apply Crossover & Mutation
Evaluate fitness of individuals
Converged
No
Yes
Stop
Fig. 1.
Flow chart of genetic algorithm.
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
53
In each generation, a new population is formed by selecting the more fit individuals based on a particular selection strategy. Some members of the new population undergo genetic operations to form new solutions. The two commonly used genetic operations are crossover and mutation. Crossover is a mixing operator that combines genetic material from selected parents. Mutation acts as a background operator and is used to search the unexplored search space by randomly changing the values at one or more positions of the selected chromosome. The above steps are repeated until the convergence criterion is satisfied. 4. Proposed Genetic Algorithm In a standard Simple Genetic Algorithm (SGA), binary strings are used to represent the solution variables which may cause difficulties in the case of coding continuous variables due to Hamming Cliff problems13 and also, for discrete variables with total number of permissible choices not equal to 2k (where k is an integer) it becomes difficult to use a fixed length binary coding to represent all permissible values. To overcome the above difficulties, the following modifications have been made to the standard GA to improve its efficiency in designing the fuzzy classifier. • Mixed form of representation of chromosome has been followed for representing the rule set and the membership function in the genetic population (i.e., the membership functions are represented using real numbers and the rule set is represented by the binary string). • With mixed form of representation, the selection scheme remains the same but modifications are needed for crossover and mutation operator. The various components of the modified GA are given in the following subsections: 4.1. Representation When designing a fuzzy system using GA, the first important consideration is the representation strategy to be followed. A fuzzy system is completely specified only when the rule set and the membership function associated with each fuzzy set are defined. The parameters of the membership function are represented as real numbers. In this work, trapezoidal and triangular membership functions are used to represent the membership functions. Three membership points are needed to represent each membership function. To represent a variable with 3 fuzzy sets, a total of nine membership points (P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 , P9 ) are required as shown in Fig. 2. In that nine points, first and last points (P1 and P9 ) are fixed which are the minimum and maximum of the input variable. The remaining seven membership points are evolved between the dynamic range such that P2 has [P1 , P9 ], P3 has [P2 , P9 ], P4 has [P2 , P3 ], P5 has [P4 , P9 ], P6 has [P5 , P9 ], P7 has [P5 , P6 ] and P8 has [P7 , P9 ] as limits. As an extension of the above method if five fuzzy sets are used to represent each variable, then a
February 26, 2010 9:15 WSPC/S1469-0268
00276
D. Devaraj & P. Ganesh Kumar
Membership grade
54
157-IJCIA
P5
P2
P1
P4
Membership grade
Fig. 2.
P1
P3 P7 Variable Range
P9
P6
Fuzzy space with three fuzzy sets per variable.
P2
P5
P4
Fig. 3.
P8
P3
P8
P11
P10 P9 P7 P6 Variable Range
P14
P13
P15
P12
Fuzzy space with five fuzzy sets per variable.
total of fifteen membership points (P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 , P9 , P10 , P11 , P12 , P13 , P14 , P15 ) are required as shown in Fig. 3. This type of representation is simpler and can be extended to any number of fuzzy sets. This type of representation has a number of advantages over binary coding for membership function. The efficiency of the GA is increased, as there is no need to convert the input variables to the binary type. Binary strings are used to represent the rule set. The length of the binary string depends on the number of fuzzy sets defined for each input variable and the number of output classes. To develop a classifier with compact rule set, the maximum number of rules (NR ) is specified and within that “NR ” rules, to select the optimum number of rules, a rule selection bit is used. With the above representation a typical chromosome will be as shown in Fig. 4. The value of “NR ” is chosen arbitrarily and depending on the performance of the classifier its value is changed during the development stage. For a three class fuzzy classifier with three fuzzy sets defined for each input variable, two bit each are needed to represent the antecedent and the output classes. With the above representation a typical chromosome will be as shown in Fig. 4.
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
0 01 RS IP1
10 ... 00 01 IP2 IPn OP
1 10 11 … 10 00 …. RS IP1 IP2 IPn OP
Rule 1
0 01 RS IP1
Rule 2
5.6,6.1,7.0 2.6,3.2,3.8 2.4,3.9,5.4 MFlow MFmedium MFhigh
………
55
00… 11 10 IP2 IPn OP Rule NR
0.3,0.6,0.9 1.2,1.5,1.8 2.4,2.7,3.1 MFlow MFmedium MFhigh
Input 1
Input n
Fig. 4.
Typical representation of chromosome.
4.2. Fitness function The next important consideration following the representation is the formation of fitness function. In the classification problem under consideration the objective is to maximize the correctly classified data or to minimize the difference between total number of data and the correctly classified data and to minimize the number of rules. This is mathematically represented as, Min f = (S − Cc) + (k × NS ).
(1)
Here S, Cc, NS and k are the total number of samples, number of correctly classified data, number of selected rules and constant respectively. In this paper, the value of “k” is taken as 5. During the GA run, GA searches for a solution with maximum fitness function value. Hence, the minimization objective function given by Eq. (1) is transformed to a fitness function to be maximized as 1 (2) Fitness = . f Evaluation of the individuals in the population is accomplished by calculating the objective function value for the problem using the parameter set. The result of the objective function calculation is used to calculate the fitness value of the individual. 4.3. Selection The selection of individuals to produce successive generations plays an important role in GA. Tournament selection13 is used in this work. In tournament selection, “T” individuals are selected randomly from the population, and the best of the “T” is inserted into the new population for further genetic processing. Tournaments are often held between pairs of individuals, although larger tournaments can be used. This procedure is repeated until the mating pool is filled. 4.4. Crossover As the proposed GA uses a mixed coding, two reproduction schemes are defined, one for the binary and another for real genes. For the rule set that is represented as
February 26, 2010 9:15 WSPC/S1469-0268
56
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar Rule 1
Parent 1
Rule 2
…
Rule NR
RS
IP1
IP2
…
IPn
OP
RS
IP1
IP2
…
IPn
OP
…
RS
IP1
IP2
IPn
OP
0
01
10
…
00
01
1
10
11
…
10
00
…
0
01
00
11
10
00
11
10
0
Parent 2
01
10
…
00
01
1
10
11
…
10
00
…
0
01
…
(a) Rule 1 Child 1
Child 2
Rule 2
…
Rule NR
RS
IP1
IP2
…
IPn
OP
RS
IP1
IP2
…
IPn
OP
…
RS
IP1
IP2
IPn
OP
0
11
10
…
00
01
1
10
11
…
10
00
…
0
01
00
11
10
0
01
10
…
00
01
1
10
11
…
10
00
…
0
01
00
01
10
…
(b) Fig. 5.
(a) Original strings, (b) children after GCSO operation.
binary string, two-point crossover,18 is Arot applied and then an advanced operator called Gene Cross Swap Operator (GCSO)14 is applied. GCSO randomly selects two chromosomes (individuals) from the population. Next, one gene from each chromosome is selected randomly and their values are swapped. In Fig. 5(a), the highlighted strings are the selected genes for GCSO operation. The children after the GCSO operation are shown in Fig. 5(b). While two point crossover exchanges information between high-fit chromosomes, the GCSO searches for alternative alleles, exploiting information stored even in low fit strings. For membership function that is represented in real form, a simple arithmetic crossover is applied. Arithmetic crossover randomly selects two chromosomes that represent parent p1i and p2i from a particular variable and produces two offspring as given below: Ci1 = λ · p1i + (1 − λ)p2i
(3)
Ci2 = λ · p2i + (1 − λ)p1i ,
(4)
where λ is a uniform random number ∈ [0, 1]. Figure 6 illustrates the arithmetic crossover for the one-dimensional case. In this figure p1i and p2i are the two parents and ai , bi are the lower and upper limit of the ith variable. 4.5. Mutation A mutation operator introduces new material into the population and thereby allows faster convergence and prevents trapping to a local optimal value. For rule set, bit-wise mutation is applied which switches a few randomly chosen bits from 1
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
pi1
ai Fig. 6.
λ = 13
λ = 12
λ = 23
pi2
57
bi
Schematic representations of arithmetic cross over.
to 0 or from 0 to 1 with a small mutation probability (Pm ). For membership function, “Uniform Mutation” operator is applied. In uniform mutation, first a variable is selected randomly from an individual and it is set to a uniform random number between the variable’s lower and upper limit. 5. Simulation Result This section presents the details of the simulation carried out on the benchmark datasets to demonstrate the effectiveness of the proposed GA-based approach for fuzzy classifier design. The datasets considered are “iris”, “wine”, “breast cancer”, “glass”, “credit” and “appendicitis” datasets available in the UCI machine learning repository35 and the tcpdump dataset available in the MIT Lincoln Labs.50 The details of the dataset are given in Table 1. The proposed MGA is developed using MATLAB programming and executed on a PC with Pentium IV processor with 2.40 GHz speed and 256 MB of RAM. Simulations were carried out to examine the learning ability as well as the generalization ability of the classifier. The details of the simulation and the performance of the classifier for the Iris data are presented here. An Iris dataset consists of 150 four-dimensional vectors representing 50 plants each of three species iris setosa, iris versicolor and iris virginica. The four input features are sepal length, sepal width, petal length and petal width. All the four input features are continuous in nature. Each input variable is represented by three fuzzy sets namely, “low”, “medium” and “high”. Triangular membership function is used to represent the “low” and “high” fuzzy sets. Trapezoidal membership function is used to represent the “medium” fuzzy set. By following the representation strategy given in Sec. 4.1, the membership function and the fuzzy if-then rules are represented by a mixed string. A total of seven points are needed to represent the three fuzzy sets belonging to a variable. The range of each membership function point is computed dynamically. Table 1. Name of the dataset Iris Wine Breast cancer Glass Appendicitis Credit Tcpdump
Details of dataset used. No. of samples
Attributes
Classes
150 178 690 214 106 699 294
4 13 14 10 7 14 5
3 3 2 6 2 2 2
February 26, 2010 9:15 WSPC/S1469-0268
58
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar
A maximum of seven rules are included in the genetic population, and each antecedent part is represented by two bit substring with 01 representing “low”, 10 representing “medium”, 11 representing “high” and 00 representing “don’t care”. The output class is also represented by two bit binary string with 01 representing “iris virginica”, 10 representing “iris versicolor” and 11 representing “iris setosa” and 00 representing “don’t care”. The substring corresponding to each rule has three sections: rule selection bit, representation for the input variables and the representation for the output classes. When coded using binary string each rule requires 11 bits and hence a total of 77 (11 × 7) bits are needed to represent the complete rule set in the genetic population. The learning ability is examined by using all the 150 samples as training patterns and evolving the fuzzy classifier. The GA was run with 30 independent trials with different values of random seeds and GA control parameters. The optimal results were obtained with the following setting: Population Size: 30 Crossover Probability: 0.8 Mutation Probability: 0.05 Swap Probability: 0.02 Tournament Size: 2 The convergence behavior of the proposed MGA during learning is shown in Fig. 7. From Fig. 7, it is found that the proposed MGA has a steep increase in the fitness value in the first 25 generations and the algorithm converges in 200 iterations.
Fig. 7.
Convergence of GA showing learning ability.
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
59
A fuzzy system with three rules and 98.6% classification accuracy was evolved. The three rules evolved by the GA are given below: 1. If petal length is medium then the flower is iris setosa. 2. If petal length is low and petal width is low then the flower is iris virginica. 3. If the sepal width is low and petal width is low then the flower is iris versicolor. For comparison, two different GA based approaches using binary representation are developed. The first one namely Simple Genetic Algorithm (SGA) uses basic genetic operators alone and the other one namely Improved Genetic Algorithm (IGA) uses advanced genetic operators like Gene Cross Swap Operator (GCSO), Gene Max-Min Operator (GMMO) and Gene Inverse Operator (GIO) in addition to the basic genetic operators. Table 2 compares the learning ability of the proposed MGA with the other two GA models that uses binary representation. From Table 2, it is inferred that the proposed MGA classifies 148 data correctly in just 200 generations showing very good learning ability when compared with other GA models. It is clearly seen from Table 2, that the proposed MGA is slightly faster (167 seconds to converge) than the other two GA models. Since SGA and IGA uses only binary strings, it needs some extra time in decoding it to continuous values for membership function. Also from Table 2, it is understood that the proposed mixed form of representation avoids the difficulties of coding continuous variables and allows all permissible values for membership function which results in better classification accuracy (98.8%) than the other two GA models. Next, the generalization ability of the proposed fuzzy classifier is examined by using 5 × 2 cross-validation datasets used in the KEELv1.0 software.4 In 5×2 crossvalidation, the dataset is divided into 10 subsets each containing same number of data. Among them five subsets are used for training and the remaining five subsets are used for testing. The 150 data of iris flower is divided into 10 subsets each containing 75 data using 5 × 2 cross-validation as given in Table 3. The training dataset is used to find the optimal membership function and the rule set. The convergence behavior of the proposed MGA during generalization is shown in Fig. 8. From the figure, it is found that the proposed MGA has a steep increase in the fitness value in the first 25 generations and the algorithm converges in 120 iterations. Table 2.
Learning ability of GA-based approaches for Iris data.
Performance parameter
SGA
IGA
MGA
No. of generations Population size Correctly classified data No. of rules (average) Accuracy in percentage CPU time in seconds
300 30 146.1 2.6 97.4% 302
240 30 147.4 2.8 98.2% 257
200 30 148.2 2.9 98.8% 167.48
February 26, 2010 9:15 WSPC/S1469-0268
60
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar Table 3.
Iris data partition using 5 × 2 cross-validations.
Set
Data series selected
Data size
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10
1, 2, 5, 6, 9, 10, . . . , 149 3, 4, 7, 8, 11, 12, . . . , 150 1, 3, 5, 7, 9, 11, . . . , 149 2, 4, 6, 8, 10, 12, . . . , 150 3, 4, 7, 8, 11, 12, . . . , 150 1, 2, 5, 6, 9, 10, . . . , 149 2, 4, 6, 8, 10, 12, . . . , 150 1, 3, 5, 7, 9, 11, . . . , 149 1, 2, 3, 7, 8, 9, . . . , 147 4, 5, 6, 10, 11, 12, . . . , 150
75 75 75 75 75 75 75 75 75 75
Fig. 8.
Convergence of GA showing generalization ability.
Usage Training Testing Training Testing Training Testing Training Testing Training Testing
During the course of the GA run, the ranges of each membership function points are evolved and tuned simultaneously. After 25 generations the membership function points have begun to distribute uniformly over the range of the variables with halfway overlap between them. The optimal membership function obtained for the variable petal length is shown in Fig. 9. The testing dataset is used to evaluate the performance of the designed fuzzy classifier. During testing, the proposed MGA has correctly classified 74.8 data (average value). Table 4 shows the result achieved by proposed MGA and other GA model on the various data subsets of Iris data. Table 5 compares the performance of proposed MGA with the other two GA models that uses binary representation. From Tables 4 and 5, it is inferred that the proposed MGA has evolved a fuzzy classifier that has good classification accuracy and uses minimum number of rules. Also, the proposed approach took minimum number of generations compared to other approaches reported in the literature.
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
Fig. 9.
Table 4.
61
Optimal membership function for the variable petal length.
Generalization ability of GA-based approaches for Iris data.
Datasets
SGA
IGA
MGA
Training
Testing
CC
NR
CC
NR
CC
NR
S1 S3 S5 S7 S9
S2 S4 S6 S8 S10
73 74 72 73 73
2 3 3 2 3
74 74 74 73 73
3 3 3 2 3
74 75 75 75 75
3 3 3 3 3
73.2
2.6
73.6
2.8
74.8
3
Average value Table 5.
Performance of GA-based approaches for Iris data.
Performance parameter
SGA
IGA
MGA
Generations Population size Correctly classified data No. of rules Accuracy in percentage CPU time in seconds
175 30 73.2 2.6 97.6% 247.6
75 30 73.6 2.8 98.1% 56.81
50 30 74.8 3 99.7% 29.17
Next, the proposed approach is applied to develop the fuzzy classifier for the remaining six datasets. The same approach followed in developing the fuzzy classifier for Iris data is followed for these datasets also. Table 6 shows the performance of MGA along with the performance of SGA and IGA for the six datasets. To compare the performance of the proposed approach with the existing fuzzy logic based approaches, simulation was carried out using Fuzzy-SLAVE, Fuzzy-GP and Fuzzy-Chi-RW5 available in the educational version of experiments design in KEELv1.0 software. The experimental settings used in these three approaches are
February 26, 2010 9:15 WSPC/S1469-0268
62
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar Table 6.
Performance of GA-based approaches for other datasets.
Datasets
SGA
Wine Breast cancer Glass identification Appendicitis Credit approval Tcpdump
IGA
MGA
*CA (%)
*NR
CA (%)
NR
CA (%)
ANR
91.67 96.05 69.01 94 90.9 91.84
30.3 12.9 28.7 6.5 18.3 2.4
93.33 97.37 80 94 93.13 96.94
15.7 9.4 18.2 6.3 11.5 2.3
96.67 98.98 84.5 99.6 94.85 98.98
5.2 4.5 9.7 2.3 4.6 2.1
*CA = Classification accuracy *NR = Number of rules.
given in Table A in the appendix. Table 7 gives the performance comparison of MGA with the three fuzzy learning approaches available in KEELv1.0 software during testing. From this table, it is found that the proposed MGA outperforms the binary coded GA as well as the other fuzzy learning approaches in all cases. Statistical evaluation of experimental results is considered an essential part of validation of new machine learning methods. For comparing the performance of the proposed algorithm with the existing algorithms on multiple datasets, Receiver Operating Characteristic (ROC) analysis12,15 is performed. ROC graphs are a useful technique for organizing classifiers and visualizing their performance. A ROC curve is a two-dimensional depiction of classifier performance. To compare the performance of the proposed MGA based fuzzy classifier, the ROC performance is reduced to a single scalar value by calculating the Area Under ROC Curve (AUC).6 Since the AUC is a portion of the area of the unit square, its value will always be between 0 and 1.0. AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. Table 8 gives the value of AUC for different GA-based approaches. Table 9 gives the value of AUC for the three fuzzy learning approaches and MGA for all seven cases. From Table 8, it is clearly found that the proposed MGA based fuzzy classifier has higher probability than the other GA models for all the seven datasets. Also from Table 9, it is understood that even though the proposed classifier is slightly inferior in classifying credit and appendicitis dataset, the average rank is highly superior to the other approaches. From these comparisons, it is found that MGA
Table 7. Datasets Iris Wine Breast cancer Glass identification Appendicitis Credit approval Tcpdump
Performance comparison of MGA with fuzzy approaches. Fuzzy-Slave (%)
Fuzzy-GP (%)
Fuzzy-Chi-RW (%)
MGA (%)
93 90.3 95.1 57.1 78.3 82.9 85.8
86.1 85.6 95.5 47.4 81.4 79.5 87.9
92.5 93.2 87.2 58.6 85.7 83.1 82.4
99.7 96.67 98.98 84.5 99.6 94.85 98.98
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design Table 8.
Statistical analysis of GA-based methods.
Datasets
SGA
Iris Wine Breast cancer Credit approval Glass identification Appendicitis Tcpdump
0.583 0.955 0.719 0.625 0.814 0.785 0.983
Average rank
Table 9.
IGA
(3) (2) (3) (2.5) (3) (2) (3)
0.589 0.920 0.755 0.625 0.838 0.732 0.989
2.64
MGA
(2) (3) (2) (2.5) (2) (2) (2)
0.595 0.962 0.793 0.675 0.866 0.810 0.995
2.21
(1) (1) (1) (1) (1) (1) (1)
1.00
Statistical comparison of MGA with other approaches.
Datasets
Fuzzy-SLAVE
Iris Wine Breast cancer Credit approval Glass identification Appendicitis Tcpdump Average rank
Table 10. Dataset
0.481 0.878 0.780 0.630 0.769 0.835 0.913
(4) (4) (2) (3) (3) (1) (2.5)
Fuzzy-GP 0.580 0.898 0.690 0.612 0.783 0.694 0.889
2.78
(2) (3) (3) (4) (2) (4) (4)
3.14
Fuzzy-Chi-RW 0.569 0.901 0.520 0.778 0.699 0.713 0.913
(3) (2) (4) (1) (4) (3) (2.5)
2.78
MGA 0.595 0.962 0.793 0.675 0.866 0.810 0.995
(1) (1) (1) (2) (1) (2) (1)
1.28
Performance comparison of proposed MGA. Approaches
Classification accuracy (%)
Iris
Ref. 27 Ref. 19 Ref. 29 Heesoo et al. (2004) Proposed method (MGA)
94.67 95.7 96.4 96.7 99.7
Wine
Ref. 8 Ref. 29 Proposed method (MGA)
95.76 96.2 96.67
Glass identification
Ref. 23 Ref. 25 Proposed method (MGA) Ref. 48 Ref. 20 Ref. 25 Proposed method (MGA)
53.8 64.4 84.5 89.6 87.7 84.9 99.6
Credit approval
Ref. 37 Ref. 25 Proposed method (MGA)
85.8 86.7 94.85
Breast cancer
Ref. 36 Ref. 16 Proposed method (MGA)
97.51 98.2 98.98
Tcpdump
Ref. 33 Ref. 17 Proposed method (MGA)
93.26 97.14 98.98
Appendicitis
63
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar
64
has the ability to provide a richer measure of classification performance than the other algorithms in designing the fuzzy classifier in all the seven cases. Table 10 gives the comparison of results obtained by the proposed approach with the already published results. From this table, it is clear that the proposed mixed form of representation with modified genetic operations in fuzzy classifier design outperforms the results of existing classifier in the literature. 6. Conclusion The bottleneck of the fuzzy logic based system for any application is the development of rule base and the formation of the membership function. This paper has proposed a Mixed Genetic Algorithm approach for the optimal design of the fuzzy system for the classification task. In the proposed approach, a mixed form of representation is used to encode the rule base and the membership functions and is evolved simultaneously. In addition to the basic genetic operators, advanced problem specific operator and modified form of genetic operators have been applied to improve the convergence and the quality of the solution. The effectiveness of the proposed approach for developing fuzzy classifier has been demonstrated through seven standard data sets. In all the cases the proposed approach generated a compact fuzzy system with high classification rate when compared with the binary coded genetic algorithm. Appendix Table A.
Experimental settings of fuzzy approaches.
Sl. No
Parameter description
Fuzzy SLAVE
Fuzzy GP
Fuzzy Chi RW
1 2 3
Number of labels Population Size Number of rules i) Iris ii) Wine iii) Breast cancer iv) Glass identification v) Appendicitis vi) Credit approval vii) Tcpdump
3 30
3 30
3 —
—
3 6 5 10 3 5 2
—
300 0.01 Yes — — —
300 — — 2 1 —
— — — — — Product
— —
— —
Penalized certainty factor Winning rule
4 5 6 7 8 9
10 11
Number of iteration Mutation probability Use rule weights Numisland Steady T-norm for the computation of compatibility Rule weight Fuzzy reasoning method
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
65
References 1. S. Abe and M. S. Lan, A method for fuzzy rule extraction directly from numerical data and its application to pattern classification, IEEE Trans. Fuzzy Syst. 3 (1995) 18–28. 2. S. Abe and R. Thawonmas, A fuzzy classifier with ellipsoidal regions, IEEE Trans. Fuzzy Syst. 5 (1997) 358–368. 3. R. Alcal´ a, J. Alcal´ a-Fdez, J. Casillas, O. Cord´ on and F. Herrera, Hybrid learning models to get the interpretability-accuracy trade-off in fuzzy modeling, Soft Comput. 10 (2006) 717–734. 4. R. Alcal´ a, J. Alcal´ a-Fdez, M. J. Gacto and F. Herrera, Rule base reduction and genetic tuning of fuzzy systems based on the linguistic 3-tuples representation, Soft Comput. 11 (2007) 401–419. 5. J. Alcal´ a-Fdez, L. Sanchez, S. Garcia, M. J. del Jesus, S. Ventura, J. M. Garell, J. Otero, C. Romero, J. Bacardit, V. M. Rivas, J. C. Fernandez and F. Herrera, KEEL: A software tool to access evolutionary algorithms to data mining problems, Soft Comput. In Press. 6. A. P. Bradley, The use of area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recogn. 30 (1997) 1145–1159. 7. J. Casillas, O. Cordon, M. J. del Jesus and F. Herrera, Genetic tuning of fuzzy rule deep structures preserving interpretability and its interaction with fuzzy rule set reduction, IEEE Trans. Fuzzy Syst. 13 (2005) 13–28. 8. L. Castillo, A. Gonzalez and P. Perez, Including a simplicity criterion in the selection of the best rule in a genetic fuzzy learning algorithm, Fuzzy Sets Syst. 120 (2001) 309–321. 9. F. Cheong and R. Lai, Constraining the optimization of a fuzzy logic controller using an enhanced genetic algorithm, IEEE Trans. Syst. Man Cybern. B 30 (2000) 31–46. 10. A. L. Corcoran and S. Sen, Using real-valued genetic algorithms to evolve rule sets for classification, Proc. 1st IEEE Conf. Evolutionary Computation, Orlando (1994) pp. 120–124. 11. O. Cordon, F. Gomide, F. Herrera, F. Hoffmann and L. Magdalena, Ten years of genetic fuzzy systems: Current framework and new trends, Fuzzy Sets Syst. 141 (2004) 5–31. 12. J. Demsar, Statistical comparisons of classifiers over multiple datasets, J. Mach. Learn. Res. 7 (2006) 1–30. 13. D. Devaraj and B. Yegnanarayana, Genetic algorithm-based optimal power flow for security enhancement, IEE Proc. Generation Transmission and Distribution 152 (2005) 899–905. 14. S. Durairaj, D. Devaraj and P. S. Kannan, Voltage stability constrained reactive power planning using improved genetic algorithm, Int. J. Water and Energy 1 (2006) 56–64. 15. T. Fawcett, ROC Graphs: Notes and Practical Considerations for Researchers (Kluwer Academic Publishers, 2004), pp. 1–38. 16. P. GaneshKumar and D. Devaraj, Improved genetic-fuzzy system for breast cancer diagnosis, Int. J. Systemics, Cybernetics and Informatics (2008) 24–28. 17. P. GaneshKumar and D. Devaraj, Soft computing techniques for network intrusion detection, Proc. National Conf. Advanced Computing, MIT Chennai, (2007) pp. 111– 123. 18. D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning (Addison-Wesley, Reading, MA, 1989). 19. A. Gonzalez and R. Parez, SLAVE: A genetic learning system based on an iterative approach, IEEE Trans. Fuzzy Syst. 7 (1999) 176–191.
February 26, 2010 9:15 WSPC/S1469-0268
66
157-IJCIA
00276
D. Devaraj & P. Ganesh Kumar
20. M. Grabisch and J. M. Nicolas, Classification by fuzzy integral: Performance & Test, Fuzzy Sets Syst. 65 (1994) 255–271. 21. H. Hwang, Identification of a gaussian fuzzy classifier, Int. J. Cont. Automat. Syst. 2 (2004) 118–124. 22. F. Herrera, M. Lozano and A. M. S´ anchez, A taxonomy for the crossover operator for real-coded genetic algorithms: An experimental study, Int. J. Intell. Syst. 18 (2003) 309–338. 23. R. C. Holte, Very simple classification rules that perform well on most commonly used dataset, Mach. Learn. 11 (1993) 63–91. 24. H. Ishibuchi and T. Nakashima, Improving the performance of fuzzy classifier systems for pattern classification problems with continuous attributes, IEEE Trans. Ind. Electron. 46 (1999) 1057–1068. 25. H. Ishibuchi, T. Nakashima and T. Murata, Performance evaluation of fuzzy classifier systems for multidimensional pattern classification problems, IEEE Trans. Syst. Man Cybern. B 29 (1999) 601–617. 26. H. Ishibuchi, K. Nozaki and H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification, Fuzzy Sets Syst. 52 (1992) 21–32. 27. H. Ishibuchi, K. Nozaki, N. Yamamoto and H. Tanaka, Selecting fuzzy if-then rules for classification problems using genetic algorithms, IEEE Trans. Fuzzy Syst. 3 (1995) 260–270. 28. H. Ishibuchi, K. Nozaki, N. Yamamoto and H. Tanaka, Adaptive fuzzy rule-based classification systems, IEEE Trans. Fuzzy Syst. 4 (1998) 238–250. 29. H. Ishibuchi and T. Yamamoto, Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining, Fuzzy Sets Syst. 141 (2004) 59–88. 30. J. S. R. Jang, Fuzzy controller design without domain experts, Proc. 1st IEEE Int. Conf. Fuzzy Systems, San Diego (1992) pp. 289–296. 31. J. S. R. Jang, C. T. Sun and E. Mijutani, Neuro-Fuzzy and Soft Computing (Prentice Hall Englewood Cliffs, New Jersey, 1997). 32. C. C. Lee, Fuzzy logic in control systems: Fuzzy logic controller-part I and part II, IEEE Trans. System Man Cybern. 20 (1990) 404–435. 33. K. C. Lee and L. Mikkailov, Intelligent intrusion detection system, Proc. IEEE Int. Conf. Intell. Syst. (2004) pp. 497–502. 34. D. Nauck and R. Kruse, A neuro-fuzzy method to learn fuzzy classification rules from data, Fuzzy Sets Syst. 89 (1997) 277–288. 35. D. J. Newman, S. Hettich, C. L. Blake and C. J. Merz, UCI repository of machine learning databases, University of California, Irvine, http://www.ics.uci.edu/∼ mlearn/MLRepository.html. 36. C. A. Pena-Reyes and M. Sipper, A fuzzy-genetic approach to breast cancer diagnosis, Artif. Intell. Med. 17 (1999) 131–155. 37. J. R. Quinlain, Simplifying decision trees, Int. J. Man-Mach. Stud. 27 (1987) 221–234. 38. I. Rojas, J. Gonzalez, H. Pomares, F. J. Rojas, F. J. Fernandez and A. Prieto, Multidimensional and multideme genetic algorithms for the construction of fuzzy systems, Int. J. Approximate Reasoning 26 (2001) 179–210. 39. H. Roubos and M. Setnes, Compact and transparent fuzzy models and classifiers through iterative complexity reduction, IEEE Trans. Fuzzy Syst. 9 (2001) 516–524. 40. M. Russo, Genetic fuzzy learning, IEEE Trans. Evol. Comput. 4 (2000) 259–273. 41. M. Setnes and H. Roubos, GA-Fuzzy modeling and classification: Complexity and performance, IEEE Trans. Fuzzy Syst. 8 (2000) 509–522.
February 26, 2010 9:15 WSPC/S1469-0268
157-IJCIA
00276
Mixed Genetic Algorithm Approach for Fuzzy Classifier Design
67
42. M. Sugeno, An introductory survey of fuzzy control, Information Science 36 (1985) 59–83. 43. M. Sugeno and T. Yasukawa, A fuzzy-logic-based approach to qualitative modeling, IEEE Trans. Fuzzy Syst. 1 (1993) 7–31. 44. T. Takagi and M. Sugeno, Fuzzy identification of systems and its applications to modeling and control, IEEE Trans. Syst. Man Cybern. 15 (1985) 116–132. 45. T. Ross, Fuzzy Logic with Engineering Application (Tata McGraw Hill, 1995). 46. C. H. Wang, T. Hong and S. Tseng, Integrating fuzzy knowledge by genetic algorithms, IEEE Trans. Evol. Comput. 2 (1998) 138–148. 47. L. X. Wang and J. M. Mendel, Generating fuzzy rules by learning from examples, IEEE Trans. Syst. Man Cybern. 22 (1992) 1414–1427. 48. S. M. Weiss and C. A. Kulikowski, Computer System that Learn (Morgan Kaufmann, San Mateo, CA, 1991). 49. S. Yuhui, E. Russell and C. Yaobin, Implementation of evolutionary fuzzy system, IEEE Trans. Fuzzy Syst. 7 (1999) 109–119. 50. M. Zissman, DARPA intrusion detection scenario specific datasets, Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, http://www.ll.mit.edu/ IST/ideval/data/data index.html.