Fuzzy Data Mining, Rule Reduction, SOM. 1. INTRODUCTION. Data mining is a process of discovering various models, summaries, and derived values from a ...
IADIS European Conference Data Mining 2008
FUZZY ASSOCIATION RULE REDUCTION USING CLUSTERING IN SOM NEURAL NETWORK Marjan Kaedi, Mohammadali Nematbakhsh, Nasser Ghasem-aghaee University of Isfahan Department of Engineering, Universityof Isfahan, Isfahan, Iran
ABSTRACT The major drawback of fuzzy data mining is that after applying fuzzy data mining on the quantitative data, the number of extracted fuzzy association rules is very huge. When many association rules are obtained, the usefulness of them will be reduced. In this paper, we introduce an approach to reduce and summarize the extracted fuzzy association rules after fuzzy data mining. In our approach, in first, we encode each obtained fuzzy association rule to a string of numbers. Then we use self-organizing map (SOM) neural network iteratively in a tree structure for clustering these encoded rules and summarizing them to a smaller collection of fuzzy association rules. This approach has been applied on a data base containing information about 5000 employees and has shown good results. KEYWORDS Fuzzy Data Mining, Rule Reduction, SOM.
1. INTRODUCTION Data mining is a process of discovering various models, summaries, and derived values from a given collection of data (Glenn 2006). Data mining techniques have been developed to turn data into useful taskoriented knowledge. Associations reflect relationships among items in databases, and have been widely studied in the fields of knowledge discovery and data mining. Most algorithms for mining association rules identify relationships among transactions using binary values. Transactions with quantitative values and items are, however, commonly seen in real-world applications. Recent years have witnessed many efforts on discovering fuzzy associations, aimed at coping with fuzziness in knowledge representation and decision support processes (Delgado et al. 2005). Fuzzy association rules described by the natural language are well suitable for the thinking of human subjects. According to (Delgado et al. 2005), (Lee and Kwang 1997) is the first paper introducing fuzzy sets into association rules to diminish the granularity of quantitative attributes. The model uses a membership threshold to change fuzzy transactions into crisp ones before looking for ordinary association rules in the set of crisp transactions. Items keep being pairs, i.e. attribute, label. In (Hong et al.1999), only one item per attribute is considered: the pair (attribute, label) with greater support among those items based on the same attribute. The model is the usual generalization of support and confidence based on sigma-counts. The proposed mining algorithm first transforms each quantitative value into fuzzy sets in linguistic terms and then calculates the scalar cardinalities of all linguistic terms in the transaction data. (Au and Chan 1999) presented a novel algorithm, called FARM, which employs linguistic terms to represent the revealed regularities and exceptions. FARM employs adjusted difference analysis to identify interesting associations among attributes without using any user supplied thresholds. In (Hong et al. 2003), a fuzzy multiple-level data-mining algorithm was proposed that can process transaction data with quantitative values and discover interesting patterns among them. If mining procedure also produces a huge number of rules, a human user does not have the ability to analyze these rules. However, if such a huge number of rules do exist in the data, it will not be appropriate to arbitrarily discard any of them or to generate only a small subset of them. It is much more desirable if we can summarize them. In (Farzanyar et al. 2006), the size of average transactions and original dataset is reduced by the recognition and fusion of similar behaving attributes and mining is performed on the reduced dataset that produces a much smaller but richer set of fuzzy association rules which has been approved by
139
ISBN: 978-972-8924-63-8 © 2008 IADIS
experimental results. But the disadvantage of this method is that through merging some fields of table, some significant and key fields may be eliminated. It is better to preserve all data and table fields, accomplish fuzzy data mining on them and then summarize these extracted fuzzy rules. In this case the results will be more exact and the key data will not be eliminated. In this article rule clustering has been introduced for reducing number of extracted fuzzy association rules of fuzzy data mining algorithms. For this purpose, a SOM neural network has been used. To use the SOM neural network, first, the obtained rules should be encoded to a suitable form for applying as input of the SOM network. Thus we encode the rules and change them into vectors in Euclidean space. Then the SOM network receives the encoded rules as input vectors and clusters the similar rules in a tree structure. After clustering process, the template vector of each final cluster is decoded and turned into the form of a linguistic rule. Each decoded rule will be replaced with all fuzzy rules in its cluster. In section 2 of this article, there is an introduction to clustering in the SOM. The encoding process will be discussed in section 3, and section 4 deals with using the SOM network for reduction of fuzzy rules. Section 5 deals with decoding the reduced rules. Section 6 shows some experimental results and section 6 concludes the article.
2. CLUSTERING AND SOM NEURAL NETWORK Clustering is the process of organizing objects into groups whose members are similar in some way. A cluster is therefore a collection of objects which are “similar” and are “dissimilar” to the objects belonging to other clusters. A SOM is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional representation of the input space of the training samples, called a map. The SOM is based on an issue of competitive learning. The net consists of a set A with n neurons, represented with weight vectors wi. Furthermore, neurons are mutually interconnected and these bindings form some topological grid. If we present a pattern x into this network then exactly one neuron could be the winner and its weights are adapted proportionally to the pattern (the neuron is then closer). Neighborhood N(c) could be formally defined as set of neurons that are topologically near to the winner. The winner of the competition is determined as the neuron with the minimum distance to the pattern. Then, adaptation of the weights proceeds. Weight vectors for the next iteration of the winner and neurons in the neighborhood are adapted in a way that current weights are modified (either added or subtracted) with a variance of current weight and input pattern. The weight vector of pattern is called template vector of that pattern. the SOM tries to adapt weights of neurons to cover the most dense regions and therefore SOM naturally finds data clusters (Freeman 1991; Wasserman 1989).
3. FUZZY ASSOCIATION RULE ENCODING For clustering of fuzzy rules, the linguistic representation of rules is not suitable. It is needed to encode each rule to a suitable form. For this purpose, we have converted each rule to a list of numerical attribute. According to (Kacprzyk and Zadrozny 2005), the general form of a fuzzy association rule may be written as: Qy’s are S , in which Q is a quantity in agreement, i.e. a linguistic quantifier (e.g. most); y is a set of object in a database, e.g., the set of workers; S is a summarizer, i.e. an attribute together with a linguistic value (fuzzy predicate) defined on the domain of an attribute (e.g. “low salary” for attribute “salary”). For example: “most of employees earn low salary”. A more general form of fuzzy association rules is: QRy’s are S, in which R is a qualifier, i.e. another attribute together with a linguistic value (fuzzy predicate) defined on the domain of an attribute (e.g. “young” for attribute “age”). For example: “most of young employees earn low salary”. Considering this most general form of a fuzzy association rule, each of linguistic terms in QRy’s are S, i.e. Q, R and S, has some fuzzy levels. For example Q may be categorize in three level of most, few, a half of and so on. Thus this linguistic term can be represented by a number, depending on the degree on that fuzzy term (e.g. 1 for few, 2 for half and 3 for most). Each linguistic term available in ORY's are S has different levels. For example, Q, a linguistic quantifier, may have different levels of nothing, very little, a little, half of, most, nearly all, and all. For encoding fuzzy rules, each of these levels is designated by a digit based on their intensity. These digits are shown in table 1. As it is shown, the numbers assigned to each term are primes. This helps to decode the rules because of the arithmetic that is performed (described in section 4.1). This
140
IADIS European Conference Data Mining 2008
designation will be made also for other linguistic terms i.e. S and R. Suppose that we want to do fuzzy data mining process on employee data. Linguistic term R describes the age of each employee staff as young, middle age and old. S describes the salary of each employee in 5 levels as very low, low, medium, high and very high. The digits assigned to these linguistic terms are presented in table 1. After fuzzy data mining and encoding all linguistic terms in association fuzzy rules is performed, instead of linguistic fuzzy rules, encoded fuzzy association rules will be used for clustering. Some samples of these rules are shown in table 2. Each of these encoded rules can be illustrated as a vector in 3-dimentional Euclidean space (figure 1). Table 1. Assigned numbers to fuzzy terms Q R S
none Very few 2 3 Young 2 Very low 2
Few 5
Half of 7
Most of 11
Almos 13 Old 5
Medium 3 Low 3
Medium 5
High 7
all 17
Very high 11
Figure 1. Representation of encoded fuzzy association rule in Euclidean space Table 2. Some encoded fuzzy association rules Linguistic rule most of young employees earn low salary Few of old employees earn very low salary most of old employees earn high salary
Encoded rule 11 – 2 - 3 5–5-3 11 – 5 - 7
4. CLUSTERING OF FUZZY ASSOCIATION RULES BY SOM As mentioned, SOM network receives data as input vectors and clusters the similar vectors, so merging similar data and reducing data volume. Since we aim to deal with the problem of numerous association rules in this article, SOM network can be a helpful tool for this purpose. In figure 2 the proposed structure for rules reduction is illustrated. In the following section, we will describe the structure of our designed SOM network.
Figure 2. Process diagram
141
ISBN: 978-972-8924-63-8 © 2008 IADIS
4.1 Topology of SOM Neural Network After coding fuzzy rules, it turns to clustering the vectors obtained from rules encoding. Clustering process classifies vectors such that similar rules will be placed into a single cluster. The structure of a SOM network depends on the number of final fuzzy rules. Every neuron in this structure will represent a compressed fuzzy rule after clustering. Therefore, the number of output rules determines the number of neurons in the network. So when the SOM network is to be applied, it is necessary to know the number of final compressed rules (i.e. number of obtained cluster) to design the network structure from the beginning. But there is no information about the number of suitable clusters before clustering process and applying the SOM network in a classic manner will not be an appropriate method. Therefore, for clustering encoded fuzzy rules, the SOM network in a tree structure is used. For this, first all decoded rules are divided in two clusters by a SOM network with two output neurons (figure3) while each cluster contains similar rules. Then again the rules in both clusters will be passed through a SOM network so that the rules within each cluster again will be divided in two clusters containing similar fuzzy rules. This process continues until after passing rules through the SOM network, all the rules are assigned to one cluster while the other cluster remains empty. Now clustering stops and the template vector in each cluster is replaced with the rules of that cluster. This process is shown in figure 4. Update process in the designed network is performed via multiplying of all three parts (S,R,Q) of input vector by representative vector. With application of input vectors to the network, template vectors are formed gradually and each will become a template of a cluster of similar fuzzy rules.
Figure 3. Topology of using neural network
Figure 4. Using SOM neural network in a tree Structure
5. DECODING OF SOM TEMPLATE VECTORS After the clustering, it is needed to obtain the corresponding linguistic form of each template vector. As said before, as a vector (rule) is inputted and placed in either clusters of SOM network, the representative vector of that cluster is updated via multiplying input vector by representative vector. After clustering process, each of representative vectors will be the product of some prime numbers (because fuzzy rules are encoded by prime numbers). Thus through decomposition of representative vectors into their componential primes, and deletion of repetitive primes, the representative vector of each cluster can be decoded and converted into a linguistic rule according to table 1.
6. EXPERIMENTAL RESULTS To experiment our approach, we have chosen a data base containing information about 5000 employees. This information includes age, salary, education level, number of children, level of job satisfaction, etc. For simplicity, we have selected just age and salary to find the relation of their quantity via fuzzy data mining process. The first step is fuzzy sets definition for these two quantitative attributes. We defined five fuzzy set for each attribute. Then we made a new table with 10 columns. This table was filled with membership degree
142
IADIS European Conference Data Mining 2008
of age and salary in related fuzzy sets. We have chosen 4000 records as training set. The mining algorithm was fuzzy extention of “apriori” data mining algorithm. After running fuzzy mining algorithm, we obtained 16 fuzzy association rules which describe the relation of age and salary of employees. Some samples of these rules are: “Very few of young employees earn very high salary”, “Most of young employees earn medium salary”, “Few of old employees earn low salary”, “Most of employees are young” and so on. These rules were encoded and applied to SOM neural network to compress the redundant rules and reduce the number of rules. The SOM network reduced 16 fuzzy association rules to 3 rules. To evaluate the reduction process, we calculated the average supports of obtained rules on remained 1000 records (testing set). Also we tested the primitive rules on these records. Table 4 shows the results. As presented in table 4, the accuracy of association rules reduced 7% after 81% reduction of number of rules. This shows a good result and we need not pay a great pay off for reduction of rules by this approach. Table 4. The average supports of rules Testing data Training data
Primitive rules 0.21 0.223
Reduced rules 0.197 0.208
7. CONCLUSION The great number of rules extracted from data mining algorithms, especially fuzzy data mining, decreases the effectiveness of the rules and turns it difficult for the user to use and decide on these rules. This article proposes a method for reduction of association fuzzy rules extracted from fuzzy data mining. In this method, after appropriately coding association fuzzy rules, the rules are clustered using SOM neural network in a tree structure and similar rules are joined and expressed by one rule. This reduction of rules results in a considerable decrease of association fuzzy rules, but dose not reduce the accuracy of rules significantly.
REFERENCES Au, W. H. and Chan, K.C.C, 1999. FARM: A Data Mining System for Discovering Fuzzy Association Rules. In Proceedings of IEEE International Fuzzy Systems Conference, Seoul, Korea, Vol. 3, pp. 1217 - 1222. Delgado, M., et al, 2005. Mining Fuzzy Association Rules: An Overview. Springer, Berlin/Heidelberg, pp. 351-373. Farzanyar, Z. et al. 2006. Effect of Similar Behaving Attributes in Mining of Fuzzy Association Rules in the Large Databases. Peoceedings of Springer-Verlag Berlin Heidelberg, ICCSA, pp. 1100 – 1109. Freeman, A., 1991. Neural Network,Algorithms Applications and Programming Techniques. Addison Wesley, Reading, MA. Glenn, J., 2006. Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining. John Wiley. Hong, T.P. et al, 1999. Mining association rules from quantitative data. In Intelligent Data Analysis, Elsevier, Vol. 3, pp. 363–376. Hong, T-P. et al. 2003. Mining Fuzzy Multiple-Level Association Rules from Quantitative Data. In Applied Intelligence, Volume 18, Number 1, pp. 79-90. Kacprzyk, J. and Zadrozny, S., 2005. Fuzzy Linguistic Data Summaries as a Human Consistent, User Adaptable Solution to Data Mining. Springer-Verlag Berlin Heidelberg, pp. 321-340. Lee, J.H. and Kwang, H.L. 1997. An extension of association rules using fuzzy sets. In Proceedings of IFSA’97, Prague, pp. 399-402. Wasserman, P. D., 1989. Neural computing, Theory and Practice. Van Nostrand Reinhold, New York.
143