A decision-theoretic rough set approach for dynamic data mining

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2014.2387877, IEEE Transactions on Fuzzy Systems

JOURNAL OF LATEX CLASS FILES, VOL. 11, NO. 4, DECEMBER 2012

1

A decision-theoretic rough set approach for dynamic data mining Hongmei Chen, Tianrui Li*, IEEE Senior Member, Chuan Luo, Shi-Jinn Horng, IEEE Member, and Guoyin Wang, IEEE Senior Member

✦

Abstract—Uncertainty and fuzziness generally exist in real-life data. Approximations are employed to describe the uncertain information approximately in rough set theory. Certain and uncertain rules are induced directly from different regions partitioned by approximations. Approximation can further be applied to data mining related task, e.g., attribute reduction. Nowadays, different types of data collected from different applications evolve with time, especially new attributes may appear while new objects are added. This paper presents an approach for dynamic maintenance of approximations w.r.t. objects and attributes added simultaneously under the framework of DecisionTheoretic Rough Set (DTRS). Equivalence feature vector and matrix are defined firstly to update approximations of DTRS in different levels of granularity. Then, the information system is decomposed into subspaces and the equivalence feature matrix is updated in different subspaces incrementally. Finally, the approximations of DTRS are renewed during the process of updating the equivalence feature matrix. Extensive experimental results verify the effectiveness of the proposed methods. Index Terms—Granular computing, Decision-theoretic rough set, Information system, Incremental learning.

1

I NTRODUCTION

G

Ranular computing (GrC) provides multi-view and multi-level solutions for the different problems to find optimized methods by decomposition and combination of granules [59][69]. Granule is a clump of objects drawn together by indistinguishability, similarity or functionality [68]. Granulation, the relationship between granules, and the variation of the granules are important works in GrC. Different levels of concepts or rules are then unraveled. The frameworks, models, • * Corresponding author Hongmei Chen, Tianrui Li, and Chuan Luo are with School of Information Science and Technology, Southwest Jiaotong University, China, Chengdu, 610031. E-mail: {hmchen, trli}@swjtu.edu.cn, [email protected] • Shi-Jinn Horng is with School of Information Science and Technology, Southwest Jiaotong University, China, Chengdu, 610031, and Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan. E-mail: [email protected] • Guoyin Wang is with School of Information Science and Technology, Southwest Jiaotong University, China, Chengdu, 610031, and Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China. E-mail: [email protected] Manuscript received February 3, 2014; revised May 17, 2014.

methodologies, and techniques of GrC have been widely studied in fuzzy logic theory [42][68], rough set theory [17][19][34][43][53][61][72], formal concept analysis [56], and label semantics framework [24]. GrC has been successfully applied in data mining, pattern recognition, and image processing [4][33][38][49][57][63]. For example, Pedrycz proposed a knowledge-based fuzzy clustering to form granules and further investigated a granular-oriented neural network and a granular time series approach [12][36][37][39]. In rough set theory, granules are induced by the equivalence relation. Any subset in the universe is described by equivalence classes [35][41][48]. DTRS proposed by Yao in early 1990’s is based on the well-established Bayesian decision procedure [65]. The required threshold values in DTRS can be calculated and easily interpreted based on a semantically meaningful loss or risk function. Furthermore, DTRS is a general probabilistic rough set model since different probabilistic rough set models may be derived from DTRS [62]. It has been successfully applied in three-way decisions [64], multi-agent threeway decisions [58], risk decision making [27], a four-level approach of probabilistic rules choosing criteria [29], autonomous knowledge-oriented clustering [67], attribute reduction decisions [66], and rough cluster quality index [28]. In this paper, we investigate the expression of granules in rough set theory (i.e., formally express the characteristics of a granule), study the dynamic variation of granules and the relationships of granules in a dynamic environment under the framework of Decision Theoretic Rough Set (DTRS). Therefore, we propose methods to update knowledge incrementally in the granule level and it may improve the efficiency of knowledge discovery. Nowadays, different types of data collected from different applications increase rapidly and these data may evolve with time, i.e., the objects, attributes, and attribute values may change dynamically [6][20]. How to update information effectively is crucial to the efficiency of a knowledge discovery algorithm. Incremental updating methods are used frequently in the interactive applications, stream data processing, massive data processing, and in the case when the storage space or

1063-6706 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.



computing capacity is limited. Batarseh et al. proposed an incremental validation method for knowledge-based systems based on a life cycle model of system development [1]. Cao et al. proposed a summation-based incremental learning algorithm for Info-Kmeans clustering [5]. In rough set theory, many works have been done considering the alteration of objects in the literatures [54]. Shan et al. proposed an incremental methodology for finding all maximally generalized rules and for adaptive modification of them when new data becomes available [46]. Ziarko proposed an incremental learning process in which the structure of rough decision tables is updated [73]. Fan et al. analyzed the different cases of Strength Index (SI) change when objects are added to the information system and then they proposed an incremental method for updating rules [13]. Blaszczynski proposed an algorithm to induce decision rules incrementally when objects are added under Dominance-based Rough Sets Approach (DRSA) [2]. Liu et al. defined an accuracy matrix and a coverage matrix under Variable Precision Rough Sets (VPRS). Then they proposed an incremental approach to update the accuracy matrix and coverage matrix to obtain interesting knowledge w.r.t. the immigration or emigration of objects [30]. Chen et al. analyzed the variation of relative degree of misclassification and granules in VPRS and then they designed algorithms to update approximations when an object is deleted from or inserted into the universe [9]. Zhang et al. investigated the approach for updating approximations under Neighborhood Rough Sets (NRS) [70]. Luo et al. proposed incremental approaches for updating approximations while the object set varies over time in the set-valued information systems [31]. Li et al. studied methods for maintaining approximations of upward union and downward union under DRSA [25]. In the case of variation of attributes, Ciucci discussed the properties of approximations when attributes alter in the information system [11]. Li et al. investigated the method of updating decision rules when multiattributes are deleted or added simultaneously under the characteristic relation based rough set [26]. Cheng et al. proposed methods to update approximations incrementally in rough fuzzy set w.r.t. the addition of attributes [10]. Zhang et al. investigated matrix-based approaches for updating approximations in set-valued information systems [71]. In the case of variation of attributes values, Chen et al. studied the method of updating approximations while attribute values coarsening and refining in complete information systems and incomplete ordered information systems, respectively [7][8]. However, in real-life applications, new attributes may appear when new objects are inserted into the information system, i.e., objects and attributes vary simultaneously. In this case, methods for updating approximations only considering objects vary or attributes vary are ineffective. The granular structure of the information system must be taken into consideration. There is no report about updating approximations in DTRS for dynamic data mining w.r.t.

2

objects and attributes evolve simultaneously. Approximations are basic operators in rough set theory. Certain and uncertain rules are extracted directly from different regions partitioned by approximations. Approximations can be further applied to attribute reduction or other data mining related works. The problems of inducing decision rules by using the principles of rough set theory have been studied extensively by researchers and several software systems have been developed, e.g., RSES [3][47], LERS [14][15], KDDR [74], ROSETTA [22][23], RoughDAS and RoughClass [51], RoughFamily [32], and Datalogic/R [44], which have been successfully applied in many domains. For example, RSES has been used in classification problems, e.g., robot obstacle classification [40], network traffic data [50], and gastroenterology disease [45]. LERS has been used in studying global temperature stability [16], medical data [18], and biomedical domain [52]. This study investigates an approach for updating approximations w.r.t. the variation of objects and attributes simultaneously in the framework of DTRS. The contribution of this paper is outlined as follows: (1) it defines an equivalence feature matrix and updates approximations based on the granular level; (2) it partitions the universe after the variation of objects and attributes into several subspaces and the equivalence feature matrixes of different sub-information systems are updated and merged; (3) it updates approximations in the merging course of equivalence feature matrix which may avoid some time-consuming operations of sets; and (4) extensive experiments on real-life data verify the effectiveness and the efficiency of the method. The paper is organized as follows. In Section 2, basic concepts and principles in DTRS are introduced. In Section 3, the granularity of the information system while objects and attributes added simultaneously are analyzed. In Section 4, an approach for updating approximations incrementally based on granules in DTRS is presented. In Section 5, an example is employed to explain the proposed methods. In Section 6, experimental evaluations of the incremental approaches are carried on UCI data sets. Section 7 concludes the paper and outlines the research direction in the future.

2

D ECISION T HEORETIC ROUGH S ET M ODEL -

S

Yao proposed DTRS in [65]. In this section, we introduce the basic concepts and principles in DTRS [62][65]. Definition 2.1. An information system S is a 4-tuple (U, A, V, f ), where U = {x1 , x2 , . . . , x|U| } is a non-empty finite set of objects, called the universe. A = {a1 , a2 , . . . , a|A| } is a non-empty finite set of attributes. A = C ∪D, C ∩D = ∅, where C and D denote the sets of condition attributes and decision attributes, respectively. V is a domain of attributes, V = {va 1 , va 2 , . . . va |A| }. f : U × A → V is an information function, f = { f (xi , q)| f (xi , q) : xi → vq , q ∈ C, xi ∈ U, 1 ≤ i ≤ |U |}. f (xi , al ) = vil (i = 1, 2, . . . , |U | , l =




3

1, 2, . . . , |A|) denotes the attribute value of object xi under al .

is lower than that of assigning x to P OS(α,β) (X) in this case. Then, we have

Definition 2.2. The equivalence relation on the attribute set C is defined as follows. RC = {(x, y) ∈ U × U |∀a ∈ C, f (x, a) = f (y, a) };

(1)

Then, let [x]RC = [x]C = {y ∈ U |(x, y) ∈ RC } denote the equivalence classes including the object x. U/ ∆U/ = {[x]C |x ∈ U } = {E1 , . . . , Ek , 1 ≤ k ≤ |U |} RC R is a family of all equivalence classes. The conditional C| [21]. It probability of P ( X| [x]C ) is estimated as |X∩[x] |[x]C | satisfies the following properties. Property 2.1. For P ( X| [x]C ), (1) P ( X| [x] C ) = 0 ⇔ [x]C ∩ X = ∅; (2) P ( X C [x]C ) = 1 − P ( X| [x]C ); (3) P ( X| [x]C ) = 1 ⇔ [x]C ⊆ X; (4) 0 < P ( X| [x]C ) < 1 ⇔ [x]C ∩ X 6= ∅ ∧ ¬([x]C ⊆ X). In the following, we introduce the definitions of approximations in DTRS. A pair of thresholds decided by the loss function are introduced to define positive region, boundary region and negative region in DTRS. Definition 2.3. Let U be the non-empty finite set of objects. R is an equivalence relation on U , X ⊆ U , 0 ≤ β < α ≤ 1, the lower and upper approximations are defined as follows: apr (α,β) (X) = {x ∈ U |P (X |[x] C ) ≥ α}

(2)

apr(α,β) (X) = {x ∈ U |P (X |[x] C ) > β}

(3)

Then, (α, β) positive region, boundary region and negative region are given as follows, respectively. P OS(α,β) (X) = {x ∈ U |P (X |[x] C ) ≥ α }

(4)

BN D(α,β) (X) = {x ∈ U |β < P (X |[x] C ) < α }

(5)

N EG(α,β) (X) = {x ∈ U |P (X |[x] C ) ≤ β }

(6)

Here, the threshold α and β are determined by the loss function. The basic idea is given as follows. When an object x belongs to X, let λP P , λN P , and λBP denote the loss functions when x is assigned to P OS(α,β) (X), N EG(α,β) (X), BN D(α,β) (X), respectively. On the contrary, when an object x doesn’t belong to X, let λP N , λN N , and λBN denote the loss functions when x is assigned to P OS(α,β) (X), N EG(α,β) (X), BN D(α,β) (X), respectively. Generally, the loss function satisfies λP P ≤ λBP ≤ λN P , λN N ≤ λBN ≤ λP N , which means the risk of assigning the object x to P OS(α,β) (X) is not higher than that of assigning it to the BN D(α,β) (X) when x belongs to X actually. The risk of assigning the object x to P OS(α,β) (X) and BN D(α,β) (X) is lower than that of assigning x to N EG(α,β) (X). In another case, if the object x doesn’t belong to X actually, then the risk of assigning the object x to N EG(α,β) (X) is not higher than that of assigning x to P OS(α,β) (X) and BN D(α,β) (X). The risk of assigning the object x to N EG(α,β) (X) and BN D(α,β) (X)

(λP N − λBN ) − λBN ) + (λBP − λP P )

(7)

γ=

(λP N − λN N ) (λP N − λN N ) + (λN P − λP P )

(8)

β=

(λBN − λN N ) (λBN − λN N ) + (λN P − λBP )

(9)

α=

(λP N

where α ∈ (0, 1], γ ∈ (0, 1), β ∈ [0, 1). Different types of probabilistic rough sets are then defined by different kinds of loss functions. If P ( X| [x]C ) + P ( X C [x]C ) = 1 and (λP N − λBN )(λN P − λBP ) > (λBP − λP P )(λBN − λN N ), then we have α > β and α > γ > β. Therefore, three-way decision rules are induced from different regions in DTRS, i.e., positive rules for acceptance, boundary rules for indecision or delayed decision, and negative rules for rejection [64]. If α = 1, β = 0, then Traditional Rough Set (TRS) can be easily derived, that is

3

apr (1,0) (X) = ∪{[x] ∈ U/R |[x] ⊆ X};

(10)

apr (1,0) (X) = ∪{[x] ∈ U/R |[x] ∩ X 6= ∅}.

(11)

T HE

GRANULARITY PROPERTY OF IN FORMATION SYSTEM w.r.t. OBJECTS OR ATTRIBUTES VARY

In DTRS, equivalence classes form a partition of the universe. Granules are induced by equivalence classes. The finer the size of granules, the stronger the ability to distinguish. The coarser the size of granules, the more rough the ability to distinguish. An arbitrary set in the universe is described approximately by granules induced by equivalence classes. The measure of granularity is important to rough set theory. Yao proposed a measure of granularity for a partition in [60] as follows. Definition 3.1. [60] Let (U, R) be an approximation space. Suppose U/R = {E1 , E2 , . . . , Ek }(1 ≤ k ≤ n) is a partition of U . A measure of granularity for a partition is defined as follows. n X |Ei | log2 |Ei | (12) G(U /R) = |U | i=1 where |Ei |/|U | represents the probability of equivalence class Ei within the universe U , and log2 |Ei | is the commonly known as the Hartley measure of information of the set Ei [60]. In the following, we discuss the property of the measure of granularity of a partition w.r.t. the alteration of attributes or objects. Let S t = (U t , At , V t , f t ) be the information system at time t. U t /Rt = (E1t , E2t , . . . , Ent ) denotes a partition of the universe U t . Let S t+1 = (U t+1 , At+1 , V t+1 , f t+1 ) be the information system at t+1 t+1 t+1 t+1 time t+1. U , . . . , Em a parRt+1 = (E1 , E2 t+1 ) ist+1 t+1 t+1 tition of the universe U . If ∀Ei ∈ U R (1 ≤




4

i ≤ m), ∃Ejt ∈ U t /Rt (1 ≤ j ≤ s.t., Eit+1 ⊆ Ejt , then n), t t t+1 t+1 t+1 t+1 U /R ≺U R , i.e., U R is a refiner partition than U t /Rt . Considering the addition of objects or attributes, the following properties hold. Property 3.1. When inserting an object set ∆U into the universe, i.e., At+1 = At , U t+1 = U t ∪∆U , then G(U t /Rt ) may be greater or less than G(U t+1 /Rt+1 ). Proof. Without loss of generality, we consider the case when adding an object xt+1 n+1 . Then one of the following two cases may hold: i. U t+1 Rt+1 = U t /Rt ∪ {{xt+1 n+1 }}, t+1 t+1 i.e., xn+1 becomes a new equivalence class in U Rt+1 . t+1 t+1 t+1 t ii. Ej = Ei ∪ {xn+1 }, i.e., xn+1 merges with a certain equivalence class Eit (1 ≤ i ≤ m) in U t /Rt . n+1 t+1 P |Eit+1 | = In Case i, G(U t+1 Rt+1 ) = |U+1| log2 Ei i=1 t+1 n E t+1 t+1 t+1 P | i | | |En+1 E Since log + 2 i |U+1| |U+1| log2 En+1 . i=1 t+1 t+1 = 1, then G(U t+1 Rt+1 ) E = |E t |, E i n+1 i n Et n E t+1 t+1 P P | i| | i | log2 E log2 |E t |. In < = |U+1|

i=1

i

Case ii, G(U t+1 Rt+1 ) j−1 P

=

|Eit+1 |

i |U| i=1 t+1 n+1 t+1 P |E i | |U+1| log2 Ei i=1 t+1 |Ejt+1 | |U+1| log2 Ej

=

log2 Eit+1 + + i=1 n t+1 P |Eit+1 | . Since E t+1 = |E t | + 1 , then i j |U+1| log2 Ei |U+1|

Fig. 1: The relationship between granularity and objects G(U t+1 Rt+1 ). Example 3.2. The relationships between granularity and attributes are depicted in Figure 2. The data set is the “Chess”. The number of initial objects is 3196 and that of attributes is 3. Then, 3 attributes are added in each experiment. The horizontal coordinate of a point is the value of the cardinality of the attribute set. The vertical coordinate of a point is the value of the measure of granularity. From Figure 2, we can see the measure of granularity decreases with the increase of attributes.

i=j+1

|Ejt+1 | |U+1|

j−1 P

|E t | j−1 t+1 P |Eit+1 | < log2 Ejt+1 > |U|j log2 Ejt , |U+1| log2 Ei

|Eit |

n P |Eit+1 | |U+1|

log2 Eit+1
∆X, log2 Ei + |U+1| log2 Ei i=j+1 t+1 t t t+1 i=1

|U|

i=1

log2 |Eit |,

G(U /R ) > G(U R ). If ∆Y < ∆X, then G(U t /Rt ) < G(U t+1 Rt+1 ). From i and ii, it is clear that G(U t /Rt ) is not always greater than G(U t+1 /Rt+1 ), and G(U t /Rt ) is not always less than G(U t+1 /Rt+1 ) when adding an object set ∆U . Example 3.1. The relationships between the measure of granularity and objects are depicted in Figure 1. We download a data set “Chess” from the UC Irvine Machine Learning Database Repository (www.ics.uci.edu/∼mlearn/MLRe pository.html). Initially, 12 attributes and 10 objects are selected. In each experiment, the number of objects added is 10 and attributes keep unchanged. The horizontal coordinate of a point is the value of the cardinality of the object set. The vertical coordinate of a point is the value of the measure of granularity. From Figure 1, we can see the measure of granularity may increase or decrease with the addition of objects. Property 3.2. [60] When inserting attributes ∆A into the universe, i.e., U t = U t+1 , At+1 = At ∪∆A, then G(U t /Rt ) ≥

Fig. 2: The relationship between granularity and attributes Property 3.3. When inserting an object set ∆U and an attribute set ∆A into the universe, i.e., At+1 = At ∪ ∆A, U t+1 = U t ∪ ∆U , then G(U t /Rt ) may be greater or less than G(U t+1 /Rt+1 ). Proof. It is clear from Properties 3.1 and 3.2.

4

I NCREMENTAL UPDATING APPROXIMATION S IN DTRS BASED ON GRANULES WHILE ATTRIBUTES AND OBJECTS VARY SIMULTANEOUS LY

Granulation and approximation are two important concepts in rough set theory. Granules induced by equivalence classes form a partition of the universe. Equivalence classes are used to describe uncertain concept or subset approximately. The objects, attributes and attribute values in the information system may change,




5

then the granularity of granules may alter. The granules may be refiner with the addition of attributes. The granules may be coarser or new granules may emerge while adding objects. Approximations of a concept may vary according to the variation of granules, i.e., positive region, boundary region and negative region may change. That means the three kinds of rules may alter. GrC presents a multi-view and multi-level method for the information system. How to use the information of granules effectively and ignore unnecessary details will contribute to improve the efficiency of knowledge discovery. So, we present the definitions of equivalence feature vector and equivalence feature matrix as follows. Definition 4.1. Let S = (U, A, V, f ) be an information system, U/RC = {E1 , E2 , . . . , Ei , . . . , El }. Then the equivalence − → feature vector of Ei is defined as Ei = (Indexi , obji , regi ) −→ and Eic = (ei1 , ei2 , . . . , eij , . . . , eim ) denotes the characteristic value vector of Ei , where eij = f (xk , aj )(∀xk ∈ Ei , aj ∈ C), Indexi = xk (∃xk ∈ Ei ) is the feature index, obji = {xk |xk ∈ Ei }. If Ei ∈ P OS(α,β) (X), then regi = P ; if Ei ∈ BN D(α,β) (X), then regi = B; if Ei ∈ N EG(α,β) (X), then regi = N . Definition 4.2. Let S = (U, A, V, f ) be an information − → system. Ei = (Indexi , obji , regi ) is the equivalence feature vector of equivalence class Ei . The equivalence feature matrix and characteristic value matrix of the information system are defined as follows, respectively.  − → E1  .  .  .  − → ME =  Ej   ..  . − → El

MEC



 Index1   ..   .      =  Indexj   ..    . Indexl

 −→ E1c  .  .  .  −→ =  Ejc   ..  . −→ Elc



 e11   .   .   .    =  ej1   .   .  . el1

obj1 .. . objj .. . objl

... .. . ... .. . ...

reg1 .. . regj .. . regl

e1m .. . ejm .. . elm

       

(13)

       

(14)

where (l = U/RC ). The granulation and approximate information are included in the equivalence feature matrix. The equivalence feature matrix may change when the information system evolves with time. Lemma 4.1. Let S = (U, A, V, f ) be an information system. ME is the equivalence feature matrix. Then apr (α,β) (X) = − → ∪{obji obji ∈ Ei ∧ reg i = P, 1 ≤ i ≤ l}, apr(α,β) (X) = − → ∪{obji |obji ∈ Ei ∧(regi = P ∨ regi = B), 1 ≤ i ≤ l}. Proof. It follows directly by Definition 2.3. Now, we consider the variation of the information system when objects and attributes alter simultaneous-

ly. Let S t = (U t , At , V t , f t ) denote the information system at time t, and S t+1 = (U t+1 , At+1 , V t+1 , f t+1 ) denote the information system at time t + 1, where U t+1 = U t ∪ ∆U , At = C t ∪ dt , At+1 = C t+1 ∪ dt , C t+1 = C t ∪ ∆C. For convenience, the equivalence feature vector, characteristic value vector and equivalence feature matrix of the information system S t are denoted − → −→ t as Eit = (Indexti , objit , regit ), Eic = (eti1 , . . . , etim ), and MEt , respectively. Then the equivalence feature matrix of S t is defined as follows.

t

ME

 − → t E1  .  .  .  −  → =  Ejt   .  .  . −→ t E lt



 Indext1   .   .   .    t   =  Indexj    .   .  .  Indextlt

obj1t . . . objjt . . . objltt

reg1t . . . regjt . . . regltt



   .  t  (l = U t RC t ) (15)    

Now, we partition the universe S t+1 into two subt+1 t+1 t+1 spaces, i.e., S U = (U t+1 , At , V U , f U ), S ∆A = t+1 t+1 t+1 = (U t+1 , At , V U , f U ) (U t+1 , ∆C, V ∆A , f ∆A ). S U is further partitioned into two sub-spaces: S t = (U t , At , V t , f t ), S ∆U = (∆U, At , V ∆U , f ∆U ). Generally, the concepts in the information system may change. Let t+1 t+1 X t , X U , X ∆U , X t+1 denote the concepts in S t , S U , S ∆U , S t+1 , respectively. Because the decision attributes t+1 do not change, X t+1 = X U and X t+1 = X t ∪ X ∆U . ∆A Then it is clear that S and S ∆U are information systems added to previous information system S t . Each information system has its’ granular structure. Merging it into one united granular structure is taken into consideration and the different regions are updated during the course of merging granules. Proposition 4.1. Given feature matrix MEt and ME∆U , for t+1 MEU , the following three cases hold. −−∆U −→ −→ t+1 t , then objiU = objit ∪ objk∆U , i. If Eic = Ekc t+1 t+1 is determined as IndexU = Indexti , where regiU i follows. t+1 A. If regit = regk∆U , then regiU = regit ; B. Otherwise, (a) If α = 1, β = 0, then regi = B; t+1 t+1 (b) If P ( X t+1 objiU ) ≥ α, then regiU = P ; If t+1 t+1 U U t+1 β < P ( X obji ) < α, then regi = B; t+1 t+1 If P ( X t+1 objiU ) ≤ β, then regiU = N. −−∆U −→ −−∆U −→ −−−→ − −→ → t ii. If ¬∃Ekc s.t. Eic = Ekc , then Eit+1 = Eit ; −−∆U −→ −→ −→ t t iii. Otherwise, if ¬∃Eic s.t. Eic = Ekc , then j = lt + 1, −→ −−t+1 −→ −−∆U Ej = Ek . −−∆U −→ −→ t+1 t Proof. i. If Eic = Ekc , then objiU = objit ∪ objk∆U . t+1 t ∆U U t+1 Since X = X ∪ X and obji = objit ∪ ∆U U t+1 t+1 t ∆U objk , then obji ∩X = (obji ∪ objk ) ∩ (X t ∪ ∆U t t ∆U X ) = (obji ∩ X ) ∪ (objk ∩ X ∆U ). A. If regit = P , |objk∆U ∩X ∆U | |objit ∩X t | ≥ α. regk∆U = P , then |X t | ≥ α, |X ∆U | t t obj ∆U ∩ X ∆U ≥ α X ∆U . So |objit ∩ X | ≥ α |X | , k t+1 ∩X t+1 objiU |(objit ∩X t )∪(objk∆U ∩X ∆U )| . STherefore, = |X t+1 | |X t ∪X ∆U | t ∆U t ∆U ince obji ∩ objk = ∅, and X ∩ X = ∅,




6

|(objit ∩X t )∪(objk∆U ∩X ∆U )| |objit ∩X t |+|objk∆U ∩X ∆U | then = ≥ |X t ∪X ∆U | |X t |+|X ∆U | α|X t |+α|X ∆U | U t+1 = α. So we have regi = P . Similarly, if |X t |+|X ∆U | t+1

regit = B, regk∆U = B , then regiU = B; If regit = N , t+1 ∆U regk = N , then regi = N . Therefore, if regit = regk∆U , t+1 then regiU = regit . B. The following cases may happen, i.e., t regi = L and regk∆U = N , or regit = L and regk∆U = B, or regit = N and regk∆U = B, or regit = N and regk∆U = L, or regit = B and regk∆U = B. (a) When α = 1, β = 0, if regit = L and regk∆U = N , we have objit ⊆ X t , t+1 objk∆U ∩ X ∆U = ∅. Then objiU = (objit ∪ objk∆U ) 6⊂ t+1 X t ∪ X ∆U , objiU = (objit ∪ objk∆U ) ∩ (X t ∪ X ∆U ) 6= ∅. Therefore, regit+1 = B. The other cases can be proved in the t+1 similar way, namely, regiU = B. (b) It follows directly from Definition 2.3. ii and iii are obvious. In Proposition 4.1, three cases of granule variation are presented while adding objects to the information system. If the characteristic values of granules in S ∆U and S t are the same, then these granules merge into one t+1 granule in S U . If the characteristic values of granules in S ∆U (S t ) is different from the characteristic values of every granule in S t (S ∆U ), then these granules does not merge with other granules in S t+1 . For MEt+1 , the following proposition holds. t+1

Proposition 4.2. Given feature matrix MEU , for MEt+1 , −−t+1 −→ t+1 t+1 t+1 ∆A ∆A ∈ , . . . , eU then Ejc = (eU j1 j|C t | , ei1 , . . . , ei|∆C| ), objj t+1. U t+1 objj R∆C ; for regj , the following hold: . U t+1 i. If objjt+1 6= 1 or objjt+1 6= objj R∆C , then, A. If α = 1, β = 0 then (a) regjU

t+1

=B

t UAGOAAS (ME , U t , X t , ∆U, ∆C, α, β) 1 k←1 2 for each xi in ∆U ∆U 3 do C OMPUTE MDU (∆U, X t , α, β) Compute ME 4 X t+1 ← X t ∪ X ∆U − → t 5 for each Ejt in ME −− → − → − U t+1 6 do C OMPUTE MUT1 (Ejt , Ek∆U , α, β) Compute ME − − − → − − → − − → t ∆U t 7 if ¬∃Eic s.t. Ekc = Eic −−−t+1 −→ − →∆U U 8 then Elt+1 ← Ek − −∆U − → − −∆U − → − − → t 9 if ¬∃Ekc s.t. Eic = Ekc −−−t+1 −→ − → 10 then EjU ← Ejt t+1

t+1 U 11 C OMPUTE MT1 (ME ) Compute ME 12 apr (α,β) (X t+1 ) ← BN D(α,β) (X t+1 ) ∪ apr (α,β) (X t+1 ) t+1 13 return ME , X t+1 , apr

⊆

X

t+1

, then

X t+1 = ∅, then

(b) B. Otherwise,

t+1 (a) If P ( X t+1 (objjU ∩ obji∆A )) ≥ α, then t+1 regj = P ; t+1 (b) If β < P ( X t+1 (objjU ∩ obji∆A )) < α, then t+1 regj = B; t+1 ∩ obji∆A )) ≤ β, then (c) If P ( X t+1 (objjU t+1 regj = N . t+1

.

Proof. It follows directly from Definitions 2.2 and 2.3. In the following, we design an algorithm for updating regions based on granules while the attribute set and the object set vary simultaneously.

(α,β)

(X t+1 ), apr (α,β) (X t+1 )

C OMPUTE MDU (∆U, X t , α, β) 1 if ¬∃Ek∆U ∈

U∆

C t s.t. xi ∈ Ek∆U − −− → then Compute Ek∆U , k ← k + 1 − −∆U − → ∆U 3 return Ek , X

2

−− → − → − C OMPUTE MUT1 (Ejt , Ek∆U , α, β) − −∆U − → − − → t 1 if Ekc = Ejc t+1

then objjU ← objk∆U ∪ objjt ; IndexU j ∆U if regk = regjt

2 3

t+1

← Indextj ;

t+1

then regjU ← regjt elseif α = 1, β = 0

4 5

then regjU

6

t+1

←B

else E QUI R EGIONS (X t+1 , objjU −−−t+1 −→ U 8 return Ej

7

t+1

, α, β)

t+1

U C OMPUTE MT1 (ME ) −−−t+1 −→ U t+1 1 for each EiU in ME 2 do k ← 0

3 4

t+1 (i) If (objjU ∩ obji∆A ) t+1 regj = L; t+1 (ii) If (objjU ∩ obji∆A ) ∩ t+1 regj = N . t+1 Otherwise, regjt+1 = regjU

ii. Otherwise, regjt+1 = regjU

Algorithm 4.1. Updating Approximations based on Granules w.r.t. Objects and Attributes are Added Simultaneously (UAGOAAS)

5 6 7 8 9 10 11 12 13 14 15 16 17

t+1

U for each xj in objE i

if ¬∃Ekt+1 ∈

t+1 SU objE i R

C∆

s.t. xj ∈ Ekt+1

then Compute Ekt+1 (Ekt+1 ∈ else Continue

t+1 SU objE i R

C∆

)

t+1

U if Ekt+1 = objE i

−−−→ then U PDATE A PPROXIMATIONS (Eit+1 ), Break; else k ← k + 1, objit+1 ← Ekt+1 , Indext+1 ← xl (xl ∈ Ekt+1 ) i if α = 1, β = 0

then T RA E QUI R EGIONS (X t+1 , objit+1 , regiU else E QUI R EGIONS (X t+1 , objit+1 ) −−−→ U PDATE A PPROXIMATIONS (Ekt+1 ) if k > 1 −−t+1 −→ −−t+1 −→ t+1 then l′ ← lU + 1, El′ ← Ek t+1 return ME , apr (α,β) (X t+1 ), BN D(α,β) (X t+1 )

T RA E QUI R EGIONS (X, obji , rego ) 1 if rego = B 2 then 3 if obji ⊆ X 4 then regi ← L 5 elseif obji ∩ X 6= ∅ 6 then regi ← B 7 else regi ← rego 8 return regi − → U PDATEA PPROXIMATIONS (Eit ) 1 if regit = P (X t+1 ) ← apr 2 then apr (α,β)

(α,β)

(X t+1 ) ∪ objit

3 elseif regit = B 4 then BN D(α,β) (X t+1 ) ← BN D(α,β) (X t+1 ) ∪ objit


t+1

)



E QUI R EGIONS (X, obj, α, β) 1 Pt ← P (X |obj ) 2 if Pt ≥ α 3 then regit ← P 4 elseif β < Pt < α 5 then regit ← B 6 elseif Pt ≤ β 7 then regit ← N 8 return regit

In Algorithm UAGOAAS, ME△U is computed in line 3 by Function COM P U T EM DU (∆U, X t , α, β)) in the sub-space S △U and X t+1 is updated in line 4. Then, t+1 the equivalence feature vector in MEU is generated by merging the equivalence feature vector in MEt and ME△U from line 5 to line 10. Finally, MEt+1 is calculated in line 11 t+1 by Function COM P U T EM T 1(MEU ). Approximations are generated during the course of generating MEt+1 .

5

AN

ILLUSTRATIVE EXAMPLE

An example is given to illustrate the proposed method based on granules used to update approximations while objects and attributes vary simultaneously. Let S t = (U t , At , V t , f t ) be the information system at time t, where U t = {xi , 1 ≤ i ≤ 10}, At = C t ∪ d, C t = {ai , 1 ≤ i ≤ 4}, V t = {0, 1} (See Table 1). At time t + 1, attributes {a5 , a6 } and objects {x11 , x12 , x13 , x14 , x15 } are added into the information system S t (See Table 2). TABLE 1: An information system S t at time t U x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

a1 1 0 1 1 0 1 1 1 0 1

a2 0 1 1 1 1 0 0 0 0 1

a3 0 0 0 0 0 0 0 0 0 0

a4 1 0 0 0 0 1 1 1 1 0

d 0 0 0 1 1 1 0 0 1 0

a1 1 0 1 1 0 1 1 1 0 1 1 0 0 0 1

a2 0 1 1 1 1 0 0 0 0 1 1 0 1 1 0

a3 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0

a4 1 0 0 0 0 1 1 1 1 0 0 1 0 0 1

a5 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0

a6 0 1 1 0 1 1 0 1 0 1 1 0 0 0 0

0.67, P (X t |E4t ) = 0. Therefore, P OS(α,β) (X t ) = {E1t } = {x1 , x6 , x7 , x8 }, BN D(α,β) (X t ) = {E2t , E3t } = {x2 , x3 , x4 , x5 , x10 }, N EG(α,β) (X t ) = {E4t } = {x9 }.   x1 {x1 , x6 , x7 , x8 } P  x2 {x2 , x5 } B   , M t = MEt = EC  x3 {x3 , x4 , x10 } B  {x9 } N  x9  1 0 0 1  0 1 0 0     1 1 0 0 . 0 0 0 1 Now, we partition the information system into t+1 t+1 t+1 t+1 t+1 SU = (U U , AU , V U , f U ) and S ∆A = t+1 t+1 (U ∆A , A∆A , V ∆A , f ∆A ), where U U = U t+1 , AU = At = {a1 , a2 , a3 , a4 , d}, U ∆A = U t+1 , A∆A = A∆ (See t+1 Tables 3 and 4). S U is further divided into S t ∆U ∆ ∆U and S = (U , A , V ∆U , f ∆U ), where U ∆ = {x11 , x12 , x13 , x14 , x15 }, A∆U = U t (See Table 5). TABLE 3: An information system S U U x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15

a1 1 0 1 1 0 1 1 1 0 1 1 0 0 0 1

a2 0 1 1 1 1 0 0 0 0 1 1 0 1 1 0

a3 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0

a4 1 0 0 0 0 1 1 1 1 0 0 1 0 0 1

t+1

d 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0

TABLE 4: An information system S ∆A

TABLE 2: An information system S t+1 at time t + 1 U x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15

7

d 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0

U x1 x2 x3 x4 x5 x6 x7 x8

a5 0 1 0 0 1 0 0 0

a6 0 1 1 0 1 1 0 1

U x9 x10 x11 x12 x13 x14 x15

a5 0 0 0 0 1 1 0

a6 0 1 1 0 0 0 0

TABLE 5: An information system S ∆U U x11 x12 x13 x14 x15

a1 1 0 0 0 1

a2 1 0 1 1 0

a3 0 0 1 1 0

a4 0 1 0 0 1

d 1 0 0 0 0

t

Then, U /R t = {E1t , E2t , E3t , E4t }, E1t = {x1 , x6 , x7 , x8 }, C t E2 = {x2 , x5 }, E3t = {x3 , x4 , x10 }, E4t = {x9 }, U t/ = {D1 , D2 }, D1 = {x1 , x2 , x3 , x7 , x8 , x10 }, D2 = d {x4 , x5 , x6 , x9 }, α = 0.7, β = 0.3, f (X t , d) = 0, X t = D1 , P (X t |E1t ) = 0.75, P (X t |E2t ) = 0.5, P (X t |E3t ) =

i. Compute ME∆U in the subspace S ∆U . x11 {x11 } N  x {x } P  12   12 ME∆U =  x13 {x13 , x14 } P , x15 {x15 } P


∆U MEC

=





 1 1 0 0  0 0 0 1  ∆U   = {x12 , x13 , x14 , x15 }, X t+1 =  0 1 1 0 , X 1 0 0 1 X t + X ∆U = {x1 , x2 , x3 , x7 , x8 , x10 , x12 , x13 , x14 , x15 }. t+1 t+1 ii. Compute MEU in the subspace S U incrementally. −−∆U −→ −→ t+1 t a) Since E1c = E4c , then obj1U = obj1t ∪ obj4∆U = t+1 U {x1 , x6 , x7 , x8 , x15 }, Index1 = Indext1 . Since t ∆U U t+1 reg1 = reg4 , then reg1 = reg1t = P . t+1 −−∆U −→ −−∆U −→ −→ t = , then obj2U b) Since ¬∃Ekc , s.t., E2c = Ekc t+1 t U t+1 t U t obj2 , Index2 = Index2 , reg2 = reg2 = B. −−∆U −→ −→ t+1 t c) Since E3c = E1c , then obj3U = obj3t ∪ ∆U U t+1 obj1 = {x3 , x4 , x10 , x11 }, Index = Indext3 . 3 t ∆U t+1 U t+1 Since reg1 6= reg4 , P (X ) = 0.5, then E3 t+1

reg3U = B. −−∆U −→ −→ t+1 t , then obj4U = obj4t ∪ obj2∆U = d) Since E4c = E2c t+1 U t {x9 , x12 }, Index 4 = Index2 . Since reg4t 6= t+1 ∆U t+1 U t+1 reg2 , P (X ) = 0.5, then reg4U = B. E4 − − − → −→ − → t ∆U t e) Since ¬∃Eic , s.t., E3c = Eic , then ∆U t U t+1 = {x13 , x14 }, = obj3 i = l + 1 = 5, obj5 t+1 ∆U U t+1 = P. = Index∆U IndexU 3 , reg5 t+1= reg3 3 U That is, M = E   x1 {x1 , x6 , x7 , x8 , x15 } P  x2 {x2 , x5 } B     x3 {x3 , x4 , x10 , x11 } B   .  x9 {x9 , x12 } B  x13 {x13 , x14 } P iii. Compute MEt+1 . . U t+1 a) Since E1 R{a5 ,a6 } = {{x1 , x7 , x15 }, {x6 , x8 }}, t+1 P (X |{x1 , x7 , x15 }) = 1 > 0.7, reg1t+1 = P , 0.3 < P (X t+1 |{x6 , x8 } ) = 0.5 < 0.7, reg6t+1 = −−−→ −−−→ B, then E1t+1 = (x1 , {x1 , x7 , x15 }, P ), E6t+1 = (x6 , {x6 , x8 }, B), P OS(α,β) (X t+1 ) = {x1 , x7 , x15 }, BN D(α,β) (X t+1 ) = {x6 , x8 }. . −−−→ U t+1 U t+1 b) Since E2 , then E2t+1 = R{a5 ,a6 } = E2 −−−t+1 → E2U = (x2 , {x2 , x5 }, B), BN D(α,β) (X t+1 ) = {x2 , x5 , x6 , x8. }. U t+1 c) Since E3 R{a5 ,a6 } = {{x3 , x10 , x11 }, {x4 }}, t+1 reg3U = B, 0.3 < P (X t |{x3 , x10 , t+1 = B, i.e., x11 }) = 0.67 < 0.7, then reg3 −− −→ E3t+1 = (x3 , {x3 , x10 , x11 }, B), BN D(α,β) (X t+1 ) = BN D(α,β) (X t+1 ) ∪ {x3 , x10 , x11 } = {x2 , x3 , x5 , x6 , x8 , x10 , x11 }. Because P (X t |{x4 } ) = 0 < 0.3, then reg7t+1 = N , −−t+1 −→ E7 = (x4 , {x4. }, N ). U t+1 t+1 E 4 d) Since = E4U , then R{a5 ,a6 } −−−t+1 → −−t+1 −→ U E4 = E4 = (x9 , {x9 , x12 }, B), BN D(α,β) (X t+1 ) = BN D(α,β) (X t+1 ) ∪ {x9 , x12 } = {x2 , x3 , x5 , x6. , x8 , x10 , x11 , x9 , x12 }. −−−→ U t+1 U t+1 e) Since E5 , then E5t+1 = R{a5 ,a6 } = E5

8

−−−t+1 → E5U = (x13 , {x13 , x14 }, P ), P OS(α,β) (X t+1 ) = P OS(α,β) (X t+1 ) ∪ {x13 , x14 } = {x1 , x7 , x13 , x14 , x15 }.   x1 {x1 , x7 , x15 } P  x2 {x2 , x5 } B     x3 {x3 , x10 , x11 } B    {x9 , x12 } B  That is, MEt+1 =   x9 ,  x13  {x , x } P 13 14    x6 {x6 , x8 } B  x4 {x4 } N P OS(α,β) (X t+1 ) = {x1 , x7 , x13 , x14 , x15 }, BN D(α,β) (X t+1 ) = {x2 , x3 , x5 , x6 , x8 , x9 , x10 , x11 , x12 }. iv. Output apr (α,β) (X t+1 ) = P OS(α,β) (X t+1 ) = {x1 , x7 , x13 , x14 , x15 }, apr(α,β) (X t+1 ) t+1 = P OS(α,β) (X ) ∪ BN D(α,β) (X t+1 ) = {x1 , x2 , x3 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15 }.

6

E XPERIMENTAL E VALUATION

Extensive experiments are carried out to verify the effectiveness of the proposed method while objects and attributes are added simultaneously under the framework of DTRS. Eight data sets are downloaded from the UC Irvine Machine Learning Database Repository (www.ics.uci.edu/∼mlearn/MLRepository.html) listed in Table 6. All attribute types are nominal. We perform the experiments on a computer with Intel Core 2 Duo CPU T6500 2.10GHz CPU, 4.0 GB of memory, running Microsoft Windows Vista Home Basic. Algorithm is developed in C#. 6.1 Comparative experiments with the increasing size of data sets In this subsection, we compare the computation time of Algorithm UAGOAAS with that of non-incremental updating method while objects and attributes are added simultaneously on the 8 real-life data sets shown in Table 6. We take out 10%, 20%, . . . , 90% objects and attributes of each data sets listed in Table 6 as test set 1, test set 2, etc. Then data from the rest of each data set is added to the test set, which size is 5% of the test set. The comparative results are depicted in Figure 3. In Figure 3, the x-coordinate pertains to the number of the test set and a point in the y-coordinate is the logarithm value of the computation time of different algorithms. In Figure 3, square lines denote the computation time of non-incremental updating method, and circle lines denote the computation time of Algorithm UAGOAAS. The computation time of Algorithm UAGOAAS is lower than that of non-incremental updating method while objects and attributes are added simultaneously. With the increase of the size of test sets, the computation times of Algorithm UAGOAAS and non-incremental updating method increase. In Table 7, we list the speed-up ratios of experiments on different data sets with different test sets. From Table




9

TABLE 6: A description of data sets Data set Chess (King-Rook vs. King-Pawn) Splice-junction Gene Sequences Optical Recognition of Handwritten Digits Statlog (Landsat Satellite) US Census Data (1990) Data Set Insurance Company Benchmark (COIL 2000) Musk (Version 2) Connect-4

Abbreviation Chess Splice Optdigits Statlog USCensus Insurance Musk Connect-4

Attributes 36 61 65 38 69 87 168 44

Rows 3196 3190 3823 4435 4999 5822 6598 6756

Classes 2 2 9 7 10 10 2 3

Missing Values No No No No No No No No

(a) Chess

(b) Connect-4

(c) Insurance

(d) Musk

(e) Optdigits

(f) Splice

(g) Statlog

(h) USCensus

Fig. 3: A comparison of UAGOAAS and non-incremental updating method when objects and attributes are added simultaneously in DTRS

TABLE 7: Speed-up ratio w.r.t different test sets Data set Speed-up ratio

Chess

Connect-4

Insurance

Musk

Test set 1 2 3 4 5 6 7 8 9 Average

4.121050 7.128718 2.990636 6.142953 10.964786 7.404989 8.192254 7.143551 6.422111 6.723450

15.960940 8.481307 4.329740 4.739208 7.350014 12.347331 11.082535 6.632745 5.350532 8.474928

4.838566 4.857986 5.584297 3.452812 4.118697 3.590970 3.310550 3.500533 3.647401 4.100201

10.862227 5.740119 2.706809 2.980205 3.262777 3.910592 2.831588 3.528354 — 4.477834

Speed-up ratio

Optdigits

Splice

Statlog

USCensus

Test set 1 2 3 4 5 6 7 8 9 Average

2.741677 5.383915 4.155330 4.865840 4.621711 4.675433 4.266656 4.341512 4.746722 4.422088

2.818100 2.409246 4.696177 3.070611 4.196029 3.524953 3.076634 3.200972 3.225922 3.357627

3.492651 2.717219 4.152685 3.327443 3.216701 3.157294 3.066948 3.313112 — 3.305507

4.013137 2.629266 4.103433 3.357445 3.286038 2.878476 3.487950 3.770010 3.248472 3.419359

Data set




7, it is clear that Algorithm UAGOAAS is effective in maintaining approximations w.r.t. objects and attributes added simultaneously.

10

the performance of Algorithm UAGOAAS is very good w.r.t. the addition of relative small amount of objects and attributes, especially in the case when an object and an attribute are added.

6.2 Experiments on the case of increasing number of added objects and attributes In this subsection, we investigate cases when test sets keep unchanged and the added objects and attributes vary. We want to answer the question to what extend the proposed incremental method is effective or ineffective. We take out 50% objects and attributes of different data sets as the test set, respectively. Then data sets from the rest of data sets are added to the test set respectively, which size is 10%, 20%, . . . , 90% of the each test set, i.e., the step of addition is 10%. The results are depicted in Figure 4. In Figure 4, the x-coordinate pertains to the size of the objects and attributes added to the test set which starts from 10% to 90% size of the test set and the y-coordinate of a point is the computation time of different algorithms. In Figure 4, square lines denote the computation time of non-incremental updating method, and circle lines denote the computation time of Algorithm UAGOAAS. From Figure 4, Algorithm UAGOAAS is ineffective when the objects and attributes added to data sets, e.g., Musk, Optdigits, and Splice, are above 40% (See Figures 4d, 4e, 4f). As for data sets, e.g., Statlog and USCensus, the Algorithm UAGOAAS is ineffective when objects and attributes added to data sets are about 30% (See Figures 4g, 4h). Algorithm UAGOAAS works well in data sets, e.g., Chess, Connect-4, and Insurance, while about 90% objects and attributes are added (See Figures 4a, 4b, 4c). There is a fluctuation in the data set, Musk, when 70% objects and attributes are added (See Figure 4d). In Figure 5, we depict the speed-up ratio of eight data sets when adding different amount of objects and attributes. The speed-up ratio of the Algorithm UAGOAAS decreases with the increase of addition of objects and attributes in data sets, e.g., Splice, Statlog, USCensus, Insurance, Musk, and Connect-4. As for data sets, e.g., Chess and Optdigits, there are some fluctuations in the speed-up ratio of them due to uncertain change of the granularity of equivalence classes, the approximation properties of the information system, and the variation of the concept X. 6.3 Experiments on the case of relative smaller addition of objects and attributes In this subsection, we test the relatively outmost performance of the Algorithm UAGOAAS. We carry out two type of tests, i.e., the case when an object and an attribute are inserted into the information system, and the another one is when 1% objects and attributes are added. We take out 90% objects and attributes from eight data sets as test sets in Table 6. We depict the results in Figure 6. Figure 6 shows the speed-up ratio of Algorithm UAGOAAS in these two cases. From Figure 6, we can see

Fig. 6: Speed-up ratio while adding relative lower amount of objects and attributes

6.4 Experiments under the variation of α and β In DTRS, lower and upper approximations are affected by a pair of parameters, namely, α, β. Do these two parameters affect the computation time of the incremental updating method? In this subsection, we carry out experiments to test the computation time with different α and β. 90% objects and attributes are taken out from seven data sets in Table 6 as test sets. The size of data added is 5% of test sets and data set added is from the rest of data sets. The results are depicted in Figure 7. In Figure 7, the x-coordinate pertains to different α and β parameters and a point of y-coordinate is the computation time of different algorithms. Clearly, the computational times fluctuate a little with different α and β when inserting objects and attributes simultaneously into the universe in Algorithm UAGOAAS. There is no rule between the variation of α and β and the computational time of incremental updating approximations since the complexity of computation is almost the same in the different cases of α and β except for some random factors. 6.5 Computational complexity analysis Let S t = (U t , C t ∪ Dt , V t , f t ) be an information system S at time t. The approximations in DTRS are affected by two factors, i.e., the conditional probability and the thresholds α and β. The computation time of approximations include three parts, i.e., the time used t to compute equivalence classes, concept X t ∈ U /R t D and the conditional probability. Let S t+1 = (U t+1 , C t+1 ∪ Dt+1 , V t+1 , f t+1 ) be an information system S at time t t + 1, where U t+1 = U ∪ ∆U , C t+1 = C t ∪ ∆C, t+1 Dt+1 = Dt , X t+1 ∈ U RDt+1 . In the non-incremental updating, the computation complexity of computing et+1 |U t+1 |(|U t+1 |+1)|C t+1 | ), quivalence classes U /RC t+1 is O( 2




11

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 4: A comparison of computation time while objects and attributes added change but test sets keep unchanged

Fig. 5: Speed-up ratio while adding different amount of data the computation complexity of computing concep|U t+1 |(|U t+1 |+1)|Dt+1 | t X t+1 is O( ), and the compu2 tation complexity of computing approximation is t+1 U t+1 O( /RC t+1 X ). Then, the computation com plexity of the non-incremental updating is O( 12 U t+1 t+1 ( U t+1 +1)( C t+1 + Dt+1 )+ U /R t+1 X t+1 ). AlgoC

rithm UAGOAAS includes four steps, i.e., computation t+1 of X ∆U , ME∆U , MEU , and MEt+1 , respectively. The approximation are calculated during the course of generating MEt+1 . The computational complexity of computing |∆U|(|∆U|+1)|Dt | X ∆U is . The computational complexity 2

|∆U|(|∆U|+1)|C t | . The computationof computing ME∆U is 2 t U t+1 al complexity of computing ME is U /RC t ∆U /RC t .

The complexity of computing MEt+1 is t+1 computation U /C t+1 |∆C|. Therefore, the computational complexi |∆U|(|∆U|+1)(|C t |+|Dt |) UAGOAAS is O( + ty tof Algorithm 2 U t+1 ∆U U /C t+1 |∆C|), which is clearly / RC t + / RC t less than that of the non-incremental updating method.




12

ACKNOWLEDGMENT This work is supported by the National Science Foundation of China (Nos. 61175047, 61100117, 61170111, 71201133) and NSAF (No. U1230117), the Youth Social Science Foundation of the Chinese Education Commission (No. 11YJC630127), the Scientific Research Fundation of Sichuan Provincial Education Department (No. 13ZB0210), the 2013 Doctoral Innovation Funds of Southwest Jiaotong University, the Fundamental Research Funds for the Central Universities (SWJTU11ZT08, SWJTU12CX091), and Beijing Key Laboratory of Traffic Data Analysis and Mining (BKLTDAM2014001).

R EFERENCES

αβ

Fig. 7: A comparison of incremental algorithm with different α and β

7

C ONCLUSION

In this paper, the dynamic relation between granules, and the approaches for updating approximations incrementally in DTRS based on a new concept, an equivalence feature matrix, were studied firstly. Then, the Algorithm UAGOAAS was proposed for updating approximations in DTRS for dynamic data mining by using of equivalence feature matrixes while objects and attributes are inserted into the information system simultaneously. It may avoid updating approximations directly which is time-consuming. We carried out extensive experiments to verify the effectiveness of the proposed algorithms. Experimental results indicated that Algorithm UAGOAAS is effective to maintain knowledge when objects and attributes in the information system change dynamically. This method is the first effort to efficiently update approximations w.r.t. objects and attributes added simultaneously in DTRS. However, missing values may appear when the dataset varies with time. Future investigation will focus on accommodating the proposed approaches to incomplete information systems. In addition, experimental results show the presented algorithm has good scalability with the increasing size of objects and attributes. However, things get challenging when the data is too big. New policies, e.g., data pipes [55][75] and parallelizing techniques [72], will be very useful for development of a scalable algorithm. It will be another future work. Moreover, it is worthwhile to investigate the variation of existing rules for evolving information systems which is one of the crucial issues in incremental learning.

[1] F.A. Batarseh, A.J. Gonzalez, “Incremental lifecycle validation of knowledge-based systems through CommonKADS”, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 3, pp. 643-654, 2013. [2] J. Blaszczynski, R. Slowinski, “Incremental induction of decision rules from dominance-based rough approximations”, Electronic Notes in Theoretical Computer Science, vol. 82, no. 4, pp. 40-51, 2003. [3] J.G. Bazan, M.S. Szczuka, “RSES and RSESlib a collection of tools for rough set computations, in Proceedings of 2nd International Conference Rough Sets and Current Trends in Computing, pp. 106-113, 2000. [4] S. Calegari, D. Ciucci, “Granular computing applied to ontologies”, International Journal of Approximate Reasoning, vol. 51, no. 4, pp. 391409, 2010. [5] J. Cao, Z. Wu, J.J. Wu, H. Xiong, “SAIL: Summation-based incremental learning for information-theoretic text clustering”, IEEE Transactions on Cybernetics, vol. 43, no. 2, pp. 570-584, 2013. [6] M. Cheng, B. Fang, Y.Yan Tang, T.P. Zhang, J. Wen, “Incremental embedding and learning in the local discriminant subspace with application to face recognition”, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, no. 5, pp. 580-591, 2010. [7] H.M. Chen, T.R. Li, D. Ruan, “Maintenance of approximations in incomplete ordered decision systems”, Knowledge-Based Systems, vol. 31, pp. 140-161, 2012. [8] H.M. Chen, T.R. Li, S.J. Qiao, D. Ruan, “A rough sets based dynamic maintenance approach for approximations in coarsening and refining attribute values”, International Journal of Intelligent Systems, vol. 25, no. 10, pp. 1005-1026, 2010. [9] H.M. Chen, T.R. Li, D. Ruan, J.H. Lin, C.X. Hu, “A rough-set based incremental approach for updating approximations under dynamic maintenance environments”, IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 274-284, 2013. [10] Y. Cheng, “The incremental method for fast computing the rough fuzzy approximations”, Data & Knowledge Engineering, vol. 70, no. 1, pp. 84-100, 2011. [11] D. Ciucci, “Classification of rough sets dynamics”, in Proceedings RSCTC10, LNAI 6086, pp. 257-266, 2010. [12] R.J. Dong, W. Pedrycza, “A granular time series approach to longterm forecasting and trend forecasting”, Physica A, vol. 387, no. 13, pp. 3253-3270, 2008. [13] Y.N. Fan, T.L. Tseng, C.C. Chern, C.C. Huang, “Rule induction based on an incremental rough set”, Expert Systems with Applications, vol. 36, no. 9, pp. 11439-11450, 2009. [14] J.W. Grzymala-Busse, “LERS-A system for learning from examples based on rough sets”, Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, Roman Slowinski (Ed.), Kluwer Academic Publishers, vol. 11, pp 3-18, 1992. [15] J.W. Grzymala-Busse, “A new version of the rule induction system LERS”, Fundamenta Informaticae, vol. 31, no. 1, pp. 27-39, 1997. [16] J.D. Gunn, J.W. Grzymala-Busse, ”Global temperature stability by rule induction: An interdisciplinary bridge”, Human Ecology, vol. 22, no. 1, pp. 59-81, 1994. [17] Q.H. Hu, L. Zhang, D.G. Chen, W. Pedrycz, D.R. Yu, “Gaussian kernel based fuzzy rough sets: Model, uncertainty measures and applications”, International Journal of Approximate Reasoning, vol. 51, no. 4, pp. 453-471, 2010.




[18] G. Ilczuk, R. Mlynarski, A. Wakulicz-Deja, A. Drzewiecka, W. Kargul, “Rough set techniques for medical diagnosis systems”, in Proceedings of Computers in Cardiology, pp. 837-840, 2005. [19] M. Inuiguchi, T. Miyajima, “Rough set based rule induction from two decision tables”, European Journal of Operational Research, vol. 181, pp. 1540-1553, 2007. [20] M. Karasuyama, I. Takeuchi, “Multiple incremental decremental learning of support vector machines”, IEEE Transactions on Neural Networks, vol. 21, no. 7, pp. 1048-1059, 2010. [21] W. Kotlowski, K. Dembczynskia, S. Greco, R. Slowinski, “Stochastic dominance-based rough set model for ordinal classification”, Information Sciences, vol. 178, PP. 4019-4037, 2008. [22] J. Komorowski, A. Ohrn, A. Skowron, “Handbook of data mining and knowledge discovery: The ROSETTA rough set software system”, Oxford University Press, 2002. [23] J. Komorowski, A. Ohrn, A. Skowron, Rosetta, A rough set toolkit for analysis of data, Available: http://www.lcb.uu.se/tools/rosetta/ [24] J. Lawry, Y.C. Tang, “Granular knowledge representation and inference using labels and label expressions”, IEEE Transactions on Fuzzy Systems, vol. 18, no. 3, pp. 500-514, 2010. [25] S.Y. Li, T.R. Li, D. Liu, “Incremental updating approximations in dominance-based rough sets approach under the variation of the attribute set”, Knowledge-Based Systems, vol. 40, pp. 17-26, 2013. [26] T.R. Li, D. Ruan, W. Geert, J. Song, Y. Xu, “A rough sets based characteristic relation approach for dynamic attribute generalization in data mining”, Knowledge-Based Systems, vol. 20, no. 5, pp. 485-494, 2007. [27] H.X. Li, X.Z. Zhou, “Risk decision making based on decisiontheoretic rough set: A three-way view decision model”, International Journal of Computational Intelligence Systems, vol. 4, pp. 1-11, 2011. [28] P. Lingras, M. Chen, D.Q. Miao, “Rough cluster quality index based on decision theory”, IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 7, pp. 1014-1026, 2009. [29] D. Liu, T.R. Li, D. Ruan, “Probabilistic model criteria with decision-theoretic rough sets”, Information Sciences, vol. 181, no. 17, pp. 3709-3722, 2011. [30] D. Liu, T.R. Li, D. Ruan, W.L. Zou, “An incremental approach for inducing knowledge from dynamic information systems”, Fundamenta Informaticae, vol. 94, pp. 245-260, 2009. [31] C. Luo, T.R. Li, H.M. Chen, D. Liu, “Incremental approaches for updating approximations in set-valued ordered information systems”, Knowledge-Based Systems, vol. 50, pp. 218-233, 2013. [32] R. Mienko, R. Slowinski, J. Stefanowski, R. Susmaga, ”RoughFamily-software implementation of rough set based data analysis and rule discovery techniques”, in Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets and Machine Discovery, pp. 437-440, 1996. [33] G. Panoutsos, M. Mahfouf, “A neural-fuzzy modelling framework based on granular computing: Concepts and applications”, Fuzzy Sets and Systems, vol. 161, no. 21, pp. 2808-2830, 2010. [34] Z. Pawlak, A. Skowron, “Rudiments of rough sets”, Information Sciences, vol. 177, no. 1, pp. 3-27, 2007. [35] Z. Pawlak, A. Skowron, “Rough Sets: Some Extensions”, Information Sciences, vol. 177, no. 1, pp. 28-40, 2007. [36] W. Pedrycz, “A granular-oriented development of functional radial basis function neural networks”, Neurocomputing, vol. 72, pp. 420-435, 2008. [37] W. Pedrycz, “The design of cognitive maps: A study in synergy of granular computing and evolutionary optimization”, Expert Systems with Applications, vol. 37, no. 10, pp. 7288-7294, 2010. [38] W. Pedrycz, A. Bargiela, “An optimization of allocation of information granularity in the interpretation of data structures: Toward granular fuzzy clustering”, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 3, pp. 582-590, 2012. [39] W. Pedrycz, V. Loia, S. Senatore, “Fuzzy clustering with viewpoints”, IEEE Transactions on Fuzzy Systems, vol. 18, no. 2, pp. 274284, 2010. [40] J.F. Peters, S. Ramanna, M.S. Szczuka, “Towards a line-crawling robot obstacle classification system: A rough set approach”, RSFDGrC 2003, LNAI 2639, Springer, pp. 303-307, 2003. [41] L. Polkowski, A. Skowron, “Rough mereology: A new paradigm for approximate reasoning”, International Journal of Approximate Reasoning, vol. 15, no. 4, pp. 333-365, 1996.

13

[42] Y.H. Qian, J.Y. Liang, W.Z. Wu, C.Y. Dang, “Information granularity in fuzzy binary GrC model”, IEEE Transactions on Fuzzy Systems, vol. 19, no. 2, pp. 253-264, 2011. [43] Y.H. Qian , J.Y. Liang, Y.Y. Yao , C.Y. Dang, “MGRS: A multigranulation rough set”, Information Sciences, vol. 180, pp. 949-970, 2010. [44] A. Szladow, ”Datalogic/R for databases mining and decision support”, in Proceedings of the Second International Workshop on Rough Sets and Knowledge Discovery, pp. 511-513, 1993. [45] A. Sahiner, T. Yigit, “A study of rough set approach in gastroenterology”, Computational and Mathematical Methods in Medicine, pp. 1-7, 2013. [46] N. Shan, W. Ziarko, “Data-based acqusition and incremental modification of classification rules”, Computational Intelligence, vol. 11, pp. 357-370, 1995. [47] A. Skowron, J. Bazan, M. S. Szczuka, J. Wrblewski, Rough set exploration system, Available: http://logic.mimuw.edu.pl/ rses/ [48] A. Skowron, J. Stepaniuk, “Tolerance approximation spaces”, Fundamenta Informaticae, vol. 27, no. 2-3, pp. 245-253, 1996. [49] A.R. Solis, G. Panoutsos, “Granular computing neural-fuzzy modelling: A neutrosophic approach”, Applied Soft Computing, vol. 13, no. 9, pp. 4010-4021, 2013. [50] N. Sengupta, J. Sil, “Evaluation of rough set theory based network traffic data classifier using different discretization method”, International Journal of Information and Electronics Engineering, vol. 2, no. 3, pp. 338-341, 2012. [51] R. Slowinski, J. Stefanowski, ”’RoughDAS’ and ’RoughClass’ software implementations of the rough sets approach”, Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory, Roman Slowinski (Ed.), Kluwer Academic Publishers, Vol. 11, pp 445-456, 1992. [52] K. Varmuza, J.W. Grzymala-Busse, Z.S.H.T. Mroczek, “Comparison of consistent and inconsistent models in biomedical domain: A rough sets approach to melanoma data”, Artificial Intelligence Methods, Gliwice, Poland, pp. 323-328, 2003. [53] C.Z. Wang, C.X. Wu, D.G. Chen, Q.H. Hu, C. Wu, “Communicating between information systems”, Information Sciences, vol. 178, pp. 3228-3239, 2008. [54] A. Wojna, “Constraint based incremental learning of classification rules”, Rough Sets and Current Trends in Computing, Springer, pp. 428-435, 2001. [55] M. Wojnarski, ”Debellor: Open source modular platform for scalable data mining”, in Procedings in Intelligent Information Systems (IIS) 2009, Krakw, Poland, pp. 669-682, 2009. [56] W.Z. Wu, Y. Leung, J.S. Mi, “Granular computing and knowledge reduction in formal contexts”, IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 10, pp. 1461-1474, 2009. [57] R.R. Yager, “Participatory learning with granular observations”, IEEE Transactions on Fuzzy Systems, vol. 17, pp. 1-13, 2009. [58] X.P. Yang, J.T. Yao, “Modelling multi-agent three-way decisions with decision theoretic rough sets”, Fundamenta Informaticae, vol. 115, pp. 157-171, 2012. [59] Y.Y. Yao, “Handbook of granular computing, chapter A unified framework of granular computing”, Wiley, New York, pp. 401-410, 2008. [60] Y.Y. Yao, “Probabilistic approaches to rough sets”, Expert Systems, vol. 20, no. 5, pp. 287-297, 2003. [61] Y.Y. Yao, “Information granulation and rough set approximation”, International Journal of Intelligent Systems, vol. 16, no. 1, pp. 87-104, 2001. [62] Y.Y. Yao, “Probabilistic rough set approximations”, International Journal of Approximate Reasoning, vol. 49, no. 2, pp. 255-271, 2008. [63] Y.Y. Yao, “Interpreting concept learning in cognitive informatics and granular computing”, IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 39, no. 4, pp. 855-866, 2009. [64] Y.Y. Yao, “The superiority of three-way decisions in probabilistic rough set models”, Information Sciences, vol. 181, no. 6, pp. 10801096, 2011. [65] Y.Y. Yao, S.K.M. Wong, “A decision theoretic framework for approximating concepts”, International Journal of Man-Machine Studies, vol. 37, no. 6, pp. 793-809, 1992. [66] Y.Y. Yao, Y. Zhao, “Attribute reduction in decision-theoretic rough set models”, Information Sciences, vol. 178, no. 17, pp. 3356-3373, 2008. [67] H. Yu, S.S. Chu, D.C. Yang, “Autonomous knowledge-oriented clustering using decision-theoretic rough set theory”, Fundamenta Informaticae, vol. 115, pp. 141-156, 2012.




[68] L.A. Zadeh, “Fuzzy logic=Computing with words”, IEEE Transactions on Fuzzy Systems, vol. 4, no. 1, pp. 103-111, 1996. [69] L.A. Zadeh, “Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic”, Fuzzy Sets and Systems, vol. 19, no. 1, pp. 111-127, 1997. [70] J.B. Zhang, T.R. Li, D. Ruan, D. Liu, “Neighborhood rough sets for dynamic data mining”, International Journal of Intelligent Systems, vol. 27, pp. 317-342, 2012. [71] J.B. Zhang, T.R. Li, D. Ruan, D. Liu, “Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems”, International Journal of Approximate Reasoning, vol. 53, pp. 620-635, 2012. [72] J.B. Zhang, T.R. Li, D. Ruan, Z.Z. Gao, C.B. Zhao, “A parallel method for computing rough set approximations”, Information Sciences, vol. 194, pp. 209-223, 2012. [73] W. Ziarko, “Incremental learning and evaluation of structures of rough decision tables”, Transactions on Rough Sets IV, vol. 3700, pp. 162-177, 2005. [74] W. Ziarko, N. Shan, ”KDD-R: A comprehensive system for knowledge discovery in databases using rough sets”, Soft Computing, T.Y. Lin, A.M. Wildberger (Ed.), Springer, pp. 298-301, 1995. [75] Pipes.py. Available: https://github.com/mwojnars/nifty/blob/ master/data/pipes.py

14

Shi-Jinn Horng received the B.S. degree in Electronics Engineering from National Taiwan Institute of Technology, the M.S. degree in Information Engineering from National Central University, and the Ph.D. degree in Computer Science from National Tsing Hua University, in 1980, 1984, and 1989, respectively. Currently, he is the Chair and a distinguished Professor in the Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology. He has published more than 200 research papers and received many awards. Especially, the Distinguished Research Award got from the National Science Council in Taiwan in 2004. His research interests include VLSI design, Biometric recognition, image processing, and Information Security.

Hongmei Chen received her M.S. from the University of Electronic Science and Technology of China in 2000, and Ph.D. degree from the Southwest Jiaotong University in 2013. Currently, she is currently an Assistant Professor in the School of Information Science Technology, Southwest Jiaotong University. Her research interests include the areas of data mining, pattern recognition, fuzzy sets, and rough sets.

Tianrui Li (SM’10) received the B.S., M.S., and Ph.D. degrees from the Southwest Jiaotong University, Chengdu, China, in 1992, 1995 and 2002, respectively. He was a Postdoctoral Researcher with SCK•CEN, Belgium, from 2005 to 2006, and a Visiting Professor with Hasselt University, Belgium, in 2008, the University of Technology, Sydney, Australia, in 2009 and the University of Regina, Canada, 2014. He is currently a Professor and the Director of the Key Laboratory of Cloud Computing and Intelligent Techniques, Southwest Jiaotong University. He has authored or coauthored more than 150 research papers in refereed journals and conferences. His research interests include big data, cloud computing, data mining, granular computing and rough sets.

Guoyin Wang received the B.S., M.S., and Ph.D. degrees from Xian Jiaotong University, Xian, China, in 1992, 1994, and 1996, respectively. He worked at the University of North Texas, USA, and the University of Regina, Canada, as a visiting scholar during 1998-1999. Since 1996, he has been working at the Chongqing University of Posts and Telecommunications, where he is currently a professor, the Director of the Chongqing Key Laboratory of Computational Intelligence, and the Dean of the College of Computer Science and Technology. He was appointed as the Director of the Institute of Electronic Information Technology, Chongqing Institute of Green and Intelligent Technology, CAS, China, in 2011. He is the author of 10 books, the editor of dozens of proceedings of international and national conferences, and has over 200 reviewed research publications. His research interests include rough set, granular computing, knowledge technology, data mining, neural network, cognitive computing.

Chuan Luo received the B.S. degree from the College of Computer Science, Sichuan Normal University, Chengdu, China, in 2009. Currently, he is working toward the Ph.D. degree in the School of Information Science and Technology, Southwest Jiaotong University, China. His research interests include data mining, cloud computing, granular computing and rough sets.


A decision-theoretic rough set approach for dynamic data mining

A decision-theoretic rough set approach for dynamic data mining

Suggest Documents

Designing Data Mining Applications with Rough Set

Rough set theory: a data mining tool for semiconductor manufacturing ...

A SWARM-BASED ROUGH SET APPROACH FOR FMRI DATA ...

A Rough Set Approach for Clustering the Data Using

A Rough Set Approach for Clustering the Data Using ...

Analyzing Data Clusters: A Rough Set Approach to ... - Semantic Scholar

A Rough Set Approach towards Analysis of Cosmetic Data

Rough Set Approach for Categorical Data Clustering - CiteSeerX

Rough Set Approach for Categorical Data Clustering - Semantic Scholar

Application of Rough Set Theory in Data Mining - Semantic Scholar

A rough set approach for determining weights of decision ... - PLOS

A Rough Set Based Approach for ECG Classification - Semantic Scholar

A Rough Set Based Approach for ECG Classification

A Rough-Set-Refined Text Mining Approach for Crude Oil Market ...

Rough Set Approximation Based on Dynamic Granulation

Rough Set Approach to KDD - CiteSeerX

A data mining approach for identifying pathway

A Data Mining Approach - Core

A Data Mining approach - KIT

A Data Mining Approach f

Dynamic Data Mining - CiteSeerX

Dynamic Data Mining

Dynamic Data Mining - CiteSeerX

Rough set data analysis - Semantic Scholar