Objects' combination based simple computation of ... - IEEE Xplore

2 downloads 0 Views 470KB Size Report
Oct 27, 2002 - Zheng Zheng, Guoyin Wang, Yu Wu. Inst. of Computer Science and Technology,. Chongqing Univ. of Posts and Telecommunications,.
Proceedings of the 2002 IEEE International Symposium on Intelligent Control Vancouver, Canada October 27-30,2002

Objects’ Combination Based Simple Computation of Attribute Core Zheng Zheng, Guoyin Wang, Yu Wu Inst. of Computer Science and Technology, Chongqing Univ. of Posts and Telecommunications, Chongqing 400065, P.R.China [email protected]

Based on Professor Skowron’s discemibility matrix, Xiaohua Hu presented a method for calculating the attribute core in a decision information systemt4]. But Dongyi Ye found an error in it with a counterexample and then presented a new method16]. Hu’s and Ye’s methods are used in many information reduction algorithms, but their temporal and spatial complexities are too high. Moreover, although Dongyi Ye corrected Hu’s error, he didn‘t find the prime reason leading to the error and his method has higher complexity. We investigate these problems with information entropy and prove that there is much difference between the attribute core in the algebra view and information view of rough set.

Abstract- Rough set theory is emerging as a new tool for dealing with fuzzy and uncertain data. Attribute

reduction is one of the most important parts in rough set theory. The attribute core of a decision table is often the start point and key of many information reduction procedures. Hu’s method for computing attribute core based on discemibility matrix was wrong because of ignoring some factors. The error was corrected by Dongyi Ye, but his algorithm’s complexity was too high. In this paper, we present a new algorithm based on objects’ combination. The algorithm corrects Hu’s error and has less complexity than Ye’s and Hu’s methods. Key words--Attribute Core, Knowledge Reduction, Discernibility Matrix, Rough Set

For solving these problems, we develop a combination

I. INTRODUCTION

rule. It can be used to reduce a large decision information table (no matter consistent or inconsistent) to a smaller consistent one. And we prove that the tables before and after combination have the same attribute core. Based on the combination rule, a new algorithm for calculating the attribute core of a decision table is presented, which has a lower complexity than Ye’s method and in most cases a lower temporal complexity than Hu’s method.

The classical rough set theory developed by Professor Z.Pawlak is a valid mathematical theory, which has the ability to deal with imprecise, uncertain and vague information. In recent years, it has been applied in such fields as machine learning, data mining, knowledge acquiring successfully. In this theory, attribute reduction is one of the most important parts and there are several algorithms for it at present, where attribute core is often the start point. Hence, computation of attribute core is the key for rough set theory.

11. SOME BASIC NOTIONS OF ROUGH SET

For the convenience of description, we introduce some basic notions of information system at first. Def.1 A decision information system is defined as S= U,R,V,f 7 , where U is a finite set of objects and R=C U D is a finite set of attributes. C is the condition attribute set and D is the decision attribute set. With every attribute aCR, set of its values V is associated. Each attribute has a determines function f : U X R-V

Many scholars made a success in research about attribute core. Unfortunately, a lot of problems need to be solved. This paper is partially supported by National Science Foundation of P. R China (No.69803014), National Climb Program of P. R c h i Foundation for University Key Teacher by the State Education Ministry of P. R China (No.GG-520-10617IOOI), Scientific Research Foundation for the Returned overse as Chinese Scholars by the State Education Ministry of P. R. China, and Application Science Foundation of Chongqing.

0-7803-7620-W02/$17.00 0 2002 IEEE

Def.2 POSp(Q)= U P-(X) is the P positive region of Q, xsuip

514

where P and Q are both attribute sets of an information system and P-(X) is the P lower approximation of an object set X.

system from an original decision system.

Eg.1 With combination rule, Table 2 can be constructed from Table 1.

Def.3 An attribute Y of an information system is said to be relatively dispensable or superfluous if POSp(Q)= P0Se.(,,,(Q), otherwise, relatively indispensable. A relative reduction of P is a set of attributes S C P such that all attributes aEP-S are relatively dispensable, all attributes a€ S are relatively indispensable and POSs(Q)=POSp(Q). We use the term RedD@)to denote the family of relative reducts of B. CoreD@)=n RedD@) is called the D-core of attribute set B.

TABLE 1 THE DECISION TABLE BEFORE COMBRVATION

TABLE 2 THE DECISION TABLE AFTER COMBINATION

I)

Def.4 By d(xi)=card{f(y,D):$ [xiIc} we mean the number of different decision attribute values in the indiscernible class of object xi with respect to condition attribute set C.

0

XI’

:x

I

1

*

1 1 1 11 11

Def.5 An object is consistent if all the objects in its indiscernible class with respect to condition attribute set C have the same decision value, otherwise, inconsistent.

In the original decision table S , we might as well suppose that:

Def.6 A decision table is consistent if its objects are all consistent, otherwise, inconsistent.

where, U and &3are the number of indiscernible classes in U/C and U/D respectively, and C(x,),D(x,) are the sets of x,’ s conditional attribute values and decision attribute value respectively. In the decision table S’, we might as well suppose also:

U/C=(X1,X- ,.... .,x * }, U/D={Y,,Y2,..,YB

111. COMBINATION RULE AND ITS PROPERTIES

With combination rule, a large decision information table (consistent or inconsistent) could be reduced to a smaller consistent one. According to the method mentioned in [3], any decision table can be transformed to a decision table with only one decision attribute, so we might as well suppose D={d).

1,

UW={ Xl’,X2/,. ..,x, ’} , U’/D={ Y l’,Yi,. .....,Y p , ’ } , where, and P ’ are the number of indiscernible classes in U’/C and U’/D respectively, and C(x,/),D(x,/) are the sets of x,”s conditional attribute values and decision attribute value respectively.



Def.7(combination Rule) Given a decision information system S=l then let f(x,,D)=* in decision table

d/c={ (x I/}, {xi},..-(xa/’}1. Property 3: to Dei: 6 )

S‘.

that is,

s’ is a consistent decision table (According

Property 4: POSc(D)=U’ in S’.

The main idea of combination rule is to combine all the objects in each indiscernible class with respect to condition attributes to one object. If there are conflict among them, let the decision value of the counterpart in S’ be “* ”, while it’s c ondition attribute value unchanged, otherwise, all its attribute values is unchanged. We assume “* ” as a specific value diEerent from any value in V. The following example illustrates how to use the combination rule to generate a new consistent decision

1V. THE EQUIVALENCE OF THE ATTRIBUTE CORE OF S AND S’

Theorem 1 Suppose S= (xp’)=c-{CJ(xP>=c-{CJCQ=C- { c,) ($9 , we know xp/ is an inconsistent object after deleting attribute c, from SI. In addition, xp/ is a consistent object before deleting attribute c, ,therefore, the number of inconsistent objects in S’ must increase after attribute c, is deleted.

Proof. ( 3 ) Suppose xp’ will be changed from a consistent object to be an inconsistent one after attribute ci is deleted from decision table SI. We can deduce from consistent object’s defnition that there is at least an object q’ such that C(xp’) f C(%’), C-{c,}(xp/)’ C-{ci}(q’)and D(xp’) f D(%/). The necessity can be proved in the following three cases:

(1). If D(%’)=*, we can deduce from the combination rule that there are at least two objects x , and x, in the original decision table S that satisfy C-{ci}(xm)= C-{ c,}(xn)=C-{ci}(xS/)=C-{c~}(x;) and D(xm) f D(x,).And we can find an object xp such that C(X~)=C(X,?,SO C- {ci}(xp)=C-{ci}(xp’)=C- {cif(xm)= C-{q}(x,) . And since D(xm)fD(xn), we know either D(xp)fD(xm) or D(xp)fD(x,). Thus, xp will be an inconsistent object after deleting attribute ci from S. Since D(q/)=* and D(xp/)#D(q’), so D(xp’)f*, and we can deduce from the combination rule that xp is a consistent object before deleting attribute ci. Therefore, the number of inconsistent objects in S must increase after attribute ci is deleted.

(2’). If % is a consistent object before attribute c, is deleted from S , then we can find an object %‘ in decision table S’ which satisfies C(xq)=C(%’) and D(xq)=D(%’), in addition, D(xp)f D(xJ and D(xp)=D(xp’), thus D(xp’)# D(x,?. And since C- { c,}(xp/)=C-{ c,}(xp)= C- (c,}(%)= C-{c,}(%’), thus xp’ is an inconsistent object after deleting attribute c, from S’. In addition, xp/ is a consistent object befroe deleting attribute c, , so, the number of inconsistent objects in S’ must increase after attribute c, is deleted. From (l’), (2’), we can conclude that after some attribute is deleted from S and S’, if the number of inconsistent objects in decision table S will increase, then the number of inconsistent objects in decision table S’ will increase too. According to all above, we can conclude that Theorem 4 holds.

(2). If D(xp/)=*, we can prove it in the similar way as (1). (3). If D(%’)f* and D(xp’)#*, we can deduce from the combination rule that there are at least two consistent objects xp and % such that C(x,)=C(xp/), C(xJ=C(%’), D(x,,)=D(xp/), and D(xJ=D(q’). And since D(x,? # D(%’), we can know D(xp) f D(%). In addition, c-{c,}

Suggest Documents