Int. J. Mach. Learn. & Cyber. DOI 10.1007/s13042-015-0407-9
ORIGINAL ARTICLE
Multigranulation decision-theoretic rough sets in incomplete information systems Hai-Long Yang1,2,3 • Zhi-Lian Guo4
Received: 30 January 2015 / Accepted: 21 July 2015 Ó Springer-Verlag Berlin Heidelberg 2015
Abstract We study multigranulation decision-theoretic rough sets in incomplete information systems. Based on Bayesian decision procedure, we propose the notions of weighted mean multigranulation decision-theoretic rough sets, optimistic multigranulation decision-theoretic rough sets, and pessimistic multigranulation decision-theoretic rough sets in an incomplete information system. We investigate the relationships between the proposed multigranulation decision-theoretic rough set models and other related rough set models. We also study some basic properties of these models. We give an example to illustrate the application of the proposed models. Keywords Decision-theoretic rough sets Multigranulation Bayesian decision procedure Multigranulation decision-theoretic rough sets Incomplete information systems
& Hai-Long Yang
[email protected] & Zhi-Lian Guo
[email protected] 1
College of Mathematics and Information Science, Shaanxi Normal University, Xi’an 710119, People’s Republic of China
2
Department of Computer Science, University of Regina, Regina, SK S4S 0A2, Canada
3
The School of Management, Xi’an Jiaotong University, Xi’an 710049, People’s Republic of China
4
College of Economics, Northwest University of Political Science and Law, Xi’an 710063, People’s Republic of China
1 Introduction Based on Bayesian decision procedure, Yao and Wong [37] proposed decision-theoretic rough sets (DTRS). The model of DTRS covers Pawlak rough sets [18, 19], variable precision rough sets [42], 0.5-probabilistic rough sets [29] as its submodels. In the DTRS model, the required thresholds can been calculated from a loss function according to the minimum expected overall risk, where the losses are associated with the decision risk. Many researchers have studied theoretic aspects of DTRS. Liang et al. [10] proposed triangular fuzzy DTRS by generalizing the precise value of loss function to triangular fuzzy number. Jia et al. [5] proposed an optimization representation of DTRS model. Zhou [41] constructed a new formulation of multi-class DTRS. Liu et al. [13] proposed fuzzy interval DTRS by using risk lovers and risk averters strategies. Sun et al. [24] studied decision-theoretic rough fuzzy set model and its application. Zhao and Hu [39] proposed fuzzy and interval-valued fuzzy DTRS by using fuzzy probability measure. Li et al. [9] studied an axiomatic characterization of DTRS. Based on rough set theory, Yao [35, 36] proposed a theory of three-way decisions as a new decision making method. A basic idea of three-way decisions is to divide a universe into three pair-wise disjoint regions, positive, negative, and boundary regions, based on an evaluation function and a pair of thresholds. Three-way decisions have been applied in many domains, such as decision-making [8], email filtering [4, 40], cluster analysis [11, 38], government decisions [12], and so on. Hu [2] established three-way decisions spaces by unifying decision measurement, decision conditions and evaluation functions of three-way decisions. Yang and Yao [33] studied the multi-agent three-way decisions with decision-theoretic rough sets. In order to apply rough set theory to more practical applications, Qian et al. [20] proposed multigranulation
123
Int. J. Mach. Learn. & Cyber.
rough sets. Multigranulation rough sets have received more attention from many researchers. Wu and Leung [28] proposed a formal approach to granular computing with multiscale data measured at different levels of granulations, and studied theory and applications of granular labelled partitions in multi-scale decision information systems. Xu et al. [30] applied multigranulation rough sets to order information systems. She and He [23] studied the algebraic and topological structure of multigranulation rough sets. Lin et al. [17] studied feature selection via neighborhood multi-granulation fusion. Liu et al. [14] and Lin et al. [15] generalized multigranulation rough sets based on equivalence relations to rough sets based on covering [25]. Huang et al. [3] proposed three types of intuitionistic fuzzy multigranulation rough sets and proposed a computing method for approximation sets and a reduction method for intuitionistic fuzzy multigranulation rough sets. Lin et al. [16] studied the connection between the multigranulation rough set theory and Dempster–Shafer theory. Yang et al. [31] studied test cost sensitive multigranulation rough sets. Qian et al. [22] introduced the notion of multigranulation decision-theoretic rough sets by combining the multigranulation idea and decision-theoretic rough sets. An incomplete information system is a system with missing data. Pawlak rough set model can be extended to non-equivalence relations which could be used in knowledge acquisition and decision making in incomplete information systems [6, 7]. Many studies on multigranulation rough sets and decision-theoretic rough sets have addressed incomplete information systems. Qian et al. [21] proposed incomplete multigranulation rough sets by extending the rough set model based on a tolerance relation to an incomplete rough set model based on multiple tolerance relations. Tripathy et al. [26] proposed the concept of multigranulation intuitionistic fuzzy rough sets in incomplete information systems and studied some related properties. Wang et al. [27] studied multigranulation rough sets in incomplete ordered decision systems. Yang et al. [32] studied decision-theoretic rough sets and threeway decisions in incomplete information systems by total probability. The study of multigranulation decision-theoretic rough sets in incomplete information systems is still lacking. The purpose of this paper is to study multigranulation decisiontheoretic rough sets in incomplete information systems. In other words, we apply multigranulation decision-theoretic rough sets to incomplete information systems. We shall propose three types of multigranulation decision-theoretic rough sets in an incomplete information system based on the Bayesian decision procedure. The rest of the paper is structured as follows. In the next section, we recall some basic notions about Pawlak rough sets, multigranulation rough sets, Bayesian decision
123
procedure, and decision-theoretic rough sets. In Sect. 3, we propose three types of multigranulation decision-theoretic rough sets in an incomplete information system based on Bayesian decision procedure. We illustrate the relationships between the multigranulation decision-theoretic rough sets and other related rough sets. Some related properties are also discussed. In Sect. 4, an example is used to illustrate the the proposed model. The last section summarizes the conclusions and presents some topics for future research.
2 Preliminaries In this section, we recall some basic notions used in this paper. 2.1 Pawlak rough sets An information system is a tuple I ¼ ðU; AT; V; f Þ; where U is a finite nonempty set of objects, AT is a nonempty finite set of attributes, V ¼ VAT ¼ [a2AT Va ; where Va is a set of values of the attribute a, and f : U AT ! V is an information function satisfying f ðx; aÞ 2 Va ð8x 2 U; a 2 ATÞ: Every nonempty subset A of AT determines an indiscernibility relation (i.e. an equivalence relation), defined as INDðAÞ ¼ fðx; yÞ 2 U U j f ðx; aÞ ¼ f ðy; aÞ; 8a 2 Ag: The indiscernibility relation IND(A) partitions U into equivalence classes given by U=INDðAÞ ¼ f½xA j x 2 Ug;
where
½xA ¼ fy 2 U j ðx; yÞ 2 INDðAÞg: By IND(A), the lower and upper approximations of an arbitrary subset X of U are defined as follows, respectively, AðXÞ ¼ fx 2 U j ½xA Xg
and
AðXÞ ¼ fx 2 U j ½xA \ X 6¼ ;g:
The pair ðAðXÞ; AðXÞÞ is referred to as a Pawlak rough set [18, 19, 34] with respect to the set of attributes A. 2.2 Incomplete information systems and incomplete multigranulation rough sets In an incomplete information system, attributes’ values for some objects are unknown. In this paper, an incomplete information system is still denoted by I ¼ ðU; AT; V; f Þ: It is assumed that an object x 2 U possesses only one value for an attribute a. A special symbol ‘‘*’’ is used to indicate that the value of an attribute is unknown. The unknown value is just missing, but it does exist, and the real value
Int. J. Mach. Learn. & Cyber.
must be from the set Va fg: I ¼ ðU; AT; V; f ; DÞ is called an incomplete target information system if values of some attributes in AT are missing and those of all attributes in D are known, where AT is called a conditional attribute set and D is a decision attribute set. Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A AT: Kryszkiewicz [6] proposed the following binary relation with respect to A, denoted by RA ; RA ¼ fðx; yÞ 2 U U j f ðx; aÞ ¼ f ðy; aÞ _ f ðx; aÞ ¼ _ f ðy; aÞ ¼ ; 8a 2 Ag: Clearly, RA is a tolerance relation, i.e. RA is reflexive and symmetric. Given an incomplete information system I ¼ ðU; AT; V; f Þ; 8x 2 U; let RA ðxÞ ¼ fy 2 U j ðx; yÞ 2 RA g which is called the RA -related set of x. Obviously, RA ðxÞ is the maximal set of objects that are possibly indiscernible by A with x. Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT: Then, for X U; the optimistic Pm O multigranulation lower approximation i¼1 Ai ðXÞ and P O upper approximation m i¼1 Ai ðXÞ of X in an incomplete information system, are defined as follows, m X
O
Ai ðXÞ ¼ fx 2 U j RA1 ðxÞ X or or RAm ðxÞ Xg;
i¼1 m X
O
Ai ðXÞ ¼ :
i¼1
m X
!
O
Ai ð:XÞ
We review the Bayesian decision-theoretic framework [1]. Let X ¼ fx1 ; . . .; xs g be a finite set of s states, and A ¼ fa1 ; . . .; am g be a finite set of m possible actions. Let Pðxj jxÞ be the conditional probability of an object x being in state xj given that the object is described by x: Let kðai jxj Þ denote the loss, or cost, for taking action ai when the state is xj : For an object with description x; suppose action ai is taken. Since Pðxj jxÞ is the probability that the true state is xj given x; the expected loss associated with taking action ai is given by: s X kðai jxj ÞPðxj jxÞ: Rðai jxÞ ¼ j¼1
Given a description x; a decision rule is a function sðxÞ that specifies which action to take. That is, for every x; sðxÞ takes one of the actions, a1 ; . . .; am : The overall risk R is the expected loss associated with a given decision rule. Since RðsðxÞjxÞ is the conditional risk associated with action sðxÞ; the overall risk R is defined by, X R¼ RðsðxÞjxÞPðxÞ; x
where the summation is over the set of all possible descriptions of objects. For every x; we select the action which causes minimum conditional risk. Then the overall risk is minimized. If more than one action minimizes Rðai jxÞ; a tie-breaking criterion can be used.
i¼1
¼ fx 2 U j RA1 ðxÞ \ X 6¼ ; and and RAm ðxÞ \ X 6¼ ;g; where :X is the complement of X. P Pm O O The pair ð m i¼1 Ai ðXÞ; i¼1 Ai ðXÞÞ is called an optimistic multigranulation rough set [21]. Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT: Then, for X U; the pessimistic Pm P multigranulation lower approximation i¼1 Ai ðXÞ and P P upper approximation m i¼1 Ai ðXÞ of X in an incomplete information system, are defined as follows, m X
2.3 Bayesian decision procedure
P
Ai ðXÞ ¼ fx 2 U j RA1 ðxÞ X and and RAm ðxÞ Xg;
2.4 Decision-theoretic rough sets Yao [36] proposed a outline of a theory of three-way decisions. Comparing to two-way decisions, three-way decisions have a third option, i.e., non-commitment in addition to acceptance and rejection. For the problem of three-way decisions, the following formal description was given in [36]: The problem of three-way decisions Suppose U is a nonempty finite set and C is a finite set of criteria. The problem of three-way decisions is to divide, based on the set of criteria C, U into three regions, POS, NEG, and BND, called the positive, negative, and boundary regions, respectively, which satisfy the following conditions:
i¼1 m X i¼1
P
Ai ðXÞ ¼ :
m X
P
!
Ai ð:XÞ
i¼1
¼ fx 2 U j RA1 ðxÞ \ X 6¼ ; or or RAm ðxÞ \ X 6¼ ;g:
P Pm P P The pair ð m i¼1 Ai ðXÞ; i¼1 Ai ðXÞÞ is called a pessimistic multigranulation rough set [21].
Condition (1) POS, NEG and BND are pair-wise disjoint, Condition (2) POS [ NEG [ BND ¼ U: If an object belongs to the positive region, then we accept the object. If an object belongs to the negative region, then we reject the object. If an object belongs to the boundary region, then we defer a decision. Rules constructed from
123
Int. J. Mach. Learn. & Cyber.
the three regions are associated with different actions and decisions, which lead to the notion of three-way decision rules. The positive rules make decisions of acceptance. The negative rules make decisions of rejection, and the boundary rules make decisions of non-commitment. Within the frame of three-way decisions, the set of states is given by X ¼ fX; :Xg (where :X denotes the complement of X), the set of actions is given by A ¼ faP ; aB ; aN g; where aP ; aB ; and aN represent the three actions in classifying an object x, namely, deciding x 2 POSðXÞ; deciding x should be further investigated x 2 BNDðXÞ; and deciding x 2 NEGðXÞ: kPP ; kBP ; and kNP denote the loss incurred for taking actions of aP ; aB ; and aN ; respectively, when an object belongs to X. Similarly, kPN ; kBN ; and kNN denote the loss incurred for taking the correspondence actions when the object belongs to :X: By Bayesian decision procedure, for an object x, the expected loss Rða j½xÞ associated with taking the individual actions can be expressed as: RðaP j½xÞ ¼ kPP PðXj½xÞ þ kPN Pð:Xj½xÞ; RðaN j½xÞ ¼ kNP PðXj½xÞ þ kNN Pð:Xj½xÞ; RðaB j½xÞ ¼ kBP PðXj½xÞ þ kBN Pð:Xj½xÞ: The Bayesian decision procedure suggests the following three minimum-risk decision rules: If RðaP j½xÞ RðaB j½xÞ and RðaP j½xÞ RðaN j½xÞ; then decide x 2 POSðXÞ; If RðaN j½xÞ RðaP j½xÞ and RðaN j½xÞ RðaB j½xÞ; then decide x 2 NEGðXÞ; If RðaB j½xÞ RðaP j½xÞ and RðaB j½xÞ RðaN j½xÞ; then decide x 2 BNDðXÞ:
(P1) (N1) (B1)
By considering a reasonable kind of losses with 0 kPP kBP \kNP and 0 kNN kBN \kPN ; the decision rules (P1)–(B1) can be expressed concisely as: If PðXj½xÞ a and PðXj½xÞ c; then decide x 2 POSðXÞ; If PðXj½xÞ c and PðXj½xÞ b; then decide x 2 NEGðXÞ; If PðXj½xÞ a and PðXj½xÞ b; then decide x 2 BNDðXÞ;
(P2) (N2) (B2) where a¼ c¼ b¼
ðkPN ðkPN ðkBN
kPN kBN ; kBN Þ þ ðkBP kPP Þ kPN kNN ; kNN Þ þ ðkNP kPP Þ kBN kNN : kNN Þ þ ðkNP kBP Þ
If 0 b\c\a 1; then decision rules (P2)–(B2) can be rewritten as follows:
123
(P3) If PðXj½xÞ a; then decide x 2 POSðXÞ; (N3) If PðXj½xÞ b; then decide x 2 NEGðXÞ; (B3) If b\PðXj½xÞ\a; then decide x 2 BNDðXÞ: Based on the decision rules above, we can obtain lower and upper approximations of the decision-theoretic rough sets [35, 37] as follows, PRðXÞ ¼ fx 2 U j PðXj½xÞ ag; PRðXÞ ¼ fx 2 U j PðXj½xÞ [ bg:
3 Three types of multigranulation decisiontheoretic rough sets In this section, we will give three types of multigranulation decision-theoretic rough sets in an incomplete information system based on Bayesian decision procedure. Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT; m tolerance relations RA1 ; . . .; RAm (induced by A1 ; . . .; Am ; respectively) are called m granular structures. Let Xk ¼ fX; :Xg be the set of states for kth granular structure ðk ¼ 1; . . .; mÞ; indicating that an object x is in X and not in X, respectively. The set of actions is given by A ¼ faP ; aB ; aN g; where aP ; aB ; and aN represent the three actions in classifying an object x, namely, deciding x 2 POSðXÞ; deciding x should be further investigated x 2 BNDðXÞ; and deciding x 2 NEGðXÞ: kkPP ; kkBP ; and kkNP denote the loss incurred for taking actions of aP ; aB ; and aN ; respectively, when an object x belongs to X for kth granular structure. Similarly, kkPN ; kkBN ; and kkNN denote the loss incurred for taking the correspondence actions when the object belongs to :X: By Bayesian approach, for kth granular structure, the expected loss of taking actions aP ; aB and aN for x are given as follows: RðaP jRAk ðxÞÞ ¼ kkPP PðXjRAk ðxÞÞ þ kkPN Pð:XjRAk ðxÞÞ; RðaB jRAk ðxÞÞ ¼ kkBP PðXjRAk ðxÞÞ þ kkBN Pð:XjRAk ðxÞÞ; RðaN jRAk ðxÞÞ ¼ kkNP PðXjRAk ðxÞÞ þ kkNN Pð:XjRAk ðxÞÞ: Based on those expected loss, we can define different types of multigranulation decision-theoretic rough sets.
3.1 Weighted mean incomplete multigranulation decision-theoretic rough sets For m granular structures, the expected overall loss of taking actions aP ; aB and aN for the object x can be computed by weighted mean idea as follows:
Int. J. Mach. Learn. & Cyber.
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼
m X
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, x 2 NEGPm PAWMI ðXÞ,
xk kkPP PðXjRAk ðxÞÞ
k¼1
þ
m X
k¼1
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼
m X
xk kkBP PðXjRAk ðxÞÞ
k¼1
þ
m X
xk kkBN Pð:XjRAk ðxÞÞ;
k¼1
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼
m X
xk kkNP PðXjRAk ðxÞÞ
k¼1
þ
m X
k
k¼1
k
By the reasonable assume 0 kPP kBP \kNP and 0 kNN kBN \kPN , the decision rules (WMIP1)– (WMIB1) can be expressed concisely as: P Pm (WMIP2) If m k¼1 xk PðXjRAk ðxÞÞ a and k¼1 xk PðXj RAk ðxÞÞ c, then decide x 2 POSPm PAWMI k¼1
xk kkNN Pð:XjRAk ðxÞÞ;
k¼1
where xk is the weight of kth granular structure satisfying P xk 2 ½0; 1 and m k¼1 xk ¼ 1: For kk ; there are two cases. One is that kk is irrelevant to k. Another is that kk is relevant to k. In this paper, we assume that kk is irrelevant to k. That is to say k1PP k1NP k1BN
decide
If RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 BNDPm PAWMI ðXÞ:
(WMIB1)
xk kkPN Pð:XjRAk ðxÞÞ;
then
¼ ¼ km PP ¼ ¼ km NP ¼ ¼ km BN
k1BP ¼ ¼ km BP ¼ kBP ; 1 kPN ¼ ¼ km PN ¼ kPN ; 1 kNN ¼ ¼ km NN ¼ kNN :
¼ kPP ; ¼ kNP ; ¼ kBN ;
By PðXjRAk ðxÞÞ þ Pð:XjRAk ðxÞÞ ¼ 1; we have: RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kPP þ kPN 1
m X
m X
xk PðXjRAk ðxÞÞ
k¼1
! xk PðXjRAk ðxÞÞ ;
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kBP þ kBN 1
m X
m X
!
xk PðXjRAk ðxÞÞ
k¼1
xk PðXjRAk ðxÞÞ ;
k¼1
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kNP þ kNN 1
m X
!
m X
xk PðXjRAk ðxÞÞ
k¼1
xk PðXjRAk ðxÞÞ :
k¼1
k
(WMIN2)
ðXÞ, Pm P If m k¼1 xk PðXjRAk ðxÞÞ c and k¼1 xk PðXj RAk ðxÞÞ b, then decide x 2 NEGPm PAWMI
(WMIB2)
ðXÞ, Pm P If m k¼1 xk PðXjRAk ðxÞÞ a and k¼1 xk PðXj RAk ðxÞÞ b, then decide x 2 BNDPm PAWMI
k¼1
k¼1
k
k
ðXÞ; where kPN kBN ; ðkPN kBN Þ þ ðkBP kPP Þ kPN kNN ; c¼ ðkPN kNN Þ þ ðkNP kPP Þ kBN kNN : b¼ ðkBN kNN Þ þ ðkNP kBP Þ a¼
If ðkPN kBN ÞðkNP kBP Þ [ ðkBN kNN ÞðkBP kPP Þ, we can get 0 b\c\a 1, then the rules (WMIP2–WMIB2) can be rewritten as follows: Pm (WMIP3) If then decide k¼1 xk PðXjRAk ðxÞÞ a, x 2 POSPm PAWMI ðXÞ, Pm k¼1 k (WMIN3) If then decide k¼1 xk PðXjRAk ðxÞÞ b, x 2 NEGPm PAWMI ðXÞ, P k¼1 k (WMIB3) If b\ m k¼1 xk PðXjRAk ðxÞÞ\a, then decide x 2 BNDPm PAWMI ðXÞ: k¼1
k
Furthermore, the weighted mean multigranulation positive, negative, and boundary regions of X are defined as follows, respectively, (
m X
)
The Bayesian decision procedure suggests the following three minimum-risk decision rules:
POSPm
If RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 POSPm PAWMI ðXÞ,
NEGPm
If RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ
By the three pair-wise disjoint regions above, we can obtain the lower and upper approximations of X as follows:
(WMIP1)
k¼1
(WMIN1)
k
k¼1
ðXÞ ¼ PAWMI
x2Uj
k
k¼1
( k¼1
BNDPm k¼1
PAWMI k
ðXÞ ¼
x2Uj (
PAWMI k
ðXÞ ¼
xk PðXjRAk ðxÞÞ a ;
m X
) xk PðXjRAk ðxÞÞ b ;
k¼1
x 2 U j b\
m X
) xk PðXjRAk ðxÞÞ\a :
k¼1
123
Int. J. Mach. Learn. & Cyber. m X
(
WMI;a
ðXÞ ¼
PAk
x2Uj
k¼1 m X
m X
) xk PðXjRAk ðxÞÞ a ;
(2)
WMI;b
ðXÞ ¼
PAk
x2Uj
k¼1
m X
) xk PðXjRAk ðxÞÞ [ b :
(3)
Based on it, we have the following definition.
k¼1 m X
(5)
WMI;a WMI;a
ð;Þ ¼
Pm
ðUÞ ¼
Pm
k¼1
PAk
WMI;b
ð;Þ ¼ ;,
WMI;b
PAk
k¼1
k¼1
(
WMI;b
ðXÞ ¼ x 2 U j
PAk
k¼1
m X
)
k¼1
where x ¼ ðx1 ; . . .; xm Þ is the weighted vector of ðPðXjRA1 ðxÞÞ; . . .; PðXjRAm ðxÞÞÞ with xk 2 ½0; 1 ðk ¼ P 1; 2; . . .; mÞ; m k¼1 xk ¼ 1: P P WMI;b WMI;a ðXÞ; m ðXÞÞ is The pair ð m k¼1 PAk k¼1 PAk called a weighted mean multigranulation decision-theoretic rough set. The weighted mean multigranulation decision-theoretic rough set is a special weighted mean multigranulation probabilistic rough set, in which a; b can be interpreted and computed by loss or cost based on Bayesian decision procedure. Qian et al. [22] proposed the mean multigranulation decision-theoretic rough set model in a complete information system. The following remark points out the relationship between them. Remark 3.1 If x1 ¼ ¼ xm ¼ m1 and (U, AT, V, f) is a complete information system, then the weighted mean multigranulation decision-theoretic rough set model given in Definition 3.1 will be degenerated to be the mean multigranulation decision-theoretic rough set model [22]. The following properties of the lower approximation Pm WMI;a operator and the upper approximation k¼1 PAk Pm WMI;b can be easily obtained. operator k¼1 PAk Proposition 3.1 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT, and P : 2U ! ½0; 1 is a probability function defined on the power set 2U . For any X; Y U, then we have Pm
k¼1
123
PAk
WMI;a
ðXÞ
Pm
Proof
It follows immediately from Definition 3.1.
h
xk PðXjRAk ðxÞÞ [ b ;
T
(1)
(4)
PAk
PAk ðUÞ¼ U, Pm WMI;a ðXÞ If X Y, then k¼1 PAk Pm P WMI;b WMI;a m ðYÞ and ðXÞ k¼1 PAk k¼1 PAk Pm WMI;b ðYÞ, k¼1 PAk Pm P WMI;1a WMI;a ðXÞ ¼ : m ð:XÞða [ k¼1 PAk k¼1 PAk Pm Pm WMI;b WMI;1b ðXÞ ¼ : k¼1 PAk 0:5Þ, k¼1 PAk ð:XÞðb\0:5Þ, If 0\a1 a2 1 and 0 b1 b2 \1, then Pm P WMI;a2 WMI;a1 ðXÞ m ðXÞ and k¼1 PAk k¼1 PAk Pm Pm WMI;b2 WMI;b1 ðXÞ k¼1 PAk ðXÞ: k¼1 PAk k¼1
k¼1
Definition 3.1 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT; and P : 2U ! ½0; 1 is a probability function defined on the power set 2U : Continue to use the marks a; b above, for any X U; we define the lower and upper approximations of X as follows, ( ) WMI;a m m X X PAk ðXÞ ¼ x 2 U j xk PðXjRAk ðxÞÞ a ;
k¼1
Pm
k¼1
(
Pm
k¼1
PAk
WMI;b
ðXÞ,
3.2 Optimistic multigranulation decision -theoretic rough sets Liu et al. [13] proposed a three-way decision model of fuzzy interval decision-theoretic rough sets by using risk lovers strategy. We apply this strategy to incomplete multigranulation case. For risk lovers, their attitude towards risk is optimistic. The expected overall loss of taking actions aP ; aB and aN for the object x can be computed as follows: RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ þ
m ^ k¼1 m ^
kkPP PðXjRAk ðxÞÞ kkPN Pð:XjRAk ðxÞÞ;
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ þ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ þ
m ^ k¼1 m ^ k¼1 m ^ k¼1 m ^
kkBP PðXjRAk ðxÞÞ kkBN Pð:XjRAk ðxÞÞ; kkNP PðXjRAk ðxÞÞ kkNN Pð:XjRAk ðxÞÞ:
k¼1
Assume that k1PP ¼ ¼ km PP ¼ kPP ;
k1BP ¼ ¼ km BP ¼ kBP ;
k1NP ¼ ¼ km NP ¼ kNP ;
k1PN ¼ ¼ km PN ¼ kPN ;
k1BN ¼ ¼ km BN ¼ kBN ;
k1NN ¼ ¼ km NN ¼ kNN :
By PðXjRAk ðxÞÞ þ Pð:XjRAk ðxÞÞ ¼ 1, we have:
Int. J. Mach. Learn. & Cyber.
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kPP 1
m _
m ^
and
PðXjRAk ðxÞÞ þ kPN
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kNN :
ðkPN kNN Þ þ ðkNP kPP Þ
k¼1
! PðXjRAk ðxÞÞ ;
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kBP
m ^
PðXjRAk ðxÞÞ þ kBN 1
k¼1
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kNP
m _
! PðXjRAk ðxÞÞ ;
(2)
For rule (OIN1):
k¼1 m ^
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kNN ðkPN kNN Þ þ ðkNP kPP Þ
PðXjRAk ðxÞÞ þ kNN
k¼1
1
m _
! PðXjRAk ðxÞÞ :
k¼1
and (i)
(ii)
Vm Wm If k¼1 PðXjRAk ðxÞÞ ¼ 0 and k¼1 PðXjRAk ðxÞÞ ¼ 1, then RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ 0, which implies that, in this case, it is trivial. Wm V If m k¼1 PðXjRAk ðxÞÞ 6¼ 0 or k¼1 PðXjRAk ðxÞÞ 6¼ 1, then the Bayesian decision procedure suggests the following three minimum-risk decision rules: (OIP1)
If RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB j ðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaP j ðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 POSPm PAOI ðXÞ,
(OIN1)
If RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP j ðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RA m ðxÞÞÞ, then decide x 2 NEGPm PAOI ðXÞ,
k¼1
k¼1
(OIB1)
For rule (OIB1): RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kBN ðkPN kBN Þ þ ðkBP kPP Þ
and RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW V () m 1þ m PðXjR ðxÞÞ A k k¼1 k¼1 PðXjRAk ðxÞÞ kBN kNN :
ðkBN kNN Þ þ ðkNP kBP Þ
k
k
b d By ba dc () aþb
cþd ð8a; b; c; d [ 0Þ and the reasonable assume 0 kPP kBP \kNP , 0 kNN kBN \kPN , we have the following:
(1)
(3)
k
If RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP j ðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 BNDPm PAOI ðXÞ. k¼1
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kBN kNN : ðkBN kNN Þ þ ðkNP kBP Þ
Therefore, the decision rules (OIP1)–(OIB1) can be expressed concisely as: Vm PðXjRAk ðxÞÞ k¼1 V Wm (OIP2) If
a and m 1þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ Vm PðXjRAk ðxÞÞ k¼1 Vm Wm
c, then decide 1þ
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kBN
ðkPN kBN Þ þ ðkBP kPP Þ
k¼1
PðXjRAk ðxÞÞ
x 2 POSPm
For rule (OIP1):
k¼1
PðXjRAk ðxÞÞ
ðXÞ, PAOI k¼1 Vk m
(OIN2)
If 1þ
PðXjRAk ðxÞÞ k¼1 Vm Wm c and PðXjRAk ðxÞÞ PðXjRAk ðxÞÞ k¼1 k¼1 Vm PðXjRAk ðxÞÞ k¼1 Wm b, then decide
1þ
Vm k¼1
PðXjRAk ðxÞÞ
x 2 NEGPm
PAOI k k¼1
k¼1
PðXjRAk ðxÞÞ
ðXÞ,
123
Int. J. Mach. Learn. & Cyber.
Vm PðXjRAk ðxÞÞ k¼1 V Wm If a and m 1þ PðXjRAk ðxÞÞ PðXjRAk ðxÞÞ k¼1 Vmk¼1 PðXjRAk ðxÞÞ k¼1 Vm Wm
b, then decide
(OIB2)
1þ
k¼1
PðXjRAk ðxÞÞ
x 2 BNDPm
k¼1
k¼1
PðXjRAk ðxÞÞ
ðXÞ, PAOI k
where a¼ c¼ b¼
m X
kPN kBN ; kBN Þ þ ðkBP kPP Þ kPN kNN ; kNN Þ þ ðkNP kPP Þ kBN kNN : kNN Þ þ ðkNP kBP Þ
ðkPN ðkPN ðkBN
(OIN3)
(OIB3)
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ
decide x 2 NEGPm PAOI ðXÞ, Vmk¼1 k PðXjRAk ðxÞÞ k¼1 Wm If b\ Vm PðXjRAk ðxÞÞ k¼1
PAOI k
k¼1
b,
PðXjRAk ðxÞÞ
then
\a, then
ðXÞ.
Furthermore, the optimistic multigranulation positive, negative, and boundary regions of X are defined as follows, respectively, POSPm PAOI ðXÞ k Vm k¼1 ðxÞÞ k k¼1 PðXjRAW Vm
a ; ¼ x2Uj m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ P NEG m PAOI ðXÞ k Vm k¼1 ðxÞÞ k k¼1 PðXjRAW Vm b ; ¼ x2Uj 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ BNDPm PAOI ðXÞ k Vm k¼1 ðxÞÞ k k¼1 PðXjRAW Vm \a : ¼ x 2 U j b\ m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
By the three pair-wise disjoint regions above, we can obtain the lower and upper approximations of X as follows: m X
OI;a
k¼1
¼
x2Uj
m X
1þ
k¼1
x2Uj
123
1þ
x2Uj
1þ
Vm
PðXjRAk ðxÞÞ Wm [b : k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Vm
k¼1
P P OI;b OI;a The pair ð m ðXÞ; m ðXÞÞ is called an k¼1 PAk k¼1 PAk optimistic multigranulation decision-theoretic rough set. The optimistic multigranulation decision-theoretic rough set is a special optimistic multigranulation probabilistic rough set, in which a; b can be interpreted and computed by loss or cost based on Bayesian decision procedure. The following two remarks point out the relationships between optimistic multigranulation decision-theoretic rough set model in incomplete information systems and the related rough set models [21, 35, 37]. If a ¼ 1, b ¼ 0 and PðXjRAk ðxÞÞ ¼
Remark 3.2
jX\RAk ðxÞj jRAk ðxÞj ,
then m X k¼1
OI;1
ðXÞ
PAk
Vm ðxÞÞ k k¼1 PðXjRAW Vm
1 x2Uj 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ ( ) m _ ¼ x2Uj PðXjRAk ðxÞÞ 1
¼
k¼1
¼ fx 2 U j PðXjRA1 ðxÞÞ ¼ 1 or or PðXjRAm ðxÞÞ ¼ 1g; ¼ fx 2 U j RA1 ðxÞ X or or RAm ðxÞ Xg; m X k¼1
OI;0
PAk
ðXÞ
x2Uj
Vm
PðXjRAk ðxÞÞ Wm [0 k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Vm
k¼1
1þ m ^ PðXjRAk ðxÞÞ [ 0g ¼ fx 2 U j k¼1
¼ fx 2 U j PðXjRA1 ðxÞÞ [ 0 and and PðXjRAm ðxÞÞ [ 0g; ¼ fx 2 U j RA1 ðxÞ \ X 6¼ ; and and RAm ðxÞ \ X 6¼ ;g;
ðXÞ
k¼1
¼
Vm
PðXjRAk ðxÞÞ Wm
a ; k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Vm
OI;b
PAk
¼
¼
ðXÞ
PAk
k¼1
ðXÞ
PAk
k¼1
decide x 2 BNDPm
1þ
Vm
PðXjRAk ðxÞÞ Wm
a ; k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Vm
OI;b
k¼1
PðXjRAk ðxÞÞ
k¼1
x2Uj
m X
k¼1
1þ
ðXÞ
¼
decide x 2 POSPm PAOI ðXÞ, Vm k¼1 k PðXjRAk ðxÞÞ k¼1 Vm Wm If 1þ
OI;a
PAk
k¼1
If ðkPN kBN ÞðkNP kBP Þ [ ðkBN kNN ÞðkBP kPP Þ, then we can get 0 b\c\a 1. Thus the rules (OIP2)– (OIB2) can be rewritten as follows: Vm PðXjRAk ðxÞÞ k¼1 V Wm (OIP3) If
a, then m 1þ
Definition 3.2 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT, and P : 2U ! ½0; 1 is a probability function defined on the power set 2U . Continue to use the marks a; b above, for any X U, we define the lower and upper approximations of X as follows,
Vm
PðXjRAk ðxÞÞ Wm [b : k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Vm
k¼1
which implies that, in this case, the optimistic multigranulation decision-theoretic rough set model in an incomplete
Int. J. Mach. Learn. & Cyber.
information system will be degenerated to be the optimistic multigranulation rough set model [21]. Remark 3.3 If m ¼ 1 and (U, AT, V, f) is a complete information system, then the optimistic multigranulation decision-theoretic rough set model in an incomplete information system will be degenerated to be the decisiontheoretic rough set model [35, 37].
averters strategy. We apply this strategy to incomplete multigranulation case. For risk averters, their attitude towards risk is pessimistic, then the expected overall loss of taking actions aP ; aB and aN for the object x can be computed as follows: m _ kkPP PðXjRAk ðxÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼
By Definition 3.2, the following proposition holds.
þ
Proposition 3.2 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT, and P : 2U ! ½0; 1 is a probability function defined on the power set 2U . For any X U, then we have (1)
Pm
(2)
Pm
PAk
OI;a
k¼1
Pm
PAk
OI;a
k¼1
k¼1
m X k¼1
¼ ¼
PAk
ðUÞ ¼
Pm
PAk
PAk
k¼1
k¼1
k¼1
OI;b
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ þ
ðXÞ,
ð;Þ ¼ ;,
OI;b
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼
ðUÞ ¼ U,
P OI;1a OI;a PAk ðXÞ ¼ : m ð:XÞ k¼1 PAk ða [ 0:5Þ; Pm P OI;b OI;1b ðXÞ ¼ : m ð:XÞðb\0:5Þ; k¼1 PAk k¼1 PAk If 0\a1 a2 1 and 0 b1 b2 \1, then Pm P OI;a2 OI;a1 ðXÞ m ðXÞ and k¼1 PAk k¼1 PAk Pm Pm OI;b2 OI;b1 ðXÞ k¼1 PAk ðXÞ: k¼1 PAk
þ
k¼1 m _ k¼1 m _
kkBN Pð:XjRAk ðxÞÞ; kkNP PðXjRAk ðxÞÞ
k¼1 m _
where _ denotes maximum. We assume that k1PP ¼ ¼ km PP ¼ kPP ;
k1BP ¼ ¼ km BP ¼ kBP ;
k1NP ¼ ¼ km NP ¼ kNP ;
k1PN ¼ ¼ km PN ¼ kPN ;
k1BN ¼ ¼ km BN ¼ kBN ;
k1NN ¼ ¼ km NN ¼ kNN :
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kPP
ð:XÞ
PAk
Vm
kkNN Pð:XjRAk ðxÞÞ;
k¼1
þ kPN 1
m ^
!
m _
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ ;
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kBP þ kBN 1
k¼1
m ^
!
m _
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ ;
k¼1
Pm
OI;a
By k¼1 PAk b\0:5, we have
k¼1
kkBP PðXjRAk ðxÞÞ
OI;1a
Ak ðxÞÞ k¼1 Pð:XjRW Vm [ 1 a : x2Uj 1 þ k¼1 Pð:XjRAk ðxÞÞ m k¼1 Pð:XjRAk ðxÞÞ W 1 m PðXjRAk ðxÞÞ k¼1 Wm Vm x2Uj 1a 1 k¼1 PðXjRAk ðxÞÞ þ k¼1 PðXjRAk ðxÞÞ Vm ðxÞÞ k k¼1 PðXjRAW Vm x2Uj
a m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ OI;a m X PAk ðXÞ:
m X
m _
By PðXjRAk ðxÞÞ þ Pð:XjRAk ðxÞÞ ¼ 1, we have:
We only show (3). 81 a [ 0:5;
Proof
¼
ð;Þ ¼
Pm
OI;b
kkPN Pð:XjRAk ðxÞÞ;
k¼1
k¼1
(4)
¼
ðXÞ
Pm
Pm
(3)
:
PAk
OI;a
k¼1 m _
OI;b
PAk
ðXÞ ¼ :ð:
P OI;1a ðXÞ ¼ : m ð:XÞ, k¼1 PAk
m X k¼1
OI;b
PAk
m X ð:ð:XÞÞÞ ¼ : PAk
80
OI;1b
ð:XÞ:
RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ ¼ kNP þ kNN 1
m ^
!
m _
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ ;
k¼1
k¼1
h 3.3 Pessimistic multigranulation decision-theoretic rough sets Liu et al. [13] proposed a three-way decision model of fuzzy interval decision-theoretic rough sets by using risk
where ^ denotes minimum. The Bayesian decision procedure suggests the following three minimum-risk decision rules: (PIP1)
If RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ R ðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 POSPm PAPI ðXÞ, k¼1
k
123
Int. J. Mach. Learn. & Cyber.
(PIN1)
If RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 NEGPm PAPI ðXÞ,
(PIB1)
If RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ and RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ, then decide x 2 BNDPm PAPI ðXÞ.
k¼1
k¼1
RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV Wm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kBN kNN :
ðkBN kNN Þ þ ðkNP kBP Þ
k
k
b d
cþd ð8a; b; c; d [ 0Þ and the reasonBy ba dc () aþb able assume 0 kPP kBP \kNP , 0 kNN kBN \kPN , we have the following:
Therefore, the decision rules (PIP1)–(PIB1) can be expressed concisely as: Wm PðXjRAk ðxÞÞ k¼1 Wm Vm (PIP2) If
a and PðXjRAk ðxÞÞ PðXjRAk ðxÞÞ 1þ k¼1 Wmk¼1 PðXjRAk ðxÞÞ k¼1 Wm Vm
c, then decide 1þ
(1)
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV Wm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kBN
ðkPN kBN Þ þ ðkBP kPP Þ
k¼1
PðXjRAk ðxÞÞ
x 2 POSPm
For rule (PIP1): (PIN2)
If
RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV W () m 1þ m PðXjR ðxÞÞ A k k¼1 k¼1 PðXjRAk ðxÞÞ kPN kNN :
ðkPN kNN Þ þ ðkNP kPP Þ
(2)
For rule (PIN1): RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV W () m 1þ m PðXjR ðxÞÞ A k k¼1 k¼1 PðXjRAk ðxÞÞ kPN kNN ðkPN kNN Þ þ ðkNP kPP Þ
and RðaN jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV Wm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kBN kNN : ðkBN kNN Þ þ ðkNP kBP Þ
1þ
k¼1
PðXjRAk ðxÞÞ
x 2 NEGPm (PIB2)
and
123
PAPI k
Wm
k¼1
PðXjRAk ðxÞÞ
ðXÞ,
PðXjRAk ðxÞÞ k¼1 Wm Vm a and PðXjRAk ðxÞÞ PðXjRAk ðxÞÞ k¼1 Wmk¼1 PðXjRAk ðxÞÞ k¼1 Vm
b, then decide
1þ
Wm k¼1
PðXjRAk ðxÞÞ
x 2 BNDPm
k¼1
k¼1
PðXjRAk ðxÞÞ
ðXÞ, PAPI k
where kPN kBN ; ðkPN kBN Þ þ ðkBP kPP Þ kPN kNN ; c¼ ðkPN kNN Þ þ ðkNP kPP Þ kBN kNN : b¼ ðkBN kNN Þ þ ðkNP kBP Þ a¼
If ðkPN kBN ÞðkNP kBP Þ [ ðkBN kNN ÞðkBP kPP Þ, we can get 0 b\c\a 1, then the rules (PIP2)–(PIB2) can be rewritten as follows: Wm PðXjRAk ðxÞÞ k¼1 Wm Vm (PIP3) If
a, then 1þ
For rule (PIB1): RðaB jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ RðaP jðRA1 ðxÞ; . . .; RAm ðxÞÞÞ Wm ðxÞÞ k k¼1 PðXjRAV Wm () 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ kPN kBN ðkPN kBN Þ þ ðkBP kPP Þ
If 1þ
(PIN3) (3)
PðXjRAk ðxÞÞ
W PðXjRAk ðxÞÞ k¼1 Wm Vm c and PðXjRAk ðxÞÞ PðXjRAk ðxÞÞ 1þ k¼1 Wmk¼1 PðXjRAk ðxÞÞ k¼1 Wm Vm b, then decide k¼1
and
k¼1
ðXÞ,
PAPI k k¼1 m
(PIB3)
k¼1
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ
k¼1
PðXjRAk ðxÞÞ
decide x 2 POSPm PAPI ðXÞ, Wm k¼1 k PðXjRAk ðxÞÞ k¼1 Wm Vm If 1þ
decide x 2 NEGPm PAPI ðXÞ, Wmk¼1 k PðXjRAk ðxÞÞ k¼1 Vm If b\ Wm 1þ
k¼1
PðXjRAk ðxÞÞ
decide x 2 BNDPm
k¼1
PAPI k
k¼1
b,
PðXjRAk ðxÞÞ
then
\a, then
ðXÞ:
Furthermore, the pessimistic multigranulation positive, negative, and boundary regions of X are defined as follows, respectively,
Int. J. Mach. Learn. & Cyber.
POSPm PAPI ðXÞ k Wm k¼1 ðxÞÞ k k¼1 PðXjRAV Wm
a ; ¼ x2Uj 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ NEGPm PAPI ðXÞ k Wm k¼1 ðxÞÞ k k¼1 PðXjRAV Wm b ; ¼ x2Uj m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ P BND m PAPI ðXÞ k Wm k¼1 ðxÞÞ k k¼1 PðXjRAV Wm \a : ¼ x 2 U j b\ m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
By the three pair-wise disjoint regions above, we can obtain the lower and upper approximations of X as follows: m X
¼ m X
ðXÞ
PAk
k¼1
Wm
PðXjRAk ðxÞÞ Wm Vm
a ; x2Uj 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ k¼1
PI;b
ðXÞ
PAk
¼
x2Uj
1þ
Wm
ðxÞÞ k k¼1 PðXjRAV m PðXjR ðxÞÞ A k k¼1 k¼1 PðXjRAk ðxÞÞ
Wm
[b :
Definition 3.3 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT, and P : 2U ! ½0; 1 is a probability function defined on the power set 2U . Continue to use the marks a and b above, for any X U, we define the lower and upper approximations of X as follows, m X
PI;a
ðXÞ
PAk
k¼1
¼
x2Uj
m X
1þ
PAk
x2Uj
1þ
Pm
ðXÞ
Wm ðxÞÞ k k¼1 PðXjRAV Wm x2Uj
1 1 þ k¼1 PðXjRAk ðxÞÞ m k¼1 PðXjRAk ðxÞÞ ( ) m ^ ¼ x2Uj PðXjRAk ðxÞÞ 1
¼
k¼1
¼ fx 2 U j PðXjRA1 ðxÞÞ ¼ 1 and and PðXjRAm ðxÞÞ ¼ 1g ¼ fx 2 U j RA1 ðxÞ X and and RAm ðxÞ Xg;
k¼1
PI;0
ðXÞ
PAk
Wm
PðXjRAk ðxÞÞ Vm [0 k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ )
Wm
¼
k¼1
x2Uj 1þ ( m _ ¼ x2Uj PðXjRAk ðxÞÞ [ 0
which means that, in this case, the pessimistic multigranulation decision-theoretic rough set model will be degenerated to be the pessimistic multigranulation rough set model [21]. Remark 3.5 If m ¼ 1 and (U, AT, V, f) is a complete information system, then the pessimistic multigranulation decision-theoretic rough set model will be degenerated to be the decision-theoretic rough set model [35, 37]. By Definition 3.3, we have the following proposition.
Wm
Proposition 3.3 Let I ¼ ðU; AT; V; f Þ be an incomplete information system, A1 ; . . .; Am AT, and P : 2U ! ½0; 1 is a probability function defined on the power set 2U . For any X U, then we have
PðXjRAk ðxÞÞ Vm [b : k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Wm
¼ fx 2 U j PðXjRA1 ðxÞÞ [ 0 or or PðXjRAm ðxÞÞ [ 0g ¼ fx 2 U j RA1 ðxÞ \ X 6¼ ; or or RAm ðxÞ \ X 6¼ ;g;
k¼1
ðXÞ
¼
k¼1
PI;1
PAk
Wm
PðXjRAk ðxÞÞ Vm
a ; k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ
Wm
PI;b
k¼1
m X
k¼1
k¼1
jX\RAk ðxÞj jRAk ðxÞj
(where |X| denotes the cardinality of the set X), then we have
m X
PI;a
If a ¼ 1, b ¼ 0 and PðXjRAk ðxÞÞ ¼
Remark 3.4
k¼1
PI;a
Pm
PI;b
The pair ð k¼1 PAk ðXÞ; k¼1 PAk ðXÞÞ is called a pessimistic multigranulation decision-theoretic rough set.
(1)
Pm
(2)
Pm
k¼1 k¼1
Pm
k¼1
The pessimistic multigranulation decision-theoretic rough set is a special pessimistic multigranulation probabilistic rough set, in which a; b can be interpreted and computed by loss or cost based on Bayesian decision procedure. Next, two remarks are given to point out the relationships between pessimistic multigranulation decision-theoretic rough set model in an incomplete information system and the related rough set models [21, 35, 37].
(3)
Pm
Proof
PI;a
PAk
PI;a
PAk
PI;a PI;a
ðXÞ
Pm
ð;Þ ¼
Pm
ðUÞ ¼
Pm
PAk
k¼1
PAk
ðXÞ ¼ :
k¼1
PAk
k¼1
Pm
PI;b
PI;b
PAk
ðXÞ,
ð;Þ ¼ ;,
PI;b
ðUÞ ¼ U,
PI;1a
ð:XÞða [ 0:5Þ, k¼1 PAk Pm PI;1b ðXÞ ¼ : k¼1 PAk ð:XÞðb\0:5Þ, k¼1 PAk If 0\a1 a2 1 and 0 b1 b2 \1, then Pm P PI;a2 PI;a1 ðXÞ m ðXÞ and k¼1 PAk k¼1 PAk Pm Pm PI;b2 PI;b1 ðXÞ k¼1 PAk ðXÞ: k¼1 PAk k¼1
Pm (4)
PAk
PI;b
We only show (3). 81 a [ 0:5,
123
Int. J. Mach. Learn. & Cyber.
:
m X
PI;1a
Table 2 The first granular structure RA1 induced by A1
ð:XÞ
PAk
k¼1
¼ ¼ ¼ ¼
Wm Ak ðxÞÞ k¼1 Pð:XjRV Wm [1 a : x2Uj m 1 þ k¼1 Pð:XjRAk ðxÞÞ k¼1 Pð:XjRAk ðxÞÞ V 1 m ðxÞÞ k¼1 PðXjR Vm WAmk 1 a x2Uj 1 k¼1 PðXjRAk ðxÞÞ þ k¼1 PðXjRAk ðxÞÞ Wm k ðxÞÞ k¼1 PðXjRAV Wm
a x2Uj m 1 þ k¼1 PðXjRAk ðxÞÞ k¼1 PðXjRAk ðxÞÞ PI;a m X PAk ðXÞ: k¼1
Pm
By k¼1 PAk \0:5, we have :
m X
PI;1b
PAk
PI;a
ðXÞ ¼ :
Pm
k¼1
0
m X ð:XÞ ¼ :@: PAk
k¼1
PAk
PI;1a
ð:XÞ,
1
PI;b
ð:ð:XÞÞA ¼
k¼1
m X
80 b
RA1
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x1
1
0
0
0
0
1
0
0
0
0
x2
0
1
0
0
1
0
0
0
0
0
x3 x4
0 0
0 0
1 1
1 1
0 0
0 0
0 0
0 0
0 0
1 1
x5
0
1
0
0
1
0
0
0
0
0
x6
1
0
0
0
0
1
1
0
0
0
x7
0
0
0
0
0
1
1
0
0
0
x8
0
0
0
0
0
0
0
1
0
1
x9
0
0
0
0
0
0
0
0
1
0
x10
0
0
1
1
0
0
0
1
0
1
PI;b
ðXÞ:
PAk
k¼1
Table 3 The second granular structure RA2 induced by A2 RA2
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x1
1
0
0
0
0
0
0
0
1
1
x2
0
1
1
1
0
0
0
1
0
0
4 An example
x3 x4
0 0
1 1
1 1
1 1
0 0
0 0
0 1
1 1
0 0
0 0
We will give an example to illustrate the application of the models proposed in Sect. 3.
x5
0
0
0
0
1
1
0
1
0
0
x6
0
0
0
0
1
1
0
1
0
0
Example 4.1 Consider descriptions of several cars in Table 1, where the set of cars is U ¼ fu1 ; u2 ; u3 ; u4 ; u5 ; u6 ; u7 ; u8 ; u9 ; u10 g, the condition attribute set is AT ¼ fa1 ; a2 ; a3 ; a4 g (where a1 denotes ‘‘Price’’, a2 denotes ‘‘Size’’, a3 denotes ‘‘Engine’’, a4 denotes ‘‘Max-speed’’), and the decision attribute set is D ¼ fdg: Let A1 ¼ fa1 ; a4 g and A2 ¼ fa2 ; a3 g, then by Table 1, we can obtain RA1 and RA2 induced by A1 and A2 , respectively, which are given in Tables 2 and 3, respectively. By Table 1, it is easy to see that U=INDðdÞ ¼ fD1 ; D2 g, where D1 ¼ fx2 ; x5 ; x8 ; x9 g;D2 ¼ fx1 ; x3 ; x4 ; x6 ; x7 ; x10 g. According to customer’s requirement, one of the three types of models given in Sect. 3 can be used. Without loss
x7
0
0
0
1
0
0
1
0
0
0
x8
0
1
1
1
1
1
0
1
0
0
x9
1
0
0
0
0
0
0
0
1
1
x10
1
0
0
0
0
0
0
0
1
1
h
Table 1 The incomplete information table about cars
of generality, we choose the second model to illustrate the application process. Take kPP ¼ 0, kNP ¼ 0:6, kNN ¼ 0, kBN ¼ 0:2. Then
kPN ¼ 0:6,
kPN kBN 2 ¼ ; ðkPN kBN Þ þ ðkBP kPP Þ 3 kBN kNN 1 b¼ ¼ : ðkBN kNN Þ þ ðkNP kBP Þ 3 a¼
U
a1
a2
a3
a4
d
x1
Low
Compact
Gasoline
Low
Poor
x2
Low
Full
Diesel
High
Good
x3
High
Full
Diesel
Medium
Poor
2 X
x4
High
*
Diesel
Medium
Poor
k¼1
x5
Low
Full
Gasoline
High
Good
x6
*
Full
Gasoline
Low
Poor
2 X
x7
High
Compact
Diesel
Low
Poor
k¼1
x8
Low
Full
*
Medium
Good
¼ fx1 ; x3 ; x4 ; x6 ; x7 ; x8 ; x9 ; x10 g:
x9
High
Compact
Gasoline
High
Good
x10
*
Compact
Gasoline
Medium
Poor
123
kBP ¼ 0:2,
By using Definition 3.3 and PðXjRAk ðxÞÞ ¼ PI;23
ðD1 Þ ¼ fx5 g;
PAk
2 X
jX\RAk ðxÞj jRAk ðxÞj , we have
PI;13
PAk
ðD1 Þ ¼ fx2 ; x5 ; x6 ; x8 ; x9 g;
k¼1 PI;23
PAk
ðD2 Þ ¼ fx1 ; x4 ; x7 ; x10 g;
2 X
PI;13
PAk
ðD2 Þ
k¼1
By using three-way decision theory, we have the following decision rules:
Int. J. Mach. Learn. & Cyber.
(1)
(2) (3)
PI;13 P P PI;2 If x 2 2i¼1 PAi 3 ðD1 Þ or x 2 U 2i¼1 PAi ðD2 Þ, then the car x is Good. PI;13 P P PI;2 If x 2 2i¼1 PAi 3 ðD2 Þ or x 2 U 2i¼1 PAi ðD1 Þ, then the car x is Poor. Others are non-committed.
5.
6. 7.
Thus, we can obtain the following results: (1) (2) (3)
The cars x2 and x5 are Good. The cars x1 ; x3 ; x4 ; x7 , and x10 are Poor. The cars x6 ; x8 , and x9 are non-committed.
8.
9. 10.
5 Conclusion
11.
This paper studies multigranulation decision-theoretic rough sets in an incomplete information system. We propose three types of multigranulation decision-theoretic rough sets in an incomplete information system, based on Bayesian decision procedure. The relationships between multigranulation decision-theoretic rough sets in an incomplete information system and other rough sets are studied. We also give an example to illustrate the ideas of the proposed models. This paper assumes that the risk kk is irrelevant to k. In future research, we will focus on the study of multigranulation decision-theoretic rough sets in incomplete information systems in the case that the risks kk is relevant to k. In this paper, we only give a tentative study on the application aspect of model. It is desirable to further apply these proposed models to other practical problems in order to show the practical value of these models. Extension of the proposed multigranulation decision-theoretic rough sets in other types of data sets is also a future research direction.
12.
13.
14. 15. 16.
17.
18. 19. 20. 21.
22. Acknowledgments The paper is completed during authors’ visit to University of Regina. Authors sincerely thank their supervisor Professor Yiyu Yao for his valuable suggestions and careful reading. This work is supported by the National Natural Science Foundation of China (No. 61473181), China Postdoctoral Science Foundation funded project (No. 2013M532063), and Shaanxi Province Postdoctoral Science Foundation funded project (The first batch).
23. 24. 25.
26.
References 27. 1. Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York 2. Hu BQ (2014) Three-way decisions space and three-way decisions. Inf Sci 281:21–52 3. Huang B, Guo C-X, Zhuang Y-L, Li H, Zhou X (2014) Intuitionistic fuzzy multigranulation rough sets. Inf Sci 277:299–320 4. Jia XY, Zheng K, Li WW, Liu TT, Shang L (2012) Three-way decisions solution to filter spam email: an empirical study. In:
28.
29.
Yao JT et al (eds) RSCTC 2012, LNAI 7413. Springer, Berlin, pp 287–296 Jia XY, Tang ZM, Liao WH, Shang L (2014) On an optimization representation of decision-theoretic rough set model. Int J Approx Reas 55(1):156–166 Kryszkiewicz M (1998) Rough set approach to incomplete information systems. Inf Sci 112:39–49 Leung Y, Wu W-Z, Zhang W-X (2006) Knowledge acquisition in incomplete information systems: a rough set appproach. Eur J Oper Res 168:164–180 Li HX, Zhou XZ, Huang B, Liu D (2013) Cost-sensitive threeway decision: a sequential strategy. In: Proceedings of RSKT 2013, LNCS (LNAI), vol 8171, pp 325–337 Li T-J, Yang X-P (2014) An axiomatic characterization of probabilistic rough sets. Int J Approx Reas 55(1):130–141 Liang D, Liu D, Pedrycz W, Hu P (2013) Triangular fuzzy decision-theoretic rough sets. Int J Approx Reas 54(8):1087–1106 Lingras P, Chen M, Miao DQ (2009) Rough cluster quality index based on decision theory. IEEE Trans Knowl Data Eng 21(7):1014–1026 Liu D, Li TR, Liang DC (2012) Three-way government decision analysis with decision-theoretic rough sets. Int J Uncertain Fuzziness Knowl Based Syst 20(Supp. 1):119–132 Liu D, Li T, Liang D (2013) Fuzzy interval decision-theoretic rough sets. In: IFSA world congress and NAFIPS annual meeting (IFSA/NAFIPS), 2013 joint, pp 1315–1320 Liu CH, Miao DQ, Qian J (2014) On multi-granulation covering rough sets. Int J Approx Reas 55(6):1404–1418 Lin GP, Liang JY, Qian YH (2013) Multigranulation rough sets: from partition to covering. Inf Sci 241(20):101–118 Lin GP, Liang JY, Qian YH (2015) An information fusion approach by combining multigranulation rough sets and evidence theory. Inf Sci 314(1):184–199 Lin Y, Li J, Lin P, Lin G, Chen J (2014) Feature selection via neighborhood multi-granulation fusion. Knowl Based Syst 67:162–168 Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356 Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic, Dordrecht Qian YH, Liang JY, Yao YY, Dang CY (2010) MGRS: a multigranulation rough set. Inf Sci 180:949–970 Qian YH, Liang JY, Dang CY (2010) Incomplete multigranulation rough set. IEEE Trans Syst Man Cybern Part A Syst Hum 40(2):420–431 Qian YH, Zhang H, Sang YL, Liang JY (2014) Multigranulation decision-theoretic rough sets. Int J Approx Reas 55(1):225–237 She YH, He XL (2012) On the structure of the multigranulation rough set model. Knowl Based Syst 36:81–92 Sun BZ, Ma WM, Zhao HY (2014) Decision-theoretic rough fuzzy set model and application. Inf Sci 283:180–196 Tan AH, Li J, Lin GP, Lin Y (2015) Fast approach to knowledge acquisition in covering information systems using matrix operations. Knowl Based Syst 79:90–98 Tripathy BK, Panda GK, Mitra A (2012) Incomplete multigranulation based on rough intuitionistic fuzzy sets. UNIASCIT 2(1):118–124 Wang LJ, Yang XB, Yang JY, Wu C (2012) Incomplete multigranulation rough sets in incomplete ordered decision system. In: Huang D-S et al (eds) ICIC 2011, LNBI, vol 6840, pp 323–330 Wu WZ, Leung Y (2011) Theory and applications of granular labeled partitions in multi-scale decision tables. Inf Sci 181:3878–3897 Wong SKM, Ziarko W (1987) Comparison of the probabilistic approximate classification and the fuzzy set model. Fuzzy Sets Syst 21:357–362
123
Int. J. Mach. Learn. & Cyber. 30. Xu WH, Sun WX, Zhang XY, Zhang WX (2012) Multiple granulation rough set approach to ordered information systems. Int J Gen Syst 41(5):475–501 31. Yang XB, Qi YS, Song XN, Yang JY (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199 32. Yang XP, Lu ZJ, Li T-J (2013) Decision-theoretic rough sets in incomplete information system. Fundam Inf 126:353–375 33. Yang XP, Yao JT (2012) Modelling multi-agent three-way decisions with decision-theoretic rough sets. Fundam Inf 115:157–171 34. Yao YY (2015) The two sides of the theory of rough sets. Knowl Based Syst 80:67–77 35. Yao YY (2010) Three-way decisions with probabilistic rough sets. Inf Sci 180:341–353 36. Yao YY (2012) An outline of a theory of three-way decisions. In: Yao J, Yang Y, Slowinski R, Greco S, Li H, Mitra S, Polkowski L (eds) RSCTC 2012. LNCS (LNAI), Springer, Heidelberg, vol 7413, pp 1–17
123
37. Yao YY, Wong SKM (1992) A decision theoretic framework for approximating concepts. Int J Man Mach Stud 37:793–809 38. Yu H, Liu ZG, Wang GY (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reas 55(1):101–115 39. Zhao XR, Hu BQ (2015) Fuzzy and interval-valued fuzzy decision-theoretic rough set approaches based on fuzzy probability measure. Inf Sci 298(20):534–554 40. Zhou B, Yao YY, Luo JG (2010) A three-way decision approach to email spam filtering. In: Farzindar A, Keselj V (eds) Canadian AI 2010, LNAI, 6085th edn. Springer, Berlin, pp 28–39 41. Zhou B (2014) Multi-class decision-theoretic rough sets. Int J Approx Reas 55:211–224 42. Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46:39–59