Probabilistic Dependencies in Linear Hierarchies of Decision Tables

0 downloads 0 Views 160KB Size Report
by any combination of condition attribute values present in the decision table. ... tables are presented in this article in the probabilistic context, with the under- .... POSu(X), NEGl(X) or BNDl,u(X), the corresponding elementary set Et is ... tation of decision tables, as described above, creates a new approximation space.
Probabilistic Dependencies in Linear Hierarchies of Decision Tables Wojciech Ziarko Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2

Abstract. The article is a study of probabilistic dependencies between attribute-defined partitions of a universe in hierarchies of probabilistic decision tables. The dependencies are expressed through two measures: the probabilistic generalization of the Pawlak’s measure of the dependency between attributes and the expected certainty gain measure introduced by the author. The expected certainty gain measure reflects the subtle grades of probabilistic dependence of events. Both dependency measures are developed and it is shown how they can be extended from flat decision tables to dependencies existing in hierarchical structures of decision tables.

1

Introduction

The notion of decision table has been around for long time and was widely used in circuit design, software engineering, business, and other application areas. In the original formulation, decision tables are static due to the lack of the ability to automatically learn and adapt their structures based on new information. Decision tables representing data-acquired classification knowledge have been introduced by Pawlak [1]. In Pawlak’s approach, the decision tables are dynamic structures derived from data, with the ability to adjust with new information. This fundamental difference makes it possible for novel uses of decision tables in applications related to reasoning from data, such as data mining, machine learning or complex pattern recognition. The decision tables are typically used for making predictions about the value of the target decision attribute, such as medical diagnosis, based on combinations of values of condition attributes, for example symptoms and test results, as measured on new, previously unseen objects (for example, patients). However,

the decision tables often suffer from the following problems related to the fact that they are typically computed based on a subset, a sample of the universe of all possible objects. Firstly, the decision table may have excessive decision boundary, often due to poor quality of the descriptive condition attributes, which may be weakly correlated with the decision attribute. The excessive decision boundary leads to the excessive number of incorrect predictions. Secondly, the decision table may be highly incomplete, i.e. excessively many new measurement vectors of condition attributes of new objects are not matched by any combination of condition attribute values present in the decision table. Such a highly incomplete decision table leads to an excessive number of new, unrepresented observations, for which the prediction of the decision attribute value is not possible. With condition attributes weakly correlated with the decision attribute, increasing their number does not rectify the first problem. Attempting to increase the number of condition attributes, or the number of possible values of the attributes, results in the exponential explosion of the complexity of decision table learning and is leading to the rapid increase of its degree of incompleteness [8]. In general, the decision boundary reduction problem is conflicting with the decision table incompleteness minimization problem. To deal with these fundamental difficulties, an approach involving building hierarchies of decision tables was proposed [6]. The approach is focused on learning hierarchical structures of decision tables rather than learning individual tables, subject to learning complexity constraints. In this approach, a linear hierarchy of decision tables is formed, in which the parent layer decision boundary defines a universe of discourse for the child layer table. The decision tables on each layer are size-limited by reducing the number of condition attributes and their values, thus bounding their learning complexity [8]. Each layer contributes a degree of decision boundary reduction, while providing a shrinking decision boundary to the next layer. In this way, even in the presence of relatively weak condition attributes, a significant total boundary reduction can be achieved, while preserving the learning complexity constraints on each level. Similar to single layer decision table, the hierarchy of decision tables needs to be evaluated from the point of view of its quality as a potential classifier of new observations. The primary evaluative measure for decision tables, as introduced

by Pawlak, is the measure of partial functional dependency between attributes [1] and its probabilistic extension [7]. Another measure is the recently introduced expected gain measure which captures more subtle probabilistic associations between attributes [7]. In this paper, these measures are reviewed and generalized to the hierarchical structures of decision tables. A simple recursive method of their computation is also discussed. The measures, referred to as γ and λ measures respectively, provide a tool for assessment of decision table-based classifiers derived from data. The basics of the rough set theory and the techniques for analysis of decision tables are presented in this article in the probabilistic context, with the underlying assumption that the universe of discourse U is potentially infinite and is known only partially through a finite collection of observation vectors (the sample data). This assumption is consistent with great majority of applications in the areas of statistical analysis, data mining and machine learning.

2

Attribute-Based Probabilistic Approximation Spaces

In this section, we briefly review the essential assumptions, definitions and notations of the rough set theory in the context of probability theory. 2.1

Attributes and Classifications

We assume that observations about objects are expressed through values of attributes, which are assumed to be functions a : U → Va , where Va is a finite set of values called the domain. The attributes represent some properties of the objects in U . It should be however mentioned here that, in practice, the attributes may not be functions but general relations due to influence of measurement random noise. The presence of noise may cause the appearance of multiple attribute values associated with an object. Traditionally, the attributes are divided into two disjoint categories: condition attributes denoted as C, and decision attributes D = {d}. In many rough set-oriented applications, attributes are finite-valued functions obtained by discretizing values of real-valued variables representing measurements taken on objects e ∈ U . As individual attributes, any non-empty subset of attributes B ⊆ C ∪ D defines a mapping from the set of objects U into the set of vectors of values of

attributes in B. This leads to the idea of the equivalence relation on U , called indiscernibility relation IN DB = {(e1 , e2 ) ∈ U : B(e1 ) = B(e2 )}. According to this relation, objects having identical values of attributes in B are equivalent, that is, indistinguishable in terms of values of attributes in B . The collection of classes of identical objects will be denoted as U/B and the pair (U, U/B) will be called an approximation space. The object sets G ∈ U/C ∪D, will be referred to as atoms. The sets E ∈ U/C will be referred to as elementary sets. The sets X ∈ U/D will be called decision categories. Each elementary set E ∈ U/C and each decision category X ∈ U/D is a union of some atoms. That is, E = ∪{G ∈ U/C ∪ D : G ⊆ E} and X = ∪{G ∈ U/C ∪ D : G ⊆ F }.

2.2

Probabilities

We assume that all subsets X ⊆ U under consideration are measurable by a probability measure function P , normally estimated from collected data in a standard way, with 0 < P (X) < 1, which means that they are likely to occur but their occurrence is not certain. In particular, each atom G ∈ U/C ∪ D is assigned a joint probability P (G). From our initial assumption and from the basic properties of the probability measure P , follows that for all atoms G ∈ U/C ∪ D, we have 0 < P (G) < 1 and P G∈U/C∪D P (G) = 1. Based on the joint probabilities of atoms, probabilities of elementary sets E and of a decision category X can be calculated by P (E) = P G⊆E P (G). The probability P (X) of the decision category X in the universe U is the prior probability of the category X. It represents the degree of confidence in the occurrence of the decision category X, in the absence of any information expressed by attribute values. The conditional probability of a decision category X, P (X|E) =

P (X∩E) P (E) ,

conditioned on the occurrence of the elementary set E, represents the degree of confidence in the occurrence of the decision category X, given information indicating that E occurred. The conditional probability can be expressed in terms P of joint probabilities of atoms by P (X|E) =

G⊆X∩E P (G) P . G⊆E P (G)

This property allows

for simple computation of the conditional probabilities of decision categories.

2.3

Variable Precision Rough Sets

The theory of rough set underlies the methods for derivation, optimization and analysis of decision tables acquired from data. In this part, we review the basic definitions and assumptions of the variable precision rough set model (VPRSM) [5][7]. The VPRSM is a direct generalization of Pawlak rough sets [1]. One of the main objectives of rough set theory is the formation and analysis of approximate definitions of otherwise undefinable sets [1]. The approximate definitions, in the form of lower approximation and boundary area of a set, allow for determination of an object’s membership in a set with varying degrees of certainty. The lower approximation permits for uncertainty-free membership determination, whereas the boundary defines an area of objects which are not certain, but possible, members of the set [1]. The VPRSM extends upon these ideas by parametrically defining the positive region as an area where the certainty degree of an object’s membership in a set is relatively high, the negative region as an area where the certainty degree of an object’s membership in a set is relatively low, and by defining the boundary as an area where the certainty of an object’s membership in a set is deemed neither high nor low. The defining criteria in the VPRSM are expressed in terms of conditional probabilities and of the prior probability P (X) of the set X in the universe U . The prior probability P (X) is used as reference value here as it represents the likelihood of X occurrence in the extreme case characterized by the absence of any attribute-based information. In the context the attribute-value representation of sets of the universe U , as described in the previous section, we will assume that the sets of interest are decision categories X ∈ U/D. Two precision control parameters are used: the lower limit l, 0 ≤ l < P (X) < 1, representing the highest acceptable degree of the conditional probability P (X|E) to include the elementary set E in the negative region of the set X; and the upper limit u, 0 < P (X) < u ≤ 1, reflecting the least acceptable degree of the conditional probability P (X|E) to include the elementary set E in the positive region, or u-lower approximation of the set X. The l-negative region of the set X, denoted as N EGl (X) is defined by: N EGl (X) = ∪{E : P (X|E) ≤ l}

(1)

The l-negative region of the set Xis a collection of objects for which the probability of membership in the set X is significantly lower than the prior probability

P (X). The u-positive region of the set X, P OSu (X) is defined as P OSu (X) = ∪{E : P (X|E) ≥ u}.

(2)

The u-positive region of the set X is a collection of objects for which the probability of membership in the set X is significantly higher than the prior probability P (X). The objects which are not classified as being in the u-positive region nor in the l-negative region belong to the (l, u)-boundary region of the decision category X, denoted as BN Rl,u (X) = ∪{E : l < P (X|E) < u}.

(3)

The boundary is a specification of objects about which it is known that their associated probability of belonging, or not belonging to the decision category X, is not much different from the prior probability of the decision category P (X). The VPRSM reduces to standard rough sets when l = 0 and u = 1.

3

Structures of Decision Tables Acquired from Data

To describe functional or partial functional connections between attributes of objects of the universe U , Pawlak introduced the idea of decision table acquired from data [1]. The probabilistic decision tables and their hierarchies extend this idea into probabilistic domain by forming representations of probabilistic relations between attributes.

3.1

Probabilistic Decision Tables

For the given decision category X ∈ U/D and the set values of the VPRSM lower and upper limit parameters l and u, we define the probabilistic decision table

C,D DTl,u as a mapping C(U ) → {P OS, N EG, BN D} derived from the

classification table as follows: The mapping is assigning each tuple of values of condition attribute values t ∈ C(U ) to its unique designation of one of VPRSM approximation regions P OSu (X), N EGl (X) or BN Dl,u (X), the corresponding elementary set Et is included in, along with associated elementary set probabilities P (Et ) and conditional probabilities P (X|Et ):

    (P (Et ), P (X|Et ), P OS) ⇔ Et ⊆ P OSu (X) C,D DTl,u (t) = (P (Et ), P (X|Et ), N EG) ⇔ Et ⊆ N EGl (X)    (P (E ), P (X|E ), BN D) ⇔ E ⊆ BN D (X) t t t l,u

(4)

The probabilistic decision table is an approximate representation of the probabilistic relation between condition and decision attributes via a collection of uniform size probabilistic rules corresponding to rows of the table. An example probabilistic decision table is shown in Table 1. In this table, the condition attributes are a, b, c, attribute-value combinations correspond to elementary sets E and Region is a designation of one of the approximation regions the corresponding elementary sets belong to: positive (POS), negative (NEG) or boundary (BND). The probabilistic decision tables are most useful for decision making or prediction when the relation between condition and decision attributes is largely non-deterministic. However, they suffer from the inherent contradiction between the accuracy and completeness. In the presence of boundary region, higher accuracy, i.e. reduction of boundary region, can be achieved either by adding new condition attributes or by increasing the precision of existing ones (for instance, by making the discretization procedure finer). Both solutions lead to the exponential growth in the maximum number of attribute-value combinations to be stored in the decision table [8]. In practice, it results in such negative effects as excessive size of the decision table, likely high degree of table incompleteness (in the sense of missing many feasible attribute-value combinations), weak data support for elementary sets represented in the table and, consequently, unreliable estimates of probabilities. The use of hierarchies of decision tables rather than individual tables in the process of classifier learning from data provides a partial solution to these problems [6]. 3.2

Probabilistic Decision Table Hierarchies

Since the VPRSM boundary region BN Dl,u (X) is a definable subset of the universe U , it allows to structure the decision tables into hierarchies by treating the boundary region BN Dl,u (X) as sub-universe of U , denoted as U 0 = BN Dl,u (X). The ”child” sub-universe U 0 so defined can be made completely independent from its ”parent” universe U , by having its own collection of condition attributes C 0 to form a ”child” approximation sub-space (U, U/C 0 ). As on the parent level, in

a

b

c

P (E) P (X|E)

Region

1

1

2

0.23

1.00

POS

1

0

1

0.33

0.61

BND

2

2

1

0.11

0.27

BND

2

0

2

0.01

1.00

POS

0

2

1

0.32

0.06

NEG

Table 1. An example of probabilistic decision table

the approximation space (U, U/C 0 ), the decision table for the subset X 0 ⊆ X of the target decision category X, X 0 = X ∩ BN Dl,u (X) can be derived by adapting the formula (4). By repeating this step recursively, a linear hierarchy of probabilistic decision tables can be grown until either boundary area disappears in one of the child tables, or no attributes can be identified to produce non-boundary decision table at the final level. Other termination conditions are possible, but this issue is out of scope in this article. The nesting of approximation spaces obtained as a result of recursive computation of decision tables, as described above, creates a new approximation space on U . The resulting hierarchical approximation space (U, R) cannot be expressed by the indiscernibility relation, as defined in Section 2, in terms of the attributes used to form the local sub-spaces on individual levels of the hierarchy. This leads to the basic question: how to measure the degree of the mostly probabilistic dependency between the hierarchical partition R of U and the partition (X, ¬X) corresponding to the decision category X ⊆ U . Some probabilistic inter-partition dependency measures are explored in the next section.

4

Dependencies in Decision Table Hierarchies

The dependencies between partitions are fundamental to rough set-based nonprobabilistic and probabilistic reasoning and prediction. They allow to predict the occurrence of a class of one partition based on the information that a class of another partition occurred. There are several ways dependencies between partitions can be defined in decision tables. In Pawlak’s early works functional and partial functional dependencies were explored [1]. The probabilistic generalization of the dependencies was also defined and investigated in the framework of the variable precision rough set model. All these dependencies represent the rel-

ative size of the positive and negative regions of the target set X. They reflect the quality of approximation of the target category in terms of the elementary sets of the approximation space. Following the original Pawlak’s terminology, we will refer to these dependencies as γ-dependencies. Other kind of dependencies, based on the notion of the certainty gain measure, reflect the average degree of improvement of the certainty of occurrence of the decision category X, or ¬X, relative to its prior probability P (X) [7] (see also [2] and [4]). We will refer to these dependencies as λ-dependencies. Both, the γ-dependencies and λ-dependencies can be extended to hierarchies of probabilistic decision tables, as described below. Because there is no single collection of attributes defining the partition of U , the dependencies of interest in this case are dependencies between the hierarchical partition R generated by the decision table hierarchy, forming the approximation space (U, R), and the partition (X, ¬X), defined by the target set.

4.1

Γ -dependencies for Decision Tables

The partial functional dependency among attributes, referred to as γ-dependency γ(D|C) measure, was introduced by Pawlak [1]. It can be expressed in terms of the probability of positive region of the partition U/D defining decision categories: γ(D|C) = P (P OS C,D (U ))

(5)

where P OS C,D (U ) is a positive region of the partition U/D in the approximation space induced by the partition U/C. In the binary case of two decision categories, X and ¬X, the γ(D|C)-dependency can be extended to the VPRSM by defining it as the combined probability of the u-positive and l -negative regions:

γl,u (X|C) = P (P OSu (X) ∪ N EGl (X)).

(6)

The γ-dependency measure reflects the proportion of objects in U , which can be classified with sufficiently high certainty as being members, or non-members of the set X.

4.2

Computation of Γ -dependencies in Decision Table Hierarchies

In the case of the approximation space obtained by forming it via hierarchical classification process, the γ-dependency between the hierarchical partition R and the partition (X, ¬X) can be computed directly by analyzing all classes of the hierarchical partition. However, an easier to implement recursive computation is also possible. This is done by recursively applying, starting from the leaf table of the hierarchy and going up to the root table, the following formula (7) U for computing the dependency of the parent table γl,u (X|R) in the hierarchical 0

U approximation space (U, R), if the dependency of a child level table γl,u (X|R0 )

in the sub-approximation space (U 0 , R0 ) is given: 0

U U U (X|R) = γl,u (X|C) + P (U 0 )γl,u γl,u (X|R0 ),

(7)

where C is a collection of attributes inducing the approximation space U and 0

U = BN Dl,u (X). As in the flat table case, this dependency measure represents the fraction of objects that can be classified with acceptable certainty into decision categories X or ¬X by applying the decision tables in the hierarchy. The dependency of the whole structure of decision tables, that is the last dependency computed by the recursive application of formula (7), will be called a global γdependency. Alternatively, the global γ-dependency can be computed straight from from the definition (5). This computation requires checking all elementary sets of the hierarchical partition for the inclusion in P OSu (X)∪N EGl (X), which seems to be less elegant and more time consuming that the recursive method. 4.3

Certainty Gain Functions

Based on the probabilistic information contained in data, as given by the joint probabilities of atoms, it is also possible to evaluate the degree of probabilistic dependency between any elementary set and a decision category. The dependency measure is called absolute certainty gain [7] (gabs). It represents the degree of influence the occurrence of an elementary set E has on the likelihood of occurrence of the decision category X. The occurrence of E can increase, decrease, or have no effect on the probability of occurrence of X. The probability of occurrence of X, in the absence of any other information, is given by its prior probability P (X). The degree of variation of the probability of X, due to occurrence of E, is reflected by the absolute certainty gain function:

gabs(X|E) = |P (X|E) − P (X)|,

(8)

where | ∗ | denotes absolute value function. The values of the absolute gain function fall in the range 0 ≤ gabs(X|E) ≤ max(P (¬X), P (X)) < 1. In addition, if sets X and E are independent in the probabilistic sense, that is, if P (X ∩ E) = P (X)P (E), then gabs(X|E) = 0. The definition of the absolute certainty gain provides a basis for the definition of a new probabilistic dependency measure between attributes. This dependency can be expressed as the average degree of change of occurrence certainty of the decision category X, or of its complement ¬X, due to occurrence of any elementary set [7], as defined by the expected certainty gain function: X

egabs(X|C) =

P (E)gabs(X|E),

(9)

E∈U/C

where X ∈ U/D. The expected certainty gain is a more subtle inter-partition dependency than γ-dependency since it takes into account the probabilistic distribution information in the boundary region of X. The egabs(X|C) measure can be computed directly from joint probabilities of atoms. It can be proven [7] that the expected gain function falls in the range 0 ≤ egabs(X|C) ≤ 2P (X)(1 − P (X)), where X ∈ U/D. 4.4

Attribute Λ-dependencies in Decision Tables

The strongest dependency between attributes of a decision table occurs when the decision category X is definable, i.e. when the dependency is functional. Consequently, the dependency in this deterministic case can be used as a reference value to normalize the certainty gain function. The following normalized expected gain function λ(X|C) measures the expected degree of the probabilistic dependency between elementary sets and the decision categories belonging to U/D [7]: λ(X|C) =

egabs(X|C) , 2P (X)(1 − P (X))

(10)

where X ∈ U/D. The λ-dependency quantifies in relative terms the average degree of deviation of elementary sets from statistical independence with the decision class X ∈ U/D. The dependency function reaches its maximum λ(X|C) = 1

only if the dependency is deterministic (functional) and is at minimum when all events represented by elementary sets E ∈ U/C are unrelated to the occurrence of the decision class X ∈ U/D. In the latter case, the conditional distribution of the decision class P (X|E) equals to its prior distribution P (X). The value of the λ(X|C) dependency function can be easily computed from the joint probabilities of atoms. As opposed to the generalized γ(X|C) dependency, the λ(X|C) dependency has the monotonicity property [3], that is, λ(X|C) ≤ λ(X|C ∪ {a}), where a is an extra condition attribute outside the set C. This monotonicity property allows for dependency-preserving reduction of attributes and is leading to the notion of probabilistic λ-reduct of attributes, as defined in [3].

4.5

Computation of Λ-dependencies in Decision Table Hierarchies

The λ-dependencies can be computed directly based on any known partitioning of the universe U . In cases when the approximation space is formed through hierarchical classification, the λ-dependency between the partition R so created and the target category X can be computed via a recursive formula derived below. Let X

egabsl,u (X|C) =

P (E)gabs(X|E)

(11)

E∈P OSu ∪N EGl

denote the conditional expected gain function, i.e. restricted to the union of positive and negative regions of the target set X in the approximations space generated by attributes C. The maximum value of egabsl,u (X|C), achievable in deterministic case, is 2P (X)(1 − P (X)). Thus, the normalized conditional λ-dependency function, can be defined as: λl,u (X|C) =

egabsl,u (X|C) . 2P (X)(1 − P (X))

(12)

As γ-dependencies, λ-dependencies between the target partition (X, ¬X) and the hierarchical partition R can be computed recursively. The following formula (13) describes the relationship between λ-dependency computed in the approximation space (U, R), versus the dependency computed over the approximation sub-space (U, R0 ), where R and R0 are hierarchical partitions of universes U

and U 0 = BN Dl,u (X), respectively. Let λl,u (X|R) and λl,u (X|R0 ) denote λdependency measures in the approximation spaces (U, R) and (U 0 , R0 ), respectively. The λ-dependencies in those approximation spaces are related by the following: λl,u (X|R) = λl,u (X|C) + P (BN Dl,u (X))λl,u (X|R0 ).

(13)

The proof of the above formula follows directly from the Bayes’s equation. In practical terms, the formula (13) provides a method for efficient computation of conditional λ-dependency in a hierarchical arrangement of probabilistic decision tables. According to this method, to compute conditional λ-dependency for each level of the hierarchy, it suffices to compute the conditional λ-dependency and to know ”child” BN Dl,u (X)-level conditional λ-dependency. That is, the conditional λ-dependency should be computed first for the bottom level table using formula (12), and then it would be computed for each subsequent level in the bottom-up fashion by successively applying (13). In similar way, the ”unconditional” λ-dependency λ(X|R) can be computed over all elementary sets of the hierarchical approximation space. This is made possible by the following variant of the formula (13): λ(X|R) = λl,u (X|C) + P (BN Dl,u (X))λ(X|R0 ).

(14)

The recursive process based on the formula (14) is essentially the same as in the case (13), with except that the bottom-up procedure starts with computation of the ”unconditional” λ-dependency by formula (10) for the the bottom-level table.

5

Concluding Remarks

Learning and evaluation of hierarchical structures of probabilistic decision tables is the main focus of this article. The earlier introduced measures of gamma and lambda dependencies between attributes [7] for decision tables acquired from data are not directly applicable to approximation spaces corresponding to hierarchical structures of decision tables. The main contribution of this work is the extension of the measures to the decision table hierarchies case and the derivation of recursive formulas for their easy computation. The gamma dependency measure allows for the assessment of the prospective ability of the classifier

based on the hierarchy of decision tables to predict the values of decision attribute on required level of certainty. The lambda dependency measure captures the relative degree of probabilistic correlation between classes of the partitions corresponding to condition and decision attributes, respectively. The degree of the correlation in this case is a representation of the average improvement of the ability to predict the occurrence of the target set X, or its complement ¬X. Jointly, both measures enable the user to evaluate the progress of learning with the addition of new training data and to assess the quality of the empirical classifier. Three experimental applications of the presented approach are currently under development. The first one is concerned with face recognition using photos to develop the classifier in the form of a hierarchies of decision tables, the second one is aiming at adaptive learning of spam recognition among e-mails, and the third one is focused on stock price movement prediction using historical data. Acknowledgment: This paper is an extended version of the article included in the Proceedings of the International Conference on Rough Sets and Emerging Intelligent Systems Paradigms, devoted to the memory of Professor Zdzislaw Pawlak, held in Warsaw, Poland in 2007. The support of the Natural Sciences and Engineering Research Council of Canada in funding the research presented in this article is gratefully acknowledged.

References 1. Pawlak Z.: Rough sets - Theoretical Aspects of Reasoning About Data. Kluwer, 1991. 2. Greco S., Matarazzo B., Slowinski R.: Rough membership and Bayesian confirmation measures for parametrized rough sets. Proc. of the 10th RSDGRC’2005, LNAI 3641, Springer, 2005, 314-324. 3. Slezak D., Ziarko W.: The Investigation of the Bayesian rough set model. International Journal of Approximate Reasoning, Elsevier, vol. 40, 2005, 81-91. 4. Yao Y.: Probabilistic approaches to rough sets. Expert Systems, vol. 20(5), 2003, 287-291. 5. Ziarko W.: Variable precision rough sets model. Journal of Computer and Systems Sciences, vol. 46(1), 1993, 39-59. 6. Ziarko W.: Acquisition of hierarchy-structured probabilistic decision tables and rules from data. Proc. of IEEE Intl. Conf. on Fuzzy Systems, Honolulu, 2002, 779-784. 7. Ziarko W.: Probabilistic rough sets. Proc. of the 10th RSDGRC’2005, LNAI 3641, Springer, 2005, 283-293. 8. Ziarko W.: On learnability of decision tables. Proc. of the 3rd RSCTC, Uppsala, Sveden, LNAI 3066, Springer, 2004, 394-401

Suggest Documents