fuzzy approximate dependencies over imprecise ...

FUZZY APPROXIMATE DEPENDENCIES OVER IMPRECISE DOMAINS. AN EXAMPLE IN SOIL DATA MANAGEMENT J. Calero, G. Delgado,

J.M. Serrano

D. Sánchez, M.A. Vila

Department of Pedology and Agricultural Chemistry University of Granada (Spain)

Department of Computer Science University of Jaen (Spain)

Department of Computer Science and A.I. University of Granada (Spain)

ABSTRACT Fuzzy databases are commonly used when managing imprecise or uncertain information as, for example, soil color data, that is usually affected by inherent imprecision factors. In many occasions, a set of semantic relations between real problems data can be modelled by means of fuzzy similarity relations. Also, attributes over numeric domains can be transformed, if we define adequate sets of linguistic labels, in order to reduce granularity. Data mining tools, as association rules and approximate dependencies, have been proven effective and useful when users are looking for implicit or non-intuitive relations between data. Nevertheles, data mining techniques must be extended in order to allow this imprecise or uncertain data. Several approaches can be found in the literature with this aim. In this work, we comment some of them and apply both crisp and fuzzy techniques over a soil database, comparing obtained results with domain experts aid. KEYWORDS

Fuzzy approximate dependencies, fuzzy data mining, imprecise domains, fuzzy similarity relations, fuzzy databases.

1. INTRODUCTION Knowledge discovery in databases (KDD) is concerned with finding previously unknown and potentially useful knowledge from databases. Roughly, it consists of three main steps: data preprocessing (preparing data), data mining (finding interesting patterns in data) and interpretation of data mining results to provide the final knowledge. There are several ways in which fuzzy set technology is useful in KDD, see (Pedrycz, 1998). First, participation of users in all the steps of the KDD process is crucial, and in particular the ultimate objective of KDD is to provide users with understandable information. Fuzzy set technology is a suitable tool for this purpose, specially in those cases where we want to express summaries or relations between numerical data by means of linguistic terms. For instance, we can represent a set of numeric values related to Clay percentage by means of a set of linguistic labels {High, Average, Low}. Also, linguistic assessments of patterns are helpful in order to judge them in the last step of KDD. However, there is another important fact that links KDD and fuzzy sets: in many cases data is inherently imprecise or uncertain, and several fuzzy relational, deductive and object-oriented database models have been developed in order to cope with this. A more usual case, that provides a similar scenario, is that of fuzzy data obtained from crisp data in the preprocessing step by aggregation, summarization or change of granularity level. The analysis of such information requires the development of specific tools as fuzzy extensions of existing ones. An example of both cases is soil data, and more concretely, soil color information. On the one hand, some similarities can be established between attribute values according to semantic relations. On the other hand, the definition of sets of linguistic labels over numeric domains can help us in the reduction of granularity information. For instance, color is a very remarked characteristic of soils. It can be easily determined with

I - 396

FUZZY APPROXIMATE DEPENDENCIES OVER IMPRECISE DOMAINS. AN EXAMPLE IN SOIL DATA MANAGEMENT

little expert aid, and it lets us to qualitatively estimate the sets of materials conforming soil horizons and soilforming processes (Bigham and Ciolkosz, 1993). Several authors have studied the existing relations between soil color and soil components (Torrent et al., 1980; Schwertmann, 1993; Schulze et al., 1993). In (Sánchez-Marañón et al., 1997) we can find a deeper study of the so called “Mediterranean red soils”, typical of Mediterranean climate. Here, statistic tools are applied to suggest and contrast a certain number of hypothesis, relating some soil components with soil color. Unfortunately, most statistical techniques can not be applied over data modelled by means of fuzzy sets so, our intention in the present paper is to apply a set of fuzzy data mining tools in order to extend the statistic results where possible. For lack of space, we reduce to the study of dependencies between attributes, but at a future task, we will afford the local study of relations between values, by means of association rules. The paper is structured as follows. First, we summarize the data mining techniques and tools that we shall apply. The next section is devoted to explain materials and methods, where the soil database is described, and is followed by an expert interpretation of the obtained results. Finally, some concluding remarks and future tasks are added before the bibliographic references are listed.

2. DATA MINING TOOLS A very common issue in database analysis is the study of existing relations between data stored in a database. Mainly, we can distinguish two basic types of relations. On the one hand, there can be implicit or hidden relations between attribute values (at value level), not clear at a first moment. As we will see in the next subsection, those relations can be obtained by means of association rules. On the other hand, we can be more interested in finding explicit relations between data, where implications between attributes can be easily detected (at attribute level). Approximate dependencies discovery is a suitable technique in order to accomplish this task. Moreover, when source data is affected with a given imprecision or uncertain degree, existing techniques must be extended in order to manage this. In this section we summarize the techniques we have employed to analyze data, in particular, the one corresponding to soil color and properties.

2.1 Association Rules Given a set I ("set of items") and a set of transactions T (also called T-set), each transaction being a subset of I, association rules are "implications" of the form A ⇒ C that relate the presence of itemsets (sets of items) A and C in transactions of T, assuming A, C ⊆ I, A∩C =∅ and A, C ≠∅. In the case of relational databases, it is usual to consider that items are pairs , and transactions are tuples in a table. For example, the item is in the transaction associated to a tuple t if and only if t[X] = x0. The ordinary measures proposed in (Agrawal et al., 1993) to assess association rules are confidence (the conditional probability, p(C|A), noted Conf(A⇒C)) and support (the joint probability, p(A∪C), noted Supp(A⇒C)). It is usual to establish two thresholds, called minsupp and minconf, respectively. Some authors have shown that confidence can yield misleading results in some cases. Basically, the problem with confidence is that it does not take into account the support of C, hence it is unable to detect statistical independence or negative dependency, i.e., a high value of confidence can be obtained in those cases. This problem is specially important when there are some items with very high support. In the worst case, given an itemset C such that Supp(C)=1, every rule of the form A⇒ C will be strong provided that Supp(A)>minsupp. It has been shown that in practice, a large amount of rules with high confidence are misleading because of the aforementioned problems. A summary of papers discussing this problem can be found in (Berzal et al., 2001; Berzal et al., 2002), where an alternative framework is proposed. In this framework, accuracy is measured by means of Shortliffe and Buchanan’s certainty factors (Shortliffe and Buchanan, 1975), in the following way: the certainty factor of the rule A ⇒ C is (Conf ( A ⇒ C )) − Supp (C ) CF = ( A ⇒ C ) = 1 − S (C )

if Conf(A ⇒ C) > Supp(C), and

I - 397

IADIS International Conference Applied Computing 2004

CF = ( A ⇒ C ) =

(Conf ( A ⇒ C )) − Supp (C ) S (C )

if Conf(A ⇒ C) < Supp(C), and 0 otherwise. Certainty factors take values in [-1, 1], indicating the extent to which our belief that the consequent is true varies when the antecedent is also true. It ranges from 1, meaning maximum increment (i.e., when A is true then C is true) to -1, meaning maximum decrement.

2.2 Approximate Dependencies In the following, let RE={At1,…,Atm} be a relational scheme and let r be an instance of RE such that |r|=n. Also, let V, W ⊆RE with V∩W =∅. A functional dependence V → W holds in RE if and only if

∀t, s ∈ r if t[V ] = s[V ] then t[W] = s[W] Approximate dependencies can be roughly defined as functional dependencies with exceptions. The definition of approximate dependence is then a matter of how to define exceptions, and how to measure the accuracy of the dependence (Bosc and Lietard, 1997). We shall follow the approach introduced in (Delgado et al., 2000a; Blanco et al., 2000), where the same methodology employed in mining for association rules is applied to the discovery of approximate dependencies. The idea is that, since a functional dependency "V → W" can be seen as a rule that relates the equality of attribute values in pairs of tuples (see equation above), and association rules relate the presence of items in transactions, we can represent approximate dependencies as association rules by using the following interpretations of the concepts of item and transaction: • An item is an object associated to an attribute of RE. For every attribute At∈RE we note itAt the associated item. • We introduce the itemset IV to be IV = {itAt | At∈V} • Tr is a T-set that, for each pair of tuples ∈r×r contains a transaction ts∈Tr verifying itAt∈ts , t[At] = s[At] It is obvious that |Tr| = |r×r| = n2. Then, an approximate dependence V → W in the relation r is an association rule IV ⇒ IW in Tr (Delgado et al., 2000a; Blanco et al., 2000). The support and certainty factor of IV ⇒ IW measure the interest and accuracy of the dependence V → W.

2.3 Fuzzy Association Rules In (Delgado et al., 2003), the model for association rules is extended in order to manage fuzzy values in databases. The approach is based on the definition of fuzzy transactions as fuzzy subsets of items. Let I = {i1,…, im} be a set of items and T’ be a set of fuzzy transactions, where each fuzzy transaction is a fuzzy subset of I. Let τ∈T’ be a fuzzy transaction, we note τ(ik) the membership degree of ik in τ. A fuzzy association rule is an implication of the form A ⇒ C such that A, C⊂RE and A∩C =∅. It is immediate that the set of transactions where a given item appears is a fuzzy set. We call it representation of the item. For item ik in T’ we have the following fuzzy subset of T’:

Γik =

∑τ (i ) /τ

τ ∈T '

k

This representation can be extended to itemsets as follows: let I0∈I be an itemset, its representation is the following subset of T’:

ΓI 0 =

∩ Γ = min i

i∈ I 0

i∈ I 0

Γi

In order to measure the interest and accuracy of a fuzzy association rule, we must use approximate reasoning tools, because of the imprecision that affects fuzzy transactions and, consequently, the

I - 398


representation of itemsets. In (Delgado et al., 2003), a semantic approach is proposed based on the evaluation of quantified sentences (see (Zadeh, 1983)). Let Q be a fuzzy coherent quantifier: • The support of an itemset ΓI 0 is equal to the result of evaluating the quantified sentence Q of T’ are ΓI 0 . •

The support of the fuzzy association rule A ⇒ C in the FT-set T’, Supp(A ⇒ C), is the evaluation of the quantified sentence Q of T are ΓA∪ C = Q of T are ( ΓA ∩ ΓC ).

•

The confidence of the fuzzy association rule A ⇒ C in the FT-set T’ Conf(A ⇒ C), is the evaluation of the quantified sentence Q of ΓA are ΓC .

As seen in (Delgado et al., 2003), the proposed method is a generalization of the ordinary association rule assessment framework in the crisp case. The sentences can be evaluated for instance by means of method GD, defined in (Delgado et al., 2000b), as ⎛ | (G ∩ F )α i | ⎞ ⎟ GDQ (G / F ) = (α i − α i +1 )Q⎜ ⎜ ⎟ | Fα i | α i ∈∆ (G / F ) ⎝ ⎠ where ∆ (G / F ) = ∧(G ∩ F ) ∪ ∧ F , ∧ F being the level set of F, and ∆(G / F ) = {α1,..., α p } with αi > αi+1

∑

for every i ∈ {1,…,p}. The set F is assumed to be normalized. If not, F is normalized and the normalization factor is applied to G ∩ F. We choose the quantifier QM, defined by QM(x) = x, since it verifies the conditions we request for a quantifier (to be a coherent quantifier) and it has a valuable property: the values obtained by using it in previous definitions in the case of crisp transactions, are the ordinary measures of support and confidence in the crisp case. This way, the proposed method is a generalization of the ordinary association rule assessment framework in the crisp case.

2.4 Fuzzy Approximate Dependencies As seen in (Bosc and Lietard, 1997), it is possible to extend the concept of functional dependence in several ways by smoothing some of the elements of the rule in section 2.2. We want to consider as much cases as we can, integrating both approximate dependencies (exceptions) and fuzzy dependencies. For that purpose, in addition to allowing exceptions, we have considered the relaxation of several elements of the definition of functional dependencies. In particular we consider membership degrees associated to pairs (attribute, value) as in the case of fuzzy association rules, and also fuzzy similarity relations to smooth the equality of the rule. We shall define fuzzy approximate dependencies in a relation as fuzzy association rules on a special FTset obtained from that relation, in the same way that approximate dependencies are defined as association rules on a special T-set. Let IRE = {itAt | At∈RE} be the set of items associated to the set of attributes RE. We define a FT-set T'r associated to table r with attributes in RE as follows: for each pair of rows in r×r we have a fuzzy transaction ts in T'r defined as

∀it At ∈ T ' r , ts(it At ) = min( µt ( At ), µs ( At ), S At (t ( At ), s ( At ))) This way, the membership degree of a certain item itAt in the transaction associated to tuples t and s takes into account the membership degree of the value of At in each tuple (µt(At)) and the similarity between them (SAt). This value represents the degree to which tuples t and s agree in At. On this basis, we define fuzzy approximate dependencies as follows (Berzal et al., 2003; Serrano, 2003): Let X, Y ⊆ RE with X∩Y =∅ and X, Y ≠∅. The fuzzy approximate dependence X → Y in r is defined as the fuzzy association rule IX ⇒ IY in T'r. The support and certainty factor of IX ⇒ IY are calculated from T'r as explained in sections 2.3 and 2.1, and they are employed to measure the importance and accuracy of X → Y. A fuzzy approximate dependence X → Y holds with total accuracy (certainty factor CF(X → Y) = 1) in a relation r if and only if ts(IX)≤ts(IY) ∀ts∈T’r (let us remember that ts(IX)=minAt∈X ts(itAt), ∀X∈RE). Moreover,

I - 399


since fuzzy association rules generalize crisp association rules, fuzzy approximate dependencies generalize approximate dependencies. Additional properties and an efficient algorithm for computing fuzzy approximate dependencies can be found in (Berzal et al., 2003; Serrano, 2003).

3. BIBLIOGRAPHIC SOURCES AND DATABASES The studied database consists of information about three mesoenvironments from the South and Southeast of the Iberian Peninsula under Mediterranean climate: Sierra Nevada, Sierra of Gádor and Southeast (involving part of the provinces of Murcia and Almería). Table 1 shows the main characteristics of local information sources. We used two PhD thesis and five cartographic sheets from LUCDEME, scale 1:100000. Data from Sierra of Gádor was extracted from (Oyonarte, 1990) and consists of 70 soil profiles and 176 horizons. Altitude fluctuate from 100 to 2200 m, and rainfall from 213 mm/year (semiarid climate) to 813 mm/year (wet climate), with a mean annual rainfall of 562 mm/year. Lower annual mean temperature is 6.4º C and higher is 21.0º C, with a mean of 12.7º C. Original material of soils are of carbonated type, mainly limestones and dolomites. Data from Southeast was extracted from LUCDEME soil maps, specifically from sheets 1041 from Vera, Almería (Delgado et al., 1991), 911 from Cehegin, Murcia (Alias, 1987), 1030 from Tabernas, Almería (Pérez Pujalte, 1987), 912 from Mula, Murcia (Alias, 1986) and 1031 from Sorbas, Almería (Pérez Pujalte, 1989). There is a total of 89 soil profiles and 262 horizons. Altitude fluctuates from 65 to 1120 m, and rainfall from 183 mm/year (arid climate) to 359 mm/year (semiarid climate), with a mean annual rainfall of 300 mm/year. Lower annual mean temperature is 13.2º C and higher is 19.0º C, with a mean of 17.0º C. Geological environment and Original materials of soils are extremely different, we can find carbonated, acids and volcanic rocks. Data from Sierra Nevada was extracted from (Sánchez-Marañón, 1992). There is a total of 35 soil profiles and 103 horizons. Altitude fluctuates from 1320 to 3020 m, and rainfall from 748 mm/year (semihumid climate) to 1287 mm/year (hyperhumid climate), with a mean annual rainfall of 953 mm/year. Lower annual mean temperature is 0.0º C and higher is 12.1º C. Geological environment and Original materials of soils are mainly acids, but it is not strange to find basic rocks. Soil colors can be quantified by means of several color systems. The most extended of these systems is the Munsell Color System (Munsell, 1954; Soil Survey Staff, 1975). It is based on three parameters: Hue, Chroma and Value. Hue is related with the dominant length wave in reflected radiation, Value (or lightness) expresses the proportion of reflected light, and finally, Chroma means the chromatic intensity or relative purity on color. Starting from the correlation matrix appeared in (Sánchez-Marañón, 1992), we selected those soil components more correlated (positive or negatively) with Hue, Value and Chroma. The studied components were: Clay percentage, Sand percentage, and Organic Carbon percentage. Database values had to be preprocessed before the analysis. In order to reduce the granularity degree, attributes with numeric domains were discretized, following the discussed techniques in (Hussain et al., 1999), under supervision of domain experts. A set of linguistic labels {Low, Medium, High} was defined for every attribute. Later, these labels were associated to possibilistic distributions. Attributes with categorical domains were fuzzified considering fuzzy similarity relations according to semantics between values. Table 1. Attributes used in knowledge discovery and semantic groups Semantic group Environmental Stations

Attribute Mesoenvironment

Soil-forming Factors

Altitude Mean Annual Rainfall Mean Annual Temperature Original Material

I - 400

Comments Multidimensional combinations of environmental factors that define more or less homogeneous spaces of influence in soil later development. These factors are also named soil-forming factors. Soil-forming factors do not belong to the soil ‘sensu strictu’ and they do not belong to its structure. They are general environmental factors, measurable, and act as causal agents in soil-forming processes in soil development.


Horizons

Horizon Type

Components

Calcic carbonate percentage CEC Useful water percentage Sand percentage Clay percentage Organic carbon percentage Free iron percentage Value Chroma Hue

Properties

It expresses the characteristics in homogeneous zones in soil and can be obtained as a final result of several soil-forming processes due to the action of soil-forming agents (factors) Components and properties are measurable characteristics, morphological or analytical, described in each horizon, and they can act as a diagnostic for it. In soil genesis, properties are consequence of components.

4. RESULTS AND INTERPRETATION In this section, we apply the domain experts aid in order to give an interpretation of the obtained results. First, as a previous exploratory step, we applied a crisp approximate dependencies (section 2.2) extraction algorithm. We reduced to the case of one antecedent and one consequent, and fixed a minimum threshold of 0.7 for CF measure. According to this value and to the experts’ opinion, the “best” obtained approximate dependencies were the following, [Dry Hue] → [Wet Hue], supp 17.35%, CF 0.91 [Wet Hue] → [Dry Hue], supp 17.35%, CF 0.88 [Altitude] → [Mean Annual Rainfall], supp 35.51%, CF 0.80 [Mean Annual Temperature] → [Mean Annual Rainfall], supp 31.87%, CF 0.78 [Free iron percentage] → [Mesoenvironment], supp 28.62%, CF 0.76 [Mean Annual Rainfall] → [Altitude], supp 35.51%, CF 0.75 [Mesoenvironment] → [Mean Annual Rainfall], supp 31.16%, CF 0.71 From these dependencies, the first one is trivial, from an expert’s point of view, since it just relates the two existing moisture states (wet and dry) for the Hue color parameter. This property, as seen before, gives us information about the reflected radiation wave length by the soil sample. Hue, opposing to other properties as Value and Chroma, is hardly modifiable by moisture changes. In this way, the relation is logical for Hue, but not for Value of Chroma. The second dependence shows the narrow relation between three climatic properties. Following the regional climatic pattern, a higher rainfall corresponds to a higher altitude, and viceversa, as shown by the results. Another revealed relation is the one between temperature and rainfall. In the geographical zone studied, the higher the temperature, the lower the rainfall, almost invariably. Another result reveals a relation between rainfall and mesoenviroment. This is very reasonable, from the experts’ point of view. Looking at the previous dependencies and knowing the existing altitude gradient in Southeast-Sierra of Gádor-Sierra Nevada, a relation like this was expected. Moreover, Free iron percentage seems to be related to soil evolution, that is, the obtained dependence shows that a general evolutive relation between mesoenvironments exists. For all, it must be remarked that, by means of crisp techniques, it is not possible to obtain dependencies involving color properties (Hue, Chroma, Value) and any other soil attribute, as it was our first objective. Nevertheless, applying fuzzy approximate dependencies, a higher number of results (up to six times, maintaining the same CF threshold of 0.7) is obtained. This fact, in particular, means a higher possibility of discovering potentially useful information in a database. Within the obtained results, we have selected those dependencies involving soil color properties and other attributes, [% Organic Carbon] → [Wet Value], supp 27.6%;CF 0.75 [CEC] → [Wet Value], supp 26.31%;CF 0.7 [% Clay] → [Dry Chroma], supp 28.9%;CF 0.99 [% Organic Carbon] → [Dry Chroma], supp 31.4%;CF 0.91

I - 401


[% Calcic Carbonate] → [Dry Chroma], supp 27.12%;CF 0.8 [CEC] → [Dry Chroma], supp 27.11%;CF 0.71 [% Useful water] → [Dry Chroma], supp 29.12%;CF 0.97 Organic carbon percentage, Cation exchange capacity (CEC), Clay percentage, Calcic Carbonate percentage and Useful water in soil conform the most relevant properties group in soil analysis. Because of this, these obtained dependencies have a very high interest, according to experts’ opinion. Relations between CEC and Organic carbon are well-known in the studied knowledge area. A higher Organic carbon percentage (humus content in soil), implies a higher Value (lower luminosity, darker soils). This fact is always present in studied soils, as experts verify. This effect is clearly more remarkable in wet soils (attribute Wet Value) than in dry soils (obtained dependencies involving Dry Value gave only CF under 0.54). For this particular case, organic matter shows a higher CEC (about 300 cmol(+)/kg) than the remaining soil components (i.e., clay has a value of 30 cmol(+)/kg), so a narrow relation appears between these attributes. For that reason, the fuzzy approximate dependence [CEC] → [Wet Value] is easily explainable, although with a lower CF. In other order of things, the dependence between Clay percentage and Dry Chroma is almost perfect. In every soil in the world, a higher value for Clay percentage implies a higher Chroma, since this is invariably related to iron oxide fine particle releasing (clay-scale particles), acting as pigments and intensifying soil color. This effect is even more remarkable on dry soils, since sample humidification reduces reflection. The meaning of variation of Dry Chroma with Organic carbon is not so clear, even though they appear to be very related. A higher quantity of humus generates an intensity loss in soil color,but a local study by means of association rules between attribute values would be desirable, in order to verify this hypothesis. Then, experts could confirm the direction of the dependencies involving Dry Chroma with CEC and Useful water, as both are very dependent of Clay and Organic carbon percentages. Nevertheless, the dependencies seems to be reasonable in some measure. Finally, the fuzzy approximate dependence between Calcic carbonate percentage and Dry Chroma seems to be very interesting, according to experts. A priori, it could be argued that a high carbonate percentage should lead to a low intense and whitish soil color. For this particular case, a local analysis would be desirable, too.

5. CONCLUSION We conclude that, for this particular case, asked experts were highly satisfied as knowledge extracted by means of fuzzy data mining was more suitable to "fusion" or comparison with expert knowledge that crisp. Moreover, fuzzy data mining was sensitive to low support dependencies, that were discarded in crisp data mining. Agricultural information, and in particular, soil data, is inherently affected by imprecision of uncertain factors and can be modelled very efficiently in fuzzy databases. Our methodology generates a set of dependencies that could be applied to datasets description or data prediction. The obtained results were considered, according to domain experts opinion, interesting enough to continue studying, as a future work, the possible existing relations at a local level, that is, between attribute values. Fuzzy association rules techniques over categorical domains, as well as over sets of fuzzy transactions, might be studied and developed. As another pending task, we propose to solve this same problem in a general case. With a domain expert aid, we could define the set of criteria for database preprocessing but also discern when fuzzy techniques generate better results than crisp ones. One first approach can be found in (Berzal et al, 2003b), where a medical database is analyzed also by means of fuzzy approximate dependencies. Further efforts would be also devoted to develop user-guided tools in order to obtain better knowledge and hence take better decisions.

ACKNOWLEDGEMENT This work is part of the research project Fuzzy-KIM CICYT TIC2002-04021-C02-02.

I - 402


REFERENCES Agrawal R. et al, (1993). Mining association rules between sets of items in large databases. In Proc. Of the 1993 ACM SIGMOD Conference, pages 207–216. Alias J. (1986). Mapa de suelos de Mula. Mapa 1:100000 y memoria. LUCDEME; MAPA-ICONA-University of Murcia. Alias J. (1987). Mapa de suelos de Cehegin. Mapa 1:100000 y memoria. LUCDEME; MAPA-ICONA. University of Murcia. Berzal F. et al, (2003). A definition for fuzzy approximate dependencies. Fuzzy Sets and Systems. Submitted. Berzal F. et al, (2003b). Finding Fuzzy Approximate Dependencies within STULONG Data. Proceedings of the ECML/PKDD 2003, Workshop on Discovery Challenge. Berzal F. et al, (2001). A new framework to assess association rules. In Hoffmann, F., editor, Advances in Intelligent Data Analysis. Fourth International Symposium, IDA’01. Lecture Notes in Computer Science 2189, pages 95–104. Springer-Verlag. Berzal F. et al, (2002). Measuring the accuracy and interest of association rules: A new framework. Intelligent Data Analysis. An extension of (Berzal et al., 2001), submitted. Bigham J.M., and Ciolkosz E.J., eds. (1993). Soil color. Soil Sci. Soc. Am. Spec. Publ. No. 31, 159 pp. Blanco I. et al, (2000). On the support of dependencies in relational databases: strong approximate dependencies. Data Mining and Knowledge Discovery. Submitted. Bosc P. and Lietard L. (1997). Functional dependencies revisited under graduality and imprecision. In Annual Meeting of NAFIPS, pages 57–62. Delgado G. et al, (1991). Mapa de Suelos de Vera. LUCDEME, ICONA-Universidad de Granada. Delgado M. et al, (2003). Fuzzy association rules: General model and applications. IEEE Transactions on Fuzzy Systems 11(2), pages 214–225. Delgado M. et al, (2000a). Mining strong approximate dependencies from relational databases. In Proceedings of IPMU’2000. Delgado, M. et al, (2000b). Fuzzy cardinality based evaluation of quantified sentences. International Journal of Approximate Reasoning, vol. 23, pp. 23-66. Hussain F. et al, (1999). Discretization: An enabling technique. Technical report, The National University of Singapore. Munsell. (1954). Soil Color Charts. Munsell Color Company Inc. Baltimore, Maryland. Oyonarte C. (1990). Estudio Edáfico de la Sierra de Gádor (Almería). Evaluación para usos forestales. PhD thesis, University of Granada. Pedrycz W. (1998). Fuzzy Set Technology in Knowledge Discovery. Fuzzy Sets and Systems 98, pp. 279-290. Pérez-Pujalte A. (1987). Mapa de suelos de Tabernas. Mapa 1:100000 y memoria. LUCDEME; MAPAICONA-CSIC. Pérez-Pujalte A. (1989). Mapa de suelos de Sorbas. Mapa 1:100000 y memoria. LUCDEME; MAPA-ICONA-CSIC. Sánchez-Marañón M. (1992). Los suelos del Macizo de Sierra Nevada. Evaluación y capacidad de uso (in Spanish). PhD thesis, University of Granada. Sánchez-Marañón M. et al, (1997). CIELAB color parameters and their relationship to soil characteristic in Mediterranean Red Soils. Soil Sci. 11, 833-842. Schwertmann, U. (1993). Relations between iron oxides, soil color and soil formation. Soil Sci. Soc. Am. Spec. Publ. No. 31, 51-69. Schulze, D.G. et al, (1993). Significance of organic matter in determining soil colors. Soil Sci. Soc. Am. Spec. Publ. No. 31, 71-90. Serrano J.M. (2003). Fusión de Conocimiento en Bases de Datos Relacionales: Medidas de Agregación y Resumen (in Spanish). PhD thesis, University of Granada. Shortliffe E. and Buchanan B. (1975). A model of inexact reasoning in medicine. Mathematical Biosciences, 23, pages 351–379. Soil Survey Staff (1975). Soil Taxonomy. U.S. Dept. Agri. Handbook No. 436, 754 pp. Torrent J. et al, (1980). Iron oxide mineralogy of two river terraces sequences in Spain. Geoderma 23, 191-208. Zadeh L. (1983). A computational approach to fuzzy quantifiers in natural languages. Computing and Mathematics with Applications, 9(1):149–184.

I - 403

fuzzy approximate dependencies over imprecise ...

fuzzy approximate dependencies over imprecise ...

Suggest Documents

Discovery of Approximate Differential Dependencies

Fuzzy approximate continuity

Fuzzy Functional Dependencies in Relational

Mining Approximate Functional Dependencies ... - Semantic Scholar

Mining Approximate Functional Dependencies and Concept ...

Characterizing Approximate-Matching Dependencies in ... - HAL-Inria

Efficient Discovery of Approximate Dependencies - VLDB Endowment

Answering Imprecise Queries over Autonomous Web Databases

Functional Dependencies over XML Documents with DTDs

potentials of fuzzy logic:an approach to handle imprecise ... - CiteSeerX

Approximate Equality is no Fuzzy Equality

Approximate Solution of Fuzzy Matrix Equations

Fast Approximate Discovery of Inclusion Dependencies - BTW 2017 in

Page 1 Imprecise (Fuzzy) Information in Geostatistics' A. Bardossy,* 1 ...

On an Indexing Mechanism for Imprecise Numerical Data for Fuzzy

Iterative method for approximate solution of fuzzy

Approximate Reasoning for Solving Fuzzy Linear Programming

Fuzzy Logic and Approximate Reasoning - DiVA

Approximate solution of intuitionistic fuzzy differential

Fuzzy Approximate Reasoning toward Multi-Objective

Equalizing imbalanced imprecise datasets for genetic fuzzy classifiers

Processing Imprecise Database Queries by Fuzzy Clustering Algorithms

Bayesian Networks with Imprecise Probabilities - Imprecise Probability ...

Evaluating Trajectory Queries Over Imprecise Location Data - PolyU