2018 Tenth International Conference on Advanced Computational Intelligence (ICACI) March 29–31, 2018, Xiamen, China
A Density-based Discretization Method With Inconsistency Evaluation Rong Zhao, Yanpeng Qu, and Ansheng Deng
Reyer Zwiggelaar
Information Science and Technology College Dalian Maritime University Dalian, 116026, China Email: {rongzhao, yanpengqu, ashdeng}@dlmu.edu.cn
Department of Computer Science Institute of Mathematics, Physics and Computer Science Aberystwyth University, SY23 3DB Aberystwyth, UK E-mail:
[email protected]
Abstract—Commonly, the data used in the real-world applications is composed by two types, the continuous data and the discrete data. The continuous data represents a range of values, while the discrete data refers to the information that share certain commonality. Since the discretized data always enjoys the general and simple usability, many data mining methods such as rough set theory and decision tree are designed to deal with discrete data. Due to the abundant existence of continuous attributes in data sets, data discretization is required as an important data processing method. In this paper, a densitybased clustering algorithm is used to generate a discretization method. Specifically, in order to automatically seek out the proper number of clusters, the clustering method is employed to divide data set into clusters by fast search and find of density peaks. Then a top-down splitting strategy is utilized to discretize the interval of attributes. Furthermore, a novel probabilistic inconsistency measure is proposed to evaluate the results of discretization method. The experimental results demonstrate that the discretization methods with higher classification accuracy selected by inconsistency measure is better than the other methods. Therefore, the inconsistency measure can be used as an evaluation indicator. Keywords—Inconsistency measure; Discretization method; Clustering
I. I NTRODUCTION Data mining is a process of extracting novel, effective knowledge data from the information resource. Generally, the types of commonly-used information consist of two categories: discrete data (nominal) and continuous data (numerical). Discrete data belongs to qualitative data, such as colour or gene. However, continuous data is quantitative data, such as income level or length [1], [2]. Many data with continuous attributes exist in the actual database. But many data mining methods such as decision tree and rough set theory are more effective to deal with the discrete or critical data [3], [4]. Therefore, discretization of continuous attributes is necessary to preprocess data for methods which can only address discrete attribute. Continuous attributes discretization can simplify the structure of data, and it is the conversion of continuous attributes to categorical attributes. In short, such technology is to divide the value of the continuous attribute into several discrete intervals. Eventually, the attribute values of each interval are represented by different symbols or values. In general, the taxonomy of discretization methods can be divided into three types. a) The global [5] and local [6]
978-1-5386-4362-4/18/$31.00 © 2018 IEEE
discretization, the global refers to use the entire sample space, while the local is in a part of the sample space. b) The dynamic [7] and static discretization, the dynamic methods are to discretize the continuous attributes while establishing the classification model. However, the static methods are the preprocess of a classification task. c) The supervised and unsupervised discretization [8], the supervised methods require to consider the category information of the data, while the unsupervised methods do not need it. Moreover, for unsupervised discretization methods, the input data only contain the condition attributes. The main methods are: equal frequency, equal width and the cluster-based discretization method. The first two methods require to specify the number of intervals which is derived from the property values. Another drawback of the equal width method is that it is more sensitive to the abnormal points, which can lead to uneven distribution of instances. Such uneven distribution will damage the ability of the equal width method to build a good decision-making structure using a feature. Although the equal frequency method avoids the above problems, it may assign the eigenvalues with the same class label to different intervals to meet the conditions of the fixed number of data in intervals. As for the cluster-based analysis, there are two steps. Firstly, the value of continuous attribute is divided into clusters. Secondly, the clusters can be further processed by the top-down splitting strategy and bottom-up merging strategy. The disadvantage of this method is that the number of clusters is determined by the users. An unsupervised discretization method was presented in [9], which performs density estimation for univariate data. Besides, a discretization algorithm called Low Frequency Discretizer (LFD) which uses only the numerical values, that have low frequency, as potential cut points [10]. Inconsistency measures can be applied to analyze inconsistencies of the knowledge base and to propose views of repairing the inconsistencies. A larger value of inconsistency measure of knowledge base implies the severe inconsistency among data. Moreover, the value of inconsistency measure is zero if and only if the knowledge base is consistent. However, inconsistency is a concept which is difficult to be quantified. Therefore, the inconsistency measures are proposed to address the above problem, especially for classical propositional logic [11]. The methods of inconsistency measures are
758
divided into many species, e.g. inconsistency measures based on minimal inconsistencies [12]–[14], inconsistency measures based on maximal consistency [15], probabilistic inconsistency measures [16], variable-based inconsistency measures [17], distance-based inconsistency measures [18] and proof-based inconsistency measures [19]. In this paper, a density-based discretization method is proposed to process the data sets with continuous attributes. In order to avoid the disadvantage of the cluster-based discretization method, clustering by fast search and find of density peaks is employed to automatically select the number of cluster centers. A novel probabilistic inconsistency measure is proposed as a criterion to evaluate the discretization results. Several data sets are discretized by a number of discretization methods, and the discrete results are evaluated by proposed index and other evaluation indexes i.e. chi-square, standard deviation and dependence degree. It is possible that the inconsistent records occur while transforming continuous attributes into discrete ones. The inconsistent records refer to the data with same conditions and different conclusion. Hence, the discrete processing should ensure that the inconsistency of the data set obtained after discretization is smaller than that of the original data set. On the semantic, the inconsistency presents the contradiction and incompatibility of data. The decrease of inconsistency illustrates that the discrete method has well effect on the data set. The rest of this paper is organized as follows. Some discretization methods are briefly reviewed in Section II. In Section III, the density-based clustering discretization method for continuous attributes and the evaluation index of inconsistency measure are introduced. The results of experimental are assessment analysis and discussed in Section IV. Finally, Section V is devoted to conclusion remarks. II. P RELIMINARIES A. Clustering by Fast Search and Find of Density Peaks The clustering algorithm by fast search and find of density peaks [20] is similar to the K-median algorithm, which is based on the distance between the data points. In addition, it can detect non-spherical clusters resembling to DBSCAN and mean-shift. This method can automatically find the correct number of clusters. The algorithm has a basic assumption that the cluster centers refer to the points surrounded by neighbors with lower local density and have relatively large distance from any points with higher local density. For each data point, the local density and the distance from other data points with higher density are calculated. These two quantities are only dependent on the distance between data points, which are assumed to satisfy the triangular inequality. The local density ρi of data point i is defined as follows: X ρi = χ(dij − dc ), (1) j
where χ(x) = 1 if x < 0 and χ(x) = 0 otherwise. dij is the distance between the data point i and the data point j. dc is the cutoff distance, and it is taken the value of the total two
percent position of the distance between the data points which is arranged in ascending order. In general, ρi is the number of data points that are closer than the truncation distance to the data point i. Then the Cut-off kernel and the Gaussian kernel are applied for the calculation of the local density. Equation (1) is used to calculate the Cut-off kernel, while the calculation method for Gaussian kernel is defined as follows: ρi =
X j
dij
e−( dc
)2
.
(2)
For the above two methods of calculating the local density, the former uses discrete values while the latter applies continuous values. The latter reduces the probability when different data points have the same local density. The algorithm is only sensitive to the relative amplitude of ρi in different data points, which means that, for large data sets, the results of analysis are still robust for the selection of dc . δi is measured by calculating the distance between the data point i and other higher-density data points. The definition of δi is as follows: δi = minj:ρj >ρi (dij ),
(3)
in which dij represents the distance between the data point i and the data point j. Equation (3) illustrates that if the density is relatively high, and the distance between the current point and other points with higher density is relatively large. Then the data point is recognized as another cluster center. For data points with the highest density, the distance δi usually uses the following formula: δi = maxj (dij ),
(4)
The meaning of the above expression is that if the current data point i is already the point with the highest local density, δi is the maximum distance from other points to data point i. Since the data point is the local or global maximum in the density, δi is much larger than the distance from data point i to its nearest-neighbor. It can be considered that the cluster center is the point with an anomalously large value of δi . B. Class-Attribute Interdependence Maximization Class-attribute interdependence maximization (CAIM) is a top-down discretization method [21]. The purpose of the algorithm is to maximize the class attribute dependency and to generate the minimum discrete interval. CAIM proposes a heuristic measure that quantifies the dependencies between classes and the discrete intervals. The larger the value calculated by CAIM, the greater the dependency is. The discrete process of the CAIM method is started by selecting numeric attributes and arranging in ascending order. The definition of CAIM is as: max2r r=1 M+r
Pn
. (5) n Here n represents the number of intervals. maxr (r = 1, 2, · · · , n) refers to the maximum value in the r-th column of
759
CAIM (C, D|F ) =
the quanta matrix; M+r is the total number of the continuous value of the attribute F; C is the class variable; D is the discrete variable and F is the property. C. Probability Function of Inconsistency Measure One of the measures of probability inconsistency is =η [22], which is based on a probability function over the underlying propositional language. A probability function P ∈ F (At) is extended to formulas φ ∈ ι(At) via X P (φ) = P (ω). (6) ω∈Ω(At), ω|=φ
Here At is a finite set of propositions, which is a fixed set of propositional signature. ι(At) is the corresponding propositional language generated by the atoms in At and the connectives ∧(and), ∨(or), and ¬(negation). ω represents a possible explanation for the propositional language belonging to At, and the explanation of ω on At is a function ω : At → {true, f alse}. Ω(At) is used to represent the set of all propositional languages interpretations for At. Semantics are given probability conditions by the probability function on Ω(At). F (At) represents a set of all probability functions of P, where the probabilityP function P on ι(At) is a function P : Ω(At) → [0, 1] with ω∈Ω(At) P (ω) = 1. The probability of a formula is the summation of the probabilities of the possible explanations that satisfy that formula. The main idea of the probability inconsistency measure is to find a probability function, which can assign the maximum probability to each formula in the knowledge base. If a probability function that can assign 1 to each formula is sought, the knowledge base is consistent. If this knowledge base is not consistent, the probability function must be distributed. An inconsistent set of formulas can not be satisfied by a single explanation. Therefore, the probability P (ω) can only be associated with a subset of Ω(At). The smaller the maximum probability that can be assigned to all formulas, the more inconsistent the knowledge base is. III. D ISCRETIZATION M ETHOD BASED ON FSFDP AND E VALUATION M ETHOD OF I NCONSISTENCY In this section, a cluster-based approach for data discretization is proposed to deal with continuous attributes of data set, and the inconsistency measure is applied as an index to evaluate the effect of data discretization. A. Discretization Method based on Clustering by Fast Search and Find of Density Peaks The discretization method based on FSFDP and splitting strategy is used to discretize the continuous attributes of data sets. The general process of top-down splitting strategy, which is used to further process the results of clustering, involves four steps. Firstly, according to specified rule, successive attribute values are sorted in a descending order. Secondly, the breakpoints of continuous attributes are preliminary determined. Thirdly, the intervals are split according to the given judgment criteria. Eventually, if the termination condition of the criterion
is determined by the third step, the process of the entire continuous attributes discretization is terminated. Otherwise, the third step is continued until the termination condition is reached. Algorithm 1 DFSFDP Algorithm Input: Data sets with continuous attributes X Output: Discretized data sets disc data discrete intervals for each attribute dm 1: Compute the matrix of distance between sample data. 2: Select 2 percent of the random position as the cut distance. 3: Calculate the local density with Gaussian kernel and descending order ordrho. 4: for i = 2 to N D do 5: for j = 1 to i − 1 do 6: if dist(rho(i), rho(j)) < delta(rho(i)) then 7: Use dist to updated δ. 8: end if 9: end for 10: end for 11: Choose the maximum as the value of δ. 12: Select cluster centers. 13: Assign data points based on the number of clusters. 14: The top-down splitting strategy is used for each clusters. 15: Find the range of values of an attribute as initial interval. 16: for j = 1 to size F do 17: if ∼ F (j, 2) then 18: Select the smallest of the other values as a new breakpoint except marked and calculate. 19: Add the value to discrete interval. 20: end if 21: end for 22: for i = 1 to L − 1 do do 23: for j = 1 to size d − 1 do 24: Indices satisfy the condition of S(:, i) ≥ d(j) and S(:, i) ≤ d(j + 1). 25: Divide the data according to the discrete intervals. 26: end for 27: end for The Algorithm 1 gives the basic procedure of the method proposed in this paper. The first step is to calculate the distance between the data points and to determine the cut-off distance. Next, the local density can calculated through the Gaussian kernel and the Cut-off kernel, the distance between the current point and any other points with higher density is calculated. Then the point with maximum distance δ is selected and data points are assigned according to the cluster centers. After the clustering, a top-down splitting strategy is implemented to process the clusters. The maximum-minimum value of a feature is taken as an initial interval. Then breakpoints are selected from the remaining unlabeled attribute values, and determined whether the points are inserted into the interval or not. Finally, all the data are judged according to the intervals, and the data belonging to an interval are assigned the same
760
mark. Continuous attribute discretization can reduce the number of given continuous eigenvalues. Not only the discrete values are closer to the representation of knowledge level than continuous values, but also is easier to comprehend, use and interpret. In addition, the defects hidden in the data can be effectively overcame, and discretization technology can lead to a more stable model structure. Moreover, the proposed DFSFDP method can automatically determine the the number of clusters, which avoid the disadvantage of the cluster-based discretization method. B. Evaluation Method of Inconsistency Measure The quality of discretization can be considered in several ways i.e. complete discretization, the simplest discrete results, prediction accuracy, etc. In this paper, the inconsistency measure is employed to evaluate the results after discretization. When the inconsistency measure is used as the evaluation index, the inconsistency of the original data set and the discretized data set are calculated. If the inconsistency of the discretized data set is less than that of the original data set, the discretization method reduces the contradiction and incompatibility between the data. The discretization of the data set is assessed by using the probabilistic inconsistency measure. The proposed inconsistency measure is applied to a feature in the application. The proportion of each value with the same decision class, the ratio of each value and the proportion of each category are calculated separately. The probability function is the sum of the ratio of every value with same category of all samples. The probability function for probabilistic inconsistency measure is expressed as: X X P (ϕ) = (P (α))( P (β)P (γ)), (7) α∈D
β∈C,γ∈C∩D
where D denotes the decision attribute; C represents the condition attribute; P (α) = |α|/N signifies the proportion of each decision category; P (β) = |β|/N defines the ratio of each value under a property; P (γ) = |γ|/|C ∩ D| represents the proportion of each value with the same category under an attribute; N is the total number of samples. The idea of the proposed probability function conforms to the original probabilistic inconsistency measure. The probability of a formula is the sum of the probabilities of the possible explanations that satisfy that formula. The probabilistic inconsistency measure is defined as: Γ(S) = 1 − max{ε | ∃P ∈ ℘(Ψ) : ∀λ ∈ S : P (λ) ≥ ε}, (8) in which S denotes the entire discretized data set, ℘(Ψ) is a set of all the above probability functions. The probabilistic inconsistency measure is to compute the degree of conflict records appearing in the data set. P (ϕ) indicates the consistency of the data set, which is calculated by considering the combination of various proportional values that belong to the same decision class. If P (ϕ) is taken the
maximum, Γ(S) represents the smallest inconsistency of data set. IV. E XPERIMENT R ESULTS In this section, the proposed discretization method based on FSFDP and other four discretization methods are applied on the eight data sets of the UCI machine learning repository [23]. The effect of discretization on data sets are further evaluated. The eight data sets are Cleveland, Climate, Ecoli, Liver, Olitos, Parkinson, Sonar, and Waveform, respectively. Table I shows the number of objects, features and decision classes. The performance of these data discretization methods on eight data sets are discussed in detail. TABLE I. Brief Introduction of Data Sets Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
Objects 297 540 336 345 120 195 208 5000
Features 13 18 7 6 25 22 60 40
Decision classes 5 2 8 2 4 2 2 3
The implemented discretization methods including K-means [24] and DBSCAN [25] belong to unsupervised methods. Kmeans partition the input data set into k-clusters according to an initial value. The DBSCAN method is a density-based clustering algorithm and can find arbitrarily shaped clusters in noisy data. However, Chimerge [26] and CAIM [21] pertain to supervised methods. In order to transform the continuous attributes into discrete variables, the Chimerge method is to compute the value of each pair of adjacent intervals and combine the lowest value of a pair of intervals for each feature. The CAIM algorithm is to maximize the class-attribute interdependence and to generate a minimal number of discrete intervals. A. Evaluation Index In this paper, the probabilistic inconsistency measure is proposed as an index to evaluate the effect of discretization. The standard deviation, the chi-square test and the dependence degree are the other indexes to evaluate the effects of discrete data. The evaluation of the results of different discretization methods and data sets by using above four indictors are revealed from Table II to Table V. The discretization methods can obtain better results, when the calculated value of the inconsistency measure, the chisquare test and the standard deviation is smaller. However, if the calculated value of dependency degree is larger, the discretization method selected by the indicator can reach a better result. For example, according to Table II, the inconsistency measure selects the CAIM method on the data sets including Climate, Ecoli, Liver, Olitos, Sonar and Waveform. DFSFDP and DBSCAN are chosen on the data sets Cleveland, Ecoli and Olitos. The K-means method which is selected on
761
TABLE II. Inconsistency Measure of Different Data Sets and Discretization Methods Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
DFSFDP 0.9839 0.8001 0.9997 0.5917 0.9068 0.8016 0.3637 0.5324
CAIM 0.9847 0.3245 0.9997 0.3195 0.9068 0.7977 0.3612 0.5316
Chimerge 0.9854 0.7514 0.9997 0.5934 0.9404 0.4163 0.5561 0.7087
DBSCAN 0.9839 0.3262 0.9997 0.711 0.9068 0.8016 0.3637 0.5324
K-means 0.9839 0.8086 0.9997 0.6774 0.9519 0.482 0.8153 0.7476
discretization method chosen by the inconsistency measure has higher classification accuracy than the methods selected by the other indicators, the proposed inconsistency measure can be employed as the evaluation index. In the experiments, the same classifier is used to classify several discretized results of different data sets. Table VI, Table VII and Table VIII show the classification accuracy of different data sets and discretization methods under the classifier of JRip, IBK and J48, respectively. TABLE VI. Classification Accuracy by JRip
TABLE III. Standard Deviation of Different Data Sets and Discretization Methods Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
DFSFDP 0.7173 0.8550 0.8667 0.8420 0.8518 0.3921 0.4244 0.7535
CAIM 0.5871 0.3289 1.3741 0.3144 0.8664 0.4008 0.4253 0.7537
Chimerge 1.1848 2.3045 1.3645 1.5293 2.3100 2.1873 2.1593 1.9748
DBSCAN 0.6188 0.3298 1.2741 1.2290 0.8518 0.3921 0.4244 0.7535
Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
K-means 1.4329 0.5152 1.8151 1.1694 2.9092 1.5226 1.3358 1.2838
DFSFDP 54.06 91.67 81.87 64.82 71.83 85.78 77.56 78.40
CAIM 54.02 94.20 83.61 68.39 72.42 86.94 79.52 78.74
Chimerge 53.72 90.39 83.84 65.80 67.50 89.95 77.94 79.97
DBSCAN 54.39 93.41 77.60 66.53 71.83 85.78 77.56 78.40
K-means 54.19 91.24 82.95 65.08 64.75 86.08 78.78 79.69
TABLE VII. Classification Accuracy by IBK TABLE IV. Chi-square Test of Different Data Sets and Discretization Methods Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
DFSFDP 5.0242 53.518 77.904 44.091 3.5705 3.6493 0.306 1815.3
CAIM 5.8212 11.745 34.608 19.78 3.6080 3.6381 0.304 1807
Chimerge 4.1993 203.54 87.646 67.866 2.1220 19.911 0.79 2715.5
DBSCAN 5.2556 11.767 74.189 56.293 3.5705 3.6493 0.306 1815.3
Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
K-means 2.6055 50.552 165.94 48.066 2.8183 9.1709 2.12 3687.9
DFSFDP 54.35 92.48 78.42 67.07 79.42 86.86 87.30 76.88
CAIM 53.96 92.06 85.81 69.11 79.00 86.42 86.72 77.33
Chimerge 57.53 91.74 84.94 62.27 84.17 91.09 82.85 76.92
DBSCAN 54.96 92.06 73.94 60.31 79.42 86.86 87.30 76.88
K-means 55.14 91.20 85.31 63.28 83.92 92.57 84.04 82.01
TABLE VIII. Classification Accuracy by J48 TABLE V. Dependence Degree of Different Data Sets and Discretization Methods Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
DFSFDP 0.9630 0.9963 0.7619 0.5043 1 0.9179 1 1
CAIM 0.8889 0.9611 0.9048 0.0783 1 0.959 1 1
Chimerge 1 1 0.9048 0.8812 1 1 1 1
DBSCAN 0.9562 0.9648 0.8958 0.7884 1 0.9179 1 1
K-means 1 0.8086 0.8571 0.6774 1 1 1 1
datasets Cleveland and Ecoli. In addition, the chimerge method is chosen on data sets Ecoli and Parkinson. For the other three indexes, the selection of discretization methods from Table III to Table V are similar to the way of choosing the discretization methods from Table II. Note that the values in Table IV are reduced 1000 times. B. Classification Accuracy In order to judge whether the discretization method selected by the evaluation criteria really has a better result, the discrete data sets are tested on different classifiers. If the
Dataset Cleveland Climate Ecoli Liver Olitos Parkinson Sonar Waveform
DFSFDP 50.47 91.61 84.67 61.98 71.00 86.42 81.00 75.43
CAIM 53.96 94.28 84.29 69.11 74.42 87.61 81.25 75.64
Chimerge 55.10 90.87 84.28 67.02 68.25 88.78 76.47 75.89
DBSCAN 54.63 94.22 76.04 65.16 71.00 86.42 81.00 75.43
K-means 53.01 91.48 81.64 65.13 57.67 87.09 76.41 76.22
According to the evaluation results of four indicators and classification accuracy, the better discretization results can be found for the data sets and their corresponding discretization method. Under the classifier J48 and the data set Parkinson, the Chimerge method is selected by the inconsistency measure, the dependency degree selects the Chimerge and K-means. The CAIM method is chosen by the chi-square test. And the standard deviation chooses DFSFDP. According to the results in Table VIII, the classification accuracy of the selected Chimerge, CAIM, K-means and DFSFDP methods under the classifier are 88.78, 87.61, 87.09 and 86.42, respectively. The results indicate that the discretization method chosen by the inconsistency measure is relatively better than the other indicators in most cases. For classifier JRip, apart from Ecoli, the
762
data sets with corresponding discretization methods selected by the inconsistency measure are better than that chosen by other indexes. V. C ONCLUSION The idea of the method of clustering by fast search and find of density peaks is simple and lively. In this paper, a densitybased discretization method is proposed to deal with the data sets with continuous attributes. First of all, the clustering method is employed to automatically find the number of clusters. Then the discrete intervals are obtained by further splitting each cluster. Finally, according to intervals, attribute values are assigned with different labels to accomplish the discretization. In order to evaluate results of discretization, the probabilistic inconsistency measure is proposed as an evaluation index. The experimental results have demonstrated that the inconsistency measure can select a better discretization method than the other three kinds of evaluation indexes in some cases. Therefore, the inconsistency measure can be used as an indicator to evaluate the effect of discretization. Although the inconsistency measure applied on evaluating the effects of discretization have demonstrated its promising performance, more works can be further addressed about this research topic. Since the proposed DFSFDP method shows a better performance only on few occasions, the formula for calculating density or distance may be replaced by other kernel or distance formulas in the future work. In addition, when the sample size is small but the decision class becomes more, the parameters in the probability function become smaller, which leads to the similarity of the value of the inconsistent measure and the poor results of the experiment. Therefore, in the next work, the formula for calculating the inconsistency measure will be modified so that it is not affected by the number of samples and decision classes. ACKNOWLEDGMENT This work is jointly supported by the National Natural Science Foundation of China (No. 61502068), the China Postdoctoral Science Foundation (No. 2013M541213 and 2015T80239), the Royal Society International Exchanges Cost Share Award with NSFC (No. IE160875).
[7] J. Zhu and M. Collette, “A dynamic discretization method for reliability inference in dynamic bayesian networks,” Reliability Engineering and System Safety, vol. 138, pp. 242–252, 2015. [8] J. Dougherty and R. Kohavi, “Supervised and unsupervised discretization of continuous features,” in Machine learning: proceedings of the twelfth international conference, vol. 12, pp. 194–202, 1995. [9] G. Schmidberger and E. Frank, “Unsupervised discretization using treebased density estimation,” in PKDD, vol. 5, pp. 240–251, 2005. [10] M. G. Rahman and M. Z. Islam, “Discretization of continuous attributes through low frequency numerical values and attribute interdependency,” Expert Systems with Applications, vol. 45, pp. 410–423, 2016. [11] M. Thimm, “On the expressivity of inconsistency measures,” Artificial Intelligence, vol. 234, pp. 120–151, 2016. [12] A. Hunter and S. Konieczny, “Measuring inconsistency through minimal inconsistent sets.,” KR, vol. 8, pp. 358–366, 2008. [13] K. Mu, W. Liu, and Z. Jin, “A general framework for measuring inconsistency through minimal inconsistent sets,” Knowledge and Information Systems, vol. 27, no. 1, pp. 85–114, 2011. [14] K. Mu, “Responsibility for inconsistency,” International Journal of Approximate Reasoning, vol. 61, pp. 43–60, 2015. [15] J. Grant and A. Hunter, “Measuring consistency gain and information loss in stepwise inconsistency resolution,” Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pp. 362–373, 2011. [16] K. Knight, A theory of inconsistency. 2002. [17] N. Potyka, “Linear programs for measuring inconsistency in probabilistic logics.,” in KR, 2014. [18] J. Grant and A. Hunter, “Analysing inconsistent information using distance-based measures,” International Journal of Approximate Reasoning, vol. 89, pp. 3–26, 2017. [19] S. Jabbour and B. Raddaoui, “Measuring inconsistency through minimal proofs,” in European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pp. 290–301, 2013. [20] A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, vol. 344, pp. 1492–1496, 2014. [21] A. Cano, D. T. Nguyen, and S. Ventura, “ur-caim: improved caim discretization for unbalanced and balanced data,” Soft Computing, vol. 20, no. 1, pp. 173–188, 2016. [22] N. Potyka and M. Thimm, “Probabilistic reasoning with inconsistent beliefs using inconsistency measures.,” in IJCAI, pp. 3156–3163, 2015. [23] C. Blake, “Uci repository of machine learning databases,” 1998. [24] N. Dixit, “An improved svm classifier for discretization of attributes using k-means clustering,” International Journal of Computer Applications, vol. 159, no. 9, 2017. [25] I. Cordova and T. S. Moh, “Dbscan on resilient distributed datasets,” in International Conference on High Performance Computing and Simulation, pp. 531–540, 2015. [26] S. Rosati, G. Balestra, and V. Giannini, “Chimerge discretization method: Impact on a computer aided diagnosis system for prostate cancer in mri,” in Medical Measurements and Applications (MeMeA), 2015 IEEE International Symposium on, pp. 297–302, IEEE, 2015.
R EFERENCES [1] L. Peng, W. Qing, and G. Yujia, “Study on comparison of discretization methods,” in Artificial Intelligence and Computational Intelligence, vol. 4, pp. 380–384, 2009. [2] S. Ram´ırez-Gallego and S. Garc´ıa, “Multivariate discretization based on evolutionary cut points selection for classification,” IEEE transactions on cybernetics, vol. 46, no. 3, pp. 595–608, 2016. [3] J. Catlett, “On changing continuous attributes into ordered discrete attributes,” in Machine learningłEWSL-91, pp. 164–178, 1991. [4] R. Kerber, “Chimerge: Discretization of numeric attributes,” in Proceedings of the tenth national conference on Artificial intelligence, pp. 123– 128, 1992. [5] H. Luo and J. Yan, “A global discretization method based on clustering and rough set,” in The International Flins Conference, pp. 400–405, 2015. [6] K. Black, “Approximation of naviercstokes incompressible flow using a spectral element method with a local discretization in spectral space,” Numerical Methods for Partial Differential Equations, vol. 13, no. 6, pp. 587–599, 2015.
763