A Backtracking Approach to Minimal Cost Feature Selection of

Journal of Information & Computational Science 10:13 (2013) 4105–4115 Available at http://www.joics.com

September 1, 2013

A Backtracking Approach to Minimal Cost Feature Selection of Numerical Data ⋆ Hong Zhao ∗, Fan Min ∗,

William Zhu

Lab of Granular Computing, Zhangzhou Normal University, Zhangzhou 363000, China

Abstract In data mining applications, one of the most fundamental problems is feature selection, which can improve the generalization capability of patterns. Recent developments in this field have led to a renewed interest in cost-sensitive feature selection. Minimal cost feature selection is devoted to obtain a trade-off between test costs and misclassification costs. However, this problem has been addressed on only nominal data. In this paper, we consider numerical data with measurement errors and propose a backtracking approach for this problem. First, we build a data model with normal distribution measurement errors. In order to deal with this model, the neighborhood of each data item is constructed through the confidence interval. Compared with discretized intervals, neighborhoods are more reasonable to maintain the information of data. And then, we redefine the minimal total cost feature selection problem on the neighborhood model. Finally, a backtracking algorithm with three effective pruning techniques is proposed to deal with this problem. Experimental results indicate that the pruning techniques are effective, and the algorithm is efficient for datasets with nearly one thousand objects. Keywords: Feature Selection; Normal Distribution Measurement Errors; Test Costs; Misclassification Costs

1

Introduction

Minimal cost feature selection is devoted to obtain a trade-off between test costs and misclassification costs. Test costs are what we pay for collecting data items [10]. Test costs are often measured by time, money, and other resources. When test costs are only considered (see, e.g., [12, 16, 17, 19, 20]), the minimal cost feature selection problem degrades to the minimal test cost reduct problem [10, 11]. Therefore, the minimal cost feature selection problem is a generalization of the minimal test cost reduct problem. ⋆ This work is in part supported by the Natural Science Foundation of Fujian Province, China under Grant No. 2012J01294, the Fujian Province Foundation of Higher Education under Grant No. JK2012028, the National Science Foundation of China under Grant No. 61170128, and the Education Department of Fujian Province under Grant No. JA12222. ∗ Corresponding author. Tel.: +86 133 7690 8359 Email addresses: [email protected] (Hong Zhao), [email protected] (Fan Min).

1548–7741 / Copyright © 2013 Binary Information Press DOI: 10.12733/jics20102163

4106

H. Zhao et al. / Journal of Information & Computational Science 10:13 (2013) 4105–4115

In addition to the test costs, misclassification costs are necessary to be considered in costsensitive learning [22]. Misclassification cost is the penalty we receive while deciding that an object belongs to class J when its real class is K [2, 3]. For example, in medical diagnosis, if a cancer (non-cancer) is regarded as the negative (positive) class, there will be punishment. When misclassification costs are only considered (see, e.g., [2, 26]), the average misclassification cost is a more general metric than the accuracy [13]. It is important to consider both test costs and misclassification costs in many applications. Test costs are paid on each object. Misclassification costs are paid to misclassified objects [20]. Therefore we should take into account total cost on a set of objects. This issue has been addressed recently on nominal data. In real applications, however, data are often numerical and they always have some measurement errors. The measurement errors of the data have certain universality and are inescapability. In this paper, we consider numerical data with measurement errors and propose a backtracking approach to study minimal cost feature selection problem. First, a data model with measurement errors under the normal distribution is defined. Second, we construct the neighborhood of each data item through the confidence interval. Compared with discretized intervals, neighborhoods are more reasonable to maintain the information of data. Third, considering the trade-off between test costs and misclassification costs, we define a new minimal total cost feature selection problem. Fourth, a backtracking algorithm with three effective pruning techniques is proposed to deal with this problem. Four open datasets from the University of California - Irvine (UCI) library are employed to study the effect and efficiency of our algorithm. Experiments are undertaken with the open source software Coser [15] to validate the performance of this algorithm. Experimental results indicate that the pruning techniques are effective, and the algorithm is efficient for datasets with nearly one thousand objects. This paper is organized as follows. Section 2 introduces a decision system with normal distribution measurement errors. Then we introduce test costs and misclassification costs to this decision system, and define a cost-sensitive decision system. In Section 3, we define a new minimal cost feature selection problem by considering test costs and misclassification costs. In order to deal with this problem, a full description of a backtracking algorithm is given in Section 4. Experimental settings and results are discussed in Section 5. Finally, Section 6 concludes and suggests further research trends.

2

Data Models

In this section, the concept of decision systems with Normal Distribution Measurement Errors (NDME) is revisited. Then the neighborhood of each data item is constructed through the confidence interval. Finally decision systems based on NDME with test costs and misclassification costs is presented. Definition 1 [24] A decision system with Normal Distribution Measurement Errors (NEDS) S is the 6-tuple: S = (U, C, d, V = {Va |a ∈ C ∪ {d}}, I = {Ia |a ∈ C ∪ {d}}, n),

(1)


4107

where U is the nonempty set called a universe, C is the nonempty set of conditional attributes, d is the decision attribute, Va is the set of values for each a ∈ C ∪ {d}, and Ia : U → Va is an information function for each a ∈ C ∪ {d}. n : C → R+ ∪ {0} is the maximal measurement error of a ∈ C, and ±n(a) are confidence limits of a, respectively. Confidence limits are the lower and upper boundaries of a confidence interval. In real applications, there are a number of measurement methods to obtain a numerical data item with different measurement errors. The measurement errors often satisfy normal distribution which is found to be applicable over almost the whole of science and engineering measurement. We introduce the confidence interval of normal distribution to our model. According to Definition 1, we have defined a neighborhood in [25] as follows. Definition 2 Let S = (U, C, d, V, I, n) be a NEDS. Given xi ∈ U and B ⊆ C, the neighborhood of xi with respect to normal distribution measurement errors on test set B is defined as nB (xi ) = {x ∈ U |∀a ∈ B, |a(x) − a(xi )| ≤ 2n(a)},

(2)

it represents the error value of a in [−n(a), +n(a)]. From Definition 2 we know that nB (xi ) =

∩

n{a} (xi ).

(3)

a∈B

That is, the neighborhood nB (xi ) is the intersection of a number of basic neighborhoods (see, e.g., ∪ [1, 7, 8, 23]). Given ∀x ∈ U , ∀a ∈ B, x ∈ nB (x). Therefore, for any B ⊆ C, x∈U nB (x) = U . Hence the set {nB (xi )|xi ∈ U } is a covering (see, e.g., [9, 27, 28, 29, 30]) of U . In cost-sensitive learning, test costs and misclassification costs are two most important types of costs. Now, we define a cost-sensitive decision system by considering both test and misclassification costs. Definition 3 A decision system based on NDME with test costs and misclassification costs (NEDS-TM) S is the 8-tuple: S = (U, C, d, V, I, n, tc, mc), (4) where U, C, d, V, I and n have the same meanings as in Definition 1, tc : C → R+ ∪ {0} is the test costs function and mc : k × k → R+ ∪ {0} is the misclassification costs function, where k = |Id |. For any B ⊆ C, the sequence-independent test costs function tc is defined as follows: ∑ tc(B) = tc(a).

(5)

a∈B

The misclassification costs function can be represented by a matrix mc = {mck×k }. Misclassification costs [4, 5, 21] is the penalty we receive while deciding that an object belongs to class m when its real class is n [3, 14]. If classification is correct, the misclassification costs mc[m, m] = 0. The following example gives us intuitive understanding.

4108


Table 1: An example of numerical value attribute decision table x

a1

a2

a3

d

x1

0.31

0.23

0.08

y

x2

0.14

0.38

0.23

y

x3

0.25

0.40

0.40

y

x4

0.60

0.46

0.15

n

x5

0.41

0.64

0.62

n

x6

0.35

0.50

0.75

n

Table 2: A neighborhood boundaries vector and a test costs vector a

a1

a2

a3

n(a)

0.0069

0.0087

0.0086

c(a)

$28

$19

$56

Example 1 A NEDS-TM is given by Tables 1 and 2, and ] [ 0 800 . mc = 200 0

(6)

That is, the test costs are $28, $19 and $56, respectively. In this dataset, the decision attribute is used to split dataset into two sets. In case that a person belongs to set “y”, and he is misclassified as set “n”, a penalty of $800 is paid, contrarily $200 is paid.

3

Minimal Cost Feature Selection Problem

In this work, we focus on selecting a subset of features to minimize the total cost, that is, costsensitive feature selection based on test costs and misclassification costs. The problem of finding such a subset of features is called the minimal cost feature selection problem. Problem 1 The minimal cost feature selection problem (MCFS). Input: S = (U, C, d, V, I, n, tc, mc); Output: R ⊆ C and the average total cost (AT C(R)); Optimization objective: min(AT C(R)). Compared with the classical minimal reduction problem, there are several differences as follows. The first is the input, the external information is the test costs and misclassification costs as well as normal distribution measurement errors. The second is the optimization objective, which is to minimize the total cost instead of the number of features. In data mining applications, average total cost (ATC) is considered to be a more general metric than the accuracy [26]. The average total cost is computed as follows. Step 1. Compute the neighborhood of each data item. Let B be a selected feature set, and N (B) be a set {nB (xi )|xi ∈ U }.


4109

Step 2. Assign one class. Let U ′ = nB (xi ) and cB (xi )(dm ) be the number of class dm of U ′ , where dm ∈ {Id }. We adjust different classes of elements in U ′ to one class for minimizing the misclassification cost of U ′ . U ′ includes the following two types of cases. 1. U ′ ⊆ P OSB ({d}) if and only if d(x) = d(y) for any x, y ∈ U ′ . In this case, the misclassification costs of U ′ is mc(U ′ , B) = 0. For any x ∈ U ′ , the assigning class d′ (x) = d(x). 2. However, if there exists x, y ∈ U ′ st. d(x) ̸= d(y), we may adjust different classes of elements in U ′ to one class. Let d(x) be m-class and d(y) be n-class. We select one case to minimize the misclassification costs of U ′ . ′ mc(U ′ , B) = min(mcm,n × |Um |, mcn,m × |Un′ |).

For any x ∈ U ′ ,

{ d′ (x) =

n-class

′ If mc(U ′ , B) = mcm,n × |Um |,

m-class If mc(U ′ , B) = mcn,m × |Un′ |,

(7)

(8)

′ where mcm,n is the cost of classifying an object of the m-class into the n-class, and |Um | is the number of m-class.

Step 3. Compute average misclassification cost. The class of xi is d′ (xi ) = dm if and only if max{cB (xi )(dm )|dm ∈ {Id }}. The misclassification costs are   If d(xi ) = d′ (xi ),   0 (9) mc∗ (d(xi ), d′ (xi )) = mcm,n If d(xi ) = m and d′ (xi ) = n,    mc ′ If d(x ) = n and d (x ) = m. n,m

i

i

In this way, the average misclassification cost is given by ∑ mc∗ (d(xi ), d′ (xi )) mc(U, B) = xi ∈U . |U |

(10)

Step 4. Compute average total cost (ATC). The average total cost is given by AT C(U, B) = tc(B) + mc(U, B).

(11)

In this context, we select a feature subset in order to minimize the average total cost. The minimal average total cost is given by AT C(U, B) = min{AT C(U, B ′ )|B ′ ⊆ C}.

(12)

The MCFS problem has a similar motivation with the cheapest cost problem [18], or the Minimal Test cost Reduct (MTR) problem (see, e.g., [10]). However, compared with the MTR problem, our MCFS problem is different from theirs in two aspects. 1. In addition to considering the test costs of each attribute, we take misclassification cost into account. When the misclassification costs are too large compared with test costs, the MCFS problem coincides with the MTR problem. Therefore, the MCFS problem is a generalization of the MTR problem. 2. The attribute reduction [6] needs to preserve a particular property of the decision system. The feature selection relies on the only cost information.

4110

4


Algorithm

In this section, the backtracking algorithm to the Minimal Cost Feature Selection problem (MCFS) is illustrated in Algorithm 1. To invoke the algorithm, one should initialize the global variables as follows: R = ∅ is a feature subset with minimal total cost; cmc = mc(U, R) is currently minimal cost; and use the following statement: backtracking(R, 0). The result of a feature subset with minimal total cost will be stored in R. Algorithm 1 A backtracking algorithm to the MCFS problem. Input: (U, C, d, {Va }, {Ia }, n, tc, mc), select tests R, current level test index lower bound l Output: A set of features R with AT C and cmc, they are global variables Method: backtracking 1: for (i = l; i < |C|; i + +) do 2: B = R ∪ {ai } 3: if (tc(B) > cmc) then 4: continue; //Pruning for too expensive test cost 5: end if 6: if ((AT C(U, B) ≥ AT C(U, R)) and (mc(B) < mc(R)) then 7: continue; //Pruning for non-decreasing total cost and decreasing misclassification cost 8: end if 9: if (AT C(U, B) < cmc)) then 10: cmc = AT C(U, B); //Update the minimal total cost R = B; //Update the set of features with minimal total cost 11: 12: end if 13: backtracking (B, i + 1); 14: end for Generally, the search space of the feature selection algorithm is 2|C| . In this context, a backtracking algorithm with pruning techniques is used to select a feature subset to minimize the total cost. There are essentially three pruning techniques employed in Algorithm 1. 1. In Algorithm 1, Line 1 indicates that the variable i starts from l instead of 0. Whenever we move forward (see Line 13), the lower bound is increased. With this pruning technique, the solution space is 2|C| instead of |C|!. 2. In Algorithm 1, Lines 3 through 5 show the second pruning method. The misclassification costs are non-negative in the practical application. In this conditions, the feature subsets B will be discarded if the test costs of B is larger than the current minimal cost (cmc). This technique can prune most branches. 3. Lines 6 through 8 indicate that if the new feature subset produce high cost, the current branch will never produce the feature subset with minimal total cost.

5

Experiments

Experiments are undertaken on four datasets from the UCI Repository of Machine Learning Databases, as listed in Table 3. We undertake three groups of experiments from different viewpoints.

4111


In many real-world applications such as fraud detection, medical diagnosis, costs of different types of misclassifications can be vastly different. Therefore, in our examinations, we assume that the misclassification costs are different from each other. Table 4 is the optimal feature subset based on different misclassification costs for Diab dataset. The ratio of two misclassification costs is set 10 in this experiment. As shown in this table, when the misclassification costs are too large compared with test costs, the test costs increase and even equal to the average total cost. In this case, the MCFS problem coincides with the MTR problem. Table 5 shows the optimal feature subset based on unified misclassification cost. Since the misclassification costs are small enough, Table 3: Datasets information No.

Name

Domain

|U |

|C|

D = {d}

1

Liver

clinic

345

6

selector

2

Credit

commerce

690

15

class

3

Iono

physics

351

34

class

4

Diab

clinic

768

8

class

Table 4: The optimal feature subset based on different misclassification costs MisCost1

MisCost2

Test costs

Total cost

Feature subset

120

1200

6.00

9.75

[1, 2, 3]

140

1400

6.00

10.38

[1, 2, 3]

160

1600

6.00

11.00

[1, 2, 3]

180

1800

6.00

11.63

[1, 2, 3]

200

2000

6.00

12.25

[1, 2, 3]

220

2200

10.00

12.58

[1, 3, 6]

240

2400

10.00

12.81

[1, 3, 6]

260

2600

13.00

13.00

[1, 2, 6]

Table 5: The optimal feature subset based on unified misclassification cost MisCost1

MisCost2

Test costs

Total cost

Feature subset

120

120

6.00

9.59

[1, 2, 3]

140

140

6.00

10.19

[1, 2, 3]

160

160

10.00

10.42

[0, 1, 2, 3]

180

180

10.00

10.47

[0, 1, 2, 3]

200

200

10.00

10.52

[0, 1, 2, 3]

220

220

10.00

10.57

[0, 1, 2, 3]

240

240

10.00

10.63

[0, 1, 2, 3]

260

260

10.00

10.68

[0, 1, 2, 3]

4112


the algorithm chooses a feature subset with minimal average total cost. From the efficiency of the algorithm, we design two sets of experiments to evaluate performance of pruning technique. In this case, we propose Fast approach and Slow approach. The Fast approach is a backtracking algorithm with three pruning methods, which is given by the Algorithm 1. We also propose the Slow approach without the third pruning method. The first set studies the change of backtracking steps with the different misclassification costs. Fig. 1 shows the backtracking steps of the algorithm. The smaller misclassification costs are listed in X-axis. From these figures we observe the effectiveness of the third pruning technique.

350

14 13

300

11

250

10

200

Steps

Steps

12

9 8 7

100 Fast Slow

6 5 4

150

(a) 10

40 70 100 130 Misclassification cost setting

Fast Slow

50 0

160

(b) 10


160

35

250

30 200 25 Steps

Steps

150 100

20 15 10

50

Fast Slow

(c) 0

10


160

Fast Slow

5 (d) 0

10


160

Fig. 1: Backtracking steps: (a) Liver; (b) Credit; (c) Iono; (d) Diab

From the costs viewpoint, the changes of test costs and the average minimal total cost are shown in Fig. 2. The smaller misclassification costs are listed in X-axis. As shown in Fig. 2 (a), compared with the test costs, when the misclassification costs are large enough, the average total cost equal to the test costs. In this way, the misclassification cost is zero, and the set of selected tests is a attribute reduct. In real applications, we could not select any attribute when misclassification costs are low. Fig. 2 (d) shows this situation.

4113

H. Zhao et al. / Journal of Information & Computational Science 10:13 (2013) 4105–4115 7

13

6

12 11 10

4

Cost

Cost

5

3

8

2

7 Test cost Average total cost

1 (a) 0

9

10


6 5

160

(b) 10

Test cost Average total cost 40 70 100 130 Misclassification cost setting

160

20 4.5

18

4.0

16

3.5

14 12 Cost

Cost

3.0 2.5

10

2.0

8

1.5

6

1.0 0.5 0

Test cost Average total cost

(c) 10


160

4 2 (d) 0 10

Test cost Average total cost 40 70 100 130 Misclassification cost setting

160

Fig. 2: Test costs and average total cost: (a) Liver; (b) Credit; (c) Iono; (d) Diab

6

Conclusion

In this paper, we take data with normal distribution measurement errors into account and study feature selection with minimal cost. This minimal cost feature selection problem has a wide application area for two reasons. From the viewpoint of the data, measurement errors under considered are ubiquitous. From the viewpoint of the minimal cost problem, the resource one can afford is often limited. In order to obtain the optimal result, a backtracking algorithm with three effective pruning techniques is designed for MCFS problem. Experimental results show that the pruning techniques are effective. This work also serves as the benchmark for other heuristic algorithms which should be designed in our further works for large datasets.

References [1] [2]

A. Bargiela, W. Pedrycz, Granular Computing: An Introduction, Kluwer Academic Publishers, Boston, 2002 C. Elkan, The foundations of cost-sensitive learning, in: Proceedings of the 7th International Joint Conference on Artificial Intelligence, 2001

4114


[3]

E. B. Hunt, J. Marin, P. J. Stone (eds.), Experiments in Induction, Academic Press, New York, 1966

[4]

M. Kukar, I. Kononenko et al., Cost-sensitive learning with neural networks, in: Proceedings of the 13th European Conference on Artificial Intelligence (ECAI-98), Chichester, John Wiley & Sons, 1998

[5]

J. Lan, M. Hu, E. Patuwo, G. Zhang, An investigation of neural network classifiers with unequal misclassification costs and group sizes, Decision Support Systems, 48(4), 2010, 582-591

[6]

H. X. Li, X. Z. Zhou, J. B. Zhao, D. Liu, Attribute reduction in decision-theoretic rough set model: A further investigation, in: Proceedings of Rough Sets and Knowledge Technology, LNCS, Vol. 6954, 2011

[7]

T. Y. Lin, Granular computing on binary relations-analysis of conflict and Chinese wall security policy, in: Proceedings of Rough Sets and Current Trends in Computing, LNAI, Vol. 2475, 2002

[8]

T. Y. Lin, Granular computing - Structures, representations, and applications, in: Lecture Notes in Artificial Intelligence, Vol. 2639, 2003

[9]

L. W. Ma, On some types of neighborhood-related covering rough sets, International Journal of Approximate Reasoning, 53(6), 2012, 901-911

[10] F. Min, H. P. He, Y. H. Qian, W. Zhu, Test-cost-sensitive attribute reduction, Information Sciences, 181 (2011), 4928-4942 [11] F. Min, Q. H. Liu, A hierarchical model for test-cost-sensitive decision systems, Information Sciences, 179 (2009), 2442-2452 [12] F. Min, W. Zhu, Attribute reduction with test cost constraint, Journal of Electronic Science and Technology of China, 9(2), 2011, 97-102 [13] F. Min, W. Zhu, Minimal cost attribute reduction through backtracking, Database Theory and Application, Bio-Science and Bio-Technology Communications in Computer and Information Science, Vol. 258, 2011, 100-107 [14] F. Min, W. Zhu, A competition strategy to cost-sensitive decision trees, in: Proceedings of Rough Set and Knowledge Technology, LNAI, Vol. 7414, 2012 [15] F. Min, W. Zhu, H. Zhao, G. Y. Pan, J. B. Liu, Z. L. Xu, Coser: Cost-senstive rough sets, 2012, http://grc.fjzs.edu.cn/˜fmin/coser/ [16] S. Norton, Generating better decision trees, in: Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, 1989 [17] M. N´ un ˜ez, The use of background knowledge in decision tree induction, Machine Learning, 6(3), 1991, 231-250 [18] R. Susmaga, Computation of minimal cost reducts, in: Z. Ras, A. Skowron (eds.), Foundations of Intelligent Systems, LNCS, Vol. 1609, 1999, 448-456 [19] M. Tan, Cost-sensitive learning of classification knowledge and its applications in robotics, Machine Learning, 13(1), 1993, 7-33 [20] P. Turney, Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm, Journal of Artificial Intelligence Research (JAIR), 2 (1995), 369-409 [21] P. Turney, Types of cost in inductive concept learning, in: Proceedings of the ICML’2000 Workshop on Cost-Sensitive Learning, 2000 [22] Y. Weiss, Y. Elovici, L. Rokach, The cash algorithm-cost-sensitive attribute selection using histograms, Information Sciences, Vol. 222, 2013, 247-268 [23] H. Zhao, F. Min, W. Zhu, Test-cost-sensitive attribute reduction based on neighborhood rough set, in: Proceedings of the 2011 IEEE International Conference on Granular Computing, 2011


4115

[24] H. Zhao, F. Min, W. Zhu, Inducing covering rough sets from error distribution, Journal of Information & Computational Science, 10(3), 2013, 851-859 [25] H. Zhao, F. Min, W. Zhu, Test-cost-sensitive attribute reduction of data with normal distribution measurement errors, to appear in Mathematical Problems in Engineering, 2013 [26] Z. Zhou, X. Liu, Training cost-sensitive neural networks with methods addressing the class imbalance problem, IEEE Transactions on Knowledge and Data Engineering, 18(1), 2006, 63-77 [27] W. Zhu, Generalized rough sets based on relations, Information Sciences, 177(22), 2007, 4997-5011 [28] W. Zhu, Relationship among basic concepts in covering-based rough sets, Information Sciences, 17(14), 2009, 2478-2486 [29] W. Zhu, Relationship between generalized rough sets based on binary relation and covering, Information Sciences, 179(3), 2009, 210-225 [30] W. Zhu, F. Wang, Reduction and axiomization of covering generalized rough sets, Information Sciences, 152(1), 2003, 217-230

A Backtracking Approach to Minimal Cost Feature Selection of

A Backtracking Approach to Minimal Cost Feature Selection of

Suggest Documents

Minimal cost attribute reduction through backtracking

Minimal cost feature selection of data with normal distribution

COST-SENSITIVE FEATURE SELECTION

A Minimal Encoding Approach to Feature Discovery - Semantic Scholar

A Minimal Encoding Approach to Feature Discovery - Semantic Scholar

Multiobjective Feature Selection Approach to ... - ACS Publications

A Minimal Subset of Features Using Feature Selection for Handwritten

A Hybrid Approach for Effective Feature Selection

A Minimum Description Length Approach to Multitask Feature Selection

A Probabilistic Approach to Feature Selection for Multi ... - Springer Link

A Group Incremental Approach to Feature Selection ... - Jiye Liang

A Multi-Objective Genetic Algorithm Approach to Feature Selection in

Backtracking algorithms for service selection

A Portfolio Approach to Algorithm Selection for Discrete Time-Cost ...

A Novel Approach to Cost Weighting in Unit Selection TTS

An Execution Backtracking Approach to Program ...

A feature selection approach for identification of signature genes from ...

A Feature Selection Approach in the Study of Azorean ...

Fire Fly Based Feature Selection Approach - CiteSeerX

Novel Clustering approach for Feature selection

Feature Selection: Near Set Approach - Semantic Scholar

Feature Selection in a Low Cost Signature Recognition ... - Springer Link

1216 Minimal Processing Feature

An approach to feature selection for keystroke dynamics ... - CIn-UFPE