SVM classifier based feature selection using GA ... - Semantic Scholar

1 downloads 0 Views 466KB Size Report
2Department of Biotechnology,Jaypee Institute of Information Technology University, Noida, ... challenge before the scientific community in RNAi technology.
SVM classifier based feature selection using GA, ACO and PSO for siRNA Design Yamuna Prasad1, K. K. Biswas1, Chakresh Kumar Jain2 1

Department of Computer Science and Engineering,Indian Institute of Technology, Delhi, India [email protected], [email protected] 2 Department of Biotechnology,Jaypee Institute of Information Technology University, Noida, India [email protected]

Abstract— Recently there has been considerable interest in applying evolutionary and natural computing techniques for analysing large datasets with large number of features. In particular, efficacy prediction of siRNA has attracted a lot of researchers, because of large number of features involved. In the present work, we have applied the SVM based classifier along with PSO, ACO and GA on Huesken dataset of siRNA features as well as on two other wine and wdbc breast cancer gene benchmark dataset and achieved considerably high accuracy and the results have been presented. We have also highlighted the necessary data size for better accuracy in SVM for selected kernel. Both groups of features (sequential and thermodynamic) are important in the efficacy prediction of siRNA. The results of our study have been compared with other results available in the literature.

Keywords: siRNA, ACO, GA, PSO, LibSVM, RBF

1 Introduction In RNAi pathways, siRNAs are small molecules generated endogenously or exogenously from ds-RNAs which cleave down the sequence specific complementary mRNA segment via incorporation of DICER enzymes and RISC formation. These siRNAs sequences can be computationally designed exogenously [3]. Currently, many tools are available on public servers for siRNA prediction against gene of interest, but these tools generate the bunch of siRNA with varied efficacy, not the optimal number of siRNAs[5]. A study [2] reveals the cause behind the large number of siRNA sequence design, because of biased dataset or small dataset or noise dataset or sometimes it is effected with over fitting problem. Consequently, designing the most effective siRNA, based upon optimal feature selection is the current challenge before the scientific community in RNAi technology. In the hunt of suitable features various models have been developed. Since the availability of the siRNA dataset [4], number of research workers have been analysing this data for better siRNA efficacy prediction [6, 7, 8, 25]. In [8] a SVM classifier has been used on Huesken dataset[4] and presents 77% classification accuracy with certain filtering criteria. This work uses 200 most potent and 200 least potent siRNAs for designing purpose. Though, in the siRNA designing sequence based features play significant role to determine the efficacy comparatively [2], the thermodynamic features also play an important role in effective designing [8,25]. [8] also brings out the need for feature selection process modelling. The problem of feature selection in classification can be defined as “finding minimal subset 's' of features (|s| < n) from the feature set 'S' where |S| = n, simultaneously achieving maximal classification accuracy. Hence, feature selection which simultaneously maximizes the classification accuracy is a two objective optimization problem. Searching for the minimal subset which maximizes the classification is hard as it requires the computation time of exponential order (O(2n)) [16, 17, 21, 25]. For feature selection problem, researchers have applied widely known evolutionary and natural computing heuristics like GA, ACO and PSO etc. [16, 17, 21, 25] for better classification and reported their suitability for the problems associated with large class of research domains like network simulation, anomaly detection, security and vigilance, image classification and Bioinformatics[15, 16]. Our work is directed towards handling the data size issue for better classification, identification of optimal features and kernel applicability for the siRNA dataset. In this work, we have shown that a SVM classifier for siRNA efficacy prediction coupled with a evolutionary and natural computing heuristics significantly improves the prediction results. We are using these to obtain the most appropriate set of features for siRNA efficacy prediction. We have used both the linear and the RBF kernels for the SVM classifier. We present results for GA-SVM, ACO-SVM, PSO-SVM for three datasets, namely, the siRNA dataset, the wine and breast cancer gene wdbc dataset [9]. A new control parameter has also been proposed for ACO ant traversal stopping criteria. We also compare our results with the published results on the three data sets [8,21,25]. A study also has been carried out for the stability prediction of the proposed models.

2. Methodology Heuristic solution methods for optimization, often based on local neighbourhood searches, are rather sensitive to starting point conditions and tend to get trapped in local minima. Conversely, in order to avoid these type of problems Ant Colony Optimization, Genetic Algorithms and Particle Swarm optimization randomizes the search space stochastically and also use the information of previous history to explore the search space [ 15, 16, 17, 18, 19, 22, 23 ] which provides global optima with proper tuning of the parameters. These optimization approaches have been widely used for large class of challenging problems of various domains. The descriptions of these approaches for optimal

feature selection have been described in the subsections below:

2.1 ACO ACO is one of the meta-heuristic algorithms applied for several classes of optimization problems. The algorithm is entirely developed by Dorigo M. [12, 14] and later on by [15,16,17,25], conceptually based upon foraging behaviour of ant. This algorithm utilizes the approximate methods to provide the better solution against hard combinatorial problems. Heuristics desirability is the backbone of the ACO procedure for stochastically computing the optimal solutions from a finite set of solution components. The construction of solution starts with an empty partial solution and then in the succeeding steps the current partial solution is extended by adding a feasible solution component from the set of solution components [14]. In the current work, SVM classifier performance has been used as heuristic information for feature selection. The transition rule for the ant 'm' to decide whether to include the i th feature at any time„t‟ in its solution jointly influenced by the heuristic and level of pheromone. The probabilistic transition rule is computed by the equation 1 as follows:

k where h is the set of feasible features that can be added to the partial solution; τi and ηi are, respectively, the pheromone value and heuristic desirability associated with feature i. α and β are two parameters that determine the relative importance of the pheromone value and heuristic information. The transition probability used by ACO is a balance between pheromone intensity (i.e. history of previous successful moves), τi, and heuristic information (expressing desirability of the move), ηi. This effectively balances the exploitation–exploration trade-off. The best balance between exploitation and exploration is achieved through proper selection of the parameters α and β. If α = 0, no pheromone information is used, i.e. previous search experience is neglected. The search then degrades to a stochastic greedy search. If β = 0, the attractiveness (or potential benefit) of moves is neglected. The value of local heuristic desirability ηi for the ith feature has been evaluated using the SVM – classifier classification accuracy unlike the mutual information score and other statistical measures [16, 17]. After all ants have completed their solutions, pheromone evaporation on all nodes is triggered, and then according to Eq. (2) each ant k deposits a quantity of pheromone, on each node that it has traversed.

k k where S (t) is the feature subset found by ant k at iteration t, and S (t) is its length. The pheromone is updated k according to both the measure of the classifier performance, C(S (t)), and feature subset length and is a parameter that control the relative weight of classifier performance and feature subset length; N is the total number of feature in the data set. In our experiment we assume that classifier performance is more important than subset length, so the value of has been set to 0.95. The total pheromone by all ants on the features is updated by the following equation which also includes the effect of evaporation:

Where m is the number of ants at each iteration and is the pheromone trail decay coefficient. The main role of pheromone evaporation is to avoid stagnation, that is, the situation in which all ants constructing the same solution. 2.1.1 Ants Traversal stopping criteria: Various methodologies have been evolved for the feature selection stopping criteria like fixed number of features and constant number of accuracy inversions [16, 17]. In the former approach user defines the minimum and maximum limit for the feature subset length and ants stop selecting the next feature if the specified number of features have been included where as in the later approach if selection of a feature degrades the performance, it is termed as accuracy inversions and if the number of inversions reach the specified limit ants stop the selection feature and returns the subset. In our proposed methodology a control parameter has been defined to automatically decide the feature selection stopping criteria according to the equation (4).

Where ; when µ = 1 it considers the number of inversions as the total number of features and when µ = number of features ant stops the feature selection when a single accuracy inversion takes place.

After performing several experiments with an artificial data set we have tuned the value of µ to 3. The flow of the ACO algorithm with SVM is illustrated in Figure 1. Start Generate Ants

Initialize Pheromones

Calculate Feature Importance (Local Heuristic) Generate New Ants

Ant

s

m

1

Evaluate Ants

FS stopping Criteria

Yes

No

 

Choose Next Feature

Gather Selected Subsets

Evaluate selected Features using SVM Classifier

Evaluate Stopping criteria/Max Iterations

No

Update Pheromone

Yes

Return Best Subset Fig.1. ACO-SVM model for feature selection 2.2 GA Genetic algorithm is a stochastic computational model inspired from the principles of biological evolutionary theory to solve optimization problems. GA model the natural phenomenon of genetic inheritance based on the principle of “survival of the fittest”. Several researchers had shown the applicability of GA to solve sequential decision process for function optimization, machine learning and general optimization problems [16, 18, 20]. Various methods have been developed for the selection of optimal feature subset based on the classifier accuracy. BVO (optimal binary vector) and m-ary vector are widely used feature selection heuristics in the literature till date. The former method uses a binary bit for each feature where a „1‟ or '0' suggests that a particular feature (index in the feature

vector) is selected or dropped respectively to optimize the classifier performance. In the later methodology a weight (real valued) is assigned to features instead of abruptly dropping or including them as in the former case[20]. It has been observed by the researchers that the former one has better suitability for the better resolution in the multidimensional space [20]. In this work we have used the BVO methodology and roulette wheel selection method for crossover and have also used elitism technique, which retains the some better chromosomes in the new generation. 2.3 Particle Swarm Optimization Particle swarm optimization (PSO) was developed for the solution of optimization problems using social and cognitive behaviour of swarm [23]. In PSO each particle has some velocity according to which it moves in the multi-dimensional solution space; and memory to keep information of its previous visited space. Hence, its movement is influenced by two factors: the local best solution due to itself and the global best solution due to all particles participating in the solution space. The algorithm is guided by two factors: the movement of particles in the global neighbourhood and the movement in the local neighbourhood. In the global neighbourhood each particle searches for the best position (solution) and towards the best particle in the whole swarm while in the local neighbourhood, each particle moves towards the best position (solution) towards the best particle in the restricted neighbourhood (swarm). During an iteration of the algorithm, the local best position and the global best position are updated if better solution is found and the process is repeated till the desired results are achieved or specified number of iterations are exhausted . Let us consider an N-dimensional solution space. The ith particle can be represented as an N- dimensional vector, where the first subscript denotes the particle number and the second subscript denotes the dimension. The velocity of the particle is denoted by a N-dimensional vector .The memory of the previous best position of the particle is represented by an N- dimensional vector and the global best position (considering the whole population as topological neighbour) by . Initially particles are distributed randomly over the search space. In the succeeding iterations, each particle is updated by Posi and Posg values. Each particle keeps track of its co-ordinates in the problem space, which are associated with the best solution (fitness) the particle has achieved so far using Posi. The movement of particle is affected by its own best position and global best position. The velocity of the particle Xi at (k+1)th iteration dth dimension is updated by:

The corresponding position of the particle is updated by :

Where, i = 1, 2, 3, .........m; m being the number of particles and d = 1, 2, 3, .....N is the dimension of a particle; α and β are the positive constants, called cognitive parameter and social parameter respectively which indicates the relative influence of the local and global positions; r1 anf r2 are the random numbers distributed uniformly in [0, 1]; and k = 1, 2, 3, ......Max_iteration is the iteration or generation step; is called inertia weight [23]. For the problem of feature selection, in binary PSO the particles are represented by a vector of binary values '1' or '0' which suggests the selection and removal of particular feature in the feature vector represented by the particles. The velocity and particle updating for binary PSO are the same as in the case of continuous one. However, the final decisions are made in terms of the output generated by Sigmoid function [22, 23] based on the probabilistic decision induced by the velocity vector.

The particles select the feature according to the value obtained by the equation (7) from 0 to 1 as follows:

Where, „r‟ is the random number generated in the range [0, 1]; is the sigmoid value generated using velocity component of the particle for each dimension. If is larger than a randomly produced disorder number that is within [0, 1] in the particular dimension, then the corresponding feature is selected otherwise it is dropped. Details can be found in [21]. 2.4 SVM Generally machine learning methods are used to identify the pattern in the input dataset. SVM, Support vector machines (SVM), developed by Vapnik 1963, are a group of supervised learning methods that can be applied to classification or

regression. Initially, it has been applied for linear classification and extended to non linear and regression [11]. SVMs classify the input dataset as a vector in the multidimensional space with the construction of maximal-margin in hyperplanes between the two datasets. This provides better classification, as an assumption that the maximum distance between the data margin leads less generalization error, thus improves the classification accuracy. LibSVM[10] is an integrated software for support vector classification, (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). In this work we have applied the SVM classifier for evaluating the fitness in each of the methodology described as earlier. We have pipelined the SVM classifier of LibSVM library with the feature selection methodologies presented earlier. We have also investigated the kernel suitability for the various benchmark datasets [4, 9, 21]. 2.5 Feature Subset Selection Criteria After evaluating the new generation in all proposed model, best population so for is calculated. For computing the best subset a following rule has been proposed:

3. Implementation and Results 3.1 Dataset In our experiment, we have used 3 benchmark datasets: - siRNA dataset having 2431 siRNA sequences[4] - wine data set comprised of 178 samples and 13 features [9] - wdbc dataset is a breast cancer dataset comprising of 569 samples and 30 features[9]. The siRNA dataset comprises of thermodynamic features (column 1-21) and sequence features (22-110) with their efficacy. We have carried out a study for selecting training and testing data sets using SVM classifier, which shows that if we use the 200 most potent and 200 least potent siRNAs from the original siRNA dataset it provides better classifier learning of 92.60 cross validation accuracy as shown in Figure 2.

100

Cross-validation Accuracy %

90 80 70 60 50 40 30 20 10 0 0

1

2

3

4

5

6

7

8

9

10 11

Training size

Fig. 2. Training dataset size and cross-validation accuracy, x-axis represents the training data size where „1‟ refers to 200 most potent and 200 least potent siRNAs, 2 refer 300 most potent and 300 least potent siRNAs and so on. Y-axis represents the cross-validation accuracy. 3.2 Experiments The experiments have been conducted in C on Linux platform with gcc compiler. Table 1 illustrates the values of the parameters tuned for the experiments for ACO-SVM, GA-SVM and PSO-SVM. Table 1. Parameter values for the models Models τ α β φ

ACOSVM GA-SVM

ρ

μ

ω

v (initial velocity)

1.0

1.0

0.1

0.95

0.2

3

-

-

Cross over rate -

-

-

-

-

-

-

-

-

0.8

Mutation rate

Popsize

Maxiteration

-

30

50

0.02

30

50

PSO-SVM

-

2.0

2.0

-

-

-

1.0

0.0

-

-

30

50

The two parameters C and r of SVM classifier has been set to 2 12 and 16 for the wine dataset and 212 and 1 for the wdbc and siRNA dataset after performing several experiments. The author doesn‟t claim for the optimality of the parameters. These values are obtained by carrying out the experiments many times on a small data set generated artificially. 3.3 Results and Discussions All the parameters of the models has been set according to the table 1. For the purpose of the experiment all the datasets have been randomly divided into two groups, with 90% samples for training and 10% samples for testing. The proposed model computes more stable results as dataset ( according to the feature subset) is randomly partitioned into the training and testing dataset 10 times and then classification is performed each time then average is taken as the fitness value for the feature subset. The use of control parameter (µ) automatically takes care of the size of the feature subset which improves the accuracy. We have implemented all three proposed model with two different kernels for the SVM- classifier linear and radial basis function (RBF) respectively. The results for the linear kernel and and RBF kernel are illustrated in the table 2 and table 3 respectively. From the results it can be seen that all the three approaches produces the better results than the conventional SVM approach. In the following tables NF stands for number of features, CVA stands for cross validation accuracy and TA stands for test accuracy. Table 2. Results using linear kernel SVM-classifier Dataset SVM with all ACO-SVM 110 features TA CVA TA CVA Wine 96.00 96.25 98.15 97.19 (178X13) Wdbc 92.98 96.09 98.25 95.96 (569X30) Heuskin 87.50 87.78 90.25 87.45 (400X110)

Table 3. Results using RBF kernel SVM-classifier Dataset SVM with all ACO-SVM 110 features TA CVA TA CVA Wine 97.50 96.25 98.15 97.75 (178X13) Wdbc 94.55 96.29 98.83 96.84 (569X30) Heuskin 89.23 88.23 91.67 88.75 (400X110)

GA-SVM

PSO-SVM

NF 8

TA 100.00

CVA 97.19

NF 5

TA 100.00

CVA 96.63

NF 10

15

98.77

97.19

18

100.00

97.37

17

56

95.28

91.11

79

95.94

91.35

71

GA-SVM

PSO-SVM

NF 8

TA 100.00

CVA 98.880

NF 9

TA 100.00

CVA 95. 51

NF 9

20

98.95

97.360

7

100.00

98.07

18

42

96.50

91.500

75

97.50

91.00

66

The accuracy of the PSO-SVM is higher than the GA-SVM and ACO-SVM, where as the accuracy of the ACO-SVM is comparable to the GA-SVM. The number of features obtained by the ACO-SVM is very less than that of the PSO-SVM and GA-SVM for the Heuskin and wine data sets. The accuracy with linear kernel is slightly less than the accuracy of the RBF kernel. During the experiments it has also been observed that the time taken by the linear kernel optimizer is very large than that of RBF kernel. The observed results have been compared with the results reported by the various researchers. The comparison of proposed model for wine and wdbc dataset has been reported in the table 4. The comparison results for the heuskin dataset have been presented in the table5. Table 4. Comparision of proposed models with previous PSO-SVM models Datset Chung-Jui [21] Proposed PSO-SVM Proposed ACO-SVM Wine 100 100 98.15 Wdbc 95.61 100 98.83

Table 5. Comparision of proposed models with previous SVM model for siRNAa data Datset Wang et. al.[8] Jain et. al. Proposed PSOProposed ACO[25] SVM SVM Heuskin 77.00 71.10 97.50 91.67

Proposed GA-SVM 100 98.95

Proposed GASVM 96.50

3000

3000

2950

2950

2900

2900

2850 2800 GA-SVM

2750

ACOSVM PSOSVM

2700

POPULATION FITNESS

POPULATION FITNESS

From the table 4, it is obvious that the results proposed by our models with the parameters described in table 1, yields higher accuracy than that reported by Chung-Jui et.al [21]. Table 5 shows that the observed results after feature selection have outperformed the results reported by Wang et. al.[8] and Jain et. al. [25]. Further, a study has been carried out for the stability of the proposed models by computing the population fitness during the iterations. Figure 3a, figure 3b and figure 3c represents the stability of all the proposed method for wine , wdbc and Heuskin dataset, respectively.

2850 2800 GA-SVM

2750 2700

2650

2650

2600

2600

2550

2550

2500

ACOSVM PSOSVM

2500 1

6

11 16 ITERATION

21

26

1

Fig. 3a. Population fitness for wine dataset

6

11 16 ITERATION

21

26

Fig. 3b. Population fitness for wdbc dataset

3000 POPULATION FITNESS

2900 2800 GA-SVM

2700

ACO-SVM

2600

PSO-SVM

2500 2400 1

11 21 ITERATION

31

Fig. 3c. Population fitness for siRNA dataset The stability of ACO-SVM and PSO-SVM is more than the GA-SVM model whereas the ACO-SVM and PSO-SVM both show the better stability in the terms of population fitness which can be seen in the figures 3a, 3b, and 3c. Observed results show the higher accuracy than that of the previously reported one. ACO-SVM and PSO-SVM are rather more stable than GA-SVM for achieving the optimality. Even though, the complexity of the PSO-SVM and GASVM is less than the ACO-SVM.

4. Conclusion In this paper, we have established that GA, ACO and PSO methods supported by a SVM classifier yields better siRNA efficacy predictions. It also shows that better feature selection is carried out for the wine and wdbc data sets. The obtained results using our proposed models GA-SVM and PSO-SVM describes the significance of both sequence and thermodynamic features [4, 25] while the ACO-SVM assigns more significance to the sequence features in[2, 25] siRNA dataset generated by Heuskin et. al.[4].

References 1.

Elbashir SM, Harborth J, Lendeckel W, Yalcin A, Weber K, Tuschl T.: Duplexes of 21- nucleotide RNAs mediate RNA

2. 3. 4.

5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

17. 18. 19. 20. 21. 22. 23. 24.

25.

interference in cultured mammalian cells. Nature. Vol. 411(6836):428-9(2001) Saetrom P, Snove O: A comparison of siRNA efficacy predictors. Biochem Biophys Re Commun, vol. 321(1), pp. 247-253 (2004) Reynolds A, Leake D, Boese Q, Scaringe S, Marshall WS, Khvorova A.: Rational siRNA design for RNA interference. Nat Biotechnol, vol. 22(3), pp. 326-330 (2004) Huesken D, Lange J, Mickanin C, Weiler J, Asselbergs F, Warner J, Meloon B, Engel S, Rosenberg A, Cohen D, Labow M, Reinhardt M, Natt F, Hall J.: Design of a genome-wide siRNA library using an artificial neural network. Nat Biotechnol, vol. 23, pp. 995–1001 (2005) Zhi John Lu, David H. Mathews: OligoWalk: an online siRNA design tool utilizing hybridization thermodynamics. Nucleic Acids Research, Vol. 36, No. suppl_2 W104- W108 (2008) Vert JP, Foveau N, Lajaunie C, Vandenbrouck Y.: An accurate and interpretable model for siRNA efficacy prediction. BMC Bioinform, Vol. 7:520 (2006) Matveeva O, Nechipurenko Y, Rossi L, Moore B, Sætrom P, Ogurtsov AY, Atkins JF, Shabalina SA.: Comparison of approaches for rational siRNA design leading to a new efficient and transparent method. Nucleic Acids Res, 35:e63 (2007) Wang Xiaowei, Wang Xiaohui, Verma Rajeev K., Beauchamp Lesslie, Maghdaleno Susan and Surendra Timothy J., Selection of Hyperfunctional siRNAs with improved potency and specificity. Nucleic Acids Research, vol. 37, No. 22, e152, (2009) Asuncion, A. and Newman, D.J.: UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science (2007) Chih-Chung Chang and Chih-Jen Lin.: LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, (2001) Drucker,H., Burges Chris J.C., Kaufman,L., Smola A., Vapnik, V.: Support Vector Regression Machines. Advances in Neural Information Processing Systems 9, NIPS, 155-161 (1997) Dorigo,M., Maniezzo,V., Colorni,A.: The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics - Part B, 26(1):29 - 42, (1996) Cheng-Lung Huang : ACO-based hybrid classification system with feature subset selection and model parameters optimization, Neurocomputing, vol. 73, pp. 438–448 (2009) Dorigo M. and Blum C., Ant colony optimization theory: A survey, Theoretical Computer Science, pp. 243–278 (2005) Tsang C.H., Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection. Swarm Intelligence in Data Mining, Springer Berlin/Heidelberg, pp. 101-123 (2007) Nemati Shahla, Basiri Mohammad Ehsan, Ghasem-Aghaee Nasser, Aghdam Mehdi Hosseinzadeh, A novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Systems with Applications, vol. 36 pp. 12086–12094 (2009) Aghdam Mehdi Hosseinzadeh, Ghasem-Aghaee Nasser, Basiri Mohammad Ehsan: Text feature selection using ant colony optimization. Expert Systems with Applications, vol. 36, pp. 6843– 6853 (2009) Yang J., Honavar V., Feature subset selection using a genetic algorithm. IEEE Intelligent Systems, vol. 13 (2), pp. 44–49 (1998) Zhao Xing-Ming, Huang De-Shuang, Cheung Yiu-ming, Wang Hong-qiang, and Huang Xin: A Novel Hybrid GA/SVM System for Protein Sequences Classification. IDEAL 2004, LNCS 3177, pp. 11–16 (2004) Raymer M., Punch W., Goodman E., Kuhn L. and Jain A.K., Dimensionality reduction using genetic algorithms. IEEE Transactions on Evolutionary Computing, vol. 4, pp. 164–171 (2000) Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong Yang: Feature Selection using PSO-SVM. IAENG International Journal of Computer Science, 33:1, IJCS_33_1_18, (2007) Liu Y., Qin Z., Xu Z., & He H.: Feature selection with particle swarms. CIS 2004, LNCS 3314, pp. 425–430, Berlin, Heidelberg: Springer-Verlag (2004) Khanesar M. A. , Teshnehlab M., and Soorehdeli M.A.: A Novel Binary Particle Swarm Optimization. In: Proc. 15Th Mediterranean Conference on Control and Automation, (2007) Correa S., Freitas A. A., & Johnson C. G.: Particle Swarm and Bayesian networks applied to attribute selection for protein functional classification. In: Proc. of the GECCO- 2007 workshop on particle swarms, The second decade, pp. 2651–2658 (2007) Jain Chakresh Kumar and Prasad Yamuna,: Feature selection for siRNA efficacy prediction using natural computation. In: World Congress on Nature & Biologically Inspired Computing (NaBIC 2009), pp. 1759-1764, IEEE Press (2009)