2014 International Computer Science and Engineering Conference (ICSEC)
An Improved Grey Wolf Optimizer for Training q-Gaussian Radial Basis Functional-link Nets Nipotepat Muangkote1, Khamron Sunat2, Sirapat Chiewchanwattana3 Department of Computer Science, Faculty of Science, Khon Kaen University Khon Kaen, Thailand E-mails:
[email protected],
[email protected],
[email protected]
1,2,3
the mechanism to balance the exploration and exploitation ability in GWO. The new technique for balancing the exploration and exploitation is proposed.
Abstract—In this paper, a novel meta-heuristic technique an improved Grey Wolf Optimizer (IGWO) which is an improved version of Grey Wolf Optimizer (GWO) is proposed. The performance is evaluated by adopting the IGWO to training qGaussian Radial Basis Functional-link nets (qRBFLNs) neural networks. The function approximation problems in regression areas and the multiclass classification problem in classification areas are employed to test the algorithm. For instance, in order to overcome the multiclass classification problem, the dataset of the screening risk groups of the population age 15 years and over in Charoensin District, Sakon Nakhon Province, Thailand is used in the experiments. The results of the function approximation problems and real application in multiclass classification problem prove that the proposed algorithm is able to address the test problems. Moreover, the proposed algorithm obtains competitive performance compared to other meta-heuristic methods.
2) The radial basis functional-link nets (RBFLNs) neural networks based on q-Gaussian presented in [7] is used as an application of the proposed algorithm. The IGWO is applied to training of the neural networks in the training process. To evaluate performance of the IGWO algorithm, the other metaheuristic algorithms are verified to make comparison. 3) Furthermore, the dataset of the screening risk groups of the population age 15 years and over in Charoensin District, Sakon Nakhon Province, Thailand, fiscal year 2012, is used in real-world classification problem. The rest of the paper is organized as follows: Section II, briefly presents the radial basis functional-link nets neural network, GWO algorithm. The proposed IGWO is presented in Section III; the performance evaluation and experimental results are described in Section IV; and conclusions are drawn in the last section.
Keywords—improved Grey Wolf Optimizer (IGWO); Grey Wolf Optimizer (GWO); optimization; Radial Basis Function neural network (RBFLN)
I.
INTRODUCTION
A. RBFLNs and RBFLNs based on q-Gaussian Radial basis functional-link nets (RBFLNs), firstly proposed by Looney [8], are a generalize version of radial basis function (RBFNNs) neural networks. In other words, the RBFLNs are the modification of the architecture of RBFNNs. Generally, the model of RBFNNs can be described as:
In this paper, it presents an improved version of a novel optimization algorithm which categorized in class of population-based meta-heuristics or known as Swarm Intelligence (SI); it was inspired by hunting mechanism and democratic behavior of grey wolves pack in the wild; this algorithm is, therefore, named ‘Grey Wolf Optimizer’ (GWO) [6]. The GWO is powerful in terms of exploration, exploitation, local optima avoidance, and convergence. However, its ability is still somewhat dependent or limited on some of the mechanisms in the balance between exploration and exploitation. Since there are still rooms left for the GWO’s performance to be improved in terms of its exploration and exploitation ability, this paper is an attempt to fill in those gaps. Such attempts can be illustrated through the following objectives.
y j X ; ), : where X
§ f¨ ¨ ©
n
¦w i 1
ji I i ( X ; P i , V i ) b j
· ¸, ¸ ¹
(1)
[ x1 ,..., x d ]T is input vector with d-dimensional;
Y [ y1 ,..., y m ]T is the output vector with m-dimensional of neural network, yj and bj are the output and bias of each jth output neuron, respectively; ) {I1 ,..., I n } are the set of radial basis function, P i [ P i1 ,..., P id ]T are the center and V i is the width of radial basis function, Ii , n is the number of hidden nodes; : {w11 ,..., w1n ,..., wm1 ,..., wmn } is the set of weights connected to the output neurons.
1) To propose the IGWO algorithm which is an improved the performance of the GWO algorithm. We concentrate on 978-1-4799-4963-2/14/$31.00 ©2014 IEEE
978-1-4799-4963-2/14/$31.00 ©2014 IEEE
RELATED WORK
II.
Over the last two decades, meta-heuristic optimization techniques have become popular and been applied in various fields of research. Some of the well-known meta-heuristic optimization techniques are, for example, Genetic Algorithm (GA) [1], Differential Evolution (DE) [2], Particle Swarm Optimization (PSO) [3], Gravitational Search Algorithm (GSA) [4], Teaching–learning-based optimization algorithm (TLBO) [5], etc.
209
2014 International Computer Science and Engineering Conference (ICSEC)
Radial basis functional-link nets (RBFLNs) are RBFNNs which link the extra lines from the input nodes directly to the output neurodes with another set of weights {umj}. The input xm is weighted with umj at the jth output node. The RBFLNs output components are given by
y j X ; ), :
n
d
½
>1 /(n d )@°®¦ wij Ii ¦ u mj xm °¾ . °¯ i
1
m 1
°¿
5
&
D
D
U ,*:2PRYH
'DOSKD &
(2) 5
Radial basis functional-link nets neural network based on qGaussian function (qRBFLNs) utilized the q-Gaussian into the RBFLNs. The q-Gaussian function allows the changing of the shape of RBFLNs according to the real parameter q. The brief details of the q-logarithm, its inverse, and the q-exponential are expressed as
U
D
E U
'EHWD
*:2PRYH
G
Z
'GHOWD
D &
ln q ( x) {
x1 q 1 1 q
1 ° eqx { ®>1 (1 q) x @ (1 q) , °0, ¯
x!0
(3) Fig. 1. Position movement of Z or any other hunters in GWO and IGWO.
1 (1 q ) x t 0 .
is delta, which has to submit to alphas and betas, but dominate the omega. The lowest rank of the grey wolf is omega, which have to submit to all the other dominant wolves. The GWO algorithm is provided in the mathematical models as follows:
(4)
otherwise
See [9] for more details. There are several radial basis functions for RBFNNs. The most commonly used radial basis function is the Gaussian function:
§ xP I ( x; P , V ) exp¨¨ ¨ 2V 2 ©
2
· ¸. ¸¸ ¹
1) Social hierarchy: In the social hierarchy of wolves when designing GWO, the best solution is considered as the alpha ( D ), the second and third best solutions are considered as beta ( E ) and delta ( G ) respectively. The rest of the candidate solutions are assumed to be omega ( Z). The Z wolves are guided by D , E , and G , and followed by these three wolves.
(5)
2) Encircling prey: The grey wolves encircle prey during the hunt. The encircling behavior can be mathematically modeled as follows:
The use of q-Gaussian function as the radial basis function is defined by the following equation,
I ( x; q , P q , V q )
x Pq
& & & & D | C X p (t ) X (t ) | & & & & X (t 1) X p (t ) A D
2
eq ( 3q )V q2 .
(6)
The non-extensive entropic index, q, is variable in the range (-, 3). The non-extensive entropic index, q, can be found by employing a meta-heuristic algorithm. In this paper the extended version of RBFLNs by using q-Gaussian function as the radial basis function presented by [7] is used to test the IGWO algorithm. See [10] for more detail of the q-Gaussian.
(7) (8)
& & where t is the current iteration, A and C are coefficient & vectors, X p is the position vector of the prey (global & solution), and X is the position vector of a grey wolf. The & & vectors A and C are calculated as follows:
B. Grey Wolf Optimizer (GWO) The GWO is firstly proposed by Mirjalili et al., [6]. The algorithm was inspired by the democratic behavior and the hunting mechanism of grey wolves in the wild. In a pack, the grey wolves have a very strict social dominant hierarchy. The leaders, which are a male and a female, are called alpha. The second level of grey wolves, which are subordinate wolves that help the leaders, are called beta. The third level of grey wolves
& A & C
& & & 2a r1 a
(9)
& 2 r2
(10)
& where components of a are linearly decreased from 2 to 0 according to iterations and r1, r2 are random vectors in [0, 1].
210
2014 International Computer Science and Engineering Conference (ICSEC)
3) Hunting: The hunt is usually guided by the alpha, beta and delta, which have better knowledge about the potential location of prey. The other search agents should update their positions according to the position of the best search agent. The update of their agent position can be formulated as follows:
Initialize the grey wolf population X i (i Initialize a, A, and C. Calculate the fitness of each search agent X D =the best search agent
X E =the second best search agent X G =the third best search agent
& & & & & & & & & DD | C1 X D X |, DE | C 2 X E X |, DG
& X1 & X3
& & & & X D A1 ( DD ), X 2 & & & X G A3 ( DG )
& X (t 1)
& & & | C3 X G X | (11) & & & X E A2 ( DE ), (12)
& & & X1 X 2 X 3 3
While (t < Max number of iterations) for each search agent if |A| 1, the gray wolves are forced to diverge from the prey.
III.
1,2,..., n)
Fig. 2. Pseudo-code of the IGWO algorithm.
& & & & & & & & DDc | C1 X r1 X r3 |, DEc | C 2 X r2 X r1 |, & & & & DGc | C3 X r3 X r1 | & X 1c & X 3c
& & & & X D A1 ( DDc ), X 2c & & & X G A3 ( DGc ) & X c(t 1)
PROPOSED ALGORITHM
Although the GWO has a helpful mechanism to smoothly balance the exploration and exploitation, i.e. the adaptive values of a and A, the GWO may trap in local optima. Since the GWO emphasizes the exploration ability which depends only on the C vector. According the parameter A, we found that it can utilize the parameter A to select the strategy to compute & & & the vector DDc , DEc , and DGc , considering that the absolute parameter A is less than 1 or greater than 1. The aim of the use of hybridization strategy is to increase the diversity of the agent. The mathematical model is described as follows:
& & & X E A2 ( DEc ),
& & & X 1c X 2c X 3c 3
(14)
(15)
(16)
where the indexes r1 , r2 , r3 {1,2,..., NP} are randomly chosen indexes and r1 z r2 z r3 . The pseudo-code of IGWO algorithm is shown in Fig. 2, the differences between the standard GWO and the improved IGWO are highlighted by “”. B. IGWO for training RBFLNs Recently, there are many meta-heuristic algorithms which are used to find a combination of weights, biases or other parameters which improve the neural networks to provide the lowest error or the highest accuracy rate [7, 11]. In the same light, this paper adopts the novel meta-heuristic algorithm named IGWO to find the parameters of RBFLNs neural networks. The training RBFLNs can be described below.
A. An improved GWO algorithm (IGWO) The movement of agent in GWO depends greatly on the alpha, beta, and delta circumstances. Fig. 1 shows how a search agent updates its position. In contrast, in IGWO, the movement of the agent depends on the alpha, beta, delta, or those & random chosen agents according to the random values of A . In this & paper, we added a new strategy to calculate the vector DDc , & & DEc , and DGc which are helpful for a search agent to have more exploration ability and not trap in the local optima. The update position can be formulated as follows:
1) Objective function The two evaluation metrics are adopted as the objective function of the optimizer, which are Mean Square Error (MSE) and geometric means of each classification accuracy (G-mean). The MSE is used to measure the error of function approximation problems between the actual value and the predicted value. The MSE is defined as:
211
2014 International Computer Science and Engineering Conference (ICSEC)
MSE
1 N
neural networks. The benchmark problems in regression areas and a real problem in classification areas are used to compare the performance of each algorithm. 30 trials have been conducted for all experiments and the average results are reported. All algorithms were carried out in MATLAB 2012b on a personal computer with a 3.4 GHz CPU, 8 GB RAM under Windows 7 Professional 64-bit platform.
N
¦
( y actual ,i yˆ predicted ,i ) 2
(17)
i 1
where N is the number of output nodes. On the other hand, with regard to the classification area, in this paper, we introduced the G-mean presented in [12], to measure the effectiveness of a classifier. The G-mean is adopted in lieu of the accuracy rate of all the samples in order to give more insight into the accuracy. Particularly, for binary classification problem, G-mean is the square root of positive class accuracy multiplies by negative class accuracy. The Gmean is defined as:
G - mean
TP TN u TP FN TN FP
A. Data specification In this paper, the performance of the proposed algorithm is evaluated by applying to the regression and classification areas, which are the function approximation problems and the multiclass classification problem. These two problems were discussed as follows: 1) Regression problem In this part, the regression problem or the function approximation problem is the typical application of neural networks. In this study, the five one-dimensional functions which have been used in [7] are selected to test the algorithm. The five function approximations are shown in Table I. The training set ( xi , y i ) consisting of 500 data points is created by
(18)
where TP, TN, FP, and FN stand for true positive, true negative, false positive and false negative, respectively. Therefore, in order to minimize optimization, the fitness function of each ith training sample can be defined as follows:
Fitness ( X i ) 1 (G - mean( X i ))
Hereafter, the G-mean is used as an objective function of all meta-heuristic algorithms. As our experimental results, we found that the use of G-mean is more reasonable to be adopted to the multiclass imbalanced classification problem.
ȝ
ª P1,1 ,..., P1,d º « » « » ,ı « P n,1 ,..., P n,d » ¬ ¼
>V 1 ,..., V n @ , q
y
f ( x)
(23)
where x is uniformly distributed within the input interval, simultaneously. The training set and the testing set will always be synthesized by (22) and (23) in each run, respectively. 2) Pre-Hypertension classification problem In this part, we further investigate the ability of our proposed algorithm by applying to the real-world problem. The dataset used in the classification experiments is the screening risk groups of the population age 15 years and over residing in Charoensin District, Sakon Nakhon Province, Thailand, fiscal year 2012 between October and September 2011 and 2012. Those datasets have 12 factors and 2,987 samples. The dataset was presented in [13], which is categorized as the multiclass and imbalanced dataset.
(20)
[q1 ,..., q n ] (21)
where the range of the center (P) of radial basis function is set to the range of input data of neural network, n is the number of hidden nodes, d is the d-dimensional input vector of the neural network.
TABLE I.
THE FIVE FUNCTION APPROXIMATION PROBLEMS
Function approximation
Usually, the descriptions, which are mentioned above can be applied to other meta-heuristic algorithms. IV.
(22)
where x is randomly distributed within the input interval and H is an added Gaussian white noise with the mean value at 0 and variance at 0.1. The 500 data points of testing set ( xi , y i ) is generated by:
2) Encoding strategy This section illustrates how to encode the center (P), the width of the radial basis function (V), and non-extensive entropic index (q) of the qRBFLNs for the search agents of the IGWO algorithms. The solution positions (populations) of IGWO algorithms are encoded as follows:
agent (:,:, i ) [ ȝ, ı , q] ,
f ( x) H
y
(19)
f1 x
x 8 sin 3x 9 cos2 x 25
f 2 x sin 0.8Sx cos0.2Sx f 3 x exp2 sin x / x
PERFORMANCE EVALUATION
In this section, the performance of the proposed IGWO algorithm is compared to the popular nature-inspired metaheuristic algorithms; PSO, TLBO, and GSA algorithms. These algorithms are adopted in the training process of the qRBFLNs
f 4 x 1.1 1 x 2 x exp x / 2
2
f 5 x cos x 1 / 50
212
4
2
x
[0,2S ] [0,10]
[10,10] [10,10] [0,10]
2014 International Computer Science and Engineering Conference (ICSEC)
TABLE II.
10-FOLD CROSS VALIDATION FOR PRE-HYPERTENSION DATASET [13] 10-Fold Cross Validation
Class 1 2 3 4 Total
1
Tr 297 2150 228 16 2691
2
Te 32 238 25 1 296
Tr 296 2149 227 15 2687
3
Te 33 239 26 2 300
Tr 296 2149 227 15 2687
4
Te 33 239 26 2 300
Tr 296 2149 227 15 2688
5
Te 33 239 26 2 299
Tr 296 2149 228 15 2688
6
Te 33 239 25 2 299
Tr 296 2149 228 15 2688
7
Te 33 239 25 2 299
Tr 296 2149 228 15 2688
8
Te 33 239 25 2 299
Tr 296 2149 228 15 2688
9
Te 33 239 25 2 299
Tr 296 2149 228 16 2689
10
Te 33 239 25 1 298
Tr 296 2150 228 16 2690
Te 33 238 25 1 297
Tr, and Te stand for the training set, and testing set, respectively.
TABLE III.
PERFORMANCE COMPARISON OF MSE AVERAGED OVER 30 RUNS OF THE qRBFLNS TRAINED BY PSO, TLBO, GSA, GWO, AND IGWO FOR APPROXIMATING 5 FUNCTIONS
Func. No. of nodes f1
f2
f3
f4
f5
20 30 40 20 30 40 20 30 40 20 30 40 20 30 40
PSO-qRBFLNs [7]
TLBO-qRBFLNs [7]
GSA-qRBFLNs
GWO-qRBFLNs
IGWO-qRBFLNs
MSE Train
MSE Test
MSE Train
MSE Test
MSE Train
MSE Test
MSE Train
MSE Test
MSE Train
MSE Test
9.2420E-03 9.0060E-03 8.5588E-03 9.2497E-03 8.8798E-03 8.5409E-03 9.2804E-03 9.1392E-03 8.7142E-03 9.2421E-03 8.9860E-03 8.6396E-03 1.5490E-01 1.2701E-01 1.0647E-01
6.2406E-04| 9.8879E-041.3791E-03| 6.7294E-04| 1.0876E-03| 7.1798E-03| 6.6472E-04| 9.8003E-04| 1.3904E-03| 6.5036E-04+ 1.0406E-03| 1.4795E-03+ 1.4248E-011.2203E-012.2995E-01(-) 4, (+)2, (|)9
9.4783E-03 9.2554E-03 8.8836E-03 9.3661E-03 9.0685E-03 9.0556E-03 9.2900E-03 9.1021E-03 8.7238E-03 9.3879E-03 9.1814E-03 8.7809E-03 1.3484E-01 1.0032E-01 7.5923E-02
5.6698E-04| 8.5547E-04| 1.0532E-03+ 6.1629E-04| 8.4881E-04+ 1.1988E-03| 6.7414E-04| 8.8551E-04| 1.1735E-03| 7.4494E-04| 9.2846E-04+ 1.3305E-03+ 1.2628E-018.6299E-02| 6.8447E-02| (-) 1, (+)4, (|)10
9.5382E-03 9.3409E-03 8.9978E-03 9.5973E-03 9.2055E-03 9.2063E-03 9.5219E-03 9.3358E-03 9.1134E-03 9.2794E-03 9.1525E-03 9.0566E-03 1.6475E-01 1.4944E-01 1.3177E-01
4.7849E-04+ 7.3383E-04+ 8.9723E-04+ 4.7658E-04+ 6.9108E-04+ 8.7746E-04+ 5.1719E-04+ 7.4111E-04+ 9.3802E-04+ 6.1746E-04+ 8.4039E-04+ 1.1245E-03+ 1.5394E-011.4118E-011.2335E-01(-) 3, (+)12, (|)0
9.4981E-03 9.2039E-03 8.7145E-03 9.4905E-03 9.1419E-03 8.7873E-03 9.4933E-03 9.2016E-03 8.8485E-03 9.1659E-03 9.0573E-03 8.7724E-03 1.4199E-01 1.1841E-01 9.7210E-02
5.8096E-04| 9.1143E-04| 1.3029E-03| 6.2811E-04| 8.9221E-04+ 1.3575E-03| 5.6075E-04| 9.5448E-04| 1.2691E-03| 8.0036E-04| 1.0633E-03| 1.4743E-03| 1.3408E-011.0670E-018.6791E-02(-) 3, (+)1, (|)11
9.0251E-03 8.9331E-03 8.7429E-03 9.3083E-03 8.9721E-03 8.4992E-03 9.4715E-03 9.1511E-03 8.6249E-03 9.1097E-03 8.8378E-03 8.4788E-03 1.2430E-01 9.2922E-02 7.5583E-02
5.9802E-04 8.4112E-04 1.3456E-03 6.4070E-04 1.0742E-03 1.3162E-03 6.4491E-04 9.3726E-04 1.2214E-03 1.1096E-03 1.1452E-03 1.5437E-03 1.1297E-01 8.4211E-02 6.6789E-02
Wilcoxon’s test on the MSE values at a 0.05 significance level. (-), (+), and (|) stands for the difference of MSE is statistically worse than, better than, and similar to IGWO-qRBFLNs, respectively.
TABLE IV.
PERFORMANCE COMPARISON OF G-MEAN AND ACCURACY AVERAGED OVER 30 RUNS OF THE qRBFLNS TRAINED BY PSO, GSA, TLBO, GWO, AND IGWO FOR PRE-HYPERTENSION CLASSIFICATION PROBLEM Number of hidden nodes 60 hidden nodes
Algorithm G-mean PSO-qRBFLNs GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs IGWO-qRBFLNs Accuracy PSO-qRBFLNs GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs IGWO-qRBFLNs
80 hidden nodes
100 hidden nodes
Training result (%)
Testing result (%)
Training time (s)
Training result (%)
Testing result (%)
Training time (s)
Training result (%)
Testing result (%)
Training time (s)
64.6267 62.4018 66.8531 67.2051 67.3858
61.4743 60.1280 63.2335 63.0006 63.4507
1.21E+03 7.23E+02 9.53E+02 1.19E+03 8.00E+02
65.6766 63.7704 67.2673 67.9042 67.8757
62.8641 60.8961 63.4516 63.8101 63.9399
1.61E+03 9.10E+02 1.24E+03 1.55E+03 1.05E+03
66.3303 64.6371 67.5659 68.0525 68.2381
63.6231 61.8013 64.1860 64.0506 64.2944
2.09E+03 1.17E+03 1.54E+03 1.99E+03 1.35E+03
82.8116 82.4881 82.9677 83.0206 83.0531
82.0487 81.9710 82.1353 82.0610 82.1545
1.21E+03 7.23E+02 9.53E+02 1.19E+03 8.00E+02
82.8988 82.6592 83.0403 83.1222 83.1082
82.0783 82.0122 82.1174 82.1710 82.1843
1.61E+03 9.10E+02 1.24E+03 1.55E+03 1.05E+03
82.9746 82.8022 83.0893 83.1485 83.1624
82.1061 82.0831 82.2098 82.1743 82.2119
2.09E+03 1.17E+03 1.54E+03 1.99E+03 1.35E+03
set to 2; inertia weight Z is decreasing linearly from 0.9 to 0.4. For the TLBO algorithm, there is no special parameter setting. For GSA algorithm, all parameters are set the same as [4]: G0 is set to 100, D is set to 20, and the K0 is set to the total number of agents and is decreased linearly to 1. Lastly, for the GWO and IGWO algorithms, the parameter a is linearly decreased from 2 to 0.
The K-fold cross validation is adopted to create the train and to test the dataset of each class. The 10-fold cross validation of the data set is shown in Table II. B. Parameter settings In all experiments, the population size is set to 30, the maximum number of function evaluations (max_NFEs) is set to 10000 as the stopping criteria, and the maximum number of runs is set to 30. The PSO parameters are similarly set as presented in [7]; the acceleration constants c1 and c2 are both
213
2014 International Computer Science and Engineering Conference (ICSEC)
TABLE V. p-VALUE OF THE WILCOXON SIGNED-RANK TEST FOR PAIRWISE COMPARISONS OF G-MEAN BETWEEN THE ALGORITHMS OVER PREHYPERTENSION CLASSIFICATION PROBLEM p-Value 60 nodes GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs IGWO-qRBFLNs
be used or implemented. In order to evaluate the performance of the proposed algorithm, the RBFLNs neural networks based on q-Gaussian is used as an application. The other existing meta-heuristic algorithms are verified comparatively with the IGWO algorithm. However, according to the experimental results, despites the fact that the IGWO is slightly inferior to the GSA in terms of the use of training qRBFLNs for solving one-dimensional function approximation problem,, the IGWO is outstanding in solving a more difficult problem (f5). Moreover, the results of the real-world Pre-Hypertension problem revealed that the IGWO yields the highest G-mean and accuracy; this is due to the balancing ability between exploration and exploitation. In this work, the proposed method IGWO is very suitable to be adopted in qRBFLNs to solve the multiclass classification problem. The proposed algorithm can be applied to other various neural networks.
PSO-qRBFLNs GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs
0.0012 0.9225 0.7240 0.6996
0.0005 0.0003 0.0073
0.5970 0.8295
0.7095
0.0001 0.1502 0.1328 0.6137
0.0053 0.0342 0.0009
0.4747 0.6944
0.1907
0.0001 0.0459 0.0030 0.0526
0.0101 0.3484 0.0119
0.0886 0.7423
0.0880
80 nodes GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs IGWO-qRBFLNs
100 nodes GSA-qRBFLNs TLBO-qRBFLNs GWO-qRBFLNs IGWO-qRBFLNs
For future work, with regard to a regression problem, we are going to extend our investigation in the performance of the algorithm in high-dimensional function approximation problem. Furthermore, the real world data should also be used to validate the proposed method. As for classification problem, the compared results of the proposed method can be further tested by other existing classification methods.
A small p-value which less than 0.05 means that one algorithm is significantly different to the other one.
C. Experimental results 1) Testing with regression problems The activation function in output neuron is f ( x) x . The experimental results are presented in Table III. The lowest MSE is the best. It can be seen that the test MSE of IGWO best performs in function f5 even though it slightly higher MSE than GSA in function f1, f2, f3, and f4 respectively. However, the GSA can hardly solve f5, which is due to the fact that f5 is the most difficult level [7] as it was observed, when solving the problem, that all algorithms are inferior to the IGWO. As a result, the superior performance of IGWO was resulted from the balance between exploration and exploitation which in turn leads the agent to become avoidable in local optima and can be reached in the global optima. 2) Testing with Pre-Hypertension classification problem The Pre-Hypertension dataset is a high-dimensional imbalanced multiclass classification problem which is more challenging to be tackled by neural networks, particularly in RBFNNs. The experimental results are shown in Table IV; the highest G-mean and accuracy are the best. As a result, the IGWO significantly yields the highest G-mean and accuracy in all majority cases; the training time of IGWO is also shorter than that of GWO. That is to say, an improved IGWO are properly combined into the training process of qRBFLNs to solve a more difficult problem, particularly in a highdimensional imbalanced multiclass classification problem. 3) Statistical test result To compare the significance between other meta-heuristic algorithms and IGWO, the pairwise Wilcoxon signed-rank test at a 0.05 significance level is used. Table V shows the p-value of G-mean compared between two algorithms in PreHypertension classification problem. V.
REFERENCES [1]
[2]
[3] [4] [5]
[6] [7]
[8] [9] [10]
[11]
[12]
CONCLUSIONS
This paper proposes an improved IGWO algorithm, which concentrates on the mechanism to balance the exploration and exploitation ability in GWO. The proposed method is simple to
[13]
214
K.S. Tang, K.F. Man, S. Kwong, Q. He, “Genetic algorithms and their applications,” IEEE Signal Processing Magazine, vol 13, no. 6, pp. 2237, Nov. 1996. R. Storn and K. Price, “Differential evolution: A simple and efficient heuristic for global optimization over continuous spaces,” J. Global Optimiz., vol. 11, no. 4, pp. 341-359, Dec. 1997. J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Networks, vol. 4, 1995, pp. 1942–1948. E. Rasedi, H. Nezamabadi-pour, and S. Saryazdi, “GSA: A gravitational search algorithm,” Inf. Sci, vol. 179, no. 13, pp. 2232-2248, Jun. 2009. R.V. Rao, V.J. Savsani, and D.P. Vakharia, “Teaching–learning-based optimization: A novel method for constrained mechanical design optimization problems,” Comput. Aid. Des., vol. 43, no. 3, pp. 303-315, Mar. 2011. S. Mirjalili, S.M. Mirjalili, and A. Lewis, “Grey wolf optimizer,” Adv. Eng. Softw., vol. 69, pp. 46-61, Mar. 2014. N. Muangkote, K. Sunat, and S. Chiewchanwattana, “Evolutionary training of a q-gaussian radial basis functional-link nets for function approximation,” in Proc. JCSSE, 2013, pp. 58-63. C.G. Looney, “Radial basis functional link nets and fuzzy reasoning,” Neurocomputing, vol. 48, no. 1–4, pp. 489-509, Oct. 2002. T. Yamano, “Some properties of q-logarithm and q-exponential functions in Tsallis statistics,” Physica A., vol. 305, pp. 486 - 496, 2002. W.J. Thistleton, J.A. Marsh, K. Nelson, and C. Tsallis, “Generalized Box–Müller Method for Generating q-Gaussian Random Deviates,” IEEE Trans. Info. Theo., vol. 53, no. 12, pp. 4805 - 4810, Dec. 2007. S. Mirjalili, S.Z.M. Hashim, and H.M. Sardroudi, “Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm,” Appl. Math. Comput., vol. 218, no. 22, pp. 11125–11137, Jul. 2012. W. Zong, G.-B. Huang, and Y. Chen, “Weighted extreme learning machine for imbalance learning,” Neurocomputing, vol. 101, pp. 229– 242, Feb. 2013. C. Somchai, S. Chiewchanwattana, K. Sunat, and N. Muangkote, “Extreme learning machine for Pre-Hypertension classification,” in Proc. ICSEC, 2013, pp. 501-506.