Research on using genetic algorithms to optimize Elman neural ...

3 downloads 3816 Views 333KB Size Report
It affects the processing accuracy. So we optimize the weights, thresholds and numbers of hidden layer neurons of Elman networks by genetic algorithm.
Neural Comput & Applic (2013) 23:293–297 DOI 10.1007/s00521-012-0896-3

ORIGINAL ARTICLE

Research on using genetic algorithms to optimize Elman neural networks Shifei Ding • Yanan Zhang • Jinrong Chen Weikuan Jia



Received: 12 October 2011 / Accepted: 14 February 2012 / Published online: 1 March 2012 Ó Springer-Verlag London Limited 2012

Abstract There is a function of dynamic mapping when processing non-linear complex data with Elman neural networks. Because Elman neural network inherits the feature of back-propagation neural network to some extent, it has many defects; for example, it is easy to fall into local minimum, the fixed learning rate, the uncertain number of hidden layer neuron and so on. It affects the processing accuracy. So we optimize the weights, thresholds and numbers of hidden layer neurons of Elman networks by genetic algorithm. It improves training speed and generalization ability of Elman neural networks to get the optimal algorithm model. It has been proved by instance analysis that new algorithm was superior to the traditional model in terms of convergence rate, predicted value error, number of trainings conducted successfully, etc. It indicates the effect of the new algorithm and deserves further popularization. Keywords Elman neural networks  Genetic algorithm  GA-Elman algorithm

1 Introduction High dimension, complex structure and information-relatedness are the characteristics of non-linear complex data. However, it is the base of pattern recognition and data mining by processing these problems, which is also a great S. Ding (&)  Y. Zhang  J. Chen  W. Jia School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China e-mail: [email protected]; [email protected] S. Ding Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China

challenge. Artificial neural network (ANN) [7] has largescale computing capability and can easily achieve the non-linear mapping process, which will show a unique advantage when dealing with large complex non-linear system. Elman neural network [3] is a kind of feedback neural network. It adds an undertake layer in the hidden layer based on back-propagation network (BP) [8] as a delay operator to remember, which make the system a dynamic time-varying capacity and strong global stability. Elman neural network is a kind of optimizing BP, so it inherits the characteristics of BP. But BP has some defects; for example, it is easy to fall into local minimum, the fixed learning rate, the uncertain number of hidden layer neuron and so on [6], which will cause the training network failure and waste too much time by repeating training network and give the illusion of fitting phenomenon, at last, affect the accuracy of recognition. So the Elman network has the similar defects in a certain extent. Genetic algorithm (GA) [4, 5] as a representative of evolutionary computation will show its superiority in robust, non-linear, parallel and solving complex problem by not based on knowledge of the problem. Using GA to optimize the Elman network weights and the number of hidden layer neurons is to simplify the structure of network, improving the training speed, convergence and generalization ability. In recent years, it induces widespread attentions of many scholars using the GA to optimize neural network. Zhang Lijun proposed GA algorithm to optimize Elman network initial weights, then applied to predict stock price and made an efficient GA-Elman dynamic recurrent neural network stock prediction model [12]. Wang Tian’e used GA to optimize the structure and weights, proposed a modified dynamic Elman network model and successfully applied it to Dongfeng Motor’s stock price prediction [10]. Zhang Xiuling used GA to optimize the initial and threshold and established the model of Elman network prediction for

123

294

Neural Comput & Applic (2013) 23:293–297

nickel-metal hydride battery capacity [13]. Ding Shifei optimized the BP connection weights and threshold by GA to improve the training speed of BP and overcome the defect of easy to fall into local minimum [1, 2]. Ding Shifei used a hybrid coding of GA to optimize the RBF structure and weight, then adjusted the network with pseudoinverse method or LMS, which can get better network structure, stronger classification ability, reducing the time to build the network [1, 2]. It is an adaptive and intelligent learning algorithm. Most of the above studies got good results, but they optimize Elman network weights or network structure, respectively. In this paper, we optimize the network weights and network architecture at the same time based on GA and propose the optimized GA-Elman algorithm and verify the effectiveness of the algorithm by experiment.

2 Basic theory 2.1 The basic theory of GA

search algorithm, which uses the group search technology with population as a solution [9]. By selection, crossover, mutation and other mutation and other genetic operations to produce a new generation population, and gradually evolved until get the optimal state with approximate optimal solution. The main problems of construct genetic algorithm are the encoding method and the design of genetic operators. Different optimization methods need different encoding and genetic operators, and the solved problem with understanding of GA is the key to the success of the application. GA actual is an iterative process; in each iteration, it retains a candidate solution and sorts their pros and cons, then choose some of the solution according some indicators and use genetic operators computing to produce a new generation of candidate solutions, repeat this process until meet the target convergence. Figure 1 clearly shows the GA algorithm [11] where Gen is the number of genetic, i is the total number of individuals that have been processed, when i = M, go to the next generation. The basic genetic algorithm as follows:

GA is a learning mechanism of natural selection and evolution to develop highly parallel, randomized and adaptive

Step 1 Build a string consisting of a random initial population.

Gen=0

Initial group

Satisfy or not

Output result

yes

Calculate fitness The end i=0

yes

Gen=Gen+1

i=M ?

no

Pc

Select genetic operator

Pv

Pe Select an individuality by fitness

Select two individuality by fitness i=i+1

Copy

Exchange

variation

Insert the result of copying into new group

Insert the result of exchanging into new group

Insert the result of variation into new group

i=i+1

Fig. 1 The flow chart of GA

123

Neural Comput & Applic (2013) 23:293–297

295

Step 2 Calculate the fitness of each individual. Step 3 According to the genetic probability, use replication, crossing and mutation to generate new groups. Step 4 Repeat step 2 and 3 until meet the termination conditions; select the best individual as the result of genetic algorithm.

ð3Þ

yðkÞ ¼ gðw3 xðkÞÞ

Elman neural network use BP algorithm to revise weights, the error of network is E¼

m X

ðtk  yk Þ2 ;

ð4Þ

k¼1

where tk is the output vector of object.

2.2 The theory of Elman neural network The Fig. 2 shows the structure of Elman network: The topology is generally divided into four layers: input layer, hidden layer, undertake layer and output layer. Undertake layer is used to remembrance the output of hidden layer, which can be seen as a step delay operator. Based on BP network, the output of hidden associates with its input through the delay and storage of undertake layer. This way of association is sensitivity to historical data, and internal feedback network can increase the ability of handling dynamic information. Remembrances the internal state makes it to have the dynamic mapping function, which make the system to have the ability to adapt to timevarying characteristics. Suppose with n input, m output, the number of hidden and undertake neurons are r, the weight of input layer to hidden layer is w1, the weight of undertake layer to hidden layer is w2, the weight of hidden layer to output layer is w3; u(k - 1) is the input of neural network, x(k) is the output of hidden layer, xc(k) is the output of undertake layer, y(k) is the output of neural network, then xðkÞ ¼ f ðw2 xc ðkÞ þ w1 ðuðk  1ÞÞÞ;

ð1Þ

where xc ðkÞ ¼ xðk  1Þ. f is the hidden layer transfer function, which is commonly used in S-type function, that is f ðxÞ ¼ ð1 þ ex Þ1

ð2Þ

g is the transfer function of output layer, which is often a linear function, that is

3 GA-Elman algorithm 3.1 GA-Elman optimization The content of using GA to optimize Elman network is chromosome coding, fitness function and defining the structure of genetic operators. GA-Elman optimization [11] can be seen as an adaptive system without human intervention and can automatically adjust its connection weight and structure, which realizes the integration of GA and Elman and is shown in Fig. 3. The previous algorithm with GA optimization network is single optimizing weights or network structure. Optimizing the network weight is replacing traditional learning algorithm with GA to overcome the defect of easily falling into local minimum. One side, GA is good at searching largescale, non-differentiable, multi-mode space, and without the gradient information of error function, which show its unique advantage when hardly get these information. On other side, it does not need to consider the error function that is differentiable or not, so it can insert some penalty term to error function to improve the versatility of the network and reduce the complex of network, which show great potential in evaluating connection weights. For a given problem, it could not be qualify whether the number of connection and hidden nodes is too less because

Genetic algorithms

input layer

hidden layer

w1

u1

x1

output layer

w3

evolution

y1 Evaluate performance of network

groups

...

...

...

u(k-1)

Calculate fitness

y(k) check

decoding

un

xr

yn

train New neural network

The trained network

xc1

undertake layer

...

w2

xcr

Fig. 2 The topology structure of Elman NN

Training

Simulating

sample

sample

Fig. 3 GA combined with NN

123

296

of limiting capacity, but too much would make noise be trained together and the generalization ability is poor. Optimizing the network structure makes different problems can use the topology, which will improve the defect of the number of hidden layer neurons is difficult to determine. Above of all, provide an automatic design method of neural network to optimize the connection weight and structure at the same time. The biggest problem is the way of encoding network weights, thresholds and the number of hidden layer neurons. So the connection weights and threshold use binary encoding, the number of neural neurons uses real encoding, and set the maximum number of hidden layer neurons, which can get the best network model. 3.2 GA-Elman algorithm The basic steps of GA-Elman algorithm are as follow: Step 1 unity the Elman network weight and threshold and use binary encoding; use real encoding the number of hidden layer neurons (and set the maximum number of neurons); Step 2 use corresponding decoding for the encoding of Step 1 and get different network. Step 3 train the network with given training sample, if the precision meet the requirement of network, stop train or turn to Step 4. Step 4 determine individual fitness by objective function and training result, choose a number of individuals with largest fitness to inherit to the next generation. Step 5 use crossover, mutation and other operation processing the current population to produce the next population. Step 6 return to Step 2 until reach the setting genetic generation. The significance of new model is optimizing the connection weight and threshold to improve the training speed and convergence, save network running time and improve network efficiency; optimizing the number of hidden layers can determine the optimal structure and improve the treatment capacity of the network.

4 Experiments This paper uses the radar ionosphere data to simulate, which is in the standard UCI data sets [http://www. ics.uci.edu/*mlearn/databases/ionosphere/]. The data set has 351 samples; each sample has 34 features, which is used to predict the quality of the radar. Test selected 300 samples as training samples, the remaining 51 as the simulation sample. First, use MATLAB to establish a traditional Elman network with 34 input layer neurons and one

123

Neural Comput & Applic (2013) 23:293–297 Table 1 The comparison of the performance of Elman model and GA-Elman model The model of network

The rate of successful train

The steps to convergence

Error sum of squares

Accuracy

Elman

53.33

837

4.3219

78.43

GA-Elman1

60.00

300 ? 209

2.0093

86.27

GA-Elman2

93.33

300 ? 143

1.8760

88.23

GA-Elman3

96.67

300 ? 65

0.9081

96.08

output layer neuron; next, use GA only optimizing the number of hidden neurons (called GA-Elman1); then, use GA only optimizing the network weight and threshold (called GA-Elman2); last, use GA optimizing the network weight and the number of hidden neurons at the same time (called GA-Elman3). The results of these models are listed in Table 1. Suppose the optimization generation of genetic is 300, according to the Gao Daqi’s empirical formula: The maximum number of hidden layer neurons is four time as its value, which is 44 neurons. The Table 1 measures the performance of these modes with the following side: the rate of training success, which is the number of successful trainings when the total number of trainings is constant (the total number of this experiment is 30); the step of optimal training to convergence (after each model training 30 times, the step of the optimal model to convergence); error sum of squares that is predicted value and actual value of the difference of square, which is used to measure the proximity between the predicted value and actual value, when the accuracy is the same in the simulation, the smaller of the sum of squared error shows the model having higher precision; accuracy, the accuracy of simulation the sample. From the Table 1, from the training success rate (the number of successful trainings in 30 times), GA-Elman3 model is superior to the traditional model, slightly better than the other two models; GA-Elman1 and GA-Elman2 are balance in training success rate. From the convergence rate of optimal training models, the number of steps of four models become less and less, indicating that improve the speed of convergence rate, which GA-Elman3 used the least number of steps and had the highest the speed of convergence. From the error sum of squares and the accuracy of simulation, GA-Elman3 is significantly better than the traditional Elman model and better than the other two models, and the GA-Elman1 and GA-Elman2 are quite effective.

5 Results and discussion Elman network inherits the defects of BP, such as training network with random weight will lead to fall into local

Neural Comput & Applic (2013) 23:293–297

convergence and take too much time and the number of hidden layer neurons is difficult to determine, which will lead to difficulties in designing the structure of network and so on. Based on these defects, this paper proposes the new GA-Elman algorithm with optimizing network weight and the number of hidden layer neurons. Through the experiment and Table 1, optimizing the network weight and structure can get optimum model, which verify the effective of new algorithm. Using GA to optimize Elman network will take some time, but it can improve the speed of convergence and success rate of neural network training. Using GA to optimize Elman network can save a lot of training time, improve operational efficiency and get the optimal model for the simulation, which will get the best result. Optimizing the weight and structure at the same time will be superior to single optimizing the weight or structure. New algorithm will great improve the performance, such as improve the self-learning ability, the speed of convergence and save the running time, which will increase the efficiency of network. Acknowledgments This work is supported by the National Natural Science Foundation of China (Nos. 41074003, 60975039) and the Opening Foundation of Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (No. IIP2010-1).

297 2. Ding SF, Jia WK, Su CY et al (2011) Research of neural network algorithm based on factor analysis and cluster analysis. Neural Comput Appl 20(2):297–302 3. Elman JL (1990) Finding structure in time. Cogn Sci 14(2): 179–211 4. Holland JH (1975) Adaptation in neural and artificial systems. MIT Press, Cambridge. http://www.ics.uci.edu/*mlearn/databases/ ionosphere/ [OL] 5. Koza JR (1992) Genetic programming on the programming of computers by means of natural selection. MIT Press, Cambridge 6. Lu JJ, Chen H (2006) Researching development on BP neural network. Control Eng China 13(5):449–451 7. Mccllochw S, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 10(5):115–133 8. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representation by back-propagating errors. Nature 3(6):533–536 9. Shi ZZ (2009) Neural networks. Higher Education Press, Beijing 10. Wang T, Ye DQ (2009) Application of improved dynamic neural network based on GA to stock market prediction. Comput Technol Dev 19(1):214–216 11. Wang YN (2002) Intelligent information processing technology. Higher Education Press, Beijing 12. Zhang LJ, Yuan D (2008) Stock market forecasting research based on GA-Elman neural network. East China Econ Manag 22(9):79–82 13. Zhang XL, Zhu CY (2009) Elman neural networks based on genetic algorithms and its application in prediction of MH-Ni battery capacity. Ind Instrum Autom 4:100–102

References 1. Ding SF, Su CY, Yu JZ (2011) An optimizing BP neural network algorithm based on genetic algorithm. Artif Intell Rev 36(2): 153–162

123

Suggest Documents