A Local Linear Wavelet Neural Network Trained by ... - Semantic Scholar

13 downloads 0 Views 349KB Size Report
Curse-of-dimensionality is a mainly unsolved problem in WNN theory which ... based optimization method first proposed by Kennedy and. Eberhart [1]. Some of ...
th

第五届全球智能控制与自动化大会 2004 年 6 月 15 - 19 日, 中国杭州

Proceedings of the 5 World Congress on Intelligent Control and Automation, June 15-19, 2004, Hangzhou, P.R. China

A Local Linear Wavelet Neural Network Yuehui Chen, Jiwen Dong , Bo Yang and Yong Zhang School of Information Science and Engineering Jinan University, 250022 Jinan ,P.R. China [email protected] local minimum. Particle swarm optimization (PSO) is a recent population based stochastic global search algorithm. The proposed local linear wavelet neural network is trained by the PSO algorithm. Simulation results for the identification of nonlinear system show the effectiveness of the proposed method. The paper is organized as follows. The local linear wavelet neural network is introduced in section 2. The particle swarm optimization used for learning of the local linear wavelet neural network is described in section 3. The experiments on nonlinear dynamic system identification problems are given in section 4. Finally, concluding remarks are derived in the last section.

Abstract - The paper presents a local linear wavelet neural network. The difference of the network with the original wavelet neural network is that the connection weights between the hidden layer and output layer of the original WNN are replaced by a local linear model. A simple and fast training algorithm, particle swarm optimization (PSO), is also introduced for training the local linear wavelet neural network. Simulation results for the identification of nonlinear systems show the feasibility and effectiveness of the proposed method. Index Terms - Local Linear Wavelet neural network, Particle swarm optimization algorithm, System identification

I. INTRODUCTION Recently, in stead of using the common sigmoid activation functions, the wavelet neural network (WNN) employing nonlinear wavelet basis functions (named wavelets), which are localized in both the time space and frequency space, has been developed as an alternative approach to nonlinear fitting problem [3]. Two key problems in designing of WNN are how to determine architecture of WNN and what learning algorithm can be effectively used for the training of the WNN [2][8]. These problems are related to determine the optimal architecture of the WNN, to arrange the windows of wavelets, and to find the proper orthogonal or non-orthogonal wavelet basis. Curse-of-dimensionality is a mainly unsolved problem in WNN theory which brings some difficulties in applying the WNN to high-dimension problems. The basis function neural networks are a class of neural networks, in which the output of the network is a weighted sum of a number of basis functions. The usually used basis functions include B-spline basis functions, wavelet basis functions and some neurofuzzy basis functions [4][15]. The Particle Swarm Optimization (PSO) is a population based optimization method first proposed by Kennedy and Eberhart [1]. Some of the attractive features of the PSO include ease of implementation and the fact that no gradient information is required. It can be used to solve a wide array of different optimization problems. Some example application include neural network training [9][10][11][12] and function minimization [13][14]. In this paper, a local linear wavelet neural network is proposed, in which the connection weights between the hidden layer units and output units are replaced by a local linear model. The usually used learning algorithm for wavelet neural networks is gradient descent method. But its disadvantages are slow convergence speed and easy stay at

II. LOCAL LINEAR WAVELET NEURAL NETWORK In terms of wavelet transformation theory, wavelets in the following form

Ψ = {Ψi = ai

1 2

ψ(

ai = (ai1 ,L, aiN ) bi = (bi1 ,L, biN ) is a family of functions generated from one single function ψ(x) by the operation of dilation and translations. ψ(x), which is localized in both the time space and the frequency space, is called a mother wavelet and the parameters a, b are named the scalar and translation parameters, respectively. In the standard form of wavelet neural network, the output of a WNN is given by M

M

i =1

i =1

f ( x) = ∑ wi Ψi ( x) = ∑ wi ai



1 2

ψ(

x − bi ) ai

(1)

where Ψi is the wavelet activation function of ith unit of the hidden layer and wi is the weight connecting the ith unit of the hidden layer to the output layer unit. It is obviously, the localization of the ith units of the hidden layer is determined by the scalar parameter ai and the translation parameter bi. According to the previous researches, the two parameters can either be predetermined based upon the wavelet transformation theory or be determined by a training algorithm. Note that the above wavelet neural network is a kind of basis function neural network in the sense of that the

1954

0-7803-8273-0/04/$20.00 ©2004 IEEE

x − bi ) : ai , bi ∈ R, i ∈ Z } ai x = ( x1 ,L, xN ) −

wavelets consists of the basis function. Note that an intrinsic feature of the basis function networks is the localized activation of the hidden layer units, so that the connection weights associated with the units can be viewed as locally accurate piecewise constant models whose validity for a given input is indicated by the activation functions. Compared to the multilayer perceptron neural network, this local capacity provides some advantages such as the learning efficiency and the structure transparency. However, the problem of basis function networks is also led by it. Due to the crudeness of the local approximation, a large number of basis function units have to be employed to approximate a given system. A shortcoming of the wavelet neural network is that for higher dimensional problems many hidden layer units are needed. In order to take advantage of the local capacity of the wavelet basis functions while not having to have too many hidden units, here we propose an alternative type of wavelet neural network. Its output of the kth unit in the output layer is given by

Note that for the n-dimensional input space, the multivariate wavelet basis function can be calculated by the tensor product of n single wavelet basis function as follows N

ψ ( x) = ∏ψ ( xi )

Note that the dilation and translation parameters are randomly initialized at beginning and will be optimized by PSO algorithm.

III. LEARNING ALGORITHM Particle swarm optimization is basically developed through simulation of bird flocking in two-dimension space [1][6]. The position of each agent is represented by XY axis position and also the velocity is expressed by vx and vy. Modification of the agent position is realized by the position and the velocity information. Bird flocking optimizes a certain objective function. Each agent knows its best value so far (pbest) and its XY position. Moreover, each agent knows the best value so far in the group (gbest) among pbest. Velocity of each agent can be modified by the following equation:

M

y k = ∑ ( wi 0 + wi1 x1 + L + wiN x N )Ψi ( x)

(2)

i =1

where, instead of the straightforward weight wi (piecewise constant model), a linear model

vi = wi 0 + wi1 x1 + L + wiN x N is introduced. Due to the activities of the linear models vi (i=1,2,…M) are determined by the associated locally active wavelet functions Ψi (i=1,2,…M), vi is locally valid. Therefore, the proposed local linear wavelet neural network is as follows

vik +1 = wvik + c1 rand1 × ( pbest i − s ik ) + c 2 rand 2 × ( gbest − sik )

M

f ( x) = ∑ ( wi 0 + wi1 x1 + L + wiN x N )Ψi ( x) = ∑ ( wi 0 + wi1 x1 + L + wiN x N ) a i



1 2

ψ(

i =1

x − bi ) ai (3)

where x=[x1, x2, …, xN]. y

sik +1 = sik + vik +1 .

Σ

wi0 + wi1 x1 +... + win xn ψ1

ψ2

ψn

x2

(6)

The general flow chart of PSO for optimizing a local linear wavelet neural network can be described as follows: Step.1 Generation of initial condition of each agent Initial searching points (si0) and velocity (vi0) of each agent are usually generated randomly within the allowable range. Note that the dimension of search space is consists of all the parameters used in the local linear wavelet neural network as shown in equation (3). The current searching point is set to pbest for each agent. The best-evaluated value of pbest is set to gbest and the agent number with the bset value is stored. Step.2 Evaluation of searching points of each agent

 x1

(5)

where, vik is the velocity of agent i at iteration k is w weighting function, cj is weighting factor, sik is the current position of agent I at iteration k, pbesti is the pbest of agent i and gebst is the gbest of the group. Using the above equation, a certain velocity, which gradually gets close to pbest and gbest can be calculated. The current position (searching point in the solution space) can be modified by the following equation:

i =1

M

(4)

i =1

xn

Fig. 1. A local linear wavelet neural network

1955

Step.3

Step.4

The objective function value is calculated for each agent. If the value is better than the current pbest of the agent, the pbest value is replaced by the current value. If the best value of pbest is better than the current gbest, gbest is replaced by the best value and the agent number with the best value is stored. Modification of each searching point The current searching point of each agent is changed using (5)(6). Checking the exit condition The current iteration number reaches the predetermined maximum iteration number, then exit. Otherwise, go to step 2. IV. NUMERICALEXAMPLES

For the following experiments, the used mother wavelet is as follows:

x2 ψ ( x) = − x exp(− ) 2

Fig. 2. Comparison of model output, dynamic system output and identification error for train data set.

(7)

The used objective function is root mean square error:

RMSE =

1 N i ( y1 − y 2i ) 2 ∑ N − 1 i =1

(8)

A.

Static Nonlinear Function Approximation A static nonlinear static benchmark modeling problem [7] is under consideration. The system is described with the equation:

y=

1.0 + sin(πx1 )con(πx 2 ) , − 1 ≤ x1 , x 2 ≤ 1 2

Over the domain [-1, 1] × [-1, 1], we generate a set of 49 data points equally spaced on a 7 × 7 grid as the training data set. The validation data set is a set of 169 data points which are equally spaced on 13 × 13 grid. The static nonlinear function is approximated by the local linear wavelet neural network with 3 hidden units (3 wavelets). The evolved local linear wavelet neural network was obtained at iteration 3000 with RMSE 0.0085 for training data set and RMSE 0.011 for validation data set, respectively. Figure 2 and Figure 3 present the outputs of actual static nonlinear function and the local linear wavelet neural network, and the approximation errors, respectively. In order to give a meaningful comparison, the PSO optimized WNN was also implemented with 10 wavelets. The RMSE for training and test data set are 0.013 and 0.026, respectively. Fig.4 and Fig.5 show the model and the actual outputs of the nonlinear function and the approximation errors. It is clear that the proposed local linear wavelet model is better than the WNN.

1956

Fig. 3. Comparison of the model output, dynamic system output and identification error for test data set.

Fig. 4. Comparison of model output, dynamic system output and identification error for train data set.

network and the identification error. From Figure 4, it can be seen that the proposed local linear wavelet neural network works well for identification of nonlinear systems.

V. CONCLUSION In this paper, a local linear wavelet neural network is proposed. The characteristic of the network is that the straightforward weight is replaced by a local linear model. The working process of the proposed network can be viewed as to decompose the complex, nonlinear system into a set of locally active submodels, then smoothly integrate those submodels by their associated wavelet basis functions. One advantage of the proposed method is that it needs only smaller wavelets for a given problem than the common wavelet neural networks. A fast training algorithm, particle swarm optimization, is also introduced for training the local linear wavelet neural network. Simulation results for static nonlinear function approximation and dynamical system identification show the effectiveness of the proposed approach.

Fig. 5. Comparison of the model output, dynamic system output and identification error for test data set.

REFERENCES [1]J. Kennedy et al., “Particle Swarm Optimization”, Proc. of IEEE International Conference on Neural Networks, Vol.IV, 1942-1948, 1995. [2]Yuehui Chen et al., “Evolving Wavelet Neural Networks for System Identification”, Proceeding of International Conference on Electrical Engineering, .279-282, 2000. [3]Q. Zhang and A. Benveniste, “Wavelet Networks”, IEEE Trans. On Neural Networks, Vol.3, No.6, 889-898, 1992. [4]Yuehui Chen eta l., “Evolving the Basis Function Neural Networks for System Identification”, International Journal of Advanced Computational Intelligence, Vol.5, No.4, pp.229-238, 2001. [5]K.S.Narendra et al., “Adaptive Control Using Neural Networks and Approximation Models”, IEEE Trans. On Neural Networks, Vol.8, No.3, pp.475-485, 1997. [6]H. Yoshida et al., "A Particle Swarm Optimization for Reactive Power and Voltage Control Considering Voltage Security Assessment", IEEE Trans. on Power Systems, Vol.15, No.4, November 2000. [7]T. Wang et al., “A wavelet neural network for the approximation of nonlinear multivariable function”, The Trans. of the Institute of Electrical Engineering C, 102-C, pp.185-193, 2000. [8]Yuehui Chen etal., “Evolving Wavelet Neural Networks for System Identification”, Proc. of The International Conference on Electrical Engineering, 279-282, Kitakyushu, Japan, 2000. [9] A. P. Engelbrecht and A. Ismail, “Training product unit neural networks”, Stability and Control: Theory and Applications, Vol. 2, No. 1-2, 59-74, 1999. [10]F. van den Berg, “Particle Swarm Weight Initialization in Multi-layer Perceptron Artificial Neural Networks”, In Development and Practice of Artificial Intelligence Techniques, 41-45, 1999. [11]F. van den Berg and A.P. Engelbrecht, “Cooperative Learning in Neural Networks using Particle Swarm Optimizers”, South African Computer Journal, 84-90, Nov.2000. [12]R.C. Eberhart and X. Hu, “Human Tremor Analysis Using Particle Swarm Optimization”, Proc. Of the Congress on Evolutionary Computation, 1927-1930, 1999. [13]Y. Shi and R.C. Eberhart, “Empirical Study of Particle Swarm Optimization”, Proc. Of the Congress on Evolutionary Computation, 1945-1949, 1999. [14]Y. Shi and R.C. Eberhart,, “A Modified Particle Swarm Optimizer”, IEEE International Conference of Evolutionary Computation, May 1998. [15]S. Kawaji and Yuehui Chen eta l., “Evolving Neurofuzzy system by hybrid soft computing approaches for System Identification”, International Journal of Advanced Computational Intelligence, Vol.5, No.4, pp.229-238, 2001.

Fig. 6. Comparison of the model output, dynamic system output and identification error for test data set.

B.

Nonlinear System Identification The nonlinear system to be identified is given by the following equation [5]: y (k ) =

y ( k −1) y ( k − 2 )[ y ( k −1) + 2.5 ] 1+ y 2 ( k −1) + y 2 ( k − 2 )

+ u (k − 1 )

(8)

The input signal is selected as random number in the interval [0, 1]. The created data is used for training data set. The test data set is created by using input signal:

u (k ) = 0.5 sin(2πk / 10) + 0.5 sin(2πk / 25)

(9)

Local linear wavelet neural network trained by PSO algorithm was convergent at iteration 2038 with the root mean square error (RMSE) 0.029 for training data set and the RMSE for test data is 0.034. The used local linear wavelet neural network has 5 hidden units (5 wavelets). Figure 6 gives the comparison result of the output of the dynamic system, the output of the local linear wavelet neural

1957