Short Paper Int. J. on Recent Trends in Engineering and Technology, Vol. 6, No. 1, Nov 2011
Improved EBP Training Algorithm (GKLM) with Numerical Method for NN Image Compression G.R.C.Kaladhara Sarma1, P.V.GopiKrishna Rao2, Dr.K.Muralidhar Reddy 3 1 &2
RGMCET/EIE, Nandyal, 3 Professors in RGMCET/EIE, Nandyal, India
[email protected]
F(w)= eTe Where w = [w1, w2, …., wN ] Consists of all weights of the network, e is the error vector Comprising the error for all the training examples. The standard training process can be illustrated in the following pseudo-codes, ALGORITHM 1. Divide the original image into 8x8 pixel blocks and reshape each one into 64x1 column vector. 2. Arrange the column vectors into a matrix of 64x1024. 3. Let the target matrix equal to the matrix in step 2. 4. Choose a suitable learning algorithm, and parameters to start training. 5. Simulate the network with the input matrix and the target matrix. 6. Obtain the output matrices of the hidden layer and the output layer. 7. Post-process them to obtain the compressed image.
Abstract— In this paper we have proposed an algorithm for neural network training. This algorithm is developed from modification on EBP algorithm with the help of numerical methods for MLP neural network learning. The proposed algorithm has good convergence. This method reduces the amount of oscillation in learning procedure. We named this algorithm as GK-LM Method. An example is given here to show usefulness of this method. Finally a simulation verifies the results of proposed method. Keywords—GK-LMMethod, modification, neural network, Variable learning rate.
I. INTRODUCTION The Error Back Propagation (EBP) algorithm [1]– [4] has been a significant improvement in neural network research, but it has a weak convergence rate. Many efforts have been made to speed up EBP algorithm [5]–[9]. All of these methods lead to little acceptable results. The EBPLM algorithm [4], [10]–[13] ensued from development of EBP algorithm dependent methods. It gives a good exchange between the speed of the Newton algorithm and the stability of the steepest descent method [11], that those are two basic theorems of EBPLM algorithm. An attempt has been made to speed up EBP algorithm with modified performance index and gradient computation [14], and numerical differentiations although it is unable to reduce error oscillation. Other effort with variable decay rate has been ensued to reduce error oscillation [15]. In this paper a modification is made on computation of differentiations , this computations take less time than standard method and change in Learning parameter resulted in to decrease together both learning iteration and oscil lation a lso it r esults the fa st convergence . A modification method by varying the learning parameter has been made to speed up EBP algorithm. In addition, the error oscillation has been decreased. Section II describes the EBP algorithm. In section III the proposed form (GK-LM Method) is introduced. In section IV a simulation is discussed.
III. THE PROPOSED GK-LM METHOD The improved back propagation algorithm which is used in this project can be summarized in the following steps . Step 1: Initialize all the weights and biases to a small number between -1 and +l Step 2: Read the input vector, x and desired output vector, d.step 3: Compute the actual outputs of the network defined as
where the function f(*) is the nonlinear activation function. Step 4: Adjust the weights by
II. T HE STANDED EBP BASED METHOD REVIEW In the EBP algorithm, the performance index F(w) is to be minimized and is defined as the sum of squared errors between the target outputs and the network’s simulated outputs, namely: © 2011 ACEEE DOI: 01.IJRTET.06.01. 111
107
Short Paper Int. J. on Recent Trends in Engineering and Technology, Vol. 6, No. 1, Nov 2011 For hidden unit j By chain rule
There wij(t) is the weight from node i to node j at time t, is the learning rate, a is a .positive number between zero and one, and tjj is an error term for node j. If node k is an output node, then If the node j is an internal hidden node, then Step 5: Compute the mean squared error between the desired output and the actual output denoted by E as
If E becomes smaller than some predefined error goal, then stop the iterations, otherwise go to step 2. Gradient of Error function wrt a weight wji
Using the Newton method we have as:
• A local computation involving product of error signal y nj-tnj associated with output end of link wji •variable xni associated with input end of link By chain rule for partial derivatives
The gradient can be written as:
Where
J (w) is called the Jacobian Matrix. Next we want to find the Hessian matrix. The k, j elements of the Hessian matrix yields as: Thus required derivative is
obtained by:
• Multiplying value of δ for the unit at output end of weight by value of z for unit at input end of weight. • Need to figure out how to calculate The Hessian matrix can then be expressed as follows: For output unit Where
© 2011 ACEEE DOI: 01.IJRTET.06.01. 111
108
Short Paper Int. J. on Recent Trends in Engineering and Technology, Vol. 6, No. 1, Nov 2011 TABLE I. SOME STANDARD VALUES FOR EBP BASED I MAGE COMPRESSION FROM WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY,
If we assume that S(W) is small, we can approximate the Hessian matrix as:
Using (12) and (4) we obtain the Gauss-Newton method as:
The advantage of Gauss-Newton is that it does not require calculation of second derivatives. There is a problem the Gauss-Newton method is the matrix H=JTJ may not be invertible. This can be overcome by using the following modification. Hessian matrix can be written as:
T ABLE II. THE PROPOSED METHOD GK_LM METHOD RESULTS
Suppose that the eigenvalues and eigenvectors of H are {λ1, λ2,…….,λn} and {z1,z2,…….,zn}. Then:
TABLE III. COMPARISON OF CONVERGES TIME FOR STANDARD CAMERAMEN, LENA IMAGES
Therefore the eigenvectors of G are the same as the eigenvectors of H, and the eigenvalues of G are (λi+μ). The matrix G is positive definite by increasing μ until (λi+μ)>0 for all i therefore the matrix will be invertible. This leads to Levenberg- Marquardt algorithm:
As known, learning parameter, μ is illustrator of steps of actual output movement to desired output. In the standard LM method, μ is a constant number. This paper modifies LM (18) method using μ as: μ=0.001eTe Where e is a k×1 matrix therefore eTe is a 1×1 therefore [JTJ+ìI] is invertible. Therefore, if actual output is far than desired output or similarly, errors are large so, it converges to desired output with large steps. Likewise, when measurement of error is small then, actual output approaches to desired output with soft steps. Therefore error oscillation reduces greatly. IV. SIMULATION RESULTS In this section we have taken an image as a case study. In this case we compress this Image with standard EBP method and GK-LM Method. We use a Neural Network with two layers. Table 1 shows the standards of Neural Network compression simulation with standard EBP Method and Table .2 shows the result of modified GK-LM Method As described in above Fig , the iteration of learning and the Converges Time reduced in GK-LM Method. © 2011 ACEEE DOI: 01.IJRTET.06.01. 111
109
Short Paper Int. J. on Recent Trends in Engineering and Technology, Vol. 6, No. 1, Nov 2011 arguments,” Proceedings of International Joint Conference on Neural Networks, Washington, 1, pp. 565-568, 1990. [7] Solla, S. A., Levin, E. and Fleisher, M. “Accelerated learning in layered neural networks,” Complex Systems, 2, pp. 625-639, 1988. [8] Miniani, A. A. and Williams, R. D. “Acceleration of backpropagation through learning rate and momentum adaptation,” Proceedings of International Joint Conference on Neural Networks, San Diego, CA, 1, 676-679, 1990. [9] Jacobs, R. A., “Increased rates of convergence through learning rate adaptation,” Neural Networks, vol. 1, no. 4, pp. 295- 308, 1988. [10] Andersen, T. J. and Wilamowski, B.M. “A Modified Regression Algorithm for Fast One Layer Neural Network Training,” World Congress of Neural Networks, Washington DC, USA, vol. 1, pp. 687-690, July 17-21, 1995. [11] Battiti, R., “First- and second-order methods for learning between steepest descent and Newton’s method,” Neural Computation, vol. 4, no. 2, pp. 141-166, 1992. [12] Charalambous, C., “Conjugate gradient algorithm for efficient training of artificial neural networks,” IEE Proceedings, vol. 139, no. 3, pp. 301-310, 1992. [13] Shah, S. and Palmieri, F. “MEKA - A fast, local algorithm for training feed forward neural networks,” Proceedings of International Joint Conference on Neural Networks, San Diego, CA, 3, pp. 4146, 1990. [14] B. M. Wilamowski , Y. Chen, and A. Malinowski, “Efficient algorithm for training neural networks with one hidden layer,” In Proc. IJCNN, vol.3, pp.1725-728, 1999. [15] T. Cong Chen, D. Jian Han, F. T. K. Au, L. G. Than, “Acceleration of Levenberg-Marquardt training of neural networks with variable decay rate,” IEEE Trans. on Neural Net., vol. 3, no. 6, pp. 1873 - 1878, 2003. [16] Amir Abolfazl Suratgar, “Modified Levenberg-Marquardt Method for Neural Networks Training” Procedings of World Academy of Science, Engineering and Technology” 6 2005.
REFERENCES [1] Rumelhart, D. E., Hinton, G. E. and Williams, R. J, “Learning internal representations by error propagation,” In Parallel Distributed Processing, Cambridge, MA: MIT Press, vol 1, pp. 318-362. [2] Rumelhart, D. E., Hinton, G. E. and Wiliams, R. J, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533-536, 1986. [3] Werbos, P. J. “Back-propagation: Past and future,” Proceeding of International Conference on Neural Networks, San Diego, CA, 1, pp. 343-354, 1988. [4] .M. T .Hagan and M. B. Menhaj, “Training feed forward network with the Marquardt algorithm,” IEEE Trans. on Neural Net., vol. 5, no. 6, pp.989-993, 1994. [5] Bello, M. G. “Enhanced training algorithms, and integrated training/architecture selection for multi layer perceptron networks,” IEEE Trans. on Neural Net., vol. 3, pp. 864-875, 1992. [6] Samad, T. “Back-propagation improvements based on heuristic
© 2011 ACEEE DOI: 01.IJRTET.06.01. 111
110